Content of review 1, reviewed on June 12, 2017
Review of revision of Hybrid-denovo In the revision of Chen et al. the authors have made substantial additions to their previous manuscript. However, while it now reads a lot better, a few essential points are still missing and the comparison to state of the art software is simply missing, that would have been more appropriate to get a good idea of the performance of Hybrid-denovo in the year 2017. The software seems to excel at a few things (like taxonomic assignments, but see comments below) but seems to avoid in other parts a fair comparison with software that could get to the same or better levels. Further, I am not convinced that the results might not be driven by false positives. I will recommend this paper for major revision, because I think the high ICC values hold some potential. I fear that the increased tax assignment rate may be an artifact of a not completely described methodological twist, and that this was not correctly described in the methods. The ICC values I could also imagine to be artifacts due to several factors outlined below. However, I think when the comparisons are done with more comparability and openly described, and if the authors still have higher ICC values than Qiime and mothur, I could imagine this paper to be of scientific value. In my first review I was not commenting on some of these points; only in the revision I became aware of some potential problems due to better (but still not sufficient) described methods.
Major: OTU picking / "Gold" Standard: In the methods section, for all used software packages the exact parameters should be mentioned and why these options were used. Especially in a methods paper this is essential for transparency. About the usearch clustering (that is somewhat outdated), here I have a major issue: How do the authors cluster paired end reads with usearch? To the best of my knowledge this is not supported by usearch, so I wonder how this is done within the pipeline? This needs to be described, as simpler approaches like "stitching" will in all likelihood introduce errors. This then goes back to the "Gold" standard: Trying to achieve the same results as the Gold standard assumes that the Gold standard represents some truth; since this is not simulated data where we know the true outcome, but a clinical dataset, this definition of "best" is somewhat problematic since the Gold standard might as well represent the most wrong interpretation of the data. Not surprisingly, the more close the filtered (it is not a simulation, but a read filtering that is done in the 25,50,75% datasets) parameters allow for full (100%) coverage, the more similar the results get to the full ("Gold") standard read set. This is circular reasoning. Further, this introduces a serious theoretical problem: if the Gold standard would be biased to artificial / false positive OTUs of a specific signature, then this would not represent an improved performance but a decrease. Using ICC is a good idea, but can this refute that a false positive bias might drive this benchmark? Artificially increased diversity: the "Discussion" of this important topic now includes a single sentence saying there is a newer software. This still doesn't answer my question, how this software deal with this important problem? I am so insistent on this, since Usearch compared to Uparse usually has increased OTU diversity, and an algorithm like dada2 might even further decrease it. Further the authors mention that they do very stringent read quality filtering. From what I can see this is trimomatic (with what I would regard as lenient parameters, e.g. Q=15 allows for error prone reads to pass) which has considerably less fine grained quality filtering than e.g. Mothur and Qiime, nor the parameter breadth and probabilistic read filtering in the LotuS pipeline, nor the denoising in dada2, nor the abundance corrected clustering of Uparse. Thus this part of the pipeline is in my opinion not state of the art. Further, it seems like the authors use CMalign to remove reads / OTUs (not sure which) that are not belonging to bacteria. I see two problems with this: 1) This is only mentioned in the Discussion, please expand on this in the Methods section. Also expand the methods to all other steps that are not explicitly mentioned so far. 2) This will for sure bias the dataset, as this is comparable to closed ref OTU picking (only use OTUs/reads that have a representative in the databases). I would also assume that such a treatment will bias the ICC values. This would also explain, why the fraction of unassigned reads is higher in Hybrid-denovo (since all OTUs not fitting to bacteria are removed, before taxonomy is compared). Please note somewhere, how many OTUs are actually removed due to the different filtering steps and compare how many reads are in the final OTU matrix between mothur, QIIME and HybridDeNovo. This is a philosophical question, whether unassigned OTUs should be removed, but the user needs to be made aware that you loose all diversity that is not represented in the databases, and the benchmarks need to clearly state this difference between the three tested pipelines. I would recommend to use public mock communities, WITHOUT any sort of filtering of only known taxa (which would automatically bias the analysis since only known taxa are being used in mock communities). Showing here that the filtering and use of R2 reads can improve OTU clustering, diversity, and taxonomic assignment rates would be a more appropriate test, in my opinion. Comparability to other pipeline: In the abstract the authors claim that "Existing 16S analysis pipeline can either process paired-end or single-end reads, but not a mixture." First, I do not see if Hybrid deNovo could accept an actual mixture of single and paired end reads, but I suspect it can only process a mixture produced within this pipeline from the paired-end input. Second, as pointed out by Pat Schloss in the response, there is a good reason why the second read is not being used, please see the Uparse paper, that explains these reasons with a lot of detail, but there are several papers by now that point out that the second read should NOT be used in the clustering step, as it will likely increase diversity. Second, other pipelines are capable of processing a mixture of paired and single end reads (LotuS), even as input, therefore this statement is wrong. Last, since conceptually ideas in Hybrid-denovo and LotuS are very similar (the biggest difference being that LotuS uses the second read for tax assignments, but not for denovo OTU clustering), I would think it more interesting to compare to Dada2 (better OTU resolution) and LotuS (for better tax resolution), which are both in my experience also faster than mothur and QIIME. Minor: Greengenes database has last been updated in 2013 and is out of date in this rapidly evolving field, consider using Silva. Methods: describe with what parameters mothur and Qiime were run. Wording: "enjoy", "OTUing", .. seem like a colloquial choice of words. Abstract: "Captured more microbial diversity" -> Really? As far as I can see, Hybrid-denovo had less OTUs predicted than either Qiime or mothur. Abstract: "identified 30% more diessential.." (diessential -> don't know this word). Also 30% more than what? I didn't read about any other pipeline being used on this dataset, so this statement if false, if it simply refers to the pruning of the dataset in order to test technical performance. Further, is this really 30% more? Looking at 5, lines 18+, I can also see that all three Hybrid-denovo approaches have 16%,16% and 20% of possible OTUs being classified as sign. different in RA, so the "30%" more could just refer to more OTUs being available, that can be tested for significance? Page 4, line 37+: It is not mentioned, what parameters were used to assign the taxonomy of Qiime and mothur OTUs (RDP classifier maybe?). Page 6, line 10+: This is an empty statement and could be easily tested; correctly merging and quality controlling merged reads is not as straightforward as suggested here, a user could probably with less hassle adapt Qiime or mothur to do "hybrid-denovo" assemblies with these pipelines. If you think Hybrid denovo is more powerful than standard pipelines, please demonstrate it.
Level of interest
Please indicate how interesting you found the manuscript:
An article of limited interest.
Quality of written English
Please indicate the quality of language in the manuscript:
Needs some language corrections before being published
Declaration of competing interests
Please complete a declaration of competing interests, considering the following questions:
Have you in the past five years received reimbursements, fees, funding, or salary from an organisation that may in any way gain or lose financially from the publication of this manuscript, either now or in the future?
Do you hold any stocks or shares in an organisation that may in any way gain or lose financially from the publication of this manuscript, either now or in the future?
Do you hold or are you currently applying for any patents relating to the content of the manuscript?
Have you received reimbursements, fees, funding, or salary from an organization that holds or has applied for patents relating to the content of the manuscript?
Do you have any other financial competing interests?
Do you have any non-financial competing interests in relation to this paper?
If you can answer no to all of the above, write 'I declare that I have no competing interests' below. If your reply is yes to any, please give details below.
I declare that I have no competing interests.
I agree to the open peer review policy of the journal. I understand that my name will be included on my report to the authors and, if the manuscript is accepted for publication, my named report including any attachments I upload will be posted on the website along with the authors' responses. I agree for my report to be made available under an Open Access Creative Commons CC-BY license (http://creativecommons.org/licenses/by/4.0/). I understand that any comments which I do not wish to be included in my named report can be included as confidential comments to the editors, which will not be published.
I agree to the open peer review policy of the journal.
Authors' response to reviews: (https://drive.google.com/open?id=1TBfuFe8tAy0LoD9wOUVYh6u4KinCmFo6)
Source
© 2017 the Reviewer (CC BY 4.0).
References
Xianfeng, C., Stephen, J., Patricio, J., Junwen, W., Nicholas, C., A., K. J., Jun, C. 2017. Hybrid-denovo: a de novo OTU-picking pipeline integrating single-end and paired-end 16S sequence tags. GigaScience.