Content of review 1, reviewed on August 20, 2015

Benitez-Paez et al present a dataset of amplicon sequencing 16s rRNA genes of a mock metagenomics community, using the MinION platform. The author also show an analysis of the dataset, and based on the results, conclude that amplicon sequencing using MinION can be used for taxonomy assignments, and to measure relative abundance of all the species present in the community. I find this study interesting and the dataset would be valuable to the research community since it is, as far as I know, the first of this kind. I however, also find some claims are not conclusive from the analysis presented (point 3&4 below). English and writing can be significantly improved.

Major Compulsory Revisions --------------------------

1. The manuscript will need to be edited throughout to improve English. Many statements, though not the main points of the paper, need to be reworded to avoid inaccuracy and ambiguity. Such as:

- (Abstract) "...this technology is under constant development to completely deliver error-free and high quality reads...": I do not see this technology (and any others) could deliver errorfree in a foreseeable future.

- (Background, line 70): "The moderate throughput ..., thus making it possible to obtain reads of thousands nucleotides in length.". I fail to make sense of this sentence.

2. I am unclear of the reference sequences the authors used to map MinION reads to. They stated that they aligned reads against the 16s rRNA sequences, but the accessions numbers listed refer to the complete genomes of the mentioned species. The authors need to make this consistent.

3. The authors found a "detrimental effect" of 2D reads, and they explained that this was because 86% of 2D reads were low quality - which I assume they were classified as "fail" reads. If that was what happened, the template and complements reads to construct these 2D reads were also in the fail category, and (theoretically) had even lower quality. So I do not think this explanation is justified.

3. From their analyses, the authors claim that MinION amplicon sequencing can be used for taxonomy assignment and abundance measurement. I find while the latter is reasonable, the former is not conclusive from the analysis. The authors aligned these reads to the 16s rRNA genes (or genomes) from the 20 species in the sample, which would not normally known in practice. The claim can be substantiated if the reads are aligned to a large set of 16s rRNA (if not all known 16s rRNA) and there are significantly higher hits in (only) these 20 species.

4. I would be interested to know if the alignment configuration they used would report multiple alignments (same reads aligned to multiple reference sequences) or only report the best alignment. In case of multiple alignment mode, reporting of any such multiple alignments would help the interpretation of the results. On the other hand, if they alignment was run with competitive mode, I would be interested to see if any reads were aligned to the species NOT in the list of 20 in the sample.

Discretionary Revisions:

Several long paragraphs (such as the whole analysis section) should be broken into containable units.

Level of interest

Please indicate how interesting you found the manuscript: An article of importance in its field

Quality of written English

Please indicate the quality of language in the manuscript: Not suitable for publication unless extensively edited

Declaration of competing interests

Please complete a declaration of competing interests, considering the following questions:

1. Have you in the past five years received reimbursements, fees, funding, or salary from an organisation that may in any way gain or lose financially from the publication of this manuscript, either now or in the future?

2. Do you hold any stocks or shares in an organisation that may in any way gain or lose financially from the publication of this manuscript, either now or in the future?

3. Do you hold or are you currently applying for any patents relating to the content of the manuscript?

4. Have you received reimbursements, fees, funding, or salary from an organization that holds or has applied for patents relating to the content of the manuscript?

5. Do you have any other financial competing interests?

6. Do you have any non-financial competing interests in relation to this paper?

If you can answer no to all of the above, write 'I declare that I have no competing interests' below. If your reply is yes to any, please give details below.

I declare that I have no competing interests.

I agree to the open peer review policy of the journal. I understand that my name will be included on my report to the authors and, if the manuscript is accepted for publication, my named report including any attachments I upload will be posted on the website along with the authors' responses. I agree for my report to be made available under an Open Access Creative Commons CC-BY license (http://creativecommons.org/licenses/by/4.0/). I understand that any comments which I do not wish to be included in my named report can be included as confidential comments to the editors, which will not be published.

I agree to the open peer review policy of the journal.

Authors' response to reviews:

Reviewers' concerns: Reviewer #1:

1. The authors mention and multiple points in the manuscript that the initial PCR is a potential influence for the downstream community analysis via nanopore sequencing. Methods exist for the absolute quantification of PCR products (digital PCR), and so it would be beneficial if this analysis was performed on the initial conserved primer PCR reaction. This would give insight on how the original PCR quantities influence "quantitative" sequence analysis.

R./ Lines 136-146: We have performed a quantitative analysis to know if the coverage deviation observed for the 16S rRNA gene sequencing of different species present in the mock community would come from the initial PCR step or from the sequencing process itself. Absolute quantification of three different amplicons was assessed through qPCR using specific oligos for the 16S sequences from E. coli, C. beijerinckii, and B. vulgatus which showed high coverage, close-to-expected coverage, and the lowest coverage during the sequencing process, respectively. The results of this absolute quantification are shown in the new Figure 2, consisting of a scatter plot of the 16S copies present in the PCR starting material versus the coverage deviation calculated from the MinION data. We demonstrated that a clear bias was produced during the initial 16S PCR, despite using "universal" primers, and that MinION sequencing properly replicates the original proportion of 16S molecules present in the starting material. The methods regarding this new approach are also included in the new version (lines 371-387).

2. Related to the above point, the authors also mention that secondary structure and/or melting temps might influence the PCR and/or the translocation of PCR products through the nanopores. It is possible to analyze each of the 20 amplicons for secondary structure, and the influence of DNA folding on MinION efficiency for a given sequence. This would be an extremely useful piece of information on a standard reference material that could have large impacts on how MinION data is analyzed and corrected. It is a dry-lab type of analysis, not dependent on the availability of reagents from ONT, and thus could be performed to add more impact to the work described. Again, the impact of these structures and melting points have not been quantitatively assessed on the MinION, so it would be of high value.

R./ We performed the additional analysis as that proposed by the reviewer#1 using the mfold server to calculate the folding energy for each 16S sequence of the mock community. We found no significant correlation between the coverage deviation and the ∆G value obtained for the 16S folding. This information was not included in the main text given that the origin of the coverage bias was strongly supported by the experiment described above.

Reviewer #3: Major Compulsory Revisions

1. The manuscript will need to be edited throughout to improve English. Many statements, though not the main points of the paper, need to be reworded to avoid inaccuracy and ambiguity. Such as: 1.1. (Abstract) "...this technology is under constant development to completely deliver error-free and high quality reads...": I do not see this technology (and any others) could deliver error-free in a foreseeable future. 1.2. (Background, line 70): "The moderate throughput ..., thus making it possible to obtain reads of thousands nucleotides in length.". I fail to make sense of this sentence.

R./ Rewording of these statements was done for a clearer understanding for the reader (lines 54-56 and 67-73). Additionally, the full text was revised and approved by a native English speaker who is also author of our manuscript, Kevin Portune.

2. I am unclear of the reference sequences the authors used to map MinION reads to. They stated that they aligned reads against the 16s rRNA sequences, but the accessions numbers listed refer to the complete genomes of the mentioned species. The authors need to make this consistent.

R./ Together with accession numbers, which retrieve information regarding the complete genome of the species analyzed in the mock community, we detailed the specific nucleotide range where the respective 16S rRNA genes are encoded (lines 344-352). Additionally, we submitted the multi-fasta file for the twenty 16S sequences in a repository publicly available (https://github.com/alfbenpa/16S_MinION) where anyone can access such information and we describe this information in line 355 of the manuscript.

3. The authors found a "detrimental effect" of 2D reads, and they explained that this was because 86% of 2D reads were low quality - which I assume they were classified as "fail" reads. If that was what happened, the template and complements reads to construct these 2D reads were also in the fail category, and (theoretically) had even lower quality. So I do not think this explanation is justified.

R./ We have reworded this statement to clarify our methods (lines 126-129). We declared that template and complement reads, which were more abundant in the output data from MinION, were initially used to reconstruct the expected 16S molecules. In this way we can obtain reliable information based on more sequence observations. The 2d reads, less abundant but with higher quality, were used to answer the next two concerns of reviewer #3.

4. From their analyses, the authors claim that MinION amplicon sequencing can be used for taxonomy assignment and abundance measurement. I find while the latter is reasonable, the former is not conclusive from the analysis. The authors aligned these reads to the 16s rRNA genes (or genomes) from the 20 species in the sample, which would not normally known in practice. The claim can be substantiated if the reads are aligned to a large set of 16s rRNA (if not all known 16s rRNA) and there are significantly higher hits in (only) these 20 species.

R./ We have now used the 2d reads for assessing the taxonomy assignment in a common tool for the microbial communities analyses, the SINA aligner. By submitting the 2d reads to this online tool, we have obtained a taxonomy assignment for some of the microorganisms presented in our mock community. Despite the fact that the SINA alignment service only retrieves taxonomy assignments at the genus level, we had confident assignation even using a sequence identity threshold at only 70% (lines 185-205).

5. I would be interested to know if the alignment configuration they used would report multiple alignments (same reads aligned to multiple reference sequences) or only report the best alignment. In case of multiple alignment mode, reporting of any such multiple alignments would help the interpretation of the results. On the other hand, if they alignment was run with competitive mode, I would be interested to see if any reads were aligned to the species NOT in the list of 20 in the sample.

R./ In order to put the MinION data into the context of sequence analysis for microbial communities, we have used NCBI’s BLAST against the bacterial 16S database for taxonomy assignment on the 2d reads, thus allowing classification to the species level. After discarding the alignments with E-value > 1e-03 and alignment length < 800bp (approximately half size of 16S molecule) we retrieved the taxonomy assignment for 172 reads (lines 205-209). The species retrieved in this and the above analyses are presented in the new Table 2.

Discretionary Revisions: ----------------------- Several long paragraphs (such as the whole analysis section) should be broken into containable units.

R./ Changed as suggested.

 


The reviewed version of the manuscript can be seen here:

All revised versions are also available:

Source

    © 2015 the Reviewer (CC BY 4.0 - source).

Content of review 2, reviewed on December 09, 2015

The authors have satisfactorily addressed my concerns in the previous review. The alignment of nanopore reads to a large database of 16s sequences which showed enrichment of the species in the mock community has strengthened the claims of the manuscript. I however still spot a number of grammatical and spelling mistakes. The manuscript should be edited again to improve English.  

Are the methods appropriate to the aims of the study, are they well described, and are necessary controls included? If not, please specify what is required in your comments to the authors.

Yes

Are the conclusions adequately supported by the data shown? If not, please explain in your comments to the authors. Yes

Does the manuscript adhere to the journal’s guidelines on http://www.gigasciencejournal.com/authors/instructions/minimum_standards_reporting minimum standards of reporting? If not, please specify what is required in your comments to the authors.
Yes

Are you able to assess all statistics in the manuscript, including the appropriateness of statistical tests used? (If an additional statistical review is recommended, please specify what aspects require further assessment in your comments to the editors.)

Yes, and I have assessed the statistics in my report.

Quality of written English
Please indicate the quality of language in the manuscript: Needs some language corrections before being published

Declaration of competing interests Please complete a declaration of competing interests, considering the following questions:

1. Have you in the past five years received reimbursements, fees, funding, or salary from an organisation that may in any way gain or lose financially from the publication of this manuscript, either now or in the future?

2. Do you hold any stocks or shares in an organisation that may in any way gain or lose financially from the publication of this manuscript, either now or in the future?

3. Do you hold or are you currently applying for any patents relating to the content of the manuscript?

4. Have you received reimbursements, fees, funding, or salary from an organization that holds or has applied for patents relating to the content of the manuscript?

5. Do you have any other financial competing interests?

6. Do you have any non-financial competing interests in relation to this paper? If you can answer no to all of the above, write 'I declare that I have no competing interests' below. If your reply is yes to any, please give details below.

I declare that I have no competing interests.

I agree to the open peer review policy of the journal. I understand that my name will be included on my report to the authors and, if the manuscript is accepted for publication, my named report including any attachments I upload will be posted on the website along with the authors' responses. I agree for my report to be made available under an Open Access Creative Commons CC-BY license (http://creativecommons.org/licenses/by/4.0/). I understand that any comments which I do not wish to be included in my named report can be included as confidential comments to the editors, which will not be published.

I agree to the open peer review policy of the journal


The reviewed version of the manuscript can be seen here:

All revised versions are also available:

Source

    © 2015 the Reviewer (CC BY 4.0 - source).

References

    Alfonso, B., J., P. K., Yolanda, S. 2016. Species-level resolution of 16S rRNA gene amplicons sequenced through the MinION (TM) portable nanopore sequencer. GigaScience.