Content of review 1, reviewed on December 15, 2014
Sequencing of single cell genomes relies on whole genome amplification. A range of methods have been proposed in recent years that are commercially available through different vendors. None of these methods is perfect: every method adds a significant amount of noise and distortion, in the form of amplification bias, chimeras and point mutations, hindering the identification of copy number variants, rearrangements, and single nucleotide variants. As a result, whole genome amplification is the bottleneck in many single cell sequencing assays. The authors perform a comparison of the performance of three different amplification principles, available through seven different vendors. The authors thereby address an important issue. The study covers data from 34 single cells, and bulk reference data from different cell lines, generated on the Illumina (17 datasets) and Ion Torrent Proton (17 datasets) platforms. Five additional deep sequencing datasets (30x coverage) were generated. All in all, this represents a valuable dataset. Major revisions required: (1) Similar analyses have been reported recently (de Bourcy et al. Plos One 2014, Voet et al. NAR, 2013, Yu et al. Anal Chem 2014). The authors need to discuss their findings in the context of this recent literature. (2) The primary analyses are based on only 17 new single cell datasets, corresponding to amplification products generated using MDA and DOP-PCR (kits of three different vendors, two and three replicates per kit). Comparison of the performance of MDA and DOP-PCR to the performance of MALBAC (a method introduced in 2012) is made based on publicly available data for MALBAC. It is unfortunate that the authors did not generate an independent Illumina dataset for MALBAC. It would seem critical for this type of analysis that data are generated under the same conditions, in the same laboratory, using the same sequencing parameters. The authors did collect Ion Proton data for MALBAC but only discuss these data as a control for the copy number variant analysis. It is unclear why, it should be trivial to generate low coverage Illumina data from the existing YH single cell MALBAC amplification products for example.
Level of interest An article of importance in its field Quality of written English Needs some language corrections before being published Statistical review No, the manuscript does not need to be seen by a statistician. Declaration of competing interests I declare that I have no competing interests.
Authors' response to reviews: (http://www.gigasciencejournal.com/imedia/3021547641562807_comment.pdf)
Source
Content of review 2, reviewed on February 16, 2015
This revised version of the manuscript provides several improvements. In particular, the authors do a better job at citing and discussing recent relevant literature. They have also provided a schematic of the experimental design which is very helpful.
Major compulsory revisions:
A few concerns remain that I would like to see addressed: (1) The authors include a discussion of the performance in genome assembly. I appreciate it is hard to perform such comparison based on sparse human genome data. The authors decided to compare the assembly of mitochondrial sequences, which I think is a fair choice. The authors should however discuss the limitations of this comparison in light of the small size of the mitochondrial genome (16kb) and the high number of mitochondrial genome copies per cell.
(2) In comparing the assembly results to recent literature data, the authors write: “We measureed the contig N50 to evaluate the assembly quality, and found that the mitochondria genome assembled by the MDA and MALBAC data have comparable quality (Additional file 27: Table S14), which is in consistent with the previous report [24].” Are the results consistent or inconsistent? I am not able to tell. Data in ref 24 indicate that MALBAC and MDA yield assemblies of comparable quality for reaction gains <= 5*106. Please clarify and provide more discussion, including the caveats mentioned above.
(3) It is still not clear to me why the authors did not decide to generate new Illumina sequencing data for MALBAC and instead rely on published data (similar point raised by other reviewer). In response to my initial concern, the authors have included a schematic of the experimental design, which I think is very helpful, but they have not specifically addressed my concern of the re-use of MALBAC data. I think this is an important point, and the authors need to at least explain this choice. This is particularly confusing since the paper does include new data for MALBAC generated on the Ion Proton platform.
Level of interest An article of importance in its field Quality of written English Needs some language corrections before being published Statistical review Yes, and I have assessed the statistics in my report. Declaration of competing interests I declare that I have no competing interests.
Authors' response to reviewers: (http://www.gigasciencejournal.com/imedia/1966304336168046_comment.pdf)
Source
Content of review 3, reviewed on April 17, 2015
All my concerns and comments have been addressed adequately and I am now happy to recommend this paper for publication. Level of interest An article whose findings are important to those with closely related research interests Quality of written English Needs some language corrections before being published Statistical review Yes, and I have assessed the statistics in my report. Declaration of competing interests No competing interests here.