Content of review 1, reviewed on December 06, 2011

This manuscript presents a draft assembly of the Puerto Rican Parrot, an ecologically important and endangered species, where the resources for the project were generated from the local community. This is indeed an admirable effort and a valuable model for future projects. The authors used the data generated for producing two draft assemblies with quite different characteristics and also detail several downstream projects that they plan to undertake.

Major Compulsory Revisions

  1. The description of the genome assembly effort needs more detail (e.g. final set of parameters used, procedure used to identify mis-assemblies). Also the authors should provide any custom scripts that they wrote for this work as well results from their work on evaluating the correctness of the assembly (as say an annotation track for the final assembly). More details are provided below:

  2. Read filtering (p11: line 3-4)
    This part need some clarification, especially the criteria used to filter/trim the reads. Also, it will be useful to provide details for the number of reads filtered/trimmed as well as the general read quality and the final base quality of the assembly.

  3. Assembly Reconciliation
    The authors use RAY and SOAPdenovo to produce two assemblies with very different characteristics in terms of contig and scaffold N50s. In practice, the user community would ideally have one good draft assembly. It is therefore essential to either combine the assemblies (using approaches described in say Zimin et al, 2007) or compare them for a range of metrics (e.g. mis-assembly rate, base quality) to guide users to the assembly that will be most useful for downstream analysis.

  4. Assembly Validation (page 11: line 10-12)
    This part is not clear, particularly the criterion used to define mis-assembled regions.

  5. The quality of the assembly here is probably not sufficient for a lot of downstream genomic analysis. In our experience, steps such as error-correction and gap-filling can have a substantial impact on assembly quality and it's not clear if these have been explored fully by the authors.

  6. A draft genome without any gene annotations or basic analysis (such as repeat annotation, identification of heterozygous positions in the sequenced samples) has limited value for the genomics community.

Minor Essential Revisions

  • Page 4: the authors claim that “One sample (Pa9a) has been finally selected (Fig 1) and sequenced“, but on page 5 they claim that “four samples from two different A. vittata females were sequenced”!

  • Table 2: shows the number of A, T,G, C and Ns present on the sequencing reads, but this information does not seem to be very relevant.

  • Table 3: it would be more meaningful to combine the statistics of the paired reads and to add an additional column that reports physical coverage.

  • Table 5: It's not clear which dataset the second column is referring to. Level of interest An article of limited interest Quality of written English Needs some language corrections before being published Statistical review Yes, and I have assessed the statistics in my report.

Source

    © 2011 the Reviewer (CC-BY 4.0 - source).

Content of review 2, reviewed on May 03, 2012

Major Compulsory Revisions

This is admittedly a fragmented, incomplete assembly which requires substantial additional sequencing and analysis before it can be considered a sequenced genome of value in scientific studies.

Level of interest: An article of limited interest

Quality of written English: Needs some language corrections before being published

Statistical review: No, the manuscript does not need to be seen by a statistician.

Declaration of competing interests: I declare that I have no competing interests'

Source

    © 2012 the Reviewer (CC-BY 4.0 - source).

References

    K., O. T., Jean-Francois, P., Daniel, S., Anyimilehidi, M., Brian, R., Wilfried, G., Yashira, A., T., R. C., L., N. M., M., L. D., Michael, D., Luis, F., Ricardo, V., Juan-Carlos, M. 2012. A locally funded Puerto Rican parrot (Amazona vittata) genome sequencing project increases avian data and advances young researcher education. GigaScience.