Content of review 1, reviewed on December 08, 2011

The manuscript by Oleksyk et al. describes about their sequence effort of the Puerto Rican Parrot. They made about 17x PE sequence of short read (300bp) and about 10x mate-pair reads (2.5kbp). They assembled the data using two different software, Ray and SOAPdenovo, to generate draft genome sequences. This is the first parrot draft sequence and also the first genome sequence effort done by community based funding scheme.

Major point

  1. Assuming this manuscript is submitted for "Research" section, the amont of sequence, in my opinion, is not enough for “draft” sequence. I prefer to have 50-60x, even for 1-2G genome. But, considering the parrot is endangered species and this is done through a community based funding, it may be difficult to sequence more to achieve 50-60x. One option is to aim this manuscript to other section in the journal that is suitable of this kind of data.

Minor points

  1. How many birds are used? In Abstract, authors wrote "one A. vittata female", in Data Description "One sample (Pa9a) has been finally selected (Fig 1) and sequenced on Illumina HiSeq platform", while in Discussion "Four samples from two different A. vittata females were sequenced on the Illumina HiSeq platform resulting in a total of 42 billion bases"

  2. 5th paragraph of Methods should be more specific.

  3. "The Illumina paired-end reads and mate-pairs were filtered and/or trimmed according to their quality values" :Give specific value for cut off.
  4. "The optimal parameters (e.g. k-mer) were defined empirically and iteratively." : Give the value for Ray and SOAPdenovo
  5. "Reads were subsequently mapped on the contigs to detect regions harboring unusually high/low coverage, and potential assembly errors were manually reviewed and corrected." :Give the cut off value of unsuually high/low overage for Ray and SOAPdenovo.
  6. "Regions that could not be assembled using short sequencing reads were identified and these were selected for sequencing using other approaches including long-range PCR and Sanger chemistry." Give the regions you did with long-PCR and Sanger sequence for both assembly.

Discretionary point

  1. In Background 2nd paragraph, authors wrote "Finding these variants should invigorate search for the genomic regions that may contain genetic code relevant to parrot’s survival, and contribute to our understanding of the causes of its decline, and help guide conservation efforts through effective breeding strategies. Thus, the genome wide information will contribute to the conservation effort and to the species’ eventual recovery.", but survival and extinction is very complicated phenomina and identifing the survival genes will be difficult even after the full sequence. The stress of this part should be lowered.

Level of interest: An article whose findings are important to those with closely related research interests

Quality of written English: Acceptable

Statistical review: No, the manuscript does not need to be seen by a statistician.

Declaration of competing interests: No.

Source

    © 2011 the Reviewer (CC-BY 4.0 - source).

Content of review 2, reviewed on May 01, 2012

The manuscript was much improved and many points that I had concern were corrected. I think this work is now publishable after major editorial revisions.

Major points

  1. Majority of “Potential implications” should be deleted. Most of them are either obvious or premature.

  2. Substantial part of “Discussion” could be merged into “Assembly”, “Comparative Analysis” and “Annotation”. This will increase the readability

  3. Remained discussion should be concise.

Minor points

  1. P2 L19-21, “Although our analysis is preliminary, it indicates that the existence of repeats may interfere with the effort of advancing to the high quality draft.” should be deleted. It is so obvious to include it in Abstract.

  2. “Taeniopygia guttata” is NOT “Zebra fish”. Please check others either.

  3. In Table 1, numbers in “Pa9a_1” and “Pa9a_2” (also “Pa9a-MP_1” and “Pa9a-MP_2”) are exactly the same. Why? This could be some mistake. Please re-check numbers in other tables, too.

  4. After rewriting the manuscript, the order that tables and figures appear may be different from the initial one. It may be easier for the reader, if you re-number the tables and figures.

Level of interest: An article of importance in its field

Quality of written English: Acceptable

Statistical review: No, the manuscript does not need to be seen by a statistician.

Declaration of competing interests: I declare that I have no competing interests.

Source

    © 2012 the Reviewer (CC-BY 4.0 - source).

References

    K., O. T., Jean-Francois, P., Daniel, S., Anyimilehidi, M., Brian, R., Wilfried, G., Yashira, A., T., R. C., L., N. M., M., L. D., Michael, D., Luis, F., Ricardo, V., Juan-Carlos, M. 2012. A locally funded Puerto Rican parrot (Amazona vittata) genome sequencing project increases avian data and advances young researcher education. GigaScience.