Content of review 1, reviewed on August 15, 2013

Bedoya-Reina et al report on a set of tools that have been integrated into Galaxy for performing analyses on genetic variation within species. How these tools can be used to analyse nucleotide and amino acid polymorphisms is exemplified by the authors in a number of use cases using published polymorphism data sets from a range of organisms including canine, pig, lemur and human. The authors state that a major reason for integrating these tools into Galaxy is to enhance the reproducibility of analyses of genetic polymorphisms.

The set of tools described by the authors is comprehensive in nature. There are tools allowing users to pre-process polymorphism data once it is loaded into a Galaxy server before tools are used to analyze population structure and how this is related to the differences in genotype within species. Finally, tools using data from KEGG and the Gene Ontology have been wrapped into Galaxy to enable hypotheses to be made about the possible biological outcomes caused by the polymorphisms in genes in species.

The range of use cases demonstrated by the authors is impressive and are well described in the text. It appears that the authors have been thorough in showing that by using various combinations of their tools, it is possible to recapitulate published results in addition to producing new data derived from the initial input polymorphism data for follow-on studies.

Major Compulsory Revisions

  1. The data for in the use cases are available within a data library in the main Galaxy sever. However, I think it is important that there is a Galaxy server which provides users with access to the tools, workflows and histories associated with the use cases to complement this manuscript before it can be accepted for publication. It is currently stated in the manuscript that these resources are not yet available. Please provide the location of the server in the URLs section of the manuscript when this has been done.

Minor Essential Revisions

There seems to be a number of typos in the manuscript:

  1. Abstract: There are typos in the Conclusions section. “…that addresses the needs of a growing community of biologists who are attempting to reap the rewards of high-throughput genome sequencing to study intra-species diversity. This project provides a model for the development of a Galaxy tool set to meet the needs of a particular community of biologists”?

  2. Tools for analyzing SNV tables – page 19. “Principal component analysis (tool #12) is performed by smartpca”

  3. KEGG and Gene Ontology – page 19. “To do so,each gene is associated to a GO term following the Ensembl annotation (Flicek et al. 2013).”

  4. KEGG and Gene Ontology – page 20. “In addition, the Get Pathways tool (#19) maps KEGG genes and pathways to Ensembl codes, while the Pathway Image tool (#21) plots KEGG pathways highlighting genes of interest respectively (e.g. Figure 2).”

Level of interest: An article of importance in its field

Quality of written English: Acceptable

Statistical review: No, the manuscript does not need to be seen by a statistician.

Declaration of competing interests: I declare that I have no competing interests.

Source

    © 2013 the Reviewer (CC-BY 4.0 - source).

Content of review 2, reviewed on September 02, 2013

I made a big mistake in my review when I wrote that a Galaxy server hosting the Webb's Genome Diversity tools is required. Webb and his colleagues have made the tools available on the main Galaxy server (https://main.g2.bx.psu.edu) but I had somehow overlooked this.

I have now successfully tested the Genome Diversity tools using the aye-aye data set by performing analyses such as PCA and phylogenetic tree. I thought the tools were straightforward to use using the documentation on each of the tools' Galaxy web pages and the information available at http://www.bx.psu.edu/miller_lab/docs/galaxy_phen_assoc/tutorial/intro.html. The authors should be commended on the documentation they have written for their tools and workflows as I know how time consuming this can be to do.

Based on the above, I suggest that the paper is accepted for publication after any minor revisions have been made.

Level of interest: An article of importance in its field

Quality of written English: Acceptable

Statistical review: No, the manuscript does not need to be seen by a statistician.

Declaration of competing interests: I declare that I have no competing interests.

Source

    © 2013 the Reviewer (CC-BY 4.0 - source).

Content of review 3, reviewed on October 25, 2013

The author has addressed all of the comments of the three reviewers in a satisfactory manner. In particular, the manuscript is now much improved due to the fact that the biological question being addressed in each example use case is much clearer from the revisions that the authors have made.

Level of interest: An article of importance in its field

Quality of written English: Acceptable

Statistical review: No, the manuscript does not need to be seen by a statistician.

Declaration of competing interests: I declare that I have no competing interests.

Source

    © 2013 the Reviewer (CC-BY 4.0 - source).

References

    C., B. O., Aakrosh, R., Richard, B., Lim, K. H., Belinda, G., Cathy, R., Qunhua, L., L., O. T., Jr., L. T. P., M., v. B., H., P. G., C., S. S., Webb, M. 2013. Galaxy tools to study genome diversity. GigaScience, 2.