Content of review 1, reviewed on February 16, 2015

In this updated manuscript, Gao and al. have performed substantial text improvement which clarify some previously raised issues. Now, it's clear that the authors aim to develop a RRBS analysis pipeline based on existing tools and custom scripts. My overall impression is that, for a few processing steps described in the manuscript, the authors have re-invented the wheel. The description of the pipeline should emphasize on the novelty and how SMAT facilitates the RRBS analysis rather on the description of re-programmed scripts which are already provided by other tools. In other words, the novelty of the pipeline has not been demonstrated yet. What the user will gain by using this pipeline, which consist in running SMAT (ie. several perl and batch scripts in a Linux terminal), instead of running directly the original tools (executable file in a Linux terminal)? However, the authors have explored an area which has not been widely investigated (using RRBS for studying ASM and SNP detection), which can be complementary to existing pipelines.

Major Compulsory Revisions:

1) In the abstract (under the “Findings” section), the authors claim that they have developed a “toolkit”. This paper describes more a pipeline rather than a toolkit. Indeed, the majority of key analysis steps are performed by existing tools (eg. alignment, SNP detection). Therefore, the authors should demonstrate how their pipeline facilitates the automation of data processing. A comprehensive table describing main RRBS tools should be added on the main manuscript. The current sup-table 1 is too poor in term of features (because it does not cover all analysis steps that other software can handle such as Read QC, M-Bias plot) and in term of amount of compared software (only 2 software were used for the comparison where more that 20 are available). The authors should refer to Adusumalli et al, 2014, Briefing in Bioinformatics for this list of software.

2) An interesting part of the manuscript is the comparison of results when using different tools. SMAT is flexible by allowing the usage of several combination of software (depending the goal of the user). I think that the paper would be more attractive if the author divide it in 2 main sections: i) pipeline description along with comparison with existing tools and, ii) examples of data analysis using different procedures along with their respective metrics (reliability of the results, processing time).

3) I am still confused about the fact that the authors propose 3 software for the alignment step: Bismark, BSmap and Bowtie 2 (page 8). As raised previously, only Bismark and BSmap are designed for the analysis of bisulfite treated DNA reads. The fact that the authors highlight Bowtie 2 suggests that they have developed a program to manage bisulfite read inputs/outputs for Bowtie. However, only Bismark, BSmap were used in the performance comparison section. The Bowtie 2 procedure should be either clarified and compared or removed. Level of interest An article whose findings are important to those with closely related research interests Quality of written English Needs some language corrections before being published Statistical review No, the manuscript does not need to be seen by a statistician. Declaration of competing interests I declare that I have no competing interests.

Authors' response to reviewers: (http://www.gigasciencejournal.com/imedia/1485462551163638_comment.pdf)

Source

    © 2015 the Reviewer (CC BY 4.0 - source).

Content of review 2, reviewed on April 14, 2015

The authors have replied to all my concerns. Therefore, this manuscript is suitable for being published. Level of interest An article whose findings are important to those with closely related research interests Quality of written English Needs some language corrections before being published Statistical review No, the manuscript does not need to be seen by a statistician. Declaration of competing interests I declare that I have no competing interests.

Source

    © 2015 the Reviewer (CC BY 4.0 - source).

References

    Shengjie, G., Dan, Z., Likai, M., Quan, Z., Wenlong, J., Yi, H., Shancen, Z., Gang, C., Song, W., Dongdong, L., Fei, X., Huafeng, C., Maoshan, C., F., O. T., Lars, B., D., S. K. 2015. SMAP: a streamlined methylation analysis pipeline for bisulfite sequencing. GigaScience.