Content of review 1, reviewed on November 10, 2014

This paper presents a method for detecting linear and non-linear genetic effects on phenotypic traits using lasso. Specifically, the proposed method is a two-step procedure. In the first step, the method detects only main effects using lasso; in the second step, pairs of genetic variants with main effects (found in the first step) are added to the model used in the first step. As a result, one can detect genetic variants with main effects as well as pairwise interaction effects on the trait of interest.

Major comments:

  1. In sparse regression community, theoretical analysis of lasso has been extensively studied (i.e., condition for the recovery of true non-zero coefficients). It would be great if the paper compares between theoretical aspects of the proposed model and the existing theoretical works on lasso, including

Zhao, P., and Yu, B. "On model selection consistency of Lasso." The Journal of Machine Learning Research 7 (2006): 2541-2563. Zou, H. "The adaptive lasso and its oracle properties." Journal of the American statistical association 101.476 (2006): 1418-1429. Meinshausen, Nicolai, and Bin Yu. "Lasso-type recovery of sparse representations for high-dimensional data." The Annals of Statistics (2009): 246-270.

  1. I would suggest discussing the relationships between the proposed model and the existing lasso-based techniques for detecting non-linear interaction effects, including

Park, M., and Hastie, T. Regularization path algorithms for detecting gene interactions. Department of Statistics, Stanford University, 2006. Wu, J., et al. "Screen and clean: a tool for identifying interactions in genome‐wide association studies." Genetic epidemiology 34.3 (2010): 275-285. Lee, S., and Xing, E. P. "Leveraging input and output structures for joint mapping of epistatic and marginal eQTLs." Bioinformatics 28.12 (2012): i137-i146.

  1. To motivate the models in Eq. 2.1 and 2.2, it would be great to either add references or biological motivations. Eq. 2.3 needs a reference.

  2. It would be useful to specify lasso model used in the paper, and how such a problem could be solved (i.e., algorithm used in the experiments).

  3. One of the most popular ways to select lambda parameter is cross validation, BIC or AIC. Comparison between these techniques and the proposed method for lambda selection in page 4 (below Eq. 2.3) would be useful.

  4. In practical genome-wide association studies, it is important to control false positives. It would be useful to demonstrate that the proposed method can control false positives under a certain user-specified level (e.g. FDR 0.05).

  5. In the experiments, I would suggest comparing between the proposed method and existing methods for detecting non-linear interaction effects.

  6. Furthermore, analysis on a real dataset would improve the paper.

Minor comments: 1. Below Eq. 2.2, it was argued that var(epsilon)=0.3 is a realistic range for highly heritable complex traits. Reference would be useful here. 2. In page 3, “(see figures)” needs to include figure numbers. 3. In page 4, explanation about sample size settings is missing.

Level of interest An article of limited interest Quality of written English Needs some language corrections before being published Statistical review Yes, and I have assessed the statistics in my report. Declaration of competing interests I declare that I have no competing interests.

Authors' response to reviews: (http://www.gigasciencejournal.com/imedia/1055861928166366_comment.pdf)

Source

    © 2014 the Reviewer (CC BY 4.0 - source).

Content of review 2, reviewed on May 05, 2015

Major

  1. In the revised version, the section 2 for relations to previous works is added, which is helpful. However, it would be great to have more rigorous and extensive reviews on the previous works for phase transition, lasso theory, and epistasis detection. For example, there are only two papers referred for epistasis detection.

  2. It would be good to have detailed explanation about how the experiments were done with the human data. Also, it would be helpful to include detailed analysis on the real human dataset.

  3. In the paper, it would be helpful to clarify similarities and differences between the proposed method and "screen and clean". If there exist any algorithmic differences, it would be useful to have comparison experiments; otherwise, novelties of the proposed method over "screen and clean" need to be clearly stated.

  4. There are epistasis detection methods that can capture non-linear association such as 2-way anova test and two-locus epistasis test and its variants, BOOST (http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2933337/), and TEAM (http://www.cs.ucla.edu/~weiwang/paper/ISMB10_2.pdf). It would be helpful for the readers to have some comparisons between the proposed method and the previous methods that are widely used.

Minor

  1. In reference 7, two references are mixed. Level of interest An article whose findings are important to those with closely related research interests Quality of written English Acceptable Statistical review No, the manuscript does not need to be seen by a statistician. Declaration of competing interests I declare that I have no competing interests.

Authors' response to reviews: (http://www.gigasciencejournal.com/imedia/1581269829172001_comment.pdf)

Source

    © 2015 the Reviewer (CC BY 4.0 - source).

Content of review 3, reviewed on June 04, 2015

Major Compulsory Revisions:

  • I feel that thorough comparisons with other existing tools would be necessary. Such comparisons would be helpful for readers to choose a tool for GWAS under different scenarios.

Level of interest An article whose findings are important to those with closely related research interests Quality of written English Acceptable Statistical review No, the manuscript does not need to be seen by a statistician. Declaration of competing interests I declare that I have no competing interests.

Authors' response to reviews: (http://www.gigasciencejournal.com/imedia/6239355841764827_comment.pdf)

Source

    © 2015 the Reviewer (CC BY 4.0 - source).

References

    Man, H. C., H., H. S. D. 2015. Determination of nonlinear genetic architecture using compressed sensing. GigaScience.