Content of review 1, reviewed on October 27, 2014

This article about normalization of RNA-Seq data caught my attention due to a discussion on Twitter about the poorly written abstract. Further looking into it, it became clear that the method presented in the article is an extension of the TMM normalization method originally proposed by Robinson and Oshlack in 2010 http://genomebiology.com/2010/11/3/r25 (doi:10.1186/gb-2010-11-3-r25). Comparing the two papers revealed that Zhou et al copied (with some minor modifications of the wording) the Results & Discussion sections "Sampling framework" and "The trimmed mean of M-values normalization method" of the Robinson & Oshlack paper without appropriate attribution of their source. In the presented form I would consider parts of the Zhou et al paper to be plagiarized. It really begs the question who reviewed this paper, as any reviewer working in the field should be familiar with the Robinson & Oshlack 2010 paper and should have spotted the similarities - I for one did, and I am not an expert on RNA-Seq normalization.

Furthermore the paper suffers from the use of imprecise language ("What’s more, we discovered that gene UNC5C is highly associated with kidney cancer and so on. ", "Exclusion of most of genes may lead to lost of too much information [...]", " Finally, some conclusions are drawn. ", "The reads produced by RNA-Seq are first mapped to the reference genome using computer programs.", "We also analysis other datasets with the different normalization methods", "YZ, NL and BZ built the model, done the simulations and drafted the manuscript. YZ look into real data and do some biology analysis.") which should have really been addressed by peer reviewers and the editor prior to publication.

Originally posted on PubPeer: https://pubpeer.com/publications/260FB7AFCB7ECB1649C87BCFE6ACCB

Source

    © 2014 the Reviewer (CC BY-SA 3.0).

References

    Yan, Z., Nan, L., Baoxue, Z. 2014. An iteration normalization and test method for differential expression analysis of RNA-seq data. BioData Mining.