Content of review 1, reviewed on March 13, 2016

Review on GS commentary: Recommendations for open data science Gymrek and Farjoun present a commentary on recommendations for open data science.

This is a very important topic and GigaScience commentary is a great forum for presenting such information, and potentially providing a great venue for the discussion in the community. That said, a number of suggestions are presented to improve the document:

Major suggestions:

1. Page 1, lines 26-38: In a first background, setting up of the problem is not well done. A number of parameters are missing, including a definition of what are the limits/context for data science as a field includes. How is it different from bioinformatics, and how would it be different from a recommendations for open bioinformatics?

2. Background should have differentiated the various "Open", There is Open Access, Open Source, Open Data, and Open Teaching. How this fits with "Open Data Science", should have been more clear. Maybe defining what Data Science represents would have been justified here. Maybe a future would have helped to limit the space this commentary was wanting to address?

3. The recommendations need to stated as declarative statements (some are, not all). And need to be strong and different from each other. Right now you have these five recommendations:

Provide source code for tools in public repositories

Publish code for pipelines and workflows

Train scientists in data science best practices

Editors must enforce computational reproducibility

High quality review of computational methods

I would number the recommendations and recast these as (with comments/justification below each recommendation):

1. Provide or cite source code for tools in a public repositories
Tools used may be available from a previous publication.

2. Provide or cite pipelines and workflows in a public repositories
Workflows also need to be available form a public repository.

3. Train scientists in data science
This important recommendation needs to be more generic, and include more than "best
practices"

4. Journal editors and reviewers must demonstrate computational reproducibility
Important to target "journal editors and reviewers" and important for them to
demonstrate reproducibility, however they see fit.

5. Enable and enforce review of computational methods
This last one needed to be more declarative, and maybe needs to be either
be merged with #4, and made more different. As stated, not sure represents
a 5th recommendation.

 

4. Page 2 line 50-60: Isn't this what the the software carpentry and the data carpentry are doing? http://software-carpentry.org/ http://www.datacarpentry.org/ Last paragraph of that section doesn't make any points? Maybe that should be reworked. This also relates to a point I mentioned above about the scope and target of "data science"

5. Page 3 Lines 35-45: Not sure if a generic checklist would work, but if you do, then you would need to add these as well: "Usable Documentation "and" Training material".

Minor suggestions:

6. Page 1, lines 29-30 we read: "The research community has recently recognized …"

I don't think this is recent? GenBank has been around for more than 3 decades, and those of us worked on that back at the beginning thought sharing open data was important. Maybe the formalization is newer? This should be made more clear. 7. Page 1, Line 33: Not sure it is appropriate to make "Quality" part of policy? I guess this should be assumed, and not be part of these recommendations. 8. Page 3 Lines 25-30: Saying "most journals" is pretty generic, and not specific, maybe would be good to highlight journals that have good practices (this one? PLOS journals?)

Level of interest

Please indicate how interesting you found the manuscript:
An article of limited interest

Quality of written English

Please indicate the quality of language in the manuscript:
Not suitable for publication unless extensively edited

Declaration of competing interests

Please complete a declaration of competing interests, considering the following questions:

1. Have you in the past five years received reimbursements, fees, funding, or salary from an
organisation that may in any way gain or lose financially from the publication of this
manuscript, either now or in the future?

2. Do you hold any stocks or shares in an organisation that may in any way gain or lose
financially from the publication of this manuscript, either now or in the future?

3. Do you hold or are you currently applying for any patents relating to the content of the
manuscript?

4. Have you received reimbursements, fees, funding, or salary from an organization that
holds or has applied for patents relating to the content of the manuscript?


5. Do you have any other financial competing interests?

6. Do you have any non-financial competing interests in relation to this paper?

If you can answer no to all of the above, write 'I declare that I have no competing interests'
below. If your reply is yes to any, please give details below.

No financial competing interests.

I am on the SAB of Galaxy, a tool referenced directly in this paper. I am also coordinating a
bioinformatics training series which includes a course on big data, but I did not mention it in my
review.

I agree to the open peer review policy of the journal.

I understand that my name will be included on my report to the authors and, if the manuscript is accepted for publication, my named report
including any attachments I upload will be posted on the website along with the authors'
responses. I agree for my report to be made available under an Open Access Creative Commons
CC-BY license (http://creativecommons.org/licenses/by/4.0/). I understand that any comments
which I do not wish to be included in my named report can be included as confidential comments
to the editors, which will not be published.

I agree to the open peer review policy of the journal.

Authors' response to reviewers: (http://www.gigasciencejournal.com/imedia/1372612332201180_comment.pdf)


Source

    © 2016 the Reviewer (CC BY 4.0 - source).

Content of review 2, reviewed on April 18, 2016

Reviewer's comments on this second round of review can be seen in the following file:

https://drive.google.com/open?id=0B0V9UazwxfgRU0dwUnZ4UjlJUTA

Level of interest

Please indicate how interesting you found the manuscript:
An article of importance in its field

Quality of written English

Please indicate the quality of language in the manuscript:
Acceptable

Declaration of competing interests

Please complete a declaration of competing interests, considering the following questions:

1. Have you in the past five years received reimbursements, fees, funding, or salary from an
organisation that may in any way gain or lose financially from the publication of this
manuscript, either now or in the future?

2. Do you hold any stocks or shares in an organisation that may in any way gain or lose
financially from the publication of this manuscript, either now or in the future?

3. Do you hold or are you currently applying for any patents relating to the content of the
manuscript?

4. Have you received reimbursements, fees, funding, or salary from an organization that
holds or has applied for patents relating to the content of the manuscript?

5. Do you have any other financial competing interests?


6. Do you have any non-financial competing interests in relation to this paper?

If you can answer no to all of the above, write 'I declare that I have no competing interests'
below. If your reply is yes to any, please give details below.

I declare that I have no competing interests.

I agree to the open peer review policy of the journal. I understand that my name will be included
on my report to the authors and, if the manuscript is accepted for publication, my named report
including any attachments I upload will be posted on the website along with the authors'
responses. I agree for my report to be made available under an Open Access Creative Commons
CC-BY license (http://creativecommons.org/licenses/by/4.0/). I understand that any comments
which I do not wish to be included in my named report can be included as confidential comments
to the editors, which will not be published.

I agree to the open peer review policy of the journal.

Authors' response to review: (http://www.gigasciencejournal.com/imedia/4886765512011801_comment.pdf)


Source

    © 2016 the Reviewer (CC BY 4.0 - source).

Content of review 3, reviewed on May 01, 2016

Reviewer's report:

Authors have addressed ll of my previous concerns.

Level of interest Please indicate how interesting you found the manuscript:

An article of importance in its field

Quality of written English

Please indicate the quality of language in the manuscript: Acceptable.

 

Declaration of competing interests

Please complete a declaration of competing interests, considering the following questions:

1. Have you in the past five years received reimbursements, fees, funding, or salary from an
organisation that may in any way gain or lose financially from the publication of this
manuscript, either now or in the future?

2. Do you hold any stocks or shares in an organisation that may in any way gain or lose
financially from the publication of this manuscript, either now or in the future?

3. Do you hold or are you currently applying for any patents relating to the content of the
manuscript?

4. Have you received reimbursements, fees, funding, or salary from an organization that
holds or has applied for patents relating to the content of the manuscript?

5. Do you have any other financial competing interests?


6. Do you have any non-financial competing interests in relation to this paper?

If you can answer no to all of the above, write 'I declare that I have no competing interests'
below. If your reply is yes to any, please give details below.

I declare that I have no competing interests.

I agree to the open peer review policy of the journal. I understand that my name will be included
on my report to the authors and, if the manuscript is accepted for publication, my named report
including any attachments I upload will be posted on the website along with the authors'
responses. I agree for my report to be made available under an Open Access Creative Commons
CC-BY license (http://creativecommons.org/licenses/by/4.0/). I understand that any comments
which I do not wish to be included in my named report can be included as confidential comments
to the editors, which will not be published.

I agree to the open peer review policy of the journal.

 


Source

    © 2016 the Reviewer (CC BY 4.0 - source).

References

    Melissa, G., Yossi, F. 2016. Recommendations for open data science. GigaScience.