Content of review 1, reviewed on February 04, 2016

Nicely presented and written manuscript for a widget aiming at facilitating collaborative experimental metadata tracking.

As the authors point out, cloud based system offer solutions for provenance tracking and annotation management.

Following the steps of Ontomaton's development, it is great to see components being built to validate content and syntax. This is demonstrated efficiently for 2 formats and does the job well.

I would be interested in the following input from the authors, as they are involved in large genomics projects.

1. Do you have any quantitative measure about performance of the gdoc with very large datasets (ie tested the validation step with 10000 samples and related metadata)? in other words, it would be nice to have some real metrics about stability/linearity of performance as data set size grows.

2. Data validation: Can the authors expand about the nature and extend of data validation. While testing, I altered the QIIME formatted data so I set values for Months to be 'A' and '13'. This was not picked up by the validator. (similarly for YEAR)

I then started altering entries to check if the validation was testing for other data types (numerical value vs strings for fields such as 'total carbon' and so on, no error was picked up.

It would therefore be important that the manuscript clarifies the extent and nature of the validation, and how it is possible to augment or modulate the level of validation stringency without having to necessarily build a new plugin from sources.

Upon inspection of the following:

https://github.com/biocore/Keemei/blob/master/src/QiimeFormat.gs

It is clear that data type checking is limited to key fields (.e.g sequence check) but nothing prevent users to enter text where dates are expected.

So it would be again interesting to have an rough estimate of the performance issues if more rules are added and more constraints set.

3. How to extend? The authors mentioned ISA-Tab as a popular format, are there plans to provide a validation option for that format?

4. Consumption of GoogleSheet generated QIIME files:

while working on Ontomaton (and ISA-Tab based GoogleSheet templates), we ran into glitches when saving those documents a text file for download and use by third party tools. Have the authors experience similar issues? How are you controlling for these artefacts?

Level of interest

Please indicate how interesting you found the manuscript:
An article whose findings are important to those with closely related research interests.

Quality of written English

Please indicate the quality of language in the manuscript:
Needs some language corrections before being published .

Declaration of competing interests

Please complete a declaration of competing interests, considering the following questions:

1. Have you in the past five years received reimbursements, fees, funding, or salary from
an organisation that may in any way gain or lose financially from the publication of this
manuscript, either now or in the future?

2. Do you hold any stocks or shares in an organisation that may in any way gain or lose
financially from the publication of this manuscript, either now or in the future?

3. Do you hold or are you currently applying for any patents relating to the content of the
manuscript?

4. Have you received reimbursements, fees, funding, or salary from an organization that
holds or has applied for patents relating to the content of the manuscript?


5. Do you have any other financial competing interests?

6. Do you have any non-financial competing interests in relation to this paper?

If you can answer no to all of the above, write 'I declare that I have no competing interests'
below. If your reply is yes to any, please give details below.

I am the developer of ISA-Tab format and author on Ontomaton manuscript, the first google
plugin for metadata tracking and annotation.

I agree to the open peer review policy of the journal. I understand that my name will be included
on my report to the authors and, if the manuscript is accepted for publication, my named report
including any attachments I upload will be posted on the website along with the authors'
responses. I agree for my report to be made available under an Open Access Creative Commons
CC-BY license (http://creativecommons.org/licenses/by/4.0/). I understand that any comments
which I do not wish to be included in my named report can be included as confidential comments
to the editors, which will not be published.

I agree to the open peer review policy of the journal.

Authors' response to reviews: (http://www.gigasciencejournal.com/imedia/1711043175201233_comment.pdf)


Source

    © 2016 the Reviewer (CC BY 4.0 - source).

Content of review 2, reviewed on May 11, 2016

Reviewer's report:

No further comment. I thank the authors for their clarification and for providing further information about overall performance of the validation procedure. This is a nice addition to the manuscript.

Level of interest

Please indicate how interesting you found the manuscript:
An article whose findings are important to those with closely related research interests.

Quality of written English

Please indicate the quality of language in the manuscript:
Acceptable

Declaration of competing interests

Please complete a declaration of competing interests, considering the following questions:

1. Have you in the past five years received reimbursements, fees, funding, or salary from
an organisation that may in any way gain or lose financially from the publication of this
manuscript, either now or in the future?

2. Do you hold any stocks or shares in an organisation that may in any way gain or lose
financially from the publication of this manuscript, either now or in the future?

3. Do you hold or are you currently applying for any patents relating to the content of the
manuscript?

4. Have you received reimbursements, fees, funding, or salary from an organization that
holds or has applied for patents relating to the content of the manuscript?


5. Do you have any other financial competing interests?

6. Do you have any non-financial competing interests in relation to this paper?

If you can answer no to all of the above, write 'I declare that I have no competing interests'
below. If your reply is yes to any, please give details below.

I declare that I have no competing interests but, for full disclosure ,I am one of the developer of
ISA-Tab format and Ontomaton tools.

I agree to the open peer review policy of the journal. I understand that my name will be included
on my report to the authors and, if the manuscript is accepted for publication, my named report
including any attachments I upload will be posted on the website along with the authors'
responses. I agree for my report to be made available under an Open Access Creative Commons
CC-BY license (http://creativecommons.org/licenses/by/4.0/). I understand that any comments
which I do not wish to be included in my named report can be included as confidential comments
to the editors, which will not be published.

I agree to the open peer review policy of the journal.

 


Source

    © 2016 the Reviewer (CC BY 4.0 - source).

References

    Ram, R. J., H., C. J., Evan, B., Gail, A., Antonio, G., Rob, K., Gregory, C. J. 2016. Keemei: cloud-based validation of tabular bioinformatics file formats in Google Sheets. GigaScience.