Content of review 1, reviewed on April 13, 2017

Summary: Kenkel and Bay describe three new coral transcriptomic resources that contribute both ecological and evolutionary significance to the field of coral biology. The inclusion of pooled RNA from coral fragments exposed to ambient and elevated temperatures also provides a more comprehensive collection of transcripts for future global expression profiling studies on thermal stress. The authors provided a comparison of the holobiont and host-specific assemblies, which is a necessary step for corals and allows for further investigation into host vs. symbiont (Symbiodinium and/or microbial) transcription patterns. Based on their evaluation of the transcriptomes, these new resources are comparable to other published scleractinian transcriptomes in contiguity (protein coverage >75%) and completeness (% of KOGs). Overall, the availability of these data will enhance studies on these interesting coral species.

The manuscript is well written but there is one section that after reviewing the details of Data Notes submission appears to be absent. According to GigaScience's guide to authors Data Notes should include, "information on data collection, detailed data validation, and information on exactly how these data can be re-used." The authors touch on the first two in detail but provide little information on how these data can be (re)used. An additional section following the "Evaluation of Assemblies" on the potential novel applications of these new data is needed to complete this manuscript. I believe this data is important to get out to the community, but minor revision is required.

Minor Comments:

Supplemental Figures 1-3: The use of color is misleading in these figures based on the figure legend description. When I pull up the Nematostella vectensis glycolysis /gluconeogenesis pathway (http://www.genome.jp/kegg-bin/show_pathway?nve00010), not all genes (boxes) are colored (ie. present in the Nematostella genome). However, based on the figures I would guess that the new transcriptomes only have a small fraction of the Nematostella homologs (more white boxes than shaded boxes). To accurately present the new data compared to the genes present in Nematostella, a separate fill color should be used for (1) Nematostella-only, (2) new transcriptome-only, and (3) those shared by the two resources. For example, 2.7.1.147 was not found in the G. astreata assembly but is present in Nematostella. Then it will be clear that the transcriptomes have complete or nearly-complete pathways, but might have slight differences in the genes present.

Line 19: Remove "the first", which is written twice.

Line 73: Italicize Acropora tenuis.

Line 92: Provide the citation for the BBMap package.

Level of interest Please indicate how interesting you found the manuscript:
An article whose findings are important to those with closely related research interests.

Quality of written English Please indicate the quality of language in the manuscript:
Acceptable.

Declaration of competing interests Please complete a declaration of competing interests, considering the following questions: Have you in the past five years received reimbursements, fees, funding, or salary from an organisation that may in any way gain or lose financially from the publication of this manuscript, either now or in the future? Do you hold any stocks or shares in an organisation that may in any way gain or lose financially from the publication of this manuscript, either now or in the future? Do you hold or are you currently applying for any patents relating to the content of the manuscript? Have you received reimbursements, fees, funding, or salary from an organization that holds or has applied for patents relating to the content of the manuscript? Do you have any other financial competing interests? Do you have any non-financial competing interests in relation to this paper? If you can answer no to all of the above, write 'I declare that I have no competing interests' below. If your reply is yes to any, please give details below.
I declare that I have no competing interests.

I agree to the open peer review policy of the journal. I understand that my name will be included on my report to the authors and, if the manuscript is accepted for publication, my named report including any attachments I upload will be posted on the website along with the authors' responses. I agree for my report to be made available under an Open Access Creative Commons CC-BY license (http://creativecommons.org/licenses/by/4.0/). I understand that any comments which I do not wish to be included in my named report can be included as confidential comments to the editors, which will not be published.
I agree to the open peer review policy of the journal.

Authors' response to reviews: Reviewer 1

  1. The manuscript is well written but there is one section that after reviewing the details of Data Notes submission appears to be absent. According to GigaScience's guide to authors Data Notes should include, "information on data collection, detailed data validation, and information on exactly how these data can be re-used." The authors touch on the first two in detail but provide little information on how these data can be (re)used. An additional section following the "Evaluation of Assemblies" on the potential novel applications of these new data is needed to complete this manuscript. I believe this data is important to get out to the community, but minor revision is required.

Thank you for catching this oversight. We have now added in a section on the potential for re-use as follows (L118-123): “These coral host-specific assemblies are sufficient for use as transcriptome references for Tag-based RNAseq (TagSeq) [30], a cost-effective method which was recently shown to be more accurate at quantifying gene expression levels than traditional RNAseq [31]. The fasta files and associated annotation files have been formatted for direct use in the TagSeq read mapping (https://github.com/z0on/tag-based_RNAseq) and GO-MWU analysis pipelines (https://github.com/z0on/GO_MWU)”.

  1. Supplemental Figures 1-3: The use of color is misleading in these figures based on the figure legend description. When I pull up the Nematostella vectensis glycolysis /gluconeogenesis pathway (http://www.genome.jp/kegg-bin/show_pathway?nve00010), not all genes (boxes) are colored (ie. present in the Nematostella genome). However, based on the figures I would guess that the new transcriptomes only have a small fraction of the Nematostella homologs (more white boxes than shaded boxes). To accurately present the new data compared to the genes present in Nematostella, a separate fill color should be used for (1) Nematostella-only, (2) new transcriptome-only, and (3) those shared by the two resources. For example, 2.7.1.147 was not found in the G. astreata assembly but is present in Nematostella. Then it will be clear that the transcriptomes have complete or nearly-complete pathways, but might have slight differences in the genes present.

We have elected to replace this analysis with a more comprehensive assessment of transcriptome completeness suggested by Reviewer 2. We have replaced the KEGG map figures with a comparison to the Benchmarking Universal Single-Copy Ortholog (BUSCO v2) [29] set for metazoans. Our assemblies contain complete copies of 90% of these orthologs, on average. We have revised the text as follows (L93-96): “Transcriptome completeness was evaluated through comparison to the Benchmarking Universal Single-Copy Ortholog (BUSCO v2) [29] set for metazoans using the gVolante server (https://gvolante.riken.jp/analysis.html).” and (L114-116): “Of the 978 core BUSCO gene set for metazoans [29], 89.98-91.92% were found to be complete, while an additional 3.07-3.68% were partially assembled indicating that assemblies are fairly comprehensive (Table 1).”

  1. Line 19: Remove "the first", which is written twice.

Corrected.

  1. Line 73: Italicize Acropora tenuis.

Italicized.

  1. Line 92: Provide the citation for the BBMap package.

Citation added.

Reviewer 2

  1. Overall the article is well written and easy to understand. I think they should add few sentences in the introduction, on why they chose these particular corals and why they heat-stressed them, it is a bit unclear to the reader (who might be outside the coral world) what was the motivation for this experiment.

We pooled control and heat-treated RNA samples from the same individual to maximize transcript representation, given that coral gene expression is modified under thermal stress. We refer readers to the manuscript describing the motivation for the thermal stress experiment and have now added a clarification on sample pooling as follows (L44-48): “To generate more comprehensive reference transcriptomes, 4-5 replicate cores of a single colony were subject to a two-week temperature stress experiment as described in [18] and paired samples from control (27°C) and heat (31°C) treatments were snap frozen in liquid nitrogen on day 2, day 4 and day 17 (Table 1, note for G. acrhelia, heat-treated fragments only included for day 4 and day 17).”

  1. Line 61: using a custom perl script … : I suggest to the authors to create a permanent branch/fork of the version that was used of the github repository. Since this is a git-based system, any commits to repository in the future might bring different results to those who try to re-run the analysis.

Thank you for this suggestion, we have replaced the master link with a link to a forked version.

  1. Line 73: "Acropora tenuis", should be in italics

Italicized.

  1. Figure S1,S2,S3 : I am not sure if these KEGG maps are needed. If the authors really want to show the completeness of their transcriptome, I recommend they utilize this tool https://gvolante.riken.jp/analysis.html , which would give an accurate measure of how many families they covered. Especially that there is no explanation why these three pathways were chosen, vs. for example other core metabolic pathways?

Thank you for suggesting this alternative analysis. We have replaced the KEGG map figures with a comparison to the Benchmarking Universal Single-Copy Ortholog (BUSCO v2) [29] set for metazoans. Our assemblies contain complete copies of 90% of these orthologs, on average. We have revised the text as follows (L93-96): “Transcriptome completeness was evaluated through comparison to the Benchmarking Universal Single-Copy Ortholog (BUSCO v2) [29] set for metazoans using the gVolante server (https://gvolante.riken.jp/analysis.html).” and (L114-116): “Of the 978 core BUSCO gene set for metazoans [29], 89.98-91.92% were found to be complete, while an additional 3.07-3.68% were partially assembled indicating that assemblies are fairly comprehensive (Table 1).”

  1. I assume the authors will include the assembled transcriptome with this gigascience data note, if not I recommend they upload them to other data sharing websites such dryad or figshare, which should guarantee more permanence and access to the assemblies.

We have now also archived transcriptomes on the Australian Institute of Marine Science Data Centre which is a permanent, open access repository and have revised the section on Data Accessibility as follows (L126-131): “Raw reads are archived at NCBI’s SRA under project numbers PRJNA350363: Goniopora columna, PRJNA352640: Galaxea archelia, PRJNA352641: Galaxea astreata. The assembled transcriptomes and associated annotation files can be obtained from http://dornsife.usc.edu/labs/carlslab/data/ or from the Australian Institute of Marine Science Data Centre at http://data.aims.gov.au/metadataviewer/faces/view.xhtml?uuid=3c2d31c9-b921-491c-ae27-0d169fa98c84.”

Source

    © 2017 the Reviewer (CC BY 4.0).

References

    D., K. C., K., B. L. 2017. Novel transcriptome resources for three scleractinian coral species from the Indo-Pacific. GigaScience.