Content of review 1, reviewed on June 01, 2016

Overall this is a fascinating paper that outlines the insights from genomic analyses of one of the most enigmatic and charismatic fishes in the ocean, the Ocean sunfish (Mola mola). We have a number of suggestions for the authors that will lead to a more comprehensive manuscript. Please find our comments below.

As a referee we ask that you assess the paper on its own merits. The following list of potential issues may be helpful.

1. Is the rationale for collecting and analyzing the data well defined? Is the work carried out on a dataset that can be described as "large-scale" within the context of its field? Does it clearly describe the dataset and provide sufficient context for the reader to understand its potential uses? Does it properly describe previous work?

The motivation for conducting this research was well described in a solidly written introduction. The raw data appears to be of high quality and coverage relative to other work in the field and certainly this data set will be of significant interest to the research community. The types of analyses carried out are creative and will be informative but require some changes to ensure their accuracy and replicability.

2. Is it clear how data was collected and curated?

Credit should be given for transparency and provision of all supporting information.It would be helpful if the sex of the individual sequenced was specified. Because the sample wasobtained in 1998 it would also be useful to know the storage conditions of the blood prior toDNA extraction or, if the DNA was extracted at that time, the storage conditions of the DNAbetween extraction and sequencing. The method of DNA extraction should also be specified.

3. Is it clear - and was a statement provided - on how data and analyses tools used in the studycan be accessed?

While we make every effort to make sure this information is available, we appreciate reviewersproviding an extra eye to make absolutely certain that this information is clearly stated andproperly available. Data availability and access to tools are essential for reproducibility andprovide the best means for reuse.

There are a few instances where it is not clear what tools were used to perform certain analyses.Please see detailed comments below.

4. Are accession numbers given or links provided for data that, as a standard, should besubmitted to a community approved public repository?

Following community standards for data sharing is a requirement of the journal. Additionally,data sharing in the broadest possible manner expands the ways in which data and tools can beaccessed and used.

At the moment there are only tentative NCBI accession numbers given for the assembledgenome (XXXXX). I can see that on NCBI a BioProject (Accession: PRJNA305960 ID: 305960)and BioSample (SAMN04335856) have already been registered which is good. It would behelpful if all of the different insert size libraries (additional file 1: table S1) were named in thetable and when the raw reads are submitted to the SRA for easy cross-referencing.

5. Is the data and software available in the public domain under a Creative Commons license?

Note, that unless otherwise stated, data hosted in our database (GigaDB) is available under aCC0 waiver. Additionally, did the authors indicate where the software tools and relevant sourcecode are available, under an appropriate Open Source Initiative compliant license? If the sourcecode is currently not in a hosted repository, we can help authors copy it over to a GigaScienceGitHub repository.

6. Are the data sound and well controlled?

If you feel that inappropriate controls have been used please say so, indicating the reasons foryour concerns, and suggesting alternative controls where appropriate. If you feel that furtherexperimental/clinical evidence is required for obtaining solid biological conclusions andsubstantiating the results, please provide details.Portions of the analysis, especially the definition of the gene family clusters, require more carefuldefinition and clarification. The text regularly refers to 'single-copy' genes but often does notclearly define their criteria for classifying genes as 'single-copy' and perhaps as a consequencethe text reads as internally inconsistent, with different analyses using different datasets of 'singlecopy'genes ranging from 1,690 to 3,738 to 10,660. Rather than calling each of these datasets'single-copy genes' a more specific descriptor should be used for the phylogenetic level at whichhomology was assessed and whether or not multiple paralogs are present in each case. Forexample the 1,690 gene set could be called 'single-copy ray-finned fish homologs' as this datasetshould comprise only cases where gar and all teleost genomes contain only a single homolog,while the 10,660 gene set could be called simply 'teleost homologs' as this dataset is restricted toteleosts but includes cases where multiple paralogs (e.g. igfr1a, igfr1b) are present and thereforeare not 'single-copy'. It is not clear to me if or where the 3,738 'single-copy orthologous' gene setwas used or whether this gene set includes paralogs or not. I would additionally urge caution indescribing genes as 'orthologous' where simple phenetic (i.e. BLAST-based) methods are used toclassify them. Orthology has a very specific phylogenetic meaning and in the context of teleostgenomes especially it is important to distinguish between orthologous and paralogous sequences.Where orthology and paralogy have not been assessed using phylogenetic methods the generalterm 'homology' should be used instead.

7. Is the interpretation (Analysis and Discussion) well balanced and supported by the data?

The interpretation should discuss the relevance of all the results in an unbiased manner. Are theinterpretations overly positive or negative? Note that the authors may include opinions andspeculations in an optional 'Potential Implications' section of the manuscript; thus, if there ismaterial in other parts of the manuscript that you feel would be better suited in such a section,please state that. Conclusions drawn from the study should be valid and result directly from thedata shown, with reference to other relevant work as applicable. Have the authors providedreferences wherever necessary?

The authors are appropriately careful in drawing biological conclusions from their data andthroughout the analysis and discussion always imply potential roles rather than implying directcausality.

8. Are the methods appropriate, well described, and include sufficient details and supportinginformation to allow others to evaluate and replicate the work?

Please remark on the suitability of the methods for the study.

If statistical analyses have been carried out, please indicate if you feel they need to be assessedspecifically by an additional reviewer with statistical expertise.

In some cases more detailed descriptions of the methodology including parameters are needed.See details below.9. What are the strengths and weaknesses of the methods?

Please comment on any improvements that could be made to the study design to enhance thequality of the results. If any additional experiments are required, please give details. If novelexperimental techniques were used please pay special attention to their reliability and validity.In some instances methodological improvements during analysis seem to be necessary to meetminimum requirements for publication. Please see below for details.

10. Have the authors followed best-practices in reporting standards?

This is an essential component as ease of reproducibility and usability are key criteria formanuscript publication. Please note, the methodology sections should never contain "protocolavailable upon request" or "e-mail author for detailed protocol". Have the authors followed andused reporting checklists recommended by the Biosharing network and if the methods areamenable, have the authors used workflow management systems such as Galaxy, Taverna or oneof the many related systems listed on MyExperiment? We can also host these in our Giga-Galaxyserver if they currently do not have a home. We also encourage use of virtual machines andcontainers such as Docker. And the use and deposition of both wet-lab and computationalprotocols in a protocols repository like protocols.io.

In some cases additional details are required, particularly for methodology during the annotationand homologous gene cluster building.

11. Can the writing, organization, tables and figures be improved?

Although the editorial team may also assess the quality of the written English, please docomment if you consider the standard is below that expected for a scientific publication.

If the manuscript is organized in such a manner that it is illogical or not easily accessible to thereader please suggest improvements. Please provide feedback on whether the data are presentedin the most appropriate manner; for example, is a table being used where a graph would giveincreased clarity? Do the figures appear to be genuine, i.e. without evidence of manipulation, andof a high enough quality to be published in their present form?

The manuscript is clearly written. I have suggested moving analysis of bone-forming genes tothe main text as it warrants attention. Some minor changes to figures have been recommended.Please see below for details.

12. When revisions are requested.

Reviewers may recommend revisions for any or all of the following reasons: the data requireadditional testing to ensure their quality, additional data are required to support the authors'conclusions; better justification is needed for the arguments based on existing data; or the clarityand/or coherence of the paper needs to be improved.

Several changes and/or clarifications are necessary prior to being published. Please see belowfor details.

13. Are there any ethical or competing interests issues you would like to raise?

The study should adhere to ethical standards of scientific/medical research and the authorsshould declare that they have received ethics approval and/or patient consent for the study, whereappropriate.

Whilst we do not expect reviewers to delve into authors' competing interests, if you are aware ofany issues that you do not think have been adequately addressed, please inform the Editorialoffice.

No issues.

Detailed Revision Requests

Introduction

63 "other tetraodontid fishes such as pufferfish, boxfish and triggerfish"

KM1: This should be changed to tetraodontiform fishes to refer to the whole order and to avoidconfusion with the family tetraodontidae (pufferfishes only).Genome assembly and annotation

KM2: The number and sizes of the different libraries should be mentioned in the main text alongwith (an abbreviated version perhaps reporting only N50, contig number, scaffold number, andtotal size) of the assembly metrics - possibly in a brief concatenated figure combining S1, S2,and S3.

KM3: It should be made clear in the text that the estimate of 134X coverage is based on a (laterdescribed) k-mer counting method of genome size estimation. Table S1 also says 131Xcoverage rather than 134X so whichever is correct should be used.

KM4: My preference would also be that the coverage statistics reported in the main text shouldrefer to the reads actually used to produce the assembly and not the discarded data i.e. table S2"statistics of clean reads" as this is a more accurate reflection of what was used for producing theassembly used in all downstream analyses, so 96X coverage rather than 131X.

KM5: The number of reads from each library actually used to produce the assembly should bereported so it is clear how much of the "clean" 68.87Gb was used by the assembler and howmuch was discarded. If this isn't available as a direct output from SOAPdenovo, all of the cleanreads should be realigned to the genome assembly (e.g. using bwa or bowtie) and it should be reported what proportion of the clean reads align uniquely and concordantly to the genomeassembly. This will also give a good idea of the completeness of the assembly.

KM6: The parameters used for the SOAPdenovo assembly need to be stated. Justification forwhy these parameters and not others were used should be given, even if you only decided onthese parameters a posteriori after comparing assemblies. Did you try a range of parameters andcompare assembly metrics? Did you try a range of assembly programs? If yes this should bestated and summarized as a supplementary table. If not then it needs to be made clear that youonly produced one assembly and did not compare, but the parameters you used still need to bestated.

KM7: The programs used for filtering, trimming and/or correcting the raw reads need to bestated along with the thresholds for calling a read or a base "low quality" and discarding it.

KM8: More detailed results from the CEGMA analysis should be provided. Did you identify98.4% predicted 'full-length' proteins, or only partial proteins? Please report both values.Although I think CEGMA is still a useful tool, the authors should note that CEGMA is no longermaintained by the creators and they have released an alternative (BUSCO):http://www.acgt.me/blog/2015/5/18/goodbye-cegma-hello-busco

KM9: Given that the assembly comprised 642Mb (88%) of an estimated 730Mb genomeestimated by the authors using a kmer counting method, it would be useful to have somediscussion of sunfish genome size estimated by other methods e.g. flow cytometry, see (Rainerd,E.L.L.B. et al., 2001. Patterns of Genome Size Evolution in Tetraodontiform Fishes.55(11),pp.2363-2368) and some personal communications by the authors themselves communicated inT. Ryan Gregory's genome size database (http://www.genomesize.com/) which both suggesteven larger genome sizes for sunfish. A stringent realignment of the clean reads to the genomeassembly should also give an idea of what proportion of the read data has been used by theassembly and what proportion has been discarded.

KM10:For the estimation of genome size using k-mer analysis, please state the tools used tomake the calculation. How was the depth of 17mers counted? Is this an output of SOAPdenovoor another program like jellyfish?

97 "The sunfish genome comprises approximately 11% repetitive sequences,98 which is comparable to the repeat content of the fugu genome (Figure 1)."

KM11:It could be made clearer in the main text if the figure of 11% refers to interspersedrepeats only or is a combination including transposable elements, tandem repeats, and simplesequencerepeats. A breakdown of transposable element composition by type should beaccessible from the RepeatMasker runs already carried out and would enhance this analysis andshould be included in the supplementary data.99 "Using homology-based and de novo annotation methods, we predicted 19,605 protein-codinggenes100 in the sunfish assembly"

KM12:The type of homology-based and de novo annotation methods should be mentioned in themain text (i.e. tBLASTn against protein predictions from 5 genomes and AUGUSTUS). In themethods it should be described what the cut-off thresholds for tBLASTn alignments were andwhat criteria for annotating the sunfish homolog were used (i.e. where more than one proteinaligned did you choose the one with the greatest length, %ID, E-Value?) Because the final geneset merged with GLEAN also contains AUGUSTUS please also report the sensitivity andspecificity of the AUGUSTUS parameters chosen during the training.

101 "Using a genome-wide set of 1,690 one-to-one102 orthologs in sunfish and seven other ray-finned fishes (fugu, Tetraodon, stickleback,medaka,103 tilapia, zebrafish and spotted gar), we reconstructed a phylogenetic tree and estimated the104 divergence times of various fish lineages using MCMCtree [8]."

KM13:It needs to be clearly stated how this set of 1,690 one-to-one orthologs was chosen andverified. Ensembl is a large database with many types of export tools. Please specify the toolsused and the thresholds/criteria used for defining one-to-one orthologs. Please also report thegenome assembly and annotation version for each genome separately rather than the Ensemblrelease version. A supplementary file containing the gene names and accession numbers foreach of the additional ray-finned fish genes and the corresponding sunfish gene model numbersused to form each cluster would be necessary to make this analysis reproducible.

Figure 1

KM14:The bootstrap support for each of the nodes in the tree should be reported on the figure.The figure (preferably) or at least the legend needs to specify which assembly and annotationversion of each of the genomes reported are being used to source the values for genome size,repeat content, and number of genes. If the repeat content comes from your own analysis ratherthan the published genomes this should be made clear as well. The value of 1.3% for the spottedgar repeat content is very different from the reported value of 20% from the gar genome paper(Braasch, I. et al., 2016. The spotted gar genome illuminates vertebrate evolution and facilitateshuman-teleost comparisons. Nature Genetics, 48(4)) and this should be double-checked.

Population size history.

KM15:Having never carried out such analyses my expertise is limited here but I wouldappreciate a very brief explanation of the core methodology of PSMC in the text or methods and a brief justification of its use highlighting its potential strengths and weaknesses. Preferably citeone or two examples that show that PSMC analysis is appropriate for comparing genomes whichdiverged >50mya rather than 250 thousand years (over two orders of magnitude difference) asthis seems like it might be problematic.

Positively-selected and fast-evolving genes127 "Using a set of 10,660 one-to-one orthologues from five teleost species (sunfish, fugu,128 Tetraodon, medaka and zebrafish) we conducted positive selection analyses"

KM16:Calling this 10,660 gene set 'one-to-one orthologues' is confusing as it contains multipleparalogs present in different quantities in different teleost genomes. It should be described howmany sunfish paralogs are found in each case, and whether the subsequent selection analysesused the teleost 'a' or 'b' paralogy groups as the sunfish genes do not seem to be classified withinthe teleost 'a' or 'b' paralogy groups. For example, insulin growth factor 1 receptor (igf1r) ispresent as 2 paralogs in fugu, Tetraodon, medaka and zebrafish (igf1ra, igf1rb) but only onesunfish homolog (Sunfish09150) is reported in the selection analyses (Table S6, S7). Is this theortholog of igf1ra or igf1rb? Table S8 suggests 2 copies of igf1r are found in sunfish and reports2 dN/dS values and LRT p-values but doesn't distinguish which is which. Furthermore, the LRTp-values reported in table S6 and S7 don't correspond with those reported in table S8 (5.78x10-4for the one igf1r paralog presented in S6, and 3.64x10-7, 2.3 x10-3 for the two igf1r paralogspresented in S8). It would help if the sunfish gene models were annotated with 'a' or 'b' if thishas been assessed - and if orthology hasn't been assessed calling them (1 of 2) and (2 of 2) wouldbe more appropriate. If different paralogs, rather than orthologs, were used in any alignments thedN/dS estimations and inferences of evolutionary rates are meaningless so it is crucial that themethods used to assess orthology are careful and clearly described.395 "We picked396 up genes whose likelihood values of H1 are significantly larger (LRT p-value of <0.05) than397 H0 and likelihood values of H2 are not significantly larger than H1."

KM17:During the hypothesis testing it would also be more appropriate to select genes whoselikelihood values of H1 (sunfish evolving independently from rest of the tree) are significantlygreater than both H0 (all branches evolving at the same rate) and H2 (all branches evolvingindependently) before then sorting from this set which sunfish genes have a larger . It wouldalso be interesting to report which sunfish genes have a lower as this might imply a greateramount of constraint.144 "Using the branch models in PAML [20], we found multiple genes in the145 GH/IGF1 axis (ghr1, igf1r, grb2, irs1, irs2, jak2, stat5, akt3) with significantly higher dN/dS146 values compared to other lineages, suggesting that these genes are evolving rapidly in the147 sunfish lineage"

KM18:Contrary to the above statement, the authors are not reporting sunfish genes withsignificantly higher dN/dS than other lineages but rather sunfish genes for which hypothesis H1(sunfish genes evolving at a different rate from the rest of the tree) is a significantly betterhypothesis than H0 (all branches evolving equally). There are also multiple examples (bothparalogs of irs1, one of the paralogs of irs2, one of the paralogs of jak2, and stat5) where thedN/dS value in sunfish is actually lower than the background dN/dS implying the sunfish genesare actually evolving slower than the background.

Table S8. Copy number and LRT p-values of sunfish genes in the GH/IGF-1 axis.

KM19:This should be changed to "select genes in the GH/IGF-1 axis" as this is not acomprehensive list of genes involved in this pathway.131 "we identified 1117 genes that contained positively-selected sites132 specifically in sunfish (Additional file 3: Table S7)."

KM20:The authors should report how many sites (either absolute number or proportion ofcoding sequence) appear to be under positive selection for each of these cases in theirsupplementary data. Could the authors please also clarify whether their claim that these 1117genes contained positively-selected sites specifically in sunfish means that the sites or that thegenes show signs of positive selection only in sunfish.132 "Inspection of the fast-evolving and133 positively-selected gene sets revealed several interesting genes."

KM21:'Positively-selected genes' should be replaced with 'genes with positively selected sites' asnone of the genes showed outright signs of positive selection (dN/dS > 1).

KM22:Ideally the authors would perform a type of overrepresentation analysis using forexample GO or KEGG pathway terms to determine without bias whether the GH/IGF pathway,ECM components, or bone formation for example turn up more or less frequently than expectedat random in their set of 'rapidly-evolving' or 'positively-selected' genes. Otherwise it should bemade clear that the authors specifically looked at genes in the GH/IGF pathway and ECM. Forexample "we examined genes in the GH/IGF pathway" rather than "inspectionrevealed" as thisimplies that these genes somehow stood out form the rest of the data - which might be the casebut without an overrepresentation analysis it is not clear.144 "we found multiple genes in the145 GH/IGF1 axis (ghr1, igf1r, grb2, irs1, irs2, jak2, stat5, akt3) with significantly higher dN/dS146 values compared to other lineages, suggesting that these genes are evolving rapidly in the147 sunfish lineage"

KM23:Again here as I understand it the analysis tested whether there was a significantdifference between H1 and H0, not whether there was a significant difference in dN/dS betweensunfish and other lineages. If this is a separate analysis it should be clearly stated. Furthermoreseveral dN/dS values reported for sunfish in table S8 are actually lower than the backgroundreported.147 "We found that both copies of igf1r148 (igf1ra and igf1rb) are under positive selection in the sunfish (Figure 2, Additional file 1:Table 149 S8)"

KM24:Here please also replace "under positive selection" with "contain sites under positiveselection". The same applies to ECM analysis. If you have indeed assessed orthology withigf1ra and igf1rb please make this clear in earlier methods sections and report orthology in tableS7, S8 and elsewhere.190 "However, the sunfish191 possesses intact orthologues for most of these genes except for some SCPP genes (see192 Supplementary Material)"

KM25:I find it disjointed that this analysis alone is described in supplementary materials. As itis integral to the motivation for conducting the study the analysis of bone forming genes shouldbe included in the main text.

Additional File 1"We identified orthologues for all the above genes in the ocean sunfish genome on (a)scaffold10.1, (b) scaffold39.1, (c) scaffold20.1, and (d) scaffold77.1, except Optc and Omd."

KM26:Please state how you identified these homologs. Did you perform tblastn, or tblastxgenome wide against your assembly and what did you use as your query sequences? What werethe similarity thresholds you used?

"We BLASTX-searched the ocean sunfish loci of (a) and (b) to identify Optc and Omdrespectively, but did not identify these genes."

KM27:Again, please clarify the type of BLAST algorithm you ran and the query and targetsequences you used. The above statement implies you used blastx to run the sunfish scaffolds asa query against a database containing Optc and Omd protein sequences. Is this correct? Whichspecies were the Optc and Omd proteins sourced from? What were the cutoff parameters used?

"An alignment of Runx2 proteins shows that ocean sunfish Runx2 is highly conserved (e.g. itsDNA-binding domain is perfectly conserved; its central and C-terminal domains look intact aswell) (data not shown)."

KM28:I have no reason to doubt this but if you are reporting it I suggest you show the dataespecially as your supplementary data is not restricted.

KM29:The analysis of presence/absence of each of the target bone-formation related genesshould be presented in a table (in either the main text or SI). In each case where homologs ofbone-formation genes were found in sunfish the exact number of homologs found should bestated. E.g. "For Smad4, we identified up to four copies in ocean sunfish" is confusing and theexact number should be reported.

200 "However, it has lost two P/Q-rich SCPP genes (fa93e10 and scpp7) that are conserved inthe201 other two teleosts"

KM30:Before concluding gene loss please make it clear if you have searched the whole genomeassembly and not just the identified clusters for these genes, and whether you have also searched the raw genomic reads which may contain unassembled reads corresponding to the missinggenes.

KM31:Because of the complex duplication history of SCPP genes I would consider it essentialto carefully assess homology of each of the genes in the P/Q-rich SCPP gene cluster withphylogenetic methods to ensure that scpp7 is indeed lost and that additional sunfish SCPP genesreported as scpp3b1 and scpp3b2 for example are not actually orthologs of scpp4, and that thereported pseudogene of scpp4 is not in fact scpp7.

KM32:To confirm that scpp4 is indeed a pseudogene and that the insertion of the "T" is not asequencing/assembly error please report the results of a read re-mapping to this locus to verifythat the additional "T" is present in most raw reads which realign to this site.

Hox genesKM33:In figure S3 a more appropriate or additional outgroup for analysis of Hox clusters inteleosts would be the spotted gar, which the authors have also previously used in their ownanalyses. See (Braasch, I. et al., 2016. The spotted gar genome illuminates vertebrate evolutionand facilitates human-teleost comparisons. Nature Genetics, 48(4)). The figure would beameliorated if the authors marked the independent gene losses which occurred on each branch tohighlight the differences in sunfish from other teleosts. It should also be reported what scaffoldnumbers in the sunfish assembly each Hox cluster corresponds to, in a similar fashion as reportedfor SCPP genes in Figure 4.

Are the methods appropriate to the aims of the study, are they well described, and arenecessary controls included?If not, please specify what is required in your comments to the authors.

NoAre the conclusions adequately supported by the data shown?If not, please explain in your comments to the authors.

Yes

Does the manuscript adhere to the journal’s guidelines on <a href=’http://resourcecms.springer.com/springercms/rest/v1/content/7117202/data/v1/Minimum+standards+of+reporting+checklist’target='new'>minimum standards of reporting?</a>If not, please specify what is required in your comments to the authors.

Yes

Are you able to assess all statistics in the manuscript, including the appropriateness ofstatistical tests used?(If an additional statistical review is recommended, please specify what aspects require furtherassessment in your comments to the editors.)

Yes, and I have assessed the statistics in my report.

Quality of written EnglishPlease indicate the quality of language in the manuscript:

Acceptable

Declaration of competing interestsPlease complete a declaration of competing interests, consider the following questions:1. Have you in the past five years received reimbursements, fees, funding, or salary from anorganization that may in any way gain or lose financially from the publication of thismanuscript, either now or in the future?2. Do you hold any stocks or shares in an organization that may in any way gain or losefinancially from the publication of this manuscript, either now or in the future?3. Do you hold or are you currently applying for any patents relating to the content of themanuscript?4. Have you received reimbursements, fees, funding, or salary from an organization thatholds or has applied for patents relating to the content of the manuscript?5. Do you have any other financial competing interests?6. Do you have any non-financial competing interests in relation to this manuscript?If you can answer no to all of the above, write ‘I declare that I have no competing interests’below. If your reply is yes to any, please give details below.

I declare that I have no competing interests .

I agree to the open peer review policy of the journal. I understand that my name will be includedon my report to the authors and, if the manuscript is accepted for publication, my named reportincluding any attachments I upload will be posted on the website along with the authors'responses. I agree for my report to be made available under an Open Access Creative CommonsCC-BY license (http://creativecommons.org/licenses/by/4.0/). I understand that any commentswhich I do not wish to be included in my named report can be included as confidential commentsto the editors, which will not be published.

I agree to the open peer review policy of the journal.

Authors' response to reviews: (https://static-content.springer.com/openpeerreview/art%3A10.1186%2Fs13742-016-0144-3/13742_2016_144_AuthorComment_V1.pdf)


Source

    © 2016 the Reviewer (CC BY 4.0 - source).

References

    Hailin, P., Hao, Y., Vydianathan, R., Cai, L., P., L. A., M., L. M., Boon-Hui, T., Sydney, B., Jian, W., Huanming, Y., Guojie, Z., Byrappa, V. 2016. The genome of the largest bony fish, ocean sunfish (Mola mola), provides insights into its fast growth rate. GigaScience.