Content of review 1, reviewed on September 24, 2015

First of all: I love the pun in the title! I have reviewed this paper from my perspective as a naturalist-taxonomist-conservationist. I enjoyed reading it very much (this is exciting stuff!) but I do miss some realism, placing this research within the true challenges of biodiversity research and conservation.

This is a well-written and clear review of the topic at hand. I identify very much with the premise: from my personal breadth as a biologist I am also frustrated that "different subdisciplines of ecology and evolution do not generally cross these different levels, and they continue to be defined by their specific questions and various logistical constraints. […] Thus, there is a great need for unifying these various subfields, which requires the use of universal character systems for tracking ecological and evolutionary processes." The question posed by the review is whether mito-metagenomics to obtain "whole-community phylogenetic and population-genetic composition […] to follow spatial and temporal change in [these communities]" are the solution.

The paper aims to review both the method and its advantages. As explained, the method is effectively "metabarcoding plus" (i.e. to "skip the laborious process of handling and sorting of specimens required in standard taxonomic approaches"), working with less material and bias, and with the added phylogenetic component. Some worries with the method are:
(1) The usual problems of reconstructing phylogenies with only mtDNA...
(2) You admit that "bulk MMG […] is biased towards the recovery of the most abundant species in a sample, meaning that rare species are likely to be lost and species lists are incomplete. One possible partial workaround is to assume that even overall rare species will be abundant in at least some samples." To express this as vaguely as "one possible partial workaround" suggests you're very unconvinced yourselves! In fact, this problem has the potential to undermine many advantages advocated further on. Anyone who has worked with tropical insects knows the assumption is a very poor one: lots of species are simply very rare. Moreover, they are exactly those 'indicators' (see comments below) that are most informative for most ecological, biogeographic or conservation questions and why specialist research (versus a generalist approach as you present) is typically the most efficient guarantee for results. Please try to deal with this challenge more seriously.
(3) You say that "if resources are tight" sequence reads from bulk samples can be mapped "against a database of standard DNA barcodes" but promote establishing "a mitogenome reference database" because it is more accurate in identification, contains more phylogenetic information, and helps quantify biomass better. The theoretical argument is convincing, but current estimates suggest that over 80% of perhaps 9 million eukaryote species are not described morphologically and at least 94% are not barcoded. I am entirely in favour of describing, barcoding and mito-sequencing each and every single species on earth, but how feasible is this endeavour? With limited resources, might an emphasis on mito-metagenomics not just lead to more information of even fewer species? What is the most efficient way forward?

The authors indicate that "metagenomic sequencing could thus improve the study of biodiversity in two important dimensions; (1) by analyzing numerous species collectively and hence shifting the focus to the study of large species assemblages rather than individual 'indicator' species, and (2) by characterizing all species in these assemblages simultaneously for presence at particular sites, their phylogenetic position, their biomass (abundance), and possibly their within-species genetic variation. The approach can be conducted at any scale, from comparisons of local samples through to comparisons across biomes globally. In each case, the sequence data, via the phylogenetic tree obtained from mitogenomes, will readily place the encountered species in the context of other studies."

How this is an improvement, especially for conservation, requires stronger arguments. Indicator species are not used because we have no information about all other species present, but because they are assumed to indicate something about a site's overall biodiversity or health. You are promoting the very opposite of biological monitoring's practical need to be selective and specific, i.e. knowing as much as possible by the simplest and cheapest means. Moreover, every ecologist knows that comparing species list is uninformative without knowing something about those species. This requires linking your species to data on their status, distribution, ecology etc. You offer several ways to tackle that:
(1) Optimistically you state that "mitogenome data from multiple sources inform each other, for an ever more complete image of global biodiversity." But how realistic is that? As you admit "a concern with the use [of] MMG may be the comparatively high cost of sequencing and bioinformatics required for data acquisition". How can you expect to get comparative data with an expensive method if it is not available from existing cheaper ones? See also the next points about how incomplete our image of global biodiversity is. What makes you think the technique can fill in the blanks without being hugely impeded by those blanks?
(2) "Crampton-Platt et al.'s study of specimens of Coleoptera obtained in a single canopy-fogging event in the Bornean rainforest generated a largely complete set of mitogenomes for the species present in the sample, and by incorporating these sequences into an existing phylogenetic tree of major coleopteran lineages, a taxonomic placement of most species in this sample could be established without expert identification, which would have been extremely difficult in any event for a complex tropical assemblage." This is insanely cool and has me all excited as a diversity- and phylogeny-crazed biologist, but how long did it take you and how much shorter and cheaper can you make the process? Again, see the next points about how incomplete our image of global biodiversity is. The credit for the success described above is not for the method, but for the thousands of systematic man-hours that provided the phylogenetic backbone. So it's all but "without taxonomic expertise"...
(3) You evoke exploiting the predictive power of phylogenetics, but often that will just mean substituting one set of unknowns (no species names) for another (no ecology known). Going back to the above example: "Even entirely novel species in the bulk sample could be grouped into guilds based on life traits in closely related, known species, e.g. separating herbivores and predators." That could grossly overestimate what we know of the living world. With 20% of species described and 6% barcoded, the chance you'll find an informative phylogenetic match for a random species in your dataset might be 5% at most. So how good are your chances to predict ecology? Think of the huge swathes of the Hymenoptera tree where we know nothing except perhaps a name and that they are probably parasitoids… Please address the technique's promise in the face of the taxonomic and ecological deficit.
(4) Also the idea presented towards the end of the review that "assessments of biomass will have to be carefully calibrated for each taxon" might be a bit too much in the light of overall biodiversity (mapping of which with this method is what you're envisioning) and the knowledge deficit thereof...

I like the positive (even utopian) tone of the review, with statements like "when combined with other mitogenome sequences in a phylogenetic tree, the members of a local assemblage are immediately integrated in an evolutionary framework relative to not only the specimens in the same samples but also to taxa studied elsewhere and by others", but please be realistic and honest about its biological context (see comments above). For example, it is very exciting and noble to address the method's potential to survey "most small-bodied eukaryotes" for which "basic biodiversity data are difficult to obtain" as their "taxonomy is poorly known", but how would this specific tool resolve the problems that centuries of natural history haven't gotten around to tackle?

The merits of the tool are synthesized as follows: "The PCR-free study of eukaryote biodiversity is a new and rapidly expanding field whose range of application can be expected to grow quickly with increasing data quality and quantity. […] The number of specimens and samples that can be studied may be very large and is essentially limited only by the cost of sequencing capacity and computing resources. In addition, specimens can be studied from across a range of taxa without the need to involve taxonomic experts, and in fact the methodology can be applied to undescribed species."

To suggest that this tool is limited only by capacity and not expertise is misleading. The tool builds on what centuries of natural history and taxonomy and a decade of barcoding have achieved. It is a very exciting extension to previous methods, but is as much limited by its vast reach as it seems limitless because of it. To do more than standalone studies of (phylo)genetic diversity at isolated sites, and show more than examples of its potential, we must invest in obtaining basic knowledge of the vast majority of eukaryote life. After all, vast amounts of genetic data of unknown species have little applied or scientific value without similarly exhaustive data of their relationships, ranges, habits, traits etc. The method's greatest limitation is not capacity, but interpretation.

So, in conclusion, can mito-metagenomics unify the different subdisciplines? As a tool it can get us a long way in that direction. However, the effort to develop the method must be matched with an even greater investment in the work it builds and relies on to succeed. Otherwise the vast majority of OTUs found will float in an interpretative vacuum. These methodological advancements are a great argument to ramp up overall biodiversity exploration (most of which will still be by 'simpler' methods, just because of cost) during the current environmental crisis, and I urge you to make that point, also in the conclusions. That "different subdisciplines of ecology and evolution [and natural history, taxonomy, barcoding etc.] do not generally cross these different levels" is exactly what you lamented on the onset. Please then be more emphatic on how the technique does let us cross the levels.

An interesting read! Feel free to contact me with any questions.
Klaas-Douwe B. Dijkstra (kd.dijkstra@naturalis.nl)

P.S. A small issue, the statement "pinned British butterflies collected mostly in the 1980's and 90's" includes apostrophes after the year. This is very uncommon usage in English, but mandatory in Dutch. As the author of the cited study (Timmermans) is Dutch, perhaps this error was copied from him.

Level of interest
Please indicate how interesting you found the manuscript:
An article of importance in its field

Quality of written English
Please indicate the quality of language in the manuscript:
Acceptable

Declaration of competing interests

Please complete a declaration of competing interests, considering the following questions:

1. Have you in the past five years received reimbursements, fees, funding, or salary from an
organisation that may in any way gain or lose financially from the publication of this
manuscript, either now or in the future?

2. Do you hold any stocks or shares in an organisation that may in any way gain or lose
financially from the publication of this manuscript, either now or in the future?

3. Do you hold or are you currently applying for any patents relating to the content of the
manuscript?

4. Have you received reimbursements, fees, funding, or salary from an organization that
holds or has applied for patents relating to the content of the manuscript?

5. Do you have any other financial competing interests?

6. Do you have any non-financial competing interests in relation to this paper?

If you can answer no to all of the above, write 'I declare that I have no competing interests'
below.

If your reply is yes to any, please give details below.
I declare that I have no competing interests.

I agree to the open peer review policy of the journal. I understand that my name will be included
on my report to the authors and, if the manuscript is accepted for publication, my named report
including any attachments I upload will be posted on the website along with the authors'
responses. I agree for my report to be made available under an Open Access Creative Commons
CC-BY license (http://creativecommons.org/licenses/by/4.0/). I understand that any comments
which I do not wish to be included in my named report can be included as confidential comments
to the editors, which will not be published.

I agree to the open peer review policy of the journal.

Authors' response to reviews: (http://www.gigasciencejournal.com/imedia/1126582142200970_comment.pdf)


The reviewed version of the manuscript can be seen here:

All revised versions are also available:

Source

    © 2015 the Reviewer (CC BY 4.0 - source).

References

    Alex, C., W., Y. D., Xin, Z., P., V. A. 2016. Mitochondrial metagenomics: letting the genes out of the bottle. GigaScience.