Content of review 1, reviewed on October 03, 2020

The manuscript presents a novel method of classifying animal sounds, categorizing vocal repertoires into acoustic clades and identifying calls that are most typical for each clade. The manuscript is well written and mostly clear (though some details of the methods need some clarification). I liked this manuscript on the first glance, but deeper inspection revealed the number of issues that reduced the value and validity of the manuscript.

The minor but obvious problem is that the title of the manuscript does not reflect its contents. In two of the three examples (wrens and crickets) the differences were on the subspecies/species rather than population level. Besides, acoustic clades do not necessarily reflect the population structure. For example, in killer whales, highly diverse vocal clans can occur within the same population without any genetic isolation between them. In the text of the manuscript, there are many references to ‘population structure’ as well, which also needs correcting.

More importantly, I have some concerns about the suggested method, as after the thorough reading of the manuscript it appears less universal and ground-breaking than it is claimed in the abstract. Classifying animal vocal repertoires is a tricky endeavour, and many different approaches have been applied, however, no universal algorithm has been developed so far. In most cases, the automatic classification is hindered by the fact that different sounds and repertoires have different levels of variation, and an algorithm that works perfectly for one species or population fails with another. The suggested method by no means has overcome these limitations: to achieve the best results, the authors had to use different values of parameters (minrep, mincall etc.) for each dataset, even within the same species (sperm whale). To me, this implies that the suggested method is still rather subjective and does not have many benefits over manual classification by experienced observers. I would love to see this method applied to a dataset with a blind background – for example, a set of killer whale dialects from different populations with the operator not knowing which dialects belong to which population. If the method performs well (which I doubt), it would be much more impressive than distinguishing wren songs that have significant differences in frequency parameters between the subspecies anyway.

Also, the manuscript would benefit from some validation of the results of call categorization. No effort has been done (or at least presented in the manuscript) to check how well the automatically identified call types corresponded to those identified by different human observers and other automatic categorization algorithms.

I do not really understand what is the benefit of identifying the ‘identity calls’. They are not used to identify the clades, they are just calls typical for a particular clade and not for other clades. They do not provide any novel information that can help to identify the clades, neither can they be used to identify the clades in the field, because though they are typical for their clades, they are not exclusive: they are still used (quite often in some cases) by other clades.

Another issue that caught my attention is the meaning of the dendrograms. They were built ‘based on correlations of type usage’ – I don’t really understand how this was done and I would love to see more detailed description of this stage of the analysis. Anyway, any dendrogram has some idea behind it, and in biology this idea is often phylogeny – i.e. it is implied that the clades that are closer on the dendrogram are more related to each other. If you interpret your dendrograms in the same way, suggesting that the closer clades are more related, you should rather use phylogenetic approach, which takes into account not only the mere similarities between the clades, but rather the probabilities that the repertoires of two clades has descended from the same ancestor repertoire.

Less importantly but still worth mentioning is that to me, being a zoologist rather than mathematician, the paper feels too colorless without the proper examples of calls and the description of parameters used to describe these calls. For example, a couple of sonograms of wren and cricket songs with the list of measured parameters would help to get a better idea of the datasets.

In general, the manuscript appears worth publishing, but needs to address the comments above and to tone down its self-praising, listing honestly all the limitations of the method not only in the discussion, but also in the abstract. The text would also benefit from some self-criticism. For example, the fact that some Pacific sperm whale repertoires were not classified into the known clans is interpreted as the probability that ‘some of the repertoires previously assigned to the Pacific Four-Plus clan may belong to a putative fifth clan, suggesting the method is sensitive enough to identify acoustic clades that were not previously detected with other methods’. However, this fact can as well imply that the method does not work well enough for all repertoires, i.e. what is interpreted by authors as a benefit is in fact a limitation. There are other examples of this sort in the text, I leave it to the authors to find and correct them.

The last minor comment – the reference to culture in the last paragraph looks out of context. The method does not work specifically for species with culturally inherited repertoires: crickets definitely have genetically inherited calls, and wrens probably too. It would look more professional to finish the manuscript with something more related to its main topic.

Source

    © 2020 the Reviewer.