### Ancient divergence and recent population expansion in a leaf frog endemic to the southern Brazilian Atlantic forest

Verified

5

##### Content of review 1, reviewed on April 05, 2015

Review for Organisms Diversity and Evolution Dr. Egon Heiss Associate Editor

Title: Ancient divergence and recent population expansion in a leaf frog endemic to the southern Brazilian Atlantic forest Manuscript No. ODAE-D-15-00021 Authors: Brunes et al.

Summary:

This paper presents a statistical phylogeographic analysis of one species of leaf frogs from the Atlantic Forest of Brazil. The data are based on 4 genes, including 118 samples of the mitochondrial ND2 gene and 3 nuclear genes with sample sizes of 102, 62 and 18. Extensive population genetic analyses are presented, including an evaluation of 4 a priori historical demographic scenarios via Approximate Bayesian Computation (ABC) statistics. The authors find support for 2 main clades, Northern and Southern, as predicted by mtDNA, with more genetic variation found in the North. ABC results reject (A) a single population and (D) a post-LGM establishment of the South from a Northern source. Strongest support is found for (B) a Pleistocene vicariance at 600,000 (this values was not part of any a priori biogeographic hypothesis, but rather was taken from the output of the IMa2 population genetic analysis). This is referred to as ‘ancient’ vicariance. Modest support was found for (C) a single Pleistocene population fragmentation, reduction to 0.01–0.20 of current size at the LGM and subsequent Holocene expansion up to present size.

Positive aspects:

Abundant data and extensive analyses, thoughtfully discussed and integrated. I enjoyed reading it.

Negative aspects:

This manuscript has no major flaws, and only minor problems with clarity that can be readily addressed without requiring additional analyses should the authors better justify some of their choices. Here I offer my suggestions for improving the clarity of the work.

Please do not report theta and pi values as %’s. Theta is not a percent, it is not a fraction of some total value. In my humble opinion, this is non-standard and potentially confusing. All molecular population geneticists that I know report theta as decimals, without multiplying by 100 and adding a “%” sign. I recommend turning the %’s into simple decimals throughout the entire manuscript, including tables. e.g, theta of 0.75% is actually a theta of 0.0075 (Perhaps substitution rates might be clearer, too, as decimals, e.g., 0.00957 rather than 0.957% in Lines 154-155.)

Clarifying and quantifying the models in Fig. 2, especially in the caption of Fig. 2 would be a tremendous and much appreciated aid to the reader. Please expand your caption greatly.

TABLES: The captions must define all terms and abbreviations, please, especially in the caption for Table 2. Table 3 caption should define terms, as well, including ‘K’.

Regarding the choice of CCSM: In my own limited experience with SDM of Neotropical frogs, I find that CCSM and MIROC often give dramatically different results, suggesting that one or both of these models is not very predictive. Perhaps the authors should include in the ‘SOM’ a second analysis using MIROC instead, and compare? Perhaps the MIROC model would even give results more consistent with the genetic data?

Line 255: The authors cite Elith et al. (2011) in the manuscript, but here they seem to ignore the advice provided in this same paper:

“ … [MaxEnt] is unlikely to be improved - and more likely, degraded - by procedures that use other modelling methods to pre-select variables (e.g., Wollan et al., 2008). In particular, it is more stable in the face of correlated variables than stepwise regression, so there is less need to remove correlated variables (unless some of them are known to be ecologically irrelevant)” (p. 50, Elith et al. 2011).

Thus, I would encourage the authors to comment on their choice of method here. If the authors stick to this method, I would ask the authors to please clarify how they decided which of 2 correlated variables was removed. And perhaps some justification of their final list of 5 bioclim variables? Are there previous studies suggesting that these 5 might be particularly useful for frogs and this region, relative to the other 14 bioclim variables?

The section headings in the Methods section should be made more specific, please, as multiple sections talk about phylogeography, population structure, gene flow, historical demography, etc., and these terms are somewhat redundant. More precise section headings would help the reader. And same for Results. The section title “Bioclimatic models” might be more clear as “species distribution models”, which is what the authors actually call them anyway.

The statistical phylogeographic approach taken here via the ABC analyses is a wonderful approach. I would like to see a few lines (Discussion?) comparing and contrasting the alternative or complementary strategies of basing the scenarios and the more important parameters, i.e., divergence time, migration and population size fluctuations, on a priori historical biogeographic hypotheses vs. assuming values first estimated from other software, such as IMa2.

In scenarios B and C, why is migration unidirectional? Was this based on IMa2 or a priori biogeographic hypotheses?

Lines 175-176: Why a strict clock for some genes and relaxed clock for others? Please explain or justify. Can you comment on how sensitive or robust the results might be to these model assumptions?

Line 22: why just 22 individuals? not clear.

211: Why would the authors define these population parameters (time and Ne) on a per locus basis, when the authors are using the 3-4 genes together to estimate one population history? Can you clarify?

Line 237: please provide a reference for this PCA approach. Delete ‘for’

Line 418-419: Perhaps I missed it, but what was the evidence for incomplete lineage sorting in CXCR4 gene, and how do we know this ‘problem’ afflicted only this gene? This is a very important topic, so any additional clarity here would be appreciated.

Suggestions for improvements to grammar and clarity, by line numbers:

Throughout MS: “p-uncorrected distance” is redundant. Please replace everywhere with either ‘p-distance’ or ‘uncorrected distance’. Perhaps define at first use, if necessary.

Throughout MS: Rather than ‘neutral theory’, more correct would be ‘the standard neutral model’ or the ‘Wright-Fisher population model’., e.g., in line 161, and elsewhere. This clarification is actually important because the authors are interested in demography, not natural selection, and these standard tests (Fs, DT, etc) are tests of the Wright-Fisher population model, and rejection can imply demographic changes or population structure, as well as natural section.

Throughout MS: any point estimates should be accompanied by confidence or credibility intervals, e.g., line 402 (600,000 years)

39: to should be with. 46-47: subject of sentence is plural ‘patterns’, so ‘its’ (as in ‘underlying its’) is not clear.
51: ‘Advances in’ 52: how about “substantial increase in our knowledge of species diversification,” 54-55: ‘contributed by both originating’ is awkward.. can this sentence be simplified? 57: ‘favored’ is not clear to me. 59: what is meant by ‘neo’tectonic? How about ‘recent tectonic’? 61: delete M T C 65: ‘of taxa’ 67: ‘revisions suggest a need for’ 70: ‘only one of five’ 70: ‘the P. burmeisteri group with…’ 72: investigation ‘of’ 74 (and elsewhere): ‘drastically’ seems too dramatic. How about ‘substantially’? 77: how about ‘historcial demographic’ 86: delete C F B 99: ‘corrected’ 103: unique instead of singleton? 104: perhaps delete ‘the number of’ 105: after ‘simulations’ add: under the standard neutral model.” 112: What does this line mean? reduced indels to a single step? 116: down-the-line could be ‘subsequent’ or ‘downstream’.
119: delete ‘results’ 120-121: not clear what you mean by “scenario to”.
124: ‘most parimonious’ 125: delete ‘the’ 130: what are “tandardized and non-standardized matrices”? 132: ‘via the’ 137: ‘steps’ or ‘generations’ better than ‘repititions’. 141: what is meant by ‘biologically meaningful’? Perhaps explain. 148: ‘We then ran…’ 150: ‘to the’? 153: ‘time series’ instead of trendline. 155: family is now Craugastoridae. :-) 156: ‘the Phyllo…’ 159: 2Nm, not 2NM. 163: dnaSP not DNAsp (see line 236) 167: delete ‘Additionally’
167: ‘in the’ 168: delete ‘see revision in’ 170: ‘Attempt to that’?? What does this mean? Expand, clarify. 170: ‘follow the latter authors’ advice’. Note the s-apostrophe. 171: ‘on’ should be ‘among’ ALSO: Why several? Why not fewer or just one? You mean ‘several’ as opposed to ‘all’? Please clarify. 172 and 765: tree should be three. 176: ‘for ND2’ 180: Here and elsewhere, not clear what model parameters are being discussed. What are these priors for or on?
184: ‘results plot’? 188: since you don’t define or justify the term ‘robust’, how about ‘To apply a statistical phylogeographic approach, we…’ 189: I would say ‘best fit the data’, yes? 191: ‘based on’ 199- 201: perhaps define ‘ancient’ and ‘strong’, etc., or say ‘see below’. Here the reader is left hanging.
205: ‘fit’ 205: perhaps mention that the summ. stats. are mentioned below? 237: Delete ‘for’ 238: ‘Therewith’ is not clear. ‘…expect… presenting…’ is also not clear. 242: ‘climatically suitable’? 246-7: ‘method has been widely’ (delete ‘a positive established trajectory’) 285: was 296: two things are not clear. First, the authors should use a | (pipe) not a / (slash), and also Pr(X|K) looks to me like the likelihood, i.e., Pr (data | hypothesis K=2) not the posterior probability. Please review and confirm/correct. 307 & 417 (and elsewhere?): you mean ‘loci’ not ‘locus’, yes? 307: ‘the’ before N and before S. 308: should be ‘of CXCR4’ or ‘of the CXCR4 gene’ 309: change ‘representing the variation’ to ‘based on’ perhaps. 312: delete ‘left’ 313: comma after here. 316: space after ‘values’ 320: you could delete ‘x 10-5’ after 1.2 323: change ‘of time’ to ‘in the past’ perhaps. 324: ‘nm’ should be ‘Nm’ (or even ‘M’ if defined in Methods as ‘2Nm’, for example). 327: see comment above about ‘neutral theory’.
335: yes, this is a nice and informative and precisely named section heading. :-) 350-352: perhaps just delete this sentence. 361: perhaps clarify why this was ‘expected’? 369: please provide a reference for this first sentence.
372: ‘summary’ in place of ‘basic’ 389: ‘However,’?
411: delete ‘seem to’ 419: inconsistencY 421: ‘among’ (not ‘between’) preferred when referring to more than 2 things. 423: ‘with studies’ 428: italicize Latin binomen 460: ‘support for’ 462: ‘its location’ seems not the right word here… or perhaps ‘more instable’ could be ‘more dynamic’? 505 & 508: associated with 784: ‘axes are’ or ‘axis is’ ? 787: conjunctly? perhaps ‘jointly’ or ‘simultaneously’? Table 3: Scenario, singular.

Nice manuscript! I enjoyed reading it.
Sincerely, Andrew J. Crawford andrew@dna.ac Reviewer 2.

##### Source

© 2015 the Reviewer (CC BY 4.0).