Content of review 1, reviewed on July 03, 2020

The study “Hierarchical computing for hierarchical models in ecology” applies recursive Bayesian computation to an ecological data set and aims to make this computational tool accessible to ecologists. This is a very timely study and the topic should be of much interest to a broad audience of ecologists that struggle to fit hierarchical models to large and complex data sets because of computational constraints. I found this an impressive and well written manuscript. However, working at the interface of field ecology and hierarchical modelling myself, I found myself spending a relatively long time reading some details, yet I was not fully convinced after reading whether this approach is indeed readily applicable to a large range of problems. I thus wonder whether some more explanations would be helpful to make the work more accessible to a broad audience. Perhaps the TARB approach introduced from line 113 onwards can be better explained if adding some more details about assumptions etc and perhaps also linking this to Bayes filter ecologist might be more familiar with?
Also, I would find it interesting to learn mor about how this approach can be applied to sparse data and unequal sample sizes among groups. How would you for example weight first stage estimates according to partition-level uncertainty in the second stage? Does the TARB approach results in less precision in parameter estimates at either first or second stage? What kind of hierarchical models should not be done with TARB? Perhaps worth to add some discussion here that may foster the understanding of the approach?
I much appreciate the amount of work that apparently went into this manuscript. The authors have chosen a relatively complex movement model for demonstration while providing some other helpful examples of ecological data previously analysed with hierarchical models. If feasible, I think that providing some TARB model code and output (perhaps as a supplementary tutorial) for one of the most accessible data sets such as the harbor seal counts would make the approach much more accessible to a broad audience.
Overall, an interesting and stimulating manuscript!

SPECIFIC COMMENTS:
Line 12: Can you provide more details for the statement “reduced computation time for fitting our hierarchical movement model by half”? If the initial model is fitted with an extremely slow conventional MCMC algorithms, half the time might be of less interest than in the initial model is state-of-the-art modelling?
Line 27-29: Would it be possible to explain in few words or with a short example how individual telemetry data related to an species-level parameter of interest?
Line 55-58: Recursive computing is generally associated with sequential data but some of the hierarchical model application in ecology you mention above are not fitted to sequential data? Perhaps worth to clarify whether your approach is limited to sequential data or not?
Line 66: The examples (“(e.g., partitioned by individuals, sites, or species)”) are redundant with
Lines 68: Replace “population-level parameters” with “group-level parameters” (simply to avoid confusion around the word ‘population’?). Perhaps worth to consider also introducing the term ‘hyperparameter’ here?
Line 79: ”temperature in blue tits”? Unclear.
Line 94-95: Please check: I found the wording “specifying priors [theta_j] for the partition-level models” somewhat confusing: you specify priors for the partition-level parameter theta_j but perhaps it is not clear to all readers that “[theta_j]” refers to a prior?
Line 98: “remaining parameters” are all “population-level parameters”?
Line 108: Does “first stage” correspond to ‘partition-level’ terminology as used before? Suggest to make clear somewhere in the text how first/second stage link to partition/population levels.
Line 108: Perhaps explain ‘importance update’ in a few words (in parenthesis)?
Lines 113-131 The transformation and justification of the Jacobian prior, i.e. the TARB approach, is in my opinion incompletely described to make this accessible to ecologists and need some more detailed description. Also, shouldn’t aspects such as sequential data and underlying Markov processes be mentioned here? I think it need to be made clear here how the assumptions are met for data that resemble observations from underlying Markov process (e.g, movement records or population fluctuations) versus independent samples (e.g. independent records of species occurrence or behavior in different environments)?
Line 114: Perhaps some more information are warranted to explain how the transformation function g is determined and why as transformation should be used? Also, worth to mention the vector size here?
Line 148: Explain I in equation 11?
Line 149: What landscape covariates were used in your study?
Line 154: Confusing to me and perhaps worth some clarification: shouldn’t ‘delta_p rather than sigma relate to movement speed?
Line 179: For how many days (i.e. sequence length)?
Line 224: You present posterior estimates for first stage estimates but wouldn’t it be as interesting to present those from the second stage?
Legend Figure 1: Check "n = 15" and "n = 1675 telemetry locations from J = 15". Line 224:
Line 250: Please check: the wording “to fit the full hierarchical model…” is confusing because you talk about fitting a TARB?

Source

    © 2020 the Reviewer.

Content of review 2, reviewed on September 14, 2020

The authors have improved their previous draft, and I have only minor suggestions to add.
Lines 78-83: Explanations for the hyperparameters in equation 3 and 4 are missing. “IG” is an inverse Gamma prior?
Lines 81-83: An explanation/reference should be added why this model cannot be sampled using Gibbs updates.
Line 90: Why would you still need to tune the updates for each logit(pj) by hand in the first stage? I think readers would appreciate a short explanation here.
Figure 1: Please check: in panel A, “Data” do not represent the hierarchical structure y_i_j. Suggest to either use the two relevant index in panel A or clarify in the legend that you refer to partitioned data in both panel A and B.
Lines 105-106: Please check, I found the reference to your cheatgrass example her a bit out of context.
Line 218 Delete redundant “the”.
Line 248-253: It is still difficult to understand how the computation gain can be explained, as you run the first stage of the recursive approach on 8 and 15 cores compared to the single algorithm approach? Feasible to add computation time for the recursive approach on a single core?

Source

    © 2020 the Reviewer.

References

    M., M. H., B., F. A., B., H. M. 2020. Hierarchical computing for hierarchical models in ecology. Methods in Ecology and Evolution.