Content of review 1, reviewed on May 09, 2022

Review of DiRenzo et al MEE

The authors present a study of the utility of simulation based approaches for assessing model robustness, and refining biological inference. They split the work into two main sections focussing on 1) study-specific simulations (single dataset cases) and 2) ‘general property simulations’, used to assess the appropriateness and accuracy of novel analytical approaches.

I think this is a great paper with some interesting findings. Both sections are logical and well written, and both make important points that should be adopted by the wider field. I think the second section in particular is the most exciting! The two main suggestions I have for the authors pertain to:

1) introducing more ecology into the manuscript, specifically with the Cape Weaver system and what questions are being addressed by hierarchical modeling here. This is obviously a statistical manuscript, but a lot of the text is very technical / statistics heavy, and devoting some time to fleshing out the ecology these models underpin will make the manuscript far more engaging

2) Particularly in the study-specific section, though well written I fear the language may lose less seasoned hierarchical modelers fairly quickly, and it would be a shame if this manuscript didnt appeal to as broad an audience as possible. Addressing point 1 above will help, as it will give some much needed context to the statistical language, but some tweaks here and there will also help. Details below.

Specific Comments
L207 & L335: Harrison et al 2018 PeerJ guide the reader through simulations on fitted GLMMs to assess levels of overdispersion and/or zero inflation, which you may want to signpost here as one of the more simplistic applications before you move on the more complex case of your spatial occupancy models. I think this, or other similar examples,
(https://peerj.com/articles/4794.pdf)

A lot of the terminology throughout might be quite difficult to parse for less experienced modelers looking to improve their skills. This is particularly the case for Bayesian methods (e.g. L312 ‘posterior means’) and other more general terms (e.g. L215: ‘identifiability’). Consider defining some of these terms to make sure you appeal to a broader audience, or signposting where to look for a primer.

L306- 319 Silk et al 2020 PeerJ showed the dangers of not accounting for spatial non-independence in the correct manner in GLMMs, so may be useful ti signpost here
https://peerj.com/articles/9522/

L237 & Box 2/L617: I know you dedicate a box to the weaver data, but I found that this section was a little light on the ecology / questions underpinning the system. It would be good to introduce what is being studied and why it is interesting. Some readers new(er) to spatial occupancy modeling might not understand what biological and methodological variables may influence species distribution, and including some brief text on this will only improve the flow of the manuscript. At the minute it feels like I have to go looking for box 2 to understand the system, but even when there I don’t quite get the hit of ecology I’m looking for. Perhaps bring the system to life a bit here for context

L447: One could argue that all statistical models are misspecifed because we will never know the true data-generating model in nature. And this is important because we as ecologists are terrible for setting data-generating model = statistical model. You definitely argue this point here but worth driving home WHY they dont match and why this introduces bias. Deliberately mis-specifying the statistical model is a fairer / more robust way of estimates of power and parameter accuracy.

Additional Comments

Referencing throughout needs tweaking, as author initials appear haphazardly throughout eg. L 81, 89
L80 missing space before reference

Source

    © 2022 the Reviewer.

References

    V., D. G., Ephraim, H., W., M. D. A. 2023. A practical guide to understanding and validating complex models using data simulations. Methods in Ecology and Evolution.