Content of review 1, reviewed on January 23, 2023
Identifying the genetic basis of viral spillover using Lassa virus as a test case
Thank you for the opportunity to review this work under consideration for publication. This is very well described piece of work. It addresses important concepts in the sampling design for monitoring spillover virus risk and understanding any potential contribution to this risk from genetic mutations. I highly recommend this for publication within your journal as I feel it highlights several important considerations applicable to a wide audience. I have a few minor comments and some questions which may help this work to appeal to a wider non-expert audience who may not be as experienced with phylo-genetic/-geographic analysis. I have no major suggested major revisions.
Ethical approval
You report ethical approval not required for the current submission, however, you report rodent samples collected for the purpose of this study 144-148. You should perhaps include a reference a to the local and institutional ethical approval for obtaining these samples in the manuscript.
Similarly, if the protocol for the collection of these samples is available it would be useful to reference this as it would be interesting to know if the sampling of rodents was conducted with an a priori assumption of population stratification or if this data were purely a useful source to test these hypotheses post hoc.
Introduction
27-29: Might be worth including a mention of not just human and wildlife contact but also domesticated animals. i.e., “as humans, domesticated animals and wildlife come into contact more frequently”. This has been shown to be important for risk of spillover for Hendravirus and Nipah virus and potentially several other viral and bacterial zoonosis. These pathways of spillover can occur through amplification within the domesticated animal or acting as a bridging species.
33-36: It might be worth considering teasing out hazard and risk here. I generally consider that the pathogen host spatial and temporal distribution can be considered a hazard. In this formulation there is a measure of the prevalence of host and pathogen across a landscape. This layer is separate but connected to the risk of spillover into human populations. For example, hazard of spillover can be high in an isolated region of the Guinean forest but with low risk due to sparse human populations and limited contact (i.e. high pathogen and host prevalence would suggest high risk of spillover but low susceptible human populations would lead to low risk). The inverse can be true, where high human population density and high contact rates could amplify a relatively low hazard. I think this may be important to contextualise the overall risk of spillover as a lot of previous studies have generated risk maps that conflate these two.
47-48: Do you also get variable substitution rates in hosts? For example could a putative virus have a different mutation rate in rodent hosts than in humans?
Figure 1: Is it possible to put the expected values of λA and λS on the lines for each scenario? The text describes it well in 96-104 so this is purely a suggestion.
Methods
119-123: These seem important assumptions but reasonable for the purpose of simulation. Did you perform sensitivity analyses to explore the impact of these assumptions? If so would be useful to discuss them, if not is it possible to describe the potential impact of these assumptions? I would assume that if an SNP increased spillover and transmission among rodents causing clonal expansion this would provide a different signal than would be seen under the simulation. I think you explore this a bit in the results 203-205 but I am unsure if the reference to frequency in the population is related to the human population or both the rodent and human population. You also briefly mention this in 315-316 but it would be helpful to link this back to your assumptions.
150-153: Is it possible to provide the clades and associated accession numbers in the supplementary. It is interesting that you have a human sample from KGH that resides within rodent samples reported to be from the north of the country. It is likely that the human sample resides from a case that was transported to KGH for specialist care as often their samples do not describe the location of where the case obtained the infection. You describe this in the results, but it might be useful to have a table.
Results
281-286: My interpretation of figure 5 and the text is that looking at these SNPs they are relatively more likely to occur in rodent samples than in human samples. Is this correct? If so, does this suggest that the assumption of detecting a genetic basis for spillover risk by observing an SNP in human infections but not rodent infections may not show the pattern that you were testing?
Discussion
297-309: I think it could be useful to link the number of available sequences to sampling effort from your rodent sampling study. 15 sequences were obtained between 2019-2020 for this current analysis, what was the rodent sampling effort required to obtain this in number of rodents sampled or trap nights? Often an approach to increase the number of rodents detected is to sample the same location over multiple timepoints, you mentioned in the results that time did not seem to have a significant impact of clades, would expanding the time period of rodent sampling to boost sequence sample size not come with any trade-offs? To reach the 100 samples you used for the simulation test, which still had low specificity at <60 allele frequency how much more sampling would have been required. Is it feasible and if not what recommendations can be made to increase the probability of detecting these putative SNPs?
319-326: The limitations are well discussed here. It may be worth also mentioning additional confounding introduced by case-finding. Increasingly in Nigeria, contacts of infected individuals seeking healthcare are being tested, if these are found to be infected but have minimal symptoms this may impact the inference as sampling will be biased in these populations. A similar problem may be introduced if targeted sampling, as occurs at KGH, to obtain rodent infections occurs in locations where human cases are reported from as you mention in 313-315.
p.s. are you intending on pre-printing this work? It would be good to be able to share this work with a collaborator for some work we are doing looking at geographic biases in viral sampling and the impact this can have on inference of pathogen sequences on spillover.
Kind regards,
David Simons
Source
© 2023 the Reviewer.
References
B., W. A. O., H., B. B. H., Bruno, G., J., D. A. J., Joseph, H., Jenna, N., Matej, V., Emmanuel, A., James, B., G., L. E. G., C., K. M. C., T., K. O. T., Anna, S., H., R. C. H., L., N. S. L. 2023. Identifying the genetic basis of viral spillover using Lassa virus as a test case. Royal Society Open Science.
