  • This study examine the costs of increased brood size in terms of oxidative balance and telomere length in captive passerines breeding at low temperatures. It shows no effect of the brood size manipulation, although sample size might be limited to conclude on the small effects expected (esp. on TL). However, interesting changes were found between incubation and nestlings feeding. This study is timely and provides new evidence in the hot debate on the oxidative costs of reproduction, by trying to add environmental constraints in a captive setting. I however have to main concerns. First, there was no control for the temperature manipulation and the authors do not sufficiently discuss why they were confident that such manipulation would increase the effect of brood size manipulation and not mask it. Second, the authors do not impartially present both sides of the methodological debate around the measure of reproductive costs, that is whether reproductive costs should be studied between non-breeding vs. breeding individuals or by modulating reproductive effort in breeding individuals. I think that their study does provide interesting information in this debate and should thus be discussed in that light.

    Some sections (esp. the end of the first paragraph of introduction and the statistics section in M&M) could probably be made clearer with some re-writing, and with proof-reading by a native English speaker (as I am not one myself, I am not sure whether some sentences are correct or not).

    Specific comments:

    l. 63-66 These sentences are poorly referenced and lack precision: which costs of reproduction are detected (in terms of body condition? survival?), in which experimental settings? Please explain in more details.

    l. 70 “both processes may independently mediate” instead of “both processes independently may mediate”

    l. 72 “the inevitable”

    l. 80 Are you sure this reference is appropriate? The main point in Speakman and Garratt 2014 on the links between oxidative stress and aging is that, citing them: “Evidence is also accumulating to suggest that oxidative stress is not the predominant cause of ageing”

    l. 81 In which direction and on which component of senescence (reproductive, actuarial)? Some cited references do not support this particular statement (Wiersma et al. 2004 does not measure senescence).

    l. 81-84 I do not follow the logic between the two parts of the sentence, especially since some of the references study reproductive senescence and not actuarial (survival) senescence. Please rephrase to make that clearer.

    l. 84-89 Please provide references for these statements.

    l. 100-103 Here you should probably discuss the “oxidative shielding” hypothesis and how it might or not apply to your protocol (Blount, J. D. et al. 2015 Oxidative shielding and the cost of reproduction. Biol. Rev. doi:10.1111/brv.12179).

    l. 109 As Descamps et al. 2009 manipulated breeding conditions by submitting birds to an immune challenge, could you better explain the rationale behind your manipulation of temperature? Indeed, Beamonte-Barrientos & Verhulst 2013 showed that increased metabolic rate through decreased temperature was not associated with higher oxidative damages (Beamonte-Barrientos, R. & Verhulst, S. 2013 Plasma reactive oxygen metabolites and non-enzymatic antioxidant capacity are not affected by an acute increase of metabolic rate in zebra finches. J. Comp. Physiol. doi:10.1007/s00360-013-0745-4). Moreover, in the context of your brood size manipulation, as individuals with increased reproductive effort might also produce more thermal energy, thermoregulation costs of low temperature could be partly compensated in the enlarged-brood group.

    l. 137-142 It should be clearly stated that a donor nest gave nestlings to more than one recipient nest. Was there any effect of the nest of origin on nestling condition and survival?

    l. 144-146 Could a nestling from a donor nest then be swapped and end up in a control nest, or were only original nestlings from enlarged nests swapped with the control nest?

    l. 170-171 Could you provide the range and distribution of DNA concentration, 260/230 and 260/280 ratios?

    l. 176 Why did you choose GADPH as the control gene?

    l. 192-195 As requested by the MIQE guidelines, “repeatability” should not be expressed as the CV of Ct values (due to the distribution of Ct and the inherent inter-plate variation), but as the CV of the eventual quantification measure (here TL ratio). See Bustin, S. A. et al. 2009 The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clin. Chem for details. Could you also provide proper repeatability estimates of TL ratios to evaluate the range of technical variation compared to between-samples variation?

    l. 224 “mM” instead of “mmol”. Please use consistent notations between text and figures (either mmol/L or mM).

    l. 224-226 Here again, as the scale is very different for the two measures, could you provide a repeatability estimate?

    l. 239-260 It is not clear to me whether the models for adult body mass on one side and TL, dROMs and OXY on the other differed, as the effects on the latter variables are not explicitely mentionned. Also, why did you not check body mass before treatment as you did for the other variables ? Overall, rewriting this section to better highlight the fixed and random effects in each model would improve readability.

    l. 309 Although they “sacrificed” individual offspring condition, the total reproductive output of enlarged broods (total body mass) was still higher, so parents seem to gain from brood enlargement without suffering more damages. Do you have data on nestlings survival after independence and future reproductive success (or from the literature) to assess whether parents of enlarged brood have any interest in increasing their effort?

    l. 327-342 You should discuss here the actual meaning of the dROM and OXY measures. Indeed, dROM cannot be simply interpreted as a measure of ROS, as several antioxidants act early in the oxidative cascade, before ROMs formation, and OXY does not directly measure enzymatic antioxidant activities, but rather non-enzymatic activities.

    Have you tested that the dynamics of TL and oxidative parameters with time did not differ between males and females? If so, please rephrase the statistics and results sections to make that clearer.

    l.338 Do you have evidence that metabolic rate is higher during nestling feeding than during incubation in similar laboratory settings?

    l. 343-351 Please cite some references in this paragraph.

    l. 353-355 and 381 Could telomere loss also be interpreted as a cost of reproduction? How does it compare to telomere loss in non-breeding individuals over the same length of time?

    l. 358-361 Do you have enough power to detect the small effect found in other studies?

    l. 376-379 You might rather discuss that effect of age in the second paragraph of the introduction, when you discuss opportunistic vs. seasonal breeders.

  • Dear authors and Prof. Clayton, The manuscript examines the synchronization of parasitic great spotted cuckoo egg laying in magpie host nests. It seems like there is the basis for a good study in this manuscript, however there is missing information and it is presented very unclearly, therefore I cannot properly evaluate the science at this stage. The lack of clarity, I suspect, is due in part to the need for a native English speaker to rewrite the paper. However, the overall structure also makes it difficult to read. I give more details below, but in general, there is information in the discussion that needs to be in the introduction, the discussion needs to focus clearly and primarily on the results found in the study, there is information missing from the methods so I cannot evaluate the results, the aims and predictions are not always obvious and then are only referred to as Prediction X throughout the rest of the paper rather than summarizing for continuity, and the figures need clarification for proper interpretation. I hope my comments below will be helpful for a revision of the manuscript.

    Throughout: be consistent with the number of significant digits used for p-values; “prevalence of parasitism” and “parasitism rate” seem to have different meanings throughout the manuscript so explain the actual measure at each use instead of using these terms, which will improve clarity; in the results section, summarize each prediction to make it easier to remember what they are; summarise each prediction when referring to it rather than just referring to Prediction X since it is tedious to go back to the introduction to try to find the predictions each time.

    Detailed comments: Lines 106-107 - do you mean “‘underlying’ causes in some other systems”? If so, also give an example of what these other systems are.

    Line 117 - define “clutch completion”

    Line 126 - it should be “parasite density” rather than “parasitism rate” since the measure is the abundance of brood parasites (number of individuals?) per area. Later (line 146) you seem to use parasitism rate differently from density, so perhaps you have two separate terms? If so, define each.

    Line 132 - define “breeding cycle”. I would normally think of this as breeding or non-breeding season, but since you only collected data during the breeding season, you mean something more specific so it would be good to clarify.

    Line 134+ state the aims and predictions clearly. Perhaps a bullet format where each aim and each prediction is clearly delineated would be more useful. As it is, this information is a bit difficult to find in the paragraphs. As well, aims 1 and 4 are missing.

    Line 141 - “prevalence of parasitism”: is this the same as parasite density? If so, stick with one term throughout the paper, if not, define the term.

    Line 145 - state the direction of the relationship predicted in prediction 1b, i.e., that it should reduce the level of synchronization.

    Line 148 - prediction 1c implies that cuckoos attend to conspecific presence/competition to judge when to lay their eggs (or some other factor - be specific about what you think it is) and, thus, why they would lay an egg before the host does. Is there evidence to support this assumption? What cues do you predict the cuckoos are attending to for this prediction?

    Line 167 - what are “acceptor species”?

    Line 170 - what are the four hypotheses? Summarize them here for clarity and continuity.

    Lines 184-186 - by “parasitism rate” do you mean the percentage of eggs in each magpie nest that are parasitic eggs?

    Experimental set up (line 208) - it is unclear what was involved in experiment 1 (2011) and 2 (2012) (e.g., what condition was tested in which breeding stage). For instance, experiment 1 says that mimetic and non-mimetic eggs were used, but there is no mention of the experimental design with relation to experimental eggs. Perhaps presenting the experimental design in a table format would be preferable. Regardless, I would need at least the following information to evaluate the methods section...[a table was inserted into the original document]

    Line 220 - instead of “any”, do you mean “all” of the different breeding stages? Line 233 - by “prevalence of parasitism” do you mean the number of parasitic eggs per nest? Line 236 - instead of “GLZ” do you mean “GLM”? If not explain what the Z stands for. Line 248 - by “parasitism prevalence” do you mean the percentage of nests with one or more parasitic egg?  Par starting on line 269 - are these eggs already included in the previous analyses? What prediction do they go with? Need a topic sentence to clarify how this paragraph fits with the rest of the results.

    Lines 283-285 - explain how the results differ from your prediction (though this is better left to the discussion).

    Line 318 - don’t you mean this is a positively correlated relationship (as you report in the previous sentence) rather than in inversely correlated one?

    Discussion in general: discuss your results primarily and then bring in other literature to place your results in context. As it is now, the focus is on other literature, more similar to an introduction than a discussion. Also, only prediction 1c is explicitly discussed and not the other predictions, which should also be discussed explicitly.

    Line 391 - is there evidence that this hypothesis involves non-adaptive behavior?

    Line 408 - is there evidence that domed nests are not the cause of poor synchronization?

    Line 425 - this hypothesis needs to be explained in more detail.

    Figures 1, 2, and 5 - the terms “Before magpie egg-laying” and “After magpie egg-laying” appear at the bottom of the figure, but it is not clear to what they refer. I imagine that the first two bars, titled “Before egg-laying” go with the “Before magpie egg-laying” category, while the rest of the bars go with the “After” category, however there needs to be a clear distinction made on the figure.

    Figure 1 legend: does the y axis refer to the number of cuckoo eggs or magpie eggs or both?

    Figure 3: define “parasitism rate” on the x axis. Y axis: how is this a frequency when the limits go from 0.00 to 0.40?

    Figures 4 and 5: define “ejection rate” on the y axis.

    I hope this feedback helps the revision of your manuscript.

  • The authors present a study of the relationship between laterality and boldness in rainbowfish. While not novel, the hypothesis and experiments are simple, clear and well-justified; the manuscript is largely clear, concise, well-written and easy to follow. It was nice to read such a clear, simple and uncomplicated MS. I have few comments and points for clarification:

    L 36: Species name? L 47: Species name? were -> where L 66: Authors repeated L 77: have a myriad of ecological -> have myriad ecological L 141: Could the authors clarify whether the fish's whole body had to emerge? LL 148-149: Why was this technique used? Can the data be coerced to be normal? It would make more sense to use a more standard repeatability measure that reports r and preferably a CI for the point estimate. L 152: Could the authors state here that they investigated interactions between the fixed effects? I had to wait til the Results to find that out. L 156: I assume that laterality is repeatable but I could not find this stated in the MS (apologies if I have missed it). Could the authors test this or provide a reference stating this in the Intro? L 211: Could the authors elaborate on the function of the corpus collosum for naïve readers? L 218: I don't think this explanation is all that mundane! One could use a different test of laterality (that does not involve conspecifics) to get around this issue. L 244: I find the wording difficult to follow here. Could the authors rephrase this sentence? LL 248, 250, 252: There are a couple of typos on these lines. Discussion: Why do the authors think there is not effect of sex in this study given they have previously reported a link between laterality and sex?

    I hope the authors find my comments useful. Alecia Carter (Please note I sign all my reviews)

  • Review of BEAS-D-13-00136: Variation in plasticity of personality traits implies that the ranking of personality measures changes between environmental contexts: Calculating the cross- environmental correlation.

    The author presents a method to estimate the cross-environmental correlation of a behavioural trait using random regression. I am unable to comment on the maths presented, and I thus focus my comments on the other aspects of the manuscript. Although the method seems very well explained, it is still unclear to me what the information provided by the cross-environmental correlation adds to our current understanding of behavioural plasticity in an animal personality framework. The author suggests there is "clear scope to widen our understanding for why behavioral consistency is maintained" in the Introduction (LL64-65) and that "future studies can benefit" (L365), without explaining what the scope covers or what the benefit may be. Some information about the scope is provided, but not until the Discussion (in the hypothetical example on LL347-349). Although I realise this is a methods submission, I would suggest the author make the scope/benefits clearer and more explicit.

    It is further unclear to me how the information provided by the cross-environmental correlation differs from the information provided by the intercept-slope correlation, even if it is obvious that they are different from the figure and the text (LL328-337). From what I understand from this paper, in very simplistic terms, the cross-environment correlation quantifies rank-order consistency of individuals across a gradient (although it can obviously generalise across more than one context/environment, making it more useful than a direct correlation)? And the Correl I, I x E (int-slope correlation) will give information about how far the reaction norm is being estimated from the point that all intercepts converge, assuming an I x E interaction (Fig 1e). If there is a negative cross-environment correlation, then there cannot be correlation between I and I x E (see Nussey et al. 2007 Fig 2 e-h). If there is a positive cross-environment correlation AND a positive Correl I, I x E, the reaction norm is shifted to the right (moves positively along the x axis) (Fig 1c or Nussey et al. 2007 Fig 2 e-f) and if there is a positive cross-environment correlation AND a negative Correl I, I x E, the reaction norm is shifted to the left (no representative figures, but Fig 1c or Nussey et al. 2007 Fig 2 e-f could be flipped on the vertical axis). So, the covariance between intercepts at E1 and E2 will give an idea of whether the reaction norm is right or left shifted, assuming an I x E interaction? I apologise if I have not understood the estimate, or my understanding of I, I x E correlations is misled. Whether or not I have understood this correctly, I would suggest (1) that the author add a (few) hypothetical example(s) explaining why this is important and what it can tell us that we do not already know or are already estimating; (2) to add the intercept-slope correlation as an extra column of panels to Figure 1 so that the reader can understand how they differ from the cross-environmental correlation and why they are both necessary to understand the reaction norm (a reference to this difference and Fig. 1 is made at LL212-213) (or some other way of explaining the difference as it is currently unclear); and (3) add a list of advantages to using this method so it is clear to the reader what it adds. I would also suggest adding a figure that shows what the reaction norms would look like in the available parameter space in figure 2. For example, the top right corner can be occupied by parts of the reaction norm similar to Fig. 1c, the top middle part can be occupied by parts of the reaction norm similar to Fig. 1e etc. I can't imagine what can (significantly) occupy the bottom left corner.

    Further, I believe it would be very useful (and helpful) to use consistent notation across articles, therefore I would strongly suggest that the notation in the eqnn follow that of Dingemanse & Dochtermann (2013) e.g. in eqn (1), V(ey) -> Vind0y; V(sy) -> Vind1y; C(ey,sy) -> Cov ind0y, ind1y (I believe).

    Finally, I'm unsure what the treatment of the function-valued vs character state approaches adds. From my reading of Stinchcombe et al., behavioural ecologists do seem to use the function-valued trait approach (though the author does provide an example of a character-state approach). I think it needs to be explained in more detail if it is kept in the ms.

    Below are some minor points that the author may find useful to include/address: L28: which both are of -> both of which are of L52: on -> at L72: usual -> usually L87: in -> is L160: AsReml -> ASReml L163: is therefore to -> therefore is to L191: variances -> variance L121: how -> what L215: could therefore not be -> therefore could not be L216: can you explain univariate residuals? (Also, my understanding is that univariate residuals are assumed/default for most statistical packages in R [other than MCMCglmm], which could explain the reviewed literature. This also applies to the point about homogeneous residuals L296, I believe) L225: Which Table? L232: remove 'what' and 'is' L233: remove 'as' L238: what does the author mean by 'decent'? L248: "how much variation there is in plasticity relative to variation in plasticity" makes little sense. L246: I don't yet understand how the correlation between intercept and slope describes the level of crossing; surely a negative cross-environmental correlation indicates crossing (and positive = no crossing), and the correlation between intercepts and slopes indicates how far from the crossing? Perhaps this could be clarified here. L258: could the author clarify how the information is of interest? LL273-275: could the author give some examples here? L283: that -> than L314: could the author please provide an example of the suggested plot? L317: could the author please provide a reference for this? L367: estimated -> estimates (but the wording is a bit odd here still). Table 1: what does "r_e_s" refer to? L468: panel -> panels L469: individual -> individuals L508 and Table 2: R(12) is perhaps a bit misleading. Could the author use r1(E1,E2) from eqn. (5)? Further, could the author explain in the text the limits of this estimate? L546: This data is -> These data are L553: that -> what Also, it would be useful if the data set and a text file of the code were made available to the reviewers (I'm not sure if they were both uploaded and I just don't have access).

  • Review of BEAS-D-12-00617: Testing for between individual correlations of personality and physiological traits in a wild bird

    The paper presents an investigation of a pace-of-life syndrome (POLS) in wild blue tits. The authors attempt to link behavioural and physiological/immunological traits in a 'behavioural syndrome'. They show repeatability of all the measured traits, but the authors found little to no evidence for a POLS. The paper is timely and the results are important. The sample size of individuals is very large for this kind of study, but the number of repeat measurements per individual is not. I found the paper very difficult to follow in many places, which I believe detracts from what is a potentially interesting and sophisticated study. Many of my comments below are on the presentation of the ms; I hope the authors find these comments helpful. I have a few more major theoretical/methodological comments for the authors:

    1. The authors seem to present the idea of testing a POLS in the first paragraph of the Introduction (and indeed it is a keyword of the study), but do not return to this idea again. Are the authors testing this theory? If so, I recommend structuring the paper around testing this hypothesis. Just because the authors found no evidence for this is not a reason not to mention the POLS theory it in the Discussion - this result is important in that context.
    2. I find the presentation of variance partitioning in LL96-118 very confusing. For example, at L99 the authors define residual variation as a correlation at the within-individual level, but at L101 as environmental effects. Please clarify at the first mention what the within-individual/environmental/residual level is. There is a mistake in Eq. 1: after rw it should be [(1-R1)(1-R2)] bar (not [1-R1 1-R2] bar which is equivalent to [1-R1-R2] bar), but it's not clear where the (geometric) mean is coming from (which needs to be introduced and explained as well); from my reading of Searle 1961 this is supposed to be the square root, not the mean (from Eqns 3 and 5 in Searle, but I note that in these eqns, r refers to the genetic correlations). In any event, I think this section needs to be better explained. Can I encourage the authors to use the terminology/notation used in Dingemanse & Dochtermann. 2012. Quantifying individual variation in behaviour: mixed-effect modeling approaches. Journal of Animal Ecology. There is already confusion in terminology in the personality field and no need to introduce more confusion in notations. Further, could the authors provide an explanation of how R1 and R2 are calculated, and add to the end of the paragraph what methods (i.e. LME models) are used to estimate the between-individual correlations (although this should probably be introduced earlier in the paragraph).
    3. Novel object experiments: L183: was the same 'novel' object (pink pig) used for all novel object presentations? If so, the authors will not be measuring neophobia/boldness/exploration in the second+ presentations. At L194: what are the variables that are measured that are not repeatable? Do these refer to the novel object presentations? Why was no novel object presentation control performed? The responses of the birds may well likely be because an experimenter put his/her hand in the cage, not because a novel object was introduced per se. Are any data used that are described from this test in this study?
    4. Does body mass/size have an effect on any of the measured variables? i.e. this should be included in the models or an explanation of why it wasn't included provided.
    5. At L257 & 264: why calculate the raw phenotypic repeatability if variation can be explained by fixed effects that are included in other models? Further, I'm not sure an LRT is appropriate to compare LMEs and LMs; can the authors provide a reference/justification for this? Alternatively, can the authors use rptR in R to estimate the repeatabilities and p-values for these if they don't control for other fixed effects. Are LL265-68 necessary if the authors do not use this approach (even if it is more appropriate)? I don't understand why the authors continually refer to adjusted repeatability (Methods, Discussion, Table 4), but do not seem to calculate/present it (but it is assumed that it has been done at L454). Why not? I find the presentation of the (adjusted/raw) repeatabilities overall very confusing.
    6. Could the authors state whether they had sufficient power to accurately estimate between- and within-individual variances given the very low number of repeat samples (see 'Sampling Designs …' section of Dingemanse & Dochtermann and references therein).
    7. Do the authors have enough power to estimate behavioural reaction norms? How much of the within-individual variation could be attributed to individual differences in plasticity? It's strange that this isn't mentioned at all in the Discussion, although the authors have a paper in press that deals with this idea it seems.

    General points re clarification/style:

    1. Paragraph starting L119 and elsewhere: I believe that the convention for reporting methods and predictions is to present them in the past tense. Further, the sentence on L119 belongs in the Methods section, as it doesn't add much here (at least the last part of that sentence).
    2. I found the Introduction long and difficult to follow. The authors introduce the POLS before introducing covariation among traits and then they return to discussing physiology and immunity (which they sometimes call the same thing, sometimes not). I realize this a matter of individual style, but the authors may find the flow of the Introduction improved if the Introduction (and Methods) is shortened and re-worked slightly. Paragraph 4 is too long and should be broken at the sentence starting at L94 and the beginning of the paragraph shortened and subsumed in the previous paragraphs, for example.
    3. L170: will it matter that the breath rate was calculated after different amounts of handling time for different individuals? I assume that the time between trapping and handling and breath rate estimation must be different for all trapped individuals; did the authors record this, and will it have an effect?
    4. L321: Perhaps I've misunderstood, but just because there is a correlation between the same trait in both seasons does not mean that 'there is no season-dependence for these traits'. In fact, 'Season' has an effect on all of the measured variables but aggression, so I would say that those behaviours/measures are context dependent. Please clarify what the authors mean here.

    Minor points for clarification in the ms that the authors may find helpful to include:

    1. L24: haematocrit, -> haematocrit;
    2. L24 and throughout the ms: between individual level -> between-individual level
    3. L27: Remove 'Furthermore'
    4. L29: allowing to partition phenotypic -> allowing the partitioning of phenotypic
    5. L31 and 43: pace-of-life
    6. Sentence ending L37: needs a reference. Also, the list of taxa does not add anything so I would suggest removing it.
    7. L39-41: remove occurrences of 'they' and 'e.g.'
    8. L42: advocate integration -> advocate the integration
    9. L43: the POLS abbreviation is never used and is not needed.
    10. L45: individual's life can -> individual's life history can
    11. L50: that are co-varying over -> that co-vary over
    12. L56 & 57: an individuals' physiology -> an individual's physiology or individuals' physiology
    13. L65: recently -> recently,
    14. L65: I believe this author is Careau
    15. Sentence starting L74: Move to L70
    16. L78: what does 'living environment' mean?
    17. L88: For example, -> In one example, (or something similar as the previous sentence stipulates that there are not many studies that investigate this and the authors are now giving an example of one)
    18. L99: between- and within-individual level should be hyphenated
    19. L107: difference -> differences
    20. L123: with -> of
    21. L129: remove dash after IgG
    22. L134: remove 'up and'
    23. L145: to what accuracy was age determined? i.e. to the nearest year?
    24. L152&3: use a different numbering style (e.g. i, ii and 1, 2, or a, b, etc.) to demonstrate these are two different lists
    25. L152: There needs to be an introductory sentence here (the subtitle is not enough) e.g. 'We measured Xnumber behaviours during handling after trapping the study birds.'
    26. L156: My understanding is that a Likert score refers to the agreement a rater has with statement such as, in this case, 'the bird struggles and pecks a lot'. What seems to have been done here is rating on an interval scale ranging from 1 (…) to 5 (…). That's fine, it's just not a Likert scale per se, so remove the reference and call it an interval score.
    27. L156: picking -> pecking
    28. L164: minutes -> min (this can be changed elsewhere as well)
    29. Can the authors be consistent in whether they use digits or write numbers in full for small values? See LL 140, 164, 182, 183 etc.
    30. Could the authors clarify which birds had only their tarsus measured (L142) and which birds had all tarsus, head, wing, and tail (LL152-155) measured?
    31. L152: this is confusing as written currently. May I suggest that the authors add after 'handling: …': 'We scored birds' aggression while collecting the following morphological measurements:'
    32. L178: is this part of the (i) aggression, (ii) breath rate list? Is so, why does this one get it's own heading? If not, remove '(iii)' or add (i) and (ii) to the other two subheadings. If the authors do that, please use a different enumeration again for the lists at LL152-5 so that the reader knows to which list the authors are referring.
    33. L170: remove 'the lap function of' as this is unnecessary
    34. L180: insert spaces between digits and symbols
    35. L196: is 'can be viewed as' necessary? Is this the number of hops and flights? If so, remove the aforementioned text; if not, please clarify, as readers do not have access to the cited paper.
    36. Around L239: can you please provide the median, minimum and maximum number of (repeated) measurements on each individual? Or remove this from here completely as it is repeated at L307.
    37. L240: after 'records)' replace with 'for which both behavioural and physiological measurements were taken on the same day'
    38. Section L243: I don't understand why this is listed here. In the section above, the authors suggest that if there were any missing points those records were discarded (L239), is that correct? If so, this section could be turned into one or two sentences and added to the previous section.
    39. L246: of -> for
    40. L248: remove semicolon
    41. L249: analyse, to some -> analyse, some
    42. L256: could the authors clarify which five traits were being assessed (given the confusion created by the Cage Behaviours section)?
    43. L259: as fixed -> as a/the fixed
    44. L265: what fixed effects are the authors referring to?
    45. L270: please clarify the 'context' that the authors refer to here.
    46. L276: please clarify what the authors mean by 2nd calendar year etc. for age. Was this the second year they were observed? Or the estimation of their age? If so, '2 years old' is descriptive enough.
    47. L273-281: could the authors make this clearer? I had to read it many times to understand the modelling approach used. Also, were no birds measured in two breeding seasons or two winters? i.e. are there only birds that were measured from one winter to one breeding season or one breeding season to one winter?
    48. L284: perspective affect -> perspective, affect
    49. L285: put a space between 1 and pm
    50. L284 and elsewhere: In-sentence lists should be started with a colon
    51. L291: Could the authors describe the delta-method and supply a reference; I am not, and other readers may not be, familiar with it.
    52. L280 and elsewhere on this page: can the authors use multivariate here? Or use multi-variate or multivariate consistently?
    53. L294: which correlations? This is odd at the beginning of a paragraph; could the authors clarify to which correlations are being referred?
    54. L295: correlations, -> correlations;
    55. L299: summing up all the -> summing the
    56. L310: birds -> birds'
    57. L315 insert commas around 'respectively'
    58. L316: lower 0.18 -> lower at 0.18
    59. L317: remove 'ranged between' and the dash after IgG (or the space to match the presentation at L319) and add ', respectively' at the end of the sentence.
    60. L337: remove 'nearly significant' and replace with something more appropriate
    61. LL357-366: Much of this is speculation/explanation that belongs in the Discussion
    62. L373: how is a bird cage 'semi-artificial'? It seems completely artificial to me.
    63. Throughout the Discussion: Can the authors use past tense?
    64. L385: Could the authors clarify to which estimates they refer?
    65. L388: Are the authors referring to the POLS? If so, can they be explicit about it (as that terminology is used in the keywords and Introduction)?
    66. L400: reflects -> reflect
    67. L401: slower -> lower
    68. L415: breath -> breathe
    69. L415: can more easily acquire oxygen -> can carry more oxygen
    70. L418: remove 'e.g.' and add ', for example' at the end of the sentence
    71. L419: what does 'a better physiological condition' mean?
    72. L427: over the ontogeny -> over ontogeny
    73. L426: Exciting idea!
    74. L436: 10 min is not a long time for 'acclimation', especially as the authors use a measure of breath rate after about the same time of capture-handling time. Do the birds not acclimate to the handling? I suggest removing the reference to acclimatisation (as the authors later attribute the freezing to being stressed in the cage at that point).
    75. L589: Can the authors confirm that these are the BLUPs for the individual intercepts? (Also, some description of what these are is likely necessary for many readers.)
    76. Table 1: Ig -> IgG? Un-hyphenate breath rate here, in the next table and at L272
    77. L599: Why 'nearly significant'? Can't the authors say 'trend'? Perhaps italicise the trends to show they are not significant but may be important.
    78. L600: a description of the superscript is unnecessary
    79. Table 4, Line 13: modle -> model
    80. Table 4, Line 15: Does Table X3 refer to Table 3?
    81. Table 4, Line 16: To what does Vi and Vr refer? Please define all notation used, either here or in the text.

    Thank you for asking me to review this paper. I hope the authors find my comment constructive and helpful.

