Content of review 1, reviewed on June 04, 2022

Summary and general impression
This paper reports on behavioral data of speech comprehension performance and auditory motor synchronization, based on syllable rate of the stimuli, prefered auditory rate of participant (syllables/sec), and preferred motor rate of participant (syllables/sec). The report contributes to the literature on the cognitive (neuro)science of language, even though, in my opinion, the paper is trying to present the results in a stronger way than their actual value.
For example, Figure 4 offers a schematic representation of 4 different variables (auditory motor coupling strength, preferred auditory rate, preferred motor rate, and speech comprehension performance) in an attempt to visualize how these variables affect each other, but unfortunately the depiction is not intuitive enough. In the major comments I am suggesting ways to improve comprehensibility of the schematic representation. Also, Figure 4 refers to auditory motor coupling, but in the Discussion (lines 444-458) it is clarified that the measurement was actually auditory-motor synchronization, which is taken as a behavioral indicator of auditory-motor coupling in the brain. This discrepancy shows that the authors are trying to make a suggestion on the neural level, but only based on behavioral measurements. I would strongly recommend staying with the term auditory-motor synchronization also for the schematic representation, and, if indeed you would like to draw conclusions on auditory-motor coupling, to phrase those carefully in the discussion text, as you have attempted already in lines 444-458.
I would not recommend rejection, but a major, thorough revision in the hope that the resubmission will be more concise.

Major points
Lines 139-146 The description would be clearer if you could add a graphic representation of the trials for each experiment. Please also add the time-course of each trial, what was presented for how long, when were the participants allowed to respond etc.
Line 157: Please expand on the motivation of minimizing auditory feedback. As it stands currently, I don’t see how it could affect the behavioral measures.
Lines 159-165: Please add a schematic representation of this trial, it would be much clearer and more intuitively explained.
Lines 168-173: Please explain why whispering was the best option.
Lines 187-188: Why were trials with z-scores larger than 2 excluded? Please explain or refer to a paper where this is advised.
Lines 191-193: This sentence implies that only participants with very high auditory rate were included. Wouldn’t this reduce variability in the preferred auditory rate and therefore not present the full picture of participants’ preferred auditory rate? This exclusion needs to be motivated clearly or not performed at all. This is important, especially in light of the results that the preferred auditory rate was not found to affect auditory-motor synchronization.
Line 214: “perplexity” what is the reason of choosing the word “perplexity” and not “complexity”? I am more familiar with the definition of “complexity” and I would find it useful to the reader if a sentence explaining the relation between these two features, how different and how similar they are defined and/or how differently and similarity they have been modeled/implemented. Also add references to this sentence to show where the term “perplexity” comes from.
Line 226: Why weren’t random slopes included in those models? I would encourage using random slopes as previous work has shown that intercept-only models can inflate Type I error (see Barr et al., 2013, Journal of Memory and Language).
Line 235-6: the GLMM with logistic link function needs more specification, which R package did you use and which function? (was it the mgcv package?)
Line 316: “the odds of … increased by …” I am not familiar with this type of statistical significance statement, can you please elaborate or provide a reference that explains this reporting? Also it is not clear to me why you would choose to divide this perfectly good linear variable into two categories. Wouldn’t it be easier to include a linear predictor in the model? The motivation for this “dichotomy” is not very well spelled out in the intro & methods. Last, in light of the results on auditory motor synchronization and the relevance you would like to show in relation to auditory motor coupling, it would make much more sense to model this variable as a continuous.
Lines 355-356: Given this sentence, I strongly recommend reporting the values after correcting for multiple comparisons. This is very important, to have the corrected values reported, else if one doesn’t read the paper carefully enough, they might wrongly assume that the reported values are corrected.
Lines 413-420, Figure 4: This Figure is very difficult to comprehend. While I appreciate the attempt to create a schematic representation of the relations between the measured variables, I’m afraid that this picture does not serve the purpose. It is too complex, includes too much information and is not clear about the relation between the variables. As mentioned earlier, I would like to see the term auditory motor coupling replaced by auditory motor synchronization in the figure, as it would make it more precise. I would exclude the preferred auditory rate distribution and the speech plus the arrow graphs, as they seem more distracting than focusing on the core message. The core message is the auditory motor synchronization and how 1) performance, 2) preferred motor rate, and 3) preferred auditory rate relate to it. I would recommend taking the middle portion of the image (Circled A and M and the arrows) and repeating it 3 times in one figure with 3 panels. In the first panel you would show that the strength of the arrows (thickness of the arrows) differs with regard to performance. In the second you would show how the same relation (AM-synchronization) differs in relation to preferred motor rate. In the last panel you can keep the arrows in the same thickness, or have only one set of arrows instead of 3, showing that there is no such effect of preferred auditory rate on AM-synchronization. I think this simpler visualization would make the message much clearer in an intuitive way.
(I hope you appreciate the time and effort I have invested in thinking along to make this Figure better, my partner and I were discussing it for about an hour before we found a satisfactory suggestion)

Minor points
Line 71: Auditory cortex, capital letter after colon, maybe this is not what you meant.
Lines 73-76: difficult sentence, please cut in two parts and rephrase.
Line 76: Decoding, the article is missing
Generally, too many parentheses are used. Please see if you can include the phrases that are currently in parentheses into the sentence with commas or make a new sentence, because the parentheses break the text flow, example on lines 89-90.
Lines 98-100: This sentence lacks a citation, please add the evidence.
Lines 130: Insert subchapter Participants. Also please add the names of the “local ethics committees”, to have the complete information. Please ignore this comment, if the journal requires you to add the ethics details in the supplementary material.
Line 142: “classic intelligibility task” please add a citation, to clarify.
Line 208: “k-means algorithm” needs a reference, for someone who doesn’t know what this is and will need to look it up.
Line 211: Please explain what θ, τ, and Τ stand for in the formula.
Lines 260-262: This sentence is already discussing the results, I would exclude it from the Results and add it in the right place in the Discussion.
Lines 265-270, Figure 1: Please add a letter to each panel of the figure A, B, C, and D, to be able to refer to each panel in a clearer way. Figure caption: B “Predictions from a generalized additive mixed model (GAMM)” is this estimated or fitted on the data, I am confused. Please clarify in the caption text.
Lines 273-279, 303-306, 359-369, 386-394: This paragraph should be under the Methods section, please exclude from Results. The results section should be reporting on the data, not repeating expectations and discussing the results, please keep it as simple as that. The sentence on lines 394-396 is the only sentence belonging to the Results section.
Line 285: “most participants” needs to be specified in percentage.
Line 307: “significantly influenced” please add inferential stats values to make the significance statement specific, or combine with the next sentence so that the stats and the statement are read together.
Lines 328-335, Figure 3: 3A and 3B are showing overlapping information, I recommend excluding 3A. 3B: again, as in Figure 1, please include a letter/number for each panel. Figure caption: missing stats reports for main effects, please add this information either here or in the results text.
Line 351: “each unit”, does this mean “each word”? Please clarify what unit stands for.
Lines 476-479: This sentence is too long. I’d reformulate: Ïnterestinly, our findings suggest that the facilitatory effect of linguistic predictability is particularly effective at fast rates. Second, this effect may be used differently …”
Line 480 “under what conditions”: This is a very interesting consideration, can you please give examples of such conditions affecting the impact of the motor system on speech comprehension? I thought of trained musician motor systems, but what are your thoughts? Would be interesting to know the angle from which you’d approach this.

Source

    © 2022 the Reviewer.

References

    Christina, L., Anne, K., Jonas, O., David, P., M., R. J. 2023. Explaining flexible continuous speech comprehension from individual motor rhythms. Proceedings of the Royal Society B: Biological Sciences.