Content of review 1, reviewed on December 08, 2022

I would like to thank the authors for their careful revisions and their detailed response to my earlier comments. Nevertheless, I must say that my earlier concerns remain the same for the most part.

1) Regarding my point that “the findings are somewhat incremental in the sense that they mostly build upon the earlier paper by Niarchou et al. (reference 8), and only add a few novel aspects in my view”, the authors made an effort to highlight the impact and significance of the study. However, I am not entirely convinced by their argumentation but, as it is mostly a matter of opinion in the end, this is not a major impediment to publication as long as the authors make sure to clearly differentiate this study from the earlier one (which they did better in the revised version).

2) More problematic is the issue of the ceiling effect for the SMDT rhythm perception task. This is a major issue: the mean score for this test was 16.27 (out of a maximum of 18) in the current study, and in fact the mode of the distribution of the scores was 18 (i.e., the highest possible score)! To put it bluntly, this test has very poor psychometric properties and is statistically uninformative given that participants made, on average, about two mistakes out of 18 trials.

To their credit, the authors now acknowledge that “many individuals scored perfectly on the rhythmic discrimination task” but they argue that “the distributional characteristics were similar to the original population sample that was not selected for musical ability”. Prompted by this response, I went back to the original Ullen et al. (2014) study, and indeed the mean score obtained on the rhythm task was around 15.3 (Table 1 of Ullen et al.). However, in the same sample, the mean score for participants with music education was around 15.7, while it was around 14.9 for participants with no music education (Table 3 of Ullen et al.; similar results when categorizing participants on the basis of musical experience instead of education). This is important because it indicates that even people with no musical education made on average about three mistakes out of 18 trials. In other words, the ceiling effect is present even among non-musicians, meaning that, whether or not the authors actually “oversampled individuals interested in music”, the root of the problem is that the SMDT rhythm task itself has very poor psychometric properties and should be revised and made much more difficult (ideally, the mean score for non-musicians should be somewhere between 8 and 11 [if we keep 18 items] to avoid both floor and ceiling effects). I understand that the authors found it advantageous to use the SMDT rhythm task for other reasons, but, as I indicated in my first review, the poor statistical properties of the data thus obtained severely affect all subsequent analyses.

Here, I want briefly to address the topic of normality: even if the arcsine-transformed scores show distributional characteristics that are closer to normality, this does not in any way alleviate the main problem, which is that the mode of the distribution is the highest possible score. The ceiling effect remains as problematic as before, as is obvious from even a cursory look at Figure 3 of the current study. However, looking more closely at Figure 3, I noticed that the relative height of the histogram columns for the rhythm discrimination task changed between the original version of Figure 3 and the revised version. This is puzzling because, while the arcsine transformation (or any mathematical transformation for that matter) would be expected to change the overall shape of the histogram, specifically the relative distance between the columns, the relative height of the columns should be the same (obviously, all scores of 18 should yield the same value after arcsine transformation, same with the scores of 17, and so on, so that the relative height of the two highest columns [for instance] in Figure 3 should be the same in both the old and new versions of Figure 3). Even more puzzling, I noticed that there appears to be six columns between 15 and 18 in the histogram found in the original version of Figure 3, although there are only four possible scores (15, 16, 17, and 18). Reassuringly, Figure S1 on p. 13 of the Supplemental Notes (original submission) does show the expected four columns between 15 and 18, although here again the relative height of the histogram columns changed between the original and revised version of Figure S1. Hopefully, these discrepancies are only due to a glitch with the graphic software and are not indicative of a more serious problem with the data, but I would like the authors to check their data for any errors, redo Figures 3 and S1 if necessary, and make sure that the relative height of the histogram columns stays unchanged no matter what transformation is applied to the scores, as it should.

Finally, since the authors hope that “this study will be informative and inspirational to other researchers in the field of music cognition”, I would, as a first step, strongly recommend using tests and questionnaires with solid psychometric properties, and which generate statistically informative data (no floor or ceiling effects, and a distribution of scores showing enough variance to be able to conduct meaningful statistical analyses based on these scores). This is the case neither for the SMDT rhythm task, nor for the “Can you clap in time with a musical beat” question to which 92.6% of participants responded “yes”. In my opinion, when conducting research in this field, an equal amount of care and attention should be spent on collecting quality phenotypic data as on obtaining quality genetic data: if either of them is not up to par, the entire study is affected. In the present case, this means that, regardless of the amount of effort spend on collecting genetic samples and conducting state-of-the-art genetic analyses, the poor statistical properties of the SMDT rhythm scores irremediably affect all analyses based on these scores. If this study is accepted for publication, I would like the authors to acknowledge, even more explicitly than they have done in the current version, that the poor psychometric properties of the SMDT rhythm task (and, to a smaller extent, of the “can you clap in time with a musical beat” question) are a major limitation of this study, and that future studies in the field should pay particular attention to the design and selection of appropriate tests and questionnaires.

Source

    © 2022 the Reviewer.

References

    E., G. D., L., C. P., Youjia, W., Rachana, N., E., P. L., T., B. C., A., M. M., W., W. L., Fredrik, U., E., B. J., J., C. N., L., G. R. 2023. Exploring the genetics of rhythmic perception and musical engagement in the Vanderbilt Online Musicality Study. Annals of the New York Academy of Sciences.