Content of review 1, reviewed on March 25, 2021

The article is well written and poses a DL pipeline for a 3-class classification of patients with AD, with MCI and controls, which are called HA. The article presents the information in a very clear way and is easy to understand. It provides really interesting results that I'm sure that will be very useful for the AD research field.

Nevertheless, it also presents some issues that should be considered before publishing:

  • The first, and the most important one, is that for assessing the true classification efficiency of the pipeline, more parameters are required: Cohen's Kappa, AUROC, or, preferably, confusion matrices. This is a must, and its absence questions the results. Please, include in the next versions of the manuscript the confusion matrices as well as the Cohen's Kappa to provide clearer information about the capability of the algorithm to classify patients.

Moreover, there are also some minor issues that should be considered. They are a high number of issues but they are minimal and can be easily addressed:

  • Page (P) 2, Line (L) 23: It is stated that AD is provoked by plaques and tangles, and I'm not sure about this. You should provide a reference, and state that it is related, but it is not demonstrated its causality.
  • P 2 L 40: Gold standard instead of golden standard
  • Third paragraph of Introduction: It is a bit confusing. Try to reorder it to improve its legibility. Besides, I think that "algorithms" should be used instead of "processing techniques"
  • You use the expression "classify the differences" throughout the text. You do not classify differences; you classify subjects or recordings.
  • Section 2 is a bit disordered. My advice is to rearrange it chronologically or clarify which order you have followed to order the studies you mentioned.
  • The format of Table 1 is not really good. Try to homogenize it.
  • The first paragraph of Section 3 should be rewritten, because although it is a summary, it is not very clear.
  • P5 L32: Time series of 587 "seconds"?
  • Why do you use a notch filter at 42 and 21 Hz? Does Power Line work at these frequencies? Is 21Hz an harmonic of 42 Hz, where the power line works?
  • MARA classify the ICA components, to the best of my knowledge. This should be stated.
  • Subjects without MMSE were used or were discarded? This should be stated more clearly. Could this issue bias the results?
  • P6 L6: Where do you obtain the 340137 samples? Please, make this explicit
  • You should indicate how you make the decision of the maximum and type of scale of the CWT representations. It was visually but based on which parameters?
  • Figure 3 is mentioned before Figure 2.
  • Figures, generally, are not placed in the top (or the bottom) of the pages
  • The last paragraph of 3.5 sections is misunderstanding. Some information is missing. You should state that you use 8 of those splits for training, one for validation and one for test. Then, in the next sections of the article where this is stated again, it should be removed to avoid repetition.
  • P7 L36: Typo – “the model to tune the model”. Model is repeated so many times, try to use an alternative word.
  • The optimizer used in the DL model should be explained a little bit more or, at least, include references to look for further information.
  • When you said that the results from smaller batch are have poorer generalization. What do you refer exactly?
  • In the discussion, in the first paragraph you repeat again all the methods. This is somehow repetitive and should be removed or shortened.
  • Conclusions section should be completely rebuild. It is too long, and include limitations (and future lines) that should be placed in the previous section (with the other limitations). Furthermore, it is noteworthy that the 2-class classification you propose (considering MCI non-pathological) is not correct. It should be considered comparing MCI-AD or MCI-HA.

Finally, I would like to repeat that despite the long list of issues the paper is well written and the content is appropriate.

Source

    © 2021 the Reviewer.

Content of review 2, reviewed on May 05, 2021

Thank you very much for your reply to all the coments I posed. You have properly addressed all the issues I pointed out, and the comments really satisfies my doubts.

Nevertheless, there are still a couple of minor issues that should be considered:
- The quality of figures in general, but specially Figure 6 is really poor. This could be due to the journal's shipping system but should be fixed before publication. This is probably not your fault, but please take it into account.
- It is noteworthy that the registers are 10 minutes long. This is probably making the last epochs to be very different from the first ones, and should be assessed its influence in the results, considerng it as a potential confounding factor. Assessing this would be very time consiming, but at least it should be stated in the limitations section.
- The educational level, (as a proxy of the cognitive reserve) is known to be a confounding factor, and the authors have not considered it. I understand that this could no longer be fixed, but it should be stated in the limitations section more widely explained than is now.

Finally I want to acknowledge the authors for the manuscript. They have made a really good and valuable job.

Source

    © 2021 the Reviewer.

References

    J., H. C., Javier, E., A., P. M., Brian, S., Renato, A., Amanda, V. L. D. A., F., B. L., Daniel, A. 2021. Deep learning of resting-state electroencephalogram signals for three-class classification of Alzheimer's disease, mild cognitive impairment and healthy ageing. Journal of Neural Engineering.