Content of review 1, reviewed on June 20, 2021

Lensink et al. presents the community evaluation of the CAPRI50-CASP14 assembly predictions. Their paper presents an elaborate depiction of what happened during the Round. Their analyses are comprehensive and very informative. We recommend the authors to address the following points for having their work accepted as a publication.

General comments:

  • How are the values classified as acceptable, medium, and high accuracy?

  • An interface residue definition was provided on P. 16, line #59:

“An interface residue is defined as such, when any of its atoms (hydrogen atoms excluded) are located within 10 Å of any of the atoms of the binding partner.”

Then, another interface definition was introduced on P.19, line #29:

“Interface residues of the receptor (R) and ligand (L) components in both the target and predicted models were defined as those whose solvent accessible surface area (ASA) is reduced (by any amount) in the complex relative to that in the individual components”

Why are there two distinct interface definitions to measure the success of the assembly and the predicted interface residues?

Minor text corrections:

General comment:
- Numbers less than ten should be written out in word form.

Specific comments:
ABSTRACT
- P.6, line #9: easy to model --> easy
- P.6, line #14: template --> template(s)

THE TARGETS
- P.12, line #48: “The targets are designated by their CAPRI target ID followed by their corresponding CASP target ID.” --> This is the case only for Table 1 (not Figure 1).
- P.12, line #13: “oligomeric state of the protein” complex(?)
- P.14, line #24-33: T180/T1099 could be linked to the Figure 3.
Also, on this page, Figure 4 is cited before Figure 3. That should be revisited.
- P,14, line #33: “… as will further detailed in our analysis ” --> … as will be detailed further in our analysis.

OVERVIEW OF THE PREDICTION EXPERIMENT
- P.14, line #40: “As in in previous...” -> one “in” should be removed.
- P.15, line #49: Why such a small fraction of the submitted models were assessed during the assembly prediction category?

ASSESSMENT METRICS AND PROCEDURES
- P.16, line #40: Would be nice to cite the parameters (L_rms, i_rms, f(nat)) in the order of their appearance as in the continuation of this section.
- P.16, line #49: Which part of the Supplementary Material is cited here?
- P.17, line #29: There is no such parameter as “N” in Eq. (1).
Also, N is used to express the total number of interfaces considered per target, as well as the overall number of acceptable, etc. models. The authors could consider using different variable names to express those (to avoid confusion).

RESULTS AND DISCUSSION
- Here and later, the targets should be matched with their relevant panels on Figure 1.
- P.20, line #15: The fourth and final part analyzes … --> analyze
- P.20, line #56: The sentence starting with “The homodimer” could be dissected into two sentences.
- P.21, line #29: The sentence starting with “The majority” reads confusing.
- P.21, line #10: “where available” --> “were available”
- P.23, line #47: “models” --> “model”
- P.23, line #52: “sand scoring” --> should be a typo
That sentence should also be rephrased to express it in simpler terms.
- P.23, line #50: “proper” should be removed.
- P.23, line #52: “sever” should be corrected as “server”. Also, on the same line, the sentence starting with “For example” should be revisited for clarity.
- P.27, line #21: The sentence starting with “ Among the ..” --> please rephrase for clarity.
- P.28, line #49: “in term” --> “in terms”
- P.29, line #3: The sentence starting with “But” is too long and should be revisited.
- P.29, line #56: There is no Figure 3z available.
- P.29, line #13: “the most complex” --> “the most difficult complex”
- P.29, line #49: The section starting with “These include” is too long and should be revisited for clarity.
- P.38, line #26: The section starting with “Higher” should be revisited.
- P.41, line #23: There are no panels as a & b on Figure 6. Also, it is not clear what is meant by “left” and “right” on line #27.

TABLES AND FIGURES
- Table 1 caption:
“Dimeric and trimeric targets are listed with Easy targets first and then Difficult targets.” --> Not clear, please rephrase

  • Figure 5 legend:
    Final sentence of this legend should be revisited for clarity.

  • Figure 6:
    What is the additional information obtained from the box plot version of the data spread?

Source

    © 2021 the Reviewer.

Content of review 2, reviewed on August 05, 2021

I would like to thank the authors for addressing all my comments.

Source

    © 2021 the Reviewer.

References

    F., L. M., Guillaume, B., Theo, M., Nurul, N., Sameer, V., G., C. R. A., Tereza, C., A., B. P., Ren, K., Bin, L., Guangbo, Y., Ming, L., Hang, S., Xufeng, L., Shan, C., S., R. R., Farhan, Q., Jian, L., Jianlin, C., Anna, A., Cezary, C., Artur, G., Mateusz, K., G., L. A., Adam, L., A., L. E., Martyna, M., K., S. A., Rafal, S., A., W. P., Karolina, Z., A., D. C. M. C., Eiichiro, I., Ameya, H., J., G. J., J., B. A. M. J., Francesco, A., Rodrigo, V. H., Zuzana, J., Brian, J., I., K. P., Siri, V. K., W., V. N. C., Manon, R., Jorge, R., Sergei, K., Dzmitry, P., A., P. K., Andrey, A., Mikhail, I., Israel, D., Ryota, A., Zhuyezi, S., Usman, G., Nasser, H., Sandor, V., Dima, K., Mireia, R., A., R. L., Juan, F., Agnieszka, K., Sergei, G., Yumeng, Y., Hao, L., Peicong, L., Sheng-You, H., Charles, C., Genki, T., Jacob, V., Daipayan, S., Tunde, A., Xiao, W., Daisuke, K., Tsukasa, N., Yuya, H., Ragul, G., D., G. J., Rui, Y., Ghazaleh, T., G., P. B., Didier, B., Zhen, C., Luigi, C., Romina, O., Yuanfei, S., Shaowen, Z., Yang, S., Taeyong, P., Hyeonuk, W., Jinsol, Y., Sohee, K., Jonghun, W., Chaok, S., Yasuomi, K., Shinpei, K., Yoshiki, H., Mayuko, T., J., K. P., Amar, S., A., V. I., Justas, D., Kliment, O., Ceslovas, V., Rui, D., Liming, Q., Xianjin, X., Shuang, Z., Xiaoqin, Z., J., W. S. 2021. Prediction of protein assemblies, the next frontier: The CASP14-CAPRI experiment. Proteins: Structure, Function, and Bioinformatics.