Abstract

This paper examines the structure of scientific collaborations in Berlin as a specific case with unique history of divide and reunification. It aims to identify strategic organizational coalitions in a context with high sectoral diversity. By adopting a global, regional and organization based approach we provide a quantitative, exploratory and macro view of this diversity. We use publications data with at least one organization located in Berlin from 1996-2017. We further investigate four members of the Berlin University Alliance (BUA) through their self-represented research profiles comparing it with empirical results of OECD disciplines. Using a bipartite network modeling framework, we move beyond the uncontested trend towards team science and increasing internationalization. Our results show that BUA members shape the structure of scientific collaborations in the region. However, they are not collaborating cohesively in all disciplines. Larger divides exist in some disciplines e.g., Agricultural Sciences and Humanities. Only Medical and Health Sciences have cohesive intraregional collaborations which signals the success of regional cooperation established in 2003. We explain possible underlying factors shaping the intraregional groupings and potential implications for regions worldwide. A major methodological contribution of this paper is evaluating coverage and accuracy of different organization name disambiguation techniques.


Authors

Aliakbar Akbaritabar

Publons users who've claimed - I am an author
Contributors on Publons
  • 1 author
  • 2 reviewers
  • pre-publication peer review (FINAL ROUND)
    Decision Letter
    2021/04/02

    02-Apr-2021

    Dear Dr. Akbaritabar:

    It is a pleasure to accept your manuscript entitled "A Quantitative View of the Structure of Institutional Scientific Collaborations Using the Example of Berlin" for publication in Quantitative Science Studies.

    I would like to request you to prepare the final version of your manuscript using the checklist available at https://bit.ly/2QW3uV5. Please also sign the publication agreement, which can be downloaded from https://bit.ly/2QYuW4w. The final version of your manuscript, along with the completed checklist and the signed publication agreement, can be returned to qss@issi-society.org.

    Thank you for your contribution. On behalf of the Editors of Quantitative Science Studies, I look forward to your continued contributions to the journal.

    Best wishes,
    Dr. Ludo Waltman
    Editor, Quantitative Science Studies
    qss@issi-society.org

    Decision letter by
    Cite this decision letter
    Author Response
    2021/04/01

    Dear Professor Waltman, QSS Editor in chief,

    Thank you very much for further suggestions to help improve our manuscript. We appreciate them. We have replied to each point below in a new line, starting with a "#RESPONSE#" sign. The revised parts of the text are tracked with colored changes (placed after the revised manuscript, and at the end of the PDF file) to facilitate the review.

    In addition, since that our data was licensed and unfortunately could not be made available, we provided replication Python codes (https://doi.org/10.5281/zenodo.4657325) for the three disambiguation methods and how they can be parallelized to increase computation speed. In addition, the code includes bipartite network construction and community detection as reported in the manuscript. This is to hopefully help future researchers adopt the methods proposed in our manuscript.

    Best regards,
    Authors

    Editor's Comments to Author:

    Your manuscript can almost be accepted for publication in Quantitative Science Studies, but a few more improvements need to be made:

    1. Please present the contributions of your manuscript in the introduction rather than in the literature review.

    RESPONSE# Contributions are now presented at the end of the introduction section.

    1. The abbreviation 'WOS' needs to be defined.

    RESPONSE# Thanks, corrected.

    1. To make Figure 2 easier to read, please increase the font size used in the figure. Also consider reducing the height of the figure.

    RESPONSE# Thanks. Figure height decreased. Fonts' sizes increased (We double checked also other figures and where it was possible, increased the font sizes).

    1. The concluding section of your manuscript should be more self-contained. Readers should be able to understand the main conclusions of your work by reading the concluding section, even without having read the preceding sections. In the current version of your manuscript, the concluding section is hard to understand in isolation. The section needs to be rewritten to make it more self-contained.

    RESPONSE# Thanks a lot for this suggestion. After adding sub-sections and changing order of presentation, this section was rendered less readable (as you kindly pointed out). We double checked the discussion, limitations and conclusion sections, moved some parts of the text from prior parts to have concluding section be more self-contained and also provide broader suggestions applicable to other regions worldwide (based on the example of Berlin region).

    1. Because Table 4 has been moved to an appendix, it should be labeled Table A1. In Section 4.3, 'in Table 4 in the Appendix section' should be changed into 'in Table A1 in the Appendix'.

    RESPONSE# Corrected, both in the text and the appendix.

    1. The Acknowledgements, Funding Information, and Data Availability sections should not be numbered. The appendix should not be numbered either.

    RESPONSE# Corrected.

    Author response by


    Cite this author response
  • pre-publication peer review (ROUND 3)
    Decision Letter
    2021/03/28

    28-Mar-2021

    Dear Dr. Akbaritabar:

    Thank you for revising your manuscript QSS-2020-0065.R2 entitled "A Quantitative View of the Structure of Institutional Scientific Collaborations Using the Example of Berlin" submitted to Quantitative Science Studies.

    Your manuscript can almost be accepted for publication in Quantitative Science Studies, but a few more improvements need to be made:

    1. Please present the contributions of your manuscript in the introduction rather than in the literature review.

    2. The abbreviation 'WOS' needs to be defined.

    3. To make Figure 2 easier to read, please increase the font size used in the figure. Also consider reducing the height of the figure.

    4. The concluding section of your manuscript should be more self-contained. Readers should be able to understand the main conclusions of your work by reading the concluding section, even without having read the preceding sections. In the current version of your manuscript, the concluding section is hard to understand in isolation. The section needs to be rewritten to make it more self-contained.

    5. Because Table 4 has been moved to an appendix, it should be labeled Table A1. In Section 4.3, 'in Table 4 in the Appendix section' should be changed into 'in Table A1 in the Appendix'.

    6. The Acknowledgements, Funding Information, and Data Availability sections should not be numbered. The appendix should not be numbered either.

    To revise your manuscript, log into https://mc.manuscriptcentral.com/qss and enter your Author Center, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions," click on "Create a Revision." Your manuscript number has been appended to denote a revision.

    You may also click the below link to start the revision process (or continue the process if you have already started your revision) for your manuscript. If you use the below link you will not be required to login to ScholarOne Manuscripts.

    PLEASE NOTE: This is a two-step process. After clicking on the link, you will be directed to a webpage to confirm.

    https://mc.manuscriptcentral.com/qss?URL_MASK=d176a87785ee4f0993ad569409ada4e9

    You will be unable to make your revisions on the originally submitted version of the manuscript. Instead, revise your manuscript using a word processing program and save it on your computer. Please also highlight the changes to your manuscript within the document by using the track changes mode in MS Word or by using bold or colored text.

    Once the revised manuscript is prepared, you can upload it and submit it through your Author Center.

    When submitting your revised manuscript, you will be able to respond to the comments made by the reviewers in the space provided. You can use this space to document any changes you make to the original manuscript. In order to expedite the processing of the revised manuscript, please be as specific as possible in your response to the reviewers.

    IMPORTANT: Your original files are available to you when you upload your revised manuscript. Please delete any redundant files before completing the submission.

    If possible, please try to submit your revised manuscript by 27-May-2021. Let me know if you need more time to revise your work.

    Once again, thank you for submitting your manuscript to Quantitative Science Studies and I look forward to receiving your revision.

    Best wishes,
    Dr. Ludo Waltman
    Editor, Quantitative Science Studies
    qss@issi-society.org

    Decision letter by
    Cite this decision letter
    Author Response
    2021/03/26

    Dear QSS Editor and reviewers,

    Thank you very much for further suggestions to help improve our manuscript. We appreciate them. We have replied to each point below in a new line, starting with a "#RESPONSE#" sign. The revised parts of the text are tracked with colored changes (placed after the revised manuscript, and at the end of the PDF file) to facilitate the review.

    Best regards,
    Authors

    Editor's Comments to Author:

    Both reviewers are satisfied with the revision of your manuscript. The reviewers still have a few minor comments and suggestions, mainly related to the tables and figures in your manuscript. I would like to invite you to prepare a second revised version of your manuscript in which the remaining comments of the reviewers are taken into consideration. In addition, I have a few small comments myself that I would like to ask you to address:

    1. The research questions are presented in the literature review section. Make sure to present them in a separate section, not as part of the literature review. If possible, please present the research questions before rather than after the literature review.

    RESPONSE# Research questions are now moved to the end of introduction section and text is revised accordingly. Literature review section is now finishing with the main contributions of the study.

    1. The concluding section is quite lengthy. To make your manuscript easier to read, please consider splitting this section into subsections.

    RESPONSE# Concluding section is restructured into sub-sections.

    1. Please use section numbers instead of section titles in the paragraph starting with “The structure of the paper is as follows”.

    RESPONSE# Thanks. Corrected.

    1. In the context of organization name disambiguation, I wonder whether the following paper published recently in Quantitative Science Studies could be of relevance: https://doi.org/10.1162/qss_a_00031.

    RESPONSE# Thank you for introducing this article. We found it relevant and cited it in our text. Our own study was initiated using both Web of Science and Scopus in a comparative research design. But unfortunately, due to the modifications WOS applies in organization affiliations and addresses, we were not able to develop robust and sound disambiguation techniques to improve WOS data quality (many issues documented in the text were observed in WOS data as well). We thus limited our study to only Scopus which provides the affiliation addresses in the same format delivered by publishers and makes improving the disambiguation easier. Nevertheless, as declared in our limitations (and above article you kindly introduced), it is problematic to use only one bibliometric database and hence we make our conclusions cautiously.

    1. “For hard sciences, e.g., particle physics, no scientific discovery is imaginable today without multi-organizational scientific collaborations”: This seems an exaggeration.

    RESPONSE# Rephrased.

    Reviewers' Comments to Author:

    Reviewer: 1

    Comments to the Author
    The author has taken my feedback in revising the paper. The revised paper has improved its clarity and presentation.

    RESPONSE# Thanks a lot.

    Minor comments:
    1. Figure 4 does not add much value to the paper. It can be removed.

    RESPONSE# With all due respect, we think that was due to our poor choice of color scheme in this figure. We followed the other reviewer's advice and completely redone the figure. The color scheme and legend of the figure and how countries are mapped are changed to make the trends clearer. Text is revised accordingly.

    1. Table 4 may be more appropriate to be placed in the appendix.

    RESPONSE# Table 4 and its description is moved to the appendix.

    Reviewer: 2

    Comments to the Author
    The manuscript has been substantially improved. I have only minor suggestions for improving the manuscript:

    RESPONSE# Thanks a lot.

    1) Some lines in Figure 2 have a very similar color (AS and SS as well as MHS and NS) so that it is virtually impossible to distinguish them. Different line colors or plotting symbols might help.

    RESPONSE# Thanks. It is revised so that line types represent the field and count types and colors represent the two lines for each field.

    2) The color scale of Figure 4 is very compressed. Is it possible to expand the size of the scale? There is more than enough space. Some countries are missing in Figure 4, especially many countries are missing in the facet "no collaborator". I suppose that these missing countries have zero counts. However, the scale does contain zero. In any case, missing countries should be drawn in Figure 4 to facilitate easier recognition of the colored countries.

    RESPONSE# Thanks for the constructive suggestions. This figure is completely redone. The color scheme is reversed and all countries are now mapped to make trends clearer. Countries without a representative organization in a specific sector are mapped as gray to highlight the sector-based collaboration trends described in the text.

    3) The URLs for Stahlschmidt, Stephen, & Hinze (2019) and Stephen, Stahlschmidt, & Hinze (2020) do not work:

    Stahlschmidt, S., Stephen, D., & Hinze, S. (2019). Performance and Structures of the German Science System (p. 91). Studien zum deutschen Innovationssystem. . https://www.e-fi.de/daten-und-informationen/indikatorenstudien/2019/
    Stephen, D., Stahlschmidt, S., & Hinze, S. (2020). Performance and Structures of the German Science System 2020. Studien zum deutschen Innovationssystem. . https://www.e-fi.de/fileadmin/Innovationsstudien_2020/StuDIS_05_2020.pdf

    I receive page not found errors when I try to access these URLs.

    RESPONSE# We are sorry for the problem with the added URLs. It was due to the new changes in the EFI website (https://www.e-fi.de/publikationen/studien) which did not use the previous URLs. We double checked and added new URLs (copied below).

    Stahlschmidt, S., Stephen, D., & Hinze, S. (2019). Performance and Structures of the German Science System (p. 91). Studien zum deutschen Innovationssystem. https://www.e-fi.de/fileadmin/Assets/Studien/2019/StuDIS_05_2019.pdf

    Stephen, D., Stahlschmidt, S., & Hinze, S. (2020). Performance and Structures of the German Science System 2020. Studien zum deutschen Innovationssystem. https://www.e-fi.de/fileadmin/Assets/Studien/2020/StuDIS_05_2020.pdf

    Author response by


    Cite this author response
  • pre-publication peer review (ROUND 2)
    Decision Letter
    2021/03/23

    23-Mar-2021

    Dear Dr. Akbaritabar:

    Your manuscript QSS-2020-0065.R1 entitled "A Quantitative View of the Structure of Institutional Scientific Collaborations Using the Example of Berlin", which you submitted to Quantitative Science Studies, has been reviewed. The comments of the reviewers are included at the bottom of this letter.

    Both reviewers are satisfied with the revision of your manuscript. The reviewers still have a few minor comments and suggestions, mainly related to the tables and figures in your manuscript. I would like to invite you to prepare a second revised version of your manuscript in which the remaining comments of the reviewers are taken into consideration. In addition, I have a few small comments myself that I would like to ask you to address:

    1. The research questions are presented in the literature review section. Make sure to present them in a separate section, not as part of the literature review. If possible, please present the research questions before rather than after the literature review.

    2. The concluding section is quite lengthy. To make your manuscript easier to read, please consider splitting this section into subsections.

    3. Please use section numbers instead of section titles in the paragraph starting with “The structure of the paper is as follows”.

    4. In the context of organization name disambiguation, I wonder whether the following paper published recently in Quantitative Science Studies could be of relevance: https://doi.org/10.1162/qss_a_00031.

    5. “For hard sciences, e.g., particle physics, no scientific discovery is imaginable today without multi-organizational scientific collaborations”: This seems an exaggeration.

    To revise your manuscript, log into https://mc.manuscriptcentral.com/qss and enter your Author Center, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions," click on "Create a Revision." Your manuscript number has been appended to denote a revision.

    You may also click the below link to start the revision process (or continue the process if you have already started your revision) for your manuscript. If you use the below link you will not be required to login to ScholarOne Manuscripts.

    PLEASE NOTE: This is a two-step process. After clicking on the link, you will be directed to a webpage to confirm.

    https://mc.manuscriptcentral.com/qss?URL_MASK=f6c565e747364fcba7664ad086cba819

    You will be unable to make your revisions on the originally submitted version of the manuscript. Instead, revise your manuscript using a word processing program and save it on your computer. Please also highlight the changes to your manuscript within the document by using the track changes mode in MS Word or by using bold or colored text.

    Once the revised manuscript is prepared, you can upload it and submit it through your Author Center.

    When submitting your revised manuscript, you will be able to respond to the comments made by the reviewers in the space provided. You can use this space to document any changes you make to the original manuscript. In order to expedite the processing of the revised manuscript, please be as specific as possible in your response to the reviewers.

    IMPORTANT: Your original files are available to you when you upload your revised manuscript. Please delete any redundant files before completing the submission.

    If possible, please try to submit your revised manuscript by 22-May-2021. Let me know if you need more time to revise your work.

    Once again, thank you for submitting your manuscript to Quantitative Science Studies and I look forward to receiving your revision.

    Best wishes,
    Dr. Ludo Waltman
    Editor, Quantitative Science Studies
    qss@issi-society.org

    Reviewers' Comments to Author:

    Reviewer: 1

    Comments to the Author
    The author has taken my feedback in revising the paper. The revised paper has improved its clarity and presentation.

    Minor comments:
    1. Figure 4 does not add much value to the paper. It can be removed.
    2. Table 4 may be more appropriate to be placed in the appendix.

    Reviewer: 2

    Comments to the Author
    The manuscript has been substantially improved. I have only minor suggestions for improving the manuscript:

    1) Some lines in Figure 2 have a very similar color (AS and SS as well as MHS and NS) so that it is virtually impossible to distinguish them. Different line colors or plotting symbols might help.

    2) The color scale of Figure 4 is very compressed. Is it possible to expand the size of the scale? There is more than enough space. Some countries are missing in Figure 4, especially many countries are missing in the facet "no collaborator". I suppose that these missing countries have zero counts. However, the scale does contain zero. In any case, missing countries should be drawn in Figure 4 to facilitate easier recognition of the colored countries.

    3) The URLs for Stahlschmidt, Stephen, & Hinze (2019) and Stephen, Stahlschmidt, & Hinze (2020) do not work:

    Stahlschmidt, S., Stephen, D., & Hinze, S. (2019). Performance and Structures of the German Science System (p. 91). Studien zum deutschen Innovationssystem. . https://www.e-fi.de/daten-und-informationen/indikatorenstudien/2019/
    Stephen, D., Stahlschmidt, S., & Hinze, S. (2020). Performance and Structures of the German Science System 2020. Studien zum deutschen Innovationssystem. . https://www.e-fi.de/fileadmin/Innovationsstudien_2020/StuDIS_05_2020.pdf

    I receive page not found errors when I try to access these URLs.

    Decision letter by
    Cite this decision letter
    Reviewer report
    2021/03/22

    The manuscript has been substantially improved. I have only minor suggestions for improving the manuscript:

    1) Some lines in Figure 2 have a very similar color (AS and SS as well as MHS and NS) so that it is virtually impossible to distinguish them. Different line colors or plotting symbols might help.

    2) The color scale of Figure 4 is very compressed. Is it possible to expand the size of the scale? There is more than enough space. Some countries are missing in Figure 4, especially many countries are missing in the facet "no collaborator". I suppose that these missing countries have zero counts. However, the scale does contain zero. In any case, missing countries should be drawn in Figure 4 to facilitate easier recognition of the colored countries.

    3) The URLs for Stahlschmidt, Stephen, & Hinze (2019) and Stephen, Stahlschmidt, & Hinze (2020) do not work:

    Stahlschmidt, S., Stephen, D., & Hinze, S. (2019). Performance and Structures of the German Science System (p. 91). Studien zum deutschen Innovationssystem. . https://www.e-fi.de/daten-und-informationen/indikatorenstudien/2019/
    Stephen, D., Stahlschmidt, S., & Hinze, S. (2020). Performance and Structures of the German Science System 2020. Studien zum deutschen Innovationssystem. . https://www.e-fi.de/fileadmin/Innovationsstudien_2020/StuDIS_05_2020.pdf

    I receive page not found errors when I try to access these URLs.

    Reviewed by
    Cite this review
    Reviewer report
    2021/02/26

    The author has taken my feedback in revising the paper. The revised paper has improved its clarity and presentation.

    Minor comments:
    1. Figure 4 does not add much value to the paper. It can be removed.
    2. Table 4 may be more appropriate to be placed in the appendix.

    Reviewed by
    Cite this review
    Author Response
    2021/02/03

    Dear QSS Editor and reviewers,

    Thank you very much for the constructive feedback on our manuscript. We have replied to each point below in a new line, starting with a "#RESPONSE#" sign. We think that the revised manuscript has improved thanks to your suggestions. The revised parts of the text are tracked with colored changes to facilitate the review.

    Best regards,
    Authors

    Reviewers' Comments to Author:

    Reviewer: 1

    Comments to the Author

    This is a confusing paper. Methods used in this paper seem to be valid but my major critique is that the author tried to analyze the collaborations among three Berlin universities, but when constructing the collaboration networks, the three universities are actually blended into the whole networks with thousands of nodes. Thus, the research objectives are not neatly matched with the actual methods and implementations.

    RESPONSE# We are sorry for the confusion caused. We asked a colleague that is a native English speaker to further control our narrative for clarity and applied their suggestions. The main research objective was to study structure of scientific collaborations in the Berlin metropolitan region and empirically identify potential strategic coalitions. It was not only focused on the three Berlin universities plus one university hospital (e.g., the four Berlin University Alliance (BUA) members), but also other scientific/academic institutions in the whole region. We wanted to map the scientific landscape of the region. One of the major pre-existing coalitions that we wanted to see whether it is as powerful as it is presented by its members and plays a noticeable role in the region's science landscape was BUA. Thus, in the networks constructed we kept all collaborating organizations in the picture to see whether BUA has played a differentiating role in this landscape, which turned out not to be the case. If we were to include only these three universities as the reviewer kindly suggests and previous studies have done (unfortunately the only cases we found were contract research, and not publicly published to be cited in the text), the picture was going to be a highly cohesive collaboration structure. However, when proper contextual comparison is made by placing the said coalition into its regional context, such as the one case cited in the text (Abbasiharofteh, M., & Broekel, T. (2020). Still in the shadow of the wall? The case of the Berlin biotechnology cluster: Environment and Planning A: Economy and Space. https://doi.org/10.1177/0308518X20933904), and when we investigate the composition of collaborating organizations involved in each single publication (over the years), then, the collaboration structure is not as cohesive as coalition have hoped to achieve (at least not yet) and there are smaller organizations active in the region that sometimes have more internationalized collaborations than the major players in the region (e.g., BUA members).

    The author has a solid master of key bibliometric skills but it is disappointing to see that the results are quite weak to address the proposed hypotheses. Nearly the entire analysis feels like it is about a network analysis of 5000 or so universities in a collaboration network. I don't think the cluster-based methods are effective for the purpose of this paper. if your focus is the three universities only, what value does clustering add to the results since most of the clusters do not contain any of the three universities? An intuitive approach may be comparing the shared collaborators between BUA members in different disciplines. And may be seeing the changes over the years.

    RESPONSE# For the first part of this comment, we responded in the previous point that our focus was broader than the four BUA members. That is the reason why we present the sector and geographical composition of all of the collaborations. We are afraid that by applying the second part of the comment, it would make the reader more confused on our study's goal. We followed other kind comments by the reviewer and revised and reduced the text extensively and hence tried to increase the clarity of our narrative. We hope it is now clearer, why we chose the implemented methods and how clustering helps answer our research question on mapping the scientific landscape of the region. Based on evidence provided below in response to another comment, we believe that the cluster-based method would be one of the most accurate strategies to identify potential groupings in the regional context stemmed from composition of collaborators in individual publications. A type of result that having an aggregate view will not provide.

    The paper is too long with lots of unnecessary details. For instance, on page 2, the author spent more than half page in describing someone else's work. There needs to be a literature review section separate from the introduction section. Literature should be synthesized. Some non-essential descriptions of the data, method, and results can be moved to the appendix.

    RESPONSE# We removed the said paragraph and added a tittle as "literature review". We reduced the details presented about prior research.

    The first RQ is about a process evaluation and should not be considered as a RQ.

    RESPONSE# RQ0 is removed.

    Why OECD disciplines? Any justifications?

    RESPONSE# ASJC subject categories delivered by Scopus are too many to be interpretable i.e., 33 categories, and they are not as informative as we want them to be and they are assigned in journal level with sometimes cover overlapping or fairly close subjects. Thus, we used the reduced and more meaningful categories based on OECD fields instead, which was based on this research (OECD. (2007). Revised Field of Science and Technology (FOS) classification in the Frascati Manual (Classification, Field of science and technology classification, FOS, Frascati, Methodology, Research and development). https://www.oecd.org/science/inno/38235147.pdf). A citation to this research was missing which is now added to the text.

    Figure 4 looks pretty but I do not see much value in it. It is not surprising that US, China, France, Germany and GB are major collaborators of BUA, but they are also the key collaborators of perhaps any major university in the West.

    RESPONSE# In Figure 4 our aim was to present the aggregate and disciplinary differences in top 5% of countries collaborating with the Berlin region. This figure and its description in text are now removed.

    Table 3 is interesting but a percentage should be noted for clear comparisons.

    RESPONSE# Thanks, percentages are added to table 3.

    The color scheme used in Figure 5 made it hard to read. Figure 6 is useless. Nothing can be told from this figure. It is also unclear what messages Figure 7 and Figure 8 are trying to convey. The alluvial chart simply had too many variables for any meaningful patterns to emerge.

    RESPONSE# Color scheme of figure 5 changed and regions without any collaborating organization (previously excluded) added, on the advice of the other reviewer, in a separate panel to highlight the trends and global coverage of the region's collaborators. Figure 6 is removed. Figure 7 is the most important finding of the current paper that shows the divided and scattered structure of scientific collaborations in the region. We changed its color scheme to highlight the trends and its description is further shortened and clarified in the revised text (please see the tracked changes and colored file for the revisions). Figure 8 is removed.

    I do not think Tables 4-10 are reliable because when a different clustering method is applied, the cluster membership may be shifted.

    RESPONSE# That is indeed true. To ensure replicability of our results in using the same algorithm (e.g., CPM bipartite), we use a seed. Furthermore, we changed our previous approach and set one resolution parameter (6e-03) for all the scientific fields and the aggregate of them to show that the divide we are observing and reporting in the text is an attribute of the region and collaborations and not a byproduct of the algorithm used. In addition, the chosen clustering method is one of the few that operate on bipartite networks and is available in form of a well maintained library to researchers. There is a new version of label propagation method introduced in mid-2020 (Taguchi, H., Murata, T., & Liu, X. (2020). BiMLPA: Community Detection in Bipartite Networks by Multi-Label Propagation. In N. Masuda, K.-I. Goh, T. Jia, J. Yamanoi, & H. Sayama (Eds.), Proceedings of NetSci-X 2020: Sixth International Winter School and Conference on Network Science (pp. 17–31). Springer International Publishing. https://doi.org/10.1007/978-3-030-38965-9_2

    ) and the bipartite version of the Infomap which were both not as accessible to other researchers (e.g., some algorithms are implemented in C++ language and our technical skills of R and Python, unfortunately, will not allow us to use those implementations). In our limitations at the end of conclusions, we added a note that available algorithms are scarce to compare the results and new developments are made that can enable future research. We sincerely think, as stated in the text, that a one-mode projection of the studied network, although providing access to many implemented community detection (clustering) methods, it would give a biased picture of the collaboration structure (a highly cohesive structure that simply is not true when a fine grained and higher granularity of detail is considered). Regarding the table of clustering results, we integrated all separate tables into one adding background color and font changes for more clarity and unified the description in the text to clarify the differences between disciplines.

    Overall, I think the results are not effective to address the research questions. Figures and tables were not informative and they did not reveal much insights.

    RESPONSE# Based on the descriptions we have presented to the previous points, we respectfully think this conclusion is due to our unclear narrative in text. The specific interpretation of the reviewer from our research goals might be due to our unclear narrative as well. We were focused on the Berlin metropolitan region trying to identify presence of strategic coalitions and to compare it with pre-existing and known coalitions (e.g., BUA). We did not have an intention to only evaluate the intra-BUA members collaborations and its evolution over time (that is what BUA proposal, cited in text, has presented). We hope that the extensive revisions applied in text further clarifies our research goals and our study's contributions to the literature (please see the tracked changes and colored file for the revisions).

    Reviewer: 2

    Comments to the Author

    The manuscript "A Quantitative View of the Structure of Institutional Scientific Collaborations Using the Example of Berlin" presents a collaboration analysis of the research institutions in the Berlin metropolitan area. A new organization disambiguation technique is employed. Eight research questions are posed at the end of the introduction and answered during presentation of the results. Overall, the manuscript should be of interest to the readership of QSS.

    The author refers to the newly developed disambiguation techniques as "PyString" and "Fuzzy matching". This is prone for confusion. There is a python library called pystring (https://github.com/imageworks/pystring) and the term fuzzy matching is used to refer generally to a method that provides an improved ability to process word-based matching queries to find matching phrases or sentences from a database. The author might have used the library pystring and fuzzy matching methodology but the methods in the manuscript should use different terms.

    RESPONSE# Thanks for the introduction of this library. We renamed "PyString" to "OrgNameString" and "Fuzzy" to "OrgNameFuzzy". We hope it is clearer now.

    I might have some trouble understanding some details. Maybe, a clearer description might help. Table 1 shows 66 connected components using PyString. Does this mean that 66 unique institutions active in research and being located in Berlin have been found? That sounds quite much except if many companies are included. Maybe some comments can clarify this.

    RESPONSE# Description in text revised to clarify this further. No, those connected components are not individual or unique organizations, they are groups of organizations who have co-authorship ties to each other and are disconnected from the other organizations in the network (the total count of these organizations were presented in the 7th row of table 1 "N. of organizations" which in case of PyString (now renamed to OrgNameString) were 5,244 that is now further highlighted by bold font and background color in the revised table).

    I do not see a clear quality assessment of the employed disambiguation technique. Table 1 shows (I guess) that the number of organizations in the Berlin area reduces from 10,269 over 159 and 100 to 66 from non-disambiguated institutions over Fuzzy and ROR to PyString. This, however, does not provide any indication that the correct institutional name variants are grouped together. Results on some check should be provided.

    RESPONSE# Thanks for this suggestion. We added results of two manual validations (two samples each with 100 cases chosen randomly) into the methods section. In addition, the information stated on the bottom part of figure 1 which was showing how many unique organizations are disambiguated with a method that are not possible to be disambiguated with the other two methods and were in page 9 (lines 34-36 copied below) is now separated into a new paragraph to further clarify this. "Each technique successfully disambiguates a set of unique organization names which other techniques are unable to disambiguate (OrgNameString 1,206, OrgNameFuzzy 8,198 and ROR 8,449)."

    I think I have similar trouble understanding Figure 4 and similar other figures. As far as I understand from the text, each circle corresponds to a cluster that comprises institutions that are active in research. Are there more than 1000 German institutions that co-authored more than 100,000 papers? I guess that not each institution features a co-author in every one of these papers. Some more detailed comments about the data might help to facilitate understanding.

    RESPONSE# We first intended to revise the description in text to clarify that these numbers are the sum of aggregate number of publications by organizations in the specific country (figure 4) or community (figure 7), but, following advice of the other reviewer, we removed the figure 4 to reduce the word count of the paper. Description of the former figure 7 (current figure 5) is revised thoroughly (please see the tracked changes and colored file for the revisions).

    The color code could be chosen better in Figure 5. Most of the countries have the same color. Maybe a color gradient would produce better graphics than a fixed set of five colors. Many countries disappeared (I guess) because no papers were found from them or they might not have a collaboration with a Berlin area institution. Such countries should still be visible with a white background and black borders.

    RESPONSE# Thanks, colors in figure 5 are revised. Countries without collaboration with Berlin region (previously excluded) are now presented in a separate panel to highlight the coverage and diversity of the region's collaborators worldwide.

    The labels in Figure 6 are barely readable in the manuscript version. In a journal print version, I see no chance of reading such small labels without a magnifying glass. Furthermore, the label 2452 does not seem to be informative to me.

    RESPONSE# Figure 6 is removed on the suggestion of the other reviewer.

    More comparisons to previous studies of such kind might be interesting. I can only find rather general comments when comparing to previous studies.

    RESPONSE# Comparisons to results of reviewed literature in text is revised. To the best of our knowledge (and after consulting more experienced researchers of the region whose help we have gratefully acknowledged in text) we were only able to find contracted research that unfortunately was not publicly published to be cited in the text. The previously cited studies with a focus on Berlin region are further highlighted in the revised text.

    One of the conclusions picked up in the abstract is: "Our results show that BUA members shape the structure of scientific collaborations in the region. However, they are not collaborating cohesively in all disciplines. Larger divides exist in some disciplines e.g., Agricultural Sciences and Humanities." I wonder if this is more a regional or a field effect. Would such larger divides also show up when comparing research collaboration of other metropolitan areas, e.g., Cologne?

    RESPONSE# With current methods and results we cannot say for sure whether this is a field effect or specificity of the region. That would require a study with Germany or world as the sample. We know that the region has a history of divide, competition and coalition which is evident in our results, but disentangling them from scientific fields would require data on intentions and motivations of the organizations and researchers to form ties which are among our limitations previously stated (and currently kept) at the end of manuscript.

    URLs should be provided for some references, e.g., Stahlschmidt, Stephen, & Hinze (2019) and Stephen, Stahlschmidt, & Hinze (2020). Reference Stahlschmidt, Stephen, & Hinze (2019) points to page 91. The version I found via a Google Scholar search has only 87 pages.

    RESPONSE# Thanks, URLs are now added to these references.

    Author response by


    Cite this author response
  • pre-publication peer review (ROUND 1)
    Decision Letter
    2020/10/11

    11-Oct-2020

    Dear Dr. Akbaritabar:

    Your manuscript QSS-2020-0065 entitled "A Quantitative View of the Structure of Institutional Scientific Collaborations Using the Example of Berlin", which you submitted to Quantitative Science Studies, has been reviewed. The comments of the reviewers are included at the bottom of this letter.

    Based on the comments of the reviewers as well as my own reading of your manuscript, my editorial decision is to invite you to prepare a major revision of your manuscript. I need to emphasize that revising your manuscript does not guarantee that your work will eventually be accepted for publication in Quantitative Science Studies. This depends on the outcome of the revision. Please note that reviewer 1 is rather critical about your work. It is essential to carefully address the concerns of this reviewer.

    To revise your manuscript, log into https://mc.manuscriptcentral.com/qss and enter your Author Center, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions," click on "Create a Revision." Your manuscript number has been appended to denote a revision.

    You may also click the below link to start the revision process (or continue the process if you have already started your revision) for your manuscript. If you use the below link you will not be required to login to ScholarOne Manuscripts.

    PLEASE NOTE: This is a two-step process. After clicking on the link, you will be directed to a webpage to confirm.

    https://mc.manuscriptcentral.com/qss?URL_MASK=b6ba8a4674894aa79879828008d952d3

    You will be unable to make your revisions on the originally submitted version of the manuscript. Instead, revise your manuscript using a word processing program and save it on your computer. Please also highlight the changes to your manuscript within the document by using the track changes mode in MS Word or by using bold or colored text.

    Once the revised manuscript is prepared, you can upload it and submit it through your Author Center.

    When submitting your revised manuscript, you will be able to respond to the comments made by the reviewers in the space provided. You can use this space to document any changes you make to the original manuscript. In order to expedite the processing of the revised manuscript, please be as specific as possible in your response to the reviewers.

    IMPORTANT: Your original files are available to you when you upload your revised manuscript. Please delete any redundant files before completing the submission.

    If possible, please try to submit your revised manuscript by 08-Feb-2021. Let me know if you need more time to revise your work.

    Once again, thank you for submitting your manuscript to Quantitative Science Studies and I look forward to receiving your revision.

    Best wishes,
    Dr. Ludo Waltman
    Editor, Quantitative Science Studies
    qss@issi-society.org

    Reviewers' Comments to Author:

    Reviewer: 1

    Comments to the Author
    This is a confusing paper. Methods used in this paper seem to be valid but my major critique is that the author tried to analyze the collaborations among three Berlin universities, but when constructing the collaboration networks, the three universities are actually blended into the whole networks with thousands of nodes. Thus, the research objectives are not neatly matched with the actual methods and implementations.

    The author has a solid master of key bibliometric skills but it is disappointing to see that the results are quite weak to address the proposed hypotheses. Nearly the entire analysis feels like it is about a network analysis of 5000 or so universities in a collaboration network. I don't think the cluster-based methods are effective for the purpose of this paper. if your focus is the three universities only, what value does clustering add to the results since most of the clusters do not contain any of the three universities? An intuitive approach may be comparing the shared collaborators between BUA members in different disciplines. And may be seeing the changes over the years.

    The paper is too long with lots of unnecessary details. For instance, on page 2, the author spent more than half page in describing someone else's work. There needs to be a literature review section separate from the introduction section. Literature should be synthesized. Some non-essential descriptions of the data, method, and results can be moved to the appendix.

    The first RQ is about a process evaluation and should not be considered as a RQ.

    Why OECD disciplines? Any justifications?

    Figure 4 looks pretty but I do not see much value in it. It is not surprising that US, China, France, Germany and GB are major collaborators of BUA, but they are also the key collaborators of perhaps any major university in the West.

    Table 3 is interesting but a percentage should be noted for clear comparisons.

    The color scheme used in Figure 5 made it hard to read. Figure 6 is useless. Nothing can be told from this figure. It is also unclear what messages Figure 7 and Figure 8 are trying to convey. The alluvial chart simply had too many variables for any meaningful patterns to emerge.

    I do not think Tables 4-10 are reliable because when a different clustering method is applied, the cluster membership may be shifted.

    Overall, I think the results are not effective to address the research questions. Figures and tables were not informative and they did not reveal much insights.

    Reviewer: 2

    Comments to the Author
    The manuscript "A Quantitative View of the Structure of Institutional Scientific Collaborations Using the Example of Berlin" presents a collaboration analysis of the research institutions in the Berlin metropolitan area. A new organization disambiguation technique is employed. Eight research questions are posed at the end of the introduction and answered during presentation of the results. Overall, the manuscript should be of interest to the readership of QSS.

    The author refers to the newly developed disambiguation techniques as "PyString" and "Fuzzy matching". This is prone for confusion. There is a python library called pystring (https://github.com/imageworks/pystring) and the term fuzzy matching is used to refer generally to a method that provides an improved ability to process word-based matching queries to find matching phrases or sentences from a database. The author might have used the library pystring and fuzzy matching methodology but the methods in the manuscript should use different terms.

    I might have some trouble understanding some details. Maybe, a clearer description might help. Table 1 shows 66 connected components using PyString. Does this mean that 66 unique institutions active in research and being located in Berlin have been found? That sounds quite much except if many companies are included. Maybe some comments can clarify this.

    I do not see a clear quality assessment of the employed disambiguation technique. Table 1 shows (I guess) that the number of organizations in the Berlin area reduces from 10,269 over 159 and 100 to 66 from non-disambiguated institutions over Fuzzy and ROR to PyString. This, however, does not provide any indication that the correct institutional name variants are grouped together. Results on some check should be provided.

    I think I have similar trouble understanding Figure 4 and similar other figures. As far as I understand from the text, each circle corresponds to a cluster that comprises institutions that are active in research. Are there more than 1000 German institutions that co-authored more than 100,000 papers? I guess that not each institution features a co-author in every one of these papers. Some more detailed comments about the data might help to facilitate understanding.

    The color code could be chosen better in Figure 5. Most of the countries have the same color. Maybe a color gradient would produce better graphics than a fixed set of five colors. Many countries disappeared (I guess) because no papers were found from them or they might not have a collaboration with a Berlin area institution. Such countries should still be visible with a white background and black borders.

    The labels in Figure 6 are barely readable in the manuscript version. In a journal print version, I see no chance of reading such small labels without a magnifying glass. Furthermore, the label 2452 does not seem to be informative to me.

    More comparisons to previous studies of such kind might be interesting. I can only find rather general comments when comparing to previous studies.

    One of the conclusions picked up in the abstract is: "Our results show that BUA members shape the structure of scientific collaborations in the region. However, they are not collaborating cohesively in all disciplines. Larger divides exist in some disciplines e.g., Agricultural Sciences and Humanities." I wonder if this is more a regional or a field effect. Would such larger divides also show up when comparing research collaboration of other metropolitan areas, e.g., Cologne?

    URLs should be provided for some references, e.g., Stahlschmidt, Stephen, & Hinze (2019) and Stephen, Stahlschmidt, & Hinze (2020). Reference Stahlschmidt, Stephen, & Hinze (2019) points to page 91. The version I found via a Google Scholar search has only 87 pages.

    Decision letter by
    Cite this decision letter
    Reviewer report
    2020/10/05

    The manuscript "A Quantitative View of the Structure of Institutional Scientific Collaborations Using the Example of Berlin" presents a collaboration analysis of the research institutions in the Berlin metropolitan area. A new organization disambiguation technique is employed. Eight research questions are posed at the end of the introduction and answered during presentation of the results. Overall, the manuscript should be of interest to the readership of QSS.

    The author refers to the newly developed disambiguation techniques as "PyString" and "Fuzzy matching". This is prone for confusion. There is a python library called pystring (https://github.com/imageworks/pystring) and the term fuzzy matching is used to refer generally to a method that provides an improved ability to process word-based matching queries to find matching phrases or sentences from a database. The author might have used the library pystring and fuzzy matching methodology but the methods in the manuscript should use different terms.

    I might have some trouble understanding some details. Maybe, a clearer description might help. Table 1 shows 66 connected components using PyString. Does this mean that 66 unique institutions active in research and being located in Berlin have been found? That sounds quite much except if many companies are included. Maybe some comments can clarify this.

    I do not see a clear quality assessment of the employed disambiguation technique. Table 1 shows (I guess) that the number of organizations in the Berlin area reduces from 10,269 over 159 and 100 to 66 from non-disambiguated institutions over Fuzzy and ROR to PyString. This, however, does not provide any indication that the correct institutional name variants are grouped together. Results on some check should be provided.

    I think I have similar trouble understanding Figure 4 and similar other figures. As far as I understand from the text, each circle corresponds to a cluster that comprises institutions that are active in research. Are there more than 1000 German institutions that co-authored more than 100,000 papers? I guess that not each institution features a co-author in every one of these papers. Some more detailed comments about the data might help to facilitate understanding.

    The color code could be chosen better in Figure 5. Most of the countries have the same color. Maybe a color gradient would produce better graphics than a fixed set of five colors. Many countries disappeared (I guess) because no papers were found from them or they might not have a collaboration with a Berlin area institution. Such countries should still be visible with a white background and black borders.

    The labels in Figure 6 are barely readable in the manuscript version. In a journal print version, I see no chance of reading such small labels without a magnifying glass. Furthermore, the label 2452 does not seem to be informative to me.

    More comparisons to previous studies of such kind might be interesting. I can only find rather general comments when comparing to previous studies.

    One of the conclusions picked up in the abstract is: "Our results show that BUA members shape the structure of scientific collaborations in the region. However, they are not collaborating cohesively in all disciplines. Larger divides exist in some disciplines e.g., Agricultural Sciences and Humanities." I wonder if this is more a regional or a field effect. Would such larger divides also show up when comparing research collaboration of other metropolitan areas, e.g., Cologne?

    URLs should be provided for some references, e.g., Stahlschmidt, Stephen, & Hinze (2019) and Stephen, Stahlschmidt, & Hinze (2020). Reference Stahlschmidt, Stephen, & Hinze (2019) points to page 91. The version I found via a Google Scholar search has only 87 pages.

    Reviewed by
    Cite this review
    Reviewer report
    2020/09/24

    This is a confusing paper. Methods used in this paper seem to be valid but my major critique is that the author tried to analyze the collaborations among three Berlin universities, but when constructing the collaboration networks, the three universities are actually blended into the whole networks with thousands of nodes. Thus, the research objectives are not neatly matched with the actual methods and implementations.

    The author has a solid master of key bibliometric skills but it is disappointing to see that the results are quite weak to address the proposed hypotheses. Nearly the entire analysis feels like it is about a network analysis of 5000 or so universities in a collaboration network. I don't think the cluster-based methods are effective for the purpose of this paper. if your focus is the three universities only, what value does clustering add to the results since most of the clusters do not contain any of the three universities? An intuitive approach may be comparing the shared collaborators between BUA members in different disciplines. And may be seeing the changes over the years.

    The paper is too long with lots of unnecessary details. For instance, on page 2, the author spent more than half page in describing someone else's work. There needs to be a literature review section separate from the introduction section. Literature should be synthesized. Some non-essential descriptions of the data, method, and results can be moved to the appendix.

    The first RQ is about a process evaluation and should not be considered as a RQ.

    Why OECD disciplines? Any justifications?

    Figure 4 looks pretty but I do not see much value in it. It is not surprising that US, China, France, Germany and GB are major collaborators of BUA, but they are also the key collaborators of perhaps any major university in the West.

    Table 3 is interesting but a percentage should be noted for clear comparisons.

    The color scheme used in Figure 5 made it hard to read. Figure 6 is useless. Nothing can be told from this figure. It is also unclear what messages Figure 7 and Figure 8 are trying to convey. The alluvial chart simply had too many variables for any meaningful patterns to emerge.

    I do not think Tables 4-10 are reliable because when a different clustering method is applied, the cluster membership may be shifted.

    Overall, I think the results are not effective to address the research questions. Figures and tables were not informative and they did not reveal much insights.

    Reviewed by
    Cite this review
All peer review content displayed here is covered by a Creative Commons CC BY 4.0 license.