Abstract

BACKGROUND: The Metadata 2020 initiative is an ongoing effort to bring various scholarly communications stakeholder groups together to promote principles and standards of practice to improve the quality of metadata. To understand the perspectives and practices regarding metadata of the main stakeholder groups (librarians, publishers, researchers and repository managers), we conducted a survey during summer 2019. The survey content was generated by representatives from the stakeholder groups.

METHODS: A link to an online survey (17 or 18 questions depending on the group) was distributed through multiple social media, listserv, and blog outlets. Responses were anonymous, with an optional entry for names and email addresses for those who were willing to be contacted later.

RESULTS: Complete responses (N=211; 87 librarians, 27 publishers, 48 repository managers, and 49 researchers) representing 23 countries on four continents were analyzed and summarized for thematic content and ranking of awareness and practices.

CONCLUSIONS: Across the stakeholder groups, the level of awareness and usage of metadata methods and practices was highly variable. Clear gaps across the groups point to the need for consolidation of schema and practices, as well as broad educational efforts in order to increase knowledge and implementation of metadata in scholarly communications.


Authors

Kathryn A. Kaiser;  Michelle Urberg;  Maria Johnsson;  Jennifer Kemp;  Alice Meadows;  Laura Paglione

Publons users who've claimed - I am an author

No Publons users have claimed this paper.

Contributors on Publons
  • 2 reviewers
  • pre-publication peer review (FINAL ROUND)
    Decision Letter
    2021/04/10

    10-Apr-2021

    Dear Ms. Paglione:

    Thank you for the careful revision of your manuscript "An International, Multi-Stakeholder Survey about Metadata Awareness, Knowledge and Use in Scholarly Communications". It is a pleasure to accept your manuscript for publication in Quantitative Science Studies.

    I would like to request you to prepare the final version of your manuscript using the attached checklist. Please also sign the publication agreement, which can be downloaded from https://bit.ly/2OALK46. The final version of your manuscript, along with the completed checklist and the signed publication agreement, can be returned to qss@issi-society.org.

    Thank you for your contribution. On behalf of the Editors of Quantitative Science Studies, I look forward to your continued contributions to the journal.

    Best wishes,
    Dr. Ludo Waltman
    Editor, Quantitative Science Studies
    qss@issi-society.org

    Decision letter by
    Cite this decision letter
    Author Response
    2021/04/08

    Thank you for your review of our paper and the detailed feedback. We believe that we have address the reviewers comments. Please find the details below. Updates have been made using track changes in the manuscript. Supplementary materials were updated and have been published in Zenodo so that they can be referenced by both the paper and dataset. The manuscript has been updated to APA formatted as per publication guidelines. While figures are included in the manuscript inline, they are also provided as separate files.

    Best, Laura Paglione, Kathryn Kaiser, Michelle Urberg, Maria Johnsson, Jennifer Kemp, and Alice Meadows

    Reviewer Comments and Author Responses

    COMMENT
    Bias: The survey was in English, the survey time was relatively short - 40 days from issue of invitations to closing date, and the respondents heavily biased to native anglophones, and strongly biased to those resident in North America. This correlates closely with the characteristics of those initiating the survey.

    CHANGED: (Section 5.3) Inserted: "We must acknowledge the limitations of the convenience sample of respondents who were predominantly from English speaking countries and represent the organizations that control much in the scholarly communications ecosphere. Future work would benefit from targeted recruitment from under-represented areas of the world and stakeholders, and should include questions about less common but important items such as non-traditional works and indigenous knowledge."

    COMMENT
    Main report 1: The mention of the FAIR principles (Section 5.3) should be supported by a reference (https://www.force11.org/fairprinciples).

    CHANGED: (Reference section and Section 5.3) Reference was included

    COMMENT
    Main report 2: In Table 9, the second line should be corrected to read “Difference from Librarians”, not “Difference from Publishers” in THREE places.

    CHANGED: (Table 3 (note, table numbering has been updated)) Correction made as noted

    COMMENT
    Main report 3: Typographical errors to be corrected.
    (a) Last sentence of section 4.2.3 “with 11% reporting no checking at al (Publisher Survey Q10).”: “al” should be replaced by “all”.
    (b) Text should be checked to remove occasional unwanted extra spaces before words or punctuation marks:
    e.g. Section 4.3.1: “The professional duties of this group vary widely , with . . .”;
    or to add spaces where required:
    e.g. Section 4.3.2: “about their challenges with entering data(Repository Manager Survey Q10).”"

    CHANGED: (Throughout the paper) Paper was reviewed for typographical and grammar/ punctuation errors.

    COMMENT
    Main report 4: Spelling mistake to be corrected.
    Section 5.2: “reponses point to fields with rich metadata (Researcher Survey Q11 and Q12).” should be “responses . . .”.

    CHANGED: (Throughout the paper) Paper was reviewed for spelling errors

    COMMENT
    Findings 1. ORCIDS: There is an encouraging awareness of ORCID among repository managers and researchers.

    CHANGED: (section 5.3) New text in the results section

    COMMENT
    Findings 2. Abstracts: The fact that Abstracts were rated by publishers among their four most important metadata elements (together with authors, title and publication date) should bode well for publisher involvement in the Initiative for Open Abstracts (I4OA).

    CHANGED: (section 5.3) New text in the results section

    COMMENT
    Findings 3. Reference management systems: In answers to question A1 (What metadata fields are most important?), researchers list personal reference management systems (Mendelay, Zotero, Endnote and Refworks), that are not mentioned by respondents from any of the other three communities. This shows a disjunction between researchers' practices and those attempting to serve their needs.

    CHANGED: (section 5.3) New text in the results section

    COMMENT
    Findings 4. Use of Semantic Web technologies: Although some repository managers report use of JSON-LD, Schema.org and RDF, and 10% of researchers use RDF when publishing research outputs, there seems to be a general ignorance or avoidance of semantic web technologies, despite the benefits these could bring to metadata definition.

    CHANGED: (section 5.3) New text in the results section

    COMMENT
    Findings 5. Controlled metadata vocabularies: Related to point 4, although among respondents there is widespread awareness of and use of NISO JATS and Dublin Core, a consistent finding from all four groups surveyed (librarians, publishers, repository managers and researchers) is the need for better controlled metadata / controlled vocabularies / standardised metadata schemas. Yet neither the survey respondents, nor the authors of this report, comment on the limited coverage of Dublin Core terms, or express awareness of the existing much richer controlled vocabularies specifically targeted to this domain, such as the SPAR (Semantic Publishing and Referencing) Ontologies [1], or of the OpenCitations Data Model [2] that structures metadata based on such ontologies (that this reviewer has helped developed).

    CHANGED: (section 5.3) New text in the results section

    COMMENT
    Findings 6. Lack of precision of JATS terms: The authors observe that the many of metadata systems being used were “not originally designed with computer technology in mind.” (Section 5.1). The principle difference between JATS and RDF ontologies is the precision with which terms are identified. JATS is a descriptive, not a prescriptive model, and is deliberately vague about the meaning of terms, because there is no intention to tell any publisher what they should call their content. Thus, unlike the world of RDF with its precise semantics, in the world of XML markup terms can take on different meanings, depending on who is using them.

    CHANGED: (section5.3) Added to encourage a deeper treatment as a topic of future study

    COMMENT
    Clarification 1: In this study, the researcher respondents "are primarily creators and consumers of metadata" (p17). This would appear to be an effect of self-section in relation to the survey. "Researchers are increasingly struggling with administrative demands associated with creating metadata for funding applications, publications, and data sets" (p25). That researches are doing metadata work is quite useful for revealing gaps. Whereas a more normative role for researchers might be users and creators of the research objets that metadata are meant to describe. In the survey responses, do the researchers express a view about their role in metadata work? And what is the expected role of researchers, from the authors standpoint, in possible metadata futures? [For future work in this area, the authors propose a "persona perspectives" framework to "better encapsulate" metadata workflows. Among other affordances, this framework would seem to help reduce the possibility of normative questions.]

    CHANGED: (section 4.4.1 and 5.3.7) Added text to address these points

    COMMENT
    Clarification 2: The absence of Persistent Identifiers (PIDs) in this paper seems to suggest a lack of relevance to the task at hand. PID's such as DOIs and ORCIDs are present as metadata elements--PIDs as metadata. However, PIDs also contain metadata. Could PID platforms provide useful metadata workflows as a contrast to the platforms chosen for this study? How might PID metadata practices inform the problem space (or solution space) for metadata as operationalized in this study?

    CHANGED: (section 5.3.8) Added to encourage deeper treatment as a topic of future study

    COMMENT
    Supplementary 1: Individual figures and tables should be NUMBERED to assist referencing them.

    CHANGED: (supplementary paper - https://doi.org/10.5281/zenodo.4666193) Figure and Table numbers were added in addition to the numbered questions that accompanied the figures. Question numbers are included in captions to avoid confusion

    COMMENT
    Supplementary 2: In Tables 3, 4, 5 and 6 (Pages 3 & 4 of Metadata 2020 Survey Methods and Results Summary, pages 32 & 33 of the PDF): The meaning of “Default” (second line heading for third column) MUST be defined!

    CHANGED: (supplementary paper - https://doi.org/10.5281/zenodo.4666193) A definition has been provided for "Default" and the other fields used in these tables

    COMMENT
    Supplementary 3: In Section 5. What is the role of services for and support of metadata in your library?
    The pixel resolution of the inset text box headed “To quote . . .” on page 8 of Metadata 2020 Survey Methods and Results Summary (page 37 of the PDF) is inadequate for publication and should be replaced by a paragraph headed “Write in comments [sic]:” containing the same textual information, as for later responses."

    CHANGED: (supplementary paper - https://doi.org/10.5281/zenodo.4666193) Quotes have been removed from the text box and added to the body text. Since these are excerpts from write in answers, the introduction before the list of quotes is slightly different from the treatment elsewhere to indicate that they are not the full answers.

    COMMENT
    Supplementary 4: In the table on page 51 of the supplementary information (Page 80 of the PDF)(identical to Table 9 in the main report), the second line should be corrected to read “Difference from Librarians”, not “Difference from Publishers” in THREE places.

    CHANGED: (supplementary paper - https://doi.org/10.5281/zenodo.4666193) Correction made as noted



    Cite this author response
  • pre-publication peer review (ROUND 1)
    Decision Letter
    2021/02/09

    09-Feb-2021

    Dear Ms. Paglione:

    Your manuscript QSS-2020-0093 entitled "An International, Multi-Stakeholder Survey about Metadata Awareness, Knowledge and Use in Scholarly Communications", which you submitted to Quantitative Science Studies, has been reviewed. There are two reviewers. The comments of reviewer 1 can be found in the attached PDF file. The comments of reviewer 2 are included at the bottom of this letter.

    Both reviewers are positive about your work. The reviewers recommend publication of your manuscript after some minor issues have been addressed. Based on the comments of the reviewers, I would like to invite you to prepare a revised version of your manuscript.

    To revise your manuscript, log into https://mc.manuscriptcentral.com/qss and enter your Author Center, where you will find your manuscript title listed under "Manuscripts with Decisions." Under "Actions," click on "Create a Revision." Your manuscript number has been appended to denote a revision.

    You may also click the below link to start the revision process (or continue the process if you have already started your revision) for your manuscript. If you use the below link you will not be required to login to ScholarOne Manuscripts.

    PLEASE NOTE: This is a two-step process. After clicking on the link, you will be directed to a webpage to confirm.

    https://mc.manuscriptcentral.com/qss?URL_MASK=e3bad7b40d964be9934fed1225e13000

    You will be unable to make your revisions on the originally submitted version of the manuscript. Instead, revise your manuscript using a word processing program and save it on your computer. Please also highlight the changes to your manuscript within the document by using the track changes mode in MS Word or by using bold or colored text.

    Once the revised manuscript is prepared, you can upload it and submit it through your Author Center.

    When submitting your revised manuscript, you will be able to respond to the comments made by the reviewers in the space provided. You can use this space to document any changes you make to the original manuscript. In order to expedite the processing of the revised manuscript, please be as specific as possible in your response to the reviewers.

    IMPORTANT: Your original files are available to you when you upload your revised manuscript. Please delete any redundant files before completing the submission.

    If possible, please try to submit your revised manuscript by 10-Apr-2021. Let me know if you need more time to revise your work.

    Once again, thank you for submitting your manuscript to Quantitative Science Studies and I look forward to receiving your revision.

    Best wishes,
    Dr. Ludo Waltman
    Editor, Quantitative Science Studies
    qss@issi-society.org

    Reviewers' Comments to Author:

    Reviewer: 1

    Comments to the Author
    Please see attached review document.

    Reviewer: 2

    Comments to the Author
    This is an important initiative, Metadata 2020 in general and this project in particular. The findings present a challenging state of affairs, whereby metadata move "through various systems in complex and idiosyncratic ways, but the various workflows have not been identified or studied in ways that can help researchers learn more about where their metadata goes once it enters a platform or a publisher’s website" (p25). Related complexities include high degree of manual entry/update; inconsistent quality verification; and "no single powerful best practice or standard or software system drives the metadata ecosystem" (p23).

    As the authors suggest, moving toward better alignment of metadata schemata and workflows would be both challenging and consequential. Survey respondent categories included librarians, publishers, researchers and repository managers. Roles for each seemed to be embedded in the survey questions and discussion of responses. This raises two clarification questions. First a specific question about the expected role of researchers with respect to metadata and then a general question about the relevance of PIDs in the metadata landscape.

    In this study, the researcher respondents "are primarily creators and consumers of metadata" (p17). This would appear to be an effect of self-section in relation to the survey. "Researchers are increasingly struggling with administrative demands associated with creating metadata for funding applications, publications, and data sets" (p25). That researches are doing metadata work is quite useful for revealing gaps. Whereas a more normative role for researchers might be users and creators of the research objets that metadata are meant to describe. In the survey responses, do the researchers express a view about their role in metadata work? And what is the expected role of researchers, from the authors standpoint, in possible metadata futures? [For future work in this area, the authors propose a "persona perspectives" framework to "better encapsulate" metadata workflows. Among other affordances, this framework would seem to help reduce the possibility of normative questions.]

    The absence of Persistent Identifiers (PIDs) in this paper seems to suggest a lack of relevance to the task at hand. PID's such as DOIs and ORCIDs are present as metadata elements--PIDs as metadata. However, PIDs also contain metadata. Could PID platforms provide useful metadata workflows as a contrast to the platforms chosen for this study? How might PID metadata practices inform the problem space (or solution space) for metadata as operationalized in this study?

    A solid contribution that sheds light on a vexing problem! The requested clarifications would, in my view, make it even better.

    Decision letter by
    Cite this decision letter
    Reviewer report
    2021/02/09

    This is an important initiative, Metadata 2020 in general and this project in particular. The findings present a challenging state of affairs, whereby metadata move "through various systems in complex and idiosyncratic ways, but the various workflows have not been identified or studied in ways that can help researchers learn more about where their metadata goes once it enters a platform or a publisher’s website" (p25). Related complexities include high degree of manual entry/update; inconsistent quality verification; and "no single powerful best practice or standard or software system drives the metadata ecosystem" (p23).

    As the authors suggest, moving toward better alignment of metadata schemata and workflows would be both challenging and consequential. Survey respondent categories included librarians, publishers, researchers and repository managers. Roles for each seemed to be embedded in the survey questions and discussion of responses. This raises two clarification questions. First a specific question about the expected role of researchers with respect to metadata and then a general question about the relevance of PIDs in the metadata landscape.

    In this study, the researcher respondents "are primarily creators and consumers of metadata" (p17). This would appear to be an effect of self-section in relation to the survey. "Researchers are increasingly struggling with administrative demands associated with creating metadata for funding applications, publications, and data sets" (p25). That researches are doing metadata work is quite useful for revealing gaps. Whereas a more normative role for researchers might be users and creators of the research objets that metadata are meant to describe. In the survey responses, do the researchers express a view about their role in metadata work? And what is the expected role of researchers, from the authors standpoint, in possible metadata futures? [For future work in this area, the authors propose a "persona perspectives" framework to "better encapsulate" metadata workflows. Among other affordances, this framework would seem to help reduce the possibility of normative questions.]

    The absence of Persistent Identifiers (PIDs) in this paper seems to suggest a lack of relevance to the task at hand. PID's such as DOIs and ORCIDs are present as metadata elements--PIDs as metadata. However, PIDs also contain metadata. Could PID platforms provide useful metadata workflows as a contrast to the platforms chosen for this study? How might PID metadata practices inform the problem space (or solution space) for metadata as operationalized in this study?

    A solid contribution that sheds light on a vexing problem! The requested clarifications would, in my view, make it even better.

    Reviewed by
    Cite this review
    Reviewer report
    2021/01/02

    Please see attached review document.

    Reviewed by
    Cite this review
All peer review content displayed here is covered by a Creative Commons CC BY 4.0 license.