Over 22,000 new articles added to the CDR

Over the past few weeks, we’ve added over 22,000 UNC-authored articles to the CDR. These articles were published from 1980-2015 and are primarily focused on science, technology and mathematics. They are also available through PubMed Central.

As with the previous batch of articles, this batch was also identified through a report from 1Science, which the Libraries purchased in May 2018. We learned a lot from the previous project and therefore adapted our processes and loaded the second batch of articles quickly. Still, we once again had to do a lot of work in order to load the articles into the CDR, including:

  • Rewriting portions of the download script to prevent overwriting of file names
  • Identifying and obtaining missing or incorrect metadata
  • Normalizing metadata, including mapping author affiliations to the CDR’s internal department list
  • Identifying embargo periods
  • Writing a script to ingest the articles into the CDR
  • Multiple quality assurance checks

This list represents work from the Repository Services, Software Development, Infrastructure Management and Library Data Strategy and Services departments. Anna Goslen and Rebekah Kati presented on our process at the 2020 Triangle Research Libraries Network Annual Meeting. We are planning to load a third large batch in 2021.

However, these articles are only one component of our program to increase the number of scholarly articles in the Carolina Digital Repository and support the Libraries Sustainable Scholarship initiative. Read more about our recent initiatives to review faculty CVs and identify coronavirus-related research for deposit. If you’d like to deposit your work in the CDR, please contact us!

Leave a Reply

Your email address will not be published. Required fields are marked *