Digital materials are subject to the same principles for selection, preservation, and access as non-digital collections.
Successful digital preservation requires a firm institutional commitment, clarity of purpose, skilled personnel, and an infrastructure with adequate resources to expand and adapt as collections and technology evolve. Rapid advances in technology dictate that digital preservation approaches respond and adapt to changes in digital media. This dynamic nature of the digital landscape heightens the importance of regularly reviewing and updating the Library’s digital preservation policy.
As an integral part of the larger infrastructure that supports the stewardship of content for which the Libraries has assumed responsibility, the Carolina Digital Repository (CDR) manages the infrastructure and services necessary to provide sustainable access to digital objects. The CDR is designed and operated to ensure the integrity of digital files at a bitstream level (the way information is encoded). Characteristics that will affect preservation are recorded in metadata as part of the deposit process. The CDR regularly verifies the integrity of files, maintains a record of preservation-related actions, and employs best practices in the field for persistent storage, including back-up and recovery procedures. Repository staff consult with prospective depositors regarding the nature of their digital collections, foreseeable challenges for long-term access, and strategies for meeting their preservation goals.
- Consult with pilot collection contributors for preparation of submission of content into the CDR
- Create and maintain service application, including:
- Provide and manage sufficient storage for ingest and maintenance of content and associated metadata
- Provide and manage persistent storage, including appropriate back-up and recovery procedures
- Sustain bitstream-level preservation of digital objects
- Perform system monitoring, testing and debugging
- Ensure the authenticity and integrity of content through the automated creation and capture of technical metadata
- Ensure continued access to the intellectual content of digital objects through the creation and capture of descriptive metadata
- Provide a suite of services and functionality for maintaining digital objects to enable future access
The content that will be deposited into the CDR is organized into three collecting areas:
- UNC-Chapel Hill research/scholarship/creative work. For example: pre/post prints, grey literature, posters, datasets, learning objects, electronic theses, dissertations and other capstone works.
- Born-digital special collections acquisitions. For example: email correspondence, digital literary manuscripts, photographs, audio/visual materials, research data, electronic records of UNC administrative units and student organizations.
- Digitized UNC-Chapel Hill Libraries collections. For example: digitized copies of print monographs and serials, manuscripts, maps, photographs, audio/visual materials, and musical scores held at UNC-Chapel Hill (except those deposited in other preservation repositories).
The following types of materials are currently not deposited into the CDR:
- Web harvesting data
- Print materials digitized through the Internet Archive Scribe workflow
- Licensed electronic resources
See also the CDR Collection Development Policy, linked in Appendix.
Preservation Priority Tiers
The Library determines preservation priorities based on user needs, risk and consequence of loss, and feasibility. The following list of priority tiers is meant to serve as a broad guideline and must be applied in context. In cases where we must make decisions about which content receives priority placement in the CDR, the following matrix will guide the decision process.
Tier 1: Highest Priority
- Born-digital content with scholarly value held solely by UNC-Chapel Hill, such as digital content in archival collections.
- Master files from digitization by UNC-Chapel Hill of materials with high degree of deterioration or risk of loss in original format.
- Digital content preserved under the terms of formal agreements and obligations, such as grant contracts, depositor agreements, policies regarding electronic records in the University Archives, and terms of participation in the LOCKSS network
- Master files of UNC-Chapel Hill theses and dissertations
Tier 2: High Priority
- Master files from digitization by UNC-Chapel Hill of materials with low or moderate degree of deterioration or risk of loss in original format
Tier 3: Moderate Priority
- Published content with scholarly value in files acquired, stored and managed by the Library under the terms of perpetual access and purchase agreements but likely to be similarly stored and managed by peer libraries.
- Digital content accompanying print publications
Determining Levels of Preservation Priority
Levels of preservation priority consider both collecting priorities (i.e., collection development policies) and the technical preservability of the file formats and objects. Preservation priority levels are determined by library and repository staff. See appendix for more detailed information on this process, and related policies and documentation.
Preservation actions that are performed to ensure the long-term preservation of digital objects in the repository are:
- Bitstream maintenance
- Persistent, permanent identifier
- Preservation metadata
- On- and off-site backup
- Routine virus and file corruption checks
- Periodic refreshments to new storage media
- Monitoring of file format for changes that may warrant transformation/reassessment
- Migration to successive format
Not all actions will be performed on all objects.
The CDR does not manage any physical storage media outside of its own infrastructure. Physical media will be handled per the terms of the deposit agreement.
File Support & Preservation Best Practices
Several policies outline the criteria with which the repository determines preferred file formats and/or levels of preservation support. The most common criteria cited in preservation policies are:
- Format is commonly used, in widespread use, and/or is in current use
- Availability of the format’s complete documentation and specifications
- Format is non-proprietary
- Format is based on established standards
- Format is platform-independent
- Format is uncompressed