Born-Digital Ingest Workflow Overview

The following workflow is used primarily for special collections. 

1. Safely extract files from original storage media

  • Create accession record and label removable media as appropriate.
  • Prepare Staging Storage directory.
  • Disk image or use other tools for safe data copy as appropriate for media type.

2. Appraise and prepare files for ingest in secure Staging Storage

  • Manifest of transfer including file format identification created.
  • Virus scan.
  • Personally Identifiable Information scan, as appropriate for text based materials.
  • Appraisal of content and technical characteristics to determine further pre-ingest processing needs.
  • Further pre-ingest processing as appropriate (e.g. weeding files, deduplication, file format normalization, embedded metadata review, etc.)
  • Note: this work and ingest to the CDR should happen over the course of a couple months. Collections should not be left on Staging Storage indefinitely. If collections cannot be processed in the near-term, see the Storage Menu for other temporary and secure options.

3. Finalize SIP

  • Review total size of the collection. If the SIP icludes 5,000 items or more, contact the Repository Program Librarian. The ingest may need to be split up into multiple ingests.
  • For almost all special collections ingest, use Bagger to package the collection for ingest.
    • Refer to the Using Bagger documentation on the CDR Help pages.
  •  If you have single items to ingest or don’t think Bagger is necessary, contact the Repository Program Librarian or the Assistant University Archivist for Digital Records for further advice. Or see more detailed departmental workflow documentation for steps.

4. Ingest through the CDR Admin interface

  • Login using your Onyen
  • Click “Admin” to access the admin interface
  • Find your collection in the “Collection overview” or in “Browse”
  • Refer to the “Working in the Admin Interface” page for further details.
  • First time users should contact the Repository Program Librarian for CDR training.

5. Quality control

  • If ingest fails, contact CDR staff for assistance.
  • After ingest is complete, it is critical to perform QC and ensure expected files were ingested.
  • If you are performing QC for a large collection, sampling may be appropriate.
  • Some things to look for:
    • Was all the data and metadata for each file ingested properly? If you have different file types, look at each type.
    • Can each file be viewed and/or downloaded?
    • Was FITS generated for files?
    • What access controls need to be added? Should the collection be unpublished?
      • A lock icon will appear on files/collections that are not published or embargoed
      • Are children inheriting from parents, and do the exceptions not inherit?
  • If you find any problems, contact CDR staff for troubleshooting.

6. Arrange files, describe with MODS, and finalize access controls

  • Working with Technical Services, collections should be arranged and described (in MODS and finding aids).
  • In the Admin interface, collection managers can: add collections and folders; move files; and set access controls for collections, folders, and individual files.
  • MODS metadata

9. Final QC check + delete files or return storage media

  • Take one more look at the access controls for the collection in the CDR. Ensure it is all correct.
  • After you have performed QC and confirmed that all files have been properly ingested, it is safe to delete the original files on local storage, or return them to donors.
  • The CDR will clean up the files and directories in the Staging Storage. Delete any other working copies (outside of the bag) for the collection that you may have created on Staging Storage.
  • Update the accession record as needed.

**Note this workflow is generalized for large, heterogeneous collections from legacy storage media. Workflows for other collection types, for example, single PDF publication ingest, may be different.