Carolina Digital Repository Blog

Only entropy comes easy. -Anton Chekhov

Technology Overview

without comments

The technologies that make up the CDR are carefully chosen and tested to provide trustworthy, sustainable and flexible access to digital objects. We rely heavily upon our backbone of Fedora Commons and iRODS to create our repository services. These two systems complement each other, providing high-level object services alongside distributed bitstream preservation and processing.

D is for Durability

Durability and survivability are qualities that we had in mind when we combined Fedora Commons with iRODS. These two systems work together to separate a complex and flexible world of well-described digital objects from the routine actions of bit stream preservation. Specifically, the Fedora Commons repository uses a special storage module to securely place object definitions and data files into an iRODS storage grid.

It is important to note that a Fedora Commons repository, complete with sophisticated object models and interrelationships, may be completely recovered from these object definition and data files. The parts of a Fedora Commons repository, such as its database, triple store, search engine and web services, while tremendously flexible and useful are all equally ephemeral with respect to the survivability of the objects they represent.  These systems are may be recreated from the “flat files”. Therefore we ensure the survival of all our digital objects, albeit in latent form, by safeguarding the integrity of this growing set of files.

The iRODS grid receives files and their digests from Fedora and takes action to preserve them based on rules, automatically verifying the digests and replicating files to a remote location. The grid can also be used to verify the integrity of copies over time and trigger a repair action.  Furthermore, the grid and its distributed compute resources can be used for data processing activities.  These activities may include things like virus scans, format migrations, data subset-ing and technical metadata extraction routines.

We will expand this overview as we explore the technology of the CDR in these upcoming posts:

  • Ongoing Access, the Tier-N-y of the Web
  • How Can  Humans Trust Machines?
  • Compose and Enhance, the Digital Object Dance
  • The Workbench: Hustle and Flow

Written by Greg Jansen

October 2nd, 2010 at 3:46 pm

Posted in