Mass digitization = mass confusion?

Photographic archivists Elizabeth Hull and Stephen Fletcher scan one of Hugh Morton’s negatives.
The image above (© 2008 Winston-Salem Journal photo/David Rolfe), which shows Stephen at the computer and me (Elizabeth “The Flash” Hull) operating the scanner, is from a recent article in the Winston-Salem Journal (2/17/2008 issue) about our work on the Morton collection (on the article page, scroll down to the “Multimedia” link to view more images, including some of Morton’s that you haven’t yet seen on A View to Hugh). (Note: as of 5/9/2008, the article no longer appears to be available online).

Yes, digitization of the Morton collection has begun . . . sort of. As you may have read in Stephen’s recent post, students from the digital libraries class are hard at work scanning black-and-white sheet film negatives—about 60 scans and counting so far—and the Library’s Digital Production Center should start on other parts of the collection soon.

This way of doing things—scanning the collection in the middle of archival processing, rather than after it’s finished—is somewhat unorthodox and presents a number of challenges, especially in terms of how to maintain control over 1) the physical items in the collection (the actual negatives, prints, etc.), 2) the electronic versions we’re creating, and 3) the relationships between them.

We feel strongly about using this method, however, for a few reasons. The first, of most immediate importance to me, is that scanning will actually help me process the collection. I don’t know if you’ve ever sorted through a chaotic pile of 50,000 unlabeled negatives before (I’m guessing not), but for even the most seasoned negative-reader, it can be hard to tell what you’re looking at. Having positive versions in an electronic format will be incredibly useful for categorizing and identification.

The second reason we want to do “mass digitization” of the Morton collection is so that we can make as much of it available as quickly as possible. This fits with a growing trend in the library/archives community, as shown by the report Shifting Gears: Gearing Up to Get Into the Flow published by Online Computer Library Center (OCLC). (This report was inspired by an all-day program entitled “Digitization Matters: Breaking Through the Barriers—Scaling Up Digitization of Special Collections,” attended by Stephen, at the Society of American Archivists annual meeting last August. Stephen’s series “200,000 slides” on this blog stems from ideas presented at that program).

The Shifting Gears report challenges libraries and archives to shift the focus of digitization away from “boutique” projects that either highlight marquee collections or “cherry-pick” from collections, scanning images at a very high resolution, describing them in great detail, and leaving the bulk of the archival holdings on the shelf where only in-person visitors can see them. Increasingly, the consensus is that we should be moving towards methods that emphasize access over preservation, aim for quantity over quality, and that let the user decide which “cherries” s/he wants to pick.

If this all sounds confusing, that’s because it is! We’re figuring it out as we go along. Ideas, advice, sympathy, encouragement, etc. are welcome as always.

6 thoughts on “Mass digitization = mass confusion?”

  1. Hello Elizabeth & crew –
    I find the mass digitization concept interesting and want to point out that selecting materials (aka cherry picking)for digitization involves decisions that should be taken into consideration in any mass digitization project
    1)What about all the images you don’t want to make available – like near dups, items off topic, etc. I realize this might not be a concern with the Morton collection. I am currently working with a collection where we are digitizing while processing and what is happening is that tons of slides that I would never select to digitize are being digitized. I wonder about the waste of time as we believe most of these would not be useful to researchers, yet we are spending time scanning and managing the scans.
    2) What about copyright? Are you making the images available via the web or an internal database? I can see the advantages of mass digitization while processing IF the images were available in-house only. But publishing all those digital images on the web constitutes an act covered by copyright laws. Once again this may not be an issue with the Morton collection if all the images maintain the same copyright restrictions. But lots of collections can have a mixed bag of copyright issues. If processors have to investigate and satisfy copyright issues while they are processing this can mean added work (and possibly A LOT of added work)that might actually slow down processing.
    I’d love to hear how folks grapple with these two issues during the course of mass digitization efforts.
    Thanks – Ginny Daley

  2. Thanks, Ginny–you bring up important points. Your #1 is most certainly an issue with the Morton. We’re asking ourselves, though, which takes more time/effort: going through hundreds of thousands of images individually to select which ones should be scanned, or just scanning them all? (Especially in a “mass production” kind of environment?)
    Digital storage is increasingly cheap these days. And, it doesn’t mean you have to keep all the scans in the end (though that would also involve a time-consuming review process). But in the case of negatives, it may be impossible to know what you’ve got until you’re able to look at a positive…
    Just a few thoughts on the matter, for now.

  3. Ginny, one of the questions we’ll be looking at is what takes less time: 1) pre-selection, meaning weed through the physical items then scan those that survive the edit, or 2) scan everything, then review scans on a monitor and select what gets presented on the Web. Initial testing suggest the latter *if* you are utilizing high-speed scanning, which we will be. (More on that very soon!) Use of a slow scanner would likely favor pre-selection. I’ll be sure to post our findings once we have them.

  4. As a former news director at Grandfather Mountain, I was the custodian of Hugh Morton’s photo files for 20 some years before they moved to Chapel Hill.
    This is to say, I sorted the black and white prints by subject and put them into a filing cabinet. I put color slides into ring binders sorted by subject. And when newspapers or magazines called to request illustrations for their stories, I would go through those files to see if I had something that might meet their needs.
    Through those many years I spent a great deal of time pondering how to catalog that body of work. If I had had my druthers, if I had had access to high-speed scanning, I would have voted to scan it all and sort it later.
    But because the strategy that was available to me at the time was to sort the negatives HMM chose to print or slides he chose to contribute to my photo files, I ended up with a good selection of a thousand or so images of Grandfather Mountain that I continue to reach for almost daily.
    Unfortunately, I only have two or three photos of azaleas or lighthouses or UNC basketball – because my mission was to build a photo file for Grandfather Mountain. It breaks my heart when I can not supply requests from writers and historians looking to illustrate the many facets of the culture and history of North Carolina that my father had documented so faithfully.
    So I say go for it Stephen. None of it is waste. Yes you will end up with multiples of the same subject and only one will be perfectly composed or perfectly exposed…but every once in a while you’ll say, “let me look more closely at this face in the background.” Then you’ll discover Bill Clinton in the crowd at a Carolina basketball game or a youthful Richard Nixon touring the Blue Ridge Parkway.
    Scan them all Stephen.

  5. Stephen: Having worked on many documentary projects during my 42 years at WFMY-TV in Greensboro, I can tell that it is extremely important to have available various angles and different perspectives on the same shot. I couldn’t begin to count the number of times I have used the same photo more than once in a film, but having the ability to use just a slightly different angle makes all the difference.
    I agree completely with Catherine Morton. Scan them all.

  6. Dear All,
    First of all I really appreciate the great work you guys are doing of the Mass Digitization of Morton’s collection.
    Well, I do agree with Ginny on finding the concept of Mass Digitization interesting and also the Issue of selecting the materials for digitization.
    I am working in the field of Mass Digitization from last 3 years. I have developed the whole work station for mass digitization as well. While developing the work station my main focus was on the best technology which gives the best ouput quality with very high speed and also it does not damages any of the originals.
    After and extensive research I brought this revolutionary technology for the first time in India, where one can digitize 700 pages an hour of a book, without breaking the spine of the book.
    I have worked as a consultant for several libraries to complete their projects of Digitization ranging from newspapers, books, binded documents to rare manuscripts and negatives.
    Some of the institutes I have worked with are, have digitised the original thesis of great Indian Scientist Late Dr. Vikram Sarabhai, Indian Institute of Management, many rare manuscripts of Jain religion, have also digitize the scripts written on leaf and many more.
    Well currently I am doing my dissertation on the same topic of Selction of the materials for mass digitization and found your article interesting.
    Its a good debate on whether to pre-select the material and then carry on with the digitization process or whether to digitize everything and then sort out which is of use and which is waste.
    And after ample of research I can definately argue that, there are certain projects where pre-selection is definately necessary, in order to reduce the waste of time, efforts and money. As Ginny mentioned that they are digitizing some material which is lot of waste as it is not important and then also they are digitizing it.
    While talking about some libraries, sometimes they waste their money, time and efforts in digitizing those materials, which has already been digitised. What I think is replication of the same thing is not necessary.. They should focus on those materials which are yet not been ever digitized.
    While on the other hand I also argue that, there are several projects where pre-selection is not necessary, for example, as stated by Catherine that not all negatives are that good and u would find some repitions ‘but’ they wont be waste. So yo definately need to digitize everything and then sort out.
    Talking about the copyrights, again agree with what Ginny said. Copyrights is a crucial issue while selecting the materials. Every different country have different copyright rules and act. So have to be very specific on choosing it. There are also several loopholes in the copyright acts and several funny rules in the act as I had gone through the copyright act for digitization of USA as well as India.
    Some materials are such that you can digitize it for your personal or in house use but, you canoot put it open for the public or cannot sell without the consent of the original publisher.
    Now talking about the copyrights for the Morton’s work, you guys dont have any issue for the same.
    Well I would definately love on how the work is going and how you overcame the probelms till now.
    If you guys are still going throught some problems, please do discuss it and I will be more happy to help with same.
    Thanking you,
    With warm regards,
    Rutul Kamdar
    +44 7868 111 683
    p.s. You can also contact me on my cell number if you wish to.

Leave a Reply

Your email address will not be published. Required fields are marked *