Jessica Venlet, Assistant University Archivist for Digital Records and Records Management

First Women’s Athletic Scholarship: Camey Timberlake

Over the past several decades, UNC woman’s athletics teams have had an incredible run of success, winning multiple NCAA championships in five different sports. However, it was just 50 years ago this month that UNC offered its first athletic scholarship to a woman athlete.

During the 1974-1975 season, women’s tennis coach and newly appointed Director of Intercollegiate Athletics for Women, Frances Hogan, drafted a note for her team about the evolving standards for women’s sports at UNC. In the memo found in a folder for the 1974-1975 Tennis season, she wrote:

Up until 1971, the philosophy of teams, or clubs as they were called, was to provide opportunities for a sociable, competitive experience… Women coaches have been oriented that way. We can no longer, as coaches, maintain this philosophy. We are now in athletics. A great amount will be budgeted for our women’s program from the athletic department next year. Coaches will have to look for the best players. . . . Recruiting is legal. Demands on the coaches will be greater. They must produce good teams to justify the money going in to the program.

This change from clubs to teams largely began with the passing of Title IX in 1972. Like most large scale social and political changes, the actual implementation comes in stages and looks different in different contexts. There were many local factors that impacted how things would change at UNC. One impactful moment came in 1974 when the first UNC women’s athletics scholarship was awarded. At the time, the scholarship was called a grant-in-aid. That awardee, Camey Timberlake, was a talented high school tennis player who would join the UNC squad in fall 1974.

Screenshot of a Daily Tar Heel photograph of Camey Timberlake playing tennis. — Photograph from November 8, 1975 issue of the Daily Tar Heel.

As Hogan’s note to the 1974-1975 team highlights so well, the pressure on UNC women’s athletic teams to show up for this moment was real. And the pressure on Camey Timberlake was not wholly different. It was time to perform to win and keep the momentum for women’s athletics.

Timberlake’s tenure at UNC included highs like beating Duke’s star player as a first year and lows like injuries that sidelined her more than once over the years. Disappointments are part of sports and those that play at the highest levels know that better than most. In an interview with the Daily Tar Heel in April 1978, she gave voice to the pressures she felt during her seasons with Carolina noting, “I had decided I wasn’t going to let the pressure bother me, but subconsciously, it was there, and there wasn’t much I could do about it.” While Timberlake’s playing experience at UNC may not have been as she had hoped, she emphasized the positive impact her fellow teammates had on her experience. Timberlake was inducted into the North Carolina Tennis Hall of Fame in 2000.

Screenshot of the Daily Tar Heel showing article about UNC tennis team beating Duke. Image show Timberlake playing tennis. — Image of the November 1, 1974 issue of the Daily Tar Heel.

Image of the 1977-78 team brochure for tennis — 1977-78 season team brochure. This was Timberlake’s senior season.

On the other end of this pivotal moment, UNC continued to reshape the role of women’s athletics at the university in line with Title IX. Ahead of final Title IX regulations for athletics to be released in 1975 and requiring implementation by 1978, Chancellor Ferebee Taylor took proactive steps in 1974 to set the stage for change. The women’s athletic programs were moved into the Department of Athletics and Coach Hogan’s new role, Director of Intercollegiate Athletics for Women, was created. Chancellor Taylor also charged a new committee focused on equal opportunity for women at UNC called the Title IX Committee and Subcommittee on Athletics. (Jackson, p. 92; 134- 137). Building on the first women’s athletics scholarship, the second scholarship recipient, Ann Marshall, became a member of the UNC swim team beginning in 1975. Marshall made the 1972 Summer Olympics team as a teenager and was named an All-American 18 times while at Carolina.

The path to women’s athletics at UNC was certainly not without challenges and some resistance. And to this day, all issues of equity and opportunity in women’s sports generally are far from settled. However, it’s still inspiring to consider these beginnings at UNC and see the line that can be drawn to all that has come after. We can celebrate the place that these early scholarship awardees made for so many other talented Tar Heels like Mia Hamm, Charlotte Smith, Shalane Flanagan, Ivory Latta, Erin Matson and countless others as well as opportunities for coaches like Karen Shelton to build programs of the highest quality to cheer for.

Images of documents from the Department of Athletics records:

First page of the 1974-75 Tennis season annual report.

First page of the team memo sent by Coach Hogan in 1974-75 season.

1974-1975 Women’s Team Conditioning Program

1975-76 Women’s Tennis Challenge Rules for Roster Spots

1975 press release announcing Ann Marshall’s scholarship.

References

For a more complete picture of the years long development of women’s athletics at UNC and Title IX context, check out this dissertation by Victoria Jackson. https://keep.lib.asu.edu/items/153493
You can peruse old issues of the Daily Tar Heel here for lots of sports stories and highlights: https://www.digitalnc.org/newspapers/daily-tar-heel-chapel-hill-n-c/
Swimming: 1975-1976 season: General, Box 27, in the Department of Athletics of the University of North Carolina at Chapel Hill Records #40093, University Archives, Wilson Library, University of North Carolina at Chapel Hill.
Tennis: 1974-1975 season: General, Box 28, in the Department of Athletics of the University of North Carolina at Chapel Hill Records #40093, University Archives, Wilson Library, University of North Carolina at Chapel Hill.
- Read the entire memo from Coach Hogan here: HoganTeamMemo-1974-1975
Tennis: 1974-1975 season: Camey Timberlake, Box 28, in the Department of Athletics of the University of North Carolina at Chapel Hill Records #40093, University Archives, Wilson Library, University of North Carolina at Chapel Hill.

Digital Records Management 101: Remote Edition

We had planned to host a Digital Records Management 101 training session this month, but we had to cancel the training due to COVID-19. However, we still wanted to provide the university community with some tips for managing digital records. If you are working from home, this might be a good time to work independently to organize your work records or remotely collaborate with colleagues in your department to tackle organization of a shared drive.

This post provides suggestions for reviewing and organizing digital records based on the requirements found in the UNC at Chapel Hill General Records Retention Schedule (Retention Schedule). University Archives staff are working from home and we are available to answer records management questions. You can reach us at archives@unc.edu.

What is Records Management and what is the Records Retention Schedule?

Essentially, records management provides a systematic way to manage records. The Retention Schedule outlines the rules for how different types of records should be managed at our institution. For example, the Schedule (available as a PDF here), provides retention rules for a variety of Personnel Records. So, if you are wondering how long to keep SHRA personnel records, you can find that information on page 128 of the Schedule.
Many of the most common questions about records management are answered on this guide.

How can I used the Retention Schedule to determine what digital records can be deleted and what we need to keep?

The following prompts can help you determine how to manage a record.

What type of record is it?
- Based on the information communicated in the record, what type of record is it? Personnel? A policy? Curriculum? Student information? Financial?
Who created the record? Who is responsible for it?
- An important concept in records management is the Office of Record and Reference Copy. You may have copies of digital records that weren’t created by your department and so aren’t your responsibility. This can get tricky when it comes to cross-departmental collaboration. If you are uncertain, feel free to send us an email.
What is the retention and disposition?
- Check the Retention Schedule for the retention and disposition rules based on your assessments in question one and two above.
Does it go to the Archives?
- Some records are scheduled to be transferred to the University Archives. If you have records that need to be archived, please contact us.

I would like a way to better organize and manage digital records. What advice is there for individuals or departments on managing shared storage like shared drives or SharePoint sites?

One of the best things you can do to keep records organized is to discard files as soon as the retention period allows.
Creating a plan for how to organize active records and instructions for when to review records for retention can go a long way!
- The plan should accounts for the variety of record types and storage locations that you use. Once you’ve outlined a plan — implement and use your plan consistently.
If you are developing a plan for a team or department, ensure that members of the team are involved in planning and communicate the plan clearly to everyone who will manage records.

What are some records organization plan components that I should consider?

One of the most important parts of a records management plan is to determine who (be specific!) will review records and how often that review should occur. This role might fall to one person or to a small team. In our experience, offices who designated a records management point person or small team have the most success at keeping things organized. We suggest that records are reviewed for retention yearly, but you can review more frequently.
List all the digital storage options available and create guidance on how to use that storage and what records should be stored there.
Create short, descriptive notes in digital folders.
- Use a text file (.txt) in a program like Notepad (PC) or Text Edit (Mac) to describe a folder of digital files. Remind your future self or future staff what the contents of the folder are. Title the file README.txt
Use file and folder names wisely & go for consistency
- Create a standard date formatting in file and folder names to make finding things easier: YYYY-MM-DD_AnnualReport.pdf
- Think of other people – what would help them understand what this file is or what this folder contains?
Use folders strategically, but don’t go overboard with too many nested folders. That can end up making it harder to navigate to files later.
Centralize storage (digital or analog) for final copies of records. Avoid relying on individual staff computers/OneDrives for storing important departmental records.
Create a process to ensure any staff who leave employment in your department add important documents to shared storage, so that records are not left behind in personal OneDrive accounts or other cloud storage accounts (e.g. Google Drive).
Ensure digital files are secure and backed-up as needed. Discuss this with ITS as needed.

Do I really need to look at every file? There is so much content and much of it was added to our shared drive by other people or before I worked in the department.

Records management assessment relies on understanding the information contained in a file, so in many cases it is necessary to look at files individually. But there are some higher level strategies that might help to make the task easier.

Try to use folder title and filename cues. If you trust a folder name like “Annual Reports 2012-2016” then you probably don’t need to open every single file in that folder to determine the contents.
Instead of trying to organize everything in one project, you might start by tackling one year’s worth of records at a time. Maybe start with the newest or oldest year. Similarly, you could focus on one specific record type at a time. For example, maybe the first project is to find and organize all annual reports and strategic planning documents.

How can I manage my email more effectively?

As of April 2019, we implemented a new policy on email retention (see Appendix A of the Retention Schedule document). Under this policy, email records created and received by employees in selected administrative positions will automatically be retained as permanent records in the University Archives. All other email accounts will be retained for a period of five years after the employee leaves the University and then discarded. All employees still have a responsibility to evaluate emails, like other record formats, based on the Retention Schedule.

To manage email more effectively, we suggest:

Delete “transitory” or reference copy emails as soon as possible. This refers to things like messages about meeting room changes, calendar invitations, messages about breakroom food or staff parties, or messages sent to the entire campus.
Use folders to organize emails that are related to your department and your substantial work projects.
We suggest that records are reviewed for retention yearly, but you can review more frequently.

How do I access work records from home?

If you are working with records in OneDrive, Outlook, Sharepoint, or a work computer you brought home with you, you can log in to those sites or devices as you normally would when on campus. If you want to work with files that are on a shared drive or access a work computer that is still on campus, you can likely do this from your home with a few extra steps. Follow the guidance below to set up access at home:

If you were able to bring home a work computer:
- Install and log in to the VPN following this guidance from ITS.
- Once logged in with the VPN, you can access shared drives as you normally would.
  You do not need to be logged in with the VPN to access Outlook, SharePoint, or OneDrive.
If you were unable to bring a work computer home:
- You may be able to access your work computer desktop (and all your files) from home using VPN and the Remote Desktop application.
- See this guide for more information on connecting to Remote Desktop.
  - Note: Step 1 of this guide won’t be possible remotely. If you don’t know your office computer IP address, contact the ITS Help Desk or your departmental IT to get IP address information for your work computer.
If you run into any issues with VPN or remote desktop, contact ITS Help Desk or your departmental IT staff for further assistance.

How should I collaborate remotely on reviewing digital records from my department?

This will depend on your team and the type of records you have, but you could consider:

A series of virtual meetings to discuss the current state of your department’s digital storage options and goals for reviewing and organizing that storage.
A discussion of records management and the Records Management guide as it applies to records in your office.
Create a plan for assessing older digital records. Discuss the plan over virtual meetings and divvy up tasks based on who can access files from home.
Create a plan for organizing new digital records going forward and determine the best way to get buy-in from your department.

New Website Collection and Digital Access Guide

We have written before about collecting tweets related to the recent protests of the Confederate monument on the UNC-Chapel Hill campus. We would like to announce the availability of the UNC-Chapel Hill Confederate Monument Protest web archive collection as an additional resource on the recent protests.

Screenshot of a wordcloud. some of the most prominent words are students, confederate, statue, unc, monument, silent, campus, protest — Sample wordcloud generated from collected tweets that included #silentsam or #silencesam (from Fall semester 2017).

The web archive collection contains a variety of content related to the protests. It contains many statements about the monument in the form of editorials, webpages, tweets, and Google documents. The collection also includes news articles from The News & Observer, The New York Times, The Washington Post, The Chronicle of Higher Education, and more. The collection also includes other online materials such as activist websites, editorial cartoons, and Facebook event pages. You can learn more about the web archive in the finding aid on our website.

Additionally, the UNC-Chapel Hill Confederate Monument Protest tweet collection has expanded to include tweets from 2018 and 2019. Visit the finding aid for additional details.

Access to the tweets and web archives can involve a slight learning curve due to technical methods used for collecting the material. So, with this in mind, we are also happy to announce the release of a guide to accessing digital materials. The guide includes information on where to find archived websites, tools for using Twitter data sets, and tips on accessing the myriad file formats in our collections.

If you have any questions about these collections or are interested in donating material related to protests of the monument, please feel free to contact us by email: archives@unc.edu.

Behind the Scenes: 3D Printing for Preservation

Many collections we receive at Wilson Special Collections Library include a wide range of legacy digital storage media. Floppy disks, of the 3.5″ and 5.25″ variety, are very common and our standard work computers obviously don’t come with floppy disk drives anymore! In order to process these materials for preservation and access we need to have legacy hardware to access the disks. We have developed workflows and a lab space to handle these collections (read more here). The lab includes hardware that is sometimes available online (often via eBay) or from specialty retailers. Often the drives or devices aren’t housed in a computer tower or protective case by default. To improve the handling and care of these items, we’ve 3D printed several cases.

Two of the cases we’ve printed used designs available on Thingiverse.

The first was a case for 5.25″ floppy drive. You can find the design on Thingiverse. UCLA Special Collections’ digital archivist, Shira Peltzman, shared their design which was created by graduate student Yvonne Eadon – thank you!

3d printed blue case housing a 5.25" floppy drive — The Carolina blue case for the 5.25″ floppy drive.

The second was a case for a KryoFlux board. You can learn more about KryoFlux on their website. The design is available on Thingiverse.

Empty 3d printed kryoflux case with lid. — The KryoFlux board case with the lid – also in Carolina blue.

Kyroflux board in the 3d printed case — The case with the KryoFlux board inside.

The third item we needed a case for was the controller board for the 5.25″ floppy disk system from Device Side Data. We couldn’t find an existing case, so graduate student assistant Miana Breed took on a project to create and print a design.

Below Miana describes her process:

When given the task of creating and printing a case for the floppy controller circuit board, I was a little daunted. I had never used any CAD programs before or worked with item design. My initial attempt to get a case printed consisted of me going to the Kenan Science Library and hoping that someone would create the case for me! But, alas, I was given instructions for TinkerCAD and sent off to figure it out on my own. (A hands-on learning technique that ended up being very helpful.)

The experience was both interesting and frustrating. I enjoyed testing out the shapes, cutouts, and extensions in the online platform, which is more intuitive than you might think. The program allows two kinds of objects: shapes and holes. Your design is created using these shapes and holes, which are placed around the grid plane wherever you’d like. Objects can be levitated above the grid plane, lowered into or below the plane, and turned at any sort of angle. It takes a bit of playing around to get the hang of creating objects, but I found it to be very user-friendly. There are even some tutorials available through TinkerCAD that show you how to create certain types of objects.

Screenshot of the design in TinkerCAD — Screenshot of the TinkerCAD online design environment.

Luckily, because UNC is a research university, the Kenan Science Library was more than willing to print multiple iterations of my design. It took three tries before I finally landed on the right design for the case that offered the most protection with the best fit.

Our highest priorities when making the case were keeping the board secure in the case and protecting the delicate pins on the upper side of the board. The features that keep the board in place inside its tray are the two screw holes on either end of the tray. These line up with the mounting screws on the controller board that are intended to mount the board inside a computer tower.

Two of the cases I made were not quite tall enough to avoid brushing the pins, and one of the cases didn’t slide over the USB port. The most difficult pieces to get right were the cutouts on the front of the case. Measuring in TinkerCAD is relatively easy, but sometimes your design gets shifted by millimeters without letting you know. Each time I printed a new case, I thought I had gotten the cutouts measured correctly, with the right depth and distance apart, but I finally decided to go with a design that had one long cutout rather than two individual ones for the USB cable and ribbon cable. One lesson learned from this process is that, sometimes, the simplest design is the best design.

In the photo below, the two cases printed in clear plastic were my first two attempts. If you look closely, you can see two small black marks where I measured how much wider the cutouts needed to be.

Shows four 3-d printed cases. Three attempts and the final case. — Iterations on the design. Sometimes it’s hard to tell how successful the design is until it’s printed.

My final design for the board included tracks for the inner tray to slide on that aim to keep the tray in place inside the outer case. These tracks were another stumbling block in the third print (grey case in the picture above). Because you can move shapes around so easily in TinkerCAD, designs can sometimes become tilted or swiveled in a way that doesn’t fit all your pieces. When I received my printed case, the tracks were at a slight diagonal and the inner tray slid into the case at an angle that popped one of the seams.

Finally, after several consultations with makerspace staff at the Science Library, I landed on three fixes for my final design: 1) A thicker top and sides for the case that wouldn’t break at the seams. 2) A higher top that would avoid all pins and the USB port. 3) Tracks that were perfectly parallel with the sides of the case. The end result keeps a majority of dust and other particulates off the circuitry and provides some protection for the board. The case also allows the appropriate wires to be attached without removing the tray from the outer case, which will help prevent damage to the pins.

5.25" Floppy Controller Case with the controller board inside — The final print!

5.25" Floppy Controller Case with controller boarder. Close up showing the the lid slides off. — Close up showing the slide feature of the base.

While I was initially a bit reluctant to take on this 3D printing project, I see this sort of design as a valuable skill to have when working with digital archives. Some of the devices we use to read legacy media are difficult to find in their original housings, and some, like the Kyroflux, Device Side Data controller board, and 5.25” floppy drive will come without protection for their inner workings. These pieces of hardware need cases in order to protect and maintain functionality of the devices and our ability to access legacy storage media.

You can find Miana’s design on Thingiverse.

Records Retention Schedule Updated

The UNC-Chapel Hill Records Retention and Disposition Schedule underwent a routine revision process in 2018 and the newly updated Schedule is now available (effective April 8, 2019). The new Schedule document is available on our Records Management Guide. If you have any questions, please contact us. Below we’ve outlined some of the major changes you’ll see in this newest edition of the Schedule.

Appendix policy on managing email

This policy reflects a new approach to selecting and retaining email at the University. It was developed in consultation with the State Archives, University Counsel’s office, and UNC Information and Technology Services.

This approach is based on the Capstone Approach developed by the National Archives and Records Administration. It enables us to collect email of permanent historical value based on an employee’s position and function rather than the content of individual email messages. Under this approach, email records created and received by employees in selected administrative positions will automatically be retained as permanent records in the University Archives. All other email accounts will be retained for a period of five years after the employee leaves the University and then discarded. All employees still have a responsibility to evaluate emails, like other record formats, based on the Records Retention Schedule and individuals not in “Capstone positions” can still work with us to transfer permanent records if needed.

Document structure changes

Due to some changes to terminology and series headings the Schedule was re-alphabetized and reordered. You may find that a series you were used to using has changed location in the document. This does not necessarily mean the content of the series has changed.

New navigation has been introduced to the PDF document. The table of contents are now links and can lead directly to the desired section. Every page includes a “back to top” link at the bottom of the page that leads back to the table of contents. We hope this helps to make the document easier to use.

Significant content changes

1.24: Insurance Records

Changed retention from permanent to destroy in office after 6 years. Changes will bring this schedule in line with the statewide college and university schedule and the State Archives.

11.13: Disciplinary Records

Longer retention period as proposed by University Counsel’s office.

11.34: Immigration Filings

Revision as proposed by UNC Office of International Student and Scholar Services.

11.46: Search Records

Removing requirement to retain records of administrative searches permanently after consultation with State Archives.

Required retention period for applications from unsuccessful student candidates (11.46b) changed from 1 year to 2 years to match statewide requirements.

12: Public Safety Records

Several changes made in this section in order to ensure compliance with Clery Act record-keeping requirements.

13: Sponsored Projects and Research Records

There are many changes in this section, all suggested by the UNC-CH Vice Chancellor for Research and University Counsel. Specific changes include:

- 13.2: Animal Research Records: Retention period reduced from 7 years to 3 years to match NIH and other federal guidelines.
- 13.11 and 13.12: Research Misconduct Reviews and Scientific Review Committee Records: New sections.

14.17: International Student Records

Changes as proposed by UNC Office of International Student and Scholar Services.

18.1: Disciplinary Records

Changes as proposed by UNC Equal Opportunity/Compliance office.

Behind the Scenes: Workflows Development Bit by Bit

Born-digital accessioning, processing, and ingest work has been handled in a variety of ways at Wilson Special Collections Library since about 2010. This post is about our most recent development in the evergreen quest to optimize and improve archival workflows. Over the past two years, improving workflows for born-digital materials at Wilson Library has often meant centralizing and standardizing.

Image shows a 3.5" floppy disk, yellow zip disk, 5.25" floppy disk, CD, USB thumb drive, and external hard drive. — These are some of the most common types of born-digital storage media that we process.

If you are an archivist, you might wonder, why centralize? Over the past couple of years there have been calls for moving away from the lone digital archivist model in our institutions. This can be a beneficial staffing move, but I also think it depends a lot on institutional context. At Wilson Library, we are not necessarily trying to centralize the work to one person, but are striving to use a consistent workflow across units and make a portion of the workflow (the really technical bits) centralized with a smaller number of people. The idea is that it will be easier to implement the workflows with a smaller number of staff who have capacity to become experts in the technical workflow. Other bits of the workflow like acquisition or description still happen elsewhere in our building wide workflows.

So, what have we done so far to work toward this goal?

One thing was the creation of more detailed workflow documentation and training resources that could be easily available to all staff. This included filling in some workflow gaps between acquisition and ingest, creating more documentation of the software and hardware available that addressed why and when to use various tools, creating a metadata template for archival folders in the repository, training resources, and more. The documentation was then compiled into a website for easier navigation and use. The review and creation of documentation also presented an opportunity to think more about our goals in technical processing of born-digital materials. In an effort to reduce focus on specific tools, I drafted some digital preservation statements the underpin our workflow goals and development. Hopefully these statements can guide us no matter what tools we use in the future.

Image shows a cubicle space with two computer workstations, whiteboard, and a small round table — The Digital Preservation Lab is currently located in this fun cubicle space.

Another important development was making the hardware and software acquired over the years by the University Archives more available to all Wilson collecting units. This process evolved into the development of our Digital Preservation Lab and centralized service. Instead of each Wilson Library department developing their own born-digital workflows, staff can now bring born-digital accessions to the Lab where one of three dedicated staff (myself and two graduate students) will prepare the materials for appraisal and ingest to preservation storage. This has greatly reduced the number of people who need to learn the entirety of the pre-ingest and ingest workflows. It is also helping to highlight non-technical aspect of the born-digital workflow that need further assessment and development.

We still have more to do to integrate born-digital workflows into other accessioning and processing workflows—and of course there is always the on-going process of planning and managing the big picture of digital preservation over time—but we are well on our way!

Behind the Scenes: Describing Archived Websites

On May 22, I participated in an Archive-It training webinar on describing archived websites. The following is a summary of my short presentation on the Wilson Special Collection Library’s approach to describing archived websites in finding aids.

Special Collections has been archiving websites with Archive-It since 2013. Our Archive-It account is spilt into collections that reflect our five main collecting units as well as one collection for the UNC at Chapel Hill Art Library. Some of our collecting units use catalog records to describe archived websites, but my presentation is focused on the finding aid side of the house and uses examples from the University Archives collection.

What makes describing websites unique?

In many ways, our approach to archived website description lines up with existing archival finding aid practices. However, there are some ways that archived websites are unique from other materials. For example, date can be tricky. Do we describe the date we archived the website or try to assign some kind of creation date? Our technical services team opted for describing the date we started archiving a website rather than trying to assign the website a date of use or creation. Other challenges are the recurring nature of “crawling” websites, frequently changing content, URL changes and redirects, the differing frequencies used to archive different websites in our collections, and the technical limitations and incompleteness of some archived websites.

Case Studies

We have some consistency in our approach, but we don’t have written documentation yet. The following examples are representative of our approach as well as a couple newer things we have tried more recently.

Archive-It Collection level description

The first example is a finding aid for the University Archives’ Archive-it collection. The finding aid was created in 2013 and serves as a blanket entry point and general description of all URLs in the collection. I think this is a helpful finding aid to have, but the University Archives collection has grown a lot since 2013. One improvement might be adding series to this finding aid that describe groups of related URLs in the collection. The additional description will help the finding aid show up in more searches. It would also provide users with more access points rather than just being transported directly to the entire (very long) list of URLs in our collection.

Screenshot of a portion of a finding aid describing the University Archives archived website collection. — http://finding-aids.lib.unc.edu/40417/

URL level description

The second example is adding description of individual URLs to finding aids. This style of description is pretty standard across manuscript collecting units and was implemented broadly by our technical services team in 2013-14. Typically, these URLs were selected for archiving because we already had a collection for the person or organization. When adding individual archived websites to finding aids, we link to the Archive-It “calendar page” that shows each of the dates we archived the URL. The description also provides the URL, the first crawl date by month and year, and a brief description of the live website.
This approach works well. One way I’d like to iterate on this approach is to figure out how best to represent the incomplete nature of archived websites in the finding aid. The description of the site describes the live website features and content, but the archived version may be different based on how often we archive it or it may have elements missing due to technical limitations of web crawlers.
Example:

Screenshot of a finding aid section describing the Carolina Black Caucus archived website. — http://finding-aids.lib.unc.edu/40363/

Group of related URLs description

A third way we’ve represented archived websites is by creator groups and this is a slightly newer approach for us. Instead of listing individual websites on this finding aid, we added one link to the group of URLs created by the student organization. We could have done item level and that might allow for better description of the URLs given that each is quite different (e.g. a Facebook event page vs. Email newsletter vs. a website). But linking to a group of URLs does fit more closely to traditional archival description practices that focus on aggregate rather than items. We’ll have to continue to think about how to handle the donation or selection of several websites by one creator in our descriptions.
Example:

Screenshot of part of finding aid showing a group of URLs archived for the Asian Student Association collection. — http://finding-aids.lib.unc.edu/40486/

Intersection of legacy media and websites

The last example is really different from our other archived websites. Last year I worked on a project with a colleague to deal with website directories given to UA on optical media (I wrote about it on the blog here). These sites are no longer live on the web. We essentially re-hosted the website, gave it an artificial URL, and crawled it with Archive-It.
One of the questions we had was how to best describe these websites. In order to re-host and archive the sites with Archive-It we had to use an artificial URL and the crawl date is very different from the creation/use of the site. Additionally, the directory of files from the DVD had already been ingested to the repository a couple years ago. We needed to make some connections between these factors.
We decided to keep a link to the repository, note the DVD identification number, link to Archive-It, and explain a bit about the process to re-host the site.

Screenshot of finding aid section describing archived website given to us on DVD — http://finding-aids.lib.unc.edu/40296/#contentslist

Next Steps

Our staff last talked about this work in 2013-14 when we first started using Archive-It, so our best next step is to revisit this topic as a group and figure out how we can iterate on our current approaches to meet the unique description challenges posed by archived websites. I had the pleasure of participating in the OCLC Web Archives Description working group in 2016-17 and the guidelines produced by the group will be a helpful resource in this discussion. Documentation of our practices for describing websites will be an important addition to our existing documentation for description of born-digital materials in archival finding aids. I’d also like to use more metadata in the Archive-It access interface. The OCLC WAM guidelines can help with that as well.

You can use and explore our archived website collections online through our Archive-It access portal.

Collecting a Snapshot of #SilenceSam

The Confederate Monument on the UNC-Chapel Hill campus has been the subject of controversy and protest for decades. A detailed timeline and corresponding archival materials related to the monument between 1908 and 2015 can be explored online via our Guide to Resources about UNC’s Confederate Monument. While some aspects of the current protests mirror past efforts, social media has facilitated new approaches for sharing information and sparking action on campus. In an effort to document the current protests, we knew it would be important to explore methods for collecting a sampling of tweets related to the Silence Sam protests.

We decided to use a tool called twarc to harvest tweet data for specific hashtags searches. Twarc is a Python package that makes use of the Twitter API to collect tweets. Between August 22 and December 15, 2017, we performed a weekly search and harvest of #silencesam and #silentsam. In addition, we infrequently captured select complementary hashtags: #boycottunc #boycottunctownhall #iaarchat and the @Move_Silent_Sam user account. 15,063 tweets were collected across all searches. The hashtags #silentsam and #silencesam make the up the majority with 12,993 tweets collected.

The tweets are in a raw form, so to speak. Twarc returns the tweets and associated metadata in a JSON document. So, in this collection you won’t automatically find a timeline that looks like the Twitter website. Instead, what we have is a structured text document with many lines and each line represents a tweet and associated metadata about that tweet. The data can be manipulated in a variety of ways for analysis or viewing. A wide variety of visualization tools could be useful for working with the data.

To get started working with this collection, though, you’ll first need a Twitter account and Hydrator or twarc installed.

The first step is to “hydrate” the dataset. There are some specific access stipulations for this collection due to the Twitter API terms of service. We cannot make the full data we collected available for use. In particular, we are unable to make deleted tweets available for use. Instead, we provide a list of the tweet identifiers (tweet ids) for all the tweets we’ve collected in our repository. This list of identifiers can be hydrated by querying the Twitter API for the tweets that are still publicly available. There are two options for hydrating the tweet ids.

Download the Hydrator tool

You’ll need to authorize the app to connect to your Twitter account.
Upload the tweet identifier document to Hydrator and start the process.
Download the hydrated tweets from the tool.

Hydrate using the command line with twarc

This method will require you to have Twitter API credentials. It’s not as intense as it sounds. Social Feed Manager, a project at George Washington University Libraries, provides a helpful guide in their documentation under the Adding Twitter Credentials section. Don’t worry about the parts that are specific to using Social Feed Manager. Your API keys will be entered via command line when setting up twarc. Instructions for setting up twarc are available on GitHub.

Once you have hydrated the dataset using one of the options above, you’ll have the full text of tweets and metadata in a JSON document or a CSV spreadsheet (from Hydrator).

The next step is to begin working with the data. You could use a variety of tools to visualize the data. Twarc comes with a few useful “utilities” that can also be used. A few are highlighted below:

wordcloud.py

emoji.py

The emoji.py program provides a way to tally up the emojis used across collected tweets.

wall.py

The wall.py program is the best way to generate a timeline of tweets that can be read one by one.

noretweets.py and deduplicate.py

These programs may be useful if you want to pare down the dataset. We don’t anticipate much duplication of tweets in the dataset, but no deduplication has been performed by us prior to making the collection available.

A note on images and video: There are limitations to collecting video and image files embedded in tweets due to the nature of the collecting by API. You may try using the method shared in this blog post from Tim Sherratt under Get Images. He uses image URLs and wget to gather pictures.

Access the Collection: You can find the collection description here and access to the tweet identifiers documents can be found here.

Other on-going collecting efforts related to the Confederate Monument protests that began on August 22, 2017 can be found:

UNC-Chapel Hill Ephemera Collection (40446)
UNC Libraries’ Web Archives, Confederate Monument Protests related websites.

If you have materials related to the protests – like photos, signs, or video – and you are interested in donating these materials to the University Archives, please contact us by email archives@unc.edu.

Other twarc and social media archive resources:

Digital Humanities/Digital Scholarship resources
Documenting the Now website and blog.
Social Feed Manager website.

Behind the Scenes: Introducing Really Old Website Resurrector (ROWR)

From time to time the University Archives finds copies of departmental websites stored on CDs or DVDs as directories of html and other associated files. These websites are usually no longer available on the web. When we receive CD/DVDs from University departments, the files are carefully copied from the optical media (a relatively unstable storage medium) and deposited with our repository which is designed specifically for digital materials preservation. However, accessing a website as a directory of individual files rather than web pages in a browser leaves something to be desired. The content might be available, but the use is very different than what was originally intended. Additionally, from an archival standpoint we would like all archived websites to be stored in the WARC file format (an international standard).

In reviewing these items in our collections last year, I began to wonder if it would be possible to temporarily host the websites again. Once hosted online, we could crawl the website with Archive-It, which is the tool we use for website archiving. This method would allow us to provide access to the webpages as a site, connect the websites with the rest of our archived website collections, and generate a WARC file copy of the site’s contents. Luke Aeschleman, then of the library’s software development department, helped me with this project by creating a tool, ROWR, to clean up links and prepare for hosting the site.

ROWR prompts the user for a directory of website files as well as appropriate actions for modifying or removing links. ROWR creates a copy of the site prior to making changes so it is possible to reset and start over, if needed. ROWR also keeps track of the files and folders it has scanned, so it is possible to stop and continue review of the site later.

ROWR essentially produces a website that has a new artificial URL to facilitate temporary hosting of the website through a library server. This URL is then added to the Archive-It application and we run a standard crawl of the site. Once the crawl is tested and finalized, we take the website down from the library server.

We tested this approach with two websites. Overall the process works fairly well, but I did come up against some unique collection management and description needs. For example:

Do we need to keep a copy of the files from the CD/DVD or can we discard it and just use the Archive-It version?
The crawl date in Archive-It is completely different from the date the website was created and originally used. How should we represent these dates to users in metadata and other description?
ROWR is changing the content and we are creating an artificial URL, so how do we communicate this to users and what would they want to know about these changes?
It can be time consuming to use ROWR and clean-up all the links.

I decided that we should keep the copy from the CD/DVD available in the repository as it is representative of the original website and the verison in Archive-It contains an artificial URL. To address the other issues we added some language to finding aids:

“An alternative version made for access is available here. This website was transferred to the University Archives on optical disc. To aid preservation and access the website files were temporarily re-hosted online and archived with Archive-It in 2017.”

In Archive-It, I also created a URL group (“website cd archives”) for the websites that were part of this test project in an attempt to set them apart from our typical web archiving work. I’ve not yet found a satisfactory way to provide context for these website in the Archive-It access portal with Dublin Core metadata, but I hope that the group tag can be a clue that more information exists if a user were to ask us.

These two approaches to description are likely not the permanent solutions to the collection management challenges, but it is a starting point that provides an easier way to access these particular websites online. A future project for us will be to assess metadata description in Archive-It for all of our archived websites.

ROWR is in an early iteration and is not being actively developed at this time, but you can find the code on the UNC Libraries’ GitHub. In the months since we wrapped up this project in the summer of 2017, the Webrecorder team introduced a tool called warcit. The tool can transform a directory of website files into the WARC file format. The resulting WARC file could then be accessed in the Web Recorder Player application. This new tool is something else we’ll be exploring as we continue to improve procedures for the preservation and access of website archives transferred to us as file directories.

Carolina Tweets #archiveunc

When you think of archives you might think of dusty old books and papers tucked away to be used by historians and other academics. Here at the University Archives we preserve plenty of old University records (that are kept dust-free, by the way), but our day-to-day work is actually very focused on the current moment. Without collecting materials that document the present day researchers can’t study the University in the future.

One way we archive the current moment is through collecting student life materials and UNC related web content. With only three full time staff members it can be tough to keep up with all the conversations, events, and activism happening on campus. We can’t do this alone. This is where you come in!

You can actively contribute to the documentation of what’s happening at UNC by using the hashtag #archiveunc on your public tweets or Instagram posts. That’s all you have to do! By using the hashtag, you opt in to having the posts archived for long-term preservation and research access.

How is the content archived? We will periodically use a tool called Archive-It to “crawl” the tweets or posts tagged with the #archiveunc hashtag. Once the posts have been crawled by the Archive-it tool, the data is preserved by the Internet Archive and we provide access through our Archive-It website.

What kind of tweets are we looking for? We’re open to any tweets or Instagram posts related to UNC academics, campus life, and events. For example:

Promoting a student organization event? #archiveunc
Protesting? #archiveunc
Promoting a cause? #archiveunc
Sharing activities or chalk messages seen on campus? #archiveunc

If you don’t use #archiveunc, we may be in touch to ask permission to add your social media content or website to the Archives. Collecting social media content as it unfolds is new for us. We’re experimenting, so how we ask for permission and the technology used may evolve over time. As things change, we’ll keep you in the loop.

We hope you’ll join us in this exciting new effort!

Not interested in social media? Other ways to get involved and help document Carolina history:

Submit photos of UNC shirts to the UNC T-Shirt Archive.
Connect with us regarding donation of student organization records, digital or print photos, videos, or campus posters/flyers. If it documents something happening at UNC, we’re happy to talk about adding it to the archives. Please email (archives@unc.edu) us to get the process started.
Nominate a UNC website for archiving. First check to see if we’ve already archived the website: https://archive-it.org/collections/3491. If the website can’t be found in our web archives, send us an email (archives@unc.edu) to get the process started.