stacksimage

From Investigation to Implementation

Building a Program
for the Large-Scale
Digitization of Manuscripts


Order of Digitization

A critical question facing the SHC staff at the outset of this digitization effort was, "Where to begin?" With a collection as large and as broad as the SHC, full digitization of all the materials would not happen quickly, and so priorities must be set to determine the order. SHC staff considered several options and then presented them to the scholarly community (the SHC's core audience of users) to help determine what would best meet their needs.

Chronological approach

One option was to digitize collections in the order in which they were received—either beginning with Box 1 of Collection 1 and working methodically toward the most recent collections, or beginning with Collection n (the most recently accessioned collections) and working backward toward Collection 1. Both approaches, however, present some inequities: If the Box 1 of Collection 1 approach was used, then some rarely used collections would be digitized before collections that are used frequently. And records from early time periods would be digitized before twentieth-century collections, given that the SHC's first 2,000 accessions were mostly nineteenth-century records, and scholars exploring the twentieth century may not find much of use in the digital search room. If the Collection n through Collection 1 approach was used, the bulk of the early digitization would be focused on large twentieth-century collections, and scholars exploring the eighteenth and nineteenth centuries would wait indefinitely for useful digital collections to be put online.

As one scholar commented in the workshop, "Chronological would not work for me because you would inevitably pick the chronological period that I am not working in . . . and that would be frustrating." Another scholar said that twentieth-century collections tended to be "thin, but impossibly gigantic and so being able to search them in new ways and different ways becomes more and more critical."

If digitization could enable scholars exploring the twentieth century to navigate those voluminous collections more easily and conveniently, then an approach that placed all such collections near the end of the digitization queue would clearly be inappropriate.

Greatest hits approach

A second option was the "greatest hits" model of digitization—to begin with some of the most-used and most written-about collections (which tend to be the most widely used). There are problems with this approach as well including:

  • The greatest hits model isn't always preferable or inclusive: As one scholar noted, "We've all seen those greatest hits collections. I've never seen one that really covers the full breadth that we are talking about here."
  • Many of these collections have already been microfilmed—and so are available worldwide through interlibrary loan: These resources would certainly be well used if digitized early in the process. However, in prioritizing their digitization, collections that are not available on microfilm—and therefore can be accessed only by users who visit Wilson Library—may not be digitized as quickly. One participant commented, "If I can get it on interlibrary loan, I don't feel that I need to have it digitized right now. What I would rather see is the stuff that I have no way of accessing except to come here."
  • Lesser-used and newly added collections could also be of great research value and could very likely become more readily used if digitized: An informal survey of archivists on the SHC staff revealed that usage statistics should play some role in the prioritization for digitization, but that heavy usage should not automatically move a collection ahead in the queue.

Hidden treasures approach

The third approach was to prioritize the digitization of some of the "hidden treasures"—those resources that the archivists may be familiar with, but that few scholars have yet studied extensively. However, what one person considers a hidden treasure, another person may consider only marginally useful or interesting. As one scholar asked, "I want to see them, but how do you pick them?" Moreover, many treasures may still be hidden in boxes little used or unknown to the current SHC staff, and could be revealed as they are brought out of the stacks to be digitized.

Conclusions

As discussions with scholars about these approaches unfolded, no consensus emerged to guide the SHC staff in choosing. Scholars tended to agree that no single research agenda, e.g., the American Civil War, should drive the selection of collections for digitization (as such an approach would necessarily exclude material of great use to scholars with other research interests), and that the materials prioritized for digitization should have broad potential for use by a variety of scholars and other potential audiences. As one scholar noted, "If you're doing it because you want to benefit a huge number of people who could never come to the Southern [SHC] and would never pick up a letter, then I think you ought to be biased toward the collections that could usefully be used by that massive number of users" [rather than focusing on small collections, which may be helpful to, say, a scholar writing a monograph, but that few others would find useful].

Ultimately the SHC staff decided that by prioritizing collections that in some way addressed a very broad and important issue in southern history—one that was historically relevant in all time periods and locations—the initial digitization would represent a broad geographic, chronological, and topical range, covering both the most-used collections and the hidden treasures of the SHC.