Decision Matrix
With a strategy in place for priority setting, project director Laura Clark Brown built a detailed decision matrix to analyze the holdings of the SHC. The decision matrix is a complex series of questions, the first of which has already been applied to each of the SHC's 4,600 existing manuscript collections and will be applied to its future holdings. (See a complete copy of the Decision Matrix in Appendix G.) Priority will be given to collections that contain any documents addressing the "great subtext of American history" and of southern history in particular—race and race relations. The core question of the matrix deals with the presence of documentation on race. Subsequent questions explore the collections for size, formats, donor restrictions, copyright concerns, presence or absence of sensitive information on third parties, and status of the finding aids—all factors that might hasten, delay, or prevent digitization. Questions in the matrix determine each collection's place in the lineup of the digitization process.
A collection's priority ranking is based, in part, on the presence of documentation concerning race and race relations, broadly defined. However, the intent of the large-scale digitization model is to digitize the entirety of the prioritized collections rather than to extract those singular items focused on race. For example, the Graves Family Papers, a collection that documents two generations of an affluent white family in Georgia from 1815 to 1901, would be digitized in its entirety even though this large collection contains only a small pocket of documentation of freedpeople. The result of this approach—entire collections rather than selections—is broad coverage of geography, chronology, material types, creators, and content. In the Exit Survey this option was preferred by the scholars who rejected methods that gave priority to such elements as time periods, geographical locations, or document types (e.g., typescripts with OCR possibilities).
Matrix stages
The first stage determines whether the collection will proceed further through the matrix and be considered for inclusion in the initial digitization. Collections are first evaluated to determine whether the majority—or at least substantive portions℄of the collection is comprised of original materials held in the SHC, as opposed to copies. Collections with original materials are then assessed to determine whether they contain any materials related to race or race relations in the American South. When a collection has met both initial criteria, additional questions are applied and scored to determine the collection's priority for digitization, including possible impediments (e.g., imposed restrictions by collection donors, copyright protection, and privacy concerns), the extent to which the collection has been used by scholars or is deemed by archivists to be of great potential use to scholars, and the extent to which the collection has been completely processed by SHC staff.
In the second stage, each collection is analyzed in a more detailed way, considering: 1) any donor restrictions on the collection, 2) the existence of sensitive materials within the collection or those that may pose privacy concerns, 3) the provenance of the collection, 4) the inclusion of documents related to subject strengths and collecting initiatives of the SHC within the collection, 5) the relationship between the collection and other collections within the SHC and other nearby archives, 6) the past and anticipated uses of the collection by scholars, 7) the copyright and intellectual property status of the collection, 8) the existence of microfilmed or other digital copies of the collection's contents, and 9) the completeness of the collection's processing and finding aids.
The third stage of the matrix deals with quantitative data about each collection in the priority list. Questions address the size of the collection, the chronological and geographical scope of the collection, the types of manuscripts included (e.g., correspondence, diaries, ledgers, maps, drawings, photographs, recordings), the dates and processing status of the original accession and any additions or expected additions, and whether any part of the collection has been previously reproduced (either on microfilm or in digital form).
Testing the Matrix
By applying the three stages of the matrix, collections can be prioritized for digitization to best meet the needs of the scholarly community in a balanced and efficient way. SHC staff applied the first stage of the matrix to all 4,600 collections, which enabled prioritization of 1,030 collections for digitization. The staff then tested all stages of the decision matrix by thoroughly evaluating two twentieth-century collections and determining the priority for each: The Delta Health Center Records (#4613) would be a low priority due to privacy concerns (presence of health information); The Delta and Providence Farm Papers (#3474) would be a high priority, as risk would be low for privacy concerns because, despite the fact that the Farm Papers is a twentieth-century collection, it does not contain sensitive information about identified third parties likely to be living today.
Previously microfilmed collections
Microfilm can be digitized quickly and cheaply without exposing original manuscript sources to additional handling. Thus digitizing collections from microfilm could result in a large volume of materials being digitized safely and made available online very quickly. However, scans of microfilm are only as readable as the original microfilm itself and may not be of the same high quality as scans created directly from the original documents. The graduate students and scholars were divided on the issue of digitizing materials for which microfilm copies already exist.
Many of the graduate students said that the SHC should digitize its existing microfilmed collections to get a large quantity of material online quickly; others were more wary of digitized microfilm—stating concerns about image quality and readability, and questioning whether the cost savings and speed truly mitigated the loss of quality. Some said that the degree of image-quality loss was substantial enough to forego microfilm digitization altogether, and they advocated for original manuscript digitization only. Others argued that any initially digitized microfilm could be replaced by digitized manuscripts at a later date as resources permitted.
When presented with the probability that the digital surrogates of the documents would be legible, most interviewed scholars said that large-scale digitization should emphasize quantity of materials digitized rather than perfection of image quality. Several argued that the SHC staff should examine microfilm editions for qualityIand advocated for digitizing only those microfilmed collections with superior legibility and for using the original archival material when the microfilm proved illegible or otherwise poor. The majority said that microfilm digitization should be considered on a case-by-case basis. More than half (12 of 21) of the scholars who completed the Exit Survey responded that the SHC should not digitize microfilm holdings.
Given the current availability of SHC staff time and resources, it was decided that when collections that are available on microfilm reach the top of the priority list for digitization, working from the original documents instead of the microfilmed versions will likely be best for the following reasons:
- Many of the microfilmed copies of SHC collections were created by commercial publishing companies and therefore could not be digitized by the SHC.
- Microfilmed copies created by the SHC would have to be compared to original sources to determine image quality and might require a page-by-page assessment of each collection (as microfilm quality varies with different kinds of documents). And after scanning the film, individual pages or folders within a linear section of microfilm might have to be delineated in order to fit correctly into the digital presentation of the collection.
- Many collections were microfilmed decades ago, and the original collections might have grown with additions, been reorganized and reprocessed, or changed in other ways. The microfilmed documents would therefore have to be aligned with the finding aid and the original documents as they are currently organized.
Extensibility
The SHC staff hopes that this decision matrix will help other repositories in designing a plan for large-scale digitization of their own collections. Specifics would necessarily be divergent, but the basic elements may be transferable: The model focuses on a core question of interest (SHC staff identified collections related to race) to the primary research audience (for the SHC it is scholars of the American South) of the archive. In shaping the core question, the keys are to determine the intended audience and to engage that community in setting priorities for digitization. Then a series of questions is developed to enable archivists to prioritize collections according to their relevance to that question.
Conclusion
After the initial assessment of all 4,600 collections, the identification of 1,030 high-priority collections, and the detailed assessment of two key collections, the SHC determined that the decision matrix will be a useful long-term tool for evaluating and prioritizing collections in a large-scale digitization effort.