Making New Connections to Old Materials: Bringing Special Collections into the Digital Age [this version is in the mock-up]

Link to the original draft with reviewer comments
to the first edit
to the draft in response to that edit
or to the most recent edit

The automation revolution that has transformed most facets of the modern library has not yet had the same impact on archives or special collections. Ever since the card catalog gave way to the online public access catalog (OPAC), archivists and special collections librarians have longed to provide their patrons with the same level of computer searching power enjoyed by users of the main collections. Unfortunately, the United States Machine-Readable Cataloging (USMARC) standard for books, media, and other library materials was not designed to classify, order, or retrieve the types of materials most often found in archives or special collections libraries. The difference is mostly a matter of scale. OPACs (like the card catalog before them) guide library patrons to books and periodicals, while archive indexes guide patrons to individual documents, photographs, artifacts, etc. The level of detail required by archive indexes cannot be handled efficiently by the USMARC standard.

Patrons who wish to find materials in most archives today are still relegated to searching printed finding aids. Locating information can be quite time consuming.  Because there is no way to conduct global searches for needed information, patrons must read each printed register that may have information relative to their research. Needless to say, this is a cumbersome and inefficient research method, one that is especially aggravating to patrons and librarians accustomed to the computer power available for searching the main collections.

The proliferation of OPACs in libraries has made the ability to search library holdings from distant locations, through TELNET or the World Wide Web, more and more common. Again, users of special collections and archives generally have not benefited from this technology; they must travel to the location of the materials in order to access the printed finding aids. Even when collection registers are available through networks (e.g., local area networks, OPACs, or the WWW), the registers usually are reformatted text documents and are not searchable.

The Power of Integration: Unprecedented Access to Special Collections at Southern Utah University

In response to the challenge of making special collections searchable online, technology specialists at Southern Utah University have embraced a powerful and often overlooked concept in educational technology: integration. Not all important technological advances happen at the cutting-edge; some result from novel combinations of existing hardware and software. Evidence of that fact can be found at the Gerald R. Sherratt Library at SUU, where the integration of four existing technologies—SGML (Standard Generalized Markup Language), OPAC, the WWW, and digitization—provides patrons with unprecedented access to the library's special collections.

In the summer of 1997, the SUU Special Collections joined the beta test for a new national standard—introduced jointly by the Library of Congress and the Society of American Archivists—for archival cataloging. The new standard is based on SGML, a technique for representing documents in machine-readable form; SGML was approved in 1986 as an international standard (ISO 8879). Because it is a general language, SGML can be customized through a document type definition (DTD), which outlines the internal fields and tags used to meet specific document needs. HTML is a well-known example of a specific SGML application.

The DTD for using SGML to create archival finding aids is the Encoded Archival Description (EAD), which was conceived and created by a multidisciplinary team of librarians, historians, and computer scientists. Daniel Pitti of the University of California-Berkeley organized the team under the auspices of the Bentley Library Research Fellowship Program for the Study of Modern Archives. The Library of Congress and the Society of American Archivists have taken charge of the EAD and currently are fine-tuning its structure and promoting its use among libraries and archives. The EAD is destined to become the national standard for creating machine-readable finding aids for archives and museums.

Like all SGML applications, EAD uses embedded tags to add detail to text documents. An example of non-SGML embedded tags are the word processing codes that provide formatting details like <bold> or <left tab>; these describe how a document looks. In contrast, the tags in an EAD document identify the kinds of information included in an item and the location of the material. Because archive indexes are in outline form, EAD tags also encode the hierarchical structure of the index. For example, <container> identifies the box and folder where an item is stored, <persname> indicates a personal name, <unittitle> describes a section or item, and <unitdate> indicates the date an item was created. The following is a simple example of the use of these tags:

<container> Box 5 </container> <container> Folder 1</container> <unittitle> Letter from <persname> John Smith </persname> to his parents, while stationed at Pearl Harbor, <unitdate> Dec 6, 1941.</unitdate></unittitle>

At the heart of SUU's new integrated system are EAD-based registers of manuscript and photo collections. The EAD format allows the registers  to be indexed and searched by a UNIX-based search engine. This searching power is linked to several user-friendly front ends that offer Special Collections patrons a range of access options.

The first level of access, a combination of SGML and Web-based systems, is a Special Collections homepage that contains a standard Web search tool. A patron can use this tool to search the contents of all indexed collections globally or only the contents of individual collections. The second level offers Special Collections access to general Web users who do not enter directly through the SUU Library or Special Collections Web site. The indexing software dynamically creates a general contents page from all indexed collections; the contents page is updated automatically whenever new registers are added to the database. Because this contents page is registered with most major commercial search engines, interested Web surfers can easily find the Sherratt Library's Special Collections page where they can peruse the registers or conduct a specific content search.. The third level of access exists via the library's OPAC, sometimes referred to as the "electronic card catalog." New standards for OPACs allow libraries to link digital resources directly to bibliographic records. Under this new protocol, library administrators have linked the Special Collections search engine directly to collection-level records in the OPAC.

The Rewards of Integration: Practical Examples

During the summer of 1998, SUU Special Collections staff members began experimenting with the beta version of the EAD. They began creating EAD-based registers in earnest when the EAD 1.0 version was released in the fall of 1998. (These projects were made possible by a federal Library Services and Technology Act mini-grant.) Since then, approximately 30 registers have been completed, and more are added every month.

This integrated system offers a whole new level of searching power and access that Special Collection patrons have never before enjoyed. If a patron's search on the OPAC retrieves a record for an item in Special Collections, one click of the mouse opens the Web browser and takes the patron directly to the searchable finding aid for that collection; once there, the patron can retrieve information down to the folder or item level. For example, a patron who searches the OPAC for information on the history of the Grand Canyon will find an entry to the George A. Croft Collection; this entry links to the machine-readable EAD register of that collection. The patron's Web-based search of the EAD index will turn up records for quite a few documents about, and photographs of, the construction of the first hotel on the Canyon's north rim. One more click of the mouse will bring the photographic images to the screen. All of this happens via the OPAC front end.

SUU has proved that new and powerful informational and educational systems can be created by combining standard technologies in a unique way. Readers of The Technology Source may discover for themselves the integrated system's capabilities by following the instructions below, which illustrate how a patron might search for information on early graduates of the Branch Normal School, the first college in southern Utah.

Connect to the library's Web-based OPAC at http://webpac.li.suu.edu/webpac-bin/wgbroker?new+-access+top.suu and do a subject search for branch normal school. From the resulting list of subjects, choose Branch Normal School -- History; when a list of titles, appears, choose Lillian Higbee Macfarlane Collection. This retrieves the bibliographic record for the Macfarlane Collection. Follow the URL immediately below the Title and Author fields. This connects you to the search page for the collection. Do a keyword search for graduates. One item will be retrieved: a 1910 photograph of the first four-year graduates of the Branch Normal School. You may preview the photo by clicking on View Image.

You also may access the system directly though the main search page at http://archive.li.suu.edu/, or you can search a specific collection by choosing it from a complete listing of indexed collections at http://archive.li.suu.edu/cgi-bin/collections?.

Other archives, special collections libraries, and repositories that use EAD can be found at the official EAD Web site (sponsored by the Library of Congress): http://www.loc.gov/ead/eadsites.html.