Imaging Pittsburgh: Creating a shared gateway to digital image collections of the Pittsburgh region
by Edward A. Galloway
The University of Pittsburgh’s Digital Research Library received a two-year grant from the Institute of Museum and Library Services (IMLS) to provide online access to multiple photographic collections held by the University’s Archives Service Center, Carnegie Museum of Art, and the Historical Society of Western Pennsylvania. When the project ends in October 2004, the project team will have mounted over 7,000 visual images depicting the people, places and events of the greater Pittsburgh region during the midnineteenth and midtwentieth centuries. Although the beta version of the Web site was released in February 2004, the project team will continue to develop the site and offer creative avenues for exploring the collections. This paper summarizes remarks made at WebWise 2004 Conference in Chicago.
Challenges and accomplishments
Preparing Web site release
In 2002 the University of Pittsburgh received a National Leadership Grant from the Institute of Museum and Library Services (IMLS). Under the auspices of the Library & Museum Collaboration track, this grant proposed to create a shared gateway to visual image collections in the Pittsburgh region. The grant partners include three cultural heritage institutions in Pittsburgh, namely the Archives Service Center at the University of Pittsburgh, the Library & Archives division of the Historical Society of Western Pennsylvania, and the Carnegie Museum of Art. My department, the Digital Research Library (DRL) at the University of Pittsburgh, provides the overall leadership of the project, supports the technical infrastructure needs, and hosts the image collections. Work on the grant commenced on 1 November 2002 and will continue until 31 October 2004.
This paper will introduce you to the grant partners and the image collections, briefly summarize the purpose of our project, mention our progress and challenges encountered so far, note how we have addressed and resolved them, and discuss our intended project outcomes and impacts.
The main focus of our project is to create a single Web gateway for the public to access thousands of visual images from photographic collections held by the Archives Service Center of the University of Pittsburgh, Carnegie Museum of Art, and the Historical Society of Western Pennsylvania. These "content partners" are responsible for determining which collections to represent, selecting individual images, describing and cataloging the images, digitizing the images, and delivering the images and metadata to the DRL. The DRL is responsible for providing federated access to the multiple image collections utilizing DLXS middleware developed at the University of Michigan . The DRL indexes and mounts the image collections on a central University Library System server, and creates a Web gateway with the functionality of crosssearching the multiple image collections.
Primary access to the image collections is through the existing Historic Pittsburgh Web site created and maintained by the DRL with support from the Historical Society of Western Pennsylvania . Available for several years now, this Web site has greatly increased public access to significant collections of historic material documenting the growth and development of Pittsburgh and the surrounding western Pennsylvania region during the nineteenth and early twentieth centuries. Adding image content will increase the comprehensiveness of Historic Pittsburgh as an online research tool.
The Online collection and its benefits
The online collection will comprise images from over 20 distinct photographic collections held by the three institutions. At a minimum, 7,000 individual images will be available with the anticipation that over 10,000 will be online by project’s end. This grant has created the necessary framework to add image content to the site beyond the life of the grant. In fact, we are already reaching out to new content partners in the Pittsburgh region .
An obvious benefit for users working with the collections as a group is the ability to obtain a wider picture of events and people, not too mention changes to localities, infrastructure, and land use. This is an important facet to mention since the collections document many different perspectives of the city throughout time.
Take for example the Lower Hill District in Pittsburgh. From the 1930s to the late 1950s, the Hill was one of the most prominent AfricanAmerican neighborhoods in the country, the social and political center of a thriving Black population that had moved from the South in the hope of escaping segregation and finding work in Pittsburgh’s iron and steel mills. The photographs in the Teenie Harris Collection at the Carnegie Museum of Art visually document much of the people, places and events of the Hill District (see Figure 1).
Figure 1: Couples sitting on and standing behind couch, ca. 1950. Teenie Harris Collection, Carnegie Museum of Art.
The Lower Hill was demolished in the late 1950s to make room for Pittsburgh’s first "Renaissance" by the Allegheny County Community Development (ACCD) program as represented by images in the ACCD collection (see Figure 2). Over 1,300 buildings, 413 businesses and 8,000 residents were displaced in an attempt to extend the revitalization of the adjacent Golden Triangle (Downtown).
Figure 2: Beginning of demolition of the Lower Hill, ca. 1955. Allegheny Conference on Community Development, Historical Society of Western Pennsylvania.
Other examples of diverse and competing views of the city include photographs in the City Photographer Collection, which were primarily created for utilitarian purposes (curb and street conditions, paving, etc.). These images can be contrasted with images found in the general collection of Carnegie Museum of Art photographs, which demonstrate the artistic nature and value of similar sites. One more example includes images documenting a wealthy industrialist family (see Figure 3) to families scratching out a living in the Irene Kauffman settlement.
Figure 3: Spencer children in coats, 25 November 1897. Spencer Family, Archives Service Center, University of Pittsburgh.
Characteristics of the Web gateway
Users of the image collection gateway will be able to do the following:
- Conduct a keyword search across all the image collections;
- Browse images within any given collection;
- Read about the collections and their contents, including provenance, date span, and coverage;
- Explore the image collections by time, place and theme; and,
- Order image reproductions.
The content partners are especially interested in providing an image reproduction service to help generate income for their respective institution.
Challenges and accomplishments
We have experienced numerous communication challenges. When we commenced work on the grant, we established several avenues of communication, including the creation of a project email distribution list and a Web site for posting documentation. The team initiated monthly meetings that have been held on a revolving basis at each institution. However, there is often a lack of dialogue outside the regularly held meetings, and little communication on the listserv other than noting when new documentation has been added or updated on the project Web site. This simply confirms the inherent difficulty of communicating across institutional boundaries. Or does it? Speaking as project leader, it’s difficult for me to judge whether the silence means everyone knows exactly what to do and are doing just that, or they are so busy doing other jobs that they don’t have time to communicate. Likely it is a bit of both.
Another communication challenge is reflected in the different missions and institutional cultures of the respective content partners. We’ve learned much about how each institution perceives their collections and the role they play in educating or entertaining a patron. A simple example is a cataloging discussion we once had regarding subject terms. Our museum friends tend to view their image collections as works of art with intrinsic value as a photograph. The academic archives tend to view their image collections for utilitarian purposes with minimal description, while the historical society’s practice has been to provide contextual information that not only describes the image, but informs the reader about the history and impact of an area or person depicted by the image. All this is to say that it has taken time to build a common dialogue for discussing critical elements of the project.
The original design of the project gave much freedom for the content partners to select the image collections to be represented in this project. The grant proposal mentioned the inclusion of 16 distinct photographic collections that documented diverse times, places and themes in the Pittsburgh region.
The team created a document to help guide selection of specific photographs, but most of these guidelines were instructive in nature that dealt mainly with scanning capabilities such as size, format, condition, etc. Each content partner devised different methods of actually selecting unique images from the collections. Some collections did not need to have selections made because the entire collection was a good candidate for digitization (mainly due to its size).
One tool to help guide image selection (that never occurred to me during the grant preparation) was the use of the subject headings. Now that approximately 3,400 images have been described, the team has started reviewing the subject headings assigned to the images. When an alphabetical list of all subject headings was produced with an indication of the number of times a term was used, it soon became obvious what the online collection contains. Our challenge in the remainder of the project is to use lists like this to inform and perhaps change our selection decisions to ensure the collection as a whole is balanced.
One of the biggest selection challenges remains: split collections. For historical reasons, the curatorialship of two image collections are split between the Archives Service Center and the Historical Society of Western Pennsylvania. In order to select images from these collections, curators representing both institutions plan to collaboratively select images and decide together which images should be digitized.
We know that the metadata is the glue that holds these collections together, but the creation of metadata has been a challenge. One challenge we have faced is the difference between projectwide metadata needs versus local needs. On one hand, each institution has its reasons for wanting to create and use its own metadata scheme for internal management purposes and so forth. On the other hand, the interoperability of the metadata was crucial to the success of the project. We agreed to map to eight Dublin Core elements (title, date, description, subject, creator, identifier, filename, rights). These elements will serve as the core descriptors in the online version of the metadata, but we also agreed that each content partner had the liberty to include additional fields in the online database if they thought appropriate.
We reached an agreement within the project team to use controlled vocabulary terms when cataloging the images (i.e., subject headings). But what controlled vocabulary to choose from? Although we investigated thesauri including the Art and Architecture Thesaurus and the Getty Thesaurus of Geographic Names, we settled on the use of the Library of Congress Subject Headings (LCSH) for two primary reasons. First, the most experienced cataloger on the project team utilized LCSH in his cataloging, and developed a brief guideline to ensure consistency among the three partners. He has also played a crucial role by systematically reviewing and commenting on the list of aggregated subject terms for proper syntax and construction. He also developed our own set of geographic vocabulary terms for describing local neighborhoods. Second, the museum curator, who would perform the bulk of the museum’s cataloging, had no prior experience using any kind of controlled vocabulary schemes, so she had no preference.
Another metadata challenge has involved the use of dates. We agreed to create two date fields: a normalized date (ISO standard) for computer sorting and a display date for the user to view in the metadata record. Although the middleware is currently limited in its ability to make robust use of the date field, we have expectations that we will eventually have the ability to sort dates, perform a search over a date range, etc.
Each content partner faced the challenge of creating a workflow to get the images through numerous processing steps, such as selection, digitization, description, and quality control. At first I wondered how difficult this might be since each institution inherently has its own unique practice and method of curating image collections. Although each institution did devise a workflow based on institutional practice, barriers, and handling and security issues, it was interesting to see that when these workflows were shared with the project team, ideas were exchanged that caused each institution to reconsider its method and incorporate new (and often better) ideas.
The project team agreed to set a minimum and consistent level of image quality for the "production masters" that were delivered to the DRL. This has ensured that the DRL always receives images of consistent size and quality for online viewing (i.e., surrogate image). However, each content partner is at liberty to create an image of higher quality that may serve other purposes on behalf of the institution (e.g., printing, publishing).
Another workflow challenge was the creation and use of separate databases. From the outset, we did not want to impose the use of identical software applications or field structure due to the unique set of partners within this grant. Instead, we insisted that the appropriate and necessary metadata fields be able to be exported to the DRL per our instructions. Therefore, the museum cataloged its records in a commercial database while the archives and historical society used locally developed Microsoft Access databases.
Web site development challenges
When dealing with multiple collections from three different institutions, it has been a challenge to develop consistent copyright and permission statements for the site and accompanying images. Practically speaking, it is impossible for one such statement to cover the actual policies that govern access and use to these collections. Therefore, we decided on a twofold strategy. First, we developed a generic copyright and use statement for the image collections as a whole, which basically says: "you are free to use the images on this Web site for personal research and noncommercial use, but you must seek permission from the respective institution that holds the images to reuse the images." Links are included in this statement that point directly to each institution’s Web page explaining their explicit policy on copyright and use. Second, the metadata that accompanies each individual image contains a copyright field. We have used the contents of that field to also directly point to the holding institution’s copyright and use Web page. In this way, we hope to point users directly to the source rather than a nebulous statement that simply cautions users.
What will be the best way to respond to user questions and comments about the site? Although we toyed with the idea of developing a sophisticated email system to filter comments directly to the appropriate institution and project personnel, we decided to continue using the email distribution list we presently use to handle any Historic Pittsburgh feedback. When a user submits an inquiry or comment in the Webbased form, an email is sent to a few team members from each institution. Typically it is obvious who should answer a question: all troubleshooting questions will be answered by the DRL while a curator or archivist at one of the institutions will answer content questions. A librarian in the DRL assures that replies are made and in a timely fashion.
During the past several months, we have begun to turn our attention to building the Web site for access to the image collections. We have faced the dual challenge of emphasizing access to the "collection" as a whole, while also maintaining the individual identity of each collection. Due to the collaborative nature of the grant, emphasis will be placed on accessing the image collections as a group via search or explore (see Figure 4). That said, each image collection will be represented by a customdesigned "homepage" for access to just that set of images. While we believe the majority of access to the image collections as a group will occur via the Historic Pittsburgh portal, each partner institution can point directly to their image collection "homepages" from their institutional Web server for promotion and access purposes.
Figure 4: Example of a crosscollection search for "bridge" sorted by date.
Another Web site development challenge is dealing with the limitations of the DLXS ImageClass middleware and internal resources to improve upon its functionality for this specific application. We recognize the limits of the middleware and have decided to live within the framework of the application and not make significant local alterations to the system. However, the DRL has often communicated with developers at the University of Michigan to seek improvements and report bugs.
Preparing Web site release
In the Fall 2003, we agreed as a project team to attempt to launch a beta release of the site by February 2004. I am happy to say that we just made it ... the beta site went live on 25 February. Users of Historic Pittsburgh now have access to over 3,400 images from 18 different collections. As the site becomes more fully developed and more content is added in the coming months, we plan an official release in the Fall 2004. By launching the site in February though, it should give us an opportunity to receive immediate feedback from our users to help improve the site.
During the spring and summer months we plan on conducting interface testing in order to make improvements to the site. Following this analysis, we plan on conducting an online survey in the Fall to seek user information and outcome measurements. We also will create OAI records for each image in the collection to be shared and collected by Open Archives Initiative metadata aggregators to enable greater resource discovery.
Avenues for exploring
One of the most exciting challenges that still remains is developing creative ways to help users explore the collections. The team has spent tremendous energy cataloging the images; what better way to tell some of Pittsburgh’s stories than by leveraging the controlled metadata? This will allow us to provide curatorial context and guidance to the images in the collections. Our avenues for exploring the collections will encompass time, place and theme. Some of our theme ideas include "Pittsburgh at Work," "Pittsburgh at Play," "Pittsburgh at Home," and "Pittsburgh Personalities." We anticipate that users will be able to see selected highlights based on the above themes, and/or explore the images more indepth by cataloging terms. An idea we have for exploring the images by place involves a clickable city map whereby users can select a neighborhood and automatically retrieve images depicting that neighborhood from the collections.
I must confess that when we wrote this grant, we knew little about OutcomeBased Evaluation, or "OBE" as it’s fondly known. Yet after participating in the IMLSsponsored workshop last January, the team developed an understanding and appreciation for "outcomes" rather than "outputs." With this new knowledge, the team worked through some of the OBE exercises and formulated several outcomes for different target audiences. We plan on measuring our success in the fall.
One of our main project goals for the user is to experience the image collections in a way that cannot be achieved in their analog format. Furthermore, we want users to learn something about the image collections. Therefore, we will measure whether the online photograph collections meet the research needs of users, and whether users gain knowledge of the photographic collections held by each institution.
Each institution developed one outcome statement and devised a means of tracking and measuring the outcome. The Archives Service Center hopes to increase the demand for images for publication and instruction purposes. The Carnegie Museum of Art hopes to collect specific information from the community about individuals and content in the Teenie Harris Collection. The Historical Society of Western Pennsylvania hopes to expand the use of the digital images in seventhtwelfth grade instruction.
This project has already impacted the curators and archivists who provide primary access to their respective institution’s collections in several significant ways: a better understanding of photographic collections that document the city; an improvement in communication; and, a new respect for each institution’s unique missions and attitudes toward their primary audience.
Work on this project continues, but already we are beginning to see how these image collections will increase opportunities for diverse cultural heritage institutions to make a significant impact on the availability of visual images documenting Pittsburgh during its industrial heyday and transformation. We hope users enjoy and benefit from their experience visiting these unique image collections.
About the Author
Edward A. Galloway is Coordinator of the University of Pittsburgh’s Digital Research Library. He is Principal Investigator of the Imaging Pittsburgh project funded in part by the Institute of Museum and Library Services (IMLS). He is also currently serving as CoDirector of a library project funded in part by the National Endowment for the Humanities (NEH) to microfilm and digitize scholarly Chinese material held within the University Library System.
I would like to thank the entire Imaging Pittsburgh project team for their dedication and hard work on this project. I appreciate the cooperation the project has received from the University of Pittsburgh’s University Library System, Carnegie Museum of Art, and the Historical Society of Western Pennsylvania. Finally, I want to thank IMLS for the opportunity to work on this exciting project and present our worktodate at this WebWise conference.
1. University of Michigan’s Digital Library eXtension Service (DLXS) at http://www.dlxs.org/, accessed 1 April 2004.
2. Historic Pittsburgh at http://digital.library.pitt.edu/pittsburgh/, accessed 1 April 2004.
3. Since this presentation was made, the University of Pittsburgh at Greensburg (Pa.) and Chatham College (Pittsburgh, Pa.) have expressed interest in providing image content to this project.
Paper received 8 April 2004; accepted 15 April 2004.
Copyright ©2004, First Monday
Copyright ©2004, Edward A. Galloway
Imaging Pittsburgh: Creating a shared gateway to digital image collections of the Pittsburgh region by Edward A. Galloway
First Monday, Volume 9, Number 5 - 3 May 2004