OpenKey: Illinois-North Carolina Collaborative Environment for Botanical Resources
First Monday

OpenKey: Illinois-North Carolina Collaborative Environment for Botanical Resources by P. Bryan Heidorn and Lesley Deem

Contents

Related herbaria projects
Image-based description and identification
Knowledge driven imaging
Collection development
Testing and evaluation
Future work

 


 

In good weather millions of nature enthusiasts migrate to nature areas or their back yards to observe the plants and animals that live there. Once there, many people wonder about the names and lives of the creatures around them. What is that flower? Where else does it grow? In spite of our increasingly urban lives we build environmental and nature classes into K-12 education. These questions become serious science for botanists, entomologists and policy makers making decisions about land use. Most people identify plants and animals using a field guide, but anyone who has used one knows its limitations for identification and other information. In the OpenKey project we are building computer-based identification tools (keys) that will help people to identify plants and animals by selecting images and descriptions. Mobile technology will allow nature lovers to take their electronic nature library with them to their favorite nature area.

The Illinois-North Carolina OpenKey Project is revolutionizing access to botanical resources by making information about plant species more accessible and by simplifying and visualizing the process of identification. This project is supported by a partnership between botanists and information scientists at the University of Illinois at Urbana Champaign (UIUC), University of North Carolina at Chapel Hill (UNC-CH), Illinois Natural History Survey and North Carolina Botanical Garden. The project's major goals are to create new kinds of identification tools; provide online access to museum specimens; photograph and digitize images of plants in their natural environments; and, test the systems with school children, citizen scientists and professional biologists.

++++++++++

Related herbaria projects

The digitization of holdings in natural history collections has become commonplace at least among major institutions. Current examples include the New York Botanical Garden (NYBG) and Missouri Botanical Garden (MBG). The NYBG has digitized 82,000 of their 89,000 vascular plant type specimens (http://www.nybg.org/bsci/herbarium_imaging/). A type specimen is a reference key that defines the basic characteristics of a given species. The MBG has a similar project which includes the digitization of some rare books (http://www.mobot.org/MOBOT/Research/imaginglab/welcome.shtml). These projects generate high-quality digital scans of herbarium sheets and other components in their collections. Access is provided through specimen label data which includes the species name, collector, date, and location where the specimen was collected. Some of the uses of these collections are discussed by Schaub and Dunn (2002).

 

++++++++++

Image-based description and identification

However, one of the main uses of these collections is the identification of new specimens. No access is provided to the properly named museum specimens unless you already know the name. OpenKey provides a solution to this issue by adding an identification tool as an access method. The OpenKey identification tool is a version of the Biological Information Browsing Environment (BIBE: http://www.biobrowser.org; Heidorn, 2001; Olsen et al., 1993) which is used as a polyclave (Figure 1).



Figure 1: BIBE weighted display.

A computerized polyclave is also known as a multi-entry key when implemented on a computer. Important examples of computer based keys include DELTA (Dallwitz, 2000; Dallwitz et al, 2002; Thiele, 2000) and Lucid (http://www.lucidcentral.com/). The polyclave derives its name from punch card-based identification systems where holes in the cards represented characteristics of plants. A person trying to identify a plant would slide a rod through holes in cards to select the cards with each of the desired characteristics. We use the XML structure of our documents and the BIBE visualization system to the same effect except that BIBE is more robust in the face of user errors. In BIBE a person can specify any number of characteristics independently, like a rod being slide through the holes that match a characteristic. A big difference is that BIBE can use selective springs rather than rods to connect to species that match a characteristic. These springs only connect to species if the species matches the character set defined on the spring. Species descriptions (XML) that match one or more of these characteristics are returned to the user. Species (represented as piles of papers) that match only one characteristic are piled over that spring/characteristic. Species that match more than one spring/characteristic are pulled to the middle of the display between the characteristics that match as the springs each tug on the species. As in the real world, some springs are more powerful than others. Spring that represent rare characteristics for the collection are stronger and have more pull. The location of the species on the screen shows the relative matching of the spring/characteristics and the species.

For OpenKey, we enhanced BIBE to allow users to specify plant characteristics through image tables. With these tables, a person who is trying to identify a plant selects a collection of images that represent characteristics of the plant in question. A trivial example is a table of leaf shapes ranging from linear to ovate. By specifying a set of these descriptive characteristics of the plant such as an image of a oval-shaped leaf with alternate leaf arrangement and with bluntly toothed margins and "grows in Illinois" might match a wild mustard.

 

++++++++++

Knowledge driven imaging

The inclusion of the images for these tables adds a whole new dimension to the digitization process that we can call "knowledge driven imaging." The professionals on the team not only scan herbarium specimens but also collect images and metadata about each distinguishing characteristic of the plant that can be used for identification. This requires the expert knowledge of the botanist along with the expert skills of a photographer. Sometimes the images of individual distinguishing characteristics must be made with living specimens to show the vibrant color or form of a flower or the environmental context of the plant's natural habitat. An example is the Anemone cylindrica (Candle Anemone) inflorescence illustrated in Figure 2. This image was created by the Illinois development team's botanist/photographer, Kenneth Robertson. Using this method we end up with a much richer depiction of the plant as a species in addition to a record of the individual specimens that serve as museum vouchers for the species. Each species and each specimen is associated with a dozen or more images reflecting critical characteristics of the plant.



Figure 2: Anemone cylindrica 0.5x, Prospect Cemetery prairie, Ford Co., Illinois, 30 October 2002.

One of the most innovative aspects of the project is the ability for botanists and key creators who are geographically distributed to coordinate their character lists. The idea is to create a centrally controlled vocabulary for species descriptions developed from the grass roots, up not from the canopy. All descriptions and source code should be used freely and with proper acknowledgement of the creator in a small step toward the Biological Information Commons (Moritz, 2002). This coordination is managed using XML files. In the initial part of the project, coordination between UNC and UIUC researchers was through a long taxon by character matrix. These character lists are now created through Web forms that are available anywhere on the Internet. Botanists use these forms to create the lists of characters, and character states that are valid to use in the description of any species as well as the species descriptions themselves. The resulting XML files are used to transfer information between institutions in an easy-to-use fashion. They are also used to index species for searching with BIBE and to create species description pages (Figure 3).



Figure 3: XML creation, indexing and display.

 

++++++++++

Collection development

Our collection development strategy will center on critical ecosystems in two geographic regions. The data in the collection will include a comprehensive representation of species from these critical ecosystems: Midwestern prairies, Piedmont glades and prairies, Midwestern forests, and North Carolinian forests.

The species that we include in our collections were determined through the following criteria. Dominant species are those that have the highest biomass in the community; however, they may range over many communities. Characteristic species are those that are diagnostic for that habitat, even if not dominant. Rare species are important for monitoring since they are the most vulnerable to loss and have the greatest value to conservation areas. Bioindicator species are those that have the most significance for monitoring change. These changes can be both negative, as in the invasion by foreign weeds or the damage to ozone sensitive species; or they can be positive indicators of good habitat quality.

Unlike traditional herbarium digitization projects which focus on digitizing rare specimens in the collection, we are selecting species based on their usefulness for conducting environmental monitoring. For example, we are digitizing prairie species in the following sequence:

  • First, all PrairieWatch species are scanned and will serve as the starting point;
  • Then, other species that are not necessarily closely related taxonomically, but that could be confused with the target species are scanned;
  • Then, those species that are closely related taxonomically to the PrairieWatch species will be scanned;
  • Then common species that will be encountered by volunteers in Illinois prairies; and,
  • Finally, species that are related to the previous set are scanned to the point where all the tall grass species are present in the database.

Analogous procedures are use at the University of North Carolina for selecting their species. Some information is used from an ongoing project at the University of North Carolina at the Plant Information Center (Greenberg et al, 2000; Greenberg, 2001).

 

++++++++++

Testing and evaluation

Beginning in the summer of 2003, the system will be tested with a group of citizen scientists, teachers and students who conduct important environmental monitoring and education projects (PrairieWatch in Illinois and TreeWatch in North Carolina). In the lab one group of volunteers will use BIBE to identify prairie plants while another uses PrairieWatch provided paper-based keys and flash cards. We will then observe the use of the systems in a prairie plot where the subjects attempt to identify all PrairieWatch bioindicators.

The design of the project provides a framework that will foster scientific learning by citizen scientists who want to assist in documenting plant habits necessary for monitoring changes in the environment. The polyclave keys with data gathered by citizen scientists will be made available to other libraries and museums throughout the World Wide Web after development and testing.

 

++++++++++

Future work

The search facilities of BIBE/OpenKey are already being used in schools. In October 2002, over fifty teachers were trained of how to create species description pages and make digital images in the Illinois Schools' Flora and Fauna Online Project (http://www.siue.edu/OSME/river/). Hundreds of students have participated in creating plant species descriptions and placed them on the Web. The BIBE spider has indexed these descriptions and made them available through one search site using the BIBE visualization. The goal of this project is to create multiple species descriptions for all grade levels of all plant (and spiders) in Illinois ... and then the world. End of article

 

About the Authors

P. Bryan Heidorn is Assistant Professor at the Graduate School of Library and Information Science at the University of Illinois at Urbana-Champaign.
E-mail: pheidorn@uiuc.edu

Lesley Deem is Research Associate at the Graduate School of Library and Information Science at the University of Illinois at Urbana-Champaign.
E-mail: l-deem@uiuc.edu

 

Acknowledgements

This work was funded by the Institute for Museum and Library Services and the National Science Foundation. Botanical and photographic assistance was provided by Kenneth Robertson, Illinois Natural History Survey. We also thank University of Illinois students, Karen Medina, Heekyung Choi, Jing Zhao, Sharon Chow, Kanya Babu, University of North Carolina principal investigators Evelyn Daniels, Jane Greenberg and Peter White and many others.

Direct comments to pheidorn@uiuc.edu.

 

References

M.J. Dallwitz, 2000. "A comparison of interactive identification programs," http://biodiversity.uno.edu/delta/www/comparison.htm.

M.J. Dallwitz, T.A. Paine, and E.J. Zurcher, 2002. "Interactive identification using the Internet," http://biodiversity.uno.edu/delta/.

J. Greenberg, 2001. "Metadata applications for the Plant Information Center (PIC): A Web-based scientific learning center," Interactive Learning Environments, volume 9, number 3, pp. 291-313. http://dx.doi.org/10.1076/ilee.9.3.291.3570

J. Greenberg, E. Daniel, J. Massey, and P. White, 2000. "The Plant Information Center (PIC): A Web-based learning center for botanical study," Proceedings of WebNet 2000 World Conference on the WWW and Internet, San Antonio, Texas, 20 October-4 November 2000, AACE Association for the Advancement of Computing in Education, pp. 217-226; also at http://ils.unc.edu/daniel/PIC/.

P.B. Heidorn, 2001. "A tool for multipurpose use of online flora and fauna: The Biological Information Browsing Environment (BIBE)," First Monday, volume 6, number 2 (February), at http://www.firstmonday.org/issues/issue6_2/heidorn/. http://dx.doi.org/10.5210/fm.v6i2.835

T. Moritz, 2002. "Building the biodiversity commons," D-Lib Magazine, volume 8 number 6 (June), at http://www.dlib.org/dlib/june02/moritz/06moritz.html.

K.A. Olsen, R.R. Korfhage, K.M. Sochats, M.B. Spring, and J.G. Williams, 1993. "Visualization of a document collection: The VIBE system," Information Processing and Management, volume 29, number 1, pp. 69-81. http://dx.doi.org/10.1016/0306-4573(93)90024-8

M. Schaub and C.P. Dunn, 2002. "vPlants: A virtual herbarium of the Chicago region," First Monday, volume 7, number 5 (May), at http://www.firstmonday.org/issues/issue7_5/schaub/. http://dx.doi.org/10.5210/fm.v7i5.956

K. Thiele, 2000. "A critique of Dallwitz's 'A comparison of interactive identification programs'," http://biodiversity.uno.edu/delta/www/thiele.htm/


Editorial history

Paper received 6 May 2003; accepted 7 May 2003.


Contents Index

Copyright ©2003, First Monday

Copyright ©2003, P. Bryan Heidorn

Copyright ©2003, Lesley Deem

OpenKey: Illinois-North Carolina Collaborative Environment for Botanical Resources by P. Bryan Heidorn and Lesley Deem
First Monday, volume 8, number 5 (May 2003),
URL: http://firstmonday.org/issues/issue8_5/heidorn/index.html





A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2017. ISSN 1396-0466.