Planning, implementing and managing online repositories
First Monday

Planning, implementing and managing online repositories: Lessons learned from the KnowGenesis Library
by Saurabh Kudesia

 

Abstract
Established in 2005, the KnowGenesis Online Library for Technical Communication (http://www.knowgenesis.org/tc) is India’s first online repository dedicated to accelerate knowledge sharing and promote self–learning in the field of technical communication. The Library is available free of cost, requiring a one–time free registration to access available material. The popularity and success rate of the Library can be determined by the fact that within a year of its launch, it not only attracted more than 24,000 visitors and gained more than 1,500 subscribers, but also increased the volume of the hosted content from a few documents to more than 2,000 important documents, presentations, tutorials and links.

The KnowGenesis (KG) Library presents a unique case for repository designers to study the complex design and implementation process that contributed to the stability and overall success rate of the online Library.

This paper not only shares the designing and implementation challenges faced by the KnowGenesis team, but also presents the approach used to match the user requirements with the Library design. Based on the lessons learned during the process, the paper also presents specific set of guidelines and recommends methodologies that can provide critical assistance for developing and managing medium and large–scale repositories.

Contents

Overview
Background of the KnowGenesis Library
Design considerations
Salient features
Important lessons learned
Future work
Success so far

 


 

Overview

In the last few decades there has been a rapid expansion of research on digital networks. The education community has provided excellent opportunities to connect information sources electronically, resulting in emerging, universally accessible, digital libraries. Digital libraries are no longer restricted to a collection of content collected on behalf of users but are extending their scope as institutions provide a range of services in a digital environment (Borgman, 1999). Hence digital library service is about an “assemblage of digital computing, storage, and communications machinery together with the software needed to reproduce, emulate, and extend the services provided by conventional libraries based on paper and other material means of collecting, storing, cataloguing, finding, and disseminating information” (Gladney, et al., 1994). In addition to digital collection and management tools, a digital library provides an environment to support information life cycle (Duguid, 1997).

Digital libraries are based on a distributed technology environment (Fox, 1994) and require technology to link resources and to make information services available to end users (Association of Research Libraries [ARL], 1995). The easier and wider access to information enables digital libraries to extend their services to digital artefacts that cannot be represented or distributed in printed formats (ARL, 1995). These libraries also serve as communal repositories of knowledge, allowing users to define and guide the development of specific libraries (Wright, et al., 2002).

 

++++++++++

Background of the KnowGenesis Library

The trigger

While there were many Web sites catering to different aspects of technical communication, very few of them were able to meet the overall learning requirements of technical writers. Some prominent online collections [1] provide comprehensive bibliographic data related to the field but are slightly restricted in their operation. While the services related to these collections are free, a user has to actually visit the referred site to read the content. Frequently content availability is governed by the site hosting the content rather than the library itself. Therefore, some content hosted on the site was available only to subscribers of a parent site. There was no way to know if resources are available on the parent site without visiting a related site.

Major objectives

Propelled by a need to fill observed gaps during preliminary observations and specifically to meet the requirements of a growing Indian technical writing community, the KnowGenesis Library was conceptualized with an aim to

  • Accelerate and promote self–learning through online support;
  • Establish a networked, non–centralized approach to search and discovery of resources;
  • Create culture that fosters contribution to and use of the Library;
  • Bring the expertise distributed throughout the technical writing community and convert it to forms that are shareable through the digital Library, for the benefit of all;
  • Develop tools for professional development and continuous learning aimed at addressing specific issues in technical writing education;
  • Share best practices, activities and methodologies within the community;
  • Foster synergistic relationships between contributors and users; and,
  • Provide impetus for similar initiatives within and outside the community.

Important considerations

One of the biggest challenges in library planning is to provide users with access to information that has been evaluated, organized and preserved in the most useful format (McMillan, 2000). Special attention was therefore given to design, construction, and management aspects of the KnowGenesis Library in order to fully utilize the distributed effort and energies of a broadly engaged community. Methodologies such as user–centered design (Norman and Draper, 1986), task–centered design (Gould, et al., 1991; Lewis and Rieman, 1993), and Participatory Design (PD) (Greenbaum and Kyng, 1991; Schuler and Namioka, 1993) were considered in order to involve users early in the design process and get their continuous and systematic feedback during and after the development of the Library.

Keeping user requirements in focus, major considerations were given to issues like

  • Affordability: keeping initial cost as low as possible;
  • Sustainability: minimal ongoing investment;
  • Repeatability: to use the model to serve other requirements of the community;
  • Openness: minimize software costs;
  • Compatibility: supporting variety of document formats;
  • Commonality: solution is a common denominator with elements readily accessible to subscribers; and,
  • Scalability: eventually should be able to serve more than 10,000 users.

 

++++++++++

Design considerations

As the potential participants were distributed, with most project communication and coordination activities taking place over the Internet, a broader user–centric approach was adapted to cover the four major dimensions of data/collection, system/technology, users, and usage (Fuhr, et al., 2001). The initial framework was planned to serve different modules of the Library like

  • The underlying system and its components, including classical information retrieval evaluation methods and techniques as well as overall systems performance;
  • The interface and interaction level of the activities between the user and the system;
  • Support for different access and usage strategies;
  • Situational and contextual factors, such as organizational and group issues; and,
  • Design outcomes, with emergent flavor, that can help to manage the unpredictable lines of inquiry being pursued in the broader community of participants. (Wright, et al., 2002).

Technical requirements

Based on the initial research, the important technical requirements for Library development were identified and categorized as below:

  • Mechanism for data collection and dissemination: including content (partial/full, diversity, size and quality of the content) (Fuhr, et al., 2001);
  • Meta–content requirements: indexing, citation, level of detail, classification (Fischer, 2001a; 2001b);
  • Management: access rights, user management, document maintenance, growth requirements, ease of use, workflow; approval, consumption, synthesis and distribution of documents;
  • Technology: interface (user/admin), searching, printing, repository structure; supported document model and format;
  • Information access: uploading, access, modification, deletion; and,
  • Tools: minimum cost, ease of operation, availability of support documentation and online community.

Tool selection

For selecting the right tool for developing the Library, apart from cost consideration, immediate administrative and usage requirements were taken as minimum requirements (see Table 1).

 

Table 1: Categorized minimum requirements for KG repository.
Built–in ApplicationsEase of UseInteroperabilityGranular PrivilegesManagement
Database ReportsFriendly URLsContent Syndication (RSS)LDAP AuthenticationContent Scheduling
Document ManagementServer Page LanguageFTP SupportSession ManagementInline Administration
Help Desk/Bug ReportingSpell CheckerUTF–8 SupportVersioningOnline Administration
Link ManagementTemplate LanguageWAI CompliantSSL LoginsPackage Deployment
Mail FormUI LevelsWebDAV SupportSSL PagesThemes/Skins
PollsUndoXHTML CompliantSupportTrash
Product ManagementWYSIWYG EditorPerformanceCommercial ManualsWeb Statistics
Project TrackingFlexibilityAdvanced CachingCommercial SupportWeb–based Style/Template Management
Search EngineContent ReusePage CachingCommercial TrainingWeb–based Translation Management
SubscriptionsExtensible User ProfilesSecurityDeveloper Community 
Syndicated Content (RSS)Interface LocalizationContent ApprovalOnline Help 
User ContributionsMetadataE–mail Verification  

 

Because of its deep features and variety of means to communicate on the Web, Mambo Open Source was found suitable to satisfy the library requirements.

 

Figure 1: KG library Architecture.

Figure 1: KG library Architecture.
Source: http://www.websitesource.com/images/Mambo_Schema.jpg.

 

The extremely lightweight and efficient Mambo is a useful, sophisticated content management system and supports most of the tasks that the content editors and site visitors care about including [2]

  • A useful site structure and navigations system;
  • Allowing non–technical content editors to update content, add new pages or change navigation menu items;
  • Supporting a completely configurable graphic design, with flexibility to modify the Web site and CMS to match the exact requirements;
  • Facilitating internal work sharing;
  • Providing accessible sites, search engine optimization and human readable URLs; and,
  • Offering lots of plug–ins to support a wide range of common needs.

User participation planning

The initial requirements were focused towards developing the Library as a collaborative process, where the users of the Library can be the real contributors of the content and even take different roles and responsibilities based on the development requirements and individual interest and expertise.

Based on the requirement of their engagement in the project, the participants are categorized into three primary groupings: Administrators, Moderators and the community. The Administrators are responsible for the policy guidance and to manage the core facets of the Library (services, users, collections and technology), including developing and operating the core infrastructure of the Library. The moderators are volunteers who have shown their interest in helping KnowGenesis with the day–to–day activities of the Library and its usage. The community includes individuals and institutions that have an interest in seeing the Library develop, and they are involved as contributors based on their interests and expertise.

Two main hierarchies exist for User Groups: one for access to the front–end (to allow users to log into the Web site and view designated sections and pages) and one for back–end administration access (see Figure 2). In general, access provided to a parent group (like Registered) is inherited by a child group (like Author) unless specifically denied.

Users in the Super Administrator group cannot be deleted and cannot be switched to another group.

 

Figure 2: Hierarchical access mechanism and membership provided in KG.

Figure 2: Hierarchical access mechanism and membership provided in KG.

 

 

Table 2: Categorized membership provided in KG.
User GroupsAccess privileges
Front End
RegisteredThese Users are able to login to the front–end Web site. Additional information (sections and pages) may be available to a user once logged in.
AuthorsThese Users are given access to submit new content and edit their own content items/pages by logging into the Front–end.
EditorsThese Users are given access to submit and edit any content by logging into the front–end.
PublishersThese Users are given access to submit, edit and publish any content by logging into the front–end.
Back End
ManagerThe Manager Group is generally restricted to matters of content creation.
AdministratorThe Administrator Group has slightly restricted access to the back–end (Administrator) functions.
Super AdministratorThe Super Administrator Group has access to all of the back–end (Administrator) functions and can perform site’s Global Configuration,

 

The access rights can be changed by the administrators to reflect the change in roles and responsibilities of the user.

 

Figure 3: KnowGenesis Library front-end.

Figure 3: KnowGenesis Library front–end.

 

 

Figure 4: KnowGenesis Library back-end.

Figure 4: KnowGenesis Library back–end.

 

 

++++++++++

Salient features

  • Security. Apart from role based access, two additional access control parameters, namely Public and Registered, were assigned to Content items, menu items, modules and components. While the items published in the category Public can be viewed and accessed by the anonymous Web visitor, anything assigned with registered access can be view or accessed by anyone who has logged into the Web site via the front–end and is a registered user of the Library.
  • User Registration. For registration, visitors are prompted for Name, Username, E–mail and Password. For activation, an e–mail with the activation link is sent to the e–mail address provided by the visitor. On clicking the activation link, the account will be activated and the user will able to log in. This feature not only verifies that visitor exists and has a valid e–mail address but also gives the user the ability to choose their password at registration. It also provides Administrator a better overview of activated and non–activated accounts.
  • Content Uploading and Management. Any user created as Author, Editor, Publisher, Manager, Administrator or Super Administrator is considered a Special User and has access to submit news, articles, FAQs and Web Links. These Special Users are the only ones able to access to an item with the ‘Special’ access parameter. The entire content module may be hidden from any ‘Public’ or ‘Registered’ user by specifying its access as ‘Special’. The uploaded content can be scheduled for publication at specified date or can be configured to be made available for a specified duration of time. The Library also automatically picks the appropriate content items to show site visitors based on rules — for instance, the Library home page automatically display only the recent news stories, posts, content uploaded, and events added.
  • Online Editing and Saving. When a User edits a file, Mambo changes its status to “Checked Out” to prevent two Users from editing a document at the same time, thus preventing loss of data upon saving. The Administrator has been granted privileges to forcefully check–in all the checked–out files.
  • Archiving. KnowGenesis Library supports archiving the contents on pre–defined rules with additional facility of restoring or trashing the archived content.
  • Language Management. KnowGenesis Library supports a long list of the Languages for the core text on the Front end of the Library. Currently, only the English language has been enabled.
  • Messaging. The Messaging System is handy to facilitate work flow events and to send notes or messages to other Mambo Administrators. The Messaging system has been customized to notify administrators of events such as new content being submitted. Further to this, the library provides facility to send a message by e–mail to one or more groups of Users.

 

++++++++++

Important lessons learned

  • Working with a content management system. No single content management system (CMS) was found to be perfect during the tool assessment and almost all systems required some modifications to make it fit the requirements of KnowGenesis. It is therefore important to set library design priorities straight and manage tradeoffs between time and other requirements like features, stability and extensibility during the planning phase. While open source CMS may have no initial cost, the cost involved in setting up the hardware, initial installation and configuration has its own role to play. Evaluation of browser and platform compatibility of the selected CMS, along with the programming skills required to setup and maintain the library, should therefore form an integral part of the technical requirement assessment.
  • Initial planning. The KnowGenesis Library has done a lot of things right, reflecting the time and analysis that went into initial planning. So far, the initiative has kept focus on the principles of affordability, sustainability, repeatability, openness, compatibility, commonality, and scalability. Regular testing of Library features and services ensure that the Library’s performance is up to the user requirements.
  • Keep administrative requirements to a minimum. The KnowGenesis Library experience reflects that computer–mediated communication technologies (e–mail, news, Web forums) can provide vital assets to establish and support online communities. Solutions that minimize and simplify human involvement at all levels of the library’s operation and tools or approaches that allow users to diagnose their problems without the assistance of library administrators should be encouraged and form a core aspect of library design [3].
  • Balancing usability. Determining the quality of the actual user experience, however, was a difficult task as very few of the Library subscribers explicitly complained. It is therefore important for online repository designers to get explicit feedback about the quality of a given service or to track user behavior to get a real picture. Different methodologies and channels should be utilized to understand usability. Regular feedbacks using polls and e–mail discussion are some of the methods that KnowGenesis is using to secure feedback from users.
  • Ensuring social facilitation. The KG framework experimented with intertwining social processes and technical artifacts to develop community driven designs (Scharff, 2002). One major challenge was to manage the large amount of work required to support the social facilitation of a diverse set of participants, and to encourage them to get involved in the design, development and testing of the Library. Providing regular encouragements and due credits for individual contributions on different platforms are important in encouraging further participation.
  • Securing content distribution. From a service perspective, it is indeed challenging to manage secure distribution of content into and out of the library environment and to keep it secure from intruders. The KG Library has several checkpoints to ensure that content is safe in all aspects and authentication is required at all levels to manage the regular operations.

 

++++++++++

Future work

To make sure that the Library can keep pace with the growing user requirements and to make the best use of the technical developments in the future, other important enhancements, features and facilities were identified that need to be incorporated in the future.

 

Table 3: Planned future enhancements for the KG repository.
ApplicationsGuest BookSecurityPerformance
BlogJob PostingsAudit TrailDatabase Replication
ChatMy PageLogin History 
ClassifiedsNewsletterPluggable Authentication 
ContactPhoto GallerySupport 
ManagementSurveysProfessional Hosting 
Data EntryTests/QuizzesProfessional Services 
Discussion/ForumFlexibilityPublic Forum 
Events CalendarMultilingual ContentPublic Mailing ListLegends
FAQ ManagementMultilingual Content IntegrationThird–Party DevelopersInitiated
File DistributionMulti–Site DeploymentUser ConferenceNear completion
Graphs and ChartsURL RewritingEase of UseCompleted
GroupwareWiki AwareE–mail to DiscussionDue

 

While some of the planned enhancements are already implemented, a majority of them are in final stage, waiting to be integrated in the Library.

 

++++++++++

Success so far

Within a year of its establishment, the KG community has come together to support the specific educational needs of technical writing. It has fundamentally changed by exploring new ways of sharing information, tools and services.

Partnering with projects focused on improving education provides a scalable model for addressing the social aspects of library building. Within a year of its inception and with the efforts and support of volunteers, the KnowGenesis Library is now home to more than 4,000 important documents, presentations, case studies, papers, and other items of interest to the community. The total volume of content is excess of eight GB and there are over 2,000 subscribers [4]. The Library is also earning support from international organizations [5]. It is actively engaged in promoting fresh concepts and ideas to break new grounds in collaboration for the community. While the Library has been busy developing itself as an important resource for technical writers (Bhange, 2006), international use is growing with approximately one–third of traffic coming from outside India. End of article

 

About the author

Saurabh is the deputy director of Technical Communication Department currently working with UTStarcom Inc. — a world leader in providing IP–based telecommunications equipment and solutions. The majority of his team members are in China with the rest in the U.S., Canada, and India.

He has over nine years of professional experience as a writer, editor, columnist, and evaluator, and has the proven expertise to handle complex documentation projects in various domains for global audience. He has a rich track record of successfully completing more than 22 documentation projects and eight medium/large size knowledge bases for IT/ITES, BPO, and product–based companies/clients in U.S., U.K., Canada, China, Italy, Philippines, Taiwan, Japan, South Africa, and India. He has a good experience in developing, implementing, and managing cost–effective document development life cycle and process flows to increase team efficiency, deliverable quality, and customer satisfaction.

He was awarded the Best Project Adherence Award from CEO, HSBC Global Technology Center (India) for his unique methodological approach towards documentation development. He recently contributed and reviewed ISO/IEC 26512 (standards for acquirers and suppliers of user documentation).

He is the associate editor of Directives, a newsletter published by the Society of Technical Communications’ (STC) Management Special Interest Group (SIG) and co–founder, editor–in–chief of KnowGenesis International Journal for Technical Communication (IJTC) — India’s first online journal for technical communication (www.knowgenesis.net).

He is a member of the Society for Technical Communication (STC) and a member of the Publicity and Communications Committee, HCI–Hyderabad, India.

 

Notes

1. See http://tc.eserver.org/about/.

2. For more information, see the Mambo Administrator Manual at http://help.mamboserver.com/index.php?option=com_content&task=category&sectionid=16&id=101&Itemid=121.

3. Several such approaches are given by Merwe, et al., 2003.

4. As of 23 January 2007.

5. See KnowGenesis proud to be associated with Digital Curation Centre (DCC) UK, available at http://www.knowgenesis.org/tc/index.php?option=content&task=view&id=133&Itemid=2.

 

Acknowledgments

The success of the KnowGenesis Library is shared by individual contributors whose continuous involvement in various roles and capacity is providing critical support to the Library. KnowGenesis’ biggest assets is its community. Within that community are those who search and upload hundreds of reference documents, original articles, presentations and tutorials that have helped to make KG Library a truly great open source project.

 

References

Association of Research Libraries (ARL), 1995. “Definition and purpose of a digital library,” at http://www.ifla.org/documents/libraries/net/arl-dlib.txt, accessed 10 January 2007.

C. Bhange, 2006. “Virtual and digital libraries,” at http://eprints.rclis.org/archive/00007987/02/virtual_and_digital_library.pdf, accessed 10 January 2007.

C.L. Borgman, 1999. “What are digital libraries? Competing visions,” Information Processing and Management, volume 35, number 3, pp. 227–243.

P. Duguid, 1997. “Report of the Santa Fe Planning Workshop on Distributed Knowledge Work Environments: Digital Libraries,” at http://www.si.umich.edu/SantaFe/, accessed 10 January 2007.

E.A. Fox, 1994. Source book on digital libraries. Blacksburg, Va.: Computer Science Dept., Virginia Tech.

G. Fischer, 2001a. “Communities of interest: Learning through the interaction of multiple knowledge systems,” Proceedings of the 24th IRIS Conference (Bergen, Norway) at http://l3d.cs.colorado.edu/~gerhard/papers/iris24.pdf, accessed 20 February 2008.

G. Fischer, G. 2001b. “External and sharable artifacts as sources for social creativity in communities of interest,” Proceedings of the Fifth International Roundtable Conference on Computational and Cognitive Models of Creative Design (Heron Island, Australia).

N. Fuhr, P. Hansen, M. Mabe, A. Micsik and I. Sølvberg, 2001. “Digital libraries: A generic classification and evaluation scheme,” Proceedings, ECDL 2001, Fifth European Conference on Research and Advanced Technology for Digital Libraries (Darmstadt, Germany, 4–9 September), Lecture Notes in Computer Science, number 2163, pp. 187–199, and at http://www.is.informatik.uni-duisburg.de/bib/pdf/ir/Fuhr_etal:01.pdf, accessed 20 February 2008.

H.M. Gladney, N.J. Belkin, Z. Ahmed, E.A. Fox, R. Ashany and M. Zemankova, 1994. “Digital library: Gross structure and requirements (Report from a workshop),” IBM Research Report, RJ 9840, at http://www.ifla.org/documents/libraries/net/rj9840.pdf, accessed 20 February 2008.

J.D. Gould, S.J. Boies and C. Lewis, 1991. “Making usable, useful, productivity–enhancing computer applications,” Communications of the ACM, volume 34, number 1, pp. 74–85.http://dx.doi.org/10.1145/99977.99993

J. Greenbaum and M. Kyng (editors), 1991. Design at work: Cooperative design of computer systems. Hillsdale, N.J.: L. Erlbaum Associates.

C. Lewis and J. Rieman, 1993. “Task–centered user interface design: A practical introduction,”, at http://www.hcibib.org/tcuid/, accessed 10 January 2007.

Mambo Administrator Manual, at http://help.mamboserver.com/index.php?option=com_content&task=category&sectionid=16&id=101&Itemid=121, accessed 10 January 2007.

G. McMillan, 2000. “The digital library: Without a soul can it be a library?” Proceedings of the Tenth VALA Biennial Conference (Melbourne, Australia), at http://www.vala.org.au/vala2000/2000pdf/McMillan.PDF, accessed 10 January 2007.

J. van der Merwe, P. Gausman, C.D. Cranor and R. Akhmarov, 2003. “Design, implementation and operation of a large enterprise content distribution network,” Proceedings of the Eighth International Workshop on Web Content Caching and Distribution, at http://www.research.att.com/~kobus/docs/wcw2003.pdf, accessed 10 January 2007.

D.A. Norman and S.W. Draper (editors), 1986. User–centered system design: New perspectives on human–computer interaction. Hillsdale, N.J.: L. Earlbaum Associates.

E.D. Scharff, 2002. “Open source: A conceptual framework for collaborative artifact and knowledge construction,” Ph.D. Thesis, Department of Computer Science, University of Colorado, Boulder, at http://www.cs.colorado.edu/events/defenses/2001-2002/scharff.html, accessed 20 February 2008.

D. Schuler and A. Namioka (editors), 1993. Participatory design: Principles and practices. Hillsdale, N.J.: L. Erlbaum Associates.

M. Wright, M. Marlino and T. Sumner, 2002. “Meta–design of a community digital library,” D–Lib Magazine, volume 8, number 5, at http://www.dlib.org/dlib/may02/wright/05wright.html, accessed 10 January 2007.

 


Editorial history

Paper received 3 May 2007; accepted 25 January 2008.


Copyright © 2008, First Monday.

Copyright © 2008, Saurabh Kudesia.

Planning, implementing and managing online repositories: Lessons learned from the KnowGenesis Library
by Saurabh Kudesia
First Monday, Volume 13, Number 2 - 4 February 2008
http://firstmonday.org/ojs/index.php/fm/article/view/2130/1941





A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2017. ISSN 1396-0466.