Collaborative development of open content: A process model to unlock the potential for African universities by Derek Keats
Given the cost of content, the under-resourcing of universities and the scattered nature of expertise in Africa, the collaborative development of open content seems like a useful way to get high-quality, locally-relevant content for using to enhance teaching-and-learning. However, there is currently no published operational model to guide institutions or individuals in creating collaborative open content projects. This paper examines lessons learned from open source software development and uses these lessons to build the foundations of a process model for the collaborative development of open content.
Most mainstream higher education institutions (HEIs) in Africa are severely under-resourced in comparison to their counterparts in the developed world (Gauci, 2001; Nwuke, 2001). Given the financial limitations on most universities in Africa, and the low-income levels supporting the students within them, it is worthwhile to explore alternative means of lowering costs and improving quality. One area where it is possible to drive costs down and improve quality at the same time is in the provision of content to support learning, the raw material upon which most undergraduate and many graduate courses is based. While content  may not be the most important aspect of a course, it is difficult to imagine an undergraduate university course that does not make use of content in some form.
The cost of content materials, typically in the form of textbooks for undergraduate courses, often means that they are not equally accessible to all institutions and all students within them. Alternatively, it means that academics stretch the terms of fair use in order to provide photocopied "course packs" for their students to use in learning.
When we use textbooks in Africa that were developed in the U.S. or Europe, we obtain content that may not be locally relevant. Our purchases go to support the publishing industry in that part of the world, and contribute to our dependency on that industry. More importantly, this dependency means that African academics do not develop a strong tradition of authoring and publishing learning content, although of course there are some exceptions. Because of cost and unavailability factors, it is not uncommon for institutions to use out-of-date textbooks and older journals articles as learning content. This means that students may not be exposed to the latest ideas in their discipline of study.
As noted in the UNCTAD Commerce and Development Report 2002, publishing is one of the most important channels for disseminating knowledge, so improvements or expansions in publishing lead to further creation and dissemination of knowledge and in turn to increased economic and social development (UNCTAD, 2002). Although UNCTAD noted the poor state of the publishing industry in developing countries, it made no mention of open content as a potential knowledge creation dissemination strategy for developing countries even though it does note the potential of an open source approach to software development. The potential of open content approaches for development still need to be explored.
The cost of creating content by every academic in every HEI in Africa would be prohibitive yet many academics do put effort into developing their own learning materials. It has been estimated that it costs $US25,000 to develop an online course in a U.S. university, and a further $US5,000 a year to maintain it (Nwuke, 2001), much of this presumably going into the creation of content for online delivery. Due to lower salaries, the costs of creating online course content in Africa will be lower than this, but if we take this amount for illustrative purposes then we can begin to show the benefits of the open content approach . First, however, it is necessary to understand the concept of open content and open content licensing.
The idea of open content has its background in the open source software movement, and can be considered a license agreement, a philosophy, a way of doing things, as well as the content produced and distributed according to the open content license agreement. As a philosophy, open content refers to the principle that content should be freely reusable so as to make knowledge available as common knowledge for the common good (see Newmarch, 2001). A key fundamental of open content licensing is that any object is freely available for modification, use, and redistribution with certain restrictions.
Open content licenses are legal instruments for promoting the free and open distribution of knowledge. The original OpenContent License (OPL), Version 1.0, 14 July 1998  does not stipulate how the nature of changes to OPL documents is to be made known. This and other problems with the OPL lead to the development of the Open Publication License, which lacks an official acronym and was be referred to as the OPbL by Keats and Shuttleworth (2003). This license first relieves the author(s) of any liability or implication of warranty. In a litigious world, this is important and ensures that material can be distributed freely without the author having to worry that someone might imply a warranty or liability on the part of the author. It grants everyone permission to use the content in whole or in part, and insures that the original author will be properly credited when content is used. In other words, although you have the freedom to use the work, you may not claim that it is your own. This does not stop you from modifying the content and producing whole or partially derived works because the license grants others permission to modify and redistribute the content. It stipulates that they must clearly mark what changes have been made, when they were made, and who made them.
One of the most crucial aspects of the OPbL is that it insures that if someone else bases a work on Open Content, that the resultant work will be made available as Open Content as well. In other words, you may not take an Open Content work, produce a derived work, and then restrict the derived work or try to claim other than OPbL copyright for it. This feature of the license ensures that improvements to an Open Content work will remain in the public domain, and is commonly referred to as copyleft. Under the full license, anyone may distribute, reproduce, modify or even sell the original work, as well as work derived from it. However, clause VI (License options) presents two options to restrict the sale of printed versions of an item of content, or to prevent others from substantially modifying it.
Creative Commons  promotes the innovative reuse of all sorts of intellectual works and offers the public a set of copyright licenses as well as a set of symbols akin to the © symbol to denote different aspects of open content licensing. As of 28 November 2002, the available licenses fall into four categories:
- Attribution. Permit others to copy, distribute, display, and perform the work and derivative works based upon it only if they give you credit.
- Noncommercial. Permit others to copy, distribute, display, and perform the work and derivative works based upon it only for noncommercial purposes.
- No Derivative Works. Permit others to copy, distribute, display and perform only verbatim copies of the work, not derivative works based upon it.
- Share Alike. Permit others to distribute derivative works only under a license identical to the license that governs your work.
This document is concerned mainly with open content developed with the share-alike license and the Gnu copyleft clause that ensures derived content objects maintain the copyleft clause.
As a way of doing things, open content methodologies are not well understood, and there is no published model of methods of open content development. Therefore, the purpose of this paper is to develop such a model for use in guiding academic collaborations, as well as to provide guidelines for software development to support open content methodologies.
Examples of open content
A number of organizations are active in the open content area. OpenContentList  is a site and mailing list covering the field of Open Content: emerging models of collaboratively-built content, from Weblogs to user-run encyclopedias to free media databases. The Open Content Network  aims to be the world's largest content delivery network (CDN). Its Web site claims that users will soon be able to download open source and public domain software, movies, and music at incredibly fast speeds from this global, distributed network. The Open Content Network is based on peer-to-peer technology, called the "Content-Addressable Web", which enables advanced content location and distribution services for use with standard Web servers, caches, and browsers.
There are two open content online encyclopedia projects that are fairly mature, Nupedia and Wikipedia. Both projects were initiated by Larry Sanger, with Nupedia being a formal peer-reviewed project, and Wikipedia being much more open and less formal. Both Nupedia and Wikipedia are originated and hosted by Bomis, a WebPortal company. Nupedia has been under development for some time and still has a very limited number of articles. It relies completely on volunteers to create content, and articles are written by individuals rather than interacting teams of collaborators. Judging by the Web site, Nupedia is yet to achieve the momentum needed to make the project a success. Wikipedia  started in January 2001 and as of November 2002 claimed to be working on 93,345 articles in the English version. Anonymous users are able to change content, but despite this there are some surprisingly good articles. Like Nupedia, Wikipedia relies completely on volunteers but articles are more of a community effort since anyone can make corrections.
Andamooka  hosts open content books for reading, annotation, and discussion. Nearly all books currently available on Andamooka are related to computer topics, mainly dealing with the Linux operating system and related software.
A model for open content in African universities
Learning from open source software development
The concept of open source is a software engineering concept in which the source code for a software programme is kept open, and the software is freely distributable, distributed along with its source code and in which anyone is free to modify the source code and change the program, as long as the resulting program is also freely distributable and modifiable (Drummond, 2000). However, more than a software engineering concept, like 'open content', 'open source' is a philosophy as well as a way of engaging in large scale collaborations (Sandred, 2001; Williams, 2002). As a philosophy and a means of collaboration, open source principles can be applied as much content as to software (Keats and Shuttleworth, 2003).
Open source software development methods are well developed, and reasonably well-studied processes (e.g. Crowston and Scozzi, 2002; Sandred, 2001; Stalder and Hirsch, 2002). We can use what is known about open source methods to develop a process model for open content development in the context of collaborating higher educational institutions. This process model can be used to guide practice in the creation of open content resources. Given that open content is an emerging area of collaborative work, it is likely that this model will be refined as we learn more through practice.
In conceptualizing this model, the most important lessons from open source software development were extracted from the literature, as well as personal experience in open source software development (see Table 1). The literature reviewed below was used to construct the lessons presented in Table 1, and that will guide the conceptualization of the open content process model.
Open source software projects behave as virtual organizations (Crowston and Scozzi, 2002), characterized by having a common interest or goal shared by its members, geographically dispersed membership, and the use of ICT to communicate and manage interdependent processes (Ahuja and Carley, 1998). Crowston and Scozzi (2002) applied 'competency rallying' (CR) theory to open source software projects hosted on SourceForge . CR theory suggests that four generic capabilities must be present for a virtual organization to succeed (Katzy and Crowston, 2000; Crowston and Scozzi, 2002):
- identification and development of individual competencies;
- identification of market opportunities;
- marshalling of competencies;
- management of a short-term co-operative effort.
The analysis of data gathered from the SourceForge site by Crowston and Scozzi (2002) provided empirical support and supported four propositions based on the CR theory capabilities (Table 2). These capabilities and propositions serve as important lessons for the process model of open content development (Table 2).
Table 2: Generic capabilities of a successful virtual organization interpreted in terms of open content processes.
(based on Crowston and Scozzi, 2002)
Capability Proposition Open content interpretation identification and development of individual competencies The more available the required competencies, the more successful will the open source software project Open content projects should start subject areas where there is a critical mass of available expertise and such projects will achieve more than those where expertise is limited identification of market opportunities The more readily developers can recognize the needs and problems addressed in the project, the more successful the open source software project Open content authors should include as many of the content users as possible; learners should contribute the identification of opportunities for improvements to content and related learning objects; ample opportunities for feedback should be provided marshalling of competencies The more quickly and accurately competencies can be marshaled, the more successful the open source software project Open content projects will succeed better if they are led by a champion who is well-connected in the discipline for which content is being developed management of a short-term co-operative effort The greater the ability to manage short term co-operation, the more successful open source software project Open content projects should "hit the ground running" under the leadership of a known and respected champion who understands the importance of communication to the open content process.
The organizational structure of open source projects tends to be flat, but a flat structure does not imply absence of leadership and responsibility. However, leaders in open source projects are able to provide leadership mainly because they have earned the respect of the team members (Sandred, 2001) based on known or perceived experience and expertise pertaining to the tasks of the project. Stalder and Hirsch (2002) noted that the power of the leader within the network is awarded by members of the group based on their ability to convince others to follow and contribute. From experiences, we know this to be true even when there is a person who has quasi-legal leadership responsibility (e.g the official project manager). Project leaders may at times achieve results by virtue of the power of coercion, but often they achieve results through simply allowing people to have their input. The balance between these two approaches is crucial to success.
Any attempt to exercise power or status that arises outside the virtual organization can lead to a breakdown of the organization, and the departure of those participants who are not paid members of the project team and a consequent reduction in open source momentum. The threat of forking (taking the code and going off on a separate branch) is an ever present threat that helps keep the leader in line, as no leader would want to alienate the team to the point where a fork happens (Stalder and Hirsch, 2002).
According to Covey (1992), trust is the foundation for long term success in any endeavour. An open source network is dependent on the capacity of its members to trust in the abilities and integrity of one another (Hissam et al., 2002). In the absence of trust, an open source project will disintegrate, particularly when it involves voluntary and easily mobile participants (Sandred, 2001).
Configuration management is crucial to the success of open source software development (Asklund and Bendix, 2002). In open source projects, configuration management is typically done using tool called concurrent version system (CVS). According to Asklund and Bendix (2002), configuration management is made up of a number of components that have a developer focus: build management, configuration selection, workspace management, concurrency control, change management, and release management. These processes correspond to analogous processes that can be applied to content.
One of the roles of the leader in an open source project is as the gate keeper, a concept made known through the development of Linux by Linus Torvalds. Central to the Linux operating system is the bit of code known as the kernel. For some time, Torvalds himself acted as gatekeeper on the Linux kernel. In recent years, programmer Alan Cox, has performed a substantial part of this role by checking and incorporating bug fixes from developers around the world from his home in Wales (Boutin, 2001). As gate keeper, Torvalds and Cox ensure the integrity of the code that goes into the Linux kernel, and they are thus keys to quality control in the open source development process.
There are a number of risks inherent in open source software development, including code forking, team breakdown, and unpredictable delivery. It is important to understand these risks, and to build procedures into the open content model to reduce the likelihood of their occurrence and reduce their impact on content creation. Among these risks, the risk of forking is particularly likely to be a problem with content. This is so because there are more permutations and combinations of ideas in content than there can be of code in software. In software, badly written code is likely to fail, while poorly written content is still readable. Furthermore, there are relatively few content areas where everyone would agree, so mechanisms to cater for alternative viewpoints are likely to be important in open content processes.
A process model for open content
Key elements of a process model are processes, tools and people, so in developing the process model for open content, this is the approach taken (Figure 1 and Table 3). An integrated suite of tools to facilitate the processes described above do not yet exist, so the broad conceptualization of such an integrated suite is provided here.
Figure 1: People and processes involved in open content development.
Stage 1 (pre-development): The first stage of an open content project involves the predevelopment processes of planning, mobilizing people, and registration of the project (Figure 1 and Table 3). Planning will typically be initiated by a small group of people with a common interest, or even by an individual. This will be followed by mobilizing other interested parties to join in the content creation process, as well as further negotiation of the planning. Planning and mobilization may involve face-to-face meetings, but tools to support this process would include an asynchronous, threaded discussion forum that can be participated in by e-mail or via a Web interface, as well as synchronous tools such as instant messaging and chatrooms. The Web site would provide the common interface to these communicative processes, as well as access to the tools for project registration. The activities of a well-networked champion are likely to be important in mobilizing people during this stage.
Table 3: Processes, tools, and people in open content development.
Tools in italics do not exist at present, and an integrated tool to manage all processes has not yet been developed.
Note that it is assumed that auditing is part of all processes.
Processes Tools People Planning face-to-face meetings; e-mail-integrated, threaded discussion forum; instant messaging; chatrooms; Web site as common interface project initiator or group Mobilization e-mail; e-mail-integrated, threaded discussion forum; instant messaging; chatrooms, Web site as common interface and marketing tool project initiator or group, new members who join Registration of open content project content configuration management system based on WEB/DAV project initiator or group Local checkout content configuration management system based on WEB/DAV, local client each individual or work team Develop variety of authoring tools for HTML, images, video, audio, multimedia each individual or work team Assess Web browser authors, informed users Modify variety of authoring tools for HTML, images, video, audio, multimedia each individual or work team Check in content configuration management system based on WEB/DAV, local client each individual or work team Automated view content configuration management system based on WEB/DAV automated process Release content configuration management system based on WEB/DAV, Web browser consensus among authors, gatekeeper individual or workgroup Communication Fully threaded Web discussion forum with full e-mail integration and links to other processes; in-page discussion and feedback; integrated chatroom; integrated instant messaging users, developers
Stage 2 (content development): The second stage involves the ongoing processes of content development and refinement (Figure 1 and Table 3). Once the project has been registered, the development team are provided access to a content repository that handles configuration management similar to capabilities of CVS in software development (Table 3).
Stage 3 (provision of live view): The third stage is one which is automated by the software that is used to manage the repository, and provides a live view of content object under development. This might be achieved, for example, but automatically copying the current version of documents to a Web site following their check-in by authors. This would be a similar process to running a cron job to update a Web site form a CVS repository in software development. An important part of this process will be obtaining feedback from users and other authors so that revisions can be made.
Stage 4 (release): The fourth stage is the release of content objects to the public for eventual incorporation into course curriculum (Figure 1 and Table 3). This process would again need to be automated by the software, but under the control of one or more gatekeepers who ensure that the content meets agreed standards before public release.
Another way to look at the open content approach, would be to follow the flow of events in a hypothetical project (Figure 2). An open content project would begin when someone gets an idea for a project, and puts together the initial team to work on it. The initial team could just be an individual, but would typically involve two or more individuals who have a common interest, such as teaching an introductory course on African Art (for example). This team would either meet physically, or more likely using a synchronous chat tool such as Internet relay chat (IRC) or another real time conferencing mechanism. Through this meeting, they would develop a rough project plan, and then register the content project on the content management server ("contentforge"). The content management server would provide facilities for collaborative content development and version control, similar to the facilities provided by SourceForge for open source software.
Figure 2: Flow diagram for a hypothetical open content development project.
Once the server has registered the project, the content developers would begin the process of content creation. When enough of the content is in place to be useful, content would be scheduled for release, which may involve a gatekeeper who is responsible for ensuring quality and standards are adhered to before release.
Communication is crucial in all stages of open content development, and it can take many forms, with some being destructive and others constructive. It is crucial that the tools provide for appropriate communication mechanisms throughout the cycle. Communication may be hierarchical, brokered (structured gossip), multiple pair-wise (unstructured gossip) or open and technology-mediated (Figure 3). The hierarchical model resembles a military situation, and is inappropriate to a collaborative academic setting where there are no command and control mechanisms, although experience shows that we often have this model in our heads when we think of the structure of collaborative projects. Brokered communications can happen when communication happens via a coordinator or project leader (Figure 3), a situation that I have termed structured gossip. It is inappropriate for open content type of collaboration because experience shows that it leads to disconnected components (silos) and creates opportunities for blame, excuses and other destructive outcomes. Even worse for collaborative projects is the multiple pair-wise or many-to-many communications among subsets of the whole group (Figure 3). I have termed this unstructured gossip, and experience with a number of collaborative projects that started off using this approach shows it to be a very destructive form of collaboration, often leading to the formation of virtual cliques and silos, as well as the creation of many opportunities for the destructive forces of blame, mistrust, and excuse. For collaborative projects involving open content creation, the best form of communication is one where communication is open and mediated by technology. This typically means asynchronous discussion forums that may include email integration as well as persistent synchronous chatrooms where all team members have access to the log of chats even if they were not participants. Hence, tools designed to support open content authoring will need to incorporate these features.
Figure 3: Graphical value of suggested relationship between perceived value of open content and the degree to which developers have progressed towards an open process with an inclusive community of construction.
Once the content is released, a community of content users will develop around it, and some of these users will become new participants in the content creation process. To bring them into the project, the team might host another IRC meeting, after which the new participants would obtain authoring rights and contribute to the collaborative creation of content. Version control and management would ensure that if any problems are introduced the content can always be rolled back to an earlier working version, just as is the case with software in CVS and other version control systems.
Figure 4: Alternative communication models.
While it is possible to create these "communities of construction" for knowledge content, open content can also be made available by a centralized publishing process with little or not collaboration outside the originating organization (Table 4). One can imagine a continuum from centralized publishing to communities of construction, as well as the features of each extreme of the continuum (Table 4). We can hypothesize the users are likely to place greater value on content when they have been included its planning and development (Figure 4), although the line may be less steep when the publishing organization is one of substantial international stature. What this hypothesis suggests in terms of the process model is that process and tools must be available to support communication and the inclusion of content authors that do not form part of the original team.
Table 4: Characteristics of two extremes of open content creation and sharing.
Centralized publishing characteristics Community of construction characteristics Organizational structure One or more experts who create content as part of a single organizational structure One or more experts who create content as part of a multiple organizational structures Involvement of main users Expected users are not involved in construction Expected users are actively involved in construction New development partners welcome Exclusive Inclusive Examples MIT Open Courseware Webpedia, Wikipedia, AVOIR
The economic benefits of collaborative model of open content development stem from two inter-related processes, collaboration and reuse. When people with a common interest in different institutions collaborate in the creation of content, it stands to reason that as more people collaborate the costs per institution are reduced (Figure 5). As noted by Keats and Shuttleworth (2003) however, there will be some incremental costs of the collaboration as well as the integration of the content, so one cannot simply divide the total cost by the number of collaborators. Nevertheless, there should be substantial economic benefits of collaboration. Robinson (2002) called the collaborative approach "curriculum co-development" and documented some of its benefits even in the absence of open content license arrangements. If there is already a global repository of open content material in a particular subject area, the cost per institution will be even lower. Thus, as more institutions produce content and make it available under open content licensing, the costs of developing content go down even further (Figure 5).
Figure 5: Relationship between cost per institution, number of collaborators, and increasing availability of content.
While the process model presented here is not the only way for authors to create open content, other approaches will be difficult for institutions to sustain in the absence of sustained external funding. This is evidenced by the scarcity of good open content on the Web, and the relative scarcity of open content materials available even from dedicated Web sites such as Nupedia and others.
This model does not preclude other mechanisms of creating open content in certain circumstances. For example, informal associations of like-minded scholars or artists, an open content clause in works that have traditional copyright but which kicks in after a period of time, the non-collaborative approach of MIT's Open Courseware , and others are all valid approaches that can contribute to the knowledge commons under differing sets of circumstances. What I have attempted to present here is a model to enable small institutions with limited resources to collaborate and contribute to the knowledge commons, from which they can also draw content resources. The model includes assumptions and hypotheses that will become testable as more organizations become involved in open content development, so this model could act as a guide for future research on open content.
About the Author
Professor Derek Keats is Executive Director of Information & Communication Services at the University of the Western Cape in Bellville, South Africa.
1. Content, as used here, refers to any learning materials that can be read, viewed or listened to in the absence of a teacher (e.g. textbooks, journals, Web pages, video, television, radio, audiotape, multimedia package, etc.), as well as the materials used to support a learning interaction with such materials (e.g. study guide, study questions, self tests, essay assignment, worksheets, laboratory manuals, field exercises, etc.).
U. Asklund and L. Bendix, 2002. "A study of configuration management in opensource software projects," IEE Proceedings — Software, volume 149, pp. 40-46.
P. Boutin, 2001. "Lieutenant kernel," Wired, volume 7, number 2 (February), at http://www.wired.com/wired/archive/9.02/mustread.html?pg=7, accessed 31 July 2002.
F. Covey, 1992. Principle centred leadership. London: Simon and Schuster.
K. Crowston and B. Scozzi, 2002. "Open source software projects as virtual organizations: Competency rallying for software development," IEE Proceedings Software, volume 149, pp. 3-17.
J.G. Drummond, 2000. "Open source software and documents: A literature and online resource review," at http://www.omar.org/opensource/litreview/index.html, accessed 25 February 2001.
A. Gauci, 2001. "Reforms in higher education and the use of information technology," Issues in Higher Education, Economic Growth, and Information Technology, Ad-hoc Expert Group Meeting (19-21 November), Nairobi, Kenya.
S.A. Hissam, D. Plakosh, and C.B. Weinstock, 2002. "Trust and vulnerability in open source software," IEE Proceedings Software, volume 149, pp. 47-51.
B. Katzy and K. Crowston, 2000. "A process theory of competency rallying in engineering projects," at http://crowston.syr.edu/papers/virtual-short.pdf, accessed 21 December 2002.
D.W. Keats and M. Shuttleworth, 2003. "Towards a view of knowledge as the common heritage of humanity: mapping an Open Content strategy," In: M.A. Beebe, K.K. Magloire, B. Oyeyinka, and M. Rao (editors). AfricaDotEdu: Higher education and IT opportunities. New Delhi: Tata McGraw-Hill.
J. Newmarch, 2001. "Lessons from open source: Intellectual property and courseware," First Monday, volume 6, number 6 (June), at http://firstmonday.org/issues/issue6_6/newmarch/, accessed 28 November 2002.
O.K. Nwuke, 2001. "Reforms in higher education and the use of information technology," Issues in Higher Education, Economic Growth, and Information Technology, Ad-hoc Expert Group Meeting (19-21 November), Nairobi, Kenya.
P. Robinson, 2002. "Curriculum co-development with African universities: Experiments in collaboration across two digital divides," Paper presented at the Conference on effective use of ICT to create a new environment for learning, teaching and research (29 July-1 August), UN International Conference Centre, Addis Ababa, Ethopia.
J. Sandred, 2001. Managing open source projects: A Wiley tech brief. New York: Wiley Computer Publishing.
F. Stalder and J. Hirsch, 2002. "Open source intelligence," First Monday, volume 7, number 6 (June), at http://firstmonday.org/issues/issue7_6/stalder/, accessed 31 July 2002.
UNCTAD, 2002. Commerce and Development Report 2002. New York: United Nations Conference on Trade and Development.
S. Williams, 2002. "Free as in freedom: Richard Stallman's crusade for free software," at http://www.oreilly.com/openbook/freedom/, accessed 21 July 2002.
Paper received 9 January 2003; accepted 1 February 2003.
Copyright ©2003, First Monday
Copyright ©2003, Derek Keats
Collaborative development of open content: A process model to unlock the potential for African universities by Derek Keats
First Monday, volume 8, number 2 (February 2003),
A Great Cities Initiative of the University of Illinois at Chicago University Library.
© First Monday, 1995-2013.