The Second-Level Digital Divide of the Web and Its Impact on Journalism by Wiebke Loosen
The so-called digital divide is one of the most important issues in connection with the increasing development and distribution of the Internet and its technologies. Assumptions concerning the effects on social, economic and educational development are based on certain ideas about the technical principles of the organisation of the Web.
The paper discusses the fundamental ideas of the Web such as openness, freedom, and equality, and analyzes their scopes under increasing economic and technical influences which definine the World Wide Web infrastructure and its potentials. This includes a comparison of specific functions of different search engines within a fast growing search industry that may be responsible for a distortion of Internet content and for a certain mode of Web traffic. Equally important are intentions of self-regulation and the monitoring of contents under these circumstances. This discussion will focus on the varied economic and technical aspects that will strongly influence the quality of the Web - which can be called a 'second-level digital divide'. It includes an analysis of relevant aspects in conjunction with online journalism and its role relative to this new situation.
Social conditions and consequences of information technologies cannot be described in terms of economic imperatives or with technological determinism only. Society and information technology are co-evolving and changing one another. This is one of the main perspectives developed out of recent approaches of technology assessment, technological change and technology sociology (Ropohl, 2001; Baron, 1995; Rammert, 1990; Weingart, 1989). Technology in this sense is a socially constructed process within its whole life cycle starting with its invention, development, production and finally culminating in its utilization .
Thus, social change and development are neither simply technology-determined nor is the evolution of technology just an effect of political, economic or military requirements. However Information technologies have very different social implications , with communication and media technologies creating the foundations for certain kinds of utilization and implementation in society. This means "information technology goes social when the exposure of its output makes a transition from individuals or small groups to large numbers of interconnected users" .
The social effects of communication and information technologies are most important relative to understanding the digital divide. In a sense, this is what the digital divide basically is about. It is one of the most important subjects in connection with the increasing development and distribution of the Internet and its technologies. Even though the digital divide is a major topic, research on it should avoid focusing exclusively on inequality in access to the Internet. Inequality is a multidimensional variable, which shows that technology's organization is influenced by many determinants , which also lead to a divide inherent in the Web.
Any analysis of technology's organization and social effects should be combined with a detailed look at particular kinds of technological principles governing the Web and their implications, possibilities and restrictions and on how all this is compatible with the idea of free information access. Some of these factors have social consequences that are leading to a phenomenon that we can call a 'second-level' digital divide which is dividing the Web itself.
Due to these effects, even those who are not confronted with the problem of access to and utilization of new communication and information technologies do have to take into account that access is regulated and influenced by different determining factors. These factors concern the organization of technology that in turn reduces the degrees of freedom on the Web and make it more predictable.
There are a variety of economic and technological constraints which restrict the Web's infrastructure and its fundamental ideals such as openness, freedom and equality. The purpose of this paper is to analyze these increasing economic and technological influences and their unintended consequences, which define the Web's infrastructure and its potential. This also includes an analysis of the relevant aspects and potential effects in conjunction with journalism and its role within a fast changing communication situation.
Assumptions concerning the effects of the Internet on social, economic and educational development are based on what is generally regarded as the promises of the Internet and on its (technological) infrastructure. In particular:
- fast and easy access to almost unrestricted masses of information;
- breaking of restrictions concerning the communicator-recipient relationship;
- new possibilities of interactivity; and, overall
- its positive impact on democratization and participation for everyone who has access to these communication technologies.
These promises seem to be something like absolute terms in media evolution which emerge with every developing 'new medium'  and which also routinely lead to differentiated demands on journalism. In contrast to these idealizations media development have always been accompanied by more or less rapid processes of commercialization and institutionalization, which is why the promise of democratization never has been fulfilled in the expected way.
Obviously similar processes can be observed with regard to the Web's promises which are - implicit or explicit - based on specific assumptions about technical principles of its infrastructure and functions:
- equality of parallelly existing contents;
- almost unrestricted possibilities of accessing, archiving and documentation of information; and,
- the principle of hypertext that hypothetically connects everything with everything else.
But mainly these principles implicate an increasing complexity of structures and content. As information becomes more abundant, there is the problem of attention scarcity (Franck, 1998; Goldhaber, 1997a; Goldhaber, 1997b) and the creation of strategies to reduce complexity.
Although an entire discussion of the technical details of the Web is beyond the scope of this paper, this paper highlights some aspects especially those concerning interconnectivity and search engines that I consider relevant in the context of the second-level digital divide.
Access to information on the Web is restricted in specific ways even though there are high degrees of freedom relative to date, time and intensity. The hypertext principle and the linking of Web documents has a strong impact on the Web's infrastructure and therefore on information access. "A link indicates the implicit presence of other documents and the ability to reach them instantly" . But interconnectivity within the Web varies. According to a study based on 203 million Web pages at the IBM Almaden Research Center, in fact 90 percent of all sites are linked to each other, but only 25 percent are referred to be "strongly connected components" with many in-links (links within a given site) and out-links (links to other pages) (Broder et al., 2000).
The Power of search engines
The link-based Web structure is extremely relevant for the work of search engines. They interpret connectivity (besides other indicators) as an indicator of popularity and utility within their search and ranking algorithms and simultaneously reinforce popularity in doing so. Search engines constitute a powerful source of access and accessibility within the Web and their importance cannot be overestimated.
The following is a short description of their functions to point out how preconfigured the access to the Web is by search engines and their particular techniques to determine importance and relevance (Cho, Garcia-Molina, and Page, 1998; Loosen, 1999; Introna and Nissenbaum, 2000). The following metrics can be combined in varied ways and are used by different search engines due to their special search algorithms :
Table 1: Importance Metrics
Source: Cho, Garcia-Molina, and Page, 1998.
Similarity to a driving query A query drives the crawling process, and the importance of a page is defined to be the textual similarity between page and query. Backlink Count The value of importance of a page is the number of links to the page that appear over the entire Web. Page Rank The Page Rank backlink metric recursively defines the importance of a page to be the weighted sum of the backlinks to it. Thus, a link from the Yahoo home page is assumed to be more important than a link from some person's home page. Location Metric The importance of a page is a function of its location, not of its contents. For example, URLs ending with ".com" may be deemed more useful than URLs with other endings.
This table gives a brief impression of how search engines index and rank documents and how selective this process is. It is not at all an objective reflectance of the Web and its documents.
This is one of the main reasons why Web pages are not 'equal.' Possibilities to be found by a search engine, and therefore by the user, vary very much. But to be found on the Web is extremely important for economic survival. This is why information retrieval and attempts to 'outwit' search algorithms have led to a whole 'search industry' with innumerable tools and applications  to improve search engine rank positions (Loosen, 1999). The selling of rank positions of search engines exacerbate these problems. Today it is commonplace to buy your way up to the top of search engines' ranking just without "doctoring the content"  of documents. A number of offers is available on almost all major search engines: e.g. Banner Ads, Content Deals, Paid Placement, Paid Inclusion and Paid Submission .
Due to these facts, interconnectivity is not only a technical function, but a powerful strategic tool with important consequences for information access, modes of communication and forms of content. Web sites are designed and planned. Therefore to link to other pages is also a communicative decision, which gives Web structure a special kind of communicative nature (Jackson, 1997). And this seems to lead to a so called "winner-take-all market" or a kind of "rich-get-richer" phenomenon, which means that the bulk of Web traffic takes place on very few sites only (Adamic and Hubermann, 1999).
Potential effects of content rating
Worldwide there are more or less massive attempts to regulate access to the Web and to control Web content (Freedom House, 2001). Less palpable are kinds of 'everyday censorship' that may be caused by industry self-regulation and rating of Web content.
A general-purpose system for rating the content of Web pages is the Platform for Internet Content Selection (PICS) developed by the Word Wide Web Consortium (http://www.w3.org). It was originally designed to control what children access on the Internet. Web publishers describe the content of their Web pages using the filtering categories (e.g. language used on a site, violence, nudity and sexual content) of PICS and voluntarily rate their pages with PICS-compliant rating systems , which provide them with content labels which are integrated into the HTML code of the page. PICS is already incorporated into Netscape Navigator and Microsoft's Internet Explorer that use it - if enabled - when reading the description of the page. All of this allows the filtering of pages to prevent a Web user from viewing certain sites and content.
If search engines index-rated and labeled sites only (Gruhler, 1998) the user would almost have no chance to bypass those measures. Even more importantly rating and filtering systems could in fact facilitate government censorship. Some critics and citizen's groups  see search engines as the "most effective censorship technology ever designed" , leading to a Web that is more regulated and mainstream.
The distinction between rather technological and economic influences on the Web can only be an analytical one because these processes are very deeply related and often go together simultaneously. Nevertheless there are some economic, rather than technological, indicators that illustrate the second level digital divide in an underlying sense. I will focus on examples relating economics and journalism since it is a logical conjunction of content and business. In turn this affects journalism professionally and its impact on society in particular.
Digitalization of all data has often been defined as the trigger for various forms of media convergence in terms of technical, functional, economic, regulative and receptive processes . The digitalization of content is of enormous economic interest as it offers additional ways of information distribution and further forms of product diversification, which allow a differentiated approach to increasingly segmented target groups .
The additional distribution of content via online media can be used for cross-media marketing, to reach new audiences and for transfer of credibility (Scheufele, 1999) from the corresponding traditional 'mother medium' to online media brands. All in all this increased capacity puts pressure on multiple uses of content and other resources, which simultaneously requires a high degree of standardization of content processing.
Nevertheless it is not yet the predominant aim to make all media content available for access via computer networks. It is rather the case that content is made available through different kinds of media types. Therefore the content is re-differentiated again, to be able to service different kinds of strategies and preferences of media use and to secure all media types as sources of revenue.
It is this convergence of different media modalities that secures the survival of specific structures and performances of certain kinds of media types, as production and distribution of media output can - in the long run - only be economical in such a synergetic production process.
It is within this context that the absolute term of "Riepl's law", that is frequently cited in Germany's scientific community, becomes relevant: A new medium type tends to complement rather than displace traditional media and patterns of use. But media types are converging in terms of organization and production processes. This assumption is amongst others an explanation for the fact that traditional media do not "disappear". With the formation of media conglomerates, mergers and alliances between previously unrelated players and branches value-added chains for media offers are created and inter-media competition is partly compensated by diagonal concentration. As the Web evolves into a mass medium, quite similar structures are emerging as in the off-line world; the media actors, working in the developing Web market, are the same as in the off-line world. It is unavoidable that mass media distribution is tied to economics and notions of profitability. Despite an individualization of media content there is a rather limited revalorization of the audience. Audiences are economically regarded as "target groups," customers part of value-added chains.
Standardization of applications
Another fresh impetus to cross-media strategies has been given by new technological applications. In October 2000 the International Press Telecommunications Council (IPTC) ratified "NewsML" as a standard for the interchange of news data. The news agency Reuters conceptualized "NewsML"  and wants nothing less than "[...] to revolutionise the way journalists create stories and users receive information" .
"NewsML" is based on XML (extensible markup language) whose use has been encouraged by the World Wide Web Consortium since approximately 1996 (http://www.w3.org/XML/). XML-based technologies provide a structure to conceptualize and publish data from diverse sources - in the case of "NewsML" news - in any format.
This is why contrary to earlier prognoses  mobility barriers between media subsystems will sooner or later (technically) overcome. Some examples show that they already did (e.g. at the Financial Times ; Marjoribanks, 2000). Content can be generated from standardized components, which can be multiply re-arranged using databases . On the other hand this weakens the often discussed advantages of customization of content, which is only possible when database content can be specifically recombined for any query and be put together following a modular design principle .
These processes of standardization has led to an increase in the number of so-called re-distributors on the Web, subscribing to news agencies and other content services, just to put content online without any further editorial modification. They are offering journalistic content and products as added values in a surrounding that is far from being journalistic. This strategy could be called pseudo-journalism.
NewsML - as a professional standard - and XML are geared towards the needs of professional content providers, as the programming is too complex for ordinary applications. XML is also designed for more application-oriented users, searching for information. All of this means that the gap between professional and private content providers will grow, making it more obvious that not all content on the Web is equal or equally ranked.
The World Wide Web is about to become a mass medium. This is not only a question of 'mass' in a quantitative sense, but also in a qualitative sense, in terms of the process of institutionalization of a new medium. A medium according to this definition is the result of institutionalization irelative to economic, technological, organizational and professional features  - with all of their advantages and disadvantages.
Within this process the promises of the Internet seem to 'cancel' and 'neutralize' each other to some extent. Unrestricted documentation facilities have to be qualified by facilities and standards of information selection, processing and distribution which will shape the Web and its content. These ways of organizing content also shape patterns of media consumption  and the degrees of freedom in using the Web.
Nonetheless the potentials of the Internet are expansive (not without limits) and cannot be described with (alleged) dichotomies like 'push and pull', 'mainstreaming and individualisation' or 'free and restricted access'.
Currently the rise of the Web is one of the main reasons for the acceleration of technicalization and economicalization in journalism, and it faces a difficult challenge in applying professional standards to the Web. Technology has had and will have an impact on operational procedures and technological innovations have often been important for modifications of professional procedures, professional rules and in general for the distinction between technological and editorial functions (Weischenberg, 1982; Ursell, 2001).
The abundance of information on the Web - its storage, management, multiple use and the unlimited possibilities to recombine and feed into content to different systems and data formats - are challenging journalism regarding its own processes of rationalizing information. There is some degree of ambivalence to these developments. This is due in part to the reduction of work required to sample content to a mere administration of databases on the one hand and the possibility of new fields of activity and ways of working on the other hand .
Due to this fact the term 'database-journalism' is about to be redefined in a far-reaching sense. It would not only mean material selection from databases and online news research (Garrison, 2001) but also supplying databases with raw material - articles, photos and other content - by using medium-agnostic publishing systems and then making it available for different devices. This would turn databases into 'hubs' in newsrooms which in turn will affect news values and the generation of media coverage.
The structural differentiation of journalism with the rise of the Web will not lead to a deterioration of journalistic traditions. Structural differentiation will lead to a functional increase or decrease of performance and output. This would be the case if traditional journalistic functions are taken over by other systems or if the professional standards would be re-defined due to economic or political influences.
Economics and the technology of the news gathering process will dominate future news production. Further research is now needed to decide whether, and in which specific contexts, this will be functional or dysfunctional for journalism and consequently for society.
About the Author
Wiebke Loosen is assistant lecturer at the Institute of Journalism and Communications at the University of Hamburg, Germany. She earned her MA and PhD in Communications at the University of Muenster. Amongst other things she has published on interactivity in online journalism, the function and influence of search engines on Web traffic and Web structure and on the influence of the Web on journalistic work procedures. Her current project is an empirical look at processes of convergence and cross-media synergies in journalism.
7. A detailed overview on how search engines work and how they rank is given at http://www.searchenginewatch.com/webmasters/work.html and at http://www.searchenginewatch.com/webmasters/rank.html, accessed 16 April 2002.
10. An overview is given at http://www.searchenginewatch.com/webmasters/paid.html, accessed 16 April 2002.
12. There are innumerable initiatives such as Peacefire (http://www.peacefire.org/), Global Internet Liberty Campaign (http://www.gilc.org/) and Human Rights Watch for freedom of expression on the Internet (http://www.hrw.org/wr2k/Issues-04.htm), accessed 16 April 2002.
13. Simson Garfinkel at http://hotwired.lycos.com/packet/garfinkel/97/05/index2a.html, accessed 16 April 2002.
16. See http://newsshowcase.reuters.com/, accessed 6 November 2001.
17. Reuters press release at http://newsshowcase.rtrlondon.co.uk/mainsite/contacts.htm, accessed 16 April 2002.
19. Cf. http://www.interactivepublishing.net/dbdownloads/ft.ppt, accessed 16 April 2002.
Lada A. Adamic and Bernardo A. Huberman, 2000. "The Nature of markets in the World Wide Web," Quarterly Journal of Electronic Commerce, volume 1, pp. 5-12, and at http://ginger.hpl.hp.com/shl/papers/webmarkets/, accessed 16 April 2002.
Waldemar M. Baron, 1995. Technikfolgenabschätzung. Ansätze zur Institutionalisierung und Chancen der Partizipation. Opladen: Westdeutscher Verlag.
Andrei Broder, Ravi Kumar, Farzin Maghoul, Prabhakar Raghavan, Sridhar Rajagopolan, Raymie Stata, Andrew Tomkins, and Janet L. Wiener, 2000. "Graph structure in the Web," Proceedings of the Ninth International World Wide Web Conference, Amsterdam,, pp. 309-320, and at http://www.almaden.ibm.com/cs/k53/www9.final/, accessed 16 April 2002.
Junghoo Cho, Hector Garcia-Molina, and Lawrence Page, 1998. "Efficient crawling through URL ordering," Proceedings of the Seventh International World Wide Web Conference, Brisbane, at http://www7.scu.edu.au/programme/fullpapers/1919/com1919.htm, accessed 16 April 2002.
Paul DiMaggio, Eszter Hargittai, W. Russell Neumann, and John P. Robinson, 2001. "Social Implications of the Internet," Annual Review of Sociology, volume 27, pp. 307-336.
Georg Franck, 1998. Ökonomie der Aufmerksamkeit. Ein Entwurf. München, Wien: Carl Hanser Verlag.
Freedom House, 2001. "Press Freedom Survey 2000," at http://www.freedomhouse.org/pfs2001/pfs2001.pdf, accessed 6 April 2002.
Bruce Garrison, 2001. "Diffusion of online information technologies in newspaper newsrooms," Journalism, volume 2, pp. 221-239.
Michael H. Goldhaber, 1997a. "Die Aufmerksamkeitsekonomie und das Netz - Teil I," Telepolis, at http://www.ct.heise.de/tp/deutsch/special/eco/6195/1.html, accessed 16 April 2002.
Michael H. Goldhaber, 1997b. "Die Aufmerksamkeitsekonomie und das Netz - Teil II," Telepolis, at http://www.ct.heise.de/tp/deutsch/special/eco/6200/1.html, accessed 16 April 2002.
Alexander Gruhler, 1998. "PICS - eine moderne Version der Zensur? Das technische Konzept eines umstrittenen Kontrollinstruments und seine Auswirkungen auf die Netzwelt," Telepolis, at http://www.heise.de/tp/deutsch/inhalt/te/1464/1.html, accessed 16 April 2002.
Marti Hearst, 1999. "When information technology 'goes social'," IEEE Intelligent Systems (January/February), pp. 10-15.
Heim, Michael, 1993. The Metaphysics of virtual reality. New York: Oxford University Press.
Jürgen Heinrich, 1999. Medienöl. Opladen: Westdeutscher Verlag.
Lucas D. Introna and Helen Nissenbaum, 2000. "Shaping the Web: Why the politics of search engines matters," Information Society, volume 16, pp. 1-17.
Michele H. Jackson, 1997. "Assessing the Structure of Communication on the World Wide Web," Journal of Computer-Mediated Communication, volume 3, at http://www.ascusc.org/jcmc/vol3/issue1/jackson.html#abstract, accessed April 16 2002.
Michael Latzer, 1997. Mediamatik - Die Konvergenz von Telekommunikation, Computer und Rundfunk. Opladen: Westdeutscher Verlag.
Timothy Marjoribanks, 2000. "'The Anti-Wapping'? Technological Innovation and Workplace Re-organization at the Financial Times," Media, Culture & Society, volume 22, pp. 575-593.
Irene Neverla, 2000. "Das Netz - eine Herausforderung für die Kommunikationswissenschaft," Medien und Kommunikationswissenschaft, volume 2, pp. 175-187.
Ekkehardt Oehmichen and Christian Schröter, 2000. "Fernsehen, Hörfunk, Internet: Konkurrenz, Konvergenz oder Komplement?," Media Perspektiven, volume 8, pp. 359-368.
Werner Rammert (editor), 1990. Computerwelten - Alltagwelten: Wie verändert der Computer die soziale Wirklichkeit? Opladen: Westdeutscher Verlag.
Günther Ropohl (editor), 2001. Erträge der interdisziplinären Technikforschung: Eine Bilanz nach 20 Jahren Berlin: Erisch Schmidt.
Bertram Scheufele, 1999. "Mediendiskurs, Medienpräsenz und das World Wide Web: Wie 'traditionelle' Medien die Einschätzung der Glaubwürdigkeit und andere Vorstellungen von World Wide Web und Online-Kommunikation prägen können," In: Patrick Rössler (editor). Glaubwürdigkeit im Internet: Fragestellungen, Modelle, empirische Befunde. München: Fischer Verlag, pp. 68-88.
Siegfried J. Schmidt and Guido Zurstiege, 2000. Orientierung Kommunikationswissenschaft: Was sie kann, was sie will. Hamburg: Rowohlt.
Gabriele Siegert, 2001. Medien Marken Management: Relevanz, Spezifika und Implikationen einer medienökonomischen Profilierungsstrategie. München: Fischer Verlag.
Gillian D.M. Ursell, 2001. "Dumbing down or shaping up? New technologies, new media, new journalism," Journalism, volume 2, pp. 175-196.
Peter Weingart (editor), 1989. Technik als sozialer Prozeß. Frankfurt am Main: Suhrkamp.
Siegfried Weischenberg, 1998. "Pull, Push und Medien-Pfusch. Computerisierung - kommunikationswissenschaftlich revisited," In: Irene Neverla (editor). Das Netz-Medium. Kommunikationswissenschaftliche Aspekte eines Mediums in Entwicklung. Opladen/Wiesbaden: Westdeutscher Verlag, pp. 37-61.
Siegfried Weischenberg, 1995. Journalistik. Theorie und Praxis aktueller Medienkommunikation. Bd. 2: Medientechnik, Medienfunktionen, Medienakteure. Opladen: Westdeutscher Verlag.
Siegfried Weischenberg, 1985. "Die Unberechenbarkeit des Gatekeepers. Zur Zukunft professioneller Informationsvermittlung im Prozess technisch-ökonomischen Wandels," Rundfunk und Fernsehen, volume 33, number 2, pp. 187-201.
Siegfried Weischenberg, 1982. Journalismus in der Computergesellschaft: Informatisierung, Medientechnik und die Rolle der Berufskommunikatoren. München: Saur.
Siegfried Weischenberg, Klaus-Dieter Altmeppen, and Martin Löffelholz, 1994. Die Zukunft des Journalismus. Technologische, ökonomische und redaktionelle Trends. Opladen: Westdeutscher Verlag.
Paper received 17 April 2002; accepted 22 July 2002.
Copyright ©2002, First Monday
Copyright ©2002, Wiebke Loosen
The Second-Level Digital Divide of the Web and Its Impact on Journalism by Wiebke Loosen
First Monday, Volume 7, Number 8 - 5 August 2002
A Great Cities Initiative of the University of Illinois at Chicago University Library.
© First Monday, 1995-2013.