The new library of Babel? Borges, digitisation and the myth of the universal library
The new library of Babel? Borges, digitisation and the myth of the universal library by Christopher Rowe

The growing capacity of digital encoding and storage has opened up vast new avenues for the archiving and distribution of texts in virtual space, prompting many to declare the imminent obsolescence of print media, the book included. An interesting correlate to this situation is the revival of interest in and support for the idea of the universal library, a collection of every text in existence, albeit reimagined as an immense database of digitised material with online accessibility. Drawing mainly upon two texts by Jorge Luis Borges, a short story and an essay, this article challenges the premise that such a project would be possible or even desirable, and problematises the perceived equivalencies between print and digital media, reading a book and reading onscreen text, and library and database.


The concept of the book
The concept of the library
The concept of reading




The utopian idea of the universal library, a repository of every text ever published, has persisted in the western mind for over two millennia. The Library of Alexandria, founded in the third century BCE, is generally regarded as the first and, practically speaking, last such endeavour, an attempt to house and catalogue all of the texts (which were at that time primarily in the form of papyrus scrolls) in the then known world. Tradition holds that the collection was decimated by a fire, though the true fate of the Library of Alexandria is debatable; its existence and the comprehensiveness of its archives, however, are attested to by numerous sources [1]. Now, with the rise of digital media, virtual storage and the World Wide Web, many claim that the ancient dream of a universal library is approaching realisation, albeit in a new and very different form. The Google Books Library Project, the undertaking most often singled out as the modern equivalent of the Library of Alexandria, has reportedly compiled over 20 million scanned volumes [2], largely obtained from the collections of its 20 prominent partner libraries. Google’s stated objective at the inception of this project was no less than “to organize the world’s information and make it universally accessible and useful.” [3] Other proponents of the project have been even more hyperbolic; Kevin Kelly declared in a New York Times article that this new universal library would eventually offer “the entire works of humankind, from the beginning of recorded history, in all languages, available to all people, all the time” [4], including in its scope digital versions of all paintings, films, recorded music, television programs, every piece of print media and every internet site ever to have existed.

The idea of electronically storing and delivering vast collections of texts is older than most would imagine. In 1960, Ted Nelson, the inventor of the term “hypertext”, began working on (but never completed) the Xanadu system, a proposed “docuverse” which he later described as “a plan for a worldwide network, intended to serve hundreds of millions of users simultaneously from the corpus of the world’s stored writings, graphics and data” [5]. Nelson in turn drew inspiration from a 1945 article by Vannevar Bush, one of the first to seriously consider the logistics and possibilities of such a system [6]. However, I wish to draw the reader’s attention to an even earlier and more indirect theorisation of the universal digital library, one found in Jorge Luis Borges’s 1941 short story “The Library of Babel”. In this work, a nameless narrator describes the titular library as a seemingly endless vertical and horizontal series of hexagonal rooms housing 20 bookshelves apiece, the contents of which are described as follows: “each bookshelf holds 32 books identical in format; each book contains 410 pages; each page, 40 lines; each line, approximately 80 black letters” [7]. The contents of these books are revealed to be randomly generated combinations of a set of 25 characters: 22 letters representing all vowel and consonant sounds, the comma, the period and the space. This library, whose spatial dimensions would vastly exceed those of the observable universe [8], would by definition contain everything that has been, or possibly ever could be, expressed in writing; yet for every sentence, much less volume, of interpretable language there would exist galaxies of meaningless or indecipherable strings of characters [9]. While the library Borges describes here (and in his essay “The Total Library”, written two years prior to the story) does not resemble in content the universal library proposed by Google Books or other digitisation projects, there are certain commonalities between the two which are worth considering when attempting to conceptualise this more recent proposal.

Borges’ imagined library could be considered, as William Bloch points out, the output of a relatively simple computer program that generates random, non–repeating strings of 1,312,000 characters (410 pages x 40 lines per page x 80 characters per line) from a 25–character set, the library itself housing the inconceivably vast set of all possible strings: 251,312,000 books [10]. The production of these books would thus entail the atomisation of written language into individual characters (the typographical symbols), which may then be recombined irrespectively of their relationship to phonetic expression or linguistic sense. Electronic texts or ebooks, whether ASCII–based text files or other types of digital documents such as Adobe’s Portable Document Format, are likewise based upon encoding at the level of characters, as opposed to the linguistic level of sounds, words and phrases. In this transformation of language into data or information there is a concomitant transformation of our understanding of, and relationship to, digital texts themselves. Using Borges’ story as a touchstone, then, this essay aims to explore several potential effects of digitisation on our fundamental concepts of the book, the library, and the act of reading, followed by a final assessment of the nature of the new proposed universal library.



The concept of the book

In considering the concept of the book we will leave aside its deeply entrenched cultural and religious significance as a symbol of truth, knowledge and revelation and concentrate upon its technological development and the avenues it opened up with regard to cognition, and to the expression of ideas. The replacement of the scroll by the codex, the earliest form of the bound book, in the first century A.D. is considered by many historians a more significant technological innovation than Gutenberg’s invention of the moveable–type printing press [11]. Unlike the papyrus scroll, the codex required the use of only one hand for perusal, freeing the other hand for annotation or note–taking, while also allowing for much more rapid movement between parts of the text and hence a greater capacity for selective and repeated reading. Thus, the codex allowed for a decisive departure from the conventions of oral communication, the necessary temporal sequentiality of which was reflected in the scroll, whose format allowed only relatively slow, unidirectional progression. Much more than a rhetorical thread to be unwound, the text of a book is also an extensive surface to be viewed, as its format allows for the accommodation of a readily–accessible area hundreds of times that of its outer surface. This extensive visual accessibility of a book, which is comparable in certain respects to a painting or tableau, allots greater agency to the eye, which, as Christian Vandendorpe states, “can mobilize the analytical faculties more easily and more precisely than the ear” [12]. The format itself thus facilitated a more complex engagement between reader and text, one that allows the spatialized presentation of densely layered information and/or multiple threads of narrative. The influential literary critic I.A. Richards points towards the technical achievements of the book’s form — as well as its capacity to open up complex avenues for conceptualisation and expression — when he defines books as “machines to think with” [13].

In the ebook, however, we can detect a fundamental shift in spatial and temporal orientation compared to the traditional book. Due to its being accessed through screen–based hardware — whether a computer monitor, a dedicated ebook reader such as the Amazon Kindle, a tablet computer, or a smartphone — only a limited amount of text is visible at one time, and the speed at which other portions of the text may be located and oriented for the reader is far less than that of books. In this respect, the ebook’s visual format more closely resembles the scroll than the codex, though with a far greater flexibility and movement rate available with respect to its text; the use of the vertical “scroll bar” on most word processors, document–display computer programs and Web browsers further highlights this similarity [14]. J. David Bolter’s identification of a primarily spatial metaphor for reading and writing — a metaphor that is literalized and pushed to an absurd extreme in Borges’ story, wherein the universe itself is conceived of solely as a vast textual space — posits a series of shifts in the visual and spatial forms knowledge assumes following each technological development. The papyrus roll remained heavily aligned with an oral space of communication, and consequently remained “poor at suggesting a sense of closure” [15], while the codex “was more effective in enclosing, protecting and delimiting the writing it contained” [16], and hence the book came to represent the “physical embodiment, the incarnation, of the text” [17]. This sense of closure contributed to a spatial concept of knowledge which persists today: each book is considered a stable and discrete textual unit that forms arrays of shifting connections with other books but itself retains a relatively permanent identity. Electronic texts and the devices with which we access them, however, like scrolls before them, “work against closure, because both in form and function they refer their users to other texts, devices, or media forms” [18]. While not reverting to the aural spatial metaphor with which the scroll is associated, the electronic text is internally and externally enmeshed within a rapidly evolving mediascape that contributes to what Bolter describes as “a view of knowledge as collections of (verbal and visual) ideas that can arrange themselves into a kaleidoscope of hierarchical and associative patterns” [19]. This conceptualisation of knowledge is defined by a continual and active process of reconfiguration and adaptation according to variables such as individual user input, changing information technologies and selective modes of engagement. Knowledge thus assumes a diffuse and transient status, occupying something like a complexified aural space, which could be termed an “audiovisual space”.

In addition to this spatial affinity to an older technology than the book, then, electronic texts lend themselves to close associations with the temporal qualities of much more recent screen–based media technologies, namely film, television and video. Unlike the still surface of the painting, the space for contemplation with which the book is comparable, ebooks are aligned with the moving image, the dynamic flow of graphical and textual information prooffered by other screen media. Carla Hesse, for instance, notes that while “the book form serves precisely to defer action, to widen the temporal gap between thought and deed, to create a space for reflection and debate” [20], we now face a growing impulse to “reconceptualize the key institutions of modern literary culture — the book, the author, the reader, and the library — in terms of time, motion, and modes of action, rather than in terms of space, objects, and actors” [21]. Speed, and thus time, defines the electronic text in terms of accessibility and transmission, qualities essential to the form and function of the text itself; in the case of the book, time factors into the publication cycle and the temporal investment of the reader — and in the articulation of the narrative in applicable books — but is not a formal determinant in the textual object itself. The screen, however, is always already aligned with the concept of the moving image and its temporalities, as well as with flows of information, since the visual accessibility of the text relies upon flux rather than permanency, stasis, or stability. Our mode of engagement with screen–based visual material is necessarily based upon movement, which directly implies considerations of time, as Bolter again points out: “The continuous flow of words and pages in the book is supplanted in electronic space by abrupt changes of direction and tempo, as the user interacts with a Web page or other interface” [22].

What should be recognised above all is the fact that the book is the tangible product of a sophisticated epistemological and technological system which is wholly distinct from that of information technology, and whose essential properties remain deeply entrenched in its very materiality. These properties — individual identity, spatial extensity, total surface accessibility and textual fixity — are, at this point in time, not replicable in the form of the digital text or ebook. As Borges puts it, in an assertion that appears in both “The Library of Babel” and “The Total Library,” though the volumes of a universal library would mimic the form of books, they would lack their fundamental material solidity and instead “run the incessant risk of changing into [other books] that affirm, deny and confuse everything like a delirious god” [23]. Thus, the very attributes that are the strengths of electronic texts, which Vandendorpe lists as “ubiquity, fluidity, interactivity and complete indexation” [24], are not advantageous to the individual work, defined as a book. However, these properties, which lend themselves to highly useful functions such as Boolean search operators and hyperlinks, are ideally applied to masses of text rather than to individual works in isolation. This raises the possibility that online or electronic collections of texts may be defined as a type of library — or even “universal library” — regardless of whether or not the texts themselves are analogous to books in form or function.



The concept of the library

The application of the term “universal library” to the digital archiving of text, whether by Google Books or any other organisation, is highly problematic. To begin with, universality has not been the aim of any library since Alexandria’s: no matter how expansive in space and aspiration, every library is by definition selective in its collection of texts. Even the U.S. Library of Congress, currently the largest library in the world in terms of size and holdings, retains only about 10,000 of the roughly 22,000 items submitted to it each day, according to its Web site [25]. It could be argued that this selectivity is based upon the limitations of physical space, an issue that the digital library would scarcely face. However, selection is also based upon other criteria which become clear when we start to develop a tentative list of works which an all–inclusive library would contain: all pornographic works ever printed, including those the possession of which is deemed a felony offense; all manuscripts ever rejected for publication, including works by the insane; documents containing state or national secrets; seditious pamphlets and hate literature; technical specifications and blueprints for every weapon, including nuclear warheads; every iteration of every work ever published, including those printed in violation of copyright law; et cetera. Additionally, as Miroslav Kruk notes:

The library of total inclusiveness would contain materials blatantly untrue, false or distorted — intentionally or unintentionally misrepresenting reality, in which case, the universal library could be defined as small islands of meaning surrounded by vast oceans of meaninglessness. [26]

Borges’ Library of Babel would in this consideration reflect the actual, and perhaps only possible, form of the universal digital library. Indeed, the very notion of universality or totality in such a context promotes utter ambiguity by making true statements or texts indiscernible from false or erroneous ones. In typically Borgesian lists found in both “The Library of Babel” and “The Total Library”, the author points out that a library of total inclusiveness would contain “the faithful catalog of the Library, thousands and thousands of false catalogs, the proof of the falsity of those false catalogs, a proof of the falsity of the true catalog” [27]. This absolute relativization of knowledge is indicative of the indifference towards truth evidenced by the computational algorithms — whether random, simple or complex in nature — which form the basis of both Borges’ total library and any proposed digital equivalent.

In “The Total Library”, Borges cites the “thinking machine” created by theologian and logician Ramón Llull (ca. 1232–1315) — also known as the “Llullian circle” — as an antecedent of the concept that developed into the Library of Babel [28]. Llull, whose work with logical systems is often regarded as “a distant precursor of computer science” [29], inscribed discs with letters, words and symbols, and overlaid two or more of them in such a way that they rotated independently of one another. Undetermined combinations of letters, words and symbols could then be obtained through rotations of these discs, generating a seemingly endless array of linguistic structures or conceptual formulae within a closed, but randomized, logical system. An aggregation of all possible combinations of such signifiers would not constitute a form of knowledge, however, but rather, as Bolter puts it in his description of the Library of Babel, “the exhaustion of human symbolic thought” [30]. Borges’ short story thus addresses two key tendencies in modern epistemology: the displacement of logical mechanisms or modes of thought onto external apparatuses — whether the “God” who produced the Library of Babel, the Llullian circle, or an electronic computing device — and the attendant equivocation of data and knowledge. The need to distinguish between these two fields is crucial in any consideration of digitization or digital storage, particularly when questions of totality or universality are introduced. To offer a rather simplified contrast: knowledge is conceptualized in the primarily spatial dimensions of depth and breadth and applied to modes of cognition; information and data are conceptualized in the spatiotemporal dimensions of fluidity, movement and speed and applied to modes of communication and calculation. A totality of knowledge is inconceivable, since there are inherent limits to cognitive processes. A totality of information is inconceivable, due to continual variations in data sets. Space and time, in other words, circumscribe the two fields. It is for this reason that Robert Darnton categorically refutes the idea of cyberspace as a potentially limitless storehouse of knowledge:

Such a notion of cyberspace has a strange resemblance to Saint Augustine’s conception of the mind of God — omniscient and infinite, because His knowledge extends everywhere, even beyond time and space. Knowledge could also be infinite in a communication system where hyperlinks extended to everything — except, of course, that no such system could possibly exist. We produce far more information than we can digitize, and information isn’t knowledge, anyhow. [31]

The “delirious god” of the Library of Babel has shaped raw data output into masses of books, objects which are typically vehicles for knowledge rather than information, and placed them in an unordered sequence within a space utterly impossible to traverse. Inspired by Llull’s thinking machine, Borges took random orthographical computation — and by implication the digitization of texts, which likewise renders linguistic knowledge into arrays of pure information — from its earliest origins to its logical extreme: a vast desert of almost completely meaningless text.

Rather than an enormous static collection, then, the concept of the universal digital library would entail a transformation of the term “library” itself from a location for ordered collections of stable and identifiable texts to an endlessly shifting and expanding informational topography which must be actively mapped and navigated. As Pierre Lévy points out, we tend to speak of “text” more and more in the partitive sense, like “water” or “sand”, as a semi–fluid and ultimately formless mass [32]. This decisive shift in the conceptual status of text from knowledge towards information has influenced the architectural design of newer libraries, such as Dominique Perrault’s Bibliothèque Nationale de France and Cathy Simon’s San Francisco Municipal Library, buildings which negotiate interior and exterior space in ways that suggest that their function is to accommodate flows of information as much as to contain spatial objects [33]. This informational model is, of course, much more relevant to our understanding of the World Wide Web than to anything western culture had previously defined as a “library”.

This is not to say that Google’s undertaking is in no way a library project, as it evidences several aspects of library collections: it is selective, insofar as it is scanning the holdings of actual libraries, which were of course compiled selectively; it envelops the texts within a unified space, albeit the virtual space of a network database; and, it upholds (if only barely, according to its many critics) the regulations governing the published holdings of libraries, such as fair use and copyright. It should be clear, however, that it is in no sense a nascent “universal library”, nor could it possibly develop into such in the foreseeable future. Furthermore, the concept of the library is not the most appropriate model for the aggregation of data or text envisioned by the supporters of the Google Books Library Project and similar endeavours. In an addendum to “The Library of Babel”, Borges states the following: “Letizia Álvarez de Toledo has observed that this vast Library is useless: rigorously speaking, a single volume would be sufficient, a volume of ordinary format ... containing an infinite number of infinitely thin leaves” [34]. Kevin Kelly, the fervent defender of the idea of the universal digital library, supplies a corollary to this idea, pointing out that “[i]n a curious way, the universal library becomes one very, very, very large single text: the world’s only book” [35]. The holistic nature of an electronic “docuverse”, like Google Books or Xanadu, ironically undermines the very idea of a library and leaves us instead with the unitary idea of a single massive text or intertext. But how do we propose to read, or even to approach, such a text?



The concept of reading

The act of reading has numerous and highly distinct functions and attitudes, as demonstrated in the quite different approaches we take to reading different media: novels, reference books, newspapers, magazines, and Web pages each invite a unique manner of reading with varying types or degrees of cognition, attention, concentration, physical habitude, and time investment. Two basic types of engagement with a text may be identified, however, which are generally termed linear and tabular reading. Linear or intensive reading characterises the way we consume narrative fiction, and has its origins in the oral storytelling traditions of the epic and folktale. The textual qualities of univocality (a consistent narrative voice) and sequential temporality (a consistent plot or timeline) thus still dominate this branch of literature, in spite of their having been eschewed by many prominent modernist and postmodernist authors. The reader of such works is typically highly absorbed in their storylines, borne along almost automatically by the temporal and causal narrative connections between sentences, paragraphs, and chapters. The tabular mode of reading, however, does not correspond to the storytelling form but rather to what Vandendorpe defines as “the semidialogical model of the question and answer” [36]. In this case, reading is interrogative, seeking information about a specific subject, and this is reflected in the formats of tabular texts — encyclopedias, dictionaries and other reference books. Such works employ a number of organisational strategies — including alphabetisation, block spacing of text, section headings, and so on — as well as extra-textual devices like the table of contents, index, glossary, annotations, and bibliography, all of which are designed to facilitate the readers’ search for specific information as well as to direct them towards the text’s sources or related material if necessary.

It should be clear that the digital text format excels in the scope and efficiency of its tools for referencing and indexation. With connection to a database, or the Internet itself, the hyperlink as a referencing tool offers not only information about sources or related texts but a near–instantaneous connection directly to another text in its entirety, and to the relevant section thereof. Internal, local and global search functions are equally powerful indexing tools, capable of generating in moments comprehensive lists of instances of a given word, phrase or topic in multiple texts, while human endeavour might require months or years to do the same. This functionality, along with considerations of the vast potential space and flexibility of content of electronic formats, assured the widespread replacement of many forms of tabular texts by information databases and digital texts. In terms of common public use, the most prominent effect of this has been the supercession of paper encyclopedias and dictionaries by the Internet, though of equal importance is the widespread institutional digitisation of legal, medical and scientific documents, the collation, indexation and cross–referencing of which are now vastly superior to those of older paper forms. In libraries, significantly, card catalogues and other indexes were generally digitized long before any book in the holdings. Tabular reading, the interrogative search for information, is in most cases clearly aided by electronic text and computer functionality; in other words, by what we broadly term information technology.

The contents of most linear texts such as fictional narratives, however, are not strictly definable as informational, since their content does not primarily derive from questions or problems posed independently of themselves, in the manner of semi–dialogical tabular texts. The novel, for example, exhibits a myriad of relationships to other texts — based upon author, publisher, subject matter, genre, direct or indirect intertextual references, and many other factors — yet its fundamental integrity of form and content is conducive to the sustained engagement between reader and text that is the basis of linear reading. The transformation of this content into data, while allowing for advances in the storage, distribution, and indexation of linear texts, offers no advantages to the way they are read. In many respects digitalisation enables, or even actively encourages, a less immersive mode of reading than the book, as Terje Hillesund points out with reference to a study by Anne Mangen [37]:

It requires less mental energy to click the mouse and rekindle our attention than to try to resist distractions by attempting to keep on structuring consciousness from within, and thus continue reading (Mangen, 2008). Furthermore, in front of the computer screen — and especially online — we are relentlessly tuned in to change. We have learned to expect something to happen and are thus doubly compelled by an urge to click. (Hillesund, 2010)

The temporalities associated with screen media and information technologies, it is suggested, tend to encroach upon the act of linear reading itself.

While the interactive, exhaustively annotated, hyperlinked and searchable online literary database is the inescapable future of literature in many opinions, it should be recognised how drastic a shift in our habits of reading this textual transformation and repositioning entails. To return to “The Library of Babel” — the content of which, we have established, is data in the form of books rather than books themselves — Borges’ description of the act of reading (or, more accurately, the impossibility of reading) is telling: “sometimes [searchers] pick up the nearest volume and leaf through it, looking for infamous words. Obviously, no one expects to discover anything” [38]. While the sophisticated indexation and referencing tools we employ aid the search for information, the mode of reading itself remains acutely interrogative to a significant degree. Linear reading, which is propelled not by the answers and explanations we seek but by the momentum of the quest itself, would become a difficult prospect within a large–scale network of digital text. Kevin Kelly envisions a near future in which the self–contained form of a work will be considered obsolete, without taking into account the advantages such a form offers for sustained engagement with a linear text:

Copies of isolated books, bound between inert covers, soon won’t mean much ... What counts are the ways in which these common copies of a creative work can be linked, manipulated, annotated, tagged, highlighted, bookmarked, translated, enlivened by other media and sewn together into the universal library. Soon a book outside the library will be like a Web page outside the Web, gasping for air. Indeed, the only way for books to retain their waning authority in our culture is to wire their texts into the universal library. [39]

In such a model, reading would be only one among many co–existing modes of interaction with a text, alongside watching and listening to related media, adding hyperlinks to other texts, commenting upon and tagging sections for the attention of others, and so on. If we consider such a system and space a “library”, then there is little room in it for readers; instead, as in the Borges story, its denizens or participants would largely function as “librarians”, individually and continually cataloguing and categorizing — or attempting to catalogue and categorize — the practically innumerable texts.


It is clear that the technology of book and library is distinct from information technology, and that the latter is not capable, at this point in time, of replicating many aspects of the former. Electronic media clearly improves or enables certain established and emergent forms of text, the former including informational or tabular texts and the latter including interactive fiction and hypertexts. Still, the phenomenal differences separating book and ebook, library and digital library, and reading a book and reading onscreen or Web–based text must be recognised. The utopian fantasy of the universal library so frequently raised in connection with digitisation, and particularly with the Google Books Library Project, is a new myth clothed in the garb of an old one, borrowing its terms and little else. It is preferable to think of Google Books not as a potentially perfect library but as a near–perfect library index, encompassing as it does not simply keywords and short descriptions of books but the entire contents of the books themselves, and Google seems to have embraced this definition, at least for the present [40]. The universal library, however, remains something best left to Borges’ “horrible imaginings” [41]. End of article


About the author

Christopher Rowe hails from St. John’s, Newfoundland, Canada. He obtained an M.A. in comparative literature from the University of Western Ontario, Canada, and is currently completing a Ph.D. in screen studies at the School of Culture and Communication, University of Melbourne, Australia.



First Monday, Volume 18, Number 2 - 4 February 2013

