Six degrees of reputation
First Monday

Six degrees of reputation: The use and abuse of online review and recommendation systems by Shay David and Trevor Pinch



This paper is included in the First Monday Special Issue: Commercial Applications of the Internet, published in July 2006.


Abstract
This paper reports initial findings from a study that used quantitative and qualitative research methods and custom–built software to investigate online economies of reputation and user practices in online product reviews at several leading e–commerce sites (primarily Amazon.com). We explore several cases in which book and CD reviews were copied whole or in part from one item to another and show that hundreds of product reviews on Amazon.com might be copies of one another. We further explain the strategies involved in these suspect product reviews, and the ways in which the collapse of the barriers between authors and readers affect the ways in which these information goods are being produced and exchanged. We report on techniques that are employed by authors, artists, editors, and readers to ensure they promote their agendas while they build their identities as experts. We suggest a framework for discussing the changes of the categories of authorship, creativity, expertise, and reputation that are being re–negotiated in this multi–tier reputation economy.

Contents

Introduction: A cultural Lake Wobegon?
Six degrees of reputation
Research methods
Empirical findings: Strategies and practices of reputation management
Continuities and discontinuities from earlier models and the offline world
Conclusion — Beyond six degrees of reputation?

 


 

Introduction: A cultural Lake Wobegon?

Why look at book reviews?

Charles McGrath, former editor of the New York Times Book Review recently posed the rhetorical question: “has there ever been a book that wasn’t acclaimed?” (Safire, 2005) What McGrath laments, of course, is the inflation of accolades in the universe of book promotion, which, much like in the case of CDs or other cultural products, is influenced by commercial interests more than by standards of accuracy in the representation of a product’s quality. Traditionally, the critics employed by respected institutions like the New York Times or other leading newspapers and trade magazines served as cultural gatekeepers and the proprietors of quality. The growing abundance of books (there were over 100,000 new books published in the U.S. last year), CDs and similar ‘information goods’, however, precludes any wide coverage or quality assessment. Recently the small group of paid experts who are hired by these select establishments has been aided by a wide variety of trade publications and Web sites, which cover ever more specialized sub–fields of the culture industry and are employing systems that harness the power of user communities.

Evidently, in many areas of cultural production user reviews are mushrooming as an alternative to traditional expert reviews. If there was any doubt, it has long ago been established that reviews and recommender systems play a determining role in consumer purchasing (an early review is available in Resnick and Varian, 1997) and recent qualitative research adds weight to the claim that these review systems have causal and positive effects on sales; to nobody’s surprise, books with more and better reviews are shown to sell better (Chevalier and Mayzlin, 2004). With people in the culture industries increasingly realizing this truism, many of the reviews are thus positively biased and it becomes very hard to distinguish the ‘objective’ quality of the reviews. In addition, due to the large variance in the quality of the reviews, and the varied agendas of the reviewers, user input too often becomes untrustworthy leaving the consumers with little ability to gauge an item’s actual quality. Do we live in a cultural Lake Wobegon where “all the books are above average?” (to paraphrase Keillor, 1985) Is there a way to review the reviewers, to guard the guards? As will be discussed in details below, emerging systems like the one employed on sites like Amazon.com (2005) suggest that there are ways to try to solve this bias problem by offering a tiered reputation management system which offers a set of checks and balances. But these new options also bring with them new problems as the participants adjust to what is at stake in this new economy of reputation.

Do we live in a cultural Lake Wobegon where “all the books are above average?”

The new user input systems which are burgeoning on the Internet employ various types of user input to assess the quality of books and CDs (Amazon.com, BN.com), news (Slashdot.org, Kuro5hin.org), consumer electronics (Shopping.com), home–recorded music (ACIDplanet.com), teaching quality (RateMyProfessor.com), drug effectiveness and side effects (DrugRatingZ.com), as well as many more types of information, information goods, and information–embedded goods (i.e., goods whose value derives from the information embedded in them). These systems are exemplifications of what Yochai Benkler (2002) calls peer–production systems in which communities of users pool their resources in order to produce higher–quality information goods and information–embedded goods, in some cases replacing the traditional mechanisms of firms and markets altogether.

There is disagreement among scholars regarding the novelty and the potential long–lasting effects of these systems. The proponents, on one hand, claim that peer–production systems will revolutionize the way we produce, consume, and use information, primarily due to the cost–reduction they offer and the enhancements they enable in assessing and allocating human creativity (Benkler, forthcoming; Raymond, 1999; Himanen, 2001; Lessig, 2004; Coleman, 2005). The skeptics, on the other hand, ponder the ways in which such systems are being expropriated and appropriated by existing actors and refer us to a lengthy tradition of user involvement (Castells (1996), for one, offers a neutral account of larger trends and the rise of the network society). In and of itself, we should remember, harnessing the power of community members for quality evaluation of products is not a new idea. The Whole Earth Catalog, for example, during its heyday in the early 1970s, was distributed in more than one million copies and introduced members of the back–to–the–land movement to products, which varied from fertilizers to computer displays, and offered for each product a summary of user experiences and recommendations. This community–based product can be viewed as an early model that directly influenced later systems, including online communities (Turner, 2005). But the proponents of novel systems would argue that the sheer scale and immediacy provided by the Internet offers something genuinely new.

Regardless of the future prospects of peer–production systems, however, both sides of the debate would agree that practices as well as norms are not yet stabilized in this domain, and that the mechanisms which control this ‘reputation economy’ are not yet well understood. Clearly, the nascent systems that we explore here introduce new variations of old problems concerning authority and expertise. The primary objective of this paper is to explore the reputational underpinning of these systems in the face of alarming discoveries concerning the authenticity of some of their content. We start with two episodes that drew us to this research and attest to the role of user reviews in our culture and speak to some of the concerns we wish to address.

Two episodes, or why look at book review copying?

One of us while sitting at a local Ithaca café, witnessed a meeting between a local author, Barry Strauss, and one of his fans. The fan, an enthusiast for naval histories, was very excited by the fact that Mr. Strauss’ latest book, The Battle of Salamis: The Naval Encounter that Saved Greece — and Western Civilization (New York: Simon & Schuster, 2004) was well received and as proof for this he mentioned a review he had read on Amazon.com which compares Strauss’ style to that of best–selling author Tom Clancy. The review reads:

“A Good Story Well Told” / December 1, 2004 / Reviewer: John Matlock “Gunny” (Winnemucca, NV)

... This extensively researched book is centered on the naval battle, but it is set in its place with descriptions of other parts of the war ... It also includes an amazing amount of detail on the two countries, their cultures and the times in general ... I have to say that the author’s writing style makes this read like a Tom Clancy novel ... [1]

From a sociological point of view, this encounter is interesting in several respects. First we note that user book reviews have become conversation pieces in the ‘offline’ world; what is otherwise a ‘secondary’ information–good becomes the prime point of focus. Second, we note that the fan does not appear to know the reviewer, but that this does not impede him from invoking the reviewer as a legitimate authority; the materialization of the review — in writing — on a reputable Web site is enough to warrant quotation. In this specific case, the reviewer is a “Top 50” reviewer, with over 1,500 reviews to his credit, but does this make a difference? Third, the review content itself makes use of tiered reputation: the reviewer accredits Strauss by comparing him to a reputable thriller writer like Clancy. Taken together, these three points suggest that user book reviews offers an interesting topic for sociological analysis.

In a second case, we came across a more alarming dynamic concerning book reviews: review plagiarism. This instance concerned one of Pinch’s own books Analog Days: The Invention and Impact of the Moog Synthesizer (Pinch and Trocco, 2002). This book that chronicles the invention and early days of the electronic music synthesizer was well received by reviewers both offline and online, and the Amazon.com editors quote a review from the Library Journal that reads as follows:

... In this well–researched, entertaining, and immensely readable book, Pinch (science & technology, Cornell Univ.) and Trocco (Lesley Univ., U.K. [sic]) chronicle the synthesizer’s early, heady years, from the mid–1960s through the mid–1970s ... . Throughout, their prose is engagingly anecdotal and accessible, and readers are never asked to wade through dense, technological jargon. Yet there are enough details to enlighten those trying to understand this multidisciplinary field of music, acoustics, physics, and electronics. Highly recommended. [2]

A similar (but distinctly different) book that had appeared earlier — Electronic Music Pioneers by Ben Kettlewell (Vallejo, Calif.: ProMusic Press, 2002) — received the following user review on Amazon.com on 15 April 2003:

This book is a must. Highly recommended., April 15, 2003 / Alex Tremain (Hollywood, CA USA)

... In this well–researched, entertaining, and immensely readable book, Kettlewell chronicles the synthesizer’s early, years, from the turn of the 20th century — through the mid–1990s ... . Throughout, his prose is engagingly anecdotal and accessible, and readers are never asked to wade through dense, technological jargon. Yet there are enough details to enlighten those trying to understand this multidisciplinary field of music, acoustics, physics, and electronics. Highly recommended. [3]

The ‘similarity’, of course, is striking. The second review is simply a verbatim copy of the first one, replacing only the name of the authors and the period the book covers. The word “heady” has been removed as the period Mr. Kettlewell covers is lengthier than the sixties focus of the Pinch–Trocco volume, but it has been removed so sloppily that the comma after “early” has been left in so that the review is now grammatically incorrect. Other reviews posted in subsequent weeks for Mr. Kettlewell’s book contained other sentences lifted off of Pinch and Trocco’s accolades (in this case from reader reviews). Furthermore an inspection of the entire set of reader reviews for Mr. Kettlewell’s book suggests that the copying from Pinch and Trocco’s reviews has an ulterior motivation. Just before the copied reviews is the following reader review:

Dissapointing, for a $21 book........, / March 12, 2003 Reviewer “djminiwjeats” (Chicago, Il USA)

This book, although comprehensive to be sure, often paints in extremely broad or disconnected brush strokes, leaving me wishing there was more detail at times. This was especially evident in the first section, a seemingly endless series of brief bio’s of various figures who are presented as key players in the development of electronic music, with very little indication of how they might actually fit into the historical continuum, or how they might relate to each other ... . I also have read Frank Trocco’s book on the Moog synthesizer (which also covers the Buchla, ARP, and others), and found it to be far superior. I’d recommend anyone just getting into this subject to start there instead. [4]

What does this copying strategy suggest? Is this simply an attempt to influence sales of a less well–received book, stealing the credit from a better established publication, or is there a deeper undercurrent provoked by the earlier reader review comparing this book unfavorably to the Pinch–Trocco book? When Pinch and Trocco contacted Amazon.com in 2003 and alerted them to this plagiarism (which Pinch had discovered himself by accident), Amazon’s response was to recite their policy which states that they give users complete freedom in posting reviews, and do not intervene in the process (a policy that has since changed). Clearly issues concerning the freedom of expression clash here with notions of integrity and the imputed genuineness of reader reviews. Pinch as an author felt disappointed by Amazon’s response. He was proud of his positive reviews — it is rather unusual for an academic book to get any reader reviews at all and a student of his had once told him as a mark of acclaim for the book “it even has reader reviews on Amazon.” Pinch felt that somehow his own positive reviews, which he had taken to come from genuine enthusiastic readers, were now compromised by a system that allowed such blatant copying by possibly non–existent readers.

With these two cases in mind, we decided to further explore the extent to which reviews are being used and abused in Amazon’s and similar systems. Clearly the electronic media allow perfect copies of both primary and secondary information goods. Usually we are concerned only with the authenticity and quality of the primary artifacts, but reviews play a determining role in assessing those characteristics. To what degree can we count on those reviews? A preliminary literature search revealed that beyond our own experience, recent evidence suggested that many reviews are not authentic, that users are using various techniques to game the system, and that this phenomena might be widespread. For example, in 2004 both the New York Times (Harmon, 2004) and the Washington Post (Marcus, 2004) reported that a technical fault on the Canadian division of Amazon.com exposed the identities of several thousand of its ‘anonymous’ reviewers, and alarming discoveries were made. It was established that a large number of authors had “gotten glowing testimonials from friends, husbands, wives, colleagues or paid professionals.” A few had even ‘reviewed’ their own books, and, unsurprisingly, some had unfairly slurred the competition.

Recent evidence suggested that many reviews are not authentic, that users are using various techniques to game the system ... .

Given these early observations, and the enormous volume of books sold that make use of such systems, the task of understanding the mechanisms that control these reputation and expertise tools is of prime importance. It seems that cases in which reviews are being plagiarized or otherwise abused are good starting points for understanding the issues underpinning such systems. What follows, are preliminary results from our investigation. We start by making a first cut at the different tiers or degrees of reputation that seem to come into play.

 

++++++++++

Six degrees of reputation

Amazon’s system is largely typical of other electronic retailers’ systems, thus the description below is intended to be illustrative of a whole class of similar systems that might vary in details of implementation but which are based on the same principles nonetheless. Importantly the system features discrete levels of reputation management, which are layered in a structure that we call “six degrees of reputation”:

  1. At the first level we find authors’past reputation and credentials accruing to their benefit directly from the association of the items with their names. This level of reputation is influenced by activities that take place OUTSIDE the online recommendation system. It might include official credentials, past performance, reviews offline, sales history, etc.

  2. At the second level, paid editors write editorial book reviews trying to influence buyers to buy a specific book; these are sometimes quoted from other media sources and other times are produced for Amazon by its own employees; mostly, they are commercial in nature resembling the promotional material found on the back–cover blurb which McGrath decries. The primary objective of these reviews is to offer readers a professional, well–written, mostly positive review of the item for sale.

  3. At a third level, expert–users (reviewers) write free–form reviews and compile best–of lists, which are ostensibly non–commercial and unbiased as much as they are opinionated; in those reviews the expert–users also rank the books on a numeric scale, assigning books ‘stars’ on a 1–5 scale.

  4. At a fourth level lay–users (readers) rate expert–user reviews on a binary usefulness scale (useful or not). A summary of past ratings is available (e.g. “5 out of 10 people found this review useful”). In addition, a ‘report this’ feature (introduced by Amazon.com recently) allows lay–users to report inappropriate content that is then evaluated by Amazon’s staff. The primary objective of these mechanisms is to offer a form of balance to the reviewers’ power. Ostensibly, poor reviews will receive an inferior usefulness ranking, and thus future readers will be alerted and will pay less heed to such negatively rated reviews.

  5. At a fifth level some reviews are highlighted and given more visibility based on the usefulness scale and the ranking of the expert–users who wrote them. Reviews that are found to be useful by more readers, or reviews that are written by credible reviewers (i.e. those reviewers whose reviews consistently get high rankings) are displayed first.

  6. At the sixth level the expert–users (reviewers) themselves are credentialed based on the amount of reviews they post and the usefulness of their reviews as evaluated by the lay–users. Reviewers that reach the top of the list receive visual indication of their status in the form of an icon next to their name that states “Top X Reviewer” (where X is 1000, 500, 100 or 20) and reviewers that hit the top 20 get to write a full profile page that includes their biography and picture.

We clearly see how in a system like this the notions of quality, reputation and expertise are tightly bound, and as will be explained in detail below, are often conflated. There is a direct transition from evaluating the quality of an artifact (both quantitatively using a numerical scale and qualitatively using words) to meta–evaluating the usefulness of that evaluation (on a binary scale), to evaluating the expertise of the expert–user (based on the level of participation and the evaluation of that participation by lay–users). Higher levels of expertise and authority are directly tied to participation; formal credentials and past reputation enter the system only at the first two of the six levels; expertise is measured on a continuous scale and is affected by (a) levels of participation and expert–lay interaction; and, (b) lay–user review of expert activity.

To understand what expertise means in this context we can compare this system with traditional book review venues where, in contrast, only certified experts get to express their opinion (other than, perhaps, in the ‘letters to the editor’ section where all readers can participate). In a system like Amazon’s the only thing required in order to gain expertise, other than basic language skills, is participation. Expert–users assert themselves as such by writing reviews and compiling their lists. They are then subject to the community’s scrutiny. The community evaluates these contributions and these evaluations in turn translate to increased or decreased levels of visibility. At the top of the expertise scale Amazon tries to mimic offline mechanisms of accreditation by ranking the top reviewers and giving them visual indication of their status as top–reviewers. We read this as an attempt to overcome the problem of information overload to which this system, of course, is not immune. By giving the readers (lay–users) a clear indication of the expert’s status, the site saves the users the effort to try and evaluate the expert on their own (a saving that can be significant when browsing items that have hundreds and sometimes thousands of reviews).

There are two significant differences between this accreditation system and the traditional educational accreditation system we are familiar with: (1) the only accreditation of the ‘experts’ is tied to participation, attests to ‘real–world’ performance within the system, and does not rely on external factors. This is very different from, say, a training certificate that attests to the completion of some educational activity but speaks little to the expert’s ability to partake in hands–on activities; and, (2) the accreditation is relative. A top–reviewer’s only way to reach the top is by topping other reviewers. As we have already seen, though, the actual working of this system is more nuanced.

In summary, the six degrees of reputation are underwritten by various mechanisms for gaining and expressing expertise through writing, reading, and evaluating reviews. At this point one may raise several objections. A first objection will note that books are particular types of goods for which reviews are particularly powerful because they help establish the meaning of the artifacts in question more than, say, for electronic gadgets. While this might be true, it only serves to make our point that in such systems meaning–making, quality assessment, expertise and reputation are conflated. A second objection will alert us to the point that people participating in this system are a self–selected group who, to begin with, ascribe weight to meta–information. Are these observations then generalizable to larger audiences, and can we see their parallels in other contexts? Anecdotal evidence suggests that we can. For example, we recently observed billboard ads in New York subway stations that were adorned by ‘user reviews’ in the form of physical mark–up. It seems as if user input as a form of review is becoming an accepted form of cultural expression. A third objection will note that levels 4,5 and 6 of the reputation system concern only a small group of lay–experts, and that their activities do not necessarily reflect the activities of reviewers as a whole. This, of course, is true, but does not limit our claim that the system offers built in incentives for such activities, and that at the outset such incentives might be a part of the solution to the free–rider problem. All these claims are explored in detail in the following sections.

 

++++++++++

Research methods

Our study involved three sorts of activities: participant observation, interviews, and downloading and analysis of data from Amazon.com using special custom–built software. We describe each in turn.

Participant observation

Both authors of this paper are regular browsers of Amazon.com and other product review systems. One of us is an author with many books listed and sold on Amazon.com and we both use Amazon.com to buy new and used books and CDs. We both read and write reviews, rank reviews, review expert user profiles and more. In addition, we have encouraged our students to write book and CD reviews and post them online, and have observed the dynamics of that interaction.

Interviews

We carried out a small number of interviews with a few prominent authors (novelists and non–fiction writers). The key question we were trying to answer was in what ways the relationships between authors and readers have changed as a result of the introduction of these tiered reputation systems. We also investigated the continuities and discontinuities that authors perceived between older models of quality control and reputation management and the new models. We encouraged our interviewees to recall cases in which book reviews were abused in the offline world, and asked them to reflect on the ways in which the review system has changed with the introduction of new systems.

Custom–built software for data download and plagiarism–detection

The primary quantitative tool for this research was a set of software programs that one of us wrote specifically for the purpose of evaluating the prevalence of review copying, plagiarism and abuse. The software included communication modules for downloading data from Amazon.com’s Web site and copy detection algorithms for detecting text re–use within downloaded data. The following section provides a detailed description of this software and the algorithms it uses.

Similarity matching and communication modules

The first task for the software was to identify which books or CDs might have re–used reviews. From a computer–scientific perspective a brute–force comparison of all the reviews ever published is technically feasible but we did not have access to the full database, and saw little point in comparing books from different categories with one another. As a working solution, we decided to compare only those items that are somewhat similar to one another. For a ‘similarity’ criteria we used data available from Amazon’s public application programming interfaces (APIs). Using eXtensible Markup Language (XML), these APIs programmatically expose various types of data including similarity matching which is based on collaborative filtering algorithms that use customers’ past purchasing behavior to deduce similarity among items and project customers’ interest. Evidently, such similarity algorithms can be very powerful. For example, the similarity algorithm for Pinch and Troco’s Analog Days finds Kettlewell’s Electronic Music Pioneers to be similar (Figure 1). Our data download algorithm, then, was seeded with any book or CD title as its source and using recursive calls to the similarity API was able to build a virtual graph representing other items which are ‘similar’ to the original book. Such items included both books and CDs.

 

Using similarity data

Figure 1: Using similarity data.

 

Having built this similarity graph, the algorithm proceeded to make calls to further APIs that expose the content of user reviews (Figure 2). Trying not to exceed limitations set by Amazon on the amount of data that can be accessed using the APIs, for each book we accessed only the five most recent reviews.

 

Downloading reviews

Figure 2: Downloading reviews.

 

Copy detection

The next step of the algorithm was to compare the reviews to one another (see Appendix for a more detailed technical description of the algorithm). Simply put, the algorithm looks for text re–use at the sentence level, and produces lists of re–used texts ranked by the amount of similar text and the probability of copying (Figure 3). Importantly, the algorithm is able to detect re–use of text even if the re–use order within the paragraph is different. For the examples described here we selected cases in which more than one sentence was re–used.

 

Comparing similar reviews

Figure 3: Comparing similar reviews.

 

 

++++++++++

Empirical findings: Strategies and practices of reputation management

At this stage we report conclusions from evaluations of more than 50,000 user reviews pertaining to over 10,000 pseudo–randomly selected books and CDs. Our findings allow us to estimate that about one percent of all review data is duplicated, verbatim or with variations. The similar patterns observed across different genres of books and CD suggests that our findings will be corroborated with larger datasets, but further research on extended such data is necessary [5]. In most of these cases the copying is done by a person who writes an original review and copies it to different items with or without an attempt to change their reviewer identity. Importantly, we have not found further cases of reviews that were copied from a book or product to a competitive book or product, as in the case that triggered our investigation.

The numerous cases of review re–use can be grouped into several categories:

  • Reviews copied from one item to another in order to promote the sales of a specific item.

  • Reviews posted by the same author on multiple items (or multiple editions of the same item) trying to promote a specific product, agenda, or opinion.

  • Reviews posted by the same author using multiple reviewer identities to bolster support for an item.

  • Reviews (or parts thereof) posted by the same reviewer to increase their own credibility and/or to build their identity.

The following sections describe examples of how these strategies are employed.

Cases of product, opinion or agenda promotion

The most common use we identified for duplicated reviews was the promotion of an agenda, a product or an opinion. In several cases, review space was used simply for free advertisements or spamming. For example, in the review space for drummer Bill Bruford’s Earthworks CD, which was sold on Amazon.com for US$19.98 we found the following ‘review’:

AMAZON IS THE WRONG PLACE TO BUY THIS / As with all the BB remastered Winterfold/Summerfold titles, this CD can be found @ $14.98 direct at billbruford.com. [6]

The same reviewer posted the same ad on other items as well and also participated persistently in attempts to boycott products and promote alternative products. Surprisingly the ad stayed in the review space without Amazon.com taking action against it. Not surprisingly, perhaps, users did not find it necessary to report this ‘review’ as inappropriate content using the ‘report this’ feature. Many of them probably found this information useful. Users we have talked to also reported more extreme cases found on other CD product pages, where sometimes the review space is used to post links that point directly to digital copies of the music itself (often as torrents, a file type supported by the popular peer–to–peer software BitTorrent, which is hosted on users’ computers and not on a central server and is thus more resistant to threats from copyright owners). Under this scheme the Amazon system is used simply as an easy–to–use index of copyrighted music (ironically, providing such an index was the original goal of Napster before it was sued out of existence by the music industry).

Another strategy can be found in the form of agenda promotion. Following are two examples from the political memoir genre. In one case, review space for many of the books critiquing president George W. Bush was used in order to promote a conspiracy–theory video. Variations of the following excerpt were copied multiple times:

... check out the movie 911 in Plane Site. www.911inplanesite.com This movie shows suppressed news footage and video evidence from 9/11 that proves the government’s “official story” is ludicrous. For example, the Pentagon was hit by something, but it wasn’t a passenger airliner. That’s why we haven’t seen Pentagon footage. Please, check out this movie and show as many people as possible. [7]

The reviewer’s strategy is sophisticated. By inserting this morsel of information into the review space of books that are critical of the Presidency the writer is assured to be speaking to a specific type of audience that is likely to be receptive to his agenda. A more complicated use of the same strategy was employed against Henry Kissinger. In this case text was entered to review space of all of his books and to many books written about him. The reviewer, of course, proposes an alternative in the form of another book.

... if you want the evil truth about Dr K and how he undermined the 1968 peace talks, read “No Peace, No Honor: Nixon, Kissinger, and Betrayal in Vietnam” by Larry Berman. This book explains how Nixon and Kissinger illegally colluded with SVN and Nguyen Van Thieu — he was told by Nixon via Anna Chenault to “hold on, we are going to win” and “you will get a better deal with us”. So Thieu says he won’t talk peace, Nixon wins, Kissinger openly changes sides after working with the Democrats, and together they crank up the war. The point is: The War could have ended in 1968 if it were not for this man — Dr Death himself, Henry Adolf Kissinger! [8]

A more subtle strategy in this category calls for posting the same review multiple times for the same item, under different reviewer names. Our data shows a multitude of cases like this, which make use of what became known as ‘sock puppets’. Sock puppets are virtual identities, which are used to back–up other users’ opinion. Under this strategy the same review will be repeated over an over using different ‘puppets’ that are speaking in the same voice. A reviewer called bookcritic.com, for instance, shows up with the same reviews as ‘faithful_reader’. This strategy is especially useful for popular items that receive dozens of reviews. In these cases the reader is not aware of the fact that the reviews are duplicated since the reviews are separated into multiple pages, and the readers’ attention span is usually limited and prevents thorough browsing through more than a handful of reviews. It is important to note, however, that in several cases there seemed to be little commercial intent in this strategy. In some cases it was obvious that users for some reason lost access to their older identities and built new user profiles, making the effort to manually copy all their former reviews and post them anew under their new identity. This suggests that reviewing is highly connected to identity construction, on which we elaborate next.

Identity building

Several cases demonstrate how online reviewing is becoming an activity that is aimed not only at assessing the quality of information goods for sale but also a way for reviewers to construct their own identities. In one case, for example, the reviewer took the opportunity to use review space for direct communication with the famous rock musician Bruce Springsteen, posting reviews on several of his albums, including Human Touch and Tunnel of Love. In a review titled “YO BOSS! UP OFF YOUR BUTT & GET BUSY REMASTERING!” the user writes:

Hey, BRRRRRUUUUUCCCE! What’s the deal? Just about every major Top 40 artist has had their catalogs sonically updated EXCEPT YOURS. Why can we buy the “Tracks” editions, and get glorious HDCD–encoded sound, but “Touch” sounds like it’s coming out an AM radio? OK, you did your best, but Dubya’s back in the White House for the next four years, and Kerry’s home in his underwear watching the Weather Channel. You should have plenty of time on your hands now... GET BUSY! Let’s see some remastering! [9]

Such direct dialog speaks both to the artist and to the audience. By writing in this language the reviewer establishes himself as an expert who is in the privileged position to tell the artist what to do. This is a reversal of the common role between artist and audience. Our interviews with authors corroborate the user’s premise that the artists often go and read users reviews, looking for feedback. Interestingly, one of our interviewees, novelist J.R. Lennon, reported that he stopped doing so when he understood that some reviewers were reading his books in a way different than how he intended them to be read. Lennon had initially liked the access to readers’ comments and indeed on receiving his first negative comment he actually posted a reaction to it. He told us:

I wanted to defend myself to people who were being really nasty in their customer comments ... . I was shocked that people were not discussing the book in any analytical or rational way. They were just sort of blurting out their gut reaction to it under the cloak of anonymity. The reaction to it was shocking. I think that’s why I posted a comment. Of course that’s not shocking anymore, because that’s what the Internet is full of now, and customer comments are sort of accepted as a reasonable form of book review.

He soon, however, asked Amazon.com to withdraw his comment:

I called Amazon and asked them to remove my own comment on my book, [I felt that] the whole concept of commenting on your own book on the Internet was really stupid.

Q: Why is it stupid?

Because if somebody wants to ask me about it they can, I feel the book is self explanatory. If someone wants to ask me about the book I don’t mind talking about it. So when I talked to the Amazon guy about removing it, he said “yeah everyone’s doing it.”

Here this author can be seen to be adjusting his own practices to life in this new and unaccustomed digital medium. There is no doubt that over time for this author his experience of this new reviewing system has become largely a negative one:

Basically I hate Amazon customer comments. I despise them, because any anonymous prick can wreck your day without even looking at your book. If your book is coming out at the same time as someone else’s they can just send their friends in to sabotage your rating.

Lennon found that early on he himself was drawn into the game of writing reviews simply to defend a book against what he perceived as unfair comments:

The only review I ever remember doing was for a guy named Stewart O’Nan, who was in the writing program in Cornell. He lives in Connecticut now. He wrote this terrific book, a very short novel about a priest in a small village in Wisconsin when plague and fire hit at the same time [Prayer for the Dying] It’s very dark and in second person. And someone had written in an Amazon customer comment who hated Stewart O’Nan, saying it was plagiarized, ... . It was really nasty and had nothing to do with the book, and I actually wrote a review defending the book.

Q: Did O’Nan ask you to do that?

No, I barely know him. I was furious.

The temptation to attack others via posting negative reviews and/or write in support of oneself, or fend off attacks on one’s friends and colleagues who receive negative reviews, is something which we are finding many users have felt and — on occasions — succumbed to. It is simply so easy to do and — with the possibility of posting anonymously — is almost totally cost free. In the course of this research we have encountered numerous stories of publishers, friends of authors and so on engaging in these practices. It seems likely that a significant percentage of Amazon book reviews and probably online product reviews in general are generated in such a way. The effect is that the “worth” of even positive reviews becomes more and more difficult to gauge.

One interviewee, for example, an academic who published a well–received book, was frustrated when she found out that her reviewer was less genuine than what she originally thought. She commented to us in an e–mail:

[This reviewer] wrote a very lengthy and kindly review of my own book on Amazon, but upon surfing semi–randomly over the years for books by others that interest me, I see he has been everywhere, and every damned book he reviews gets 5 stars. It turns out he is “reviewer #47” [and] has written over 600 (five–star) book reviews. Is he really so easy to please? Has he also read hundreds of other books, which don’t merit five stars and which he generously declines to review so as not to post criticism? Does he perhaps pilfer his prose from other places? It might bear looking into. I can say that I never paid a dime for his flattery, but maybe he sells it to others ... I’m curious as to what he gets out of all these five–star reviews. Does he hope to ingratiate himself with authors, is he a frustrated ABD, or is he just a history whore?

Our initial data suggests that indeed many of the top–ranked reviewer generated reviews are positively biased, and that inadvertently these reviewers might continue to generate this bias in order to maintain their good standing among the reviewer ranking. Further research is needed, however, to establish this finding.

In another case a user reviewed several Tom Hanks/Meg Ryan movies. The user posted the same review for the movies Sleepless in Seattle and You’ve Got Mail. He found that each of those films was “a film about human relations, hope and second chances, but most importantly about trust, love, and inner strength.” [10] As we know, especially with the demands for producing one blockbuster after another, Hollywood movies are sometimes strikingly similar, and yet posting the same review for two different films suggests that the reviewer is interested less in accurate representation of the movie’s content or qualities and more in the sort of reputation and identity that he or she can build as someone who posts numerous reviews.

You might think that there is little harm in posting duplicated reviews like this. After all, the reviewer statements could independently be true of more than one movie, and there are no serious consequences if they are not. One reviewer who regularly posts music reviews on an online music site (ACIDplanet.com) where musicians can post their own music as well as review and download others’ told us that he only published positive reviews because he knew the chart rankings on this site were based mainly on such customer reviews and he had found that if he reviewed other artists favorably he would get positive reviews in return (known to the users as R=R). But such positive reviewing stretched his capacity to find enough words to post the same plaudits without actually repeating himself:

If I feel that I’ve written a lot, the same thing, I’ll get the dictionary out and change a few words. And I just, say, “I’ll change ‘creative’ to ‘inventive’.” Oh, that’s good! I’ll change, you know, ‘cool’ to ‘groovy.’ Oh, that’s good, you know!

This artist stressed that he found it unethical to actually copy a review but he encountered such copied reviews all the time:

And there’s people who post the same review over and over again.

Q: You’ve seen that?

Oh yeah, yeah. I’ve gotten a review, and it’s just all positive stuff, and then I’ve gone and I’ve listened to someone else’s music, “Hey, you know, this guy reviewed — oh it’s the same review!” And he just cuts and pastes, you know, a positive review. Or a negative review, for that matter. Uh, I see that a lot.

This problem takes on even more salience, however, when the product being reviewed is not a book or piece of music but a drug. We have found a case like this on DrugRatingZ.com, a site used by people to post reviews of prescription drugs. In this example, the review concerned the drug known by the generic name Lorazepam or by the brand name Ativan. The review reads:

Works well if you have never used benzodiazepines regularly. It is very habit–forming and stops being effective if used too often. Makes you very drowsy, but can be extremely useful for panic attacks. [11]

At a first glance this seems like a simple, authoritative statement, but further browsing on the site reveals the review for the drug known by the generic name Clonazapam or the brand name Klonopin where the review reads:

Works well if you have never used benzodiazepines regularly. It is very habit–forming and stops being effective if used too often. Makes you very drowsy, but can be extremely useful for panic attacks. [12]

A short consultation with a medical doctor reveals that these drugs are actually based on the same molecule. Arguably, under those conditions, it makes sense to have the review copied but does the fact that in the latter case we’re dealing with reviews of drugs that might have significant health effects suggest that the norms for such reviews should be stricter than the ones for book and music reviews? We came across this one case by accident and further research of drug product sites is clearly called for.

Another common form of copying we found was when part of the review space was used to write summaries of a topic that were than used over and over again. For example, a top–20 reviewer with over 1,350 written reviews has made a point of reviewing books dealing with black history, and in several of his many reviews has used the same paragraphs as excerpts within specific reviews. Is there anything wrong with doing so? Should such self–plagiarizing be condemned? The answers to these questions are not clear and depend on the framework of analysis (Samuelson (1994), for example proposes to consider these within the framework of fair use). What is clear, however, is that these well–written reviews (which, to be sure, demand a lot of time and attention) have in turn been highly ranked, allowing this reviewer to sustain his status as a top reviewer and continue the cycle that allows him to spread his ideas. In cases like this we see how strongly the reviewer’s strategies of identity building can be tied to those of agenda promotion. Apparently, self–plagiarism is a phenomenon common in other settings as well. The journal Nature recently published a special report “Taking on the cheats” which discusses self–plagiarizing in academic settings. They write:

Self–plagiarism, in which authors attempt to pass off already published material as new, is a particular problem. In an increasingly competitive environment where appointments, promotions and grant applications are strongly influenced by publication record, researchers are under intense pressure to publish, and a growing minority are seeking to bump up their CVs through dishonest means ... . And although most cases are never discovered, almost all of the editors and publishers contacted by Nature agreed that self–plagiarism is on the rise. [13]

The question that remains open is the question concerning norms. Unlike in academic settings, where the motivation for such practices is clear and there is good moral and practical basis for scorn, here it is not clear what the standard should be. We return to this point later.

Why write reviews?

All book reviewing takes time. Obviously professionals who write reviews for major organs such as the New York Times get paid but more importantly it adds to their identity as authors. A well–written review by a well–known author in a prominent place can itself become a topic for further discussions and on rare occasions a literary event in its own right. The New York Times will often only review a book if a suitable other well–known person with special salience for the topic can be located and is willing to do the review. One novelist told us that there is also a possible “pay back” in reviewing for the New York Times — if you review for them you are more likely to get your own work reviewed there. Academic book reviewing pays nothing (other than a free copy of the book), and counts for little on academics’ resumes for job applications. Again it would seem that reviewers use such a medium to build their academic identities such that they can act as authoritative gatekeepers evaluating new work in their fields. Why do people engage in online reviewing? What is in it for them? One possible answer is that some reviewers on Amazon.com hope that eventually they might break into the offline world of paid reviewing. In what we might call “market signaling” a top–20 reviewer in his twenties, for instance, declares in his profile: “My objective is to do what I do here for a living.” With hundreds of high–ranked reviews under his belt, this reviewer is signaling to the market that he is worthy of a job as a professional book reviewer.

In other cases, people, often adolescents, write reviews as a social practice and as conversation pieces with their significant others. It seems to be empowering to them to see their name and review on a Web site attached to a famous movie. We have witnessed cases in which children wrote DVD reviews for movies after that they have seen them in the theater. They would later send links to their reviews to their friends, and take pride in their ability to ‘publish’. In other cases, our observations of students that participated in online book and CD reviewing as a form of writing practice suggest that such activities are also very empowering to them, especially when those reviews received high rankings on the usefulness scale. In those cases young students who have never ‘published’ before were very excited to see that their short writing pieces (reviews) were actually read by many. Reportedly, that moment of interaction demonstrated to many of these students how powerful their writing could be as a tool.

Users as active agents

Taken together these examples suggest that the new (and still evolving) systems of online product reviews do not function straightforwardly as a democratic pooling of expertise. As we have shown, users who write product reviews are engaged in a variety of activities: promoting agendas, carrying out personal attacks, boosting their own and others reputations, building their own identities as reviewers, experiencing for the first time the empowerment of publication and so on. Of course, we should also not loose sight of the fact that the majority of reviews can probably be taken at face value and are authors’ attempts to give their own honest appraisal of a product. The multiple uses of this new digital technology we have encountered demonstrate a theme becoming increasingly prominent in the history and sociology of technology — the power and agency of users (Kline and Pinch, 1996; Oudshoorn and Pinch, 2003). Users are not only a source of new innovations (von Hippel, 2005) but they can radically reinterpret the meaning of existing technologies — a process known in the sociology of technology as “interpretative flexibility” (Bijker, et al., 1987). For example, early rural users of the Model–T in the U.S. found new uses of the auto not as a means of transportation but rather as a source of stationary power that can be harnessed to various appliances (Kline and Pinch, 1996). Similarly, in the cases we have documented above we clearly see how, for various reasons, whether personal or commercial, users find loopholes in the system’s design which help them game the system to their own ends. To what degree do these behaviors hinge upon specific digital system designs? To what extent do they represent departures from existing offline models?

 

++++++++++

Continuities and discontinuities from earlier models and the offline world

New affordances?

As a way of helping to think about this problem we would like to start with the concept of technological affordances. Affordance was a term originally coined by perceptual psychologists and used in the fields of cognitive psychology, environmental psychology, industrial design, and human–computer interaction. The term was first introduced by psychologist James J. Gibson in 1966, then explored more fully in The Ecological Approach to Visual Perception (1986) where Gibson investigates affordances for action (i.e. the empty space in a door–path affords walking through the door). Donald Norman further developed the concept in his book The Psychology of Everyday Things (1988). His definition of an affordance is the design aspect of an object or a system that suggests how the object can and should be used:

... the term affordance refers to the perceived and actual properties of the thing, primarily those fundamental properties that determine just how the thing could possibly be used. A chair affords (‘is for’) support and, therefore, affords sitting.

However, this definition of affordance is too tied to the psychological literature and assumes too essentialist a use of a product for our purposes. We want to make the notion of “affordance” consistent with the sociology of technology. It is clear that users come up with new and unexpected uses of technology and that it is problematic to read off a definitive or ‘best–use’ from the design of an artifact. How a chair will actually be used depends on the context it is used in — for example some chairs may never be sat in and are always used as foot rests. Furthermore, what a technology is good for, or what can be done with it, is in itself a process of social construction. The analysis of affordances cannot be done by looking at technology alone. The social, historical, economic, and legal contexts are decisive in shaping the ways that technologies are interpreted.

An instructive example is that of the early French videotex system known as Minitel (Schneider, et al., 1991). This is often seen as a precursor to the Internet and enabled French users to exchange and post information with each other from their home terminals about restaurants and other products and services including, famously, sexual services. The rapid take–up of the interactivity dimension of Minitel was something that completely surprised the engineers who had designed the system. They had included interactivity because it was technically feasible to do so but they did not expect it to become the defining feature of Minitel. Other videotex services like the U.K. British Telecom system, Prestell, had no such interactivity and users were limited to broadcast announcements about TV programs and services, posted from only a few central locations. If we think of this in terms of affordances we can say that use is constrained by the affordances but that users determine the actual use of the technology [15]. For example, the affordance of a chair allows it to become a foot rest but does not allow it to become a Minitel terminal [15]. We should be careful to distinguish physical limitations on affordances from social (including political, cultural and economic) limitations. For example, the physical limitation on the affordance of a book review does not constrain its length (physically we could write book reviews that were the length of novels) but social limitation constrains length [16]. In sum we can say that affordances are socially constructed and can ask which new affordances the tiered reputation systems offer. How do authors and readers construct these affordances and how are they reflected in their practices? Which affordances make it easier to abuse review space? Are there parallels in the offline world?

All one needs to take part is access to a computer connected to the Internet and some rudimentary keyboarding skills. No one cares if you have ever written a book review before, what your age or qualifications are and whether you can even write.

One of the main affordances which the digital systems allow is that the barrier to entry into participation is dramatically lowered. All one needs to take part is access to a computer connected to the Internet and some rudimentary keyboarding skills. No one cares if you have ever written a book review before, what your age or qualifications are and whether you can even write. Even children can participate. The Internet, of course, does not permit limitless participation (there exist both access and skill digital divides) but it does give greater affordance to participation than any offline medium.

A second difference in affordances concerns the length of the reviews. Book reviewing in the physical world is constrained by length (especially for important organs like the New York Times). Different nuances in the length of reviews in the New York Times is something to which authors can be finely attuned. As one novelist told us:

This is not to say that longer does not always equal better because short reviews can force the reviewer to be more meaty and succinct and lead to a more quotable quote for blurbing, but there is no doubt in the Times longer and more towards the front part of the weekly review is best.

In the online review a pre–specified word–length limitation is removed. Reviewers can post reviews of just a few sentences or can write an extended essay — it is their choice. The review system affords very lengthy reviews to be posted. In practice we find very few online reviews that actually take up the offer of extended space. Most reviews are rather short, and yet the lack of length limitation allows a multitude of review lengths to co–exist.

A third, more important physical limitation that is removed in the online world is the constraint imposed by time and deadline. Reviewers can take as long as they like to produce their review and post it when they like. This actually makes a difference for books that are not meant to be read quickly — some books are best read a few pages a day but this is not how a reviewer with a deadline to meet reads them. In this case online review systems with their lack of deadlines permits/facilitates/gives affordance to a greater variety of reading practices by reviewers.

A fourth affordance facilitated by online review systems is the ability to refer to earlier reviews so a dialogue of sorts can ensue. Several instances show how reviewers disagree with earlier reviews — something very rare indeed in the physical world of reviewing. Interestingly when Amazon changed the ordering of the reviews from chronological to “most useful” it messed up this dialogue as the chronology goes. This affordance in itself can be abused too. One interviewee reported of a case where reviewers who were part of a group of ‘anarcholiterists’ systematically discredited earlier positive reviews of an author’s book as part of an attempt to carry out a grudge against him in revenge for his earlier criticism of their group in a newspaper article.

As we have seen earlier, ostensibly, a significant affordance of the new class of online review systems is the ability to copy or cut and paste reviews from one item to another at a negligible cost. This affordance forms the basis of many of the practices we described in detail above, but is this really a new affordance? One interviewee, novelist Brian Hall, tells the story of off–line review plagiarism in English newspapers stemming from a blurb for one of his books:

I had noticed six months or so after the publication of the hard cover, when the clippings of the really little papers are coming in, that there was two or three [of these duplicated], and it may be of course some of these little papers in England, like here, are just syndicated with each other. So it may have been one person who did this. That then got farmed out to like three or four of these different little papers. It was basically just publishers blurb. “This in enthralling journey into ... .” That kind of publisher talk. And so they just copied that, and at the end they added one sentence of like “so all in all a good read,” and then signed it and got paid for it. I chuckled when I saw it and stuck it in a box. “Oh look some lazy guy scamming his editor, and the editors to lazy to even notice.” ... And then six months later when [my publisher] came out with the paperback, and they were putting together the back and trying to quote things, sure enough they quoted one of these things. Of course it sounded great, it was the publishers copy! And there too the editors of the paperback hadn’t noticed that it was exactly the same language as the flat copy on the hardcover. So I called them up and said, “Yes this is a wonderful review, it sounds great. I don’t think we should have it because if you’ll just turn to your hard cover you’ll notice that we wrote that.” So they took it off.

In summary, we are not claiming that plagiarism and copying do not take place in the offline world — clearly they do — it is just that in the online world copying becomes even easier. Speaking in the language of affordances we can say that the new online recommendation systems offer opportunities for greater participation and for larger variance in review style and length on one hand, but also easier mechanisms for copying and abuse on the other hand. More people participate in the review process, but more occurrences of foul play occur.

Cross influences

An often overlooked aspect of the new systems is their cross influence on older systems. These cross influences can work in any of the six degrees of reputation. For example, one author encouraged all his friends to buy his book at a special time called ‘the Amazon hour’. This resulted in a spike in his book’s sales, and brought the book to the fourth position in the best selling books of its category. Once that has been established, the author received ‘Amazon best–selling author’ status, which, significantly, has no time dimension. This author can now enjoy the status for the rest of his career, online or offline, despite the fact that his best–selling tenure was only a few hours long.

In another case, we have witnessed offline book stores that allow users to ‘publish’ book reviews using simple sticky notes (Figure 4). This cross–influence speaks to the profound ways in which online participation goes back to change offline practices. Once user reviews have been legitimated online, there is little resistance offline to the practice of users intervening in a process that formerly was a simple business transaction between a shop and a customer. Evidently, it is believed by shops that allow this practice to exist that overall the shopping environment that they foster will allow their business to flourish, and at the same time, they are engaging in “me–too” behavior, offering their customers an online–like experience.

 

Comparing similar reviews

Figure 4: Offline user reviews for books using sticky notes.

 

 

++++++++++

Conclusion — Beyond six degrees of reputation?

Let us summarize. Clearly, the world of online recommendation systems is a world in transition. New systems introduce new technological affordances that make certain activities easier then others. Authors, artists, editors and users find creative ways to interpret those technologies, and often use the system in richer ways then the designers originally intended.

Lawrence Lessig’s model of norms, law, markets, and code and their interplay is useful as a way of making sense of this changing realm. Lessig (1999) argues that the above four categories constitute regulatory regimes that influence the behavior and freedoms of individuals within a given society (particularly, cyberspace). Law, norms, market forces and architecture (code) together set the constraints and limitations on what we can or cannot do. They are all forms of regulation on human practices, which determine how individuals, groups, organizations, communities or states regulate and are regulated. Lessig argues that it is essential for us to consider these four forms of regulation as they pertain to one another because they interact and can compete and that each of these concepts can reinforce or undermine another. Lessig further describes in detail how power nexuses such as, for example, the movie industry, use their control to influence a combination of these categories in order to fortify their long–standing interests. On Lessig’s view we should understand those interactions and intervene in them if we want to create a more equitable and just society.

In our context we have seen that what is ostensibly an activity within a market (online books and CD sales) is actually influenced directly by Code (i.e. the technological affordances that the system offers) by Norms (that, as we have seen, are often not yet stabilized), and to a lesser degree by the Law (a treatment of the legal aspects of this systems deserves a separate article which is beyond the scope of this work). Indeed, as the system matures we can expect all four components to come into play. There does, however, seem to be something that needs to be added to Lessig’s account, and that is the power of users to interpret technologies and their use in ever new contexts. The sorts of user practices we have documented are not things that can be read formally into the system in advance. User practices in the form of local resistance, much before they stabilize into norms can be significant drivers that influence the stabilization of a technology. Norms, laws, markets and codes are often as this case shows in flux in the early days of a technology and will further change as users evolve new practices and the systems designers and operators respond to users.

Users will adapt ... no doubt try and find new ways to game the system.

And here our intervention as scholars forms another interesting potentiality in the system. The sorts of practices we have documented in this paper could have been documented by Amazon.com themselves (and for all we know may have indeed been documented). Furthermore if we can write an algorithm to detect copying then it is possible for Amazon.com to go further and use such algorithms to alert users to copying and if necessary remove material. If Amazon.com were to write such an algorithm and, say, remove copied material, this will not be the end of the story. Users will adapt to the new feature and will no doubt try and find new ways to game the system. At some point we assume though, as with offline reviewing, that the system will become fairly stable.

By studying a technology in transition we were in this paper able to focus on novel user practices. What we hoped to gain from studying these cases is a better understanding of the system as a whole, not only where it ‘fails’ but also to get a sense of its potential when it functions properly. There is little doubt that online book review systems are here to stay and that they are already changing many aspects of the book world as we thought we knew them. Whether these changes are good or not depends significantly on us as users. As we have shown, technology unfolds as users practices evolve — it is users who can and must play a role in shaping this technology. End of article

 

About the authors

Shay David is a PhD candidate at Cornell’s Science and Technology Studies department and is affiliated with Cornell’s Information Science Program. Shay is also a fellow at The Information Society Project at the Yale Law School (http://www.law.yale.edu/isp). His dissertation on ‘open systems’ investigates the complex interconnections that exist among technology and ideology, language and legitimation, authority and assimilation, expertise and reputation. Specifically, it explores the links between metaphors and values and the ways by which the former enable or constrain the embedding of the latter within information technology (IT). Shay holds a B.Sc. in Computer Science and a B.A. in Philosophy, Magna Cum Laude, from Tel Aviv University, and an M.A. from New York University where his interdisciplinary research thesis focused on the political economy of free and open source software and file sharing networks. Shay is an entrepreneur that co–founded two software start–up companies, and was involved for several years in cutting edge software research, combining open source and proprietary software. He shares his time between Ithaca, New Haven and New York City, where his wife Ofri, who is an exhibiting video artist, is working on several large–scale art projects. For more information check out http://www.shaydavid.info or mail to sd256 [at] cornell [dot] edu

Trevor Pinch is Professor of Sociology and and Professor of Science and Technology Studies at Cornell University. He holds degrees in physics and sociology. He has published fourteen books and numerous articles on aspects of the sociology of science and technology. His studies have included quantum physics, solar neutrinos, parapsychology, health economics, the bicycle, the car, and the electronic music synthesizer. His most recent books are How Users Matter (edited with Nelly Oudshoorn, MIT Press, 2003), Analog Days: The Invention and Impact of the Moog Synthesizer (with Frank Trocco, Harvard University Press, 2002) and Dr Golem: How To Think About Medicine (with Harry Collins, University of Chicago Press, 2005). Analog Days was the winner of the 2003 silver award for popular culture “Book of the Year” of Foreword Magazine. He is currently researching the online music community ACIDplanet.com.

 

Acknowledgments

We wish to thank Daria Soronika from Cornell’s Computer Science department for her work in the area of plagiarism detection algorithms that were essential to our project. We wish to thank Joseph (Jofish) Kaye and Phoebe Sengers for alerting us to several examples of review copying practices. We wish to thank our interviewees and students for participating in this study. A draft of this paper was presented at the Annual Meetings for the Society for the Advancement of Social Economics, SASE, in Budapest, July 2005; we wish to thank SASE participants for useful comments. Further drafts have been presented at Cornell University during the Cornell Conference on Economic Sociology, September 2005, and as part of New York University’s Information Technology and Society colloquium series. We wish to thank the participants of these fora for insightful ongoing discussions, especially Mike Lynch, Bruce Lewenstein, Helen Nissenbaum, Gaia Bernstein, Anindya Ghose, Kim Taipale, Eddan Katz and Jack Balkin. We also wish to thank Harry Collins and Robert Evans for a useful in–depth discussion concerning the role of reviews in meaning–setting and purchasing behavior, and John Lesko and his team for their detailed review of this work.

Naturally, any and all errors remain our own.

 

Notes

1. Source: http://www.amazon.com/gp/product/customer-reviews/0743244508/ref=cm_cr_dp_2_1/102-6460714- 8298569?%5Fencoding=UTF8&customer-reviews.sort%5Fby=-SubmissionDate&n=283155.

2. Source: http://www.amazon.com/exec/obidos/tg/detail/-/0674008898/qid=1124793312/sr=8-2/ref=pd_bbs_2/102-6460714- 8298569?v=glance&s=books&n=507846.

3. Source: http://www.amazon.com/exec/obidos/tg/detail/-/1931140170/qid=1124793411/sr=1-1/ref=sr_1_1/102-6460714- 8298569?v=glance&s=books.

4. Source: http://www.amazon.com/gp/product/customer-reviews/1931140170/103-5681666-4511814.

5. Further research is needed primarily to establish correlations between review type and copying practices. Further research is also necessary to fully evaluate editorial review practices and their uptake within user reviews. As the research progresses, it is our intention to stabilize the software and offer it for free download as an open source package so that interested parties can corroborate our data and use it for further research.

6. http://www.amazon.com/gp/cdp/member-reviews/AGEJE3WH26UBR/ref=cm_cr_auth/103-4232317-9497407.

7. http://www.amazon.com/gp/cdp/member-reviews/AF4XMZXOYWR7L/.

8. http://www.amazon.com/gp/cdp/member-reviews/AHSOTSV5VRTAH/.

9. http://www.amazon.com/exec/obidos/ASIN/B0000028SR/qid%3D1115574821/sr%3D11-1/ref%3Dsr%5F11%5F1/103-4232317- 9497407 and http://www.amazon.com/exec/obidos/ASIN/B0000026E5/qid%3D1115575375/sr%3D11-1/ref%3Dsr%5F11%5F1/103- 4232317-9497407.

10. http://www.amazon.com/exec/obidos/ASIN/B0000AOV4I/qid%3D1115573323/sr%3D11-1/ref%3Dsr%5F11%5F1/103-4232317- 9497407 and http://www.amazon.com/exec/obidos/ASIN/6305368171/qid%3D1115573451/sr%3D11-1/ref%3Dsr%5F11%5F1/103- 4232317-9497407.

11. http://www.drugratingz.com/ShowRatings.jsp?tcvid=618.

12. http://www.drugratingz.com/ShowRatings.jsp?tcvid=754.

13. Jim Giles, 2005. “Taking on the cheats,” Nature, volume 435 (19 May), pp. 258–259. A section of the report discusses anti–plagiarism measures taken in the e–print repository arXiv.org. For this project we have used an adopted version of the software developed and used by arXiv. See the appendix for details.

14. A related notion here is Madeline Akrich’s (Akrich, 1992) notion of a technological “script.” As with a literary script, designers try and designate particular patterns of use which are “scripted” into the technology.

15. This issue is much debated within the field of science and technology studies. For instance see Grint and Woolgar (1992).

16. See Boczkowski (2004) for a nuanced study of how social and technical processes work together to enable new sorts of affordances in online newspapers. Hutchby (2001) uses the notion of affordances in discussing how an interactionist approach towards telephone conversations can be integrated with the sociology of technology.

 

References

M. Akrich, 1992. “The description of technical objects,” In: Wiebe E. Bijker and John Law (editors). Shaping technology/building society: Studies in sociotechnical change. Cambridge, Mass.: MIT Press.

Amazon.com. Web site. Available online at http://www.amazon.com.

Y. Benkler, forthcoming. The wealth of networks: How social production transforms markets and freedom. New Haven, Conn.: Yale University Press.

Y. Benkler, 2002. “Coase’s Penguin, or Linux and the Nature of the Firm,” Yale Law Journal, volume 112, number 3 (December), pp. 369–446; also at http://www.benkler.org/CoasesPenguin.html.

W.E. Bijker, T.P. Hughes, and T. Pinch (editors), 1987. The social construction of technological systems: New directions in the sociology and history of technology. Cambridge, Mass.: MIT Press.

P.J. Boczkowski, 2004. Digitizing the news: Innovation in online newspapers. Cambridge, Mass.: MIT Press.

M. Castells, 1996. The rise of the network society. Malden, Mass.: Blackwell.

J. Chevalier and D. Mayzlin, 2004. “The effect of word of mouth on sales: Online book reviews,” Yale School of Management, Working Paper, ES number 28 and MK number 15, at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=432481 accessed 25 May 2005.

G. Coleman, 2005. “The social production of freedom,” PhD dissertation, University of Chicago.

J.J. Gibson, 1986. The ecological approach to visual perception. Hillsdale, N.J.: Lawrence Erlbaum

J. Giles, 2005. “Taking on the cheats,” Nature, volume 435 (19 May), pp. 258–259.

K. Grint and S.W. Woolgar, 1992. “Computers, guns and roses: What’s social about being shot?” Science, Technology & Human Values, volume 17, pp. 368–379.

P. Himanen, 2001. The hacker ethic, and the spirit of the information age. New York: Random House.

E. von Hippel, 2005. “Democratizing innovation: The evolving phenomenon of user innovation,” Journal für Betriebswirtschaft, volume 55, number 1, pp. 63–78.

I. Hutchby, 2001. Conversation and technology: From the telephone to the Internet. Cambridge: Polity Press.

G. Keillor, 1985. Lake Wobegon days. New York: Penguin Books.

B. Kettlewell, 2002. Electronic music pioneers. Vallejo, Calif.: ProMusic Press.

R. Kline and T. Pinch, 1996. “Users as agents of technological change: The social construction of the automobile in the rural United States,” Technology and Culture, volume 37, pp. 763–795.

L. Lessig, 2004. Free culture: How big media uses technology and the law to lock down culture and control creativity. New York: Penguin.

L. Lessig, 1999. Code and other laws of cyberspace. New York: Basic Books.

J. Marcus, 2004. “The boisterous world of online literary commentary is many things. But is it criticism?” Washington Post (11 April), p. BW13, and at http://www.washingtonpost.com/ac2/wp-dyn?pagename=article&contentId=A61073-2004Apr8¬Found=true, accessed 20 February 2006.

D.A. Norman, 1988. The psychology of everyday things. New York: Basic Books.

N. Oudshoorn and T. Pinch, 2003. “How users and non–users matter,” In: N. Oudshoorn and T. Pinch, (editors). How users matter: The co–construction of users and technologies. Cambridge, Mass.: MIT Press, pp. 1–29.

T. Pinch and F. Trocco, 2002. Analog days: The invention and impact of the Moog synthesizer. Cambridge, Mass.: Harvard Universty Press.

E.S. Raymond, 1999. The cathedral and the bazaar: Musings on Linux and open source by an accidental revolutionary. Sebastopol, Calif.: O’Reilly.

P. Resnick and H. Varian, 1997. “Recommender systems,” introduction to special section of Communications of the ACM, volume 40, number 3, pp. 56–58.

W. Safire, 2005. “Blurbosphere (On Language),” New York Times (1 May), section 6, column 1, p. 26.

P. Samuelson, 1994. “Self–plagiarism or fair use?” Communications of the ACM, volume 37, number 8, pp. 21–25.

S. Schleimer, D. Wilkerson, and A. Aiken, 2003. “Winnowing: Local algorithms for document fingerprinting,” Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 76–85.

V. Schneider, J.–M. Charon, I. Miles, G. Thomas, and T. Vedel, 1991. “The dynamics of videotex development in Britain, France and Germany: A cross–national comparison,” European Journal of Communication, volume 6, number 2, pp. 187–212.

B. Strauss, 2004. The Battle of Salamis: The naval encounter that saved Greece — and Western civilization. New York: Simon & Schuster.

F. Turner, 2005. “Where the counterculture met the new economy: The WELL and the origins of virtual community,” Technology and Culture, volume 46, number 3, pp. 485–512.

 

Appendix — Copy Detection Algorithm technical description

The copy detection algorithm in our solution is based on a modification of the winnowing algorithm (Schleimer, et al., 2003). Originally the algorithm works on the level of symbols. Daria Soronika’s adaptation of this algorithm to text problems moves it to words and sentences levels. Winnowing has two parameters: k and t. Sequences of k symbols are called k–grams. Each k–gram can be converted into a number by some hash function. A document is represented by the set of fingerprints — subset of all possible k–grams extracted from this document. The main idea of winnowing is the following: for each window (sequence) of size t > k there is at least one k–gram from this window that is chosen into the set of fingerprints. This k–gram depends only on the content of the window and does not depend on its place in the document. Therefore, if two documents share substrings of length at least t then their sets of fingerprints will share some fingerprints as well.

In summary, in our version of the algorithm we are comparing texts based on sentence level. Therefore, we don’t need k–grams crossing sentence borders. K–grams crossing word borders don’t make much sense either, so we redefine k–gram as a sequence of k words, not k symbols, and we are considering only those k–grams that fit inside some sentence. Sentences are considered similar when they share at least one non–widespread k–gram — by our definition these are k–grams used by three or less authors.

 


Editorial history

Paper received 3 December 2005; accepted 16 February 2006.


Creative Commons License
This work is licensed under a Creative Commons Attribution–NonCommercial–ShareAlike 2.5 License.

Six degrees of reputation: The use and abuse of online review and recommendation systems by Shay David and Trevor Pinch
First Monday, Special Issue #6: Commercial applications of the Internet
http://firstmonday.org/ojs/index.php/fm/article/view/1590/1505





A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2014.