Manifesto for the Reputation Society
First Monday

Manifesto for the Reputation Society

Abstract
Manifesto for the Reputation Society by Hassan Masum and Yi–Cheng Zhang

Information overload, challenges of evaluating quality, and the opportunity to benefit from experiences of others have spurred the development of reputation systems. Most Internet sites which mediate between large numbers of people use some form of reputation mechanism: Slashdot, eBay, ePinions, Amazon, and Google all make use of collaborative filtering, recommender systems, or shared judgements of quality.

But we suggest the potential utility of reputation services is far greater, touching nearly every aspect of society. By leveraging our limited and local human judgement power with collective networked filtering, it is possible to promote an interconnected ecology of socially beneficial reputation systems — to restrain the baser side of human nature, while unleashing positive social changes and enabling the realization of ever higher goals.

Contents

Introduction
Search
Communicating with peers
Filtering
Trade
Culture
Risks
Ideosphere
Conclusion

 


 

++++++++++

Introduction

The emergence of reputation systems

How do we identify what is good? And how do we censure what is bad? We will argue that developing a humane reputation system ecology can provide better answers to these two general questions — restraining the baser side of human nature, while liberating the human spirit to reach for ever higher goals.

Most social interactions require matching human needs on the one hand, and quality or taste on the other hand: hunting for a reliable mechanic, looking for an interesting book, sifting through potential investments, judging the merits of proposed policies. Drawing from a distributed pool of reputations has the potential to ease the search for opportunities, ideas, friendships, cultural goods, and high–quality services; hand in hand, pressure will increase for honest behavior, competence, and fulfilling subtle human needs. At the same time, more efficient tagging of con artists, sources of spam, untrue claims, and dishonest actions can better sanction antisocial behavior, for the most part in a bottom–up "distributed court of opinion."

Sustained rapid advances in information technology have created unprecedented abilities, which come along with unprecedented dilemmas. Data has never been easier to create and move around, but to make decisions one also needs to understand its context and implications. Just as important is what lies behind its face value, in realms such as speculative bubbles (Chancellor, 2000), shady financial practices (Partnoy, 2004), and the political statements that are the topic of much of our public discourse.

Access has generally become easier, driven by falling production, communication, and search costs; however, there are consequently far more suggestions, demands, and events to consider, straining our individual information processing capacity. Leaving the computational milieu alone to evolve "naturally" is no guarantee that an ideal or even acceptable society–wide information infrastructure will emerge, as Brin (1999), Lessig (2001), Norman (1994), Schenk (1998), Stallman (2002), Walker (2003), and others have warned. To promote an interconnected ecology of socially beneficial reputation systems, conscious design, analytical modeling, and learning from past successes and failures is indispensable.

Filtering tools are still in their infancy — and having a better window to look out at the world can also imply that others can more easily peek in. Search engines speed up finding similar work done elsewhere; collaboration tools from humble mailing lists to community software to advanced groupware aid both individual and collective problem–solving. Yet it seems that for many important issues, all these tools are not keeping up with the increasing scale and number of decisions we must make, leading to what (Homer–Dixon, 2002) has referred to as an "ingenuity gap."

Sheer computation power is not enough — another factor of 1000 in speed, storage, or bandwidth translates to a more modest gain in making better decisions, and perhaps even a net loss in other forms of effectiveness. (Indeed, if the data or assumptions are noisy enough, the programming aphorism of "Garbage In, Garbage Out" applies.) Those who seek to use technology or the public space of ideas to advance their own agendas can also leverage increased computation power. There is indeed a technological revolution still in progress, reshaping commercial and social interactions — but its ultimate course will be affected by designers, activists, researchers and civil society at large. As (Fischer, 2002) said:

"Peter Drucker argued that "there is nothing so useless as doing efficiently that which should not be done at all." Adding new media and new technologies to existing practices will not change the consumer mindsets of learners and workers. We need to explore new computational media based on fundamental aspects of how we think, create, work, learn, and collaborate ... New tools should not only help people to do known cognitive tasks more easily, but they should lead to fundamental alterations in the way problems are solved."

In order for computational advances to translate to widespread social advances, the tools must confer the ability to "think smarter, not harder" — and use our collective evaluations of what is desirable to steer resources away from unproductive negative–sum games. Human brains provide an analogy: the difference between a moron and a genius lies primarily not in more or faster neurons, but in neurons that are wired together more effectively. In the same way, reputation systems are a generic tool that allow our observations, analysis, and actions to be "wired together" more efficiently.

What is special about the present time? A networked society that can easily share opinions and access reputations is making new applications possible, and new possibilities in turn generate interest in previously impractical solutions. Rising living standards have led to higher expectations. Increasing political, cultural, and personal freedom in many societies has encouraged the widespread ability to question customary choices, with the resulting ferment of millions discussing and seeking for better answers being a defining part of our zeitgeist. Finally, unprecedented challenges loom — global ecological issues, technologically–amplified terrorist threats, resource depletion, and the poverty of billions. Seeing that a better world is possible pushes us to solve challenges that we once might have resigned ourselves to.

The character of reputation

A vendor haggles with a prospective customer in the bazaar, shrewdly trying to estimate a closing price: the customer’s dress, mode of speech, degree of interest, and evident knowledge of the product all play a part. Simultaneously the customer tries to estimate how much the product is worth and what options the vendor has: the vendor’s location, store layout, knowledge, and guarantees can all modify the first impression of the product itself. In this age–old bargaining game, the image of both parties always looms in the background.

A press kit claiming development of a breakthrough cancer drug is received by a journalist. At first the temptation is strong to dump it straight in the waste basket; crackpots are a dime a dozen. But then a name jumps out from the page: one of the country’s most prominent investors has endorsed the science, along with a professor from a world famous university. The journalist’s skepticism turns quickly to interest.

Browsing through the new releases at the library, a grad student notes a book somewhat related to her thesis topic. Although it’s not directly relevant, the author has been widely quoted in newspapers — incentive enough to pick it up and read it.

All these examples share a dependence on reputation: of buyers and sellers, of investors and professors, of authors and ideas themselves. When in colloquial language we speak of a person’s "good reputation", we are implicitly claiming that the person fulfills many of his or her local society’s expectations of good social behavior — typically including qualities like honesty, reliability, "good moral character", and competence. (Note particularly the last — someone might be perfectly honest, sincere, and dedicated, yet still be mistaken or mediocre.)

Reputation is context–specific. A Ph.D. degree, medical license, or award of merit is meant to certify particular abilities. When a credit agency evaluates your financial history and generates a reputation, the context is your ability to repay loans; this ability may be correlated with but is quite distinct from more general character traits. And reputation could refer to any of these more general traits, like one’s sense of humor or ability to work in a team.

Since there is no absolute objective reputation quantity stamped on people’s foreheads, measurable proxies are necessary, such as book sales rankings, citations in academic papers, Web site visits, and readership of blogs. (Not coincidentally, they have similar highly asymmetric power–law distributions. Many distributions of wealth and of readership of non-electronic resources also follow power–law distributions, a fact noted in Zipf (1949) more than half a century ago.)

Reputation is a surrogate — a partial reflection representing our "best educated guess" of the underlying true state of affairs. Active evaluation by looking behind surface signals can corroborate or disprove reputations, while indiscriminate use degrades their reliability. The challenge is to encourage active evaluation, but also to use it efficiently since it will always be in limited supply.

Emerging information tools are making it possible for people to rate each other on a variety of traits, generating what is really a whole set of reputations for each person. (Information technology is also indirectly increasing the need for such reputations, as we have to sift through more and more possibilities.) You may mentally assign a friend a bad reputation for being on time or returning borrowed items promptly, while still thinking them reliable for helping out in case of real need. No person can be reduced to a single measure of "quality."

So people will have different reputations for different contexts. But even for the same context, people will often have different reputations as assessed by different judges. None of us is omniscient — we all bring our various weaknesses, tastes, bias, and lack of insight to bear when rating each other. And people and organizations often have hidden agendas, leading to consciously distorted opinions.

Reputations are rarely formed in isolation — we influence each others’ opinions. Studying the structure of social connectivity [1] promises to reveal insights about how we interact, and thinking about simple quantities like the average number of sources consulted before an opinion is formed will help us to better filter these opinions.

Are reputations only for people? No, their scope is far wider:

  • They can be for groups of people: companies, media sources, non–governmental organizations, fraternities, political movements.
  • They are often used for inanimate objects: books, movies, music, academic papers, consumer products. Typically, whenever we talk about the "quality" of an object with some degree of subjectivity, we can also speak of its reputation, usually as assessed by multiple users — bestseller lists are a simple example.
  • Finally, ideas can have reputations. Belief systems, theories, political ideas, and policy proposals are the bedrock of public discussion. The waxing and waning of idea–reputations directly affects their likelihood of implementation, and thus the environment that we all share [2].

In the twentieth century, perhaps the two biggest changes in how reputations were formed came from the pervasive spread of mass media and from advances in information technology. For better and for worse, newspapers, radio, and television indisputably play a central role in forming reputations of people and ideas, through near ubiquitous broadcasting, advertising and branding. However, long–term effects of information technology are still very much in formation.

The Web lets us publish and actively access information — but if we wind up using the Web as just an alternative conduit for an expanded mass media, not much has changed. E–mail, chat, and future collaboration technologies are democratizing communication and discussion — but how many of our opinions on important issues are formed through discussion and how many from seemingly authoritative sources?

Reputation itself is changing. While feudal entitlements and class differences once had a fair chance of lasting for decades or even centuries, now a single major news story can make a star or break a career. And the sphere of interaction within which we need to have some reputational information is expanding, as trade, travel, and personal links go global. Computational tools — and the implicit "division of judgement" they enable — will help the same number of hours in the day go further. Just as we have greatly leveraged our natural human muscle power with mechanical energy, we will also leverage our limited and local human judgement power with collective networked filtering energy.

Why reputation systems matter

Each of us has limits: limited time, limited motivation, and limited ability to make sense of facts and observations. Brains adapted over millennia for hunter-gatherer roles are suddenly being forced to cope with the complex and frenetic rhythms of the information age. As the tasks we must solve in professional, private, and civic roles require more and more resources, we are less and less able to cope. Complex issues become oversimplified; opportunities are missed; hidden agendas and snake–oil salesmen become rampant. But most of the solutions required are not dependent on endlessly increasing the amount of data that each of us must process — such a world is a recipe for stress, general degradation of society, and progressive loss of control over our destiny.

And dealing with analytical issues is not the only problem. A whole class of difficulties arises from conflicts of interest between multiple parties — especially when distributions of power, influence, and information become overly asymmetric. Increasing social welfare has been a challenge for millennia, but now increases in scale, scope, and speed of interactions require compensating tools to keep parties honest and encourage accountability. There is as well a hierarchy of basic human needs (Maslow, 1998; Csikszentmihalyi, 1991) which could be addressed more effectively: finding friends and peers, seeking cultural and intellectual stimulation, challenging oneself.

We argue that all these issues and more could potentially benefit from the use of reputation systems, a process that is already underway and beginning to be researched; see Dellarocas and Resnick (2003); Perugini, et al. (2003); and, Terveen and Hill (2001) for surveys, and Masum (2002) for a previous general paper along similar lines. Reputation is a judgement of quality. It becomes more trusted to the extent it accounts for differing biases and abilities of reputation–formers, and differing tastes and needs of reputation–users. Each time we can use reputation instead of having to process and judge the underlying raw data, we save time and effort, and extend our reach and capabilities.

Reputation systems systematically combine many reputations, providing a point of access and enforcing "rules of the game." The information institutions that make these services available — formal and informal, for–profit and non–profit, private and public — will become pillars of the Reputation Society. The challenge is to first understand and then design, build and foster healthy reputation systems — to systematically benefit from the experience of others, and avoid stumbling through endless trial–and–error cycles. In a world where information institutions are often global (and can underpin critical infrastructure) the cost of avoidable failure is unacceptably high.

The process of filtering information to distill a smaller yet more refined set of usable, verified, trustworthy judgements is not easy. But it is doable. And it is both more feasible and more necessary now than ever before, due to information proliferation, technological advances, and pressing socio–economic problems. Indeed, we already see many types of reputation systems emerging, especially online:

  • Slashdot has grown to be a prime tech news site largely because of its inspired combination of open contribution and bottom–up filtering, using a modest amount of effort distributed over a large number of people — ranking the thousands of daily comments so one can choose to read just a few gems or all contributions. Similar communities are arising with different focuses, and figuring out why some fail while others succeed will teach us valuable design lessons.
  • Amazon, the online bookselling pioneer that has grown to be a juggernaut, early on made a decision to let users themselves rate each item, optionally accompanied by comments. Browsing through these ratings, suggestions, and warnings can be a gold mine of useful tips, one that is hard to replicate.
  • eBay uses reputations at the heart of its online auction system, for ranking buyer and seller honesty. Without this feedback, weeding out the bad apples who renege on deals would be far more difficult.
  • Google uses derived reputations from Web page interlinking to decide which search results are most relevant, which proved so effective that it has rapidly grown to become a global information utility. It has no "community boundaries," but extends use of reputation to the Web in its entirety.
  • BizRate and ePinions provide ratings of businesses, seeking to identify those with better product quality and customer service. Both depend on feedback from many consumers, summarizing the experiences of many and in turn influencing future purchasing decisions of consumers in a virtuous feedback loop.

All these sites and more owe a big part of their usefulness to the large–scale use of reputation: to schemes for emphasizing what is perceived to be better, as measured by the explicit and implicit contributions of millions of users. If a reputation system is honest and well–designed, information filtering using a huge pool of individuals can be more stable, reliable, and insightful than the opinions of a small group of gatekeepers or pundits. In the early twenty–first century, lower costs for search, coordination, and evaluation are making previously unthought–of applications feasible — just as happened with the Internet in the 1990’s.

The goal, then, is to devise ways of reducing the gap between reputations and reality. A good reputation system will take account of real–world limitations — scarce ratings, differing tastes, people gaming the system — and still manage to create reputation signals that are close enough to reality to be useful.

 

++++++++++

Search

Popularity vs. obscurity

Information transport has developed steadily, from shipping books and brains around, to telecommunications, to the Internet and beyond. Network protocols and skyrocketing bandwidth have more or less solved the transport of raw data, but we continue to grapple with a far harder problem: finding which data is relevant. And there is a lot of data to sift through, most of which never even gets printed (Lyman and Varian, 2003). With growing interconnectedness and increasing scope of problems to solve, there is a pressing need for better tools than the occasional "best–of list" for finding high–quality resources.

Steam engines heralded the Industrial Revolution more than two centuries ago. A diversity of power devices took over most of their functionality, but all still obey the same thermodynamic laws. Today, search engines play a similar role as universal information–powering devices, and merit special attention. It took many decades of trial and error and incremental evolution before reliable engineering solutions developed for steam engines, and even longer for theoretical understanding of efficiency limits and thermodynamics. Search in particular and Reputation in general await a similar theoretical science. The future may find other institutions or mechanisms to handle information matching, but the challenges posed to search engines today will survive the specific engines themselves.

It must be possible to find reputational information on a category of interest easily — even slight increases in the transactional cost of effort required can reduce usage of reputations. In early 2004, an illustrative example was the antitrust probe of the European Commission into Microsoft. One key claim that led to the imposition of penalties was that Microsoft’s tying of Windows Media Player to its Windows operating system made it difficult for alternative programs to compete on merit (EU, 2004). The implication is that even the few minutes of extra search effort for users to find alternatives was in practice enough to dissuade many.

This suggests a fundamental dichotomy between searching for static, long–lasting objects like books and secure databases (which can safely remain in storage for decades, ready to be picked up again if someone becomes interested) and more ephemeral talents, ideas, or organizations which may wither away for lack of interest if no one seeks them out. Like flowers, many ventures need to bask in the light of human involvement to survive.

Similarly, there is a difference between predefined knowledge that comes in neatly labeled packages, and searching for more organic or abstract knowledge which may need to be pieced together from various sources. Finding a book with a known author or title is easy; much harder is finding works about an interdisciplinary or hard–to–define area, or judging quality and relevance. Those ideas which are hard to find naturally suffer a judgmental penalty; it is difficult to rate an item that is hard to find. Both ephemeral and abstract objects are also more at risk for being actively manipulated by interested parties, since they can more easily decay or be distorted.

It is instructive to start by looking at centuries of experience in information retrieval, which provide a rich base of time–tested tools for "finding out about" (Baeza–Yates and Ribeiro–Neto, 1999). The indices used to arrange books in libraries today are generally topical and hierarchical — for example, science books start with "Q" in the Library of Congress classification system, but "QA75" and "QA76" are reserved for "calculating machines," an archaic category which now includes computer science and software. When computer science first arose, defining it as a rather esoteric subfield of mathematics (which occupies the rest of the "QA" section) made sense. However, in the early twenty–first century, these two numbers are probably larger and more rapidly growing than anything else between QA1 and QA999. Once fixed and widely adopted, changing a cataloging system is a major undertaking — and the categories used affect an idea’s "default reputation."

The simple Boolean method, used in older library systems, considers a document as a list of keywords. It searches indexed documents with queries composed of keywords along with the operators AND, OR, and NOT — for example, "(climate AND change) NOT warming" could give climate change information other than global warming. While both simple and fast, this method often returns too few or too many documents, and doesn’t rank search results. This can amplify the popularity of a few well–known references, as searchers who are pressed for time settle for what is better–known or what comes up first with the most obvious keywords.

Adding consideration of keyword frequency — both the number of times each occurs within a document, and the relative frequency between different documents — gives the better-performing vector space model [3]. More sophisticated approaches can be built on these ideas, many of which are used in current search engines; for example, keywords occurring in titles or other prime locations are usually a sign of high relevance. Since queries like "information overload" and "data smog" probably refer to similar ideas, keywords can be mapped to more general categories, making it easier for once separate subdisciplines to recombine and share ideas on what is important.

Search engines and link–based reputation

Let us turn now to the emergence of search engines, and the radical decrease they bring in certain search costs. When the Web first arose, there were few roadmaps — each user decided what to put up and who else to link to in a vast bottom–up creative construction. But navigating through a land with no signposts was difficult.

The physical routing problem — going from a URL or IP address to the corresponding computer, located anywhere — had been solved so well that it operated behind the scenes most of the time. However, the separate pieces of the Web built on top of this physical layer lacked connective tissue — the paths, roads, and maps that would allow users to find what they were looking for, and to know what there was to be found. And this in turn meant that information on which resources were good could not spread easily.

Many search–assisting sites arose. Yahoo! and the Open Directory Project were human–edited directories of links with hierarchical topic directories. But like a spinning loom factory in the dawn of the Industrial Revolution, they required too much manual attention, covering an ever smaller percentage of sites as the Web grew. Altavista remedied this problem by trying to crawl all reachable sites to form a giant automated index; while useful for targeted search, the numerous results for more general queries rendered it difficult to use as an exploratory tool. An unmet demand for sorting search results by relevance and quality steadily built up — and when Google supplied an easy–to–use solution, it grew rapidly to become a global information utility.

Consider an arbitrary Web page — how can we estimate its quality? A breakthrough in increasing search result relevance came from the development and implementation of PageRank, as outlined in Brin and Page (1998). The basic idea behind PageRank is to estimate reputation of a page using both the number and quality of other pages it is linked to [4], building on insights from citation analysis and information science.

The crucial feature of all this link information is that it summarizes a huge number of independent choices. The creator of each Web page has a choice of which other Web pages to link to, and normally one would choose to link to higher–quality pages — so every incoming link to the page in question is an implicit vote of confidence. In a complementary way, suppose the page in question links to a lot of other high–quality pages — these are probably useful to a reader, and should hence also increase its score.

The beauty of this type of scheme is that it implicitly leverages the writers of all publicly accessible Web pages — their collective selecting (Park, 2002), sifting, and evaluating is observed via the resulting link information. The filtered opinions of millions create Web page reputations that have helped make Google a global icon — and that affect how the page popularity distribution and link structure of the Web itself evolves over time.

Do link–based heuristics always work? Certainly not. All such heuristics are just approximations — a page could be good, yet still not satisfy some or all of the criteria. For example, the basic algorithm above will assign a low rank to a page that is high–quality but too new to have gathered many links yet, or one that is thoughtful but too difficult to be appreciated by most Web users. Along with other popularity–based methods, it also suffers from the "preferential attachment" problem: since Web pages that become popular will be returned first in the list of search results, they will tend to become even more popular, independently of their underlying quality. More sophisticated search algorithms are continually under development (Roush, 2004).

Similar search dynamics play a part in many other information resources, from academic citations to searching for new music to the market for books. While library catalogs are still as useful as ever, in 2004 Amazon.com is one of the most often used resources for purposes of book search — to look up bibliographic information, to search for books that are related to a given book of interest, or to browse books in a new field. User rankings and the way that Amazon orders query results influence which items people are likelier to buy.

Search engines also use other heuristics to try to figure out which sites are better in quality, and which sites best match a user’s query. One can consider factors like where the search terms occur in the document, how long the resource has been available, what format the resource is in, where it’s from, and so on. In fact, it’s common to use many of these same heuristics when searching in a library or bookstore for information about a new subject — from a glance or quick flip through the book, we use the typeface, layout, size, tone, publishing company, and many other more subliminal factors to estimate quality.

Masquerading and coevolutionary search

While academics have used citation tricks like cross–referencing cliques for decades, the commercial Internet has triggered a larger shift from static information "sitting there waiting to be noticed" to a co–evolutionary race between ways of finding relevant information and ways of getting noticed by the finders. This is part of a general pattern: when others are seeking positive search results to reward, we naturally emphasize ourselves and discount others; conversely, when others seek negative search results like blacklisted businesses to stay away from, we naturally try to stay off their radar. Like a masquerade ball, we seek to present the best appearance, while trying to imagine what might lie behind the masks of others.

Frequently people modify their pages in a deliberate attempt to improve search engine rankings. (A number of "hacks" are possible, as discussed in Calishain and Dornfest (2003); though largely about ways of using Google more cleverly, the information can also be applied to try to skew one’s ranking.) The possibility of inflating rankings has led to an arms race where Google updates its algorithms to counter abuses, another weakness is found which Google again combats, and so forth. Yet despite all these flaws, search engines are empirically quite useful, and for many purposes work well enough to meet users’ needs.

Clearly, being a gateway to the Web provides an opportunity for influencing what gets found and what gets ignored. The most obvious temptation is to skew search results based on payment, so that those Web sites that pay more appear earlier than they otherwise would. (This has been tried by some search engines.) Making a site just modestly more difficult or easier to find can have a major impact on its popularity, since people usually stop looking after relatively few results. The challenge for Google and other top engines is thus to keep answering the question, "Who evaluates the evaluators?", with the same answer: "Everybody."

As Simon (1996) pointed out, we usually "satisfice" instead of optimizing difficult problems due to limited time and insight, settling for a solution that is good enough. The "findability" of information biases its perceived quality — studies such as Lawrence (2001) suggest that papers available online are several times more likely to be cited than those offline (though there may be a self–selection effect where authors publicize only their better papers).

The search applications we have seen so far are only the tip of the iceberg. Classical information retrieval and many Web search tools deal with finding exact matches to specified queries. The better search engines modestly generalize exact matching, by also considering limited forms of approximate matching and quality indicators. Personalizable search engines of the future will find items satisfying more general properties:

  • A business partner, with liquidity, skills, and the right personality.
  • A list of opportunities matching your interests and abilities — charitable, professional, or personal.
  • An employee or graduate student, with talent, insight, and a strong work ethic.
  • A game partner, with matching ability level and tastes.
  • News items, whose topic is most important and interesting relative to your point of view, and whose author is most authoritative and clear.
  • Representatives of opposing points of view (Gerhart, 2004).
  • An expert advisor, with a history of solving tough problems and a list of satisfied clients.
  • The most urgent and ignored problems in a given field.

A big part of transaction costs in personal, commercial, and civic life is finding the right challenge to tackle, or the right partners. Indeed, in Zhang (2001) we suggest that the first part of the Internet Age was about people connecting computers, while the second (still just beginning) is about computers connecting people — and the ever more powerful computing resources pouring into society can be productively harnessed toward matching people together to their mutual benefit. "Search" in the broader sense of the term eases a whole range of difficulties which collectively sow confusion in the world.

Finding high quality options requires active search. If we’re content to take whatever is given to us, the other party has no incentive to improve. If we’re content to stick with the status quo and not search for alternatives, new products and ways of doing business will have a difficult time getting started. And if we’re content to accept any explanation given without questioning too hard, those who have power will always be tempted to make reassuring noises instead of doing the hard work of living up to what they say.

 

++++++++++

Communicating with peers

Virtual connections

What were the public places where people once formed reputations? Taverns, town squares, bazaars, and places of worship. Now, the Internet "idea bazaar" is creating many oases of discussion, and a few of real community.

From our point of view as observers and developers of reputation mechanisms, a key connecting thread is the varied solutions developed to the problem of raising discourse quality. Another thread is motivation for contribution; common rewards include peer esteem, making social connections, and the natural pleasure of helping others. To the degree that a conversational community has stable and important reputations for individuals (e.g., a relatively intimate mailing list), or has a reputation for the community as a whole by helping to create a public good (e.g., Wikipedia), there is more flexibility as reputation and other motivations substitute for direct reciprocity.

Mailing lists and newsgroups formed some of the earliest online communities, and are still quite active; their history and social functions are described in Rheingold (2000). Those who organized such communities had to solve many problems which characteristically arose past a certain size. Flaming is the term given to rude, overly emotional, or excessively argumentative replies to a posting; flamers may be socially shunned, or each reader may individually choose to place them in a "bozo filter" so that the reader is automatically shielded from their future postings. Newcomers to the list may ask questions which have previously been discussed ad nauseum; for this, the FAQ (Frequently Asked Questions) was developed, and is frequently a repository of high–quality information about the group’s topic.

Choosing to make a list or newsgroup moderated is a general strategy that combats all these issues: if every post must be approved, then off–topic, flame, and spam posts are much less likely to appear. However, moderation introduces its own problems, such as time (a busy list may require a lot of supervision from moderators), and taste (moderators have their own preferences and agendas which may differ from the rest of the group). Another type of moderation is by restricting access to the community as a whole, perhaps by requiring a recommendation from an existing group member or by voting on new admissions.

In a pattern that recurs in most cohesive communities, local experts often arise in newsgroups — people who because of their knowledge, eloquence, or wisdom come to be respected for the value of their postings. Thus there arises an informal but nonetheless real degree of positive reputation for those who have the capability and energy to contribute high–quality material. Modern conversational software harkens back to hunter–gatherer days when tribal leaders could arise rapidly through wit, cunning, and brute strength — as in the earliest times, most of us come into online communities "naked", and prove ourselves in a relatively level playing field.

Chat provides a more real–time discussion format. Chatting with a small group of friends or co–workers can be a productive experience. At the other extreme is participation in a large, open chat room, where drivel seems to scroll unendingly on the monitor. Usually, taking part in a restricted access conversation increases the potential level of trust, especially if the identities of the participants are known.

The reputation of a chat venue depends partly on its stability; if identities are persistent, if it is difficult for outsiders to break up good conversations with inane or off–topic remarks, if there is quality control on those entering, then (just as with real–life communities) the chat venue will come to be seen as a place worthwhile. Of course, a chicken–and–egg problem exists — it’s hard to attract good people without a reputation for high–quality conversation, but such a reputation is difficult to form without good people. Worthwhile communities are emergent phenomena.

SMS and instant messaging services form an interesting special case of chatting, usually being terse yet accessible anywhere. Although it’s hard to see deep discussions taking place in this medium, the ubiquitous availability could alter reputations in real time of a person or event — or even government if the sparks catch fire, as discussed in the case of the Philippines and elsewhere in Rheingold (2003). One could speculate on the potential effectiveness of a large group of connected people, with strong motivations (such as in a conflict, celebration, or natural disaster), all of whom are using some collaborative protocol that enables rapid decision–making for a crowd. These tactics have been seen in protest movements, and are reminiscent of swarm models (Bonabeau, et al., 1999).

Blogs (an abbreviation of "Web log") are structured forms of Web pages, which usually feature some combination of links to other bloggers, links to stories or items with personal commentary, feedback from readers, and a diary–like format. This combination of features gives them a story–telling feel, and perhaps makes it easier for readers to assess the personality of the person behind the blog; in turn, the implicit reputation formed by one’s blog is a valuable tool for making new and talented acquaintances. (See Rodzvilla (2002) for a collection of blog perspectives.)

Tools have been developed to derive ratings, with a simple method inferring higher ratings for items referred to more often. Due to their relatively lightweight nature — many blog postings consist of just a paragraph and a link or two — blogs often operate at a faster time scale than Web pages, and so the derived ratings from who–links–to–whom information can provide near real–time rankings of breaking news. This in turn can feed back to the community and focus attention of other bloggers; a blog is only as powerful as its reader base. Those blogs that specialize in current events and analysis can be seen as complementary to mainstream media, with a smaller but more focused audience.

Enabling technologies for blogs allow easy syndication of articles, so that one can easily track new stories from a personally chosen selection of dozens or hundreds of other sites (Rittenbruch, et al., 2003) . This "self–amplifying Web of respect" may allow for more effective group discourse, though as Blood (2003) discusses, there is a danger of becoming trapped in "echo chambers" of like–minded discussion partners. In fact, if methods can be found to improve their filtering and to consolidate conversational threads across multiple blogs into contextualized narratives, the structure of blogs as public discourse scaffolds may well make them powerful building blocks for honing large–scale ideas.

Scaling up: Conversations among millions

No discussion of online discourse would be complete without looking at Slashdot. This bottom–up news site, self–described as "News for Nerds — Stuff that matters," showed how hundreds of thousands of users could contribute toward readable commentary on technological and science developments. The operation of the site is straightforward: anyone can suggest a news item to be featured on the front page, but only a small number of suggestions are chosen by the site administrators. Then, all users can comment on the news story, or on each other’s comments.

Due to the large number of visitors, stories routinely receive hundreds of comments. Clearly these are not all of the same quality, so how can some ranking be done? Asking administrators to read and rate all comments, for all stories, is i) not feasible due to the large volume, and ii) not desirable since for any comment there are probably many readers who know more about that comment than an administrator. So the Slashdot administrators innovated out of necessity and set up peer moderation of stories: users themselves vote on the quality of each comment. The resulting load distribution of comment evaluation is effective enough that the comment ranking is reasonably correlated with quality.

From the Slashdot FAQ:

"Concentrate more on promoting than on demoting. The real goal here is to find the juicy good stuff and let others read it. Do not promote personal agendas. Do not let your opinions factor in. Try to be impartial about this. Simply disagreeing with a comment is not a valid reason to mark it down. Likewise, agreeing with a comment is not a valid reason to mark it up. The goal here is to share ideas. To sift through the haystack and find needles.

... Metamoderation is a second layer of moderation. It seeks to address the issue of unfair moderators by letting "metamoderators" (any logged–in Slashdotter) "rate the rating" of ten randomly selected comment posts. The metamoderator decides if the moderator’s rating was fair, unfair, or neither ... ."

This combination of open contribution and peer moderation is a natural way to deal with scaling, and with several refinements has proven to be a practical way of floating good commentary to the top. In particular, each user has a reputation called "karma", which is built up over time from activities like moderating comments and posting comments which get high scores from others; high–karma users get to moderate and meta–moderate others. Many other sites have arisen based on Slash, the same open source software that powers Slashdot. A number of variations based on different rules exist, such as kuro5hin; though it suffers from a lack of metamoderation, it uses voting to decide which stories and essays should be posted as "front–page items" (a function performed editorially in Slashdot).

Lampe and Resnick (forthcoming) have empirically examined Slashdot postings and moderations, and found several interesting patterns. Median time between posting of a story and accumulation of the first 50 percent of commentary is approximately three hours, and for the first 90 percent about 18 hours, so discussions happen quickly; those comments posted later tend to receive fewer ratings since many people will only read the comments that have already been rated highly. Of the comments that were moderated, only 15 percent received both positive and negative moderations (indicating disagreement among moderators), and only eight percent of metamoderations disagreed with the moderations they evaluated. The authors suggest that the system could be improved by highlighting comments needing additional moderator attention, which would distribute the raters’ attention more efficiently.

One common feature of discourse methods is that, to a greater or lesser extent, they generate a sense of community. In fact, online communities are a testbed for viewing the formation of social capital in action. Some "rules of the game" that have been suggested to make such communities work better include gradual access to full participation, a cost to exiting, penalties for those who transgress community norms, and persistent reputations associated with participants (Rheingold, 2000). Tolerating diversity of opinion while maintaining focus is a delicate balance.

Communities online include places for serious discussion and massively multiplayer games, places to brainstorm new ideas and program together, places for fans and for customers — such a rich variety of environments is both a social laboratory and a source of lessons for designers of reputation systems. These environments are also good places to see how the definition of an individual’s reputation influences the character of the community as a whole, which in turn feeds back and affects what individuals perceive as being a "good reputation" in that context.

In physical communities, gossip can be an effective reputation–spreading mechanism for adhering to community or social norms; carried too far, gossip risks enforcing social conformity (Durkheim, 1998). Knowing what the people in your community are up to, and feeling you can trust them, is an integral component of "social capital" (Putnam, 2002) — the interpersonal reserves of society that bind together and support people, and give them a sense of community. In high–trust societies, stories are told of farmers who leave baskets of cherries by the roadside, next to a box of money; passers–by are told to help themselves and leave an appropriate sum. By contrast, citizens of low–trust societies tend to be somewhat paranoid, often viewing a new social interaction with a stranger as a potential threat; everyone looks after their narrow interests out of necessity, leading to a more localized horizon of trust, and to neglected and degraded public goods.

Similar dynamics hold true for virtual communities. Developing a high–trust virtual community is no easy task, due to factors like anonymity, lack of inhibition online, lack of multiple channels of interaction, and splitting of contributor and caretaker effort across different virtual communities. Moderators of online communities have learned to limit the potential damage any participant can cause. But one must also proactively encourage higher levels of discourse quality, through techniques like peer moderation and bottom–up choices of whom to read and cite.

Let’s close this section by moving back to the biggest scale of discourse. Wiki is a fascinating type of media — essentially a large Web site where anyone can edit the content. To guard against malicious or clueless people destroying previously created content, all changes are logged and one can easily roll back to previous versions.

Remarkably, this simple idea has managed to attract large numbers of participants and generate some high–quality results, with the best known example being Wikipedia — an open source encyclopedia formed incrementally from millions of separate improvements. This prototypical form of large–scale discourse has lessons for future efforts to combine and channel the efforts of thousands or millions toward common goals.

Wikipedia generated worthwhile results largely because the process for contributing was simple and contributions were "separable" (i.e., each entry of the encyclopedia could be modified relatively independently of other entries). For future efforts where this is not the case, coordinating contributions and resolving conflicting points of view will be a large challenge.

An item of debate within the Wikipedia community is the degree to which contributors should acquire some form of reputation, which might then be used to make their contributions to the encyclopedia harder to modify. Letting reputation of contributors emerge in a transparent manner will reward higher–quality contributions, and may provide a partial answer to coordination problems if those who make good contributions receive some proportionate ability to decide conflicts. However, the contrary point of view argues that it is the very openness of Wikipedia that made it a success. One suggestion that balances both points of view is to keep the full Wikipedia open, but to use a reputation system to highlight entries that will be periodically copied into an unmodifiable backup; more ideas can be found in the online discussion of a Wikipedia approval mechanism (WikiApproval, 2004).

The Internet itself is the largest example of discourse in action. From basic hyperlinks to all of the above discussed methods and more, Internet users and open source developers are engaged in a myriad of communication channels, on time scales ranging from seconds to years. Just as with the time–honored traditions of books and academic articles which form a constructive discourse spanning generations, readers of a Web page or blog can easily feel in direct contact with other minds, and reshape their ideas into new messages which in turn affect others — even continents or centuries away. But such a rich medium can benefit from gardening and guidelines. In order to more easily find like minds, to effectively select contributions which are better written and thought out, and to combine the separate efforts of millions toward common achievements, reputation mechanisms are a vital component.

 

++++++++++

Filtering

Fingers in the dike

An increasingly important function of reputation will be to control who and what can get our attention. As filters are initially applied to low–grade information, measuring progress is easy — few indeed are those who would welcome more spam or junk mail. But once the easy barriers are put up, demands competing for our attention get harder to rate, from advertising to charitable requests to greetings from new and potentially interesting people.

Good ideas can be drowned in an ocean of mediocrity, as too much material becomes a smokescreen for quality. This is not just an issue of wasted time — many have argued that having too much tempting but irrelevant information reduces our individual and society–wide capacity to function effectively (see for instance Schenk (1998); Postman (1986), and your television). What can be done?

Since many situations of information overload share similar structural features (cheap dissemination costs on the supply side, difficulty of separating desirable from unwanted information on the receiver side), similar solutions may also work. Wherever too many offers are being supplied, it will be helpful to highlight useful incoming information, share experiences on what is useful with friends and larger communities, and increase the cost to those who intrusively send unwanted information. Let’s look at several specific examples, ordered from least to most desired by the user.

Society may be polarized in many respects, but one issue that crosses boundaries is the nuisance factor of junk mail and telemarketers, and especially spam. Why is there so much of it?

First, there is little disincentive for producing spam. Legal remedies are slowly being put into place, but are still ineffective in many jurisdictions. Costs for sending spam are low, and remarkably spam does have a non–zero rate of success — so spam can pay off financially. Second, the lack of easy traceability means that negative reputational consequences do not accrue to most individuals behind spam. (It is worth noting that few large or reputable companies resort to spam, because of the potential reputational backlash.) Finally, costs in time and annoyance for the vast majority of recipients who do not want to receive spam are difficult to charge back to the sender.

Given this huge problem, what are some solutions to block spam? Currently, one of the most effective is the Bayesian approach (Graham, 2003) which essentially looks for telltale phrases or strings by which to differentiate spam from non–spam [5]. Empirically, this seems to work well for most spam, with few false positives. (Similar adaptive approaches have been used for filtering news, Usenet messages, and other domains, though spam may be the testbed with lowest signal–to–noise ratio.)

Many other approaches use various forms of reputational filtering. Known spammers or spam sites can be placed on a blacklist, and the blacklist shared between legitimate servers; incoming mail from a blacklisted site can either be sent to a low–priority folder or blocked entirely. Several organizations monitor overall traffic patterns, with probes, fake decoy e–mails, and firewall monitors. There have been proposals to implement authentication and traceability in a widespread fashion, so that a real identity could reliably be linked to each e–mail. And one long standing proposal for increasing the costs of spam is to add a low cost to each e–mail sent, via requiring the sender to perform some computational task or place a small amount in escrow before the receiver’s e–mail software accepts the inbound message. These tasks would be inexpensive enough not to deter legitimate senders, but would make the indiscriminate spamming of thousands or millions difficult.

Note the underlying features in spam which are common to other situations of information bombardment. The sender incurs minimal financial and reputational cost when sending spam, with the expectation of a reward higher than the costs. The receivers collectively incur a high cost from spam, but can’t impose these costs on the sender. Reputation — of senders of e–mail, or of an e–mail itself — provides a method on the receiver’s end to differentiate spam from non–spam effectively.

Telemarketers and spam provide the "worst–case scenario" — a similar but less severe situation exists with regard to advertising. Companies spend large amounts convincing customers to buy their products and services, a fraction of which is enough to supply major portions of operating revenue (and hence influence) for commercial television, radio, and newspapers. Political campaigns often rely on expensive avenues to force–feed their message to the public. Are all these advertising dollars generating ideal outcomes for advertisers and citizens?

From the point of view of the seller, the goal is to spend money in order to increase the reputation of the product, which will in turn induce greater sales. So advertising is only a means to an end — if an alternative method existed to increase the reputation of the product, it would also serve the seller’s purpose. And vendors of products that provide better value for money would benefit from being able to differentiate themselves from competitors, without having to engage in an arms race of spending to catch the consumer’s attention. Many companies do advertise through providing useful content, through rating channels like trade magazines, or through "infomediaries" which match needs of buyers with abilities of sellers (particularly at the business–to–business level) — but this process still has much room for improvement.

From the point of view of the consumer, advertising is a signal of product quality — but it is a noisy and inaccurate signal. And most signals for products that one is not interested in buying are visual and aural pollution, wasting one’s time and attention. So the consumer too would benefit from a signaling method which would more effectively indicate true quality.

Methods similar to but more complex than those for spam have the potential to provide an effective answer. If an intermediary can be motivated to provide competent assessment of product quality, consumers will have a tool to help select between product alternatives. If this third–party strategy does not function effectively due to corruption or insufficient incentives to provide inspection services, a bottom–up strategy may be feasible, in which consumers themselves pool their experiences. Through an Internet–based intermediary like epinions, consumers can easily enter their experiences and receive information, advice, or statistical data based on experiences of others. Challenges include providing such a service in a trustworthy and privacy–respecting manner, deriving effective advice from raw user experience data, and motivating service provision and use.

When quality comes to be more in demand, investing in quality becomes more feasible — marking a shift from image–based to substance–based advertising. Indeed, it will become possible to incorporate reputations for more general properties than quality — a possibility that the fair trade certification process illustrates. It should become easier to find goods and opportunities that are ranked by others as being environmentally friendly, mentally stimulating, possessing less restrictive copyrights, and so forth.

Consumer boycotts or pro–buying campaigns based on company ethics impart pressure to enhance a different kind of quality, as does increasing the cost to advertise to those who would rather be left alone. Such strategies can provide a win–win proposition: reducing the consumer’s time wasted, channeling the seller’s advertising budget to avenues where ads are appreciated, and redirecting effort toward improving underlying quality as it becomes more transparent. From a society–wide point of view, the goal would be to move from a negative–sum arms race in which advertisers must compete just to play in the commercial world — wasting consumer time and advertiser dollars — to a positive–sum experience which provides effective quality and relevance information for the advertising dollars that are spent.

Coevolving information channels

With spam and advertising, others push too much material into our personal space, only a tiny percentage of which is desired. But we also actively reach into the outside world to pull back materials satisfying personal needs. Due to limited time for examining source material directly, our choice is often limited to selecting between available channels of information.

Consider news media as an example. Not even as a full–time job would there be time to read all the articles in our favorite newspapers and magazines, participate actively in blogs and discussion sites, watch all the documentaries and webcasts available, and otherwise stay abreast of the ever–rolling wave of interesting incoming information. But with limited time, stories and developments become analogous to a voluntary, seductive form of spam — of much higher quality and interest, but with the same characteristic of requiring far more of one’s time than is available. It’s easy to get mentally exhausted just keeping up with it all, without actually allocating time to affect issues.

News institutions have been critically analyzed by a variety of commentators, and criticized for being driven by advertising and organized interests (e.g., Bagdikian, 2000). Four issues are especially relevant for our filtering discussion:

  1. Scope: To what degree is it useful to continually hear about conflicts and crises from around the globe? Conversely, what stories outside the usual areas are being missed, that would be useful for the audience?
  2. Signal: Does the typical newscast reflect the most important stories? How much is entertainment or self–referential trivia?
  3. Speed: Too much too fast, like a spotlight on steroids, chasing after the story of the day. Are long–term issues adequately on the radar?
  4. Spin: Bias in choice and presentation. How well does the news report match the opinion that would be formed if we (or an informed person we trust) could examine the underlying actuality?

Bowman and Willis (2003) review the recent shift toward a more diverse and participatory range of media formats, including blogs and discussion sites. To the extent that news is for civic informedness, there is a need to match knowledge with quality control — and with analysis and action, which Meikle (2002) refers to as "the conversational, the unfinished, the intercreative" in an argument for more engaging, bidirectional, and thoughtful media.

Perhaps too much exposure to surface events is not good for psychic health; like junk food, it’s easy to swallow, but becoming involved with fewer and deeper issues may be healthier. A group of citizens that wanted to have more say might choose to all hear some subset of the most important and general stories, while each would allocate the remaining time available to those issues they know most about, that most affect them, or that are most under–attended. Or they might choose a different strategy — in either case, a reasonable goal is to consciously and cooperatively allocate time and focus at both individual and group levels.

As a simple example, consider the proportion of "good news" (positive developments in the world) vs. "bad news" (wars, famines, accidents, scandals, hideous crimes, etc). After a survey of 300 years of news, Davis and McLeod (2003) suggest that the choice of sensational news has remained relatively stable and partly reflects instinctive human traits. Since far more stories of both kinds exist than anyone can keep up with, why is the balance in each venue at a particular characteristic ratio? If one could choose the ratio to watch, what would it be?

These and many other properties of news could be "knobs" that future citizens can tune to tailor their stream of incoming news — knobs like estimated long–term importance, proportion of time given to pundits, timescale, local relevance, and even entertainment value. The issue will then be the best knob settings for each person at various times, and to what extent to agree to "bias" the knobs for larger groups; to ensure, for instance, that important stories are likely to be received by all those who should be concerned about them.

Indeed, implementation of effective filtering may lead to the rise of "anti–spam" — thoughtful suggestions, inside tips and targeted offers you really do value highly. Picture a world where each person receives only advertising tailored to their specific interests, and rare gems certified as high quality by third parties or friends. Information, news, ads, and offers that make it through the much harder filter would be of high quality and relevance — perhaps even eagerly awaited.

Of course, some caution is needed so that items which may be socially important yet uninteresting still have paths to get through. Perhaps this will be part of the civic duty of future citizens — to read the "Weekly Us" along with the "Daily Me" warned about in Sunstein (2002). But the overall goal of filtering signal from noise remains vital, to maintain a citizenry that is shielded from spam and data smog, and that can hence spend enough time on issues that really matter.

 

++++++++++

Trade

Active vs. passive participants

We constantly make choices in economic activities, from choosing products and services to deciding which investments to make. But some choices are researched more actively than others — and the distinction between active and passive choices affects how closely reputations correspond to reality.

An active participant is one who brings thought, experience, and carefully considered advice from other people into play to evaluate the underlying quality behind the reputation of a trade good — "Sure, that stock is hot, but what do the underlying fundamentals really show?" In contrast, a passive participant is content to rely on expressed preferences of others — "Everyone else seems to be reading that book, I don’t want to miss it!"

Passive participants are essentially free riders — they take advantage of the time and resources others spend making evaluations. In the extreme case where all participants are passive, the distance between claims and actual quality can grow arbitrarily large, as many stock market manias and fads have illustrated. (The effect of various proportions of active participants is studied mathematically in Bianconi, et al., (2004))

Everyone is a free rider most of the time. We can’t afford to investigate and ponder every choice, but must rely largely on past experience and guidance from others — otherwise even a trip to the supermarket would be a time–consuming burden of comparison and investigation. Reputation systems can reduce the proportion of active participants required via two complementary methods: reducing the impact of free riders, and easing the effort required for active investigation. Both methods work toward the goal that a few active searchers in each domain will suffice to keep counterparties honest for everyone.

Using reputational information and signals can thus be seen as a way of making the best inference, given a limited amount of evaluation capacity. In the case of trade, this is made more complex by conflicting interests between sellers and buyers — sellers have incentives to distort reputations to give the best possible spin on their product. Hence, along with efficiency, accuracy and resistance to manipulation are key goals. Let’s consider three practical ways of achieving all these goals: signals, experts, and peer reputations.

First, one can use credible signals from the seller (Riley, 2001). If resources are insufficient to evaluate the quality directly, a guarantee, evidence of popularity, or other signal can provide evidence of higher quality. An example is offering a longer warranty — a strategy that might be adopted by a new company to back its claims to customers. Such direct methods, along with past experience with a product or brand, are the age–old ways used to size up a seller.

If signals or other active search methods are impractical, expert opinions may be valuable: certifications for some quality level, or recommendations from a trusted "infomediary" that mediates recommendations and reputations (Hagel and Singer, 1999), If only a few experts are credibly capable of evaluating an object [6], this may be a reasonable approach. This simple solution has drawbacks, though: experts may be biased or even paid off by sellers, and a small set of evaluations may not provide sufficiently customized recommendations for people with unusual tastes.

Indeed, Partnoy (2004) sounds a cautionary note after examining why so many financial schemes and unsound investments failed to be caught by those who were entrusted with evaluating their quality:

"Gatekeepers benefit greatly from legal rules requiring that companies employ accounting firms to certify their financial statements, banks to underwrite their securities, law firms to examine the underlying documents and opine that they are legitimate, and credit–rating agencies to rate their bonds. Moreover, legal rules permit managers to insulate themselves from liability by involving gatekeeper firms in their transactions. In other words, gatekeepers do not survive based on their reputation alone, contrary to the assumptions of many academics."

The third solution, perhaps most interesting in the long run, is a system which combines evaluations from a variety of sources, in an accurate and trustworthy manner, to give each user the benefit of many peoples’ experience. A single evaluator may be mistaken, may be strange, or may be bought — but "you can’t fool all the people all the time." (Blacklists can be used to highlight particularly egregious offenders, as is done by some business associations and chambers of commerce.)

Such services already exist for specific domains, and there is a growing body of research on trade–related reputation mechanisms with distributed feedback (Bolton, et al., 2003; Dellarocas and Resnick, 2003). Examples of services include:

  • eBay’s buyer–seller reputations, where parties in a trade evaluate each other according to helpfulness and honesty.
  • ePinions, where products are rated by users, and raters are themselves rated on their usefulness.
  • comparis.ch, a site comparing quality of many products in Switzerland where the ratings are partly derived from user feedback.
  • BizRate, which rates companies using feedback from online buyers.

In our view, such a system should be as decentralized as a user wants it to be: providing opinions from just a few authoritative sources, from several trusted friends in a "network of trust" mode (Massa and Bhattacharjee, 2004), or from many people not personally known, according to the user’s desires and the available trusted sources of information. (Naturally, honesty in reporting opinions of others and algorithmic efficiency in identifying relevant opinions is also important.)

Striving for positive–sum games

Let’s turn now to the complementary perspective: reputation from the point of view of a seller. A reputation system, as described above from the point of view of the buyer, also has remarkable potential advantages for sellers producing high–quality goods:

  • Easier access to more accurate reputations for buyers implies that sellers will have a harder time getting away with substandard quality. However, those sellers producing high–quality goods will be better able to compete directly on terms of measurable performance and quality.
  • Easier access to customers seeking stimulating yet lesser–known opportunities, by reducing research and transaction costs.
  • Benefits will also accrue to those who provide better service throughout the product life cycle: selling honestly, giving good after–sales service, and taking customer feedback into account.

Sellers have some degree of choice for the quality of their products and services; buyers have some ability to perceive the true quality of products and services on offer. Increasing buyer perception ability tends to increase the quality level at which sellers have incentives to produce, which is clearly good for the buyer. It is also good for those sellers that can produce at the higher quality level while still making profits. Thus an increase in perception on the part of the buyer can result in a positive–sum game being played between buyers and higher–quality sellers, so that they both share increasing pieces of a growing "magic pie." In fact, considering both the seller’s quality options and buyer perceptiveness as fundamental parameters leads to an insightful economic theory (Zhang, forthcoming).

Certification, both formal and informal, becomes more important as access to and trust in certification information becomes easier (e.g., ISO standards or receiving a "seal of approval" from a consumer association). Brands will still be important assets, but will be linked to increasing amounts of third–party observation and information, which is already happening in merchant feedback sites. Beyond quality measures, ethical and fair trade certification would also be easy to add systematically, as a filter that would highlight companies and services deemed to be ethical by whatever agencies one trusts.

The buyer–seller relationship is recursively present for several levels in many industries, as sellers to a customer must in turn be buyers of their inputs, and evaluate and negotiate with sellers further up the supply chain. For both material inputs and human resources, companies which feel increased customer pressure from a fully–functional reputation system will nevertheless benefit as they in turn become purchasers. Indeed, SGS (Société Générale de Surveillance) and similar companies already provide specialized expertise in inspection, quality assurance and reputation certification of suppliers.

Efficiently and fairly handling the reputation of an employee — or an employee’s ideas — within a company is a key competitive advantage. Recognizing good employees and suggestions could be handled by internal reputation systems, which will give better solutions to both strategy development and fair employee rewards (though care must be taken to avoid dysfunctional behavior, like rating a competitor low to make oneself look better). Finding qualified people itself changes with formal reputations; while wise employers have always used a range of signals to look behind formal grades and references, direct testing and certification of ability can provide better filtering of candidates and complement formal qualifications. Poundstone (2004) showcases the use of puzzles and brainteasers in the technology field.

As Fischer (2002) argues:

"The fundamental challenge for computational media is to contribute to the invention and design of cultures in which humans can express themselves and engage in personally meaningful activities."

This is no less true of working life in general. We would hope that open reputations for conditions inside a company would provide both internal and external impetus for improvement. The experience from open source software development is instructive in this regard: Bonaccorsi and Rossi (2004) survey reasons for taking part in the open source movement and find that peer esteem and free software ideals are among the highest motivators for individuals. One additional factor is the visibility of code and contributors for open source projects, implying that one can develop a portable reputation as a good developer (which is more difficult in a closed company environment).

Mondragon, the Spanish "cooperative of cooperatives," is a new business model whose history and philosophy is described in MacLeod (1997): employee–owned, with distributed decision–making. Working environments which stay successful while being fair to their customers, their employees, and their societies can benefit from reputation in two senses: internally by a "reputation democracy" where ideas and rewards are not confined to the executive suites, and externally as their reputation for fair dealing attracts both customers and new employees. Such environments will make it harder for dysfunctional behavior to maintain itself behind the corporate veil, while making it easier to find matches between skills, personalities, needs, and opportunities.

Trade usually depends on a reasonable level of trust, with a thicker layer reducing many costs for compliance and monitoring, and generally making life more pleasant all around. With the same degree of native trust, people will be willing to engage in more trusting behavior if a reputation system can give credibility to claims of past good behavior, and increase the costs and thus reduce the risk of potential negative behavior by counterparties.

From treating the properties of trade, it’s natural to turn to the dynamics of reputation and wealth in general. They are similar in several important ways. First, there are "increasing returns to scale": those who already possess some find it easier to acquire more. Reputation and wealth are also both signals of some underlying task performed that is of value to others, albeit noisy signals — chance, deception, and the passiveness of observers all play a part.

A related way in which they empirically appear similar is the speculative investment and reputation bubbles that have plagued us throughout history. Whenever the belief of many in an idea strengthens the idea, and the idea is not subject to external validation or correction, such bubbles can last for a long time. We can see examples of stock market bubbles since such markets began, from tulip mania in the seventeenth century to the speculative bubble that preceded the Great Depression to the Internet Bubble of the late 1990’s. (See Galbraith (1994) and Chancellor (2000) for general histories of manias, and Partnoy (2004) for an account of speculative bubbles and deceptive practices in North America since the early 1990’s.)

But reputation and wealth are also different. Wealth is far more readily exchanged for other goods and services. While reputation can also be exchanged, it is a much chancier affair, depending on the particular situation and people in question. Wealth is easier to store for a rainy day or future investment; reputation is an uncertain affair, lasting centuries for some and melting away with a single unfortunate incident for others.

With advancing information technology and improving social wiring, there is hope that society may naturally evolve toward distributions in better accord with underlying reality. In the context of wealth, this would lead to rewards better reflecting underlying contributions. In the context of reputations, this would lead to a closer match between how things are perceived and how much they are actually worth.

 

++++++++++

Culture

Music, movies, and all that

Finding music which matches one’s tastes is a paradigmatic example of the complexities that arise when no single meaningful "quality" measure exists — there are quite idiosyncratic opinions as to what constitutes good music. Fortunately, differences in taste can be handled algorithmically, by defining different measures of quality for each person. Then a user can interpret others’ ratings by weighting more highly the opinion of those raters who share similar quality measures [7].

A complementary approach is to consider similarity between items, e.g., "most people who liked Pink Floyd also like Tangerine Dream." These and other approaches have been used in music recommender systems. (As indicated elsewhere in this article, this is just one example of the general promise of collaborative filtering technologies; for a mathematical treatment, see Maslov and Zhang (2001).) The trick is to refine quality measures and recommendations in a differentiated way — a system which only aggregates user preferences at a coarse level is not doing justice to the range of human culture and tastes.

But reputations are not just for finding what one already knows about — they can also lead to the development of new tastes, as when a friend recommends a piece of music in a new genre. And in the spirit of personal development, we should not ignore the participatory side of music which most people can take part in, whether by developing more nuanced appreciation for the music of others or by practicing one’s own talents. The reputation one gets for being a connoisseur or entertaining performer is a natural feedback loop that encourages further personal development.

Movies have had a well–developed culture of reviews for a long time. The combination of a plot description and an opinion is often sufficient to form an educated estimate of a movie’s likely appeal — if the reviewer is competent and has similar tastes to one’s own. But many times more personalized advice would be useful; perhaps as an aggregate of all those interested in a particular movie or niche, or perhaps just a combination of a few friends’ ratings. Either way, the advice (or "personalized reputation") benefits from being more targeted.

IMDB (Internet Movie Database) and similar Web sites aggregate millions of votes for a wide range of movies, and online sellers of movies use some degree of collaborative filtering to make recommendations given one’s past purchasing history. However, an open reputation–sharing mechanism remains to become widespread. One can project forward to imagine innovative applications, such as "Movies Wanted": a system where plot descriptions are collaboratively developed and voted on, to highlight those movies already desired by a constituency. The net effect of reputation filtering will be to bring more old, foreign, and niche movies to light, with similar effects for music and other culture. Cultural opportunities that languish for want of attention due to high search costs will reach audiences that didn’t know what they were missing.

When discussing music and movies, conversation tends to turn naturally to issues of file sharing and intellectual property. However, by 2010 the particular technological issue of sharing music or movie files may well be resolved, leaving behind the longer–term issue of how to fairly assign reward to creative artists. Considerations of reputation are relevant to the debate; most obviously, the degree to which people copy various forms of media depends in part on how socially acceptable such behavior is in their milieu. This in turn depends on how reasonable the alternatives seem.

Negative reputation has been tried as a strategy by some companies, by making public the names of those engaged in large–scale file–sharing. Conversely, positive reputation as used by established institutions like museums, operas, and universities is a potential strategy, if for example a music fan would choose to pay some amount to become a "patron" of a band, in return for public acknowledgement of their status and access to a repository of the band’s music. (Fans could even invest in a band in advance, with a tour or album happening only after sufficient funds have been raised; Lewis (2002) recounts this taking place successfully with the band Marillion.) A society–wide scheme has been suggested in Baker (2003), where a tax–deductible contribution is made by each citizen to their choice of artists who have chosen to participate; these artists would then donate their works to the public domain.

Many recommender systems provide suggestions based on expressed or observed preferences. But reputations could also encode other properties of media, such as "ethicalness" of lyrics (and indeed of the performers’ lives and aims if one desires), or specific legal or reproduction rights. Licensing schemes like Creative Commons certify an artistic work as having particular legal properties; it is then feasible to provide both recommendations and direct access just within the set of freely available music.

Beyond music and movies, numerous cultural areas and experience goods are ripe for recommendation services provided by reputations. Book ratings and suggestions provide a navigation tool through humanity’s ever–growing literary output — most notably from Amazon, but also from a variety of small–scale services and personal lists. Travel guidebooks aid in getting the insider view of an unfamiliar locale, but interpreted experiences of natives and previous travelers could be even better. Whether for festivals, museums, opera, or the thousands of other shared activities which enrich our social landscape, the cultural sector is fertile ground for development.

Keeping score, cooperatively

Sports have always been the realm of the record — farther, faster, stronger, or just more, every athlete and team competes for a better score. Indeed, there is something quintessentially human about wanting to push the bar that one step higher.

Lewis (2003) recounts the remarkable success of a baseball team which shifted to quantitative analysis to help make personnel decisions. Better observations and interpretation of baseball statistics, and the contribution toward scoring runs that each fielding or batting statistic implies, allowed identification of "undervalued investments": players who possessed non–obvious valuable talents, but were not recognized as such by other teams. Lewis wonders what the previous lack of effective methods of assessing player value implies about society at large:

" ... if gross miscalculations of a person’s value could occur on a baseball field, before a live audience of thirty thousand, and a television audience of millions more, what did that say about the measurement of performance in other lines of work? If professional baseball players could be over– or under–valued, who couldn’t? Bad as they may have been, the statistics used to evaluate baseball players were probably far more accurate than anything used to measure the value of people who didn’t play baseball for a living."

Similar reputations exist for mental sports like chess and Go; in fact, these have been so well formalized that they provide a widely understood and unambiguous measuring stick for ranking theory (and indeed for comparing open–ended progress in human and computational abilities; see Masum, et al. (2003)). Most abilities we care about can’t be measured so precisely as gamesmanship — so the reputations found for games, and their statistically measurable ability to predict future performance, provide an accessible testbed for reputation theory development.

Games illustrate two complementary facets of reputation: the score against others competing in the same arena, and the score against one’s previous personal best. Though there can by definition only be one overall bestseller in a market, there can be many products which are "best" if a variety of desirable criteria exist for different people. It may be psychologically beneficial to encourage people to find their own niches at which they can excel; Brooks (2002) discusses this tendency in modern society, recognizing its positive potential while cautioning that it may lead to solipsistic subcultures.

Newer massively multiplayer online games combine hundreds of thousands of people in virtual worlds with their own subcultures and economies; indeed, the persistence and scale of these virtual worlds is large enough that they already generate significant economic activity (Castronova, 2002). Reputations from games could carry over to other situations if the game includes enough real–life skills, as has happened with the competitive programming arena TopCoder. Ultimately games could merge into simulations, and provide entertaining vehicles for exploring policy alternatives and collaboratively solving tough issues (Aldrich, 2003; Sawyer, 2002).

Much of culture is not solitary. Whether in collaboration, competition, or connoisseurship, social group formation can accelerate growth in cultural spheres. Reputation can act as a matchmaker, to find those with similar goals or with matching skill and effort levels. As well–meshed teams and peer groups become easier to form, we might see a general amplification of human satisfaction.

We have just dipped below the surface of a vast sea of changes, but the general pattern should be apparent. Whether for books or music, for travel, art, or any of thousands of hobby subcultures, reputation will aid in several key ways:

  • In assessing and filtering the works on offer to find those best matched to personal tastes and abilities.
  • In finding others who share one’s passion.
  • In generating global and local reputations, to provide feedback that encourages increasing competence levels.
  • And in motivating the development of our leisure tastes as urged by Scitovsky (1997), to move beyond passive consumption to actively broadening our horizons.

 

++++++++++

Risks

Like any powerful technology, reputation systems are not without their risks. If the technical, economic, and social barriers to implementing a ubiquitous ecology of reputation systems are overcome, what dangers could be posed?

Perhaps the most obvious risk for an individual or organization is loss of privacy. Keeping in mind a model of influence spread as a network of nodes interconnected by edges, this can be further subdivided into two categories:

  • Inbound reputation — reputation that others have about you. Naturally, some people will know more about you than others, and it’s not difficult to imagine cases where one wouldn’t want such knowledge made public.
  • Outbound reputation — reputation that you have about others. The opinions you form say something about your personal beliefs and stance in life, and the right to keep such beliefs private has been fought for over many centuries.

Both forms of privacy must be dealt with convincingly before reputation systems can (or should) be used in a domain. In some areas, one wishes to publicize one’s reputation information: prizes or honors are advertised rather than hidden, tastes in music or books may be posted on one’s Web page to signal personality traits to others. But keeping the choice to make personal assessments of reputation public or private seems a reasonable design goal.

Here a distinction must be made. For outbound reputation, privacy rights will naturally be higher. Unless one proclaims one’s opinion of others in a public forum, the use of a reputation system should automatically imply that all data submitted will be kept confidential, with a minimum of loopholes and escape clauses.

But for inbound reputation — the reputation that others form about you — there will be a shifting balance between your right to privacy and the reputation former’s right to share and collaborate with others. Those with public functions — politicians, doctors, lawyers, professors, and so forth — can expect to be more exposed in roles where others have legitimate interests. A dentist’s reputation as a dentist may be public, while the same dentist’s reputation as a debater need not be. It is easy to foresee intense debates, differing legal interpretations, and the emergence of customs on this issue, similar to existing tensions on the desirable scope of intellectual property rights.

Often privacy is desired out of fear that information about oneself might be used against one by others. We usually prefer to know more about others while hiding our own shortcomings and embarrassments — living life with social sunglasses. The natural dynamic of effective reputation systems is to increase transparency, so there is an open question as to what the desirable countervailing privacy–preserving forces should be.

Deliberate skewing of reputations by those who benefit from their inaccuracy is one of the greatest operational problems reputation systems will face, once they have dealt with implementation issues like privacy and authentication. The public relations agencies of today may evolve into the reputation manipulation and repair agencies of tomorrow, with expertise ranging from understanding why one’s reputation is in trouble to underhanded ways of gaming reputation systems. Arenas with more heterogeneous interest groups like politics and commerce will naturally have more pressure for skewing reputations — consider the present–day difference in deceitfulness between commercial and educational Web sites.

Inaccurate reputations are already a concern. Mistakes in credit reports and off–base investment advice illustrate how insufficient effort, lack of insight, and diverse incentives can reduce reputation quality. Rigidity of evaluations poses similar dangers: people might find it difficult to escape from minor youthful misdemeanors, or to fulfill unreasonable rules imposed by a reputation evaluation agency wanting to make its job as easy as possible. As a new and specialized form of identity theft, "reputation theft" could occur when someone successfully hijacks the reputational characteristics of another (Kaye, 2004; Newitz, 2003).

Let’s look ahead now and consider longer–term risks of a fully functional reputation system ecology. The recent history of economic practice should provide a cautionary example: too much reliance on simple financial metrics means optimizing a distortion of the real issues, as for instance with maximizing economic activity (instead of a more accurate but harder to calculate basket of human welfare measures) or "locally optimizing" short–term company income (without adequately considering the many investments in human, technical, and other forms of capital necessary for long–term success). Too much reputation could lead to a similar sort of "satisficing" where decision–makers become overly reliant on the measures they can easily measure, becoming passive users of reputations without adequately considering their limitations.

Pressure to perform is likewise a mixed blessing — while good in moderation, turning up the pressure too high leads to counterproductive stress. If everyone lives "under the microscope" in their dealings with others, the scope for innovation and risk–taking may be unduly constricted. Since everyone makes mistakes, a certain level of forgetfulness in maintaining reputations seems necessary, to encourage redemption and forgiveness; perhaps this will emerge naturally, since reputations of people are often relative to norms.

Rapid flows of investment money have been destabilizing in the past. Rapid reputation flows may also be an issue. As mass media provide an amplifying feedback effect for a few major stories, the reputation of a person, a country, or even long–held ideas can come under intense, sustained attack — sometimes justified, other times not. As reputation becomes more directly linked to financial, economic, or even legal rights, the best safeguard against inaccurate booms and crashes of reputation is to be evaluated by a diversity of viewpoints, each of which has some independent capacity for thought and judgement — perhaps combined with a "circuit breaker" mechanism that focuses critical attention on rapidly changing reputations.

As suggested in several venues such as Sunstein (2002), an overly personalized reputation system that catered exclusively to the expressed preferences of each citizen would lead to a world where each person reads a different newspaper, sees different shows, and interacts with different people — all sympathetic to issues and ideals they already like. Such a world could separate subcultures and demographic groups ever more sharply, with polarized viewpoints taking over and little left to hold the common ground that binds together societies.

Polarizing groups are a very real concern, as is the age–old problem of the tyranny of the majority drowning out the voice of the minority — or worse. As Benjamin Franklin said: "Democracy is two wolves and a lamb voting on what to have for lunch. Liberty is a well–armed lamb contesting the vote."

Space for insightful yet unpopular points of view may ironically be provided by the same polarizing phenomenon that discounts minority views, since easy polarization implies that differing points of view can easily be filtered out and ignored. It may also be desirable for explicit mechanisms or implicit dynamics of the reputation system ecology to encourage maintenance of diversity, cross–fertilization of views, and exposure to general culture and ideals.

 

++++++++++

Ideosphere

Informed debate

In the public sphere, the ceaseless ebb and flow of ideas affects each one of us. Policies are developed. Consensus is reached. Compromises are made. And this give–and–take and regard for the common welfare that characterizes healthy societies can be boosted through judicious assistance from reputation systems.

At root, informed public debate rests on a base of ideas. So let’s look at two of the most concrete manifestations of ideas: books and articles.

Occupying a special place in human history for many centuries as the channel of communication for difficult and significant ideas, books are still one of the best means of encapsulating a large yet coherent point of view. Since it’s difficult to construct a parallel synthesis without the large time investment of writing another book, a well–written book can come to partially define and symbolize a field. Many are the books which have had significant effects on the course of history, from religious texts to The Wealth of Nations and Das Kapital to On the Origin of Species.

What affects which books get popular? Amazon makes public ongoing sales rankings for millions of books; Rosenthal (2003) has used observations of these rankings combined with revenue information to estimate what a particular sales rank translates into in terms of sales per day, and finds a roughly power–law behavior for most of the ranking range. Advances in searchability, such as the full–text search in millions of books discussed in Wolf (2003), may flatten this distribution, by bringing relevant works to light.

Academic articles have served a similar purpose of idea communication for several centuries — the collection of articles in a field constitutes both a historical record and a body of currently accepted truths. The filtering provided by peer review has been critical to the maintenance of standards, now complemented by citation–based measures; as with search engines, the number of citations of a given article constitutes a rough measure of how useful that article was found to be.

Bibliometrics and citation analysis look at this graph of citations to extract reputational measures for individual papers, authors, institutions, and sometimes entire fields. However, cliques of mutual citations can artificially boost ratings. A more subtle effect comes when people are "passive citers," mentioning papers and books only because they are already well–known, without checking directly for quality and relevance. Simkin and Roychowdhury (2003) model misprint distributions to estimate that about 80 percent of citers don’t read the original paper.

Though informed public debate rests on a bedrock of deep ideas, much of the day–to–day action takes place in the realm of conversation, usually mediated by media. Clearly a world in which it is common to hear about and be affected by events on the other side of the planet is one where the task of sorting, filtering, and translating the ongoing deluge of raw events into news has never been more important. Can "news" be produced, filtered, and consumed more effectively? Bowman and Willis (2003) review online efforts in constructing self–filtering social communities, and suggest the answer is a cautious yes.

Pundits and public intellectuals play a significant role in shaping public opinion, but their reputations are only weakly linked with how useful their advice turns out to be over time. So why do so many people listen to "professional talking heads" from a few large media outlets, instead of more local voices or more insightful ones? One causal factor is that news is largely a search and authentication issue, and so a key barrier to becoming a primary news source is perceived quality. In principle, news collection and dissemination could be much more distributed, if one could easily find relevant news and trust the reputation for accuracy and insight of news intermediaries. Another barrier is that broadcast news is almost effortless for the user — there must be active demand for and participation in better alternatives, and a refusal to simply be spoon–fed couch potatoes (Shaw, 2003).

A credible negative reputation can be an effective deterrent to unethical behavior, complementary to legal sanctions. Several volunteer–driven sites attempt to identify those who would be willing to cross the line into unethical behavior, by tactics like posing as a potential medical customer or underage sex–chat participant — those who show repeated willingness to transgress social norms then have their identities publicized. Though there is a danger of mistaken allegations or ostracism of those with unpopular views, a well–functioning "distributed court of opinion" nevertheless has the potential to restrain antisocial behavior in areas that overburdened justice systems find difficult to prosecute.

Generating better policy

Wise policies and original ideas are usually not developed in a vacuum. They benefit immeasurably from concentrations of creative thinkers and doers: sharing know–how, competing for social status, and bouncing ideas off each other. A historical example is the Lunar Society, a discussion group in Birmingham, England that started around 1765, lasted for half a century, and included such thinkers and doers as Matthew Boulton, Erasmus Darwin, Samuel Galton, Joseph Priestley, James Watt, and Josiah Wedgwood (Benjamin Franklin was a corresponding member). Like other similar centers through the ages, this technological hub and relatively open society of its day drew in many remarkable people. By word of mouth and active search for new members, a creative and stimulating environment was maintained through many decades.

How can we similarly promote more productive seminars, conferences, and brainstorming sessions today? A paradox familiar to designers of both physical and online communities needs to be balanced — one wants to be known to others in order to draw in new members, but at the same time one wants to be selective in admitting new members, to avoid diluting talent and excitement levels.

To consciously guide and distribute the reputational pressure during conversation is a strategic tool for generating better final outputs. Thus far, implemented online discussion communities are providing both positive and negative case studies in exploring the consequences of a given story or idea; it remains open to collaboratively develop large–scale policy or come to workable compromises on important issues. As the scale and diversity of participants grows, more structure becomes necessary, but reputational mechanisms for steering attention and time remain important — and the formation of these mechanisms is a social technology still very much under development. (See Stiegler (1999) for a science fictional treatment using global idea markets.)

Strogatz (2003) speculates on "Human Sync" — the dynamics of human populations converging on ideas, positive or negative. Causative factors for fads include luck, hysteria, rational choice, "catchiness," propaganda, repetition, and amplifying feedback. This phenomenon seems to have mostly been considered from the point of view of political movements, marketing, and stock market bubbles — but an underexplored practical consequent question is, how can one kickstart social uptake of positive ideas (Gladwell, 2002)? Inoculating society against contagion of negative ideas is equally important — like stock bubbles, mobs, political regimes, or simply ideas that turn out to have negative long–term expected survival value.

To the degree that reputations in the public sphere are thought to be important, free time for reputation evaluation is an important public good. Even so, each of us can only make informed decisions by offloading most research gathering and evaluation to others, which is an important limitation in the implementation of a democracy. As Schenk (1998) points out, subjecting all governance decisions to real–time referenda is not the way to promote reasoned governance.

A skilled breeder needs to notice good mutants and variants, or else they disappear. Spontaneous talents and ideas are constantly being formed, but better options need recognition, encouragement and support to develop. Hoping for an emergent selection process to generate the best outcomes is insufficient; healthy reputation systems can provide pressure to improve, focusing and clarifying the best ideas to promote action.

In most cases, reputations formed from a cooperating network of people have the potential to be better than those any single person could form. Our judgements on any complex topic are inevitably transmitted to each other in an incomplete and distorted fashion. The task of reputation system designers is then to set up incentives that minimize inaccuracies and maximize productive collaboration, so that wherever possible the judgement of a group — or indeed, of an entire society — becomes better than the judgement of its individual members.

 

++++++++++

Conclusions

Being human, each of us has many limitations: time, access, ability, and experience. The main goal of developing enhanced reputation filters is to do as much as possible despite our individual limitations — to cooperatively pierce the veils of deception, mediocrity, and banality.

"Who will guard the guardians?" — the age–old problem posed by the Romans remains just as relevant today as it was two thousand years ago. Accountability of leaders, of businesspeople, and of those entrusted with public duties must be a key goal of developing a society–wide reputation system ecology.

The sharing of observations and opinions builds up a picture in each person’s mind of the reputation’s subject, which we might call the "Invisible Eye" — the distributed formation of reputations, and consequent increased ability to distinguish better from worse. To the degree that you have access to and trust the experience of others, it is almost as if you yourself had been there watching that previous situation, thus increasing your base of experience from which to judge future reliability — and increasing pressure on the subject in question to behave responsibly. The analogy to Adam Smith’s Invisible Hand is not accidental; just as selfish local actions with market incentives can lead to collectively efficient behavior, locally maximizing actions with reputation incentives have the potential for similar guided emergent behavior that exceeds what might have been designed by a conscious planner.

The ultimate aim is to increase the level of collective wisdom through sharing our separate experience and expertise. This will enable a "division of experience" — instead of each of us personally suffering through scams, cheats, and mediocrity, we will be able to leverage each other’s experiences. Collectively, aided by astutely networked reputation systems, we stand the best chance of overcoming our dark side and bringing out the best in us.

How is this to be done? Though it would be premature to suggest engineering principles, we can with greater certainty discuss what goals the emerging reputation infrastructure should aim for (see also Masum, 2002):

 

Table 1: Common design goals.

Authentication Data security, user privacy, identities that are difficult to hijack.
Searchability Ease of finding relevant reputations.
Analytical capabilities Understanding patterns, aggregations, and implications of reputations.
Efficient use of human time Relatively simple to set up and maintain, easy to use, and stretching the limited number of human evaluations as far as possible.
System transparency Public rules behind how reputations are made, checked, and corrected.

 

While these goals are relatively uncontroversial, other design principles involve tensions between different parties:

 

Table 2: Design tensions.

Scope
(what reputations should be available)
Too little, and the reputation system loses effectiveness. Too much, and there is a lack of privacy.
Centralization of reputation formation Too little, and it is hard to find good sources of reputation information. Too much, and there is potential for corruption and incompetence in the gatekeepers.
Customization One naturally applauds customization as a goal, giving each person a free choice of whose opinions to listen to ... but too much diversity of reputation sources might lead to polarization and weaken social cohesion.
Funding A user–pay model might be expensive, and make it difficult to invest in common infrastructure and in–depth research ... but third parties who are paid to provide reputation services may intrude their biases, or pursue profit over truth.

 

The questions that reputation was created to answer are as old as human society itself. However, the answers have been taken for granted while society changed shape. It is now time to re–examine our previous answers and search for better ones — the emergence of solutions without focused effort cannot be taken for granted. Implemented and prototype reputation systems are already enabling new kinds of interpersonal exchanges, commercial opportunities, and discussion forums, giving a base of theory and practice with which to design the next generation of platforms.

Reliable and easy–to–use reputation systems are more needed now than ever before; constructing them will require a mix of information technology and "incentive engineering." At a personal level, there are many common activities which would be enhanced through better reputation services. And at a societal level, we are facing grave issues: from climate change to media monopolies, from fair trade to weapons of mass destruction, from poverty alleviation to developing shared ethics.

To connect abilities, opportunities, and individuals, and to better understand their true character and value — that is the challenge whose solution will overcome constraints on self–actualization and mutually beneficial interactions. Each new advance in human civilization liberates our mind, body, and spirit a little more from the limitations formed by physical circumstance and social imperfection. While rising living standards have improved the material lot of humanity relative to previous generations, we now have the opportunity to unleash positive changes in the social, psychological, and creative fabric of society. Only our best efforts will do. End of article

 

About the authors

Hassan Masum is a postdoctoral researcher, affiliated with Carleton University and the Communications Research Center of Canada. Yi–Cheng Zhang is professor of theoretical physics at the University of Fribourg, Switzerland.

Hassan and Yi–Cheng are writing a book on the emerging Reputation Society; please see reputationsociety.com.

 

Acknowledgements

We’d like to thank Ben Houston, Joe McCauley, Lionel Moret, and John Spence for helpful feedback on drafts of this article.

 

Notes

1. The field of social network analysis examines the connectivity structure between people, and its implications for their interactions. Recently, a good deal of quantitative research has been triggered by the widespread realization that the network of influence between people has a special form, whose structure is based on a "Small World Network".

Small World Networks are those where neighboring nodes are highly clustered, but the minimum path length between a randomly chosen pair of nodes remains small. Analyzing similar properties and corresponding graphs has led to a productive branch of research, as overviewed in Barabási (2002) and Watts (2004), and more technically in Newman (2003).

2. Cohesive idea–units are often called "memes", a term that appears to have been introduced in the original edition of Dawkins (1976). By analogy with biological genes, memes may spread from mind to mind and evolve over time.

3. In the vector space model, documents are still represented as lists of words, but the number of times each word occurs is also stored — so the index would have an entry similar to "In this document, ‘a’ occurs 127 times, ‘aardvark’ occurs 1 time, ...". The rationale is that documents mentioning the requested keywords more often are probably more focused on the requested topic. In addition to word frequency, each entry is weighted by "inverse document frequency", with words occurring more often across all documents considered less important — intuitively, "a" and "the" distinguish between different documents much less than a rarer word like "hydrodynamics".

The combination of term frequency and inverse document frequency is often referred to as "TF–IDF" formulae. The query itself is just a list of keywords (such as one would type into a search engine) and can be represented as a vector in the same way as the documents, though almost all entries will be 0. The system then calculates the closeness between the query and each document, where they are considered closer when they both have high values for the same entries. (The query and all documents are represented as weighted vectors, and one just takes the cosine between the two vectors.)

4. One easily observable feature of a Web page is which other pages it links to — just scan through the page and pick out the links. It is harder to find all the "backlinks" of a given page (i.e., the pages which link to a given page), but given moderate computing power and bandwidth this is doable, by first crawling the subset of the Web to be indexed and then efficiently analyzing the resulting Web graph.

To quantify the estimation of quality from link information, PageRank used a scheme which can be understood by analogy to the "random surfer model" (Richardson and Domingos, 2004). Suppose a person is plunked down at a random Web site, and starts surfing. At each page, the random surfer does one of two actions:

  1. A random link is picked to follow, with some high probability — 85 percent in the original model.
  2. Otherwise (or if the current page has no out-links) the surfer "gets bored" and jumps to a random Web page somewhere else on the Web — probability 15 percent in the original model.

Watching one random surfer doesn’t give much insight, except perhaps into odd browsing behavior. But imagine hundreds, millions, then trillions of random surfers all browsing the Web simultaneously. At any given time, each Web site will have some number of random surfers visiting it, and those which have more links (especially to pages which themselves have a lot of random surfers already) will tend to get more visitors. Taking the ratio of random surfers at each site essentially gives the PageRank quality estimate.

See Richardson and Domingos (2004) for further explanation, algorithm refinements, and a contrast with Kleinberg’s HITS algorithm.

5. Bayes Theorem is a widely used model from probability theory, which we will rephrase with spam in mind: for a given e–mail E and word W in the e–mail:

P(E is spam | E contains W) = P(E contains W | E is spam) * P(a random e–mail is spam) / P(a random e–mail contains W)

What this says is that a particular word or phrase W (e.g., "sell", "buy", "special offer") is a better indicator of the containing e–mail being spam, if based on past observation spam tends to contain this phrase more than regular e–mail. Of course, one such phrase could be a coincidence, so a Bayesian spam filter will use a number of different indicators and classify an incoming e–mail as spam if it scores highly on enough of these indicators.

6. Economics discusses these issues under the label of "credence goods." For example, it’s hard for most patients to evaluate a surgeon directly, so professional standards, peer regulation and legal sanctions help share the evaluation burden with others. By definition, credence goods are ones where the proportion of competent raters is relatively small. As such, they are closely related to domains of expertise, where for instance the proof of Fermat’s Last Theorem could only be judged by a select group of mathematicians.

A distributed reputation system must account for vast differences in the proportion of competent raters in different domains, leaving room for those who can demonstrate expertise to join the rating group while discounting the opinions of others. This of course can be a thorny issue in areas like questions of public policy, where citizens may feel that they deserve some degree of input and oversight even to supposedly expert groups; we do not discount the difficulties involve, but defer a detailed discussion to a future venue.

7. One can consider a weighted bipartite graph, with the two node sets consisting of people and music respectively, and the weight on each link between a person and a song expressing the degree of preference. Then one can look for similar profiles of connectivity. Alternatively, one can consider the matrix of people vs music objects and look for vector similarity.

 

References

Clark Aldrich, 2003. Simulations and the future of learning. San Francisco, Calif.: Pfeiffer.

Altavista, at http://www.altavista.com, accessed 1 March 2004.

Amazon, at http://amazon.com, accessed 1 March 2004.

Ricardo Baeza–Yates and Berthier de Araújo Neto Ribeiro, 1999. Modern information retrieval. Reading, Mass.: Addison–Wesley.

Ben Bagdikian, 2000. The media monopoly. Sixth edition. Boston: Beacon Press.

Dean Baker, 2003. "The artistic freedom voucher: Internet age alternative to copyrights," CEPR paper, at http://www.cepr.net/publications/AFV.htm, accessed 1 March 2004.

Albert–László Barabási, 2002. Linked: The new science of networks. Cambridge, Mass.: Perseus.

G. Bianconi, P. Laureti, Y–K. Yu, and Y–C. Zhang, 2004. "Ecology of active and passive players and their impact on information selection," Physica A, volume 332 (1 February), pp. 519–532.

BizRate, at http://www.bizrate.com, accessed 1 March 2004.

Rebecca Blood, 2003. "Waging peace: Using our powers for good," Keynote at BlogTalk 1.0, 2003; at http://www.rebeccablood.net/talks/waging_peace.html, accessed on 1 March 2004.

Gary E. Bolton, Elena Katok, and Axel Ockenfels, 2003. "How effective are electronic reputation mechanisms? An experimental investigation," University of Cologne Working Paper Series in Economics, number 3 (September), at http://ideas.repec.org/p/kls/series/0003.html, accessed 2 July 2004.

Eric Bonabeau, Marco Dorigo, and Guy Theraulaz, 1999. Swarm intelligence: From natural to artificial systems. New York: Oxford University Press.

Andrea Bonaccorsi and Cristina Rossi, 2004. "Altruistic individuals, selfish firms? The structure of motivation in Open Source software," First Monday, volume 9, number 1 (January), at http://firstmonday.org/issues/issue9_1/bonaccorsi/, accessed 2 July 2004.

Shayne Bowman and Chris Willis, 2003. "We media: How audiences are shaping the future of news and information," at http://www.hypergene.net/wemedia/, accessed 1 March 2004.

David Brin, 1999. The transparent society: Will technology force us to choose between privacy and freedom? Reading, Mass.: Perseus.

Sergey Brin and Lawrence Page. "The anatomy of a large–scale hypertextual Web search engine," at http://www-db.stanford.edu/~backrub/google.html, accessed 2 July 2004.

David Brooks, 2002. "Superiority complex," Atlantic Monthly volume 290, number 4 (November), pp. 32–33, and at http://www.theatlantic.com/issues/2002/11/brooks.htm, accessed 2 July 2004.

Tara Calishain and Rael Dornfest, 2003. Google hacks. Sebastopol, Calif.: O’Reilly.

Edward Castronova, 2002. "On virtual economies," CESifo (Center for Economic Studies at the University of Munich and the Ifo Institute for Economic Research) Working Paper, number 752, at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=338500, accessed 2 July 2004.

Edward Chancellor, 2000. Devil take the hindmost: A history of financial speculation. New York: Plume.

Creative Commons, at http://www.creativecommons.org, accessed 1 March 2004.

Mihaly Csikszentmihalyi, 1991. Flow: The psychology of optimal experience. New York: HarperPerennial.

Hank Davis and S Lyndsay McLeod, 2003. "Why humans value sensational news: An evolutionary perspective," Evolution and Human Behavior, volume 24, number 3, pp. 208–216. http://dx.doi.org/10.1016/S1090-5138(03)00012-6

Richard Dawkins, 1976. The selfish gene. Oxford: Oxford University Press.

Chrysanthos Dellarocas and Paul Resnick, 2003. "Online reputation mechanisms: A roadmap for future research," Report from the MIT/NSF Interdisciplinary Symposium on Reputation Mechanisms, 26–27 April 2003; at http://ccs.mit.edu/dell/symposium.html, accessed on 1 March 2004.

Emile Durkheim, 1998. De la division de travail social. Fifth edition. Paris: Presses universitaires de France.

eBay, at http://ebay.com, accessed 1 March 2004.

ePinions, at http://www.epinions.com, accessed 1 March 2004.

EU, 2004. " Commission concludes on Microsoft investigation, imposes conduct remedies and a fine," EU press release IP/04/382, Brussels (24 March), at http://europa.eu.int/, accessed on 24 March 2004.

Gerhard Fischer, 2002. "Beyond ‘couch potatoes’: From consumers to designers and active contributors," First Monday, volume 7, number 12 (December), at http://www.firstmonday.org/issues/issue7_12/fischer/, accessed 23 June 2004.

John Kenneth Galbraith, 1994. A short history of financial euphoria. London: Penguin.

Susan Gerhart, 2004. "Do Web search engines suppress controversy?" First Monday, volume 9, number 1 (January), at http://www.firstmonday.org/issues/issue9_1/gerhart/, accessed 23 June 2004.

Malcolm Gladwell, 2002. The tipping point: How little things can make a big difference. Boston: Back Bay Books.

Google, at http://google.com, accessed 1 March 2004.

Paul Graham, 2003. "Better Bayesian filtering," at http://www.paulgraham.com/better.html, accessed 1 March 2004.

John Hagel and Marc Singer, 1999. Net worth: Shaping markets when customers make the rules. Boston: Harvard Business School Press.

Thomas Homer–Dixon, 2002. The ingenuity gap: Facing the economic, environmental, and other challenges of an increasingly complex and unpredictable world. New York: Vintage.

IMDB at http://imdb.com, accessed 1 March 2004.

Robert Kaye, 2004. "Next–generation file sharing with social networks," accessed at http://www.openp2p.com/pub/a/p2p/2004/03/05/file_share.html, on 1 May 2004.

Cliff Lampe and Paul Resnick, forthcoming. "Slash(dot) and burn: Distributed moderation in a large online conversation space," In: Proceedings of ACM Computer Human Interaction Conference; at http://www.si.umich.edu/~presnick/papers/chi04/LampeResnick.pdf, accessed 25 June 2004.

Steve Lawrence, 2001. "Online or invisible?" Nature, volume 411, number 6837 (31 May), p. 521, and at http://www.neci.nec.com/~lawrence/papers/online-nature01/, accessed 2 July 2004.

Lawrence Lessig, 2001. The future of ideas: The fate of the commons in a connected world. New York: Random House.

Michael Lewis 2003. Moneyball: The art of winning an unfair game. New York: Norton.

Michael Lewis, 2002. Next: The future just happened. New York: Norton.

Peter Lyman and Hal R. Varian, 2003. "How much information?" at http://www.sims.berkeley.edu/how-much-info-2003, accessed 1 March 2004.

Gregory MacLeod, 1997. From Mondragon to America: Experiments in community economic development. Sydney, Nova Scotia, Canada: University College of Cape Breton Press.

Sergei Maslov and Yi–Cheng Zhang, 2001. "Extracting hidden information from knowledge networks," Physical Review Letters, volume 87, number 24 (10 December), article number 248701.

Abraham H. Maslow, 1998. Toward a psychology of being. Third edition. New York: Wiley.

Paolo Massa and Bobby Bhattacharjee, 2004. "Using trust in recommender systems: An experimental analysis," iTrust2004 International Conference, at http://moloko.itc.it/paoloblog/papers/itrust2004/trust2004.html, accessed 2 July 2004.

Hassan Masum, 2002. "TOOL: The open opinion layer," First Monday, volume 7, number 7 (July), at http://www.firstmonday.org/issues/issue7_7/masum/, accessed 23 June 2004.

Hassan Masum, Steffen Chistensen and Franz Oppacher, 2003. "The Turing ratio: A framework for open–ended task metrics," Journal of Evolution and Technology, volume 13, number 2 (October), at http://www.jetpress.org/volume13/TuringRatio.pdf, accessed 2 July 2004.

Graham Meikle, 2002. Future active: Media activism and the Internet. New York: Routledge.

Annalee Newitz, 2003. "Defenses lacking at social network sites," at http://www.securityfocus.com/news/7739, accessed 1 March 2004.

Mark E.J. Newman, 2003. "The structure and function of complex networks," SIAM Review, volume 45, number 2, 167–256. http://dx.doi.org/10.1137/S003614450342480

Donald A. Norman, 1994. Things that make us smart: Defending human attributes in the age of the machine. Reading, Mass.: Addison–Wesley.

Open Directory Project, at http://dmoz.org, accessed 1 March 2004.

Han Woo Park, 2002. "Examining the determinants of who is hyperlinked to whom: A survey of Webmasters in Korea," First Monday, volume 7, number 11 (November), at http://www.firstmonday.org/issues/issue7_11/park/, accessed 23 June 2004.

Frank Partnoy, 2004. Infectious greed: How deceit and risk corrupted the financial markets. London: Profile.

Saverio Perugini, Marcos A Goncalves, and Edward A Fox, 2003. "A connection–centric survey of recommender systems research," at http://arxiv.org/abs/cs.IR/0205059, accessed 2 July 2004.

Neil Postman, 1986. Amusing ourselves to death: Public discourse in the age of show business. New York: Penguin.

William Poundstone, 2004. How would you move Mount Fuji? Microsoft’s cult of the puzzle: How the world’s smartest company selects the most creative thinkers. Boston: Little, Brown.

Robert D. Putnam (editor), 2002. Democracies in flux: The evolution of social capital in contemporary society. New York: Oxford University Press.

Howard Rheingold, 2003. Smart mobs: The next social revolution. New York: Basic Books; book discussion site at http://www.smartmobs.com, accessed 1 March 2004.

Howard Rheingold, 2000. The virtual community. Revised edition. Cambridge, Mass.: MIT Press.

Matthew Richardson and Pedro Domingos, 2004. "Combining link and content information in Web search," In: Mark Levene and Alexandra Poulovassilis (editors). Web dynamics: Adapting to change in content, size, topology and use. Berlin: Springer.

John G. Riley, 2001. "Silver signals: Twenty–five years of screening and signaling," Journal of Economic Literature, volume 39, issue 2, pp. 432–478. http://dx.doi.org/10.1257/jel.39.2.432

Markus Rittenbruch, Tim Mansfield, and Luke Cole, 2003. "Making sense of ‘syndicated collaboration’," at http://www.dstc.edu.au/Research/Projects/Infoeco/publications/group03.pdf, accessed on 1 March 2004.

John Rodzvilla (editor), 2002. We’ve got blog: How Weblogs are changing our culture. Cambridge, Mass.: Perseus.

Morris Rosenthal, 2004. "Surfing the Amazon on a log — What Amazon sales ranks mean," at http://www.fonerbooks.com/surfing.htm, accessed 1 March 2004.

Wade Roush, 2004. "Search beyond Google," Technology Review, volume 107, number 2 (March), pp. 34–45.

Ben Sawyer, 2002. "Serious games: Improving public policy through game-based learning and simulation," at http://www.seriousgames.org/, accessed on 1 March 2004.

David Schenk, 1998. Data smog: Surviving the information glut. New York: Harper.

Tibor Scitovsky, 1997. The joyless economy. Oxford: Oxford University Press.

Helen Shaw, 2003. "The Age of McMedia: The challenge to information and democracy," Weatherhead Center for International Affairs at Harvard, Fellows Paper, at http://www.wcfia.harvard.edu/fellows/papers02-03/shaw.pdf, accessed 1 March 2004.

M.V. Simkin and V.P. Roychowdhury, 2003. "Read before you cite!" Complex Systems, volume 14, number 3, pp. 269–274.

Herbert Simon, 1996. The Sciences of the artificial. Third edition. Cambridge, Mass.: MIT Press.

Slashdot, at http://slashdot.org, accessed 1 March 2004.

Richard M. Stallman, 2002. Free software, free society: Selected essays of Richard M. Stallman. Boston: Free Software Foundation.

Marc Stiegler, 1999. Earthweb. New York: Baen Books.

Steven Strogatz, 2003. Sync: The emerging science of spontaneous order. New York: Hyperion.

Cass Sunstein, 2002. Republic.com. Princeton, N.J.: Princeton University Press.

Loren Terveen and Will Hill, 2001. "Beyond recommender systems: Helping people help each other," In: Jack Carroll (editor). Human–computer interaction in the new millennium. Boston: ACM Press.

TopCoder, at http://www.topcoder.com, accessed 1 March 2004.

John Walker, 2003. "The digital imprimatur," at http://www.fourmilab.ch/documents/digital-imprimatur/, accessed 1 March 2004.

Duncan J. Watts, 2004. Six degrees: The science of a connected age. New York: Norton.

Wikipedia, at http://wikipedia.org, accessed 1 March 2004.

WikiApproval, 2004. "Online discussion of a Wikipedia approval mechanism," at http://en.wikipedia.org/wiki/Wikipedia_approval_mechanism, accessed 1 March 2004.

Gary Wolf, 2003. "The great library of Amazonia," Wired, volume 11, number 12 (December), at http://www.wired.com/wired/archive/11.12/amazon.html, accessed 1 March 2004.

Yi–Cheng Zhang, forthcoming. "Supply–demand law under imperfect information."

Yi–Cheng Zhang, 2001. "Happier world with more information," Physica A, (Proceedings of the NATO ARW on Application of Physics in Economic Modelling, 8–10 February 2001), at http://arxiv.org/pdf/cond-mat/0105186, accessed 2 July 2004.

George K. Zipf, 1949. Human behavior and the principle of least–effort. Cambridge, Mass.: Addison–Wesley.


Editorial history

Paper received 12 May 2004; accepted 15 June 2004.


Copyright ©2004, First Monday

Copyright ©2004, Hassan Masum and Yi–Cheng Zhang

Manifesto for the Reputation Society by Hassan Masum and Yi–Cheng Zhang
<First Monday, Volume 9, Number 7 - 5 July 2004
http://firstmonday.org/ojs/index.php/fm/article/view/1158/1078





A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2016.