Population automation: An interview with Wikipedia bot pioneer Ram-Man
First Monday

Population automation: An interview with Wikipedia bot pioneer Ram-Man by Randall M. Livingstone



Abstract
Software robots (“bots”) play a major role across the Internet today, including on Wikipedia, the world’s largest online encyclopedia. Bots complete over 20 percent of all edits to the project, yet often their work goes unnoticed by other users. Their initial integration onto Wikipedia was not uncontested and highlighted the opposing philosophies of “inclusionists” and “deletionists” who influenced the early years of the project. This paper presents an in-depth interview with Wikipedia user Ram-Man, an early bot operator on the site and creator or the rambot, the first mass-editing bot. Topics discussed include the social and technical climate of early Wikipedia, the creation of bot policies and bureaucracy, and the legacy of rambot and Ram-Man’s work.

 


 

Originally launched in 2001 as an experiment to save a failing expert-written online encyclopedia project, today Wikipedia is the seventh most visited Web site on the Internet globally (Alexa, 2016), consisting of over five million articles on the English language version alone (Wikipedia, 2016b). Some of the earliest articles on the site were created about the cities, towns, and counties in the United States. A quick experiment reveals an interesting commonality between these articles: go to any town or city’s page, view the history of the article (as all past data on a wiki page is publically available), and find the earliest revisions to that page. Unless you are viewing a major metropolis like New York City or Los Angeles, you’ll find two accounts responsible for the earliest work on that page: Ram-Man and the rambot [1].

Bots (short for “software robots”) were used to add articles and content to Wikipedia as early as October 2001, when entries from Easton’s Bible Dictionary were imported to the encyclopedia by a script (Wikipedia, 2016a), but it was User Ram-Man’s automated creation, the rambot, that brought bots into the consciousness of the Wikipedia community in 2002. Ram-Man and the rambot’s legacy has been lasting, as today bots are found on nearly every language version of Wikipedia and contribute over 20 percent of the total edits to the project (Livingstone, 2014; Niederer and van Dijck, 2010; Zachte, 2015). These bots made Internet headlines in 2014, with Swedish programmer Sverker Johansson’s Lsjbot responsible for creating over 2.7 million new article stubs, raising the Swedish version of the project to the second largest after English (Jervell, 2014; Wikimedia, 2016). After briefly offering some necessary context, this paper presents a transcript (edited for readability) of an extensive 2012 interview the author conducted with Ram-Man to discuss his work and influence on Wikipedia and the early days of the project.

Bots have been a reality in the world of computer science since the early 1960s, but the concept actually stretches much farther back. The first bots, developed at the Massachusetts Institute of Technology, were called “daemons,” not in reference to evil spirits, but rather to Socrates’ conceptualization of a non-human, intelligent, and autonomous companion (Leonard, 1997). Indeed, those early bots in the days of mainframe computing were significant helpers for computer scientists, backing up files and performing tedious and time-consuming housekeeping tasks. The first functional bot is attributed to Fernando Corbato at MIT, whose program defined some of the major characteristics of a bot’s nature: bots are processes that run in the background, are normally invisible, react to their environment, and most importantly, run autonomously. Leonard (1997) claims autonomy is the “crucial variable”: “[A bot] is as different from a typical software program (a word processor, say) as a clock is from a hammer. Put the hammer down, and it is useless, a dead object. The clock never stops, as long as it has power.” [2]

Two of the most important programs in the history of bots are ELIZA and Julia, each significant for the development of autonomous software agents (Christian, 2011; Leonard, 1997). ELIZA, named after a character in George Bernard Shaw’s Pygmalion, was developed at MIT by Joseph Weizenbaum in 1966. Programmed to mimic a Rogerian therapist, ELIZA was the first software program to successfully impersonate a human being. A user would interact with ELIZA through a text interface, typing in statements that the program would then manipulate into psychoanalytic questions (for example, upon entering “I am afraid of dogs,” ELIZA would respond, “Why are you afraid of dogs?”). Weizenbaum (1976) reported that graduate student researchers would engage with ELIZA for hours and develop a sense of intimacy with the program. And though ELIZA’s responses were merely algorithmic, she gave a sense of intelligence that in many ways spurred the field of artificial computer intelligence (Schanze, 2010).

ELIZA was a chatterbot, a term coined nearly 30 years later by Michael Mauldin (1994), a graduate student at Carnegie Mellon University and programmer of Julia. Whereas ELIZA was a local chatterbot, run from a lab, Julia was run on early networked platforms like MUDs (Multi-User Dungeons), where she interacted with anyone in the online community. Julia’s conversation skills were quite advanced for a computer program; she could use humor, remember information from earlier in a conversation, delay her responses to mimic thinking, and purposefully send conversations on tangents. Even more importantly, though, Julia could answer questions about the online community that would help other users. Leonard (1997) claims, “Julia represents a giant step forward for botkind. ... Julia, as a useful servant, represents in embryological form the intelligent agents waiting to be born. And Julia, as an interface, signifies the importance of the anthropomorphic approach.” [3] As ELIZA and Julia demonstrate, there is a strong tradition of helper programs that vastly predates Wikipedia.

Late in 2002, Ram-Man began manually creating Wikipedia articles for the over 3,000 geographical counties in the United States, but he decided to use his programming skills when he moved to the city and town level. Over the course of a week in October, the rambot created over 30,000 new articles on the English Wikipedia, each including consistently formatted information on location, geography, and population demographics pulled from the 2000 U.S. Census and CIA World Factbook Web sites. At the time, the encyclopedia had approximately 50,000 articles, so the rambot’s work expanded the project by over 60 percent, flooding Recent Changes and contributor watchlists (Figure 1) [4]. And although Ram-Man’s work seized upon many of Wikipedia’s early principles — “Be bold,” “Ignore all rules,” “Assume good faith” — it was met with mixed reactions on Talk pages across the site. Users Tarquin and Juuitchan wrote respectively:

Hundreds of core topics are still uncovered or amateurishly-written, and here we have a page for every one-horse town across the U.S. It won’t project a terribly good image of Wikipedia; that concerns me.

And while you’re at it, why limit it to the USA? Why not do England, Canada, Australia ... why limit it to English-speaking countries? Why not do the whole world?? Clearly there is something absurd about this! (Wikipedia, 2016e)

Defenders of the rambot saw these additions to the encyclopedia as a positive step:

Just linked Bitterroot to Missoula, Montana, then added to the Missoula article the fact that it is the only place that bitterroot (the state flower) grows. Took about as long as it would take to repeat an oft-made complaint against the rambot, and much more interesting, fun, encyclopedic, and productive. These articles are the foundation for the encyclopedia of the future. Use them. Improve them. (Ortolan88)

These arguments encapsulate the ideological stances of two emerging groups on the site: inclusionists, who felt the project should take advantage of its openness and include a broad range of content, and deletionists, who held a more conservative, traditional vision for the encyclopedia.

 

The rambot's effect on the growth of Wikipedia's article population
 
Figure 1: The rambot’s effect on the growth of Wikipedia’s article population. Source: User: HenkvD, Wikimedia Commons.

 

From a technical perspective, critics of the rambot were worried that its speed and consistency could affect the performance of MediaWiki, which at the time was still run from a single server, as well as the response time of the server. At a larger level, the rambot’s automated actions raised anxieties about bots on the site running amuck without the operator’s awareness. Debates went back and forth regarding the implications of bot work on a theoretically “user” generated site. Ram-Man himself felt many of the arguments against bots were spurred by an “irrational fear” or technophobia among the community, concluding:

The big issue is that people are biased against bots. No one has complained one bit about the county articles but I hear a lot of complaint about the bot added cities. I bet no one even knew that the Alabama and Alaska entries were entered by hand! The articles are almost equivalent, but people don’t like one because a bot did it. (Wikipedia, 2016e)

Ultimately, the need for a policy around bots became apparent even to Ram-Man, who drafted many of the original bot policies that remain in effect today on the English Wikipedia. Bot policy there requires that automated edits be made from a separate account than the operator’s personal account (generally with ”bot“ in the name) and requires a bureaucrat-granted bot flag that both signifies its legitimacy and suppresses its edits from appearing on the Recent Changes page [5]. Bot operators must clearly define the tasks their bots will tackle, prove the proper functionality of their bot during a trial period, and remain open and available for communication from other contributors (Wikipedia, 2016d). Early bot policy served to address the concerns of the greater Wikipedia community around automated editing, and since then it has solidified into rules and guidelines largely respected by the bot community.

In 2006, the Bots Approval Group (BAG) was formed on the English Wikipedia to review Bot Request for Approvals (BRFA) (Figure 2). Consisting of experienced bot operators, the group would both review the soundness of a bot request and determine if there was community consensus for the task. By 2007, the BAG was facing accusations of being a technical cabal on the site, making decisions on bots without fully gauging community consensus, and adding unnecessary bureaucracy and process to the site (Wikipedia, 2016e). The BAG stood by the fact that the BRFA process is always open to the broader community, but few outside contributors regularly participated in the process. Opposition largely subsided, and the BAG continued their work. Today, the group consists of only five active members and a number of semi-active or inactive members (Wikipedia, 2016c).

 

Unofficial Wikipedia logos for bots and automated tools (left) and the Bot Approval Group (right)
 
Figure 2: Unofficial Wikipedia logos for bots and automated tools (left) and the Bot Approval Group (right). Each demonstrates the mechanical metaphor that is often applied to bot work on the site. Source: Wikimedia Commons.

 

The English Wikipedia’s BRFA process is the most formalized bot approval process on any version of Wikipedia, although a similar group has recently been formed on the Portuguese Wikipedia (Livingstone, 2014). Most language versions of the project, however, utilize a broader community review with less formalized processes [6]. Others often have an even simpler process, where a bureaucrat will grant the bot flag directly after a successful trial, barring any vocal opposition from the community.

Some scholarly work over the past decade has interrogated the influence of bots on Wikipedia maintenance, culture, and community, echoing many of the main themes from Ram-Man’s own experiences. Bots have developed important roles in task routing (Cosley, et al., 2007) and combatting vandalism (Geiger and Ribes, 2010; Kittur, et al., 2007), allowing human contributors to utilize their time on the site more efficiently while also “fundamentally changing Wikipedia’s culture” (Halfaker and Riedl, 2012). Geiger and Ribes (2010) suggest that “technological tools like bots and assisted editing programs are significant social actors in Wikipedia, making possible a form of distributed cognition regarding epistemological standards.” [7] Müller-Birn, et al. (2013) argue that Wikipedia now operates with a system of “algorithmic governance” because of the interdependence between human users, bots, and the technical infrastructure of the project, though the system is not without conflict regarding the purview of the technological actors. And Geiger’s (2011) case study of HagermanBot, a controversial bot used to sign unsigned user comments, demonstrates that bots are “subject to social and political pressures,” with the author suggesting that “we must be careful to not fall into familiar narratives of technological determinism when asking who — or what — actually controls Wikipedia” [8].

The most active accounts (by number of edits) on Wikipedia today are those of bots. Sixty-four different bots have made over one million edits to the project, while six have made over 10 million (Zachte, 2015). Most of these high-edit bots work on multiple language versions of Wikipedia. The rambot, their forbearer in many ways, made only 131,000 edits to the English Wikipedia though, and as Ram-Man recounts, faced a number of challenges from its earliest days:

Question: You wrote and operated what seemed to be the first bot that made significant contributions to content on Wikipedia (WP) (in terms of number of edits). There seemed to be a lot of debate whether your bot (and bots in general) should be allowed to run on WP. Can you describe what you feel were the reasons for that conflict?

Ram-Man: This question will require a detailed answer ... I was the first “major” bot for sure. There were other bots before mine, but they have a relatively insignificant influence as far as controversy goes. I was unemployed at the time and thought Wikipedia the greatest cause for freedom of information ever (I still hold that view to a large extent). I had a lot of free time — I am also a programmer — so I decided to do as much as I could do in the time I had.

So in the early spirit of Wikipedia (“Be Bold!” and “There are no rules”), I just went ahead and started adding. I started by adding all the U.S. counties/parishes (the ones that were missing anyway). This was only, something like 3,000 articles, so it was not too major. As an aside, the articles were generated by “bot”, but I actually added all of those articles by hand.

Q: Did you pull the info from the U.S. census?

RM: Most of it, but there were other data sources. There may still be a page that lists them. I just used a database to link the various data sources. This would include things like geographical coordinates and such. The process of adding the cities was more involved. I started using a bot and this created a handful of problems:

  1. It affected performance.
  2. The percentage of articles got skewed. This directly affected “Random Page.”
  3. It cluttered up the “Recent Changes.”
  4. It generated lots of discussion on the merits of the articles themselves and the various camps all came out to play (Deletionists, Inclusionists, etc.).
  5. People started to be concerned about how a bot could be used to “mess up” the project.

Q: Right. Those camps were entrenched from the early days of the project?

RM: They were entrenched to some extent. Part of the joy of working back then was the relative lack of bureaucracy, so it was more open dialogue. Anyway, those five items were all separate issues that were all handled differently. I’d suggest we take them one by one.

I don’t know if you know the user “Maverick” (I think that was the spelling.) He was a VERY active administrator and one of my chief supporters [9]. At the time he had a lot of clout. Technically this was not supposed to happen, but you can’t prevent hierarchies of power and influence. He defended my articles from the almost constant barrage from people who wanted them deleted. There were discussions and discussions on the topic. Eventually we just referred people to previous discussions just so we could get on with our lives, but even to this day articles are put up for deletion.

In the end, the inclusionist camp won out because ultimately when the question was put to the software developer, the determination was made that disk space was not an issue. Strange to think of that being a concern, but back then it was certainly raised, but I don’t think it was ever seriously considered. Text does not take up that much space. You could go quite in depth researching the deletion debates, but in the end not a single article was deleted (that I know about anyway).

Q: Right. None of yours, you mean?

RM: Yes, mine were all legally recognized entities, so that was sufficient for notability. So many of the early policies (like notability) were strongly influenced by those articles. “Ignore all rules” was pretty much what saved me, since I frequently pushed the limits.

Q: So the articles you added didn’t affect performance?

RM: Oh, it did affect performance. No one noticed when I added the counties, but when I started adding cities as fast as I could (sequentially), it caused problems. Mind you, it wasn’t just adding either. Every time I made a mistake, I had to edit each article, some 30,000+ edits. This happened numerous times. Basically in discussions with developers, we decided to limit my editing rate to about once per second. You’ll probably see reference to that number in bot policy (maybe it is still there).

Q: What sort of mistake?

RM: Ok, mistakes ... well, sometimes the names of places in the census does not match the name for, say, the postal service, and say, common usage. Other problems include places where the county and city are the same, others where they are separate but have the same name. Lots of things I can’t remember. Not everything was like that, of course. I made many modifications just to add new things. If you look at the early history of Autaugaville, Alabama, you can get an idea of the kinds of things done (that’s the first city in the list in alphabetical order, so it had EVERY change).

So after slowing down my edits, that controversy largely went away, although people still made the accusation so I had to deal with that. Oh, one other interesting anecdote: my bot edits exposed a bug in the article counter at one point. I don’t know if it was the way I was accessing them instead of using a browser, but every edit the bot made was counted as a new article. It was a bug in the software, but it was an example where using the bot actually exposed a flaw in the underlying software. There were a LOT of article count watchers back then, so it was annoying. Those charts have all been adjusted for the error and it has largely been forgotten. Until it was fixed, it distorted Wikipedia’s statistics, which affected its private and public perception. The only thing left to say about the topic is that it was one of the contributing reasons why bot policy was made. We didn’t want bots to affect normal human editing in any way.

So #2 “random article” was a problem that sort of resolved itself. It annoyed a lot of people, and people wanted to delete the “rambot articles” simply because they used “random page” as a search tool to work on a random article and they didn’t want my articles. But the decision makers were not going to delete articles because it messed up a software feature. In the end, the massive growth of Wikipedia solved the issue by lowering the percentage of articles that I had made.

Q: Who were the “decision makers” at the time? Admins like Maverick?

RM: Well, there were discussions on the deletion request pages, but also on the various Wikiproject pages. Wikiproject Cities was the biggest one. The talk page was where most of the arguments took place. Also, Wikproject Counties. It looks like those pages may have been deleted. If so, then a lot of the history may be lost.

The “Recent Changes” was one of the most major issues. There were so many of my edits that it prevented adequate vandalism detection. At one point I may have restricted edits to once per minute for that reason. It was somewhat annoying, because my bot runs would take one to three days to complete sometimes. So the developers created the bot flag to get around this problem. The bot flag was created for that one purpose, to prevent my edits from affecting recent changes. At the time, there was not even a configuration option. The developers manually assigned everything and my edits were totally hidden. For a while I operated totally out of the spotlight, so controversy declined a bit. But eventually features caught up to where they are now. This was one of the other areas that led to bot policy. We needed a practical way to manage the bot flag. This was probably the most visible problem, but it was also easily fixed.

Q: This was still way before BAG, correct?

RM: Correct. We had bot policy well before BAG. You know that I authored the bot policy, yes?

Q: The original version? My understanding is it has morphed a bit over the years?

RM: To an extent, but looking at the page, not much has changed in some ways. The “Bot usage” section looks the same: #1) Bots edit far faster, #2) Lower level of scrutiny (because of the bot flag mentioned above hiding edits), #3) May cause disruption. While the wording may have been adjusted, a lot of the core policy remains the same.

So back to the list ... #5, fear of causing problems. This was a somewhat legit concern. A bot running amok could cause a lot of damage. And this is the third reason that bot policy was formed. Take a look at the original version [which includes the language] “useful”, “harmless”, and “not a server hog”. Bots have indeed run amok, sometimes intentionally, sometimes not so. As mentioned, my bot ran for hours or days at a time, often when I was asleep. There were mistakes that were not caught quickly. It was not intentional, and I fixed all the problems myself, but other bots had these problems too. BAG members have frequently blocked bots or users acting as bots. And there were some users who had to be blocked because they used bots maliciously (at least semi-maliciously). As you are quite aware, there are various ideological camps, and sometimes a bot was used as a big stick to emphasize a point. A dumb idea for sure, but it still happened.

Speaking of blocking, we determined early on that it was best to block first, ask questions later, if an unauthorized bot was detected. Part of the bot policy (and BAG) was to allow this behavior without controversy. Oh! Speaking of “malicious bots” ... spell checking bots.

Q: Yes ... A lot of people don’t like them.

RM: They are a very popular concept, but we made sure very early that these bots would be disallowed. They absolutely require a human element. We certainly approved semi-automated spell check bots, because it sped things up.

Q: What was the environment and atmosphere like for bots and bot operators during the early years of Wikipedia?

RM: When I started, I argued vigorously that I should be allowed to do whatever I wanted. A bot and a human (at least in my case) were not very different. There was in my opinion prejudice against bots in the beginning.

Q: Would you call it technophobia?

RM: I think so. A lot of what I believe was irrational fear. I don’t like a lot of rules. You might not believe that because I founded bot policy. What I wanted was to be left alone so I could get work done. The way to do this was to write the rules myself. I had a vested interest in making it relatively easy. But once the policy was codified, people slowly stopped complaining. After all, there was consensus.

Q: Do you feel the fears were primarily from non-coders, non-developers?

RM: Yes, I would say that. Running a bot is a big responsibility because you can really muck things up.

Q: And what do you think the WP community at large — those not in the bot community — think about bots today

RM: I don’t think (normal) people even think about bots anymore. There are so many bots running all manner of vital maintenance tasks. It looks like bot policy has not really changed all that much since I was involved.

Q: Do you feel editors should know more about bots? Or even non-editing users (i.e., readers) of Wikipedia? Should they have some form of literacy about bots and what bot operators do?

RM: No, editors do not need to care about bots. Only those looking for vandalism. What is a bot?

In the Wikimedia software, there are tasks that do all sorts of things, like count pages, count orphan pages, etc. If these things are not in the software, an external bot could do them. It is just a small step to editing pages. The main difference is where it runs and who runs it. (I’m not sure this is the best example for what I believe).

Q: Do you agree with the definition on the bot policy page?: “Bots (short for “robots”) are generally programs or scripts that make automated edits without the necessity of human decision-making.” Did you write that?

RM: That’s pretty much in essence what has been there from the “beginning”. I don’t know if I wrote it or not, but I could have. It’s not entirely accurate anyway. Some bots make no decisions.

Q: How would you modify it?

RM: It’s fine, which is why I never changed it. But it does illustrate my point: If you know it is inaccurate, then you are informed enough that it does not matter. The average user does not care about the distinction.

Q: Were you involved in the discussions that ultimately created the Bot Approvals Group? The discussions about the group being a cabal? And what are your general feelings about the value of the group and the BRFA (Bot Request for Approval) process?

RM: I was just looking at an old version of the Bot Approvals Group talk page. The important quote I made:

“I’ve been neglecting approvals. I should probably be removed as a BAG member because I’ve been busy doing other things, some of which is defending BAG instead of getting useful work done. Ninety percent of the approvals process is open to everyone. Everyone. About 10 percent of the job is determining consensus and having the technical and policy understanding to recognize if a bot is a bad idea even if it has approval otherwise. People seem to be ignoring the fact that no one is helping with the most important aspect and instead complaining about the 10 percent that is working out pretty well. It’s the 90 percent that is broken: we don’t have enough people commenting so BAG members get bogged down. BAG members are really good at the 10 percent part: technically knowledgeable and really good at determining consensus. As far as I know, we’ve had no complaints about what we’ve actually done. If everyone, including myself, would just shutup and help, there wouldn’t be a problem. Anyone can join BAG if they prove themselves competent with the other 90 percent of the work. Why they won’t do that is beyond me.”

This roughly summarized my view that BAG was/is NOT a cabal. Unlike many other Wikipedia policies that generated hot debate, the management of bots was a largely ignored and thankless job.

Q: What do you think provoked criticism of BAG? (You had talked about how Ram-Bot was a major reason for the establishment of Bot Policy.)

RM: [directing me to the Bot Approvals Group talk page] You’ll see various areas of the early controversy regarding whether or not BAG should exist: “Eventually a more formal process had to be created to grant the bot flag to a growing number of bots”. In other words, we had to have a minimal amount of process anyway simply to implement the technical feature.

Q: Do you feel the users who challenged this did it on principle? Paranoia about cabals on WP in general?

RM: Oh yes, they certainly thought it was a “cabal” and heavy handed, but I honestly believe it was misunderstanding of how BAG worked. Or perhaps people just had to push their agenda, but I’m not sure I’d go that far. I think my rebuttal to the question in that section adequately describes 90 percent of my feelings in defense of BAG’s existence.

Q: What have your experiences and interactions with the Bots Approval Group (BAG) been like?

RM: Primarily frustration at the lack of participation. It was a lot of work approving bots. A lot of people got tired of it. There was not a lot of community interaction or interest. The people involved just wanted everything to work correctly without problems. Within the group itself, it was pretty civil. Most bots were approved with very little controversy. It was only a small fraction with any sort of problem. People tend to remember the problem areas I suppose, but in the end I’d say it worked just fine.

There was one thing I do remember that pertains to this discussion ... a lot of people wanted to be BAG members. And almost always we said, “If you want to join, you have to put in the work”. We didn’t want people to just walk in for a bot or two and then disappear. They had to work on the 90 percent that didn’t involve being a BAG member if they wanted the extra 10 percent responsibility. And there were some cases where a bot idea would be something like “automatic spell checking bot” and non-BAG members would say “What a great idea!” Obviously BAG members had to shoot that down as a matter of policy, but it didn’t help perception.

Q: Right. Did these people then get frustrated and drop their request to join?

RM: Yes, they would drop the request (let it fade away anyway).

Q: Any other comments on BAG or BRFA?

RM: I spent a lot of time defending BAG, as one of its founders, so all of that is in the [talk page] history for all to remember if they want to find it. I must say though, that eventually I tired of BAG, like a lot of other people. It was a lot of backend, maintenance, bureaucracy, without any real work improving articles. I think that is one of the reasons I stopped (that and being busy in real life). And one last thought about BAG ... I believe that Wikipedia would have been fine without it, but problems would have been detected slower. In the end, BAG was about volunteers wanting to make the project run smoother, but it was never vital.

Q: Does/did Rambot run on the Wikimedia Foundation’s Toolserver? [10]

RM: Rambot was a Java program that ran on my PC. I probably ran it on both Linux and Windows, in case that matters. I’m not sure, but I don’t think the Toolserver existed when I started rambot. Perhaps it did and I didn’t know about it. A lot of the early Toolserver applications were pretty experimental, and I was not interested in being a beta tester for that technology. I just wanted a quicker way to update articles.

Q: Right. What do you think overall about the Mediawiki framework and the technical infrastructure of WP?

RM: I’ve actually taken the Mediawiki framework and installed it on an internal wiki at my current company. I love the software. I’ve even made slight modifications to the code, but I have not looked at it that closely to know if it is well designed or a programmer’s nightmare. In some ways it is too technical for the novice, but you need that to get the features that the project needs. It’s a tradeoff, but I think it is good. I’ve played with less technical wikis and have not liked them at all. As a programmer I love the advanced features, particularly templates and transclusion. I also love categories, as they allow arbitrary and multiple hierarchies of information (unlike, say, a file system).

Q: How about the administrative structures on Wikipedia (other than BAG)? Any comments there?

RM: You won’t hear a lot of complaints out of me. I’ve always appreciated the project with very few complaints, except perhaps against the deletionists. I even thought that the Essjay controversy was overblown [11] (I worked with him on numerous occasions). About the admin structure ... when you get a large enough group together, you need some amount of overhead to manage coordination of resources. Chaos just does not work. It has never worked, which is why you get governments of one sort or another. So a “government” was bound to occur, the only question was whether the consensus method used was a good one. I think it works about as best as you can without electing politicians (metaphorically speaking). Considering how easy it is to become an administrator, they hardly count as a ruling body.

Q: What are your opinions on the sanction system (i.e., the Arbitration Committee, or ArbCom) or recognition systems like Barnstars?

RM: ArbCom is crucial to the success of the project. You MUST end debates (and problem users), they cannot be allowed to last forever, and so someone has to decide. Because any user can edit, you have to put restrictions. There is no way everyone can be made happy. However, 99 percent of editing will never have to deal with these issues.

Of bigger concern are people who sit on an article that they think of as theirs. At the risk of sounding arrogant, I’ll use a personal example. My images make up maybe 50 percent of the pictures used to illustrate the Monarch butterfly article (maybe more). In this case, I function as the “gatekeeper”. Someone comes in and changes the image to something they like and a prolonged edit war ensues. I think my images are superior, but I don’t have the time to patrol the article, so over time the quality degrades. You can see my bias here. Now for sake of argument, say I was incorrect in my view. As the gatekeeper, I would actually be harming the article. This happens ALL THE TIME, and usually it is not noticed. Someone finds a useful piece of information and the gatekeeper just removes it at a later point. No one can check every edit they ever made.

Q: Ok ... so you think this attitude of ownership degrades articles on the whole?

RM: Yes. Vandalism is relatively easy to deal with. Ownership is tough, because casual vandalism patrols will not be able to determine whether the changes are beneficial or harmful to the article because it requires a level of expertise.

Q: Just to play the other side, isn’t this attitude somewhat necessary to get articles up to speed in the first place? Is it very much a double-edged sword?

RM: I can’t agree more. It is one of the biggest problems, but “Be Bold!” is a core policy (for a good reason!). Over time gatekeepers become irrelevant.

Q: It’s a contradiction that the project has to live with?

RM: Indeed. The very nature of the project causes these problems. I’m not clever enough to think up the solutions to these problems (if they even exist), but I just wish that there was a way to avoid the short term degradation.

Q: Right. So, somewhat along those lines ... What does edit count mean to you? What does it represent? At one time earlier in the project, you had the highest edit count.

RM: Edit counts were a big load of fun in those early days because I was up in the top 10 for a long time. If you merge my three accounts, I had a lot. I don’t think the lists include bot edits. But it was never more than fun and a little pride, but it didn’t matter. My edits were cheap and I never made a featured article.

Q: Cheap?

RM: I’m like the miner who produces the raw materials while some other architect/artist/builder turns it into something beautiful. There were a few exceptions to this, but this was part of the reason I stopped editing articles and went into photography to improve Wikipedia images.

Q: Edit counts? Or recognition in general?

RM: Oh no, my edit counts have been hardly anything since that transition. No, I found something that I was truly good at (taking high quality, encyclopedic images), and turned that towards improving articles. I don’t mind recognition, but it has always been about making the articles better. I do 95 percent of my edits on the Wikimedia Commons now.

Q: Do you follow the workings of the Wikimedia Foundation at all? The governance (if you’d call it that) at that level?

RM: Nope. I remember when I was once a very active administrator on Wikipedia, I followed all the head policy decisions, and actively participated in all of that. I even caused some concern with Mr. Wales [12] (about licensing), which was resolved when we met at a Wiki Meetup. Now, I don’t even pay attention. Quite a change.

Q: Do you have any other comments, experiences, anecdotes, or stories you’d like to share that might help my project, which endeavors to understand and give voice to Wikipedia bots and their creators?

RM: Bots (and their writers) do a tremendous service to Wikipedia. It is a vital role, but largely missed by the average user. If I go to the Albert Pujols page [13], where do all those stats come from? A bot! It is not needed for the average Wikipedia user to know about bots, but their function is crucial.

Q: Do you use any assisted-editing tools or semi-automated tools on WP?

RM: I use the plain text editor, and if I need automation, I use the rambot (it has done other tasks). But as I’ve said, most of what I do now is just images, which is pretty simple editing. Getting the images prepared and uploaded is the tough part.

Q: Finally, could you describe your motivations for your continued contributions to Wikimedia projects as a whole?

RM: In summary: Freedom of speech and information. Knowledge is power, and Wikipedia is the largest source of knowledge that I know. Being able to contribute to it has been a personal (and ideological) pleasure. Of course I get a lot of personal enjoyment out of the whole process, but the core motivation has always been the above. Incidentally, I also have been a large supporter of the theory that many eyes will improve Wikipedia’s quality. I wanted to see if I was right. End of article

 

About the author

Randall M. Livingstone is an assistant professor in the School of Communication at Endicott College. He holds a Ph.D. in communication and society from the University of Oregon. His research interests include new/digital media, social media, collective intelligence, and the political economy of communication.
E-mail: rlivings [at] endicott [dot] edu

 

Notes

1.This bot is properly referred to as “the rambot,” including the direct article and lowercase spelling.

2. Leonard, 1997, p. 21.

3. Leonard, 1997, pp. 41–42.

4. Recent Changes is a real-time listing of the most recent edits made to Wikipedia. Watchlists display recent changes made to only a subset of pages selected by the particular user (or watcher). Both are features of the MediaWiki software that Wikipedia runs on, and both are tools used to monitor user additions and deletions from the site.

5. Technically, a bot flag is a designation of user rights, granting an account a certain set of editing privileges. More practically, though, the bot flag is an indication of trust and approval for a bot and its work.

6. Perhaps reflective of the unique nature of bots on the project, even with processes free of appointed reviewers, the approval of bots is generally only handled by a handful of Wikipedians (Livingstone, 2014).

7. Geiger and Ribes, 2010, p. 124. Distributed cognition refers to an approach to task completion that utilizes both human cognition and tools/artefacts in the environment (the concept sometimes associated with collective intelligence). See Hutchins (1995).

8. Geiger, 2011, p. 79.

9. An administrator is a user who has been granted a higher level of technical rights, including the ability to block other users, and often serves as a resource to the site’s community of contributors. Administrators go through a formal nomination and consensus approval process.

10. A server maintained by the WFM from which various software tools are hosted.

11. Essjay was a prolific Wikipedia contributor who was discovered to be grossly misrepresenting his off-line credentials, stirring controversy in 2007 around the credibility of the project. See Cohen (2007).

12. Jimmy Wales is the co-founder and most public spokesperson for Wikipedia.

13. A professional baseball player for the Los Angeles Angels of Anaheim, Calif.

 

References

Alexa, 2016. “The top 500 sites on the Web,” at http://www.alexa.com/topsites, accessed 5 January 2016.

B. Christian, 2011. The most human human: What talking with computers teaches us about what it means to be alive. New York: Doubleday.

N. Cohen, 2007. “Wikipedia ire turns against ex-editor,” New York Times (6 March), at http://www.nytimes.com/2007/03/06/technology/06iht-wiki.4817747.html, accessed 20 May 2015.

D. Cosley, D. Frankowski, L. Terveen, and J. Riedl, 2007. “SuggestBot: Using intelligent task routing to help people find work in Wikipedia,” IUI ’07: Proceedings of the 12th International Conference on Intelligent User Interfaces, pp. 32–41.
doi: http://dx.doi.org/10.1145/1216295.1216309, accessed 1 January 2016.

R.S. Geiger, 2011. “The lives of bots,” In: G. Lovink and N. Tkacz (editors). Critical point of view: A Wikipedia reader. Amsterdam: Institute of Network Cultures, pp. 78–93.

R.S. Geiger and D. Ribes, 2010. “The work of sustaining order in Wikipedia: The banning of a vandal,” CSCW ’10: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, pp. 117–126.
doi: http://dx.doi.org/10.1145/1718918.1718941, accessed 1 January 2016.

A. Halfaker and J. Riedl, 2012. “Bots and cyborgs: Wikipedia’s immune system,” Computer, volume 45, number 3, pp. 79–82.
doi: http://dx.doi.org/10.1109/MC.2012.82, accessed 1 January 2016.

E. Hutchins, 1995. Cognition in the wild. Cambridge, Mass.: MIT Press.

E.E. Jervell, 2014. “For this author, 10,000 Wikipedia articles is a good day’s work,” Wall Street Journal (13 July), at http://online.wsj.com/articles/for-this-author-10-000-wikipedia-articles-is-a-good-days-work-1405305001, accessed 20 May 2015.

A. Kittur, E. Chi, B.A. Pendleton, B. Suh, and T. Mytkowicz, 2007. “Power of the few vs. wisdom of the crowd: Wikipedia and the rise of the bourgeoisie,” CHI ’07: Proceedings from the 25th Annual ACM Conference on Human Factors in Computing Systems, pp. 1–9, at http://www-users.cs.umn.edu/~echi/papers/2007-CHI/2007-05-altCHI-Power-Wikipedia.pdf, accessed 1 January 2016.

A. Leonard, 1997. Bots: The origin of new species. San Francisco: Hardwired.

R.M. Livingstone, 2014. “Immaterial editors: Bots and bot policies across global Wikipedia,” In: P. Fichman and N. Hara (editors). Global Wikipedia: International and cross-cultural issues in online collaboration. Lanham, Md.: Rowman & Littlefield, pp. 7–23.

C. Müller-Birn, L. Dobusch, and J.D. Herbsleb, 2013. “Work-to-rule: The emergence of algorithmic governance in Wikipedia,” C&T ’13: Proceedings of the Sixth International Conference on Communities and Technologies, pp. 80–89.
doi: http://dx.doi.org/10.1145/2482991.2482999, accessed 1 January 2016.

S. Niederer and J. van Dijck, 2010. “Wisdom of the crowd or technicity of content? Wikipedia as a sociotechnical system,” New Media & Society, volume 12, number 8, pp. 1,368–1,387.
doi: http://dx.doi.org/10.1177/1461444810365297, accessed 1 January 2016.

J. Schanze, 2010. Plug & pray [Motion picture]. München: Mascha Film.

J. Weizenbaum, 1976. Computer power and human reason: From judgment to calculation. San Francisco: W.H. Freeman.

Wikimedia, 2016. “List of Wikipedias,” at https://meta.wikimedia.org/wiki/List_of_Wikipedias, accessed 5 January 2016.

Wikpedia, 2016a. “History of Wikipedia bots,” at https://en.wikipedia.org/wiki/Wikipedia:History_of_Wikipedia_bots, accessed 5 January 2016.

Wikipedia, 2016b. “Size of Wikipedia,” at https://en.wikipedia.org/wiki/Wikipedia:Size_of_Wikipedia, accessed 5 January 2016.

Wikipedia, 2016c. “Wikipedia:Bot approvals group,” at https://en.wikipedia.org/wiki/Wikipedia:Bot_Approvals_Group, accessed 5 January 2016.

Wikipedia, 2016d. “Wikipedia:Bot policy,” at https://en.wikipedia.org/wiki/Wikipedia:Bot_policy, accessed 5 January 2016.

Wikipedia, 2016e. “Wikipedia talk:Bots/Archive,” at https://en.wikipedia.org/wiki/Wikipedia_talk:Bots/Archive/1, accessed 5 January 2016.

E. Zachte, 2015. “Wikipedia statistics,” at http://stats.wikimedia.org/EN/BotActivityMatrixEdits.htm, accessed 5 January 2016.

 


Editorial history

Received 27 May 2015; revised 6 January 2016; accepted 7 January 2016.


Creative Commons License
“Population automation: An interview with Wikipedia bot pioneer Ram-Man” by Randall M. Livingstone is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Population automation: An interview with Wikipedia bot pioneer Ram-Man
by Randall M. Livingstone.
First Monday, Volume 21, Number 1 - 4 January 2016
http://firstmonday.org/ojs/index.php/fm/article/view/6027/5189
doi: http://dx.doi.org/10.5210/fm.v21i1.6027





A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2017. ISSN 1396-0466.