The cost of search and evaluation in online problem-solving social networks with financial and non-financial incentives
First Monday

The cost of search and evaluation in online problem-solving social networks with financial and non-financial incentives by Daniel Scain Farenzena, Luis da Cunha Lamb, and Ricardo Matsumura de Araujo



Abstract
Online networks of individuals have been used to solve a number of problems on a scale that would not be possible if not within a connected, virtual and social environment such as the Internet. In this paper, we show that when solving tasks with small duration (under five minutes), also known as microtasks, individuals decision making will be strongly biased by costs of searching (and evaluating) options rather than financial or non-financial incentives. Indeed, we are able to show that we can influence individuals decisions, when solving problems, by rearranging elements visually to modify an individual search sequence, be it by designing the virtual work environment or manipulating which options are first shown in non-controlled environments, such as the Amazon Mechanical Turk labor market. We performed almost 50 experiments in online networks where individuals were invited to work on tasks with varying degrees of difficulty within three settings: mathematical games with objective truth (Sudoku and SAT instances); surveys with subjective evaluation (public policy polling); and labor markets (Amazon Mechanical Turk).

Contents

Introduction
Human behavior studies in computer science
Goals and methodology
Results
Discussion and conclusions

 


 

Introduction

In the past decade, humans are increasingly spending more time in online, networked environments, such as online social networks, collaborative work Web sites or playing multiplayer games (Easley and Kleinberg, 2010). Therefore, given the amount and relevance of human activities performed in online environments, we need to be able to design information systems capable of optimally harvesting human resources available online. However, are there good models to explain how humans behave in online environments? What are important factors to consider when designing systems where humans interact and work? This paper aims to investigate important factors that can influence human decision making in online environments. These factors can can be used to explain online human behavior and possibly rules on the design of information systems.

While this change from the off-line to the online world is happening in networked environments, provided by an immense infrastructure composed of connected electronic devices (Abowd and Mynatt, 2000), data from human activity is being collected and made available to network hosts, making it less expensive to observe human behavior on a large scale. As a consequence, we can take advantage of this scenario and collect empirical data on decision making processes of many individuals. In this paper, as a methodology to achieve our goal of understanding online decision making, we performed around 50 experiments and collected data in non-controlled environments to provide empirical data about how humans behave in online environments. This information is used as a base to a new model based on artificial memetic networks.

Within this scenario of massive, cheap and accessible data, new behavioral phenomena has been observed. Online applications are becoming successful in effecting a significant number of individuals comparable to the total human population, raising interest from academia, industry (von Ahn and Cathcart, 2009) and government. Examples of successful applications are reCAPTCHA, FoldIt and Amazon Mechanical Turk (AMT). Another example is Quora, a Web site where individuals can ask questions and have them answered, edited and organized by other users of site (Paul, et al., 2012). As much as artificial intelligence has progressed, no artificial systems, based on AI, can properly read natural language questions and answer them in a satisfactory way as to surpass the popularity of Quora or other Question and Answer (Q&A) sites. Therefore, online human behavior studies are receiving special attention and we will model some of our experiments after applications where humans solve tasks, specifically SAT problems and Sudoku games. Both types of tasks have equivalent mathematical descriptions and can be used to compare solving performance and what traits influence how individuals decide to solve tasks among many possible options.

There are many research areas dedicated to understanding online human behavior. One way to understand how humans manage online problem solving is to look at human cooperative behavior. It is the process where individuals work or act together for common or mutual benefits (Fehr and Fischbacher, 2003). For a more detailed description of cooperative behavior, we could choose one among several fields which will provide their own definitions, whether with rigorous mathematical models as in economics (Osborne and Rubinstein, 1994; Ohtsuki, et al., 2006) or more subjective statements found in the humanities (Richerson and Boyd, 2005). The study of cooperative behavior is overlapped by several classic fields of study, such as economics, psychology and social sciences. Additionally, in the past five years, new disciplines were created in order to better describe ongoing interdisciplinary work in social computing, human computing and social dynamics. There is no fair agreement over which properties, phenomena or objects receive special emphasis in each of the aforementioned areas because, as we will see in this paper, theories and models about cooperative work are still under development or subjective, making comparisons not straightforward. However, we can more easily distinguish each area with a definition of cooperative behavior if we add context and methodology. For instance, in the game theoretic approach [1], although shared with other fields, is more commonly found in economics (Kreps, 1990), making use of complex mathematical models (Bredin and Parkes, 2012). We are going to use this method in highlighting important traits about each field, to distinguish each approach as necessary.

Nonetheless, as we present evidence from the first phase of our experiments, we gather evidence that people will decide to cooperate with any source of information that they can connect to, and not only humans (even machines) as they were communicating with humans, thus making our model less specific to networks of individuals cooperating but more generals terms of nodes of information exchanging information. Putting another way, if machines or information systems are not considered cooperative peers, our results apply to humans working in non-social, disconnected environments. Therefore, we can understand how people behave at the group level by using characteristics that are present even when individuals are not in a group. With this new hypothesis in hand, we perform a second phase of experiments to collect empirical data to prove our hypothesis where individuals are asked to participate in a controlled public policy poll and, after that, a non-controlled experiment in a labor market, the Amazon Mechanical Turk.

In this work, we investigate human behavior in online environments based on our own model, the artificial memetic networks (AMN). AMN were originally inspired by the naturally occurring spread of ideas (Dawkins, 1989) and was successfully applied to computer simulations to investigate the problem-solving performance of networks of artificial agents (Araújo and Lamb, 2008; Araújo, 2010). The present work will make use of an extension to AMN to design, perform and analyze experiments with individuals in problem-solving online networks. Our model provides sufficient mathematical rigor to allow us to model experiments and tell apart each object under investigation with precision, while keeping certain generality of the assertions to allow comparison between other models and theories. This goes along with our goal of making possible a better information system design where people perform activities online.

The AMN model was first introduced in Araújo (2010) and Araújo and Lamb (2008). It can be used, as we will explain in this paper, to understand human behavior in both connected and disconnected environments by modeling how memes (or ideas) spread and mutate as they are exchanged between individuals. As we developed the concepts of AMN, we will also perform hundreds of experiments to provide empirical evidence that our model can correctly pinpoint behavioral traits that can explain observed phenomena.

We show that when solving tasks with very small duration (under five minutes), also known as microtasks, individuals will act less rationally than predicted in classical game theory, strongly biased by costs of searching (and evaluating) options rather than financial or non-financial incentives. Therefore, the search and evaluation process, modeled more commonly in algorithms for combinatorial optimization, plays an important role in human decision making and constrains on options available to humans. Later in this paper, we will provide a brief introduction to human decision making, providing a context for applications and experiments that mark the beginning of highly available, massive data and low cost social experimentation.

The constrains to search and evaluation, as suggested in our study, limit the number of available options in microtask solving. Moreover, influence from one individual to another, as shown in this work, when solving problems, can be created by rearranging elements visually to modify search sequence order of an individual, be it by designing the layout of the virtual work environment or by manipulating which options are first shown in non-controlled environments, such as the Amazon Mechanical Turk labor market.

Individuals, therefore, rely more on the search conditions then on exogenous incentives (Manski, 1993). Our evidence is based on around 50 experiments in online social networks that we performed while individuals worked on tasks with varying degrees of difficulty within three settings: mathematical games with objective truth (Sudoku and SAT instances); surveys with subjective evaluation (public policy polling); and labor markets (Amazon Mechanical Turk).

We will start by presenting an introduction to literature that sets a context for applications and experiments that mark the beginning of highly available, massive data and low cost online human behavior experimentation. A description of the model that we used and the experiments are then detailed. We provide empirical evidence experimenting with thousands of subjects, from controlled experiments to naturally occurring phenomena observed in a labor market and in a social network (Facebook). We then describe the results with a data analysis that supports our conclusions.

 

++++++++++

Human behavior studies in computer science

The performance of human cooperative work has been studied extensively in many contexts (Malone and Crowston, 1994). However, research by means of the methods of computer science did not come before the dawn of very successful online applications whose value is based on the work of millions of users coordinated by electronic infrastructure (Bran, et al., 2009).

We will show examples of applications that were able to prove the power that human cooperative work is able to achieve in online environments in an era where artificial intelligence and machine learning were believed to solve any tractable problem with solely powerful hardware and software. By showing not only the success of applications but also that they were based on incomplete or unverified theories, we argue that a stronger, more reliable theory is needed, one that can be backed by experimental data and have properties that can be applied generally to applications that rely online human work.

We will present examples of candidate theories constructed to close the gap between unforeseen applications and existing, classic theories. Although they may provide clues about how people behave, they oversee several naturally occurring phenomena in experiments or cannot be applied to the design of experiments, as they are still under development and assume properties that are not realistic.

In the final section, we will present studies that show experiments specifically designed to provide evidence for new theories on human online collaborative work.

Examples of Internet applications

reCAPTCHA

A great deal of research has been published recently presenting faster ways of processing images, by making use of graphic processing units (GPU). GPUs have developed from simple processing cores with special single instruction, multiple data (SIMD) instructions to hundreds of specialized, stream processors integrated into system-on-chips (SoC). The increase in image processing power hinted at an era where image segmentation and object recognition could be performed on a huge scale. However, it was found that the quality of the extracted information from images could not be paralleled by a new way of automatically recognizing images, a network of motivated individuals tagging images, known as reCAPTCHA (von Ahn, et al., 2008).

A CAPTCHA (completely automated public Turing test to tell computers and humans apart) is a method to distinguish between humans and machines by using AI problems (von Ahn, et al., 2003). A captcha is a program that can generate and grade tests that: (A) most humans can pass, but (B) current computer programs can’t pass. A known successful application of captcha is the reCAPTCHA (von Ahn, et al., 2008), which is more commonly known as the wobbly word pairs that one has to identify as a requirement to request information on a Web site, such as survey forms or e-mail access (see Figure 1).

 

A reCAPTCHA event
 
Figure 1: In reCAPTCHA, an Internet user is required to type two words before submitting his or her credentials to a Web form, aiming to identify if the user is a machine or not. One of the images of a word was generated by a computer and thus the system knows what word is contained in the image. The other word was captured from a real situation, e.g., a scanned book or a photo, thus not generated artificially by a computer. If the user types both words and the generated image does not match the word used to generate the image, the system will assume the user is a machine and his interaction with the application is ignored. Therefore, the user, when typing both words, does not know which one was generated and thus is required to type both words correctly to pass the test. Therefore, by correctly typing both words, the user is not only providing evidence that he or she is a human, but also performing character recognition for the non-artificially generated image.

 

With reCAPTCHA, words in images are correctly identified as users are required to match an image with the correct words, otherwise they cannot continue interacting with a given Web site. Therefore, there is a motivational component (users want to interact with a site), usually explained in the game theory context of mechanism design or in algorithmic game theory (AGT), and the outcome of the interaction produces optical character recognition (OCR) work with higher quality then previously available from any other OCR algorithm performed by machines. Therefore, with a CAPTCHA, which is a generalization of reCAPTCHA (von Ahn, et al., 2008): [...] either the problems remain unsolved and there is a way to differentiate humans from computers, or the problems are solved and there is a way to communicate covertly on some channels.

By far the most successful example of Captcha, reCAPTCHA was acquired by Google for an estimated US$400 million dollars, proving quality OCR services for Google’s goals of extracting information from the Web and digitizing books.

reCAPTCHA was originally described by von Ahn, et al. (2003), published a decade after the original patent for OCR with CAPTCHAs (see Lillibridge, et al., 2001). It is not surprising at all that applications may precede theory and this pattern will repeatedly happen with online applications. Von Ahn, et al., (2003) and von Ahn and Dabbish (2008) demonstrated that, assuming each user has to match two words (as in Figure 1), it is possible to compute the probability that the information provided is true, assuming an acceptable error. It does not model why people can perform better than machines when performing OCR.

Wikipedia

Wikipedia is a cooperative effort to build an up-to-date, accurate encyclopedia. Currently, Wikipedia is one of the most accessed web sites in the world (Alexa, 2016), competing well with traditional encyclopedias (Giles, 2005).

To create Wikipedia, generally any user can edit articles and contribute to their content and organization. Each article, together with their respective talk pages where users can discuss and share information, is constructed with parts of information from various users. Moreover, users can edit each other contributions, making corrections or adding to existing information on each article. Therefore, Wikipedia is an online application of networked individuals sharing information and aggregating them into articles. As a reward, volunteers are essentially co-authors of any article. Another incentive is to be able to altruistically contribute to the advance of knowledge and have other utilize this knowledge, with no monetary incentives.

GalaxyZoo

Beyond interstellar dust, GalaxyZoo (2015) proposes the classification of galaxy morphology based on images captured by satellites. From an expected crowd of 20,000 to 30,000 participants, the project quickly (and unexpectedly) gathered 100,000 active participants (Lintott, et al., 2011; see Figure 2).

 

The GalaxyZoo interface
 
Figure 2: In the GalaxyZoo interface, volunteers are requested to classify an object at the center of the image. As the object is classified between classes Smooth, Features or disk and Star or artifact, more questions might be asked of the user.

 

We have been emphasizing the interconnected nature of some online applications. Both the Stardust@Home and GalaxyZoo projects have central servers with data collected from scientific experiments. The data is distributed to volunteers and output is returned to the server. The output is thus aggregated and stored on a central server. You could argue that the volunteers are individually working and do not cooperate as they are not communicating directly between each other. However no work on such scale can be performed without a (machine) central point of interaction — in this case, an Internet server. This is very important to consider when comparing online applications to other networks studied in economics and social sciences. Previous studies did not consider situations where work sharing would have such low cost and where tasks could be performed so quickly. This could be argued as a difference between predicted results from classic fields and what has been observed in online behavior.

Additionally, we observe that no individual task performed by any volunteer described so far took more than a few seconds to perform. As we will see in some experiments, this type of work is sometimes classified as a microtask and has interesting, reproducible properties that will be useful in analyzing online behavior.

FoldIt

So far, we have discussed online applications that provide high quality output when matching individuals work (reCAPTCHA), or providing tasks distributed from a server (Stardust@Home, GalaxyZoo). Let’s examine an application where individuals have to create solutions to problems for which solutions have not been created, highlighting a creative component.

In FoldIt (2015), participants are given 3D molecule structures and asked to find geometric configurations that will maximize (or minimize) the energy potential of a given molecule. Finding the minimum energy of a molecule and its corresponding geometry is a central problem in bioinformatics and protein folding (PF) (de Andrades, et al., 2013):

[Computational protein design (CPD)] design can be considered as the inverse of the protein folding (PF) problem (Osguthorpe, 2000) because it starts with the structure rather than the sequence and looks for all sequences that will fold into such 3-D structure. Considering that there are 20 naturally occurring amino acids for each position, the combinatorial complexity of the problem amounts to 20110 or 10130 (Floudas, et al., 2006).

Therefore, PF is an intractable problem and the use of heuristics is required. By challenging volunteers to fold a protein to the lowest energy state, FoldIt aims at leveraging a human intellectual advantage over machines. It has never been proven that humans are more or less capable of processing data than computers, however no robot or machine can match human cognitive skills in absolutely all types of activities, and thus it is still believed that humans have special cognitive skills. In this sense, FoldIt tries to harness this human cognitive potential by allowing individuals to work on proteins through a graphical interface (Figure 3).

 

FoldIt interface
 
Figure 3: In FoldIt, volunteers manipulate molecules to find the lowest energy state possible of the molecule. Although easy to compute the energy of the geometric configuration, finding the lowest energy is an intractable problem.

 

It has been shown that individuals are indeed capable of finding interesting structures, thus validating the hypothesis that humans can provide solutions beyond machines. The evidence for this is a series of solutions for folding geometries provided by humans that were not found previously by any system (FoldIt, 2015).

Relevant fields and methods

In this section, we will show examples of studies in classical fields (algorithmic game theory and evolutionary biology), new fields (networks science and behavioral economics) and experiments that were performed with the current successful applications of social networks in mind (Kearns, et al., 2006).

These fields do overlap and we will aim less at addressing broadly how each field tries to explain human behavior but more at highlighting how each field utilizes specific methods.

In the last section we will explain the idea behind our extend model of artificial memetic networks.

Algorithmic game theory

Rigorous mathematical treatment of group behavior was originally found in economic theory and later in algorithmic game theory. We start by quickly exposing the basic rational agent model that is generally used and then showing how it can be applied to online social applications.

A game theoretic analysis generally starts from the following model: there exists an entity agent that is deemed as rational for being capable of, given a set of actions, choosing the action that is the best, i.e., one that can increase a given fitness quality of the agent. The actions available to the agent and the change in the agent fitness according to his or her actions is given by a payoff function. The sequence of actions an agent takes is her strategy. If all agents know the history of actions of all other agents, we can say this is a perfect information game.

Let us test these definitions with Facebook. The agent (human individual, or user) can take actions (e.g., post text and images, comment, like, scroll the page, leave the site, turn off the computer) that is expected to improve his or her fitness. A payoff function maps the current user context (e.g., set of peers, time of the day, native language) and set of actions to a change in agent fitness. Moreover, a sequence of actions can be grouped into a strategy of an agent, e.g., first post a message, then comment, then turn off the computer, then turn on the computer, then post again.

In this example, we can easily trace an analogy between agents and users and between posts and actions. However, we cannot easily determine the payoff function of users. Indeed, there is no generally accepted methodology to model a user payoff function. Each field has its own ways with pros and cons, motivating the development of new models.

In algorithmic game theory, we want to provide assumptions about the payoff functions of users and restrict their actions so we can describe the system as a mathematical model that can have a known analytic solution in game theory, also known as a solution concept. Having a model without an analytic solution is possible but avoided as it can introduce more parameters and weaken the model.

We will use Ghosh and Hummel (2014) as a reference for social behavior modeling in a Question and Answer (Q&A) forum. In a Q&A forum, each agent i can choose among an infinite, continuous set of actions:

  • to not contribute with an answer;
  • to contribute with an answer with quality qi.

For the last option, the agent also can choose the quality of the contribution, from 0 to infinity, then making the number of possible actions infinite as well. We will base part of the model in our explanation of results combined with a fitted function to our experimental model. Moreover, Jain and Parkes (2013) noted that AGT does not yet cover sufficient cases to be used in as a tool for predicting or analyzing online behavior.

One experiment that has been used extensively as a reference to design the first phase of our experiments is Kearns, et al. (2006). In this work, an experiment asked individuals to solve collaboratively a coloring problem. Each individual was given a color and was positioned inside a virtual social network implemented in an experiment. Other nodes in the network were individual as well with actions allowing individuals to change their own colors with an ultimate goal of solving the problem.

Suri and Watts (2011) performed experiments in a social network which confirmed our experimental results in that they found that network topology did not affect outcomes.

Artificial memetic networks

We will first introduce a more formal concept of culture so we can then discuss memes. There are a number of models for culture, some of which present a mathematical definition of culture. An interesting survey may be found in Birukou, et al. (2013) and from there we borrow and adapt a formal definition of culture.

We denote each agent as ai and the m-th group of agents as Am = {a1, a2, ...}. When no index is specified, A represents all agents that may exist in our context (Internet, social network, etc.).

Each agent ai possesses cultural traits, which are (Birukou, et al., 2013) characteristics of human societies that are potentially transmitted by non-genetic means and can be owned by an agent.

However with different names, a trait is an element that can be found in most theories about humans, societies and cultures. For example, traits may be found in other works in the form of ideas, memes or even as behaviors, such “The Earth is round”, a description on how to build a computer or even just a number. All such instances of traits are based in informal, subjective definitions and often too broad concepts, making it difficult to compare and construct a more universally accepted description.

Nevertheless, existing definitions of traits share a common characteristic: traits can be transmitted and learned by individuals (Birukou, et al., 2013). We can use this common characteristic to define traits.

Definition 0.1: Agent ai possesses zero or more (cultural) traits τj and the set of traits of ai is Tai, itself contained in the set of all traits Tai ⊆ T. Traits can be shared and learned by agents. Agents can share traits that have never been learned by them, i.e., agents can create new traits and share them [2]. Thus, we present the function:
equation 1        (1)
to map trait sets at different instants of time as they change due to learning or trait creation.

In our model, T is a totally ordered set and elements can be indexed uniquely with j and j ∈ ℕ+. This is a reasonable assumption because traits can be described uniquely with natural language descriptions (such as “The Earth is round”) that can be, at least, ordered alphabetically.

Finally, we will start with the following informal definition of culture:

The culture of any society consists of the sum total of ideas, conditioned emotional responses and patterns of habitual behavior which the members of that society have acquired through instruction or imitation and which they share to a greater or less degree (Linton, 1936).

Culture is the set of traits of all agents in a group. Thus, the culture of group Am = {a1, a2, ...} is CAm = {Ta1 ∪ Ta2 ∪ ...}.

Definition 0.2: (Definition of culture) The culture CAm of the m-th group of agents Am is
equation 2        (2)

If group m has only one agent a1 then the culture of the group is the agent’s traits CAm = Ca1 = T1 and we can talk about an agent’s culture.

An active field of study is how cultures evolve and, to model the culture’s capability to evolve, we will define a culture in different points of time t. For this, τi contains a timestamp that informs when agent ai acquired τi and C is equipped with a total order.

equation 2        (3)

For instance, if you wish to know the culture at times t0 and t1, you compute C(t0) and C(t1), respectively.

Definition 0.3: (Evolution of culture) The evolution (or difference) of culture of group Am from time t′ to t′′ is
equation 3        (4)

Generally, we may be interested in the evolution of culture between the beginning and the end of an activity of agents to solve problem instance pk. If t′ and t′′ are the time instants when problem instance pk begins and ends, respectively,

equation 5        (5)

In our work, traits are mainly limited to constitute problem’s solutions, e.g., each trait is a SAT or Sudoku solution. Therefore, agents in our social network may have only one trait type [3].

Capability to forget

In our model, agents may forget traits, at variance from Birukou, et al. (2013). Thus we will not take into account, for our model, definitions and axioms related to causality and sharing of traits. The possibility of an agent forgetting a trait is necessary to justify when some agents try to solve problems with solutions that they already have tried, a situation not possible in the Birukou, et al. (2013) model. We use an adapted form of snapshots (Birukou, et al., 2013) as noted in equations 1 and 3, allowing causality definitions to be included as long as proper notation is presented.

A game theoretic description

Here we present a game theoretic description of meme evolution that will be useful to discuss gains and costs on our experiments. Here we will try to provide a description of meme evolution based on basic game theory concepts.

Each agent ai strategically chooses an action (also traditionally known as a move) αi(t) ∈{E,I,N}, where αi = E indicates that agent i chooses to act extrovertedly by copying another agent’s trait. In this case, we may also specify from which agent a′ the trait was copied by with αi = E(a′) notation. Additionally, αi = I indicates that agent i act introvertedly by adapting his set of traits Ti into a new trait (Araújo and Lamb, 2008). When necessary, we may specify the new trait created τ′ as αi = I = τ′. Finally, αi = N indicates that agent i does not act.

If we assume agents are solving a Sudoku game, for instance, then the game instances being played are finitely repeated games and where necessary we will specify the time each action was taken with the t parameter. All first actions of each agent happen in t = t0 in equation 5. However, not showing the time instant does not imply that time steps have the same duration between agent moves, although we can always add as many αi = N to make all agents actions synchronized with constant time steps.

The strategy of agent i is denoted as si = {αi(t0), αi(t1), ..., αi(tpk)} or si,pk when we specifically refer to the strategy taken by agent i while solving the k-nth problem instance pk. The strategy profiles are denoted as Spk = {s0, s1, ...} for the k-th problem instance pk. Since our game is finite (discrete steps and time bound), we define a new si(t′′) = si(t′) ∪{αi(t′)} when each agent take an action individually. Each next step is indexed sequentially, e.g., t1, t2, ..., tpk and tpk is the duration of problem pk.

The N action is important to transform the asynchronous actions of agent ai into actions that happen at the same time, by adding a αi = N action to si when ai did not played while A-i played.

We are already able to formalize specific statements about agent interaction in our experiments. For instance, an agent a3 copying another agent’s a5 solution τ7 at time t11 is formalized as α3(t11) = E(a5,t11).

Continuity of traits (or ideas)

We declared earlier that traits can always be described with a sequence of characters in natural language. Even if phrases or words can be ambiguous, we can always add more description to the trait to resolve the ambiguity. For instance, a Wikipedia article about “Neighborhood” defines this word as part of a city. As a consequence, whenever we want the article about the mathematical notion of “Neighborhood” (from set theory), we add the context to the Wikipedia article’s title in parentheses (“Neighborhood (mathematics)”). In this way, we can uniquely address every trait, or idea and set a relation such as the alphabetical order.

We define the metric space (T, l) where l is the Levenshtein Distance (LD) between the string description of two traits l : description(T) × description(T) → . This metric is going to be used extensively to represent graphically T.

For any experiment or observation, we need to force agents to change traits in a way that there are discrete, unitary changes from the past solution. Enforcing this behavior, we guarantee that we trace the agent’s introspective line of thought since it represents the agent’s strategy while not sharing his or her solution with other agents. We assume that an agent cannot hold a long number of traits in her memory and thus uses the interface to keep track of the current solution.

Lemma 0.1: (Continuity of traits in a culture) For all agents ai ∈ A, any sequence of actions αi ∈{I,N} forms a topological space with neighborhood
equation 6        (6)

Proof: Users are forced by the UI to take actions that modify only one variable of a SAT solution or only one Sudoku tile each time they take actions I. Also, the encoding of the SAT solution is an array of variables encoded as their value with the ‘T’ character for true values and ‘F’ for false on each variable of the problem, and for the Sudoku as the concatenation of all numbers on the board, replacing the blank space with ‘0’ (zero). With this encoding, each time an agent changes one variable (or one tile in the Sudoku example), the Levenshtein Distance (LD) is at most 1. Finally, the action αi = N have Levenshtein Distance zero and abide Lemma 0.1.

The continuity of agents introspective behavior showed in Lemma 6 will also be used to identify whenever an agent acted extroverted by copying from other agent, i.e., when there is any break in the continuity established in Lemma 0.1. We can identify this clearly from our experiments, but we should not assume this information will be available at all times, as non-controlled observations cannot be certain if agents were copying or thinking introspectively.

How can one say that a specific individual has been cooperating or acting selfishly in the long term? From game theory and sociobiology, it is well accepted that agents are always acting selfishly, but can, in the short term, show cooperative behavior, i.e., accept temporarily a situation of less gain so in the future an overall gain is justified.

To classify an organism as cooperating is straightforward in game theoretic experiments with well defined rules, or within life and death situations that living beings might face. In the case of artificial memetic networks, at least simulated experiments, we will provide a similar, objective way of telling whether the agent is cooperating or not. However, when looking at social experiments, we are biased by our opinions and beliefs about who is cooperating or not.

Aiming to find in artificial memetic networks a compromise between definitions provided by evolutionary theory and well developed mathematical theories behind game theory, we need a visual aid to not only better explain artificial memetic networks, but also to provide hints about how we can select a classification method. In this sense, it is important to first provide a way for researchers to intuitively agree with our proposed model, in the sense that one should be able to recover, under the layers of mathematical models and data analysis, what we usually accept cooperation to be. Our goal in this section is, therefore, to provide a way of expressing behavioral patterns in a visual, intuitive way.

We will now take the proposed model and offer a visual representation for each of the described elements, mainly memes and meme transitions, as an intuitive way of understanding the model.

Timeline of the actions of agents

In this section we model agents solving a problem as an optimization problem and relate the solving process with the cultural dynamics of a group of agents. Through action αi(t′) = I, agent ai generates a trait τ′ based on other traits in Ti (Araújo and Lamb, 2008). We name this process generation for two reasons.

First, when acting αi′ = I all information needed by the agent to present the trait τ′ is endogenous to the agent. Thus, no information was used from external resources, such as other peers A-i in the social network. For an external observer, trait τ′ was created or generated, since the output τi came from agent ai with direct interaction with the environment. Secondly, if trait τ′′ is subsequently generated at time t′′ by αi(t′′) = αi(t′) = τ′ and t′′ = t′ + 1, we expect the cardinality of Ti to remain unchanged as |Ti(t′)| = |Ti(t′′)|, since the set of traits is unchanged. However, because agents can forget, Ti(t′′) may be smaller than Ti(t′) with the loss of one or more traits. Nevertheless, in our work |Ti(t)| is always monotonic and agents’ capacities to forget will be mentioned only when an agent takes action αi(t′′) = τ′′, τ′′∈ Ti(t′) and t′ < t′′, which means traits (also solutions to problems in our experiments) are being repeated by the same agent, even thou they do not represent the final solution to the problem being solved.

We name the generative function as follows:

Definition 0.4: Agent ai can generate traits with Gai:
equation 7        (7)
and
equation 8        (8)
Therefore, generating traits can produce a new trait not previously in Tai and Ta-i.

The process of generating a trait is at the core of the gain (and cost) function. It is in this function that actions takes place and the agent has to decide whether to copy, to mutate a trait or to do nothing. Indeed, the decision process will be shown to have important costs if an individual decides to act extrovertly, e.g., αi = E that we are able to modify an individual behavior when they are performing microtasks.

We may start now connecting our model to Araújo and Lamb (2008). From Araújo and Lamb (2008):

Definition 0.5: Aggregation by copying the best neighbor: Let A be the set of adjacent vertices to any vertex; let u = argmaxx∈Aeval(x). If more than one vertex in A may satisfy this condition, u is chosen randomly among these. Then, make v ⇐ u, otherwise v is left unaltered.

To translate this definition to our mathematical framework, we have to make u = Ta-i for connected peers, i.e., the agents that are connected and visible to ai. Also, x : τ.

Definition 0.5 states that agents act rationally by copying the best solution according to a evaluation eval, just as in a game theoretic model. Naturally, both artificial memetic networks and game theory extend in many ways this assumption.

There are more steps involved processing artificial memetic networks. However, for the sake of our experiments, it is enough to introduce the above definition, as others steps, like the connection step, appropriation step and the remaining aggregation step models are beyond the scope of this work.

If αi(t′) = I and agent’s trait set |Ti(t′)| = |Ti(t′ + 1)|, ∀t′ < t′′,|si(t′)| < |si(t′′)| still holds and |si(t)| is monotonically increasing and si(t′) = αi(t′). It may be useful to present a graphical representation about how agents interact and how their traits and culture change as they take actions.

In Figure 4 we describe the timeline of agents taking actions during a game. Although we can describe the actions of agents with natural language, when we deal with multiple agents and hundreds, and possibly thousands, of actions in a single experiment run, the agent activity representation became cluttered and we should simplify.

 

Diagram shows two columns, each representing agent a0 and a1 strategies s0 and s1
 
Figure 4: The diagram shows two columns, each representing agent a0 and a1 strategies s0 and s1 respectively. When agent each takes any action individually, time changes by Δt and s-1 = N and a new last trait is selected. In the example, first agent a0 takes action α0 = I, which means it does not copy from agent a1 but generates trait τ0 (indicated by the edge τ0) based in his own set of traits T1. Because agents most probably do not take actions simultaneously, we set α1 = N to keep the game synchronous. During the last step t1, agent a1 copies from agent a0. Only in action α0(t0) a trait τ0 is generated. The number 1 inside each circle will be explained in Figure 5.

 

Using Figure 4 as a reference, we present a simplified diagram in Figure 5 that show graphical features in social activities that inspired our further analysis of agents’ interaction with cultural dynamics.

 

Simplified visual model for meme evolution
 
Figure 5: Simplified visual model for meme evolution: the diagram simplifies Figure 4 to introduce culture dynamics concepts in light of game theoretic models. We drop the labels a0 and a1, exclude α = N actions and set node value as the number of agents that ever output t′ Differentiation between α = I and α = E is guaranteed by continuity breaks (see Lemma 0.1). The diagram layout must be arranged with the neato algorithm so we can visually see continuity breaks.

 

To simplify, agents identity is of no interest for culture dynamic analysis, so we drop labels (in Figure 4 we drop a0 and a1). It is implied that any path in this graph represents a strategy of one or more agents, so strategic information is not lost and can still be used to approximate payoff functions.

Even if agents perform actions at different times, there is no individual timeline representation in Figure 5 and this will not be necessary to present our results. Additionally, the type of problem instances we used in our experiments make agents quickly take actions and displaying agents actions with time layers (as in Figure 4) does not change significantly the representation. The timing information will be useful for the next step in our work where we study homophily with the aid of the traits interpretation.

While generating unique traits, there is a growth of culture Cai and therefore culture Cpk. Each agent ai selects a strategy si aiming to maximize its utility function ui : S → ℝ where S is the space of all possible actions an agent can choose from.

A first visualization can be found in Figure 6 and a typical interaction diagram is displayed in Figure 7.

 

Problem instances with a low number of states cannot be visually represented with our visual model
 
Figure 6: Problem instances with a low number of states cannot be visually represented with our visual model: the low number of states for the SAT instances decreased the value of the visualization. After generating this visualization, we planned working with SAT problem instances with a higher number of variables. Nonetheless, over 65 variables was already too many for human agents to solve.

 

 

Memetic evolution paths can be clearly seen in this visualization, but not on areas with popular memes
 
Figure 7: Memetic evolution paths can be clearly seen in this visualization, but not on areas with popular memes: after finding that visualizing meme evolution on a problem instance with a low number of states was not possible, we devised a new set of experiments with Sudoku problem instances. The visualization of a single instance of a Sudoku problem can be seen above. Now, it is clear that there are agents that take isolated paths from other agents, while most of the agents try to copy each other and have similar ideas. However, because of the high number of states and the popularity of some memes (solutions), we required to zoom in. The next figures show a zoomed version of the central area.

 

Important features, such exchanging memes, strategy and the evolution of memes, are being represented and used to intuitively explain phenomena such as meme exchange in the context of cooperation and homophily. Our goal with the representation is not to directly show or verify cooperative behavior but to provide a grasp on the concepts of meme evolution and exchange that were presented in previous sections.

 

Meme evolution visualization with zoom 2x
 
Figure 8: Meme evolution visualization with zoom 2x can already show when agents cooperate by exchanging memes (problem instance solutions): even in areas with high density of memes, we can still see the particular strategies of each agent. When information exchange occurs, an arrow merge two paths into one. In green, we can see the correct solution for this problem (zoomed in the detail box). In our experiments, we allow agents to finish solving a problem and all agents were able to solve a given problem.

 

 

++++++++++

Goals and methodology

Goals

This paper aims at understanding factors that explain how individuals behave in online environments, specifically related to their performance when working on specific tasks. Previous work showed that artificial memetic networks (AMN) can be used to model social behavior and provide explanation for observed human cooperative problem solving (Araújo and Lamb, 2008; Araújo, 2010). However, research on AMN has focused on computer simulations instead of human agents in a controlled environment or in natural settings, such as online social networks like Facebook or Google+. Adding to Araújo (2010), we will perform experiments modeled after AMN with the following objectives:

  • To model online human behavior experiments according to artificial memetic networks (AMN);
  • To identify potential factors that affect human behavior in online environments using AMN and perform experiments to provide empirical evidence; and
  • To apply the modeled factors to a new experiment and demonstrate how we can affect human behavior in online environments.

 

Meme evolution paths distant by L, but converging towards the solution
 
Figure 9: Meme evolution paths distant by L, but converging towards the solution. L is a measure for the meme space and represents homophilly intensity: the symbol L represents homophilly between agents, measured by the distance between their strategies. In this visualization, homophilly is not dynamic over time. Highlight are paths that show visually continuity of traits behavior (see Lemma 0.1) highlighted. In green, the solution of the problem, after a few minutes the most and only popular solution.

 

Therefore, our first goal is to provide a mathematical foundation for AMN in the context of online problem solving social networks, one that can be unambiguously used and compared to other models and to design experiments that can be performed and tested. Our goal will be fulfilled when we can isolate a property, one that can be used to compare our model with others and one that can be found in the highest number of experimental or observable situations in social environments. Additionally, we want to design and perform experiments that are able to provide confidence around our model. We believe that our model is able to make more detailed behavioral descriptions that can pinpoint where theories diverge from observations and which phenomena can be attributed to social interaction or due to environmental factors. The designed experiments must comply with the scenario we described earlier: low cost experiments that are able to motivate individuals to act and generate large amounts of data.

 

Interface of the Social SAT Solver (S3)
 
Figure 10: Interface of the Social SAT Solver (S3).

 

Methodology

We will start by defining the memes, based on a set of constraints from existing definitions of memes. For instance, for a successful software architecture in digital computers through which we plan to perform social experiments, data types must be bounded and compact enough so we can fit a high number of variables of a give data type into our data store. Therefore, memes should be stored with a few bytes. Moreover, computations on the data types — for instance the sequence that describes one meme mutation to another as time passes — have to be well defined. By having restrictions related to software implementation, we know that we have provided concrete definitions for memes, allowing implementation in an experiment.

To assess our model, based on AMN, we will perform experiments in connected environments in three different settings:

  • Experiment Phase I: we analyze what factors are important in object truth problem solving. Individuals were challenged to solve SAT and Sudoku instances.
  • Experiment Phase II: we extend the results found during the Phase I phase hold even with subjective truth, specifically to public policy polling.
  • Experiment Phase III: we perform experiments that provide financial incentive to individuals and generalize results found in Phase I and II.

Experiment Phase I: SAT and Sudoku

For Phase I, we planned several experiments where human agents were able to communicate in a controlled way and were asked to solve a mathematical challenge. A first group of individuals solved SAT instances (Boolean Satisfiability Problem instances). Instances of SAT have been extensively studied and their solution space and equivalence to other problems is well known, making it easier to compare to agents solving other, equivalent problems. Another reason to select SAT instances was that we wanted to analyze the performance of artificial memetic networks solving SAT instances individually or in groups, aiming to possibly find new applications for collaboration.

A second group tried to solve Sudoku instances. Sudoku is a game mathematically equivalent to SAT and thus suited our needs for a problem that can be more objectively compared regarding number of variables, difficulty and (intuitive) complexity or difficulty. However, it naturally provided motivation to individuals, diminishing effects of individuals having other interests but to win the game.

Both problems, SAT and Sudoku, were used within a range of network settings, i.e., a varied number of agents, topology and problem instance characteristics, in order to reduce bias from our experimental setup. We will have met our goal if any behavioral pattern observed in the first experiment (SAT) can be also found in the second (Facebook), showing that artificial memetic networks offer predictability power while also providing results that can be generalized to other settings.

The experiments were performed with two online Web applications developed by our research group. Both applications allowed collaborative solving of SAT and Sudoku instances. In each computer a subject accessed the server through a Web browser to participate in the experiment. The server was configured to allow users’ solutions to be seen by other users according to a selected topology; no other form of communication between participants was allowed. Moreover, the position of each neighbor’s solution in this area was randomized before they were presented on the screen and then fixed until all instances were solved by network members. When one subject submitted a solution, it became available to her neighbors on their respective screens. A given user had no option over sharing their solution. A button allowed a given user the opportunity to copy any solution of a neighbor. Solutions could be changed before being submitted again. Figure 10 illustrates a screenshot of the Social SAT Solver.

For the experiment with Sudoku (Figure 11), users accessed a Facebook app where, unlike the SAT experiment, they could potentially join the experiment at will. Hence, the number of participants in each experiment fluctuated.

 

User interface for the Facebook Sudoku experiments
 
Figure 11: User interface for the Facebook Sudoku experiments.

 

It is known that the topology of social networks have a significant impact on the behavior of individuals and a group while solving problems (Araújo and Lamb, 2008; Kearns, et al., 2006). In our work, we shall define several topologies to perform the experiments. We use two ring networks (Newman, 2010) with different neighborhood sizes, 4 and 6, where subjects are arranged in a regular ring. Moreover, a scale-free network (Barabási, 2002) is used, where the degree distribution follows a power law with a large tail, allowing for the existence of highly connected subjects (hubs).

In order to perform a general analysis of the results, you have to consider variations in experiment runs, or, alternatively, accept that the analysis applies to fewer social networks setups. Also, variation in the number of human subjects participating at each experiment represents more closely what happens in non-controlled online social networks.

Experiments using the SAT and Facebook problems were performed within laboratories at our universities. Before the start of each session, a short explanation of the problem was given to the subjects. In the case of the Facebook experiment, therefore, the app was only available to a restricted set of students. During the SAT experiment, we introduced the SAT problem, its relevance to computer science and artificial intelligence, and how subjects should use the interface to create solutions. Instructions on how to visualize their peers’ solutions — when available — were presented.

 

Two network topologies were used in our experiments
 
Figure 12: Two network topologies were used in our experiments. The left topology is a scale-free network and we will use a γ = 1.65, which is higher as hubs have a higher number of neighbours. The topology to the right is a a ring network where K sets the number of neighbour each peer will have. All peers have the same number of neighbours in the ring topology.

 

In order to encourage participation, we announced that a global ranking will be published with a prize awarded to the best solvers, with the value of the prize increased based on the rank of the solver. The ranking mechanism provided an incentive for those subjects not be interested in solving instances alone. When all participants were ready, the system allowed them to start working on selected problems. The experimental subjects consisted of groups of first and second year computer science undergraduates. Students had a basic acquaintance with propositional logic. No subject was knowledgeable about search algorithms or SAT or Sudoku solving techniques.

A total of 30 meetings were planned with students, providing 30 experiments.

Experiment Phase II: Public policy poll

Our goal was to verify if the results from Phase I with Sudoku and SAT held even in the context of non-objective problems of public policy polling.

In problem solving, it is often the case that the problem being solved has no objective answer and it often depends on the culture of the group. For instance, there is an objective answer for What is the result of 1+1?, but the answer to What is the best football team? will depend on the local culture, whether locality is defined by geographic features, a political map, ethnicity, etc.

In this new experiment, we shared a hyperlink on Facebook where users clicked and accessed a Web tool to estimate how public policies affect the bus fare in the city of Porto Alegre [4]. As users clicked each policy item, an estimated value for the new bus fare was shown on the screen, as well as the individual value for each policy chosen. Therefore, the time to evaluate each option was greatly reduced as we showed the summed components of the bus fare. The set of options were chosen as to represent what was perceived to be the opinions of left and right wing individuals, as well as more centered positions. Therefore, we had each policy item have a monetary value and a subjective, political weight (from left to right wing, as perceived by us) associated with it. See Figure 13 for a screenshot of the Web site.

 

Bus fare hike experiment interface
 
Figure 13: Bus fare hike experiment interface: users can hover with the mouse on top of each policy item to see the difference in price that each item would provide to the final bus fare. At the lower part of the page, the selected items would be summarized as a final bus fare. The bus fare selected could be shared on Facebook to stimulate more access to the Web site, but also to control how users who share information behave compared to those that do not share.

 

As the Facebook Web site worked, users were allowed to share information with their social network, much like memes are exchanged as described in AMN. They did so by clicking a Share button or, in our experiment, on a similarly looking button compartilhar a sua proposta. All clicks and Facebook shares were recorded, along with geographical information of the users accessing the Web site. The shared information was used to separate between users that were exploring options by clicking on each option, much like an introspective I action described in our model, or users that were exchanging information with the media in a extroverted action E.

Experiment Phase III: Labor market

The experiments in Phase I and II provided incentives that were non-financial, i.e., the work users performed by seeking solutions and spending time working on problems were rewarded with gains other than money or other financial instrument.

To extend our results to Internet applications that aggregate work, we analyzed data from experiments performed in cooperation with Cornell University. The original intent of the project was to analyze how pay rate affects workers on Amazon Mechanical Turk; however we were able to use generated data for our current goals.

Amazon Mechanical Turk (AMT) is a Web site where requesters post tasks such as Transcribe an audio file or Tag five images. Each task named in AMT as a HIT and individuals, known as workers, can agree to work on tasks and receive a reward set by the requester if they complete the work. A requester can post as many HITs as he or she wants and the workers can accept as many tasks as they want. It is up to the requester to decide if a task was satisfactorily completed. If so, the worker receives the reward, set by the requester and paid by AMT. The requester is responsible for buying credits for AMT and has to do so before posting any task, aiming to assure the worker is eventually paid.

There are many sites like AMT; we call this class of application an online labor market, where workers are paid for completing tasks posted online. In the case of AMT, these tasks are identified as microtasks, even though they may require a few hours to complete.

Before accepting a task, a worker can secure details about conditions surrounding a given task, such as the amount of time allotted to complete the task. These details are described in Figure 14. The most important fields in our analysis will be title, description, reward and allotted time.

 

A single HIT detailed
 
Figure 14: A single HIT detailed: HITs are described by a Title (top line with text), requester name, expiration date (after which the HIT becomes unavailable for workers), a reward (in U.S. dollars), time allotted, which is the amount of time a worker has to complete the task, the number of HITs available (of similar tasks), a description of the task, keywords and qualifications that a worker needs to accept the task.

 

Upon accessing AMT’s site, a worker is presented with a list of available tasks with summary descriptions (Figure 15).

 

Amazon Mechanical Turk Web site interface
 
Figure 15: Amazon Mechanical Turk Web site interface: the initial screen of AMT shows a list of HITs available to workers. Before accepting, a worker can examine details of the HIT. It is very important to notice that the default sorting for HITs is HITs Available (most first) (on the top, left corner of the screen). This default sorting has an important impact on the dynamics of workers on AMT. Also, the term HIT Group is used to aggregate tasks with the same description, but there might be small differences among the tasks.

 

As the number of HITs for each task is shown, we can sample periodically each row of the list in Figure 15 to measure how many HITs have been completed by workers per unit of time, indicating the preference of workers and, aligning with other parameters, which factors affect the preferences of workers.

 

++++++++++

Results

We were able to identify a common pattern of aggregation behavior across all experiments. Surprisingly, agents were, on a first analysis, more importantly affected by the interface then by the quality of solutions that could be exchanged with their peers. In the following sections, we will explain the results achieved in each experiment and how our conclusions evolved. Results were published earlier (Farenzena, et al., 2010; Farenzena, Araújo, and Lamb, 2011b; Farenzena, Araújo, and Lamb, 2011a). In the following section, we will describe social experiments scheduled and executed, along with analysis in the first step of our series of experiments.

Experiment phase I: Social experiments

Our first experiment, published in (Farenzena, Araújo, and Lamb, 2011a) ran 42 experiments with the participation of 160 human subjects.

As the time to collectively solve each instance was different in each group of agents (from 2 to 35 simultaneous agents in the network, with total 125 different agents throughout all experiments) and in each topology, each participant did not solve the same number of instances in all experiment runs.

For the SAT experiment, solutions are presented in a row, while for Sudoku they are presented in a grid (Figure 11).

Thirty experimental sessions were planned for the SAT experiments, as described in Table 1. Out of 30 experiments, 14 were scheduled and eight were considered successful. Some experiments could not be scheduled largely due to overlapping academic activities, such as exams and fairs. Scheduled experiments failed mainly due to software bugs, hardware or infrastructure failure or a poor set of SAT problem instances (discovered while performing the first set of experiments).

 

Table 1: Schedule for the collaborative problem-solving SAT experiments.
DateTime TypeResultsComments
23 Sept 200910:30 AMIndividualNegativeServer hardware failure.
30 Sept 200910:30 AMIndividualPositiveSAT instances proved to be too easy.
6 Oct 20093:30 PMIndividualPositiveIndividuals asked to fill report.
7 Oct 200910:30 AMIndividualNegative 
8 Oct 20093:30 PMIndividualNegativeAdditional SAT instances were not ready.
9 Nov 20098:30 AMIndividualPositiveNetwork hardware failure, but still worked.
9 Nov 200910:30 AMIndividualPositive 
10 Nov 20093:30 PMGroupNegativeSoftware bug.
11 Nov 20098:30 AMGroupNegativeSoftware bug.
11 Nov 200910:30 AMGroupNegativeSoftware bug.
12 Nov 20093:30 PMGroupPositiveRing N=26; k = 4 and k = 6.
18 Nov 200910:30 AMGroupPositive 
8 Dec 200910:30 AMGroupPositiveSoftware bug, but still worked.
14 Dec 200910:30 AMGroupPositiveRing k=4.

 

Next, examine a summary for data collected for the SAT (Table 2) and Sudoku experiments (Table 3).

 

Table 2: Experiment runs for SAT instances.
Note: * Total exchanged solutions ratio excludes solutions from disconnected topology.
TopologyExperiments
(instances)
VariablesClausesAgentsSolutions
(copied)
Disconnected16 (16)3 to 265 to 63125
Ring (Kring = 4)4 (3)4 to 816 to 1616 to 19335 (13%)
Ring (Kring = 6)3 (2)8 to 1012 to 1622 to 31268 (14%)
Scale free (γ = 1.65)11 (9)4 to 268 to 631 to 321,179 (10%)
Total34 (24)3 to 265 to 631 to 1257,630 (11%)

 

 

Table 3: Experiment runs for Sudoku instances.
TopologyExperimentsAgentsSolutions
(% copied)
Ring (K = 4)2 (2)2 to 5533 (25%)
Ring (K = 6)2 (2)17 to 2011,114 (18%)
Scale free (γ = 1.65)4 (3)14 to 356,012 (87%)
Total8 (6)2 to 3517,659 (42%)

 

In column topology we classify experiments by the topology used. In column experiments we represent the number of instances submitted for each topology, and the number of different problem instances used. We have used the same instances a few times to assess if subjects could recognize repeated instances; only two percent were able to do so. The Subjects column provides the number of human subjects that participated in experiments (there is a range of subjects instead of a fixed number, because the number of subjects changed from one problem instance to another due to subjects’ availability). The Solutions column accounts for the number of distinct solutions subjects provided or copied from neighbors during all experiments for each topology, while in brackets we present only the percentage of solutions copied. The Variables and Clauses columns are specific to Table 2 (SAT experiment).

 

Positional influence on user strategy
 
Figure 16: Positional influence on user strategy: each point shows the percentage of times Ca a solution on position k on the interface was copied. When the solution was copied, the agent that copied the solution had solved exp instances prior to that instance. ⟨X(k)⟩ is the probability mass function of the geometric distribution that fitted the data with five percent significance level.

 

Our hypothesis was that the position of a peer’s solution would hardly matter, as agents would seek copying the best solutions shown, evaluating each solution. In order to test this hypothesis, we plotted the number of times a neighboring solution was copied as a function of its position in the screen for the SAT problem (Figure 16). We can see this plot as the probability of the kth solution being copied by a neighbor. Considering two possible outputs (copy or not), and a discrete distribution, then we can model it as a geometric distribution. We fitted this function as a probability mass function of the geometric distribution where ⟨X(k)⟩ denotes the probability of an agent copying the kth neighbor solution, with parameter p = 0.5479:

equation 9        (9)

As it turned out, the frequency that a solution from a neighbor is copied decreases with how far to the right it is in the solution’s row. Therefore, agents were picking the closest solution on the interface instead of using the best solutions available to make their choices. This suggests that subjects select the most readily available solution, which may not be the best solution to a given problem. In Table 2, we show that the number of neighbors could be as high as 20 (for scale-free networks), which implies that agents had up to 20 solutions available. However, almost all solutions copied at least once were along the first six positions, even though agents had access to far more solutions. That is, even when participants were not able to improve their solution and chose to copy a neighbor’s instead, they still would not go over all of the neighbors’ solutions to try and find the best solution, choosing instead to copy some seemingly random solution from the pool.

In the sequence, we performed this same analysis for the Sudoku experiment (the estimation of the probability of a neighbor’s solution to be copied as a function of its position on the interface) and found a similar pattern. We fitted the function as a negative binomial distribution. To assess fitness, we tested the collected data for the null hypothesis of a normal distribution with the Anderson-Darling test and rejected the null hypothesis with five percent significance level. Comparing the modeled geometric distribution with the sampled data in a two-sample Kolmogorov-Smirnov test we did reject the null hypothesis of the distributions, with the exception of the vertical axis of the Sudoku experiment, where the null hypothesis was not rejected.

A novel model for cooperative problem-solving

Data analysis demonstrates that there is a decreasing probability of a neighbor’s solution getting copied based on the solution’s position. Since positioning is randomized, it is not likely that the first few positions will always have the best solutions. This evidence may seen contradictory when we assume rational agents and indicates a non-deterministic behavior of interaction, since agents do not evaluate a neighbors’ solution and may not select the best solutions.

However, in a informational natural selection model, it is not really necessary for agents to evaluate each neighbor’s solution. This apparently random behavior explains cooperation with the following proposed procedure for human cooperation. An agent has to solve his/her own problem, and tries to do so by proposing solutions and securing feedback from the environment. The environment is a virtual problem-solving activity where solutions are objectively evaluated by a computer. Thus an agent can “navigate” through solutions using only this feedback without a need for external sources, such as peers’ shared solutions. Even with varying degrees of problem-solving skills, all agents can solve alone the problem instances, as verified by the first part of the experiment where all agents correctly solved problem instances in the absence of cooperation.

In that sense, with varying skills, all agents individually create solutions aiming at reaching a correct solution. If, for instance, an agent opts to copy a peer’s solution, then her own solution is completely replaced by the peer’s and the agent will continue to improve upon this solution, even if the copied solution was not strictly better then the original. If we take the average solution distance of all agents to the correct solution, we will find that it decreases with time, as all agents are able to solve the problem. Thus, even with a non-deterministic choice of neighbor for cooperation, an agent achieves the correct solution sooner or later as it modifies a given solution towards the correct solution.

In the case where a specific agent is surrounded by agents that share the very same solution, when an agent chooses to cooperate (i.e., exchange solutions), the probability that this solution will be copied is 1, which explains conformist behavior as treated in Efferson, et al. (2008). In a non-virtual environment, the feedback of the environment can cause an agent to stop communicating, e.g., in the case of injury or death. In this case, the agent cannot communicate his solution to a problem and the probability of this agent’s solution selected and copied is effectively zero. In conclusion, copying is a justified behavior and necessary for survival of the group, as it only happens between agents able to do so — presumably the ones that are still alive to communicate their solution.

Finally, for problem-solving social networks, cooperative behavior cannot be described in a model with deterministic rational agents. Our results provide evidence that solutions have not been evaluated. Indeed, if we had the case of a real environment, agents would not need to be rational at all, since the existence of many agents would mean many random solutions being tested against the environment; bad solutions could result in an agent halting communication while good solutions would still be around, providing further chances for other agents to copy them and survive. Thus, our results show that, in our settings, cooperation works because agents can copy each other.

However, even concluding that random acting agents can survive in a real environment, in our experiments they do not change their solutions randomly, since we know that agents can solve problems by themselves. Additional evidence for agents not acting randomly is provided by feedback in our experiments. There is no possibility that a bad solution will cause a given agent to halt communication. Therefore agents need other mechanisms to evolve their own solutions, such as reasoning.

Mapping to our original mathematical framework and referencing Definition 0.5, it is initially assumed the existence of function eval and for other modified versions of the aggregation step. As each option, in our model regarded as τ, is evaluated, the chance of being copied decreases. We hypothesize that Phase I provides evidence that eval has an internal state influenced by previous evaluations. We could think about the influence of time, however every time an individual chooses to evaluate the solutions of peers, even after several repeated experiments, the evaluation for the first solution remained at the same level. Therefore, we assume that the internal state is not time, but the cost of having evaluated other solutions τ-i before.

equation 10        (10)

We have mapped the eval function to real numbers ℝ for simplicity. We can chose any function with an order defined over the T domain, i.e., the domain of all traits.

Peer number cooperation limit

There is a quantitative limit for the number of peers with which a subject will interact with in computational social problem-solving. This number was close to six in our experiments. This result suggests that, when choosing a topology for designing a problem-solving social network, you may limit the degree of nodes to a maximum number that is relatively small, leading to a sparse network that may be easier to handle. Although we do not generalize the limits found in our experiments to other problem-solving social networks, it is possible to use the social network experimentation methodology from this work to find this maximum number for any other given problem. For instance, in a hierarchical organization, each working group probably has a maximum number of members, which may be determined by the methodology used in our work, i.e., you can find out how to collect communication or solution data exchanged between participants and model the data according to a geometric distribution. This is relevant to collaborative work and collective intelligence applications (Bran, et al., 2009; Nowak, 2006; Woolley, et al., 2010).

Capability to forget

Around 5.31 percent ±6.93 of all solutions used by all agents were, at some point, already provided by those same agents. Consequently, agents do forget some of their previous attempts to solve the problem; we cannot model problem as if agents were unable to lose traits.

We proposed a novel model that provides information on how humans interact when performing computation in a social network through informational natural selection. From experimentation with human subjects networks we provide empirical evidence of cooperative emergence.

Experiment Phase II: Public policy poll and Internet wide protests

Following our model of agent interaction from Equation 9, we plotted the same graph where we analyze how people are affected by the position of options on the application interface. Each element k on the screen receives consecutively less interaction from the user, as seen in Figure 17. The type of interaction a user may perform is rather ample and is specific to what users are allowed to do in experiments.

equation 11        (11)

 

Positional influence of public policy item
 
Figure 17: Positional influence of public policy item: each layout presents the distribution of element exploration by users. The elements are linearly ordered, vertically.

 

As a consequence, no matter what is first element of the screen is shown, the user will interact more with it. As we observed in our experiments, if the experiment is a poll, then the user have a high probability of picking the first option in the poll, and close to zero of picking the last one. Therefore, choosing the order of elements on the screen can determine the result of poll, assuming r(k) is sufficiently small. To avoid influence by the layout, and thus removing the undesired behavior that pollutes the collected data with type 2 users, with propose randomizing the elements of the layout of the application.

When we randomize the layout, let us suppose that each layout will be displayed ξ(0) = a times with order of elements A, ξ(1) = b times with order of elements B and so on, where a + b + ... and M is the total number of users interacting.

To verify our models, we applied our method to the poll experiment. Our result on Figure 17 shows the distribution of interaction when only the distribution of interaction over one layout is applied. All layouts illustrated the same pattern. However, if we do randomize uniformly each order of options (thus a different layout) we see results noted in Figure 18, representing voting without the influence of non-motivated agents.

 

Randomizing layout for policy items
 
Figure 18: Randomizing layout for policy items: by randomizing elements on the layout we cancel out the effect of data provided by users that selected options rationally, but according to a distribution not related to the value of the options as solutions to the problem domain.

 

Experiment III: Case with financial incentives

We performed a data sampling of each task present on AMT each five minutes on average for two months uninterruptedly [5]. Table 4 summarizes our collected data.

 

Table 4: Data collected from Amazon Mechanical Turk.
 Our studyChilton, et al. (2010)Ratio
SetupAll pages3 pages 
Total pages2403615.00%
Total time (hours)1,440322.22%
Distinct HIT groups67,1812,0403.04%
Total HITs3,696,7302,207,54859.72%
Sum of reward (US$)$531,313.18$232,051.1543.68%

 

To compare AMT with our experiments, we have to compare the decay rate of total number of tasks per HIT Group with the number of exchanged solutions for our experiments. Our goal is to prove that no other factor is more important to worker activity than the order at which items (τi) are evaluated. This is an outcome of AMT choosing as a default sorting the number of HITs available per group, once again allowing us to identify that searching is expensive to individuals and they might chose not to be picky or to evaluate each option individually.

For our investigation, we performed a linear regression with ridge regularization over our data and with five explanatory variables:

  • Number of HITs: total number of HITs per HIT group. This is the field that is sorted by default on AMT.
  • Alphabetical order: influence of the alphabetical order of titles.
  • Expiration time: how close the HIT group is close to expiration and thus being removed from AMT list of available HITs.
  • Reward: the amount of money a worker collects upon completing and having the work approved by a requester.
  • Time allotted: the amount of time a worker has to complete the task.

Our results, shown in Table 5, confirm that the HITs are more influenced by any other factor in our analysis. However, R2 was very low, indicating that our model is incomplete.

 

Table 5: Linear regression analysis of AMT.
Note: Number of observations: 23,460; Error degrees of freedom: 23454; Root Mean Squared Error: 1.25e03; R-squared: 0.0103; Adjusted R-squared 0.0101; F-statistic vs. constant model: 48.7; p-value = 2.61e-50
 EstimateStandard errort-studentp-value
(Intercept)-345.4629.462-11.7261.1553e-31
Rank by number of HITs1.24420.1118711.1221.1555e-28
Rank by alphabetical order0.0617720.0116685.29431.2055e-07
Rank by expiration time-0.0161550.0097449-1.65780.097375
Rank by reward-0.00595810.0089948-0.662390.50773
Rank by time allotted0.0461830.00831485.55432.8178e-08

 

Assuming that other requesters are also aware of the fact that being on the first page of AMT is a key factor to attract workers, we can state that AMT is being gamed by other requesters. The first page does not actually represent tasks with the highest number of real tasks available. Most tasks are fake and only created to inflate the number of total HITs available for a given task.

The idea that being amongst the first HIT groups shown on the AMT first page will increase the consumption rate of HITs, also known as uptake rate, might not be new. Indeed, if we assume that, we should adjust our linear regression analysis by removing any page that is at the first position in any sorting method. We performed a new analysis without the first position HIT groups and found results as noted in Table 6.

 

Table 6: Linear regression analysis of AMT, corrected.
Note: Number of observations: 23,142; Error degrees of freedom: 23136; Root Mean Squared Error: 579; R-squared: 0.0167; Adjusted R-squared 0.0165; F-statistic vs. constant model: 78.8; p-value = 2.95e-82
 EstimateStandard errort-studentp-value
(Intercept)-183.8713.825-13.2993.2764e-40
Rank by number of HITs0.785280.05231215.0121.069e-50
Rank by alphabetical order0.0306530.0054385.63681.7521e-08
Rank by expiration time-0.0167150.0045583-3.66690.0002461
Rank by reward-0.00813550.0042198-1.92790.53878
Rank by time allotted0.0220.00387695.67451.4078e-08

 

With improved statistics, expiration time and time allotted explanatory variables have confidence levels p < 0.05. However, we still have low R2 levels.

We have shown that AMT does behave in similar ways to our previous experiments by sampling data from AMT Web site and performing a linear regression over collected data. We have shown once again that variables that are usually taken as the most important factors, such as reward and time allotted, are not as important as interface factors, such as the position of tasks on the screen. However, our data analysis, although with safe p values, still had low R2 and thus there are other factors that have to be taken into consideration.

 

++++++++++

Discussion and conclusions

This paper assumed initially that the connected nature of online applications could be used as a base to explain human online activities. As our first experiments were conducted, we realized that when solving microtasks, a strong component of the cost of searching and evaluation affected how people made decisions in connected environments. Indeed, it provided evidence that cooperative behavior was strongly biased because of the high costs of searching and evaluating, compared to the gains of analyzing each solution individually. We demonstrated that individuals, when performing tasks in online environments, have search and evaluation costs comparable to gains from cooperation and thus need to reduce costs by searching less and spending less time evaluating options. The result holds even as individuals gain experience by performing the experiments repeatedly along a few hours, showing that they are not learning and modifying their behavior, or that their behavior is not a consequence of a purely exploratory use of the applications they used to solve tasks.

In similar experiments where individuals were challenged to solve mathematical problems (Judd, et al., 2010), our results could be possibly explained by specific interface features, such as the way in which we organized elements in the layout.

To improve our investigations, we extended our study with a new round based on public policy polling, which became known as Experiment II. This task had no objective truth, although each option had an intrinsic financial value that could potentially effects future savings. Moreover, the layout was displayed vertically. Once again, we were able to identify the same behavior, i.e. the chance to vote for a specific public policy would decrease as more options were shown.

Experiments I and II were performed with volunteers. One could argue that the lack of financial incentives decreased gains for the efforts of each participant. With the aim of investigating this possibility, we experimented with an application where individuals received financial incentives, Amazon Mechanical Turk. This experiment was not controlled and we only collected data. First, we were able to show that individuals are heavily affected by search and evaluation costs, even when, by searching and evaluating properly, they would be able to find tasks with higher rewards. Most importantly, by gaming the market and moving our tasks to the first positions of the default task display, we were able to show that we can affect, using our model, the decision making process of individuals.

Therefore, it is important to consider search and evaluation costs in performing microtasks. This paper presented a mathematical description of traits based on artificial memetic networks and showed how the evaluation function of the aggregation step behaves in many settings, shedding light on problem-solving social networks.

The costs of search and evaluation during problem solving obey the same power law across all experiments, although with different parameters. For the case of financial incentives, the model did show how optimizing the search costs is more important than financial incentives, but the error indicated that the model needs more explanatory variables or a improved model to provide a more complete explanation of gains and costs in a labor market.

The experiments with Mechanical Turk showed a low R2 value, which directs us to exploring more properties of AMT.

We can point to consistent results across experiments, but not without limitations. First, we applied the meme concept to small SAT and Sudoku solutions and then we suggest that they can be compared between each other with a given function, such as Levenshtein distance. However, this comparison would not be tractable for other non-continuum spaces of problems, such as valid binary representation of programs, or if we chose a distance for which the computational complexity grows quickly with the size of the input. Therefore, we have considered, in our experiments, cases where the solution has a small input. For instance, a SAT solution with 64 variables can be represented in a 64 bits (or 8 bytes in most architectures), while a Sudoku solution can have up to to 299 bits. If we tried to apply our meme model to natural language problems, such as interpreting human conversation or books, the input size could span from bytes to several megabytes.

Our research was limited to microtasks and we have not extended our results to long-running problems. Although we have provided evidence that individuals do not change their behavior after repeated experiments, the total time a given individual worked from first problem instance to the last did not span more than one hour. Experiments that span days or maybe even months are required to understand the behavior of individuals in long-running problems. End of article

 

About the authors

Daniel Scain Farenzena is a software srchitect at Aurea. He holds a Ph.D. in computer science from the Federal University of Rio Grande do Sul. His research interests include data science, artificial intelligence and and social computing.
Direct comments to dsfarenzena [at] inf [dot] ufrgs [dot] br

Prof. Dr. Luís da Cunha Lamb is Professor and Dean of the Institute of Informatics at the Federal University of Rio Grande do Sul, Porto Alegre, Brazil. He holds a Ph.D. in computing science from the Imperial College London (2000) and the Diploma of the Imperial College, M.Sc. by research (1995) and B.Sc. in computer science (1992) from the Federal University of Rio Grande do Sul. His research interests include logic in computer science and artificial intelligence, neural computation; social computing and computing in the physical and social sciences. Lamb holds an Advanced Research Fellowship from the Brazilian National Research Council CNPq.
E-mail: lamb [at] inf [dot] ufrgs [dot] br

Prof. Dr. Ricardo Matsumura de Araújo is a computer science professor at the Center for Technological Advancement in the Federal University of Pelotas (UFPel). He holds a Ph.D. and master degree in computer science from UFRGS and lectures in the computer science and computer engineering undergrad courses and the graduation program in computer science. His research interests include machine learning, computational social science and data science.
E-mail: ricado [at] inf [dot] ufpel [dot] edu [dot] br

 

Notes

1. Also known as the game-theoretic approach.

2. In our model, the creation of new traits (or memes) is the result of a mutation process (Araújo and Lamb, 2008).

3. We performed experiments where agents were able to communicate with natural language, and not only solutions for problems. However, those cases were excluded from our analysis in this paper as the amount of data needed to was not sufficient. Additional experiments are planned.

4. Nahon and Hemsley, 2013, p. 25.

5. Except for a few hours when computer infrastructure was not available.

 

References

G.D. Abowd and E.D. Mynatt, 2000. “Charting past, present, and future research in ubiquitous computing,” ACM Transactions on Computer-Human Interaction, volume 7, number 1, pp. 29–58.
doi: http://doi.org/10.1145/344949.344988, accessed 20 July 2016.

Alexia, 2016. “How popular is wikipedia.org?” at http://www.alexa.com/siteinfo/wikipedia.org, accessed 20 July 2016.

R.D. Araújo, 2010. “Memetic networks: Problem-solving with social network models,” Ph.D. dissertation, Universidade Federal do Rio Grande do Sul, Instituto de Informática, at http://www.lume.ufrgs.br/handle/10183/25515, accessed 20 July 2016.

R.D. Araújo and L.C. Lamb, 2008. “Memetic networks: Analyzing the effects of network properties in multi-agent performance,” AAAI ’08: Proceedings of the 23rd National Conference on Artificial Intelligence, volume 1, pp. 3–8, and at http://www.aaai.org/Papers/AAAI/2008/AAAI08-001.pdf, accessed 20 July 2016.

A.-L. Barabási, 2002. Linked: The new science of networks. Cambridge, Mass.: Perseus.

A. Birukou, E. Blanzieri, P. Giorgini, and F. Giunchiglia, 2013. “A formal definition of culture,” In: K. Sycara, M. Gelfand, and A. Abbe (editors). Models for Intercultural Collaboration and Negotiation. Advances in Group Decision and Negotiation, volume 6. Berlin: Springer-Verlag, pp. 1–26.
doi: http://doi.org/10.1007/978-94-007-5574-1_1, accessed 20 July 2016.

C.A. Bran, T. Malone, D. Lewis, and J. Burton, 2009. “The knowledge worker of the future,” OOPSLA ’09: Proceedings of the 24th ACM SIGPLAN Conference Companion on Object Oriented Programming Systems Languages and Applications, pp. 851–852.
doi: http://doi.org/10.1145/1639950.1640044, accessed 20 July 2016.

J. Bredin and D.C. Parkes, 2012. “Models for truthful online double auctions,” arXiv, at https://arxiv.org/pdf/1207.1360.pdf, accessed 20 July 2016.

L.B. Chilton, J.J. Horton, R.C. Miller, and S. Azenkot, 2010. “Task search in a human computation market,” HCOMP ’10: Proceedings of the ACM SIGKDD Workshop on Human Computation, pp. 1–9.
doi: http://doi.org/10.1145/1837885.1837889, accessed 20 July 2016.

R. Dawkins, 1989. The selfish gene. New edition. New York: Oxford University Press.

R.K. de Andrades, M. Dorn, D.S. Farenzena, and L.C. Lamb, 2013. “A cluster-DEE-based strategy to empower protein design,” Expert Systems with Applications, volume 40, number 13, pp. 5,210–5,218.
doi: http://doi.org/10.1016/j.eswa.2013.03.011, accessed 20 July 2016.

D. Easley and J. Kleinberg, 2010. Networks, crowds, and markets: Reasoning about a highly connected world. New York: Cambridge University Press.

C. Efferson, R. Lalive, P.J. Richerson, R. McElreath, and M. Lubell, 2008. “Conformists and mavericks: The empirics of frequency-dependent cultural transmission,” Evolution & Human Behavior, volume 29, number 1, pp. 56–64.
doi: http://doi.org/10.1016/j.evolhumbehav.2007.08.003, accessed 20 July 2016.

D.S. Farenzena, R.M. Araújo, and L.C. Lamb, 2011a. “Collaboration emergence in social networks with informational natural selection,” 2011 IEEE Third International Conference on Privacy, Security, Risk and Trust (PASSAT) and 2011 IEEE Third Inernational Conference on Social Computing (SocialCom), pp. 555–559.
doi: http://doi.org/10.1109/PASSAT/SocialCom.2011.174, accessed 20 July 2016.

D.S. Farenzena, R.M. Araújo, and L.C. Lamb, 2011b. “Towards social problem-solving with human subjects,” Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence, pp. 2,798–2,799, and at http://www.ijcai.org/Proceedings/11/Papers/471.pdf, accessed 20 July 2016.
doi: http://doi.org/10.5591/978-1-57735-516-8/IJCAI11-471, accessed 20 July 2016.

D.S. Farenzena, L.C. Lamb, and R.M. Araújo, 2010. “Combining human reasoning and machine computation: Towards a memetic network solution to satisfiability,” Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, pp. 1,929–1,930.

E. Fehr and U. Fischbacher, 2003. “The nature of human altruism,” Nature, volume 425, number 6960 (23 October), pp. 785–791.
doi: http://doi.org/10.1038/nature02043, accessed 20 July 2016.

C. Floudas, H. Fung, S. McAllister, M. Mönnigmann, and R. Rajgaria, 2006. “Advances in protein structure prediction and de novo protein design: A review,” Chemical Engineering Science, volume 61, number 3, pp. 966–988.
doi: http://doi.org/10.1016/j.ces.2005.04.009, accessed 20 July 2016.

FoldIt, 2015. “Solve puzzles for science,” at https://fold.it/portal/, accessed 20 July 2016.

GalaxyZoo, 2015. “Classify galaxies,” at http://www.galaxyzoo.org/, accessed 20 July 2016.

A. Ghosh and P. Hummel, 2014. “A game-theoretic analysis of rank-order mechanisms for user-generated content,” Journal of Economic Theory, volume 154, pp. 349–374.
doi: http://doi.org/10.1016/j.jet.2014.09.009, accessed 20 July 2016.

J. Giles, 2005. “Internet encyclopaedias go head to head,” Nature, volume 438, number 7070 (15 December), pp. 900–901.
doi: http://doi.org/10.1038/438900a, accessed 20 July 2016.

S. Jain and D.C. Parkes, 2013. “A game-theoretic analysis of the ESP game,” ACM Transactions on Economics and Computation, volume 1, number 1, article number 3.
doi: http://doi.org/10.1145/2399187.2399190, accessed 20 July 2016.

S. Judd, M. Kearns, and Y. Vorobeychik, 2010. “Behavioral dynamics and influence in networked coloring and consensus,” Proceedings of the National Academy of Sciences, volume 107, number 34, pp. 14,978–14,982.
doi: http://doi.org/10.1073/pnas.1001280107, accessed 20 July 2016.

M. Kearns, S. Suri, and N. Montfort, 2006. “An experimental study of the coloring problem on human subject networks,” Science, volume 313, number 5788 (11 August), pp. 824–827.
doi: http://doi.org/10.1126/science.1127207, accessed 20 July 2016.

D.M. Kreps, 1990. A course in microeconomic theory. Princeton, N.J.: Princeton University Press.

M.D. Lillibridge, M. Abadi, K. Bharat, and A.Z. Broder, 2001. “Method for selectively restricting access to computer systems,” U.S. patent 6,195,698 B1 (27 February); version at http://www.google.com/patents/US6195698, accessed 20 July 2016.

R. Linton, 1936. The study of man: An introduction. Student’s edition. New York, D. Appleton-Century.

C. Lintott, K. Schawinski, S. Bamford, A. Slosar, K. Land, D. Thomas, E. Edmondson, K. Masters, R.C. Nichol, M.J. Raddick, A. Szalay, D. Andreescu, P. Murray, and J. Vandenberg, 2011. “Galaxy Zoo 1: Data release of morphological classifications for nearly 900,000 galaxies,” Monthly Notices of the Royal Astronomical Society, volume 410, number 1, pp. 166–178.
doi: http://doi.org/10.1111/j.1365-2966.2010.17432.x, accessed 20 July 2016.

T.W. Malone and K. Crowston, 1994. “The interdisciplinary study of coordination,” ACM Computing Surveys, volume 26, number 1, pp. 87–119.
doi: http://doi.org/10.1145/174666.174668, accessed 20 July 2016.

C.F. Manski, 1993. “Identification of endogenous social effects: The reflection problem,” Review of Economic Studies, volume 60, number 3, pp. 531–542.
doi: http://doi.org/10.2307/2298123, accessed 20 July 2016.

M.E.J. Newman, 2010. Networks: An introduction. New York: Oxford University Press.

M.A. Nowak, 2006. “Five rules for the evolution of cooperation,” Science, volume 314, number 5805 (8 December), pp. 1,560–1,563.
doi: http://doi.org/10.1126/science.1133755, accessed 20 July 2016.

H. Ohtsuki, C. Hauert, E. Lieberman, and M.A. Nowak, 2006. A simple rule for the evolution of cooperation on graphs and social networks, Nature, volume 441, number 7092 (25 May), 502–505.
doi: http://doi.org/10.1038/nature04605, accessed 20 July 2016.

M.J. Osborne and A. Rubinstein, 1994. A course in game theory. Cambridge, Mass.: MIT Press.

D.J. Osguthorpe, 2000. “Ab initio protein folding,” Current Opinion in Structural Biology, volume 10, number 2, pp. 146–152.
doi: http://doi.org/10.1016/S0959-440X(00)00067-1, accessed 20 July 2016.

S.A. Paul, L. Hong, and E.H. Chi, 2012. “Who is authoritative? Understanding reputation mechanisms in Quora,” arXiv (17 April), at https://arxiv.org/abs/1204.3724, accessed 20 July 2016.

P.J. Richerson and R. Boyd, 2005. Not by genes alone: How culture transformed human evolution. Chicago: University of Chicago Press.

S. Suri and D.J. Watts, 2011. “Cooperation and contagion in Web-based, networked public goods experiments,” ACM SIGecom Exchanges, volume 10, number 2, pp. 3–8, at http://www.sigecom.org/exchanges/volume_10/2/SURI.pdf, accessed 20 July 2016.

L. von Ahn and W. Cathcart, 2009. “Teaching computers to read: Google acquires reCAPTCHA” (16 September), at http://googleblog.blogspot.com.br/2009/09/teaching-computers-to-read-google.html, accessed 20 July 2016.

L. von Ahn and L. Dabbish, 2008. “General techniques for designing games with a purpose,” Communications of the ACM, volume 51, number 8, pp. 58–67.
doi: http://doi.org/10.1145/1378704.1378719, accessed 20 July 2016.

L. von Ahn, M. Blum, N.J. Hopper and J. Langford, 2003. “CAPTCHA: Using hard AI problems for security,” In: E. Biham (editor). Advances in Cryptology — EUROCRYPT 2003. Lecture Notes in Computer Science, volume 2656. Berlin: Springer, pp 294–311.
doi: http://doi.org/10.1007/3-540-39200-9_18, accessed 20 July 2016.

L. von Ahn, B. Maurer, C. McMillen, D. Abraham and M. Blum, 2008. “reCAPTCHA: Human-based character recognition via Web security measures,” Science, volume 321, number 5895 (12 September), pp. 1,465–1,468.
doi: http://doi.org/10.1126/science.1160379, accessed 20 July 2016.

A.W. Woolley, C.F. Chabris, A. Pentland, N. Hashmi, and T.W. Malone, 2010. “Evidence for a collective intelligence factor in the performance of human groups,” Science, volume 330, number 6004 (29 October), pp. 686–688.
doi: http://doi.org/10.1126/science.1193147, accessed 20 July 2016.

 


Editorial history

Received 30 May 2016; accepted 20 July 2016.


Creative Commons License
This paper is licensed under a Creative Commons Attribution 4.0 International License.

The cost of search and evaluation in online problem-solving social networks with financial and non-financial incentives
by Daniel Scain Farenzena, Luís da Cunha Lamb, and Ricardo Matsumura Araújo.
First Monday, Volume 21, Number 8 - 1 August 2016
http://firstmonday.org/ojs/index.php/fm/article/view/6751/5599
doi: http://dx.doi.org/10.5210/fm.v21i8.6751





A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2017. ISSN 1396-0466.