Effectiveness and user satisfaction in Yahoo! Answers
First Monday

Effectiveness and user satisfaction in Yahoo! Answers by Chirag Shah



Abstract
Social question–answering services such as Yahoo! Answers (YA) are becoming highly prominent venues for online information seeking. While their immense popularity indicates their success, there is a need to measure their effectiveness and how satisfactory information they provide to the information seekers. To study these questions of effectiveness and user satisfaction, we collected a large amount of data from YA. For operationalizing the constructs of effectiveness and user satisfaction, we considered the amount of time lapsed between a question being asked and answered, and the asker choosing an answer to be satisfying, respectively. Using data mining, we show that the majority of the questions on YA get at least one answer within a few minutes, however, it takes longer to receive an answer that satisfies the asker. We also demonstrate that the sooner an answer appears for a question, the higher chances it has being selected as the best answer by the asker.

Contents

Introduction
Methodology
Analysis
Conclusion

 


 

Introduction

With emerging online information sources, information seeking behavior is changing. A variety of of sources have emerged, including social or community question–answering (social Q&A) sites such as Yahoo! Answers (http://answers.yahoo.com/), AnswerBag (http://www.answerbag.com/), and WikiAnswers (http://wiki.answers.com/). A common and defining characteristic of these sites is that anyone can pose their information need on almost any topic as a question, and receive answers from the community of users that belong to that particular site (Harper, et al., 2008). In recent years, these services have become increasingly popular. Yahoo! Answers is reported to have more than 200 million users worldwide [1], with 15 million users visiting daily [2].

While social Q&A sites may be relatively new, asking questions to experts via an online form — digital referencing — has been around for a while. This area is considered to be an online or virtual version of traditional reference services, where a patron can receive information from reference librarians or other experts (Mon, 2000). Several works have tried to compare such digital referencing or expert Q&A services with social Q&A in order to draw their pros and cons (e.g., Shah, et al., 2008).

Rather than comparing with other forms of online Q&A services, in this paper we focus on understanding the effectiveness of and satisfaction with a typical social Q&A service, since it is considered to be one of the core research agendas in this area (Shah, et al., 2009). We choose Yahoo! Answers (YA) due to its scope and accessibility, examining its effectiveness in terms of retrieving information and providing satisfactory answers. Hence, we will address the following research questions:

  1. How quickly is a specific question answered?
  2. How quickly does a question receive an answer that satisfies the interrogator?
  3. What is the relationship between a satisfactory answer and its position (or rank) in a list of answers?

The remainder of the paper describes how we collected a large amount of data from YA, mining it to answer these research questions. We conclude with a few interpretations and implications of our findings.

 

++++++++++

Methodology

To answer our research questions, we performed data mining on a large set of data retrieved from YA. This section describes our methodology for data collection, as well as a basic description of the data.

Collecting the data

We used YA’s Application Programming Interface (API, at http://developer.yahoo.com/answers/) support for collecting questions and answers data. However, APIs have a daily limit so collecting a large amount of data in a short period of time is not possible. We, therefore, ran our data collection processes for more than two years (between Fall 2007 and Fall 2009) to create a corpus of a reasonable size. This resulted in a collection of over 3,000,000 questions, with over 16,000,000 answers.

On YA, an interrogator can choose an answer that satisfies his or her information needs from a set of answers. This selected answer is denoted as the “best answer”, and with that, the question is considered “resolved”. We only collected those questions that have been resolved. In other words, interrogators of these questions had picked the best answers from a set of answers they received; hence, answers were not possible for these questions. This approach was selected because we wanted to look at the selected answer for each question, since having an answer chosen as the best answer marks a given question resolved. This creates one of the limitations of our approach to identifying effectiveness and user satisfaction. Since we only studied questions that were resolved, we only looked at situations where at least one satisfactory answer was provided, thus, making the service appear very effective. It is beyond the scope of this work, due to its methodology, to consider situations where no satisfactory answer was posted.

Description of data

Table 1 provides a summary of the data, where the number of questions and their respective answers are presented within the 25 subject categories of YA (http://answers.yahoo.com/dir/index). We do not know the actual size of the YA dataset, which constantly grows. We do, however, believe that our data is a reasonable sampling of it.

 

Table 1: Summary of Yahoo! Answers data.
CategoryNumber of questionsNumber of answers
Arts & Humanities155,034748,097
Beauty & Style155,625922,294
Business & Finance148,808489,885
Cars & Transportation152,045584,804
Computers & the Internet157,524530,176
Consumer Electronics151,241476,891
Dining Out98,599597,079
Education & Reference134,265448,545
Entertainment & Music130,0641,207,108
Environment90,235556,815
Family & Relationships136,315887,881
Food & Drink135,576803,592
Games & Recreation134,406395,745
Health115,402423,612
Home & Garden123,430454,242
Local Businesses97,664250,407
News & Events105,701765,140
Pets130,701739,300
Politics & Government133,0201,026,122
Pregnancy & Parenting128,634969,644
Science & Mathematics132,275366,106
Social Science114,450500,394
Society & Culture133,6261,056,102
Sports123,900702,559
Travel130,049503,078
Total3,248,58916,405,618

 

We note that the category “Entertainment & Music” had the highest average number of answers per question (more than nine), followed by “Society & Culture” and “Politics & Government” (seven to eight). Questions in these categories are often seeking opinions rather than a specific answer [3].

For each question, we collected its subject, content, category, username of the interrogator, number of answers, and the time when the question was posted. For each answer, we collected its content, username of the individual providing an answer, rating (if there was any), and the time when the answer was posted. Given that the rating is posted only for the interrogator selecting a given answer, having a rating for an answer also tells us which was chosen.

 

++++++++++

Analysis

This section presents our analysis of the data based on our three research questions.

1. How quickly is a specific question answered?

First, we looked at how many answers a given question received. Figure 1 depicts a scatter plot of number of questions–answers. As we can see, the majority of questions received fewer than 10 answers. In fact, based on our data — about 16,000,000 answers for about 3,000,000 questions — we can derive that, on average, each question earned five to six answers.

 

Figure 1: Scatter plot of number of questions and answers
Figure 1: Scatter plot of number of questions and answers. Each data point (represented with a cross) indicates how many questions receive how many answers.

 

Receiving five or six answers per question may seem encouraging, but a more important question is how soon one could receive at least one answer. This could, in a sense, indicate the effectiveness of YA. To answer this question, we computed the time lapse (in minutes) between the posting times of a question and its first answer. The results are summarized in Figure 2.

 

Figure 2: Time lapse between a given question and the first answer to be posted
Figure 2: Time lapse between a given question and the first answer to be posted.

 

We can see that more than 30 percent (965,867 out of about 3,000,000) of the questions received their first answers in less than five minutes, and only about eight percent (259,120 out of about 3,000,000) of the questions took longer than one hour to secure their first answers. In other words, more than 90 percent of the questions received at least one answer within an hour. Given that millions of questions are posted on YA, these statistics are indicative of the effectiveness of this system. Note that we did not collect unresolved questions from YA, so there may be a number of questions for which no satisfactory answer was received. We have observed that many of these sorts of questions ask for opinions or advice rather than a solution or specific information; thus, by nature, they are irresolvable. Such observations are supported by Harper, et al. (2010), reporting that only about 34 percent of questions on YA are factual questions.

2. How quickly does a question receive an answer that satisfies the interrogator?

Interrogators select answers from a set of answers that best fit their information needs. Liu and Agichtein (2008) regard this as an indication of user satisfaction. When selecting an answer, an interrogator can also rate that answer on the scale of 1 to 5. Figure 3 illustrates the rank distribution for answers in our dataset. A large portion of the answers have rating=0 (no rating) as they were not selected. Among the selected answers, most received a rating of 3 or higher. This has been regarded as the indication of high quality answers by Shah and Pomerantz (2010) [4]. In other words, interrogators were not only satisfied with a given answer; they also found these selected answers to be of high quality.

 

Figure 3: Distribution of answer ratings
Figure 3: Distribution of answer ratings.

 

Let us consider these answers with respect to time. While receiving a quick response for a posted question may demonstrate the effectiveness of the system, it still remains to be seen if these answers satisfy the needs of a given interrogator. To analyze this, we looked at the time lapsed between the time a question was posted and an answer was given and selected. This is shown in Figure 4.

 

Figure 4: Time lapse between a given question and selected answer
Figure 4: Time lapse between a given question and selected answer.

 

It is notable that about one–third of the best answers took more than an hour to appear. This is interesting to see in contrast with Figure 2. We noticed that while many answers are posted within a few minutes of a given question being posted (Figure 2), the answers that satisfy an interrogator takes longer to appear (Figure 4). The appearance of better answers after the first 5 minutes may be attributed to the fact that a delayed answer may use earlier posted information to generate a more refined or comprehensive answer. Thus, a delayed answer may have a higher likelihood of being selected as the best answer. A large number of best answers arriving after 60 minutes may likely related to cases when an interrogator does not select an answer in a reasonable time, and the community decides on an answer by voting.

3. What is the relationship between a satisfactory answer and its position (or rank) in a list of answers?

While answers for a given question may be posted by a variety of individuals, each answer potentially can be influenced by answers already posted. Let us examine which answers, in a list of answers, were selected to be the best.

Figure 5 illustrates rank distribution for the best answers. We can see that the answers that appear at rank 1 (first answer) for a given question had a greater chance of being selected as the best answer. We also see a tall peak for rank=6. On average a question on YA receives five to six answers; hence the last answer may be chosen as the best answer. It should also be noted that once an interrogator selects an answer to be the best, the question is considered to be resolved and closed. We collected only resolved questions and their answers, which also explains the peak for rank=6. We plan to explore this phenomenon in future research.

 

Figure 5: Rank distribution for best answers
Figure 5: Rank distribution for best answers.

 

This finding that the higher an answer appears, the better chances it has for being selected as the best answer was also noted by Shah and Pomerantz (2010).

Let us now examine the results as displayed in Figures 2, 4, and 5. Figure 2 essentially tells us that the first answer to a question appears quickly, however Figure 4 indicates that the eventual selected answer may not appear immediately. Figure 5 indicates that the best answers tend to appear higher on a list of answers. This can be explained by noting that there are many questions for which answers do not appear immediately (that is, for more than an hour). When the answers eventually appear, they are selected as the best answers at early ranks. Figure 6 illustrates rank distributions for best answers that appear after more than 60 minutes of the initial postings of questions. Relative to Figure 5, we note that there is a gradual decrease in the number of best answers as we go down the ranks, without a spike at any particular rank. This may indicate a level of difficulty for these questions. On the other hand, there are many questions for which answers appear quickly (within 5–10 minutes), and at some point the interrogator selects one as the best (on average, rank=6) and closes the question (Figure 5).

 

Figure 6: Rank distribution for best answers appearing after 60 minutes
Figure 6: Rank distribution for best answers appearing after 60 minutes.

 

Combining these factors, one could even explain or evaluate question difficulty, and possibly improve content quality using the following guidelines:

  • If a question has not been answered for over an hour, it could be difficult or poorly posed. Interrogators may be contacted in these cases to revise their questions.
  • Interrogators could be informed about the average response time for questions akin to a given posted question. YA provides on–the–fly suggestions for similar questions already posted. This feature could be modified to inform interrogators about response rates, encouraging patience and deliberation before selecting the best answer.

 

++++++++++

Conclusion

Social or community Q&A services, such as Yahoo! Answers, are significant since a large number of online information seekers are increasingly using these resources. Individuals providing answers on these sites are in a way fulfilling the role of traditional reference librarians or field experts. It is still remains to be seen how their answers compare in quality with those from traditional reference sources. However, there are ways to test the effectiveness of these services in terms of providing information seekers satisfactory, if not high quality, information.

In this paper we addressed questions about effectiveness and user satisfaction by mining data collected from YA. Our analysis indicated that YA provides a very effective platform for one to post a question and secure an answer quickly. However, a satisfactory answer may take longer depending on the difficulty of the question. While we measured user satisfaction by examining the interaction of interrogators with YA, other approaches have been utilized. For instance, Kim ,et al. (2007) and Kim and Oh (2009) examined comments by interrogators in order to understanding perceived relevance and satisfaction. Finally, the methodologies used in this study provide some evidence for evaluating question difficulty and content quality. End of article

 

About the author

Dr. Chirag Shah is an assistant professor in School of Communication & Information (SC&I) at Rutgers University. He received his Ph.D. from the School of Information & Library Science (SILS) at University of North Carolina at Chapel Hill. His research interests include various aspects of interactive information retrieval/seeking, especially in the context of online social networks and collaborations, contextual information mining, and applications of social media services for exploring critical socio–political issues.
E–mail: chirags [at] rutgers [dot] edu

 

Notes

1. http://yanswersblog.com/index.php/archives/2009/12/14/yahoo-answers-hits-200-million-visitors-worldwide/, accessed 4 February 2011.

2. http://yanswersblog.com/index.php/archives/2009/10/05/did-you-know/, accessed 4 February 2011.

3. There are many questions in these categories asking for opinions, generating thousands of “answers”. We did not include these opinion-seeking questions in our dataset since were not “resolved”.

4. Dividing up five–rating responses to two categories (in this case, high quality and low quality) is often done in literature on usability. See, for instance, White and Kelly (2006), and Liu and Belkin (2010). The rationale behind this approach is that the five–point scale may be appropriate for individuals to respond, but too fine for a meaningful quantitative analysis.

 

References

F.M. Harper, J. Weinberg, J. Logie, and J.A. Konstan, 2010. “Question types in social Q&A sites,” First Monday, volume 15, number 7, at http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/2913/2571, accessed 4 February 2011.

F.M. Harper, D. Raban, S. Rafaeli, and J. Konstan, 2008. “Predictors of answer quality in online Q&A sites,” Proceedings of the 26th Annual SIGCHI Conference on Human Factors in Computing Systems, pp. 865–874.

S. Kim and S. Oh, 2009. “Users’ relevance criteria For evaluating answers in a social Q&A site,” Journal of the American Society for Information Science and Technology, volume 60, number 4, pp. 716–727.http://dx.doi.org/10.1002/asi.21026

S. Kim, J.S. Oh, and S. Oh, 2007. “Best–answer selection criteria in a social Q&A site from the user–oriented relevance perspective,” Proceedings of the American Society for Information Science and Technology, volume 44, number 1, pp. 1–15.http://dx.doi.org/10.1002/meet.1450440256

J. Liu and N.J. Belkin, 2010. “Personalizing information retrieval for multi–session tasks: The roles of task stage and task type,” SIGIR ’10: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 26–33.

Y. Liu, J. Bian, and E. Agichtein, 2008. “Predicting information seeker satisfaction in community question answering,” SIGIR ’08: Proceedings of the 31st International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 483–490.

L. Mon, 2000. “Digital reference service,” Government Information Quarterly, volume 17, number 3, pp. 309–318.http://dx.doi.org/10.1016/S0740-624X(00)00046-0

C. Shah and J. Pomerantz, 2010. “Evaluating and predicting answer quality in community QA,” SIGIR ’10: Proceedings of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval.

C. Shah, S. Oh, and J.S. Oh, 2009. “Research agenda for social Q&A,” Library & Information Science Research, volume 31, number 4, pp. 205–209.http://dx.doi.org/10.1016/j.lisr.2009.07.006

C. Shah, J.S. Oh, and S. Oh, 2008. “Exploring characteristics and effects of user participation in online social Q&A sites,” First Monday, volume 13, number 9, at http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/viewArticle/2182/2028, accessed 4 February 2011.

R.W. White and D. Kelly, 2006. “A study on the effects of personalization and task information on implicit feedback performance,” CIKM ’06: Proceedings of the 15th ACM International Conference on Information and Knowledge Management.

 


Editorial history

Received 1 August 2010; revised 30 January 2011; accepted 2 February 2011.


Creative Commons License
“Measuring effectiveness and user satisfaction in Yahoo! Answers” by Chirag Shah is licensed under a Creative Commons Attribution–NonCommercial–NoDerivs 3.0 Unported License.

Effectiveness and user satisfaction in Yahoo! Answers
by Chirag Shah.
First Monday, Volume 16, Number 2 - 7 February 2011
http://firstmonday.org/ojs/index.php/fm/article/view/3092/2769





A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2014.