Influencing public opinion from corn syrup to obesity: A longitudinal analysis of the references for nutritional entries on Wikipedia
by Marcus Messner, Marcia W. DiStaso, Yan Jin, Shana Meganck, Scott Sherman, and Sally Norton

The collaboratively edited online encyclopedia Wikipedia has continuously increased its reliability through a revised editing and referencing process. As the public increasingly turns to online resources for health information, this study analyzed the development of the referencing as the basis for Wikipedia content on nutritional health topics over five years. The study found that Wikipedia articles on nine selected nutrition topics do not only consistently rank among the top search results in major search engines, but have also been heavily edited and revised over time. A longitudinal content analysis of more than 3,000 references showed that the articles are greatly relying on academic publications as the sources for their information on nutrition, stressing the improved reliability of Wikipedia content.


When “wiki wars” were raging and “Wikipedia politics” was regularly played during election cycles in the middle part of the past decade, the collaboratively edited online encyclopedia Wikipedia was regularly criticized for its hoaxes and inaccuracies (Boxer, 2004; McCaffrey, 2006). However, these days are now long gone as new editorial policies for Wikipedia have been put into place and a new referencing system has greatly increased the reliability of the Wikipedia content and made it much more difficult to add false information to the articles (DiStaso and Messner, 2010).

The collaborative nature of Wikipedia allows the public to define the current state of knowledge on any important issue, including information about health, nutrition and related diseases. Research has shown that the public is turning increasingly to online resources to find information on health topics such as nutrition, exercise and physical fitness as well as specific diseases (Fox, 2011). Through their high rankings for their respective topics on online search engines, most Wikipedia articles gain wide readership as well as are subsequently used as sources by news organizations and the public (DiStaso, et al., 2007; Messner and South, 2011).

It is the purpose of this study to broaden the research on Wikipedia, which has mainly focused on its overall reliability and the reputation management of companies, to the area of health communication, specifically nutritional information. As the public turns to Wikipedia as one of its main online sources for health information, it is important to explore the reliability of Wikipedia articles on nutrition and related diseases based on the referencing frequency and reference types over time.



Wikipedia: An emerging online public information source

Founded 15 January 2001, the online, user-contributed encyclopedia Wikipedia utilizes the wiki concept, which distinguishes it from traditional encyclopedias like Encyclopedia Britannica. While Britannica articles are written by experts in particular fields, Wikipedia lets any of its users contribute to its content.

Over the last decade, Wikipedia’s seemingly chaotic production process has contributed to a meticulous editing and revision process and has greatly increased the global knowledge base online (DiStaso and Messner, 2010). Today, Wikipedia ranks as the sixth most popular website in the world and in the United States, only surpassed by Google, Facebook, YouTube, Yahoo! and Amazon.

On a daily basis, it reaches almost 15 percent of global Internet users with its 19 million articles in more than 270 languages (Alexa, 2012; Wikipedia, 2012). It is 25 times as large as Britannica, which was founded 244 years earlier (Cohen, 2010; Ellison, 2006). Compared to YouTube, Wikipedia has similar rates of contributions by its users (Leung, 2009).

In comparison to traditional works of reference, Wikipedia “will never be finished” [1] as its contributors engage in discussions over added and rewritten content continuously. Miller (2005) pointed out that “the wiki culture invites, almost compels readers to edit” [2].

This editorial process has become so rigorous and time consuming that some contributors have quit their engagement on Wikipedia, leading the encyclopedia to call for more users to help edit (Angwin and Fowler, 2009; Gomes, 2007). This is in stark contrast to the encyclopedia’s early stages, when the collaboration in the editing process led to reliability problem that gained widespread public attention.

In 2004 and 2005, misuses by its users caused Wikipedia negative press coverage, especially when the founding editorial director of USA Today, John Seigenthaler, became subject to a hoax, linking him to the assassination of President John F. Kennedy in an article on him on Wikipedia (Lamb, 2006).

The 2004 presidential election became also an example of the “wiki wars” when supporters of former President George W. Bush and his challenger John Kerry started editing and re-editing the opposing candidate’s entries (Boxer, 2004). The reliability of Wikipedia was greatly questioned at the time (Orlowski, 2006), leading Palser (2006) to argue that “we’ve seen Wikipedia’s good reputation besmirched by a few historical revisionists, ending our bewilderment.”

However, Wikipedia consequently revised the editorial rules for its contributors and made it impossible to edit articles anonymously. Today, contributors have to register in order to add or change content on the Web site. In addition, very popular and controversial entries have been locked for editing and can only be changed by Wikipedia administrators (Waters, 2009). In some international editions, new contributors are reviewed by experienced administrators (Waters, 2010).

In addition, the Wikipedia community is much more attentive to changes to its articles, constantly engaging in discussions over them. The Financial Times (2009) even argued that Wikipedia “is moving closer towards being like a normal encyclopedia, where specialist editors make sure that, as far as possible, the words that they print are true” [3].

As these episodes of hoaxes and inaccuracies have become less frequent thanks to a more reliable editorial process, Wikipedia has been praised for its breadth, timeliness and reliability (Cohen, 2010; Crovitz, 2009; Giles, 2005; Güntheroth and Schönert, 2007; Leung, 2009). Its increasing popularity on the Web has also caused its articles on any given topic to rank higher on search results of major search engines like Google (DiStaso and Messner, 2010; Langlois and Elmer, 2009; Stross, 2009).

Wikipedia is now even being recognized as a reliable source in news reports by major newspapers in the U.S. (Messner and South, 2011). Not surprisingly, public relations practitioners have now also started to pay attention to Wikipedia content on their companies and organizations as well as the topics related to their own focus.

According to Wright and Hinson (2010), companies pay more attention to the impact of social media on their reputations. Dizikes (2008) also stressed that communication professionals now edit Wikipedia articles on their companies: “Wikipedia has become something of a battleground for the truth, or, at least, a kind of operating history.”

DiStaso and Messner (2010) also found that corporate entries are heavily edited by a great variety of contributors. However, only very few scholars have started to pay attention to the influence of references used for Wikipedia’s content.

DiStaso, et al. (2012) were among the first to conduct a study that examined the types of references used in an analysis of Wikipedia articles on the BP’s Deepwater Horizon Oil Spill. They found that references are primarily based on news reports by traditional news media.

Despite the lack of research on Wikipedia’s references, the agenda-setting effects of sources have long been established. McCombs (2005) stressed that sources can have strong effects on the framing of content. Fernando (2010) also pointed out that references have a strong impact on the discussions of Wikipedia contributors: “An article needs to be well referenced or it will be taken down” [4].

Wikipedia contributors discuss the authority of information and settle disagreements through referencing (Fullerton and Ettema, 2009). Subsequently, Wikipedia content is often referenced with readily available online news media reports (DiStaso, et al., 2012; Melandson, 2010; Siegel, 2007).

Health and nutritional information online

Online and mobile technologies play a growing role in the way Americans retrieve health information. A study by the Pew Research Center’s Internet & American Life Project found that online health information is more often retrieved through mobile devices, making that information available remotely for mobile phone users (Fox, 2010).

Another study showed that the public is using online resources to find information on health topics, especially on nutrition, (Fox, 2011). While Wikipedia has been shown to being a quick choice of information for millions of online and mobile users (DiStaso and Messner, 2010), researchers have also pointed out that Wikipedia has increased its reliability and breadth to a degree that it can be used as a knowledge transfer tool by patients (Czarnecka-Kujawa, et al., 2008).

Scholars have also stressed the potential value of Wikipedia for global health promotion, if medical entries are edited by medical professionals and referenced with reputable sources (Heilman, et al., 2011). According to the Center for Disease Control and Prevention (2011), the most prevalent health issues today are related to nutrition because it has been shown that nutrition plays an essential role in maintaining both ones physical and mental health.

Specifically, there has been an increasing amount of research that clearly links food and health maintenance with disease development and there are many documented nutritional therapies that can be utilized when treating mental disorders (World Health Organization, 2003; Lakhan and Wieira, 2008).

According to the Healthy Eating Index (HEI), the diets of most Americans is a concern because dietary factors are associated with four of the 10 leading causes of death, such as coronary heart disease, certain types of cancer, strokes, and Type 2 diabetes (Basiotis, et al., 2004).

All Americans, regardless of income level, could benefit from dietary improvements by making small changes, such as choosing more nutrition-dense forms of food, which would ultimately provide substantial health benefits (Guenther, et al., 2008).

Therefore, as noted by Lenoir-Wijnkoop, et al. (2011), “the important role of food and nutrition in public health is being increasingly recognized as crucial for its potential impact on health-related quality of life and economics, both at the societal and individual levels.”

With that being said, the use of nutrition and health claims, and access to proper nutritional health information in general, have the potential to enhance consumers’ nutritional knowledge and healthy eating patterns as well as improve public health more generally (Leathwood, et al., 2007).

When the public’s need for nutrition information is discussed, Campbell and Campbell (2005) advocated “eating right can save your life” [5]. While medical findings suggest that heart disease, diabetes and obesity seem to be reversed by a healthy diet, Campbell and Campbell (2005) pointed out, “even though information and opinions are plentiful, very few people truly know what they should be doing to improve their health,” largely due to the fact that the real science is buried beneath “a clutter of irrelevant or even harmful information” [6].

This confusion among Americans about diet and nutrition has much to do with “how health information is generated and communicated and who controls such activities” [7] as well as contradictory results that even confused the scientists and policy-makers.

Health communication researchers have studied nutrition information and the role of media in conveying accurate health information and promoting healthier dietary behavior. Kava, et al. (2002) studied whether magazines provide readers with adequate information about the safety information as related to supplement use. The findings suggest that information about safety issues such as maximum safe doses and drug-supplement interactions was often lacking even in magazine articles about nutrition.

Maddock, et al. (2008) argued that poor nutrition, together with physical inactivity, makes up the second leading causes of preventable morbidity and mortality in the United States. The authors highlighted the critical role mass media campaigns can play in reaching large segments of the population to influence their dietary behaviors.

There clearly is a need for reliable online information on health and nutrition and previous research has shown the popularity and increasing reliability of Wikipedia content. Therefore, this study’s goal was to close a gap in the current research and analyze the online rankings of Wikipedia articles on nutritional topics and the types of references they are based on.

Research questions

Based on the above review of literature, the following research questions are posed for this study:

RQ1: How prominently are nutritional articles on Wikipedia ranked on popular online search engines?
RQ2: What is the frequency of edits in nutritional articles on Wikipedia?
RQ3: How have the number of references used in nutritional articles on Wikipedia changed over time?
RQ4: How have the types of references in nutritional articles on Wikipedia changed over time?
RQ5: How have the specific references in nutritional articles on Wikipedia changed over time?




This study applied a research methodology developed by DiStaso, et al. (2012) in their analysis of the references in Wikipedia articles on BP’s Deepwater Horizon Oil Spill. The goal of this study was to determine how prominently nutritional Wikipedia articles rank on popular search engines, how frequently they are edited, and how the referencing in these articles has developed over time.

A longitudinal content analysis was conducted for nine nutrition-related Wikipedia articles in a five-year time frame between 2007 and 2011. The coding protocol was constructed based on the one developed for coding of Wikipedia references by DiStaso, et al. (2012). The content analysis was carried out based on the principles for mass communications research established by Neuendorf (2002).

The study analyzed articles in the following three nutritional areas and selected three articles for each area based on expert recommendations: 1) nutritional ingredients: high fructose corn syrup, trans fat, gluten; 2) dietary medical conditions: hypercholesterolemia, hypertension, obesity; and, 3) dietary diseases: diabetes mellitus, cardiovascular disease, cancer.

Articles for each of the nine topics were captured through the Wikipedia revision history archive on December 31 for the years 2007, 2008, 2009, 2010 and 2011. Thereby, a total of 45 articles were selected for the content analysis.

The rankings on popular search engines were analyzed using a current search for each of the terms in Google, Bing and Yahoo!. This provided an understanding of the ranking for each of the nutritional Wikipedia articles. Unfortunately, technology does not allow for post-dated searches, so changes in the rankings were not measurable over time for this study.

Through the use of the Wikipedia revision history archive, each nutritional Wikipedia article was examined for the number of edits received between 1 January 2007 and 31 December 2011. The idea behind this is that the more edits an article receives, the more rigorously it was created, and the more scrutiny was applied in the formation of reliable public information (DiStaso and Messner, 2010).

The articles were also coded for the number and the type of references. Types of references were coded as mainstream media (i.e., New York Times), mainstream medical media (i.e., Science), alternative medical media (i.e.,, government institutions (i.e., FDA), NGOs (i.e., American Cancer Society), academic publications (i.e., Journal of Nutrition), other, and inactive or broken links.

The coding of the 3,029 references in the Wikipedia articles was completed by two trained coders. Intercoder reliability was established with approximately 10 percent (n=309) of the coding content with a coefficient of .92 using Scott’s pi (Scott, 1955). The information on the editing history and the search ranking was recorded by one coder.




This longitudinal study analyzed the search engines rankings of Wikipedia articles on nine nutritional topics as well as the editing and referencing in these articles. The research questions posed for this study will be answered separately in the following sections.

RQ1: How prominently are nutritional articles on Wikipedia ranked on popular online search engines?

The searches for each of the nine nutritional Wikipedia articles showed that they always rank among the top three search results on Google, Bing and Yahoo! In 63 percent (n=17) of the searches the Wikipedia article was the first search result, in 33.3 percent (n=9) the second and in only 3.7 percent (n=1) the third. Only one of the articles did not at least have one top ranking in one of the three search engines.

The average rankings, which were 1.56 for Google and Bing and 1.11 for Yahoo!, showed that Wikipedia articles are very likely to be the first resources the public finds when searching for nutritional information online (see Table 1).


Rankings for Wikipedia nutrition articles in major search engines


RQ2: What is the frequency of edits in nutritional articles on Wikipedia?

The analysis of the revision history for each Wikipedia article showed that the nine articles were edited 29,593 times between 1 January 2007 and 31 December 2011 for an average of 2,172 edits for each article over the 60 months (see Table 2). While each article was edited an average 36.2 times per month, there were differences in the editing of individual articles. Although the differences in editing totals were not found to be significant, this is most likely due to the small sample size of nine Wikipedia articles.


Number of edits for Wikipedia nutritional articles in Wikipedia


Not only did the articles on diabetes mellitus and obesity have the highest number of edits in 2007 (n=4,288 and n=3,731 respectively, p<.05), they also had the greatest percentage decrease in edits (89 percent and 87.2 percent respectively) (see Chart 1). Hypercholesterolemia had the least number of edits in all five years (n=71, n=47, n=53, n=135, n=55) and the smallest change with just a 23 percent decrease from 2007 to 2011. On average, there was a 55 percent decrease in edits from 2007 to 2011.


Edits over time


RQ3: How have the number of references used in nutritional articles on Wikipedia changed over time?

In 2007, there was a total of 283 references for the nine nutritional Wikipedia articles; an average of 31 references per article (see Chart 2). In 2008 it increased to 464 references for an average of 52 references per article. In 2009 it increased again to 792 references with an average of 88 references per article. There was a slight dip in 2010 to 777 references with an average of 86 references per article. Then another small dip in 2011 to 713 references with an average of 79 references per article.


Average number of references over time


At the same time the references were increasing, the length of the articles remained fairly constant with an average number of sentences of 231 in 2007, 262 in 2008, 242 in 2009, 218 in 2010, and 220 in 2011. This resulted in an average number sentence per reference of 15.7 in 2007, 5.2 in 2008, 3.5 in 2009, 2.8 in 2010, and 3.5 in 2011. The lower the number the better, since this means that more sentences have references. This indicates that the increase in references cannot be attributed to an increase in the length of the articles.

While an ANOVA statistical analysis did not identify significant differences, it is still important to look at the number of references in each individual nutritional Wikipedia article (see Chart 3). Overall, the number of references for the articles increased from 2007 to 2011 except for the one on high fructose corn syrup that maintained the same number and gluten that decreased 10 percent. On the other end of the spectrum was the cardiovascular disease article that increased 1,700 percent, the cancer article that increased 620 percent and the obesity article that increased 537 percent.


Nutritional references over time


RQ4: How have the types of references in nutritional articles on Wikipedia changed over time?

Over the five years, the distribution of the types of references used in the articles has remained fairly consistent (see Chart 4) (χ2 (32, n=3,029)=270.24, p<.001). Academic publications (42.0 percent, n=119 in 2007; 48.5 percent, n=225 in 2008; 55.7 percent, n=441 in 2009; 52.3 percent, n=406 in 2010; 52.0 percent, n=371 in 2011) were the most common type of references over all five years followed by NGO publications (12.7 percent, n=36 in 2007; 9.6 percent, n=53 in 2008; 9.6 percent, n=76 in 2009; 13.4 percent, n=104 in 2010; 13.3 percent, n=95 in 2011), government institutions (11.3 percent, n=32 in 2007; 11.4 percent, n=53 in 2008; 8.3 percent, n=66 in 2009; 9.4 percent, n=73 in 2010; 9.4 percent, n=67 in 2011), mainstream news media (13.4 percent, n=38 in 2007; 10.3 percent, n=48 in 2008; 7.4 percent, n=59 in 2009; 8.5 percent, n=66 in 2010; 8.7 percent, n=38 in 2011), mainstream medical media (0.7 percent, n=2 in 2007; 1.7 percent, n=8 in 2008; 2.4 percent, n=19 in 2009; 2.1 percent, n=16 in 2010; 2.1 percent, n=15 in 2011), and alternative medical media (0.4 percent, n=1 in 2007; 1.1 percent, n=5 in 2008; 0.5 percent, n=4 in 2009; 0.4 percent, n=3 in 2010; 0.3 percent, n=2 in 2011).

It is notable, that the overall percentage of mainstream news media references has decreased almost five percent from 13.4 percent (n=38) in 2007 to 8.7 percent (n=62) in 2011, while at the same time the references to academic publications increased 23.8 percent. The percentages for all other reference types only changed in minor ways over the five-year time period.


Reference types over time


RQ5: How have the specific references in nutritional articles on Wikipedia changed over time?

The specific references did not change much over time (see Table 3). Lancet, New England Journal of Medicine, JAMA, and American Journal of Clinical Nutrition were in the top five most frequent references for all five years. CBC and New York Times were among the top in 2007 and 2008 while Hypertension was in the top for 2009, 2010 and 2011.


Most common references over time


There were significant differences among the type of references and time (χ2 (5698, n=3,029)=20068.48, p<.001) (see Table 4). There were many different references in most of the different types except for the mainstream medical media references where 45 percent of the references were for Medscape. This indicates quite a diverse referencing scheme and knowledgebase that the nutrition Wikipedia articles in this analysis were created from.


Most common references by type


All of the nutrition Wikipedia articles contained more references from academic publications than any other type of reference except for the article on trans fat that contained a slightly higher amount of mainstream media references (27.2 percent, n=40 versus 16.3 percent, n=24) (χ2 (56, n=713)=268.55, p<.001).



Discussion and conclusion

The purpose of this study was to analyze the rankings of nutritional Wikipedia articles on popular search engines, the overall editing frequency on these articles as well as the development in referencing quantity and types. This longitudinal content analysis collected data from 2007, 2008, 2009, 2010 and 2011 and found indications for the high online ranking of the content and its increasing reliability based on the references used over time.

After searching for all nine nutritional health topics under study (high fructose corn syrup, trans fat, gluten, hypercholesterolemia, hypertension, obesity, diabetes mellitus, cardiovascular disease, and cancer) on Google, Bing and Yahoo!, it is evident from the search results that Wikipedia is a high-ranking source of information online.

All Wikipedia articles were presented in the top three of the search results for each of the search engines, appearing most of the time as the first result. These rankings are even higher than the ones DiStaso and Messner (2010) found for Wikipedia articles on Fortune 500 companies.

It shows that any individual searching for these nutritional terms on the search engines will be offered a Wikipedia article at the top of the results list. This finding makes it even more important to study the reliability of these articles through their content references.

Every one of the nine articles under study was heavily edited, with some articles showcasing edits multiple times per day. Although some were edited less than others, all articles were edited a minimum of 300 times in the five-year time period, showing great rigor in the overall editorial process.

Additionally, the number of edits steadily increased from 2007 to 2009, with some decline in 2010 and 2011. This decline likely shows that Wikipedia has transitioned to a mode of consolidation, unlike initially when there was a need to increase the quality of content. In other words, instead of being news-driven and constantly adding new content to the articles, Wikipedia articles on nutrition currently represent the established facts on the respective topics.

With this being said, the types of references have remained relatively steady over the five-year time period, with academic references increasing and mainstream media references slowly declining. Academic publications today represent more than half of all references cited in the nine articles.

This underlines that the reliability of information has increased. Nevertheless, there still are some less reliable and more questionable references included in all articles. For all years, 15.4 percent (n=464) of all references came from outside the main reference type categories or could not be clearly determined. This still shows that Wikipedia’s contributors have more work to do to increase the reliability of the articles.

However, the results overall show that reliable references — mainstream news media, mainstream medical media, government institutions, and academic publications — are still far more prevalent than the other possibly more questionable sources.

Therefore, while Wikipedia has grown and expanded over time with reference numbers increasing, the overall quality of articles has at a minimum stayed consistently reliable and has even increased for some of the articles in recent years. This shows that as Wikipedia continues to grow, there has been an overall consolidation, creating a better and more reliable source for information on health and nutrition.

Internet users who search for nutritional health information on popular search engines today will find high-ranking Wikipedia articles, which are mainly based on academic research articles. Future research should further analyze through surveys and interviews with Wikipedia users how much influence health articles on the online encyclopedia have on health information gathering and decision-making.

As with all research, this study also has limitations. The sample only included nine Wikipedia articles that were selected from three nutritional categories. While this was considered appropriate for this initial study to analyze Wikipedia references, future research should broaden the sample to include additional health categories to allow for generalizations in this area.

As this study only examined the references of the articles, future studies should also analyze references in the context of the content they are used, meaning that future analysis should include the entire Wikipedia articles.

The authors believe that this area of research is an important one that needs much more attention. As search engine results drive our decisions on which content we access online, Wikipedia needs to be high on the agenda for health communication researchers and practitioners.

Much more research is needed in this area and communications professionals in the health field need to be much more actively involved in ensuring that the content on Wikipedia is reliable and well sourced with reliable references. End of article


About the authors

Marcus Messner (Ph.D., University of Miami) is an associate professor and coordinator for research and innovation at the Richard T. Robertson School of Media & Culture as well as executive director of the Center for Media+Health at Virginia Commonwealth University. His research interest is in new and social media and how they are adopted and used in journalism and public relations. He teaches social media, multimedia journalism and global communication.
Direct comments to: mmessner [at] vcu [dot] edu

Marcia W. DiStaso (Ph.D., University of Miami) is an associate professor in the College of Communications at Pennsylvania State University, senior research fellow with the Arthur W. Page Center and Chair-Elect of the Public Relations Society of America Educators Academy. Her research agenda, speaking engagements, and consulting focus on social media, financial communications, and investor relations.

Yan Jin (Ph.D., University of Missouri-Columbia) is an associate professor in the Grady College of Journalism and Mass Communication and associate director of the Center for Health & Risk Communication at the University of Georgia. Her research interest is in the area of crisis communication and strategic conflict management, as well as how emotions influence public relations decision-making and public responses to crisis and risk information.

Shana Meganck (Ph.D., Virginia Commonwealth University) is an assistant professor in the Richard T. Robertson School of Media and Culture and an affiliate member of the Center for Media+Health at Virginia Commonwealth University. Her research interests focus on the intersections of health communication, new media and public relations.

Scott Sherman (M.A., Syracuse University) is an associate professor in the Richard T. Robertson School of Media and Culture and director for operations and outreach in the Center for Media+Health at Virginia Commonwealth University. He has been working in advertising for 20 years. For most of his career, he has guided strategies and design in the creative department of advertising agencies.

Sally Norton (M.P.H., University of North Carolina at Chapel Hill) is a public health research consultant and wellness educator. Her career history includes public health service and education in academic health care. Previous employers include the Program on Integrative Medicine at the University of North Carolina and the Department of Social and Behavioral Health at Virginia Commonwealth University.



