First Monday

Report and repeat: Investigating Facebook's hate speech removal process by Caitlin Ring Carlson and Hayley Rousselle

Social media is rife with hate speech. Although Facebook prohibits this content on its site, little is known about how much of the hate speech reported by users is actually removed by the company. Given the enormous power Facebook has to shape the universe of discourse, this study sought to determine what proportion of reported hate speech is removed from the platform and whether patterns exist in Facebook’s decision-making process. To understand how the company is interpreting and applying its own Community Standards regarding hate speech, the authors identified and reported hundreds of comments, posts, and images featuring hate speech to the company (n=311) and recorded Facebook’s decision regarding whether or not to remove the reported content. A qualitative content analysis was then performed on the content that was and was not removed to identify patterns in Facebook’s content moderation decisions about hate speech. Of particular interest was whether the company’s 2018 policy update resulted in any significant change.

Our results indicated that only about half of reported content containing hate speech was removed. The 2018 policy change also appeared to have little impact on the company’s decision-making. The results suggest that Facebook also had substantial issues including: removing misogynistic hate speech, establishing consistency in removing attacks and threats, an inability to consider context in removal decisions, and a general lack of transparency within the hate speech removal processes. Facebook’s failure to effectively remove reported hate speech allows misethnic discourses to spread and perpetuates stereotypes. The paper concludes with recommendations for Facebook and other social media organizations to consider to minimize the amount and impact of hate speech on their platforms.


Defining hate speech
Relevant legal regulations
Content moderation
Facebook’s hate speech regulations
Recommendations & conclusion




Social media content is rife with hate speech. In the United States, one-in-four African Americans have been targeted online because of their race or ethnicity, as have one in 10 Hispanics (Duggan, 2017). The problem has gotten so bad that Facebook, Google, Microsoft, and Twitter have partnered with the Anti-Defamation League (ADL) to create a Cyberhate Problem-Solving Lab to address the growing tide of online hate (Fingas, 2017).

One way companies are able to regulate the expression of users is through their Terms of Service and Community Standards. These policies, which users must agree to in order to access a platform, allow social media companies to set the rules that govern how hateful content will be regulated on their sites. Facebook specifically prohibits hate speech on its platform and uses a combination of automatic detection and community flagging to identify and remove hate speech from the site. According to the company’s own transparency report, Facebook “took action” on four million pieces of content containing hate speech in Q1 of 2019 (Facebook, 2019c).

Although Facebook’s report provides data about how many pieces of content were ultimately removed from the site, what remains unclear is what proportion of comments reported by users were not removed by the company. While research exists that considers how social media companies like Facebook should be regulating content (Suzor, et al., 2018; Klonick, 2018), there is not a lot of data available to help scholars understand what is currently being done. This study contributes to the existing body of literature by examining how Facebook is applying its own community standards to content that includes hate speech. This research measures the proportion of reported content containing hate speech being removed, identifies trends evident in the removal decisions, and examines whether and how updates to Facebook’s hate speech policy have affected this process.

To address these issues, the authors reported hundreds of comments containing hate speech to Facebook in January 2018 and again in February 2019 (n=311) and recorded the company’s decisions about whether or not to remove the reported content. A qualitative content analysis was then performed on the resulting data to determine trends in the efficacy of Facebook’s removal process, as well as whether notable changes in this process occurred as a result of the company’s hate speech policy update in 2018. Results indicated that only about half of reported hate speech was removed and that the company demonstrated inconsistencies in removing hate speech, particularly when it targeted a person’s gender. We also found that the 2018 policy update had little impact on the removal process.

This paper begins by defining the term hate speech and exploring the current legal frameworks that exist to regulate hate speech on social media. Next, we provide an overview of the content moderation process and then look specifically at Facebook’s approach to regulating content containing hate speech. Once the review of relevant terminology, literature, and processes is complete, the method is presented along with the results of the study. The paper concludes with a discussion of the results and recommendations for Facebook to consider as the company works toward their stated goal of eliminating hate speech from their platform.



Defining hate speech

Hate speech is an expansive and contested term. The European Union’s “Framework Decision on Combating Certain Forms and Expressions of Racism and Xenophobia by Means of Criminal Law” defines hate speech as the public incitement to violence or hatred directed to groups or individuals on the basis of certain characteristics, including race, color, religion, descent and national or ethnic origin (Council of the European Union, 2008). Like the EU definition, various countries’ legal prohibitions against hate speech generally focus on incitement to hatred toward people based on their fixed identity characteristics. However, the public tends to associate hate speech with the use of racist, homophobic, and misogynistic slurs (Carlson, 2017). The terms cyberbullying and harassment generally refer to offensive expression targeted at an individual, but hate speech can be aimed at a group of people or at an individual. Claims about a group’s inferiority, calls for violence against an individual or an entire group because of their fixed identity characteristics, and the use of racist, homophobic, or ethic slurs all constitute hate speech (Carlson, 2017).

Hate speech can have negative impacts on individuals as well as society at large. Laura Leets’ 2002 study exposed Jewish and LGBTQ college students to harmful speech based on real scenarios. Leets found that the short- and long-term effects of participants’ exposure to hate speech might be similar in form (but sometimes not in intensity) to the effects of other kinds of traumatic experiences. At the societal level, hate speech can play a role in creating the conditions for bias-motivated violence. Legal scholar Alexander Tsesis (2009) argues that the very purpose of intimidating hate speech is to perpetuate and augment existing inequalities. “Although the spread of intimidating hate speech does not always lead to the commission of discriminatory violence, it establishes the rationale for attacking particular disfavored groups.” The recent brutality against the Rohingya people in Myanmar is evidence of the role Facebook content containing hate speech can play in this process. A 2018 Reuters investigation done in conjunction with the Human Rights Center and the UC Berkeley School of Law found over 1,000 posts calling Rohingya or other Muslims dogs, maggots, and rapists (Stecklow, 2018). This content was created and distributed in the lead-up to the military’s campaign of ethnic cleansing and crimes against humanity that forced 740,000 Rohingya to flee to Bangladesh. Given the severity of hate speech’s potential effects, it is essential that scholars examine how social media organizations are handling hate speech on their platforms.



Relevant legal regulations

In the European Union and in other countries such as Canada or South Africa, hate speech, whether it is used in-person or online, is illegal and may be punished with fines or jail time (Article 19, 2018). Several countries, including Germany, also have civil remedies for group defamation, which the victims of hate speech can pursue (Article 19, 2018). In the United States, hate speech is protected by the First Amendment. Unless expression falls into one of the narrow categories of exception defined by the U.S. Supreme Court such as fighting words, incitement to violence, or true threats, it is permitted and may not be sanctioned by the government.

However, as private virtual spaces, social media organizations are not required to extend First Amendment protection to the posts, images, and videos shared on their sites (Alkiviadou, 2018; Foxman and Wolf, 2013; Hartzog, 2010). This gives these organizations tremendous power to influence users’ free expression by determining which information does and does not make it into the marketplace of ideas via social media. By removing content or not, social media companies decide which posts audiences will be exposed to and have the opportunity to consider. This power to decide what will and will not remain on their sites makes social media organizations the de facto gatekeepers of their own public spheres and ultimately gives them the ability to influence the worldview of their users.

Facebook and other social media platforms and ISPs are also not liable for what users share on their sites in the United States. With the recent exception of content related to sex trafficking, Section 230 of the Communications Decency Act (CDA) says that providers of interactive computer services in the U.S., such as ISPs or social media companies, shall not be treated as publishers and, therefore, are not responsible for what third parties do on their sites (Communications Decency Act, 1996).

While the United States takes a hands-off approach to the regulation of hate speech in social media content, in 2017 Germany passed the Network Enforcement Law (Netzwerkdurchsetzungsgesetz or NetzDG), which requires social media companies with more than two million users remove or block access to reported content that violates restrictions against hate speech in the German criminal code (Network Enforcement Act, 2017). Companies must remove “obvious hate speech” within 24 hours of receiving a notification or risk a US$50 million fine (Oltermann, 2018). The European Commission is also now working directly with social media organizations including Facebook, Twitter, YouTube, and Microsoft to combat the spread of hateful content in Europe. In May 2016, the group presented a “Code of Conduct on Countering Illegal Hate Speech Online.” A February 2019 report on the impact of the Code of Conduct indicated that “IT companies are now assessing 89% of flagged content within 24 hours and 72% of the content deemed to be illegal hate speech is removed, compared to 40% and 28% respectively when the Code was first launched in 2016” (European Commission, 2019).

After the 2019 attack on two mosques in New Zealand, global leaders met with executives from Facebook, Google, Twitter, and other companies to compile a set of guidelines called the “Christchurch Call,” which sought to enact measures against extreme, violent, and hateful rhetoric online. Notably, the United States did not sign the pledge. The United States’ unrestricted approach to regulating Internet content means that Facebook and other publicly traded social media companies are free to regulate expression and hate speech in particular in any way they see fit.



Content moderation

Content moderation is best defined as a series of practices with shared characteristics which are used to screen user-generated content including posts, images, videos, or even hashtags to determine what will make it onto or remain on a social media platform, Web site or other online outlets (Roberts, 2019, 2016; Gerrard, 2018). The process often includes three distinct phases. First is editorial review, which refers to oversight imposed on content before its made available, such as the ratings given to movies prior to their release (Gillespie, 2018). In the case of social media, editorial review often refers to the community standards set by social media platforms.

Next is automatic detection, which utilizes sophisticated software to aid in the removal process (Gillespie, 2018). Platforms use algorithms and/or artificial intelligence to remove content that violates their community standards both before and after it has been uploaded (Klonick, 2018). According to Facebook’s Q1 2019 transparency report, the company proactively removed 65 percent of hate speech on the site before users reported it (Facebook, 2019d). Recent research in this area suggests that best practices for algorithms that remove hate speech from social media content are emerging. A 2017 study found that fine-grained labels that help algorithms distinguish between hate speech and merely offensive speech, including humor, were effective (Davidson, et al., 2017). For example, the word f*g is used in hate speech and offensive language, but the term f*ggot is generally only associated with hate speech (Davidson, et al., 2017). Moreover, posts containing multiple racial or homophobic slurs are more likely to be hate speech, as opposed to offensive language (Davidson, et al., 2017).

The last of the three phases of content moderation is community flagging (Gillespie, 2018). Here users report content they believe violates the Community Standards outlined by the company. Reported content is then manually reviewed by employees and a determination is made regarding whether it will be blocked, deleted, or remain on the site. Social media organizations often contract this work out to other organizations. Workers in these roles are dispersed globally at a variety of worksites and the work itself often takes place in secret by low-status workers paid very low wages (Roberts, 2016). Workers in these roles suffer panic attacks and other mental health issues as a result of the 1,500 violent, hateful, or otherwise troubling posts they review each week (Newton, 2019).

Although Facebook utilizes each of the three primary tools for content moderation (editorial review, automatic detection, and community flagging), users of Facebook and other social media platforms are generally unaware of how the process actually works. A study by Sarah Myers West (2018) examined how users interpret the role and function of companies in moderating their content, what kinds of content is taken down, and the impact this has on public expression. Myers West’s study found that users developed a number of folk theories to explain how and why their content was removed from a platform. Most believed that some form of human intervention, their content being flagged by other users, was the cause although in reality algorithms and artificial intelligence play a substantial role in this process. Several users also attributed content removal to a perceived political bias on the part of the company (Myers West, 2018).

Although Facebook’s approach to content moderation is best characterized as industrial (Caplan, 2018), the company has had mixed results in its efforts to regulate content containing hate speech on their site; a failure that users, advertisers, and even the U.S. Congress have noticed. Concerns about the rise of hate speech and disinformation have increased the amount of public scrutiny being placed on search engine and social media companies that are responsible for mediating much of the world’s information (Caplan, 2018). Despite the company’s efforts, racism and other forms of discrimination persist online in ways that are both new and unique to the Internet (Daniels, 2013).



Facebook’s hate speech regulations

Facebook has prohibited hate speech since the company developed its first Terms of Service in October of 2005 (Facebook, 2019b). The company defines hate speech as “a direct attack on people based on protected characteristics — race, ethnicity, national origin, religious affiliation, sexual orientation, sex, gender, gender identity, and serious disability or disease” (Facebook, 2019b). Facebook says it has this policy because the company believes that such speech “creates an environment of intimidation and exclusion and in some cases may promote real-world violence” (Facebook, 2019b).

Facebook’s policies regarding hate speech have been regularly revised. The most recent update occurred in 2018 when the company made its detailed internal community standards policies public for the first time with the goal of highlighting Facebook’s devotion to “encouraging expression and creating a safe environment” (Facebook, 2019a). Facebook’s decision to release an expanded version of its hate speech policy was part of an effort by the company to “be more transparent [with users] about how it decides what content to take down or leave up” (Kelly, 2018). While Facebook’s Vice President of Global Policy Management, Monika Bickert, stated that the 2018 Comunity Standard update only “reflects standards that have long been in place,” Facebook did make some revisions as part of the 2018 update to its hate speech policy. This included adding immigration status as a protected characteristic, including a clear definition of what constitutes an “attack,” and providing detailed examples of the kinds of content that qualify as hate speech using a three-tiered approach (Facebook, 2019b). Facebook defined an attack as “violent or dehumanizing speech, statements of inferiority, or calls for exclusion or segregation” (Facebook, 2019b). The 2018 policy update made use of this three-part definition of what constitutes an attack by establishing three-tiers of hate speech. According to the company, the first tier refers to “violent or dehumanizing speech,” the second tier refers to “statements of inferiority,” and the third tier refers to “calls for exclusion or segregation” (Facebook, 2019b).

Within its explanation of each tier, Facebook provides specific examples of the kind of ideas, words, and concepts that they consider hate speech and will remove from the site. For example, the company explains that dehumanizing speech might consist of a “reference or comparison to filth, bacteria, disease or feces;” statements of inferiority might include words like “deformed,” “low IQ,” and “hideous;” and calls for exclusion or segregation would include “content that describes or negatively targets people with slurs” (Facebook, 2019b). This addition of clear examples stands in stark contrast from the published hate speech policy Facebook had prior to 2018, which provided users with a bare bones explanation of what hate speech was and the identities that were protected under the policy. Notably, one thing that did not change as a result of the policy update were the categories presented to users once they went to report hate speech that met the definition provided by the company. In order to report a piece of content as hate speech, a user must ultimately identify it as attacking one of the following categories: race/ethnicity, religions group, gender/orientation, or disability/disease.

While users have a better sense of how Facebook is defining and categorizing hate speech, little is known about how much and what types of content reported as hate speech is actually being removed. Given the company’s enormous power to regulate the speech marketplace, it is essential to understand how these policies are being applied and the impact that has on the resulting universe of discourse. Therefore we propose the following research questions:

RQ1: Is Facebook following its own community standards regarding hate speech in its removal process for reported content?

RQ2: Is Facebook removing reported content that attacks each of the protected categories of identities listed in the community standards regarding hate speech?

RQ3: Have the updates in Facebook’s community standards and removal process impacted which content is or is not removed from the platform?




To answer the research questions, we collected data at two points over a 13-month period. The first data collection period took place during January 2018 and the second took place during February 2019. Although the initial plan was to collect data only once, when Facebook announced changes to its hate speech policy, we felt it was essential to study the company’s removal process both before and after the update.

To collect data regarding removal rates, we identified and reported content we believed to be hate speech to Facebook. The definition of hate speech provided by the company was used by the two coders to determine whether the content in question should be considered hate speech or not. As a reminder, Facebook defines hate speech as “a direct attack on people based on protected characteristics — race, ethnicity, national origin, religious affiliation, sexual orientation, sex, gender, gender identity, and serious disability or disease.” In addition to the definition, other guidance provided by the company, particularly after the 2018 policy update, was used by the two coders to determine whether the post, comment, image, or video in question met Facebook’s definition. If a piece of content attacked, maligned, abused, or otherwise denigrated an individual or group based on one of the fixed characteristics listed by the company, then it was reported as hate speech. “Content” included primarily text, but also memes, emojis, images, and video.

In addition to identifying the content as hate speech, the two coders also had to determine which category of identity was being attacked in order to complete the reporting process and submit the content to Facebook for review. Category choices during both data collection periods included: race/ethnicity, religious group, gender/orientation, or disability/disease. Categories are mutually exclusive and multiple categories could not be selected during the reporting process.

Before each data collection period, the two coders established intercoder reliability using a test set of 28 comments. The percentage of agreement was 0.93 for the 2018 data collection period and 0.89 for the 2019 data collection period. To be considered “agreement” the coders had to identify the sample comment or post as hate speech, and select the same category of identity under which to report it.

Reports to Facebook were made from one of the author’s personal accounts, as well as a sock puppet account. A sock-puppet account is an online identity used for purposes of deception, in other words, an account without a real person attached to it. For each report, the authors’ recorded Facebook’s decision about whether or not to remove the content. The category of hate speech, which the company requires users to include in the flagging process, was also recorded along with the page the content was found on and the time of the report and response. Facebook generally provided a removal decision within 24–48 hours except in a handful of instances. The longest response time was one week.

In order to locate posts containing hate speech, the authors used a purposeful sampling technique to identify offensive content on pages and groups known in the United States for their hateful rhetoric. Pages on Facebook where hate speech regularly appeared were identified and scoured for instances of hate speech that met Facebook’s definition of the term. These pages included the pro-gun groups such as “Cold Dead Hands,” as well as pro-Trump pages such as the “Deplorables.” Men’s rights pages, such as “Modern Man,” were also identified as containing a substantial amount of hate speech. Other Facebook pages where hate speech was identified and reported from include: American White History Month 2, Because I’m Just a Guy, Christian Conservatives for Trump, Drain the Swamp, Liberalism is a Mental Disorder, Libtards; ya gotta love ’em, Meninist, Pissed Off White Americans, Survive Our Collapse, Trump 2020, Trump’s Deplorables, What Guys Like.

Many of these pages reflect similar core ideologies, which are often associated with groups aligned with the American far right, white pride movements, and men’s rights groups.They are almost all anti-feminist, anti-immigrant, anti-LGBTQ, and anti-Muslim, and pro-gun, pro-male, and pro-white. The prevalence of content from these groups in the sample may be due to the fact that conservative sites are more likely to generate uncivil comments. While it is important not to conflate incivility with hate speech, it is worth noting that a recent study by Su, et al. (2018) assessing and comparing the prevalence of incivility in the comments posted on the Facebook pages of a wide variety of American news outlets found that 19 percent of all comments posted on conservative news outlet’s Facebook pages were extremely uncivil, compared to nine percent of comments posted to pages of liberal news outlets. Put simply, members of these groups are more likely than members of other coalitions to directly attack an individual or group’s identity characteristics including their race, gender, gender identity, sexual orientation, immigration status, etc.

A total of 311 comments were identified by the coders as hate speech and reported to Facebook for a removal decision (n=144 during the 2018 data collection period and n=167 during the 2019 data collection period). The content that was removed by Facebook was separated from that which the company determined was not hate speech and therefore would remain on the site. A qualitative content analysis was then performed on the two resulting sets of content in order to identify trends in Facebook’s response patterns. The analysis required ongoing comparison of the data to identify categories and overarching themes (Schutt, 2015).

In terms of limitations, the ambiguity of interpreting and applying Facebook’s definition of hate speech is a clear limitation of this study. Along those lines, another important limitation is the role the authors’ own bias potentially played in determining what counted as hate speech under the company’s definition and what did not. The relatively small sample size compared to studies employing “big data analysis” also meant that specific groups became synonymous with certain identity categories. For example, most of the reported comments targetting “religion” included attacks on either Muslims or Jewish people. The comments in the sample that targetted gender were focused almost exclusively on people who identify as women. Also important to note is the fact that content was reported to the company using two different accounts. One was the author’s personal account and the other was a sock puppet account created for this study. It is possible that the lack of activity on the sock puppet account somehow impacted Facebook’s decision about whether to remove a particular post or not.




This section highlights the results of the investigation described above. First, the numeric breakdown of Facebook’s decisions throughout the study regarding whether or not to remove content reported as hate speech will be reported. Next, results from the qualitative analysis of the content that was and was not removed within each of Facebook’s categories of reporting for hate speech are presented. Facebook’s categories of hate speech include: race/ethnicity, gender/orientation, religion, and disability/disease. To begin, we’ll examine the data set as a whole, and then move on to review differences between the 2018 and 2019 data collection periods.

RQ1: Is Facebook following its own community standards regarding hate speech in its removal process for reported content?

A total of 311 (n=311) comments, posts, or images featuring what one of the two coders considered hate speech according to Facebook’s definition were reported to the company as a violation of its community standards. Of those, 149 comments were removed and 162 were not removed. This means that less than half (48 percent) of reported hate speech was removed from the site.


Table 1: Overall removal rates for content reported to Facebook as hate speech.
Total number of comments/posts reportedNumber of comments/posts removedPercent of totalNumber of comments/posts not removedPercent of total


RQ2: Is Facebook removing reported content that attacks each of the protected categories of identities listed in the community standards regarding hate speech?

The majority of comments or posts reported as hate speech fell under the sub-category of gender/sexual orientation, followed closely by race/ ethnicity and religion. Interestingly, reported hate speech that targeted people based on gender/sexual orientation had the highest percentage of posts that went unremoved. The table below outlines the removal rates for each category.


Table 2: Comments/posts removed and not removed by sub-category of hate speech.
Category of hate speech reportedTotal number of comments/posts reported in categoryNumber of comments/posts removedPercent removedNumber of comments/posts not removedPercent not removed
Religious group834756.6%3643.4%
Disability or disease3133.3%266.7%


The themes that emerged in the analysis of the comments and posts that were or were not removed are perhaps the most substantive of the findings presented here. The themes revealed demonstrate the extent to which Facebook does and does not follow its own community standards regarding hate speech. In order to properly contextualize the findings, the results from the categories of Gender/Orientation, Race/Ethnicity, and Religion are individually reviewed here. There were too few comments/posts associated with the category of Disability/Disease to analyze that group.


To begin, it is essential to note how problematic it is for Facebook to conflate these two categories in their reporting scheme. Some of the posts reported within this group had to do primarily with gender, while others were concerned distinctly with issues of sexual orientation or sexual identity. Therefore, these categories will be reviewed independently here.

Gender-based hate speech was prolific on the site. However, when reported, the removal of gender-based hate speech was inconsistent. For example, the term “slut” was sometimes, but not always, removed. Specifically, 7 of the 11 posts containing the word “slut” were removed during this study. Along those lines, the term “whore” was also reported as gender-based hate speech 11 times during the study and removed 4 of those. The term “cunt” was reported 9 times and removed only once. Notably, the term “bitch” was reported 42 times and went unremoved 64.3 percent of the time, leading to more inconsistencies in Facebook’s response to hate speech. For example, comments such as “stupid bitch” and “disgusting bitches” were not removed when reported to Facebook as gender-based hate speech. Table 3 provides a complete breakdown of the most prevalent gender-based slurs identified in this study.


Table 3: Gender-based slurs removed and not removed.
Gender slurs/terms implying moral inferiorityTotal number of times term was reportedNumber of times term was removedPercent removedNumber of times term was not removedPercent not removed
Total gender slurs733041.1%4358.9%


One clear theme that surfaced from the data was the prevalence of threats made towards women in the digital community. This study identified 15 posts that made threats against women, and, of that, 8 were removed. Keywords indicating a threat included terms such as: “choke,” “shoot,” and “hang.” Despite the fact that Facebook indicates that “violent speech or support in written or visual form” (Facebook, 2019b) qualifies as hate speech, violent threats made against women were sometimes allowed to remain on the site, seemingly without reason. For example, two nearly identical comments were reported; the comment “hang that bitch” was reported and removed while the comment “hang this bitch” was allowed to remain on the site. Comments about raping women and genital mutilation were also permitted to remain on the site despite being reported to moderators. Comments that mocked another user for her weight and other forms of fat shaming were also allowed to remain on the site after being reported.

Another clear theme that emerged in the analysis of the removed and not removed comments from this category was the extent to which women were criticized if they did not adhere to traditional gender roles or feminine standards of beauty. For example, there were several comments indicating that powerful women athletes or politicians should “get back to the kitchen” or “make me a sandwich.” According to a 2010 study by Tyler Okimoto and Victoria Brescoll, this contemptuous reaction to women’s power-seeking is to be expected. Okimoto and Brescoll (2010) found that when women candidates were seen as ambitious, it evoked moral outrage on the part of voters. These severe emotional reactions include contempt, anger, and disgust.

In terms of hate speech centered on a person’s sexuality or sexual identity, inconsistency in removing these comments was once again evident. For example, certain homophobic slurs were always removed, while others, particularly those that relied on context or cultural connotation to define, were more inconsistently removed. For example, the term “fag” was reported and removed 7 times during the study, as was the comment “Fucking faggot, short rope tall tree.” However, the term “fruitcake” was not removed, nor was the comment “sure if you’re a homo” or “homogays.” Photos of non-binary individuals were also posted and mocked by members of several Facebook groups.

In summary, relevant findings from the analysis of gender or sexual orientation-based hate speech revealed inconsistencies in the removal of gender-based slurs and threats against women. Also interesting, was the extent to which failure to adhere to traditional gender roles is punished by other Facebook users with vitriolic comments and posts. Regarding homophobic slurs, most were removed, except those that required knowledge of colloquial U.S. English language and culture.


The examination of the race or ethnicity-base hate speech reported to Facebook and removed or not, indicated that the company is proficient in removing the most egregious terms once they are reported by users. The “n-word” was removed each time it was reported, as were the terms: “darkies,” “camel humper,” “beaner,” and “colored person.” However, less overt racial slurs that relied on wordplay like “Oprah Nigfree” or “Niggarhea...” were allowed to remain on the site.

One prominent theme that emerged from the analysis of data featuring race-based hate speech was the number of posts that included dehumanizing comparisons of people to animals. Most commonly, racist comparisons identified in this study included references to animals like: pigs, monkeys, rats, and camels. To be clear, Facebook recognizes dehumanizing comparison between humans and “animals who are culturally perceived as intellectually or physically inferior” as a “tier one” kind of an attack in the community standards; as such, according to Facebook, these kinds of posts should not be allowed to proliferate on the site. Largely these kinds of comparisons were removed from the site when reported. Overall, only 22.9 percent of these kinds of comparisons remained on the site once reported. Table 4 includes a more detailed breakdown of the results of reporting race-based hate speech involving comparisons to animals.


Table 4: Dehumanizing comparisons removed and not removed.
Dehumanizing term/ComparisonTotal number of times term was reportedNumber of times term was removedPercent removedNumber of times term was not removedPercent not removed
Total comparisons352777.1%822.9%


Another theme that emerged in this content was the notion that white people were under attack and needed to “fight back” against people of color and other minority groups. This comment illustrates this perspective:

“If you are white be prepared as the minority trash is out there and they want everything. If that happens and do you really think a colored or minority person is going to help you find a job, get food, get shelter ... Hell no. The street will be your new address so lets get ready and remove this trash and vote out any politician that supports them while denying you and your family.”

While Facebook was effective at removing overt racial slurs, more insidious comments, like this one, were allowed to remain on the site. Also allowed to remain were some comments containing offensive remarks, but no actual racial or ethnic slurs. For example, comments that suggested users “leave our country or return to their own country.”

Finally, comments that required contextual interpretation were not removed; for example “Black lives splatter” and a comment calling Oprah a “Cotton patch bitch.” If moderators are unfamiliar with the link between slavery, the history of cotton as a Southern cash crop, and the experience of modern African Americans as the descendants of slaves forced to harvest cotton, then they may not realize how racist these terms actually are.

What is also evident here is the extent to which Facebook is relying on users to police content, as well as the importance of context in the removal process. Moreover, some content that included racist speech was not deemed hate speech by Facebook, likely because it did not contain an actual slur (i.e., “Black lives splatter”). This suggests that Facebook needs to dedicate more resources to be able to interpret the context of racist speech in order to ensure that all racist content is removed. Failure to do this suggests that Facebook is choosing to continue to profit off of the use of the site by racists individuals and/or organizations.


The majority of comments reported to Facebook as religiously-based hate speech included offensive remarks about Muslims. For example, a meme featuring a photo of a Barbie doll in a headscarf captioned, “Sharia Barbie: comes with hijab, bruises & quran,” was removed once reported. As was the comment, “Get rid of all Muslims.” However, general, unspecified threats aimed at Muslims were permitted to remain on Facebook even after being reported. For example, the comment “Islam is a Cancer,” was not removed once reported. More specific threats, however, were inconsistently removed from the site. For example, Facebook did remove the comment “burn that stupid terrorist at stake,” but did not remove a comment that said “Islam is shit ... I’ll have a 40 caliber response for you.”

Another prominent theme that emerged from the data was the number of reported comments that included some kind of racial slur or demeaning term. The terms “rag” or “towel” were often used as a demeaning replacement for the word hijab. When reported, comments that included such remarks were typically removed. However, Facebook was less consistent in removing comments that referred to Muslims as “terrorists.” For example, a comment reading “Get rid of her!! Nobody wants this crazy terrorist bitch!” was not removed, while a comment stating “this b*** is a terrorist” was removed. In total, comments that referred to Muslims as “terrorists” were only removed 50 percent of the time. Table 5 includes a more detailed breakdown of the results of reporting religion-based slurs.


Table 5: Religious slurs removed and not removed.
Religious slursTotal number of times slur was reportedNumber of times slur was removedPercent removedNumber of times slur was not removedPercent not removed


Another theme identified in the data was the large number of comments and memes that were reported that mocked Muslims by using pictures of bacon or other types of pork. Largely these kinds of posts were removed from the site. Finally, comments suggesting Muslims “go home” were not removed once reported; for example comments like:

“Send them home”

“Piece of shit! Go back to your country and get beat!”

Once again, contextually driven hate speech that did not contain an actual slur, but was clearly in violation of Facebook’s community standards, was not removed despite being reported to the company.

RQ3: Have the updates in Facebook’s community standards and removal process impacted which content is or is not removed from the platform?

Before Facebook updated its publicly posted Community Standards in 2018, the authors reported a total of 144 comments, photos and posts. Of the 144 posts that were reported only 44.4 percent were effectively removed from the site leaving 55.6 percent of reported content still on the site. In a second round of reporting that occurred after Facebook made their 2018 updates, the authors identified and reported a total of 167 posts that violated Facebook’s hate speech policy. This second round of reporting resulted in the removal of 50.9 percent of reported posts from the site. After the policy update, Facebook demonstrated a 6.5 percent increase in removing reported hate speech. Table 6 provides a more in-depth comparison of the removal rates between the two reporting periods.


Table 6: Removal rates for content reported to Facebook as hate speech before and after the 2018 policy update.
 Total number of comments/posts reportedNumber of comments/posts removedPercent removedNumber of comments/posts not removedPercent not removed
Before update1446444.4%8055.6%
After update1678550.9%8249.1%


When examining the differences in removal rates by subcategory of hate speech before and after the 2018 policy change, the company shows some interesting inconsistencies. Most impressively, Facebook seems to be doing a better job of removing hate speech that targets people based on their gender or orientation. Before the policy update, the first round of reporting demonstrated that Facebook only removed 30 percent of gender/orientation-based hate speech; however, the results from the second round of reporting revealed an increased removal rate of 45.2 percent for gender/orientation-based hate speech. This is a 15.2 percent increase in removal rates between the two data sets. While this is a step in the right direction, Facebook is still removing less than half of reported hate speech that targets people because of their gender or sexual orientation.

Another surprising finding is that removal rates dropped for both race-based and religion-based hate speech during the second round of reporting. Facebook removed less race-based hate speech after its policy change by 4.6 percent. This means that Facebook’s removal rate for racist hate speech decreased from 59.2 percent to 54.6 percent in the second round of reporting. In a similar vein, religious based hate speech dropped from a 59 percent removal rate in the first round of reporting to a 55.7 percent removal rate after the company released its policy updates. Table 7 illustrates a more detailed breakdown of the removal rates for specific subcategories of hate speech before and after the 2018 policy update.


Table 7: Comments/posts removed and not removed by sub-category of hate speech before and after the 2018 policy update.
Category of hate speechData set in questionTotal number of posts reported in categoryNumber of posts removedPercent removedNumber of posts not removedPercent not removed
Race/EthnicityBefore update492959.2%2040.8%
After update331854.6%1545.4%
Gender/OrientationBefore update702130%4970%
After update733345.2%4054.8%
ReligionBefore update221359%941%
After update613455.7%2749.1%


What is consistent across the board for all three sub-categories of hate speech is that Facebook seems to only be removing around 50 percent of the hateful rhetoric that is reported. This leaves a lot of room for improvement for the company.




The analysis of Facebook’s decision-making process as it attempts to navigate the difficult task of removing hate speech reported to the company yielded several interesting insights. First, there were not substantial improvements noted as a result of the policy change. Although Facebook’s community standards added immigration status to the list of protected categories, immigration status is not included as a separate category in the reporting process. Thus, any hate speech based on a person’s immigration status was simply rolled into reports concerning race or ethnicity. As part of the policy update, Facebook also provided more detail about what did and did not constitute an attack based on a person’s protected characteristics. However, the data clearly indicated that the company was inconsistent in its removal of threats and other “attacks” on people’s identity characteristics. For example, terms such as “bitch” or “slut” were in some instances removed once reported and in other instances not, with no clear indication of differences between the reported content.

More broadly, Facebook has a problem in evaluating what should be referred to as misogynistic hate speech. While slurs that demeaned others based on their sexual orientation or their ethnicity were quickly and regularly removed, slurs that demeaned someone based on their gender were largely not removed when reported. Such broad inconsistencies in the removal decisions regarding gender-based hate speech calls into question whether or not Facebook is truly including gender as a protected category. Across the two data collection periods gender-based hate speech was consistently less likely to be removed from the site than hate speech that targeted any of the other protected categories. Given Facebook’s enormous power to shape public discourse and influence dominant ideology, this discovery is particularly troubling. The proliferation of misogynistic hate speech on the site may actually be working to silence women and limit their participation on the platform (Citron, 2014; Citron and Norton, 2011). It is worth noting that Facebook did increase the amount of gender-based hate speech removed from the site once reported by 15.2 percent between the two data collection periods; however, the fact remains that Facebook still removes less than half (45.2 percent) of reported misogynistic hate speech from the site. As such, the company still has a lot of room for improvement in protecting victims of gender-based hate speech.

An additional area of concern raised by this study was how Facebook is dealing with threats made by users on the platform. Despite the company stating that “violent speech or support in written or visual form” is prohibited from the site, comments that attacked an individual’s fixed identity characteristics and included inherently violent words like “choke,” “shoot,” and “hang” were sometimes allowed to remain on the site. According to the U.S. Supreme Court in Elonis v. United States (2015), the key to determining when a threat has crossed the legal line between protected expression and a true threat rests in the intent of the poster or commenter. Not, as the Third Circuit Court of Appeals originally held, in whether the individual reading the content feels threatened. However, Facebook does not seem to use either of these approaches in its efforts to regulate threatening hate speech. Instead, the key for Facebook to determine whether or not a threat should be removed from the site seems to be determined by how specific the threat is. In this study, the comments “I’d hit her with my car” and “I’ll choke her,” were not removed from Facebook when reported. Perhaps Facebook should consider amending its policy regarding threats to better consider the victim’s interpretation of the content. Failure on the part of the company to do so may chill the speech of users who are threatened because of a particular identity characteristic such as being a woman or being a Muslim, for example.

Also notable here was the extent to which current news events influenced the amount and type of hate speech present on the site. In 2019, after Representative Illhan Omar had been elected to the U.S. Congress, we observed substantially more hate speech regarding Muslims than we did during the 2018 data collection period. During the first period of data collection, 22 comments regarding religion were reported, compared to 61 during the second period of data collection. This suggests a connection between the nature of hate speech on the site and current news items.

Another major issue evident in the analysis of content removed and not removed by the company was their inability to effectively take context into account. Several clearly racist or misogynistic comments were not removed, likely because the posts did not contain an actual slur. For example, the phrase “cotton patch b*tch” was not removed when it was reported as hate speech. These results were somewhat unsurprising given the fact that content moderators at Facebook are reportedly expected to review flagged content without being able to see the context in which it was made (Gillespie, 2018). However, given the role hate speech can play as a precursor to bias-motivated violence, it is essential that the company act to address this issue. Alexander Tsesis describes how “misethnic discourses” were used to justify the American slave trade and the relocation of Native Americans. Misethnic discourses include the practice of ascribing undesirable traits to members of these groups and using dehumanizing terms describe these individuals (Tsesis, 2002). The results of this study indicated that Facebook failed to remove comparisons that likened African Americans to apes or monkeys 25 percent of the time. This means that comments featuring misethnic discourses like “A yard ape is a yard ape” or “shoot those [monkey emoji]” continue to exist on Facebook even after they’ve been reported by users. This is the exact type of language that serves to make violence against African Americans and other people of color more palatable. Thus, Facebook should do more to take context, including the use of emojis, into account in its decision making process regarding content reported as hate speech.

In addition to issues with taking context into account, a lack of transparency was also evident in Facebook’s decision-making process. In order to even begin to understand how decisions regarding reported content were made and by whom, the authors had to rely on academic research (Gillespie, 2018, 2010; Roberts, 2919, 2016) and investigative media reports (Newton, 2019) in addition to the information published in the company’s own Community Standards. This concealment is unnecessary and likely creates additional problems, which ultimately impact the speech universe. If users are unaware of how their expression is being regulated on the platform, that expression and the resulting online public sphere are essentially being shaped in secret. If Facebook is truly committed to free expression, the company should consider developing its own “sunshine” rules. Like the laws that require government agencies to make their proceedings available and open to the public, social media organizations like Facebook could also make their content removal proceedings visible and open to the public. The current “transparency reports” published by the company include aggregate data only. Instead, Facebook should simply publish the raw reporting data online. Under the German Netz DG law, social media organizations with more than two million users are already required to produce bi-annual reports outlining what content was reported, removed, why, and by whom (Network Enforcement Act, 2017). Facebook could replicate this reporting process for each country in which it operates. Letting users know how decisions are being made, and the frameworks being used to make those decisions would decrease the potential for bias inherent in this process. By sharing their outcomes more freely, Facebook and other social media companies would also create more opportunities for users to become involved in the process, perhaps even in a collective way. For example, repeat offenders may be more easily identified, which could help users craft collective blocklists for pages they may want to intentionally avoid.

In addition to promoting transparency, Facebook should also strongly consider implementing a process for users to adjudicate decisions they disagree with. In this study there were several reported posts that were not removed by content moderators that we believe clearly violated the company’s hate speech policy. Fortunately, it seems that Facebook is in the process of building an oversight board to review controversial content moderation cases. According to the company’s draft charter, it plans to create a 40–person global board made up of people appointed by Facebook. It is unclear at this time how many cases the board will review, though the company has said each case will be reviewed by a group of three–five board members who will make a final ruling and provide a public explanation for its decision (Wagner, 2019).



Recommendations & conclusion

The research questions proposed here sought to understand (1) Whether or not Facebook was adhering to its stated community guidelines in decisions regarding whether or not to remove reported content including hate speech; (2) Whether Facebook was removing reported content that attacked each of the protected categories of identities or only certain ones; and, (3) Whether and how the efficacy of content moderation of hate speech had changed since the company’s policy updates. The larger goal of this analysis was to observe how Facebook is shaping users’ expression, along with online public discourse. Those observations, outlined in the results section and summarized in the discussion above, serve as the basis for the recommendations presented here. Based on our analysis of current hate speech removal practices, Facebook should pursue the following recommendations:

  1. Increase consistency in the content removal process.
  2. Broaden the interpretation of hate speech to better address misogynistic comments and posts.
  3. Consider threats from the receiver’s perspective. If a user feels threatened by a post and reports it, it should be considered an “attack” and be removed by the company.
  4. Do more to take context, particularly cultural context, into consideration in the decision-making process.
  5. Increase transparency about removal policies, processes, and decisions.

This study demonstrated the extent to which Facebook is and is not following its own community guidelines regarding hate speech. The results indicated that content attacking certain protected categories, such as gender, was less likely to be removed than content attacking race or ethnicity. In addition, this study draws attention to Facebook’s demonstrated inconsistency in the reporting process, the inability for content moderators to effectively take context into account, and the overall need for greater transparency in the hate speech reporting and removal process. The data collected and presented here also suggest that Facebook’s 2018 policy change regarding hate speech was more lip service than an actual change on the part of the company. If real change is their desired goal, then Facebook should move quickly to provide content moderators with more contextual information in order to assess posts. In fact, the hate speech removal process would be substantially improved if content moderation were seen as an integral part of the company’s operations, rather than a low-level task to be outsourced.

Failure on the part of Facebook to address these issues raises questions about the financial motivations for the company to leave hate speech on the site (Gillespie, 2010). Given the improvements in artificial intelligence and content removal algorithms, Facebook could choose to remove all instances of certain hateful slurs such as “f*g” or “c*nt” when they appear on the platform. The fact that the company does not immediately take this action suggests that they are more than aware of the financial benefits associated with having hate speech, and the users that consume that content, spending more time on the site.

Today, social media organizations have enormous power to shape which expression reaches and remains in the virtual public sphere. Some countries, such as Germany, have made a decision to regulate social media companies in order to force them to act to remove illegal hate speech (Network Enforcement Act, 2017). In the United States, hate speech is protected by the First Amendment. We are therefore reliant on these publicly traded companies to regulate the flow of hate speech into the universe of discourse. Given the role hate speech plays in creating the conditions for bias-motivated violence, it is the authors’ sincere hope that Facebook and other social media organizations devote the necessary resources to addressing this issue for the benefit their users. End of article


About the authors

Caitlin Ring Carlson is an Associate Professor in the Department of Communication at Seattle University. She teaches courses in media law, social media, and strategic communication. Her research focuses on media law, policy, and ethics from a feminist perspective. Carlson’s work has appeared in journals such as Communication Law and Policy, Journal of Mass Media Ethics, and Communication Law Review. Carlson received her Ph.D. in media studies from the University of Colorado Boulder.
Direct comments to: carlso42 [at] seattleu [dot] edu

Hayley Rousselle is a student at Syracuse University College of Law. She recently graduated from Seattle University with a bachelor’s degree in communication and media.



N. Alkiviadou, 2018. “Hate speech on social media networks: Towards a regulatory framework?” Information & Communications Technology Law, volume 28, number 1, pp. 19–35.
doi:, accessed 8 January 2020.

Article 19, 2018. “Responding to ‘hate speech’: Comparative overview of six EU countries,” at, accessed 8 January 2020.

R. Caplan, 2018. “Content or context moderation: Artisanal, community-reliant and industrial approaches,” Data & Society (14 November), at, accessed 8 January 2020.

C.R. Carlson, 2017. “Censoring hate speech in social media content: Understanding the user’s perspective,” Communication Law Review, volume 17, number 1, pp. 24–45.

D.K. Citron, 2014. Hate crimes in cyberspace. Cambridge, Mass.: Harvard University Press.

D.K. Citron and H. Norton, 2011. “Intermediaries and hate speech: Fostering digital citizenship for our information age,” Boston University Law Review, volume 91, pp. 1,435–1,484.

Communications Decency Act, 1996. “Telecommunications Act of 1996,” (8 February), at, accessed 8 January 2020.

Council of the European Union, 2008. “Council framework decision 2008/913/JHA on combating certain forms and expressions of racism and xenophobia by means of criminal law” (28 November), at, accessed 8 January 2020.

J. Daniels, 2013. “Race and racism in Internet studies: A review and critique,” New Media & Society, volume 15, number 5, pp. 695–719.
doi:, accessed 8 January 2020.

T. Davidson, D. Warmsley, M. Macy, and I. Weber, 2017. “Automated hate speech detection and the problem of offensive language,” Proceedings of the Eleventh International AAAI Conference on Web and Social Media, at, accessed 8 January 2020.

M. Duggan, 2017. “Online harassment 2017,” Pew Research Center (11 July), at, accessed 8 January 2020.

Elonis v. United States, 2015. Elonis v. United States, at, accessed 8 January 2020.

European Commission, 2019. “Countering illegal hate speech online — EU Code of Conduct ensures swift response,” at, accessed 8 January 2020.

Facebook, 2019a. “Community standards,” at, accessed 8 January 2020.

Facebook, 2019b. “Hate speech,” at, accessed 8 January 2020.

Facebook, 2019c. “Terms of service,” at, accessed 8 January 2020.

Facebook, 2019d. “Hate speech,” at, accessed 8 January 2020.

J. Fingas, 2017. “Tech giants team with Anti-Defamation League to fight online hate,” Endgadget (10 October), at, accessed 8 January 2020.

A.H. Foxman and C. Wolf, 2013. Viral hate: Containing its spread on the Internet. New York: St. Martin’s Press.

Y. Gerrard, 2018. “Beyond the hashtag: Circumventing content moderation on social media,” New Media & Society, volume 20, number 12, pp. 4,492–4,511.
doi:, accessed 8 January 2020.

T. Gillespie, 2018. Custodians of the Internet: Platforms, content moderation, and the hidden decisions that shape social media. New Haven, Conn.: Yale University Press.

T. Gillespie, 2010. “The politics of ‘platforms’,” New Media & Society, volume 12, number 3, pp. 347–364.
doi:, accessed 8 January 2020.

W. Hartzog, 2010. “The new price to play: Are passive online media users bound by terms of use?” Communication Law and Policy, volume 15, number 4, pp. 405–433.
doi:, accessed 8 January 2020.

H. Kelly, 2018. “Facebook reveals its internal rules for removing controversial posts,” CNN Money (24 April), at, accessed 8 January 2020.

K. Klonick, 2018. “The new governors: The people, rules, and processes governing online speech,” Harvard Law Review, volume 131, pp. 1,598–1,690, and, accessed 8 January 2020.

L. Leets, 2002. “Experiencing hate speech: Perceptions and responses to anti-Semitism and antigay speech,” Journal of Social Issues, volume 58, number 2, pp. 341–361.
doi:, accessed 8 January 2020.

S. Myers West, 2018. “Censored, suspended, shadowbanned: User interpretations of content moderation on social media platforms,” New Media & Society, volume 20, number 11, pp. 4,366–4,383.
doi:, accessed 8 January 2020.

Network Enforcement Act, 2017. “Act to improve enforcement of the law in social networks“ Strafgesetzbuch (StGB) [German penal code] (1 October), at, accessed 8 January 2020.

C. Newton, 2019. “The trauma floor: The secret lives of Facebook moderators in America,” The Verge (25 February), at, accessed 8 January 2020.

T.G. Okimoto and V.L. Brescoll, 2010. “The price of power: Power seeking and backlash against female politicians,” Personality and Social Psychology Bulletin, volume 36, number 7, pp. 923–936.
doi:, accessed 8 January 2020.

P. Oltermann, 2018. “Tough new German law puts tech firms and free speech in spotlight,” Guardian (5 January), at, accessed 8 January 2020.

S.T. Roberts, 2019. Behind the screen: Content moderation in the shadows of social media. New Haven, Conn.: Yale University Press.

S.T. Roberts, 2016. “Commercial content moderation: Digital laborers’ dirty work,” In: S.U. Noble and B.M. Tynes (editors). The intersectional Internet: Race, sex, class, and culture online. New York: Peter Lang, pp. 147–160.
doi:, accessed 8 January 2020.

R.K. Schutt, 2015. Investigating the social world: The process and practice of research. Eighth edition. Thousand Oaks, Calif.: Sage.

S. Stecklow, 2018. “Why Facebook is losing the war on hate speech in Myanmar,” Reuters (15 August), at, accessed 8 January 2020.

L. Y.–F. Su, M.A. Xenos, K.M. Rose, C. Wirz, D.A. Scheufele, and D. Brossard, 2018. “Comparing patterns of incivility in comments on the Facebook pages of news outlets,” New Media & Society, volume 20, number 10, pp. 3,678–3,699.
doi:, accessed 8 January 2020.

N. Suzor, T. Van Geelan, and S. Myers West, 2018. “Evaluating the legitimacy of platform governance: A review of research and a shared research agenda,” International Communication Gazette, volume 80, number 4, pp. 385–400.
doi:, accessed 8 January 2020.

A. Tsesis, 2009. “Dignity and speech: The regulation of hate speech in a democracy,” Wake Forest Law Review, volume 44, pp. 497–532.

A. Tsesis, 2002. Destructive messages: How hate speech paves the way for harmful social movements. New York: New York University Press.

K. Wagner. 2019. “Facebook is building an oversight board. Can that fix its problems?” Bloomberg News (June 24), at, accessed 8 January 2020.


Editorial history

Received 20 September 2019; revised 4 December 2019; accepted 5 December 2019.

Creative Commons License
This paper is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Report and repeat: Investigating Facebook’s hate speech removal process
by Caitlin Ring Carlson and Hayley Rousselle.
First Monday, Volume 25, Number 2 - 3 February 2020