First Monday

Shades of hatred online: 4chan memetic duplicate circulation surge during hybrid media events by Asta Zelenkauskaite, Pihla Toivanen, Jukka Huhtamaki, and Katja Valaskivi

The 4chan /pol/ platform is a controversial online space on which a surge in hate speech has been observed. While recent research indicates that events may lead to more hate speech, empirical evidence on the phenomenon remains limited. This study analyzes 4chan /pol/ user activity during the mass shootings in Christchurch and Pittsburgh and compares the frequency and nature of user activity prior to these events. We find not only a surge in the use of hate speech and anti-Semitism but also increased circulation of duplicate messages, links, and images and an overall increase in messages from users who self-identify as “white supremacist” or “fascist” primarily voiced from English-speaking IP-based locations: the U.S., Canada, Australia, and Great Britain. Finally, we show how these hybrid media events share the arena with other prominent events involving different agendas, such as the U.S. midterm elections. The significant increase in duplicates during the hybrid media events in this study is interpreted beyond their memetic logic. This increase can be interpreted through what we refer to as activism of hate. Our findings indicate that there is either a group of dedicated users who are compelled to support the causes for which shooting took place and/or that users use automated means to achieve duplication.


1. Introduction
2. Data and method
3. Results
4. Discussion
5. Conclusion



1. Introduction

While violence and hate has been part of a fabric of not only traditional media but also online communication (Phillips, 2015; Weaver, et al., 2012), racist and anti-Semitic hate speech, disinformation, and the circulation of conspiracy theories relating to disruptive media events have been reported to be on a rise (Hodge and Hallgrimsdottir, 2020; Lyons, 2017), under a disguise of ‘anti-political correctness’ label (Gantt Shafer, 2017). In hybridized media events, human actions are conditioned by the infrastructures of media technologies. Meaning is constructed in contexts where both human and non-human actors participate in the circulation of information, content, and affect (Sumiala, et al., 2018). This circulation may be translocal and transcultural, and the roles of production and consumption are often blurred. As violent attacks generate attention, users of different platforms adapt their rhetoric to the narratives surrounding of an event and the affordances of the online environment, taking part in generating a hybrid media event.

4chan has received a substantial attention from researchers on the hate-instilling nature of the interactions (Colley and Moore, 2020; Hine, et al., 2017; Mittos, et al., 2020; Nagle, 2017; Papasavva, et al., 2020; Tuters and Hagen, 2020; Tuters, et al., 2018), and its relationships to the violent events such as terrorist attacks and the circulation of hate across platforms (Oboler, et al., 2019). While early studies of 4chan found users to refer to the content of 4chan as “habitually unpleasant discourse” (Knuttila, 2011), more recent analyses of online spaces such as 4chan and specific groups such as /pol/ have revealed their hosting and successfully circulating anti-Semitic hate speech (Malevich and Robertson, 2020; Zannettou, et al., 2017). They have also detected 4chan use for mobilization during events (Oboler, et al., 2019). Our goal here is to combine these two approaches and provide more granularity by identifying the specific nature of mobilization in the aftermath of disruptive events, by showcasing the different shades of hate speech on 4chan and how violent events activate such social practices. While this study does not aim to prove that violent events provoke the surge of hate, content analysis shows that in an aftermath of a violent attack there is a surge of the hate speech that takes part in contributing to the development of a hybrid media event.

We focus on two attacks, the Pittsburgh synagogue shootings in 2018 and the mosque shootings in Christchurch in 2019, as contexts for a surge in hate speech and the mobilization of users. Similar to Oboler, et al. (2019), we have analyzed hate speech in relation to terrorist attacks. However, we expand the understanding in the spread of hate speech prevalent across various fringe groups by tracing hate speech before and during two terrorist attacks.

Methodologically, our goal is to explore recurrent and potentially automated online communication which we refer to as duplicates. We operationalized our setting through message duplicate analyses and temporal content analysis. Message circulation is visible through functions of other social media platforms, such as Twitter’s retweet function, and has been analyzed to uncover content circulation (Grinberg, et al., 2019). However, 4chan does not have such a function; messages are merely reposted in their original form and are visible only when counted in the overall corpus. We are interested in the nature of these recirculated messages. To trace different shades of hate speech, we triangulated our analysis. At the message level, we included named entity analysis, image link analysis, and content keyword analysis. At the user level, we analyzed user self-naming and geographic self-naming. We aim to understand how attention is structured and accumulated during a hybrid media event and how such events create opportunities for individuals to capitalize on the attention in circulating hate speech (cf., Citton, 2017).

We address the circulation of racist and anti-Semitic discourse on the /pol/ section of 4chan in the context of two violent attacks: the Pittsburgh synagogue shootings with 11 victims in the U.S. on 27 October 2018 and the shooting of 51 people at al Noor and Linwood mosques in Christchurch, New Zealand on 15 March 2019 [1]. In both cases, the perpetrators were labeled by media as white supremacists and the attacks as hate crimes. We expected differences in racist and anti-Semitic rhetoric between the two datasets as the Pittsburgh synagogue shooting was expected to attract anti-Semitic keywords, while the New Zealand case was not an obvious context for anti-Semitic rhetoric. To contextualize our findings, we created two control datasets to understand changes in hate speech during disruptive hybrid media events.

1.1. Hybrid media events: Theoretical considerations

The theoretical framework of this study is based on media event theory (Dayan and Katz, 1992; Katz and Liebes, 2007). Theories of hybrid media system (Chadwick, 2013) and hybrid media events (Sumiala, et al., 2018) have tended to ignore fringe communities and have rather focused on the relationship between so-called mainstream media and commercial social media platforms. Recently, however, analyses of online spaces have provided evidence that fringe communities are not isolated from the larger media environment (Ylä-Anttila, et al., 2019; Toivanen, et al., in press); and that their role in the circulation of hate speech, conspiracy theories, and disinformation is significant (Zannettou, et al., 2017).

In our contemporary hybridized media environment (Valaskivi, et al., 2019; cf., Chadwick, 2013), the complexity and hybridity of disruptive media events must be recognized. Hybrid media events can be examined by looking at five elements, the so-called five As: actors, affordances, attention, affect, and acceleration (Sumiala, et al., 2018). Actors and affordances relate to the environment in which a hybrid media event unfolds. Attention is the motivation for a hybrid media event, which fuels the circulation of affect. Finally, acceleration is the consequence of all the other elements and the process they activate. Our main focus is on the actors, affordances, and circulation of affective content in hybrid media events. We examine the affordances of the 4chan platform, including anonymity and duplicated content, to understand how anonymous actors represent themselves in conversations related to violent, hybrid media events. We also study how attention is directed toward affective hate speech during hybrid media events and how racism and anti-Semitism circulate and increase during these events.

1.2. 4chan and fringe communities online

This study analyzes hate on 4chan that describes itself as a “image-based bulletin board.” [2] Moreover, 4chan is an anonymous forum where anyone can participate without registering. However, some communities, such as ‘Politically Incorrect’, known as /pol/, have a disclaimer that exempts the site from being held responsible for the content of posts [3]. 4chan reports the following content traffic: it attracts around 22 million users a month worldwide (46 percent from the U.S., eight percent from the U.K., six percent from Canada, five percent from Australia, four percent from Germany, two percent from each of France, Sweden, Netherlands, 1.5 percent from Poland, and 1.5 percent from Brazil); 70 percent of users are males aged 18–34. The site features around 900,000–1,000,000 messages per day [4]. While it is hard to truthfully authenticate user location due to, for example, IP masking, a study in which results were normalized by Internet users per capita showed that, in 2017, most of /pol/ users identify with one of the following countries: U.S., New Zealand, Canada, Ireland, Finland, and Australia (Hine, et al., 2017). The site offers a set of self-identification labels that are automatically generated according to the user’s IP address that accompany the user’s messages [5]. In addition to country flags, these include ideological terms such as nazi, tree-hugger, anarcho-communist, and confederate, among others (Malice, 2019). Users of /pol/ can replace the automatically generated country flags to identify themselves with “non-state organisations and political positions” (Wall and Mitew, 2018) or they can simply mask or manipulate their IP to stage their location.

While 4chan promotes free speech, it has been known for increasing radicalization which has received extensive coverage in the U.S. media. Controversial media coverage of the site can be reflected through Brian Feldman’s quote that called /pol/ out as “the site’s most toxic community — as well as increasingly (and bizarrely) influential in online white-supremacist politics.” [6] He described how 4chan users circulated videos with guns and encouraged viewers to “stay white.” These videos were posted after the shooting of five Black Lives Matter protesters in Minneapolis that took place on 24 November 2015 in response to a fatally shot unarmed black man [7]. This indicates that specific events, such as protests and violent attacks, solicit active responses from what Feldman calls “white supremacists.” In 2014, before the surge in popularity of the /pol/ discussion board, 4chan was already known as an arena for different forms of toxicity, such as celebrity nude leaks [8]. It is currently considered to be a forum for right-wing contrarianism, extremism, and user mobilization [9].

The academic community has also begun to raise awareness of the importance of analyzing communities that embrace hate online. Hine, et al. (2017) highlighted a lack of knowledge among the scholarly community on radical online communities such as 4chan and Gab. In 2020, however, scholars’ overall understanding of how these communities operate is substantial (Papasavva, et al., 2020), even if the effects of the specific events on the proliferation of hatred are not yet fully uncovered.

Some scholars view fringe communities as vehicles of conspiracy theory generation (Tuters, et al., 2018), while others see them as propagators of racial and anti-Semitic slurs. On 4chan, racial slurs have been combined with anti-Semitic rhetoric. Scholars have argued that in the past four years, 4chan has been radicalized with the emergence of radical fringe groups that use social media as an arena to engage in “nationalistic politics coupled with racist ideology” (Tuters, et al., 2018). Scholars of 4chan, such as Zannettou, et al. (2020), link politically incorrect group /pol/ with alt-right ideologies. Alt-right ideologies, an abbreviation of “alternative right” are defined by Hodge and Hallgrimsdottir (2020) as radicalized far right that profess a multifaceted ideology. They state that alt-right is “anti-Semitic, anti-feminist, anti-multiculturalism, anti-post-modernism, anti-political correctness, anti-Afro centrism, pro-white, pro-Europe, pro-traditional families, pro-scientific racism, pro-free speech, and anti-SJW (social justice warrior) ideologies online.” [10] Moreover, according to Hodge and Hallgrimsdottir (2020) alt-right as an ideology is tightly connected to the ultranationalist groups such as Generation Identitaire, a European nationalist and anti-immigration movement.

Online platforms have been found to be weaponized to congregate and disseminate racist and anti-Semitic rhetoric online. Anti-Semitic rhetoric is particularly prevalent on online platforms such as 4chan and Gab (Zannettou, et al., 2020). Zannettou, et al. (2020) concluded that between July 2016 and January 2018, anti-Semitic rhetoric on 4chan /pol/ increased and circulated to more mainstream social media platforms in the form of memes and racist slurs. These were more pronounced during specific events such as the 2016 presidential inauguration or the Charlottesville white supremacist rally and the violence surrounding it. This “memetic warfare” has been called the new political trolling (Merrin, 2019).

Few studies have examined how these fringe communities rally during specific hybrid media events (Potts and Harrison, 2013; Thompson, 2018). Potts and Harrison (2013), when analyzing Boston bombing on 4chan and Reddit, showed how, by using images, users of 4chan tried to engage in an investigative work to find the suspect; however, discussion included other bombing topics and conspiracy theories related to the bombing. Another exception is Thompson’s (2018) report on the increase in anti-Semitic hate rhetoric after Trump’s presidential campaign. Thompson (2018) connected keyword frequency from Trump’s campaign launch to the Unite the Right rally in Charlottesville in August 2017, showing an increase in terms associated with white supremacy, such as white genocide, red siege, iron march, kike, and n-word from a few hundred to four million by the end of 2018. Similarly, 4chan has been found to systematically solicit hate rhetoric, contribute to radicalization, and circulate anti-Muslim discourse during mass shooting events reported by the media [11]. The goal of this study is to assess the degree to which mass shooting events attract and circulate discourses of hate on 4chan /pol/ and thus contribute to the formation of the hybrid media event.

1.3. Defining the rhetoric of hate

Hate speech is recognized as a growing problem among Internet communities and often manifests as anti-Semitism. Hate speech is particularly prevalent in online spaces. Research has shown that hate speech is not only the product of site users’ motivations and actions but also of the interplay between site and platform policies, technological affordances, and users’ communicative acts (Ben-David and Fernandéz, 2016). It is also known that hate speech has been systematically used in order to increase animosities in discussions on conflict prone topics (Arif, et al., 2018).

Although from a content analysis perspective the term “hate speech” can be treated as latent category, rather than a manifest, it has been analyzed through specific keyword lists. Keyword lists can be categorized into two types: top-down and bottom-up. Top-down lists include keywords that have been identified a priori by researchers, while with the bottom-up approach, keywords emerge from the given datasets. Bottom up or data-driven approaches identify context-sensitive keywords, while top-down approaches allow for theoretical operationalization based on previously established theoretical parameters. For example, a bottom-up approach used to analyze hate speech in Canadian tweets (Chaudhry, 2015) yielded racially charged words, such as white boy, paki, whitey, pikey, nigga, spic, crow, squinty, and wigga. In this study, we used a combination of approaches to generate content lists.

Manual labeling of content with toxicity scores allows for the data to be analyzed in relation to the construct of toxicity, an approach that has been used to analyze 4chan /pol/ hate speech (Papasavva, et al., 2020). Specific practices that capture anti-Semitic discourse have been analyzed through the lens of the uses of (((They))) on 4chan, which was interpreted as “memetic antagonism and nebulous othering” (Tuters and Hagen, 2020). Tuters and Hagen (2020) describe (((They))) as a meme that was constructed with triple parentheses: Triple parentheses on 4chan began as an “explicitly anti-Semitic meme on an alt-right podcast.” [12] (((They))) had been identified by the researchers a priori as a form of othering and theoretically conceptualized as a way of identifying a common opponent to create an ingroup.

Hate speech on 4chan has been analyzed using discourse-based approaches (Ludemann, 2018). For example, racial hatred has been examined in relation to “fighting for the race” discourses (Mittos, et al., 2020). Mittos, et al. (2020) analyzed specific discourses on genetic testing as ways of propagating highly toxic language in hateful, racist, and misogynistic comments. In addition, the /pol/ group was found to openly engage in anti-Semitic rhetoric, often through memes. These discourses are intertwined with fringe political views.

In addition to technological affordances, platform policies and user ways of using of 4chan, the culture of a platform may also support the use of hate speech. Nagle (2017) analyzed cross-platform hate discourses and identified alt-right language as a cultural peculiarity of 4chan. “Kek,” which is equivalent to “lol,” stems from a culturally ambivalent term for an ancient Egyptian deity, but it is used online in relation to Pepe the Frog that has become iconic in alt-right ideologies, an insider “joke” meme. This blurs the lines between hate speech and degrading humor or mockery. Our study further examines the prevalence of context-specific hate speech on /pol/.

1.4. Research questions

In light of previous research on hatred on 4chan, the overarching aim of this research is to identify the granularity of circulation of hatred on /pol/. We address this question by analyzing two media events: the Christchurch attacks on two mosques and the Pittsburgh synagogue shootings. Data related to the mass shooting (we refer to these two sets of data as the “target” datasets) and compare this data with another dataset that would allow us to establish baseline expectations of user behaviors (which we refer to as “control” datasets). This study is based on a multilateral design and is grounded in the following research questions:

RQ1a: How does the density of content on 4chan /pol/ differ between the target datasets and the control datasets?

RQ1b: How does the density of racist and anti-Semitist hate content on 4chan /pol/ differ between the target datasets and the control datasets?

We expected an overall increase in hate speech during the two hybrid media events and an increase in anti-Semitic keywords in the Pittsburgh case, given that the shooting targeted a synagogue.

Research question two relates to content circulation, which we operationalize through duplicates, i.e., repeated content. Past research shows how users used duplicate message as a strategy in public texts to fulfill their visibility goals (Zelenkauskaite and Herring, 2008). Duplicates are an easy and efficient way of promoting ideas online, especially since it only requires copying and pasting a message and can even be automated by creating an algorithm to repeat and repost content. In some ways, duplicates resemble memetic behavior, since they both represent what Shifman (2013) calls “sharing economies.” Shifman (2013) argues that such a sharing encompasses distribution and communication and she treats memes as being a part of online culture. While duplicates fit into the logic of memetic sharing, in this study they go beyond the cultural logic, since we have focused on the meaning of duplicates that disseminate hate to expand on studies that have analyzed memetic (sub)cultural logic of 4chan; see Tuterset al. 2018). Thus, we ask the following research questions:

RQ2a: How does the circulation of the content differ between the target datasets and the control datasets?

RQ2b: What is the nature of the most widely circulated content?

We expected to find more duplicate circulation in the target dataset than in the control dataset.

Research question three focuses on the actors involved in the hybrid media events in relation to racist or anti-Semitic self-labeling:

RQ3: What is the density of users who identify themselves using racist or anti-Semitic self-labeling in the target versus the control datasets?

Based on hybrid media event theory, we expected to find more users with racist or anti-Semitic labels in the target datasets.

By considering both the Christchurch shootings in New Zealand and the Pittsburgh shootings in the U.S., this study extends research of hate speech on 4chan beyond particular national settings, through analyzing the nature of user reactions to these events from various locations around the world. Based on this we pose RQ4:

RQ4: What are the similarities and differences between national and global audiences of these two hybrid media events?

We predicted a core audience and common topics; however, we expected to find event-specific elements that were not present in the control datasets, which represent content exchanged prior to the analyzed events.



2. Data and method

2.1. Datasets and analytical approaches

In this study, we compared two case studies — the Christchurch and Pittsburgh shootings — and four datasets, which are summarized in Table 1.


Table 1: Overview of the datasets.
DatasetTimeframe of message collectionNumber of messages
Christchurch target15–29 March 2019416,453
Christchurch control14 February–14 March 20193,500 (randomly selected)
Pittsburgh target27 October–10 November 2018135,000
Pittsburgh control28 September–26 October 20183,500 (randomly selected)
Total 558,453


For data collection, we used 4cat (Peeters and Hagen, 2018), a Web-based tool for capturing data from 4chan. This tool saves 4chan posts from /pol/ and allows a user to export posts from a selected timeframe. Posts can be extracted using the following variables: the post itself, user’s anonymized ID, post thread, image filename, country of the user, ideology-based flag of the user, and date of the post.

To identify the difference between target and control datasets, target datasets were collected during the events (two weeks from the onset), the control datasets contain 3,500 random posts posted during the month prior to each of the mass shooting events. These two datasets were used as control datasets to compare posting behaviors on 4chan /pol/.

We conducted comparisons between and within datasets. By including a month’s worth of /pol/ data posted before each media event took place, we expected to establish a baseline, enabling us to observe the extent to which media events change behaviors in the fringe community of 4chan /pol/.

2.2. Design and operational definitions of the variables used in the study

This study encompasses a multi-level analysis to capture hate speech circulation over time. This includes a user level of analysis and a content level of analysis of the two target datasets and their control counterparts. Similar multi-layered approaches have been used in previous studies to study hidden or complex media phenomena in online spaces (Zelenkauskaite and Balduccini, 2017). Table 2 summarizes the levels of analysis and study design.


Table 2: Study design: Research questions and operational definitions.
Research questionApproachEvidence
RQ1a: How does the density of the content on 4chan /pol/ differ between the target dataset and the control dataset?Named entity comparisonLongitudinal density of named entities
1. Within cases (control vs. target)
2. Between datasets (Christchurch and Pittsburgh)
RQ1b: How does the density of racist and anti-Semitist hate content on 4chan /pol/ differ between the target dataset and the control dataset?Keyword frequency comparisonDensity of keywords
1. Within cases (control vs. target)
2. Between datasets (Christchurch and Pittsburgh)
RQ2a: How does the circulation of the content differ between the target dataset and the control dataset?Comparison of duplicate messages, images, and linksDensity of repeated content
1. Within cases (control vs. target)
2. Between datasets (Christchurch and Pittsburgh)
RQ2b: What is the nature of the most widely circulated content?Top link analysisTop link qualitative content analysis
RQ3: What is the density of users who identify themselves using racist or anti-Semitist self-labeling in the target versus the control dataset?User-level of analysis self-labeling comparisonDensity of user self-labeling in messages
1. Within cases (control vs. target)
2. Between datasets (Christchurch and Pittsburgh)
RQ4: What are the similarities and differences between national and global audiences of hybrid media events?Country analysis by hate speech keywordsGeography of hate-related keywords
1. Within cases (control vs. target)
2. Between datasets (Christchurch and Pittsburgh)


To analyze the complexity of hate speech on 4chan, we operationalized racist and anti-Semitic hate through socio-technically relevant variables found on 4chan. In addition to keyword analyses, we included link and image link analysis as types of content circulation over time. To explore the granularity of racist and anti-Semitic discourse, we worked with two cases of which we had similar expectations regarding anti-Semitic rhetoric. Furthermore, to gauge a baseline for comparison, we created control datasets of messages exchanged prior to two shootings.

To capture the circulation of hate speech, we compared control and target datasets by calculating frequencies of duplicated content. The duplicate analysis builds on previous between-platform approaches to circulation such as that employed by Zannettou, et al. (2017), who examined news circulation across Reddit, 4chan, and Twitter to trace fringe communities.

2.3. Operationalization of analyzed concepts

The variables included in the study design are operationalized as detailed below.

2.3.1. Hate-related keywords

We employed anti-Semitic keywords used by Zannettou, et al. (2020), user geography, user self-labeling, named entities, and link and image link analysis. Approaches to hate speech for this work have been operationalized through the keyword frequency. We adopted these previously identified racist and anti-Semitic keywords for our analysis, including “black(s),” “white(s),” “kike(s),” “Jew(s).” In addition to this top-down approach, we also employed a bottom-up approach by including ideological labels adopted by users and extracted from our datasets. We tested racism and anti-Semitism by extracting the percentages of users who labeled themselves “white supremacist,” “nazi,” “confederate,” and “fascist.”

This study analyzes anti-Semitism by combining various approaches proposed in previous studies. Following previous studies of radicalization, we analyzed topics of posts, user distribution geography, and named entities (Hine, et al., 2017; Papasavva, et al., 2020). We used link analysis to explore the cross-platform distribution of hate speech (Hine, et al., 2017) through links to other platforms, such as YouTube (Hine, et al., 2017).

2.3.2. Named entities

Named entity recognition is a technique that is widely used in the field of natural language processing. It involves extracting structured information from unstructured source data, such as text (Nadeau and Sekine, 2007). There are many different ways of implementing named entity recognition, but it is typically used to detect names of persons or locations, dates, and numeric expressions (Nadeau and Sekine, 2007). In the area of text processing, named entity recognition has recently been applied to identify event locations from Twitter data (Zhou, et al., 2016) and to detect named entities of interest from citizens’ criminal complaints (Schraagen, et al., 2017).

For named entity recognition, we used the spaCy (v2.2+) library for Python. Specifically, we used spaCy’s “en_core_web_sm” model trained on the OntoNotes corpus (Weischedel, et al., 2013), which is a large collection of different text genres, such as “news, conversational telephone speech, weblogs, Usenet newsgroups, broadcast, [and] talk shows” (Weischedel, et al., 2013) [13].

2.3.3. Duplicates

Duplicates were operationalized as content repeated in its original form across messages. We specifically focused on message duplicates, link duplicates, and image file name duplicates. We used Python programming to detect message duplicates, treating messages as duplicates if their content was identical character by character. We detected link duplicates by first extracting all links using the Python “url extract” library and then comparing them in the same way as message duplicates. Image duplicates were extracted by comparing the image file field in posts using the same method as for message and link duplicates.

2.3.4. Geography and self-labeling

Geography and self-labeling were operationalized as the sociotechnical affordance that allows users to have their country automatically assigned based on their IP address or manually select a label from a list. Such labels appear in a message as flags. These variables were extracted at the message level. We counted the frequency of the labels “fascist,” “white supremacist,” “nazi,” and “confederate” in both the target and control datasets and compared them.



3. Results

3.1. Longitudinal flow of the content

To answer RQ1a, concerning differences in the of the content on 4chan /pol/ between the target datasets and the control datasets, we generated temporal visualizations of named entities, which are presented in Figures 1 and 2 below.

To create a timeline representation of the frequency of named entities, we used Python for data pre-processing and visualization. First, we extracted entities and their individual timestamps from the raw control and target data using Pandas (version 1.0.3), a Python library for data analysis (McKinney, 2011). Second, we visualized the data with a Python-based data visualization library Seaborn (Waskom, et al., 2020). The x-axis represents the timeline and y-axis the daily frequency of which named entities were used in messages. The entity frequency was normalized using the total message count in the respective dataset. The top 10 entities were filtered according to their total number of occurrences in the control and target datasets. The color legend of named entities is presented in descending order according to the total number of usages of named entities.

Both figures show an increase in the specified named entities at the time of the two mass shootings, which took place on 27 October 2018 and 15 March 2019, respectively. To contextualize Figures 1 and 2, it is worth mentioning, that there was an overall increase in messages in target datasets, compared to control datasets. Specifically, an overall message count for 15 days in prior to the mass shooting (i.e., Pittsburgh control) was equal to 1,964,818 messages, compared to a 15-day total count of messages that was equal to 2,035,706. In Christchurch, the pattern was the same: 15-days prior to the mass shooting, users produced a total of 1,705,645 messages, compared to 1,980,564 messages 15 days after the event.


Longitudinal visualization of named entities in Christchurch datasets
Figure 1: Longitudinal visualization of named entities in Christchurch datasets.


The legend at the right top corner of Figures 1 and 2 lists the named entities in order of highest density, operationalized as frequency of a given named entity; the top entity had the highest frequency. The most prominent named entities found in Christchurch datasets (Figure 1) included references to “Jews,” “Jewish,” “Israel,” and a derogatory term “kek”, and Islam-related words, such as “Muslims.” Others named entities referred to geographic locations such as “NZ” where the attack took place, as well as Europe and America. Named entities related to the U.S. included “Trump,” “American,” and “first.”


Longitudinal visualization of named entities in Pittsburgh datasets
Figure 2: Longitudinal visualization of named entities in Pittsburgh datasets.


In Pittsburgh dataset (Figure 2), the most prevalent named entities can be categorized as references to “Jews,” “Jewish,” “Israel,” and “kek.” The second most prevalent group of named entities were related to the U.S.-related and included “Trump,” “Republicans,” “Democrats,” “house,” “first,” and “California.” “House” here refers to “House of Representatives.” In addition to the U.S.-based locations, other mentioned located included “Europe.”

3.2. Circulation of racist and anti-Semitic content

To answer RQ1b, concerning how the density of racist and anti-Semitist hate content on 4chan /pol/ differs between the target datasets and the control datasets, we analyzed the results presented in Table 3.


Table 3: Frequency of racist and anti-Semitic keywords.
KeywordsChristchurch target Christchurch controlPittsburgh targetPittsburgh control
n (%)n (%)n (%)n (%)
White23,911 (5.7%)188 (5.4%) 4,589 (3.4%)174 (5%)
Black4,848 (1.2%)61 (1.7%)1,712 (1.3%)79 (2.3%)
Jew17,388 (4.2%)179 (5.1%)7,458 (5.5 %)115 (3.3%)
kike6,518 (1.6%)67 (1.9%)2,339 (1.7%)59 (1.7%)
Total52,665 (13%)495 (14%)16,098 (12%)427 (12%)


The results of Table 3 show an overall greater use of all keywords in target datasets compared to the control datasets, yet percentage-wise differences were found as follows. The most prevalent word in the Christchurch data was “White,” followed by “Jew.” In the Pittsburgh dataset, the most prominent keyword was “Jew” followed by “White.” The word “Jew” was used substantially more frequently in the target dataset compared to the Pittsburgh control dataset. Note that the percentages reported in Table 2 are derived from the totals, provided in the respective datasets (see totals in Table 2).

3.3. Content circulation through duplicates

To answer, RQ2a concerning how the circulation of content differed between the target datasets and control datasets, we identified duplicates within and across datasets.


Table 4: Duplicates across datasets.
DatasetDuplicate messages n (%)Total imagesImage duplicates n (%)Total links Link duplicates n (%)
Christchurch target18,221 (4.4%)44,4902,788 (6.3%)28,34716,994 (59.9%)
Christchurch control80 (2.3%)9679 (0.9%)1967 (3.6%)
Pittsburgh target7,331 (5.4%)106,4909,328 (8.8%)11,4897,947 (69.2%)
Pittsburgh control88 (2.5%)9633 (0.3%)26720 (7.5%)
Total25,720 (15%)152,91012,128 (16%)40,29924,968


Table 4 shows that the percentages of message duplicates in the target datasets were double those in the control datasets for both Christchurch and Pittsburgh. We found similar patterns for both image and link duplicates. We found six times as many image duplicates in the Christchurch target dataset compared to the control data set and eight times as many in the Pittsburgh target dataset compared to the control. Link duplicates were roughly 10 times more frequent in the target datasets compared to the control datasets [14].

The examples below are of duplicate messages showing anti-Semitic sentiment. These messages were repeated multiple times.

Example 1: Anti-Semitic message
2018-10-02 23:04:31 >>187867740
if you did the job right the first time and killed all the jews , we wouldnt have this problem. but no you had to attack russia.

Example 2: Anti-Semitic message
2018-10-03 12:05:16 >>187922731
Quiz: Same kike shilling with alternate accounts or no?
>pic related

In addition to pro-Republican election slogans, duplicate messages in the Pittsburgh target dataset mentioned MAGA — Make America Great Again slogan used in the U.S. Republican presidential campaign of 2016 (repeated 176 times) or slogans such as “Red California” which expresses wishes for the Republican candidates to win over California’s electorate.

3.3.1. Image duplicate overview

As Table 4 shows, the numbers of image duplicates varied across samples. The Christchurch target dataset contained generic titles such as maxresdefault.jpg, indicating that the file has been downloaded from a different site, and download.jpg. It also contained a “join degeneracy akarin.png” image file of which 20 versions were shared. Each version was shared between 20 and 60 times. One version in response to an anti-Semitic remark was found in the thread below [15].


Sample message containing a frequently duplicated image
Example 3: Sample message containing a frequently duplicated image.


The image in Example 3 contains a link inviting users to join Discord — an application that has originally hosted gamers but soon has become home to hackers and recently to white supremacists [16]. While the link is no longer valid — Discord explains that this could be for multiple reasons — the image demonstrates how users with specific topical interests (in this case anti-Semitic sentiments) were mobilized to join a different platform. Other prominent links were “kek.jpg” (shared 29 times), “Sloppy job Mossad.jpg” (shared 17 times), and “Nigger.png” (shared 18 times).

The Christchurch control dataset contained only a small number of duplicated images: “images.jpg” was repeated eight times and “mememagic.jpg” was repeated twice. Other images were posted only once.

The Pittsburgh target dataset contained generic image names, such as “file.png,” which was repeated the most times (353 times). Other prominent repeated images were “MAGA.jpg” and “kek.jpg” (each repeated 11 times) and “Halloween frog.jpg” (repeated eight times).

The Pittsburgh control dataset had the fewest image duplicates. The most frequently circulated ones had generic titles, such as “image.jpg,” “download.jpg,” and “”

3.3.2. Keyword analysis in duplicates

Table 5 shows the prevalence of keywords in duplicates across the datasets.


Table 5: Frequency of anti-Semitic and racist keywords in duplicates.
Keyword in duplicatesChristchurch targetChristchurch controlPittsburgh targetPittsburgh control
n (%)n (%)n (%)n (%)
white(s) (in duplicates)136 (0.8%)019 (0.3%)0
black(s) (in duplicates)35 (0.2%)015 (0.2%)0
Jew(s) (in duplicates)88 (0.5%)041 (0.6%)0
kike(s) (in duplicates)16 (0.1%)023 (0.3%)0
Total in duplicates275 (1.5%)098 (1.3%)0


The duplicates in the control datasets did not contain any racially charged words. However, all four terms were found in duplicates in the target datasets. “Jew(s)” was the most prevalent keyword in the Pittsburgh dataset, while “White(s)” was the most prevalent in the Christchurch dataset.

3.3.3. Link duplicate analysis

Given the prevalence of duplicate links in the datasets, we performed a top link analysis, which entailed qualitatively analyzing the most frequently circulated links.

Christchurch target top link analysis. The links in the Christchurch target data can be divided into two general categories. The first category of links led to “news” sites or pieces that could be seen as related to the event but were mostly from before the event [17]. These links were accompanied by speculation concerning false flag theories claiming that the attacks were committed with the intent of disguising the actual source of responsibility, referring to covert operations of various governments. The term false flag is popular among conspiracy theory promoters. The false flag theories were mostly related to international politics, and Mossad was frequently mentioned, as well as speculation about relationships between various Arab countries. The second category contained (supposed) links to a video of the attack or other reference to the attack itself. The content of the links could not be verified as the content of all links leading to YouTube had been removed. However, the video of the attack was found through three other video services linked in the data. The only link of the 120 most circulated links that did not relate to the attack was the link “,” which was repeated 94 times. It appeared to be an advertisement for an app designed to “make your internet safer.”

The most circulated link was to a Telegraph article — “Mossad spy ring ‘unearthed because of Christchurch earthquake’” — which appeared 192 times although the article was from 2011. It was shared in two forms: as a direct link to the Telegraph page (112 times) and as a link to the site (80 times). The link is shared 10 times with a post including a lengthy text [18].

On the list of most frequently posted links the first is to the video of the attack led to a Web site The users of the site thanked the poster for sharing the video as, apparently, it was difficult to find. On, it was accompanied by a warning regarding disturbing material. The video itself was from YouTube and began with a picture of Goebbels and a quotation from him. The video had been edited and had a soundtrack. The third most frequently found link to the attack video led to Kiwifarms, which is the N.Z. equivalent of 4chan. There was no warning to indicate disturbing material and the video appeared to be available in its entirety with no sound editing.

Generally, we found very little material referencing the event itself (due to deleted videos, at least some of which we can assume were the video of the shooting). The links mostly led to news that were not related to the event but were associated with it through various false flag theories. References to Mossad were significant in the link data.

Christchurch control dataset top link analysis. In the Christchurch control dataset [19], the most circulated link was an animation of Byron Smith’s audio recording of himself shooting two teens who broke into his home in Minnesota in 2012. This appeared in the data only three times. A video by “MatsWhatItIs” revealing a YouTube “wormhole” to child porn and a Russia Today link to a piece showing how “Iran launches annual navy maneuvers from Gulf to Indian Ocean” each appeared twice. Altogether there were only these three links that had duplicates in the control data. All other links appeared only once.

Pittsburgh target dataset top link analysis. The most circulated links in the Pittsburgh target dataset related to the midterm elections and were pro-Trump [20]. The 10 most circulated links were listed in a single frequently duplicated message. The links led to pages listing Trump’s successes, his public schedule, and even directed to the White House Web site. We found a slight variation in the frequency with which the links were repeated, which means that not all of the messages contained all the links. On the list of most frequently duplicated links numbers 2–4 were repeated 130 times, the next five links were repeated 129 times, and the next two were repeated 128 times. The most circulated link was to the Bloomberg Web site revealing the midterm election results. The midterm elections took place on 6 November 2018, nine days after the Pittsburgh attacks on 27 October. The link was circulated 138 times.

The tenth most duplicated link led to a YouTube video featuring a current affairs program from September 2018. The video, which was produced by the Western Journal, was apparently highly critical of Trump. The video began with a warning from YouTube: “The following content has been identified by the YouTube community as inappropriate or offensive to some audiences. Viewer discretion is advised. To see the video, the user needs to click ‘I understand and wish to proceed.’” This link appeared in the message with the other links all of which were pro-Trump, suggested inverted propaganda that highlighted views opposed to Trump as suspicious.

Pittsburgh control dataset top link analysis. In the Pittsburgh control dataset, the 10 most shared links were shared only twice and were also shared in the Pittsburgh target data [21]. The top 20 links were to positive news stories about Trump’s activities.

3.4. Self-labeling analysis

To answer RQ3 concerning the density of users who identify themselves with the racist or anti-Semitist labels in the target versus the control datasets, we analyzed self-labeling in relation to racist and anti-Semitic sentiments.


Table 6: Self-labeling frequency.
Self-labelingChristchurch targetChristchurch controlPittsburgh targetPittsburgh controlTotal
n (%)n (%)n (%)n (%)n
Nazi3,841 (0.09%)32 (0.09%)82 (0.006%)03,955
Confederate2,454 (0.06%)21 (0.06%)68 (0.005%)02,543
Fascist2,442 (0.06%)16 (0.05%)25 (0.002%)02,483
White supremacist1,438 (0.03%)7 (0.02%)6 (0.0002%)01,451
Total10,175 (0.24%)76 (0.22%)181 (0.01%)010,432
Size of dataset416,4533,500135,0003,500 


Table 6 shows that “nazi,” “confederate,” and “fascist” were the most frequently used labels. The labels “fascist” and “white supremacist” were proportionally more frequently used in the target datasets compared to the control datasets, while the other keywords showed a similar frequency in both. In fact, in the case of Pittsburgh, these labels were only found in the target dataset and did not occur in the control dataset. When comparing target datasets, self-labeling was more prevalent in the Christchurch target dataset than the Pittsburgh target dataset.

3.5. Keyword frequency by countries

An analysis of the distribution of hate words by self-identified country yielded the following results. The top countries, which had above 0.1 percent of keywords within the target datasets, were the U.S., Canada, Australia, and Great Britain. Messages with the country labels Australia and Great Britain contained no references to “black” or “kike,” except for Great Britain in the Pittsburgh and Christchurch control datasets and Australia in the Christchurch target dataset.

Overall, the control datasets had a longer list of countries that passed the 0.1 percent threshold for hate keywords, while the target datasets had more concentration of these keywords among fewer countries. An analysis of the countries with at least 0.1 percent of hate keywords showed that the messages focused on the keywords “Jew” and “white.” However, certain countries, such as Germany, followed a different pattern. Users from Germany and users who used self-reference as “nazi” used “Jew” label frequently in the Christchurch target dataset, while in Christchurch control dataset references to “kike” were more frequent by German and “nazi” users. In the Pittsburgh control dataset, users from Brazil, Germany, and Sweden made more references to “black,” in addition to the U.S., Canada, and Great Britain.

The Christchurch dataset had a larger range of countries that passed the 0.1 percent threshold for racial and anti-Semitic keywords compared to the Pittsburgh datasets. The users in the Christchurch datasets also used a wider range of ideological flags. These included “nazi,” “confederate,” and “pirate” in the Christchurch target data and “nazi,” “confederate,” “pirate,” “white supremacist,” and “kekistani” in the Christchurch control dataset. Tables showing the results for each dataset can be found in Appendix 2.

3.6. Comparison between four datasets

The results of this study can be summarized by mapping the relative use of named entities that showcase the overall trends in hybrid media events (i.e., shared and divergent aspects) among four datasets.

The relative use of named entities was calculated by following a two-step procedure. First, a normalized frequency of named entity use was calculated for each entity for both media events during different points in time for the control and target datasets. The highest normalized means, which reflect the most used entities, were assigned a value of one; the lowest normalized means (i.e., the least used named entities) were assigned a value of zero. Second, for the two media events, we subtracted the frequency in the control data from that in the target data.


Named entity overlap and divergence between control and target datasets for Christchurch and Pittsburgh
Figure 3: Named entity overlap and divergence between control and target datasets for Christchurch and Pittsburgh.


In the resulting visualization (Figure 3), the named entities on top of the x-axis were used more during the Christchurch target compared to control. These include ‘muslim’, ‘muslims’, ‘islam’, ‘islamic’, and ‘ISIS’. Similarly, the named entities on right of the y-axis are specific to Pittsburgh. Key examples include ‘house’, ‘california’, ‘red wave’, and ‘jews’. Out of these, the named entity ‘jews’ was used more for Pittsburgh target and for Christchurch control. The empty top-right corner indicates that the two attacks do not share named entities that were predominantly used during both targets. The named entities in the bottom-left corner were predominantly used in the control datasets for the two cases, that is, mostly before the attack. The most prominent example is ‘trump’. Other control-specific examples include ‘america’, ‘american’, and ‘uk.’



4. Discussion

In this study, we analyzed shades of hate and how they are manifested during media events. The assumption is that the density of hate speech elements increases in response to or along with hybrid media events. We analyzed two mass shootings that took place on two different continents as examples of such events. The results show a) an increased number of named entities used during the target events and b) a proportional increase in keywords associated with hate speech. Thus, our expectation regarding RQ1 was confirmed since the control datasets showed lower density and less circulation of hate speech.

For RQ2, we expected duplicate circulation to increase in the target datasets. The results show c) an increase in content circulation through the use of messages, links, and picture duplicates during the target events.

In response to RQ3, we found more users who labeled themselves with racist or anti-Semitic labels in the target datasets. Overall, d) we found differences in actors (self-labeling and geographic self-labeling) in relation to hate speech and anti-Semitic speech.

For RQ4, we expected a common core audience and common topics across the datasets. However, we also expected to find event-specific elements that were not present in the control dataset (i.e., in the content exchanged prior to the analyzed events). We found e) a greater concentration of hate speech and “specialized” racist keywords in message labeled as North American, while anti-Semitic keywords were present in messages relating to both events and across users.

Since we found that the circulation of repeated content increased during media events, the next step was to analyze the nature of this content. The results can be summarized as follows:

  1. When socio-politically significant events take place, the activity of /pol/ users is more frequent. During media events, online fringe communities have been found to be mobilized to express racial slurs and anti-Semitic sentiments.

  2. Duplicates and keyword analysis support the increased circulation of such content.

  3. Self-labeling as “white supremacist” was consistent across the datasets, indicating that such users represent the “culture” of this fringe community and users were explicit about it. In our analysis, we have not included other options of self-naming such as “tree-hugger” or “gay.” The creation of character-based archetypes is common on /pol/ and makes this online space semi-performative and semi-authentic.

  4. Hatred was prevalent across the datasets. Moreover, anti-Semitism was also widespread and was not only expressed in relation to the Pittsburgh shooting. Thus, anti-Semitism seems to be an “agenda” item on 4chan. However, we found anti-Semitic sentiment increased during media events: hybrid media events attracted larger audiences and also expressions of hate were more frequently used.

  5. Hybrid media events attract users from different causes. In the case of our analysis, we have seen the increase in support for the Republicans on /pol/ after the Pittsburgh shooting, as /pol/ was used to promote party politics.

The analysis of the named entities showed an overlap in content between the datasets. For example, the control datasets contained more mentions of “America,” while the target datasets showed an expected prominence of certain topics, such as “Jews” in Pittsburgh and “Islam” in New Zealand. The prevalence of the term “America” and words such as shows how American politics was intertwined with the analyzed hybrid media events. Furthermore, our analysis of duplicates indicates a potential of non-human actors involved in the circulation of such content, given that non-human actors such as bots become increasingly sophisticated, even if the degree of the automated content circulation online is unclear. Future studies should further analyze duplicates on 4chan to test the level of automation in online spaces and their relationship to bots.

This study is subject to a range of assumptions and limitations. Our sample is based on randomized data; thus, while it allows us to generalize the content analysis parts of the study and frequency, by design, it includes gaps in our timeline representation. In terms of assumptions, this study analyses texts, thus inferences are drawn from the textual analysis. While textual analysis presented here entails content representation and its potential interpretation, access to users intentionality and the level of reader’s interpretations on potential sarcasm, truthfulness, and role playing are not part of this analysis. However, without access to content producer intentions, it is worth noting that even if racist or anti-Semitic content is used ironically, it contributes to normalization of racism and antisemitism within societies, as acknowledged by various scholars of online hate (e.g., Colley and Moore, 2020). Furthermore, while we acknowledge that content circulation through duplicates can be viewed by some users of /pol/ as part of the culture, here we expose the antagonistic nature of such practices and their presentation in online spaces freely accessible by anyone. Thus, the purpose of this work is to demonstrate the surge of content that features user categories such as “nazi” and “white supremacists” (even if ironic or playful) coupled with anti-Semitic content as well as the surge of these categories in the aftermath of violent events that attract vast audiences and “fringe” discourses that gradually move into mainstream (Whyte, 2020).



5. Conclusion

This study contributes to the understanding of the granularity of racially charged and anti-Semitic hate speech on 4chan during disruptive hybrid media events by analyzing the rhetoric on the platform in the aftermath of two mass shooting events. We also included control datasets, consisting of content posted before the events. This enabled us to understand the particularities of the content during the events and the differences between a “normal” situation and a situation of violent attack. We focused on two hybrid media events generated by violent attacks: a synagogue shooting in Pittsburgh and mosque shootings in Christchurch. We expected anti-Semitic rhetoric and racially charged slurs to be more prevalent in the Pittsburgh case and anti-Islamism to dominate after the Christchurch attacks.

Our study empirically demonstrates the surge of content and hate-related content during hybrid media events from an empirical perspective. It expands on previous research that compared discussion groups on 4chan (e.g., Nagle, 2017) by focusing on the /pol/ group. Our combinatory approach brings into light moments of “rupture” when anti-Semitic rhetoric spiked on /pol/ and events catalyzed a surge of anti-Semitism.

Based on our data, we can confirm that pro-Trump propaganda appeared on /pol/ during a U.S.-based disruptive and violent media event coincided with the midterm elections. While we cannot claim causal relationship between these two events, this relationship can be considered as a hijacking of the Pittsburgh attack for election propaganda, i.e., the use of a socially relevant event to rally users. This relationship speaks to something that we could call an activism of hate, where social media affordances are utilized for political activism, as argued by previous scholars (Coleman, 2014), where anonymity can mediate such a process (Beyer, 2014).

We present two core arguments. First, we found an increase in anti-Semitic rhetoric on 4chan during hybrid media events, i.e., in our target datasets. Second, hybrid media events involving terrorist violence mobilize hatred, including anti-Semitism, regardless of which group is the target of the violence. This is evidenced by the fact that both the Christchurch attacks on mosques and the Pittsburgh synagogue shootings increased the amount of anti-Semitic hateful content on /pol/.

Our findings show that user activity and expressions of hate increase in the aftermath of violent attacks, compared to the times when such events are not topical, and confirm previous findings that show a surge of mobilization during specific events (Thompson, 2018; Malevich and Robertson, 2020). The activity on the board is a part of the hybrid media event, formed when the ripples of an attack spread through the media environment. Similarly, such mobilization of a user base has been also observed in the study of another violent event — the Boston marathon bombing that took place in 2013 (Potts and Harrison, 2013). The contribution of this study is to empirically demonstrate through variation and gradation in racist and anti-Semitic hate speech on the /pol/ board.

The 4chan /pol/ platform not only showcases anti-Semitic hatred in its content, as previously argued (e.g., Hine, et al., 2017), but also facilitates the expression of different shades of hatred through the affordances of the platform. The platform allows people to label themselves “white supremacist,” “confederate,” or “nazi” when using anti-Semitic and racially charged language. Hatred is mediated on this platform by self-identification, even if such self-identification is often ironic. We observed an increase in this type of self-identification in the aftermath of both attacks and an increase in posts from these “fringe users” during the events.

A comparison of the two events supports Merrin’s (2019) claim that /pol/ is politicized. We show that support for specific causes, such as MAGA and Donald Trump’s election, was concurrently mobilized during the Pittsburgh shooting. We also show that users who post more content and circulate content through duplicate messages are mobilized during hybrid media events. Future studies should investigate the mobilization that takes place on 4chan and the automated forces involved.

We found anti-Semitic and racially charged rhetoric in both the target and control datasets, but the circulation of hate speech was more pronounced during the media events. The same pattern was found throughout several levels of analysis. On the message level, the evidence shows a) an increase in overall activity during the hybrid media events (i.e., in the target datasets compared to the control datasets) and b) an increase in duplicate messages, links, and images during the events. Analysis of named entities over time shows an increase in the use of named entities during the hybrid media events (i.e., in the target datasets compared to the control datasets). Named entity analysis and keyword analysis both showed a prevalence of anti-Semitic and racist rhetoric. The user-level of analysis, which focused on self-naming, confirmed our finding of anti-Semitic and racist content. The geolocation analysis of hatred showed a particular prevalence of anti-Semitic rhetoric in posts from English-speaking countries.

One unexpected finding emerged when we compared the New Zealand case with the U.S. case. We observed a more diverse distribution of hatred in the New Zealand case in terms of a posting traffic based on a range of countries. However, in the U.S. case, the shooting coincided with the midterm elections; thus, 4chan /pol/ hatred was focused on both anti-Semitic and racist rhetoric and locally relevant political issues (e.g., calls to support for Republicans and Trump). These findings are consistent with the observations of Mittos, et al. (2020) showing the intertwining of anti-Semitic and racist comments with far-right politics. We propose that when an event attracts attention, the event is flooded with related (political) discourses to capitalize on the increased attention.

Our findings have multiple implications regarding hate rhetoric. First, we show there is gradation in the rhetoric of hate on 4chan, even within a single channel, which is notorious for fostering hate and anti-Semitic. We show that hate rhetoric increases during violent events. Second, hate speech forums function as reactive spaces where violent events are reflected through increased rhetoric. While there is a certain baseline uniformity of topics and expressions, such rhetoric spikes in the aftermath of events such as mass shootings and their reporting in the media, which we here call hybrid media events. 4chan /pol/ can be treated as an arena where these violent events are exploited to draw attention and advance ideological agendas. Third, we argue that hybrid media events activate particular elements and features of communication, in this case, those related to hatred, similar to Malevich and Robertson (2020) finding that posting frequency increases immediately following terrorist attacks. Our approach includes quantitative and qualitative multi-level analyses of online spaces. Future studies could test the value of duplicates in circulation practices not only between but also within datasets and expand this empirical work by look at causal relationships between hate speech and specific events and how communicative features are activated on different online platforms. End of article


About the authors

Asta Zelenkauskaite is an Associate Professor in the Department of Communication at Drexel University (U.S.). Her research focuses on dark participation online. Specifically, she is interested in automated and micro discourse practices to trace automated online influence. Her work is situated between information science and social science paradigms.
E-mail: az358 [at] drexel [dot] edu

Pihla Toivanen is a doctoral candidate in language technology in University of Helsinki and has worked as researcher and technical assistant at University of Tampere and University of Helsinki focusing on computational methods in studying disinformation, counter-media, and conspiracy theories in hybrid media events.
E-mail: toivanenpihla [at] gmail [dot] com

Jukka Huhtamäki, D.Sc. (Tech.), is a postdoctoral researcher at the Unit of Information and Knowledge Management at Tampere University. He develops computational methods to investigate and intervene in socio-technical phenomena in diverse research contexts, including computer-supported cooperative work, hybrid media, and organizing. His research interests include computational social matching, fluid organizing, and digital ecosystems.Jukka applies a variety of research methods, including computational social science, action design research, and experimentation to create new knowledge that is theoretically rich and that relevance in practice.
E-mail: jukka [dot] huhtamaki [at] tuni [dot] fi

Associate Professor Katja Valaskivi is based in the Faculty of Theology at the University of Helsinki, Finland. She is a media researcher who specializes in the circulation of belief systems and ideologies in the hybrid media environment with a focus on the technological conditions of meaning-making in the digital world. Valaskivi is the PI of the the Academy of Finland funded consortium Hybrid Terrorizing. Developing a New Model for the Study of Global Media Events of Terrorist Violence (HYTE 2017-2021), sub-project PI in Extremist Networks, Narcotics and Criminality in Online DarkNet Environments (ENNCODE 2020-2022) and leader of the Politics of Conspiracy Theories research project (SAPO, Helsingin Sanomat Foundation 2019–2020).
E-mail: katja [dot] valaskivi [at] helsinki [dot] fi



The authors would like to thank First Monday’s anonymous reviewers and editor for their comments. Katja Valaskivi gratefully acknowledges the support of the Helsingin Sanomat Foundation (the Politics of Conspiracy theory, SAPO project) and the Academy of Finland (the Hybrid Terrorizing consortium, HYTE, 326642).



1. E. Grinberg, E.C. McLaughlin, S. Sidner, and A. Stapleton, 2018. “11 people were gunned down at a Pittsburgh synagogue. Here are their stories,” CNN (1 November), at; E. Ainge Roy, 2020. “Thousands in New Zealand protest against George Floyd killing,&erdquo; Guardian (1 June), at, accessed 15 July 2020.


3. Before entering the /pol/ forum, users are presented with the following pop up disclaimer message, which they must accept, before proceeding to the discussion boards: “The content of this website is for mature audiences only and may not be suitable for minors. If you are a minor or it is illegal for you to access mature images and language, do not proceed. This website is presented to you AS IS, with no warranty, express or implied. By clicking ‘I Agree,’ you agree not to hold 4chan responsible for any damages from your use of the website, and you understand that the content posted is not owned or generated by 4chan, but rather by 4chan’s users. As a condition of using this website, you agree to comply with the ‘Rules’ of 4chan, which are also linked on the home page. Please read the Rules carefully, because they are important.”


5. In addition to the Web page affordances, as on any Web page, one can use VPN to ironically or otherwise change an IP address and therefore country flag.

6. B. Feldman, 2015. “Inside /pol/, the 4chan politics board shouted out in Minneapolis gun video,” New York Magazine (25 November), at, accessed 15 July 2020.

7. J. Eligon and A. Southhall, 2015. “Black Lives Matter activists vow not to cower after 5 are shot,” New York Times (24 November), at

8. C. Dewey, 2014. “Absolutely everything you need to know to understand 4chan, the Internet’s own bogeyman,” Washington Post (25 September), at, accessed 15 July 2020.

9. A. Thompson, 2018. “The measure of hate on 4chan,” Rolling Stone (10 May), at, accessed 15 July 2020.

10. Hodge and Hallgrimsdottir, 2020, p. 568.

11. P.L. Austin, 2019. “What is 8chan, and how is it related to this weekend’s shootings? Here’s what to know,” Time (5 August), at; R. Evans, 2018. “How the MAGAbomber and the Synagogue shooter were likely radicalized” (31 October), at

12. Tuters and Hagen, 2020, p. 29.

13. SpaCy defines named entity recognition as a “real-world object.” The list of the named entities can be found in Appendix 1.

The prominence of the named entity scores across datasets is shown below:


 Christchurch targetChristchurch controlPittsburgh targetPittsburgh control


The Pittsburgh target dataset had more NORP and CARDINAL entities than the control data set; the frequencies of other entity types were similar across the target and control datasets. The Christchurch target dataset had more GPE entities than the control dataset. Both target datasets had more CARDINAL entities than the corresponding control datasets even though this type of entity occurred less frequently.

14. Note that percentages for duplicate messages are derived from the divisions out of the total of the respective samples, reported in Table 2. Percentages for images and links are produced from the total of images and total of links of the respective datasets.


16. I. Sher, 2019. “Discord, Slack for gamers, tops 250 million registered users,” CNET (13 May), at



Christchurch target dataset top links.
Christchurch target top linksCount


18. Example of a repeated post:

Threadly reminder to the edgy manchildren swarming /Pol/ now
>shooter was a personal trailer for 2 years after school. Leaves 2011 to “travel overseas” ...
>7 year gap who knows where he is ...
>Posts from the facebook of a Hotel in Islamabad with a picture
>Says how amazing and wonderful this muslim country is and the people
>Hotel owned by Shia muslims connected to Nizari people ...
> a little over 6 months later appears in christchurch a small NZ town of a little over 300k people
> a small town that has had multiple White kiwis become jihadis, travel to yemen and get killed by US drones
>the same small town another kiwi teen was radicalized into a Jihadi and planned a mass killing before being arrested
>the same small town that a large mossad operation had been operating in for who knows how many years



Christchurch control dataset top links.
Christchurch control top linksCount




Pittsburgh target dataset top links.
Pittsburgh target top linksCount




Pittsburgh control dataset top links.
Pittsburgh control top linksCount




A. Arif, L.G. Stewart, and K. Starbird, 2018. “Acting the part: Examining information pperations within #BlackLivesMatter discourse,” Proceedings of the ACM on Human-Computer Interaction, article number 20.
doi:, accessed 4 December 2020.

A. Ben-David and A.M. Fernandéz, 2016. “Hate speech and covert discrimination on social media: Monitoring the Facebook pages of extreme-right political parties in Spain,” International Journal of Communication, volume 10, pp. 1,167–1,193, and at, accessed 4 December 2020.

J.L. Beyer, 2014. Expect us: Online communities and political mobilization. Oxford: Oxford University Press.
doi:, accessed 4 December 2020.

A. Chadwick, 2013. The hybrid media system: Politics and power. Oxford: Oxford University Press.
doi:, accessed 4 December 2020.

I. Chaudhry, 2015. “#Hashtagging hate: Using Twitter to track racism online,” First Monday, volume 20, number 2, at, accessed 14 July 2020.
doi:, accessed 4 December 2020.

Y. Citton, 2017. The ecology of attention. Cambridge: Polity.

T. Colley and M. Moore, 2020. “The challenges of studying 4chan and the alt-right: ‘Come on in the water’s fine’,” New Media & Society (September).
doi:, accessed 4 December 2020.

G. Coleman, 2014. Hacker, hoaxer, whistleblower, spy: The many faces of Anonymous. London: Verso.

D. Dayan and E. Katz, 1992. Media events: The live broadcasting of history. Cambridge, Mass.: Harvard University Press.

J. Gantt Shafer, 2017. “Donald Trump’s ‘political incorrectness’: Neoliberalism as frontstage racism on social media,” Social Media + Society, (July).
doi:, accessed 4 December 2020.

N. Grinberg, K. Joseph, L. Friedland, B. Swire-Thompson, and D. Lazer, 2019. “Fake news on Twitter during the 2016 US presidential election,” Science, volume 363, number 6425 (25 January), pp. 374–378.
doi:, accessed 4 December 2020.

G.E. Hine, J. Onaolapo, E. De Cristofaro, N. Kourtellis, I. Leontiadis, R. Samaras, and J. Blackburn, 2017. “Kek, cucks, and god emperor trump: A measurement study of 4chan’s politically incorrect forum and its effects on the Web,” arXiv:1610.03452, at, accessed 4 December 2020.

E. Hodge and H. Hallgrimsdottir, 2020. “Networks of hate: The alt-right, ‘troll culture’, and the cultural geography of social movement spaces online,” Journal of Borderlands Studies, volume 35, number 4, pp. 563–580.
doi:, accessed 4 December 2020.

E. Katz and T. Liebes, 2007. “‘No more peace!’ How disaster, terror and war have upstaged media events,” International Journal of Communication, volume 1, pp. 157–166, and at, accessed 4 December 2020.

L. Knuttila, 2011. “User unknown: 4chan, anonymity and contingency,” First Monday, volume 16, number 10, at, accessed 4 December 2020.
doi:, accessed 4 December 2020.

D. Ludemann, 2018. “/pol/emics: Ambiguity, scales, and digital discourse on 4chan,” Discourse, Context & Media, volume 24, pp. 92–98.
doi:, accessed 4 December 2020.

M.N. Lyons, 2017. “Ctrl-alt-delete: An antifascist report on the alternative right,” Political Research Associates (20 January), at, accessed 4 December 2020.

S. Malevich and T. Robertson, 2020. “Violence begetting violence: An examination of extremist content on deep Web social networks,” First Monday, volume 25, number 3, at, accessed 4 December 2020.
doi:, accessed 4 December 2020.

M. Malice, 2019. The new right: A journey to the fringe of American politics. New York: All Points Books.

W. McKinney, 2011. “pandas: A foundational Python library for data analysis and statistics,” PyHPC2011: Python for High Performance and Scientific Computing, at, accessed 14 October 2020.

W. Merrin, 2019. “President troll: Trump, 4chan and memetic warfare,” In: C. Happer, A. Hoskins, and W. Merrin (editor). Trump’s media war. Cham, Switzerland: Palgrave Macmillan, pp. 201–226.
doi:, accessed 4 December 2020.

A. Mittos, S. Zannettou, J. Blackburn, and E. De Cristofaro, 2020. “‘And we will fight for our race!’ A measurement study of genetic testing conversations on Reddit and 4chan,” Proceedings of the Fourteenth International AAAI Conference on Web and Social Media, volume 14, pp. 452–463, and at, accessed 4 December 2020.

D. Nadeau and S. Sekine, 2007. “A survey of named entity recognition and classification,” Lingvisticæ Investigationes, volume 30, number 1, pp. 3–26.
doi:, accessed 10 August 2020.

A. Nagle, 2017. Kill all normies: Online culture wars from 4chan and Tumblr to Trump and the alt-right. Alresford, Essex: Zer0 Books.

A. Oboler, W. Allington, and P. Scolyer-Gray, 2019. “Hate and violent extremism from an online subculture: The Yom Kippur terrorist attack in Halle, Germany,” Online Hate Prevention Institute, Report, IR19-4, at, accessed 4 December 2020.

A. Papasavva, S. Zannettou, E. De Cristofaro, G. Stringhini, and J. Blackburn, 2020. “Raiders of the lost kek: 3.5 years of augmented 4chan posts from the politically incorrect board,” Proceedings of the Fourteenth International AAAI Conference on Web and Social Media, volume 14, pp. 885–894, and at, accessed 4 December 2020.

S. Peeters and S. Hagen, 2018. “4CAT: Capture and analysis toolkit,” at, accessed 4 December 2020.

W. Phillips, 2015. This is why we can’t have nice things: Mapping the relationship between online trolling and mainstream culture. Cambridge, Mass.: MIT Press.

L. Potts and A. Harrison, 2013. “Interfaces as rhetorical constructions: Reddit and 4chan during the Boston Marathon bombings,” SIGDOC '13: Proceedings of the 31st ACM International Conference on Design of Communication, pp. 143–150.
doi:, accessed 4 December 2020.

M. Schraagen, M. Brinkhuis, and F. Bex, 2017. “Evaluation of named entity recognition in Dutch online criminal complaints,” Computational Linguistics in the Netherlands Journal, volume 7, pp. 3–16, and at, accessed 4 December 2020.

L. Shifman, 2013. Memes in digital culture. Cambridge, Mass.: MIT Press.

J. Sumiala, K. Valaskivi, M. Tikka, and J. Huhtamäki, 2018. Hybrid media events: The Charlie Hebdo attacks and the global circulation of terrorist violence. Bingley: Emerald Publishing.

A. Thompson, 2018. “The measure of hate on 4Chan,” Rolling Stone (10 May), at, accessed 4 December 2020.

P. Toivanen, M. Nelimarkka, and K. Valaskivi, in press. “Remediation in the hybrid media environment: Understanding countermedia in context,” New Media & Society.

M. Tuters and S. Hagen, 2020. “(((They))) rule: Memetic antagonism and nebulous othering on 4chan,” New Media & Society, volume 22, number 12, pp. 2,218–2,237.
doi:, accessed 4 December 2020.

M. Tuters, E. Jokubauskaitė, and D. Bach, 2018. “Post-truth protest: How 4chan cooked up the Pizzagate bullshit,” M/C Journal, volume 21, number 3, at, accessed 14 August 2020.
doi:, accessed 4 December 2020.

K. Valaskivi, A. Rantasila, M. Tanaka, and R. Kunelius, 2019. Traces of Fukushima: Global events, networked media and circulating emotions. London: Palgrave Macmillan.
doi:, accessed 4 December 2020.

T. Wall and T. Mitew, 2018. “Swarm networks and the design process of a distributed meme warfare campaign,” First Monday, volume 23, number 5, at, accessed 14 July 2020.
doi:, accessed 4 December 2020.

M. Waskom, O. Botvinnik, J. Ostblom, M. Gelbart, S. Lukauskas, P. Hobson, D.C. Gemperline, T. Augspurger, Y. Halchenko, J.B. Cole, J. Warmenhoven, J. De Ruiter, C. Pye, S. Hoyer, J. Vanderplas, S. Villalba, G. Kunter, E. Quintero, P. Bachant, M. Martin, K. Meyer, C. Swain, A. Miles, T. Brunner D. O’Kane, T, Yarkoni, M.L. Williams, C. Evans, and C. Fitzgerald, 2020. “mwaskom/seaborn,” Zenodo, v0.10.1 (April), at, accessed 14 October 2020.
doi:, accessed 4 December 2020.

A.J. Weaver, A. Zelenkauskaite, and L. Samson, 2012. “The (non) violent world of YouTube: Content trends in Web video,” Journal of Communication, volume 62, number 6, pp. 1,065–1,083.
doi:, accessed 28 December 2020.

R. Weischedel, M. Palmer, M. Marcus, E. Hovy, S. Pradhan, L. Ramshaw, N. Xue, A. Taylor, J. Kaufman, M. Franchini, M. El-Bachouti, R. Belvin, and A. Houston, 2013. “Ontonotes release 5.0,” Linguistic Data Consortium (16 October), at, accessed 4 December 2020.
doi:, accessed 4 December 2020.

C. Whyte, 2020. “Of commissars, cults and conspiratorial communities: The role of countercultural spaces in ‘democracy hacking’ campaigns,” First Monday, volume 25, number 4, at, accessed 14 October 2020.
doi:, accessed 4 December 2020.

T. Ylä-Anttila, G. Bauvois, and N. Pyrhönen, 2019. “Politicization of migration in the countermedia style: A computational and qualitative analysis of populist discourse,” Discourse, Context & Media, volume 32, 100326.
doi:, accessed 4 December 2020.

S. Zannettou, J. Finkelstein, B. Bradlyn, and J. Blackburn, 2020. “A quantitative approach to understanding online antisemitism,” Proceedings of the Fourteenth International AAAI Conference on Web and Social Media, volume 14, pp. 786–797, and at, accessed 4 December 2020.

S. Zannettou, T. Caulfield, E. De Cristofaro, N. Kourtellis, I. Leontiadis, M. Sirivianos, G. Stringhini, J. Blackburn, 2017. “The Web centipede: understanding how Web communities influence each other through the lens of mainstream and alternative news sources,” IMC ’17: Proceedings of the 2017 Internet Measurement Conference, pp. 405–417.
doi:, accessed 4 December 2020.

A. Zelenkauskaite and M. Balduccini, 2017. ““Information warfare’ and online news commenting: Analyzing forces of social influence through location-based commenting user typology,” Social Media + Society (July).
doi:, accessed 10 August 2020.

A. Zelenkauskaite and S.C. Herring, 2008. “Gender differences in personal advertisements in Lithuanian iTV SMS,” Proceedings of Cultural Attitudes Towards Technology and Communication, pp. 462–476; version at, accessed 4 December 2020.

Y. Zhou, S. De, and K. Moessner, 2016. “Real world city event extraction from Twitter data streams,” Procedia Computer Science, volume 98, pp. 443–448.
doi:, accessed 4 December 2020.


Appendix 1: Description of named entities.


PERSONPeople, including fictional
NORPNationalities or religious or political groups
FACBuildings, airports, highways, bridges, etc.
ORGCompanies, agencies, institutions, etc.
GPECountries, cities, states
LOCNon-GPE locations, mountain ranges, bodies of water
PRODUCTObjects, vehicles, foods, etc. (not services)
EVENTNamed hurricanes, battles, wars, sports events, etc.
WORK_OF_ARTTitles of books, songs, etc.
LAWNamed documents made into laws
LANGUAGEAny named language
DATEAbsolute or relative dates or periods
TIMETimes smaller than a day
PERCENTPercentage, including “%”
MONEYMonetary values, including unit
QUANTITYMeasurements, as of weight or distance
ORDINAL“first,” “second,” etc.
CARDINALNumerals that do not fall under another type



Appendix 2: Keyword analysis by country and by dataset. Each row represents a different dataset. Only variables and countries with values equal to or greater than 0.1 were considered for the analysis; those values are highlighted in yellow.


GREAT BRITAIN9291,509334327
NEW ZEALAND622193622



Editorial history

Received 20 August 2020; revised 28 October 2020; accepted 30 October 2020.

Creative Commons License
This paper is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Shades of hatred online: 4chan memetic duplicate circulation surge during hybrid media events
by Asta Zelenkauskaite, Pihla Toivanen, Jukka Huhtamäki, and Katja Valaskivi.
First Monday, Volume 26, Number 1 - 4 January 2021