Reddit, known as “the front page of the Internet,” has been one of the most widely visited Web sites since its inception in 2005. As a social networking site it is unique in that the personal relationships between its users are considered secondary to its content, which includes both original, user-generated content and links to outside sources. Although previous research has investigated other social networking platforms in depth, relatively little has been written on Reddit. The present research considers a variety of indicators, including text readability, emoticon usage, and domain linkage. It was found that the most popular communities on Reddit behave very differently from each other, in terms of language sophistication, sentiment, and topicality (as measured by top-level links to outside sources). The results can be used to inform future investigations of online discourse spaces, particularly those in the contemporary social media sphere.Contents
Introduction
Related work
Methods
Results
Discussion
Conclusion
The online community Reddit, known as “the front page of the Internet,” is one of the most popular sites in cyberspace. As of September 2016 it is the ninth most-visited Web site in the United States and the 25th most-visited Web site worldwide (Alexa, 2016). It has been estimated that six percent of Internet-using adults visit the site (Duggan and Smith, 2013). A more recent study claims that “seven percent of U.S. adults report using the site,” with more than three-quarters of these using Reddit as a news source (Barthel, et al., 2016). Reddit, originally founded as a simple link-sharing platform in 2005, now describes itself as “a platform for communities to discuss, connect, and share in an open environment, home to some of the most authentic content anywhere online” (Reddit, 2016). The “communities” referred to are more commonly referred to as subreddits, each of which has its own unique appearance, guidelines, team of moderators, and, of course, contributors (called “Redditors”). In this paper, subreddits are referred to in the following form: r/subredditName. This is a reference to the URL at which a subreddit can be visited: for example, the “news” subreddit can be viewed at http://reddit.com/r/news/. This also follows a convention on Reddit, wherein references to a subreddit within a comment are expressed in the “r/subredditName” format.
Although Redditors are allowed to “subscribe” to subreddits (which means that threads from these communities show up on the user’s front page), in most instances it is not necessary to subscribe to a subreddit in order to contribute to the discourse. There are a variety of means by which Redditors can participate in a community, including posting comments on a thread, voting on the quality of threads and comments, and posting original threads, though in the latter case there are sometimes constraints on what can be posted. For example, r/politics (a community devoted to political matters in the United States) restricts users to posting links to outside articles (this only applies to users seeking to start a thread; user-generated content is encouraged in the comments section for any given thread). However, most subreddits allow for users to post original thoughts, ideas, and questions in top-level posts (many, such as r/explainlikeimfive [1] and r/AskReddit [2], are entirely comprised of original content).
Though it may be tempting to view Reddit as a monolithic enterprise, the nature of its subreddit system casts it instead as a loosely joined collection of vastly disparate communities. Reddit’s structure allows its users to reject permanent virtual identities (Bergstrom, 2011), and Becker (2013) observed that “each subreddit has its own theme and ‘personality,’ which cater to its online community of readers.” Moreover, subreddits allow for users to tailor the Reddit experience to their own interests (Mills, 2015).
Although previous research has analyzed social interaction across a variety of subreddits (e.g., Choi, et al., 2015), and subreddits have been discussed in a networked context with users as links and the various communities as nodes (Olson and Neal, 2015; see also Olson, 2013, for a visualization of community ties), there is a dearth of research on the actual content of comment threads. Accordingly, this study seeks to address this gap by adopting a content analysis approach in order to analyze readability and, to a lesser extent, user interaction sentiment across different subreddits (the subreddits under analysis were curated from a list of the most popular subreddits in terms of subscribers, as cited by http://redditmetrics.com). Specifically, emoticon analysis (as a measure of sentiment) was used in conjunction with a battery of readability tests (Flesch-Kincaid, Gunning fog, and SMOG). These readability tests take into account average sentence length, syllable count, etc. in order to estimate the grade level (i.e., years of formal education) required to understand a text. Finally, links provided by original posters (that is, the users who create the threads under analysis) were analyzed and classified, in order to determine which types of sites are linked to via the most popular threads on Reddit. This is of particular interest when one considers the “Reddit hug of death,” which occurs when a highly visible thread calls attention to a little-known site without a robust server; the resulting traffic often crashes the site, necessitating the posting of mirrors so that Redditors can view the linked content. A somewhat less dramatic instance of the same phenomenon can be found by looking at Wikipedia page views, wherein pages related to a popular Reddit thread often experience temporary increases in viewcounts (Moyer, et al., 2015).
In sum, the research was guided by the following questions:
RQ1: To what degree do different subreddits (and subreddit categories) differ in terms of readability?
RQ2: To what degree do different subreddits (and subreddit categories) use emoticons (both in terms of frequency and in terms of sentiment)?
RQ3: What types of sites do different subreddits (and subreddit categories) link to, taking into consideration only links that are found in original posts (OPs)?
The primary metric of the popularity of any given thread or comment on Reddit is its score. This may be compared to Facebook’s “like” system or Twitter's retweets, although Reddit allows users to downvote items (expressing dislike). Accordingly, assessing the popularity of a thread or comment is not quite as straightforward on Reddit as it is on other social networking sites, a state of affairs that results in Reddit offering several methods of “sorting” comments within a thread or threads within a subreddit. These different methods take into account variables such as time (“hot”), raw upvote scores (“top”), and upvote/downvote ratios (“best”).
Reddit’s voting system has been criticized, with Gilbert (2013) claiming that it does not consistently identify “potentially popular links” [3], thus defeating the purpose of the site. Another criticism is that it leads to “Karma whoring” [4], wherein “users address the lowest common denominator and usually extend already popular topics” (Richterich, 2014) in the hopes of obtaining upvotes and thus a higher karma score. However, it has also been found that Reddit’s voting system was effective at identifying high-quality “quotable phrases” (Bendersky and Smith, 2012). In addition, Turcotte, et al. (2015) have observed that “social media recommendations improve levels of media trust, and also make people want to follow more news from that particular media outlet in the future,” which indicates that it is likely that Reddit’s content delivery system has an effect on frequent visitors. Although it has been noted that “consumption of news from information/news Web sites is positively associated with higher trust, while access to information available on social media is linked with lower trust” (Ceron, 2015), the fact that Reddit by its very nature actually links to outside information sources suggests that it captures the best of both worlds: the experience of social media coupled with the authority granted by “official” news sites (see also Johnson and Kaye [2014], wherein it was found that “reliance on other online sources is linked to perceptions of high credibility of SNS”).
Readability scores
Three of the most commonly used readability tests are Flesch-Kincaid, Gunning fog, and SMOG. All of these tests return an estimated grade level — that is, the amount of formal education required to comprehend any given text. Whereas the Flesch-Kincaid formula (hereafter referred to simply as “Flesch”) takes into account the total number of syllables in a text (Kincaid, et al., 1975), the Gunning fog index only considers the total number of “complex words” (defined as those with at least three syllables) (Bogert, 1985). Both tests take into account the total number of words and sentences in the text under analysis. The SMOG index, consciously designed as a simpler yet more accurate version of Gunning fog, only considers the total number of sentences and the total number of “complex words” (Fitzsimmons, et al., 2010).
Readability scores have been used across a variety of contexts in order to ascertain the “difficulty” of reading a document, including articles published in medical journals (Weeks and Wallace, 2002), business reports (Clatworthy and Jones, 2001; Jones, 1996; Smith and Taffler, 1992), mission statements (Busch and Folaron, 2005), and college textbooks (McConnell, 1982). Flesch scores have been used to detect deceptive language (Burgoon, et al., 2003) and analyze the degree of community present in a classroom setting (Rovai, 2002). In computer-mediated contexts, Flesch scores have been used to analyze online news articles (Knobloch-Westerwick and Johnson, 2014) and medical Web sites (Whitten, et al., 2008). All three scores were used to determine that a job analysis questionnaire was considered to be at the college-level in terms of readability (Ash and Edgell, 1975).
Perhaps the most frequent usage of readability tests in the academic literature is in regard to health care materials. The Gunning fog metric has been used to establish that health care literature provided to patients was too advanced for the intended audience (Gazmararian, et al., 1999), just as the same index (taken in conjunction with Flesch) indicated that consent forms regarding oncology protocols were also too advanced for lay patients (Grossman, et al., 1994). Both SMOG and Flesch were used by Beaver and Luker (1997) to determine that breast cancer booklets distributed in the U.K. were too advanced for most readers, while SMOG on its own has been used to argue that materials relating to other healh-care topics are too advanced for the general public, including HIV/AIDS (Wells, 1994), strokes (Sullivan and O’Conor, 2001), and dentistry (Jayaratne, et al., 2013). Finally, SMOG scores have been used to argue that many online materials relating to health are written above the average reader’s comprehension level (Aliu and Chung, 2010).
In the field of scholarly communication, Gunning fog was used to discover that the peer-review process improves readability, though the end results are still too advanced for many readers (Roberts, et al., 1994). It may be that editors and reviewers either consciously or unconsciously shy away from simplifying submissions too much, as more “difficult” texts are seen as containing better research (Armstrong, 1980).
Applying these scores to a computer-mediated environment is somewhat more difficult, as noted by Sallis and Kassabova (2000), although with proper adjustments (e.g., filtering out unparseable elements such as URLs) they can be quite powerful. The SMOG index in particular has been found to be an accurate measure of the readability of online documents (Gottron and Martin, 2009). In analyzing online messages, Walther (2007) used the Flesch index to determine that writers used more complex language when they believed they were speaking with a university professor, while using less sophisticated language when they were under the impression that they were speaking with a high schooler, indicating a degree of “language accommodation” in this context. Flesch scores have also been used to analyze e-mail correspondence in newsgroups (Sallis and Kassabova, 2000). Finally, Flesch was used as a means of ensuring that texts were written at a fifth-grade reading level (that is, the texts should be readable by a 10–11 year old), in order to facilitate the demands of a usability study concerning a computer-mediated communication health application (Lin, et al., 2009).
Given that there is something of a pervasive trend of “official” materials being too advanced for the general public, one would expect content generated by lay people, such as that found on Reddit, to be far more comprehensible in terms of language sophistication. The question, then is how much “simpler” the discourse on Reddit is, and how it varies across communities and topic types. A cursory observation of the site’s most popular subreddits (even in pure terms of simple comment length) indicates that comments in “entertainment”-based subreddits seem to be relatively short, and comment chains extend deeper into the tree most often when there is a recurring joke that multiple members are exploiting. Conversely, subreddits about “serious” topics (e.g., those classified as “news,” “politics/history,” and “science”) tend to encourage discussion (as opposed to mere reactions to an outside link or original story), which carries with it a greater attention to detail and information-seeking, which in turn would be expected to lead to a higher level of discourse in the comments section. Hence the following hypothesis:
H1: “Entertainment” subreddits will be rated as more accessible via readability tests when compared to “serious” subreddits. Emoticons
In previous research, Dresner and Herring (2010) have described:
... three functions of emoticons ... (a) emotion, mapped directly onto facial expression (e.g., happy or sad); (b) nonemotional meaning, mapped conventionally onto facial expression (e.g., a wink as indicating joking intent; an anxious smile); and (c) illocutionary force indicators that do not map conventionally onto facial expression (e.g., a smile as downgrading a complaint to a simple assertion). [5] All of these can be found in the Reddit environment. Although the impacts of emoticons on readers have been claimed to be minimal (Walther and D’Addario, 2001), emoticons do indicate intended sentiment, and indeed, Derks, et al. (2007a) found that emoticons “have an impact on message interpretation” and “are useful in strengthening the intensity of a verbal message.” Emoticons have also been used to aid machine learning efforts (Read, 2005). From another perspective, emoticons serve a useful purpose in professional e-mail messages (Skovholt, et al., 2014). Along the same lines, emoticons allow for non-verbal information to be communicated in a computer-mediated context (Lo, 2008), just as they allow users to clarify the intended meanings of their missives (Thompson and Filik, 2016). There does seem to be something of a generation gap in terms of using emoticons as non-verbal indicators (Krohn, 2004), but this is somewhat less relevant in the Reddit environment, given that Reddit users tend to be younger than Internet users as a whole (Barthel, et al., 2016). Finally, previous research has found that people “used more emoticons in socio-emotional than in task-oriented social contexts” [6], which leads to the second hypothesis:
H2: “Entertainment” focused subreddits will be more likely to use emoticons, while more “serious” subreddits will be less likely to use emoticons. The results of this research can be used to gain insights into the manner in which Reddit as a whole operates, as well as to inform future research that seeks to analyze emoticon use, topicality, and readability in online communities.
A total of 204 subreddits were chosen for analysis; these subreddits represent the subreddits with the largest number of subscribers in February 2016 (two separate samples, consisting of the top 200 subreddits, were gathered during this time. As there were slight differences between the two lists, the total number of subreddits adds up to 204). These subreddits were identified via the “reddit metrics” site, which provides a wealth of information about Reddit, including the “fastest growing” subreddits, the “top new reddits,” and — most relevantly for the current research — a ranked list of subreddits (“Top subreddits,” 2016). All popularity metrics take the number of subscribers into account.
From each of the 204 subreddits, the top 75 threads of all time (based off of Reddit’s “top” sorting feature) were chosen, and up to 500 comments were selected from each thread (when a thread contained more than 500 comments, the Reddit sorting algorithm was deferred to — that is, the 500 “best” comments were selected). In addition, the “top” threads in a 24-hour period were sampled from each subreddit (up to a maximum of 25 threads per subreddit), and on two separate occasions the “hot” threads in a 24-hour period were also sampled from each subreddit (again, for each of these samples, up to a maximum of 25 threads per subreddit were harvested) [7. In short, the final sample consisted of the top all-time threads from the most popular subreddits, as well as a series of “snapshots” of these subreddits. Thus, this research is not intended to be a description of Reddit as a whole; rather, it seeks to consider what the most popular threads on the most popular subreddits are discussing, and the level of discourse evidenced in the most visible sections of the community.
Each of the subreddits was manually classified into one of 27 exclusive topical categories. While it is true that the comment trees in any given subreddit often drift away from the topic of the original post, the different subreddit categories effectively establish something of a baseline prompt, which makes the topical groupings of the subreddits meaningful. The classifications provided in the ModeratorDuck subreddit were used as a starting point (“Categorization of all subreddits,” 2014), although many subreddits were not included in this list, and several amendments needed to be made. A list of these categories, along with brief descriptions and member subreddits, can be found in Table A in the Appendix.
The study consists of three distinct sections: readability scores, sentiment (as measured by emoticon usage), and domain analyses. Each of these will be discussed in turn.
Readability
The Text-Statistics PHP library was used to facilitate large-scale computations of readability across the corpus (Childs, 2016). Individual scores were calculated for each comment, which were then averaged together to calculate the mean scores for each readability test for each subreddit. Finally, the mean scores of the three different readability tests (Flesch, Gunning fog, and SMOG), were averaged together in order to determine a mean readability score for each subreddit. These scores are expressed as a number that indicates the estimated grade level of education required to understand a text; thus, a score of 8 would indicate that a text is written at an eighth-grade reading level (that is, a 13-year-old student should be able to read the text without any significant difficulties).
Although the Text-Statistics library provided two further readability tests (Coleman-Liau and Automated Readability Index), these were found to be unsatisfactory for computer-mediated environments (particularly when a subreddit contained a high proportion of posts along the lines of “HAHAHAHAHAHAHAHAHAHAHAHAHA”), and thus these scores were not taken into consideration for this project (though it should be noted that the Automated Readability Index has been used in relation to product reviews, wherein language is somewhat more regulated, e.g., Hu, et al., 2012). In addition, any individual comments that were outside the normal Flesch range (0–100) were excluded from analysis for this particular test. A full list of results can be found in Table B in the Appendix.
Sentiment [Emoticons]
The emoticon list was drawn from the EmoticonLookupTable.txt file included in the SentiStrength download (see Thelwall, et al., 2010; Thelwall, et al., 2012). This file includes a list of emoticons mapped to their perceived sentiment (e.g., a smiley face — :) — is given a score of “1,” whereas a frowny face — :( — is given a score of “-1”). However, this list was slightly adapted in order to facilitate analysis. Specifically, the “:/” emoticon was removed from the list, as it generated a large number of false positives due to the presence of URLs in the comments (http:// being the most common violator). Emoji and other non-textual emoticons were not considered, as the Reddit platform only permits textual characters in the comments section.
Two separate analyses were carried out: the percentage of comments per subreddit that used at least one emoticon, and the average sentiment of all emoticons used across a given subreddit’s sampled comments. The results can be found in Table C in the Appendix.
Domain analysis
Whereas some subreddits prohibit the posting of links in top-level posts (often because the nature of the subreddit, such as r/explainlikeimfive and r/showerthoughts, stipulates that top-level posts should only consist of text, often in the form of a question or statement), others prohibit the posting of anything but links (for example, r/politics, wherein OPs must link to an outside source, and the title of the post must be drawn from said source). Keeping this in mind, it is instructive to consider all OPs links across the entire sample, as these represent the outside domains that were linked to most often in the most popular posts in the most popular subreddits (obviously, subreddits that prohibit outside links in OPs are not represented in this analysis). A list of Web sites that were linked to at least 10 times across the entire sample can be found in Table D in the Appendix. These Web sites (n = 21,797) represent 81.1 percent of the links that could be found in the OPs across the sample.
The various Web sites were classified into one of five categories. A list of these categories, along with example sites and brief descriptions, can be found in Table 1 (the precise categories can be found in Table D in the Appendix).
Table 1: Site classifications. Classification Explanation Examples GIFs Sites that host silent videos, animations, clips, etc. gfycat.com, tumblr.com Images Sites that host static images imgur.com; instagram.com News Sites that provide current information bbc.co.uk; huffingtonpost.com User-generated content Sites that rely on original user-generated content en.wikipedia.org; twitter.com Videos Sites that host videos with sound vimeo.com; youtube.com
For each of the dependent variables, a one-way ANOVA was calculated to predict the dependent variable based on the subreddit category variable. A significant finding indicates that the dependent variable is influenced by the subreddit category. All of these relationships were found to be significant at p < .0001, and all had a moderate effect size (η2 > .25), per Ferguson (2009) (Table 2). Accordingly, we can say that that a subreddit’s category has a predictive effect on all of the variables under analysis: the average readability score of a subreddit is dependent on the subreddit’s category, the category of a subreddit is a reliable predictor of emoticon usage within the subreddit, etc.
Table 2: ANOVA results for dependent variables. Dependent variable F-statistics η2 Emoticon score F(26, 177)=4.375 0.391 Emoticon percentage F(26, 177)=4.586 0.403 GIFs F(26, 177)=3.562 0.344 Images F(26, 177)=9.042 0.57 News F(26, 177)=7.836 0.535 Readability scores (mean) F(26, 177)=9.901 0.593 Videos F(26, 177)=9.958 0.594 User-generated content F(26, 177)=3.66 0.35
Readability
The mean readability scores across the 27 subreddit categories ranged from 4.6 (indicating a fourth- to fifth-grade reading level) to 7.8 (indicating a seventh- to eighth-grade reading level). Subreddits that were classified as “porn,” “GIFs,” “videos,” and “images” (which, taken together, might be considered a “multimedia” macro category) were found at the lower end of the spectrum, while subreddits classified as “philosophy/religion,” “business/finance,” and “science” (all of which could be considered more “academic,” or at least more likely to spark intricate discourse) were found at the upper end of the spectrum. The full results can be seen in Figure 1 (the y-axis numbers have been selected to emphasize the distinctions between the subreddit categories).
Figure 1: Mean readability scores by subreddit type.
In terms of the Tukey tests, the subreddits classified as “porn” and “videos” (the latter not containing any pornographic material) were consistently rated as having a less-sophisticated discourse style than other subreddits, particularly compared to those categorized as “philosophy/religion” and, to a lesser degree, “business/finance.” In addition, “sports” subreddits tended to rank on the lower end of the readability spectrum. This suggests that the “philosophy/religion” and “business/finance” subreddits contain in-depth discourse (perhaps with the usage of terms that are sufficiently “sophisticated” to register highly on the various readability tests), along with relatively intricate sentence construction. Conversely, the “porn,” “videos,” and “sports” subreddits tend towards comments that consist of simple language with little technical jargon. Moreover, it appears that more actual “discussion” goes on in subreddits that ranked higher on the readability tests (as the resulting back-and-forth between members engenders increasingly sophisticated discourse), whereas the lower-ranked subreddits consist more of simple opinions or arguments (“You’re wrong,” etc.).
Sentiment [Emoticons]
Subreddits classified as “sports,” “random/assorted,” and “humor” had the lowest mean emoticon score (indicating that they had a tendency to use negative emoticons, a tendency to avoid positive emoticons, or a combination of both), while subreddits classified as “relationships” and “health/food” had the highest mean emoticon scores (Figure 2). In terms of emoticon use frequency, subreddits classified as “politics/history” and “news” were the least likely to contain comments that used emoticons, while subreddits classified as “health/food” were the most likely to contain comments that used emoticons (Figure 3).
Figure 2: Mean emoticon scores by subreddit type.
Figure 3: Mean emoticon percentage by subreddit type.
Domain analysis
The types of sites that were linked to by OPs were heavily dependent on the subreddit classification. “Porn” subreddits consistently linked to “GIF” sites at a much higher rate than other subreddit types. Unsurprisingly, “images” subreddits, along with “GIFs,” “photography,” and “porn” subreddits were most likely to link to “images” sites. Similarly, “news” and “business/finance” subreddits were most likely to link to “news” sites (as were “politics/history” subreddits, albeit to a somewhat lesser degree). Finally, the “meta” and “sports” subreddits were the most likely to link to sites containing “user-generated content.”
The various subreddits exhibited a number of differences in posting style and content. It was perhaps the readability analysis that exhibited the starkest differences, as the topical focus of a subreddit was a reliable predictor of the complexity of its discourse. Subreddits that aim at answering questions or encouraging discussion (e.g., r/science, r/philosophy, r/AskHistorians) possessed the most linguistically advanced discourses, whereas subreddits such as r/ass, r/milf, r/gonewild, and r/Amateur (the “porn” subreddits) were consistently ranked at the bottom (the conversations on the latter subreddits were generally conducted at no higher than a fourth grade reading level). The latter is hardly surprising, as the words used most often across subreddits such as r/RealGirls (with stopwords ignored) consisted almost exclusively of expletives, obscenities, and terms such as “love,” “yeah,” “fake,” “face,” and “hot.” Conversely, the subreddits that ranked highest in terms of linguistic sophistication used words such as “moral,” “human,” “access,” “articles,” “research,” “science,” “question,” and “answer,” indicating concepts that lend themselves to a deeper level of discourse (as well as an environment in which a question/answer dynamic is frequent, suggesting that requests for more information may lead to more erudite discussions).
Four subreddits — r/AskHistorians, r/philosophy, r/askscience, and r/changemyview — averaged an eighth grade reading level, which was the highest average level observed across the sample, with the exception of two notable outliers: r/rickandmorty and r/circlejerk. These subreddits scored abnormally highly on at least one readability test, illustrating the imperfections inherent in applying these tools to a computer-mediated context. The latter subreddit is easy to explain, as a single post that scores abnormally highly on one or more of the readability tests (e.g., “hahahahahahahahahaha”) will often be copied by many subsequent users, many of whom may add their own variations, most of which will fall outside of the “normal” realm of discourse expected by the readability tests. This same effect can be seen in r/rickandmorty, wherein the results were heavily skewed by the presence of one thread wherein more than 60 posts simply consisted of a long string of capital Hs. When these results were removed, the subreddit ranked near the bottom in terms of linguistic sophistication. As a final note, it is worth mentioning that r/DepthHub, which by its own description “gathers the best in-depth submissions and discussion on Reddit,” ranked highly in the readability tests, simultaneously validating the success of this subreddit and the applicability of selected readability tests for a computer-mediated environment.
The emoticon analysis was not quite as revealing (nor were the subreddits at either end of the spectrum as easily classified), but there were still some interesting findings. Of the six subreddits with a negative score (indicating a greater proportion of emotions classified as negative by SentiStrength’s dictionary), two are sports-related (r/nba and r/nfl), possibly because sports discussions tend to involve negativity towards players, teams, etc. However, it is important to emphasize that, on the whole, “sports” subreddits still had a positive emoticon score, although this was the lowest score witnessed across the 27 subreddit categories.
In terms of raw emoticon counts (not taking positivity/negativity into account), subreddits classified as “health/food” tended to contain more emoticons, whereas subreddits classified as “politics/history” and “news” tended to contain fewer emoticons. This appears to be due to the fact that many “health/food” subreddits involve dieting, wherein emoticons may be used as encouragement (or may be used as reflections of a poster’s individual experiences). Specifically, the two subreddits with the highest percentage of comments containing emoticons — r/loseit (10.43 percent) and r/MakeupAddition (13.18 percent) — are both lifestyle subreddits wherein motivational statements are highly valued. Subreddits such as r/SkincareAddiction and r/bodyweightfitness are not much further down the list. It is also important to note that these “health/food” subreddits also ranked highest in terms of emoticon sentiment, further lending credence to the idea that contributors to these subreddits use positive emoticons as a means of encouraging others and solidifying the community.
Conversely, subreddits such as r/liberal and r/conservative (two of the three subreddits with the lowest percentage of comments containing emoticons, with 1.11 percent and 1.26 percent, respectively), r/news (1.35 percent), and r/politics (1.61 percent) seem to shy away from emoticon use, possibly because factual discussion is prioritized in these communities over personal opinions (or, alternately, because proffered opinions are expected to be “straight,” without any emoticon embellishments or other niceties). Interestingly, “humor” subreddits rank in the lower third of subreddit categories in terms of emoticon usage, suggesting that it may be considered somewhat gauche to use emoticons in these subreddits. A possible explanation is that, in these subreddits, images, videos, GIFs, and plain text are used for humorous effect, and thus emoticons would be the Internet equivalent of a laugh track — leading, distracting, and frowned upon by many in the community. Yet another possible explanation is that the inclusion of r/4chan in the “humor” category may have contributed greatly to this score, considering that 4chan is associated with a rather venomous type of humor.
Finally, in regards to the domain analysis, the vast majority of sites linked to by the top Reddit threads are either mainstream news organizations (e.g., Guardian, New York Times) or social media (e.g., YouTube, Twitter, Imgur). This simultaneously supports and argues against Reddit’s claim to being “the front page of the Internet” — whereas the most popular threads on the most popular subreddits clearly link to popular sites, it is also true the front page of Reddit is not necessarily the best place to seek out information that is available only in less widely-known venues. The most visible segments of Reddit, then, appear to reflect the most visible segments of the Internet, from Wikipedia to Imgur/YouTube to English-speaking news sites (both in the U.K. and in the U.S., which is hardly surprising for an English-based Web site). Of course, the long tail evidenced in regard to linked domains (accounting for 19.1 percent of domains) indicates that Reddit indeed does manage to highlight lesser-known venues, although these venues are (rather predictably) not as prominent as more mainstream sources.
The topical category of any given subreddit is a reliable predictor of the content within the subreddit. While some consequences are expected (e.g., certain subreddits prohibit links in OPs, whereas others require a link to a site such as Imgur; in these situations, it is not surprising that there are systematic differences in OP link domains), others are more surprising. Different subreddits exhibit vastly varying levels of discourse sentiment and sophistication, with “porn” subreddits using more basic vocabularies (posts often consist simply of crass statements such as “hot girl”) and “philosophy/religion” subreddits using the most sophisticated vocabularies. The “health/food” subreddits simultaneously use emoticons most frequently and use them the most positively, indicating that encouraging and motivating others via emoticons is an integral part of being a member of these communities.
Future research in this area could undertake a more comprehensive sentiment analysis based on actual language patterns, although this would most likely need to be conducted manually. Similarly, a robust topical analysis of OPs and their associated comments would go a long way towards determining what, precisely, people are talking about on Reddit. Finally, this study only considered the most popular comments on the most popular threads in the most popular subreddits. The reasoning behind this was that it was considered desirable to analyze what the average Reddit user sees. However, it would be interesting to see if this study’s findings hold up across the whole of Reddit. What is certain, however, is that differences will continue to be found, as Reddit is not a monolithic enterprise. Rather, it is effectively a collection of very different communities, with different participants, different goals, and different norms, which share a common platform but very little else. This may well be the reason for its continued popularity; given that users can locate, join, and even create communities with ease, and given the wide variation in topicality, language use, and user interaction across the site, a large variety of audiences can find much of worth on “the front page of the Internet.”
About the author
Andrew Tsou is a Ph.D. student in the Department of Information & Library Science at Indiana University Bloomington. His research interests include computer-mediated communication and discourse patterns across social media platforms.
E-mail: iatsou [at] umail [dot] iu [dot] edu
Acknowledgements
The author would like to thank Patrick Shih for his assistance in preparing this manuscript.
Notes
1. Often referred to as “ELI5” for short, this subreddit allows users to ask questions about a variety of topics, with the understanding that responses should be written as simply as possible (despite the subreddit’s name, responses are not expected to be comprehensible by actual five-year-olds). Analogies are often used to make complicated points.
2. “AskReddit” is somewhat more informal than “explainlikeimfive,” in that many questions involve asking the Reddit community about their opinions/suggestions on a variety of topics.
3. Gilbert, 2013, p. 803.
4. “Karma” refers to the points that a user accrues by posting popular comments. It is something of a status symbol on Reddit, similar to retweets on Twitter and “likes” on Facebook.
5. Dresner and Herring, 2010, p. 263.
6. Derks, et al., 2007b, p. 842.
7. For more information about Reddit’s sorting system, see https://redditblog.com/2009/10/15/reddits-new-comment-sorting-system/.
References
Alexa.com, 2016. “Reddit,” at http://www.alexa.com/siteinfo/reddit.com, accessed 30 September 2016.
Oluseyi Aliu and Kevin C. Chung, 2010. “Readability of ASPS and ASAPS educational websites: An analysis of consumer impact,” Plastic and Reconstructive Surgery, volume 125, number 4, pp. 1,271–1,278.
doi: http://dx.doi.org/10.1097/PRS.0b013e3181d0ab9e, accessed 25 October 2016.J. Scott Armstrong, 1980. “Unintelligible management research and academic prestige,” Interfaces, volume 10, number 2, pp. 80–86.
doi: http://dx.doi.org/10.1287/inte.10.2.80, accessed 25 October 2016.Ronald A. Ash and Steven L. Edgell, 1975. “A note on the readability on the Position Analysis Questionnaire (PAQ),” Journal of Applied Psychology, volume 60, number 6, pp. 765–766.
doi: http://dx.doi.org/10.1037/0021-9010.60.6.765, accessed 25 October 2015.Michael Barthel, Galen Stocking, Jesse Holcomb, and Amy Mitchell, 2016. “Nearly eight-in-ten Reddit users get news on the site,” Pew Research Center (25 February), at http://www.journalism.org/files/2016/02/PJ_2016.02.25_Reddit_FINAL.pdf, 30 September 2016.
Kinta Beaver and Karen Luker, 1997. “Readability of patient information booklets for women with breast cancer,” Patient Education and Counseling, volume 31, number 2, pp. 95–102.
doi: http://dx.doi.org/10.1016/S0738-3991(96)00988-3, accessed 25 October 2016.Bernd Becker, 2013. “Learning analytics: Insights into the natural learning behavior of our students,” Behavioral & Social Sciences Librarian, volume 32, number 1, pp. 63–67.
doi: http://dx.doi.org/10.1080/01639269.2013.751804, accessed 25 October 2016.Michael Bendersky and David A. Smith, 2012. “A dictionary of wisdom and wit: Learning to extract quotable phrases,” Proceedings of the Workshop on Computational Linguistics for Literature, co-located with the 2012 Conference of the North American Chapter of the Association for Computational Linguistics, pp. 69–77; version at http://bendersky.github.io/pubs/2012-3.pdf, accessed 25 October 2016.
Kelly Bergstrom, 2011. “‘Don’t feed the troll’: Shutting down debate about community expectations on Reddit.com,” First Monday, volume 16, number 8, at http://firstmonday.org/article/view/3498/3029, accessed 25 October 2016.
doi: http://dx.doi.org/10.5210/fm.v16i8.3498, accessed 25 October 2016.Judith Bogert, 1985. “In defense of the Fog Index,” Bulletin of the Association for Business Communication, volume 48, number 2, pp. 9–11.
doi: http://dx.doi.org/10.1177/108056998504800203, accessed 25 October 2016.Judee K. Burgoon, J.P. Blair, Tiantian Qin, and Jay F. Nunamaker Jr., 2003. “Detecting deception through linguistic analysis,” In: Hsinchun Chen, Richard Miranda, Daniel D. Zeng, Chris Demchak, Jenny Schroeder, and Therani Madhusudan (editors). Intelligence and security informatics: Proceedings of the First NSF/NIJ Symposium, ISI 2003, Tucson, AZ, USA, June 2–3, 2003. Lecture Notes in Computer Science, volume 2665. Berlin: Springer, pp. 91–101.
doi: http://dx.doi.org/10.1007/3-540-44853-5_7, accessed 25 October 2016.Monique Busch and Gail Folaron, 2005. “Accessibility and clarity of state child welfare agency mission statements,” Child Welfare, volume 84, number 3, pp. 415–430.
“Categorization of all subreddits,” 2014. At https://www.reddit.com/r/ModeratorDuck/wiki/subreddit_classification, 22 October 2016.
Andrea Ceron, 2015. “Internet, news, and political trust: The difference between social media and online media outlets,” Journal of Computer–Mediated Communication, volume 20, number 5, pp. 487–503.
doi: http://dx.doi.org/10.1111/jcc4.12129, accessed 25 October 2016.Dave Childs, 2016. “Text-Statistics” (PHP library), at https://github.com/DaveChild/Text-Statistics, accessed 22 October 2016.
Daejin Choi, Jinyoung Han, Taejoong Chung, Yong-Yeol Ahn, Byung-Gon Chun, and Ted Taekyoung Kwon, 2015. “Characterizing conversation patterns in Reddit: From the perspectives of content properties and user participation behaviors,” COSN ’15: Proceedings of the 2015 ACM on Conference on Online Social Networks, pp. 233–243.
doi: http://dx.doi.org/10.1145/2817946.2817959, accessed 25 October 2016.Mark Clatworthy and Michael John Jones, 2001. “The effect of thematic structure on the variability of annual report readability,” Accounting, Auditing & Accountability Journal, volume 14, number 3, pp. 311–326.
doi: http://dx.doi.org/10.1108/09513570110399890, accessed 25 October 2016.Daantje Derks, Arjan E.R. Bos, and Jasper von Grumbkow, 2007a. “Emoticons and online message interpretation,” Social Science Computer Review, volume 26, number 3, pp. 379–388.
doi: http://dx.doi.org/10.1177/0894439307311611, accessed 25 October 2016.Daantje Derks, Arjan E.R. Bos, and Jasper von Grumbkow, 2007b. “Emoticons and social interaction on the Internet: The importance of social context,” Computers in Human Behavior, volume 23, number 1, pp. 842–849.
doi: http://dx.doi.org/10.1016/j.chb.2004.11.013, accessed 25 October 2016.Eli Dresner and Susan C. Herring, 2010. “Functions of the nonverbal in CMC: Emoticons and illocutionary force,” Communication Theory, volume 20, number 3, pp. 249–268.
doi: http://dx.doi.org/10.1111/j.1468-2885.2010.01362.x, accessed 25 October 2016.Maeve Duggan and Aaron Smith, 2013. “6% of online adults are Reddit users,” Pew Research Center (3 July), at http://www.pewinternet.org/2013/07/03/6-of-online-adults-are-reddit-users/, accessed 25 October 2016.
Christopher J. Ferguson, 2009. “An effect size primer: A guide for clinicians and researchers,” Professional Psychology: Research and Practice, volume 40, number 5, pp. 532–538.
doi: http://dx.doi.org/10.1037/a0015808, accessed 25 October 2016.P.R. Fitzsimmons, B.D. Michael, J.L. Hulley, and G.O. Scott, 2010. “A readability assessment of online Parkinson’s disease information,” Journal of the Royal College of Physicians of Edinburgh, volume 40, number 4, pp. 292–296.
doi: http://dx.doi.org/10.4997/JRCPE.2010.401, accessed 25 October 2016.Julie A. Gazmararian, David W. Baker, Mark V. Williams, Ruth M. Parker, Tracy L. Scott, Diane C. Green, S. Nicole Fehrenbach, Junling Ren, and Jeffrey P. Koplan, 1999. “Health literacy among Medicare enrollees in a managed care organization,” Journal of the American Medical Association, volume 281, number 6, pp. 545–551.
doi: http://dx.doi.org/10.1001/jama.281.6.545, accessed 25 October 2016.Eric Gilbert, 2013. “Widespread underprovision on Reddit,” CSCW ’13: Proceedings of the 2013 Conference on Computer Supported Cooperative Work, pp. 803–808.
doi: http://dx.doi.org/10.1145/2441776.2441866, accessed 25 October 2016.Thomas Gottron and Ludger Martin, 2009. “Estimating Web site readability using content extraction,” WWW ’09: Proceedings of the 18th International Conference on World Wide Web, pp. 1,169–1,170.
doi: http://dx.doi.org/10.1145/1526709.1526911, accessed 25 October 2016.Stuart A. Grossman, Steven Piantadosi, and Charles Covahey, 1994. “Are informed consent forms that describe clinical oncology research protocols readable by most patients and their families?” Journal of Clinical Oncology, volume 12, number 10, pp. 2,211–2,215.
Nan Hu, Indranil Bose, Noi Sian Koh, and Ling Liu, 2012. “Manipulation of online reviews: An analysis of ratings, readability, and sentiments,” Decision Support Systems, volume 52, number 3, pp. 674–684.
doi: http://dx.doi.org/10.1016/j.dss.2011.11.002, accessed 25 October 2016.Yasas S.N. Jayaratne, Nina K. Anderson, and Roger A. Zwahlen, 2014. “Readability of websites containing information on dental implants,” Clinical Oral Implants Research, volume 25, number 12, pp. 1,319–1,324.
doi: http://dx.doi.org/10.1111/clr.12285, accessed 25 October 2016.Thomas J. Johnson and Barbara K. Kaye, 2014. “Credibility of social network sites for political information among politically interested Internet users,” Journal of Computer–Mediated Communication, volume 19, number 4, pp. 957–974.
doi: http://dx.doi.org/10.1111/jcc4.12084, accessed 25 October 2016.Michael John Jones, 1996. “Readability of annual reports: Western versus Asian evidence — A comment to contexualize,” Accounting, Auditing & Accountability Journal, volume 9, number 2, pp. 86–91.
doi: http://dx.doi.org/10.1108/09513579610116376, accessed 25 October 2016.J. Peter Kincaid, Robert P. Fishburne Jr., Richard L. Rogers, and Brad S. Chissom, 1975. “Derivation of new readability formulas (automated readability index, fog count and Flesch reading ease formula) for Navy enlisted personnel,” Research Branch Report, 8–75. Millington, Tenn.: Naval Technical Training Command; version at http://www.dtic.mil/dtic/tr/fulltext/u2/a006655.pdf, accessed 25 October 2016.
Silvia Knobloch–Westerwick and Benjamin K. Johnson, 2014. “Selective exposure for better or worse: Its mediating role for online news’ impact on political participation,” Journal of Computer–Mediated Communication, volume 19, number 2, pp. 184–196.
doi: http://dx.doi.org/10.1111/jcc4.12036, accessed 25 October 2016.Franklin B. Krohn, 2004. “A generational approach to using emoticons as nonverbal communication,” Journal of Technical Writing and Communication, volume 34, number 4, pp. 321–328.
doi: http://dx.doi.org/10.2190/9EQH-DE81-CWG1-QLL9, accessed 25 October 2016.Carolyn A. Lin, Patricia J. Neafsey, and Zoe Strickler, 2009. “Usability testing by older adults of a computer-mediated health communication program,” Journal of Health Communication, volume 14, number 2, pp. 102–118.
doi: http://dx.doi.org/10.1080/10810730802659095, accessed 25 October 2016.Shao-Kang Lo, 2008. “The nonverbal communication functions of emoticons in computer-mediated communication,” CyberPsychology & Behavior, volume 11, number 5, pp. 595–597.
doi: http://dx.doi.org/10.1089/cpb.2007.0132, accessed 25 October 2016.Campbell R. McConnell, 1982. “Readability formulas as applied to college economics textbooks,“ Journal of Reading, volume 26, number 1, pp. 14–17.
Richard A. Mills, 2015. “Reddit.com — A census of subreddits,” WebSci ’15: Proceedings of the ACM Web Science Conference, article number 49.
doi: http://dx.doi.org/10.1145/2786451.2786491, accessed 25 October 2016.Daniel Moyer, Samuel L. Carson, Thayne Keegan Dye, Richard T. Carson, and David Goldbaum, 2015. “Determining the influence of Reddit posts on Wikipedia pageviews,” Proceedings of the Ninth International AAAI Conference on Web and Social Media, at https://www.aaai.org/ocs/index.php/ICWSM/ICWSM15/paper/view/10655, accessed 25 October 2016.
Randal S. Olson, 2013. “redditviz — reddit interest network,” at http://rhiever.github.io/redditviz/clustered/, accessed 25 October 2016.
Randal S. Olson and Zachary P. Neal, 2015. “Navigating the massive world of Reddit: Using backbone networks to map user interests in social media,” PeerJ Computer Science, volume 1, article e4, at https://peerj.com/articles/cs-4/, accessed 25 October 2016.
doi: https://doi.org/10.7717/peerj-cs.4, accessed 25 October 2016.Jonathon Read, 2005. “Using emoticons to reduce dependency in machine learning techniques for sentiment classification,” ACLstudent ’05: Proceedings of the ACL Student Research Workshop, pp. 43–48.
Reddit, 2016. “Reddit content policy,” at https://www.reddit.com/help/contentpolicy/, accessed 30 May 2016.
Annika Richterich, 2014. “‘Karma, precious karma!’ Karmawhoring on Reddit and the Front Page’s econometrisation,” Journal of Peer Production, number 4, at http://peerproduction.net/issues/issue-4-value-and-currency/peer-reviewed-articles/karma-precious-karma/, accessed 25 October 2016.
John C. Roberts, Robert H. Fletcher, and Suzanne W. Fletcher, 1994. “Effects of peer review and editing on the readability of articles published in Annals of Internal Medicine,” Journal of the American Medical Association, volume 272, number 2, pp. 119–121.
Alfred P. Rovai, 2002. “Development of an instrument to measure classroom community,” Internet and Higher Education, volume 5, number 3, pp. 197–211.
doi: http://dx.doi.org/10.1016/S1096-7516(02)00102-1, accessed 25 October 2016.Philip Sallis and Diana Kassabova, 2000. “Computer-mediated communication: Experiments with e-mail readability,” Information Sciences, volume 123, numbers 1–2, pp. 43–53.
doi: http://dx.doi.org/10.1016/S0020-0255(99)00109-7, accessed 25 October 2016.Karianne Skovholt, Anette Grønning, and Anne Kankaanranta, 2014. “The communicative functions of emoticons in workplace e–mails::–),” Journal of Computer–Mediated Communication, volume 19, number 4, pp. 780–797.
doi: http://dx.doi.org/10.1111/jcc4.12063, accessed 25 October 2016.Malcolm Smith and Richard Taffler, 1992. “The chairman’s statement and corporate financial performance,” Accounting & Finance, volume 32, number 2, pp. 75–90.
doi: http://dx.doi.org/10.1111/j.1467-629X.1992.tb00187.x, accessed 25 October 2016.Karen Sullivan and Frances O’Conor, 2001. “A readability analysis of Australian stroke information,” Topics in Stroke Rehabilitation, volume 7, number 4, pp. 52–60.
doi: http://dx.doi.org/10.1310/6UM4-6J7J-YTLC-K976, accessed 25 October 2016.Mike Thelwall, Kevan Buckley, and Georgios Paltoglou, 2012. “Sentiment strength detection for the social Web,” Journal of the American Society for Information Science and Technology, volume 63, number 1, pp. 163–173.
doi: http://dx.doi.org/10.1002/asi.21662, accessed 25 October 2016.Mike Thelwall, Kevan Buckley, Georgios Paltoglou, Di Cai, and Arvid Kappas, 2010. “Sentiment strength detection in short informal text,” Journal of the American Society for Information Science and Technology, volume 61, number 12, pp. 2,544–2,558.
doi: http://dx.doi.org/10.1002/asi.21416, accessed 25 October 2016.Dominic Thompson and Ruth Filik, 2016. “Sarcasm in written communication: Emoticons are efficient markers of intention,” Journal of Computer–Mediated Communication, volume 21, number 2, pp. 105–120.
doi: http://dx.doi.org/10.1002/asi.21416, accessed 25 October 2016.“Top subreddits,” 2016. At http://redditmetrics.com/top, accessed February 2016.
Jason Turcotte, Chance York, Jacob Irving, Rosanne M. Scholl, and Raymond J. Pingree, 2015. “News recommendations from social media opinion leaders: Effects on media trust and information seeking,” Journal of Computer–Mediated Communication, volume 20, number 5, pp. 520–535.
doi: http://dx.doi.org/10.1111/jcc4.12127, accessed 25 October 2016.Joseph B. Walther, 2007. “Selective self-presentation in computer-mediated communication: Hyperpersonal dimensions of technology, language, and cognition,” Computers in Human Behavior, volume 23, number 5, pp. 2,538–2,557.
doi: http://dx.doi.org/10.1016/j.chb.2006.05.002, accessed 25 October 2016.Joseph B. Walther and Kyle P. D’Addario, 2001. “The impacts of emoticons on message interpretation in computer-mediated communication,” Social Science Computer Review, volume 19, number 3, pp. 324–347.
doi: http://dx.doi.org/10.1177/089443930101900307, accessed 25 October 2016.William B. Weeks and Amy E. Wallace, 2002. “Readability of British and American medical prose at the start of the 21st century,” British Medical Journal, volume 325, number 7378, pp. 1,451–1,452.
doi: http://dx.doi.org/10.1136/bmj.325.7378.1451, accessed 25 October 2016.James A. Wells, 1994. “Readability of HIV/AIDS educational materials: The role of the medium of communication, target audience, and producer characteristics,” Patient Education and Counseling, volume 24, number 3, pp. 249–259.
doi: http://dx.doi.org/10.1016/0738-3991(94)90068-X, accessed 25 October 2016.Pamela Whitten, Sandi Smith, Samantha Munday, and Carolyn LaPlante, 2008. “Communication assessment of the most frequented breast cancer websites: Evaluation of design and theoretical criteria,” Journal of Computer–Mediated Communication, volume 13, number 4, pp. 880–911.
doi: http://dx.doi.org/10.1111/j.1083-6101.2008.00423.x, accessed 25 October 2016.
Appendix
Table A: Subreddit classifications and justifications. Category Description Subreddits Business/Finance Subreddits concerned with finances or business practices business; Economics; personalfinance; shutupandtakemymoney Drugs Subreddits dedicated to alcohol, drugs, etc. Drugs; trees; woahdude Gaming Subreddits about specific games or the gaming experience in general DestinyTheGame; DotA2; Fallout; GameDeals; Games; gaming; GlobalOffensive; hearthstone; leagueoflegends; Minecraft; pokemon; PS4; skyrim; smashbros; Steam; wow General information Subreddits containing general information that does not fit into any other category BuyItForLife; DIY; everymanshouldknow; freebies; GetMotivated; LearnUselessTalents; lifehacks; LifeProTips; malefashionadvice; travel; TwoXChromosomes; YouShouldKnow GIFs Subreddits containing links to silent, animated images aww; creepy; gifs; holdmybeer; reactiongifs; Unexpected; Whatcouldgowrong; wheredidthesodago Health/Food Subreddits about exercise, food, etc. bodyweightfitness; Cooking; EatCheapAndHealthy; Fitness; food; loseit; MakeupAddiction; SkincareAddiction Humor Subreddits containing jokes, humor, memes, etc. 4chan; AdviceAnimals; AnimalsBeingJerks; BlackPeopleTwitter; cats; cringepics; dadjokes; facepalm; funny; humor; ImGoingToHellForThis; Jokes Images Subreddits containing images and other static visual arts (includes discussions of such as well); Photographs are in a separate category Art; comics; CrappyDesign; dataisbeautiful; fffffffuuuuuuuuuuuu; meirl; oddlysatisfying; OldSchoolCool; pics; QuotesPorn; tattoos; TumblrInAction; wallpapers; youdontsurf Meta Subreddits that refer to the Reddit experience announcements; bestof; blog; circlejerk; DepthHub; reddit.com; SubredditDrama; TrueReddit Movies/Television Subreddits about movies and television anime; breakingbad; doctorwho; Documentaries; fullmoviesonyoutube; gameofthrones; movies; NetflixBestOf; rickandmorty; scifi; StarWars; television; thewalkingdead Music Subreddits about music hiphopheads; listentothis; Music News Subreddits focused on contemporary events news; nottheonion; offbeat; UpliftingNews; worldnews Non-subreddit Special category for links harvested from the front page and from r/all all; frontPage Philosophy/religion Subreddits about religious or philosophical topics atheism; Futurology; philosophy Photography Subreddits containing links to photographs (or discussion of such); this category takes precedence over “Images” AbandonedPorn; EarthPorn; FoodPorn; HistoryPorn; MapPorn; photography; photoshopbattles; RoomPorn Politics/history Subreddits about contemporary political topics or historical topics conservative; conspiracy; firstworldanarchists; guns; history; liberal; politics Porn Subreddits containing NSFW content (note that not all subreddits with the word “Porn” in their names are actually pornography in the traditional sense) Amateur; ass; Boobies; BustyPetite; cumsluts; FiftyFifty; gentlemanboners; gonewild; holdthemoan; milf; nsfw; nsfwgifs; RealGirls Questions Subreddits wherein participants can ask questions about various topics AskHistorians; AskMen; AskReddit; askscience; AskWomen; changemyview; DoesAnybodyElse; explainlikeimfive; iama; OutOfTheLoop; shittyaskscience Random/assorted Subreddits with content that does not fit into any other category geek; interestingasfuck; InternetIsBeautiful; mildlyinfuriating; mildlyinteresting; MorbidReality; nonononoyes; WTF Reading and Writing Subreddits about specific books, writing prompts, or reading/writing in general asoiaf; books; harrypotter; nosleep; WritingPrompts Regions Subreddits dedicated to specific locations/countries canada; europe Relationships Subreddits about interpersonal relationships relationships; seduction; sex Science Subreddits dedicated to scientific topics psychology; scholar; science; space; spaceporn Sports Subreddits about sports hockey; nba; nfl; soccer; sports; worldcup Stories Subreddits wherein users share textual stories (often original thoughts or of the "this happened to me" variety) cringe; Frugal; JusticePorn; offmychest; PerfectTiming; Showerthoughts; TalesFromRetail; talesfromtechsupport; thatHappened; tifu; todayilearned Technology Subreddits about specific technologies (phones, etc.) or technology in general (e.g., programming) AlienBlue; Android; apple; baconreader; buildapc; gadgets; learnprogramming; linux; pcmasterrace; programming; technology Videos Subreddits containing links to videos UnexpectedThugLife; videos; youtubehaiku
Table B: Readability scores by subreddit. Subreddit Subreddit classification Flesch Gunning fog SMOG Mean 4chan Humor 5.347894 5.529376 4.149865 5.009045 AbandonedPorn Photography 5.670579 6.636534 5.001848 5.769654 AdviceAnimals Humor 5.857784 6.974808 5.047536 5.960043 AlienBlue Technology 5.387022 6.405773 4.757142 5.516646 all Non-subreddit 5.722419 6.645442 4.842902 5.736921 Amateur Porn 4.812753 4.706766 3.689386 4.402969 Android Technology 6.283602 7.597016 5.61092 6.497179 AnimalsBeingJerks Humor 5.141306 5.947051 4.267147 5.118501 anime Movies/Television 5.91522 6.647706 5.16336 5.908762 announcements Meta 6.786669 8.365927 6.161095 7.104564 apple Technology 6.211603 7.499235 5.460794 6.390544 Art Images 5.644666 6.70974 4.954172 5.769526 AskHistorians Questions 8.433575 7.982846 6.357339 7.591253 AskMen Questions 6.080702 7.548459 5.425199 6.351453 AskReddit Questions 5.876155 6.829691 5.02241 5.909419 askscience Questions 8.290544 9.887492 7.136616 8.438217 AskWomen Questions 6.146896 7.603436 5.439704 6.396678 asoiaf Reading and Writing 6.278979 7.332256 5.407599 6.339611 ass Porn 4.669864 4.801753 3.667244 4.37962 atheism Philosophy/religion 6.742932 8.099285 6.009224 6.95048 aww GIFs 5.067664 5.683805 4.118282 4.956584 baconreader Technology 5.335055 6.324689 4.657829 5.439191 bestof Meta 6.818642 8.589356 6.209983 7.205994 BlackPeopleTwitter Humor 5.352074 5.625047 4.289933 5.089018 blog Meta 6.111717 7.138953 5.35143 6.2007 bodyweightfitness Health/Food 5.649285 7.13966 5.226887 6.005277 Boobies Porn 4.865284 4.814753 3.762431 4.480823 books Reading and Writing 6.389078 7.832435 5.800451 6.673988 breakingbad Movies/Television 5.533605 6.279049 4.676055 5.496236 buildapc Technology 5.934614 7.44487 5.355524 6.245003 business Business/Finance 6.951565 8.654455 6.416985 7.341002 BustyPetite Porn 4.818153 4.872508 3.850253 4.513638 BuyItForLife General information 5.76634 7.301313 5.26382 6.110491 canada Regions 6.841567 8.127388 6.215863 7.061606 cats Humor 4.931322 5.808197 4.154254 4.964591 changemyview Questions 8.289065 10.32271 7.520446 8.710742 circlejerk Meta 5.265051 14.2477 4.004863 7.839206 comics Images 5.698967 6.487716 4.806636 5.66444 conservative Politics/history 7.169054 8.607837 6.556951 7.444614 conspiracy Politics/history 6.637352 8.038079 5.954481 6.876637 Cooking Health/Food 5.590957 7.075309 5.079964 5.91541 CrappyDesign Images 5.71661 6.664863 4.78828 5.723251 creepy GIFs 5.450203 6.107264 4.518685 5.358718 cringepics Humor 5.466905 6.139775 4.507337 5.371339 cringe Stories 5.667174 6.529634 4.81221 5.669672 cumsluts Porn 4.963078 5.033149 3.894199 4.630142 dadjokes Humor 4.869485 5.165841 3.920746 4.652024 dataisbeautiful Images 6.665312 8.070513 6.006199 6.914008 DepthHub Meta 8.063707 9.465282 7.059377 8.196122 DestinyTheGame Gaming 5.601208 6.829063 4.987183 5.805818 DIY General information 5.410893 6.809351 4.845638 5.688627 doctorwho Movies/Television 5.473853 6.351778 4.716667 5.514099 Documentaries Movies/Television 6.933304 8.375321 6.216662 7.175096 DoesAnybodyElse Questions 5.886123 7.146253 5.102527 6.044968 DotA2 Gaming 5.890093 6.783667 4.915665 5.863142 Drugs Drugs 6.38057 7.746683 5.672008 6.599754 EarthPorn Photography 5.515065 6.427615 4.895985 5.612888 EatCheapAndHealthy Health/Food 5.627473 7.212713 5.233446 6.024544 Economics Business/Finance 7.923993 10.01633 7.283554 8.407959 europe Regions 6.816911 7.820154 6.00226 6.879775 everymanshouldknow General information 5.995496 7.385413 5.276011 6.218973 explainlikeimfive Questions 7.085486 8.648025 6.269556 7.334356 facepalm Humor 5.865661 6.725538 4.967516 5.852905 Fallout Gaming 5.754687 6.621546 4.899157 5.758463 fffffffuuuuuuuuuuuu Images 5.29869 6.167189 4.467057 5.310979 FiftyFifty Porn 5.209601 5.5845 4.141094 4.978398 firstworldanarchists Politics/history 5.62596 6.267572 4.715446 5.536326 Fitness Health/Food 5.636469 7.009356 5.052348 5.899391 FoodPorn Photography 5.258518 6.186185 4.63351 5.359404 food Health/Food 5.449894 6.570875 4.803423 5.608064 freebies General information 5.241357 5.980361 4.568176 5.263298 frontPage Non-subreddit 5.841877 6.761125 4.983555 5.862185 Frugal Stories 5.969371 7.378925 5.315077 6.221124 fullmoviesonyoutube Movies/Television 5.625362 6.284834 4.851697 5.587298 funny Humor 5.399688 6.058858 4.481948 5.313498 Futurology Philosophy/religion 7.369161 9.074459 6.562093 7.668571 gadgets Technology 6.354897 7.952356 5.722246 6.6765 GameDeals Gaming 5.699775 6.940361 5.190957 5.943698 gameofthrones Movies/Television 5.850069 6.471904 4.853028 5.725 Games Gaming 7.14884 8.824199 6.439507 7.470849 gaming Gaming 5.653163 6.602734 4.850906 5.702268 geek Random/assorted 5.852735 6.946881 5.094286 5.964634 gentlemanboners Porn 5.199246 5.568046 4.329885 5.032392 GetMotivated General information 5.862264 6.676124 4.89901 5.812466 gifs GIFs 5.410963 6.102726 4.43477 5.316153 GlobalOffensive Gaming 5.595012 6.347391 4.512837 5.48508 gonewild Porn 4.605298 5.014394 3.710148 4.44328 guns Politics/history 5.691455 6.666327 4.883211 5.746998 harrypotter Reading and Writing 6.045412 6.840469 5.221772 6.035885 hearthstone Gaming 6.197961 7.334673 5.372336 6.301657 hiphopheads Music 5.540423 6.245992 4.637696 5.474704 HistoryPorn Photography 6.698803 7.642645 5.7807 6.707383 history Politics/history 7.305193 8.219164 6.274859 7.266405 hockey Sports 5.268 5.919472 4.505735 5.231069 holdmybeer GIFs 5.20858 5.959939 4.300411 5.15631 holdthemoan Porn 5.102365 5.215768 3.986216 4.768117 humor Humor 6.363258 7.400205 5.602861 6.455441 iama Questions 6.206097 7.407847 5.54081 6.384918 ImGoingToHellForThis Humor 5.740358 5.916921 4.537413 5.39823 interestingasfuck Random/assorted 5.760709 6.721271 4.852475 5.778151 InternetIsBeautiful Random/assorted 5.683153 6.723698 4.982828 5.796559 Jokes Humor 5.415379 5.731282 4.3448 5.16382 JusticePorn Stories 5.790889 6.87647 4.957207 5.874855 leagueoflegends Gaming 5.901406 6.830236 4.908441 5.880028 learnprogramming Technology 6.406433 7.9075 5.854799 6.722911 LearnUselessTalents General information 5.572609 6.45065 4.695851 5.573037 liberal Politics/history 7.34185 8.977803 6.793625 7.704426 lifehacks General information 5.514462 6.562441 4.686507 5.587803 LifeProTips General information 5.904225 7.189206 5.141292 6.078241 linux Technology 6.760833 8.167975 6.033963 6.98759 listentothis Music 5.536638 6.819454 5.190042 5.848712 loseit Health/Food 5.172728 6.733055 4.847562 5.584448 MakeupAddiction Health/Food 5.394541 6.770124 4.969658 5.711441 malefashionadvice General information 5.529903 6.655028 4.869444 5.684792 MapPorn Photography 6.727538 7.306101 5.917751 6.650463 meirl Images 5.078466 5.113193 3.850227 4.680629 mildlyinfuriating Random/assorted 5.989153 7.108163 5.127624 6.07498 mildlyinteresting Random/assorted 5.475107 6.31609 4.580774 5.457324 milf Porn 4.484264 4.420356 3.49652 4.133713 Minecraft Gaming 5.648564 6.67722 4.954316 5.760034 MorbidReality Random/assorted 6.526668 7.588044 5.620855 6.578522 movies Movies/Television 6.111348 7.067411 5.334198 6.170986 Music Music 5.8597 6.963228 5.218397 6.013775 nba Sports 5.390673 6.131177 4.515989 5.345946 NetflixBestOf Movies/Television 5.807377 6.874135 5.168032 5.949848 news News 6.889155 8.60882 6.281335 7.25977 nfl Sports 5.525155 6.408474 4.724814 5.552814 nonononoyes Random/assorted 5.458937 6.418988 4.557342 5.478422 nosleep Reading and Writing 5.53151 6.234105 4.569758 5.445124 nottheonion News 6.310675 7.379756 5.521765 6.404065 nsfwgifs Porn 5.133341 5.39258 4.160841 4.895587 nsfw Porn 4.939942 5.153091 3.993413 4.695482 oddlysatisfying Images 5.418206 6.28681 4.526729 5.410581 offbeat News 6.568421 8.021845 5.915214 6.83516 offmychest Stories 5.666299 6.977753 5.073608 5.905887 OldSchoolCool Images 5.326501 5.820321 4.442444 5.196422 OutOfTheLoop Questions 6.580314 7.612843 5.656727 6.616628 pcmasterrace Technology 5.79224 6.747137 4.911844 5.817074 PerfectTiming Stories 5.243094 5.735839 4.287357 5.088763 personalfinance Business/Finance 6.64761 8.376616 6.008425 7.010884 philosophy Philosophy/religion 8.559984 10.48852 7.630801 8.8931 photography Photography 6.613143 8.383648 6.088992 7.028594 photoshopbattles Photography 4.843465 4.319794 3.58646 4.249906 pics Images 5.602581 6.391556 4.771202 5.588446 pokemon Gaming 5.844901 6.561064 5.106428 5.837464 politics Politics/history 7.219129 8.888562 6.634515 7.580735 programming Technology 6.993152 8.694611 6.372196 7.35332 PS4 Gaming 5.79333 7.061011 5.197124 6.017155 psychology Science 7.986655 9.645382 7.06937 8.233802 QuotesPorn Images 6.647033 7.897303 5.839256 6.794531 reactiongifs GIFs 5.421152 6.084136 4.508374 5.337887 RealGirls Porn 4.901627 4.895056 3.859492 4.552058 reddit.com Meta 6.07344 7.048564 5.255881 6.125962 relationships Relationships 6.023577 7.917952 5.585533 6.509021 rickandmorty Movies/Television 5.448059 6.146563 4.599745 5.398122 RoomPorn Photography 5.431156 6.507354 4.75393 5.564146 scholar Science 6.308917 7.357211 5.502449 6.389526 science Science 7.997577 8.778597 6.517832 7.764668 scifi Movies/Television 6.417296 7.603456 5.721776 6.580843 seduction Relationships 5.918029 7.159833 5.300075 6.125979 sex Relationships 6.15389 7.598063 5.505047 6.419 shittyaskscience Questions 5.729646 6.486256 4.75737 5.657757 Showerthoughts Stories 5.890653 6.844103 4.977926 5.904227 shutupandtakemymoney Business/Finance 5.65556 6.871307 4.986468 5.837778 SkincareAddiction Health/Food 5.91611 7.219664 5.315452 6.150409 skyrim Gaming 5.702826 6.601542 4.888569 5.730979 smashbros Gaming 5.676741 6.17487 4.752306 5.534639 soccer Sports 5.876072 6.477856 5.004039 5.785989 spaceporn Science 6.019494 7.086704 5.24835 6.11818 space Science 6.675357 8.336974 6.053386 7.021906 sports Sports 5.539677 6.404378 4.689062 5.544372 StarWars Movies/Television 5.832684 6.667757 4.981267 5.827236 Steam Gaming 6.247909 7.486559 5.432546 6.389005 SubredditDrama Meta 6.604216 7.912682 5.802684 6.773194 TalesFromRetail Stories 5.888246 7.220672 5.140921 6.08328 talesfromtechsupport Stories 6.129757 7.481391 5.348304 6.319817 tattoos Images 5.294619 6.289688 4.610483 5.398264 technology Technology 6.971372 8.651027 6.329837 7.317412 television Movies/Television 6.182763 7.588218 5.595737 6.455573 thatHappened Stories 5.730338 6.551266 4.730774 5.670793 thewalkingdead Movies/Television 5.667616 6.673219 4.830502 5.723779 tifu Stories 5.43839 6.354455 4.613559 5.468802 todayilearned Stories 6.37201 7.455654 5.495824 6.441163 travel General information 5.853839 7.045461 5.353245 6.084181 trees Drugs 5.883433 6.858741 5.025238 5.922471 TrueReddit Meta 7.71234 9.607225 7.092208 8.137258 TumblrInAction Images 6.494007 7.585155 5.602518 6.56056 TwoXChromosomes General information 7.008689 8.744964 6.360795 7.371483 UnexpectedThugLife Videos 5.441538 6.095202 4.552209 5.362983 Unexpected GIFs 5.300694 5.881228 4.341776 5.174566 UpliftingNews News 6.347076 7.398816 5.486144 6.410678 videos Videos 5.847053 7.022321 5.107927 5.992434 wallpapers Images 5.613932 6.520707 4.918605 5.684415 Whatcouldgowrong GIFs 5.54213 6.534984 4.652271 5.576461 wheredidthesodago GIFs 5.399545 6.13928 4.46587 5.334898 woahdude Drugs 5.666458 6.477783 4.731979 5.625407 worldcup Sports 5.974366 6.661892 5.245006 5.960421 worldnews News 7.069084 8.345221 6.25595 7.223419 wow Gaming 5.903424 6.917147 5.087131 5.969234 WritingPrompts Reading and Writing 5.558031 6.441681 4.84308 5.614264 WTF Random/assorted 5.512701 6.39577 4.668361 5.525611 youdontsurf Images 5.143515 5.41192 4.039822 4.865086 YouShouldKnow General information 6.354004 7.696667 5.635594 6.562088 youtubehaiku Videos 5.254319 5.433304 4.189676 4.9591
Table C: Emoticon scores and percentages by subreddit. Subreddit Category Emoticon score Emoticon percentage 4chan Humor -0.03556 1.62 AbandonedPorn Photography 0.42029 3.11 AdviceAnimals Humor 0.231373 1.95 AlienBlue Technology 0.090029 5.68 all Non-subreddit 0.169916 2.21 Amateur Porn 0.352941 1.69 Android Technology 0.227858 2.98 AnimalsBeingJerks Humor 0.119565 2.39 anime Movies/Television 0.4 4.38 announcements Meta 0.225836 4.89 apple Technology 0.295626 2.57 Art Images 0.555156 3.97 AskHistorians Questions 0.390887 3.77 AskMen Questions 0.364555 3.07 AskReddit Questions 0.171035 2.2 askscience Questions 0.393211 3.04 AskWomen Questions 0.412465 5.24 asoiaf Reading and Writing 0.17875 2.26 ass Porn 0.494253 1.99 atheism Philosophy/religion 0.40176 1.89 aww GIFs 0.39789 3.85 baconreader Technology 0.372781 3.74 bestof Meta 0.255932 2.07 BlackPeopleTwitter Humor 0.102151 1.35 blog Meta 0.334435 4.63 bodyweightfitness Health/Food 0.64 7.76 Boobies Porn 0.315789 2.5 books Reading and Writing 0.365922 3.06 breakingbad Movies/Television 0.020882 1.95 buildapc Technology 0.395214 4.67 business Business/Finance 0.365789 2.11 BustyPetite Porn 0.319048 2.71 BuyItForLife General information 0.244792 3.09 canada Regions 0.40709 2.39 cats Humor 0.281459 7.34 changemyview Questions 0.417943 2.11 circlejerk Meta 0.43012 4.73 comics Images 0.387622 3.06 conservative Politics/history 0.296552 1.26 conspiracy Politics/history 0.1566 1.68 Cooking Health/Food 0.555556 4.86 CrappyDesign Images 0.204878 2.03 creepy GIFs 0.089286 2.13 cringe Stories 0.0625 1.51 cringepics Humor 0.158765 2.39 cumsluts Porn 0.502075 6.35 dadjokes Humor 0.05802 2.71 dataisbeautiful Images 0.351136 2.66 DepthHub Meta 0.296089 2.37 DestinyTheGame Gaming 0.402254 4.37 DIY General information 0.566154 4.2 doctorwho Movies/Television 0.514325 4.99 Documentaries Movies/Television 0.36214 2.41 DoesAnybodyElse Questions 0.29806 2.96 DotA2 Gaming 0.314756 3.72 Drugs Drugs 0.533095 3.11 EarthPorn Photography 0.586441 4.27 EatCheapAndHealthy Health/Food 0.58361 6.99 Economics Business/Finance 0.328076 1.82 europe Regions 0.492298 4.61 everymanshouldknow General information 0.438066 2.26 explainlikeimfive Questions 0.373277 2.55 facepalm Humor 0.128857 2.01 Fallout Gaming 0.34101 3.44 fffffffuuuuuuuuuuuu Images 0.316354 3.29 FiftyFifty Porn 0.032787 2.07 firstworldanarchists Politics/history 0.258258 2.21 Fitness Health/Food 0.562543 4.28 food Health/Food 0.425882 3.61 FoodPorn Photography 0.609756 5.63 freebies General information 0.166322 5.62 frontPage Non-subreddit 0.210423 2.54 Frugal Stories 0.327713 3.28 fullmoviesonyoutube Movies/Television 0.17506 5.94 funny Humor 0.217659 2.12 Futurology Philosophy/religion 0.240117 2.1 gadgets Technology 0.393782 2.35 GameDeals Gaming 0.244478 6.8 gameofthrones Movies/Television 0.047091 2.37 Games Gaming 0.222561 2.5 gaming Gaming 0.145796 2.67 geek Random/assorted 0.155689 3.11 gentlemanboners Porn 0.374302 2.08 GetMotivated General information 0.51469 3.46 gifs GIFs 0.21601 2.04 GlobalOffensive Gaming 0.499226 5.79 gonewild Porn 0.566962 9.32 guns Politics/history 0.29108 2.48 harrypotter Reading and Writing 0.375635 4.56 hearthstone Gaming 0.324439 4.21 hiphopheads Music -0.03824 1.17 history Politics/history 0.392116 2.3 HistoryPorn Photography 0.282561 2.35 hockey Sports 0.064423 2.46 holdmybeer GIFs 0.162679 1.32 holdthemoan Porn 0.48248 5.11 humor Humor 0.349138 2.3 iama Questions 0.480447 3.7 ImGoingToHellForThis Humor 0.230032 1.47 interestingasfuck Random/assorted 0.233951 2.26 InternetIsBeautiful Random/assorted 0.21832 4.08 Jokes Humor 0.318919 2.41 JusticePorn Stories 0.279661 1.63 leagueoflegends Gaming 0.248251 5.05 learnprogramming Technology 0.546119 8.18 LearnUselessTalents General information 0.351515 2.91 liberal Politics/history 0.12 1.11 lifehacks General information 0.165669 2.12 LifeProTips General information 0.302792 2.98 linux Technology 0.415144 4.53 listentothis Music 0.4125 6.24 loseit Health/Food 0.752573 10.43 MakeupAddiction Health/Food 0.612269 13.18 malefashionadvice General information 0.343523 2.59 MapPorn Photography 0.25741 2.3 meirl Images -0.0625 2.16 mildlyinfuriating Random/assorted 0.270751 1.93 mildlyinteresting Random/assorted 0.19411 2.17 milf Porn 0.460784 2.68 Minecraft Gaming 0.552025 6.37 MorbidReality Random/assorted -0.15195 2.51 movies Movies/Television 0.15771 1.92 Music Music 0.2 2.59 nba Sports -0.02391 1.75 NetflixBestOf Movies/Television 0.163873 2.71 news News 0.081882 1.35 nfl Sports -0.10667 1.94 nonononoyes Random/assorted 0.145604 1.87 nosleep Reading and Writing 0.334686 3.63 nottheonion News 0.18 1.64 nsfw Porn 0.30137 2.02 nsfwgifs Porn 0.176471 2.43 oddlysatisfying Images 0.280353 2.16 offbeat News 0.189112 1.58 offmychest Stories 0.5 5.71 OldSchoolCool Images 0.366509 2.22 OutOfTheLoop Questions 0.200867 2.43 pcmasterrace Technology 0.317538 4.37 PerfectTiming Stories 0.40404 2.23 personalfinance Business/Finance 0.38172 2.73 philosophy Philosophy/religion 0.463221 2.29 photography Photography 0.515742 3.88 photoshopbattles Photography 0.330299 2.87 pics Images 0.227318 2.31 pokemon Gaming 0.394246 4.25 politics Politics/history 0.212121 1.61 programming Technology 0.409038 4.47 PS4 Gaming 0.234332 3.26 psychology Science 0.44586 3.63 QuotesPorn Images 0.473282 1.98 reactiongifs GIFs 0.166016 2.12 RealGirls Porn 0.293839 2.08 reddit.com Meta 0.385914 3.64 relationships Relationships 0.567117 3.53 rickandmorty Movies/Television 0.310945 2.11 RoomPorn Photography 0.350318 3.01 scholar Science 0.146341 8.3 science Science 0.341584 2.14 scifi Movies/Television 0.306977 3.09 seduction Relationships 0.625 4.19 sex Relationships 0.550234 5.22 shittyaskscience Questions 0.029915 1.62 Showerthoughts Stories 0.140731 1.98 shutupandtakemymoney Business/Finance 0.327411 3.54 SkincareAddiction Health/Food 0.4961 9.27 skyrim Gaming 0.349862 3.2 smashbros Gaming 0.113153 3.35 soccer Sports 0.257143 2.12 space Science 0.465649 3.24 spaceporn Science 0.504386 4.01 sports Sports 0.102941 1.73 StarWars Movies/Television 0.291022 2.25 Steam Gaming 0.297595 3.64 SubredditDrama Meta 0.14653 2.11 TalesFromRetail Stories 0.527891 3.95 talesfromtechsupport Stories 0.48557 4.53 tattoos Images 0.697588 3.97 technology Technology 0.16756 2 television Movies/Television 0.063861 1.93 thatHappened Stories 0.164384 1.83 thewalkingdead Movies/Television 0.303538 2.26 tifu Stories 0.318342 2.78 todayilearned Stories 0.220317 1.89 travel General information 0.662597 5.34 trees Drugs 0.308314 2.75 TrueReddit Meta 0.378556 1.88 TumblrInAction Images 0.16885 1.89 TwoXChromosomes General information 0.528397 4.02 Unexpected GIFs 0.17833 2.16 UnexpectedThugLife Videos 0.261818 2.06 UpliftingNews News 0.343434 2.55 videos Videos 0.204659 2.59 wallpapers Images 0.443503 4 Whatcouldgowrong GIFs 0.192529 1.54 wheredidthesodago GIFs 0.23221 1.97 woahdude Drugs 0.310984 2.36 worldcup Sports 0.366366 3.86 worldnews News 0.200244 1.93 wow Gaming 0.382143 5.2 WritingPrompts Reading and Writing 0.654905 4.98 WTF Random/assorted 0.010929 1.71 youdontsurf Images 0.11465 1.9 YouShouldKnow General information 0.199377 2.76 youtubehaiku Videos 0.211055 2.49
Table D: Domains linked to by OPs at least 10 times across the sample. Domain Overall OP link count Classification abcnews.go.com 25 News arstechnica.com 35 News bbc.co.uk 70 News bbc.com 44 News bloomberg.com 35 News businessinsider.com 27 News cbc.ca 38 News cnbc.com 19 News cnn.com 27 News dailymail.co.uk 19 News en.wikipedia.org 78 User-generated content forbes.com 12 News gfycat.com 624 GIFs giant.gfycat.com 11 GIFs huffingtonpost.com 44 News imgur.com 11827 Images independent.co.uk 51 News instagram.com 18 Images latimes.com 24 News media.giphy.com 20 GIFs medium.com 26 User-generated content nbcnews.com 20 News news.yahoo.com 17 News npr.org 30 News nypost.com 15 News nytimes.com 96 News pbs.twimg.com 20 User-generated content qz.com 11 News reddit.com 305 User-generated content reuters.com 32 News slate.com 11 News smh.com.au 11 News streamable.com 90 Videos telegraph.co.uk 24 News theatlantic.com 20 News thedailybeast.com 14 News theguardian.com 151 News theverge.com 41 News time.com 18 News tumblr.com 28 GIFs twitter.com 300 User-generated content upload.wikimedia.org 30 User-generated content usatoday.com 21 News vimeo.com 25 Videos vox.com 20 News washingtonpost.com 75 News wired.com 21 News wsj.com 13 News youtube.com 1858 Videos
Editorial history
Received 30 September 2016; revised 20 October 2016; accepted 28 October 2016.
This paper is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.How does the front page of the Internet behave? Readability, emoticon use, and links on Reddit
by Andrew Tsou.
First Monday, Volume 21, Number 11 - 7 November 2016
https://firstmonday.org/ojs/index.php/fm/article/download/7013/5651
doi: http://dx.doi.org/10.5210/fm.v21i11.7013