Tracking the imagined audience: A case study on Nike's use of Twitter for B2C interaction
First Monday

Tracking the imagined audience: A case study on Nike's use of Twitter for B2C interaction by Jacky Au Duong and Frauke Zeller

Social media platforms have become the new centre of attention in business-to-consumer (B2C) communication. These interactions provide a rich source of information for businesses in terms of their customers’ preferences, backgrounds and behaviour. We introduce a multi-disciplinary theoretical and methodological framework based on studies in marketing, communication and computer-mediated communication, which aims to inform marketing professionals and academic researchers on how social media can facilitate B2C engagement.


Literature review
Discussion of findings




This paper discusses the usage of social media in brand-consumer relationships. Data generated on social media — often discussed as Big Data — represents a promising way for businesses and brands to track user behaviour and gain a better understanding of their customers’ needs and preferences. In some publications, Big Data is promised as the new digital oil for companies (Mayer-Schönberger and Cukier, 2013). It is also being depicted as the tool all companies need for their market analyses, and thus advertising the diminishing need for traditional, social-sciences-based, user studies (Brooks, 2013). However, challenges related to Big Data research, such as velocity, variety, quantity and veracity (Zikopoulos, et al., 2012; Zeller, 2014), indicate that the mere existence of trackable user data does not necessarily translate easily into business and marketing strategies (Fisher, 2015; Constantinides, 2014; Polánska, 2014; Aral, et al., 2013; Breur, 2011). Arguably, there is no such guaranteed success for social media campaigns, and more insight is needed into what Big Data, i.e., user generated data on social media, can actually tell us in terms of business-to-consumer (B2C) communication. This need is followed by the question of how we can measure and analyze social media data.

To address these areas, we leverage insights from communication studies to develop an innovative methodological approach for analyzing social media content in the field of B2C communication. Starting with a discussion of integrated marketing approaches and social media, we explored existing research in communication studies to build an informed understanding of “audiences” in social media and how tracked data can provide the information that businesses need in order to understand the dynamics and communication patterns on social media platforms. The relationship between communication studies and marketing on social media has been well-researched: Kapoor, et al. (2013) discussed the shift in consumer influence that took place with the advent of social media, claiming that social media had “empowered consumers by connecting them all together into conversational webs” [1]. Fischer and Reuber (2011) examined how user interaction on Twitter affected consumer influence and concluded that Twitter “can positively affect business outcomes such as sales growth, brand image, and company reputation” [2]. Gaps in knowledge still exist, however, regarding what ties Twitter usage by brands to consumer engagement, and also how Twitter can mediate consumer-consumer and brand-consumer relationships. The observed patterns and behaviours discussed in this paper can be used to validate or challenge concepts within existing studies on social media, ultimately deepening the understanding of B2C communication on Twitter. Our study uses a multi-disciplinary approach in marketing, communication and computer-mediated communication (CMC) studies to inform marketing professionals and academic researchers on how Twitter can facilitate B2C engagement. We will apply this approach to a case study using a mixed-methods design, tailored towards the analysis of complex social media data.



Literature review

Twitter in B2C

The advent of social media and social networking sites has caused somewhat of a crisis for marketers worldwide. While the traditional marketing and promotion mix mirrored the one-to-many broadcast approach to communication, studies in Twitter and consumer behaviour have revealed that with the advent of communication technology, the one-to-many approach is no longer effective [3]. It has now become almost impossible for organizations to control the message. Consequently, scholars have contended that professionals should actively monitor, facilitate and/or interject into user conversations that are taking place online (Nitins and Burgess, 2014; Mangold and Faulds, 2009).

As a starting point, both researchers and marketing professionals have recognized that within social media and social networking sites, consumer-generated messages “have become a major factor in influencing various aspects of consumer behaviour including awareness, information acquisition, opinions, attitudes, purchase behaviour, and post-purchase communication and evaluation” [4]. Burton and Soboleva (2011) found that organizations increasingly develop or introduce Twitter accounts as another means or channel to communicate with customers [5]. Their study compared the organizational use of Twitter in six companies with American and Australian accounts (12 accounts in total) and concluded that, “Twitter and other social media platforms create additional marketing communication channels” [6]. More interestingly, the study found that companies were inconsistent in their use of Twitter, revealing in one company that the Australian Twitter account used hashtags more frequently than its American profile. The authors suggested that one of the key issues for this discrepancy is that “for organizations attempting to develop an effective and efficient Twitter strategy, there is the lack of theoretical or empirical evidence on [the] use of Twitter” [7].

Mangold and Faulds (2009) suggested that the traditional integrated marketing communication (IMC) model excludes B2C interaction, and that existing literature “offers marketing managers very little guidance for incorporating social media into their IMC strategies” [8]. These and other studies acknowledge that “online communication in particular is an ideal avenue for fostering dialogue” [9], and propose an alternative approach which “combines some of the characteristics of traditional IMC tools with a highly magnified form of word-of-mouth communication” [10]. The approach accounts for the interactions that take place between the brand and the consumer, as well as the consumer to other consumers [11]. Engagement-focused activities can also be summarized as social business [12]. Within the IMC model, the concept of social business can be helpful to organizations seeking to build and maintain consumer relationships, and it expands the otherwise often too focused approach on analyzing communication messages only. One of the ways organizations can benefit from social media is by understanding user expectations.

Besides being an additional way for customer communication, Twitter and other social networking platforms also present a variety of opportunities for social interaction and engagement, including disseminating coupons or initiating contests (Murthy, 2012; Litt, 2012; Marwick and boyd, 2010). Gruzd, et al. (2011) explored the notion of the imagined community and how Twitter facilitates a sense of community (SoC). Among the different social media platforms that are being used by organizations for B2C communication, Twitter takes on an exceptional role. Given its inherent platform structure, it provides great potential for a vast audience reach but also underscores the aforementioned challenge of not being able to control their messages. This challenge stems from the fact that — unlike other social media platforms such as Facebook — Twitter’s user-to-user relationship does not have to be mutual: users can subscribe to the tweets of other users without requiring them to follow back (Gruzd, et al., 2011; Naaman, et al., 2010). These and other studies illuminate the appeal and feasibility of studying Twitter as a communication tool that promotes social connections.

Communicative patterns in B2C relationships

Much of the literature reviewed highlighted Twitter as a platform for social interaction that is particularly useful for engaging consumers online. Concepts like electronic word-of-mouth (eWOM) (Kapoor, et al., 2013) and social business (Rajagopal, 2013) can be used to contribute to existing work on Twitter and communication. As consumers continue to take to Twitter to share product reviews and experiences, it becomes even more important for organizations to understand the conversations that may take place. To do so, it is important to have models and conceptualizations of communication on Twitter that are informed by communication and audience research approaches (Dann, 2010; Naaman, et al., 2010; Zappavigna, 2011). It is also important to analyze discussions in detail in order to gain a better understanding of the role different audiences and their communication patterns play and, consequently, in what ways B2C relationships manifests.

Twitter-inherent features such as hyperlinks, @replies, retweets, and hashtags are communicative features that connect and expand a user’s social network in various ways. These “linguistic markers” (Zappavigna, 2011) provide a structured approach for gathering and analyzing content, which is useful since they focus not just on what organizations communicate, but also the comments, actions and topics created by public users. For example, hashtags allow users to label and follow specific topics and therefore also represent an often-used metadata-reference for researchers as well as marketing professionals. Jones (2014) analyzed instances where multiple hashtags were used in one post to investigate how networks can leverage resources from other networks. Focusing on the hashtags thus helps to, “better understand the multiple purposes involved in their use” [13]. Zappavigna (2011) claimed that the Twitter hashtag “is broadly involved in construing heteroglossia” in the sense that it “presupposes a virtual community of interested listeners who are actively following [a] keyword or who may use it as a search term” [14]. Heteroglossia derives from the notion that language cannot be regarded as a neutral medium since it carries multiple intentions of each speaker and listener. Zappavigna (2011) claimed that a topic, indexed via hashtags, facilitates heteroglossia via the republishing of tweets (retweets) and through the adaptation of a single topic by multiple users [15]. Gillen and Merchant (2013) added that each tweet in the hashtag becomes what Bakhtin (1986) conceptualized as “a link in the chain of speech communion” that can be organized and added on to by the users [16]. Thus, hashtags have been argued to engender heteroglossia through the shared usage of the tag from one user to the next.

The need to better contextualize and interpret social media content is heightened by the fact that social media initiate and support online communities. Both positive and negative sentiments uttered on social media, and relating to a specific company or product, become a powerful communication tool when communities form or pick up on them. Viral marketing on social media can go both ways — supporting and advertising an organization or product, but also reflecting complaints and negative standpoints towards an organization (Burton and Soboleva, 2011). Arguably, viral marketing represents one of the biggest opportunities but also a challenge for B2C communication on social media as users are able to react and respond to content. In this study, we are using the concept of imagined communities in order to enhance the understanding of B2C communication.

Drawing from Bakhtin’s (1986) imagined community, a single utterance equates to a tweet that starts a chain in which other users can react and respond to. These chains can be observed within Twitter via the user’s social awareness streams (SAS) (Naaman, et al., 2010) where users can respond, ignore, retweet, or quote specific tweets from other users.

In the context of social media, Litt (2012) described the imagined audience as “the mental conceptualization of people with whom we are communicating, our audience” [17]. The imagined audience consists of all of the user’s followers who the user thinks he or she is communicating with and all other users that the tweet may be visible to. Thus, the hashtag plays a crucial role in communicating with the imagined audience. Other scholars claimed that hashtag participants “do not necessarily know each other, but have been brought together by a shared theme, interest, or concern” [18], and that hashtags “draw the attention of other users to a particular message within a wider network” [19]. Studies showed that the imagined audience does in fact hold influence on user behaviour (Litt, 2012), for example, through sharing information and experiences with various products (Kapoor, et al., 2013). Other studies found that despite having no prior relationship with one another, people can be motivated to act simply by observing others [20]. Marwick and boyd (2010) added one further distinction between the “imagined audience” and the “networked audience” by claiming that the latter is both public and personal. The networked audience includes connections with whom the user is familiar, as well as individuals who have random or unknown connections to the user [21]. The imagined audience on the other hand are all Twitter profiles that the user is not directly aware of, thus users “imagine it” [22]. This means that there is a much lower threshold — combined with the anonymity in social media — to participate in hashtag conversations expressing, for example, negative sentiments towards an organization or brand. Greer and Ferguson (2011) conducted a large-scale analysis to find a relationship between Twitter interactivity and TV viewership. The study examined Twitter sites of 488 television stations and found that public stations and commercial stations utilized different promotional branding strategies [23]. Some quantitative studies examined the co-creation of meaning (Zappavigna, 2011) and user influence [24]. These studies showed that the quantitative approach can be useful for identifying usage patterns and behaviours. For our approach, we adapted a mixed-method study, following among others Lewis, et al. (2013) in their claim to adopt blended methods for analyzing big and complex data. In a study that investigated the use of computer-assisted content analysis in examining aspects of speech act theory, Einspänner, et al. (2014) argued that the use of computer-assisted qualitative analysis software (CAQDAS) for Twitter research can be more efficient when dealing with large datasets [25]. The researchers concluded that a mixed-methods approach may be appealing as the quantitative and qualitative elements would “minimize [each other’s] shortcomings” [26].

The communication perspective combined with a B2C focus shows that there are a range of promising avenues to pursue in order to analyze the role and potential of Twitter in B2C communication. For example, the concepts of heteroglossia and imagined audiences provide an approach for a more detailed understanding of how communication works, patterns form, and what influence these aspects have on B2C communication. The following sections will apply these concepts to a case study, showcasing our mixed-methods approach to analyze social media conversation.




Case study

Our case study uses Nike’s Possibilities [27] campaign to uncover and examine B2C communication patterns that take place on Twitter. This particular campaign was chosen not only because the campaign hashtag, #JustDoIt, contained a high level of daily activity, but also because the conversations within the hashtag demonstrated successful use for growing and engaging the brand’s audiences, as evidenced by the volume of activity that still takes place far beyond the official campaign period.

Investigating the Twitter interactions that have taken place within the #JustDoIt hashtag enables us to better understand how consumers were interacting with the brand and among each other. Given that “social media can provide a means of ‘observing’ customers, getting closer to customers, and developing personal and company brands” [28], the #JustDoIt conversation serves as a prime event for investigating these conclusions, and for investigating whether and how the message has changed in its intended meaning.

Data collection

Public data in the form of searchable tweets were collected from the #JustDoIt hashtag using the Netlytic platform (, 2014). We collected a total of 118,608 tweets [29] over two months from 1 March to 30 April 2014. Netlytic is a Web-based platform that can import data from Twitter and analyze the information using a suite of built-in tools including network visualizations, frequency-driven word clouds and topic analysis. Netlytic provided nine data points for each tweet, from basic structural categories such as ID#, URL, and source, to content-specific data including the tweet itself, its publication date, and the author. Our study analyzed the authorship, content, and publishing source as the main datapoints for each tweet, and the Netlytic platform was used to analyze the main corpus. Netlytic reported the most mentioned usernames, provided the most common words found within the corpus by frequency, and visualized interactions via total-degree centrality — a measure that counts the volume of tweets to and from a username (see, 2014).

This study also used the non-proprietary AntConc Corpus Toolkit (Anthony, 2005) for a quantitative analysis. As a statistical text analysis tool, AntConc generates frequency-driven wordlists and provides keyword, concordance, collocation, and cluster analyses. Keyword analyses are different from regular wordlist generators (e.g., a word cloud) because keyword lists require a reference corpus (that is significantly larger than the study corpus), which is used to generate a keyness score for each word. In effect, a high keyness score indicates that the word occurs significantly more times than it does in the reference corpus, while a negative keyness score indicates an unusually infrequent appearance of a word when compared to the reference corpus. In addition, the collocation tool provides frequencies in which words are found to co-occur (e.g., used side by side) with the search term, thus indicating a possible relationship between words, hinting toward a theme or topic.

For our qualitative analysis, we divided the main corpus into two sub-corpora. We sorted tweets with “RT @” within the body of the text into the Retweet (RT) corpus, while all other tweets were placed in the original-content (OC) corpus. Using a web-based statistical calculator [30], we determined that a sample size of 96 tweets per sub-corpus was sufficient to produce a 95 percent confidence level with a 10 percent confidence interval relative to corpus size. As a result, 96 tweets per sub-corpus were selected using a number randomizer [31], creating a sample size of 192 tweets with an expected 85 percent accuracy (95 percent ± 10 percent).

We then created a codebook with six categories to analyze each of the 192 tweets: (1) Author, (2) Forms of Interaction, (3) Content Structure, (4) Source Device, (5) Topics and Themes, and finally, (6) Unit of analysis. These categories were developed using an inductive approach with the data points provided by Netlytic, as well as emergent categories from manually reviewing the samples. The Topics and Themes category, for example, consisted of seven sub-classifications that were developed based on the content’s relation to Nike, fitness, and motivation. Similarly, the Content Structure category in this study adapted six sub-classifications from Dann’s (2010) review of Twitter literature.

Operationalizing the research questions

Our overall goal was to explore themes and behaviours within the #JustDoIt conversation. The following section describes how the research questions were operationalized using the different tools, and describes the qualitative and quantitative methods used in our mixed-methods design.

RQ1: How does #JustDoIt facilitate interaction between Nike as a brand and its consumers?

At the quantitative level, Netlytic’s Name Network tool visualizes the amount of users as well as the nodes and ties between them. This study uses the total-degree centrality for visualizing these relationships, which includes the number of @mentions each user sent as well as received (see Gruzd and Roy, 2014). The number of retweets and original tweets can then reveal whether users in the #JustDoIt conversation tend to publish original content or retweet content. A manually-coded content analyses was used to reveal different levels of interactivity based on the tweet’s structure and characteristics (e.g., URLs), which will facilitate a theoretical exploration of heteroglossia and the imagined audience.

RQ2: What are Twitter users saying in Nike’s #JustDoIt conversation?

Keywords and their contexts were analyzed to explore the types of conversations users participated in. We used a reference corpus containing 1.6 million words from Twitter [32]. Collocation analyses were then performed to identify words that preceded and followed the keyword in question, which highlighted possible relationships and the contexts in which they were used. Cluster analyses organized and counted the frequency of groups of words (e.g., sentences), providing a better understanding of what users are “talking about”.

RQ3: To what ends does #JustDoIt serve in Nike’s overall mission?

We chose a qualitative content analysis as our main instrument to answer the last RQ. Transcripts from Nike’s Investor Meeting (Nike, 2013b), and Quarterly Meeting (Nike, 2013c) were particularly helpful in identifying Nike’s high-level strategies for consumer engagement. Objectives that were identified by combing through these documents, as well as Nike’s Web site, which included statements such as “to inspire everyone in their personal achievements” in the official Possibilities campaign press release (Nike, 2013a), and “to achieve goals” on the Nike Community Forums (Nike, 2014). These findings informed our code book for the qualitative coding of tweets.

Findings from RQ1 and RQ2 helped to conceptualize the observed interactions in terms of social business (Rajagopal, 2013) and eWOM (Kapoor, et al., 2013). Through the discussion portion of this study, Research Question 3 will ultimately answer whether the observed interactions (RQ1) and conversations (RQ2) had any impact on Nike’s digital strategy.



Discussion of findings

RQ1: Visualizing user interaction via name networks

Netlytic reported a total of 99,973 usernames within the #JustDoIt conversation. A total of 59,772 nodes were found, along with 90,564 ties between them. The discrepancy between the total number of users and the total number of nodes can be a result of either omotted #JustDoIt hashtags within replies, or that not all users replied to tweets. Both situations are quite possible considering that the single collection criterion for this study was limited to tweets with the #JustDoIt hashtag. For example, only the first two tweets in the following conversation would be collected because they were the only ones with the #JustDoIt hashtag.


Sample Twitter conversation
Figure 1: Sample Twitter conversation. This study only collected tweets with #JustDoIt within the post (i.e., keyword corpus).


Figures 2–4 are the name network visualizations produced by Netlytic. The areas of concentration in the centre — much brighter than the outer areas — are nodes that contain the most influential users within the #JustDoIt hashtag. Users @NikeFuel (Figure 3), and @Nike (Figure 4) were among these influencers [33].


Name network visualization of the #JustDoIt corpus
Figure 2: Name network visualization of the #JustDoIt corpus. Tweets were collected from 1 March — 30 April 2014.



Isolated name network for @NikeFuel
Figure 3: Isolated name network for @NikeFuel.



Isolated name network for @Nike
Figure 4: Isolated name network for @Nike.


RQ1: Patterns in user interaction

We found that @Nike was the most frequently mentioned username, and Nike was the most frequent word throughout the corpus (Figures 5, 6). We then divided the corpus into the retweet (RT) and original content (OC) sub-corpora for individual examination. The RT corpus contained 54,530 tweets (46 percent), while the OC sub-corpus contained 64,078 tweets (54 percent).


Top 10 mentioned users in the keyword corpus
Figure 5: Top 10 mentioned users in the keyword corpus.



Top 10 frequently used words in the keyword corpus
Figure 6: Top 10 frequently used words in the keyword corpus.


The initial division of the corpus into the RT and OC sub-corpora revealed that while users posted original content more often than they retweeted, retweeted content accounted for more interaction between users. This differs from a previous study in which Bruns and Stieglitz (2014) observed 90 percent of the activity within a hashtag conversation was retweeted content from “Twitter celebrities” (the remaining 10 percent), thus concluding that the bulk of the users’ involvement in that particular conversation were “marginal” [34]. Yet, retweets are an important function for expanding the content’s visibility since they widen a tweet’s exposure from one mutually exclusive network to another [35]. Retweets also function as a form of acknowledgement since they “indicate actual attention given to a message” [36]. Retweets in this sense constitute interaction via acknowledgement. Accordingly, we wanted to identify “who tweeted” within the hashtag. We examined this by classifying our sample posts as either published by Nike, a Nike affiliate, or a user with no visible affiliation with Nike. Based on authorship alone, our sample corpora suggest that Nike rarely retweets content, but users are indeed retweeting existing content (100 percent of our all retweet samples were posted by users). Results from the OC corpus indicates that the conversation is dominated by users rather than the brand (only four OC tweets within our sample were from a Nike-affiliated account). This means that in our study, the liveliness of the hashtagged conversation has to be attributed to user-generated content, including retweets.

Next, we analyzed how often users interacted with the brand. Through manual coding (see Table 1 below), we found that 59.4 percent of the tweets analyzed in the RT subcorpus mentioned Nike; 54 tweets (56.3 percent) were a direct retweet of Nike content, noted with “RT @Nike”, and three tweets (3.1 percent) contained “@Nike” but not immediately prefixed with “RT @Nike”.


Table 1: Forms of interaction with Nike. Each tweet was coded based on how, if applicable, it mentioned Nike.
 RT sub-corpusOC sub-corpus
1Original tweet — Direct mention00.066.3
2Original tweet — Indirect mention00.01515.6
3Original tweet — No mention00.07578.1
4Retweet — Direct mention33.100.0
5Retweet — Nike direct5456.300.0
6Retweet — Indirect mention88.300.0
7Retweet — No mention3129.2 00.0
9N/A — Spam/Junk data00.0 00.0


The OC sub-corpus only had 21.9 percent of its tweets mentioning Nike: 15 tweets (15.6 percent) contained “Nike”, but only six (6.3 percent) contained a direct “@Nike” mention. More than 75 percent of tweets in the OC corpus had no mention of the brand at all, compared to only 32.3 percent from the RT sub-corpus.

From our analysis, we found that within retweeted content, users are more likely to interact directly with the brand. With original content, however, users are more likely to indirectly mention the brand (i.e., not directing tweets to @Nike or affiliated accounts). Finally, to classify these interactions based on the content of each tweet, we adapted Dann’s (2010) Twitter content classification system (see Table 2 below). The model describes the types of content users were posting, including the categories such as conversational, pass along, news, status, phatic, and spam [37]. Information-seeking or addressivity related tweets (mentioning a username) were classified as conversational content. The pass along category included tweets that were intended to share information, were self-promoting/advertising, or contained a URL. Headlines, event coverage, or other reporting of information were classified as news, while those that answered “what are you doing now?” were classified as statuses. Within the phatic category were greetings, monologues, or opinions. The spam category encompassed automated posts.


Table 2: Results from coding tweet content structures focusing on interactivity.
 RT sub-corpusOC sub-corpus
2Pass along5254.21515.6


Within the RT corpus, the highest observed structure was the pass along category with 52 tweets (54.2 percent). Trailing by 33 tweets was the conversational structure with 19 tweets (19.8 percent). The news, status, and phatic categories had three tweets each, accounting for 9.3 percent of the total RT corpus.

Contrasting this with the OC corpus, the highest observed structure was the conversational style, accounting for 27 tweets (28.1 percent), which were addressed to others via an “@” symbol. The second-highest ranking category was the status category, accounting for 22 tweets (22.9 percent). Information sharing and self-promotion — the pass along and phatic categories — ranked third, accounting for 15 tweets each (totaling 31.2 percent of the OC corpus). Within both sub-corpora, 16 tweets were classified as spam.

The most surprising finding was that most of the retweets categorized as having a conversational structure were of content from @Nike. Regarding the 19 retweets identified as conversational, 13 (68 percent) contained “RT @Nike”.


Sample conversational tweet
Figure 7: Sample conversational tweet. @Nike’s tweet, which is a response to a post from another Twitter user, was retweeted 130 times.


These tweets are interesting not only because users were retweeting official Nike content, but also because the content of the tweet shows that Nike is addressing a different user. This suggests that (1) Nike engages with users regularly, (2) the addressee is invited to engage with Nike and (3) other users are endorsing Nike’s engagement (e.g., 130 retweets in the sample retweet). From a theoretical standpoint, these tweets exemplify the notion of the imagined audience (Litt, 2012) in that the content reached individuals that neither Nike nor the addressee were aware of.

RQ2: Emerging conversations

In order to gain a better understanding of what users are saying within the #JustDoIt conversation, the texts within the tweets were analyzed using the AntConc Corpus Toolkit. The RT and OC sub-corpora were analyzed separately through keyword, collocation, and cluster analyses.

Our keyword analysis (see Table 3 below) revealed a total of 154 keywords within the RT corpus. These words were repeated at least 225 times. The OC corpus had 156 keywords.


Table 3: Top 10 keywords by sub-corpus.
Keywords – RT sub-corpus Keywords – OC sub-corpus
WordKeyness score WordKeyness score
Time11,965.38 Nike21,548.39
Nike8,256.38 run6,078.61
Finish6,647.23 day4,570.31
Run6,346.15 workout4,570.31
Line6,115.24 nikeplus4,312.55
Start5,841.33 fitness4,140.71
Season4,673.29 time3,808.01
Moving4,499.00 today3,659.95
Quit4,315.66 running3,402.22
Day4,252.28 motivation3,182.88


By identifying significant keywords within each corpus, we gained more insight into the conversations. Similar keywords were found between the RT and OC corpora; terms like time, Nike, run, and day were common in both top 10 lists. The difference between these lists was the keyness scores attributed to each word by AntConc. A word’s Keyness references the frequency in which it appears in the main corpus (i.e., the corpus being analyzed) when compared to a much larger reference corpus (see Baker, 2006). Our reference corpus contained 1.6 million words extracted from Twitter [38]. Overall, the similar keywords found in RT and OC corpora, suggest that the users were tweeting and retweeting about similar topics.

To provide a greater depth of understanding, we ran the keywords first through a collocation and then a cluster analysis. The collocation analysis was configured to identify four words to the left and four words to the right of the search term. The statistical measure used in calculating collocations was the principle of mutual information (MI). Words that usually occur more often closer to the search term, the greater the chance that the collocate represents the context in which the search term was used (Baker, 2006). Thus, the collocation analysis provides further insight into what users are “talking about” in relation to the keyword. Table 4 (below) summarizes the top 10 collocates for Nike.


Table 4: Top 10 collocates for Nike by sub-corpus.
 RT sub-corpusOC sub-corpus
RankCollocateFreq.MI scoreCollocateFreq.MI score


Many collocates had very similar MI scores, suggesting that the conversations shared a common theme. For example, we found that most collocates in the RT corpus for the term Nike were actually Nike product lines (5/10 collocates). Within the OC corpus, only two collocates for Nike were actual product names, though most of the content was about Nike purchases. One outlier was “gymwanker” — which is actually a hashtag on Instagram that functions much like a Twitter hashtag in that it indexes posts under the tag. In general, keywords were used in the same contexts via retweets and original content.

Our cluster analyses confirmed the same finding. We configured AntConc to analyze a cluster of five words surrounding the collocate. Within the OC corpus, results from the cluster analyses (for Nike, run, day, workout, nikeplus collocates) found that a majority of the themes were related to Nike products or quotes. Collocates that most strongly related to fitness and motivation were run, day, and workout. Nike branding and product placements were most visible within the Nike and nikeplus collocates, and found co-occurring words such as “join Nike ...” and “Nike family”. The analysis also found similar topics within the RT corpus. Product placements and branding collocates included hypervenoms and knightsnation — both of which were used as subsequent hashtags within the tweet.

One difference between the RT and OC corpora was that the RT corpus had much less fitness and motivational phrases. Tweets in the OC corpus had a stronger relation to fitness, motivation, and Nike related themes. Evidenced through the frequency of each literary item, tweets observed in the RT corpus were mostly quotes retweeted by multiple users; based on content alone, most of these were difficult to interpret as to having a relationship with fitness or Nike (e.g., quotes such as “Don’t let your luck run out”).

The cluster analysis also detected negative sentiment. Within both corpora, the topic of boycotting Nike was revealed through co-occurring keywords such as sponsoring and barbarian. A collocation analysis alone would have provided insufficient details to detect negative sentiment.

RQ3: Relevance to organizational goals

In order to relate the tweets back to Nike’s organizational goals to motivate and inspire users, we conducted a manual content analysis of the tweets based on the following categories: (1) Nike+ Related, (2) Nike — Fitness Related, (3) Nike — General and Other Products, (4) Fitness non-Nike, (5) Motivation non-Nike, (6) General — Unrelated, and (7) Other/Junk. Table 5 provides the results of our manual content analysis, and Table 6 provides sample tweets that were coded.


Table 5: Results of content analysis: Topics and themes.
 RT sub-corpusOC sub-corpus
1Nike+ related00.033.1
2Nike — Fitness related2728.188.3
3Nike — General & other products2324.01010.4
4Fitness related, non-Nike55.21414.6
5Motivation, non-related22.11515.6
6General — Unrelated1616.73132.3


The RT corpus saw the greatest number of Nike-related tweets, with 27 messages (28.1 percent) related to Nike and fitness, and 23 tweets (24 percent) related to Nike in general (e.g., quotes, products). Only seven tweets were not related to Nike, though five tweets were still fitness related and two were motivational in general. A total of 39 retweets (40.6 percent) were not related to Nike, fitness or motivation; 23 retweets were identified as junk for either having an insufficient amount of context or being undecipherable (e.g., due to language).

Within the OC corpus, 46 tweets were not related to Nike, fitness or motivation; 15 out of the 46 tweets were categorized as junk. The OC corpus contained 21 tweets that were directly related to Nike, including eight that were related to Nike and fitness, and three that were related to Nike+. The OC corpus also had 14 fitness tweets that were not affiliated with Nike, and 15 tweets that were generally motivational with no ties to Nike.


Table 6: Sample tweets within content analysis.
1Nike+ related#nikefuel #nikerunning #JustDoIt I won 30 and 30, a 30.00mi Challenge using Nike+. #nikeplus
2Nike — Fitness related358m done! First time ever. #JustDoIt (@ SSC Swimming Pool)
3Nike — General & other productsLove my new @Nike tees. #nike #JustDoIt #reupload #betterphoto #lovethem
4Fitness related, non-NikeI went hard in the gym today. #justdoit
5Motivation, non-NikeGoals ... Dreams with deadlines #justdoit
6General — Unrelated#illustration #marker #drawing #night #JustDoIt #art #draw
9Other/junkJa ik moet echt! @loopmaatjes #JustDoIt #lazybastard





The observed topics included brand communication, fitness and motivational themes, and other non-brand related messages from users. By definition, Nike’s slogan-turned-campaign handle experienced heteroglossia by having multiple meanings among participants, all of which used Nike’s “Just do it” slogan to their own ends.

While Nike originally intended that the message served as a channel for users to inspire and motivate one another (Nike, 2013a), the collocation and cluster analysis revealed that in at least one conversation participants have also used the tag to support a boycott of Nike products. The cluster analysis revealed keywords like barbarian and sponsor in the phrase such as: “WHY is Nike sponsoring this BARBARIAN?” @DogRescueTweets: #Boycott #Nike #justdoit #animalabuse #dogfighting.

Androutsopoulos (2011) claimed that “heteroglossia does not occur ... but is made [emphasis in the source text]; it is fabricated by social actors who have woven voices of society into their discourses, contrasting these voices and the social viewpoints they stand for” [39]. In other words, participants in the boycott subtopic have taken the original slogan and reframed it as a form of action — to boycott Nike. By participating in the #JustDoIt hashtag, these tweets and conversations were amplified to the larger, imagined audience that otherwise may not have been aware or related to the topic.

This amplification of content within the hashtag reflects the notion of the imagined audience when followers of @Nike retweet the brand’s messages, exposing the tweet to even more networks. Each retweet thus re-appears in the hashtag as a separate and more recent comment, effectively pushing down — or burying — previous tweets. Nike may have pushed its agreeable quotes to its followers in anticipation that the messages would reach its imagined audience and trigger more retweets. Moreover, Nike and its affiliates effectively reassigned content from the spheres of #nikeplus — and no doubt, others — to the #JustDoIt hashtag. In essence, Nike used Twitter to redirect users who have posted fitness and motivational content to the #JustDoIt thread, and they did so by recruiting audiences elsewhere.

Strong evidence was derived from the quantitative analysis that suggests users were in fact discussing Nike, fitness, motivation or a combination of the three. Statistical measures using mutual information (for collocation analysis) and log-likelihood (keyword lists) calculations, for example, produced notable results indicating product-related topics were taking place. From the qualitative perspective, 52 percent of the sample RTs were related to Nike, compared to only 18.8 percent of the OC tweets. Using the image of “competing voices” [40], it is clear that participants who post original content are trying to adapt the #JustDoIt keyword for non-Nike related purposes. By recruiting users from other conversations and publishing retweet-worthy content, however, Nike seemed determined to reserve the meaning of #JustDoIt for fitness or brand-related content. Given the size and influence of Nike’s and its affiliated accounts, it would be a useful strategy for the brand to regain control of the hashtag by publishing retweetable content, thus burying tweets that are less favourable.

Summing up, our proposed mixed-method design to study the usage of social media in B2C relationships produced interesting insights into the actual design, interaction and dynamics of Nike’s campaign. The sample in this study was limited to the discussions around the #JustDoIt hashtag and to a certain timeframe, and the results therefore are not representative of all B2C interactions of this brand. However, by merging B2C approaches with communication studies and corpus linguistic tools, we provided an innovative, multi-disciplinary avenue of research into social media interactions that can be expanded to either larger or smaller data sets/samples (of different companies. Further, complementary research areas could focus on how social media affects B2C relationships off-line, and how B2C interactions through social media impact financial revenues of companies, that is the ROI (return of investment) of companies’ social media engagement. Our approach offers analyses and insights that go beyond quantitative approaches focusing on the number of followers and provide the opportunity of explaining follower numbers, behaviours and interaction patterns. End of article


About the authors

Jacky Au Duong is a researcher and coordinator at the Centre for Communicating Knowledge at Ryerson University in Toronto, Canada. His areas of interest range from social media, branding and marketing, communication and public engagement.
E-mail: jauduong [at] ryerson [dot] ca

Frauke Zeller is Associate Professor in the School of Professional Communication at Ryerson University in Toronto. Her recent publications focus on Big Data in audience research, social media research mixed-methods and human-robot interaction studies.
E-mail: fzeller [at] ryerson [dot] ca



1. Kapoor, et al., 2013, p. 54.

2. Fischer and Reuber, 2011, p. 16.

3. Nitins and Burgess, 2014, p. 294.

4. Mangold and Faulds, 2009, p. 358.

5. Burton and Soboleva, 2011, p. 492.

6. Burton and Soboleva, 2011, p. 491.

7. Ibid.

8. Mangold and Faulds, 2009, p. 358.

9. Rybalko and Seltzer, 2010, p. 336.

10. Mangold and Faulds, 2009, p. 359.

11. Rajagopal, 2013, p. 112.

12. Rajagopal, 2013, p. 111.

13. Jones, 2014, p. 103.

14. Zappavigna, 2011, p. 791.

15. Zappavigna, 2011, p. 790.

16. Bakhtin, 1986, pp. 75–76.

17. Litt, 2012, p. 331.

18. Bruns and Moe, 2014, p. 19.

19. Jones, 2014, p. 104.

20. Long, et al., 2012, p. 284.

21. Marwick and boyd, 2010, p. 129.

22. Marwick and boyd, 2010, p. 117.

23. Greer and Ferguson, 2011, p. 207.

24. Hawthorne, et al., 2013, p. 557.

25. Einspänner, et al., 2014, p. 99.

26. Einspänner, et al., 2014, p. 105.

27. Nike’s Possibilities campaign (Nike, 2013a) celebrated 25 years of its famous “Just Do It” slogan and had asked users to share personal accomplishments via the #JustDoIt Twitter hashtag. The campaign ran from 20 August — 13 September 2013.

28. Fischer and Reuber, 2011, p. 16.

29. Compared to other studies, the size of this study’s corpus seemed sufficient given that the search term was confined to one hashtag. The size of other corpora used in previous studies have varied significantly: A study on the 2011 Canadian federal election only collected 5,918 tweets over a two-day period (Gruzd and Roy, 2014), while another study used a corpus with 34,770,790 tweets for sentiment analysis (Thelwall, et al., 2011).



32. We defined Influencers using Netlytic&squo;s Total Degree Centrality measure, which accounts for the total number of tweets to and from a particular username.

33. See previous note.

34. Bruns and Stieglitz, 2014, p. 78.

35. Bruns and Moe, 2014, p. 19.

36. Himelboim, et al., 2013, p. 163.

37. Tweets that were information-seeking or had addressivity through including the @Username tag were classified as conversational content. The pass along category included tweets that were intended to share information or had a URL, as were tweets that were self-promoting and/or advertising information. Headlines, event coverage, or other reporting of information were classified as news, while those that answered “what are you doing now?” were classified as statuses. Within the phatic category were greetings, monologues, or opinions. Finally, the spam category encompasses tweets that were automated or unsolicited posts from malware or bots.

38. Our reference corpus contains 1.6 million words extracted from Twitter data. Retrieved June 2014 from Sentiment140 (

39. Androutsopoulos, 2011, p. 282.

40. Bruns and Moe, 2014, p. 204.



J. Androutsopoulos, 2011. “From variation to heteroglossia in the study of computer-mediated discourse,” In: C. Thurlow and K. Mroczek (editors). Digital discourse: Language in the new media. Oxford: Oxford University Press, pp. 277–298.
doi:, accessed 21 April 2017.

L. Anthony, 2014. “AntConc,” version 3.4.3. Tokyo, Japan: Waseda University, at, accessed 21 April 2017.

S. Aral, C. Dellarocas and D. Godes, 2013. “Introduction to the special issue — Social media and business transformation: A framework for research,” Information Systems Research, volume 24, number 1, pp. 3–13.
doi:, accessed 21 April 2017.

P. Baker, 2006. “‘The question is, how cruel is it?’ Keywords, foxhunting and the House of Commons,” AHRC ICT Methods Network Expert Seminar on Linguistics (8 September), at, accessed 2 April 2014.

M. Bakhtin, 1986. Speech genres and other late essays. C. Emerson and M. Holquist (editors) and V. McGee (translator). Austin: University of Texas Press.

T. Breur, 2011. “Data analysis across various media: Data fusion, direct marketing, clickstream data and social media,” Journal of Direct, Data and Digital Marketing Practice, volume 13, number 2, pp. 95–105.
doi:, accessed 21 April 2017.

D. Brooks, 2013. “The philosophy of data,” New York Times (4 February), at, accessed 2 April 2013.

A. Bruns and H. Moe, 2014. “Structural layers of communication on Twitter,” In: K. Weller, A. Bruns, J. Burgess, M. Mahrt and C. Puschmann (editors). Twitter and society. New York: Peter Lang, pp. 15–28.

A. Bruns and S. Stieglitz, 2014. “Metrics for understanding communication on Twitter,” In: K. Weller, A. Bruns, J. Burgess, M. Mahrt and C. Puschmann (editors). Twitter and society. New York: Peter Lang, pp. 68–82.

S. Burton and A. Soboleva, 2011. “Interactive or reactive? Marketing with Twitter,” Journal of Consumer Marketing, volume 28, number 7, pp. 491–499.
doi:, accessed 21 April 2017.

E. Constantinides, 2014. “Foundations of social media marketing,” Procedia — Social and Behavioral Sciences, volume 148, pp. 40–57.
doi:, accessed 21 April 2017.

S. Dann, 2010. “Twitter content classification,” First Monday, volume 15, number 12, at, accessed 14 April 2014.
doi:, accessed 21 April 2017.

J. Einspänner, M. Dang-Anh and C. Thimm, 2014. “Computer-assisted content analysis of Twitter data,” In: K. Weller, A. Bruns, J. Burgess, M. Mahrt and C. Puschmann (editors). Twitter and society. New York: Peter Lang, pp. 97–108.

E. Fischer and A. Reuber, 2011. “Social interaction via new social media: (How) can interactions in Twitter affect effectual thinking and behavior?” Journal of Business Venturing, volume 26, number 1, pp. 1–18.
doi:, accessed 21 April 2017.

E. Fisher, 2015. “‘You media’: Audiencing as marketing in social media,” Media, Culture & Society, volume 37, number 1, pp. 50–67.
doi:, accessed 21 April 2017.

J. Gillen and G. Merchant, 2013. “Contact calls: Twitter as a dialogic social and linguistic practice,” Language Sciences, volume 35, pp. 47–58.
doi:, accessed 21 April 2017.

C. Greer and D. Ferguson, 2011. “Using Twitter for promotion and branding: A content analysis of local television Twitter sites,” Journal of Broadcasting & Electronic Media, volume 55, number 2, pp. 198–214.
doi:, accessed 21 April 2017.

A. Gruzd and J. Roy, 2014. “Investigating political polarization on Twitter: A Canadian perspective,” Policy & Internet, volume 6, number 1, pp. 28–45.
doi:, accessed 21 April 2017.

A. Gruzd, B. Wellman and Y. Takhteyev, 2011. “Imagining Twitter as an imagined community,” American Behavioural Scientist, volume 55, number 10, pp. 1,294–1,318.
doi:, accessed 21 April 2017.

J. Hawthorne, J. Houston and M. McKinney, 2013. “Live-tweeting a presidential primary debate: Exploring new political conversations,” Social Science Computer Review, volume 31, number 5, pp. 552–562.
doi:, accessed 21 April 2017.

I. Himelboim, S. McCreery and M. Smith. 2013. “Birds of a feather tweet togeter: Integrating network and content analyses to examine cross-ideology exposure on Twitter,” Journal of Computer–Mediated Communication, volume 18, number 2, pp. 40–60.
doi:, accessed 21 April 2017.

J. Jones, 2014. “Switching in Twitter’s hashtagged exchanges,” Journal of Business and Technical Communication, volume 28, issue 1, pp. 83–108.
doi:, accessed 21 April 2017.

P. Kapoor, K. Jayasimha and A. Sadh, 2013. “Brand-related, consumer to consumer, communication via social media,” IIM Kozhikode Society & Management Review, volume 2, number 1, pp. 43–59.
doi:, accessed 21 April 2017.

S. Lewis, R. Zamith and A. Hermida, 2013. “Content analysis in an era of big data: A hybrid approach to computational and manual methods,” Journal of Broadcasting & Electronic Media, volume 57, number 1, pp. 34–52.
doi:, accessed 21 April 2017.

E. Litt, 2012. “Knock, knock. Who's there? The imagined audience,” Journal of Broadcasting & Electronic Media, volume 56, number 3, pp. 330–345.
doi:, accessed 21 April 2017.

C. Long, P. Gable, C. Boerstler and C. Albee, 2012. “Brands can be like friends: Goals and interpersonal motives influence attitudes toward preferred brands,” In: M. Fetscherin, S. Fournier and M. Breazeale (editors). Consumer-brand relationships: Theory and practice. Milton Park, Abingdon, Oxon.: Routledge, pp. 279–297.

W. Mangold and D. Faulds, 2009. “Social media: The new hybrid element of the promotion mix,” Business Horizons, volume 52, number 4, pp. 357–365.
doi:, accessed 21 April 2017.

A. Marwick and d. boyd, 2010. “I tweet honestly, I tweet passionately: Twitter users, context collapse, and the imagined audience,” New Media & Society, volume 13, number 1, pp. 114–133.
doi:, accessed 21 April 2017.

V. Mayer-Schönberger and K. Cukier, 2013. Big data: A revolution that will transform how we live, work and think. London: John Murray.

D. Murthy, 2012. “Towards a sociological understanding of social media: Theorizing Twitter,” Sociology, volume 46, number 6, pp. 1,059–1,073.
doi:, accessed 21 April 2017.

M. Naaman, J. Boase and C. Lai, 2010. “Is it really about me? Message content in social awareness streams,” CSCW ’10: Proceedings of the 2010 ACM Conference on Computer Supported Cooperative Work, pp. 189–192.
doi:, accessed 21 April 2017., 2014. “Netlytic,” at, accessed 14 February 2014.

Nike, 2014. “Nike community forum,” at, accessed 27 February 2014.

Nike, 2013a. “Nike redefines ‘Just Do It’ with new campaign” (21 August), at, accessed 24 January 2014.

Nike, 2013b. “2013 Nike, Inc. investor meeting (9 Oct 2013) [transcript],” at, accessed 21 April 2017.

Nike, 2013c. “FY 2014 Q2 earnings release conference call transcript December 19, 2013,” at, accessed 21 April 2017.

T. Nitins and J. Burgess, 2014. “Twitter, brands, and user engagement,” In: K. Weller, A. Bruns, J. Burgess, M. Mahrt and C. Puschmann (editors). Twitter and society. New York: Peter Lang, pp. 292–304.

K. Polánska, 2014. “Social media in modern business,” European Scientific Journal, volume 1, pp. 335–345, and at, accessed 21 April 2017.

Rajagopal, 2013. “Social media and consumer insight,” In: Rajagopal. Managing social media and consumerism: The grapevine effect in competitive markets. Houndmills, Basingstoke, Hampshire: Palgrave Macmillan, pp. 109–131.
doi:, accessed 21 April 2017.

S. Rybalko and T. Seltzer, 2010. “Dialogic communication in 140 characters or less: How Fortune 500 companies engage stakeholders using Twitter,” Public Relations Review, volume 36, number 4, pp. 336–341.
doi:, accessed 21 April 2017.

M. Thelwall, K. Buckley and G. Paltoglou, 2011. “Sentiment in Twitter events,” Journal of the American Society for Information Science and Technology, volume 62, number 2, pp. 406–418.
doi:, accessed 21 April 2017.

F. Zeller, 2014. “Big data in audience research,” In: F. Zeller, C. Ponte and B. O’Neill (editors). Revitalising audience research: Innovations in European audience research. London: Routledge, pp. 261–278.

P. Zikopoulos, C. Eaton, D. DeRoos, T. Deutsch and G. Lapis, 2012. Understanding big data: Analytics for enterprise class Hadoop and streaming data. New York: McGraw-Hill.

M. Zappavigna, 2011. “Ambient affiliation: A linguistic perspective on Twitter,” New Media & Society, volume 13, number 5, pp. 788–806.
doi:, accessed 21 April 2017.


Editorial history

Received 13 December 2016; accepted 22 April 2017.

Copyright © 2017, Jacky Au Duong and Frauke Zeller. All Rights Reserved.

Tracking the imagined audience: A case study on Nike’s use of Twitter for B2C interaction
by Jacky Au Duong and Frauke Zeller.
First Monday, Volume 22, Number 5 - 1 May 2017

A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2017. ISSN 1396-0466.