ERGM approach to press freedom, regime type, and Internet connectedness
First Monday

ERGM approach to press freedom, regime type, and Internet connectedness by Hyunjin Seo and Stuart J. Thorson

We investigate homophily in the tie structure of the global Internet by estimating Exponential Random Graph (ERG) models. Specically, we analyze the extent to which dierent variables including Gross National Income, geographic proximity, political regime type, and press freedom rating account for the pattern of direct country-to-country Internet connections. Results show that for 20111–2014, but not before, press freedom homophily is signicantly predictive of the presence (or absence) of country-to-country Internet connections even when controlling for geographic proximity, bandwidth, and whether or not a country is democratic. The regime type variable was a signicant predictor in 20021–2004 but not after. The ndings provide insights into changes in press freedom around the world and the global Internet structure. The ERG approach used in this study should be useful for future research in related areas.


Literature review
Discussion and conclusions




The revival of nationalist politics in Europe and the United States has been accompanied by diering claims involving the role of the press in a democracy. For example, on 17 February 2017, U.S. President Donald Trump tweeted: “The FAKE NEWS media (failing @nytimes, @NBCNews, @ABC, @CBS, @CNN) is not my enemy, it is the enemy of the American People!” While politicians have often felt themselves in an adversarial relationship with the press, “the enemy of the people” phrase struck many as crossing a line in a country whose constitution prohibits government interference with a free press. The next day, German Chancellor Angela Merkel, speaking to an audience including U.S. Vice President Mike Pence, oered a contrasting position on press/government relations saying, “I stand by a free and independent press and have great respect for journalists. We’ve always done well in Germany when we mutually respect each other.” (Donahue, 2017)

Interestingly, these conversations about fake news and press freedom are taking place in a communication environment globalized by the Internet. Moreover, increasingly available and aordable digital communication networks enabled by the Internet allow actors other than traditional intermediaries such as governments or mass media to play an important role in producing and sharing news and information. However, dierent levels of access to the Internet and freedom to use the Internet as well as press freedom inuence levels of citizen participation in important and relevant discourses in politics or other areas of citizen lives. In particular, country-level Internet infrastructure and policy are important factors in this. For example, North Korea’s government policies deny Internet access to most of its citizens. The Chinese government generally restricts access to Google and most of its services. South Korea’s national security law prohibits citizen access to North Korean Web sites and limits discussions of North Korea over social media. Internet access has become an integral component of freedom of press and expression even as much of the world’s population lacks the resources and/or skills to easily access it.

In this sense, it is important to better understand structural changes of the global Internet and how these changes are associated with changes in other aspects of society. Against this backdrop, this paper examines several Exponential Random Graph (ERG) models to understand the extent to which social, political, and economic variables including press freedom, regime type, and economic performance can account for the pattern of direct country-to-country Internet connections [1]. Our ndings suggest that in the early days (2002) of the global Internet, macro economic performance, per capita domestic Internet users, and homophily with regard to regime type (whether or not a country is democratic) and geographic location were statistically meaningful predictors of structure. However, in the less centralized global Internet of 2014, regime type homophily became no longer predictive, though freedom of press homophily has become signicant. Our paper provides a more nuanced understanding of interplay between structural aspects of the global Internet and press freedom and regime type.



Literature review

The digital network now known as the Internet can be thought of as beginning with Kleinrock’s 1961 paper (Kleinrock, 1961) on packet switching. By 1969, his networking ideas had been realized in the initial ARPANET, connecting machines located in Menlo Park, California (Stanford Research Institute); University of California, Los Angeles; University of California, Santa Barbara; and, the University of Utah.

By 2002 the Internet was truly international, with 187 countries sharing a high speed direct connection with at least one other country with a total global bandwidth capacity of 0.9 terabits per second (Tbps). By 2014, there were 201 countries connected and total global bandwidth capacity had grown to 137.3 Tbps.

The rapid growth in the size of the global Internet is not the whole story. The structure of the Internet has also undergone considerable change from being highly centralized around the United States to becoming increasingly decentralized with increasing numbers of countries having direct connections with more than one other country. In 2002, the median number of direct connections across all countries was two while the most direct connections any country had was 124 (U.S.). The Freeman centralization score (Freeman, 1978) is an index of how central the most central node in a network is relative to all other nodes normalized by the theoretical maximum centralization of the network. Thus scores can range between 0 (maximally decentralized) and 1 (maximally centralized). In 2002, the score for the global Internet, using number of direct connections as the index of a country’s centrality, was 0.64. Moving forward to 2014, the median number of direct connections had more than doubled to 5. Great Britain now had the largest number of direct connections (100). Importantly, the Freeman centrality score in 2014 had dropped to 0.45 indicating increased decentralization with regard to the number of direct connections held by countries connected in the global Internet.

In this paper our interest is in considering some of the variables associated with these structural changes of the global Internet. In doing so, we apply Exponential Random Graph (ERG) modeling, which has gained attention in the field of communication (Song, 2015; Shumate and Palazzolo, 2010; Welles and Contractor, 2015). ERGM is used to examine interdependent mechanisms that affect how network ties are created, sustained, and dissolved, and to analyze the statistical likelihood that a particular network structure will be observed. While ERGM was initially developed in the 1990s (Pattison and Wasserman, 1999), it has taken off in recent years due to excellent expository texts along with packages for use in statistical environments such as R (Shumate and Palazzolo, 2010; Harris, 2014; Kolaczyk and Csárdi, 2014; Butts, 2016; Hunter, et al., 2008).

Welles and Contractor (2015) applied ERGM to analyzing how individual and network-level factors influence the formation of online social relationships. They found that the amount of time spent online was positively associated with the likelihood of online relationship formation. Another significant predictor of the emergence of online relationship ties in their study was network pressure toward balance, “individuals tending to form relationships with others who have relationships in common.” [2] Song (2015) used ERGM to examine what factors influence structural properties of discussion networks.

Previous studies of democracy and Internet adoption examined whether forms of political institutions influence the government’s Internet adoption or whether the government’s Internet adoption affects its democratic processes (Guillén and Suárez, 2005; Seo and Thorson, 2012; Kedzie, 1997; Milner, 2006; Norris, 2001; Rød and Weidmann, 2015). With empirical and theoretical studies in this area putting forward somewhat conflicting arguments, there are two primary schools of thought. The first group contends that the type of political institution is closely associated with the country’s adoption of the Internet and that democratic countries are more likely to adopt the Internet (Guillén and Suárez, 2005; Kedzie, 1997; Milner, 2006). For example, based on data from 1991 to 2001 on about 200 countries, Milner (2006) concluded that democracies adopt the Internet at a much faster pace than autocracies. Similarly, Guillén and Suárez (2005) showed that a country’s democracy score was significantly associated with the number of Internet users and hosts in that country.

In comparison, the second group suggests that Internet adoption has little to do with advancements in democracy, arguing that the association between the Internet and democratic development has been overestimated (Rød and Weidmann, 2015). Based on an empirical analysis of Internet use in authoritarian regimes from 1993 to 2010, Rød and Weidmann (2015) concluded that movements toward democracy were more frequent in autocratic countries with low Internet penetration. According to that study, autocracies which are concerned with the domestic information environment tend to introduce the Internet more actively than other autocracies.

While regime type has been examined, press freedom has rarely been studied in this context. Using dyadic bandwidth between countries, Seo and Thorson (2017) found that a country’s regime type influences a given country’s Internet adoption and that democratic countries dominate the global Internet with respect to bandwidth over direct shared connections in both 2002 and 2014. Their analyses concluded that liberal democracies and economically “central” countries played an important role in global Internet connectedness. While this study is helpful in enhancing our understanding of the effects of regime type on bandwidth over direct country-to-country connections, it did not take into account press freedom, which is increasingly intertwined with Internet use, nor did that study attempt to explain which countries were connected and which were not. Instead, it took the structure (pattern of connections) as a given and looked at the impact of regime type on the amount of bandwidth over those connections.

Examining press freedom, not just regime type, is important as a country’s press freedom can improve or deteriorate despite a country maintaining the same democratic status in terms of political institutions. For example, South Korea’s press freedom ratings significantly plunged during the Lee Myung-bak and Park Geun-hye administrations, primarily due to their attempts to censor online content. Our study aims to address this gap using the press freedom index developed by Freedom House, a U.S.-based non-governmental organization for promoting democracy and press freedom around the world. Freedom House publishes its Freedom of the Press report annually, evaluating the status of press freedom in countries across the globe. The press freedom rating — free, partly free, or not free — is based on scores each country receives as a result of the legal, political and economic environments in which media operate (Freedom House, 2015).

The Freedom of the Press index has been widely studied in social science research (Alam and Ali Shah, 2013; Tran, et al., 2011; Whitten-Woodring, 2009; Karlekar and Becker, 2014). For example, in their study of 115 countries over the period of 2002–2010, Alam and Ali Shah (2013) found a positive correlation between press freedom and economic growth. These authors analyzed foreign direct investment (FDI) and gross domestic product (GDP) along with the press freedom index. Karlekar and Becker conducted a systematic investigation of the relationship between democracy and press freedom, with a particular focus on Freedom House measures. They concluded that while there has been a strong positive correlation between press freedom and democracy measures, “Most often, changes in the state of media freedom have happened in tandem with changes in broader freedoms, therefore making it a sensitive indicator of the overall health of a democracy.” [3] In the case of the 2014 global Internet, as discussed in the Method section, Spearman’s p between the dichotomous regime type variable and press freedom status is .73 and generally consistent with findings from Karlekar and Becker (2014). Sobel, et al. (2010) looked at the extent to which press freedom in one country spills over to neighboring countries. Using time series analysis on a sample of 102 countries over the 1994 to 2003 period, they concluded that “a country catches approximately one-fifth of the press freedom of its neighbors.” [4] Neighbor here refers to countries sharing a land border. The global Internet has arguably enabled countries some distance from one another to become information neighbors. If so, we might expect press freedom homophily to be associated with direct Internet ties.




Our primary analysis examines relations between the structure of Internet connections between countries in 2014 [5] and economic performance (Gross National Income), number of domestic Internet users per capita, and homophily with regard to (i) geographic location; (ii) regime type (democratic or non-democratic); and, (iii) the degree to which the country has a free press. We combine data from several sources. Values for country population (used in per capita calculations), gross national income, GNI [6], and Internet users per 100 population measures are from the World Bank (2016) [7]. Regime type (RegimeType) is calculated from data provided by the Institut for Statskundskab, Aarhus Universitet (2016). Country global Internet connectivity measures are from capacity data curated by TeleGeography (2016) and purchased from them. In addition, international Internet bandwidth is considered a useful and relevant indicator of global Internet connectedness, especially as previous research has shown that Internet bandwidth and Internet traffic tend to correlate (Barnett and Park, 2005). Finally, the press freedom measure is from Freedom House (2015). The primary reason for using data from 2002 to 2014 for this study is that complete datasets for all the variables analyzed were available to us for the time period.

Regime type measurements by Skaaning, et al. (2015) focus on electoral participation and code country government types into one of seven lexical categories: (i) non-electoral regimes; (ii) one- and no-party regimes; (iii) non-parliamentary constitutional monarchies; (iv) limited multi-party authoritarian regimes; (v) exclusive democracies; (vi) male democracies; and, (vii) electoral democracies. During the 2002–2014 period there were no countries coded as exclusive or male-only democracies, reflecting widespread near universal suffrage. In this paper we dichotomize the regime type variable into two categories: (1) democratic (lexical categories v-vii); and, (ii) non-democratic (lexical categories i-iv). Example countries with their regime type classification in 2014 include Iran (non-democratic), Egypt (non-democratic), China (non-democratic), Russian Federation (non-democratic), Indonesia (democratic), Republic of Korea (democratic), and France (democratic). Our dichotomized variable corresponds, over the 2002–2014 period, to whether a country is an electoral democracy where “leaders are selected through contested elections held periodically before a broad electorate.” [8]

As mentioned, we use the Freedom House press freedom measure [9]. In 2014, the measure included 197 countries. The press freedom score for each country is based upon answers from a panel of experts to 23 questions divided into legal, political, and environmental categories. The result ultimately yields a country numeric score ranging between 0 (highest degree of press freedom) and 100 (lowest degree of press freedom). These numeric scores are then assigned categorical ones where a country’s media is classified as Free (numeric score less than 31), Partly Free (numeric score 31 to 60), or Not Free (numeric score 61 to 100). We use the Freedom House press freedom categorical measure (“free”, “partly free”, and “not free”) rather than its underlying numeric values (which range from 1 to 100). There are several reasons for this. Most importantly, our interest is in the impact of homophily and thus the numeric scores make little sense in that we would be forced to consider two countries with scores of 10 and 95 being, with respect to homophily, the same as if they were to have had scores of 10 and 11. That is, in neither case would they match, since the numeric scores are not identical. In addition, the panel of experts determining press freedom scores every year pays most attention to category changes. For example, changing a country’s press freedom score from “free” to “partly free” requires deeper discussions among the experts participating in the annual review. In contrast, changes in numeric measures that do not result in a category change do not always receive the same level of scrutiny. This insight is based on one of the authors’ participation in Freedom House meetings and interviews with panelists who have contributed to press freedom scores.

Our focus is on understanding the underlying structure of direct Internet connections between countries and the extent to which both country-specific social and political variables and how countries relate with one another can account for that structure. To investigate this question we will examine several theoretically informed Exponential Random Graph (ERG) models accounting for the tie structure of the global Internet. Structure here is represented by a graph whose nodes are countries with at least one measurable Internet connection to at least one other country and whose edges are the direct Internet connections between all pairs of countries in 2014 [10]. More specifically, we assume the global Internet in a given year is comprised of a fixed set of n countries (nodes) together with the data links connecting them represented by the n × n adjacency matrix A where:


Equation 1


Information can flow in either direction over an Internet connection and thus the graph is undirected and A is symmetric (Ai,j = Aj,i) [11]. Our objective then is to estimate a stochastic model of A which also does a reasonable job accounting for other general properties of the observed Internet, such as degree distribution, shared links, and geodesic distances.

ERG models, sometimes referred to as p-star models, assume that any given network is a realization of an underlying statistical process. In this sense, our 2014 network is the result of sampling from a distribution of networks each of which was generated by the same statistical model. Using ERG techniques is a process involving several steps. First, we use existing data to generate a “best” approximation of that statistical process. We then use bootstrapping to assess how likely we are to generate simulated networks with macro network-level properties of the sort observed in our 2014 network assuming our observed dataset is not an extreme outlier in the actual underlying generating distribution.

What follows considers the 162-country subset for which we have data on all measured variables used in estimating the various models. This compares with the total of 201 countries with positive direct Internet bandwidth connections to at least one other country in 2014. Countries with no missing data comprise 93.71 percent of the total 2014 global bandwidth and 100 percent of all the edges (direct Internet connections) in the full dataset. The countries excluded from our analysis tend to be less connected (low degree) countries. Figure 1 displays the structure but with countries not included in our ERGM analyses shown in red along with their country names. Figure 2 illustrates the geographic distribution of press freedom status for the 162 in our 2014 analysis subset.


Global Internet full structure with dropped nodes in red and labelled: 2014
Figure 1: Global Internet full structure with dropped nodes in red and labelled: 2014.
Note: Larger version of Figure 1 available here.



Freedom House status (analysis country subset): 2014
Figure 2: Freedom House status (analysis country subset): 2014.
Note: Larger version of Figure 2 available here.


Previous work has shown that the evolution of the global Internet exhibits considerable path dependence associated with a country’s economic and political situation (Barnett, 2001; Barnett and Park, 2005; Seo and Thorson, 2016). These studies used country-level measures such as political regime-type, economic output, and domestic Internet penetration to account for variations in global bandwidth generally assuming that the attribute values for a given country are independent of those for other countries. We build on these past analyses by initially considering three country-level attribute variables — total global Internet bandwidth, gross national income (GNI), and domestic Internet users per 100 population. Given the heavy right tail of both total bandwidth and GNI, we use the natural log of these variables.

Our goal is to model whether any two countries share a direct Internet connection. To accomplish this, we go beyond node attribute values to explore the degree to which shared characteristics or homophily adds to our ability to predict direct Internet link between country pairs. So, for example, are two democratic countries more likely to share a direct global Internet connection, holding constant all other measured variables, than are democratic and non-democratic countries?

Direct Internet connections require capital investment and it seems reasonable to expect that countries near to one another may be more likely, again ceteris paribus, to share a connection than would country pairs at a considerable geographic difference. We use continent as a proxy for this and expect continent homophily (country dyads in the same continent) to be predictive of direct connections [12].

Regime type is another homophily variable. Are countries with identical regime types (democratic or non-democratic) more likely, ceteris paribus, to share a direct link? The Internet was based on packet switching research conducted in the U.S., France, and Great Britain, all democracies, in the 1960s. This research led to the ARPANET in 1969 and by the 1980s inter-networking connected countries including Western European democracies, Australia, and, in Asia, first Japan and then South Korea. By 2002, the first year of our dataset, the global Internet connected 187 countries.

The early mover advantage in terms of connectedness as measured by the degree, number of direct global Internet ties with other countries for a given country, can be seen by comparing 2002 with 2014. Of the 170 countries for which we had regime type measures in 2002, 60 percent were classified as democratic. Importantly, the democratic countries were considerably more connected to the global Internet than were the non-democratic ones. The mean degree (number of direct connections) for democratic countries was 7.6 (median = 3) as compared to a mean degree for non-democratic countries of 3.2 (median = 2) [13].

As mentioned earlier, by 2014 the entire global Internet had become more decentralized with more countries having more direct data ties to other countries. Of the 175 countries for which we had regime type measures in 2014, 66.3 percent were classified as democratic, roughly the same as in 2002. The mean degree (number of direct connections) for democratic countries was 10.6 (median = 6) as compared to a mean degree for non-democratic countries of 8.2 (median = 5). While both democratic and non-democratic countries were, on average, more connected than in 2002, the differences had decreased. Thus we might expect regime type homophily to matter more in the earlier days of the global Internet than it would by 2014.

The final homophily variable we consider is press freedom. Are countries with the same press freedom categorization more likely, ceteris paribus, to share a direct connection? This might seem an odd expectation. Firstly, if the critique of the Freedom House measure that it simply reflects Western liberal values is correct, then we would anticipate regime type and press freedom to be highly confounded. However, it is interesting to look at Figure 3. This shows the network structure of our 2014 network. Here the nodes (countries) are colored as follows: blue border indicates that the regime type is democratic, red border indicates the regime type is not democratic, blue interior color indicates press freedom status is free, red interior color indicates press freedom status is not free, and, finally, orange interior color indicates press freedom status is partly free. In trying to distinguish between regime type homophily and press freedom status homophily we are interested in how predictive border colors or interior colors are of an edge or link between any two countries.


Global Internet structure 2014 after dropping nodes
Figure 3: Global Internet structure 2014 after dropping nodes (blue border indicates that the regime type is democratic, red border indicates the regime type is not democratic, blue interior color indicates press freedom status is free, red interior color indicates press freedom status is not free, and orange interior color indicates press freedom status is partly free).
Note: Larger version of Figure 3 available here.


To summarize, we examine whether plausible models of the tie structure of the global Internet can be estimated using domestic Internet users per100; log(global bandwidth); and log(GNI) as attribute covariates and regime type; UN continent, and press freedom status as homophily variables.




Initially, two ERG models were estimated [14]. In all models, the dependent variable was whether or not a direct tie existed between country pairs. The main effects model for 2014 focused only on country attribute variables. The independent attribute variables for each country were (i) domestic Internet users per 100; (ii) total global bandwidth (logged); and, (iii) GNI (logged). A summary of results is in Table 1. Since a tie (direct connection) either exists or does not exist between any two countries, the coefficients reflect the change in the log-odds likelihood of a direct connection for a unit change in the predictor variable. Standard errors are shown below the coefficients. In the main effects ERG model, the domestic Internet users per 100 was not statistically significant. However, both the GNI and bandwidth variables were statistically meaningful predictors.

All three attribute variables are treated as continuous as indicated by the nodecov preceding their names in Table 1. Using nodecov adds a statistic to the model equal to the sum of the attribute value for each dyad. For bandwidth the sum of the total global bandwidth (logged) for each pair of countries in the network would be added to the bandwidth statistic. This means that a large bandwidth-low bandwidth pair would look similar to a pair in which each country had moderate total bandwidth. It is also worth noting again that bandwidth here refers to each country’s total global bandwidth and not to the shared bandwidth over the pair if there is a tie. Empirically this seems plausible in that we do tend to observe large bandwidth (or large GNI) countries connecting both to each other and to smaller ones. Basically, large bandwidth (or large GNI) countries tend also to be higher degree countries.


Table 1: 2014 Main effects only.
Note: *p < .05; **p < .01; ***p < .001.
 2014 network
exponential family random graph
nodecov.ln BW0.2540***
nodecov.ln gni0.1970***


ERG model results are generally easier to interpret in terms of odds ratios which show any improvement a predictor variable provides within the estimated model. A coefficient is transformed to an odds ratio via an exponential transformation. So if the coefficient is θˆ, the odds ratio will be (eθˆ). 95% condence intervals are calculated from the coefficient’s estimated standard error (sˆe).


Equation 2


In what follows, we will summarize each model by providing odds ratios for the predictor variables using the 95 percent confidence interval. If a variable’s odds ratio is above 1.0 throughout 95 percent confidence interval range, we say it significantly affects our ability to predict a direct Internet connection between two countries. In the case of the homophily variables, the odds ratio can be interpreted as indicating the change in odds of a direct tie assuming a match on the variable between two countries and all other variables held constant. Table 2 shows the information in Table 1 but now as odds ratios.

In this model, the bandwidth variable (odds ratio = 1.37 with lower 95 percent value = 1.21) and the GNI variable (odds ratio = 1.29 with lower 95 percent value = 1.15) were significant. The estimated odds ratio of 1.29 for bandwidth can be interpreted as saying that there is a 1.29 increase in the odds of a tie between two countries for every unit increase in (logged) global bandwidth.


Table 2: 2014 odds 95% CI main effects model.
nodecov.ln BW1.211.291.37
nodecov.ln gni1.151.221.29


The homophily model drops the Internet users per 100 country variable (since it was not a significant predictor in the main effects model) and adds as second order effects three homophily variables: (i) CONTINENT (to reflect that it may be easier for a country to connect to countries on the same continent); (ii) regime type (coded as democratic or non-democratic); and, (iii) Freedom House status (coded as free, partly free, or not free). The three homophily variables are discrete and tested for exact value matching (node.match in model language) with odds results shown in Table 3. The country-level variables remained significant. Not surprisingly, of the homophily variables, CONTINENT (p < .001) had the highest estimated odds ratio indicating that knowing two countries are on the same continent increases the odds of their sharing a direct connection by a factor of a little over 4. Press freedom status homophily (p < .01) was also significant indicating that if two countries have the same press freedom status score, it increases the odds of a direct connection by 28 percent again assuming that all other variables are unchanged [15]. Importantly, the regime type variable was not significant.

It is clear that homophily with respect to CONTINENT is by far the most predictive followed by freedom house status homophily. Looked at in terms of probability, if two countries are both on the same continent and have the same press freedom status, the probability of their sharing a direct connection is 0.84 all other things being equal. If we only know that two countries have the same press freedom status then the probability of their sharing a direct connection is 0.56. And, finally, if all we know is that the countries are on the same continent, the probability of their sharing a connection is 0.81. These conditional probability estimates are obtained by doing an inverse logit transform on the estimated coefficients, symbol 4

Homophily means that connections form between countries who share values on the measured characteristic. A possibly confounding effect would be transitivity wherein countries who have connections in common will develop direct connections with one another. To test for this we used the (ergm) term gwesp (Robins, et al., 2007) to check for geometrically weighted edgewise shared partners by estimating a 2014 model using only gwesp and then the homophily model adding a gwesp term. While the gwesp term was statistically significant in both these models, the impact of both the covariates and the homophily terms remained predictive, within the 95 percent confidence interval, as in the base homophily model. We used .25 as the scaling parameter in the gwesp term. While we conclude that homophily matters, we did not identify an “optimal” scaling factor [16].


Table 3: 2014 odds 95% CI homophily model.


Comparing AIC scores, the homophily model (4071) is lower than that for the no homophily one (4362). Thus the homophily model is preferred and we consider its goodness of fit as shown in Figure 4. Here the bold line represents the observed data against data simulated from the estimated model. The results look fairly good for degree distribution though it underestimates the number of edge-wise shared partners. We also looked at goodness of fit with the gwesp term included. Results were similar, though AIC was lowered to 3794, and the number of edge-wise shared partners continued to be underestimated.


2014 homophily model GOF
Figure 4: 2014 homophily model GOF.
Note: Larger version of Figure 4 available here.


While the negative results for regime type may seem surprising, it is important to keep in mind that our results are with regard to the existence (or non-existence) of a direct tie between countries. When it comes to total global bandwidth, regime type clearly matters as shown in Table 4. The mean global bandwidth for democratic countries is almost four times greater than the mean for non-democracies.

To examine this further, we estimated the homophily model for each year in the 2002–2014 period of our dataset. The results, not shown, were that press freedom status was predictive in 2011, 2012, 2013, and 2014 but not before. The regime type variable was predictive of direct ties in 2002, 2003, and 2004. The country attribute variables remained predictive in every year for which we have data (2002–2014) as did the CONTINENT homophily variable. We have no plausible theoretical explanation for the recent significance of freedom of press variable. As far as we are aware, Freedom House made no methodological changes in its press freedom measure post 2002 [17]. It is always a possibility that the data for any given year are anomalous in some sense. However, the fact that results seem patterned and consistent over the past four years argues against 2014 being an outlier. Finally, recall that ERGM is most effective when there is no missing data and our data set includes only those countries for which we have valid measures on all model variables. A consequence is that 39, mostly smaller degree and bandwidth, countries were excluded. Countries with missing data in 2014 and thus not included can be seen in Figure 1.


Table 4: Global bandwidth by regime type.
regime-typemean Tbpsmedian Tbpstotal Tbpsn
Not democratic0.270.0113.3250




Discussion and conclusions

Traditional news intermediaries including governments and mainstream media outlets are no longer exclusive gatekeepers as social media platforms including Twitter, Weibo, and Facebook permit people to share news and information directly with one another. That said, not everyone has easy access to the global Internet, a necessary condition for using social media. Country-level Internet infrastructure and policy play an important role in this, as demonstrated in North Korea’s government policies denying Internet access to most of its citizens and the Chinese government generally restricting access to Google and most of its services. At the same time, a country’s press freedom status provides an important snapshot of the country’s legal, political, and economic conditions for citizens’ access to information which is increasingly available online. Even within countries of a similar regime type (e.g., democratic country), levels of press freedom can vary significantly.

Our results demonstrate that press freedom homophily is significantly predictive of direct Internet connections even in the context of geographic location and whether or not a country is democratic. That is, countries exhibiting homophily with regard to press freedom are more likely to share direct country-to-country Internet connections even after controlling for their economic performance, political regime type, and geographical location. Specifically, our study showed that press freedom status emerged as an important predictor of direct Internet ties between countries in 2011 and remained so until 2014, the final year of data analysis in this study. The regime type variable was a significant predictor in 2002, 2003, and 2004.

Our research offers several important scholarly and policy implications. First, this study provides important insights to understanding the underlying structure of direct Internet connections between countries. By analyzing regime type and press freedom along with other relevant variables such as geographic location and economic development, this study advances research on global Internet connectedness. In addition, our results may increase understanding of the relationship between press freedom and democracy. While these variables are often found to be highly correlated (Karlekar and Becker, 2014), in this paper we find (i) that the two variables play distinct roles in predicting the existence of direct global Internet connections; and, (ii) that the specific nature of these roles has changed over the 2002–2014 period.

Second, our analysis of press freedom should be informative for scholars who examine how press freedom might spread across the globe. This study suggests that more attention needs to be paid to associations between press freedom homophily and direct Internet ties between countries. Press freedom has rarely been examined in the context of global Internet connections, and thus our study fills a gap in the literature. Moreover, the one past study of press freedom contagion that we identified (Sobel, et al., 2010) concluded that press freedom spread to nearby countries. In the age of the global Internet, “nearness” might be extended to countries sharing direct connections. If so, our positive results related to the impact of both edgewise shared partners and press freedom homophily appears consistent with a contagion effect. Consequently, our study helps scholars better understand the role of the Internet and press freedom in democratic changes.

Third, the ERG approach used in this study should be helpful for future research in related areas. Our study contributes to advancing computational methods in communication research by demonstrating how ERG approaches can be used to study democracy, press freedom, global Internet connectivity, and other related variables. ERGM has gained increased attention in the field of communication (Song, 2015; Shumate and Palazzolo, 2010; Welles and Contractor, 2015) but is still new to many scholars in the field.

Moreover, findings should be helpful for policy-makers in the area of Internet connectivity and organizations that analyze and advocate for press freedom such as Freedom House and Reporters without Borders.

Future research should provide a more in-depth look at relationships between the variables each year. In addition, obtaining more complete datasets for countries is important though it probably will not be solved anytime soon. As mentioned before, missing data in our datasets prevented us from analyzing all countries. While the countries excluded due to missing data tended to be low bandwidth and low degree, we cannot rule out that having complete measures on those countries would affect our results.

Finally, as with any modeling technique, ERGM can be extremely sensitive to what might appear to be fairly minor tweaks to the model being estimated (Shalizi and Rinaldo, 2013). This becomes especially noticeable when adding variables to the model to account for macro features such as shared edges, triangles, or degree. These and other model parameters can be adjusted to see if, say, AIC scores get lower. There is always the danger of overfitting by modeling noise in the data. In this paper we have kept the the models simple and made every effort to develop them in a manner informed by available literature. End of article


About the authors

Hyunjin Seo is Associate Professor and Docking Faculty Scholar in the School of Journalism and Mass Communications at the University of Kansas and a Fellow at the Berkman Klein Center for Internet & Society at Harvard University.
E-mail: hseo [at] ku [dot] edu

Stuart J. Thorson is Professor Emeritus in the Maxwell School at Syracuse University.
E-mail: thorson [at] syr [dot] edu



1. We use the existence of a direct positive bandwidth capacity link between two countries to indicate those two countries are connected even if it were to be the case that no traffic flowed over that connection in a given time period. More on this in the Method section.

2. Welles and Contractor, 2015, p. 180.

3. Karlekar and Becker, 2014, p. 32.

4. Sobel, et al., 2010, p. 141.

5. Though our focus in this paper is on 2014, reference will also be made to our entire dataset which extends back to 2002.

6. GNI is gross domestic product plus incomes of foreign residents minus domestic earnings of non-residents. We use the World Bank’s PPP, purchasing power parity, GNI to facilitate cross-national comparisons.

7. We consider domestic Internet users as a proxy for how accessible the Internet is to citizens. As a consequence, we normalize it by the total population. Thus, for example, we would not want large population countries where a small proportion of the population uses the Internet to look “better” than a much smaller country with a high proportion of its population using the Internet. GNI, on the other hand, we use as a proxy for the economic strength of a country and thus do not normalize it by population.

8. Skaaning, et al., 2015, p. 1,495.

9. The Freedom House press freedom measure has been criticized both for being biased in favor of Western liberal values and for paying insufficient attention to new media such as blogs. Excellent summaries of these critiques are in Burgess (2010) and Becker, et al. (2007).

10. While we have 2015 data for most countries, global Internet bandwidth capacity data are often revised during the year following publication. Thus 2014 is the most recent year for which the measured values can be assumed reliable.

11. While we have bandwidth capacity for each data link, we are not using that information in this analysis and are treating the network as unweighted.

12. For this purpose we use the United Nations continent definitions. While UN region might seem a more fine grained measure, it fails to match countries sharing a land border while classified in different regions. An example would be the Czech Republic (Eastern European) and Austria (Western Europe).

13. The large differences between means and medians reflects the heavy right tail of the degree distribution. As an example, in 2002, the U.S. was the largest degree country with degree 124.

14. Models were estimated in R (R Core Team, 2016) using the SNA and ergm packages (Butts, 2016; Hunter, et al., 2008).

15. As a point of comparison, the Las Vegas house advantage on roulette is estimated to be 5.26 percent.

16. The .25 is a typical value in the literature we have reviewed. Model estimation takes considerably more computational time with the gwesp term specified. We attempted to identify a “best” value for the gwesp term using tools provided within ergm however the estimate did not converge in over a week of run time on a 4 GHz iMac with 32GB of memory and increased memory.size allocation for R.

17. Burgess (2010, p. 9) reports that Freedom House made changes in its press freedom measure in 1989, 1994, 1997, 1999, and 2002.



Abdullah Alam and Syed Zulfiqar Ali Shah, 2013. “The role of press freedom in economic development: A global perspective,” Journal of Media Economics, volume 26, number 1, pp. 4–20.
doi:, accessed 20 August 2019.

George A. Barnett, 2001. “A longitudinal analysis of the international telecommunication network, 1978–1996.” American Behavioral Scientist, volume 44, number 10, pp. 1,638–1,655.
doi:, accessed 20 August 2019.

George A. Barnett and Han Woo Park, 2005. “The structure of international Internet hyperlinks and bilateral bandwidth,” Annals of Telecommunications, volume 60, numbers 9–10, pp. 1,110–1,127.
doi:, accessed 20 August 2019.

Lee B. Becker, Tudor Vlad, and Nancy Nusser, 2007. “An evaluation of press freedom indicators,” International Communication Gazette, volume 69, number 1, pp. 5–28.
doi:, accessed 20 August 2019.

John Burgess, 2010. “Evaluating the evaluators: Media freedom indexes and what they measure,” Center for International Media Assistance, National Endowment for Democracy, at, accessed 20 August 2019.

Carter T. Butts, 2016. “sna: Tools for social network analysis,” R package version 2.4, at, accessed 20 August 2019.

Patrick Donahue, 2017. “Merkel pushes back on Trump’s media attacks, calls for ‘respect’.” Bloomberg (18 February), at, accessed 20 August 2019.

Freedom House, 2015. “Freedom in the world 2015,” at, accessed 20 August 2019.

Linton C. Freeman, 1978. “Centrality in social networks conceptual clarification,” Social Networks, volume 1, number 3, pp. 215–239.
doi:, accessed 20 August 2019.

Mauro F. Guillén and Sandra L. Suárez, 2005. “Explaining the global digital divide: Economic, political and sociological drivers of cross-national Internet use,” Social Forces, volume 84, number 2, pp. 681–708.
doi:, accessed 20 August 2019.

Jenine K. Harris, 2014. An introduction to exponential random graph modeling. Quantitative applications in the social sciences, volume 173. Los Angeles, Calif.: Sage.

David R. Hunter, Mark S. Handcock, Carter T. Butts, Steven M. Goodreau, and Martina Morris, 2008. “ergm: A package to fit, simulate and diagnose exponential-family models for networks,” Journal of Statistical Software, volume 24, number 3, pp. 1–29, at, accessed 20 August 2019.

Institut for Statskundskab, Aarhus Universitet, 2016. “Lexical index of electoral democracy,” at, accessed 6 July 2016.

Karin Deutsch Karlekar and Lee B. Becker, 2014. “By the numbers: Tracing the statistical correlation between press freedom and democracy,” Center for International Media Assistance (22 April), at, accessed 20 August 2019.

Christopher Kedzie, 1997. “Communication and democracy: Coincident revolutions and the emergent dictators,” RAND Corporation RGSD–127, at, accessed 20 August 2019.

Leonard Kleinrock, 1961. “Information flow in large communication nets,” RLE Quarterly Progress Report, number 1, at, accessed 20 August 2019.

Eric D. Kolaczyk and Gábor Csárdi, 2014. Statistical analysis of network data with R. New York: Springer-Verlag.
doi:, accessed 20 August 2019.

Helen V. Milner, 2006. “The digital divide: The role of political institutions in technology diffusion,” Comparative Political Studies, volume 39, number 2, pp. 176–199.
doi:, accessed 20 August 2019.

Pippa Norris, 2001. Digital divide: Civic engagement, information poverty, and the Internet worldwide. New York: Cambridge University Press.

Philippa Pattison and Stanley Wasserman, 1999. “Logit models and logistic regressions for social networks: II. Multivariate relations,” British Journal of Mathematical and Statistical Psychology, volume 52, number 2, pp. 169–193.
doi:, accessed 20 August 2019.

R Core Team, 2016. “R: A language and environment for statistical computing,” at, accessed 20 August 2019.

Garry Robins, Pip Pattison, Yuval Kalish, and Dean Lusher, 2007. “An introduction to exponential random graph (p*) models for social networks,” Social Networks, volume 29, number 2, pp. 173–191.
doi:, accessed 20 August 2019.

Espen Geelmuyden Rød and Nils B. Weidmann, 2015. “Empowering activists or autocrats? The Internet in authoritarian regimes,” Journal of Peace Research, volume 52, number 3, pp. 338–351.
doi:, accessed 20 August 2019.

Hyunjin Seo and Stuart Thorson, 2017. “Network approach to regime type and global Internet connectedness,” Journal of Global Information Technology Management, volume 20, number 3, pp. 141–155.
doi:, accessed 20 August 2019.

Hyunjin Seo and Stuart Thorson, 2016. “A mixture model of global Internet capacity distributions,” Journal of the Association for Information Science and Technology, volume 67, number 8, pp. 2,032–2,044.
doi:, accessed 20 August 2019.

Hyunjin Seo and Stuart Thorson, 2012. “Networks of networks: Changing patterns in country bandwidth and centrality in global information infrastructure, 2002–2010,” Journal of Communication, volume 62, number 2, pp. 345–358.
doi:, accessed 20 August 2019.

Cosma Rohilla Shalizi and Alessandro Rinaldo, 2013. “Consistency under sampling of exponential random graph models,” Annals of Statistics, volume 41, number 2, pp. 508–535.
doi:, accessed 20 August 2019.

Michelle Shumate and Edward T Palazzolo, 2010. “Exponential random graph (p*) models as a method for social network analysis in communication research,” Communication Methods and Measures, volume 4, number 4, pp. 341–371.
doi:, accessed 20 August 2019.

Svend-Erik Skaaning, John Gerring, and Henrikas Bartusevičius, 2015. “A lexical index of electoral democracy,” Comparative Political Studies, volume 48, number 12, pp. 1,491–1,525.
doi:, accessed 20 August 2019.

Russell S. Sobel, Nabamita Dutta, and Sanjukta Roy, 2010. “Beyond borders: Is media freedom contagious?” Kyklos, volume 63, number 1, pp. 133–143.
doi:, accessed 20 August 2019.

Hyunjin Song, 2015. “Uncovering the structural underpinnings of political discussion networks: Evidence from an exponential random graph model,” Journal of Communication, volume 65, number 1, pp. 146–169.
doi:, accessed 20 August 2019.

TeleGeography, 2016. “Global Internet geography,” at, accessed 20 August 2019.

Hai Tran, Reaz Mahmood, Ying Du, and Andrei Khrapavitski, 2011. “Linking global press freedom to development and culture: Implications from a comparative analysis,” International Journal of Communication, volume 5, at, accessed 20 August 2019.

Brooke Foucault Welles and Noshir Contractor. 2015. “Individual motivations and network effects: A multilevel analysis of the structure of online social relationships,” Annals of the American Academy of Political and Social Science, volume 659, number 1, pp. 180–190.
doi:, accessed 20 August 2019.

Jenifer Whitten-Woodring, 2009. “Watchdog or lapdog? Media freedom, regime type, and government respect for human rights,” International Studies Quarterly, volume 53, number 3, pp. 595–625.
doi:, accessed 20 August 2019.

World Bank, 2016. “World development indicators,” at, accessed 8 July 2016.


Editorial history

Received 16 September 2018; accepted 21 August 2019.

Copyright © 2019, Hyunjin Seo and Stuart J. Thorson. All Rights Reserved.

ERGM approach to press freedom, regime type, and Internet connectedness
by Hyunjin Seo and Stuart J. Thorson.
First Monday, Volume 24, Number 9 - 2 September 2019

A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2019. ISSN 1396-0466.