Trust but verify: Caution in the application of Internet-based research
First Monday

Trust but verify: Caution in the application of Internet-based research by R. Michelle Green



Abstract
Are Internet–based research results sufficiently trustworthy? In 314 adults ages 18–82, systematic personality trait differences existed among broadband users, dial–up users and Internet non–users. Greater levels of openness and extraversion correlated with greater degrees of home Internet access in black subjects and in those over 40. No systematic differences were observed in white subjects and in those under 40. Internet sampling’s ease cannot free researchers from the obligation to identify anomalous results and to pursue diverse samples using diverse methods, particularly critical as United States population becomes more diverse and long–lived.

Contents

Introduction
Methodology and theoretical framework
Results
Discussion
Conclusion

 


 

Introduction

Can Internet–based sampling take the place of other sampling techniques? Said differently, are there systematic differences between Internet users and non–users that would undermine the trustworthiness or generalizeability of the research? The question is particularly important as the ease and economy of using Internet–based research methods grows.

To date there is no clarion call for caution in the use of Internet research. Articles comparing individual differences between heretofore published samples and Internet samples are not widely disseminated. The cost efficiency of Internet–based survey studies should not obscure consideration of those populations either a) disproportionately unrepresented on the Internet or b) underparticipating in Internet use due to skill or resource limitations. Quality in the design, implementation and use of research demands inclusion not only of robust findings but also of appropriate caveats. For the previous six years, Pew Internet Life research (Fox, 2005) reported that a persistent 22 percent of Americans do not have Internet access. Their 2006 survey also confirms a skew in online demographics (Pew Internet & American Life Project, 2006): though 70 percent of Americans go online at least occasionally, only a third of those over 65 do; of the 15 percent of Americans without a high school diploma (nearly 45 million people), only 36 percent go online; and, 42 percent of black Americans report not going online, even occasionally. In this light, critical use of Internet–based research methods must consider: are there systematic differences between broadband users, dial–up users and Internet non–users? This conceptual article examines observed differences in samples with varying degrees of Internet access at home and at work.

 

++++++++++

Methodology and theoretical framework

I originally collected this data in 2003 in order to study the psychosocial precursors of fluency in the use of computers (Green, 2005). Subjects completed a survey assessing various aspects of personality as well as skill with and knowledge of computers. Three hundred and fourteen U.S. citizens ages 18–82 participated in the study. There were 108 men and 206 women; 119 identified themselves as white, 163 black. The median age is 41, with 14 percent of the sample over 65. The median level of education is “some college.” A fifth of the sample had at most a high school diploma, and 25 percent have at least a bachelor’s degree. One–fourth of participants did not own a computer at the time they completed the survey, though they may still have had computer access elsewhere.

Eighty–three percent of the sample responded to ads placed in one low–cost and two free metropolis newspapers in the Midwestern United States. One free weekly is considered a definitive source for that area’s entertainment and classifieds. The other two papers target black readers, and were included in the pursuit of a diverse sample. To further ensure a population diverse in age and race, active outreach to black churches and affinity groups identified the remainder (53 participants).

Since use of computers and networking depends not only on one’s readiness to use technology (e.g., education or employment status), but also one’s willingness to use technology, my research on predictors of computer use includes psychological units of analysis, such as traits. Traits describe aspects of one’s personality that are quickly recognizable. I examined personality traits by administering the NEO–PI–R (Costa and McCrae, 1992). The five traits in Costa & McCrae’s Revised Personality Inventory are Neuroticism (more analogous to anxiety than pathology), Extraversion, Openness to experience, Agreeableness and Conscientiousness. Please see the Appendix below for descriptions of each trait. It should be noted that a low level of any trait is not per se problematic, just as a high level is not automatically positive. Rather, extreme scores in either direction may indicate problematic behavior.

Subjects could describe their home access to the Internet as nonexistent, dial–up or high–speed. Since most work access is almost universally high-speed, the superior discriminator was degree of freedom permitted in using the Internet. Many employees have network access that only serves their job function, without free interaction with the Internet in general. Quality of work access was therefore described as no access, limited access, unlimited access. Through correlation analysis using SPSS, I examined the trait differences in populations with three degrees of access to the Internet.

 

++++++++++

Results

Trait differences did exist among these three populations, not only in age cohorts or racial sub–samples, but also in the full sample, most frequently with higher levels of openness correlating with greater levels of Internet access.

In the full sample (Table 1), higher levels of openness and extraversion correlated with greater Internet access for the 72 percent of the sample with Internet access at home. There were no systematic trait differences in the full sample for different levels of Internet access at work. For subjects under 40 years of age, systematic differences in traits did not exist across varying levels of Internet access whether at work or at home (Table 2). For 110 subjects over 40 (Table 3), however, higher scores on openness, conscientiousness and extraversion were correlated with greater degrees of home Internet access. Though a significant difference in openness was associated with the quality of Internet access at work for this sub–sample, the result was unexpected. Lower levels of openness in work populations were significantly correlated with greater levels of Internet access.

 

Table 1: Correlation of psychological variables with quality of Internet access (full sample).
* ρ ≤ .05, ** ρ ≤ .01, *** ρ ≤ .001.
All subjectsDo you have Internet access at home?
1=None, 2=dialup, 3=broadband
(N = 227)
How would you describe your access to the Internet at work?
1=None, 2=limited, 3=unlimited
(N = 203)
Neuroticism-.08.03
Openness*.16-.13
Extraversion*.15-.02
Conscientiousness.05-.05
Agreeableness.05.00

 

 

Table 2: Correlation of psychological variables with quality of Internet access (younger subjects).
* ρ ≤ .05, ** ρ ≤ .01, *** ρ ≤ .001.
<40 years oldDo you have Internet access at home?
1=None, 2=dialup, 3=broadband
(N = 117)
How would you describe your access to the Internet at work?
1=None, 2=limited, 3=unlimited
(N = 104)
Neuroticism-.13.02
Openness-.02-.08
Extraversion-.01.06
Conscientiousness-.03.05
Agreeableness.05-.01

 

 

Table 3: Correlation of psychological variables with quality of Internet access (older subjects).
* ρ ≤ .05, ** ρ ≤ .01, *** ρ ≤ .001.
40+ years oldDo you have Internet access at home?
1=None, 2=dialup, 3=broadband
(N = 110)
How would you describe your access to the Internet at work?
1=None, 2=limited, 3=unlimited
(N = 99)
Neuroticism-.06.06
Openness***.34**-.33
Extraversion**.30-.07
Conscientiousness*.20-.19
Agreeableness.07.01

 

With respect to race, I found no significant differences in psychological traits for white subjects by quality of Internet access, at home or at work (Table 4). In the 107 black subjects with Internet access at home, however, higher levels of openness were significantly correlated with greater quality of Internet access at home (Table 5). The work analysis paralleled the age results. Lower levels of openness in black subjects were correlated with greater levels of Internet access at work. No significant differences were observed in the other four traits.

In summary, my research data found systematic differences in psychological traits associated with differences in the quality of Internet access in black subjects, and in subjects over 40 years of age.

 

++++++++++

Discussion

For white subjects and younger subjects, trait patterns appear quite similar at different levels of Internet access, with no significant differences that might taint research results or their application. Black Internet users and Internet users over 40, however, do display personality profiles that differ both by levels of Internet access and location of access. A preponderance of educationally or technologically oriented research is performed in formal institutions of learning, where a disproportionate percent of those sampled are likely to be less than 40 years of age. Similarly, it is harder to find and or successfully solicit non–white participation in survey research. In that light, the findings of this analysis are particularly troubling, since the populations least likely to be surveyed are most likely to have systematic differences.

 

Table 3: Correlation of psychological variables with quality of Internet access (white subjects).
* ρ ≤ .05, ** ρ ≤ .01, *** ρ ≤ .001.
White subjectsDo you have Internet access at home?
1=None, 2=dialup, 3=broadband
(N = 99)
How would you describe your access to the Internet at work?
1=None, 2=limited, 3=unlimited
(N = 83)
Neuroticism-.05.13
Openness.01-.07
Extraversion.15-.06
Conscientiousness.07-.18
Agreeableness.16-.14

 

 

Table 3: Correlation of psychological variables with quality of Internet access (black subjects).
* ρ ≤ .05, ** ρ ≤ .01, *** ρ ≤ .001.
Black subjectsDo you have Internet access at home?
1=None, 2=dialup, 3=broadband
(N = 107)
How would you describe your access to the Internet at work?
1=None, 2=limited, 3=unlimited
(N = 99)
Neuroticism-.14-.02
Openness**.28*-.23
Extraversion.08-.01
Conscientiousness.03.06
Agreeableness-.03.09

 

The most anomalous result, where higher openness scores in black subjects and older subjects correlated with lower quality of Internet access at work, could reflect the subjective nature of the definition of “limited” access to the Internet. Individuals with high levels of openness may chafe at any limitation on their ability to access the Internet’s data store. The result’s meaning is less relevant to this discussion than its existence: the anomaly ironically underscores the importance of formal analysis of ancillary variables over the tendency to trust one’s results or instincts prima facie.

Online survey methods intensify the challenges of research application. Samples obtained solely using online methods would by definition not reflect the behavior of non–users. Such research could achieve results not replicable in or applicable to older populations or non–white populations, even as the demographic profile in the United States lives longer and becomes more diverse (Texeira, 2006). Researchers must communicate appropriate caveats to online research, just as research audiences must look for and understand such caveats.

 

++++++++++

Conclusion

Internet–based research is a powerful tool in research analysis today. Gosling, et al. (2004), in addressing race, age, and Internet sampling, argued that though Internet samples are not statistically representative of the population, they are no worse, and often better, than traditional samples in terms of achieving a diverse racial profile. By dint of sampling thousands, Internet research often yields race or age sub–samples with great statistical power. Having an Internet–based diverse sample has advantages over a non–diverse sample obtained through traditional techniques.

Any method that can yield a superior representation of the population under study has value. Internet–based surveys, however, make it easier for researchers to run afoul of two risks: first, neglecting or ignoring key phenomena in human behavior by undersampling populations (e.g., non–Internet users); second, misrepresenting these undersampled populations by presuming that behavior observed in their sample is applicable across the universe (e.g., race or age). Quality in the design, implementation and use of research demands inclusion not only of robust findings but also of appropriate caveats. Researchers must be as knowledgeable of the pros/cons of Internet–based sampling as they are of self–report data and volunteer effects.

Systematic differences such as those observed in this analysis could appear for any set of variables, for any given research question. It behooves us as researchers to consider variables that might affect our results, and to make every possible effort to reveal systematic patterns in such intervening variables. The quality and trustworthiness of research is closely entwined with conscientiously considering sub–groups of interest and confirming the generalizeability of the final results. The ease and relatively low cost of Internet sampling does not free researchers from the obligation to understand and pursue diverse populations using diverse sampling methods. Alternatively, those who apply the outcomes of Internet–based research in fields like psychology, public policy, gerontology, parenting and education need clear indication of relevant caveats so that they may draw the best possible conclusions. End of article

 

About the author

R. Michelle Green received her Ph.D. in 2005 from Northwestern University’s School of Education and Social Policy. Her research program for the better part of a decade has examined why adults embrace or reject Information Technology (IT). She is particularly concerned with the ramifications of technological inequity on those who are poor, older, or non–white. Dr. Green works at Hampshire College in Amherst, Mass., and is currently Visiting Faculty in Cognitive Science and Dean of Student Services.

 

Appendix: Short description of the NEO–PI–R’s five trait domains

The Revised NEO–Personality Inventory, or NEO–PI–R (Costa and McCrae, 1992), is a self–report inventory of 240 items. Different inventories are available for college students and for older adults; this study used only the Adult inventory. The NEO–PI–R has six subscales within each of the five traits: Neuroticism, Extraversion, Openness to experience, Conscientious and Agreeableness. Please note that the NEO is designed such that neither a high nor a low score necessarily implies any recognizable or diagnosable psychiatric problem.

Neuroticism encompasses many behaviors associated with maladjustment and emotional instability. This domain is associated with negative descriptors like fear, worry, anxiety, or disgust. The existence of neuroses or even psychoses does not automatically generate high neuroticism scores. Conversely, those scoring low on neuroticism do not necessarily exhibit indicators of superior mental health — they are, however, calmer and steadier than others. The six sub–facets of neuroticism are anxiety, angry hostility, depression, self–consciousness, impulsiveness and vulnerability.

Extraversion’s high scorers are convivial and engaging. They are multitaskers, full of energy and drive. Those low in extraversion are reserved, and often prefer their own company to that of others. They may find comfort in a slower life tempo than others, or gravitate to the routine and stable. Its six sub–facets are excitement seeking, activity, warmth, gregariousness, assertiveness, and positive emotions.

Openness to experience tends to reflect characteristics like intelligence and creativity. It is not equivalent to either, however. Very high openness may contribute to indecisiveness or high levels of distraction, or make sensory experiences too intense. Similarly, though very low openness is associated with political conservatism, it does not automatically signal authoritarianism. The six sub–facets are (openness to) ideas, actions, fantasy, aesthetics, feelings, and values.

Conscientiousness is, of all the aspects of personality, most strongly associated with achievement. High scorers on this aspect manifest focus, purpose and strength of will. Such individuals also tend to have a higher degree of impulse control than others, with the ability to delay gratification to achieve their goals. Conversely, long work hours and a “work first, play later” approach driven by very high conscientiousness could challenge work and family balance. Since cautiousness and deliberateness are also part of this domain, aspects of conscientiousness can lead to indecisiveness as the individual obsessively considers all pros and cons of every option. Conscientiousness’ six sub–facets are competence, order, dutifulness, achievement striving, self–discipline, and deliberation.

Agreeableness tends to measure helpfulness or altruism in interactions with others. Those high on agreeableness would tend to prefer cooperation to competition. Those scoring high in agreeableness expect others to be agreeable and well intentioned too. Low scorers on this trait are more likely to manifest skepticism, even cynicism. Excessive agreeableness is also a concern — someone too high in compliance, for example, might not be able to reconcile multiple and conflicting demands. Agreeableness includes the six sub–facets trust, straightforwardness, altruism, compliance, modesty and tender–mindedness.

 

References

P.T. Costa, Jr. and R.R. McCrae, 1992. NEO–PI–R Professional Manual. Lutz, Fla.: Psychological Assessment Resources, Inc.

Susannah Fox, 2005. “Digital divisions,” Pew Internet and American Life Project 2005 (5 October), at http://www.pewInternet.org/pdfs/PIP_Digital_Divisions_Oct_5_2005.pdf, accessed 7 November 2007.

Samuel D. Gosling, Simine Vazire, Sanjay Srivastava, and Oliver P. John, 2004, “Should we trust Web-Based studies? A comparative analysis of six preconceptions about Internet questionnaires,” American Psychologist, volume 59, number 2, pp. 93–104. http://dx.doi.org/10.1037/0003-066X.59.2.93

R. Michelle Green, 2005. “Predictors of digital fluency,” dissertation, School of Education and Social Policy. Evanston, Ill.: Northwestern University.

Pew Internet & American Life Project, 2006. “Internet penetration and impact,” at http://www.pewinternet.org/PPF/r/182/report_display.asp, accessed 7 November 2007.

Erin Texeira, 2006. “U. S. will be more diverse at 400 million,” Washington Post (21 October), at http://www.washingtonpost.com/wp-dyn/content/article/2006/10/21/AR2006102100402.html, accessed 2 February 2007.

 


Editorial history

Paper received 2 February 2007; accepted 17 October 2007.


Copyright ©2007, First Monday.

Copyright ©2007, R. Michelle Green.

Trust but verify: Caution in the application of Internet–based research by R. Michelle Green
First Monday, Volume 12 Number 11 - 5 November 2007
http://firstmonday.org/ojs/index.php/fm/article/view/2027/1892





A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2017. ISSN 1396-0466.