Mining the Blogosphere: Age, gender and the varieties of self-expression

Shlomo Argamon, Moshe Koppel, James W. Pennebaker, Jonathan Schler

Abstract


The growth of the blogosphere offers an unprecedented opportunity to study language and how people use it on a large scale. We present an analysis of over 140 million words of English text drawn from the blogosphere, exploring if and how age and gender affect writing style and topic. Our primary result is that a number of stylistic and content-based indicators are significantly affected by both age and gender, and that the main difference between older and younger bloggers, and between male and female bloggers, lies in the extent to which their discourse is outer- or inner-directed. In fact, the linguistic factors that increase in use with age are just those used more by males of any age, and conversely, those that decrease in use with age are those used more by females of any age.

Full Text:

HTML


DOI: http://dx.doi.org/10.5210/fm.v12i9.2003



A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2017. ISSN 1396-0466.