Measuring monographs: A quantitative method to assess scientific impact and societal relevance
First Monday

Measuring monographs: A quantitative method to assess scientific impact and societal relevance by Ronald Snijder



Abstract
In the humanities and social sciences (HSS), the monograph is an important means of communicating scientific results. As in the field of STM (science, technology and medicine), the quality of research needs to be assessed. This is done by bibliometric measures and qualitative methods. Bibliometric measures based on articles do not function well in the field of HSS, where monographs are the norm. The qualitative methods which take into account several stakeholders are labour intensive and the results are dependent on self–assessment of the respondents, which may introduce bias. In the case of humanities, the picture becomes even less clear due to uncertainties about the stakeholders.

This paper describes a method that may complement the current research on scientific impact and societal relevance. This method measures the usage of online monographs and identifies the Internet provider involved. The providers are categorized as academic; government; business; non–profit organisations and the general public. The usage is further categorised in national and international. Combining this data makes it possible to assess the scientific impact and the societal relevance of the monographs. The method is quantitative, which makes the results easier to validate. It is not necessary to know the stakeholders in advance: the readers are identified through the method.

The dataset consists of over 25,000 downloads by more than 1,500 providers, spread over 859 monographs. More than two thirds of the usage can be categorised, and almost 45 percent of all usage comes from non–academics. This might indicate that the monographs have an relevance in society.

Two possible influences on monograph usage were analysed: subject and language. Most of the subjects that received a higher than average number of downloads come from the field of the social sciences; the humanities were less ‘popular’. Books in English — the ‘lingua franca’ of science — were downloaded the most. Languages such as Dutch were read much less outside of national borders that Italian or German. A Dutch or Belgian scholar would need a translation in order to have more influence abroad; this applies far less for Germans or Italians.

While further research is needed, the results are promising and the proposed method could be used as an addition to the existing tools to measure the scholarly impact and societal relevance of the field of HSS.

Contents

Monographs under pressure
Scientific impact, societal relevance and monographs
Methodology
The OAPEN Library as dissemination channel
Setup
Are all ISPs equal?
Possible influences on usage
Conclusion

 


 

Monographs under pressure

In the humanities and social sciences (HSS), monographs — instead of articles — play an important part in communicating scholarly results [1]. However, the publication of (paper) monographs faces challenges. Greco and Wharton (2008) describe the problems faced by university presses, resulting in smaller print runs per title and declining sales to libraries and institutions. Also, Thompson (2005) describes falling print runs and declining sales. The decline in dissemination of scientific monographs is further illustrated by the Association of Research Libraries (ARL). The expenditure for journals grew from more than US$1,400,000 in 1986 to over US$7,513,000 in 2011. This contrasts sharply to the US$1,120,000 spent in 1986 and US$1,936,000 spent in 2011 on monographs (Association of Research Libraries, 2012). Williams, et al. (2009) also describe a decline of sales combined with negative effects on print runs, but conclude that the monograph remains the single most valued means of scholarly publishing within the field of arts and humanities. Withey, et al. (2011) conclude that the economic model supporting monographs depends for a significant amount on subsidies. This funding model can only be sustained if the return on investments is clear.

This raises the question why monographs are used more than journal articles. The answer might be found in the definition by Case (2009): “The monograph is a large, specialized work of scholarship that treats a narrow topic in great detail.” He adds that “monographs are principally about establishing facts or narrative in a set of fields in which facts and narratives are often hard to establish.” Due to its size, a monograph enables researchers to describe the results of research spanning a long period in sufficient detail. It is therefore best suited for the type of research mostly conducted in the field of HSS. It is targeted at a specialised audience, in contrast to a ‘text book’ which is designed for a more general audience. However, in this article we will see that there is an interest in monographs by the ‘general public’.

The monograph clearly performs a useful function in the field of HSS, especially because of its length. An example of scholarly use of monographs is described by Mendez and Chapman (2006) who investigated the role of monographs as sources in the field of Latin American history. They conclude that the use of monographs as secondary sources — after a decline in the period 1985 to 1995 — elevated to a higher level in 2005.

However, scholars in the Humanities and Social Sciences are expected to describe their contribution to society. As in the field of Science, Technology and Medicine (STM), there is a need to assess the value of scholarly output.

 

++++++++++

Scientific impact, societal relevance and monographs

In several countries government policies have been developed to assess the quality of scientific and scholarly research, in other countries the assessment is done by academies of sciences. The aim is to enhance the quality of scientific work and to maximise the societal benefits deriving from it. Assessing the quality of research is normally done on two levels: at the level of individual scientists or scholars and at the level of scientific or scholarly output. The first level is measured through ‘esteem indicators’ as prizes and scholarly positions, or the amount of international influence. At the level of output we find ‘internal assessments’: peer review of documents and ‘external assessments’ through bibliometric indicators, such as high ranking journals, book series or publishers (Royal Netherlands Academy of Arts and Sciences, 2010). Furthermore, the assessment must take into account the variety of output forms — it should not be limited to journal articles — and the bureaucratic burden must be limited.

On top of this, research and its outcomes can be categorised as Mode 1 and Mode 2, where Mode 1 research is done within the academic discipline, and Mode 2 research aims at the application of research outcomes. This concept was introduced by Gibbons, et al. (1994); the application in research evaluation is recently discussed by Ernø–Kjølhede and Hansson (2011). Leydesdorff and Etzkowitz (1996) use a different angle by looking at the relations between universities, governments and industries: the “Triple Helix.”

Creating the best possible scientific or scholarly output is not a goal in itself; the output should be used by others. Usage by scientists is termed scientific impact; usage by others is termed societal relevance. Usage is not exactly the same as impact; it functions as an indicator for impact. Measuring scientific impact in the field of HSS is poorly developed compared to the field of STM. In the field of STM, the use of bibliometric measures such the Journal Impact Factor (JIF) or the h–index is often discussed, although its application is controversial and often inappropriate. In the field of HSS — where articles play a smaller role in disseminating research results — similar tools are not widely available.

However, Nederhof (2006) and Linmans (2010) have discussed the usage of bibliometric tools in the humanities and the social sciences. Nederhof investigated the possibilities of bibliometric research in the field of HSS and concludes that it is possible to use the same methods as deployed in STM. It could be done if more types of publications — monographs and journals not covered by ISI — are taken into account and by applying impact indicators that compensate for the smaller volumes of citations in the humanities and social sciences, compared to the field of STM (Nederhof, 2006). Linmans focuses on citations per author, not from a certain period but on life–long citation data. This method aims to make more citation data available, which should lead to more robust results (Linmans, 2010).

Alternatives to the ‘standard’ bibliometric methods have also been described. White, et al. (2009) discuss ‘libcitations’, where the number of academic libraries holding a certain book is the unit of measure. The collection of a library is formed based on qualitative decisions; a monograph that is acquired by a large number of libraries is ‘better’ than a monograph that only resides in a few libraries (White, et al., 2009). The MESUR project is not only based on counting citations, but also focuses on the usage of online sources — mostly journal articles — by scientists. The authors see online usage as a better indicator for scientific impact than citations (Bollen, et al., 2009; Bollen, et al., 2008). The method described in this article is also based on measuring online usage, but here the focus is not on journal articles; it is on monographs instead. Online usage is also discussed by Herb, et al. (2010), publishing work on the usage and interface design of repositories — the most widely used way to disseminate open access documents (Herb, 2010; Herb, et al., 2010). While the discussed research uses quite different modes of operation, all of it is aimed at scientific impact, not on societal relevance.

In order to measure the usage of scientific or scholarly output in society, more elaborate methods are needed. Several researchers have published work on defining societal relevance and the evaluation of the current frameworks. The methodology described by Lyall (2004) encompasses focus groups, questionnaires, desk research and stakeholder analysis; a method which does not seem to minimize bureaucratic demands. In the Netherlands, the same methodology was presented by the QANU organisation (Bennink, et al., 2008). The SIAMPI project defined three types of indicators (termed ‘productive interactions’): direct or personal interactions; indirect interactions through texts or artefacts and financial interactions through money or ‘in kind’ contributions (Spaapen and van Drooge, 2011). The method described here measures one of the interactions: through the texts of electronic version of monographs. Furthermore, current policy programs aimed on societal relevance are studied. An example is the case study by Grant, et al. (2010) of the Australian RQF, the U.K. RAISS method, the U.S. PART framework and the Dutch ERiC framework.

Very little is known about the societal relevance of monographs. Only recent, Serenko, et al. (2011) have published research on societal relevance in the field of knowledge management and intellectual capital. Within knowledge management, there is a relatively clear distinction between scholars and practitioners. As all stakeholders are known, the flow of knowledge from one group to the other is not hard to follow. In the social sciences, government agencies are considered to be a major benefactor of the scientific results. Several usage studies — primarily based on surveys and interviews — have been published (Bell, et al., 2011; Landry, et al., 2001; Landry, et al., 2003). In other disciplines in the humanities, the picture is less clear. Benneworth and Jongbloed (2009) show that the stakeholders — in other words: the groups that would primarily benefit from research — are less visible to universities. Of course, if stakeholders are not known, it is impossible to perform the kind of qualitative research described by Lyall.

Measuring scholarly impact and societal relevance in the humanities and social sciences is not without problems. When methods based on bibliographic data are used to assess scholarly impact, the lack of data makes the results less reliable. The proposed and used methods to assess societal influence are labour intensive; this requires a large investment in time and money. Furthermore, the results are dependent on self–assessment of the respondents. Of course, this may introduce bias: depending on the respondent the perceived results may be too positive or too negative. In the case of humanities, the picture becomes even less clear due to uncertainties about the stakeholders.

This paper describes a method that may complement the current research on scientific impact and societal relevance. It relies on analysing data generated by usage of electronic versions of monographs. Every time a reader opens a Web page or downloads a document, information about the organisation through which the reader accesses the Web is recorded. By assessing this information, it is possible to determine the type of organisation and the county of origin. Due to extensive use of automated tools it is less labour intensive than the previously described methods, and it may uncover groups of users, even in disciplines where stakeholders are not well known. The method is tested on data generated from the OAPEN Library.

 

++++++++++

Methodology

The method is based on the fact that books can be made available online, in full or partial, through a dissemination channel. Those channels may impose restrictions such as full or limited availability, enabling downloading, printing etc. Examples of dissemination channels are the Google Book Search program, institutional repositories or e–book collections of academic libraries. Each of these channels collect usage data, such as the number of views or downloads and some information about the user. Almost all web based channels list the Web address of the ‘provider’: the organisation that grants access to the Internet. So, if a researcher of Leiden University downloads a book using her or his office equipment, the Web address (www.leidenuniv.nl) of that university will be logged. Basic information such as address and telephone number are publicly available and can be found using the so called ‘WHOIS protocol’ (Wikipedia, n.d.). By combining the usage data and information about the provider, we can make an assumption about who is using a specific monograph. To put if differently: the type of provider is used to assess the type of reader. In the example used, the reader is affiliated with an academic institution, based in the Netherlands.

Defining stakeholders: scientific impact and societal relevance

If the dissemination channel is open to everybody, it may attract users from all kinds of organisations. Not everybody will have an academic organisation as provider; it may be another type of organisation or it will be an Internet service provider (ISP). It then becomes necessary to define several groups of organisations. Here, the following categories are used: academic; government; business; non–profit organisations and the general public. Academic users are seen as the main audience for monographs. Based on the literature on societal relevance, we could divide the other types of readers of monographs into the following categories: government, business and general public. If the provider is an ISP, the reader cannot be linked to an organisation. This could mean that the reader is not acting as a member of an organisation, and may be categorised as a member of the general public. In this article, another type of organisation is proposed: non–profit organisations.

Within the humanities and social sciences, we might expect to find stakeholders that are not commercial, who play a role in the discipline. In the social sciences, government is seen as a significant stakeholder, and government policies regarding certain subjects — for instance: immigration, environment — receive considerable attention from non–profit organisations. Societal relevance by those types of organisations is therefore also to be expected. As discussed before, the situation in the humanities is less clear and stakeholders are not identified. Still, we might expect usage from non–profit organisations. For instance, national history may cause considerable interest.

Apart from the provider, information about the country from which the data request originated is available, indicating the nationality of the reader. This information can be used to classify the usage a bit further: national versus international. In order to classify usage to be national or international, we need to establish the ‘nationality’ of a monograph. Several choices are available: the nationality of the author(s), the country of the author’s organisation or the country of publication. Here, the country of publication is used; the information about authors or their organisations was not available.

This method can be used to measure the scientific impact and the societal relevance of one monograph. The ratio of academic readers versus other users may be used as an indication of the level of scientific impact and societal relevance. Examining a group of monographs enables us to look at other aspects as well: what is the influence of the monograph’s subject or is language a barrier for international usage? The amount of national and international usage could be closely linked to the language of the monograph. When looking at the monograph’s subject, it may be possible that different scientific disciplines display other usage patterns. For instance, the percentage of users connected to a government organisation may be larger in the social sciences than in the humanities.

Most literature on societal relevance does not explicitly focus on international usage; from a policy point of view societal relevance is looked at on a national level. Policy makers are more likely to prioritize usage on a national level as a way to measure the return on investments in science done by national governments. Still, the international usage should also be taken into account. As discussed before, international usage is used as an indication of esteem. The percentage of usage outside national borders may give an indication of the importance of the work. This reflects on the authors; one of the ‘esteem indicators’ is the level of international interest.

The dataset contains books that are published in west European countries. Usage is global however, ranging from Albania to Zimbabwe. This also includes the so–called “developing countries”, with more limited financial resources. The digital divide between the developing countries and the developed countries could be described as a financial barrier to access (Swan and Hall, 2010). Here, all monographs used are published in open access, therefore this barrier does not exist here and this aspect will not be discussed in this paper.

Conclusions regarding these statistics must be drawn with caution. First of all, the information found using the WHOIS protocol must be interpreted: what type of organisation is described? If the organisation is a university, it is quite clear. The question where to draw the line between an ISP and another type of commercial organisation is less easy to answer. Also, organisational affiliation does not tell anything about professional roles. For instance, if the provider is a university, there is no way to tell whether the reader is a student or a professor. Likewise, if the provider is an ISP, we cannot be sure the reader used the online monograph for personal or professional reasons. Regarding nationality, this too is not a 100 percent match: one could easily imagine a Spanish reader downloading a monograph while in the U.S. The user statistic would then indicate the U.S. as country of origin. A possible remedy could be found in using a survey, asking readers about their professional affiliation, role and nationality. And finally, here we measure the number of downloads. The number of downloads is an indication of readership: we can assume that the more a book has been downloaded, the more is has been read. But we cannot state that 100 downloads equal 100 people reading the book cover to cover.

Selecting a channel to measure usage

In order to measure the usage of electronic monographs, we need access to dissemination channels. One may consider academic libraries to be the obvious choice. However, there are certain drawbacks to this dissemination channel. First of all, measuring usage from an academic library constricts the user population to the staff and students of that particular academic institution. The composition of the group needs to be taken into account. For instance, if faculties are significant different in size, it may reflect on the usage measured. A far more serious problem is the fact that academic libraries are not open to outsiders, making it impossible to measure societal relevance. Furthermore, usage ‘outside’ of the library catalogue — of monographs found through search engines — is not measured.

Collections of monographs are not only found in libraries. Academic publishers also have access to dissemination channels. Publishers have a different interest from academic libraries; instead of serving one academic community, publishers need to be known as widely as possible. This reflects on their usage of dissemination channels: at the very least, information on all available publications are accessible to everybody. Therefore, usage data is not restricted to certain groups and could be used to measure both scientific impact and societal relevance. Furthermore, access to the data is not channelled through a library catalogue, but is wide open to both search engines and other linking mechanisms — such as the Facebook Web site (Vascellaro, 2009).

 

++++++++++

The OAPEN Library as dissemination channel

The method was tested on the OAPEN Library, which was officially launched in September 2010. The OAPEN Consortium describes it as “an Online Library containing a freely available, quality–proven and multilingual collection of monographs from various fields of HSS” (OAPEN Consortium, 2011). It is a Web–based collection of monographs, which are all available in open access. The site offers several ways to make its contents accessible: it enables searching and browsing, readers can share book descriptions via social media and it contains several data feeds (Open Access Publishing in European Networks, 2010).

The OAPEN Library was used because its collection contains a diverse range of subjects, published by dozens of publishers and in several languages. This creates a large dataset, which contains sufficient large sets of monographs with the same language, subject, etc. For this article the number of downloads of the full year 2011 as measured through the Google Analytics program were used. Google Analytics only measures the number of downloads that result from a visit to the OAPEN Library site. This does not draw a complete picture; all monographs can be directly downloaded, without browsing the OAPEN Library Web site. So, if a reader uses a search engine such as Google or Bing to find a book and downloads it directly from there, the download will not be registered in Google Analytics. The total number of downloads in 2011 is larger than 300,000. At this moment, not all user statistics are available. Therefore, the Google Analytics data will be used.

The dataset consists of a diverse set of monographs: 859 titles, published by 30 publishers. There is also a wide range of languages available: Danish; Dutch; English; French; German; Italian; Latin; Norwegian; Spanish and Welsh. For those titles, 25,405 downloads were measured, by 1,574 unique providers. Each provider was classified as one of the following types: academic, government, business, non–profit or ISP. For each download, the provider was further classified as national or international, depending on the county of publication and the country of the provider: if the country of publication equals the country of the provider, the provider is national; otherwise it is classified as international. So, if the University of Exeter downloads a book by the Dutch publisher Brill, it is classified as Academic (International). When a book published by Manchester University Press is downloaded by the University of Exeter, it is classified as Academic (National).

 

++++++++++

Setup

The goal of the research is to test the method and gather quantitative data about scientific impact and societal relevance of scientific monographs. This type of research is new; therefore no best practice is established. Here, the percentage of downloads per type of provider is used as a measure for scientific impact and societal relevance, combined with the average amount of downloads per group of titles. By comparing these groups, we may be able to find significant differences. No benchmark is available, so it is not possible to say how well a certain monograph ‘performs’.

Monograph usage can be measured on two levels:

  1. At the level of separate titles
  2. At the level of the complete collection

Measuring usage at the level of separate titles

In this paper, the data at the level of the complete collection or at the level of large subsets will be discussed in most detail. It is possible to analyse each monograph’s usage. The following example shows the usage data for the book Globalization contested: An international political economy of work [2] written by Louise Amoore and published by Manchester University Press in 2002.

About 23 percent of the usage comes from (international) academic institutions, and almost 70 percent is generated by foreign ISPs. The remaining usage is generated by a company, a British ISP and a non–profit organisation: the International Atomic Energy Agency. We will see that those figures are no exception: the average usage percentages per provider type are more or less along these lines. The international usage is truly worldwide; this is also typical for all the measured data, which originated from 102 countries.

 

Table 1: Usage data of one book.
Organisation Type Country Downloads
University of Queensland Academic (International) Australia 1
University of Hong Kong Academic (International) China 1
Universität Duisburg–Essen Academic (International) Germany 1
University of the Aegean Academic (International) Greece 1
Hokkaido University Academic (International) Japan 1
Universiteit van Amsterdam Academic (International) Netherlands 1
Universidade do Porto Academic (International) Portugal 1
National University of Singapore Academic (International) Singapore 3
Webtrade Ltd. Business (International) Ireland 1
International Atomic Energy Agency Non–profit (International) Austria 1
Virgin Media ISP (National) Great Britain 1
Belgacom ISP (International) Belgium 1
Telecel S.A. ISP (International) Bolivia 1
Cambodian ISP, Country Wide, Wireless IAP ISP (International) Cambodia 1
Ezecom ISP (International) Cambodia 1
Bell ISP (International) Canada 1
Cytanet ISP (International) Cyprus 1
UPC Broadband ISP (International) Czech Republic 1
Arcor AG ISP (International) Germany 1
Ewe Tel ISP (International) Germany 1
OTEnet S.A. ISP (International) Greece 2
Videsh Sanchar Nigam ISP (International) India 1
PT Telekomunikasi Indonesia, Tbk ISP (International) Indonesia 5
PT. Global Media Teknologi ISP (International) Indonesia 1
XS4All ISP (International) Netherlands 2
Ar Telecom ISP (International) Portugal 1
Astral ISP (International) Romania 1
BEOTEL–AS BeotelNet–ISP ISP (International) Serbia and Montenegro 1
Telia ISP (International) Sweden 1
Asia Infonet ISP (International) Thailand 1
TOT Content Farm Network ISP (International) Thailand 1
Farlep–Odessa ISP ISP (International) Ukraine 1
GoDaddy.com ISP (International) USA 1
RoadRunner ISP (International) USA 1
SYNCHRONOSS TECHNOLOGIES ISP (International) USA 1

 

Measuring usage at the level of the complete collection

Looking at the usage data of all books, it is clear that most traffic comes from ISPs, followed by usage from academic institutions. While usage by government or business is discussed as the primary source of societal relevance, here it plays a minor role. Furthermore, 85 percent of the usage is international.

 

Table 2: Usage data of all books.
Usage Total National International
Academic 27.82% 4.71% 23.11%
Non–profit 0.91% 0.19% 0.72%
Government 2.25% 0.30% 1.95%
Business 1.18% 0.20% 0.98%
ISP 67.84% 9.44% 58.40%
Total   14.84% 85.16%

 

 

Downloads OAPEN Library
 
Figure 1: Downloads OAPEN Library.

 

 

++++++++++

Are all ISPs equal?

The high percentage of usage coming from ISPs presents an unexpected problem. Without further refinement, almost 68 percent of the usage is hard to categorize. A method is needed to distinguish whether the usage comes from users whose organisation does not provide Internet access or from users who are downloading the monographs ‘from home’. The solution can be found by looking at the internet infrastructure per country, combined with the percentage of ISPs.

Internet infrastructure and ISPs

The Internet infrastructure differs from country to country. We might assume that in countries with an highly developed Internet infrastructure, most organisations are capable of directly providing Internet access to their employees. In contrast, access to the Internet will almost certainly be provided through an ISP in countries with a weakly developed internet infrastructure. In other words: we might expect that in countries with a highly developed infrastructure, ‘professional users’ are more likely to use the Internet access provided by their organisation and the users who access the OAPEN Library through an ISP are not doing that as part of their professional role.

In order to assess the state of the Internet infrastructure per country, statistical data from the World Bank is used. The publication The Little Data Book on Information and Communication Technology 2011 contains several indicators on the state of the IT infrastructure per country (World Bank, 2011). One of the indicators is the amount of internet users per 100 people. When this indicator is plotted against the percentage of ISPs per country found in the data, we find that in countries with a higher percentage of Internet users — countries with a better developed infrastructure — the percentage ISPs is lower.

 

Percentage ISPs and Internet users per country
 
Figure 2: Percentage ISPs and Internet users per country.

 

When we look at the data we might assume that the people using a highly developed Internet infrastructure are less likely to use an ISP if they download books from the OAPEN Library in their professional role. So, downloads through an ISP from countries with a highly developed Internet infrastructure are more likely to be coming from non–professional users. The next question to answer is which countries are considered to have a highly developed infrastructure. Plotting all countries displays a smooth descend from 94.5 Internet users per 100 people (Iceland) to 0.5 (Ethiopia).

In order to find a suitable cutoff point, the number of providers of all countries was listed. From this list, the 25 countries with the highest number of providers — regardless of the type — were selected, and the number of Internet users per 100 people was plotted in the following chart.

 

Internet users (per 100 people)
 
Figure 3: Internet users (per 100 people).

 

The first cutoff point can be found between Switzerland (70.9 Internet users per 100 people) and the Czech Republic (63.7 Internet users per 100 people). Therefore, it is assumed that all countries with 70 or more Internet users per 100 people have a highly developed Internet infrastructure and ISP usage from these countries is more likely to come from the ‘general public’.

 

Table 3: The 25 countries with the highest number of providers.
Provider country Total number of providers Number of ISPs Percentage ISPs Internet users (per 100 people)
Sweden 36 18 50.00% 90.3
Netherlands 107 49 45.79% 90
Denmark 45 25 55.56% 85.9
Finland 34 13 38.24% 83.9
Great Britain 119 31 26.05% 83.2
Germany 113 26 23.01% 79.5
USA 181 72 39.78% 78.1
Canada 36 13 36.11% 77.7
Japan 30 19 63.33% 77.7
Belgium 41 20 48.78% 75.2
Austria 36 12 33.33% 73.5
Poland 30 20 66.67% 72.3
Australia 45 18 40.00% 72
France 49 13 26.53% 71.3
Switzerland 25 10 40.00% 70.9
Czech Republic 42 24 57.14% 63.7
Spain 42 12 28.57% 61.2
Portugal 25 9 36.00% 48.6
Italy 53 16 30.19% 48.5
Greece 25 9 36.00% 44.1
Russia 63 49 77.78% 42.1
Brazil 24 12 50.00% 39.2
Ukraine 23 21 91.30% 33.3
Indonesia 51 36 70.59% 8.7
India 25 13 52.00% 5.3

 

A refined categorisation of ISP usage statistics

Refining the categorisation of the ISP usage statistics does paint quite a different picture. The percentage of data generated by ISPs is now divided into 31.86 percent that cannot be categorised as ‘private’ or ‘professional’ use and almost 36 percent where the possibility of ‘personal’ usage is much higher. If we combine this with the other categories, more than two–thirds of the usage data can be explained!

 

Table 4: Usage data of all books, refined.
Usage Total National International
Academic 27.82% 4.71% 23.11%
Non–profit 0.91% 0.19% 0.72%
Government 2.25% 0.30% 1.95%
Business 1.18% 0.20% 0.98%
ISP 31.86% 0.75% 31.11%
ISP (High Internet usage) 35.97% 8.68% 27.29%
Total   14.84% 85.16%

 

 

Downloads OAPEN Library 2011; refined
 
Figure 4: Downloads OAPEN Library — 2011; refined.

 

 

++++++++++

Possible influences on usage

Two possible influences on the usage in the OAPEN Library will be discussed: subject and language. Using the average number of downloads per group of titles, the distribution of the providers will be analysed. The average number of downloads is used here to compensate for the varying number of titles per subject or language. As described below, the number of titles with the same subject ranges from 65 to 22 titles. The same holds true for titles in the same language: the set contains 460 books in English; 105 in Dutch; 112 in Italian and 126 written in German.

After analysing the data on the level of the complete OAPEN Library or relative large subsets, the data on the level of individual books will be discussed. However, the analysis at the individual level will be less thorough.

Subject — highest level

In the OAPEN Library, the subject of the books is described using the BIC classification (Book Industry Communication, 2010). Due to its hierarchical nature, the classification assigned to each book can be abbreviated. This results in a larger group of monographs which share the same — broad — subject. The usage data of the 10 largest groups were compared with the averages of all books in the OAPEN Library, to see if the usage patterns differ significantly. In the following table, all data is normalised to the average number of downloads per subject.

 

Table 5: Subject: Usage data of 10 largest groups.
Number of titles Book subject Academic (National) (International) Non–profit (National) (International) (National) Government (International) Business (National) (International) ISP (National) Usage (National) ISP (International) ISP High Internet usage (International) Total
859 OAPEN — all books 1.39 6.84 0.06 0.21 0.09 0.58 0.06 0.29 0.22 2.57 9.20 8.07 29.58
65 Sociology and anthropology (JH) 1.78 16.35 0.03 0.29 0.11 1.68 0.02 0.69 0.11 4.35 19.35 18.22 62.98
43 Science: general issues (PD) 1.51 14.30 0.00 0.37 0.21 1.91 0.00 0.40 0.00 3.02 13.74 13.88 49.35
51 Society and culture (JF) 1.57 10.27 0.04 0.24 0.06 0.88 0.00 0.37 0.14 2.63 15.08 12.75 44.02
148 Politics and government (JP) 1.32 8.67 0.03 0.22 0.12 0.80 0.01 0.28 0.11 2.14 12.66 8.97 35.32
30 Film, TV and Radio (AP) 1.97 7.43 0.00 0.20 0.00 0.33 0.07 0.73 0.00 2.00 10.50 9.13 32.37
151 History (HB) 1.72 4.27 0.09 0.15 0.17 0.34 0.11 0.16 0.44 4.03 6.97 5.99 24.42
22 Philosophy (HP) 1.14 4.59 0.09 0.23 0.00 0.18 0.14 0.23 0.27 1.50 6.82 8.59 23.77
28 Literature: history and criticism (DS) 1.43 3.18 0.07 0.07 0.00 0.04 0.39 0.32 0.00 2.04 3.86 6.29 17.68
22 Linguistics (CF) 0.41 2.82 0.00 0.09 0.05 0.18 0.00 0.05 0.50 1.36 3.82 3.91 13.18
22 Laws of Specific jurisdictions (LN) 0.73 1.73 0.14 0.18 0.05 0.14 0.00 0.14 0.09 0.50 2.64 1.91 8.23

 

Average downloads per subject

All data is normalised to the average number of downloads per subject.

 

Average downloads per subject
 
Figure 5: Average downloads per subject.

 

When looking at the average number of downloads, it is striking that subjects from the social sciences — Sociology and anthropology, Society and culture, Politics and government — are more ‘popular’ than well–known subjects from the humanities, such as History, Philosophy and Literature.

The large differences in downloads per subject raise the question whether this is caused by differences in the usage per readers group. For instance, is the large uptake on Sociology and anthropology caused by relative high academic usage? In order find the answer, the percentages of usage per provider was computed.

Average downloads per subject — percentage

All data is normalised to the average number of downloads per subject.

 

Average downloads per subject - percentage
 
Figure 6: Average downloads per subject — percentage.

 

Here, the distribution across the subjects does not change dramatically, with the exception of History, Linguistics and Literature. For these subjects the academic national usage is relatively high. In the case of History, the explanation may lie in the fact that if the historic subject is national, the usage will tend to be national as well. Linguistics and Literature are of course closely bound to national languages; the percentage of academic readers interested in their national language will be greater than readers interested in foreign languages.

Furthermore, the largest percentages of national ‘ISP usage’ coming from countries with a high number of Internet usage — in other words: readers that are most likely to be interested for non–professional reasons — are to be found with History and Linguistics. In contrast, the usage of legal books (Laws of Specific jurisdictions) by government agencies and businesses is relatively high, but is still dwarfed by academic and ‘ISP usage’.

Language — highest level

The collection of the OAPEN Library contains several languages. Not all languages are equally represented. Therefore, only the largest groups are discussed. In the following table, all data is normalised to the average number of downloads per language.

 

Table 6: Language: Usage data of four largest groups.
Number of titles Language Academic (National) Academic Non–profit (National) Non–profit Government (National) Government Business (National) Business (International) ISP (National) ISP High Internet usage ISP (International) ISP High Internet usage Total
859 OAPEN — all books 1.39 6.84 0.06 0.21 0.09 0.58 0.06 0.29 0.22 2.57 9.20 8.07 29.58
460 English 1.51 10.80 0.07 0.32 0.07 1.00 0.10 0.51 0.02 2.63 15.38 12.70 45.09
105 Dutch 2.46 2.10 0.00 0.01 0.08 0.10 0.03 0.06 0.00 5.94 1.62 2.56 15.04
112 Italian 0.48 1.90 0.00 0.07 0.07 0.11 0.01 0.04 1.64 0.00 1.97 2.22 8.52
126 German 0.71 1.56 0.06 0.17 0.08 0.06 0.02 0.02 0.00 0.88 1.54 2.69 7.79

Average downloads per language

All data is normalised to the average number of downloads per language.

 

Average downloads per language
 
Figure 7: Average downloads per language.

 

When looking at the total number of downloads per average title, it becomes clear that English is the most read language: it amounts to approximately 150 percent of the average of the complete OAPEN Library usage. The average downloads of Dutch titles is almost twice as high as the number of downloads for titles in Italian and German. The explanation may be found in the fact that 14 percent of all usage data originated in the Netherlands, while Italian providers are responsible for four percent and German providers for eight percent.

The differences in usage may also be connected to differences in usage by each reader group. For that reason, the percentages of usage per provider was computed.

Average downloads per language — percentage

All data is normalised to the average number of downloads per language.

 

Average downloads per language - percentage
 
Figure 8: Average downloads per language — percentage.

 

The percentages reveal the ‘national appeal’ of Dutch language titles: the percentages of national usage — both academic and coming from ISPs — are far greater than the other languages, or the average of all books. The international usage coming from ISPs is by far the lowest, and the percentage of international academic use is also lower compared to the other languages. National usage for Dutch language books is of course coming from both the Netherlands and from Belgium. In this particular case, the Dutch language books published by Dutch publishers account for 35 percent of the usage data, while the Dutch language books published by Belgian publishers account for 4.5 percent of the usage.

In contrast, the books written in English have the lowest percentages of national usage. This is of course not surprising: English functions as the ‘lingua franca’ of science. The percentages of German and Italian books fall between these two extremes. From this we might conclude that books written in English, German and Italian appeal to a far more international audience than those written in Dutch. If Dutch or Belgian authors want their work to be used outside their countries, translation is necessary. The same effect was found for Danish, but the number of titles was much lower: 22. Therefore these titles were not taken into account here.

Subject — book level

Here, all downloads per individual title are analysed, per subject. The main goal is to look at the skewedness of the total number of downloads: is it heavily influenced by just a few titles, or is the number of downloads spread relatively even? Furthermore, the usage percentages of the 15 most downloaded titles are visualised, in order to determine if they deviate greatly from the percentages of the whole group.

It becomes clear that the social sciences are more prone to skewed distributions of downloads, compared to humanities. The groups Sociology and anthropology, Society and culture and Politics and government all contain a title that is downloaded far more than the rest. All these titles with an exceptional number of downloads were authored by members of IMISCOE Research Network [3]. The Web site of the IMISCOE Network contains links to all books in the OAPEN Library. This may be the reason for the high number of downloads.

When we look at the usage percentages, we see that lower number of downloads seem to correlate with higher differences in percentages. A good example can be found in the group Laws of Specific jurisdictions, where the title Videovernehmung kindlicher Zeugen; zur Praxis des Zeugenschutzgesetzes, ISBN 978–3–938–61683–3 shows a usage percentage of 40 percent by foreign government organisations. This looks very spectacular, but it is caused by two downloads. Small differences give high percentages!

Each book is identified using ISBN (International Standard Book Number).

Sociology and anthropology

The chart depicts the total number of downloads per title. The total number of titles is 65.

 

Sociology and anthropology (JH) - Total downloads per title
 
Figure 9: Sociology and anthropology (JH) — Total downloads per title.

 

Here we see an outlier, with 680 downloads: Diaspora and Transnationalism: Concepts, Theories and Methods, ISBN 978–9–089–64238–7.

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same subject.

 

Sociology and anthropology (JH) - Most downloaded, percentage
 
Figure 10: Sociology and anthropology (JH) — Most downloaded, percentage.

 

The first 15 titles are responsible for 71.64 percent of all downloads. Here we see one obvious outlier: Nationale identiteit en meervoudig verleden, ISBN 978–9–0535–6358–8. The difference may come from the large amount of OAPEN users from the Netherlands and Belgium.

Science: general issues

The chart depicts the total number of downloads per title. The total number of titles is 43.

 

Average downloads per subject - percentage
 
Figure 11: Average downloads per subject — percentage.

 

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same subject.

 

Science: general issues (PD) - Most downloaded, percentage
 
Figure 12: Science: general issues (PD) — Most downloaded, percentage.

 

The first 15 titles are responsible for 83.84 percent of all downloads. Here we see the same pattern as the previous subject: no large differences save one outlier: Van natuurlandschap tot risicomaatschappij: De geografie van de relatie tussen mens en milieu, ISBN 978–9–053–56798–2. As this is the only Dutch language title, the large amount of Dutch OAPEN users may have caused this.

Society and culture

The chart depicts the total number of downloads per title. The total number of titles is 51.

 

Society and culture (JF) - Total downloads per title
 
Figure 13: Society and culture (JF) — Total downloads per title.

 

This group contains one outlier: The Dynamics of International Migration and Settlement in Europe: A State of the Art, ISBN 978–9–053–56866–8.

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same subject.

 

Society and culture (JF) - Most downloaded, percentage
 
Figure 14: Society and culture (JF) — Most downloaded, percentage.

 

The first 15 titles are responsible for 75.08 percent of all downloads. There is no obvious outlier.

Politics and government

The chart depicts the total number of downloads per title. The total number of titles is 148.

 

Politics and government (JP) - Total downloads per title
 
Figure 15: Politics and government (JP) — Total downloads per title.

 

As is the case with other social science groups, here we see one outlier: Innovative Concepts for Alternative Migration Policies: Ten Innovative Approaches to the Challenges of Migration in the 21st Century, ISBN 978–9–053–56990–0.

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same subject.

 

Politics and government (JP) - Most downloaded, percentage
 
Figure 16: Politics and government (JP) — Most downloaded, percentage.

 

The first 15 titles are responsible for 42.98 percent of all downloads. The title Illegal Residence and Public Safety in the Netherlands, ISBN 978–9–089–64049–9 has a relative large percentage of national academic usage, which is not surprising given the fact that is was published in the Netherlands.

Film, TV and Radio

The chart depicts the total number of downloads per title. The total number of titles is 30.

 

Film, TV and Radio (AP) - Total downloads per title
 
Figure 17: Film, TV and Radio (AP) — Total downloads per title.

 

There is no obvious outlier.

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same subject.

 

Film, TV and Radio (AP) - Most downloaded, percentage
 
Figure 18: Film, TV and Radio (AP) — Most downloaded, percentage.

 

The first 15 titles are responsible for 71.68 percent of all downloads. Here, downloads from government agencies are a relatively large percentage of all downloads for one title. This is caused by the overall low number of downloads: per title it is one or two downloads.

History

The chart depicts the total number of downloads per title. The total number of titles is 151.

 

History (HB) - Total downloads per title
 
Figure 19: History (HB) — Total downloads per title.

 

There is no obvious outlier.

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same subject.

 

History (HB) - Most downloaded, percentage
 
Figure 20: History (HB) — Most downloaded, percentage.

 

The first 15 titles are responsible for 27.66 percent of all downloads. We can see a relative high number of national high internet downloads for these titles:

  • Literary Cultures and Public Opinion in the Low Countries, 1450–1650, ISBN 978–9–004–20616–8
  • De hand van Huizinga, ISBN 978–9–089–64020–8
  • Het Hemels Mandaat: De Geschiedenis van het Chinese Keizerrijk, ISBN 978–9–089–64120–5
  • Opera omnia Desiderii Erasmi: Ordinis secundi tomus quartus, ISBN 978–0–444–70132–9

Philosophy

The chart depicts the total number of downloads per title. The total number of titles is 22.

All are published by Dutch publishers, and the language is either Dutch or the book is concerned with a Dutch subject.

 

Philosophy (HP) - Total downloads per title
 
Figure 21: Philosophy (HP) — Total downloads per title.

 

There is no obvious outlier.

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same subject.

 

Philosophy (HP) - Most downloaded, percentage
 
Figure 22: Philosophy (HP) — Most downloaded, percentage.

 

The first 15 titles are responsible for 91.59 percent of all downloads. The 15th title received nine downloads. We will continue to see strong deviations in the percentages, combines with a small number of downloads.

Literature: history and criticism

The chart depicts the total number of downloads per title. The total number of titles is 28.

 

Literature: history and criticism (DS) - Total downloads per title
 
Figure 23: Literature: history and criticism (DS) — Total downloads per title.

 

There is no obvious outlier.

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same subject.

 

Literature: history and criticism (DS) - Most downloaded, percentage
 
Figure 24: Literature: history and criticism (DS) — Most downloaded, percentage.

 

The first 15 titles are responsible for 76.77 percent of all downloads. Here we see much variation in the usage percentages per title. Because of the relative low number of downloads — ranging from 49 to 16 — one download has a large impact in the chart.

Linguistics

The chart depicts the total number of downloads per title. The total number of titles is 22.

 

Linguistics (CF) - Total downloads per title
 
Figure 25: Linguistics (CF) — Total downloads per title.

 

The first two titles are responsible for 29.66 percent of all downloads.

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same subject.

 

Linguistics (CF) - Most downloaded, percentage
 
Figure 26: Linguistics (CF) — Most downloaded, percentage.

 

The first 15 titles are responsible for 88.97 percent of all downloads. Again we see a large difference in usage percentages, but a small number of overall downloads.

Laws of Specific jurisdictions

The chart depicts the total number of downloads per title. The total number of titles is 22.

 

Laws of Specific jurisdictions (LN) - Total downloads per title
 
Figure 27: Laws of Specific jurisdictions (LN) — Total downloads per title.

 

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same subject.

 

Laws of Specific jurisdictions (LN) - Most downloaded, percentage
 
Figure 28: Laws of Specific jurisdictions (LN) — Most downloaded, percentage.

 

The first 15 titles are responsible for 89.50 percent of all downloads. Again we see a large difference in usage percentages, but a small number of overall downloads.

Language — book level

The analysis on languages shows the same pattern: a low number of downloads seems to be correlated with high diversity in percentages. This is best illustrated with the differences between English and German. The usage percentages for English — where the average number of downloads per book is 45.29 — are not much different. This contrast with German, where the average number of downloads is much lower: 7.79.

English

The chart depicts the total number of downloads per language. The total number of titles is 460.

 

English - Total downloads per title
 
Figure 29: English — Total downloads per title.

 

The outlier is of course: Diaspora and Transnationalism: Concepts, Theories and Methods, ISBN 978–9–089–64238–7.

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same language.

 

English - Most downloaded, percentage
 
Figure 30: English — Most downloaded, percentage.

 

The first 15 titles are responsible for 20.90 percent of all downloads. Here the usage percentages are the most consistent.

Dutch

The chart depicts the total number of downloads per language.

 

Dutch - Total downloads per title
 
Figure 31: Dutch — Total downloads per title.

 

The most popular title is Nationale identiteit en meervoudig verleden, ISBN 978–9–053–56358–8.

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same language.

 

Dutch - Most downloaded, percentage
 
Figure 32: Dutch — Most downloaded, percentage.

 

The first 15 titles are responsible for 44.08 percent of all downloads. The ‘national’ appeal — which was discussed earlier — is clearly visible through the relative high percentages of national academic and ISP usage.

Italian

The chart depicts the total number of downloads per language. The total number of titles is 112.

 

Italian - Total downloads per title
 
Figure 33: Italian — Total downloads per title.

 

The most downloaded title is Tell Barri/Kahat: la campagna del 2000, ISBN 88–8453–097–0.

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same language.

 

Italian - Most downloaded, percentage
 
Figure 34: Italian — Most downloaded, percentage.

 

The first 15 titles are responsible for 36.79 percent of all downloads. The small number of downloads correlates once again with larger differences in usage percentages.

German

The chart depicts the total number of downloads per language. The total number of titles is 126.

 

German - Total downloads per title
 
Figure 35: German — Total downloads per title.

 

The most popular title — with 56 downloads — is Ein Compendium sumerisch–akkadischer Beschwörungen, ISBN 978–3–940–34417–5.

The chart depicts the 15 most downloaded titles, combined with the percentages for all titles with the same language.

 

German - Most downloaded, percentage
 
Figure 36: German — Most downloaded, percentage.

 

The first 15 titles are responsible for 37.68 percent of all downloads.

 

++++++++++

Conclusion

The method as addition to existing assessments

The problem addressed in this paper is the measurement of scientific impact and societal relevance in the field of Humanities and Social Sciences. When looking at methods to measure scientific impact through the published output, we saw that the standard bibliometric methods employed in the fields of Science, Technology and Medicine — where publishing in articles is the norm — do not function well in a field where monographs are the standard. Quantitative methods that do take monographs into account are aimed at measuring scientific impact only, leaving out the societal benefits.

The methods used to assess societal relevance also have drawbacks. First of all, most of these methods are qualitative, depending on self–assessments by scholars and on opinions by representatives of stakeholders outside academia. Apart from possible subjective biases, these methods require that stakeholders are known. This is not always the case in the field of HSS, especially in the humanities. Another aspect is the amount of labour involved: discussions with focus groups, sending questionnaires and conducting desk research and stakeholder analysis requires quite a lot of manpower. Societal relevance is hard to measure in the field of HSS, especially when the groups that would primarily benefit from research are not always known. In the field of STM, patents are used as an indicator, but a comparable indicator for HSS research has not been defined.

In this paper, a new method is used to overcome some of those issues. This method measures the usage of monographs and identifies the organisation responsible for the Internet access. Therefore both the usage and the readers of each monograph are known. The amount of usage — here restricted to number of downloads — by each type of reader could be used to assess the value of scientific or scholarly output.

The method is quantitative, which makes the results easier to validate. The amount of measurements is also large: the dataset for this paper consists of over 25,000 downloads by more than 1,500 providers, spread over 859 monographs. A large dataset reduces the chances of outliers influencing the results. It is not necessary to know the stakeholders in advance: the method is used to identify the readers. This solves one of the identified problems: especially in the field of humanities, where benefactors besides academics are not always known. Knowing other users besides academics makes it easier to assess the societal relevance. Another drawback of the described qualitative methods is the labour intensity; by relying heavily on automated tools, this method is relatively easy to execute. Furthermore, one of the problems attached to measuring societal relevance is attribution: how to measure the influence of a certain scholar? Here we look at the usage of books, which makes it easy to identify the influence of each author.

Discussion of the results

When looking at the results, it becomes clear that the monographs are not used exclusively by scholars. From the measured data, over 27 percent is directly linked to academic users. The percentage of usage that can be linked directly to other ’professional‘ users is quite small: less than five percent. This leaves a large portion of users that cannot be categorised immediately. By taking into account the percentage of ISPs per country, this group is further categorised. This results in a group of users that cannot be categorised and a group of users — more than 35 percent of all users — that have a higher probability to be ’non–professional users‘, also known as the ‘general public’. Taken together, more than 68 percent of the usage can be categorised, and almost 45 percent of all usage comes from non–academics. This might indicate that the monographs have an impact in society.

In order to further refine the results, two possible influences on monograph usage were analysed: subject and language. When looking at the influence of subject on usage, we saw that the average number of downloads per subject varies widely. Most of the subjects that received a higher number of downloads than the average of the total set come from the field of the social sciences. The humanities were less ‘popular’, with amounts that lie mostly below the average for the complete set. If we use this as a measure of societal relevance, we might conclude that monographs in the social sciences enjoy a relatively large readership outside academia. The number of books on a certain subject may have influenced these results, but it is not very likely: 65 books on Sociology and anthropology receive an average number of 62.98 downloads, and 51 books on Society and culture are downloaded 44.02 times on average. In contrast, 151 History books — a much larger amount of titles — are downloaded 24.42 times on average. If the usage percentages per group were taken into account, it becomes clear that they do not differ significantly. Only History, Linguistics and Literature are the exception: here the percentage of ‘national’ usage is higher. These subjects might have a tendency to be bound to national borders.

In order to measure the influence of language on monograph usage, the four largest language groups were analysed. Again, the average number of downloads and the percentages per groups were used. It was hardly surprising to discover that books in English — the ‘lingua franca’ of science — were downloaded the most. A more interesting discovery was the fact that some languages such as Dutch (and Danish) were read much less outside of national borders that Italian or German. While a Dutch or Belgian scholar would need a translation in order to have more influence abroad, this does apply far less for Germans or Italians.

The analysis on the level of individual books revealed that within the social sciences, the distribution of usage was relatively more skewed than within the humanities: the groups Sociology and anthropology, Society and culture and Politics and government all contain a title that is downloaded far more than the rest. It is interesting to note that all these books were written by authors connected to the IMISCOE [4] network. Possibly, readers were alerted through the IMISCOE Web site.

The results give an indication of the usage, and it becomes clear that HSS monographs are read outside academia, proving the societal relevance. Below, the conclusions are discussed a little further.

First of all, the research method is based on measuring usage of electronic versions of monographs. The usage data of the paper versions — such as sales figures or borrowing data from libraries — were not available. It would be interesting to see if the percentage of user categories would differ dramatically. Given the economic circumstances discussed earlier, we might conclude that the dissemination of paper books is far less successful than electronic ones. However, it may be possible that a certain group of readers prefers the paper monograph to the electronic version, and this aspect has not been taken into account.

Another aspect is the dissemination channel. In earlier research (Snijder, 2010), it became clear that different dissemination channels display different results. There, the usage through an institutional repository was significantly smaller than usage through the Google Book Search program. Here, one dissemination channel is used and therefore we cannot compare the usage patterns. In other words: we cannot determine if the low usage by government agencies, non–profit organisations and businesses is solely caused by the contents of the monographs, or whether it is partly caused by the fact that the OAPEN Library is not used by these types of organisations. Another aspect of the OAPEN Library is that it only hosts open access monographs. This means that the complete text of the books is fully available online. As there is no comparable data set available of monographs that are not fully accessible, we cannot determine how usage is influenced by open access.

The data analysed is the usage measured through the OAPEN Web site; direct downloads are not taken into account. At this point, only the total number of downloads is available — no other data. When the complete data becomes available, it will be interesting to see whether the percentage of ‘ISP’ usage will become smaller. The total number of downloads in 2011 — over 300,000 — is more than six times higher than the amount of downloads in the current data set. This much larger number of ‘direct’ downloads may come from library systems or other collections of book data. These data files will probably made available to ‘professional’ users, such as academics or civil servants. This may explain the small percentage of government use, or it may uncover a much higher scientific and scholarly use.

Possible refinements to the method

The method in its current form uses relatively broad categories. Users are divided into six groups and are categorised as national or international. Based on these categories, it is simple to make analyses on an abstract level. By doing so, smaller effects based on specific books are not visible. One of the possible refinements could be categorisation on countries. This enables us to look at a more detailed level. For instance, the books published by KITLV Press are mostly downloaded through Indonesian providers. The reason for this is clear: the subject of all KITLV titles is South East Asia, and most of those monographs describe themes from Indonesia. An analysis on country level may uncover more of these effects, but the level of detail required is beyond the scope of this paper.

Another refinement could be found in analysing the usage patterns for each individual author. Before, the usage per subject has been analysed. We could use the percentages of the groups of readers as a ‘baseline’ to compare the usage patterns of the work or works from a certain author. Again, we cannot be sure how the dissemination channel influences the results. Therefore, this kind of analysis should be done with caution, and preferably at a time where more experience with using this method has been gained.

The dataset is available at:
http://www.persistent-identifier.nl/?identifier=urn:nbn:nl:ui:13-fbfa-yd.

Evaluation of the results

In this paper, a method is introduced to measure both the scientific and the societal relevance of the Humanities and Social Sciences, by measuring the usage of its main publication form: the monograph. While both the monograph and the field of HSS are under pressure, we saw that there is a considerable interest; from both inside and outside academia. We could say that this is a good result: it indicates the scholarly impact and the societal relevance of HSS. Furthermore, it was possible to measure the influence of subject and language. On the other hand, some of the results were mixed. The usage patterns differ strongly from the literature on societal relevance: contrary to expectations, the data show a low usage percentage by ‘professionals’. Whether this is a property of HSS usage or it is caused by the used channel and dataset is a question that needs further research. However, the results of this article are promising, and the proposed method can be used as an addition to the existing toolkit. End of article

 

About the author

Ronald Snijder joined Amsterdam University Press (AUP) in 2007, where he is responsible for developing digital publications, combined with IT management. Since 2011 he is also technical coordinator at the OAPEN Foundation. There he is responsible for the technical development of the OAPEN Library. Before that, he has worked in several profit and not–for–profit organizations as an IT and information management specialist. Follow him on Twitter @ronaldsnijder.

 

Acknowledgements

Prof. Paul Wouters commented on the draft version of this article. The author would like to thank Anton Nederhof and all researchers at CWTS and Gert van Vugt at Amsterdam University Press for discussing the first results.

 

Notes

1. Psychology is an exception: in this field articles are used more than monographs (Schaffer, 2004).

2. See http://oapen.org/search?identifier=341340.

3. See http://www.imiscoe.org.

4. Ibid.

 

References

Association of Research Libraries (ARL), 2012. “ARL Statistics® & Salary Surveys,” at http://www.arl.org/stats/annualsurveys/arlstats/arlstats11.shtml, accessed 27 August 2012.

S. Bell, B. Shaw, and A. Boaz, 2011. “Real–world approaches to assessing the impact of environmental research on policy,” Research Evaluation, volume 20, number 3, pp. 227–237.http://dx.doi.org/10.3152/095820211X13118583635792

P. Benneworth and B.W. Jongbloed, 2009. “Who matters to universities? A stakeholder perspective on humanities, arts and social sciences valorisation,” Higher Education, volume 59, number 5, pp. 567–588.http://dx.doi.org/10.1007/s10734-009-9265-2

R. Bennink, I. Meijer, F. Wamelink, and F. Zuijdam, 2008. “De maatschappelijke kwaliteit van onderzoek in kaart Een handreiking,” pp. 1–22, at http://www.knaw.nl/Content/Internet_KNAW/actueel/bestanden/ERiC_handreiking_maatschappelijke_kwaliteit.pdf, accessed 3 May 2013.

J. Bollen, H. Van de Sompel, A. Hagberg, and R. Chute, 2009. “A principal component analysis of 39 scientific impact measures,” PloS One, volume 4, number 6, at http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0006022, accessed 3 May 2013.

J. Bollen, H. Van de Sompel, and M.A. Rodriguez, 2008. “Towards usage–based impact metrics: First results from the mesur project,” JCDL ’08: Proceedings of the 8th ACM/IEEE–CS Joint Conference on Digital libraries, pp. 231–240.

Book Industry Communication, 2010. “BIC Standard Subject Categories,” at http://www.bic.org.uk/7/BIC-Standard-Subject-Categories/, accessed 9 February 2012.

M. Case (editor), 1999. The specialized scholarly monograph in crisis, or, How can I get tenure if you won’t publish my book? Proceedings of a conference sponsored by American Council of Learned Societies, Association of American University Presses, [and] Association of Research Libraries, Washington, D.C. September 11–12, 1997, at http://www.arl.org/publications-resources, accessed 12 March 2012.

E. Ernø–Kjølhede and F. Hansson, 2011. “Measuring research performance during a changing relationship between science and society,” Research Evaluation, volume 20, number 2, pp. 130–142.http://dx.doi.org/10.3152/095820211X12941371876544

M. Gibbons, C. Limoges, H. Nowotny, S. Schwartzman, P. Scott, and M. Trow 1994. The new production of knowledge: The dynamics of science and research in contemporary societies. London: Sage.

J. Grant, P.–B. Brutscher, S. Guthrie, L. Butler, and S. Wooding, 2010. Capturing research impacts: A review of international practice. Santa Monica, Calif.: RAND, at http://www.rand.org/pubs/documented_briefings/DB578.html, accessed 3 May 2013.

A.N. Greco and R.M. Wharton, 2008. “Should university presses adopt an open access [electronic publishing] business model for all of their scholarly books?” In: L. Chan and S. Mornati (editors). ELPUB2008. Open Scholarship: Authority, Community, and Sustainability in the Age of Web 2.0 — Proceedings of the 12th International Conference on Electronic Publishing held in Toronto, Canada 25–27 June 2008, pp. 149–164, and at http://elpub.scix.net/cgi-bin/works/Show?149_elpub2008, accessed 3 May 2013.

U. Herb, 2010. “Alternative impact measures for open access documents? An examination how to generate interoperable usage information from distributed open access services,” World Library and Information Congress; 76th IFLA General Conference and Assembly, at http://www.ifla.org/files/hq/papers/ifla76/72-herb-en.pdf, accessed 3 May 2013.

U. Herb, E. Kranz, T. Leidinger, and B. Mittelsdorf, 2010. “How to assess the impact of an electronic document? And what does impact mean anyway? Reliable usage statistics in heterogeneous repository communities,” OCLC Systems & Services, volume 26, number 2, pp. 133–145.http://dx.doi.org/10.1108/10650751011048506

R. Landry, M. Lamari, and N. Amara, 2003. “The extent and determinants of the utilization of university research in government agencies,” Public Administration Review, volume 63, number 2, pp. 192–205.http://dx.doi.org/10.1111/1540-6210.00279

R. Landry, N. Amara, and M. Lamari, 2001. “Climbing the ladder of research utilization: Evidence from social science research,” Science Communication, volume 22, number 4, pp. 396–422.http://dx.doi.org/10.1177/1075547001022004003

L. Leydesdorff and H. Etzkowitz, 1996. “Emergence of a triple helix of university–industry–government relations,” Science and Public Policy, volume 23, number 5, pp. 279–286.

A.J.M. Linmans, 2010. “Why with bibliometrics the humanities does not need to be the weakest link,” Scientometrics, volume 83, number 2, pp. 337–354.http://dx.doi.org/10.1007/s11192-009-0088-9

C. Lyall, 2004. “Assessing end–use relevance of public sector research organisations,” Research Policy, volume 33, number 1, pp. 73–87.http://dx.doi.org/10.1016/S0048-7333(03)00090-8

M. Mendez and K. Chapman, 2006. “The use of scholarly monographs in the journal literature of Latin American history,” Electronic Journal of Academic and Special Librarianship, volume 7, number 3, at http://southernlibrarianship.icaap.org/content/v07n03/mendez_m01.htm, accessed 3 May 2013.

A.J. Nederhof, 2006. “Bibliometric monitoring of research performance in the social sciences and the humanities: A review” Scientometrics, volume 66, number 1, pp. 81–100.http://dx.doi.org/10.1007/s11192-006-0007-2

OAPEN Consortium, 2011. “OAPEN final report,” at http://project.oapen.org/images/documents/oapen_final_public_report.pdf, accessed 8 February 2012.

Open Access Publishing in European Networks, 2010. “OAPEN Library,” at http://www.oapen.org, accessed 17 November 2011.

Royal Netherlands Academy of Arts and Sciences, 2010. Quality indicators for research in the humanities, at http://www.knaw.nl/Content/Internet_KNAW/publicaties/pdf/20111024.pdf, accessed 3 May 2013.

T. Schaffer, 2004. “Psychology citations revisited: Behavioral research in the age of electronic resources,” Journal of Academic Librarianship, volume 30, number 5, pp. 354–360.http://dx.doi.org/10.1016/j.acalib.2004.06.009

A. Serenko, N. Bontis, and M. Moshonsky, 2011. “Exploring the role of books as a knowledge translation mechanism: Citation analysis and author survey,” AMCIS 2011 Proceedings, at http://aisel.aisnet.org/amcis2011_submissions/23/, accessed 3 May 2013.

R. Snijder, 2010. “The profits of free books: An experiment to measure the impact of open access publishing,” Learned Publishing, volume 23, number 4, pp. 293–301.http://dx.doi.org/10.1087/20100403

J. Spaapen and L. van Drooge, 2011. “Introducing ‘productive interactions’ in social impact assessment,” Research Evaluation, volume 20, number 3, pp. 211–218.http://dx.doi.org/10.3152/095820211X12941371876742

A. Swan and M. Hall, 2010. “Why open access can change science in the developing world,” Public Service Review: International Development Online at http://eprints.soton.ac.uk/271550/, accessed 3 May 2013.

J.B. Thompson, 2005. Books in the digital age: The transformation of academic and higher education publishing in Britain and the United States. Cambridge: Polity Press.

J.E. Vascellaro, 2009. “Facebook, the Search Engine?” Wall Street Journal, at http://blogs.wsj.com/digits/2009/08/11/facebook?-the-search-engine/, accessed 17 November 2011.

Wikipedia, n.d. “WHOIS,” at http://en.wikipedia.org/wiki/Whois, accessed 23 November 2011.

H.D. White, S.K. Boell, H. Yu, M. Davis, C.S. Wilson, and F.T.H. Cole, 2009. “Libcitations: A measure for comparative assessment of book publications in the humanities and social sciences,” Journal of the American Society for Information Science and Technology, volume 60, number 6, pp. 1,083–1,096.

P. Williams, I. Stevenson, D. Nicholas, A. Watkinson, and I. Rowlands, 2009. “The role and future of the monograph in arts and humanities research,” Aslib Proceedings, volume 61, number 1, pp. 67–82.http://dx.doi.org/10.1108/00012530910932294

L. Withey, S. Cohn, E. Faran, M. Jensen, G. Kiely, W. Underwood, B. Wilcox, R. Brown, P. Givler, A. Holzman, and K. Keane, 2011. “Sustaining scholarly publishing: New business models for university presses,” Journal of Scholarly Publishing, volume 42, number 4, pp. 397–441.

World Bank, 2011. The little data book on information and communication technology 2011. Washington, D.C.: World Bank.

 


Editorial history

Received 21 September 2012; accepted 4 March 2013.


Creative Commons License
This work is licensed under a Creative Commons Attribution–ShareAlike 3.0 Unported License.

Measuring monographs: A quantitative method to assess scientific impact and societal relevance
by Ronald Snijder.
First Monday, Volume 18, Number 5 - 6 May 2013
http://firstmonday.org/ojs/index.php/fm/article/view/4250/3675
doi:10.5210/fm.v18i5.4250





A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2017. ISSN 1396-0466.