Game testing and evaluation on real devices: Exploring in the case of the Open Device Lab community
First Monday

Game testing and evaluation on real devices: Exploring in the case of the Open Device Lab community by Raquel Godinho-Paiva and Ruth Sofia Contreras-Espinosa

Game testing and evaluation (T&E) still have no standards to ensure quality. T&E on real devices instead of only using a software solution (e.g., emulators) has become a basic procedure for mobile software design and development, including games. This study presents the Open Device Lab community (ODL), a grassroots movement helping the Web and app community to have free access to device labs. The findings reveal how the open community can benefit the game industry.


Literature review
Open Device Lab community
Limitations and conclusion




In the new global economy, games have become a central issue for software development. Scholars have studied the game industry as a business both from the viewpoint of large companies and indie developers (Consalvo and Phelps, 2019); and people are spending more time on online games than ever, which is why it is becoming the world’s favourite pastime (Newzoo, 2017). Consequently, there is a significant market share available, while at the same time there still exists significant challenges to ensure game quality.

Large game corporations usually have in-house labs but that route is not always feasible or affordable for independent video game developers (indie studios). A survey of 1,445 small and medium-sized indie studios (2 to 50 employees) showed that PCs and mobiles are the most popular choices of development and virtual reality/augmented reality (VR/AR) are increasing. Most of these studios develop, publish and market their own games (Zachariah, 2018), and are therefore more exposed to difficulties because of lack of support. Practices such as Playtest, for example, require specialised expertise and equipment (Mirza-Babaei, et al., 2016).

In this study, we present the Open Device Labs (ODLs) as an option to assist game testing and evaluation practices. The ODLs started as a grassroots movement to help the Web development community. They provide free access to device labs in an open and collaborative environment, with pools of devices connected to the Internet. They are a group of developers who got together to help the community with testing on real devices. Considering “knowledge is located and develops in communities that are organized around practices” [1], this study aims to explore the potential of the community to support the game segment.

The results contribute to both academia and industry. Many problems of the game development industry have been solved in the software industry (Vithani and Kumar, 2014). Most of the relevant information on games is published on specialised online blogs and in magazines. The number of academic studies has been increasing, but it is still difficult to find relevant papers in scientific journals. The game community members tend to share design practice information through conversations with one another, industry conferences and online resources. They do not typically read academic papers (Isbister and Mueller, 2015). More recently the indie community have started sharing their practices in live streaming (Consalvo and Phelps, 2019).



Literature review

Human computer interaction (HCI) and software engineering (SE) share design as a common concern; both aim to design software systems, but have different roots in theory, processes for design, and views on design representations (Sutcliffe, 2005). The software development life cycle (SDLC) is a term used to describe either software or systems’ development life cycles, from inception with requirements definition, through to fielding and maintenance (Ruparelia, 2010). Desktop application development requires similar phases of SDLC such as planning, design, implementation, testing and deployment.

As distinct from desktops, mobile applications and games have distinguishing characteristics and do not entirely benefit from a common SDLC. For this reason, there are new and specific life cycle purposes either for mobile applications or for games. For example, the mobile application development life cycle (MADLC) see Vithani and Kumar (2014) and the game development life cycle (GDLC) (see Ramadan and Widyani, 2013). Therefore, games require a specific approach [2] to characteristics such as game play, mechanics, art, and music and sound.

The results from a literature review of software development processes for games showed that “there is no single model that serves as a best practice process model for games development and it is a matter of deciding which model is best suited for a particular game” [3]. Furthermore, developers are using multiple ways, including informal knowledge sharing (Consalvo and Phelps, 2019).

Traditional software testing can be applied to specific software categories such as games, Web sites, and mobile applications, adapting testing accordingly for each one (Aleem, et al., 2016a; Di Lucca and Fasolino, 2006). The principal phases of the game design software engineering life cycle (GDSE) can be combined into three main phases: pre-production, production, and post-production (Aleem, et al., 2016a). They are similar to those used in other software development (González Sánchez, et al., 2009). However, there are variations of SDLC with specific phases both from industry and scientific sources,(Aleem, et al., 2016b). These sources combine similar phases in different ways. Most of them presented testing as a specific phase (see Figure 1).


Game development life cycle (Contreras-Espinosa and Eguia, 2017)
Figure 1: Game development life cycle (Contreras-Espinosa and Eguia, 2017).


Game-testing and evaluation on real devices

The use of real (physical) devices is essential in the SDLC. Although this may be obvious, there are many technology professionals who are not yet using real devices to perform tests. They are still using software solutions such as emulators, simulators and remote testing (SauceLabs, 2015). Furthermore, some processes usually follow specific procedures and steps and thus makes it somewhat slow when it comes to being reactive to other processes and the changing patterns in users’ needs and markets (Twidale and Hansen, 2019).

Device management is one of the main obstacles in the testing and evaluation routine and it is fundamental to both the development of Web-based mobile games and native mobile applications. The challenges differ depending on which platform(s) the game will support. The most popular mobile operation system(s) (OS) are Android and iOS, followed by Windows Phone (Statcounter, 2018). Android is an open source OS, while iOS is a closed source and proprietary to Apple devices (Scolastici and Nolte, 2013), thus development for Android is not as controlled is the case for Apple. On the other hand, there are many different Android devices in the mobile market, which makes it hard to secure a representative pool of devices. In the case of iOS, the diversity is not as big as Android but the price gap between them has widened over the years (Richter, 2016).

Simulators and emulators are an alternative for testing but their results are limited, for example on performance, user experience, hardware and software. There are options for each platform, which should be combined with real devices. Moreover, there is no standard approach for T&E diverse mobile environments. For this reason, hands-on physical devices in the development methodology are critical (Gardner and Grigsby, 2012). A survey of 504 technology professionals, responsible for the quality of Web and mobile applications, showed that 29 percent only use simulators and emulators, 37 percent use only real devices, while 34 percent use a combination of both (SauceLabs, 2015).

“Testing is one of the most important phases of any development lifecycle model. The testing of prototype types is performed on an emulator/simulator followed by testing on the real device. The emulator/simulator is often provided in the software development kit (SDK). The testing on the real device, for example, in the case of Android operating system development, should be performed on multiple operating system versions, multiple models of handsets with variable screen size. The test cases are documented and forwarded to the client for feedback.” [4]

This is crucial, not just for SDLC or MDLC but also for GDLC: “The real performance of a mobile game can only be measured on real devices.” [5] Testing and evaluation on physical devices gives the clearest feedback on how the game will work and how the user will interact with it. Based on the research/study type, the T&E can be done either manually or automated, in person or remotely, lab-based or field-based. Some approaches are based on expert evaluation and others involve users. Each of them offers different benefits and limitations. It is essential to plan when using them, depending on the purpose and SDLC phase. The following description is based on the studies conducted by Dix, et al., (2004), Kaner (2016) and Kjeldskov and Skov (2014).

  • Manual: time-consuming but allows user to feel physical characteristics such as input methods (press, release, tap, and pinch/scale/rotate), form and portability and capabilities (touch, GPS, accelerometer).

  • Automated: runs simultaneous test on hundreds of real devices, which save time and provide good results as a cost-effective solution. However, it does not enable evaluation and testing the real interaction or multi-touch with the device.

  • In-person: means when the moderator is physically in the same place as the participant.

  • Remote: it can be automated, performed by software, or manually by the game player. The first simulates many devices using the same hardware as input, good for repeated tests. The second is performed by a person remotely, which is good for getting user feedback.

  • Lab-based: performed in a controlled environment using specific devices and/or users.

  • Field-based: performed in a natural context. It allows testing the device performance and user behaviour in real world environments.

  • User testing: it consists of testing with actual users and customers or stakeholders. It tends to occur in the later stages of development.

  • Expert evaluation: is done by the expert, a designer, developer or a tester. If done in-house, this is a cheap method in comparison to user testing and can be done more regularly.

Each of the listed approaches is related to different methods and techniques to evaluate a range of software aspects. Based on the development phases, user ratings, performance and device compatibility are critical areas for the mobile game (Helppi, 2015). The first represents the result of the user experience with a group of actions with the hardware, the software and its interaction. “Bad ratings and feedback lead to low number of downloads.” [6] Rating is directly related to device performance and compatibility. Some software needs more in-depth tests and others more simplified ones, from internal structure to the interface. There are regular tests for all Web and app products and specific ones for games. A quality assurance recommendation is to start testing as soon as possible in the development life cycle (Dix, et al., 2004).

Play-testing is used as one of the main game methods of quality control. It is not just about verifying if the game works or not, it is about asking if the game works well. It can lead to questions such as — is the game too easy? Is the interface clear and easy to navigate? — (Schultz, et al., 2005). It is a method to gather qualitative and quantitative data while a potential user plays (Contreras-Espinosa and Eguia, 2017). The first step is to determine the goal of the play-testing, evaluate controls, retention, the interface or, for example, to understand the player perception (Mirza-Babaei, et al., 2016).



Open Device Lab community

The Open Device Lab community is a grassroots movement of the Web development community, which emerged in 2012, a global community grounded on the open standards principles, Web for all and on everything. Its proposal is to provide access free of charge to labs equipped with devices, such as smartphones and tablets, connected to the Internet. They are located in different spaces as private companies, co-workings, universities and schools (Godinho-Paiva, 2015) (see Figure 2).


Open Device Lab FFM, Frankfurt, Germany
Figure 2: Open Device Lab FFM, Frankfurt, Germany.


In October 2018, there were 151 laboratories in 35 countries registered on (Figure 3). The highest number of labs is in Europe, followed by North America. Germany has the highest number of labs followed by the U.K. and the U.S. Regarding the number of devices, the largest ODLs are in the U.K., Germany and the Netherlands (Godinho and Contreras-Espinosa, 2016).


Community Web site
Figure 3: Community Web site.


The community Web site has three major goals: to help people to locate an ODL, to explain and promote the movement and to attract contributors and sponsors to help the labs. The site works as a hub. Each registered lab has a page with a brief profile and links to external Web sites, depending on their own preference (lab Web site, Twitter, Facebook, GitHub, Google+, Google groups, LinkedIn, Instagram, Meetup and Xing)(see example in Figure 4). Most of the ODLs have links to a Web site, a Twitter account and a Facebook account. The community Web site served as the starting point for our data collection.


Example of links used to collect data
Figure 4: Example of links used to collect data.


The literature refers to the ODLs as a valuable alternative to make testing on real devices possible to more professionals; for those who cannot set up their own device lab (Godinho-Paiva, 2015). The labs are an investment in local communities, working in a collaborative way to provide solutions to improve the Web and app user experience. The main benefit of the ODLs is the free access to devices and infrastructure such as Wi-Fi and software. In addition, most of the ODLs’ teams help guests with testing and evaluation procedures, if required. A device lab is not just about having new smartphones and tablets. It is also useful to have older key devices with a variety of operational systems and versions.

The main research we have been conducting on the global community has identified a gap on the game testing and evaluation on real devices. Furthermore, we decided to set up an ODL at the University we are affiliated with, to support the degrees in multimedia, applications and games.

After the first exploratory phase of the research, we proceeded to this systematic study; investigating the devices available, and the software and approaches used by the community related to the game segment. We emphasise here that we intentionally discuss only the findings related to the game research questions: how do the Open Device Labs support game testing and evaluation, which is based on two sub questions:

RQ1: In terms of game market categories, what recurrent devices are available on the ODL community?

This RQ enabled us to determine the game segments covered and not covered, in terms of devices, by the community.

RQ2: Which approaches and software used by the ODLs are also used for game testing and evaluation?

This RQ allowed us to identify strategies and software used by the ODLs to take it into consideration as expertise of the team to support the guests.




Sampling and study population

This qualitative case study is aligned with clearly identified systematic procedures (Creswell and Miller, 2000). It uses a homogeneous sample data, which shares a set of characteristics: labs hosted by private companies in the Web/app development industry that are part of the ODL community. It is not concerned with statistical generalisability as it is field oriented (Crossman, 2019; Guest, et al., 2006).

The population considers the entire community (Creswell, 2014), meaning all the 151 labs registered online. The research is based on three main sources: online documents, observation and data from interviews. We collected all the data, available online, from the community’s Web sites (see Figure 4). We interviewed participants from Germany, due to the fact it is the country with the highest number of ODLs. They were recommended by the community manager according to active presence and availability. The Netherlands, Sweden and Finland labs were used to verify possible geographic differences in the sample [7]. The quality of the data and the number of interviews were linked with the research method (Guest, et al., 2006).

Data collection

To answer the research questions, we started with the information found on the community Web site and moved on to the data obtained in the interviews. To select the data, we visited all the working links from each laboratory and selected the lab’s Web sites, Twitter and Facebook accounts, since these are the most common communication channels. From Twitter, we selected the profile area and from Facebook the about Web page.

Online documents represented information collected through the community’s Web site accessed in September 2017. We collected data from 93 ODL’s Web sites, 58 Twitter pages and 18 Facebook pages. Other links such as Google Plus or Instagram were not used because there were not a representative number of them available. In total, we collected information from 151 labs.

Device lists — the first collection of device lists occurred in March 2016 when we collected 128 lists. We replicated the collection in July 2018, to verify updates concerning open and closed labs information. Most of these lists are available on the ODL’s Web sites and a few of them we obtained by e-mail. Some labs did not provide their device lists. However, overall, we analysed 126 device lists with a total of 3,890 devices.

It is an important criteria for quality to be considered and to explain some of limitations of this data about the devices (Tracy, 2010). For example, new devices may have been purchased and may not have been registered on the published lists collected. There are also on-demand devices, such as personal phones. Additionally, there were cases where labs wrote comments about requested devices, which are not in the permanent list. Older and broken devices were also found on the lists with comments about them; the broken devices were not used in this study.

Data from interviews — the original study for which the interviews were conducted examined the community ecosystem as a global movement, using semi-structured and open-ended interviews. From this data, we presented the results of the codes approach and software.

Data analysis

We conducted the data analysis through computer-aided approaches (Tracy, 2013). The online documents were analysed using Atlas.ti and the devices list information using Excel.

To conduct the analyses for the RQ1, the data was organised in an Excel file. We created a sheet for each ODL device list in chronological order, based on their registered date on the community Web site. The device lists were organised by device brand, model and type.

After all the lists were homogeneously organised in a master sheet, we used an Excel function to identify and count the data through all the device lists; classifying it according to game industry categories (Newzoo, 2016). Finally, devices that did not fit into these categories were not used; a total of 3,709 devices were analysed:

  1. per device and segment: mobile (tablets and smartphones), PC/MMO games and Web games (browser PC and boxed/downloaded PC) and consoles;

  2. per screen and segment: included personal screen (smartphone and watches), entertainment screens (TV, consoles and VR), floating screens (tablets and handheld) and computer screen (Web games and PC/MMO games).

To conduct the analyses for the RQ2, we organised the online documents and interviews in two separate files. Information that was not published in English was translated. Using Atlas.ti, we conducted a qualitative analysis (Thomas, 2006). In the primary-cycles (Tracy, 2013), we coded the data using a descriptive and in vivo methods (Saldaña, 2009). The in vivo method was important to identify similar T&E approaches described differently by participants. In the last coding round, the T&E approaches and software were classified as regular and/or games.




This study, as qualitative research, was interested in exploring and understanding the subject. However, we will present graphics and numbers to help the reader to visualise the scope of the results from the global community and the countries with higher numbers of labs. Quantitative data, in qualitative studies, complements and strengthens the data obtained (Bogdan and Biklen, 1994). The findings are presented in the following sections according to the research questions and their considerations; the final discussion is presented at the end of this section.

RQ1: In terms of game market categories, what recurrent devices are available on the ODL community?

To present the results in terms of numbers, we used two different samples, 1) by the global community and 2) by country. For the latter, we chose to analyse the three countries with the highest number of laboratories, the U.K., U.S. and Germany. Together they represent the majority of labs.

The results by country show similar results to the global data. This means the U.K., U.S. and Germany make up a representative sample of the community. For this reason, we decided to show the findings by categories in a comparative way: a. per screen and segment (Figure 5) and b. per device and segment (Figure 6). Per game screen and segment — included personal screens (smartphone and watches), floating screens (tablets and handheld) and entertainment screens (TV, consoles and VR).


Comparative data of devices available in the ODLs, per screen
Figure 5: Comparative data of devices available in the ODLs, per screen.


In both cases, by global community and by country, up-to-date numbers of the Open Device Lab community show the personal screen is the largest screen segment represented, followed by the floating screen. In the case of the entertainment screen, the results in comparison are a bit different. In the global sample, it is higher than the computer screen, in Germany, it is a bit lower than computer screen and in the U.S. and the U.K., the results in terms of percentages, are similar (Figure 5).


Comparative data of available devices in the ODLs, per device segment
Figure 6: Comparative data of available devices in the ODLs, per device segment.


Per game device and segment: mobile (tablets and (smart) phones), PC/MMO (browser PC and boxed/downloaded PC) and consoles.

In these cases, the mobile segment was by far the most represented in the community (Figure 6). One percent represented consoles, two percent PC/MMO and 97 percent were mobiles. We also analysed the game consoles sample. Even if making up a small percentage, most of these segments were equivalent to the same best-selling devices in the global market in 2015–2017 according to Statista (2018), thus it is a valuable pool of devices.

Comparing the data by country, the mobile segment is also by far the most represented segment, followed by the PC/MMO and console in three countries. The U.S. presents a higher result in PC/MMO (seven percent) than consoles (three percent). Germany and the U.K. are more similar in the results (Figure 6). These findings are commensurate with the global findings.

The outcomes, both in terms of devices and screens, shows that mobiles and personal screens, corresponding to smartphones followed by tablets, represent almost 90 percent of the global sample. Therefore, in terms of devices, the community can mainly support the market’s demand of the mobile and the personal screen game segment, which has been reaching a large market in terms of revenue and gamers (Newzoo, 2017). Even if they do not support the other segments in terms of quantity of devices, they can support it in these useful devices.

The representative aspects of the community are related to its context. The ODL community emerged in 2012 when testing and evaluating Web-based products was much more difficult. At that time, mobile fragmentation was increasing with inconsistent phones and systems. New strategies, such as the responsive Web design approach, were developed to deal with the challenges.

For this reason, devices with Web browsers are the most commonly used in the ODLs, including game consoles and handheld equipment. Additionally, many labs have been adding new kinds of devices to their collections. This includes beacons, glasses, motion controllers, VRs, bracelets and others.

RQ2: Which approaches and software used by the ODLs are also used for game testing and evaluation?

We know the ODLs offer devices and infrastructures, and usually they do not promote helping guests with testing and evaluation procedures issues. It is not because they do not want to be helpful but because there is a chance that they will be drawn into company or institutional projects. However, online user reviews reveal that helping external guests is a common practice of the ODLs. Consequently, exploring T&E approaches and software also means identifying the expertise available to the ODLs’ guests.

Several T&E methods are used by the Web industry as well as by the ODL community, some of which have already been listed earlier. Here, we present data obtained from ODLs’ Web sites and interviews with ODLs’ managers. In this context, the evidence mostly represents approaches and software used in the ODLs by the host team.

However, it is important to remember their focus is Web and app products in general and not specifically games. Many labs are hosted by companies working on Web-based products, so their expertise and software are in line with the relevant segments. The results we present here are a selection of the most common approaches and software for game testing and evaluation based on the findings of (Contreras-Espinosa and Eguia, 2017; Isbister and Schaffer, 2008; Redavid and Farid, 2011; Schultz, et al., 2005), which is by no means an exhaustive list (Tables 1 and 2).


Table 1: Findings on common game testing and evaluation approaches used by ODLs [8].
AutomatedUnit test, functional test, regression test.It is a regular approach depending on the test phase.
ManualUX test, usability test, performance test.It is a regular approach depending on the test phase.
Field-basedUsability test, functional test.There is not much data on field-based T&E.
Lab-basedCompatibility test, cross-platform, cross-browser.It is the main approach in the community.
Expert evaluationExploratory testing, cognitive walkthrough, ad hoc testing.It is the main approach in the community.
User-testingA/B test, UI test, UX test, concept test.It is not as common as expert evaluation.
In-personEye tracking, Focus group, Observation.It is the main approach in the community.


Both automated and manual testing are regular practices in the ODLs. It depends on the testing phases. There is software available to help with automated tests on multiple devices (see Table 2). As expected, lab-based is the main approach but we also found information on field-based, although this is an unusual approach since many ODLs do not allow the removal of the devices from the lab. Expert evaluation is also the main strategy compared to user testing; and in-person is the predominant practice as there is no evidence on remote testing.


Table 2: Findings on common game testing and evaluation software used by ODLs.
BrowserSyncCross-browser and cross-platform toolFor Windows, Apple OS and Linux.
SeleniumSuite of toolsFor browsers, Windows, Apple OS and Linux.
TestFlightBeta test toolFor iOS
XcodeIntegrated development environment (IDE)For Apple OS


The results show a diverse range of software. There are cross-platform and browser tools, suites, software for user-testing and IDE. In the case of software, the benefits are again more about the hosts’ expertise, help and knowledge exchange than about the software use. Most of the ODL guests have their own laptop, which contain the most-used software for development, and usually they take it with them to perform tests at the lab. At the same time, many ODLs have PCs or laptops available for the use of the guests with free access to software such as cross-platform and browser synch tools, commonly sponsored by the companies who developed the software. To sum up, the main ODLs’ benefits for indie developers are:

  1. Mobile devices — access to a pool of real devices for game testing and evaluation — primarily support to personal screen and mobile game segment, i.e., smartphones and tablets.
  2. Structure and facilities — besides the devices, access to Wi-Fi, software and a private and/or dedicated space is available.
  3. Help, knowledge exchange and expertise sharing — mainly for general challenging aspects of Web and mobile testing such as functionality, performance, and compatibility; and in terms of game aspects, more suitable for testing or playtesting, focusing on single player, Web and mobile titles.




As noted in the literature review presented in the first section of this paper, the ODLs work on a specific and key aspect of the game design life cycle: testing and evaluation on real devices.

“Ideally, evaluation should occur throughout the design life cycle, with the results of the evaluation feeding back into modifications to the design.” [9] Choosing when and how often devices are used depends on the choices of software model development and approach, although having a pool of devices or a device lab at hand makes it easier.

To those who choose to perform tests on real devices, the community supports mainly lab-based testing and evaluation. However, most of them do not lend the devices to the guests. They can also support manual, automated, in-person, expert evaluation and user-testing approaches as well as playtesting. There is no evidence on remote testing and evaluation services. Therefore, in this case, it is necessary to pay for a platform, which offers manual and automated remote services on real devices.

The ODL community principles make possible a voluntary initiative and collaborative purpose to help the Web design and development community, as well as interconnected areas, which benefit from it. Resources, which also lead to a particular practice of the private sector offering open free spaces, includes a pool of devices connected to the Internet and ready to go, i.e., charged, organised and ready for the guest when they arrive at the lab. Furthermore, the offering of a representative list of devices used by the hosts for their own tests is available for free for local communities in 35 countries. Additionally, benefits for both guests and hosts such as knowledge exchange, expertise sharing, networking and improvements are available in these areas. The hosts are willing to help and share their expertise in the area of approaches and software development. As we saw, this was not always specifically on games but in common areas of software development. However, this is not an issue, since the game tester should have knowledge in his/her area of work. All community efforts go to “lead to an ultimate improvement of the Web & app experience both for developers and for consumers.” [10]

Besides the potential and benefits, it is important to mention what is not supported by this community. For the moment, it is not a place for in-person multiplayer, physiological and biometrics testing and evaluation. The testing industry is growing fast, mainly in automated and remote testing, due to their benefits and the growth of companies offering this kind of service. However, there is a cost for using this service, so shared spaces supported by the local community are still a relevant option for independent developers and small companies.

The number of companies offering testing solutions on real devices is increasing. A key factor in choosing testing and evaluation strategies is the budget of the game project. In this case, the ODLs are once again a relevant solution because they put together what is necessary to get the benefits of using real devices in Web, app, and game development, supporting both game testing and evaluation. There are common preferences, e.g., Web testing rather than native, or using the devices at the lab rather than allowing guests to borrow them. However, they are open to special cases and exceptions whenever requested and/or contacted in time to analyse the case. From what we have observed in the interviews and from the online user reviews, the guests usually have what they need in terms of structure and help.

To sum up, this study is about a community that supports developers, designers, students, lecturers, and researchers with devices, structure, software and expertise. A bottom-up and peer-to-peer collaboration among amateurs, professionals and experts is based on openness and collaborative principles. A unique facility is already available to designers and developers with high potential and also to indie game developers.

As noted earlier, most indie studios develop, publish and market their own games (Zachariah, 2018); free access to device labs could be a game changer in providing them with support on game testing and evaluation.



Limitations and conclusion

Mobile and online games are at the centre of global game market attention. In the development sector, while there are more distribution opportunities for indie games than before, there is still no standard for quality assurance, and lack of support in the GDLC.

To address this problem, this study presents the Open Device Labs as an unexplored case with high potential to support indie games. The Labs represent a bottom-up and peer-to-peer movement, aiming to help indie Web developers in terms of testing and evaluation on real devices, reaching out to the game industry.

This explorative research on the ODLs’ recurrent devices available, approaches and software clearly illustrates a high potential to serve the Web and mobile game lab-based testing and evaluation for indie studios, playing an important role in addressing the issue of game evaluation and testing.

At the moment, the community promotes and support game testing but it still not used heavily by game testers. The focus and limitations of this study do not allow us to present evidence on this aspect. We might say, as a hypothesis, that it is a result of weak connections to and little promotion in the game community. At the same time, acquiring data from gamers who have visited ODLs would help to explain this issue. This is a limitation of this research and an opportunity to be explored in future work. In addition, there are other significant unexplored uses of ODLs. Our aim is to conduct long-term empirical research at the ODL, which we are establishing at the University of Vic-Central of Catalonia (UVic-UCC), Spain. End of article


About the authors

Raquel Paiva Godinho is a tenured lecturer and researcher at the Design Department, Federal Institute of Education, Science and Technology Sul-rio-grandense (IFSul), Brazil. Currently, she is a member of the Balmes Foundation team involved in a H2020 project, and a Ph.D. candidate in Experimental Science and Technology at the University of Vic-Central of Catalonia (UVic-UCC), Spain. Her current research interests include design and collaborative practices.
E-mail: raquelpg [at] pelotas [dot] ifsul [dot] edu [dot] br

Dr. Ruth Sofia Contreras Espinosa is a Professor at the Faculty of Business and Communication, University of Vic-Central of Catalonia (UVic-UCC), Spain and a Project Manager in H2020 European projects. She is also coordinator of Observatory of Communication, Video Games and Entertainment (OCVE), InCom-UAB-UVIC. Her current research interests include game studies, game user experience and games user research.
Direct comments to: ruth [dot] contreras [at] uvic [dot] cat



We would like to thank the Open Device Lab community for allowing us to conduct this study and for collaborating with it. This work was partially supported by the IFSul, UVic-UCC, Erasmus+ program, and by the BBVA bank group and Antiga Caixa Manlleu Foundation.



1. Tuomi, 2001.

2. The words approach, method, strategy, and techniques have been used interchangeably. Therefore, in this study, we chose to use the word ‘approach’ based on Dix, et al. (2004).

3.Osborne O’Hagan, et al., 2014, p. 182.

4. Vithani and Kumar, 2014, p. 599.

5. Helppi, 2017, p. 25.

6. Helppi, 2015, p. 20.

7. Questions were answered by 10 managers of ODLs in Germany, the Netherlands, Sweden and Finland. The number of interviews was based on Guest, et al. (2006) who suggested from 6 to 12 is an appropriate range. These labs are all hosted by private companies and most of them are focused on Web-based products; a few works with native apps.

8. The examples in column two are not exclusively of the correspondent approach. For example, Unit Test can be tested in both automated and manual approach.

9. Dix, et al., 2004, p. 319.

10. ODL, 2018.



Saiqa Aleem, Luiz Fernando Capretz and Faheem Ahmed, 2016a. “A digital game maturity model (DGMM),” Entertainment Computing, volume 17, pp. 55–73.
doi:, accessed 22 July 2019.

Saiqa Aleem, Luiz Fernando Capretz and Faheem Ahmed, 2016b. “Game development software engineering process life cycle: A systematic review,” Journal of Software Engineering Research and Development, volume 4, number 6.
doi:, accessed 22 July 2019.

Robert C. Bogdan and Sari Knopp Biklen, 1994. Investigaç ão qualitativa em educação: Uma introdução à teoria e aos métodos. Porto: Porto Editor.

Mia Consalvo and Andrew Phelps, 2019. “Performing game development live on Twitch,” Proceedings of the 52nd Hawaii International Conference on System Sciences, pp. 2,438–2,447, and at, accessed 22 July 2019.

Ruth S. Contreras-Espinosa and Jose L. Eguia (editors), 2017. “Usability and user experience methodologies used by games companies,” Catalonia Research Project, Report, Observatory of Communication, Games and Entertainment, Autonomous University of Barcerlona and University of Vic-Central of Catalonia.

John W. Creswell, 2014. Research design: Qualitative, quantitative, and mixed methods approaches. Fourth edition. Thousand Oaks, Calif.: Sage.

John W. Creswell and Dana L. Miller, 2000. “Determining validity in qualitative inquiry,” Theory in Practice, volume 39, number 3, pp. 124–130.
doi:, accessed 22 July 2019.

Ashley Crossman, 2018. “Understanding purposive sampling: An overview of the method and its applications,” ThoughtCo. (3 July), at, accessed 22 July 2019.

Alan Dix, Janet Finlay, Gregory D. Abowd and Russell Beale, 2004. Human-computer interaction. Third edition. Harlow: Pearson.

Lyza Danger Gardner and Jason Grigsby, 2012. Head first mobile Web. Beijing: O’Reilly.

Raquel Godinho-Paiva, 2015. “Open Device Lab (ODL) — um movimento colaborativo para o uso de dispositivos reais em projetos para web e aplicativos (revisão da literatura),” Obra Digital, número 9, pp. 58–79, and at, accessed 22 July 2019.

Raquel Godinho-Paiva and Ruth S. Contreras-Espinosa, 2016. “Open Device Lab: An analysis of available devices in the gaming market,” 2016 Eighth International Conference on Games and Virtual Worlds for Serious Applications (VS-GAMES).
doi:, accessed 22 July 2019.

Jose Luis González Sánchez, Natalia Padilla Zea and Francisco L. Gutiérrez, 2009. “From usability to playability: Introduction to player-centred video game development process,” In: Masaaki Kurosu (editor). Human centered design. Lecture Notes in Computer Science, volume 5619. Berlin: Springer, pp. 65–74.
doi:, accessed 22 July 2019.

Greg Guest, Arwen Bunce and Laura Johnson, 2006. “How many interviews are enough? An experiment with data saturation and variability,” Field Methods, volume 18, number 1, pp. 59–82.
doi:, accessed 22 July 2019.

Ville-Veikko Helppi, 2017. “Mobile games under the microscope,” TEST, pp. 24–27, and at, accessed 26 February 2018.

Ville-Veikko Helppi, 2015. “The fundamentals of mobile game development and testing,” at, accessed 11 March 2018.

Katherine Isbister and Florian Mueller, 2015. “Guidelines for the design of movement-based games and their relevance to HCI,” Human–Computer Interaction, volume 30, numbers 3–4, pp. 366–399.
doi:, accessed 22 July 2019.

Katherine Isbister and Noah Schaffer, 2008. Game usability: Advice from the experts for advancing the player experience. San Francisco, Calif.: Morgan Kaufmann.

Cem Kaner, 2016. “Updating BBST to version 4.0,” at, accessed 22 January 2018.

Jesper Kjeldskov and Mikael B. Skov, 2014. “Was it worth the hassle? Ten years of mobile HCI research discussion on lab and field evaluations,” MobileHCI ’14: Proceedings of the 16th International Conference on Human-Computer Interaction with Mobile Devices & Services, pp. 43–52.
doi:, accessed 22 July 2019.

Giuseppe A. Di Lucca and Anna Rita Fasolino, 2006. “Web application testing,” In: Emilia Mendes and Nile Mosley (editors). Web engineering. Berlin: Springer, pp. 219–260.
doi:, accessed 22 July 2019.

Pejman Mirza-Babaei, Naeem Moosajee and Brandon Drenikow, 2016. “Playtesting for indie studios,” AcademicMindtrek ’16: Proceedings of the 20th International Academic Mindtrek Conference, pp. 366–374.
doi:, accessed 22 July 2019.

Newzoo, 2017. “Global games matket report: Trends, insights, and projections toward 2020,” at, accessed 5 March 2018.

Newzoo, 2016. “Global game market report: An overview of trends & insights,” at, accessed 22 January 2018.

ODL, 2018. “ — Locate, contribute to and sponsor an Open Device Lab (ODL),” at, accessed 22 July 2019.

Ann Osborne O’Hagan, Gerry Coleman and Rory V. O’Connor, 2014. “Software development processes for games: A systematic literature review,” In: Béatrix Barafort, Rory V. O’Connor, Alexander Poth and Richard Messnarz (editors). Systems, software and services process improvement. Berlin: Springer, pp. 182–193.
doi:, accessed 22 July 2019.

Rido Ramadan and Yani Widyani, 2013. “Game development life cycle guidelines,” 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS), pp. 95–100.
doi:, accessed 22 July 2019.

Claudio Redavid and Adil Farid, 2011. “An overview of game testing techniques,” at, accessed 18 October 2018.

Felix Richter, 2016. “The smartphone price gap,” Statista (2 June), at, accessed 15 August 2018.

Nayan B. Ruparelia, 2010. “Software development lifecycle models,” ACM SIGSOFT Software Engineering Notes, volume 35, number 3, pp. 8–13.
doi:, accessed 19 July 2017.

Johnny Saldaña, 2009. The coding manual for qualitative researchers. London: Sage.

SauceLabs, 2015. “Testing trends in 2015: A survey of software professionals,” at, accessed 10 March 2018.

Charles P. Schultz, Robert Bryant and Tim Langdell, 2005. Game testing all in one. Boston, Mass.: Thomson/Course Technology.

Claudio Scolastici and David Nolte, 2013. Mobile game design essentials: A useful and detailed resource for designing games for mobile devices. Birmingham: Packt Publications.

Statcounter, 2018. “Mobile operating system market share worldwide,” at, accessed 16 October 2018.

Statista, 2018. “Global unit sales of current generation video game consoles from 2008 to 2017 (in million units),” at, accessed 26 February 2018.

Allistair G. Sutcliffe, 2005. “Convergence or competition between software engineering and human computer interaction,” In: Ahmed Seffah, Jan Gulliksen and Michel C. Desmarais (editors). Human-centered software engineering — Integrating usability in the software development lifecycle. Dordrecht: Springer, pp. 71–84.
doi:, accessed 19 July 2017.

David R. Thomas, 2006. “A general inductive approach for analyzing qualitative evaluation data,” American Journal of Evaluation, volume 27, number 2, pp. 237–246.
doi:, accessed 22 July 2019.

Sarah J. Tracy, 2013. Qualitative research methods: Collecting evidence, crafting analysis, communicating impact. Chichester: Wiley-Blackwell.

Sarah J. Tracy, 2010. “Qualitative quality: Eight ‘big-tent’ criteria for excellent qualitative research,” Qualitative Inquiry, volume 16, number 10, pp. 837–851.
doi:, accessed 22 July 2019.

Ilkka Tuomi, 2001. “Internet, innovation, and open source: Actors in the network,” First Monday, volume 6, number 1, at, accessed 22 July 2019.
doi:, accessed 22 July 2019.

Michael Twidale and Preben Hansen, 2019. “Agile research,” First Monday, volume 24, number 1, at, accessed 23 April 2019.
doi:, accessed 22 July 2019.

Tejas Vithani and Anand Kumar, 2014. “Modeling the mobile application development lifecycle,” Proceedings of the International MultiConference of Engineers and Computers Scientists (IMECS), volume 1, pp. 596–600, and at, accessed 18 July 2017.

Shanti Zachariah, 2018. “The way small independent studios create” (3 August), at, accessed 17 September 2018.


Editorial history

Received 29 October 2018; revised 19 May 2019; accepted 2 July 2019.

Copyright © 2019, Raquel Godinho-Paiva and Ruth Sofia Contreras-Espinosa. All Rights Reserved.

Game testing and evaluation on real devices: Exploring in the case of the Open Device Lab community
by Raquel Godinho-Paiva and Ruth Sofia Contreras-Espinosa.
First Monday, Volume 24, Number 8 - 5 August 2019

A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2019. ISSN 1396-0466.