The dangers of Webcrawled datasets

Graeme Baxter Bell

Abstract


This article highlights legal, ethical and scientific problems arising from the use of large experimental datasets gathered from the Internet - in particular, image datasets. Such datasets are currently used within research into topics such as information forensics and image-processing. This paper strongly recommends against webcrawling as a means for generating experimental datasets, and proposes safer alternatives.

Keywords


internet; webcrawler; webcrawling; data gathering; image-processing; information forensics

Full Text:

HTML


DOI: http://dx.doi.org/10.5210/fm.v15i2.2739



A Great Cities Initiative of the University of Illinois at Chicago University Library.

© First Monday, 1995-2017. ISSN 1396-0466.