An Introduction to Text Mining. Gabe Ignatow. Читать онлайн. Newlib. NEWLIB.NET

Автор: Gabe Ignatow
Издательство: Ingram
Серия:
Жанр произведения: Социология
Год издания: 0
isbn: 9781506337029
Скачать книгу
recent 2012 guidelines draw particular attention to three areas that need to be negotiated by researchers using user-generated online data: the concept of human subjects, public versus private online spaces, and data or persons. The 2012 guidelines do not prescribe a set of dos and don’ts but instead recommend a series of questions for researchers to consider when thinking about the ethical dimensions of their study.

      For human subjects, the AoIR guidelines state as a key guiding principle that because “all digital information at some point involves individual persons, consideration of principles related to research on human subjects may be necessary even if it is not immediately apparent how and where persons are involved in the research data.” However, while the term human subject persists as a guiding concept for ethical social research, in Internet research this gets a bit tricky:

      “Human subject” has never been a good fit for describing many internet-based research environments. Ongoing debates among our community of scholars illustrate a diverse, educated range of standpoints on the answers to the question of what constitutes a “human subject.” We agree with other regulatory bodies that the term no longer enjoys the relatively straightforward definitional status it once did. As a community of scholars, we maintain the stance that when considered outside a regulatory framework, the concept of “human subject” may not be as relevant as other terms such as harm, vulnerability, personally identifiable information, and so forth. We encourage researchers to continue vigorous and critical discussion of the concept of “human subject,” both as it might be further specified in internet related research or as it might be supplanted by terms that more appropriately define the boundaries for what constitutes inquiry that might be ethically challenging. (p. 6)

      A second major consideration in the AoIR ethics guidelines is the idea of public versus private data. While privacy is a concept that must include a consideration of expectations and consensus, a “clearly recognizable boundary” between public and private does not exist:

      Individual and cultural definitions and expectations of privacy are ambiguous, contested, and changing. People may operate in public spaces but maintain strong perceptions or expectations of privacy. Or, they may acknowledge that the substance of their communication is public, but that the specific context in which it appears implies restrictions on how that information is—or ought to be—used by other parties. Data aggregators or search tools make information accessible to a wider public than what might have been originally intended. (p. 7)

      The third consideration or tension in the AoIR guidelines is that between data and persons. The report’s authors noted the following:

      The internet complicates the fundamental research ethics question of personhood. Is an avatar a person? Is one’s digital information an extension of the self? In the U.S. regulatory system, the primary question has generally been: Are we working with human subjects or not? If information is collected directly from individuals, such as an email exchange, instant message, or an interview in a virtual world, we are likely to naturally define the research scenario as one that involves a person.

      For example, if you are working with a data set that contains thousands of tweets or Facebook posts, it may appear that your data are far removed from the people who did the actual tweeting or posting. While it may be hard to believe that the people who produced your data could be directly or indirectly impacted by the research, there is considerable evidence that even “anonymized” data sets contain personal information that allows the individuals who produced it to be identified. Researchers continue to debate how to adequately protect individuals when working with such data sets (e.g., Narayanan & Shmatikov, 2008, 2009; Sweeney, 2003). These debates are important because they are concerned with the fundamental ethical principle of minimizing harm; the connection between a person’s online data and his or her physical person could possibly lead to psychological, economic, or even physical, harm. Thus, as a researcher, you must consider whether your data can possibly be linked back to the people who produced it and whether there are scenarios in which this link could cause them harm.

      Professional research associations such as the British Psychological Society (www.bps.org.uk/system/files/Public%20files/inf206-guidelines-for-internet-mediated-research.pdf) and American Psychological Association (APA; www.apa.org/science/leadership/bsa/internet/internet-report.aspx) have developed their own reports and ethical guidelines for online research. But because not all professional research associations have developed their own guidelines, it is critical that you submit your research proposal to your IRB for review before collecting or analyzing data.

      Institutional Review Boards

      IRBs are university committees that approve, monitor, and review behavioral and biomedical research involving humans. Within higher education institutions ethical approval is required from a university-level ethics committee for any research involving human participants. IRBs and other university ethics committees continue to develop and revise standards to keep up with evolving social media and big data technologies.

      Since the 1990s, a consensus has emerged that the study of computer-mediated and Internet-based communication often requires that IRBs modify their human subjects principles and research ethics policies. Such modifications are necessary because in online environments it is often impossible to gain the consent of research participants (Sveningsson, 2003), and there is often an expectation of public exposure by users. Researchers and ethics professionals who write and revise university research ethics policies continue to grapple with several issues that we address next, including privacy, informed consent, manipulation of human subjects, and publishing ethics.

      Privacy

      In 1996, the Internet researchers Sudweeks and Rafaeli argued that social scientists should treat “public discourse on computer-mediated communication as just that: public” and that, therefore, “such study is more akin to the study of tombstone epitaphs, graffiti, or letters to the editor. Personal? Yes. Private? No” (p. 121). Sudweeks and Rafaeli’s position may be convenient for the practice of research, but it has proved not to always be sufficient for research using data from contemporary social media platforms. In many cases there is a lack of consensus about whether people who have posted messages on the Internet should be considered “participants” in research or whether research that uses their messages as data should be viewed as involving the analysis of secondary data that already existed in the public domain.

      Some researchers have argued that publicly available data carry no expectation of privacy, while many researchers who have carried out studies of online messages (e.g., Attard & Coulson, 2012; Coulson, Buchanan, & Aubeeluck, 2007) have deemed the data to be in the public domain yet have sought IRB approval from within their own institutions anyway.

      A number of Internet researchers have concluded that where data can be accessed without site membership, such data can be considered as public domain (Attard & Coulson, 2012; Haigh & Jones, 2005, 2007). Therefore, if data can be accessed by anyone, without website registration, it would be reasonable to consider the data to be within the public domain of the Internet.

      There appears to be agreement that websites that require registration and password-protected data should be considered private domain (Haigh & Jones, 2005) because users posting in password-protected websites are likely to have expectations of privacy. Websites that require registration are often copyrighted, which raises a legal issue of ownership of the data and whether posts and messages can be legally and ethically used for research purposes.

      The Cornell–Facebook study is widely seen as having invaded the privacy of Facebook users. Some websites and social media platforms have privacy policies that set expectations for users’ privacy, and these can be used by researchers as guidelines for whether it is ethical to treat the site’s data as in the public domain or else whether informed consent may be required. But in most cases, such guidelines are insufficient and at best provide minimum standards that