Bots. Nick Monaco. Читать онлайн. Newlib. NEWLIB.NET

Автор: Nick Monaco
Издательство: John Wiley & Sons Limited
Серия:
Жанр произведения: Кинематограф, театр
Год издания: 0
isbn: 9781509543601
Скачать книгу
identities and remove others’ posts. This meant that, in effect, a bot could be used to censor other users by deleting their content from the web. Once the secret was out, users and organizations began cancelling other’s users’ posts. For example, a bot called CancelBunny began deleting mentions of the Church of Scientology on Usenet, claiming they violated copyright. A representative from the Church itself said that it had contacted technologists to “remove the infringing materials from the Net,” and a team of digital investigators traced CancelBot back to a Scientologist’s Usenet account (Grossman, 1995). The incident drew ire from Usenet enthusiasts and inspired hacktivists like the Cult of the Dead Cow (cDc) to declare an online “war” on the Church, feeling the attempt at automated censorship violated the free speech ethos of Usenet (Swamp Ratte, 1995). Another malicious cancelbot “attack” from a user in Oklahoma deleted 25,536 messages on Usenet (Woodford, 2005, p. 135). Some modern governments use automation in similar ways, and for similar purposes as these cancelbots and annoybots: using automation to affect the visibility of certain messages and indirectly censor speech online (M. Roberts, 2020; Stukal et al., 2020).

      Another prolific account on Usenet, Sedar Argic, posted political screeds on dozens of different news groups with astonishing frequency and volume. These posts cast doubt on Turkey’s role in the Armenian Genocide in the early twentieth century, and criticized Armenian users. Usenet enthusiasts still debate today whether the Argic’s posts were actually automated or not, but its high-volume posting and apparent canned response to keywords such as “Turkey” in any context (even on posts referring to the food) seem to point toward automation.

      Over time, more advanced social Usenet bots began to emerge. One of these was Mark V. Shaney, a bot designed by two Bell Laboratories researchers that made its own posts and conversed with human users. Shaney used Markov Chains, a probabilistic language generation algorithm, which strings together sentences based on what words are most likely to follow the words before it. The name Mark V. Shaney was actually a pun on the term Markov Chain (Leonard, 1997, p. 49). The Markov Chain probabilistic technique is still widely used today in modern natural language processing (NLP) applications (Jurafsky & Martin, 2018, pp. 157–160; Markov, 1913).

      The arc of bot usage and evolution in IRC is similar to that of Usenet. At first, bots played an infrastructural role; then, tech-savvy users began to entertain themselves by building their own bots for fun and nefarious users began using bots as a disruptive tool; in response, annoyed server runners and white-hat bot-builders in the community built new bots to solve the bot problems (Leonard, 1997; Ohno, 2018).

      Just as with Usenet, early bots in IRC channels played an infrastructural role, helping with basic routine maintenance tasks. For instance, the initial design of IRC required at least one human user to be logged into a server (often called a “channel”) for it to be available to join. If no users were logged into an IRC server, the server would close and cease to exist. Eventually, “Eggdrop” bots were created to solve this problem. Users deployed these bots to stay logged into IRC servers at all times, keeping channels open even when all other human users were logged out (such as at night, when they were sleeping). Bots were easy to build in the IRC framework, and users thus quickly began designing other new bots with different purposes: bots that would say hello to newcomers in the chat, spellcheck typing, or allow an interface for users to play games like Jeopardy! or HuntTheWumpus in IRC.

      In addition to Usenet and IRC, computer games were also a hotbed of early bot development. From 1979 on, chatbots were relatively popular in online gaming environments known as MUDs (“multi-user domains” or “multi-user dungeons”). MUDs gained their name from the fact that multiple users could log into a website at the same time and play the same game. Unlike console games, MUDs were text-based and entirely without graphics,5 due to early computers’ limited memory and processing power, making them an ideal environment for typed bot interaction. These games often had automated non-player characters (NPCs) that helped move gameplay along, providing players with necessary information and services. MUDs remained popular into the 1990s, and users increasingly programmed and forked their own bots as the genre matured (Abokhodair et al., 2015; Leonard, 1997).

      As we have seen, early internet environments such as Usenet, IRC, and MUDs were the first wave of bot development, driving bot evolution from the 1970s through the 1990s. The next stage of bot advancement came with the advent of the World Wide Web in 1991.

      The basic logic that drives crawlers is very simple. At their base, websites are text files. These text files are written using hypertext markup language (HTML), a standardized format