At the end of 2014, Professor Stephen Hawking rattled the data science world when he warned, “The development of full artificial intelligence could spell the end of the human race… It would take off on its own, and re‐design itself at an ever increasing rate. Humans, who are limited by slow biological evolution, couldn't compete and would be superseded.”16
In August 2014, Elon Musk took to Twitter to express his misgivings:
“Worth reading Superintelligence by Bostrom. We need to be super careful with AI. Potentially more dangerous than nukes,” (Figure 1.2) and “Hope we're not just the biological boot loader for digital superintelligence. Unfortunately, that is increasingly probable.”
Figure 1.2 Elon Musk expresses his disquiet on Twitter.
In a clip from the movie Lo and Behold, by German filmmaker Werner Herzog, Musk says:
I think that the biggest risk is not that the AI will develop a will of its own, but rather that it will follow the will of people that establish its utility function. If it is not well thought out – even if its intent is benign – it could have quite a bad outcome. If you were a hedge fund or private equity fund and you said, “Well, all I want my AI to do is maximize the value of my portfolio,” then the AI could decide, well, the best way to do that is to short consumer stocks, go long defense stocks, and start a war. That would obviously be quite bad.
While Hawking is thinking big, Musk raises the quintessential Paperclip Maximizer Problem and the Intentional Consequences Problem.
The AI that Ate the Earth
Say you build an AI system with a goal of maximizing the number of paperclips it has. The threat is that it learns how to find paperclips, buy paperclips (requiring it to learn how to make money), and then work out how to manufacture paperclips. It would realize that it needs to be smarter, and so increases its own intelligence in order to make it even smarter, in service of making paperclips.
What is the problem? A hyper‐intelligent agent could figure out how to use nanotech and quantum physics to alter all atoms on Earth into paperclips.
Whoops, somebody seems to have forgotten to include the Three Laws of Robotics from Isaac Asimov's 1950 book, I Robot:
1. A robot may not injure a human being, or through inaction, allow a human being to come to harm.
2. A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
3. A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
Max Tegmark, president of the Future of Life Institute, ponders what would happen if an AI
is programmed to do something beneficial, but it develops a destructive method for achieving its goal: This can happen whenever we fail to fully align the AI's goals with ours, which is strikingly difficult. If you ask an obedient intelligent car to take you to the airport as fast as possible, it might get you there chased by helicopters and covered in vomit, doing not what you wanted but literally what you asked for. If a superintelligent system is tasked with a(n) ambitious geoengineering project, it might wreak havoc with our ecosystem as a side effect, and view human attempts to stop it as a threat to be met.17
If you really want to dive into a dark hole of the existential problem that AI represents, take a gander at “The AI Revolution: Our Immortality or Extinction.”18
Intentional Consequences Problem
Bad guys are the scariest thing about guns, nuclear weapons, hacking, and, yes, AI. Dictators and authoritarian regimes, people with a grudge, and people who are mentally unstable could all use very powerful software to wreak havoc on our self‐driving cars, dams, water systems, and air traffic control systems. That would, to repeat Mr. Musk, obviously be quite bad.
That's why the Future of Life Institute offered “Autonomous Weapons: An Open Letter from AI & Robotics Researchers,” which concludes, “Starting a military AI arms race is a bad idea, and should be prevented by a ban on offensive autonomous weapons beyond meaningful human control.”19
In his 2015 presentation on “The Long‐Term Future of (Artificial) Intelligence,” University of California, Berkeley professor Stuart Russell asked, “What's so bad about the better AI? AI that is incredibly good at achieving something other than what we really want.”
Russell then offered some approaches to managing the it's‐smarter‐than‐we‐are conundrum. He described AIs that are not in control of anything in the world, but only answer a human's questions, making us wonder whether it could learn to manipulate the human. He suggested creating an agent whose only job is to review other AIs to see if they are potentially dangerous and admitted that was a bit of a paradox. He's very optimistic, however, given the economic incentive for humans to create AI systems that do not run amok and turn people into paperclips. The result will inevitably be the development of community standards and a global regulatory framework.
Setting aside science fiction fears of the unknown and a madman with a suitcase nuke, there are some issues that are real and deserve our attention.
Unintended Consequences
The biggest legitimate concern facing marketing executives when it comes to machine learning and AI is when the machine does what you tell it to do rather than what you wanted it to do. This is much like the paperclip problem, but much more subtle. In broad terms, this is known as the alignment problem. The alignment problem wonders how to explain to an AI system goals that are not absolute, but take all of human values into consideration, especially considering that values vary widely from human to human, even in the same community. And even then, humans, according to Professor Russell, are irrational, inconsistent, and weak‐willed.
The good news is that addressing this issue is actively happening at the industrial level. “OpenAI is a non‐profit artificial intelligence research company. Our mission is to build safe AI, and ensure AI's benefits are as widely and evenly distributed as possible.”20
The other good news is that addressing this issue is actively happening at the academic/scientific level. The Future of Humanity Institute teamed with Google to publish a paper titled “Safely Interruptible Agents.”21
Reinforcement learning agents interacting with a complex environment like the real world are unlikely to behave optimally all the time. If such an agent is operating in real‐time under human supervision, now and then it may be necessary for a human operator to press the big red button to prevent the agent from continuing a harmful sequence of actions – harmful either for the agent or for the environment – and lead the agent into a safer situation. However, if the learning agent expects to receive rewards from this sequence, it may learn in the long run to avoid such interruptions, for example by disabling the red button – which is an undesirable outcome. This paper explores a way to make sure a learning agent will not learn to prevent (or seek!) being interrupted by the environment or a human operator. We provide a formal definition of safe interruptibility and exploit the off‐policy learning property to prove that either some agents are already safely interruptible, like Q‐learning, or can easily be made so, like Sarsa. We show that even ideal, uncomputable reinforcement learning agents for (deterministic) general computable environments can be made safely interruptible.
There is also the Partnership on Artificial Intelligence to Benefit People and Society,22 which was “established