Multi-Objective Decision Making. Diederik M. Roijers. Читать онлайн. Newlib. NEWLIB.NET

Автор: Diederik M. Roijers
Издательство: Ingram
Серия: Synthesis Lectures on Artificial Intelligence and Machine Learning
Жанр произведения: Программы
Год издания: 0
isbn: 9781681731827
Скачать книгу
Karl Tuyls, Francesco Delle Fave, Joris Mooij, Reyhan Aydoğan, and many others.

      Diederik M. Roijers and Shimon Whiteson

      April 2017

       Table of Abbreviations

Abbreviation Full Name Location
AOLS approximate optimistic linear support Algorithm 5.10, Section 5.5
CCS convex coverage set Definition 3.7, Section 3.2.2
CH convex hull Definition 3.6, Section 3.2.2
CHVI convex hull value iteration Section 4.3.2
CLS Cheng’s linear support Section 5.3
CMOVE multi-objective variable elimination Section 4.2.3
CoG coordination graph Definition 2.4, Section 2.2.1
CS coverage set Definition 3.5, Section 3.2
f scalarization function Definition 1.1, Section 1.1
MDP Markov decision process Definition 2.6, Section 2.3.1
MO-CoG multi-objective coordination graph Definition 2.5, Section 2.2.2
MODP multi-objective decision problem Definition 2.2, Section 2.1
MOMDP multi-objective Markov decision process Definition 2.8, Section 2.3.2
MORL multi-objective reinforcement learning Chapter 6
MOVE multi-objective variable elimination Algorithm 4.5, Section 4.2.3
MOVI multi-objective value iteration Section 4.3.2
OLS optimistic linear support Algorithm 5.8, Section 5.3
OLS-R optimistic linear support with reuse Algorithm 5.11, Section 5.6
PMOVI Pareto multi-objective value iteration Section 4.3.2
PCS Pareto coverage set Definition 3.11, Section 3.2.4
PMOVE Pareto multi-objective variable elimination Section 4.2.3
POMDP partially observable Markov decision process Section 5.2.1
PF Pareto front Definition 3.10, Section 3.2.4
SODP single-objective decision problem Definition 2.1, Section 2.1
U undominated set Definition 3.4, Section 3.2
VE variable elimination Algorithm 4.4, Section 4.2.1
VELS variable elimination linear support Section 5.7
VI value iteration Section 4.3.1
Vπ value vector of a policy π Definition 2.2, Section 2.1
Π a set of allowed policies Definition 2.1, Section 2.1
P Pareto dominance relation Definition 3.3, Section 3.1.2

      CHAPTER 1

       Introduction

      Many real-world decision problems are so complex that they cannot be solved by hand. In such cases, autonomous agents that reason about these problems automatically can provide the necessary support for human decision makers. An agent is “anything that can be viewed as perceiving its environment through sensors and acting upon that environment through effectors” [Russell et al., 1995]. An artificial agent is typically a computer program—possibly embedded in specific hardware—that takes actions in an environment that changes as a result of these actions. Autonomous agents can act without human control or intervention, on a user’s behalf[Franklin and Graesser, 1997].

      Artificial autonomous agents can assist us in many ways. For example, agents can control manufacturing machines to produce products for a company [Monostori et al., 2006, Van Moergestel, 2014], drive a car in place of a human [Guizzo, 2011], trade goods or services on markets [Ketter et al., 2013, Pardoe, 2011], and help ensure security [Tambe, 2011]. As such,