35 35 Casado, M., Freedman, M.J., Pettit, J. et al. (2007). Ethane: taking control of the enterprise. ACM SIGCOMM Computer Communication Review 37 (4): 1–12.
36 36 Kirkpatrick, K. (2013). Software‐defined networking. Communications of the ACM 56 (9): 16–19. https://doi.org/10.1145/2500468.2500473.
37 37 Yang, Z., Cui, Y., Li, B. et al. (2019). Software‐defined wide area network (SD‐WAN): architecture, advances and opportunities. 2019 28th International Conference on Computer Communication and Networks (ICCCN), IEEE, pp. 1–9.
38 38 McKeown, N., Anderson, T., Balakrishnan, H. et al. (2008). OpenFlow: enabling innovation in campus networks. SIGCOMM Computer Communication Review 38 (2): 69–74. https://doi.org/10.1145/1355734.1355746.
39 39 Han, B., Gopalakrishnan, V., Ji, L., and Lee, S. (2015). Network function virtualization: challenges and opportunities for innovations. IEEE Communications Magazine 53 (2): 90–97.
2 Overview of Artificial Intelligence and Machine Learning
Nur Zincir‐Heywood1, Marco Mellia2, and Yixin Diao3
1 Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada
2 Department of Electronics and Telecommunications, Politecnico di Torino, Torino, Italy
3 PebblePost, New York, NY, USA
2.1 Overview
As the computer and network technologies improve, the ability to acquire, access, store, and process huge amounts of data from physical distant/near locations also increase. For example, people with smartphones connected all the time to different social media systems, exchanging text, voice, photos, and videos at any time and any place. This typically amounts to gigabytes to terabytes of data every day on social networking platforms. This stored data becomes useful when it is analyzed and turned into information such as for prediction, correlation, etc. To this end, artificial intelligence (AI) and machine learning (ML) have become the techniques that are increasingly employed over the years [1–3].
AI is in most part logic based [4]. AI aims to make computers do the types of things that humans' minds can do. These include but are not limited to reasoning, planning, prediction, association, perception, etc. which enable humans to achieve their goals. There are several major types of AI from classical or symbolic AI to ML, each includes many variations. Classical/symbolic AI models planning and reasoning and can also model learning. It is based on the spirit of Turing machine combined with propositional logic and the theory of neural synapses. Complex propositions are built, and deductive arguments are carried out by using logical operators to describe reasoning systems. Expert systems, knowledge bases, and case base reasoning are some examples of classical AI. Expert systems mimic the decision‐making process of a human expert. The program would ask an expert in a field how to respond in a given situation, and once this was learned for a sufficient range of situations, non‐experts could receive advice from that program. These programs would be used for creating knowledge bases which then may be used for decision support systems for different application areas. Case‐based reasoners solve new problems by retrieving stored “cases” describing similar prior problem‐solving episodes and adapting their solutions to fit new needs. Over the years, we have seen applications of expert systems and case‐based reasoning systems in network and service management [5–7].
On the other hand, ML is data driven [8]. It means programming to optimize performance criteria using examples of data or past experience. In ML, there exists a model defined by some parameters, then the learning becomes the execution of a program to optimize the parameters of the model using the training data or the past experience. Past experience case is distinct from either supervised or unsupervised learning because credit assignment is subject to delays. Thus, it is not immediately apparent which behaviors should be rewarded or penalized. This issue is specific to reinforcement learning. The model could be predictive to make predictions in the future, or it could be descriptive to gain knowledge from data, or it could be both. ML uses statistical theory to guide model building in order to infer from the training data or the past experience. In training, efficient algorithms are necessary to solve the optimization problem as well as to store and process the data/past experience. Moreover, the model that is learned at the end of training is required to have efficient representation and solution for inference purposes, possibly in real time. In some applications of ML, the efficiency of the learning and inference model, in other words the space and time complexity could be as important as its prediction accuracy. The growth of network technologies for easy access to data, cheaper access to CPU power, and fast access to data storage has enabled the use of ML algorithms in network and service management [9, 10].
2.2 Learning Algorithms
Most researchers categorize learning algorithms into three major types based on the underlying characteristics of the task: (i) supervised learning, (ii) unsupervised learning, and (iii) reinforcement learning. In supervised learning, the aim is to learn a mapping from the input data to the output data whose correct values are provided by the ground‐truth (label) during training. In unsupervised learning, there is no such ground‐truth, there is only input data. Thus, the aim is to find the similarities in the input data. In reinforcement learning, on the other hand, the focus is on identifying a system that is capable of maximizing the cumulative reward received when explicitly interacting with an environment. However, independent from the task, ML has three key components, namely representation, cost function, and credit assignment. In this context, representation means the learning language used to build solutions. Examples include a neural network representation vs. representation in decision tree induction or instructions from a simple instruction set. The representation may also distinguish between those capable of supporting some form of memory and those that do not. Thus, recurrent representations define the current output as a function of previous “internal” state as well as the current input (state). If the representation does not support memory mechanisms, the resulting model is limited to reactive behaviors alone. Depending on the representation assumed, solutions might be more difficult to discover or costly to estimate. Supporting memory would be beneficial under tasks that are partially observable, but might also decrease the ability to establish how decisions have been made (transparency). Cost function refers to how the performance of a solution is evaluated, e.g. classification or prediction accuracy, posterior probability or how simple a solution should be. Credit assignment guides how the representation is modified, i.e. rewarding/punishing to guide the search process. In the following, we will discuss the types of ML in more detail to gain more insight and understand their uses.
2.2.1 Supervised Learning
The goal of supervised learning is to learn a mapping from the input space to the output space where the correct values are provided by labels, called the supervisor. Figure 2.1 shows an overview of the supervised learning model. If the output data are real‐valued, then such problems are also called regression. Otherwise, they are called classification, where the learning system fits a model that associates sets of (input) exemplars with labels, possibly with a corresponding measure of certainty. After training with past data, the model learns a classification rule, which may be in the form of an If‐Then‐Else form. Having a rule like this enables us to make predictions if the future is similar to the past. In some cases, we may want to calculate a probability, then the classification becomes learning an association between the input and output data. Learning a rule from data also allows knowledge extraction. In this case, the rule is a simple (complex) model that describes the data and therefore the learning model provides us with an insight about the process underlying the data. Moreover, a learning model also performs compression by fitting a rule to the data.