Considering Learning Styles
Machine learning can be applied in three main styles: supervised, unsupervised, and semisupervised. Supervised and unsupervised methods are behind most modern machine learning applications, and semisupervised learning is an up-and-coming star.
Learning with supervised algorithms
Supervised learning algorithms require that input data has labeled features. These algorithms learn from known features of that data to produce an output model that successfully predicts labels for new incoming, unlabeled data points. You use supervised learning when you have a labeled dataset composed of historical values that are good predictors of future events. Use cases include survival analysis and fraud detection, among others. Logistic regression is a type of supervised learning algorithm, and you can read more on that topic in the next section.
Survival analysis, also known as event history analysis in social science, is a statistical method that attempts to predict the time of a particular event — such as a mother’s age at first childbirth in the case of demography, or age at first incarceration for criminologists.
Learning with unsupervised algorithms
Unsupervised learning algorithms accept unlabeled data and attempt to group observations into categories based on underlying similarities in input features, as shown in Figure 3-2. Principal component analysis, k-means clustering, and singular value decomposition are all examples of unsupervised machine learning algorithms. Popular use cases include recommendation engines, facial recognition systems, and customer segmentation.
FIGURE 3-2: Unsupervised machine learning breaks down unlabeled data into subgroups.
Learning with reinforcement
Reinforcement learning is a behavior-based learning model. It’s based on a mechanic similar to how humans and animals learn. The model is given “rewards” based on how it behaves, and it subsequently learns to maximize the sum of its rewards by adapting the decisions it makes to earn as many rewards as possible.
Seeing What You Can Do
Whether you’re just becoming familiar with the algorithms that are involved in machine learning or you’re looking to find out more about what’s happening in cutting-edge machine learning advancements, this section has something for you. First, I give you an overview of machine learning algorithms, broken down by function, and then I describe more about the advanced areas of machine learning that are embodied by deep learning and Apache Spark.
Selecting algorithms based on function
When you need to choose a class of machine learning algorithms, it’s helpful to consider each model class based on its functionality. For the most part, algorithmic functionality falls into the categories shown in Figure 3-3.
Regression: You can use this type to model relationships between features in a dataset. You can read more on linear and logistic regression methods and ordinary least squares in Chapter 4.
Association rule learning: This type of algorithm is a rule-based set of methods that you can use to discover associations between features in a dataset. For an in-depth training and demonstration on how to use association rules in Excel, be sure to check out the companion website to this book (https://businessgrowth.ai/
).
Instance-based: If you want to use observations in your dataset to classify new observations based on similarity, you can use this type. To model with instances, you can use methods like k-nearest neighbor classification, covered in Chapter 5.
Regularizing: You can use regularization to introduce added information as a means by which to prevent model overfitting or to solve an ill-posed problem. In case the term is new to you, model overfitting is a situation in which a model is so tightly fit to its underlying dataset, as well as its noise or random error, that the model performs poorly as a predictor for new observations.
Naïve Bayes: If you want to predict the likelihood of an event’s occurrence based on some evidence in your data, you can use this method, based on classification and regression. Naïve Bayes is covered in Chapter 4.
Decision tree: A tree structure is useful as a decision-support tool. You can use it to build models that predict for potential downstream implications that are associated with any given decision.
Clustering: You can use this type of unsupervised machine learning method to uncover subgroups within an unlabeled dataset. Both k-means clustering and hierarchical clustering are covered in Chapter 5.
Dimension reduction: If you’re looking for a method to use as a filter to remove redundant information, unexplainable random variation, and outliers from your data, consider dimension reduction techniques such as factor analysis and principal component analysis. These topics are covered in Chapter 4.
Neural network: A neural network mimics how the brain solves problems, by using a layer of interconnected neural units as a means by which to learn — and infer rules — from observational data. It’s often used in image recognition and computer vision applications.Imagine that you’re deciding whether you should go to the beach. You never go to the beach if it’s raining, and you don’t like going if it’s colder than 75 degrees (Fahrenheit) outside. These are the two inputs for your decision. Your preference to not go to the beach when it’s raining is a lot stronger than your preference to not go to the beach when it’s colder than 75 degrees, so you weight these two inputs accordingly. For any given instance where you decide whether you’re going to the beach, you consider these two criteria, add up the result, and then decide whether to go. If you decide to go, your decision threshold has been satisfied. If you decide not to go, your decision threshold was not satisfied. This is a simplistic analogy for how neural networks work. Now, for a more technical definition. The simplest type of neural network is the perceptron. It accepts more than one input, weights them, adds them up on a processor layer, and then — based on the activation function and the threshold you set for it — outputs a result. An activation function is a mathematical function that transforms inputs into an output signal. The processor layer is called a hidden layer. A neural network is a layer of connected perceptrons that all work together as a unit to accept inputs and return outputs that signal whether some criteria is met. A key feature of neural nets is that they’re self-learning — in other words, they adapt, learn, and optimize per changes in input data. Figure 3-4 is a schematic layout that depicts how a perceptron is structured.
Deep learning method: This method incorporates traditional neural networks in successive layers to offer deep-layer training for generating predictive outputs. I tell you more about this topic in the next section.
Ensemble