It is easy to see machine learning in action, in just the first 90 minutes of the day. It is not exactly clear which of the author's observed activities or features were used to trigger his phone to make suggestions, but it is clear that routine daily actions observed and logged over time had served as training data and had ultimately resulted in a number of predictions about subsequent activities or labels. Clearly, the phone knew the time the alarm was set for, at what time the commute begins, and where your author's car is left for the day.
There is a scale of maturity for machine learning capabilities, beginning with descriptive analytics to look at what has happened in the past with data aggregation and mining, moving forward to diagnostic analytics, to understand the drivers of the target outcomes. Moving further along the continuum, we get to predictive analytics to help us to project from past observations what will happen in the future, based on statistical forecasting models, and on to prescriptive analytics, which uses optimization and simulation algorithms to advise on possible outcomes and to determine what actions should be taken. In the phone example above, the machine learning model is far out to the right, even approaching prescriptive analytics. The model was able to predict the next actions and to prescribe what to do about them – launch the driving directions app, as you are in the car and headed to the train station!
It is likely that analysts will encounter less mature models, where they would be pleased just to draw correlations that are difficult to uncover with simplistic traditional analysis tools. The analyst may be looking for descriptions of the Xs observed in the case of Y outcomes, or perhaps explanations of the Ys from observed Xs. However, it is thought that the true value can come in an algorithm understanding a large number of observations such that it can make predictions about the future. Taking it a step further, by tying the predicted outcomes to prescribed action steps, we approach true artificial intelligence, enabling us to best deal with encountered scenarios in a data-driven and methodical way.
Optical Character Recognition/Intelligent Character Recognition
Optical character recognition (OCR) is a means of using software to convert images of typed, handwritten, or printed text into machine-encoded text, from a number of formats – scanned documents, photos of documents, or even from subtitles, captions, or text superimposed on an image. As a practical matter, OCR is often used to digitally capture books or other documents with consistent and universally recognizable fonts. OCR is often a component of document management software (DMS) that can be used to go paperless. Many readers may use DMS programs to allow them to take a snapshot of transaction receipts with their phones, and the software will capture and categories transactional details like the items purchased and the vendor, directly from the image. Other images that we work with can be tailored to our needs better with OCR. Common examples include Adobe Acrobat document images, which are common for locking down documents into a stable, read-only format, prior to distribution. Using OCR capabilities can allow for more flexible digital archiving, can make them searchable, and can even allow users to copy and paste from the created body of machine-encoded text, once it has been extracted from the image.
Intelligent character recognition (ICR) is a very similar technology, on the surface. It also enables the extraction of text from images. However, it has an added dimension of complexity: the ability to learn more complicated and non-standard fonts, and importantly – even human handwriting. Whereas OCR tends to be appropriate for easily understood typewritten text, being able to learn, recognize, and understand the free-est of free-text forms is another skill altogether. The ability to continuously learn from training data makes it significantly more sophisticated and often more costly to deploy. Organizations that wish to be able to capture and archive large volumes of information from images should evaluate the level of customization and flexibility required to process the target body of data, being conscious of the complexity and costs involved in moving along the capability spectrum from OCR to ICR.
Natural Language Processing
Natural language processing (NLP) is a dimension of artificial intelligence that enlists linguistics and computer science to improve how computers can capture, analyze, and process large volumes of human language data. Efforts in this area have centered around speech recognition, language translation, natural language understanding, and natural language generation. This is perhaps the area of artificial intelligence which has been in existence the longest, spanning decades of work in the field.
In our discussion of robotic process automation (RPA) in a previous section of this chapter, we described in some detail how Chat Bots can multiply the efforts of existing customer service staff by engaging users to extract common demands and then locating appropriate information to provide in response. Chat Bots leverage NLP to “understand” those demands. Other popular assistants in use today are reliant on NLP to enable commands to be invoked.
For many of us, informal conversational English can be quite different from explicit computer language demands. “Hey Siri, can you place a call to order pizza from that place on Main Street we ordered from last week?” is a natural language request that needs to be classified, or broken down, to answer such questions as Q1: “What does the user want?” – A1: The user would like me to place a call, or at least to locate a phone number. Q2: “Place a call to (or get a phone number for) whom?” – A2: A pizza restaurant that is in the user's call history, with an address on Main Street, Q3: “If my classification fails, or any of my actions do not appear suitable, relevant, or of value, is there any other related information I can provide to better assist the user?” – A3: Provide pizza options from nearby restaurants.
NLP is all about being able to structure, classify, and understand meaning from volumes and volumes of unstructured data. If we think about how much information is available on the internet, could not the internet be a valuable and rich dataset containing a nearly infinite number of observations for virtually any study? If you were the CEO of one of the largest social media sites, do you think you could benefit from digesting at any point in time, the millions of posts that are published in your ecosystem? Remember, these posts are both the lifeblood of your livelihood, but they may also represent the biggest liabilities to your reputation and ultimately threats to your profitability. Perhaps even worse, could a reckless failure to draft guidelines on appropriate online behavior, to actively monitor the body of content published on your site, and to implement policies and procedures to appropriately respond to any behaviors that are counter to your guidelines introduce regulatory risk? Could they prompt regulatory action, potentially impact your business model, and compromise your long-cultivated self-determination and independence? GULP!
In the above example, getting assistance from a finely tuned NLP model, to process informal vernacular in posts, to glean meaning from content, to perform classification to determine whether posts are appropriate, and finally to flag as exceptions any posts deemed inconsistent with usage guidelines, could be a very valuable tool to have at your disposal. Equipping an NLP model to extract, structure, and classify messages, all while continually improving the outcomes, is integral to allowing humans to interface with computers on our own terms.
Self-Service Data Analytics
We have introduced self-service data analytics as an important growing subset in the suite of data analytics tools that are rapidly saturating the finance, accounting, and operations functions across organizations. In most cases, these tools are off-the-shelf vendor products with which individual operators, not technologists, can interact and configure directly, due to their drag-and-drop ease of use. Process owners, rather than coders and technologists, can build workflows in