Seeing the Trouble Math Makes
Math is at the center of data science — which isn’t surprising, given that math is at the center of many areas, including music, computer programming, and the universe. However, though math may be the keystone of many things, it isn’t the whole thing of anything.
In fairness to math, it’s prudent to point out that math isn’t a decision, either. The discipline known as decision analysis defines a decision as an irrevocable act — that means an investment of effort, time, or resources must be fully committed and deployed before it can be technically deemed a decision. Math doesn’t commit, let alone act. It calculates. As such, it delivers a calculation, not a decision. The decision, my friend, rests with you, the diviner of the calculation.
It’s worrisome news, I know. It is so much more convenient to praise or blame the math for data driven decisions and, in so doing, absolve ourselves of any responsibility or accountability. But no, at best math gives us limited cover for bad decisions. See? I told you math makes for trouble!
The limits of math-only approaches
When you come right down to it, math isn’t much of a strategist. If your business strategy involves putting all your eggs in the mathematical basket, you’re staking your business’s future on a naïve strategy that more likely than not will underperform. That’s what typically happens when strategies depend too much on quantitative values.
It’s nonetheless true that quantitative values have fueled (and continue to fuel) the harvesting of big data low hanging fruit. That is to say that many of the algorithms that have been used until this point do have value and will continue to have value going forward — but only in certain circumstances. For example, an algorithm that predicts when a mechanical part will reach the end of its usefulness is a reliable indicator of when that part should be replaced. Such decision triggers — or decision recommendations, if you prefer a different term — will continue to be helpful. That being said, without the added qualitative inputs for balance and context in decision-making, pure math tends to go a bit sideways in real-world applications.
So, what could act as qualitative measures in decision-making? For the most part, they are things you associate with human qualities, such as behaviors, emotional responses, talents, instincts, intuitive intelligence, experience, cultural interpretations, and creativity. Folks often refer to them as soft skills, but Google’s Cassie Kozyrkov hits the nail on the head when she says that it’s better to think of these skills as “the ‘hardest to automate.’”
I’m all for cultivating the soft skills, as my arguments throughout this book make clear. But I’m not about to throw the baby out with the bathwater. The points I make here in no way negate or contradict the usefulness of math in data science, data analytics, or decision processes. You can’t just skip the math in decision intelligence — nor should you want to. The good news is that much of the math you need has already been built into many of the more useful analytical tools available to you, making them much less troublesome and far easier to use. (I tell you more about tools with automated math later, in Chapter 7.) For now, the point is that math alone does not a decision make.
Decision intelligence adds to the data sciences; it doesn’t lessen the value of the associated disciplines, experiences, tools, or lessons learned thus far in scalable decision-making. Rather, it involves a rethinking of how and when to use those disciplines, experiences, tools, and lessons learned thus far in the decision-making process. Make no mistake; math and algorithms remain important cornerstones in many of the tools. However, math and algorithms are decoupled from the decision-making process in the user interface and pushed to the background in emerging decision intelligence and related tools.
Think of decision intelligence as the next logical, evolutionary step in data democratization and interpretation.
The right math for the wrong question
Math is the cornerstone of data analytics in particular and of decision-making in general. However, the right math can deliver the right answer to a wrong question, which leads to nothing good in the way of making a sound decision.
How does that happen? It’s the result (mostly, but not always) of communication errors and mismatched assumptions between people or groups. That’s right — the problem has nothing to do with the math. The math is right and the answer is right, yet it’s all wrong because the question was wrong for the result folks were looking for.
For example, it’s quite common for a data scientist or a data analyst to query the data based on a question asked by a business manager or an executive. But managers and executives often pose questions from assumptions rising from their own (limited) perspective, often using imprecise language. Data scientists and data analysts, on the other hand, think and speak in the precise terms and statistical assumptions that are the norm in their crafts. The two seldom meet on the same train of thought.
If you ever want to develop a firsthand appreciation of what I think of as the great data divide, learn any programming language (provided you don’t know one already). The first thing you notice is how profoundly it changes the way you think, the assumptions you make, the way you approach logic, and the expectations you have of machine performance.The truth is, people fall into patterns of thinking and often can’t imagine that any other pattern exists. Imagine a cake baker asking a bridge engineer to go outdoors and bring back some fruit flies. Never mind that what the cake baker really wants is a set of edible creations that look like flies made of fruit for that entomologist’s birthday cake that’s on order; the cake baker said “fruit flies,” which the more technical-thinking bridge engineer took to mean those nasty little fruit flies that bedevil your fruit bowl. The bridge engineer may work diligently and for endless hours to collect fruit flies and deliver them to the cake baker who will then see this result as disastrous to their own efforts and squash the lot. That, in a nutshell, is why so many data queries end up delivering so little in terms of business value.
Real-world examples that aren’t quite so fanciful are plentiful. As a science-and-technology journalist, I see news publications regularly derailed by their addiction to following the answers to the wrong questions. For example, it’s typical for news media to “run the numbers” to see which articles attract the most audience eyeballs, clicks, likes, and shares. Whatever that outcome is becomes the next list of assignments for staff and freelance journalists. Sounds like a good plan, yes? Well, it is, but only for as far as it goes.
There’s a problem with diminishing returns. Think for a moment: How many times can the same article be written in different variations before readers lose interest and the publications pay for articles that readers won’t read? Those dead-on-arrival articles also impact other metrics, including the ones advertisers consider before buying ads or sponsored content with that publication.
In response, the publications run the numbers again to see which articles are trending now in order to repeat the cycle until it again ends in diminishing returns — all because of a wrong question, and even worse, a wrong question repeated endlessly.
The right question would be one that would put the publication in the lead position of trending articles rather than following the leader at the midsection or tail of moving trends. When a publication