FT.com writes:
According to Juerg Zeltner, CEO of UBS Wealth Management, a mass of information does not equal a wealth of knowledge. With global financial markets continuing to be volatile, the need for interpretation has never been greater.6
Before we proceed, let me introduce a simple but necessary concept for talking about data. It is quite surprising how confused the discussion of the term data still is, even among data scientists and vendors. In reality, there are four fundamentally different levels, depicted in Figure I.3:
Figure I.3 Pyramid of Use Cases (Levels 1–4)
● Level 1: Data – raw data and “cleansed” or “preprocessed” data
This could be a sequence of measurements sent from a temperature or vibration sensor in a packaging machine, a set of credit card transactions, or a few pictures from a surveillance camera. No meaning can be gleaned without further processing or analysis. You may know the term cleansing, but this just refers to readying data for further analysis (e.g., by changing some formats).
Returning to our restaurant analogy from the start of this section, raw data are like raw vegetables just delivered from the grocery store, but not really scrutinized by the chef. Data quality remains a very big issue for companies. In the 2014 Experian Data Quality survey, 75 percent of responding UK companies said that they had wasted 14 percent of their revenue due to bad data quality.7
● Level 2: Information – data already analyzed to some extent
Simple findings have already been derived. For example, the sensor data has been found to contain five unexpected outliers where vibrations are stronger than allowed in the technical specifications, or an analysis of market shares has shown the ranking of a product's market share in various countries in the form of a table or a pie chart. The key point is that we have some initial findings, but certainly no “so what.”
In the restaurant analogy, the chef might have cut and cooked the vegetables, but they haven't been arranged on the plate yet.
● Level 3: Insight – the “so what” that helps in making value-adding decisions
This is what the decision maker is looking for. In our restaurant analogy, the vegetables have now been served on the plate as part of the full meal, and the patron's brain has signaled this wonderful moment of visual and gustatory enjoyment.
There is definitely some room to improve, as shown in a BusinessIntelligence.com survey sponsored by Domo: only 10 percent of 302 CEOs and CXOs believed that their reports provided a solid foundation for decision making,8 and 85 percent of 600 executives interviewed by the Economist Intelligence Unit (EIU) mentioned that the biggest hurdle of analytics was to derive actionable insights from the data.9
● Level 4: Knowledge – a group of Level 3 insights available to others across time and space
This is the essence of what analytics, and indeed research, aims for: insights have been made reusable over time by multiple people in multiple locations. The decision maker might still decide to ignore the knowledge (not everyone learns from history!), but the insights are available in a format that can be used by others. In the restaurant analogy, our guest was actually a reviewer for a major and popular food blog, magazine, or even the Michelin Guide. The reviewer's description informs others, sharing the experience and helping in decisions about the next evening out.
A core question that I am posing here is how these four levels of data relate to the concept of mind+machine: where does Mind have a unique role, and where can Machine assist? The short answer is that machines are essential at Level 1 and are becoming better at their role there. At Level 2, some success has been achieved with machines creating information out of data automatically. However, Levels 3 and 4 will continue to require the human mind for 99 percent of analytics use cases in the real world for quite some time.
It is interesting to see that companies are experiencing challenges across Levels 1 to 3, with a higher focus on Level 3. A 2013 survey sponsored by Infogroup and YesMail with more than 700 marketers showed that 38 percent were planning to improve data analysis, 31 percent data cleansing, and 28 percent data collection capabilities.10 The survey did not include questions pertinent to Level 4.
To illustrate the variation in data volumes for each level, we'll take the use case of the chef explaining the process of cooking a great dish in various ways: in a video, in an audio recording, and in a recipe book. Let's assume that all these media ultimately contain the same Level 4 knowledge: how to prepare the perfect example of this dish.
A video can easily have a data volume between 200 megabytes (1 MB = 1 million bytes = 8 million bits) up to about 1 gigabyte (1 GB = 1 billion bytes = 8 billion bits) depending on the definition resolution. A one-hour audio book describing the same meal would be about 50 to 100 megabytes – roughly 4–10 times less data than the video – and the 10 pages of text required to describe the same process would be only about 0.1 megabytes – about 2,000 times less data than the video.
The actual Level 3 insights and the Level 4 knowledge consume only a very small amount of storage space (equal to or less than the text in the book), compared to the initial data volumes. If we take all the original video cuts that never made it into the final video, the Level 1 data volume might have even been 5 to 10 times bigger.
Therefore, the actual “from raw data to insight” compression factor could easily be 10,000 in this example. Please be aware that this compression factor is different from the more technical compression factor used to store pictures or data more efficiently (e.g., in a file format such as .jpeg or .mp3). This insight compression factor is probably always higher than the technical compression factor because we elevate basic data to higher-level abstract concepts the human brain can comprehend more easily.
The key point is that decision makers really want the compressed insight and the knowledge, not the raw data or even the information. The reality we see with our clients is exactly the opposite. Everyone seems to focus on creating Level 1 data pools or Level 2 reports and tables with the help of very powerful machines, but true insights are as rare as oases in the desert.
If you're not convinced, answer this question. Who in your organization is getting the right level of insight at the right time in the right delivery format to make the right decision?
Here is a funny yet sad real-life situation I encountered a few years ago. An individual in a prospective client's operations department had spent half of their working time for the previous seven years producing a list of records with various types of customer data. The total annual cost to the company, including overhead, was USD 40,000. When we spoke to the internal customer, we received confirmation that they had received the list every month since joining the company a few years previously. They also told us that they deleted it each time because they did not know its purpose. The analysis never made it even to Level 2 – it was in fact a total waste of resources.
Regarding delivery, I can share another story. A senior partner in a law firm got his team to do regular reports on the key accounts for business development – or as they referred to it, client development. The team produced well-written, insightful