Tableau Your Data!. Murray Daniel G.. Читать онлайн. Newlib. NEWLIB.NET

Автор: Murray Daniel G.
Издательство: John Wiley & Sons Limited
Серия:
Жанр произведения: Зарубежная образовательная литература
Год издания: 0
isbn: 9781119001201
Скачать книгу
graphics should draw the view’s attention to the sense and substance of the data, not to something else.

– Edward R. Tufte1

      The seeds for Tableau were planted in the early 1970s when IBM invented Structured Query Language (SQL) and later in 1981 when the spreadsheet became the killer application of the personal computer. Data creation and analysis fundamentally changed for the better. Our ability to create and store data increased exponentially.

      The business intelligence (BI) industry was created with this wave, each vendor providing a product “stack” based on some variant of SQL. The pioneering companies invented foundational technologies and developed sound methods for collecting and storing data. Recently, a new generation of NoSQL2 (Not Only SQL) databases are enabling web properties like Facebook to mine massive, multi-petabyte3 data streams.

      Deploying these systems can take years. Data today resides in many different databases and may also need to be collected from external sources. The traditional leaders in the BI industry have created reporting tools that focus on rendering data from their proprietary products. Performing analysis and building reports with these tools require technical expertise and time. The people with the technical chops to master them are product specialists who don’t always know the best way to present the information.

      The scale, velocity, and scope of data today demand reporting tools that deploy quickly. They must be suitable for non-technical users to master. They should connect to a wide variety of data sources. And, the tools need to guide us to use the best techniques known for rendering the data into information.

      The Shortcomings of Traditional Information Analysis

      Entities are having difficulty getting widespread usage of traditional BI tools. A recent study by the Business Application Research Center (BARC, 2009) reported adoption rates are surprisingly low.4

      In any given BI using organization just over 8 percent of employees are actually using BI tools. Even in industries that have aggressively adopted BI tools (e.g., wholesales, banking, and retail), usage barely exceeds 11 percent.

Nigel Pendse, BARC

      In other words, 92 percent of the people who have access to traditional BI tools don’t use them. The BARC survey noted these causes:

      • The tools are too difficult to learn and use.

      • Technical experts were needed to create reports.

      • The turnaround time for reports is too long.

      Companies that have invested millions of dollars in BI systems are using spreadsheets for data analysis and reporting. When BI system reports are received, traditional tools often employ inappropriate visualization methods. Stephen Few has written several books that illuminate the problem and provide examples of data visualization techniques that adhere to best practices. Stephen also provides examples of inappropriate visualizations provided by legacy vendor tools.5 It turns out that the skills required to design and build database products are different from the skills needed to create dashboards that effectively communicate. The BARC study clearly indicates this IT-centric control model has failed to deliver compelling answers that attract users.

      You want to make informed decisions with reliable information. You have to connect with a variety of data sources and may not know the best ways to visualize the data. Ideally, the tool used should automatically present the information using the best practices. Tableau has become a popular choice because it makes industrial-strength reporting, analysis, and discovery accessible to less-technical staff. During the last few years, information technology teams have started to embrace end-user empowerment because it provides a more efficient way to provide information, reduces request backlogs, and provides a toolset for leveraging the knowledge of constrained technical human resources.

      The Business Case for Visual Analysis

      Whether your entity seeks profits or engages in non-profit activities, all enterprises use data to monitor operations and perform analysis. Insights gleaned from the reports and analyses are then used to maintain efficiency, pursue opportunity, and prevent negative outcomes. Supporting this infrastructure (from the perspective of the information consumer) are three kinds of data.

       Three Kinds of Data That Exist in Every Entity

      Reports, analysis, and ad hoc discovery are used to express three basics kinds of data.

       Known Data (Type 1)

      Encompassed in daily, weekly, and monthly reports that are used for monitoring activity, these reports provide the basic context used to inform discussion and frame questions. Type 1 reports aren’t intended to answer questions. Their purpose is to provide visibility of operations.

       Data You Know You Need to Know (Type 2)

      Once patterns and outliers emerge in type 1 data, the question that naturally follows is: Why is this happening? People need to understand the cause of the outliers so that action can be taken. Traditional reporting tools provide a good framework to answer this type of query as long as the question is anticipated in the design of the report.

       Data You Don’t Know You Need to Know (Type 3)

      Performing analysis with data in real time while using appropriate visual analytics provides the possibility of seeing patterns and outliers that are not visible in type 1 and type 2 reports. The process of interacting with granular data yields different questions that can lead to new actionable insights. Software that enables quick, iterative analysis and reporting is becoming a necessary element of effective business information systems.

      Distributing type 1 reports in a timely manner is important. This requires speedy design and build stages when a new type 1 report is created. To effectively enable types 2 and 3 analyses, the reporting tool must adapt quickly to ad hoc queries and present the data in intuitive ways.

       How Visual Analytics Improves Decision Making

      Rendering data accurately is easy to achieve with Tableau, but your knowledge of the best practices enhances the clarity of the information being displayed. The next three figures illustrate how the choice of chart types can make it easier for your audience to see and understand important findings in the data. The goal of these examples is to provide sales analysis by region, product category, and product subcategory.

Figure 1-1 presents data using a grid of numbers (a text table) and pie charts. Text tables are useful for finding specific values. Pie charts are intended to show part-to-whole comparisons. The pie charts compare sales by region and product category.

Figure 1-1: Sales mix analysis using a text tables and pie charts

      Text tables are not the most effective way to make part-of-whole comparisons or identify outliers. Pie charts are a commonly used chart type but are one of the least effective ways to make precise comparisons. This is especially true when there are many slices that are similar in size or very small.

Figure 1-2 employs a bar chart and heat map to convey the same information. Bar charts provide a better means for making precise comparisons. The linear presentation makes it easier to see the relative values. The heat map on the right provides total sales for each product category. The grayscale background color in the heat map highlights the high and low selling items. The blue-orange color encoding in the bar chart provides additional information on profit ratio. More importantly, this color scheme is visible to color-blind people.

Figure 1-2: Sales mix analysis using a bar chart and heat map

The bar chart and heat map communicate


<p>1</p>

Edward R. Tufte, The Visual Display of Quantitative Information (Cheshire, CT: Graphics, 2001), 91.

<p>2</p>

Margaret Rouse, NoSQL (Not Only SQL Database), “Essential Guide, Big Data Applications: Real-World Strategies for Managing Big Data,” SearchDataManagement, October 5, 2011, http://searchdatamanagement.techtarget.com/definition/NoSQL-Not-Only-SQL.

<p>3</p>

Andrew Ryan, “Under the Hood: Hadoop Distributed Filesystem Reliability with Namenode and Avatarnote,” https://www.facebook.com/notes/facebook-engineering/under-the-hood-hadoop-distributed-filesystem-reliability-with-namenode-and-avata/10150888759153920.

<p>4</p>

Stephen Swoyer, “Report Debunks Business Intelligence Usage Myth,” http://tdwi.org/Articles/2009/05/20/Report-Debunks-BI-Usage-Myth.aspx?Page=1.

<p>5</p>

Stephen Few, Information Dashboard Design: The Effective Visual Communication of Data, Berkeley, California, (O’Reilly Media, Inc, 2006), 4.