Download as pdf or txt
Download as pdf or txt
You are on page 1of 58

Statistical Tableau

How to Use Statistical Models and Decision Science in


Tableau

With Early Release ebooks, you get books in their earliest form—the
author’s raw and unedited content as they write—so you can take advantage
of these technologies long before the official release of these titles.

Ethan Lang
Statistical Tableau
by Ethan Lang
Copyright © 2023 Ethan Lang. All rights reserved.
Printed in the United States of America.
Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North,
Sebastopol, CA 95472.
O’Reilly books may be purchased for educational, business, or sales promotional
use. Online editions are also available for most titles (http://oreilly.com). For
more information, contact our corporate/institutional sales department: 800-998-
9938 or corporate@oreilly.com.

Editors: Michelle Smith and Sara Hunter

Production Editor: Beth Kelly

Copyeditor: FILL IN COPYEDITOR

Proofreader: FILL IN PROOFREADER

Indexer: FILL IN INDEXER

Interior Designer: David Futato

Cover Designer: Karen Montgomery

Illustrator: Kate Dullea

September 2024: First Edition


Revision History for the Early Release

2023-05-05: First Release

See http://oreilly.com/catalog/errata.csp?isbn=9781098151799 for release


details.
The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. Statistical
Tableau, the cover image, and related trade dress are trademarks of O’Reilly
Media, Inc.
The views expressed in this work are those of the author and do not represent the
publisher’s views. While the publisher and the author have used good faith
efforts to ensure that the information and instructions contained in this work are
accurate, the publisher and the author disclaim all responsibility for errors or
omissions, including without limitation responsibility for damages resulting
from the use of or reliance on this work. Use of the information and instructions
contained in this work is at your own risk. If any code samples or other
technology this work contains or describes is subject to open source licenses or
the intellectual property rights of others, it is your responsibility to ensure that
your use thereof complies with such licenses and/or rights.
978-1-098-15173-7
[FILL IN]
Chapter 1. Introduction to Tableau

A NOTE FOR EARLY RELEASE READERS


With Early Release ebooks, you get books in their earliest form—the
author’s raw and unedited content as they write—so you can take advantage
of these technologies long before the official release of these titles.
This will be the 1st chapter of the final book. Please note that the GitHub
repo will be made active later on.
If you have comments about how we might improve the content and/or
examples in this book, or if you notice missing material within this chapter,
please reach out to the editor at shunter@oreilly.com.

It is important to understand that Tableau is not simply a data visualization tool,


but a company with a suite of tools to support data visualization at an enterprise
level. There are many products within Tableau’s ecosystem, includingTableau
Desktop, Tableau Cloud, Tableau Server, Tableau Prep Builder, Tableau Public,
and more.
Some of these products require a license to use while others, like Tableau Public,
do not require you to purchase a license. With a license you can publish your
workbooks to Tableau Server or Tableau Cloud from Tableau Desktop. This
allows your users to view and interact with your data visualizations from a
browser. If you use Tableau Public you can also publish your work to Tableau
Cloud. However; workbooks published using Tableau Public are accessible to
anyone from Tableau Public’s website.

Download and start using Tableau Desktop


For this book I will focus primarily on Tableau Desktop. To download Tableau
Desktop first navigate to Tableau’s website tableau.com. From the top navigation
select Resources, hover over Support, and then click All Releases (see figure 1-
1).
Figure 1-1. Finding All Product Releases on Tableau’s Website

On this page you will see Tableau’s different products. Find Tableau Desktop
and click on the link underneath that says “SEE ALL RELEASES” (see figure 1-
2).
Figure 1-2. Releases by Product on Tableau’s Website
You will find all the versions of Tableau Desktop on this page. At the time of
this writing I am using 2023.1. Tableau has a very aggressive release schedule
and they push out a new version typically every quarter. These updates are
donated with the year, the quarter, followed by the latest update number. Click
on 2023.1 to expand that verison and then download the software (see figure 1-
3).

Figure 1-3. Choosing a Version of Tableau Desktop to Download

After installation open Tableau Desktop. This is a licensed product so you will
be prompted to enter your information. You can also opt to start a free trial of the
tool. Once you have finalized those details you will land on the Start Page as
shown in figure 1-4.
Figure 1-4. Start Page of Tableau Desktop

Tableau Desktop has hundreds of connectors that you can use to access data. On
the left hand side of the Start Page you can explore all of those options. For all
the demonstrations in this book I will be using the Sample - Superstore dataset.
To connect to this dataset, simply click on Sample - Superstore which I have
highlighted in figure 1-5 below.
Figure 1-5. Choose Sample - Superstore from the List of Connectors

After clicking on the sample dataset, you will be navigated from the Start Page
to Tableau Desktop’s authoring interface which you can see in figure 1-6.
Figure 1-6. Tableau Desktops Authoring Interface

I provide step by step instructions in the following chapters so I won’t spend too
much time covering every aspect of the authoring interface. However, to give
you a brief overview, on the left hand side you will find the Data pane as shown
in figure 1-7.
Figure 1-7. The Data Pane of the Authoring Interface

At the top of the Data pane you will see a list of the data sources you are
connected to. Moving down you will find a list of measures, dimensions, bins,
and sets. Last, you can find a list of your parameters.
To the right of the Data pane you will find the Marks shelf, Filters shelf, Pages
shelf, Columns shelf, Rows shelf, and canvas which you can see in figure 1-8.

Figure 1-8. Key Features of the Authoring Interface

The last major feature I want to call out in this chapter is in the bottom left
corner of the authoring interface. There you will find a button to navigate to the
Data Source page and three small buttons. These buttons are used to create new
sheets, new dashboards, or a story (see figure 1-9).
Figure 1-9. Key Actions of the Authoring Interface

Using Tableau Desktop is very intuitive and there are many different ways to do
things. To give you a basic example I am going to show you how to create two
simple charts and how to add them to a dashboard. First, double click on Sales in
the Data pane then double click Order Date. You should end up with a line chart
(see figure 1-10).

Figure 1-10. Creating a Simple Line Chart in Tableau Deskto


Click on the New Worksheet button at the bottom left of the authoring interface
as shown in figure 1-11.

Figure 1-11. Creating a New Sheet from the Authoring Interface

This will open Sheet 2; your first chart is still viewable by navigating back to
sheet 1. Double click on Sales then Segment in the Data pane. This will create a
simple Bar Chart showing the SUM of Sales by Segment on the canvas similar
to figure 1-12.

Figure 1-12. Creating a Simple Bar Chart in Tableau Desktop


Now click on the New Dashboard icon in the bottom left of the authoring
interface as shown on figure 1-13

Figure 1-13. Creating a New Dashboard from The Authoring Interface

This will open a new view where you can create dashboards as shown in figure
1-14. Dashboards are the bread and butter of Tableau and are ultimately what
you will publish online for users to interact with.
Figure 1-14. Dashboard Canvas in Tableau Desktop

Now add your two sheets on the dashboard canvas. On the left click and drag
Sheet 1 onto the canvas. Then click and drag Sheet 2 onto the canvas. Your
dashboard should should now look similar to figure 1-15
Figure 1-15. Creating a Simple Dashboard Layout in Tableau Desktop

This is a simple example to give you an understanding of how Tableau Desktop


works. Knowing the layout of the tool and terms are the foundation to
understanding Tableau Desktop as a whole. In later chapters I will show you
more advanced features for now let’s introduce you to statistics.

Introduction to Statistics
According to the Webster Dictionary, statistics is defined as “a branch of
mathematics dealing with the collection, analysis, interpretation, and
presentation of masses of numerical data.”. I personally think this definition
nails it on the head especially in today’s landscape. To unlock deep insights in
your data you need to incorporate statistics into almost every aspect of the
analytics process. This includes collecting data in an efficient and ethical way,
understanding the data, finding deeper insights in the analysis, and presenting
your findings so your stakeholders can make informed decisions.
The best way I can introduce you to the power that statistical analysis can unlock
is to dive into a real world example. Let’s say that your company wants to test
some new marketing in an email. However; they are worried that if the new
marketing fails it could significantly impact sales for this quarter. Therefore,
they want to test the new marketing email by sending it to a subset of the total
email list, then analyze the performance. Below are the results of that test in a
contingency table 1-1.

Table 1-1. Contingency Table of Marketing Conversions


Original Email New Marketing Email
Non Conversions 727 117
Conversions 23 8

The marketing team has done a simple analysis looking at the conversion rates
of the emails by taking the Total Sent / Conversions. Using this calculation they
found that the original email had a conversion rate of about 3% (23/750=0.030)
and the new marketing email had a conversion rate of about 6% (8/125=0.064).
They claim that the new email is an absolute success and that it will lead to
double the amount of conversions when they send it out to their entire list next
time.
Management is thrilled with the idea of doubling the amount of sales and wants
to invest in several new salespeople to help with the increase. However; they
come to you for a second opinion and ask if the analytics team could review the
data and confirm the marketing team’s assumptions.
Where do you begin? This is where statistical analysis will become your best
friend. Armed with some basic statistics you know that you can run a few simple
tests that will let you know if the new marketing email was statistically
significant or not. Before I get too far in the weeds let’s set a foundation so you
better understand the test and results.

Hypothesis test
The first thing you need to do in this situation is to set up a hypothesis test. In a
standard hypothesis test you set two hypotheses. The first, is referred to as your
Null Hypothesis and the second is called the Alternative Hypothesis. For this
example the hypothesis will be as follows:
.
Null hypothesis
The new marketing email is not statistically significant therefore email
conversions will remain the same on average as the original.

Alternative hypothesis
The new marketing email is statistically significant therefore email
conversions will be higher on average than the original.

In statistics it’s important to understand that you are always trying to prove
yourself wrong. What do I mean by that? You always want to assume that
nothing is going to change when new things are introduced. Therefore you want
to assume that the Null Hypothesis is correct and your test will determine if that
is wrong. In statistics you would say you have failed to reject the Null
hypothesis if the null hypothesis is proven correct. If it turns out that the
Alternative Hypothesis is statistically significant you would say you reject the
Null hypothesis in favor of the Alternative.

Chi-square test
Now that you have your hypothesis set up, it’s time to run a statistical analysis.
In the spirit of providing you with a foundational understanding, I have decided
to run a simple statistical test called a chi-square test. This is a perfect test to run
in this situation and very accessible even if you’re new to statistics. You don’t
have to have any special software or know any coding to calculate this test. You
can do it by hand, run it in excel, or look for a calculator online.
To begin, let’s revisit the contingency table and add to it. As you can see in Table
1.2, I added totals for each column, row, and a grand total column.

Table 1-2. Adding Totals to the Contingency Table


Original Email New Marketing Email Totals
Non-Conversions 727 117 844
Conversions 23 8 31
Totals 750 125 875

You need to use these totals to calculate the expected values for each cell in the
original table. This can be expressed mathematically as: or more simply: ((rows
total * columns total) / the grand total) for each cell. Starting with the cell in the
upper left which I will call , I have (750*844)/875 = 723.43. I will calculate , ,
and in table 1-3 you can visually see where the connections are.

Table 1-3. Calculating Expected Values


Original Email New Marketing Email Totals
Non- (750*844)/875 = (125*844)/875 = 844
Conversions 723.43 120.57
Conversions (750*31)/875 = 26.57 (125*31)/875 = 4.43 31
Totals 750 125 875

With your expected values calculated you need to finish by comparing those
values to the values you observed. This step is expressed mathematically by the
following formula:
.
Simply put you need to take the original value minus the expected value I just
calculated, square that, then divide by the expected value. You will do this for
each cell then add up each of the values we get from that. To make it simple to
follow along I have put those calculations in table 1-4 below.

Table 1-4. Comparing Expected Values to Observed Values


Original Email New Marketing Email Totals
Non- (727-723.43)2 / 723.43 = (117-120.57)2 / 120.57 = 844
Conversions 0.018 0.106
Conversions (23-26.57)2 / 26.57 = (8-4.43)2 / 4.43 = 2.877 31
0.48

Totals 750 125 875

Now add those values up


= (0.018+0.106+0.48+2.877) = 3.481
That gives you a Observed Value of 3.481. The decision rule for a chi-test is as
follows: Observed Value > Critical Value you reject the null hypothesis. For this
example you are testing against a p-value of 0.05 and that would give us a
Critical Value = 3.84
3.481 is not greater than 3.84, therefore you would fail to reject the null
hypothesis. In simple terms, this means that the test proved that the new
marketing email did not have a statistically significant increase in conversions.
You can conclude that moving forward with this new email marketing campaign
will yield similar results to the original on average.

Conclusions drawn from statistical analysis


I chose this example specifically for two reasons. 1) This is a very real world
example that gives you a foundational understanding of statistics and how it’s
used. 2) This example comes really close to being statistically significant. In
statistics one of the most important lessons is to understand the data and make
some assumptions. Meaning it’s not always as black and white as it appears.
Unlike traditional mathematics you have to be able to think outside the box and
make further recommendations after an analysis.
In this situation I may go back and say that the results did not yield a significant
increase to conversions. However; the data suggests that there is a slight
improvement. My recommendation would be to hold off on hiring, run the test
again next quarter, and split the total emails sent 50 / 50. This would give the
team a larger sample size to rerun the analysis. Afterall, you can make the
assumption that while the new campaign did not yield statistically significant
results to prove it increased conversions, the results did suggest that the new
marketing email did not hurt conversions.

Data Visualization and Statistics


In closing, there is an obvious advantage of data visualization when trying to
find quick insight in your data and from the previous example you know the
power statistical analysis can have when making decisions. However, bringing
them together is where you will truly unlock the most of any analytics tool or
analysis.
I want to share a great example to drive home the importance of bringing data
visualization together with statistical analysis. Below in Table 1.5 I have 4
statistical summaries from 4 different datasets.

Table 1-5. Statistical Summary of Anscombe’s Quartet


Dataset 1 Dataset 2 Dataset 3 Dataset 4
X Y X Y X Y X Y
Obs 11 11 11 11 11 11 11 11
mean 9.00 7.50 9.00 7.50090 9.00 7.50 9.00 7.50
SD 3.16 1.94 3.16 1.94 3.16 1.94 3.16 1.94
r 0.82 0.82 0.82 0.82

You can see that the Standard Deviation, r, and mean are all the same across all
four datasets. However; if you were to plot the datasets and visualize them as
shown in figure 1.16 you can clearly see that each dataset is very different.
Figure 1-16. Visual Representation of Anscombe’s Quartet

This example is called Anscombe’s Quartet and it was constructed by the


statistician Franis Anscombe in 1973 to demonstrate the importance of
visualizing your data before modeling it. When building statistical models you
need to visualize the data to truly understand what the story is, if there are
outliers, correlation, normalization, the list goes on. On the other hand data
visualization alone leaves a lot of assumptions and room for misinterpretation so
you need to back it up with statistics.

Summary
In this chapter I discussed what Tableau is and listed several of their key
products. I then showed you how to download Tableau and gave you a brief
overview of the product. This foundational knowledge will be key in later
chapters especially if you are new to Tableau. Then I showed you a basic
example of a statistical analysis and walked you through the importance of
running these tests. To drive home the importance of pairing both Tableau and
statistics together I showed you the Anscombe’s Quartet example. In the
following chapters I will show you how to start incorporating statistical analysis
into your data visualizations in Tableau.
Chapter 2. What is a Confidence
Interval

A NOTE FOR EARLY RELEASE READERS


With Early Release ebooks, you get books in their earliest form—the
author’s raw and unedited content as they write—so you can take advantage
of these technologies long before the official release of these titles.
This will be the 3rd chapter of the final book. Please note that the GitHub
repo will be made active later on.
If you have comments about how we might improve the content and/or
examples in this book, or if you notice missing material within this chapter,
please reach out to the editor at shunter@oreilly.com.

As you begin applying statistics to your analysis you will be producing a lot of
estimates. These estimates will always have uncertainty around them because
you are making assumptions about things unknown to you. To quantify
uncertainty, statisticians turn to confidence intervals. There are many types of
confidence intervals but for this book I will cover average and median two tailed
confidence intervals.
To be less abstract, a confidence interval is a range of values that you expect
your estimate to fall between a certain percentage of the time if you were to run
your experiment again or re-sample the population the same way. Confidence
intervals normally contain the average or median of the estimate and a plus and
minus variation from the estimate. This plus or minus variation is your
confidence interval range.
To set your interval range you first have to decide on what your confidence level
is. The standard level you will see is 95% confidence. However; you can
increase it which would widen your interval range or decrease your level of
confidence which would shorten your interval range.
Your desired confidence level is usually one minus the alpha (a) value.
Confidence level = 1 - a
So if you use an alpha value of p < 0.05, then your confidence level would be 1 -
0.05 = 0.95, or 95%. If you wanted to be more confident you would use an alpha
value of p < 0.01, 1 - 0.01 = 0.99 or 99%. If you wanted to be less confident you
would use an alpha level of p < 0.1, 1 - 0.1 = 0.90, or 90%.
The level of confidence is completely up to you but important to understand. For
example, if you are working with healthcare data where your results could be the
difference between life or death, use a higher level of confidence. If you are
estimating the height of the next person to walk into a classroom of college
students, you have room to be less confident. Remember when someone says
they are 95% confident in an estimate they are basically saying that 95 out of
100 times the average of the estimate would fall between the upper and lower
values of the confidence interval.
If you visualize a confidence interval it would look like figure 3.1 below.
Figure 2-1. Confidence Interval Bell Curve

Looking at the visual you can see that the 95% is representing the majority of
this distribution with 2.5% on either side which represent the two tails of the
confidence interval. Let’s say I resampled the population and I ended up with
new results. 95% of the time the new average would fall in that 95% range like
figure 3.2 below.
Figure 2-2. Resampling and Falling within the Original Created Confidence Interval

If you were to re-run the experiment using this new sample of data you would
probably get similar results with some slight variation. However; you know that
5% of the time the average would fall outside the confidence interval range like
in figure 3.3.
Figure 2-3. Resampling and Falling Outside the Original Confidence Interval

If you were to re-run your experiment using this new sample of the data you
could get results that were farther off. This is why it is important to understand
that you can change your level of confidence to better suit your needs.

How to Calculate Confidence Intervals


I will show you an example of how to calculate confidence intervals by hand so
you understand the results better and how to implement it in Tableau. There are
many different ways to calculate confidence intervals. Tableau always assumes
that you are working with a sample population vs the total population so that is
the formula I will show you. However; in today’s environment doing this by
hand will not be feasible. You will need to rely on some sort of software like
Tableau to compute the results.
The formula to calculate the confidence interval is expressed as:
= confidence interval
= sample average
t = is the t-score for the desired level of confidence and degrees of freedom
= sample standard deviation
= sample size
In this example I will be calculating a 95% confidence interval using 12 test
scores which are 80, 95, 80, 80, 85, 85, 90, 85, 75, 95, 90, and 80. To begin you
need to find the sample average. You can do this by adding up all the scores and
dividing them by the total number of tests:
80 + 95 + 80 + 80 + 85 + 85+ 90 + 85 + 75 + 95 + 90 + 80 = 1020 / 12 = 85
= 85
Next, you need to calculate the standard deviation. Start by subtracting the
average from each test score and square each result.
(80 - 85)^2 + (95 - 85)^2 + (80 - 85)^2 + (80 - 85)^2 + (85 - 85)^2 + (85 - 85)^2
+ (90 - 85)^2 + (85 - 85)^2 + (75 - 85)^2 + (95 - 85)^2 + (90 - 85)^2 + (80 -
85)^2 =
(-5)^2 + (10)^2 + (-5)^2 + (-5)^2 + (0)^2 + (0)^2 + (5)^2 + (0)^2 + (-10)^2 +
(10)^2 + (5)^2 + (-5)^2 =
Now add each square and divide the result by the sample size (n) - one. Since
there are12 test scores you will divide by 11. This will give you the sum of
squares.
25 + 100 + 25+ 25 + 0 + 0 + 25 + 0 + 100 + 100 + 25 + 25 = 450 / 11 = 40.91
Now find the standard deviation by taking the square root of the sum of squares
Looking back at the formula you only have one more number to find which is t.
Since you are working with such a small sample size for this example I did have
to refer to a t-table to look up the correct value. However; when you have more
than 1000 rows of data this value starts to normalize to a standard value as it
moves toward infinity. For a reference I include a t-table in the back of this book.
See Table 1-1 for an excerpt from the t-table.

Figure 2-4. t Distribution Table

Since my degrees of freedom (DF) equals my sample size (n) minus one. I will
go to DF 11 then move across to the 0.95 column that is my confidence level. I
end up with a t-value of 2.201. I now have all the data points I need to work
through my equation.
Upper confidence interval = 85 + 4.064 = 89.064
Lower confidence interval = 85 - 4.064 = 80.936

Interpreting the Results


With these results you can say that the average test score will fall between 89.06
and 80.94 for the total population of students 95% of the time.
If you were to visualize this would look like figure 3-4
Figure 2-5. Confidence Interval from the Example

While this alone is powerful insight into the data, there is another quick statistic
you can calculate that will give you further insight called the standard error. The
standard error is calculated by dividing the standard deviation (s) by the square
root of the sample population (n). Pulling those values from our formula I get the
following results.
6.396 / = 1.846
The standard error tells you how accurately the sample data will reflect the total
population. The higher the standard error the more volatility you can expect
from the total population the lower the standard error the less volatility. Since I
have a relatively low standard error you can say that the test scores in future
rounds will be close to the sample. However; 12 data points is a small amount.
My personal rule of thumb is anything over 30 is a decent sample size. Be
careful of the assumptions you make with anything less than 30 rows of data.

How to Calculate Confidence Intervals in Tableau


You now have a solid understanding of confidence intervals and how to calculate
them. You can also see that anything beyond a couple dozen data points would
be too cumbersome to calculate by hand. That said, let’s open up Tableau and
check your work from the example above. Then you’ll implement confidence
intervals on the Sample - Superstore dataset.

Check the confidence interval you solved by hand


Start by opening Tableau and from the start page,click Microsoft Excel from the
list of data connectors to the left (see figure 3-5).
Figure 2-6. Connecting to Excel Data in Tableau

A window will appear and ask you to choose the file you want to connect to.
Navigate to Chapter 3 - Test Scores.xlsx, select it, and click connect. From the
data source page, click Go to Sheet in the bottom left of the page (see figure 3-
6).
Figure 2-7. Data Source Page in Tableau After Connecting to Chapter 3 - Test_Scores.xlsx

To start,drag Test Scores onto the rows shelf and Student ID to the columns
shelf. This will give you a vertical bar chart in the view as shown in figure 3-7.
Figure 2-8. Bar Chart of Test Scores

From here, tab to the Analytics pane, drag Average with 95% CI onto the view,
and drop it on Table (see figure 3-8).

Figure 2-9. Adding Confidence Intervals to the View From the Analytics Pane

This is going to add a 95% confidence interval to the view.


Figure 2-10. Confidence Interval in Tableau Desktop

If you hover over the upper and lower bounds of the confidence interval a tooltip
will activate. This shows the upper and lower bounds as well as the sample
average when you hover over each line. You can see in figure 3-9 that the
confidence intervals match exactly to what you got when you calculated them by
hand.
Getting the confidence intervals in Tableau took me 2 minutes total vs the 15-20
minutes it took me to calculate the results by hand. Not only that but Tableau is
extremely flexible and can compute these results extremely quickly even when
you begin to get more data.

Implement confidence intervals on sample dataset


To demonstrate the flexibility of Tableau let’s connect to the Sample - Superstore
dataset and implement a confidence interval. To start,click Data from the top
navigation then select New Data Source from the menu as shown in figure 3.10.
Figure 2-11. Connecting to a New Data Source in Tableau

From the Connect menu, choose Sample - Superstore, which is the second to last
option from the menu (see figure 3-11).
Figure 2-12. Data Connection Page in Tableau Desktop

Tableau will add a new data source into the top of the Data pane and you will see
a list of the measures and dimensions loaded in the Tables section of the Data
pane as shown in figure 3.12.

Figure 2-13. Data Pane in the Authoring Interface of Tableau Desktop

Create a new sheet by dragging Sales onto the rows shelf then Sub-Category
onto the columns shelf as shown in figure 3.13. This will create a vertical bar
chart that displays the SUM of Sales by the Sub-Category dimension
Figure 2-14. Bar Chart of Sales by Sub-Category in Tableau Desktop
Toggle to the Analytics pane and drag Average with 95% CI onto the view. As
you can see in figure 3.14 Tableau automatically calculates this figure for you in
a matter of seconds. This dataset also has 10,194 rows of data. This is a
relatively small amount in today’s standards but large enough to know it would
not be feasible to calculate this by hand like you did for the test scores.
Figure 2-15. Implementing a Confidence Interval on Bar Chart of Sales by Sub-Category in Tableau
Desktop

Now that the confidence intervals are incorporated into the view, you can also
change the level of detail and Tableau will recalculate the results for you on the
fly. For example, replace Sub-Category with State/Province by dragging and
dropping the State/Province dimension onto Sub-Category in the columns shelf.
Tableau will calculate the confidence interval for you immediately at this new
level of detail as shown in figure 3.15.
Figure 2-16. Bar Chart With 95% CI of Sales by State

Summary
In this chapter, I showed you how to calculate confidence intervals so you have a
better understanding of the math Tableau uses behind the scenes. This will allow
you to better understand and communicate the results to your stakeholders. I also
walked you through how to implement this model within Tableau using the built
in feature from the Analytics pane. This allows you to apply this method at an
enterprise level on large amounts of data. Last, I showed you the flexibility of
Tableau when changing the level of detail. This allows you to move very quickly
when business requirements change.
Table of Contents
1. Introduction to Tableau
Download and start using Tableau Desktop
Introduction to Statistics
Hypothesis test
Chi-square test
Conclusions drawn from statistical analysis
Data Visualization and Statistics
Summary
2. What is a Confidence Interval
How to Calculate Confidence Intervals
Interpreting the Results
How to Calculate Confidence Intervals in Tableau
Check the confidence interval you solved by hand
Implement confidence intervals on sample dataset
Summary

You might also like