Professional Documents
Culture Documents
DMAIC Analyze Phase 1694960554
DMAIC Analyze Phase 1694960554
ityy PPlo
lott ooff R
Roott
No
Norrm
maall
99
99
Me
Meaann 1133..7788
St
StDDeevv 77.7
.71122
995
5 NN 1188
AADD 00.2
.28855
90
90 P-
P-VVaaluluee 0.0.558866
80
80
70
70
entt
erccen 60
60
50
50
The
The P
P--vvaalue
lue is
is >> 0.05
0.05
PPer
40
40
30
30 We
We can
can assume
assume ourour
20
20 data
data isis Normally
normally
10
10 Distributed.
distributed.
55
11
-5
-5 00 55 10
10 15
15 20
20 25
25 30
30 35
35
Ro
Rott
Reg
Re g io n Reg ion
of of
DO U BT DO U BT
A cce
cepp t a s ch
chaa n ce d
diif f e r e n ce s
TThe
he qu
quan
quan
an
antitittiytyoo
f fX
X’s
Xs
’s
XX XX XX X
keep
after rweeduc
think ing as
yoabout
u workYth =f(eXproj
) + eect
Th
The
The
e qu
qu
anatit
an
tnittiy
ityofoX’s
f Xs
rema
wheninwwe
ing
e app
appl
afterly
XXX
leverage
DMA(T IChe v viital
fe
feww)
233
Analyze Phase
Welcome to Analyze
Now that we have completed the Measure Phase we are going to jump into the Analyze Phase.
Welcome to Analyze will give you a brief look at the topics we are going to cover
cover.
Welcome to Analyze
Overview
Inferentia
Inferentiall Sta
Statistics
tistics
Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing
Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
H
Hypothesis
H th
th ii Testing
Hypothesis TTesting
T ti
ti N
NNND
D P1
P1
Hypothesis
Hypothesis Testing
Testing N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Items
Items
Estimate COPQ
Establish Team
Measure
I l
Implement
t Control
C t l Plan
Pl tto Ensure
E Problem
P bl D
Doesn’t
’t Return
Rt
Collect Data
Statistically
Significant?
N
Y
Update FMEA
N
Practically
Significant?
Root
Cause
N
Y
Identify Root Cause
This provides a process look at putting “Analyze” to work. By the time we complete this phase you will
have a thorough understanding of the various Analyze Phase concepts.
We will build upon the foundational work of the Define and Measure Phases by introducing
techniques to find root causes, then using experimentation and Lean Principles to find solutions to
process problems. Next you will learn techniques for sustaining and maintaining process performance
using control tools and finally placing your process knowledge into a high level Process Management
tool for controlling and monitoring process performance.
Analyze Phase
“X” Sifting
Now we will continue in the Analyze Phase with “X Sifting” – determining what the impact of the
inputs to our process are.
“X” Sifting
Overview
The core
fundamentals of this W
W elcom
elcomee to
to Ana
Analy
lyze
ze
M
Multi-Va
ulti-Vari
ri Ana
Analysis
lysis
phase are Multi-Vari
Analysis and ““X
X”” Sifting
Sifting
Classes and Cla
Classes
sses aand
nd Ca
Causes
uses
Causes. Inferentia
Inferentiall Sta
Statistics
tistics
Hy
Hypothesis
pothesis Testing
Testing N
NNND
D P1
P1
Hy
Hypothesis
pothesis Testing
Testing N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
Multi-Vari Studies
The XXXXXXXXXX
The many
manyXs Xs
XXXXXXXXXX
when
when wewe first
first start
start X XX XXXXX X X
(The
(The trivial
trivial many)
many) X XX XXXXX X X
The
Thequantity
quantityofofX’s
Xs
XX XX XX X
keep
after reducing
we think as
you
about
workY=f(X)
the project
+e
The
Thequantity
quantityofofX’s
Xs
remaining
when we apply
after
XXX
leverage
DMAIC
(The vital
few)
In the Define Phase you use tools like Process Mapping to identify all possible “X’s”
X s . In the Measure
Phase you use tools to help refine all possible “X’s” like the X-Y Diagram and FMEA.
In the Analyze Phase we start to “dis-assemble” the data to determine what it tells us. This is the fun
part.
“X” Sifting
Multi-Vari Definition
The Multi-Vari Chart helps in screening factors by using graphical techniques to logically subgroup
discrete X’s (Independent Variables) plotted against a continuous Y (Dependent). By looking at the
pattern of the graphed points, conclusions are drawn from about the largest family of variation.
At this point in DMAIC, Multi-Vari Charts are intended to be used as a passive study, but later in the
process they can be used as a graphical representation where factors were intentionally changed. The
only caveat with using MINITABTM to graph the data is that the data must be balanced. Each source of
variation
i ti mustt have
h th
the same numberb off d
data
t points
i t across titime.
“X” Sifting
Multi-Vari Example
You are probably asking yourself what is Injection Molding? Well basically an injection molding
machine takes hard plastic pellets and melts them into a fluid. This fluid is then injected into a
mold or die, under pressure, to create products, such as piping and computer cases.
Method
Typically, we start Sa mpling pla ns should encom pa ss a ll three types of
with a data collection
va ria tion: W ithin,, Betw een a nd Tem pora
p l.
sheet
h t th
thatt makes
k
sense based on our
1). Create Sampling Plan
knowledge of the
process. Then follow 2). Gather Passive Date
the steps. 3). Graph Data
If we only see minor 4). Check to see if Variation is Exposed
variation in the
5) Interpret Results
5).
sample, it is time to go
back and collect No
additional data. When Is
Yes
Crea
Create
te Ga
Gather
ther Is
your data collection Pa
Gra
Graph
ph Va
Varia
riation
tion Interpret
Interpret
Sa
Sammpling
pling Passive
ssive Da Results
represents at least Da Data
ta Ex
Exposed
posed Results
Pla
Plann Datata
80% of the variation
within the
process then you should have enough information to evaluate the graph.
graph
Remember for a Multi-Vari Analysis to work the output must be continuous and the sources of
variation discrete.
“X” Sifting
Sources of Variation
Within unit, between
unit and temporal are W ithin unit or Positiona l
the classic causes of – W ithin piece variation related to the geometry of the part.
variation. A unit can – Variation across a single unit containing many individual parts
be a single piece or
such as a wafer containing many computer processors.
a grouping of pieces
– Location in a batch process such as plating.
depending on
whether they were
Between unit or Cyclica l
created
t d att unique
i
times. – Variation among consecutive pieces.
– Variation among groups of pieces.
Multi-Vari Analysis – Variation among consecutive batches.
can be performed on
other processes, Tempora l or Over time Shift-to-Shift
simply identify the
g
categorical sources – Day-to-Day
of variation you are – W eek-to-W eek
interested in.
M a ster Injection
Pressure
% O x ygen
Dista nce to ta nk
Injection Pressure
Per Ca vity
Fluid Level
#1
#2
Am bient
Die
#3 Temp
Tem p
#4
Die
Relea se
An example of Within Unit Variation is measured by differences in the 4 widgets from a single
die cycle. For example, we could measure the wall thickness for each of the 4 widgets.
Between Unit Variation is measured by differences from sequential die cycles. An example of
Between Unit Variation is, comparing the average of wall thickness from die cycle to die cycle.
Temporal Variation is measured over some meaningful time period. For example, we would
compare the average of all the data collected in a time period say the 8 o’clock hour to the 10
o’clock hour.
Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com
241
“X” Sifting
Sampling Plan
To continue with this Monday W ednesday Friday
example, the Multi-Vari Die Die Die Die Die Die Die Die Die
Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle Cycle
sampling plan will be to #1 #2 #3 #1 #2 #3 #1 #2 #3
gather data for 3 die cycles
Cavity #1
on 3 different days for 4
widgets inside the mold.
Cavity #2
Cavity #3
Cavity #4
Cavity #4
“X” Sifting
Gather the list of potential X’s and assign to one of the families of
variation.
– This information can be pulled from the X-Y Diagram from the
Measure Phase.
If an X spans one or more families, assign %’s to the supposed split.
Now let’s
let s use the same information from the X-Y
X Y Diagram that was created in the Measure Phase Phase. The
following exercise will help you assign one of the variables to the family of variation. f you find yourself
with a variable or (X) then assign percentages to split. Use your best judgment for the splits. Don’t
assume that the true X’s causing variation have to come from one in the list.
Step 4 - Focus further effort on the X’s associated with the family of
largest variation.
Remember
R b ththe goa l isi nott to
t onlyl
figure out w ha t it is, but w ha t it is not!
“X” Sifting
Data Worksheet
Now create the Multi-Vari
Chart in MINITABTM.
Run Multi-Vari
“X” Sifting
To find an example of
within unit variation, look
at Unit 1 in the second
time period. Notice the
spread of data is 0.07.
To determine temporal
variation, compare the
averages between time periods. It appears time period 3 and 2 have a difference of 0.06.
To determine within unit variation, find the unit with the greatest variation like Unit 1 in the second
time p
period. Notice the spread
p of data is 0.07. It appears
pp the second unit in the third.
Notice that the shifting from unit to unit is not consistent, but it certainly jumps up and down. The
question at this point should be: Does this graph represent the problem I’m working on? Do I see at
least 80% of the variation? Read the units off the Y axis or look in the worksheet. Notice the spread
of the data is 0.22 units. If the usual spread of the data is 0.25 units, then this data set represents
88% of the usual variation which tells us our sampling plan was sufficient to detect the problem.
“X” Sifting
Let’s try another example, A company with two call centers wants to compare two methods of
open the MINITABTM handling calls at each location at different times of the day.
worksheet “CallCenter.mtw”.
This example is a One method involves a team to resolve customer issues, and the other
transactional application of method requires a single subject-matter expert to handle the call
alone.
the tool.
In this p
particular case,, a • Output (Y)
company with two call – Call Time
centers wants to compare
two methods of handling • Input (X)
calls at each location at – Call Center (GA,N V)
different times of the day. – Time of Day (10:00, 13:00, 17:00)
One method involves a team – Method (Expert, Team)
to resolve customer issues,
and the other method
requires a single subject-
matter expert to handle the
call alone.
“X” Sifting
It is not necessaryy to
force fit any one tool to
your project. For
transactional projects
Multi-Vari may be difficult
to interpret purely
graphically. We will re-
visit this data set later
when working through
Hypothesis Testing.
M lti V i Exercise
Multi-Vari E i
“X” Sifting
MVA Solution
Do you recall the reason
why Normality is an Check for norm a lity …
issue? Normality is
required if you intend to
use the information as a
predictive tool. Early in
the Six Sigma process
there is no reason to
assume that
th t your data
d t Probability
ProbabilityPlot
Plotof
ofVolume
Volume
Normal
Normal
will be Normal. 99.9
99.9
Mean
Mean 514.7
514.7
StDev 6.854
Percent
70
Percent
60
60
50
50
easier. Let’s work the 40
40
30
30
20
20
problem now. Is that 10
10
5
5
0.1
0.1
490 500 510 520 530 540
Normality. Since the P- 490 500 510
Volume
Volume
520 530 540
Having a graphical
summary is quite Another method to check norm a lity is…
nice since it
provides a picture
of the data as well
as the summary
statistics The
statistics.
Summary
Summaryfor
forVolume
graphical summary Volume
AAnderson-Darling
nderson-D arlingNNormality
ormalityTest
Test
command in AA-Squared
-S quared
PP-V-Value
alue
0.49
0.49
0.212
0.212
MINITABTM is an MMean
ean
SStDev
tD ev
514.71
514.71
6.85
6.85
VVariance 46.97
alternative method ariance
SSkew
kewness
Kurtosis
ness
46.97
-0.084725
-0.084725
-0.696960
Kurtosis -0.696960
to check for NN
MMinimum 500.64
144
144
inimum 500.64
Normality. Notice 1st
1stQQuartile
MMedian
uartile
edian
509.70
509.70
515.32
515.32
3rd
3rdQQuartile 520.12
that the P-value in 504
504
510
510
516
516
522
522
528
528
95%
MMaximum
uartile
aximum
520.12
529.39
529.39
95%CConfidence
onfidenceInterv
Intervalalfor
forMMean
ean
this window is the 513.58
513.58 515.84
515.84
95% C onfidence Interv al for M edian
95% C onfidence Interv al for M edian
same as the 513.90
513.90
516.37
516.37
95% C onfidence Interv al for StDev
95% C onfidence Interv al for S tD ev
previous. Mean
9 5 % C onfidence Inter vals
9 5 % C onfidence Inter vals 6.14
6.14
7.75
7.75
Mean
Median
Median
Notice that even 513.5
513.5
514.0
514.0
514.5
514.5
515.0
515.0
515.5
515.5
516.0
516.0
516.5
516.5
“X” Sifting
MVA Solution
Now it is time to
perform the
process
capability. For
subgroup size is
enter 12 since all
12 bottles are
filled at the same
time. Also, use
500 milliliters as
the upper spec
limit in order to
see how bad the
capability was
from a
manufacturers
prospective.
Under the
“Options” tab you
can select the “Benchmark Z’s (sigma level)” of the process, or you can leave the default as
“Capability stats”. Just for fun you can run MINITABTM to generate the Capability Analysis using 500
as the upper spec limit, then run it again as the lower spec limit and see what happens to the
statistics
statistics.
MVA Solution
“X” Sifting
Perform an MVA
The order in which you enter the factors will
produce different graphs. The “classical”
method is to use Within, Between and over-
time (Temporal) order.
MVA Solution
The graph shows the variation within a unit is consistent across all the data. The variation between
units also looks consistent across all the data. What seems to stand out is the machine may be set
up differently from first shift to second. That should be easy to fix! What is the largest source of
variation? Within Unit Variation is the largest
largest, Temporal is the next largest (and probably easiest to
fix) and Between Unit Variation comes in last.
88
This example was 515
515 99
Between
that a high price scale Panel
Panel variable:
variable: Temporal
Temporal
could be generating
significant variation.
The in-line scale weighed the bottles and either sent them forward to ship or rejected them to be
topped off. The wind generated by the positive pressure in the room blew across the scale making
the weights recorded fluctuate unacceptably. The filling machine was actually quite good, there
were a few adjustments made once the variation from the scale was fixed. Once the variation in
the data was reduced, they were able to shift the Mean closer to the specification of 500 ml.
“X” Sifting
“X” Sifting
Classes of Distributions
By now you are
convinced that M ulti-Va ri is a tool tha t helps screen X ’s by visua lizing
Multi-Vari is a tool three prima ry sources of va ria tion. La ter w e w ill
that helps screen perform Hypothesis Tests ba sed on our findings.
X’s by visualizing
three primary
sources of At this point we will review classes and causes of distributions that
variation. At this can also help us screen X’s to perform Hypothesis Tests.
point we will review
classes and causes – N ormal Distribution
of distributions that
– N on-normality – 4 Primary Classifications
can also help us
screen X’s to 1. Skewness
perform Hypothesis
Tests. 2. Multiple Modes
3 Kurtosis
3.
4. Granularity
-6 -5 -4 -3 -2 -1 +1 +2 +3 +4 +5 +6
“X” Sifting
Normal Distribution
However, just
because a
distribution of sample data looks Normal does not mean that the variation cannot be reduced and a
new Normal Distribution created.
Non-Normal Distributions
“X” Sifting
Skewness Classification
When a distribution P t ti l Ca
Potentia C uses off Sk ew ness
is not symmetrical,
then it’s Skewed. Left Sk ew Right Sk ew
Generally a Skewed 60
distribution longest 40
50
tail points in the
Frequency
Frequency
30 40
direction of the 20 30
Skew. 10
20
10
0 0
10 15 20 4 5 6 7 8 9 10 11
M a chine A M a chine B
O pera tor A O pera tor B
Pa y ment M ethod A Pa y ment M ethod B Combined
Interview er A Interview er B
Sa m ple A
+ Sa mple B
=
What causes Mixed Distributions? Mixed Distributions occur when data comes from several sources
that are supposed to be the same but are not.
Note that both distributions that formed the combined Skewed Distribution started out as Normal
Distributions.
“X” Sifting
Just because on Linea r Rela tionships occur w hen the X a nd Y sca les
N on-Linea
your Input (X) a re different.
is Normally
Distributed 10
about a Mean,
the Output (Y)
may not be
Normally
Distributed.
Y
M a rgina l Distribution
5
of Y
0 50 100
X
M a rgina l Distribution
of X
1 5 IInteractions
1-5 t ti
Intera ctions occur when two inputs interact with each other to
have a larger impact on Y than either would by themselves.
On
35
Temperature
Spray
Off
30
Room T
25
No Spray
If you find that two inputs have a large impact on Y but would not effect Y by themselves
themselves, this is
called a Interaction.
For instance, if you spray an aerosol can in the direction of a flame what would happen to room
temperature? What do you see regarding these distributions?
“X” Sifting
Th distribution
The di t ib ti is
i dependent
d d t on time.
ti Time
relationships
occur when the
30
distribution is
dependent on
M a rgina l Distrribution
time, some
25
examples are
tool wear,
wear
chemical bath
of Y
depletion, stock
20 prices, etc.
10 20 30 40 50
Tim e
O
Often
ften seen
seen w w hen
hen tooling
tooling requires
requires ““w
w aarm ing up”
rming up”,, tool
tool w
w ea
ear,
r,
chemica l ba th depletions, a mbient tem pera ture effect on tooling.
chemica l ba th depletions, a m bient tem pera ture effect on tooling.
“X” Sifting
Kurtosis 2
K t i refers
Kurtosis f tto th
the sha
h pe off the
th ta
t ils.
il Platykurtic are
flat with short-
– Leptokurtic tails.
– Platykurtic
• Different combinations of distributions causes the resulting overall
shapes.
Platykurtic
2 -2
2 Sorting or Selecting:
Scrapping product that falls
outside the spec limits
2 -3 Trends or Pa tterns:
Lack of Independence in the
data (example: tool wear,
chemical bath)
2 -4 N on Linea r
Rela tionships
Chemical Systems
“X” Sifting
Leptokurtic
Positive
Kurtosis value Distributions
Di t ib ti overla
l ying
i ea ch
h other
th ththa t hah ve very
indicates different va ria nce ca n ca use a Leptok urtic
Leptokurtic distribution.
distribution. Ca uses:
2 -2 Sorting or Selecting:
Scrapping product that falls
outside the spec limits
2 -3 Trends or Pa tterns:
Lack of Independence in the data
(example: tool wear, chemical
bath)
2 -4 N on Linea r
Rela tionships
Chemical Systems
Multiple Modes 3
Multiple Modes have such dramatic combinations of underlying sources that they show distinct
modes. They may have shown as Platykurtic, but were far enough apart to see separation.
“X” Sifting
Bimodal Distributions
Descriptive Statistics
Variable: ExtremeBiMod
Anderson-Darling
Anderson Darling Normality Test
A-Squared: 22.657
P-Value: 0.000
Mean 28.8144
StDev 7.5702
Variance 57.3081
Skewness 1.37767
Kurtosis 2.66E-03
N 127
22 26 30 34 38 42 46
Minimum 22.6294
1 t Quartile
1st Q til 24 2649
24.2649
Median 25.2902
3rd Quartile 26.5494
95% Confidence Interval for Mu Maximum 45.3291
95% Confidence Interval for Mu
27.4851 30.1438
24.6 25.6 26.6 27.6 28.6 29.6 30.6 95% Confidence Interval for Sigma
6.7398 8.6359
95% Confidence Interval for Median
95% Confidence Interval for Median
25 0263
25.0263 25 7491
25.7491
If you see an extreme outlier, it usually has its on cause or own source of variation. It’s relatively
easy to isolate the cause by looking on the X Axis of the Histogram.
“X” Sifting
Mean 26.2507
StDev 4.8453
Variance 23.4767
Skewness 3.17250
Kurtosis 9.11483
N 108
22 26 30 34 38 42 46
Minimum 22.6294
1st Quartile 24.1285
Median 25.0534
3rd Quartile 25.9709
95% Confidence Interval for Mu Maximum 46.0000
95% Confidence Interval for Mu
25.3265 27.1750
25 26 27 95% Confidence Interval for Sigma
4.2740 5.5943
95% Confidence Interval for Median
95% Confidence Interval for Median
24.8365 25.2971
Granular 4
Now let’s take a moment and Notice the P-value in the Normal Probability Plot, it is definitely smaller
than 0.05!
“X” Sifting
Normal Example
N on
on-normal
normal Distributions can give more root cause
information than N ormal data (the nature of why…)
Hey
y Honey,
y I found the key….
y
“X” Sifting
Notes
Analyze Phase
Inferential Statistics
Inferential Statistics
Overview
The core
fundamentals of this W
W elcome
elcome to
to Ana
Analyze
lyze
phase are Inferential
Statistics, Nature of ““X
X”” Sifting
Sifting Inferential
Inferential Statistics
Statistics
Sampling and
Central Limit Inferentia
Inferentiall Sta
Statistics
tistics N
Nature
ature of
of Sampling
Sampling
Theorem.
Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing Central
Central Limit
Limit Theorem
Theorem
We will examine the
meaning of each of
these and show you Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
how to apply them.
Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
Hypothesis
Hypothesis Testing
Testing N
NNND
D P1
P1
Hypothesis
H h i Testing
Hypothesis TTesting
i N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Items
Items
Nature of Inference
Inferential Statistics
1 . W ha t do you w a nt to k now ?
So many
questions….?
As with most things you have learned associated with Six Sigma – there are defined steps to be
taken.
Types of Error
1 . Error in sa mpling
– Error due to differences among samples drawn at random from the
population (luck of the draw).
– This is the only source of error that statistics can accommodate.
2 . Bia s in sa mpling
– Error due to lack of independence among random samples or due to
systematic sampling procedures (height of horse jockeys only).
3 . Error in mea surement
– Error in the measurement of the samples (MSA/ GR&R)
4 . La ck of mea surement va lidity
– Error in the measurement does not actually measure what it intends to
measure (placing a probe in the wrong slot measuring temperature
with a thermometer that is just next to a furnace).
Inferential Statistics
Popula tion
– EVERY data point that has ever been or ever will be generated from a given
characteristic.
Sa m ple
– A portion (or subset) of the population, either at one time or over time.
X
X X
X X
O bserva tion
– An
A individual
i di id l measurement.
t
Let’s just review a few definitions: A population is EVERY data point that has ever been or ever will
be generated from a given characteristic. A sample is a portion (or subset) of the population, either
at one time or over time. An observation is an individual measurement.
Significance
* RORI includes not only dollars and assets but the time and participation of your teams.
Inferential Statistics
The Mission
Your mission, which you have chosen to accept, is to reduce cycle time, reduce the error rate,
reduce costs, reduce investment, improve service level, improve throughput, reduce lead time,
increase productivity… change the output metric of some process, etc…
In statistical terms, this translates to the need to move the process Mean and/or reduce the process
Standard Deviation
You’ll be making decisions about how to adjust key process input variables based on sample data,
not population data - that means you are taking some risks.
How will you know your key process output variable really changed, and is not just an unlikely
sample? The Central Limit Theorem helps us understand the risk we are taking and is the basis for
using sampling to estimate population parameters.
Imagine you have some population. The individual values of this population form some distribution.
Take a sample of some of the individual values and calculate the sample Mean.
The Central Limit Theorem says that as the sample size becomes large, this new distribution (the
sample Mean distribution) will form a Normal Distribution, no matter what the shape of the
population distribution of individuals.
Inferential Statistics
Popula tion • Samples from the population, each with five observations:
3
5 Sa mple 1 Sa mple 2 Sa mple 3
2
12 1 9 2
10 12 8 3
1 9 5 6
6
12 7 14 11
5 8 10 10
6
12 7 .4 9 .2 6 .4
14
3 • In this example, we have taken three samples out of the
6 population, each with five observations in it. W e computed a
11
9 Mean for each sample. N ote that the Means are not the same!
10 • W hy not?
10
12
• W hat would happen if we kept taking more samples?
Every statistic derives from a sampling distribution. For instance, if you were to keep taking samples
from the population over and over, a distribution could be formed for calculating Means, Medians,
Mode, Standard Deviations, etc. As you will see the above sample distributions each have a
diff
different
t statistic.
t ti ti ThThe goall h
here iis tto successfully
f ll make
k iinferences
f regarding
di ththe statistical
t ti ti l ddata.
t
Create a sample of 1,000 individual rolls of a die that we will store in a variable named “Population”.
From the p
population,
p , we will draw five random samples.
p
Inferential Statistics
Sampling Distributions
To draw random samples from the population follow the command shown below and repeat 4 more
times for the other columns.
Sampling Error
Ca lcula te the M ea n a nd Sta nda rd Devia tion for ea ch column Now compare
a nd compa re the sa mple sta tistics to the popula tion. the Mean and
Standard
Stat > Basic Statistics > Display Descriptive Statistics…
Deviation of the
samples of 5
Descriptive Sta tistics: Popula tion, Sa mple1 , Sa mple2 , Sa m ple3 , Sa mple4 , Sa mple5 observations to
the population.
Variable N N* Mean SE Mean StDev Minimum Q1 Median Q3 Maximum What do you
see?
Population 1000 0 3.5510 0.0528 1.6692 1.0000 2.0000 4.0000 5.0000 6.0000
Sample1
Sa pe 5 0 3.400
3 00 0.927
0 9 2.074
0 1.000
000 1.500
500 3.000
3 000 5.500
5 500 6.000
6 000
Inferential Statistics
Sampling Error
Sampling
p g Error - Reduced
Calculate the Mean and Standard Deviation for each column and compare the sample statistics to
the population.
S
Sample6
l 6 10 0 3 600
3.600 0 653
0.653 2 066
2.066 1 000
1.000 1 750
1.750 3 500
3.500 6 000
6.000 6 000
6.000
Can you tell what is happening to the Mean and Standard Deviation? When the sample size
increases, the values of the Mean and Standard Deviation decrease.
What do you think would happen if the sample increased? Let’s try 30 for a sample size.
Inferential Statistics
Do you notice
anything different?
Sampling Distributions
Now instead of looking at the effect of sample size on error, we will create a sampling distribution
of averages. Follow along to generate your own random data.
Inferential Statistics
Sampling Distributions
Repea
p tt this
Repea this com
commmaand
nd
to
to ca
calcula
lculate
te the
the M
Meaeann
of
of C1
C1-C1
-C100,, aand
nd store
store
result
result in
in M
Meaeann 1100
The commands shown above will create new columns that are now averages from the columns of
random population data. We have 1000 averages of sample size 5 and 1000 averages of sample
size
i 1010.
Crea te a Histogra m of C1 , M ea n5 a nd M ea n1 0 .
Graph> Histogram> Simple…..
Multiple
p Graph…On
p separate
p graphs…Same
g p X,, including
g same bins
In MINITABTM
follow the above
commands. The
Histogram being
generated makes
it easy to see
what happened
when the sample
size was
Select
Select ““Sa
Sam meeX X ,,
including
increased.
including sasammee
bins”
bins” to
to fa
facilita
cilitate
te
com
compapa rison
rison
Inferential Statistics
Different Distributions
Sa m ple M ea ns
Individua ls
Observations
have a Mean
Everything we have gone have a Std Dev
through with sampling error
and sampling distributions
was leading up to the and be normally distributed when the parent population is
normally distributed, or will be approximately normal for samples
Central Limit Theorem. of size 30 or more when the parent population is not normally
distributed
distributed.
This improves with samples of larger size.
Bigger is Better!
Inferential Statistics
So What?
A Practical Example
What is the likelihood of getting a sample with a 2 second difference? This could be caused either
by implementing changes or could be a result of random sampling variation, sampling error. The
95% confidence interval exceeds the 2 second difference (delta) seen as a result. What is the delta
caused from? This could be a true difference in performance or random sampling error
error. This is why
you look further than only relying on point estimators.
Inferential Statistics
Theoretica l distribution of
sa mple M ea ns for n = 2
Distribution of
Theoretica l distribution of
individua ls in the
sa mple M ea ns for n = 1 0
popula tion
Inferential Statistics
Standard Error
The ra te of cha nge in the sta nda rd error a pproa ches zero
a t a bout 3 0 sa mples.
Sta nda rd Errror
0 5 10 20 30
Sa m ple Size
When comparing standard error with sample size, the rate of change in the standard error
approaches zero at about 30 samples. This is why a sample size of 30 comes up often in discussions
on sample size.
This is the point at which the t and the Z distributions become nearly equivalent. If you look at a Z
table and a t table to compare
p Z=1.96 to t at 0.975 as sample
p approaches
pp infinite degrees
g of freedom
they are equal.
Inferential Statistics
Notes
Analyze Phase
Introduction to Hypothesis Testing
Hypothesis
Hypothesis Testing
Testing N
NNND
D P1
P1
Hypothesis
Hypothesis Testing
Testing N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Items
Items
Our goal is to improve our Process Capability, this translates to the need to move the process Mean
(or proportion) and reduce the Standard Deviation.
Because it is too expensive or too impractical (not to mention theoretically impossible) to
collect population data, we will make decisions based on sample data.
Because we are dealing with sample data, there is some uncertainty about the true
population parameters.
Hypothesis Testing helps us make fact-based decisions about whether there are different population
parameters or that the differences are just due to expected sample variation.
96 100 104 108 112 116 120 102 105 108 111 114 117 120
O bserv ed P erformance E xp. Within P erformance E xp. O v erall P erformance O bserv ed P erformance Exp. Within P erformance E xp. O v erall P erformance
P P M < LS L 6666.67 P P M < LSL 115.74 P P M < LSL 55078.48 P P M < LS L 0.00 P P M < LSL 0.00 P P M < LSL 0.00
PPM > USL 0.00 PPM > USL 0.71 P P M > U SL 18193.49 P P M > U SL 0.00 P P M > U S L 0.00 P P M > U S L 0.00
P P M Total 6666.67 P P M Total 116.45 P P M Total 73271.97 P P M Total 0.00 P P M Total 0.00 P P M Total 0.00
The purpose of appropriate Hypothesis Testing is to integrate the Voice of the Process with the
Voice of the Business to make data-based decisions to resolve problems.
Hypothesis Testing can help avoid high costs of experimental efforts by using existing data. This
can be likened to:
Local store costs versus mini bar expenses.
There may be a need to eventually use experimentation, but careful data analysis can
indicate a direction for experimentation if necessary.
Recall from the discussion on classes and cause of distributions that a data set may seem Normal,
yet still be made up of multiple distributions.
Hypothesis Testing can help establish a statistical difference between factors from different
distributions.
0.8
0.7
0.6
0.5
freq
0.4
03
0.3
0.2
0.1
0.0
-3 -2 -1 0 1 2 3
x
Because of not having the capability to test an entire population, having to use a sample is the
closest we can get to the population. Since we are using sample data and not the entire population
we need to have methods what will allow us to infer the sample if a fair representation of then
population.
When we use a proper sample size, Hypothesis Testing gives us a way to detect the likelihood that
a sample came from a particular distribution. Sometimes the questions can be: Did our sample
come from a population with a mean of 100? Is our sample variance significantly different than the
variance of the population? Is it different from a target?
Significant Difference
μ1 μ2
Sa mple 1 Sa mple 2
Do you see a difference between Sample 1 and Sample 2? There may be a real difference between
the samples shown; however, we may not be able to determine a statistical difference. Our
confidence is established statistically which has an effect on the necessary sample size. Our ability
to detect a difference is directly linked to sample size and in turn whether we practically care about
such a small difference.
Detecting Significance
H A : The sk y is fa lling.
We will discuss the difference between practical and statistical throughout this session. We can
affect the outcome of a statistical test simply by changing the sample size.
Lets take a moment to explore the concept of Practical Differences versus Statistical Differences.
Detecting Significance
g in the Mean or in
The difference d can be either a change
the variance.
Hypothesis Testing
A Hypothesis
H T t is
th i Test i an a priori
i i theory
th relating
l ti tot differences
diff b
between
t variables.
i bl
DICE Example
You have rolled dice before haven’t you? You know dice that you would find in a board game or in
Las Vegas.
Well assume that we suspect a single die is “Fixed.” Meaning it has been altered in some form or
fashion to make a certain number appear more often that it rightfully should.
Consider the example on how we would go about determining if in fact a die was loaded.
If we threw the die five times and got five one’s, what would you conclude? How sure can you be?
The pprobability
y of g
getting
g jjust a single
g one. The p
probability
y of g
getting
g five ones.
W e could throw it a number of times and track how many each face
occurred. W ith a standard die, we would expect each face to occur 1/ 6 or
16.67% of the time.
If we threw the die 5 times and got 5 one’s, what would you conclude? How
sure can you be?
– Pr (1 one) = 0.1667 Pr (5 ones) = (0.1667)5 = 0.00013
There are approximately 1.3 chances out of 1000 that we could have gotten
5 ones with a standard die.
Therefore, we would say we are willing to take a 0.1% chance of being
wrong about our hypothesis that the die was “ loaded” since the results do not
come
co e cclose
ose to ou
our p
predicted
ed cted outco
outcome.
e
Hypothesis Testing
DECISIONS
β n
A differences
Any diff between
b observed
b dddata and
d claims
l i made
d under
d H0 may b
be reall or d
due to chance.
h
Hypothesis Tests determine the probabilities of these differences occurring solely due to chance and
call them P-values.
The a level of a test (level of significance) represents the yardstick against which p-values are
measured and H0 is rejected if the P-value is less than the alpha level.
Th mostt commonly
The l used
d a llevels
l are 5%
5%, 10% and
d 1%
1%.
There of ttwo
o ttypes
pes of error T
Type
pe I with
ith an associated risk eq
equal
al to alpha (the first letter in the Greek
alphabet), and of course named the other one Type II with an associated risk equal to beta.
The formula reads: alpha is equal to the probability of making a Type 1 error, or alpha is equal to
the probability of rejecting the null hypothesis when the null hypothesis is true.
Alpha Risk
Region R i
Region
of of
DO UBT DO UBT
Hypothesis
yp Testing
g Risk
The beta risk or Type 2 Error (also called the “Consumer’s Risk”) is the probability that we could
be wrong in saying that two or more things are the same when, in fact, they are different.
Actua l Conditions
N ot Different Different
(Ho is True) (Ho is False)
Another way to describe beta risk is failing to recognize an improvement. Chances are the sample
size was inappropriate or the data was imprecise and/or inaccurate.
Reading the formula: Beta is equal to the probability of making a Type 2 error.
Or: Beta is equal to the probability of failing to reject the null hypothesis given that the null
hypothesis is false.
Beta Risk
Critica
Criticall va
value
lue of
of
test
test sta
statistic
tistic
Theoretical Distribution
of Means
When n = 30
δ=5
S=1
Large S
All samples are estimates of the population. All statistics based on samples are estimates of the
equivalent population parameters. All estimates could be wrong!
These are typical questions you will experience or hear during sampling. The most common answer
is “It depends.”. Primarily because someone could say a sample of 30 is perfect where that may
actually be too many. Point is you don’t know what the right sample is without the test.
40 50 60 70
H
Here iis a H
Hypothesis
th i T Testing
ti roadmap
d ffor C
Continuous
ti D
Data.
t Thi
This iis a greatt reference
f ttooll while
hil you
are conducting Hypothesis Tests.
N orm a l
u s
i n uo
nt
Co Da ta
2 Sa m ple T O ne W a y AN O VA 2 Sa m ple T O ne W a y AN O VA
s
ou
t i nu
n
Co D a ta N on N orma l
Attribute Da ta
u te
t t r ib
A a ta
D
One Fa ctor Tw o Fa ctors
Two Samples Two or More Samples
One Sample
W hile
hil using
i H Hypothesis
th i T Testing
ti ththe ffollowing
ll i ffacts
t should
h ld bbe b
borne in
i
mind at the conclusion stage:
– The decision is about Ho and N OT Ha.
– The conclusion statement is whether the contention of Ha was upheld.
– The null hypothesis (Ho) is on trial.
– W hen a decision has been made:
• N othing has been proved.
• It is just a decision.
• All decisions can lead to errors (Types I and II).
– If the decision is to “ Reject Ho,” then the conclusion should read “ There
is sufficient evidence at the α level of significance
g to show that “ state the
alternative hypothesis Ha.”
– If the decision is to “ Fail to Reject Ho,” then the conclusion should read
“ There isn’t sufficient evidence at the α level of significance to show that
“ state the alternative hypothesis.”
Notes
Notes
Analyze Phase
Hypothesis Testing Normal Data Part 1
Overview
The core
fundamentals of this W
W elcom
elcomee to
to Ana
Analy
lyze
ze
phase are
Hypothesis Testing, ““X
X”” Sifting
Sifting
Tests for Central
Tendency, Tests for Inferentia
Inferentiall Sta
Statistics
tistics
Variance and
ANOVA.
Intro
Intro to
to Hypothesis
Hy pothesis Testing
Testing Sa
Sample
p Size
mple Size
We will examine the
meaning of each of Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1 Testing
Testing M
Mea
eans
ns
these and show you Ana
Analy
lyzing
zing Results
Results
how to apply them. Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
Hy pothesis Testing
Hypothesis Testing N
NNND
D P1
P1
Hypothesis
H th i Testing
Hypothesis Testing
T ti N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Items
Item s
T-tests are used to compare a Mean against a target and to compare Means from two different
samples and to compare paired data. When comparing multiple Means it is inappropriate to use a t-
test. Analysis of variance or ANOVA is used when it is necessary to compare more than two Means.
t-tests a re used:
– To compare
p a Mean against
g a target.
g
• i.e.; The team made improvements and wants to compare
the mean against a target to see if they met the target.
They don’t
The d ’t look
l k the
same to me!
1 Sample t
Here we are looking for the region in which we can be 95% sure our true population Mean will lie
lie.
This is based on a calculated average, Standard Deviation, number of trials and a given alpha risk of
.05.
A 1-sample t-test is used to compare an expected population mean to a
In order for the Mean target.
of the sample to be
considered not
significantly different
than the target,
target the
target must fall within Target μsample
the confidence
interval of the sample MIN ITABTM performs a one sample t-test or t-confidence interval for the Mean.
Mean.
Use 1-sample t to compute a confidence interval and perform a hypothesis
test of the Mean when the population Standard Deviation, σ, is unknown. For
a one or two-tailed 1-sample t:
If you remember from earlier, 95% of the area under the curve of a Normal Distribution falls within plus
or minus 2 Standard Deviations. Confidence intervals are based on your selected alpha level, so if you
selected an alpha of 5%, then the confidence interval would be 95% which is roughly plus or minus 2
Standard Deviations. Using your eye to guesstimate you can see that the target value falls within plus
or minus 2 Standard Deviations of the sampling distribution of sample size 2
2.
If you used a sample of 30, could you tell if the target was different? Just using your eye it appears
that the target is outside the 95% confidence interval of the Mean. Luckily, MINITABTM makes this very
easy…
Sample Size
IInstead
t d off going
i To determine proper sa mple size in M IN ITABTM :
through the dreadful
hand calculations of
sample size we will
use MINITABTM.
Three fields must be
filled in and one left
blank in the sample Three fields m ust be filled
in a nd one left bla nk .
size window.
MINITABTM will solve
for the third. If you
want to know the
sample size, you must
enter the difference,
which is the shift that
mustt be
b detected.
d t t d It isi
common to state the
difference in terms of
“generic” Standard Deviations when you do not have an estimate for the Standard Deviation of the
process. For example, if you want to detect a shift of 1.5 Standard Deviations enter that in difference
and enter 1 for Standard Deviation. If you knew the Standard Deviation and it was 0.8, then enter it
for Standard Deviation and 1.2 for the difference (which is a 1.5 Standard Deviation shift in terms of
real values)
values).
If you are unsure of the desired difference, or in many cases simply get stuck with a sample size that
you didn’t have a lot of control over, MINITABTM will tell you how much of a difference can be
detected. You as a practitioner must be careful when drawing Practical Conclusions because it is
possible to have statistical significance without practical significance. In other words- do a reality
check. MINITABTM has made it easy to see an assortment of sample sizes and differences.
1-Sample t Example
3 . 1 -sa mple t-test (popula tion Sta nda rd Devia tion unk now n,
compa ring to ta rget).
α = 0.05 β = 0.10
4 . Sa mple Size:
• O pen the M IN ITABTM w ork sheet:
Ex h_Sta t.M TW
• Use the C1 colum n: Va lues
– In this ca se, the new supplier
sent 9 sa mples for eva lua tion.
– How much of a difference ca n
be detected w ith this sa mple?
Hypothesis Testing
Follow along in
MINITABTM, as you can
see, we will be able to
detect a difference of
1.23 with the sample of
9.
Now refer to the road map for Hypothesis Testing, you must first check for Normality. In MINITABTM
select “Stats>Basic Statistics>Normality Test”. For the “Variable Fields” double-click on “Values” in
the left-hand box. Once this is complete select “OK”.
Since the P-value is greater than 0.05 we fail to reject the null hypothesis that the data are Normal.
60
50
40
30
20
Are
Are the
the
10
da
datata in
in the
the
va
values
lues
column
l
colum n
4.2 4.4 4.6 4.8 5.0 5.2 5.4 norma
norm al? l?
Values
1-Sample t Example
Perform the one sample t-t
test. In MINITABTM select
Stat>Basic Statistics>1-
Sample t. From the left-
hand box double-click on
“Values”.
Click “ Gra phs”
In the “Options” button
th
there iis a selection
l ti ffor th
the -Select
S l t a ll 3
alternative hypothesis, the Click “ O ptions
default is not equal which
corresponds to our - In CI enter 9 5
hypothesis. If your
alternative hypothesis was
a greater than or less than,
you would have to change
y g
the default.
Histogram of Values
hypothesized value of 5
noted as the Ho or null N ote our ta rget M ea n (represented by red Ho) is outside our
hypothesis. popula tion confidence bounda ries w hich tells tha t there is
significa nt difference betw een popula tion a nd ta rget M ea n.
Boxplot
Boxplotof
ofValues
Values
(with
(withHo
Hoand
and95%
95%t-confidence
t-confidenceinterval
intervalfor
forthe
themean)
mean)
__
XX
Ho
Ho
4.4
4.4 4.5
4.5 4.6
4.6 4.7
4.7 4.8
4.8 4.9
4.9 5.0
5.0 5.1
5.1
Values
Values
I di id l Value
Individual V l Pl t off Values
Plot V l
(with Ho and 95% t-confidence interval for the mean)
_
X
Ho
As you will see the conclusion is the same, but the Dot Plot is just another representation of data.
Session Window
Ho Ha
n
(X i − X ) 2
One-Sample T: Values
s= ∑
i =1 n −1
Test of mu = 5 vs not = 5 S
SE Mean =
n
N – sa mple size
M ea n – ca lcula te ma thema tic a vera ge
StDev – ca lcula ted individua l sta nda rd devia tion (cla ssica l m ethod)
SE M ea n – ca lcula ted sta nda rd devia tion of the distribution of the m ea ns
Confidence Interva l tha t our popula tion a vera ge w ill fa ll betw een 4 .5 9 8 9 , 4 .9 7 8 9
X Ho
X − Target 4 . 79 − 5 . 00
t= = = − 2 . 56
s 0 . 247
n 9
degrees of T - Distribution
freedom
.600 .700 .800 .900 .950 .975 .990 .995
1 0.325
0 325 0.727
0 727 1.376
1 376 3.078
3 078 6.314
6 314 12.706
12 706 31.821
31 821 63.657
63 657
2 0.289 0.617 1.061 1.886 2.920 4.303 6.965 9.925
3 0.277 0.584 0.978 1.638 2.353 3.182 4.541 5.841
4 0.271 0.569 0.941 1.533 2.132 2.776 3.747 4.604
5 0.267 0.559 0.920 1.476 2.015 2.571 3.365 4.032
-2.306 2.306
0
If the ca lcula ted t-va lue lies a ny w here Critical Regions
in the critica l regions
regions, reject the null hypothesis
hypothesis.
– The da ta supports the a lterna tive hy pothesis tha t the
estima te for the M ea n of the popula tion is not 5 .0 .
Here iis th
H the fformula
l ffor th
the
confidence interval. Notice The formula for a tw o-sided t-test is:
we get the same results as
MINITABTM. s s
X − t α/2, n −1 ≤ μ ≤ X + t α/2,n −1
n n
or
X ± t crit SE
S mean = 4.788
88 ± 2.306
2 306 * .0824
0824
4.5989 to 4.9789
4.5989 X Ho
4.9789
4.7889
1-Sample t Exercise
3. Are we on Target?
B
Because we used d th
the
option of “Graphs”, we get a Histogram of ppm VOC
nice visualization of the (with Ho and 95% t-confidence interval for the mean)
data in a Histogram AND a 10
cy
Frequenc
4
Because the null
hypothesis is within the
2
confidence level, you know
we will “fail to reject” the 0 _
X
null hypothesis and accept Ho
the equipment is running at 20 25 30 35 40 45 50
the target of 32.0.
32 0 ppm VOC
N orma l
s
ou
nu
o n ti ta
C Da
2 Sa mple T O ne W a y AN O VA 2 Sa mple T O ne W a y AN O VA
2 Sample t-test
Notice the
difference in the A 2-sample t-test is used to compare two Means.
hypothesis for two
two- Stat > Basic Statistics > 2-Sample t
tailed vs. one-tailed MIN ITABTM performs an independent two-sample t-test and generates a
test. This confidence interval.
terminology is only
used to know which Use 2-Sample t to perform a Hypothesis Test and compute a confidence
column to look interval of the difference between two population Means when the
down in the t-table. population Standard Deviations, σ’s, are unknown.
Sample Size
In MINITABTM
select “Stat>Power To determine proper sa mple size in M IN ITABTM :
and Sample
Size>2-Sample t”.
Follow the same
steps that were
taken for 1-sample
t.
Three fields m ust be filled
in a nd one left bla nk .
2-Sample t Example
Now in Step 4.
Open the worksheet 4 . Sa mple Size:
in MINITABTM • Open the MIN ITABTM worksheet: Furnace.MTW
called:
g the data to see how the data is coded.
• Scroll through
“Furnace
Furnace.MTW”
MTW”
• In order to work with the data in the BTU.In column, we will need
How is the data to unstack the data by damper type.
coded?
2-Sample t Example
Notice the “unstacked” data for each damper. WE NOW HAVE TWO COLUMNS.
2-Sample t Example
For the field “Sample Sizes:” enter 40 space 50 because our data set has unequal sample sizes
which is not uncommon. The smallest difference that can be detected is based on the smallest
sample size, so in this case it is: 0.734.
Probability
ProbabilityPlot
Plotof
ofBTU.In_1
BTU.In_1
Normal
Normal
99
99
Mean
Mean 9.908
9.908
StDev
StDev 3.020
3.020
95
95 NN 40
40
AAD
D 0.475
0.475
90
90 P-Value
P-Value 0.228
0.228
80
80
70
70
Percent
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
55 10
10 15
15 20
20
BTU.In_1
BTU.In_1
The data is considered Normal since the P-value is greater than 0.05.
Probability
ProbabilityPlot
Plotof
ofBTU.In_2
BTU.In_2
Normal
Normal
99
99
Mean
Mean 10.14
10.14
StDev
StDev 2.767
2.767
95
95 NN 50
50
AAD
D 0.190
0.190
90
90 P-Value
P-Value 0.895
0.895
80
80
70
70
Percent
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
22 44 66 88 10
10 12
12 14
14 16
16 18
18
BTU.In_2
BTU.In_2
This is the Normality Plot for damper 2. Is the data Normal? It is Normal, continuing down the
roadmap…
Levene'ss Test
Levene
Test Statistic 0.00
Sa mple 1 2
P-Value 0.996
1
Damper
5 10 15 20
BTU.In
The P-value of 0.558 indicates that there is no statistically significant difference in variance.
Let’s
Let s continue along the roadmap…
roadmap Perform the 2
2-Sample
Sample tt-test;
test; be sure to check the box “Assume
Assume
equal variances”.
Box Plot
Boxplot
Boxplotof
ofBTU.In
BTU.Inby
byDamper
Damper
20
20
15
15
BBTU.In
TU.In
10
10
55
11 22
Damper
Damper
The Box Plots do not show much of a difference between the dampers.
Ca lcula ted
Avera ge n
(X i − X)
2
s= ∑i =1 n −1
S
SE Mean= (N 1 – 1 ) + (N 2 -1 )
n
Tw
Tw o-
o- Sa
Sammple
ple T-Test
T-Test
((Varia
(Va riances
nces Equa
Equal)l)
)
H
Hoo:: μμ11 == μμ22
N -1 .4 5 0 0 .9 8 0
Num
umber
ber of
of H
Haa:: μμ11≠≠ or
or << or
or >> μμ22
Sa
Samples
mples -0 .3 8
Exercise
2. Statistical Problem:
Ho:μ1 = μ2
Ha:μ1 ≠ μ2
To unstack the data follow the steps here. This will generate two new columns of data shown on the
next page…
By unstacking
the data we how
have the • Clor.Lev_Post_1 =
Clor.Lev data Distributor 1
separated by the
distributor it
came from. Now • Clor.Lev_Post_2 =
let’s
let s move on to Distributor 2
trying to
determine correct
sample size.
We wantt to
W t determine
d t i whath t is
i the
th
smallest difference that can be
detected based on our data.
In this case:
.7339 rounded to.734
The results show us a P-value of 0.154 so our data is Normal. Recall if the P-value is greater than
.05 then we will consider our data Normal.
ent
60
Perce 50
0
40
30
20
10
1
10 12 14 16 18 20 22 24 26 28
Clor.Lev_Post_1
60
50
40
30
20
10
1
10 12 14 16 18 20 22 24 26
Clor.Lev_Post_2
Look at the P
P-value
value of 0.574.
0 574
This tells us that there is no statistically significant difference in the variance in these two data sets.
What does this mean….We can finally run a 2 sample t–test with equal variances?
Levene's Test
Test Statistic 0.00
P-Value 0.986
2
1
Distributor
Boxplot
Boxplotof
ofClor.Lev_Post
Clor.Lev_Postby
byDistributor
Distributor
Hmm, we’re 28
28
26
26
a lot alike! 24
24
22
22
Clor.Lev_Post
Clor.Lev_Post
20
20
18
18
16
16
14
14
12
12
10
10
11 22
Distributor
Distributor
The Box Plots show VERY little difference between the Distributors,, also not the P-value in the
Session Window– there is no difference between the two Distributors.
N orma l
s
u ou
n tin a
Co Da t
2 Sa mple T O ne W a y AN O VA 2 Sa mple T O ne W a y AN O VA
Normality Test
70
Percent
60
60
50
50
40
40
30
30
20
20
10
10
5
5
1
1
Probability
ProbabilityPlot
Plotof
ofSample
Sample11
Normal
o a
Normal
0.1
01
0.1
-5
-5 00 55 10
10 15
15
99.9
99.9
Mean 4.853 Sample
Sample33
Mean 4.853
StDev
StDev 1.020
1.020
99
99 NN 100
100
AD 0.374
95
AD 0.374
95 P-Value 0.411
P-Value 0.411
90
90
80
80
70
Percent
70
Percent
60
60
50
50
40
OOur
ur da
datata sets
sets aare
40
30
30
20
20 re
norm a lly distributed.
10
10
5
5 norma lly distributed.
1
1
0.1
0.1
11 22 33 44 55 66 77 88
Sample
Sample11
1 2 3 4
WW ee use
use F-Test
F-Test Sta
Statistic
tistic
F-Test Levene's Test
beca
because
use ourour da
datata is
is Test Statistic: 0.106 Test Statistic: 67.073
norm
normaallylly distributed.
distributed. P-Value : 0.000 P-Value : 0.000
P-Va
P-Value
lue isis less
less tha
thann
00.0
.055,, our
our vavaria
riances
nces aarere Boxplots of Raw Data
not
not equa
equal.l.
1
0 5 10 15
Stacked
M
Media
edians
ns of
of Sa
Sam ples
mples
This is the output from MINITABTM. Notice that even though the names of the columns in
MINITABTM were Sample p 1 and Sample
p 3,, MINITABTM used Factor levels 1 and 2 to differentiate
the outcome. We have to interpret the meaning for factor levels properly, it is simply the difference
between the samples labeled one and three in our worksheet.
UN CHECK
“ Assum e equa
q l
va ria nces” box .
You can see there is very little difference in the 2-Sample t-tests.
Boxplot
Boxplot of
of Stacked
Stackedby
by C4
C4
15
15 Indica te
Sa mple
M ea ns
10
10
Stacked
Stacked
55
00
-5
-5
11 22
C4
C4
The Box Plot shows no difference between the Means. The overall box is smaller for sample on the
left,, which is an indication for the difference in variance.
Individual
Individual Value
Value Plot
Plotof
of Stacked
Stackedvs
vs C4
C4
15
15
IIndica
di te
10
10
Sa mple
M ea ns
Stacked
Stacked
55
00
-5
-5
11 22
C4
C4
By looking at this Individual Value Plot you can notice a big spread or variance of the data.
Tw
Tw o-Sa
o-Sammple
ple T-Test
T-Test
(Va
(Varia
riances
nces N
Not
ot Equa
Equal)
l)
Ho:
Ho: μμ11 == μμ22 (P-Va
(P-Value
lue >> 00.0
.055))
Ha:: μμ11 ≠≠ or
Ha or << or
or >> μμ22 (P-Va
(P-Value
lue << 00.0
.055))
What does the P-value of 0.996 mean? After conducting a 2-sample t-test there is no significant
difference between the Means.
N orma l
s
u ou
n t in a
Co Da t
2 Sa mple
p T O ne W a y AN O VA 2 Sa m p
ple T O ne W a y AN O VA
Paired t-test
• Use th
U the PPa iired
d t com m a nd
d to
t com pute
t a confidence
fid interva
i t l a ndd
perform a Hy pothesis Test of the difference betw een popula tion M ea ns
w hen observa tions a re pa ired. A pa ired t-procedure m a tches responses
tha t a re dependent or rela ted in a pa ir-w ise delta
m a nner. This m a tching a llow s y ou to a ccount for (δ)
va ria bility betw een the pa irs usua lly resulting in
a sm a ller error term , thus increa sing the sensitivity
of the Hypothesis Test or confidence interva l.
– H o : μ δ = μo
– H a : μδ ≠ μo
μbefore μafter
• W here μδ is the popula tion M ea n of the differences a nd μ0 is the
hypothesized M ea n of the differences, typica lly zero.
Example
JJustt checking
h ki your souls,
l
er…soles!
Example (cont.)
EXH_STAT DELTA.MTW
Paired t-test
t test Example
In MINITABTM
open
“Stat>Power
Now that’s
and Sample
size>1-
a tee test!
Sample t”.
E t in
Enter i th
the
appropriate
Sample Size, M IN ITABTM Session W indow
Power Value Pow er a nd Sa mple Size
and Standard 1 -Sa mple t Test
Deviation.
Testing mea n = null (versus not = null)
Ca lcula ting pow er for mea n = null +
diff
difference
Alpha = 0 .0 5 Assumed sta nda rd
This mea ns w e w ill only be a ble to devia tion = 1
detect a difference of only 1 .1 5 if the
Sa mple
Sta nda rd Devia tion is equa l to 1 .
Size Pow er Difference
10 0 .9 1 .1 5 4 5 6
Given the sample size of 10 we will be able to detect a difference of 1.15. If this was your process you
would need to decide if this was good enough. In this case, is a difference of 1.15 enough to
practically want to change the material used for the soles of the children’s shoes.
P i d t-test
Paired tt tE Example
l
Probability
ProbabilityPlot
Plotof
ofAB
ABDelta
Delta
Normal
Normal
99
99
Mean
Mean 0.41
0.41
StDev
StDev 0.3872
0.3872
95
95 NN 10
10
AAD
D 0.261
0.261
90
90 P-Value
P-Value 0.622
0.622
80
80
70
70
Percent
ercent
60
60
50
50
40
40
P
30
30
20
20
10
10
55
11
-0.5
-0.5 0.0
0.0 0.5
0.5 1.0
1.0 1.5
1.5
AAB
B Delta
Delta
1-Sample t
Box Plot
Analyzing the Box Plot we see that the null hypothesis falls outside the confidence interval, so we
reject the null hypothesis. The P-value is also less than 0.05. Given this we are 95% confident that
there is a difference in the wear between the two materials used for the soles of children’s shoes.
Paired T-Test
Click
Click on
on ““Gra
Graphs”
phs” aand
nd select
select
the
the gra
graphs
phs yyou
ou w
w ould
ould lik
likee
to
to genera
generate.
te.
Boxplot
Boxplotof
ofDifferences
Differences
(with
(withHo
Hoand
and95%
95%t-confidence
t-confidenceinterval
intervalfor
forthe
themean)
mean)
The
The P-Va
P-Valuelue of
of from
from
this
thi
this Pa
PPaired
iiredd T-Test
T T t tells
T-Test ttells
ll
us
us the
the difference
difference in in
mmaateria
terials
ls is
is
_
X
X
_ sta
statistica
tistically
lly significa
significant. nt.
Ho
Ho
-1.2
-1.2 -0.9
-0.9 -0.6
-0.6 -0.3
-0.3 0.0
0.0
Differences
Differences
Pa
Paired
ired T-Test
T-Test aand
nd CI:
CI: MMaat-A,
t-A, M
Maat-B
t-B
Pa
Paired
ired TT for
for M
Maat-A
t-A -- M
Maat-B
t-B
N
N MMeaeann StDev
StDev SE
SE M Meaeann
M
Maat-A
t-A 1100 1100.6.6330000 22.4 .4551133 00.7
.7775522
M
Maat-B
t-B 1100 1111.0.0440000 22.5 .5118855 00.7
.7996644
Difference
Difference 1100 -0
-0.4
.41100000000 00.3
.38877115555 00.1.12222442299
9955%
% CI
CI for
for m
mea
eann difference:
difference: ((-0
( 0.6
(-0 .68866995544,, -0
-00.1
.13333004466))
T-Test
T-Test of
of m
meaeann difference
difference == 00 (vs
(vs not
not == 00): ): T-Va
T-Value
lue == -3-3.3
.355 P-Va
P-Value
lue == 00.0
.00099
As you will see the conclusions are the same, but just presented differently.
Calc>Calculator
Median
Histogram
Histogramof
ofTX_MX-Diff
TX_MX-Diff
(with
(withHo
Hoand
and95%
95%t-confidence
t-confidenceinterval
intervalfor
forthe
themean)
mean)
55
44
33
Frequency
Frequency
N orma l
s
ou
nu
o n ti ta
C Da
2 Sa mple T O ne W a y AN O VA 2 Sa mple T O ne W a y AN O VA
Notes
You have now completed Analyze Phase – Hypothesis Testing Normal Data Part 1.
Notes
Analyze Phase
Hypothesis Testing Normal Data Part 2
Overview
We are now
moving into W
W elcome
elcome to
to Ana
Analy
lyze
ze
Hypothesis
Testing Normal ““X
X”” Sifting
Sifting
Data Part 2 where
we will address Inferentia
Inferentiall Sta
Statistics
tistics
Calculating
Sample Size, Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing
Variance Testing
and Analyzing Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
Results. Ca
Calcula
lculate
te Sa
Sample
mple Size
Size
W
W ra
rapp Up
Up &
& Action
Action Items
Items
Tests of Variance
Tests of Va ria nce are used for both normal and non-normal
data.
N orma l Da ta
– 1 Sample to a target
– 2 Samples – F-Test
– 3 or More Samples Bartlett’s Test
N on-N orm a l Da ta
– 2 or more samples Levene’s
Levene s Test
1-Sample Variance
Use the sa mple size ca lcula tions for a 1 sa mple t-test since
they a re ra rely performed w ithout performing a 1 sa mple t-
test a s w ell.
1-Sample Variance
4 . Sa mple Size:
• O pen the M IN ITABTM w ork sheet: Ex h_Sta t.M TW
• This is the sa me file used for the 1 Sa mple t ex a mple.
– W e w ill a ssume the sa mple size is a dequa te.
MMean
ean 4.7889
4.7889
StD ev
StDev 0.2472
0.2472
What does this mean from a VVariance
ariance
Skew ness
Skewness
0.0611
0.0611
-0.02863
-0.02863
practical stand point? They Kurtosis
Kurtosis
NN
-1.24215
-1.24215
99
4.5989 4.9789
easier
i tto accomplish
li h iin a 95%
95%CConfidence
onfidence Interv al for
Interval forMMedian
edian
4.6000 5.0772
process than reducing 95%
95%Confidence
ConfidenceIntervals
Intervals
95%
4.6000
95%CConfidence
onfidence Interv
5.0772
al for
Interval forSStDev
tD ev
3 . Equa l va ria nce test (F-test since there are only 2 factors.)
Ch k ffor N
Check Normality.
lit
5 . Sta tistica l Solution:
Stat>Basic Statistics>Normality Test
According to the
graph we have Ho: Da ta is norm a l
Normal data. Ha : Da ta is N O T norm a l Stat>Basic Stats> Normality Test
(Use Anderson Darling)
Probability
ProbabilityPlot
Plotof
ofRot
Rot11
Normal
Normal
99.9
99.9
Mean
Mean 4.871
4.871
StDev
StDev 0.9670
0.9670
99
99 NN 100
100
AAD
D 0.306
0.306
95
95 P-Value 0.559
P-Value 0.559
90
90
80
80
70
Percent
70
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
0.1
0 1
0.1
22 33 44 55 66 77 88
Rot
Rot11
Test
Testfor
forEqual
EqualVariances
Variancesfor
forRot
Rot11
F-Test
F-Test
Test
TestStatistic
Statistic 0.74
0 74
0.74
11 P-Value
P-Value 0.298
0.298
Factors
Levene's Test
Test
TestStatistic
Statistic 0.53
0.53
P-Value
P-Value 0.469
0.469
22
Use
Use F-Test
F-Test for
for 22 sa mmples
sa0.7 ples0.8 0.9 1.0 1.1 1.2 1.3 1.4
0.7 0.8 0.9 1.0 1.1 1.2 1.3 1.4
norm
normaally
lly distributed
distributed da data
ta.. 95%
95%Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
P-Va
P-Value
lue >>00.0
.055 (.2
((.29988))
Assum
Assume e Equa
Equall Va
Va1 ria
riance
nce 1
Factors
Factors
22
22 33 44 55 66 77
Rot
Rot11
Normality Test
Probability
Probability Plot
Plot of
of Rot
Rot
Normal
Normal
99
99
Mean
Mean 13.78
13.78
StDev
StDev 7.712
7.712
95
95 NN 18
18
AADD 0.285
0.285
90
90 P-Value
P-Value 0.586
0.586
80
80
70
70
Percent
Percent
60
60
50
50
40
40 The
The P-value
P-value is
is >> 0.05
0.05
30
30 We
We can
can assume
assume ourour
20
20 data
data is
is Normally
normally
10
10 Distributed.
distributed.
55
11
-5
-5 00 55 10
10 15
15 20
20 25
25 30
30 35
35
Rot
Rot
Test
Testfor
forEqual
EqualVariances
Variancesfor
forRot
Rot
F-Test
F-Test
Test
TestStatistic
Statistic 0.68
0.68
10
10 P-Value
P-Value 0.598
0.598
Lev ene's Test
Levene's Test
Temp
Tem p
Test
TestStatistic
Statistic 0.05
0.05
P-Value
P-Value 0.824
0.824
16
16
22 44 66 88 10
10 12
12
95%
95%Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs Ho:
Ho: σ
σ11 == σσ22
Ha
Ha:: σ
σ11≠≠ σ
σ22
10
10
P-Va
P-Value
lue >> 00.0
.055,, There
There isis no
no
sta
statistica
tistically
lly significa
significant nt difference.
difference.
Temp
Tem p
16
16
00 55 10
10 15
15 20
20 25
25
Rot
Rot
You can see there is no statistical difference for variance in Rot based on temperature as a factor.
Since the data is Normally Distributed and we have 2 samples, use F-Test statistic.
Use
Use F-
F- Test
Test for
for 22
sa
samples
mples ofof N
N orma
ormally lly
Distributed
Distributed da
datata..
Another method for testing for equal variance will allow more than one factor. The Labels are the
factors. The data is the Output.
This time we have Rot as the response and Temp and Oxygen as the factors.
This graph
Thi h shows
h a ttestt off
equal variance which Test
Testfor
forEqual
EqualVariances
Variancesfor
forRot
Rot
displays Bonferroni 95% Temp
Temp Oxygen
Oxygen
confidence for the response Bartlett's
Bartlett'sTest
Test
Standard Deviation at each 22 Test
TestStatistic
Statistic 2.71
2.71
P-Value
P-Value 0.744
0.744
level. As you will see the 10 66 Lev ene's Test
Levene's Test
10
Bartlett’s and Levene’s test Test
TestStatistic
Statistic 0.37
0.37
P-Value
P-Value 0.858
0.858
are displayed in the same 10
10
being equal.
P-va lue > 0 .0 5 show s insignifica nt
difference betw een va ria nce
Use this if
Bartlett s Test (normal distribution)
Bartlett's
da ta is N orma l
Test statistic = 2.71, p-value = 0.744
a nd for Fa ctors < 2
Does the Session Window have the same P-values as the Graphical Analysis?
First we want to do a graphical summary of the two samples from the two suppliers.
I “Variables:”
In “V i bl ” enter
t ‘‘ppm
VOC’
The P-value is greater than 0.05 for both Anderson-Darling Normality Tests so we conclude the
samples are from Normally Distributed populations because we “failed to reject” the null hypothesis
that the data sets are from Normal Distributions.
Median Median
Continue to
determine if
they are of
equal variance.
For “Response:”
p
enter ‘ppm VOC’
Note MINITABTM
defaults to 95%
confidence
co de ce interval
te a
which is exactly the
level we want to test
for this problem.
RM Supplier
Lev ene's Test
4 6 8 10 12 14
The P-value of the F-test 95% Bonferroni Confidence Intervals for StDevs
N orma l
u s
uo
n tin a
Co D a t
2 Sa mple T O ne W a y AN O VA 2 Sa m ple T O ne W a y AN O VA
Purpose of ANOVA
Ana llysis
A i off Va
V ria
i nce (AN O VA) is
i used
d to
t investiga
i ti tet a ndd
m odel the rela tionship betw een a response va ria ble a nd
one or m ore independent va ria bles.
Is the between group variation large enough to be distinguished from the within group variation?
(δ)
T
Tota l (O vera ll) Va
V ria
i tion
i
X
X
X X
X
X X X
μ1 μ2
Calculating ANOVA
W here:
G - the number of groups (levels in the study )
x ij = the individua l in the jth group
n j = the number of individua ls in the jth group or level
X = the gra nd M ea n
X j = the M ea n of the jth group or level
delta
(δ) W ithin Group Va ria tion
∑j=1
nj (Xj − X ) 2 ∑ ∑ (X ij − X)2 ∑ ∑ (X
j=1 i =1
ij − X )2
j =1 i =1
Calculating ANOVA
1 − (1 − α )
k
The reason we don’t use a t-test to evaluate series of Means is because the alpha risk increases as the
number of Means increases. If we had 7 pairs of Means and an alpha of 0.05 our actual alpha risk
could be as high as 30%. Notice we did not say it was 30%, only that it could be as high as 30% which
is quite unacceptable.
Three Samples
We have three potential suppliers that claim to have equal levels of quality
quality. Supplier B provides a
considerably lower purchase price than either of the other two vendors. We would like to choose the
lowest cost supplier but we must ensure that we do not effect the quality of our raw material.
W
W ee w
w ould
ould lik
likee test
test the
the da
data
ta to
to determ
determine
ine w
w hether
hether
there is a difference betw een the three suppliers.
there is a difference betw een the three suppliers.
95
StDev
N
0.4401
5 Supplier A P-Value 0.568
90
AD
P-Value
0.246
0.568
Supplier B P-Value 0.385
80 Supplier C P-Value 0.910
70
Percent
60
50
40
30 Probability Plot of Supplier B
20 Normal
99
10 Mean 3.968
5 StDev 0.2051
95 N 5
AD 0.314
1
90 Probability
P-Value 0.385 Plot of Supplier C
25
2.5 30
3.0 80 35
3.5 40
4.0 45
4.5 Normal
70
Supplier A 99
Mean 4.03
Percent
60
StDev 0.4177
50
95 N 5
40
AD 0.148
30 90
P-Value 0.910
20
80
10 70
Percent
60
5
50
40
1 30
3.50 3.75 4.00 20 4.25 4.50
Supplier B
10
1
3.0 3.5 4.0 4.5 5.0
Supplier C
suppliers. Supplier
SupplierAA P-Value
P-Value
Lev ene's Test
Levene's
0.348
Test
0.348
Test
TestStatistic
Statistic 0.59
0.59
P-Value
P-Value 0.568
0.568
Suppliers
Suppliers
Supplier
SupplierBB
Supplier
pp CC
Supplier
0.0
0.0 0.2
0.2 0.4
0.4 0.6
0.6 0.8
0.8 1.0
1.0 1.2
1.2 1.4
1.4 1.6
1.6 1.8
1.8
95%
95%Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
ANOVA in MINITABTM
Stat>ANOVA>One-Way Unstacked
ANOVA
4.2
4.0
Data
3.8
3.6
3.4
3.2
3.0
Supplier A Supplier B Supplier C
Looking at the P-value the conclusion is we fail to reject the null hypothesis. According to the data
there is no significant difference between the Means of the 3 suppliers.
N
Norm
ormaall da
data
ta P-va
P-value
lue >>
Stat>ANOVA>One Way
y .0
.055 N
Noo Difference
Difference
ANOVA
Before looking up the f critical value you must first know what the degrees of freedom are
are. The purpose
of the ANOVA’s test statistic uses variance between the Means divided by variance within the groups.
Therefore, the degrees of freedom would be 3 suppliers minus 1 for 2 degrees of freedom. The
denominator would be 5 samples minus 1 (for each supplier) multiplied by 3 suppliers, or 12 degrees of
freedom. As you can see the critical F value is 3.89, and since the calculated f of 1.40 not close to the
critical value we fail to reject the null hypothesis.
T t for
Test f Equal
E l Variances:
V i Suppliers
S li vs ID
One-way ANOVA: Suppliers versus ID
Analysis of Variance for Supplier
Source DF SS MS F P
ID 2 0.384 0.192 1.40 0.284
Error 12 1.641 0.137 F-Ca lc F-Critica l
Total 14 2.025
Individual 95% CIs For Mean
D/N 1 2 3 4
Based on Pooled StDev 1 161.40 199.50 215.70 224.60
2 18 51
18.51 19 00
19.00 19 16
19.16 19 25
19.25
L
Level
l N Mean
M StDev
StD ----------+---------+---------+------
3 10.13 9.55 9.28 9.12
Supplier 5 3.6640 0.4401 (-----------*-----------) 4 7.71 6.94 6.59 6.39
5 6.61 5.79 5.41 5.19
Supplier 5 3.9680 0.2051 (-----------*-----------)
6 5.99 5.14 4.76 4.53
Supplier 5 4.0300 0.4177 (-----------*-----------) 7 5.59 4.74 4.35 4.12
8 5.32 4.46 4.07 3.84
----------+---------+---------+------
9 5.12 4.26 3.86 3.63
Pooled StDev = 0.3698 3.60 3.90 4.20 10 4.96 4.10 3.71 3.48
11 4.84 3.98 3.59 3.36
12 4.75 3.89 3.49 3.26
13 4.67 3.81 3.41 3.18
14 4.60 3.74 3.34 3.11
15 4.54 3.68 3.29 3.06
Sample Size
Let’s check on how much difference we can see with a sample of 5.
Will having a
sample of 5 show
a difference?
After crunching
the numbers, a
sample of 5 can
only detect a
difference of 2.56
Standard Pow er a nd Sa m ple Size
Deviations Which
Deviations. O ne-w a y AN O VA
means that the
Alpha = 0 .0 5 Assum ed Sta nda rd Devia tion = 1
Mean would have N um ber of Levels = 3
to be at least 2.56 Sa m ple M a x im um
Standard Size Pow er SS M ea ns Difference
Deviations until we 5 0 .9 3 .2 9 6 5 9 2 .5 6 7 7 2
could see a
The sa mple size is for ea ch level.
difference. To help
elevate this
problem a larger sample should be used. If there is a larger sample you would be able to have a
more sensitive reading for the Means and the variance.
ANOVA Assumptions
1 . O bserva
b tions
ti a re a dequa
d tely
t l described
d ib d by
b the
th m odel.
d l
2 . Errors a re norm a lly a nd independently distributed.
3 . Homogeneity of va ria nce a m ong fa ctor levels.
Residual Plots
To generate the residual plots in MINITABTM select “Stat>ANOVA>One-way Unstacked>Graphs”,
then select “Individual value plot” and check all three types of plots.
Stat>ANOVA>One-Way Unstacked>Graphs
Histogram of Residuals
Histogram
Histogramof
ofthe
theResiduals
Residuals
(responses
(responsesare
areSupplier
SupplierA,
A,Supplier
SupplierB,
B,Supplier
SupplierC)
C)
55
44
FFrequency
requency
33
22
11
00
-0.6
06
-0.6 -0.4
04
-0.4 -0.2
02
-0.2 00.0
0.0
0 0.2
0 2
0.2 0.4
0 4
0.4 0.6
0 6
0.6
Residual
Residual
N orm a lity plot of the residua ls should follow a stra ight line.
Results of our ex a m ple look good.
The norm a lity a ssum ption is sa tisfied.
Normal
NormalProbability
ProbabilityPlot
Plotof
ofthe
theResiduals
Residuals
(responses
(responsesare
areSupplier
SupplierA,
A,Supplier
SupplierB,
B,Supplier
SupplierC)
C)
99
99
95
95
90
90
80
80
70
70
Percent
Percent
60
60
50
50
40
40
30
30
20
20
10
10
55
11
-1.0
-1.0 -0.5
-0.5 0.0
0.0 0.5
0.5 1.0
1.0
Residual
Residual
2-Sample t Example
For the field “Sample Sizes:” enter 40 space 50 because our data set has unequal sample sizes
which is not uncommon. The smallest difference that can be detected is based on the smallest
sample size, so in this case it is: 0.734.
Residuals
ResidualsVersus
Versusthe
theFitted
FittedValues
Values
((responses
(responsesare
areSupplier
SSupplier
li A,AA,Supplier
SSupplier
li B,BB,Supplier
SSupplier
li C) C)
0.75
0.75
0.50
0.50
0.25
0.25
Residual
Residual
0.00
0.00
-0.25
-0.25
-0.50
-0.50
3.65
3.65 3.70
3.70 3.75
3.75 3.80
3.80 3.85
3.85 3.90
3.90 3.95
3.95 4.00
4.00 4.05
4.05
Fitted
FittedValue
Value
ANOVA Exercise
In “Variables:” enter
‘ppm VOC’
In “By Variables:”
e te ‘Shift’
enter S t
Summary Summary
Summaryfor
forppm
ppmVOC P-Value 0.658
Summaryfor
forppm
ppmVOC
VOC P-Value 0.334 VOC
Shift
Shift
Shift==22 Shift==33
A nderson-Darling N ormality Test A nderson-D arling Normality Test
Anderson-Darling Normality Test Anderson-Darling Normality Test
A -S quared 0.37 A -S quared 0.24
A-Squared 0.37 A-Squared 0.24
P -V alue 0.334 P -V alue 0.658
P-Value 0.334 P-Value 0.658
M ean 34.625 M ean 28.000
Mean 34.625 Mean 28.000
S tD ev 5.041 S tD ev 6.525
StDev 5.041 StDev 6.525
V ariance 25.411 V ariance 42.571
Variance 25.411 Variance 42.571
S kew ness -0.74123 S kew ness 0.06172
Skewness -0.74123 Skewness 0.06172
Kurtosis 1.37039 Kurtosis -1.10012
Kurtosis 1 37039
1.37039 Kurtosis -1
1.10012
10012
N 8 N 8
N 8 N 8
M inimum 25.000 M inimum 19.000
Minimum 25.000 Minimum 19.000
1st Q uartile 31.750 1st Q uartile 22.000
1st Quartile 31.750 1st Quartile 22.000
M edian 35.500 M edian 28.000
Median 35.500 Median 28.000
20 25 30 35 40 45 50 3rd Q uartile 37.000 20 25 30 35 40 45 50 3rd Q uartile 32.750
20 25 30 35 40 45 50 3rd Quartile 37.000 20 25 30 35 40 45 50 3rd Quartile 32.750
M aximum 42.000 M aximum 38.000
Maximum 42.000 Maximum 38.000
95% C onfidence Interv al for M ean 95% C onfidence Interv al for M ean
95% Confidence Interval for Mean 95% Confidence Interval for Mean
30.411 38.839 22.545 33.455
30.411 38.839 22.545 33.455
95% C onfidence Interv al for M edian 95% C onfidence Interv al for M edian
95% Confidence Interval for Median 95% Confidence Interval for Median
30.614 37.322 20.871 33.322
95% Confidence Intervals 30.614 37.322 95% Confidence Intervals 20.871 33.322
95% Confidence Intervals 95% C onfidence Interv al for S tDev 95% Confidence Intervals 95% C onfidence Interv al for S tDev
95% Confidence Interval for StDev 95% Confidence Interval for StDev
Mean 3.333 10.260 Mean 4.314 13.279
Mean 3.333 10.260 Mean 4.314 13.279
Median Median
Median Median
Test
Testfor
forEqual
EqualVariances
Variancesfor
forppm
ppmVOC
VOC
Bartlett's
Bartlett'sTest
Test
Test
TestStatistic
Statistic 0.63
0.63
11 P-Value
P-Value 0.729
0.729
Lev ene's Test
Levene's Test
Test
TestStatistic
Statistic 0.85
0.85
P-Value
P-Value 0.440
0.440
Shift
Shift
22
33
22 44 66 88 10
10 12
12 14
14 16
16 18
18
95%
95%Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
Since our residuals look Normally Distributed and randomly patterned, we will assume our analysis is
correct.
Residual
Percent
50 0
10 -5
1 -10
-10 0 10 30 35 40
Residual Fitted Value
3.6 5
Frequency
Residual
2.4 0
1.2 -5
0.0 -10
-10 -5 0 5 10 2 4 6 8 10 12 14 16 18 20 22 24
Residual Observation Order
Since the P-value of the ANOVA test is less than 0.05, we “reject” the null hypothesis that the Mean
product quality as measured in ppm VOC is the same from all shifts.
We “accept” the alternate hypothesis that the Mean product quality is different from at least one shift.
You have now completed Analyze Phase – Hypothesis Testing Normal Data Part 2.
Notes
Analyze Phase
Hypothesis Testing Non-Normal Data
Part 1
Overview
The core
fundamentals of this W
W elcom
elcomee to
to Ana
Analy ze
lyze
phase are Equal
Variance Tests and ““X
X”” Sifting
Sifting
Tests for Medians.
Inferentia
Inferentiall Sta
Statistics
tistics
We will examine the
meaning of each of Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing
these and show you
how to apply them. Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
Equal
Equal Variance
Variance Tests
Tests
Hypothesis
Hypothesis Testing
Testing N
NNND
D P1
P1
Tests
Tests for
for Medians
Medians
Hypothesis
Hypothesis Testing
Testing N
NNND
D P2
P2
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
At this point we have covered the tests for determining significance for Normal Data. We will continue
to follow the roadmap to complete the test for Non-Normal Data with Continuous Data.
Later in the module we will use another roadmap that was designed for Discrete data.
Recall that Discrete data does not follow a Normal Distribution, but because it is not
Continuous Data, there are a separate set of tests to properly analyze the data.
1 Sample t
Why do we care if a data set is Normally Distributed?
When it is necessary to make inferences about the true nature of the
population based on random samples drawn from the population.
When the two indices of interest (X-Bar and s) depend on the data
being Normally Distributed.
For problem solving purposes, because we don’t want to make a bad
decision – having Normal Data is so critical that with EVERY statistical
test the first thing we do is check for Normality of the data
test, data.
Recall the four primary causes for Non-normal data:
Skewness – Natural and Artificial Limits
Mixed Distributions - Multiple Modes
Kurtosis
Granularity
We will focus on skewness for the remaining tests for Continuous Data.
s
u ou
n
o n ti ta
C Da N on N orm a l
Now we will continue down the Non-Normal side of the roadmap. Notice this slide is primarily for tests
of Medians.
Sample Size
– Ho: σ1 = σ2 = σ3 …
– Ha: At least one is different.
You have already seen this command in the last module, this is simply the application for Non-
Normal data. The question is: are any of the Standard Deviations or variances statistically different?
P-Va
P-Value
lue << 00.0
.055 (0
(0.0
.000))
Assum
Assumee da
datata is
is not
not
N
N orma
ormally
lly distributed.
distributed.
60
50
40
30
20
10
5
EXH_AOV.MTW 0.1
-5.0 -2.5 0.0 2.5 5.0 7.5 10.0
Rot 2
Test
Testfor
forEqual
EqualVariances
Variancesfor
forRot
Rot22
F-Test
F-Test
Test
TestStatistic
Statistic 1.75
1.75
11 P-Value
P-Value 0.053
0.053
Factors2
Factors2
Lev ene's Test
Levene's Test
Test
TestStatistic
Statistic 0.03
0.03
P-Value
P-Value 0.860
0.860
22
1.0
1.0 1.2
1.2 1.4
1.4 1.6
1.6 1.8
1.8 2.0
2.0 2.2
2.2
95%
95%Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
11
Factors2
Factors2
22
00 22 44 66 88 10
10
Rot
Rot22
W hen testing >2 samples with N ormal distribution, use Bartlett’s test:
– To determine whether multiple N ormal distributions have equal
variance.
i
Our focus for this module when working with N on-N ormal distributions.
For the Data to be Normal the P-value must be greater than 0.05.
Based off the P-value, the variables being analyzed is Non-normal Data.
As you can see the data illustrates a P-value of 0.247 which is more than 0.05. As a result, there
is no variance between CallperWk1 and CallperWk2.
Nonparametric Tests
Non-parametric Hypothesis Testing works the same way as parametric testing. Evaluate the P-
value in the same manner
~ ~ ~
Target X X1 X2
MINITABTM’s Nonparametrics
1-Sample Sign: performs a one-sample sign test of the Median and calculates the corresponding
point estimate and confidence interval. Use this test as an alternative to one-sample Z and one-
sample t-tests.
1-Sample Wilcoxon: performs a one-sample Wilcoxon signed rank test of the Median and
calculates the corresponding point estimate and confidence interval (more discriminating or efficient
than the sign test)
test). Use this test as a nonparametric alternative to one-sample
one sample Z and one-sample
one sample t- t
tests.
Mann-Whitney: performs a Hypothesis Test of the equality of two population Medians and
calculates the corresponding point estimate and confidence interval. Use this test as a
nonparametric alternative to the two-sample t-test.
Kruskal-Wallis: performs a Hypothesis Test of the equality of population Medians for a one-way
design. This test is more powerful than Mood’s Median (the confidence interval is narrower, on
average)
g ) for analyzing
y g data from many yp
populations,
p , but is less robust to outliers. Use this test as an
alternative to the one-way ANOVA.
Mood’s Median Test: performs a Hypothesis Test of the equality of population Medians in a one-
way design. Test is similar to the Kruskal-Wallis Test. Also referred to as the Median test or sign
scores test. Use as an alternative to the one-way ANOVA.
1-Sample Example
4 . Sa m ple Size:
This data set has 500 samples (well in excess of necessary sample size).
The Statistical Problem is: The null hypothesis is that the Median is equal to 63 and the
alternative hypothesis is the Median is not equal to 63.
Open the MINITABTM Data File: “DISTRIB1.MTW”. Next you have a choice of either performing a
1-Sample Sign Test or 1-Sample Wilcoxon Test because both will test the Median against a
target. For this example we will perform a 1-Sample Sign Test.
1-Sample Example
=
Sign Test for M edia n: Pos Sk ew
Sign test of M edia n = 6 3 .0 0 versus = 6 3 .0 0
N Below Equa l Above P M edia n
Pos Sk ew 5 0 0 37 0 463 0 .0 0 0 0 6 5 .7 0
As you can see the P-value is less than 0.05, so we must reject the null hypothesis which
means we have data that supports the alternative hypothesis that the Median is different than
63. The actual Median of 65.70 is shown in the Session Window. Since the Median is greater
than the target value, it seem the new process is not as good as we may have hoped.
Perform the same steps as the 1-Sample Sign to use the 1-sample Wilcoxon.
1-Sample Example
For the 1-sample sign test, select a confidence interval level of 95%. As you can see this yields a
result intervals of 65.26 to 66.50. The NLI means a non linear interpolation method was used to
estimate the confidence intervals
intervals. As you can see the confidence interval is very narrow
narrow.
Since the target of 63 is not within the confidence interval, reject the null hypothesis.
As you will see the confidence interval is even tighter for the Wilcoxon test. Therefore we reject
the null, the Median is higher than the target of 63. Unfortunately, the Median was higher than
the target which is not the desired direction.
HYPOTESTSTUD.MPJ
Stat>Nonparametrics>1-Sample Sign
The Black Belt in this case agrees the Mine Manager is achieving
his target of 2.1 tons/ day
We agree!
Mann-Whitney
Mann Whitney Example
The Mann-W hitney test is used to test if the Medians for 2 samples
are different.
2. Ho: M 1 = M 2
Ha: M 1 ≠ M 2
3. Mann
Mann-W
W hitney test.
4. There are 200 data points for each machine, well over the
minimum sample necessary.
Mann-Whitney Example
Wh llooking
When ki att th
the
5 . Sta tistica l Conclusion
Probability Plot,
Match A yields a less Probability
ProbabilityPlot
Plotof
ofMach
MachAA
Normal
than .05 P-value. 99.9
99.9
Normal
50
50
the other that is 40
40
30
30
99.9
99.9
Mean
Mean 16.73
16.73
20 StDev 5.284
Normal. The good 20 99 StDev 5.284
99 NN 200
200
10
10 AAD
D 0.630
0.630
news is when
55 95
95 P-Value
P-Value 0.099
0.099
90
90
11
performing a 80
80
70
Percent
70
Percent
0.1 60
Nonparametric Test 0.1
00 10
10 20
20
60
50
50 30
40
40
30 40
40
Mach
MachAA 30
of 2 Samples, only 30
20
20
10
one has to be 10
55
N
Normal.l With ththatt 11
Whitney.
Perform the Mann- 6 . Pra ctica l Conclusion: The medians of the machines are
Whitney test. Since different.
zero (the difference Stat>Nonparametric>Mann-Whitney…
between the 2
Medians) is not
contained within the
confidence interval If the sa mples a re the
sa m e, zero w ould be
we reject the null included w ithin the
hypothesis. Also, the confidence interva l.
last line in the
Session Window
Mann-Whitney Test and CI: Mach A, Mach B
where it says … is N Median
significant at 0.0010 Mach A 200 14.841
is the equivalent of a Mach B 200 16.346
P-value for the Mann- Point estimate for ETA1-ETA2 is -1.604
Whitney test. 95.0 Percent CI for ETA1-ETA2 is (-2.635,-0.594)
W = 36509.0
Exercise
The final 2 tests are the Mood’s Median and the Kruskal Wallis.
= = ?
Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com
381
Summary
Summaryfor
forRecovery
Recovery
Location
Location==Savannah
Savannah
AAnderson-Darling
nderson-D arling NNormality
ormality Test
Test
AA-Squared
-S quared 0.81
0.81
PP-Value
-V alue 0.032
0.032
MMean
ean 87.660
87.660
SStDev
tD ev 7.944
7.944
VVariance
ariance 63.113
63.113
SSkewness
kew ness -0.15286
-0.15286
Kurtosis
K t i
Kurtosis -1
1.11764
11764
-1.11764
1 11764
N 25
N 25
MMinimum
inimum 75.300
75.300
1st Q uartile 79.000
1st Quartile 79.000
MMedian
edian 87.500
87.500
78 84 90 96 3rd Q uartile 96.550
78 84 90 96 3rd Quartile 96.550
M aximum 99.200
Maximum 99.200
95% C onfidence Interv al for M ean
95% Confidence Interval for Mean
84.381 90.939
84.381 90.939
95% C onfidence Interv al for M edian
95% Confidence Interval for Median
86.179 90.080
86.179 90.080
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tD ev
95% Confidence Intervals 95% Confidence Interval for StDev
Mean 6.203 11.052
Mean 6.203 11.052
Median
Median
Notice evidence of outliers in at least 2 of the 3 populations. You could do Box Plot to get a clearer idea
about outliers.
Summary
Summaryfor
forRecovery
Recovery
Location
Location==Bangor
Bangor
AAnderson-Darling
nderson-D arling NNormality
ormality Test
Test
AA-Squared
-S quared 0.72
0.72
PP-Value
-V alue 0.045
0.045
MMean
ean 93.042
93.042
SStDev
tD ev 5.918
5.918
VVariance
ariance 35.017
35.017
SSkewness
kew ness -1.81758
-1.81758
Kurtosis
Kurtosis 4.66838
4.66838
NN 13
13
MMinimum
inimum 76.630
76.630
1st
1stQQuartile
uartile 90.600
90.600
78 84 90 96
M edian
Median
3rd Q uartile
Summary
Summaryfor
94.800
94.800
97.350 forRecovery
Recovery
78 84 90 96 3rd Quartile 97.350
99.700 Location = Ankhar
99.700 Location = Ankhar
M aximum
Maximum
95% C onfidence Interv al for M ean AAnderson-Darling
nderson-D arling NNormality
ormality Test
95% Confidence Interval for Mean Test
89.466 96.617 AA-Squared
-S quared 0.86
89.466 96.617 0.86
95% C onfidence Interv al for M edian PP-Value
-V
PVValue
l 00.022
0.022
0022
022
95% Confidence
C fid Interval
I t l for
f Median
M di
90.637 97.036 MMean
ean 88.302
90.637 97.036 88.302
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tD ev SStDev
tD ev 6.929
95% Confidence Intervals 95% Confidence Interval for StDev 6.929
4.243 9.768 VVariance
ariance 48.008
48.008
Mean 4.243 9.768
Mean SSkewness
kew ness -0.105610
-0.105610
Median
Kurtosis
Kurtosis 0.182123
0.182123
Median NN 20
20
90 92 94 96 98
90 92 94 96 98 MMinimum
inimum 73.500
73.500
1st Q uartile 85.150
1st Quartile 85.150
MMedian
edian 88.425
88.425
78 84 90 96 3rd Q uartile 89.700
78 84 90 96 3rd Quartile 89.700
M aximum 99.450
Maximum 99.450
95% C onfidence Interv al for M ean
95% Confidence Interval for Mean
85.059 91.545
85.059 91.545
95% C onfidence Interv al for M edian
95% Confidence Interval for Median
86.735 89.299
86.735 89.299
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tD ev
95% Confidence Intervals 95% Confidence Interval for StDev
Mean 5 269
5.269 10 120
10.120
Mean 5.269 10.120
Median
Median
85 86 87 88 89 90 91
85 86 87 88 89 90 91
Test
Testfor
forEqual
EqualVariances
Variancesfor
forRecovery
Recovery
Bartlett's
Bartlett'sTest
Test
Test
TestStatistic
Statistic 1.33
1.33
Ankhar
Ankhar P-Value
P-Value 0.514
0.514
Lev ene's Test
Levene's Test
Test
TestStatistic
Statistic 1.02
1.02
P-Value
P-Value 0.367
0.367
Location
Location
Bangor
Bangor
Savannah
Savannah
33 44 55 66 77 88 99 1010 11 11 1212
95%
95%Bonferroni
BonferroniConfidence
ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
Sta tistica l Solution: Since the P-value of the Mood’s Median test is
less than 0.05, we reject the null hypothesis.
Pra ctica l Solution: Bangor has the highest recovery of all three
facilities.
W e observe the confidence interva ls for
the M edia ns of the 3 popula tion’s. N ote
there is no overla p of the 9 5 %
confidence levels for Ba ngor—so w e
visua lly k now the P-va lue is below 0 .0 5 .
Mood Median Test: Recovery versus Location
Kruskal-Wallis Test
Using the same data set
set, analyze using the Kruskal-Wallis test.
test
H = 6.86 DF = 2 P = 0.032
H = 6.87 DF = 2 P = 0.032 (adjusted for ties)
When comparing the Kruskal-Wallis test to the Mood’s Median test, the Kruskal-Wallis test is better.
In this case the Kruskal-Wallis Test showed the variances were equal and illustrated the same
conclusion.
Exercise
When comparing the Kruskal-Wallis test to the Mood’s Median test, the Kruskal-Wallis test is better.
In this case the Kruskal-Wallis Test showed the variances were equal and illustrated the same
conclusion.
Unequal Variance
Example
This is an example of comparable products.
products To view these graphs open the data set
“Var_Comp.mtw”.
Model A and Model B are similar in nature (not exact), but are
manufactured in the same plant.
– Check for N ormality: Var_Comp.mtw
p
Percent
60 60
50 50
40 40
30 30
20 20
10 10
5 5
1 1
8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0
Model A Model B
Does Model B have a larger variance than Model A? The Median for Model B is much lower. How
can we capitalize on our knowledge of the process? Let’s look at data demographic to help us
explain the differences between the two processes.
Test
TestStatistic
Statistic 4.47
4.47
id
P-Value
P-Value 0.049
0.049
Model
ModelBB
00 11 22 33 44 55 66 77
995%
5 % Bonfer roni Confidence
Bonferroni ConfidenceIntervals
Intervalsfor
forStDevs
StDevs
Model
ModelAA
idvar
idvar
Model
ModelBB
00 22 44 66 88 10
10 12
12
Data
Data
Data Demographics
What clues can explain the difference in variances? This example illustrates how Non-normal Data
can have significant informational content as revealed through data demographics. Sometimes this
is all that is needed to draw conclusions.
Median
Median
9.8 10.0 10.2 10.4 10.6 10.8 11.0
0 1 2 3 4 5 6
Dotplot
p of Model A,, Model B
Model A
Model B
-0.0 1.6 3.2 4.8 6.4 8.0 9.6 11.2
Data
Now let’s look at the MINITABTM Session Window. As you can see the P-value is greater than 0.05.
N ex t w e a re going to
check for va ria nce.
Before performing a
Test for Equa l
Va ria nce should the
da ta be sta ck ed?
Therefore we reject the accept the null hypothesis, there is no difference between a potential Black
Belt’s
’ degree and performance.
f
You have now completed Analyze Phase – Hypothesis Testing Non-Normal Data Part 1.
Notes
Analyze Phase
Hypothesis Testing Non-Normal Data
Part 2
Overview
The core
fundamentals of this W
W elcome
elcome to
to Ana
Analy
lyze
ze
phase are Tests for
Proportions and ““X
X”” Sifting
Sifting
Contingency Tables.
Inferentia
Inferentiall Sta
Statistics
tistics
We will examine the
meaning of each of Intro
Intro to
to Hypothesis
Hypothesis Testing
Testing
th
these andd show
h you
how to apply them. Hypothesis
Hypothesis Testing
Testing N
NDD P1
P1
Hypothesis
Hypothesis Testing
Testing N
NDD P2
P2
Hypothesis
Hypothesis Testing
Testing N
NNND
D P1
P1
Tests
Tests for
for Proportions
Proportions
Hypothesis
Hypothesis Testing
Testing N
NNND
D P2
2
P2
Contingency
Contingency Tables
Tables
W
W ra
rapp Up
Up &
& Action
Action Item
Itemss
te Attribute Da ta
t r ib u
A t a ta
D
O ne Fa ctor Tw o Fa ctors
Two Samples Two or More Samples
One Sample
We will now continue with the roadmap for Attribute Data. Since Attribute Data is Non-normal by
definition, it belongs in this module on Non-normal Data.
For Continuous Da ta :
– Ca pa bility a na lysis – a minimum of 3 0 sa mples
– Hypothesis Testing – depends on the pra ctica l
difference to be detected a nd the inherent va ria tion
in the process.
For Attribute Da ta :
– Ca pa bility a na lysis – a lot of sa mples
– Hypothesis Testing – a lot, but depends on pra ctica l
difference to be detected.
The hypotheses:
– H0: p = p 0
– Ha: p ¹ p 0
p (1 − p )n
obs
0 0
Now let’s
let s try an example:
4 . Sa mple size:
Take note of the how quickly the sample size increases as the alternative proportion goes up. It
would require 1402 samples to tell a difference between 98% and 99% accuracy. Our sample of
500 will do because the alternative hypothesis is 96% according to the proportion formula.
After you analyze the data you will see the statistical conclusion is to reject the null hypothesis.
What is the Practical Conclusion…(the process is not performing to the desired accuracy of 99%).
As you can see the Sample Size should be at least 4073 to prove our hypothesis.
Yes, you get your bonus since .80 is not within the confidence interval. Because the improvement
was 84%, the sample size was sufficient.
Answer: Use alternative proportion of .82, hypothesized proportion of .80. n=4073. Either you’d
better ship a lot of stuff or you’d better improve the process more than just 2%!
N ow let us ca lcula te if w e
?
receive our bonus…
O ut of the 2 0 0 0
shipments, 1 6 8 0 w ere
a ccura te. W a s the X 1680
sa mple size sufficient? p̂ = = = 0.84
n 2000
p̂1 − p̂ 2 − D
Zobs =
p̂1 (1 − p̂1 ) n1 + p̂ 2 (1 − p̂ 2 ) n 2
This is compa red to Z critica l = Z a / 2
a δ p1 p2 n
5% .01
01 0.79
0 79 0.8
0 8 ___________
5% .01 0.81 0.8 ___________ Answers:
5% .02 0.08 0.1 ___________ 34,247
32,986
5% .02 0.12 0.1 ___________ 4,301
5% .01 0.47 0.5 ___________ 5,142
5% .01 0.53 0.5 ___________ 5,831
5,831
P
Pow er a ndd Sa
S mple
l Size
Si
Test for Tw o Proportions
Testing proportion 1 = proportion 2 (versus not = )
Ca lcula ting pow er for proportion 2 = 0 .9 5
Alpha = 0 .0 5
Sa m ple Ta rget
Proportion 1 Size Pow er Actua l Pow er
0 .8 5 188 0 .9 0 .9 0 1 4 5 1
The sa mple size is for ea ch group.
A sample of at least 188 is necessary for each group to be able to detect a 10% difference. If you
have reason to believe your improved process is has only improved to 90% and you would like to
be able to prove that improvement is occurring the sample size of 188 is not appropriate.
Recalculate using .90 for proportion 2 and leave proportion 1 at .85. It would require a sample size
of 918 for each sample!
The data
shown was The follow ing da ta w ere ta k en:
gathered for
two Tota l Sa mples Accura te
processes.
Before Im provem ent 600 510
After Im provement 225 212
Ca lcula te proportions:
X1 510
Before Im provem ent: 6 0 0 sa mples, 5 1 0 a ccura te p̂1 = = = 0.85
n1 600
X 2 212
After Improvement: 2 0 0 sa mples, 2 2 0 a ccura te p̂ 2 = = = 0.942
n 2 225
Difference = p (1 ) - p (2 )
Estima te for difference: -0 .0 9 2 2 2 2 2
9 5 % CI for difference: (-0 .1 3 4 0 0 5 , -0 .0 5 0 4 3 9 9 )
Test for difference = 0 (vs not = 0 ): Z = -4 .3 3 P-Va lue =
0 .0 0 0
1. W ho is worse?
2. Is the sample size large enough?
X1 47
Boris p̂1 = = = 0.132
n1 356
X 2 99
Igor p̂ 2 = = = 0.173
n 2 571
Results:
As you can see we N ow let’s see w ha t the
Fail to reject the null minimum sa mple size w ill
hypothesis with the be…
data given. One
conclusion is the
sample size is not
large enough. It
would take a
minimum sample of
1673 to distinguish
the sample Stat>Power and Sample Size>2 Proportions
proportions for Boris
and Igor
Igor.
Sample Target
Proportion 1 Size Power Actual Power
0.17 1673 0.9 0.900078
Contingency Tables
Contingency
C ti T bl a re used
Ta bles d to
t simulta
i lt neouslyl compa re
m ore tha n tw o sa mple proportions w ith ea ch other.
Sta tisticia ns ha ve show n tha t the follow ing sta tistic forms
a chi-squa re distribution w hen H 0 is true:
∑
(observed − expected)
2
expected
W here “ observed
observed” is the sa mple frequency
frequency, “ ex pected
pected”
is the ca lcula ted frequency ba sed on the null hypothesis,
a nd the summa tion is over a ll cells in the ta ble.
Th ..oh,
That? oh, that’s
h
my contingency
table!
Certified Lean Six Sigma Black Belt Book Copyright OpenSourceSixSigma.com
401
Chi-squa re Test
r c (Oij − E ij ) 2
χ =∑ ∑
2 W here:
o
i =1 j=1 E ij O = the observed va lue
(from sa mple da ta )
E = th
the ex pected
t d va lue
l
(F * F )
E ij = row col r = number of row s
Ftotal c = number of columns
Frow = tota l frequency for tha t
2
χ critical = χ α,2 ν row
Fcol = tota l frequency for tha t
From the Chi-Square
Chi Square Table column
Ftota l = tota l frequency for the ta ble
n = degrees of freedom [(r-1 )(c-1 )]
Wow!!! Can you believe this is the math in a Contingency Table. Thank goodness for MINITABTM.
Now let’s do an example.
Note the data gathered in the table. Curley isn’t looking too good right now (as if he ever did).
0 .3 0 6 *4 5 = 1 3 .8
0 .6 9 4 * 3 8 = 2 6 .4
(observed - expected)2
expected
The final step is to create a summary table including the observed chi-squared.
Critica l Va lue:
• Like any other Hypothesis Test, compare the observed statistic with
the critical statistic. W e decide a = 0.05, what else do we need to
know?
• For a chi-square
chi square distribution, we need to specify n, in a
contingency table:
n = (r - 1)(c - 1), where
r = # of rows
c = # of columns
• In our example,
example we have 2 rows and 3 columns
columns, so n = 2
• W hat is the critical chi-square? For a Contingency Table, all the
risk is in the right hand tail (i.e. a one-tail test); look it up in
MIN ITABTM (Calc>Probability Distributions>Chisquare…)
2 = 5.99
χ crit
0.5
0.4
0.3
Accept Reject
f
0.2 χobs
2 = 7.02
0.1
0.0
0 1 2 3 4 5 6 7 8
chi-square χcrit
2 = 5.99
Using M IN ITABTM
As you can see the data confirms: to reject the null hypothesis.
Chi-Square Test
Expected counts are printed below observed counts
Moe Larry Curley Total
1 5 8 20 33
7.64 11.61 13.75
0.912 1.123 2.841
2 20 30 25 75
17.36 26.39 31.25
0.401 0.494 1.250
Stat>Tables>Chi-Square Test
Total 25 38 45 108
Quotations Exercise
• You are the quotations manager and your team thinks that the
reason you don’t get a contract depends on its complexity.
• You determine a way to measure complexity and classify lost
contracts as follows:
Secondly, in M IN ITABTM
perform
f a Chi-Squa
Chi S re Test
T t
Stat>Tables>Chi-Square Test
Overview
You have now completed Analyze Phase – Hypothesis Testing Non-Normal Data Part 2.
Notes
Analyze Phase
Wrap Up and Action Items
• Embracing change
• Continuous learning
• Being tenacious and courageous
• Make data-based decisions
• Being rigorous
• Thinking outside of the box
Ea
Each
ch ““pla
playerer” in
yyer” in the
the Six
Six Sigm
Sigmaa process
process m
must
ust be
be
AA RO
ROLELEMMOODEL
DEL
for
for the Six Sigm a culture.
the Six Sigm a culture.
A Six Sigma Black Belt has a tendency to take on many roles, therefore these behaviors help you
through the journey.
Analyze Deliverables
Sample
p size is dependent
p on the type
yp of data.
• Listed below are the Analyze Phase deliverables that each candidate
will present in a Power Point presentation at the beginning of the
Control Phase training.
• At this point you should all understand what is necessary to provide
these deliverables in your presentation.
– Team Members (Team Meeting Attendance)
– Primary Metric
– Secondary Metric(s)
– Data Demographics
– Hypothesis Testing (applicable tools)
– Modeling (applicable tools)
– Strategy to reduce X’s
– Project Plan
– Issues and Barriers It’s your show!
DMAIC Roadmap
Estimate COPQ
D
Establish Team
Measure
Analyze Phase
Collect Data
Statistically
Significant?
N
Y
Update FMEA
N
Practically
Significant?
Root
Cause
N
Y
Identify Root Cause
Genera l Q uestions
• Are there any issues or barriers that prevent you from completing this phase?
• Do you have adequate resources to complete the project?
This is a template that should be used with each project to assure you take the proper steps –
remember, Six Sigma is very much about taking steps. Lots of them and in the correct order.
W HAT W HO W HEN W HY W HY N O T HO W
Y ’ on your way!
You’re a !
Notes
Analyze Phase
Quiz
Now we will see what you have retained from the Analyze Phase of the course. Please answer
these questions to the best of your ability without referencing the text. The answers are in the
Appendix. Please check your answers against the answers provided and review the sections in
the Analyze Phase where your retention of the knowledge is less than you desire.
1. The Multi-Vari Chart was originally designed to show variation from 3 primary sources:
Within unit, Between unit, and Temporal (or over time).
True False
2. One Six Sigma tool helps to screen factors by using graphical techniques to logically
subgroup multiple Discrete X´s plotted against a Continuous Y is known as a
________________________Chart. (fill in the blank)
4. As the sample size becomes large, the new distribution of Means will form a Normal
Distribution, no matter what the shape of the population distribution of individuals are.
This concept is known as the Central Limit Theorem.
True False
5. Which of the following statements are true regarding Hypothesis Testing? (check all that
apply)
A. A Hypothesis Test is an a priori theory relating to differences between variables
B A statistical test or Hypothesis Test is performed to prove or disprove the theory
B.
C. A Hypothesis Test converts the Practical Problem into a Statistical Problem.
D. A Hypothesis Test illustrates short-term results
6. What are the four primary causes for Non-normal Data? (check all that apply)
A. Skewness
B. Mixed Distributions
C. Kurtosis
D. Formulosis
E. Granularity
7. When a data set is Normally Distributed, making inferences about the true nature of the
population based on random samples drawn from the population is an example of using
Non-normal Data.
True False
8. From the list below, which is the best example of a Mann-Whitney Test? (check all that
apply)
A. Determine if one of a few machines has a different Mean cycle time
B. Determine if one of a few machines has a different Median cycle time
C. Determine if document A and document B have different Mean cycle times
D. Determine if document A and document B have different Median cycle times
10. Having Unequal variance is a result of similar distributions having: (check all that apply)
A Extreme
A. E t tails
t il
B. Outliers
C. Multiple Modes
D. Having the tails of the distribution equal each other
11. Conducting a Capability Analysis using Attribute Data should contain a lot of samples to
be statistical sound.
True False
12. Contingency Tables are used to: (check all that apply)
A. Illustrate one tail proportion
B. Compare more than two sample proportions with each other
C. Contrast the outliers under the tail
D. Analyze the ´´what if´´ scenario
13 C
13. Contingency
ti T
Tables
bl are used
d tto ttestt ffor association
i ti ((or d
dependency)
d )bbetween
t ttwo or
more classifications.
True False
14. To conduct a proper Capability Analysis using Continuous Data, what is the minimum
recommended number of samples to use? (check all that apply)
A. 15
B. 20
C. 30
D. 50
15. For a Skewed Distribution, the appropriate statistic to describe the central tendency is:
(check all that apply)
A. Mean
B. Median
C M
C. Mode
d
16. A Non-parametric Test makes assumptions about the data are from Normal
Populations.
True False
17. If the results from a Hypothesis Test are located in the ´´Region of Doubt´´ area, what
can be concluded? ((check all that apply)
pp y)
A. Failure to reject the Null Hypothesis
B. Failure to accept the Null Hypothesis
C. The test was conducted improperly
D. Rejection of the alpha
19. To conduct a proper Hypothesis Test there are six recommended steps to follow.
True False