Module - 4 - Analyze Phase - Oct 20

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 210

Analyze Phase

Module 4

1
Agenda

– Overview of Analyze Phase

– Hypothesis Testing

– Confidence Intervals

– Sample Size

– Analysis of Variance (ANOVA)

– Chi Square

– Multi-variate Studies

– Correlation and Regression

2
Saudi Aramco: Company General Use
Overview of Analyze Phase

3
In Analyze phase, we sift thru the various
x’s to focus on the critical x’s
Identify problem’s root causes through process & data analysis
Define phase
Symptoms “Y” Quantify output

Measure phase
Fishbones, C&E matrix, FMEA
Process Maps, etc 30 - 50 Inputs Prioritize causes

Paretos, Multi-vari, 1-t, 2-t, paired t, ANOVA,


10 - 15 Xs Analyze phase
Chi-Sq, Test for equal var, proportions tests Validate causes
8 - 10 PIVs

3-6 Key PIVs

Saudi Aramco: Company General Use


Optimized Process 4
Analyze Topics

❑ Individual components of the Analyze aren’t intuitively related but if their


relationships are understood early, the relevant pieces can be assimilated easier

❑ Several common concepts are repeated throughout this phase; If they are
understood up front, the follow-on lessons are easier

▪ Hypothesis Testing

▪ Confidence Intervals

▪ Analysis of Variance (ANOVA)

▪ Chi Square

▪ Correlation and Regression

Saudi Aramco: Company General Use


5
Hypothesis Testing

6
What is Hypothesis Testing (1/2)

❑ The probability of occurrence is based on a pre-determined statistical confidence

❑ Decisions are based on:


▪ Beliefs (past experience)

▪ Preferences (current needs)

▪ Evidence (statistical data)

▪ Risk (acceptable level of failure)

❑ A hypothesis is just a statement that we want to test:


▪ The average time to process a purchase order is 12 days

▪ The reject rate from Machine 5 has improved as a result of our work

▪ There is no relationship between humidity and our Cost of Poor Quality

▪ My average Chess Ranking for 2022 is lower than my average Chess Ranking for 2023
Saudi Aramco: Company General Use
7
What is Hypothesis Testing (2/2)

In statistics, we usually form at least two hypotheses:

❑ The “null hypothesis” H0 assumes no significant difference/relationship

▪ This is the default assumption of all statistical tests

❑ The “alternative hypothesis” Ha assumes there is significant difference/relationship

❑ To do any hypothesis testing, we need to


▪ First obtain a sample (or samples) and measure the relevant variable of interest

▪ Then the measured sample value is compared to the hypothesised value of the population
& then we decide to support or not support the hypothesis

❑ The key question becomes how can we reliably use the values from a single sample
to make conclusions about a population value?

Saudi Aramco: Company General Use


8
The cookie owner’s claim (Conjecture)

❑ A cookie shop is selling their famous product — gingerbread cookie! The owner
believes his product is the most delicious one in the world

❑ Also, the owner said that the average weight (μ) of each product (a bag of
gingerbread cookies) is 500g

❑ Assume that we know the bags of cookie


is normally distributed with a standard
deviation (σ) equals 30g

❑ If the owner’s claim is true (the average


weight of one bag of cookies = 500g), we
could expect the distribution of one bag
of cookies looks like the next figure
Saudi Aramco: Company General Use
9
Can We Believe in the Owner’s Words?

❑ Does the average weight of a bag of cookies really equal 500g? What if the owner
deceives customers and gives less than 500g cookies? How do we validate his words?

❑ Here, is where “hypothesis testing” comes in

❑ To implement hypothesis testing, firstly, let’s set up our null hypothesis (H0) and the
alternative hypothesis (H1)

❑ As Industrial engineer were taught not suspect others without having any evidence

❑ So, we assume the owner is honest about his business (H0). If we want to check
whether his cookies is less than 500g, we need to collect data and have enough
evidence to support our guess (H1) So…we have the hypothesis statement set up as
follows: H0: Average weight of one bag of cookies (μ) = 500g
H1: Average weight of one bag of cookies (μ) < 500g

Saudi Aramco: Company General Use


10
Can We Believe in the Owner’s Words?

❑ Since we are unsure about how our


population distribution looks like, I
use the dashed line to represent
possible distributions. If the owner’s
claim is true, we could expect one
bag of cookies has a distribution
with a mean weight equals to 500g
(left picture)

❑ However, if the owner’s claim is not true and the mean weight of cookies is less
than 500g, the population distribution should look differently (any of right
picture)
Saudi Aramco: Company General Use
11
How to test the owner’s statement ?

❑ The problem statement is set

H0: Average weight of one bag of cookies (μ) = 500g


H1: Average weight of one bag of cookies (μ) < 500g

❑ So now, the next question is: how to test our hypothesis statement?

❑ Maybe just weigh all bags of cookies so that we could know the exact population
distribution?

❑ Well…obviously, it is IMPOSSIBLE for us to collect ALL the cookies (population) produced


from the Owner’s cookie shop!!! So…what should we do?

❑ Here we need to use what we learned in Inferential statistics

Saudi Aramco: Company General Use


12
From Inferential statistics

❑ From inferential statistics, it is almost impossible to collect all the data of the whole
population to calculate the parameters (population mean μ, population standard deviation
σ,..etc) and that’s why we use samples and statistics (sample mean 𝑥, sample standard
deviation s,….etc) as an estimator to help us infer the unknown population parameters

Saudi Aramco: Company General Use


13
Hypothesis testing the process

❑ In hypothesis testing, we are not interested in a single unknown parameter; instead,


we are interested in “whether we can reject the null hypothesis?”

❑ To answer this question,


we follow the same
method — we calculate
the statistics from our
sample data for
inferring the answer to
this question. The
statistics used here
called Test Statistics

Saudi Aramco: Company General Use


14
Sampling Distribution

❑ Before we collect sample data and calculate the test statistic to test the hypothesis
statement, we need to understand the concept of sampling distribution

❑ Sampling Distribution is the distribution of the sample statistic

❑ Let’s use sample mean (x̄) as an example


▪ If we sampling from the population many times, we could get many sample datasets (sample
1 to sample m). Then, if we calculate the sample mean (x̄) from each sample dataset, we
could get m data points of the sample mean (x̄)

▪ Use these data points, we could draw a distribution of sample mean (x̄). Since this
distribution is from the sample statistic, we called the distribution Sampling Distribution of
sample mean (x̄)

❑ The same idea applies to other statistics. For example, if we calculate the test statistic
from each sample dataset, we could get the sampling distribution of the test statistic.
Saudi Aramco: Company General Use
15
Sampling Distribution

❑ A sampling distribution is similar to all the other distributions

❑ It shows how likely (probability) the statistic value might appear if we sampling from
the population many times

We will use the


brown color to
represent the
sampling
distribution curve
in the following
sections

Saudi Aramco: Company General Use


16
Testing Hypothesis Statements

❑ The first thing we need to do is to have a sample dataset

❑ So, we go to the cookie shop and randomly pick up 25 bags of cookies (n) as our sample
data, and we calculate the mean weight (x̄) of this sample is 485g

❑ The first part of testing is to compare our sample statistic to the null hypothesis so that
we can know how far away our sample statistic is from the expected value

❑ To do so, we first assume the null hypothesis is true

❑ What does this mean? This means, in our case, we assume the population distribution
of one bag of cookies is really equals to 500g

Saudi Aramco: Company General Use


17
Testing Hypothesis Statements

❑ If the statement is true, according to Central Limit Theorem, we could have a sampling
distribution of sample mean (x̄) looks like the below picture (mean value of the sample
mean = 500g) if we sampling from this population many times

Saudi Aramco: Company General Use


18
Testing Hypothesis Statements

❑ So now, if the null hypothesis is true, we could


easily see that our sample mean is 15g below
(485–500 = -15) the expected mean value (500g)

❑ Hmm… but “15g” is only a number, which is not


very helpful for us to explain the meaning

❑ Also, if we want to calculate the probability under

the curve, it is inefficient to calculate it case by case

❑ Imagine there are numerous distributions, each of them has its own mean and standard
deviation…you really don’t want to calculate the probability for many many times…

❑ So, what should we do? We standardize our value so that the mean value of distribution
always equals zero 19
Saudi Aramco: Company General Use
Z-score and Test Statistic

❑ The benefit of standardization is that statisticians already generate a table that


includes the area under each standardized value

❑ So that we don’t need to calculate the area case by case. All we need to do is to
standardize our data

❑ How to standardize? In our case, we use the z-score to transform our data. And z-score
is the Test Statistic in our case

Saudi Aramco: Company General Use


20
Z-score and Test Statistic

❑ The next picture shows the sampling


distribution of the test statistic (z-
score). If our sample data exactly equal
to the null hypothesis (the population
mean =500g, the sample mean = 500g),
we should have the test statistic equals
to 0

❑ In our case, our sample mean equals


485g, which gives us the test statistic
equals to -2.5. This indicates that our
sample data has 2.5 standard errors
below the expected value
Saudi Aramco: Company General Use
21
Choice of Test Statistic

❑ The test statistic is chosen based on different cases.

❑ You might hear different kinds of statistical tests, such as z-test, t-test, chi-square test…Why
we need different kinds of tests?

❑ Because we might need to test different types of data (categorical? quantitative?), we might
have different purpose of testing (testing for mean? proportion?), the data we have might
have a different distribution, we might only have limited attributes of our data……Hence,
how to choose a suitable testing method is another crucial work

❑ In this case, since we are interested in testing the mean value, also, we assume our
population data is normally distributed with known population standard deviation (σ)

❑ Based on our condition, we choose the z-test for this case

Saudi Aramco: Company General Use


22
Measuring the probability of the sample data

❑ So, we know how far away our test statistic is from the expected value when the null
hypothesis is true. Then, what we really want to know is: how likely (probability) we get this
sample data if the null hypothesis is true?

❑ To answer this question, we need to calculate the probability. As you know, the probability
between one point to the other point is the area under our sampling distribution curve
between these two points

❑ So here, we do not calculate the probability of a specific point; instead, we calculate the
probability from our test statistic point to infinite — indicates the cumulative probability of
all the points which farther away from our test statistic (also farther away from the
expected test statistic)

❑ This cumulative probability is our p-value

Saudi Aramco: Company General Use


23
P-value

❑ The p-value is the probability of obtaining test results at least as extreme as the results
actually observed, under the assumption that the null hypothesis is correct

❑ Let’s calculate the p-value in our case

❑ We could just look up the z-table, or use


any statistical software to help us get the
p-value

❑ In our case, we have p-value equals 0.0062


(0.62%). Since our alternative hypothesis
(H1) is set up as “mean value less than
500g”, we only care about the value that
less than our test statistics (left-hand side)
Saudi Aramco: Company General Use
24
P-value

❑ Now, we have p-value = 0.0062. It is a small number…but what does this mean?

❑ This means, under the condition that our null hypothesis is true (population mean really
equals 500g), if we sampling from this population distribution 1000 times, we will have 6.2
times chance to get this sample data (sample mean = 485g) or other samples with sample
mean less than 485g

Saudi Aramco: Company General Use


25
P-value

❑ If we get sample data with


a sample mean equals to
485g, there are two
possible explanations:

1. The population mean really


equals 500g (H0 is correct)
We got “lucky” to get this
rare sample data! (6.2 times
out of 1000 times sampling)

2. The assumption of the “null hypothesis is true” is incorrect. This sample data (sample
mean equals 485g) actually comes from other population distribution where the sample
mean = 485g more likely to happen
Saudi Aramco: Company General Use
26
P-value

❑ So now we know that if our p-value is very small, that means either we get a very rare
sample data or our assumption (null hypothesis is true) is incorrect

❑ Then, the next question is: we only have the p-value now, but how to use it to judge
when to reject the null hypothesis? In other words, how small the p-value is, we are
willing to say that this sample comes from another population?

❑ Here, let’s introduce the judgment standard — significant level (α). The significant
level is a pre-defined value that needs to be set before implementing the hypothesis
testing. The significant level is just a threshold, which gives us a criterion of when to
reject the null hypothesis

❑ This criterion is set as below:


▪ if p-value ≤ significant level (α), we reject the null hypothesis (H0)

Saudi Aramco: Company General Use ▪ if p-value > significant level (α), we fail to reject the null hypothesis (H0) 27
Significance Level

❑ We can see the below picture, the red area is the significant level (In our case, it
equals 0.05). We use the significant level as our criterion, if the p-value within (less
than or equal to) the red area, we reject H0; if the p-value exceeds (greater than) the
red area, we fail to reject H0

❑ The significant level (α) also indicates


the maximum risk we are acceptable
for type I error (type I error means we
reject H0 when H0 is actually true)

❑ In our case, we have p-value = 0.0062,


which smaller than 0.05, as a result,
we can reject our null hypothesis
Saudi Aramco: Company General Use
28
What if we change the significant level?

❑ The result will be different. Since 0.0062 > 0.005, we then fail to reject H0. So here is
the tricky part, since the significant level is subjective, we need to determine it
before the testing. Otherwise, we might very likely to cheat ourselves after knowing
the p-value

Saudi Aramco: Company General Use


29
Recap of what we covered so far

❑ Part 1: To test whether our sample data support the alternative hypothesis or not, we
first assume the null hypothesis is true. So that we can know how far away our sample
data from the expected value given by the null hypothesis. The p-value is the
probability of obtaining test results at least as extreme as the results actually
observed, under the assumption that the null hypothesis is correct

❑ Part 2: Based on the distribution, data types, purpose, known attributes of our data,
choose an appropriate test statistic. And calculate the test statistic of our sample
data. (Test statistic shows how far away our sample data from the expected value)

❑ Part 3: Calculate the probability (area under the sampling distribution curve) from the
test statistic point to infinite (indicates more extreme) at the direction represent your
alternative hypothesis(left-tailed, right-tailed, two-tailed)

Saudi Aramco: Company General Use


30
Recap of what we covered so far

What is the meaning of a small p-value?

❑ If we have a very small p-value, it might indicate two possible meaning:

(1) We are so “lucky” to get this very rare sample data!

(2) This sample data is not from our null hypothesis distribution; instead, it is from
other population distribution. (So that we consider to reject the null hypothesis)

❑ How to use p-value?

To determine whether we could reject the null hypothesis, we compare the p-value to
the pre-defined significant level (threshold)

▪ If p-value ≤ significant level (α), we reject the null hypothesis (H0)

▪ If p-value > significant level (α), we fail to reject the null hypothesis (H0)

Saudi Aramco: Company General Use


31
Hypothesis Testing Risk

❑ The alpha risk or Type 1 Error (generally called the “Producer’s Risk”) is the probability
that we could be wrong in saying that something is “different

❑ It is an assessment of the likelihood that the observed difference could have occurred
by random chance. Alpha is the primary decision-making tool of most statistical tests
Actual Conditions
Not Different Different
(Ho is True) (Ho is False)

Not Different Correct Type II


(Fail to Reject
Decision Error
Ho)
Statistical
Conclusions
Different
Type 1 Correct
(Reject Ho) Error Decision

Saudi Aramco: Company General Use


32
Alpha Risk
Alpha ( ) risks are expressed relative to a reference distribution
Distributions include:
The a-level is represented
t-distribution by the clouded areas.
z-distribution Sample results in this area
2- distribution lead to rejection of H0.

F-distribution

Region of Region of
DOUBT DOUBT

Accept as chance differences

Saudi Aramco: Company General Use


33
Alpha Risk

❑ Alpha (α) is known as the significance level; the probability of being wrong (risk level)

Saudi Aramco: Company General Use


34
The beta risk

❑ The beta risk or Type 2 Error (also called the “Consumer’s Risk”) is the probability that
we could be wrong in saying that two or more things are the same when, in fact, they
are different
Actual Conditions
Not Different Different
(Ho is True) (Ho is False)

Not Different Correct Type II


(Fail to Reject
Decision Error
Ho)
Statistical
Conclusions
Different
Type 1 Correct
(Reject Ho) Error Decision

Saudi Aramco: Company General Use


35
The beta risk

❑ Beta Risk is the probability of failing to reject the null hypothesis when a difference
exists
Distribution if H0 is true

Reject H0
 = Pr(Type 1 error)

 = 0.05
H0 value

Accept H0 Distribution if Ha is true


= Pr(Type II error)

Critical value of test statistic 36


Saudi Aramco: Company General Use
The beta risk

❑ The beta risk is the probability that we could be wrong in saying that two or more
things are the same when, in fact, they are different

Saudi Aramco: Company General Use


37
Steps to Statistical Hypothesis Test
Define the problem and state objectives

State a “Null Hypothesis” (Ho)

State the “Alternate Hypothesis” (Ha)

Establish significance level ()

Collect sample data

Calculate test statistic and/or p-value

DECIDE:
What does the evidence suggest?
Reject Ho? or Fail to reject Ho?

Saudi Aramco: Company General Use


38
What are the types of Hypothesis Testing
One Two Multiple

1-Sample 2-Sample
Mean ANOVA
t t Continuous

2
 -test ANOVA ANOVA
MINITAB • 2 Variances Test For Equal
Descriptive • Test For Equal Variances
Variance Statistics Variance Continuous
(Use CI)

2-Sample
1-Sample 2
Proportion P Test  - test Discrete
P Test 2
 - test

Hypothesis Testing and Tool Selection

Saudi Aramco: Company General Use


39
What are the types of Hypothesis Testing
Low p-value Tool Used Interpretation High p-value
Discrete data Defect Counts
80 vs. 150 140 vs. 150
Chi Square; P Tests Differences in % defect
585 535 585 535
p-value <0.05 p-value >0.05

Continuous data Cycle Time, $$


ANOVA; t-tests Differences in averages

p-value <0.05 p-value >0.05

Continuous data Cycle Time, $$


Analyze

ANOVA; F-Tests Differences in variation


p-value >0.05
p-value <0.05

Continuous data Cycle Time, $$


Correlation & Regression Strength of relationship

p-value <0.05 p-value >0.05


There is a Low p-value identifies where critical x’s There is no
difference difference
are
Saudi Aramco: Company General Use
40
Hypothesis Testing Roadmap

Normal
Two samples One sample

Test of Equal Variance 1 Sample Variance 1 Sample z-test/t-test

Variance Equal Variance Not Equal

Two samples One sample Two samples One sample

2 Sample T One Way ANOVA 2 Sample T One Way ANOVA

Saudi Aramco: Company General Use


41
Hypothesis Testing Roadmap

Non Normal
Two samples One sample

Test of Equal Variance Median Test

Two samples One sample

Mann-Whitney Several Median Tests

Saudi Aramco: Company General Use


42
Hypothesis Testing Roadmap

Attribute Data

One Factor Two Factors


One sample Two samples Two or More Samples

One Sample Two Sample Chi Square Test


Proportion Proportion (Contingency Table)

Chi Square Test


(Contingency Table)

Saudi Aramco: Company General Use


43
Common Pitfalls to Avoid
.
❑ While using Hypothesis Testing the following facts should be kept in mind at the
conclusion stage:

• The decision is about Ho and NOT Ha.

• The conclusion statement is whether the contention of Ha was upheld

• The null hypothesis (Ho) is on trial

• When a decision has been made:

o Nothing has been proved

o It is just a decision

o All decisions can lead to errors (Types I and II)

Saudi Aramco: Company General Use


44
Common Pitfalls to Avoid
.
❑ If the decision is to “Reject Ho,” then the conclusion should read “There is
sufficient evidence at the α level of significance to show that “state the alternative
hypothesis Ha.”

❑ If the decision is to “Fail to Reject Ho,” then the conclusion should read “There
isn’t sufficient evidence at the α level of significance to show that “state the
alternative hypothesis.”

Saudi Aramco: Company General Use


45
Hypothesis Testing (Normal data)

46
Hypothesis test for μ (σ known)

Null hypothesis: H0 :  = 0

Test statistic value :

Alternative Hypothesis Rejection Region for Level  Test

Saudi Aramco: Company General Use 47


Testing means of a large sample

When the sample size is large, the z tests for case I are easily modified to
yield valid test procedures without requiring either a normal population
distribution or known 
A large n (>30) implies that the standardized variable

has approximately a standard normal distribution.

Saudi Aramco: Company General Use 48


Testing means of a small sample coming
from a normal
The One-Sample t Test
Null hypothesis: H0:  = 0

Test statistic value: Normality must be assessed

Alternative Hypothesis Rejection Region for a Level 


Test

Saudi Aramco: Company General Use 49


P-Values for z Tests

The calculation of the P-value depends on whether the test


is upper-, lower-, or two-tailed.

Each of these is the probability of getting a value at least


as extreme as what was obtained (assuming H0 true).

Saudi Aramco: Company General Use 50


P-Values for z Tests

Saudi Aramco: Company General Use 51


Testing Proportion of a large sample

The estimator is unbiased , has


approximately a normal distribution, and its standard
deviation is

When H0 is true, and so


does not involve any unknown parameters. It then follows
that when n is large and H0 is true, the test statistic

has approximately a standard normal


distribution.

Saudi Aramco: Company General Use 52


Proportions: Large-Sample Tests

Alternative Hypothesis Rejection


Region
Ha: p > p0 z  z (upper-tailed)

Ha: p < p0 z  –z (lower-tailed)

Ha: p ≠ p0 either z  z/2


or z  –z/2 (two-tailed)

These test procedures are valid provided that np0  10 and


n(1 – p0)  10.

Saudi Aramco: Company General Use 53


Two Samples Test, Known Variances

In general:

Null hypothesis: H0 : 1 – 2 = 0

Test statistic value: z =

Saudi Aramco: Company General Use 54


Two Samples Test, Known Variances

Null hypothesis: H 0 :  1 –  2 = 0

Alternative Hypothesis Rejection Region for Level  Test

Ha: 1 – 2 > 0 z  z (upper-tailed)

Ha: 1 – 2 < 0 z  – z (lower-tailed)

Ha: 1 – 2 ≠ 0 either z  z/2 or z  – z/2(two-


tailed)

Saudi Aramco: Company General Use 55


Large-Sample Tests

The assumptions of normal population distributions and known


values of 1 and 2 are fortunately unnecessary when both
sample sizes are sufficiently large

Furthermore, using 𝑆12 and 𝑆12 in place of 12 and 22 gives a
variable whose distribution is approximately standard normal:

These tests are usually appropriate if both m > 30 and n > 30

Saudi Aramco: Company General Use 56


The Two-Sample t Test

When the population distribution are both normal, the


standardized variable

Normality must be assessed

has approximately a t distribution with df v


estimated from the data by

Saudi Aramco: Company General Use 57


The Two-Sample t Test

The two-sample t test for testing H0: 1 – 2 = 0 is as follows:

Test statistic value: t =

Saudi Aramco: Company General Use 58


The Two-Sample t Test

Alternative Hypothesis Rejection Region for Approximate


Level  Test
Ha: 1 – 2 > 0 t  t,v (upper-tailed)

Ha: 1 – 2 < 0 t  – t,v (lower-tailed)

Ha: 1 – 2  0 either t  t/2,v or t  –t/2,v (two-tailed)

Saudi Aramco: Company General Use 59


A Test for Proportion Differences

Theoretically, we know that:

has approximately a standard normal distribution when H0


is true

However, this Z cannot serve as a test statistic because


the value of p is unknown—H0 asserts only that there is a
common value of p, but does not say what that value is

Saudi Aramco: Company General Use 60


A Large-Sample Test Procedure

Under the null hypothesis, we assume that p1 = p2 = p,


instead of separate samples of size m and n from two
different populations (two different binomial distributions).
So, we really have a single sample of size m + n from one
population with proportion p

The total number of individuals in this combined sample


having the characteristic of interest is X + Y

The estimator of p is then

Saudi Aramco: Company General Use 61


A Large-Sample Test Procedure

Using and = 1 – in place of p and q in our old


equation gives a test statistic having approximately
a standard normal distribution when H0 is true

Null hypothesis: H0: p1 – p2 = 0

Test statistic value (large

samples):

Saudi Aramco: Company General Use 62


A Large-Sample Test Procedure

Alternative Hypothesis Rejection Region for


Approximate Level  Test
Ha: p1 – p2 > 0 z  za

Ha: p1 – p2 < 0 z  –za

Ha: p1 – p2  0 either z  za/2 or z  –za/2

A P-value is calculated in the same way as for previous z tests.

The test can safely be used as long as and are all


at least 10

Saudi Aramco: Company General Use 63


The F Test for Ratio of Variances

The F probability distribution has two parameters, denoted by v1 and


v2. The parameter v1 is called the numerator degrees of freedom, and
v2 is the denominator degrees of freedom

A random variable that has an F distribution cannot assume a


negative value. The density function is complicated and will not
be used explicitly, so it’s not shown

There is an important connection between an F variable and chi--


squared variables

Saudi Aramco: Company General Use 64


The F Distribution

If X1 and X2 are independent chi-squared rv’s with v1 and


v2 df, respectively, then the rv

can be shown to have an F distribution.

Recall that a chi-squared distribution was obtain by summing squared


standard Normal variables (such as squared deviations for example).
So a scaled ratio of two variances is a ratio of two scaled chi-squared
variables

Saudi Aramco: Company General Use 65


The F Distribution

Figure below illustrates a typical F density


function.

Saudi Aramco: Company General Use 66


The F Distribution

We use for the value on the horizontal axis that


captures  of the area under the F density curve with v1
and v2 df in the upper tail

The density curve is not symmetric, so it would seem that both


upper- and lower-tail critical values must be tabulated. This is
not necessary, though, because of the fact that

For example, F.05,6,10 = 3.22 and F.95,10,6 = 0.31 = 1/3.22.

Saudi Aramco: Company General Use 67


The F Test for Ratio of Variances

A test procedure for hypotheses concerning the ratio is


based on the following result.

Theorem
Let X1,…, Xm be a random sample from a normal distribution
with variance let Y1,…, Yn be another random
sample (independent of the Xi’s) from a normal distribution
with variance and let and denote the
two sample variances. Then the rv

has an F distribution with v1 = m – 1 and v2 = n – 1.

Saudi Aramco: Company General Use 68


The F Test for Ratio of Variances

This theorem results from combining the fact that the


variables and each have a
chi-squared distribution with m – 1 and n – 1 df,
respectively.

Because F involves a ratio rather than a difference, the test


statistic is the ratio of sample variances.

The claim that is then rejected if the ratio differs by


too much from 1.

Saudi Aramco: Company General Use 69


The F Test for Ratio of Variances

Null hypothesis:

Test statistic value:

Alternative Hypothesis Rejection Region for a Level


 Test

Ratio of Variances or Equality of Variances are the same test as either their
ratio is close to one or their difference is close to zero
Saudi Aramco: Company General Use 70
Bartlett's Test for Equality of Variances

❑ Check the equality or test the variation between two sample data, or two groups of
data we use F-test

❑ When we want to test the equality of variances between more than 2 variances, we
use Bartlett’s test

❑ Bartlett's test is used to test if k samples have equal variances. Equal variances
across samples is called homogeneity of variances.

❑ Some statistical tests, for example the analysis of variance (ANOVA), assume that
variances are equal across groups or samples

Saudi Aramco: Company General Use


71
Bartlett's Test for Equality of Variances

❑ Bartlett's test is sensitive to departures from normality

❑ The Levene’s Test is an alternative to the Bartlett test that is less sensitive to
departures from normality

❑ Some common statistical methods assume that variances of the populations from
which different samples are drawn are equal. Bartlett's test assesses this
assumption. It tests the null hypothesis that the population variances are equal

Saudi Aramco: Company General Use


72
Bartlett's Test for Equality of Variances

H0: σ12 = σ22 = …. = σk2


H1: σi2 ≠ σj2 for at least one pair (i,j)

The test statistic is rather ugly:

In the above, Si2 is the variance of the ith group, N is the total sample size, Ni is the
sample size of the ith group, k is the number of groups, and Sp2 is the pooled variance. The
pooled variance is a weighted average of the group variances and is defined as:

Saudi Aramco: Company General Use


73
Bartlett's Test for Equality of Variances
Critical Region:
The variances are judged to be unequal if,
𝑇 > 𝜒21−𝑎 ,𝑘−1
Where
2
𝜒1−𝑎 ,𝑘−1 is the critical value of the chi-square
distribution
with k - 1 degrees of freedom and a significance level of α

Key assumptions : Homogeneity (common group variances), Normality of


responses (or of residuals), and Independence of responses (or of residuals). (Hopefully
achieved through randomization…)

Saudi Aramco: Company General Use


74
Paired t-test

❑ A Paired t-test is used to compare the Means of two measurements from the same
samples generally used as a before and after test

❑ This is appropriate for testing the difference between two Means when the data are
paired and the paired differences follow a Normal Distribution

❑ This matching allows you to account for variability between the pairs usually delta
(d)
resulting in a smaller error term, thus increasing the sensitivity
of the Hypothesis Test or confidence interval.

Ho: μδ = μo
before after
Ha: μδ ≠ μo

❑ Where μδ is the population Mean of the differences and μ0 is the hypothesized


Mean of the differences, typically zero.
Saudi Aramco: Company General Use
75
Paired t-test Example

❑ We are interested in changing the sole material for a popular brand of shoes for
children. In order to account for variation in activity of children wearing the shoes,
each child will wear one shoe of each type of sole material. The sole material will
be randomly assigned to either the left or right shoe.

❑ 2. Statistical Problem:

Ho: μδ = 0

Ha: μδ ≠ 0

❑ 3. Paired t-test (comparing data that must remain paired).

α = 0.05 β = 0.10

Saudi Aramco: Company General Use


76
Paired t-test Example

❑ How much of a difference can be detected with 10 samples?

Saudi Aramco: Company General Use


77
Paired t-test Example

❑ How much of a difference can be detected with 10 samples?


MINITABTM Session Window
Power and Sample Size
1-Sample t Test
Testing Mean = null (versus not = null)
Calculating power for Mean = null +
difference
Alpha = 0.05 Assumed Standard Deviation =
1
Sample
Size Power Difference
10 0.9 1.15456

This means we will be able to detect a difference of only


1.15 if the Standard Deviation is equal to 1

Saudi Aramco: Company General Use


78
Paired t-test Example

We need to calculate the difference between the two distributions


We are concerned with the delta; is the Ho outside the t-calc
Saudi Aramco: Company General Use
79
Paired t-test Example

❑ Following the Hypothesis Test roadmap, we first test the AB-Delta distribution for
Normality

MINITABTM Session Window


Box Plot of AB Delta
One-Sample T: AB Delta
Test of mu = 0 vs not = 0
Variable N Mean StDev SE Mean
AB Delta 10 0.410000 0.387155 0.122429
95% CI T P
(0.133046, 0.686954) 3.35 0.009

Reject the null hypothesis since we are 95% confident that there is a
difference in wear between the two materials (does not include zero)
Saudi Aramco: Company General Use
80
Hypothesis Testing (Non-normal data)

81
Non-Normal Hypothesis Tests

❑ At this point we have covered the tests for determining significance for Normal
Data. We will continue to follow the roadmap to complete the test for Non-normal
Data with Continuous Data

❑ Later in the module we will use another roadmap that was designed for Discrete
Data

❑ Recall that Discrete Data does not follow a Normal Distribution, but because it is
not Continuous Data, there are a separate set of tests to properly analyze the data

Saudi Aramco: Company General Use


82
Non-Normal Hypothesis Tests

❑ Why do we care if a data set is Normally Distributed?


▪ When it is necessary to make inferences about the true nature of the population based
on random samples drawn from the population

▪ When the two indices of interest (X-Bar and s) depend on the data being Normal

▪ For problem solving purposes, because we don’t want to make a bad decision – having
Normal Data is so critical that with EVERY statistical test, the first thing we do is check
for Normality of the data

❑ There are four primary causes for Non-normal Data:


▪ Skewness – Natural and Artificial Limits

▪ Mixed Distributions - Multiple Modes

▪ Kurtosis

▪ Granularity 83
Saudi Aramco: Company General Use
Non-Normal Distributions
1 Skewed 2 Kurtosis

3 Multi-Modal 4 Granularity

Saudi Aramco: Company General Use


84
Skewness Classification
Potential Causes of Skewness
Left Skew Right Skew

60
40
50

Frequency
Frequency

30 40

20 30

20
10
10
0 0
10 15 20 4 5 6 7 8 9 10 11

1-1 Natural Limits


1-2 Artificial Limits (Sorting)
1-3 Mixtures
1-4 Non-Linear Relationships
1-5 Interactions
Saudi Aramco: Company General Use 1-6 Non-Random Patterns Across Time 85
Mixed Distributions 1-3

Mixed Distributions occur when data comes from multiple


sources that are supposed to be the same yet are not

Machine A Machine B
Operator A Operator B
Payment Method A Payment Method B Combined
Interviewer A Interviewer B

Sample A + Sample B
=

Saudi Aramco: Company General Use


86
1-4 Non-Linear Relationships

Non-Linear Relationships occur when the X and Y scales


are different

10
Marginal Distribution

Y
5
of Y

0
0 50 100
X

Marginal Distribution
Saudi Aramco: Company General Use
of X 87
1-5 Interactions

Interactions occur when two inputs interact with each other to


have a larger impact on Y than either would by themselves

Interaction Plot for Process Output Aerosol Hairspray

On
35
If you find that two
Room Temperature

Spray
Off inputs have a large
impact on Y but would
30
not effect Y by
themselves, this is
called a Interaction
25
No Spray

No Fire With Fire

Saudi Aramco: Company General Use


88
1-6 Time Relationships / Patterns

The distribution is dependent on time

30
Marginal Distribution

25
of Y

20

10 20 30 40 50

Time

Often seen when tooling requires “warming up”, tool wear, chemical bath
depletions, ambient temperature effect on tooling
Saudi Aramco: Company General Use
89
Non-Normal Right (Positive) Skewed

Moment coefficient of Skewness will be close to zero for


symmetric distributions, negative for left Skewed and positive
for right Skewed Summary for Pos Skew
A nderson-D arling N ormality T est
A -S quared 46.49
P -V alue < 0.005

M ean 70.000
S tD ev 10.000
V ariance 100.000
S kew ness 2.41707
Kurtosis 6.93041
N 500

M inimum 62.921
1st Q uartile 63.647
M edian 65.695
3rd Q uartile 72.821
70 80 90 100 110 120 130
M aximum 130.366
95% C onfidence Interv al for M ean
69.121 70.879
95% C onfidence Interv al for M edian
65.260 66.501
95% C onfidence Interv al for S tD ev
9 5 % C onfidence Inter v als
9.416 10.662
Mean

Median

65 66 67 68 69 70 71

Saudi Aramco: Company General Use


90
Non-Normal Right (Positive) Skewed

Moment coefficient of Skewness will be close to zero for


symmetric distributions, negative for left Skewed and positive
for right Skewed Summary for Pos Skew
A nderson-D arling N ormality T est
A -S quared 46.49
P -V alue < 0.005

M ean 70.000
S tD ev 10.000
V ariance 100.000
S kew ness 2.41707
Kurtosis 6.93041
N 500

M inimum 62.921
1st Q uartile 63.647
M edian 65.695
3rd Q uartile 72.821
70 80 90 100 110 120 130
M aximum 130.366
95% C onfidence Interv al for M ean
69.121 70.879
95% C onfidence Interv al for M edian
65.260 66.501
95% C onfidence Interv al for S tD ev
9 5 % C onfidence Inter v als
9.416 10.662
Mean

Median

65 66 67 68 69 70 71

Saudi Aramco: Company General Use


91
Kurtosis

Kurtosis refers to the shape of the tails


– Leptokurtic
– Platykurtic
• Different combinations of distributions causes the resulting overall shapes

Leptokurtic Platykurtic
Peaked with Long-Tails Flat with Short-Tails

Saudi Aramco: Company General Use


92
Kurtosis - Platykurtic distribution
Platykurtic
Multiple Means shifting over time produces a plateau of the data
as the shift exhibits this shift
Summary for Flat Causes:
A nderson-D arling N ormality Test
A -S quared
P -V alue <
1.74
0.005
2-1. Mixtures: (Combined Data from
M ean 52.330
Multiple Processes)
S tD ev
V ariance
5.099
26.001
Multiple Set-Ups
S kew ness 0.033260 Multiple Batches
Kurtosis -0.988765
N 182 Multiple Machines
M inimum 41.978 Tool Wear (over time)
1st Q uartile 48.006
M edian 52.223

44 48 52 56 60 64
3rd Q uartile
M aximum
56.729
64.140
2-2 Sorting or Selecting:
95% C onfidence Interv al for M ean Scrapping product that falls outside the
51.585 53.076
95% C onfidence Interv al for M edian
spec limits
50.932 53.741

9 5 % C onfidence Inter vals


95% C onfidence Interv al for S tD ev
2-3 Trends or Patterns:
4.624 5.685
Mean
Lack of Independence in the data
Median
(example: tool wear, chemical bath)
51.0 51.5 52.0 52.5 53.0 53.5 54.0
2-4 Non Linear Relationships
Chemical Systems
Negative coefficient of Kurtosis indicates Platykurtic distribution
Saudi Aramco: Company General Use
93
Kurtosis - Leptokurtic distribution
.
Platykurtic
Distributions overlaying each other that have very different
variance can cause a Leptokurtic distribution
Causes:
Summary for LongTail
A nderson-D arling N ormality Test
A -S quared 3.59 2-1. Mixtures: (Combined Data from
P -V alue < 0.005

M ean 51.389
Multiple Processes)
S tD ev 12.998 Multiple Set-Ups
V ariance 168.960
S kew ness -0.06752 Multiple Batches
Kurtosis 3.08271
N 125 Multiple Machines
M inimum 0.813 Tool Wear (over time)
1st Q uartile 46.488
M edian 52.017
3rd Q uartile 55.620
0 15 30 45 60 75 90
M aximum 94.795 2-2 Sorting or Selecting:
95% C onfidence Interv al for M ean
Scrapping product that falls outside the
49.088 53.691
95% C onfidence Interv al for M edian spec limits
50.584 52.666
95% C onfidence Interv al for S tD ev
9 5 % C onfidence Inter vals
11.562 14.845 2-3 Trends or Patterns:
Mean
Lack of Independence in the data
Median (example: tool wear, chemical bath)
49 50 51 52 53 54

2-4 Non Linear Relationships


Chemical Systems
Positive Kurtosis value indicates Leptokurtic distribution
Saudi Aramco: Company General Use
94
Multiple Modes
.
Platykurtic
Reasons for Multiple Modes:
1 Mixtures of distributions (most likely)

2 Lack of independence – trends or patterns

3 Catastrophic failures
(example: testing voltage on a motor and the motor shorts out so we
get a zero reading etc.)

Multiple Modes have such dramatic combinations of underlying sources that


they show distinct Modes. They may have shown as Platykurtic but were far
enough apart to see separation

These are usually the easiest to identify causes

Saudi Aramco: Company General Use


95
Bi-Modal Distribution
.
Summary for BiModal
A nderson-D arling N ormality T est
A -S quared 27.11
P -V alue <

M ean
0.005

79.570
This is an example of a Bi-Modal
S tD ev
V ariance
32.385
1048.785 Distribution. Interestingly each
peak is actually a Normal
S kew ness 0.00716
K urtosis -1.63184
N 500

M inimum
1st Q uartile
21.341
48.265
Distribution, but when the data is
20 40 60 80 100 120 140
M edian
3rd Q uartile
83.772
110.379 viewed as a group it is obviously
M aximum 142.391
95% C onfidence Interv al for M ean not Normal
76.724 82.416
95% C onfidence Interv al for M edian
62.354 97.233
95% C onfidence Interv al for S tD ev
9 5 % C onfidence Inter v als
30.494 34.527
Mean

Median

60 70 80 90 100

2 Different Distributions
-2 different machines
-2 different operators
-2 different administrators
Saudi Aramco: Company General Use
96
Extreme Bi-Modal (Outliers)
.
Summary for ExtremeBiModal
A nderson-Darling N ormality Test
A -S quared 22.88 If you see an extreme
P -V alue < 0.005
outlier, it usually has
M ean 58.487
S tD ev 21.751 its own cause or own
V ariance 473.106
S kew ness -0.59479
source of variation. It’s
Kurtosis
N
-1.03403
385
relatively easy to
M inimum 19.987 isolate the cause by
1st Q uartile
M edian
26.920
66.161
looking on the X axis of
30 45 60 75 90 105
3rd Q uartile 74.140 the Histogram
M aximum 103.301
95% C onfidence Interv al for M ean
56.308 60.667
95% C onfidence Interv al for M edian
63.410 67.793
95% C onfidence Interv al for S tD ev
9 5 % C onfidence Inter vals
20.315 23.406
Mean

Median

55.0 57.5 60.0 62.5 65.0 67.5

Saudi Aramco: Company General Use


97
Bi-Modal – Multiple Outliers
.
Summary for Multiple Outliers
A nderson-D arling N ormality Test Having multiple outliers
A -S quared 20.90
P -V alue < 0.005 is more difficult to
M ean 26.251 correct. This action
S tD ev 4.845
V ariance 23.477 typically means
S kew ness
Kurtosis
3.17250
9.11483 multiple inputs
N 108

M inimum 22.629
1st Q uartile 24.128
M edian 25.053
3rd Q uartile 25.971
24 28 32 36 40 44
M aximum 46.000
95% C onfidence Interv al for M ean
25.326 27.175
95% C onfidence Interv al for M edian
24.836 25.297
95% C onfidence Interv al for S tD ev
9 5 % C onfidence Inter vals
4.274 5.594
Mean

Median

25.0 25.5 26.0 26.5 27.0 27.5

Saudi Aramco: Company General Use


98
Granularity
.
Granular data is easy to see in a Dot Plot
– Use Caution!
• It looks ‘Normal’ but it is only symmetric and not Continuous
– Causes:
1 Measurement system resolution (Gage R&R) Notice the P-
2 Categorical (step-type function) data value in the
Normal
Probability
Plot, it is
definitely
smaller than
0.05

Saudi Aramco: Company General Use


99
Normal Example
.
Notice the contrast to the previous slide!

Saudi Aramco: Company General Use


100
Conclusions Regarding Distributions

❑ Non-normal Distributions are not BAD!!!

❑ Non-normal Distributions can give more Root Cause information than Normal data
(the nature of why…)

❑ Understanding what the data is telling us is KEY !!!

Saudi Aramco: Company General Use


101
Hypothesis Testing Roadmap

Non Normal
Two samples One sample

Test of Equal Variance Median Test

Two samples One sample

Mann-Whitney Several Median Tests

Saudi Aramco: Company General Use


102
Test of Equal Variance

❑ Levene’s test of Equal Variance is used to compare the estimated


population Standard Deviations from two or more samples with Non-normal
Distributions

❑ Ho: σ1 = σ2 = σ3 …

❑ Ha: At least one is different

Saudi Aramco: Company General Use


103
Test of Equal Variance (Minitab)

P-value < 0.05 (0.00)


Assume data is not
Normally Distributed

Probability Plot of Rot 2


Normal
99.9
Mean 1.023
StDev 1.407
99
N 100
AD 7.448
95 P-Value <0.005
90
80
70

Percent
Stat > Basic Statistics > Normality test… 60
50
40
30
20
10
5

0.1
-5.0 -2.5 0.0 2.5 5.0 7.5 10.0
Rot 2

Saudi Aramco: Company General Use


104
Test of Equal Variance Non-Normal Distribution
Stat>ANOVA>Test for Equal Variance
Use Levene’s Statistics for Non-Normal Data
P-value >0.05 (0.860) Assume variance is equal.
Ho: σ1 = σ2 = σ3 …
Ha: At least one is different.

Test for Equal Variances for Rot 2


F-Test
Test Statistic 1.75
1 P-Value 0.053

Factors2
Lev ene's Test
Test Statistic 0.03
P-Value 0.860
2

1.0 1.2 1.4 1.6 1.8 2.0 2.2


95% Bonferroni Confidence Intervals for StDevs

1
Factors2

0 2 4 6 8 10
Rot 2

Saudi Aramco: Company General Use


105
Test of Equal Variance - Conclusions

❑ When testing 2 samples with Normal Distribution, use F-test:

To determine whether two Normal Distributions have equal variance

❑ When testing >2 samples with Normal Distribution, use Bartlett’s test:

To determine whether multiple Normal Distributions have equal variance

❑ When testing 2 or more samples with Non-normal Distributions, use Levene’s test:

To determine whether two or more distributions have Equal Variance

Our focus for this module when working with Non-normal Distributions

Saudi Aramco: Company General Use


106
Mean and Median

This Graphical Summary provides the confidence interval for the Median

With Normal Data notice the With skewed data, the Mean is
symmetrical shape of the distribution influenced by the outliers. Notice the
and notice how the Mean and the Median is still centered
Median are centered

A nderson-Darling N ormality Test A nderson-Darling N ormality Test

A -S quared 0.30 A -S quared 3.72


P -V alue 0.574 P -V alue < 0.005

M ean 350.51 M ean 4.8454


S tDev 5.01 S tDev 3.1865
V ariance 25.12 V ariance 10.1536
S kew ness -0.079532 S kew ness 1.11209
Kurtosis -0.635029 Kurtosis 1.26752
N 75 N 200

M inimum 339.09 M inimum 0.1454


1st Q uartile 347.48 1st Q uartile 2.4862
M edian 350.48 M edian 4.1533
3rd Q uartile 353.99 3rd Q uartile 6.5424
M aximum 359.53 M aximum 16.4629
340 344 348 352 356 360 0 3 6 9 12 15 95% C onfidence Interv al for M ean
95% C onfidence Interv al for M ean
349.35 351.66 4.4011 5.2898
95% C onfidence Interv al for M edian 95% C onfidence Interv al for M edian
349.30 351.85 3.6296 4.7174
95% C onfidence Interv al for S tDev 95% C onfidence Interv al for S tDev
4.32 5.97 2.9018 3.5336
95% Confidence Intervals 95% Confidence Intervals
Mean Mean

Median Median

349.0 349.5 350.0 350.5 351.0 351.5 352.0 3.5 4.0 4.5 5.0 5.5

Saudi Aramco: Company General Use


107
MINITAB’s Nonparametric tests

❑ 1-Sample Sign: performs a one-sample sign test of the Median and calculates the
corresponding point estimate and confidence interval. Use this test as an
alternative to one-sample Z and one-sample t-tests

❑ 1-Sample Wilcoxon: performs a one-sample Wilcoxon signed rank test of the


Median and calculates the corresponding point estimate and confidence interval
(more discriminating or efficient than the sign test). Use this test as a
nonparametric alternative to one-sample Z and one-sample t-tests.

❑ Mann-Whitney: performs a Hypothesis Test of the equality of two population


Medians and calculates the corresponding point estimate and confidence interval.
Use this test as a nonparametric alternative to the two-sample t-test

Saudi Aramco: Company General Use


108
MINITAB’s Nonparametric tests

❑ Kruskal-Wallis: performs a Hypothesis Test of the equality of population Medians


for a one-way design. This test is more powerful than Mood’s Median (the
confidence interval is narrower, on average) for analyzing data from many
populations, but is less robust to outliers. Use this test as an alternative to the
one-way ANOVA

❑ Mood’s Median Test: performs a Hypothesis Test of the equality of population


Medians in a one-way design. Test is similar to the Kruskal-Wallis Test. Also
referred to as the Median test or sign scores test. Use as an alternative to the
one-way ANOVA

Saudi Aramco: Company General Use


109
1-Sample Sign Test

❑ This test is used when you want to compare the Median of one distribution to a target
value

❑ Must have at least one column of numeric data. If there is more than one column of
data, MINITABTM performs a one-sample Wilcoxon test separately for each column

❑ The hypotheses:

H0: M = Mtarget

Ha: M ≠ Mtarget

❑ Interpretation of the resulting P-value is the same

Saudi Aramco: Company General Use


110
1-Sample Sign Test

❑ Example: Our facility requires a cycle time from an improved process of 63 minutes.
This process supports the customer service division and has become a bottleneck to
completion of order processing. To alleviate the bottleneck the improved process
must perform at least at the expected 63 minutes

❑ Ho: M = 63

❑ Ha: M ≠ 63 Stat>Non parametric> 1 sample sign …


Or
❑ 1-Sample Sign or 1-Sample Wilcoxon Stat> Non parametric> 1 sample Wilcoxon

Saudi Aramco: Company General Use


111
1-Sample Sign Test
Stat>Non parametric> 1 Sample Sign …

For a two tailed test, choose the


“not equal” for the alternative
hypothesis.

Sign Test for Median: Pos Skew


Sign Test of Median = 63.00 versus = 63.00
N Below Equal Above P Median
Pos Skew 500 37 0 463 0.0000 65.70

As you can see the P-value is less than 0.05, so we must reject the null hypothesis which means
we have data that supports the alternative hypothesis that the Median is different than 63.

Saudi Aramco: Company General Use


112
1 Sample Wilcoxon Test
Stat>Non parametric> 1 Sample Wilcoxon …

Wilcoxon Signed Rank Test: Pos Skew


Test of Median = 63.00 versus Median not = 63.00

N for Wilcoxon Estimated


N Test Statistic P Median
Pos Skew 500 500 124015.0 0.000 67.83

As you can see the P-value is less than 0.05, so we must reject the null hypothesis which means we
have data that supports the alternative hypothesis that the Median is different than 63.
Saudi Aramco: Company General Use
113
Mann-Whitney Example

❑ The Mann-Whitney test is used to test if the Medians for 2 samples are different.

❑ Determine if different machines have different Median cycle times.

Ho: M1 = M2

Ha: M1 ≠ M2

❑ There are 200 data points for each machine, well over the minimum sample
necessary

Saudi Aramco: Company General Use


114
Mann-Whitney Example
First run a Normality Test…of course!
When looking at the probability plot, Match
Probability Plot of Mach A
Normal
A yields a less than .05 P-value. Now look
99.9
Mean 15.24
at Graph B? Ok now you have one graph
99
StDev
N
5.379
200 that is Non-normal Data and the other that
is Normal
AD 1.550
95 P-Value <0.005
90
80
70
Percent

60
50 Probability Plot of Mach B
40
30 Normal
20 99.9
Mean 16.73
10
StDev 5.284
5 99
N 200
AD 0.630
1 95 P-Value 0.099
90

0.1 80
0 10 20 7030 40
Percent

60
Mach A 50
40
30
20
10
5

0.1
0 5 10 15 20 25 30 35
Mach B

Saudi Aramco: Company General Use


115
Mann-Whitney Example
Now you’ll actually run the Mann-Whitney test and based on the
results end up determining that Medians of the machines are
different.
Stat>Nonparametric>Mann-Whitney… Since zero (the difference between the 2 Medians) is
not contained within the confidence interval we
reject the null hypothesis. Also, the last line in the
Session Window where it says … “is significant at
0.0019” is the equivalent of a P-value for the Mann-
Whitney test

Mann-Whitney Test and CI: Mach A, Mach B


N Median
Mach A 200 14.841
Mach B 200 16.346
Point estimate for ETA1-ETA2 is -1.604
95.0 Percent CI for ETA1-ETA2 is (-2.635,-0.594)
W = 36509.0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is
significant at 0.0019
Saudi Aramco: Company General Use
116
Mann-Whitney Example

❑ Example: A credit card company now understands there is no variability difference in customer
calls/week for the two different credit card types. This means no difference in strategy of
deploying the workforces. However, the credit card company wants to see if there is a
difference in call volume between the two different card types. The company expects no
difference since the total sales among the two credit card types are similar. The Black Belt was
selected and told to evaluate with 95% confidence if the averages were the same. The Black
Belt reminded the credit card company the calls/day were not Normal distributions so he would
have to compare using Medians since Medians are used to describe the central tendency of Non-
normal Populations

❑ Analyze the problem using the Hypothesis Testing roadmap.

❑ Is there a difference in call volume between the 2 different card types?

Saudi Aramco: Company General Use


117
Mann-Whitney Example: Solution

❑ Since we know the data are Non-normal we can proceed to performing a Mann-Whitney Test

Stat>Nonparametrics>Mann-Whitney

Mann-Whitney Test and CI: CallsperWk1, CallsperWk2


N Median
CallsperWk1 22 739.0
CallsperWk2 105 770.0
Point estimate for ETA1-ETA2 is -26.5
95.0 Percent CI for ETA1-ETA2 is (-91.9,43.0)
W = 36509.0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.4580

Saudi Aramco: Company General Use


118
Mann-Whitney Example: Solution

❑ As you can see there is a difference in the Median between CallsperWk1 and CallsperWk2.

❑ Therefore, there is not a difference in call volume between the two different card types

Mann-Whitney Test and CI: CallsperWk1, CallsperWk2


N Median
CallsperWk1 22 739.0
CallsperWk2 105 770.0
Point estimate for ETA1-ETA2 is -26.5
95.0 Percent CI for ETA1-ETA2 is (-91.9,43.0)
W = 36509.0
Test of ETA1 = ETA2 vs ETA1 not = ETA2 is significant at 0.4580

Saudi Aramco: Company General Use


119
Mood’s Median Test

❑ An aluminum company wanted to compare the operation of its three facilities worldwide.
They want to see if there is a difference in the recoveries among the three locations. A
Black Belt was asked to help management evaluate the recoveries at the locations with 95%
confidence.

❑ Ho: M1 = M2 = M3
Ha: at least one is different
Use the Mood’s Median test.

❑ Based on the smallest sample of 13, the test will be able to detect a difference close to 1.5

❑ Statistical Conclusions: Use the data in the columns named “Recovery” and “Location” in
the Minitab worksheet “Hypoteststud.mtw” for analysis

Saudi Aramco: Company General Use


120
Mood’s Median Test Example: Solution

Stat>Basic Statistics>Graphical Summary… Instead of using the Anderson-Darling test for Normality,
this time we used the Graphical Summary method. It
gives a P-value for Normality and allows a view of the
data that the Normality test does not.
Summary for Recovery
Location = Savannah
A nderson-D arling N ormality Test
A -S quared 0.81
P -V alue 0.032

M ean 87.660
S tD ev 7.944
V ariance 63.113
S kew ness -0.15286
Kurtosis -1.11764
N 25

M inimum 75.300
1st Q uartile 79.000
M edian 87.500
78 84 90 96 3rd Q uartile 96.550
M aximum 99.200
95% C onfidence Interv al for M ean
84.381 90.939
95% C onfidence Interv al for M edian
86.179 90.080
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tD ev

Mean 6.203 11.052

Median

84.0 85.5 87.0 88.5 90.0 91.5

Saudi Aramco: Company General Use


121
Mood’s Median Test Example: Solution
Summary for Recovery
Location = Bangor Notice evidence of outliers in at
A nderson-D arling N ormality Test
A -S quared 0.72
least 2 of the 3 populations. You
P -V alue 0.045

M ean 93.042
could do Box Plot to get a clearer
S tD ev
V ariance
5.918
35.017 idea about Outliers.
S kew ness -1.81758
Kurtosis 4.66838
N 13

M inimum 76.630
Summary for Recovery
1st Q uartile 90.600 Location = Ankhar
M edian 94.800
A nderson-D arling N ormality Test
78 84 90 96 3rd Q uartile 97.350
M aximum 99.700 A -S quared 0.86
P -V alue 0.022
95% C onfidence Interv al for M ean
89.466 96.617 M ean 88.302
S tD ev 6.929
95% C onfidence Interv al for M edian
V ariance 48.008
90.637 97.036 S kew ness -0.105610
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tD ev Kurtosis 0.182123
4.243 9.768 N 20
Mean
M inimum 73.500
Median 1st Q uartile 85.150
M edian 88.425
90 92 94 96 98
78 84 90 96 3rd Q uartile 89.700
M aximum 99.450
95% C onfidence Interv al for M ean
85.059 91.545
95% C onfidence Interv al for M edian
86.735 89.299
9 5 % C onfidence Inter vals 95% C onfidence Interv al for S tD ev

Mean 5.269 10.120

Median

85 86 87 88 89 90 91

Saudi Aramco: Company General Use


122
Mood’s Median Test Example: Solution

Test for Equal Variances for Recovery

Bartlett's Test
Test Statistic 1.33
Ankhar P-Value 0.514
Lev ene's Test
Test Statistic 1.02
P-Value 0.367
Location

Bangor

Savannah

3 4 5 6 7 8 9 10 11 12
95% Bonferroni Confidence Intervals for StDevs

Saudi Aramco: Company General Use


123
Mood’s Median Test
Stat>NonParametrics > Moods Median [Session Output}…

Mood Median Test: Recovery versus Location

Mood median test for Recovery


Chi-Square = 12.11 DF = 2 P = 0.002

Individual 95.0% CIs


Location N<= N> Median Q3-Q1 ---+---------+---------+---------+---
Ankhar 13 7 88.4 4.5 (-----*--)
Bangor 1 12 94.8 6.8 (-------------*------)
Savannah 15 10 87.5 17.6 (----*-------)
---+---------+---------+---------+---
87.0 90.0 93.0 96.0
Overall median = 88.9

We observe the confidence intervals for the Medians of the 3 populations. Note
there is no overlap of the 95% confidence levels for Bangor—so we visually know
the P-value is below 0.05.

Statistical C on clu sion : Sin ce th e P -valu e of th e Mood ’ s Me dian te st is le ss th an 0.05,


we re je ct th e n u ll h y poth e sis.

Practical C on clu sion : Ban gor h as th e h igh e st re cove ry of all th re e facilitie s. 124
Saudi Aramco: Company General Use
Kruskal-Wallis Test

Using the same data set, analyze using the Kruskal-Wallis test.

Kruskal-Wallis Test: Recovery versus Location When comparing the Kruskal-Wallis


test to the Mood’s Median test, the
Kruskal-Wallis Test on Recovery Kruskal-Wallis test is better. In this
case the Kruskal-Wallis Test showed
Location N Median Ave Rank Z
Ankhar 20 88.43 27.3 -0.73 the variances were equal and
Bangor 13 94.80 40.2 2.60 illustrated the same conclusion.
Savannah 25 87.50 25.7 -1.49
Overall 58 29.5

H = 6.86 DF = 2 P = 0.032
H = 6.87 DF = 2 P = 0.032 (adjusted for
ties)

This output is the “least friendly” to interpret. Look for the P-value which tells us we reject the null
hypothesis. We have the same conclusion as with the Mood’s Median test. 125
Saudi Aramco: Company General Use
Unequal Variance

❑ Where do you go in the roadmap if the variance is not equal?

▪ Unequal variances are usually the result of differences in the shape of the
distribution

▪ Extreme tails

▪ Outliers

▪ Multiple modes

❑ These conditions should be explored through data demographics

❑ For Skewed Distributions with comparable Medians, it is unusual for the variances to
be different without some assignable cause impacting the process

Saudi Aramco: Company General Use


126
Check For Normality
Check for normality using Stat > Basic Statistics > Normality….

Model A and Model B are similar in nature (not exact), but are manufactured
in the same plant

Probability Plot of Model A Probability Plot of Model B


Normal Normal
99 99
Mean 10.28 Mean 2.826
StDev 0.7028 StDev 3.088
95 N 10 95 N 10
AD 0.227 AD 0.753
90 90
P-Value 0.747 P-Value 0.033
80 80
70 70
Percent

Percent
60 60
50 50
40 40
30 30
20 20

10 10
5 5

1 1
8.5 9.0 9.5 10.0 10.5 11.0 11.5 12.0 -5.0 -2.5 0.0 2.5 5.0 7.5 10.0
Model A Model B

Model A is Normal, Model B is Non-normal


Saudi Aramco: Company General Use
127
Check for equal Variance
Now le’ts check for Equal Variances using Levene’s test but remember, first
you’ll need to stack the data so you can run this test…

Test for Equal Variances for Data


F-Test
Test Statistic 0.05
Model A P-Value 0.000
Lev ene's Test

idvar
Test Statistic 4.47
P-Value 0.049
Model B

0 1 2 3 4 5 6 7
95% Bonferroni Confidence Intervals for StDevs

Model A

idvar
Model B

0 2 4 6 8 10 12
Data

The P-value is just under the limit of .05. Whenever the result is borderline,
as in this case, use your process knowledge to make a judgment.
Saudi Aramco: Company General Use
128
Plot the data to explore explain the differences

Let’s look at data demographics for clues


Summary for Model A Summary for Model B
A nderson-D arling N ormality Test A nderson-D arling N ormality Test
A -S quared 0.23 A -S quared 0.75
P -V alue 0.747 P -V alue 0.033

M ean 10.279 M ean 2.8260


S tD ev 0.703 S tD ev 3.0882
V ariance 0.494 V ariance 9.5370
S kew ness 0.330968 S kew ness 1.29887
Kurtosis -0.614597 Kurtosis 0.92377
N 10 N 10

M inimum 9.213 M inimum 0.2253


1st Q uartile 9.779 1st Q uartile 0.3488
M edian 10.111 M edian 1.7773
3rd Q uartile 10.816 3rd Q uartile 5.5508
9.0 9.5 10.0 10.5 11.0 11.5 0 2 4 6 8 10
M aximum 11.496 M aximum 9.4440
95% C onfidence Interv al for M ean 95% C onfidence Interv al for M ean
9.776 10.782 0.6169 5.0352
95% C onfidence Interv al for M edian 95% C onfidence Interv al for M edian
9.767 10.848 0.3465 5.5873
95% C onfidence Interv al for S tD ev 95% C onfidence Interv al for S tD ev
9 5 % C onfidence Inter vals 9 5 % C onfidence Inter vals
0.483 1.283 2.1242 5.6379
Mean Mean

Median Median

9.8 10.0 10.2 10.4 10.6 10.8 11.0 0 1 2 3 4 5 6

Dotplot of Model A, Model B

Graph> Dotplot> Multiple Y’s, Simple


Model A
Model B
-0.0 1.6 3.2 4.8 6.4 8.0 9.6 11.2
Data

Saudi Aramco: Company General Use


129
Confidence Interval

130
Why Confidence Interval?
.
❑ Sample statistics such as the mean, standard deviation and proportion (x, s, p) are
only estimates of the population parameters (𝑋, , and P)

❑ Since there is variability in these estimates from sample to sample, we can quantify
the uncertainty using confidence intervals

❑ Confidence intervals provide us with a range in which population parameters are


likely to fall

Saudi Aramco: Company General Use


131
What Is A Confidence Interval?
A Graphical View
.
❑ A 95% confidence interval suggests that Population Mean

approximately 95 out of 100


confidence intervals will contain the
population parameter

❑ Confidence level = 1-α

❑ 1-α is called the probability content or Sample Mean


level of confidence

❑ Alpha (α) is known as the significance


level; the probability of being wrong
(risk level)
Confidence Interval

Saudi Aramco: Company General Use


132
Central Limit Theorem
.
500
Population
400

Frequency
Distribution
300


SE Mean = x= n 200

100

0
 x = Standard Error of the Mean 30 40 50 60 70 80 90 100
Population
 = Standard Deviation for the Individual Scores
80
n = Sample Size for the mean 70
60

Frequency
50
40
30
Sample Means
20
Distribution 10
0
30 40 50 60 70 80 90 100
Sample

Saudi Aramco: Company General Use


133
Point and Interval Estimates
.
❑ A point estimate is a single We can estimate a with a Sample Statistic
Population Parameter (a Point Estimate)
number, and a confidence
interval provides additional Mean μ X
information about the
variability of the estimate Proportion P p

Lower Upper
Confidence Confidence
Point Estimate Limit
Limit
Width of
confidence interval

Saudi Aramco: Company General Use


134
Point and Interval Estimates
.
❑ How much uncertainty is associated with a point estimate of a population parameter?

❑ An interval estimate provides more information about a population characteristic than


does a point estimate Such interval estimates are called confidence intervals

❑ The general formula for all confidence intervals is:

Point Estimate ± (Critical Value)(Standard Error)


Where:

Point Estimate is the sample statistic estimating the population parameter of interest

Critical Value is a table value based on the sampling distribution of the point estimate and
the desired confidence level

Standard Error is the standard deviation of the point estimate


Saudi Aramco: Company General Use
135
Confidence Intervals on mean with known 
.

Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown

Saudi Aramco: Company General Use


136
Confidence Interval for μ (σ Known)
.
❑ Assumptions

Population standard deviation σ is known

Population is normally distributed

❑ If population is not normal, use large sample (n > 30)

❑ Confidence interval estimate:


σ
X ± Z𝛼/2
n
where X is the point estimate
Zα/2 is the normal distribution critical value for a probability of /2 in each tail
σ
is the standard error
n
Saudi Aramco: Company General Use
137
Finding the Critical Value, Z α/2
.
Consider a 95% confidence interval: Z α/2 = 1.96

1 − α = 0.95 so α = 0.05

α α
= 0.025 = 0.025
2 2

Z units: Zα/2 = -1.96 0 Zα/2 = 1.96


X units: Lower Upper
Point Estimate
Confidence Confidence
Limit Limit

Saudi Aramco: Company General Use


138
Common Levels of Confidence
.
Confidence
Confidence
Coefficient, Zα/2 value
Level
1− 
80% 0.80 1.28
90% 0.90 1.645
95% 0.95 1.96
98% 0.98 2.33
99% 0.99 2.58
99.8% 0.998 3.08
99.9% 0.999 3.27

Saudi Aramco: Company General Use


139
Intervals and Level of Confidence

Sampling Distribution of the Mean

/2 1– /2

μx = μ x
Intervals x1
extend from x2 (1-)100%
σ of intervals
X − Zα / 2 constructed
n
contain μ;
to
σ ()100% do
X + Zα / 2 not.
n Confidence Intervals
Saudi Aramco: Company General Use
140
Example
.
❑ A sample of 11 circuits from a large normal population has a mean resistance of 2.20
ohms. We know from past testing that the population standard deviation is 0.35 ohms

❑ Determine a 95% confidence interval for the true mean resistance of the population

Solution Interpretation
σ We are 95% confident that the true
X ± Z𝛼/2 mean resistance is between
n
1.9932 and 2.4068 ohms
= 2.20 ± 1.96 (0.35/ 11) Although the true mean may or may
= 2.20 ± 0.2068 not be in this interval, 95% of
intervals formed in this manner
1.9932 ≤ 𝜇 ≤ 2.4068 will contain the true mean

Saudi Aramco: Company General Use


141
Confidence Intervals on mean with unknown 
.

Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown

Saudi Aramco: Company General Use


142
Confidence Interval for μ (σ unknown)
.
❑ Assumptions

Population standard deviation is unknown

Population is normally distributed

If population is not normal, use large sample (n > 30)

❑ We use the Student’s t Distribution instead of the normal distribution as it factor in the
greater uncertainty associated with small sample sizes

𝑠
X ± t 𝛼/2
n
where tα/2 is the critical value of the t distribution with n
-1 degrees of freedom and an area of α/2 in each tail
Saudi Aramco: Company General Use
143
Student’s t Distribution
.
❑ The t is a family of distributions

❑ The tα/2 value depends on degrees of freedom (d.f.)

❑ Number of observations that are free to vary after sample mean has been calculated

𝑑. 𝑓. = 𝑛 − 1

Saudi Aramco: Company General Use


144
Degrees of Freedom (df)
.
Idea: Number of observations that are free to vary
after sample mean has been calculated
Example: Suppose the mean of 3 numbers is 8.0

Let X1 = 7
Let X2 = 8
If the mean of these three
What is X3? values is 8.0,
then X3 must be 9
(i.e., X3 is not free to vary)

Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2


(2 values can be any numbers, but the third is not free to vary
for a given mean)
Saudi Aramco: Company General Use
145
Student’s t Distribution
.

Standard
Normal
(t with df = ∞)

t (df = 13)
t-distributions are bell-
shaped and symmetric, but
have ‘fatter’ tails than the t (df = 5)
normal

0 t
Saudi Aramco: Company General Use
146
Student’s t table
.

Upper Tail Area


Let: n = 3
df .10 .05 .025 df = n - 1 = 2
 = 0.10
1 3.078 6.314 12.706 /2 = 0.05

2 1.886 2.920 4.303


3 1.638 2.353 3.182 /2 = 0.05

The body of the table


contains t values, not 0
probabilities
2.920 t
Saudi Aramco: Company General Use
147
Selected t distribution values
.
With comparison to the Z value

Confidence t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) (∞ d.f.)

0.80 1.372 1.325 1.310 1.28


0.90 1.812 1.725 1.697 1.645
0.95 2.228 2.086 2.042 1.96
0.99 3.169 2.845 2.750 2.58

Note: t Z as n increases

Saudi Aramco: Company General Use


148
Example
.
❑ A random sample of n = 25 has X = 50 and S = 8. Form a 95% confidence interval for μ

❑ d.f. = n – 1 = 24, so t α/2 = t 0.025 = 2.0639


The confidence interval is

Solution Interpretation
S Interpreting this interval requires the
𝑋 ± 𝑡𝛼/2 assumption that the population you are
n
8 sampling from is approximately a normal
= 50 ± (2.0639) distribution (especially since n is only 25)
25
This condition can be checked by creating a:
46.698 ≤ μ ≤ 53.302 Normal probability plot or Boxplot

Saudi Aramco: Company General Use


149
Confidence Intervals for the Population Proportion, P
.

Confidence
Intervals

Population Population
Mean Proportion

σ Known σ Unknown

Saudi Aramco: Company General Use


150
Confidence Interval for the Population Proportion, P
.
❑ Recall that the distribution of the sample proportion is approximately normal if the sample
size is large, with standard deviation

𝑝(1
Ƹ − 𝑝)Ƹ
𝑃=
n

❑ We will estimate this with sample data:

𝑝(1
Ƹ − 𝑝)Ƹ
n

Saudi Aramco: Company General Use


151
Confidence Interval for the Population Proportion, P
.
❑ Upper and lower confidence limits for the population proportion are calculated with the
formula

𝑝Ƹ (1 − 𝑝Ƹ )
𝑝Ƹ ± Z𝛼/2
n

where Zα/2 is the standard normal value for the level of confidence desired

pො is the sample proportion

n is the sample size

Note: must have np > 5 and n(1-p) > 5

Saudi Aramco: Company General Use


152
Example

❑ A random sample of 100 people shows that 25 are left-handed.

❑ Form a 95% confidence interval for the true proportion of left-handers

Solution Interpretation
p ± Z𝛼/2 p(1 − p)/n We are 95% confident that the true
= 25/100 ± 1.96 0.25(0.75)/100 percentage of left-handers in the
= 0.25 ± 1.96 0.0433 population is between 16.51% and 33.49%
Although the interval from 0.1651 to 0.3349
= 0.1651 ≤ p ≤ 0.3349 may or may not contain the true
proportion, 95% of intervals formed from
samples of size 100 in this manner will
contain the true proportion

Saudi Aramco: Company General Use


153
Sample Size

154
Distinguishing between Two Samples
.
Theoretical Distribution
❑ Recall from the Central Limit Theorem as of Means
d When n = 2
the number of individual observations
d=5
increase the Standard Error decreases. S=1

❑ In this example when n=2 we cannot


distinguish the difference between the
Means (> 5% overlap, P-value > 0.05)

❑ When n=30, we can distinguish between the


Theoretical Distribution
Means (< 5% overlap, P-value < 0.05) There of Means
is a significant difference When n = 30
d=5
S=1

Saudi Aramco: Company General Use


155
Delta Sigma—The Ratio between δ and S
Large Delta
❑ Delta (δ) is the size of the difference between two
Means or one Mean and a target value
d
❑ Sigma (S) is the sample Standard Deviation of the
distribution of individuals of one or both of the
samples under question

❑ When δ & S is large, we don’t need statistics


because the differences are so large

Large S
❑ If the variance of the data is large, it is difficult to
establish differences. We need larger sample sizes
to reduce uncertainty
Saudi Aramco: Company General Use
156
The Perfect Sample Size
Question: “How many samples should we take?”
Answer: “Well, that depends on the size of your delta and Standard Deviation”

Question: “How should we conduct the sampling?”

Answer: “Well, that depends on what you want to know”

Question: “Was the sample we took large enough?”

Answer: “Well, that depends on the size of your delta and Standard Deviation

Question: “Should we take some more samples just to be sure?”

Answer: “No, not if you took the correct number of samples the first time!”

Saudi Aramco: Company General Use


157
The Perfect Sample Size

❑ The minimum sample size required to


provide exactly 5% overlap (risk). In
order to distinguish the Delta.

❑ Note: If you are working with Non-


normal Data, multiply your calculated
sample size by 1.1 (this is based on 40 50 60 70
Population
recommendations by multiple studies)

Saudi Aramco: Company General Use 40 50 60 70


158
Determining Sample Size

Determining
Sample Size

For the For the


Mean Proportion

Saudi Aramco: Company General Use


159
Sampling Error

❑ The required sample size can be found to reach a desired margin of error (e)
with a specified level of confidence (1 - α)

❑ The margin of error is also called sampling error the amount of imprecision in
the estimate of the population parameter or the amount added and subtracted
to the point estimate to form the confidence interval

Saudi Aramco: Company General Use


160
Determining Sample Size

Determining
Sample Size

2
For the 𝑍𝛼/2 𝜎2
Mean 𝑛=
𝑒2
σ
σ e = Zα / 2
X  Zα / 2 n
n

Saudi Aramco: Company General Use


161
Determining Sample Size

❑ To determine the required sample size for the mean, you must know:

1) The desired level of confidence (1 - α), which determines the critical value,
Zα/2

2) The acceptable sampling error, e

3) The standard deviation, σ

Saudi Aramco: Company General Use


162
Required Sample Size Example

If  = 45, what sample size is needed to estimate the


mean within ± 5 with 90% confidence?

Z 2 σ 2 (1.645)2 (45)2
n= 2
= 2
= 219.19
e 5

So the required sample size is n = 220


(Always round up)

Saudi Aramco: Company General Use


163
If σ is unknown

❑ If unknown, σ can be estimated when using the required


sample size formula

❑ Use a value for σ that is expected to be at least as large as


the true σ

❑ Select a pilot sample and estimate σ with the sample


standard deviation, S

Saudi Aramco: Company General Use


164
Determining Sample Size

For the 𝐙𝛂𝟐 𝐏(𝟏 − 𝒑)


Proportion 𝐧= 𝟐
𝐞𝟐
𝒑(𝟏 − 𝒑)
𝑝Ƹ (1 − 𝑝Ƹ ) 𝐞=𝐙
𝑝Ƹ ± Z𝛼/2 𝐧
n

Another approach to choosing n uses the fact that the sample size will always be a maximum
for p = 0.5 [that is, p(1 - p)≤ 0.25 with equality for p 0.5], and this can be used to obtain an
upper bound on n. In other words, we are at least 100(1 – α)% confident that the error in
estimating p by is less than E if the sample size is 𝟐
𝐙α/𝟐
𝐧= 𝟎. 𝟐𝟓
𝐞𝟐
Saudi Aramco: Company General Use
165
Determining Sample Size

❑ To determine the required sample size for the proportion, you must know:

1) The desired level of confidence (1 - α), which determines the critical value,
Zα/2

2) The acceptable sampling error, e

3) The true proportion of events of interest, p

4) P can be estimated with a pilot sample if necessary (or conservatively use


0.5 as an estimate of p)

Saudi Aramco: Company General Use


166
Required Sample Size Example

How large a sample would be necessary to estimate the true


proportion of defectives in a large population within ±3%, with
95% confidence?
(Assume a pilot sample yields p = 0.12)

Solution:
For 95% confidence, use Zα/2 = 1.96 , e = 0.03
p = 0.12, so use this to estimate p So use n = 451

2
Z𝛼/2 𝑝(1 − 𝑝) (1.96)2 (0.12)(1 − 0.12)
n= = = 450.74
e2 (0.03)2

Saudi Aramco: Company General Use


167
Proportion data - Example

❑ Laura was looking at the percentage of duplicate payments. She has randomly
sampled 50 and discovered that four were duplicated or defective. She wants
a 95% confidence of the overall payments population defect rate to within plus
or minus 2%. If she uses the defect percentage of her sample, calculate the
sample size she would need to determine what she wants to know

A. What sample size is needed based on the information above?

B. Once she sees the number, she indicates she is uncertain about the defect
rate, calculate the sample needed with an unknown defect rate

C. After seeing the samples sizes needed, Laura is concerned about never being
able to go to Hawaii again. What could you suggest?
Saudi Aramco: Company General Use
168
Proportion data - Example

❑ N = 50, C = 4, E = 2% , P*(1-P) = 4*46/50 = 0.0736

A. What sample size is needed based on the information above?

𝐙𝛂𝟐 𝐏(𝟏 − 𝒑)
𝟐 𝟏. 𝟗𝟔𝟐 ∗ 𝟎. 𝟎𝟕𝟑𝟔
𝐧= = = 𝟕𝟎𝟔. 𝟖𝟓 ⇒ 𝟕𝟎𝟕
𝐞𝟐 𝟎. 𝟎𝟐𝟐

Saudi Aramco: Company General Use


169
Proportion data - Example

❑ N = 50, C = 4, E = 2% , P*(1-P) = 4*46/50 = 0.0736

B. Once she sees the number, she indicates she is uncertain about the defect rate,
calculate the sample needed with an unknown defect rate

𝐙𝛂𝟐 𝟎. 𝟐𝟓
𝟐 𝟏. 𝟗𝟔𝟐 ∗ 𝟎. 𝟐𝟓
𝐧= = = 𝟐𝟒𝟎𝟏 ⇒ 𝟕𝟎𝟕
𝐞𝟐 𝟎. 𝟎𝟐𝟐

C. After seeing the samples sizes needed, Laura is concerned about never being
able to go to Hawaii again. What could you suggest?

Reduce confidence needed, decrease precision or error around the population mean

Saudi Aramco: Company General Use


170
Determining Sample Size Of Attribute Data

❑ To determine the required sample size for the proportion, you must know:

1) The desired level of confidence (1 - α), which determines the critical value,
Zα/2

2) The acceptable sampling error, e

3) The average number of defects of interest, 𝐶

2
𝐶 𝑍𝛼/2
𝑛=
𝑒2

Saudi Aramco: Company General Use


171
Attribute Data - Example

❑ Jeri is looking at number of claims lines defects. There is no prior history on


this, so she takes a random sample of 100 claim lines and determines that the
average number of defects is 72. She wants to be 95% confident of the

❑ overall population average, plus or minus 3 lines

❑ Was her sample of 100 adequate to estimate the overall c?

2
𝐶 𝑍𝛼/2 72 ∗ 1.962
𝑛= = = 30.7 ⇒ 31
𝑒2 3 2

Saudi Aramco: Company General Use


172
Attribute Data - Example

❑ Jennifer has already completed her first project. She is now analyzing a
suggestion to reduce the number of cell phones the company pays for. While
there is a report from the phone company about the number of calls per cell
phone, Jennifer knows she needs to verify the data on the report for her
Measurement System Analysis
❑ What size sample does she need to be 95% confident in the GRR accuracy if the
average number of calls per cell phone per month is 32 and she wants to be within
+/- 5 calls?

❑ Jennifer says this is great news! I can afford to be more accurate. How about +/-
2 calls? What will you tell her?

Saudi Aramco: Company General Use


173
Attribute Data - Example

❑ What size sample does she need to be 95% confident in the GRR accuracy if the
average number of calls per cell phone per month is 32 and she wants to be within
+/- 5 calls?
2
𝐶 𝑍𝛼/2 32 ∗ 1.962
𝑛= = = 4.92 ⇒ 5
𝑒2 5 2

❑ Jennifer says this is great news! I can afford to be more accurate. How about +/-
2 calls? What will you tell her?

2
𝐶 𝑍𝛼/2 32 ∗ 1.962
𝑛= = = 30.73 ⇒ 31
𝑒2 2 2

Saudi Aramco: Company General Use


174
Sample Size - Summary
Continuous Proportions Count
2 𝐙𝛂𝟐 𝐏(𝟏 − 𝒑)
𝑍𝛼/2 𝜎2 2
𝐶 𝑍𝛼/2
𝟐
𝑛= 𝐧= 𝑛=
𝑒2 𝐞𝟐 𝑒2

❑ We often do not have historical defect data, so we begin by taking a sample of


100 for attribute data and at least 30 for continuous to get an estimate

❑ When we find sample sizes too numerous to investigate, we can decrease our
confidence or the amount of error to get us to a more reasonable n

Saudi Aramco: Company General Use


175
Analysis of Variance (ANOVA)

176
ANOVA

❑ Analysis of Variance (ANOVA) is used to investigate and model the relationship


between a response variable and one or more independent variables

❑ Analysis of Variance extends the two sample t-test for testing the equality of two
population Means to a more general null hypothesis of comparing the equality of
more than two Means, versus them not all being equal

❑ The classification variable, or factor, usually has three or more levels (If there are
only two levels, a t-test can be used)

❑ Allows you to examine differences among means using multiple comparisons

❑ The ANOVA test statistic is:


Avg SS between S2 between
= 2
Avg SS within S within
Saudi Aramco: Company General Use
177
What do we want to know?

❑ Is the between group variation large enough to be distinguished from the within
group variation?

(Between Group Variation)


delta X
(δ)

Total (Overall) Variation

Within Group Variation


(level of supplier 1)

X
X
X X
X
X X X
μ1 μ2
Saudi Aramco: Company General Use
178
Calculating ANOVA
Where:
Total (Overall) Variation
G - the number of groups (levels in the study)

xij = the individual in the jth group

nj = the number of individuals in the jth group or level


delta
(δ) Within Group Variation
𝑋 = the grand Mean

Xj = the Mean of the jth group or level

(Between Group Variation)

Between Group Variation Within Group Variation Total Variation


g g nj g nj

j=1
nj (Xj − X) 2
 (X ij − X) 2
 (X
j=1 i =1
ij − X) 2
j=1 i =1

Saudi Aramco: Company General Use


179
Alpha Risk and Pair-Wise t-tests

❑ The alpha risk increases as the number of Means increases with a pair-wise t-test
scheme. The formula for testing more than one pair of Means using a t-test is:

1 − (1 − α )
k

where k = number of pairs of means


so, for 7 pairs of means and an α = 0.05 :
1 - (1 - 0.05) = 0.30
7

or 30% alpha risk

Saudi Aramco: Company General Use


180
Comparison Of Means

❑ “Are the means of the populations (1, 2, 3, 4) equal, or are there statistically
significant differences?”

vs.. vs.. vs..

1 2 3 4
❑ These populations represent the levels of a factor

❑ Use samples to make inferences about the populations

Saudi Aramco: Company General Use


181
Example

❑ The Sigma Finance Company is attempting to improve the time it takes to process
forms. The team believes there is a difference in the form cycle time between the
four processing centers

Center 1 Center 2 Center 3 Center 4

62 63 68 56
60 67 66 62
63 71 72 60
59 64 67 61
65 68 63
66 68 64
63
59

Saudi Aramco: Company General Use


182
ANOVA Table In MINITAB
One-way ANOVA: Center 1, Center 2, Center 3, Center 4

Analysis of Variance
Source DF SS MS F P
Factor 3 228.00 76.00 13.57 0.000
Error 20 112.00 5.60
Total 23 340.00
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev ---+---------+---------+---------+---
Center 1 4 61.000 1.826 (------*------)
Center 2 6 66.000 2.828 (-----*----)
Center 3 6 68.000 1.673 (----*-----)
Center 4 8 61.000 2.619 (----*----)
---+---------+---------+---------+---
Pooled StDev = 2.366 59.5 63.0 66.5 70.0

Saudi Aramco: Company General Use


183
The Concept
Center 1 Center 2 Center 3 Center 4 Center 1 Center 2 Center 3 Center 4
62 63 68 56 62 63 68 56
60 67 66 62 60 67 66 62
63 71 72 60 63 71 72 60
59 64 67 61 59 64 67 61
65 68 63 65 68 63
66 68 64 66 68 64
Variation 63 63
Within Variation Between
59 59
(error) (Factor)
1.82 2.82 1.67 2.62 61 66 68 61
Stdev AVG
Center 1 Center 2 Center 3 Center 4
62 63 68 56
60 67 66 62 Variation
Total
63 71 72 60 = Variation +
Variation Between Within
59 64 67 61
65 68 63
66 68 64
63
Total Variation 59
Saudi Aramco: Company General Use
184
The Formulas
Variation Within (error) Variation Between (Factor)
Center 1 Center 2 Center 3 Center 4 Center 1 Center 2 Center 3 Center 4
62 63 68 56 62 63 68 56
60 67 66 62 60 67 66 62
63 71 72 60 63 71 72 60
59 64 67 61 59 64 67 61
65 68 63 65 68 63
66 68 64 66 68 64
63 63
2
Σ (nj – 1)sj 59 Y = 64 59
4
=
4
SS
Error
෍(𝑛𝑗 − 1) 𝑆𝑗 2
SS = ෍ 𝑛𝑗 (𝑦𝑗 − 𝑦)2
Factor
(Within) 𝑗=1
(Between) 𝑗=1

Total Variation
Center 1 Center 2 Center 3 Center 4

62 63 68 56
Total Variation
60 67 66 62 = Variation +
63 71 72 60 Variation Between Within
59 64 67 61
65 68 63 SST = SSb + SSe
66 68 64
63 Factor Error
Saudi Aramco: Company General Use
59 185
How it works
Variation Between (Factor)
Center 1 Center 2 Center 3 Center 4 Center 1 Center 2 Center 3 Center 4
62 63 68 56 yj 61 66 68 61
60 67 66 62
63 71 72 60 sj2 3.33 7.95 2.79 6.85
59 64 67 61
65 68 63 nj 4 6 6 8
66 68 64
63
Y = 64 59

SS =
4
SSb Analysis of Variance
Error
෍(𝑛𝑗 − 1) 𝑆𝑗 2
(Within) 𝑗=1 Source DF SS MS F P
Factor 3 228.00 76.00 13.57 0.000
4 Error 20 112.00 5.60
SS = ෍ 𝑛𝑗 (𝑦𝑗 − 𝑦)2 Total 23 340.00
Factor
(Between) 𝑗=1
SSe

Saudi Aramco: Company General Use


186
Mean Sum Of Squares
4

෍ 𝑛𝑗 (𝑦𝑗 − 𝑦)2 MSb


36 + 24 + 96 + 72 228
MS = 𝑗=1
= = = 76
Factor
(Between)
# of Factors -1 3 3

Analysis of Variance
F Calculated
Source DF SS MS F P
Factor 3 228.00 76.00 13.57 0.000 76
= 13.57
5.6
Error 20 112.00 5.60
Total 23 340.00
4 MSe
2
෍(𝑛𝑗 − 1) 𝑆𝑗
MS = 9.99 + 39.75 + 13.95 + 48.02 112
Error
𝑗=1
4
= = = 5.60
(Within)
20 20
෍(𝑛𝑗 − 1)
𝑗=1

Saudi Aramco: Company General Use


187
ANOVA Table
One-way ANOVA: Center 1, Center 2, Center 3, Center 4

Analysis of Variance
Source DF SS MS F P
Factor 3 228.00 76.00 13.57 0.000
Error 20 112.00 5.60
Total 23 340.00
Individual 95% CIs For Mean
Based on Pooled StDev
Level N Mean StDev ---+---------+---------+---------+---
Center 1 4 61.000 1.826 (------*------)
Center 2 6 66.000 2.828 (-----*----)
Center 3 6 68.000 1.673 (----*-----)
Center 4 8 61.000 2.619 (----*----)
---+---------+---------+---------+---
Pooled StDev = 2.366 59.5 63.0 66.5 70.0

Saudi Aramco: Company General Use


188
Conclusion

❑ The Sigma Finance Company has made a decision to outsource its contracting
function. Four companies have been identified and one of the criteria is the time
in which they close contracts

❑ Are any of the vendors’ significantly better than the others in average time and
consistency with at least 95% confidence?

▪ Since p-value ≤ significant level (α) then we reject the null hypothesis (H0) and
conclude there is a difference between the four companies

Saudi Aramco: Company General Use


189
How To Set Up In MINITAB

Follow the hypothesis test roadmap!


Saudi Aramco: Company General Use
190
Main Effects Plot

Saudi Aramco: Company General Use


191
Main Effects Plot

Main Effects Plot - Data Means for Stacked

Grand
68
Average
67

66
Stacked

65

64

63

62

61

Center 1 Center 2 Center 3 Center 4

Center

Saudi Aramco: Company General Use


192
Three Samples Example

❑ We have three potential suppliers that claim to have equal levels of quality.
Supplier B provides a considerably lower purchase price than either of the other
two vendors. We would like to choose the lowest cost supplier but we must ensure
that we do not effect the quality of our raw material.

We would like test the data to determine whether


there is a difference between the three suppliers
Saudi Aramco: Company General Use
193
Test for Normality
Probability Plot of Supplier A
Normal ❑ All three suppliers samples are Normally
99
Mean
StDev
3.664
0.4401 Distributed
95 N 5
AD 0.246
90

Supplier A (P-value 0.568), Supplier B (P-value


P-Value 0.568
80
70
Percent

60
50 0.385), Supplier C (P-value 0.910)
40
30
20

10 Probability Plot of Supplier B


Normal
5
99
Probability
Mean Plot
3.968 of Supplier C
1 StDev 0.2051
Normal
2.5 3.0 3.5 4.0
95 4.5 N 5
Supplier A 90
99 AD 0.314
P-Value 0.385 Mean 4.03
StDev 0.4177
80
95 N 5
70 AD 0.148
90
Percent

60 P-Value 0.910
50
80
40
70
30

Percent
60
20
50
10 40
30
5
20

1 10
3.50 3.75 4.00 5 4.25 4.50
Supplier B
1
3.0 3.5 4.0 4.5 5.0
Supplier C

Saudi Aramco: Company General Use


194
Test for Equal Variance

❑ Test for Equal Variance (Must stack data to


create “Response” & “ Factors”):

Test for Equal Variances for Data

Bartlett's Test
Test Statistic 2.11
Supplier A P-Value 0.348
Lev ene's Test
Test Statistic 0.59
P-Value 0.568

Suppliers
Supplier B

Supplier C

0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8
95% Bonferroni Confidence Intervals for StDevs

Saudi Aramco: Company General Use


195
ANOVA MINITAB
Stat>ANOVA>One-Way Unstacked

Enter Stacked Supplier data in


“Responses:”

Click on “Graphs…”,
Check “Boxplots of data” 196
Saudi Aramco: Company General Use
ANOVA MINITAB
What does this graph tell us?

Boxplot of Supplier A, Supplier B, Supplier C


4.6

4.4

4.2

4.0
Data

3.8

3.6

3.4

3.2

3.0
Supplier A Supplier B Supplier C

Saudi Aramco: Company General Use


197
ANOVA Session Window
P-value > .05 - No Difference
between suppliers
One-way ANOVA: Supplier A, Supplier B, Supplier C

Source DF SS MS F P
Factor 2 0.384 0.192 1.40 0.284
Error 12 1.641 0.137
Total 14 2.025

S = 0.3698 R-Sq = 18.95% R-Sq(adj) = 5.44%

Level N Mean StDev


Supplier A 5 3.6640 0.4401
Supplier B 5 3.9680 0.2051 Stat>ANOVA>One Way (unstacked)
Supplier C 5 4.0300 0.4177

Individual 95% CIs For Mean Based on Pooled StDev


Level +---------+---------+---------+---------
Supplier A (-----------*-----------)
Supplier B (-----------*-----------)
Supplier C (-----------*-----------)
+---------+---------+---------+---------
3.30 3.60 3.90 4.20

Pooled StDev = 0.3698

Saudi Aramco: Company General Use


198
ANOVA Session Window
One-way ANOVA: Supplier A, Supplier B, Supplier C

Source DF SS MS F P
Factor 2 0.384 0.192 1.40 0.284
Error 12 1.641 0.137
Total 14 2.025

S = 0.3698 R-Sq = 18.95% R-Sq(adj) = 5.44%


F-Calc F-Critical

Level N Mean StDev


D/N 1 2 3 4
Supplier A 5 3.6640 0.4401
1 161.40 199.50 215.70 224.60
Supplier B 5 3.9680 0.2051
2 18.51 19.00 19.16 19.25
Supplier C 5 4.0300 0.4177
3
4
10.13
7.71
9.55
6.94
9.28
6.59
9.12
6.39
𝐹𝑐𝑎𝑙𝑐 ≥ 𝐹𝛼,2,12
Individual 95% CIs For Mean Based on Pooled StDev 2
5 6.61 5.79 5.41 5.19
Level +---------+---------+---------+---------
6 5.99 5.14 4.76 4.53 1.40 𝑖𝑠 𝑙𝑒𝑠𝑠 𝑡ℎ𝑎𝑛 3.89
Supplier A (-----------*-----------)
7 5.59 4.74 4.35 4.12
Supplier B (-----------*-----------) 𝑤𝑒 𝑐𝑜𝑛𝑐𝑙𝑢𝑑𝑒 𝑡ℎ𝑒𝑟𝑒 𝑖𝑠
8 5.32 4.46 4.07 3.84
Supplier C (-----------*-----------) 9 5.12 4.26 3.86 3.63 no Difference
+---------+---------+---------+--------- 10 4.96 4.10 3.71 3.48 𝑏𝑒𝑡𝑤𝑒𝑒𝑛 𝑠𝑢𝑝𝑝𝑙𝑖𝑒𝑟𝑠
3.30 3.60 3.90 4.20 11 4.84 3.98 3.59 3.36
12 4.75 3.89 3.49 3.26
Pooled StDev = 0.3698 13 4.67 3.81 3.41 3.18
14 4.60 3.74 3.34 3.11
15 4.54 3.68 3.29 3.06
Saudi Aramco: Company General Use
199
ANOVA Assumptions

1. Observations are adequately described by the model

2. Errors are normally and independently distributed

3. Homogeneity of variance among factor levels

❑ In one-way ANOVA, model adequacy can be checked by either of the following:

▪ Check the data for Normality at each level and for homogeneity of variance
across all levels

▪ Examine the residuals (a residual is the difference in what the model predicts
and the true observation)
o Normal plot of the residuals

o Residuals versus fits

o Residuals versus order


Saudi Aramco: Company General Use
200
Residual Plots

Saudi Aramco: Company General Use


201
Histogram of Residuals

Histogram of the Residuals


(responses are Supplier A, Supplier B, Supplier C)

4
Frequency

0
-0.6 -0.4 -0.2 0.0 0.2 0.4 0.6
Residual

The Histogram of residuals should show a


bell shaped curve.
Saudi Aramco: Company General Use
202
Normal Probability Plot of Residuals
Normal Probability Plot of the Residuals
(responses are Supplier A, Supplier B, Supplier C)
99

95

90

80
70
Percent
60
50
40
30
20

10

1
-1.0 -0.5 0.0 0.5 1.0
Residual

Normality plot of the residuals should follow a straight line


Results of our example look good
The Normality assumption is satisfied
Saudi Aramco: Company General Use
203
Normal Probability Plot of Residuals

Residuals Versus the Fitted Values


(responses are Supplier A, Supplier B, Supplier C)
0.75

0.50

0.25
Residual

0.00

-0.25

-0.50

3.65 3.70 3.75 3.80 3.85 3.90 3.95 4.00 4.05


Fitted Value

The plot of residuals versus fits examines constant variance


The plot should be structureless with no outliers present 204
Saudi Aramco: Company General Use
Fisher’s Least Significant Difference

❑ A one-way ANOVA is used to determine whether or not there is a statistically


significant difference between the means of three or more independent groups

❑ If the p-value from the ANOVA is less than some significance level (like α = .05), we
can reject the null hypothesis and conclude that at least one of the group means is
different from the others

❑ But in order to find out exactly which groups are different from each other, we
must conduct a post-hoc test

❑ One commonly used post-hoc test is Fisher’s least significant difference test

Saudi Aramco: Company General Use


205
Fisher’s Least Significant Difference

❑ To perform this test, we first calculate the following test statistic:

1 1
𝐿𝑆𝐷 = 𝑡𝛼 , 𝐷𝐹 𝑓𝑜𝑟 𝑔𝑟𝑜𝑢𝑝𝑠 ∗ 𝑀𝑆𝐺𝑟𝑜𝑢𝑝𝑠 +
2 𝑛1 𝑛2

Where 𝒕𝜶 , 𝑫𝑭 𝒇𝒐𝒓 𝒈𝒓𝒐𝒖𝒑𝒔 ∶ the t-crtitical from the t-distribution with 𝛼 and 𝐷𝐹 𝑓𝑜𝑟 𝑔𝑟𝑜𝑢𝑝𝑠 is the
𝟐

degree of freedom within groups from the ANOVA table

𝑴𝑺𝑮𝒓𝒐𝒖𝒑𝒔 ∶ the mean squared within groups from the ANOVA table

𝒏𝟏 , 𝒏𝟐 : the sample size of each group

❑ We can then compare the mean difference between each group to this test
statistic. If the absolute value of the mean difference between two groups is
greater than the test statistic, we can declare that there is a statistically significant
difference between the group means
Saudi Aramco: Company General Use
206
Example: Fisher’s LSD Test

❑ Suppose a professor wants to know whether or not three different studying


techniques lead to different exam scores among students. To test this, she
randomly assigns 10 students to use each studying technique and records their
exam scores

❑ The following table shows the exam scores for

each student based on the studying technique

they used:

Saudi Aramco: Company General Use


207
Example: Fisher’s LSD Test

❑ The professor performs a one-way ANOVA and get the following results:

Saudi Aramco: Company General Use


208
Example: Fisher’s LSD Test

❑ Since the p-value in the ANOVA table (.018771) is less than .05, we can conclude
that not all of the mean exam scores between the three groups are equal

❑ Thus, we can proceed to perform Fisher’s least significant difference test to


determine which group means are different

❑ Using the output of the ANOVA, we can calculate Fisher’s test statistic as:

1 1
𝐿𝑆𝐷 = 𝑡𝛼 , 𝐷𝐹 𝑓𝑜𝑟 𝑔𝑟𝑜𝑢𝑝𝑠 ∗ 𝑀𝑆𝐺𝑟𝑜𝑢𝑝𝑠 +
2 𝑛1 𝑛2

1 1
𝐿𝑆𝐷 = 𝑡0.025, 27 ∗ 36.948 + = 2.0252 7.3896 = 5.578
10 10

Saudi Aramco: Company General Use


209
Example: Fisher’s LSD Test

❑ We can then calculate the absolute mean difference between each group:
▪ Technique 1 vs. Technique 2: |80 – 85.8| = 5.8

▪ Technique 1 vs. Technique 3: |80 – 88| = 8

▪ Technique 2 vs. Technique 3: |85.8 – 88| = 2.2

❑ The absolute mean differences between technique 1 vs. technique 2 and technique
1 vs. technique 3 are greater than Fisher’s test statistic, thus we can conclude that
these techniques lead to statistically significantly different mean exam scores

❑ We can also conclude that there is no significant difference in mean exam scores
between technique 2 and technique 3

Saudi Aramco: Company General Use


210

You might also like