STATS 2 Part 2 Rev 2.0 With Exercise

STATISTICS LEVEL 2
Part 2
(Rev 2.0)
ST Restricted
Structure of the course
Module 1: INTRODUCTION Module 4: HYPOTHESIS TESTING

• Introduction
• First Concepts • 1 Normal Population
• Population VS Sample • 2 Normal Populations
• Descriptive VS Inferential • ANOVA
• Estimation (point and interval) • Non-Parametric Test
• Hypothesis Testing Annex 1
• Inferential Error
• Introduction to Bootstrap
Module 2: CENTRAL LIMIT THEOREM
• Decision Making Process Annex 2
• Numerical Simulation & Examples
• t-Distribution • Overview of Outlier Detection Methods
Module 3: CONFIDENCE INTERVAL

• Estimation
• Point Estimation
• Properties of Point Estimators
• Interval Estimation
• Parameters of 1 Normal Population
• Parameters of 2 Normal Population
2
ST Restricted
Module 4: Hypotheses testing
ST Restricted
Module 4 objectives
• At the end of this chapter, you will be able to:
• Test statistical hypotheses on single population parameters

• To use hypothesis testing procedures to compare more than one population parameters (this will be
in subsequent module)
• To assess when a parametric procedure can be used and when it is better to use a non-parametric
one (this will be in subsequent module)
4
ST Restricted
Module 4
Hypothesis testing - introduction
Associated errors – type I & type II

Inferential Statistics (or Inference)
Significance, confidence, power
Parametric procedures
One normal population
Test on mean
Test on variance
Test on proportion
Two normal populations
Test difference of means
Test ratio of variances
Test difference of proportions
Test on correlation coefficient
More than two populations
Non-parametric procedures
Introduction
List of tests
•
Module 4 Key Learning
Hypothesis testing
e.g., Test the claim that the population mean weight is 120 pounds
5
ST Restricted
Module 4

Hypothesis Testing Procedure
Test on mean
Common steps to every hypothesis testing procedure:
Test on variance
Test on proportion
Test difference of proportions ❑ First step - definition of a “system of two hypotheses”, indicated by H0 and H1
More than two populations o The first, H0 is called the “null hypothesis” and
Non-parametric procedures o the second, H1, is called the “alternative hypothesis”.
Introduction o They are defined in a mutually exclusive way. This means that if one hypothesis is true, the
List of tests
second one must be false.
❑ Second step – data collection (sample) from the population on which we want to
make some inference. Of course, we assume that the sample adequately represents
the population.
❑ Third step - decision about the hypothesis which is more likely to be true based data
evidence. Statistics helps us to take this decision. The conclusion of the test is always
associated to an acceptable probability of error called “significance level”.
6
ST Restricted
Module 4

The Null Hypothesis – H0
Test on mean
Test on variance
Test on proportion ➢ States the assumption (numerical) to be tested
Example: The average number of TV sets in U.S.
More than two populations Homes is equal to three ( H0: μ=3)
Introduction
➢ Is always about a population parameter, not

List of tests

about a sample statistic
H0 : μ = 3 H0 : X = 3
7
ST Restricted
Module 4

The Null Hypothesis – H0
Test on mean
Test on variance
Test on proportion
➢ Begin with the assumption that the null hypothesis is true
Test ratio of variances o Like the notion of innocent until proven guilty
➢ Refers to the “status quo”
Introduction
➢ Always contains “=” , “≤” or “” sign

List of tests
➢ May or may not be rejected
8
ST Restricted
Module 4

The Alternative Hypothesis – H1
Test on mean
Test on variance
➢ Is the opposite of the null hypothesis
Test on proportion
Two normal populations • e.g., The average number of TV sets in U.S. homes is not equal to 3 ( H1: μ
Test ratio of variances ≠3)
Test on correlation coefficient ➢ Challenges the “status quo”
Introduction
➢ Always contains the “≠”, “<“ or “>” sign
• If it contains the “≠” sign, the test is a “two-sided test”
List of tests

• If it contains the “<“ or the “>” sign, the test is a “one-sided test”
➢ Never contains the “=” , “≤” or “” sign
➢ May or may not be supported
➢ Is generally the hypothesis that the researcher is trying to support
9
ST Restricted
Module 4
Hypothesis testing - introduction Hypothesis Testing Process

Start from the Claim: Population
Test on mean
Test on variance
“The population
Test on proportion
mean age is 50”.
(Null Hypothesis:
H0: μ = 50 )
Now select a
Introduction random sample
List of tests

Suppose
ഥ = 𝟐𝟎 likely if μ = 50?
Is 𝑿 the sample
mean age
ഥ = 𝟐𝟎
is 20: 𝑿
Sample
If not likely,
REJECT
the Null Hypothesis
10
ST Restricted
Module 4

Test on mean
Test on variance ഥ (if H0 is true)
Sampling Distribution of 𝒙
Test on proportion
H0 : μ = 50
Non-parametric procedures ഥ= 20
𝒙 μ = 50
X
Introduction
List of tests If it is unlikely that we
Module 4 Key Learning would get a sample
average of this value ...
... if in fact this (50)
were the population
mean…
... then we reject the null
hypothesis that μ = 50.
11
ST Restricted
Module 4

Parametric procedures “If the sample mean 𝑥ҧ were exactly 50, of course we would accept the null hypothesis.”
Test on mean Question:
Test on variance
Test on proportion
what is the maximum “distance” between the parameter value hypothesized
Two normal populations in H0 (50) and the sample mean 𝑥,ҧ to still conclude that H0 is true?
Test difference of proportions ഥ (if H0 is true)
Non-parametric procedures H0 : μ = 50
Introduction
List of tests H1 : μ ≠ 50
μ = 50 X
max distance (to still support H0)?

12
ST Restricted
Module 4

H0 : μ ≤ 50
One normal population Consider the (one-sided) system of hypotheses:
Test on mean H1 : μ > 50
Test on variance
Test on proportion
Two normal populations ഥ (if H0 is true)
Test ratio of variances How large the sample average can
Test difference of proportions be to still support H0?
More than two populations CRITERION: use as “threshold”, that
Non-parametric procedures value of ഥ𝒙 larger than (1-α)% of the
Introduction (1-α) = 0.95 α = 0.05 population and smaller than α%.
List of tests α is a probability, so, between 0 and 1.
For example, α = 0.05
μ = 50 X
“threshold”
Critical Value (for a given α)
Adopting this criterion to define the Critical Value, implies that we accept a risk that some large values
of 𝑥ҧ (larger than the critical value) will lead to an erroneous rejection of H0 (the yellow right tale of the
distribution in the figure).
- The probability associated to this event (our risk) is α.
- Conversely, we don’t risk to do this error with probability (1- α).
13
ST Restricted
Module 4

Test on mean In case of a two-sided test, the probability α is equally divided on the two
Test on variance
Test on proportion
tails of the distribution, so that the total risk has still a probability equal to α.
Two normal populations In these cases, we have two critical values:
ഥ (if H0 is true)
Introduction
List of tests H0 : μ = 50
Module 4 Key Learning H1 : μ ≠ 50
α/2 = 0.025 (1-α) = 0.95 α/2 = 0.025
μ = 50 X
Critical Value Critical Value
14
ST Restricted
Module 4

Test on mean Process Specs : 0 ± 20um.
Test on variance We claim the pop mean will be at nominal value 0um.
Test on proportion
Two normal populations We collect sample data and investigate if our claim is supported.
Test difference of proportions Null Hypothesis, 𝑯𝒐 Alternative Hypothesis, 𝑯𝟏
More than two populations • Status quo. • Challenge the status quo.
Non-parametric procedures • Contains only = , ≤ 𝑜𝑟 ≥ sign. • Contains only ≠ , < 𝑜𝑟 > sign.
Introduction • “Innocent unless proven guilty” • Hypothesis the researcher trying to
List of tests
investigate.
𝐻𝑜 : 𝜇 = 0 𝐻1 : 𝜇 ≠ 0 (Two Sided)
𝐻𝑜 : 𝜇 = 0 𝑜𝑟 (𝜇 ≥ 0) 𝐻1 : 𝜇 < 0 (One Sided Lower)
𝐻𝑜 : 𝜇 = 0 𝑜𝑟 (𝜇 ≤ 0) 𝐻1 : 𝜇 > 0 (One Sided Upper)
Hypothesis Testing is about inferring population parameter.
Only population parameter symbols are used in Hypothesis Statement and NOT sample statistic.
15
ST Restricted
Module 4

Hypothesis Testing Process
Associated errors – type I & type II 𝑆𝑝𝑒𝑐𝑠 = 5𝑢𝑚 ± 20𝑢𝑚
Parametric procedures Step 1 Define the System of Hypothesis
Test on mean
𝐻𝑜 : 𝜇 = 5
Test on variance
Test on proportion 𝐻1 : 𝜇 ≠ 5
Test ratio of variances Step 2 Convert Sample Data into Information.
Introduction
List of tests
16
ST Restricted
Module 4

Non-Standardized Scale Standardized Scale
Step 3
Establish the Distribution of the Sample Average Assume 𝐻𝑜 is true
One normal population (given a sample of size n)
Test on mean
Test on variance 𝝈𝒙
Test on proportion 𝑰𝒇 𝝈𝒙 𝒌𝒏𝒐𝒘𝒏 N ( µx , ) Std.Error
𝒏
𝝈𝒙
Test difference of means 𝒏
• 𝝁𝒙 = 𝝁𝒙ഥ
Test ratio of variances 𝝈𝒙 ഥ
𝒙
Test difference of proportions • 𝝈𝒙ഥ = 𝑺𝒕𝒅. 𝑬𝒓𝒓 = µx = 5
𝒏
Non-parametric procedures t-distribution Assume 𝐻𝑜 is true
𝑰𝒇𝝈𝒙 𝒖𝒏𝒌𝒏𝒐𝒘𝒏
Introduction (with n-1 dof)
List of tests
Std.Error 𝑑𝑜𝑓 = 29
Module 4 Key Learning 𝒔𝒙 𝟑. 𝟔𝟗
𝑺𝒕𝒅. 𝑬𝒓𝒓 = = = 𝟎. 𝟔𝟕 0.67
𝒏 𝟑𝟎
ഥ
𝒙
µx = 5
Here onwards, we are using 𝝈𝒙 unknown situation.
Step 4
Assume 𝐻𝑜 is true
0.025 0.025
User to define α. In this case α=0.05
0.67
ഥ
𝒙 17
µx = 5
ST Restricted
Module 4

Significance, confidence, power Step 5 Assume 𝐻𝑜 is true
One normal population Establish the Acceptance & Rejection Regions 0.025 0.025
Test on mean
Test on variance • If the Sample Mean is within the Acceptance Region
Test on proportion 0.67
Statistical Conclusion: Failed to Reject Ho
Test difference of means • If the Sample Mean is within the Rejection Region µx = 5
ഥ
𝒙
Test ratio of variances Statistical Conclusion : Reject Ho REJECT REJECT
ACCEPT REGION
Test difference of proportions REGION REGION

In this case, the Sample Mean, 𝑥ҧ = −1.27.
Non-parametric procedures But how do we know 𝑥ҧ = −1.27 is within which region?
Introduction This is why we need to find the Critical Value.
List of tests

Step 6
Find the Critical Value
0.025 0.025
• How to find the Critical Value?

0.67
• Note: In modern days, statistical software
ഥ
𝒙
will compute the Critical Value. µx = 5
REJECT REJECT
ACCEPT REGION
REGION REGION
• Subsequent steps is to show the classical
method used.

18
ST Restricted
Module 4

Significance, confidence, power Step 7 𝒙ǉ − 𝝁
Assume 𝐻𝑜 is true 𝒕= 𝒔 Assume 𝐻𝑜 is true
One normal population Convert to Standardized Scale, where
Test on mean in our case: 0.025 0.025
𝒏 0.025 0.025
Test on variance
Test on proportion 𝝁=5
0.67
Test difference of means 𝒔
Test ratio of variances = 𝟎. 𝟔𝟕
µx = 5
ഥ
𝒙 t
𝒏 µx =0
Test difference of proportions REJECT REJECT REJECT REJECT
ACCEPT REGION ACCEPT REGION
Test on correlation coefficient REGION REGION REGION REGION

Introduction Critical Value Critical Value Critical Value Critical Value
List of tests
Module 4 Key Learning Statistician have tabulated the function

of t-distribution creating the
t-statistical table.
19
ST Restricted
Module 4

Significance, confidence, power Step 8
One normal population Refer to statistical table for t-distribution Using the Statistical Table, we know
Test on mean the Critical Value is 2.045
Test on variance • Alpha is 0.025
Test on proportion • dof is 29.
Two normal populations Assume 𝐻𝑜 is true
Test ratio of variances 0.025 0.025
1
t
Non-parametric procedures µx = 0
Introduction REJECT REJECT
ACCEPT REGION
List of tests REGION REGION

-2.045 2.045
20
ST Restricted
Module 4

One normal population Covert back to Non-Standardized Scale
Test on mean 0.025 0.025
Test on variance
Test on proportion
1
Test ratio of variances t
µx =0
Test difference of proportions REJECT REJECT
ACCEPT REGION
Test on correlation coefficient REGION REGION

Non-parametric procedures -2.045 2.045
Introduction
List of tests
Module 4 Key Learning 𝒔

ഥ = 𝝁 ± (𝟐. 𝟎𝟒𝟓)(
𝒙 )
Assume 𝐻𝑜 is true 𝒏
0.025 0.025
0.67
µx = 5 ഥ
𝒙
𝒙ǉ − 𝟓
= ±𝟐. 𝟎𝟒𝟓 REJECT
ACCEPT REGION
REJECT
𝟎. 𝟔𝟕 REGION REGION
ഥ = 𝟓 ± (𝟐. 𝟎𝟒𝟓)(𝟎. 𝟔𝟕)

𝒙 3.63 6.37
21
ST Restricted
Module 4

One normal population Now we can answer which region the
Test on mean Sample Mean, 𝑥ҧ = −1.27 is at.
0.025 0.025
Test on variance
Test on proportion Statistical Conclusion:
Two normal populations Reject Ho 0.67
Test ratio of variances ഥ
𝒙
µx = 5
Test difference of proportions REJECT
REJECT REGION ACCEPT REGION
Test on correlation coefficient REGION
More than two populations Assume 𝐻𝑜 is true

Non-parametric procedures 𝑥ҧ = −1.27 3.63 6.37
Introduction 0.025 0.025
List of tests

t
µx =0
REJECT
REGION
Convert Sample Mean 𝑇𝑒𝑠𝑡 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 = −9.29 -2.045 2.045

into Standardized Scale
𝒙ǉ − 𝝁 (−𝟏. 𝟐𝟕) − (𝟓)

𝑻𝒆𝒔𝒕 𝑺𝒕𝒂𝒕𝒊𝒔𝒕𝒊𝒄𝒔 = 𝒔 = 𝟎. 𝟔𝟕
𝒏
(Note: In JMP, the calculation takes into account

all the decimal points)
22
ST Restricted
Module 4
Results in JMP
Hypothesis testing - introduction Standardized Scale
Associated errors – type I & type II 𝐻𝑜 : 𝜇 = 5
Parametric procedures 𝐻1 : 𝜇 ≠ 5 0.025 0.025
Test on mean
Test on variance
Test on proportion
t
Two normal populations µx =0
Test difference of means REJECT REGION ACCEPT REGION
REJECT
REGION
𝑇𝑒𝑠𝑡 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 = −9.29 -2.045 2.045
Introduction
List of tests
Module 4 Key Learning Non-Standardized Scale

0.025 0.025
0.67
ഥ
𝒙
µx = 5
REJECT
REGION
𝑥ҧ = −1.27 3.63 6.37

23
ST Restricted
0.025 0.025
This concept works for any
0.95 chosen 𝛼 value.
The case shown here is

using 𝛼 = 0.05.
REJECTION REGION ACCEPTANCE REGION REJECTION REGION
Test Statistic < Abs(Critical Value)

𝑃 𝑣𝑎𝑙𝑢𝑒 > 0.05
𝐹𝑎𝑖𝑙 𝑡𝑜 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻𝑜
Test Statistic Test Statistic
Test Statistic > Abs (Critical Value)

𝑃 𝑣𝑎𝑙𝑢𝑒 < 0.05
𝑅𝑒𝑗𝑒𝑐𝑡 𝐻𝑜
Test Statistic Test Statistic 24

ST Restricted
Module 4

Exercise #14
Exercise File:
Parametric procedures Scenario:
Test on mean
Test on variance 1. You are setting up the machine.
Test on proportion 2. 30 setup units measured.
3. Descriptive Statistics obtained.
Test ratio of variances a. All single value within specs limit. Die Attach Placement Specs = 0𝑢𝑚 ± 20𝑢𝑚
Test on correlation coefficient b. Cpk > 1.67.
4. Decision? :
a. Release machine for production, or
Introduction b. Re-setup machine.
List of tests
Module 4 Key Learning Note:

• This is the same example used in Confidence Interval topic (Example #6).
• You can compare the statistical conclusion vs the two methods
(Confidence Interval & Hypothesis Testing).
Trainer will show:

1. How to perform Hypothesis
Testing.
3. Interpretation of results.
25
ST Restricted
Module 4

Exercise #14
Make first distribution. Go to Analyze > Distribution. Cast the 2 columns to Y
Test on mean
Test on variance
Test on proportion
Introduction Do a hypothesis testing. Use hotspot, select Test Mean. Enter Hypothesized Mean. Hit OK
List of tests
Use for non-normal distribution
26
ST Restricted
Module 4

Exercise #14
Test on mean
Test on variance
Test on proportion
Test difference of proportions If 2-sided test
Test on correlation coefficient If right test
If left test
Since Prob > |t| is
Introduction greater than the
List of tests
significance value (0.05),
Module 4 Key Learning then we FAIL TO REJECT
The NULL HYPOTHESIS.
Practical Conclusion:
The process is centered, can
release machine for
production.
Perform the same for Y-Offset data. What is your conclusion?
27
ST Restricted
Module 4

Exercise #14
Test on mean
Test on variance
Test on proportion
Test difference of means Since Prob > |t| is
Test difference of proportions less than the
Test on correlation coefficient significance value (0.05),
More than two populations then we REJECT
Non-parametric procedures The NULL HYPOTHESIS.
Introduction
List of tests
The process is not centered,
check the machine for
Y-offset.
28
ST Restricted
Module 4

𝐼𝑓 𝐻𝑜 𝑖𝑠 𝑇𝑟𝑢𝑒
2 Sided Test
𝐻𝑜 : 𝜇 = 0
Test on mean
Test on variance 𝐻1 : 𝜇 ≠ 0 2.5% 2.5%
Test on proportion
Two normal populations 95%
REJECTION REGION ACCEPTANCE REGION REJECTION REGION
Introduction
List of tests
By default this graph
Module 4 Key Learning shows two-sided test.
To see one-sided test,

goes “P(value)
Animation”.
Statistical Conclusion:
29
Fail to Reject Ho
ST Restricted
Module 4

𝐼𝑓 𝐻𝑜 𝑖𝑠 𝑇𝑟𝑢𝑒
Significance, confidence, power 1 Upper Sided Test
Test on mean
𝐻𝑜 : 𝜇 ≤ 0
Test on variance
Test on proportion 𝐻1 : 𝜇 > 0 5.0%
Two normal populations 95%
ACCEPTANCE REGION REJECTION REGION
Introduction
Statistical Conclusion:
List of tests Fail to Reject Ho
30
ST Restricted
Module 4
Associated errors – type I & type II 𝐼𝑓 𝐻𝑜 𝑖𝑠 𝑇𝑟𝑢𝑒

1 Lower Sided Test
Test on mean
Test on variance 𝐻𝑜 : 𝜇 ≥ 0
Test on proportion 5.0%
𝐻1 : 𝜇 < 0
Test ratio of variances 95%
Test on correlation coefficient Statistical Conclusion:
More than two populations Accept 𝐻1 REJECTION REGION ACCEPTANCE REGION
Introduction
List of tests
31
ST Restricted
Module 4 Side by Side Comparison : Confidence Interval vs Hypothesis Testing on same data

Test on mean
Test on variance
Test on proportion
Test difference of means Target of Interest : 0
Test on correlation coefficient Confidence Interval Hypothesis Testing
1. Establish Hypothesis Statement
Introduction
List of tests
𝐻𝑜 : 𝜇 = 0 𝐻1 : 𝜇 ≠ 0
Module 4 Key Learning 2. Establish the Distribution of the Sample Average
1. Establish the Distribution of the Sample Average
(given a sample of size n)
(given a sample of size n)
Assume is 𝐻𝑜 True.
𝑑𝑜𝑓 = 29 𝑑𝑜𝑓 = 29
𝟎
t t
𝟎 32
ST Restricted
Module 4
Associated errors – type I & type II Confidence Interval Hypothesis Testing

2. Define Significance Level, 𝜶 (in this case 0.05). 3. Define Significance Level, 𝜶 (in this case 0.05).
Test on mean 3. Establish Upper and Lower Confidence Limit. 4. Establish Acceptance and Rejection region.
Test on variance
Test on proportion
Test ratio of variances 2.5% 2.5% 2.5% 2.5%
Introduction 𝟎 𝟎
List of tests Critical Value Critical Value REJECT ACCEPTANCE REGION REJECT
(-2.045) (2.045)
Module 4 Key Learning Critical Value Test Statistic Test Statistic Critical Value
𝒔 𝒔 (-2.045) (-1.8778) (1.8778) (2.045)
ഥ − 𝒕𝒏−𝟏,𝜶/𝟐
𝑿 ഥ + 𝒕𝒏−𝟏,𝜶/𝟐
<𝝁<𝑿
𝒏 𝒏
Lower C.L Upper C.L
3.69 3.69
(-1,27) – (2.045)( ) (-1.27) + (2.045)( )
30 30 Represents the “Yellow” area shown
above. Hence, we know the Test Statistic
-2.65 0.11 Is within the Acceptance Region.
33
ST Restricted
Module 4
Associated errors – type I & type II Confidence Interval Hypothesis Testing

Parametric procedures 5. Statistical Conclusion
Test on mean P(value) > 0.05 ➔ Fail to Reject 𝐻𝑜 . ➔ Process is centered.
Test on variance
Test on proportion
More than two populations 4. Statistical Conclusion:
Non-parametric procedures Target of Interest : Zero is within CI ➔ Process is centered.
Introduction
List of tests
Both Inferential Method provides the same statistical conclusion.
34
ST Restricted
Module 4
Practical Simulation to show Confidence Interval and Hypothesis Testing provide same statistical result performed at same
Hypothesis testing - introduction significance level, alpha (Simulation using sheet Pop 1 from Central Limit Theorem.xls file where the known Pop mean is 5.02).
Associated errors – type I & type II Confidence Interval performed at alpha=0.05 (or 95% confidence level)
Test on mean
Test on variance
Test on proportion Pop Mean.
Two normal populations @5.02
More than two populations On Average 4/100 CI
do not contain µ
Introduction Hypothesis Statement : 𝐻𝑜 : 𝜇 =5.02 , 𝐻1 : 𝜇 ≠ 5.02. Hypothesis test performed at alpha=0.05 (or 95% confidence level)
List of tests
On Average
4/100
P(value) <0.05
35
ST Restricted
Module 4

Errors in making decisions
Test on mean
Test on variance
Test on proportion Type I Error - “reject a true null hypothesis”
The probability of Type I Error is 
•  is called “level of significance” of the test
Introduction
•  is set by researcher in advance
List of tests
Type II Error - “fail to reject a false null hypothesis”

The probability of Type II Error is β
36
ST Restricted
Module 4

Outcomes and probabilities
Test on mean
Test on variance Possible Hypothesis Test Outcomes
Test on proportion
Test ratio of variances Actual Situation Key:
Test on correlation coefficient Outcome
More than two populations Decision H0 True H0 False (Probability)
Introduction
List of tests Do Not No error Type II Error
Module 4 Key Learning Reject H0 (1 - α) (β)
Type I Error No Error

Reject H0
(α) (1 - β)
37
ST Restricted
Module 4

Type I & II Error Relationship
Test on mean
Test on variance
▪ Type I and Type II errors cannot happen at the same time
Test on proportion
▪ Type I error can only occur if H0 is true
Test on correlation coefficient ▪ Type II error can only occur if H0 is false
Introduction
List of tests

If type I error probability (  ) , then Type II error probability ( β )
Lowering , the probability of type I error (with no change in available data), β, the
probability of type II error, increases.
38
ST Restricted
Module 4

Consequences of type I & II Error
Test on mean
EXAMPLE on the consequences of the two types of error.
Test on variance “We want to test if two Equipment, on average, are aligned. Suppose that a sample of size n1 is
Test on proportion
drawn from the population of measurements of Equipment 1 and that a second sample of size n 2 is
Test difference of means drawn from the population of measurements of Equipment 2. Moreover, let’s assume that: (a) both
populations are Normally distributed with known (and equal) variances, (b) samples are
Test on correlation coefficient independent and (c) n1 = n2.
The hypotheses to test are: H0: μ1- μ2 = 0 => If true, the equipment are aligned.
Introduction H1: μ1- μ2  0 => If true, the equipment are misaligned.
List of tests
Module 4 Key Learning Type I Error

“Reject H0 which is true”: We waste our efforts, time and money trying to align
equipment which are already aligned.
Type II Error
“Accept H0 which is false”: We do nothing to match the equipment, since we think
that they are aligned. But they are not and this error will
increase process variability.
39
ST Restricted
Module 4

Consequences of type I & II Error
Test on mean EXAMPLE on the consequences of the two types of error.
Test on variance
Test on proportion A production process is monitored with SPC techniques.
Test difference of means H0: “The process is in-control”.
The hypotheses to test are:
H1: “The process is out-of-control”.
Type I Error
“Reject H0 which is true”: Actually, the process is in-control, but we erroneously
Introduction think that it is not. In this case, a FALSE ALARM occurred.
List of tests
Thus, we waste our efforts, time and money investigating
Module 4 Key Learning and trying to remove the effect of “special causes” of
variability which, actually, do not exist.
Type II Error
“Accept H0 which is false”: We fail to detect that the process run out-of control. The
effects of this error are considered very dangerous.
40
ST Restricted
Module 4

Factors affecting type II Error
Test on mean All else equal,
Test on variance
Test on proportion
β
Test difference of proportions o when the difference between hypothesized parameter and its true value

Introduction
List of tests
o β when
o β when σ
o β when n
41
ST Restricted
Module 4

Relationship between α and β
Test on mean
Test on variance
Test on proportion Test on 1 tail (right)
Two normal populations H0: μ ≤ 0
Test difference of means H1: μ = μ1 > 0
Introduction
Ho ACCEPTANCE REGION Ho REJECT REGION
List of tests
H1 REJECT REGION H1 ACCEPTANCE REGION
The values of α and β are linked.

Given α, the value of β depends on:
• The distribution of the considered variable (in the graphical example, it is normal)
• The distance between the values of the parameter in the null and alternative hypotheses. In this
case, the distance between μ0 and μ1
• The population variance.
42
ST Restricted
Module 4

Significance, Confidence, Power
Test on mean
Test on variance
Test on proportion
Two normal populations SYMBOL MEANING NAME
α Probability of making type I error Level of significance
(1-α) Probability of not making type I error Level of confidence
Non-parametric procedures β Probability of making type II error ---
Introduction
List of tests (1-β) Probability of not making type II error Power
43
ST Restricted
Module 4

Test on mean
Test on variance
Test on proportion
Test on correlation coefficient Hypothesis testing procedures on population
parameters (one normal population)
Introduction
List of tests
44
ST Restricted
Module 4

Hypothesis tests on one population parameters
One normal population Tests on the parameters of one normally distributed population
Test on mean
Test on variance
Test on proportion
Hypothesis
Test difference of proportions Tests
Introduction
List of tests Population Population Population
Module 4 Key Learning Mean Variance Proportion
σ2 Known σ2 Unknown
45
ST Restricted
Module 4
Hypothesis testing - introduction Hypothesis tests on one population parameters

Significance, confidence, power Two equivalent ways to test hypotheses. They are separated just for training purposes.
Parametric procedures However, they are two faces of the same coin. In everyday work, we shall use only the p-value
Test on mean
Test on variance
Test on proportion
Test difference of means Use the CRITICAL VALUE
Test ratio of variances Use the P-VALUE
Test difference of proportions (which defines a REJECTION REGION)
Introduction
List of tests
HYPOTHESIS TESTING PROCEDURE (Critical Value and the Rejection Region).

1. Define the HYPOTESES to test.
2. Draw a SAMPLE from the population.
3. Define and calculate the TEST STATISTIC using sample data (done by statistical software).
4. Define the LEVEL OF SIGNIFICANCE (α).
5. Find the CRITICAL VALUE (from Statistical Tables) and the REJECTION REGION.
6. Make your DECISION (Reject or not H0).
DECISION RULE:
Test statistic > Critical value ( level)  REJECT H0
Test statistic ≤ Critical value ( level)  DO NOT REJECT H0 46
ST Restricted
Module 4
Hypothesis testing - introduction Hypothesis tests on one population parameters

Significance, confidence, power The Hypotheses
One normal population H0: μ = 0 Assumption: Normal population
Test on mean H1: μ ≠ 0 Context: Variance known
Test on variance
Test on proportion H0: μ = 0 OR μ  0
Two normal populations H1: μ > 0
Test ratio of variances H0: μ = 0 OR μ ≥ 0 0 is a constant!
Test difference of proportions H1: μ < 0
TEST STATISTIC, CRITICAL VALUE and REJECTION RULE
Introduction
List of tests ❑ TEST STATISTIC – A statistic, is a Test Statistic for hypotheses H0 and H1, if it is
known how this statistic is distributed when the null hypothesis (H0) is true. Its
value is calculated on sample data by statistical software.
❑ CRITICAL VALUE – it depends on the test level of significance (α). It is the largest
(absolute) value of the test statistic (for a given α) that still permits to support H0 .
This value is found on Statistical Tables.
❑ REJECTION RULE – it tells us when H0 must be rejected. Typically, «Reject H0 for
“large” values of the test statistic». “LARGE” = “(the absolute value of the test
statistic is) larger than the CRITICAL VALUE”.
47
ST Restricted
Module 4
Associated errors – type I & type II Test Statistic and Critical Value
Significance, confidence, power EXAMPLE
“A phone industry manager thinks that customer monthly cell phone bill have increased, and
now average over $52 per month. The company wishes to test this claim.” (Assume  = 10
Test on mean
Test on variance is known) 1. Define the HYPOTESES to test.
Test on proportion
Two normal populations 2. Draw a SAMPLE from the population.
3. Calculate the TEST STATISTIC using sample data.
Test difference of means HYPOTHESIS TESTING PROCDURE : 4. Define the LEVEL OF SIGNIFICANCE (α).
Test difference of proportions (Method: Critical value and rejection region) 5. Find the CRITICAL VALUE and the REJECTION REGION.
Test on correlation coefficient 6. Make your DECISION (Reject or not H0).
More than two populations H0: μ  52 The average is not over $52 per month.
1. Hypotheses formulation:
Non-parametric procedures H1: μ > 52 The average is greater than $52 per month
Introduction
List of tests 2. Sample extraction: The following results are obtained: n = 64, 𝑥ҧ = 53.1, and it’s known that 𝜎 = 10.
Module 4 Key Learning x − μ0 53.1 − 52
3. Test statistic calculation: z = = = 0.88
σ 10
n 64
4. Level of significance: α = 0.10
5. Critical value and rejection region: zα=0.1= 1.28 (from statistical tables -
Rejection Region: z > 1.28
6. Decision:
Do not reject H0 at the significance level  = 0.1, since z = 0.88 < 1.28
(i.e.: at level 10%, there is not sufficient evidence that the mean bill is over $52).
48
ST Restricted
Module 4
Hypothesis testing - introduction Test Statistic and Critical Value

Test on mean GRAPHICALLY:
Test on variance (Test on the right tail)
Test on proportion
α = 0.1
Non-parametric procedures (1-α) = 0.90
Introduction
List of tests

μ=0 z = 0.88 zα =1.28
Z
(Test Statistic) (Critical Value)
DO NOT REJECT H0 REJECT H0
REJECTION REGION (H0)
The Test Statistic has not fallen into the Rejection Region → Do not reject H0
49
ST Restricted
Module 4

The P-Value
Significance, confidence, power Two equivalent ways to describe
Parametric procedures hypothesis testing procedure
Test on mean
Test on variance
Test on proportion
Use the CRITICAL VALUE
Use the P-VALUE
Test difference of means (which defines a REJECTION REGION)
HYPOTHESIS TESTING PROCEDURE (Using the P-Value).
Introduction
List of tests 1. Define the HYPOTESES to test.
2. Extract a SAMPLE from the population.
3. Define and calculate the TEST STATISTIC using sample data (by statistical software).
4. Define the LEVEL OF SIGNIFICANCE (α).
5. Calculate the P-VALUE (done by statistical software).
6. Draw your CONCLUSIONS (Reject or not H0) (*).
DECISION RULE:
P-value <   REJECT H0
P-value ≥   DO NOT REJECT H0
50
ST Restricted
Module 4
Hypothesis testing - introduction The P-Value

Test on mean Definition: The p-value is smallest value of  for which H0 can be rejected.
Test on variance
Test on proportion
Graphically: p-value

Introduction
List of tests

0
Do not reject H0 Reject H0
Zα
(Critical Value)
Z
(Test Statistic)
51
ST Restricted
Module 4
Hypothesis testing - introduction The P-Value

EXAMPLE
One normal population Consider again the previous example 3.3 of the “phone industry manager”.
Test on mean
Test on variance
Test on proportion H0: μ  52 The average is not over $52 per month.
1. Hypotheses formulation: H1: μ > 52 The average is greater than $52 per month
2. Sample results: n = 64, 𝑥ҧ = 53.1, 𝑎𝑛𝑑 𝜎 = 10 (assumed known).
x − μ0 53.1 − 52
3.Test statistic (*): z = = = 0.88
Introduction σ 10
List of tests
n 64
4. Level of significance: α = 0.10
5. P-Value calculation (*):

 53.1 − 52.0 
p − value = P(X  53.1 |  = 52 ) = P  Z   = P(Z  0.88 ) = 1 − F ( 0.88 ) = 1 − 0.8106 = 0.1894
 10 / 64 
6. Conclusions: Do not reject H0 at the significance level  = 0.1, since p-value = 0.1894 >  = 0.10
(*) calculations are carried out by statistical software. For more details, se also the Manual
of Statistical Methodology, ANNEX 6 (8482919 ver. 2). 52
ST Restricted
Module 4

The P-Value
Test on mean
Test on variance
Test on proportion
Graphically: p-value = 0.1894
 = 0.1
Introduction
List of tests 0
Do not reject H0 Reject H0
Zα =1.28
(Critical Value)
Z =0.88
(Test Statistic)
53
ST Restricted
Module 4
Hypothesis testing - introduction Summary of Rules

SUMMARY OF REJECTION RULES USING BOTH METHODS:
Test on mean
Test on variance
Test on proportion NO |Test YES NO YES
 P-value
Two normal populations Statistic|
Test difference of means is large
is large
REJECT H0
Introduction
List of tests
DO NOT REJECT H0
• The “test statistic” is considered large if: |Test Statistic| > Critical Value (tables)
• The “p-value” is considered large if: P-value > Significance Level (α)
• According to the formulation of H1 (one or two sided test) the comparison between the test
statistic and the critical value is carried out according to the following rules:
o One-sided (left) test - reject H0 if test statistic < critical value
o One-sided (right) test - reject H0 if test statistic > critical value
o Two-sided test - reject H0 if test statistic < critical value 1 OR if test statistic > critical value 2
• The comparison between the significance level (α) and the p-value is carried out according to the
following rule: reject H0 if p-value < significance level (α)
54
ST Restricted
Module 4

Summary of Rules
Test on mean
Test on variance The methodology so far illustrated, can be adopted for all the cases in
Test on proportion the next slides.
Test on correlation coefficient The only differences between them regards:
Non-parametric procedures ➢ the hypotheses
Introduction
List of tests
➢ the test statistic
Next slides will show only summary tables.

In all cases, we assume the populations to be normally distributed.
55
ST Restricted
Module 4 Hypothesis
Tests

Population
Summary of Tests
Population Population
Significance, confidence, power Mean
Variance Proportion
Test on mean
Test on variance σ2 Known
σ2 Unknown
FOR THE MEAN - variance known
Test on proportion
Test difference of means 𝒙ǉ − 𝝁𝟎
The Test Statistic:
𝒛 = 𝝈 > 𝒛𝜶
Test difference of proportions is a value of the standard normal distribution
𝒏
Introduction
List of tests
HYPOTHESES REJECT H0 IF P(value) < alpha or for Test
Statistics with condition stated below
H0: μ = μ0 (or H0: μ ≥ μ0) 𝒙ǉ − 𝝁𝟎

𝒛= 𝝈 < −𝒛𝜶
H1: μ < μ0
𝒏
H0: μ = μ0 (or H0: μ  μ0) 𝒙ǉ − 𝝁𝟎
𝒛 = 𝝈 ≻ 𝒛𝜶
H1: μ > μ0
𝒏
H0: μ = μ0 𝒙ǉ − 𝝁𝟎 𝒙ǉ − 𝝁𝟎 𝒙ǉ − 𝝁𝟎
𝒛 = 𝝈 > 𝒛𝜶/𝟐 ⇔ 𝝈 > 𝒛𝜶/𝟐 𝑶𝑹 𝝈 < −𝒛𝜶/𝟐
H1: μ  μ0
𝒏 𝒏 𝒏
56
ST Restricted
Module 4 Hypothesis
Tests

Summary of Tests
Population
Variance Proportion
Test on mean σ2 Known σ2 Unknown
Test on variance
Test on proportion FOR THE MEAN - variance unknown
Test difference of proportions 𝒙ǉ − 𝝁𝟎
Test on correlation coefficient The Test Statistic: 𝒕 = 𝒔 is a value of the t distribution with (n – 1) DF
𝒏
Introduction
List of tests
HYPOTHESES REJECT H0 IF P(value) < alpha or for Test Statistics with
Module 4 Key Learning condition stated below
H0: μ = μ0 (or H0: μ ≥ μ0) 𝒙ǉ − 𝝁𝟎
𝒕= 𝒔 < −𝒕𝒏−𝟏 ,𝜶
H1: μ < μ0
𝒏
H0: μ = μ0 (or H0: μ  μ0) 𝒙ǉ − 𝝁𝟎
𝒕 = 𝒔 > 𝒕𝒏−𝟏 ,𝜶
H1: μ > μ0
𝒏
H0: μ = μ0 𝒙ǉ − 𝝁𝟎 𝒙ǉ − 𝝁𝟎 𝒙ǉ − 𝝁𝟎
𝒕 = > 𝒕𝒏−𝟏,𝜶/𝟐 ⇔ > 𝒕𝒏−𝟏,𝜶/𝟐 𝑶𝑹 < −𝒕𝒏−𝟏,𝜶/𝟐
H1: μ  μ0 𝑺 𝑺 𝑺
𝒏 𝒏 𝒏 57
ST Restricted
Module 4 Hypothesis
Tests

Summary of Tests
Population
Variance Proportion
Test on mean
σ2 Known
Test on variance σ2 Unknown
Test on proportion
Two normal populations FOR THE VARIANCE
2
(n − 1)s2
Test on correlation coefficient The Test Statistic: 𝜒n−1 = is a value of the χ2 (Chi-squared) distribution with n-1 d.f.
σ20
Introduction HYPOTHESES REJECT H0 IF P(value) < alpha or GRAPHICALLY
List of tests
for Test Statistics with condition
Module 4 Key Learning stated below
𝐻0 : 𝜎 2 = 𝜎02 (𝑜𝑟 𝜎 2 ≥ 𝜎02 ) 2 2
α
𝜒𝑛−1 < 𝜒𝑛−1 ,1−𝛼
𝐻1 : 𝜎 2 < 𝜎02 2
𝜒𝑛− 1 ,1−𝛼
𝐻0 : 𝜎 2 = 𝜎02 (𝑜𝑟𝜎 2 ≤ 𝜎02 ) 2 2

α
𝜒𝑛−1 > 𝜒𝑛−1 ,𝛼
𝐻1 : 𝜎 2 > 𝜎02 2
𝜒𝑛−1 ,𝛼
2 2 α/2 α/2
𝐻0 : 𝜎 2 = 𝜎02 𝜒𝑛−1 > 𝜒𝑛−1 ,𝛼 /2 OR
𝐻1 : 𝜎 2 ≠ 𝜎02 2
𝜒𝑛−1 2
< 𝜒𝑛− 2
𝜒𝑛−
2
𝜒𝑛−1 ,𝛼/2
1 ,1−𝛼/2 1 ,1−𝛼/2
58
ST Restricted
Module 4

Exercise #14.1
Note : How to perform Hypothesis Testing on One Pop Mean have been shown in
Exercise 14.
Test on mean
Test on variance
Test on proportion
1. Open the exercise File:
Non-parametric procedures 2. Trainer will show using JMP:
Introduction
List of tests a. How to perform Hypothesis Testing on One Pop Variance.
59
ST Restricted
Module 4

Exercise #14.1
Parametric procedures Make first distribution. Go to Analyze > Distribution. Cast the 2 columns to Y
Test on mean
Test on variance
Test on proportion
Introduction Do a hypothesis testing. Use hotspot, select Test Std Dev.
List of tests

Value to enter should be variance
60
ST Restricted
Module 4
Hypothesis testing - introduction Ho: 2 = 1 Exercise #14.1

Associated errors – type I & type II H1: 2  1
Test on mean
Test on variance
Test on proportion
Test ratio of variances If 2-sided test
Test difference of proportions If left test
If right test
Introduction
List of tests
Since Min PValue is less than the significance value (0.05), then we REJECT the NULL HYPOTHESIS.
The process variance is statistically not equal to 1
61
ST Restricted
Module 4 Hypothesis
Tests

Summary of Tests
Population
Variance Proportion
Test on mean
σ2 Known
Test on variance σ2 Unknown
Test on proportion
Two normal populations FOR THE PROPORTION
Test difference of proportions 𝑝Ƹ − 𝑝0
Test on correlation coefficient The Test Statistic: 𝑧 = is a value of the standard normal distribution
𝑝0 (1 − 𝑝0 )
More than two populations 𝑛
Introduction
List of tests ASSUMPTION: The binomial distribution can be approximated by a normal distribution.
Module 4 Key Learning Rule of thumb → The normal approximation holds when np(1-p) > 9
HYPOTHESES REJECT H0 IF P(value) < alpha or for Test Statistics
with condition stated below
𝐻0 : 𝑝 = 𝑝0 (𝑜𝑟 𝑝 ≥ 𝑝0 )
𝑧 < −𝑧𝛼
𝐻1 : 𝑝 < 𝑝0
𝐻0 : 𝑝 = 𝑝0 (𝑜𝑟 𝑝 ≤ 𝑝0 ) 𝑧 > 𝑧𝛼
𝐻1 : 𝑝 > 𝑝0
𝐻0 : 𝑝 = 𝑝0 𝒛 < −𝑧𝛼 OR 𝒛 > 𝑧𝛼 62
𝐻1 : 𝑝 ≠ 𝑝0 2 2 ST Restricted
Module 4

Exercise #14.2
Test on mean 1. Open the exercise File:
Test on variance
Test on proportion
Two normal populations 2. Trainer will show using JMP:
Test difference of means a. How to perform Hypothesis Testing on One Pop Proportion.
Test on correlation coefficient 3. Interpretation of results.
Introduction
List of tests
63
ST Restricted
Module 4

Exercise #14.2
Make first distribution. Go to Analyze > Distribution. Cast Conformity to Y, then OK
Test on mean
Test on variance
Test on proportion
More than two populations Do a hypothesis testing. Use hotspot, select Test Probabilities.
Introduction
List of tests
Sum equal to 1
Since Prob value is greater than the significance value (0.05), then
we FAIL TO REJECT the NULL HYPOTHESIS.
The proportion conformity is equal to 0.8 64
ST Restricted
Module 4

Exercise #14.2
Using Sample Calculator in Add-Ins > Statistics Calculator > CI for One Proportion
Test on mean
Test on variance
Test on proportion
Introduction
List of tests Select Raw Data. Use Conformity for the Test.
65
ST Restricted
Module 4

Exercise #14.2
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
Input same value as in previous slide (Hypothesized proportion)

Input alpha (0.05)
66
ST Restricted
Module 4

Test on mean
Test on variance
Test on proportion
Test on correlation coefficient Hypothesis testing on parameters
of two normal populations
Introduction
List of tests
67
ST Restricted
Module 4
Hypothesis testing - introduction Summary of Tests

GENERAL SCHEME:
Test on mean
Test on variance HYPOTHESIS TESTING ON PARAMETERS OF TWO NORMAL POPULATIONS
Test on proportion
Test on correlation coefficient DIFFERENCE of 2 means RATIO of 2 DIFFERENCE Correlation
variances of 2 proportions coefficient
Introduction
List of tests DEPENDENT INDEPENDENT
Module 4 Key Learning SAMPLES SAMPLES
Variances Variances
KNOWN UNKNOWN
Variances Variances assumed

assumed EQUAL UNEQUAL
68
ST Restricted
Module 4
Hypothesis testing - introduction HYPOTHESIS TESTING ON PARAMETERS OF TWO NORMAL POPULATIONS

Summary of Tests
DIFFERENC RATIO of 2 DIFFERENCE Correlation
E of 2 means variances of 2 proportions coefficient
One normal population DEPENDENT INDEPENDENT
SAMPLES SAMPLES
Test on mean
Test on variance
Test on proportion Variances
KNOWN
Variances Test the difference of means of two
UNKNOWN
Test difference of means dependent (or “paired”) samples
Test ratio of variances Variances assumed Variances
EQUAL assumed
Test difference of proportions UNEQUAL
Test on correlation coefficient 𝒅 − 𝒅𝟎 is a value of the t distribution with
𝒕= 𝒔𝒅
More than two populations The Test Statistic: (n – 1) degrees of freedom.
Non-parametric procedures 𝒏
Introduction
List of tests

HYPOTHESES REJECT H0 IF P(value) < alpha or for Test
Statistics with condition stated below Of particular interest is
H0: μx - μy = () d0 the case d0 = 0 to test
t < -tn-1, α
H1: μx - μy < d0 the equality of the
population means.
H0: μx - μy = () d0
H1: μx - μy > d0 t > tn-1, α
H0: μx - μy = d0
H1: μx - μy  d0 t > tn-1, α/2 OR t < -tn-1, α/2
69
ST Restricted
Module 4

Exercise #15
Test on mean
Test on variance 2. Trainer will show using JMP:
Test on proportion
a. How to perform Hypothesis Testing.
Test ratio of variances 3. Interpretation of results.
Introduction
List of tests
70
ST Restricted
Module 4

Exercise #15
Go to Analyze > Specialized Modeling > Matched Pair
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
Cast T0 and T500 columns to Y, then OK If 2-sided test

If right test
If left test
It also has the Confidence interval for the mean difference
Since Prob value is less than the significance value

(0.05), then we REJECT the NULL HYPOTHESIS.
71
Mean difference is not equal to 0.
ST Restricted
Module 4

Exercise #15
Go to Analyze > Distribution > Test Mean
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
Conclusion:
Since Prob value is less than the significance value
(0.05), then we REJECT the NULL HYPOTHESIS.
72
ST Restricted
Module 4
Hypothesis testing - introduction HYPOTHESIS TESTING ON PARAMETERS OF TWO NORMAL POPULATIONS

Summary of Tests
Significance, confidence, power DIFFERENC RATIO of 2 DIFFERENCE Correlation
Test on mean SAMPLES SAMPLES
Test on variance
Test on proportion Variances Variances Test the difference of means of two
Two normal populations KNOWN UNKNOWN
Test difference of means independent samples – variances known

EQUAL assumed
Test difference of proportions UNEQUAL
ഥ ഥ − 𝒅𝟎
𝒙−𝒚 is a value of the standard normal distribution.
Non-parametric procedures The Test Statistic: 𝒁 =
Introduction 𝛔𝐱 𝟐 𝛔𝐲 𝟐
List of tests +
𝒏𝐱 𝒏𝐲

H0: μx - μy = () d0
z < -zα
H1: μx - μy < d0
H0: μx - μy = () d0
z > zα
H1: μx - μy > d0
H0: μx - μy = d0
z < -z α/2 OR z > zα/2
H1: μx - μy  d0
73
ST Restricted
Module 4

Exercise #16
Test on mean
Test on proportion
a. How to perform Hypothesis Testing on 2 Independent
Test difference of means Samples with Variance Known.
Introduction
List of tests
74
ST Restricted
Module 4

Exercise #16
Parametric procedures Go to Tables > Stack
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
Go to Add-Ins > Statistical Calculators > Hypothesis Test for Two Means
75
ST Restricted
Module 4

Exercise #16
Cast the columns to appropriate fields > OK
Test on mean
Test on variance
Test on proportion
Known variance
Introduction
List of tests
Unknown variance
Conclusion:
Since Prob value is less than the
significance value (0.05), then we REJECT
the NULL HYPOTHESIS.
76
* In case there is a larger size historical std dev, use them instead of this 30 samples
ST Restricted
Module 4

Exercise #16
Cast the columns to appropriate fields > OK
Test on mean
Test on variance
Test on proportion
Unknown variance
Test on correlation coefficient Select Variance option
based on scenario
Introduction
List of tests
77
ST Restricted
Module 4

HYPOTHESIS TESTING ON PARAMETERS OF TWO NORMAL POPULATIONS
Summary of Tests
Parametric procedures DIFFERENC RATIO of 2 DIFFERENCE Correlation
Test on mean
Test on variance DEPENDENT INDEPENDENT
Test on proportion SAMPLES SAMPLES

Test the difference of means of independent
Test difference of means Variances Variances samples – variances unknown but assumed equal
KNOWN UNKNOWN
Test on correlation coefficient Variances assumed Variances
EQUAL assumed
UNEQUAL
Non-parametric procedures ഥ ഥ − 𝒅𝟎
𝒙−𝒚
Introduction The Test Statistic: 𝒕 = is a value of the t distribution with (nx + ny – 2) d.f.
List of tests 𝟏 𝟏
𝒔𝟐𝒑 +
𝒏𝒙 𝒏𝒚 ( n x − 1)s x2 + ( n2 y −(1) 2
n sx y− 1)s x2 + ( n y − 1)s 2y
Module 4 Key Learning The pooled variance s 2p =is defined as: s p =
nx + n y − 2 nx + n y − 2
H0: μx - μy = () d0
H1: μx - μy < d0
𝑡 < −𝑡 𝑛𝑥 +𝑛𝑦 −2 ,𝛼
H0: μx - μy = () d0
H1: μx - μy > d0
𝑡>𝑡 𝑛𝑥 +𝑛𝑦 −2 ,𝛼
H0: μx - μy = d0
𝑡 < −𝑡 𝑛𝑥 +𝑛𝑦 −2 ,𝛼/2 OR 𝑡 >𝑡
H1: μx - μy  d0 𝑛𝑥 +𝑛𝑦 −2 ,𝛼/2
78
ST Restricted
Module 4

Exercise #17
Test on mean
Test on proportion
a. How to perform Unequal Variance Test.
Test difference of means b. How to perform pooled-t test.
Introduction
List of tests
79
ST Restricted
Module 4

Exercise #17
Go to Analyze > Fit Y by X. Cast Data to Y, Label to X then OK
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
80
ST Restricted
Module 4

Exercise #17
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
Module 4 Key Learning Since P-value > significance level (0.05), we FAIL TO REJECT
NULL HYPOTHESIS; variances are equal.
Note: Some explanation on the different Unequal Variance Tests Method. 81

https://www.jmp.com/support/help/en/16.0/index.shtml#page/jmp/unequal-variances.shtml ST Restricted
Module 4

Exercise #17
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
Since P-value < significance level (0.05), we REJECT NULL

HYPOTHESIS.
82
ST Restricted
Module 4

Exercise #17
Go to Add-Ins > Statistics Calculator > Hypothesis Test for Two Means
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
Since P-value < significance

level (0.05), we REJECT NULL
HYPOTHESIS.
83
ST Restricted
Module 4
Hypothesis testing - introduction Summary of Tests

Associated errors – type I & type II HYPOTHESIS TESTING ON PARAMETERS OF TWO NORMAL POPULATIONS

Parametric procedures DIFFERENC RATIO of 2 DIFFERENCE Correlation
Test on mean
Test on variance DEPENDENT INDEPENDENT
Test on proportion SAMPLES SAMPLES
Test the difference of means of independent
Test difference of means Variances
KNOWN
Variances samples – variances unknown and assumed
UNKNOWN
Test difference of proportions unequal
Test on correlation coefficient Variances assumed Variances
EQUAL assumed
More than two populations UNEQUAL
Introduction
List of tests

ഥ ഥ − 𝒅𝟎
𝒙−𝒚 is a value of the t distribution with  degrees of freedom
The Test Statistic: 𝒕 =
𝟐
𝝈𝟐𝒙 𝝈𝒚 2
+
𝒏𝒙 𝒏𝒚  s2 s 2y 
( ) + (
x
)
 n x ny 
HYPOTHESES REJECT H0 IF: where, v = 2
 s 2y 
2
 s x2 
H0: μx - μy = () d0   /(n x − 1) +   /(n y − 1)
𝑡 < −𝑡𝜈,𝛼 n   ny 
H1: μx - μy < d0  x  
H0: μx - μy = () d0
𝑡 > 𝑡𝜈,𝛼
H1: μx - μy > d0
H0: μx - μy = d0
𝑡 < −𝑡𝜈,𝛼/2 OR 𝑡 > 𝑡𝜈,𝛼/2 84
H1: μx - μy  d0
ST Restricted
Module 4

Exercise #18
Test on mean
Test on proportion
a. How to perform Unequal Variance Tests.
Test difference of means b. How to perform t-test.
Introduction
List of tests
85
ST Restricted
Module 4

Exercise #18
Go to Analyze > Fit Y by X. Cast Data to Y, Label to X then OK
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
86
ST Restricted
Module 4

Exercise #18
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
Module 4 Key Learning Since P-value > significance level (0.05), we REJECT NULL HYPOTHESIS;
variances are not equal.
Note: Some explanation on the different Unequal Variance Tests Method.

https://www.jmp.com/support/help/en/16.0/index.shtml#page/jmp/unequal-
variances.shtml
87
ST Restricted
Module 4

Exercise #18
Go to hotspot and select t Test
Test on mean
Test on variance
Test on proportion
Non-parametric procedures Since P-value < significance level (0.05), we REJECT NULL
Introduction
List of tests
HYPOTHESIS. Mean difference is not equal to 0.
88
ST Restricted
Module 4

Exercise #18
Go to Add-Ins > Statistics Calculator > Hypothesis Test for Two Means
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
Since P-value <

significance level (0.05), we
REJECT NULL
HYPOTHESIS.
89
ST Restricted
Module 4

Summary of Tests
Test on mean
DEPENDENT INDEPENDENT
Test on variance SAMPLES SAMPLES
Test on proportion Test the ratio of variances
Variances Variances
Test difference of means KNOWN UNKNOWN
Test difference of proportions Variances
Variances assumed
Test on correlation coefficient EQUAL assumed
UNEQUAL
Introduction
List of tests HYPOTHESES REJECT H0 IF:
𝒔𝟐𝒙 is a value of the F
Module 4 Key Learning 𝜎𝑥2 The Test Statistic: 𝑭= 𝟐
𝐻0 : 2 = ≤ 1 𝒔𝒚
𝜎𝑦
𝐹 > 𝐹𝑛𝑥 −1,𝑛𝑦−1,𝛼
𝜎𝑥2 distribution with (nx-1) and (ny-1) d.f. for the
𝐻1 : 2 >1
𝜎𝑦 numerator and the denominator of F respectively.
𝜎𝑥2
𝐻0 : 2 = 1
𝜎𝑦
𝐹 > 𝐹𝑛𝑥 −1,𝑛𝑦−1,𝛼/2
𝜎𝑥2
𝐻1 : 2 ≠ 1
𝜎𝑦 90
ST Restricted
Module 4

Exercise #19
1. Two exercise files from previous example.
Test on mean
a. Unequal Variance
Test on variance b. Equal Variance
Test on proportion
Test difference of means 2. Unequal Variance Test have been shown
Test ratio of variances previously in Exercise #17 and #18..
More than two populations 3. If need be, trainer can show again.
Introduction
List of tests
91
ST Restricted
Module 4

Summary of Tests
Significance, confidence, power DIFFERENC RATIO of 2 DIFFERENCE Correlation
Test on mean SAMPLES SAMPLES
Test on variance Test the difference of proportions
Test on proportion
Variances Variances
Two normal populations KNOWN UNKNOWN
Test difference of proportions EQUAL assumed
UNEQUAL
Introduction (𝑝Ƹ𝑥 − 𝑝Ƹ𝑦 )
List of tests The Test Statistic: 𝑧= is a value of the standard normal distribution.
𝑝Ƹ0 (1 − 𝑝Ƹ0 ) 𝑝Ƹ0 (1 − 𝑝Ƹ0 )
+ n x p̂ x + n y p̂ y
Module 4 Key Learning 𝑛𝑥 𝑛𝑦 Where: p̂0 =
nx + n y
HYPOTHESES REJECT H0 IF: ASSUMPTIONS
H0: px - py = () 0
z < -z
H1: px - py < 0 Two large independent random
H0: px - py = () 0 samples of sizes nx and ny, are drawn.
z > z
H1: px - py > 0 The normal approximation holds (still
H0: px - py = 0 z < -z/2 OR z the rule of thumb, np(1-p)>9)
H1: px - py  0 >z/2
92
Where 𝒑 ෝ 𝟎 is a weighted estimate of the (under H0) common proportion.
ST Restricted
Module 4

Exercise #20
Test on mean
Test on variance
Test on proportion
2. Trainer will show using JMP:
Two normal populations a. How to perform Hypothesis Test for 2 Sample Proportion.
Test difference of proportions 3. Interpretation of results.
Introduction
List of tests
93
ST Restricted
Module 4

Exercise #20
Go to Tables > Stack. Go to Analyze > Fit Y by X
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
If left test
If right test
If 2-sided test
Depending on the system of hypothesis, evaluate P-value

Can toggle the response
against alpha (0.05)
94
ST Restricted
Module 4
Hypothesis testing - introduction Inference on the correlation coefficient

Significance, confidence, power HYPOTHESIS TESTING ON PARAMETERS OF TWO NORMAL POPULATIONS
Test on mean E of 2 means variances of 2 proportions coefficient
Test on variance
Test on proportion
DEPENDENT INDEPENDENT
Two normal populations SAMPLES
SAMPLES
Test difference of proportions Variances Variances
KNOWN UNKNOWN
More than two populations Variances
FOR THE CORRELATION COEFFICIENT
Variances assumed
EQUAL assumed
Non-parametric procedures UNEQUAL
Introduction
List of tests
𝐫 (𝐧 − 𝟐)
The TEST STATISTIC: 𝐭= is a value of the t distribution with (n-2) d.f.
(𝟏 − 𝐫𝟐 )
HYPOTHESES REJECT H0 IF:

𝐻0 : 𝜌 = 0 (𝑜𝑟𝜌 ≥ 0)
𝑡 < −𝑡𝑛−2,𝛼
𝐻1 : 𝜌 < 0
𝐻0 : 𝜌 = 0 (𝑜𝑟𝜌 ≤ 0)
𝑡 > 𝑡𝑛−2,𝛼
𝐻1 : 𝜌 > 0 NOTE: for a nonparametric test on the
correlation coefficient, see also the
𝐻0 : 𝜌 = 0 𝑡 < −𝑡𝑛−2,𝛼/2 OR 𝑡 > 𝑡𝑛−2,𝛼/2 Manual of Statistical Methodology, §7.
𝐻1 : 𝜌 ≠ 0 (DMS 8482919_A) 95
ST Restricted
Module 4

Exercise #21
Test on mean
Test on variance
Test on proportion
Two normal populations a. How to perform Hypothesis Test for Correlation Coefficient.
Test difference of means 3. Interpretation of results.
Introduction
List of tests
96
ST Restricted
Module 4

Exercise #21
One normal population Go to Analyze > Multivariate > Multivariate
Test on mean
Test on variance
Test on proportion
Correlation is significant
Introduction
List of tests

Cast column to Y > OK
97
ST Restricted
Module 4
Hypothesis testing - introduction Comparing more than two population means

Parametric procedures PROBLEM: We want to compare the unknown means of k=3 populations.
One normal population “Populations” can be “equipment”, “testers”, “bonders” and so on, e.g. to
Test on mean
Test on variance assess the alignment between equipment, testers, bonders etc.
Test on proportion
Two normal populations The system of hypotheses to test is:
Test ratio of variances H0: μ1 = μ2 = μ3 = μ (constant)
Test on correlation coefficient H1: “At least one of the means is different from at least another one”.
Introduction
List of tests IDEA: “to compare all possible pairs of means, using the methods shown so far
(e.g. the t-test)”.
The following group of hypotheses should be tested:
H0: μ1 = μ2 H0: μ1 = μ3 H0: μ2 = μ3

H1: μ1 ≠ μ2 + H1: μ1 ≠ μ3 + H1: μ2 ≠ μ3
98
ST Restricted
Module 4

Comparing more than two population means
Parametric procedures WHY NOT? This approach has several drawbacks. Among them:
One normal population 1. It is time-consuming (as k increases, the number of pairs to compare will rapidly become very
Test on mean
Test on variance large)
Test on proportion 2. If each comparison is made at the α significance level, the conclusion, based on these tests is
not associated to the same alpha level. An adjustment is required (e.g. Bonferroni or others).
Non-parametric procedures We need a simpler and more efficient method: the F-test
Introduction
List of tests
Generalizing to a number k of means, we will test the following system of hypotheses.
𝑯𝟎 : “the k means are all equal”

𝑯𝟏 : “at least one of the k means is significantly different from at least another one”
Assumptions (of the F-test):

1. The samples are independent and randomly drawn from k populations
2. All k populations are normally distributed
3. The k population variances are homogeneous (i.e. not significantly different)
99
ST Restricted
Module 4

Parametric procedures CASE 1: one variable - simultaneous comparison of means (5 means for UBM thickness in the example)
Test on mean INPUT OUTPUT
Test on variance
Test on proportion
one-way X1 - one independent Y - one dependent
variable on many levels SYSTEM response variable
Test difference of means ANOVA
Test on correlation coefficient UBM thickness, 5 levels Ball Shear
Introduction CASE 2: two variables - simultaneous comparison of means (in the example: 5 means for variable X1
List of tests
and 3 means for variable X2)
Module 4 Key Learning INPUT OUTPUT
Two-ways
X1 and X2 - two independent Y - one dependent
ANOVA variables on many levels
SYSTEM response variable
X1 - Reflow Temperature, 5 levels

Ball Shear
X2 - Reflow Time, 3 levels
⋮
More than two variables multivariable ANOVA
NOTE 100
for a deeper analysis on these tests, consider the training on “Model Building and Design of Experiment, Level 2”. ST Restricted
Nb. of variables Goal of the test Parametric Non-parametric Multiple comparison technique (*)
test test (medians)
1 Compare means F-test Kruskal-Wallis Tukey/Bonferroni/Sheffè/LSD/Newman-Keuls/…
2 Compare means F-test Friedman Tukey/Bonferroni/Sheffè/LSD/Newman-Keuls/…
⋮
One variable:
For a given parameter (e.g. “Ball Diameter”), test the alignment between k machines (1 indep. Variable=MACHINE , k levels=k MACHINE_ID’s)
More than one variable

For a given parameter ((e.g. “Ball Diameter”), test the alignment between k machines (first variable, K levels)
AND
the effect of s different air temperatures (second variable, S levels) on the same parameter
AND
The effect of h values of relative Humidity (third variable, H levels) on the same parameter
(*) The F-test tells us if at least one mean is different from – at least – another one. It does not tell us which mean
is different from which other. To know this, we can use the “multiple comparison methods”, which identify groups of
homogeneous means.
101
ST Restricted
Module 4

F test to compare K means (1 variable)
To carry-out the test, some sample data are required. Once data are available, statistical
packages like JMP calculate the p-value of the F test.
Test on mean
Test on variance
Test on proportion
Hypotheses:
Two normal populations 𝐻0 : 𝜇𝑖 = 𝜇 ∀𝑖, 𝑖 = 1,2, … , 𝑘 all the k population means are equal
Test difference of means 𝐻1 : ∃𝑖: 𝜇𝑖 ≠ 𝜇 at least one of the population means is different
Non-parametric procedures Sample data :
Introduction
List of tests The following table summarizes the results of data collection. Each row represents a sample.
Samples can be either equally or differently sized
Sample from the first population
Sample
Levels Sample
Average
1 Average of data from
the first population
2
Variable X on K levels
⁞ ⁞ ⁞ Average of data from
the k-th population
k
Sample from the k-th population

102
ST Restricted
Module 4

Parametric procedures Example: we want to test the alignment of K machines. We refer to a relevant parameter Y e.g. “ball
One normal population shear”. From each machine, a sample of n measurements has been collected.
Test on mean
Test on variance
Test on proportion
STANDARD NOTATION
Test ratio of variances Machines Sample (n replications) Average
Test on correlation coefficient 1
2
𝑦𝑖𝑗 Average of the n replications
Introduction
List of tests
⁞ 𝑦ത𝑖∙ of the ith sample (ith machine)
k
Grand average 𝑦ത∙∙ 𝒊 = 𝟏, 𝟐, ⋯ , 𝒌
𝒋 = 𝟏, 𝟐, ⋯ , 𝒏
Ball shear value from ith
machine, in the jth replication)
Grand average of all the n x k observations
The dots (∙) substitute the indices used for averaging. For example, in 𝑦ത𝑖∙ it replaces the index j, (columns) to permit the
calculation of row averages. Or it replaces both indices i and j – like in 𝑦ത∙∙ to indicate the value of the grand average
NOTE
For simplicity, in this example the replications are “balanced”, i.e. same number of measurements from each machine (n). 103
This is not a necessary condition for data analysis. Unbalanced cases can be analyzed as well ST Restricted
Module 4

In short, we can say that the machines are aligned if simultaneously the samples are drawn
Parametric procedures from populations with:
One normal population 1. Same mean (𝝁𝟏 = 𝝁𝟐 = ⋯ = 𝝁𝒌 )
Test on mean
Test on variance 2. Same variance (𝝈𝟐𝟏 = 𝝈𝟐𝟐 = ⋯ 𝝈𝟐𝒌 )
Test on proportion
3. Same distribution (by assumption, the normal distribution)
Test difference of proportions Let’s focus on the equality of the k means;
More than two populations ഥ𝟏∙
𝒚 𝝁𝟏
ഥ𝟐∙
𝒚 𝝁𝟐
Introduction are estimators of
List of tests
⁞ ⁞
ഥ𝒌∙
𝒚 𝝁𝒌
Machines Sample (n replications) Average
1 Sample from machine 1 ഥ𝟏∙

𝒚
2 Sample from machine 2 ഥ𝟐∙
𝒚
𝑦ത𝑖∙
⁞ ⁞ ⁞
k Sample from machine k ഥ𝒌∙
𝒚 𝑖 = 1,2, ⋯ , 𝑘
𝑗 = 1,2, ⋯ , 𝑛
Grand average ഥ∙∙
𝒚 104
ST Restricted
Module 4

Test on mean
“all the machines are matched”
Test on variance
Test on proportion
Two normal populations one common
Test ratio of variances H0: μi = μ ∀ i σ
population
Non-parametric procedures μ1 = μ2 = ⋯ = μk = μ
Introduction
List of tests Two possible
outcomes:
“at least 2 machines are mismatched”
? ?
σ σ σ
H1: ∃ i: μi ≠ μ
μ1 μk μ2 = μ3
105
ST Restricted
Two sources of variability generate two different types of useful information to test the hypotheses on the
equality of means:
Source 1 Machines Sample (n replications) Average Source 2
Variability WITHIN the

1 Sample from machine 1 ഥ𝟏∙
𝒚
Variability BETWEEN the
k samples 2 Sample from machine 2 ഥ𝟐∙
𝒚
k samples
⁞ ⁞ ⁞
k Sample from machine k ഥ𝒌∙
𝒚
𝒚
This variability represents the
The more these values are
inherent process variability. No
variable, the more it is likely
link with the effect of the variable
that H0 will be rejected (i.e.
on the considered parameter
𝑯𝟎 : 𝝁𝒊 = 𝝁 ∀𝒊, 𝒊 = 𝟏, 𝟐, … , 𝒌 means are different)
(differences between machines)
𝑯𝟏 : ∃𝒊: 𝝁𝒊 ≠ 𝝁
106
ST Restricted
Module 4
Hypothesis testing - introduction F test to compare K means (1 variable)

Machines Sample (n replications) Average
Variability BETWEEN
Parametric procedures 1 Sample from machine 1 ഥ𝟏∙
𝒚 the k samples
Test on mean 2 Sample from machine 2 ഥ𝟐∙
𝒚
Test on variance
Test on proportion ⁞ ⁞ ⁞
Test ratio of variances k Sample from machine k ഥ𝒌∙
𝒚
Variability WITHIN
𝒚
More than two populations the k samples
𝑯𝟎 : 𝝁𝒊 = 𝝁 ∀𝒊, 𝟏 = 𝟏, 𝟐, … , 𝒌
Introduction 𝑯𝟏 : ∃𝒊: 𝝁𝒊 ≠ 𝝁
List of tests

How the F-test works
is considered Reject H0
𝒗𝒂𝒓𝒊𝒂𝒃𝒊𝒍𝒊𝒕𝒚 𝑩𝑬𝑻𝑾𝑬𝑬𝑵 LARGE (machine mismatched)
If the ratio 𝑭=
𝒗𝒂𝒓𝒊𝒂𝒃𝒊𝒍𝒊𝒕𝒚 𝑾𝑰𝑻𝑯𝑰𝑵 is considered Do not reject H0
SMALL (machine matched)
NOTE:
Statistics helps us in fixing the relative concepts of “large” and “small”, once a level of significance has been established. 107
ST Restricted
Module 4

F test to compare K means
The F statistic is the ratio of the between estimate of variance and the within estimate
of variance
Test on mean
Test on variance
o It is always positive
Test on proportion
o df1 = k -1 will typically be small
o df2 = (n-1)k = nT - k will typically be large
Non-parametric procedures g(F)
Introduction
List of tests
Decision Rule:
 = 0.05
Reject H0 if
F(calc) > F(k-1), (nT – k), 
0 F
Do not Reject H0
Or for small p-values. reject H0
F(k-1), (nT – k),
F(calc) = ?
108
ST Restricted
Advanced Explanation
Numerically
To calculate the test statistic F, we first decompose SST, the total deviance Machines Sample (n replications) Average
(SST stands for “Total Sum of Squares”) of the observations into the following
components: 1
2
SST = SSX + SSE
where, ⁞ 𝑦𝑖𝑗 𝑦ത𝑖∙
- SSX is the deviance of the sample averages (i.e. the variability “between” k
the samples due to the differences between the levels of the variable X, 𝑖 = 1,2, ⋯ , 𝐾
e.g. the machines) and
Grand average 𝑦ത∙∙
𝑗 = 1,2, ⋯ , 𝑛
- SSE is the sum of the deviances “within” the samples, the inherent
process variability
𝑘 𝑛 𝑘 𝑘 𝑛
2 2 2
෍ ෍ 𝑦𝑖𝑗 − 𝑦ത∙∙ = 𝑛 ෍ 𝑦ത𝑖∙ − 𝑦ത∙∙ + ෍ ෍ 𝑦𝑖𝑗 − 𝑦ത𝑖∙
𝑖=1 𝑗=1 𝑖=1 𝑖=1 𝑗=1
SST SSX SSE

109
ST Restricted
To calculate the test statistic F, we need a ratio of two variances. Not of two deviances.
To obtain a variance from a deviance, we divide it by its degrees of freedom.
𝑫𝒆𝒗(𝑿)
𝒗𝒂𝒓 𝑿 =
𝑫𝑭
In our case:
Deviance Degrees of Freedom (DF)
SST kn - 1
SSX k–1
SSE k(n – 1)
So, to get the variances (also called Mean Squares, MS), we simply divide the deviances (Sum of Squares) by the
corresponding DF:
Deviance (SS) Degrees of Freedom (DF) Variance (MS)
SSX k–1 MSX = SSX/(k -1)
SSE k(n – 1) MSE = SSE/(k(n – 1))
SST kn - 1 MST = SST/(kn – 1) is not used
110
ST Restricted
𝐻0 : 𝜇𝑖 = 𝜇 ∀𝑖, 𝑖 = 1,2, … , 𝑘 𝑀𝑆𝑋

To test the hypotheses 𝐻1 : ∃𝑖: 𝜇𝑖 ≠ 𝜇 the test statistic F can then be calculated: 𝐹=
𝑀𝑆𝐸
The terms calculated so far are usually summarized in a table called AN.O.VA. table (stands for ANalysis Of VAriance).
Degrees of
Source of Devianc Test
Freedom Variance (MS) P-value
variability e (SS) statistic
(DF)
Variable X
SSX k–1 MSX = SSX/(k -1)
(machines) 𝑴𝑺𝑿
𝑭𝑿 = 𝒑 − 𝒗𝒂𝒍𝒖𝒆𝑿
Error SSE k(n – 1) MSE = SSE/(k(n – 1)) 𝑴𝑺𝑬
Total SST kn - 1
One-factor (or “one-way”) AN.O.VA. table
NOTE:
Be careful not to be misled. This procedure is called analysis of variance (ANOVA) since it uses the ratio between two variances.
However, they are not the object of our investigation (see the hypotheses), we just “use” them, but to study the population means!
111
ST Restricted
Module 4

F test - 2 variables
X1 X2 1 2 … h Average
Test on mean 1
Test on variance
Test on proportion 𝒊 = 𝟏, 𝟐, ⋯ , 𝒌 Levels of X1
𝒋 = 𝟏, 𝟐, ⋯ , 𝒉 Levels of X2
Test difference of means 2
Test ratio of variances 𝒓 = 𝟏, 𝟐, ⋯ , 𝒏 Replications
More than two populations ⁞ ⁞ ⁞ ⁞ ⁞ ⁞
Introduction Average of the measurements
List of tests
k ഥ𝒊∙∙
𝒚 obtained when X1 is set on its ith level
Average ഥ∙𝒋∙
𝒚 ഥ⋯
𝒚 Grand average (average of all
⁞
the khn measurements)
𝒚𝒊𝒋𝒓 ഥ𝒊𝒋∙
𝒚
Average of the measurements
obtained when X2 is set on its jth level
rth measurement obtained when X1 and X2
are set on their ith and jth levels respectively
Average of the n replications when X1 and X2

112
are set on their ith and jth levels respectively
ST Restricted
Module 4

F test - 2 variables
Significance, confidence, power X1 X2 1 2 … h Average
Parametric procedures We want to assess the effects of 3 terms:
One normal population ➢ X1
Test on mean 1
Test on variance ➢ X2
Test on proportion ➢ X1X2, the interaction between X1 and X2
Test difference of means 2
Test difference of proportions For each term for which we need to assess
⁞ ⁞ ⁞
More than two populations ⁞ ⁞ ⁞ its effect on the response variable, an F test
is performed. The software calculates a p-
Introduction
value for each term to be tested.
List of tests
k ഥ𝒊∙∙
𝒚
Module 4 Key Learning 𝒊 = 𝟏, 𝟐, ⋯ , 𝒌
Average ഥ∙𝒋∙
𝒚 ഥ⋯
𝒚 𝒋 = 𝟏, 𝟐, ⋯ , 𝒉
⁞
𝒓 = 𝟏, 𝟐, ⋯ , 𝒏
The effect of an interaction between 2 variables X1 and X2 is significant, when for some combinations
(or settings) of X1 and X2, the value of the response variable is significantly higher (or lower) than what
we might expect considering X1 and X2 independently. This is called a “multiplicative” effect of X1 and
X2. If X1 and X2 are independent, their effect is said to be only “additive”.
113
ST Restricted
To test the effects of X1, X2 and their interaction, X1*X2, we decompose SST, the total sum of squares
(or deviance), in the following components:
SST = SSX1 + SSX2 + SS(X1X2) + SSE
𝑘 ℎ 𝑛 𝑘 ℎ 𝑘 ℎ 𝑘 ℎ 𝑛
2 2 2 2 2
෍ ෍ ෍ 𝑦𝑖𝑗𝑟 − 𝑦ത⋯ = 𝑛ℎ ෍ 𝑦ത𝑖∙∙ − 𝑦ത⋯ + 𝑛𝑘 ෍ 𝑦ത∙𝑗∙ − 𝑦ത⋯ + 𝑛 ෍ ෍ 𝑦ത𝑖𝑗∙ − 𝑦ത𝑖∙∙ − 𝑦ത∙𝑗∙ + 𝑦ത⋯ + ෍ ෍ ෍ 𝑦𝑖𝑗𝑟 − 𝑦ത𝑖𝑗∙
𝑖=1 𝑗=1 𝑟=1 𝑖=1 𝑗=1 𝑖=1 𝑗=1 𝑖=1 𝑗=1 𝑟=1
SST SSX1 SSX2 SS(X1X2) SSE
114
ST Restricted
Deviance (SS) Degrees of Freedom (DF) Variance (MS)

SSX1 k–1 MSX1
SSX2 h-1 MSX2
SS(X1X2) (k – 1)(h - 1) MS(X1X2)
SSE kh(n – 1) MSE
SST kh(n – 1) MST
To test the equality of means for X1, X2 and their interaction, three test statistics are calculated:
𝑀𝑆𝑋1 𝑀𝑆𝑋2 𝑀𝑆(𝑋1 𝑋2 )

𝐹𝑋1 = 𝐹𝑋2 = 𝐹𝑋1𝑋2 =
𝑀𝑆𝐸 𝑀𝑆𝐸 𝑀𝑆𝐸
115
ST Restricted
And finally, the AN.O.VA. table can be created
Source of Deviance Degrees of Variance

Test statistic p-value
variability (SS) Freedom (DF) (MS)
MSX1
Variable X1 SSX1 k–1 MSX1 𝑭𝑿𝟏 = 𝒑 − 𝒗𝒂𝒍𝒖𝒆𝑿𝟏
𝑴𝑺𝑬
MSX2
Variable X2 SSX2 h–1 MSX2 𝑭𝑿𝟐 = 𝒑 − 𝒗𝒂𝒍𝒖𝒆𝑿𝟐
𝑴𝑺𝑬
MSX1X2
Interaction X1X2 SSX1X2 (k – 1)(h – 1) MSX1X2 𝑭𝑿𝟏 𝑿𝟐 = 𝒑 − 𝒗𝒂𝒍𝒖𝒆𝑿𝟏 𝑿𝟐
𝑴𝑺𝑬
Error SSE kh(n – 1) MSE
Total SST khn - 1
Two-factors (or “two-ways”) AN.O.VA. table
116
ST Restricted
Module 4

Exercise #22
Test on mean
Test on variance
Test on proportion
Two normal populations 2. Trainer will show using JMP:
Test ratio of variances a. How to perform One Way and 2 Way ANOVA analysis
Introduction
List of tests
117
ST Restricted
Module 4

Exercise #22
One normal population Go to Tables > Stack (Machine A/B/C to Columns)
Test on mean
Test on variance Continuous
Test on proportion
Two normal populations Nominal
Test difference of proportions Ordinal
Introduction
List of tests
Go to Analyze > Fit Y by X. Cast Data to Y, Machine to X Go to hot spot > Means/Anova
118
ST Restricted
Module 4

Exercise #22
One normal population Go to hot spot > Compare Means (All Pairs)
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
Prob value <

significance of 0.05, we
reject Null hypothesis.
Which mean is
different?
Machine B is
different with
Machine A and C
Machine A and C
has no statistically
significant difference 119
ST Restricted
Module 4

Exercise #22
One normal population Go to Analyze > Fit Model
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
Cast Response to Y, Machine&Operator to Macros > Full Factorial
Machine is a significant factor
The factors machine and Operator &

their interaction were added in the model
120
ST Restricted
Module 4

Exercise #22
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
Note: Use common axis settings

121
ST Restricted
Module 4
Associated errors – type I & type II Introduction to nonparametric tests

Test on mean
Test on variance
Test on proportion
Nonparametric Statistics deals with the
Test on correlation coefficient same problems of parametric Statistics.
Introduction
The method is different.
List of tests

Basically, there is at least one nonparametric
equivalent for each parametric type of test.
122
ST Restricted
Module 4

Introduction to nonparametric tests
Test on mean Parametric and nonparametric methods.
Test on variance
Test on proportion
Two normal populations ❑ need for statistical procedures that enable us to process data of "low quality”:
Test ratio of variances ▪ small samples,
▪ on variables about which nothing is known (e.g. their distribution).
❑ Specifically, nonparametric methods were developed to be used in cases when
Introduction the researcher knows nothing about the parameters of the variable of interest in
List of tests
the population (hence the name nonparametric).
❑ Nonparametric methods do not rely on the estimation of parameters (such as the

mean or the standard deviation) describing the distribution of the variable of
interest in the population.
❑ Therefore, these methods are also sometimes (and more appropriately)

called parameter-free methods or distribution-free methods.
123
ST Restricted
Module 4
Associated errors – type I & type II Introduction to nonparametric tests

Test on mean
Test on variance Nonparametric Statistics
Test on proportion
• Fewer restrictive assumptions about underlying
Test difference of proportions probability distributions
More than two populations • Population distributions may be skewed or, in general, assumptions
Non-parametric procedures on the distribution are not required
Introduction
List of tests • All else equal, nonparametric procedures are less
powerful than their parametric counterparts (i.e. higher β,
the probability of type II error → lower power (1-β) → when
the alternative is true, they may be less likely to reject H0)
124
ST Restricted
Module 4

Introduction to nonparametric tests
Test on mean
Test on variance
Test on proportion
Two normal populations ❑ Also for nonparametric statistics, the first step is
the formulation of 2 hypotheses.
Introduction
List of tests
❑ A second step is the calculation of a test statistic
based on sample data
❑ As final result, a p-value is produced. The

interpretation is the same as for parametric tests.
125
ST Restricted
Module 4
Associated errors – type I & type II Nonparametric tests

Differences between independent groups.
Test on mean
Test on variance
• The parametric counterpart is the t-test (for independent samples).
Test on proportion
• Nonparametric alternatives for this test are:
Test difference of means • the Wald-Wolfowitz runs test,
Test difference of proportions • the Mann-Whitney U test, and
• the Kolmogorov-Smirnov two-sample test.
• If we have multiple groups, we would use analysis of variance, ANOVA (the
Introduction nonparametric ANOVA equivalents to this method are the Kruskal-Wallis analysis
List of tests
of ranks and the Median test).
Multiple comparisons ➔ Steel Dwass
H0 / H1 P(value) Statistical Conclusion

H0 : µ1 = µ2 < Alpha Reject Ho
H1 : µ1 ≠ µ2 “The 2 populations have
different means” at Alpha
significance level. 126
ST Restricted
Module 4

Nonparametric tests
Test on mean Equality of variances of 2 independent groups.
• The parametric counterpart is the F-test (for independent
Test on variance
Test on proportion
samples) or the Bartlett test.
• Nonparametric alternatives for this test are:
More than two populations • the Brown-Forsythe test,
Introduction • the Levene test
List of tests
• Cochran Test
• Hartley Test

H0 : 𝜎12 = 𝜎22 < Alpha Reject Ho
H1 : 𝜎12 ≠ 𝜎22 “The 2 populations have
different variance” at Alpha
significance level.
127
ST Restricted
Module 4

Exercise #23
Test on mean
Test on variance
Test on proportion
Two normal populations a. How to perform non-parametric test for two independent samples.
Test difference of proportions 3. Interpretation of results.
Introduction
List of tests
128
ST Restricted
Module 4

Exercise #23
Go to Analyze > Fit Y by X
Test on mean
Test on variance Go to hotspot > Nonparametric > Wilcoxon Test
Test on proportion
Introduction
List of tests
Machine 3 & 4 are not different
129
ST Restricted
Module 4

Exercise #23
One normal population Go to hotspot > Unequal Variances
Test on mean
Test on variance
Test on proportion
Introduction
List of tests

Machine 3 & 4 variance
are the same
130
ST Restricted
Module 4
Hypothesis testing - introduction Nonparametric tests

Differences between dependent groups.
Parametric procedures • The parametric approach is the t-test (two variables measured in the same
One normal population sample. Dependent samples).
Test on mean
Test on variance • Nonparametric alternatives to this test are:
Test on proportion
• the Sign test and
Test difference of means • the Wilcoxon's matched pairs test.
Test difference of proportions • If the variables of interest are dichotomous (like "pass" vs. "no pass") then
McNemar's Chi-square test.
• If there are more than two variables that were measured in the same sample,
Introduction then we would customarily use repeated measures ANOVA.
List of tests • Nonparametric alternatives to this method are
Module 4 Key Learning • Friedman's two-way ANOVA and
• Cochran Q test (if the variable was measured in terms of
categories, e.g., "passed" vs. "failed").

H0 : µ1 = µ2 < Alpha Reject Ho
H1 : µ1 ≠ µ2 “The 2 populations have
different means” at Alpha
significance level.
131
ST Restricted
Module 4

Exercise #24
Test on mean
Test on variance
Test on proportion
Two normal populations a. How to perform non-parametric test for two dependent samples.
Test difference of means (available under Match Paired-t ➔ Non Parametric).
Introduction
List of tests
132
ST Restricted
Module 4

Exercise #24
One normal population Go to Analyze > Specialized Modelling > Matched Pair Go to hotspot > Wilcoxon or Sign Test
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
133
ST Restricted
Module 4

Relationships between variables. Nonparametric tests
• To express a relationship between two variables one usually
Test on mean
computes the correlation coefficient.
Test on variance
Test on proportion
• Nonparametric equivalents to the standard (Pearson) correlation
coefficient are:
Test difference of proportions • Spearman R,
• Kendall Tau and others
Non-parametric procedures • If the two variables of interest are categorical in nature (e.g.,
Introduction
List of tests "passed" vs. "failed" by "male" vs. "female") nonparametric
Module 4 Key Learning statistics for testing the relationship between the two
variables are:
• the Chi-square test or
• the Fisher exact test.
H0 : X1 and X2 are not correlated. < Alpha Reject Ho
H1 : X1 and X2 are correlated. “X1 and X2 are correlated”
at Alpha significance level. 134
ST Restricted
Module 4

Exercise #25
Test on mean
Test on variance
Test on proportion 1. Open the exercise File:
Test difference of proportions a. How to perform non-parametric test for relationship between 2 variables.
More than two populations 3. Interpretation of results.
Introduction
List of tests
135
ST Restricted
Module 4

Exercise #25
Go to Analyze Multivariate Methods > Multivariate
Test on mean
Test on variance
Test on proportion
Introduction
List of tests
Prob value > significance level;

correlation is not significant 136
ST Restricted
Module 4
Associated errors – type I & type II Nonparametric tests

Test on mean
Test on variance
Test on proportion
Goodness Of Fit (GOF) test.
• This type of procedure is used to test a claim about the
distribution of a population (e.g. “normality tests”)
• Examples of nonparametric tests for GOF are:
Non-parametric procedures • Kolmogorov-Smirnov test,
• Anderson-Darling test
Introduction
List of tests
Module 4 Key Learning • Lilliefors test

• Shapiro-Wilks test
• Cramèr-von Mises test
H0 : Data follow an assumed < Alpha Reject Ho
distribution (for example: Normal).
H1 : Data do not follow the assumed
distribution. 137
ST Restricted
Module 4

Exercise #26
Test on mean
Test on variance
Test on proportion
Two normal populations a. How to perform non-parametric test for Goodness of Fit Test.
Test difference of means b. Dependent on what distribution are being assumed, JMP will utilize different test method.
Introduction
List of tests
138
ST Restricted
Module 4

Exercise #26
Go to Analyze > Distribution Column3
Test on mean Go to hot spot > Continuous Fit
Test on variance
Test on proportion
Introduction
List of tests
Module 4 Key Learning Column4
Evaluate the P-value

139
ST Restricted
Module 4

Comparison parametric vs. nonparametric. Summary
Test on mean
Test on variance
Test on proportion
Parametric Non-parametric
Two normal populations Assumed distribution Normal Any
Test ratio of variances Assumed variance Homogeneous Any
Test on correlation coefficient Typical data Ratio or Interval Ordinal or Nominal
More than two populations Data set relationships Independent Any
Non-parametric procedures Usual central measure Mean Median
Introduction
List of tests Benefits Can draw more conclusions Simplicity; Less affected by outliers
Tests
Correlation test Pearson Spearman
Independent measures, 2 groups Independent-measures t-test Mann-Whitney test
One-way, independent-
Independent measures, >2 groups Kruskal-Wallis test
measures ANOVA
Repeated measures, 2 conditions Matched-pair t-test Wilcoxon test
One-way, repeated
Repeated measures, >2 conditions Friedman's test
measures ANOVA
140
ST Restricted
Module 4 Key Learning’s
• Hypothesis Tests for one (normal) population parameters

• Using the Critical Value
• Using the P-Value
• Hypothesis Tests for two (normal) population parameters
• More than two populations: introduction to (one-way) ANOVA
• Introduction to nonparametric (distribution-free) Tests
141
ST Restricted
Annex: Overview of outlier detection methods
ST Restricted
Annex 2 objectives
At the end of this chapter, you will be able to:
• Assess the importance of detecting outliers prior to any statistical analysis.

• Have a better visibility on the most popular methods to detect outliers with
particular focus on univariate ones.
143
ST Restricted
Introduction
Outliers detection
As pointed out in the Manual of Statistical Methodology (8482919 ver.2), Chapter 7, great
importance resides in the adoption of effective methods to detect outliers. The quality of the
results of statistical analyses performed on contaminated data is heavily affected by the
presence of outliers in the dataset. As an example, consider two important statistical
applications which are heavily affected by the presence of outliers: Regression Analysis (with
OLS method) and Control Charts for process monitoring.
Moreover, from “Outlier identification in high dimensions” (2006), P. Filzmoser, R. Maronna, and M. Werner:
“Accurate identification of outliers plays an important role in statistical analysis. If classical statistical models
are blindly applied to data containing outliers, the results can be misleading at best. In addition, outliers
themselves are often the special points of interest in many practical situations and their identification is the
main purpose of the investigation. Classical tools based on the mean and covariance matrix are rarely able
to detect all the multivariate outliers in a given sample due to the masking effect (Becker and Gather, 1999),
with the consequence that methods based on classical measures are unsuitable for general use unless it is
certain that outliers are not present. Contaminated data are commonly found in several situations, and so
robust methods that identify or downweight outliers are essential tools for statisticians”.
144
ST Restricted
Methods for outlier detection
Several methods have been developed to detect outliers.
A first classification level separates between:
→ Methods for Univariate Outlier Detection

→ Methods for Multivariate Outlier Detection
While most surveys collect multivariate data, univariate outlier detection methods are usually
preferred for their simplicity. But these methods fail to detect observations that violate
the correlational structure of the dataset.
Graphical example: At univariate

level point A is not an outlier (neither
for X1 or X2). Conversely, considering
the bivariate distribution of X1 and A
X2, point A is an outlier.
OUTLIER
145
ST Restricted
Yet, the methods for outliers detection can be divided into different groups
according to the statistical procedure/approach which is adopted:
• Distribution-based methods.
• Distance-based methods.
• Density-based methods.
• Methods based on clustering.
146
ST Restricted
Distribution-based methods
they assume a known distribution of the data, and test if the target extreme value is an outlier of the
distribution, i.e., whether or not it deviates from the assumed distribution. Examples of this group of methods
are Dixon or Grubb tests. Often, in real world data it is not easy to fulfill the distributional requirements, and
this creates a limitation to their use.
Distance-based methods
Several outliers detection methods use some measure of distance to evaluate how far away an observation
is from the centre of the data. To measure this distance, the sample mean and variance may be used, but
since they are not robust to outliers, they can mask the very observations we seek to detect. In other terms,
a method which is not robust, i.e. which itself is being effected by the outliers, is of few (if no) help in
detecting them. To avoid this masking effect, variability and location estimators need to be “robustified” , that
means make the statistical estimators less sensitive to outliers. It is for this reason that many outlier
detection methods use order statistics, such as the median or quartile.
Methods for robustification of the estimators include, among the others, the Minimum Covariance
Determinant (MCD) due to Rousseeuw (The MCD estimator is determined by a subset of points of size h
which minimizes the determinant of the variance-covariance matrix over all subsets of size h).
In univariate statistics, distance-based methods provide interesting results and are often preferred for their
relative simplicity. However, in high dimensional space the notion of outlier based on distance may become
meaningless.
147
ST Restricted
Density-based methods
these methods assign to each object a degree to be an outlier. This degree is called the Local Outlier Factor
(LOF) of an object. It is “Local” since the considered property is the density of objects in the surrounding
neighborhood of the object itself.
Methods based on clustering
Clustering is a basic method to detect outliers. From the viewpoint of clustering algorithm, potential outliers
are data which are not located in any cluster. Furthermore, if a cluster significantly differs from other clusters,
the objects in this cluster might be outliers.
Graphically:
CLUSTER A CLUSTER B
To be noticed that with a distance-based

approach, the point in the red circle would
never be considered an outlier. In fact, it is very
OUTLIER close to the average of the dataset.
148
ST Restricted
Methods for univariate outlier detection
The methods listed below are based on “distance considerations” and are generally
considered robust in case of non-normal data (→ they do not require the normality
assumption). The idea of “distance”, means that an observed value is defined outlier if its
distance from what is considered the centre of the distribution is greater than a cut-off value.
Among the most popular methods:
• THE STANDARD DEVIATION (SD) METHOD

• THE Z-SCORE METHOD
• THE MODIFIED Z-SCORE METHOD
• THE TUKEY’S METHOD (BOXPLOT)
• THE ADJUSTED BOXPLOT METHOD
• THE MADe METHOD
149
ST Restricted
In a simulation study within the STATS Program, these methods have been tested on a
representative number of FE SPC variables with real data (results available).
The conclusion of the study about the most pertinent methods are summarized as follows:
➢ MADe and MD methods provide equivalent and pertinent results on both production and
monitor data.
➢ BoxPlot methods is generally aligned with MD & MADe, but in several cases the
proposed limits are not well adapted to distribution. This method provides good results
when employed on contamination data.
➢ Adjusted Box Plot (with Johnson Fit or Bootstrap methods) don’t provide correct limits.
150
ST Restricted
The MADe Method
The MADe method, using the Median and the Median Absolute Deviation (MAD), is one of the
basic robust methods which are largely unaffected by the presence of outliers in the dataset.
This approach is similar to the SD method. However, here the median and MADe are employed
instead of the mean and the standard deviation.
The MADe method is defined as follows:
RULE: An observation is considered outlier if its value is outside the interval:
MED ± 3 MADe
where (for a sample of size n):

• MED is the median (or 50th percentile)
• MADe=1.483×MAD for large normal data.
• MAD = Median Absolute Deviation = Median (|xi – Median(x)|) and i=1, 2, …, n
MAD is an estimator of data variability. It is similar to the standard deviation and like the
median has an approximately 50% breakdown point.
When the MAD value is scaled by a factor of 1.483, it is similar to the standard deviation in a
normal distribution. This scaled MAD value is the MADe.
151
ST Restricted
Key Learning’s
Now you know:
• The importance of adopting effective filters for outliers in
every statistical analysis.
• That exist several methods and approaches to detect
outliers.
• How to detect outliers at univariate level (using distance-
based methods).
152
ST Restricted
Conclusion
153
• What you could do next to better improve your statistical

competency:
• Use as much as possible what you have learned. And do it since
tomorrow!
• Only way to avoid forgetting what you learned: do not wait too
much time after the course to start implementing the techniques shown
in the training.
• Think about attending the next training course on “Statistical
Model Building”
• You will learn:
• how to generate a functional relation between a response variable and one or
more explanatory variables, based on empirical data.
• How to optimize these models making them reliable and usable.
• How to find stability windows and how to optimize a process
ST Restricted
Post-test
• Complete the post-test to the best of your knowledge
It allows us to measure the learning that has taken place

during the training.
10-15 minutes
154
ST Restricted
Customer satisfaction
How can we improve for next time?
Kirkpatrick Level 1 evaluation questionnaire
You will receive an e-mail, Please take 5mn to complete the

evaluation form this will help us to continually improve
the learning content, facilitation, organization
155
ST Restricted
CONGRATULATIONS!!
ST Restricted
File Revision
Version Date Remarks Who

1.0 2017 Initial Release Marco Della Seta
• New format and template.
2.0 June 2021 Marco Della Seta / HK Looi
• New exercises.
157
ST Restricted

STATS 2 Part 2 Rev 2.0 With Exercise

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

STATS 2 Part 2 Rev 2.0 With Exercise

Uploaded by

Copyright:

Available Formats

STATISTICS LEVEL 2

Module 1: INTRODUCTION Module 4: HYPOTHESIS TESTING

Module 3: CONFIDENCE INTERVAL

• At the end of this chapter, you will be able to:

• Test statistical hypotheses on single population parameters

Hypothesis testing - introduction

Associated errors – type I & type II

Hypothesis testing - introduction

Associated errors – type I & type II

Hypothesis testing - introduction

Associated errors – type I & type II

➢ Is always about a population parameter, not

Module 4 Key Learning

Hypothesis testing - introduction

Associated errors – type I & type II

➢ Always contains “=” , “≤” or “” sign

Module 4 Key Learning

➢ May or may not be rejected

Hypothesis testing - introduction

Associated errors – type I & type II

Module 4 Key Learning

Hypothesis testing - introduction Hypothesis Testing Process

Module 4 Key Learning

Hypothesis testing - introduction Hypothesis Testing Process

Hypothesis testing - introduction Hypothesis Testing Process

max distance (to still support H0)?

Hypothesis testing - introduction Hypothesis Testing Process

Hypothesis testing - introduction Hypothesis Testing Process

Hypothesis testing - introduction Hypothesis Testing Process

𝐻𝑜 : 𝜇 = 0 𝑜𝑟 (𝜇 ≥ 0) 𝐻1 : 𝜇 < 0 (One Sided Lower)

𝐻𝑜 : 𝜇 = 0 𝑜𝑟 (𝜇 ≤ 0) 𝐻1 : 𝜇 > 0 (One Sided Upper)

Hypothesis Testing is about inferring population parameter.

Hypothesis testing - introduction

Module 4 Key Learning

Hypothesis testing - introduction

Here onwards, we are using 𝝈𝒙 unknown situation.

Hypothesis testing - introduction

Test on correlation coefficient

Module 4 Key Learning

• How to find the Critical Value?

Critical Value Critical Value

Hypothesis testing - introduction

More than two populations

Module 4 Key Learning Statistician have tabulated the function

Hypothesis testing - introduction

Module 4 Key Learning

Hypothesis testing - introduction

More than two populations

Module 4 Key Learning 𝒔

ഥ = 𝟓 ± (𝟐. 𝟎𝟒𝟓)(𝟎. 𝟔𝟕)

Hypothesis testing - introduction

More than two populations Assume 𝐻𝑜 is true

Module 4 Key Learning

Convert Sample Mean 𝑇𝑒𝑠𝑡 𝑆𝑡𝑎𝑡𝑖𝑠𝑡𝑖𝑐 = −9.29 -2.045 2.045

𝒙ǉ − 𝝁 (−𝟏. 𝟐𝟕) − (𝟓)

(Note: In JMP, the calculation takes into account

Module 4 Key Learning Non-Standardized Scale

𝑥ҧ = −1.27 3.63 6.37

The case shown here is

Critical Value Critical Value

Test Statistic < Abs(Critical Value)

Test Statistic Test Statistic

Test Statistic > Abs (Critical Value)