STATS 2 Part 1 Rev 2.0 With Exercise Slides Ao

STATISTICS LEVEL 2
Part 1
Speaker Name
ST Context2
• STatS is a project started in 2012 under the sponsorship of PQR Management.
• To permit to ST to reach Best-in-class by introduction of innovative statistical

tools and methodologies at Company level
• The main goals of STatS are review, rationalize and improve the effectiveness
of Statistical methodology in general.
• STatS is intended to continuously improve our detection capability through the

adoption of an advanced statistical approach, and to reduce DPPM (Defective
Parts Per Million), thru an innovation of the statistical techniques deployed in ST
manufacturing.
• To drive and support the deployment and correct application of the Statistics
Manuals in all ST manufacturing plants
S.P.C.
Statistics Learning
A Awareness
W Statistics Level 1 S.P.C.
A (STAT 1) Operators
R
M.S.A.
E Awareness
N
Statistics Level 2
E S.P.C.
Part 1 Engineers
S
D.O.E.
S Awareness Part 2
Statistical Model Measurement

Building 1 + D.O.E. 1 System Analysis
(SMB 1 + D.O.E. 1) (MSA)
Color code - legenda:

Statistical Model • Green shadows = No prerequisites
Multivariate • Other colors: Prerequisites
Building 2 + D.O.E. 2
Statistics
(SMB 2 + D.O.E. 2)
3
Why this course?
• To provide the fundamentals of statistics

• To answer current statistical questions in everyday work
• To produce more accurate/effective statistical analysis
4
Training purpose
• To understand the basics of inferential Statistics
• To form, use and interpret confidence intervals
• Perform and interpret statistical test: null and alternative hypothesis, test statistics, p-value
• To understand and use the concept of correlation between variables
• To use both parametric and non-parametric Statistics
• To focus on the importance of the methods for outlier detection
• To know about the existence of bootstrap methodology
5
Benefits
• To Analyze information at the proper level

and take the right decision accordingly
6
Let’s get to know each other…
Round table:
• Name
• Organization
• Are you already using statistical methodology?
• If so, what are the main applications?
• Expectations from the course
7
Pre-test
• Complete the questionnaire to the best of your
knowledge
• This is not an individual ‘control’ test
It will allow us to have an idea of your level of knowledge

about the subject prior to the training (so, if you don’t
know, don’t worry …)
We will re-do the questionnaire at the end of the course
to measure the learning that has taken place
10 minutes
8
Structure of the Course
Structure of the course
Module 1: INTRODUCTION Module 4: HYPOTHESIS TESTING

• Introduction
• First Concepts • 1 Normal Population
• Population VS Sample • 2 Normal Populations
• Descriptive VS Inferential • ANOVA

• Estimation (point and interval) • Non-Parametric Test
• Hypothesis Testing Annex 1
• Inferential Error
• Introduction to Bootstrap
Module 2: CENTRAL LIMIT THEOREM
• Decision Making Process Annex 2
• Numerical Simulation & Examples
• t-Distribution • Overview of Outlier Detection Methods
Module 3: CONFIDENCE INTERVAL

• Estimation
• Point Estimation
• Properties of Point Estimators
• Interval Estimation
• Parameters of 1 Normal Population
• Parameters of 2 Normal Population
10
Module 1 : Introduction
Objective: Duration ~ TBD hrs
• Recall from STATS1 : Population vs Sample , Descriptive vs Inferential Statistics.
• Point and interval estimation of parameters.
• Hypothesis Testing Procedures.
• Inferential error.
Know : Statistical Theory How : JMP Application

• Recall from STATS 1 • Not Applicable.
• Introduction to Inference
• Estimation & Hypothesis Testing
• Module 1 Key Learning
FAQ
#1 : To collect input from participants.
12
Module 1
Recall from STATS 1
Introduction To Inference Descriptive and Inferential Statistics

Estimation & Hypothesis Testing
Inferential Error
Module 1 Key Learning
Two branches of Statistics
Descriptive statistics
• Collecting, summarizing, and processing data to transform data into information
Inferential statistics
• provide the basis for predictions, forecasts, and estimates that are used to transform
information into knowledge
13
Module 1
Recall from STATS 1
Introduction To Inference Descriptive Statistics

Inferential Error
Descriptive
•Collect data
• e.g., Survey
•Present data
• e.g., Tables and Graphs
•Summarize data
σ 𝑋𝑖
ത
• e.g., Sample average = 𝑋 =
𝑛
14
Module 1
Recall from STATS 1
Introduction To Inference Inferential Statistics

Inferential Error

Inference
•Estimation
• e.g., Estimate the population mean
weight using the sample mean weight
•Hypothesis testing
• e.g., Test the claim that the population
mean weight is 120 pounds
Inference is the process of drawing conclusions or making
decisions about a population based on sample results
15
Module 1
Recall from STATS 1
Introduction To Inference Inferential Statistics

Inferential Error
Inference is the process of drawing conclusions or making

decisions about a population based on sample results
Estimation (of unknown parameters)

• e.g., Estimate the population mean weight using the sample
mean weight
Hypothesis testing
• e.g., Test the claim that the population mean weight is 120
pounds
16
Module 1
Recall from STATS 1
Introduction To Inference
Inferential Error
Two approaches to estimate an unknown population parameter:
RESULT = Single Value

Point Estimation Example: “mean = 18”
RESULT = Interval + Confidence

Interval Estimation Example: “mean between 17 and 19.
At the 95% confidence level”
17
Module 1
Recall from STATS 1
Inferential Error

• A point estimate is a single number,
• A confidence interval provides additional information about variability
Point Estimate
Lower Upper
Confidence Confidence
Limit Limit
Confidence Interval
18
Module 1
Recall from STATS 1
Inferential Error
Estimation
• e.g., Estimate the population mean weight using the
sample mean weight
Hypothesis testing
e.g., Test the claim that the population mean weight is
120 pounds
19
Module 1
Recall from STATS 1
Inferential Error
• A hypothesis is a claim (assumption) about an aspect of the
population under investigation
• It can be a population parameter like:

• The population mean / variance, …
Example: The mean monthly cell
phone bill of this city is μ = $42
• The population proportion
Example: The proportion of adults in

this city with cell phones is  = 0.68
• Other
20
Module 1
Recall from STATS 1
Inferential Error Hypothesis Testing Procedure

Statistical hypothesis testing procedures are methods to investigate on an

aspect of interest of one (or more than one) population(s).
For example, we might be interested in:

❑ determining the most likely value of a population parameter (e.g. mean or variance);
❑ comparing the same parameter of two or more than two populations (e.g. two or
more means, variances, proportions or other);
❑ assessing how precisely a certain theoretical distribution (e.g. the normal
distribution) fits the data (Goodness of Fit tests);
❑ and many others.
21
Module 1
Recall from STATS 1
Inferential Error

“What have in common interval estimation and
hypothesis testing?”
“Their results are based on sample data. The basic

assumption is that the sample adequately represents
the population from which it has been drawn”
22
Module 1
Recall from STATS 1
Introduction To Inference However,

Inferential Error
“Represents adequately” ≠ “Represents perfectly”
1. In interval estimation problems, we cannot be 100% sure

that the true value of the unknown parameter is contained in
the interval.
2. Testing a system of hypotheses, we cannot be 100% sure
to take the right decision about the hypothesis to support.
23
Module 1
Recall from STATS 1
Inferential Error
WHY in interval estimation problems and hypothesis testing
procedures we cannot be 100% sure to take the right decision
(cannot be 100% confident)?
BECAUSE it always exists a positive probability that the

sample misleads our conclusions due to the INFERENTIAL (or
SAMPLING ERROR).
To eliminate this error, all the population should be considered.
24
Module 1
Recall from STATS 1
Estimation & Hypothesis Testing EXAMPLE:

Inferential Error A sample (of size n) is drawn from a population (of size N>>n) and a confidence
interval for the population mean has been formed with the following results:
- Confidence limits: LCL=53, UCL=61 (width = 8)

- Confidence Level: 0.95 (or 95%)
- Point estimate of the population mean : ഥ = 𝟓𝟕
𝑿
If we use a higher confidence level (same data), say 99%, we must accept a
wider confidence interval → less precise estimator.
- Confidence limits: LCL=50, UCL=64 (width =14)
- Confidence Level: 0.99 (or 99%)
- Population mean point estimate: ഥ = 𝟓𝟕
𝑿
To maintain the same interval width with a higher confidence

level on the same data, we must increase the sample size.
25
Module 1
Recall from STATS 1

• Descriptive and Inferential Statistics
Inferential Error

• Estimation of population parameters
• Hypothesis Testing
• Key definitions:
• Population vs. Sample
• Point vs. Interval Estimation
• Null Hypothesis vs. Alternative Hypothesis
26
Module 2 : - The Decision Process
- The Central Limit Theorem
• Understand the Concept of Central Limit Theorem

• The Decision-Making Process • Numerical Simulation & Examples using JMP
• Central Limit Theorem
• t-Distribution
• Module 2 Key Learning
FAQ
28
Module 2
The Decision-Making Process
Central Limit Theorem

Numerical simulations & examples Data
t-Distribution Descriptive Statistics
Module 2 Key Learning STATS 1
Information
Inferential Statistics
Knowledge
STATS 2
Statistical Decision
Engineering Engineering Consideration

Constraint or
Consideration
Pragmatic Decision
29
Module 2

Numerical simulations & examples
Predicting Unknown History
t-Distribution SAMPLE
POPULATION
Descriptive Inferential Example
ASSEMBLED
UNITS
A lot on-hold at
OQC. The lot is
already produced.
Assume
Descriptive Statistics, is mainly referred sample data.

Will you obtain the same results If you re-sample from the same population?
Predicting Unknown Future
Assume
Example
SAMPLE POPULATION Process setup.
Using 30 setup
Descriptive Inferential UNITS GOING data (sample), to
TO BE
ASSEMBLED predict next 100K.
30
Module 2

From the Central Limit Theorem (CLT):
t-Distribution
Let 𝑥1 , 𝑥2 , ⋯ , 𝑥𝑛 be a random sample of size n. The 𝑥𝑖 values represent a series of independent
and identically distributed random variables, drawn from a population with mean 𝜇𝑋 and finite
variance given by 𝜎𝑋2 .
The CLT provides useful information about the distribution of the sample average. In fact, the
theorem demonstrates that the distribution of 𝑿ഥ approaches normality regardless of the shape
of the distribution of the individual 𝑿𝒊 .
Two cases can be identified:
𝟐
1. 𝑿~𝑵 𝝁𝑿 , 𝝈𝟐𝑿 ഥ ~𝑵 𝝁𝑿ഥ = 𝝁𝑿 , 𝝈𝟐ഥ = 𝝈𝑿
→𝑿 𝑿 𝒏
𝟐
2. 𝑿~𝑾𝒉𝒂𝒕𝒆𝒗𝒆𝒓 (𝒖𝒏𝒌𝒏𝒐𝒘𝒏) 𝒘𝒊𝒕𝒉 𝒑𝒂𝒓𝒂𝒎𝒆𝒕𝒆𝒓𝒔 𝝁𝑿 , 𝝈𝟐𝑿 ഥ ~𝑵 𝝁𝑿ഥ = 𝝁𝑿 , 𝝈𝟐ഥ = 𝝈𝑿
→𝑿 𝑿 𝒏
when 𝒏 → ∞ (typically, when n>30)
31
Module 2
The Decision-Making Process Exercise #1

t-Distribution Exercise File:

M2.0 Central Limit Theorem.xls M2.0 Central
Limit Theorem
Trainer will guide participants in performing the exercise.
Exercise 1.1
1. Click on Sheet Pop 1.
2. Copy Pop 1 data into JMP table.
3. Establish Data Distribution for Pop 1.
Exercise 1.2
4. Create Sampling Distribution for Mean & Variance for sample size, n=5.
5. Copy into JMP table.
6. Repeat Step 4 and 5 for n = 10 , 15 , 20 , 25 & 30.
Exercise 1.3
7. Establish Data Distribution for Sampling Distribution of Mean (for all n).
32
Module 2
The Decision-Making Process Exercise 1.1

Exercise File:
t-Distribution
M2.0 Central
Module 2 Key Learning M2.0 Central Limit Theorem.xls Limit Theorem
Exercise 1.1
33
Module 2

Numerical simulations & examples Exercise File:
M2.0 Central
M2.0 Central Limit Theorem.xls Limit Theorem
t-Distribution

Exercise 1.1
2. Copy Pop 1 sheet Population data column into JMP table (Edit > Paste With Column Names)
3.1 Click on Analyze > Distribution
Cast Population column into Y
Make sure it is continuous data

34
Module 2

Numerical simulations & examples Exercise File:
M2.0 Central
t-Distribution
35
Module 2


Module 2 Key Learning M2.0 Central
Exercise 1.2
6. Repeat Step 4 and 5 for n = 10 , 15 , 20 , 25 & 30.
6.1 Make 1000 group of data of size 5
36
Module 2


6.2 Copy Sample Mean and Sample Variance to JMP data

table
37
Module 2


6.3 Draw 1000 groups of data with size n=10, copy to JMP
4.26 21.94
3.91 38.86
7.15 31.04
5.00 6.48
5.95 20.37
3.04 15.32
2.14 26.31
4.50 22.12
7.15 32.26
5.44 24.84
6.4 Do the same for n=15, 20, 25 & 30
38
Module 2


Exercise 1.3
Go to Analyze > Distribution, select all Mean columns
39
Module 2
The Decision-Making Process Population Sampling Distribution of Mean

t-Distribution n=5
n = 10
Did you notice the Sampling Distribution of Mean:
1. Normally distributed.
2. As sample size increases, the Stdev decreases.

n = 15
3. The mean is close to the population mean. 𝜎

𝑁 (𝜇 , )
4. Your Stdev are similar to mine.
𝑛
n = 20
Everyone have unique set of numbers, n = 25

but we all converged into similar
results !
n = 30
40
Module 2

Population Sampling Distribution of Mean
t-Distribution
𝝈
𝑵 𝝁,
Module 2 Key Learning 𝒏
𝑛=5
4.99
𝑁 5.02, ⇒ 𝑁(5.02,2.24)
5
𝑛 = 25
4.99
𝑁 5.02, ⇒ 𝑁(5.02,1.00)
25
41
Module 2

Exercise File:
t-Distribution M2.0 Central Limit Theorem.xls

Exercise
Exercise Con’t
6. Repeat Step 4 and 5 for n = 15 & 30.
Exercise Con’t
42
Module 2

Calculated Answer Simulated Answer
t-Distribution

n=5
𝑁(4.00,0.26)
n = 10
𝑁(4.00,0.18)
n = 15
𝝈 𝑁(4.00,0.15)
𝑵 𝝁,
𝒏
n = 20
𝑁(4.00,0.13)
n = 25
𝑁(4.00,0.12)
n = 30
𝑁(4.00,0.11)
43
Module 2

Exercise File:
t-Distribution M2.0 Central Limit Theorem.xls

Exercise
Exercise Con’t
6. Repeat Step 4 and 5 for n = 15 & 30.
Exercise Con’t
44
Module 2

Calculated Answer Simulated Answer
t-Distribution

n=5
𝑁(6.06,1.92)
n = 10
𝑁(6.06,1.36)
n = 15
𝝈 𝑁(6.06,1.11)
𝑵 𝝁,
𝒏
n = 20
𝑁(6.06,0.96)
n = 25
𝑁(6.06,0.86)
n = 30
𝑁(6.06,0.78)
45
Module 2

t-Distribution
Any Population Distribution Sampling Distribution of Mean
𝜎
𝑛
Sampling Distribution of Variance
46
Module 2
Population

t-Distribution

n=5
n=10
n=15
n=20
n=25
n=30
47
Module 2

Sampling Distribution of Variance Follow a Chi-Square Distribution
t-Distribution
Another usage of Variance and why it is important to study.
48
Module 2

t-Distribution Any Population Distribution Sampling Distribution of Mean
𝜎
𝑛
Here onwards, we will focus on Inferential for Population Mean as example

(The concept of how we infer about Population Variance works the same way).
49
Module 2

PROVING
t-Distribution Any Population Distribution Sampling Distribution of Mean

𝜎
𝑛
𝝈
𝑵 𝝁,
𝒏
Ex. Infer about Pop Mean

PRACTICAL APPLICATION
Using only 1 group of sample with size, n 50

Module 2
Central Limit Theorem However, most likely we will not know the
Numerical simulations & examples value of the population stdev.
t-Distribution
As such, we cannot form the Sampling
Module 2 Key Learning Distribution for Mean in this situation. Sampling Ex : Inferential
Distribution of About Population Mean
Mean
Sample with size, n
𝝈
𝑰𝒇 𝝈 𝒌𝒏𝒐𝒘𝒏 𝑵 𝝁,
𝒏
Ex:
ഥ
𝑴𝒆𝒂𝒏 ∶ 𝒙 Confidence Interval
𝑺𝒕𝒅𝒆𝒗 ∶ 𝒔 Ex:
Hypothesis Testing
𝒔
𝑰𝒇 𝝈 𝒖𝒏𝒌𝒏𝒐𝒘𝒏 𝑵
t 𝝁,
𝒏
• Machine setup.
• Produce n setup units. You want to know from the setup you did,
• 𝑆𝑎𝑚𝑝𝑙𝑒 𝑀𝑒𝑎𝑛, 𝑥.ҧ Real cases We replace σ with s.
mainly and infer about the population (as in
• 𝑆𝑎𝑚𝑝𝑙𝑒 𝑆𝑡𝑑𝑒𝑣, 𝑠. production).
belong to This results in a
this group distribution that follow
a Then you make decision if you need to
t-distribution. re-setup or release the machine for
production.
51
Module 2

t-Distribution

Ex: Dof = 10
Ex: Dof = 60
Main features of the Normal Density Function:

• It is symmetric
• It is unimodal
• It is bell-shaped
Main features of the t Density Function:

• It is symmetric
• It is unimodal
• It is bell-shaped
• It has thicker tail compared to Normal distribution.
• t → Normal as doF (n-1) increase. t = Normal when doF >> 60.
52
Module 2
Central Limit Theorem This situation can be used when sample size is large
Numerical simulations & examples enough.
t-Distribution Reason
Module 2 Key Learning t ➔ Normal.
In other words, as if we know about Pop Stdev.
Sampling Inference
Distribution of About Population Mean
Mean
Sample with size, n
𝝈
𝒏 Ex:
ഥ
𝑴𝒆𝒂𝒏 ∶ 𝒙 Confidence Interval
𝑺𝒕𝒅𝒆𝒗 ∶ 𝒔 Ex:
Hypothesis Testing
𝒔
t 𝝁,
𝒏
To understand this concept, let’s look into how it is being applied and go through practical examples.
Let’s start with Confidence Interval.
53
Module 2

Numerical simulations & examples • Understanding the concept behind Central Limit Theorem
t-Distribution
Module 2 Key Learning • Understanding the effect of sample size modification.
• What is t-distribution.
54
Module 3 : Confidence Interval
• Know which are the main properties of point estimators.
• Form Confidence Interval for the population parameters.

• Estimation • Confidence Interval for 1 Normal Population
• Point Estimation • Frequency Interpretation
• Properties of Point Estimators • Practical Exercises of using Confidence Interval
• Interval Estimation • Effect of Sample Size to Confidence Interval
• Parameters of 1 Normal Population Width
• Parameters of 2 Normal Population • 1 Sample Proportion
• Confidence Interval for 2 Normal Populations
• Two Independent With Equal Variance
• Two Independent with Unequal Variance
• Two Dependent Samples
• Two Sample Proportion
FAQ
56
Module 3
Estimation
Point Estimation
Properties of Point Estimators
GOAL
Interval estimation
Parameters of 1 normal population Estimation of unknown population parameters (e.g. “the
Parameters of 2 normal populations

population mean weight μ”) using the available sample data
Relevant terms:
o ESTIMATOR → sample statistic used to estimate the unknown
population parameter (e.g. average of sample
ത
weights 𝑋)
o ESTIMATE → value of the estimator calculated on sample

data, (e.g. “120 pounds”)
57
Module 3
Estimation
Point Estimation
Interval estimation
Parameters of 1 normal population

A point estimator of an unknown population parameter is
o a random variable that depends on sample information . . .
o whose value provides an approximation to this unknown parameter
A specific value of that random variable is called point estimate
58
Module 3
Estimation
Point Estimation
Interval estimation Despite its conceptual simplicity, a disadvantage of point

estimation is that it does not permit to assess the precision
Module 3 Key Learning of the results using the language of probability, i.e. point
estimators do not tell us what is the probability that the
estimated value is really close to (or far from) the true
unknown value
Notation:
• The unknown parameters to estimate are indicated by Greek letters
(e.g. μ, σ, θ, ρ, λ …)
ത s, r …)
• The estimators are indicated by Latin letters (e.g. 𝑋,
ത read “the
• The symbol ^ (hat) is used to indicate estimation (e.g. 𝜇Ƹ = 𝑋,
ത
point estimator of μ is the sample average 𝑋“)
59
Module 3
Estimation
Point Estimation
Interval estimation
Parameters of 1 normal population Usually, the unknown parameters that we need to estimate are:
• The population mean (μ)
• The population variance (σ2) or the population standard deviation (σ)
• The population proportion (θ)
Or, we might be interested in estimating other unknown population parameters like:
• The coefficients of a Regression Model (the β’s)
• The correlation coefficient (ρ)
• Other...
60
Module 3
Estimation
Point Estimation
Point Estimation
Interval estimation
Parameters of 1 normal population EXAMPLE
Let µ be an unknown parameter that we want to estimate. Here, µ is the mean lifetime of a
certain type of batteries.
A random sample of n = 30 batteries might yield the following observed lifetimes (hours):
x1= 6.1, x2= 5.3, …, x30= 5.9
The computed value of the sample average lifetime is:

𝑛 30
1 1 1
𝑋ത = ෍ 𝑋𝑖 = ෍ 𝑋𝑖 = 6.1 + 5.3 + ⋯ + 5.9 = 5.77
𝑛 30 30
𝑖=1 𝑖=1
Based on the available sample information, it is reasonable to regard 5.77 as a very

plausible value of µ or our “best guess”. The value 5.77 is a point estimate of the
unknown population mean.
61
Module 3
Estimation
Point Estimation
Point Estimator Properties
Interval estimation
Point estimators can be classified according to some
desirable properties.
For example, estimators can be:
• Unbiased.
• Consistent.
• Most efficient
NOTE
For more details, se also the Manual of Statistical Methodology (8482919 ver.2).
62
Module 3
Estimation
Point Estimation
Unbiasedness
Interval estimation
Parameters of 2 normal populations • A point estimator 𝜃መ is said to be an unbiased estimator of the
parameter  if E(𝜽),
෡ the expected value, or mean, of the sampling
መ is  .
distribution of 𝜃,
𝜃෠ is unbiased estimator of  if: E(𝜽)

෡ =θ
• Examples:
• The sample mean is an unbiased estimator of μ
• The sample variance is an unbiased estimator of σ2
• The sample proportion is an unbiased estimator of 𝜋
63
Module 3
Estimation
Expected Value
Point Estimation
Properties of Point Estimators መ
Expected value, or mean, of the sampling distribution of 𝜃:
Interval estimation
1. The unknown population parameter that we want to estimate is θ
Parameters of 2 normal populations 2. The population contains a very large (infinite) number of items
Module 3 Key Learning 3. From that population, a very large (infinite) number (k) of samples is drawn
4. From each sample the estimate of θ is calculated
5. The expected value of the sampling distribution of 𝜃෠ is the average of all the estimates
calculated at the previous point 4
Population
Sample 1 Sample 2 ... Sample k
θ̂1 θ̂ 2 ... θ̂ k
෡ i (i = 1,…, k), k→ ∞, is the Expected Value (or Mean)
The average of all the 𝜽
of the sampling distribution of 𝜃෠ and it is indicated by E(𝜽)෡
64
Module 3
Estimation
Point Estimation
Unbiasedness
Interval estimation
Parameters of 2 normal populations መ
𝑓(𝜃)
Module 3 Key Learning Sampling distribution of θ෠ 1
Sampling distribution of θ෠ 2
𝐸(𝜃መ 1)= θ 𝐸(𝜃መ 2) 𝜃መ
Bias in 𝜃መ 2
መ let’s define bias = E(𝜽)
For a any estimator of 𝜃, say 𝜃, ෡ - 𝜽.
෡ 1) = 0, 𝜽
Since the bias (𝜽 ෡ 1 is an unbiased estimator of 𝜽.
Conversely, since the bias (𝜽 ෡ 2) > 0, 𝜽
෡ 2 is a biased estimator of 𝜽.
65
Module 3
Estimation
Point Estimation
Bias
Interval estimation
Parameters of 1 normal population • Let 𝜃መ be an estimator of 
• The bias in 𝜃መ is defined as the difference between its mean and 
෡ = 𝑬(𝜽)
Bias(𝜽) ෡ −𝛉
• The bias of an unbiased estimator is 0 by definition,
66
Module 3
Estimation
Point Estimation
Consistency
Interval estimation
Parameters of 1 normal population • Let 𝜃መ be an estimator of 
• 𝜃መ is a consistent estimator of  if the difference between the expected

value of 𝜃መ and  (i.e. the bias) decreases as the sample size increases
• Consistency is desired when unbiased estimators cannot be obtained
67
Module 3
Estimation
Point Estimation
Maximum Efficiency
Interval estimation
Suppose there are several unbiased estimators of 


Definition:
The most efficient estimator or the minimum variance unbiased
estimator of  is the unbiased estimator with the smallest variance
(a measure of the amount of dispersion away from the estimate. In
other words, the estimator that varies least from sample to sample).
This generally depends on the distribution of the population.

For example, the mean is more efficient than the median for the normal
distribution but not for “skewed” (asymmetrical) distributions.
68
Module 3
Estimation
Point Estimation
Maximum Efficiency
Let 𝜃መ 1 and 𝜃መ 2 be two unbiased estimators of .

Interval estimation

Then,
• 𝜃መ 1 is said to be more efficient than 𝜃መ 2 if:
෡ 𝟏 ) < Var(𝜽
Var(𝜽 ෡𝟐)
• The relative efficiency of 𝜃መ 1 with respect to 𝜃መ 2 is the ratio of their variances:
෡𝟏)
Var(𝜽
Relative Efficiency =
෡𝟐)
Var(𝜽
69
Module 3
Estimation
Point Estimation
Interval estimation
Interval estimation of unknown population

parameters (one normal population)
70
Module 3
Estimation
Point Estimation
Interval estimation
• How much uncertainty is associated with a point estimate of a
population parameter? We cannot know, but,…
• …instead of a point estimate, Statistics helps us determining the limits

of an interval which, expectedly, contains the unknown parameter
with a certain probability. This probability is called confidence level.
• An interval estimate provides more information about a population

characteristic than does a point estimate
71
Module 3
Estimation
Point Estimation
Interval Estimation
Interval estimation

Interval estimation is a methodology used to evaluate an unknown parameter -
for example, a population mean - by computing an interval, within which the
unknown parameter is most likely to be located.
Intervals are commonly chosen such that the unknown parameter falls within
with a certain (percent) probability. This probability, determined a-priori, typically
is set between 90% to 99% and it is called confidence level. Hence, the
intervals are called confidence intervals;
the end points of such an interval are called upper and lower confidence limits.
72
Module 3
Estimation
Point Estimation
Interval Estimation
Interval estimation
EXAMPLE
Parameters of 2 normal populations Consider again the batteries lifetime example.
Module 3 Key Learning Instead of calculating a point estimator of the mean lifetime of the batteries, now we want to go
further and form a confidence interval at the 95% confidence level, for the unknown mean lifetime
of the batteries.
Using the methods presented in the next Modules, it will be possible to form such an interval. For
example, a possible outcome is the interval (5.23, 6.31)
The values 5.23 and 6.31 are the lower and upper confidence limits respectively
It is reasonable to state that – with a confidence equal to 95% - the true unknown mean of the
batteries lifetime is contained in the 5.23 to 6.31 interval
73
Module 3
Estimation
Point Estimation
Interval estimation
Parameters of 2 normal populations The interval containing a population parameter is established by calculating
that statistic from values measured on a random sample taken from the
population and by applying the knowledge of the fidelity with which the
properties of a sample represent those of the entire population.
The probability tells what percentage of the time the assignment of the
interval will be correct but not what the chances are that it is true for any
given sample.
Of the intervals computed from many samples, a certain percentage will

contain the true value of the parameter being sought.
74
Module 3
Estimation
Point Estimation
Interval estimation
An interval gives a range of values:
❑ Takes into consideration variation in statistics from sample to sample

❑ Based on observations from 1 sample
❑ Gives information about closeness to unknown population parameters
❑ Stated in terms of level of confidence
▪ Can never be 100% confident
75
Module 3
Estimation
Point Estimation
Properties of Point Estimators Let  be an unknown population parameter that we want to estimate. If
Interval estimation P(a <  < b) = 1 -  then the interval from a to b is called 100(1 - )%
Parameters of 1 normal population confidence interval of .
The quantity (1 - ) is called confidence level of the interval (0 <  < 1).
Acceptable values for α are 0.01< α < 0.10
Acceptable confidence levels are between 0.90 to 0.99
o In repeated samples, the true value of the parameter  would be contained in 100(1 - )%
of the intervals.
o The confidence interval calculated in this manner is written as a <  < b with 100(1 - )%
confidence
76
Module 3
Estimation
Point Estimation
Frequency Interpretation
Interval estimation
❑ Suppose confidence level = 95%
❑ Also written (1 - ) = 0.95

❑ A relative frequency interpretation:

▪ From repeated samples, 95% of all the confidence intervals that can be constructed
will contain the unknown true parameter
❑ A specific interval either will contain or will not contain the true parameter
▪ No probability involved in a specific interval
77
Module 3
Estimation Frequency Interpretation

Point Estimation
Interval estimation Sampling Distribution of the Mean

𝛼/2 1−𝛼 𝛼/2
X
μx = μ
Xሜ 1
Xሜ 2
In repeated samples, 100(1-)% of
intervals contain μ, 100()% do not.
78
Module 3
Estimation
Point Estimation Confidence Intervals

Interval estimation
Confidence intervals for parameters of one normal population
Confidence Intervals
Population Mean Population Variance
σ2 Known σ2 Unknown
79
Module 3
Estimation
Point Estimation Confidence Intervals

Interval estimation
Confidence intervals for parameters of one normal population
80
Module 3
Estimation
Point Estimation
Interval estimation

The general formula for the confidence intervals for the population mean under
the normality assumption is:
Point Estimate  (Reliability Factor)*(Standard Error)
❑ The value of the reliability factor depends on the desired level of confidence
𝜎
❑ The standard error is equal to , where σ is the population std. dev. and n is the sample size
𝑛
81
Confidence Interval for μ (σ2 known)
Module 3
Estimation
Point Estimation
- Assumptions
Interval estimation
• Population is normally distributed (if population is not normal, use large sample)
- Context
• Population variance σ2 is known
- Confidence interval estimate:
𝝈 𝝈
ഥ − 𝒁𝜶
𝑿 ഥ + 𝒁𝜶
<𝝁<𝑿
𝟐 𝒏 𝟐 𝒏
Where,
▪ 𝑋ത - point estimator of the population mean, it is the sample average
▪ z/2 – percentage point of the N(0,1) distribution such that: 𝑃 𝑧 ≥ 𝑍𝛼/2 = 𝛼/2
▪ σ – population standard deviation (known)
▪ n – sample size
For more references on method see also the Manual of Statistical Methodology, Ch. 6 and Annex 4 (DMS 8482919_A)
82
Module 3
Estimation
Point Estimation
Confidence Interval for μ (σ2 known)
Interval estimation
The confidence interval can be written also as follows:
Point Estimate  (Reliability Factor)*(Standard Error)
𝝈
ഥ ± 𝒁𝜶
𝑿
𝟐 𝒏
or as
ഥ ±ME
𝑿
𝝈
where ME is called the Margin of Error, 𝑴𝑬 = 𝒁𝜶
𝟐 𝒏
The interval width, W, is equal to twice the margin of error: W = 2*ME
83
Module 3
Estimation
Point Estimation
Reducing the Margin of Error
Interval estimation
Q. Why it is desirable to reduce the ME?
Parameters of 2 normal populations 𝝈
𝑴𝑬 = 𝒁𝜶 A. Because W, the width of the interval, given by W=2*ME is
𝟐 𝒏 inversely proportional to the precision of an interval estimator.
The margin of error can be reduced if:
❑ the population standard deviation can be reduced - σ
❑ The sample size is increased - n
❑ The confidence level is decreased - (1 – )
NOTE
𝝈
Fixed a value for W and a confidence level, one can solve for n the equation 𝑾 = 𝟐𝒁𝜶 to obtain the
𝟐 𝒏
minimum sample size required to guarantee a confidence (1 − 𝛼) and a precision W
84
Module 3
Estimation
Reducing the Margin of Error
Point Estimation
𝛼 𝛼
Interval estimation = 0.05 = 0.05
2 2 Narrower CI. More precise.
Parameters of 1 normal population (1 − 𝛼) = 0.90 Less likely to include the
Parameters of 2 normal populations 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 90%
true population Mean.
(𝛼 = 0.10)
3.18 CI @ 90% 6.03
𝛼 𝛼
= 0.025 = 0.025
2 2
(1 − 𝛼) = 0.95
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 95%
(𝛼 = 0.05)
2.89 CI @ 95% 6.31
𝛼 𝛼
= 0.005 = 0.005
2 2 Wider CI. Less precise.
(1 − 𝛼) = 0.99 More likely to include the
true Population Mean.
(𝛼 = 0.01)
2.30 CI @ 99% 6.91
85
Module 3
Estimation
Point Estimation
Finding the Reliability Factor
Interval estimation Consider a 95% confidence interval:


Width = W
α α
= 0.025 = 0.025
2 1−𝛼Z= 0.95 2
Z units: -z = -1.96 0 z = 1.96

Lower Upper
X units: Confidence Point Estimate Confidence
Limit ഥ)
(𝑿 Limit
z0.025 = 1.96 from the standard normal distribution table

86
Module 3
Estimation
The standard normal distribution table
Point Estimation
.06
Interval estimation
α
= 0.025
2
α
= 0.025 1-α = 0.95
2
Z units: z = -1.96 0 z = 1.96 -1.9 .0250

Lower Upper
Confidence Confidence
Limit Limit
-1.96
87
Module 3
Estimation
Point Estimation
Common Levels of Confidence
Interval estimation
Parameters of 2 normal populations Commonly used confidence levels are 90%, 95%, and 99%
Confidence Confidence Z/2 value

Level Coefficient,
1−𝛼
90% 0.90 1.645
95% 0.95 1.96
99% 0.99 2.58
88
Module 3
Estimation
Point Estimation
Advanced Explanation
Interval estimation
How did we arrive to the previous formula for the
Module 3 Key Learning confidence interval?
We need to focus on the following:
2
𝜎𝑋
1. ത
The Central Limit Theorem (CLT) tells us that if 𝑋~𝑁(𝜇𝑋 , 𝜎𝑋2 ) ⇒ 𝑋~𝑁 𝜇𝑋ത = 𝜇𝑋 , 𝜎𝑋2ത = . In
𝑛
ത the
words, if X is a normally distributed random variable with parameters 𝜇𝑋 and 𝜎𝑋2 , then 𝑋,
𝜎2
sample average, is also a normal RV with parameters 𝜇𝑋ത = 𝜇𝑋 and 𝜎𝑋2ത = 𝑋 , where n is the sample
𝑛
size. If X is NOT normally distributed, still, for large values of n (at least, n>30), the previous
distributional property holds (statisticians say that this property holds “asymptotically”).
2. The transformation called “standardization” of a normal random variable.

𝑋−𝜇𝑋
For any normal RV X, 𝑋~𝑁(𝜇𝑋 , 𝜎𝑋2 ) ⇒ 𝑍 = ~𝑁(𝜇𝑍 = 0, 𝜎𝑍 = 1), i.e. Z is a standard normal RV.
𝜎𝑋
89
Module 3
Estimation
Point Estimation
Interval estimation
To form a confidence interval for the population mean 𝜇𝑋 of a normal RV with known
Module 3 Key Learning variance 𝜎𝑋2 , we need 2 critical values, say CV1 and CV2, that include 100 1 − 𝛼 %
of the population (thus, leaving in each tail an area equal to 𝛼/2.
𝛼/2 (1 − 𝛼) 𝛼/2
𝜇𝑋 X
𝐶𝑉1 𝐶𝑉2
Due to the symmetrical feature of the normal distribution, it can be noticed that:
𝑪𝑽𝟐 − 𝝁𝑿 = 𝝁𝑿 − 𝑪𝑽𝟏
90
Module 3
Estimation
Point Estimation
Interval estimation
We standardize the value of 𝑋ത to form the confidence interval for 𝜇𝑋 :
ത 𝑋
𝑋−𝜇 ഥ ത 𝑋
𝑋−𝜇
Module 3 Key Learning 𝑍𝑋ത = =𝜎
𝜎𝑋
ഥ 𝑋/ 𝑛
Note: the quantity 𝜎𝑋 / 𝑛 is also called “mean standard error”.
According to the desired confidence level, we may obtain the different critical values that
one can find in the table of the standard normal distribution.
𝛼
Let CV2, the upper critical value, be 𝐶𝑉2 = 𝑍𝑋ത ( 2 ). Considering the property of symmetry
𝛼
of the normal distribution, 𝐶𝑉1 = −𝐶𝑉2 = −𝑍𝑋ത ( 2 ) holds.
For example, for α = 0.05 ⇒ α/2 = 0.025, from the table of the Standard Normal
Distribution we obtain 𝐶𝑉1 = −1.96 and 𝐶𝑉2 = 1.96.
91
Module 3
Estimation
Point Estimation
Interval estimation
To form the confidence interval for 𝜇𝑋 we consider the following inequality:
𝛼 𝑋ത − 𝜇𝑋 𝛼
−𝑍( ) ≤ ≤ 𝑍( )
2 𝜎𝑋 / 𝑛 2
𝛼 𝜎𝑋 𝛼 𝜎
And we solve it for 𝜇𝑋 ⇒ −Z( 2 ) ≤ 𝑋ത − 𝜇𝑋 ≤ 𝑍 ( 2 ) 𝑋𝑛 .
𝑛
After solving the system of inequalities, we obtain the desired confidence interval:
𝜶 𝝈𝑿 𝜶 𝝈𝑿
ഥ−𝒁
𝑿 ഥ
≤ 𝝁𝑿 ≤ 𝑿 + 𝒁( )
𝟐 𝒏 𝟐 𝒏
92
Module 3
Estimation
Point Estimation
Interval estimation EXAMPLE

Confidence interval for the mean of a normal population (variance known)
• A sample of 11 circuits from a large normal population has a mean resistance of 2.26 ohms.
We know from past testing that the population standard deviation is 0.35 ohms.
• The available data: {2.25, 2.48, 1.40, 2.31, 2.43, 2.47, 2.52, 2.56, 2.33, 1.48, 2.61}
• Determine & interpret a 95% confidence interval for the true mean resistance of the
population.
93
Module 3
Estimation Confidence Intervals

Point Estimation
Properties of Point Estimators SOLUTION
𝝈 𝝈
Interval estimation ഥ − 𝒁𝜶
In this case, σ2 is known, then we can use: 𝑿 ഥ + 𝒁𝜶
<𝝁<𝑿
Parameters of 1 normal population 𝟐 𝒏 𝟐 𝒏
Parameters of 2 normal populations Where,
ഥ = 2.26
• 𝑿
• 𝒁𝜶 = 1.96 (α = 0.05 → α/2 = 0.025 )
𝟐 Confidence interval :
• 𝝈 = 0.35
• n = 11 𝟎. 𝟑𝟓 𝟎. 𝟑𝟓
𝟐. 𝟐𝟔 − 𝟏. 𝟗𝟔 < 𝝁 < 𝟐. 𝟐𝟔 + 𝟏. 𝟗𝟔
𝟏𝟏 𝟏𝟏
𝟐. 𝟎𝟓 < 𝝁 < 𝟐. 𝟒𝟕
• We are 95% confident that the true mean resistance is between 2.05 and 2.47 ohms
• Although the true mean may or may not be in this interval, 95% of intervals formed in
this manner will contain the true mean
NOTE: From the statistical tables of the Standard Normal Distribution, we obtain that 𝑍𝛼 = 1.96.
2
However, the confidence interval is provided by the software and not calculated by hand. 94
Module 3
Estimation
Point Estimation
Interval estimation
Confidence intervals for parameters of one population
95
Module 3
Estimation
Point Estimation
Confidence Interval for μ (σ2 Unknown)
Interval estimation

• If the population standard deviation σ is unknown, we
can substitute the sample standard deviation, s
• This introduces extra uncertainty, since s is variable

from sample to sample
• So, we use the Student’s t distribution instead of the

Normal distribution
96
Module 3
Estimation
Point Estimation
Interval estimation
• Assumptions
Parameters of 1 normal population • Population is normally distributed (if population is not normal, use large sample)
Module 3 Key Learning • Context

• Population variance σ2 is unknown
• Use Student’s t Distribution
• Confidence Interval Estimate:
𝒔 𝒔
ഥ − 𝒕𝒏−𝟏,𝜶/𝟐
𝑿 ഥ + 𝒕𝒏−𝟏,𝜶/𝟐
<𝝁<𝑿
𝒏 𝒏
where tn-1,α/2 is the critical value of the t distribution with n-1 degrees of freedom
and an area of α/2 in each tail:
P(t n−1 > t n−1,α/2 ) = α/2

97
Module 3

Estimation
Point Estimation
Interval estimation

❑ The t is a family of distributions
❑ The t value depends on degrees of freedom (d.f.)
▪ Number of observations that are free to vary after
sample mean has been calculated
d.f. = n - 1
98
Module 3
Estimation
Point Estimation
Student’s t Distribution
Interval estimation
Parameters of 2 normal populations f(Z), f(t)
Standard Normal
(t with d.f. = ∞)
t (d.f. = 13)
t-distributions are bell-shaped

and symmetric, but have t (d.f. = 5)
‘fatter’ tails than the normal
00 t Z, t
Note: t → Z as n (and then the d.f.) increases

99
Module 3
Estimation
Point Estimation
Student’s t Distribution
Interval estimation
With comparison to the Z value (Z=Standard Normal, N(μ=0, σ2=1))
Confidence t t t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) (60 d.f.) (120 d.f.)
90% 1.812 1.725 1.697 1.671 1.658 1.645

95% 2.228 2.086 2.042 2.000 1.980 1.960
99% 3.169 2.845 2.750 2.660 2.617 2.576
Notes
• d.f.=n-1 ⇒ from the table above, we notice that as n increases the interval width
decreases (and the precision increases)
• as the number of d.f.→ ∞ ⇒ t → Z
• with d.f. > 60, approximately t = Z 100
Module 3
Estimation
Point Estimation
Use of the Student’s t Distribution
Interval estimation
EXAMPLE
confidence interval for the mean of a normal population (variance unknown)
• Consider the data from the previous example. Sample size n=11, but here the true
population standard deviation is unknown and must be estimated.
• From sample data, we obtain s = 0.42
• Determine a 95% confidence interval for μ, the true population mean.
101
Module 3
Estimation
Point Estimation
Interval estimation SOLUTION

𝒔 𝒔
Parameters of 1 normal population ഥ − 𝒕𝒏−𝟏,𝜶/𝟐
In this case, σ2 is unknown, then we use: 𝑿 ഥ + 𝒕𝒏−𝟏,𝜶/𝟐
<𝝁<𝑿
Parameters of 2 normal populations 𝒏 𝒏
where,
• ഥ = 2.260
𝑿
• 𝑡10, 𝛼 = 2.228 (α = 0.05 → α/2 = 0.025 )
2
• 𝒔 = 0.420 𝒔 𝒔
ഥ − 𝒕𝒏−𝟏,𝜶/𝟐
𝑿 ഥ + 𝒕𝒏−𝟏,𝜶/𝟐
<𝝁<𝑿
• n = 11 → n-1 =10 𝒏 𝒏
𝟎. 𝟒𝟐𝟎 𝟎. 𝟒𝟐𝟎
𝟐. 𝟐𝟔𝟎 − 𝟐. 𝟐𝟐𝟖 < 𝝁 < 𝟐. 𝟐𝟔𝟎 + 𝟐. 𝟐𝟐𝟖
𝟏𝟏 𝟏𝟏
𝟏. 𝟗𝟕𝟖 < 𝝁 < 𝟐. 𝟓𝟒𝟐
NOTES:
• From the Statistical Tables of the t Distribution (see next slide), we obtain that 𝑡10,0.025 = 2.228.
• The interpretation of this result is the same as for the previous example. The only difference is in the formula used
for the calculation.
102
Module 3
Estimation
Point Estimation
Table of critical values for Student’s t distributions
Interval estimation df α = 0.1 0.05 0.025 0.01 0.005 0.001 0.0005
Parameters of 1 normal population 1 3.078 6.314 12.706 31.821 63.656 318.289 636.578
2 1.886 2.920 4.303 6.965 9.925 22.328 31.600
Parameters of 2 normal populations 3 1.638 2.353 3.182 4.541 5.841 10.214 12.924
4 1.533 2.132 2.776 3.747 4.604 7.173 8.610
Module 3 Key Learning 5 1.476 2.015 2.571 3.365 4.032 5.894 6.869
6 1.440 1.943 2.447 3.143 3.707 5.208 5.959
7 1.415 1.895 2.365 2.998 3.499 4.785 5.408
8 1.397 1.860 2.306 2.896 3.355 4.501 5.041
9 1.383 1.833 2.262 2.821 3.250 4.297 4.781
10 1.372 1.812 2.228 2.764 3.169 4.144 4.587
11 1.363 1.796 2.201 2.718 3.106 4.025 4.437
12 1.356 1.782 2.179 2.681 3.055 3.930 4.318
13 1.350 1.771 2.160 2.650 3.012 3.852 4.221
14 1.345 1.761 2.145 2.624 2.977 3.787 4.140
15 1.341 1.753 2.131 2.602 2.947 3.733 4.073
16 1.337 1.746 2.120 2.583 2.921 3.686 4.015
17 1.333 1.740 2.110 2.567 2.898 3.646 3.965
18 1.330 1.734 2.101 2.552 2.878 3.610 3.922
19 1.328 1.729 2.093 2.539 2.861 3.579 3.883
20 1.325 1.725 2.086 2.528 2.845 3.552 3.850
21 1.323 1.721 2.080 2.518 2.831 3.527 3.819
22 1.321 1.717 2.074 2.508 2.819 3.505 3.792
23 1.319 1.714 2.069 2.500 2.807 3.485 3.768
24 1.318 1.711 2.064 2.492 2.797 3.467 3.745
25 1.316 1.708 2.060 2.485 2.787 3.450 3.725
26 1.315 1.706 2.056 2.479 2.779 3.435 3.707
27 1.314 1.703 2.052 2.473 2.771 3.421 3.689
28 1.313 1.701 2.048 2.467 2.763 3.408 3.674
29 1.311 1.699 2.045 2.462 2.756 3.396 3.660
30 1.310 1.697 2.042 2.457 2.750 3.385 3.646
60 1.296 1.671 2.000 2.390 2.660 3.232 3.460
120 1.289 1.658 1.980 2.358 2.617 3.160 3.373
∞ 1.282 1.645 1.960 2.326 2.576 3.091 3.291
103
Module 3
Estimation
Point Estimation
Confidence Interval
Interval estimation
Sampling
Parameters of 1 normal population Sample with size, n Distribution Inference
Parameters of 2 normal populations About Population Mean
of Mean
𝝈
𝒏
ഥ
Average ∶ 𝒙
𝑺𝒕𝒅𝒆𝒗 ∶ 𝒔 Confidence Interval
𝒔
t 𝝁,
𝒏
From here onwards, we will use the condition of “σ unknown”.
We will use practical example to understand the application of

CLT in Confidence Interval.
104
Module 3
Estimation
Exercise #4
Point Estimation
Exercise File:
Interval estimation
1. Trainer will show how to use “Transpose” feature in JMP.
2. Subsequent slide is using this set of data for explanation on how Confidence Interval
is being calculated.
105
Module 3
Estimation
Exercise #4
Point Estimation
Exercise File:
Interval estimation
Parameters of 2 normal populations 1. Open the file…data is in row format
2. Transpose into column format

Go to Tables > Transpose. Select all columns and cast to Transpose Columns, click Ok
This is a sample data (e.g. machine setup data, etc)

106
Data in column format
Module 3
Estimation
Exercise #4
Point Estimation
Exercise File:
Interval estimation
Parameters of 2 normal populations 3. Make a distribution of the sample data
Module 3 Key Learning Go to Analyze > Distribution. Cast Row column into Y, then hit Ok
JMP generated CI
107
Module 3
Estimation
Exercise #4
Point Estimation
Exercise File:
Interval estimation
Parameters of 2 normal populations To generate confidence interval for both Mean and Stdev:
Module 3 Key Learning Go to distribution hotspot > Confidence Inter
JMP generated CI
108
Module 3
Estimation
Point Estimation
Confidence Interval
Properties of Point Estimators Step 1 Convert Data into Information.
Interval estimation
Descriptive
Module 3 Key Learning Statistics
Step 2 Establish the Sampling Distribution for Mean
𝒔 𝟒. 𝟓𝟖
t 𝝁, 𝑵t 𝟒. 𝟔𝟎,
𝒏 𝟑𝟎 Sampling
Distribution of Mean
inference on the
Inferential
t Distribution Population Mean.
Statistics
𝒔 𝑆𝑡𝑑 𝐸𝑟𝑟: 0.84

Note: Std Error =
𝒏
𝟒. 𝟔𝟎 109
Module 3
Estimation
Point Estimation
Confidence Interval
Properties of Point Estimators Step 3 User to Define Significance Level, 𝜶.
Interval estimation
t Distribution
Significance Level, 𝜶 Confidence Level, (1-𝜶)
𝛼 𝛼 0.01 99%
= 0.025 = 0.025
Module 3 Key Learning 2 2
1 − 𝛼 = 0.95 0.05 95%
0.10 90%
𝐿𝐶𝐿 𝟒. 𝟔𝟎 𝑈𝐶𝐿
LCL = Lower Confidence Limit Typically we choose 𝛼 = 0.05.
UCL = Upper Confidence Limit
As an illustration of the CLT, we compute the average of the

sample means of 1000 samples of size, n. Our results are
FAQ
similar, with the center of sampling distribution almost equal
to pop mean. Why we need to introduce 𝛼 or
Confidence Level?
In the case of application, we use only 1 group of sample
with size n to form the sampling distribution of mean. Hence
there are less information to infer the pop mean, thus lower This will be answered later.
accuracy.
As such, a Confidence Interval is introduced to establish a FAQ

range of values that likely contain the pop mean. Why typically we choose 𝛼 = 0.05 110
or Confidence Level = 95%?
Module 3
Estimation
Point Estimation
Step 4 Establishing the Confidence Interval Confidence Interval
t Distribution
Interval estimation
Significance Level, 𝜶 Confidence Level, (1-𝜶)
Parameters of 2 normal populations 𝛼 𝛼 0.01 99%
= 0.025 = 0.025
2 2
Module 3 Key Learning 1 − 𝛼 = 0.95 0.05 95%
0.10 90%
𝐿𝐶L 𝟒. 𝟔𝟎 𝑈𝐶𝐿
2.89 6.31
In the example:
What does the Confidence Interval mean?

It means that at the 95% Confidence Level, the Pop Mean are contained within 2.89 to 6.31.
111
But how does this knowledge help us in making decision?
Module 3
Estimation
Point Estimation
Confidence Interval
Interval estimation

𝛼 𝛼
= 0.025 = 0.025
2 2
1 − 𝛼 = 0.95
𝑇𝑎𝑟𝑔𝑒𝑡 𝐿𝐶𝐿 𝟒. 𝟔𝟎 𝑇𝑎𝑟𝑔𝑒𝑡 𝑈𝐶𝐿

0 2.89 5.00 6.31
Possible Pop Mean Value
Let’s say specs target is 0+/-50um Let’s say specs target is 5+/-50um
0 is not within the Confidence 5 is within the Confidence Interval

Interval (@ 95% Confidence Level) (@ 95% Confidence Level)
Process not centered. Process is centered.
User’s Decision User’s Decision

Possible Action: Possible Action:
To re-setup machine. No need re-setup machine. 112
Module 3
Estimation
Point Estimation
Confidence Interval
Properties of Point Estimators FAQ
Interval estimation
Why typically we choose 𝛼 = 0.05 or
Parameters of 1 normal population Confidence Level = 95%?
99%
95%
2.30 2.89 6.31 6.91
CI @ 95%
What it means when we increase

Confidence Level?
CI @ 99%
1. Confidence Interval width
becomes larger and higher
confidence now to contain
population mean.
2. It also means less precise

because pop mean can take
more different values. 113
Module 3
Estimation Confidence Interval

Point Estimation
5.0% 5.0% FAQ
Interval estimation
Parameters of 1 normal population 90% The Confidence
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 90% Interval width will be
(𝛼 = 0.10) smaller at CL 90%.
3.18 CI @ 90% 6.03 Why don’t we use CL
90% then?
What it means when we

increase Confidence
2.5% 2.5% Level?
95%
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 95% 1. Confidence Interval
(𝛼 = 0.05) width becomes
larger and higher
CI @ 95% confidence now to
2.89 6.31
contain population
mean.
2. It also means less

precise because pop
0.5% mean can take on
0.5%
more values.
99%
(𝛼 = 0.01)
2.30 CI @ 99% 6.91 114
Module 3
Estimation
Exercise #5
Point Estimation
Interval estimation
Parameters of 1 normal population Frequency Interpretation – a numerical simulation
Module 3 Key Learning FAQ

The Confidence Interval width will be smaller at CL
90%. Why don’t we use CL 90% then?
1. Open M2.0 Central Limit Theorem.xls.
2. Create 100 groups of sample with size, n=30 using sheet Pop 1.
3. Copy the raw data of these 100 groups.
4. Create confidence interval for all these 100 groups using:

a. 𝛼 = 0.10 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 90% .
b. 𝛼 = 0.05 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 95% .
c. 𝛼 = 0.01 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 99% .
5. Compare the results.
6. Trainer will show step by step, how to do it.

115
Module 3
Estimation
Exercise #5
Point Estimation
Interval estimation
M2.0 Central
1. Open M2.0 Central Limit Theorem.xls. Limit Theorem
2. Create 100 groups of sample with size, n=30 using sheet Pop 1.
Up to 30
100
groups
116
Module 3
Estimation
Exercise #5
Point Estimation
M2.0 Central
Interval estimation 3. Copy/paste sample data to JMP, transpose to column format Limit Theorem

Go to Tables > Transpose. Select all columns and cast to Transpose Columns, click Ok
Up to 100 cols
Up to 30 rows
Can delete Label column…click on column, right click > Delete Column
117
Module 3
Estimation
Exercise #5
Point Estimation
M2.0 Central
Interval estimation Limit Theorem

Parameters of 2 normal populations Go to Analyze > Distribution. Select all columns and cast Y Columns, click Ok
There are 100 distributions created – remove Quantiles and Summary statistics from the hotspot
Untick
118
Module 3
Estimation
Exercise #5
Point Estimation
M2.0 Central
Parameters of 1 normal population 4. Create confidence interval for all these 100 groups using:
Parameters of 2 normal populations a. 𝛼 = 0.10 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 90% .
b. 𝛼 = 0.05 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 95% .
c. 𝛼 = 0.01 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 99% .
Pressing Ctrl key, click on hotspot, select Confidence interval, choose confidence level
Round1: 90% Round2: 95% Round3: 99%
119
Module 3
Estimation
Exercise #5
Point Estimation
M2.0 Central

Right click on the tabulation > Make Combined Data Table
There are 30 groups, each group with 3 sets of CI (90, 95, 99% CL)
120
Module 3
Estimation
Exercise #5
Point Estimation
M2.0 Central
Interval estimation Remove Std Dev rows Limit Theorem

Click on Rows > Row Selection > Select Where

Delete highlighted rows
Remove also Parameter and Estimate columns (not needed)
121
Module 3
Estimation
Exercise #5
Point Estimation
M2.0 Central
Interval estimation Stack the data columns Limit Theorem

Click on Tables > Stack Stacked data
Make Oneway graph

Click on Analyze > Fit Y by X
Y, Response: Data
X: Y
By: 1-Alpha
Click OK
122
Module 3
Estimation
Exercise #5
Point Estimation
M2.0 Central

123
Module 3
Estimation
Exercise #5
Point Estimation
Remove Grand Mean
Interval estimation
Click on hotspot > Display Options – untick Grand Mean
Create box plot
Module 3 Key Learning Click on hotspot > Display Options – tick Box Plot
124
Module 3
Estimation
Exercise #5
Point Estimation
From exercise 1.1, we get population mean = 5.02, we will add pop mean line into our graph
Interval estimation
Double-click on Y axis of the graph, put value 5.02 then click Add > OK (Do for all 3 Oneway graph)
CL(%) Tr P1 P2 P3 P4
90
95
99
125
Module 3
Estimation
Exercise #5
Point Estimation
Compare Confidence Interval Width, use uniform scaling to easily see
Interval estimation
Right-click on Y axis of the graph (for 99%) > Edit > Copy Axis Settings
Click on 95% CL graph Yaxis > Edit > Paste Axis Settings
Click on 90% CL graph Yaxis > Edit > Paste Axis Settings
126
Module 3
Estimation
Frequency Interpretation – a numerical simulation
Point Estimation
Interval estimation
Parameters of 2 normal populations Population Mean
At 90% Confidence Level ➔ On average, 90/100 Confidence Intervals contain the Pop Mean 127
Module 3
Estimation
Point Estimation
Interval estimation
Parameters of 2 normal populations Pop Mean
Module 3
Estimation
Point Estimation
Interval estimation
Pop Mean
Module 3
Estimation Frequency Interpretation

Point Estimation
Interval estimation
There’s nothing wrong in choosing 90% or 99%.
Important is that you are clear about pros and cons

that come with your decision.
130
Module 3
Estimation
Exercise #6
Point Estimation
Interval estimation Example #1

𝑆𝑝𝑒𝑐𝑠 = 0𝑢𝑚 ± 20𝑢𝑚
Scenario:
1. You are setting up the machine.
2. 30 setup units measured.
3. Descriptive Statistics obtained.

a. All single value within specs limit.
b. Cpk > 1.67.
4. Decision? :
a. Release machine for production.
b. Re-setup machine.
131
Module 3
Estimation
Exercise #6
Point Estimation
Generate descriptive statistics
Interval estimation
Click on Analyze > Distribution, select X & Y offset columns, then OK

To add process capability
Go to data table, select both columns > Right-click > Column Properties > Specs Limits
Input here
Do the same for another parameter down below
132
Module 3
Estimation
Exercise #6
Point Estimation
Interval estimation
Parameters of 1 normal population Generate descriptive statistics
Parameters of 2 normal populations Click on Analyze > Distribution, select X & Y offset columns, then OK
With check, since we put

Specs Limit in the
Data column
133
Module 3
Estimation
Exercise #6
Point Estimation
Generate descriptive statistics
Interval estimation
Parameters of 1 normal population Click on Analyze > Distribution, select X & Y offset columns, then OK
134
Module 3
Estimation 𝑆𝑝𝑒𝑐𝑠 = 0𝑢𝑚 ± 20𝑢𝑚 Confidence Intervals

Step 1 Convert Data into Information.
Point Estimation
Interval estimation
Step 2 Establish the Sampling Distribution for Mean
𝒔
t 𝝁,
𝒏
Target Interest
𝟑. 𝟔𝟗 𝟏. 𝟑𝟕
t𝑵 −𝟏. 𝟐𝟕, t𝑵 −𝟗. 𝟗𝟎,
𝟑𝟎 𝟑𝟎
𝑆𝑡𝑑 𝐸𝑟𝑟: 0.67 𝑆𝑡𝑑 𝐸𝑟𝑟: 0.25
Target Interest
−𝟏. 𝟐𝟕 −𝟗. 𝟗𝟎
-2.65 CI 0..11 -10.41 CI -9.39
Knowledge Knowledge
At 95% CL, target within CI. At 95% CL, target not within CI.
Process Is Centered. Process Not Centered.
Step 3 User to Define Significant Level, 𝜶. Typically @ 0.05 or 95% Confidence Level.
Step 4 Establish the Confidence Interval.

135
Step 5 Compare to your target of interest and make your decision.
Module 3
Estimation
Exercise #7
Point Estimation
Example #2
Interval estimation
Parameters of 2 normal populations Same scenario as in Exercise 6 but with different data set
𝑆𝑝𝑒𝑐𝑠 = 0𝑢𝑚 ± 20𝑢𝑚
136
Module 3
Estimation
Exercise #7
Point Estimation
Interval estimation
Parameters of 2 normal populations To add process capability
Go to data table, select both columns > Right-click > Column Properties > Specs Limits
Input here
137
Module 3
Estimation
Exercise #7
Point Estimation
Interval estimation
What is confidence interval telling us?

Check the min and max…Ppk
138
Module 3
Estimation FAQ Answer
Point Estimation Could we increase the precision Increase the sample size, n. 𝑺𝒂𝒎𝒑𝒍𝒆 𝑺𝒊𝒛𝒆, 𝒏𝟏
Properties of Point Estimators without sacrificing the frequencies?
Interval estimation
Parameters of 1 normal population 2.5% 2.5%
𝝈
𝑰𝒇 𝝈 𝒌𝒏𝒐𝒘𝒏 𝑵 𝝁, 95%
𝒏
𝒔 CI @ 95%
t 𝝁,
𝒏
𝑺𝒂𝒎𝒑𝒍𝒆 𝑺𝒊𝒛𝒆, 𝒏𝟐
𝒔 (𝒏𝟐 > 𝒏𝟏 )
Sample size increases ➔ decreases.
𝒏
Resulting in a narrower spread of Sampling

Distribution for Mean. 95%
2.5% 2.5%
CI @ 95% 139
Module 3
Estimation
Exercise #8
Point Estimation
Interval estimation
Exercise File:
Use M.20 Central Limit Theorem.xls file.
Exercise
2. Random sampling 100 groups of sample with size = 5 & 30.
3. Compute Confidence Interval for all 100 groups for both n = 5 and n = 30.
4. Use 95% Confidence Level for both groups.
140
Module 3
Estimation
Exercise #8
Point Estimation
Interval estimation Random sampling 100 groups of sample with size = 5

Random sampling 100 groups of sample with size = 30
141
Module 3
Estimation
Exercise #8
Point Estimation
Copy and paste to JMP
Interval estimation
Go to Tables > Transpose
142
Module 3
Estimation
Exercise #8
Point Estimation
Remove Quantiles and other Statistics except Confidence Intervals
Interval estimation
Pressing Ctrl key, Go to hotspot > Display Options, untick Quantiles
Pressing Ctrl key, Go to hotspot of Summary Statistics, untick others except Confidence Intervals
143
Module 3
Estimation
Exercise #8
Point Estimation
Interval estimation
Parameters of 2 normal populations Right-Click somewhere here, select
Make Combined Data Table
144
Module 3
Estimation
Exercise #8
Point Estimation
Make a box plot
Interval estimation
Go to Analyze > Fit Y by X
Have the Grand Mean removed and tick on Box Plots Add the Pop Mean line (double-click on Yaxis)
145
Hotspot > Display Options
Module 3
Estimation
Exercise #8
Point Estimation
Do the same for n = 30 samples
Interval estimation
Make scaling uniform:

Go to n = 5 graph > right-click on Y axis > Edit > Copy Axis Settings
Go to n = 30 graph > right-click on Y axis > Edit > Paste Axis Settings
146
Module 3
Estimation
Exercise #8
Point Estimation
Interval estimation
147
Module 3
Estimation
Point Estimation
Interval estimation
148
Using same CL at 95%, as sample size increases ➔ Precision improves maintaining same frequencies to contain pop mean.
Module 3
Estimation
Increase
Point Estimation
Probability to Contain Sample Size
Properties of Point Estimators Population Mean
Interval estimation
𝑺𝒂𝒎𝒑𝒍𝒆 𝑺𝒊𝒛𝒆, 𝒏𝟏
(𝒏𝟐 > 𝒏𝟏 ) 𝑺𝒂𝒎𝒑𝒍𝒆 𝑺𝒊𝒛𝒆, 𝒏𝟐 > 𝒏𝟏

CL CL
Increase Precision but same probability to contain pop mean.
99% 99%
CL Increase Precision but same probability to contain pop mean.

CL
95% 95%
CL Increase Precision but same probability to contain pop mean.

CL
90% 90%
Precision
Within Same Sample Size Within Same Sample Size

CL impact vs Precision/Probability as CL impact vs Precision/Probability as
per above. per above.
Increasing Sample Size

Precision increases vs each CL, without decreasing the Probability. 149
Module 3
Estimation
Point Estimation
Interval estimation
Confidence intervals for parameters of one population
150
Module 3
Estimation
Point Estimation Confidence Interval for σ2

Interval estimation
Assumption
- The population is normally distributed
(n − 1)s 2
The random variable  2
n−1 =
σ2
follows a chi-square distribution with (n – 1) degrees of freedom
The chi-square value n2−1,  denotes the number for which:

P( χn2−1  χn2−1, α ) = α
151
Confidence Interval for σ2
Module 3
Estimation
Point Estimation
The (1 - )% confidence interval for the population variance is:
Interval estimation
Parameters of 2 normal populations 𝒏 − 𝟏 𝒔𝟐 𝒏 − 𝟏 𝒔𝟐
< 𝝈𝟐 <
𝝌𝟐𝜶 𝝌𝟐
𝜶
,𝒏−𝟏 𝟏− ,𝒏−𝟏
𝟐 𝟐
Graphically, the (1 - )% = 95% confidence interval:
f(2n-1)
probability probability
α/2 = 0.025 (1-α)=0.95 α/2 = 0.025
2n-1
2n-1, α/2 2n-1, 1-α/2
Note: The Chi-squared distribution is not symmetric like the Normal.

→ The critical values are 2n-1, 1-α/2 and 2n-1, α/2
152
Module 3
Estimation
Point Estimation
Interval estimation
Parameters of 1 normal population EXAMPLE
You are testing the speed of a computer processor (X).
You collect the following data (MHz):
Sample Statistic Value

Size 11
Mean 3,004
Standard Deviation 74
Assuming that the population is normal, determine a 95% confidence interval for σx2 , the
true population variance.
153
Module 3

Estimation
Point Estimation
Interval estimation SOLUTION

𝒏−𝟏 𝒔𝟐 𝟐 𝒏−𝟏 𝒔𝟐
Parameters of 2 normal populations In this case, we use the formula: <𝝈 <
𝝌𝟐𝒏−𝟏,𝜶/𝟐 𝝌𝟐𝒏−𝟏,𝟏−𝜶/𝟐
Module 3 Key Learning where,
• 𝝌𝟐𝒏−𝟏,𝜶/𝟐 = 𝝌𝟐𝟏𝟎,𝟎.𝟎𝟐𝟓 = 𝟐𝟎. 𝟒𝟖 From the Statistical Tables of
the Chi-squared Distribution (see next slide)
• 𝝌𝟐𝒏−𝟏,𝟏−𝜶/𝟐 = 𝝌𝟐𝟏𝟎,𝟎.𝟗𝟕𝟓 = 𝟑. 𝟐𝟓 (α = 0.05 → α/2 = 0.025 and (1- α/2) = 0.975 ).
• 𝒔 = 74
• n = 11
𝒏 − 𝟏 𝒔𝟐 𝒏 − 𝟏 𝒔 𝟐
𝟏𝟏 − 𝟏 𝟓𝟒𝟕𝟔 𝟐
𝟏𝟏 − 𝟏 𝟓𝟒𝟕𝟔 < 𝝈𝟐 < 𝟐
<𝝈 < 𝟐
𝝌𝒏−𝟏,𝜶/𝟐 𝝌𝒏−𝟏,𝟏−𝜶/𝟐
𝟐𝟎. 𝟒𝟖 𝟑. 𝟐𝟓
𝟐, 𝟔𝟕𝟑. 𝟖𝟑 < 𝝈𝟐 <16,849.23

Converting to standard deviation, we are 95% confident that the population
standard deviation of CPU speed is between 51.71 and 129.80 MHz
154
Module 3
Estimation Use of the chi-square distribution table

Point Estimation
Table of values of χ2 in a Chi-Squared Distribution with k degrees of freedom such
Interval estimation
that α is the area between χ2 and +∞
Parameters of 2 normal populations α
d.f. 0.995 0.99 0.975 0.95 0.9 0.75 0.5 0.25 0.1 0.05 0.025 0.01 0.005 0.002 0.001
1 3.927e-5 1.570e-4 9.820e-4 0.00393 0.0157 0.102 0.455 1.323 2.706 3.841 5.024 6.635 7.879 9.550 10.828
2 0.0100 0.0201 0.0506 0.103 0.211 0.575 1.386 2.773 4.605 5.991 7.378 9.210 10.597 12.429 13.816
3 0.0717 0.115 0.216 0.352 0.584 1.213 2.366 4.108 6.251 7.815 9.348 11.345 12.838 14.796 16.266
4 0.207 0.297 0.484 0.711 1.064 1.923 3.357 5.385 7.779 9.488 11.143 13.277 14.860 16.924 18.467
5 0.412 0.554 0.831 1.145 1.610 2.675 4.351 6.626 9.236 11.070 12.833 15.086 16.750 18.907 20.515
6 0.676 0.872 1.237 1.635 2.204 3.455 5.348 7.841 10.645 12.592 14.449 16.812 18.548 20.791 22.458
7 0.989 1.239 1.690 2.167 2.833 4.255 6.346 9.037 12.017 14.067 16.013 18.475 20.278 22.601 24.322
8 1.344 1.646 2.180 2.733 3.490 5.071 7.344 10.219 13.362 15.507 17.535 20.090 21.955 24.352 26.124
9 1.735 2.088 2.700 3.325 4.168 5.899 8.343 11.389 14.684 16.919 19.023 21.666 23.589 26.056 27.877
10 2.156 2.558 3.247 3.940 4.865 6.737 9.342 12.549 15.987 18.307 20.483 23.209 25.188 27.722 29.588
11 2.603 3.053 3.816 4.575 5.578 7.584 10.341 13.701 17.275 19.675 21.920 24.725 26.757 29.354 31.264
12 3.074 3.571 4.404 5.226 6.304 8.438 11.340 14.845 18.549 21.026 23.337 26.217 28.300 30.957 32.909
13 3.565 4.107 5.009 5.892 7.042 9.299 12.340 15.984 19.812 22.362 24.736 27.688 29.819 32.535 34.528
14 4.075 4.660 5.629 6.571 7.790 10.165 13.339 17.117 21.064 23.685 26.119 29.141 31.319 34.091 36.123
15 4.601 5.229 6.262 7.261 8.547 11.037 14.339 18.245 22.307 24.996 27.488 30.578 32.801 35.628 37.697
16 5.142 5.812 6.908 7.962 9.312 11.912 15.338 19.369 23.542 26.296 28.845 32.000 34.267 37.146 39.252
17 5.697 6.408 7.564 8.672 10.085 12.792 16.338 20.489 24.769 27.587 30.191 33.409 35.718 38.648 40.790
18 6.265 7.015 8.231 9.390 10.865 13.675 17.338 21.605 25.989 28.869 31.526 34.805 37.156 40.136 42.312
19 6.844 7.633 8.907 10.117 11.651 14.562 18.338 22.718 27.204 30.144 32.852 36.191 38.582 41.610 43.820
20 7.434 8.260 9.591 10.851 12.443 15.452 19.337 23.828 28.412 31.410 34.170 37.566 39.997 43.072 45.315 155
Module 3
Estimation
Point Estimation
Interval estimation
2 2 2
DEFINITION: if 𝑃 𝜒𝑛−1 ≥ 𝜒𝛼/2,𝑛−1 = 𝛼Τ2 then, 𝜒𝛼/2,𝑛−1 is the percentage point of a
Parameters of 2 normal populations Chi-Square distribution with (n-1) degrees of freedom.
2
❑ Let 𝜒𝛼/2,𝑛−1 be the percentage point of a Chi-Square distribution with (n-1) DF such
2 2
that 𝑃 𝜒𝑛−1 ≥ 𝜒𝛼/2,𝑛−1 = 𝛼Τ2, (right tail).
2
❑ Let 𝜒1−𝛼/2,𝑛−1 be the percentage point of a Chi-Square distribution with (n-1) DF, such
2 2
that 𝑃 𝜒𝑛−1 ≥ 𝜒1−𝛼/2,𝑛−1 = 1 − 𝛼Τ2 (left tail).
(𝑛−1)𝑠 2
It can be shown that the quantity follows a Chi-Square distribution with (n-1) DF.
𝜎2
(𝑛−1)𝑠 2
A confidence interval at the level (1- 𝛼 ) for the quantity can be formed taking the
𝜎2
2 2
distance between the 2 percentage points previously shown: 𝜒𝛼/2,𝑛−1 and 𝜒1−𝛼/2,𝑛−1 .
2 (𝑛−1)𝑠 2 2
So, it can be stated that P 𝜒1−𝛼/2,𝑛−1 ≤ ≤ 𝜒𝛼/2,𝑛−1 =1−𝛼
𝜎2
156
Module 3
Estimation Advanced Explanation

Point Estimation
Interval estimation
Parameters of 1 normal population To obtain the confidence interval for the population variance, the term 𝜎 2 must be isolated.
Dividing by (𝑛 − 1)𝑠 2 , we obtain
2 2
𝜒1−𝛼/2,𝑛−1 1 𝜒𝛼/2,𝑛−1
P ≤ ≤ =1−𝛼
(𝑛 − 1)𝑠 2 𝜎 2 (𝑛 − 1)𝑠 2
2
𝜒1−𝛼/2,𝑛−1 1 (𝑛−1)𝑠 2
≤ 2 ≥ 𝜎2
(𝑛−1)𝑠 2 𝜎2 𝜒1−𝛼/2,𝑛−1
to solve for 𝜎 2 we need the reciprocal, and we get
1
2
𝜒𝛼/2,𝑛−1 (𝑛−1)𝑠 2
≤ 𝜎2 ≥ 2
𝜎2 (𝑛−1)𝑠 2 𝜒𝛼/2,𝑛−1
which leads to the 100(𝟏 − 𝜶 )% two-sided confidence interval for the population variance:
𝒏 − 𝟏 𝒔𝟐 𝒏 − 𝟏 𝒔𝟐
< 𝝈𝟐 <
𝝌𝟐 𝜶 𝝌𝟐 𝜶
𝒏−𝟏, 𝟐 𝒏−𝟏,(𝟏−𝟐 )
157
Module 3
Estimation
Confidence Interval for the population proportion
Point Estimation
Interval estimation
Parameters of 2 normal populations Let 𝑝Ƹ be the proportion of successes in n independent trials, having each a probability of success
equal to p.
The following C.I. for the population proportion p is valid for large samples. Or, in other terms, the
method is valid under the assumption that the Binomial distribution is satisfactorily approximated
by a Normal distribution. This happens if np(1- p) > 9.
Under this assumption, a 100(1-α)% C.I. for the population proportion is given by:
ෝ(𝟏−ෝ
𝒑 𝒑) ෝ (𝟏−ෝ
𝒑 𝒑)
ෝ − 𝒁𝜶/𝟐
𝒑 ෝ + 𝒁𝜶/𝟐
<𝒑<𝒑
𝒏 𝒏
158
Module 3
Estimation
Confidence Interval for the population proportion
Point Estimation
Interval estimation
Module 3 Key Learning EXAMPLE:

During the inspection of a random sample, comes out that 15 items out of 80 are nonconforming
(for one or more than one causes of nonconformity). A point estimate of the proportion of
15
nonconforming, is then: 𝑝Ƹ = = 0.1875.
80
The normal approximation is plausible, since 80(0,1875)(0,8125)=12,1875 > 9.
A 95% C.I for the population proportion:
ෝ(𝟏−ෝ
𝒑 𝒑) ෝ(𝟏−ෝ
𝒑 𝒑) 𝟎.𝟏𝟖𝟕𝟓(𝟎.𝟖𝟏𝟐𝟓) 𝟎.𝟏𝟖𝟕𝟓(𝟎.𝟖𝟏𝟐𝟓)
ෝ − 𝒁𝜶/𝟐
<𝒑<𝒑 ⟹ 0.1875-1.96 <p< 0.1875+1.96
𝒏 𝒏 𝟖𝟎 𝟖𝟎
IC=(0.102 – 0.273)
159
Module 3
Estimation
Exercise #9
Point Estimation
1. Open the exercise File:
Interval estimation
2. Trainer will show using JMP:
a. How to perform one population Proportion Test
Module 3 Key Learning 3. Interpretation of results.
Note:
• In JMP, the default proportion test is using Wilson Score Interval. Thus the result
obtained is different from the method used in slide # the previous slide.
• The method used in slide #126 is explained in Slide #125.

This method can be downloaded into JMP as “Add-In”.
Visit the JMP Learning Center under section of “Useful Add-In” and download
the add-in called Statistical Calculator.
https://stmicroelectronics.sharepoint.com/teams/jmplearningcenter9/SitePages/
Welcome-to-JMP-Video-Tutorial.aspx
160
Module 3
Estimation
Exercise #9
Point Estimation
Interval estimation 1. Open the exercise Files

Summarized data of conformity

Individual conformity, 80 rows
161
Module 3
Estimation
Exercise #9
Point Estimation
Interval estimation Using this data: Go to Analyze > Distribution – Conformity column to Y > OK
Go to hotspot > Confidence interval > 0.95
162
Module 3
Estimation
Exercise #9
Point Estimation
Interval estimation Using this data: Go to Analyze > Distribution – Conformity column to Y, Freq col to Freq > OK
Go to hotspot > Confidence interval > 0.95
163
Module 3
Estimation
Exercise #9
Point Estimation
Interval estimation
JMP uses
Parameters of 1 normal population Wilson score method
Manual calculation uses Classical Method
164
Module 3
Estimation
Exercise #9
Point Estimation
Properties of Point Estimators Go to Add-Ins > Statistics Calculator III > Confidence Interval for One Proportion
Interval estimation
165
Module 3
Estimation
Exercise #9
Point Estimation
Properties of Point Estimators Go to Add-Ins > Statistics Calculator III > Confidence Interval for One Proportion
Interval estimation
166
Module 3
Estimation
Exercise #9
Point Estimation
Interval estimation
167
Module 3
Estimation
Point Estimation Confidence Interval summary

Interval estimation
Parameters of 1 normal population Formulas for confidence intervals of parameters of one normal population:

Population Parameter Confidence Interval
𝝈 𝝈
μ (with σ2 Known) ഥ
𝑿 − 𝒁𝜶 ഥ
< 𝝁 < 𝑿 + 𝒁𝜶
𝟐 𝒏 𝟐 𝒏
𝒔 𝒔
μ (with σ2 Unknown) ഥ
𝑿 − 𝒕𝒏−𝟏,𝜶/𝟐 ഥ
< 𝝁 < 𝑿 + 𝒕𝒏−𝟏,𝜶/𝟐
𝒏 𝒏
𝒏 − 𝟏 𝒔𝟐 𝒏 − 𝟏 𝒔𝟐
σ2 < 𝝈𝟐 <
𝝌𝟐 𝜶 𝝌𝟐 𝜶
𝒏−𝟏, 𝟐 𝒏−𝟏,(𝟏−𝟐 )
ෝ(𝟏−ෝ
𝒑 𝒑) ෝ(𝟏−ෝ
𝒑 𝒑)
Θ (population proportion) ෝ − 𝒁𝜶/𝟐
<𝒑<𝒑
𝒏 𝒏
NOTE: 168
For more details, se also the Manual of Statistical Methodology – Ch. 6. (DMS 8482919_A)
Module 3
Estimation
Point Estimation
Interval estimation
Interval estimation of unknown population

parameters (two normal populations)
169
Module 3
Estimation
Point Estimation
Interval Estimation – 2 populations
Interval estimation When two populations are simultaneously considered, the goal of
the confidence intervals is often to compare among them the
parameters of the two populations
Population X Population Y
σX σY
μX μY
Samples x1, x2, ... ,xnx y1, y2, ... ,yny
Point estimates ഥ, 𝒔𝒙 𝟐
𝒙 & ഥ, 𝒔𝒚 𝟐
𝒚
Comparisons μX = μY ? & σ 2X = σ 2Y ?
170
Module 3
Estimation
Point Estimation
Interval Estimation(two populations)
Interval estimation
GENERAL SCHEME Confidence Interval for:
difference of difference of ratio of

population means, population means, population
independent samples dependent samples variances
Examples:
Comparison of means: Comparison of means: Comparison of
Group 1 vs. independent Same group before variances of two
Group 2 vs. after treatment normal distributions
171
Module 3
Estimation
Point Estimation
Interval estimation


Examples: Group 1 vs. independent Same group before variances of two
172
Module 3
Estimation
Point Estimation
Interval estimation
Module 3 Key Learning Interval estimation of unknown population

parameters (two independent populations)
The previous results for one population are now extended to the case of two
independent populations
Two independent populations, Population X and Population Y, are considered
• Population X has mean μX and variance σ2X
• Population Y has mean μY and variance σ2Y
• Inferences are based on two random samples of sizes nX and nY, from population X and
population Y, respectively (→ from population X the random sample is: x1, x2, …, xnx and
from population Y the random sample is: y1, y2, …, yny ).
173
Module 3
Estimation
Point Estimation
C. I. for μX-μY (independent populations)
Interval estimation
Parameters of 2 normal populations Goal: Form a confidence interval
Population means,
independent samples for the difference between two
population means, μX – μY
• Different data sources
• Uncorrelated
• Independent
• Sample selected from one population has no effect on the sample selected from the
other population
• The point estimate is the difference between the two

sample means:
ഥ−𝒚
𝒙 ഥ
174
Module 3
Estimation C. I. for μX-μY (independent populations)

Point Estimation
Interval estimation
Different cases can be identified
Module 3 Key Learning Population means,

independent samples
σx2 and σy2 known Confidence interval uses z/2
σx2 and σy2 unknown
σx2 and σy2

assumed equal Confidence interval uses a value
σx2 and σy2 from the Student’s t distribution
assumed unequal
175
Module 3

Point Estimation
Population means, independent samples
Interval estimation Start with this case
σx2 and σy2 known
σx2 and σy2 assumed equal
σx2 and σy2 assumed unequal
1 Assumptions:
▪ Samples are randomly and independently drawn
▪ both population distributions are Normal
Context:
▪ Population variances are known
2 The confidence interval for μX – μY is:
𝝈𝟐𝑿 𝝈𝟐𝒀 𝝈𝟐𝑿 𝝈𝟐𝒀

ഥ−𝒚
𝒙 ഥ − 𝒁𝜶 + ഥ−𝒚
< 𝝁𝑿 − 𝝁𝒀 < 𝒙 ഥ + 𝒁𝜶 +
𝟐 𝒏𝒙 𝒏𝒚 𝟐 𝒏𝒙 𝒏𝒚
3 μx and μy comparison: “Form a (1-α)% confidence interval for the difference μx - μy

If zero is included in the interval, we can state (with (1-α)% confidence) that μx = μy”.
176
Module 3

Point Estimation
Interval estimation
Parameters of 1 normal population THIS CASE
Parameters of 2 normal populations σx2 and σy2 known
Module 3 Key Learning σx2 and σy2 assumed equal

1 Assumptions:
Context:
▪ Population variances are unknown but assumed equal
2 Forming interval estimates:

• The population variances are unknown but assumed equal, so use the two
sample standard deviations and pool them to estimate the unknown common σ
• use a t value with (nx + ny – 2) degrees of freedom
𝟐 𝟐
𝒏𝒙 − 𝟏 𝑺𝒙 + 𝒏𝒚 − 𝟏 𝑺𝒚
Calculate the pooled variance 𝑺𝟐𝒑 : 𝑺𝟐𝒑 =
𝒏𝒙 + 𝒏𝒚 − 𝟐
177
Module 3

Point Estimation
Interval estimation

𝑺𝟐𝒑 𝑺𝟐𝒑 𝑺𝟐𝒑 𝑺𝟐𝒑

ഥ−𝒚
𝒙 ഥ − 𝒕𝒏𝒙+𝒏𝒚 −𝟐,𝜶/𝟐 + ഥ−𝒚
< 𝝁𝑿 − 𝝁𝒀 < 𝒙 ഥ + 𝒕𝒏𝒙 +𝒏𝒚−𝟐,𝜶/𝟐 +
𝒏𝒙 𝒏𝒚 𝒏𝒙 𝒏𝒚
μx and μy comparison: “Form a (1-α)% confidence interval for the difference μx - μy .

4 If zero is included in the interval, we can state (with (1-α)% confidence) that μx = μy”.
178
Module 3
Estimation
Point Estimation
Interval estimation EXAMPLE

You are testing the speed of two computer processors (X1 and X2).
You collect the following data (MHz):
Sample Statistic 1st processor X1 2nd processor X2

Sample size 17 14
Sample Mean 3,004 2,538
Sample Standard Deviation 74 56
Assuming that the two populations are normally distributed with unknown but equal
variances, determine a 95% confidence interval for the difference of CPU mean speeds.
179
SOLUTION
2 2
2
n x − 1 Sx + n y − 1 Sy 17 − 1 742 + 14 − 1 562
The pooled variance is: Sp = = = 4427.03
(n𝑥 − 1) + (ny − 1) (17−1) + (14 − 1)
The t value for a 95% confidence interval is: t nx +ny −2 , α/2 = t 29 , 0.025 = 2.045
sp2 sp2 sp2 sp2

The 95% confidence interval is: (x − y) − t n x +ny −2,α/2 +  μX − μY  (x − y) + t nx +ny −2,α/2 +
nx ny nx ny
4427.03 4427.03 4427.03 4427.03

(3004 − 2538) − (2.054) +  μX − μY  (3004 − 2538) + (2.054) +
17 14 17 14
416.69  μX − μY  515.31
We are 95% confident that the mean difference in CPU speed is between 416.69 and 515.31 Mhz. Since zero
is not included in the interval, we cannot state that they are equally performing in terms of mean CPU speed.
180
Module 3
Estimation
Exercise #10
Point Estimation
Interval estimation
Parameters of 2 normal populations 1. Open the exercise File:
a. How to perform Unequal Variance Test.
b. How to perform Confidence Interval.
3. Interpretation of results.
181
Module 3
Estimation
Exercise #10
Point Estimation
Interval estimation 1. Open the exercise File and stack the data columns
Go to Tables > Stack
2. Plot the data and analyze

182
Module 3
Estimation
Exercise #10
Point Estimation
Go to hotspot > Unequal Variances
Interval estimation Are the variances equal?

Go to hotspot > Means/Anova/Pooled t
Does your CI
include 0?
183
Module 3
Estimation
Point Estimation
Interval estimation
Parameters of 2 normal populations σx2 and σy2 known
σx2 and σy2 assumed equal
1 Assumptions:
Context: This case
▪ Population variances are unknown and assumed unequal
2 Forming interval estimates:

• The population variances are assumed unequal, so a pooled variance is not appropriate
• use a t value with 𝝂 degrees of freedom, where: 𝟐
𝑺𝟐𝒙 𝑺𝟐𝒚
+𝒏
𝒏𝒙 𝒚
𝝂= 𝟐 𝟐
𝑺𝟐𝒙 𝑺𝟐𝒚
𝒏𝒙 𝒏𝒚
𝒏𝒙 − 𝟏 + 𝒏𝒚 − 𝟏 184
Module 3

Point Estimation
Interval estimation

𝑺𝟐𝑿 𝑺𝟐𝒀 𝑺𝟐𝑿 𝑺𝟐𝒀

ഥ−𝒚
𝒙 ഥ − 𝒕𝝂,𝜶/𝟐 + ഥ−𝒚
< 𝝁𝑿 − 𝝁𝒀 < 𝒙 ഥ + 𝒕𝝂,𝜶/𝟐 +
𝒏𝒙 𝒏𝒚 𝒏𝒙 𝒏𝒚
μx and μy comparison: “Form a (1-α)% confidence interval for the difference μx - μy .

4 If zero is included in the interval, we can state (with (1-α)% confidence) that μx = μy”.
185
Module 3
Estimation
Exercise #11
Point Estimation
Interval estimation
Parameters of 2 normal populations 1. Open the exercise File:

a. How to perform Unequal Variance Test.
b. How to perform Confidence Interval.
186
Module 3
Estimation
Exercise #11
Point Estimation
1. Follow Step1 & 2 in Exercise 10
Interval estimation
Parameters of 1 normal population Go to hotspot > Unequal Variances

Are the variances equal?
Go to hotspot > Means/Anova/Pooled t
Does your CI
include 0?
187
Module 3
Estimation
Point Estimation
Interval estimation

Examples:
Group 1 vs. independent Same group before variances of two
188
Module 3
Estimation
Point Estimation
Interval estimation
Parameters of 1 normal population Interval estimation of unknown population parameters (two
dependent populations)
• Tests Means of 2 Related Populations (X and Y)

o Paired or matched samples
o Repeated measures (before/after)
o Use difference between paired values: di = (Xi – Yi ), i = 1,2,…,n
• Does not consider Variation Among Subjects
Assumptions: Both Populations Are Normally Distributed
189
Module 3
Estimation
Point Estimation
C. I. for μX-μY (dependent populations)
Interval estimation
Module 3 Key Learning The ith paired difference is di, where

di = Xi - Yi
The point estimate for the population σ𝒏𝒊=𝟏 𝒅𝒊

ഥ
mean paired difference is 𝒅: ഥ=
𝒅
𝒏
The sample standard deviation is Sd : 𝟐

σ𝒏𝒊=𝟏 ഥ
𝒅𝒊 − 𝒅
𝑺𝒅 =
𝒏−𝟏
n is the number of matched pairs in the sample 190

Module 3
Estimation
Point Estimation
Interval estimation

The confidence interval for difference between population means, μd , is
𝑺𝒅 𝑺𝒅
ഥ−𝒕
𝒅 𝜶 ഥ+𝒕
< 𝝁𝒅 < 𝒅 𝜶
𝒏−𝟏, 𝒏 𝒏−𝟏,
𝟐 𝟐 𝒏
Where,
n = the sample size (number of matched pairs in the paired sample)
𝝁𝒅 interpretation: “if zero is included in the interval, we can state

(with (1-α)% confidence) that 𝝁𝒅 = 0. This means no difference between
the population means”.
191
Module 3
Estimation
Point Estimation
Interval estimation

• The Margin of Error (ME) is:
𝑺𝒅
𝑴𝑬 = 𝒕𝒏−𝟏,𝜶/𝟐
𝒏
• tn-1,/2 is the value from the Student’s t distribution

with (n – 1) degrees of freedom for which:
𝜶
𝑷 𝒕𝒏−𝟏 > 𝒕𝒏−𝟏,𝜶/𝟐 =
𝟐
192
Module 3
Estimation
Exercise #12
Point Estimation
Interval estimation
1. Open the exercise File:
a. How to perform Confidence Interval.
193
Module 3
Estimation
Exercise #12
Point Estimation
Interval estimation Method1: Using Matched Pair

Go to Analyze > Specialized Modeling > Matched Pair
Cast T0 and T500 to Y Response
Does the CI include the

target of interest (0)?
194
Module 3
Estimation
Exercise #12
Point Estimation
Interval estimation Method2: Using the Changes column (difference of T0 & T500 measurements)
Go to Distribution
Cast Changes to Y, Response → OK
Does the CI include the

target of interest (0)?
195
Module 3
Estimation
Point Estimation
EXAMPLE
Interval estimation
Parameters of 1 normal population Six people sign up for a weight loss program. You collect the following data:

Weight:
Person Before (xi) After (yi) Difference di
1 136 125 11
2 205 195 10
3 157 150 7
4 138 140 -2
5 175 165 10
6 166 160 6
42
 di ∑ (d i- d) 2
Sd = = 4.82
d = n = 7.0 n -1
Form a 95% confidence interval for the difference of means and establish
if the weight loss program helps people loosing weight.
196
Module 3
Estimation
Point Estimation
For a 95% confidence level, the appropriate t value is tn-1,/2 = t5,0.025 = 2.571
Interval estimation
SOLUTION The 95% confidence interval for the difference between means, μd , is
𝐒𝐝 𝐒𝐝
ഥ
𝒅−t ഥ+𝐭
< 𝛍𝐝 < 𝒅
Module 3 Key Learning n−1,α/2 𝐧 n−1,α/2 𝐧
4.82 4.82
7−(2.571) < 𝛍𝐝 < 7+(2.571)
𝟔 𝟔
𝟏. 𝟗𝟒 < 𝝁𝒅 < 𝟏𝟐. 𝟎𝟔
Interpretation:
• Since this interval does not contains zero, we can be 95% confident, given this
limited data, that the weight loss program produced a statistically significant effect
on people weight.
• Since the confidence limits are positive, we can state that the weight loss program
helps people to lose weight (weight before > weight after).
197
Module 3
Estimation
Point Estimation
C. I. for μd and σd (Dependent samples)
Interval estimation
EXAMPLE
Confidence interval for the mean μd and for the standard deviation σd of the difference (d)
of the values drawn from 2 normal paired (or matched) samples
Use here the same data of previous example 2.6 for independent samples (same
parameter in two successive time periods).
Here, we assume that the data in columns A and B represent measurements on the
same units before and after a certain treatment was given (or, equivalently, in two
different periods of time: t-1 and t)
• Determine and interpret the 95% confidence intervals for μd and for σd
198
Module 3
Estimation C. I. for μd and σd (Dependent samples)

Point Estimation
Interval estimation SOLUTION (JMP)

ANALYZE > MATCHED PAIRS
Confidence interval
199
Module 3
Estimation
Point Estimation
Interval estimation

Examples:
Comparison of means: Same group Comparison of
Group 1 vs. before vs. after variances of two
independent Group 2 treatment normal distributions
200
C. I. for (σX)2/(σY)2 (indep. Normal populations)
Module 3
Estimation
Point Estimation
Interval estimation 1 Assumptions:

Parameters of 1 normal population Samples are randomly and independently drawn from two Normal distributions
2 The confidence interval for the ratio of variances σX2/σY2 is given by:
𝒔𝟐𝑿 𝝈𝟐𝑿 𝒔𝟐𝑿

𝟐 𝑭𝟏− , < 𝟐 < 𝟐 𝑭 𝜶,
𝜶
𝒔𝒀 𝟐 𝒏𝑿 −𝟏 , 𝒏𝒀 −𝟏 𝝈𝒀 𝒔𝒀 𝟐 𝒏𝑿 −𝟏 , 𝒏𝒀 −𝟏
Where 𝑭𝜶, 𝒏𝑿 −𝟏 , 𝒏𝒀 −𝟏 is the percentage point of the F distribution with (nX-1)

𝟐
𝜶
and (nY-1) degrees of freedom such that 𝑷 𝑭 𝒏𝑿 −𝟏 , 𝒏𝒀 −𝟏 ≥ 𝑭𝜶, 𝒏𝑿 −𝟏 , 𝒏𝒀 −𝟏 =
𝟐 𝟐
3 σ2X and σ2Y comparison: “Form a (1-α)% confidence interval for the ratio σ2X
/σ2Y . If one (1) is included in the interval, we can state (with (1-α)% confidence)
that σ2X = σ2Y ”.
201
Module 3
Estimation
Confidence Interval for the difference of 2 population proportions
Point Estimation
ASSUMPTIONS:
Interval estimation
Parameters of 1 normal population • Samples are randomly and independently drawn.
Parameters of 2 normal populations • Samples sizes are nX and nY .
• Both sample sizes are large and the Normal approximation to the Binomial distribution holds (np (1 - p) > 9).
The C.I. is relative to the difference of the population proportions: pX – pY.
The point estimate for the difference is 𝑝Ƹ𝑋 − 𝑝Ƹ 𝑌
𝑝ො𝑋 −𝑝ො𝑌 −(𝑝𝑋 −𝑝𝑌 )

The random variable 𝑍 = ෝ 𝑋 (1−ෝ
𝑝 ෝ (1−ෝ
𝑝𝑋 ) 𝑝 𝑝𝑌 )
is approximately normal and then, a 100(1-α)% confidence interval
+ 𝑌
𝑛𝑋 𝑛𝑌
for the difference of 2 population proportions is given by:
𝑝Ƹ𝑋 1 − 𝑝Ƹ𝑋 𝑝Ƹ 𝑌 1 − 𝑝Ƹ 𝑌+
𝑝Ƹ𝑋 − 𝑝Ƹ 𝑌 ± 𝑍𝛼 +
2 𝑛𝑋 𝑛𝑌
202
Module 3
Estimation
Confidence Interval for the difference of 2 population proportion
Point Estimation
Interval estimation
EXAMPLE
Two production lots have been inspected. From the first, a random sample of size nX = 270 units is drawn and then
Module 3 Key Learning checked. 11 units out of 270 have been found defective. From the second lot, a random sample of size nY = 352 units
is drawn and then checked. 19 units out of 352 have been found defective.
Form a 90% C.I. for the difference of proportions of defective units in lot 1 and in lot 2 to assess equality of proportions.
SOLUTION
11 19
From lot 1: 𝑝Ƹ𝑋 = = 0.0407. From lot 2: 𝑝Ƹ𝑌 = = 0.0053977
270 352
𝑝Ƹ 𝑋 1−𝑝Ƹ 𝑋 𝑝Ƹ 𝑌 1−𝑝Ƹ 𝑌+ 0.0407 0.9593 0.053977 0.946023
For 90% confidence level, 𝑍𝛼 = 1.645, and + = + = 0.01702
2 𝑛𝑋 𝑛𝑌 270 352
𝑝Ƹ 𝑋 1−𝑝Ƹ 𝑋 𝑝Ƹ 𝑌 1−𝑝Ƹ 𝑌+
The confidence limits are: 𝑝Ƹ𝑋 − 𝑝Ƹ𝑌 ± 𝑍𝛼 + = (0.0407 − 0.053977) ± 1.645(0.01702)
2 𝑛𝑋 𝑛𝑌
and the C.I. is:
𝑪𝑰 = −𝟎. 𝟎𝟒𝟏𝟐 < 𝒑𝑿 − 𝒑𝒀 < 𝟎. 𝟎𝟏𝟒𝟖
CONCLUSION
Since 0 ∈ C.I., with a 90% confidence level we can state that the proportion of defective is equivalent in the two lots.
203
Module 3
Estimation
Exercise #13
Point Estimation
Interval estimation
Module 3 Key Learning 1. Open the exercise File:

a. How to perform 2 population Proportion Test
204
Module 3
Estimation
Exercise #13
Point Estimation
Interval estimation
Go to Tables > Stack
Stacked data
205
Module 3
Estimation
Exercise #13
Point Estimation Go to hot spot > Two Sample Test for Proportions
Interval estimation
CI for Proportion
Does this CI include 0?
You can toggle the response of interest
206
Module 3
Estimation
Exercise #13
Point Estimation
Interval estimation Go to Tables > Stack Stacked data table

207
Module 3
Estimation Using summarized data Using raw data Exercise #13

Point Estimation
Interval estimation
208
Module 3
Estimation
Point Estimation
Module 3 Key Learnings
Interval estimation ❑ Point and interval estimation


❑ Properties of a point estimator
o unbiasedness, consistency, maximum efficiency
❑ Confidence interval for parameters (under the normality assumption)

o One population
▪ For the mean (variance known and unknown)
▪ For the variance
o Two populations
▪ Independent samples
• For the difference of means (variance known and unknown, equal and unequal variances)
• For the ratio of variances
▪ Dependent samples
• For the difference of means
209
END OF STATS 2 PART 1
210
File Revision
Version Date Remarks Who

1.0 2017 Initial Release Marco Della Seta
• New format and template.
2.0 April 2021 Marco Della Seta / HK Looi
• New exercises.
211

STATS 2 Part 1 Rev 2.0 With Exercise Slides Ao

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

STATS 2 Part 1 Rev 2.0 With Exercise Slides Ao

Uploaded by

Copyright:

Available Formats

STATISTICS LEVEL 2

• STatS is a project started in 2012 under the sponsorship of PQR Management.

• To permit to ST to reach Best-in-class by introduction of innovative statistical

• STatS is intended to continuously improve our detection capability through the

Statistical Model Measurement

Color code - legenda:

• To provide the fundamentals of statistics

• To Analyze information at the proper level

• This is not an individual ‘control’ test

It will allow us to have an idea of your level of knowledge

Module 1: INTRODUCTION Module 4: HYPOTHESIS TESTING

• Descriptive VS Inferential • ANOVA

Module 3: CONFIDENCE INTERVAL

Know : Statistical Theory How : JMP Application

Recall from STATS 1

Introduction To Inference Descriptive and Inferential Statistics

Module 1 Key Learning

Two branches of Statistics

Recall from STATS 1

Introduction To Inference Descriptive Statistics

Recall from STATS 1

Introduction To Inference Inferential Statistics

Module 1 Key Learning

Recall from STATS 1

Introduction To Inference Inferential Statistics

Module 1 Key Learning

Inference is the process of drawing conclusions or making

Estimation (of unknown parameters)

Recall from STATS 1

Estimation & Hypothesis Testing

Module 1 Key Learning

Two approaches to estimate an unknown population parameter:

RESULT = Single Value

RESULT = Interval + Confidence

Recall from STATS 1

Estimation & Hypothesis Testing

Module 1 Key Learning

Recall from STATS 1

Estimation & Hypothesis Testing

Module 1 Key Learning

Recall from STATS 1

Estimation & Hypothesis Testing

• It can be a population parameter like:

• The population proportion

Example: The proportion of adults in

Recall from STATS 1

Estimation & Hypothesis Testing

Inferential Error Hypothesis Testing Procedure

Statistical hypothesis testing procedures are methods to investigate on an

For example, we might be interested in:

Recall from STATS 1

Estimation & Hypothesis Testing

Module 1 Key Learning

“Their results are based on sample data. The basic

Recall from STATS 1

Introduction To Inference However,

1. In interval estimation problems, we cannot be 100% sure

Recall from STATS 1

Estimation & Hypothesis Testing

BECAUSE it always exists a positive probability that the

Recall from STATS 1

Estimation & Hypothesis Testing EXAMPLE:

- Confidence limits: LCL=53, UCL=61 (width = 8)