Download as pdf or txt
Download as pdf or txt
You are on page 1of 211

STATISTICS LEVEL 2

Part 1
Speaker Name
ST Context2

• STatS is a project started in 2012 under the sponsorship of PQR Management.

• To permit to ST to reach Best-in-class by introduction of innovative statistical


tools and methodologies at Company level

• The main goals of STatS are review, rationalize and improve the effectiveness
of Statistical methodology in general.

• STatS is intended to continuously improve our detection capability through the


adoption of an advanced statistical approach, and to reduce DPPM (Defective
Parts Per Million), thru an innovation of the statistical techniques deployed in ST
manufacturing.

• To drive and support the deployment and correct application of the Statistics
Manuals in all ST manufacturing plants
S.P.C.
Statistics Learning
A Awareness
W Statistics Level 1 S.P.C.
A (STAT 1) Operators
R
M.S.A.
E Awareness
N
Statistics Level 2
E S.P.C.
Part 1 Engineers
S
D.O.E.
S Awareness Part 2

Statistical Model Measurement


Building 1 + D.O.E. 1 System Analysis
(SMB 1 + D.O.E. 1) (MSA)

Color code - legenda:


Statistical Model • Green shadows = No prerequisites
Multivariate • Other colors: Prerequisites
Building 2 + D.O.E. 2
Statistics
(SMB 2 + D.O.E. 2)
3
Why this course?

• To provide the fundamentals of statistics


• To answer current statistical questions in everyday work
• To produce more accurate/effective statistical analysis

4
Training purpose
• To understand the basics of inferential Statistics
• To form, use and interpret confidence intervals
• Perform and interpret statistical test: null and alternative hypothesis, test statistics, p-value
• To understand and use the concept of correlation between variables
• To use both parametric and non-parametric Statistics
• To focus on the importance of the methods for outlier detection
• To know about the existence of bootstrap methodology

5
Benefits

• To Analyze information at the proper level


and take the right decision accordingly

6
Let’s get to know each other…

Round table:
• Name
• Organization
• Are you already using statistical methodology?
• If so, what are the main applications?
• Expectations from the course

7
Pre-test
• Complete the questionnaire to the best of your
knowledge

• This is not an individual ‘control’ test

It will allow us to have an idea of your level of knowledge


about the subject prior to the training (so, if you don’t
know, don’t worry …)
We will re-do the questionnaire at the end of the course
to measure the learning that has taken place
10 minutes

8
Structure of the Course
Structure of the course

Module 1: INTRODUCTION Module 4: HYPOTHESIS TESTING


• Introduction
• First Concepts • 1 Normal Population
• Population VS Sample • 2 Normal Populations

• Descriptive VS Inferential • ANOVA


• Estimation (point and interval) • Non-Parametric Test
• Hypothesis Testing Annex 1
• Inferential Error
• Introduction to Bootstrap
Module 2: CENTRAL LIMIT THEOREM
• Decision Making Process Annex 2
• Numerical Simulation & Examples
• t-Distribution • Overview of Outlier Detection Methods

Module 3: CONFIDENCE INTERVAL


• Estimation
• Point Estimation
• Properties of Point Estimators
• Interval Estimation
• Parameters of 1 Normal Population
• Parameters of 2 Normal Population

10
Module 1 : Introduction
Objective: Duration ~ TBD hrs
• Recall from STATS1 : Population vs Sample , Descriptive vs Inferential Statistics.
• Point and interval estimation of parameters.
• Hypothesis Testing Procedures.
• Inferential error.

Know : Statistical Theory How : JMP Application


• Recall from STATS 1 • Not Applicable.
• Introduction to Inference
• Estimation & Hypothesis Testing
• Inferential Error
• Module 1 Key Learning

FAQ
#1 : To collect input from participants.

12
Module 1

Recall from STATS 1

Introduction To Inference Descriptive and Inferential Statistics


Estimation & Hypothesis Testing

Inferential Error

Module 1 Key Learning

Two branches of Statistics

Descriptive statistics
• Collecting, summarizing, and processing data to transform data into information

Inferential statistics
• provide the basis for predictions, forecasts, and estimates that are used to transform
information into knowledge

13
Module 1

Recall from STATS 1

Introduction To Inference Descriptive Statistics


Estimation & Hypothesis Testing

Inferential Error
Descriptive
Module 1 Key Learning
•Collect data
• e.g., Survey

•Present data
• e.g., Tables and Graphs

•Summarize data
σ 𝑋𝑖

• e.g., Sample average = 𝑋 =
𝑛

14
Module 1

Recall from STATS 1

Introduction To Inference Inferential Statistics


Estimation & Hypothesis Testing

Inferential Error

Module 1 Key Learning


Inference
•Estimation
• e.g., Estimate the population mean
weight using the sample mean weight

•Hypothesis testing
• e.g., Test the claim that the population
mean weight is 120 pounds
Inference is the process of drawing conclusions or making
decisions about a population based on sample results

15
Module 1

Recall from STATS 1

Introduction To Inference Inferential Statistics


Estimation & Hypothesis Testing

Inferential Error

Module 1 Key Learning

Inference is the process of drawing conclusions or making


decisions about a population based on sample results

Estimation (of unknown parameters)


• e.g., Estimate the population mean weight using the sample
mean weight

Hypothesis testing
• e.g., Test the claim that the population mean weight is 120
pounds

16
Module 1

Recall from STATS 1

Introduction To Inference

Estimation & Hypothesis Testing

Inferential Error

Module 1 Key Learning

Two approaches to estimate an unknown population parameter:

RESULT = Single Value


Point Estimation Example: “mean = 18”

RESULT = Interval + Confidence


Interval Estimation Example: “mean between 17 and 19.
At the 95% confidence level”

17
Module 1

Recall from STATS 1

Introduction To Inference

Estimation & Hypothesis Testing

Inferential Error

Module 1 Key Learning


• A point estimate is a single number,
• A confidence interval provides additional information about variability

Point Estimate

Lower Upper
Confidence Confidence
Limit Limit
Confidence Interval

18
Module 1

Recall from STATS 1

Introduction To Inference

Estimation & Hypothesis Testing

Inferential Error

Module 1 Key Learning

Estimation
• e.g., Estimate the population mean weight using the
sample mean weight

Hypothesis testing
e.g., Test the claim that the population mean weight is
120 pounds

19
Module 1

Recall from STATS 1

Introduction To Inference

Estimation & Hypothesis Testing

Inferential Error
• A hypothesis is a claim (assumption) about an aspect of the
Module 1 Key Learning
population under investigation

• It can be a population parameter like:


• The population mean / variance, …
Example: The mean monthly cell
phone bill of this city is μ = $42

• The population proportion

Example: The proportion of adults in


this city with cell phones is  = 0.68

• Other

20
Module 1

Recall from STATS 1

Introduction To Inference

Estimation & Hypothesis Testing

Inferential Error Hypothesis Testing Procedure


Module 1 Key Learning

Statistical hypothesis testing procedures are methods to investigate on an


aspect of interest of one (or more than one) population(s).

For example, we might be interested in:


❑ determining the most likely value of a population parameter (e.g. mean or variance);
❑ comparing the same parameter of two or more than two populations (e.g. two or
more means, variances, proportions or other);
❑ assessing how precisely a certain theoretical distribution (e.g. the normal
distribution) fits the data (Goodness of Fit tests);
❑ and many others.

21
Module 1

Recall from STATS 1

Introduction To Inference

Estimation & Hypothesis Testing

Inferential Error

Module 1 Key Learning


“What have in common interval estimation and
hypothesis testing?”

“Their results are based on sample data. The basic


assumption is that the sample adequately represents
the population from which it has been drawn”

22
Module 1

Recall from STATS 1

Introduction To Inference However,


Estimation & Hypothesis Testing

Inferential Error
“Represents adequately” ≠ “Represents perfectly”
Module 1 Key Learning

1. In interval estimation problems, we cannot be 100% sure


that the true value of the unknown parameter is contained in
the interval.
2. Testing a system of hypotheses, we cannot be 100% sure
to take the right decision about the hypothesis to support.

23
Module 1

Recall from STATS 1

Introduction To Inference

Estimation & Hypothesis Testing

Inferential Error
WHY in interval estimation problems and hypothesis testing
Module 1 Key Learning
procedures we cannot be 100% sure to take the right decision
(cannot be 100% confident)?

BECAUSE it always exists a positive probability that the


sample misleads our conclusions due to the INFERENTIAL (or
SAMPLING ERROR).
To eliminate this error, all the population should be considered.

24
Module 1

Recall from STATS 1

Introduction To Inference

Estimation & Hypothesis Testing EXAMPLE:


Inferential Error A sample (of size n) is drawn from a population (of size N>>n) and a confidence
Module 1 Key Learning
interval for the population mean has been formed with the following results:

- Confidence limits: LCL=53, UCL=61 (width = 8)


- Confidence Level: 0.95 (or 95%)
- Point estimate of the population mean : ഥ = 𝟓𝟕
𝑿

If we use a higher confidence level (same data), say 99%, we must accept a
wider confidence interval → less precise estimator.
- Confidence limits: LCL=50, UCL=64 (width =14)
- Confidence Level: 0.99 (or 99%)
- Population mean point estimate: ഥ = 𝟓𝟕
𝑿

To maintain the same interval width with a higher confidence


level on the same data, we must increase the sample size.
25
Module 1

Recall from STATS 1

Introduction To Inference

Estimation & Hypothesis Testing


• Descriptive and Inferential Statistics
Inferential Error

Module 1 Key Learning


• Estimation of population parameters

• Hypothesis Testing

• Inferential Error

• Key definitions:
• Population vs. Sample
• Point vs. Interval Estimation
• Null Hypothesis vs. Alternative Hypothesis
• Inferential Error

26
Module 2 : - The Decision Process
- The Central Limit Theorem
Objective: Duration ~ TBD hrs
• Understand the Concept of Central Limit Theorem

Know : Statistical Theory How : JMP Application


• The Decision-Making Process • Numerical Simulation & Examples using JMP
• Central Limit Theorem
• t-Distribution
• Module 2 Key Learning

FAQ
#1 : To collect input from participants.

28
Module 2

The Decision-Making Process

Central Limit Theorem


Numerical simulations & examples Data
t-Distribution Descriptive Statistics
Module 2 Key Learning STATS 1
Information

Inferential Statistics
Knowledge
STATS 2

Statistical Decision

Engineering Engineering Consideration


Constraint or
Consideration
Pragmatic Decision

29
Module 2

The Decision-Making Process

Central Limit Theorem


Numerical simulations & examples
Predicting Unknown History
t-Distribution SAMPLE
POPULATION
Module 2 Key Learning
Descriptive Inferential Example
ASSEMBLED
UNITS
A lot on-hold at
OQC. The lot is
already produced.
Assume

Descriptive Statistics, is mainly referred sample data.


Will you obtain the same results If you re-sample from the same population?
Predicting Unknown Future
Assume
Example
SAMPLE POPULATION Process setup.
Using 30 setup
Descriptive Inferential UNITS GOING data (sample), to
TO BE
ASSEMBLED predict next 100K.

30
Module 2

The Decision-Making Process

Central Limit Theorem


Numerical simulations & examples
From the Central Limit Theorem (CLT):
t-Distribution
Let 𝑥1 , 𝑥2 , ⋯ , 𝑥𝑛 be a random sample of size n. The 𝑥𝑖 values represent a series of independent
Module 2 Key Learning
and identically distributed random variables, drawn from a population with mean 𝜇𝑋 and finite
variance given by 𝜎𝑋2 .

The CLT provides useful information about the distribution of the sample average. In fact, the
theorem demonstrates that the distribution of 𝑿ഥ approaches normality regardless of the shape
of the distribution of the individual 𝑿𝒊 .

Two cases can be identified:

𝟐
1. 𝑿~𝑵 𝝁𝑿 , 𝝈𝟐𝑿 ഥ ~𝑵 𝝁𝑿ഥ = 𝝁𝑿 , 𝝈𝟐ഥ = 𝝈𝑿
→𝑿 𝑿 𝒏
𝟐
2. 𝑿~𝑾𝒉𝒂𝒕𝒆𝒗𝒆𝒓 (𝒖𝒏𝒌𝒏𝒐𝒘𝒏) 𝒘𝒊𝒕𝒉 𝒑𝒂𝒓𝒂𝒎𝒆𝒕𝒆𝒓𝒔 𝝁𝑿 , 𝝈𝟐𝑿 ഥ ~𝑵 𝝁𝑿ഥ = 𝝁𝑿 , 𝝈𝟐ഥ = 𝝈𝑿
→𝑿 𝑿 𝒏
when 𝒏 → ∞ (typically, when n>30)

31
Module 2

The Decision-Making Process Exercise #1


Central Limit Theorem
Numerical simulations & examples

t-Distribution Exercise File:


Module 2 Key Learning
M2.0 Central Limit Theorem.xls M2.0 Central
Limit Theorem

Trainer will guide participants in performing the exercise.

Exercise 1.1
1. Click on Sheet Pop 1.
2. Copy Pop 1 data into JMP table.
3. Establish Data Distribution for Pop 1.

Exercise 1.2
4. Create Sampling Distribution for Mean & Variance for sample size, n=5.
5. Copy into JMP table.
6. Repeat Step 4 and 5 for n = 10 , 15 , 20 , 25 & 30.

Exercise 1.3
7. Establish Data Distribution for Sampling Distribution of Mean (for all n).

32
Module 2

The Decision-Making Process Exercise 1.1


Central Limit Theorem
Numerical simulations & examples
Exercise File:
t-Distribution
M2.0 Central
Module 2 Key Learning M2.0 Central Limit Theorem.xls Limit Theorem

Trainer will guide participants in performing the exercise.

Exercise 1.1
1. Click on Sheet Pop 1.
2. Copy Pop 1 data into JMP table.
3. Establish Data Distribution for Pop 1.

33
Module 2

The Decision-Making Process Exercise 1.1


Central Limit Theorem
Numerical simulations & examples Exercise File:
M2.0 Central
M2.0 Central Limit Theorem.xls Limit Theorem
t-Distribution

Module 2 Key Learning


Trainer will guide participants in performing the exercise.

Exercise 1.1
1. Click on Sheet Pop 1.
2. Copy Pop 1 sheet Population data column into JMP table (Edit > Paste With Column Names)
3. Establish Data Distribution for Pop 1.
3.1 Click on Analyze > Distribution
Cast Population column into Y

Make sure it is continuous data


34
Module 2

The Decision-Making Process Exercise 1.1


Central Limit Theorem
Numerical simulations & examples Exercise File:
M2.0 Central
M2.0 Central Limit Theorem.xls Limit Theorem
t-Distribution

Module 2 Key Learning

35
Module 2

The Decision-Making Process Exercise 1.2


Central Limit Theorem
Numerical simulations & examples

t-Distribution Exercise File:


Module 2 Key Learning M2.0 Central
M2.0 Central Limit Theorem.xls Limit Theorem

Exercise 1.2
4. Create Sampling Distribution for Mean & Variance for sample size, n=5.
5. Copy into JMP table.
6. Repeat Step 4 and 5 for n = 10 , 15 , 20 , 25 & 30.

6.1 Make 1000 group of data of size 5

36
Module 2

The Decision-Making Process Exercise 1.2


Central Limit Theorem
Numerical simulations & examples

t-Distribution Exercise File:


Module 2 Key Learning M2.0 Central
M2.0 Central Limit Theorem.xls Limit Theorem

6.2 Copy Sample Mean and Sample Variance to JMP data


table

37
Module 2

The Decision-Making Process Exercise 1.2


Central Limit Theorem
Numerical simulations & examples

t-Distribution Exercise File:


Module 2 Key Learning M2.0 Central
M2.0 Central Limit Theorem.xls Limit Theorem

6.3 Draw 1000 groups of data with size n=10, copy to JMP

4.26 21.94
3.91 38.86
7.15 31.04
5.00 6.48
5.95 20.37
3.04 15.32
2.14 26.31
4.50 22.12
7.15 32.26
5.44 24.84

6.4 Do the same for n=15, 20, 25 & 30

38
Module 2

The Decision-Making Process Exercise 1.3


Central Limit Theorem
Numerical simulations & examples

t-Distribution Exercise File:


Module 2 Key Learning M2.0 Central
M2.0 Central Limit Theorem.xls Limit Theorem

Exercise 1.3
7. Establish Data Distribution for Sampling Distribution of Mean (for all n).
Go to Analyze > Distribution, select all Mean columns

39
Module 2

The Decision-Making Process Population Sampling Distribution of Mean


Central Limit Theorem
Numerical simulations & examples

t-Distribution n=5
Module 2 Key Learning

n = 10
Did you notice the Sampling Distribution of Mean:

1. Normally distributed.

2. As sample size increases, the Stdev decreases.


n = 15

3. The mean is close to the population mean. 𝜎


𝑁 (𝜇 , )
4. Your Stdev are similar to mine.
𝑛
n = 20

Everyone have unique set of numbers, n = 25


but we all converged into similar
results !

n = 30

40
Module 2

The Decision-Making Process


Population Sampling Distribution of Mean
Central Limit Theorem
Numerical simulations & examples

t-Distribution
𝝈
𝑵 𝝁,
Module 2 Key Learning 𝒏

𝑛=5
4.99
𝑁 5.02, ⇒ 𝑁(5.02,2.24)
5

𝑛 = 25
4.99
𝑁 5.02, ⇒ 𝑁(5.02,1.00)
25

41
Module 2

The Decision-Making Process Exercise #2


Central Limit Theorem
Numerical simulations & examples
Exercise File:

t-Distribution M2.0 Central Limit Theorem.xls


Module 2 Key Learning

Trainer will guide participants in performing the exercise.

Exercise
1. Click on Sheet Pop 2.
2. Copy Pop 2 data into JMP table.
3. Establish Data Distribution for Pop 2.

Exercise Con’t
4. Create Sampling Distribution for Mean & Variance for sample size, n=5.
5. Copy into JMP table.
6. Repeat Step 4 and 5 for n = 15 & 30.

Exercise Con’t
7. Establish Data Distribution for Sampling Distribution of Mean (for all n).

42
Module 2

The Decision-Making Process


Population Sampling Distribution of Mean
Central Limit Theorem
Numerical simulations & examples
Calculated Answer Simulated Answer
t-Distribution

Module 2 Key Learning


n=5
𝑁(4.00,0.26)

n = 10
𝑁(4.00,0.18)

n = 15
𝝈 𝑁(4.00,0.15)
𝑵 𝝁,
𝒏
n = 20
𝑁(4.00,0.13)

n = 25
𝑁(4.00,0.12)

n = 30
𝑁(4.00,0.11)
43
Module 2

The Decision-Making Process Exercise #3


Central Limit Theorem
Numerical simulations & examples
Exercise File:

t-Distribution M2.0 Central Limit Theorem.xls


Module 2 Key Learning

Trainer will guide participants in performing the exercise.

Exercise
1. Click on Sheet Pop 3.
2. Copy Pop 3 data into JMP table.
3. Establish Data Distribution for Pop 3.

Exercise Con’t
4. Create Sampling Distribution for Mean & Variance for sample size, n=5.
5. Copy into JMP table.
6. Repeat Step 4 and 5 for n = 15 & 30.

Exercise Con’t
7. Establish Data Distribution for Sampling Distribution of Mean (for all n).

44
Module 2

The Decision-Making Process


Population Sampling Distribution of Mean
Central Limit Theorem
Numerical simulations & examples
Calculated Answer Simulated Answer
t-Distribution

Module 2 Key Learning


n=5
𝑁(6.06,1.92)

n = 10
𝑁(6.06,1.36)

n = 15
𝝈 𝑁(6.06,1.11)
𝑵 𝝁,
𝒏
n = 20
𝑁(6.06,0.96)

n = 25
𝑁(6.06,0.86)

n = 30
𝑁(6.06,0.78)
45
Module 2

The Decision-Making Process

Central Limit Theorem


Numerical simulations & examples

t-Distribution
Any Population Distribution Sampling Distribution of Mean
Module 2 Key Learning

𝜎
𝑛

Sampling Distribution of Variance

46
Module 2
Population
The Decision-Making Process

Central Limit Theorem


Numerical simulations & examples

t-Distribution

Module 2 Key Learning


Sampling Distribution of Variance

n=5

n=10

n=15

n=20

n=25

n=30

47
Module 2

The Decision-Making Process


Sampling Distribution of Variance Follow a Chi-Square Distribution
Central Limit Theorem
Numerical simulations & examples

t-Distribution

Module 2 Key Learning

Another usage of Variance and why it is important to study.

48
Module 2

The Decision-Making Process

Central Limit Theorem


Numerical simulations & examples

t-Distribution Any Population Distribution Sampling Distribution of Mean

Module 2 Key Learning

𝜎
𝑛

Sampling Distribution of Variance

Here onwards, we will focus on Inferential for Population Mean as example


(The concept of how we infer about Population Variance works the same way).
49
Module 2

The Decision-Making Process


PROVING
Central Limit Theorem
Numerical simulations & examples

t-Distribution Any Population Distribution Sampling Distribution of Mean


Module 2 Key Learning

𝜎
𝑛

𝝈
𝑵 𝝁,
𝒏

Ex. Infer about Pop Mean


PRACTICAL APPLICATION

Using only 1 group of sample with size, n 50


Module 2

The Decision-Making Process

Central Limit Theorem However, most likely we will not know the
Numerical simulations & examples value of the population stdev.
t-Distribution
As such, we cannot form the Sampling
Module 2 Key Learning Distribution for Mean in this situation. Sampling Ex : Inferential
Distribution of About Population Mean
Mean
Sample with size, n
𝝈
𝑰𝒇 𝝈 𝒌𝒏𝒐𝒘𝒏 𝑵 𝝁,
𝒏
Ex:

𝑴𝒆𝒂𝒏 ∶ 𝒙 Confidence Interval

𝑺𝒕𝒅𝒆𝒗 ∶ 𝒔 Ex:
Hypothesis Testing
𝒔
𝑰𝒇 𝝈 𝒖𝒏𝒌𝒏𝒐𝒘𝒏 𝑵
t 𝝁,
𝒏
• Machine setup.
• Produce n setup units. You want to know from the setup you did,
• 𝑆𝑎𝑚𝑝𝑙𝑒 𝑀𝑒𝑎𝑛, 𝑥.ҧ Real cases We replace σ with s.
mainly and infer about the population (as in
• 𝑆𝑎𝑚𝑝𝑙𝑒 𝑆𝑡𝑑𝑒𝑣, 𝑠. production).
belong to This results in a
this group distribution that follow
a Then you make decision if you need to
t-distribution. re-setup or release the machine for
production.

51
Module 2

The Decision-Making Process

Central Limit Theorem


Numerical simulations & examples

t-Distribution

Module 2 Key Learning


Ex: Dof = 10
Ex: Dof = 60

Main features of the Normal Density Function:


• It is symmetric
• It is unimodal
• It is bell-shaped

Main features of the t Density Function:


• It is symmetric
• It is unimodal
• It is bell-shaped
• It has thicker tail compared to Normal distribution.
• t → Normal as doF (n-1) increase. t = Normal when doF >> 60.
52
Module 2

The Decision-Making Process

Central Limit Theorem This situation can be used when sample size is large
Numerical simulations & examples enough.

t-Distribution Reason
Module 2 Key Learning t ➔ Normal.
In other words, as if we know about Pop Stdev.

Sampling Inference
Distribution of About Population Mean
Mean
Sample with size, n
𝝈
𝑰𝒇 𝝈 𝒌𝒏𝒐𝒘𝒏 𝑵 𝝁,
𝒏 Ex:

𝑴𝒆𝒂𝒏 ∶ 𝒙 Confidence Interval

𝑺𝒕𝒅𝒆𝒗 ∶ 𝒔 Ex:
Hypothesis Testing
𝒔
𝑰𝒇 𝝈 𝒖𝒏𝒌𝒏𝒐𝒘𝒏 𝑵
t 𝝁,
𝒏

To understand this concept, let’s look into how it is being applied and go through practical examples.

Let’s start with Confidence Interval.

53
Module 2

The Decision-Making Process

Central Limit Theorem


Numerical simulations & examples • Understanding the concept behind Central Limit Theorem
t-Distribution

Module 2 Key Learning • Understanding the effect of sample size modification.

• What is t-distribution.

54
Module 3 : Confidence Interval
Objective: Duration ~ TBD hrs
• Know which are the main properties of point estimators.
• Form Confidence Interval for the population parameters.

Know : Statistical Theory How : JMP Application


• Estimation • Confidence Interval for 1 Normal Population
• Point Estimation • Frequency Interpretation
• Properties of Point Estimators • Practical Exercises of using Confidence Interval
• Interval Estimation • Effect of Sample Size to Confidence Interval
• Parameters of 1 Normal Population Width
• Parameters of 2 Normal Population • 1 Sample Proportion
• Confidence Interval for 2 Normal Populations
• Two Independent With Equal Variance
• Two Independent with Unequal Variance
• Two Dependent Samples
• Two Sample Proportion

FAQ
#1 : To collect input from participants.

56
Module 3

Estimation

Point Estimation
Properties of Point Estimators
GOAL
Interval estimation
Parameters of 1 normal population Estimation of unknown population parameters (e.g. “the
Parameters of 2 normal populations

Module 3 Key Learning


population mean weight μ”) using the available sample data

Relevant terms:
o ESTIMATOR → sample statistic used to estimate the unknown
population parameter (e.g. average of sample

weights 𝑋)

o ESTIMATE → value of the estimator calculated on sample


data, (e.g. “120 pounds”)

57
Module 3

Estimation

Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


A point estimator of an unknown population parameter is
o a random variable that depends on sample information . . .
o whose value provides an approximation to this unknown parameter

A specific value of that random variable is called point estimate

58
Module 3

Estimation

Point Estimation
Properties of Point Estimators

Interval estimation Despite its conceptual simplicity, a disadvantage of point


Parameters of 1 normal population
Parameters of 2 normal populations
estimation is that it does not permit to assess the precision
Module 3 Key Learning of the results using the language of probability, i.e. point
estimators do not tell us what is the probability that the
estimated value is really close to (or far from) the true
unknown value

Notation:
• The unknown parameters to estimate are indicated by Greek letters
(e.g. μ, σ, θ, ρ, λ …)
ത s, r …)
• The estimators are indicated by Latin letters (e.g. 𝑋,
ത read “the
• The symbol ^ (hat) is used to indicate estimation (e.g. 𝜇Ƹ = 𝑋,

point estimator of μ is the sample average 𝑋“)

59
Module 3

Estimation

Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population Usually, the unknown parameters that we need to estimate are:
Parameters of 2 normal populations
• The population mean (μ)
Module 3 Key Learning
• The population variance (σ2) or the population standard deviation (σ)
• The population proportion (θ)
Or, we might be interested in estimating other unknown population parameters like:
• The coefficients of a Regression Model (the β’s)
• The correlation coefficient (ρ)
• Other...

60
Module 3

Estimation

Point Estimation
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population EXAMPLE
Parameters of 2 normal populations
Let µ be an unknown parameter that we want to estimate. Here, µ is the mean lifetime of a
Module 3 Key Learning
certain type of batteries.

A random sample of n = 30 batteries might yield the following observed lifetimes (hours):
x1= 6.1, x2= 5.3, …, x30= 5.9

The computed value of the sample average lifetime is:


𝑛 30
1 1 1
𝑋ത = ෍ 𝑋𝑖 = ෍ 𝑋𝑖 = 6.1 + 5.3 + ⋯ + 5.9 = 5.77
𝑛 30 30
𝑖=1 𝑖=1

Based on the available sample information, it is reasonable to regard 5.77 as a very


plausible value of µ or our “best guess”. The value 5.77 is a point estimate of the
unknown population mean.

61
Module 3

Estimation

Point Estimation
Point Estimator Properties
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
Point estimators can be classified according to some
Module 3 Key Learning
desirable properties.

For example, estimators can be:

• Unbiased.
• Consistent.
• Most efficient

NOTE
For more details, se also the Manual of Statistical Methodology (8482919 ver.2).
62
Module 3

Estimation

Point Estimation
Unbiasedness
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations • A point estimator 𝜃መ is said to be an unbiased estimator of the
Module 3 Key Learning
parameter  if E(𝜽),
෡ the expected value, or mean, of the sampling
መ is  .
distribution of 𝜃,

𝜃෠ is unbiased estimator of  if: E(𝜽)


෡ =θ

• Examples:
• The sample mean is an unbiased estimator of μ
• The sample variance is an unbiased estimator of σ2
• The sample proportion is an unbiased estimator of 𝜋

63
Module 3

Estimation
Expected Value
Point Estimation
Properties of Point Estimators መ
Expected value, or mean, of the sampling distribution of 𝜃:
Interval estimation
1. The unknown population parameter that we want to estimate is θ
Parameters of 1 normal population
Parameters of 2 normal populations 2. The population contains a very large (infinite) number of items
Module 3 Key Learning 3. From that population, a very large (infinite) number (k) of samples is drawn
4. From each sample the estimate of θ is calculated
5. The expected value of the sampling distribution of 𝜃෠ is the average of all the estimates
calculated at the previous point 4

Population

Sample 1 Sample 2 ... Sample k

θ̂1 θ̂ 2 ... θ̂ k
෡ i (i = 1,…, k), k→ ∞, is the Expected Value (or Mean)
The average of all the 𝜽
of the sampling distribution of 𝜃෠ and it is indicated by E(𝜽)෡
64
Module 3

Estimation

Point Estimation
Unbiasedness
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations መ
𝑓(𝜃)
Module 3 Key Learning Sampling distribution of θ෠ 1
Sampling distribution of θ෠ 2

𝐸(𝜃መ 1)= θ 𝐸(𝜃መ 2) 𝜃መ

Bias in 𝜃መ 2
መ let’s define bias = E(𝜽)
For a any estimator of 𝜃, say 𝜃, ෡ - 𝜽.

෡ 1) = 0, 𝜽
Since the bias (𝜽 ෡ 1 is an unbiased estimator of 𝜽.
Conversely, since the bias (𝜽 ෡ 2) > 0, 𝜽
෡ 2 is a biased estimator of 𝜽.

65
Module 3

Estimation

Point Estimation
Bias
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population • Let 𝜃መ be an estimator of 
Parameters of 2 normal populations

Module 3 Key Learning

• The bias in 𝜃መ is defined as the difference between its mean and 

෡ = 𝑬(𝜽)
Bias(𝜽) ෡ −𝛉

• The bias of an unbiased estimator is 0 by definition,

66
Module 3

Estimation

Point Estimation
Consistency
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population • Let 𝜃መ be an estimator of 
Parameters of 2 normal populations

Module 3 Key Learning

• 𝜃መ is a consistent estimator of  if the difference between the expected


value of 𝜃መ and  (i.e. the bias) decreases as the sample size increases

• Consistency is desired when unbiased estimators cannot be obtained

67
Module 3

Estimation

Point Estimation
Maximum Efficiency
Properties of Point Estimators

Interval estimation

Suppose there are several unbiased estimators of 


Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


Definition:
The most efficient estimator or the minimum variance unbiased
estimator of  is the unbiased estimator with the smallest variance
(a measure of the amount of dispersion away from the estimate. In
other words, the estimator that varies least from sample to sample).

This generally depends on the distribution of the population.


For example, the mean is more efficient than the median for the normal
distribution but not for “skewed” (asymmetrical) distributions.

68
Module 3

Estimation

Point Estimation
Maximum Efficiency
Properties of Point Estimators

Let 𝜃መ 1 and 𝜃መ 2 be two unbiased estimators of .


Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


Then,

• 𝜃መ 1 is said to be more efficient than 𝜃መ 2 if:

෡ 𝟏 ) < Var(𝜽
Var(𝜽 ෡𝟐)

• The relative efficiency of 𝜃መ 1 with respect to 𝜃መ 2 is the ratio of their variances:

෡𝟏)
Var(𝜽
Relative Efficiency =
෡𝟐)
Var(𝜽
69
Module 3

Estimation

Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

Interval estimation of unknown population


parameters (one normal population)

70
Module 3

Estimation

Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
• How much uncertainty is associated with a point estimate of a
Module 3 Key Learning
population parameter? We cannot know, but,…

• …instead of a point estimate, Statistics helps us determining the limits


of an interval which, expectedly, contains the unknown parameter
with a certain probability. This probability is called confidence level.

• An interval estimate provides more information about a population


characteristic than does a point estimate

71
Module 3

Estimation

Point Estimation
Interval Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


Interval estimation is a methodology used to evaluate an unknown parameter -
for example, a population mean - by computing an interval, within which the
unknown parameter is most likely to be located.

Intervals are commonly chosen such that the unknown parameter falls within
with a certain (percent) probability. This probability, determined a-priori, typically
is set between 90% to 99% and it is called confidence level. Hence, the
intervals are called confidence intervals;
the end points of such an interval are called upper and lower confidence limits.

72
Module 3

Estimation

Point Estimation
Interval Estimation
Properties of Point Estimators

Interval estimation
EXAMPLE
Parameters of 1 normal population
Parameters of 2 normal populations Consider again the batteries lifetime example.

Module 3 Key Learning Instead of calculating a point estimator of the mean lifetime of the batteries, now we want to go
further and form a confidence interval at the 95% confidence level, for the unknown mean lifetime
of the batteries.

Using the methods presented in the next Modules, it will be possible to form such an interval. For
example, a possible outcome is the interval (5.23, 6.31)

The values 5.23 and 6.31 are the lower and upper confidence limits respectively

It is reasonable to state that – with a confidence equal to 95% - the true unknown mean of the
batteries lifetime is contained in the 5.23 to 6.31 interval

73
Module 3

Estimation

Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations The interval containing a population parameter is established by calculating
Module 3 Key Learning
that statistic from values measured on a random sample taken from the
population and by applying the knowledge of the fidelity with which the
properties of a sample represent those of the entire population.

The probability tells what percentage of the time the assignment of the
interval will be correct but not what the chances are that it is true for any
given sample.

Of the intervals computed from many samples, a certain percentage will


contain the true value of the parameter being sought.

74
Module 3

Estimation

Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
An interval gives a range of values:
Module 3 Key Learning

❑ Takes into consideration variation in statistics from sample to sample


❑ Based on observations from 1 sample
❑ Gives information about closeness to unknown population parameters
❑ Stated in terms of level of confidence
▪ Can never be 100% confident

75
Module 3

Estimation

Point Estimation
Properties of Point Estimators Let  be an unknown population parameter that we want to estimate. If
Interval estimation P(a <  < b) = 1 -  then the interval from a to b is called 100(1 - )%
Parameters of 1 normal population confidence interval of .
Parameters of 2 normal populations

Module 3 Key Learning

The quantity (1 - ) is called confidence level of the interval (0 <  < 1).

Acceptable values for α are 0.01< α < 0.10

Acceptable confidence levels are between 0.90 to 0.99

o In repeated samples, the true value of the parameter  would be contained in 100(1 - )%
of the intervals.
o The confidence interval calculated in this manner is written as a <  < b with 100(1 - )%
confidence

76
Module 3

Estimation

Point Estimation
Frequency Interpretation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
❑ Suppose confidence level = 95%

❑ Also written (1 - ) = 0.95


Module 3 Key Learning

❑ A relative frequency interpretation:


▪ From repeated samples, 95% of all the confidence intervals that can be constructed
will contain the unknown true parameter

❑ A specific interval either will contain or will not contain the true parameter
▪ No probability involved in a specific interval

77
Module 3

Estimation Frequency Interpretation


Point Estimation
Properties of Point Estimators

Interval estimation Sampling Distribution of the Mean


Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

𝛼/2 1−𝛼 𝛼/2

X
μx = μ
Xሜ 1
Xሜ 2
In repeated samples, 100(1-)% of
intervals contain μ, 100()% do not.

78
Module 3

Estimation

Point Estimation Confidence Intervals


Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
Confidence intervals for parameters of one normal population
Module 3 Key Learning

Confidence Intervals

Population Mean Population Variance

σ2 Known σ2 Unknown

79
Module 3

Estimation

Point Estimation Confidence Intervals


Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
Confidence intervals for parameters of one normal population
Module 3 Key Learning

Confidence Intervals

Population Mean Population Variance

σ2 Known σ2 Unknown

80
Module 3

Estimation

Point Estimation
Confidence Intervals
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


The general formula for the confidence intervals for the population mean under
the normality assumption is:

Point Estimate  (Reliability Factor)*(Standard Error)

❑ The value of the reliability factor depends on the desired level of confidence
𝜎
❑ The standard error is equal to , where σ is the population std. dev. and n is the sample size
𝑛

81
Confidence Interval for μ (σ2 known)
Module 3

Estimation

Point Estimation
Properties of Point Estimators
- Assumptions
Interval estimation
• Population is normally distributed (if population is not normal, use large sample)
Parameters of 1 normal population
Parameters of 2 normal populations
- Context
Module 3 Key Learning
• Population variance σ2 is known

- Confidence interval estimate:

𝝈 𝝈
ഥ − 𝒁𝜶
𝑿 ഥ + 𝒁𝜶
<𝝁<𝑿
𝟐 𝒏 𝟐 𝒏
Where,
▪ 𝑋ത - point estimator of the population mean, it is the sample average
▪ z/2 – percentage point of the N(0,1) distribution such that: 𝑃 𝑧 ≥ 𝑍𝛼/2 = 𝛼/2
▪ σ – population standard deviation (known)
▪ n – sample size

For more references on method see also the Manual of Statistical Methodology, Ch. 6 and Annex 4 (DMS 8482919_A)
82
Module 3

Estimation

Point Estimation
Confidence Interval for μ (σ2 known)
Properties of Point Estimators

Interval estimation
The confidence interval can be written also as follows:
Point Estimate  (Reliability Factor)*(Standard Error)
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

𝝈
ഥ ± 𝒁𝜶
𝑿
𝟐 𝒏
or as
ഥ ±ME
𝑿
𝝈
where ME is called the Margin of Error, 𝑴𝑬 = 𝒁𝜶
𝟐 𝒏
The interval width, W, is equal to twice the margin of error: W = 2*ME

83
Module 3

Estimation

Point Estimation
Reducing the Margin of Error
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Q. Why it is desirable to reduce the ME?
Parameters of 2 normal populations 𝝈
Module 3 Key Learning
𝑴𝑬 = 𝒁𝜶 A. Because W, the width of the interval, given by W=2*ME is
𝟐 𝒏 inversely proportional to the precision of an interval estimator.

The margin of error can be reduced if:

❑ the population standard deviation can be reduced - σ

❑ The sample size is increased - n

❑ The confidence level is decreased - (1 – )

NOTE
𝝈
Fixed a value for W and a confidence level, one can solve for n the equation 𝑾 = 𝟐𝒁𝜶 to obtain the
𝟐 𝒏
minimum sample size required to guarantee a confidence (1 − 𝛼) and a precision W
84
Module 3

Estimation
Reducing the Margin of Error
Point Estimation
Properties of Point Estimators
𝛼 𝛼
Interval estimation = 0.05 = 0.05
2 2 Narrower CI. More precise.
Parameters of 1 normal population (1 − 𝛼) = 0.90 Less likely to include the
Parameters of 2 normal populations 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 90%
true population Mean.
(𝛼 = 0.10)
Module 3 Key Learning
3.18 CI @ 90% 6.03

𝛼 𝛼
= 0.025 = 0.025
2 2
(1 − 𝛼) = 0.95
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 95%
(𝛼 = 0.05)
2.89 CI @ 95% 6.31

𝛼 𝛼
= 0.005 = 0.005
2 2 Wider CI. Less precise.
(1 − 𝛼) = 0.99 More likely to include the
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 99%
true Population Mean.
(𝛼 = 0.01)
2.30 CI @ 99% 6.91
85
Module 3

Estimation

Point Estimation
Finding the Reliability Factor
Properties of Point Estimators

Interval estimation Consider a 95% confidence interval:


Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


Width = W

α α
= 0.025 = 0.025
2 1−𝛼Z= 0.95 2

Z units: -z = -1.96 0 z = 1.96


Lower Upper
X units: Confidence Point Estimate Confidence
Limit ഥ)
(𝑿 Limit

z0.025 = 1.96 from the standard normal distribution table


86
Module 3

Estimation
The standard normal distribution table
Point Estimation
Properties of Point Estimators
.06
Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

α
= 0.025
2
α
= 0.025 1-α = 0.95
2

Z units: z = -1.96 0 z = 1.96 -1.9 .0250


Lower Upper
Confidence Confidence
Limit Limit

-1.96

87
Module 3

Estimation

Point Estimation
Common Levels of Confidence
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations Commonly used confidence levels are 90%, 95%, and 99%
Module 3 Key Learning

Confidence Confidence Z/2 value


Level Coefficient,
1−𝛼
90% 0.90 1.645
95% 0.95 1.96
99% 0.99 2.58

88
Module 3

Estimation

Point Estimation
Advanced Explanation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
How did we arrive to the previous formula for the
Module 3 Key Learning confidence interval?
We need to focus on the following:
2
𝜎𝑋
1. ത
The Central Limit Theorem (CLT) tells us that if 𝑋~𝑁(𝜇𝑋 , 𝜎𝑋2 ) ⇒ 𝑋~𝑁 𝜇𝑋ത = 𝜇𝑋 , 𝜎𝑋2ത = . In
𝑛
ത the
words, if X is a normally distributed random variable with parameters 𝜇𝑋 and 𝜎𝑋2 , then 𝑋,
𝜎2
sample average, is also a normal RV with parameters 𝜇𝑋ത = 𝜇𝑋 and 𝜎𝑋2ത = 𝑋 , where n is the sample
𝑛
size. If X is NOT normally distributed, still, for large values of n (at least, n>30), the previous
distributional property holds (statisticians say that this property holds “asymptotically”).

2. The transformation called “standardization” of a normal random variable.


𝑋−𝜇𝑋
For any normal RV X, 𝑋~𝑁(𝜇𝑋 , 𝜎𝑋2 ) ⇒ 𝑍 = ~𝑁(𝜇𝑍 = 0, 𝜎𝑍 = 1), i.e. Z is a standard normal RV.
𝜎𝑋

89
Module 3

Estimation

Point Estimation
Advanced Explanation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
To form a confidence interval for the population mean 𝜇𝑋 of a normal RV with known
Module 3 Key Learning variance 𝜎𝑋2 , we need 2 critical values, say CV1 and CV2, that include 100 1 − 𝛼 %
of the population (thus, leaving in each tail an area equal to 𝛼/2.

𝛼/2 (1 − 𝛼) 𝛼/2

𝜇𝑋 X
𝐶𝑉1 𝐶𝑉2

Due to the symmetrical feature of the normal distribution, it can be noticed that:
𝑪𝑽𝟐 − 𝝁𝑿 = 𝝁𝑿 − 𝑪𝑽𝟏

90
Module 3

Estimation
Advanced Explanation
Point Estimation
Properties of Point Estimators

Interval estimation
We standardize the value of 𝑋ത to form the confidence interval for 𝜇𝑋 :
Parameters of 1 normal population
Parameters of 2 normal populations
ത 𝑋
𝑋−𝜇 ഥ ത 𝑋
𝑋−𝜇
Module 3 Key Learning 𝑍𝑋ത = =𝜎
𝜎𝑋
ഥ 𝑋/ 𝑛

Note: the quantity 𝜎𝑋 / 𝑛 is also called “mean standard error”.

According to the desired confidence level, we may obtain the different critical values that
one can find in the table of the standard normal distribution.
𝛼
Let CV2, the upper critical value, be 𝐶𝑉2 = 𝑍𝑋ത ( 2 ). Considering the property of symmetry
𝛼
of the normal distribution, 𝐶𝑉1 = −𝐶𝑉2 = −𝑍𝑋ത ( 2 ) holds.

For example, for α = 0.05 ⇒ α/2 = 0.025, from the table of the Standard Normal
Distribution we obtain 𝐶𝑉1 = −1.96 and 𝐶𝑉2 = 1.96.

91
Module 3

Estimation

Point Estimation
Advanced Explanation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
To form the confidence interval for 𝜇𝑋 we consider the following inequality:
Module 3 Key Learning
𝛼 𝑋ത − 𝜇𝑋 𝛼
−𝑍( ) ≤ ≤ 𝑍( )
2 𝜎𝑋 / 𝑛 2
𝛼 𝜎𝑋 𝛼 𝜎
And we solve it for 𝜇𝑋 ⇒ −Z( 2 ) ≤ 𝑋ത − 𝜇𝑋 ≤ 𝑍 ( 2 ) 𝑋𝑛 .
𝑛

After solving the system of inequalities, we obtain the desired confidence interval:

𝜶 𝝈𝑿 𝜶 𝝈𝑿
ഥ−𝒁
𝑿 ഥ
≤ 𝝁𝑿 ≤ 𝑿 + 𝒁( )
𝟐 𝒏 𝟐 𝒏

92
Module 3

Estimation

Point Estimation
Confidence Intervals
Properties of Point Estimators

Interval estimation EXAMPLE


Parameters of 1 normal population
Parameters of 2 normal populations
Confidence interval for the mean of a normal population (variance known)
Module 3 Key Learning

• A sample of 11 circuits from a large normal population has a mean resistance of 2.26 ohms.
We know from past testing that the population standard deviation is 0.35 ohms.

• The available data: {2.25, 2.48, 1.40, 2.31, 2.43, 2.47, 2.52, 2.56, 2.33, 1.48, 2.61}

• Determine & interpret a 95% confidence interval for the true mean resistance of the
population.

93
Module 3

Estimation Confidence Intervals


Point Estimation
Properties of Point Estimators SOLUTION
𝝈 𝝈
Interval estimation ഥ − 𝒁𝜶
In this case, σ2 is known, then we can use: 𝑿 ഥ + 𝒁𝜶
<𝝁<𝑿
Parameters of 1 normal population 𝟐 𝒏 𝟐 𝒏
Parameters of 2 normal populations Where,
ഥ = 2.26
• 𝑿
Module 3 Key Learning
• 𝒁𝜶 = 1.96 (α = 0.05 → α/2 = 0.025 )
𝟐 Confidence interval :
• 𝝈 = 0.35
• n = 11 𝟎. 𝟑𝟓 𝟎. 𝟑𝟓
𝟐. 𝟐𝟔 − 𝟏. 𝟗𝟔 < 𝝁 < 𝟐. 𝟐𝟔 + 𝟏. 𝟗𝟔
𝟏𝟏 𝟏𝟏

𝟐. 𝟎𝟓 < 𝝁 < 𝟐. 𝟒𝟕

• We are 95% confident that the true mean resistance is between 2.05 and 2.47 ohms

• Although the true mean may or may not be in this interval, 95% of intervals formed in
this manner will contain the true mean

NOTE: From the statistical tables of the Standard Normal Distribution, we obtain that 𝑍𝛼 = 1.96.
2
However, the confidence interval is provided by the software and not calculated by hand. 94
Module 3

Estimation

Point Estimation
Confidence Intervals
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
Confidence intervals for parameters of one population
Module 3 Key Learning

Confidence Intervals

Population Mean Population Variance

σ2 Known σ2 Unknown

95
Module 3

Estimation

Point Estimation
Confidence Interval for μ (σ2 Unknown)
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


• If the population standard deviation σ is unknown, we
can substitute the sample standard deviation, s

• This introduces extra uncertainty, since s is variable


from sample to sample

• So, we use the Student’s t distribution instead of the


Normal distribution

96
Module 3

Estimation

Point Estimation
Confidence Interval for μ (σ2 Unknown)
Properties of Point Estimators

Interval estimation
• Assumptions
Parameters of 1 normal population • Population is normally distributed (if population is not normal, use large sample)
Parameters of 2 normal populations

Module 3 Key Learning • Context


• Population variance σ2 is unknown

• Use Student’s t Distribution

• Confidence Interval Estimate:

𝒔 𝒔
ഥ − 𝒕𝒏−𝟏,𝜶/𝟐
𝑿 ഥ + 𝒕𝒏−𝟏,𝜶/𝟐
<𝝁<𝑿
𝒏 𝒏
where tn-1,α/2 is the critical value of the t distribution with n-1 degrees of freedom
and an area of α/2 in each tail:

P(t n−1 > t n−1,α/2 ) = α/2


97
Module 3

Confidence Interval for μ (σ2 Unknown)


Estimation

Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


❑ The t is a family of distributions
❑ The t value depends on degrees of freedom (d.f.)
▪ Number of observations that are free to vary after
sample mean has been calculated

d.f. = n - 1

98
Module 3

Estimation

Point Estimation
Student’s t Distribution
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations f(Z), f(t)
Module 3 Key Learning
Standard Normal
(t with d.f. = ∞)

t (d.f. = 13)

t-distributions are bell-shaped


and symmetric, but have t (d.f. = 5)
‘fatter’ tails than the normal

00 t Z, t

Note: t → Z as n (and then the d.f.) increases


99
Module 3

Estimation

Point Estimation
Student’s t Distribution
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
With comparison to the Z value (Z=Standard Normal, N(μ=0, σ2=1))
Module 3 Key Learning

Confidence t t t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) (60 d.f.) (120 d.f.)

90% 1.812 1.725 1.697 1.671 1.658 1.645


95% 2.228 2.086 2.042 2.000 1.980 1.960
99% 3.169 2.845 2.750 2.660 2.617 2.576

Notes
• d.f.=n-1 ⇒ from the table above, we notice that as n increases the interval width
decreases (and the precision increases)
• as the number of d.f.→ ∞ ⇒ t → Z
• with d.f. > 60, approximately t = Z 100
Module 3

Estimation

Point Estimation
Use of the Student’s t Distribution
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
EXAMPLE
Module 3 Key Learning
confidence interval for the mean of a normal population (variance unknown)
• Consider the data from the previous example. Sample size n=11, but here the true
population standard deviation is unknown and must be estimated.
• From sample data, we obtain s = 0.42
• Determine a 95% confidence interval for μ, the true population mean.

101
Module 3

Estimation

Point Estimation
Use of the Student’s t Distribution
Properties of Point Estimators

Interval estimation SOLUTION


𝒔 𝒔
Parameters of 1 normal population ഥ − 𝒕𝒏−𝟏,𝜶/𝟐
In this case, σ2 is unknown, then we use: 𝑿 ഥ + 𝒕𝒏−𝟏,𝜶/𝟐
<𝝁<𝑿
Parameters of 2 normal populations 𝒏 𝒏
where,
Module 3 Key Learning

• ഥ = 2.260
𝑿
• 𝑡10, 𝛼 = 2.228 (α = 0.05 → α/2 = 0.025 )
2
• 𝒔 = 0.420 𝒔 𝒔
ഥ − 𝒕𝒏−𝟏,𝜶/𝟐
𝑿 ഥ + 𝒕𝒏−𝟏,𝜶/𝟐
<𝝁<𝑿
• n = 11 → n-1 =10 𝒏 𝒏

𝟎. 𝟒𝟐𝟎 𝟎. 𝟒𝟐𝟎
𝟐. 𝟐𝟔𝟎 − 𝟐. 𝟐𝟐𝟖 < 𝝁 < 𝟐. 𝟐𝟔𝟎 + 𝟐. 𝟐𝟐𝟖
𝟏𝟏 𝟏𝟏

𝟏. 𝟗𝟕𝟖 < 𝝁 < 𝟐. 𝟓𝟒𝟐

NOTES:
• From the Statistical Tables of the t Distribution (see next slide), we obtain that 𝑡10,0.025 = 2.228.
• The interpretation of this result is the same as for the previous example. The only difference is in the formula used
for the calculation.
102
Module 3

Estimation

Point Estimation
Use of the Student’s t Distribution
Properties of Point Estimators
Table of critical values for Student’s t distributions
Interval estimation df α = 0.1 0.05 0.025 0.01 0.005 0.001 0.0005
Parameters of 1 normal population 1 3.078 6.314 12.706 31.821 63.656 318.289 636.578
2 1.886 2.920 4.303 6.965 9.925 22.328 31.600
Parameters of 2 normal populations 3 1.638 2.353 3.182 4.541 5.841 10.214 12.924
4 1.533 2.132 2.776 3.747 4.604 7.173 8.610
Module 3 Key Learning 5 1.476 2.015 2.571 3.365 4.032 5.894 6.869
6 1.440 1.943 2.447 3.143 3.707 5.208 5.959
7 1.415 1.895 2.365 2.998 3.499 4.785 5.408
8 1.397 1.860 2.306 2.896 3.355 4.501 5.041
9 1.383 1.833 2.262 2.821 3.250 4.297 4.781
10 1.372 1.812 2.228 2.764 3.169 4.144 4.587
11 1.363 1.796 2.201 2.718 3.106 4.025 4.437
12 1.356 1.782 2.179 2.681 3.055 3.930 4.318
13 1.350 1.771 2.160 2.650 3.012 3.852 4.221
14 1.345 1.761 2.145 2.624 2.977 3.787 4.140
15 1.341 1.753 2.131 2.602 2.947 3.733 4.073
16 1.337 1.746 2.120 2.583 2.921 3.686 4.015
17 1.333 1.740 2.110 2.567 2.898 3.646 3.965
18 1.330 1.734 2.101 2.552 2.878 3.610 3.922
19 1.328 1.729 2.093 2.539 2.861 3.579 3.883
20 1.325 1.725 2.086 2.528 2.845 3.552 3.850
21 1.323 1.721 2.080 2.518 2.831 3.527 3.819
22 1.321 1.717 2.074 2.508 2.819 3.505 3.792
23 1.319 1.714 2.069 2.500 2.807 3.485 3.768
24 1.318 1.711 2.064 2.492 2.797 3.467 3.745
25 1.316 1.708 2.060 2.485 2.787 3.450 3.725
26 1.315 1.706 2.056 2.479 2.779 3.435 3.707
27 1.314 1.703 2.052 2.473 2.771 3.421 3.689
28 1.313 1.701 2.048 2.467 2.763 3.408 3.674
29 1.311 1.699 2.045 2.462 2.756 3.396 3.660
30 1.310 1.697 2.042 2.457 2.750 3.385 3.646
60 1.296 1.671 2.000 2.390 2.660 3.232 3.460
120 1.289 1.658 1.980 2.358 2.617 3.160 3.373
∞ 1.282 1.645 1.960 2.326 2.576 3.091 3.291
103
Module 3

Estimation

Point Estimation
Confidence Interval
Properties of Point Estimators

Interval estimation
Sampling
Parameters of 1 normal population Sample with size, n Distribution Inference
Parameters of 2 normal populations About Population Mean
of Mean
Module 3 Key Learning
𝝈
𝑰𝒇 𝝈 𝒌𝒏𝒐𝒘𝒏 𝑵 𝝁,
𝒏

Average ∶ 𝒙
𝑺𝒕𝒅𝒆𝒗 ∶ 𝒔 Confidence Interval

𝒔
𝑰𝒇 𝝈 𝒖𝒏𝒌𝒏𝒐𝒘𝒏 𝑵
t 𝝁,
𝒏

From here onwards, we will use the condition of “σ unknown”.

We will use practical example to understand the application of


CLT in Confidence Interval.
104
Module 3

Estimation
Exercise #4
Point Estimation
Properties of Point Estimators
Exercise File:
Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

1. Trainer will show how to use “Transpose” feature in JMP.

2. Subsequent slide is using this set of data for explanation on how Confidence Interval
is being calculated.

105
Module 3

Estimation
Exercise #4
Point Estimation
Properties of Point Estimators
Exercise File:
Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations 1. Open the file…data is in row format
Module 3 Key Learning

2. Transpose into column format


Go to Tables > Transpose. Select all columns and cast to Transpose Columns, click Ok

This is a sample data (e.g. machine setup data, etc)


106
Data in column format
Module 3

Estimation
Exercise #4
Point Estimation
Properties of Point Estimators
Exercise File:
Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations 3. Make a distribution of the sample data
Module 3 Key Learning Go to Analyze > Distribution. Cast Row column into Y, then hit Ok

JMP generated CI

107
Module 3

Estimation
Exercise #4
Point Estimation
Properties of Point Estimators
Exercise File:
Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations To generate confidence interval for both Mean and Stdev:
Module 3 Key Learning Go to distribution hotspot > Confidence Inter

JMP generated CI

108
Module 3

Estimation

Point Estimation
Confidence Interval
Properties of Point Estimators Step 1 Convert Data into Information.
Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
Descriptive
Module 3 Key Learning Statistics

Step 2 Establish the Sampling Distribution for Mean

𝒔 𝟒. 𝟓𝟖
𝑰𝒇 𝝈 𝒖𝒏𝒌𝒏𝒐𝒘𝒏 𝑵
t 𝝁, 𝑵t 𝟒. 𝟔𝟎,
𝒏 𝟑𝟎 Sampling
Distribution of Mean
inference on the
Inferential
t Distribution Population Mean.
Statistics

𝒔 𝑆𝑡𝑑 𝐸𝑟𝑟: 0.84


Note: Std Error =
𝒏

𝟒. 𝟔𝟎 109
Module 3

Estimation

Point Estimation
Confidence Interval
Properties of Point Estimators Step 3 User to Define Significance Level, 𝜶.
Interval estimation
t Distribution
Parameters of 1 normal population
Significance Level, 𝜶 Confidence Level, (1-𝜶)
Parameters of 2 normal populations
𝛼 𝛼 0.01 99%
= 0.025 = 0.025
Module 3 Key Learning 2 2
1 − 𝛼 = 0.95 0.05 95%
0.10 90%
𝐿𝐶𝐿 𝟒. 𝟔𝟎 𝑈𝐶𝐿
LCL = Lower Confidence Limit Typically we choose 𝛼 = 0.05.
UCL = Upper Confidence Limit

As an illustration of the CLT, we compute the average of the


sample means of 1000 samples of size, n. Our results are
FAQ
similar, with the center of sampling distribution almost equal
to pop mean. Why we need to introduce 𝛼 or
Confidence Level?
In the case of application, we use only 1 group of sample
with size n to form the sampling distribution of mean. Hence
there are less information to infer the pop mean, thus lower This will be answered later.
accuracy.

As such, a Confidence Interval is introduced to establish a FAQ


range of values that likely contain the pop mean. Why typically we choose 𝛼 = 0.05 110
or Confidence Level = 95%?
Module 3

Estimation

Point Estimation
Step 4 Establishing the Confidence Interval Confidence Interval
Properties of Point Estimators
t Distribution
Interval estimation
Significance Level, 𝜶 Confidence Level, (1-𝜶)
Parameters of 1 normal population
Parameters of 2 normal populations 𝛼 𝛼 0.01 99%
= 0.025 = 0.025
2 2
Module 3 Key Learning 1 − 𝛼 = 0.95 0.05 95%
0.10 90%
𝐿𝐶L 𝟒. 𝟔𝟎 𝑈𝐶𝐿
2.89 6.31
In the example:

What does the Confidence Interval mean?


It means that at the 95% Confidence Level, the Pop Mean are contained within 2.89 to 6.31.
111
But how does this knowledge help us in making decision?
Module 3

Estimation

Point Estimation
Confidence Interval
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


𝛼 𝛼
= 0.025 = 0.025
2 2
1 − 𝛼 = 0.95

𝑇𝑎𝑟𝑔𝑒𝑡 𝐿𝐶𝐿 𝟒. 𝟔𝟎 𝑇𝑎𝑟𝑔𝑒𝑡 𝑈𝐶𝐿


0 2.89 5.00 6.31
Possible Pop Mean Value

Let’s say specs target is 0+/-50um Let’s say specs target is 5+/-50um

0 is not within the Confidence 5 is within the Confidence Interval


Interval (@ 95% Confidence Level) (@ 95% Confidence Level)

Process not centered. Process is centered.

User’s Decision User’s Decision


Possible Action: Possible Action:
To re-setup machine. No need re-setup machine. 112
Module 3

Estimation

Point Estimation
Confidence Interval
Properties of Point Estimators FAQ
Interval estimation
Why typically we choose 𝛼 = 0.05 or
Parameters of 1 normal population Confidence Level = 95%?
Parameters of 2 normal populations

Module 3 Key Learning

99%
95%
2.30 2.89 6.31 6.91
CI @ 95%

What it means when we increase


Confidence Level?
CI @ 99%
1. Confidence Interval width
becomes larger and higher
confidence now to contain
population mean.

2. It also means less precise


because pop mean can take
more different values. 113
Module 3

Estimation Confidence Interval


Point Estimation
Properties of Point Estimators
5.0% 5.0% FAQ
Interval estimation
Parameters of 1 normal population 90% The Confidence
Parameters of 2 normal populations
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 90% Interval width will be
(𝛼 = 0.10) smaller at CL 90%.
Module 3 Key Learning
3.18 CI @ 90% 6.03 Why don’t we use CL
90% then?

What it means when we


increase Confidence
2.5% 2.5% Level?
95%
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 95% 1. Confidence Interval
(𝛼 = 0.05) width becomes
larger and higher
CI @ 95% confidence now to
2.89 6.31
contain population
mean.

2. It also means less


precise because pop
0.5% mean can take on
0.5%
more values.
99%
𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 99%
(𝛼 = 0.01)
2.30 CI @ 99% 6.91 114
Module 3

Estimation
Exercise #5
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population Frequency Interpretation – a numerical simulation
Parameters of 2 normal populations

Module 3 Key Learning FAQ


The Confidence Interval width will be smaller at CL
90%. Why don’t we use CL 90% then?

1. Open M2.0 Central Limit Theorem.xls.

2. Create 100 groups of sample with size, n=30 using sheet Pop 1.

3. Copy the raw data of these 100 groups.

4. Create confidence interval for all these 100 groups using:


a. 𝛼 = 0.10 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 90% .
b. 𝛼 = 0.05 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 95% .
c. 𝛼 = 0.01 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 99% .

5. Compare the results.

6. Trainer will show step by step, how to do it.


115
Module 3

Estimation
Exercise #5
Point Estimation
Properties of Point Estimators

Interval estimation
M2.0 Central
1. Open M2.0 Central Limit Theorem.xls. Limit Theorem
Parameters of 1 normal population
Parameters of 2 normal populations
2. Create 100 groups of sample with size, n=30 using sheet Pop 1.
Module 3 Key Learning

Up to 30

100
groups

116
Module 3

Estimation
Exercise #5
Point Estimation
Properties of Point Estimators
M2.0 Central
Interval estimation 3. Copy/paste sample data to JMP, transpose to column format Limit Theorem

Parameters of 1 normal population


Go to Tables > Transpose. Select all columns and cast to Transpose Columns, click Ok
Parameters of 2 normal populations

Module 3 Key Learning

Up to 100 cols

Up to 30 rows

Can delete Label column…click on column, right click > Delete Column

117
Module 3

Estimation
Exercise #5
Point Estimation
Properties of Point Estimators
M2.0 Central
Interval estimation Limit Theorem

Parameters of 1 normal population


Parameters of 2 normal populations Go to Analyze > Distribution. Select all columns and cast Y Columns, click Ok
There are 100 distributions created – remove Quantiles and Summary statistics from the hotspot
Module 3 Key Learning

Untick

118
Module 3

Estimation
Exercise #5
Point Estimation
Properties of Point Estimators
M2.0 Central
Interval estimation Limit Theorem

Parameters of 1 normal population 4. Create confidence interval for all these 100 groups using:
Parameters of 2 normal populations a. 𝛼 = 0.10 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 90% .
b. 𝛼 = 0.05 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 95% .
Module 3 Key Learning
c. 𝛼 = 0.01 𝐶𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝐿𝑒𝑣𝑒𝑙 = 99% .
Pressing Ctrl key, click on hotspot, select Confidence interval, choose confidence level
Round1: 90% Round2: 95% Round3: 99%

119
Module 3

Estimation
Exercise #5
Point Estimation
Properties of Point Estimators
M2.0 Central
Interval estimation Limit Theorem

Parameters of 1 normal population


Right click on the tabulation > Make Combined Data Table
Parameters of 2 normal populations

Module 3 Key Learning

There are 30 groups, each group with 3 sets of CI (90, 95, 99% CL)

120
Module 3

Estimation
Exercise #5
Point Estimation
Properties of Point Estimators
M2.0 Central
Interval estimation Remove Std Dev rows Limit Theorem

Parameters of 1 normal population


Click on Rows > Row Selection > Select Where
Parameters of 2 normal populations

Module 3 Key Learning


Delete highlighted rows

Remove also Parameter and Estimate columns (not needed)

121
Module 3

Estimation
Exercise #5
Point Estimation
Properties of Point Estimators
M2.0 Central
Interval estimation Stack the data columns Limit Theorem

Parameters of 1 normal population


Click on Tables > Stack Stacked data
Parameters of 2 normal populations

Module 3 Key Learning

Make Oneway graph


Click on Analyze > Fit Y by X

Y, Response: Data
X: Y
By: 1-Alpha

Click OK

122
Module 3

Estimation
Exercise #5
Point Estimation
Properties of Point Estimators
M2.0 Central
Interval estimation Limit Theorem

Parameters of 1 normal population


Parameters of 2 normal populations

Module 3 Key Learning

123
Module 3

Estimation
Exercise #5
Point Estimation
Properties of Point Estimators
Remove Grand Mean
Interval estimation
Click on hotspot > Display Options – untick Grand Mean
Parameters of 1 normal population
Parameters of 2 normal populations
Create box plot
Module 3 Key Learning Click on hotspot > Display Options – tick Box Plot

124
Module 3

Estimation
Exercise #5
Point Estimation
Properties of Point Estimators
From exercise 1.1, we get population mean = 5.02, we will add pop mean line into our graph
Interval estimation
Double-click on Y axis of the graph, put value 5.02 then click Add > OK (Do for all 3 Oneway graph)
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

CL(%) Tr P1 P2 P3 P4

90
95
99

125
Module 3

Estimation
Exercise #5
Point Estimation
Properties of Point Estimators
Compare Confidence Interval Width, use uniform scaling to easily see
Interval estimation
Right-click on Y axis of the graph (for 99%) > Edit > Copy Axis Settings
Parameters of 1 normal population
Click on 95% CL graph Yaxis > Edit > Paste Axis Settings
Parameters of 2 normal populations
Click on 90% CL graph Yaxis > Edit > Paste Axis Settings
Module 3 Key Learning

126
Module 3

Estimation
Frequency Interpretation – a numerical simulation
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations Population Mean
Module 3 Key Learning

At 90% Confidence Level ➔ On average, 90/100 Confidence Intervals contain the Pop Mean 127
Frequency Interpretation – a numerical simulation
Module 3

Estimation

Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations Pop Mean

Module 3 Key Learning

At 95% Confidence Level ➔ On average, 95/100 Confidence Intervals contain the Pop Mean 128
Frequency Interpretation – a numerical simulation
Module 3

Estimation

Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
Pop Mean
Module 3 Key Learning

At 99% Confidence Level ➔ On average, 100/100 Confidence Intervals contain the Pop Mean 129
Module 3

Estimation Frequency Interpretation


Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

There’s nothing wrong in choosing 90% or 99%.

Important is that you are clear about pros and cons


that come with your decision.

130
Module 3

Estimation
Exercise #6
Point Estimation
Properties of Point Estimators

Interval estimation Example #1


Parameters of 1 normal population
Parameters of 2 normal populations
𝑆𝑝𝑒𝑐𝑠 = 0𝑢𝑚 ± 20𝑢𝑚
Module 3 Key Learning

Scenario:

1. You are setting up the machine.

2. 30 setup units measured.

3. Descriptive Statistics obtained.


a. All single value within specs limit.
b. Cpk > 1.67.

4. Decision? :
a. Release machine for production.
b. Re-setup machine.

131
Module 3

Estimation
Exercise #6
Point Estimation
Properties of Point Estimators
Generate descriptive statistics
Interval estimation
Click on Analyze > Distribution, select X & Y offset columns, then OK
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


To add process capability
Go to data table, select both columns > Right-click > Column Properties > Specs Limits

Input here
Do the same for another parameter down below

132
Module 3

Estimation
Exercise #6
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population Generate descriptive statistics
Parameters of 2 normal populations Click on Analyze > Distribution, select X & Y offset columns, then OK
Module 3 Key Learning

With check, since we put


Specs Limit in the
Data column

133
Module 3

Estimation
Exercise #6
Point Estimation
Properties of Point Estimators
Generate descriptive statistics
Interval estimation
Parameters of 1 normal population Click on Analyze > Distribution, select X & Y offset columns, then OK
Parameters of 2 normal populations

Module 3 Key Learning

134
Module 3

Estimation 𝑆𝑝𝑒𝑐𝑠 = 0𝑢𝑚 ± 20𝑢𝑚 Confidence Intervals


Step 1 Convert Data into Information.
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

Step 2 Establish the Sampling Distribution for Mean

𝒔
𝑰𝒇 𝝈 𝒖𝒏𝒌𝒏𝒐𝒘𝒏 𝑵
t 𝝁,
𝒏
Target Interest
𝟑. 𝟔𝟗 𝟏. 𝟑𝟕
t𝑵 −𝟏. 𝟐𝟕, t𝑵 −𝟗. 𝟗𝟎,
𝟑𝟎 𝟑𝟎
𝑆𝑡𝑑 𝐸𝑟𝑟: 0.67 𝑆𝑡𝑑 𝐸𝑟𝑟: 0.25
Target Interest

−𝟏. 𝟐𝟕 −𝟗. 𝟗𝟎
-2.65 CI 0..11 -10.41 CI -9.39

Knowledge Knowledge
At 95% CL, target within CI. At 95% CL, target not within CI.
Process Is Centered. Process Not Centered.

Step 3 User to Define Significant Level, 𝜶. Typically @ 0.05 or 95% Confidence Level.

Step 4 Establish the Confidence Interval.


135
Step 5 Compare to your target of interest and make your decision.
Module 3

Estimation
Exercise #7
Point Estimation
Properties of Point Estimators
Example #2
Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations Same scenario as in Exercise 6 but with different data set
Module 3 Key Learning

𝑆𝑝𝑒𝑐𝑠 = 0𝑢𝑚 ± 20𝑢𝑚

136
Module 3

Estimation
Exercise #7
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations To add process capability
Go to data table, select both columns > Right-click > Column Properties > Specs Limits
Module 3 Key Learning

Input here

137
Module 3

Estimation
Exercise #7
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

What is confidence interval telling us?


Check the min and max…Ppk

138
Module 3
Confidence Intervals
Estimation FAQ Answer
Point Estimation Could we increase the precision Increase the sample size, n. 𝑺𝒂𝒎𝒑𝒍𝒆 𝑺𝒊𝒛𝒆, 𝒏𝟏
Properties of Point Estimators without sacrificing the frequencies?

Interval estimation
Parameters of 1 normal population 2.5% 2.5%
𝝈
Parameters of 2 normal populations
𝑰𝒇 𝝈 𝒌𝒏𝒐𝒘𝒏 𝑵 𝝁, 95%
Module 3 Key Learning
𝒏

𝒔 CI @ 95%
𝑰𝒇 𝝈 𝒖𝒏𝒌𝒏𝒐𝒘𝒏 𝑵
t 𝝁,
𝒏

𝑺𝒂𝒎𝒑𝒍𝒆 𝑺𝒊𝒛𝒆, 𝒏𝟐
𝒔 (𝒏𝟐 > 𝒏𝟏 )
Sample size increases ➔ decreases.
𝒏

Resulting in a narrower spread of Sampling


Distribution for Mean. 95%
2.5% 2.5%

CI @ 95% 139
Module 3

Estimation
Exercise #8
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
Exercise File:
Module 3 Key Learning
Use M.20 Central Limit Theorem.xls file.

Trainer will guide participants in performing the exercise.

Exercise
1. Click on Sheet Pop 1.
2. Random sampling 100 groups of sample with size = 5 & 30.
3. Compute Confidence Interval for all 100 groups for both n = 5 and n = 30.
4. Use 95% Confidence Level for both groups.

140
Module 3

Estimation
Exercise #8
Point Estimation
Properties of Point Estimators

Interval estimation Random sampling 100 groups of sample with size = 5


Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

Random sampling 100 groups of sample with size = 30

141
Module 3

Estimation
Exercise #8
Point Estimation
Properties of Point Estimators
Copy and paste to JMP
Interval estimation
Parameters of 1 normal population
Go to Tables > Transpose
Parameters of 2 normal populations

Module 3 Key Learning

142
Module 3

Estimation
Exercise #8
Point Estimation
Properties of Point Estimators
Remove Quantiles and other Statistics except Confidence Intervals
Interval estimation
Parameters of 1 normal population
Pressing Ctrl key, Go to hotspot > Display Options, untick Quantiles
Parameters of 2 normal populations

Module 3 Key Learning

Pressing Ctrl key, Go to hotspot of Summary Statistics, untick others except Confidence Intervals

143
Module 3

Estimation
Exercise #8
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations Right-Click somewhere here, select
Module 3 Key Learning
Make Combined Data Table

144
Module 3

Estimation
Exercise #8
Point Estimation
Properties of Point Estimators
Make a box plot
Interval estimation
Parameters of 1 normal population
Go to Analyze > Fit Y by X
Parameters of 2 normal populations

Module 3 Key Learning

Have the Grand Mean removed and tick on Box Plots Add the Pop Mean line (double-click on Yaxis)
145
Hotspot > Display Options
Module 3

Estimation
Exercise #8
Point Estimation
Properties of Point Estimators
Do the same for n = 30 samples
Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

Make scaling uniform:


Go to n = 5 graph > right-click on Y axis > Edit > Copy Axis Settings

Go to n = 30 graph > right-click on Y axis > Edit > Paste Axis Settings

146
Module 3

Estimation
Exercise #8
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

147
Module 3

Estimation
Confidence Intervals
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

148
Using same CL at 95%, as sample size increases ➔ Precision improves maintaining same frequencies to contain pop mean.
Module 3

Estimation
Confidence Intervals
Increase
Point Estimation
Probability to Contain Sample Size
Properties of Point Estimators Population Mean
Interval estimation
𝑺𝒂𝒎𝒑𝒍𝒆 𝑺𝒊𝒛𝒆, 𝒏𝟏
(𝒏𝟐 > 𝒏𝟏 ) 𝑺𝒂𝒎𝒑𝒍𝒆 𝑺𝒊𝒛𝒆, 𝒏𝟐 > 𝒏𝟏
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


CL CL
Increase Precision but same probability to contain pop mean.
99% 99%

CL Increase Precision but same probability to contain pop mean.


CL
95% 95%

CL Increase Precision but same probability to contain pop mean.


CL
90% 90%

Precision

Within Same Sample Size Within Same Sample Size


CL impact vs Precision/Probability as CL impact vs Precision/Probability as
per above. per above.

Increasing Sample Size


Precision increases vs each CL, without decreasing the Probability. 149
Module 3

Estimation

Point Estimation
Confidence Intervals
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
Confidence intervals for parameters of one population
Module 3 Key Learning

Confidence Intervals

Population Mean Population Variance

σ2 Known σ2 Unknown

150
Module 3

Estimation

Point Estimation Confidence Interval for σ2


Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
Assumption
- The population is normally distributed
Module 3 Key Learning

(n − 1)s 2
The random variable  2
n−1 =
σ2
follows a chi-square distribution with (n – 1) degrees of freedom

The chi-square value n2−1,  denotes the number for which:


P( χn2−1  χn2−1, α ) = α

151
Confidence Interval for σ2
Module 3

Estimation

Point Estimation
Properties of Point Estimators
The (1 - )% confidence interval for the population variance is:
Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations 𝒏 − 𝟏 𝒔𝟐 𝒏 − 𝟏 𝒔𝟐
< 𝝈𝟐 <
𝝌𝟐𝜶 𝝌𝟐
Module 3 Key Learning
𝜶
,𝒏−𝟏 𝟏− ,𝒏−𝟏
𝟐 𝟐

Graphically, the (1 - )% = 95% confidence interval:

f(2n-1)

probability probability
α/2 = 0.025 (1-α)=0.95 α/2 = 0.025

2n-1
2n-1, α/2 2n-1, 1-α/2

Note: The Chi-squared distribution is not symmetric like the Normal.


→ The critical values are 2n-1, 1-α/2 and 2n-1, α/2

152
Module 3

Estimation

Point Estimation
Confidence Interval for σ2
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population EXAMPLE
Parameters of 2 normal populations
You are testing the speed of a computer processor (X).
Module 3 Key Learning

You collect the following data (MHz):

Sample Statistic Value


Size 11
Mean 3,004
Standard Deviation 74

Assuming that the population is normal, determine a 95% confidence interval for σx2 , the
true population variance.

153
Module 3

Confidence Interval for σ2


Estimation

Point Estimation
Properties of Point Estimators

Interval estimation SOLUTION


Parameters of 1 normal population
𝒏−𝟏 𝒔𝟐 𝟐 𝒏−𝟏 𝒔𝟐
Parameters of 2 normal populations In this case, we use the formula: <𝝈 <
𝝌𝟐𝒏−𝟏,𝜶/𝟐 𝝌𝟐𝒏−𝟏,𝟏−𝜶/𝟐
Module 3 Key Learning where,
• 𝝌𝟐𝒏−𝟏,𝜶/𝟐 = 𝝌𝟐𝟏𝟎,𝟎.𝟎𝟐𝟓 = 𝟐𝟎. 𝟒𝟖 From the Statistical Tables of
the Chi-squared Distribution (see next slide)
• 𝝌𝟐𝒏−𝟏,𝟏−𝜶/𝟐 = 𝝌𝟐𝟏𝟎,𝟎.𝟗𝟕𝟓 = 𝟑. 𝟐𝟓 (α = 0.05 → α/2 = 0.025 and (1- α/2) = 0.975 ).
• 𝒔 = 74
• n = 11
𝒏 − 𝟏 𝒔𝟐 𝒏 − 𝟏 𝒔 𝟐
𝟏𝟏 − 𝟏 𝟓𝟒𝟕𝟔 𝟐
𝟏𝟏 − 𝟏 𝟓𝟒𝟕𝟔 < 𝝈𝟐 < 𝟐
<𝝈 < 𝟐
𝝌𝒏−𝟏,𝜶/𝟐 𝝌𝒏−𝟏,𝟏−𝜶/𝟐
𝟐𝟎. 𝟒𝟖 𝟑. 𝟐𝟓

𝟐, 𝟔𝟕𝟑. 𝟖𝟑 < 𝝈𝟐 <16,849.23


Converting to standard deviation, we are 95% confident that the population
standard deviation of CPU speed is between 51.71 and 129.80 MHz

154
Module 3

Estimation Use of the chi-square distribution table


Point Estimation
Properties of Point Estimators
Table of values of χ2 in a Chi-Squared Distribution with k degrees of freedom such
Interval estimation
that α is the area between χ2 and +∞
Parameters of 1 normal population
Parameters of 2 normal populations α
d.f. 0.995 0.99 0.975 0.95 0.9 0.75 0.5 0.25 0.1 0.05 0.025 0.01 0.005 0.002 0.001
Module 3 Key Learning
1 3.927e-5 1.570e-4 9.820e-4 0.00393 0.0157 0.102 0.455 1.323 2.706 3.841 5.024 6.635 7.879 9.550 10.828
2 0.0100 0.0201 0.0506 0.103 0.211 0.575 1.386 2.773 4.605 5.991 7.378 9.210 10.597 12.429 13.816
3 0.0717 0.115 0.216 0.352 0.584 1.213 2.366 4.108 6.251 7.815 9.348 11.345 12.838 14.796 16.266
4 0.207 0.297 0.484 0.711 1.064 1.923 3.357 5.385 7.779 9.488 11.143 13.277 14.860 16.924 18.467
5 0.412 0.554 0.831 1.145 1.610 2.675 4.351 6.626 9.236 11.070 12.833 15.086 16.750 18.907 20.515
6 0.676 0.872 1.237 1.635 2.204 3.455 5.348 7.841 10.645 12.592 14.449 16.812 18.548 20.791 22.458
7 0.989 1.239 1.690 2.167 2.833 4.255 6.346 9.037 12.017 14.067 16.013 18.475 20.278 22.601 24.322
8 1.344 1.646 2.180 2.733 3.490 5.071 7.344 10.219 13.362 15.507 17.535 20.090 21.955 24.352 26.124
9 1.735 2.088 2.700 3.325 4.168 5.899 8.343 11.389 14.684 16.919 19.023 21.666 23.589 26.056 27.877
10 2.156 2.558 3.247 3.940 4.865 6.737 9.342 12.549 15.987 18.307 20.483 23.209 25.188 27.722 29.588
11 2.603 3.053 3.816 4.575 5.578 7.584 10.341 13.701 17.275 19.675 21.920 24.725 26.757 29.354 31.264
12 3.074 3.571 4.404 5.226 6.304 8.438 11.340 14.845 18.549 21.026 23.337 26.217 28.300 30.957 32.909
13 3.565 4.107 5.009 5.892 7.042 9.299 12.340 15.984 19.812 22.362 24.736 27.688 29.819 32.535 34.528
14 4.075 4.660 5.629 6.571 7.790 10.165 13.339 17.117 21.064 23.685 26.119 29.141 31.319 34.091 36.123
15 4.601 5.229 6.262 7.261 8.547 11.037 14.339 18.245 22.307 24.996 27.488 30.578 32.801 35.628 37.697
16 5.142 5.812 6.908 7.962 9.312 11.912 15.338 19.369 23.542 26.296 28.845 32.000 34.267 37.146 39.252
17 5.697 6.408 7.564 8.672 10.085 12.792 16.338 20.489 24.769 27.587 30.191 33.409 35.718 38.648 40.790
18 6.265 7.015 8.231 9.390 10.865 13.675 17.338 21.605 25.989 28.869 31.526 34.805 37.156 40.136 42.312
19 6.844 7.633 8.907 10.117 11.651 14.562 18.338 22.718 27.204 30.144 32.852 36.191 38.582 41.610 43.820
20 7.434 8.260 9.591 10.851 12.443 15.452 19.337 23.828 28.412 31.410 34.170 37.566 39.997 43.072 45.315 155
Module 3

Estimation

Point Estimation
Advanced Explanation
Properties of Point Estimators

Interval estimation
2 2 2
Parameters of 1 normal population
DEFINITION: if 𝑃 𝜒𝑛−1 ≥ 𝜒𝛼/2,𝑛−1 = 𝛼Τ2 then, 𝜒𝛼/2,𝑛−1 is the percentage point of a
Parameters of 2 normal populations Chi-Square distribution with (n-1) degrees of freedom.
Module 3 Key Learning
2
❑ Let 𝜒𝛼/2,𝑛−1 be the percentage point of a Chi-Square distribution with (n-1) DF such
2 2
that 𝑃 𝜒𝑛−1 ≥ 𝜒𝛼/2,𝑛−1 = 𝛼Τ2, (right tail).
2
❑ Let 𝜒1−𝛼/2,𝑛−1 be the percentage point of a Chi-Square distribution with (n-1) DF, such
2 2
that 𝑃 𝜒𝑛−1 ≥ 𝜒1−𝛼/2,𝑛−1 = 1 − 𝛼Τ2 (left tail).

(𝑛−1)𝑠 2
It can be shown that the quantity follows a Chi-Square distribution with (n-1) DF.
𝜎2
(𝑛−1)𝑠 2
A confidence interval at the level (1- 𝛼 ) for the quantity can be formed taking the
𝜎2
2 2
distance between the 2 percentage points previously shown: 𝜒𝛼/2,𝑛−1 and 𝜒1−𝛼/2,𝑛−1 .
2 (𝑛−1)𝑠 2 2
So, it can be stated that P 𝜒1−𝛼/2,𝑛−1 ≤ ≤ 𝜒𝛼/2,𝑛−1 =1−𝛼
𝜎2

156
Module 3

Estimation Advanced Explanation


Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population To obtain the confidence interval for the population variance, the term 𝜎 2 must be isolated.
Parameters of 2 normal populations
Dividing by (𝑛 − 1)𝑠 2 , we obtain
Module 3 Key Learning
2 2
𝜒1−𝛼/2,𝑛−1 1 𝜒𝛼/2,𝑛−1
P ≤ ≤ =1−𝛼
(𝑛 − 1)𝑠 2 𝜎 2 (𝑛 − 1)𝑠 2

2
𝜒1−𝛼/2,𝑛−1 1 (𝑛−1)𝑠 2
≤ 2 ≥ 𝜎2
(𝑛−1)𝑠 2 𝜎2 𝜒1−𝛼/2,𝑛−1
to solve for 𝜎 2 we need the reciprocal, and we get
1
2
𝜒𝛼/2,𝑛−1 (𝑛−1)𝑠 2
≤ 𝜎2 ≥ 2
𝜎2 (𝑛−1)𝑠 2 𝜒𝛼/2,𝑛−1

which leads to the 100(𝟏 − 𝜶 )% two-sided confidence interval for the population variance:

𝒏 − 𝟏 𝒔𝟐 𝒏 − 𝟏 𝒔𝟐
< 𝝈𝟐 <
𝝌𝟐 𝜶 𝝌𝟐 𝜶
𝒏−𝟏, 𝟐 𝒏−𝟏,(𝟏−𝟐 )

157
Module 3

Estimation
Confidence Interval for the population proportion
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations Let 𝑝Ƹ be the proportion of successes in n independent trials, having each a probability of success
equal to p.
Module 3 Key Learning

The following C.I. for the population proportion p is valid for large samples. Or, in other terms, the
method is valid under the assumption that the Binomial distribution is satisfactorily approximated
by a Normal distribution. This happens if np(1- p) > 9.

Under this assumption, a 100(1-α)% C.I. for the population proportion is given by:

ෝ(𝟏−ෝ
𝒑 𝒑) ෝ (𝟏−ෝ
𝒑 𝒑)
ෝ − 𝒁𝜶/𝟐
𝒑 ෝ + 𝒁𝜶/𝟐
<𝒑<𝒑
𝒏 𝒏

158
Module 3

Estimation
Confidence Interval for the population proportion
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning EXAMPLE:


During the inspection of a random sample, comes out that 15 items out of 80 are nonconforming
(for one or more than one causes of nonconformity). A point estimate of the proportion of
15
nonconforming, is then: 𝑝Ƹ = = 0.1875.
80
The normal approximation is plausible, since 80(0,1875)(0,8125)=12,1875 > 9.

A 95% C.I for the population proportion:

ෝ(𝟏−ෝ
𝒑 𝒑) ෝ(𝟏−ෝ
𝒑 𝒑) 𝟎.𝟏𝟖𝟕𝟓(𝟎.𝟖𝟏𝟐𝟓) 𝟎.𝟏𝟖𝟕𝟓(𝟎.𝟖𝟏𝟐𝟓)
ෝ − 𝒁𝜶/𝟐
𝒑 ෝ + 𝒁𝜶/𝟐
<𝒑<𝒑 ⟹ 0.1875-1.96 <p< 0.1875+1.96
𝒏 𝒏 𝟖𝟎 𝟖𝟎

IC=(0.102 – 0.273)

159
Module 3

Estimation
Exercise #9
Point Estimation
Properties of Point Estimators
1. Open the exercise File:
Interval estimation
2. Trainer will show using JMP:
Parameters of 1 normal population
a. How to perform one population Proportion Test
Parameters of 2 normal populations

Module 3 Key Learning 3. Interpretation of results.

Note:
• In JMP, the default proportion test is using Wilson Score Interval. Thus the result
obtained is different from the method used in slide # the previous slide.

• The method used in slide #126 is explained in Slide #125.


This method can be downloaded into JMP as “Add-In”.

Visit the JMP Learning Center under section of “Useful Add-In” and download
the add-in called Statistical Calculator.

https://stmicroelectronics.sharepoint.com/teams/jmplearningcenter9/SitePages/
Welcome-to-JMP-Video-Tutorial.aspx

160
Module 3

Estimation
Exercise #9
Point Estimation
Properties of Point Estimators

Interval estimation 1. Open the exercise Files


Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

Summarized data of conformity


Individual conformity, 80 rows

161
Module 3

Estimation
Exercise #9
Point Estimation
Properties of Point Estimators

Interval estimation Using this data: Go to Analyze > Distribution – Conformity column to Y > OK
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

Go to hotspot > Confidence interval > 0.95

162
Module 3

Estimation
Exercise #9
Point Estimation
Properties of Point Estimators

Interval estimation Using this data: Go to Analyze > Distribution – Conformity column to Y, Freq col to Freq > OK
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

Go to hotspot > Confidence interval > 0.95

163
Module 3

Estimation
Exercise #9
Point Estimation
Properties of Point Estimators

Interval estimation
JMP uses
Parameters of 1 normal population Wilson score method
Parameters of 2 normal populations

Module 3 Key Learning

Manual calculation uses Classical Method

164
Module 3

Estimation
Exercise #9
Point Estimation
Properties of Point Estimators Go to Add-Ins > Statistics Calculator III > Confidence Interval for One Proportion
Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

165
Module 3

Estimation
Exercise #9
Point Estimation
Properties of Point Estimators Go to Add-Ins > Statistics Calculator III > Confidence Interval for One Proportion
Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

166
Module 3

Estimation
Exercise #9
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

167
Module 3

Estimation

Point Estimation Confidence Interval summary


Properties of Point Estimators

Interval estimation
Parameters of 1 normal population Formulas for confidence intervals of parameters of one normal population:
Parameters of 2 normal populations

Module 3 Key Learning


Population Parameter Confidence Interval
𝝈 𝝈
μ (with σ2 Known) ഥ
𝑿 − 𝒁𝜶 ഥ
< 𝝁 < 𝑿 + 𝒁𝜶
𝟐 𝒏 𝟐 𝒏
𝒔 𝒔
μ (with σ2 Unknown) ഥ
𝑿 − 𝒕𝒏−𝟏,𝜶/𝟐 ഥ
< 𝝁 < 𝑿 + 𝒕𝒏−𝟏,𝜶/𝟐
𝒏 𝒏
𝒏 − 𝟏 𝒔𝟐 𝒏 − 𝟏 𝒔𝟐
σ2 < 𝝈𝟐 <
𝝌𝟐 𝜶 𝝌𝟐 𝜶
𝒏−𝟏, 𝟐 𝒏−𝟏,(𝟏−𝟐 )

ෝ(𝟏−ෝ
𝒑 𝒑) ෝ(𝟏−ෝ
𝒑 𝒑)
Θ (population proportion) ෝ − 𝒁𝜶/𝟐
𝒑 ෝ + 𝒁𝜶/𝟐
<𝒑<𝒑
𝒏 𝒏

NOTE: 168
For more details, se also the Manual of Statistical Methodology – Ch. 6. (DMS 8482919_A)
Module 3

Estimation

Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

Interval estimation of unknown population


parameters (two normal populations)

169
Module 3

Estimation

Point Estimation
Interval Estimation – 2 populations
Properties of Point Estimators

Interval estimation When two populations are simultaneously considered, the goal of
Parameters of 1 normal population
Parameters of 2 normal populations
the confidence intervals is often to compare among them the
parameters of the two populations
Module 3 Key Learning

Population X Population Y

σX σY

μX μY
Samples x1, x2, ... ,xnx y1, y2, ... ,yny

Point estimates ഥ, 𝒔𝒙 𝟐
𝒙 & ഥ, 𝒔𝒚 𝟐
𝒚

Comparisons μX = μY ? & σ 2X = σ 2Y ?
170
Module 3

Estimation

Point Estimation
Interval Estimation(two populations)
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
GENERAL SCHEME Confidence Interval for:
Module 3 Key Learning

difference of difference of ratio of


population means, population means, population
independent samples dependent samples variances

Examples:
Comparison of means: Comparison of means: Comparison of
Group 1 vs. independent Same group before variances of two
Group 2 vs. after treatment normal distributions

171
Module 3

Estimation

Point Estimation
Interval Estimation(two populations)
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
GENERAL SCHEME Confidence Interval for:
Module 3 Key Learning

difference of difference of ratio of


population means, population means, population
independent samples dependent samples variances

Comparison of means: Comparison of means: Comparison of


Examples: Group 1 vs. independent Same group before variances of two
Group 2 vs. after treatment normal distributions

172
Module 3

Estimation

Point Estimation
Interval Estimation(two populations)
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning Interval estimation of unknown population


parameters (two independent populations)
The previous results for one population are now extended to the case of two
independent populations
Two independent populations, Population X and Population Y, are considered
• Population X has mean μX and variance σ2X
• Population Y has mean μY and variance σ2Y
• Inferences are based on two random samples of sizes nX and nY, from population X and
population Y, respectively (→ from population X the random sample is: x1, x2, …, xnx and
from population Y the random sample is: y1, y2, …, yny ).

173
Module 3

Estimation

Point Estimation
C. I. for μX-μY (independent populations)
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations Goal: Form a confidence interval
Population means,
Module 3 Key Learning
independent samples for the difference between two
population means, μX – μY
• Different data sources
• Uncorrelated
• Independent
• Sample selected from one population has no effect on the sample selected from the
other population

• The point estimate is the difference between the two


sample means:
ഥ−𝒚
𝒙 ഥ
174
Module 3

Estimation C. I. for μX-μY (independent populations)


Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Different cases can be identified
Parameters of 2 normal populations

Module 3 Key Learning Population means,


independent samples

σx2 and σy2 known Confidence interval uses z/2

σx2 and σy2 unknown

σx2 and σy2


assumed equal Confidence interval uses a value
σx2 and σy2 from the Student’s t distribution
assumed unequal
175
Module 3

Estimation C. I. for μX-μY (independent populations)


Point Estimation
Properties of Point Estimators
Population means, independent samples
Interval estimation Start with this case
Parameters of 1 normal population
Parameters of 2 normal populations
σx2 and σy2 known
Module 3 Key Learning
σx2 and σy2 assumed equal
σx2 and σy2 unknown
σx2 and σy2 assumed unequal
1 Assumptions:
▪ Samples are randomly and independently drawn
▪ both population distributions are Normal
Context:
▪ Population variances are known

2 The confidence interval for μX – μY is:

𝝈𝟐𝑿 𝝈𝟐𝒀 𝝈𝟐𝑿 𝝈𝟐𝒀


ഥ−𝒚
𝒙 ഥ − 𝒁𝜶 + ഥ−𝒚
< 𝝁𝑿 − 𝝁𝒀 < 𝒙 ഥ + 𝒁𝜶 +
𝟐 𝒏𝒙 𝒏𝒚 𝟐 𝒏𝒙 𝒏𝒚

3 μx and μy comparison: “Form a (1-α)% confidence interval for the difference μx - μy


If zero is included in the interval, we can state (with (1-α)% confidence) that μx = μy”.
176
Module 3

Estimation C. I. for μX-μY (independent populations)


Point Estimation
Properties of Point Estimators
Population means, independent samples
Interval estimation
Parameters of 1 normal population THIS CASE
Parameters of 2 normal populations σx2 and σy2 known

Module 3 Key Learning σx2 and σy2 assumed equal


σx2 and σy2 unknown
σx2 and σy2 assumed unequal
1 Assumptions:
▪ Samples are randomly and independently drawn
▪ both population distributions are Normal
Context:
▪ Population variances are unknown but assumed equal

2 Forming interval estimates:


• The population variances are unknown but assumed equal, so use the two
sample standard deviations and pool them to estimate the unknown common σ
• use a t value with (nx + ny – 2) degrees of freedom

𝟐 𝟐
𝒏𝒙 − 𝟏 𝑺𝒙 + 𝒏𝒚 − 𝟏 𝑺𝒚
Calculate the pooled variance 𝑺𝟐𝒑 : 𝑺𝟐𝒑 =
𝒏𝒙 + 𝒏𝒚 − 𝟐
177
Module 3

Estimation C. I. for μX-μY (independent populations)


Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

3 The confidence interval for μX – μY is:


Module 3 Key Learning

𝑺𝟐𝒑 𝑺𝟐𝒑 𝑺𝟐𝒑 𝑺𝟐𝒑


ഥ−𝒚
𝒙 ഥ − 𝒕𝒏𝒙+𝒏𝒚 −𝟐,𝜶/𝟐 + ഥ−𝒚
< 𝝁𝑿 − 𝝁𝒀 < 𝒙 ഥ + 𝒕𝒏𝒙 +𝒏𝒚−𝟐,𝜶/𝟐 +
𝒏𝒙 𝒏𝒚 𝒏𝒙 𝒏𝒚

μx and μy comparison: “Form a (1-α)% confidence interval for the difference μx - μy .


4 If zero is included in the interval, we can state (with (1-α)% confidence) that μx = μy”.

178
Module 3

Estimation

Point Estimation
C. I. for μX-μY (independent populations)
Properties of Point Estimators

Interval estimation EXAMPLE


Parameters of 1 normal population
Parameters of 2 normal populations
You are testing the speed of two computer processors (X1 and X2).
Module 3 Key Learning
You collect the following data (MHz):

Sample Statistic 1st processor X1 2nd processor X2


Sample size 17 14
Sample Mean 3,004 2,538
Sample Standard Deviation 74 56

Assuming that the two populations are normally distributed with unknown but equal
variances, determine a 95% confidence interval for the difference of CPU mean speeds.

179
C. I. for μX-μY (independent populations)
SOLUTION
2 2
2
n x − 1 Sx + n y − 1 Sy 17 − 1 742 + 14 − 1 562
The pooled variance is: Sp = = = 4427.03
(n𝑥 − 1) + (ny − 1) (17−1) + (14 − 1)

The t value for a 95% confidence interval is: t nx +ny −2 , α/2 = t 29 , 0.025 = 2.045

sp2 sp2 sp2 sp2


The 95% confidence interval is: (x − y) − t n x +ny −2,α/2 +  μX − μY  (x − y) + t nx +ny −2,α/2 +
nx ny nx ny

4427.03 4427.03 4427.03 4427.03


(3004 − 2538) − (2.054) +  μX − μY  (3004 − 2538) + (2.054) +
17 14 17 14

416.69  μX − μY  515.31
We are 95% confident that the mean difference in CPU speed is between 416.69 and 515.31 Mhz. Since zero
is not included in the interval, we cannot state that they are equally performing in terms of mean CPU speed.

180
Module 3

Estimation
Exercise #10
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations 1. Open the exercise File:
Module 3 Key Learning
2. Trainer will show using JMP:
a. How to perform Unequal Variance Test.
b. How to perform Confidence Interval.

3. Interpretation of results.

181
Module 3

Estimation
Exercise #10
Point Estimation
Properties of Point Estimators

Interval estimation 1. Open the exercise File and stack the data columns
Parameters of 1 normal population
Parameters of 2 normal populations
Go to Tables > Stack

Module 3 Key Learning

2. Plot the data and analyze


Go to Analyze > Fit Y by X

182
Module 3

Estimation
Exercise #10
Point Estimation
Properties of Point Estimators
Go to hotspot > Unequal Variances

Interval estimation Are the variances equal?


Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

Go to hotspot > Means/Anova/Pooled t

Does your CI
include 0?

183
C. I. for μX-μY (independent populations)
Module 3

Estimation

Point Estimation
Properties of Point Estimators
Population means, independent samples
Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations σx2 and σy2 known
Module 3 Key Learning
σx2 and σy2 assumed equal
σx2 and σy2 unknown
σx2 and σy2 assumed unequal
1 Assumptions:
▪ Samples are randomly and independently drawn
▪ both population distributions are Normal
Context: This case
▪ Population variances are unknown and assumed unequal

2 Forming interval estimates:


• The population variances are assumed unequal, so a pooled variance is not appropriate
• use a t value with 𝝂 degrees of freedom, where: 𝟐
𝑺𝟐𝒙 𝑺𝟐𝒚
+𝒏
𝒏𝒙 𝒚
𝝂= 𝟐 𝟐
𝑺𝟐𝒙 𝑺𝟐𝒚
𝒏𝒙 𝒏𝒚
𝒏𝒙 − 𝟏 + 𝒏𝒚 − 𝟏 184
Module 3

Estimation C. I. for μX-μY (independent populations)


Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


3 The confidence interval for μX – μY is:

𝑺𝟐𝑿 𝑺𝟐𝒀 𝑺𝟐𝑿 𝑺𝟐𝒀


ഥ−𝒚
𝒙 ഥ − 𝒕𝝂,𝜶/𝟐 + ഥ−𝒚
< 𝝁𝑿 − 𝝁𝒀 < 𝒙 ഥ + 𝒕𝝂,𝜶/𝟐 +
𝒏𝒙 𝒏𝒚 𝒏𝒙 𝒏𝒚

μx and μy comparison: “Form a (1-α)% confidence interval for the difference μx - μy .


4 If zero is included in the interval, we can state (with (1-α)% confidence) that μx = μy”.

185
Module 3

Estimation
Exercise #11
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations 1. Open the exercise File:

Module 3 Key Learning


2. Trainer will show using JMP:
a. How to perform Unequal Variance Test.
b. How to perform Confidence Interval.

3. Interpretation of results.

186
Module 3

Estimation
Exercise #11
Point Estimation
Properties of Point Estimators
1. Follow Step1 & 2 in Exercise 10
Interval estimation
Parameters of 1 normal population Go to hotspot > Unequal Variances
Parameters of 2 normal populations

Module 3 Key Learning


Are the variances equal?

Go to hotspot > Means/Anova/Pooled t

Does your CI
include 0?

187
Module 3

Estimation

Point Estimation
Interval Estimation(two populations)
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
GENERAL SCHEME Confidence Interval for:
Module 3 Key Learning

difference of difference of ratio of


population means, population means, population
independent samples dependent samples variances

Examples:
Comparison of means: Comparison of means: Comparison of
Group 1 vs. independent Same group before variances of two
Group 2 vs. after treatment normal distributions

188
Module 3

Estimation

Point Estimation
Interval Estimation(two populations)
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population Interval estimation of unknown population parameters (two
Parameters of 2 normal populations
dependent populations)
Module 3 Key Learning

• Tests Means of 2 Related Populations (X and Y)


o Paired or matched samples
o Repeated measures (before/after)
o Use difference between paired values: di = (Xi – Yi ), i = 1,2,…,n

• Does not consider Variation Among Subjects

Assumptions: Both Populations Are Normally Distributed

189
Module 3

Estimation

Point Estimation
C. I. for μX-μY (dependent populations)
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning The ith paired difference is di, where


di = Xi - Yi

The point estimate for the population σ𝒏𝒊=𝟏 𝒅𝒊



mean paired difference is 𝒅: ഥ=
𝒅
𝒏

The sample standard deviation is Sd : 𝟐


σ𝒏𝒊=𝟏 ഥ
𝒅𝒊 − 𝒅
𝑺𝒅 =
𝒏−𝟏

n is the number of matched pairs in the sample 190


Module 3

Estimation

Point Estimation
C. I. for μX-μY (dependent populations)
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


The confidence interval for difference between population means, μd , is

𝑺𝒅 𝑺𝒅
ഥ−𝒕
𝒅 𝜶 ഥ+𝒕
< 𝝁𝒅 < 𝒅 𝜶
𝒏−𝟏, 𝒏 𝒏−𝟏,
𝟐 𝟐 𝒏

Where,
n = the sample size (number of matched pairs in the paired sample)

𝝁𝒅 interpretation: “if zero is included in the interval, we can state


(with (1-α)% confidence) that 𝝁𝒅 = 0. This means no difference between
the population means”.

191
Module 3

Estimation

Point Estimation
C. I. for μX-μY (dependent populations)
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


• The Margin of Error (ME) is:
𝑺𝒅
𝑴𝑬 = 𝒕𝒏−𝟏,𝜶/𝟐
𝒏

• tn-1,/2 is the value from the Student’s t distribution


with (n – 1) degrees of freedom for which:
𝜶
𝑷 𝒕𝒏−𝟏 > 𝒕𝒏−𝟏,𝜶/𝟐 =
𝟐

192
Module 3

Estimation
Exercise #12
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
1. Open the exercise File:
Module 3 Key Learning
2. Trainer will show using JMP:
a. How to perform Confidence Interval.

3. Interpretation of results.

193
Module 3

Estimation
Exercise #12
Point Estimation
Properties of Point Estimators

Interval estimation Method1: Using Matched Pair


Parameters of 1 normal population
Parameters of 2 normal populations
Go to Analyze > Specialized Modeling > Matched Pair

Module 3 Key Learning

Cast T0 and T500 to Y Response

Does the CI include the


target of interest (0)?

194
Module 3

Estimation
Exercise #12
Point Estimation
Properties of Point Estimators

Interval estimation Method2: Using the Changes column (difference of T0 & T500 measurements)
Parameters of 1 normal population
Parameters of 2 normal populations
Go to Distribution
Cast Changes to Y, Response → OK
Module 3 Key Learning

Does the CI include the


target of interest (0)?

195
C. I. for μX-μY (dependent populations)
Module 3

Estimation

Point Estimation
Properties of Point Estimators
EXAMPLE
Interval estimation
Parameters of 1 normal population Six people sign up for a weight loss program. You collect the following data:
Parameters of 2 normal populations

Module 3 Key Learning


Weight:
Person Before (xi) After (yi) Difference di

1 136 125 11
2 205 195 10
3 157 150 7
4 138 140 -2
5 175 165 10
6 166 160 6
42

 di ∑ (d i- d) 2
Sd = = 4.82
d = n = 7.0 n -1
Form a 95% confidence interval for the difference of means and establish
if the weight loss program helps people loosing weight.
196
C. I. for μX-μY (dependent populations)
Module 3

Estimation

Point Estimation
Properties of Point Estimators
For a 95% confidence level, the appropriate t value is tn-1,/2 = t5,0.025 = 2.571
Interval estimation
SOLUTION The 95% confidence interval for the difference between means, μd , is
Parameters of 1 normal population
Parameters of 2 normal populations
𝐒𝐝 𝐒𝐝

𝒅−t ഥ+𝐭
< 𝛍𝐝 < 𝒅
Module 3 Key Learning n−1,α/2 𝐧 n−1,α/2 𝐧

4.82 4.82
7−(2.571) < 𝛍𝐝 < 7+(2.571)
𝟔 𝟔

𝟏. 𝟗𝟒 < 𝝁𝒅 < 𝟏𝟐. 𝟎𝟔

Interpretation:
• Since this interval does not contains zero, we can be 95% confident, given this
limited data, that the weight loss program produced a statistically significant effect
on people weight.
• Since the confidence limits are positive, we can state that the weight loss program
helps people to lose weight (weight before > weight after).

197
Module 3

Estimation

Point Estimation
C. I. for μd and σd (Dependent samples)
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
EXAMPLE
Module 3 Key Learning
Confidence interval for the mean μd and for the standard deviation σd of the difference (d)
of the values drawn from 2 normal paired (or matched) samples
Use here the same data of previous example 2.6 for independent samples (same
parameter in two successive time periods).
Here, we assume that the data in columns A and B represent measurements on the
same units before and after a certain treatment was given (or, equivalently, in two
different periods of time: t-1 and t)

• Determine and interpret the 95% confidence intervals for μd and for σd

198
Module 3

Estimation C. I. for μd and σd (Dependent samples)


Point Estimation
Properties of Point Estimators

Interval estimation SOLUTION (JMP)


Parameters of 1 normal population
ANALYZE > MATCHED PAIRS
Parameters of 2 normal populations

Module 3 Key Learning

Confidence interval

199
Module 3

Estimation

Point Estimation
Interval Estimation(two populations)
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
GENERAL SCHEME Confidence Interval for:
Module 3 Key Learning

difference of difference of ratio of


population means, population means, population
independent samples dependent samples variances

Examples:
Comparison of means: Same group Comparison of
Group 1 vs. before vs. after variances of two
independent Group 2 treatment normal distributions

200
C. I. for (σX)2/(σY)2 (indep. Normal populations)
Module 3

Estimation

Point Estimation
Properties of Point Estimators

Interval estimation 1 Assumptions:


Parameters of 1 normal population Samples are randomly and independently drawn from two Normal distributions
Parameters of 2 normal populations

Module 3 Key Learning

2 The confidence interval for the ratio of variances σX2/σY2 is given by:

𝒔𝟐𝑿 𝝈𝟐𝑿 𝒔𝟐𝑿


𝟐 𝑭𝟏− , < 𝟐 < 𝟐 𝑭 𝜶,
𝜶
𝒔𝒀 𝟐 𝒏𝑿 −𝟏 , 𝒏𝒀 −𝟏 𝝈𝒀 𝒔𝒀 𝟐 𝒏𝑿 −𝟏 , 𝒏𝒀 −𝟏

Where 𝑭𝜶, 𝒏𝑿 −𝟏 , 𝒏𝒀 −𝟏 is the percentage point of the F distribution with (nX-1)


𝟐
𝜶
and (nY-1) degrees of freedom such that 𝑷 𝑭 𝒏𝑿 −𝟏 , 𝒏𝒀 −𝟏 ≥ 𝑭𝜶, 𝒏𝑿 −𝟏 , 𝒏𝒀 −𝟏 =
𝟐 𝟐

3 σ2X and σ2Y comparison: “Form a (1-α)% confidence interval for the ratio σ2X
/σ2Y . If one (1) is included in the interval, we can state (with (1-α)% confidence)
that σ2X = σ2Y ”.

201
Module 3

Estimation
Confidence Interval for the difference of 2 population proportions
Point Estimation
Properties of Point Estimators
ASSUMPTIONS:
Interval estimation
Parameters of 1 normal population • Samples are randomly and independently drawn.
Parameters of 2 normal populations • Samples sizes are nX and nY .
• Both sample sizes are large and the Normal approximation to the Binomial distribution holds (np (1 - p) > 9).
Module 3 Key Learning

The C.I. is relative to the difference of the population proportions: pX – pY.

The point estimate for the difference is 𝑝Ƹ𝑋 − 𝑝Ƹ 𝑌

𝑝ො𝑋 −𝑝ො𝑌 −(𝑝𝑋 −𝑝𝑌 )


The random variable 𝑍 = ෝ 𝑋 (1−ෝ
𝑝 ෝ (1−ෝ
𝑝𝑋 ) 𝑝 𝑝𝑌 )
is approximately normal and then, a 100(1-α)% confidence interval
+ 𝑌
𝑛𝑋 𝑛𝑌
for the difference of 2 population proportions is given by:

𝑝Ƹ𝑋 1 − 𝑝Ƹ𝑋 𝑝Ƹ 𝑌 1 − 𝑝Ƹ 𝑌+
𝑝Ƹ𝑋 − 𝑝Ƹ 𝑌 ± 𝑍𝛼 +
2 𝑛𝑋 𝑛𝑌

202
Module 3

Estimation
Confidence Interval for the difference of 2 population proportion
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
EXAMPLE
Two production lots have been inspected. From the first, a random sample of size nX = 270 units is drawn and then
Module 3 Key Learning checked. 11 units out of 270 have been found defective. From the second lot, a random sample of size nY = 352 units
is drawn and then checked. 19 units out of 352 have been found defective.
Form a 90% C.I. for the difference of proportions of defective units in lot 1 and in lot 2 to assess equality of proportions.

SOLUTION
11 19
From lot 1: 𝑝Ƹ𝑋 = = 0.0407. From lot 2: 𝑝Ƹ𝑌 = = 0.0053977
270 352
𝑝Ƹ 𝑋 1−𝑝Ƹ 𝑋 𝑝Ƹ 𝑌 1−𝑝Ƹ 𝑌+ 0.0407 0.9593 0.053977 0.946023
For 90% confidence level, 𝑍𝛼 = 1.645, and + = + = 0.01702
2 𝑛𝑋 𝑛𝑌 270 352

𝑝Ƹ 𝑋 1−𝑝Ƹ 𝑋 𝑝Ƹ 𝑌 1−𝑝Ƹ 𝑌+
The confidence limits are: 𝑝Ƹ𝑋 − 𝑝Ƹ𝑌 ± 𝑍𝛼 + = (0.0407 − 0.053977) ± 1.645(0.01702)
2 𝑛𝑋 𝑛𝑌
and the C.I. is:
𝑪𝑰 = −𝟎. 𝟎𝟒𝟏𝟐 < 𝒑𝑿 − 𝒑𝒀 < 𝟎. 𝟎𝟏𝟒𝟖

CONCLUSION
Since 0 ∈ C.I., with a 90% confidence level we can state that the proportion of defective is equivalent in the two lots.

203
Module 3

Estimation
Exercise #13
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning 1. Open the exercise File:

2. Trainer will show using JMP:


a. How to perform 2 population Proportion Test

3. Interpretation of results.

204
Module 3

Estimation
Exercise #13
Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations
Go to Tables > Stack

Module 3 Key Learning

Stacked data

Go to Analyze > Fit Y by X

205
Module 3

Estimation
Exercise #13
Point Estimation Go to hot spot > Two Sample Test for Proportions
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

CI for Proportion

Does this CI include 0?

You can toggle the response of interest

206
Module 3

Estimation
Exercise #13
Point Estimation
Properties of Point Estimators

Interval estimation Go to Tables > Stack Stacked data table


Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

Go to Analyze > Fit Y by X

207
Module 3

Estimation Using summarized data Using raw data Exercise #13


Point Estimation
Properties of Point Estimators

Interval estimation
Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning

208
Module 3

Estimation

Point Estimation
Module 3 Key Learnings
Properties of Point Estimators

Interval estimation ❑ Point and interval estimation


Parameters of 1 normal population
Parameters of 2 normal populations

Module 3 Key Learning


❑ Properties of a point estimator
o unbiasedness, consistency, maximum efficiency

❑ Confidence interval for parameters (under the normality assumption)


o One population
▪ For the mean (variance known and unknown)
▪ For the variance
o Two populations
▪ Independent samples
• For the difference of means (variance known and unknown, equal and unequal variances)
• For the ratio of variances
▪ Dependent samples
• For the difference of means

209
END OF STATS 2 PART 1

210
File Revision

Version Date Remarks Who


1.0 2017 Initial Release Marco Della Seta
• New format and template.
2.0 April 2021 Marco Della Seta / HK Looi
• New exercises.

211

You might also like