Topic 05c - Inference About Population Variance

Decision Science 1– PGP 1 – Term 1 2020
DECISION SCIENCE ‐ 1
INFERENCES ABOUT POPULATION VARIANCE

Term 1: 2020
PGP ‐ 1
Notice: This material contains material that is protected by copyright. You have been granted permission to use this material only for
the course Decision Science ‐ 1 being offered Term 1 (Aug‐Oct 2020) at Indian Institute of Management, Bangalore. This presentation
may not be copied or distributed in whole or in part without permission from Ananth Krishnamurthy.
Decision Science ‐1 ‐ Course Material
Outline
Inference About Variance in a Single Populations

• Statistical inference for 𝜎
– Confidence intervals
– Hypothesis testing
Inference About Variances from Two Populations
• Statistical inference for

Prof. Ananth Krishnamurthy 1

Sample Variance 𝑆 : Estimator of Population Variance 𝜎
• Variance can provide important decision‐making information
• Consider the examples we saw

– Average mileage (mpg) is important, but so is the variance
– Average assembly time of a box is important, but so is the variance
• By selecting a sample, we can compute the sample variance. If the sample variance is high,
it may indicate over‐performance or under‐performance, even if the average is same
• We start with the sampling distribution for the variance, and use that information to
develop confidence intervals and conduct hypothesis tests
Sample Variance, 𝑆 : Estimator of Population Variance, 𝜎
• The sample variance, 𝑆 is the sample statistic used as an

estimator of the population variance, 𝜎
Variance,𝜎
• When we sample the population, we obtain a we obtain a
specific value of sample variance, 𝑆 , denoted by 𝑠
– For example, if we get 𝑠 =2.5. This is our estimate of 𝜎
Distribution
• Every 𝑠 is a point estimate of 𝜎 as the estimate is a single

Frequency
number
𝑠
• If we had an interval estimate, we would have been able to 𝑠
𝑠
say 𝜎 lies in some interval, say (1, 5)
Sample Variance (𝑆 )

Use of Chi‐Square Distribution in Sample Variance Estimates

• We will assume that sampling is done from a population with Normal distribution
∑ ̅
• Recall that: 𝑠
– If the random variables Xi ~ N(0,1) for 1 ≤ i ≤ n, are independent, then the random variable Y = X12 + X22
+…+ Xn2 is said to have a Chi‐Square distribution with n degrees of freedom.
• The sampling distribution of has a Chi‐Square distribution with 𝑛 1 degrees of

freedom whenever a simple random sample of size 𝑛 is drawn from a Normal population
• We use Chi‐Square distribution to develop interval estimates and conduct hypothesis

tests about a population variance
Chi‐Square Distribution
• If the random variables Xi ~ N(0,1) for 1 ≤ i ≤ n, are independent, then the random variable Y
= X12 + X22 +…+ Xn2 is said to have a Chi‐Square distribution with n degrees of freedom.
• It is denoted by Y ~ χ2n and has an expectation of n and a variance of 2n.
With 2 degrees
f(x) of freedom
With 5 degrees
of freedom
With 10 degrees
of freedom
Friedrich Robert Helmert (1843 –1917) was

a German geodesist and an important writer on
the theory of errors. In 1876 he discovered the
0 10 20 chi‐squared distribution as the distribution of
the sample variance for a normal distribution.

Chi‐Square Distribution
𝑃 𝑋 𝜒 , 𝛼
• We will use the notation 𝜒 , to
denote the critical points of a Chi‐ 𝛼
Square distribution with 𝜐 degrees of
freedom
𝜒
0 𝜒 ,
• For example, there is a 0.95

.025
probability of obtaining a 𝜒 (Chi‐ .025
Square) value such that: 95% of the
possible 𝜒 , values 𝜒
𝜒 . , 𝜒 𝜒 . , 0 𝜒 𝜒
. , . ,
Confidence Interval Estimates for 𝜎

• There is a (1 𝛼) probability of obtaining a 𝜒 value
such that:
𝜒 𝜒 𝜒
, , 𝛼
𝜒
• Substituting for for the 𝜒 we get:
0 𝜒 ,
𝑛 1 𝑠
𝜒 𝜒
, 𝜎 ,
.025
• Performing algebraic manipulations, we get: .025
95% of the
𝑛 1 𝑠 𝑛 1 𝑠 possible 𝜒 , values
𝜎 𝜒
𝜒 𝜒
, ,
0 𝜒 . , 𝜒 . ,

Confidence Interval Estimates for 𝜎

• Interval estimates for population variance:
𝛼
𝑛 1 𝑠 𝑛 1 𝑠
𝜎
𝜒 𝜒
, , 𝜒
• And…
0 𝜒 ,
𝑛 1 𝑠 𝑛 1 𝑠
𝜎
𝜒 𝜒
, ,
.025
.025
95% of the
• The 𝜒 values are based on a Chi‐Square distribution possible 𝜒 , values
𝜒
with 𝑛 1 degrees of freedom
• The term (1 𝛼) is the confidence coefficient 0 𝜒 . , 𝜒 . ,
Example: Bart’s Chemistry Lab
• Bart is conducting a series of chemistry experiments to Experiment No. Temperature
determine the exact temperature when a certain reaction 1 67.4
starts. Recording from 10 recent experiments are shown: 2 67.8

3 68.2
4 69.3
• Determine a 95% confidence interval for variance of the 5 69.5
temperature where the reaction starts. 6 67.0
7 68.1
8 68.6
9 67.9
10 67.2
10


• We have 𝑛 10, 𝑣 𝑛 1 9 degrees of freedom, i.e. 𝜒 has Experiment No. Temperature
mean of 9 and variance of 18
1 67.4
2 67.8
• We see that: 𝜒 . , 2.70 and 𝜒 . , 19.02
3 68.2
4 69.3
• Therefore: 𝜒 . , 𝜒 𝜒 . , or 2.70 𝜒 19.02
5 69.5
6 67.0
7 68.1
.025 8 68.6
.025 9 67.9
95% of the
10 67.2
possible 𝜒 , values
𝜒 Sample std. dev: 𝒔 0.84
𝟐
0 𝜒 . , 2.70 𝜒 19.02 Sample Variance: 𝒔 0.70
. ,
11

• Since: 𝜒 . , 𝜒 𝜒 . , or 2.70 𝜒 19.02 and 𝜒
Experiment No. Temperature
1 67.4
• We can write 2.70 19.02 2 67.8
3 68.2
∗ . ∗ . 4 69.3
• Which gives: 𝜎 or 0.33 𝜎 2.33
. . 5 69.5
The 95% confidence interval 6 67.0
for variance of the 7 68.1
temperature where the
reaction starts is given by: 8 68.6
.025
.025 9 67.9
0.33 𝜎 2.33 10 67.2
95% of the
possible 𝜒 , values
𝜒 Sample std. dev: 𝒔 0.84
𝟐
Sample Variance: 𝒔 0.70
0 𝜒 . , 2.70 𝜒 . , 19.02
12

Hypothesis Testing for Population Variance, 𝜎
• Next, we look at hypothesis tests involving population variance (i.e. 𝜎 )

• Let 𝜎 be the hypothesized value of the population variance
𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎
Lower‐tail Upper‐tail Two‐tailed
• Test Statistic used: 𝜒

• We can use the p‐value or the critical value approach used earlier
13
Hypothesis Testing for Population Variance, 𝜎

• Following the logic used for the p‐value or the critical value approach used earlier, we can
summarize the procedure for hypothesis tests for 𝜎 as follows:
Lower Tail Test Upper Tail Test Two‐Tailed Test
Hypotheses 𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎
Test Statistic 𝑛 1 𝑠 𝑛 1 𝑠 𝑛 1 𝑠
𝜒 𝜒 𝜒
𝜎 𝜎 𝜎
Rejection Rule (p‐value) 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
𝑖𝑓 p value 𝛼 𝑖𝑓 p value 𝛼 𝑖𝑓 p value 𝛼
Rejection Rule (critical ‐value) 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
𝑖𝑓𝜒 𝜒 , 𝑖𝑓𝜒 𝜒 , 𝑖𝑓 𝜒 𝜒
,
𝑂𝑅 𝜒 𝜒 / ,
• Note: The lack of symmetry in the 𝜒 distribution changes the critical value criteria
14


• Bart is conducting a series of chemistry experiments to determine the Experiment No. Temperature
exact temperature when a certain reaction starts. Recording from 10 1 67.4
recent experiments are shown. 2 67.8
3 68.2
• Bart’s teacher considers the readings to be accurate if the variance is 4 69.3
0.5 or less. If the variance is greater than 0.5, Bart will have to conduct 5 69.5
more experiments.
6 67.0
7 68.1
• Conduct a hypothesis test with 𝛼 0.10 to determine whether Bart
8 68.6
needs to conduct more experiments.
9 67.9
10 67.2
Sample std. dev: 𝒔 0.84
𝟐
Sample Variance: 𝒔 0.70
15
Bart’s Chemistry Lab: Using p‐Value Approach
Develop the Null and Alternate 𝐻 :𝜎 𝜎 0.5

Hypotheses (𝐻 , 𝐻 ) 𝐻 :𝜎 𝜎 0.5
𝛼 0.10
Specify the Level of Significance (𝛼)
𝑛 10, 𝑑𝑓 9, 𝑠 0.70
Collect sample data and compute 𝑛 1 𝑠 9 ∗ 0.70
the value of the Test Statistic 𝜒 12.60
𝜎 0.5
Use the value of the Test Statistic p value 𝑃 𝜒 12.60

(𝑍) to compute the p value 𝑃 𝜒 14.68 0.10.
p value 0.182 𝑃 𝜒 12.60 0.10
Reject 𝐻 if the p value 𝛼
p value 𝛼 0.10
𝐶𝑎𝑛𝑛𝑜𝑡 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
16

Bart’s Chemistry Lab: Using Critical Value Approach

𝐻 :𝜎 𝜎 0.5
Develop the Null and Alternate 𝐻 :𝜎 𝜎 0.5
Hypotheses (𝐻 , 𝐻 )
𝛼 0.10

𝑛 10, 𝑑𝑓 9, 𝑠 0.70
𝑛 1 𝑠 9 ∗ 0.70
Collect sample data and compute 𝜒 12.60 𝜒
the value of the Test Statistic 𝜎 0.5
𝐶𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑉𝑎𝑙𝑢𝑒: 𝜒 . , 14.68

Use 𝛼 to determine the critical 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑖𝑓𝜒 14.68
value and rejection rule
12.60 𝜒 𝜒 . , 14.68
Use the value of Test Statistic, 𝐶𝑎𝑛𝑛𝑜𝑡 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
rejection rule to decide on 𝐻
17
Outline
Inference About Variance in a Single Populations

• Statistical inference for 𝜎
Inference About Variances from Two Populations
• Statistical inference for

18

The F‐Distribution
• The ratio of two independent Chi‐Square random variables
that have been divided by their respective degrees of
freedom is defined as the 𝐹 ‐distribution. This ratio is
written as:
𝜒 ⁄𝜐
𝐹 , ~
𝜒 ⁄𝜐
Sir Ronald Aylmer Fisher (1890 –1962)
• The 𝐹 ‐distribution is defined on the positive state space was a British statistician and geneticist.
For his work in statistics, he has been
(i.e. the random variable takes only positive values) described as "a genius who almost
single‐handedly created the foundations
for modern statistical science” and "the
single most important figure in 20th
• It has an expectation close to one and its variance century statistics".
decreases as the degrees of freedom 𝜐 and 𝜐 increase

19
The F‐Distribution
• We will use the notation 𝐹 , , to denote the critical points of an F‐distribution
𝜒 ⁄𝜐
𝐹 , ~
𝜒 ⁄𝜐
𝑃 𝑋 𝐹 , , 𝛼
x
0 𝐹 , ,
20

Inferences About Two Population Variances, 𝜎 and 𝜎
• The Scenario:
Population 1 Population 2
𝜎 : Unknown variance 𝜎 : Unknown variance
Ratio of the variances:
Simple random sample of 𝑛 Simple random sample of 𝑛

𝑠 = sample variance 𝑠 = sample variance
: point estimate of the ratio between the variances

21
Inferences About Two Population Variances, 𝜎 and 𝜎

• Recall: The sampling distribution of has a Chi‐Square distribution 𝜒 (with 𝑛 1
degrees of freedom) whenever a simple random sample of size 𝑛 is drawn from a Normal
population
• Suppose we have two independent random samples from two Normally distributed
populations
• Then these samples will give rise to two random variables, 𝑆 and 𝑆 where:
• 𝜒 , with 𝑛 1 degrees of freedom
• 𝜒 , with 𝑛 1 degrees of freedom
22

Hypothesis Testing for Population Variances 𝜎 and 𝜎
• Then the ratio of these two random variables 𝑆 and 𝑆 will be the random variable
,
where: ∗ ∗
,
, ⁄
• When the population variances are equal, i.e. 𝜎 𝜎 , then ⁄
which is the
,
ratio of to Chi‐Square random variables divided by their own degrees of freedom
, ⁄
• Then 𝐹 ⁄
is a random variable with an 𝐹 distribution with 𝑛 1 and
,
𝑛 1 degrees of freedom
23
• The test statistic used to investigate the ratio of population variances of two normally
distributed populations is:
𝑆 𝜒 , ⁄𝑛 1
𝐹
𝑆 𝜒 , ⁄𝑛 1
• For a specific sampling experiment, this random variable 𝐹, takes value , just like we got a
value 𝑥̅ for the random variable 𝑋
24


• We consider the following hypothesis tests:
𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎
Upper‐tail Two‐tailed
• We usually denote the population providing the larger sample variance as population 1
• Since the 𝐹‐test statistic is constructed with the larger sample variance, 𝑠 in the numerator,
the value of the test statistic will be in the upper tail of the F distribution (greater than 1)
• Therefore we do not consider Lower tail tests and tables do not list lower tail distributions
25
Summary of Hypothesis Testing for 𝜎 and 𝜎
• Summary of approaches:
Assuming sampling from Normal population and 𝑠 𝑠
Upper Tail Test Two‐Tailed Test
Hypotheses 𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎
Test Statistic 𝑆 𝜒 , ⁄𝑛 1 𝑆 𝜒 , ⁄𝑛 1
𝐹 𝐹
𝑆 𝜒 , ⁄𝑛 1 𝑆 𝜒 , ⁄𝑛 1
Rejection Rule (p‐value) 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
𝑖𝑓 p value 𝛼 𝑖𝑓 p value 𝛼
Rejection Rule (critical ‐value) 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
𝑖𝑓 𝐹 𝐹 𝑖𝑓 𝐹 𝐹 /
• Note: 𝐹 and 𝐹 / are computed based on the 𝐹 distribution with 𝑛 1 degrees of

freedom in the numerator and 𝑛 1 degrees of freedom in the denominator
26

Example: Bart’s Chemistry Lab Versus Lisa’s Chemistry Lab

Experiment No. Bart’s Lisa’s
• Bart is conducting a series of chemistry experiments to Temperature Temperature
determine the exact temperature when a certain Reading Reading
reaction starts. Recording from 10 experiments are 1 67.4 67.7
shown. Lisa conducts the same chemistry experiment.
2 67.8 66.4
Recording from her 10 experiments are also given.
3 68.2 69.2
4 69.3 70.1
• Bart’s teacher wants to compare the variances from
both experiments and determine if they are 5 69.5 69.5
significantly different. 6 67.0 69.7
7 68.1 68.1
• Conduct a hypothesis test with 𝛼 0.10 to support 8 68.6 66.6
this goal. 9 67.9 67.3
10 67.2 67.5
Sample Std. Dev 𝒔𝟏 0.84 𝒔𝟐 1.33
Sample Variance 𝒔𝟐𝟏 0.70 𝒔𝟐𝟐 1.77
27
Bart Versus Lisa: Using p‐Value Approach

𝐻 :𝜎 𝜎
Develop the Null and Alternate 𝐻 :𝜎 𝜎
𝛼 0.10
𝑛 9, 𝑛 9, 𝑠 0.70, 𝑠 1.77
Collect sample data and compute 𝑠 1.77
𝐹 2.53
the value of the Test Statistic (𝑍) 𝑠 0.70
Use the value of the Test Statistic p value 2 ∗ 𝑃 𝐹 2.53

(𝑍) to compute the p value There is insufficient evidence 𝑃 𝐹 2.44 0.10 𝑎𝑛𝑑 𝑃 𝐹 3.18 0.05
to conclude that the p value 2 ∗ 0.092 2 ∗ 𝑃 𝐹 2.53 0.10
population variances differ for
Reject 𝐻 if the p value 𝛼 the two lab results p value 𝛼 0.10
𝐶𝑎𝑛𝑛𝑜𝑡 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
28

Bart Versus Lisa: Using Critical Value Approach

𝐻 :𝜎 𝜎
Develop the Null and Alternate 𝐻 :𝜎 𝜎
𝛼 0.10

𝑛 9, 𝑛 9, 𝑠 0.70, 𝑠 1.77
𝑠 1.77
Collect sample data and compute 𝐹 2.53
the value of the Test Statistic (𝑍) 𝑠 0.70
𝐶𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑉𝑎𝑙𝑢𝑒: 𝐹 / 𝐹. 3.18

Use 𝛼 to determine the critical 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑖𝑓𝐹 𝐹 3.18
value and rejection rule There is insufficient evidence
to conclude that the
Use the value of Test Statistic, population variances differ for 2.53 𝐹 𝐹 3.18
rejection rule to decide on 𝐻 the two lab results 𝐶𝑎𝑛𝑛𝑜𝑡 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
29
Summary of Hypothesis Testing for 𝜎 and 𝜎
• Summary of approaches:
Assuming sampling from Normal population and 𝑠 𝑠
Upper Tail Test Two‐Tailed Test
Hypotheses 𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎
Test Statistic 𝑆 𝜒 , ⁄𝑛 1 𝑆 𝜒 , ⁄𝑛 1
𝐹 𝐹
𝑆 𝜒 , ⁄𝑛 1 𝑆 𝜒 , ⁄𝑛 1
Rejection Rule (p‐value) 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
𝑖𝑓 p value 𝛼 𝑖𝑓 p value 𝛼
Rejection Rule (critical ‐value) 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
𝑖𝑓 𝐹 𝐹 𝑖𝑓 𝐹 𝐹 /
• Note: 𝐹 and 𝐹 / are computed based on the 𝐹 distribution with 𝑛 1 degrees of

freedom in the numerator and 𝑛 1 degrees of freedom in the denominator
30

Questions?
Acknowledgments: The authors are thankful for permission to use cartoons and graphics in this
presentation. They are solely intended to improve educational experience of students.
31

Topic 05c - Inference About Population Variance

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Topic 05c - Inference About Population Variance

Uploaded by

Copyright:

Available Formats

Decision Science 1– PGP 1 – Term 1 2020

INFERENCES ABOUT POPULATION VARIANCE

Decision Science ‐1 ‐ Course Material

Inference About Variance in a Single Populations

Inference About Variances from Two Populations

• Statistical inference for

Decision Science ‐1 ‐ Course Material

Prof. Ananth Krishnamurthy 1

Sample Variance 𝑆 : Estimator of Population Variance 𝜎

• Variance can provide important decision‐making information

• Consider the examples we saw

Decision Science ‐1 ‐ Course Material

Sample Variance, 𝑆 : Estimator of Population Variance, 𝜎

• The sample variance, 𝑆 is the sample statistic used as an

• Every 𝑠 is a point estimate of 𝜎 as the estimate is a single

Prof. Ananth Krishnamurthy 2

Use of Chi‐Square Distribution in Sample Variance Estimates

• The sampling distribution of has a Chi‐Square distribution with 𝑛 1 degrees of

• We use Chi‐Square distribution to develop interval estimates and conduct hypothesis

Friedrich Robert Helmert (1843 –1917) was

Prof. Ananth Krishnamurthy 3

• For example, there is a 0.95

Decision Science ‐1 ‐ Course Material

Confidence Interval Estimates for 𝜎

Decision Science ‐1 ‐ Course Material

Prof. Ananth Krishnamurthy 4

Confidence Interval Estimates for 𝜎

Decision Science ‐1 ‐ Course Material

Example: Bart’s Chemistry Lab

• Bart is conducting a series of chemistry experiments to Experiment No. Temperature

determine the exact temperature when a certain reaction 1 67.4

starts. Recording from 10 recent experiments are shown: 2 67.8

Decision Science ‐1 ‐ Course Material

Prof. Ananth Krishnamurthy 5

Example: Bart’s Chemistry Lab

Decision Science ‐1 ‐ Course Material

Example: Bart’s Chemistry Lab

Prof. Ananth Krishnamurthy 6

Hypothesis Testing for Population Variance, 𝜎

• Next, we look at hypothesis tests involving population variance (i.e. 𝜎 )

Lower‐tail Upper‐tail Two‐tailed

• Test Statistic used: 𝜒

Decision Science ‐1 ‐ Course Material

Hypothesis Testing for Population Variance, 𝜎

Decision Science ‐1 ‐ Course Material

Prof. Ananth Krishnamurthy 7

Example: Bart’s Chemistry Lab

Decision Science ‐1 ‐ Course Material

Bart’s Chemistry Lab: Using p‐Value Approach

Develop the Null and Alternate 𝐻 :𝜎 𝜎 0.5

Use the value of the Test Statistic p value 𝑃 𝜒 12.60

Prof. Ananth Krishnamurthy 8

Bart’s Chemistry Lab: Using Critical Value Approach

Specify the Level of Significance (𝛼)

𝐶𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑉𝑎𝑙𝑢𝑒: 𝜒 . , 14.68

Decision Science ‐1 ‐ Course Material

Inference About Variance in a Single Populations

Inference About Variances from Two Populations

• Statistical inference for

Decision Science ‐1 ‐ Course Material

Prof. Ananth Krishnamurthy 9

decreases as the degrees of freedom 𝜐 and 𝜐 increase

Decision Science ‐1 ‐ Course Material