Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Decision Science 1– PGP 1 – Term 1 2020

DECISION SCIENCE ‐ 1

INFERENCES ABOUT POPULATION VARIANCE


Term 1: 2020

PGP ‐ 1

Notice: This material contains material that is protected by copyright. You have been granted permission to use this material only for
the course Decision Science ‐ 1 being offered Term 1 (Aug‐Oct 2020) at Indian Institute of Management, Bangalore. This presentation
may not be copied or distributed in whole or in part without permission from Ananth Krishnamurthy.

Decision Science ‐1 ‐ Course Material

Outline

Inference About Variance in a Single Populations


• Statistical inference for 𝜎
– Confidence intervals
– Hypothesis testing

Inference About Variances from Two Populations

• Statistical inference for


– Confidence intervals
– Hypothesis testing

Decision Science ‐1 ‐ Course Material

Prof. Ananth Krishnamurthy 1


Decision Science 1– PGP 1 – Term 1 2020

Sample Variance 𝑆 : Estimator of Population Variance 𝜎

• Variance can provide important decision‐making information

• Consider the examples we saw


– Average mileage (mpg) is important, but so is the variance
– Average assembly time of a box is important, but so is the variance

• By selecting a sample, we can compute the sample variance. If the sample variance is high,
it may indicate over‐performance or under‐performance, even if the average is same

• We start with the sampling distribution for the variance, and use that information to
develop confidence intervals and conduct hypothesis tests

Decision Science ‐1 ‐ Course Material

Sample Variance, 𝑆 : Estimator of Population Variance, 𝜎

• The sample variance, 𝑆 is the sample statistic used as an


estimator of the population variance, 𝜎

Variance,𝜎
• When we sample the population, we obtain a we obtain a
specific value of sample variance, 𝑆 , denoted by 𝑠
– For example, if we get 𝑠 =2.5. This is our estimate of 𝜎
Distribution

• Every 𝑠 is a point estimate of 𝜎 as the estimate is a single


Frequency

number
𝑠
• If we had an interval estimate, we would have been able to 𝑠
𝑠
say 𝜎 lies in some interval, say (1, 5)

Sample Variance (𝑆 )
Decision Science ‐1 ‐ Course Material

Prof. Ananth Krishnamurthy 2


Decision Science 1– PGP 1 – Term 1 2020

Use of Chi‐Square Distribution in Sample Variance Estimates


• We will assume that sampling is done from a population with Normal distribution

∑ ̅
• Recall that: 𝑠
– If the random variables Xi ~ N(0,1) for 1 ≤ i ≤ n, are independent, then the random variable Y = X12 + X22
+…+ Xn2 is said to have a Chi‐Square distribution with n degrees of freedom.

• The sampling distribution of has a Chi‐Square distribution with 𝑛 1 degrees of


freedom whenever a simple random sample of size 𝑛 is drawn from a Normal population

• We use Chi‐Square distribution to develop interval estimates and conduct hypothesis


tests about a population variance
Decision Science ‐1 ‐ Course Material

Chi‐Square Distribution
• If the random variables Xi ~ N(0,1) for 1 ≤ i ≤ n, are independent, then the random variable Y
= X12 + X22 +…+ Xn2 is said to have a Chi‐Square distribution with n degrees of freedom.
• It is denoted by Y ~ χ2n and has an expectation of n and a variance of 2n.

With 2 degrees
f(x) of freedom
With 5 degrees
of freedom

With 10 degrees
of freedom

Friedrich Robert Helmert (1843 –1917) was


a German geodesist and an important writer on
the theory of errors. In 1876 he discovered the
0 10 20 chi‐squared distribution as the distribution of
the sample variance for a normal distribution.
Decision Science ‐1 ‐ Course Material

Prof. Ananth Krishnamurthy 3


Decision Science 1– PGP 1 – Term 1 2020

Chi‐Square Distribution
𝑃 𝑋 𝜒 , 𝛼
• We will use the notation 𝜒 , to
denote the critical points of a Chi‐ 𝛼
Square distribution with 𝜐 degrees of
freedom
𝜒
0 𝜒 ,
𝑃 𝑋 𝜒 , 𝛼

• For example, there is a 0.95


.025
probability of obtaining a 𝜒 (Chi‐ .025
Square) value such that: 95% of the
possible 𝜒 , values 𝜒
𝜒 . , 𝜒 𝜒 . , 0 𝜒 𝜒
. , . ,

Decision Science ‐1 ‐ Course Material

Confidence Interval Estimates for 𝜎


𝑃 𝑋 𝜒 , 𝛼
• There is a (1 𝛼) probability of obtaining a 𝜒 value
such that:
𝜒 𝜒 𝜒
, , 𝛼

𝜒
• Substituting for for the 𝜒 we get:
0 𝜒 ,
𝑛 1 𝑠
𝜒 𝜒
, 𝜎 ,

.025
• Performing algebraic manipulations, we get: .025
95% of the
𝑛 1 𝑠 𝑛 1 𝑠 possible 𝜒 , values
𝜎 𝜒
𝜒 𝜒
, ,
0 𝜒 . , 𝜒 . ,

Decision Science ‐1 ‐ Course Material

Prof. Ananth Krishnamurthy 4


Decision Science 1– PGP 1 – Term 1 2020

Confidence Interval Estimates for 𝜎


𝑃 𝑋 𝜒 , 𝛼
• Interval estimates for population variance:

𝛼
𝑛 1 𝑠 𝑛 1 𝑠
𝜎
𝜒 𝜒
, , 𝜒
• And…
0 𝜒 ,

𝑛 1 𝑠 𝑛 1 𝑠
𝜎
𝜒 𝜒
, ,
.025
.025
95% of the
• The 𝜒 values are based on a Chi‐Square distribution possible 𝜒 , values
𝜒
with 𝑛 1 degrees of freedom
• The term (1 𝛼) is the confidence coefficient 0 𝜒 . , 𝜒 . ,

Decision Science ‐1 ‐ Course Material

Example: Bart’s Chemistry Lab

• Bart is conducting a series of chemistry experiments to Experiment No. Temperature

determine the exact temperature when a certain reaction 1 67.4

starts. Recording from 10 recent experiments are shown: 2 67.8


3 68.2
4 69.3
• Determine a 95% confidence interval for variance of the 5 69.5
temperature where the reaction starts. 6 67.0
7 68.1
8 68.6
9 67.9
10 67.2

Decision Science ‐1 ‐ Course Material

10

Prof. Ananth Krishnamurthy 5


Decision Science 1– PGP 1 – Term 1 2020

Example: Bart’s Chemistry Lab


• We have 𝑛 10, 𝑣 𝑛 1 9 degrees of freedom, i.e. 𝜒 has Experiment No. Temperature
mean of 9 and variance of 18
1 67.4
2 67.8
• We see that: 𝜒 . , 2.70 and 𝜒 . , 19.02
3 68.2
4 69.3
• Therefore: 𝜒 . , 𝜒 𝜒 . , or 2.70 𝜒 19.02
5 69.5
6 67.0
7 68.1

.025 8 68.6
.025 9 67.9
95% of the
10 67.2
possible 𝜒 , values
𝜒 Sample std. dev: 𝒔 0.84
𝟐
0 𝜒 . , 2.70 𝜒 19.02 Sample Variance: 𝒔 0.70
. ,

Decision Science ‐1 ‐ Course Material

11

Example: Bart’s Chemistry Lab


• Since: 𝜒 . , 𝜒 𝜒 . , or 2.70 𝜒 19.02 and 𝜒
Experiment No. Temperature
1 67.4
• We can write 2.70 19.02 2 67.8
3 68.2
∗ . ∗ . 4 69.3
• Which gives: 𝜎 or 0.33 𝜎 2.33
. . 5 69.5
The 95% confidence interval 6 67.0
for variance of the 7 68.1
temperature where the
reaction starts is given by: 8 68.6
.025
.025 9 67.9
0.33 𝜎 2.33 10 67.2
95% of the
possible 𝜒 , values
𝜒 Sample std. dev: 𝒔 0.84
𝟐
Sample Variance: 𝒔 0.70
0 𝜒 . , 2.70 𝜒 . , 19.02
Decision Science ‐1 ‐ Course Material

12

Prof. Ananth Krishnamurthy 6


Decision Science 1– PGP 1 – Term 1 2020

Hypothesis Testing for Population Variance, 𝜎

• Next, we look at hypothesis tests involving population variance (i.e. 𝜎 )


• Let 𝜎 be the hypothesized value of the population variance

𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎

𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎

Lower‐tail Upper‐tail Two‐tailed

• Test Statistic used: 𝜒


• We can use the p‐value or the critical value approach used earlier

Decision Science ‐1 ‐ Course Material

13

Hypothesis Testing for Population Variance, 𝜎


• Following the logic used for the p‐value or the critical value approach used earlier, we can
summarize the procedure for hypothesis tests for 𝜎 as follows:
Lower Tail Test Upper Tail Test Two‐Tailed Test
Hypotheses 𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎
𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎
Test Statistic 𝑛 1 𝑠 𝑛 1 𝑠 𝑛 1 𝑠
𝜒 𝜒 𝜒
𝜎 𝜎 𝜎
Rejection Rule (p‐value) 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
𝑖𝑓 p value 𝛼 𝑖𝑓 p value 𝛼 𝑖𝑓 p value 𝛼
Rejection Rule (critical ‐value) 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
𝑖𝑓𝜒 𝜒 , 𝑖𝑓𝜒 𝜒 , 𝑖𝑓 𝜒 𝜒
,
𝑂𝑅 𝜒 𝜒 / ,

• Note: The lack of symmetry in the 𝜒 distribution changes the critical value criteria

Decision Science ‐1 ‐ Course Material

14

Prof. Ananth Krishnamurthy 7


Decision Science 1– PGP 1 – Term 1 2020

Example: Bart’s Chemistry Lab


• Bart is conducting a series of chemistry experiments to determine the Experiment No. Temperature
exact temperature when a certain reaction starts. Recording from 10 1 67.4
recent experiments are shown. 2 67.8
3 68.2
• Bart’s teacher considers the readings to be accurate if the variance is 4 69.3
0.5 or less. If the variance is greater than 0.5, Bart will have to conduct 5 69.5
more experiments.
6 67.0
7 68.1
• Conduct a hypothesis test with 𝛼 0.10 to determine whether Bart
8 68.6
needs to conduct more experiments.
9 67.9
10 67.2
Sample std. dev: 𝒔 0.84
𝟐
Sample Variance: 𝒔 0.70

Decision Science ‐1 ‐ Course Material

15

Bart’s Chemistry Lab: Using p‐Value Approach

Develop the Null and Alternate 𝐻 :𝜎 𝜎 0.5


Hypotheses (𝐻 , 𝐻 ) 𝐻 :𝜎 𝜎 0.5

𝛼 0.10
Specify the Level of Significance (𝛼)

𝑛 10, 𝑑𝑓 9, 𝑠 0.70
Collect sample data and compute 𝑛 1 𝑠 9 ∗ 0.70
the value of the Test Statistic 𝜒 12.60
𝜎 0.5

Use the value of the Test Statistic p value 𝑃 𝜒 12.60


(𝑍) to compute the p value 𝑃 𝜒 14.68 0.10.
p value 0.182 𝑃 𝜒 12.60 0.10
Reject 𝐻 if the p value 𝛼
p value 𝛼 0.10
𝐶𝑎𝑛𝑛𝑜𝑡 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
Decision Science ‐1 ‐ Course Material

16

Prof. Ananth Krishnamurthy 8


Decision Science 1– PGP 1 – Term 1 2020

Bart’s Chemistry Lab: Using Critical Value Approach


𝐻 :𝜎 𝜎 0.5
Develop the Null and Alternate 𝐻 :𝜎 𝜎 0.5
Hypotheses (𝐻 , 𝐻 )
𝛼 0.10

Specify the Level of Significance (𝛼)


𝑛 10, 𝑑𝑓 9, 𝑠 0.70
𝑛 1 𝑠 9 ∗ 0.70
Collect sample data and compute 𝜒 12.60 𝜒
the value of the Test Statistic 𝜎 0.5

𝐶𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑉𝑎𝑙𝑢𝑒: 𝜒 . , 14.68


Use 𝛼 to determine the critical 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑖𝑓𝜒 14.68
value and rejection rule
12.60 𝜒 𝜒 . , 14.68
Use the value of Test Statistic, 𝐶𝑎𝑛𝑛𝑜𝑡 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
rejection rule to decide on 𝐻

Decision Science ‐1 ‐ Course Material

17

Outline

Inference About Variance in a Single Populations


• Statistical inference for 𝜎
– Confidence intervals
– Hypothesis testing

Inference About Variances from Two Populations

• Statistical inference for


– Confidence intervals
– Hypothesis testing

Decision Science ‐1 ‐ Course Material

18

Prof. Ananth Krishnamurthy 9


Decision Science 1– PGP 1 – Term 1 2020

The F‐Distribution
• The ratio of two independent Chi‐Square random variables
that have been divided by their respective degrees of
freedom is defined as the 𝐹 ‐distribution. This ratio is
written as:

𝜒 ⁄𝜐
𝐹 , ~
𝜒 ⁄𝜐
Sir Ronald Aylmer Fisher (1890 –1962)
• The 𝐹 ‐distribution is defined on the positive state space was a British statistician and geneticist.
For his work in statistics, he has been
(i.e. the random variable takes only positive values) described as "a genius who almost
single‐handedly created the foundations
for modern statistical science” and "the
single most important figure in 20th
• It has an expectation close to one and its variance century statistics".

decreases as the degrees of freedom 𝜐 and 𝜐 increase


Decision Science ‐1 ‐ Course Material

19

The F‐Distribution
• We will use the notation 𝐹 , , to denote the critical points of an F‐distribution

𝜒 ⁄𝜐
𝐹 , ~
𝜒 ⁄𝜐

𝑃 𝑋 𝐹 , , 𝛼

x
0 𝐹 , ,

Decision Science ‐1 ‐ Course Material

20

Prof. Ananth Krishnamurthy 10


Decision Science 1– PGP 1 – Term 1 2020

Inferences About Two Population Variances, 𝜎 and 𝜎

• The Scenario:
Population 1 Population 2

𝜎 : Unknown variance 𝜎 : Unknown variance

Ratio of the variances:

Simple random sample of 𝑛 Simple random sample of 𝑛


𝑠 = sample variance 𝑠 = sample variance

: point estimate of the ratio between the variances


Decision Science ‐1 ‐ Course Material

21

Inferences About Two Population Variances, 𝜎 and 𝜎


• Recall: The sampling distribution of has a Chi‐Square distribution 𝜒 (with 𝑛 1
degrees of freedom) whenever a simple random sample of size 𝑛 is drawn from a Normal
population

• Suppose we have two independent random samples from two Normally distributed
populations

• Then these samples will give rise to two random variables, 𝑆 and 𝑆 where:

• 𝜒 , with 𝑛 1 degrees of freedom

• 𝜒 , with 𝑛 1 degrees of freedom

Decision Science ‐1 ‐ Course Material

22

Prof. Ananth Krishnamurthy 11


Decision Science 1– PGP 1 – Term 1 2020

Hypothesis Testing for Population Variances 𝜎 and 𝜎

• Then the ratio of these two random variables 𝑆 and 𝑆 will be the random variable
,
where: ∗ ∗
,

, ⁄
• When the population variances are equal, i.e. 𝜎 𝜎 , then ⁄
which is the
,
ratio of to Chi‐Square random variables divided by their own degrees of freedom

, ⁄
• Then 𝐹 ⁄
is a random variable with an 𝐹 distribution with 𝑛 1 and
,
𝑛 1 degrees of freedom
Decision Science ‐1 ‐ Course Material

23

Hypothesis Testing for Population Variances 𝜎 and 𝜎

• The test statistic used to investigate the ratio of population variances of two normally
distributed populations is:

𝑆 𝜒 , ⁄𝑛 1
𝐹
𝑆 𝜒 , ⁄𝑛 1

• For a specific sampling experiment, this random variable 𝐹, takes value , just like we got a
value 𝑥̅ for the random variable 𝑋

Decision Science ‐1 ‐ Course Material

24

Prof. Ananth Krishnamurthy 12


Decision Science 1– PGP 1 – Term 1 2020

Hypothesis Testing for Population Variances 𝜎 and 𝜎


• We consider the following hypothesis tests:

𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎

𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎
Upper‐tail Two‐tailed

• We usually denote the population providing the larger sample variance as population 1

• Since the 𝐹‐test statistic is constructed with the larger sample variance, 𝑠 in the numerator,
the value of the test statistic will be in the upper tail of the F distribution (greater than 1)

• Therefore we do not consider Lower tail tests and tables do not list lower tail distributions

Decision Science ‐1 ‐ Course Material

25

Summary of Hypothesis Testing for 𝜎 and 𝜎

• Summary of approaches:
Assuming sampling from Normal population and 𝑠 𝑠
Upper Tail Test Two‐Tailed Test
Hypotheses 𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎
𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎
Test Statistic 𝑆 𝜒 , ⁄𝑛 1 𝑆 𝜒 , ⁄𝑛 1
𝐹 𝐹
𝑆 𝜒 , ⁄𝑛 1 𝑆 𝜒 , ⁄𝑛 1
Rejection Rule (p‐value) 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
𝑖𝑓 p value 𝛼 𝑖𝑓 p value 𝛼
Rejection Rule (critical ‐value) 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
𝑖𝑓 𝐹 𝐹 𝑖𝑓 𝐹 𝐹 /

• Note: 𝐹 and 𝐹 / are computed based on the 𝐹 distribution with 𝑛 1 degrees of


freedom in the numerator and 𝑛 1 degrees of freedom in the denominator
Decision Science ‐1 ‐ Course Material

26

Prof. Ananth Krishnamurthy 13


Decision Science 1– PGP 1 – Term 1 2020

Example: Bart’s Chemistry Lab Versus Lisa’s Chemistry Lab


Experiment No. Bart’s Lisa’s
• Bart is conducting a series of chemistry experiments to Temperature Temperature
determine the exact temperature when a certain Reading Reading
reaction starts. Recording from 10 experiments are 1 67.4 67.7
shown. Lisa conducts the same chemistry experiment.
2 67.8 66.4
Recording from her 10 experiments are also given.
3 68.2 69.2
4 69.3 70.1
• Bart’s teacher wants to compare the variances from
both experiments and determine if they are 5 69.5 69.5
significantly different. 6 67.0 69.7
7 68.1 68.1
• Conduct a hypothesis test with 𝛼 0.10 to support 8 68.6 66.6
this goal. 9 67.9 67.3
10 67.2 67.5
Sample Std. Dev 𝒔𝟏 0.84 𝒔𝟐 1.33
Sample Variance 𝒔𝟐𝟏 0.70 𝒔𝟐𝟐 1.77
Decision Science ‐1 ‐ Course Material

27

Bart Versus Lisa: Using p‐Value Approach


𝐻 :𝜎 𝜎
Develop the Null and Alternate 𝐻 :𝜎 𝜎
Hypotheses (𝐻 , 𝐻 )
𝛼 0.10
Specify the Level of Significance (𝛼)
𝑛 9, 𝑛 9, 𝑠 0.70, 𝑠 1.77
Collect sample data and compute 𝑠 1.77
𝐹 2.53
the value of the Test Statistic (𝑍) 𝑠 0.70

Use the value of the Test Statistic p value 2 ∗ 𝑃 𝐹 2.53


(𝑍) to compute the p value There is insufficient evidence 𝑃 𝐹 2.44 0.10 𝑎𝑛𝑑 𝑃 𝐹 3.18 0.05
to conclude that the p value 2 ∗ 0.092 2 ∗ 𝑃 𝐹 2.53 0.10
population variances differ for
Reject 𝐻 if the p value 𝛼 the two lab results p value 𝛼 0.10
𝐶𝑎𝑛𝑛𝑜𝑡 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻

Decision Science ‐1 ‐ Course Material

28

Prof. Ananth Krishnamurthy 14


Decision Science 1– PGP 1 – Term 1 2020

Bart Versus Lisa: Using Critical Value Approach


𝐻 :𝜎 𝜎
Develop the Null and Alternate 𝐻 :𝜎 𝜎
Hypotheses (𝐻 , 𝐻 )
𝛼 0.10

Specify the Level of Significance (𝛼)


𝑛 9, 𝑛 9, 𝑠 0.70, 𝑠 1.77
𝑠 1.77
Collect sample data and compute 𝐹 2.53
the value of the Test Statistic (𝑍) 𝑠 0.70

𝐶𝑟𝑖𝑡𝑖𝑐𝑎𝑙 𝑉𝑎𝑙𝑢𝑒: 𝐹 / 𝐹. 3.18


Use 𝛼 to determine the critical 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑖𝑓𝐹 𝐹 3.18
value and rejection rule There is insufficient evidence
to conclude that the
Use the value of Test Statistic, population variances differ for 2.53 𝐹 𝐹 3.18
rejection rule to decide on 𝐻 the two lab results 𝐶𝑎𝑛𝑛𝑜𝑡 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻

Decision Science ‐1 ‐ Course Material

29

Summary of Hypothesis Testing for 𝜎 and 𝜎

• Summary of approaches:
Assuming sampling from Normal population and 𝑠 𝑠
Upper Tail Test Two‐Tailed Test
Hypotheses 𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎
𝐻 :𝜎 𝜎 𝐻 :𝜎 𝜎
Test Statistic 𝑆 𝜒 , ⁄𝑛 1 𝑆 𝜒 , ⁄𝑛 1
𝐹 𝐹
𝑆 𝜒 , ⁄𝑛 1 𝑆 𝜒 , ⁄𝑛 1
Rejection Rule (p‐value) 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
𝑖𝑓 p value 𝛼 𝑖𝑓 p value 𝛼
Rejection Rule (critical ‐value) 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻 𝑅𝑒𝑗𝑒𝑐𝑡 𝐻
𝑖𝑓 𝐹 𝐹 𝑖𝑓 𝐹 𝐹 /

• Note: 𝐹 and 𝐹 / are computed based on the 𝐹 distribution with 𝑛 1 degrees of


freedom in the numerator and 𝑛 1 degrees of freedom in the denominator
Decision Science ‐1 ‐ Course Material

30

Prof. Ananth Krishnamurthy 15


Decision Science 1– PGP 1 – Term 1 2020

Questions?

Acknowledgments: The authors are thankful for permission to use cartoons and graphics in this
presentation. They are solely intended to improve educational experience of students.

Decision Science ‐1 ‐ Course Material

31

Prof. Ananth Krishnamurthy 16

You might also like