Inference About Comparing Two Populations: Hypothesis Testing

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

LECTURE 9

Inference About
Comparing Two
Populations: Hypothesis
Testing
Learning Objectives
In this chapter, you learn how to use hypothesis testing for
comparing the difference between:

• The means of two independent populations


• The means of two related populations
• The proportions of two independent populations

Hence, the type of tests conducted are depends to


the relation between the two samples whether the
samples are dependent or independent
Introduction
 This topic concerns about comparing two samples
means and proportions to examine whether a difference
between them is significant.

 Independent samples are those for which the selection


process for one is not related to that for the other.

 Samples are dependent when the selection process for


one is related to the selection process for the other. A
typical example of dependent samples occurs when we
have before-and-after measures of the same individual /
objects.
Example: Measure the test score for the students
before and after joining the tuition class
Testing the Difference Between
Two Populations Means, 1 – 2

Independent Alternative
Null hypothesis
samples hypothesis

Two-tail test H0 : 1 = 2 H1 : 1  2

Left-tail test H0 : 1 = 2 H1 : 1 < 2

Right-tail test H0 : 1 = 2 H1 : 1 > 2

Note: H0 : 1 = 2 can be written as H0 : 1 – 2 = 0 (the


hypothesized difference between the population means is
zero – means there is no difference between two samples).
(i) Pooled-Variances t-Test for Comparing
The Means of Two Independent Samples
When the samples are independent, then we used the t- pooled
variance test.

Test Statistic : t* = 𝑥1ҧ − 𝑥ҧ2 − (𝜇1 − 𝜇2 )


1 1
𝑠𝑝2 +
𝑛1 𝑛2
Need to find the
pooled estimate of the
population variance

degree of freedom, v = n1 + n2 – 2 For t-test

Assumptions : (i) Population variances are unknown but


assumed to be equal, (ii) the populations are at least
approximately normally distributed.
Hypothesis tests for μ1 – μ2
Two Population Means, Independent Samples
How to make decision?
Just recall step 5 in Lecture 8
Lower-tail test: Upper-tail test: Two-tail test:
H0: μ1 – μ2 = 0 H0: μ1 – μ2 = 0 H0: μ1 – μ2 = 0
H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0

a a a/2 a/2

-ta ta -ta/2 ta/2


Reject H0 if - tSTAT < -ta Reject H0 if tSTAT > ta Reject H0 if -tSTAT < -ta/2
or tSTAT > ta/2
EXAMPLE 1

A statistician found that in a random sample of 15 cans of


paint produced by one manufacture the mean drying time
was 201 minutes, with a standard deviation of 48 minutes.
In a random sample of 11 cans of paint produced by
another manufacturer, the mean drying time was 170
minutes, with a standard deviation of 57 minutes.

Do these data allow us to infer at the 10% significance level


that the mean drying times of the two kinds of paints differ?
Assume that the populations are approximately normally
distributed with equal variances.

This is the statement that the researcher want


to prove so this is the H1 where µ1 is not equal
to µ2 (proceed with two tail test)
List the Information from the question
to ease you:
Sample 1 Sample 2
𝑿𝟏 = 201 𝑿𝟐 = 170
S = 48 S = 57
n = 15 n = 11

𝜶 = 𝟎. 𝟏 Notes that there is no info for µ1


and µ2 so we assume zero

Both samples are independent sample so we proceed the


hypothesis testing using the t-pooled variance test
H0 : 1 = 2
H1 : 1  2 Test at a = 0.10
2 2
(15  1) 48  (11  1)57
𝑠𝑝2   2697.75
15  11  2
201  170
Test Statistic : t* =  1.504
1 1 
2697.75  
 15 11 
degree of freedom, v = 15 + 11 -2 = 24 (a/2 = 0.05)
Critical value :  t 0.05, 24 =  1.711

Decision: Refer to next


Do not (Fail to) reject H0 because t* < 1.711 slide on how to
make decision
Conclusion:
There is not enough evidence to infer that the mean drying
times of the two kinds of paints differ.
Fail to
reject Ho
-1.711 +1.711

1.504
(ii) Comparing the Means When the
Samples Are Dependent
 Tests in which the samples are dependent are also
referred to as matched pairs.

 In this case we are interested in only one variable: the


paired difference, di between measurements for each
person or object, di = x1 – x2 , where x1 and x2 are the
paired observations.

 The major advantage to design a study that uses


matched pairs sample is that we can eliminate the
individual differences that occur between person or
object and this will increase the power of the test.
Testing the Mean of the Paired Difference
Between Two Populations, d

Dependent Alternative
Null hypothesis
samples hypothesis

Two-tail test H0 : d = 0 H1 : d  0

Left-tail test H0 : d = 0 H1 : d < 0

Right-tail test H0 : d = 0 H1 : d > 0

Note: H0 : d = 0 (the mean of the paired difference is


hypothesized equal to zero).
t-Test for Comparing the Means When
the Samples Are Dependent

𝑥ҧ𝑑 − 𝜇𝑑 If not given then


Test Statistic : t =
*
assumed zero
𝑠𝑑 Τ 𝑛

σ 𝑑𝑖
where 𝑥ҧ𝑑 = and
𝑛
degree of freedom: v = n – 1 The sum of squared
of each d
n is the number of pairs of observations

Assumption : The distribution of the population of


differences, d, follows the normal distribution
EXAMPLE 2
An accountant is in the process of investigating the
consequences of switching to another method of depreciating
assets. She randomly selects six firms and calculates the after-
tax profits using both depreciation methods. The results
(rounded to the nearest RM million) are shown below.

Company A B C D E F
Method 1 90 18 70 112 77 40
Method 2 84 20 64 105 77 25
Do these data provide sufficient evidence to indicate that the
adoption of method 2 results in a lower after-tax profit? Test
with a = 0.05.
Do these data provide sufficient evidence to indicate that the
adoption of method 2 results in a lower after-tax profit? Test
with a = 0.05.

Question ask you to prove that the after tax profit using
method 2 is less than method 1.
What is the steps:

(i) You need to define the d (difference) between method 1 and 2


(ii) Since the question highlight after-tax profit using method 2 is
less than method 1 (means that after-tax profit using method 1
is higher so let say you define d = method 1 – method 2 (you
will get the positive difference so you must proceed with right-
tail test as shown in the next slide)
Remember: d is the difference for the paired
sample
d = profit using method 1 – profit using method 2

Company A B C D E F
d 6 -2 6 7 0 15
H0 : d = 0 Test at a = 0.05 Remember: you
H1 : d > 0 need to find 𝒙𝒅
𝟔+ −𝟐 +𝟔+𝟕+𝟎+𝟏𝟓 and 𝒔𝒅 to
𝒙𝒅 = = 𝟓. 𝟑𝟑𝟑
𝟔 calculate t*
𝟏 𝟏𝟎𝟐𝟒
𝒔𝒅 = 𝟑𝟓𝟎 − = 𝟓. 𝟗𝟖𝟗
𝟓 𝟔
5.333
 2.181
Test Statistic : t* = 5.989 / 6
Refer to t table and use the a
Critical value : t 0.05, 5 = 2.015 as given as the test is one tail
Reject H0 because t > 2.015
*

There is enough evidence to indicate that the adoption of


method 2 results in a lower after-tax profit.
Reject Ho

+2.015

2.181

Decision:
Reject Ho because
2.181 (t*) > 2.015. (t0.05,5)
Testing the Difference Between
Two Populations Proportions

Independent Alternative
Null hypothesis
samples hypothesis

Two-tail test H0 : p1 = p2 H1 : p1  p2

Left-tail test H0 : p1 = p2 H1 : p1 < p2

Right-tail test H0 : p1 = p2 H1 : p1 > p2

Note: H0 : p1 = p2 also can be written as H0 : p1 – p2 = 0


(the hypothesized difference between the population
proportions is zero).
Z-Test for Comparing Proportions of
Two Independent Samples

𝑝Ƹ1 − 𝑝Ƹ 2
Test Statistic : Z * =
1 1
𝑝(1
Ƹ − 𝑝)Ƹ +
𝑛1 𝑛2

pooled estimate of the population proportion

𝑥1 + 𝑥2 𝑛1 𝑝Ƹ1 + 𝑛2 𝑝Ƹ 2
𝑝Ƹ = or 𝑝Ƹ =
𝑛1 + 𝑛2 𝑛1 + 𝑛2

Assumptions : n1 and n2 are sufficiently large


EXAMPLE 3
Recently, the Canadian parliament debated the reinstatement
of the death penalty. One of the factors in this debate was
the amount of public support for the death penalty. In 1989,
a sample of 1500 Canadians revealed that 1125 favored the
death penalty. In 1999, 873 in a sample of 1200 supported
the death penalty.

Do these data provide sufficient evidence at the 1%


significance level to indicate that the proportion of public
support for the death penalty was lower in 1999 as compared
with 1989?
You need to proceed with the hypothesis test for
population proportion using one tail - left tail test
𝟖𝟕𝟑 𝟏𝟏𝟐𝟓
ෞ𝟏 =
𝒑 = 0.7275 ; ෞ
𝒑𝟐 = = 0.75
𝟏𝟐𝟎𝟎 𝟏𝟓𝟎𝟎
Just use the a given
H0 : p1 = p2 Test at a = 0.01
don’t divide it by 2 –
H1 : p1 < p2
because this is the one-
𝑝Ƹ 1125  873 tail test
  0.74
1500  1200
0.7275  0.75
Test Statistic : Z* =  1.324
 1 1 
0.74(0.26)  
 1200 1500 
The value for the Z
Critical value : - Z 0.01 = -2.3263 has to be negative
because we proceed
Decision: with left tail test
Fail to reject H0 because -Z * > -2.3263 then

Conclusion:
There is no enough evidence to indicate that the proportion of
public support for the death penalty was lower in 1999 as
compared with 1989.
Reject Ho Fail to
Reject Ho
-2.3263
- 1.324
Example: Pooled-Variance t Test

You are a financial analyst for a brokerage firm. Is there a


difference in dividend yield between stocks listed on the
NYSE & NASDAQ? You collect the following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16

Assuming both populations are


approximately normal with equal
variances, is
there a difference in mean
yield
Copyright(a =Pearson
©2012 0.05)?Education
Pooled-Variance t Test Example:
Calculating the Test Statistic
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)

The test statistic is:

t STAT 
X 1 
 X 2  μ 1  μ 2 

3.27  2.53  0  2.040
 1 1   1 1 
S   
2
1.5021  
 21 25 
p
 n1 n 2 

n
S2  1
 1 S 2
1  n 2  1 S 2
2

21  11.302
 25  11.162
 1.5021
(n1  1)  (n 2  1) (21 - 1)  ( 25  1)
P
Pooled-Variance t Test Example: Hypothesis
Test Solution

H0: μ1 - μ2 = 0 i.e. (μ1 = μ2) Reject H0 Reject H0

H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)


a = 0.05 .025 .025

df = 21 + 25 - 2 = 44 -2.0154 0 2.0154 t
Critical Values: t = ± 2.0154
2.040
Test Statistic: Decision:
3.27  2.53 Reject H0 because tstat (2.040) >
t STAT   2.040 2.0154
 1 1  Conclusion:
1.5021   
 21 25  There is enough evidence of a
difference in mean yield
Fast Revision:

(i) Use t-pooled variance test for the independent samples

(ii) Use t paired test for the dependents samples

(iii) Use Z test for population proportion

(iv) Make sure you can make decision to perform one tail test or two
tail test after read the question.

(v) For one tail test no need to divide a by 2 to find the critical
value for Z or t

(vi) For two tail test must divide a by 2 to find the critical value
for Z or t

You might also like