Hypothesis Testing: Comparing Two Populations

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 16

Lecture 13.

Chapter 13
Hypothesis testing:
Comparing two populations.
13.1 Testing the difference between two
population means: Independent samples
13.2 Testing the difference between two
population means: The matched pairs
experiment
13.3 Testing the difference between two
population proportions
1

13.1 Testing the difference


between two population means:
Independent samples
• Two independent random samples are drawn from the
two populations of interest. Because we are interested
in the difference between two population means, we
use the statistic x1  x 2 as an estimator for 1 - 2.
• x1  x 2 is normally distributed if the (original)
population distributions are normal and is
approximately normally distributed if the (original)
population is not normal, but the sample size is large.
• The expected value of x1  x 2 is 1 – 2 and the
variance of x  x is s12/n1 + s22/n2.
1 2

1
A. Testing a hypothesis about
1 – 2 when the population
variances are known
If the sampling distribution of x1  x 2 is normal or
approximately normal, then

( X 1  X 2 )  (     )
Z
s  s 

n1 n2

can be considered as a Z-statistic and used to build


a test statistic for 1 – 2

Example 13.1, page 522


• Suppose that a large department-store chain in
Queensland is trying to decide whether to build a new
store in Logan City or in Ipswich. Building costs are
lower in Ipswich, and the company decides it will build
there unless the average household income is higher
in Logan City.
• A survey of 100 residences in each of the areas found
that the mean annual household income was $54 180
in Logan City and $45 340 in Ipswich. From other
sources, it is known that the population standard
deviations of annual household incomes are $5365 in
Logan City and $7440 in Ipswich.
• At the 5% significance level, can it be concluded that
the mean household income in Logan City exceeds
that of Ipswich?

2
Example 13.1 (contd.)

– The data are numerical.


– There are two independent samples.

– The parameter to be tested is the difference


between two means, the mean annual household
income in Logan City (1) and the mean annual
household income in Ipswich (2).

– The sample sizes are large. Therefore, the


difference between the sample means is
normally distributed.

– The population variances are known.

Solution, pages 522– 523: Three steps to follow:


1. Set up the hypotheses and state the test:
Calculate 𝑥1 - 𝑥2 = 54180 – 45340 >>0 , and according
to the question given we have to perform the following
hypothesis testing (a right tail test): H0 :  - 2= 0 and
HA :  - 2>0 .
2. Calculate the value of the test statistic and z:
x1 
((𝒙  ( −𝝁
−𝒙x2))−(𝝁   (54180  45340
)  ) 𝟓𝟒𝟏𝟖𝟎−𝟒𝟓𝟑𝟒𝟎 −𝟎 )  0  9.637
𝒛z = 𝟏 𝟐  𝟏  𝟐 = =2 9.637
s s
𝝈𝟏𝟐 𝝈𝟐 𝟐
5365
𝟓𝟑𝟔𝟓
2
𝟐 𝟕𝟒𝟒𝟎7440
𝟐
+ n + 
n 𝟏𝟎𝟎
100 𝟏𝟎𝟎 100
𝒏1𝟏 𝒏𝟐 2

and z = z0.05 = 1.645.


3. Make the decision / inference:
Since Z-statistic value falls in the rejection region
𝑧 ≥ 𝑧𝛼 , we reject H0 and accept HA at the significance level 
= 5%. This means that there is enough evidence gathered
from the sample to infer that the mean household
income of Logan City exceeds that of Ipswich.

3
Annual household income (see page 523 for details)
Logan City Ipswich Using Excel D.A.
55840 44360 z-Test: Two Sample for Means
53840 44460
53940 43460
Logan City Ipswich
52940 44810
54290 44260 Mean 54180 45340
53740 44810 Known Variance 28783252 55353600
54290 45010 Observations 100 100
54490 45010 Hypothesized Mean Difference 0
54490 47360
z 9.637383377    
56840 46360
55840 44560 P(Z<=z) one-tail 0
54040 44660 z Critical one-tail 1.644853627
54140 43660 P(Z<=z) two-tail 0
53140 45010 z Critical two-tail 1.959963985
54490 44460
53940 45010
At the 5% significance level, there is
. . sufficient evidence to reject the null
. . hypothesis. (We use p value =
. . P(Z9.64) one tail= 0 < 5% to make
. . the decision)

B. Testing a hypothesis about


1 – 2 when the population
variances are unknown
• Practically, Z-statistic is hardly used, because the
population variances are usually not known. Instead, we
construct a T-statistic using the sample ‘variances’ (s12
and s22). ( x  x2 )  (     )
T 1
s s

n1 n2

• Two cases are considered when producing the T-statistic:


– The two unknown population variances are equal.
– The two unknown population variances are not equal.

4
Case I: Unequal variances

Construct the unequal-variances T-statistic as follows:

(x1  x 2 )  ( 1   2 )
T
s12 s 22
(  )
n1 n2
(s12 n1  s 22 /n 2 ) 2
with d.f.  2
(s12 n1 ) 2 (s 22 n 2 )

n1  1 n 2 1

Example 13.2, page 525: Consmers Non-cmrs


variances are unequal 2560 2008
2420 2812
• Do people who eat high- 2116 2940
fibre cereal for breakfast 2364 2828
consume, on average, 2384 2092
fewer kilojoules for lunch 2256 2136
2460 3072
than people who do not eat 2240 2504
high-fibre cereal for 2540 2480
breakfast? 2492 2356
• A sample of 30 people was 2944
2260
randomly drawn. Each 2744
person was identified as a 2116
consumer or non-consumer 2528
of high-fibre cereal. 3804

• For each person, the


2976
2528
number of kilojoules 2372
consumed at lunch was 3388
recorded. (data given in example 11.3 page 432)

5
Solution, pages 525– 526: Three steps to follow
1. Set up the hypotheses and state the test: Calculate 𝑥1 -
𝑥2 = 2383.2 – 2644.4<<0 , and according to the question
given we have to perform the following hypothesis testing (a
left tail test): H0 :  - 2= 0 and HA :  - 2<0 . Also,
calculate: s1 = 142.75 and s2 = 462.61 and hence
variances appear to be unequal.
2. Calculate value of the test statistic, d.f and t,d.f.
((𝒙
x 𝟏−𝒙
 x𝟐 )2 −(𝝁  𝟐) = ) 𝟐𝟑𝟖𝟑.𝟐−𝟐𝟔𝟒𝟒.𝟒
)  𝟏( −𝝁 ( 2383.2  −𝟎 2644 4)  0
= .-2.31
t𝒛 = 1  𝟏𝟒𝟐.𝟕𝟓𝟐 𝟒𝟔𝟐.𝟔 𝟐  2.31
𝒔𝟏s𝟐 𝒔𝟐 𝟐s  142+.75 2
462.612
+
𝒏𝟏  
 𝟏𝟎 𝟐𝟎
𝒏𝟐
n1 n2 10 20

d.f. = 25.1  25 and t 0.05, 25 = 1.708.


3. Make the decision / inference: Since t- statistic value
falls in the rejection region 𝑡 ≤ −𝑡𝛼,𝑑.𝑓. =- t0.05;25 = -1.708,
we reject H0 and accept HA at the significance level  = 5%.
This means that people who eat high-fibre cereal for
breakfast consume, on average, fewer kilojoules for
lunch than people who do not eat high-fibre cereal.

Kilojoules consumed at lunch


Consmers Non-cmrs Using Excel D.A.
2560 2008
t-Test: Two-Sample Assuming Unequal Variances
2420 2812
2116 2940
2364 2828 Consumers Nonconsumers
2384 2092 Mean 2383.2 2644.4
2256 2136 Variance 20376.17778 214004.0421
2460 3072 Observations 10 20
2240 2504 Hypothesized Mean Diff 0
2540 2480 df 25
2492 2356 t Stat -2.31433179
2944 P(T<=t) one-tail 0.014576434
2260 t Critical one-tail 1.708140189
2744 P(T<=t) two-tail 0.029152868
2116 t Critical two-tail 2.05953711
2528
3804
2976 At the 5% significance level, there
2528 is sufficient evidence to reject the
2372 null hypothesis (using p value =
3388
P(T-2.31)= 0.014<5%).

6
Case II: Equal variances

• Calculate the pooled variance estimate by:


(n1  1)s12  (n2  1)s 22
Sp 
2

n1  n2  2
• Construct the equal-variances t-statistic as follows:
( x1  x 2 )  (    )
t
1 1
s p2 (  )
n1 n2
d.f .  n1  n2  2

Example 13.3, page 529

• Does job design (referring to worker movements)


affect workers’ productivity?
• Two job designs are being considered for the
production of a new computer desk.

• Two samples are randomly and independently selected


- A sample of 25 workers assembled the desk using design
A.
- A sample of 25 workers assembled the desk using design
B.
- The assembly times were recorded.
• Do the assembly times of the two designs differ ?
(Use  = 5%)

7
Assembly times in minutes
Design-A Design-B
6.8 5.2 Remarks
5.0 6.7 –The data are numerical.
7.9 5.7 –There are two
5.2 6.6 independent samples.
7.6 8.5
5.0 6.5
–The parameter of
5.9 5.9 interest is the difference
5.2 6.7 between two population
6.5 6.6 means.
. . –The claim to be tested is
. .
. . whether a difference
. . between the two designs
exists.

Solution, pages 528– 529: Three steps to follow:


1. Set up the hypotheses and state the test:
Calculate 𝑥1 - 𝑥2 = 6.288 – 6.016 ≈ 0 , and according to
the question given we have to perform the following
hypothesis testing (a two-tail test): H0 :  - 2= 0 and HA:
 - 2  0 . Calculate: s1 = 0.921 and s2 = 1.142 and
hence variances appear to be equal. Calculate also the
pooled variance sp2 = 1.075.
2. Calculate the value of the test statistic, d.f and t/2,d.f.
( x  x )  (   ) (6.288 −𝟎
6.016)  0
 (𝒙𝟏1−𝒙𝟐 )2−(𝝁𝟏 −𝝁𝟐 ) = 𝟔.𝟐𝟖𝟖−𝟔.𝟎𝟏𝟔
𝒛t =   0.93
s  
sp 𝟏 𝟏( 1= 0.93
1
𝒔(𝒑𝟐 p 𝒔𝒑
𝟐 ) 1.075
𝟏.𝟎𝟕𝟓( + )  )
+ 𝟏𝟎 𝟐𝟎 10 20
𝒏𝟏n1 𝒏𝟐 n2

d.f. = 25 + 25 – 2 = 48 and t 0.025, 48= 2.010.


3. Make the decision/ inference: Since t-statistic value
falls in the acceptance region −𝑡𝛼/2,𝑑.𝑓. ≤ 𝑡 ≤ 𝑡𝛼/2,𝑑.𝑓. , we
accept H0 and reject HA at the significance level  = 5%. This
means that the assembly times of the two designs do
not differ.

8
Design-A Design-B
6.8 5.2
5.0 6.7 Assembly times
7.9 5.7
5.2 6.6
7.6 8.5
5.0 6.5
5.9 5.9 Use Excel D.A.
5.2 6.7
6.5 6.6 t-Test: Two-Sample Assuming Equal Variances
. .
. . Design-A Design-B
. . Mean 6.288 6.016
. . Variance S 12 0.847766667 1.3030667 S 22
Observations 25 25
At the 5% significance Pooled Variance 1.075416667 Sp 2

level, there is sufficient Hypothesized Mean Difference


0
evidence to reject the df 48    
null hypothesis (using p t Stat 0.927332603
value = P(T-0.93) + P(T<=t) one-tail 0.179196744
t Critical one-tail 1.677224191
P(T0.93)= 0.3584 >
P(T<=t) two-tail 0.358393488
5%).
t Critical two-tail 2.01063358

Checking the required condition for the equal


variances case (Example 13.3, page 529)
Design A The distributions are not
12 bell shaped, but they
10 seem to be approximately
8 normal. Since the technique
Frequency

6 is robust, we can be
4 confident
2
about the results.
0
4 5 6 7 8 9 More

Design B
10

Checking the 8

required condition for


Frequency

the equal variances 4


using the frequency 2
histogram 0
4 5 6 7 8 9 More

9
13.2 Testing the difference
between two population means:
Matched pairs experiment
Example 13.5, page 541:
• To determine whether a new steel-belted radial tyre
lasts longer than a current model, the manufacturer
designs the following matched pairs experiment (to
make the comparison without any side effects).
• One of each type of tyre is installed on the rear
wheels of 20 randomly selected cars.
• Drivers drive in their usual way until the tyres are
worn out.
• The number of kilometres driven by each driver are
recorded.

Car New Existing Diffe-


Design Design rence
1 65 56 9
Solution (pages
542-543)
2 72 58 14
3 110 97 13 Calculate the sample of
4 70 64 6 differences:
5 90 87 3 XD = X1 – X2
6 95 83 12
(look at the 4-th
7 69 58 11
column).
8 70 57 13
9 82 78 4 We assume that XD is
10 70 74 -4 normally distributed
11 108 106 2 (check it by drawing a
12 98 94 4 histogram).
13 91 86 5 T-statistic is used to
14 92 98 -6
perform the test:
15 94 106 -12
16 70 66 4
17 75 66 9
18 48 49 -1
19 79 69 10
20 86 91 -5

10
Solution, pages 542– 543
Three steps to follow:
1. Set up the hypotheses and state the test:
Calculate 𝑋𝐷 = 4.55 >> 0 , and according to the
question given we have to perform the following
hypothesis testing (a right tail test): H0 : D = 0 and HA:
D > 0 . Calculate also the S.D sD = 7.22.

2. Calculate the value of the test statistic and t,d.f. :


𝑥𝐷 − 𝜇𝐷 4.55 − 0
𝑡= = = 2.82
𝑠𝐷 𝑛𝐷 7.22 20
and t α;d.f = t 0.05, nD−1= t 0.05, 19 = 1.729

3. Make the decision/ inference: Since t-statistic value


falls in the rejection region 𝑡 ≥ t α;d.f , we reject H0 and
accept HA at the significance level  = 5%. This means that
the new-design tyre is superior.

Car New Existing Diffe-


Design Design rence Using Excel D.A.
1 65 56 9
t-Test: Paired Two Sample for Means
2 72 58 14
3 110 97 13
New Ds Exist. Ds
4 70 64 6 Mean 81.7 77.15
5 90 87 3 Variance 244.0105 318.766
6 95 83 12 Observations 20 20
7 69 58 11 Pearson Correlation 0.915437
8 70 57 13 Hypoth. Mean Def. 0
9 82 78 4 df 19
10 70 74 -4
t Stat 2.817587
11 108 106 2
12 98 94 4
P(T<=t) one-tail 0.005497
13 91 86 5 t Critical one-tail 1.729133
14 92 98 -6 P(T<=t) two-tail 0.010994
15 94 106 -12 t Critical two-tail 2.093024
16 70 66 4
17 75 66 9 At the 5% significance level, there
18 48 49 -1 is sufficient evidence to reject the
null hypothesis (using p value =
19 79 69 10 P(T 2.81)= 0.0055<5%).
20 86 91 -5

11
13.3 Testing the difference
between two population proportions
• In this section we deal with two populations whose
data are nominal.
• When data are nominal, we can (only) ask questions
regarding the proportions of occurrences (successes)
of certain outcomes.
• Thus, we hypothesise on the difference p1 – p2 and
draw an inference from the hypothesis test.
• Consider statistic (sample proportion) 𝑝1 = 𝑋1 /𝑛1
where X1 is number of successes in sample of size n1
taken from the 1st population; and statistic 𝑝2 = 𝑋2 /𝑛2
where X2 is number of successes in sample of size n2
taken from the 2nd population.

• The statistic 𝑝1 − 𝑝2 is approximately normally


distributed if n1p1, n1(1 – p1), n2p2, n2(1 – p2) are
all equal to or greater than 5 (since p1, p2 are
unknown, instead of these we use their point
estimators 𝑝1 𝑎𝑛𝑑 𝑝2 ).
• The mean of 𝑝1 − 𝑝2 is p1 – p2. The variance of
𝑝1 − 𝑝2 is (p1(1 – p1) /n1)+ (p2(1 – p2)/n2).
Case 1: The following statistic may be
considered as Z- statistic and used to perform
the hypothesis testing when H0 : p1 - p2 = 0.
( pˆ  pˆ 2 )  ( p1  p2 )
Z 1
1 1
pˆ (1  pˆ )(  )
n1 n2
𝑿𝟏 +𝑿𝟐
where 𝒑 = is the (pooled) proportion of
𝒏𝟏 +𝒏𝟐
successes in both samples.

12
Example 13.6, page 551

• A research project employing 22 000 patients


was conducted to discover whether aspirin can
prevent heart attacks.
• Half the participants in the research took aspirin
and half took a placebo.
• In a three-year period, 104 of those who took
aspirin and 189 of those who took the placebo
had heart attacks.
• Is aspirin effective in preventing heart attacks?

Solution, pages 550– 552: Three steps to follow:


1. Set up the hypotheses and state the test:
Calculate 𝑝1 = 104/11000= 0.009455 << 𝑝2 =
189/11000 =0.01718 and according to the question
given we have to perform the following hypothesis
testing (a left tail test) H0 : p1 – p2= 0 and HA: p1 – p2 <
0 . Calculate also the pooled proportion p = 293/22000
= 0.01332.

2. Calculate the value of the test statistic and z :

( pˆ 1  pˆ 2 )  0 (0.009455  0.01718)  0
z   4.999
1 1 1 1
pˆ (1  pˆ )(  ) (0.01332)(0.98668)(  )
n1 n2 11000 11000

and zα = z 0.05= 1.645

13
Solution, pages 550– 552:
Step 3:
3. Make the decision/ inference:

Since z- statistic value falls in the rejection


region 𝑧 ≤ −zα , we reject H0 and accept HA at the
significance level  = 5%. This means that
aspirin is effective in reducing the incidence
of heart attacks.

Note:
Since p-value = P(Z<-4.999)=0, we reject H0
and accept HA at the significance level  = 5%.

Using Excel D.A.


z-Test of the Difference Between Two Proportions (Case 1)

Sample 1 Sample 2 z Stat -4.9989


Sample proportion0.009455 0.017182 P(Z<=z) one-tail 0.0000
Sample size 11000 11000 z Critical one-tail1.6449
Alpha 0.05 P(Z<=z) two-tail 0.0000
z Critical two-tail1.9600

Since: p value = P(Z<-4.999)= 0 <5% (or z-


statistic value z = -4.999 falls in the rejection
region (-, -1.645))
we make the inference: At the 5% significance
level, there is sufficient evidence to reject the
null hypothesis.

14
Case 2: The following statistic may be
considered as Z- statistic and used to perform
the hypothesis testing when H0 : p1 - p2 = D.
(p̂1  p̂ 2 )  D
Z
p̂1 (1  p̂1 ) p̂ 2 (1  p̂ 2 )

n1 n2

Example 13.7, page 554


• The process that is used to produce a complex
component used in medical instruments typically
results in defect rates in the 40% range. Recently,
two innovative processes have been developed to
replace the existing process.

Example 13.7, page 554


• Process 1 appears to be more promising, but it is
considerably more expensive to purchase and operate
than process 2. After a thorough analysis of the costs,
management decides that it will adopt process 1 only
if the proportion of defective components produced by
process 2 is more than 8% more than that produced
by process 1.

• In a test to guide the decision, both processes were


used to produce 300 components. Of the 300
components produced by process 1, 33 were found to
be defective, while 84 out of the 300 produced by
process 2 were defective. Using a significance level of
10%, conduct a test to help management make a
decision.

15
Solution, pages 554– 555: Three steps to follow:
1. Set up the hypotheses and state the test:
Calculate 𝑝1 = 33/300 = 0.11 << 𝑝2 - 0.08 = 84/300 - 0.08
=0.28 - 0.08=0.20) and according to the question given we
have to perform the following hypothesis testing (a right tail
test) H0 : p1 – p2= - 0.08 and HA: p1 – p2 < -0.08 .
2. Calculate the value of the test statistic and z :

( pˆ 1  pˆ 2 )  D (0.11  0.28)  (0.08)


z   2.85
pˆ 1 (1  pˆ 1 ) pˆ 2 (1  pˆ 2 ) 0.11(1  0.11) 0.28(1  0.28)
 
n1 n2 300 300

and zα = z0.1 = 1.282


3. Make the decision/ inference: Since z- statistic value
falls in the acceptance region 𝑧 ≤ −zα , we accept HA and
reject H0 to conclude that the proportion of defective
components produced by process 2 is more than 8% more
than the proportion of defective components produced by
process 1. (Note: p-value = P(Z<-2.85)= 0.0022<10%).

Summary: page 560

Home assignment:

- Section 13.1 Exercises pages 534 - 537: 13.6,


13.10, 13.19

- Section 13.2 Exercises pages 547 - 548: 13.37,


13.39, 13.42

- Section 13.3 Exercises pages 558 - 559: 13.50,


13.57, 13.61

32

16

You might also like