Hypothesis of Two Population

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 122

Statistics for Business and

Economics
Inferences Based on Two Samples:
Confidence Intervals & Tests of
Hypotheses
Learning Objectives
1. Distinguish Independent and Related
Populations
2. Solve Inference Problems for Two
Populations
• Mean
• Proportion
• Variance
3. Determine Sample Size
Thinking Challenge
How would you try to answer these questions?

• Who gets higher grades: • Which program is faster


males or females? to learn: Word or Excel?
Target Parameters

Difference between Means 1 – 2

Difference between
p1 – p2
Proportions

( 1 )2
Ratio of Variances
( 2 ) 2
Independent & Related
Populations
Independent Related
1. Different data sources 1. Same data source
• Unrelated • Paired or matched
• Independent • Repeated measures
(before/after)

2. Use difference 2. Use difference between


between the two each pair of observations
sample means • di = x1i – x2i
• X1 – X2
Two Independent Populations
Examples
1. An economist wishes to determine whether
there is a difference in mean family income for
households in two socioeconomic groups.
2. An admissions officer of a small liberal arts
college wants to compare the mean SAT
scores of applicants educated in rural high
schools and in urban high schools.
Two Related Populations
Examples
1. Nike wants to see if there is a difference in
durability of two sole materials. One type is
placed on one shoe, the other type on the other
shoe of the same pair.
2. An analyst for Educational Testing Service
wants to compare the mean GMAT scores of
students before and after taking a GMAT
review course.
Thinking Challenge
Are they independent or related?
1. Miles per gallon ratings of cars before and after
mounting radial tires
2. The life expectancy of light bulbs made in two
different factories
3. Difference in hardness between two metals: one
contains an alloy, one doesn’t
4. Tread life of two different motorcycle tires: one on
the front, the other on the back
Two types of dependent samples

Disadvantage of
dependent samples: The same
subjects Matched or
Degrees of freedom paired
are halved measured at two
different points observations
in time.
Advantage of dependent samples:
Reduction in variation in the sampling distribution

Comparing dependent and independent samples


Two Population Inference

Two
Populations

Paired
Mean Proportion Variance
Indep.

Z t t
(Large (Small (Paired Z F
sample) sample) sample)
Comparing Two Means
Two Population Inference

Two
Populations

Paired
Mean Proportion Variance
Indep.

Z t t
(Large (Small (Paired Z F
sample) sample) sample)
Comparing
Two Independent Means
Two Population Inference

Two
Populations

Paired
Mean Proportion Variance
Indep.

Z t t
(Large (Small (Paired Z F
sample) sample) sample)
Sampling Distribution

Population 1 2 Population
1 2
1 2

Select simple random Compute X1 – X2 Select simple random


sample, n1. Compute for every pair of sample, n2. Compute
X1 samples X2

Astronomical number Sampling


of X1 – X2 values Distribution

1 - 2
Large-Sample Inference for
Two Independent Means
Two Population Inference

Two
Populations

Paired
Mean Proportion Variance
Indep.

Z t t
(Large (Small (Paired Z F
sample) sample) sample)
Conditions Required for Valid
Large-Sample Inferences about
μ1 – μ2
Assumptions
• Independent, random samples
• Can be approximated by the normal distribution
when n1  30 and n2  30
Large-Sample Confidence
Interval for μ1 – μ2
(Independent Samples)

Confidence Interval
 2
 2

X 1  X 2   Z 2
n1
1

n2
2
Large-Sample Confidence
Interval Example
You’re a financial analyst for Charles Schwab.
You want to estimate the difference in dividend
yield between stocks listed on NYSE and
NASDAQ. You collect the following data:
NYSE NASDAQ
Number 121 125
Mean 3.27 2.53
Std Dev 1.30 1.16
What is the 95% confidence interval
for the difference between the mean
dividend yields? © 1984-1994 T/Maker Co.
Large-Sample Confidence
Interval Solution
12  22
X 1  X 2   Z 2
n1

n2

(1.3)2 (1.16)2
(3.27  2.53)  1.96 
121 125

.43  1  2  1.05
Hypotheses for Means of Two
Independent Populations
Research Questions
No Difference Pop 1  Pop 2 Pop 1  Pop 2
Hypothesis Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2

H0 1  2  0 1  2  0 1  2  0
Ha 1  2  0 1  2  0 1  2  0
Large-Sample Test for μ1 – μ2
(Independent Samples)

Two Independent Sample Z-Test Statistic


( x1  x2 )  ( 1  2 )
z
12  22
 Hypothesized
n1 n2 difference
Large-Sample Test
Example
You’re a financial analyst for Charles Schwab.
You want to find out if there is a difference in
dividend yield between stocks listed on NYSE
and NASDAQ. You collect the following data:
NYSE NASDAQ
Number 121 125
Mean 3.27 2.53
Std Dev 1.30 1.16
Is there a difference in average
yield ( = .05)?
© 1984-1994 T/Maker Co.
Large-Sample Test
Solution
• H0: 1 - 2 = 0 (1 = 2)
Test Statistic:
• Ha: 1 - 2  0 (1  2)
(3.27  2.53)  0
•  .05 z  4.69
1.698 1.353
• n1= 121 , n2 = 125 
121 125
• Critical Value(s):
Decision:
Reject H0 Reject H0
Reject at  = .05
.025 .025
Conclusion:
There is evidence of a
-1.96 0 1.96 z difference in means
Large-Sample Test
Thinking Challenge
You’re an economist for the Department of
Education. You want to find out if there is a
difference in spending per pupil between urban and
rural high schools. You collect the following:
Urban Rural
Number 35 35
Mean $ 6,012 $ 5,832
Std Dev $ 602 $ 497
Is there any difference in population
means ( = .10)?
Large-Sample Test
Solution*
• H0: 1 - 2 = 0 (1 = 2)
Test Statistic:
• Ha: 1 - 2  0 (1  2)
(6012  5832)  0
•   .10 z  1.36
6022 4972
• n1 = 35, n2 = 35 
35 35
• Critical Value(s):
Decision:
Reject H 0 Reject H0
Do not reject at  = .10
.05 .05
Conclusion:
There is no evidence of a
-1.645 0 1.645 z difference in means
Small-Sample Inference
for Two Independent Means
Two Population Inference

Two
Populations

Paired
Mean Proportion Variance
Indep.

Z t t
(Large (Small (Paired Z F
sample) sample) sample)
Conditions Required for Valid
Small-Sample Inferences about
μ1 – μ2
Assumptions
• Independent, random samples
• Populations are approximately normally distributed
• Population variances are equal
Small-Sample Confidence
Interval for μ1 – μ2
(Independent Samples)
Confidence Interval

1 1
X 1  X 2   t 2 SP    
2

 n1 n2 

SP
2

 n1  1  S  n2  1  S2
1
2 2

n1  n2  2
df  n1  n2  2
Small-Sample Confidence
Interval Example
You’re a financial analyst for Charles Schwab.
You want to estimate the difference in dividend
yield between stocks listed on the NYSE and
NASDAQ? You collect the following data:
NYSE NASDAQ
Number 11 15
Mean 3.27 2.53
Std Dev 1.30 1.16
Assuming normal populations, what
is the 95% confidence interval
for the difference between the
mean dividend yields?
© 1984-1994 T/Maker Co.
Small-Sample Confidence
Interval Solution
df = n1 + n2 – 2 = 11 + 15 – 2 = 24 t.025 = 2.064

SP
2 
 1
n  1  S1
2
  n2  1  S 2
2

n1  n2  2
11  1  1.30   15  1  1.16 
2 2

  1.489
11  15  2

1 1
 3.27  2.53  2.064 1.489    
 11 15 

.26  1  2  1.74
Small-Sample Test for μ1 – μ2
(Independent Samples)

Two Independent Sample t–Test Statistic

t
 X 1  X 2   1   2 
2  1 1 Hypothesized
S P     difference
 n1 n2 

SP
2

n1  1  S1  n2  1  S 2
2 2

n1  n2  2
df  n1  n2  2
Small-Sample Test
Example
You’re a financial analyst for Charles Schwab. Is
there a difference in dividend yield between stocks
listed on the NYSE and NASDAQ? You collect
the following data:
NYSE NASDAQ
Number 11 15
Mean 3.27 2.53
Std Dev 1.30 1.16
Assuming normal populations,
is there a difference in average
yield ( = .05)?

© 1984-1994 T/Maker Co.


Small-Sample Test
Solution
• H0: 1 - 2 = 0 (1 = 2) Test Statistic:
• Ha: 1 - 2  0 (1  2)
•   .05
• df  11 + 15 - 2 = 24
• Critical Value(s):
Decision:
Reject H 0 Reject H 0

.025 .025 Conclusion:

-2.064 0 2.064 t
Small-Sample Test
Solution

SP
2

 n1  1  S1   n2  1  S2
2 2

n1  n2  2
11  1  1.30   15  1  1.16 
2 2

  1.489
11  15  2

t
 X 1  X 2    1  2 

 3.27  2.53   0 
 1.53
1 1 1 1
SP    
2
1.489    
 n1 n2   11 15 
Small-Sample Test
Solution
• H0: 1 - 2 = 0 (1 = 2) Test Statistic:
• Ha: 1 - 2  0 (1  2) t
3.27  2.53
 1.53
•   .05 1 1
1.489    
 11 15 
• df  11 + 15 - 2 = 24
• Critical Value(s):
Decision:
Reject H 0 Reject H 0 Do not reject at  = .05
.025 .025 Conclusion:
There is no evidence of a
-2.064 0 2.064 t difference in means
Small-Sample Test
Thinking Challenge
You’re a research analyst for General Motors. Assuming
equal variances, is there a difference in the average
miles per gallon (mpg) of two car models ( = .05)?
You collect the following:
Sedan Van
Number 15 11
Mean 22.00 20.27
Std Dev 4.77 3.64
Small-Sample Test
Solution*
• H0: 1 - 2 = 0 (1 = 2) Test Statistic:
• Ha: 1 - 2  0 (1  2)
•   .05
• df  15 + 11 - 2 = 24
• Critical Value(s):
Decision:
Reject H 0 Reject H 0

.025 .025 Conclusion:

-2.064 0 2.064 t
Small-Sample Test
Solution*

SP
2

 n1  1  S   n2  1  S2
1
2 2

n1  n2  2
15  1   4.77   11  1   3.64 
2 2

  18.793
15  11  2

t
 X 1  X 2    1  2 

 22.00  20.27    0
 1.00
1 1 1 1
SP    
2
18.793    
 n1 n2   15 11 
Small-Sample Test
Solution*
• H0: 1 - 2 = 0 (1 = 2) Test Statistic:
• Ha: 1 - 2  0 (1  2) 22.00  20.27
t  1.00
•   .05 1 1
18.793    
• df  15 + 11 - 2 = 24  15 11 
• Critical Value(s):
Decision:
Reject H 0 Reject H 0 Do not reject at  = .05
.025 .025 Conclusion:
There is no evidence of a
-2.064 0 2.064 t difference in means
Paired Difference
Experiments
Small-Sample
Two Population Inference

Two
Populations

Paired
Mean Proportion Variance
Indep.

Z t t
(Large (Small (Paired Z F
sample) sample) sample)
Paired-Difference
Experiments
1. Compares means of two related populations
• Paired or matched
• Repeated measures (before/after)

2. Eliminates variation among subjects


Conditions Required for Valid
Small-Sample Paired-Difference
Inferences
Assumptions
• Random sample of differences
• Both population are approximately normally
distributed
Paired-Difference Experiment
Data Collection Table

Observation Group 1 Group 2 Difference


1 x11 x21 d1 = x11 – x21
2 x12 x22 d2 = x12 – x22
   
i x1i x2i di = x1i – x2i
   
n x1n x2n dn = x1n – x2n
Paired-Difference Experiment
Small-Sample Confidence
Interval
sd
d  t 2 df = nd – 1
nd

Sample Mean Sample Standard Deviation


n n

d i  (di - d)2
i 1 i 1
d  Sd 
nd nd  1
Paired-Difference Experiment
Confidence Interval Example
You work in Human Resources. You want to see if there is a
difference in test scores after a training program. You collect
the following test score data:
Name Before (1) After (2)
Sam 85 94
Tamika 94 87
Brian 78 79
Mike 87 88
Find a 90% confidence interval for the
mean difference in test scores.
Computation Table
Observation Before After Difference
Sam 85 94 -9
Tamika 94 87 7
Brian 78 79 -1
Mike 87 88 -1
Total -4

d = –1 Sd = 6.53
Paired-Difference Experiment
Confidence Interval Solution
df = nd – 1 = 4 – 1 = 3 t.05 = 2.353

Sd
d  t 2
nd
6.53
1  2.353
4
8.68  d  6.68
Hypotheses for Paired-
Difference Experiment

Research Questions
No Difference Pop 1  Pop 2 Pop 1  Pop 2
Hypothesis Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2

H0 d  0 d  0 d  0
Ha d  0 d  0 d  0
Note: di = x1i – x2i for ith observation
Paired-Difference Experiment
Small-Sample Test Statistic
d  D0
t  df = n – 1
Sd
nd
Sample Mean Sample Standard Deviation
n n

 di  (di - d)2
i 1 i 1
d  Sd 
nd nd  1
Paired-Difference Experiment
Small-Sample Test Example
You work in Human Resources. You want to see if a training
program is effective. You collect the following test score
data:
Name Before After
Sam 85 94
Tamika 94 87
Brian 78 79
Mike 87 88
At the .10 level of significance, was the
training effective?
Null Hypothesis
Solution
1. Was the training effective?
2. Effective means ‘Before’ < ‘After’.
3. Statistically, this means B < A.
4. Rearranging terms gives B – A < 0.
5. Defining d = B – A and substituting into (4)
gives d  .
6. The alternative hypothesis is Ha: d  0.
Paired-Difference Experiment
Small-Sample Test Solution
• H0: d = 0 (d = B - A)
Test Statistic:
• Ha: d < 0
•  = .10
• df = 4 - 1 = 3
• Critical Value(s):
Decision:
Reject H0

.10 Conclusion:

-1.638 0 t
Computation Table
Observation Before After Difference
Sam 85 94 -9
Tamika 94 87 7
Brian 78 79 -1
Mike 87 88 -1
Total -4

d = –1 Sd = 6.53
Paired-Difference Experiment
Small-Sample Test Solution
• H0: d = 0 (d = B - A)
Test Statistic:
• Ha: d < 0
d  D0 1  0
•  = .10 t   .306
Sd 6.53
• df = 4 - 1 = 3
nd 4
• Critical Value(s):
Decision:
Reject H0 Do not reject at  = .10
.10 Conclusion:
There is no evidence
-1.638 0 t training was effective
Paired-Difference Experiment
Small-Sample Test Thinking
Challenge
You’re a marketing research (1) (2)
analyst. You want to Store Client Competitor
compare a client’s calculator 1 $ 10 $ 11
to a competitor’s. You 2 8 11
sample 8 retail stores. At the 3 7 10
.01 level of significance, 4 9 12
does your client’s calculator 5 11 11
sell for less than their 6 10 13
competitor’s? 7 9 12
8 8 10
Paired-Difference Experiment
Small-Sample Test Solution*
• H0: d = 0 (d = 1 - 2)
Test Statistic:
• Ha: d < 0
d  D0 2.25  0
•  = .01 t   5.486
Sd 1.16
• df = 8 - 1 = 7
nd 8
• Critical Value(s):
Decision:
Reject H0 Reject at  = .01
.01 Conclusion:
There is evidence client’s
-2.998 0 t brand (1) sells for less
Comparing Two Population
Proportions
Two Population Inference

Two
Populations

Paired
Mean Proportion Variance
Indep.

Z t t
(Large (Small (Paired Z F
sample) sample) sample)
Conditions Required for Valid
Large-Sample Inference about
p1 – p2
Assumptions
• Independent, random samples
• Normal approximation can be used if
n1 pˆ1  15, n1qˆ1  15, n2 pˆ 2  15, and n2 qˆ2  15
Large-Sample Confidence
Interval for p1 – p2
Confidence Interval
pˆ1qˆ1 pˆ 2 qˆ2
 pˆ1  pˆ 2   Z 2 
n1 n2
Confidence Interval for p1 – p2
Example
As personnel director, you
want to test the perception of
fairness of two methods of
performance evaluation. 63 of
78 employees rated Method 1
as fair. 49 of 82 rated Method
2 as fair. Find a 99%
confidence interval for the
difference in perceptions.
Confidence Interval for p1 – p2
Solution
63
pˆ1   .808 qˆ1  1  .808  .192
78
49
pˆ 2   .598 qˆ2  1  .598  .402
82
.808  .192 .598  .402
.808  .598  2.58 
78 82
.029  p1  p2  .391
Hypotheses for
Two Proportions
Research Questions
No Difference Pop 1  Pop 2 Pop 1  Pop 2
Hypothesis Any Difference Pop 1 < Pop 2 Pop 1 > Pop 2

H0 p1  p2  0 p1  p2  0 p1  p2  0
Ha p1  p2  0 p1  p2  0 p1  p2  0
Large-Sample Test
for p1 – p2
Z-Test Statistic for Two Proportions

Z
 pˆ1  pˆ 2    p1  p2 
where pˆ 
X1  X 2
1 1 n1  n2
ˆ ˆ  
pq
 n1 n2 
Test for Two Proportions
Example
As personnel director, you
want to test the perception of
fairness of two methods of
performance evaluation. 63 of
78 employees rated Method 1
as fair. 49 of 82 rated Method
2 as fair. At the .01 level of
significance, is there a
difference in perceptions?
Test for Two Proportions
Solution
• H0: p1 - p2 = 0 Test Statistic:
• Ha: p1 - p2  0
•  =.01
• n1 = 78 n2 = 82
• Critical Value(s):
Decision:
Reject H0 Reject H0

.005 .005 Conclusion:

-2.58 0 2.58 z
Test for Two Proportions
Solution
X 1 63 X 2 49
pˆ1    .808 pˆ 2    .598
n1 78 n2 82

X 1  X 2 63  49
pˆ    .70
n1  n2 78  82

Z
 pˆ1  pˆ 2    p1  p2  
.808  .598   0 
1 1   1 1 
ˆp  1  pˆ      .70   1  .70     
 n1 n2   78 82 
 2.90
Test for Two Proportions
Solution
• H0: p1 - p2 = 0 Test Statistic:
• Ha: p1 - p2  0 Z = +2.90
•  =.01
• n1 = 78 n2 = 82
• Critical Value(s):
Decision:
Reject H0 Reject H0
Reject at  = .01
.005 .005 Conclusion:
There is evidence of a
-2.58 0 2.58 z difference in proportions
Test for Two Proportions
Thinking Challenge
You’re an economist for the
Department of Labor. You’re
studying unemployment rates. In MA,
74 of 1500 people surveyed were MA
unemployed. In CA, 129 of 1500
were unemployed. At the .05 level of
significance, does MA have a lower CA
unemployment rate than CA?
Test for Two Proportions
Solution*
• H0: pMA – pCA = 0 Test Statistic:
• Ha: pMA – pCA < 0
•  =.05
• nMA =1500 nCA = 1500
• Critical Value(s):
Decision:
Reject H0

.05 Conclusion:

-1.645 0 Z
Test for Two Proportions
Solution*
X MA 74 X CA 129
pˆ MA    .0493 pˆ CA    .0860
nMA 1500 nCA 1500

X MA  X CA 74  129
pˆ    .0677
nMA  nCA 1500  1500

Z
.0493  .0860    0 
 1 1 
.0677   1  .0677     
 1500 1500 
 4.00
Test for Two Proportions
Solution*
• H0: pMA – pCA = 0 Test Statistic:
• Ha: pMA – pCA < 0 Z = –4.00
•  =.05
• nMA =1500 nCA = 1500
• Critical Value(s):
Decision:
Reject H0 Reject at  = .05
.05 Conclusion:
There is evidence MA is
-1.645 0 Z less than CA
Determining Sample Size
Determining Sample Size
• Sample size for estimating μ1 – μ2

  2  1 2 
2
Z  2
  2

n1  n2  ME = Margin
( ME )2 of Error

• Sample size for estimating p1 – p2


 Z  pq  p q 
2
 2
n1  n2 
1 1 2 2

(ME)2
Sample Size Example
What sample size is needed to estimate μ1 – μ2
with 95% confidence and a margin of error of
5.8? Assume prior experience tells us σ1 =12
and σ2 =18.

   
2 2 2
1.96 12 18
n1  n2  2
 53.44  54
(5.8)
Sample Size Example
What sample size is needed to estimate p1 – p2
with 90% confidence and a width of .05?
width .05
ME    .025
2 2

1.645 .5  .5  .5  .5


2

n1  n2   2164.82  2165
(.025)2
Comparing
Two Population Variances
Two Population Inference

Two
Populations

Paired
Mean Proportion Variance
Indep.

Z t t
(Large (Small (Paired Z F
sample) sample) sample)
Sampling Distribution

Population 1 2 Population
1 2
1 2
Select simple random Compute F = S12/S22 Select simple random
sample, size n1. for every pair of n1 sample, size n2.
Compute S12 & n2 size samples Compute S22

Sampling
Astronomical number Distributions for
of S12/S22 values Different Sample
Sizes
Conditions Required for a Valid
F-Test for Equal Variances

Assumptions
• Both populations are normally distributed
— Test is not robust to violations
• Independent, random samples
F-Test for Equal
Variances Hypotheses
• Hypotheses
H0: 12 = 22 OR H0: 12  22 (or  )
Ha: 12  22 Ha: 12  22 (or >)

• Test Statistic
• F = s12 /s22
• Two sets of degrees of freedom
— 1 = n1 – 1; 2 = n2 – 1
• Follows F distribution
F-Test for Equal Variances
Critical Values

Reject H 0 Reject H 0

/2 Do Not /2


Reject H 0

0 F
1
FL ( / 2;  1, 2 )  FU ( / 2;  1, 2 )
FU ( / 2;  2 , 1 )
Note!
F-Test for Equal
Variances Example
You’re a financial analyst for Charles Schwab.
You want to compare dividend yields between
stocks listed on the NYSE & NASDAQ. You
collect the following data:
NYSE NASDAQ
Number 21 25
Mean 3.27 2.53
Std Dev 1.30 1.16
Is there a difference in variances
between the NYSE & NASDAQ
at the .05 level of significance?
© 1984-1994 T/Maker Co.
F-Test for Equal Variances
Solution
• H0: 12 = 22 Test Statistic:
• Ha: 12  22
•   .05
• 1  20 2  24
• Critical Value(s):
Decision:

Conclusion:
F-Test for Equal Variances
Solution
Reject H0 Reject H0

/2 = .025 Do Not /2 = .025


Reject H0

0 F
FU (.025;20,24)  2.33
1 1
FL(.025;20,24)    .415
FU (.025;24,20) 2.41
F-Test for Equal Variances
Solution
• H0: 12 = 22 Test Statistic:
• Ha: 12  22 S
2
1.302
•   .05 F 
1
2 2
 1.25
S 2
1.16
• 1  20 2  24
• Critical Value(s):
Decision:
Reject H0 Reject H0 Do not reject at  = .05
.025 .025 Conclusion:
There is no evidence of a
0 .415 2.33 F difference in variances
F-Test for Equal Variances
Thinking Challenge
You’re an analyst for the Light & Power Company.
You want to compare the electricity consumption of
single-family homes in two towns. You compute the
following from a sample of homes:
Town 1 Town 2
Number 25 21
Mean $ 85 $ 68
Std Dev $ 30 $ 18
At the .05 level of significance, is there evidence
of a difference in variances between the two towns?
F-Test for Equal Variances
Solution*
• H0: 12 = 22 Test Statistic:
• Ha: 12  22
•   .05
• 1  24 2  20
• Critical Value(s):
Decision:

Conclusion:
Critical Values
Solution*
Reject H0 Reject H0

/2 = .025 Do Not /2 = .025


Reject H0

0 F
FU (.025;24,20)  2.41
1 1
FL(.025;24,20)    .429
FU (.025;20,24) 2.33
F-Test for Equal Variances
Solution*
• H0: 12 = 22 Test Statistic:
• Ha: 12  22 S 2
302
F  2  2.778
1
•   .05 S 2
18
2
• 1  24 2  20
• Critical Value(s):
Decision:
Reject H0 Reject H0 Reject at  = .05
.025 .025 Conclusion:
There is evidence of a
0 .429 2.41 F difference in variances
Comparing two populations
If both samples Does the
contain at least 30 distribution of the
differences in sample
observations we use
means have a
the z distribution as
mean of 0?
the test statistic.
The samples are
from independent No assumptions about the
populations. shape of the populations
are required.
The formula for X1  X 2
z
computing the s12 s22
value of z is: 
n1 n2
Comparing two populations
Two cities,
Bradford and Kane
are separated only
by the Conewango
River. There is
competition
between the two
cities. The local
paper recently reported that
the mean household income with a standard deviation
in Bradford is $38,000 with of $7,000 for a sample of
a standard deviation of 35 households. At the .01
$6,000 for a sample of 40 significance level can we
households. The same conclude the mean income
article reported the mean in Bradford is more?
income in Kane is $35,000
EXAMPLE 1
Step 4 Step 3
State the decision rule. Find the appropriate test
The null hypothesis is statistic. Because both
rejected if z is greater samples are more than 30, we
than 2.33 or p < .01. can use z as the test statistic.

Step 1 Step 2
State the null and State the level of significance.
alternate hypotheses. The .01 significance level is
H0: µB < µK stated in the problem.
H1: µB > µK
Example 1 continued
Step 5: Compute the value of z and make a decision.

$38,000  $35,000
z  1.98
($6,000) 2 ($7,000) 2

40 35
Because the computed Z of 1.98
The p(z > 1.98)
is .0239 for a < critical Z of 2.33, the p-value of
one-tailed test .0239 >  of .01, the decision is
of significance. to not reject the null hypothesis.
We cannot conclude that the mean
household income in Bradford is
larger.
Two Sample Tests of Proportions investigate
whether two samples came from populations with an
equal proportion of successes.

The two samples The value of the test


are pooled using statistic is computed from
the following X  X the following formula.
p1  p2
formula. pc  1 2
z
n1  n2 pc (1  pc ) pc (1  pc )

where X1 and X2 refer to n1 n2
the number of successes where X1 and X2 refer to the
in the respective samples number of successes in the
of n1 and n2. respective samples of n1
and n2.
Example 2
Are unmarried
workers more likely
to be absent from
work than married
workers? A sample of
250 married workers
showed 22 missed
more than 5 days last
year, while a sample
o f 3 0 0 u n mar ried
w o r k e r s showed 35 missed more
than five days. Use a .05
significance level.
The null and the alternate hypotheses
H 0: p U < p M H1: pU > pM

The null hypothesis is The pooled proportion


rejected if the computed
value of z is greater than 35  22
pc 
1.65 or the p-value < .05. 300  250

= .1036

Example 2 continued
35 22

Example
z 300 250 2 continued  1.10
.1036(1  .1036) .1036(1  .1036)

300 250

Because a calculated z of 1.10 <


a critical z of 1.96, p of .136 > 
The p(z > 1.10) of .05, the null hypothesis is not
= .136 for a rejected. We cannot conclude
one-tailed test that a higher proportion of
of significance. unmarried workers miss more
days in a year than the married
workers.
Small
SmallSample Tests
Sample of Means
Tests of Means
The t distribution is used as the test statistic if one or
more of the samples have less than 30 observations.

The required assumptions


1. Both populations must follow
the normal distribution.
2. The populations must have
equal standard deviations.
3. The samples are from
independent populations.
Finding the value of the test statistic requires two steps.

Step One: Pool the


( n  1) s 2
 ( n  1) s 2
sample standard sp  1
2 1 2 2

deviations. n1  n2  2

Step Two: Determine the value of t from the


following formula.

X1  X 2
t
2 1 1 
s p   
 n1 n2 
Small sample test of
means continued
A recent EPA study A sample of 12 imported
compared the highway cars revealed a mean of
fuel economy of 35.7 mpg with a standard
domestic and imported deviation of 3.9. At the
passenger cars. A .05 significance level can
sample of 15 domestic the EPA conclude that the
cars revealed a mean of mpg is higher on the
33.7 mpg with a standard imported cars?
deviation of 2.4 mpg.

Example 3
Step 3 Example 3 continued

Find the appropriate test


statistic. Both samples
are less than 30, so we
use the t distribution. Step 2
State the level of
significance. The .05
significance level is
stated in the problem.
Step 1
State the null and
alternate hypotheses.
H0: µD > µI
H1: µD < µI
Step 4
The decision rule is to reject Step 5
H0 if t<-1.708 or if p-value We compute the
< .05. There are n-1 or 25 pooled variance.
degrees of freedom.

( n  1)( s 2
)  ( n  1)( s 2
2)
sp 
2 1 1 2
n1  n 2  2
(15  1)(2.4) 2  (12  1)(3.9) 2
  9.918
15  12  2

Example 3 continued
Example 3 continued

We compute the value of t as follows.

X1  X 2
t 
 1 1 

s 2p
n  

 1 n 2 

33.7  35.7
  1.640
 1 1 
8.312  
 15 12 
P(t < -1.64) = Since a computed z of –1.64
.0567 for a one- > critical z of –1.71, the p-
tailed t-test. value of .0567 >  of .05, H0
is not rejected. There is
insufficient sample evidence
to claim a higher mpg on the
imported cars.

Example 3 continued
Independent samples Dependent samples are
are samples that are not samples that are paired or
related in any way. related in some fashion.

If you wished to buy a car you would look


at the same car at two (or more) different
dealerships and compare the prices.

If you wished to measure


the effectiveness of a new
diet you would weigh the
dieters at the start and at
the finish of the program.
Hypothesis Testing Involving Paired Observations
Use the following test when the samples are
dependent:

d
t
sd / n
where d is the mean of the differences
sd is the standard deviation of the differences
n is the number of pairs (differences)

Hypothesis Testing Involving Paired


Observations
An independent testing City Hertz Avis ($)
($)
agency is comparing the
daily rental cost for EXAMPLE
Atlanta42 4 40
renting a compact car Chicago 56 52
from Hertz and Avis. A Cleveland 45 43
random sample of eight Denver 48 48
cities revealed the Honolulu 37 32
following information. Kansas City 45 48
At the .05 significance Miami 41 39
level can the testing Seattle 46 50
agency conclude that
there is a difference in
the rental charged?
Step 1 Step 4
Ho: d = 0 H0 is rejected if
H1: d = 0 t < -2.365 or t > 2.365;
or if p-value < .05.
Step 2 We use the t distribution with
The stated n-1 or 7 degrees of freedom.
significance
level is .05.
Step 5
Step 3 Perform the
The appropriate calculations and make
test statistic is the a decision.
paired t-test.
Example 4 continued
City Hertz Avis d d2
Atlanta 42 40 2 4
Chicago 56 52 4 16
Cleveland 45 43 2 4
Denver 48 48 0 0
Honolulu 37 32 5 25
Kansas City 45 48 -3 9
Miami 41 39 2 4
Seattle 46 50 -4 16

Example 4 continued
d 8.0
d   1.00
n 8

d 2 
d  2
78 
82
sd  n  8  3.1623
n 1 8 1

d 1.00
t   0.894
sd n 3.1623 8

Example 4 continued
P(t>.894) = .20 for a
one-tailed t-test at 7
degrees of freedom.

Because 0.894 is less


than the critical value,
the p-value of .20 > a of
.05, do not reject the
null hypothesis. There
is no difference in the
mean amount charged
by Hertz and Avis.
Example 4 continued
Conclusion
1. Distinguished Independent and Related
Populations
2. Solved Inference Problems for Two
Populations
• Mean
• Proportion
• Variance
3. Determined Sample Size
Exercise
1. In order to compare the means of two populations, independent random
samples of 144 observations are selected from each population with the
following results.
Sample 1 Sample 2
x1 = 7,123 x2 = 6,957
s1 = 175 s2 = 225
Use a 95% confidence to test of hypothesis of difference between the
population means (µ1 -µ2).
2. A sample of 200 Lion Store charge customers 50 years or older showed
that 20 did not pay their entire balance at the end of the month. A sample
of 300 customers under 30 showed that 50 did not pay their entire balance
at the end of the month. At the .02 significance level can we conclude that
a larger percent of the younger customers do not pay their entire balance
at the end of the month?
3. The mean high temperature for 12 days in July in Detroit, Michigan was 88
degrees with a standard deviation of 4 degrees. The mean high
temperature in Hilton Head, South Carolina for 8 July days was 91 degrees
with a standard deviation of 3 degrees. At the .05 significance level, can
we conclude that it is warmer in Hilton Head?
4. An egg farmer wanted to determine if increasing the time the lights were
on in his hen house would increase egg production. For a sample of eight
chickens he determined their production before and after increasing the
amount of time the lights were on. The data are reported below. At the .01
significance level, has there been an increase in production?
Hen 1 2 3 4 5 6 7 8
Before 10 8 5 2 3 7 3 3
After 7 5 6 8 8 8 10 2
5. One indication of how strong the real estate market is performing is the
proportion of properties that sell in lessthan 30 days after being listed.
Of the condominiums in a Florida beach community that sold in the first six
months of 2006, 75 of the 115 sampled had been on the market less than 30
days.For the first six months of 2007, 25 of the 85 sampled had been on
the market less than 30 days.
Test the hypothesis that the proportion of condominiums that sold within
30 days decreased from 2006 to 2007. Use a .01.
Next week

Chi Square Distribution

You might also like