Lecture 7 (Two Sample Tests)

Two-Sample Tests
Learning Outcomes
In this session, you learn:

• How to use hypothesis testing for comparing the difference
between
• The means of two independent populations
• The means of two related populations
Two-Sample Tests
Two-Sample Tests
Population Population
Means, Means,
Independent Related
Samples Samples
Examples:
Same group
Group 1 vs. before vs. after
Group 2 treatment
Difference Between Two Means
Population means, Goal: Test hypothesis or form

independent
samples
* a confidence interval for the
difference between two
population means, μ1 – μ2
σ1 and σ2 unknown,
assumed equal The point estimate for the
difference is
X1 – X2
not assumed equal
Difference Between Two Means: Independent Samples
• Different data sources
Population means, • Unrelated
independent
samples
* • Independent
• Sample selected from one population
has no effect on the sample selected
from the other population
Use Sp to estimate unknown

σ1 and σ2 unknown, σ. Use a Pooled-Variance t
assumed equal test.
σ1 and σ2 unknown, Use S1 and S2 to estimate

not assumed equal unknown σ1 and σ2. Use a
Separate-variance t test
Hypothesis Tests for Two Population Means
Two Population Means, Independent Samples
Lower-tail test: Upper-tail test: Two-tail test:
H0: μ1  μ2 H0: μ1 ≤ μ2 H0: μ1 = μ2

H1: μ1 < μ2 H1: μ1 > μ2 H1: μ1 ≠ μ2
i.e., i.e., i.e.,
H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0
Hypothesis tests for μ1 – μ2
Two Population Means, Independent Samples
H0: μ1 – μ2  0 H0: μ1 – μ2 ≤ 0 H0: μ1 – μ2 = 0
H1: μ1 – μ2 < 0 H1: μ1 – μ2 > 0 H1: μ1 – μ2 ≠ 0
a a a/2 a/2
-ta ta -ta/2 ta/2

Reject H0 if tSTAT < -ta Reject H0 if tSTAT > ta Reject H0 if tSTAT < -ta/2
or tSTAT > ta/2
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown and assumed equal
Population means, Assumptions:

independent
▪ Samples are randomly and
samples
independently drawn
▪ Populations are normally

assumed equal
* distributed or both sample
sizes are at least 30
▪ Population variances are

unknown but assumed equal
not assumed equal
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown and assumed equal
(continued)
• The pooled variance is:

Population means,
independent S 2
=
(n1 − 1)S1
2
+ (n2 − 1)S2
2
(n1 − 1) + (n2 − 1)
p
samples
• The test statistic is:

assumed equal
* ( X1 − X 2 ) − ( μ 1 − μ 2 )
t STAT =
2  1 1 

Sp  + 
 n1 n2 
not assumed equal • Where tSTAT has d.f. = (n1 + n2 – 2)
Confidence interval for µ1 - µ2 with σ1 and σ2 unknown and assumed equal
Population means,
independent
samples
The confidence interval for
μ1 – μ2 is:
assumed equal
*
( X1 − X 2 )  tα/2 2 
Sp 
1
+
1 


 n1 n 2 
σ1 and σ2 unknown, Where tα/2 has d.f. = n1 + n2 – 2

not assumed equal
Pooled-Variance t Test Example
You are a financial analyst for a brokerage firm. Is there a
difference in dividend yield between stocks listed on the NYSE
& NASDAQ? You collect the following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16
Assuming both populations are approximately normal with

equal variances, is there a difference in mean
yield (a = 0.05)?
Pooled-Variance t Test Example: Calculating the Test Statistic
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2) (continued)
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)
The test statistic is:
t=
(X1 − X 2 ) − (μ1 − μ 2 )
=
(3.27 − 2.53) − 0 = 2.040
2  1 1 

 1
1.5021 + 
1 
Sp  + 
 n1 n 2   21 25 
S =
2 (n1 − 1)S1
2
+ (n 2 − 1)S 2
2
=
(21 − 1)1.30 2 + (25 − 1)1.16 2
= 1.5021
(n1 − 1) + (n2 − 1) (21 - 1) + (25 − 1)
p
Pooled-Variance t Test Example: Hypothesis Test Solution
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2) Reject H0 Reject H0
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)

a = 0.05 .025 .025
df = 21 + 25 - 2 = 44
Critical Values: t = ± 2.0154
-2.0154 0 2.0154 t
2.040
Test Statistic: Decision:
3.27 − 2.53 Reject H0 at a = 0.05
t= = 2.040
 1 1 
1.5021  +  Conclusion:
 21 25  There is evidence of a
difference in means.
Pooled-Variance t Test Example: Confidence Interval for µ1 - µ2
Since we rejected H0 can we be 95% confident that µNYSE >

µNASDAQ?
95% Confidence Interval for µNYSE - µNASDAQ
(X − X )  t
1 2 a/2
2
p
1 1 
S  +  = 0.74  2.0154  0.3628 = (0.009, 1.471)

 n1 n 2 
Since 0 is less than the entire interval, we can be 95%

confident that µNYSE > µNASDAQ
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown, not
assumed equal
Population means, Assumptions:

independent
▪ Samples are randomly and
samples
independently drawn
σ1 and σ2 unknown, ▪ Populations are normally

assumed equal distributed or both sample
sizes are at least 30
▪ Population variances are

σ1 and σ2 unknown, unknown and cannot be
not assumed equal * assumed to be equal
Hypothesis tests for µ1 - µ2 with σ1 and σ2 unknown and not assumed
equal (continued)

Population means,
independent
t STAT =
( X 1 )
− X 2 − ( μ1 − μ 2 )
samples
S12 S 22
+
n1 n 2
assumed equal tSTAT has d.f. ν =
2
 S1 2 S
2

 
 n + n 
2
 =  1 2 
2 2
 S1 2   S22 
not assumed equal
* 
 n 

 1  + 
n1 − 1

 n 

2 
n2 − 1
Separate-Variance t Test Example
You are a financial analyst for a brokerage firm. Is there a
difference in dividend yield between stocks listed on the NYSE
& NASDAQ? You collect the following data:
NYSE NASDAQ
Number 21 25
Sample mean 3.27 2.53
Sample std dev 1.30 1.16
Assuming both populations are approximately normal

with unequal variances, is there a difference in mean
yield (a = 0.05)?
Separate-Variance t Test Example: Calculating the Test Statistic
(continued)
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2)
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)
t=
( )
X1 − X 2 − (μ1 − μ 2 )
=
(3.27 − 2.53) − 0 = 2.019
 S12 S 22   1.302 1.162 
 +   + 
 n1 n 2   21 25 
2
 S1 2 S 2 2 
2
   1.30 2
1.16 2

n +  
 + 
n2  Use degrees of
=  2  21 25 
= = 40.57
1
 S1 
2
 S2 2 2 2 2
 1.30   1.16  2 2 freedom = 40
       
n  n   21  +  25 
 1  + 2 
n1 − 1 n2 − 1 20 24
Separate-Variance t Test Example: Hypothesis Test Solution
H0: μ1 - μ2 = 0 i.e. (μ1 = μ2) Reject H0 Reject H0
H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)

a = 0.05 .025 .025
df = 40
Critical Values: t = ± 2.021
-2.021 0 2.021 t
2.019
Test Statistic: Decision:
Fail To Reject H0 at a = 0.05
t = 2.019 Conclusion:
There is insufficient evidence of
a difference in means.
Related Populations: The Paired Difference Test
Tests Means of 2 Related Populations

Related • Paired or matched samples
• Repeated measures (before/after)
samples
• Use difference between paired values:
Di = X1i - X2i
• Eliminates Variation Among Subjects
• Assumptions:
• Both Populations Are Normally Distributed
• Or, if not Normal, use large samples
Related Populations (continued)
The Paired Difference Test
The ith paired difference is Di , where
Related Di = X1i - X2i
samples
n
The point estimate for the
paired difference D i
D= i =1
population mean μD is D : n
n
The sample standard  (D − D)i
2
deviation is SD SD = i=1
n −1
n is the number of pairs in the paired sample
The Paired Difference Test: Finding tSTAT
• The test statistic for μD is:

Paired
samples
D − μD
t STAT =
SD
n
◼ Where tSTAT has n - 1 d.f.

The Paired Difference Test: Possible Hypotheses
Paired Samples
H0: μD  0 H0: μD ≤ 0 H0: μD = 0

H1: μD < 0 H1: μD > 0 H1: μD ≠ 0
a a a/2 a/2
-ta ta -ta/2 ta/2

Reject H0 if tSTAT < -ta Reject H0 if tSTAT > ta Reject H0 if tSTAT < -ta/2
or tSTAT > ta/2
Where tSTAT has n - 1 d.f.
The Paired Difference Confidence Interval
The confidence interval for μD is

Paired
samples
SD
D  ta / 2
n
n
 (D − D)
i
2
where SD = i=1
n −1
Paired Difference Test: Example
• Assume you send your salespeople to a “customer

service” training workshop. Has the training made a
difference in the number of complaints? You collect the
following data:
Number of Complaints: (2) - (1)  Di

Salesperson Before (1) After (2) Difference, Di D = n
C.B. 6 4 - 2 = -4.2
T.F. 20 6 -14
M.H. 3 2 - 1
R.K. 0 0 0
SD =
 (D − D)
i
2
M.O. 4 0 - 4 n −1
-21
= 5.67
Paired Difference Test: Solution
• Has the training made a difference in the number of complaints (at
the 0.01 level)?
Reject Reject
H0: μD = 0
H1: μD  0
a/2 a/2
a = .01 D = - 4.2 - 4.604 4.604

- 1.66
t0.005 = ± 4.604
d.f. = n - 1 = 4 Decision: Do not reject H0
(tstat is not in the reject region)
Test Statistic:
D − μ D − 4.2 − 0 Conclusion: There is not a
t STAT = = = −1.66 significant change in the number of
SD / n 5.67/ 5 complaints.
The Paired Difference Confidence Interval -- Example
SD
D  ta / 2
The confidence interval for μD is:
n
D = -4.2, SD = 5.67
Since this interval contains 0 cannot be 99% confident that μD doesn’t = 0
5.67
99% CI for  D : − 4.2  4.604
5
= (-15.87, 7.47)
Session Summary
In this session we discussed
• Comparing two independent samples
• Performed pooled-variance t test for the difference in two means
• Performed separate-variance t test for difference in two means
• Formed confidence intervals for the difference between two means
• Comparing two related samples (paired samples)
• Performed paired t test for the mean difference
• Formed confidence interval for the mean difference

Lecture 7 (Two Sample Tests)

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Lecture 7 (Two Sample Tests)

Uploaded by

Copyright:

Available Formats

Two-Sample Tests

In this session, you learn:

Population means, Goal: Test hypothesis or form

Use Sp to estimate unknown

σ1 and σ2 unknown, Use S1 and S2 to estimate

Lower-tail test: Upper-tail test: Two-tail test:

H0: μ1  μ2 H0: μ1 ≤ μ2 H0: μ1 = μ2

-ta ta -ta/2 ta/2

Population means, Assumptions:

▪ Populations are normally

▪ Population variances are

• The pooled variance is:

• The test statistic is:

σ1 and σ2 unknown, Where tα/2 has d.f. = n1 + n2 – 2

Assuming both populations are approximately normal with

The test statistic is:

H0: μ1 - μ2 = 0 i.e. (μ1 = μ2) Reject H0 Reject H0

H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)

Since we rejected H0 can we be 95% confident that µNYSE >

95% Confidence Interval for µNYSE - µNASDAQ

Since 0 is less than the entire interval, we can be 95%

Population means, Assumptions:

σ1 and σ2 unknown, ▪ Populations are normally

▪ Population variances are

The test statistic is:

Assuming both populations are approximately normal

The test statistic is:

H0: μ1 - μ2 = 0 i.e. (μ1 = μ2) Reject H0 Reject H0

H1: μ1 - μ2 ≠ 0 i.e. (μ1 ≠ μ2)

Tests Means of 2 Related Populations

• The test statistic for μD is:

◼ Where tSTAT has n - 1 d.f.

Lower-tail test: Upper-tail test: Two-tail test:

H0: μD  0 H0: μD ≤ 0 H0: μD = 0

-ta ta -ta/2 ta/2

The confidence interval for μD is

• Assume you send your salespeople to a “customer

Number of Complaints: (2) - (1)  Di

a = .01 D = - 4.2 - 4.604 4.604

You might also like