Professional Documents
Culture Documents
Inferences Based On Two Samples
Inferences Based On Two Samples
Inferences Based
on
Two Samples
9.1
m: sample size 1
2. Y1,,Yn is a random sample2 from a
population with m 2 and s 2 .
n: sample size 2
Null hypothesis: H 0 : m1 - m 2 = 0
same
x - y - D0
Test statistic value: z =
s 2
s 2
+
1 2
m n
b ( D
) = P(Type II Error)
Alt. Hypothesis b ( D
= m1 - m 2 )
H a : m1 - m 2 > D 0 D - D0
F za -
s
D
- D0
H a : m1 - m 2 < D 0 1- F - za -
s
D - D0
H a : m1 - m 2 D 0 F za / 2 -
s
D - D0
Similar to p. 330 -F - za / 2 -
formulas s
Large-Sample Tests
The Two-Sample
t Test and
Confidence Interval
Assumptions
m -1 n -1
Null hypothesis: H 0 : m1 - m 2 = D 0
Usually zero
x - y - D0
Test statistic value: z=
2 2
s s
1
+ 2
m n
The Two-Sample t Test
Alternative Rejection Region for
Hypothesis Approx. Level a Test
H a : m - m0 > D0 t ta ,v
H a : m - m0 < D0 t -ta ,v
Usage in formulas:
S12 S 22 S P2 S P2 2 1 1
+ becomes + or S P +
m n m n m n
9.3
Analysis of
Paired Data
Paired Data (Assumptions)
Important: A natural pairing must exist!
The data consists of n independently
selected pairs (X1,Y1),, (Xn,Yn), with
E ( X i ) = m1 and E (Yi ) = m 2
Let D1 = X1 Y1, , Dn = Xn Yn.
The Dis are assumed to be normally
distributed 2with mean value m Dand
variance s D . Bottom line: Two-sample problem
becomes a one-sample problem!
The Paired t Test
Null hypothesis: H 0 : m D = D0
Usually zero
d - D0
Test statistic value: t=
sD / n
d and sD are the sample mean
and standard deviation of the dis.
The Paired t Test Nothing new
here!
Alternative Rejection Region for
Hypothesis Level a Test
H a : mD > D0 t ta ,n -1
H a : m D < D0 t -ta ,n -1
H a : m D D 0 t ta / 2,n -1 or t -ta / 2,n -1
Confidence Interval for m D
Nothing new
here!
The paired t CI for m D is
d ta / 2,n -1
sD / n
confidence bounds can be found by
replacing ta / 2 by ta .
For large samples, you could use Z test and CI
Paired Data and Two-Sample t
1
V ( X - Y ) = V ( D) = V Di
n
V ( Di ) s 1 + s 2 - 2 rs 1s 2
2 2
= =
n n
Remember: Smaller variance means better estimates
Independence between X and Y r = 0
Positive dependence r > 0
Pros and Cons of Pairing
1. For great heterogeneity and large correlation
within experimental units, the loss in degrees
of freedom will be compensated for by an
increased precision associated with pairing
(use pairing). Usually, were in case 1;
use pairing if possible.
2. If the units are relatively homogeneous and
the correlation within pairs is not large, the
gain in precision due to pairing will be
outweighed by the decrease in degrees of
freedom (use independent samples).
9.4
Inferences
Concerning a
Difference Between
Population Proportions
Difference Between Population
Proportions
Let X ~Bin(m,p1) and Y ~Bin(n,p2) with
X and Y independent variables. Then
p1 - p 2 is an estimator of p1 - p2
X Y
Note: p1 = and p2 =
E ( p1 - p 2 ) = p1 - p2 m n
p1q1 p2 q2
V ( p1 - p 2 ) = - (qi = 1 pi)
m n
mp1 10 and mq1 10 and np 2 10 and nq2 10
Large-Samples
Null hypothesis: H 0 : p1 - p2 = 0
p = m n
p1 + p2
m+n m+n
p1q1 p 2 q2
p1 - p 2 za / 2 +
m n