Professional Documents
Culture Documents
Stat Anal11
Stat Anal11
Stat Anal11
ii) we can choose the size of the sample taken from each stratum - sampling fractions need not
be the same in each stratum. This gives scope to do an efficient job allocating resources to the
sampling within strata.
11.2 Notation
Let Yij be the value of the variable Y for the j-th member of the i-th stratum in the population.
Let yij be the value of the variable y for the j-th member taken from the i-th stratum from
within the sample.
1
Population Sample
Size N n
Strata sizes Ni ni
Strata weights Wi = Ni /N wi = ni /n
PK PNi PK Pni
Total T = i=1 j=1 Yij t= i=1 j=1 yij
Mean Y = T /N y = t/n
PNi Pni
Strata Totals Ti = j=1 Yij ti = j=1 yij
PNi 2
Pni 2
j=1 (Yij −Y i ) j=1 (yij −y i )
Strata Variances Si2 = Ni −1
s2i = ni −1
Also,
define fi = ni /Ni to be the sampling fraction for the i-th stratum, and,
f = n/N to be the overall sampling fraction.
and
K
X K
X Si2
var(y st ) = Wi2 var(y i ) = Wi2 (1 − fi ) (2)
i=1 i=1
ni
2
Compare this with the overall sample average:
K
X K
0 1X
y = wi y i = ni y i
i=1
n i=1
In general, this will differ from y st unless ni /Ni = n/N , i.e. ni ∝ Ni for i = 1, . . . , K: the
sample sizes are proportional to the stratum sizes, which is proportional allocation (see later);
such an allocation also ensures that y 0 is unbiased for Y .
The design efficiency is determined by comparing var(y st ) with var(y). Indeed, define
Def f = var(y st )/var(y).
If the within-stratum sampling fractions are equal to each other, then the gain in efficiency
depends on how small the {Si2 } are in comparison to S 2 .
Indeed, since, in this case (see later),
K
1 − f X Ni 2
var(y st ) = S
n i=1 N i
then
( K
)
(1 − f ) 2 1 X
var(y) − var(y st ) = S − Ni Si2 .
n N i=1
However,
( K N ) ( K N )
1 XX i
1 XX i
S2 = (Yij − Y )2 = (Yij − Y i + Y i − Y )2
N − 1 i=1 j=1 N − 1 i=1 j=1
X K µ ¶ XK
Ni − 1 2 Ni
= Si + (Y i − Y )2 .
i=1
N −1 i=1
N −1
Thus
K K
(1 − f ) X (1 − f ) X
var(y) − var(y st ) ≈ Ni (Y i − Y )2 = Wi (Y i − Y )2
nN i=1 n i=1
which is strictly positive unless the {Y i } are all equal to each other. Thus (under the proviso
that the {Ni } are sufficiently large), the stratified sample mean is usually more efficient than
the simple random mean. A more precise analysis can be found in the next section.
3
11.4 Choice of Strata fractions
There are a number of possible strategies available for selecting the strata sampling fractions.
We outline a few of these below, along with their relative efficiencies.
K
X 2
2 Si
var(y st,p ) = Wi (1 − fi )
i=1
ni
K
(1 − f ) X 1−f 2
= Wi Si2 = Sw
n i=1
n
Then
Let c0 be some fixed cost, and define ci to be the cost per sampled unit in stratum i.
Then the total cost satisfies
K
X
c0 + ni ci ≤ C (3)
i=1
4
Thus we seek to minimize
K
X W 2 S 2 (1 − fi )
i i
var(y st ) =
i=1
ni
Now
∂L Wj2 Sj2
= − 2 + λcj
∂nj nj
Actually, if the costs across all the strata were equal, then the optimal allocation would be
such that nj ∝ Wj Sj : we denote the resulting estimate by y st;o . It can be shown that
ÃK !2 K
1 X 1 X
var(y st;o ) = Wi Si − Wi Si2 .
n i=1
N i=1
5
Stratum No. of students No. of Stratum Stratum Std.
per institution institutions Total Mean Dev.
i Ni Ti Yi Si
1 <1000 661 292671 443 236
2 1000-3000 205 345302 1684 625
3 3000-10000 122 672728 5514 2008
4 ≥10000 31 573693 18506 10023
N = 1019 T = 1884394
These data might be used to plan a sample designed to give a quick estimate of total registra-
tion in some future year.
Assume that the sampling costsPper unit are the same in all strata and that we are aim-
ing at a total sample size of n = 4i=1 ni = 250. Then, we could attempt to invoke the optimal
allocation rule, i.e. that nj ∝ Nj Sj . Under this rule, n4 would be
31 × 10023
250 × ≈ 92.
(661 × 236) + (205 × 625) + (122 × 2008) + (31 × 10023)
However, N4 = 31: thus our ’disproportionate’ allocation procedure says that we take a 100%
sample from stratum 4 and then use the rule ni ∝ Ni Si to distribute the remaining sample
over the other strata.
6
11.5 Estimation of Totals
A number of unbiased estimators for T can be formulated: these differ in their variability
according to the sampling scheme from which they are generated.
E[Tb] = N × E[y] = N Y = T
var(Tb) = N 2 (1 − f )S 2 /n.
Tbst;p = N y st;p .
Now
Finally, consider stratified random sampling with optimal allocation using equal costs for all
sampling units. Then defining
Tbst;o = N y st;o
it follows that
and
à !2
1 XK
1 XK
var(Tbst;o ) = N 2 × var(y st;o ) = N 2 Wi Si − 2
Wi Si .
n N i=1
i=1
7
11.6 Stratified Sampling for Attributes
Here it is assumed that Xij = 1 if the j-th member of the i-th stratum belongs to a certain
category, and Xij = 0 otherwise. Thus the proportion of the population that belong to this
P PNi
category is X = K i=1 j=1 Xij /N ; let us denote this by π. Similarly, the proportion of the
P i
i-th stratum population that belong to this category is given by X i = N j=1 Xij /Ni , which we
denote by πi .
Let xij be equal to 1 if the j-th member of the i-th stratum within the sample possesses
the attribute, and is equal to 0 otherwise.
Let xi be the proportionPof the ni members of the sample from the i-th stratum that be-
long to the category, i.e. nj=1
i
xij /ni . The quantity π can be estimated by
K
X
π
bst = W i xi
i=1
Ni πi (1−πi )
Also, using the fact1 that Si2 = Ni −1
, then
K
X K
X X K
πi (1 − πi ) 1 − fi πi (1 − πi )
var(b
πst ) = Wi2 var(xi ) = Wi2 1 ≈ Wi2 (1 − fi )
i=1 i=1
ni 1 − Ni i=1
ni
where the final expression holds if the {Ni } are sufficiently large.
1 N π(1−π)
Under this regime, S 2 = N −1 : see top of p.12 of lecture 10.
8
(a) C.I. criterion √
Suppose that we have a finite population of size N with standard deviation S = S 2 . We
could estimate Y using y based on a sample of size n. Along with the assumption of normality,
we have that
µ ¶
S2 n no
y ∼N Y, 1−
n N
approximately, in which case the width of the ’approximate’ 100(1 − α)% C.I. is
S ³ n ´1/2
2 × zα/2 √ 1 − .
n N
For a 95% C.I., this is approximately equal to
S ³ n ´1/2
4× √ 1−
n N
since zα/2 = 1.96.
for large N .
(b) S.E.Criterion
Suppose that we require that the standard error of the estimate of the mean is no greater than
e. Then
µ ¶1/2
S ³ n ´1/2 1 1
√ 1− =S − =e
n N n N
i.e.
S2 S2
= e2 +
n N
which implies that
S2 S2
n= 2 ≈
e2 + SN e2
9
11.7.2 Cost Criteria
(a) Costing the Survey or Study
The costs may be detailed under headings such as set-up and initial design, pilot survey/experiment,
analysis of pilot and refinement of design, interview training, data collection, data entry, data
screening, analysis, report presentation.
Some costs are fixed, others monotonic in the sample size (not necessarily linear). Overall
cost generally of the form:
K
X
Cost = Constant + gi (stratum sample size)
i=1
where the function gi (·) is often of a form involving either steps or slopes. For example, the
steps may correspond to the need for another interviewer.
500
400
300
cost
200
100
0
sample.size
Might start with a statistical criterion curve, standard error say, plotted against sample size.
This needs to be converted to a monetary value curve.
10
400
300
monetary.return
200
100
0
sample.size
300
200
100
0
sample.size
11