Stratified Sampling: Sample Size Determination

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 24

STRATIFIED SAMPLING

**SAMPLE SIZE DETERMINATION


Sample Size Determination
✓ The formula function as the same as the sample size formula in the previous
chapter.
✓ However, look closely on the additional item of allocation, ai must be
understood clearly.
✓ If the population variance, σi is not given, use sample variance, si from the
previous experiment. [σ2i = s2i ]
General Formula
Mean Total Proportion
Sample size
determination,
L Ni2 i2 L Ni2 i2 L Ni2 pi qi
n   
n= i =1 ai n= i =1 ai n= i =1 ai
L L L
N D +  Ni  i2
2
N D +  Ni  i2
2
N D +  Ni pi qi
2
i =1 i =1 i =1
Estimated
variance for B2 B2 B2
the estimator, D= D= D=
D 4 4N 2 4
Where, ai is the fraction of observation allocated to stratum i
Example
A manager of the company desires to estimate the average marks of personal motivation test for
their staff. The staff were divided based on their department – Administration, Operation and
Logistic. The marks achieved by these staffs on their personal motivation test are given below.
Administration Operation Logistic
N1 = 23 N2 = 48 N3 = 29
δ1 = 3.4 δ2 = 2.5 δ3 = 3.6
n1 = 12 n2 = 24 n3 = 14
ӯ1 = 78 ӯ2 = 81 ӯ3 = 85

Suppose the mean mark for all the staffs is to be estimated again in restructuring the
organization. Assuming the allocation of sampling is equal for all strata, find the needed sample
size to obtain a bound on the error of estimation equal to 0.3.
Administration Operation Logistic
N1 = 23 N2 = 48 N3 = 29

Solution δ1 = 3.4
n1 = 12
δ2 = 2.5
n2 = 24
δ3 = 3.6
n3 = 14
ӯ1 = 78 ӯ2 = 81 ӯ3 = 85

Since allocation is the same for all strata, a1 =1/3, a2 =1/3, and a3 =1/3.
L
Ni2 i2 N12 12 N 22 22 N32 32 (23) 2 (3.4) 2 (48) 2 (2.5) 2 (29) 2 (3.6) 2
 + + + + = 94243.8
( ) ( ) ( )
= =
i =1 ai a1 a2 a3 1 1 1
3 3 3
L

 i i 11 2 2 3 3
N 
i =1
2
= N  2
+N  2
+N  2
= (23)(3.4) 2
+ (48)(2.5) 2
+ (29)(3.6) 2
= 941.72

D=
B 2 0.32
4
=
4
= 0.0225 N D = (100) (0.0225) = 225
2 2
Solution
L
N i2 i2

i =1 ai 94243.8
n= = = 80.8  81
L
225 + 941.72
N 2 D +  N i i2
1
i =1
n1 = n(a1 ) = 81  = 27
3
The manager should take n = 81 sample 1
n2 = n(a2 ) = 81  = 27
3
1
n3 = n(a3 ) = 81  = 27
3
** The example for the sample size determination for total population and
population proportion is not given here. (see example in CHAPTER 3 notes).

Mean Total Proportion


Sample size
determination,
L Ni2 i2 L Ni2 i2 L Ni2 pi qi
n   
n= i =1 ai n= i =1 ai n= i =1 ai
L L L
N D +  Ni  i2
2
N D +  Ni  i2
2
N D +  Ni pi qi
2
i =1 i =1 i =1
Estimated
variance for B2 B2 B2
the estimator, D= D= D=
D 4 4N 2 4
Where, ai is the fraction of observation allocated to stratum i
Allocation of the Sample
➢ Use allocation that gives a specified amount of information at minimum cost.
➢ Best allocation scheme is affected by 3 factors;
1. The cost of obtaining an observation from each stratum. (take small sample
from strata with high costs)
2. The variability of observations within each stratum. (a large sample is
needed to obtain a good estimate of population parameters when the
observations are less homogeneous- the large the variance, the greater the
sample size needed)
3. The total number of elements in each stratum. (large sample size should be
assigned to stratum with larger number of elements)
Allocation of the Sample
➢ There is 3 formula specified for calculating the allocation based on
the given information at hand.
1. Optimal allocation
2. Neyman allocation
3. Proportional allocation
➢ Each type of allocation can be the best allocation based on the
similarity of the allocation to the information given.
➢ For allocation with the variance of estimator that fixed at D.
Optimal allocation Neyman allocation Proportional allocation
2
 L N k k  L   L  L
     i i i 
N  c   Nk k   i i
N  2

n=  k =1 ck   i =1 
n =  k =1 L  n= i =1
L
1
 i i
L
N D +  N i i
2 2
N 2D +  Ni  i2 ND + N  2

N i =1
i =1 i =1


 Ni i 
    
     N 
ci  Ni  i 
ni = n   ni = n L ni = n  L i 
 L N       N 
  k k   Nk k  k
 
 k =1  c k    
     k =1   k =1 

B2 B2
D= when estimating  D= 2
when estimating 
4 4N
Optimal allocation Neyman allocation Proportional allocation

2
 L  L   L 
L

  N k pk qk / ck   Ni pi qi ci    N i pi qi  N pq i i i
n =  k =1 L
 i =1  n =  i =1 L
 n= i =1
L
1
N 2 D +  Ni pi qi
i =1
N D +  N i pi qi
2

i =1
ND +
N
N pq
i =1
i i i

     
 N pq /c   N pq   N 
ni = n  L i i i i  ni = n  L i i i  ni = n  L i 
 N pq   N 
 k k k  k
 N p q /c 
 k k k k   k =1

  k =1


 k =1 

B2
D= when estimating p
4
Which allocation is suitable?
➢The choice of using allocation is actually based on the information given.

Comparison from Optimal allocation Neyman allocation Proportional allocation


each strata
Cost per c1 ≠ c2 ≠ c3 c1 = c2 = c3 c1 = c2 = c3
observation
Variance per σ 21 ≠ σ 2 2 ≠ σ 23 σ 2 1 ≠ σ 22 ≠ σ 23 σ 21 = σ 22 = σ 23
stratum

Population size per N1 ≠ N2 ≠ N3 N1 ≠ N2 ≠ N3 N1 ≠ N2 ≠ N3


stratum
Example
A manager of the company desires to estimate the average marks of personal motivation test for
their staff. The staff were divided based on their department – Administration, Operation and
Logistic. The marks achieved by these staffs on their personal motivation test are given below.

Administration Operation Logistic


N1 = 23 N2 = 48 N3 = 29
δ1 = 3.4 δ2 = 2.5 δ3 = 3.6
n1 = 12 n2 = 24 n3 = 14
ӯ1 = 78 ӯ2 = 81 ӯ3 = 85

Suppose the mean mark for all the staffs is to be estimated again in restructuring the
organization. Assuming the cost for collecting data for administration, operation and logistic
department are 4, 9 and 16 respectively. Find the needed sample size to obtain a bound on the
error of estimation equal to 0.3.
Administration Operation Logistic
N1 = 23 N2 = 48 N3 = 29
Solution δ1 = 3.4 δ2 = 2.5 δ3 = 3.6
n1 = 12 n2 = 24 n3 = 14
ӯ1 = 78 ӯ2 = 81 ӯ3 = 85

L  N k k  N1 1 N 2 2 N 3 3 23(3.4) 48(2.5) 29(3.6)


 
 c
k =1 
=
 c1
+
c2
+
c3
=
4
+
9
+
16
= 105.2
k 
L

N
i =1
i i ci = N1 1 c1 + N 2 2 c2 + N 3 3 c3 D=
B2
4
=
0.32
4
= 0.0225

= 23(3.4) 4 + 48(2.5) 9 + 29(3.6) 16 = 934 N 2 D = (100) 2 (0.0225) = 225

N
i =1
i i
2
= N1 12 +N 2 22 +N 3 32 = (23)(3.4) 2 + (48)(2.5) 2 + (29)(3.6) 2 = 941.72
Solution
 L N k k  L 
     N i i ci 
 k =1 ck   i =1  (105.2 )( 935 )
n= = = 84.3  85
L
225 + 941.72
N 2 D +  N i i
2

i =1

 23(3.4) / 2   48(2.5) / 3   29(3.6) / 4 


n1 = n  n2 = n  n3 = n 
 105.2   105.2   105.2 
= 0.372n = 0.372 ( 85 ) = 0.380n = 0.380 ( 85 ) = 0.248n = 0.248 ( 85 )
= 31.6  32 = 32.3  32 = 21.1  21
Example
A manager of the company desires to estimate the average marks of personal motivation test for
their staff. The staff were divided based on their department – Administration, Operation and
Logistic. The marks achieved by these staffs on their personal motivation test are given below.
Administration Operation Logistic
N1 = 23 N2 = 48 N3 = 29
δ1 = 3.4 δ2 = 2.5 δ3 = 3.6
n1 = 12 n2 = 24 n3 = 14
ӯ1 = 78 ӯ2 = 81 ӯ3 = 85

Suppose the mean mark for all the staffs is to be estimated again in restructuring the
organization. Assuming the cost allocation of sampling is equal for all strata, find the needed
sample size to obtain a bound on the error of estimation equal to 0.3.
Administration Operation Logistic
N1 = 23 N2 = 48 N3 = 29
Solution δ1 = 3.4 δ2 = 2.5 δ3 = 3.6
n1 = 12 n2 = 24 n3 = 14
ӯ1 = 78 ӯ2 = 81 ӯ3 = 85

N 
k =1
k k = N1 1 + N 2 2 + N3 3 = 23(3.4) + 48(2.5) + 29(3.6) = 302.6

 i i 1 1 2 2 3 3
N 
i =1
2
= N  2
+N  2
+N  2
= (23)(3.4) 2
+ (48)(2.5) 2
+ (29)(3.6) 2
= 941.72

D=
B 2 0.32
= = 0.0225 N 2 D = (100) 2 (0.0225) = 225
4 4
Solution
2
 L

 k k 
N 
( ) = 78.5  79
2

n =  k =1 L  =
302.6
225 + 941.72
N 2 D +  N i i2
i =1

 23(3.4)   48(2.5)   29(3.6) 


n1 = n  n2 = n  n3 = n 
 302.6   302.6   302.6 
= 0.258n = 0.258 ( 79 ) = 0.397 n = 0.397 ( 79 ) = 0.345n = 0.345 ( 79 )
= 20.4  21 = 31.4  31 = 27.3  27
Example
A manager of the company desires to estimate the average marks of personal motivation test for
their staff. The staff were divided based on their department – Administration, Operation and
Logistic. The marks achieved by these staffs on their personal motivation test are given below.
Administration Operation Logistic
N1 = 23 N2 = 48 N3 = 29
δ1 = 3.4 δ2 = 2.5 δ3 = 3.6
n1 = 12 n2 = 24 n3 = 14
ӯ1 = 78 ӯ2 = 81 ӯ3 = 85

Suppose the mean mark for all the staffs is to be estimated again in restructuring the
organization. Assuming the cost and variance allocation of sampling is equal for all strata, find
the needed sample size to obtain a bound on the error of estimation equal to 0.3.
Administration Operation Logistic
N1 = 23 N2 = 48 N3 = 29
Solution δ1 = 3.4 δ2 = 2.5 δ3 = 3.6
n1 = 12 n2 = 24 n3 = 14
Given, C1=5, C2=8 and C3=10
ӯ1 = 78 ӯ2 = 81 ӯ3 = 85

 i i 1 1 2 2 3 3
N 
i =1
2
= N  2
+N  2
+N  2
= (23)(3.4) 2
+ (48)(2.5) 2
+ (29)(3.6) 2
= 941.72

B2 0.32
D= = = 0.0225 ND = (100)(0.0225) = 2.25
4 4
Solution
L

 i i
N  2
941.72
n= i =1
= = 80.7  81
1 L
 1 
ND +
N
 i i
N 
i =1
2
2.25 +   ( 941.72 )
 100 

 23   48   29 
n1 = n  n2 = n  n3 = n 
100  100  100 
= 0.23n = 0.23 ( 81) = 0.48n = 0.48 ( 81) = 0.29n = 0.29 ( 81)
= 18.6  19 = 38.9  39 = 23.5  23
Which allocation is the best?
➢ Generally
Rank Allocation Reason

1 Optimal allocation The calculation of allocation take into account the


variability of the cost, variance and population size.
2 Neyman allocation The calculation of allocation take into account the
variability of the variance and population size.
3 Proportional allocation The calculation of allocation only take into account the
variability of the population size.
➢ But, it does not the only reasons to the selection.
➢ The consideration of cost, variability and size of each stratum also must be reconsidered as
highlighted in the factor that effect the best allocation. (see back in SLIDE 3)
Consider the given cases…
Case Given Optimal Neyman Proportional Best allocation
allocation allocation allocation

1 ➢ cConsider the givenncase.


1=30, c2=20, c3=10 1 = 10
Given
n1 =that
12 the cost
n1 = for
8 each All allocation
the sample isisnot significant
σ21= 2, σ22=3, σ23=1 n2 = 12 n2 = 8 n2 = 10 to the cost, variance and
provided, the
N1=30, N2=20, N3=10
best choice
n3 =8
of allocation
n3 =10
is with
n3 =12
the least
stratum
cost.size, but the
Proportional allocation give the
lowest cost.
2 c1=30, c2=20, c3=10 n1 = 8 n1 = 10 n1 = 10 Neyman allocation is the best
σ21= 2, σ22=3, σ23=1 n2 = 12 n2 = 12 n2 = 8 since the differences in the
N1=30, N2=20, N3=10 n3 =10 n3 =8 n3 = 12 sample variance was significant
among the strata
3 c1=30, c2=20, c3=10 n1 = 10 n1 = 12 n1 = 12 Proportional allocation is the
σ21= 2, σ22=3, σ23=1 n2 = 12 n2 = 8 n2 = 10 best since the differences in
N1=30, N2=20, N3=10 n3 =8 n3 = 10 n3 =8 the sample size was significant
among the strata
Thank you

You might also like