Statistics For Management Sampling Theory: Post Graduate Programme

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

Post Graduate Programme

Statistics for Management


Sampling Theory

Prof. Saibal Chattopadhyay


IIM Calcutta
Sampling Theory
• Census Vs. Sampling
• Judgment Sampling Vs. Probability
Sampling

Different Probability Sampling Procedures


• Simple Random Sampling – With
Replacement (SRSWR) & Without
Replacement (SRSWOR)
• Stratified Random Sampling
• Systematic Sampling
Preliminary Concepts
• Finite Population:
N units having values X1, X2, …, XN
• Parameter:
A function of the population values
Examples:
• Population Mean =  =  Xi /N
• Population SD =  =  (Xi - )2/N
• Population Proportion = P
Simple Random Sampling With
Replacement (SRSWR)
• n units to draw from N units
• Unit drawn is returned before next draw
• All possible choices are equally likely
• Nn possible samples of size n each
• Each sample has probability 1/Nn
• Same unit may repeat in the same sample
• Values of sampled units are random
variables !
SRSWR

Denote the sample values as x1, x2, …, xn.


Consider x1 (the first sample value)
This could be any one of the N values of
the population
x1 takes each of the values X1, X2, …, XN
with probability 1/N.
Thus
P(x1=X1) = P(x1=X2) = …= P(x1=XN) = 1/N.
SRSWR

What about x2?


Sampling done with replacement;
Composition of the population unchanged;
Second sample value x2 is identically
distributed as x1
True for all subsequent sample values
• Sample Values are identically distributed
P(xi = Y1) = P(xi = Y2 ) = …. = P( xi= YN ) = 1/N,
for all i = 1, 2, …n.
SRSWR

Are the sample values independent?


P( xi = Y1 and xj = Y2) = 1/N2
P( xi = Y1) = 1/N & P( xj = Y2) = 1/N
 xi and xj are independent
True for all pairs of values
• Sample Values are identically distributed
• Independent and identically distributed
(IID) random variables
SRS Without Replacement (SRSWOR)

• n units to draw at random from N units


• Unit once drawn is not returned before
drawing the next unit
• All possible choices are equally likely
• NCn possible samples of size n each
• Each sample has probability 1/NCn
• Units in a sample are all distinct
• Values of sampled units are random
variables !
SRSWOR
Are the sample units still identically
distributed ?
For x1 the distribution is same as SRSWR
What about x2 ?
P(x2=X1 | x1 = Xi) = 0 if i = 1;
= 1/(N-1) otherwise
P(x2= X1) = (1/N).0 + (N-1). (I/N).(I/(N-1))
= 1/N, same as in SRSWR !
• Yes; units are identically distributed
SRSWOR
Are the sample units still independent?
P(x1 = X1, x2 = X2) = 1/N(N-1),
but
P(x1 = X1) = 1/N = P(x2 = X2)
x1 and x2 are not independent
True for all sample values
• Sample units are not independent
• What about their dependence?
SRSWOR

Are the sampled units uncorrelated?


• No; Covariance between any two of them
is - 2/(N-1);
What is a Statistic?
• A function of the sample values;
Examples
• Sample Mean
• Sample SD
• Sample Proportion
SRSWOR

• A Statistic is a Random Variable


• Probability Distribution of a Statistic –
Called Sampling Distribution
• Mean of a Statistic – Called Expectation
• SD of a Statistic – Called Standard Error
(SE)
• Role of SE – compares efficacy of
different sampling procedures
• Smaller the SE, better is the sampling
Sampling Distribution of Sample Mean in
Simple Random Sampling
• Finite Population of size N
• Population mean =  and SD = 
• Random Sample of size n drawn (WR/WOR)
• Statistic is Sample Mean
• Expectation =  (both SRSWR and SRSWOR)
• SE = /n for SRSWR
• SE = (/n).( FPC) for SRSWOR
• FPC = Finite Population Correction = (N-n)/(N-1)
Comparing SRSWR and SRSWOR

• For n =1, FPC = 1, so SRSWR and


SRSWOR are equivalent
• For n > 1, FPC < 1, so SRSWOR is better
than SRSWR
• Limiting Behaviour: As N becomes large
with n fixed, both sampling methods are
asymptotically equivalent
---- Intuitively Obvious !
Central Limit Theorem

Sampling from a normal population


• Mean =  and SD = 
• SRS of size = n (With Replacement)
• Statistic = Sample Mean
• Expectation = ; SE = /n
• Z = (Sample Mean - )/(/n) is N(0, 1)
What happens if sampling is done from a non-
normal distribution?
• Distribution of sample mean is no longer normal
though formulae for Expectation & SE are still
true
Can we say anything more?

Yes, provided the sample size n is ‘large’ !


How large is ‘large’ ? n  30 will do !!
What happens if n is ‘large’ ?
• Distribution of sample mean is still normal, but
only approximately
• Approximation is better and better as n becomes
larger and larger
• Always true regardless of the underlying
distribution from which sampling is done
 Central Limit Theorem
Some Standard Sampling Distributions

1. Chi-Square Distribution
• n IID N(0, 1) variables: Z1, Z2, …, Zn
• Y = Sum of Squares of Z1, Z2, …, Zn
= Z21+ Z22 + … + Z2n
• Y is Chi-Square with n degrees of freedom
(d.f)
• Mean = n; SD = 2n
• (Y – n)/ 2n is Standard Normal for large n
• Distribution is positively skewed; probability
table available
Some Standard Sampling Distributions

2. t – distribution
• Z is N(0, 1)
• Y is Chi-Square with d.f = n
• Z and Y are independently distributed
• Sampling Distribution of t = Z/(Y/n) is
called the t-distribution with d.f = n
• Similar to N(0, 1); Approaches N(0,1) as
sample size n is large ( n  30); Probability
tables for n < 30 available
Some Standard Sampling Distributions

3. F – distribution
• Y1 is Chi-Square with d.f = n1
• Y2 is Chi-Square with d.f = n2
• Y1 and Y2 are independently distributed
• Sampling distribution of
F = (Y1/n1)/(Y2/n2)
is F distribution with d.f = (n1, n2)
 Useful for Hypothesis-Testing problems
when we have samples available from a
normal population (exact or approximate)
References

• Statistics for Managers using Microsoft


Excel (8th ed.) – Levine, D.M., Stephan
D.F., & Szabat K.A.
• Class Notes
• Complete Business Statistics: Aczel, A.D.
& Sounderpandian, J. – Fifth Edition (Tata
McGraw-Hill)
• Sampling Techniques: Cochran, W. (Wiley)

You might also like