Download as doc, pdf, or txt
Download as doc, pdf, or txt
You are on page 1of 5

In applications:

Probability Sampling: Simple Random Sampling Stratified Random Sampling, Multi-Stage Sampling

 What is each and how is it done?


 How do we decide which to use?

 How do we analyze the results differently depending on the type of sampling?

Non-probability Sampling: Why don't we use non-probability sampling schemes? Two reasons:

 We can't use the mathematics of probability to analyze the results.


 In general, we can't count on a non-probability sampling scheme to produce representative
samples.

What are the main types of sampling and how is each done?

Simple Random Sampling: A simple random sample (SRS) of size n is produced by a scheme which
ensures that each subgroup of the population of size n has an equal probability of being chosen as the
sample.

Stratified Random Sampling: Divide the population into "strata". There can be any number of
these. Then choose a simple random sample from each stratum. Combine those into the overall sample.
That is a stratified random sample. (Example: Church A has 600 women and 400 women as members.
One way to get a stratified random sample of size 30 is to take a SRS of 18 women from the 600 women
and another SRS of 12 men from the 400 men.)

Multi-Stage Sampling: Sometimes the population is too large and scattered for it to be practical to
make a list of the entire population from which to draw a SRS. For instance, when the a polling
organization samples US voters, they do not do a SRS. Since voter lists are compiled by counties, they
might first do a sample of the counties and then sample within the selected counties. This illustrates
two stages. In some instances, they might use even more stages. At each stage, they might do a
stratified random sample on sex, race, income level, or any other useful variable on which they could
get information before sampling.

How does one decide which type of sampling to use?

The formulas in almost all statistics books assume simple random sampling. Unless you are willing to
learn the more complex techniques to analyze the data after it is collected, it is appropriate to use
simple random sampling. To learn the appropriate formulas for the more complex sampling schemes,
look for a book or course on sampling.

Stratified random sampling gives more precise information than simple random sampling for a given
sample size. So, if information on all members of the population is available that divides them into
strata that seem relevant, stratified sampling will usually be used.
If the population is large and enough resources are available, usually one will use multi-stage sampling.
In such situations, usually stratified sampling will be done at some stages.

How do we analyze the results differently depending on the different type of sampling?

The main difference is in the computation of the estimates of the variance (or standard deviation). An
excellent book for self-study is A Sampler on Sampling, by Williams, Wiley. In this, you see a rather small
population and then a complete derivation and description of the sampling distribution of the sample
mean for a particular small sample size. I believe that is accessible for any student who has had an
upper-division mathematical statistics course and for some strong students who have had a freshman
introductory statistics course. A very simple statement of the conclusion is that the variance of the
estimator is smaller if it came from a stratified random sample than from simple random sample of the
same size. Since small variance means more precise information from the sample, we see that this is
consistent with stratified random sampling giving better estimators for a given sample size.

Non-probability sampling schemes

These include voluntary response sampling, judgment sampling, convenience sampling, and maybe
others.

In the early part of the 20th century, many important samples were done that weren't based on
probability sampling schemes. They led to some memorable mistakes. Look in an introductory statistics
text at the discussion of sampling for some interesting examples. The introductory statistics books I
usually teach from are Basic Practice of Statistics by David Moore, Freeman, and Introduction to the
Practice of Statistics by Moore and McCabe, also from Freeman. A particularly good book for a
discussion of the problems of non-probability sampling is Statistics by Freedman, Pisani, and Purves. The
detail is fascinating. Or, ask a statistics teacher to lunch and have them tell you the stories they tell in
class. Most of us like to talk about these! Someday when I have time, maybe I'll write some of them
here.

Mathematically, the important thing to recognize is that the discipline of statistics is based on the
mathematics of probability. That's about random variables. All of our formulas in statistics are based on
probabilities in sampling distributions of estimators. To create a sampling distribution of an estimator
for a sample size of 30, we must be able to consider all possible samples of size 30 and base our analysis
on how likely each individual result is.

Cluster Sampling Vs. Stratified Sampling


There are various methods by which sampling can be done. This article will focus on cluster sampling vs.
stratified sampling. Cluster sampling and stratified sampling are two different sampling methods. The main
difference between them is that a cluster is treated as sampling unit. Hence, in the first stage, analysis is
done on a population of clusters. In stratified sampling, the elements within the strata are analyzed.

Cluster Sampling
In this mode of sampling, the naturally occurring groups are selected for being included in the sample.
Its main use is in market research. In this method, the total population is divided into samples or groups after
which, a sample of the groups is selected.
After this process, relevant and required data from all the elements of all the groups is
collected.
At times, instead of collecting information from each group, information can be collected from a
sub-sample of the elements.If the variation is between the members of the groups and not
between the actual groups, then this technique will work the best. Before you start using this
methods on clusters, make sure that the clusters are collectively exhaustive and mutually
exclusive. stratified Sampling In this technique, a sample is divided into stratum and on random
basis. Different stratum are created, which will allow the usage of different sampling percentage
in each stratum. These stratum are nothing but simple groups, which consists of a number of
elements. On these stratum, simple random selection is performed. Make sure that every
element is assigned only one stratum. This method is known to produce weighted mean whose
variability is less than that of arithmetic mean of a simple random sample of the population.
Even in stratified sampling, the strata should be collectively exhaustive and mutually exclusive.
This will help in applying random or systematic sampling in each of the stratum. This will also
help in the reduction of errors.

Cluster Vs. Stratified


Cluster Sampling

Application: It is used when natural groupings are evident in a statistical population. Choice: It
can be chosen if the group consists of homogeneous members. Advantage: The method is
cheaper as compared to the other methods. Disadvantage: The main disadvantage is that it
introduces higher errors.

Stratified Sampling

Application: In this method, the members are grouped into relatively homogeneous groups. This
allows greater balancing of statistical power of tests.

Choice: It is a good option for heterogeneous members.

Advantages: This method ignores the irrelevant ones and focuses on the crucial sub
populations. You can opt for different techniques. This also helps in improving the efficiency
and accuracy of the estimation.
Disadvantage: It requires a choice of relevant stratification variables, which can be tough at
times. When there are homogeneous subgroups, it is not very useful, and its implementation is
expensive. If not provided with accurate information about the population, then an error may be
introduced.

This article would be surely of help to people who were in a dilemma of which sampling method
to opt for. Though both methods are appropriate, one must choose according to his needs and
availability of data.
Case Study

The following data shows used of Electricity in one month by 100 residential
consumers in certain locality of Karachi is a random variable x
i. draw a random sample of size n = 15 by using random numbers
ii. Sample mean and standard deviation
iii. Find 95 % confidence interval of population means
iv. Find population mean and standard deviation
v. comments on results

5 93 95 103 30 105 30 40 17 87
12 91 83 95 93 91 93 126 60 98
19 89 64 97 85 52 140 67 150 50
24 87 120 64 43 87 144 60 25 83
25 86 55 68 135 142 160 65 60 84
120 102 18 82 130 75 111 155 146 41
82 100 103 75 165 121 23 21 70 42
81 78 74 73 15 84 85 85 69 90
124 63 98 100 124 40 52 98 98 45
80 105 106 122 87 142 81 95 86 55

Case Study More recent results suggest a different value of temperatures Dr.Philip Mackowiak,
Dr.Steven Wasserman and Dr. Myron Levine, University of Mary land researcher conducted clinical
tests, and we will analyze some of the results they obtained. Listed in following table are the body
temperatures of 106 healthy adults.

98.6 98.8 98.5 99.6 98.0 96.9 98.8 98.6 98.9 98.0

98.6 98.6 97.3 98.7 97.8 97.6 98.7 98.3 98.4 98.4

98.0 97.0 98.7 99.4 98.0 97.1 97.8 98.7 98.6 97.8
98.0 97.0 97.4 98.2 98.4 97.9 98.0 98.8 97.1 98.4

99.0 98.8 98.9 98.0 98.6 98.4 97.1 99.1 97.9 97.4

98.4 97.6 98.6 98.6 98.6 97.3 97.4 98.6 98.8 98.0

98.4 97.7 99.5 98.6 97.8 98.0 99.4 97.9 98.7 97.0

98.4 98.8 97.5 97.2 99.0 97.5 98.4 98.8 97.6 98.4

98.0 97.3 98.4 96.5 97.6 98.6 98.0 98.2 98.6 98.0

97.6 98.6 97.6 98.2 98.4 98.7 99.2 98.6 98.3 98.2

98.2 98.0 98.5 98.5 98.5 97.8

You might also like