Cluster Sampling

Cluster sampling:Cluster sampling is a sampling technique used when
"natural" but relatively homogeneous groupings are

evident in a statistical population. It is often used in
marketing research. In this technique, the total
population is divided into these groups (or clusters)
and a simple random sample of the groups is
selected. Then the required information is collected
from a simple random sample of the elements within
each selected group. This may be done for every
element in these groups or a sub sample of elements
may be selected within each of these groups. A
common motivation for cluster sampling is to reduce
the total number of interviews and costs given the
desired accuracy. Assuming a fixed sample size, the
technique gives more accurate results when most of
the variation in the population is within the groups,
not between them.
Cluster elements
The population within a cluster should ideally be as
heterogeneous as possible, but there should be
homogeneity between cluster means. Each cluster
should be a small scale representation of the total
population. The clusters should be mutually
exclusive and collectively exhaustive. A random
sampling technique is then used on any relevant
clusters to choose which clusters to include in the
study. In single-stage cluster sampling, all the
elements from each of the selected clusters are used.
In two-stage cluster sampling, a random sampling
technique is applied to the elements from each of the
selected clusters.
The main difference between cluster sampling and
stratified sampling is that in cluster sampling the
cluster is treated as the sampling unit so analysis is
done on a population of clusters (at least in the first
stage). In stratified sampling, the analysis is done on
elements within strata. In stratified sampling, a
random sample is drawn from each of the strata,
whereas in cluster sampling only the selected
clusters are studied. The main objective of cluster
sampling is to reduce costs by increasing sampling
efficiency. This contrasts with stratified sampling
where the main objective is to increase precision.
There also exists multi stage sampling, here more
than two steps are taken in selecting clusters from
clusters.
Aspects of cluster sampling
One version of cluster sampling is area sampling or
geographical cluster sampling. Clusters consist of
geographical areas. Because a geographically
dispersed population can be expensive to survey,
greater economy than simple random sampling can
be achieved by treating several respondents within a
local area as a cluster. It is usually necessary to
""

. .
, ( )
.

.

.
.
,
, .

, .
.
.

. ,
.
,
.
(
)
. ,
.
,
, .
.
.
,
.

.
.
,

.
increase the total sample size to achieve equivalent

precision in the estimators, but cost savings may
make that feasible.
In some situations, cluster analysis is only
appropriate when the clusters are approximately the
same size. This can be achieved by combining
clusters. If this is not possible, probability
proportionate to size sampling is used. In this
method, the probability of selecting any cluster
varies with the size of the cluster, giving larger
clusters a greater probability of selection and smaller
clusters a lower probability. However, if clusters are
selected with probability proportionate to size, the
same number of interviews should be carried out in
each sampled cluster so that each unit sampled has
the same probability of selection.
Cluster sampling is used to estimate high mortalities
in cases such as wars, famines and natural disasters
Advantages
Can be cheaper than other methods e.g. fewer
travel expenses, administration costs
estimators
Disadvantages
Higher sampling error, which can be expressed in the
so-called "design effect", the ratio between the
number of subjects in the cluster study and the
number of subjects in an equally reliable, randomly
sampled un clustered study
Two-stage cluster sampling, a simple case of
multistage sampling, is obtained by selecting cluster
samples in the first stage and then selecting sample
of elements from every sampled cluster. Consider a
population of N clusters in total. In the first stage, n
clusters are selected using ordinary cluster sampling
method. In the second stage, simple random
sampling is usually used. It is used separately in
every cluster and the numbers of elements selected
from different clusters are not necessarily equal. The
total number of clusters N, number of clusters
selected n, and numbers of elements from selected
clusters need to be pre-determined by the survey
designer. Two-stage cluster sampling aims at
minimizing survey costs and at the same time
controlling the uncertainty related to estimates of
interest. This method can be used in health and
social sciences. For instance, researchers used twostage cluster sampling to generate a representative
sample of the Iraqi population to conduct mortality
surveys.
Multistage sampling
Multistage sampling is a complex form of cluster
sampling. Cluster sampling is a type of sampling
which involves dividing the population into groups
(or clusters). Then, one or more clusters are chosen
at random and everyone within the chosen cluster is
, .
,
. .
, .
,
,
.

,
.
,

- ,
" ",
,

,
,
, ,

. .
,
.
, used. It

. , ,

.

.
. ,

.

. (
) . ,

.
sampled.
Using all the sample elements in all the selected
clusters may be prohibitively expensive or not
necessary. Under these circumstances, multistage
cluster sampling becomes useful. Instead of using all
the elements contained in the selected clusters, the
researcher randomly selects elements from each
cluster. Constructing the clusters is the first stage.
Deciding what elements within the cluster to use is
the second stage. The technique is used frequently
when a complete list of all members of the
population does not exist and is inappropriate.
In some cases, several levels of cluster selection may
be applied before the final sample elements are
reached. For example, household surveys conducted
by the Australian Bureau of Statistics begin by
dividing metropolitan regions into 'collection
districts', and selecting some of these collection
districts (first stage). The selected collection districts
are then divided into blocks, and blocks are chosen
from within each selected collection district (second
stage). Next, dwellings are listed within each
selected block, and some of these dwellings are
selected (third stage). This method means that it is
not necessary to create a list of every dwelling in the
region, only for selected blocks. In remote areas, an
additional stage of clustering is used, in order to
reduce travel requirements.
Although cluster sampling and stratified sampling
bear some superficial similarities, they are
substantially different. In stratified sampling, a
random sample is drawn from all the strata, where in
cluster sampling only the selected clusters are
studied, either in single- or multi-stage.
Advantages
1)cost and speed that the survey can be done in
2)convenience of finding the survey sample
3)normally more accurate than cluster sampling for
the same size sample
Disadvantages
1) Is not as accurate as SRS if the sample is the same
size
2) More testing is difficult to do
In statistics, a simple random sample is a subset of

individuals (a sample) chosen from a larger set (a
population). Each individual is chosen randomly and
entirely by chance, such that each individual has the
same probability of being chosen at any stage during
the sampling process, and each subset of k
individuals has the same probability of being chosen
for the sample as any other subset of k individuals.
[1] This process and technique is known as simple
random sampling, and should not be confused with
, ( )
. , multistage
.

. .

.
.
,
. ,
' '
, ( )
. ,
( ) . ,
, ( )
.
.
,
.
,
. ,

, .
1)
2)
3)

1)
2)
( ) .

,

. [1]
,
.
random sampling. A simple random sample is an

unbiased surveying technique.
Simple random sampling is a basic type of sampling,
since it can be a component of other more complex
sampling methods. The principle of simple random
sampling is that every object has the same possibility
to be chosen. For example, N college students want
to get a ticket for a basketball game, but there are not
enough tickets (X) for them, so they decide to have a
fair way to see who gets to go. Then, everybody is
given a number (0 to N-1), and random numbers are
generated, either electronically or from a table of
random numbers. Non-existent numbers are ignored,
as are any numbers previously selected. The first X
numbers would be the lucky ticket winners.
In small populations and often in large ones, such
sampling is typically done "without replacement" ,
i.e., one deliberately avoids choosing any member of
the population more than once. Although simple
random sampling can be conducted with replacement
instead, this is less common and would normally be
described more fully as simple random sampling
with replacement. Sampling done without
replacement is no longer independent, but still
satisfies exchangeability, hence many results still
hold. Further, for a small sample from a large
population, sampling without replacement is
approximately the same as sampling with
replacement, since the odds of choosing the same
individual twice is low.
An unbiased random selection of individuals is
important so that in the long run, the sample
represents the population. However, this does not
guarantee that a particular sample is a perfect
representation of the population. Simple random
sampling merely allows one to draw externally valid
conclusions about the entire population based on the
sample.
Conceptually, simple random sampling is the
simplest of the probability sampling techniques. It
requires a complete sampling frame, which may not
be available or feasible to construct for large
populations. Even if a complete frame is available,
more efficient approaches may be possible if other
useful information is available about the units in the
population.
Advantages are that it is free of classification error,
and it requires minimum advance knowledge of the
population other than the frame. Its simplicity also
makes it relatively easy to interpret data collected via
SRS. For these reasons, simple random sampling
best suits situations where not much information is
available about the population and data collection
can be efficiently conducted on randomly distributed
items, or where the cost of sampling is small enough
.

, .
. ,

, () ,
. ,
, (0
-1) , .

.
.
,
"" , , -
.
,

.
, ,
.
, ,
, .
,
. ,
.

.
, .

, .
,
.
,
.

. ,

,

. ,
.
to make efficiency less important than simplicity. If

these conditions are not true, stratified sampling or
cluster sampling may be a better choice.
Nonprobability sampling
Sampling is the use of a subset of the population to
represent the whole population. Probability
sampling, or random sampling, is a sampling
technique in which the probability of getting any
particular sample may be calculated. Nonprobability
sampling does not meet this criterion and should be
used with caution. Nonprobability sampling
techniques cannot be used to infer from the sample
to the general population. Any generalizations
obtained from a nonprobability sample must be
filtered through one's knowledge of the topic being
studied. Performing nonprobability sampling can be
considerably less expensive than doing probability
sampling. While probability methods are suitable for
large scale studies concerned with
representativeness, non-probability approaches are
often more suitable for in-depth qualitative research
in which the focus is often to understand complex
social phenomena. In this respect, the sampling
method chosen should match the particular focus of
the research study at hand.
Examples of nonprobability sampling include:
Convenience, Haphazard or Accidental sampling members of the population are chosen based on their
relative ease of access. To sample friends, coworkers, or shoppers at a single mall, are all
examples of convenience sampling. Such samples
are biased because researchers may unconsciously
approach some kinds of respondents and avoid
others (Lucas 2012), and respondents who volunteer
for a study may differ in unknown but important
ways from others (Wiederman 1999).
Snowball sampling - The first respondent refers a
friend. The friend also refers a friend, and so on.
Such samples are biased because they give people
with more social connections an unknown but higher
chance of selection (Berg 2006).
Judgmental sampling or Purposive sampling - The
researcher chooses the sample based on who they
think would be appropriate for the study. This is
used primarily when there is a limited number of
people that have expertise in the area being
researched.
Deviant Case - Get cases that substantially differ
from the dominant pattern (a special type of
purposive sample).
Case study - The research is limited to one group,

. , ,
, .
Nonprobability
. Nonprobability
.
nonprobability
.
Nonprobability
.
,

. ,
.
Nonprobability :
, -
. , ,
, .

( 2012) ,

(Wiederman
1999) .
- .
, .
( 2006)
.
-
.
.
Deviant - ( )
.
-
, .
- (65% )

.
often with a similar characteristic or of small size.

ad hoc quotas - A quota is established (say 65%
women) and researchers are free to choose any
respondent they wish as long as the quota is met.
Even studies intended to be probability studies
sometimes end up being non-probability studies due
to unintentional or unavoidable characteristics of the
sampling method. In public opinion polling by
private companies (or other organizations unable to
require response), the sample can be self-selected
rather than random. This often introduces an
important type of error: self-selection bias. This error
sometimes makes it unlikely that the sample will
accurately represent the broader population.
Volunteering for the sample may be determined by
characteristics such as submissiveness or
availability. The samples in such surveys sh
What is snowball sampling?
Snowball sampling uses a small pool of initial
informants to nominate, through their social
networks, other participants who meet the eligibility
criteria and could potentially contribute to a specific
study. The term snowball sampling reflects an
analogy to a snowball increasing in size as it rolls
downhill [9]
Snowball Sampling is a method used to obtain
research and knowledge, from extended associations,
through previous acquaintances, Snowball sampling
uses recommendations to find people with the
specific range of skills that has been determined as
being useful. An individual or a group receives
information from different places through a mutual
intermediary. This is referred to metaphorically as
snowball sampling because as more relationships are
built through mutual association, more connections
can be made through those new relationships and a
plethora of information can be shared and collected,
much like a snowball that rolls and increases in size
as it collects more snow. Snowball sampling is a
useful tool for building networks and increasing the
number of participants. However, the success of this
technique depends greatly on the initial contacts and
connections made. Thus it is important to correlate
with those that are popular and honorable to create
more opportunities to grow, but also to create a
credible and dependable reputation
Stratified sampling
In statistics, stratified sampling is a method of
sampling from a population.
In statistical surveys, when sub populations within
an overall population vary, it is advantageous to
sample each sub population (stratum) independently.
Stratification is the process of dividing members of

. (
) , . - :
.
.

.
?
, ,

. " " [9]

, ,
, " .

"

. ,

.

. ,
.

,

, .
, ,
() .

. :
the population into homogeneous subgroups before

sampling. The strata should be mutually exclusive:
every element in the population must be assigned to
only one stratum. The strata should also be
collectively exhaustive: no population element can
be excluded. Then simple random sampling or
systematic sampling is applied within each stratum.
This often improves the representativeness of the
sample by reducing sampling error. It can produce a
weighted mean that has less variability than the
arithmetic mean of a simple random sample of the
population.
In computational statistics, stratified sampling is a
method of variance reduction when Monte Carlo
methods are used to estimate population statistics
from a known population.
Sequential analysis
In statistics, sequential analysis or sequential
hypothesis testing is statistical analysis where the
sample size is not fixed in advance. Instead data are
evaluated as they are collected, and further sampling
is stopped in accordance with a pre-defined stopping
rule as soon as significant results are observed. Thus
a conclusion may sometimes be reached at a much
earlier stage than would be possible with more
classical hypothesis testing or estimation, at
consequently lower financial and/or human cost.
Probability and Non-probability Sampling
The term probability samplingis used when the
selection of the sample is purely based on chance.
The human mind has no control on the selection or
non- selection of the units for the sample. Every unit
of the population has known nonzero probability of
being selected for the sample. The probability of
selection may b equal or unequal but it should be
non-zero and should be known. The probability
samplingis also called the random sampling (not
simple random sampling). Some examples of
random sampling are:
Simple random sampling.

Stratified random sampling.
Systematic random sampling.
In non-probability sampling,the sample is not based
on chance. It is rather determined by some person.
We cannot assign to an element of population the
probability of its being selected in the sample.
Somebody may use his personal judgment in the
selection of the sample. In this case the sampling is
called judgment sampling.A drawback in nonprobability sampling is that such a sample cannot be
used to determine the error. Any statistical method
cannot be used to draw inference from this sample.
But it should be remembered that judgment sampling
becomes essential in some situations. Suppose we
: .
.
.

.
,

.

,
.
,

.
/

.

samplingis.
.
.
,
. samplingis (
) . :
.
.
, .
.
.
.
sampling.A ,
.

.
.
.
have to take a small sample from a big heap of coal.

We cannot make a list of all the pieces of coal. The
upper part of the heap will have perhaps big pieces
of coal. We have to use our judgment in selecting a
sample to have an idea about the quality of coal. The
nonprobability sampling is also called non-random
sampling.
. .

.
.
Sampling
In statistics, quality assurance, and survey
methodology, sampling is concerned with the
selection of a subset of individuals from within a
statistical population to estimate characteristics of
the whole population. Acceptance sampling is used
to determine if a production lot of material meets the
governing specifications. Two advantages of
sampling are that the cost is lower and data
collection is faster than measuring the entire
population.
Each observation measures one or more properties
(such as weight, location, color) of observable bodies
distinguished as independent objects or individuals.
In survey sampling, weights can be applied to the
data to adjust for the sample design, particularly
stratified sampling (blocking). Results from
probability theory and statistical theory are
employed to guide practice. In business and medical
research, sampling is widely used for gathering
information about a population.
The sampling process comprises several stages:
Defining the population of concern
Specifying a sampling frame, a set of items or events
possible to measure
Specifying a sampling method for selecting items or
events from the frame
Determining the sample size
Implementing the sampling plan
Sampling and data collecting
, , ,

.
,
.
.

( , , ) .
,
, ()
.
. ,
.
:

,

Cluster Sampling

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cluster Sampling

Uploaded by

Copyright:

Available Formats

Cluster sampling:Cluster sampling is a sampling technique used when

"natural" but relatively homogeneous groupings are

increase the total sample size to achieve equivalent

In statistics, a simple random sample is a subset of

random sampling. A simple random sample is an

to make efficiency less important than simplicity. If

often with a similar characteristic or of small size.

the population into homogeneous subgroups before

Simple random sampling.

have to take a small sample from a big heap of coal.

You might also like