Prof. Lhars M.

Biostatistics with Epidemiology Program
Cagayan State University
By the end of the session, the learner should be able to:
1. Identify examples of the applications of sampling in
health research;
2. Define the meaning of basic sampling concepts;
3. Differentiate between probability and non-probability
sampling designs; and
4. Describe the procedures in implementing the basic
probability sampling designs
a. Sampling of individuals
 Determining the health status of populations
 Evaluating the effectiveness of health measures
b. Sampling of health facilities/institutions
 Evaluating the performance of health facilities (ex.,
level of utilization; adequacy of equipment, etc.)
 Assessing the status of health facilities (ex.,
determining the level of disaster preparedness of
c. Sampling of communities
 Determining investments of LGUs in health (ex., % of total
budget allocated for health; preparation and implementation
of Disaster Preparedness Plan)
 Determining status of communities in relation to
environmental health and climate change variables (ex., level
of air pollution; amount of rainfall; temperature change, etc.)
d. Sampling of non-human populations
 Water sampling to determine portability
 Sampling of shellfish to determine incidence of red tide or
sampling of fish sold in markets to determine incidence of
use of formalin as preservative
a. It is cheaper.
b. It is faster.
c. Better quality of information can be
d. More comprehensive data may be obtained.
e. It is the only possible method when the
procedure is destructive.
a. Population – the entire group of individuals or items of
interest in the study
b. Target population – the group from which representative
information is desired and to which inferences will be made.
Whatever conclusions will be derived from the study, will be
generalized to the target population.
c. Sampling Population – the population from which a sample
will actually be taken
Ideally, the target population should be the same as the sampling population. However
there are certain instances when there is a gap between the two, resulting from limited
resources and other field realities. When this occurs, what is important is for the
investigator to determine the extent and direction of the bias (if any) created by the gap
between the target and the sampling population
d. Elementary unit or elements – an object or a person
on which a measurement is actually taken
e. Sampling Unit – units which are chosen in selecting
the sample
f. Sampling frame – a listing or any other material like
spot maps or aerial photographs which shows or
accounts for the target population. It is a collection
of the sampling units
4.1 Non-probability Sampling Designs
a. The probability of each member of the sampling population to be
selected in the sample is difficult to determine or cannot be
specified, hence the reliability of the resulting estimates of the
sample results cannot be assessed.
b. The external validity of the results becomes an issue
c. Examples of non-probability sampling designs are:
 Purposive sampling
 Judgment sampling
 Convenience or accidental sampling
 Snow-ball technique or referral sampling
d. These are the types of designs usually used in qualitative studies.
4.2 Probability Sampling Designs
a. The rules and procedures for selecting the sample
and estimating the parameters are explicitly and
rigidly specified.
b. The reliability of the resulting estimates can be
c. Most quantitative studies use probability sampling
designs in the selection of subjects.
a. Where? -- geographic area to be covered by the survey
b. Who? – elements (households, mothers, infants, etc.) to be
studied in the survey. In cases when the subjects of the survey
are not in a position to provide the information (ex., young
children, sick elderly), the actual survey respondents must be
c. How many? – sample size; number of elements to be included
in the survey and its basis. The values of important parameters
considered in sample size determination must be explicitly
indicated (ex., specific variable used as basis, anticipated value
of the variable, confidence level; margin of error; power of the
test, etc.)
d. How to select? – procedures to be followed in selecting the
elements to be included. If stratification variables are used,
these should be mentioned with a concise justification why
they were considered. If multi-stage sampling is used, the
sampling units at each level of selection and the
corresponding sampling frames used must be mentioned.
e. When? – refers to the time period for the conduct of the
survey. This is an important consideration when the
variable being studied has seasonality
Every element in the population has an equal chance of
being included in the sample
Procedures for Sample Selection
a. Prepare the sampling frame
b. Number all the population elements in the sampling
frame chronologically from 1 to N, where N is the
population size
c. Determine the required sample size, n.
Procedures for Sample Selection
d. Select n numbers at random between 1 and N,
using either the lottery method or computer –
generated random numbers using a software like
e. The population elements in the list whose
numbers correspond to the n numbers randomly
selected will comprise the simple random sample
This design is used when the investigator wants to:
a. ensure that groups of interest or subsections of the
population considered important for the study are
adequately represented
b. derive reasonably precise estimates for important
subsections of the population
1. Identify the stratification variable.
2. Classify the population elements according to the
categories of the stratification variable
3. Number the population elements chronologically from 1
to N, within each category of the stratification variable.
4. Determine the sample size needed from each stratum
5. Within each stratum, select the required number of
samples by simple random sampling.
Suppose we have the following:
N=800 households of which: NUrban = 320 and Nrural = 480
n=200 households of which: nUrban = 80 and nRural = 120
List of 800 households, numbered SELECTION
Select 200 numbers at
Simple random
sampling chronologically from 1 to 800 random, between 1 and 800

Stratified random Two sampling frames are needed: Urban and rural samples are
sampling a. For URBAN areas: List of 320 selected separately, as
urban households, chronologically follows:
numbered between 1 and 320 a. For urban areas, 80
numbers are selected at
b. For RURAL areas: random between 1 and 320
List of 480 rural households,
chronologically numbered b. For rural areas, 120
between 1 and 480 numbers are selected at
random between 1 and 480
Suppose we want to allocate 250 samples to 3 sample barangays
included in the study. These 3 barangays have the following populations:

A 3000 15.0
B 10500 52.5
C 6500 32.5
TOTAL 20000 100.0
The 250 samples can be allocated to the 3 barangays to reflect the
population distribution as follows:

A 3000 15.0 38 15.0
B 10500 52.5 131 52.5
C 6500 32.5 81 32.5
TOTAL 20000 100.0 250 100.0
a. Every element has an equal chance of being selected.
b. It is often used under the following conditions:
 the population elements are too many to list or
to number chronologically
 a frame is not available

c. It is often used in combination with other designs

1. Determine the required sample size, n.
2. Determine the sampling interval, k, where:
k = N/n
3. Select a number at random between 1 and k. The
population element in the frame corresponding to the
random number selected will be the first to be included in
the sample
4. Include in the sample survey every kth population element
after the first random number selected
Using the same example presented earlier where N=800
and n=200
Systematic sampling Not needed 1. Compute for the sampling interval, k where
k=N/n. Therefore k=800/200 = 4. This means
that for every 4 households in the population, 1
household will be selected as sample
2. Select a random number between 1 and 4.
Suppose #2 was selected. Therefore the second
household in the population to be studied is
included as sample.
3. Every 4th household thereafter will be included on
the study. These include households number 2, 6,
10, 14, 18, 22, 26, 30, 34, 38. etc.
a. It is used when a frame for the individual
elementary units in the population is not
available. However, a frame for groups or clusters
of elements is available
b. The sampling unit is different from the
elementary unit.
1. Identify the groups or clusters of elementary
units. It is best if the sizes of the clusters are not
too big and do not vary much from each other.
2. Select a random sample of clusters.
3. All elements in the selected clusters will be
included in the survey.
a. It is generally used when the survey has a wide
coverage and a sampling frame for the elementary
units is difficult to obtain.
b. Sampling is done in successive stages.
c. Data collection is concentrated only on the samples
selected at each stage, resulting in lower cost per unit
of inquiry.
d. Statistical analysis of the data is more complicated.
Procedures for Sample Selection:
1. Determine the number of stages of selection to be used in the
sampling design and the sampling units to be used at each stage.
2. Determine the sample size necessary for each stage of selection.
3. Prepare the sampling frame for the 1st stage of selection, and select
at random a sample of primary sampling units (PSUs).
4. For each of the PSUs earlier selected, prepare the sampling frame for
the 2nd stage of selection. Randomly select the corresponding
number of secondary sampling units (SSUs) from each PSU included in
the sample.
5. Repeat the process of frame preparation and sample selection until
the last stage of sampling is reached.

