Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 47

Chapter Two

Sampling Theory

11/14/2023 1
Basic Concepts of Sampling Theory
 Population or universe: is a group of all elements
/observations (persons, animals, objects, measurements,
etc) under consideration in a certain problem.
The word population is a technical term in statistics, not
necessarily referring to people.
Examples:
 All students in this university;
 All households in Hawassa city;
All light bulbs produced by a firm in a single day;
All fish in a lake, etc.

11/14/2023 2
Cont’..
 Census is a collection of data from the whole population
(that is, complete enumeration).
It is the actual measurement or observation of all
possible elements from the population or it is a survey of
everyone in the population.
 Reference population(source/target population)
the population of interest, to which the researcher
would like to generalize the results of the study.
Example: If a researcher would like to study the effect of
a new fertilizer on crop yield in Ethiopia,
 then the reference population is all farmers in Ethiopia
who are using the new fertilizer.

11/14/2023 3
Cont’..
 Sampling theory is a study of relationships existing
between a population and samples drawn from the
population.
Attaining a specified precision at minimum cost is the
main intention of sampling theory.
 In sampling theory population is often required as an
assumption.
 Sample is the small group that is chosen for the
study.
 Sample is a part or portion or sub set of a population
taken so that some generalizations about the
population can be made.

11/14/2023 4
Cont’
The main concern in sampling is to ensure that the
sample accurately represents the population we are
interested to study.
That is, samples are taken in a way that they will
be representative of the population.
Sampling is the process involving the selection of a
finite number of elements from a given population
of interest for purposes of an inquiry.
 It is a process of taking samples from a population
of interest for purpose of an inquiry.

11/14/2023 5
cont’..
Sample size is the number of individuals or observations
in a sample (usually denoted by n).
Parameter is any measurable characteristic of a
population.
 Example: Population means, Population standard
deviations, population medians, etc.
Statistic is a number resulting from manipulation of
sample data. it is any measurable characteristic of a sample.
Example: sample means, sample standard deviations,
sample medians, etc.
 A statistic is used to estimate a population parameter
such as Population mean , Population standard deviation ,
etc.
11/14/2023 6
Cont’
The sampling error is the difference between a sample
statistic and its corresponding population parameter.
It is the error that occurs because a sample has been taken
instead of a census.
 For example: the sample mean may differ from the true
population mean.
Sampling Unit is the ultimate unit to be sampled
(elements of the population to be sampled) or It is the unit
of selection in the sampling process.

11/14/2023 7
Cont’
Examples
 In a sample of households, the sampling unit is a
household;
 In a sample of students, a student is the sampling unit.
 In a sample of districts, the sampling unit is a district, etc.
Sampling Frame: is the list of all possible units in the
reference population, from which a sample is to be drawn.

11/14/2023 8
Cont’
Example: If a researcher would like to do a research on
poverty levels of residents in a town and if she/he
decided that the sampling unit for the study is an
individual, then the sampling frame would be the list
of all individuals living in that town.
Sample design is a set of procedures for selecting the
units from the population that are to be in the sample.
Sampling fraction (sampling interval):- the ratio of the
number of units in the sample to the number of units in
the sampling frame or In the reference population.

11/14/2023 9
Cont’
For example, a sampling fraction or ratio of 1:3 is
equivalent to a sampling interval of 1 in every 3 units.
This means that the sample constitutes 33.3% of
the total units in the sampling frame or in the
reference population.

11/14/2023 10
An application of the terminologies
Population: All students in Hawassa University in 2012 E.C.
entry
Sampling Frame: All students appearing in the list of students
prepared by the registrar on October 30, 2012 E.C.
Sample design: Probability sampling
Sample size: 200 students selected from the sampling frame.
Sampling unit (unit of analysis): a student
Statistic: Students in the sample have spent an average of 300
birr per month.
Parameter: Students in the university are probably spending, on
average, between 250 birr and 350 birr per month (estimate
derived from sample statistic).

11/14/2023 11
Reasons for Sampling
Why a Sample instead of a census?
 When studying characteristics of a population, there
are many practical reasons why we prefer to select
samples of a population.
 A census can be extremely expensive and time-consuming.
Contacting every member of a large population would
require great expenditures of time and money, and
 sampling from the list can provide satisfactory results
more quickly and at much lower cost.
Efficiency is the commonly known advantage of
sampling.

11/14/2023 12
Cont’
 The physical impossibility of checking all items in the
population (sometimes census is impossible)
Example: the population of fish, birds, mosquito and
the like are large and constantly moving, being born
and dying.
Therefore, we just take some samples to do a research
as it is impractical to have a census upon such types of
populations.
 A census can be destructive:
The Awash wine factory, like every other winery, employs
wine tasters to ensure the consistency of product quality.
Naturally, it would be counterproductive if the tasters
consumed all of the wine, since none would be left to sell
the thirsty customers.
11/14/2023 13
Cont’
 The sample results are usually adequate: In practice, a
sample can be more accurate than a census.
 Speed: The collection and analysis of data can be
done more quickly if the data are not excessive.
Time and energy are saved.
That is, the data can be collected and summarized
more quickly with a sample than with a census.

11/14/2023 14
Cont’
 It enables the researcher to get more detailed
information about a particular subject under
investigation.
Disadvantages of sampling:
Reliability: If the sample is not a true representative of
the population, then we may sacrifice reliability in favor
of less time and money.
If complete information is required on each and every
element of the population, census should be applied.

11/14/2023 15
Sampling Methods
Sampling involves the selection of a number of study
units from a defined population.
Sampling methods can be categorized as probability
and non-probability.
 Probability Sampling: is a sample selected such that
each item in the population being studied has a known
chance (greater than zero) of being included in the
sample.
These methods remove human judgment from the
sampling process and ensure a more representative
sample.
Non-probability (non random) sample is a sample
selected based on contingency and judgment.
11/14/2023 16
Methods of Probability Sampling:
The four basic types of sampling methods are:
 SIMPLE RANDOM SAMPLING,
 SYSTEMATIC SAMPLING,
 STRATIFIED SAMPLING, AND
 CLUSTER SAMPLING.
The choice will depend on the types of a problem
being investigated, aim of the research and the available
resources.

11/14/2023 17
Simple Random Sample (SRS)
In SRS, each item in the population has a known, the
same, nonzero chance of being included in the sample.
Random samples are selected by using methods such as
random numbers (which can be generated from
computers) or lottery method.
To select a simple random sample you need to follow
the following procedures:

11/14/2023 18
Cont’
1. Make a numbered list of all units in the population
(sampling frame),
2. Each unit on the list should be numbered in
sequence from 1 to N (where N is the size of the
population),
3. Select the required number of study units, using a
"lottery" or a table of random numbers.

11/14/2023 19
Lottery Method in SRS
1. Numbered or named papers representing a unit in
the population are placed in a hat.
2. The papers are thoroughly mixed and the number of
papers equal to the sample size is selected from the
hat.
For a sample of 200 students, the researcher would
select 200 papers.
3. The sample then consists of all units of the population
corresponding to the selected papers.

11/14/2023 20
Random Number Table Method in SRS
1. The researcher assigns a number to each unit of the
population and constructs the random table.
2. Then s/he randomly selects a starting place (point),
goes through the table across the rows or down the
columns and lists the numbers as they appear on the
table.
3. Members of the population with the selected numbers
constitute the sample.
4. A random number table is a list of numbers generated
by a computer that has been programmed to yield a set
of random numbers.
5. It is possible for a unit’s number to be selected more
than once.
11/14/2023 21
Cont’
 Advantage of SRS
 Ensures that the sample is unbiased in that every individual
and every sample has an advantage of being chosen.
SRS is the basic sampling method assumed in survey
statistical computations. This can be used with confidence.
 Disadvantages of SRS
SRS requires a sampling frame and this is sometimes
impossible (the case of fish population),
 It is difficult to take samples if the reference population is
scattered,
If the population is extremely large, it is tedious and time
consuming to number and select the sample,
Minority subgroups of interest in the population may not be
represented in the sample.
11/14/2023 22
Cont’
Note that: In SRS, when we apply the table of random
numbers, we have to ignore repeated digits and those
lying above the range of the population size.

11/14/2023 23
Systematic Sampling
(Quasi-random sampling):
the elements to be included in a sample are
picked at a constant interval.
the items or individuals of the population are
arranged in some order and…….
…….a random starting point is selected from 1
through k where K=population size/sample size=N/n
then every kth member of the population is selected
for the sample.

11/14/2023 24
Cont’
In systematic sampling:
A complete list of all the elements within the population
(sampling frame) is required.
The procedure is to take every kth item from the
sampling frame.
Let N= population size; n=sample size; k=sampling
interval, k=N/n
Choose any number between 1 and k. suppose it is

J(1≤j≤k)

11/14/2023 25
Cont’
The jth unit is selected at first and then (j+k)th, then…..
…..( j+2k)th, …..etc. unit is selected until the required
sample size is reached.
Example Suppose there are 2000 subjects in the
population and a sample size of 50 subjects are
needed. Select a systematic sample of these 50 subjects.
Solution: The sampling interval (k) is 40 (2000/50).
The number of the first subject to be included in the
sample is chosen randomly, for example, by blindly
picking up one out of 40 pieces of paper numbered 1 to
40.
Suppose subject 12 was the first subject selected, then the
sample would consist of samples whose numbers were 12,
52, 92, etc until 50 subjects (samples) are obtained.
11/14/2023 26
Cont’
a sample chosen this way is not strictly
random since not all the members of the
population have an equal chance of being selected.
The answer is not unique as it depends where the
number of the first subject to be included is picked.
Advantages of Systematic Sampling:
 Less time consuming and easier to perform than
SRS,
It is more convenient to use as compared to SRS,
It provides a good approximation to SRS

11/14/2023 27
Disadvantages of Systematic Sampling:
If there is any sort of cyclic ordering of the subjects, the
samples will not be representative of the population.
Example: If subjects in the population are arranged in a
manner such as:
1. Defective item
2. Non-defective item
3. Defective item
4. Non-defective item
5. etc,

11/14/2023 28
Cont’
The selection of the starting point could produce
a sample of all defective items or non -defective
items depending on whether the number to be added
(k) is even or odd.
Example: starting point =defective item +even k=all
defective item in the sample and
starting point =non-defective item +even k=all non-
defective items in the sample.

11/14/2023 29
Cont’
Example: Moha Company stores boxes containing Pepsi and
Mirinda in the following order.
1. Box containing Pepsi
2. Box containing Mirinda
3. Box containing Pepsi
4. Box containing Mirinda
5. .
6. .
7. .
. .
. .
200
11/14/2023 30
Cont’
The quality department of the company would like to
check the expiry date of the products by taking a
systematic sample size of 40 boxes containing either
Pepsi or Mirinda.
Assume that you are working in the quality department
of the company, select the systematic samples required.
Is the sample you selected a representative?

11/14/2023 31
Stratified Sampling:
 a population is first divided into subgroups, called
strata (singular stratum), and
a sample is selected from each stratum based on simple
random or systematic sampling method.
we can divide a human population in to different strata
(subgroups) on the basis of age, sex, occupations,
education, region, religion etc.
Stratified sampling is applied if the population is
heterogeneous.

11/14/2023 32
Cont’
Stratified sampling can also be proportionate or non-
proportionate.
 non-proportionate an equal number of elements are
drawn from each stratum.
 Proportionate Stratified Sampling: Number of units
selected from each stratum is directly proportional to
the size of the strata.
 If Pi represents the proportion of population included in
the stratum i, and n represents the total sample size,
the number of elements selected from stratum i is n*Pi.

11/14/2023 33
Cont’
Examples: Let us suppose that we want a sample size of 30 to
be drawn from a population size of 8000 which is divided in
to three strata of size 4000, 2400 and 1600. Adopting
proportional allocation:
Find the sample sizes under each stratum.
Solution: We shall get the sample size for the different strata:
 N1=4000 we have P1=4000/8000=0.5 and hence n1=n.
P1=30*0.5=15
 N2=2400, we have P2=2400/8000=0.3 and hence n2=n.
P2=30*0.3=9
 N3=1600, we have P3=1600/8000=0.2 and hence n3=n.
P3=30*0.2=6
 N= N1 +N2+ N3, P= P1 +P2 +P3=1 n1 +n2 +n3=15+9+6=30
11/14/2023 34
Cont’
Advantage: The representation of the sample is
improved.
Disadvantages:
If there are many variables of interest, dividing a large
population in to representative subgroups requires a
great deal of effort,
If variables are somewhat complex or ambiguous (such as
beliefs, attitudes, etc), it is difficult to separate
individuals in to the sub groups according to these
variables.

11/14/2023 35
Cluster Sampling:
if the population is homogeneous and very large or resides
in a large area,
 it is costly and time consuming to take samples by using
the three methods just mentioned above.
In this case, we divide the population in to groups called
clusters and then we select representative clusters
randomly.
Finally, the samples will be taken from the sample
clusters.
We can take either all members of the sample clusters or
we may select samples from the clusters by using other
sampling techniques.
11/14/2023 36
Procedures
1. The reference population is divided in to clusters or
subgroups, preferably similar in size,
2. A sample of the clusters is taken by random or
systematic sampling,
3. All the units in the selected clusters are then
studied or we may select samples from each cluster.
 If part of the elements in each cluster is included in the
sample, then the procedure is called two stage sampling.
 The first stage is selecting a sample of clusters and the
second stage is selecting a sample of elements from each
cluster.

11/14/2023 37
Cont’
Advantage:
 A list of all individual study units in the reference
population is not required.
 Reduces cost
 simplify field work and it is convenient
Disadvantage:
 The members of the clusters are often more
homogeneous than the members of the whole
population and therefore, it may not be representative.
 The elements in a cluster may not have the same variation
in characteristics as elements selected individually from
the population
11/14/2023 38
Multi-Stage sampling
is a sampling technique that is used when the
reference population is large and widely scattered.
Selection of samples is done in stages until the final
sampling unit is obtained.
 The number of stages of sampling is the number of
times a sampling procedure is carried out.
The primary sampling unit (PSU) is the sampling unit in
the first sampling stage and
 the secondary sampling unit (SSU) is the sampling unit
in the second sampling stage, etc.

11/14/2023 39
Cont’
For example: the PSU can be the weredas, the SSU can be
the kebeles, etc.
 From PSUs, we can select samples based a suitable
method and
 each of these selected PSUs is further sub-divided in to
second stage units (say kebeles) and from these SSUs
again a sample is taken by some suitable methods.
Further stages may be added if required.

11/14/2023 40
cont’
 Advantages
Cuts the costs of preparing sampling frame.
 Disadvantages
Gives less precise estimate than SRS for the same
sample size

11/14/2023 41
Comparison of Stratified and Cluster Sampling
Most of the time students face difficulty in
differentiating stratified and cluster sampling. The
main distinguishing criteria of stratified from cluster
sampling is that in the case of stratified sampling the
population is divided in to well-defined groups,
where each group has homogeneity with in itself but
wider heterogeneity (or variation) among the groups.
 In the case of cluster sampling, the situation is the
reveres for stratified sampling (i.e., the different
clusters are homogeneous but elements in each cluster
are heterogeneous).

11/14/2023 42
Methods of Non-Probability Sampling
not every unit in the population has a chance of being
included in the sample and
the process involves at least some degree of personal
subjectivity instead of following predetermined,
probabilistic rules for selection.
This sampling technique is:
 Used when a sampling frame doesn't exist,
 It is non-random selection (unrepresentative)
 Inappropriate if the aim is to measure variables and
generalize findings
 Easier, quicker and cheaper to carryout than probability
designs.

11/14/2023 43
cont’
There are three non- probability sampling methods.
 Convenience Sampling: is a method in which a sample is
chosen with ease of access being the primary concern.
Example: Interviews conducted in convenient locations such
as student lounge.
 Purposive (Judgmental) Sampling: the researcher exercises
deliberate subjective choice in drawing samples what s/he
regards as more informative for a study undergoing.
 Quota Sampling: is a method that ensures that a certain
number of sample units from different categories with
specific characteristics are represented.
 Here, judgmental and convenience sampling methods are
combined.
Quota sampling can be applied for affirmative action.
11/14/2023 44
Cont’
Example:
Suppose we know that 54% of the adults in a community
are females, and the study requires 100 respondents as a
sample.
 In quota sampling, we might interview the first 54
females and the first 46 males.

11/14/2023 45
Errors in Sampling
There are two types of errors
 Sampling error: is the discrepancy between the population
value (parameter) and sample value (statistic).
It may arise due to inappropriate sampling technique
applied.
 It can be minimized by increasing the size of the sample.
When n = N, sampling error = 0.
 Non-sampling error (bias): are due to procedure bias such as:
Subjects’ non-response
Due to incorrect response
Problem with sampling frame
Measurement error
Errors at different stages in processing the data.
11/14/2023 46
Ways to reduce data error
Ensure that survey instruments are well prepared, simple
to read, and easy to understand.
Properly select and train interviewer to control data
gathering bias or error.
Use sound editing, coding, and tabulating procedures
to reduce the possibility of data processing error.

11/14/2023 47

You might also like