Sampling Method

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

Sampling

Method
Selecting some of the Each individual
elements of the participant or object on
population, a conclusion which measurement is
is drawn about the taken is referred to as
entire population the element.

Population is the total


Census is a count of all collection of all the
Sampling the elements in the
population
elements about which
we want to make an
inference

Listing of all the


population elements
from which the sample
will be drawn is the
sample frame
Lower Cost Greater Accuracy of Results Greater speed of data
collection

Sampling -
Advantages
Because sampling is the only
option in certain cases (ex. to
test the efficacy of a drug
before its commercial use for
diabetics, to check the safety
features of a vehicle before its
launch –crash all vehicles??)
Sampling Design

A sampling design is the


definite plan for Sampling Design
obtaining a sample includes
from a given population

Technique/Process
Number of the items
adopted to select items
(sample size)
for the sample
Steps in Sample Design

1 2 3
Step 1 – Clearly define the Step 2 – Clearly define the Step 3 – Clearly define the
objectives and thereby the sampling unit (item) sampling frame / source list (it
Universe • Geographical – village, city, town should contain the names of all the
• Construction unit – house, flat items in the universe.
• Social unit – family, club, school
• Individual unit – male, female, child
Steps in Sampling Design

Step 5 – Consider the specific population


parameters that we may be interested in.
Population parameters are summary
descriptors of variables of interest in the
Step 4 – Decide upon the size of the population. For eg. Frequency of a diner at
sample. It should be neither too large nor a restaurant in the last 7 days (0-5 times, 6-
too small. The size would depend upon the 10 times etc), actual count of dining at the
size of the population, it would depend on restaurant (average 5 times). – actual
the cost and the budget available, also on diners only. Proportion of people
the level of confidence interval we wish to interested in dining in the restaurant –
establish non-diners who may be interested. How
many of the diners were married men or
women, how many lived close by, how
many were students? ( if so our sampling
frame should be selected accordingly)
• Step 6 – Decide upon the
sampling procedure or
technique/type, i.e., decide the
sample design
Steps in
Sampling
Design
Types of sampling designs
• Depends on two aspects
1. Representation – how are the
members/units/elements of the sample selected
• Is it arbitrary and haphazard or is it a controlled
procedure
• If it is haphazard it is non probability sampling and if it
Types of is a controlled procedure with every element getting
an equal chance of getting picked up for the study it is
Sample probability sampling

Design 2. Element Selection –


• Is each element selected individually and directly
from the entire population or are additional controls
imposed?
• The former is referred to as unrestricted sample,
everything else is restricted sampling
Types of Sample Design
Non-Probability (unrestricted)

• Convenience – Researcher select any readily available


individual as a participant
• Not very reliable
• But cheapest and easiest to obtain
• Reason for selection is usually that the respondent
happened to be at the right place at the right time
• Eg informal pool of friends, voluntary responses to an
ad, neighbours
• Could be used in the initial stages of an exploratory
study
• Eg to understand the emotional state of students on
returning to the campus. First 25 students I speak to all
speak in the same tone giving ,me a sense of what is to
come☺
Types of Sample Design
Non-Probability (restricted)

• Purposive – researcher chooses a non probability


sample (arbitrarily) that confirms to certain
criteria. They may be chosen for their unique
characteristics, experiences, attitudes or
perceptions. Two types- judgment and quota
• Judgment – the sample is selected on the
basis of the judgement of the researcher
• Eg organisation wants to understand
customer response to a new product prior
to launch. So they decide to first try the
same on the employees!
• To explore the factors influencing glass
ceiling, the researcher decides to meet
women in senior most positions in
organisations
Types of Sample Design
Non-Probability (restricted)

• Quota – researcher uses this to improve


representativeness
• Tries to get the sample to have the same
characteristics as the population
• May be viewed as a two stage judgment sampling
• Develop control categories and then use
judgement to select elements from each category
• I want to understand the inclination of students to
use the canteen food coupons and I know that the
ratio of boys to girls in the population is 60:40. So I
would in stage one, set this control that my
sample should have 60:40 ratio of boys and girls.
In second stage I would go about selecting or
interviewing students based on my own
judgement
Types of Sample Design
Non-Probability (restricted)

• Snowball – researcher finds the


respondent through a referral network.
The snowball gathers subjects as it rolls
along
• Initial set of respondents are chosen,
they in turn refer the researcher to
others who they may know to possess
similar characteristics, experiences or
attitudes
• Eg. Difficult to find people who changed
stream mid-way through in their
careers
Non probability sampling
Non-probability sampling

Snowball
Sampling
Probability
Sampling –
Simple
Random
Sampling
Probability Sampling
• Simple Random Sampling (SRS) – this is
unrestricted random sampling and is the
purest form of probability sampling
• Each population element has a known and
equal chance of selection
• Probability of selection = sample
size/population size
• Accomplished with the help of a computer
software (excel also has an option)
• Every element is selected independent of
the other element
Probability Sampling -
Simple Random Sampling
• Simple Random Sampling (SRS) – it
requires a sampling frame
• Maybe expensive to produce if the
population is large
• May also not be truly representative of
the population
• Eg the sample may not have the
representation of people of all ethnicities
that are there in a population and as
such the results may not be appropriate
if ethnicity is an important characteristic
in the population
Probability Sampling - Systematic Random Sampling
Probability Sampling -
Systematic Random Sampling

• Complex Random Sampling – this is better than


SRS in terms of precision and cost and time
efficiency. Four types
• Systematic random sampling
• Pick a random way to select the first unit to
begin with (the first randomly picked house
in the lane). Then continue further and pick
every ‘K’th after it. This is referred to as the
skip interval
• The skip interval = Population size/sample
size
• You must have a sample frame that is
available as a list or in some order
• Eg. Picking invoices for audit purpose
Probability Sampling -
Systematic Random Sampling

• Systematic random sampling


• Important to randomise the population, else bias
creeps in
• For instance if the sample is being picked from a
sampling frame of students whose names are
listed according to their grades in ascending order
then if the first random number picked is 8, we
would miss the data from the students who have
secured the lowest grades in the class. Hence the
sample frame should be probably in alphabetical
order or on some other random basis.
• In spite of the best of efforts the randomly
generated sample may still be skewed for some
reason for instance may be the class has a gender
ratio of 60:40 but my sample has 70:30 which is
not truly representative, which is a problem with
systematic random sampling
Probability Sampling - Stratified Random Sampling
Probability Sampling -
Stratified Random Sampling
• Stratified Random Sampling – Population is segregated into some
mutually exclusive subpopulations, or strata. The process by which we
constrain the sample to include the elements from the strata in the
population is called stratified random sampling
• Students can be classified into boys and girls, or first years, second years
and final years
• Stratification is usually more efficient than simple random sampling
• Each stratum is homogenous internally and heterogeneous externally
with other stratas
• Samples can be stratified using more than one variable. For eg employee
data can be stratified on the basis of salary and domain area/department
• Helps study various characteristics of the population considered
important
• The sampling frame is the entire list included in the population
Probability Sampling -
Stratified Random Sampling
• Stratified Random Sampling – proportionate vs disproportionate
sampling
• Each stratum is properly represented so that percentage in the
sample is proportionate to the percentage in the population

• Any stratification that departs from the proportionate relationship


is referred to as disproportionate stratified random sampling

• Process for drawing a stratified sample


• Determine the variables for use for stratification
• Determine the proportions
• Select proportionate or disproportionate stratification
based on the requirement of the project
• Divide the sampling frame into separate frames for each
stratum
• Randomise the elements
• Draw the sample from each stratum
Probability Sampling – Cluster Sampling
Probability Sampling –
Cluster Sampling
• Cluster Sampling – it involves grouping the
population into some subgroups such that each
subgroup is heterogeneous in itself, instead of
selecting individual elements in the population on
the basis of a characteristic
• Each cluster should be a representative of the
entire population
• Each cluster should be homogeneous with other
clusters
• A researcher conducts the study only on selected
clusters
• This method is chosen when then size of the
population is very large and it is difficult to get a
sampling frame and decide upon the sample.
• The sampling frame is the complete list of clusters
and not of individual units/participants in the
population
Probability Sampling –
Cluster Sampling
• Cluster Sampling – The grouping in clusters could be
any naturally occurring grouping
• Stages involved:
• Choose the cluster groupings
• Number each cluster with a unique number
• Select the sample of clusters based on simple
random sampling and measure each element in
the selected clusters
• This is also known as single stage cluster sampling
• Considered less accurate than stratified random
sampling as it may not be truly representative of the
population
• Examples of clusters – study on reading habits of 7th
graders in India, the total number of 7th graders is
huge so cannot reach all. I can form clusters on the
bases of schools with a strength of over 150 – 7th
grade students. Each school becomes a cluster and I
can randomly choose any school from the list of
schools
Probability Sampling –
Cluster Sampling
• Cluster Sampling – Two stage
• a simple random sample of clusters is
selected and then a simple random
sample is selected from the units in each
sampled cluster.
• Cluster Sampling – Multi stage
• Cluster sampling repeated at a number
of levels
• a more complex form of cluster
sampling which contains more stages in
sample selection. In simple terms, in
multi-stage sampling large clusters of
population are divided into smaller
clusters in several stages in order to make
primary data collection more manageable
Stratified Vs Clustered
Probability Sampling –
Cluster Sampling
• Area Sampling – the most important form
of cluster sampling
• The entire geographical area is divided
into smaller non overlapping areas,
generally called geographical clusters
• Ex. Researcher wants to study the
feedback for a product launched newly in
the market across a state. He/she may
define clusters here on the basis of the
districts in which the product is being sold
in the state. Then these districts would be
numbered and sample clusters would be
chosen randomly
Probability Sampling –
Double Sampling
• Double Sampling – Also known as sequential
sampling or multiphase sampling
• Two phase method of sampling
• Step one- Preliminary analysis on the initial
sample. Comparatively easy to obtain this
information. Either simple characteristics or
infrequent characteristics are studied on a
large scale.
• Ex. For a study on household expenses
on education, select a preliminary
sample and collect information about,
income, number of children, level of
education, type of school etc
• Step two- another sample is taken from the
preliminary sample and more analysis is
done to observe some specific characteristic
Probability Sampling –
Double Sampling
• Double Sampling – collection of some
information by the sample in the first phase
and then using this information as the basis for
selecting a sub-sample for further study may
prove more convenient or economical
• The information collected in the first phase can
be used with advantage for the stratification of
the sampling units and selection of appropriate
subsamples. (stratification or clustering)
• The point in support of this is that when all
information from all respondents is redundant,
there is little justification to add additional
burden approaching everyone for everything.
This saves time and money
Determination of Sample Size
• Choice of sample size is determined by the following:
• Margin of error we are willing to tolerate – accuracy we want with respect to any estimate we
make from our sample. Precision is measured using standard error of estimate – smaller the
standard error higher the precision of the sample. If the estimated mean is Rs4000 and the
precision desired is +/- 4%, then the true value should lie between Rs3840 and Rs4160.
(Margin of error = Critical value * standard error)
• Confidence level – confidence we want in our data- level of certainty that the characteristics of
the data collected will represent the characteristics of the target population. The percentage
of times that the value will fall within the precision limits. Eg. 95% of the times the value will
fall within the range required. Significance level indicates the percentage of times it will lie
outside the desired range. Thus if 95% is confidence level, significance level is 5%.
Determination of Sample Size

• The larger the absolute size of the sample the


closer its distribution will be to a normal
distribution – central limit theorem

• Samples of large absolute sizes are more likely to


be representative of the target population, in
particular, the mean of a large sample is likely to
be more closer to the mean of the target
population – law of large numbers
Determination of Sample Size
• The smaller the absolute size of the sample, the
smaller the relative proportion of the population
sampled, greater the margin of error
• Hence we may need a larger sample to ensure
that we get enough responses for the margin of
error required
• Choice of sample size is determined by the
Determination following:

of Sample Size • The type of analysis we wish to


undertake – number of categories we
wish to divide our data into. Eg minimum
data threshold requirement for chi-
square
• The budget and time
Determination of Sample Size

• Sample size for estimating population mean

Where

n = Sample size
σ = Population standard deviation
e = Margin of error
Z = The value for the given confidence interval
Determination of
Sample Size

• A researcher wants the estimate to be


between +/- Rs25 of the true population
value and he wishes to be 95% confident
it will contain the true population mean.
Also assume that earlier studies have
demonstrated the standard deviation to
be around Rs100. What would be the
required sample size?
• n=?, Z = 1.96, SD=100, margin of error =
25
Determination of Sample Size

• Sample size for estimating population proportion

• When population proportion p is known

• When population proportion p is not known


Determination of Sample Size
Determination of
Sample Size
• A widely accepted rule of thumb is 10
cases/observations per indicator variable in
setting a lower bound of an adequate
sample size (Nunnally, 1967)
• Bentler and Chou (1987) suggest a ratio as
low as 5 cases per variable would be
sufficient
• Kaiser-Meyer-Olkin (KMO) Test for
Sampling Adequacy - KMO values between
0.8 and 1 indicate the sampling is adequate.
KMO values less than 0.6 indicate the
sampling is not adequate and that remedial
action should be taken.
Determination of
Sample Size

• Bartlett sample size determination


table - A table is provided that can be
used to select the sample size
Thank You

You might also like