Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 62

Chapter 5: Sampling

and Data Collection

Dr. Mohammed Shamim Uddin Khan


Professor and Ex-Chairman
Department of Finance
University of Chittagong
1
Terminology
 Sample: Subset of a larger population
 Sampling: The process of obtaining information from a
subset (sample) of a larger group (population)
 Population: Any Complete Group
– People
– Sales Territories
– Stores
 Census: Investigation of all individual elements that make
up a population
Population Vs. Sample

Population of Interest

Population Sample

Sample

Parameter Statistic

We measure the sample using statistics in order to draw


inferences about the population and its parameters.
Steps in Sampling Process
1. Define the population
2. Identify the sampling frame
3. Select a sampling design or procedure
4. Determine the sample size
5. Draw the sample
Sampling Design Process
Define Population

Determine Sampling Frame

Determine Sampling Procedure

Probability Sampling Non-Probability Sampling


Type of Procedure Type of Procedure
Simple Random Sampling Convenience
Stratified Sampling Judgmental
Cluster Sampling Quota

Determine Appropriate
Sample Size

Execute Sampling
Design
Sampling Frame
 A list of elements from which the sample may be
drawn
 Working Population
 Mailing Lists - Data Base Marketers
 Sampling Frame Error
Random Sampling Error
 The difference between the sample results and
the result of a census conducted using identical
procedures
 Statistical fluctuation due to chance variations
Systematic Errors
 Non-sampling errors
 Unrepresentative sample results
 Not due to chance
 Due to study design or imperfections in
execution
Errors Associated with Sampling

 Sampling Frame Error


 Random Sampling Error
 Non-response Error
Two Major Categories of Sampling

 Probability Sampling
 Non-probability Sampling
Non-probability Sampling
 Convenience Sampling (Chunk Sampling)
 Judgment Sampling (Purposive Sampling)
 Quota Sampling
 Snowball Sampling
Probability Sampling
 Simple Random Sample
 Systematic Sample
 Stratified Sample
 Cluster Sample
 Multistage Area Sample
Convenience Sampling
 Also called haphazard or accidental sampling
 The sampling procedure of obtaining the people
or units that are most conveniently available
Judgment Sampling
 Also called purposive sampling
 An experienced individual selects the sample
based on his or her judgment about some
appropriate characteristics required of the
sample member
Quota Sampling
 The population is divided into cells on the basis of
relevant control characteristics.
 A quota of sample units is established for each cell.
 A convenience sample is drawn for each cell until
the quota is met.
(similar to stratified sampling; It should not be confused
with stratified sampling)
Snowball Sampling
 A variety of procedures
 Initial respondents are selected by probability
methods
 Additional respondents are obtained from
information provided by the initial respondents
Simple Random Sampling
 A sampling procedure that ensures that each
element in the population will have an equal
chance of being included in the sample
Systematic Sampling
 A simple process
 Every nth name from the list will be drawn
Stratified Sampling
 Probability sample
 Subsamples are drawn within different strata
 Each stratum is more or less equal on some
characteristic
 Do not confuse with quota sample
Cluster Sampling
 The purpose of cluster sampling is to sample
economically while retaining the characteristics
of a probability sample.
 The primary sampling unit is no longer the
individual element in the population
 The primary sampling unit is a larger cluster of
elements located in proximity to one another
Examples of Clusters

Population Element Possible Clusters in the United States

U.S. adult population States


Counties
Metropolitan Statistical Area
Census tracts
Blocks
Households
Examples of Clusters

Population Element Possible Clusters in the United States

College seniors Colleges


Manufacturing firms Counties
Metropolitan Statistical Areas
Localities
Plants
Examples of Clusters

Population Element Possible Clusters in the United States

Airline travelers Airports


Planes

Sports fans Football stadiums


Basketball arenas
Baseball parks
What is the
Appropriate Sample Design?

 Degree of Accuracy
 Resources
 Time
 Advanced Knowledge of the Population
 National versus Local
 Need for Statistical Analysis
Determination of Sample Size
In sampling analysis, the most ticklish question is ‘what
should be the size of the sample (n) or how large or small
should be n? If the sample size is too small, it may not
serve to achieve the objectives. If it is too large, we may
incur huge cost and waste resources. As a general rule, one
can say that the sample must be of an optimum size, i.e. it
should neither be excessively large nor too small. The two
alternative approaches for determining the size of the
sample are:
1. Estimating the sample size based on a proportion
2. Estimating the sample size based on a mean
Estimating the Sample Size Based
on a Proportion
Example 1: A nutrition survey is to be conducted in a Rohinga
camp. Assume that 40% children suffer from malnutrition.
How large a sample would be needed in order to be 95%
certain that the estimated prevalence does not differ from the
true prevalence by more than 0.05?
Solution: Assume that the population is large, Here z = 1.96,
maximum allowable error, e = 0.05, and proportion of
children suffering from malnutrition, p = 0.40. Thus we
employ
z 2 pq (1.96) 2 (0.4)(0.6)
no  2  2
 369
e (0.05)
Estimating the Sample Size
Based on a Mean
Example 2: Suppose a researcher wishes to investigate the
average (mean) income level of employees in a city within a
margin of error of and desires a 95% confidence level
assessing the true mean. On the basis of prior studies the
researcher believes that the standard deviation can be
estimated as 1.5. What would be the required sample size?
Solution: Here z = 1.96, maximum allowable error, e = 0.25,
and standard deviation,   15
Thus we employ
z 2 2 (1.96) 2 (1.5)2
no  2  2
 138
e (0.25)
Stratified Sampling
Example 3: A population with 300 university students is
divided according to the faculty they belong to: Science,
Arts, Social Sciences and Business studies. The numbers
of students in these faculties were respectively 50, 120,
70, and 60. A stratified sample of 30 is to be selected.
Use proportional allocation technique to allocate sample
size to different strata.
Solution:
Proportional Allocation Method:
Ni Where n = sample size = 30,
ni  n Ni = Size of each strata, N1 = 50, N2 = 120, N3 = 70,
N N4 = 60, N = Size of population = 300
Example 3 Cont.
N1 50 N3 70
n1  n  30  5 n3  n  30  7
N 300 N 300
N 120 N 60
n2  n 2  30   12 n4  n 4  30  6
N 300 N 300

Thus using stratified random sampling, we will select 5


students from stratum 1 (Science), 12 students from stratum
2 (Arts), 7 students from stratum 3 (Social Sciences) and 6
students from stratum 4 (Business Studies) to make up a
total of n = 30. Note that all of the four strata have a
uniform sampling fraction 1/10 = 10%.
Data and Its Classification
Data: The raw materials of statistics consists of numbers or observations
usually obtained by some process of counting or measurement, they are
referred to collectively data. Thus, ‘A set of observations is called data’.
Classification of Data: Data can be classified in a number of ways.
1.Data according to origin: (a) Population data (b) Sample data.

2.Data according to variable: (a) Qualitative (categorical) data (b)


Quantitative data.
3.Data according to time: (a) Time series data (b) Cross-section data (c) Panel
data
4.Data according to measurements of scale: (a) Nominal data (b) Ordinal data
(c) Interval data (d) Ratio data.
5.Data according to subject (Discipline): (a) Economic data (b) Agriculture
data (c) Medical data (d) Business data (e) Metrological data (f) Import data
(g) Export data etc.
N.B. Again quantitative data can be classified as (i) Discrete data (ii)
Continuous data
Types of Data

1. Categorical: (e.g., Sex, Marital Status, income category)


2. Continuous: (e.g., Age, income, weight, height, time to
achieve an outcome)
3. Discrete: (e.g.,Number of Children in a family)
4. Binary or Dichotomous: (e.g., response to all Yes or No
type of questions)

31
Scale of Data

1. Nominal: These data do not represent an amount or quantity


(e.g., marital status, religion, race, sex)

2. Ordinal: These data represent an ordered series of


relationship (e.g., level of education)

3. Interval: These data is measured on an interval scale having


equal units but an arbitrary zero point. (e.g.: Temperature in
Fahrenheit)

4. Ratio: Variable such as weight for which we can compare


meaningfully one weight versus another (say, 100 Kg is
twice 50 Kg)
32
Methods of Data Collection

 The task of data collection begins after a research problem


has been defined and research design/plan chalked out.
 While deciding about the method of data collection to be
used for the study, the researcher should keep in mind two
types of data viz., primary and secondary.
 A researcher as per requirement of study may decide on
use of primary data or secondary data or both.
 Both primary and secondary data have their own pros and
cons.
Primary and Secondary Data

Primary Data: The primary data are those which are


collected afresh and for the first time, and thus happen to
be original in character. Primary Data are collected by
the researcher.
Secondary Data: The secondary data are those which have
already been collected by some other agency and which
have already been processed. Secondary data collected
by someone else and have already been passed through
the statistical process.
Collecting Secondary Data

 Sources of secondary data are existing literature,


Reports of professional agencies, Departments,
Archives, Internet, etc.

 While collecting secondary data one has to


follow legal procedures required and maintain
the academic ethics.
Scrutiny of Secondary Data

1. Suitability: The complier should satisfy himself that the data contained
in the publication will be suitable for his study. In particular, the
conformity of the definitions, units measurement and time frame
should be checked.
2. Reliability: The reliability of the secondary data can be ascertained
from the collecting agency, mode of collection and the time period
of collection. For instance, secondary data collected by a voluntary
agency with unskilled investigators are unlikely to be reliable.
3. Adequacy: The source of data may be suitable and reliable but the data
may not be adequate for the proposed enquiry. The original data may
cover a bigger or narrower geographical region or the data may not
cover suitable periods.
4. Accuracy: The user must be satisfied about the accuracy of the
secondary data. The process of collecting raw data, the reproduction
of processed data in the publication, the degree of accuracy desired
and achieved should also be satisfactory and acceptable to the
researcher.
Methods of Collecting Primary Data

There are several methods of collecting primary data,


particularly in surveys and descriptive research.
Important ones are-
 Observation

 Interview

 Questionnaire

 Schedule

 Other Methods
Primary Data Collection
Techniques
Quantitative Data Qualitative Data
Collection Techniques Collection Techniques
1. Interviewing Method 1. Unstructured interview
2. Observation Method 2. Observation Method
3. Mail Questionnaire 3. Focus Group Discussion
4. Experimental Method 4. Document Study
5. Data Base 5. Content Analysis
Other Data Collection
Techniques
1. Delphi Technique
2. Panel Study
3. Rapid Rural Appraisal
4. Participatory Rural Appraisal
5. Nominal Group Technique
6. Key Informant Interview
7. Community Risk Assessment
Observation

See what is happening


– traffic patterns
– land use patterns
– layout of city and rural areas
– quality of housing
– condition of roads
– conditions of buildings
– who goes to a health clinic
Observation is Helpful when:

 Need direct information


 Trying to understand ongoing behavior
 There is physical evidence, products,
or outputs than can be observed
 Need to provide alternative when
other data collection is infeasible or
inappropriate
Types of Observation

 Participatory and Non Participatory

 Candid and Covert

 Structured, Semi-structured and


Unstructured.

 Controlled and Uncontrolled


Advantages/Disadvantages of
Observation
Advantages:
 Subjective bias eliminated

 Researcher gets current information

 Independent of Respondents

Disadvantages:
 Expensive, Time consuming

 Limited information

 Unforeseen factors may influence observation


Interview
 The interview method of collecting data
involves presentation of oral-verbal stimuli
and reply in terms of oral-verbal responses.

 This method can be used through personal


interviews or telephone interviews.

 Structured, Semi-Structured or Unstructured


Interview.
Interview Types
 Personal Interviews: Interviewer asking questions
generally in a face-to-face contact to the other person
or persons. Direct personal investigation or Indirect
oral investigation.
 Focused Interview is meant to focus attention on the
given experience of the respondent and its effects.
 Clinical Interview is concerned with broad underlying
feelings or motivations or with the course of
individual’s life experience.
 Non-directive Interview is that where the
interviewer’s function is simply to encourage the
respondent to talk about the given topic with a bare
minimum of direct questioning.
Skill of Interviewer

The main game in interviewing is to


facilitate an interviewee’s ability to
answer. This involves:
– easing respondents into the interview
– asking strategic questions
– prompting and probing appropriately
– keeping it moving
– winding it down when the time is right
Merits/Demerits of Interview

Merits:
 More and in depth information obtained

 Personal Information

 Greater Flexibility

 Adaptation as per the respondent

Demerits:
 Bias of Interviewer

 Expensive/Time Consuming

 Need expertise
Questionnaire Method

 A questionnaire is sent (usually by post) to persons


concerned with a request to answer the questions
and return the questionnaire.
 A questionnaire consists of a number of questions
printed in a definite order.
 The respondents have to answer the questions on
their own.
Steps in Questionnaire
Construction
 Preparation
 Constructing the first draft
 Self-evaluation
 External evaluation
 Revision
 Pre-test or Pilot study
 Revision
 Second pre-testing
 Preparing final draft
Advantages of Questionnaire

 Lower cost
 Time saving
 Accessibility to widespread respondents
 No interviewer’s bias
 Greater anonymity
 Respondent’s convenience
 Standard wordings
 No Variation
Disadvantages of Questionnaire

 Questionnaires can be used only for educated people.


 Sometimes different respondent’s interpreted questions
differently
 Questionnaires do not provide an opportunity to collect
additional information
 Researchers are not sure whether the person to whom the
questionnaire was mailed has himself answered the
questions.
 Many questions remain unanswered
 The respondent can consult other persons before filling
in the questionnaire.
Essentials of a Good Questionnaire

1. Number of questions should be kept to the minimum.


2. Questions should be simple, short, and unambiguous
3. Question arranged in from simple to difficult.
4. Questions of sensitive/personal nature, technical term and vague
expression should be avoided.
5. Answers to questions should not require calculations.
6. Questions should be capable of an objective answer.
7. Questions should be arranged logically.
8. Proper words should be used in the questionnaire.
9. Questionnaire should look attractive.
10. Questionnaire should be pre-tested to find out its shortcomings if
any.
11. Cross-Check and footnotes should be considered in the
questionnaire.
12. Necessary instructions should be given to the informant.
Collection of Data Through Schedule

 Schedules like questionnaires contain a set of


questions.
 Researcher /Enumerators appointed collect data
through schedules.
 Enumerators go to the field, put questions to the
respondents and fill the schedules.
 Enumerators need to be trained.
Questionnaire Vs. Schedule

Questionnaire Schedule
 Mailed, filled by  Direct contact , filled by
Respondent Researcher or Enumerator
 Economical  Expensive
 Non-Response high  Non-Response low
 Time Consuming  Time bound
 Literate, co-operative  No such pre condition
respondents
 Success depends on  Success depends on quality
quality of questionnaire of enumerator
Some Other Methods

 Warranty Cards Post card size cards sent to customers


and feedback collected through asking questions.
 Distributor or Store Audits are performed by
manufacturer/distributor through salesmen. Information
so obtained are used to estimate market size, market
share, seasonal sales pattern, etc.
 Pantry Audits From the observation of pantry of
customer to know purchase habit of people ( of which
product, what brand, etc.). Questions may be asked at the
time of audit.
Some Other Methods

 Consumer Panels Pantry audit approach on a regular


basis is known as ‘consumer panel’, where a set of
consumers are arranged to come to an understanding to
maintain detailed daily records of their consumption and
the same is made available to investigator on demands.
 Projective techniques developed by psychologists to use
projections of respondents for inferring about underlying
motives, urges, or intentions which are such that the
respondent either resists to reveal them or is unable to
figure out himself.
Some Other Methods

 Use of Mechanical Devices Eye Camera is used


to record the focus of eyes of a respondent on a
specific portion of a sketch or diagram or written
material. Psychological vinometer is used for
measuring the extent of body excitement as a
result of the visual stimulus. Motion picture
camera is used to record movement of consumer
at time of purchase. Audiometer is used to know
the preferences to TV channels, programmes.
Some Other Methods

 Depth interviews are those interviews that are designed


to discover underlying motives and desires and are often
used in motivational research. Indirect question or
projective technique are used to know the behaviour of
respondents.

 Content Analysis Analyzing the contents of


documentary materials such as books, magazines,
newspapers and the contents of all other verbal materials
which can be either spoken or printed.
Editing of Primary Data
Editing involves reviewing the data collected by investigators to ensure maximum
accuracy and unambiguity. It should be done as soon as possible after the data
have been collected. The different steps of editing are discussed below:
1. Checking legibility: Obviously, the data must be legible to be used. If a
response is not presented clearly, the concerned investigator should be asked
to rewrite it.
2. Checking Completeness: An omitted entry on a fully structured questionnaire
may mean that no attempt was made to collect data from the respondent or that
the investigator simply did not record the data. If the investigator did not
record the data, prompt editing and questioning of the investigator may
provide the missing item.
3. Checking Consistency: The editor should examine each questionnaire to
check inconsistency or inaccuracy if any, in the statement. The income and
expenditure figures may be unduly inconsistent. The age and the date of birth
may disagree. The concerned investigators should be asked to make the
necessary corrections.
Selection of Appropriate Method
of Data Collection
 Nature, Scope and Object of enquiry
 Availability of Fund
 Availability of Time
 Degree of Precision Required
Precautions in Data Collection

 The data must be relevant to the research problem.


 It should be collected through formal or standardized research
tools.
 The data should be such as these can be subjected to statistical
treatment easily.
 The data should have minimum measurement error.
 The data must be tenable for the verification of the hypotheses.
 The data should be collected through objective procedure.
 The data should be accurate and precise.
 The data should be reliable and valid
 The data should be complete in itself and also comprehensive in
nature.

You might also like