Professional Documents
Culture Documents
Research Methedology
Research Methedology
Research Methedology
Jimma University
Department of GeES
Course: Advanced Research
Methods Academic Year: 2022/23
Semester: II
Target group: Regular GIS & RS, and LRAM PG
students
By
Girma Alemu (PhD)
Email: girmaalemu83@gmail.com
ORCID: https://orcid.org/0000-0001-5783-870X
Course Outline
Survey Study
It is a method of gathering data at a particular
time for a specific objective;
• It is characterized by:
Cont’d
• gathering data on a one-shot basis and hence
is economical and efficient,
• can represent a wide range of target
population,
• generates numerical data,
• Provides:
• descriptive,
• explanatory and inferential data;
- manipulates key factors and variables to
derive frequencies
• Surveys can be distinguished as:
• cross-sectional studies, and
Discussion
n
Introductio
Discussio
Conclusio
Results
Formulate problem
Interpretation, n
n
conclusion Materials and Hypothesis
Methods
75
Introduction
Professional Experience:
Ownprofessional experience is the
most important source of a research
problem,
Contacts and discussions with research-
oriented people,
Attending conferences, seminars, and
Listening to learned speakers
80
Cont’d
Funding,
84
2. Definition and Statement of the Problem
87
Cont’d
f) In addition:
Technical terms or phrases, with special meanings
used in the statement of the problem should be
clearly defined.
Basic assumptions or postulates relating to the
research problem should be clearly stated.
The suitability of the time period and the sources
of data available must be considered in
defining the problem.
The scope of the investigation within which the
problem is to be studied must be mentioned
explicitly in defining a research problem.
90
3. Extensive Literature Survey
92
Cont’d
93
Cont’d
Articles:
JSTOR: www.jstor.org
EconLit
Web Pages
94
Cont’d
relationships.
Cont’d
99
Cont’d
Theoretical framework:
Definition. Theories are formulated to explain,
predict, and understand phenomena and, in
many cases, to challenge and extend existing
knowledge within the limits of critical bounding
assumptions.
The theoretical framework is the structure that
can hold or support a theory of a research study.
100
Cont’d
Declarative form
The declarative form of hypotheses are positive
statement about the expected outcome of a study.
The declarative form of hypothesis are further divided
into:
Directional or
Non-directional
Directional
The directional hypothesis stipulates the direction of the
expected differences or relationships.
- For example, the hypothesis:
- “Soil qualities in terms of nutrients are much higher
on cultivated land than on fields under forest and
pasture land” is directional and declarative.
Non-directional
A statement which does not specify the direction of
expected difference or relationship is a non-directional
research hypothesis.
For example, the hypothesis:
“Soil qualities in terms of nutrients on cultivated land
and fields under forest and pasture land are not the
same” is declarative, however, Non-directional.
The Null Form
In the null form, the research hypothesis can make a
statement that states no relationship.
For example, the hypothesis
“there is no difference in soil fertility status
between cultivated and pasture fields of farmers” is
the null form.
5. Scope and Limitations
110
7. Sample Selection
111
8. Execution of the Project
112
Cont’d
114
Cont’d
117
Some of the most important shared values
124
125
A research proposal should usually contain the
following categories of information:
Question-type titles:
These are used less commonly than
indicative and hanging titles.
However, they are acceptable where it is possible
to use few words – say less than 15.
Example: Does agricultural credit alleviate poverty in
low-potential areas of Ethiopia.
128
2. Statement of the Problem:
132
Cont’d
135
Cont’d
Hence, the justification should answer the following:
How does the research relate to the priorities
of the region and the country?
What knowledge and information will be obtained?
What is the ultimate purpose the knowledge obtained
from the study will serve?
How will the results be disseminated?
How will the results be used, and who will
be the
beneficiaries? 136
6. Definition of terms and concepts:
It is necessary to define all unusual terms and concepts
that could be misinterpreted.
Technical terms or words and phrases having
special meanings need to be defined operationally.
7. Scope and limitations of the study:
Boundaries of the study should be made clear with reference
to:
(i) the scope of the study by specifying the areas to which
the conclusions will be confined and
(ii) the procedural treatment including the
sampling procedures, the techniques of
data
analysis,
collection
etc. and
8. Basic assumptions:
Assumptions are statements of ideas that
are accepted as true.
They serve as the foundation upon which the research
study is based.
138
9. Review of Literature:
It refers to reading of existing literatures, such as
published books, research reports from various
sources, office reports etc. that have direct relation
to the problem studied.
The theoretical and empirical framework
from which the problem arises must briefly be
discussed.
Both conceptual and empirical literature is to
be reviewed for this purpose.
The researcher has to make it clear that his
problem has roots in the existing literature but it
needs further research and exploration.
The analysis of previous research eliminates the risk 139
of duplication of what has been done.
Why does the Researcher Revise the Related
Literature?
The review of the related literature (Koul, 1984:8) enables
the researcher to:
be up-to-date on the work which others have done; and
this further enables him to justify that his problem has
roots in the existing literature but it needs further
investigation,
avoid unintentional duplication of well-investigated
problem,
understand and scrutinize scientifically tested
research methodology relevant to his study,
identify the tools and instruments of the data
collection which are proved to be useful
and promising in the previous studies ;
get bases for formulating hypotheses, and
know about the recommendations of previous
researchers for further research which they have
listed in their studies.
How to organize the Related Literature
Thematically, or
methodologically
Conclusion:
summarize the major contributions,
evaluating the current position, and
pointing out:
flaws in methodology,
gaps in the research,
contradictions in, and
areas for further study
Build it under different headings
Identify sections of the review of the related literature
outline the sections of the discussion under different
headings.
One of the best guide to identify the sections of review
of related literature is to relate it to the objectives of the
study.
A careful consideration of the objectives of the study
should suggest relevant section for the discussion of
related literature.
Writing the sections of the review of the
related literature
In the discussion
Give an account for the cited information in terms
of the research problem;
Try to extract important information and paraphrase
it;
Use short direct quotation;
Use long direct quotations only for very good
reason, otherwise avoid using long direct
quotations;
Discuss the related literature from:
a comprehensive perspective to more and
more specific or more localized studies which focus
closer and closer to the specific problem of the
study.
Avoid direct copying
Direct copying other than direct short and long
quotation is plagiarism
Cont’d
II) Methodology
148
Cont’d
a) Procedures for data collection: details about
sampling procedures and data collection tools
are described.
(i) Sampling – in research situations the researcher
usually comes across unmanageable populations in
which large numbers are involved.
(ii) Tools (instruments) – in order to collect evidence or
data for a study the researcher has to make use of
certain tools such as observations, interviews,
questionnaires, etc.
The proposal should explain the reasons for
selecting a particular tool or tools for collecting
the data.
149
Cont’d
151
Cont’d
(iii) Simulation models:
152
Cont’d
Direct costs
Include at least:
Personnel cost,
Consumable supplies,
Equipment's,
Travel,
Communications,
Publication, and
Other direct costs.
Cont’d
Consumable supplies:
office supplies (stationeries),
computers,
Chemicals (if appropriate), and
educational materials (books, journals)
Cont’d
phy,
slides,
and
others
Cont’d
Indirect costs
Those costs incurred in support and management of the
proposed activities that can not be readily determined by
direct measurement. Examples includes;
Overhead costs for institutions or associations
General administrative cost
Operational and maintenance
Depreciation and use allowance
Cont’d
Work plan
Research must also be scheduled appropriately.
Researcher should also prepare a realistic time
schedule for completing the study within the
time available.
Dividing a study into phases and assigning dates
for the completion of each phase helps the
researcher to use is time systematically.
Cont’d
Work plan is a schedule, chart or graph that
summarizes how different components of a research
proposal will be implemented in a coherent way within
a specific time-span.
It may include:
The tasks to be performed;
When and where the tasks will be performed;
Who will perform the tasks and the time each person
will spend on them;
It describes the plan of assessing the ongoing progress
toward achieving the research objectives; 162
Cont’d
VI. Bibliography:
Be sure to include every work that was referred to in the
proposal
You do not have to refer to any other works if you do not
want to; the bibliography does not have to be long or
complete.
Formats vary slightly by journal, etc.
A common format:
For a book: Smith, Adam (1776). An Inquiry into the Nature and
Causes of the Wealth of Nations. London: Dent and Sons
publishing.
For an article: Coase, R (1937). “The Nature of the Firm.”
Economica 4, 386-405.
165
Cont’d
References
It is a must to give references to all the information that
you obtain from books, papers in journals, and other
sources.
References may be made in:
the main text and
the reference section.
Cont’d
Doran, J.W., Parkin, T.B., 1994. Defining and assessing soil quality. In:
Doran, J.W., Coleman, J.W. and Bezdicek, D.F. (Eds.), Defining soil
quality for a sustainable environment. Soil Sci. Soc. Am., Special Publ.
35. ASA-SSSA, Madison. WI, USA. pp. 3-21
Cont’d
For an internet reference
UN, 2002. United Nations Emergencies Unit for Ethiopia, Addis
Ababa, Ethiopia.
http://www.sas.upenn.edu/African_Studies/eue_web/eue
_mnu.htm. (consulted in April 2005).
Cont’d
Appendices/Annexes
177
Cont’d
Bias can arise:
if the selection of the sample is done by some
non-random method i.e. selection is consciously
or unconsciously influenced by human choice.
if the sampling frame (i.e. list, index, population
record) does not adequately cover the target
population.
if some sections of the population are
impossible to find or refuse to co-operate.
178
Cont’d
Major Reasons for Sampling
1) Resource Limitations: A sample study is usually less
expensive than a census.
2) Superior Quality of Results:
more accurate measurement
3) Infinite Population: sampling is also the only process
possible if the population is infinite.
4) Destructive nature of some tests: sampling remains the
only option when a test involves the destruction of the
items under study.
Example: testing the quality of a commodity (beer,
181
Cont’d
Steps in Sampling Design
a) Identifying the relevant population: when one wants to
undertake a sample survey the relevant population from
which the sample is going to be drawn need to be
identified.
Example: if the study concerns income, then the
definition of the population elements as individuals
or households can make a difference.
b) Determining the method of sampling:
Whether a probability sampling procedure or a non-
probability sampling procedure has to be used is also
very important.
182
Cont’d
c) Securing a sampling frame:
A list of elements from which the sample is
actually drawn is important and necessary
(e.g. Kebele registry)
d) Identifying parameters of interest:
What specific population characteristics
(variables and attributes) may be of interest.
183
Cont’d
e) Determining the sample size:
The determination of the sample size depends on
several factors:
i) Degree of homogeneity:
The size of the population variance is the
184
Survey and Field Research Methods… cont’d
187
Cont’d
Hence:
For small populations (<1000), a researcher needs a large
sampling ratio (about 30%). Hence, a sample size of
about 300 is required for a high degree of accuracy.
For moderately large population (10,000), a smaller
sampling ratio (about 10%) is needed – a sample size
around 1,000.
To sample from very large population (over 10 million),
one can achieve accuracy using tiny sampling ratios
(.025%) or samples of about 2,500.
These are approximate sizes, and practical limitations
(e.g. cost) also play a role in a researcher’s decision about
sample size. 188
Cont’d
Sample Size in Qualitative Studies
There are no fixed rules for sample size in qualitative research.
The size of the sample depends on WHAT you try to find out,
and from what different informants or perspectives you try to
find that out.
The sample size is therefore estimated as precisely as
non-probability means.
189
Cont’d
Probability sampling:
191
Cont’d
Types of probability sampling methods:
We can distinguish between the following types
of probability sampling methods:
Simple Random Sampling
Systematic Sampling
Stratified Sampling
Cluster Sampling
Hybrid Sampling
192
Cont’d
193
Cont’d
195
Cont’d
3. Stratified Sampling
Most populations can be segregated into a number
of mutually exclusive sub-populations or Strata.
197
Cont’d
How to Stratify
Three major decisions must made in order to
be stratify the given into some
exclusive
population
groups. mutually
(1) What stratification base to use: stratification would be
based on the principal variable under study such as
income, age, education, sex, location, religion, etc.
(2) How many strata to use: there is no precise answer as
to how many strata to use.
The more strata the closer one would be to come to
maximizing inter-strata differences and minimizing intra-
strata variables.
198
Cont’d
199
Cont’d
4. Cluster Sampling:
The selection of groups of study units (clusters) instead
of the selection of study units individually is called
CLUSTER SAMPLING.
If the total area of interest happens to be a big one and can
be divided into a number of smaller non –overlapping
areas (clusters) and if some of the groups or clusters are
selected randomly we have cluster sampling.
204
(2) Quota Sampling
Quotas are assigned to different strata groups and
interviewers are given quotas to be filled from
different strata.
A researcher first identifies categories of people
(e.g., male, female) then decides how many to get
from each category.
The major limitation of this method is the absence
of an element of randomization. Consequently the
extent of sampling error cannot be estimated.
is used in opinion pollsters, marketing research
and other similar research areas.
205
(3) Purposive or Judgment sampling
Purposive sampling occurs when one draws a non-
probability sample based on certain criteria.
When focusing on a limited number of informants,
whom we select strategically so that their in-depth
information will give optimal insight into an issue is
known as purposeful sampling.
It uses the judgment of the expert in selecting cases.
BUT, care should be taken that for different
categories of informants; selection rules are
developed to prevent the researcher from sampling
according to personal preference.
206
(4) Snowball (Network) Sampling
This is a method for identifying and sampling (or
selecting) the cases in a network.
Snowball sampling is based on an analogy to a
snowball, which begins small but becomes
larger as it is rolled on wet snow and pick up
additional snow.
Snowball sampling begins with one or a few people
or cases and spread out on the basis of links to the
initial case.
You start with one or two information-rich
key informants and ask them if they know
persons who know a lot about your topic of
interest.
207
Pr oblems in Sampling
Two types of errors:
Non sampling errors
Sampling errors
Non Sampling errors are biases or errors due to
fieldwork problems, interviewer induced bias,
clerical problems in managing data, etc.
These would contribute to error in a survey,
irrespective of whether a sample is drawn or a
census is taken.
On the other hand, error which is attributable to
sampling, and which therefore, is not present in
information gathered in a census is called sampling
error.
208
a) Non-Sampling Error
Non sampling error refer to
Non-coverage error
Wrong population is being sampled
No response error
Instrument error
Interviewer’s error
Non-Coverage sampling error: This refers to sample
frame defect.
Omission of part of the target population (for
instance, soldiers, students living on campus, people
in hospitals, prisoners, households without a
telephone in telephone surveys, etc).
Non-coverage error also occurs when the list used
for the sampling are incomplete or are outdated. 209
The wrong population is sampled
Researchers must always be sure that the group
being sampled is drawn from the population they
want to generalize about or the intended
population.
Non response error
Some people refuse to be interviewed because they
are ill, are too busy, or simply do not trust the
interviewer.
One should try to reduce the incidence of
non-response errors.
Non-response error can occur in any interview
situation, but it is mostly encountered in large-scale
surveys with self-administered questionnaires.
210
It is important in any study to mention the non-
response rate and to honestly discuss whether and
how the non-response might have influenced the
results.
Instrument error
The word instrument in sampling survey means the
device in which we collect data- usually a
questionnaire.
When a question is badly asked or worded, the
resulting error is called instrument error.
Example: leading questions or carelessly
worded questions may be misinterpreted by
some researchers. 211
Interviewer error : This occurs when some
characteristics of the interviewer such as age, sex,
affects the way in which the respondent answer
questions.
Example: questions about sexual behavior might be
differently answered depending on the gender of
the interviewer.
To sum up, a researcher must ensure that non
sampling error are avoided as far as possible, or is
evenly balanced (non systematic) and thus
cancels out in the calculation of the population
estimates.
212
b) Sampling Errors
Sampling errors are random variations in the
sample estimates around the true
population parameters.
Error which is attributable to sampling, and
which therefore is not present in a census-
gathered information, is called sampling
error.
Sampling errors can be calculated only for
probability samples.
Increasing the sample size is one of the major
instruments to reduce the extent of the sampling
error.
Sampling error is related to confidence intervals. 213
A narrower confidence interval means more
precise estimates of the population for a given
level of confidence.
The confidence interval for the true
mean is given by:
population
Mean z n
Mean is the sample mean, z is the value of the
standard variate at a given confidence level (to be
read from the table giving the area under the
normal curve) n is the sample size, and is the
standard deviation of the sample mean.
The sampling error is given by:
z
n 214
Dealing with missing data:
There are several reasons why the data may be
missing.
They may be missing because equipment
malfunctioned, the weather was terrible, or
people got sick, or the data were not entered
correctly.
If data are missing at random, by far the most
common approach is to simply omit those cases
with missing data and to run our analyses on what
remains.
215
Although deletion often results in a substantial
decrease in the sample size available for the
analysis, it does have important advantages.
Under the assumption that data are "missing at
random”, it leads to unbiased parameter
estimates.
If, on the other hand, data are not missing at
random, but are missing as a function of some
other variable, a complete treatment of missing
data would have to include a model that accounts
for missing data.
216
Data Collection Techniques
Every study is a search for information about the
given topic.
Qualitative and Quantitative data
The data should be sufficient to test the
hypotheses
Collection of the data should be
feasible
The question is from where and how to
get the information (the data).
Data can be acquired from:
Secondary sources 217
Secondary Sources of data
Secondary sources are those, which have
been collected by other individuals or
agencies.
As much as possible secondary data should always
be considered first, if available.
Why reinvent the wheel if the data already exists.
When dealing with secondary data you should ask:
Is the owner of the data making them available to
you?
Is it free of charge? If not, how will you pay?
Are the data in a format that you can work with?
218
Advantages of Secondary data
Can be found more quickly and cheaply.
Most researches on past events or distant places have
to rely on secondary data sources.
Limitations
The information often does not meet one’s specific
needs.
Definitions might differ, units of measurements may be
different and different time periods may be involved.
difficult to assess the accuracy of the information-
unknown research design or the conditions under
which the research took place.
Data could also be out of date.
219
Sources of Secondary Data
Secondary may be acquired from various
data sources:
Department reports, production
financial and accounting summaries,
reports, marketing and
sales studies, books, periodicals, reference books
encyclopedia, university publications
dissertations, etc.), policy documents, statistical
(thesis,
compilations, research report, proceedings, personal
documents (historical studies) , etc.
The Internet
220
Primary Sources of Data
Data that came into being by the people
directly involved in the research.
Data collected afresh and for the first time happen to
be original in character.
Qualitative and Quantitative data collection techniques
There are two approaches to primary data collection:
221
Qualitative data collection approaches
Qualitative data can be acquired from:
case studies,
rapid rural appraisal methods,
focus group discussions and
key informant interviews.
i) Case studies
A case study research involves a
detailed investigation of a particular
case.
• Through Interviews (several forms of interviews-
open-ended, focused, or structured).
222
• Through Direct observation (field visits).
ii)Rapid Rural Appraisal (RRA)
RRA is a systematic but semi-structured activity
often by a multidisciplinary team.
The techniques rely primarily on expert observation
coupled with semi-structured interviewing.
The RRA method:
takes only a short time to complete,
procedures.
The techniques of RRA include:
Interviews with individuals, households and key
informants
Group interview techniques, including focus-group
interviewing, etc. 223
iii) Focus group discussions
A FGD is a group discussion guided by a facilitator,
during which group members talk freely and
spontaneously about a certain topic.
The group of individuals are expected to have
experience or opinion on the topic and selected
by the researcher.
Its purpose is to obtain in-depth information on
concepts, perceptions and ideas of a group.
It is more than a question-answer interaction.
The idea is that group members discuss the
topic and interact among themselves with
guidance from the facilitator.
224
Wh y use focus groups?
The main purpose of a focus group research is
227
v) Triangulation
228
Types of Triangulation
Data triangulation, which entails gathering data
through several sampling strategies at different
times and social situations.
Investigator triangulation, which refers to the use
of more than one researcher in the field to gather
and interpret data.
Theoretical triangulation, which refers to the use
of more than one theoretical proposition in
interpreting data.
Methodological triangulation, which refers to the
use of more than one method for analyzing the
data.
229
Quantitative Primary Data Collection Methods
233
Weakness of the Method
The quality of information secured depends heavily on
the ability and willingness of the respondents.
A respondent may interpret questions or
concept differently from what was intended
by the researcher.
A respondent may deliberately
mislead the researcher by giving false information.
Surveys could be carried out through:
Face to face personal interview
By telephone interview
By mail or e-mail, or
By a combination of all these.
234
Personal Face to face Interview
It is a two-way conversion where the respondent is
asked to provide information.
Advantages:
The depth and detail of the information that can be
secured far exceeds the information secured from
telephone or mail surveys.
Interviewers can probe additional questions, gather
supplemental information through observation, etc.
Interviewers can make adjustments to the language
of the interview because they can observe the
problems and effects with which the interviewer is
faced.
235
Limitations of the Method
236
Non-repose error
This error occurs when you are not able to find those whom
you are supposed to study.
In probability samples there are pre-designated persons to
be interviewed.
When one is forced to interview substitutes, an unknown
bias is introduced.
Under such circumstances one of the following could
be
tried.
The most reliable solution is to make callbacks.
To treat all remaining non-respondents as a new
subpopulation and draw a random sample from the
subpopulation.
To substitute someone else for the missing respondent if
the population is homogeneous. 237
Response error
Errors are made in the processing and tabulating
of data.
Respondent may fail to report fully and accurately.
Cheating by enumerators, usually with only limited
training and under little direct supervision.
Enumerator can also distort the results of a survey by in-
appropriate suggestions, word emphasis, tone of voice
and question rephrasing.
Perceived social distance between enumerator and
respondent also has a distorting effect.
238
Cost Considerations
Interviewing is a costly exercise.
Much of the cost from the substantial
results
enumerator time taken up with administrative and
travel tasks.
b) Telephone Interview
Telephone can be a helpful medium of communication
in setting up interviews and screening large population
for rare respondent type.
239
Strength of this method
Moderate travel and administrative costs
Faster completion of the study
Responses can be directly entered on to the computer
Fewer interviewers’ bias.
Limitations of this method
Respondents must be available by phone.
The length of the interview period is short.
Telephone interview can result in less complete
responses and that those interviewed by phone find the
experience to be less rewarding than a personal
interview.
240
C) Interviewing by Mail
Self-administrated questionnaires may be
used in surveys.
Advantages
Lower cost than personal interview
Persons who might otherwise be
inaccessible can be contacted (major corporate
executives)
Respondents can take more time to collect facts
Disadvantages
Non response error is expected
Large amount of information may not be acquired
241
Survey Instrument Design
Actual instrument design begins by drafting
specific measurement questions.
Both the subject and wording of each question are
important.
The psychological order of the question needs to be
considered.
Questions that are more interesting, easier to answer,
and less threatening usually are placed early in the
sequence to encourage response.
242
The main components of a questionnaire
243
Designing of a Questionnaire
244
1. Question Content
Both questions and statements could be used in
survey research.
Using both in a given questionnaire gives the
researcher more flexibility.
Minimizing the number of questions is highly
desirable, but one should never try to ask
two questions in one.
Question content usually depends on the
respondent’s:
ability, and
willingness to answer the question accurately.
245
a) Is the question of proper scope?
Respondent must be competent enough to answer the
questions.
The respondent information level should be assessed
when determining the content and appropriateness of a
question.
Questions that overtax the respondent’s recall ability
may not be appropriate.
b) Willingness of respondent to answer adequately
Even if respondents have the information, they may be
unwilling to give it.
Some topics are also too sensitive to discuss with
strangers.
Examples: the most sensitive topics concern money matters
246
If respondents consider a topic to be irrelevant and
uninteresting they would be reluctant to give an
adequate answer.
Some of the main reasons for unwillingness:
The situation is not appropriate for disclosing the
information
Disclosure of information would be embarrassing
Disclosure of information is a potential threat to the
respondent
247
Some approaches that may help to secure more
complete and truthful information
248
2. Question Wording
a) Shared Vocabulary
In a survey the two parties must understand each other
and this is possible only if the vocabulary used is
common to both parties. So, don’t use unfamiliar words
or abbreviations or ambiguous words.
b) Question Clarity
Do not use emotionally loaded or vaguely defined
words.
249
c) Personalization
Finding the right degree of personalization may be a
challenge.
Instead of asking „What would you do about ...?, it
is better to ask „what would people do about ...? „
d) Provision of adequate alternatives
Asking a question that does not accommodate all
possible responses can confuse and frustrate the
respondent.
Are adequate alternatives provided? It is wise to
express each alternative explicitly in order to avoid
bias. 250
3. Response structure or format
Limitations
Different respondents give different degree of details in
answers – responses may not be consistent.
Some responses may be irrelevant
Comparison and statistical analysis become very
difficult.
Articulate and highly literature respondents have an
advantage
Requires greater amount of respondent time,
effort. 252
b) Closed Questions
253
Limitations
Can ideas that the respondents would not
otherwise have
suggest
Respondents no opinion or no knowledge can
answer
with anyway
Respondents can be confused because of too
many choices
During the construction of closed ended questions:
The response categories provided should be exhaustive.
They should include all the possible
responses that might be expected.
In multiple choice type questions, the answer categories
must be mutually exclusive.
The respondent may not be compelled to select more
than one answer.
254
4) Question Sequence - order
The order in which questions are asked can affect the
response as well as the overall data collection
activity.
Transitions between questions should be smooth.
Grouping questions that are similar will make the
questionnaire easier to complete, and the respondent
will feel more comfortable.
Questionnaires that jump from one topic to another are
not likely to produce high response rates.
255
Some guides to improve quality
256
5) Physical Characteristics of a Questionnaire
257
Formats for Responses
A variety of methods are available for presenting a
series of response categories.
Boxes
Blank spaces
Entering code numbers besides each response and
circle.
Providing Instructions
Every questionnaire whether to be self administered
by the respondent or administered by an
interviewer should contain clear instructions.
258
General instructions
260
Data Processing and Analysis
264
iii) Classification and Tabulation
Once data are edited, and coded the data
presentation exercise begins.
Most research studies result in a large volume of
raw data, which must be reduced into homogenous
groups if we are to get meaningful relationships.
Classification is the process of arranging data in
groups or classes on the basis of common
characteristics.
Data having common characteristics are placed in
similar classes and in this way the entire data get
divided into a number of groups or classes.
265
Tabulation is the process of summarizing raw data and
displaying it in compact form (i.e. in the form of statistical
tables) for further analysis.
It is an orderly arrangement of data in columns and
rows.
Tabulation may be done by hand or by mechanical or
electronic devices such as the computer.
The choice is made largely on the basis of the size and type
of study, alternatives costs, time pressures and the
availability of computer facilities.
In the case of computer tabulation computer programs
such as SPSS, Lotus, excel, STATA, etc. could be used.
266
Tabulation provides the following advantages:
It conserves space and reduces
explanatoryand
descriptive statement to a minimum.
It facilitates the process of comparison
It facilitates the summation of items and the detection of
errors and omissions
It provides a basis for various statistical computations
such as measures of central tendencies, dispersions, etc.
Tabulation may be classified as simple and complex.
Simple tabulation gives information about one or more
groups of independent questions.
Complex tabulation shows the division of data into two
or more categories. 267
II) Data Analysis
Large volume of raw statistical information need
to be reduced to more manageable dimensions if
one is to see meaningful relationships in it.
Data analysis is the computation of certain indices
or measures.
It refers to the computation of certain measures
along with searching for patterns of relationship
that exists among data group.
Data can be analyzed qualitatively or quantitatively.
268
Quantitative data analysis
Where the data are quantitative, there are some
determinants of the appropriate statistical tool for
analysis.
Was the data collected using a random or non-
random sample?
If it was non random then non-parametric data
analysis techniques are appropriate,
if random then parametric techniques are
appropriate.
269
Were the samples dependent (related) or
independent?
Samples are said to be dependent (related) when
the measurement taken from one sample affects
the measurement taken again from the same
sample.
Samples are independent if the measurements
taken from one sample do not affect those from
another sample.
270
Parametric tests
Has the data got characteristics, which can lead to
the application of parametric tests? i.e.
Were observations drawn from a population
with normal distribution i.e. data normally
distributed?
Does the of data being comparedhave
set equal variances (homogeneity of
approximately
variances)?
Were the data measured on a ratio scale?
271
Non-parametric tests (data is nominal or interval)
272
Uni-variate Analysis
Uni-variate analysis refers to the analysis with
respect to one variable.
It is also called a one-dimensional analysis.
The uni-variate analysis could either be presented
in the form of statistical measures such as measures
of central tendencies and measure of variations
or in the form of graphs.
Graphical illustrations could also be used to
demonstrate the frequency distribution (histograms,
ogives, polygons, bar graphs, line graphs and
circular graphs or pie charts).
273
Descriptive Analysis
The initial uni-variate analysis may be the
presentation of descriptive analysis in the form of
frequency distributions.
Afrequency distribution provides a profile
different groups on any of a of
characteristics
multitude such as size, composition, efficiency,
of
or preferences of persons or other entities.
The data in a frequency distribution can be used to
calculate a number of statistical indices, which
summarizes the results even further.
Measures of central tendency are examples.
274
Multivariate Analysis
Multivariate analysis involves the considerations
of two or more variables.
It we have two variables then we have bi-variate
analysis but if we have more than two variables
then we have multivariate analysis.
Several multivariate analyses could be undertaken
such as the construction of bi-variate tables or
multivariate analysis such as multiple regressions,
ANOVA, discriminant analysis, probit and logit
analyses, canonical analysis, etc.
275
Summary chart concerning analysis of
data Analysis of Data
(in a broad general way can be categrised into)
Analysis of data
processing of data (analysis proper)
(Preparing data for
analysis)
276
Pitfalls in Data Analysis
The problem with statistics
Some aspects of statistical thoughts might lead
many people to be distrustful of it.
Three broad classes of statistical pitfalls.
The first involves sources of bias. These are conditions
or circumstances which affect the external validity of
statistical results.
The second category is errors in methodology,
which can lead to inaccurate or invalid results.
The third class of problems concerns interpretation
of results - how statistical results are applied (or
misapplied) to real world issues. 277
1. Sources of Bias
The core value of statistical methodology is its ability
to assist one in making inferences about a large group
(a population) based on observations of a smaller
subset of that group.
In order for this to work correctly,
the sample must be similar to the target population in all
relevant aspects (representative sampling);
certain aspects of the measured variables must conform to
assumptions which underlie the statistical procedures to be
applied (statistical assumptions).
278
Representative sampling
This is one of the most fundamental tenets of inferential
statistics:
the observed sample must be representative of the target
population in order for inferences to be valid.
The ideal scenario would be where the sample
chosen by selecting members of the population is
random, with each member having an at
equal probability of being selected for the sample.
The sample "parallels" the population with respect
to certain key characteristics
which are to thethought
important to at
investigation behand.
the problem comes in applying this principle to real world
situations. 279
Statistical assumptions.
The validity of a statistical procedure depends on
certain assumptions it makes about various aspects of
the problem.
For instance, linear methods depends on the
assumption of normality and independence.
Unfortunately, this offers an almost irresistible temptation
to ignore any non-normality, no matter how bad the
situation is.
If the distributions are non-normal, try to figure out
why; if it's due to a measurement artifact try to
develop a better measurement device.
280
Another possible method for dealing with unusual
distributions is to apply a transformation.
However, this has dangers as well; an ill-considered
transformation can do more harm than good in terms of
interpretability of results.
The assumption regarding independence of observations is
more troublesome, because it is so frequently violated in
practice.
Observations which are linked in some way may show some
dependencies.
One way to try to get around this is to aggregate cases to
the higher level.
Example: use households as the unit of analysis, rather than
281
individuals.
2. Errors in methodology
The most common hazards include designing experiments
with insufficient power, ignoring measurement error, and
performing multiple comparisons.
Statistical Power. The power of your test generally depends
on the sample size, the effect size you want to be able to
detect, the alpha you specify, and the variability of the
sample.
Based on these parameters, you can calculate the power level of
your experiment.
Similarly you can specify the power you desire (e.g. .80), the
alpha level, and use the power equation to determine the
proper sample size for your experiment.
282
If you have too little power, you run the risk of overlooking
the effect you're trying to find.
If your sample is too large, nearly any difference, no matter
how small or meaningless from a practical standpoint, will
be "statistically significant".
Measurement error. Most statistical models assume error
free measurement.
However, measurements are seldom if ever perfect.
Particularly when dealing with noisy data such as
questionnaire responses or processes which are difficult to
measure precisely, we need to pay close attention to the
effects of measurement errors.
Two characteristics of measurement reliability and 283
validity.
Reliability refers to the ability of a measurement
instrument to measure the same thing each time it
is used.
So, a reliable measure should give you similar
results.
If the characteristic being measured is stable over
time, repeated measurement of the same unit should
yield consistent results.
Validity is the extent to which the indicator
measures the thing it was designed to measure.
Validity is usually measured in relation to some
external criterion. 284
3. Problems with interpretation
There are a number of difficulties which can arise in the
context of interpretation.
Confusion over significance. the difference between
"significance" in the statistical sense and "significance" in
the practical sense continues to elude many consumers
of statistical results.
Significance (in the statistical sense) is really a function
of sample size and experimental design and shows the
strength of the relationship.
With low power, you may be overlooking a really
useful relationship; with excessive power, you may be
finding microscopic effects with no real practical value.
285
Precision and Accuracy. These two concepts often
get confused.
precision refers to how finely an estimate is
specified, whereas accuracy refers to how close an
estimate is to the true value.
Estimates can be precise without being accurate.
Causality: assessing causality is the most important
function of most statistical analysis.
For causal inference you must have random
assignment.
Many of the things we might wish to study
are not to experimental manipulation.
subject 286
Hence, it will require a multifaceted approach to
the research to come to any strong conclusions
regarding causality:
use of chronologically structured designs (placing
variables in the roles of antecedents and
consequents),
Use several replications.
Graphical Representations. There are many ways
to present quantitative results numerically, and it
is easy to go astray by misapplying graphical
techniques.
287
Multiple Variables and Confounds
290
Controlling for Confounding Variables
We can first organize the universe of variables and
reduce them by classifying every variable into one of
two categories: Relevant or Irrelevant to the
phenomenon being investigated.
The relevant variables are those which are important to
understand the phenomenon, or those for which a
reasonable case can be made.
Example: if the literature tells us that Consumption Expenditure
is associated with income, then we will consider income to be
a relevant variable.
291
If we have not included the relevant variable in our
analysis it can be because of different reasons.
One reason we might choose to exclude a variable is
because we consider it to be irrelevant to the
phenomenon we are investigating.
If we classify a variable as irrelevant, it means that it
has no systematic effect on any of the variables
included.
Irrelevant variables require no form of control, as they
are not systematically related to any of the variables in
our model, so they will not introduce any influence.
292
Two basic reasons why relevant variables might be
excluded:
First, the variables might be unknown.
We might have overlooked some relevant variables, but the
fact that we have missed these variables does not mean that
they have no effect.
Another reason for excluding relevant variables is
because they are simply not of interest.
Although the researcher knows that the variable
affects the phenomenon being studied, he does not
want to include its effect in the model.
293
Finally, there remain two kinds of variables which
are explicitly included in our hypothesis tests.
The first are the relevant, interesting variables which are
directly involved in our hypothesis test.
The second is called a control variable.
The control variable is included because it affects
the relevant variables and we need to remove or
control for its effect.
294
Internal and External Validity
Knowing what variables you need to control for is
important, but even more important is the way you
control for them.
Several ways of controlling variables exist.
Internal validity is the degree to which we
can be sure that no confounding variables have
obscured the true relationship between the
variables in the hypothesis test.
It is the confidence that we can put on the assertion
that the independent variables actually produce the
effects that we observe.
295
External validity describes our ability to generalize
from the results of a research study to the real world.
Unfortunately although controlling for the effect of
confounding variables increases internal validity it
often reduces external validity.
Methods for Controlling Confounding Variables
The effects of confounding variables can be controlled
with three basic methods: manipulated control, statistical
control, and randomization.
Internal validity, external validity, and the amount of
information that can be obtained about confounding
variables differs for each of these methods.
296
Manipulated Control
Manipulated control essentially changes a variable into a
constant.
We eliminate the effect of a confounding variable by not
allowing it to vary. If it cannot vary, it cannot produce
any change in the other variables.
If we can hold all confounding variables constant, we
can be confident that any difference observed between
two groups is indeed due to the explanatory variable
and not due to the other variables.
This gives us high internal validity.
So, Manipulated control prevents the controlled
variables from having any effect on the dependent
variable. 297
Statistical Control
With this method of control, we include the confounding
variable into the research design as an additional measured
variable, rather than forcing its value to be a constant.
So, we will be considering with three (or more) variables
and not two: the independent and dependent variables,
plus the confounding (or control) variable or variables.
The effect of the control variable is mathematically removed
from the effect of the independent variable, but the control
variable is allowed to vary naturally.
This process yields additional information about the
relationship between the control variable and the other
variables.
298
In addition to the additional information about the
confounding variables that statistical control
provides, it also has some real advantages over
manipulated control.
External validity is improved, because the confounding
variables are allowed to vary naturally, as they would
in the real world.
But, internal validity is not compromised to achieve this
advantage.
299
In general, statistical control provides us with much
more information about the problem we are
researching than does manipulated control.
But advantages in one area usually have a cost in
another, and this is no exception.
An obvious drawback of the method lies in the increased
complexity of the measurement and statistical analysis
which will result from the introduction of larger
numbers of variables.
300
Randomization
The third method of controlling for confounding
variables is to randomly assign the units of analysis
(experimental subjects) to experimental groups or
conditions.
The rationale for this approach is straightforward: any
confounding variable will have its effects spread
evenly across all groups, and so it will not produce
any consistent effect that can be confused with the
effect of the independent variable.
This is not to say that the confounding variables
produce no effects in the dependent variable—they do.
301
But the effects are approximately equal for all groups, so the
confounding variables produce no systematic effects on the
dependent variable.
The major advantage of randomization is that we can
assume that all confounding variables have been controlled.
Even if we fail to identify all the confounding variables, we
will still control for their effects.
As these confounding variables are to vary
allowed naturally, as they would in the real
world.
External validity is high for this method of
control.
302
Since we don’t actually measure the confounding
variables, we assume that randomization produces
identical effects from all confounding variables in all
groups, and that removes any systematic confounding
effects of these variables.
But any random process may result in
disproportionate outcomes occasionally.
Example: If we flip a coin 100 times, we will not always
see exactly 50 heads and 50 tails.
Sometimes we will get 60 heads and 40 tails, or even 70
tails and 30 heads.
303
Consequently, have no way of knowing with
absolute
we certainty the randomization control
procedure has actuallythat distributed identically the
effects of all confounding variables.
We are only trusting that it did.
But, with manipulated control and statistical control,
we can be completely confident that the effects of the
confounding variables have been distributed so that no
systematic influence can occur,because wecan
measure the effects of the confounding variable
directly.
There is no chance involved.
304
A further disadvantage of randomization is that it
produces very little information about the action of any
confounding variables.
• We assume that we have controlled for any effects of
these variables, but we don’t know what the variables
are, or the size of their effects, if, there are any.
• we assume that we’ve eliminated the systematic effects of
the confounding variables by insuring that these effects
are distributed across all values of the relevant variables.
• But we have not actually measured or removed these
effects—the confounding variables will still produce
change in the relevant variables.
305
Guide to Research Report writing
Parts/sections of Research Report
Report Parts
Data collection
Title page I. Introduction IV. Results & V. Conclusion &
formats
Recommendation
Discussion 1. Conclusion
4.1. title General
Acknowledgement 2. Recommendat
II. Literature related to Tables
ion
review objective 1.