Download as pdf or txt
Download as pdf or txt
You are on page 1of 226

THE FUNDAMENTALS OF

RESEARCH METHODS

Kefelegn Getahun, PhD 1


The Scientific Method and Research

 In order to have a clear understanding of the term


research it is important to know the meaning of
the scientific method.

 It is the research methodology adopted in the


process of research that makes a given research
scientific.

 Thus, research methods have become central parts


of almost all kinds of investigations (e.g. natural
science/social science). 2
The Scientific Method… cont’d

 A science is a coherent body of thought about a topic


over which there is a broad consensus among its
practitioners.
 Science aims to discover universal laws about how
the world works.
 The scientific method is a conscious and an objective,
logical and systematic method of investigation.
 The SM can be defined as the search of truth as
determined by logical considerations.
 It refers to the ideas, rules, techniques, and
approaches that are commonly used in research.
3
The Scientific Method… cont’d

 The SM uses both Inductive and deductive logical


arguments.
 The inductive logic:
 begins by observing facts
 then proceed from observations of facts to
universal laws
 The deductive logic starts with
 (a) Universal law(s), or
 (b) Initial conditions
 We then show how an event we are trying to
explain follows
 Then we test the generality of this. 4
The Scientific Method… cont’d

 The goal of scientific research is to test current


theories and develop new ones:

 Is the current theory consistent with the world?

 Can we develop a new theory that is more


consistent with the world?

5
The Scientific Method… cont’d

 The SM involves the following series of steps or


procedures:
 Identification of the problem to be investigated
 Collection of essential facts to prove or disprove
the theory
 Selection (hypothesizing) of tentative solutions to
the problem
 Evaluation of these alternative solutions to
determine which of them is in accordance with all
the facts, and
 The final selection of the most likely solution.
6
The Scientific Method… cont’d

 Each research problem involves gathering of data to


confirm or refute an existing theory
 We are not supposed to let our prior beliefs influence
our conclusions
 Because of this, we formulate our hypothesis before we
gather the data
 We don't let the data alter our hypothesis so we
know we will confirm it.
 Any truly scientific theory must be refutable otherwise,
it is just dogma.
 A sound theory is the one that has withstood many
attempts to disprove it 7
Guiding Principles of the SM

 Three useful guiding principles need to be considered:


a) Use of empirical evidence and ethical neutrality
 The scientific method is based on empirical evidence
(experimental, observed, practical) and utilizes relevant
concepts
 The goal of SM is to facilitate independent verification of
scientific observation through the use of empirical
evidence.
 It presupposes ethical neutrality i.e., it aims at nothing
but making only adequate and correct statements about
population objectives.

8
Guiding Principles… cont’d

b) Logical reasoning (Critical thinking)


 The SM practices logical reasoning which allows
determination of the truth through steps different from
emotional and hopeful thinking
 Its methodology is made known to all concerned for
critical scrutiny and for use in testing the conclusions
through replications
 Critical thinkers always use logical reasoning.
 Logic is not an ability that humans are born with but
rather it is a skill that must be learned within a formal
educational environment.

9
Guiding Principles… cont’d

Logical reasoning?

Think about it and give examples

10
Guiding Principles… cont’d

c) Possessing a Skeptical Attitude


 The final key idea is skepticism.

 the constant questioning of your beliefs and


conclusions
 It requires the possession of skeptical attitudes
 Scientific attitude (SA) implies skeptical attitude

 A skeptic holds beliefs tentatively, and is open to


new evidence and rational argument

11
The Meaning of Research

 Research begins with a question that the researcher is


trying to answer
 Research inculcates scientific reasoning and promotes
the development of logical habits of thinking
 Hence, the term research can be broadly defined as the
scientific and systematic search or inquiry for
pertinent information or knowledge.
 It is a movement from the known to the unknown and
is a deliberate response to a need for information in
order to solve a given problem.
 It is an original contribution to the existing stock of
knowledge. 12
The Meaning of Research…cont’d

 The research activity comprises of the following


activities:
 the defining and redefining of the problem,
 the formulation of hypotheses or suggested solutions,
 the collection, organization and evaluation of data or
facts,
 the making of deductions and reaching conclusions.
 Research requires a specific plan or procedure
 We need to know if the question is answerable and how.

 We need to know whether the research project is feasible

in terms of time and money


13
The Meaning of Research…cont’d

 Research usually divides a problem into more


manageable sub-problems
 What is the current state of the Ethiopian economy?” is
vague by itself
 We could divide the economy into sectors
 by occupation: Agriculture, Household, etc. by industry
 Research makes certain critical assumptions
 If you do not make assumptions, you cannot make
logical conclusions
 Research is, by its nature, cyclical
 Every research project brings answers but also new
questions
14
 Those questions, in turn, bring new research projects
The Purpose of Research

 The purpose of research is to discover new ideas or


solutions through the application of scientific
procedures.
 Research has a clearly articulated goal
Examples:
 Test a current theory
 Add details to a theory
 Replace a theory with a better one
 Write a new theory where none existed, etc.

15
The Purpose of Research…Cont’d

 In general the purpose of research may be either of the


following:
 Exploration
 Description
 Explanation
 The major factors to be considered before embarking
on research include:
 Type and nature of information sought
 Timing
 Availability of resources
 Cost/benefit analysis
 Ethical considerations
16
Classification of Research Activities

 Different people may use different classification


systems.
 The classification can be based on:
 methods employed,
 the time dimension,
 research environment or
 data used.

 Accordingly, several types of research


classifications could be identified some of which
may include:

17
Classification of Research
Activities…Cont’d

I. Descriptive vs. Analytical Research


i. Descriptive research
 The purpose of descriptive research is description
of the state of affairs as it exists at present.
 The main characteristics of this method are that the
researcher has no control over the variables.
 He can only report what has happened or what is
happening.
 Example;the frequency of shopping by people, the
preference of people, the number of employed
workers in a factory, etc.
18
Classification of Research
Activities…Cont’d

ii. Analytical research


 In analytical research the researcher has to use facts or
information and analyze these to make a critical
evaluation of the material.

 Analytical studies go beyond simple description in their


attempt to model empirically the existing phenomena under
investigation.

 It asks “why” and tries to find the answer to a


problem.

19
Classification of Research
Activities…Cont’d

II. Applied vs. fundamental research


 Research may be undertaken either to understand the

fundamental nature of a social reality (basic research)


or to apply knowledge to address specific practical
issues (applied research)
i. Applied research
 Applied research aims at finding solution for an immediate
pressing problem facing a society/environment or an
industrial or business organization
 Applied research tries to solve specific policy problems or
help practitioners accomplish a specific task
 Theory is less central than seeking a solution to specific problem
for a limited setting. 20
Classification of Research
Activities…Cont’d
ii. Fundamental research
 Fundamental research is mainly concerned with
generalizations and with the formulation of a theory.
 It is primarily concerned with the understanding of the
fundamental nature of social reality (e.g RAN project at
JU).
 It is the source of most scientific ideas and ways of
thinking about the world.
 It is mostly exploratory in nature.

 So, gathering knowledge for knowledge’s sake is termed


as fundamental or basic research.
 Mostly deductive - seeks new conclusions from current
assumptions. 21
Classification of Research
Activities…Cont’d

III. Quantitative vs. Qualitative Research


i. Quantitative research
 Quantitative research is based on the measurement of
quantitative figure or quantity or amount.
 It is applicable to phenomenon that can be expressed in
terms of quantity.
 Most often we are testing a hypothesis
 We collect data and see whether the hypothesis is
consistent with the data
 Methodology is simpler than qualitative research
 But it often takes longer- identifying, collecting, and
analyzing appropriate data is difficult and expensive.
22
Classification of Research
Activities…Cont’d

 This approach can be further subdivided into:


 inferential,
 experimental and
 simulation approaches.
 The purpose of the inferential approach is to form a
database from which to infer characteristics or
relationships of populations.
 A survey population where a sample population is
studied to determine its characteristics and it is then
inferred that the population has the characteristics.

23
Classification of Research
Activities…Cont’d

 Experimental approach is characterized by much


greater control over the research environment.
 Some or all the variables are manipulated to observe
their effect on other variables.
 Simulation approach involves the construction of
artificial environment (e.g green house) within which
relevant information and data can be generated.
 This permits an observation of the dynamic behavior of a
system under controlled conditions.
 Given values of initial conditions, parameters and
exogenous variables, a simulation is run to represent the
behavior of the process over time.

24
Classification of Research
Activities…Cont’d

ii. Qualitative research


 Qualitative research is concerned with subjective
assessment of attitudes, opinions, and behavior.
 Qualitative Research is a function of researchers’
insights and impressions.
 It generates results, which are not subjected to
rigorous quantitative analysis.
 Generally group interviews, projective techniques
and in-depth interviews are used.
 Qualitative research is particularly important in the
behavioral sciences.

25
Classification of Research
Activities…Cont’d

 BUT, these days, most research activities are becoming


essentially pluralistic:
 researchers often combine quantitative and qualitative
research methods within the same study.

 Mixed-method research strategies are particularly


effective in policy-oriented research and the contribution
that qualitative research can make to policy evaluation is
increasingly being recognized.

26
Classification of Research
Activities…Cont’d

IV. Conceptual vs. Empirical Research


 This classification is similar to the applied vs.
fundamental classification.
i. Conceptual research
 Conceptual research is related to some abstract or
theory:
 Generally used by philosophers and other similar thinkers
to develop new concepts or reinterpret the existing ones.

 Mostly deductive: Seeks new conclusions from current


assumptions.

27
Classification of Research
Activities…Cont’d

ii. Empirical research


 It relies on experiences or observations alone without
due regard to system and theory.
 It is data-based research, coming up with conclusions,
which are capable of being verified, by observations or
experiments.
 The researcher first provides with working
hypothesis.
 He/she then works to get enough data to prove or
disprove his hypothesis.

28
Classification of Research
Activities…Cont’d

 Some other types of research:


 Research:
 can be one time (cross-sectional) or longitudinal

research,
 can be field-based or laboratory- based or

simulation research,
 can also be clinical or diagnostic research,

 can be conclusion oriented or decision oriented,

etc.

29
Time Dimension in Research

 Quantitative research may be divided into two


groups in terms of the time dimension:
 A single point in time (cross sectional)
 Multiple points research (longitudinal research)
 Cross–sectional research takes a snapshot approach.

 This is the simplest and less costly research approach.

 Limitation – it cannot capture social processes or


changes.

30
Time Dimension in Research… cont’d

 Longitudinal research examines features of people or


other units more than one time.
 It is usually more complex and costly than cross
sectional research but is also more powerful especially
with respect to social changes.
 Types of Longitudinal Research:
 Time series research – study on a group of people or other
units across multiple periods (e.g. time series data on exports
of coffee, lulcc using series of RS images).
 The panel study – the researcher observes exactly the same
people group or organization across time periods, each time
using the snapshot approach.
31
Time Dimension in Research… cont’d

 In panel study the focus is on individuals or


households.
 Example: interviewing the same people in 1991, 1993,
1995, etc, and observing the change is an example of
panel data set.
 A cohort Analysis – is similar to the panel study, but
rather than observing exactly the same people, a
category of people who share similar life experience
in a specified period is studied.
 Hence the focus is on group of individuals not on
specific individuals or households.
32
Ethical Consideration in the
Research Process
 Shared Values
 There is no one best way to undertake research.
 There is no universal method that applies to all scientific
investigations.
 Accepted practices for the conduct of research can and
do vary from discipline to discipline.

 There are, however, some important shared values for


the responsible conduct of research that bind all
researchers together.

33
Some of the most important shared values

 HONESTY — conveying information truthfully


and honoring commitments,

 ACCURACY — reporting findings precisely and


taking care to avoid errors,

 EFFICIENCY — using resources wisely and


avoiding waste, and

 OBJECTIVITY — letting the facts speak for


themselves and avoiding improper bias.
34
Some of the most important shared
values…cont’d
 During data collection,
 Some ethical principles governing data collection
include: informed consent, respect for privacy and
safeguarding the confidentiality of data.
 Informed consent implies that persons who are invited to
participate in research activities should be free to choose to
take part or refuse.
 They are free to decide after having been given the fullest
information concerning the nature and purpose of the
research, including any risks to which they personally
would be exposed, the arrangements for maintaining the
confidentiality of the data, and so on.

35
Some of the most important shared
values…cont’d
 Thus, collection of data illegally, under false pretenses, from
minors, etc is unethical.
 Getting access and consent to do research is therefore,
essential.
 During analysis (Misuse of data)

 Fabrication and falsification of research results are


serious forms of misconduct.
 It is a primary responsibility of a researcher to avoid either a
false statement or an omission that distorts the truth.
 In order to preserve accurate documentation of
observed facts with which later reports or conclusions
can be compared, every researcher has an obligation to
maintain a clear and complete record of data acquired.
36
Some of the most important shared
values…cont’d
 Records should include sufficient detail to permit
examination for the purpose of,
 replicating the research,
 responding to questions that may result from
unintentional error or misinterpretation,
 establishing authenticity of the records, and
 confirming the validity of the conclusions.
 It is considered a breach/failure of research integrity to
fail to report data that contradict or merely fail to
support the conclusions, including the purposeful
withholding of information about confounding
factors.
37

 Negative (unexpected) results must be reported.


Some of the most important shared
values…cont’d
 When writing the research paper- Plagiarism
 Plagiarism is the unauthorized use of someone else's
thoughts or wording either by:
 incorrect documentation, failing to cite your sources
altogether, or
 simply by relying too heavily on external resources.
 Whether intentional or inadvertent some or all of
another author's ideas become represented as your
own.
 Plagiarizing undermines your academic integrity.
 It betrays your own responsibilities as a student
writer, your audience, and the very research
community you were entering by deciding to write a
research paper in the first place. 38
Some of the most important shared
values…cont’d
 Incidentally, plagiarism also includes informal
published material such as the "buying" of a paper
from another student.

 If you feel cheating is an easy way-out, and the moral


and intellectual consequences don't sound alarm bells,
stop and think of the serious punitive repercussions
you could incur.

 Because it is intellectual theft, plagiarism is considered


as an academic crime with punishment anywhere from
an F on that particular paper to dismissal from the
course to expulsion from a college or university.
39
The Research Process and Preparing
the Research Proposal

40
The research process
Question

O/P (Thesis, pub, ...)


What is known ?
New knowledge

Introduction
Discussion
Conclusion
Results
Formulate problem
Interpretation,
conclusion
Hypothesis
Materials and
Methods
Analyse, Results Project plan
Experiment
Collect data

41
Introduction

 Most research activities follow the following


steps:
 Selecting a topic
 Formulating the research problem and research
questions
 Extensive literature survey/review
 Formulating the working hypothesis
 Preparing the research design and determining the
sample design
 Collecting and analyzing the data
 Generalizations and interpretations of results
 Preparing the report and presentation of the results
(formal write up of conclusions reached)
42
1. Identification of a Research Topic

 To do a research, a topic or a research problem


must be identified.
What is a Research problem?
 A research problem refers to some difficulty,
which a researcher experiences in the context of
either a theoretical or practical situation and wants
to obtain a solution for it.
 A research topic should seek to advance the state
of science.
 It usually starts with a felt practical or theoretical
difficulty. 43
Identification of a Research Topic (Cont’d)

 RT should ask a question to which the answer is not known.


 RT should ask an interesting question.
 RT should be as objective as possible (based in fact, in things
quantifiable and measurable).
Some Potential Sources of a Research Topic
 A topic must spring from the researcher’s mind like a plant
springs from its own seed.
 The best way to identify a topic is to draw up a shortlist of
possible topics that have emerged from your reading or
from your own experience that look interesting.
• A general area of interest or aspect of a subject matter
(agriculture, industry, social sector, etc.) may have to be
identified at first. 44
Identification of a Research Topic (Cont’d)

 Some Potential Sources of a Research Topic

 A topic must spring from the researcher’s


mind.
 The best way to identify a topic is to draw up a
shortlist of possible topics that have emerged
from your reading or from your own experience
that look interesting.
A general area of interest or aspect of a subject
matter (agriculture, industry, social sector, etc.) may
have to be identified at first.

45
Identification of a Research Topic (Cont’d)

 Some important sources, which may be helpful to


select a research problem:

 Professional Experience:
 Own professional experience is the most
important source of a research problem,
 Contacts and discussions with research-oriented
people,
 Attending conferences, seminars, and
 Listening to learned speakers
46
Identification of a Research Topic (Cont’d)

 Inferences from theory and professional literature:

 Research problems can also emanate from inferences


that can be drawn from theories or from empirical
literature.

 Two types of literature can be reviewed.


 The conceptual literature
 The empirical literature

 Research reports, bibliographies of books, and articles,


periodicals, research abstracts and research guides
suggest areas that need research.
47
Identification of a Research Topic (Cont’d)

 Technological and Social Changes:


 New developments bring forth new development
challenges for research.

 New innovations and changes need to be carefully


evaluated through the research process.

 In general, the most fundamental rule of good research


topic is to investigate questions that sincerely interest
you.
 i.e. a research which a researcher honestly enjoys even if
he/she encounters problems frustrating or discouraging.
48
Identification of a Research Topic (Cont’d)

 The following points are important in selecting a


research problem:
 Subject, which is overdone, should be avoided since it will
be difficult to throw any new light in such cases for the
average researcher.
 Controversial subjects should not become the choice of the
average researcher.
 Too narrow or too broad or vague problems should be
avoided.
 The importance of the subject in terms of:
 The qualification and training of researcher,
 The cost involved and the time factor, etc.
49
Identification of a Research Topic (Cont’d)

 In general, the choice of a research topic is not made


in a vacuum and is influenced by several factors:
 Interest and values of the researcher,

 Current debate in the academic world,

 Funding,

 The value and power of the subject, etc.

50
2. Definition and Statement of the Problem

 After a topic has been selected the next task is to


define it clearly.
 To define a problem means to put a fence around it.

 It involves the task of laying down the boundaries


within which a researcher shall study the problem.

 The researcher must be certain that he/she knows


exactly what his/her problem is before he/she begins
work on it.

 A problem clearly defined is a problem half solved.


51
Definition and Statement of the Problem
(Cont’d)
 Defining the problem unambiguously will help to find
answers to questions like:
 What data are to be collected?
 What characteristics of data are relevant and need to be
studied?
 What relations are to be explored?
 What techniques are to be used for the purpose?

 Hence, in the formal definition of the problem the


researcher is required:
 To describe the background of the study, its theoretical basis
and underlying assumptions in concrete, specific and
workable questions.
52
Definition and Statement of the Problem
(Cont’d)
 Useful steps in defining the research problem:
a) Statement of the problem in a general way
 Problem should be stated in a broad and general way
keeping in mind either some practical concern or some
scientific or intellectual interest.
b) Understanding the nature of the problem more clearly
 The next steps is to understand its origin and nature
clearly.
 The best way to understand the problem is to discuss it
with other more acquainted or experienced people.

53
Definition and Statement of the Problem
(Cont’d)
c) Survey of the available literature:
 The researcher must devote sufficient time in reviewing
both the conceptual and empirical literatures:
 Researches already undertaken on related topics or
problems need to be systematically reviewed.
 This exercise enables the researcher to:
 find out what data are available
 find out if there are gaps in theories, and
 find out whether the existing theory is applicable to the
problem under study.
 find out what other researchers have to say about the
topic,
 ensure that no one else has already exhausted the
questions that you aim to examine, etc. 54
Definition and Statement of the Problem
(Cont’d)
d) Developing the idea through discussion:
 Discussion concerning a problem often produces useful
information.
 The discussion sharpens the researcher’s focus of
attentions on specific aspects of the study.
e) Rephrasing the research problem:
 The researcher must sit to rephrase the research problem
into a working proposition.
 Through rephrasing, the researcher puts the research
problem in as specific terms as possible so that it may
become operationally viable and may help in the
development of a working hypothesis.
55
Definition and Statement of the Problem
(Cont’d)
f) In addition:
 Technical terms or phrases, with special meanings
used in the statement of the problem should be
clearly defined.
 Basic assumptions or postulates relating to the
research problem should be clearly stated.
 The suitability of the time period and the sources
of data available must be considered in defining
the problem.
 The scope of the investigation within which the
problem is to be studied must be mentioned
explicitly in defining a research problem.
56
3. Extensive Literature Survey

 Once the problem is formulated, the researcher


should undertake an extensive literature survey
connected with the problem:

 Academic journals, conference proceedings,


dissertations, government reports, policy reports,
publications of international organizations, books, etc.
must be tapped depending on the nature of the
problem.

 Usually one source leads to the next and the best


place for the survey is the library.
57
Extensive Literature Survey (Cont’d)

 Main goals of literature review:


 To familiarize oneself with the issue and establish
credibility/ believability
 To show the path of prior research and how current
project is linked to it
 To integrate and summarize what is known in the
area
 To learn from others and stimulate new ideas.

58
Extensive Literature Survey (Cont’d)

 From the survey of the literature, you will know


whether your question has not been answered
elsewhere.

 You will also know what other people have said about
similar topics.
 You can learn how other people faced methodological
and theoretical issues similar to your own.
 You can learn about sources of data that you might not
have known before.

59
Extensive Literature Survey (Cont’d)

 You can know other researchers tackling similar


problems.
 Potential literature sources:
 General information: Google (Google Scholar), etc.

 Books: Library, amazon.com

 Articles:

JSTOR: www.jstor.org
EconLit
 Web Pages

60
Extensive Literature Survey (Cont’d)

Structuring the review:


 Summarize every article briefly; a sentence or two
will do.
 Interpret the article in light of its relevance to your
own study.
 Critique it, if necessary.
 Show the stock of knowledge building up over the
course of the literature.
 Show how your research topic adds naturally to
this stock of knowledge.
61
4. Developing of working hypothesis

 A hypothesis is a statement, which predicts the


relationship between two or more variables.
 Formulating an appropriate and realistic research hypothesis
is a pre-requisite for a sound research.
 The role of the hypothesis is to guide the researcher by
delimiting the area of research and keep him/her on the
right track.

 It is a tentative answer to a research question that can


be confirmed or disproved by data.

 Formulating hypothesis is particularly useful for causal


relationships. 62
4. Developing working hypothesis (Cont’d)

 Major problems in formulating a working


hypothesis:
 Formulation of a hypothesis is not an easy task.
 The major problems that may arise include:
 The lack of clear theoretical framework
 The lack of ability to utilize that theoretical
framework logically
 Lack of experience in research techniques in order to
be able to phrase the hypothesis properly.

63
4. Developing working hypothesis (Cont’d)

 Theoretical framework:
 Definition. Theories are formulated to explain,
predict, and understand phenomena and, in
many cases, to challenge and extend existing
knowledge within the limits of critical bounding
assumptions.
 The theoretical framework is the structure that
can hold or support a theory of a research study.

64
4. Developing working hypothesis (Cont’d)

 Characteristics of useable hypotheses:


 The hypothesis must be conceptually clear.
 This involves two things:
the concept should be clearly defined,
the hypothesis should be commonly accepted (i.e
the hypothesis should be stated in simple terms.
 The hypothesis should have empirical references.
 No useable hypothesis embody moral judgments.
 while a hypothesis may study value judgment such a goal
must be separated from a moral preachment or a plea for
acceptance of one’s values.
65
4. Developing of working hypothesis (Cont’d)

 The hypothesis must be specific.


 all the operations and predictions indicated by it should be
spelled out.
 The hypothesis should be related to available
techniques.
 A theorist who does not know what techniques are available
to test his/her hypothesis is on a poor way to formulate
useable hypothesis or questions.
 The hypothesis should be related to a body of theory.
 It should posses theoretical relevance.
 The hypothesis should be testable.
 hypothesis should be formulated in such a way that it is
possible to verify it.
66
5. Scope and Limitations

 A research project must be clear about its scope:


(a) Geographical limitations
 The study might only focus on some regions, even though the
question pertains to a given country – Ethiopia.

(b) Limitations by industry or occupation


 The study might only be able to capture some industries or
occupations- formal or informal sector.

(c) Limitations by subject matter


 The researcher also must know that many other interesting
questions may arise that are outside of the scope of the study.

67
6. Preparing the Research Design

 Research design is a plan that specifies the sources and


types of information relevant to the research.
 It is the arrangement of conditions for the collection and
analysis of data in a manner that aims to combine relevance to
the research purpose.

 It is the conceptual structure, plan, and strategy of


investigation within which research is conducted.

 It is the blue print for collection, measurement and


analysis of data.
 The design that gives the smallest experimental error is the
best design. 68
6. Preparing the Research Design (Cont’d)

 The following are critical elements when making


design decisions:
 What type of data is required (required data)

 Where can the required data be found (source of data)

 What will be the sampling design

 What techniques of data collection will be used

 How will the data be analyzed (method of data analysis)

69
7. Sample Selection

 The researcher must decide on sampling


methods.
 Probability or non-probability sampling methods
can be employed.

70
8. Execution of the Project

 Execution involves how the survey is conducted


i.e., by means of structured/semi-structured
questionnaire or otherwise, etc.
 Several data collection methods do exist ,which
may differ in terms of:
 Financial costs

 Time costs

 Other resources

71
8. Execution of the Project (Cont’d)

 Survey data can be collected by any one or more of the


following ways:
 By observations
 Through personal interviews
 Through telephone interviews
 By mailing questionnaires/through schedules
 The researcher should select one of these methods
taking into account:
 the nature of investigations,
 objectives and scope of the study,
 financial resources,
 available time and the desired level of accuracy, etc.72
9. Analyzing the Data

 After the data have been collected the researcher turns to


the task of analyzing them.
 The analysis may involve a number of closely related
operations such as:
 Editing of the raw data
 Summarizing and tabulation of the data to obtain answers
to research questions
 Drawing statistical inferences
 Spatial analysis
 Spatial statistics
 Various statistical software are available for data entry
and analysis.
 SPSS, STATA, SAS Cspro, Spreadsheet programs such as Excel,
Lotus, etc. 73
 Second round editing is done once the data entry is
completed by examining the frequency
distributions, averages, ranges modes, etc. to detect
outliers.
 Analysis is completed with the preparation of
descriptive tables, mathematical models, SPATIAL
MODELS or programming models.

74
10. Interpretation and Generalizations
 Explaining and discussing the research results in line
with the theoretical framework is part of the
interpretation exercise.
 The real value of a research lies in its ability to arrive at
certain generalizations.
11. Preparation of the Report
 The research process is completed only when the results
are shared with the scientific community.
 Report should be written in concise and objective style
in simple language avoiding vague expressions.
75
Preparing the Research Proposal

 The research proposal helps the researcher to organize


ideas in a way it will be possible to look for flaws or
inadequacies.
 It is a pre-requisite in the research process.
 It serves as a basis for determining the feasibility of the
project and provides a systematic plan of procedure for
the researcher to follow.
 It assures that the parties understand the project’s
purpose and the proposed method of investigation.
 It provides an inventory/guide of what must be done
and which materials have to be collected.

76
Research Proposal…cont’d

 A research proposal should usually contain the


following categories of information:

I. Introduction – this part should include the


following information:

1. The title:
 It should be worded in such a way that it suggests the
theme of the study.
 It should be long enough to be explicit but not too long so
that it is tedious – usually between 15 and 25 words.
 It should contain the key words –important words that
indicate the subject. 77
Research Proposal…cont’d

 There are three types of titles:


 Indicative titles:
 they state the subject of the proposal rather than
expected outcomes.
 Example: The role of agricultural credit in alleviating
poverty in a low-potential area of Ethiopia.
 Hanging titles:
 have two parts: a general first part followed by a
more specific second part.
 Example: Alleviation of poverty in low-potential areas of
Ethiopia: the impact of agricultural credit.
78
Research Proposal…cont’d

 Question-type titles:
 These are used less commonly than indicative and
hanging titles.
 However, they are acceptable where it is possible to use
few words – say less than 15.
 Example: Does agricultural credit alleviate poverty in low-
potential areas of Ethiopia.

79
Research Proposal…cont’d

2. Statement of the Problem:

 This section makes up between ¼ - ½ of the proposal.


 It is an expansion of the title.
 It introduces the research by situating it (by giving
background), presenting the research problem and saying
how and why this problem will be “solved.“
 Without this important information the reader cannot
easily understand the more detailed information
about the research that comes later.
 It also explains why the research is being done (rationale)
which is crucial for the reader to understand the significance
of the study.

80
Research Proposal…cont’d

 The problem statement should make a convincing


argument that there is not sufficient knowledge
available to explain the problem or there is a need
to test what is known and taken as fact.
 It should provide a brief overview of the literature and
research done in the field related to the problem and of
the gaps that the proposed research is intended to fill.
 Some ways to demonstrate that you are adding to
the knowledge in your field:
• Gap: A research gap is an area where no or little
research has been carried out.
81
 Raising a question: The research problem is defined
by asking a question to which the answer is
unknown, and which you will explore in your
research.
 Continuing a previously developed line of enquiry:
Building on work already done, but taking it further
(by using a new sample, extending the area studied,
taking more factors into consideration, taking fewer
factors into consideration, etc.).
 Counter-claiming: A conflicting claim, theory or
method is put forward.
82
Research Proposal…cont’d

3. Objectives of the study:


 In this section the specific activities to performed are
listed.
 This is the step of rephrasing the problem into
operational or analytical terms, i.e. to put the
problem in as specific terms as possible.
 This section is rather brief usually not >½ a page at
most.
 This is because the rationale for each objective will
already have been established in the previous section.

83
Objectives of the study…cont’d

 The general objective provides a short statement


of the specific goals pursued by the research.
 The specific objectives are the objectives against
which the success of the whole research will be
judged.
 The specific objectives are operational and indicate the
type of knowledge to be produced, audiences to be
reached, etc.
 An objective for a proposal should be Specific,
Measurable, Achievable, Realistic and Time-bound –
that is, SMART.
84
Research Proposal…cont’d

4. Review of Literature:

 The theoretical and empirical framework from


which the problem arises must briefly be
discussed.
 Both conceptual and empirical literature is to be
reviewed for this purpose.
 The researcher has to make it clear that his problem
has roots in the existing literature but it needs
further research and exploration.
 The analysis of previous research eliminates the
risk of duplication of what has been done.
85
Research Proposal…cont’d

5. The Hypothesis:
 Questions that the research is designed to answer are
usually framed as hypothesis to be tested on the basis of
evidence.
 It gives direction to the data gathering procedure.
6. Significance of the Study:
 This section justifies the need of the study.
 It describes the type of knowledge expected to be
obtained and the intended purpose of its application.
 It should indicate clearly how the results of the research
could influence theory or practice.
86
Research Proposal…cont’d

 Rationale:
 The rationale for undertaking a research study can
be:
1. To show the existence of a time lapse between the
earlier study and the present one, and therefore, the
new knowledge, techniques or considerations indicate
the need to replicate the study.

2. To show that there are gaps in knowledge provided by


previous research studies and to show how the present
study will help to fill in these gaps and add to the
quantum of existing knowledge.
87
Research Proposal…cont’d

 Hence, the justification should answer the following:


 How does the research relate to the priorities of the region
and the country?

 What knowledge and information will be obtained?

 What is the ultimate purpose the knowledge obtained from


the study will serve?

 How will the results be disseminated?

 How will the results be used, and who will be the


beneficiaries?
88
Research Proposal…cont’d

7. Definition of terms and concepts:


 It is necessary to define all unusual terms and concepts
that could be misinterpreted.
 Technical terms or words and phrases having special
meanings need to be defined operationally.
8. Scope and limitations of the study:
 Boundaries of the study should be made clear with
reference to:
 (i) the scope of the study by specifying the areas to which the
conclusions will be confined and
 (ii) the procedural treatment including the sampling
procedures, the techniques of data collection and analysis, etc.
89
Research Proposal…cont’d

9. Basic assumptions:
 Assumptions are statements of ideas that are
accepted as true.
 They serve as the foundation upon which the research
study is based.

90
Research Proposal…cont’d

II) Methodology

 The methodology explains how each specific


objective will be achieved.

 It is impossible to define the budgetary needs


of the research project in the absence of a solid
methodology section.

91
METHODOLOGY…cont’d

a) Procedures for data collection: details about


sampling procedures and data collection tools
are described.
(i) Sampling – in research situations the researcher
usually comes across unmanageable populations in
which large numbers are involved.
(ii) Tools (instruments) – in order to collect evidence or
data for a study the researcher has to make use of
certain tools such as observations, interviews,
questionnaires, etc.
 The proposal should explain the reasons for
selecting a particular tool or tools for collecting
the data.
92
Research Proposal…cont’d

b) Procedures for treating data (method of analysis)


 In this section, the researcher describes how to organize,
analyses and interprets the data.
 The details of the statistical techniques and the rationale
for using such techniques should be described in the
research proposal.
(i) Statistical inference models
 Regression analysis is a good analytical tool, providing a
method to test various hypotheses relating to the
classical economic theory.
 The analysis is built upon the casual factor-effect analytical
framework.
 A range of regression models as well as various ways of
estimating regression coefficients, the most common being 93
the OLS method.
Research Proposal…cont’d

(ii) Mathematical programming models


 An example of a mathematical programming model is
the linear programming model.
 It is, however, only one example from a wide range of
mathematical models.
 There are also non-linear and dynamic mathematical
programming models that address a range of economic
and policy analysis questions and hypotheses.
 The central theme in these models is to optimize an
objective functions subject to a set of constraints.

94
Research Proposal…cont’d

(iii) Simulation models:

 Simulation is the operation of an abstract prototype of


a real system designed to trace out dynamic
interactions.

 Simulation models have acquired substantial appeal


among policy analysts because of their ability to
explore the consequences of a wide range of
alternative sets of policies, plans and even
management strategies.

95
Research Proposal…cont’d

III) Budgeting and Scheduling the Research


 Research costs money, depending on its complexity and
number of people and activities employed.
 A proposal should include a budget estimating the funds
required for travel expenses, typing, printing, purchase of
equipment, tools, books, etc.
 It would include all or some of the following items:
 Management time
 Bought out resources time
 Data collection
 Data analysis cost – software and hardware
 Transport cost
 Respondent’s incentives
96
Research Proposal…cont’d

 Research must also be scheduled appropriately.


 Researcher should also prepare a realistic time schedule for
completing the study within the time available.
 Dividing a study into phases and assigning dates for the
completion of each phase helps the researcher to use is time
systematically.
IV. Citations and references
 It is important that you correctly cite all consulted published
and unpublished documents that you refer to in the proposal.
 This allows the reader to know the sources of your information.
 Every reference you cited must appear in the list of references at
the end of the proposal.
 How many types of citations do exist? Please refer. 97
Research Proposal…cont’d

VI. Bibliography:
 Be sure to include every work that was referred to in the
proposal
 You do not have to refer to any other works if you do not
want to; the bibliography does not have to be long or
complete.
 Formats vary slightly by journal, etc.
 A common format:
 For a book: Smith, Adam (1776). An Inquiry into the Nature and
Causes of the Wealth of Nations. London: Dent and Sons
publishing.
 For an article: Coase, R (1937). “The Nature of the Firm.”
Economica 4, 386-405.
98
Survey and Field Research Methods

 Survey research is one of the most basic methods in


social, medical researches.
 Survey research permits a rigorous step by step
development and testing of complex propositions
through survey data.
 The aim of sample surveys is to generalize from the
sample to the population.
 The three most common purposes of surveys are:
 Description
 Explanation, and
 Exploration
99
Survey and Field Research Methods… cont’d

 One can distinguish between two basic types of survey


designs:
 Cross sectional surveys
 Data are collected at one point in time
 Less expensive and most common type
 Longitudinal Survey
 Surveys are conducted at different point in time.
 Useful for capturing changes over time.
 Survey Sampling:
 Some studies involve only small number of people and thus
all of them can be included.

 But when the population is large, it is usually not possible to


undertake a census of all items in the population. 100
Survey and Field Research Methods… cont’d

 Sampling is the process of selecting a number of study units


from a defined study population.
 It aims at obtaining consistent and unbiased estimates of the
population parameters.
 There are two principles underlying any sample design:
 The need to avoid bias in the selection procedure
 The need to gain maximum precision.
 Bias can arise:
 if the selection of the sample is done by some non-random
method i.e. selection is consciously or unconsciously
influenced by human choice.
 if the sampling frame (i.e. list, index, population record)
does not adequately cover the target population.
 if some sections of the population are impossible to find or
refuse to co-operate. 101
Survey and Field Research Methods… cont’d

 Major Reasons for Sampling


1) Resource Limitations: A sample study is usually less
expensive than a census.
2) Superior Quality of Results:
more accurate measurement
3) Infinite Population: sampling is also the only process
possible if the population is infinite.
4) Destructive nature of some tests: sampling remains the
only option when a test involves the destruction of the
items under study.
 Example: testing the quality of a commodity (beer,

cigarette, coffee, etc.)


102
Survey and Field Research Methods… cont’d

 Representativeness:
• Representativeness is important particularly if you want to
make generalization about the population.
• A representative sample has all the important characteristics
of the population from which it is drawn.
 For Quantitative Studies:
• If researchers want to draw conclusions which are valid for
the whole study population, they should draw a sample in
such a way that it is representative of that population.
 For Qualitative Studies:
• Representativeness of the sample is NOT a primary concern.
• We select study units which give us the richest possible
information.
 You go for INFORMATION-RICH cases!

103
Survey and Field Research Methods… cont’d

 Steps in Sampling Design


a) Identifying the relevant population: when one wants to
undertake a sample survey the relevant population from
which the sample is going to be drawn need to be
identified.
 Example: if the study concerns income, then the
definition of the population elements as individuals
or households can make a difference.
b) Determining the method of sampling:
 Whether a probability sampling procedure or a non-
probability sampling procedure has to be used is also
very important.
104
Survey and Field Research Methods… cont’d

c) Securing a sampling frame:


 A list of elements from which the sample is actually
drawn is important and necessary (e.g. Kebele registry)
d) Identifying parameters of interest:
 What specific population characteristics (variables and
attributes) may be of interest.
e) Determining the sample size:
 The determination of the sample size depends on several
factors:
i) Degree of homogeneity:
 The size of the population variance is the single most
important parameter.
 The greater the dispersion in the population the larger the
sample must be to provide a given estimation precession.105
Survey and Field Research Methods… cont’d

ii) Degree of confidence required:


 Since a sample can never reflect its population for certain,
the researcher must determine how much precision s/he
needs.
 Precision is measured in terms of:
 An interval range in which we would expect to find the
parameter estimate.
 The degree of confidence we wish to have in the estimate.
iii) Number of sub groups to be studied:
 When the researcher is interested in making estimates
concerning various subgroups of the population then the
sample must be large enough for each of these
subgroups to meet the desired quality level. 106
Survey and Field Research Methods… cont’d

iv) Cost: Cost considerations have major impact on


decisions about the size and type of sample.
 All studies have some budgetary constraint and hence cost
dictates the size of the sample.

 To determine sample size:


1. Use prior information: If our process has been
studied before, we can use that prior information
to determine our sample size.

 This can be done by using prior mean and variance


estimates and by stratifying the population to reduce
variation within groups.
107
Survey and Field Research Methods… cont’d

2. Rule of Thumb: are based on past experience with samples


that have met the requirements of the statistical methods.
 Researchers use it because they rarely have information on
the variance or standard errors.
3. Practicality: Of course the sample size you select must
make sense.
 We want to take enough observations to obtain reasonably
precise estimates of the parameters of interest but we also
want to do this within a practical resource budget.
 Therefore the sample size is usually a compromise
between what is DESIRABLE and what is FEASIBLE.
 In general, the smaller the population, the bigger the
sampling ratio has to be for a reasonable sample.

108
Survey and Field Research Methods… cont’d

 Hence:
 For small populations (<1000), a researcher needs a large
sampling ratio (about 30%). Hence, a sample size of
about 300 is required for a high degree of accuracy.
 For moderately large population (10,000), a smaller
sampling ratio (about 10%) is needed – a sample size
around 1,000.
 To sample from very large population (over 10 million),
one can achieve accuracy using tiny sampling ratios
(.025%) or samples of about 2,500.
 These are approximate sizes, and practical limitations
(e.g. cost) also play a role in a researcher’s decision about
sample size. 109
Survey and Field Research Methods… cont’d

 Sample Size in Qualitative Studies


 There are no fixed rules for sample size in qualitative research.
 The size of the sample depends on WHAT you try to find out,
and from what different informants or perspectives you try to
find that out.
 The sample size is therefore estimated as precisely as

possible, but not determined.


 Probability and non-probability sampling
 There could be several sampling methods that could be used
to draw a sample.
 Two types:
 probability samples

 non-probability means.

110
Survey and Field Research Methods… cont’d

 Probability sampling:

 Is based on the concept of random selection of survey


units.
 It uses a random selection procedures to ensure that
each unit of the sample is chosen on the basis of chance.
 A randomization process is used in order to reduce or
eliminate sampling bias so that the sample is
representative of the population from which it is drawn.
 A sample will be representative of the population from which it
is drawn if all members of the population have an equal chance
of being included in the sample.
 Probability sampling requires a sampling frame (a
listing of all study units).
111
Survey and Field Research Methods… cont’d

 Probability samples, although not perfectly


representative are more representative than any other
type of sample.

 So, probability sampling has considerable advantages


over other sampling methods:
• Sampling errors can be calculated.
• They rely on random process, i.e. the selection process
operates in a truly random method (no pattern).
• Finally, since each element has an equal chance or
probability of being selected it is possible to get consistent
and unbiased estimate of the population parameter.

112
Survey and Field Research Methods… cont’d

 Types of probability sampling methods:


 We can distinguish between the following types of
probability sampling methods:
 Simple Random Sampling
 Systematic Sampling
 Stratified Sampling
 Cluster Sampling
 Hybrid Sampling

113
Survey and Field Research Methods… cont’d

1. Simple Random Sampling (SRS)


 The SRS is the simplest and easiest method of
probability sampling.
 It is the sampling procedure in which each element
of the population has an equal chance of being
selected into the sample.
 It assumes that an accurate sampling frame exists.
 Usually two SRS methods are adopted to pick a
sample.
 The lottery method
 Table of random number

114
Survey and Field Research Methods… cont’d

 SRS requires a listing of the entire population of


interest. This may not be possible for national surveys.

 It is too expensive to interview a national sample face


to face based on SRS.
 The cost of interviewing randomly selected individual
drawn from a list of the entire population is extremely
high.

 So, the SRS can only be applied in situations where the


population size is small.
115
Survey and Field Research Methods… cont’d

2. Systematic Sampling Technique

 In SYSTEMATIC SAMPLING individuals are chosen


at regular intervals (for example every fifth) from the
sampling frame.
 Instead of a list of random number the researcher
calculates a sampling interval.
 The sampling interval is the standard distance between
elements selected in the sample.
 The major advantages of SS are its simplicity and
flexibility.

116
Survey and Field Research Methods… cont’d

3. Stratified Sampling
 Most populations can be segregated into a number
of mutually exclusive sub-populations or Strata.

 The stratified sampling technique is particularly


useful when we have heterogeneous populations.

 After a population is divided into the appropriate


strata a simple random sample can be taken either
using the SRS or the SS techniques from each
stratum.
117
Survey and Field Research Methods… cont’d

 Reasons for stratifying


 There are three major reasons why a researcher
chooses a stratified random sampling.
 To increase a sample’s statistical efficiency.

 To provide adequate data for analyzing the various sub-


population.

 To enable different research methods and procedures to


be used in different strata.

118
Survey and Field Research Methods… cont’d

 How to Stratify
 Three major decisions must be made in order to
stratify the given population into some mutually
exclusive groups.
(1) What stratification base to use: stratification would be
based on the principal variable under study such as
income, age, education, sex, location, religion, etc.
(2) How many strata to use: there is no precise answer as
to how many strata to use.
 The more strata the closer one would be to come to
maximizing inter-strata differences and minimizing intra-
strata variables.

119
Survey and Field Research Methods… cont’d

(3)What strata sample size to draw: different


approaches could be used:

 One could adopt a proportionate sampling procedure:

If the number of units selected from the different


strata are proportional to the total number of units of
the strata then we have proportionate sampling.

 Or use disproportionate sampling, which allocates


elements on the basis of some bias.

120
Survey and Field Research Methods… cont’d

4. Cluster Sampling:
 The selection of groups of study units (clusters) instead
of the selection of study units individually is called
CLUSTER SAMPLING.
 If the total area of interest happens to be a big one and can
be divided into a number of smaller non –overlapping
areas (clusters) and if some of the groups or clusters are
selected randomly we have cluster sampling.

 Clusters are often geographic units (e.g., districts,


villages) or organizational units (e.g., firms, clinics,
training groups, etc).

 Cluster sampling addresses two problems:


 Researchers lack a good sampling frame for a dispersed
population. 121
Survey and Field Research Methods… cont’d

 The cost to reach a sample element is very high and cluster


sampling reduces cost by concentrating surveys in selected
clusters.
 Multistage area sampling (MAS) - is a cluster
sampling with several stages:
 First take a sample of a set of geographic regions or clusters –
randomly select X number of clusters.

 Next, a subset of geographic area is sampled within each of


those regions and so on.

 Finally a sample of elements is drawn from smaller areas.


5. Hybrid sampling:
 Where there is no single way to sample a particular
population some researchers use a combination of the four
different methods discussed above. 122
Survey and Field Research Methods… cont’d

 Non-Probability Sampling
 Non-probability selection is non random i.e., each
member does not have a known non-zero chance
of being included.
 Generally thee conditions need to be met in order
to use non-probability sampling.
 First, if there is no desire to generalize to a
population parameter, then there is much less
concern whether or not the sample fully reflects the
population - when precise representation is not
necessary.
123
 Secondly, it is used because of cost and time
requirements.
 probability sampling could be prohibitively
expensive since it calls for more planning and
repeated callbacks to assure that each selected
sample unit is contacted.
 Thirdly, though probability sampling may be
superior in theory there are breakdowns in its
applications.
 The total population may not be available for the
study in certain cases.
124
 Non-probability sampling methods:
 (1) Convenience sampling
 The method selects anyone who is convenient.
 It can produce ineffective, highly un-
representative samples and is not recommended.
 Such samples are cheap, however, biased and

full of systematic errors.


 Example: the person on the street interview
conducted by television programs is an
example of a convenient sample.

125
 (2) Quota Sampling
 Quotas are assigned to different strata groups and
interviewers are given quotas to be filled from
different strata.
 A researcher first identifies categories of people
(e.g., male, female) then decides how many to get
from each category.
 The major limitation of this method is the absence
of an element of randomization. Consequently the
extent of sampling error cannot be estimated.
 is used in opinion pollsters, marketing research
and other similar research areas.

126
 (3) Purposive or Judgment sampling
 Purposive sampling occurs when one draws a non-
probability sample based on certain criteria.
 When focusing on a limited number of informants,
whom we select strategically so that their in-depth
information will give optimal insight into an issue is
known as purposeful sampling.
 It uses the judgment of the expert in selecting cases.
 BUT, care should be taken that for different
categories of informants; selection rules are
developed to prevent the researcher from sampling
according to personal preference.
127
 (4) Snowball (Network) Sampling
 This is a method for identifying and sampling (or
selecting) the cases in a network.
 Snowball sampling is based on an analogy to a
snowball, which begins small but becomes
larger as it is rolled on wet snow and pick up
additional snow.
 Snowball sampling begins with one or a few people
or cases and spread out on the basis of links to the
initial case.
 You start with one or two information-rich
key informants and ask them if they know
persons who know a lot about your topic of
interest.
128
 Problems in Sampling
 Two types of errors:
 Non sampling errors
 Sampling errors
 Non Sampling errors are biases or errors due to
fieldwork problems, interviewer induced bias,
clerical problems in managing data, etc.
 These would contribute to error in a survey,
irrespective of whether a sample is drawn or a
census is taken.
 On the other hand, error which is attributable to
sampling, and which therefore, is not present in
information gathered in a census is called sampling
error.
129
 a) Non-Sampling Error
 Non sampling error refer to
 Non-coverage error
 Wrong population is being sampled
 No response error
 Instrument error
 Interviewer’s error
 Non-Coverage sampling error: This refers to sample
frame defect.
 Omission of part of the target population (for
instance, soldiers, students living on campus, people
in hospitals, prisoners, households without a
telephone in telephone surveys, etc).
 Non-coverage error also occurs when the list used
for the sampling are incomplete or are outdated. 130
 The wrong population is sampled
 Researchers must always be sure that the group
being sampled is drawn from the population they
want to generalize about or the intended
population.
 Non response error
 Some people refuse to be interviewed because they
are ill, are too busy, or simply do not trust the
interviewer.
 One should try to reduce the incidence of
non-response errors.
 Non-response error can occur in any interview
situation, but it is mostly encountered in large-scale
surveys with self-administered questionnaires.

131
 It is important in any study to mention the non-
response rate and to honestly discuss whether and
how the non-response might have influenced the
results.
 Instrument error
 The word instrument in sampling survey means the
device in which we collect data- usually a
questionnaire.
 When a question is badly asked or worded, the
resulting error is called instrument error.
 Example: leading questions or carelessly
worded questions may be misinterpreted by
some researchers. 132
 Interviewer error : This occurs when some
characteristics of the interviewer such as age, sex,
affects the way in which the respondent answer
questions.
 Example: questions about sexual behavior might be
differently answered depending on the gender of
the interviewer.
 To sum up, a researcher must ensure that non
sampling error are avoided as far as possible, or is
evenly balanced (non systematic) and thus cancels
out in the calculation of the population estimates.

133
 b) Sampling Errors
 Sampling errors are random variations in the
sample estimates around the true population
parameters.
 Error which is attributable to sampling, and
which therefore is not present in a census-
gathered information, is called sampling
error.
 Sampling errors can be calculated only for
probability samples.
 Increasing the sample size is one of the major
instruments to reduce the extent of the sampling
error.
 Sampling error is related to confidence intervals. 134
 A narrower confidence interval means more
precise estimates of the population for a given
level of confidence.
 The confidence interval for the true population
mean is given by: 
Mean  z
n
 Mean is the sample mean, z is the value of the
standard variate at a given confidence level (to be
read from the table giving the area under the
normal curve) n is the sample size, and  is the
standard deviation of the sample mean.
 The sampling error is given by: 
z
n
135
 Dealing with missing data:
 There are several reasons why the data may be
missing.
They may be missing because equipment
malfunctioned, the weather was terrible, or
people got sick, or the data were not entered
correctly.
 If data are missing at random, by far the most
common approach is to simply omit those cases
with missing data and to run our analyses on what
remains.

136
 Although deletion often results in a substantial
decrease in the sample size available for the
analysis, it does have important advantages.
 Under the assumption that data are "missing at
random”, it leads to unbiased parameter estimates.
 If, on the other hand, data are not missing at
random, but are missing as a function of some
other variable, a complete treatment of missing
data would have to include a model that accounts
for missing data.

137
 Data Collection Techniques
 Every study is a search for information about the
given topic.
 Qualitative and Quantitative data

 The data should be sufficient to test the


hypotheses
 Collection of the data should be feasible
 The question is from where and how to get the
information (the data).
 Data can be acquired from:
 Secondary sources
 Primary sources 138
 Secondary Sources of data
 Secondary sources are those, which have been
collected by other individuals or agencies.
 As much as possible secondary data should always
be considered first, if available.
 Why reinvent the wheel if the data already exists.
 When dealing with secondary data you should ask:
 Is the owner of the data making them available to
you?
 Is it free of charge? If not, how will you pay?
 Are the data in a format that you can work with?

139
 Advantages of Secondary data
 Can be found more quickly and cheaply.

 Most researches on past events or distant places have

to rely on secondary data sources.


 Limitations
 The information often does not meet one’s specific

needs.
Definitions might differ, units of measurements may be
different and different time periods may be involved.
 difficult to assess the accuracy of the information-
unknown research design or the conditions under
which the research took place.
 Data could also be out of date.
140
 Sources of Secondary Data
 Secondary data may be acquired from various
sources:
 Department reports, production summaries,
financial and accounting reports, marketing and
sales studies, books, periodicals, reference books
encyclopedia, university publications (thesis,
dissertations, etc.), policy documents, statistical
compilations, research report, proceedings, personal
documents (historical studies) , etc.
 The Internet

141
 Primary Sources of Data
 Data that came into being by the people directly

involved in the research.


 Data collected afresh and for the first time happen to

be original in character.
 Qualitative and Quantitative data collection techniques
 There are two approaches to primary data collection:

 the qualitative approach and

 the quantitative approach

142
 Qualitative data collection approaches
 Qualitative data can be acquired from:
 case studies,
 rapid rural appraisal methods,
 focus group discussions and
 key informant interviews.
i) Case studies
 A case study research involves a detailed
investigation of a particular case.
• Through Interviews (several forms of interviews-
open-ended, focused, or structured).
• Through Direct observation (field visits). 143
ii)Rapid Rural Appraisal (RRA)
 RRA is a systematic but semi-structured activity
often by a multidisciplinary team.
 The techniques rely primarily on expert observation
coupled with semi-structured interviewing.
 The RRA method:
 takes only a short time to complete,

 tends to be relatively cheap, and

 make use of more 'informal' data collection


procedures.
 The techniques of RRA include:
 Interviews with individuals, households and key
informants
 Group interview techniques, including focus-group
interviewing, etc. 144
iii) Focus group discussions
 A FGD is a group discussion guided by a facilitator,
during which group members talk freely and
spontaneously about a certain topic.
 The group of individuals are expected to have
experience or opinion on the topic and selected
by the researcher.
 Its purpose is to obtain in-depth information on
concepts, perceptions and ideas of a group.
 It is more than a question-answer interaction.
 The idea is that group members discuss the
topic and interact among themselves with
guidance from the facilitator.
145
 Why use focus groups?
 The main purpose of a focus group research is
to draw upon respondents’ attitudes, feelings,
beliefs, experiences and reactions which would
not be captured using other methods.
 attitudes, feelings and beliefs may likely be
revealed via the social gathering and the
interaction.
 Compared to individual interviews, which aim
to obtain individual attitudes, beliefs and
feelings, focus groups elicit a multiplicity of
views and emotional processes within a group
context.
146
 Strengths and limitations
 Provided the groups have been well chosen, in terms of
composition and number, FGDs can be a powerful research
tools which provide valuable information in a short period
of time and at relatively low cost.
 BUT, FGD should not be used for quantitative purposes, such
as the testing of hypotheses or the generalization of findings
for larger areas, which would require more elaborate surveys.
 It may be risky to use FGDs as a single tool.
 In group discussions, people tend to centre their opinions on
the most common ones.
 Therefore, it is advisable to combine FGDs with other
methods (in-depth interviews).
 In case of very sensitive topics group members may
hesitate to express their feelings and experiences freely.

147
iv) Key Informant Interview
 The key informant interview technique is an
interviewing process for gathering information from
opinion leaders such as elected officials, government
officials, and business leaders, etc.
 This technique is particularly useful for:
 Raising community awareness about socio-economic
issues
 Learning minority viewpoints
 Gaining a deeper understanding of opinions and
perceptions, etc.

148
v) Triangulation

 Triangulation refers to the use of more than one


approach to the investigation of a research question in
order to enhance confidence in the findings.
 The purpose of triangulation is to obtain confirmation of
findings through convergence of different perspectives.
 Why use triangulation?
 By combining multiple methods, and empirical materials,
researchers can hope to overcome the weakness or biases and
problems that are associated with a single method.

149
 Types of Triangulation
 Data triangulation, which entails gathering data
through several sampling strategies at different
times and social situations.
 Investigator triangulation, which refers to the use
of more than one researcher in the field to gather
and interpret data.
 Theoretical triangulation, which refers to the use
of more than one theoretical proposition in
interpreting data.
 Methodological triangulation, which refers to the
use of more than one method for analyzing the
data.
150
Quantitative Primary Data Collection Methods

 Primary data may be collected through:


 Direct personal observation method, or
 Survey or questioning other persons.
 The Observation Method
 Observation includes the full range of monitoring
behavioral and non-behavioral activities.
 It is less demanding and has less bias.
 One can collect data at the time it occurs and need not
depend on reports by others.
 with this method one can capture the whole event as
it occurs.
151
Weakness of the Method

 The observer normally must be at the scene of the event


when it takes place.
 But it is often difficult or impossible to predict when and
where an event will occur.
 Observation is also slow and expensive process. It
requires either human observers or some type of costly
surveillance equipment.
 Its most reliable results are restricted to data that can be
determined by an open or deliberate action or surface
indicator.
 Limited as a way to learn about the past, or difficult to
gather information on such topics as intensions, attitudes,
opinions and preferences.
152
The Survey Method
 The most common method

 To survey is to ask people questions in a


questionnaire (mailed or handed to people) or
during an interview and then record the answer.

 Surveys are used to generate data on economic


behavior, statistics, opinion polls, etc.

 In a survey the unit of analysis is typically a


person.
153
Strength of the Survey Method
 It is a versatile or flexible method - capable of many
different uses.
 It does not require that there be a visual or other
objective perception of the sought information by a
researcher.
 One can seldom learn much about opinion and
attitudes except by questioning.
 Surveys tend to be more efficient and economical
than observations.
Information can be gathered by a few well-chosen
questions. For instance, surveying using telephone or
mail is less expensive.

154
Weakness of the Method
 The quality of information secured depends heavily on
the ability and willingness of the respondents.
 A respondent may interpret questions or concept
differently from what was intended by the researcher.
 A respondent may deliberately mislead the
researcher by giving false information.
 Surveys could be carried out through:
 Face to face personal interview
 By telephone interview
 By mail or e-mail, or
 By a combination of all these.
155
a) Personal Face to face Interview
 It is a two-way conversion where the respondent is
asked to provide information.
Advantages:
 The depth and detail of the information that can be
secured far exceeds the information secured from
telephone or mail surveys.
 Interviewers can probe additional questions, gather
supplemental information through observation, etc.
 Interviewers can make adjustments to the language
of the interview because they can observe the
problems and effects with which the interviewer is
faced.
156
Limitations of the Method

 The method is an expensive enterprise.


 Interviewer may be reluctant to visit unfamiliar
neighborhoods.
 Biased results grow out of the three types of errors.
 Sampling error (discussed earlier)
 Non-response error
 Response error

157
Non-response error
 This error occurs when you are not able to find those whom
you are supposed to study.
 In probability samples there are pre-designated persons to
be interviewed.
 When one is forced to interview substitutes, an unknown
bias is introduced.
 Under such circumstances one of the following could be
tried.
 The most reliable solution is to make callbacks.
 To treat all remaining non-respondents as a new
subpopulation and draw a random sample from the
subpopulation.
 To substitute someone else for the missing respondent if
the population is homogeneous.
158
Response error
 Errors are made in the processing and tabulating of
data.
 Respondent may fail to report fully and accurately.
 Cheating by enumerators, usually with only limited
training and under little direct supervision.
 Enumerator can also distort the results of a survey by in-
appropriate suggestions, word emphasis, tone of voice
and question rephrasing.
 Perceived social distance between enumerator and
respondent also has a distorting effect.

159
Cost Considerations
 Interviewing is a costly exercise.
 Much of the cost results from the substantial
enumerator time taken up with administrative and
travel tasks.
b) Telephone Interview
 Telephone can be a helpful medium of communication
in setting up interviews and screening large population
for rare respondent type.

160
Strength of this method
 Moderate travel and administrative costs
 Faster completion of the study
 Responses can be directly entered on to the computer
 Fewer interviewers’ bias.
Limitations of this method
 Respondents must be available by phone.
 The length of the interview period is short.
 Telephone interview can result in less complete
responses and that those interviewed by phone find the
experience to be less rewarding than a personal
interview.
161
C) Interviewing by Mail
 Self-administrated questionnaires may be used in
surveys.
Advantages
 Lower cost than personal interview
 Persons who might otherwise be inaccessible can
be contacted (major corporate executives)
 Respondents can take more time to collect facts
Disadvantages
 Non response error is expected
 Large amount of information may not be acquired
162
Survey Instrument Design
 Actual instrument design begins by drafting
specific measurement questions.
 Both the subject and wording of each question are
important.
 The psychological order of the question needs to be
considered.
 Questions that are more interesting, easier to answer,
and less threatening usually are placed early in the
sequence to encourage response.

163
The main components of a questionnaire

 Identification data: respondent’s name, address,


time and date of interview, code of interviewer,
etc.
 Instruction: depends on type of survey and may
include skip questions
 Information sought: major portion of the
questionnaire
 Cover letter: brief purpose of the survey, who is
doing it, time involved, etc.

164
Designing of a Questionnaire

 In developing a survey instrument the following


issues need to be considered carefully:
 Question content
 Question wording
 Response form
 Question sequence

165
1. Question Content
 Both questions and statements could be used in
survey research.
 Using both in a given questionnaire gives the
researcher more flexibility.
 Minimizing the number of questions is highly
desirable, but one should never try to ask two
questions in one.
 Question content usually depends on the
respondent’s:
 ability, and
 willingness to answer the question accurately.
166
a) Is the question of proper scope?
 Respondent must be competent enough to answer the
questions.
 The respondent information level should be assessed
when determining the content and appropriateness of a
question.
Questions that overtax the respondent’s recall ability may
not be appropriate.
b) Willingness of respondent to answer adequately
 Even if respondents have the information, they may be
unwilling to give it.
 Some topics are also too sensitive to discuss with strangers.
 Examples: the most sensitive topics concern money matters and
167
family life.
 If respondents consider a topic to be irrelevant and
uninteresting they would be reluctant to give an
adequate answer.
 Some of the main reasons for unwillingness:
 The situation is not appropriate for disclosing the
information
 Disclosure of information would be embarrassing
 Disclosure of information is a potential threat to the
respondent

168
Some approaches that may help to secure more
complete and truthful information

 Use an indirect statement i.e., “other people”


 Motivate respondent to provide appropriate
information.
 Change the design of the questioning process.
 Apply appropriate questioning sequences that will
lead a respondent from „safe“ question gradually to
those that are more sensitive.
 Use methods other than questioning to secure the
data (observation).

169
2. Question Wording

a) Shared Vocabulary
 In a survey the two parties must understand each other
and this is possible only if the vocabulary used is
common to both parties. So, don’t use unfamiliar words
or abbreviations or ambiguous words.
b) Question Clarity
 Do not use emotionally loaded or vaguely defined
words.

170
c) Personalization
 Finding the right degree of personalization may be a
challenge.
 Instead of asking „What would you do about ...?, it
is better to ask „what would people do about ...? „
d) Provision of adequate alternatives
 Asking a question that does not accommodate all
possible responses can confuse and frustrate the
respondent.
 Are adequate alternatives provided? It is wise to
express each alternative explicitly in order to avoid
bias. 171
3. Response structure or format

 A third major decision area is the degree and form of


the structure imposed on the responses.
 The options range from open (free choice of words) to
closed (specified alternatives).
a) Open Ended Questions
 An open-ended question (free response) question asks
questions to which respondents can give any answer.
 Open ended (free response) in turn range from
 those in which the respondents express themselves
extensively.
 Those in which the freedom is to choose one word in a “fill
172
in “ question.
Advantages
 Permit an unlimited number of possible answers
 Respondents can answer in detail and can qualify
and clarify responses
 Permit creativity self expression, etc.
Limitations
 Different respondents give different degree of details in
answers – responses may not be consistent.
 Some responses may be irrelevant
 Comparison and statistical analysis become very difficult.
 Articulate and highly literature respondents have an
advantage
 Requires greater amount of respondent time, thought and
effort. 173
b) Closed Questions

 Although the open response question may have many


advantages closed questions are generally preferable in
large surveys.
 Closed questions are often categorized as dichotomous
or multiple-choice questions.
Advantages
 Easier and quicker for respondents to answer
 Easier to compare the answers of different respondents
 Easier to code and statistically analyze
 Are less costly to administer
 reduce the variability of responses
 make fewer demands on interviewer skill, etc.

174
Limitations
 Can suggest ideas that the respondents would not
otherwise have
 Respondents with no opinion or no knowledge can
answer anyway
 Respondents can be confused because of too many
choices
 During the construction of closed ended questions:
 The response categories provided should be exhaustive.
They should include all the possible responses that
might be expected.
 In multiple choice type questions, the answer categories
must be mutually exclusive.
The respondent may not be compelled to select more
than one answer.

175
4) Question Sequence - order
 The order in which questions are asked can affect the
response as well as the overall data collection
activity.
 Transitions between questions should be smooth.
 Grouping questions that are similar will make the
questionnaire easier to complete, and the respondent
will feel more comfortable.
Questionnaires that jump from one topic to another are
not likely to produce high response rates.

176
Some guides to improve quality

 The question process must quickly awaken interest


and motivate the respondent to participate in the
interview by choosing early interview questions
that are attention getting and not controversial in
subject.
 The respondent should not be confronted by early
request for information that might be considered
too personal or threatening.
 The questioning process should move from simpler
questions to more complex ones.

177
5) Physical Characteristics of a Questionnaire

 The format of a questionnaire is important as the


nature and wording of the questions asked.
 An improperly laid out questionnaire can lead
respondents to miss questions, can confuse them.
 The questionnaire should be spread out properly.
Putting more than one question on a line will result in
some respondents skipping the second question.
Abbreviating questions will result in misinterpretation
of the question.

178
Formats for Responses
 A variety of methods are available for presenting a
series of response categories.
 Boxes
 Blank spaces
 Entering code numbers besides each response and
circle.
Providing Instructions
 Every questionnaire whether to be self administered
by the respondent or administered by an
interviewer should contain clear instructions.

179
 General instructions

 It is useful to begin a questionnaire with basic


instructions to be followed in completing it.
 Introduction: If a questionnaire is arranged into
subsections it is useful to introduce each section with a
short statement concerning its content and purpose.
 Specific Instructions: Some questions may require
special instructions to facilitate proper answering.
 Interviewers instruction: It is important to provide
clear complementary instruction where appropriate to
the interviewer.
180
6) Reproducing the questionnaire

 Having constructed questionnaire it is necessary to


provide enough copies for the actual data collection.
 A neatly reproduced instrument will encourage a
higher response rate, thereby providing better data.

181
Data Processing and Analysis

 Data analysis ranges from very simple summary


statistics to extremely complex multivariate analyses.
Data Preparation and Presentation
 Data processing starts with the editing, coding,
classifying and tabulation of the collected data.
i) Editing
 Editing of data is the process of examining the collected
raw data to detect errors and omissions and to correct
these when possible.
 Editing involves a careful scrutiny of the completed
questionnaires.
182
 In general one edits to assure that the data are:
 Accurate
 Consistent with other information/facts gathered
 Uniformly entered
 As complete as possible
 Arranged to facilitate coding and tabulation
 The editing can be done at two levels
 On the field and in the office.
a) Field level Editing
 After an interview, field workers should review their
reporting forms, complete what was abbreviated,
translate personal shorthand, rewrite illegible entries,
and make callback if necessary. 183
b) Central editing
 The central editing takes place when all forms have
been completed and returned to the office.
 Data editors correct obvious errors such as entry in
wrong place, recorded in wrong units, etc.
ii) Coding
 Coding refers to the process of assigning numerals to
answers so that responses can be put into a limited
number of categories or classes.
 Several thousands of replies or answers can be
reduced to a few categories, which contain the critical
information needed for analysis.
 Data are transcribed from a questionnaire to a coding
sheet. 184
 The coding must be:
 Appropriate, which implies that the classes or categories
must provide the best partitioning of data for testing
hypothesis and showing relationships.

 Exhaustive - there must be a class for every data item.

 Mutual exclusivity – category components should be


mutually exclusive meaning that specific answers can be
placed in one and only one cell in a given category set.

185
iii) Classification and Tabulation
 Once data are edited, and coded the data
presentation exercise begins.
 Most research studies result in a large volume of
raw data, which must be reduced into homogenous
groups if we are to get meaningful relationships.
 Classification is the process of arranging data in
groups or classes on the basis of common
characteristics.
 Data having common characteristics are placed in
similar classes and in this way the entire data get
divided into a number of groups or classes.

186
 Tabulation is the process of summarizing raw data and
displaying it in compact form (i.e. in the form of statistical
tables) for further analysis.
It is an orderly arrangement of data in columns and
rows.
 Tabulation may be done by hand or by mechanical or
electronic devices such as the computer.
 The choice is made largely on the basis of the size and type
of study, alternatives costs, time pressures and the
availability of computer facilities.
 In the case of computer tabulation computer programs
such as SPSS, Lotus, excel, STATA, etc. could be used.
187
 Tabulation provides the following advantages:
 It conserves space and reduces explanatory and
descriptive statement to a minimum.
 It facilitates the process of comparison
 It facilitates the summation of items and the detection of
errors and omissions
 It provides a basis for various statistical computations
such as measures of central tendencies, dispersions, etc.
 Tabulation may be classified as simple and complex.
 Simple tabulation gives information about one or more
groups of independent questions.
 Complex tabulation shows the division of data into two
or more categories.
188
II) Data Analysis
 Large volume of raw statistical information need
to be reduced to more manageable dimensions if
one is to see meaningful relationships in it.
 Data analysis is the computation of certain indices
or measures.
It refers to the computation of certain measures
along with searching for patterns of relationship
that exists among data group.
 Data can be analyzed qualitatively or quantitatively.

189
Quantitative data analysis
 Where the data are quantitative, there are some
determinants of the appropriate statistical tool for
analysis.
 Was the data collected using a random or non-

random sample?
If it was non random then non-parametric data
analysis techniques are appropriate,
if random then parametric techniques are
appropriate.

190
 Were the samples dependent (related) or
independent?
 Samples are said to be dependent (related) when
the measurement taken from one sample affects
the measurement taken again from the same
sample.
 Samples are independent if the measurements
taken from one sample do not affect those from
another sample.

191
Parametric tests
 Has the data got characteristics, which can lead to
the application of parametric tests? i.e.
 Were observations drawn from a population with
normal distribution i.e. data normally distributed?
 Does the set of data being compared have
approximately equal variances (homogeneity of
variances)?
 Were the data measured on a ratio scale?

192
 Non-parametric tests (data is nominal or interval)

 Question: is there a relationship between the


variable that distinguishes the rows and the
variable, which distinguishes the columns
 Method: chi square test.
 Analysis can also be categorized as descriptive
analysis or inferential analysis (statistical analysis).
 With respect to the number of variables involved in the
analysis, it can also be divided into uni-variate
analysis and multivariate analysis.

193
 Uni-variate Analysis
 Uni-variate analysis refers to the analysis with
respect to one variable.
 It is also called a one-dimensional analysis.
 The uni-variate analysis could either be presented
in the form of statistical measures such as measures
of central tendencies and measure of variations or
in the form of graphs.
 Graphical illustrations could also be used to
demonstrate the frequency distribution (histograms,
ogives, polygons, bar graphs, line graphs and
circular graphs or pie charts).

194
Descriptive Analysis
 The initial uni-variate analysis may be the
presentation of descriptive analysis in the form of
frequency distributions.
 A frequency distribution provides a profile of
different groups on any of a multitude of
characteristics such as size, composition, efficiency,
or preferences of persons or other entities.
 The data in a frequency distribution can be used to
calculate a number of statistical indices, which
summarizes the results even further.
 Measures of central tendency are examples.

195
Multivariate Analysis
 Multivariate analysis involves the considerations
of two or more variables.
 If we have two variables then we have bi-variate
analysis but if we have more than two variables then
we have multivariate analysis.
 Several multivariate analyses could be undertaken
such as the construction of bi-variate tables or
multivariate analysis such as multiple regressions,
ANOVA, discriminant analysis, probit and logit
analyses, canonical analysis, etc.

196
Summary chart concerning analysis of data
Analysis of Data
(in a broad general way can be categrised into)

Analysis of data
processing of data (analysis proper)
(Preparing data for analysis)

Editing Coding Descriptive and causal analysis inferential analysis/statistical


Classification
Tabualation
Using percentages

uni-dimensional Bivariate Multi-varaite analysis Estimation of Testing


analysis (simultaneous analysis hypothesis
analysis parameter values
(analysis of of more than twovaraibles
two or attributes in a multiway
variables or
classification) interval
attributes in Point
a two way estimation estimation
classification
calculation of
several measures Multiple regression parametric non paramteric
mostly concerning simple regression test test or distribution
one varaible and correlation multiple discriminant free tests
i. measures of central tendency Association analysis
ii. measures of dispersion of attributes
iii. measures of skewness Multi - ANOVA
iv. simple correlation
Two way ANOVA
v. one way ANOVA
cannonical analysis

197
Pitfalls in Data Analysis
 The problem with statistics
 Some aspects of statistical thoughts might lead
many people to be distrustful of it.
 Three broad classes of statistical pitfalls.
 The first involves sources of bias. These are conditions
or circumstances which affect the external validity of
statistical results.
 The second category is errors in methodology,
which can lead to inaccurate or invalid results.
 The third class of problems concerns interpretation
of results - how statistical results are applied (or
misapplied) to real world issues. 198
1. Sources of Bias
 The core value of statistical methodology is its ability
to assist one in making inferences about a large group
(a population) based on observations of a smaller
subset of that group.
 In order for this to work correctly,
 the sample must be similar to the target population in all
relevant aspects (representative sampling);
 certain aspects of the measured variables must conform to
assumptions which underlie the statistical procedures to be
applied (statistical assumptions).

199
 Representative sampling
 This is one of the most fundamental tenets of inferential
statistics:
 the observed sample must be representative of the target
population in order for inferences to be valid.
 The ideal scenario would be where the sample is
chosen by selecting members of the population at
random, with each member having an equal
probability of being selected for the sample.
 The sample "parallels" the population with respect to
certain key characteristics which are thought to be
important to the investigation at hand.
 the problem comes in applying this principle to real world
situations. 200
 Statistical assumptions.
 The validity of a statistical procedure depends on
certain assumptions it makes about various aspects of
the problem.
 For instance, linear methods depends on the
assumption of normality and independence.
 Unfortunately, this offers an almost irresistible temptation
to ignore any non-normality, no matter how bad the
situation is.
 If the distributions are non-normal, try to figure out
why; if it's due to a measurement artifact try to
develop a better measurement device.

201
 Another possible method for dealing with unusual
distributions is to apply a transformation.
 However, this has dangers as well; an ill-considered
transformation can do more harm than good in terms of
interpretability of results.
 The assumption regarding independence of observations is
more troublesome, because it is so frequently violated in
practice.
 Observations which are linked in some way may show some
dependencies.
 One way to try to get around this is to aggregate cases to
the higher level.
 Example: use households as the unit of analysis, rather than
202
individuals.
2. Errors in methodology
 The most common hazards include designing experiments
with insufficient power, ignoring measurement error, and
performing multiple comparisons.
 Statistical Power. The power of your test generally depends
on the sample size, the effect size you want to be able to
detect, the alpha you specify, and the variability of the
sample.
 Based on these parameters, you can calculate the power level of
your experiment.
 Similarly you can specify the power you desire (e.g. .80), the
alpha level, and use the power equation to determine the
proper sample size for your experiment.

203
 If you have too little power, you run the risk of overlooking
the effect you're trying to find.
 If your sample is too large, nearly any difference, no matter
how small or meaningless from a practical standpoint, will
be "statistically significant".
 Measurement error. Most statistical models assume error
free measurement.
 However, measurements are seldom if ever perfect.
 Particularly when dealing with noisy data such as
questionnaire responses or processes which are difficult to
measure precisely, we need to pay close attention to the
effects of measurement errors.
 Two characteristics of measurement reliability and 204
validity.
 Reliability refers to the ability of a measurement
instrument to measure the same thing each time it
is used.
 So, a reliable measure should give you similar
results.
 If the characteristic being measured is stable over
time, repeated measurement of the same unit should
yield consistent results.
 Validity is the extent to which the indicator
measures the thing it was designed to measure.
 Validity is usually measured in relation to some
external criterion. 205
3. Problems with interpretation
 There are a number of difficulties which can arise in the
context of interpretation.
 Confusion over significance. the difference between
"significance" in the statistical sense and "significance" in
the practical sense continues to elude many consumers
of statistical results.
 Significance (in the statistical sense) is really a function
of sample size and experimental design and shows the
strength of the relationship.
 With low power, you may be overlooking a really
useful relationship; with excessive power, you may be
finding microscopic effects with no real practical value.
206
 Precision and Accuracy. These two concepts often
get confused.
 precision refers to how finely an estimate is
specified, whereas accuracy refers to how close an
estimate is to the true value.
 Estimates can be precise without being accurate.
 Causality: assessing causality is the most important
function of most statistical analysis.
 For causal inference you must have random
assignment.
 Many of the things we might wish to study are not
subject to experimental manipulation. 207
 Hence, it will require a multifaceted approach to
the research to come to any strong conclusions
regarding causality:
 use of chronologically structured designs (placing
variables in the roles of antecedents and
consequents),
 Use several replications.
 Graphical Representations. There are many ways
to present quantitative results numerically, and it
is easy to go astray by misapplying graphical
techniques.
208
Multiple Variables and Confounds

 It would make our life simpler if every effect


variable had only one cause, and it co-varied only
with one other variable.
 Unfortunately, this is hardly ever the case.
 If we have a number of interrelated variables, then it
becomes difficult to sort out how variables affect
each other.
 It is easy to confuse one cause with another, or to
attribute all changes to a single cause when many
causal factors are operating.
209
 Having multiple variables related to each other obscures
the nature of covariance relationships.
 if we observe covariance between two variables, we must
question whether they co-vary because of some real
relationship between them, or whether the covariance is
merely due to the spurious effect of a third confounding
variable.
 The process of determining whether a relationship exists
between two variables requires first that we establish
covariance between two variables.
 In addition to verifying that the two variables change in
predictable, non-random patterns, we must also be able to
discount any other variable or variables as sources of the
change. 210
 To establish a true relationship, we must be able to
confidently state that we observed the relationship under
conditions which eliminated the effects of any other variables.
 Failure to properly control for confounding variables is
a common error.
 We must take steps to control all confounding
variables, so that we can avoid making misestimates of
the size of relationships, or even draw the wrong
conclusions from our observations.

211
Controlling for Confounding Variables
 We can first organize the universe of variables and
reduce them by classifying every variable into one of
two categories: Relevant or Irrelevant to the
phenomenon being investigated.
 The relevant variables are those which are important to
understand the phenomenon, or those for which a
reasonable case can be made.
 Example: if the literature tells us that Consumption Expenditure
is associated with income, then we will consider income to be
a relevant variable.

212
 If we have not included the relevant variable in our
analysis it can be because of different reasons.
 One reason we might choose to exclude a variable is
because we consider it to be irrelevant to the
phenomenon we are investigating.
 If we classify a variable as irrelevant, it means that it
has no systematic effect on any of the variables
included.
 Irrelevant variables require no form of control, as they
are not systematically related to any of the variables in
our model, so they will not introduce any influence.

213
 Two basic reasons why relevant variables might be
excluded:
 First, the variables might be unknown.
 We might have overlooked some relevant variables, but the
fact that we have missed these variables does not mean that
they have no effect.
 Another reason for excluding relevant variables is
because they are simply not of interest.
 Although the researcher knows that the variable
affects the phenomenon being studied, he does not
want to include its effect in the model.
214
 Finally, there remain two kinds of variables which
are explicitly included in our hypothesis tests.
 The first are the relevant, interesting variables which are
directly involved in our hypothesis test.
 The second is called a control variable.
 The control variable is included because it affects
the relevant variables and we need to remove or
control for its effect.

215
Internal and External Validity
 Knowing what variables you need to control for is
important, but even more important is the way you
control for them.
 Several ways of controlling variables exist.
 Internal validity is the degree to which we can be
sure that no confounding variables have obscured
the true relationship between the variables in the
hypothesis test.
 It is the confidence that we can put on the assertion
that the independent variables actually produce the
effects that we observe.
216
 External validity describes our ability to generalize
from the results of a research study to the real world.
 Unfortunately although controlling for the effect of
confounding variables increases internal validity it
often reduces external validity.
Methods for Controlling Confounding Variables
 The effects of confounding variables can be controlled
with three basic methods: manipulated control, statistical
control, and randomization.
 Internal validity, external validity, and the amount of
information that can be obtained about confounding
variables differs for each of these methods.
217
Manipulated Control
 Manipulated control essentially changes a variable into a
constant.
 We eliminate the effect of a confounding variable by not
allowing it to vary. If it cannot vary, it cannot produce
any change in the other variables.
 If we can hold all confounding variables constant, we
can be confident that any difference observed between
two groups is indeed due to the explanatory variable
and not due to the other variables.
 This gives us high internal validity.
 So, Manipulated control prevents the controlled
variables from having any effect on the dependent
variable. 218
Statistical Control
 With this method of control, we include the confounding
variable into the research design as an additional measured
variable, rather than forcing its value to be a constant.
 So, we will be considering with three (or more) variables
and not two: the independent and dependent variables, plus
the confounding (or control) variable or variables.
 The effect of the control variable is mathematically removed
from the effect of the independent variable, but the control
variable is allowed to vary naturally.
 This process yields additional information about the
relationship between the control variable and the other
variables.
219
 In addition to the additional information about the
confounding variables that statistical control
provides, it also has some real advantages over
manipulated control.
 External validity is improved, because the confounding
variables are allowed to vary naturally, as they would
in the real world.
 But, internal validity is not compromised to achieve this
advantage.

220
 In general, statistical control provides us with much
more information about the problem we are
researching than does manipulated control.
 But advantages in one area usually have a cost in
another, and this is no exception.
 An obvious drawback of the method lies in the increased
complexity of the measurement and statistical analysis
which will result from the introduction of larger
numbers of variables.

221
Randomization
 The third method of controlling for confounding
variables is to randomly assign the units of analysis
(experimental subjects) to experimental groups or
conditions.
 The rationale for this approach is straightforward: any
confounding variable will have its effects spread
evenly across all groups, and so it will not produce
any consistent effect that can be confused with the
effect of the independent variable.
 This is not to say that the confounding variables
produce no effects in the dependent variable—they do.
222
 But the effects are approximately equal for all groups, so the
confounding variables produce no systematic effects on the
dependent variable.
 The major advantage of randomization is that we can
assume that all confounding variables have been controlled.
 Even if we fail to identify all the confounding variables, we
will still control for their effects.
 As these confounding variables are allowed to vary
naturally, as they would in the real world.
 External validity is high for this method of control.

223
 Since we don’t actually measure the confounding
variables, we assume that randomization produces
identical effects from all confounding variables in all
groups, and that removes any systematic confounding
effects of these variables.
 But any random process may result in
disproportionate outcomes occasionally.
 Example: If we flip a coin 100 times, we will not always
see exactly 50 heads and 50 tails.
 Sometimes we will get 60 heads and 40 tails, or even 70
tails and 30 heads.

224
 Consequently, we have no way of knowing with
absolute certainty that the randomization control
procedure has actually distributed identically the
effects of all confounding variables.
 We are only trusting that it did.
 But, with manipulated control and statistical control,
we can be completely confident that the effects of the
confounding variables have been distributed so that no
systematic influence can occur, because we can
measure the effects of the confounding variable
directly.
 There is no chance involved.
225
 A further disadvantage of randomization is that it
produces very little information about the action of any
confounding variables.
• We assume that we have controlled for any effects of
these variables, but we don’t know what the variables
are, or the size of their effects, if, there are any.
• we assume that we’ve eliminated the systematic effects of
the confounding variables by insuring that these effects
are distributed across all values of the relevant variables.
• But we have not actually measured or removed these
effects—the confounding variables will still produce
change in the relevant variables.
226

You might also like