Professional Documents
Culture Documents
Research Methods Draft Chapter 3-7-1
Research Methods Draft Chapter 3-7-1
1
What technique has to be used to collect and analyze data? And so forth
Therefore, defining a research problem properly is a prerequisite for any study and a very
important step. Even it is more essential than its solution...
Techniques involved in defining a problem
The research problem should be defined in a systematic manner. The technique involved in
defining a research problem follows a number of steps, which include:
i. Statement of problem in a general way
First of all the problem should be stated in a broad general way keeping with some practical,
scientific, and intellectual interest. For that purpose, the researcher must immerse him completely
in the subject matter, where the researcher wishes to pose a problem. In social science, it is
advisable to do some field observation and / or preliminary survey (pilot survey). Then the
researcher can state the problem or can seek guidance of the subject expert.
ii. Understanding the nature of the problem
The next step is to understand clearly the nature and the origin of the problem. The best way of
understanding the problem is:
To discuss with those who first raised the problem in order to know how the problem
originally come in view.
To discuss it with those who have a good knowledge of the problem concerned or
similar other problem
iii.Survey the available literature
All available literature concerning the problem must be studied and examined before defining
research problem. This means the researcher must be familiar with the relevant theory in the area.
Theory has the following role in overall research studies:
Theory provides patterns of the interpretation of data
It links on study with the other
It provides frameworks within which concepts and variables acquire special significance.
It allows to interpret the large meaning of our findings for ourselves and others
Reports and records and other literature in the concerned area
Rearview research works undertaken on related problem. This is important especially to
learn what data and other material have been used and are available for operational purpose
Knowledge about these all will help the researcher to narrow the problem down.
Generally, survey of literature will enable researcher to know
If there are certain gap in the theory
2
Whether the existing theory applicable to the problem and consistent
with each other.
Whether the findings of the research do or do not follow a pattern
consistent with the theoretical expectation.
Study on a related problem is also useful for indicating the type of
difficulty that may be encountered in the present study.
iv. Developing ideas through discussion
Discussion on a problem produces useful information. Various new ideas can be discovered and
developed through it. The researcher should discuss his problem with colleagues and others who
have enough experience in the same area. Such practice is called ‘experience survey”. Peoples
with rich experience are in a position to show the researcher different aspects of his proposed study
and their advice and comments are usually of high values.
v. Rephrasing the research problem (Reformulation of the problem)
Finally, the researcher at this stage should be able to reformulate the problem that has been stated
in broad and general way in to working proposition. The researcher should narrow and break down
the problem into its components variables and relationships. That is, problem should be expressed
as:
a relationship between two or more variable
the problem should be stated either in question form or hypothesis form
Question form is appropriate mostly when the research is descriptive in nature. What important is
that when a researcher state the problem in question form the formulated problem should be free
from ambiguity and the relationship among variables should be clearly expressed
Examples
Does a relationship exist between income level of university students and score on their
exams?
Is there a relationship between employees' age and their productivity?
Does a relationship exist between the men circumcision and sensitivity to HIV virus?
In above examples, the study’s main elements are identified in reasonably clear fashion. The
following points must be considered while redefining the research problem:
Technical terms and words or phrased, with special meanings used in the statement of the
problem, should be clearly defined.
Basic assumptions or postulates (if any) relating to the research problem should be clearly
defined.
A straightforward statements of the value of the investigation, i.e., the criteria for the
selection of the problem) should be provided
The suitability of the time and the sources of data available must also be considered by the
researcher in defining the problem.
3
The scope of the investigation or the limits within which the problem is to be studied must
be mentioned explicitly in defining the research problem.
3.1.2. Evaluation of the Research Problem
Before the final decision is made on the investigation of the problem, the feasibility of the problem
has to be tested with regard to personal suitability of the researcher and social value of the
problem. In short, the research problem should be evaluated in terms of the following criteria.
Is the problem researchable?
Some problems cannot be effectively solved through the process of research. Particularly, research
cannot provide answers to philosophical and ethical questions that do not show the relationship
existing between two or more variable vividly. Therefore, the problem must be stated in workable
research question that can be answered empirically.
Example: Metaphysical or philosophical studies, which are concerned to identification of sources
of knowledge, essence of truth, etc
Is the problem new?
As much as possible, the research problem needs to be new. One should not target his investigation
to the problem that had already been thoroughly investigated by other researchers. To be safe from
such duplication, the researcher has to go through the record of previous studies in a given field.
However, there are times where by a problem that has been investigated in the past could be
worthy of study. A researcher may repeat a study when he wants to verify its conclusion or to
extend the validity of its findings in situation entirely different from the previous one.
Is the problem significant?
The question of significance of the problem usually relates to what a researcher hopes to
accomplish in a particular study. What is the researcher’s purpose in undertaking the research
process to solve the particular problem he has chosen? What new knowledge does he hopes to add
to the sum total of what is known? In addition, what value is this knowledge likely to have? When
these all questions are answered clearly by the researcher, the problem should be considered for
investigation. The researcher should show that the study is likely to fill the gaps in the existing
Knowledge to help resolve some of the inconsistencies in previous research or to help in the
reinterpretation of the known facts. The findings should become a basis for theory generalization,
or principles and should lead to new problems further research.
Is the problem feasible?
In addition to the above-stipulated points, the feasibility of the research problem should also be
examined from the point of view of the researcher’s personal aspects as stated hereunder.
Researcher’s Competence: The problem should be in an area in which the researcher qualified
and competent. Before indulging into investigation of the problem, the researcher has to make
sure that he is well acquainted with the existing theories, concepts, and laws related to the
problem. He must also possess the necessary Skills and competence that may be needed to
develop, administer, and interpret the necessary data gathering tools. What is more, he needs
to consider whether he has the necessary knowledge of research design and statistical
procedure that may be required to carry out the research through its completion.
4
Interest and enthusiasm: The researcher has to make sure that the problem really interests him.
He must also be truly enthusiastic about the problem. If the problem is chosen properly by
observing these points, the research will not be boring; rather it will be love’s labor.
Financial consideration: Research is an expensive endeavor, which requires a great deal of
money to invest. In this regard, the researcher should ascertain whether he has necessary
financial resources to carry on the investigation of the selected problem. An estimate of the
expenditure involved in the data gathering equipment, printing, test material, travel, and
clerical assistance to be specified. Furthermore, the possible sources of fund must be
consulted ahead of time.
Time requirement: Research should be undertaken within a given scope of time, which was
allocated, with careful analysis of the prevailing situation. Each activity of a research process
requires time. Particularly, it is worthwhile to plan for the time that will be needed for the
development and administration of tools, processing, and analysis of data, and writing of the
research report. While allocating time for research project, care should be taken for the
researcher’s other engagement or commitments, the respondents’ accessibility, the expiry data
of the required data.
Administrative consideration: The researcher has to pay to all administration matters that are
necessary to bring his study to its full administrative matters that are necessary to bring his
study to its full completion. In this regard, the researcher should consider the kinds of data
equipment, specialized personnel. In addition, administrative facilities that are needed to
complete the study successfully. The researcher must assure whether the pertinent data are
available and accessible to.
Extrapolating Research Problem to proposal:
After selecting a problem, it should be stated carefully the researchers to delimit his task and
isolate a specific problem before he can proceed with active planning of the study. This type of
decision is culminated in the problem statement.
Therefore, to extrapolate the problem to research proposal the following activities should be
undertaken:
1. Determining relation between two or more variables
2. Stating clearly and unambiguously in question form
3. Making the problem amenable to empirical testing
Performing these activities, help to identify what the researcher wants to do in a clear and concise
idea that sets the stage for further planning.
Selecting a Research topic
Topics to avoid
It is often only possible in retrospect to recognize the topic you should not even have attempted!
However, here are a few hints that may help you to avoid the research disaster. The topics to avoid
are those that are:
5
1. Too big: For example, ‘Human resource management – innovative international Perspectives’
and similar others. Some very large projects can be worthy and valuable to an organization, but
you need to ask yourself whether you have the time, experience, and resources to complete
them. Winkler and McCuen (1985) also warn that the big topic is also the most difficult to
write about it is difficult knowing where to begin, and omissions and oversights are more
crudely exposed.
2. Traced to a single source: This may not be a particular problem in pure business research
when a single solution is needed to a problem. However, if the research is linked to an
academic program of study, or important for your own professional development, there will
usually be a requirement that issues are explored from a variety of different angles.
3. Too trivial. This may seem rather subjective, but you should use your common sense to
evaluate the kinds of projects that are worth doing and those that are not. As a general rule of
thumb try using the ‘So what?’ test. Ask yourself, after completing the research, whether the
results have any meaning or significance (to others not just to you). For example, a research
project that surveyed how to reduce the use of paper in a marketing department of ten people
would yield very little of value. On the other hand, a project that took the issue of recycling
(paper, printer cartridges, furniture, computers, etc.) across an organization could have
considerable scope and link into the broader environmental debate.
4. Lacking in resource materials: Look out for warning signs – very few references to the topic
in the main textbooks, practitioner journals or other refereed journals or websites. If the project
is going to rely on access to in-house knowledge experts, make sure (in advance) that they are
both available and willing to cooperate with you.
5. Lacking in sponsorship: This does not necessarily mean financial sponsorship, but it is often
important to obtain the support and commitment of key people in the organization or fieldwork
setting where the research is taking place. These are likely to be directors, senior managers, or
the leaders of networks or groups.
6. Too technical. Some projects are more concerned with solving highly technical problems
rather than organizational research. Leave these to the technical gurus.
7. Intractable. You may be offered a problem that nobody else has been able to solve. Be highly
suspicious of this kind of gift! Ask yourself ‘Why me?’ It may be an offer you need to refuse.
8. Dependent on the completion of another project: Even if you are ‘guaranteed’ that projects
you hope to use as data sources will be completed in time for your use, you are strongly
advised not to make your own project dependent on them. If slippage occurs, your own
research will be held up or even scrapped.
9. Unethical: Avoid taking on projects that can damage other people physically, emotionally or
intellectually. Refuse to take on a project that forces you to breach confidentiality or trust.
When using interviews, observation, or surveys, you will need to pay particular attention to
politically sensitive issues such as power relationships, race, gender and the disclosure of
personal information. Ethics are discussed in more detail at the end of this chapter and
elsewhere in this book.
6
3.2. Formulation of Research Hypotheses, Objectives and Research Question
Hypothesis proposes a relationship between two or more variables. Hypothesis form is employed
when the state of the existing knowledge and theory permits formulation of reasonable prediction
about the relationship among variables. Research hypothesis differs from research question in that,
hypothesis both indicate the question in testable form and predict the nature of the answer. In other
words, hypothesis is a theory entertained in order to study the facts and examine the validity of the
theory. The task of the researcher in this case will be to establish and test such hypothesis.
Establishing a hypothesis should follow rules like:
The variables must be clearly specified and measurable by some techniques we know
The relationship between them must be stated precisely.
Importance of Hypothesis
A well-grounded hypothesis provides the following advantages
Represents specific objective, which determine the nature of the data needed to test the
proposition
Offer basis for selecting the sample, the research procedure, and the statistical analysis needed.
Keeps the study restricted in scope thereby preventing it from becoming too broad
Sets a framework for reporting the conclusion of the study.
Criteria of usable hypotheses
Hypotheses can be useful if and only if they are carefully formulated. There are several criteria
used to evaluate hypothesis. These include the following.
Hypotheses should be clearly and precisely formulated
Hypotheses should be formulated in such way that, they can be tested or verified (should be
testable)
Hypothesis should state explicitly the expected relationship between variables
Hypotheses should be limited in scope. Hypotheses of global significance are not usable as
they are not specific and simple for testing and drawing conclusions.
Hypotheses should be consistent with the known facts. In other words, hypotheses should
be grounded in a well-established facts, theories, or laws.
Hypotheses should be stated as much as possible in simple terms. The simple statement
helps to gain the following advantages
i. It becomes easily understandable to others (readers)
ii. It become easily testable
iii. It provides a basis for a clear and easily comprehended report at the completion of the
study.
The hypotheses selected should be amendable to testing within a reasonable time.
Some examples of Hypothesis,
Hypothesis: One
7
Variable one (dependent variable) relationship Variable two (independent
variable)
Hypothesis: Two
Alienation Increase with Poverty
The result of the hypothesis test is the substance of our conclusion and expressed as
generalization.
Hypothesis Formulation in Quantitative Methods
In developing hypothesis in quantitative researches, variables must be operational zed. What
variables refer to is dealt in detail in the following section.
Variables
A variable refers to a characteristic or attribute of an individual or an organization that can be
measured or observed and that varies among the people or organization being studied (Creswell,
2007).
Independent variables are those that (probably) cause, influence, or affect outcomes.
They are also called treatment, manipulated, antecedent, or predictor variables.
Dependent variables are those that depend on the independent variables; they are the
outcomes or results of the influence of the independent variables. Other names for
dependent variables are criterion, outcome, and effect variables.
Intervening or mediating variables stand between the independent and dependent
variables, and they mediate the effects of the independent variable on the dependent
variable.
Moderating variables are new variables constructed by a researcher by taking one
variable and multiplying it by another to determine the joint impact of both (e.g. age X
attitudes toward quality of life). These variables are typically found in experiments.
Two other types of variables are control variables and confounding variables.
o Control variables play an active role in quantitative studies. These are a special
type of independent variables that researchers measure because they potentially
8
influence the dependent variable. Researchers use statistical procedures (e.g. ...
analysis of covariance) to control for these variables.
o A confounding (or spurious) variable is not actually measured or observed in a
study. It exists, but its influence cannot be directly detected. Researchers comment
on the influence of confounding variables after the study has been completed,
because these variables may have operated to explain the relationship between the
independent variable and dependent variable, but they were not or could not be
easily assessed (e.g . . . discriminatory attitudes).
In general Variables in Business Research can be classified in to:
• Categorical or Qualitative Variable
• Quantitative Variable
• Continuous Variable
• Discrete Variable
Other Classification of a Variable
• Treatment/Experimental/Independent Variable
• Dependent Variable
• Control/Intervening/Extraneous Variable
The following guidelines for writing good quantitative research questions and hypotheses:
1. One of the three basic approaches should be followed: (a) in the form of comparing the
variables-the impact of independent variable on dependent variable, (b) in the form of relating the
variables, and (c) in the form of describing the variables.
2. Specify the hypothesis
3. Use only research question or hypothesis, not both, to eliminate redundancy
4. You can use hypothesis either in null or alternative form.
Research objective and Research question
Research Objective
9
It is about what you intended to do.
You should state the General objective and Specific objective in your proposal.
The General objective
It provides a short statement of the scientific goal being pursued by the research.
It is ultimate goal which is achieved in the research out come
It is not detailed
The Specific objective
Are operational in nature
a means so as to achieve the general objective activities in detail discuss
These refer to the objectives against which the success of the research will be judged.
It is important to distinguish the social objectives from the means of achieving them.
Research question
Some research objective need Research questions or research hypotheses.
- Research question – to be answered
- Research hypotheses- to be proved or disproved
Research question simply means putting research objectives in question form
The clearer the research questions are, the more convincing the research project.
11
• Association between independent and dependent variables
Types of literature review:
• In a thematic review of the literature, the researcher has to identify a theme and briefly
cites literature to document this theme. In this approach, the author discusses only the
major ideas or results from studies rather than the detail of any single study.
• In a study-by-study, review of the literature provides a detailed summary of each study
grouped under a broad theme.
Theoretical and Conceptual Frameworks
Definition of a Theory
For example, Kerlinger's (1979) definition of a theory is still valid today. He said, a theory
is "a set of interrelated constructs (variables), definitions, and propositions that presents a
systematic view of phenomena by specifying relations among variables, with the purpose
of explaining natural phenomena" (p. 64).
In this definition, a theory is an interrelated set of constructs (or variables) formed into
propositions, or hypotheses, that specify the relationship among variables (typically in
terms of magnitude or direction).
A theory might appear in a research study as an argument, a discussion, or a rationale, and
it helps to explain (or predict) phenomena that occur in the world.
Labovitz and Hagedorn (1971) add to this definition the idea of a theoretical rationale,
which they define as "specifying how and why the variables and relational statements are
interrelated.” Why would an independent variable, X, influence or affect a dependent
variable, Y?
The terms “literature review,” “conceptual framework” and “theoretical framework” are often used
interchangeably by researchers to explain each other and as steps in the process. Though
interrelated, several authors (e.g., Rocco & Plakhotnik, 2009) also agree that the three terms are
distinct. In any scientific research, theoretical and conceptual frameworks share five functions.
(1) To build a foundation
(2) To demonstrate how a study advances knowledge
(3) To conceptualize the study
(4) To assess research design and instrumentation
(5) To provide a reference point for interpretation of findings (Merriam & Simpson, 2000).
• All five functions are not necessarily fulfilled by the review or framework in each
manuscript, but often they are.
Theory:
• Theories are constructed in order to explain, predict, and master phenomena (e.g.,
relationships, events, behavior).
• A theory generalizes about observations and consists of an inter-related, coherent set of
ideas.
Theoretical Framework:
• No inquirer can investigate a problem from all perspectives simultaneously. Then what?
12
• A theoretical framework establishes a vantage point, a perspective, a set of lenses through
which the researcher views the problem.
• In this sense, the selection of a theoretical framework is both a clarifying and exclusionary
step in the research process.
• In simple terms, a theoretical framework involves the presentation of a specific theory
(e.g., self-efficacy theory) and empirical and conceptual work about that theory.
• This then serves as a basis for conducting a research.
• Merriam (2001) describes the theoretical framework as “the structure, the scaffolding, and
the frame of your study” (p. 45).
Formulating a theoretical framework: How to form?
(a) Specify the theory used as a basis for your study (and mention the proponents of the
theory).
(b) Cite the main points emphasized in the theory.
(c) Support the exposition of the theory by ideas from other experts (empirical and conceptual
work).
(d) Illustrate the theoretical framework by means of a diagram.
Conceptual Framework
• After formulating a theoretical framework, the researcher may develop the conceptual
framework of the study.
• A concept is an image or symbolic representation of an abstract idea.
• While the theoretical framework is the theory on which the study is based, the conceptual
framework is operationalization of the theory.
• Conceptual framework is the researcher’s own position on the problem and gives direction
to the study.
• In other words, a conceptual framework is the researcher’s own idea on how the problem
shall be explored.
• The conceptual framework may be an adaptation of a model used in a previous study with
modification to suit the inquiry.
• Aside from showing the direction of the study through the conceptual framework, the
researcher can show the relationships of the different constructs that he/she wants to
investigate. (you can use diagrams)
13
4. Illustrate the theoretical framework by means of a diagram.
NB. Once the conceptual framework has been determined, the next step for the researcher is to
determine what research methods to employ to best answer the research question through the
proposed framework.
Differences between Theoretical and Conceptual Frameworks
Scope
• Conceptual framework is founded on the theoretical framework, which lies on a much
broader scale of resolution.
Generality-Specificity
o Theoretical framework provides a general representation of relationships between
things in a given phenomenon.
• The conceptual framework, on the other hand, embodies the specific direction by
which the research will have to be undertaken.
• That is, the conceptual framework describes the relationship between specific
variables identified in the study.
14
variables involved, the sample of participants, the research settings, the data collection methods,
and the data analysis methods are factors that contribute to the selection of the appropriate research
design. Thus, a research design is the structure, or the blueprint, of research that guides the process
of research from the formulation of the research questions and hypotheses to reporting the research
findings. In designing any research study, the researcher should be familiar with the basic steps of
the research process that guide all types of research designs. In addition, the researcher should be
familiar with a wide range of research designs in order to choose the most appropriate design to
answer the research questions and hypotheses of interest.
Dear students you can identify different kinds of research designs. However, for the sake of
simplicity research designs can be classified into one of three broad categories based on the nature
of research, purpose of research, research questions, sample selection, data collection methods, and
data analysis techniques:
(1) Quantitative research designs,
(2) Qualitative research designs, and
(3) Mixed-research designs.
3.4.1. Quantitative Research Designs
Quantitative research is a deductive theory-based research process that focuses primarily on testing
theories and specific research hypotheses that consider finding differences and relationships using
numeric data and statistical methods to make specific conclusions about the phenomena.
Quantitative research designs can be classified into one of four broad research design categories
based on the strength of the research design’s experimental control:
(1) True experimental research designs,
(2) Quasi-experimental research designs,
(3) Pre-experimental research designs, and
(4) Non-experimental research designs.
Although each of the categories of research design is important and can provide useful research
findings, they differ in the nature of the evidence they provide in establishing causal relations
between variables and drawing causal inferences from the research findings. Experimental designs
are the most rigorous, powerful, and the strongest of the design categories to establish a cause–
effect relationship. Non-experimental designs are the weakest in terms of establishing a cause–
effect relationship between variables because of the lack of control over the variables, conditions,
and settings of the study.
1. True Experimental Research Designs
The true experiment is a type of research design where:
The researcher deliberately manipulates one or more independent variables (also called
experimental variable or treatment conditions),
Randomly assigns individuals or objects to the experimental conditions (e.g., experimental
or control groups) and controls other environmental and extraneous variables, and
15
Measures the effect of the independent variable on one or more dependent variables
(experimental outcome).
The experimental group is the group that receives the treatment, and the control group is the group
that receives no treatment or sometimes a placebo (alternative treatment that has nothing to do with
the experimental treatment). Thus, in a typical experimental study, the researcher randomly selects
the participants and randomly assigns them to the experimental conditions (e.g., experimental and
control), controls the extraneous variables that might have an effect on the outcome (dependent)
variable, and measures the effect of the experimental treatment on the outcome at the conclusion of
the experimental study.
It is important to emphasize that the experimental research design, if well conducted, is the most
conclusive and powerful of all the research designs and the only research design that tests research
questions and hypotheses to establish cause–effect relationships. For this reason, it is sometimes
called the ‘‘Golden Design.’’ The simple randomized experimental designs with two groups can be
conducted using one of the following four basic experimental designs:
1.1. Randomized Two-Group Posttest-Only Designs
1.2. Randomized Two-Group Pretest–Posttest Designs
1.3. Solomon Four-Group Designs
1.4. Experimental Factorial Designs
2. Quasi-Experimental Research Designs
Quasi-experimental research is used in situations where it is not feasible or practical to use a true
experimental design because the individual subjects are already in intact groups (e.g.,
organizations, departments, classrooms, schools, institutions). In these situations, it is often
impossible to randomly assign individual subjects to experimental and control groups. Thus, quasi-
experimental designs are similar to experimental designs in terms of one or more independent
(experimental) variables being manipulated, except for the lack of random assignment of
individual subjects to the experimental conditions (i.e., experimental and control groups). Instead,
the intact groups are assigned in a nonrandom fashion to the conditions. Types of quasi-
experimental designs include nonequivalent control group designs, longitudinal research designs,
and multi- level research designs.
2.1. Non-equivalent Control Group Design
The nonequivalent control group design involves assignment of intact nonequivalent groups (e.g.,
class- rooms, schools, departments, and organizations) to experimental conditions (experimental
and control groups). Thus, the intact groups are assigned to the treatment conditions and not the
individual subjects, as was the case in the true experimental designs. For example, in a study of the
effects of a new curriculum of students’ knowledge of science and attitudes toward science, some
classrooms would be assigned to receive the new curriculum and others would not. Toward the end
of the school year, all students are measured on their science knowledge and attitudes toward
science. Because the effects are being measured at the level of the individual student, but the
16
students themselves were not randomly assigned to the control and treatment condition, this is a
quasi- experiment, not a true experiment.
2.2. Longitudinal Research Designs
Longitudinal, repeated-measures, or time-series research designs involve repeated measurement or
observation on the same individuals at several points over a period. It is an elaboration of the one-
group pretest–posttest design and focuses primarily on change, growth, and developmental types
of research questions across many different disciplines such as medicine, public health, business,
and social and behavioral sciences. Longitudinal designs, if well designed and conducted, are
usually more complex, time consuming, and expensive than the other types of research designs.
2.3. Multi-Level Research Designs
Multi-level or hierarchical research designs involve the nesting of individuals (micro-level units)
within organizations (macro-level units) and having explanatory independent variables
characterizing and describing both levels. For example, in a two-level design, the emphasis is on
how to model the effects of explanatory variables (predictors) at one level on the relationships
occurring at another level. These multilevel and hierarchical structured data present analytical
challenges that cannot be handled by traditional linear regression methods because there is a
regression model for each level of the hierarchy. Thus, hierarchical models explicitly model the
micro and macro levels in the hierarchy by taking into consideration the interdependence of
individuals within the groups.
The most common statistical methods used for prediction purposes are simple and multiple
regression analyses. The significance of correlation research stems from the fact that many
complex and sophisticated statistical analyses are based on correlation data. For example, logistic
regression analysis and discriminate function analysis are quite similar to simple and multiple
regression analyses with the exception that the dependent (criterion) variable is categorical and not
continuous as in simple and multiple regression analyses. Canonical analysis is another statistical
method that examines the relationship between a set of predictor (independent) variables and a set
of criterion (dependent) variables. Path analysis and structural equation modeling are other
complex statistical methods that are based on correlation data to examine the relationships among
more than two variables and constructs.
4.3. Causal-Comparative Research
Causal-comparative or ex post facto research is a type of descriptive non-experimental research
because it describes the state of existing differences among groups of individuals or objects as they
existed at a given time and place and attempts to determine the possible causes or reasons for the
18
existing differences. Thus, the basic causal-comparative approach starts with selecting two or more
groups with existing differences and comparing them on an outcome (dependent) variable. In
addition, it attempts to examine and explain the possible causes of the existing differences between
the groups.
21
3.5. Sampling and Sample Size Determination
22
lower costs, economy of time, and
Better organization of the work
However, there is an important problem to deal with that is, sampling error, because a sample is a
model of reality (like a map, a doll, or an MP3) and not the reality itself. The sampling error
measures this inevitable distance of the model from reality. Obviously, the less it is, the more the
estimates are close to reality. Unfortunately, in some cases, the sampling error is unknowable.
Population VS Sample
All the items under consideration in any field of inquiry constitute a ‘universe’ or ‘population’. A
complete enumeration of all the items in the ‘population’ is known as a census inquiry. It can be
presumed that in such an inquiry when all the items are covered, no element of chance is left and
highest accuracy is obtained. Nevertheless, in practice this may not be true. Even the slightest
element of bias in such an inquiry will get larger and larger as the number of observations
increases. Moreover, there is no way of checking the element of bias or its extent except through a
resurvey or use of sample checks. Besides, this type of inquiry involves a great deal of time,
money, and energy. Census inquiry is not possible in practice under many circumstances. For
instance, blood testing is done only on sample basis. Hence, quite often we select only a few items
from the universe for our study purposes. The items so selected constitute what is technically
called a sample.
The sample size of a survey most typically refers to the number of units that were chosen from
which data were gathered. However, sample size can be defined in various ways. There is the
designated sample size, which is the number of sample units selected for contact or data collection.
There is also the final sample size, which is the number of completed interviews or units for which
data are actually collected. The final sample size may be much smaller than the designated sample
size if there is considerable non-response, ineligibility, or both. Not all the units in the designated
sample may need to be processed if productivity in completing interviews is much higher than
anticipated to achieve the final sample size. However, this assumes that units have been activated
from the designated sample in a random fashion. Survey researchers may also be interested in the
sample size for subgroups of the full sample.
Sampling Fraction
A sampling fraction, denoted by f, is the proportion of a universe that is selected for a sample. The
sampling fraction is important for survey estimation because in sampling without replacement, the
sample variance is reduced by a factor of (1−f), called the finite population correction or
adjustment.
In a simple survey design, if a sample of n is selected with equal probability from a universe of N,
then the sampling fraction is defined as f =n=N. In this case, the sampling fraction is equal to the
probability of selection. In the case of systematic sampling, f =1/I where I is the sampling interval.
The sampling fraction can also be computed for stratified and multi-stage samples. In a stratified
(single-stage) sample, the sampling fraction, fh is computed separately for each of the h strata. For
a stratified sample fh =nh/Nh, where nh is the sample size for stratum h and Nh is number of units
(in the universe of N) that belong to stratum h. Because many samples use stratification to
23
facilitate over sampling, the probabilities of selection may differ among strata, in which case the f
fh values will not be equal. For multi-stage samples, the sampling fraction can be computed at each
stage, assuming sampling is with equal probability within the stage.
Sampling Frame
A survey may be a census of the universe (the study population) or may be conducted with a
sample that represents the universe. Either a census or a sample survey requires a sampling frame.
For a census, the frame will consist of a list of all the known units in the universe, and each unit
will need to be surveyed. For a sample survey, the sample frame represents a list of the target
population from which the sample is selected. Major categories of sampling frames are area frames
for in-person interviews, random-digit dialing (RDD) frames for telephone survey samples, and a
variety of lists used for all types of surveys. Few lists that are used as sampling frames were
created specifically for that use. Exceptions are commercially available RDD frames.
Sampling interval
The length of the string of consecutive integers is commonly referred to as the sampling interval. If
the size of the population or universe is N and n is the size of the sample, then the integer that is at
least as large as the number N=n is called the sampling interval (often denoted by k). Used in
conjunction with systematic sampling, the sampling interval partitions the universe into n zones, or
strata, each consisting of k units. In general, systematic sampling is operationalized by selecting a
random start between 1 and the sampling interval. This random start, r, and every subsequent kth
integer would then be included in the sample (i.e., r, r+k, r+2k, etc.); creating k possible cluster
samples each containing n population units. The probability of selecting any one-population unit
and consequently, the probability of selecting any one of the k cluster samples is 1/k. The sampling
interval and its role in the systematic sample selection process are illustrated in Figure 1.
For example, suppose that 100 households are to be selected for interviews within a neighborhood
containing 1,000 households (labeled 1, 2, 3, 1,000 for reference). Then the sampling interval,
k=1,000= 100=10, partitions the population of 1,000 households into 100 strata, each having k=10
households. The random start 1 would then refer to the cluster sample of households {1, 11, 21, 31,
41, 971, 981, 991} under systematic random sampling.
There are two main families of sampling methods: probability (random) sampling and non-
probability sampling, respectively typical of (but not exclusive to) quantitative and qualitative
research respectively. Probabilistic samples are considered representative of reality: What can be
said about the sample can be extended to the reality of what is sampled by statistical inference.
Another advantage is that the sampling error, which is a crucial datum to assess the validity of the
sample, is calculable: This is possible only for probability samples. The main problem, however, is
that researchers need the complete list of the target population (i.e., the sample frame), though
sometimes the exact number of the population is sufficient, to extract the sample, and often this is
impossible to obtain (e.g., hen a researcher wants to study the audience of a movie). With
probability samples, each element has a known probability of being included in the sample but the
non-probability samples do not allow the researcher to determine this probability. Probability
samples are those based on simple random sampling, systematic sampling, stratified sampling,
cluster/area sampling whereas non-probability samples are those based on convenience sampling,
judgment sampling, and quota sampling techniques.
In most cases, the size of a probability sample is determined by the following formula:
n z 2 pqN
n¿ 2
E ( N −1 )+ z2 pq
Where:
Z: refers to the confidence level (cl) of the estimate (ofteny fixed at 1.96, corresponding
to a 95% cl)
pq: is the variance (that is unknown and then fixed at its maximum value: 0.25),
N: is the size of the population,
E: is the sampling error (often ≤ 0.04).
n: is the required sample size
On the other hand, non-probability samples are generally purposive or theory driven. This means
they are gathered following a criterion the researcher believes to be satisfying to obtain typological
representativeness. This latter is achieved, when the researcher has sufficient members of all the
main categories of interest to be able to describe with confidence their patterned similarities and
differences.
25
common types of non-probability sampling techniques. The main problem with non-probability
samples is that the researcher has only loose criteria for assessing their validity: The sampling error
is unknowable, so the researchers cannot say whether the results are representative or not, and the
risk of non-sampling errors is large.
Area sampling is quite close to cluster sampling and is often talked about when the total
geographical area of interest happens to be big one. Under area sampling, we first divide the total
area into a number of smaller non-overlapping areas, generally called geographical clusters, then a
number of these smaller areas are randomly selected, and all units in these small areas are included
in the sample. Area sampling is especially helpful where we do not have the list of the population
concerned. It also makes the field interviewing more efficient since interviewer can do many
interviews at each location.
6. Multi-stage sampling: This is a further development of the idea of cluster sampling. This
technique is meant for big inquiries extending to a considerably large geographical area like an
entire country. Under multi-stage sampling, the first stage may be to select large primary sampling
units such as states, then districts, then towns and finally certain families within towns. If the
technique of random-sampling is applied at all stages, the sampling procedure is described as
multi-stage random sampling.
7. Sequential sampling: This is somewhat a complex sample design where the ultimate size of the
sample is not fixed in advance but is determined according to mathematical decisions based on
information yielded as survey progresses. This design is usually adopted under acceptance
sampling plan in the context of statistical quality control. In practice, several of the methods of
sampling described above may well be used in the same study in which case it can be called mixed
sampling. It may be pointed out here that normally one should resort to random sampling so that
bias can be eliminated and sampling error can be estimated. However, purposive sampling is
considered desirable when the universe happens to be small and a known characteristic of it is to
be studied intensively. In addition, there are conditions under which sample designs other than
random sampling may be considered better for reasons like convenience and low costs. The
sample design to be used must be decided by the researcher taking into consideration the nature of
the inquiry and other related factors.
28
3.6. Methods and Techniques of Data Collection
At the end of this lesson, students will be able to:
Identify the difference between primary and secondary data,
Classify and label primary and secondary data collection mechanisms,
Define, explain and discuss the advantages and disadvantages of personal interviewing,
telephone interviewing, self-completion questionnaires and diaries as methods of data
collection, and
Explain the problems that can occur in data collection and research ethic
Data may broadly be divided into two categories, namely primary data and secondary
data.
The primary data:
o Are those that are collected for the first time by the organization that is using them.
29
o Maybe collected by observation, oral investigation, and questionnaire method
or by telephone interviews.
o Questionnaires may be used for data collection by interviewers.
o They may also be mailed to prospective respondents.
o The drafting of a good questionnaire requires utmost skill.
o The process of interviewing also requires a great deal of tact, patience and ,
competence to establish rapport with the respondent.
The secondary data:
o Are those that have already been collected by some other agency but also can be
used by the organization under consideration?
o Secondary data are available in various published and unpublished documents.
o The suitability, reliability, adequacy and accuracy of the secondary data should,
however, be ensured before they are used for research problems.
6.2. Methods of Collecting Primary Data
The primary data are those that are collected afresh and for the first time, and thus happen
to be original in character.
Such data are published by authorities who themselves are responsible for their collection.
The collection of primary data for business research is of paramount importance to assist
management in making decisions.
Generally, information regarding a large number of characteristics is necessary to analyze
any problem pertaining to management.
For instance, a study relating to employment in rural areas requires data on income,
wages, types of crops and land holdings. The collection of primary data thus requires a
great deal of deliberation and expertise.
Methods of collecting primary data are available.
i. Interviewing
In social research, there are many types of interview.
The most common of these are unstructured, semi-structured, and structured
interviews.
a. Unstructured interviews
Unstructured or in-depth interviews are sometimes called life history interviews.
Used when the researcher attempts to achieve a holistic understanding of the interviewees’
point of view or situation
The participant is free to talk about what he or she deems important, with little directional
influence from the researcher.
This type of interview can only be used for qualitative research.
b. Semi-structured interviews
Is perhaps the most common type of interview used in qualitative social research?
30
Used when the researcher wants to know specific information, which can be compared and
contrasted, with information gained in other interviews.
c. Telephone Interview
This method of collecting information consists in contacting respondents on telephone
itself.
This method is inexpensive but limited in scope, as respondents must possess a telephone.
The telephone interview method is used in industrial surveys especially in developed
regions.
d. In-Depth Interviewing
Is a technique designed to elicit a vivid picture of the participant’s perspective on the research
topic. Researchers engage with participants by posing questions in a neutral manner, listening
attentively to participants’ responses, and asking follow-up questions and probes based on those
responses. In-depth interviews are usually conducted face-to-face and involve one interviewer and
one participant.
ii. Observation
The researcher collects the requisite information personally through observation.
For example, in order to study the conditions of students residing in a university,
the investigator meets the students in their hostels
Gold (1958), cited in Lodico, Spaulding and Voegtle, 2006:117) identified the following
classification of observation based on the degree of participation of the observer:
a. Complete participant:
This means that you are a member of the group, and no one in the group is
aware of the fact that they are also an observer. While this might allow a true
“insider’s” view, it raises ethical concerns because, in essence, researcher is
deceiving the participants.
b. Participant as observer:
In this situation, researchers are active member of the group and actively
participate in the group’s activities and interactions, but each member of the
group knows that researchers are also serving a research role.
In essence, a collaborative relationship is developed between the observer and
the participants. Although this removes the ethical concerns presented by being
a complete observer, researchers may compromise the natural interaction of the
group.
c. Observer as participant:
Choosing to be an observer as participant removes the researcher a bit from
group membership.
31
Although researchers certainly still have a connection to the group, they will
not likely participate in the group’s activities.
d. Complete observer:
Here researchers might conduct their observations from behind a one-way
mirror or in a public setting.
They are not a member of the group and do not participate in the group’s
activities.
According to Goetz and LeCompte (1984); cited in Lodico, Spaulding, and Voegtle, 2006:118),
careful observation should include at least the following key features:
An explanation of the physical setting:
o This would include an overall physical description of the space.
A description of the participants in the setting.
o Careful explanation of the participants would include not only who is in the setting
but also why they might be there and a description of their roles.
o Besides, any relevant demographic information should be included.
Individual and group activities and group interactions:
o The researcher should observe the activities the participants are engaging in.
o Special note should be made of the particular activities that will help to answer the
foreshadowed questions.
Participant conversation and nonverbal communication.
o Because case data often include direct quotes, conversations should be observed in
such a way as to note not only what is being said but also how it is being said.
Researcher behavior.
o Because the researcher is part of the setting, careful attention must be paid to the
influence the observer has on the behavior of the participants.
o Does the researcher’s presence in any way influence what is occurring in the
setting?
Questionnaire
A popular and common method of collection of primary data is by personally interviewing
individuals, recording their answers in a structured questionnaire.
The complete enumeration of Ethiopian decennial census is performed by this method.
The enumerators visit the dwellings of individuals and put questions to them that elicit the
relevant information about the subject of enquiry.
This information is recorded in the questionnaire.
Occasionally a part of the questionnaire is unstructured so that the interviewee can feel
free to share information about intimate matters with the interviewer.
As the field staff collects the data personally, it is also known as personal interview
method.
32
Much of the accuracy of the collected data, however, depends on the ability and tactfulness
of investigators, who should be subjected to special training as to how they should elicit the
correct information through friendly discussions.
Mailed Questionnaire Method
A set of questions relevant to subject of enquiry are mailed to a selected list of
persons with a request to return them duly filled in.
33
In any study, two things might be true: (1) there is a difference (the experimental hypothesis), or
(2) there is no difference (the null hypothesis). Various statistical tests have been devised to permit
a decision between the experimental and null hypotheses based on the data. Decision-making
based on a statistical test is open to error, in that we can never be sure whether we have made the
correct decision. However, certain standard procedures are generally followed, and these are
discussed in this chapter. Finally, there are important issues relating to the validity of the findings
obtained from a study. One reason why the validity of the findings may be limited is that the study
itself was not carried out in a properly controlled and scientific fashion. Another reason why the
findings may be partially lacking in validity is that they cannot readily be applied to everyday life,
a state of affairs that occurs most often with laboratory studies. Issues relating to these two kinds
of validity are discussed towards the end of the chapter.
3.7.1. Qualitative Analysis of Data
Qualitative data take the form of words (spoken or written) and visual images (observed or
creatively produced). They are associated primarily with strategies of research such as
ethnography, phenomenology, and grounded theory, and with research methods such as interviews,
documents, and observation. Qualitative data, however, can be produced by other means. For
example, the use of open-ended questions as part of a survey questionnaire can produce answers in
the form of text – written words that can be treated as qualitative data. The kind of research
method used, then, does not provide the defining characteristic of qualitative data. It is the nature
of the data produced that is the crucial issue. It is important to bear this point in mind and to
recognize that qualitative data can be produced by a variety of research methods.
Those carrying out qualitative research sometimes make use of direct quotations from their
participants, arguing that such quotations are often very revealing. There has been rapid growth in
the use of qualitative methods since the mid-1980s. This is due in part to increased dissatisfaction
with the quantitative or scientific approach that has dominated social sciences for the past 100
years. Investigators who collect qualitative data use several different kinds of analysis. The
cardinal principle of qualitative analysis is that causal relationships and theoretical statements be
clearly emergent from and grounded in the phenomena studied. The theory emerges from the data;
it is not imposed on the data.
How do investigators use this principle? One important way is by considering fully the categories
spontaneously used by the participants before the investigators develop their own categories. A
34
researcher first gathers all the information obtained from the participants. This stage is not always
entirely straightforward. For example, if we simply transcribe tape recordings of what our
participants have said, we may be losing valuable information. The details about which words are
emphasized, where the speaker pauses, and when the speaker speeds up or slows down should also
be recorded, so that we can understand fully what he or she is trying to communicate. The
researcher then arranges the items of information (e.g. statements) into various groups in a
preliminary way. If a given item seems of relevance to several groups, then it is included in all of
them. Frequently, the next step is to take account of the categories or groupings suggested by the
participants themselves. The final step is for the researcher to form a set of categories based on the
information obtained from the previous steps.
Qualitative analysis is often less influenced than is quantitative analysis by the biases and
theoretical assumptions of the investigator. In addition, it offers the prospect of understanding the
participants in a study as rounded individuals in a social context. This contrasts with quantitative
analysis, in which the focus is often on rather narrow aspects of behavior. The greatest limitation
of the qualitative approach is that the findings reported tend to be unreliable and hard to replicate.
Why is this so? The qualitative approach is subjective and impressionistic, and so the ways in
which the information is categorized and then interpreted often differ considerably from one
investigator to another.
There are various ways in which qualitative researchers try to show that their findings are reliable
(Coolican, 1994). Probably the most satisfactory approach is to see whether the findings obtained
from a qualitative analysis can be replicated. This can be done by comparing the findings from an
interview study with those from an observational study. Alternatively, two different qualitative
researchers can conduct independent analyses of the same qualitative data, and then compare their
findings. Qualitative researchers argue that the fact that they typically go through the “research
cycle” more than once helps to increase reliability. Thus, for example, the initial assumptions and
categories of the researcher are checked against the data, and may then be changed. After that, the
new assumptions and categories are checked against the data. Repeating the research cycle is of
value in some ways, but it does not ensure that the findings will have high reliability.
Interpretation of Interviews, Case Studies, and Observations
Qualitative analyses as discussed in the previous section are carried out in several different kinds
of studies. They are especially common in interviews, case studies, and observational studies,
although quantitative analyses have often been used in all three types of studies. What we will do
in this section is to consider the interpretation of interviews, case studies, and observations.
i. Interviews
In general, unstructured interviews lend themselves to qualitative analyses, whereas structured
interviews lend themselves to quantitative analysis. As Coolican (1994) pointed out, there are
various skills that interviewers need in order to obtain valuable data. These skills involve
establishing a good understanding with the person being interviewed, adopting a non-judgmental
approach, and developing effective listening skills. Unstructured interviews with many of those
35
involved indicated that in fact they had good reasons for their actions. They argued that they were
defending their area against the police, and they experienced strong feelings of solidarity and
community spirit. This interpretation was supported by the fact that very little of the damage
affected private homes in the area.
There are various problems involved in interpreting interview information, such as:
1. First, there is the problem of social desirability bias. Most people want to present
themselves in the best possible light, so they may provide socially desirable rather than
honest answers to personal questions. The interviewer asking additional questions to
establish the truth can handle this problem.
2. Second, the data obtained from an interviewer may reveal more about the social interaction
processes between the interviewer and the person being interviewed (the interviewee) than
about the interviewee has thought processes and attitudes.
3. Third, account needs to be taken of the self-fulfilling prophecy. This is the tendency for
someone’s expectations about another person to lead to the fulfillment of those
expectations. For example, suppose that a therapist expects his or her patient to behave
very anxiously. This expectation may cause the therapist to treat the patient in such a way
that the patient starts to behave in the expected fashion.
ii. Case studies
Case studies (intensive investigations of individuals) come in all shapes and sizes. We need to be
very careful when interpreting the evidence from a case study. The greatest danger is that very
general conclusions may be drawn based on a single atypical individual. For this reason, it is
important to have supporting evidence from other sources before drawing such conclusions. It is
often hard to interpret the evidence from case studies. How, then, should the findings from a case
study be interpreted? Probably the greatest value of a case study is that it can suggest hypotheses
that can then be tested under more controlled conditions with larger numbers of participants. In
other words, case studies usually provide suggestive rather than definitive evidence. In addition,
case studies can indicate that there are limitations in current theories.
iii. Observations
As discussed in the chapter six, there are numerous kinds of observational studies, and the data
obtained may be either quantitative or qualitative. The general issue here is that it is often very
hard to interpret or make sense of the data obtained from observational studies, because we can
only speculate on the reasons why the participants are behaving in the ways that we observe.
Content Analysis
Content analysis is used when originally qualitative information is reduced to numerical terms.
Content analysis started as a method for analyzing messages in the media, including articles
published in newspapers, speeches made by politicians on radio and television, various forms of
propaganda, and health records. The first stage in content analysis is that of sampling, or deciding
what to select from what may be an enormous amount of material. The issue of sampling is an
important one. For example, television advertisers target their advertisements at particular sections
36
of the population, and so arrange for the advertisements to be shown when the relevant groups are
most likely to be watching television. As a result, advertisements for beer are more likely to be
shown during a football match than a program about fashion.
The other key ingredient in content analysis is the construction of the coding units into which the
information is to be categorized. In order to form appropriate coding units, the researcher needs to
have considerable knowledge of the kinds of material to be used in the content analysis. He or she
also needs to have one or more clear hypotheses, because the selection of coding units must be
such as to permit these hypotheses to be tested effectively. The coding can take many forms. The
categories used can be very specific (e.g. use of a given word) or general (e.g. theme of the
communication). Instead of using categories, the coders may be asked to provide ratings. Another
form of coding involves ranking items, or putting them in order. For example, the statements of
politicians could be ranked in terms of the extent to which they agreed with the facts.
One of the greatest strengths of content analysis is that it provides a way of extracting
information from a wealth of real-world settings. The media influence the ways we think
and feel about business issues, and so it is important to analyze media communications in
detail. Content analysis can reveal issues of concern.
The greatest limitation of content analysis is that it is often very hard to interpret the
findings. There are also problems of interpretation with other communications such as
personal diaries or essays. Diaries or essays may contain accurate accounts of what an
individual does, thinks, and feels. On the other hand, individuals may provide deliberately
distorted accounts in order to protect their self-esteem, to make it appear that their lives are
more exciting than is actually the case, and so on. Another problem is that the selection and
scoring of coding units can be rather subjective. The coding categories that are used need to
reflect accurately the content of the communication, and each of the categories must be
defined as precisely as possible.
Advantages of qualitative analysis
a. The data and the analysis are ‘grounded’: A particular strength associated with
qualitative research is that the descriptions and theories such research generates are
‘grounded in reality’. This is not to suggest that they depict reality in some simplistic sense,
as though social reality were ‘out there’ waiting to be ‘discovered’. However, it does
suggest that the data and the analysis have their roots in the conditions of social existence.
There is little scope for ‘armchair theorizing’ or ‘ideas plucked out of thin air’.
b. There is a richness and detail to the data. The in-depth study of relatively focused areas,
the tendency towards small-scale research, and the generation of ‘thick descriptions’ mean
that qualitative research scores well in terms of the way it deals with complex social
situations. It is better able to deal with the intricacies of a situation and do justice to the
subtleties of social life.
37
c. There is tolerance of ambiguity and contradictions. To the extent that social existence
involves uncertainty, accounts of that existence ought to be able to tolerate ambiguities and
contradictions, and qualitative research is better able to do this than quantitative research.
This is not a reflection of a weak analysis. It is a reflection of the social reality being
investigated.
d. There is the prospect of alternative explanations. Qualitative analysis, because it draws
on the interpretive skills of the researcher, opens up the possibility of more than one
explanation being valid. Rather than a presumption that there must be, in theory at least,
one correct explanation, it allows for the possibility that different researchers might reach
different conclusions, despite using broadly the same methods.
Disadvantages of qualitative analysis
a. The data might be less representative. The other side of qualitative research’s attention
to thick description and the grounded approach is that it becomes more difficult to establish
how far the findings from the detailed, in-depth study of a small number of instances may
be generalized to other similar instances. Provided sufficient detail is given about the
circumstances of the research, however, it is still possible to gauge how far the findings
relate to other instances, but such generalizability is still more open to doubt than it is with
well-conducted quantitative research.
b. Interpretation is bound up with the ‘self’ of the researcher. Qualitative research
recognizes more openly than does quantitative research that the researcher’s own identity,
background and beliefs have a role in the creation of data and the analysis of data. The
research is ‘self-aware’. This means that the findings are necessarily more cautious and
tentative, because it operates on the basic assumption that the findings are a creation of the
researcher rather than a discovery of fact. Although it may be argued that quantitative
research is guilty of trying to gloss over the point – which equally well applies – the greater
exposure of the intrusion of the ‘self’ in qualitative research inevitably means more
cautious approaches to the findings.
c. There is a possibility of de-contextualizing the meaning. In the process of coding and
categorizing the field notes, texts or transcripts there is a possibility that the words (or
images for that matter) are taken literally out of context. The context is an integral part of
the qualitative data, and the context refers to both events surrounding the production of the
data, and events and words that precede and follow the actual extracted pieces of data that
are used to form the units for analysis. There is a very real danger for the researcher that in
coding and categorizing of the data the meaning of the data is lost or transformed by
wrenching it from its location (a) within a sequence of data (e.g. interview talk), or (b)
within surrounding circumstances which have a bearing on the meaning of the unit as it
was originally conceived at the time of data collection.
d. There is the danger of oversimplifying the explanation. In the quest to identify themes
in the data and to develop generalizations the researcher can feel pressured to underplay,
possibly disregard data that ‘doesn’t fit’. Inconsistencies, ambiguities, and alternative
explanations can be frustrating in the way they inhibit a nice clear generalization – but they
38
are an inherent feature of social life. Social phenomena are complex, and the analysis of
qualitative data needs to acknowledge this and avoid attempts to oversimplify matters.
e. The analysis takes longer. The volume of data that a researcher collects will depend on
the time and resources available for the research project. When it comes to the analysis of
that data, however, it is almost guaranteed that it will seem like a daunting task. This is true
for quantitative research as much as it is for qualitative research. But, as Bryman and
Burgess (1994: 216) point out, when it comes to the analysis of seemingly vast amounts of
quantitative data ‘the availability of standard statistical procedures and computer programs
for handling them is generally perceived as rendering such data non-problematic’. The
analysis of the data can be achieved relatively quickly and the decision-making behind the
analysis can be explained succinctly. When researchers use qualitative data, however, the
situation is rather different. Primarily, this is because qualitative data are generally
unstructured when they are first collected in their ‘raw’ state (e.g. interviews, field notes,
photographs). Computer programs can assist with the management of this data and they can
even help with its analysis, but nowhere near to the extent that they can with quantitative
data. In the case of qualitative data, the techniques used for the analysis are more time-
consuming and the decisions made by the researcher are less easily described to the reader
of the research. The result is that qualitative data takes considerably longer to analyze.
3.7.2. Quantitative data analysis
Quantitative data take the form of numbers. They are associated primarily with strategies of
research such as surveys and experiments, and with research methods such as questionnaires and
observation. These are not, however, the only sources of quantitative data. For example, the use of
content analysis with texts (such as interview transcripts) can also produce numerical data. The
kind of research method used, then, is not the crucial thing when it comes to defining quantitative
data. It is the nature of the data that the method produces that is the key issue. It is important to
bear this point in mind and to recognize that quantitative data can be produced by a variety of
research methods.
Descriptive statistics
Suppose that we have carried out an experiment on the effects of noise on learning with three
groups of nine participants each. One group was exposed to very loud noise, another group to
moderately loud noise, and the third group was not exposed to noise at all. What they had learned
from a book chapter was assessed by giving them a set of questions, producing a score between 0
and 20. What is to be done with the raw scores? There are two key types of measures that can be
taken whenever we have a set of scores from participants in a given condition.
First, there are measures of central tendency, which provide some indication of the size of
average or typical scores.
Second, there are measures of dispersion, which indicate the extent to which the scores
cluster around the average or are spread out.
39
i. Measures of central tendency
Measures of central tendency describe how the data cluster together around a central point. There
are three main measures of central tendency: the mean; the median; and the mode.
a. Mean
Mean: an average worked out by dividing the all participants’ scores by the number of
participants.
Normal distribution: a bell-shaped distribution in which most scores cluster fairly close to the
mean.
Median: the middle score out of all participants’ scores in a given condition.
Mode: the most frequently occurring score among the participants in a given condition.
Example
Total
Scores 1 2 4 5 7 9 9 9 17 63
No. 1 2 3 4 5 6 7 8 9 63/9=7(mean
)
The mean in each group or condition is calculated by adding up all the scores in a given condition,
and then dividing by the number of participants in that condition. Suppose that the scores of the
nine participants in the no-noise condition are as follows: 1, 2, 4, 5, 7, 9, 9, 9, and 17. The mean is
given by the total, which is 63, divided by the number of participants, which is 9. Thus, the mean
is 7. The main advantage of the mean is the fact that it considers all the scores. This generally
makes it a sensitive measure of central tendency, especially if the scores resemble the normal
distribution, which is a bell-shaped distribution in which most scores cluster fairly close to the
mean. However, the mean can be very misleading if the distribution differs markedly from the
normal and there are one or two extreme scores in one direction.
Suppose that eight people complete one lap of a track in go-karts. For seven of them, the times
taken (in seconds) are as follows: 25, 28, 29, 29, 34, 36, and 42. The eighth person’s go-kart breaks
down, and so the driver has to push it around the track. This person takes 288 seconds to complete
the lap. This produces an overall mean of 64 seconds. This is clearly misleading, because no one
else took even close to 64 seconds to complete one lap.
b. Median
Another way of describing the general level of performance in each condition is known as the
median. If there is an odd number of scores, then the median is simply the middle score, having an
equal number of scores higher and lower than it does. In the example with nine scores in the no-
noise condition (1, 2, 4, 5, 7, 9, 9, 9, 17), the median is 7. Matters are slightly more complex if
there is an even number of scores. In that case, we work out the mean of the two central values.
40
For example, suppose that we have the following scores in size order: 2, 5, 5, 7, 8, 9. The two
central values are 5 and 7, and so the median is: (5+7)/2=6.
The main advantage of the median is that it is unaffected by a few extreme scores, because it
focuses only on scores in the middle of the distribution. It also has the advantage that it tends to be
easier than the mean to work out. The main limitation of the median is that it ignores most of the
scores, and so it is often less sensitive than the mean. In addition, it is not always representative of
the scores obtained, especially if there are only a few scores.
c. Mode
The final measure of central tendency is the mode. This is simply the most frequently occurring
score. In the above example of the nine scores in the no-noise condition, the mode is 9.
The main advantages of the mode are that it is unaffected by one or two extreme scores,
and that it is the easiest measure of central tendency to work out. In addition, it can still be
worked out even when some of the extreme scores are not known.
However, its limitations generally outweigh these advantages. The greatest limitation is
that the mode tends to be unreliable. For example, suppose we have the following scores: 4,
4, 6, 7, 8, 8, 12, 12, 12. The mode of these scores is 12. If just one score changed (a 12
becoming a 4), the mode would change to 4!
Another limitation is that information about the exact values of the scores obtained is
ignored in working out the mode. This makes it a less sensitive measure than the mean.
A final limitation is that it is possible for there to be more than one mode.
Levels of measurement
From what has been said so far, we have seen that the mean is the most generally useful measure
of central tendency, whereas the mode is the least useful. However, we need to take account of the
level of measurement when deciding which measure of central tendency to use. At the interval and
ratio levels of measurement, each added unit represents an equal increase. For example, someone
who hits a target four times out of ten has done twice as well as someone who hits it twice out of
ten. Below this is the ordinal level of measurement, in which we can only order, or rank, the scores
from highest to lowest. At the lowest level, there is the nominal level, in which the scores consist
of the numbers of participants falling into various categories.
The mean should only be used when the scores are at the interval level of
measurement.
The median can be used when the data are at the interval or ordinal level.
The mode can be used when the data are at any of the three levels. It is the only one of
the three measures of central tendency that can be used with nominal data.
41
ii. Measures of dispersion
The mean, median, and mode are all measures of central tendency. It is also useful to work out
what are known as measures of dispersion, such as the range, inter-quartile range, variation ratio,
and standard deviation. These measures indicate whether the scores in a given condition are similar
to each other or whether they are spread out.
KEY TERMS
Range: the difference between the highest and lowest score in any condition.
Inter-quartile range: the spread of the middle 50% of an ordered or ranked set of scores.
Variation ratio: a measure of dispersion based on the proportion of the scores that are not at
the modal value.
Standard deviation: a measure of dispersal that is of special relevance to the normal
distribution; it is the square root of the variance. It takes account of every score, and is a
sensitive dispersion measure.
Variance: a measure of dispersion that is the square of the standard deviation.
a. Range
The simplest of these measures is the range, which can be defined as the difference between the
highest and the lowest score in any condition. In the case of the no-noise group (1, 2, 4, 5, 7, 9, 9,
9, 17), the range is 17-1=16. In fact, it is preferable to calculate the range in a slightly different
way (Coolican, 1994). The revised formula (when we are dealing with whole numbers) is as
follows: (highest score - lowest score) +1. Thus, in our example, the range is (17-1) +1 =17. This
formula is preferable because it takes account of the fact that the scores we recorded were rounded
to whole numbers. In our sample data, a score of 17 stands for all values between 16.5 and 17.5,
and a score of 1 represents a value between 0.5 and 1.5. If we take the range as the interval
between the highest possible value (17.5) and the lowest possible value (0.5), this gives us a range
of 17, which is precisely the figure produced by the formula.
The main advantages of the range as a measure of dispersion are that it is easy to calculate
and that it takes full account of extreme values.
The main weakness of the range is that it can be greatly influenced by one score that is very
different from all of the others.
The other important weakness of the range is that it ignores all but two of the scores, and so
is likely to provide an inadequate measure of the general spread or dispersion of the scores
around the mean or median.
b. Inter-quartile range
The inter-quartile range is defined as the spread of the middle 50% of scores. For example,
suppose that we have the following set of scores: 4, 5, 6, 6, 7, 8, 8, 9, 11, 11, 14, 15, 17, 18, 18,
and 19.
Inter-quartile 6 8 1 18
range 1
42
scor 4 5 6 7 8 9 1 1 1 17 18 19
es 1 4 5
Ran 1 2 3 4 5 6 7 8 9 1 1 1 1 1 1 1 17 18 19 20
ge 0 1 2 3 4 5 6
Bottom 25% Middle 55% (8 scores) Top 25% (4
(4 scores) scores)
INTERQUARTILE RANGE
(9.5)
6.5 16
There are 16 scores, which can be divided into the bottom 25% (4), the middle 50% (8), and the
top 25% (4). The middle 50% of scores start with 7 and run through to 15. The upper boundary of
the inter-quartile range lies between 15 and 17, and is given by the mean of these two values, i.e.
16. The lower boundary of the inter-quartile range lies between 6 and 7, and is their mean, i.e. 6.5.
The inter-quartile range is the difference between the upper and lower boundaries, i.e. 16 _ 6.5 _
9.5.
The inter-quartile range has the advantage over the range that it is not influenced by a
single extreme score. As a result, it is more likely to provide an accurate reflection of the
spread or dispersion of the scores. It has the disadvantage that it ignores information from
the top and the bottom 25% of scores. For example, we could have two sets of scores with
the same inter-quartile range, but with more extreme scores in one set than in the other. The
difference in spread or dispersion between the two sets of scores would not be detected by
the inter-quartile range.
c. Variation ratio
Another simple measure of dispersal is the variation ratio. This can be used when the mode is the
chosen measure of central tendency. The variation ratio is defined simply as the proportion of the
scores obtained that are not at the modal value (i.e. the value of the mode). The variation ratio for
the no-noise condition discussed earlier (scores of 1, 2, 4, 5, 7, 9, 9, 9, and 17), where the mode is
9, is as follows:
43
Standard deviation: A worked example
Participan X Mean X-mean (X-mean)2
t
1 13 10 3 9
2 6 10 -4 16
3 10 10 0 0
4 15 10 5 25
5 10 10 0 0
6 15 10 5 25
7 5 10 -5 25
8 9 10 -1 1
9 10 10 0 0
Total Σx=130 10 13 10 3 9
N=13 11 6 10 -4 16
Mean= Σx/N=130/13=10 12 11 10 1 1
Variance=163/12=11.33 13 7 10 -3 9
Standard Deviation can be: 13 130 10 163
δ =¿ √ V 2=√ 11 .33 =3.37
The first step is to work out the mean of the sample. This is given by the total of all of the
participants’ scores (Σx=130; the symbol “Σ” means the sum of) divided by the number of
participants (N =13). Thus, the mean is 10.
The second step is to subtract the mean in turn from each score (X-M). The calculations are
shown in the fourth column.
The third step is to square each of the scores in the fourth column (X-M)2.
The fourth step is to work out the total of all the squared scores, Σ(X _ M)2. This comes to 136.
The fifth step is to divide the result of the fourth step by one less than the number of
participants, N-1= 12. This gives us 136 divided by 12, which equals 11.33. This is known as
the variance, which is in squared units. Finally, we use a calculator to take the square root of
the variance. This produces a figure of 3.37; this is the standard deviation.
The method for calculating the standard deviation that has just been described is used when we
want to estimate the standard deviation of the population. If we want merely to describe the spread
of scores in our sample, then the fifth step involves dividing the result of the fourth step by N.
What is the meaning of this figure for the standard deviation? We expect about two-thirds of the
scores in a sample to lie within one standard deviation of the mean.
In our example, the mean is 10.0, one standard deviation above the mean is 13.366, and one
standard deviation below the mean is 6.634. In fact, 61.5% of the scores lie between those two
limits, which is only slightly below the expected percentage. The standard deviation has special
relevance in relation to the so-called normal distribution. As was mentioned earlier, the normal
distribution is a bell-shaped curve in which there are as many scores above the mean as below it.
44
Most of the scores in a normal distribution cluster fairly close to the mean, and there are fewer and
fewer scores as you move away from the mean in either direction. In a normal distribution:
68.26% of the scores fall within one standard deviation of the mean,
95.44% fall within two standard deviations, and
99.73% fall within three standard deviations.
The standard deviation takes account of all of the scores and provides a sensitive measure of
dispersion. As we have seen, it also has the advantage that it describes the spread of scores in a
normal distribution with great precision. The most obvious disadvantage of the standard deviation
is that it is much harder to work out than the other measures of dispersion.
3.7.3. Data Presentation
Information about the scores in a sample can be presented in several ways. If it is presented in a
graph or chart, this may make it easier for people to understand what has been found, compared to
simply presenting information about the central tendency and dispersion. We will shortly consider
some examples. The key point to remember is that all graphs and charts should be clearly labeled
and presented so that the reader can rapidly make sense of the information contained in them.
i. Frequency polygon
One way of summarizing these data is in the form of a frequency polygon. This is a simple form
of chart in which the scores from low to high are indicated on the x or horizontal axis and the
frequencies of the various scores (in terms of the numbers of individuals obtaining each score) are
indicated on the y or vertical axis. The points on a frequency polygon should only be joined up
when the scores can be ordered from low to high. In order for a frequency polygon to be most
useful, it should be constructed so that most of the frequencies are neither very high nor very low.
The frequencies will be very high if the width of each class interval (the categories used to
summarize frequencies). See the example bellow.
30.0
20.0
10.0
Value
-
1999/00 2000/01 2001/02 2002/03 2003/04 2004/05 2005/06 2006/07 2007/08 2008 /09 2009 /10 2010/ 11
(10.0)
(20.0)
(30.0)
Gross Saving ( % of GDP) Investment ( % ofYear
GDP) Change in GDP Defletor Per capita GDP ($)
Resource Balance Gross Domestic Savings
45
ii. Histogram
A similar way of describing these data is by means of a histogram. In a histogram, the scores are
indicated on the horizontal axis and the frequencies are shown on the vertical axis. In contrast to a
frequency polygon, however, the frequencies are indicated by rectangular columns. These columns
are all the same width but vary in height in accordance with the corresponding frequencies. As
with frequency polygons, it is important to make sure that the class intervals are not too broad or
too narrow. All class intervals are represented, even if there are no scores in some of them. Their
mid-point at the centre of the columns indicates class intervals.
Histograms are clearly rather similar to frequency polygons. However, frequency polygons are
sometimes preferable when you want to compare two different frequency distributions. The
information contained in a histogram is interpreted in the same way as the information in a
frequency polygon.
iii. Bar chart
Frequency polygons and histograms are suitable when the scores obtained by the participants can
be ordered from low to high. In more technical terms, the data should be either interval or ratio.
However, there are many studies in which the scores are in the form of categories rather than
ordered scores; in other words, the data are nominal. In a bar chart, the categories are shown along
the horizontal axis, and the frequencies are indicated on the vertical axis. In contrast to the data
contained in histograms, the categories in bar charts cannot be ordered numerically in a meaningful
way. However, they can be arranged in ascending (or descending) order of popularity. Another
difference from histograms is that the rectangles in a bar chart do not usually touch each other. The
scale on the vertical axis of a bar chart normally starts at zero. However, it is sometimes
convenient for presentational purposes to have it start at some higher value. If that is done, then it
should be made clear in the bar chart that the lower part of the vertical scale is missing. The
columns in a bar chart often represent frequencies. However, they can also represent means or
percentages for different groups (Coolican, 1994). How do you interpret the information in a bar
chart? In the present example, a bar Chart makes it easy to compare the sources of fund to the
manufacturing sector in Ethiopia (1957-1961).
46
Figure xxx: Estimated sources of Fund to the manufacturing Sector in the FFYP (1957-61)
40%
35%
35%
30% 28%
25%
20%
Percentage share
15% 14%
12% 11%
10%
5% 1%
0%
iv. Pie-charts
Pie charts are other means used to present data through circular graphics. Pie charts, as their name
suggests, present data as segments of the whole pie. Pie charts are visually powerful. They convey
simply and straightforwardly the proportions of each category which go to make up the total. In
most cases, the segments are presented in terms of percentages. To enhance their impact, pie charts
that wish to draw attention to one particular component have that segment extracted from the rest
of the pie. On other occasions, all segments are pulled away from the core in what is known as an
exploded pie chart. Weighed against their visual strength, pie charts can only be used with one
data set. Their impact is also dependent on not having too many segments. As a rule of thumb, a
pie chart should have no more than seven segments and have no segment which accounts for
less than 2 per cent of the total.
1960/61
15%
1973/74
32%
1964/65
27%
1968/69
26%
47
Please refer other data presentation mechanisms from any statistical books
3.7.4. STATISTICAL TESTS
The various ways in which the data from a study can be presented are all useful in that they give us
convenient and easily understood summaries of what we have found. However, to have a clearer
idea of what our findings mean, it is generally necessary to carry out one or more statistical tests.
The first step in choosing an appropriate statistical test is to decide whether your data were
obtained from an experiment in which some aspect of the situation (the independent variable) was
manipulated in order to observe its effects on the dependent variables.
In using a statistical test, you need to take account of the experimental hypothesis. If you predicted
the direction of any effects (e.g. loud noise will disrupt learning and memory), then you have a
directional hypothesis, which should be evaluated by a one-tailed test. If you did not predict the
direction of any effects (e.g. loud noise will affect learning and memory), then you have a non-
directional hypothesis, which should be evaluated by a two-tailed test.
Another factor to consider when deciding which statistical test to use is the type of data you have
obtained.
There are four types of data of increasing levels of precision:
Nominal: the data consist of the numbers of participants falling into various categories
(e.g. fat, thin; men, women).
Ordinal: the data can be ordered from lowest to highest (e.g. the finishing positions of
athletes in a race).
Interval: the data differ from ordinal data, because the units of measurement are fixed
throughout the range; for example, there is the same “distance” between a height of 1.82
meters and 1.70 meters as between a height of 1.70 meters and one of 1.58 meters.
Ratio: the data have the same characteristics as interval data, with the exception that they
have a meaningful zero point; for example, time measurements provide ratio data because
the notion of zero time is meaningful, and 10 seconds is twice as long as 5 seconds. The
similarities between interval and ratio data are so great that they are sometimes combined
and referred to as interval/ratio data.
Summary
Nominal data: data consisting of the numbers of participants falling into qualitatively different
categories.
Ordinal data: data that can be ordered from smallest to largest.
Interval data: data in which the units of measurement have an invariant or unchanging value.
Ratio data: as interval data, but with a meaningful zero point.
Parametric tests: statistical tests that require interval or ratio data, normally distributed data,
and similar variances in both conditions.
Non-parametric tests: statistical tests that do not involve the requirements of parametric tests.
48
Statistical tests can be divided into parametric tests and non-parametric tests. Parametric tests
should only be used when the data obtained from a study satisfy various requirements. More
specifically, there should be interval or ratio data, the data should be normally distributed, and the
variances in the two conditions should be reasonably similar. In contrast, non-parametric tests can
nearly always continue to be used, even when the requirements of parametric tests are satisfied.
Statistical significance
So far, we have discussed some of the issues that influence the choice of statistical test. What
happens after we have chosen a statistical test, and analyzed our data, and want to interpret our
findings? We use the results of the test to choose between the following:
Experimental hypothesis (e.g., loud noise disrupts learning).
Null hypothesis, which asserts that there is no difference between conditions (e.g. loud
noise has no effect on learning).
In fact, two errors may occur when reaching a conclusion based on the results of a statistical test:
Type I error: we may reject the null hypothesis in favor of the experimental
hypothesis even though the findings are actually due to chance; the probability of
this happening is given by the level of statistical significance that is selected.
Type II error: we may retain the null hypothesis even though the experimental
hypothesis is actually correct.
It would be possible to reduce the likelihood of a Type I error by using a more stringent level of
significance. For example, if we used the 1% (p=0.01) level of significance, this would greatly
reduce the probability of a Type I error. However, use of a more stringent level of significance
increases the probability of a Type II error. We could reduce the probability of a Type II error by
using a less stringent level of significance, such as the 10% (p=0.10) level. However, this would
increase the probability of a Type I error. These considerations help to make it clear why most
business researchers favor the 5% (or p=0.05) level of significance: it allows the probabilities of
both Type I and Type II errors to remain reasonably low.
7.7. Correlation studies
a. Spearman’s rho
Suppose that we have scores on two variables from each of our participants, and we want to see
whether there is an association or correlation between the two sets of scores. This can be done by
using a test known as Spearman’s rho, if the data are at least ordinal.
(Σ d 2 x 6)
rho= 1−
N (N −1)
Steps
1. Draw up a table in which each participant’s scores for the two variables are placed in the
same row.
2. Rank all the scores for variable A. A rank of 1 is assigned to the smallest score, a rank of 2
to the second smallest score, and so on up to n.
3. Rank all the scores for variable B, with a rank of 1 being assigned to the smallest score.
49
4. Calculate the difference between the two ranks obtained by each individual, with the rank
for variable B being subtracted from the rank for variable A.
5. Square all of the d scores obtained in the fourth step.
6. The sixth step is to add up all of the d 2 scores in order to obtain the sum of the squared
difference scores.
NB. For more advanced statistical techniques on multi-variant analysis, please refer to your
statistics courses or contact your research advisors while progressing with your research
project.
In addition, there are some general sources of information on such conventions. For the production
of dissertations there is, for example, the British Standards specification no. 4821. Although this
was withdrawn in 1990, it has not been superseded and is still recommended by the British
Library. For the production of academic articles and for referencing techniques, the researcher
could turn to the Publication Manual of the American Psychological Association. There are also
books devoted to guidance for authors on the technical conventions associated with writing up
research – for example, K. L. Turabian’s Manual for Writers of Term Papers, Theses, and
Dissertations (University of Chicago Press, 6th edn, 1996).
There is no single set of rules and guidelines for writing up the research that covers all
situations and provides a universally accepted convention.
52
The Harvard referencing system
There are conventions for referring to the ideas, arguments and supporting evidence gleaned from
others. There are two that are generally recognized: the numerical system and the Harvard system.
The numerical system involves placing a number in the text at each point where the author wishes
to refer to a specific source. The full references are then given at the end of the book or individual
chapters, and these can be incorporated into endnotes. In the Harvard system, the sources of ideas,
arguments, and supporting evidence are indicated by citing the name of the author and the date of
publication of the relevant work. This is done at the appropriate point in the text. Full details of the
author’s name and the publication are subsequently given at the end of the report, so that the reader
can identify the exact source and, if necessary, refer to it directly. The Harvard system involves
referring to authors in the text in the following ways:
Baker (2007) argues that postmodernism has a dubious future.
It has been argued that postmodernism has a dubious future (Baker, 2007).
The point has been made that ‘it is not easy to see what contribution postmodernism
will make in the twenty-first century’ (Baker, 2007: 131).
In the References section towards the end of the research report, the full details of ‘Baker, 2007’
are given, as they are for all the authors’ works cited in the report. For Baker, it might look like
this:
Baker, G. (2007). The meaning of postmodernism for research methodology, British Journal of
Health Research, 25: 249–66.
As far as the References section is concerned, there are seven key components of the Harvard
system:
Author’s name and initial(s). Alphabetical order on authors’ surnames. Surname followed
by forename or initial. If the book is an edited volume, then (ed.) or (Eds) should follow the
name.
Date of publication. To identify when the work was written and to distinguish different
works published by the same author(s).
Title. The title of a book is put in italics, and uses capital letters for the first letter of the
main words. Papers and articles are not in italics and have titles in lower case.
Journal name (if applicable). This is put in italics and details are given of the number,
volume and page numbers of the specified article. If the source is, a contribution to an
edited volume then details is given about the book, in which it appears (i.e. editor’s name,
title of edited volume).
Publisher. Vital for locating more obscure sources. This is included for books but not for
journals.
Place of publication. Helpful in the location of obscure sources.
Edition. If the work appears in a second or subsequent edition this needs to be specified.
Style and presentation when writing up the research
53
Project researchers, then, are best advised to stick to the rules.
Use the third person
Use the past tense
Ensure good standards of spelling and grammar
Develop logical links from one section to the next
Use headings and sub-headings to divide the text into clear sections
Be consistent in the use of the referencing style
Use care with the page layout
Present tables and figures properly
The structure of research reports
The conventional structure for reporting research divides the material into three parts:
1. the preliminary part,
2. the main text and
3. The end matter.
The preliminary part
Title
The title itself needs to indicate accurately the contents of the work. It also needs to be brief. A
good way of combining the two is to have a two-part title. The first part acts as the main title and
gives a broad indication of the area of the work. The second part adds more detail. For example,
‘Ethnicity and friendship: the contrast between socio-metric research and fieldwork observation in
primary school classrooms’.
Abstract
An abstract is a synopsis of a piece of research. Its purpose is to provide a brief summary, which
can be circulated widely to allow other people to see, at a glance, if the research is relevant to their
needs and worth tracking down to read in full. An abstract is normally about 250–300 words in
length, and is presented on a separate sheet.
Key words
Researchers are often asked to identify up to five ‘key words’. These words are ‘identifiers’ –
words that capture the essence of what the report is all about. The key words are needed for cross-
referencing during library searches.
List of contents
Depending on the context, this can range from being just a list of chapter headings and their
starting page through to being an extensive list, including details of the contents within the major
section of the report; for instance, based on headings and sub-headings.
List of tables and figures: This should list the titles of the various tables and figures and their
locations.
Preface
This provides the opportunity for the researcher to give a personal statement about the origins of
the research and the significance of the research for the researcher as a person. In view of the
importance of the ‘self’ in the research process, the Preface offers a valuable place in the research
54
report to explore, albeit briefly, how the research reflects the personal experiences and biography
of the researcher.
Acknowledgements
Under this heading, credit can be given to those who have helped with the research. This can range
from people who acted as ‘gatekeepers’ in relation to fieldwork, through to academic supervisors,
through to those who have commented on early drafts of the research report.
List of abbreviations
If the nature of the report demands that many abbreviations are used in the text, these should be
listed, usually alphabetically, under this heading, alongside the full version of what they stand for.
The main text
The main text is generally divided into sections. The sections might be chapters as in the case of a
larger piece of work or headings as in the case of shorter reports. In either case, they are normally
presented in the following order.
Introduction
For the purposes of writing up, research there needs to be an introduction. This may, or may not,
coincide with a section or chapter titled as an ‘Introduction’, depending on how much discretion is
open to the researcher and how far this is taken. The important thing is to recognize that, at the
beginning, the reader needs to be provided with information about:
• The background to the work (in relation to significant issues, problems, ideas);
• The aims of the research;
• Key definitions and concepts to be used;
• Optionally, in longer pieces, an overview of the report (mapping out its contents).
Literature review
This may be presented as an integral part of the ‘Introduction’ or it may appear as a separate
chapter or section. It is, though, essential that in the early stages of the report there is a review of
the material that already exists on the topic in question. The current research should build on
existing knowledge, not ‘reinvent the wheel’. The literature review should demonstrate how the
research being reported relates to previous research and, if possible, how it gives rise to particular
issues, problems, and ideas that the current research addresses.
Methods of investigation
At this point, having analyzed the existing state of knowledge on a topic, it is reasonable to
describe the methods of investigation. See the section on ‘The research methods chapter or section’
(pp. 328–9) for guidance on how this should be done.
Findings
This is where the reader is introduced to the data. Aspects of the findings are singled out and
described. The first step is to say, ‘This is what was found with respect to this issue . . . This is
what was found with respect to another issue . . .’ The aim for the researcher is to be able to
present relevant findings before going ahead to analyse those findings and see what implications
they might have for the issues, problems or ideas that prompted the research. First things first: let
us see what we have found. Then, and only then, as a subsequent stage, will we move on to
55
considering what significance the data might have in the context of the overall aims of the
research.
Discussion and analysis
Here, the findings that have been outlined are subjected to scrutiny in terms of what they might
mean. They are literally discussed and analyzed with reference to the theories and ideas, issues and
problems that were noted earlier in the report as providing the context in which the research was
conceived. The researcher ‘makes sense’ of the findings by considering their implications beyond
the confines of the current research.
Conclusions and recommendations
Finally, in the main text, the researcher needs to draw together the threads of the research to arrive
at some general conclusion and, perhaps, to suggest some way forward. Rather than let the report
fizzle out as it reaches the end, this part of the report should be constructive and positive. It can
contain some of the following things:
• A retrospective evaluation of the research and its contribution;
• Recommendations for improving the situation, guidelines or codes of practice;
• Identification of new directions for further research.
The end matter
Appendices
This is the place for material which is too bulky for the main body of the text, or for material
which, though directly relevant to the discussion, might entail too much of a sidetrack if placed in
the text. Typical items that can be lodged in an appendix are:
• Extensive tables of data;
• Questionnaires used in a survey;
• Extracts from an interview transcript;
• Memos or minutes of meetings;
• Technical specifications.
Notes
These will mainly occur when the researcher is using a numerical referencing system. They also
offer the opportunity for scholarly details to be added which would interrupt the flow of the
reading were they to be put directly into the text.
References
See the section on the Harvard system of referencing above.
Index
Provision of an index is usually restricted to large reports and books. It is unlikely that the kind of
report produced by a project researcher would require an index.
Examples of citations
You do not have to follow the citation format shown here. It is my personal preference, but you
should always use what is specified by the university in which you hope to publish your work. You
will end that all citation formats give the same kinds of information, so once you have learned one
of them; it is easy to convert to others. Never copy a bibliography entry from someone else's
bibliography without understanding it and checking that it is correct.
56
Book:
Chomsky, Noam (1957) Syntactic structures. The Hague: Mouton.
Book with two authors:
Aho, Alfred V., and Ullman, Jerey D. (1972) The theory of parsing, translation, and compiling.
Englewood Cli_s, N.J.: Prentice-Hall.
Book with three or more authors:
Partee, Barbara H.; ter Meulen, Alice; and Wall, Robert E. (1990). Mathematical methods in
linguistics. Dordrecht: Kluwer.
Book in a series:
Blaser, A. (1988). Natural language at the computer. Lecture notes in computer science, 320.
Berlin: Springer.
Article in a journal:
Rapaport, William J. (1986) Philosophy of artificial intelligence. Teaching Philosophy
9.2:103{120.
Here \9.2:103 {120" means \Volume 9, issue 2, pages 103 to 120." If the page numbering does not
start afresh in each issue, do not give the issue number. Note that you must list all the pages the
article occupies, not just the pages you cited.
Article in a book:
Rapaport, William J. (2008) Philosophy of articial intelligence. In John Doe and William F.
Nusquam, eds., Studies in the philosophy of artificial intelligence, pp. 103{122. Dordrecht: Reidel.
Article in proceedings of a well-known conference:
Doe, John P. (1988) Prolog optimization systems. Proceedings, AAAI-88, 128{145.
If the conference is not well known, handle the proceedings volume like a book of articles,
identifying its editors and publisher.
Unpublished paper provided to you by the author:
Doe, John P. (1984) Giant computers of the future. Department of Computer Science, University
of Tasmania.
Paper retrieved from a web site (and not otherwise published):
Doe, John P. (2008) Giant computers of the present. http://www.utas.edu.au/cs/doe/giant.pdf.
APA's preference is to give the date on which the item was retrieved.
Reprinted article:
Doe, John P. (1987) Prolog optimizers. Reprinted in L. C. Moe, ed., Prolog optimization, pp.
101{105. New York: Columbia University Press, 1998.
57