Research Methodology - PU - II Sem

Unit - 1
Meaning of Research:
Research in simple terms refers to search for knowledge. It is a scientific and systematic search
for information on a particular topic or issue. It is also known as the art of scientific
investigation. Several social scientists have defined research in different ways.
In the Encyclopedia of Social Sciences, D. Slesinger and M. Stephension (1930) defined research
as “the manipulation of things, concepts or symbols for the purpose of generalizing to extend,
correct or verify knowledge, whether that knowledge aids in the construction of theory or in
the practice of an art”.
According to Redman and Mory (1923), research is a “systematized effort to gain new
knowledge”. It is an academic activity and therefore the term should be used in a technical
sense. According to Clifford Woody (kothari, 1988), research comprises “defining and
redefining problems, formulating hypotheses or suggested solutions; collecting, organizing and
evaluating data; making deductions and reaching conclusions; and finally, carefully testing the
conclusions to determine whether they fit the formulated hypotheses”.
Thus, research is an original addition to the available knowledge, which contributes to its
further advancement. It is an attempt to pursue truth through the methods of study,
observation, comparison and experiment. In sum, research is the search for knowledge, using
objective and systematic methods to find solution to a problem.
Objectives of Research
The objective of research is to find answers to the questions by applying scientific procedures.
In other words, the main aim of research is to find out the truth which is hidden and has not yet
been discovered. Although every research study has its own specific objectives, the research
objectives may be broadly grouped as follows:
1. To gain familiarity with new insights into a phenomenon (i.e., formulative research
studies);
2. To accurately portray the characteristics of a particular individual, group, or a situation
(i.e., descriptive research studies);
3. To analyse the frequency with which something occurs (i.e., diagnostic research
studies); and
4. To examine the hypothesis of a causal relationship between two variables (i.e.,
hypothesis-testing research studies).
Significance of Research:
“All progress is born of inquiry. Doubt is often better than overconfidence, for it leads to
inquiry, and inquiry leads to invention” is a famous Hudson Maxim in context of which the
significance of research can well be understood. Increased amounts of research make progress
1 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

possible. Research inculcates scientific and inductive thinking and it promotes the development
of logical habits of thinking and organization.
The role of research in several fields of applied economics, whether related to business or to the
economy as a whole, has greatly increased in modern times. The increasingly complex nature of
business and government has focused attention on the use of research in solving operational
problems. Research, as an aid to economic policy, has gained added importance, both for
government and business.
Research provides the basis for nearly all government policies in our economic system.For
instance, government’s budgets rest in part on an analysis of the needs and desires of the people
and on the availability of revenues to meet these needs. The cost of needs has to be equated to
probable revenues and this is a field where research is most needed. Through research we can
devise alternative policies and can as well examine the consequences of each of these
alternatives.
Decision-making may not be a part of research, but research certainly facilitates the decisions of
the policy maker. Government has also to chalk out programmes for dealing with all facets of
the country’s existence and most of these will be related directly or indirectly to economic
conditions. The plight of cultivators, the problems of big and small business and industry,
working conditions, trade union activities, the problems of distribution, even the size and
nature of defense services are matters requiring research. Thus, research is considered
necessary with regard to the allocation of nation’s resources.
Another area in government, where research is necessary, is collecting information on the

economic and social structure of the nation. Such information indicates what is happening in
the economy and what changes are taking place. Collecting such statistical information is by no
means a routine task, but it involves a variety of research problems. These day nearly all
governments maintain large staff of research technicians or experts to carry on this work.
Thus, in the context of government, research as a tool to economic policy has three distinct
phases of operation, viz,
1. Investigation of economic structure through continual compilation of facts;

2. Diagnosis of events that are taking place and the analysis of the forces underlying them;
and
3. The prognosis, i.e., the prediction of future developments.
The main purpose of research is to inform action, to prove a theory, and contribute to
developing knowledge in a field or study. This article will highlight the significance of research
with the following points:
1. A Tool for Building Knowledge and for Facilitating Learning

2. Means to Understand Various Issues and Increase Public Awareness
3. An Aid to Business Success
4. A Way to Prove Lies and to Support Truths
5. Means to Find, Gauge, and Seize Opportunities

6. A Seed to Love Reading, Writing, Analyzing, and Sharing Valuable Information
7. Nourishment and Exercise for the Mind
Some significances of research are given below:

• Research inspires scientific and inductive thinking and it promotes the development of
logical habits of thinking and organisation. The role of research in several fields of
applied economics, whether related to business or to the economy as a whole, has
greatly increased in modern times. The increasingly complex nature of business and
government has focused attention on the use of research in solving operational
problems.
• Research, as an aid to economic policy, has gained added importance, both for
government and business. Research provides the basis for nearly all government
policies in our economic system. For instance, government’s budgets rest in part on an
analysis of the needs and desires of the people and on the availability of revenues to
meet these needs.
• The cost of needs has to be equated to probable revenues and this is a field where
research is most needed. Through research we can devise alternative policies and can as
well examine the consequences of each of these alternatives.
• Decision-making may not be a part of research, but research certainly facilitates the
decisions of the policy maker. Government has also to chalk out programmes for
dealing with all facets of the country’s existence and most of these will be related
directly or indirectly to economic conditions.
• Research is considered necessary with regard to the allocation of nation’s resources.
Another area in government, where research is necessary, is collecting information on
the economic and social structure of the nation. Such information indicates what is
happening in the economy and what changes are taking place. Collecting such statistical
information is by no means a routine task, but it involves a variety of research problems.
• These days, nearly all governments maintain large staff of research technicians or
experts to carry on this work. Thus, in the context of government, research as a tool to
economic policy has three distinct phases of operation, viz.,
o investigation of economic structure through continual compilation of facts;‚‚
o diagnosis of events that are taking place and the analysis of the forces
underlying them; and ‚‚
o the prognosis, i.e., the prediction of future developments‚‚
• Research has its special significance in solving various operational and planning
problems of business and industry. Operations research and market research, along with
motivational research, are considered crucial and their results assist, in more than one
way, in taking business decisions. Market research is the investigation of the structure
and development of a market for the purpose of formulating efficient policies for
purchasing, production and sales.
• Operations research refers to the application of mathematical, logical and analytical
techniques to the solution of business problems of cost minimisation or of profit
maximisation or what can be termed as optimisation problems. Motivational research of

determining why people behave as they do is mainly concerned with market
characteristics.
• In other words, it is concerned with the determination of motivations underlying the
consumer (market) behaviour. All these are of great help to people in business and
industry who are responsible for taking business decisions. Research with regard to
demand and market factors has great utility in business. Given knowledge of future
demand, it is generally not difficult for a firm, or for an industry to adjust its supply
schedule within the limits of its projected capacity..
• Market analysis has become an integral tool of business policy these days. Business
budgeting, which ultimately results in a projected profit and loss account, is based
mainly on sales estimates which in turn depend on business research. Once sales
forecasting is done, efficient production and investment programmes can be set up
around which are grouped the purchasing and financing plans. Research, thus, replaces
intuitive business decisions by more logical and scientific decisions.
• Research is equally important for social scientists in studying social relationships and in
seeking answers to various social problems. It provides the intellectual satisfaction of
knowing a few things just for the sake of knowledge and also has practical utility for the
social scientist to know for the sake of being able to do something better or in a more
efficient manner. Research in social sciences is concerned both with knowledge for its
own sake and with knowledge for what it can contribute to practical concerns.
• “This double emphasis is perhaps especially appropriate in the case of social science. On
the one hand, its responsibility as a science is to develop a body of principles that make
possible the understanding and prediction of the whole range of human interactions. On
the other hand, because of its social orientation, it is increasingly being looked to for
practical guidance in solving immediate problems of human relations.”
• In addition to what has been stated above, the significance of research can also be
understood keeping in view the following points:
o To those students who are to write a master’s or Ph.D. thesis, research may
mean a careerism or a way to attain a high position in the social structure;
o To professionals in research methodology, research may mean a source of
livelihood;‚‚
o To philosophers and thinkers, research may mean the outlet for new ideas and
insights;‚‚
o To literary men and women, research may mean the development of new styles
and creative work;‚‚
o To analysts and intellectuals, research may mean the generalizations of new
theories.‚‚
• Thus, research is the fountain of knowledge for the sake of knowledge and an important
source of providing • guidelines for solving different business, governmental and social
problems. It is a sort of formal training which enables one to understand the new
developments in one’s field in a better way.

Motives of Research
However, this is not an exhaustive list of factors motivating people to undertake research
studies. Many more factors, such as, directives of government, employment conditions,
curiosity about new things, desire to understand causal relationships, social thinking and
awakening and many other alike factors will motivate (or at times compel) people to perform
research operations.
Characteristics of Social Research - Characteristics of social research are given below.
• Study of human behavior: Social research deals with the social phenomena. It studies
the behavior of human beings as members of society, and their feelings, response,
attitudes under different circumstances.
• Discovering new facts: In social science, a lot of research is being carried on for both
purposes and has resulted in the discovery of new facts as well as modification of old
concepts. Constant verification of the old concepts is also needed, especially in case of
dynamic sciences. Verification is needed because of two reasons.
o Firstly, there may be an improvement in the technique of research and it is
necessary to test the old concepts by this improved technique.
o Secondly, the phenomena under study might have undergone a change and it
may be required to test the validity of old concepts in the changed
circumstances.
• Scientific analysis: Social research tries to establish causal connection between various
human activities. It is really very interesting to note whether various complex human
activities, are being performed only at random without any sequence, law or system
behind them. At first causal look at varied human behavior, attitudes, moods and
temperaments, the presence of any system may appear to be an impossibility, but a close
and patient study of different cases, disclose the truth, that most of them are not merely
jumbled as they appear but motivated by definite rules, perfect system and universal
laws.

Limitations of Social Research - Limitations of social research are explained below.
• During the courses of conducting a research, one has to constantly guard against bias,
subjectivity and inaccuracy. Yet, it is difficult to totally avoid them and a minimum of
bias and inaccuracy is always present in any research. Social research happens to be no
exception and it is replete with instance of problematic situation where some errors
cannot be possibly avoided.
• In social survey, the traders, farmers and wage earners from whom the data were
collected, may not reveal their business secrets, which also set certain limitations. The
respondents are not maintaining any record of their expenses and receipts. Hence, the
data provided by them from their memory may have limitations. The researcher has to
carefully minimize such errors by educating the respondents about the scope of the
study and with all possible verification of the data collected in the field.
Purpose of Research - There are three purposes of research:
• Exploratory: As the name suggests, exploratory research is conducted to explore a

group of questions. The answers and analytics may not offer a final conclusion to the
perceived problem. It is conducted to handle new problem areas which haven’t been
explored before. This exploratory process lays the foundation for more conclusive
research and data collection.
• Descriptive: Descriptive research focuses on expanding knowledge on current issues
through a process of data collection. Descriptive studies are used to describe the
behavior of a sample population. In a descriptive study, only one variable is required to
conduct the study. The three main purposes of descriptive research are describing,
explaining, and validating the findings. For example, a study conducted to know if top-
level management leaders in the 21st century possess the moral right to receive a huge
sum of money from the company profit.
• Explanatory: Explanatory research or causal research is conducted to understand the
impact of certain changes in existing standard procedures. Conducting experiments is
the most popular form of casual research. For example, a study conducted to understand
the effect of rebranding on customer loyalty.
Types of Research
Research has been classified differently, depending upon the approach, the purpose and the
nature of a research activity. The basic types of research are as follows:
A. Fundamental, Pure or Theoretical Research - This type of research is original or basic in
character. An imaginative and particular research worker, with his qualities of honesty and
integrity and his lust for the search of the truth, makes persistent and patient efforts to discover
something new to enrich the human knowledge in a fundamental fashion. Such research is
known as fundamental or pure. Fundamental research can take shape in two different ways:
• Discovery of a new theory: Fundamental research may be entirely a new discovery,

the knowledge of which has not existed so far. The researcher is often born-genius, has
a sharp intellect, is thirsty of knowledge and eventually has an ocean of knowledge in
his possession and emerges gems of knowledge. This discovery may have nothing to do
with an existing theory.
• Development of the existing theory: Since theory is always based on assumptions,
there often exists enormous scope for altering or formulating new set of assumptions
and adding new dimensions to the existing theory. There also exist the possibilities of
re-interpretation of the theory that has already been developed. A researcher may, as
well take off from the existing theories and come out with a new one of his own. The
assumptions of a theory should always be well defined and plausible. Relaxing
assumptions, altering them or making new ones altogether depends upon how a
researcher views the existing theory.
B. Applied Research - This type of research is based on the application of known theories and
models to the actual operational fields or populations. The applied research is conducted to test
the empirical content or the basic assumptions or the very validity of theory under given
conditions. Applied research contributes to social science by:
• providing the kind of convincing evidence of the usefulness to society which is necessary
to continuing support
• utilizing and developing techniques which can also be made to serve so called basic
research
• providing data and ideas which may speed up the process of generalization
Utility to developing countries: Applied research has practical utility in the developing
countries. Developing countries may benefit by the discoveries, scientific and otherwise, that
have already been made, either by direct application or by making some modifications wherever
necessary.
Field investigation method: Applied research often takes the form of a field investigation and
aims at collecting the basic data for verifying the applicability of existing theories and models
in given situation.
C. Exploratory Research - It pursues several possibilities simultaneously, and in a sense it is

not quite sure of its objective. Exploratory research is designed to provide a background, to
familiarise and, as the word implies, just explore the general subject. A part of exploratory
research is the investigation of relationships among variables, without knowing why they are
studied. It borders on an idle curiosity approach, differing from it only in which the
investigator thinks there may be a payoff in an application somewhere in the forest of questions.

D. Descriptive Research - Descriptive research, as the name suggests, is designed to describe
something. For example, the characteristics of users of a given product, the degree to which
product use varies with income, etc. to be of maximum benefit, a descriptive study must collect
data for a definite purpose. Descriptive studies vary in the degree to which a specific hypothesis
is the guide. It allows both, implicit and explicit hypotheses to be tested depending on the
research problems.
E. Action Research - Action research is also a recent addition to the categories of research
known to a modern social scientist. By its very definition, “it is a research through launching of
a direct action with the objective of obtaining workable solutions to the given problems”. This
also aims at collecting information from other sources that have direct or indirect bearing on
the research programme.
At the second phase, the planned action is practically launched and then at the next phase,
research carries out periodical assessment of the project. At a subsequent stage, changes,
modifications and other improvements are made in the functional aspect of the project and
finally the whole process culminates in the evaluation of the project as a whole.
The methods used for this type of research are usually personal interview methods and the
survey methods. Sometimes attitude measurement techniques are also used. Some problems
associated with action research are the personal values of the individuals, lack of social
scientist’s interest and exclusive locations with the respondent.
F. Evaluation Research - It is a recent addition to the types of research. This type of research
is primarily directed to evaluate the performance of the developmental projects and other
economic programmes that have already been implemented. The objective of evaluation
research is to realistically assess the impact of any such programmes. Evaluation is held to
mean comprehensive concept of measurement and it is because of this definition of evaluation
that project evaluations have become frequent in the recent years.
The evaluation research is of three types:
• Concurrent evaluation: It is a continuing process of an inspection of the project that

has been launched. In this manner, such type of research not only evaluates the
performance but also stimulates it and gives direction and control as and when possible.
• Phasic or periodic evaluation: This type of evaluation takes place at different phases
or stages of performance of the project. It enables us to evaluate the performance of the
completed phase and make adjustments in the subsequent phases after keeping in view
the failures and successes of the previous phase.
• Terminal evaluation: It is the evaluation of the final phase of the project. Once the
project has been completed, an overall assessment is made to see how best a project has
served the objectives for which it was launched.
G. Experimental Research - Experimentation is the process of research in which one or more

variables are manipulated under conditions, which permit the collection of data and which show
the effects. Experiments will create artificial situation so that the researcher can obtain the

particular data needed and can measure the data accurately. Experiments are artificial in the
sense that the situations are usually created for testing purposes.
H. Empirical Research - This type relies on experience or observation alone, often without
due regard for system and theory. It is a data based research, coming up with conclusions,
which are capable of being verified by observation or experiment. In such a research, it is
necessary to get the facts first hand, at their source, and actively do certain things to stimulate
the production of desired information. Empirical research is appropriate when proof is sought
that certain variables affect other variables in some way.
I. Survey Research - This type of research has become very popular these days as a scientific
method for discovering relevant impact and inter-relationships of social psychological variables
from given population. Survey research studies large and small populations by selecting and
studying samples chosen from the populations – sociological and psychological variables.
The advantage of this type of research is that it links sample investigations with populations
and thereby offers an easy opportunity of studying population behavior through sample survey
research assessments.
Methods of survey: Survey research is approached through the methods of personal

interviews, mailed questionnaires and personal discussions besides indirect oral investigation.
Advantages: This type of research has the advantage of greater scope in the sense that a larger
volume of information can be controlled from a very large population.
J. Qualitative Research - It is concerned with qualitative phenomenon, i.e., phenomena

relating to or involving quality or kind. This type of research aims at discovering the
underlying motives and desires, using in depth interviews for the purpose.
K. Quantitative Research - It is based on the measurement of quantity or amount. It is

applicable to phenomena that can be expressed in terms of quality.
L. Field Investigation Research -
This is generally credited with a few virtues which are supposed to be unique to this category
of research. These virtues may be listed as:
• The variables in a field experiment operate more strongly than those used in laboratory
experiment. This is because of the fact that field situation takes stock of realistic natural
operations.
• Secondly, field experiments have the advantage of investigating more fruitfully the
dynamics of inter-relationships of small groups of variables.
• Field experimental studies are also ideal to testing of the theory and to the solution of
the real world problems.
Uses: Field experimental studies are important part of the applied research which, at times,
play an important role in pointing out the nature and direction of the refinements required for
an existing doctrinaire.
M. Ex-post Facto Research - This is a systematic empirical inquiry in which the scientist
does not have direct control of independent variables because their manifestations have already
occurred or because they are inherently not manipulative. Inferences about relations among
variables are made, without direct intervention, from concomitant variation of independent and
dependent variables.
N. Historical Research - Historical study is a study of past records and other information
sources with a view to reconstructing the origin and development of an institution or a
movement or a system and discovering the trends in the past. It is descriptive in nature. It is a
difficult task; it must often depend upon inference and logical analysis of recorded data and
indirect evidences rather than upon direct observation.
Some Other Types of Research
• All other types of research are variations of one or more of the above stated approaches,
based on either the purpose of research, or the time required to accomplish research, on
the environment in which research is done, or on the basis of some other similar factor.
• Research can be field-setting research or laboratory research or simulation research,
depending upon the environment in which it is to be carried out. Research can as well be
understood as clinical or diagnostic research. Such research follows case-study methods
or in-depth approaches to reach the basic causal relations.
• Such studies usually go deep into the causes of things or events that interest us, using
very small samples and very deep probing data gathering devices. The research may be
exploratory or it may be formalised. The objective of exploratory research is the
development of hypotheses rather than their testing, whereas formalised research
studies are those with substantial structure and with specific hypotheses to be tested.
Historical research is that which utilises historical sources like documents, remains, etc.
to study events or ideas of the past, including the philosophy of persons and groups at
any remote point of time.
• Research can also be classified as conclusion-oriented and decision-oriented research.
While doing conclusion oriented research, a researcher is free to pick up a problem,
redesign the enquiry as he proceeds and is prepared to conceptualise as he wishes.
Decision-oriented research is always for the need of a decision maker and the researcher
in this case is not free to embark upon research according to his own inclination.
• Operations research is an example of decision oriented research since it is a scientific
method of providing executive departments with a quantitative basis for decisions
regarding operations under their control.
Research Approaches
Different research approaches are explained below.
Descriptive vs. Analytical
• Descriptive research includes surveys and fact-finding enquiries of different kinds. The
major purpose of descriptive research is description of the state of affairs as it exists at

present. In social science and business research, we quite often use the term. 'Ex post
facto research' for descriptive research studies.
• The main characteristic of this method is that the researcher has no control over the
variables; he can only report what has happened or what is happening. Most ex post
facto research projects are used for descriptive studies in which the researcher seeks to
measure such items as, for example, frequency of shopping, preferences of people, or
similar data.
• Ex post facto studies also include attempts by researchers to discover causes even when
they cannot control the variables. The methods of research utilized in descriptive
research are survey methods of all kinds, including comparative and co relational
methods. In analytical research, on the other hand, the researcher has to use facts or
information already available, and analyze these to make a critical evaluation of the
material.
Applied vs. Fundamental
• Research can either be applied (or action) research or fundamental (to basic or pure)
research. Applied research aims at finding a solution for an immediate problem facing a
society or an industrial/business organisation, whereas fundamental research is mainly
concerned with generalisations and with the formulation of a theory.
• Research concerning some natural phenomenon or relating to pure mathematics are
examples of fundamental research. Similarly, research studies, concerning human
behaviour carried on with a view to make generalisations about human behaviour, are
also examples of fundamental research, but research aimed at certain conclusions (say, a
solution) facing a concrete social or business problem is an example of applied research.
• Research to identify social, economic or political trends that may affect a particular
institution or the copy research (research to find out whether certain communications
will be read and understood) or the marketing research or evaluation research are
examples of applied research. Thus, the central aim of applied research is to discover a
solution for some pressing practical problem, whereas basic research is directed towards
finding information that has a broad base of applications and thus, adds to the already
existing organised body of scientific knowledge.
Quantitative vs. Qualitative
• Quantitative research is based on the measurement of quantity or amount. It is

applicable to phenomena that can be expressed in terms of quantity. Qualitative
research, on the other hand, is concerned with qualitative phenomenon, i.e., phenomena
relating to or involving quality or kind. For instance, when we are interested in
investigating the reasons for human behaviour (i.e., why people think or do certain
things), we quite often talk of ‘Motivation Research’, an important type of qualitative
research.
• This type of research aims at discovering the underlying motives and desires, using in
depth interviews for the purpose. Other techniques of such research are word
association tests, sentence completion tests, story completion tests and similar other

projective techniques. Attitude or opinion research, i.e., research designed to find out
how people feel or what they think about a particular subject or institution is also
qualitative research.
• Qualitative research is especially important in the behavioural sciences where the aim is
to discover the underlying motives of human behaviour. Through such research we can
analyse the various factors which motivate people to behave in a particular manner or
which make people like or dislike a particular thing. It may be stated, however, that to
apply qualitative research in practice is relatively a difficult job and therefore, while
doing such research, one should seek guidance from experimental psychologists.
Conceptual vs. Empirical
• Conceptual research is that related to some abstract idea(s) or theory. It is generally

used by philosophers and thinkers to develop new concepts or to reinterpret existing
ones. On the other hand, empirical research relies on experience or observation alone,
often without due regard for system and theory. It is data-based research, coming up
with conclusions which are capable of being verified by observation or experiment.
• We can also call it as experimental type of research. In such a research, it is necessary to
get at facts firsthand, at their source, and actively to go about doing certain things to
stimulate the production of desired information. In such a research, the researcher must
first provide himself with a working hypothesis or guess as to the probable results. The
researcher then works to get enough facts (data) to prove or disprove his hypothesis. He
then sets up experimental designs which he thinks will manipulate the persons or the
materials concerned so as to bring forth the desired information.
• Such research is thus characterised by the experimenter’s control over the variables
under study and his deliberate manipulation of one of them to study its effects.
Empirical research is appropriate when proof is sought that certain variables affect
other variables in some way. Evidence gathered through experiments or empirical
studies is today considered to be the most powerful support possible for a given
hypothesis.
Characteristics of Good Research – Academic Research is defined as a process of collecting,

analyzing and interpreting information to answer questions or solve a problem. But to qualify
as good research, the process must have certain characteristics and properties: it must, as far as
possible, be controlled, rigorous, systematic, valid and verifiable, empirical and critical. The
main characteristics for good quality research are listed below:
1. It is based on the work of others.

2. It can be replicated and doable .
3. It is generalisable to other settings.
4. It is based on some logical rationale and tied to theory. In a way that it has the
potential to suggest directions for future research.
5. It generates new questions or is cyclical in nature.
6. It is incremental.
7. It addresses directly or indirectly some real problem in the world.

8. It clearly states the variables or constructs to be examined.
9. Valid and verifiable such that whatever you conclude on the basis of your findings is
correct and can be verified by you and others.
10. The researcher is sincerely interested and/or invested in this research.
Meanwhile, bad research has the following properties:
1. The opposites of what have been discussed.

2. Looking for something when it simply is not to be found.
3. Plagiarizing other people’s work.
4. Falsifying data to prove a point.
5. Misrepresenting information and misleading participants.
The qualities of a good research
1. Systematic: It means that research is structured with specified steps to be taken in a

specified sequence in accordance with the well defined set of rules. Systematic
characteristic of the research does not rule out creative thinking but it certainly does
reject the use of guessing and intuition in arriving at conclusions.
2. Logical: This implies that research is guided by the rules of logical reasoning and the
logical process of induction and deduction are of great value in carrying out research.
Induction is the process of reasoning from a part to the whole whereas deduction is the
process of reasoning from some premise to a conclusion which follows from that very
premise. In fact, logical reasoning makes research more meaningful in the context of
decision making. Good research is empirical: It implies that research is related basically
to one or more aspects of a real situation and deals with concrete data that provides a
basis for external validity to research results.
3. Replicable: This characteristic allows research results to be verified by replicating the
study and thereby building a sound basis for decisions.
Research Process - Before embarking on the details of research methodology and techniques,
it seems appropriate to present a brief overview of the research process. Research process
consists of series of actions or steps necessary to effectively carry out research and the desired
sequencing of these steps.

• The chart indicates that the research process consists of a number of closely related
activities. But such activities overlap continuously rather than following a strictly
prescribed sequence. At times, the first step determines the nature of the last step to be
undertaken. If subsequent procedures have not been taken into account in the early
stages, serious difficulties may arise which may even prevent the completion of the
study.
• One should remember that the various steps involved in a research process are not
mutually exclusive; nor are they separate and distinct. They do not necessarily follow
each other in any specific order and the researcher has to be constantly anticipating at
each step in the research process the requirements of the subsequent steps.
• However, the following order concerning various steps provides a useful procedural
guideline regarding the research process:
o formulating the research problem‚‚
o extensive literature survey‚‚
o developing the hypothesis‚‚
o preparing the research design‚‚
o determining sample design‚‚
o collecting the data‚‚
o execution of the project‚‚
o analysis of data‚‚
o hypothesis testing‚‚
o generalisations and interpretation, and ‚‚
o preparation of the report or presentation of the results, i.e., formal write-up of
conclusions reached
1. Formulating the research problem -
a. There are two types of research problems, viz., those which relate to states of
nature and those which relate to relationships between variables. At the very
outset, the researcher must single out the problem he wants to study, i.e., he
must decide the general area of interest or aspect of a subject-matter that he
would like to inquire into.
b. Initially, the problem may be stated in a broad general way and then the
ambiguities, if any, relating to the problem be resolved. Then, the feasibility of a
particular solution has to be considered before a working formulation of the
problem can be set up. The formulation of a general topic into a specific research
problem, thus, constitutes the first step in a scientific enquiry.
c. Essentially two steps are involved in formulating the research problem, viz.,
understanding the problem thoroughly, and rephrasing the same into
meaningful terms from an analytical point of view. The best way of
understanding the problem is to discuss it with one’s own colleagues or with
those having some expertise in the matter. In an academic institution the
researcher can seek the help from a guide who is usually an experienced man and
has several research problems in mind.

d. Often, the guide puts forth the problem in general terms and it is up to the
researcher to narrow it down and phrase the problem in operational terms. In
private business units or in governmental organisations, the problem is usually
earmarked by the administrative agencies with which the researcher can discuss
as to how the problem originally came about and what considerations are
involved in its possible solutions.
e. The researcher must at the same time examine all available literature to get
himself acquainted with the selected problem. He may review two types of
literature—the conceptual literature concerning the concepts and theories, and
the empirical literature consisting of studies made earlier which are similar to
the one proposed. The basic outcome of this review will be the knowledge as to
what data and other materials are available for operational purposes which will
enable the researcher to specify his own research problem in a meaningful
context.
f. After this the researcher rephrases the problem into analytical or operational
terms, i.e., to put the problem in as specific terms as possible. This task of
formulating, or defining, a research problem is a step of greatest importance in
the entire research process. The problem to be investigated must be defined
unambiguously for that will help discriminating relevant data from irrelevant
ones. Care must; however, be taken to verify the objectivity and validity of the
background facts concerning the problem.
2. Extensive literature survey
a. Once the problem is formulated, a brief summary of it should be written down. It
is compulsory for a research worker writing a thesis for a Ph.D. degree to write
a synopsis of the topic and submit it to the necessary Committee or the Research
Board for approval. At this juncture the researcher should undertake extensive
literature survey connected with the problem.
b. For this purpose, the abstracting and indexing journals and published or
unpublished bibliographies are the first place to go to. Academic journals,
conference proceedings, government reports, books etc., must be tapped
depending on the nature of the problem. In this process, it should be
remembered that one source will lead to another. The earlier studies, if any,
which are similar to the study in hand, should be carefully studied. A good
library will be a great help to the researcher at this stage.
3. Development of working hypotheses
a. After extensive literature survey, researcher should state in clear terms the
working hypothesis or hypotheses. Working hypothesis is tentative assumption
made in order to draw out and test its logical or empirical consequences. As such
the manner in which research hypotheses are developed is particularly important
since they provide the focal point for research.
b. They also affect the manner in which tests must be conducted in the analysis of
data and indirectly the quality of data which is required for the analysis. In most
types of research, the development of working hypothesis plays an important
role. Hypothesis should be very specific and limited to the piece of research in

hand because it has to be tested. The role of the hypothesis is to guide the
researcher by delimiting the area of research and to keep him on the right track.
c. It sharpens his thinking and focuses attention on the more important facets of
the problem. It also indicates the type of data required and the type of methods
of data analysis to be used. How does one go about developing working
hypotheses? The answer is by using the following approach:
i. Discussions with colleagues and experts about the problem, its origin and
the objectives in seeking a solution;
ii. Examination of data and records, if available, concerning the problem for
possible trends, peculiarities and other clues;
iii. Review of similar studies in the area or of the studies on similar
problems; and‚‚
iv. Exploratory personal investigation which involves original field
interviews on a limited scale with interested parties and individuals with
a view to secure greater insight into the practical aspects of the problem.
d. Thus, working hypotheses arise as a result of a-priori thinking about the subject,
examination of the available data and material including related studies and the
counsel of experts and interested parties. Working hypotheses are more useful
when stated in precise and clearly defined terms. It may as well be remembered
that occasionally we may encounter a problem where we do not need working
hypotheses, especially in the case of exploratory or formulative researches which
do not aim at testing the hypothesis. But as a general rule, specification of
working hypotheses in another basic step of the research process in most
research problems.
4. Preparing the research design
a. The research problem having been formulated in clear cut terms, the researcher
will be required to prepare a research design, i.e., he will have to state the
conceptual structure within which research would be conducted. The
preparation of such a design facilitates research to be as efficient as possible
yielding maximal information. In other words, the function of research design is
to provide for the collection of relevant evidence with minimal expenditure of
effort, time and money. But how all these can be achieved depends mainly on the
research purpose. Research purposes may be grouped into four categories, viz.,
i. Exploration‚‚
ii. Description‚‚
iii. Diagnosis‚‚
iv. Experimentation
b. A flexible research design which provides opportunity for considering many
different aspects of a problem is considered appropriate if the purpose of the
research study is that of exploration. But when the purpose happens to be an
accurate description of a situation or of an association between variables, the
suitable design will be one that minimises bias and maximises the reliability of
the data collected and analysed.

c. There are several research designs, such as, experimental and non-experimental
hypothesis testing. Experimental designs can be either informal designs (such as
before-and-after without control, after-only with control, before-and-after with
control) or formal designs (such as completely randomised design, randomised
block design, Latin square design, simple and complex factorial designs), out of
which the researcher must select one for his own project.
d. The preparation of the research design, appropriate for a particular research
problem, involves usually the consideration of the following:
i. the means of obtaining the information‚‚
ii. the availability and skills of the researcher and his staff (if any)‚‚
iii. explanation of the way in which selected means of obtaining information
will be organised and the reasoning leading to the selection
iv. the time available for research‚‚
v. the cost factor relating to research, i.e., the finance available for the
purpose
5. Determining sample design
a. All the items under consideration in any field of inquiry constitute a ‘universe’ or
‘population’. A complete enumeration of all the items in the ‘population’ is
known as a census inquiry. It can be presumed that in such an inquiry when all
the items are covered no element of chance is left and highest accuracy is
obtained. But in practice this may not be true.
b. Even the slightest element of bias in such an inquiry will get larger and larger as
the number of observations increases. Moreover, there is no way of checking the
element of bias or its extent except through a resurvey or use of sample checks.
Besides, this type of inquiry involves a great deal of time, money and energy.
c. The researcher must decide the way of selecting a sample or what is popularly
known as the sample design. In other words, a sample design is a definite plan
determined before any data are actually collected for obtaining a sample from a
given population. With probability samples each element has a known
probability of being included in the sample but the non-probability samples do
not allow the researcher to determine this probability.
d. Probability samples are those based on simple random sampling, systematic
sampling, stratified sampling, cluster/ area sampling whereas non-probability
samples are those based on convenience sampling, judgment sampling and quota
sampling techniques.
6. Data collection
a. In dealing with any real life problem, it is often found that data at hand are
inadequate, and hence, it becomes • necessary to collect data that are appropriate.
There are several ways of collecting the appropriate data which differ
considerably in context of money costs, time and other resources at the disposal
of the researcher. Primary data can be collected either through experiment or
through survey. If the researcher conducts an experiment, he observes some
quantitative measurements, or the data, with the help of which he examines the

truth contained in his hypothesis. But in the case of a survey, data can be
collected by any one or more of the following ways:
i. By observation: This method implies the collection of information by
way of investigator’s own observation, ‚‚without interviewing the
respondents. The information obtained relates to what is currently
happening and is not complicated by either the past behaviour or future
intentions or attitudes of respondents. This method is no doubt an
expensive method and the information provided by this method is also
very limited. As such this method is not suitable in inquiries where large
samples are concerned.
ii. Through personal interview: The investigator follows a rigid
procedure and seeks answers to a set of pre-conceived questions through
personal interviews. This method of collecting data is usually carried out
in a structured way where output depends upon the ability of the
interviewer to a large extent.
iii. Through telephone interviews: This method of collecting information
involves contacting the respondents ‚‚on telephone itself. This is not a
very widely used method but it plays an important role in industrial
surveys in developed regions, particularly, when the survey has to be
accomplished in a very limited time.
iv. By mailing of questionnaires: The researcher and the respondents do
come in contact with each other if this ‚‚method of survey is adopted.
Questionnaires are mailed to the respondents with a request to return
after completing the same. It is the most extensively used method in
various economic and business surveys. Before applying this method,
usually a Pilot Study for testing the questionnaire is conduced which
reveals the weaknesses, if any, of the questionnaire? Questionnaire to be
used must be prepared very carefully so that it may prove to be effective
in collecting the relevant information.
v. Through schedules: Under this method, the enumerators are appointed
and given training. They are provided ‚‚with schedules containing
relevant questions. These enumerators go to respondents with these
schedules. Data are collected by filling up the schedules by enumerators
on the basis of replies given by respondents. Much depends upon the
capability of enumerators so far as this method is concerned. Some
occasional field checks on the work of the enumerators may ensure
sincere work. The researcher should select one of these methods of
collecting the data taking into consideration the nature of investigation,
objective and scope of the inquiry, financial resources, available time and
the desired degree of accuracy. Though he should pay attention to all
these factors but much depends upon the ability and experience of the
researcher.

7. Execution of the project
a. Execution of the project is a very important step in the research process. If the
execution of the project proceeds on correct lines, the data to be collected would
be adequate and dependable. The researcher should see that the project is
executed in a systematic manner and in time. If the survey is to be conducted by
means of structured questionnaires, data can be readily machine-processed. In
such a situation, questions as well as the possible answers may be coded. If the
data are to be collected through interviewers, arrangements should be made for
proper selection and training of the interviewers.
b. The training may be given with the help of instruction manuals which explain
clearly the job of the interviewers at each step. Occasional field checks should be
made to ensure that the interviewers are doing their assigned job sincerely and
efficiently. A careful watch should be kept for unanticipated factors in order to
keep the survey as much realistic as possible. This, in other words, means that
steps should be taken to ensure that the survey is under statistical control so
that the collected information is in accordance with the pre-defined standard of
accuracy.
c. If some of the respondents do not cooperate, some suitable methods should be
designed to tackle this problem. One method of dealing with the non-response
problem is to make a list of the non-respondents and take a small sub-sample of
them, and then with the help of experts vigorous efforts can be made for
securing response.
8. Analysis of data
a. After the data have been collected, the researcher turns to the task of analysing
them. The analysis of data requires a number of closely related operations such
as establishment of categories, the application of these categories to raw data
through coding, tabulation and then drawing statistical inferences. The
unwieldy data should necessarily be condensed into a few manageable groups
and tables for further analysis.
b. Thus, researcher should classify the raw data into some purposeful and usable
categories. Coding operation is usually done at this stage through which the
categories of data are transformed into symbols that may be tabulated and
counted. Editing is the procedure that improves the quality of the data for
coding. With coding the stage is ready for tabulation.
c. Tabulation is a part of the technical procedure wherein the classified data are put
in the form of tables. The mechanical devices can be made use of at this juncture.
A great deal of data, especially in large inquiries, is tabulated by computers.
Computers not only save time but also make it possible to study large number of
variables affecting a problem simultaneously.
d. Analysis work after tabulation is generally based on the computation of various
percentages, coefficients, etc., by applying various well defined statistical
formulae. In the process of analysis, relationships or differences supporting or
conflicting with original or new hypotheses should be subjected to tests of

significance to determine with what validity data can be said to indicate any
conclusion(s).
e. For instance, if there are two samples of weekly wages, each sample being drawn
from factories in different parts of the same city, giving two different mean
values, then our problem may be whether the two mean values are significantly
different or the difference is just a matter of chance. Through the use of
statistical tests we can establish whether such a difference is a real one or is the
result of random fluctuations. If the difference happens to be real, the inference
will be that the two samples come from different universes and if the difference is
due to chance, the conclusion would be that the two samples belong to the same
universe.
f. Similarly, the technique of analysis of variance can help us in analysing whether
three or more varieties of seeds grown on certain fields yield significantly
different results or not. In brief, the researcher can analyse the collected data
with the help of various statistical measures.
9. Hypothesis-testing
a. After analysing the data as stated above, the researcher is in a position to test
the hypotheses, if any, he had formulated earlier. Do the facts support the
hypotheses or they happen to be contrary? This is the usual question which
should be answered while testing hypotheses. Various tests, such as Chi square
test, t-test, F-test, have been developed by statisticians for the purpose. The
hypotheses may be tested through the use of one or more of such tests,
depending upon the nature and object of research inquiry.
b. Hypothesis-testing will result in either accepting the hypothesis or in rejecting
it. If the researcher had no hypotheses to start with, generalisations established
on the basis of data may be stated as hypotheses to be tested by subsequent
researches in times to come.
10. Generalisations and interpretation
a. If a hypothesis is tested and upheld several times, it may be possible for the
researcher to arrive at generalisation, i.e., to build a theory. As a matter of fact,
the real value of research lies in its ability to arrive at certain generalisations. If
the researcher had no hypothesis to start with, he might seek to explain his
findings on the basis of some theory. It is known as interpretation. The process
of interpretation may quite often trigger off new questions which in turn may
lead to further researches.
11. Preparation of the report or the thesis
Finally, the researcher has to prepare the report of what has been done by him. Writing
of report must be done with great care keeping in view the following:
a. The layout of the report should be as follows:•
i. the preliminary pages‚‚
ii. the main text‚‚
iii. the end matter‚‚

In its preliminary pages, the report should carry title and date followed by acknowledgements
and foreword. Then there should be a table of contents followed by a list of tables and list of
graphs and charts, if any, given in the report.
b. The main text of the report should have the following parts:•
i. Introduction: It should contain a clear statement of the objective of the
research and an explanation of the methodology adopted in
accomplishing the research. The scope of the study along with various
limitations should as well be stated in this part.
ii. Summary of findings: After introduction there would appear a
statement of findings and recommendations in non-technical language. If
the findings are extensive, they should be summarised.
iii. Main report: The main body of the report should be presented in logical
sequence and broken-down into readily identifiable sections.
iv. Conclusion: Towards the end of the main text, researcher should again
put down the results of his research clearly and precisely. In fact, it is the
final summing up.
c. At the end of the report, appendices should be enlisted in respect of all technical
data. Bibliography, i.e., list of books, journals, reports, etc., consulted, should also
be given in the end. Index should also be given specially in a published research
report. Report should be written in a concise and objective style in simple
language avoiding vague expressions such as ‘it seems,’ ‘there may be’, and the
like.
d. Charts and illustrations in the main report should be used only if they present
the information more clearly and forcibly.
e. Calculated ‘confidence limits’ must be mentioned and the various constraints
experienced in conducting research operations may as well be stated.
Types of Research Design
Research Design
The next step after stating the management problem, research purpose, and research
hypotheses and questions, is to formulate a research design. The starting point for the research
design is, in fact, the research questions and hypotheses that have been so carefully developed.
In essence, the research design answers the question: How are we going to get answers to these
research questions and test these hypotheses? The research design is a plan of action indicating
the specific steps that are necessary to provide answers to those questions, test the hypotheses,
and thereby achieve the research purpose that helps choose among the decision alternatives to
solve the management problem or capitalise on the market opportunity.
Meaning of Research Design
The difficult problem that follows the task of defining the research problem is the preparation
of the design of the research project, popularly known as the “research design”. Decisions

regarding what, where, when, how much, by what means concerning an inquiry or a research
study constitute a research design.
“A research design is the arrangement of conditions for collection and analysis of data in a
manner that aims to combine relevance to the research purpose with economy in procedure.” In
fact, the research design is the conceptual structure within which research is conducted; it
constitutes the blueprint for the collection, measurement and analysis of data. As such the
design includes an outline of what the researcher will do from writing the hypothesis and its
operational implications to the final analysis of data. More explicitly, the design decisions
happen to be in respect of:
• What is the study about? ‚‚

• Why is the study being made? ‚‚
• Where will the study be carried out? ‚‚
• What type of data is required? ‚‚
• Where can the required data are found? ‚‚
• What periods of time will the study include? ‚‚
• What will be the sample design? ‚‚
• What techniques of data collection will be used? ‚‚
• How will the data be analyzed? ‚‚
• In what style will the report be prepared?
Keeping in view the above stated design decisions; one may split the overall research design
into the following parts:
• the sampling design which deals with the method of selecting items to be observed for
the given study;‚‚
• the observational design which relates to the conditions under which the observations
are to be made;‚‚
• the statistical design which concerns with the question of how many items are to be
observed and how the information and data gathered are to be analyzed; and
• the operational design which deals with the techniques by which the procedures
specified in the sampling, statistical and observational designs can be carried out
Research Design
The next step after stating the management problem, research purpose, and research
hypotheses and questions, is to formulate a research design. The starting point for the research
design is, in fact, the research questions and hypotheses that have been so carefully developed.
In essence, the research design answers the question: How are we going to get answers to these
research questions and test these hypotheses? The research design is a plan of action indicating
the specific steps that are necessary to provide answers to those questions, test the hypotheses,
and thereby achieve the research purpose that helps choose among the decision alternatives to
solve the management problem or capitalize on the market opportunity.

Meaning of Research Design
1. The difficult problem that follows the task of defining the research problem is the
preparation of the design of the research project, popularly known as the “research
design”. Decisions regarding what, where, when, how much, by what means concerning
an inquiry or a research study constitute a research design.
2. “A research design is the arrangement of conditions for collection and analysis of data in
a manner that aims to combine relevance to the research purpose with economy in
procedure.” In fact, the research design is the conceptual structure within which
research is conducted; it constitutes the blueprint for the collection, measurement and
analysis of data. As such the design includes an outline of what the researcher will do
from writing the hypothesis and its operational implications to the final analysis of data.
More explicitly, the design decisions happen to be in respect of:
a. What is the study about? ‚‚
b. Why is the study being made? ‚‚
c. Where will the study be carried out? ‚‚
d. What type of data is required? ‚‚
e. Where can the required data are found? ‚‚
f. What periods of time will the study include? ‚‚
g. What will be the sample design? ‚‚
h. What techniques of data collection will be used? ‚‚
i. How will the data be analysed? ‚‚
j. In what style will the report be prepared? ‚‚
Keeping in view the above stated design decisions; one may split the overall research design
into the following parts:
• the sampling design which deals with the method of selecting items to be observed for
the given study;‚‚
• the observational design which relates to the conditions under which the observations
are to be made;‚‚
• the statistical design which concerns with the question of how many items are to be
observed and how the information and data gathered are to be analysed; and
• the operational design which deals with the techniques by which the procedures
specified in the sampling, statistical and observational designs can be carried out
Need for Research Design
• Research design is needed because it facilitates the smooth sailing of the various
research operations, thereby making research as efficient as possible yielding maximal
information with minimal expenditure of effort, time and money. Just as for better,
economical and attractive construction of a house, we need a blueprint (or what is
commonly called the map of the house) well thought out and prepared by an expert
architect, similarly we need a research design or a plan in advance of data collection and
analysis for our research project.

• Research design stands for advance planning of the methods to be adopted for collecting
the relevant data and the techniques to be used in their analysis, keeping in view the
objective of the research and the availability of staff, time and money. Preparation of the
research design should be done with great care as any error in it may upset the entire
project.
• Research design, in fact, has a great bearing on the reliability of the results arrived at
and as such constitutes the firm foundation of the entire building of the research work.
Even then the need for a well thought out research design is at times not realised by
many. The importance which this problem deserves is not given to it. As a result, many
researches do not serve the purpose for which they are undertaken. In fact, they may
even give misleading conclusions.
• Thoughtlessness in designing the research project may result in rendering the research
exercise is useless. It is, therefore, essential that an efficient and appropriate design must
be prepared before starting research operations. The design helps the researcher to
organise his ideas in a form whereby it will be possible for him to look for flaws and
inadequacies. Such a design can even be given to others for their comments and critical
evaluation. In the absence of such a course of action, it will be difficult for the critic to
provide a comprehensive review of the proposed study.
Features of a Good Design
• A good design is often characterised by adjectives like flexible, appropriate, efficient,

and economical and so on. Generally, the design which minimises bias and maximises
the reliability of the data collected and analysed is considered a good design. The design
which gives the smallest experimental error is supposed to be the best design in many
investigations.
• Similarly, a design which yields maximal information and provides an opportunity for
considering many different aspects of a problem is considered most appropriate and
efficient design in respect of many research problems. Thus, the question of good design
is related to the purpose or objective of the research problem and also with the nature of
the problem to be studied. A design may be quite suitable in one case, but may be found
wanting in one respect or the other in the context of some other research problem.
• One single design cannot serve the purpose of all types of research problems. A research
design appropriate for a particular research problem, usually involves the consideration
of the following factors:
o the means of obtaining information‚‚
o the availability and skills of the researcher and his staff, if any‚‚
o the objective of the problem to be studied‚‚
o the nature of the problem to be studied‚‚
o the availability of time and money for the research work‚‚
• If the research study happens to be an exploratory or a formulative one, wherein the
major emphasis is on discovery of ideas and insights, the research design most
appropriate must be flexible enough to permit the consideration of many different
aspects of a phenomenon. But when the purpose of a study is accurate description of a

situation or of an association between variables (or in what are called the descriptive
studies), accuracy becomes a major consideration and a research design which minimises
bias and maximises the reliability of the evidence collected is considered a good design.
• Studies involving the testing of a hypothesis of a causal relationship between variables
require a design which will permit inferences about causality in addition to the
minimisation of bias and maximisation of reliability. But in practice it is the most
difficult task to put a particular study in a particular group, for a given research may
have in it elements of two or more of the functions of different studies. It is only on the
basis of its primary function that a study can be categorised either as an exploratory or
descriptive or hypothesis-testing study and accordingly the choice of a research design
may be made in case of a particular study.
• Besides, the availability of time, money, skills of the research staff and the means of
obtaining the information must be given due weightage while working out the relevant
details of the research design such as experimental design, survey design, sample design
and the like.
Important Concepts Related to Research Design
Before describing the different research designs, it will be appropriate to explain the various
concepts relating to designs so that these may be better and easily understood.
Dependent and independent variables:
• A concept which can take on different quantitative values is called a variable. As such
the concepts like weight, height, income are all examples of variables. Qualitative
phenomena (or the attributes) are also quantified on the basis of the presence or absence
of the concerning attribute(s). Phenomena which can take on quantitatively different
values even in decimal points are called ‘continuous variables’. But all variables are not
continuous.
• If they can only be expressed in integer values, they are non-continuous variables or in
statistical language ‘discrete variables’. Age is an example of continuous variable, but
the number of children is an example of non-continuous variable. If one variable
depends upon or is a consequence of the other variable, it is termed as a dependent
variable, and the variable that is antecedent to the dependent variable is termed as an
independent variable.
• For instance, if we say that height depends upon age, then height is a dependent
variable and age is an independent variable. Further, if in addition to being dependent
upon age, height also depends upon the individual’s sex, then height is a dependent
variable and age and sex are independent variables. Similarly, readymade films and
lectures are examples of independent variables, whereas behavioural changes, occurring
as a result of the environmental manipulations, are examples of dependent variables.

Extraneous variable:
• Independent variables that are not related to the purpose of the study, but may affect
the dependent variable are termed as extraneous variables. Suppose the researcher
wants to test the hypothesis that there is a relationship between children’s gains in
social studies achievement and their self-concepts. In this case self-concept is an
independent variable and social studies achievement is a dependent variable.
• Intelligence may as well affect the social studies achievement, but since it is not related
to the purpose of the study undertaken by the researcher, it will be termed as an
extraneous variable. Whatever effect is noticed on dependent variable as a result of
extraneous variable(s) is technically described as an ‘experimental error’. A study must
always be so designed that the effect upon the dependent variable is attributed entirely
to the independent variable(s), and not to some extraneous variable or variables.
Control:
• One important characteristic of a good research design is to minimise the influence or

effect of extraneous variable(s). The technical term ‘control’ is used when we design the
study minimising the effects of extraneous independent variables. In experimental
researches, the term ‘control’ is used to refer to restrain experimental conditions.
Confounded relationship:
• When the dependent variable is not free from the influence of extraneous variable(s), the
relationship between • the dependent and independent variables is said to be confounded
by an extraneous variable(s).
Research hypothesis:
• When a prediction or a hypothesised relationship is to be tested by scientific methods, it

is termed as research hypothesis. The research hypothesis is a predictive statement that
relates an independent variable to a dependent variable. Usually a research hypothesis
must contain, at least, one independent and one dependent variable. Predictive
statements which are not to be objectively verified or the relationships that are assumed
but not to be tested are not termed research hypotheses.
Experimental and non-experimental hypothesis-testing research:
• When the purpose of research is to test a research hypothesis, it is termed as

hypothesis-testing research. It can be of the experimental design or of the non-
experimental design. Research in which the independent variable is manipulated is
termed ‘experimental hypothesis-testing research’ and a research in which an
independent variable is not manipulated is called ‘non-experimental hypothesis-testing
research’.
• For instance, suppose a researcher wants to study whether intelligence affects reading
ability for a group of students and for this purpose he randomly selects 50 students and

tests their intelligence and reading ability by calculating the coefficient of correlation
between the two sets of scores. This is an example of non-experimental hypothesis-
testing research because herein the independent variable, intelligence, is not
manipulated.
• But now suppose that our researcher randomly selects 50 students from a group of
students who are to take a course in statistics and then divides them into two groups by
randomly assigning 25 to Group A, the usual studies programme, and 25 to Group B,
the special studies programme.
• At the end of the course, he administers a test to each group in order to judge the
effectiveness of the training programme on the student’s performance-level. This is an
example of experimental hypothesis-testing research because in this case the
independent variable, viz., the type of training programme, is manipulated.
Experimental and control groups:
• In an experimental hypothesis-testing research when a group is exposed to usual

conditions, it is termed a ‘control group’, but when the group is exposed to some novel
or special condition, it is termed an ‘experimental group’. In the above illustration, the
Group A can be called a control group and the Group B an experimental group. If both
groups A and B are exposed to special studies programmes, then both groups would be
termed ‘experimental groups.’
• It is possible to design studies which include only experimental groups or studies which
include both experimental and control groups.
Treatments:
• The different conditions under which experimental and control groups are put are
usually referred to as ‘treatments’. In the illustration taken above, the two treatments
are the usual studies programme and the special studies programme. Similarly, if we
want to determine through an experiment the comparative impact of three varieties of
fertilisers on the yield of wheat, in that case the three varieties of fertilisers will be
treated as three treatments.
Experiment:
• The process of examining the truth of a statistical hypothesis, relating to some research
problem, is known as an experiment. For example, we can conduct an experiment to
examine the usefulness of a certain newly developed drug. Experiments can be of two
types, viz., absolute experiment and comparative experiment.
• If we want to determine the impact of a fertiliser on the yield of a crop, it is a case of
absolute experiment; but if we want to determine the impact of one fertiliser as
compared to the impact of some other fertiliser, our experiment then will be termed as a
comparative experiment. Often, we undertake comparative experiments when we talk of
designs of experiments.

Experimental unit(s):
• The pre-determined plots or the blocks, where different treatments are used, are known
as experimental units. Such experimental units must be selected (defined) very carefully.
Types of Research Design:
There are different types of research designs. They may be broadly categorized as:
(1) Exploratory Research Design;

(2) Descriptive and Diagnostic Research Design; and
(3) Hypothesis-Testing Research Design.
1. Exploratory Research Design:
The Exploratory Research Design is known as formulative research design. The main objective
of using such a research design is to formulate a research problem for an in-depth or more
precise investigation, or for developing a working hypothesis from an operational aspect. The
major purpose of such studies is the discovery of ideas and insights. Therefore, such a research
design suitable for such a study should be flexible enough to provide opportunity for
considering different dimensions of the problem under study. The in-built flexibility in research
design is required as the initial research problem would be transformed into a more precise one
in the exploratory study, which in turn may necessitate changes in the research procedure for
collecting relevant data. Usually, the following three methods are considered in the context of a
research design for such studies. They are (a) a survey of related literature; (b) experience
survey; and (c) analysis of ‘insight-stimulating’ instances.
2. Descriptive and Diagnostic Research Design:
A Descriptive Research Design is concerned with describing the characteristics of a particular

individual or a group. Meanwhile, a diagnostic research design determines the frequency with
which a variable occurs or its relationship with another variable. In other words, the study
analyzing whether a certain variable is associated with another comprises a diagnostic research
study. On the other hand, a study that is concerned with specific predictions or with the
narration of facts and characteristics related to an individual, group or situation, are instances
of descriptive research studies. Generally, most of the social research design falls under this
category. As a research design, both the descriptive and diagnostic studies share common
requirements, hence they are grouped together. However, the procedure to be used and the
research design need to plan carefully. The research design must also make appropriate
provision for protection against bias and thus maximize reliability, with due regard to the
completion of the research study in an economical manner. The research design in such studies
should be rigid and not flexible. Besides, it must also focus attention on the following:
a) Formulation of the objectives of the study,

b) Proper designing of the methods of data collection,
c) Sample selection,
d) Data collection,

e) Processing and analysis of the collected data, and
f) Reporting the findings.
3. Hypothesis-Testing Research Design:
Hypothesis-Testing Research Designs are those in which the researcher tests the hypothesis of
causal relationship between two or more variables. These studies require procedures that would
not only decrease bias and enhance reliability, but also facilitate deriving inferences about the
causality. Generally, experiments satisfy such requirements. Hence, when research design is
discussed in such studies, it often refers to the design of experiments.
Difference between Research Designs
4. Experimental Designs:
Professor Fisher has enumerated three principles of experimental designs:

• According to the Principle of Replication, the experiment should be repeated more than
once. Thus, each treatment is applied in many experimental units instead of one. By
doing so, the statistical accuracy of the experiments is increased. For example, suppose
we are to examine the effect of two varieties of rice. For this purpose, we may divide the
field into two parts and grow one variety in one part and the other variety in the other
part. We can then compare the yield of the two parts and draw conclusion on that basis.
• But if we are to apply the principle of replication to this experiment, then we first divide
the field into several parts, grow one variety in half of these parts and the other variety
in the remaining parts. We can then collect the data of yield of the two varieties and
draw conclusion by comparing the same. The result so obtained will be more reliable in
comparison to the conclusion we draw without applying the principle of replication.
• The entire experiment can even be repeated several times for better results.
Conceptually replication does not present any difficulty, but computationally it does.
For example, if an experiment requiring a two-way analysis of variance is replicated, it
will then require a three-way analysis of variance since replication itself may be a source
of variation in the data.
• However, it should be remembered that replication is introduced in order to increase the
precision of a study; that is to say, to increase the accuracy with which the main effects
and interactions can be estimated. The Principle of Randomisation provides protection,
when we conduct an experiment, against the effect of extraneous factors by
randomisation.
• In other words, this principle indicates that we should design or plan the experiment in
such a way that the variations caused by extraneous factors can all be combined under
the general heading of “chance.” For instance, if we grow one variety of rice, say, in the
first half of the parts of a field and the other variety is grown in the other half, then it is
just possible that the soil fertility may be different in the first half in comparison to the
other half.
• If this is so, our results would not be realistic. In such a situation, we may assign the
variety of rice to be grown in different parts of the field on the basis of some random
sampling technique, i.e., we may apply randomisation principle and protect ourselves
against the effects of the extraneous factors (soil fertility differences in the given case).
• The Principle of Local Control is another important principle of experimental designs.
Under it the extraneous factor, the known source of variability, is made to vary
deliberately over as wide a range as necessary and this need to be done in such a way
that the variability it causes can be measured and hence eliminated from the
experimental error.
• This means that we should plan the experiment in a manner that we can perform a two-
way analysis of variance, in which the total variability of the data is divided into three
components attributed to treatments (varieties of rice in our case), the extraneous factor
(soil fertility in our case) and experimental error.
• In other words, according to the principle of local control, we first divide the field into
several homogeneous parts, known as blocks, and then each such block is divided into
parts equal to the number of treatments. Then the treatments are randomly assigned to

these parts of a block. Dividing the field into several homogenous parts is known as
‘blocking’.
• In general, blocks are the levels at which we hold an extraneous factor fixed, so that we
can measure its contribution to the total variability of the data by means of a two-way
analysis of variance. In brief, through the principle of local control we can eliminate the
variability due to extraneous factor(s) from the experimental error.
Importance of Research Design:
The need for a research design arises out of the fact that it facilitates the smooth conduct of the
various stages of research. It contributes to making research as efficient as possible, thus
yielding the maximum information with minimum effort, time and expenditure. A research
design helps to plan in advance, the methods to be employed for collecting the relevant data
and the techniques to be adopted for their analysis. This would help in pursuing the objectives
of the research in the best possible manner, provided the available staff, time and money are
given. Hence, the research design should be prepared with utmost care, so as to avoid any error
that may disturb the entire project. Thus, research design plays a crucial role in attaining the
reliability of the results obtained, which forms the strong foundation of the entire process of the
research work.
Despite its significance, the purpose of a well-planned design is not realized at times. This is
because it is not given the importance that it deserves. As a consequence, many researchers are
not able to achieve the purpose for which the research designs are formulated, due to which
they end up arriving at misleading conclusions. Therefore, faulty designing of the research
project tends to render the research exercise meaningless. This makes it imperative that an
efficient and suitable research design must be planned before commencing the process of
research. The research design helps the researcher to organize his/her ideas in a proper form,
which in turn facilitates him/her to identify the inadequacies and faults in them. The research
design is also discussed with other experts for their comments and critical evaluation, without
which it would be difficult for any critic to provide a comprehensive review and comments on
the proposed study.

Unit – 2
Meaning and Need for Data
• Data is required to make a decision in any business situation. The researcher is faced
with one of the most difficult problems of obtaining suitable, accurate and adequate data.
Utmost care must be exercised while collecting data because the quality of the research
results depends upon the reliability of the data.
• Suppose you are the Director of your company. Your Board of Directors has asked you
to find out why the profit of the company has decreased since the last two years. Your
Board wants you to present facts and figures. What are you going to do?
• The first and foremost task is to collect the relevant information to make an analysis for
the above mentioned problem. It is, therefore, the information collected from various
sources, which can be expressed in quantitative form, for a specific purpose, which is
called data. The rational decision maker seeks to evaluate information in order to select
the course of action that maximises objectives.
• For decision making, the input data must be appropriate. This depends on the
appropriateness of the method chosen for data collection. The application of a statistical
technique is possible when the questions are answerable in quantitative nature, for
instance; the cost of production and profit of the company measured in rupees, age of the
workers in the company measured in years. Therefore, the first step in statistical
activities is to gather data. The data may be classified as primary and secondary data.
Let us now discuss these two kinds of data in detail.
Source of Data
Data sources can be broadly categorized into three types.
1. Primary data Sources: Primary data refers to information gathered first hand by the
researcher for the specific purpose of the study. It is raw data without interpretation and
represents the personal or official opinion or position. Primary sources are most
authorative since the information is not filtered or tampered. Some examples of the
sources of primary data are individuals, focus groups, panel of respondents, internet etc.
Data collection from individuals can be made through interviews, observation etc.
2. Secondary Data Sources: There is hardly same secondary data available on a spatial
basis and on a micro level. Some of it is available in an aggregate level which does not
cater to the needs of this study. There is very little data of important environmental
factors like air, water and noise pollution and traffic congestion on special level. Data on
all diseases and child mortality have been collected from the patient’s registers of
various hospitals onward basis, these are: density of population, density of houses.
3. Tertiary Sources: Tertiary sources are an interpretation of a secondary source. It is
generally represented by index, bibliographies, dictionaries, encyclopaedias, handbooks,
directories and other finding aids like the internet search engines.

Primary and Secondary Data
• The Primary data are original data which are collected for the first time for a specific
purpose. Such data are published by authorities who themselves are responsible for their
collection.
• The Secondary data on the other hand, are those which have already been collected by
some other agency and which have already been processed. Secondary data may be
available in the form of published or unpublished sources.
• For instance, population census data collected by the Government in a country is
primary data for that Government. But the same data becomes secondary for those
researchers who use it later. In case you have decided to collect primary data for your
investigation, you have to identify the sources from where you can collect that data.
• For example, if you wish to study the problems of the workers of X Company Ltd., then
the workers who are working in that company are the source. On the other hand, if you
have decided to use secondary data, you have to identify the secondary source that has
already collected the related data for their study purpose.
• With the above discussion, we can understand that the difference between primary and
secondary data is only in terms of degree. That is that the data which is primary in the
hands of one becomes secondary in the hands of another.
Methods and Techniques of Data Collection
• The task of data collection begins after a research problem has been defined and
research design/ plan chalked out. While deciding about the method of data collection
to be used for the study, the researcher should keep in mind two types of data, viz.,
primary and secondary.
• The primary data are those which are collected afresh and for the first time, and thus
happen to be original in character. The secondary data, on the other hand, are those
which have already been collected by someone else and which have already been passed
through the statistical process.
• The researcher would have to decide which sort of data he would be using (thus
collecting) for his study and accordingly he will have to select one or the other method
of data collection. The methods of collecting primary and secondary data differ since
primary data are to be originally collected, while in case of secondary data the nature of
data collection work is merely that of compilation.
Collection of Primary Data
• Primary data is collected during the course of doing experiments in an experimental

research but in case research done of the descriptive type and perform surveys, whether
sample surveys or census surveys, then we can obtain primary data either through
observation or through direct communication with respondents in one form or another
or through personal interviews.
• This, in other words, means that there are several methods of collecting primary data,
particularly in surveys and descriptive researches. Important ones are:

• Observation method‚‚
• Interview method‚‚
• Through questionnaires‚‚
• Through schedules‚‚
• Other methods which include
- warranty cards
- distributor
- audits
- pantry audits
- consumer panels
- using mechanical devices
- through projective techniques
- depth interviews
- content analysis
Observation Method
• The observation method is the most commonly used method especially in studies
relating to behavioural sciences. In a way we all observe things around us, but this sort
of observation is not scientific observation. Observation becomes a scientific tool and
the method of data collection for the researcher, when it serves a formulated research
purpose, is systematically planned and recorded and is subjected to checks and controls
on validity and reliability.
• Under the observation method, the information is sought by way of investigator’s own
direct observation without asking from the respondent. For instance, in a study relating
to consumer behaviour, the investigator instead of asking the brand of wrist watch used
by the respondent, may himself look at the watch. The main advantage of this method is
that subjective bias is eliminated, if observation is done accurately.
• Secondly, the information obtained under this method relates to what is currently
happening; it is not complicated by either the past behaviour or future intentions or
attitudes.
• Thirdly, this method is independent of respondents’ willingness to respond and as such
is relatively less demanding of active cooperation on the part of respondents as happens
to be the case in the interview or the questionnaire method. This method is particularly
suitable in studies which deal with subjects (i.e., respondents) who are not capable of
giving verbal reports of their feelings for one reason or the other.
• However, observation method has various limitations. Firstly, it is an expensive
method. Secondly, the information provided by this method is very limited. Thirdly,
sometimes unforeseen factors may interfere with the observational task.
• At times, the fact that some people are rarely accessible to direct observation creates
obstacle for this method to collect data effectively. While using this method, the
researcher should keep in mind things like:
o What should be observed? ‚‚
o How the observations should be recorded? Or ‚‚
o How the accuracy of observation can be ensured? ‚‚
• In case the observation is characterised by a careful definition of the units to be
observed, the style of recording the observed information, standardised conditions of
observation and the selection of pertinent data of observation, then the observation is
called as structured observation. But when observation is to take place without these
characteristics to be thought of in advance, the same is termed as unstructured
observation.
• Structured observation is considered appropriate in descriptive studies, whereas in an
exploratory study the observational procedure is most likely to be relatively
unstructured. We often talk about participant and non-participant types of observation
in the context of studies, particularly of social sciences.
• This distinction depends upon the observer’s sharing or not sharing the life of the group
he is observing. If the observer observes by making himself, more or less, a member of
the group he is observing so that he can experience what the members of the group
experience, the observation is called as the participant observation. But when the
observer observes as a detached emissary without any attempt on his part to experience
through participation what others feel, the observation of this type is often termed as
non-participant observation. (When the observer is observing in such a manner that his
presence may be unknown to the people he is observing, such an observation is
described as disguised observation.
• There are several merits of the participant type of observation:
o The researcher is enabled to record the natural behaviour of the group.‚‚
o The researcher can even gather information which could not easily be obtained if
he observes in a disinterested fashion.
o The researcher can even verify the truth of statements made by informants in
the context of a questionnaire or a schedule.
• But there are also certain demerits of this type of observation, viz., the observer may
lose the objectivity to the extent he participates emotionally; the problem of
observation-control is not solved; and it may narrow-down the researcher’s range of
experience.
• Sometimes, we talk of controlled and uncontrolled observation. If the observation takes
place in the natural setting, it may be termed as uncontrolled observation, but when
observation takes place according to definite pre-arranged plans, involving
experimental procedure, the same is then termed controlled observation.
• In non-controlled observation, no attempt is made to use precision instruments. The
major aim of this type of observation is to get a spontaneous picture of life and persons.
It has a tendency to supply naturalness and completeness of behaviour, allowing
sufficient time for observing it.
• But in controlled observation, we use mechanical (or precision) instruments as aids to
accuracy and standardisation. Such observation has a tendency to supply formalised data
upon which generalisations can be built with some degree of assurance.
• The main pitfall of non-controlled observation is that of subjective interpretation. There
is also the danger of having the feeling that we know more about the observed

phenomena than we actually do. Generally, controlled observation takes place in various
experiments that are carried out in a laboratory or under controlled conditions, whereas
uncontrolled observation is resorted to in case of exploratory researches.
Kinds of Observation
Observation can be classified into various types according to the method used and the type of
control exercised. Following are the chief types of observation.
• Participant observation: The observation may be participant or non participant. When

the observer participates with the activities of the group under study, it is known as
participant observation. Thus, a participant observer makes himself part of the group
under study. He need not necessarily carry out all the activities as carried out by other
members of the group, but, his presence as an active member of the group is necessary.
• Non – participant observation: When the observer does not actually participate in the
activities of the group, but simply observes them from a distance, it is, known as a non
participant observation. Purely non participant observation is extremely difficult.
• Non controlled observation: When the observation is made in the natural
surroundings and the activities are performed in their usual course without being
influenced or guided by any external force it is known as non controlled observation.
Non controlled observation is generally not very reliable. The observation itself may be
biased and coloured by the views of the observer, because there is no check upon him.
Various observers may observe the same thing differently and draw different
conclusions. The greatest difficulty is that the observer may be so overpowered by
uncontrolled and stray events that he may regard them to be absolutely true while they
are far from being so.
• Controlled observation: Controlled observation affords greater precision and
objectivity and can be repeatedly observed under identical conditions. The main purpose
of a controlled observation is, thus, to check any bias due to faulty perception,
inaccurate data and influence of outside factors on the particular incident.
Interview Method
The interview method of collecting data involves presentation of oral-verbal stimuli and reply
in terms of oral-verbal responses. This method can be used through personal interviews and, if
possible, through telephone interviews.
Personal interviews
• Personal interview method requires a person known as the interviewer asking questions
generally in a face-to-face contact to the other person or persons. (At times the
interviewee may also ask certain questions and the interviewer responds to these, but
usually the interviewer initiates the interview and collects the information.)
• This sort of interview may be in the form of direct personal investigation or it may be
indirect oral investigation. In the case of direct personal investigation the interviewer

has to collect the information personally from the sources concerned. He has to be on
the spot and has to meet people from whom data have to be collected.
• This method is particularly suitable for intensive investigations. But in certain cases it
may not be possible or worthwhile to contact directly the persons concerned or on
account of the extensive scope of enquiry, the direct personal investigation technique
may not be used. In such cases an indirect oral examination can be conducted under
which the interviewer has to cross-examine other persons who are supposed to have
knowledge about the problem under investigation and the information, obtained is
recorded.
• Most of the commissions and committees appointed by government to carry on
investigations make use of this method. The method of collecting information through
personal interviews is usually carried out in a structured way. As such we call the
interviews as structured interviews. Such interviews involve the use of a set of
predetermined questions and of highly standardised techniques of recording.
• Thus, the interviewer in a structured interview follows a rigid procedure laid down,
asking questions in a form and order prescribed. As against it, the unstructured
interviews are characterised by a flexibility of approach to questioning. Unstructured
interviews do not follow a system of pre-determined questions and standardised
techniques of recording information.
• In a non-structured interview, the interviewer is allowed much greater freedom to ask,
in case of need, supplementary questions or at times he may omit certain questions if the
situation so requires. He may even change the sequence of questions. He has relatively
greater freedom while recording the responses to include some aspects and exclude
others.
• But this sort of flexibility results in lack of comparability of one interview with another
and the analysis of unstructured responses becomes much more difficult and time-
consuming than that of the structured responses obtained in case of structured
interviews.
• Unstructured interviews also demand deep knowledge and greater skill on the part of
the interviewer. Unstructured interview, however, happens to be the central technique
of collecting information in case of exploratory or formulative research studies. But in
case of descriptive studies, we quite often use the technique of structured interview
because of its being more economical, providing a safe basis for generalisation and
requiring relatively lesser skill on the part of the interviewer.
• We may as well talk about focused interview, clinical interview and the non-directive
interview. Focused interview is meant to focus attention on the given experience of the
respondent and its effects. Under it the interviewer has the freedom to decide the
manner and sequence in which the questions would be asked and has also the freedom to
explore reasons and motives.
• The main task of the interviewer in case of a focused interview is to confine the
respondent to a discussion of issues with which he seeks conversance. Such interviews
are used generally in the development of hypotheses and constitute a major type of

unstructured interviews. The clinical interview is concerned with broad underlying
feelings or motivations or with the course of individual’s life experience.
• The method of eliciting information under it is generally left to the interviewer’s
discretion. In case of non-directive interview, the interviewer’s function is simply to
encourage the respondent to talk about the given topic with a bare minimum of direct
questioning. The interviewer often acts as a catalyst to a comprehensive expression of
the respondents’ feelings and beliefs and of the frame of reference within which such
feelings and beliefs take on personal significance.
• Despite the variations in interview-techniques, the major advantages and weaknesses of
personal interviews can be enumerated in a general way. The chief merits of the
interview method are as follows:
o More information and that too in greater depth can be obtained.‚‚
o Interviewer by his own skill can overcome the resistance, if any, of the
respondents; the interview method can be made to yield an almost perfect sample
of the general population.
o There is greater flexibility under this method as the opportunity to restructure
questions is always there, especially in case of unstructured interviews.
o Observation method can as well be applied to recording verbal answers to
various questions.‚‚
o Personal information can as well be obtained easily under this method.‚‚
o Samples can be controlled more effectively as there arises no difficulty of the
missing returns; non-response generally remains very low.
o The interviewer can usually control which person(s) will answer the questions.
This is not possible in mailed questionnaire approach. If so desired, group
discussions may also be held.
o The interviewer may catch the informant off-guard and thus may secure the
most spontaneous reactions than would be the case if mailed questionnaire is
used.
o The language of the interview can be adapted to the ability or educational level
of the person interviewed and as such misinterpretations concerning questions
can be avoided.
o The interviewer can collect supplementary information about the respondent’s
personal characteristics and environment which is often of great value in
interpreting results. But there are also certain weaknesses of the interview
method. Among the important weaknesses, mention may be made of the
following:
 It is a very expensive method, especially when large and widely spread
geographical sample is taken.
 There remains the possibility of the bias of interviewer as well as that of
the respondent; there also remains the headache of supervision and
control of interviewers.
 Certain types of respondents such as important officials or executives or
people in high income groups may not be easily approachable under this
method and to that extent the data may prove inadequate.

 This method is relatively more-time-consuming, specially when the
sample is large and recalls upon the respondents are necessary.
 The presence of the interviewer on the spot may over-stimulate the
respondent, sometimes even to the extent that he may give imaginary
information just to make the interview interesting.
 Under the interview method the organisation required for selecting,
training and supervising the field-staff is more complex with formidable
problems.
 Interviewing at times may also introduce systematic errors.
 Effective interview presupposes proper rapport with respondents that
would facilitate free and frank responses. This is often a very difficult
requirement.
Pre-requisites and basic tenets of interviewing:
• For successful implementation of the interview method, interviewers should be carefully

selected, trained and briefed. They should be honest, sincere, hardworking, and
impartial and must possess the technical competence and necessary practical experience.
• Occasional field checks should be made to ensure that interviewers are neither cheating,
nor deviating from instructions given to them for performing their job efficiently. In
addition, some provision should also be made in advance so that appropriate action may
be taken if some of the selected respondents refuse to cooperate or are not available
when an interviewer calls upon them.
• In fact, interviewing is an art governed by certain scientific principles. Every effort
should be made to create friendly atmosphere of trust and confidence, so that
respondents may feel at ease while talking to and discussing with the interviewer. The
interviewer must ask questions properly and intelligently and must record the
responses accurately and completely. At the same time, the interviewer must answer
legitimate question(s), if any, asked by the respondent and must clear any doubt that the
latter has.
• The interviewers approach must be friendly, courteous, conversational and unbiased.
The interviewer should not show surprise or disapproval of a respondent’s answer but
he must keep the direction of interview in his own hand, discouraging irrelevant
conversation and must make all possible effort to keep the respondent on the track.
Telephone Interviews
This method of collecting information consists in contacting respondents on telephone itself. It

is not a very widely used method, but plays important part in industrial surveys, particularly in
developed regions. The chief merits of such a system are:
• It is more flexible in comparison to mailing method.

• It is faster than other methods i.e., a quick way of obtaining information.
• It is cheaper than personal interviewing method; here the cost per response is relatively
low.

• Recall is easy; call backs are simple and economical.
• There is a higher rate of response than what we have in mailing method; the non-
response is generally very low.
• Replies can be recorded without causing embarrassment to respondents.
• Interviewer can explain requirements more easily.
• At times, access can be gained to respondents who otherwise cannot be contacted for
one reason or the other.
• No field staffs are required.
• Representative and wider distribution of sample is possible
But this system of collecting information is not free from demerits. Some of these may be
highlighted.
• Little time is given to respondents for considered answers; interview period is not likely
to exceed five minutes in most cases.
• Surveys are restricted to respondents who have telephone facilities.
• Extensive geographical coverage may get restricted by cost considerations.
• It is not suitable for intensive surveys where comprehensive answers are required to
various questions.
• Possibility of the bias of the interviewer is relatively more.
• Questions have to be short and to the point; probes are difficult to handle.
Types of Interviews:
Interviews are generally of the following types:
• Structured Interview: It is also known as controlled, guided or direct interview. In this

kind of interview a complete schedule is used. The interviewer is asked to get the
answers to those questions only. He generally does not add any thing from his own side.
The language too is not changed. He can only interpret or amplify the statement
wherever necessary.
• Unstructured Interview: It is known as uncontrolled, unguided or undirected
interview. No direct or predetermined questions used are in this type of interview. The
field worker may be told certain broad topics upon which the information is to be
collected. It is generally held in the form of free discussion or story type narrative. The
subject is asked to narrate the incidents of his life, his own feeling and reactions and the
researcher has to draw his own conclusions from it.
• Focused Interview: As the name suggests, its main focus is on social and psychological
effects of mass communication, for example, the reactions of a film show or radio
programmes. Merton has given the following characteristics of a focussed interview.
Interviewee is known to have been involved in a particular concrete situation. The
situation under study is the one that has already been analysed prior to interviewer. The
field worker tries to focus his attention to the particulars aspect of the problem, and

tries to know his experiences, attitudes and emotional response regarding the concrete
situation under study.
• Repetitive Interview: Interview is repetitive in nature when it is desired to note the
gradual influence of some social or psychological process. There are some social changes
that have a far reaching influence upon the people and it is sometimes desired to know
the effect of such factors in time sequence. Thus, for example, suppose a village is linked
with some road connecting it with the city. Naturally, it will have its own influence
upon the life of the people. The influence of course would not be sudden. There will be a
gradual change in economic status, standard of living, attitudes and opinions, inter-
relationships of the people. In order to study this influence in time sequence a study has
to be conducted at regular intervals to mark the gradual change taking place.
Panel Research
• Panel research is a method for collecting data repeatedly, from a pre-recruited set of
people. These individuals generally provide demographic, household and behavioral
data, which can make conducting future studies easier. Technology, primarily the
internet, has transformed panel research methodology by the ease of which we can
access larger numbers of respondents. Panel research provides many advantages for
companies including faster turnaround, higher participation rates, and cost savings. The
quantitative data can provide companies with insights into pricing, effectiveness and
sales projection of their products or brand.
• Building a quality research panel is very important because your data depends on it.
With a research panel, you are able to build rich profiles of your members, which will
help to ensure that your reporting provides quality responses. If you are looking to
launch a new mobile phone, you would want to target panelists who are interested in
mobile phones and technology to yield more informed responses. Additionally, a well-
managed panel of pre-recruited respondents allows for a faster response rate, as the
participants have shown interest in participating in surveys by joining the panel.
• Advantages of Panel Research:
o The rate of research response is amplified as panel members have willingly
signed up to participate in the research process.
o Different aspects of a particular subject can be discussed with panel members,
unlike other research methods where a single topic needs to be discussed at a
time. This makes panel research effective and less expensive.
o A panel consisting of a sizeable number of participants makes it easy for
marketers to record behavioral changes across demographics due to the diversity
of panel members.
o Capturing better details in panel research insights as panel members have a
more sophisticated understanding of the research subject since they are profiled,
screened and validated during recruitment.
o Qualitative market research methods such as focus groups, discussions, online
interviews can be far more effective if they’re conducted with well-recruited
panel.

o Quantitative market research can be conducted to muster data and metrics-based
inputs for survey research by sending out online surveys and online polls to a
panel.
• Disadvantages of Panel Research:
o There are cases where certain members of the panel may not intend to honestly
help you out as they register for every panel they come across for the perks.
Evaluate your panel on basis of authenticity at regular intervals and remove
members you have a hunch about.
o Over a course of time, response rates of tenure members who’ve been a part of
the panel for an extended time frame may decrease.
o Frequent panel management is required to deal with problems regarding
attrition.
Questionnaire
This method of data collection is quite popular, particularly in case of big enquiries. It is being
adopted by private individuals, research workers, private and public organisations and even by
governments. In this method a questionnaire is sent (usually by post) to the persons concerned
with a request to answer the questions and return the questionnaire. A questionnaire consists
of a number of questions printed or typed in a definite order on a form or set of forms. The
questionnaire is mailed to respondents who are expected to read and understand the questions
and write down the reply in the space meant for the purpose in the questionnaire itself. The
respondents have to answer the questions on their own.
Merits and Demerits of a Questionnaire
The method of collecting data by mailing the questionnaires to respondents is most extensively
employed in various economic and business surveys. The merits claimed on behalf of this
method are as follows:
• There is low cost even when the universe is large and is widely spread geographically.
• It is free from the bias of the interviewer; answers are in respondents’ own words.
• Respondents have adequate time to give well thought out answers.
• Respondents, who are not easily approachable, can also be reached conveniently.
• Large samples can be made use of and thus the results can be made more dependable
and reliable.
The main demerits of this system can also be listed here:
• Low rate of return of the duly filled in questionnaires; bias due to no-response is often
indeterminate.
• It can be used only when respondents are educated and cooperating.
• The control over questionnaire may be lost once it is sent.
• There is inbuilt inflexibility because of the difficulty of amending the approach once
questionnaires have been dispatched.

• There is also the possibility of ambiguous replies or omission of replies altogether to
certain questions; interpretation of omissions is difficult.
• It is difficult to know whether willing respondents are truly representative.
• This method is likely to be the slowest of all.
Before using this method, it is always advisable to conduct ‘pilot study’ (Pilot Survey) for
testing the questionnaires. In a big enquiry the significance of pilot survey is felt very much.
Pilot survey is in fact the replica and rehearsal of the main survey. Such a survey, being
conducted by experts, brings to the light the weaknesses (if any) of the questionnaires and also
of the survey techniques. From the experience gained in this way, improvement can be effected.
Essentials of a Good Questionnaire
• To be successful, questionnaire should be comparatively short and simple i.e., the size of
the questionnaire should be kept to the minimum. Questions should proceed in logical
sequence moving from easy to more difficult questions. Personal and intimate questions
should be left to the end.
• Technical terms and vague expressions capable of different interpretations should be
avoided in a questionnaire. Questions may be dichotomous (yes or no answers), multiple
choice (alternative answers listed) or open-ended.
• The latter type of questions is often difficult to analyse and hence should be avoided in a
questionnaire to the extent possible. There should be some control questions in the
questionnaire which indicate the reliability of the respondent.
• For instance, a question designed to determine the consumption of particular material
may be asked first in terms of financial expenditure and later in terms of weight. The
control questions, thus, introduce a cross-check to see whether the information collected
is correct or not. Questions affecting the sentiments of respondents should be avoided.
Adequate space for answers should be provided in the questionnaire to help editing and
tabulation. There should always be provision for indications of uncertainty, example,
“do not know,” “no preference” and so on. Brief directions with regard to filling up the
questionnaire should invariably be given in the questionnaire itself.
• Finally, the physical appearance of the questionnaire affects the cooperation the
researcher receives from the recipients and as such an attractive looking questionnaire,
particularly in mail surveys, is a plus point for enlisting cooperation. The quality of the
paper, along with its colour, must be good so that it may attract the attention of
recipients.
Functions of Questionnaire
A questionnaire serves five functions:
• Give the respondent clear comprehension of the question.

• Induce the respondent to cooperate and to trust that answers will be treated
confidentially.

• Stimulate response through greater introspection, plumbing of memory or reference to
records.
• Give instructions on what is wanted and the manner of responding.
• Identify what needs to be known to classify and verify and interview
Paper-pencil-questionnaires can be sent to a large number of people and saves the researcher
time and money. People are more truthful while responding to the questionnaires regarding
controversial issues in particular due to the fact that their responses are anonymous. But they
also have drawbacks. Majority of the people who receive questionnaires don't return them and
those who do might not be representative of the originally selected sample.
Web based questionnaires: A new and inevitably growing methodology is the use of Internet
based research. This would mean receiving an e-mail on which you would click on an address
that would take you to a secure web-site to fill in a questionnaire. This type of research is often
quicker and less detailed. Some disadvantages of this method include the exclusion of people
who do not have a computer or are unable to access a computer. Also the validity of such
surveys are in question as people might be in a hurry to complete it and so might not give
accurate responses.
Questionnaires often make use of Checklist and rating scales. These devices help simplify and
quantify people's behaviors and attitudes. A checklist is a list of behaviors, characteristics, or
other entities that te researcher is looking for. Either the researcher or survey participant
simply checks whether each item on the list is observed, present or true or vice versa. A rating
scale is more useful when a behavior needs to be evaluated on a continuum. They are also
known as Likert scales.
Schedules
The schedule is the form containing some questions or blank which are to be filled by the
workers after getting information from the informants. The schedule may, thus, contain two
types of questions firstly those that are in form of direct questions and secondly those that are
in form of a table. There are some kinds of information that can be procured only by putting a
question, for example, questions for eliciting the informant’s opinion, attitude preferences of his
suggestion about some matter. There are some kinds of information that can be procured only
by putting a question, for example, questions for eliciting the informant’s opinion, attitude
preferences of his suggestions about some matter. There are others which may better be put in
form of tables. Generally, most of them are inter changeable.
Purpose of Schedule
• To attain objectivity: The purpose of schedule is to provide a standardised tool for

observation or interview in order to attain objectivity. By schedule every informant has
to reply the same questions put in the same language and the field worker has no choice
o get the desired reply by putting the different question or changing the language of the
same question. The order of the question is also the same and thus the whole interview
takes place under standardised conditions, and the data received is perfectly comparable.

It has not been proved that responses of the people regarding same matter differ, if the
language of the question is even slightly different or evens the place of question is
changed. A perfectly standardised form is, therefore, needed for any objective study.
• To Act as Memory: The other purpose of schedule is to act as memory tickler. In the
absence of any schedule the field worker may put different number of questions to
different people. He may forget to inquire about some important aspects and then he
would be required to go over the whole process again to collect that missing
information too, the schedule keeps his memory refreshed, and keeps him reminded of
the different aspects that are to be particularly observed.
Essentials of a Good Schedule
According to P.V.Young, there are two essential conditions of a good schedule.
• Accurate Communication: Accurate communication is achieved when respondent’s

communication is achieved, when respondents understand the questions in the same
sense which they are expected to convey. Thus the basis of accurate communications is
proper wording of questions. The questions should be so worded that they may clearly
carry the desired sense without any ambiguity. Various tools for achieving accuracy of
communication have been discussed in the succeeding paragraphs.
• Accurate Response: Accurate response is said to have been achieved when replies
contain the information sought for. Thus information should be unbiased and true. The
respondents may also co-operate in filling the schedule, by giving out correct
information.
Difference between Questionnaires and Schedules
Both questionnaire and schedule are popularly used methods of collecting data in research
surveys. There is much resemblance in the nature of these two methods and this fact has made
many people to remark that from a practical point of view, the two methods can be taken to be
the same. But from the technical point of view there is difference between the two. The
important points of difference are as under:
• The questionnaire is generally sent through mail to informants to be answered as

specified in a covering letter, but otherwise without further assistance from the sender.
The schedule is generally filled out by the research worker or the enumerator, who can
interpret questions when necessary.
• To collect data through questionnaire is relatively cheap and economical since we have
to spend money only in preparing the questionnaire and in mailing the same to
respondents. Here no field staff required. To collect data through schedules is relatively
more expensive since considerable amount of money has to be spent in appointing
enumerators and in importing training to them. Money is also spent in preparing
schedules.
• Non-response is usually high in case of questionnaire as many people do not respond
and many return the questionnaire without answering all questions. Bias due to non-
response often remains indeterminate. As against this, non-response is generally very
low in case of schedules because these are filled by enumerators who are able to get
answers to all questions. But there remains the danger of interviewer bias and cheating.
• In case of questionnaire, it is not always clear as to who replies, but in case of schedule
the identity of respondent is known.
• The questionnaire method is likely to be very slow since many respondents do not
return the questionnaire in time despite several reminders, but in case of schedules the
information is collected well in time as they are filled in by enumerators.
• Personal contact is generally not possible in case of the questionnaire method as
questionnaires are sent to respondents by post who also in turn returns the same by
post. But in case of schedules direct personal contact is established with respondents.
Questionnaire method can be used only when respondents are literate and cooperative,
but in case of schedules the information can be gathered even when the respondents
happen to be illiterate.
• Wider and more representative distribution of sample is possible under the
questionnaire method, but in respect of schedules there usually remains the difficulty in
sending enumerators over a relatively wider area.
• Risk of collecting incomplete and wrong information is relatively more under the
questionnaire method, particularly when people are unable to understand questions
properly. But in case of schedules, the information collected is generally complete and
accurate as enumerators can remove the difficulties, if any, faced by respondents in
correctly understanding the questions. As a result, the information collected through
schedules is relatively more accurate than that obtained through questionnaires.
• The success of questionnaire method lies more on the quality of the questionnaire itself,
but in the case of schedules much depends upon the honesty and competence of
enumerators.
• In order to attract the attention of respondents, the physical appearance of
questionnaire must be quite attractive, but this may not be so in case of schedules as
they are to be filled in by enumerators and not by respondents.
• Along with schedules, observation method can also be used but such a thing is not
possible while collecting data through questionnaires.
Collection of Secondary Data
Secondary data means data that are already available i.e., they refer to the data which have
already been collected and analysed by someone else. When the researcher utilises secondary
data, then he has to look into various sources from where he can obtain them. In this case he is
certainly not confronted with the problems that are usually associated with the collection of
original data. Secondary data may either be published data or unpublished data. Usually
published data are available in:
• Various publications of the central, state are local governments

• Various publications of foreign governments or of international bodies and their
subsidiary organisations
• Technical and trade journals

• Books, magazines and newspapers
• Reports and publications of various associations connected with business and industry,
banks, stock exchanges, etc.
• Reports prepared by research scholars, universities, economists, etc. in different fields
• Public records and statistics, historical documents, and other sources of published
information
Researcher must be very careful in using secondary data. He must make a minute scrutiny
because it is just possible that the secondary data may be unsuitable or may be inadequate in
the context of the problem which the researcher wants to study. By way of caution, the
researcher, before using secondary data, must see that they possess following characteristics:
Reliability of Data
The reliability can be tested by finding out such things about the said data:
• Who collected the data?

• What were the sources of data?
• Were proper methods adopted to collect the data?
• At what time were they collected?
• Was there any bias involved?
• What level of accuracy was desired? Was it achieved?
Suitability of Data
• The data that are suitable for one enquiry may not necessarily be found suitable in
another enquiry. Hence, if the available data are found to be unsuitable, they should not
be used by the researcher. In this context, the researcher must very carefully scrutinize
the definition of various terms and units of collection used at the time of collecting the
data from the primary source originally.
• Similarly, the object, scope and nature of the original enquiry must also be studied. If
the researcher finds differences in these, the data will remain unsuitable for the present
enquiry and should not be used.
Adequacy of Data
• If the level of accuracy achieved in data is found inadequate for the purpose of the
present enquiry, they will be considered as inadequate and should not be used by the
researcher. The data will also be considered inadequate, if they are related to an area
which may be either narrower or wider than the area of the present enquiry.
• From all this we can say that it is very risky to use the already available data. The
already available data should be used by the researcher only when he finds them
reliable, suitable and adequate. But he should not blindly discard the use of such data if
they are readily available from authentic sources and are also suitable and adequate for
in that case it will not be economical to spend time and energy in field surveys for

collecting information. At times, there may be wealth of usable information in the
already available data which must be used by an intelligent researcher but with due
precaution.
Qualitative Research
• Qualitative research is a type of social science research that collects and works with
non-numerical data and that seeks to interpret meaning from these data that help
understand social life through the study of targeted populations or places.
• People often frame it in opposition to quantitative research, which uses numerical data
to identify large-scale trends and employs statistical operations to determine causal and
correlative relationships between variables.
• Within sociology, qualitative research is typically focused on the micro-level of social
interaction that composes everyday life, whereas quantitative research typically focuses
on macro-level trends and phenomena.
• This type of research has long appealed to social scientists because it allows the
researchers to investigate the meanings people attribute to their behavior, actions, and
interactions with others.
• While quantitative research is useful for identifying relationships between variables,
like, for example, the connection between poverty and racial hate, it is qualitative
research that can illuminate why this connection exists by going directly to the
source—the people themselves.
• Qualitative research is designed to reveal the meaning that informs the action or
outcomes that are typically measured by quantitative research. So qualitative
researchers investigate meanings, interpretations, symbols, and the processes and
relations of social life.
• What this type of research produces is descriptive data that the researcher must then
interpret using rigorous and systematic methods of transcribing, coding, and analysis of
trends and themes.
• Because its focus is everyday life and people's experiences, qualitative research lends
itself well to creating new theories using the inductive method, which can then be tested
with further research.
Methods
Qualitative researchers use their own eyes, ears, and intelligence to collect in-depth perceptions
and descriptions of targeted populations, places, and events.
Their findings are collected through a variety of methods, and often a researcher will use at
least two or several of the following while conducting a qualitative study:
• Direct observation: With direct observation, a researcher studies people as they go

about their daily lives without participating or interfering. This type of research is often
unknown to those under study, and as such, must be conducted in public settings where
people do not have a reasonable expectation of privacy. For example, a researcher might

observe the ways in which strangers interact in public as they gather to watch a street
performer.
• Open-ended surveys: While many surveys are designed to generate quantitative data,
many are also designed with open-ended questions that allow for the generation and
analysis of qualitative data. For example, a survey might be used to investigate not just
which political candidates’ voters chose, but why they chose them, in their own words.
• Focus group: In a focus group, a researcher engages a small group of participants in a
conversation designed to generate data relevant to the research question. Focus groups
can contain anywhere from 5 to 15 participants. Social scientists often use them in
studies that examine an event or trend that occurs within a specific community. They
are common in market research, too.
• In-depth interviews: Researchers conduct in-depth interviews by speaking with
participants in a one-on-one setting. Sometimes a researcher approaches the interview
with a predetermined list of questions or topics for discussion but allows the
conversation to evolve based on how the participant responds. Other times, the
researcher has identified certain topics of interest but does not have a formal guide for
the conversation, but allows the participant to guide it.
• Oral history: The oral history method is used to create a historical account of an event,
group, or community, and typically involves a series of in-depth interviews conducted
with one or multiple participants over an extended period.
• Participant observation: This method is similar to observation, however with this one,
the researcher also participates in the action or events to not only observe others but to
gain the first-hand experience in the setting.
• Ethnographic observation: Ethnographic observation is the most intensive and in-
depth observational method. Originating in anthropology, with this method, a
researcher fully immerses themselves into the research setting and lives among the
participants as one of them for anywhere from months to years. By doing this, the
researcher attempts to experience day-to-day existence from the viewpoints of those
studied to develop in-depth and long-term accounts of the community, events, or trends
under observation.
• Content analysis: This method is used by sociologists to analyze social life by
interpreting words and images from documents, film, art, music, and other cultural
products and media. The researchers look at how the words and images are used, and
the context in which they are used to draw inferences about the underlying culture.
Content analysis of digital material, especially that generated by social media users, has
become a popular technique within the social sciences.
• Phenomenological Method: Describing how any one participant experiences a specific
event is the goal of the phenomenological method of research. This method utilizes
interviews, observation and surveys to gather information from subjects.
Phenomenology is highly concerned with how participants feel about things during an
event or activity. Businesses use this method to develop processes to help sales
representatives effectively close sales using styles that fit their personality.

• Grounded Theory Method: The grounded theory method tries to explain why a
course of action evolved the way it did. Grounded theory looks at large subject
numbers. Theoretical models are developed based on existing data in existing modes of
genetic, biological or psychological science. Businesses use grounded theory when
conducting user or satisfaction surveys that target why consumers use company
products or services. This data helps companies maintain customer satisfaction and
loyalty.
• Case Study Model: Unlike grounded theory, the case study model provides an in-depth
look at one test subject. The subject can be a person or family, business or organization,
or a town or city. Data is collected from various sources and compiled using the details
to create a bigger conclusion. Businesses often use case studies when marketing to new
clients to show how their business solutions solve a problem for the subject.
• Historical Model: The historical method of qualitative research describes past events
in order to understand present patterns and anticipate future choices. This model
answers questions based on a hypothetical idea and then uses resources to test the idea
for any potential deviations. Businesses can use historical data of previous ad campaigns
and the targeted demographic and split-test it with new campaigns to determine the
most effective campaign.
• Narrative Model: The narrative model occurs over extended periods of time and
compiles information as it happens. Like a story narrative, it takes subjects at a starting
point and reviews situations as obstacles or opportunities occur, although the final
narrative doesn't always remain in chronological order. Businesses use the narrative
method to define buyer personas and use them to identify innovations that appeal to a
target market.
While much of the data generated by qualitative research is coded and analyzed using just the
researcher's eyes and brain, the use of computer software to do these processes is increasingly
popular within the social sciences.
Qualitative research has both benefits and drawbacks:
• On the plus side, it creates an in-depth understanding of the attitudes, behaviors,

interactions, events, and social processes that comprise everyday life. In doing so, it
helps social scientists understand how everyday life is influenced by society-wide things
like social structure, social order, and all kinds of social forces.
• This set of methods also has the benefit of being flexible and easily adaptable to changes
in the research environment and can be conducted with minimal cost in many cases.
• Among the downsides of qualitative research is that its scope is fairly limited so its
findings are not always widely able to be generalized.
• Researchers also have to use caution with these methods to ensure that they do not
influence the data in ways that significantly change it and that they do not bring undue
personal bias to their interpretation of the findings.
• Fortunately, qualitative researchers receive rigorous training designed to eliminate or
reduce these types of research bias.

Motivational Research
Motivation Research Technique: There are four techniques of conducting motivation research:
(a) Non-disguised Structured Techniques.

(b) Non-disguised, Non-structured Techniques.
(c) Disguised Non-structured Techniques.
(d) Disguised structured Techniques.
(a) Non-disguised Structured Techniques: This approach employs a standardized

questionnaire to collect data on beliefs, feelings, and attitude from the respondent.
Single Question Method: (I think it is a good product or I think it is a poor product).
Multiple Questions Method: (Numbers of questionnaires asked about the attitude) and
Physiological Tests (laboratory tests such as galvanic skin response, eye movement etc.
measure attitudes of people towards products) are carried out under this approval.
(b) Non-disguised, Non-structured Techniques: These techniques use a non standardized

questionnaire. The techniques are also called depth interview, qualitative interviews,
unstructured interviews, or focussed interviews. All these techniques are designed to gather
information on various aspects of human behaviour including the “why” component.
(c) Disguised, Non-structured Techniques: In this approach, the purpose of study is not
discussed to respondents unlike above two cases. A list of unstructured questions is used to
collect data on consumer’s attitudes. This art of using disguised and unstructured method is
referred to as “Projective Techniques”.
The projective techniques include several tests given to the respondents. They may be asked to
give their comments on cartoons, pictures, stories etc. The stimuli used for this purpose are
capable of answering the respondent to a variety of reactions. A number of Projective
Techniques, are available to the market researchers for the purpose of analysing “why” part of
consumer behaviour.
Qualitative Techniques: (Projective Techniques and Word Association as follows).
The main Projective Techniques are:
1. Word Association Test (W.A.I): The interviewer calls a series of listed words one by one
and the respondents quickly replies the first word that enters his mind. The underlying
assumption is that by “free associating” with certain stimuli (words) the responses are timed so
that those answers which the respondent “response out” are identified.
2. Sentence Completion: Sentence completion test is similar to word association test except
that the respondent is required to complete an unfinished sentence.
For example, “I do not use shampoos because……..”
“Coffee that is quickly made…………. ”

3. Story Completion: In this technique the respondent is asked to complete a story, end of
which is missing. This enables a researcher to find out the almost exact version of images and
feelings of people towards a company’s product. This helps in finalising the advertising and
promotional themes for the product in question.
4. Research of Ink-blot Tests (or Research Tests): Motivation Research employs this
famous test. These tests are not in much use in marketing research. The research test expresses
in a classic way the rationale behind all projective tests, that is, in filling the missing parts of a
vague and incomplete stimulus, the respondent projects himself and his personality into the
picture.
A lot of ink is put on the piece of paper and reference is made of company, product, and the
respondent is asked to give his view points after interpreting what he sees in the blot before
him. The respondent say, “ugly packaging of the product”, or “excellent performance of the
product”. This response will help the seller to finalize his marketing strategies.
5. Psychographic Technique: This includes galvanic skin response, eye movement and eye
blink test etc. which uses various Instruments with the physiological responses.
6. Espionage Technique: There are two methods in this technique:
(i) Use of Hidden Recorders: Such as hidden tape recorders, cameras used to watch consumers
as they make purchases or consume items.
(ii) Rubbish Research: This is another method of espionage activity. Here, the researcher shifts
through the garbage of individuals or groups and record pattern of consumption, waste, and
brand preference. It gives most required estimates of consumption of cigarettes, medicines,
liquor, and magazines etc.
(d) Disguised Structured Techniques: When we are to measure those attitudes which
respondents might not readily and accurately express, we can use disguised structured
techniques. The disguised structured questionnaires are easy to administer and code.
Respondents are given questions which they are not likely to be able to answer accurately. In
such circumstances they are compelled to ‘guess at’ the answers. The respondent’s attitude on
the subject is assumed to be revealed to the extent and direction in which these guessing errors
are committed.
Uses of Motivation Research:
1. Motivation Research leads to useful insights and provides inspiration to creative person
in the advertising and packing world.
2. Knowledge and measurement of the true attitude of customers help in choosing the best
selling appeal for the product and the best way to represent the product in the sales talk,
and in determining the appropriateness and weight age of various promotional methods.
3. Motivation Research can help in measuring changes in attitudes, thus advertising
research.

4. Knowledge and measurement of attitudes provides us with an imaginative market
segmentation tool and also enables estimating market potential of each additional
segment.
5. Strategies to position the offer of the company in a particular market segment should be
based on the findings of motivation research.
Limitations of Motivation Research:
1. Cautions are required to be exercised not only in the application of these techniques but
also the resultant data should be analysed and interpreted according to the
psychological theory.
2. Originally these techniques were developed to collect data from a single individual over
a period of time. It is not free from draw backs while we apply these techniques to
gather data from a number of individuals.
3. The designing and administering of these techniques need qualified and experimented
researchers. Such personnel are not easily available.
Techniques of Motivation Research:
The techniques used in motivation research are of two types namely, Projective Techniques
and Depth Interviews.
Projective Techniques: These projective techniques represent the test conducted to establish
the personalities of the respondents and their reactions to product media advertisement
package product design and the like.
They project or reflect the subject’s thought about what he or she sees, feels, perceives thus
producing the reactions.
These tests are derived from clinical psychology and work on the postulation that if an
individual is placed in an ambiguous situation, he is guided by his own perceptions to describe
the situation.
They often provide, an insight into the motives that lie below the level of consciousness and
when the respondent is likely to rationalize his motives consciously or unconsciously; his
responses tend to reflect his own attitudes and beliefs by indirection and discretion; they are his
own perceptions and interpretations to the situation to which he is exposed.
There are five most commonly administered tests of this kind namely:
1. Thematic Appreciation Test.
2. Sentence Completion Test.
3. Word Association Test.
4. Paired Picture Test and
5. Third Person Test.

1. Thematic Appreciation Test (TAT): Under this test, the respondent is presented with a
picture or series of pictures of a scene or scenes involving people and objects associated with
goods or services in questions. These are unstructured, doubtful in action and very often
neutral giving no expression or motions. The respondent is to study the picture or the pictures
and construct a story.
His narrations or readings are interpreted by a skilled analyst. Thus, the picture may be of a
young man scribbling on a piece of a paper. Here, the respondent is to read as to whether the
person in picture is writing. If so what? For whom? And why? And so on.
2. Sentence Completion Test (SCT): Sentence completion tests are designed to discover
emotional responses of the respondent. It is the easiest, most useful and reliable test to get the
correct information in an indirect manner. The respondent is asked to complete the sentence
given.
For instance, the questions may be, in case of ladies:
1. I like instant coffee because……………
2. I use talcum powder because…………..
3. I use electric kitchen gadgets because………………….
4. I do not use pain-killers like aspirin because……………..
5. I do not like red, brown and black colours because…………………….. In case of men, these
questions may be
(a) I liked filter tipped cigarettes because………………….
(b) I gave up smoking because…………………
(c) I love natural proteins because………………..
(d) I prefer cold coffee because………………
(e) I do not use foam beds because……………….
The way the questions are asked, do not reflect right or wrong answers. However, the
emotional values and tensions are reflected in the answers so given.
3. Word Association Test (WAT): Word association test is similar to that of sentence
completion test. The only difference is that instead of an incomplete sentence, a list of words
ranging from twenty-five to seventy-five is given. This is the oldest and the simplest kind of
test.

The respondent is to match the word. That is, the word suggested by the researcher is to be
associated by the respondent by the most fitting word he thinks. This is widely used to measure
the effect of the brand names and advertising messages.
Here, it is not possible to give all the seventy-five words. On illustrative basis, let us have
fifteen words:
1. Perfume………..
2. Tooth paste
3. Hair oil………..
4. Shampoo……………………….
5. Shoes…………
6. Two-wheelers…………
7. Four-wheelers…………
8. Tyre………………………
9. Glass wares………….
10. Ink………..
11. Pencils…………
12. Fridges…………………….
13. Cupboards…………
14. Television………………….
15. Video cassettes………………
Thus, a respondent may give his preferences as ‘Colgate’ or ‘Promise’ or ‘Close-up’ or ‘Forhans’
in case of tooth paste. On the basis of such answers, it is possible to determine a scale of
preference.
4. Paired Picture Test (PPT): This is another very appealing and easy to administer test.
Paired picture test means that the respondent is given a pair of pictures almost identical in all
respects except in one. For instance, the researcher may be interested in knowing the reaction
of respondents to a new brand of refrigerator.
The pair of pictures may show a woman opening refrigerator which is moderately priced with a
usual brand; another picture of the same woman opening the refrigerator door of another
brand.

Looking to these two pictures, the respondent is to give his own feelings or reactions. Though
the same pair is shown to so many respondents, the reactions differ from person to person.
Instead of using the usual figures, cartoons may be introduced. The analyst gets here the inner
feelings of an individual for this analysis purpose.
5. Third Person Test (TPT): The format of this test is that the respondent is given a
photograph of a third person—may be a friend, a colleague, a neighbour, a star, a player, a
professional and the like The point involved is that the researcher is interested in knowing
what the third person thinks of an issue as heard through the respondent.
It is assumed that the respondent’s answer will reveal his own inner feelings more clearly
through the third person than otherwise it would have been possible.
The best example of this kind the test conducted on American house-wives in connection with
‘instant coffee’. Prior to the test, the attitude of house-wives was “It does not taste good”, with
the test being conducted, the real attitude was “A lady using instant coffee is lazy, spend-thrift
and not a good housewife”. This amply clears the fact how the test revealed the naked truth.
Measurement and Scaling Technique

Measurement in Research:
• Measurement is a relatively complex and demanding task, especially so when it

concerns qualitative or abstract phenomena. By measurement we mean the process of
assigning numbers to objects or observations, the level of measurement being a function
of the rules under which the numbers are assigned.
• It is easy to assign numbers in respect of properties of some objects, but it is relatively
difficult in respect of others. For instance, measuring such things as social conformity,
intelligence, or marital adjustment is much less obvious and requires much closer
attention than measuring physical weight, biological age or a person’s financial assets.
In other words, properties like weight, height, etc., can be measured directly with some
standard unit of measurement, but it is not that easy to measure properties like
motivation to succeed, ability to stand stress and the like.
• We can expect high accuracy in measuring the length of pipe with a yard stick, but if
the concept is abstract and the measurement tools are not standardised, we are less
confident about the accuracy of the results of measurement. Technically speaking,
measurement is a process of mapping aspects of a domain onto other aspects of a range
according to some rule of correspondence.
• In measuring, we devise some form of scale in the range (in terms of set theory, range
may refer to some set) and then transform or map the properties of objects from the
domain (in terms of set theory, domain may refer to some other set) onto this scale. For
example, in case we are to find the male to female attendance ratio while conducting a
study of persons who attend some show, then we may tabulate those who come to the
show according to sex.

• In terms of set theory, this process is one of mapping the observed physical properties of
those coming to the show (the domain) on to a sex classification (the range). The rule of
correspondence is: If the object in the domain appears to be male, assign to “0” and if
female assign to “1”. Similarly, we can record a person’s marital status as 1, 2, 3 or 4,
depending on whether the person is single, married, widowed or divorced. We can as
well record “Yes or No” answers to a question as “0” and “1” (or as 1 and 2 or perhaps as
59 and 60).
• In this artificial or nominal way, categorical data (qualitative or descriptive) can be
made into numerical data and if we thus code the various categories, we refer to the
numbers we record as nominal data. Nominal data are numerical in name only, because
they do not share any of the properties of the numbers we deal in ordinary arithmetic.
• In those situations when we cannot do anything except set up inequalities, we refer to
the data as ordinal data.
• When in addition to setting up inequalities we can also form differences, we refer to the
data as interval data.
• When in addition to setting up inequalities and forming differences we can also form
quotients (i.e., when we can perform all the customary operations of mathematics), we
refer to such data as ratio data. In this sense, ratio data includes all the usual
measurement (or determinations) of length, height, money amounts, weight, volume,
area, pressures etc.
• The above stated distinction between nominal, ordinal, interval and ratio data is
important for the nature of a set of data may suggest the use of particular statistical
techniques. A researcher has to be quite alert about this aspect while measuring
properties of objects or of abstract concepts.
Measurement Scales
• The level of measurement refers to the relationship among the values that are assigned
to the attributes, feelings or opinions for a variable. For example, the variable ‘whether
the taste of fast food is good’ has a number of attributes, namely, very good, good,
neither good nor bad, bad and very bad. For the purpose of analysing the results of this
variable, we may assign the values 1, 2, 3, 4 and 5 to the five attributes respectively.
• The level of measurement describes the relationship among these five values. Here, we
are simply using the numbers as shorter placeholders for the lengthier text terms. We
don’t mean that higher values mean ‘more’ of something or lower values mean ‘less’ of
something. We don’t assume that ‘good’ which has a value of 2 is twice of ‘very good’
which has a value of 1. We don’t even assume that ‘very good’ which is assigned the
value ‘1’ has more preference than ‘good’ which is assigned the value ‘2’. We simply use
the values as a shorter name for the attributes, opinions, or feelings. The assigned
values of attributes allow the researcher more scope for further processing of data and
statistical analysis.

Selection of Measurement of Scale
Scaling technique should be used which will yield the highest level of information feasible in a
given situation. Also, if possible the technique should permit you the use of a variety of
statistical analysis. A number of issues decide the choice of scaling technique. Some significant
issues are:
Problem Definition and Statistical Analysis: The Choice between ranking, sorting, or rating
techniques is determined by the problem definition and the type of statistical analysis likely to
be performed. For example, ranking provides only ordinal data that limits the use of statistical
techniques.
The Choice between Comparative and Non-comparative Scales: Some times it is better to
use a comparative scale rather than a non-comparative scale.
Type of Category Labels: Many researchers use verbal categories since they believe that
these categories are understood well by the respondents. The maturity and the education level
of the respondents influence this decision.
Number of Categories: While there is no single, optimal number of categories, traditional

guidelines suggest that there should be between five and nine categories. Also, if a neutral or
indifferent scale response is possible for at least some of the respondents, an odd number of
categories should be used. However, the researcher must determine the number of meaningful
positions that are best suited for a specific problem.
Balanced versus Unbalanced Scale: In general, the scale should be balanced to obtain
objective data.
Forced versus No forced Categories: In situations where the respondents are expected to
have no opinion, the accuracy of data may be improved by a non forced scale that provides a ‘no
opinion’ category.
Measurement of Scales
From what has been stated above, we can write that scales of measurement can be considered in
terms of their mathematical properties. The most widely used classification of measurement
scales are:
A. Nominal scale:
• Nominal scale is simply a system of assigning number symbols to events in
order to label them. The usual example of this is the assignment of numbers of
basketball players in order to identify them. Such numbers cannot be considered
to be associated with an ordered scale for their order is of no consequence; the
numbers are just convenient labels for the particular class of events and as such
have no quantitative value.
• Nominal scales provide convenient ways of keeping track of people, objects and
events. One cannot do much with the numbers involved. For example, one
cannot usefully average the numbers on the back of a group of football players
and come up with a meaningful value. Neither can one usefully compare the
numbers assigned to one group with the numbers assigned to another.
• The counting of members in each group is the only possible arithmetic operation
when a nominal scale is employed. Accordingly, we are restricted to use mode as
the measure of central tendency. There is no generally used measure of
dispersion for nominal scales.
• Chi-square test is the most common test of statistical significance that can be
utilised, and for the measures of correlation, the contingency coefficient can be
worked out.
• Nominal scale is the least powerful level of measurement. It indicates no order or
distance relationship and has no arithmetic origin. A nominal scale simply
describes differences between things by assigning them to categories. Nominal
data are, thus, counted data. The scale wastes any information that we may have
about varying degrees of attitude, skills, understandings, etc.
• In spite of all this, nominal scales are still very useful and are widely used in
surveys and other ex-post-facto research when data are being classified by major
sub-groups of the population.
B. Ordinal scale:
• The lowest level of the ordered scale that is commonly used is the ordinal scale.
The ordinal scale places events in order, but there is no attempt to make the
intervals of the scale equal in terms of some rule. Rank orders represent ordinal
scales and are frequently used in research relating to qualitative phenomena.
• Ordinal scales only permit the ranking of items from highest to lowest. Ordinal
measures have no absolute values, and the real differences between adjacent
ranks may not be equal. All that can be said is that one person is higher or lower
on the scale than another, but more precise comparisons cannot be made.
• Thus, the use of an ordinal scale implies a statement of ‘greater than’ or ‘less
than’ (an equality statement is also acceptable) without our being able to state
how much greater or less. The real difference between ranks 1 and 2 may be
more or less than the difference between ranks 5 and 6.
• Since the numbers of this scale have only a rank meaning, the appropriate
measure of central tendency is the median. A percentile or quartile measure is
used for measuring dispersion. Correlations are restricted to various rank order
methods. Measures of statistical significance are restricted to the non-parametric
methods.
C. Interval scale:
• In the case of interval scale, the intervals are adjusted in terms of some rule that
has been established as a basis for making the units equal. The units are equal
only in so far as one accepts the assumptions on which the rule is based. Interval
scales can have an arbitrary zero, but it is not possible to determine for them
what may be called an absolute zero or the unique origin.
• The primary limitation of the interval scale is the lack of a true zero; it does not
have the capacity to measure the complete absence of a trait or characteristic.

Interval scales provide more powerful measurement than ordinal scales for
interval scale also incorporates the concept of equality of interval.
• As such more powerful statistical measures can be used with interval scales.
Mean is the appropriate measure of central tendency, while standard deviation is
the most widely used measure of dispersion. Product moment correlation
techniques are appropriate and the generally used tests for statistical significance
are the‘t’ test and ‘F’ test.
D. Ratio scale:
• Ratio scales have an absolute or true zero of measurement. The term ‘absolute
zero’ is not as precise as it was once believed to be. We can conceive of an
absolute zero of length and similarly we can conceive of an absolute zero of time.
• The ratio involved does have significance and facilitates a kind of comparison
which is not possible in case of an interval scale. Ratio scale represents the actual
amounts of variables. Measures of physical dimensions such as weight, height,
distance, etc. are examples. Generally, all statistical techniques are usable with
ratio scales and all manipulations that one can carry out with real numbers can
also be carried out with ratio scale values.
• Multiplication and division can be used with this scale but not with other scales
mentioned above. Geometric and harmonic means can be used as measures of
central tendency and coefficients of variation may also be calculated. Thus,
proceeding from the nominal scale (the least precise type of scale) to ratio scale
(the most precise), relevant information is obtained increasingly.
• If the nature of the variables permits, the researcher should use the scale that
provides the most precise description. Researchers in physical sciences have the
advantage to describe variables in ratio scale form but the behavioural sciences
are generally limited to describe variables in interval scale form, a less precise
type of measurement.
Scaling
• In research we quite often face measurement problem (since we want a valid

measurement but may not obtain it), especially when the concepts to be measured are
complex and abstract and we do not possess the standardised measurement tools.
Alternatively, we can say that while measuring attitudes and opinions, we face the
problem of their valid measurement.
• Similar problem may be faced by a researcher, of course in a lesser degree, while
measuring physical or institutional concepts. As such we should study some procedures
which may enable us to measure abstract concepts more accurately. This brings us to
the study of scaling techniques.
Meaning of Scaling
• Scaling describes the procedures of assigning numbers to various degrees of opinion,

attitude and other concepts. This can be done in two ways, viz.,

o Making a judgment about some characteristic of an individual and then placing
him directly on a scale that has been defined in terms of that characteristic and
o Constructing questionnaires in such a way that the score of individual’s
responses assigns him a place on a scale. It may be stated here that a scale is a
continuum, consisting of the highest point (in terms of some characteristic,
example, preference, favorableness, etc.) and the lowest point along with several
intermediate points between these two extreme points.
• These scale-point positions are so related to each other that when the first point
happens to be the highest point, the second point indicates a higher degree in terms of a
given characteristic as compared to the third point and the third point indicates a higher
degree as compared to the fourth and so on. Numbers for measuring the distinctions of
degree in the attitudes/opinions are, thus, assigned to individuals corresponding to
their scale-positions.
• All this is better understood when we talk about scaling technique(s). Hence the term
‘scaling’ is applied to the procedures for attempting to determine quantitative measures
of subjective abstract concepts. Scaling has been defined as a “procedure for the
assignment of numbers (or other symbols) to a property of objects in order to impart
some of the characteristics of numbers to the properties in question.”
Scale Classification Bases
The number assigning procedures or the scaling procedures may be broadly classified on one or
more of the following bases:
• Subject Orientation: Under it, a scale may be designed to measure characteristics of

the respondent who completes it or to judge the stimulus object which is presented to
the respondent. In respect of the former, we presume that the stimuli presented are
sufficiently homogeneous, so that the between stimuli variation is small as compared to
the variation among respondents. In the latter approach, we ask the respondent to judge
some specific object in terms of one or more dimensions and we presume that the
between-respondent variation will be small as compared to the variation among the
different stimuli presented to respondents for judging.
• Response Form: Under this, we may classify the scales as categorical and comparative.
Categorical scales are also known as rating scales. These scales are used when a
respondent scores some object without direct reference to other objects. Under
comparative scales, which are also known as ranking scales, the respondent is asked to
compare two or more objects. In this sense, the respondent may state that one object is
superior to the other or those three models of pen rank in order 1, 2 and 3. The essence
of ranking is, in fact, a relative comparison of a certain property of two or more objects.
• Degree of Subjectivity: With this basis, the scale data may be based on whether we
measure subjective personal preferences or simply make non-preference judgments. In
the former case, the respondent is asked to choose which person he favors or which
solution he would like to see employed, whereas in the latter case he is simply asked to
judge which person is more effective in some aspect or which solution will take fewer
resources without reflecting any personal preference.

• Scale properties: Considering scale properties, one may classify the scales as nominal,
ordinal, interval and ratio scales. Nominal scales merely classify without indicating
order, distance or unique origin. Ordinal scales indicate magnitude relationships of
‘more than’ or ‘less than’, but indicate no distance or unique origin. Interval scales have
both, order and distance values, but no unique origin. Ratio scales possess all these
features.
• Number of Dimensions: In respect of this basis, scales can be classified as
‘unidimensional’ and ‘multidimensional’ scales. Under the former we measure only one
attribute of the respondent or object, whereas multidimensional scaling recognises that
an object might be described better by using the concept of an attribute space of ‘n’
dimensions, rather than a single-dimension continuum.
• Scale construction techniques: Following are the five main techniques by which scales
can be developed.
o Arbitrary approach: It is an approach where scale is developed on ad hoc basis.
This is the most widely used approach. It is presumed that such scales measure
the concepts for which they have been designed, although there is little evidence
to support such an assumption.
o Consensus approach: Here a panel of judges evaluates the items chosen for
inclusion in the instrument in terms of whether they are relevant to the topic
area and unambiguous in implication.
o Item analysis approach: Under it, a number of individual items are developed
into a test which is given to a group of respondents. After administering the test,
the total scores are calculated for everyone. Individual items are then analysed to
determine which items discriminate between persons or objects with high total
scores and those with low scores.
o Cumulative scales are chosen on the basis of their conforming to some ranking
of items with ascending and descending discriminating power. For instance, in
such a scale the endorsement of an item representing an extreme position should
also result in the endorsement of all items indicating a less extreme position.
o Factor scales may be constructed on the basis of inter correlations of items
which indicate that a common factor accounts for the relationship between items.
This relationship is typically measured through factor analysis method.
Important Scaling Techniques
Some of the important scaling techniques often used in the context of research especially in
context of social or business research.
Rating scales:
• The rating scale involves qualitative description of a limited number of aspects of a

thing or of traits of a person. When we use rating scales (or categorical scales), we judge
an object in absolute terms against some specified criteria, i.e., we judge properties of
objects without reference to other similar objects.

• These ratings may be in such forms as “like-dislike”, “above average, average, below
average”, or other classifications with more categories such as “like very much—like
some what—neutral—dislike somewhat—dislike very much”; “excellent—good—
average—below average—poor”, “always—often—occasionally—rarely—never”, and so
on. There is no specific rule whether to use a two-point scale, three-point scale or scale
with still more points.
• In practice, three to seven points’ scales are generally used for the simple reason that
more points on a scale provide an opportunity for greater sensitivity of measurement.
Rating scale may be either a graphic rating scale or an itemised rating scale.
• Under ranking scales (or comparative scales) we make relative judgements against other
similar objects. The respondents under this method directly compare two or more
objects and make choices among them. There are two generally used approaches of
ranking scales, viz.,
o Method of paired comparisons: Under it the respondent can express his
attitude by making a choice between two objects, say between a new flavour of
soft drink and an established brand of drink. But when there are more than two
stimuli to judge, the number of judgements required in a paired comparison is
given by the formula:
𝑛𝑛(𝑛𝑛 − 1)
𝑁𝑁 =
2
Where N = number of judgments, n = number of stimuli or objects to be judged.
• For instance, if there are ten suggestions for bargaining proposals available to a
workers union, there are 45 paired comparisons that can be made with them. When N
happens to be a big figure, there is the risk of respondents giving ill considered answers
or they may even refuse to answer.
• We can reduce the number of comparisons per respondent either by presenting to each
one of them only a sample of stimuli or by choosing a few objects which cover the range
of attractiveness at about equal intervals and then comparing all other stimuli to these
few standard objects. Thus, paired-comparison data may be treated in several ways.
• If there is substantial consistency, we will find that if X is preferred to Y, and Y to Z,
then X will consistently be preferred to Z. If this is true, we may take the total number
of preferences among the comparisons as the score for that stimulus. It should be
remembered that paired comparison provides ordinal data, but the same may be
converted into an interval scale by the method of the Law of Comparative Judgment
developed by L.L. Thurstone.
• This technique involves the conversion of frequencies of preferences into a table of
proportions which are then transformed into Z matrix by referring to the table of area
under the normal curve. J.P. Guilford in his book “Psychometric Methods” has given a
procedure which is relatively easier. The method is known as the Composite Standard
Method.

• By following the composite standard method, we can develop an interval scale from the
paired comparison ordinal data given in the table for which purpose we have to adopt
the following steps in order:
o Using the data in the table, we work out the column mean with the help of the
formula given below:
𝐶𝐶 + 5(𝑁𝑁)
𝑀𝑀𝑝𝑝 =
𝑛𝑛𝑛𝑛
Where, Mp = the mean proportion of the columns, C = the total number of choices for a given
suggestion, n = number of stimuli (proposals in the given problem), N = number of items in the
sample.
The column means have been shown in the Mp row in the above table.
• The Z values for the Mp are secured from the table giving the area under the normal
curve. When the Mp value is less than .5, the Z value is negative and for all Mp values
higher than .5, the Z values are positive.* These Z values are shown in Zj row in the
above table.
• As the Zj values represent an interval scale, zero is an arbitrary value. Hence we can
eliminate negative scale values by giving the value of zero to the lowest scale value (this
being (–).11 in our example which we shall take equal to zero) and then adding the
absolute value of this lowest scale value to all other scale items. This scale has been
shown in Rj row in the above table.
Method of Rank Order:
• Under this method of comparative scaling, the respondents are asked to rank their
choices. This method is easier and faster than the method of paired comparisons stated
above. Moreover, a complete ranking at times is not needed in which case the
respondents may be asked to rank only their first, say, four choices while the number of
overall items involved may be more than four, say, it may be 15 or 20 or more. To
secure a simple ranking of all items involved we simply total rank values received by
each item.
• There are methods through which we can as well develop an interval scale of these data.
But then there are limitations of this method. The first one is that data obtained
through this method are ordinal data and hence rank ordering is an ordinal scale with
all its limitations. Then there may be the problem of respondents becoming careless in
assigning ranks particularly when there are many (usually more than 10) items.
Scale Construction Techniques
• In social science studies, while measuring attitudes of the people, we generally follow
the technique of preparing the opinionnaire (or attitude scale) in such a way that the
score of the individual responses assigns him a place on a scale. Under this approach, the
respondent expresses his agreement or disagreement with a number of statements

relevant to the issue. While developing such statements, the researcher must note the
following two points:
o That the statements must elicit responses which are psychologically related to
the attitude being measured;
o That the statements need be such that they discriminate not merely between
extremes of attitude but also among individuals who differ slightly.
• Researchers must as well be aware that inferring attitude from what has been recorded
in opinionnaires has several limitations. People may conceal their attitudes and express
socially acceptable opinions. They may not really know how they feel about a social
issue. People may be unaware of their attitude about an abstract situation; until
confronted with a real situation, they may be unable to predict their reaction.
• Even behaviour itself is at times not a true indication of attitude. For instance, when
politicians kiss babies, their behaviour may not be a true expression of affection toward
infants. Thus, there is no sure method of measuring attitude; we only try to measure the
expressed opinion and then draw inferences from it about people’s real feelings or
attitudes.
• With all these limitations in mind, psychologists and sociologists have developed
several scale construction techniques for the purpose. The researcher should know these
techniques so as to develop an appropriate scale for his own study. Some of the
important approaches, along with the corresponding scales developed under each
approach to measure attitude are as follows:
Different Scales for Measuring attitudes of People
A. Arbitrary Scales
• Arbitrary scales are developed on ad hoc basis and are designed largely through
the researcher’s own subjective selection of items. The researcher first collects
few statements or items which he believes are unambiguous and appropriate to a
given topic. Some of these are selected for inclusion in the measuring instrument
and then people are asked to check in a list the statements with which they
agree.
• The chief merit of such scales is that they can be developed very easily, quickly
and with relatively less expense. They can also be designed to be highly specific
and adequate. Because of these benefits, such scales are widely used in practice.

• At the same time, there are some limitations of these scales. The most important
one is that we do not have objective evidence that such scales measure the
concepts for which they have been developed. We have simply to rely on
researcher’s insight and competence.
B. Differential Scales (or Thurstone-type Scales)
The name of L.L. Thurstone is associated with differential scales which have been
developed using consensus scale approach. Under such an approach, the selection of
items is made by a panel of judges who evaluate the items in terms of whether they are
relevant to the topic area and unambiguous in implication. The detailed procedure is as
under:
• The researcher gathers a large number of statements, usually twenty or more,
that express various points of view toward a group, institution, idea, or practice
(i.e., statements belonging to the topic area).
• These statements are then submitted to a panel of judges, each of whom
arranges them in eleven groups or piles ranging from one extreme to another in
position. Each of the judges is requested to place generally in the first pile the
statements which he thinks are most unfavorable to the issue, in the second pile
to place those statements which he thinks are next most unfavorable and he goes
on doing so in this manner till in the eleventh pile he puts the statements which
he considers to be the most favourable.
• This sorting by each judge yields a composite position for each of the items. In
case of marked disagreement between the judges in assigning a position to an
item, that item is discarded.
• For items that are retained, each is given its median scale value between one and
eleven as established by the panel. In other words, the scale value of any one
statement is computed as the ‘median’ position to which it is assigned by the
group of judges.
• A final selection of statements is then made. For this purpose a sample of
statements, whose median scores are spread evenly from one extreme to the
other is taken. The statements so selected, constitute the final scale to be
administered to respondents. The position of each statement on the scale is the
same as determined by the judges.
• After developing the scale as stated above, the respondents are asked during the
administration of the scale to check the statements with which they agree. The
median value of the statements that they check is worked out and this establishes
their score or quantifies their opinion. It may be noted that in the actual
instrument the statements are arranged in random order of scale value.
• If the values are valid and if the opinionnaire deals with only one attitude
dimension, the typical respondent will choose one or several contiguous items
(in terms of scale values) to reflect his views. However, at times divergence may
occur when a statement appears to tap a different attitude dimension.
• The Thurstone method has been widely used for developing differential scales
which are utilised to measure attitudes towards varied issues like war, religion,

etc. Such scales are considered most appropriate and reliable when used for
measuring a single attitude. But an important deterrent to their use is the cost
and effort required to develop them. Another weakness of such scales is that the
values assigned to various statements by the judges may reflect their own
attitudes. The method is not completely objective; it involves ultimately
subjective decision process. Critics of this method also opine that some other
scale designs give more information about the respondent’s attitude in
comparison to differential scales.
C. Summated Scales (or Likert-type Scales)
• Summated scales (or Likert-type scales) are developed by utilising the item
analysis approach wherein a • particular item is evaluated on the basis of how
well it discriminates between those persons whose total score is high and those
whose score is low. Those items or statements that best meet this sort of
discrimination test are included in the final instrument.
• Thus, summated scales consist of a number of statements which express either a
favourable or unfavourable • attitude towards the given object to which the
respondent is asked to react. The respondent indicates his agreement or
disagreement with each statement in the instrument. Each response is given a
numerical score, indicating its favourableness or unfavourableness, and the
scores are totalled to measure the respondent’s attitude.
• In other words, the overall score represents the respondent’s position on the
continuum of favourable-• unfavourableness towards an issue. Most frequently
used summated scales in the study of social attitudes follow the pattern devised
by Likert. For this reason they are often referred to as Likert-type scales.
• In a Likert scale, the respondent is asked to respond to each of the statements in
terms of several degrees, • usually five degrees (but at times 3 or 7 may also be
used) of agreement or disagreement. For example, when asked to express
opinion whether one considers his job quite pleasant, the respondent may
respond in any one of the following ways:
i. strongly agree‚‚
ii. agree ‚‚
iii. undecided‚‚
iv. disagree‚‚
v. strongly disagree‚‚
• We find that these five points constitute the scale. At one extreme of the scale
there is strong agreement with the • given statement and at the other, strong
disagreement, and between them lie intermediate points. Each point on the scale
carries a score. Response indicating the least favourable degree of job satisfaction
is given the least score (say 1) and the most favourable is given the highest score
(say 5). These score—values are normally not printed on the instrument but are
shown here just to indicate the scoring pattern.
• The Likert scaling technique, thus, assigns a scale value to each of the five
responses. The same thing is done • in respect of each and every statement in the

instrument. This way, the instrument yields a total score for each respondent,
which would then measure the respondent’s favourableness toward the given
point of view. If the instrument consists of, say 30 statements, the following
score values would be revealing.
30 × 5 = 150 Most favourable response possible
30 × 3 = 90 A neutral attitude
30 × 1 = 30 Most unfavourable attitude.
• The scores for any individual would fall between 30 and 150. If the score
happens to be above 90, it shows • favourable opinion to the given point of view,
a score of below 90 would mean unfavourable opinion and a score of exactly 90
would be suggestive of a neutral attitude.
• Procedure: The procedure for developing a Likert-type scale is as follows:•
i. As a first step, the researcher collects a large number of statements which
are relevant to the attitude being studied and each of the statements
expresses definite favourableness or unfavourableness to a particular
point of view or the attitude and that the number of favourable and
unfavourable statements is approximately equal.
ii. After the statements have been gathered, a trial test should be
administered to a number of subjects. In other words, a small group of
people, from those who are going to be studied finally, are asked to
indicate their response to each statement by checking one of the
categories of agreement or disagreement using a five point scale as stated
above.
iii. The response to various statements are scored in such a way that a
response indicative of the most favourable attitude is given the highest
score of 5 and that with the most unfavourable attitude is given the
lowest score, say, of 1.
iv. Then, the total score of each respondent is obtained by adding his scores
that he received for separate statements.
v. The next step is to array these total scores and find out those statements
which have a high discriminatory power. For this purpose, the researcher
may select some part of the highest and the lowest total scores say the
top 25 per cent and the bottom 25 per cent. These two extreme groups
are interpreted to represent the most favourable and the least favourable
attitudes and are used as criterion groups by which to evaluate individual
statements. This way we determine which statements consistently
correlate with low favourability and which with high favourability.
vi. Only those statements that correlate with the total test should be
retained in the final instrument and all others must be discarded from it.
• Advantages: The Likert-type scale has several advantages. Mention may be
made of the important ones.
i. It is relatively easy to construct the Likert-type scale in comparison t
Thurstone-type scale because Likert-type scale can be performed without
a panel of judges.

ii. Likert-type scale is considered more reliable because under it
respondents answer each statement included in the instrument. As such it
also provides more information and data than does the Thurstone-type
scale.
iii. Each statement, included in the Likert-type scale, is given an empirical
test for discriminating ability and as such, unlike Thurstone-type scale,
the Likert-type scale permits the use of statements that are not
manifestly related (to have a direct relationship) to the attitude being
studied.
iv. Likert-type scale can easily be used in respondent-centred and stimulus-
centred studies i.e., through it we can study how responses differ between
people and how responses differ between stimuli.
v. Likert-type scale takes much less time to construct; it is frequently used
by the students of opinion research. Moreover, it has been reported in
various research studies that there is high degree of correlation between
Likert-type scale and Thurstone-type scale.
• Limitations: There are several limitations of the Likert-type scale as well.
i. One important limitation is that, with this scale, we can simply examine
whether respondents are more or less favourable to a topic, but we
cannot tell how much more or less they are. There is no basis for belief
that the five positions indicated on the scale are equally spaced. The
interval between ‘strongly agree’ and ‘agree’, may not be equal to the
interval between “agree” and “undecided”. This means that Likert scale
does not rise to a stature more than that of an ordinal scale, whereas the
designers of Thurstone scale claim the Thurstone scale to be an interval
scale.
ii. One further disadvantage is that often the total score of an individual
respondent has little clear meaning since a given total score can be
secured by a variety of answer patterns. It is unlikely that the respondent
can validly react to a short statement on a printed form in the absence of
real-life qualifying situations.
iii. Moreover, there “remains a possibility that people may answer according
to what they think they should feel rather than how they do feel.” This
particular weakness of the Likert-type scale is met by using a cumulative
scale.
• In spite of all the limitations, the Likert-type summated scales are regarded as
the most useful in a situation wherein it is possible to compare the respondent’s
score with a distribution of scores from some well defined group. They are
equally useful when we are concerned with a programme of change or
improvement in which case we can use the scales to measure attitudes before and
after the programme of change or improvement in order to assess whether our
efforts have had the desired effects.
• We can as well correlate scores on the scale to other measures without any
concern for the absolute value of what is favourable and what is unfavourable.

All this accounts for the popularity of Likert-type scales in social studies relating
to measuring of attitudes.
D. Multidimensional Scaling: Multidimensional scaling (MDS) is relatively more
complicated scaling device, but with this sort of scaling one can scale objects,
individuals or both with a minimum of information. Multidimensional scaling (or MDS)
can be characterised as a set of procedures for portraying perceptual or affective
dimensions of substantive interest.
• It “provides useful methodology for portraying subjective judgements of diverse
kinds.” MDS is used when all the variables (whether metric or non-metric) in a
study are to be analysed simultaneously and all such variables happen to be
independent. The underlying assumption in MDS is that people (respondents)
“perceive a set of objects as being more or less similar to one another on a
number of dimensions (usually uncorrelated with one another) instead of only
one.”
• Through MDS techniques, one can represent geometrically the locations and
interrelationships among a set of points. In fact, these techniques attempt to
locate the points, given the information about a set of interpoint distances, in
space of one or more dimensions such as to best summarise the information
contained in the interpoint distances.
• The distances in the solution space then optimally reflect the distances contained
in the input data. For instance, if objects, say X and Y, are thought of by the
respondent as being most similar as compared to all other possible pairs of
objects, MDS techniques will position objects X and Y in such a way that the
distance between them in multidimensional space is shorter than that between
any two other objects.
• Two approaches, viz., the metric approach and the non-metric approach, are
usually talked about in the context of MDS, while attempting to construct a
space containing m points such that m(m – 1)/2 interpoint distances reflect the
input data. The metric approach to MDS treats the input data as interval scale
data and solves applying statistical methods for the additive constant* which
minimises the dimensionality of the solution space.
• This approach utilises all the information in the data in obtaining a solution. The
data (i.e., the metric similarities of the objects) are often obtained on a bipolar
similarity scale on which pairs of objects are rated one at a time. If the data
reflect exact distances between real objects in an r-dimensional space, their
solution will reproduce the set of interpoint distances. But as the true and real
data are rarely available, we require random and systematic procedures for
obtaining a solution.
• Generally, the judged similarities among a set of objects are statistically
transformed into distances by placing those objects in a multidimensional space
of some dimensionality.
• The non-metric approach first gathers the non-metric similarities by asking
respondents to rank order all possible pairs that can be obtained from a set of

objects. Such non-metric data is then transformed into some arbitrary metric
space and then the solution is obtained by reducing the dimensionality.
• In other words, this non-metric approach seeks “a representation of points in a
space of minimum dimensionality such that the rank order of the interpoint
distances in the solution space maximally corresponds to that of the data. This is
achieved by requiring only that the distances in the solution be monotone with
the input data.” The non-metric approach has come into prominence during the
sixties with the coming into existence of high speed computers to generate
metric solutions for ordinal input data.
• The significance of MDS lies in the fact that it enables the researcher to study
“the perceptual structure of a set of stimuli and the cognitive processes
underlying the development of this structure. Psychologists, for example,
employ multidimensional scaling techniques in an effort to scale psychophysical
stimuli and to determine appropriate labels for the dimensions along which these
stimuli vary.”
• The MDS techniques, infact, do away with the need in the data collection
process to specify the attribute(s) along which the several brands, say of a
particular product, may be compared as ultimately the MDS analysis itself
reveals such attribute(s) that presumably underlie the expressed relative
similarities among objects. Thus, MDS is an important tool in attitude
measurement and the techniques falling under MDS promise “a great advance
from a series of unidimensional measurements (example, a distribution of
intensities of feeling towards single attribute such as colour, taste or a preference
ranking with indeterminate intervals), to a perceptual mapping in
multidimensional space of objects, company images, advertisement brands, etc.”
• In spite of all the merits stated above, the MDS is not widely used because of the
computation complications involved under it. Many of its methods are quite
laborious in terms of both the collection of data and the subsequent analyses.
However, some progress has been achieved (due to the pioneering efforts of Paul
Green and his associates) during the last few years in the use of non-metric
MDS in the context of market research problems.
• The techniques have been specifically applied in “finding out the perceptual
dimensions, and the spacing of stimuli along these dimensions, that people, use
in making judgements about the relative similarity of pairs of Stimuli.”12 But,
“in the long run, the worth of MDS will be determined by the extent to which it
advances the behavioural sciences.”
Sampling
Need for Sampling
Sampling is used in practice for a variety of reasons such as,

• Sampling can save time and money. A sample study is usually less expensive than a
census study and produces • results at a relatively faster speed.
• Sampling may enable more accurate measurements for a sample study is generally
conducted by trained and • experienced investigators.
• Sampling remains the only way when population contains infinitely many members.•
• Sampling remains the only choice when a test involves the destruction of the item under
study. •
• Sampling usually enables to estimate the sampling errors and, thus, assists in obtaining
information concerning • some characteristic of the population.
Definition
Following are some fundamental definitions concerning sampling concepts and principles:
Universe/Population:
• From a statistical point of view, the term ‘Universe’ refers to the total of the items or
units in any field of inquiry, whereas the term ‘population’ refers to the total of items
about which information is desired.
• The attributes that are the object of study are referred to as characteristics and the units
possessing them are called as elementary units. The aggregate of such units is generally
described as population. Thus, all units in any field of inquiry form universe and all
elementary units (on the basis of one characteristic or more) form population.
• Many a times it is difficult to find any difference between population and universe, and
as such the two terms are taken as interchangeable. However, a researcher must
necessarily define these terms precisely.
• The population or universe can be finite or infinite. The population is said to be finite if
it consists of a fixed number of elements so that it is possible to enumerate it in its
totality. For instance, the population of a city, the number of workers in a factory are
examples of finite populations. The symbol ‘N’ is generally used to indicate how many
elements (or items) are there in case of a finite population.
• An infinite population is that population in which it is theoretically impossible to
observe all the elements. Thus, in an infinite population the number of items is infinite
i.e., we cannot have any idea about the total number of items. The number of stars in a
sky, possible rolls of a pair of dice are examples of infinite population.
• From a practical consideration, we then use the term infinite population for a population
that cannot be enumerated in a reasonable period of time. This way we use the
theoretical concept of infinite population as an approximation of a very large finite
population.
Sampling frame:
• The elementary units or the group or cluster of such units may form the basis of
sampling process in which case they are called as sampling units. A list containing all

such sampling units is known as sampling frame. Thus, sampling frame consists of a list
of items from which the sample is to be drawn.
• If the population is finite and the time frame is in the present or past, then it is possible
for the frame to be identical with the population. In most cases they are not identical
because it is often impossible to draw a sample directly from population.
• As such this frame is either constructed by a researcher for the purpose of his study or
may consist of some existing list of the population. For instance, one can use telephone
directory as a frame for conducting opinion survey in a city. Whatever the frame may
be, it should be a good representative of the population.
Sampling design:
• A sample design is a definite plan for obtaining a sample from the sampling frame. It
refers to the technique or the procedure the researcher would adopt in selecting some
sampling units from which inferences about the population is drawn. Sampling design is
determined before any data are collected.
Statistic(s) and parameter(s):
• A statistic is a characteristic of a sample, whereas a parameter is a characteristic of a

population. Thus, when we work out certain measures such as mean, median, mode or
the like ones from samples, then they are called statistic(s) for they describe the
characteristics of a sample. But when such measures describe the characteristics of a
population, they are known as parameter(s). For instance, the samples mean (X) is a
statistic. To obtain the estimate of a parameter from a statistic constitutes the prime
objective of sampling analysis.
Sampling error:
• Sample surveys do imply the study of a small portion of the population and as such
there would naturally be a certain amount of inaccuracy in the information collected.
This inaccuracy may be termed as sampling error or error variance.
• Sampling error = Frame error + Chance error + Response error (If we add
measurement error or the non-sampling error to sampling error, we get total error).
Sampling errors occur randomly and are equally likely to be in either direction. The
magnitude of the sampling error depends upon the nature of the universe; the more
homogeneous the universe, the smaller the sampling error. Sampling error is inversely
related to the size of the sample, i.e., sampling error decreases as the sample size
increases and vice-versa.
• A measure of the random sampling error can be calculated for a given sample design
and size and this measure is often called the precision of the sampling plan. Sampling
error is usually worked out as the product of the critical value at a certain level of
significance and the standard error.

• As opposed to sampling errors, we may have non-sampling errors which may creep in
during the process of collecting actual information and such errors occur in all surveys
whether census or sample. There is no way to measure non-sampling errors.
• In other words, sampling errors are those errors, which arise on account of sampling
and they generally happen to be random variations (in case of random sampling) in the
sample estimates around the true population values. The meaning of sampling error can
be easily understood from the following diagram:
Precision:
• Precision is the range within which the population average (or other parameter) will lie
in accordance with the reliability specified in the confidence level as a percentage of the
estimate ± or as a numerical quantity. For instance, if the estimate is Rs 4000 and the
precision desired is ± 4%, then the true value will be no less than Rs 3840 and no more
than Rs 4160.
• This is the range (Rs 3840 to Rs 4160) within which the true answer should lie. But if
we desire that the estimate should not deviate from the actual value by more than Rs
200 in either direction, in that case the range would be Rs 3800 to Rs 4200.
Confidence level and significance level:
• The confidence level or reliability is the expected percentage of times that the actual
value will fall within the stated precision limits. Thus, if we take a confidence level of
95%, then we mean that there are 95 chances in 100 (or .95 in 1) that the sample results
represent the true condition of the population within a specified precision range against
5 chances in 100 (or .05 in 1) that it does not.
• Precision is the range within which the answer may vary and still be acceptable;
confidence level indicates the likelihood that the answer will fall within that range, and

the significance level indicates the likelihood that the answer will fall outside that range.
If the confidence level is 95%, then the significance level will be (100 – 95) i.e., 5%; if the
confidence level is 99%, the significance level is (100 – 99) i.e., 1%, and so on.
• Even areas of normal curve within precision limits for the specified confidence level
constitute the acceptance region and the area of the curve outside these limits in either
direction constitutes the rejection regions.
Sampling distribution:
• We are often concerned with sampling distribution in sampling analysis. If we take

certain number of samples and for each sample compute various statistical measures
such as mean, standard deviation, etc., then we can find that each sample may give its
own value for the statistic under consideration.
• All such values of a particular statistic, say mean, together with their relative
frequencies will constitute the sampling distribution of the particular statistic, say mean.
• Accordingly, we can have sampling distribution of mean, or the sampling distribution of
standard deviation or the sampling distribution of any other statistical measure. It may
be noted that each item in a sampling distribution is a particular statistic of a sample.
• The sampling distribution tends quite closer to the normal distribution if the number of
samples is large. The significance of sampling distribution follows from the fact that the
mean of a sampling distribution is the same as the mean of the universe. Thus, the mean
of the sampling distribution can be taken as the mean of the universe.
Essentials of Good Samples
It is important that the sampling results must reflect the characteristics of the population.
Therefore, while selecting the sample from the population under investigation it should be
ensured that the sample has the following characteristics:
• A sample must represent a true picture of the population from which it is drawn.
• A sample must be unbiased by the sampling procedure.
• A sample must be taken at random so that every member of the population of data has
an equal chance of selection.
• A sample must be sufficiently large but as economical as possible.
• A sample must be accurate and complete. It should not leave any information
incomplete and should include all the respondents, units or items included in the
sample.
• Adequate sample size must be taken considering the degree of precision required in the
results of inquiry.
Methods of Sampling
• If money, time, trained manpower and other resources were not a concern, the
researcher could get most accurate data from surveying the entire population of
interest. Since most often the resources are scarce, the researcher is forced to go for

sampling. But the real purpose of the survey is to know the characteristics of the
population. Then the question is with what level of confidence the researcher will be
able to say that the characteristics of a sample represent the entire population.
• Using a combination of tasks of hypotheses and unbiased sampling methods, the
researcher can collect data that actually represents the characteristics of the entire
population from which the sample was taken. To ensure a high level of confidence that
the sample represents the population it is necessary that the sample is unbiased and
sufficiently large.
• It was scientifically proved that if we increase the sample size we shall be that much
closer to the characteristics of the population. Ultimately, if we cover each and every
unit of the population, the characteristics of the sample will be equal to the
characteristics of the population. That is why in a census there is no sampling error.
Larger the sample size, the less sampling error we have.
• The statistical meaning of bias is error. The sample must be error free to make it an
unbiased sample. In practice, it is impossible to achieve an error free sample even using
unbiased sampling methods. However, we can minimize the error by employing
appropriate sampling methods. The various sampling methods can be classified into two
categories.
There are two ways to obtain a representative sample:
1. Probability sampling/ Random Sampling: In probability sampling, the choice of the

sample will be made at random, which guarantees that each member of the population
will have the same probability of selection and inclusion in the sample group.
Researchers should ensure that they have updated information on the population from
which they will draw the sample and survey the majority to establish
representativeness.
2. Non-probability sampling: In a non-probability sampling, different types of people are
seeking to obtain a more balanced representative sample. Knowing the demographic
characteristics of our group will undoubtedly help to limit the profile of the desired
sample and define the variables that interest the researchers, such as gender, age, place
of residence, etc. By knowing these criteria, before obtaining the information,
researchers can have the control to create a representative sample that is efficient for us.
Non Probability Sampling Methods: Non probability sampling is one in which there is no
way of assessing the probability of the element or group of elements, of population being
included in the sample. In other words, non-probability sampling methods are those that
provide no basis for estimating how closely the characteristics of sample approximate the
parameters of population from which the sample had been obtained. This is because non
probability sample do not use the techniques of random sampling. Important techniques of non
probability sampling methods are:
i) Haphazard, Accidental, or Convenience Sampling
Haphazard sampling can produce ineffective, highly unrepresentative samples and is not
recommended. When a researcher haphazardly selects cases that are convenient, he or she can

easily get a sample that seriously misrepresents the population. Such samples are cheap and
quick; however, the systematic errors that easily occur make them worse than no sample at all.
The person-on-the street interview conducted by television programs is an example of a
haphazard sample. Likewise, television interviewers often select people who look “normal” to
them and avoid people who are unattractive, poor, very old, or inarticulate.
Such haphazard samples may have entertainment value, but they can give a distorted view and
seriously misrepresent the population. For example, an investigator may take student of class X
into research plan because the class teacher of the class happens to be his / her friend. This
illustrates accidental or convenience sampling.
ii) Quota Sampling
Quota Sampling is an improvement over haphazard sampling. In quota sampling, a researcher

first identifies relevant categories of people (e.g., male and female; or under age 30, ages 30 to
60, over age 60, etc.), then decides how many to get in each category. Thus, the number of
people in various categories of the sample is fixed. For example, a researcher decides to select 5
males and 5 females under age 30, 10 males and 10 females aged 30 to 60, and 5 males and 5
females over age 60 for a 40-person sample. It is difficult to represent all population
characteristics accurately.
Quota sampling ensures that some differences are in the sample. In haphazard sampling, all
those interviewed might be of the same age, sex, or background. But, once the quota sampler
fixes the categories and number of cases in each category, he or she uses haphazard or
convenience sampling. Nothing prevents the researcher from selecting people who act friendly
or who want to interviewed. Quota sampling methods are not appropriate when the
interviewers choose who they like (within above criteria) and may therefore select those who
are easiest to interview, so, sampling bias can take place. Because not using the random method,
it is impossible to estimate the accuracy. Despite these limitations, quota sampling is a popular
method among non-probability methods of sampling, because it enables the researcher to
introduce a few controls into his research plan and these methods of sampling are more
convenient and less costly than many other methods of sampling.
Limitations
• In quota sampling, the respondents are selected according to the convenience of the
field investigator rather than on a random basis. This kind of selection of sample may be
biased. Suppose in our example of soft drinks, after the sample is taken it was found that
most of the respondents belong to the lower income group then the purpose of
conducting the survey becomes useless and the results may not reflect the actual
situation.
• If the numbers of parameters on which basis the quotas are fixed, are larger, then it
becomes difficult for the researcher to fix the quota for each sub-group.
• The field workers have the tendency to cover the quota by going to those places where
the respondents may be willing to provide information and avoid those with unwilling

respondents. For example, the investigators may avoid places where high income group
respondents stay and cover only low income group areas.
iii) Purposive sampling
Purposive sampling is a valuable kind of sampling for special situations. It is used in

exploratory research or in field research. It uses the judgment of an expert in selecting cases or
it selects cases with a specific purpose in mind. With purposive sampling, he researcher never
knows whether the cases selected represent the population. Purposive sampling is appropriate
to select unique cases that are especially informative.
For example, a researcher wants to study the temperamental attributes of certain problem
behaviour children. It is very difficult to list all certain problem behavior children and sample
randomly from the list. The researcher uses many different methods to identity these cases and
approach them to obtain the relevant information. The primary consideration in purposes
sampling is the judgment of researcher as to who can provide the best information to achieve
the objectives of the study. The researcher only goes to those people who in his / her opinion
are likely to have the required information and be willing to share it.
For studying attitude toward any national issue, a sample of journalists, teacher and legislators
may be taken as an example of purposive sampling because they can more reasonably be
expected to represent the correct attitude than other class of people residing in country.
Purposes sampling is somewhat less costly, more readily accessible, more convenient and select
only those individual that are relevant to research design. Despite these advantages of purposes
sampling, there is no way to ensure that the sample is truly represent of the population and
more emphasis is placed on the ability of researcher to assess the elements of population.
iv) Snowball Sampling
Snowball sampling is also known as network, chain referral or reputation sampling method.
Snowball sampling which is a non probability sampling method is basically sociometric. It
begins by the collection of data on one or more contacts usually known to the person collecting
the data. At the end of the data collection process (e.g., questionnaire, survey, or interview), the
data collector asks the respondent to provide contact information for other potential
respondents. These potential respondents are contacted and provide more contacts. Snowball
sampling is most useful when there are very few methods to secure a list of the population or
when the population is unknowable.
Snowball sampling has some advantages— 1) Snowball sampling, which is primarily a

sociometric sampling technique, has proved very important and is helpful in studying small
informal social group and its impact upon formal organisational structure, 2) Snowball
sampling reveals communication pattern in community organisation concepts like community
power; and decision-making can also be studied with he help of such sampling technique.
Snowball sampling has some limitations also— 1) Snowball sampling becomes cumbersome

and difficult when is large or say it exceeds 100, 2) This method of sampling does not allow the
researcher to use probability statistical methods. In fact, the elements included in sample are
not randomly drawn and they are dependent on the subjective choices of the originally selected
respondents. This introduces some bias in the sampling.
v) Systematic sampling
Systematic sampling is another method of non-probability sampling plan, though the label
‘systematic’ is somewhat misleading in the sense that all probability sampling methods are also
systematic sampling methods. Due to this, it often sounds that systematic sampling should be
included under one category of probability sampling, but in reality this is not the case.
Systematic sampling may be defined as drawing or selecting every ninth person from a
predetermined list of elements or individuals. Selecting every 5th roll number in a class of 60
students will constitute systematic sampling. Likewise, drawing every 8th name from a
telephone directory is an example of systematic sampling. If we pay attention to systematic
sampling plan, it become obvious that such a plan possesses certain characteristics of
randomness (first element selected is a random one) and at the same time, possesses some non-
probability traits such as excluding all persons between every ninth element chosen.
Systematic sampling is relatively quick method of obtaining a sample of elements and it is very
easy to check whether every ninth number or name has been selected. Further Systematic
sampling is easy to used. Despite these advantages, systematic sampling ignores all persons
between every ninth element chosen. Then it is not a probability sampling plan. In Systematic
sampling there is a chance to happen the sampling error if the list is arranged in a particular
order.
Advantages
• The main advantage of using systematic sample is that it is more expeditious to collect a
sample systematically since the time taken and work involved is less than in simple
random sampling. For example, it is frequently used in exit polls and store consumers.
• This method can be used even when no formal list of the population units is available.
For example, suppose if we are interested in knowing the opinion of consumers on
improving the services offered by a store we may simply choose every kth (say 6th)
consumer visiting a store provided that we know how many consumers are visiting the
store daily (say 1000 consumers visit and we want to have 100 consumers as sample
size).
Limitations
• If there is periodicity in the occurrence of elements of a population, the selection of

sample using systematic sample could give a highly un-representative sample. For
example, suppose the sales of a consumer store are arranged chronologically and using
systematic sampling we select sample for 1st of every month. The 1st day of a month

cannot be a representative sample for the whole month. Thus in systematic sampling
there is a danger of order bias.
• Every unit of the population does not have an equal chance of being selected and the
selection of units for the sample depends on the initial unit selection. Regardless how we
select the first unit of sample, subsequent units are automatically determined lacking
complete randomness.
Probability Sampling
Probability sampling methods are those that clearly specify the probability or likelihood of
inclusion of each element or individual in the sample. Probability sampling is free of bias in
selecting sample units. They help in estimation of sampling errors and evaluate sample results
in terms of their precision, accuracy and efficiency and hence, the conclusions reached from
such samples are worth generalisation and comparable to similar population to which they
belong. Major probability sampling methods are:
i) Simple Random Sampling
A simple random sample is a probability sample. A simple random sample requires (a) a
complete listing of all the elements (b) an equal chance for each elements to be selected (c) a
selection process whereby the selection of one element has no effect on the chance of selecting
another element. For example, if we are to select a sample of 10 students from the seventh
grade consisting of 40 students, we can write the names (or roll number) of each of the 40
students on separate slips of paper – all equal in size and colour – and fold them in a similar
way.
Subsequently, they may be placed in a box and reshuffled thoroughly. A blindfolded person,
then, may be asked to pick up one slip. Here, the probability of each slip being selected is 1-40.
Suppose that after selecting the slip and noting the name written on the slip, he again returns it
to the box. In this case, the probability of the second slip being selected is again 1/40. But if he
does not return the first slip to the box, the probability of the second slip becomes 1/39.
When an element of the population is returned to the population after being selected, it is called
sampling with replacement and when it is not returned, it is called sampling without
replacement. Thus random sampling may be defined as one in which all possible combinations
of samples of fixed size have an equal probability of being selected.
Advantages of simple random sampling are:
1) Each person has equal chance as any other of being selected in the sample.
2) Simple random sampling serves as a foundation against which other methods are
sometimes evaluated.
3) It is most suitable where population is relatively small and where sampling frame is
complete and up-to-date.
4) As the sample size increases, it becomes more representative of universe.

5) This method is least costly and easily assessable of accuracy.
Despite these advantages, some of the disadvantages are:
1) Complete and up-to-date catalogued universe is necessary.

2) Large sample size is required to establish the reliability.
3) When the geographical dispersion is so wider therefore study of sample item has larger
cost and greater time.
4) Unskilled and untrained investigator may cause wrong results.
ii) Stratified Random Sampling
In stratified random sampling the population is divided into two or more strata, which may be
based upon a single criterion such as sex, yielding two strata-male and female, or upon a
combination of two or more criteria such as sex and graduation, yielding four strata, namely,
male undergraduates, male graduates, female undergraduates and female graduates. These
divided populations are called subpopulations, which are non-overlapping and together
constitute the whole population.
Having divided the population into two or more strata, which are considered to be
homogeneous internally, a simple random sample for the desired number is taken from each
population stratum. Thus, in stratified random sampling the stratification of population is the
first requirement. There can be many reasons for stratification in a population.
Two of them are:
1) Stratification tends to increases the precision in estimating the attributes of the whole
population.
2) Stratification gives some convenience in sampling. When the population is divided into
several units, a person or group of person may be deputed to supervise the sampling
survey in each unit.
Advantages of stratified Random Sampling are:

1) Stratified sampling is more representative of the population because formation of
stratum and random selection of item from each stratum make it hard to exclude in
strata of the universe and increases the sample’s representation to the population or
universe.
2) It is more precise and avoids the bias to great extent.
3) It saves time and cost of data collection since the sample size can be less in the method.
Despite these advantages, some of the disadvantages of stratified sampling are:

1) Improper stratification may cause wrong results.
2) Greater geographical concentration may result in heavy cost and more time.
3) Trained investigators are required for stratification.

iii) Cluster Sampling
A type of random sample that uses multiple stages and is often used to cover wide geographic
areas in which aggregated units are randomly selected and then sample are drawn from the
sampled aggregated units or cluster For example, if the investigator wanted to survey some
aspect of 3rd grade elementary school going children. First, a random sample of number of
states from the country would be selected. Next, within each selected state, a random selection
of certain number of districts would be made. Then within district a random selection of certain
number of elementary schools would be made. Finally within each elementary school, a certain
number of children would be randomly selected. Because each level is randomly sampled, the
final sample becomes random. However, selection of samples is done to different stages. This is
also called multi stage sampling.
This sampling method is more flexible than the other methods. Sub-divisions at the second
stage unit needs be carried out only those unit selected in the first stage. Despite these merits,
this sampling method is less accurate than a sample, containing the same number of the units in
single stage samples.
• In cluster sampling, we divide the population into groups having heterogeneous

characteristics called clusters and then select a sample of clusters using simple random
sampling. It is assumed that each of the clusters is representative of the population as a
whole. This sampling is widely used for geographical studies of many issues.
• The principles that are basic to the cluster sampling are as follows:
• The differences or variability within a cluster should be as large as possible. As far as
possible the variability within each cluster should be the same as that of the population.
• The variability between clusters should be as small as possible. Once the clusters are
selected, all the units in the selected clusters are covered for obtaining data.
Advantages
• The cluster sampling provides significant gains in data collection costs, since travelling
costs are smaller.‚‚
• Since the researcher need not cover all the clusters and only a sample of clusters is
covered, it becomes a more practical method which facilitates fieldwork.
Limitations
• The cluster sampling method is less precise than sampling of units from the whole
population since the latter is expected to provide a better cross-section of the population
than the former, due to the usual tendency of units in a cluster to be homogeneous.
The sampling efficiency of cluster sampling is likely to decrease with the decrease in cluster
size or increase in number of clusters.

Multistage sampling
• Multistage sampling is a generalisation of two stage sampling. As the name suggests,

multi stage sampling is carried out in different stages. In each stage, progressively
smaller (population) geographic areas will be randomly selected.
• In this sampling method, it is possible to take as many stages as are necessary to achieve
a representative sample. Each stage results in a reduction of sample size. In a multi
stage sampling at each stage of sampling a suitable method of sampling is used. More
number of stages is used to arrive at a sample of desired sampling units.
Advantages
• Multistage sampling provides cost gains by reducing the data collection on costs.‚‚
• Multistage sampling is more flexible and allows us to use different sampling procedures
in different stages of sampling.
• If the population is spread over a very wide geographical area, multistage sampling is
the only sampling method available in a number of practical situations.
Limitations
• If the sampling units selected at different stages are not representative multistage
sampling becomes less precise and efficient.
Errors in Sampling
Investigators expect a sample to be representative of the population. However, errors occur
Statistical Error
It is the difference between the value of a sample statistic of interest (for example, average-
willingness-to-buy-the-service score) and that of the corresponding value of the population
parameter (again, willingness to buy score). It is classified into: random sampling errors and
systematic (non-sampling) errors.
• Random sampling error: Random sampling error occurs because of chance variation in
the scientific selection of sampling units. Random sampling error is a function of sample
size. As sample size increases, random sampling error decreases.
• Systematic (non-sampling) errors: These types of errors are not due to sampling.
They are result of a study’s design and execution. Sample biases account for a large
portion of errors in business research. Random sampling errors and systematic errors
associated with the sampling process may combine to yield a sample that is less than
perfectly representative of the population. As such, researcher has to make use of
scientific approach for sampling.

In a sample survey, since only a small portion of the population is studied its results are bound
to differ from the census results and thus, have a certain amount of error. In statistics the word
error is used to denote the difference between the true value and the estimated or approximated
value. This error would always be there no matter that the sample is drawn at random and that
it is highly representative. This error is attributed to fluctuations of sampling and is called
sampling error. Sampling error exist due to the fact that only a sub set of the population has
been used to estimate the population parameters and draw inferences about the population.
Thus, sampling error is present only in a sample survey and is completely absent in census
method.
Sampling errors occur primarily due to the following reasons:
1. Faulty selection of the sample:
Some of the bias is introduced by the use of defective sampling technique for the selection of a
sample e.g. Purposive or judgment sampling in which the investigator deliberately selects a
representative sample to obtain certain results. This bias can be easily overcome by adopting
the technique of simple random sampling.
2. Substitution:
When difficulties arise in enumerating a particular sampling unit included in the random
sample, the investigators usually substitute a convenient member of the population. This
obviously leads to some bias since the characteristics possessed by the substituted unit will
usually be different from those possessed by the unit originally included in the sample.
3. Faulty demarcation of sampling units:
Bias due to defective demarcation of sampling units is particularly significant in area surveys
such as agricultural experiments in the field of crop cutting surveys etc. In such surveys, while
dealing with border line cases, it depends more or less on the discretion of the investigator
whether to include them in the sample or not.
4. Error due to bias in the estimation method:
Sampling method consists in estimating the parameters of the population by appropriate

statistics computed from the sample. Improper choice of the estimation techniques might
introduce the error.
5. Variability of the population:
Sampling error also depends on the variability or heterogeneity of the population to be

sampled.
Sampling errors are of two types: Biased Errors and Unbiased Errors
1. Biased Errors: The errors that occur due to a bias of prejudice on the part of the
informant or enumerator in selecting, estimating measuring instruments are called
biased errors. Suppose for example, the enumerator uses the deliberate sampling method

in the place of simple random sampling method, then it is called biased errors. These
errors are cumulative in nature and increase when the sample size also increases. These
errors arise due to defect in the methods of collection of data, defect in the method of
organization of data and defect in the method of analysis of data.
2. Unbiased Errors: Errors which occur in the normal course of investigation or
enumeration on account of chance are called unbiased errors. They may arise
accidentally without any bias or prejudice. These errors occur due to faulty planning of
statistical investigation.
To avoid these errors, the statistician must take proper precaution and care in using the
correct measuring instrument. He must see that the enumerators are also not biased.
Unbiased errors can be removed with the proper planning of statistical investigations.
Both these errors should be avoided by the statisticians.
Types of Errors in Testing Of Hypothesis:
As stated earlier, the inductive inference consists in arriving at a decision to accept or reject a
null hypothesis (Ho) after inspecting only a sample from it. As such an element of risk – the
risk of taking wrong decision is involved. In any test procedure, the four possible mutually
disjoint and exhaustive decisions are:
A. Reject Ho when actually it is not true i.e., when Ho is false.

B. Accept Ho when it is true.
C. Reject Ho when it is true.
D. Accept Ho when it is false.
The decisions in (i) and (ii) are correct decisions while the decisions in (iii) and (iv) are wrong
decisions. These decisions may be expressed in the following dichotomous table:
Thus, in testing of hypothesis we are likely to commit two types of errors. The error of
rejecting Ho when Ho is true is known as Type I Error and the error of accepting Ho when Ho
is false is known as Type II Error. For example, in the Industrial Quality Control, while
inspecting the quality of a manufactured lot, the Inspector commits Type I Error when he
rejects a good lot and he commits Type II Error when he accepts a bad lot.

Concept of Measurement – Reliability and Validity Tools
Sound measurement must meet the tests of validity, reliability and practicality. In fact, these
are the three major considerations one should use in evaluating a measurement tool.
• Validity refers to the extent to which a test measures what we actually wish to measure.
• Reliability has to do with the accuracy and precision of a measurement procedure. ‚‚
• Practicality is concerned with a wide range of factors of economy, convenience, and
interpretability.‚
‚
A. Test of Validity
a. Validity is the most critical criterion and indicates the degree to which an
instrument measures what it is supposed to measure. Validity can also be
thought of as utility. In other words, validity is the extent to which differences
found with a measuring instrument reflect true differences among those being
tested.
b. But the question arises: how can one determine validity without direct
confirming knowledge? The answer may be that we seek other relevant evidence
that confirms the answers we have found with our measuring tool. What is
relevant, evidence often depends upon the nature of the research problem and
the judgment of the researcher. But one can certainly consider three types of
validity in this connection:
i. Content validity is the extent to which a measuring instrument provides
adequate coverage of the topic under study. If the instrument contains a
representative sample of the universe, the content validity is good. Its
determination is primarily judgmental and intuitive. It can also be
determined by using a panel of persons who shall judge how well the
measuring instrument meets the standards, but there is no numerical way
to express it.
ii. Criterion-related validity relates to our ability to predict some outcome
or estimate the existence of some current condition. This form of validity
reflects the success of measures used for some empirical estimating
purpose. The concerned criterion must possess the following qualities:
1. Relevance: (A criterion is relevant if it is defined in terms we
judge to be the proper measure.)
2. Freedom from bias: (Freedom from bias is attained when the
criterion gives each subject an equal opportunity to score well.)
c. Reliability: (A reliable criterion is stable or reproducible.)
d. Availability: (The information specified by the criterion must be available.) In
fact, a Criterion-related validity is a broad term that actually refers to:
i. Predictive validity: the usefulness of a test in predicting some future
performance‚‚
ii. Concurrent validity: the usefulness of a test in closely relating to other
measures of known validity. ‚‚

iii. Criterion-related validity is expressed as the coefficient of correlation
between test scores and some measure of future performance or between
test scores and scores on another measure of known validity.
iv. Construct validity: It is the most complex and abstract. A measure is
said to possess construct validity to the degree that it confirms to
predicted correlations with other theoretical propositions. Construct
validity is the degree to which scores on a test can be accounted for by
the explanatory constructs of a sound theory. For determining construct
validity, we associate a set of other propositions with the results received
from using our measurement instrument. I measurements on our devised
scale correlate in a predicted way with these other propositions, we can
conclude that there is some construct validity. If the above stated criteria
and tests are met with, we may state that our measuring instrument is
valid and will result in correct measurement; otherwise we shall have to
look for more information and/or resort to exercise of judgment.
B. Test of Reliability
a. The test of reliability is another important test of sound measurement. A
measuring instrument is reliable if it provides consistent results. Reliable
measuring instrument does contribute to validity, but a reliable instrument need
not be a valid instrument.
b. If the quality of reliability is satisfied by an instrument, then while using it we
can be confident that the transient and situational factors are not interfering.
Two aspects of reliability, viz., stability and equivalence deserve special mention.
The stability aspect is concerned with securing consistent results with repeated
measurements of the same person and with the same instrument.
c. We usually determine the degree of stability by comparing the results of
repeated measurements. The equivalence aspect considers how much error may
get introduced by different investigators or different samples of the items being
studied. A good way to test for the equivalence of measurements by two
investigators is to compare their observations of the same events. Reliability can
be improved in the following two ways:
i. By standardising the conditions under which the measurement takes
place i.e., we must ensure that external sources of variation such as
boredom, fatigue, etc., are minimised to the extent possible. That will
improve stability aspect.
ii. By carefully designed directions for measurement with no variation from
group to group, by using trained and motivated persons to conduct the
research and also by broadening the sample of items used. This will
improve equivalence aspect.
C. Test of Practicality
a. The practicality characteristic of a measuring instrument can be judged in terms
of economy, convenience and interpretability. From the operational point of
view, the measuring instrument ought to be practical i.e., it should be
economical, convenient and interpretable.

b. Economy consideration suggests that some trade-off is needed between the ideal
research project and that which the budget can afford. The length of measuring
instrument is an important area where economic pressures are quickly felt.
Although more items give greater reliability as stated earlier, but in the interest
of limiting the interview or observation time, we have to take only few items for
our study purpose.
c. Similarly, data-collection methods to be used are also dependent at times upon
economic factors. Convenience test suggests that the measuring instrument
should be easy to administer. For this purpose one should give due attention to
the proper layout of the measuring instrument. For instance, a questionnaire,
with clear instructions (illustrated by examples), is certainly more effective and
easier to complete than one which lacks these features.
d. Interpretability consideration is especially important when persons other than
the designers of the test are to interpret the results. The measuring instrument,
in order to be interpretable, must be supplemented by
i. Detailed instructions for administering the test‚‚
ii. Scoring keys ‚‚
iii. Evidence about the reliability‚‚
iv. Guides for using the test and for interpreting results
Technique of Developing Measurement Tools: The technique of developing measurement

tools involves a four-stage process, consisting of the following:
• The first and foremost step is that of concept development which means that the
researcher should arrive at an understanding of the major concepts pertaining to his
study. This step of concept development is more apparent in theoretical studies than in
the more pragmatic research, where the fundamental concepts are often already
established.
• The second step requires the researcher to specify the dimensions of the concepts that
he developed in the first stage. This task may either be accomplished by deduction i.e.,
by adopting a more or less intuitive approach or by empirical correlation of the
individual dimensions with the total concept and/or the other concepts.

• For instance, one may think of several dimensions such as product reputation, customer
treatment, corporate leadership, concern for individuals, and sense of social
responsibility and so forth when one is thinking about the image of a certain company.
Once the dimensions of a concept have been specified, the researcher must develop
indicators for measuring each concept element. Indicators are specific questions, scales,
or other devices by which respondent’s knowledge, opinion, expectation, etc., are
measured.
• As there is seldom a perfect measure of a concept, the researcher should consider several
alternatives for the purpose. The use of more than one indicator gives stability to the
scores and it also improves their validity.
• The last step is that of combining the various indicators into an index, i.e., formation of
an index. When we have several dimensions of a concept or different measurements of a
dimension, we may need to combine them into a single index. One simple way for
getting an overall index is to provide scale values to the responses and then sum up the
corresponding scores.
Data Analysis
Editing of Data
The editing of data is a process of examining the raw data to detect errors and omissions and to
correct them, if possible, so as to ensure legibility, completeness, consistency and accuracy.
The recorded data must be legible so that it could he coded later. An illegible response may be
corrected by getting in touch with people who recorded it or alternatively it may be inferred
from other parts of the question.
Completeness involves that all the items in the questionnaire must be fully completed. If some
questions are not answered, the interviewers may be contacted to find out whether he failed to
respond to the question or the respondent refused to answer the question. In case of former, it
is quite likely that the interviewer will not remember the answer. In such a case the respondent
may be contacted again or alternatively this particular piece of data may be treated as missing
data.
It is very important to check whether or not respondent is consistent in answering the

questions. For example there could a respondent claiming that he makes purchases by credit
card may not have one.
The inaccuracy of the survey data may be due to interviewer bias or cheating. One way of
spotting is to look for a common pattern of responses in the instrument of a particular
interviewer.
Apart from ensuring quality data this will also facilitate in coding and tabulation of data. In
fact, the editing involves a careful scrutiny of the completed questionnaires.
• Editing is the first stage in data processing. Editing may be broadly defined to be a
procedure, which uses available information and assumptions to substitute inconsistent

values in a data set. In other words, editing is the process of examining the data
collected through various methods to detect errors and omissions and correct them for
further analysis. While editing, care has to be taken to see that the data are as accurate
and complete as possible, units of observations and number of decimal places are the
same for the same variable. The following practical guidelines may be handy while
editing the data:
o The editor should have a copy of the instructions given to the interviewers.‚‚
o The editor should not destroy or erase the original entry. Original entry should
be crossed out in such a manner that they are still legible.
o All answers, which are modified or filled in afresh by the editor, have to be
indicated.‚‚
o All completed schedules should have the signature of the editor and the date.‚‚
• For checking the quality of data collected, it is advisable to take a small sample of the
questionnaire and examine them thoroughly. This helps in understanding the following
types of problems:
o whether all the questions are answered‚‚
o whether the answers are properly recorded‚‚
o whether there is any bias‚‚
o whether there is any interviewer dishonesty‚‚
o whether there are inconsistencies‚‚
• At times, it may be worthwhile to group the same set of questionnaires according to the
investigators (whether any particular investigator has specific problems) or according to
geographical regions (whether any particular region has specific problems) or according
to the sex or background of the investigators, and corrective actions may be taken if any
problem is observed.
• Before tabulation of data it may be good to prepare an operation manual to decide the
process for identifying inconsistencies and errors and also the methods to edit and
correct them. The following broad rules may be helpful.
Incorrect answers: It is quite common to get incorrect answers to many of the questions. A
person with a thorough knowledge will be able to notice them.
For example, against the question “Which brand of biscuits do you purchase?” the answer may
be “We purchase biscuits from ABC Stores”. Now, this questionnaire can be corrected if ABC
Stores stocks only one type of biscuits, otherwise not. Answer to the question “How many days
did you go for shopping in the last week?” would be a number between 0 and 7. A number
beyond this range indicates a mistake, and such a mistake cannot be corrected. The general rule
is that changes may be made if one is absolutely sure, otherwise this question should not be
used. Usually a schedule has a number of questions and although answers to a few questions are
incorrect, it is advisable to use the other correct information from the schedule rather than
discarding the schedule entirely.
Inconsistent answers: When there are inconsistencies in the answers or when there are
incomplete or missing answers, the questionnaire should not be used. Suppose that in a survey,
per capita expenditure on various items is reported as follows: Food – Rs. 700, Clothing –

Rs.300, Fuel and Light – Rs. 200, other items – Rs. 550 and Total – Rs. 1600. The answers are
obviously inconsistent as the total of individual items of expenditure is exceeding the total
expenditure.
Modified answers: Sometimes it may be necessary to modify or qualify the answers. They have
to be indicated for reference and checking. Numerical answers to be converted to same units:
Against the question “What is the plinth area of your house?” answers could be either in square
feet or in square metres. It will be convenient to convert all the answers to these questions in
the same unit, square metre for example.
The editing can be done at two stages:
1. Field Editing, and

2. Central Editing.
Field Editing: The field editing consists of review of the reporting forms by the investigator
for completing or translating what the latter has written in abbreviated form at the time of
interviewing the respondent. This form of editing is necessary in view of the writing of
individuals, which vary from individual to individual and sometimes difficult for the tabulator
to understand. This sort of editing should be done as soon as possible after the interview, as it
may be necessary sometimes to recall the memory. While doing so, care should be taken so that
the investigator does not correct the errors of omission by simply guessing what the
respondent would have answered if the question was put to him.
Central Editing: Central editing should be carried out when all the forms of schedules have
been completed and returned to the headquarters. This type of editing requires that all the
forms are thoroughly edited by a single person (editor) in a small field study or a small group of
persons in case of a large field study, The editor may correct the obvious errors, such as an
entry in a wrong place, entry recorded in daily terms whereas it should have been recorded in
weeks/months, etc. Sometimes, inappropriate or missing replies can also be recorded by the
editor by reviewing the other information recorded in the schedule. If necessary, the
respondent may be contacted for clarification. All the incorrect replies, which are quite obvious,
must be deleted from the schedules.
Some other Editing
In-house editing - Field editing or early reviewing of the data is not always possible;
sometimes, the study may be conducted on various parts of the country and responses [filled
questionnaires] may reach after very days to the central location / research coordinator center.
In such situations, in-house editing rigorously investigates the results of data collection. The
research supplier or the research department normally has a centralized office staff to perform
the editing and coding function.
Editing for consistency - The in-house editor’s task is to ensure that inconsistent or
contradictory responses are adjusted to ensure that the answers will not be a problem for
coders and keyboard operators. For example, the editor’s task may be to eliminate an obviously
incorrect sampling unit. The in-house editor must determine if the answers given by a

respondent are consistent with other, related questions—the editor must use good judgment in
correcting such inconsistencies.
Editing for completeness - In some cases the respondent may have answered only one portion
of a two-part question. Item non-response is the technical term for unanswered questions on an
otherwise complete questionnaire. Specific decision rules for handling this problem should be
meticulously outlined in the editor’s instructions. If an editor finds a missing answer where
there can be no missing values, he or she may insert an answer (plug value) according to such a
predetermined rule. Another decision rule might be to randomly select an answer.
The editor should be familiar with the instructions and the codes given to the interviewers
while editing. The new (corrected) entry made by the editor should be in some distinctive form
and they be initialed by the editor. The date of editing may also be recorded on the schedule for
any future references.
Data Coding
Coding refers to the process by which data are categorised into groups and numerals or other
symbols or both are assigned to each item depending on the class it falls in. Hence, coding
involves:
• deciding the categories to be used,‚‚

• assigning individual codes to them‚‚
In general, coding reduces the huge amount of information collected into a form that is
amenable to analysis. A careful study of the answers is the starting point of coding. Next, a
coding frame is to be developed by listing the answers and by assigning the codes to them. A
coding manual is to be prepared with the details of variable names, codes and instructions.
Normally, the coding manual should be prepared before collection of data, but for open-ended
and partially coded questions. These two categories are to be taken care of after the data
collection.The following are the broad general rules for coding:
• Each respondent should be given a code number (an identification number).‚‚

• Each qualitative question should have codes. Quantitative variables may or may not be
coded depending on the purpose. Monthly income should not be coded if one of the
objectives is to compute average monthly income. But if it is used as a classificatory
variable it may be coded to indicate poor, middle or upper income group.
• All responses including “don’t know”, “no opinion” “no response” etc., are to be coded.‚‚
Sometimes it is not possible to anticipate all the responses and some questions are not coded
before collection of data. Responses of all the questions are to be studied carefully and codes are
to be decided by examining the essence of the answers. In partially coded questions, usually
there is an option “Any other (specify)”. Depending on the purpose, responses to this question
may be examined and additional codes may be assigned.
Coding is the process of assigning some symbols (either) alphabetical or numerals or (both) to
the answers so that the responses can be recorded into a limited number of classes or

categories. The classes should be appropriate to the research problem being studied. They must
be exhaustive and must be mutually exclusive so that the answer can be placed in one and only
one cell in a given category. Further, every class must be defined in terms of only one concept.
The coding is necessary for the efficient analysis of data. The coding decisions should usually
be taken at the designing stage of the questionnaire itself so that the likely responses to
questions are pre-coded. This simplifies computer tabulation of the data for further analysis. It
may be noted that any errors in coding should be eliminated altogether or at least be reduced to
the minimum possible level.
Coding for an open -ended question is more tedious than the closed ended question. For a
closed ended or structured question, the coding scheme is very simple and designed prior to the
field work. For example, consider the following question.
What is your Gender?
Male Female Transgender
We may assign a code of `0' to male and `1' to female respondent, and 2 to transgender. These
codes may be specified prior to the field work and if the codes are written on all questions of a
questionnaire, it is said to be wholly pre-coded.
The same approach could also be used for coding numeric data that either are not be coded into
categories or have had their relevant categories specified. For example,
What is your monthly income?
• Here the respondent would indicate his monthly income which may be entered in the
relevant column. The same question may also be asked like this:
What is your monthly income?
− < Rs. 5000

− Rs. 5000 - 8999
− Rs. 13000 – 12999
− Rs. 13000 or above.
• We may code the class less than Rs.5000' as ,1', Rs. 5000 - 8999' as `2', `Rs. 9000 -
12999' as `3' and `Rs. 13000 or above' as `4'.
Coding of open-ended questions is a more complex task as the verbatism responses of the
respondents are recorded by the interviewer. In what categories should these responses be put
to? The researcher may select at random 60-70 of the responses to a question and list them.
After examining the list, a decision is taken to what categories are appropriate to summarize
the data and the coding scheme for categorized data as discussed above is used-A word of
caution-that while classifying the data into various categories we should keep provision for
"any other" to include responses which may not fall into our designated categories.

It may be kept in mind that the response categories must be mutually exclusive and collectively
exhaustive. A study was carried out among the readers of newspapers with the following
objectives.
• To identify and understand the factors that determines the preference for Times of India
amongst the readers.
• To identify the profile of the readers of Times of India.
• To ascertain the expectations vs. perceptual reality and locate gaps if any amongst the
readers of Times of India.
• To analyze the factors responsible for the most preferred subjects of information
attracting the readers to prefer Times of India.
To achieve these objectives a questionnaire was designed. We give below a part of the
questionnaire, and discuss the coding scheme for the same. Please note that the objective here is
not to evaluate the questionnaire but to design the coding scheme for any given questionnaire
of a study. The said questionnaire is given below in Exhibit 1.

• Let us design the coding scheme for the questionnaire given in exhibit 1. We
note that question number 1 may have multiple responses because a respondent
could read one or more than one newspapers. There are 5 alternatives assigned
for question number 1 and therefore we will use five columns in the data matrix
to record the responses of this question. If the respondent reads Times of India
we code it a value 1 otherwise O. Similarly it is done for the remaining
newspapers. However, if there is a respondent who read Times of India and
Indian Express we will code question la and lc having a value of I and for the
remaining parts namely b, d and e the coded value will be 0.
• For question number 2, the respondent can choose only one of the four
alternatives. Therefore one single column is required to record the responses of
the respondents. The response categories are mutually exclusive and collectively
exhaustive. Whichever category is chosen by the respondent that is coded 1 and
the remaining are coded O.
• Question number 3 has seven parts and the respondent is to rate each one of
them on a 5-point scale ranging from 1 to 5. Therefore a total of seven column is
required to record the responses of the respondent. Suppose the respondent
rates-International News as 4 the value of 4 should be assigned to question
number 3b and so on.
• There are five attributes of Times of India mentioned in question number 4 and
the respondent is assigned the job of rating each of them on a scale of I to 5.
Therefore five columns are required to record the responses of this question.
Suppose for question 4c (Weekend Supplements) the rating of the respondent is
2, and the same will be shown in the coding book corresponding to this question.
• There are six features of Times of India mentioned in question number 5 and
labeled as 5a to 5f. The respondent is to rank them from 1 to 6 with regards to
the importance it gives to each of these features. Therefore we need six columns
for this. Suppose the rating is 2, 3, 6, 1, 4 & 5 for questions numbering 5a to 5f
respectively. The same numbers would appear on the coding sheet
corresponding to this question.
• Question number 6 is divided into five parts. For each of the part one separate
column is required. 6a indicates the age of the respondent which will be
indicated as per the data revealed by the respondent. Question 6b is concerning
the sex of the respondent. Here male respondents are coded as 1 whereas female
respondents are coded as O. Question 6c indicates the total number of members
in the household. Question 6d is concerned with the occupation of the
respondent, Question 6e mention the monthly income of the household put in
categorized form. Here the responses are mutually exclusive and collectively
exhaustive. If the respondent has a monthly income of less than 5000 rupees, the
response is coded as 1, if monthly income is between 5001-10000 rupees, it is
coded as 2, in case it is between 10001- 15000; the code is 3. From 15001-20000;
the code is 4, 20001-25000; the code is 5 and above 25000; the code is 6: The
above discussion can be shown below in the form of a code book.

The data matrix corresponding to the above coding scheme is shown in the table given below:

The above table indicates that the respondent number I reads both Times of India and Indian
Express and no other newspaper. This is indicated by code 1 corresponding to question la and
is and for the remaining parts of questions 1 a `0' is indicated. Question number 2 indicates that
the respondent is reading Times of India from 6 to 12 months. The rating of various features of
a newspaper in terms of the interest he has in them is indicated by responses indicated in
questions 3a to 3g. The respondent is not very uninterested in critical news, interested in
international news, not particular about city news, very interested in corporate and business
news, very uninterested in sports news and interested in people and lifestyle news; and leisure
art and entertainment news. The respondent rates Times of India on five attributes. He can
give a possible rating of Times of India on various attributes on a scale of 1 to 5 where 1 is on
extremely unfavorable side whereas 5 represent extremely favorable side. He has rated Times
of India on news content as 4, editorial as 3, weekend supplements as 5, and weekday’s
supplements as 3 and layout as 5. However, his ranking of how various features are important
to him on 1 to 7 scale, where 1 represents very important and 7 the least important is indicated
in question
As per the respondent, classified advertisements are ranked the least, weekdays supplements
get a rank of 4, number of pages get a rank of 6, advertisement; a rank of 3, news content; a
rank of 1, weekend supplements; a rank of 2, and layout; a rank of 5. The respondent is of 32
years of age and is a male as indicated by a code of 1 to question 6b. There are four members in
his household. His occupation is business and has a monthly income between Rs. 10001 to
15000 as indicated by code 3 for question 6c.
Respondent 2 does not read Times of India. In fact the respondent is a reader of Hindustan
Times and no other newspaper and therefore the questions mentioned in questions numbering
6 are asked to the respondent. The respondent is 30 years of age, and a female as indicated by
code 0 for question 6b. The respondent has 3 family members, is a professional and have
monthly income between 15001 - 20000 rupees as indicated by code 4 corresponding to 6c.
Classification of Data
Once the data is collected and edited, the next step towards further processing the data is
classification. In most research studies, voluminous data collected through various methods
needs to be reduced into homogeneous groups for meaningful analysis. This necessitates
classification of data, which in simple terms is the process of dividing data into different groups
or classes according to their similarities and dissimilarities.
The groups should be homogeneous within and heterogeneous between themselves.

Classification condenses huge amount of data and helps in understanding the important
underlying features. It enables us to make comparison, draw inferences, locate facts and also
helps in bringing out relationships, so as to draw meaningful conclusions. In fact classification
of data provides a basis for tabulation and analysis of data.
Types of classification
Data may be classified according to one or more external characteristics or one or more
internal characteristics or both. Let us study these kinds with the help of illustrations.

Classification according to external characteristics
In this classification, data may be classified according to area or region (Geographical) and
according to occurrences (Chronological).
• Geographical: In this type of classification, data are organised in terms of geographical

area or region. State-wise production of manufactured goods is an example of this type.
Data collected from an all India market survey may be classified geographically. Usually
the regions are arranged alphabetically or according to the size to indicate the
importance.
• Chronological: When data is arranged according to time of occurrence, it is called
chronological classification. Profit of engineering industries over the last few years is an
example. We may note that it is possible to have chronological classification within
geographical classification and vice versa. For example, a large scale all India market
survey spread over a number of years.
Classification according to internal characteristics
Data may be classified according to attributes (Qualitative characteristics which are not capable
of being described numerically) and according to the magnitude of variables (Quantitative
characteristics which are numerically described).
• Classification according to attributes: In this classification, data are classified by

descriptive characteristic like sex, caste, occupation, place of residence etc. This is done
in two ways – simple classification and manifold classification. In simple classification
(also called classification according to dichotomy), data is simply grouped according to
presence or absence of a single characteristics – male or female, employed or
unemployed etc. In manifold classification (also known as multiple classification), data is
classified according to more than one characteristic. First, the data may be divided into
two groups according to one attribute (employee and unemployed, say). Then using the
remaining attributes, data is sub-grouped again (male and female based on sex). This
may go on based on other attributes, like married and unmarried, rural and urban so
on… The following table is an example of manifold classification.
Classification according to magnitude of the variable:
• This classification refers to the classification of data according to some characteristics

that can be measured. In this classification, there are two aspects: one is variables (age,
weight, income etc) another is frequency (number of observations which can be put into
a class).
• Quantitative variables may be, generally, divided into two groups - discrete and
continuous. A discrete variable is one which can take only isolated (exact) values, it does
not carry any fractional value. The examples are number of children in a household,
number of departments in an organisation, number of workers in a factory etc.
• The variables that take any numerical value within a specified range are called
continuous variables. The examples of continuous variables are the height of a person,
profit/loss of companies etc. One point may be noted. In practice, even the continuous
variables are measured up to some degree of precision and they also essentially become
discrete variables.
Classification condenses the data, facilitates comparisons, helps to study the relationships and
facilitates in statistical treatment of data. The classification should be unambiguous and
mutually exclusive and collectively exhaustive. Further, it should not only be flexible but also
suitable for the purpose for which it is sought.' Classification can either be according to
attributes or numerical characteristics.
• Classification According to Attributes: To classify the data according to attributes

we use descriptive characteristics like sex, caste, education, user of a product etc. The
descriptive characters are the one which cannot be measured quantitatively. One can
only talk in terms of its presence or absence. The classification according to attributes
may be of two types.
o Simple Classification: In the case of simple classification each class is divided
into two sub classes and only one attribute is studied viz, user of a product or
non-user of a product, married or unmarried, employed or unemployed, Brahmin
or non-Brahmin etc.
o Manifold Classification: In the case of manifold classification more than one
attributes are considered. For example, the respondents in a survey may be
classified as user of a particular brand of a product and non-user of particular
brand of product. Both user and non-user can be further classified into male and
female. Further one can classify male and female into two categories such as
below 25 years of age and 25 and more years of age. We can further classify
them as professionals at non-professionals. This way one can keep on adding
more attributes. This is shown in Figure - 1. However, the addition of a
particular attribute (process of sub-classification) depends upon the basic
purpose for which the classification is required. The objectives of such a
classification have to be clearly spelt out.
• Classification According to Numerical Characteristic: When the observations
possesses numerical characteristics such as sales, profits, height, weight, income, marks,
they are classified according to class intervals. For example, persons whose monthly
income is between Rs. 2001 and Rs. 3500 may -form one group, those whose income is
within Rs. 3501 and Rs. 7000 may form another group, and so on. In this manner, the
entire data may be divided into a number of groins or classes, which are usually called
class-intervals. The number of items in each class is called the. Frequency of the class.
Every class has two limits: an upper limit and a lower limit, which are known as class
limits. The difference between these two limits is called the magnitude of the class or
the width of the class interval. The class intervals may be formed by using inclusive and
exclusive method. Suppose we have the class intervals such as 10 - 15, 16 - 21, 22 - 27
etc. Such a class interval is an example of inclusive method because both the lower and
upper limit are included in the class. If the variable X falls in the first class interval, it
can take values like 10-= X >=15. The class intervals like 10 - 15, 15 - 20, 20 - 25 etc.
form an example of exclusive class interval since the lower limit is included whereas the
upper limit is excluded from the class interval. The variable X if falling in the first class
interval, would take values as 10 -X < 15 As an illustration of how the data can be
classified into class intervals using inclusive and exclusive method, we may consider the
following example.
Example: Following data refers to the sales of a company for the 40 quarters.
Tabulate the data using the inclusive method.
Qtr. Sales Qtr. Sales Qtr. Sales Qtr. Sales

1 1060 11 1255 21 1690 31 1200
2 2125 12 1190 2 2130 32 2190
3 1440 13 870 23 1870 33 1800
4 1940 14 1460 24 1875 34 2255
5 2060 15 2125 25 1650 35 2000
6 1310 16 750 26 945 36 1060
7 2120 17 1120 27 2240 37 1370
8 2560 18 2000 28 1700 38 2375
9 2250 19 1750 29 1165 39 1470
10 2135 20 .1760 30 1945 40 2250
We will be using the data given above. We form five class intervals each of width 370. These
are inclusive class intervals in the sense that the variable X could take any value between the
lower and upper limit in such a way that both ends of the interval could be covered under this.
The class intervals along with the number of items in each class interval are shown in the table
below:
Tabulation of data
Presentation of collected data in the tabular form is one of the techniques of data presentation.
The two other techniques are diagrammatic and graphic presentation. Arranging the data in an
orderly manner in rows and columns is called tabulation of data.
• Sometimes data collected by survey or even from publications of official bodies are so
numerous that it is difficult to understand the important features of the data. Therefore
it becomes necessary to summarise data through tabulation to an easily intelligible
form. It may be noted that there may be loss of some minor information in certain cases,
but the essential underlying features come out more clearly.
• Quite frequently, data presented in tabular form is much easier to read and understand
than the data presented in the text. In classification, as discussed in the previous section,
the data is divided on the basis of similarity and resemblance, whereas tabulation is the
process of recording the classified facts in rows and columns. Therefore, after classifying
the data into various classes, they should be shown in the tabular form.
Types of Tables
Tables may be classified, depending upon the use and objectives of the data to be presented,
into simple tables and complex tables. Let us discuss them along with illustrations.
Simple table: In this case data are presented only for one variable or characteristics. Therefore,
this type of table is also known as one way table. The table showing the data relating to the
sales of a company in different years will be an example of a single table.
Parts of a statistical table: A table should have the following four essential parts:
1. Title
2. Main Data
3. Stud – Row heading
4. Caption for box head (Column)
• At times it may also contain an end note and source note below the table. The table
should have a title, which is usually placed above the statistical table. The title should be
clearly worded to give some idea of the table’s contents. Usually a report has many
tables. Hence the tables should be numbered to facilitate reference.
• Caption refers to the total of the columns. It is also termed as “box head”.
• There may be sub-captions under the main caption. Stub refers to the titles given to the
rows. Caption and stub should also be unambiguous. To the extent possible
abbreviations should not be used in either caption or stub. But if they are used, the
expansion must be given in the end note below. Notes pertaining to stub entries or box
headings may be numerals. But, to avoid confusion, it is better to use some symbols (like
*, **, @ etc) or alphabets for notes referring to the entries in the main body. If the table
is based on outside information, it should be mentioned in the source note below.
This note should be complete with author, title, year of publication etc to enable the reader to
go to the original source for crosschecking or for obtaining additional information. Columns
and rows may be numbered for easy reference.
Arrangement of items in stub and box-head
There is no hard and fast rule about the arrangement of column and row headings in a table. It
depends on the nature of data and type of analysis. A number of different methods are used -
alphabetical, geographical, chronological / historical, magnitude-based and customary or
conventional.
• Alphabetical: This method is suitable for general tables as it is easy to locate an item if
it is arranged alphabetically. For example, population census data of India may be
arranged in the alphabetical order of states/union territories.
• Geographical: It can be used when the reader is familiar with the usual geographical
classification.
• Chronological: A table containing data over a period of time may be presented in the
chronological order. Population data (1961 to 2001) presented earlier are in
chronological order. One may either start from the most recent year or the earliest year.
However, there is a convention to start with the month of January whenever year and
month data are presented.
• Based on Magnitude: At times, items in a table are arranged according to the value of
the characteristic. Usually the largest item is placed first and other items follow in
decreasing order. But this may be reversed also. Suppose that state-wise population data
is arranged in order of decreasing magnitude. This will highlight the most populous
state and the least populous state.
• Customary or Conventional: Traditionally some order is followed in certain cases.
While presenting population census data, usually ‘rural’ comes before ‘urban’ and ‘male’
first and ‘female’ next. At times, conventional geographical order is also followed.
Some Problems in Processing
Following are the two problems of processing the data for analytical purposes: The problem
concerning “Don’t know” (or DK) responses:
• While processing the data, the researcher often comes across some responses that are
difficult to handle. One category of such responses may be ‘Don’t Know Response’ or
simply DK response. When the DK response group is small, it is of little significance.
But when it is relatively big, it becomes a matter of major concern in which case the
question arises: Is the question which elicited DK response useless? The answer
depends on two point’s, viz., the respondent actually may not know the answer or the
researcher may fail in obtaining the appropriate information.
• In the first case the concerned question is said to be alright and DK response is taken as
legitimate DK response. But in the second case, DK response is more likely to be a
failure of the questioning process. How DK responses are to be dealt with by
researchers? The best way is to design better type of questions. Good rapport of
interviewers with respondents will result in minimising DK responses. But what about
the DK responses that have already taken place?
• One way to tackle this issue is to estimate the allocation of DK answers from other data
in the questionnaire. The other way is to keep DK responses as a separate category in
tabulation where we can consider it as a separate reply category if DK responses happen
to be legitimate, otherwise we should let the reader make his own decision.
• Yet another way is to assume that DK responses occur more or less randomly and as
such we may distribute them among the other answers in the ratio in which the latter
have occurred. Similar results will be achieved if all DK replies are excluded from
tabulation and that too without inflating the actual number of other responses.
Data Analysis and Interpretation
Preliminary Data Analysis
Preliminary data analysis refers to application of simple procedures that enable the researcher
to get a feel for the data. This allows the researcher to understand the basic relationships
among variables so that, he can further do a rigorous analysis of the data, in a focused way. The
interpretations and insights gained during the initial data analysis are sometimes very useful in
clarifying the results obtained from further analyses.
For better applications of statistical tools and selection of a data analysis strategy, the
researcher may consider the following aspects:
• Objectives and hypotheses - The preliminary data analysis strategy prepared as a part
of the research design and the process will be including problem definition, development
of an approach, and research design may serve as a starting point. However,
modifications should be done in light of the additional information generated.
• Known characteristics of the data - The scales of measurement(nominal, ordinal,
interval, or ratio) and the nature of the research design strongly influence the choice of
a statistical technique. For example, techniques like ANOVA are highly suited for
analyzing experimental data from causal designs.
• Properties of statistical techniques - The statistical techniques serve varying
purposes and some techniques are more robust to the violations of the underlying
assumptions as compared to the others. Thus, depending on the applications (e.g.
examining differences in the variables, making predictions) appropriate statistical
techniques should be chosen.
• Background and philosophy of the researcher – Depending on their sophistication,
researchers employ various statistical methods and make different assumptions about
the variables and underlying populations. For example, a conservative researcher may
choose only those statistical techniques that are distribution free.
Descriptive Statistics
It presents the data in terms of percentages or averages as well as variances.
Simple Tabulation
Tabulation of the data refers to an orderly arrangement of data collected by the researcher in a
table or other summary format. Counting the number of responses to a question and putting
them into a frequency distribution is a simple tabulation, or marginal tabulation. It is also
referred to as frequency table. In addition to frequency counts, percentages, and cumulative
percentages associated with the variable are also given. Moreover, this frequency table can be
used to draw inferences about the central tendency, variability, or shape of the underlying
distribution. Table shows a simple table From Table, it is obvious that the most people
preferred brand is Diet Pepsi, as 50% of respondents have expressed their preference.
Cross Tabulation - Simple tabulation may not yield the full value of the research. Most data
can be further organized in a variety of ways. Analyzing results by groups, categories, or
classes is the technique of crosstabulation. The purpose of categorization and cross-tabulation
is to allow the inspection and comparison of differences among groups. This form of analysis
also helps determine the type of relationship among variables.
Percentage cross-tabulation: The total number of respondents or observations may be used

as a base for computing the percentage in each cell. Selecting either the row percentages or the
column percentages will emphasize a particular comparison or distribution.
There is a conventional ruling for determining the direction of percentages if the researcher has
identified which variable is the independent variable and which is the dependent variable. The
margin total of the independent variable should be used as the base for computing the
percentages.
However, due to the increase in complexity, three variables crosstabulation is seldom used.For
example, a popular coffee restaurant in Bangalore city observed inflow of customers on a
randomly selected day. The data have been collected and summarized in Table. From the table,
it is evident, nearly 75% of the female customers prefer after 7 PM; whereas only 3.5% of male
customers prefer after 10 PM. The restaurant has to provide appropriate measures to protect
the interest of female customers.
Measures of Central Tendencies
Among the various tools, most often used measures are mean, mode, and median. The mean is
the average of the responses, the mode is the most commonly given response, and the median is
the middle value of the data when data are arranged in an ascending or descending order. Data
must be at least interval scaled in order to calculate the mean, but only ordinal for the median
and nominal for the mode. The simple mean or average is probably the most commonly used
method of describing central tendency. To compute the mean all you do is add up all the values
and divide by the number of values. For example, the mean or average score of a cricketer in
the recent 20-20 format is determined by summing all the scores and dividing by the number of
matches he played.
For example, consider the match score of a batsman in the last 10 matches;
11, 16, 21, 21, 20, 34, 15, 25, 15, 22
The sum of these 10 values is 200, so the mean is 200/10 = 20.
In notation, X = Σ [Xi/N] i- 1, 2, 3 ... N
The Median is the value, which exactly divide the data into two parts; that is, the middle of the
set of values. One way to compute the median is to list all scores in numerical order, and then
locate the score in the center of the sample. If we order the batsman scores [10 items], shown
above, [11, 16, 21, 21, 20, 34, 15, 25, 15, 22], we would get:
11, 15, 15, 16, 20, 21, 21, 22, 25, 34
Since there are 10 items, item-5 and item-6 divide the data into two equal parts. Therefore,
median is simple average of item-5 &6.
Median = 20+21/2 = 20.5
If we have only 9 items, for example, [11, 16, 21, 21, 20, 15, 25, 15, and 22], then, we would
get:
11, 15, 15, 16, 20, 21, 21, 22, 25
Since there are 9 items, item-5 divides the data into two equal parts. Therefore, median is
simply the 5th entry, which is nothing but 20.
Median = 20
The mode is the most frequently occurring value in the set of scores. To determine the mode,
you might again order the scores as shown above, and then count each one. The most
frequently occurring value is the mode. In our example, the value 15 occurs two times and 21
occurred twice; therefore, the distribution has bi-modal values. In some distributions there is
more than one modal value. Those distributions are known as Multi-modal distributions; one
should note that a distribution need not have unique mode or multi-modes.
Hypothesis Testing Procedure
Hypothesis is a statement of expected result. It is an unproven proposition or supposition that

tentatively explains certain facts or phenomenon. It is for testing and proving.
The process of hypothesis testing goes as follows:
Step-1 State Null Hypothesis
A null hypothesis is a statement about a status quo. It is often denoted by Ho
Step-2 State Alternative Hypothesis
The alternative hypothesis states the opposite of the null hypothesis.
The symbol H1 denotes it.
Step-3 Determine the Significance Level
Statisticians have defined the decision criterion as the significance level. The significance level
is a critical probability in choosing between the null hypothesis and the alternative hypothesis.
The level of significance determines the probability level—say, .05 or .01—that is to be
considered. The researcher in a way decides “how much” he or she is willing to bet. More
appropriately, the researcher selects the odds of making an “incorrect” decision. Some gamblers
will take an 80 percent chance; others, more conservative, will take 99 percent. By convention,
95 percent is often utilized.
Step-4 Choose an Appropriate Test
If the data is nominal and the hypothesis seeks to test associations, use chi-square. If the data is
ordinal and seeks to examine rank correlations, use rank correlation tests. If it is means of two
samples, use t test. If it seeks to examine variances, use ANOVA. Likewise, on the basis of data
select an appropriate test.
Step-5 State the Decision Rule
It tells when to accept null hypothesis or reject it. Find the table value of the test statistic for
the degrees of freedom and significance level. In general, when computed value is greater than
table value, we reject null hypothesis.
Step-6 Compute the test Statistic
Use the appropriate formula, make necessary calculations and find the value of the test statistic.
Step-7 Make Decision
Compare computed value with critical value and take decision as per laid out rule.
Step-8 Make Inferences
Draw final conclusion based on the result.
Illustrative Case
The Red Chicken restaurant is concerned about its store image, one aspect of which is the
friendliness of the service. In a personal interview, customers are asked to indicate their
perception of the service on a 5-point scale, where 1 indicates very friendly and 5 indicates very
unfriendly. The scale is assumed to be an interval scale and the distribution of the service
variable is assumed to be normal.
Step-1 Null Hypothesis
The researcher formulates the null hypothesis that the population mean is equal to 3.0:
H0: μ = 3.0
Step-2 Alternative hypothesis
The alternative hypothesis is that the mean does not equal 3.0: H1: μ ≠ 3.0
Step-3 Significance Level
The researchers have set the significance level at the .05 level. This means that in the long run
the probability of making an erroneous decision when H0 is true will be fewer than five times in
100.
Step-4 Choose an Appropriate Test
The Red Lion hired research consultants who collected a sample of 225 interviews. The mean
score on the 5-point scale was 3.78. The sample standard deviation was S = 1.5.They decided to
use t test.
Step-5 State the Decision Rule
From the tables of standardized normal distribution, the researcher finds that the Z score of
1.96 represented a probability of .025 that a sample mean will lay above 1.96 standard errors
from μ. Likewise .025 of all sample means will fall below -1.96 standard errors from μ .The
values that lie exactly on the boundary of the region of rejection are called the critical values of
the sample mean is greater than the critical value, 3.196, the researchers say that the sample
result is statistically significant beyond the .05 level.
Decision rule: Reject null hypothesis if computed value Z is greater than 3.196 (critical value
obtained from tables).
Step-6 Compute the Critical Value
Now we must transform these critical Z-values to the sampling distribution of the mean for this
study. The critical values are:
= μ ± ZSx or μ ± Z (S / √n) = 3.0 ± 1.96 (1.5 /√255
= 3.0 ± 1.96 (.1) = 3.0 ± 1.96
Lower limit= 2.804 and Upper limit = 3.196
Step-7 Make Decision
The sample mean X = 3.78. is greater than the critical value, 3.196. , Hence, the sample result
is statistically significant beyond the .05 level. Therefore, null hypothesis is rejected. The
alternative hypothesis is accepted.
Step-8 Make Inferences
The results indicate that customers believe the service is friendly. It is unlikely (less than five in
100) that this result would occur because of random sampling error.
Type I and Type II Errors
The researcher runs the risk of committing two types of errors. A Type I error, which has the
probability alpha (α)—the level of significance that we have set up—is an error caused by the
rejection of the null hypothesis when it is true.
Type II error has the probability of beta (β) and it is an error caused by the failure to reject the
null hypothesis when the alternative hypothesis is true.
Without increasing the sample size the researcher cannot simultaneously reduce Type I and
Type II errors because there is an inverse relationship between the two ( i.e., 1−α= β). Thus,
reducing the probability of a Type I error increases the probability of a Type II error and vice
versa. In marketing problems, Type I errors are generally more serious then Type II errors.
The number of variables that will be simultaneously investigated is a primary consideration in
the choice of statistical techniques.
UNIT – 3
Analysis of Variance – Standard Deviation, Coefficient of Variance:
Correlation and Regression
Standard Deviation:
It is most widely used measure of dispersion of a series and is commonly denoted by the symbol
σ (pronounced as sigma). Standard deviation is defined as the square-root of the average of
squares of deviations, when such deviations for the values of individual items in a series are
obtained from the arithmetic average. It is worked out as under:
∑ 𝑋𝑋 𝑖𝑖 − 𝑋𝑋
Standard deviation* (σ) = �
𝑛𝑛
* If we use assumed average, A, in place of X while finding deviations, then standard deviation
would be worked out as under:
This is also known as the short cut method of finding σ.
Or
∑ ƒ𝑖𝑖 (𝑋𝑋 𝑖𝑖 − 𝑋𝑋 )

Standard deviation (σ) =� ∑ ƒ𝑖𝑖
; in case of frequency distribution where ƒi, means the
frequency of the ith item.
• When we divide the standard deviation by the arithmetic average of the series, the
resulting quantity is known as coefficient of standard deviation which happens to be a
relative measure and is often used for comparing with similar measure of other series.
When this coefficient of standard deviation is multiplied by 100, the resulting figure is
known as coefficient of variation. Sometimes, we work out the square of standard
deviation, known as variance, which is frequently used in the context of analysis of
variation.
• The standard deviation (along with several related measures like variance, coefficient of
variation, etc.) is used mostly in research studies and is regarded as a very satisfactory
measure of dispersion in a series. It is amenable to mathematical manipulation because
the algebraic signs are not ignored in its calculation (as we ignore in case of mean
deviation). It is less affected by fluctuations of sampling. These advantages make
standard deviation and its coefficient a very popular measure of the scatteredness of a
series. It is popularly used in the context of estimation and testing of hypotheses.
“Standard deviation or S.D. is the square root of the mean of the squared deviations of
the individual scores from the mean of the distribution.”
To be clearer, we should note here that in computing the S.D., we square all the deviations
separately. Find their sum, divide the sum by total number of scores and then find the square
root of the mean of the squared deviations.
So S.D. is also called the ‘Root mean square deviations from mean’ and is generally denoted by
the small Greek letter σ (sigma).
Symbolically, the standard deviation for ungrouped data is defined as:
Where d = deviation of individual scores from the mean;
(Some authors use ‘x’ as the deviation of individual scores from the mean)
∑ = sum total of; N = total number of cases.
The mean square deviation is referred to as variance. Or in simple words square of

standard of deviation is called the Second Moment of Dispersion or Variance.
Computation of S.D. (Ungrouped data):
There are two ways of computing S.D. for ungrouped data:

(a) Direct method.
(b) Short-cut method.
(a) Direct Method:
Find the standard deviation for the scores given below:

X = 12, 15, 10, 8, 11, 13, 18, 10, 14, 9
This method uses formula (18) for finding S.D. which involves the following steps:
Step 1:
Calculate arithmetic mean of the given data:
Step 2:
Write the value of the deviation d i.e. X – M against each score in column 2. Here the
deviations of scores are to be taken from 12. Now you will find that ∑d or ∑ (X – M) is equal to
zero. Think, why is it so? Check it. If this is not so, find out the error in computation and
rectify it.
Step 3:
Square the deviations and write the value of d2 against each score in column 3. Find the sum of
squared deviations. ∑d2 = 84.
Table: Computation of S.D:
The required standard deviation is 2.9.
Step 4:
Calculate the mean of the squared deviations and then find out the positive square root for
getting the value of standard deviation i.e. σ.
Using formula (19), the Variance will be σ2 = ∑d2/N = 84/10 = 8.4
(b) Short-cut Method:

In most of the cases the arithmetic mean of the given data happens to be a fractional value and
then the process of taking deviations and squaring them becomes tedious and lime consuming
in computation of S.D.
To facilitate computation in such situations, the deviations may be taken from an assumed
mean. The adjusted short-cut formula for calculating S.D. will then be,
Where,
d = Deviation of the score from an assumed mean, say AM; i.e. d = (X – AM).
d2 = The square of the deviation.
∑d = The sum of the deviations.
∑d2 = The sum of the squared deviations.
N = No. of the scores or variates.
The computation procedure is clarified in the following example:
Example: Find S.D. for the scores given in table 4.5 of X = 12, 15, 10, 8, 11, 13, 18, 10, 14, 9.
Use short-cut method.
Solution:
Let us take assumed mean AM = 11.
The deviations and squares of deviations needed in formula are given in the following
table:
Putting the values from table in formula, the S.D.
The short-cut method gives the same result as we obtained by using direct method in previous
example. But short-cut method tends to reduce the calculation work in situations where
arithmetic mean is not a whole number.
Computation of S.D. (Grouped data):
(a) Long Method/Direct Method:
Example: Find the S.D. for the following distribution:
Here also, the first step is to find the mean M, for which we have to take the mid-points of the
c.i’s denoted by X’ and find the product fX.’. Mean is given by∑ fx’/N. The second step is to
find the deviations of the mid-points of class intervals X’ from the mean i.e. X’- M denoted by d.
The third step is to square the deviations and find the product of the squared deviations and the
corresponding frequency.
To solve the above problem, c.i.’s are written in column 1, frequencies are written in column 2,
mid-points of c.i’s i.e. X’ are written in column 3, the product of fX’ is written in column 4, the
deviation of X’ from the mean is written in column 5, the squared deviation d2 is written in
column 6, and the product fd2 is written in column 7,
As shown below:
So, the deviations of the midpoints are to be taken from 11.1.
Thus, the required standard deviation is 4.74.
(b) Short-cut Method:
Sometimes, in direct method, it is observed that the deviations from the actual mean results in
decimals and the values of d2 and fd2 are difficult to calculate. In order to avoid this problem we
follow a short cut-method for calculating standard deviation.
In this method, instead of taking the deviations from actual mean, we take deviations from a
suitably chosen assumed mean, say A.M.
The following formula is then used for calculating S.D:
Where d is deviation from assumed mean.
The following steps are then involved in the computation of standard deviation:
(i) Obtain deviations of the variates from assumed mean A.M. as d = (X – AM)
(ii) Multiply these deviations by corresponding frequencies to get the column fd. The
sum of this column gives ∑fd. fd with corresponding deviation (d)
(iii) Multiply to get the column fd 2. The sum of this column will be ∑fd 2.
(iv) Use formula (22) to find S.D.
Example:
Using short-cut method find S.D. of the data in table 4.7.
Solution:
Let us take assumed mean AM = 10. Other calculations needed for calculating S.D. are given in
table 4.8.
Putting values from table
Using the formula (19), the variance
(c) Step-Deviation Method:
In this method, in column 1 we write c.i. ‘s; in column 2 we write the frequencies; in column 3
we write the values of d, where d = X’-AM/i; in column 4 we write the product of fd, and in
column 5, We write the values of fd2, as shown below:
Here, Assumed Mean is the mid-point of the c.i. 9-11 i.e. 10, so the deviations d‘s have been
taken from 10 and divided by 3, the length of c.i. The formula for S.D. in step-deviation method
is
Where i = length of the c.i’s,
f= frequency;
d = deviations of the mid-points of c.i. ‘s from the assumed mean (AM) in class interval (i) units,
Which can be stated:
Putting values from the table
The procedures of calculation can also be stated in following manner:
Combined Standard Deviation (σcomb):
When two sets of scores have been combined into a single lot, it is possible to calculate the σ of
the total distribution from the σ’ s of the two component distributions. The formula is:
Where σ1 , = SD of distribution 1
σ2 = SD of distribution 2
d1 = (M1 – Mcomb)
d2 = (M2 – Mcomb)
N1 = No. of cases in distribution 1.
N2 = No. of cases in distribution 2.
An example will illustrate the use of the formula.
Example:
Suppose we are given the means and SD’s on an Achievement Test for two classes differing in
size, and are asked to find the o of the combined group. Data are as follows:
First, we find that
The formula (24) can be extended to any number of distributions. For example, in the case of
three distributions, it will be
Properties of S.D:
1. If each variate value is increased by the same constant value, the value of S.D. of the
distribution remains unchanged:
We will discuss this effect upon S.D. by considering an illustration. The table (4.10) shows
original scores of 5 students in a test with an arithmetic mean score of 20.
New scores (X’) are also given in the same table which we obtain by adding a constant 5 to each
original score. Using formula for ungrouped data, we observe that S.D. of the scores remains
the same in both the situations.
Thus, the value of S.D. in both situations remains same.
2. When a constant value is subtracted from each variate, the value of S.D. of the new
distribution remains unchanged:
The students can also examine that when we subtract a constant from each score, the mean is
decreased by the constant, but S.D. is the same. It is due to the reason that ‘d‘ remains
unchanged.
3. If each observed value is multiplied by a constant value, S.D. of the new observations
will also be multiplied by the same constant:
Let us multiply each score of the original distribution (Table 4.10) by 5.
Thus, the S.D. of the new distribution will be multiplied by the same constant (here, it is 5).
Merits of S.D:
1. Standard deviation is rigidly defined and its value is always definite.

2. It is based on all the observations of data.
3. It is capable of further algebraic treatment and possesses many mathematical properties.
4. Unlike Q and AD it is less affected by fluctuations of scores.
5. Unlike AD, it does not ignore the negative signs. By squaring of deviations it
overcomes these difficulties.
6. It is the reliable and most accurate measure of variability. It always goes with the mean
which is the most stable measure of central tendency.
7. S.D. gives a measure that is comparable meaning from one test to other. Above all the
normal curve units are expressed in a unit.
Demerits of S.D:
1. S.D. is difficult to understand and not easy to calculate.

2. S.D. gives more weight to extreme scores and loss to those which are nearer to the
mean. It is because the squares of the deviations, which are big in size, would be
proportionately greater than the squares of those deviations which are comparatively
small.
Uses of S.D:
1. S.D. is used when our thrust is to measure the variability having greatest stability.
2. When extreme deviations might affect the variability at that time S.D. is used.
3. S.D. is used for calculating the further statistics like coefficient of correlation, standard
scores, standard errors, Analysis of Variance, Analysis of Co-variance etc.
4. When the interpretation of scores is made in terms of the NPC, S.D is used.
5. When we want to determine the reliability and validity of test scores, S.D. is used.
Coefficient of Variance (Relative Dispersion)
The coefficient of variation (CV) is a measure of relative variability. It is the ratio of the
standard deviation to the mean (average). For example, the expression “The standard deviation
is 15% of the mean” is a CV.
The CV is particularly useful when you want to compare results from two different surveys or
tests that have different measures or values. For example, if you are comparing the results from
two tests that have different scoring mechanisms. If sample A has a CV of 12% and sample B
has a CV of 25%, you would say that sample B has more variation, relative to its mean. The
measures of dispersion give us an idea about the extent to which scores are scattered around
their central value. Therefore, two frequency distributions having the same central values can
be compared directly with the help of various measures of dispersion.
If, for example, on a test in a class, boys have mean score M1 = 60 with S.D. σ1 = 15 and girls
mean score is M2 = 60 with S.D. σ2 = 10. Clearly, girls who have a lesser S.D., are more
consistent in scoring around their average score than boys.
We have situations when two or more distributions having unequal means or different units of
measurements are to be compared in respect of their scattered-ness or variability. For making
such comparisons we use coefficients of relative dispersion or coefficient of variations (C.V.).
The formula is:

100σ
𝑉𝑉 =
𝑀𝑀
(Coefficient Of variation or coefficient of relative variability)
V gives the percentage which σ is of the test mean. It is thus a ratio which is independent of the
units of measurement.
V is restricted in its use owing to certain ambiguities in its interpretation. It is defensible when
used with ratio scales—scales in which the units are equal and there is a true zero or reference
point.
For example, V may be used without hesitation with physical scales—those concerned with
linear magnitudes, weight and time.
Two cases arise in the use of V with ratio scales:
(1) When units are dissimilar, and

(2) When M’s are unequal, the units of the scale being the same.
1. When units are unlike:
Example:
A group of 10 years old boys has a mean height of 137 cm. with a o of 6.2 cm. The same group
of boys has a mean weight of 30 kg. with a of 3.5 kg. In which trait, is the group more variable?
Solution:
Obviously, we cannot compare centimetres and kilograms directly, but we can compare the
relative variability of the two distributions in terms of V.
In the present example, two groups not only differ in respect of mean but also in units of
measurements which is cm. in the first case and kg. in the second. Coefficient of variation may
be used to compare the variability of the groups in such a situation.
We, thus calculate:
Thus, from the above calculation it appears that these boys are about twice as variable
(11.67/4.53 = 2.58) in weight as in height.
2. When means are unequal, but scale units are the same:
Suppose we have the following data on a test for a group of boys and a group of men:
Group M σ V
Boys 75 20 26.67
Men 65 25 38.46
Then, compare:
(i) The performance of the two groups on the test.
(ii) The variability of scores in the two groups.
Solution:
(i) Since the mean score of group of boys is greater than that of men, therefore, boys group has
given a better performance of the test.
(ii) For comparing two groups in respect of variability among scores, coefficient of variations
are calculated V of boys = 26.67 and V of men = 38.46.
Therefore, the variability of scores is greater in group of men. The students in boys’ group,
having a lesser C. V., are more consistent in scoring around their average score as compared to
the men’s group.
S.D. and the spread of observations:

In a symmetrical (normal) distribution,
(i) Mean ± 1 SD covers 68.26% of the scores.
Mean ± 2 SD covers 95.44% of the scores.
Mean ± 3 SD covers 99.73% of the scores.
(ii) In large samples (N = 500), the Range is about 6 times SD.
If N is about 100, the Range is about 5 times the SD.
If N is about 50, the Range is about 4.5 times the SD.
If N is about 20, the Range is about 3.7 times the S.D.
Interpretation of Standard Deviation:

The standard deviation characterises the nature of distribution of scores. When the scores are
more widely spread S.D. is more and when scores are less scattered S.D. is less. For
interpreting the value of the measure of dispersion, we must understand that greater the value
of ‘σ‘the more scattered are the scores from the mean.
As in the case of mean deviation, the interpretation of standard deviation requires the value of
M and N for consideration.
In following examples, the required values of σ, mean and N are given like:
Here, the dispersion is more in example 2 as compared to example 1. It means the values are
more scattered in example 2, as compared to the values of example 1.
Use the following formula to calculate the CV by hand for a population or a sample.
How to find a coefficient of variation
σ is the standard deviation for a population, which is the same as “s” for the sample.
μ is the mean for the population, which is the same as XBar in the sample.
In other words, to find the coefficient of variation, divide the standard deviation by the mean
and multiply by 100.
How to find a coefficient of variation in Excel:
You can calculate the coefficient of variation in Excel using the formulas for standard deviation
and mean. For a given column of data (i.e. A1:A10), you could enter: “=stdev (A1:A10)/average
(A1:A10)) then multiply by 100.
How to Find a Coefficient of Variation by hand: Steps:
Sample question: Two versions of a test are given to students. One test has pre-set answers and
a second test has randomized answers. Find the coefficient of variation.
Regular Test Randomized Answers
Mean 50.1 45.8
SD 11.2 12.9
Step 1: Divide the standard deviation by the mean for the first sample:
11.2 / 50.1 = 0.22355
Step 2: Multiply Step 1 by 100:
0.22355 * 100 = 22.355%
Step 3: Divide the standard deviation by the mean for the second sample:
12.9 / 45.8 = 0.28166
Step 4: Multiply Step 3 by 100:
0.28166 * 100 = 28.266%
Correlation and Regression
Correlation
Definitions of Correlation:
If the change in one variable appears to be accompanied by a change in the other variable, the
two variables are said to be correlated and this interdependence is called correlation or
covariation.
In short, the tendency of simultaneous variation between two variables is called correlation or
covariation. For example, there may exist a relationship between heights and weights of a
group of students, the scores of students in two different subjects are expected to have an
interdependence or relationship between them.
To measure the degree of relationship or covariation between two variables is the subject
matter of correlation analysis. Thus, correlation means the relationship or “going-
togetherness” or correspondence between two variables. In statistics, correlation is a method of
determining the correspondence or proportionality between two series of measures (or scores).
To put it simply, correlation indicates the relationship of one variable with the other.
Meaning of Correlation:
To measure the degree of association or relationship between two variables quantitatively, an

index of relationship is used and is termed as co-efficient of correlation.
Co-efficient of correlation is a numerical index that tells us to what extent the two variables are
related and to what extent the variations in one variable changes with the variations in the
other. The co-efficient of correlation is always symbolized either by r or ρ (Rho).
The notion ‘r’ is known as product moment correlation co-efficient or Karl Pearson’s
Coefficient of Correlation. The symbol ‘ρ’ (Rho) is known as Rank Difference Correlation
coefficient or spearman’s Rank Correlation Coefficient.
The size of ‘r’ indicates the amount (or degree or extent) of correlation-ship between two
variables. If the correlation is positive the value of ‘r‘ is + ve and if the correlation is negative
the value of V is negative. Thus, the signs of the coefficient indicate the kind of relationship.
The value of V varies from +1 to -1.
Correlation can vary in between perfect positive correlation and perfect negative correlation.
The top of the scale will indicate perfect positive correlation and it will begin from +1 and then
it will pass through zero, indicating entire absence of correlation.
The bottom of the scale will end at -1 and it will indicate perfect negative correlation. Thus
numerical measurement of the correlation is provided by the scale which runs from +1 to -1.
[NB—The coefficient of correlation is a number and not a percentage. It is generally rounded

up to two decimal places].
Need for Correlation:
Correlation gives meaning to a construct. Correlational analysis is essential for basic psycho-
educational research. Indeed most of the basic and applied psychological research is
correlational in nature.
Correlational analysis is required for:
1. Finding characteristics of psychological and educational tests (reliability, validity, item

analysis, etc.).
2. Testing whether certain data is consistent with hypothesis.
3. Predicting one variable on the basis of the knowledge of the other(s).
4. Building psychological and educational models and theories.
5. Grouping variables/measures for parsimonious interpretation of data.
6. Carrying multivariate statistical tests (Hoteling’s T2; MANOVA, MANCOVA,
Discriminant analysis, Factor Analysis).
7. Isolating influence of variables.
Types of Correlation:
In a bivariate distribution, the correlation may be:
1. Positive, Negative and Zero Correlation; and

2. Linear or Curvilinear (Non-linear).
1. Positive, Negative or Zero Correlation:
When the increase in one variable (X) is followed by a corresponding increase in the other
variable (Y); the correlation is said to be positive correlation. The positive correlations range
from 0 to +1; the upper limit i.e. +1 is the perfect positive coefficient of correlation.
The perfect positive correlation specifies that, for every unit increase in one variable, there is
proportional increase in the other. For example “Heat” and “Temperature” have a perfect
positive correlation.
If, on the other hand, the increase in one variable (X) results in a corresponding decrease in the
other variable (Y), the correlation is said to be negative correlation.
The negative correlation ranges from 0 to – 1; the lower limit giving the perfect negative
correlation. The perfect negative correlation indicates that for every unit increase in one
variable, there is proportional unit decrease in the other.
Zero correlation means no relationship between the two variables X and Y; i.e. the change in
one variable (X) is not associated with the change in the other variable (Y). For example, body
weight and intelligence, shoe size and monthly salary; etc. The zero correlation is the mid-point
of the range – 1 to + 1.
2. Linear or Curvilinear Correlation:

Linear correlation is the ratio of change between the two variables either in the same direction
or opposite direction and the graphical representation of the one variable with respect to other
variable is straight line.
Consider another situation. First, with increase of one variable, the second variable increases
proportionately upto some point; after that with an increase in the first variable the second
variable starts decreasing.
The graphical representation of the two variables will be a curved line. Such a relationship
between the two variables is termed as the curvilinear correlation.
Methods of Computing Co-Efficient of Correlation:
In ease of ungrouped data of bivariate distribution, the following three methods are used
to compute the value of co-efficient of correlation:
1. Scatter diagram method.

2. Pearson’s Product Moment Co-efficient of Correlation.
3. Spearman’s Rank Order Co-efficient of Correlation.
1. Scatter Diagram Method:
Scatter diagram or dot diagram is a graphic device for drawing certain conclusions about the
correlation between two variables.
In preparing a scatter diagram, the observed pairs of observations are plotted by dots on a
graph paper in a two dimensional space by taking the measurements on variable X along the
horizontal axis and that on variable Y along the vertical axis.
The placement of these dots on the graph reveals the change in the variable as to whether they
change in the same or in the opposite directions. It is a very easy, simple but rough method of
computing correlation.
The frequencies or points are plotted on a graph by taking convenient scales for the two series.
The plotted points will tend to concentrate in a band of greater or smaller width according to
its degree. ‘The line of best fit’ is drawn with a free hand and its direction indicates the nature
of correlation. Scatter diagrams, as an example, showing various degrees of correlation are
shown in Fig. 5.1 and Fig. 5.2.
If the line goes upward and this upward movement is from left to right it will show positive
correlation. Similarly, if the lines move downward and its direction is from left to right, it will
show negative correlation.
The degree of slope will indicate the degree of correlation. If the plotted points are scattered
widely it will show absence of correlation. This method simply describes the ‘fact’ that
correlation is positive or negative.
2. Pearson’s Product Moment Co-efficient of Correlation:
The coefficient of correlation, r, is often called the “Pearson r” after Professor Karl Pearson who
developed the product-moment method, following the earlier work of Gallon and Bravais.
Coefficient of correlation as ratio:

The product-moment coefficient of correlation may be thought of essentially as that ratio which
expresses the extent to which changes in one variable are accompanied by—or dependent
upon-changes in a second variable.
As an illustration, consider the following simple example which gives the paired heights
and weights of five college students:
The mean height is 69 inches, the mean weight 170 pounds, and the o is 2.24 inches and o is
13.69 pounds, respectively. In the column (4) the deviation (x) of each student’s height from the
mean height, and in column (5) the deviation, (y) of each student’s weight from the mean
weight are given. The product of these paired deviations (xy) in column (6) is a measure of the
agreement between individual heights and weights. The larger the sum of xy column the
higher the degree of correspondence. In above example the value∑xy/N of is 55/5 or 11.
Where perfect agreement, i.e. r = ± 1.00, the value of ∑xy/N exceeds maximum limit.
Thus, ∑xy/N would not yield a suitable measure of relationship between x and y. The reason is
that such an average is not a stable measure, as it is not independent of the units in which
height and weight have been expressed.
In consequence, this ratio will vary if centimeters and kilograms are employed instead of inches
and pounds. One way to avoid the trouble-some matter of differences in units is to express each
deviation as a σ score or standard score or Z score, i.e. to divide each x and y by its own σ.
Each x and y deviation is then expressed as a ratio, and is a pure number, independent of the
test units. The sum of the products of the σ scores column (9) divided by N yields a ratio which
is a stable expression of relationship. This ratio is the “product-moment” coefficient of
correlation. In our example, its value of .36 indicates a fairly high positive correlation between
height and weight in this small sample.
The student should note that our ratio or coefficient is simply the average product of the σ
scores of corresponding X and Y measures i.e.
Nature of rxy:
(i) rxy is a product moment r

(ii) rxy is a ratio, = rxy.
(iii) rxy can be + ve or – ve bound by limits – 1.00 to + 1.00.
(iv) rxy may be regarded as an arithmetic mean (rxy is the mean of standard score products).
(v) rxy is not affected by any linear transformation of scores on either X or Y or both.
(vi) When variables are in the standard score form, r gives a measure of the average amount of
change in one variable associated with the change of one unit the other variable.
(vii) rxy = √b yx bxy where byx = regression coefficient of Y on X, bxy = regression coefficient of
X on Y. rxy = square root of the slopes of the regression lines.
(viii) rxy is not influenced by the magnitude of means (scores are always relative).
(ix) rxy cannot be computed if one of the variables has no variance S2x or S2Y = 0
(x) rxy of 60 implies the same magnitude of relationship as rxy = – .60. The sign tells about the
direction of relationship, and the magnitude about the strength of the relationship.
(xi) df for rxy is N – 2, which is used for testing significance of rxy. Testing significance of r is
testing significance of regression. Regression line involves slope and intercept, hence 2 df is
lost. So when N = 2, rxy is either + 1.00 or – 1.00 as there is no freedom for sampling variation
in the numerical value of r.
A. Computation of rxy (Ungrouped Data):
Here, using the formula for computation of r depends on “where from the deviations are taken”.
In different situations deviations can be taken either from actual mean or from zero or from
A.M. Type of Formula conveniently applied for the calculation of coefficient correlation
depends upon mean value (either in fraction or whole).
(i) The Formula of r when Deviations are taken from Means of the Two Distributions X and Y.
∑ 𝑥𝑥𝑥𝑥
rxy =
𝑁𝑁 σ𝑥𝑥 σ𝑦𝑦
Where rxy = Correlation between X and Y
x = deviation of any X score from the mean in the test X
y = deviation of corresponding Y score from the mean in test Y.
∑xy = Sum of all the products of deviations (X and Y)
σx and σy = Standard deviations of the distribution of X and Y score.
in which x and y are deviations from the actual means and ∑x 2 and ∑y2 are the sums of squared
deviations in x and y taken from the two means.
This formula is preferred:
1. When mean values of both the variables are not in fraction.
2. When to find out correlation between short, ungrouped series (say, twenty- five cases or
so).
3. When deviations are to be taken from actual means of the two distributions.
The steps necessary are illustrated in Table 5.1. They are enumerated here:
Step 1:
List in parallel columns the paired X and Y scores, making sure that corresponding scores are
together.
Step 2:
Determine the two means Mx and My. In table 5.1, these are 7.5 and 8.0, respectively.
Step 3:
Determine for every pair of scores the two deviations x and y. Check them by finding algebraic
sums, which should be zero.
Step 4:
Square all the deviations, and list in two columns. This is for the purpose of computing σx and
σy.
Step 5:
Sum the squares of the deviations to obtain ∑x 2 and ∑y 2 Find xy product and sum these for
∑xy.
Step 6:
From these values compute σx and σy.
An alternative and shorter solution:
There is an alternative and shorter route that omits the computation of σx and σy, should they
not be needed for any other purpose.
Applying Formula (28):
(ii) The Calculation of rxy from Original scores or Raw scores:

It is an another procedure with ungrouped data, which does not require the use of deviations. It
deals entirely with original scores. The formula may look forbidding but is really easy to apply.
i. When to compute r from direct raw scores.

ii. Original scores ft. when data are small ungrouped.
iii. When mean values are in fractions.
iv. When good calculating machine is available.
X and Y are original scores in variables X and Y. Other symbols tell what is done with them.
We follow the steps that are illustrated in Table 5.2:
Step 1:
Square all X and Y measurements.
Step 2:
Find the XY product for every pair of scores.
Step 3:
Sum the X’s, the Y’s, the X2, the Y2, and the XY.
Step 4:
Apply formula (29):
(ii) Computation of rxy when deviations are taken from Assumed Mean:
Formula (28) is useful in calculating r directly from two ungrouped series of scores, but it has
the disadvantages as it requires “long method” of calculating means and σ’s. The deviations x
and y when taken from actual means are usually decimals and the multiplication and squaring
of these values is often a tedious task.
For this reason—even when working with short ungrouped series—it is often easier to assume
means, calculate deviations from these A.M.’s and apply the formula (30).

1. When actual means are usually decimals and the multiplication and squaring of these
values is often a tedious task.
2. When deviations are taken from A.M.’s.
3. When we are to avoid fractions.
The steps in computing r may be outlined as follows:

Step 1:
Find the mean of Test 1 (X) and the mean of Test 2 (Y). The means as shown in Table 5.3
MX = 62.5 and MY = 30.4 respectively.
Step 2:
Choose A.M.’s of both X and Y i.e. A.M.X as 60.0 and A.M.Y as 30.0.
Step 3:
Find the deviation of each score on Test 1 from its A.M., 60.0, and enter it in column x’. Next
find the deviation of each score in Test 2 from its A.M., 30.0, and enter it in column y’.
Step 4:
Square all of the x’ and all of they’ and enter these squares in column x’2 and y’2, respectively.
Total these columns to obtain ∑x’2 and ∑y’2.
Step 5:
Multiply x’ and y’, and enter these products (with due regard for sign) in the x’y’ column. Total
x’y’ column, taking account of signs, to get ∑x’y’.
Step 6:
The corrections, Cx and Cy, are found by subtracting AMX from Mx and AMy from My. Then,
Cx found as 2.5 (62.5 – 60.0) and Cy as .4 (30.4 – 30.0).
Step 7:
Substitute for ∑x’y’ , 334, for ∑x’2, 670 and for ∑y’ 2, 285 in formula (30), as shown in Table 5.3,
and solve for rxy.
Properties of r:
1. The value of the coefficient of correlation r remains unchanged when a constant is

added to one or both variables:
In order to observe the effect on the coefficient correlation r when a constant is added to one or
both the variables, we consider an example. Now, we add a score of 10 to each score in X and
20 to each score of Y and represent these scores by X’ and Y’ respectively.
The calculations for computing r for original and new pairs of observations are given in
Table 5.4:
The same formula for new scores can be written as:
Thus, we observe that the value of the coefficient of correlation r remains unchanged when a
constant is added to one or both variables.
2. The value of the coefficient of correlation r remains unchanged when a constant is

subtracted from one or both variables:
Students can examine this by taking an example. When each score of one or both variables are
subtracted by a constant the value of coefficient of correlation r also remains unchanged.
3. The value of the coefficient of correlation r remains unaltered when one or both sets
of variate values are multiplied by some constant:
In order to observe the effect of multiplying the variables by some constant on the value of r,
we arbitrarily multiply those original scores of first and second sets in the previous example by
10 and 20 respectively.
The r between X’ and Y’ may then be calculated as under:
Thus, we observe that the value of the coefficient of correlation r remains unchanged when a
constant is multiplied with one or both sets of variate values.
4. The value of r will remain unchanged even when one or both sets of variate values are
divided by some constant: Students can examine this by taking an example.
B. Coefficient of Correlation in Grouped Data:

When the number of pairs of measurements (N) on two variables X and Y are large, even
moderate in size, and when no calculating machine is available, the customary procedure is to
group data in both X and Y and to form a scatter diagram or correlation diagram which is also
called two-way frequency distribution or bivariate frequency distribution. The choice of size of
class interval and limits of intervals follows much the same rules as were given previously. To
clarify the idea, we consider a bivariate data concerned with the scores earned by a class of 20
students in Physics and Mathematics examination.
Preparing a Scatter diagram:

In setting up a double grouping of data, a table is prepared with columns and rows. Here, we
classify each pair of variates simultaneously in the two classes, one representing score in
Physics (X) and the other in Mathematics (Y) as shown in Table 5.6.
The scores of 20 students in both Physics (X) and Mathematics (Y) are shown in Table
below:
We can easily prepare a bivariate frequency distribution table by putting tallies for each pair of
scores. The construction of a scattergram is quite simple. We have to prepare a table as shown
in the diagram above.
Along the left hand margin the class intervals of X-distribution are laid off from bottom to top
(in ascending order). Along the top of the diagram the c.i’s of Y-distribution are laid off from
left to right (in ascending order).
Each pair of scores (both in X and Y) is represented through a tally in the respective cell. No. 1
student has secured 32 in Physics (X) and 25 in Mathematics (Y). His score of 32 in (X) places
him in the last row and 25 in (Y) places him in the second column. So, for the pair of scores (32,
25) a tally will be marked in the second column of 5th row.
In a similar way, in case of No. 2 student, for scores (34, 41), we shall put a tally in the 4th
column of the 5th row. Likewise, 20 tallies will be put in the respective rows and columns. (The
rows will represent the X-scores and the columns will represent the Y-scores).
Along the right-hand margin the fx column, the number of cases in each c.i., of X-distribution
are tabulated and along the bottom of the diagram in the fy row the number of cases in each c.i.,
of Y-distribution are tabulated.
The total of fx column is 20 and the total of fy row is also 20. It is in fact a bi-variate distribution
because it represents the joint distribution of two variables. The scattergram is then a
“correlation table.”
Calculation of r from a correlation table:
The following outline of the steps to be followed in calculating r will be best understood
if the student will constantly refer to Table 5.7 as he reads through each step:
Step 1:
Construct a scattergram for the two variables to be correlated, and from it draw up a
correlation table.
Step 2:
Count the frequencies of each c.i. of distribution – X and write it in the fx column. Count the
frequencies for each c.i. of distribution – Y and fill up the fy row.
Step 3:
Assume a mean for the X-distribution and mark off the c.i. in double lines. In the given
correlation table, let us assume the mean at the c.i., 40 – 49 and put double lines as shown in the
table. The deviations above the line of A.M. will be (+ ve) and the deviations below it will be (-
ve).
The deviation against the line of A.M., i.e., against the c.i. where we assumed the mean is
marked 0 (zero) and above it the d‘s are noted as +1, +2. 13 and below it d is noted to be – 1.
Now dx column is filled up. Then multiply fx. and dx of each row to get fdx.
Multiply dx and fdx of each row to get fdx2.
[Note: While computing the S.D. in the assumed mean method we were assuming a mean,
marking the d’s and computing fd and fd2. Here also same procedure is followed.]
Step 4:
Adopt the same procedure as in step 3 and compute dy, fdy and fdy2. For the distribution-Y, let
us assume the mean in the c.i. 20-29 and put double lines to mark off the column as shown in
the table. The deviations to the left of this column will be negative and right be positive.
Thus, d for the column where mean is assumed is marked 0 (zero) and the d to its left is marked
– 1 and d’s to its right are marked +1, +2 and +3. Now dy column is filled up. Multiply the
values of fy and dy of each column to get fdy. Multiply the values of dy and fdy to each column to
get fdy2.
Step 5:
As this phase is an important one, we are to mark carefully for the computation of dy for
different c.i.’s of distribution X and dx for different c.i.’s of distribution -Y.
dy for different c. i. ‘s of distribution-X: In the first row, 1f is under the column, 20-29
whose dy is 0 (Look to the bottom. The dy entry of this row is 0). Again 1f is under the column,
40- 49 whose dy is + 2. So dy for the first row = (1 x 0) + (1 x 2) = + 2.
In the second row we find that:
1 f is under the column, 40-49 whose dy is + 2 and

2 fs are under the column, 50-59 whose dy’s are + 3 each.
So dy for 2nd row = (1 x 2) + (2 X 3) = 8.
In the third row,
2 fs are under the column, 20-29 whose dy‘s are 0 each,

2 fs are under the column, 40-49 whose dy‘s are +2 each, and 1 f is under the column, 50-59
whose dy is +3.
So dy for the 3rd row = (2 x 0) + (2 x 2) + (1 X 3) = 7.
In the 4th row,

3 fs are under the column, 20-29 whose dy‘s are 0 each,
2 fs are under the column, 30-39 whose dy‘s are +1 each, and 1 f is under the column, 50-59
whose dy is + 3,
So dy for the 4th row = (3 X 0) + (2 X 1) + (1 x 3) = 5.
Likewise in the 5th row

dy for the 5th row = (2 x – 1) + (1 x 0) + (1 x 2) = 0
dx for different c.i. ,’v of distribution – Y :
In the first column,
2 fs are against the row, 30-39 whose dx is – 1.
So dx of the 1st column = (2 x – 1) = – 2
In the second column,

1 f is against the c.i., 70-79 whose dx is +3,
2 fs are against the c.i., 50-59 whose dx‘s are +1 each,
3 fs are against the c.i., 40-49 whose dx‘s are 0 each,
1 f is against the c.i., 30-39 whose dx is – 1.
So dx for the 2nd column = (1 x 3) + (2 X 1) + (3 X 0) + (1 x – 1) = 4. In the third column,
dx for the 3rd column = 2×0 = 0
In the fourth column,

dx for the 4th column = (1 x 3) + (1 x 2) + (2 x 1) + (1 x – 1) = 6.
In the fifth column,

dx for the 5th column = (2 x 2) + (1 x 1) + (1 X 0) = 5.
Step 6:
Now, calculate dx.dy each row of distribution – X by multiplying the dx entries of each row
by dy entries of each row. Then calculate dx.dy for each column of distribution – Y by
multiplying dy entries of each column by the dx entries of each column.
Step 7:
Now, take the algebraic sum of the values of the columns fdx, fdx2, dy and dx.dy (for distribution
– X). Take the algebraic sum of the values of the rows fdy, fdy2, dx and dx.dy (for distribution –
Y)
Step 8:
∑.dx.dy of X-distribution = ∑dx.dy of Y-distribution
∑fdx = total of dx row (i.e. ∑dx)
∑fdy = total of dy column (i.e. ∑dy)
Step 9:
The values of the symbols as found
∑fdx = 13, ∑fd2x = 39
∑fdy =22, ∑fd2y = 60
∑dx.dy = 29 and N = 20.
In order to compute coefficient of correlation in a correlation table following formula
can be applied:
We may mark that in the denominator of formula (31) we apply the formula for ax and ay with
the exception of no i’s. We may note here that Cx, Cy, σx, σv are all expressed in units of class
intervals (i.e., in unit of i). Thus, while computing σx and σy, no i’s are used. This is desirable
because all the product deviations i.e., ∑dx.dy’s are in interval units.
Thus, we compute:
Interpretation of the Coefficient of Correlation:

Merely computation of correlation does not have any significance until and unless we
determine how large must the coefficient be in order to be significant, and what does
correlation tell us about the data? What do we mean by the obtained value of coefficient of
correlation?
Misinterpretation of the Coefficient of Correlation:

Sometimes, we misinterpret the value of coefficient of correlation and establish the cause and
effect relationship, i.e. one variable causing the variation in the other variable. Actually we
cannot interpret in this way unless we have sound logical base.
Correlation coefficient gives us, a quantitative determination of the degree of relationship

between two variables X and Y, not information as to the nature of association between the two
variables. Causation implies an invariable sequence— A always leads to B, whereas correlation
is simply a measure of mutual association between two variables.
For example there may be a high correlation between maladjustment and anxiety:
But on the basis of high correlation we cannot say maladjustment causes anxiety. It may be
possible that high anxiety is the cause of maladjustment. This shows that maladjustment and
anxiety are mutually associated variables. Consider another example.
There is a high correlation between aptitude in a subject at school and the achievement in the
subject. At the end of the school examinations will this reflect causal relationship? It may or
may not.
Aptitude in the study of subject definitely causes variation in the achievement of the subject,
but high achievement of the student in the subject is not the result of the high aptitude only; it
may be due to the other variables also.
Thus, when interpreting the size of the correlation co-efficient in terms of cause and effect it is
appropriate, if and only if the variables under investigation provide a logical base for such
interpretation.
Factors influencing the size of the Correlation Coefficient:
We should also be aware of the following factors which influence the size of the
coefficient of correlation and can lead to misinterpretation:
1. The size of “r” is very much dependent upon the variability of measured values in the
correlated sample. The greater the variability, the higher will be the correlation,
everything else being equal.
2. The size of ‘r’ is altered, when an investigator selects an extreme group of subjects in
order to compare these groups with respect to certain behavior. “r” obtained from the
combined data of extreme groups would be larger than the “r” obtained from a random
sample of the same group.
3. Addition or dropping the extreme cases from the group can lead to change on the size of
“r”. Addition of the extreme case may increase the size of correlation, while dropping
the extreme cases will lower the value of “r”.
Uses of Product moment r:
Correlation is one of the most widely used analytic procedures in the field of Educational
and Psychological Measurement and Evaluation. It is useful in:
1. Describing the degree of correspondence (or relationship) between two variables.

2. Prediction of one variable—the dependent variable on the basis of independent variable.
3. Validating a test; e.g., a group intelligence test.
4. Determining the degree of objectivity of a test.
5. Educational and vocational guidance and in decision-making.
6. Determining the reliability and validity of the test.
7. Determining the role of various correlates to certain ability.
8. Factor analysis technique for determining the factor loading of the underlying variables
in human abilities.
Assumptions of Product moment r:
1. Normal distribution:
The variables from which we want to calculate the correlation should be normally distributed.
The assumption can be laid from random sampling.
2. Linearity:
The product-moment correlation can be shown in straight line which is known as linear
correlation.
3. Continuous series:
Measurement of variables on continuous series.
4. Homoscedasticity:
It must satisfy the condition of homoscedasticity (equal variability).
3. Spearman’s Rank Correlation Coefficient:

There are some situations in Education and Psychology where the objects or individuals may
be ranked and arranged in order of merit or proficiency on two variables and when these 2 sets
of ranks covary or have agreement between them, we measure the degrees of relationship by
rank correlation.
Again, there are problems in which the relationship among the measurements made is non-
linear, and cannot be described by the product-moment r. For example, the evaluation of a
group of students on the basis of leadership ability, the ordering of women in a beauty contest,
students ranked in order of preference or the pictures may be ranked according to their
aesthetic values. Employees may be rank-ordered by supervisors on job performance.
School children may be ranked by teachers on social adjustment. In such cases objects or
individuals may be ranked and arranged in order of merit or proficiency on two variables.
Spearman has developed a formula called Rank Correlation Coefficient to measure the extent or
degree of correlation between 2 sets of ranks.
This coefficient of correlation is denoted by Greek letter ρ (called Rho) and is given as:
Where, ρ = rho = Spearman’s Rank Correlation Coefficient
D = Difference between paired ranks (in each case)
N = Total number of items/individuals ranked.
Characteristics of Rho (ρ):
1. In Rank Correlation Coefficient the observations or measurements of the bivariate

variable is based on the ordinal scale in the form of ranks.
2. The size of the coefficient is directly affected by the size of the rank differences.
(a) If the ranks are the same for both tests, each rank difference will be zero and
ultimately D2 will be zero. This means that the correlation is perfect; i.e. 1.00.
(b) If the rank differences are very large, and the fraction is greater than one, then
the correlation will be negative.
Assumptions of Rho (ρ):

i. N is small or the data are badly skewed.
ii. They are free, or independent, of some characteristics of the population distribution.
iii. In many situations Ranking methods are used, where quantitative measurements are
not available.
iv. Though quantitative measurements are available, ranks are substituted to reduce
arithmetical labour.
v. Such tests are described as non-parametric.
vi. In such cases the data are comprised of sets of ordinal numbers, 1st, 2nd, 3rd….Nth.
These are replaced by the cardinal numbers 1, 2, 3,………, N for purposes of
calculation. The substitution of cardinal numbers for ordinal numbers always assumes
equality of intervals.
I. Calculating ρ from Test Scores:
Example 1: The following data give the scores of 5 students in Mathematics and General
Science respectively: Compute the correlation between the two series of test scores by Rank
Difference Method.
The value of coefficient of correlation between scores in Mathematics and General Science is
positive and moderate.
Steps of Calculation of Spearman’s Co-efficient of Correlation:
Step 1:
List the students, names or their serial numbers in column 1.
Step 2:
In column 2 and 3 write scores of each student or individual in test I and II.
Step 3:
Take one set of score of column 2 and assign a rank of 1 to the highest score, which is 9, a rank
of 2 to the next highest score which is 8 and so on, till the lowest score get a rank equal to N;
which is 5.
Step 4:
Take the II set of scores of column 3, and assign the rank 1 to highest score. In the second set
the highest score is 10; hence obtain rank 1. The next highest score of B student is 8; hence his
rank is 2. The rank of student C is 3, the rank of E is 4, and the rank of D is 5.
Step 5:
Calculate the difference of ranks of each student (column 6).
Step 6:
Check the sum of the differences recorded in column 6. It is always zero.
Step 7:
Each difference of ranks of column 6 is squared and recorded in column 7. Get the sum ∑D 2.
Step 8:
Put the value of N and 2D2 in the formula of Spearman’s co-efficient of correlation.
2. Calculating from Ranked Data:
Example 2:
In a speech contest Prof. Mehrotra and Prof. Shukla, judged 10 pupils. Their judgements were
in ranks, which are presented below. Determine the extent to which their judgements were in
agreement.
The value of co-efficient of correlation is + .83. This shows a high degree of agreement between
the two judges.
3. Calculating ρ (Rho) for tied Ranks:
Example 3:
The following data give the scores of 10 students on two trials of test with a gap of 2
weeks in Trial I and Trial II.
Compute the correlation between the scores of two trials by rank difference method:
The correlation between Trial I and II is positive and very high. Look carefully at the scores
obtained by the 10 students on Trial I and II of the test.
Do you find any special feature in the scores obtained by the 10 students? Probably, your
answer will be “yes”.
In the above table in column 2 and 3 you will find that more than one students are getting the
same scores. In column 2 students A and G are getting the same score viz. 10. In column 3, the
students A and B, C and F and G and J are also getting the same scores, which are 16, 24 and
14 respectively.
Definitely these pairs will have the same ranks; known as Tied Ranks. The procedure of
assigning the ranks to the repeated scores is somewhat different from the non-repeated scores.
Look at column 4. Student A and G have similar scores of 10 each and they possess 6th and 7th
rank in the group. Instead of assigning the 6th and 7th rank, the average of the two rank i.e. 6.5
(6 + 7/2 = 13/2) has been assigned to each of them.
The same procedure has been followed in respect of scores on Trial II. In this case, ties occur at
three places. Students C and F have the same score and hence obtain the average rank of (1 +
2/2 = 1.5). Student A and B have rank position 5 and 6; hence are assigned 5.5 (5 + 6/2) rank
each. Similarly student G and J have been assigned 7.5 (7 + 8/2) rank each.
If the values are repeated more than twice, the same procedure can be followed to assign
the ranks:
For example:
if three students get a score of 10, at 5th, 6th and 7th ranks, each one of them will be assigned a
rank of 5 + 6 + 7/3= 6.
The rest of the steps of procedure followed for calculation of ρ (rho) are the same as explained
earlier.
Interpretation:
The value of ρ can also be interpreted in the same way as Karl Pearson’s Coefficient of
Correlation. It varies between -1 and + 1. The value + 1 stands for a perfect positive agreement
or relationship between two sets of ranks while ρ = – 1 implies a perfect negative relationship.
In case of no relationship or agreement between ranks, the value of ρ = 0.
Advantages of Rank Difference Method:

1. The Spearman’s Rank Order Coefficient of Correlation computation is quicker and
easier than (r) computed by the Pearson’s Product Moment Method.
2. It is an acceptable method if data are available only in ordinal form or number of paired
variable is more than 5 and not greater than 30 with minimum or a few ties in ranks.
3. It is quite easy to interpret p.
Limitations:
1. When the interval data are converted into rank-ordered data the information about the
size of the score differences is lost; e.g. in the Table 5.10, if D in Trial II gets scores
from 18 up to 21, his rank remains only 4.
2. If the numbers of cases are more, giving ranks to them becomes a tedious job.
Multiple Regression
Multiple regression is an extension of simple linear regression. It is used when we want to

predict the value of a variable based on the value of two or more other variables. The variable
we want to predict is called the dependent variable (or sometimes, the outcome, target or
criterion variable). The variables we are using to predict the value of the dependent variable
are called the independent variables (or sometimes, the predictor, explanatory or regressor
variables).
For example, you could use multiple regression to understand whether exam performance can
be predicted based on revision time, test anxiety, lecture attendance and gender. Alternately,
you could use multiple regression to understand whether daily cigarette consumption can be
predicted based on smoking duration, age when started smoking, smoker type, income and
gender.
Multiple regression also allows you to determine the overall fit (variance explained) of the
model and the relative contribution of each of the predictors to the total variance explained.
For example, you might want to know how much of the variation in exam performance can be
explained by revision time, test anxiety, lecture attendance and gender "as a whole", but also
the "relative contribution" of each independent variable in explaining the variance.
In the pairs of observations, if there is a cause and effect relationship between the variables X
and Y, then the average relationship between these two variables is called regression, which
means “stepping back” or “return to the average”. The linear relationship giving the best mean
value of a variable corresponding to the other variable is called a regression line or line of the
best fit. The regression of X on Y is different from the regression of Y on X. Thus, there are
two equations of regression and the two regression lines are given as follows:
Regression of Y on X: Y − Y = byx ( X − X )
Regression of X on Y: X −X = bxy (Y − Y)
Where X, Y are the means of X, Y respectively.
Result: Let σx, σy denote the standard deviations of x, y respectively. We have the following
result.
Result:
The coefficient of correlation r between X and Y is the square root of the product of the b
values in the two regression equations. We can find r by this way also.
Application
The method of regression is very much useful for business forecasting.
Principle of Least Squares
Let x, y be two variables under consideration. Out of them, let x be an independent variable
and let y be a dependent variable, depending on x. We desire to build a functional relationship
between them. For this purpose, the first and foremost requirement is that x, y have a high
degree of correlation. If the correlation coefficient between x and y is moderate or less, we shall
not go ahead with the task of fitting a functional relationship between them.
Suppose there is a high degree of correlation (positive or negative) between x and y. Suppose it
is required to build a linear relationship between them i.e., we want a regression of y on x.
Geometrically speaking, if we plot the corresponding values of x and y in a 2-dimensional plane

and join such points, we shall obtain a straight line. However, hardly we can expect all the
pairs (x, y) to lie on a straight line. We can consider several straight lines which are, to some
extent, near all the points (x, y). Consider one line. An observation (x1, y1) may be either above
the line of consideration or below the line. Project this point on the x-axis. It will meet the
straight line at the point (x1, y1e). Here the theoretical value (or the expected value) of the
variable is y1e while the observed value is y1. When there is a difference between the expected
and observed values, there appears an error. This error is E1 = y1 –y1. This is positive if (x1, y1)
is a point above the line and negative if (x1, y1) is a point below the line. For the n pairs of
observations, we have the following n quantities of error:
E1 = y1 – y1,
E2 = y2 – y2,
En = yn – yn.
Some of these quantities are positive while the remaining ones are negative. However, the
squares of all these quantities are positive.
i.e., E21 = (y1 – y1 )2 ≥ 0, E22 = (y2 –y2)2 ≥ 0, …, E2n = (yn –yn )2 ≥ 0.

Hence the sum of squares of errors (SSE) = E21 + E22 + … + E2n
= (y1 –y2 )2 + (y2 –y2 )2 + … + (yn –yn)2 ≥ 0.
Among all those straight lines which are somewhat near to the given observations (x1, y1), (x 2,
y2), …, (xn , yn) , we consider that straight line as the ideal one for which the sse is the least.
Since the ideal straight line giving regression of y on x is based on this concept, we call this
principle as the Principle of least squares.
Normal equations
Suppose we have to fit a straight line to the n pairs of observations (x1, y1), (x2, y2), …, (xn , yn).
Suppose the equation of straight line finally comes as
Y=a+bX (1)
Where a, b are constants to be determined. Mathematically speaking, when we require finding
the equation of a straight line, two distinct points on the straight line are sufficient. However, a
different approach is followed here. We want to include all the observations in our attempt to
build a straight line. Then all the n observed points (x, y) are required to satisfy the relation
(1). Consider the summation of all such terms. We get

y = ∑ (a + b x ) = ∑ (a .1 + b x ) = ( ∑ a.1) + ( ∑ b x ) = a ( ∑ 1 ) + b ( ∑ x). i.e.
∑ y = an + b (∑ x) (2)
To find two quantities a and b, we require two equations. We have obtained one equation i.e.,
(2). We need one more equation. For this purpose, multiply both sides of (1) by
x. We obtain
x y = ax + bx2 .
Consider the summation of all such terms. We get
X 5 6 7 8 9 10 11
Y 2 4 5 5 3 8 7
x y = ∑ (ax + bx2 ) = (∑ a x) + ( ∑ bx2)
Equations (2) and (3) are referred to as the normal equations associated with the regression of
y on x. solving these two equations, we obtain
Note: For calculating the coefficient of correlation, we require ∑X, ∑Y, ∑ Xy, ∑ X2, ∑Y2.
For calculating the regression of y on x, we require ∑X, ∑Y, ∑ XY, ∑ X2. Thus, tabular
column is same in both the cases with the difference∑Y2 that is also required for the
coefficient of correlation. Next, if we consider the regression line of x on y, we get the equation
X = a + b y. The expressions for the coefficients can be got by interchanging the roles of X and
Y in the previous discussion. Thus, we obtain
Problem: Consider the following data on sales and profit.
Determine the regression of profit on sales.

Solution: We have N = 7. Take X = Sales, Y = Profit.
Calculate ∑ X, ∑y, ∑XY, ∑X2 as follows:
X Y XY X2
5 2 10 25
6 4 24 36
7 5 35 49
8 5 40 64
9 3 27 81
10 8 80 100
11 7 77 121
Total: 56 34 293 476
a = {(∑ x2) (∑ y) – (∑ x) (∑ x y)} / {n (∑ x2) – (∑ x)2}
(476 x 34 – 56 x 293) / ( 7 x 476 - 562 )

(16184 – 16408 ) / ( 3332 – 3136 )
- 224 / 196
– 1.1429
b = {n (∑ x y) – (∑ x) (∑ y)} / {n (∑ x2) – (∑ x) 2}
(7 x 293 – 56 x 34)/ 196 = (2051 – 1904)/ 196

147 /196
0.75
The regression of Y on X is given by the equation
Y=a+bX
I.e.,
Y=–1.14+0.75X
Problem: Consider the following data on occupancy rate and profit of a hotel.
Occupancy Rate 40 45 70 60 70 75 70 80 95 90
Profit 50 55 65 70 90 95 105 110 120 125
Determine the regressions of profit on occupancy rate and occupancy rate on profit.
Solution: We have N = 10. Take X = Occupancy Rate, Y = Profit.
Note that in Problems 10 and 11, we wanted only one regression line and so we did not take
∑Y2 . Now we require two regression lines. Therefore,
Calculate ∑ X, ∑Y, ∑XY, ∑X2, ∑Y2.
X Y XY X2 Y2
40 50 2000 1600 2500
45 55 2475 2025 3025
70 65 4550 4900 4225
60 70 4200 3600 4900
70 90 6300 4900 8100
75 95 7125 5625 9025
70 105 7350 4900 11025
80 110 8800 6400 12100
95 120 11400 9025 14400
90 125 11250 8100 15625
Total:695 885 65450 51075 84925
The regression line of Y on X:
Y=a+bX
Where: a ={(∑ x2) (∑ y) – (∑ x) (∑ x y)} / {n (∑ x2) – (∑ x) 2}
And b ={n (∑ x y) – (∑ x) (∑ y)} / {n (∑ x2) – (∑ x) 2}
We obtain: a = (51075 x 885 – 695 x 65450) / (10x51075 - 6952)
(45201375 – 45487750)/ (510750 – 483025)
- 286375 / 27725
- 10.329
b = (10 x 65450 – 695 x 885) / 27725
(654500 – 615075) / 27725
39425 / 27725
1.422
So, the regression equation is Y = - 10.329 + 1.422 X
Next, if we consider the regression line of X on Y, We get the equation X = a + b Y where
a = {(∑ y2) (∑ x) – (∑ y) (∑ x y)} / {n (∑ y2) – (∑ y) 2} b = {n (∑ x y) – (∑ x) (∑ y)} / {n (∑

y2) – (∑ y) 2}.
a = (84925 x 695 – 885 x 65450) / (10 x 84925 – 8852)
(59022875 – 57923250) / (849250 – 783225)
1099625 / 66025
16.655,
b = (10 x 65450 – 695 x 885) / 66025
(654500 – 615075) / 66025
39425 / 66025
0.597
So, the regression equation is X = 16.655 + 0.597 Y
Note:
For the data given in this problem, if we use the formula for r, we get
(10 x 65450 – 695 x 885) / { √ (10 x 51075 - 6952 ) √ (10 x 84925 - 8852 ) }
(654500 – 615075) / (√ 27725 √ 66025 )
39425 / 166.508 x 256.95
39425 / 42784.23
0.9214
However, once we know the two b values, we can find the coefficient of correlation r between X
and Y as the square root of the product of the two b values. Thus we obtain
r = √ (1.422 x 0.597)
√ 0.848934
0.9214.
Note that this agrees with the above value of r.
Multiple linear regression attempts to model the relationship between two or more explanatory
variables and a response variable by fitting a linear equation to observed data. Every value of
the independent variable x is associated with a value of the dependent variable y. The
population regression line for p explanatory variables x1, x2, ... , xp is defined to be y =
0 + 1 x1 + 2x2 + ... + p xp . This line describes how the mean response y changes with
the explanatory variables. The observed values for y vary about their means y and are
assumed to have the same standard deviation . The fitted values b0, b1, ..., bp estimate the
parameters 0, 1, ..., p of the population regression line.
Since the observed values for y vary about their means y, the multiple regression model
includes a term for this variation. In words, the model is expressed as DATA = FIT +
RESIDUAL, where the "FIT" term represents the expression 0 + 1x1 + 2x2 + ... pxp.
The "RESIDUAL" term represents the deviations of the observed values y from their
means y, which are normally distributed with mean 0 and variance . The notation for the
model deviations is .
Formally, the model for multiple linear regression, given n observations, is

yi = 0 + 1 xi1 + 2 xi2 + ... p xip + i for i = 1,2, ... n.
In the least-squares model, the best-fitting line for the observed data is calculated by
minimizing the sum of the squares of the vertical deviations from each data point to the line (if
a point lies on the fitted line exactly, then its vertical deviation is 0). Because the deviations are
first squared, then summed, there are no cancellations between positive and negative values.
The least-squares estimates b0, b1, ... bp are usually computed by statistical software.
The values fit by the equation b0 + b1xi1 + ... + bpxip are denoted i , and the residuals ei are equal
to yi - i, the difference between the observed and fitted values. The sum of the residuals is
equal to zero.
The variance ² may be estimated by s² = , also known as the mean-squared

error (or MSE). The estimate of the standard error s is the square root of the MSE.
Hypothesis
Meaning of Hypotheses: Once the problem to be answered in the course of research is finally
instituted, the researcher may, if feasible proceed to formulate tentative solutions or answers to
it. These proposed solutions or explanations are called hypotheses which the researcher is
obliged to test on the basis of fact already known or which can be made known.
If such answers are not formulated, even implicitly, the researcher cannot effectively go ahead
with the investigation of his problem because, in the absence of direction which hypotheses
typically provide, the researcher would not know what facts to look for and what relation or
order to search for amongst them.
The hypotheses guide the researcher through a bewildering Jungle of facts to see and select
only those that are relevant to the problem or difficulty he proposes to solve. Collection of facts
merely for the sake of collecting them will yield no fruits.
To be fruitful, one should collect such facts as are for or against some point of view or
proposition. Such a point of view or proposition is the hypothesis. The task of the inquiry or
research is to test its accord with facts.
Lundberg aptly observes, “The only difference between gathering data without a hypothesis
and gathering them with one, is that in the latter case, we deliberately recognize the limitations
of our senses and attempt to reduce their fallibility by limiting our field of investigation so as to
prevent greater concentration for attention on particular aspects which past experience leads us
to believe are irrelevant as insignificant for our purpose.”
Simply stated, a hypothesis helps us see and appreciate:
(1) The kind of data that need be collected in order to answer the research question and
(2) The way in which they should be organized most efficiently and meaningfully.
Webster’s New International Dictionary of English Language, 1956, defines the term
“hypothesis” as “proposition, condition or principle which is assumed, perhaps without belief, in
order to draw out its logical consequences and by this method to test its accord with facts
which are known or may be determined.”
Cohen and Nagel bring out the value of hypothesis thus:
“We cannot take a single step forward in any inquiry unless we begin with a suggested
explanation or solution of the difficulty which originated it. Such tentative explanations are
suggested to us by something in the subject-matter and by our previous knowledge. When they
are formulated as propositions, they are called hypotheses.”
Once the scientist knows what his question (problem) is, he can make a guess, or a number of
guesses as to its possible answers. According to Werkmeister, “The guesses he makes are the
hypotheses which either solve the problems or guide him in further investigation.”
It is clear now that a hypothesis is a provisional formulation; a tentative solution of the problem
posed by the scientist. ‘The scientist starts by assuming that the solution is true without, of
course, personally believing in its truthfulness.
Based on this assumption, the scientist anticipates that certain logical consequences will be
observed on the plane of observable events or objects. Whether these anticipations or
expectations really materialize is the test of the hypothesis, its proof or disproof.
If the hypothesis is proved, the problem of which it was a tentative solution is answered. If it is
not proved, i.e., falsified owing to non-support of proof, alternative hypotheses may be
formulated by the researcher. An hypothesis thus stands somewhere at the midpoint of
research; from here, one can look back to the problem as also look forward to data.
The hypothesis may be stated in the form of a principle, that is, the tentative explanation or
solution to the questions how? Or why? May be presented in the form of a principle that X
varies with Y. The inquiry established that an empirical referent of X varies with the empirical
referent of Y in a concrete observable situation (i.e., the hypothesis is proved) then the question
is answered.
Hypotheses, however, may take other forms, such as intelligent guesses, conditions,
propositions deduced from theories, observations and findings of other scholars etc.
Proceeding on the basis of hypotheses has been the slow and hard way of science. While some
scientific conclusions and premises seem to have arisen in the mind of the investigator as if by
flashes of insight, in a majority of cases the process of discovery has been a slower one.
“The scientific imagination devises a possible solution, a hypothesis and the investigator
proceeds to test it. He makes intellectual keys and then tries to see whether they fit the lock. If
the hypothesis does not fit, it is rejected and another is made. The scientific workshop is full of
discarded keys.”
Cohen and Nagel’s statement that one cannot take a single step forward in any inquiry without
a hypothesis may well be a correct statement of the value of hypothesis in scientific
investigation generally, but it hardly does justice to an important function of scientific research,
i.e., the “formulation hypotheses.”
Hypotheses are not given to us readymade. Of course in fields with a highly developed theoretic
structure it is reasonable to expect that most empirical studies will have at least some sharp
hypotheses to be tested.
This is so especially in social sciences where there has not yet evolved a highly developed
theoretic system in many areas of its subject-matter which can afford fruitful bases for
hypothesis-formulation.
As such, attempts to force research into this mould are either deceitful or stultifying and
hypotheses are likely to be no more than hunches as to where to look for sharper hypotheses in
which case the study may be described as an intelligent fishing trip.
As a result, in the social sciences at least, a considerable quantum of research endeavour is
directed understandably toward ‘making’ hypotheses rather than at testing them.
A very important type of research has as its goal, the formulation of significant hypotheses
relating to a particular problem. Hence, we will do well to bear in mind that research can begin
with well formulated hypotheses or it may come out with hypotheses as its end product.
Let us recapitulate the role of hypotheses for research in the words of Chaddock who
summarizes it thus:
“(A hypothesis) in the scientific sense is … an explanation held after careful canvass of known
facts, in full knowledge of other explanations that have been offered and with a mind open to
change of view, if the facts disclosed by the inquiry warrant a different explanation. Another
hypothesis as an explanation is proposed including investigation all available and pertinent data
either to prove or disprove the hypothesis…. (A hypothesis) gives point to the inquiry and if
founded on sufficient previous knowledge, guides the line of investigation. Without it much
useless data maybe collected in the hope that nothing essential will be omitted or important
data may be omitted which could have been easily included if the purpose of inquiry had been
more clearly defined” and thus hypotheses are likely to be no more than hunches as to where to
look for pertinent data.
A hypothesis is therefore held with the definite purpose of including in the investigating all
available and pertinent data either to prove or disprove the hypothesis.
Types of Hypotheses: On the Basis of Abstraction
There are many kinds of hypotheses the social researcher has to be working with. One type of
hypotheses asserts that something is the case in a given instance; that a particular object,
person or situation has a particular characteristic.
Another type of hypotheses deals with the frequency of occurrences or of association among
variables; this type of hypotheses may state that X is associated with y a certain (Y) proportion
of times, e.g., that urbanism tends to be accompanied by mental disease or that something is
greater or lesser than something else in a specific setting.
Yet another type of hypotheses assert that a particular characteristic is one of the factors which
determine another characteristic, i.e., S is the producer of Y (product). Hypotheses of this type
are known as causal hypotheses.
Hypotheses can be classified in a variety of ways. But classification of hypotheses on the basis of
their levels of abstraction is regarded as especially fruitful. Goode arid Hatt have identified
three differential levels of abstraction reached by hypotheses. We shall here be starting from
the lowest level of abstraction and go over to the higher ones.
(a) At the lowest level of abstraction are the hypotheses which state existence of certain
empirical uniformities. Many types of such empirical uniformities are common in social
research, for instance, it may be hypothesized with reference to India that in the cities men will
get married between the age of 22 and 24 years.
Or, the hypotheses of this type may state that certain behaviour pattern may be expected in a
specified community. Thus, hypotheses of this type frequently seem to invite scientific
verification of what are called “common sense propositions,” indeed without much justification.
It has often been said by way of a criticism of such hypotheses that these are not useful in as
much as they merely state what everyone seems to know already. Such an objection may
however be overruled by pointing out that what everyone knows is not often put in precise
terms nor is it adequately integrated into the framework of science.
Secondly, what everyone knows may well be mistaken. To put common sense ideas into
precisely defined concepts and subject the proposition to test is an important task of science.
This is particularly applicable to social sciences which are at present in their earlier stage of
development. Not only social science but all sciences have found such commonsense knowledge
a fruitful item of study. It was commonsense knowledge in the olden days that sun revolved
round the earth. But this and many other beliefs based on commonsense have been exploded by
patient, plodding, empirical checking of facts.
The monumental work, The American Soldier by Stouffer and associates was criticized in
certain quarters, for it was according to them mere elaboration of the obvious. But to this study
goes the credit of exploding some of the commonsense propositions and shocking many people
who had never thought that what was so obvious a commonsense could be totally wrong or
unfounded in fact.
(b) At a relatively higher level of abstraction are hypotheses concerned with complex ‘ideal
types.’ These hypotheses aim at testing whether logically derived relationship between
empirical uniformities obtain. This level of hypothesizing moves beyond the level of
anticipating a simple empirical uniformity by visualizing a complex referent in society.
Such hypotheses are indeed purposeful distortions of empirical exactness and owing to their
remoteness from empirical reality, these constructs are termed ‘ideal types.’ The function of
such hypotheses is to create tools and formulate problems for further research in complex areas
of investigation.
An example of one such hypothesis may be cited. Analyses of minority groups brought to light
empirical uniformities in the behaviour of members of a wide variety of minorities. It was
subsequently hypothesized that these uniformities pointed to an ‘ideal type’.
First called by H. A. Miller the ‘oppression psychosis,’ this ideal-typical construction was
subsequently modified as the ‘Marginal man’ by E. Stone Quist and associates. Empirical
evidence marshaled later substantiated the hypothesis, and so the concept of marginality
(marginal man) has very much come to stay as a theoretic construct in social sciences, and as
part of sociological theory.
(c) We now come to the class of hypotheses at the highest level of abstraction. This category of
hypotheses is concerned with the relation obtaining amongst analytic variables. Such
hypotheses are statements about, how one property affects other, e.g., a statement of
relationship between education and social mobility or between wealth and fertility.
It is easy to see that this level of hypothesizing is not only more abstract compared to others; it
is also the most sophisticated and vastly flexible mode of formulation.
This does not mean, however, that this type of hypotheses is ‘superior’ or ‘better’ than the other
types. Each type of hypotheses has its own importance depending in turn upon the nature of
investigation and the level of development the subject has achieved.
The sophisticated hypotheses of analytical variables owe much of their existence to the
building-blocks contributed by the hypotheses existed at the lower orders of abstraction.
Sources of Hypotheses:
Hypotheses may be developed from a variety of sources. We examine here, some of the major
ones.
(1) The history of sciences provides an eloquent testimony to the fact that personal and
idiosyncratic experiences of the scientist contribute a great deal to type and form of questions
he may ask, as also to the kinds of tentative answers to these questions (hypotheses) that he
might provide. Some scientists may perceive an interesting pattern in what may merely, seem a
jumble of facts to the common man.
The history of science is full of instances of discoveries made just because the ‘right’ person
happened to make the ‘right’ observation owing to his characteristic life-history and exposure
to a unique mosaic of events. Personal life-histories are a factor in determining the kinds of a
person’s perception and conception and this factor may in turn direct him to certain hypotheses
quite readily.
An illustration of such individual perspectives in social sciences may be seen in the work of
Thorstein Veblen whom Merton describes as a sociologist with a keen eye for the unusual and
paradoxical.
A product of an isolated Norwegian community, Veblen lived at a time when the capitalistic
system was barely subjected to any criticism. His own community background was replete with
derivational experiences attributable to the capitalist system.
Veblen being an outsider, was able to look at the capitalist economic system more objectively
and with dispassionate detachment. Veblen was thus strategically positioned to attack the
fundamental concepts and postulates of classical economics.
He was an alien who could bring a different experience to bear upon the economic world.
Consequently, he made penetrating analyses of society and economy which have ever since
profoundly influenced social science.
(2) Analogies are often a fountainhead of valuable hypotheses. Students of sociology and
political science in the course of their studies would have come across analogies wherein society
and state are compared to a biological organism, the natural law to the social law,
thermodynamics to social dynamics, etc. such analogies, notwithstanding the fact that analogies
as a class suffer from serious limitations, do provide certain fruitful insight which formulated as
hypotheses stimulate and guide inquiries.
One of the recent orientations to hypotheses formulation is provided by cybernetics, the

communication models now so well entrenched in the social science testify to the importance of
analogies as a source of fruitful hypotheses. The hypothesis that similar human types or
activities may be found occupying the same territory was derived from plant ecology.
When the hypothesis was borne out by observations in society, the concept of segregation as it
is called in plant ecology was admitted into sociology. It has now become an important idea in
sociological theory. Such examples may be multiplied.
In sum, analogy may be very suggestive but care needs to be taken not to accept models from
other disciplines without a careful scrutiny of the concepts in terms of their applicability to the
new frame of reference in which they are proposed to be deployed.
(3) Hypotheses may rest also on the findings of other studies. The researcher on the basis of the
findings of other studies may hypothesize that similar relationship between specified variables
will hold good in the present study too. This is a common way of researchers who design their
study with a view of replicating another study conducted in a different concrete context or
setting.
It was said that many a study in social science is exploratory in character, i.e., they start
without explicit hypotheses, the findings of such studies may be formulated as hypotheses for
more structured investigations directed at testing certain hypotheses.
(4) An hypothesis may stem from a body of theory which may afford by way of logical
deduction, the prediction that if certain conditions are present, certain results will follow.
Theory represents what is known; logical deductions from this constitute the hypotheses which
must be true if the theory was true.
Dubin aptly remarks, “Hypothesis is the feature of the theoretical model closest to the ‘things
observable’ that the theory is trying to model.” Merton illustrates this function of theory with
his customary felicity. Basing his deductions on Durham’s theoretic orientation, Merton shows
how hypotheses may be derived as deductions from theoretic system.
(1) Social cohesion provides psychic support to group members subjected to acute stresses
and anxieties.
(2) Suicide rates are functions of unrelieved anxieties to which persons are subjected.
(3) Catholics have greater social cohesion than Protestants.
(4) Therefore, lower suicide rates should be expected among Catholics than among
protestants.
If theories purport to model the empirical world, then there must be a linkage between the two.
This linkage is to be found in the hypotheses that mirror the propositions of the theoretical
model.
It may thus appear that the points of departure vis-a-vis hypotheses-construction are in two
opposite directions:
a. Conclusions based on concrete or empirical observations lead through the process of

induction to more abstract hypotheses and
b. The theoretical model through the process of logical deduction affords more concrete
hypotheses.
It may be well to bear in mind, however, that although these two approaches to hypotheses
formulation seem diametrically opposed to each other, the two points of departure, i.e.,
empirical, observations and the theoretical structure, represent the poles of a continuum and
hypotheses lie somewhere in the middle of this continuum.
Both these approaches to hypotheses-construction have proved their worth. The Chicago
School in American sociology represents a strong empirical orientation whereas the Mertonian
and Parsonian approach is typified by a stress on theoretic models as initial bases for
hypotheses-construction. Hence hypotheses can be deductively derived from theoretic models.
(5) It is worthy of note that value-orientation of the culture in which a science develops may
furnish many of its basic hypotheses.
That certain hypotheses and not others capture the attention of scientists or occur to them in
particular societies or culture may well be attributed to the cultural emphases. Goode and Hatt
contend that the American emphasis upon personal happiness had had considerable effect upon
social science in that country.
The phenomenon of personal happiness has been studied in great detail. In every branch of
social science, the problem of personal happiness came to occupy a position meriting central
focus. Happiness has been correlated with income, education, occupation, social class, and so on.
It is evident that the culture emphasis on happiness has been productive of a very wide range of
hypotheses for the American social science.
Folk-wisdom prevalent in a culture may also serve as source of hypotheses. The sum and
substance of the discussion is aptly reflected in Larrabee’s remark that the ideal source of
fruitful and relevant hypotheses is a fusion of two elements: past experience and imagination in
the disciplined mind of the scientist.
Hypothesis Formulation
In conducting research, the important consideration after the formulation of a research problem
is the construction of hypothesis. As you know, any scientific inquiry starts with the statement
of a solvable problem, when the problem has been stated, a tentative solution in the form of
testable proposition is offered by the researcher.
Hypothesis is often considered a tentative and testable statement of the possible relationship
between two or more events / variables under investigation. According to Mcguigan (1990), ‘a
testable statement of a potential relationship between two or more variables, i.e. advance as
potential solution to the problem’. Kerlinger (1973) defined ‘a hypothesis is a conjectural
statement of the relation between two or more variables’. In order to be useful in any study, the
hypothesis needs to be stated in such a way that it might be subjected to empirical testing.
The researcher is responsible to suggest or find some way to check how the hypothesis stands
against empirical data. When a hypothesis is formulated, the investigator must determine
usefulness of the formulated hypothesis. There are several criteria or characteristics of a good
research hypothesis. A good hypothesis is one which meets such criteria to a large extent. Some
of these characteristics are enumerated below:
1) Hypothesis should be conceptually clear;

2) Hypothesis must be testable;
3) Hypothesis should be related to the existing body or theory and impact;
4) Hypothesis should have logical unity and comprehensiveness;
5) Hypothesis should be capable of verification; and
6) Hypothesis should be operationisable.
Science proceeds with observation, hypothesis formulation and hypothesis testing. After testing
the hypothesis, through various statistical tests, researcher can accept or reject the hypothesis.
If the hypothesis is accepted then researcher can replicate the results, if hypothesis is rejected
then researcher can refined or modify the results.
By stating a specific hypothesis, the researcher narrows the focus of the data collection effort
and is able to design a data collection procedure which is aimed at testing the plausibility of the
hypothesis as a possible statement of the relationship between the terms of the research
problem.
It is, therefore, always useful to have a clear idea and vision about the hypothesis. It is essential
for the research question as the researcher intents to verify, as it will direct and greatly help to
interpretation of the results.
Possible Difficulties in Formulation of A Good Hypothesis
There are three major possible difficulties; a researcher could face during formulation of
hypothesis. First, the absence of knowledge of a theoretical framework is a major difficulty in
formulating a good research hypothesis. Second, if detailed theoretical evidences are not
available or if the investigator is not aware of the availability of those theoretical evidences, a
research hypothesis cannot be formulated. Third, when the investigator is not aware of the
scientific research techniques, she/he will not be able to frame a good research hypothesis.
Despite these difficulties, the investigator attempts in her/his research to formulate a

hypothesis. Usually the hypothesis is derived from the problem statement. The hypothesis
should be formulated in a positive and substantive form before data are collected. In some cases
additional hypothesis may be formulated after collection of data, but they should be tested on a
new set of data and not on the old set which has suggested it. The formulation of a hypothesis
is a creative task and involves a lot of thinking, imagination and innovation. Reichenbach
(1938) has made a distinction between the two processes found commonly in any hypothesis
formulation task. One is the context of discovery and another is the context of justification. The
manner or the process through which a scientist arrives at a hypothesis illustrates the context
of justification. A scientist is concerned more with a context of justification in the development
of a hypothesis. He never puts his ideas or thoughts as they nakedly occur in the formulation of
a hypothesis. Rather, he logically reconstructs his ideas or thoughts and draws some justifiable
inferences from those ideas and thoughts. He never cares to relate how he actually arrived at a
hypothesis. He does not say, for example, that while he was shaving, this particular hypothesis
occurred to him. He usually arrives at a hypothesis by the rational reconstruction of thoughts.
When a scientist reconstructs his thoughts and communicates them in the form of a hypothesis
to others, he uses the context of justification. When he arrives at a hypothesis, he extensively as
well as intensively surveys a mass of data, abstracts them, tries to find out similarities among
the abstracted data and finally makes a generalization or deduces a preposition in the form of a
hypothesis.
Here is an important distinction to be made between formulating a hypotheses and choosing

one. Although a researcher often becomes interested in a question about human behaviour for
personal reasons, the ultimate value of research study depends on the researcher bringing
methodological criteria to bear on the selection of the hypothesis to be tested. In other words,
Good hypothesis are made, not born.
Hypothesis plays a key role in formulating and guiding any study. The hypotheses are
generally derived from earlier research findings, existing theories and personal observations
and experience. For instance, you are interested in knowing the effect of reward on learning.
You have analysed the past research and found that two variables are positively related. You
need to convert this idea in terms of a testable statement. At this point you may develop the
following hypothesis.
Those who are rewarded shall require lesser number of trails to learn the lesson than those
who are not rewarded. A researcher should consider certain points while formulating a
hypothesis:
i) Expected relationship or differences between the variables.

ii) Operational definition of variable.
iii) Hypotheses are formulated following the review of literature
The literature leads a researcher to expect a certain relationship. Hypotheses are the statement
that is assumed to be true for the purpose of testing its validity.
TYPES OF HYPOTHESES
As explained earlier, any assumption that you seek to validate through investigation is called
hypotheses. Hence theoretically, there should be one type of hypotheses on the basis of the
investigation that is, research hypothesis. However, because of the conventions in scientific
enquiries and wording used in the constructions of the hypothesis, Hypotheses can be classified
into several types, like; universal hypotheses, existential hypotheses, conceptual hypotheses etc.
Broadly, there are two categories of the hypothesis:
i. Null hypothesis
ii. Alternative hypothesis
Null Hypothesis: Null hypothesis is symbolised as Ho. Null hypothesis is useful tool in testing
the significance of difference. In its simplest form, this hypothesis asserts that there is no true
difference between two population means, and the difference found between sample means is,
accidental and unimportant, that is arising out of fluctuation of sampling and by chance.
Traditionally null hypothesis stated that there is zero relationship between terms of the
hypothesis. For example, (a) schizophrenics and normal do not differ with respect to digit span
memory (b) There is no relationship between intelligence and height.
The null hypothesis is an important component of the decision making methods of inferential
statistics. If the difference between the samples of means is found significant the researcher can
reject the null hypothesis. It indicates that the differences have statistically significant and
acceptance of null hypothesis indicates that the differences are due to chance. Null hypothesis
should always be specific hypothesis i.e. it should not state about or approximately a certain
value. The null hypothesis is often stated in the following way:
Ho: μHV < μLV
Thus, the null hypothesis is that mean of the population of those children who have the high
vocabulary (group1) is less than or equal to mean of those who lack the vocabulary (Group 2).
Alternative Hypothesis
Alternative hypothesis is symbolised as H1 or Ha, is the hypothesis that specifies those values
that are researcher believes to hold true, and the researcher hopes that sample data will lead to
acceptance of this hypothesis as true. Alternative hypothesis represents all other possibilities
and it indicates the nature of relationship.
The alternative hypothesis is stated as follows:
H1: μHV > μLV
The alternative hypothesis is that the mean of population of those who have the vocabulary is
greater than the mean of those to lack the vocabulary. In this example the alternative
hypothesis is that the experimental population had higher mean than the controls. This is
called directional hypothesis because researcher predicted that the high vocabulary children
would differ in one particular direction from the low vocabulary children. Sometimes researcher
predicts only that the two groups will differ from each other but the researcher doesn’t know
which group will be higher. This is non directional hypothesis. The null and alternative
hypothesis in this case would be stated as follows:
Ho: μ1 = μ2
H1: μ1 ? μ2
Thus, the null hypothesis is that mean of group 1 equals the mean of group 2, and the
alternative hypothesis is that the mean of group 1 does not equal the mean of group 2.
Errors in Testing a Hypothesis
You have already learned that hypotheses are assumptions that may be prove to be either
correct or incorrect. It is possible to arrive at a incorrect conclusion about a hypothesis for the
various reasons if –
• Sampling procedure adopted faulty

• Data collection method inaccurate
• Study design selected is faulty
• Inappropriate statistical methods used
• Conclusions drawn are incorrect
Two common errors exist when testing a hypothesis.
 Type I error – Rejection of a null hypothesis when it is true.

 Type II error - Acceptance of a null hypothesis when it is false.
A superintendent in a medium size school has a problem. The mathematics scores on

nationally standardized achievement tests such as the SAT and ACT of the students
attending her school are lower than the national average. The school board members, who
don't care whether the football or basketball teams win or not, are greatly concerned about
this deficiency. The superintendent fears that if the situation is not corrected, she will lose
her job before long.
As the superintendent was sitting in her office wondering what to do, a salesperson
approached with a briefcase and a sales pitch. The salesperson had heard about the problem
of the mathematics scores and was prepared to offer the superintendent a "deal she couldn't
refuse." The deal was teaching machines to teach mathematics, guaranteed to increase the
mathematics scores of the students. In addition, the machines never take breaks or demand
a pay increase.
The superintendent agreed that the machines might work, but was concerned about the
cost. The salesperson finally wrote some figures. Since there were about 1000 students in
the school and one machine was needed for every ten students, the school would need
about one hundred machines. At a cost of $10,000 per machine, the total cost to the school
would be about $1,000,000. As the superintendent picked herself up off the floor, she said
she would consider the offer, but didn't think the school board would go for such a big
expenditure without prior evidence that the machines actually worked. Besides, how did
she know that the company that manufactures the machines might not go bankrupt in the
next year, meaning the school would be stuck with a million dollar's worth of useless
electronic junk?
The salesperson was prepared, making an offer to lease ten machines for testing purposes
to the school for one year at a cost of $500 each. At the end of a year, the superintendent
would make a decision about the effectiveness of the machines. If they worked, she would
pitch them to the school board; if not, then she would return the machines with no further
obligation.
An experimental design was agreed upon. One hundred students would be randomly
selected from the student population and would be taught using the machines for one year.
At the end of the year, the mean mathematics scores of those students would be compared
to the mean scores of the students who did not use the machine. If the means were different
enough, the machines would be purchased. (The astute statistics student will recognize this
as a nested t-test.) In order to help decide how different the two means would have to be in
order to buy the machines, the superintendent did a theoretical analysis of the decision
process. Her analysis is presented in the following decision box.
Decision Boxes in Hypothesis Testing

"Real World"
The machines do NOT
DECISION The machines work.
work.
(4.) CORRECT
Buy the machines. (1.) Type I ERROR probability=
Decide the machines work. probability=a 1-b
"power"
Do not buy the machines.
(2.) CORRECT
Decide that the machines do not (3.) Type II ERROR
probability=1-a
work probability=b
The decision box has the decision that the superintendent must make in the left column.
For simplicity's sake, only two possibilities are permitted: either buy all the machines or
buy none of the machines. The other two column titles represent "the state of the real
world". The state of the real world can never be truly known, because if it were known
whether or not the machines worked, there would be no point in doing the experiment.
The four "Real World" cells represent various places one could be, depending upon the
state of the world and the decision made. Each cell will be discussed in turn.
1. Buying the machines when they do not work.
This is called a Type I error and in this case is very costly ($1,000,000). The probability of
this type of error is a, also called the significance level, and is directly controlled by the
experimenter. Before the experiment begins, the experimenter directly sets the value of a.
In this example the value of a would be set low, lower than the usual value of .05, perhaps
as low as .0001, which means that one time out of 10,000 the experimenter would buy the
machines when they didn't work.
2. Not buying the machines when they really didn't work.
This is a correct decision, made with probability 1- α when in fact the teaching machines
don't work and the machines are not purchased.
The relationship between the probabilities in these two decision boxes can be illustrated
(see the following figures) using the sampling distribution when the null hypothesis is true.
The decision point is set by α, the area in the tail or tails of the distribution.
Setting smaller moves the decision point further into the tails of the distribution as you can
see in the second distribution.
3. Not buying the machines when they really work.
This is called a Type II error and is made with probability β. The value of β is not directly
set by the experimenter, but is a function of a number of factors, including the size of α, the
size of the effect, the size of the sample, and the variance of the original distribution. The
value of β is inversely related to the value of α: the smaller the value of α, the larger the
value of β. It can now be seen that setting the value of α to a small value was not done
without cost, as the value of β is increased.
4. Buying the machines when they really work.
This is the cell where the experimenter would usually like to be. The probability of making
this correct decision is 1-β and is given the name "power." Because α was set low, β would
be high, and as a result 1-β would be low. Thus it would be unlikely that the
superintendent would buy the machines, even if they did work. The relationship between
the probability of a Type II error (β) and power (1-β) is illustrated in the following
sampling distribution when there actually was an effect.
The relationship between the size of aand b can be seen in the following illustration
combining the two previous distributions into overlapping distributions, the top graph
with α=.05 and the bottom with α=.01.
The size of the effect is the difference between the center points (µ) of the two
distributions. As the size of the effect is increased, the size of beta is decreased.
When the error variance of the scores is decreased and everything else remains constant,
the probability of a type II error is decreased, as illustrated here:
The interactive exercise designed to allow exploration of the relationships between alpha,
size of effects, size of sample (N), size of error, and beta can now be understood. The values
of alpha, size of effects, size of sample, and size of error can all be adjusted with the
appropriate scroll bars. When one of these values is changed, the graphs will change and
the value of beta will be re-computed. The area representing the value of alpha on the
graph is drawn in dark gray. The area representing beta is drawn in dark blue, while the
corresponding value of power is represented by the light blue area. Use this exercise to
verify:
• The size of beta decreases as the size of error decreases.

• The size of beta decreases as the size of the sample increases.
• The size of beta decreases as the size of alpha increases.
• The size of beta decreases as the size of the effects increase.
The size of the increase or decrease in beta is a complex function of changes in all of the
other values. For example, changes in the size of the sample may have either small or large
effects on beta depending upon the other values. If a large treatment effect and small error
is present in the experiment, then changes in the sample size are going to have a small
effect.
A Second Chance
As might be expected, in the previous situation the superintendent chose not to purchase
the teaching machines, because she had essentially stacked the deck against deciding that
there were any effects. When she described the experiment and the result to the
salesperson the next year, the salesperson listened carefully and understood the reason
why α had been set so low.
The salesperson had a new offer to make, however. Because of an advance in microchip
technology, the entire teaching machine had been placed on a single integrated circuit. As a
result the price had dropped to $500 a machine. Now it would cost the superintendent a
total of $50,000 to purchase the machines, a sum that is quite reasonable.
The analysis of the probabilities of the two types of errors revealed that the cost of a Type
I error, buying the machines when they really don't work ($50,000), is small when
compared to the loss encountered in a Type II error, when the machines are not purchased
when in fact they do work, although it is difficult to put into dollars the cost of the
students not learning to their highest potential.
In any case, the superintendent would probably set the value of αto a fairly large value (.10
perhaps) relative to the standard value of .05. This would have the effect of decreasing the
value of β and increasing the power (1-β) of the experiment. Thus the decision to buy the
machines would be made more often if in fact the machines worked. The experiment was
repeated the next year under the same conditions as the previous year, except that the size
of α) was set to .10.
The results of the significance test indicated that the means were significantly different,
the null hypothesis was rejected, and a decision about the reality of effects made. The
machines were purchased, the salesperson earned a commission, the math scores of the
students increased, and everyone lived happily ever after.
The Analysis Generalized to All Experiments
The analysis of the reality of the effects of the teaching machines may be generalized to
all significance tests. Rather than buying or not buying the machines, you reject or retain
the null hypothesis. In the "real world," rather than the machines working or not working,
the null hypothesis is true or false. The following decision box presents the choices
representing significance tests in general.
Decision Boxes in Hypothesis Testing

"Real World"
NULL TRUE ALTERNATIVE NULL FALSE
DECISION FALSE ALTERNATIVE TRUE
No Effects Real Effects
Reject Null Type I CORRECT
Accept Alternative ERROR probability=1-β
Decide there are real effects. probability=α "power"
Retain Null Type II
Retain Alternative CORRECT
ERROR
Decide that no effects were probability=1-α
discovered. probability=β
Testing Hypothesis
In business research and social science research, different approaches are used to study variety
issues. This type of research may be formal or informal; all research begins with generalized
idea in form of hypothesis. A research question is usually there. In the beginning research effort
are made for area of study or it may take form of question about relationship between two or
more variable. For example do good working conditions improve employee productivity or
another question might be now working conditions influence the employees work.
Procession of Hypotheses Testing:
Hypotheses testing is a systematic method. It is used to evaluate the data collected. This serve
as aid in the process of decision making, the testing of hypotheses conducted through several
steps which are given below.
b. State the hypotheses of interest

c. Determine the appropriate test statistic
d. Specify the level of statistical significance.
e. Determine the decision rule for rejecting or not rejecting null hypotheses.
f. Collect the data and perform the needed calculations.
g. Decide to reject or not to reject the null hypotheses.
In order to provide more details on the above steps in the process of hypotheses testing each of
test will be explained here with suitable example to make steps easy to understand.
1. Stating the Hypotheses
In statistical analysis of any research study if includes at least two hypotheses one is null
hypotheses and another is alternative hypotheses. The hypotheses being tested is referred as
the null hypotheses and it is designated as HO. It is also referred as hypotheses of difference. It
should include a statement which has to be proved wrong.
The alternative hypotheses present the alternative to null hypotheses. It includes the statement
of inequality. The null hypotheses are and alternative hypotheses are complimentary. The null
hypothesis is the statement that is believed to be correct through analysis which is based on
these null hypotheses. For example, the null hypotheses might state the average are for
entering management institute is 20 years. So average age for institute entry = 20 years
2. Determining Appropriate Test Statistic
The appropriate test statistic which is to be used in statistic, which is to be used in statistical
hypotheses testing, is based on various characteristics of sample population of interest
including sample size and distribution.
The test statistic can assume many numerical values. As the value of test statistic has
significant on decision one must use the appropriate statistic in order to obtain meaningful
results. The formula to be used while testing population means is. Z - test statistic, x - mean of
sample µ - mean of population, σ - standard deviation, n – number of sample.
3. The Significance Level
As already explain, null hypothesis can be rejected or fail to reject null hypotheses. A null
hypothesis that is rejected may in really be true or false. A null hypothesis that fails to be
rejected may in reality be true or false. The outcome that a researcher desires is to reject false
null hypotheses or fail to reject true null hypotheses. However there is always possibility of
rejecting a true hypotheses or failing to reject false hypotheses.
Type I and Type II Errors
Type I: error is rejecting a null hypotheses that is true
Type II: Error is failing to rejected a false null hypotheses
4. Decision Rule
Before collection and analyses of data it is necessary to decide under which conditions the null
hypotheses will be rejected or fail to he rejected. The decision rule can be stated in terms of
computed test statistics or in probabilistic terms. The same decision will he applicable any of
the method so selected.
5. Data Collection and Calculation Performance
In research process at early stage method of data collection is decided. Once the research
problem is decided that immediately decision in respect of type and sources of data should be
taken. It must clear that fact that, which type of data will be needed for the purpose of the study
and now researcher has a plan to collect required data.
The decision will provide base for processing and analyzing of data. It is advisable to make use
of approved methods of research for collecting and analysing of data.
6. Decision on Null Hypotheses
The decision regarding null hypotheses in an important step in the process of the decision rule.
Under the said decision rule one has to reject or fail to reject the null hypotheses. If null
hypotheses is rejected than alternative hypotheses can be accepted. If one fails to reject null
hypotheses one can only suggest that null hypotheses may be true.
7. Two Tailed and One Tailed Tests
In the case of testing of hypotheses, above referred both terms are quite important and they
must be clearly understood. A two failed test rejects the null hypotheses.
• If sample mean is significantly

• Higher or lower than the
• Hypothesized value of mean of the population
• Such a test is appropriate, when the null hypotheses is some specified value and the
alternate hypotheses is a value not equal to the specified value and the alternative
hypotheses is value not equal to the specified value of null hypotheses.
Procedure for Testing of Hypotheses:
Testing of hypotheses means to decide the validity of the hypotheses on the basis of the data
collected by researcher. In testing procedure we have to decide whether null hypotheses is
accepted or not accepted. This requirement conducted through several steps between the cause
of two action i.e. relation or acceptance of null hypothesis. The steps involved in testing of
hypotheses are given below.
1. Setting up of Hypotheses
This step consists of hypotheses setting. In this step format statement in relation to hypotheses
in made. In traditional practice instead of one, two hypotheses are set. In case if one hypothesis
is rejected than other hypotheses is accepted. Hypotheses should be clearly stated in respect of
the nature of the research problem. There are hypotheses are.
b. Null hypotheses and

c. Alternative hypotheses.
Acceptance or rejection of hypotheses is based on the sampling information. Any sample which
we draw from the population will vary from it therefore it is necessary to judge whether there
difference are statistically significant or insignificant. The formulation of hypotheses is an
important step which must be accomplished and necessary care should be taken as per the
requirement and object of the research problem under construction. This should also specify
the whether one tailed or two tailed test will be used.
2. Selecting Statistical Technique
In this stage we will make selection of statistical technique which is going to be used. There are
various statistical test which are being used in testing of hypotheses. There tests are
Z – Test
T – Test
F – Test
X2
It is the job of the researcher to make proper selection of the test. Z- Test is used when
hypotheses is related to a large sample. (30 or more)
T- Test is used when hypotheses is related to small sample (Less than 30)
The selection of test will be dependent on various considerations like, variable involved, sample
size, type of data and whether samples are related or independent.
3. Selecting Level of Significance
This stage consists of making selection of desired level of significance. The researcher should
specify level of significance because testing of hypotheses is based on pre-determined level of
significance. The rejection or retention of hypothesis by the researcher is also based on the
significance level. The level of significance is generally expressed in percentage from such as
5% or 1%, 5% level of significance is accepted by the researcher; it means he will be making
wrong decision about 5% of time. In case if hypotheses is reject at this level of 5% he will be
entering risk hypotheses rejection???Out of 100 occasions.
The following factors may affect the level of significance.
- The magnitude difference between sample mean

- The size of sample
- The validity of measurement
4. Determining Sampling Distribution
The next step after deciding significance level in testing of hypothesis is to determine the
appropriate sampling distribution. It is, normal distribution and ‘t’ – distribution in which
choice can be excised.
5. Selecting Sample and Value
In this step random sample is selected and appropriate value is computed from the sample data
relating to the test statistic by utilizing the relevant distribution.
6. Performance Computation
In this step calculation of performance is done. The calculation includes testing statistics and
standard error. A hypothesis is tested for the following four possibilities, that the hypotheses is
b- True, but test lead to its rejection

c- False, but test lead to its acceptance
d- True, but test lead to its acceptance
e- False, but test lead to its rejection
Out of the above four possibilities a and b lends to wrong decision. In this case a lends to Type
I error and, b lends to Type II error.
7. Statistical Decision
Thus is the step in which we have to draw statistical decision involving the acceptance or
rejection of hypotheses. This will be dependent on whether the calculated value of the test falls
in the region of acceptance or in the region of rejection at given significance level.
If hypotheses are tested at 5% level and observed set recorded the possibilities less than 5%
level than we considered difference between hypothetical parameter and sample statistics is
significant.
Testing of Hypothesis Using Various Distribution Test:
A. The Parametric Tests:
The test of significance used for hypothesis testing is of two types the parametric and non-
parametric test. The parametric test is more powerful, but they depend on the parameters or
characteristics of the population. They are based on the following assumptions.
1. The observations or values must be independent.

2. The population from which the sample is drawn on a random basis should be normally
distributed.
3. The population should have equal variances.
4. The data should be measured at least at interval level so that arithmetic operations can
be used.
a) The Z – Test
Prof. R.A. fisher has develop the Z Test. It is based on the normal distribution. It is widely used
for testing the significance of several statistics such as mean, median, mode, coefficient of
correlation and others. This test is used even when binominal distribution or t distribution is
applicable on the presumption that such a distribution lends to approximate normal
distribution as the sample size (n) become larger.
b) The T – Test
The T – Test was developed by W.S. Gossel around 1915 since he published his finding under
a bon name ‘student’, it is known as student’s t – test. It is suitable for testing the significance
of a sample man or for judging the significance of difference between the means of two samples,
when the samples are less than 30 in number and when the population variance is not known.
When two samples are related, the paired t – test is used. The t – test can also be used for
testing the significance of the coefficient of simple and partial correlation.
In determining whether the mean of a sample drawn from a normal population deviates
significantly from a stated value when variance of population is unknown, we calculate the
statistic.
3 The f- Test
The f – test is based on f – distribution (which is a distribution skewed to the right, and tends
to be more symmetrical, as the number of degrees of freedom in the numerator and
denominator increases)
The f- test is used to compare the variances of two independent sample means at a time. It is
also used for judging the significance of multiple correlation coefficients.
B. The Non-parametric Tests
The non-parametric tests are population free tests, as they are not based on the characteristics
of population. They do not specify normally distributed population or equal variances. They are
easy to understand and to use.
The important non parametric tests are:
- The chi-square test

- The median test
- The Mann-Whitney U test
- The sign test
- The Wilcoxin matched –Paris test
- The Kolmogorow Smornov test.
Parametric and Non Parametric Test – Comparison
Nonparametric tests don’t require that your data follow the normal distribution. They’re also
known as distribution-free tests and can provide benefits in certain situations. Typically, people
who perform statistical hypothesis tests are more comfortable with parametric tests than
nonparametric tests.
• Parametric analyses to assess group means

• Nonparametric analyses to assess group medians
Related Pairs of Parametric and Nonparametric Tests: Nonparametric tests are a shadow world
of parametric tests.
Parametric tests of means Nonparametric tests of medians
1-sample t-test 1-sample Sign, 1-sample Wilcoxon
2-sample t-test Mann-Whitney test
One-Way ANOVA Kruskal-Wallis, Mood’s median test
Factorial DOE with a factor and a blocking variable Friedman test
Advantages of Parametric Tests:
Advantage 1: Parametric tests can provide trustworthy results with distributions that
are skewed and nonnormal
Parametric analyses can produce reliable results even when your continuous data are
nonnormally distributed. You just have to be sure that your sample size meets the
requirements for each analysis in the table below. Simulation studies have identified these
requirements.
Parametric analyses Sample size requirements for nonnormal data
1-sample t-test Greater than 20
2-sample t-test Each group should have more than 15 observations
For 2-9 groups, each group should have more than 15 observations
One-Way ANOVA
For 10-12 groups, each group should have more than 20 observations
These parametric tests can be used with nonnormally distributed data thanks to the central
limit theorem.
Advantage 2: Parametric tests can provide trustworthy results when the groups have
different amounts of variability
It’s true that nonparametric tests don’t require data that are normally distributed. However,
nonparametric tests have the disadvantage of an additional requirement that can be very hard
to satisfy. The groups in a nonparametric analysis typically must all have the same variability
(dispersion). Nonparametric analyses might not provide accurate results when variability
differs between groups. Conversely, parametric analyses, like the 2-sample t-test or one-way
ANOVA, allow you to analyze groups that have unequal variances. In most statistical software,
it’s as easy as checking the correct box! You don’t have to worry about groups having different
amounts of variability when you use a parametric analysis.
Advantage 3: Parametric tests have greater statistical power
In most cases, parametric tests have more power. If an effect actually exists, a parametric
analysis is more likely to detect it.
Advantages of Nonparametric Tests
Advantage 1: Nonparametric tests assess the median which can be better for some study
areas
For some datasets, nonparametric analyses provide an advantage because they assess the
median rather than the mean. The mean is not always the better measure of central tendency
for a sample. Even though you can perform a valid parametric analysis on skewed data, that
doesn’t necessarily equate to being the better method. Let me explain using the distribution of
salaries.
Salaries tend to be a right-skewed distribution. The majority of wages cluster around the
median, which is the point where half are above and half are below. However, there is a long
tail that stretches into the higher salary ranges. This long tail pulls the mean far away from the
central median value. The two distributions are typical for salary distributions.
Two right skewed distributions that have equal medians but different means.
These two distributions have roughly equal medians but different means.
In these distributions, if several very high-income individuals join the sample, the mean
increases by a significant amount despite the fact that incomes for most people don’t change.
They still cluster around the median. In this situation, parametric and nonparametric test
results can give you different results, and they both can be correct! For the two distributions, if
you draw a large random sample from each population, the difference between the means is
statistically significant. Despite this, the difference between the medians is not statistically
significant. Here’s how this works.
For skewed distributions, changes in the tail affect the mean substantially. Parametric tests can
detect this mean change. Conversely, the median is relatively unaffected, and a nonparametric
analysis can legitimately indicate that the median has not changed significantly.
You need to decide whether the mean or median is best for your study and which type of
difference is more important to detect.
Advantage 2: Nonparametric tests are valid when our sample size is small and your data
are potentially non-normal
Use a nonparametric test when your sample size isn’t large enough to satisfy the requirements
in the table above and you’re not sure that your data follow the normal distribution. With small
sample sizes, be aware that tests for normality can have insufficient power to produce useful
results.
This situation is difficult. Nonparametric analyses tend to have lower power at the outset, and a
small sample size only exacerbates that problem.
Advantage 3: Nonparametric tests can analyze ordinal data, ranked data, and outliers
Parametric tests can analyze only continuous data and the findings can be overly affected by
outliers. Conversely, nonparametric tests can also analyze ordinal and ranked data, and not be
tripped up by outliers.
Sometimes you can legitimately remove outliers from your dataset if they represent unusual
conditions. However, sometimes outliers are a genuine part of the distribution for a study area,
and you should not remove them.
You should verify the assumptions for nonparametric analyses because the various tests can
analyze different types of data and have differing abilities to handle outliers. If your data use the
ordinal Likert scale and you want to compare two groups.
T-Test
One Sample T Test Hypothesis
Null hypothesis (H0): The difference between population mean and the hypothesized value is
equal to zero
Alternative hypothesis (H1):
• The population mean is not equal to hypothesized value (two-tailed)

• The population mean is greater than hypothesized value (upper-tailed)
• The population mean is less than hypothesized value (lower-tailed)
Assumptions of One Sample T Hypothesis Test
• Data is continuous and quantitative at the scale level (in other words data in ratio or
interval)
• The sample should be randomly selected from the population
• Samples are independent to each other
• Data should follow normal probability distribution
• Assumes it don’t have extreme outliers in the dependent variable
When Would You Use a One Sample T Hypothesis Test?
One sample t test is a type of parametric test because the assumption is samples are randomly
distributed. It tests whether the sample mean is significantly different than a population mean
when the standard deviation of the population is unknown. Hence t test is used when the
population standard deviation is unknown and the sample size is below 30, otherwise use Z-test
(for known variance)
Steps to Calculate One Sample T Hypothesis Test
• State the claim of the test and determine the null hypothesis and alternative hypothesis
• Determine the level of significance
• Calculate degrees of freedom
• Find the critical value
• Calculate the test statistics
• Where
o x̅ is observed sample mean
o μ0 is population mean
o s is sample standard deviation
o n is the number of the observations in the sample
• Make a decision, the null hypothesis will be rejected if the test statistic is less than or
equal to the critical value
• Finally, Interpret the decision in the context of the original claim.
Example of a One Sample T Hypothesis Test in a DMAIC Project
One Sample T test mostly performed in Analyze phase of DMAIC to check the significant
difference between the population mean and the sample means, while paired t-test can be
performed in Measure phase to review before and after process improvement (see below
example for more details).
According to American health association the average blood pressure of a pregnant women is
120 mm Hg. Collected 15 random samples from pregnant women to check the sample blood
pressure is different from accepted standard blood pressure.
• Null Hypothesis: No difference between sample data and population blood pressure (H0:
μ=120)
• Alternative Hypothesis: There is a difference between sample data and population blood
pressure (H1: μ≠120)
Significance level: α=0.05
Degrees of freedom:15-1= 14
Calculate the critical value
Two Tailed T Test
If the calculated t value is less than -2.145 or greater than 2.145, then reject the null
hypothesis.
Test statistics
• x̅ = 123
• μ0 = 120
Calculated t statistic value less than the critical value, hence failed to reject null hypothesis (
H0). So, there is no significant difference between sample mean and population mean.
Two Sample T Hypothesis Test
What is a Two Sample T Hypothesis Test?
A two sample t test used to analyze the difference between two independent population means.
The Two-sample T-test is used when two small samples (n< 30) are taken from two different
populations and compared.
Assumptions of Two Sample T Hypothesis Test
• The sample should be randomly selected from the two population

• Samples are independent to each other
• Variance of two populations are equal
• Data should be continuous
When Would You Use a Two Sample T Hypothesis Test?
The two sample t test most likely used to compare two process means, when the data is having
one nominal variable and one measurement variable. It is a hypothesis test of means. The two
sample t test is used to compare two population means, while analysis of variance (ANOVA) is
the best option if more than two group means to be compared.
Two sample T test performed when the two group samples are statistically independent to each
other, while the paired t-test is used to compare the means of two dependent or paired groups.
Steps to Calculate Two Sample T Hypothesis Test
• State the claim of the test and determine the null hypothesis and alternative hypothesis
• Determine the level of significance
• Calculate degrees of freedom
• Find the critical value
• Calculate the test statistics
Where Sp is the pooled standard deviation
• Make a decision, the null hypothesis will be rejected if the test statistic is less than or
equal to the critical value
• Finally, Interpret the decision in the context of the original claim.
Example of a Two Sample T Hypothesis Test in a DMAIC Project
Two Sample T test mostly performed in Analyze phase of DMAIC to evaluate the difference
between two process means are really significant or due to random chance, this is basically used
to validate the root cause(s) or Critical Xs (see the below example for more detail)
Apple orchard farm owner wants to compare the two farms to see if there are any weight
difference in the apples. From farm A, randomly collected 15 apples with an average weight of
86 gms, and the standard deviation is 7. From farm B, collected 10 apples with an average
weight of 80 gms and standard deviation of 8. With a 95% confidence level, is there any
difference in the farms?
• Null Hypothesis (H0) : Mean apple weight of farm A is equal to farm B

• Alternative Hypothesis (H1) : Mean apple weight of farm A is not equal to farm B
Significance level: α=0.05

Degrees of freedom df: 10+15-2= 23
Calculate critical value
Two Tailed T Test
If the calculated t value is less than -2.069 or greater than 2.069, then reject the null
hypothesis.
Test Statistic
Calculated t statistic value less than the critical value, hence failed to reject null hypothesis
(H0). So, there is no significant difference between mean weights of apples in farm A and farm
B.
Paired Difference t-test
Requirements: A set of paired observations from a normal population
This t‐test compares one set of measurements with a second set from the same sample. It is
often used to compare “before” and “after” scores in experiments to determine whether
significant change has occurred.
Hypothesis test
Formula:
Where is the mean of the change scores, Δ is the hypothesized difference (0 if testing for
equal means), s is the sample standard deviation of the differences, and n is the sample size. The
number of degrees of freedom for the problem is n – 1.
A farmer decides to try out a new fertilizer on a test plot containing 10 stalks of corn. Before
applying the fertilizer, he measures the height of each stalk. Two weeks later, he measures the
stalks again, being careful to match each stalk's new height to its previous one. The stalks
would have grown an average of 6 inches during that time even without the fertilizer. Did the
fertilizer help? Use a significance level of 0.05.
Null hypothesis: H 0: μ = 6
Alternative hypothesis: H a : μ > 6
Subtract each stalk's “before” height from its “after” height to get the change score for each
stalk; then compute the mean and standard deviation of the change scores and insert these into
the formula.
The problem has n – 1, or 10 – 1 = 9 degrees of freedom. The test is one‐tailed because you are
asking only whether the fertilizer increases growth, not reduces it. The critical value from
the t‐table for t .05,9 is 1.833.
Because the computed t‐value of 2.098 is larger than 1.833, the null hypothesis can be rejected.
The test has provided evidence that the fertilizer caused the corn to grow more than if it had
not been fertilized. The amount of actual increase was not large (1.36 inches over normal
growth), but it was statistically significant.
Example: Test workbook (Parametric worksheet: PEFR Before, PEFR After).

Comparison of peak expiratory flow rate (PEFR) before and after a walk on a cold winter's day
for a random sample of 9 asthmatics. Enter two columns in the workbook, one of PEFR's before
the walk and the other of PEFR's after the walk. In this example each row must represent the
same subject:
Subject Before After

1 312 300
2 242 201
3 340 232
4 388 312
5 296 220
6 254 256
7 391 328
8 402 330
9 290 231
For differences between PEFR Before and PEFR After:

Mean of differences = 56.111111 (n = 9)
Standard deviation = 34.173983
Standard error = 11.391328
95% CI = 29.842662 to 82.37956
df = 8
t = 4.925774
One sided P = .0006

Two sided P = .0012
Power (for 5% significance) = 98.47%
A null hypothesis of no difference between the means is clearly rejected; the confidence interval
is a long way from including zero.
Z-Test
Find values on the left of the mean in

this negative Z score table. Table entries for z
represent the area under the bell curve to the
left of z. Negative scores in the z-table
correspond to the values which are less than
the mean.
Find values on the right of the mean in this z-
table. Table entries for z represent the area
under the bell curve to the left of z. Positive
scores in the Z-table correspond to the values
which are greater than the mean.
A Z-test is a type of hypothesis test. Hypothesis testing is just a way for you to figure out if
results from a test are valid or repeatable. For example, if someone said they had found a new
drug that cures cancer, you would want to be sure it was probably true. A hypothesis test will
tell you if it’s probably true, or probably not true. A Z test, is used when your data is
approximately normally distributed.
When you can run a Z Test.
Several different types of tests are used in statistics (i.e. f test, chi square test, t test). You would
use a Z test if:
• Your sample size is greater than 30. Otherwise, use a t test.

• Data points should be independent from each other. In other words, one data point isn’t
related or doesn’t affect another data point.
• Your data should be normally distributed. However, for large sample sizes (over 30) this
doesn’t always matter.
• Your data should be randomly selected from a population, where each item has an equal
chance of being selected.
• Sample sizes should be equal if at all possible.
How do I run a Z Test?
Running a Z test on your data requires five steps:
• State the null hypothesis and alternate hypothesis.

• Choose an alpha level.
• Find the critical value of z in a z table.
• Calculate the z test statistic.
• Compare the test statistic to the critical z value and decide if you should support or
reject the null hypothesis.
Z-Test's for Different Purposes
There are different types of Z-test each for different purpose. Some of the popular types are
outlined below:
1. z test for single proportion is used to test a hypothesis on a specific value of the
population proportion.
Statistically speaking, we test the null hypothesis H0: p = p0 against the alternative
hypothesis H1: p >< p0 where p is the population proportion and p0 is a specific value
of the population proportion we would like to test for acceptance.
The example on tea drinkers explained above requires this test. In that example, p0 =
0.5. Notice that in this particular example, proportion refers to the proportion of tea
drinkers.
2. z test for difference of proportions is used to test the hypothesis that two populations
have the same proportion.
For example suppose one is interested to test if there is any significant difference in the
habit of tea drinking between male and female citizens of a town. In such a situation, Z-
test for difference of proportions can be applied.
One would have to obtain two independent samples from the town- one from males and
the other from females and determine the proportion of tea drinkers in each sample in
order to perform this test.
3. z -test for single mean is used to test a hypothesis on a specific value of the population
mean.
Statistically speaking, we test the null hypothesis H0: μ = μ0 against the alternative
hypothesis H1: μ >< μ0 where μ is the population mean and μ0 is a specific value of the
population that we would like to test for acceptance.
Unlike the t-test for single mean, this test is used if ≥ n 30 and population standard
deviation is known.
4. z test for single variance is used to test a hypothesis on a specific value of the population
variance.
Statistically speaking, we test the null hypothesis H0: σ = σ0 against H1: σ >< σ0
where σ is the population mean and σ0 is a specific value of the population variance that
we would like to test for acceptance.
In other words, this test enables us to test if the given sample has been drawn from a
population with specific variance σ0. Unlike the chi square test for single variance, this
test is used if n ≥ 30.
5. Z-test for testing equality of variance is used to test the hypothesis of equality of two
population variances when the sample size of each sample is 30 or larger.
You could perform all these steps by hand. For example, you could find a critical value by hand,
or calculate a z value by hand. For a step by step example, see:
One Sample Z Test
Formula:
Where is the sample mean, Δ is a specified value to be tested, σ is the population standard
deviation, and n is the size of the sample. Look up the significance level of the z‐value in the
standard normal table (Table in Appendix B).
A herd of 1,500 steer was fed a special high‐protein grain for a month. A random sample of 29
were weighed and had gained an average of 6.7 pounds. If the standard deviation of weight gain
for the entire herd is 7.1, test the hypothesis that the average weight gain per steer for the
month was more than 5 pounds.
Alternative hypothesis: H a: μ > 5
Tabled value for z ≤ 1.28 is 0.8997
1 – 0.8997 = 0.1003
So, the conditional probability that a sample from the herd gains at least 6.7 pounds per steer
is p = 0.1003. Should the null hypothesis of a weight gain of less than 5 pounds for the
population be rejected? That depends on how conservative you want to be. If you had decided
beforehand on a significance level of p < 0.05, the null hypothesis could not be rejected.
In national use, a vocabulary test is known to have a mean score of 68 and a standard deviation
of 13. A class of 19 students takes the test and has a mean score of 65.
Is the class typical of others who have taken the test? Assume a significance level of p < 0.05.
There are two possible ways that the class may differ from the population. Its scores may be
lower than, or higher than, the population of all students taking the test; therefore, this
problem requires a two‐tailed test. First, state the null and alternative hypotheses:
Alternative hypothesis: H a : μ ≠ 68
Because you have specified a significance level, you can look up the critical z‐value in Table of
Appendix B before computing the statistic. This is a two‐tailed test; so the 0.05 must be split
such that 0.025 is in the upper tail and another 0.025 in the lower. The z‐value that
corresponds to –0.025 is –1.96, which is the lower critical z‐value. The upper value corresponds
to 1 – 0.025, or 0.975, which gives a z‐value of 1.96. The null hypothesis of no difference will be
rejected if the computed z statistic falls outside the range of –1.96 to 1.96.
Next, compute the z statistic:
Because –1.006 is between –1.96 and 1.96, the null hypothesis of population mean is 68 and
cannot be rejected. That is, there is not evidence that this class can be considered different from
others who have taken the test.
Formula:
where a and b are the limits of the confidence interval, is the sample mean, is the upper
(or positive) z‐value from the standard normal table corresponding to half of the desired alpha
level (because all confidence intervals are two‐tailed), σ is the population standard deviation,
and n is the size of the sample.
A sample of 12 machine pins has a mean diameter of 1.15 inches, and the population standard
deviation is known to be 0.04. What is a 99 percent confidence interval of diameter width for
the population?
First, determine the z‐value. A 99 percent confidence level is equivalent to p < 0.01. Half of
0.01 is 0.005. The z‐value corresponding to an area of 0.005 is 2.58. The interval may now be
calculated:
The interval is (1.12, 1.18).
We have 99 percent confidence that the population mean of pin diameters lies between 1.12 and
1.18 inches. Note that this is not the same as saying that 99 percent of the machine pins have
diameters between 1.12 and 1.18 inches, which would be an incorrect conclusion from this test.
Because surveys cost money to administer, researchers often want to calculate how many
subjects will be needed to determine a population mean using a fixed confidence interval and
significance level. The formula is
where n is the number of subjects needed, is the critical z‐value corresponding to the
desired significance level, σ is the population standard deviation, and w is the desired
confidence interval width.
How many subjects will be needed to find the average age of students at Fisher College plus or
minus a year, with a 95 percent significance level and a population standard deviation of 3.5?
Rounding up, a sample of 48 students would be sufficient to determine students' mean age plus
or minus one year. Note that the confidence interval width is always double the “plus or minus”
figure.
Two Sample Z-Tests
Formula:
where and are the means of the two samples, Δ is the hypothesized difference between the
population means (0 if testing for equal means), σ 1 and σ 2 are the standard deviations of the
two populations, and n 1and n 2are the sizes of the two samples.
This test for a difference in proportions, a two proportion z-test allows you to compare two
proportions to see if they are the same.
• The null hypothesis (H0) for the test is that the proportions are the same.
• The alternate hypothesis (H1) is that the proportions are not the same.
The amount of a certain trace element in blood is known to vary with a standard deviation of
14.1 ppm (parts per million) for male blood donors and 9.5 ppm for female donors. Random
samples of 75 male and 50 female donors yield concentration means of 28 and 33 ppm,
respectively. What is the likelihood that the population means of concentrations of the element
are the same for men and women?
Null hypothesis: H 0: μ 1 = μ 2
or H 0: μ 1 – μ 2= 0
Alternative hypothesis: H a : μ 1 ≠ μ 2
or: H a : μ 1 – μ 2≠ 0
The computed z‐value is negative because the (larger) mean for females was subtracted from
the (smaller) mean for males. But because the hypothesized difference between the populations
is 0, the order of the samples in this computation is arbitrary— could just as well have been
the female sample mean and the male sample mean, in which case z would be 2.37 instead of
–2.37. An extreme z‐score in either tail of the distribution (plus or minus) will lead to rejection
of the null hypothesis of no difference.
The area of the standard normal curve corresponding to a z‐score of –2.37 is 0.0089. Because
this test is two‐tailed, that figure is doubled to yield a probability of 0.0178 that the population
means are the same. If the test had been conducted at a pre‐specified significance level of α <
0.05, the null hypothesis of equal means could be rejected. If the specified significance level had
been the more conservative (more stringent) α < 0.01, however, the null hypothesis could not
be rejected.
In practice, the two‐sample z‐test is not used often, because the two population standard
deviations σ 1 and σ 2 are usually unknown. Instead, sample standard deviations and
the t‐distribution are used.
Sample question: let’s say you’re testing two flu drugs A and B. Drug A works on 41 people
out of a sample of 195. Drug B works on 351 people in a sample of 605. Are the two drugs
comparable? Use a 5% alpha level.
Step 1: Find the two proportions:
P1 = 41/195 = 0.21 (that’s 21%)
P2 = 351/605 = 0.58 (that’s 58%).
Set these numbers aside for a moment.
Step 2: Find the overall sample proportion. The numerator will be the total number of
“positive” results for the two samples and the denominator is the total number of people in the
two samples.
p = (41 + 351) / (195 + 605) = 0.49. Set this number aside for a moment.
Step 3: Insert the numbers from Step 1 and Step 2 into the test statistic formula:
Solving the formula, we get:

Z = 8.99
We need to find out if the z-score falls into the “rejection region.”
Step 4: Find the z-score associated with α/2. I’ll use the following table of known values:
The z-score associated with a 5% alpha level / 2 is 1.96.
Step 5: Compare the calculated z-score from Step 3 with the table z-score from Step 4. If the
calculated z-score is larger, you can reject the null hypothesis.
8.99 > 1.96, so we can reject the null hypothesis.
Chi-Square
df P = 0.05 P = 0.01 P = 0.001 df P = 0.05 P = 0.01 P = 0.001
1 3.84 6.64 10.83 51 68.67 77.39 87.97
2 5.99 9.21 13.82 52 69.83 78.62 89.27
3 7.82 11.35 16.27 53 70.99 79.84 90.57
4 9.49 13.28 18.47 54 72.15 81.07 91.88
5 11.07 15.09 20.52 55 73.31 82.29 93.17
6 12.59 16.81 22.46 56 74.47 83.52 94.47
7 14.07 18.48 24.32 57 75.62 84.73 95.75
8 15.51 20.09 26.13 58 76.78 85.95 97.03
9 16.92 21.67 27.88 59 77.93 87.17 98.34
10 18.31 23.21 29.59 60 79.08 88.38 99.62
11 19.68 24.73 31.26 61 80.23 89.59 100.88
12 21.03 26.22 32.91 62 81.38 90.80 102.15
13 22.36 27.69 34.53 63 82.53 92.01 103.46
14 23.69 29.14 36.12 64 83.68 93.22 104.72
15 25.00 30.58 37.70 65 84.82 94.42 105.97
16 26.30 32.00 39.25 66 85.97 95.63 107.26
17 27.59 33.41 40.79 67 87.11 96.83 108.54
18 28.87 34.81 42.31 68 88.25 98.03 109.79
19 30.14 36.19 43.82 69 89.39 99.23 111.06
20 31.41 37.57 45.32 70 90.53 100.42 112.31
21 32.67 38.93 46.80 71 91.67 101.62 113.56
22 33.92 40.29 48.27 72 92.81 102.82 114.84
23 35.17 41.64 49.73 73 93.95 104.01 116.08
24 36.42 42.98 51.18 74 95.08 105.20 117.35
25 37.65 44.31 52.62 75 96.22 106.39 118.60
26 38.89 45.64 54.05 76 97.35 107.58 119.85
27 40.11 46.96 55.48 77 98.49 108.77 121.11
28 41.34 48.28 56.89 78 99.62 109.96 122.36
29 42.56 49.59 58.30 79 100.75 111.15 123.60
30 43.77 50.89 59.70 80 101.88 112.33 124.84
31 44.99 52.19 61.10 81 103.01 113.51 126.09
32 46.19 53.49 62.49 82 104.14 114.70 127.33
33 47.40 54.78 63.87 83 105.27 115.88 128.57
34 48.60 56.06 65.25 84 106.40 117.06 129.80
35 49.80 57.34 66.62 85 107.52 118.24 131.04
36 51.00 58.62 67.99 86 108.65 119.41 132.28
37 52.19 59.89 69.35 87 109.77 120.59 133.51
38 53.38 61.16 70.71 88 110.90 121.77 134.74
39 54.57 62.43 72.06 89 112.02 122.94 135.96
40 55.76 63.69 73.41 90 113.15 124.12 137.19
41 56.94 64.95 74.75 91 114.27 125.29 138.45
42 58.12 66.21 76.09 92 115.39 126.46 139.66
43 59.30 67.46 77.42 93 116.51 127.63 140.90
44 60.48 68.71 78.75 94 117.63 128.80 142.12
45 61.66 69.96 80.08 95 118.75 129.97 143.32
46 62.83 71.20 81.40 96 119.87 131.14 144.55
47 64.00 72.44 82.72 97 120.99 132.31 145.78
48 65.17 73.68 84.03 98 122.11 133.47 146.99
49 66.34 74.92 85.35 99 123.23 134.64 148.21
50 67.51 76.15 86.66 100 124.34 135.81 149.48
Meaning of Chi-Square Test:
The Chi-square (χ2) test represents a useful method of comparing experimentally obtained
results with those to be expected theoretically on some hypothesis.
Thus Chi-square is a measure of actual divergence of the observed and expected frequencies. It
is very obvious that the importance of such a measure would be very great in sampling studies
where we have invariably to study the divergence between theory and fact.
Chi-square as we have seen is a measure of divergence between the expected and observed
frequencies and as such if there is no difference between expected and observed frequencies the
value of Chi-square is 0.
If there is a difference between the observed and the expected frequencies then the value of Chi-
square would be more than 0. That is, the larger the Chi-square the greater the probability of a
real divergence of experimentally observed from expected results.
If the calculated value of chi-square is very small as compared to its table value it indicates that
the divergence between actual and expected frequencies is very little and consequently the fit is
good. If, on the other hand, the calculated value of chi-square is very big as compared to its
table value it indicates that the divergence between expected and observed frequencies is very
great and consequently the fit is poor.
To evaluate Chi-square, we enter Table E with the computed value of chi- square and the
appropriate number of degrees of freedom. The number of df = (r – 1) (c – 1) in which r is the
number of rows and c the number of columns in which the data are tabulated.
Thus in 2 x 2 table degrees of freedom are (2 – 1) (2 – 1) or 1. Similarly in 3 x 3 table, degrees

of freedom are (3 – 1) (3 – 1) or 4 and in 3 x 4 table the degrees of freedom are (3 – 1) (4 – 1) or
6.
Levels of Significance of Chi-Square Test:
The calculated values of χ2 (Chi-square) are compared with the table values, to conclude
whether the difference between expected and observed frequencies is due to the sampling
fluctuations and as such significant or whether the difference is due to some other reason and as
such significant. The divergence of theory and fact is always tested in terms of certain
probabilities.
The probabilities indicate the extent of reliance that we can place on the conclusion drawn. The
table values of χ2 are available at various probability levels. These levels are called levels of
significance. Usually the value of χ2 at .05 and .01 level of significance for the given degrees of
freedom is seen from the tables.
If the calculated value of χ2 is greater than the tabulated value, it is said to be significant. In
other words, the discrepancy between the observed and expected frequencies cannot be
attributed to chance and we reject the null hypothesis.
Thus we conclude that the experiment does not support the theory. On the other hand if
calculated value of χ2 is less than the corresponding tabulated value then it is said to be non-
significant at the required level of significance.
This implies that the discrepancy between observed values (experiment) and the expected
values (theory) may be attributed to chance, i.e., fluctuations of sampling.
Chi-Square Test under Null Hypothesis:
Suppose we are given a set of observed frequencies obtained under some experiment and we
want to test if the experimental results support a particular hypothesis or theory. Karl Pearson
in 1990, developed a test for testing the significance of the discrepancy between experimental
values and the theoretical values obtained under some theory or hypothesis.
This test is known as χ2-test and is used to test if the deviation between observation
(experiment) and theory may be attributed to chance (fluctuations of sampling) or if it is really
due to the inadequacy of the theory to fit the observed data.
Under the Null Hypothesis we state that there is no significant difference between the observed
(experimental) and the theoretical or hypothetical values, i.e., there is a good compatibility
between theory and experiment.
The equation for chi-square (χ2) is stated as follows:
in which fo = frequency of occurrence of observed or experimentally determined facts fe =

expected frequency of occurrence on some hypothesis.
Thus chi-square is the sum of the values obtained by dividing the square of the difference
between observed and expected frequencies by the expected frequencies in each case. In other
words the differences between observed and expected frequencies are squared and divided by
the expected number in each case, and the sum of these quotients is χ2.
Several illustrations of the chi-square test will clarify the discussion given above. The
differences of fo and fe are written always + ve.
1. Testing the divergence of observed results from those expected on the hypothesis of
equal probability (null hypothesis):
Example 1:
Ninety-six subjects are asked to express their attitude towards the proposition “Should AIDS
education be integrated in the curriculum of Higher secondary stage” by marking F
(favourable), I (indifferent) or U (unfavourable).
It was observed that 48 marked ‘F’, 24 ‘I’ and 24 ‘U’:

• Test whether the observed results diverge significantly from the results to be expected
if there are no preferences in the group.
• Test the hypothesis that “there is no difference between preferences in the group”.
• Interpret the findings.
Solution:
Following steps may be followed for the computation of x2 and drawing the conclusions:
Step 1:
Compute the expected frequencies (fe) corresponding to the observed frequencies in each case
under some theory or hypothesis.
In our example the theory is of equal probability (null hypothesis). In the second row the
distribution of answers to be expected on the null hypothesis is selected equally.
Step 2:
Compute the deviations (fo – fe) for each frequency. Each of these differences is squared and
divided by its fe (256/32, 64/32 and 64/32).
Step 3:
Add these values to compute:
Step 4:
The degrees of freedom in the table is calculated from the formula df = (r – 1) (c – 1) to be (3 –
1) (2 – 1) or 2.
Step 5:
Look up the calculated (critical) values of χ2 for 2 df at certain level of significance, usually 5%
or 1%. With df = 2, the χ2 value to be significant at .01 level is 9.21 (Table E). The obtained
χ2 value of 12 > 9.21.
i. Hence the marked divergence is significant.

ii. The null hypothesis is rejected.
iii. We conclude that our group really favours the proposition.
We reject the “equal answer” hypothesis and conclude that our group favours the proposition.
Example 2:
The number of automobile accidents per week in a certain community were as follows:
12, 8, 20, 2, 14, 10, 15, 6, 9, 4
Are these frequencies in agreement with the belief that accident conditions were the same
during this 10-week period?
Solution:
Null Hypothesis—Set up the null hypothesis that the given frequencies (of number of accidents
per week in a certain community) are consistent with the belief that the accident conditions
were same during the 10-week period.
Since the total number of accidents over the 10 weeks are:

12 + 8 + 20 + 2 + 14 + 10 + 15 + 6 + 9 + 4 = 100.
Under the null hypothesis, these accidents should be uniformly distributed over the 10-week
period and hence the expected number of accidents for each of the 10 weeks are 100/10 = 10.
Since calculated value of χ2 = 26.6 is greater than the tabulated value, 21.666. It is significant
and the null hypothesis rejected at .01 level of significance. Hence we conclude that the accident
conditions are certainly not uniform (same) over the 10-week period.
2. Testing the divergence of observed results from those expected on the hypothesis of a
normal distribution:
The hypothesis, instead of being equally probable, may follow the normal distribution. An
example illustrates how this hypothesis may be tested by chi-square.
Example 3:
Two hundred salesmen have been classified into three groups very good, satisfactory, and
poor—by consensus of sales managers.
Does this distribution of rating differ significantly from that to be expected if selling ability is
normally distributed in our population of salesmen?
We set up the hypothesis that selling ability is normally distributed. The normal curve extends
from – 3σ to + 3σ. If the selling ability is normally distributed the base line can be divided into
three equal segments, i.e.
(+ 1σ to + 3σ), (- 1σ to + 1σ) and (- 3σ to – 1σ) representing good, satisfactory and poor

salesmen respectively. By referring Table A we find that 16% of cases lie between + 1σ and
+3σ, 68% in between – 1σ and + 1σ and 16% in between – 3σ and – 1σ. In case of our problem
16% of 200 = 32 and 68% of 200 = 136.
df= 2. P is less than .01

The calculated χ2 = 72.76
The calculated χ2 of 72.76 > 9.21. Hence P is less than .01.
.˙. The discrepancy between observed frequencies and expected frequencies is quite significant.
On this ground the hypothesis of a normal distribution of selling ability in this group must be
rejected. Hence we conclude that the distribution of ratings differ from that to be expected.
3. Chi-square test when our expectations are based on predetermined results:

Example 4:
In an experiment on breeding of peas a researcher obtained the following data: The
theory predicts the proportion of beans, in four groups A, B, C and D should be 9: 3: 3: 1. In an
experiment among 1,600 beans, the numbers in four groups were 882, 313, 287 and 118. Does
the experiment result support the genetic theory? (Test at .05 level).
Solution:
We set up the null hypothesis that there is no significant difference between the experimental
values and the theory. In other words there is good correspondence between theory and
experiment, i.e., the theory supports the experiment.
Since the calculated χ2 value of 4.726 < 7.81, it is not significant. Hence null hypothesis may be
accepted at .05 level of significance and we may conclude that the experimental results support
the genetic theory.
4. The Chi-square test when table entries are small:

When table entries are small and when table is 2 x 2 fold, i.e., df = 1, χ2 is subject to
considerable error unless a correction for continuity (called Yates’ Correction) is made.
Example 5:
Forty rats were offered opportunity to choose between two routes. It was found that 13 chose
lighted routes (i.e., routes with more illumination) and 27 chose dark routes.
(i) Test the hypothesis that illumination makes no difference in the rats’ preference for routes
(Test at .05 level).
(ii) Test whether the rats have a preference towards dark routes.
Solution:
If illumination makes no difference in preference for routes i.e., if H0 be true, the proportionate
preference would be 1/2 for each route (i.e., 20).
In our example we are to subtract .5 from each (fo – fe) difference for the following
reason:
The data can be tabulated as follows:
When the expected entries in 2 x 2 fold table are the same as in our problem the formula
for chi-square may be written in a somewhat shorter form as follows:
(i) The critical value of χ2 at .05 level is 3.841. The obtained χ2 of 4.22 is more than 3.841.
Hence the null hypothesis is rejected at .05 level. Apparently light or dark is a factor in the rats’
choice for routes.
(ii) In our example we have to make a one-tailed test. Entering table E we find that χ2 of 4.22
has a P = .043 (by interpolation).
.˙. P/2 = .0215 or 2%. In other words there are 2 chances in 100 that such a divergence would
occur.
Hence we mark the divergence to be significant at 02 level.
Therefore, we conclude that the rats have a preference for dark routes.
5. The Chi-square test of independence in contingency tables:

Sometimes we may encounter situations which require us to test whether there is any
relationship (or association) between two variables or attributes. In other words χ2 can be made
when we wish to investigate the relationship between traits or attributes which can be classified
into two or more categories.
For example, we may be required to test whether the eye-colour of father is associated with the
eye-colour of sons, whether the socio-economic status of the family is associated with the
preference of different brands of a commodity, whether the education of couple and family size
are related, whether a particular vaccine has a controlling effect on a particular disease etc.
To make a test we prepare a contingency table end to calculate fe (expected frequency)

for each cell of the contingency table and then compute χ2 by using formula:
Null hypothesis:
χ2 is calculated with an assumption that the two attributes are independent of each other, i.e.
there is no relationship between the two attributes.
The calculation of expected frequency of a cell is as follows:
Example 6:
In a certain sample of 2,000 families 1,400 families are consumers of tea where 1236 are Hindu
families and 164 are non-Hindu.
And 600 families are not consumers of tea where 564 are Hindu families and 36 are non-Hindu.
Use χ2 – test and state whether there is any significant difference between consumption of tea
among Hindu and non-Hindu families.
Solution:
The above data can be arranged in the form of a 2 x 2 contingency table as given below:
We set up the null hypothesis (H0) that the two attributes viz., ‘consumption of tea’ and the
‘community’ are independent. In other words, there is no significant difference between the
consumption of tea among Hindu and non-Hindu families.
Since the calculated value of χ2, viz., 15.24 is much greater than the tabulated value of χ2 at .01
level of significance; the value of χ2 is highly significant and null hypothesis is rejected.
Hence we conclude that the two communities (Hindu and Non-Hindus) differ significantly as
regards the consumption of tea among them.
Example 7:
The table given below shows the data obtained during an epidemic of cholera.
Test the effectiveness of inoculation in preventing the attack of cholera.
Solution:
We set up the null hypothesis (H0) that the two attributes viz., inoculation and absence of
attack from cholera are not associated. These two attributes in the given table are independent.
Basing on our hypothesis we can calculate the expected frequencies as follows:

Calculation of (fe):
The five percent value of χ2 for 1 df is 3.841, which is much less than the calculated value of χ2.
So in the light of this, conclusion is evident that the hypothesis is incorrect and inoculation and
absence of attack from cholera are associated.
Conditions for the Validity of Chi-Square Test:

The Chi-square test statistic can be used if the following conditions are satisfied:
• N, the total frequency, should be reasonably large, say greater than 50.
• The sample observations should be independent. This implies that no individual item
should be included twice or more in the sample.
• The constraints on the cell frequencies, if any, should be linear (i.e., they should not
involve square and higher powers of the frequencies) such as ∑f o = ∑fe = N.
• No theoretical frequency should be small. Small is a relative term. Preferably each
theoretical frequency should be larger than 10 but in any case not less than 5. If any
theoretical frequency is less than 5 then we cannot apply χ2 -test as such. In that case
we use the technique of “pooling” which consists in adding the frequencies which are
less than 5 with the preceding or succeeding frequency (frequencies) so that the
resulting sum is greater than 5 and adjust for the degrees of freedom accordingly.
• The given distribution should not be replaced by relative frequencies or proportions
but the data should be given in original units.
• Yates’ correction should be applied in special circumstances when df = 1 (i.e. in 2 x 2
tables) and when the cell entries are small.
• χ2-test is mostly used as a non-directional test (i.e. we make a two-tailed test.).
However, there may be cases when χ2 tests can be employed in making a one-tailed
test.
6. In one-tailed test we double the P-value. For example with df = 1, the critical value of
χ2 at 05 level is 2.706 (2.706 is the value written under. 10 level) and the critical value
of; χ2 at .01 level is 5.412 (the value is written under the .02 level).
The Additive Property of Chi-Square Test:

χ2 has a very useful property of addition. If a number of sample studies have been conducted in
the same field then the results can be pooled together for obtaining an accurate idea about the
real position.
Suppose ten experiments have been conducted to test whether a particular vaccine is effective
against a particular disease. Now here we shall have ten different values of χ2 and ten different
values of df.
We can add the ten χ2 to obtain one value and similarly ten values of df can also be added
together. Thus, we shall have one value of χ2 and one value of degrees of freedom. Now we can
test the results of all these ten experiments combined together and find out the value of P.
Suppose five independent experiments have been conducted in a particular field. Suppose in
each case there was one df and following values of χ2 were obtained.
Now at 5% level of significance (or for P – .05) the value χ2 for one df is 3.841. From the
calculated values of χ2 given above we notice that in only one ease i.e., experiment No. 3 the
observed value of χ2 is less than the tabulated value of 3.841.
It means that so far as this experiment is concerned the difference is insignificant but in the
remaining four cases the calculated value of χ2 is more than 3.841 and as such at 5% level of
significance the difference between the expected and the actual frequencies is significant.
If we add all the values of χ2 we get (4.3 + 5.7 + 2.1 + 3.9 + 8.3) or 24.3. The total of the
degrees of freedom is 5. It means that the calculated value of χ2 for 5 df is 24.3.
If we look in the table of χ2 we shall find that at 5% level of significance for 5 df the value of
χ2 is 11.070. The calculated value of χ2 which is 24.3 is much higher than the tabulated value
and as such we can conclude that the difference between observed and expected frequencies is
significant one.
Even if we take 1% level of significance (or P = .01) the table value of χ2 is only 15.086. Thus
the probability of getting a value of χ2 equal to or more than 24.3 as a result of sampling
fluctuations is much less than even .01 or in other words the difference is significant.
Applications of Chi-Test:
The applications of χ2-test statistic can be discussed as stated below:
• Testing the divergence of observed results from expected results when our expectations
are based on the hypothesis of equal probability.
• Chi-square test when expectations are based on normal distribution.
• Chi-square test when our expectations are based on predetermined results.
• Correction for discontinuity or Yates’ correction in calculating χ2.
• Chi-square test of independence in contingency tables.
Uses of Chi-Square Test:

• Although test is conducted in terms of frequencies it can be best viewed conceptually as
a test about proportions.
• χ2 test is used in testing hypothesis and is not useful for estimation.
• Chi-square test can be applied to complex contingency table with several classes.
Chi-square test has a very useful property i.e., ‘the additive property’. If a number of sample studies are
conducted in the same field, the results can be pooled together. This means that χ2-values can be added.
F-Test
An “F Test” is a catch-all term for any test that uses the F-distribution. In most cases, when
people talk about the F-Test, what they are actually talking about is The F-Test to Compare
Two Variances. However, the f-statistic is used in a variety of tests including regression
analysis, the Chow test and the Scheffe Test (a post-hoc ANOVA test).
General Steps for an F Test
1. State the null hypothesis and the alternate hypothesis.

2. Calculate the F value. The F Value is calculated using the formula F = (SSE1 – SSE2 /
m) / SSE2 / n-k, where SSE = residual sum of squares, m = number of restrictions and
k = number of independent variables.
3. Find the F Statistic (the critical value for this test). The F statistic formula is:
4. F Statistic = variance of the group means / mean of the within group variances.
5. You can find the F Statistic in the F-Table.
6. Support or Reject the Null Hypothesis.
F Test to Compare Two Variances
A Statistical F Test uses an F Statistic to compare two variances, s1 and s2, by dividing them.
The result is always a positive number (because variances are always positive). The equation
for comparing two variances with the f-test is:
F = s21 / s22
If the variances are equal, the ratio of the variances will equal 1. For example, if you had two
data sets with a sample 1 (variance of 10) and a sample 2 (variance of 10), the ratio would be
10/10 = 1.
You always test that the population variances are equal when running an F Test. In other
words, you always assume that the variances are equal to 1. Therefore, your null
hypothesis will always be that the variances are equal.
To compare variance of two different sets of values, F test formula is used. Applied on F
distribution under null hypothesis, we first need to find out the mean of two given observations
and then calculate their variance.
σ2=∑(x–x) 2 / n−1
Where,
σ2 = Variance
x = Values given in a set of data
x = Mean of the data
n = Total number of values.
Assumptions
Several assumptions are made for the test. Your population must be approximately normally
distributed (i.e. fit the shape of a bell curve) in order to use the test. Plus, the samples must be
independent events. In addition, you’ll want to bear in mind a few important points:
• The larger variance should always go in the numerator (the top number) to force the test
into a right-tailed test. Right-tailed tests are easier to calculate.
• For two-tailed tests, divide alpha by 2 before finding the right critical value.
• If you are given standard deviations, they must be squared to get the variances.
• If your degrees of freedom aren’t listed in the F Table, use the larger critical value. This
helps to avoid the possibility of Type I errors.
F Test to compare two variances by hand: Steps
Warning: F tests can get really tedious to calculate by hand, especially if you have to calculate
the variances.
Step 1: If you are given standard deviations, go to Step 2. If you are given variances to compare, go to
Step 3.
Step 2: Square both standard deviations to get the variances. For example, if σ1 = 9.6 and σ2 = 10.9,
then the variances (s1 and s2) would be 9.62 = 92.16 and 10.92 = 118.81.
Step 3: Take the largest variance, and divide it by the smallest variance to get the f-value. For
example, if your two variances were s1 = 2.5 and s2 = 9.4, divide 9.4 / 2.5 = 3.76.
Why? Placing the largest variance on top will force the F-test into a right tailed test, which is
much easier to calculate than a left-tailed test.
Step 4: Find your degrees of freedom. Degrees of freedom is your sample size minus 1. As you
have two samples (variance 1 and variance 2), you’ll have two degrees of freedom: one for the
numerator and one for the denominator.
Step 5: Look at the f-value you calculated in Step 3 in the f-table. Note that there are several tables,
so you’ll need to locate the right table for your alpha level. Unsure how to read an f-table?
Step 6: Compare your calculated value (Step 3) with the table f-value in Step 5. If the f-table
value is smaller than the calculated value, you can reject the null hypothesis.
Two Tailed F-Test
The difference between running a one or two tailed F test is that the alpha level needs to be
halved for two tailed F tests. For example, instead of working at α = 0.05, you use α = 0.025;
Instead of working at α = 0.01, you use α = 0.005.
With a two tailed F test, you just want to know if the variances are not equal to each other. In
notation:
Ha = σ21 ≠ σ2 2
Sample problem: Conduct a two tailed F Test on the following samples:
Sample 1: Variance = 109.63, sample size = 41.
Sample 2: Variance = 65.99, sample size = 21.
Step 1: Write your hypothesis statements:

Ho: No difference in variances.
Ha: Difference in variances.
Step 2: Calculate your F critical value. Put the highest variance as the numerator and the
lowest variance as the denominator:
F Statistic = variance 1/ variance 2 = 109.63 / 65.99 = 1.66
Step 3: Calculate the degrees of freedom:

The degrees of freedom in the table will be the sample size -1, so:
Sample 1 has 40 df (the numerator).
Sample 2 has 20 df (the denominator).
Step 4: Choose an alpha level. No alpha was stated in the question, so use 0.05 (the standard “go
to” in statistics). This needs to be halved for the two-tailed test, so use 0.025.
Step 5: Find the critical F Value using the F Table. There are several tables, so make sure you
look in the alpha = .025 table. Critical F (40, 20) at alpha (0.025) = 2.287
Step 6: Compare your calculated value (Step 2) to your table value (Step 5). If your calculated
value is higher than the table value, you can reject the null hypothesis:
F calculated value: 1.66
F value from table: 2.287.
1.66 < 2 .287.
So we cannot reject the null hypothesis.
An F-test (Snedecor and Cochran, 1983) is used to test if the variances of two populations are
equal. This test can be a two-tailed test or a one-tailed test. The two-tailed version tests against
the alternative that the variances are not equal. The one-tailed version only tests in one
direction that is the variance from the first population is either greater than or less than (but
not both) the second population variance. The choice is determined by the problem. For
example, if we are testing a new process, we may only be interested in knowing if the new
process is less variable than the old process.
Definition The F hypothesis test is defined as:
H0: σ2 1 = σ2 2
Ha: σ21<σ22 for a lower one-tailed test
σ21>σ22 for an upper one-tailed test
σ21≠σ22 for a two-tailed test
Test Statistic: F = s21/s22
Where s21 and s22 and are the sample variances. The more this ratio deviates from 1, the
stronger the evidence for unequal population variances.
Significance Level: α
Critical Region: The hypothesis that the two variances are equal is rejected if
F>Fα,N1−1,N2−1 for an upper one-tailed test
F<F1−α,N1−1,N2−1 for a lower one-tailed test
F<F1−α/2,N1−1,N2−1
or
F>Fα/2,N1−1,N2−1
for a two-tailed test where Fα, N1-1, N2-1 is the critical value of the F distribution with N1-1 and
N2-1 degrees of freedom and a significance level of α.
In the above formulas for the critical regions, the Handbook follows the convention that Fα is
the upper critical value from the F distribution and F1-α is the lower critical value from the F
distribution. Note that this is the opposite of the designation used by some texts and software
programs.
F-Test Example
Lin Xiang, a young banker, has moved from Saskatoon, Saskatchewan, to Winnipeg, Manitoba,
where she has recently been promoted and made the manager of City Bank, a newly established
bank in Winnipeg with branches across the Prairies. After a few weeks, she has discovered that
maintaining the correct number of tellers seems to be more difficult than it was when she was a
branch assistant manager in Saskatoon. Some days, the lines are very long, but on other days,
the tellers seem to have little to do. She wonders if the number of customers at her new branch
is simply more variable than the number of customers at the branch where she used to work.
Because tellers work for a whole day or half a day (morning or afternoon), she collects the
following data on the number of transactions in a half day from her branch and the branch
where she used to work:
Winnipeg branch: 156, 278, 134, 202, 236, 198, 187, 199, 143, 165, 223
Saskatoon branch: 345, 332, 309, 367, 388, 312, 355, 363, 381
She hypothesizes:
Ho: σ2W=σ2S
Ha: σ2W≠σ2S
She decides to use α – .05. She computes the sample variances and finds:
s2W=1828.56
s2S=795.19
Following the rule to put the larger variance in the numerator, so that she saves a step, she
finds:
F=s2W/s2S=1828.56/795.19=2.30
U-Test
The Mann-Whitney U test is a nonparametric test that allows two groups or conditions or
treatments to be compared without making the assumption that values are normally
distributed. So, for example, one might compare the speed at which two different groups of
people can run 100 metres, where one group has trained for six weeks and the other has not.
Requirements
• Two random, independent samples

• The data is continuous - in other words, it must, in principle, be possible to distinguish
between values at the nth decimal place
• Scale of measurement should be ordinal, interval or ratio
• For maximum accuracy, there should be no ties, though this test - like others - has a
way to handle ties
Null Hypothesis: The null hypothesis asserts that the medians of the two samples are
identical.
Equation:
The default assumption or null hypothesis is that there is no difference between the
distributions of the data samples. Rejection of this hypothesis suggests that there is likely some
difference between the samples. More specifically, the test determines whether it is equally
likely that any randomly selected observation from one sample will be greater or less than a
sample in the other distribution. If violated, it suggests differing distributions.
• Fail to Reject H0: Sample distributions are equal.

• Reject H0: Sample distributions are not equal.
For the test to be effective, it requires at least 20 observations in each data sample.
This test is often performed as a two-sided test and, thus, the research hypothesis indicates that
the populations are not equal as opposed to specifying directionality. A one-sided research
hypothesis is used if interest lies in detecting a positive or negative shift in one population as
compared to the other. The procedure for the test involves pooling the observations from the
two samples into one combined sample, keeping track of which sample each observation comes
from, and then ranking lowest to highest from 1 to n1+n2, respectively.
Example: Consider a Phase II clinical trial designed to investigate the effectiveness of a new
drug to reduce symptoms of asthma in children. A total of n=10 participants are randomized to
receive either the new drug or a placebo. Participants are asked to record the number of
episodes of shortness of breath over a 1 week period following receipt of the assigned
treatment. The data are shown below.
Placebo 7 5 6 4 12
New Drug 3 6 4 2 1
Is there a difference in the number of episodes of shortness of breath over a 1 week period in
participants receiving the new drug as compared to those receiving the placebo? By inspection,
it appears that participants receiving the placebo have more episodes of shortness of breath, but
is this statistically significant?
In this example, the outcome is a count and in this sample the data do not follow a normal
distribution.
Frequency Histogram of Number of Episodes of Shortness of Breath
In addition, the sample size is small (n1=n2=5), so a nonparametric test is appropriate. The
hypothesis is given below, and we run the test at the 5% level of significance (i.e., α=0.05).
H0: The two populations are equal versus

H1: The two populations are not equal.
Note that if the null hypothesis is true (i.e., the two populations are equal), we expect to see
similar numbers of episodes of shortness of breath in each of the two treatment groups, and we
would expect to see some participants reporting few episodes and some reporting more
episodes in each group. This does not appear to be the case with the observed data. A test of
hypothesis is needed to determine whether the observed data is evidence of a statistically
significant difference in populations.
The first step is to assign ranks and to do so we order the data from smallest to largest. This is
done on the combined or total sample (i.e., pooling the data from the two treatment groups
(n=10)), and assigning ranks from 1 to 10, as follows. We also need to keep track of the group
assignments in the total sample.
Total Sample (Ordered Smallest to

Ranks
Largest)
Placebo New Drug Placebo New Drug Placebo New Drug
7 3 1 1
5 6 2 2
6 4 3 3
4 2 4 4 4.5 4.5
12 1 5 6
6 6 7.5 7.5
7 9
12 10
Note that the lower ranks (e.g., 1, 2 and 3) are assigned to responses in the new drug group
while the higher ranks (e.g., 9, 10) are assigned to responses in the placebo group. Again, the
goal of the test is to determine whether the observed data support a difference in the
populations of responses. Recall that in parametric tests (discussed in the modules on
hypothesis testing), when comparing means between two groups, we analyzed the difference in
the sample means relative to their variability and summarized the sample information in a test
statistic. A similar approach is employed here. Specifically, we produce a test statistic based on
the ranks.
First, we sum the ranks in each group. In the placebo group, the sum of the ranks is 37; in the
new drug group, the sum of the ranks is 18. Recall that the sum of the ranks will always equal
n(n+1)/2. As a check on our assignment of ranks, we have n(n+1)/2 = 10(11)/2=55 which is
equal to 37+18 = 55.
For the test, we call the placebo group 1 and the new drug group 2 (assignment of groups 1
and 2 is arbitrary). We let R1 denote the sum of the ranks in group 1 (i.e., R1=37), and R2denote
the sum of the ranks in group 2 (i.e., R2=18). If the null hypothesis is true (i.e., if the two
populations are equal), we expect R1 and R2 to be similar. In this example, the lower values
(lower ranks) are clustered in the new drug group (group 2), while the higher values (higher
ranks) are clustered in the placebo group (group 1). This is suggestive, but is the observed
difference in the sums of the ranks simply due to chance? To answer this we will compute a test
statistic to summarize the sample information and look up the corresponding value in a
probability distribution.
Test Statistic for the Mann Whitney U Test
The test statistic for the Mann Whitney U Test is denoted U and is the smaller of U1 and U2,
defined below.
Where R1 = sum of the ranks for group 1 and R2 = sum of the ranks for group 2. For this
example,
In our example, U=3. Is this evidence in support of the null or research hypothesis? Before we
address this question, we consider the range of the test statistic U in two different situations.
Situation #1
Consider the situation where there is complete separation of the groups, supporting
the research hypothesis that the two populations are not equal. If all of the higher numbers of
episodes of shortness of breath (and thus all of the higher ranks) are in the placebo group, and
all of the lower numbers of episodes (and ranks) are in the new drug group and that there are
no ties, then:
and
Therefore, when there is clearly a difference in the populations, U=0.
Situation #2
Consider a second situation where low and high scores are approximately evenly
distributed in the two groups, supporting the null hypothesis that the groups are equal. If
ranks of 2, 4, 6, 8 and 10 are assigned to the numbers of episodes of shortness of breath
reported in the placebo group and ranks of 1, 3, 5, 7 and 9 are assigned to the numbers of
episodes of shortness of breath reported in the new drug group, then:
R1= 2+4+6+8+10 = 30 and R2= 1+3+5+7+9 = 25,
and
When there is clearly no difference between populations, then U=10.
Thus, smaller values of U support the research hypothesis, and larger values of U support the
null hypothesis.
In every test, we must determine whether the observed U supports the null or research
hypothesis. This is done following the same approach used in parametric testing. Specifically,
we determine a critical value of U such that if the observed value of U is less than or equal to
the critical value, we reject H0 in favor of H1 and if the observed value of U exceeds the critical
value we do not reject H0.
The critical value of U can be found in the table below. To determine the appropriate critical
value we need sample sizes (for Example: n1=n2=5) and our two-sided level of significance
(α=0.05). For Example 1 the critical value is 2, and the decision rule is to reject H0 if U < 2.
We do not reject H0 because 3 > 2. We do not have statistically significant evidence at α =0.05,
to show that the two populations of numbers of episodes of shortness of breath are not equal.
However, in this example, the failure to reach statistical significance may be due to low power.
The sample data suggest a difference, but the sample sizes are too small to conclude that there
is a statistically significant difference.
Table of Critical Values for U
Example: A new approach to prenatal care is proposed for pregnant women living in a rural
community. The new program involves in-home visits during the course of pregnancy in
addition to the usual or regularly scheduled visits. A pilot randomized trial with 15 pregnant
women is designed to evaluate whether women who participate in the program deliver
healthier babies than women receiving usual care. The outcome is the APGAR score measured
5 minutes after birth. Recall that APGAR scores range from 0 to 10 with scores of 7 or higher
considered normal (healthy), 4-6 low and 0-3 critically low. The data are shown below.
Usual Care 8 7 6 2 5 8 7 3
New Program 9 9 7 8 10 9 6
Is there statistical evidence of a difference in APGAR scores in women receiving the new and
enhanced versus usual prenatal care? We run the test using the five-step approach.
Step 1. Set up hypotheses and determine level of significance.

H1: The two populations are not equal. α =0.05
Step 2. Select the appropriate test statistic.

Because APGAR scores are not normally distributed and the samples are small (n1=8 and
n2=7), we use the Mann Whitney U test. The test statistic is U, the smaller of
Where R1 and R2 are the sums of the ranks in groups 1 and 2, respectively.
Step 3. Set up decision rule.
The appropriate critical value can be found in the table above. To determine the appropriate
critical value we need sample sizes (n1=8 and n2=7) and our two-sided level of significance
(α=0.05). The critical value for this test with n1=8, n2=7 and α =0.05 is 10 and the decision
rule is as follows: Reject H0 if U < 10.
Step 4. Compute the test statistic.
The first step is to assign ranks of 1 through 15 to the smallest through largest values in the
total sample, as follows:
Total Sample
Ranks
(Ordered Smallest to Largest)
Usual Care New Program Usual Care New Program Usual Care New Program
8 9 2 1
7 8 3 2
6 7 5 3
2 8 6 6 4.5 4.5
5 10 7 7 7 7
8 9 7 7
7 6 8 8 10.5 10.5
3 8 8 10.5 10.5
9 13.5
9 13.5
10 15
R1=45.5 R2=74.5
Next, we sum the ranks in each group. In the usual care group, the sum of the ranks is R1=45.5
and in the new program group, the sum of the ranks is R2=74.5. Recall that the sum of the
ranks will always equal n(n+1)/2. As a check on our assignment of ranks, we have n(n+1)/2 =
15(16)/2=120 which is equal to 45.5+74.5 = 120.
We now compute U1 and U2, as follows:
Thus, the test statistic is U=9.5.
Step 5. Conclusion:
We reject H0 because 9.5 < 10. We have statistically significant evidence at α =0.05 to show
that the populations of APGAR scores are not equal in women receiving usual prenatal care as
compared to the new program of prenatal care.
Example: A clinical trial is run to assess the effectiveness of a new anti-retroviral therapy for
patients with HIV. Patients are randomized to receive a standard anti-retroviral therapy (usual
care) or the new anti-retroviral therapy and are monitored for 3 months. The primary outcome
is viral load which represents the number of HIV copies per milliliter of blood. A total of 30
participants are randomized and the data are shown below.
Stand
ard 75 80 20 55 12 10 22 68 34 63 91 97 10 67
400
Thera 00 00 00 0 50 00 50 00 00 00 00 0 40 0
py
New
40 25 80 14 80 74 10 60 92 14 27 42 52 41 undetect
Thera
0 0 0 00 00 00 20 00 0 20 00 00 00 00 able
py
Is there statistical evidence of a difference in viral load in patients receiving the standard versus
the new anti-retroviral therapy?
Step 1. Set up hypotheses and determine level of significance.

H1: The two populations are not equal. α=0.05
Step 2. Select the appropriate test statistic.
Because viral load measures are not normally distributed (with outliers as well as limits of
detection (e.g., "undetectable")), we use the Mann-Whitney U test. The test statistic is U, the
smaller of
Where R1 and R2 are the sums of the ranks in groups 1 and 2, respectively.
Step 3. Set up the decision rule.
The critical value can be found in the table of critical values based on sample sizes (n1=n2=15)
and a two-sided level of significance (α=0.05). The critical value 64 and the decision rule is as
follows: Reject H0 if U < 64.
Step 4. Compute the test statistic.
The first step is to assign ranks of 1 through 30 to the smallest through largest values in the
total sample. Note in the table below, that the "undetectable" measurement is listed first in the
ordered values (smallest) and assigned a rank of 1.
Total Sample (Ordered

Ranks
Smallest to Largest)
Standard New Standard Standard New
New
Anti- Anti- Anti- Anti- Anti-
Anti-retroviral
retroviral retroviral retroviral retroviral retroviral
7500 400 undetectable 1
8000 250 250 2
2000 800 400 400 3.5 3.5
550 1400 550 5
1250 8000 670 6
1000 7400 800 7
2250 1020 920 8
6800 6000 970 9
3400 920 1000 10
6300 1420 1020 11
9100 2700 1040 12
970 4200 1250 13
1040 5200 1400 14
670 4100 1420 15
400 undetectable 2000 16
2250 17
2700 18
3400 19
4100 20
4200 21
5200 22
6000 23
6300 24
6800 25
7400 26
7500 27
8000 8000 28.5 28.5
9100 30
R1 = 245 R2 = 220
Next, we sum the ranks in each group. In the standard anti-retroviral therapy group, the sum
of the ranks is R1=245; in the new anti-retroviral therapy group, the sum of the ranks is
R2=220. Recall that the sum of the ranks will always equal n(n+1)/2. As a check on our
assignment of ranks, we have n(n+1)/2 = 30(31)/2=465 which is equal to 245+220 = 465. We
now compute U1 and U2, as follows,
Thus, the test statistic is U=100.
Step 5. Conclusion.
We do not reject H0 because 100 > 64. We do not have sufficient evidence to conclude that the
treatment groups differ in viral load.
Wilcoxon Signed Rank Test
This section describes nonparametric tests to compare two groups with respect to a continuous
outcome when the data are collected on matched or paired samples. The parametric procedure
for doing this was presented in the modules on hypothesis testing for the situation in which the
continuous outcome was normally distributed. This section describes procedures that should be
used when the outcome cannot be assumed to follow a normal distribution. There are two
popular nonparametric tests to compare outcomes between two matched or paired groups. The
first is called the Sign Test and the second the Wilcoxon Signed Rank Test.
Recall that when data are matched or paired, we compute difference scores for each individual
and analyze difference scores. The same approach is followed in nonparametric tests. In
parametric tests, the null hypothesis is that the mean difference (μd) is zero. In nonparametric
tests, the null hypothesis is that the median difference is zero.
Example:
Consider a clinical investigation to assess the effectiveness of a new drug designed to reduce
repetitive behaviors in children affected with autism. If the drug is effective, children will
exhibit fewer repetitive behaviors on treatment as compared to when they are untreated. A
total of 8 children with autism enroll in the study. Each child is observed by the study
psychologist for a period of 3 hours both before treatment and then again after taking the new
drug for 1 week. The time that each child is engaged in repetitive behavior during each 3 hour
observation period is measured. Repetitive behavior is scored on a scale of 0 to 100 and scores
represent the percent of the observation time in which the child is engaged in repetitive
behavior. For example, a score of 0 indicates that during the entire observation period the child
did not engage in repetitive behavior while a score of 100 indicates that the child was
constantly engaged in repetitive behavior. The data are shown below.
Child Before Treatment After 1 Week of Treatment

1 85 75
2 70 50
3 40 50
4 65 40
5 80 20
6 75 65
7 55 40
8 20 25
Looking at the data, it appears that some children improve (e.g., Child 5 scored 80 before
treatment and 20 after treatment), but some got worse (e.g., Child 3 scored 40 before treatment
and 50 after treatment). Is there statistically significant improvement in repetitive behavior
after 1 week of treatment?.
Because the before and after treatment measures are paired, we compute difference scores for
each child. In this example, we subtract the assessment of repetitive behaviors after treatment
from that measured before treatment so that difference scores represent improvement in
repetitive behavior. The question of interest is whether there is significant improvement after
treatment.
Before After 1 Week Difference

Child
Treatment of Treatment (Before-After)
1 85 75 10
2 70 50 20
3 40 50 -10
4 65 40 25
5 80 20 60
6 75 65 10
7 55 40 15
8 20 25 -5
In this small sample, the observed difference (or improvement) scores vary widely and are
subject to extremes (e.g., the observed difference of 60 is an outlier). Thus, a nonparametric test
is appropriate to test whether there is significant improvement in repetitive behavior before
versus after treatment. The hypotheses are given below.
H0: The median difference is zero versus

H1: The median difference is positive α=0.05
In this example, the null hypothesis is that there is no difference in scores before versus after
treatment. If the null hypothesis is true, we expect to see some positive differences
(improvement) and some negative differences (worsening). If the research hypothesis is true, we
expect to see more positive differences after treatment as compared to before.
Rank Sum Test
When the requirements for the t-test for two independent samples are not satisfied, the
Wilcoxon Rank-Sum non-parametric test can often be used provided the two independent
samples are drawn from populations with an ordinal distribution.
For this test we use the following null hypothesis:
H0: the observations come from the same population
From a practical point of view, this implies:
H0: if one observation is made at random from each population (call them x0 and y0), then the
probability that x0 > y0 is the same as the probability that x0 < y0, and so the populations for
each sample have the same medians.
The Wilcoxon Rank Sum test can be used to test the null hypothesis that two populations X
and Y have the same continuous distribution. We assume that we have independent random
samples x1, x2, . . ., xm and y1, y2, . . ., yn, of sizes m and n respectively, from each population.
We then merge the data and rank of each measurement from lowest to highest. All sequences of
ties are assigned an average rank.
The Wilcoxon test statistic W is the sum of the ranks from population X. Assuming that the
two populations have the same continuous distribution (and no ties occur), then W has a mean
and standard deviation given by
µ = m (m + n + 1) / 2
and
s = Sqrt[ m n (N + 1) / 12 ],
Where N = m + n.
We test the null hypothesis Ho: No difference in distributions. A one-sided alternative is Ha:
first population yields lower measurements. We use this alternative if we expect or see that W
is unusually lower than its expected value µ. In this case, the p-value is given by a normal
approximation. We let N ~ N ( µ , s ) and compute the left-tail P(N <=W) (using continuity
correction if W is an integer).
If we expect or see that W is much higher than its expected value, then we should use the
alternative Ha: first population yields higher measurements. In this case, the p-value is given by
the right-tail P(N >= W), again using continuity correction if needed. If the two sums of ranks
from each population are close, then we could use a two-sided alternative Ha: there is a
difference in distributions. In this case, the p-value is given by twice the smallest tail value (2*P
(N <=W) if W < µ, or 2*P (N >=W) if W > µ).
Run Test
The Wilcoxon signed rank sum test is another example of a non-parametric or distribution free
test. As for the sign test, the Wilcoxon signed rank sum test is used is used to test the null
hypothesis that the median of a distribution is equal to some value. It can be used a) in place of
a one-sample t-test b) in place of a paired t-test or c) for ordered categorical data where a
numerical scale is inappropriate but where it is possible to rank the observations.
Case 1: Paired data
1. State the null hypothesis - in this case it is that the median difference, M, is equal to zero.
2. Calculate each paired difference, di = xi − yi, where xi, yi are the pairs of observations.
3. Rank the dis, ignoring the signs (i.e. assign rank 1 to the smallest |di|, rank 2 to the next
etc.)
4. Label each rank with its sign, according to the sign of di.
5. Calculate W +, the sum of the ranks of the positive dis, and W −, the sum of the ranks of
the negative dis. (As a check the total, W + + W −, should be equal to
n(n+1) , where n is the number of pairs of observations in the sample).
Case 2: Single set of observations
1. State the null hypothesis - the median value is equal to some value M.
2. Calculate the difference between each observation and the hypothesised median, di = xi −
M.
3. Apply Steps 3-5 as above.
Under the null hypothesis, we would expect the distribution of the differences to be
approximately symmetric around zero and the the distribution of positives and negatives to be
distributed at random among the ranks. Under this assumption, it is possible to work out the
exact probability of every possible outcome for W. To carry out the test, we therefore proceed
as follows:
6. Choose W = min (W −, W +).

7. Use tables of critical values for the Wilcoxon signed rank sum test to find the probability
of observing a value of W or more extreme. Most tables give both one-sided and two-
sided p-values. If not, double the one-sided p-value to obtain the two-sided p-value. This
is an exact test.
Normal approximation
Dealing with ties: There are two types of tied observations that may arise when using the
Wilcoxon signed rank test:
• Observations in the sample may be exactly equal to M (i.e. 0 in the case of paired
differences). Ignore such observations and adjust n accordingly.
• Two or more observations/differences may be equal. If so, average the ranks across
the tied observations and reduce the variance by t3 - t / 48 for each group of t tied
ranks.
Example:
The table below shows the hours of relief provided by two analgesic drugs in 12 patients
suffering from arthritis. Is there any evidence that one drug provides longer relief than the
other?
Case Drug A Drug B Case Drug A Drug B

1 2.0 3.5 7 14.9 16.7
2 3.6 5.7 8 6.6 6.0
3 2.6 2.9 9 2.3 3.8
4 2.6 2.4 10 2.0 4.0
5 7.3 9.9 11 6.8 9.1
6 3.4 3.3 12 8.5 20.9
Solution:
1. In this case our null hypothesis is that the median diﬀ erence is zero.
2. Our actual differences (Drug B - Drug A) are:
+1.5, +2.1, +0.3, −0.2, +2.6, −0.1, +1.8, −0.6, +1.5, +2.0, +2.3, +12.4
Our actual median difference is 1.65 hours.
3. Ranking the differences and affixing a sign to each rank (steps 3 and 4 above):
Diff. 0.1 0.2 0.3 0.6 1.5 1.5 1.8 2.0 2.1 2.3 2.6 12.4
Rank 1 2 3 4 5.5 5.5 7 8 9 10 11 12
Sign - - + - + + + + + + + +
Calculating W + and W − gives:
W− =1+2+4=7 W+ =3+5.5+5.5+7+8+9+10+11+12=71
Therefore, we have n = 12*13/2 = 78 W = max (W−, W+) = 71

We can use a normal approximation in this case. We have one group of 2 tied ranks, so we
must reduce the variance by 8-2 /48 = 0.125. We get:
This gives a two-sided p-value of p = 0.012. There is strong evidence that Drug B provides
more relief than Drug A.
Run Test
A runs test is a statistical procedure that examines whether a string of data is occurring
randomly from a specific distribution. The runs test analyzes the occurrence of similar events
that are separated by events that are different.
Running a Test of Randomness is a non-parametric method that is used in cases when the
parametric test is not in use. In this test, two different random samples from different
populations with different continuous cumulative distribution functions are obtained. Running
a test for randomness is carried out in a random model in which the observations vary around a
constant mean. The observation in the random model in which the run test is carried out has a
constant variance, and the observations are also probabilistically independent. The run in a run
test is defined as the consecutive sequence of ones and twos. This test checks whether or not
the number of runs are the appropriate number of runs for a randomly generated series. The
observations from the two independent samples are ranked in increasing order, and each value
is coded as a 1 or 2, and the total number of runs is summed up and used as the test statistics.
Small values do not support suggest different populations and large values suggest identical
populations (the arrangements of the values should be random). Wald Wolfowitz run test is
commonly used.
Assumptions:
• Data is collected from two independent groups.

• If the run test is being tested for randomness, then it is assumed that the data should
enter in the dataset as an ordered sample, increasing in magnitude. This means that for
carrying-out the run test for randomness, there should not be any groupings or other
pre-processing.
• If the run test is carried out in SPSS, then it is assumed that the variables that are being
tested in the run test should be of numeric type. This means that if the test variables
are of the string type, then the variables must be coded as numbers in order to make
those variables of the numeric type.
• Generally, in non-parametric tests, no underlying distribution is assumed. This holds
for the run test as well, but if the number of observations is more than twenty, then it is
assumed (in the run test) that the underlying distribution would be normal and would
have the mean and variance that is given by the formulas.
Null Hypothesis: The order of the ones and twos is random.
Alternative Hypothesis: The order of ones and twos is not random.
This checking is done in the following manner:
Let us consider that ‘H’ denotes the number of observations. The ‘Ha‘ is considered to be the
number that falls above the mean, and ‘Hb‘ is considered to be the number that falls below the
mean. The ‘R’ is considered to be the observed number of runs. After considering these
symbols, then the probability of the observed number of runs is derived.
Formula of the mean and the variance of the observed number of the runs:
E (R) = H + 2 Ha Hb / H
V (R) = 2 Ha Hb (2 Ha Hb – H) / H2 (H – 1)
The researcher should note that in the run test for the random type of model, if the value of the
observations is larger than twenty, then the distribution of the observed number of runs would
approximately follow normal distribution. The value of the standard normal variate of the
observed number of runs in the run test is given by the following:
Z = R – E (R) / Stdev (R).
This follows the normal distribution that has the mean as zero and the variance as 1. This is
also called the standard normal distribution that the Z variate must follow.
Benefits of a Runs Test
The runs test model is important in determining whether an outcome of a trial is truly random,
especially in cases where random versus sequential data has implications for subsequent
theories and analysis. A runs test can be a valuable tool for investors who employ technical
analysis to make their trading decisions. These traders analyze statistical trends, such as price
movement and volume, to spot potentially profitable trading opportunities. It's important for
these traders to understand the underlying variables that could be impacting price movement
and a runs test can help with this.
Two powerful ways traders can use a runs test include:
• Testing the randomness of distribution, by taking the data in the given order and
marking with a plus (+) the data greater than the median, and with a minus (-) the data
less than the median (numbers equalling the median are omitted.)
• Testing whether a function fits well to a data set, by marking the data exceeding the
function value with + and the other data with −. For this use, the runs test, which takes
into account the signs but not the distances, is complementary to the chi-square test,
which takes into account the distances but not the signs.
A run is defined as a series of increasing values or a series of decreasing values. The number of
increasing, or decreasing, values is the length of the run. In a random data set, the probability
that the (I+1)th value is larger or smaller than the Ith value follows a binomial distribution,
which forms the basis of the runs test.
The first step in the runs test is to count the number of runs in the data sequence. There are
several ways to define runs in the literature, however, in all cases the formulation must produce
a dichotomous sequence of values. For example, a series of 20 coin tosses might produce the
following sequence of heads (H) and tails (T).
HHTTHTHHHHTHHTTTTTHH
The number of runs for this series is nine. There are 11 heads and 9 tails in the sequence.
We will code values above the median as positive and values below the median as negative. A
run is defined as a series of consecutive positive (or negative) values. The runs test is defined as:
H0: the sequence was produced in a random manner
Ha: the sequence was not produced in a random manner
Test Statistic: The test statistic is
Where R is the observed number of runs, R, is the expected number of runs, and sR is the
standard deviation of the number of runs. The values of R and sR are computed as follows:
The runs test rejects the null hypothesis if
|Z| > Z1-α/2
For a large-sample runs test (where n1 > 10 and n2 > 10), the test statistic is compared to a
standard normal table. That is, at the 5 % significance level, a test statistic with an absolute
value greater than 1.96 indicates non-randomness. For a small-sample runs test, there are
tables to determine critical values that depend on values of n1 and n2.
Example
Let X and Y denote the times in hours per weeks that students in two different schools watch
television. Let F(x) and G(y) denote the respective distributions. To test the null hypothesis:
H0: F (z) =G (z)
A random sample of eight students was selected from each school, yielding the following
results:
What conclusion should we make about the equality of the two distribution functions?
Solution: The combined ordered sample, with the x values in blue and the y values in red,
looks like this:
Counting, we see that there are 9 runs. We should reject the null hypothesis if the number of
runs is smaller than expected. Therefore, the critical region should be of the form r ≤ c. In
order to determine what we should set the value of c to be, we'll need to know something about
the p.m.f. of R. We can use the formulas that we derived above to determine various
probabilities. Well, with n1 = 8, n2 = 8, and k = 1, we have:
(Note that Table I in the back of the textbook can be helpful in evaluating the value of the
"binomial coefficients.") Now, with n1 = 8, n2 = 8, and k = 2, we have:
And, with n1 = 8, n2 = 8, and k = 3, we have:
Let's stop to see what we have going for us so far. Well, so far we've learned that:
P(R≤6)=0.00016+0.00109+0.00761+0.02284+0.06853=0.1002
That tells us that if we set c = 6, we'd have a 0.1002 probability of committing a Type I error.
That seems reasonable! That is, let's decide to reject the null hypothesis of the equality of the
two distribution functions if the number of observed runs r ≤ 6. It's not... we observed 9 runs.
Therefore, we fail to reject the null hypothesis at the 0.10 level. There is insufficient evidence at
the 0.10 level to conclude that the distribution functions are not equal.
A Large-Sample Test
As our work in the previous example illustrates, conducting a single run test can be quite
extensive in the calculation front. Is there an easier way? Fortunately, yes... that is, providing
n1 and n2 are large. Typically, we consider the samples to be large if n1 is at least 10 and n2 is
at least 10. If the samples are large, then the distribution of R can be approximated with a
normally distributed random variable. That is, it can be shown that:
Example
A charter bus line has 48-passenger buses and 38-passenger buses. With X and Y denoting the
number of miles travelled per day for the 48-passenger and 38-passenger buses, respectively,
the bus company is interested in testing the equality of the two distributions:
H0: F(z)=G(z)
The company observed the following data on a random sample of n1 = 10 buses holding 48
passengers and n2 = 11 buses holding 38 passengers:
Using the normal approximation to R, conduct the hypothesis test at the 0.05 level.
K-W Test
The Kruskal Wallis test is the non parametric alternative to the One Way ANOVA. Non
parametric means that the test doesn’t assume your data comes from a particular distribution.
The H test is used when the assumptions for ANOVA aren’t met (like the assumption of
normality). It is sometimes called the one-way ANOVA on ranks, as the ranks of the data
values are used in the test rather than the actual data points.
The test determines whether the medians of two or more groups are different. Like most
statistical tests, you calculate a test statistic and compare it to a distribution cut-off point. The
test statistic used in this test is called the H statistic. The hypotheses for the test are:
H0: population medians are equal.
H1: population medians are not equal.
The Kruskal Wallis test will tell you if there is a significant difference between groups.
However, it won’t tell you which groups are different. For that, you’ll need to run a Post Hoc
test.
Examples
• You want to find out how test anxiety affects actual test scores. The independent
variable “test anxiety” has three levels: no anxiety, low-medium anxiety and high
anxiety. The dependent variable is the exam score, rated from 0 to 100%.
• You want to find out how socioeconomic status affects attitude towards sales tax
increases. Your independent variable is “socioeconomic status” with three levels:
working class, middle class and wealthy. The dependent variable is measured on a 5-
point Likert scale from strongly agree to strongly disagree.
Assumptions for the Kruskal Wallis Test
• One independent variable with two or more levels (independent groups). The test is
more commonly used when you have three or more levels. For two levels, consider
using the Mann Whitney U Test instead.
• Ordinal scale, Ratio Scale or Interval scale dependent variables.
• Your observations should be independent. In other words, there should be no
relationship between the members in each group or between groups. For more
information on this point, see: Assumption of Independence.
• All groups should have the same shape distributions. Most software (i.e. SPSS, Minitab)
will test for this condition as part of the test.
Sample question: A shoe company wants to know if three groups of workers have different
salaries:
Women: 23K, 41K, 54K, 66K, 78K.

Men: 45K, 55K, 60K, 70K, 72K
Minorities: 18K, 30K, 34K, 40K, 44K.
Step 1: Sort the data for all groups/samples into ascending order in one combined set.
20K 54K
23K 55K
30K 60K
34K 66K
40K 70K
41K 72K
44K 90K
45K
Step 2: Assign ranks to the sorted data points. Give tied values the average rank.
20K 1
23K 2
30K 3
34K 4
40K 5
41K 6
44K 7
45K 8
54K 9
55K 10
60K 11
66K 12
70K 13
72K 14
90K 15
Step 3: Add up the different ranks for each group/sample.
Women: 23K, 41K, 54K, 66K, 90K = 2 + 6 + 9 + 12 + 15 = 44.

Men: 45K, 55K, 60K, 70K, 72K = 8 + 10 + 11 + 13 + 14 = 56.
Minorities: 20K, 30K, 34K, 40K, 44K = 1 + 3 + 4 + 5 + 7 = 20.
Step 4: Calculate the H statistic:
Where:
• n = sum of sample sizes for all samples,
• c = number of samples,
• Tj = sum of ranks in the jth sample,
• nj = size of the jth sample.
H = 6.72
Step 5: Find the critical chi-square value, with c-1 degrees of freedom. For 3 – 1 degrees of
freedom and an alpha level of .05, the critical chi square value is 5.9915.
Step 6: Compare the H value from Step 4 to the critical chi-square value from Step 5.
If the critical chi-square value is less than the H statistic, reject the null hypothesis that the
medians are equal.
If the chi-square value is not less than the H statistic, there is not enough evidence to suggest
that the medians are unequal.
In this case, 5.9915 is less than 6.72, so you can reject the null hypothesis.
Analysis Steps
Step 1: Set up the hypothesis test
The null hypothesis, Ho: The k distributions are identical given k different sets of
measurements.
The alternative hypothesis, Ha: At least one of the k distributions is different than the others.
Note: the test does not indicate which group or how many are different.
Step 2: Determine the Test Statistic: The test statistic is calculated with
Where ni is the number of measurements from sample i,
nT is the total sample size across of sets of measurements,
and, Ti is the sum of the ranks in sample i after assignment of ranks across the combined
sample.
Step 3: Determine the Rejection Region
Given a confidence level, C, let α = 1 – C. Reject Ho if H exceeds the critical value of χ2 for a =
α and df = k – 1
Calculate corrected H, H’, if there is a large number of ties in the data, use
where ti is the number of measurements in the nth group of tied ranks.
Example:
Machine A Machine B Machine C
12 14 9
19 20 14
26 14 11
23 16 8
20 22
29
Set up the Hypothesis Test
Ho: There is no difference in bearing time to failure across the three machines.
Ha: At least one machine bearing lifetime is different than the others.
Compute the Test Statistic
Combine the data in rank order and assign ranks.
Combined Data Rank Machine

8 1 C
9 2 C
11 3 C
12 4 A
14 6 C
14 6 B
14 6 B
16 8 B
19 9 A
20 10.5 A
20 10.5 B
22 12 B
23 13 A
26 14 A
29 15 A
For ties, the rank is the average of the span of ranks the group would occupy. For example, in
the data, there are three bearings that failed after 14 months. The three values would receive
ranks of 5, 6, and 7, therefore, use the average of the three rank values, or 6 in this case.
Now, sort the data back in the three groups and determine the average rank value for each
machine. The values in parenthesis are the rank value for that measurement.
Machine A Machine B Machine C
12 (4) 14 (6) 9 (2)
19 (9) 20 (10.5) 14 (6)
26 (14) 14 (6) 11 (3)
23 (13) 16 (8) 8 (1)
20 (10.5) 22 (12)
29 (15)
65.5 42.5 12
Compute H
Note: there are only 5 measurements in ties. In general, a use H’ when there are over half the
values involved in ties.
1. Determine the Rejection Region: The critical value of the χ2 distribution with α = 0.05
and df = k – 1 = 2. Using a χ2 table we find a critical value of 5.991.
Conclusion: Since the test statistics is greater than the critical value (in the rejection region)
we conclude that at least one of the machines wears out bearings at a different rate than the
others.
A box plot may provide additional information and is a good way to visualize the data from the
three machines.
Unit – 4
Report Writing
Meaning and Purpose of a Research Report:
A research report is a formal statement of the research process and its results. It narrates the
problem studied, methods used for studying it and the findings and conclusions of the study.
The purpose of a research report is to communicate to interested persons the methodology and
the results of the study in such a manner as to enable them to understand the research process
and to determine the validity of the conclusions. The aim of the report is not to convince the
reader of the value of the result, but to convey to him what was done, why it was done, aid what
was its outcome. It is so written that the reader himself can reach his own conclusions as to the
adequacy of the study and the validity of the reported results and conclusions.
Characteristics of a Report:
A research report is a narrative but authoritative document on the outcome of a research effort.
It presents highly specific information for a clearly designated audience. It is non persuasive as
a form of communication. Extra caution is shown in advocating a course of action even
subordinated to the matter being presented. It is a simple readable and accurate form of
communication.
Functions of a Research Report:
A well written research report performs several functions:-
1. It serves as a means for presenting the problem studied methods and techniques used
for collecting and analyzing data, the findings, conclusions and recommendations, in an
organised manner.
2. It serves as a basic reference material for future use in developing research proposals in
the same or related area.
3. A report serves as a means for judging the quality of the completed research project.
4. It is a means for evaluating the researcher’s ability and competence to do research.
5. It provides factual base for formulating policies and strategies relating to the subject
matter studied.
6. It provides systematic knowledge on problems and issues analysed.
Purpose of Report Writing
• The report may be meant for the people in general, when the investigation has not been
carried out at the instance of any third party. Research is essentially a cooperative
venture and it is essential that every investigator should know what others have found
about the phenomena under study. The purpose of a report is thus the dissipation of
knowledge, broadcasting of generalisations so as to ensure their widest use.
• A report of research has only one function, “it must inform”. It has to propagate
knowledge. Thus, the purpose of a report is to convey to the interested persons the
results and findings of the study in sufficient detail, and so arranged as to enable each
reader to comprehend the data, and to determine for himself the validity of conclusions.
Research results must invariably enter the general store of knowledge. A research
report is always an addition to knowledge. All this explains the significance of writing a
report.
• In a broader sense, report writing is common to both academics and organisations.
However, the purpose may be different. In academics, reports are used for
comprehensive and application-oriented learning. Whereas in organisations, reports
form the basis for decision making.
Types of Reports:
Research reports may be classified into (a) technical report (b) popular report, (c) interim
report, (d) summary report, (e) oral presentation, (f) research abstract, and (g) research article.
These types of reports vary from one another in terms of the degree of formality, physical form,
scope, style and size.
(a) Technical Report/Thesis:-
This is a comprehensive full report of the research process and its outcome. It is primarily
meant for academic community i.e. the scientists of the researcher’s discipline and other
researcher’s. it is a formal long report covering all the aspects of the research process: a
description of the problem studied, the objectives of the study, methods and techniques used a
detailed account of sampling field and other research procedures, sources of data, tools of data
collection, methods of data processing and analysis, detailed findings and conclusions and
suggestions. There is also a technical appendix for methodological details, copies of measuring
instruments and the like. It is so comprehensive and complete that the study can be replicated
by others. The technical report is essentially technical in nature and scope and couched in
technical language. It follows a specified pattern and consists of several prefatory section with
appropriate heading and paragraphs.
(b) Popular Report:-
This type of report is designed for an audience of executives/administrators and other non-
technical users. The requirement of this audience is different. The reader is less concerned with
methodological details but more interested in studying quickly the major findings and
conclusion. He is interested in applying the findings to decisions. The organization of this
report is very important. The presentation can be more forceful and persuasive without of
course, any distortion of fact. It should be clear, brief and straight forward complicated
statistical techniques and tables need not be used. Instead pictorial devices may be extensively
used. The format of this report is different from that of a technical report. After a brief
introduction to the problem and the objectives of the study, an abstract of the findings,
conclusions and recommendations is presented. The methodological details, data analysis and
their discussions are presented in the second part. More headlines, underlining pictures and
graphs may used. Sentences and paragraphs should be short. There can be a liberal use of
margins and blank space. The style may be more journalistic but be precise and it should
encourage rapid reading and quick comprehension.
(c) Interim Report:-
When there is a long time lag between data collection and the presentation of the results in the
case of a sponsored project, the study may lose its significance and usefulness and the sponsor
may also lose interest in it. One of the most effective ways to avoid such eventualities is to
present an interim report. This short report may contain either the first results of the analysis
or the final outcome of the analysis of some aspects completely analysed. Whatever may be the
coverage of the interim report it fulfils certain functions. It facilitates the sponsoring agency to
take action without waiting for the full report. It helps to keep alive the agency’s interest in the
study and prevent misunderstandings about the delay. In addition, it serves to spread over a
longer period the time consuming process of discussion of research findings and their
implication. The report also enables the researcher to find the appropriate style of reporting.
The interim report contains a narration of what has been done so far and what was its outcome.
It presents a summary of the findings of that part of analysis which has been completed.
(d) Summary Report:-
A summary report is generally prepared for the consumption of the lay audience, viz. the
general public. The preparation for this type of report is desirable for any study whose findings
are general interest. It is written in non-technical, simple language with a liberal use of
pictorial charts. It just contains a brief reference to the objective of the study, its major findings
and their implications. It is a short report of two or three pages. Its size so limited as to suitable
for publication in daily newspaper.
(e) Oral presentation:-
At times oral presentation of the results of the study is considered effective, particularly in
cases where policy recommendations are indicated by project results. The merit of this
approach lies in the fact that it provides an opportunity for give-and-take decisions which
generally lead to a better understanding of the findings and their implications. But the main
demerit of this sort of presentation is the lack of any permanent record concerning the research
details and it may be just possible that the findings may fade away from people’s memory even
before an action is taken. In order to overcome this difficulty, a written report may be
circulated before the oral presentation and referred to frequently during the discussion.
(f) Research Abstract:-
This is a short summary of the technical report. It is usually prepared by a doctoral students on
the eve of submitting, his thesis. Its copies are sent by the university along with the letters of
request to the examiners invited to evaluate the thesis. It contains a brief presentation of the
statements of the problem, the objectives of the study methods and techniques used and an
overview of the report. A brief summary of the results of the study may also be added. This
abstract is primarily meant for enabling the examiner invites to decide whether the study
belongs to the area of their specialization and interest.
(g) Research Article:-
This is designed for publication in a professional journal. If a study has two or more aspects
that can be discussed independently, it may be advisable to write separate articles rather than
to crowd too many things into single article. A research article must be clearly written in
concise and unambiguous language. It must be logically organised, progressing from a
statement of the problem and the purpose of study, through the analysis of evidence to the
conclusions and implications. A professional journal may have its own special format for
reporting research. It is important to find out in advance whether the publication does have
specific format requirements. For example, the research articles submitted for publication in the
journal of applied psychology should be prepared according to the publication manual of the
American psychological association. The preferred format is:
1. Introduction:- A statement of the nature of the problem and a brief review of previous
studies pertinent to the development of the specific questions or hypotheses to be tested.
2. Method:- A brief statement of what was done where and how it was done, and a statement of
the specific techniques and tools used.
3. Results:- A presentation of the salient findings with tables or charts.
4. Discussions:- A discussion of the salient findings with tables or charts.
5. Conclusion:- A presentation of the contribution of the study to theory and /for practice and
the brand implications of the findings.
The article must be accompanied by an abstract of 100-150 words typed on a separate sheet of
paper. Any reference to an article or other source is to be identified at an appropriate point in
the text by the last name of the author year of publication and paglination where appropriate,
all within parentheses e.g. Sherman (1980); Heller (1976 p. 701) no footnote is to be used for
purpose of citation. All references are to be listed alphabetically by author in an appendix titled
‘Reference e.g. Grove, A.S. (1983) High Output Management, New York; Random House.
Tannenbaum A & Schmidt. W (1958) How to choose a leadership pattern. Harvard Business
Review 36 95-10)
Similarly, the Indian Society of Agricultural Economics, Mumbai has prescribed guidelines for
submission of papers of publication in the Indian Journal of Agricultural Economics. The
preferred format is: 1. Introduction, 2. Methodology, 3. Results and discussion, and 4. Policy
Implications/conclusion, followed by references. Only cited works should be included in the
reference list. The style of citations to be followed is:
A.S Kahlon and K. Singh: Managing Agricultural Finance: Theory and Practice, Allied
Publishers Pvt. Ltd., New Delhi, 1984.
Jairam Krishna, “Focus on Wateland Development; Degradation and Poverty.” The Economic
Times, April 13, 1986.
RESEARCH REPORT FORMAT:

In this section, the format of a comprehensive technical report of doctoral thesis is discussed.
Report outline.
A. Prefatory Items
5. Title page
6. Researcher’s declaration
7. The certificate of the research Supervisor
8. Preface/Acknowledgements
9. Table of contents
10. List of tables
11. List of graphs and charts
12. Abstracts or synopsis
1. Introduction
a) Theoretical background of the topic

b) Statement of the problem
c) Review of Literature
d) The scope of the present study
e) The objectives of the study
f) Hypotheses to tested
g) Definition of concepts
h) Model, if any.
B. Body of the Report
2. The design of the study
b) Methodology
1. overall typology
2. methods of data collection
c) Sources of data
d) Sampling plan
e) Data collection instruments
f) Filed work
g) Data processing and analysis plan
h) An overview of the report
i) Limitations of the study
3. Results:- Findings and Discussion
4. Summary, conclusions and Recommendations
C. Terminal Items
1. Bibliography
2. Appendix
a) Copies of data collection instruments
b) Technical details of sampling plan
c) Complex tables
d) Glossary of new terms used in the report.
(A) PREFATORY ITEMS:-
Title Page:
The title page is the first page of a research report. It carries (1) the title of the study (2) the
name of the degree for which it is submitted (3) the name of the author, (4) the name of the
institution on which the report is submitted and the data of presentation. The title should be
precise and reflect the core of the problem under study. It should be printed in capital letters
and cantered in the page.
Researcher’s Declaration:-
In the case of a research undertaken by a student in fulfillment of the requirements of a degree,

he may be required to make a declaration.
Research Supervisor’s Certificate:-
In the case of a student’s research’s work , his research supervision has to certify that is was a
record of independent research work done by the students.
Acknowledgement:-
In this section, the researcher acknowledges the assistance and support received from
individuals and organizations in conducting the research. It is thus intended to show his
gratitude. Good taste calls for acknowledgements to be expressed simply and nicely.
In the case of a research undertaken by a non-student researcher, acknowledgements may be

made in the preface itself, where a brief background of the study is given.
Table of contents:-
A table of contents gives an outline of the contents of the report. It contains a list of the
chapters and their sub-titles with page numbers. It facilitates ready location of topics in the
report. The chapter headings may be typed in capital letters and be the subtitles in small
letters.
List of tables:-
This comes after the table of contents. It is presented in the following format:
Table number The title of the table Page
All the tables may be numbered serially as 1,2,3,4 in one continuous series, or tables in each
chapter may be given a separate serial order as 1.1, 1.2, 1.3 for tables in chapter 1;2, 1,2,2, 2.3,
2.4 for tables in chapter 2: and so on.
Body of the Report:-
After the prefatory items, the body of the report is presented. It is the major and main part of
the report. It covers the formulation of the problem studied, methodology, findings and
discussion and a summary of the findings and recommendations. In a comprehensive report the
body of the report will consist of several chapters.
1. Introduction:- This is the first chapter in the body of a research report, it is devoted for
introducing the theoretical background of the problem, its definition and formulation. It may
consist of the following sections.
(a) Theoretical background of the topic:- The first task is to introduce the background and
the nature of the problem so as to place into a larger context to enable the reader to know its
significance in a proper perspective. This section summarizes the theory or conceptual frame
work within which the problem has been investigated for example, the theoretical background
in the thesis entitled, “A study of social responsibilities of large scale industrial units in India
discuses; the nature of the India economy , the objective of India’s constitution to establish an
egalitarian social order, the various approaches to the concept of social responsibility property
rights approach trusteeship approach, legitimacy approach, social responsibility approach and
their implications for industries in the Indian context. Within this conceptual framework, the
problem was defined the objectives of the study were set up the concept of social responsibility
was operationalised and the methodology of investigation was formulated.
(b) The statement of the problem:- In this section, why and how the problem was selected
are stated the problem is clearly defined and its facets and significance are pointed out.
(c) Review of Literature:- This is an important part of the introductory chapter. It is devoted
for making a brief review of previous studies on the problem and significant writings on the
topic under study. This review provides summary of the current state of knowledge in the area
of investigation. Which aspects have been investigated, what research gaps exist and how the
present study is an attempt to fill in that gap are highlighted. Thus the underlying purpose is
to locate the present research in the existing body of research on the subject and to point out
what it contributes to the subject.
(d) The scope of the present study:- The dimensions of the studying terms of the
geographical area covered the designation of the population being studied and the level of
generality of the study are specified.
(e) The objectives of the study:- The objectives of the study and the investigative questions
relating to each of the objectives are presented.
(f) Hypotheses:- The specific hypotheses to be tested are stated. The sources of their
formulation may be indicated.
(g) Definition of concepts:- The reader of a report is not equipped to understand the study
unless he can know what concepts are used and how they are used. Therefore, the operational
definitions of the key concept and variable’s of the study are presented giving justifications for
the definitions adopted.
How those concepts were defined by earlier writers and how the definitions of the researcher
were an improvement over earlier definitions may be explained.
(h) Models:- The models, if any, developed for depicting the relationships between variables
under study are presented with a review of their theoretical or conceptual basis. The
underlying assumptions are also stated.
2. The design of the study:-
This part of the report is devoted for the presentation of all the aspects of the methodology and
their implementation viz. overall typology methods of data collection sample design data
collection instruments, methods of data processing and plan of analysis much of his material is
taken from the research proposal plan. The revisions, if any made in the initial design and the
reasons therefore should be clearly stated. If pilot study was conducted before designing the
main study the details of the pilot study and its outcome are reported. How the outcome of the
pilot study was utilised for designing the final study is also pointed out.
The detail of the study’s design should be so meticulously stated as to fully satisfy the criterion
of replicability. That is a it should be possible for another researcher to reproduce the study and
test its conclusions. Technical details may be given in the appendix. Failure to furnish them
could cost doubts on the design.
(a) Methodology:- In this section, the overall typology of research (i.e. experimental, survey,
case study or action research) used, and the data collection methods (i.e. observation,
interviewing or mailing) employed are described.
The sources of data the sampling plan and other aspects of the design may be presented under
separate subheadings as described below.
(b) Sources of Data:- the sources from which the secondary and or primary data were
gathered are stated in the case of primary data. The universe of the study and the unit of study
are clearly defined. The limitations of the secondary data should be indicated.
(c) Sampling Plan:- The size of the universe from which the sample was drawn, the sampling
methods adopted and the sample size and the process of sampling are described in this section.
What were originally planned and what were actually achieved and the estimate of sampling
error are to be given. These details are crucial for determining the limitations of
generalisability of the findings.
(d) Data-collection Instruments:- The types of instruments used for data collection and their
contents, scales and other devices used for measuring variables, and the procedure of
establishing their validity and reliability are described in this section. How the tools were pre-
tested and finalised are also reported.
(e) Field work:- When and how the field work was conducted, and what problems and
difficulties were faced during the field work are described under this sub-heading. The
description of field experiences will provide valuable lessons for future researches in organising
and conducting their field work.
(f) Data processing and analysis plan:- The method manual or mechanical adopted for data
processing, and an account of methods used for data analysis and testing hypotheses must be
outlined and justified. If common methods like chi-square test, correlation test and analysis of
variance were used, it is sufficient to say such and such methods were used. If an unusual or
complex method was used. It should be described in sufficient detail with the formula to enable
the reader to understand it.
(g) An overview of the report:- The scheme of subsequent chapters is stated and the purpose
of each of them is briefly described in this section in order to give an overview of the
presentation of the result of the study.
(h) Limitations of the study:- No research is free from limitations and weaknesses. These
arise from methodological weaknesses, sampling, imperfections, non-responses, data
inadequacies, measurement deficiencies and the like. Such limitations may vitiate the
conclusions and their generalisability. Therefore, a careful statement of the limitations and
weaknesses of the study should be made in order to enable the reader to judge the validity of
the conclusions and the general worth of the study in the proper perspective. A frank statement
of limitations is one of the hallmarks of an honest and competent researcher.
Summary Conclusions and Recommendations:-
The presentation on analysis and results is followed by a separate final chapter. This chapter is
more extensive than the abstract give in the beginning of the report. This chapter should be a
self-contained summary of the whole report, containing a summary of essential background
information, findings and inclusions and recommendations. After a brief statement of the
problem, the purpose of the study and the methodology used in the investigation the findings
and conclusions are presented. This summary may be more or less a reproduction of the topical
sentences of the various findings and conclusions presented in the main body.
Terminal Items
Bibliography:-
This is the first of the terminal items presented at the end of the research report. The
bibliography lists in alphabetical order all published and unpublished reference used by the
writer in preparing the report. All books articles and reports and other documents may be
presented in one common list in the alphabetical order of their authors alternatively
bibliography may be classified into three or four sections: (a) books (b) articles (c) reports, and
(d) other documents, and in each section relevant references may be arranged in alphabetical
order.
Characteristics of a Good Research Report
Research report is a channel of communicating the research findings to the readers of the
report. A good report is one which does this task efficiently and effectively. As such it should
have the following characteristics/qualities.
1. It must be clear in informing the what, why, who, whom, when, where and how of the
research study.
2. It should be neither too short nor too long. One should keep in mind the fact that it
should be long enough to cover the subject matter but short enough to sustain the
reader’s interest.
3. It should be written in an objective style and simple language, correctness, precision
and clarity should be the watchwords of the scholar. Wordiness, indirection and
pompous language are barriers to communication.
4. A good report must combine clear thinking, logical organisation and sound
interpretation.
5. It should not be dull. It should be such as to sustain the reader’s interest.
6. It must be accurate. Accuracy is one of the requirements of a report. It should be factual
with objective presentation. Exaggerations and superlatives should be avoided.
7. Clarity is another requirement of presentation. It is achieved by using familiar words
and unambiguous statements, explicitly defining new concepts and unusual terms.
8. Coherence is an essential part of clarity. There should be logical flow of ideas (i.e.
continuity of thought), sequence of sentences. Each sentence must be so linked with
other sentences so as to move the thoughts smoothly.
9. Readability is an important requirement of good communication. Even a technical
report should be easily understandable. Technicalities should be translated into
language understandable by the readers.
10. A research report should be prepared according to the best composition practices.
Ensure readability through proper paragraphing, short sentences, illustrations,
examples, and section headings, use of charts, graphs and diagrams.
11. Draw sound inferences/conclusions from the statistical tables. But don’t repeat the
tables in text (verbal) form.
12. Footnote references should be in proper form. The bibliography should be reasonably
complete and in proper form.
13. The report must be attractive in appearance, neat and clean whether typed or printed.
14. The report should be free from mistakes of all types, viz., language mistakes, factual
mistakes, spelling mistakes, calculation mistakes etc.,
The researcher should try to achieve these qualities in his report as far as possible.
Precautions for Writing Research Reports
Research report is a channel of communicating the research findings to the readers of the
report. A good research report is one which does this task efficiently and effectively. As such it
must be prepared keeping the following precautions in view:
1. While determining the length of the report (since research reports vary greatly in
length), one should keep in view the fact that it should be long enough to cover the
subject but short enough to maintain interest. In fact, report-writing should not be a
means to learning more and more about less and less.
2. A research report should not, if this can be avoided, be dull; it should be such as to
sustain reader’s interest.
3. Abstract terminology and technical jargon should be avoided in a research report. The
report should be able to convey the matter as simply as possible. This, in other words,
means that report should be written in an objective style in simple language, avoiding
expressions such as “it seems,” “there may be” and the like.
4. Readers are often interested in acquiring a quick knowledge of the main findings and as
such the report must provide a ready availability of the findings. For this purpose,
charts, graphs and the statistical tables may be used for the various results in the main
report in addition to the summary of important findings.
5. The layout of the report should be well thought out and must be appropriate and in
accordance with the objective of the research problem.
6. The reports should be free from grammatical mistakes and must be prepared strictly in
accordance with the techniques of composition of report-writing such as the use of
quotations, footnotes, documentation, proper punctuation and use of abbreviations in
footnotes and the like.
7. The report must present the logical analysis of the subject matter. It must reflect a
structure wherein the different pieces of analysis relating to the research problem fit
well.
8. A research report should show originality and should necessarily be an attempt to solve
some intellectual problem. It must contribute to the solution of a problem and must add
to the store of knowledge.
9. Towards the end, the report must also state the policy implications relating to the
problem under consideration. It is usually considered desirable if the report makes a
forecast of the probable future of the subject concerned and indicates the kinds of
research still needs to be done in that particular field.
10. Appendices should be enlisted in respect of all the technical data in the report.
11. Bibliography of sources consulted is a must for a good report and must necessarily be
given.
12. Index is also considered an essential part of a good report and as such must be prepared
and appended at the end.
13. Report must be attractive in appearance, neat and clean, whether typed or printed.
14. Calculated confidence limits must be mentioned and the various constraints experienced
in conducting the research study may also be stated in the report.
15. Objective of the study, the nature of the problem, the methods employed and the
analysis techniques adopted must all be clearly stated in the beginning of the report in
the form of introduction.
Presentation of Research Report
Presentation has become an important communication medium in organisations because a

research report is properly understood if it is accompanied by a presentation.
Presentation skill: Research report presentation skills include the ability to mix in the right
proportion various elements of:
• Communications
• Presentation package
• Use of audio – visual aids to achieve proper presentation.
The researcher needs to acquire the public conversation skills.
Communication dimension
The major elements of communication dimension, which are relevant to a presentation, are:
• Purpose: Researcher has to think the purpose of the presentation and focus the light on
research analysis sharply. Researcher can try to achieve a variety of purpose such as
informing, selling, exploring, and decision making persuading and changing attitude or
behavior.
• Audience: research report presentation, multiple audiences interest at the same time.
The researcher and the receiver of the message keep changing roles through
clarification queries, question answer, dialogue and discussion; researcher should make
live and dynamic shape of presentation of report.
• Media: Researcher report presentation, sound, sight and body language come into play.
Therefore, the co-ordination of all three at one short becomes an important aspect of
presentation.
• Message: Researcher has to think of focus of the message. The presentation situation is
built on interaction between the presenter and the audience, the emotional content of
the message and the audience should be considered.
• Time: The element of time in a presentation of research report situation depends on
various factors like availability of the room, audience and right time if he has the choice.
Time is the major aspect is how much time is given to the researcher to make the
presentation.
• Place: Researcher may not have much choice in selecting the place. But the make the
best of use of the place and the facilities available will depend on the researcher. On the
room arrangement depend the kind of audio-visual tools that can be used and the type of
interaction that the researcher can have with the audience.
• Cost: The preparation of a good presentation is time consuming and expensive.
Researcher could use chapter production methods and aids than the ones he has chosen
to put across the message.
Audio – visual aids: These can be classified as follows:
• Audio: Tape recorder and compact disc

• Visual:
o Non projected: black board‚‚
 Film charts
 Models
o Projected:
 epidiascope
 Overhead
 Slide
 Film strip
 Slide projector with a timer
o Audio - Visual:
 film-- and video cassettes
Usefulness of audio visual aids: Visual aids are sound, greater credibility and clarity can be
achieved in presentation. Since both sound and sight service are activated at the same time
along with the body language, concentration, retention and recall can be obtained in
presentation of research report.
Researcher’s poise: Researcher himself is an essential part of the presentation. Researcher’s

posture and movement on the dias or at the speaking place and his hand gestures indicate the
level of confidence of the researcher to present the report. Researcher’s ability to maintain eye
contact with the audience and keep his facial expressions suited to the subject become also
important. The fluency, pace of delivery, level of the voice and command of the language shows
the level of confidence and preparation of the researcher presenting the reports.
Evaluation of the Research Report
After the report has been submitted by the researcher, he should try to get the feedback on the
same. Feedback will enable him to know the deficiencies of his report, both in regard to the
subject matter and the method of write-up. A detailed evaluation of the research project should
be undertaken. This evaluation and review should do by panel of experts related to the subject.
With respect to each stage of the researcher process, researcher may have to keep in mind some
important aspects are:
• Conformity with the actual requirement of the study

• Tools used in collection of data
• Collection instrument properly designed
• Appropriate sample survey
• Adequate sample size
• Sufficient control over survey, appropriate analysis of data
• Interpretation of data
• Suitable research findings
Research report should be free from different types of experimental errors. Thus, proper
method of evaluation research report can lead to an improvement in the quality of research. A
scientific researcher will find considerable improvement in the quality of research undertaken
by him over a period of time.
Bibliography and References
While writing an assignment, article or book, the writer often looks for the sources to generate
an idea or data. In this context, students usually misconstrue bibliography for reference, but
they are different, in the sense that you give reference to the sources, that you have quoted in-
text, in the research report or assignment. But on the other hand, in the bibliography, you
create a list of all the sources you have gone through to conceive the idea.
Reference and Bibliography is an important part of any project under study because it helps in
acknowledging other’s work and also help the readers in finding the original sources of
information. It not only prevents plagiarism but also indicates that the writer has done good
research on the subject by using a variety of sources to gain information.
Definition of Reference
Reference can be understood as the act of giving credit to or mentioning the name of, someone
or something. In research methodology, it denotes the items which you have reviewed and
referred to, in the text, in your research work. It is nothing but a way to acknowledge or
indirectly showing gratitude, towards the sources from where the information is gathered.
While using references, one thing is to be noted that you go for reliable sources only, because it
increases credence and also supports your arguments. It may include, books, research papers, or
articles from magazines, journals, newspapers, etc., interview transcripts, internet sources such
as websites, blogs, videos watched, and so forth.
These are used to inform the reader about the sources of direct quotations, tables, statistics,
photos etc. that are included in the research work.
Why References
When you are writing an essay, report, dissertation or any other form of academic writing,
your own thoughts and ideas inevitably build on those of other writers, researchers or teachers.
It is essential that you acknowledge your debt to the sources of data, research and ideas on
which you have drawn by including references to, and full details of, these sources in your
work. Referencing your work allows the reader:
• To distinguish your own ideas and findings from those you have drawn from the work
of others;
• To follow up in more detail the ideas or facts that you have referred to.
Before you write
Whenever you read or research material for your writing, make sure that you include in your
notes, or on any photocopied material, the full publication details of each relevant text that you
read. These details should include:
• surname(s) and initial(s) of the author(s);

• the date of publication;
• the title of the text;
• if it is a paper, the title of the journal and volume number;
• if it is a chapter of an edited book, the book's title and editor(s)
• the publisher and place of publication*;
• the first and last page numbers if it is a journal article or a chapter in an edited book.
For particularly important points, or for parts of texts that you might wish to quote word for
word, also include in your notes the specific page reference.
* Please note that the publisher of a book should not be confused with the printer. The
publisher's name is normally on a book's main title page, and often on the book's spine too.
When to use references
Your source should be acknowledged every time the point that you make, or the data or other
information that you use, is substantially that of another writer and not your own. As a very
rough guide, while the introduction and the conclusions to your writing might be largely based
on your own ideas, within the main body of your report, essay or dissertation, you would expect
to be drawing on, and thus referencing your debt to, the work of others in each main section or
paragraph. Look at the ways in which your sources use references in their own work, and for
further guidance consult the companion guide Avoiding Plagiarism.
Referencing styles
There are many different referencing conventions in common use. Each department will have
its own preferred format, and every journal or book editor has a set of 'house rules'. This guide
aims to explain the general principles by giving details of the two most commonly used
formats, the 'author, date' system and footnotes or endnotes. Once you have understood the
principles common to all referencing systems you should be able to apply the specific rules set
by your own department.
How to reference using the 'author, date' system
In the 'author, date' system (often referred to as the 'Harvard' system) very brief details of the
source from which a discussion point or piece of factual information is drawn are included in the
text. Full details of the source are then given in a reference list or bibliography at the end of the
text. This allows the writer to fully acknowledge her/his sources, without significantly
interrupting the flow of the writing.
1. Citing your source within the text
As the name suggests, the citation in the text normally includes the name(s) (surname only) of
the author(s) and the date of the publication. This information is usually included in brackets at
the most appropriate point in the text.
The seminars that are often a part of humanities courses can provide opportunities for students
to develop the communication and interpersonal skills that are valued by employers.
The text reference above indicates to the reader that the point being made draws on a work by
Lyon, published in 1992. An alternative format is shown in the example below.
Knapper and Cropley (1991: p. 44) believe that the willingness of adults to learn is affected by
their attitudes, values and self-image and that their capacity to learn depends greatly on their
study skills.
Note that in this example reference has been made to a specific point within a very long text (in
this instance a book) and so a page number has been added. This gives the reader the
opportunity to find the particular place in the text where the point referred to is made. You
should always include the page number when you include a passage of direct quotation from
another writer's work.
When a publication has several authors, it is usual to give the surname of the first author
followed by et al. (an abbreviation of the Latin for 'and the others') although for works with just
two authors both names may be given, as in the example above.
Do not forget that you should also include reference to the source of any tables of data,
diagrams or maps that you include in your work. If you have included a straight copy of a table
or figure, then it is usual to add a reference to the table or figure caption thus:
Figure 1: The continuum of influences on learning (from Knapper and Cropley, 1991: p. 43).
Even if you have reorganised a table of data, or redrawn a figure, you should still acknowledge
its source:
Table 1: Type of work entered by humanities graduates (data from Lyon, 1992: Table 8.5).
You may need to cite an unpublished idea or discussion point from an oral presentation, such as
a lecture. The format for the text citation is normally exactly the same as for a published work
and should give the speaker's name and the date of the presentation.
Recent research on the origins of early man has challenged the views expressed in many of the
standard textbooks (Barker, 1996).
If the idea or information that you wish to cite has been told to you personally, perhaps in a
discussion with a lecturer or a tutor, it is normal to reference the point as shown in the example
below.
The experience of the Student Learning Centre at Leicester is that many students are anxious
to improve their writing skills, and are keen to seek help and guidance (Maria Lorenzini, pers.
comm.).
'Pers. comm.' stands for personal communication; no further information is usually required.
2. Reference lists/ bibliographies
When using the 'author, date' system, the brief references included in the text must be followed
up with full publication details, usually as an alphabetical reference list or bibliography at the
end of your piece of work. The examples given below are used to indicate the main principles.
Book references
The simplest format, for a book reference, is given first; it is the full reference for one of the
works quoted in the examples above.
Knapper, C.K. and Cropley, A. 1991: Lifelong Learning and Higher Education. London: Croom
Helm.
The reference above includes:

• the surnames and forenames or initials of both the authors;
• the date of publication;
• the book title;
• the place of publication;
• the name of the publisher.
The title of the book should be formatted to distinguish it from the other details; in the example
above it is italicised, but it could be in bold, underlined or in inverted commas. When multi-
authored works have been quoted, it is important to include the names of all the authors, even
when the text reference used was et al.
Papers or articles within an edited book
A reference to a paper or article within an edited book should in addition include:
• The editor and the title of the book;

• The first and last page numbers of the article or paper.
Lyon, E.S. 1992: Humanities graduates in the labour market. In H. Eggins (ed.), Arts
Graduates, their Skills and their Employment. London: The Falmer Press, pp. 123-143.
Journal articles
Journal articles must also include:

• the name and volume number of the journal;
• the first and last page numbers of the article.
The publisher and place of publication are not normally required for journals.
Pask, G. 1979: Styles and strategies of learning. British Journal of Educational Psychology, 46,
pp. 128-148.
Note that in the last two references above, it is the book title and the journal name that are
italicised, not the title of the paper or article. The name highlighted should always be the name
under which the work will have been filed on the library shelves or referenced in any indexing
system. It is often the name which is written on the spine of the volume, and if you remember
this it may be easier for you to remember which the appropriate title to highlight is.
Other types of publications
The three examples above cover the most common publication types. You may also wish to
refer to other types of publications, including PhD dissertations, translated works, newspaper
articles, dictionary or encyclopaedia entries or legal or historical texts. The same general
principles apply to the referencing of all published sources, but for specific conventions consult
your departmental handbook or your tutor, or look at the more detailed reference books listed
in the Further reading section of this guide.
Referencing web pages
The internet is increasingly used as a source of information and it is just as important to

reference internet sources as it is to reference printed sources. Information on the internet
changes rapidly and web pages move or are sometimes inaccessible meaning it can often be
difficult to validate or even find information cited from the internet. When referencing web
pages it is helpful to include details that will help other people check or follow up the
information. A suggested format is to include the author of the information (this may be an
individual, group or organisation), the date the page was put on the internet (most web pages
have a date at the bottom of the page), the title, the http:// address, and the date you accessed
the web page (in case the information has been subsequently modified). A format for
referencing web pages is given below.
University of Leicester Standing Committee of Deans (6/8/2002) Internet code of practice and
guide to legislation. Accessed 8/8/02
http://www.le.ac.uk/committees/deans/codecode.html
Referencing lectures
Full references to unpublished oral presentations, such as lectures, usually include the speaker's
name, the date of the lecture, the name of the lecture or of the lecture series, and the location:
Barker, G. 1996 (7 October): The Archaeology of Europe, Lecture 1. University of Leicester.
Please note that in contrast to the format used for the published sources given in the first three
examples above, the formatting of references for unpublished sources does not include italics, as
there is no publication title to highlight.
Formatting references
If you look carefully at all the examples of full references given above, you will see that there is
a consistency in the ways in which punctuation and capitalisation have been used. There are
many other ways in which references can be formatted - look at the books and articles you read
for other examples and at any guidelines in your course handbooks. The only rule governing
formatting is the rule of consistency.
How to reference using footnotes or endnotes
Some academic disciplines prefer to use footnotes (notes at the foot of the page) or endnotes
(notes at the end of the work) to reference their writing. Although this method differs in style
from the 'author, date' system, its purpose - to acknowledge the source of ideas, data or
quotations without undue interruption to the flow of the writing - is the same.
Footnote or endnote markers, usually a sequential series of numbers either in brackets or

slightly above the line of writing or printing (superscript), are placed at the appropriate point in
the text. This is normally where you would insert the author and date if you were using the
'author, date' system described above.
Employers are not just looking for high academic achievement and have identified
competencies that distinguish the high performers from the average graduate.¹ This view has
been supported by an early study that demonstrated that graduates employed in the industrial
and commercial sectors were as likely to have lower second and third class degrees as firsts and
upper seconds.²
Full details of the reference are then given at the bottom of the relevant page or, if endnotes are
preferred, in numerical order at the end of the writing. Rules for the formatting of the detailed
references follow the same principles as for the reference lists for the 'author, date' system.
1. Moore, K. 1992: National Westminster Bank plc. In H. Eggins (ed.), Arts Graduates, their
Skills and their Employment. London: The Falmer Press, pp. 24-26.
2. Kelsall, R.K., Poole, A. and Kuhn, A. 1970: Six Years After. Sheffield: Higher Education
Research Unit, Sheffield University, p. 40.
NB. The reference to 'p.40' at the end of note 2 above implies that the specific point referred to
is to be found on page 40 of the book referenced.
If the same source needs to be referred to several times, on second or subsequent occasions, a
shortened reference may be used.
Studies of women's employment patterns have demonstrated the relationship between marital
status and employment sector. ³
-------------------------
3. Kelsall et al. 1970 (as n.2 above).
In this example, the footnote refers the reader to the full reference to be found in footnote 2.
In some academic disciplines, footnotes and endnotes are not only used for references, but also
to contain elaborations or explanations of points made in the main text. If you are unsure about
how to use footnotes or endnotes in your work, consult your departmental guidelines or
personal tutor.
Definition of Bibliography
At the end of the research report, bibliography is added, which contains a list of books,
magazines, journals, websites or other publications which are in some way relevant to the topic
under study that has been consulted by the researcher during the research. In finer terms, it
comprises of all the references cited in the form of footnotes and other important works that the
author has studied.
The bibliography is helpful to the reader in gaining information regarding the literature
available on the topic and what influenced the author. For better presentation and convenient
reading, the bibliography can be grouped into two parts, wherein the first part lists out the
names of books and pamphlets consulted, and the other contains the names of magazines and
newspapers considered.
Types of Bibliography
• Bibliography of works cited: It contains the name of those books whose content has
been cited in the text of the research report.
• Selected Bibliography: As it is evident from the name itself, selected bibliography
covers only those works which the author assumes that are of major interest to the
reader.
• Annotated Bibliography: In this type of bibliography, a small description of the items
covered is given by the author to ensure readability and also improve the usefulness of
the book.
Preparation of the final bibliography:
The bibliography, which is generally appended to the research report, is a list of books in some
way relevant to the research which has been done. It should contain all those works which the
researcher has consulted. The bibliography should be arranged alphabetically and may be
divided into two parts; the first part may contain the names of books and pamphlets, and the
second part may contain the names of magazine and newspaper articles.
Generally, this pattern of bibliography is considered convenient and satisfactory from the point
of view of reader, though it is not the only way of presenting bibliography. The entries in
bibliography should be made adopting the following order: For books and pamphlets the order
may be as under:
• Name of author, last name first.‚‚
• Title, underlined to indicate italics.‚‚
• Place, publisher, and date of publication.‚‚
• Number of volumes.‚‚
Example: Kothari, C.R., Quantitative Techniques, New Delhi, Vikas Publishing House Pvt.
Ltd., 1978.
For magazines and newspapers the order may be as under:

1. Name of the author, last name first.
2. Title of article, in quotation marks.
3. Name of periodical, underlined to indicate italics.
4. The volume or volume and number.
5. The date of the issue.
6. The pagination.
Example: Robert V. Roosa, “Coping with Short-term International Money Flows”, The
Banker, London, September, 1971, p. 995.
The above examples are just the samples for bibliography entries and may be used, but one
should also remember that they are not the only acceptable forms. The only thing important is
that, whatever method one selects, it must remain consistent.
Types of Bibliographic Style
MLA Works Cited: A works cited is the citation page of the popular style by the Modern
Language Association. The MLA style sheet was first published in 1951. It was taken out of
print in 2016 but is still a popular writing style. Designed for literature, arts and philosophical
writing, MLA breaks down how to format non-print materials like web pages, personal
interviews, advertisements and other communications sources.
Perfect Citations
Since MLA helps format sources that might not have a publication date, like web pages, using
an author-page format makes it easy for people to find the information. Formatting citations for
an MLA works cited looks like:
Example – Web page:
Lindsey, Suzie. “How to Make Vegetarian Chili.” eHow, www.ehow.com/how_10727_make-

vegetarian-chili.html. Accessed 25 Nov 2018.
Example – Image:
Klee, Paul. Twittering Machine. 1922. Museum of Modern Art, New York. The Artchive,
www.artchive.com/artchive/K/klee/twittering_machine.jpg.html. Accessed 9 January 2019.
Example – Email:
Collens, Suzie. “Re: Literature.” Received by Jennifer Betts, 15 Nov. 2018.
APA Reference List: Student researching for citations and bibliography
The most popular reference list is found in the American Psychological Association writing
style. Originating in 1929, in the Psychological Bulletin, the APA style is designed for
psychology, education, social science and technical research.
This style breaks down formatting citations for journals, books, manuals and other technical
sources. That’s not to say, though, that there isn’t formatting for sources like blogs and
photographs; APA just makes citing statistics, research findings and technical reports easier. It
isn’t just the citation either. The tone and word usage are also regulated by APA style. For
example, APA style writing should use non-biased writing and an active voice.
When creating a reference list in APA, the author and date are the first things that you will see.
This is because the in-text citations follow the author-date format. Formatting sources for
citation pages will follow a unique format whether you are listing a journal, book, web page or
blog.
Examples of citations in APA include:
Book:
Calfee, R. C. (1991). APA guide. Washington, DC: American Psychological Association.
Journal article:
Jourls, H. F. (1983). Fundamentals of medicine. Journal of Medicine, 46, 837-845.
Magazine article:
Henry, W. (2001, April 19). Making the grade. Time, 135, 28-31.
Chicago Bibliography Styles: Now, it’s time to look at the great bibliography. Chicago
Manual of Style (CMOS) is by far one of the most common bibliographies around. Chicago also
comes in a student-friendly version called Turabian. With 17 editions, Chicago style has been
in print since 1906. While Chicago will use a reference list for the citation page, you can also
create a bibliography if you use notes for the in-text citations. Notes can be in the form of
endnotes or footnotes.
Bibliographical Sourcing
Chicago bibliographies are a good general style. It also works for different fields like history,
anthropology, theology and philosophy. Chicago is good for web sources, along with
audiovisual sources, lectures and even recordings. Examples of formatting for a Chicago
bibliography include:
Example – Web page:
Heck, Jim. “About the Philosophical Gourmet Report.” Last modified July 8, 2011.
http://rgheck.frege.org/philosophy/aboutpgr.php.
Example – Facebook:
Chicago Manual of Style. “Is the world ready?” Facebook, April 19, 2017.
https://www.facebook.com/ChicagoManual/posts/10152906193679151.
Example – Audiovisual:
Beyoncé. “Sorry.” Directed by Kahlil Joseph and Beyoncé Knowles. June 22, 2016. Music video,
4:25. https://youtu.be/QxsmWxxouIM.
The type of bibliography you create will depend largely on the type of citation or writing style
that you are following. For example purposes, we will explore APA vs MLA. The two are
similar in many ways, but there are some major differences as well.
Here is a chart explaining the differences between the two styles that are important when you
have to choose between APA or MLA as a whole, and not just specifically as they relate to
bibliographies.
MLA Likenesses APA

Both papers are double
spaced; this includes the
Most commonly used in the Most commonly used in the
works cited or reference
Arts & Humanities fields Social Sciences fields
pages. Hanging indents are
used for citation
There is a ‘Works Cited’ page Every piece of information There is a ‘reference page’ at
at the end of the paper to cite used in the text of the paper the end of the paper to cite all
all works used in the research MUST be included in the works used in the research
process reference or works cited pages process
Whenever information is Whenever information is
cited, and the name of the cited, and the name of the
Both necessitate the use of
author is listed in the same author is listed in the same
parenthetical citations in the
sentence, the page number sentence, the year of
body of the paper
must be placed at the end. Ex: publication should also be
According to Smith, APA included. Ex. Smith (2015)
Cover Page Samples are believes that APA book
amazing (19) citation examples help.
Whenever information is Whenever information is
listed and the name of the List citations in alphabetical cited and the author’s name is
author is NOT listed, enter order on both the reference NOT listed, place the author’s
the surname of the author and and works cited pages. surname and the year of
the page number at the end. publication in the sentence.
Key Differences between Reference and Bibliography

The difference between reference and bibliography can be drawn clearly on the following
grounds:
1. Reference implies referring to someone or something that means it provides the list of
sources, whose text is used in the assignment or research work. Conversely,
bibliography represents the list of all the sources, from which the research has gained
some information about the topic, irrespective of the work cited or not.
2. References are based on primary sources, whereas bibliography is created on the basis of
primary and secondary sources.
3. References used in the assignment can be arranged alphabetically or numerically. On
the contrary, list of sources used in the bibliography is arranged numerically.
4. The bibliography is used to list out everything you go through to obtain the
information relating to the assignment, no matter if you specifically cite it in your
assignment or not. Now coming to references, it only takes into account those sources
which have been cited in the assignment.
5. The main objective of adding a reference at the end of the document is to improve
credence or support an idea or argument. As against, the bibliography is not used for
supporting an argument.
6. While reference is used in thesis and dissertation. On the other hand, bibliography is
used in case of journal paper and research work.
BASIS FOR COMPARISON REFERENCE BIBLIOGRAPHY

Bibliography is about listing
Reference implies the list of
out all the materials which
Meaning sources that has been referred
has been consulted during the
in the research work.
research work.
Both Primary and Secondary
Based on Primary Sources
Sources
Alphabetically and
Arrangement Numerically
numerically
Only in-text citations that Both in-text citations and
Includes have been used in the other sources that are used to
assignment or project. generate the idea.
A reference can be used to A bibliography cannot be used
Supporting argument
support an argument. to support an argument.
Journal Papers and Research
Used for Thesis and Dissertation
work
Use of Software in Research Data Analysis
Computers have always assisted to solve the problems faced by the mankind. Since the time of
invention, the size of the computers has drastically reduced from that of a room to that can be
accommodated in a human palm. The word computer means “something which computes or a
machine for performing calculations automatically”. But, today computer means not merely a
“calculator”. It does vast variety of jobs with tremendous speed and efficiency. Today people use
computers in almost every walk of life. Computers have become a subject of study at schools.
Electronic computers have now become an indispensable part of every profession: so do
research.
A computer has three basic components. They are:
1) An input device (keyboard and mouse)
2) A central processing unit (CPU) and
3) An output device (monitor and/or printer)
Important characteristics of a computer
1. Speed: computers can perform calculations in just a few seconds that a human beings would
need weeks to do.
2. Storage: end number of data can be stored in the computer and retrieved when needed.
Whereas a human mind can remember limited information and unimportant data can be forgot
sometimes.
3. Accuracy: the computer’s accuracy is consistently high. Almost without exception, the errors
in computing are due to human rather than to technological weakness. i. e. due to imprecise
thinking by the programmer or due to inaccurate data or due to poorly designed system.
4. Automation: the computer programmes are automatic in nature. Individual instructions to

perform which programme is needed sometimes.
5. Diligence: being a machine computer does not suffer from human traits of tiredness and lack
of concentration. A computer can perform n number of calculations continuously with the same
accuracy and speed.
Computers in Research
The computers are indispensable throughout the research process. The role of computer
becomes more important when the research is on a large sample. Data can be stored in
computers for immediate use or can be stored in auxiliary memories like floppy discs, compact
discs, universal serial buses (pen drives) or memory cards, so that the same can be retrieved
later. The computers assist the researcher throughout different phases of research process.
Phases of Research Process
There are five major phases of the research process. They are:
1) Conceptual phase
2) Design and planning phase
3) Empirical phase
4) Analytic phase and
5) Dissemination phase
1) Role of Computer in Conceptual Phase
The conceptual phase consists of formulation of research problem, review of literature,

theoretical frame work and formulation of hypothesis. Role of Computers in Literature Review:
Computers help for searching the literatures (for review of literature) and bibliographic
references stored in the electronic databases of the World Wide Web. It can thus be used for
storing relevant published articles to be retrieved whenever needed. This has the advantage
over searching the literatures in the form of books, journals and other newsletters at the
libraries which consume considerable amount of time and effort.
2) Role of Computers in Design and planning phase
Design and planning phase consist of research design, population, research variables, sampling
plan, reviewing research plan and pilot study. Role of Computers for Sample Size Calculation:
Several softwares are available to calculate the sample size required for a proposed study.
NCSS-PASSGESS is such software. The standard deviation of the data from the pilot study is
required for the sample size calculation.
3) Role of Computers in Empirical phase
Empirical phase consist of collecting and preparing the data for analysis. Data Storage: The
data obtained from the subjects are stored in computers as word files or excel spread sheets.
This has the advantage of making necessary corrections or editing the whole layout of the
tables if needed, which is impossible or time consuming incase of writing in papers. Thus,
computers help in data entry, data editing, data management including follow up actions etc.
Computers also allow for greater flexibility in recording the data while they are collected as
well as greater ease during the analysis of these data.
In research studies, the preparation and inputting data is the most labor-intensive and time
consuming aspect of the work. Typically the data will be initially recorded on a questionnaire
or record form suitable for its acceptance by the computer. To do this the researcher in
conjunction with the statistician and the programmer, will convert the data into Microsoft
word file or excel spreadsheet. These spreadsheets can be directly opened with statistical
softwares for analysis.
4) Role of Computers in Data Analysis
This phase consist of statistical analysis of the data and interpretation of results.
Data Analysis: Many softwares are now available to perform the ‘mathematical part’ of the
research process i.e. the calculations using various statistical methods. Softwares like SPSS,
NCSS-PASS, STATA and Sysat are some of the widely used.
They can be like calculating the sample size for a proposed study, hypothesis testing and
calculating the power of the study. Familiarity with any one package will suffice to carry out
the most intricate statistical analyses. Computers are useful not only for statistical analyses, but
also to monitor the accuracy and completeness of the data as they are collected.
5) Role of Computers in Research Dissemination
This phase is the publication of the research study. Research publishing: The research article is
typed in word format and converted to portable data format (PDF) and stored and/or
published in the World Wide Web.
Some Tools in Data Analysis:
However, there are some software packages that are readily available and often used at UniSA,
including Microsoft Excel, SPSS, SAS, Stata and R, which will briefly overviewed here. Then
further details are provided in subsequent modules about each of these packages.
• Microsoft Excel: This is part of the Microsoft Office suite of programs. Excel version
1.0 was first released in 1985, with the latest version Excel 2016.
o Good points
 Extremely easy to use and interchanges nicely with other Microsoft
products
 Excel spreadsheets can be read by many other statistical packages
 Add on module which is part of Excel for undertaking basic statistical
analyses
o Can produce very nice graphs
o Bad points
 Excel is designed for financial calculations, although it is possible to use
it for many other things
 Cannot undertake more sophisticated statistical analyses without
purchase of expensive commercial add ons.
• SPSS: SPSS stands for Statistical Package for the Social Sciences. It was one of the
earliest statistical packages with Version 1 being released in 1968, well before the
advent of desktop computers. It is now on Version 23.
o Good points
 Very easy to learn and use
 Can use either with menus or syntax files
 Quite good graphics
 Excels at descriptive statistics, basic regression analysis, analysis of
variance, and some newer techniques such as Classification and
Regression Trees (CART)
 Has its own structural equation modelling software AMOS, that
dovetails with SPSS
o Bad points
 Focus is on statistical methods mainly used in the social sciences, market
research and psychology
 Has advanced regression modelling procedures such as LMM and GEE,
but they are awful to use with very obscure syntax
 Has few of the more powerful techniques required in epidemiological
analysis, such as competing risk analysis or standardised rates
• SAS: SAS stands for Statistical Analysis System. It was developed at the North Carolina
State University in 1966, so is contemporary with SPSS.
o Good points
 Much more powerful than SPSS
 Commonly used for data management in clinical trials
o Bad points
 Harder to learn and use than SPSS
• Stata: Stata is a more recent statistical package with Version 1 being released in 1985.
Since then, it has become increasingly popular in the areas of epidemiology and
economics, and probably now rivals SPSS and SAS in it user base. We are now on
Version 14.
o Good points
 Much more powerful than SPSS – probably equivalent to SAS
 Excels at advanced regression modelling
 Has its own in-built structural equation modelling
 Has a good suite of epidemiological procedures
 Researchers around the world write their own procedures in Stata, which
are then available to all users
o Bad points
 Harder to learn and use than SPSS
 Does not yet have some specialised techniques such as CART or Partial
Least squares regression
• R: S-plus is a statistical programming language developed in Seattle in 1988. R is a free
version of S-plus developed in 1996. Since then the original team has expanded to
include dozens of individuals from all over the globe. Because it is a programming
language and environment, it is used by giving the software a series of commands,
often saved in text documents called syntax files or scripts, rather than having a
menu-based system. Because of this, it is probably best used by people already
reasonably expert at statistical analaysis, or who have an affinity for computers.
o Good points
 Very powerful – easily matches or even surpasses many of the models
found in SAS or Statas
 Researchers around the world write their own procedures in R, which are
then available to all users
 Free!
o Bad points
 Much harder to learn and use than SAS or Stata
What is SPSS and Its Importance in Research & Data Analysis?
SPSS (Statistical package for the social sciences) is the set of software programs that are
combined together in a single package. The basic application of this program is to analyze
scientific data related with the social science. This data can be used for market research,
surveys, data mining, etc.
With the help of the obtained statistical information, researchers can easily understand the
demand for a product in the market, and can change their strategy accordingly. Basically, SPSS
first store and organize the provided data, then it compiles the data set to produce suitable
output. SPSS is designed in such a way that it can handle a large set of variable data formats.
Read How SPSS Helps in Research & Data Analysis Programs:
SPSS is revolutionary software mainly used by research scientists which help them process
critical data in simple steps. Working on data is a complex and time consuming process, but
this software can easily handle and operate information with the help of some techniques. These
techniques are used to analyze, transform, and produce a characteristic pattern between
different data variables. In addition to it, the output can be obtained through graphical
representation so that a user can easily understand the result. Read below to understand the
factors that are responsible in the process of data handling and its execution.
1. Data Transformation: This technique is used to convert the format of the data. After
changing the data type, it integrates same type of data in one place and it becomes easy to
manage it. You can insert the different kind of data into SPSS and it will change its structure as
per the system specification and requirement. It means that even if you change the operating
system, SPSS can still work on old data.
2. Regression Analysis: It is used to understand the relation between dependent and
interdependent variables that are stored in a data file. It also explains how a change in the value
of an interdependent variable can affect the dependent data. The primary need of regression
analysis is to understand the type of relationship between different variables.
3. ANOVA (Analysis of variance): It is a statistical approach to compare events, groups or

processes, and find out the difference between them. It can help you understand which method
is more suitable for executing a task. By looking at the result, you can find the feasibility and
effectiveness of the particular method.
4. MANOVA (Multivariate analysis of variance): This method is used to compare data of

random variables whose value is unknown. MANOVA technique can also be used to analyze
different types of population and what factors can affect their choices.
5. T-tests: It is used to understand the difference between two sample types, and researchers
apply this method to find out the difference in the interest of two kinds of groups. This test can
also understand if the produced output is meaningless or useful.
This software was developed in 1960, but later in 2009, IBM acquired it. They have made some
significant changes in the programming of SPSS and now it can perform many types of
research task in various fields. Due to this, the use of this software is extended to many
industries and organizations, such as marketing, health care, education, surveys, etc.
TESTS
One Sample t-Tests
One sample t-tests can be used to determine if the mean of a sample is different from a
particular value. In this example, we will determine if the mean number of older siblings that
the PSY 216 students have is greater than 1.
We will follow our customary steps:
1. Write the null and alternative hypotheses first:

H0: µ216 Students ≤ 1
H1: µ216 Students > 1
Where µ is the mean number of older siblings that the PSY 216 students have.
2. Determine if this is a one-tailed or a two-tailed test. Because the hypothesis involves the
phrase "greater than", this must be a one tailed test.
3. Specify the α level: α = .05
4. Determine the appropriate statistical test. The variable of interest, older, is on a ratio
scale, so a z-score test or a t-test might be appropriate. Because the population standard
deviation is not known, the z-test would be inappropriate. We will use the t-test instead.
5. Calculate the t value, or let SPSS do it for you!
The command for a one sample t tests is found at Analyze | Compare Means | One-Sample T
Test (this is shorthand for clicking on the Analyze menu item at the top of the window, and
then clicking on Compare Means from the drop down menu, and One-Sample T Test from the
popup menu.):
The One-Sample t Test dialog box will appear:
Select the dependent variable(s) that you want to test by clicking on it in the left hand pane of
the One-Sample t Test dialog box. Then click on the arrow button to move the variable into
the Test Variable(s) pane. In this example, move the Older variable (number of older siblings)
into the Test Variables box:
Click in the Test Value box and enter the value that you will compare to. In this example, we
are comparing if the number of older siblings is greater than 1, so we should enter 1 into the
Test Value box:
Click on the OK button to perform the one-sample t test. The output viewer will appear. There
are two parts to the output. The first part gives descriptive statistics for the variables that you
moved into the Test Variable(s) box on the One-Sample t Test dialog box. In this example, we
get descriptive statistics for the Older variable:
This output tells us that we have 46 observations (N), the mean number of older siblings is 1.26
and the standard deviation of the number of older siblings is 1.255. The standard error of the
mean (the standard deviation of the sampling distribution of means) is 0.185 (1.255 / square
root of 46 = 0.185).
The second part of the output gives the value of the statistical test:
The second column of the output gives us the t-test value: (1.26 - 1) / (1.255 / square root of
46) = 1.410 [if you do the calculation, the values will not match exactly because of round-off
error). The third column tells us that this t test has 45 degrees of freedom (46 - 1 = 45). The
fourth column tells us the two-tailed significance (the 2-tailed p value.) But we didn't want a
two-tailed test; our hypothesis is one tailed and there is no option to specify a one-tailed test.
Because this is a one-tailed test, look in a table of critical t values to determine the critical t.
The critical t with 45 degrees of freedom, α = .05 and one-tailed is 1.679.
Determine if we can reject the null hypothesis or not. The decision rule is: if the one-tailed
critical t value is less than the observed t AND the means are in the right order, then we can
reject H0. In this example, the critical t is 1.679 (from the table of critical t values) and the
observed t is 1.410, so we fail to reject H0. That is, there is insufficient evidence to conclude
that the mean number of older siblings for the PSY 216 classes is larger than 1.
If we were writing this for publication in an APA journal, we would write it as:
A t test failed to reveal a statistically reliable difference between the mean number of older
siblings that the PSY 216 class has (M = 1.26, s = 1.26) and 1, t(45) = 1.410, p < .05, α = .05.
Independent Samples t-Tests

Single Value Groups
When two samples are involved, the samples can come from different individuals who are not
matched (the samples are independent of each other.) Or the sample can come from the same
individuals (the samples are paired with each other) and the samples are not independent of
each other. A third alternative is that the samples can come from different individuals who have
been matched on a variable of interest; this type of sample will not be independent. The form of
the t-test is slightly different for the independent samples and dependent samples types of two
sample tests, and SPSS has separate procedures for performing the two types of tests.
The Independent Samples t-test can be used to see if two means are different from each other
when the two samples that the means are based on were taken from different individuals who
have not been matched. In this example, we will determine if the students in sections one and
two of PSY 216 have a different number of older siblings.
We will follow our customary steps:
H0: µSection 1 = µSection 2
H1: µSection 1 ≠ µSection 2
phrase "different" and no ordering of the means is specified, this must be a two tailed
test.
deviation is not known, the z-test would be inappropriate. Furthermore, there are
different students in sections 1 and 2 of PSY 216, and they have not been matched.
Because of these factors, we will use the independent samples t-test.
5. Calculate the t value, or let SPSS do it for you!
The command for the independent samples t tests is found at Analyze | Compare Means |
Independent-Samples T Test (this is shorthand for clicking on the Analyze menu item at the
top of the window, and then clicking on Compare Means from the drop down menu, and
Independent-Samples T Test from the popup menu.):
The Independent-Samples t Test dialog box will appear:
the Independent-Samples t Test dialog box. Then click on the upper arrow button to move the
variable into the Test Variable(s) pane. In this example, move the older variable (number of
older siblings) into the Test Variables box:
Click on the independent variable (the variable that defines the two groups) in the left hand
pane of the Independent-Samples t Test dialog box. Then click on the lower arrow button to
move the variable in the Grouping Variable box. In this example, move the Section variable
into the Grouping Variable box:
You need to tell SPSS how to define the two groups. Click on the Define Groups button. The
Define Groups dialog box appears:
In the Group 1 text box, type in the value that determines the first group. In this example, the
value of the 10 AM section is 10. So you would type 10 in the Group 1 text box. In the Group 2
text box, type the value that determines the second group. In this example, the value of the 11
AM section is 11. So you would type 11 in the Group 2 text box:
Click on the Continue button to close the Define Groups dialog box. Click on the OK button in
the Independent-Samples t Test dialog box to perform the t-test. The output viewer will
appear with the results of the t test. The results have two main parts: descriptive statistics and
inferential statistics. First, the descriptive statistics:
This gives the descriptive statistics for each of the two groups (as defined by the grouping
variable.) In this example, there are 14 people in the 10 AM section (N), and they have, on
average, 0.86 older siblings, with a standard deviation of 1.027 older siblings. There are 32
people in the 11 AM section (N), and they have, on average, 1.44 older siblings, with a standard
deviation of 1.318 older siblings. The last column gives the standard error of the mean for each
of the two groups.
The second part of the output gives the inferential statistics:
The columns labeled "Levene's Test for Equality of Variances" tell us whether an assumption
of the t-test has been met. The t-test assumes that the variability of each group is
approximately equal. If that assumption isn't met, then a special form of the t-test should be
used. Look at the column labeled "Sig." under the heading "Levene's Test for Equality of
Variances". In this example, the significance (p value) of Levene's test is .203. If this value is
less than or equal to your α level for the test (usually .05), then you can reject the null
hypothesis that the variability of the two groups is equal, implying that the variances are
unequal. If the p value is less than or equal to the α level, then you should use the bottom row
of the output (the row labeled "Equal variances not assumed.") If the p value is greater than
your α level, then you should use the middle row of the output (the row labeled "Equal
variances assumed.") In this example, .203 is larger than α, so we will assume that the variances
are equal and we will use the middle row of the output.
The column labeled "t" gives the observed or calculate t value. In this example, assuming equal
variances, the t value is 1.461. (We can ignore the sign of t for a two tailed t-test.) The column
labeled "df" gives the degrees of freedom associated with the t test. In this example, there are
44 degrees of freedom.
The column labeled "Sig. (2-tailed)" gives the two-tailed p value associated with the test. In this
example, the p value is .151. If this had been a one-tailed test, we would need to look up the
critical t in a table.
Decide if we can reject H0: As before, the decision rule is given by: If p≤ α , then reject H0. In
this example, .151 is not less than or equal to .05, so we fail to reject H0. That implies that we
failed to observe a difference in the number of older siblings between the two sections of this
class.
If we were writing this for publication in an APA journal, we would write it as: A t test failed to
reveal a statistically reliable difference between the mean number of older siblings that the 10
AM section has (M = 0.86, s = 1.027) and that the 11 AM section has (M = 1.44, s = 1.318),
t(44) = 1.461, p = .151, α = .05.
Independent Samples t-Tests

Cut Point Groups
Sometimes you want to perform a t-test but the groups are defined by a variable that is not
dichotomous (i.e., it has more than two values.) For example, you may want to see if the
number of older siblings is different for students who have higher GPAs than for students who
have lower GPAs. Since there is no single value of GPA that specifies "higher" or "lower", we
cannot proceed exactly as we did before. Before proceeding, decide which value you will use to
divide the GPAs into the higher and lower groups. The median would be a good value, since
half of the scores are above the median and half are below.
H0: µlower GPA = µhigher GPA
H1: µlower GPA ≠ µHigher GPA
test.
deviation is not known, the z-test would be inappropriate. Furthermore, different
students have higher and lower GPAs, so we have a between-subjects design. Because of
these factors, we will use the independent samples t-test.
5. Calculate the t value, or let SPSS do it for you.
The command for the independent samples t tests is found at Analyze | Compare Means |
Independent-Samples T Test (this is shorthand for clicking on the Analyze menu item at the
top of the window, and then clicking on Compare Means from the drop down menu, and
Independent-Samples T Test from the popup menu.):
The Independent-Samples t Test dialog box will appear:
the Independent-Samples t Test dialog box. Then click on the upper arrow button to move the
variable into the Test Variable(s) pane. In this example, move the Older variable (number of
older siblings) into the Test Variables box:
Click on the independent variable (the variable that defines the two groups) in the left hand
pane of the Independent-Samples t Test dialog box. Then click on the lower arrow button to
move the variable in the Grouping Variable box. (If there already is a variable in the Grouping
Variable box, click on it if it is not already highlighted, and then click on the lower arrow which
should be pointing to the left.) In this example, move the GPA variable into the Grouping
Variable box:
You need to tell SPSS how to define the two groups. Click on the Define Groups button. The
Define Groups dialog box appears:
Click in the circle to the left of "Cut Point:". Then type the value that splits the variable into
two groups. Group one is defined as all scores that are greater than or equal to the cut point.
Group two is defined as all scores that are less than the cut point. In this example, use 3.007
(the median of the GPA variable) as the cut point value:
Click on the Continue button to close the Define Groups dialog box. Click on the OK button in
the Independent-Samples t Test dialog box to perform the t-test. The output viewer will
appear with the results of the t test. The results have two main parts: descriptive statistics and
inferential statistics. First, the descriptive statistics:
This gives the descriptive statistics for each of the two groups (as defined by the grouping
variable.) In this example, there are 23 people with a GPA greater than or equal to 3.01 (N),
and they have, on average, 1.04 older siblings, with a standard deviation of 1.186 older siblings.
There are 23 people with a GPA less than 3.01 (N), and they have, on average, 1.48 older
siblings, with a standard deviation of 1.310 older siblings. The last column gives the standard
error of the mean for each of the two groups. The second part of the output gives the inferential
statistics:
As before, the columns labeled "Levene's Test for Equality of Variances" tell us whether an
assumption of the t-test has been met. Look at the column labeled "Sig." under the heading
"Levene's Test for Equality of Variances". In this example, the significance (p value) of Levene's
test is .383. If this value is less than or equal to your α level for this test, then you can reject the
null hypothesis that the variabilities of the two groups are equal, implying that the variances
are unequal. In this example, .383 is larger than our α level of .05, so we will assume that the
variances are equal and we will use the middle row of the output.
The column labeled "t" gives the observed or calculated t value. In this example, assuming
equal variances, the t value is 1.180. (We can ignore the sign of t when using a two-tailed t-
test.) The column labeled "df" gives the degrees of freedom associated with the t test. In this
example, there are 44 degrees of freedom.
The column labeled "Sig. (2-tailed)" gives the two-tailed p value associated with the test. In this
example, the p value is .244. If this had been a one-tailed test, we would need to look up the
critical t in a table.
Decide if we can reject H0: As before, the decision rule is given by: If p≤ α , then reject H0. In
this example, .244 is greater than .05, so we fail to reject H0. That implies that there is not
sufficient evidence to conclude that people with higher or lower GPAs have different number of
older siblings.
If we were writing this for publication in an APA journal, we would write it as: An equal
variances t test failed to reveal a statistically reliable difference between the mean number of
older siblings for people with higher (M = 1.04, s = 1.186) and lower GPAs (M = 1.48, s =
1.310), t(44) = 1.18, p = .244, α = .05.
Paired Samples t-Tests
When two samples are involved and the values for each sample are collected from the same
individuals (that is, each individual gives us two values, one for each of the two groups), or the
samples come from matched pairs of individuals then a paired-samples t-test may be an
appropriate statistic to use. The paired samples t-test can be used to determine if two means are
different from each other when the two samples that the means are based on were taken from
the matched individuals or the same individuals. In this example, we will determine if the
students have different numbers of younger and older siblings.
1. Write the null and alternative hypotheses:

H0: µolder = µyounger
H1: µolder ≠ µyounger
Where µ is the mean number of siblings that the PSY 216 students have.
test.
4. Determine the appropriate statistical test. The variables of interest, older and younger,
are on a ratio scale, so a z-score test or a t-test might be appropriate. Because the
population standard deviation is not known, the z-test would be inappropriate.
Furthermore, the same students are reporting the number of older and younger
siblings, we have a within-subjects design. Because of these factors, we will use the
paired samples t-test.
5. Let SPSS calculate the value of t for you.
The command for the paired samples t tests is found at Analyze | Compare Means | Paired-
Samples T Test (this is shorthand for clicking on the Analyze menu item at the top of the
window, and then clicking on Compare Means from the drop down menu, and Paired-Samples
T Test from the popup menu.):
The Paired-Samples t Test dialog box will appear:
You must select a pair of variables that represent the two conditions. Click on one of the
variables in the left hand pane of the Paired-Samples t Test dialog box. Then click on the other
variable in the left hand pane. Click on the arrow button to move the variables into the Paired
Variables pane. In this example, select Older and Younger variables (number of older and
younger siblings) and then click on the arrow button to move the pair into the Paired Variables
box:
Click on the OK button in the Paired-Samples t Test dialog box to perform the t-test. The
output viewer will appear with the results of the t test. The results have three main parts:
descriptive statistics, the correlation between the pair of variables, and inferential statistics.
First, the descriptive statistics:
This gives the descriptive statistics for each of the two groups (as defined by the pair of
variables.) In this example, there are 45 people who responded to the Older siblings question
(N), and they have, on average, 1.24 older siblings, with a standard deviation of 1.26 older
siblings. These same 45 people also responded to the Younger siblings question (N), and they
have, on average, 1.13 younger siblings, with a standard deviation of 1.20 younger siblings.
The last column gives the standard error of the mean for each of the two variables.
The second part of the output gives the correlation between the pair of variables:
This again shows that there are 45 pairs of observations (N). The correlation between the two
variables is given in the third column. In this example r = -.292. The last column gives the p
value for the correlation coefficient. As always, if the p value is less as or equal to the alpha
level, then you can reject the null hypothesis that the population correlation coefficient (ρ) is
equal to 0. In this case, p = .052, so we fail to reject the null hypothesis. That is, there is
insufficient evidence to conclude that the population correlation (ρ) is different from 0.
The third part of the output gives the inferential statistics:
The column labeled "Mean" is the difference of the two means (1.24 - 1.13 = 0.11 in this
example (the difference is due to round off error).) The next column is the standard deviation of
the difference between the two variables (1.98 in this example.)
The column labeled "t" gives the observed or calculated t value. In this example, the t value is
0.377 (you can ignore the sign.) The column labeled "df" gives the degrees of freedom
associated with the t test. In this example, there are 44 degrees of freedom. The column labeled
"Sig. (2-tailed)" gives the two-tailed p value associated with the test. In this example, the p
value is .708. If this had been a one-tailed test, we would need to look up the critical value of t
in a table.
Decide if we can reject H0: As before, the decision rule is given by: If p≤ α, then reject H0. In
this example, .708 is not less than or equal to .05, so we fail to reject H0. That implies that there
is insufficient evidence to conclude that the number of older and younger siblings is different.
If we were writing this for publication in an APA journal, we would write it as: A paired
samples t test failed to reveal a statistically reliable difference between the mean number of
older (M = 1.24, s = 1.26) and younger (M = 1.13, s = 1.20) siblings that the students have,
t(44) = 0.377, p = .708, α = .05.
One Proportion Z-Tests in SPSS
A certain soft drink bottler claims that less than 20% of its customers drink another brand of
soft drink on a regular basis. A random sample of 100 customers yielded 18 who did in fact
drink another brand of soft drink on a regular basis. Do these sample results support the
bottler’s claim? (Use a level of significance of 0.05.)
1. Enter the category values (Brand of Drink: 1=other brand, 2=same brand) into one
variable and the observed counts (other brand=18, same brand=82) into another variable (see
left figure, below). Then weight the category values variable by the observed counts variable
(see two right figures, below).
2. Select Analyze −> Nonparametric Tests −> Chi-Square… (See left figure, below).
3. Select “Brand of Drink” as the test variable and enter the values for the null hypothesis
proportions in numerical order by category value [i.e., P(other brand) = 0.20, then P(same
brand) = 0.80] (see right figure, below).
4. Your output should look like this.
5. You should use the output information in the following manner to answer the question.
Two Proportion Z-Tests in SPSS

In a test of the reliability of products produced by two machines, machine A produced 15
defective parts in a run of 280, while machine B produced 10 defective parts in a run of 200. Do
these results imply a difference in the reliability of these two machines? (Use α = 0.01.)
1. Enter the group values (Machine: 1=Machine A, 2=Machine B) into one variable, the
quality values (Quality: 1=Defective, 2=Acceptable) into another variable, and the observed
counts into a third variable (see left figure, below). Then weight the category variables
(Machine, Quality) by the observed counts variable (see two right figures, below).
2. Select Analyze -> Descriptive Statistics -> Crosstabs… (see top-left figure, below).
3. Select “Machine” as the row variable and “Quality” as the column variable. Click the
“Statistics…” button and be sure that “Chi-square” is selected (see bottom figure, below). Click
“Continue” to close the “Statistics…” window, and then click “OK” to perform the analysis (see
top-right figure, below).
4. Your output should look like this.
5. You should use the output information in the following manner to answer the question.
Chi-Square Test of Independence
The Chi-Square Test of Independence determines whether there is an association between

categorical variables (i.e., whether the variables are independent or related). It is a
nonparametric test.
This test is also known as:
• Chi-Square Test of Association.
This test utilizes a contingency table to analyze the data. A contingency table (also known as
a cross-tabulation, crosstab, or two-way table) is an arrangement in which data is classified
according to two categorical variables. The categories for one variable appear in the rows, and
the categories for the other variable appear in columns. Each variable must have two or more
categories. Each cell reflects the total count of cases for a specific pair of categories.
Common Uses
The Chi-Square Test of Independence is commonly used to test the following:
• Statistical independence or association between two or more categorical variables.
The Chi-Square Test of Independence can only compare categorical variables. It cannot make
comparisons between continuous variables or between categorical and continuous variables.
Additionally, the Chi-Square Test of Independence only assesses associations between
categorical variables, and cannot provide any inferences about causation.
If your categorical variables represent "pre-test" and "post-test" observations, then the chi-
square test of independence is not appropriate. This is because the assumption of the
independence of observations is violated. In this situation, McNemar's Test is appropriate.
Data Requirements
Your data must meet the following requirements:
1. Two categorical variables.

2. Two or more categories (groups) for each variable.
3. Independence of observations.
• There is no relationship between the subjects in each group.
• The categorical variables are not "paired" in any way (e.g. pre-test/post-test
observations).
4. Relatively large sample size.
• Expected frequencies for each cell are at least 1.
• Expected frequencies should be at least 5 for the majority (80%) of the cells.
Hypotheses
The null hypothesis (H0) and alternative hypothesis (H1) of the Chi-Square Test of
Independence can be expressed in two different but equivalent ways:
H0: "[Variable 1] is independent of [Variable 2]"

H1: "[Variable 1] is not independent of [Variable 2]"
OR
H0: "[Variable 1] is not associated with [Variable 2]"

H1: "[Variable 1] is associated with [Variable 2]"
Data Set-Up
There are two different ways in which your data may be set up initially. The format of the data
will determine how to proceed with running the Chi-Square Test of Independence. At
minimum, your data should include two categorical variables (represented in columns) that will
be used in the analysis. The categorical variables must include at least two groups. Your data
may be formatted in either of the following ways:
If you have the raw data (each row is a subject):
• Cases represent subjects, and each subject appears once in the dataset. That is, each row
represents an observation from a unique subject.
• The dataset contains at least two nominal categorical variables (string or numeric). The
categorical variables used in the test must have two or more categories.
If you have frequencies (each row is a combination of factors):
• Cases represent the combinations of categories for the variables.
• Each row in the dataset represents a distinct combination of the categories.
• The value in the "frequency" column for a given row is the number of unique
subjects with that combination of categories.
• You should have three variables: one representing each category, and a third
representing the number of occurrences of that particular combination of factors.
• Before running the test, you must activate Weight Cases, and set the frequency variable
as the weight.
Run a Chi-Square Test of Independence
In SPSS, the Chi-Square Test of Independence is an option within the Crosstabs procedure.
Recall that the Crosstabs procedure creates a contingency table or two-way table, which
summarizes the distribution of two categorical variables.
To create a crosstab and perform a chi-square test of independence, click Analyze ->
Descriptive Statistics -> Crosstabs.
A. Row(s): One or more variables to use in the rows of the crosstab(s). You must enter at least
one Row variable.
B. Column(s): One or more variables to use in the columns of the crosstab(s). You must enter
at least one Column variable.
Also note that if you specify one row variable and two or more column variables, SPSS will
print crosstabs for each pairing of the row variable with the column variables. The same is true
if you have one column variable and two or more row variables, or if you have multiple row and
column variables. A chi-square test will be produced for each table. Additionally, if you include
a layer variable, chi-square tests will be run for each pair of row and column variables within
each level of the layer variable.
C. Layer: An optional "stratification" variable. If you have turned on the chi-square test results
and have specified a layer variable, SPSS will subset the data with respect to the categories of
the layer variable, then run chi-square tests between the row and column variables. (This
is not equivalent to testing for a three-way association, or testing for an association between
the row and column variable after controlling for the layer variable.)
D. Statistics: Opens the Crosstabs: Statistics window, which contains fifteen different
inferential statistics for comparing categorical variables. To run the Chi-Square Test of
Independence, make sure that the Chi-square box is checked off.
E. Cells: Opens the Crosstabs: Cell Display window, which controls which output is displayed
in each cell of the crosstab. (Note: in a crosstab, the cells are the inner sections of the table. They
show the number of observations for a given combination of the row and column categories.)
There are three options in this window that are useful (but optional) when performing a Chi-
Square Test of Independence:
1. Observed: The actual number of observations for a given cell. This option is enabled by
default.
2. Expected: The expected number of observations for that cell (see the test statistic formula).
3. Unstandardized Residuals: The "residual" value, computed as observed minus expected.
F Format: Opens the Crosstabs: Table Format window, which specifies how the rows of the
table are sorted.
Example: Chi-square Test for 3x2 Table
PROBLEM STATEMENT
In the sample dataset, respondents were asked their gender and whether or not they were a
cigarette smoker. There were three answer choices: Nonsmoker, Past smoker, and Current
smoker. Suppose we want to test for an association between smoking behavior (nonsmoker,
current smoker, or past smoker) and gender (male or female) using a Chi-Square Test of
Independence (we'll use α = 0.05).
BEFORE THE TEST
Before we test for "association", it is helpful to understand what an "association" and a "lack of
association" between two categorical variables looks like. One way to visualize this is using
clustered bar charts. Let's look at the clustered bar chart produced by the Crosstabs procedure.
This is the chart that is produced if you use Smoking as the row variable and Gender as the
column variable (running the syntax later in this example):
The "clusters" in a clustered bar chart are determined by the row variable (in this case, the
smoking categories). The color of the bars is determined by the column variable (in this case,
gender). The height of each bar represents the total number of observations in that particular
combination of categories.
This type of chart emphasizes the differences within the categories of the row variable. Notice
how within each smoking category, the heights of the bars (i.e., the number of males and
females) are very similar. That is, there are an approximately equal number of male and female
nonsmokers; approximately equal number of male and female past smokers; approximately
equal number of male and female current smokers. If there were an association between gender
and smoking, we would expect these counts to differ between groups in some way.
RUNNING THE TEST
1. Open the Crosstabs dialog (Analyze > Descriptive Statistics > Crosstabs).
2. Select Smoking as the row variable, and Gender as the column variable.
3. Click Statistics. Check Chi-square, and then click Continue.
4. (Optional) Check the box for Display clustered bar charts.
5. Click OK.
OUTPUT
TABLES
The first table is the Case Processing summary, which tells us the number of valid cases used
for analysis. Only cases with nonmissing values for both smoking behavior and gender can be
used in the test.
The next tables are the crosstabulation and chi-square test results.
The key result in the Chi-Square Tests table is the Pearson Chi-Square.
• The value of the test statistic is 3.171.

• The footnote for this statistic pertains to the expected cell count assumption (i.e.,
expected cell counts are all greater than 5): no cells had an expected count less than 5,
so this assumption was met.
• Because the test statistic is based on a 3x2 crosstabulation table, the degrees of freedom
(df) for the test statistic is
df=(R−1)∗(C−1)=(3−1)∗(2−1)=2∗1=2df=(R−1)∗(C−1)=(3−1)∗(2−1)=2∗1=2
• The corresponding p-value of the test statistic is p = 0.205.
DECISION AND CONCLUSIONS
Since the p-value is greater than our chosen significance level (α = 0.05), we do not reject the
null hypothesis. Rather, we conclude that there is not enough evidence to suggest an
association between gender and smoking.
Based on the results, we can state the following:
• No association was found between gender and smoking behavior (Χ2(2)> = 3.171, p =
0.205).
PROBLEM STATEMENT (CHI-SQUARE TEST FOR 2X2 TABLE)
Let's continue the row and column percentage example from the Crosstabs tutorial, which
described the relationship between the variables RankUpperUnder (upperclassman/
underclassman) and LivesOnCampus (lives on campus/lives off-campus). Recall that the
column percentages of the crosstab appeared to indicate that upperclassmen were less likely
than underclassmen to live on campus:
• The proportion of underclassmen who live off campus is 34.8%, or 79/227.

• The proportion of underclassmen who live on campus is 65.2%, or 148/227.
• The proportion of upperclassmen who live off campus is 94.4%, or 152/161.
• The proportion of upperclassmen who live on campus is 5.6%, or 9/161.
Suppose that we want to test the association between class rank and living on campus using a
Chi-Square Test of Independence (using α = 0.05).
BEFORE THE TEST
The clustered bar chart from the Crosstabs procedure can act as a complement to the column
percentages above. Let's look at the chart produced by the Crosstabs procedure for this
example:
The height of each bar represents the total number of observations in that particular
combination of categories. The "clusters" are formed by the row variable (in this case, class
rank). This type of chart emphasizes the differences within the underclassmen and
upperclassmen groups. Here, the differences in number of students living on campus versus
living off-campus is much starker within the class rank groups.
RUNNING THE TEST
1. Open the Crosstabs dialog (Analyze > Descriptive Statistics > Crosstabs).
2. Select RankUpperUnder as the row variable, and LiveOnCampus as the column variable.
3. Click Statistics. Check Chi-square, then click Continue.
4. (Optional) Click Cells. Under Counts, check the boxes for Observed and Expected, and
under Residuals, click Unstandardized. Then click Continue.
5. (Optional) Check the box for Display clustered bar charts.
6. Click OK.
OUTPUT
TABLES
The first table is the Case Processing summary, which tells us the number of valid cases used
for analysis. Only cases with nonmissing values for both class rank and living on campus can be
used in the test.
The next table is the crosstabulation. If you elected to check off the boxes for Observed Count,
Expected Count, and Unstandardized Residuals, you should see the following table:
With the Expected Count values shown, we can confirm that all cells have an expected value
greater than 5.
Computation of the expected cell counts and residuals (observed minus expected) for the
crosstabulation of class rank by living on campus.
These numbers can be plugged into the chi-square test statistic formula:
We can confirm this computation with the results in the Chi-Square Tests table:
The row of interest here is Pearson Chi-Square and its footnote.
• The value of the test statistic is 138.926.

• The footnote for this statistic pertains to the expected cell count assumption (i.e.,
expected cell counts are all greater than 5): no cells had an expected count less than 5,
so this assumption was met.
• Because the crosstabulation is a 2x2 table, the degrees of freedom (df) for the test
statistic is
df=(R−1)∗(C−1)=(2−1)∗(2−1)=1df=(R−1)∗(C−1)=(2−1)∗(2−1)=1
• The corresponding p-value of the test statistic is so small that it is cut off from display.
Instead of writing "p = 0.000", we instead write the mathematically correct
statement p < 0.001.
DECISION AND CONCLUSIONS
Since the p-value is less than our chosen significance level α = 0.05, we can reject the null
hypothesis, and conclude that there is an association between class rank and whether or not
students live on-campus.
Based on the results, we can state the following:

• There was a significant association between class rank and living on campus (Χ2(1) =
138.9, p < .001).
Chi-Square Test of Association

Research question type: Association of two variables
What kind of variables: Categorical (nominal or ordinal with few categories)
Common Applications: Questionnaire data from a survey
Example 1: Research question: Is there an association between personality and colour

preference?
A group of students were classified in terms of personality (introvert or extrovert) and in

terms of colour preference (red, yellow, green or blue). Personality and colour preference are
categorical data.
Table 1: Student Personality Colour preference

1 Introvert Yellow
2 Extrovert Red
3 Extrovert Yellow
4 Introvert Green
5 Extrovert Blue
etc
Data of this type are usually summarised by counting the number of subjects in each
personality/colour group and presented in the form of a table (cross-tabulation), sometimes
called a contingency table.
The results of a survey of 400 students are tabulated below:
Table 2:
Colour
Red Yellow Green Blue Totals
Personality Introvert 20 (10%) 6 (15%) 30 (37.5%) 44 (55%) 100 (25%)
Extrovert 180 (90%) 34 (85%) 50 (62.5%) 36 (45%) 300 (75%)
Totals 200 (100%) 40 (100%) 80 (100%) 80 (100%) 400 (100%)
As there are different numbers of students in each group, use of percentages helps to spot any
patterns in the data. Table 2 shows column percentages in brackets. Table 3 shows row
percentages in brackets. [You can choose total percentages too, when each number is presented
as a percentage of the total.] Think about how you would choose which to use.
Table 3:
Colour
Red Yellow Green Blue Totals
Personality Introvert 20 (20%) 6 (6%) 30 (30%) 44 (44%) 100 (100%)
Extrovert 180 (60%) 34 (11.3%) 50 (16.7%) 36 (12%) 300 (100%)
Totals 200 (50%) 40 (10%) 80 (20%) 80 (20%) 400 (100%)
Hypotheses:
The 'null hypothesis' might be:
H0: Colour preference is not related to (associated with) personality
And an 'alternative hypothesis' might be:
H1: Colour preference is related to (associated with) personality
Steps in SPSS (PASW):
SPSS likes numbers, so with data entered in the format of table 1 (data from individuals), using
1 for introvert and 2 for extravert personality; and 1=red, 2=yellow, 3=green, 4=blue, choose
Analyze > Descriptive Statistics > Crosstabs
• Select one variable as the Row variable, and the other as the Column variable (see
below)
• Click on the Statistics button and select Chi-square in the top LH corner and Continue.
• Click on the Cells button and select Column percentages (or Row) and
Continue. [NB You can also ask for Expected Frequencies from the Cells
button]
• Click OK
Output should look something like below:
Results:
From the top row of the last table, Pearson Chi-Square statistic, 2 = 71.20, and p < 0.001; ie, a
very small probability of the observed data under the null hypothesis of no relationship.
[NEVER write p = 0.000]. The null hypothesis is rejected, since p < 0.05 (in fact p < 0.001).
Conclusion:
Colour preference seems to be related to personality (p < 0.001). Go back to the tabulation
(Tables 2 and 3). Note that, for instance, the most popular colour for introverts is blue (44% of
them preferred blue, Table 3), whilst the most popular colour for extroverts is red (60% of them
preferred, Table 3). Also, of all people preferring red, 90% of them are extroverts (Table 2),
whilst of all people preferring blue, 55% of them are introverts (Table 2).
Data already grouped into a table:
Grouped data as tabulated in Table 2 can be entered in SPSS as below (with codes as above):
Personality Favourite colour Frequency

1 1 20
1 2 6
1 3 30
1 4 44
2 1 180
2 2 34
2 3 50
2 4 36
Before carrying-out the SPSS steps listed above, choose:
Data -> Weight Cases and select Weight cases by and chooses your frequency variable as the
Frequency Variable. Then repeat the steps as outlined above to get the same output as before.
Example 2:
Research question: Is there an association between the proportion of defectives and the
machine used?
A sample of 200 components is selected from the output of a factory that uses three different
machines to manufacture these components. Each component in the sample is inspected to
determine whether or not it is defective. The machine that produced the component is also
recorded. The results are as follows:
Machine
1 2 3 Totals
Outcome Defective 8 (13%) 6 (9%) 12 (17%) 26 (13%)
Non-defective 54 (87%) 62 (81%) 58 (83%) 174 (87%)
Totals 62 (100%) 68 (100%) 70 (100%) 200 (100%)
The manager wishes to determine whether or not there is a relationship (association)

between the proportion of defectives and the machine used. The null and alternative hypotheses
can be formulated as above, but in this case it is also equivalent to saying:
H0: There are no differences between machines in the percentage of defectives produced
H1: There are differences between machines in the percentage of defectives produced
Using the instructions outlined above for grouped data, SPSS gives Pearson Chi-Square
statistic, χ2 = 2.112, and p = 0.348. Hence, there is no real evidence that the percentage of
defectives varies from machine to machine.
Validity of Chi-squared (χ2) tests for 2-way tables
Chi-squared tests are only valid when you have reasonable sample size.
For 2x2 tables (ie only two categories in each variable):
• If the total sample size is greater than 40, χ2 can be used

• If the total sample size is between 20 and 40, and the smallest expected frequency is at
least 5, χ2 can be used (see note 'α.' at the bottom of SPSS output to see if this is a
problem)
• Otherwise Fisher's exact test must be used (SPSS will automatically give this)
• For other tables:
• χ2 can be used if no more than 20% of the expected frequencies are less than 5 and none
is less than 1 (see note 'a.' at the bottom of SPSS output to see if this is a problem)
It is possible to 'pool' or 'collapse' categories into fewer, but this must only be done if it is
meaningful to group the data in this way.
ANOVA
A one-way analysis of variance (ANOVA) test is a statistical tool to determine if there are any
differences between the means of three or more continuous variables. This particular test
assumes that the data in each group is normally distributed.
Assumptions of a One-Way ANOVA test
Before running a One-Way ANOVA test in SPSS, it is best to ensure the data meets the
following assumptions.
1. The dependent variables should be measured on a continuous scale (either

interval or ratio).
2. There should be three or more independent (non-related) groups.
3. There are no outliers present in the dependent variable.
4. The dependent variables should be normally distributed. See how to test for
normality in SPSS.
5. The dependent variables should have homogeneity of variances. In other words,
their standard deviations need to be approximately the same.
Example experiment
For instance, say we have measured the weights of different rats. There are three groups of
rats:
1. Controls: these have not received any physical exercise.
2. Exercised: these have performed 6 weeks of physical exercise.
3. Pill: these have been treated with a diet pill for 6 weeks.
We want to know if there are any differences between the weights of the rats after the 6 week
period. We can now formulate two hypotheses.
• The null hypothesis would read: There are no differences in the weights of the rats after
the 6 week period.
• The alternative hypothesis would be: There is a difference in weight between the three
rat groups.
The one-way ANOVA test will be able to inform us if there is a significant difference between
the three groups. However, it cannot directly state which group(s) are different from each
other. So, if a one-way ANOVA test indiciates a significant result, further post-hoc testing is
required to investigate specifically which groups are significantly different.
The dataset
In SPSS, I have created a file containing two data variables labeled ‘Weight’ and
‘Group‘. The first contains all of the rat weights (measured in grams). In the ‘Group‘
column, I have assigned the numbers ‘1‘, ‘2‘, or ‘3‘ to indicate which experiment group
the rats belong to.
Below is a snapshot of what part of the data looks like so you get the idea.
Performing a One-Way ANOVA test in SPSS
Now we have the dataset, let’s perform the one-way ANOVA test in SPSS.
1. Firstly, go to Analyze > Compare Means > One-Way ANOVA....
2. A new window will open. Here you need to move the depdendent variable (Weight in
the example) into the window called Dependent List and the grouping variable (Group)
into the box titled Factor.
3. Since we do not know whether there are any differences in weights between our three
groups, we should avoid performing any post-hoc test just yet. It is, however, worth
getting further descriptive data at this point. To do this, click the Options... button.
This will bring up a new window, here you should tick the Descriptive option under the
Statistics heading and click the Continue button.
4. Finally, click the OK button to run the ANOVA test.
One-way ANOVA Output
The results are presented in the output window in SPSS. You should be presented with
two boxes.
The first (Descriptives) contains a wealth of information including mean, standard

deviation, standard error and 95% confidence intervals stratified by each group, as well
as combined. We can clearly see large differences in mean weight values.
The next output box (ANOVA) contains all of the statistical information regarding the
one-way ANOVA test. This includes the degrees of freedom (df), the F statistic (F) and
the all important significance value (Sig.).
One-Way ANOVA interpretation
By looking at the table we can see that the significance (Sig.) value is ‘.000‘. This is
considerably lower than our significance threshold of P<0.05. Therefore, we should
reject the null hypothesis and accept the alternative hypothesis.
One-Way ANOVA reporting
At this point, we can confirm that there is a significant difference in rat weights between
the three groups. Thus we could summarise this, including the statistical output, in one
simple sentence.
The reporting includes the degrees of freedom, both between and within groups, the F
statistic and the P value.
Performing post-hoc tests
Since the results of the one-way ANOVA test returned a significant result, it is now
appropriate to carry out post-hoc tests. This is to determine which specific groups are
significant from another.
1. To perform post-hoc tests in SPSS, firstly go back to the one-way ANOVA

window by going to Analyze > Compare Means > One-Way ANOVA... (as
described in Step 1).
2. Now, enter the same data into the appropriate windows again (as described in
Step 2).
3. Click the Post Hoc... button to open the Post Hoc Multiple Comparisons
window. There are multiple options for post hoc testing, but for this example
we will use the commonly adopted Tukey post hoc test. Tick the Tukey
option under the Equal Variances Assumed heading.
Now click the Continue button.
4. To run the test, click the OK button.
Post-hoc (Tukey) output
By going to the output window, you will now see a new section of results titled Post
Hoc Tests. The results that we are interested in are presented in the Multiple
Comparisons box.
The output compares each possible group. For example, the first row presents the
results for the comparison between the ‘Control‘ and the ‘Exercised‘ groups, as well as
that between the ‘Control‘ and ‘Pill‘ groups. The Mean Difference is also given, which is
the average difference in weights between the groups in comparison. Additionally, the
table contains the standard error (Std. Error) and 95% confidence intervals. The P
values for each comparison can be found under the Sig. column.
Post hoc (Tukey) interpretation
By looking at the Sig. column, it can be seen that all comparisons are significant since
the P values are all .000. Thus, the weights for the three rat groups are significantly
different from each other.
Post hoc (Tukey) reporting
Since we now know the comparisons between each group, we can add to our previous
reporting with the additional post-hoc results. I have provided an example for the full
reporting below.
Two Way ANOVA and Interactions
The Design
Suppose a statistics teacher gave an essay final to his class. He randomly divides the classes in
half such that half the class writes the final with a blue book and half with notebook computers.
In addition the students are partitioned into three groups, no typing ability, some typing
ability, and highly skilled at typing. Answers written in blue books will be transcribed to word
processors and scoring will be done blindly. Not with a blindfold, but the instructor will not
know the method or skill level of the student when scoring the final. The dependent measure
will be the score on the essay part of the final exam.
The first factor will be called Method and will have two levels, blue book and computer. The
second factor will be designated as Ability and will have three levels: none, some, and lots. Each
subject will be measured a single time. Any effects discovered will necessarily be between
subjects or groups and hence the designation "between groups" designs.
The Data
In the case of the example data, the Ability factor has two levels while the Method factor has
three. The X variable is the score on the final exam. The example data file appears below.
The independent variables or factors do not need to be in any particular order on the data file,
it is simply more convenient and easier to read if they are in order, as shown above. In most
real-life two factor ANOVAs there will be unequal numbers of subjects in each group.
Example Analysis using General Linear Model in SPSS
The analysis is done in SPSS by selecting Statistics/Univariate.... In the next screen, the
Dependent Variable is X and the Fixed Factors are Ability and Method. The screen will appear
as follows.
The Options/Display/Descriptive Statistics button was selected in this example to produce the
table of means and standard deviations. In addition, the Plots option button was selected as
follows.
Two graphs will be drawn, the first with Ability as the horizontal axis and Method as the
separate lines and the second with Method as the horizontal axis and Ability as the separate
lines.
Interpretation of Output
The interpretation of the output from the General Linear Model command will focus on two
parts: the table of means and the ANOVA summary table. The table of means is the primary
focus of the analysis while the summary table directs attention to the interesting or statistically
significant portions of the table of means.
Often the means are organized and presented in a slightly different manner than the form of the
output from the General Linear Model command. The table of means may be rearranged and
presented as follows:
Table of means for two factor ANOVA.
None Some Lots
blue-
26.67 31.00 33.33 30.33
book
computer 28.00 36.67 27.00 30.56
27.33 33.83 30.17 30.44
The means inside the boxes are called cell means, the means in the margins are called marginal
means, and the number on the bottom right-hand corner is called the grand mean. An analysis
of these means reveals that there is very little difference between the marginal means for the
different levels of Method across the levels of Ability (30.33 vs. 30.56). The marginal means of
Ability over levels of Method are different (27.33 vs. 33.83 vs. 30.17) with the mean for "Some"
being the highest. The cell means show an increasing pattern for levels of Ability using a blue
book (26.67 vs. 31.00 vs. 33.33) and a different pattern for levels of Ability using a computer
(28.00 vs. 36.67 vs. 27.00).
Graphs of Means
Graphs of means are often used to present information in a manner that is easier to
comprehend than the tables of means. One factor is selected for presentation as the X-axis and
its levels are marked on that axis. Separate lines are drawn the height of the mean for each level
of the second factor. In the following graph, the Ability, or keyboard experience, factor was
selected for the X-axis and the Method, factor was selected for the different lines.
Presenting the information in an opposite fashion would be equally correct, although some
graphs are more easily understood than others, depending upon the values for the means and
the number of levels of each factor. The second possible graph is presented below.
It is recommended that if there is any doubt that both versions of the graphs be attempted and
the one which best illustrates the data be selected for inclusion into the statistical report. In this
case it appears that the graph with Ability on the X-axis is easier to understand than the one
with Method on the X-axis.
The ANOVA Summary Table
The results of the analysis are presented in the ANOVA summary table, presented below for
the example data.
The items of primary interest in this table are the effects listed under the "Source" column and
the values under the "Sig." column. As in the previous hypothesis test, if the value of "Sig" is
less than the value of as set by the experimenter, then that effect is significant. If =.05, then the
Ability main effect and the Ability BY Method interaction would be significant in this table.
Main Effects
Main effects are differences in means over levels of one factor collapsed over levels of the other
factor. This is actually much easier than it sounds. For example, the main effect of Method is
simply the difference between the means of final exam score for the two levels of Method,
ignoring or collapsing over experience. As seen in the second method of presenting a table of
means, the main effect of Method is whether the two marginal means associated with the
Method factor are different. In the example case these means were 30.33 and 30.56 and the
differences between these means was not statistically significant. As can be seen from the
summary table, the main effect of Ability is significant. This effect refers to the differences
between the three marginal means associated with Ability. In this case the values for these
means were 27.33, 33.83, and 30.17 and the differences between them may be attributed to a
real effect.
Simple Main Effects
A simple main effect is a main effect of one factor at a given level of a second factor. In the
example data it would be possible to talk about the simple main effect of Ability at Method
equal blue book. That effect would be the difference between the three cell means at level
a1 (26.67, 31.00, and 33.33). One could also talk about the simple main effect of Method at
Ability equal lots (33.33 and 27.00). Simple main effects are not directly tested in this analysis.
They are, however, necessary to understand an interaction.
Interaction Effects
An interaction effect is a change in the simple main effect of one variable over levels of the
second. An A X B or A BY B interaction is a change in the simple main effect of B over levels of
A or the change in the simple main effect of A over levels of B. In either case the cell means
cannot be modeled simply by knowing the size of the main effects. An additional set of
parameters must be used to explain the differences between the cell means. These parameters
are collectively called an interaction. The change in the simple main effect of one variable over
levels of the other is most easily seen in the graph of the interaction. If the lines describing
the simple main effects are not parallel, then a possibility of an interaction exists. As can be
seen from the graph of the example data, the possibility of a significant interaction exists
because the lines are not parallel. The presence of an interaction was confirmed by the
significant interaction in the summary table. The following graph overlays the main effect of
Ability on the graph of the interaction.
Two things can be observed from this presentation. The first is that the main effect of Ability is
possibly significant, because the means are different heights. Second, the interaction is possibly
significant because the simple main effects of Ability using blue book and computer are
different from the main effect of Ability.
One method of understanding how main effects and interactions work is to observe a wide
variety of data and data analysis. With three effects, A, B, and A x B, which may or may not be
significant there are eight possible combinations of effects. All eight are presented on the
following pages.
Example Data Sets, Means, and Summary Tables
No Significant Effects
Main Effect of A
Main Effect of B
A x B Interaction
Main Effects of A and B
Main effect of A, A x B Interaction
Main Effect of B, A x B Interaction
Main Effects of A and B, A x B Interaction
No Significant Effects
The following analysis is interesting in that the means and the graph is identical to the case
where all effects are statistically significant. In this case, however, the within cell variance is
high relative to the differences between the means, resulting in no significant effects.
Note that the means and graphs of the last two example data sets were identical. The ANOVA
table, however, provided a quite different analysis of each data set. The data in this final set was
constructed such that there was a large standard deviation within each cell. In this case the
marginal and cell means were not different enough to warrant rejecting the hypothesis of no
effects, thus no significant effects were observed.

Research Methodology - PU - II Sem

Uploaded by

Document Information

Original Description:

Copyright

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Research Methodology - PU - II Sem

Uploaded by

Copyright:

Unit - 1

1 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

Another area in government, where research is necessary, is collecting information on the

1. Investigation of economic structure through continual compilation of facts;

1. A Tool for Building Knowledge and for Facilitating Learning

2 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

Some significances of research are given below:

3 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

4 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

Characteristics of Social Research - Characteristics of social research are given below.

5 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

Purpose of Research - There are three purposes of research:

• Exploratory: As the name suggests, exploratory research is conducted to explore a

• Discovery of a new theory: Fundamental research may be entirely a new discovery,

C. Exploratory Research - It pursues several possibilities simultaneously, and in a sense it is

7 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

The evaluation research is of three types:

• Concurrent evaluation: It is a continuing process of an inspection of the project that

G. Experimental Research - Experimentation is the process of research in which one or more

8 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

Methods of survey: Survey research is approached through the methods of personal

J. Qualitative Research - It is concerned with qualitative phenomenon, i.e., phenomena

K. Quantitative Research - It is based on the measurement of quantity or amount. It is

L. Field Investigation Research -

Some Other Types of Research

Different research approaches are explained below.

Descriptive vs. Analytical

10 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

Applied vs. Fundamental

Quantitative vs. Qualitative

• Quantitative research is based on the measurement of quantity or amount. It is

11 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

Conceptual vs. Empirical

• Conceptual research is that related to some abstract idea(s) or theory. It is generally

Characteristics of Good Research – Academic Research is defined as a process of collecting,

1. It is based on the work of others.

12 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

Meanwhile, bad research has the following properties:

1. The opposites of what have been discussed.

The qualities of a good research

1. Systematic: It means that research is structured with specified steps to be taken in a

13 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

14 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

15 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

16 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

17 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

18 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

19 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

20 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

Types of Research Design

Meaning of Research Design

21 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

• What is the study about? ‚‚

22 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

Need for Research Design

23 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

Features of a Good Design

• A good design is often characterised by adjectives like flexible, appropriate, efficient,

24 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

Important Concepts Related to Research Design

Dependent and independent variables:

25 Dr. Ashish Adholiya, Assistant Professor, PIM, PAHER University, Udaipur

• One important characteristic of a good research design is to minimise the influence or

• When a prediction or a hypothesised relationship is to be tested by scientific methods, it

Experimental and non-experimental hypothesis-testing research:

• When the purpose of research is to test a research hypothesis, it is termed as