Professional Documents
Culture Documents
Research Methodology - ITM Unviersity - MBA - Sem 2
Research Methodology - ITM Unviersity - MBA - Sem 2
Research Methodology - ITM Unviersity - MBA - Sem 2
UNIVERSITY
ONLINE
Research Methodology
1.1 Introduction 7
1. 7 Characteristics of Research 11
1.11 Chapter S u m m a r y 20
2.1 Introduction 22
2. 5 Research Design 28
2.10 Chapter S u m m a r y 34
3.1 Introduction 36
3.8 Chapter S u m m a r y 49
www.itmuniversityonline.org Page 2
Research Methodology
4.1 Introduction 51
4.6 Chapter S u m m a r y 64
5.1 Introduction 66
5.7 Chapter S u m m a r y 80
6.1 Introduction 82
6.2 Tabulation 83
6.4 M u l t i p l e D i s c r i m i n a n t Analysis 85
6. 7 Measures of Skewness 96
www.itmuniversityonline.org Page 3
Research Methodology
www.itmuniversityonline.org Page 4
Research Methodology
www.itmuniversityonline.org Page 5
I n t r o d u c t i o n to
Research
Methodology
Research Methodology
1.1 Introduction
The term research is used progressively for any kind of exploration that is intended to
significant to a wide range of subjects, such as leisure studies and sports, hospitality,
healthcare and nursing studies, the natural sciences, social sciences, the environment,
Various university courses include research that students must carry out independently,
in the form of projects, dissertations and thesis, and the more advanced the degree, the
Define research
www.itmuniversityonline.org Page 7
Research Methodology
1 . 2 D e f i n i t i o n of Research
matter. Research is one of the ways to find a good solution for problems, by
and evaluating data; making deductions and reaching conclusions; and at last
carefully testing the conclusions to determine whether they fit the formulating
hypothesis."
In other words, research is the search for knowledge t h r o u g h objective and systematic
hypothesis, enunciating the problem, analyzing the facts, collecting the facts or data,
Research Methods
Research methods and research methodology differ. Methods/techniques that are used
during the course of conducting the research are known as research methods. The study
of research methods offers training to apply various methods to solve the research
problem.
www.itmuniversityonline.org Page 8
Research Methodology
Methods that are concerned with the collection of data. Example: Questionnaire
Statistical t ec h n i q u e s that are used for building relationships between the variables
Methods used for calculating the accuracy of the results. For example, testing of
hypothesis, etc.
Research Methodology
research, various steps are generally adopted by a researcher. Methodology refers to the
methodology is the procedure by which researchers predict, explain, and describe their
work.
methods, and techniques relevant for the problem chosen. Also, research methodology
1 . 4 Objectives of Research
intention of research is to find out the hidden truth. Research study has its own specific
purpose.
consideration.
www.itmuniversityonline.org Page 9
Research Methodology
1 . 5 M o t i v a t i o n f o r Conducting Research
Note:
studies.
3. The procedural design of the research should be carefully planned to yield results
4. The researcher should report with complete frankness the flaws in procedural
5. The analysis of data should be sufficiently adequate to reveal its significance and
6. Conclusions should be confined to those justified by the data of the research and
Source: 1. James Harold Fox, Criteria of Good Research, Phi Delta Kappan, Vol. 39 (March, 1 9 5 8 ) , pp.
www.itmuniversityonline.org Page 10
Research Methodology
1 . 7 Characteristics of Research
Various terms are used to check the validity and fairness of the research; the success of
Reliability
This is a prejudiced term that cannot be measured precisely. Often, various techniques
or instruments are used to measure the reliability of any research accurately. A reliable
research is that which yields similar results every time it is undertaken, with similar
n u m b e r of s i m i l a r results produced.
Validity
Validity refers to the effectiveness with which you approximate research conclusions,
its validity. T h e validity of the research instrument can be defined as the suitability of
the research instrument to the research problem or how accurately the instrument
measures the problem. Defining concepts in the best possible manner can keep the
Accuracy
Accuracy refers to the degree to which each research process, instrument, and tool is
related to each other. It measures whether research tools have been selected in the best
possible manner and research procedures suit the research problem or not. The
accuracy of research can be improved by choosing the best data collection tool.
Credi bi I ity
C r e d i b i l i t y comes with the use of the best source of information and the best procedures
in research, as secondary data has been manipulated by humans and is therefore, not
very valid to use in research. So the research might complete in less time but its
credibility will be at stake. Instead of the least credible primary data, a certain
percentage of secondary data can be used. The credibility of a research can be increased
by g i v i n g accurate references.
Generalizability
takes a s m a l l sample from the target population to conduct the research. As the sample
www.itmuniversityonline.org Page 11
Research Methodology
is merely a representative of the population, the findings should also be the same. If
research findings can be applied to any sample from the population, the results of the
Empirical
Research has been tested for accuracy and is based on real life experiences.
Systematic
Each step must follow the other. There are a set of procedures that have been tested
Controlled
When similar events are tested in research, due to the broader nature of factors that
affect that event, some factors are taken as controlled factors, while others are tested
for the possible effect. The controlled factors or variables should have to be controlled
experiments are conducted in the laboratory but in social sciences, it becomes difficult to
Good Research, Phi Delta Kappan, Vol. 39 (March, 1958), pp. 285-86. 3. Danny N. Bellenger and Barnett,
1 . 8 Types of Research
statistical research. It deals with everything that can be enumerated and s t u d i e d, which
has an impact on the lives of the people it deals with. Example: frequency of customers,
www.itmuniversityonline.org Page 12
Research Methodology
everyday problems, cure of illnesses, and developing innovative techniques, rather than
acquiring knowledge, can be obtained using applied research. For example, to increase
fundamental research. In other words, gathering knowledge for the sake of knowledge is
concerning human behavior carried on with a view to make generalizations about human
behavior, basic science probe for answers to questions such as 'how did the universe
Qualitative research aims to collect detailed information of human attitude and the
reasons that a d m i n i s t e r such attitude. Research designed to find out how people feel or
Conceptual vs. Empirical: Philosophers and thinkers use this type of research to
research, coming u p with conclusions. These conclusions are capable of being verified by
research.
www.itmuniversityonline.org Page 13
Research Methodology
The objective of exploratory research is to analyze the data and explore the p o s s i b i l i t y of
understanding of the situation. It uses a survey and observation method for research
findings. For example, finding the various causes for decrease in the revenue of a
or variables. For example, finding the impact of incentives on the productivity of the
Significance of Research
Scientific and inductive thinking and the development of logical habits of t h i n k i n g and
Research is very useful in various fields, like applied economics, business, and medical
fields and is on the increase, day-by-day. Due to the complicated nature of business,
In the economic system, research gives a basis for all government policies. A big part of
the government's budgets are based on an analysis of the needs and desires of the
Research has a great importance in solving many planning and operational problems of
industries and businesses. Market and operations research, along with motivational
In market research, investigation of the development and structure of a market for the
purpose of formulating efficient policies for production, sales, and purchasing is done.
maximization.
www.itmuniversityonline.org Page 14
Research Methodology
The significance of research can also be explained with the following points:
To students that have to write a master's or Ph.D. thesis, research may mean
livelihood.
To philosophers and thinkers, research may mean the outlet for new ideas and
insights.
To literary men and women, research may mean the development of new styles
Thus, research is the fountain of knowledge for the sake of knowledge and an important
source of providing guidelines for solving different business, governmental, and social
problems. It is a sort of formal training that enables one to understand the new
Source: C. R. Kothari, Research Methodology: Methcx:ls and Techniques, New Age International Publishers,
2nd Edition
!
Application Objective I n q u m n g Mode
Descriptive
Qualitatlve
Pure Research
Research
Research
(Structured
Explanatory
Applied
Approach)
Research
Research
Exploratory
Quantitative
Research
Research
(Unstructured
Co-relational
Approach)
Research
www.itmuniversityonline.org Page 15
Research Methodology
Research
.. .. ..
Research
Literature Hypothesis Design
Problem
Review Formulation (Sample
Formulation
Design)
Hypothesis
Preparation
Interpretation Testing
Collection
There are two types of research problems. Some research problems relate to states of
nature and others relate to the relationship between variables. If a research problem is
stated in a general way, then doubts or ambiguities, if any, relating to the problem will
be resolved. The feasibility of a final result is considered before the formulation of the
research problem.
Understanding the problem thoroughly and rephrasing the same into meaningful terms
from an analytical point of view are the two steps involved in the formulation of the
research problem. Initially, the problem can be stated in a broad, general way and
terms as possible.
understanding of the problem chosen and to acquire proper theoretical and practical
to be considered for research. A literature review helps in assessing the current status of
the problem. After formulating a research problem, a brief summary should be written.
www.itmuniversityonline.org Page 16
Research Methodology
For example, for a research worker, writing a thesis for a Ph.D. degree or writing a
synopsis of the topic and submitting it to the Committee or Research Board is necessary
for a p p r o v a l .
Extensive literature survey that is concerned with the research problem is very
important. It can be made simple and easy by the abstracting and indexing of j o u r n a l s
government reports, academic journals, books, etc. help a lot. Also, it should be noted
that one source will lead to another. At this level, a researcher can take the help of a
Hypothesis formulation is the next step to a literature survey. The hypotheses should be
formulation is an important step because it provides the focal point for research. This is
the most crucial step in the analysis of data. It indirectly affects the q u a l i t y of data that
is required for the analysis. This is an important step in the development of research
problems. The hypothesis to be formulated must be very specific and limited because it
Hypotheses are more specific predictions about the nature and direction of the
assumption made in order to draw and test its logical or empirical consequences.
Research design consists of sample design and methods for the collection of
measurement and a n a l y s i s of data. The research design must contain the details of the
defined sa m p l e and population, population and sample type, their size and their
also contains procedures and techniques for data collection, the sa m p l e of research
population and method, and the technique to process and analyze the data.
www.itmuniversityonline.org Page 17
Research Methodology
W h i l e selecting methods of collecting primary data, take into consideration the nature of
investigation, scope, and objective of the inquiry, available time, financial resources, and
the desired degree of accuracy. There are various methods for collecting primary d a t a :
After data collection, the data is codified, tabulated, and analyzed for statistical
inferences. Through coding, the data is categorized and transformed in the form of
Statistical values are obtained for this data and test for hypothesis is conducted by
applying tests, like c h i - sq u a r e , ANOVA, F-test, and many more. According to the testing
develop concepts and theory and the empirical application of the data to a wider
population, that is, building the theory based on research outputs. Interpretation refers
to the task of drawing inferences from the collected facts after an analytical and/or
Report writing is a vital step in research, where the complete research and findings are
compiled together. A proper and valid report increases the efficiency of the research.
misleading conclusions about the research vitality, the whole research may be
questioned. Valid interpretations about the research can expose processes and relations
that u n d e r l i e its f i n d i n g s .
www.itmuniversityonline.org Page 18
Research Methodology
The major role of research in business is to reduce the risk of the b u s i n e s s decision by
In a research process, the organization is able to obtain information about key business
www.itmuniversityonline.org Page 19
Research Methodology
1.11 Chapter S u m m a r y
Research method includes various procedures and techniques used for obtaining
and a n a l y z i n g data.
problems.
Co-relational Research
Research process gives a detailed flow of steps to be followed for any kind of
research.
decision-making.
www.itmuniversityonline.org Page 20
Research P r o b l e m
Formulation and
Research D e s i g n
Research Methodology
2.1 Introduction
Selecting and properly defining a research problem is the foremost step in the research
Without being clear of what you are going to research, it is troublesome to plan how you
are going to research it. You will be able to define your research strategy and data
problem formulation will g u i d e you toward accurate research problem identification and
formulation.
You have to form the blueprint of how to conduct the research after defining a problem
as in the field of construction, architects with the help of a blueprint design, decide on
the efficient allocation of various resources. A blueprint for conducting a research is the
research design. It gives a detailed logical flow of the research approach and
www.itmuniversityonline.org Page 22
Research Methodology
In the words of Z i k m u n d and Babin, "A problem is a situation, occurs when there
conditions."
The process of defining and developing a decision statement and the steps involved in
Defining correct research problem guides for literature survey, selection of research
strategy, research design, selecting a data collection method and analysis method. Ill
defined research problems may create hurdles but a proper definition of research
problem will enable the researcher to be on track. Thus, defining a research problem
properly is a requirement for any study and is a highly important step. The formulation
of a problem is more important than its solution. It is only on the careful d e t a i l i n g of the
research problem that you can work out the research d e s i g n and can smoothly carry on
anticipated results must commensurate with the efforts put into the research, in
terms of benefit/returns.
problem d e f i n i t i o n .
www.itmuniversityonline.org Page 23
Research Methodology
The t e c h n i q u e of defining the general research problem involves the following steps:
issue or some scientific discovery. Keeping in view some practical concern or some
survey. Then, the researcher can seek the guidance of the guide or the subject
expert, in accomplishing this task. The guide puts forth the problem in general
terms, and it is then up to the researcher to narrow it down and phrase the
problem in operational terms. The problem stated in a general way may contain
all the points that induced them to make a general statement concerning the
problem. They can enter into discussion with those who have good knowledge of
3. Literature Survey
Review a l l the possible literature that is available on the research area and give a
The researcher must be well-conversant with relevant theories in the field, reports,
and records premise. For indicating the type of difficulties that may be
studied on related problems are useful. At times, such studies may also suggest
www.itmuniversityonline.org Page 24
Research Methodology
research pro b l e m . Researchers can discuss the problem with colleagues and others
information. Discussions can develop new ideas; people with rich experiences are
study.
Put the research problem in 'as specific terms as possible', so that it may become
In case of bus i nes s research problem definition, the process can be shown as illustrated
in Fig. 2.3a.
hypotheses
symptoms
Source: Zikmund, Babin, Carr, Griffin, Business Research Methods, 8th Edition
www.itmuniversityonline.org Page 25
Research Methodology
The symptoms can be: decline in sales, increase in the cost of recruitment,
etc.
Relate the symptoms with various possible reasons or causes. For example, a firm
has a problem with advertising effectiveness; the causes can be low brand
statement
decision m a k i n g .
The unit of analysis for a study indicates what or who should provide the
data and at what level of aggregation. For example, it can be the target
objectives.
www.itmuniversityonline.org Page 26
Research Methodology
The research discovers answers to questions using the application of scientific methods.
According to C. R. Kothari, research objectives can be divided into the following broad
groups:
Formulating Hypothesis
A hypothesis states the relationship between two or more variables that suggest
also predicted.
The suggestion formulated in the hypothesis may be the solution to the problem.
Null Hypothesis ( H o )
Alternative hypothesis is just the opposite of null hypothesis; it states that there is
significant difference or relationship between the groups or variables that can be tested.
and television.
Or
www.itmuniversityonline.org Page 27
Research Methodology
2.5 Research D e s i g n
conditions for collection and analysis of data in a manner that aims to combine
procedures and is useful for obtaining the information needed to structure or solve the
research problems. It is a decision matrix which looks into the aspects of SWH - what,
Research design facilitates the logical flow of research operations. It gives the concise
plan of logical relations between the research type, data required, data collection and
Sampling Design
It deals with the method of selecting items to be observed for the g i v e n research type.
Observational Design
Statistical Design
The statistical design that is concerned with the question of how many items are to be
observed and how the information and data gathered are to be analyzed.
Operational Design
Operational design deals with techniques by which the procedures specified in the
Source: Claire Selltiz and others, Research Methods in Social Sciences, 1 9 6 2 , p. 50.
www.itmuniversityonline.org Page 28
Research Methodology
Experimental
Variables Experiment Control Group
Group
Variables
anything that varies or changes from one instance to another. It can e x h i b i t differences
in value or direction. Example: Concepts like weight, height, and income, which vary
Continuous Variable
It is a variable that can take any value, even a decimal value, between its minimum
Discrete Variable
It is a variable that takes only integer value. Example: Count of c h i l d r e n in family. The
Dependent Variable
by other variables.
Independent Variable
dependent variable, in some way. For example, customer loyalty may be a dependent
www.itmuniversityonline.org Page 29
Research Methodology
Extraneous Variable
Extraneous variables are independent variables that are not related to the purpose of
the study but may affect the dependent variable. It is not under the control of the
researcher.
Experiment
The process of examining the truth of a statistical hypothesis (Ho), relating to some
the causal links, whether a change in one independent variable produces a change in
another d e p e n d e n t variable.
Treatments
The different conditions under which experimental and control groups are tested are
referred to as treatments.
These are pre-determined plots or blocks where different treatments are used; always
In a classic experiment, two groups are established and certain members are assigned
to each group. The two groups will be exactly similar in all aspects relevant to the
i
Group members assigned at random
Manipulation of
j independent variable
i
Dependent variable is measured
www.itmuniversityonline.org Page 30
Research Methodology
of f i n d i n g out:
What is h a p p e n i n g
New insights
Literature Survey
The literature survey method is one of the simplest and most fruitful methods of
Experience Survey
Experience survey means a survey of people who have had practical experience
with the problem to be studied. For such a survey, people who are competent and
define the problem more concisely and help in the formulation of the research
hypothesis.
people. Focus groups are led by a trained moderator, who follows a flexible
www.itmuniversityonline.org Page 3 1
Research Methodology
group. Diagnostic is concerned with determining the frequency with which something
1. Formulating the objective of the study: What the study is about and why is it
being made?
3. Selecting the sample: How much material will be needed? From which population
4. Collecting data: Where can the required data be found and with what time period
6. Reporting the f i n d i n g s .
It is concerned with the testing of hypothesis for the causal relationships between
variables and helps in drawing inferences about the causality. Testing of hypothesis
employs statistical procedures, in which the inferences about the target population are
drawn from a study sample. Experimental design is the method for conducting the
hypothesis testing method. While conducting hypothesis testing research, three basic
P r i n c i p l e of replication
P r i n c i p l e of randomization
P r i n c i p l e of local control
www.itmuniversityonline.org Page 32
Research Methodology
Principle of Replication
The research design should be such that the experiment can be repeated more than
once. Each treatment is applied in many experimental units, instead of one. Due to
is increased.
Principle of Randomization
units. The research design should be planned such that while experimenting, the
Randomization and replication do not remove all the extraneous sources of variation,
experimental errors still remain but are unknown. Local control refers to the g r o u p i n g of
the experiment units in such a way that the units within the group are more
homogenous than units in other groups. Then the randomized treatment is assigned to
these parts of blocks. Dividing the samples into various homogenous parts is known as
blocking. Blocking is done in such a way that variation due to the extraneous variable
www.itmuniversityonline.org Page 33
Research Methodology
2.10 Chapter S u m m a r y
Formulation of the research question and stating the hypothesis are key
The research question or a research problem statement presents the idea that is
The final research question consists of a statement about the relationship of two or
more variables.
www.itmuniversityonline.org Page 34
Sampling Design
and Sampling
Techniques
Research Methodology
3 . 1 Introduction
If researchers want to discover the most pressing financial problems faced by the people
in general, varying from low wages to raising health care, housing costs, etc., they have
to ask everyone for t h e i r opinions. However, due to economical and time constraints, it
Representative small groups can be selected from the general population for research
and analysis within the time frame for the required data. Such a grouping is known as
Often, it is not possible to study each and every observation in the population due to
time, money, and many other constraints. In that case, a fraction of that population is
studied and it is known as a sample study or sample survey, that is, the part of the
www.itmuniversityonline.org Page 36
Research Methodology
3 . 2 P o p u l a t i o n , Census, a n d Sample
Population
Population is any complete group of entities that share some common set of
known as parameters. For example, all registered voters in India or a l l members of the
international teachers u n i o n .
Census
total enumeration, rather than a sample. In a census, the survey investigator studies
the characteristics of each and every entity in the population. For example, the census
characteristics of the people. This census data is collected once every 10 years.
Sample
Sample is a subset or some part of a population, used to make inferences about the
whole population, as shown in Fig. 3.2a. Sampling involves the process of selecting a
Sample
Population
www.itmuniversityonline.org Page 37
Research Methodology
A systematic plan for obtaining a sample from a given population is known as a sample
design. It refers to the technique or the procedure that the researcher should adopt in
selecting items for the sa m p l e . Three decisions have to be considered while designing a
sample:
Who w i l l be surveyed? - S a m p l e :
Determine what type of information is needed and who is most likely to have it.
as ' p r o b a b i l i t y s a m p l i n g . '
'judgmental s a m p l i n g . '
3.4 S a m p l e D e s i g n Procedure
The target population should have the characteristics, about which inferences are
finite. If the count is not known, such as listeners of a specific radio program, then
Selecting target population (or the set of objects, technically called the Universe,
items is infinite.
www.itmuniversityonline.org Page 38
Research Methodology
T h e population of a town and the number of people in that town are examples of a
finite universe. The number of fishes in the sea, viewers of a specific TV serial
2. Select a S a m p l i n g Frame
A complete list of all cases in the population from which the sample w i l l be drawn,
social unit, such as school, club, family, etc. or it may be an individual. The
researcher w i l l have to decide one or more of such units that he has to select for
h i s study.
For exam pie, if the research objective is concerned with the members of the sports
club, then the sa m p l i n g frame will contain a complete list of individuals who are
members of that c l u b .
Chosen
judgment.
Sample size is nothing but the total number of units to be selected from the
www.itmuniversityonline.org Page 39
Research Methodology
One of the main problems for the researcher is in the sa m p l e size selection. This
sample size should not be too large or too small, that is, the sa m p l e size must be
optimal. Optimal size samples can easily fulfil requirements, such as reliability,
the sample size, the researcher must determine the desired precision to be
achieved and also, an acceptable confidence level for the estimate. The value of a
5. Parameters of Interest
While determining the design of a sample, one must note the question of the
For example, you may be interested in estimating the proportion of students with
6. Budgetary Constraint
Costs involved in the total sampling procedure have a great impact on decisions
7. Sampling Procedure
The researcher must decide about the technique to be used in choosing items for
the sample. This is a part of the sample design itself. There are several sample
designs, out of which the researcher must choose one for h i s study. Obviously, he
must select the design which, for a given sample size and cost, has a smaller
s a m p l i n g error.
www.itmuniversityonline.org Page 40
Research Methodology
Its design must be applicable in the context of funds available for the research
study.
Source: C. R. Kothari, Research Methodology: Methcx:ls and Techniques, New Age International Publishers,
2nd Edition
There are two types of costs involved in sampling analysis: the cost of an incorrect
inference, resulting from incorrect data and the cost of collecting the data. Incorrect
inferences are gathered d u e to systematic bias and sampling error. Error in the sa m p l i n g
increasing sa m p l e size. One can detect and correct the causes responsible for these
errors.
survey work, systematic bias can result if the questionnaire or the interviewer is
biased.
3. Non-respondents
If you are unable to sample all the individuals initially included in the sample,
www.itmuniversityonline.org Page 4 1
Research Methodology
4. Indeterminacy principle
systematic bias in many inquiries. There is usually a downward bias in the income
upward bias in the income data collected by social organizations. People in gereral
understate t h e i r incomes if asked about it for tax purposes, but they overstate the
Sampling Errors
Sampling errors are the random variations in sample estimates. Sampling error
decreases with the increase in the size of the sample. It can be measured for a given
sample design and size. The measurement of sampling error is usually called the
'precision of the sampling plan'. If we increase the sample size, the precision can be
improved. Thus, the effective way to increase precision is, usually, to select a better
sampling design, which has a smaller sampling error for a given sample size, at a given
cost.
Sampling errors are the random variations in the sample estimates around the
Thus, a major criterion while selecting a sampling procedure is to ensure that the
procedure causes a relatively small sampling error and helps to control the systematic
Source: C.R. Kothari, Research Methodology: Methods and Techniques, New Age International Publishers,
2nd Edition
www.itmuniversityonline.org Page 42
Research Methodology
Different types of sample designs depend on two factors, namely, the representation
basis and the element selection technique. In representation basis, the sa m p l e selected
individually, from the population under consideration at large. All other forms of
Non-probability Probability
Judgment . Systematic
.
Sampling Sampling
Quota . Stratified
.
Sampling Sampling
. Snowball Cluster
.
Sampling Sampling
researcher selects the items deliberately, that is, in such a sampling technique, the
In this method, the results selected by the investigator are favourable to his point of
view, so that the entire i n q u i r y may get vitiated. There is always a serious bias entering
www.itmuniversityonline.org Page 43
Research Methodology
Convenience S a m p l i n g
In this type of sampling, the researcher selects units of the population most
respondents that happen to be in the right place, at the right time, get selected in the
samples. For exploratory research, convenience samples are best used when additional
Judgment Sampling
researcher selects the units of the sample, based on their judgment, about some
is informative, this method is often used when working on very small samples. The
samples selected by this method satisfy specific purposes of research but will not fully
represent the population. This sampling technique is used to obtain information from a
Quota S a m p l i n g
the exact extent that the investigator desires. Quota sampling is a two-stage, restricted
judgmental sampling during which, in the first stage, the population is divided into
various groups and a quota must be calculated for each group, depending on relevant
and available data. In the second stage, sample elements from quota groups are
This s a m p l i n g method is usually used for interview and survey methods. For example, in
the city. The interviewer selects the sample with 10/o of high class, 60% of middle
class, 10% of lower middle class, and 20/o of the rest according to the quota assigned
to each g r o u p .
Snowball S a m p l i n g
respondents are obtained from the information provided by the initial respondents.
www.itmuniversityonline.org Page 44
Research Methodology
First, a group of respondents are selected randomly and then, subsequent respondents
This technique is usually used to locate members of a rare population and also, the
cases for which identifying population is very difficult. The error of systematic bias
occurs frequently with such sampling. For example, people can claim unemployment
3 . 7 . 2 Probability S a m p l i n g Techniques
It is also known as 'chance' or 'random' sampling. In this method, each item of the
lottery method, individual units are picked up from the whole group, using some
mechanical process. The results obtained from random or probability sampling can be
measured in terms of probability. Some of the types of probability sampling are simple
such a way, that the u n i t in every possible sample of size n has an e q u a l chance of being
selected, that is, simple random sampling is one of the types of probability sa m p l i n g
procedure that assures each element in the population will have an equal chance of
and easily accessible sampling frame that lists the entire population, is available,
preferably stored on a computer. For a small sample size, methods like drawing
large sample size, random number generation techniques are applied in obtaining
sample u n i t s .
A simple random sample is a subset of units selected from a population. Each unit is
selected randomly and entirely by chance, such that each unit has the same chance or
of R u n i t s has the same probability of being selected for the sample, as a n y other subset
Suppose N people want to get a ticket for a movie but there are only X tickets
where X < N. So, the authority decides to distribute the ticket among the people,
www.itmuniversityonline.org Page 45
Research Methodology
without any bias. Then, everyone is given a number in the range from O to N - 1 and
random numbers are drawn, either from a table of random numbers or electronically,
with the help of computers. Numbers between the ranges O to N - 1 are considered,
ignoring any n u m b e r s previously selected. The first X numbers would get the X ticket.
In small or large size populations, this type of sampling is typically done 'without
replacement', in which one avoids selecting any individual of the population more than
once. Instead of this, simple random sampling can be carried out with replacement.
For a small sample from a large size population, sampling without replacement is
approximately the same as sampling with replacement, since the odds of selecting the
Systematic S a m p l i n g
1h
is selected by a random process and then, every i number on the list is selected for
population size N by the sample size 'n' and rounding to the nearest integer. When the
For example, there are 100,000 individuals in the population and a sample of 1,000 is
required. In this case, the sampling interval, 'i', is 100. Now, select a random number
between 1 and 100. If this number is 23, then the sample consists of elements 23, 123,
Stratified S a m p l i n g
subsamples that are more or less equal in some characteristic are drawn from within
in nature, as possible.
www.itmuniversityonline.org Page 46
Research Methodology
There are two primary reasons for using stratified sa m p l i n g : the sa m p l e will be more
00000 000
ooooo
o o . t>. 00000 ::::J DODD
o o c:. o .
. ... .... .
.o.c:.
.0. /::,,
.................. ... ...............
Population Strata Sample
populations that are individually more homogeneous than the total population. These
different sub-populations are called 'strata'. Items should be selected from each
stratum, in order to constitute a sample. Variation within each stratum is much less than
that of a p o p u l a t i o n . Precise estimates for each stratum are computed and t h u s , a better
Stratified sampling gives more detailed and reliable information. The following three
c) How many items should be selected from each stratum or how to allocate the
relative size of that stratum and to the standard deviation of the d i s t r i b u t i o n of the
www.itmuniversityonline.org Page 47
Research Methodology
estimates. Hence, such sampling techniques are used. To increase sample efficiency, the
strata having large variability are sampled more heavily, that is, to produce smaller
random s a m p l i n g error.
Cluster S a m p l i n g
complete list of the members of the population, then cluster sampling is conducted
stratified sample. In the method of cluster sampling, grouping the population units is
done and then, selecting the groups or the clusters, instead of individual elements, for
inclusion in the s a m p l e .
In cluster sampling, it is necessary to divide the total area into a number of smaller,
non-overlapping areas, which are known as clusters. After forming appropriate clusters,
a n u m b e r of these clusters are randomly selected, so that all units in these s m a l l areas,
www.itmuniversityonline.org Page 48
Research Methodology
3 . 8 Chapter S u m m a r y
' P o p u l a t i o n . ' A complete enumeration or study of all items in the ' p o p u l a t i o n ' of a
about the whole population. Sampling involves the process of selecting a number
sample d e s i g n .
o Determine sa m p l e size
o Systematic Bias
o Sampling Errors
www.itmuniversityonline.org Page 49
Methods a n d
T o o l s of D a t a
Collection
Research Methodology
4 . 1 Introduction
In real life situations, we deal with different types of data, which are nothing but values
Data collection is a process of preparing and collecting data. The main purpose of data
topics and also, to pass information to others. Primarily, collected data provides
While dealing with any real life problem, sometimes, it is discovered that the data at
hand is inadequate; it, then, becomes essential to collect data that is a p p l i c a b l e . There
are several ways of collecting appropriate data, which differ considerably, in context with
www.itmuniversityonline.org Page 5 1
Research Methodology
4. 2 Data Types
Data collection is the next step to defining and planning research problems. There are
various methods of data collection. The type of method used for the collection of data
d e p e n d s on the type of data to be collected. There are two types of data, primary and
secondary data. Data collected afresh, for the first time, is known as primary data, w h i l e
secondary data are those which have already been collected by another person and
Researchers have to decide which type of data needs to be collected first for the study
and then, they can select the method of data collection. The methods used to collect
primary data are different than the method of secondary data collection. Facts and
statistics collected together for reference or analysis is known as data. Data is values of
data collection. This process can be established in a systematic fashion that enables one
to answer the stated research questions, evaluate outcomes, and test hypotheses.
The term data collection is used commonly, in the research of all fields of study,
including physical and humanities, business, social sciences, etc. W h i l e methods vary by
discipline, the significance on ensuring accurate and honest collection remains the same.
Accurate data collection is a very important step, rather t h a n defining data (qualitative
In case of experimental research, you can collect primary data during the course of
respondents t h r o u g h personal interviews. Data which is collected afresh and for the first
www.itmuniversityonline.org Page 52
Research Methodology
It means there are various methods of primary data collection. Some of these important
methods are:
Observation method
Interview method
T h r o u g h questionnaires
T h r o u g h schedules
Other methods i n c l u d e :
o Warranty cards
o Distributor a u d i t s
o Pantry a u d i t s
o Consumer panels
o Depth interviews
o Content a n a l y s i s
Observation Method
This is the most commonly used method of primary data collection, especially in studies
investigator's own direct observation, without asking the respondent. For example, in
consumer behavior study, the investigator may look at the watch, instead of inquiring
about the brand of the wrist watch that is used by the respondent.
knows they are being observed) or covert (no one knows they are being observed and
indirectly.
occur. For example, observing teachers teaching a topic from a written curriculum, in
order to determine whether they are delivering it with fidelity. In indirect observation
method, you watch the results of behaviors, interactions or processes. For example,
www.itmuniversityonline.org Page 53
Research Methodology
Merits
The researcher can even gather information which could not be easily obtained, if
The researcher can even verify the truth of statements made by informants in the
Demerits
If the participant participates emotionally, then the observer may lose objectivity.
Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age
Interview Method
In interview method, data is collected by oral-verbal stimuli and replies in terms of oral
Merits
Demerits
There remains the possibility of the bias of interviewer, as well as, that of the
respondent.
www.itmuniversityonline.org Page 54
Research Methodology
The presence of the interviewer on the spot may over-stimulate the respondent,
sometimes even to the extent that the respondent may give i m a g i n a r y information
Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age
Through Questionnaires
Particularly in the case of big enquiries, this method is quite popular. It is being adopted
by private and public organizations, private individuals, research workers, and even by
If it is not properly set up, the survey is bound to fail. The general form, question
Merits
It is low cost, even when the universe is large and is widely spread geographically.
It is free from the bias of the interviewer; answers are in the respondent's own
words.
Respondents, who are not easily approachable, can also be reached conveniently.
Large samples can be made use of, thus the results can be made more d e p e n d a b l e
and reliable.
Demerits
often indeterminate.
There is inbuilt inflexibility due to the difficulty of amending the approach once
www.itmuniversityonline.org Page 55
Research Methodology
Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age
Through Schedules
This method of data collection is very similar to the method of collection of data t h r o u g h
questionnaire. The difference here is that schedules are filled by the enumerators that
are appointed for this purpose. Schedules may be handed over to respondents and
enumerators may help them in recording their answers against the questions.
Enumerators explain the aim and objective of the investigation; they also remove
question, the definition or concept of difficult terms. The enumerators should be trained
well and the scope and nature of the investigation should be explained to them
thoroughly, so that they can perform. With complete training, they can understand the
implications of different questions put in the schedule. Enumerators must possess the
Merits
Demerits
schedules.
area.
Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age
www.itmuniversityonline.org Page 56
Research Methodology
4 . 2 . 2 Secondary Data
A data which has already been collected by someone else and which has already been
passed t h r o u g h the statistical process is known as secondary data. This data may be in
that data which already been collected and analyzed by someone else. If you are using
the secondary data, then you have to look into the various sources from where it is
obtained.
governments.
fields.
Sources of U n p u b l i s h e d Data
Scholars and research workers, labor bureaus, trade associations, and other
Before using secondary data, the researcher must know the following characteristics of
secondary d a t a :
Reliability of Data
www.itmuniversityonline.org Page 57
Research Methodology
Suitability of Data
Data that is suitable for one type of enquiry may not necessarily be suitable to another
researcher. The researcher must very carefully scrutinize the definition of various terms
and u n i t s of collection used at the time of collecting the data from the primary source.
Adequacy of Data
If the level of accuracy achieved in the data is known and if it is found to be inadequate,
the researcher should not use that data for further study. Data that is related to an area,
which may be either narrower or wider than the area of the present enquiry, will be
considered as inadequate. It means that using secondary data is very risky. If the
secondary data is found to be more reliable, suitable, and adequate, only then
secondary data is used. No one can blindly refuse the use of available data, if it is
a v a i l a b l e from authentic sources. Using secondary data will not be economical to spend
4 . 3 Q u e s t i o n n a i r e Design
and pre-determined questions. The form of the question may be either closed (the
type 'yes' or 'no') or open (inviting free response) but should be stated in advance
A wide range of data, in the respondent's own words, cannot be obtained with
used effectively. On the basis of the results obtained in pretest (testing before
final use) and operations from the use of unstructured questionnaires, one can
www.itmuniversityonline.org Page 58
Research Methodology
Question Sequence
be smoothly-moving and clear, thereby meaning that the relation of one question
to another should be readily apparent to the respondent, with questions that are
Questions that put great strain on the memory or intellect of the respondent.
tabulation plan. In general, all questions should meet the following standards:
Should be simple
Should be concrete
Since words are likely to affect responses, they should be properly chosen. Simple
words, which are familiar to all respondents, should be preferred. Words with
Sample of a Questionnaire
This is a research study conducted by a group of medical students. Please do NOT write
your name on the questionnaire, as this study is anonymous. Do not feel obligated to
for taking the time to complete our questionnaire, your effort is greatly appreciated.
www.itmuniversityonline.org Page 59
Research Methodology
Male - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 1 c:::J
Female - - - - - - - - - - - - - - - - - - - - - - - - - - - - 2 c:::J
17 - 19 - - - - - - - - - - - - - - - - - - - - - - - - - - - - 1 c:::J
20 - 22 - - - - - - - - - - - - - - - - - - - - - - - - - - - - 2 c:::J
23 - 25 - - - - - - - - - - - - - - - - - - - - - - - - - - - - 3 c:::J
30+ - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 5 c:::J
Eu rope - - - - - - - - - - - - - - - - - - - - - - - - - - - - 1 c:::J
Asia - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 2 c:::J
Africa - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 3 c:::J
N o n - E n g l i s h - - - - - - - - - - - - - - - - - - - - - - - 2 c:::J
None 1 2 3 4 5 6 7 8 9 Excellent
D D D D D D D D D D D
Christian - - - - - - - - - - - - - - - - - - - - - - - - - - - 2 c:::J
Hindu - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 3 c:::J
Islam - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 4 c:::J
Jewish - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 5 c:::J
Other - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 7 c:::J
www.itmuniversityonline.org Page 60
Research Methodology
1 -----------------------------------lc::]
2 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 2 c::J
3 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 3 c::J
4 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 4 c::J
Other - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 5 c::J
Administration
Nursing - - - - - - - - - - - - - - - - - - - - - - - - - - - 4 c::J
questions.
be avoided in a questionnaire.
answers listed) or o p e n - e n d e d .
The q u a l i t y of the paper, along with its color, must be good so that it may attract
Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age
www.itmuniversityonline.org Page 6 1
Research Methodology
4 . 5 Case Study
This method involves a careful and complete observation of a social unit. A person, an
institute, a family, a cultural group or even an entire community, are examples of social
units. It is a very p o p u l a r form of qualitative analysis. Case study emphasizes on the full
the process that takes place and their interrelationship and it is also an intensive
investigation of the particular unit under consideration. To locate the factors that
account for the behavior patterns of the given unit, as an integrated totality, is the
For the purpose of this study, the researcher can take one single social unit or
Researcher can study the behavior pattern of the concerning u n i t directly and not
This method results in fruitful hypotheses, along with the data, which may be
The assumption of uniformity in basic human nature, in spite of the fact that
Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age
It enables the researcher to trace out the natural history of the social u n i t and its
relationship with the social factors and the forces involved in its surrounding
environment.
www.itmuniversityonline.org Page 62
Research Methodology
in testing them.
This method facilitates an intensive study of social units, which is generally not
possible.
schedule for the said task, which requires thorough knowledge of the concerning
universe.
The researcher can use one or more of the several research methods u n d e r case
It is beneficial in determining the nature of the units to be studied, along with the
Case studies constitute the perfect type of sociological material, as they represent
a real record of personal experiences, which very often escapes the attention of
Case study method enhances the experience of the researcher and this, in turn,
This method makes the study of social changes possible. On account of the m i n u t e
study of the different facets of a social unit, the researcher can well understand
Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age
Case situations are seldom comparable and the information gathered in case
Real information is often not collected because the subjectivity of the researcher
No set rules are followed in the collection of the information and only few u n i t s are
Case data is often vitiated. Sampling is not possible u n d e r a case study method.
Case study method is based on several assumptions, which may not be very
Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age
www.itmuniversityonline.org Page 63
Research Methodology
Primary data is that which are collected afresh and for the first time and hence,
happen to be o r i g i n a l in character.
The following methods are for collecting primary data: observation method,
Secondary data are those which have already been collected by someone else and
have already been passed through the statistical process. Secondary data may
www.itmuniversityonline.org Page 64
Measurement and
Scaling Techniques
Research Methodology
5 . 1 Introduction
Measurement is an essential factor in our daily life. To cook a d i s h , one has to m i x the
in the recipe are mixed without any measurement, it can spoil the whole recipe. To give
such measurement, one has to have some standards. For this standard, different
Some standard measurement scales are liter for water, kilogram for weight of an object,
meter for height of an object, etc. These are physical measurements but in sciences and
b u sin e s s research, to measure the attitudes of the respondents from whom data is
www.itmuniversityonline.org Page 66
Research Methodology
5 . 2 M e a s u r e m e n t and Scaling
Measurement
valid way."
Thus, when an object or item is measured, it is assigned some numerical value, which,
the properties and the concept of the object or items for its measurement. Properties
object from the other. The properties of the objects may be of an objective or subjective
type. The objective properties are the properties that can be physically described, while
subjective properties can only be mentally described. Also, the concept of the object
should be known, w h i c h gives you a brief view about the object, that is, the m e a n i n g .
Scaling
Scaling is defined by Zikmund as, "A device providing a range of values that
Consensus approach: A panel of experts evaluates the chosen items for their
www.itmuniversityonline.org Page 67
Research Methodology
5 . 3 P r i m a r y Scales of Measurement
The primary scales of measurements are classified into four categories, namely, nominal
N o m i n a l Scale
The simplest scale in measurement is the nominal scale. It helps in identifying the types
among different categories, in which it falls. For example, male, female, married,
u n m a r r i e d , etc.
data, rather than metric data. Nominal data may be represented by numbers, like 1, 2,
3, etc. that are in a metric representation. The nominal data can also be represented by
symbols or letters or figures. For example, in a business research, the gender of the
respondent from whom the data is collected is labeled as O for female and 1 for m a l e .
O r d i n a l Scale
properties, like nominal scale. However, such data is arranged in either ascending or
descending order. Thus, if data can be ordered accordingly, then such data is termed as
o r d i n a l data.
For example, three biscuit brands that are in very close competition, in terms of t h e i r
taste, can be rated accordingly, with the best tasting biscuit brand as the first, followed
Interval Scale
An interval scale is the same as the ordinal scale, with a d d i t i o n a l information about the
In this scale, the units of measurement between the numbers on the scale are all
(in - c or F ); the scale of temperature from O - 100c is d i v i d ed into 100 equal parts,
that is, the difference between any two successive numbers is fixed.
www.itmuniversityonline.org Page 68
Research Methodology
Ratio Scale
It is the most advanced level of measurement. The property on the basis of which it is
ratio scale are weight, height, distance, sale, etc., which are numerical values defining
the property for the object, with zero denoting the absence of the property in the object.
Numerical Descriptive
Level Examples
Operations Statistics
Employee ID number
Yes or No
Frequencies
Nominal Good or Bad Counting
Mode
Religion: Hindu, Muslim, Christian,
S i k h , etc.
Frequencies
Mode
Student's Grade Point Average
Common Median
(GPA)
Interval arithmetic Range
1 0 0 - p o i n t job performance rating
operations Mean
provided by supervisor
Variance
Standard deviation
Frequencies
Mode
Salesperson's sales volume
All Median
N u m b e r of stores visited on a
Ratio arithmetic Range
s h o p p i n g trip
operations Mean
A n n u a l family income
Variance
Standard deviation
Source (for the t a b l e ) : Zikmund, Babin, Carr, Griffin, Business Research Methods, Eighth Edition
www.itmuniversityonline.org Page 69
Research Methodology
Robert Stevens, and David Louden as, "A reliable measure is one that
consistently generates the same result over repeated measures. For example, if
on the scale today, tomorrow, and next Tuesday, then it appears to be reliable
accurate weight."
According to Z i k m u n d , Babin, Carr, and Griffin, the difference between reliability and
modern rifle. The shots from the older gun are considerably scattered, but
those from the newer gun are closely clustered. The variability of the old rifle
compared with that of the new one indicated it is less reliable. The target on
the right illustrates the concept of a systematic bias influencing validity. The
new rifle is r eliable (because it has little variance), but the sharpshooter's
...--.::
. . .
............. : ._ .
www.itmuniversityonline.org Page 70
Research Methodology
5 . 4 C l a s s i f i c a t i o n of S c a l i n g Techniques
Scaling techniques can be further classified into two main categories, namely,
Comparative Scale
In this type of scale, the respondent is asked to compare one object with another and
furnish a response. For example, the respondent is asked to compare two cosmetic
brands, Lakme and Ponds, on its effectiveness. The response, in such a case, can be
either Lakme or Ponds, whichever the respondent finds effective. However, in t h i s scale,
the respondent can only choose the brand he/she finds effective but cannot allocate a
These types of scales are ordinal in characteristics and so it is also known as non-metric
scale. There is no standard for this scale and different respondents use different
approaches or standards.
In a categorical scale, the respondent has to rate his/her responds in a scale in which
each object is individually evaluated and each object is independent of the other. For
of 1 to 5. Where 1 indicates the best and 5 indicates the worst fragrance, as shown in
Fig. S.4a.
Q. Rate the frag ranee of the following brands in the scale below:
Brand 1
1 2 3 4 5
Brand 2
1 2 3 4 5
Brand 3
1 2 3 4 5
www.itmuniversityonline.org Page 7 1
Research Methodology
The comparative scale is further classified into two different scales: Rank order and
continuous rating scale and itemized scale. The itemized rating scale is further d i v i d e d
Like rt Sea le
C u m u l a t i v e scale
Stapel Scale
Scaling
Techniques
Comparative
Categorical Scale
Scales
Itemized
Paired Continuous
Rating
Comparison Rating Scale
Scale
Semantic
Cumulative
Differential
Scale
Scale
5 . 5 C o m p a r a t i v e Scales
In t h i s scale, the respondent is asked to rank several objects based on certain properties
or criteria. It is the simplest and quickest to apply. Ranking the extremes is very easy in
www.itmuniversityonline.org Page 72
Research Methodology
Example:
Rank the following soft d r i n k s , on the basis of their taste, from 1 for the best tasting one
Coca-Cola
7-Up
Sprite
Pepsi
Mountain Dew
Paired Comparison
As the name implies, this scale involves the comparison of different pairs of objects.
Here, the respondent is provided with pairs of objects and he/she has to select one of
If there is ' n ' n u m b e r of objects, then there will be n(n2- l) pairs to be compared.
The data obtained in this scale is ordinal and the responses of the respondent obtained
can be transformed into a matrix form. This method was given by L. L. Thurstone
(1927).
The method can be explained with the help of an example; if six apparel brands are
compared, namely, B i b a , Aurelia, W, Global Desi, Kimaya, and Fab I n d i a , on the basis of
5 52 1
be 15 pairs [ < - >]. After each respondent furnishes their response in the forms,
Fab
Biba Aurelia w Global Desi Kimaya
India
Biba x 0 1 1 1 1
Aurelia 1 x 1 1 1 1
w 0 0 x 0 0 0
Global Desi 0 0 1 x 0 0
Kimaya 0 0 1 1 x 1
Fab India 0 0 1 1 0 x
Total 1 0 5 4 2 3
www.itmuniversityonline.org Page 73
Research Methodology
In Table 5.Sa, the value 1* indicates that the brand in that column (that is, Biba) is
5 . 6 C a t e g o r i c a l Scales
In this scale, the respondents are to mark ( <) an appropriate position, which is
considered by them to be the favorable case in a number scale or pictorial scale. The
rating scale is represented by line diagrams, scale with pictures, and others.
Example:
1. How would you rate the services of DTDC as country-specific courier service?
Very good Good Quite good Neither Quite bad Bad Very bad
Or
7 6 5 4 3 2 1
0 1 2 3 4 5
Very Very
Satisfied Unsatisfied
www.itmuniversityonline.org Page 74
Research Methodology
Itemized rating scale is also known as a numerical rating scale. In t h i s scale, a series of
statements are given, from which the respondent needs to select the statement
according to the response. Unlike the continuous scale, this scale gives a rating scale in
which the respondent can select the favorable statement. The measurements in the
scale used should be of odd categories, preferably and most likely to be five to nine
categories.
According to variables, respondents, etc., the itemized rating scale has the following
divisions:
Cumulative Scale
Stapel Scale
This scale was developed by Rensis Likert in 1932. According to Mukul Gupta and
This scale is a five-point scale and the scale ranges from 1 to 5 or - 2 to 2, with O as the
neutral response.
Example:
5 4 3 2 1
Disagree
- 2 - 1 0 1 2
www.itmuniversityonline.org Page 75
Research Methodology
as semantic differential scale. In this scale, words, rather than numbers, are used in a
seven-point scale, that is, two adjectives are placed in the extreme points of the scale
and the respondent has to select the value, accordingly, in the b i p o l a r scale.
Strong Weak
Expensive Inexpensive
Fashionable Unfashionable
C u m u l a t i v e Scale
It is also known as Louis Guttman's scalogram analysis, named after the person who
developed it. It consists of some statements, which the respondent has to give his/her
Respondent's
Questions
Score
4 3 2 1
,/ ,/ ,/ ,/
4
,/ ,/ ,/
x 3
,/ ,/
x x 2
,/
x x x 1
./ = Agreement ; x = Disagreement
www.itmuniversityonline.org Page 76
Research Methodology
From Table 5.6d, you can see that the respondent is provided with two options,
agreement and disagreement, for each question asked. The questions are built up in
such a way that if the respondent's answer is positive to question 2, then the
answer for que st i o n 3 is agreement, then the answer for question 1 and 2 may be
Stapel Scale
The stapel scale was developed by Jan Stapel. It is a 10-point interval scale from
+ 5 to - 5. But there is no neutral point 0. This scale is unipolar because only one
adjective is under consideration. It even has a number of categories, unlike the other
scales.
Example:
How would you rate the extent to which the tag line of a p a rt i c u l a r product matches
accordingly?
(+5)
(+4)
(+3)
(+2)
(+1)
Perfectly Matches
{-1)
(-2)
(-3)
(-4)
(-5)
This scale is u s u a l l y presented vertically. Respondents select their response on the basis
www.itmuniversityonline.org Page 77
Research Methodology
context. The larger positive number indicates high accuracy, while a smaller negative
Non-comparative Scale
Basic
Scale Examples Advantages Disadvantages
Characteristics
unless
computerized
disagree) to 5 understand
(strongly agree)
interval
(zero)
.
Table 5 . 6 d : Different Measurement Scales
Source: Naresh K. Malhotra, Satyabhushan Dash, Marketing Research, An applied Orientation, Fifth
www.itmuniversityonline.org Page 78
Research Methodology
Before preparing a non-comparative itemized rating scale, you should keep the following
the scale in which the positive and negative categories are equal, while in an
unbalanced scale, the scale does not have equal number of categories.
The scale must be a forced or an unforced rating scale. In a forced rating scale,
respondents are forced to select an option in the middle of the scale, as it does
not contain a 'no opinion or comment' option, while in the unforced scale,
respondents are g i v e n an option of 'no opinion', if they find that the options are
The scales can be either horizontally or vertically presented. They can also be
www.itmuniversityonline.org Page 79
Research Methodology
5 . 7 Chapter Summary
www.itmuniversityonline.org Page 80
Tabulation and
A n a l y s i s of D a t a
Research Methodology
6 . 1 Introduction
helps to view large data in a compact manner. It also enables us to compare different
variables at one time. The summing of the values of different variables becomes easy
when it is expressed in a tabular form. Thus, tabulation of data helps to get a precise
Analysis of data is the most essential part of a research, which leads to the conclusion of
statistical techniques or tools. In this chapter, some of the statistical t e c h n i q u e s used for
a research are o b t a i n e d .
Discuss m u l t i p l e d i sc r i m i n a n t analysis
www.itmuniversityonline.org Page 82
Research Methodology
6.2 Tabulation
reports, articles, j o u r n a l papers, etc. to summarize a particular data. The use of tables is
data is organized in a systematic manner, reflecting the information of the data used for
the table and it also helps in further analysis required for the interpretation of the
research f i n d i n g .
qualifying words, phrases and statements in the form of titles, headings and
explanatory notes to make clear the full meaning, context and the o r i g i n of the
data.''
Thus, the technique of organizing the given data in a tabular form is known as
tabulation. In a table, it is very important that the headings for each of the rows and
c o l u m n s are properly inserted according to the data. There is always a title to define the
table provided and also, if there is a source from which the table is taken or any other
additional information relating to the table, it is provided below the table. A table
consists of the table title, along with the table number, rows, c o l u m n s , and footnotes, as
shown in Fig. 6 . 2 a .
Sources: Coltan, cassiterite and gold figures derived from Rwandan Official Statistics
Source +- (No. 227/01/10/MIN): diamond figures from the Diamond High Council. (All figures
originally appeared in the UN Panel of Inquiry, 2 0 0 1 . All 2000 figures are to October.)
www.itmuniversityonline.org Page 83
Research Methodology
The table title should be concise, reflect the complete meaning of the representation and
must follow after a table number. In Fig. 6.2a, the table is about mineral production
from the year 1995-2000, taken from Rwanda official statistics and is appropriately
The horizontal and vertical representation of a data is known as rows and c o l u m n s of the
table, respectively. In Fig. 6 .2 a , there are four columns and six rows. Each data is
placed in an individual cell. If, other than the source, there is more information to be
furnished in the table, like short-forms or abbreviations used in the table, they can be
provided below the table after the source (if any). All this information is together known
as footnotes.
1. Every table must contain a title that is clearly understandable and concise.
5. The footnotes should be placed beneath the table, along with any other a d d i t i o n a l
8. The data of the columns that are to be compared should be placed side by side.
9. A l i g n m e n t of values in the cells should be proper. Also, the decimal point and signs
1 0 . Abbreviations should be rarely used, only if necessary, and also, ditto marks
should be avoided.
1 1 . Representation of large data in a table can make it look messy and unclear. So, in
such cases, the data should not be clumped in one single table.
1 2 . The row totals are placed at the extreme right column of the table, while the
The tables, figures, and graphs help in the brief representation of the data, which helps
the data type and size of the data. It is very important to implement an appropriate
www.itmuniversityonline.org Page 84
Research Methodology
statistical technique, in order to get the appropriate results. There are many statistical
techniques used in data analysis. In this chapter, some of the important statistical
Consider that X1 and X2 are two independent variables and Y is a dependent variable.
Y=a+b 1
X 1
+b2X2
Where,
Y = Dependent variable
a, b1, b2 = Constants
To obtain the regression equation, you need to calculate the values of the constants.
IY = Na+b,IX, +b,IX,
I X , Y = aI X, + b, I X ; + b,I X 1 X 2
I X2Y = aI x 2
+ b,I x x
1 2
+ b,I Xj
6 . 4 M u l t i p l e D i s c r i m i n a n t Analysis
quantitative and the dependent variable is categorical in nature. For example, on the
basis of an applicant's age, income, length of time at present home, etc., a credit
dependent variable is categorical and the predictor or independent variables are interval
www.itmuniversityonline.org Page 85
Research Methodology
in nature (Lachenbruch, 1975). The discriminant analysis involving more than two
The equation showing such a relationship between 'n' variables is called a discriminant
function and is g i v e n a s :
z = w,x, + wx 2 2
+ ... + w"x"
Where,
Z = D i s c r i m i n a n t scores
X; = i'h i n d e p e n d e n t variable
6 . 5 M e a s u r e s of Central Tendency
characteristics of the population from which the data is extracted. It is also known as a
statistical average as an average value of the data is obtained when these measures are
computed.
The measures of central tendency are also known as measures of location as the value
obtained from such computation indicates the position/location of the value. The
Mean/Arithmetic mean/Average
Mode
Median
H a r m o n i c Mean
Geometric Mean
www.itmuniversityonline.org Page 86
Research Methodology
It is the most commonly used measure of central tendency and the simplest of a l l . It is
obtained by dividing the sum of all the values of the observation by total number of
observations.
Consider the values of ' n ' observations X1, X2, X3 ... Xn, then, the arithmetic mean is
given a s :
LX,
X=-'
n
Where,
X = Arithmetic mean
'
n = Total n u m b e r of observations
For example, if leaves taken by 10 employees in a bank, in the last three months are 2,
mean:
X = 2 + 4 + 6 + 1 + 2 + 3 + 1 + 2 + 1 + 5 = 27 = 2.7
10 10
While, if for these 'n' observations, the corresponding frequencies are given, that is,
Lf,X,
X= If,
I
Where,
X = Arithmetic mean
'
frequencies
L f, = Total frequency
'
www.itmuniversityonline.org Page 87
Research Methodology
I I I I I ! I
If,X,
X='--
If,
'
x 0 1 2 3 Total
f 2 4 6 8 20
fx 0 4 12 8 24
- 24
Therefore, X = - = 1.2
20
6.5.2 Median
Median is the middle value of a data when all the observations are arranged in
ascending or descending order of their magnitude. This is applicable if the data under
consideration is an u n g r o u p e d data.
company are 10, 11, 13, 12, 17, 18, 9, 14, and 15. To obtain the m e d i a n for t h i s data,
first, the observation is arranged in ascending order, that is, 9, 10, 11, 12, 14, 15, 17,
9 1
and 18. Then, the value of the observation in the ( ; )"' position, that is, 5th position
However, suppose that in the above example, there are 10 salespersons and the
observations are 10, 11, 13, 12, 17, 18, 9, 14, 15, and 17. Similarly, the observations
are arranged in ascending order; 9, 10, 11, 12, 14, 15, 17, 17, and 18. Then, the
www.itmuniversityonline.org Page 88
Research Methodology
median is the value obtained as the mean of the values of the observations in the
14; 15
(
12
)th and (
1
i + 1)"' positions. Thus, the median is ( ) = 14.5
x f c.f.
x, f, c,
X2 f2 C2
X3 f, C3
Xn fn Cn
Total N
. . .
Then, the value of c.f. just greater than the value obtained from () is found and the
value of X, corresponding to the value of c.f., is the median. Also, for a continuous
frequency table the calculation of median is different from the discrete frequency table.
x f c.f.
0 - 10 2 2
10 - 20 4 6
20 - 30 6 12
30 - 40 8 20
Total 20
. . .
Table 6 . 5 . 2 b : Distribution of Employees' Bonus
www.itmuniversityonline.org Page 89
Research Methodology
rz -
2
c.f.
Median= L + f x i
Where,
2
2
Therefore, from Table 6 . 5 . 2 b , = = 10, then c.f. just greater than 10 is 12 and 12
rz -2
c.f.
Median= L + f x i
6
= 20 + lO - x 10
6
4
= 20 + x 10
6
= 20 + 6 . 6 6 7
= 26.667
Thus, the m e d i a n d i v ides the whole data series into two equal parts. In add it ion, it is not
6 . 5 . 3 Mode
It measures the most frequently occurring value in the data. For an ungrouped discrete
data, the mode value is the value of the observation which occurs for m a x i m u m number
of times.
For example, if the age of the employees in the marketing department is g i v e n as: 25,
22, 35, 29, 42, 26, 36, 25, 29, 28, and 29. Here, 29 occurs m a x i m u m n u m b e r of times,
www.itmuniversityonline.org Page 90
Research Methodology
For a grouped continuous frequency distribution, the mode is obtained by the following
formula:
M = L + f,-fo x i
0
2f, - f, - f,
Where,
x f c.f.
0 - 10 5 5
10 - 20 14 19
20 - 30 23 42
30 - 40 8 50
Total 50
23 14
M
0
= 20 + - x 10
2 x 23 - 14 - 8
9
= 20 + x 10
46-22
9
= 20 + 24 x 10
= 2 3 . 75
The mode can be determined graphically. In some cases, there can be two or more
modes in a single data. Thus, for such a data, the interpretation of results is very
d ifficu It.
www.itmuniversityonline.org Page 9 1
Research Methodology
6 . 5 . 4 Geometric Mean
The nth root of the product of all the observations is called a geometric mean and is
abbreviated as G.M. That is, if xi , x2 ... x0 is a given set of 'n' observations, then the
G . M. = x 1 .x 2 x,
First taking the logarithm and then antilogarithm above, the formula can be written a s :
L logx )
G.M. = a n t i log n '
(
5. 5, 8, and 5.
G , M. = X1f1X2f2 , , , Xnfn
or
_ . (Lf,logx,J
G.M. - antiloq N
Where, N = L
For c o n t i n u o u s frequency distribution, from the class intervals, the mid-values ( rn . ) are
Lf,logm,J
G.M. = antilog N
(
www.itmuniversityonline.org Page 92
Research Methodology
6.5.5 H a r m o n i c Mean
observations. That is, if x 1 , x2 ... x, is a given set of 'n' observations, then the harmonic
mean is g i v e n a s :
X1 X2 x, Xi
3 3 12
H.M. = = x = 3.273
1 1 1 11
- + - + -
4 6 2
N
H.M. =----
1)
"[,(f, x
x,
I, f,x
m,
classes, and N = J,
Mid-value f
x f -
m m
0 - 10 5 2 0.4
10 - 20 15 4 0.267
20 - 30 25 1 0.04
30 - 40 35 3 0.086
Total 10 0.793
10
12.610
0.793
www.itmuniversityonline.org Page 93
Research Methodology
6 . 6 M e a s u r e s of Dispersion
Dispersion in statistics means the variability or spread of the observation in data. The
measure of d i s p e r s i o n helps to identify the suitability of the data, that is, if the data is
Range
Mean deviation
Standard deviation
6.6.1 Range
Range is the simplest measure of dispersion. It is defined as the difference between the
highest and the lowest value of the data series and is given a s :
Mean deviation or M.D. is defined as the average of the sum of all deviations of the
1 n
M.D. = - I Ix, -
n 1= 1
Where,
www.itmuniversityonline.org Page 94
Research Methodology
1 '
M.D. = - I;f1IX1 -
N 1 = 1
Where,
'
I; f , x ,
X = =
' '-
N
x f fx Ix- x-
1 5 5 2.9 14.5
2 10 20 1.9 19
3 15 45 0.9 13.5
4 30 120 0.1 3
5 40 200 1.1 44
n Lf1X1
1
From Table 6 . 6 . 2 a , N = L f1 = 100 and x = =
1
= 3.9
1=1 N
Therefore, M.D.
1
= - "
L f,
I
x, - vi
x =-1
- x 94 = 0. 94
N1=i 1 100
6 . 6 . 3 Standard Deviation
Standard deviation is the most widely used measure of dispersion. It is defined as the
positive square root of the average of the squares of deviations of the observation from
CT =
1 ' (
- L X; -
-)2
x
n i = 1
www.itmuniversityonline.org Page 95
Research Methodology
1 n ( -)2
cr = - I: f x, -
1
x
N 1 = 1
2 2
The value a2- is the variance of the data, that is, cr = .!. L (x, - X)
n
For example, for the g i v e n data, the standard deviation is calculated as:
Mid-value -
Class Interval f fx x-X t(x- xf
(x-xf
x
Total 8 39 31.628
"
I; f , x ,
1
39
x = ,= = = 4 875
N 8 .
o = ..!. f
1
(x 1
- XJ = /..!. x 31.628 = 1.988
N 1 = , 'V 8
6 . 7 M e a s u r e s of Skewness
distribution, the mean, mode, and median lie on the same point that d i v i d e s the whole
distribution into two equal parts. However, in case of an asymmetric distribution, the
mean, m ed i a n , and mode do not lie on the same point. Fig. 6.7a gives the shape of
different curves:
www.itmuniversityonline.org Page 96
Research Methodology
Median Median
Mean
I I
I I I I
The first figure shows a negatively skewed curve, where mode > median > mean.
However, the third figure shows mode < median < mean. While, in the second figure,
mode = m e d i a n = mean.
Prof. Karl Pearson has defined the measure of skewness, which is called Karl Pearson's
coefficient of skewness, a s :
M e a n - Mode
Karl P e a r s o n ' s Coefficient of Skewness=----- ................. (1)
Standard Deviation
If mean >, = or < mode, then the equation (1) is positively skewed, symmetrical or
However, if the data has two or more modes, then the above equation (1) is given a s :
. . 3(Mean-Median)
Karl P e a r s o n ' s Coefflclent of Skewness= d d (2)
Stan ar Deviation
If mean >, = or < median, then the equation (1) is positively skewed, symmetrical or
6 . 8 M e a s u r e s of Relationships
In a data with many variables, you can use different statistical measures for calculation
of relationship between the different variables. For a bivariate data, a cross tabulation,
s i m p l e regression can be used for measuring the relationships between the variables.
Cross Tabulation
In a cross table representation, the data is represented such that the variables can be
compared with each other. The whole data is arranged into categories and further
www.itmuniversityonline.org Page 97
Research Methodology
d i v i d e s it into two or more sub-categories. So, the row categories and c o l u m n categories
are compared and the cross table is filled in, according to the g i v e n data.
(Al Cross-Tabulation of Qustion ..Hava you followiNI the nw storis about AIG bonuses]"
Total Gender Ag
-- -----
Adults Men Women 18-29 30-39 40-49 So-64 65+
Total Gand@r Ag
-- -----
Most Bailout Money Going Ye, .... .,.,. .... .,.,. r .,. .... ,o.,.
61"
It is the most popular statistical test for measuring the relationship between two
variables. It can only detect the extent of relation between the variables but does not
give any information about the cause and effect of the relationship.
Where,
X = Mean of X variable
Y = Mean of Y variable
www.itmuniversityonline.org Page 98
Research Methodology
The value of r lies between -1 and 1. If the values of r = -1, then there is a perfect
When the data is in ordinal scale of measurement, Karl Pearson's correlation coefficient
fails to determine the relationship between the variables, in such a case, Spearman's
Rank correlation coefficient is used and the values of the variables are assigned ranks.
r - 1 - 6 L.,
'\' d, J
2
n(n - 1)
l
Where,
n = Total p a i r of observation
Regression
Regression gives the linear relationship between two variables. U n l i k e correlation, it can
give the cause and effect of the relation. In this technique, the equation representing
Y = a + bX ................. (1)
Solving equation (1) and obtaining the values of a and b give the regression equation.
2
i: x v = ai:x + b i: x
----Regression Line
. . .. .
www.itmuniversityonline.org Page 99
Research Methodology
These t e c h n i q u e s are used for only two variables but when more t h a n two variables are
6 . 9 Association of Attributes
To study the relationship between two attributes, you have to use association of
attributes. For such association, Prof. Yule has defined a coefficient of association, which
is known as Yule's coefficient of association and is denoted by QAs, where A and B are
( A B ) ( a b ) - (Ab)(aB)
QAe = ( A B ) ( a b ) + (Ab)(aB)
Where,
The mentioned frequencies are shown in Table 6.9a which is a 2 x 2 contingency table.
Attribute -t
A a Total
,I.
After computing the value of QAs, if the value of QAs = + 1, then there is a perfect
positive association between the attributes. If QAs = - 1 , then there is a perfect negative
correlation between the attributes and if QAs = 0, there is no association between the
attributes.
Time Series
Data which is given with respect to a sequence of time are called time series data.
Example: The yearly or monthly sales of a departmental store, the monthly incentives
Thus, a time series data consists of a variable denoted as Y,, recorded at specified time
Secular variations are the variations in data when it is observed for a long period of
time. Thus, the effect of a trend is almost consistent throughout the period considered.
good example of a cyclical variation as the cycle oscillates from prosperity to recession,
PEAK PEAK
P R O S P E RITY +
TROUGH
Seasonal variations are the variations that occur in a data seasonally. Like the sales of
flight tickets go h i g h d u r i n g holiday season and during the rest of the year, it is n o r m a l .
And lastly, i r r e g u l a r fluctuations are the variations that occur in a data randomly. Like, if
there is a natural calamity, like flood, earthquake, etc. or there is a strike or war, then
Index Number
An index number is a device which shows, by its variation, the change in a magnitude
(Wheldom, Business Statistics). Thus, index number studies the change mainly in
economic activities in a period of time. For example, it studies the change in prices of a
Some of the most commonly used index numbers are Laspeyres method, Paasche
method, Fisher's ideal method. Index number is also called economic barometer because
the change but does not give an accurate result of the change.
6 . 1 1 Chapter S u m m a r y
single value. The measures of central tendency involve mean, median, mode,
series. The main types of measures of dispersion are range, mean deviation, and
standard deviation.
categories.
When there are two or more than two independent variables, the anal y s i s
The change in the economic condition over two situations is called an index
number.
7 . 1 Introduction
is carried out for new and advanced findings and hypothesis testing enables the
researcher to obtain this. It helps to draw conclusions about the population based on
sample observations.
Hypothesis testing also helps in decision making in the field of business and industry.
For example, the m a n a g e r of a garment factory wants to compare the outputs of two of
t h e i r factories in two different locations. Thus, to test the hypothesis, one must initially
proceed, considering that both factories have the same outputs. Finally, after the
application of statistical tools for testing the hypothesis, one can conclude whether the
hypothesis is true or not, on the basis of the result value. Also, if one wants to study the
of customers visiting the shopping centre for 10 days and, accordingly, after calculating
the result by a statistical technique, one can conclude the average n u m b e r of customers
Define hypothesis
Discuss hypothesis testing for mean of single sample and compare two means of
two samples
Discuss hypothesis testing for simple, partial, and multiple correlation coefficients
7 . 2 Hypothesis
a m a n n e r that completely reflects the research problem. These statements are based on
educated guess, a proposition that is empirically testable." They also stated that,
Usually, there are relational hypotheses denoting some relationship between the
variables under study, hypotheses about the differences between groups and also,
the variables.
7 . 3 Types of Hypothesis
Null Hypothesis
The null hypothesis is a statement of no difference, that is, a hypothesis that states that
According to Prof. Fisher, "A null hypothesis is the hypothesis which is tested for
For example, if you want to test whether two samples from a population are s i m i l a r or
Also, if the average height of the students in a college is said to be 5.4 feet, where the
H 0 :
= 5.4feet
Alternative Hypothesis
H,.
For example, the alternative hypothesis for a null hypothesis of the type
While, H, : < 5 . 4 and H, : > 5.4 are one-tailed alternative hypotheses (left tailed and
rig ht t a i l e d , respectively).
Population
The population is an aggregate of all the items or individuals. For example, the collection
population S, if there are N numbers of units, then N is termed as the population size.
A population can be finite or infinite, depending on its size. For example, the population
Parameter
The parameters are characteristics of the population which define the population. The
2,
population parameters are mean, denoted by the symbol , variance denoted by cr etc.
Sample
The size of the sample is the total number of items or individuals in a sample and is
denoted by n. A sample is very useful when the population is very large and studying
the p o p u l a t i o n w i l l involve more money, time, and effort. In such a case, it is convenient
Statistic
different symbols to distinguish between parameters and statistics, that is, mean is
For example, if the weekly sales of an apparel brand for a year are normally distributed
2
with mean and variance cr , then, the simple hypothesis is g i v e n a s :
2
H0 : = 0 and cr = cr
a) Ho : = o
2
b) H 0
: = 0
, cr > cr
Test Statistic
After the formulation of the hypothesis, the next step is to test the hypothesis, that is,
to accept or reject the null hypothesis. To test the population characteristics under
selected. Based on t h i s sample the hypothesis is tested, that is, statistics of the sample
are involved in the computation and a test statistic is a function of these statistics.
For example, z = :;?n is a test statistic. Here, z is a function of the sample mean ( x ),
Thus, the test statistic is calculated from the sample and its value is used in decision
appropriate test statistic depends on the hypothesis formulated and the population
distribution.
Critical Value
A critical value for a hypothesis test is the value to which the value of the test statistic is
value varies according to the level of significance of the hypothesis and two-sided or
one-sided test.
While testing a hypothesis, two types of errors can be committed, namely, type-I error
and type-II error. Type-I error is committed if the null hypothesis is rejected, when it is
in fact true and type-II error is committed, if the null hypothesis is accepted, when it is
in fact false.
Level of Significance
The level of significance is a fixed value which indicates the a m o u n t of correctness in the
If a = 0 .05, then the level of significance is 5/o, that is, the probability of rejecting a
p-value
Probability value is termed as p-value. It is a value that assumes the value of a test
statistic when the null hypothesis is true. If the p-value in a hypothesis testing is less
than the already decided significance level, the difference is significant. A smaller p
The power of a hypothesis test is the probability of not committing a type-II error. Thus,
the power of a hypothesis test is one minus the probability of accepting the null
The acceptance region is the region formed by the values of test statistics under
consideration, in which the null hypothesis is accepted, that is the sample space of a l l
the values of the test statistic is divided into two regions, acceptance region and
rejection reg ion. If the calculated value of the test statistic lies in the acceptance reg ion,
then the n u l l hypothesis is accepted; whereas, if the calculated value of the test statistic
lies in the rejection region, then the null hypothesis is rejected. The rejection reg ion is
also known as the critical region. Fig. 7.4a shows the acceptance and rejection region in
H, : < o
Acceptance
Region
Cl
0 + oo
Critical Region
Fig. 7 .4a: Acceptance and Critical Region for Left Tailed Test
Therefore, the critical reg ion is located on one tail of the probability distribution, as
In Fig. 7 .4a, the critical reg ion is located on the left side of the distribution and is
termed as left-tailed test but when the critical region is located on the right side of the
Acceptance
Region
Cl
- 00 0
Critical Region
Fig. 7 .4b: Acceptance and Critical Region for Right Tailed Test
For H, : a< , the critical region falls on both sides of the d i s t r i b u t i o n and is termed as
0
a two-sided test.
Acceptance
Region
aJ2 aJ2
Cntical Critical
Region Region
The first step is to formulate the null hypothesis and the alternative hypothesis in
Compare the calculated value of the test statistic with the critical value. If the
calculated value is less than the critical value, then the null hypothesis is accepted,
value in case of one-tailed test (and a/2 in case of two-tailed test), then accept
the null hypothesis, but if the calculated probability is greater, reject the null
hypothesis.
on the value of the test statistic. There are many statistical tests used for hypothesis
testing. These tests are of two types: Parametric and non-parametric tests.
Parametric test involves some assumptions about the population considered. While in
the case of a non-parametric test, it does not involve any such a s s u m p t i o n s . Thus, when
information about the population is not available, the non-parametric test is appropriate.
2
Some parametric tests are z-test, t-test, x test, F-test, etc. The non-parametric tests
are s i g n test, run test, Wilcoxon matched pairs test, Kruskal-Wallis test, etc.
Parametric tests for hypothesis testing that are based on the assumption of normal
d i s t r i b u t i o n are:
z-test
two samples, in case of a large sample and also, when the variance is known. It is
t-test
student's t statistic. Like the z-test, t-test is also used to test the significance of a
sample mean or the difference between two sample means, when the variance of
the population from which the sample is drawn is not known. Also, for paired
2
Chi-square ( x ) test
variance.
F-test
This test is used to compare the two independent samples in ANOVA. It is also
used to compare more than two samples at a time and for testing the
Sign test
It is the simplest of all the non-parametric tests. In this test, the values of the
observations are replaced b y ' + ' o r ' - ' sign to the direction it is moving towards or
Run test
The word 'run' here denotes the sequence or series of symbols w h i c h are followed
When you have data for two samples which are paired, the Wilcoxon matched
pairs test is applicable. This test is used to make inferences about the difference
between two p o p u l a t i o n s.
Kruskal-Wallis test
This test is used to compare more than two populations for continuous data. In
AN OVA is used, with the ranks as the values of the observations for each g r o u p .
Hypothesis testing for mean of single sample and for difference between mean is:
Case 0 1 :
2
When the population is infinite and normally distributed but the variance, cr , of the
Here, H0 : =
0
Where,
, = A hypothetical value
Then, for a one-sided or two-sided alternative hypothesis, the test statistic applied is
X - 0
given a s : z =
CT I ,,[ri
Case 0 2 :
2
When the population is finite and normally distributed but the variance ( cr ) of the
sided or two-sided.
X -
Then, test statistic z is given as: z - 0
- (cr/.fri)RcN-n)/(N-1)]
Case 0 3 :
When the population is infinite and normally distributed but the variance (a') of the
population is u n k n o w n .
Since the population variance is not known, sample standard deviation is used as an
0
And the test statistic used to test the hypothesis is t - X - with (n - 1) degrees of
- (JS ; ,,fri '
freedom or df.
t - X - o
Example 0 1 :
A sample of 400 male students is found to have a mean height 67.47 inch. Can it be
reasonably regarded as a sample from a large population with mean height 67.39 inch
and standard deviation 1 . 30 inch? Test whether the sample is drawn from the given
Solution O 1 :
Taking the null hypothesis that the mean height of the population is e q u a l to 6 7 . 3 9 inch,
we can write:
H0 : H, = 67 . 3 9 "
H, : H, 7' 67.39'
the population to be normal, we can work out the test statistics 'z' as follows:
determining the rejection regions at 510 level of significance w h i c h , using normal curve
R : I > 1 . 96
The observed value of z is 1.231, which is in the acceptance region, since R : I > 1.96
and thus, H is accepted. We may conclude that the given sample (with mean height =
0
= 67.47') can be regarded to have been taken from a population with mean height
Source: Kothari .C.R., Research Methodology, Methods and Techniques, New Age International Publishers,
Case 04:
2 22
When the population variance (cr 1 , cr ) is known and the samples are large. Then, to
Ho : , = ,
Where,
,,
2
= Population means of two separate populations from which samples are d r a w n .
Where,
22
If the p o p u l a t i o n variances cr2i, cr are not known, sample variances are used.
Case O S :
If large samples are drawn from the same population with known variance, then the test
. . . . x,-x.
statistic z rs given a s : z = ---;=========
cr(_!_ + _!_)
n, n2
However, in case the population variance is not known, combined sample standard
n,(cr;, + dn + n,(cr;, + d)
Where,
d, = (X, - X12)
d, = (X2 - X12)
X, = n,X, + n 2 X2
2
n1 + n2
Case 0 6 :
For small samples, if the population variances are unknown but assumed to be equal.
t = (x,-x,)
with df (n, + n, - 2)
Example 0 2 :
The mean score in a test of total marks 100 of two samples of size 80 and 100 students
are 61 and 55. Also, the standard deviations for the two samples are given as 2 and 1,
respectively. Test whether the two samples are drawn from the same population with
Solution O 2:
Ho : , = ,
H, : , " ,
S a m p l e s of size n, = 80 and
cr = 1.4
X 1 -X 2
z = ---;=========
6 5 - 55
=
5
=
(l.96{810 + 1 0 )
5
=
(1.96)(0.0225)
= 23.8095
Using normal curve area table, the critical region for 510 level of significance is lzl > 1.96
Therefore, the calculated value of z falls on the rejection region. Thus, the null
conclude that the two samples are not drawn from the same p o p u l a t i o n .
When the hypothesis is tested for variance there can be two cases:
When a sample is drawn with variance a; from a population with variance ", the null
hypothesis is g i v e n a s : H0 : cr; = a
When two p o p u l a t i o n s are to be compared for equality of variances, the null hypothesis
is given a s : H0 : cr/ = a;
cr2
Where,
(J ,, =
L(x - x,)2
,,
(J =
(n, - 1)
If the calculated value of F is greater than the table value of F, at a certain level of
significance, for (n1 - 1) and (n2 - 1) degrees of freedom, regard the F-ratio as
significant.
Example 0 3 :
If two samples are drawn from two normally distributed populations as g i v e n below, test
Sample 1 4 6 3 8 10 11 6 7 2 12 17 16
Sample 2 12 14 18 19 15 10 11 16 21 20 18 9 13 1 1 7 1
. . .
Table 7.Sa: D1stnbution of the Two Samples
Solution 0 3 :
2
Since the p o p u l a t i o n variance is not known, we use sample variances a's, and 0 ,,
Sample 1 Sample 2
4 12 20.25
10.3298
6 14 6.25
1.473796
3 18 30.25
7. 7 6 1 7 9 6
8 19 0.25
14.3338
10 15 2.25
0.045796
11 10 6.25
27.1858
6 11 6.25
17. 7578
7 16 2.25
0.617796
2 21 42.25
33.4778
12 20 12.25
22. 9058
17 18 72.25
7.761796
16 9 56.25
38.6138
13
4. 9 0 1 7 9 6
17
3.189796
11
X , = l: X = 1 0 2 = 8 . 5
n, 12
213
X, = L x,, = = 15.214
n, 14
0' =
"(X
L, 11 -
x)'
1 = 269.25 - 24.477
ei (n,-1) 1 2 - 1
02
= 2: (x,; - x;J 190.357 =
14_643
sa (n, - 1) 14 - 1
F = 0, = 23.364 = 1 . 6 7 2
The value of the test statistic i s :
2
0 14.643
sa
the calculated value (1.672) is less than the tabulated value (2.60), so we accept the
n u l l hypothesis at 5% level of significance. Hence, we can conclude that the two samples
have been drawn from two populations with the same variance.
If a sample of 'n' pairs of observations (x, y) from a normal population, 'r' is the
correlation coefficient between X and Y, and the population correlation is 'p', the null
hypothesis i s : H0 : p = 0
r .
In case of s i m p l e correlation coefficient, the test statistic is: t = with df (n - 2)
1- r'
This calculated value is then compared with its tabulated value for a specific level of
significance. If the calculated value is less than the tabulated value, the null hypothesis
rP
In case of partial correlation coefficient (rp), t = --'----=
Ji- r:
Where,
n = N u m b e r of paired observations
k = N u m b e r of variables
If the tabulated value o f t , for (n - k)df, is greater than the calculated value, we accept
the null hypothesis for a specific level of significance that there is no partial correlation
coefficient.
If the multiple correlation coefficient is denoted by R, then the test statistic applicable
R A!< - 1 )
here is F-statistics and is given as: F-
- (l- R2 V
/(n- k)
Where,
k = N u m b e r of variables involved
n = N u m b e r of paired observations
less than the calculated value of F, the null hypothesis is rejected at a% level of
significance.
7 . 1 0 L i m i t a t i o n s of Testing of Hypothesis
It only e x p l a i n s whether the null hypothesis is true or false but it does not e x p l a i n
why the hypothesis is accepted or rejected. Thus, it fails to give the cause of the
acceptance or rejection.
The result obtained from the computation of a test statistic is compared with the
7 . 1 1 Chapter S u m m a r y
hypothesis.
In parametric tests procedure, assume that the data has come from a type of
probability distribution.
Variance
( A N OVA)
Research Methodology
8 . 1 Introduction
As discussed in the previous chapter, t-test is used to compare the means of two
experiment involves more than two sets of data, it would be time c o n s u m i n g to compare
the results. In case of agriculture application, you must test more than two sa m p l e s to
study the influence of various factors, such as variation in seed quality, effect of
fertilizers on the types of seeds, etc. in such a situation, analysis of variance can be
applicable.
Analysis of variance, most commonly known as ANOVA, is one of the main statistical
techniques used to test differences between two or more means. ANOVA means
'Analysis of Variance', rather than 'Analysis of Means' because inferences about means
are made by analyzing the variance. With ANOVA, you can analyze data from several
i n d e p e n d e n t variables, simultaneously.
Analysis of variance for experimenting with only one factor is called 'one-way ANOVA'
and for experimenting with two factors, it is called 'two-way ANOVA'. In t h i s method, the
computed for each factor in the experiment, which is a test for main effects. ANOVA
method doesn't depend on the number of levels of each factor. ANOVA is available for
8.2 A n a l y s i s of V a r i a n c e CANOVA)
the sum of its non-negative components, where each of these is a measure of the
difference in means between two or more groups are statistically significant, which is
practically difficult to solve by z-test or t-test, in such a case, the hypothesis testing
t e c h n i q u e ANOVA is u s ed .
Using AN OVA technique, you can decompose the total variability found w i t h i n a data set
into two components that are random and systematic factors. The random factors do not
have any statistical influence on the given data set, while the systematic factors d o . The
Examples:
as well as, t h e i r final grading and if you are interested in finding out whether the
AN OVA, you can break up the group according to the grade and see if the relation
Consider that a manager wants to find whether the location has an effect on the
profit of an apparel retail business having the following alternatives for location,
Here, location is the only independent variable (IV) and the profit/loss is the
dependent variable ( D V ) . In such a case, the t-test would not be appropriate. One
Assumptions in ANOVA
variance and zero mean. The error is not related to any of the level of X.
The error terms or variation within samples are uncorrelated. If the error terms
are correlated (that is, the observations are not independent), then F-ratio can be
seriously distorted.
ANOVA must have a Dependent Variable (DV) that is metric and also, one or more
Independent Variables (N) that are all categorical. Measurable variables, such as h e i g h t ,
According to Gudmund R. Iversen, Mary Gergen, and Mary M. Gergen, "a metric
variable is not metric in the sense that the metric system but in the sense that
example of retail business, profit/loss is DV and three locations are treatment for a
location as a factor.
various levels for the single factor. If more than two factors are involved, the analysis is
technique. ANOVA is, essentially, a procedure for testing the difference among different
ANOVA is that the total amount of variation in a set of data is broken down into
two types, that amount which can be attributed to chance and that amount
There may be variation within sample items and between samples. Using ANOVA, you
can s p l i t the variance for analytical purposes. Therefore, it is a method of analyzing the
various sources of variation. Using ANOVA technique, you can explain whether varieties
feed prepared for a particular class of animal or various types of d r u g s manufactured for
curing a specific disease, may be studied and judged to be significant or not t h r o u g h the
a p p l i c a t i o n of ANOVA technique.
Experiment - 1 Experiment - 2
) \
,o -10 -s O s 10 IS 20
0 2 4
groups.
Hence, to differentiate the groups in experiment 2, the variability between the groups
must be greater than the variability within the groups. If the v a r i a b i l i t y w i t h i n the g r o u p s
is large compared to the variability between the groups, any difference between the
groups is d i ff i c u l t to detect. Variability between the groups and variability within the
groups are compared to determine whether or not the group means are significantly
different.
Using ANOVA technique, one can investigate any number of factors that influence the
dependent variable. Also, one may investigate the differences amongst various
categories within each of these factors, which may have a large number of possible
values.
If you take only one factor and investigate the differences amongst its various
categories having numerous possible values, you are said to use one-way ANOVA and in
case we investigate two factors at the same time, we use two-way ANOVA. In a two or
more way ANOVA, the interaction (that is, inter-relation between two independent
8 . 4 V a r i a b i l i t y M e a s u r e by One-way ANOVA
Differences among the means of the population are tested by analyzing the amount of
Total
variability in
DV
In terms of variation within the given population, it is assumed that the values of i'h
observation of i'h group (Y, (where, i and j are positive integers excluding zero) differ
1)
from the mean of this population only because of random effects, that is, there are
influences on (Y 1
i) that are unexplainable.
u n e x p l a i n a b l e , t h u s , an error in observations.
between the mean of the /h group and the grand mean is attributable to what is called a
Two estimates are compared with the F-test for the given degrees of freedom and level
In AN OVA, the F-test is used to test the null hypothesis, which is stated as:
Ho : , = 2 = 3 = = ,
For a large value of F statistic, the greater the likelihood that the differences between
means are d u e to the treatment or something other than the chance alone, that is, the
Thus, you have to accept the alternative hypothesis, which states that at least one of
the sample mean is significantly different from the rest of the means.
8.5 ANOVA T e c h n i q u e
M e a s u r e t h e effects
)
Consider that N observations x,1 (i = 1, 2 ... k; j = 1, 2 ... n.) of a random variable X are
k
grouped into k classes of sizes n i . n2 ... nk, respectively, (N = n,) as shown in Table
i = 1
8.Sa.
Means Total
-
X11 x,, . . .
X1n1 x, . T,
' '
'
X,1 X,2 . . .
X1n, x, r,
Example 0 1 :
Three machines A, B, and C are tested to see whether their outputs ( n u m b e r of items
Machine A 12 10 11 13 14 15
Machine B 11 8 12 10 13
Machine C 10 11 14 15 12 13
Solution O 1 :
The total variation in Y, denoted by SSy (SS: Sum of Squares), is decomposed into two
Where,
Total Variability
n, c
Within Group
2
SSwithin = Total of (Group observed value - Group mean)
Between Group
y2
Between sum of squares = I-' - c. f .
i n,
,,
I\
y = c=, ( M e a n for category/group j)
' nJ
,, c
I IY,j
1
Y = ,=, ; (Mean over the whole sample or the grand mean)
That is,
1
=
2
=
3
=
A 12 10 11 13 14 15 75 955
B 11 8 12 10 13 54 598
c 10 11 14 15 12 13 75 955
2 2 2 2
y 75 54 75
B e t w e e n s u m o f sq u a r e s = L-'- - c. f . = (-+-+-)-2448=10.2
, n, 6 5 6
= 6 0 - 1 0 . 2
= 49.8
YI = SS = SS
y y
2
The value of ri varies between O and 1.
YI ss, 60
In other words, 17% of the variation in the defect rate is accounted due to types of
machines.
In one-way ANOVA, the interest lies in testing the null hypothesis that the category
Under n u l l hypothesis:
Ho : , = , = 3
Assume that the variation between the samples and within the samples come from the
same source of v a r i a t i o n .
The null hypothesis may be tested by the F statistic based on the ratio between the two
estimates as follows:
ssbetwee,/
F _ /(c - 1 ) _ MSbetwee,
- SSw1t111n/ - MSw1t111n
/(N - c)
Refer to the F-distribution table for the value of Fcr,ucal for various levels of significance
(c - 1) = 2 and (N - c) = 14
ssbetwee,/ 10.2/
/(c - 1) _ 72 _
Fca1wate0 = SS-+-
- - 49 Yi -1.4338
within
(N - c) 14
The independent variable does not have a significant effect on the dependent
On the other hand, the effect of the independent variable is significant, if the null
hypothesis is rejected.
A comparison of the category means will indicate the nature of the effect of the
i n d e p e n d e n t variable.
For the given example, the null hypothesis may be accepted. There is no significant
difference in the mean of the groups. The type of machine does not have significant
effect on the o u t p u t .
8. 7 Two-way AN OVA
While doing research, the researcher is often concerned with the effect of more than one
factor simultaneously. In two-way ANOVA, the influences of two factors are considered
For example, the quality of fabric (high, medium, and low) interacts with price levels
(high, medium, and low) to influence a brand's sale. Here, the dependent variable is the
brand's sale and the independent variables are quality and pricing. Within two factors,
Depending u p o n the replication of data within the levels of the factor, two-way AN OVA is
'm' n u m b e r of times.
The data format for two-way ANOVA without replication is shown in Table 8.7a.
-
A2 Y21 Y22 ... Y2, Y,.
- -
Column Mean Y_, Y_, ...
Y.c
Source Degrees
F-test
of Sum of Squares of Mean Square
Ratio
Variation Freedom
r
(r - 1 )
MS = SSA F - MSA
Factor A SSA = CL (Y, - Y)'
A - MS
i=l
A (r - 1)
E
MS = SSB le - MSB
Factor B S S A = rL (Yi - Y)2 (c - 1)
B - MS
B (C - 1)
j=l E
r c
2 MS = SS,
Error SS,= L L(Y
IJ
- Y - Y.J. + Y )
I.
r c
W h i l e studying differences in the mean values of the dependent variables related to the
effect of the controlled independent variables, it is often necessary to take into account
variables is usually removed by simple linear regression method and the residual sums
of squares are used to provide variance estimates, which, in turn, are used to make
tests of significance.
,z
I x---.v'"r
Consider the influence of variable X (IV) on variable Y (DV) and also, the influence of
Subtracting each individual score (Y;) from correction factor of Y(Yi), that is
ANOCOVA: Assumptions
Assume that there is some sort of relationship between the dependent variable
Source: C. R. Kothari, Research Methodology-Methods and Techniques, Second Revised Edition, New Age
8.9 Chapter S u m m a ry
It considers that total variation is due to the variation in treatment and the
variation that is u n e x p l a i n e d .
In one-way ANOVA, only one factor of influence is considered for study, in two
Two estimates are compared with the F-test for the g i v e n degrees of freedom and
Testing a n d
C h i - s q u a r e Test
R,esearch Methodology
9 . 1 Introduction
There are various statistical techniques that are based on assumptions about the
drawn from a normally-distributed population, All such techniques fall under parametric
tests but in situations where there are no rigid assumptions about the population, a
parametric test is not a p p l i ca b l e . Thus, for such data, non-parametric tests are the only
choice because they make no assumptions regarding the population and their
parameters,
measure of non-parametric test, is discussed, Chi-square test is a widely used test for
tests
Discuss C h i - sq u a r e test
9 . 2 N o n - p a r a m e t r i c Test
Definition
The non-parametric test is for inferences which do not need a n y a s s u m p t i o n s about the
test'. For example, if the sales of two sports goods' brands are to be compared and
there is no assumption about the distribution of the two variables (both brands), then
its result is more reliable than a non-parametric test. Thus, parametric and non
parametric tests differ from each other. Table 9.2a shows the advantages and
Advantages Disadvantages
data.
a p p l i c a t i o n s in psychometrical,
statistics,
+ - +
as: A , A , B, A, B , etc.
. .
Table 9 . 2 a : Advantages and Disadvantages of Non-parametric Test over
Parametric Test
Source: Gupta. S. C., Kapoor V. K., Fundamentals of Mathematical Statistics, Eleventh Edition, Sultan
In t h i s chapter, some of the non-parametric tests, like sign test, rank sum test, run test,
Non-parametric tests are very easy to calculate and also, provide quick results, In a
situation where you have data that is not exact or have no information about the
p o p u l a t i o n d i s t r i b u t i o n from which it is taken, then the parametric test fails and the non
9 . 3 C h i - s q u a r e Test
2
The Chi-square test is symbolically represented as x , The chi-square test is used for
The chi-square test can be used to determine whether the population variance is
significant, that is, a sample is drawn from a population, which is normally distributed
2
with mean and variance cr
The nu II hypothesis i s :
, 2 2
Ho CT s = CT P
Where,
2
cr 5 = Variance of the sample
n = S a m p l e size
x 2
=2(n-1)
cr>
p
The value obtained from the above formula is compared with the critical value of chi
significance a,
Chi-square is also used as a non-parametric test when the assumptions about the
population from which the sample are drawn are not known, The chi-square for non
parametric test involves chi-square goodness of fit and chi-square to test the
This test measures the difference between the observed and the theoretical (expected)
frequencies. This mechanism was developed by Karl Pearson in 1900, who named it
Where,
e, = Expected frequency
o, = Observed frequency
k N u m b e r of categories
If a sample is arranged in k categories and the observed frequency is given, then the
e, = np., i = 1, 2, 3 . . . k
Where,
n = S a m p l e size
Thus, comparing the calculated and the critical value at a specified level of significance
Example 0 1 :
If two coins are tossed 60 times, then the number of heads is g i v e n as shown in Table
9.3a.
Number of Heads 0 1 2
Frequency 15 25 20
Solution O 1 :
2
(e, - e,)2
x P, 0, e, oi-ei (o, - e;}
e,
0 0.25 15 15 0 0 0
1 0.5 25 30 -5 25 25 5
- = -
30 6
2 0.25 20 15 5 25 25 5
- = -
15 3
15
-
6
(o, - e.)? 15
2
Therefore, X = L ' ' = - = 2.5
e; 6
Thus, the critical value of x;_, at 510 level of significance is 5.991. Since, calculated
x 2
< x;_,, therefore, we accept the null hypothesis at 510 level of significance and
If two attributes, A and B, are given, which are divided into 'r' and 's' sub-categories,
such that they are arranged in a r x s contingency table, as shown in Table 9.3c.
A, A2 ... A, Total
e,
Where,
o, = Observed frequency
e, = Expected frequency
F in ally , the critical value is obtained in order to test the significance of the null
Example 02:
Table 9.3d shows the distribution of sales of two apparel brands in showrooms,
Sales from
B 180 60 80 320
Test the significance of both the brands that are equally referred at a 5% level of
significance,
Solution O 2:
The null hypothesis is H : There is no difference between sales of the two brands in the
0
g i v e n outlets.
280 300
Sales in showroom = x = 140
600
280 160
Sales in s h o p p i n g centre = x = 74.667
600
280 140
Sales in on line s h o p p i n g = x = 65.333
600
S i m i l a r l y , for brand B:
320 300
Sales in showroom = x = 160
600
320 160
Sales in s h o p p i n g centre = x = 85.333
600
320 140
Sales in on line s h o p p i n g = x = 74.667
600
Therefore, value of x 2
statistic is calculated, as shown in Table 9. 3e .
Observed Expected
2
2
(o - e)
Brands Frequency Frequency (o-e) (o- e)
e
(o) (e)
22.289
.
Table 9 .3 e : 2 x 3 Contingency Table
significance is 5.99L Since the critical value is less than the calculated value of chi
Since no a s s u m p t i o n s are available about the population, the test is not based on
frequencies.
No group should contain less than 10 items. In cases where the frequencies are
The constraints must be linear. Constraints which involve linear equations in the
Source: Kothari, Research Methodology, 2002, New Age International, New Delhi.
9 . 4 S i g n Test
The easiest and simplest of all non-parametric tests is the sign test. In this test, the
direction of the observations, that is, positive or negative direction, which are denoted
by '+' and '-' signs, are considered instead of the magnitude. There are two types of
s i g n test:
Two sa m p l e s i g n test
If a data X
1
, X , , X, " , X" is given with the sample median 8 , T h i s test is used when you
Ho : = o
If the value of the sample observation is greater than 80, then the values are replaced
by a positive sign '+' otherwise by a negative sign '-', However, if the values of the
The total numbers of'+' signs (r) and the total numbers of'-' signs (s) is such that
r + s n
Thus, in order to test the hypothesis here, r is considered to follow binomial distribution
1 1 1
H0 : p = - and H1 : p " - or p < -
2 2 2
Example 0 3 :
given: 320, 370, 4 3 0 , 320, 350, 3 1 0 , 390, 380, 360, 320, 400, and 320,
Using s i g n test at 5% level of significance, test that the average pages printed are 3 8 5 ,
Solution 0 3 :
The n u l l hypothesis is H
0
: = 385
The values of the g i v e n data are replaced by positive s i g n ' + ' and with a negative s i g n '
- ' as below:
-, -, +, -, -, -, +, -, -, -, + and - .
Thus, the modified data follows binomial distribution and the null hypothesis is
1
Ho : p = -
And n = 12, r = 3, s = 9, and p = .!:. , s o tabulated value at 5/o level of significance from
2
Therefore, the null hypothesis is accepted at 5/o level of significance and we can
This test is used to determine if two samples are drawn from an identical population.
Thus, if there are two samples, then the sign test for two samples is a p p l i c a b l e .
In t h i s method, each pair of values is replaced with a positive sign ' + ' if the value of the
first sample is greater than the value of the second sample. Otherwise, it is replaced by
9 . 5 R u n Test
The word 'run' here denotes the sequence or series of symbols that are followed or
preceded by a different symbol or no symbol. A run test is used to verify whether there
is any randomness among the observations in a given data. Thus, a run test helps you
For example, before launching a new health drink in the market, the manager of the
company wants to conduct a survey to determine which age group would be the
preferable target group for the product. Customers in the age group < 25 years are
denoted by T and those > 25 years are denoted by 0. The manager puts up a counter
for the customers to taste the health drink and feedback is taken.
H : The customers in age group < 25 and > 25 visiting the counter are random.
0
T T O O T T T T T T T T O O O O O O T T T T
1 2 3 4 5
Thus, in the above representation, there are 5 total number of runs (r) in which 14 are
Thus, for small samples, when the sample size is less than 20, then the lower () and
upper ( r , ) critical values at a specific significant level can be obtained from the table for
If the sample size is greater than 20, then the sampling distribution of 'r' tends to
2
normal d i s t r i b u t i o n with mean ()and variance ( cr ).
Where,
2n 1 n 2 +l
=---
nl + n2
2
2n n (2n n
- n, - n,)
1 2 1 2
(J = --_c__c-
(n, + n 2)2 (n 1 + n, - 1)
To test 'r', the following standard normal statistics are obtained as:
Z = r -
CT
If the calculated value of Z lies between the tabulated values - Ziz and Z ';/,_ , then the
Example 04:
If a d i e is thrown 20 times, you need to test whether the occurrence of an even number
E E E E E E E O O O E E O O O O E E E E
1st
Use a = 0.05.
Solution 04:
In the g i v e n sequence,
r = N u m b e r of r u n s = 5
The lower (r,) and upper (r,) critical values of r a t 5/o level of significance for given
n, = 13, n, = 7 are 5 and 15, respectively. As a result, r, ,; r ,; r,, so we accept the null
hypothesis and state that the occurrence of even and odd numbers in an experiment
When the data values are not numerically measurable but can be ranked, a c c o r d i n g l y , In
such a situation, a rank correlation coefficient is used, that is, it is used to measure the
association between the variables. It has been formulated by Charles Edward Spe ar man
p,
6Ld }
p - 1 - '
{ n(n2 - 1)
Where,
2
I d 1
= S u m of the squares of difference between ranks
n = N u m b e r of paired observations
Here, the null hypothesis H0 : The variables are independent or there is no correlation
between the v a r i a b l e s .
For sample sizes less than 30, if the critical value is greater than the tabulated value of
For sample sizes greater than 30, the sample distribution is assumed to follow normal
Note:
It must be noted that, if the ranks of two or more values are equal, then the average
v a l u e of the ranks that would have been assigned to the values if they were different, is
assigned to those values. So, the formula for the statistic is adjusted by the term
(m m), where m denotes the number of observations involved in a tie in any of the
1;
variables u n d e r study.
6Ld + L (m -m))
P l _ 12
2
n(n - 1)
Example 0 5 :
Table 9.6a d i s plays the values of two variables X and Y. Test whether the variables are
x y
101 120
111 125
102 123
105 121
112 122
109 126
Solution 0 5 :
6Ld }
p - 1 - '
{ n(n2 - 1)
x y R1 R, d, d. 2
'
101 120 1 1 0 0
111 125 5 5 0 0
102 123 2 4 - 2 4
1 05 121 3 2 1 1
112 122 6 3 3 9
109 126 4 6 - 2 4
Total 18
=1-{108}
210
10 2
=
210
= 0.486
greater than the calculated value, we accept the null hypothesis, Thus, the variable X
and Y are i n d e p e n d e n t .
9 . 7 K e n d a l l ' s Test
This test is an important non-parametric test for measuring the significant relationship
between two v a r i a b l e s , When two variables are tested for the association between them,
The r a n k i n g is g i v e n independently
The procedure for computing and interpreting Kendall's coefficient of concordance (W)
is:
L All the objects, N, should be ranked by all k judges in the usual fashion and this
= 1, 2, 3 . . . k).
W = s
2 3
_1_ k (N - N)
12
Where,
S=L(RJ-RJJ
N = N u m b e r of objects ranked
Source: Kothari, Research Methodology-Methods and Techniques, New Age International Publishers, New
Delhi, 2 0 0 2
Note:
If there are tied ranks in the data, then the above formula is modified a s :
W = s
2 3
_1_ k (N - N) - kL T
12
A correction factor 'T' is calculated for each of the k sets of ranks and these are added
L(t'-t)
T = and the summation depends on the number of tied ranks,
12
Example 0 6 :
The ranks obtained by 5 candid ates from 4 interviews that were conducted for the post
A B c D
1 1 1 5 1
2 3 5 3 4
3 4 4 2 2
4 2 2 4 5
5 5 3 1 3
Solution 0 6 :
Sum of ranks
A B c D s = (Ri - R i f
(R;)
I
1 1 1 5 1 8 16
2 3 5 3 4 15 9
3 4 4 2 2 12 0
4 2 2 4 5 13 1
5 5 3 1 3 12 0
60 26
s
Therefore, W=----
J:_ k 2 ( N 3 - N )
12
26
=----
J:_ 4 2 ( 5 3 - 5 )
12
= 0, 1625
The calculated value of W is 26 and the tabulated value for k = 4 and N = 5 (using
So, we accept the null hypothesis and conclude that the judges' ranking is insignificant
at 5% level of significance,
Coefficient
more sets of ranks but you can also determine the degree association among k sets of
pairs k(k - l) of r a n k i n g in view that W bears a linear relation to the average (p) taken
2
over a l l possible pairs. The relationship between the average of p and Kendall's W can
Average of p = (kW - %- l)
However, the method of finding W, using average p between all possible pairs is quite
tedious, particularly when k happens to be a big figure and, as such, this method is
Wilcoxon matched-pairs test is used in case of paired data, If the data for paired
samples is given like the values before and after a medical treatment, the supply and
demand of a commodity in the market, etc., then the Wilcoxon test, among all non
If X and Y are two paired data of small sample sizes, then the difference between the
Then, the ranks are assigned to the differences by ignoring the + and - sign and also
ignoring the differences with value equal to zero, The next step is to calculate the sum
of a l l ranks with a positive sign (T), with a negative sign ( T - ) , and then obtain Min (T+
'T-),
F in ally , the critical value is obtained at a specified level of significance, Thus, if the
calculated value is less than the critical value, then the null hypothesis is either accepted
or rejected.
Example 0 7 :
Using Wilcoxon m a t c h ed - p a i r s test, test whether the two samples are significantly
First Sample 10 21 15 14 11 16 13 16
Second Sample 9 25 13 15 17 16 10 11
. . .
Table 9.Sa: Distribution of Values of Two Samples X and Y
Solution 0 7 :
X, Y, d, ld,I Ranks
10 9 1 1 LS
21 25 -4 4 5
15 13 2 2 3
14 15 -1 1 LS
11 17 -6 6 7
16 16 0 0 -
13 10 3 3 4
16 11 5 5 6
which is less than the calculated value of T. Thus, we reject the null hypothesis and
conclude that there is a difference between the samples at 510 level of significance.
9 . 9 M a n n - W h i t n e y U Test
Mann-Whitney U test is used to find out whether the two given samples that are drawn
from the two populations are identical or not. Here, the null hypothesis states that the
two samples are drawn from different populations having the same d i s t r i b u t i o n ,
The ranks are assigned to the two samples separately and if two values are s i m i l a r , then
the average of the ranks that would be assigned to the two values if they were different,
are assigned to the two numbers. Then, the ranks of the two sa m p l e s are summed
U n, (n, + 1)
, = n, n, + 2
U _ n1 (n1 + 1) R
2 - n1 n2 + - 2
2
Then, obtain min(U,, U ) and compare the obtained value with the tabulated value of U
2
for (n,, n,) degrees of freedom. If the tabulated value is greater than the calculated
U _ n, (n1 + 1) R
- n1n2 + 2 - t
Where,
2
U N(u, ou )
2
With mean = n,n, and variance a = n,n,(n, + n, + l)
u 2 u 12
U -
z = u
u
Thus, if the first population > the second population, then for Zu < -Za, reject H0 If
the first population < the second population, then for calculated Zu > Za, reject H0 If
the first population and the second population differ from each other, the calculated
9 . 1 0 Chapter S u m m a r y
2
The c h i - sq u a r e ( x ) test is used for the following cases:
S i g n test is a non-parametric test in which only direction that is denoted b y ' + ' or
'-' sign are considered, while the magnitude is not considered, There are two
types of s i g n test:
A run test is used to find out whether the sample is random or not, This test is
are i n d e p e n d e n t or not.
Writing
Research Methodology
1 0 . 1 Introduction
In the previous chapters, the topics necessary for a research were discussed. In this
chapter, you will get a brief idea of how to pen-down the research, in order to get a
proper research report. Right from data collection to analysis and interpretation of the
Research report w r i t i n g is the final stage of a research. To write a good, effective, and
detailed research report is very subjective in the sense that it varies according to the
Lately, research is being widely carried out in different fields of physical and social
written. Thus, a research report contains all the evidence and valid references to support
its interpretation. Hence, research report writing is the most important part of any
research.
Meaning
audiences."
Thus, a research report is an essential part of research, without which the research is
incomplete. It not only reflects information about the research f i n d i n g but also the whole
help in business decision making, as well as, to forecast the measures to be taken w h i c h
defined by Ranjit Kumar as, "an overall plan, scheme, structure and strategy
your research project, its main function is to detail the operational plan for
medical, biotechnology, social sciences, etc., research is carried out immensely but the
followed, in order to obtain consistency in the reports. In this chapter, the standard
format for a research report, that is, the main sections to be included in a report are
discussed.
For example, the Ph.D. thesis is a research report, where the whole process is written to
Importance
It also helps in planning s c h e m e s and strategies for future, based on its result.
It can be used for future references in any research with relation to it, for a more
A good report is one which is clear, easily understood, and precise. To distinguish a
readers/audience.
The facts mentioned in the report should be scientifically verified. Also, the data
It should highlight the difficulties faced in data collection and not only the
A report having the above characteristics is attractive in its approach and gets more
au d ience or readers.
Step 5.
Rewriting and Refining the Rough Draft
The first step in writing a research report is the analysis of the research q u e s t i o n ,
L og i c al: Analyze all logical associations and relation between the research
question u n d e r consideration.
The next step is to prepare an outline of the research work. By doing this, the
research can be framed in a systematic order and also, one can list out all the
After the outline of the research is prepared, the next step is to prepare a rough
draft. The rough draft will consist of the procedure for data collection and the
In t h i s step, a l l the limitations present in the rough draft of the research report are
Before the final step, you need to prepare the bibliography, which is the list of
books and pamphlets are to be listed is, name of the author (last name first), title,
place, publisher, date of publication, and volume number. The order in which
magazines and newspapers are listed is, name of the author (last name first), title
of article (in quotation marks), name of periodical, the volume number, date of
Lastly, the final draft of the research is prepared, where detailed information about
drafts of the research, in order to get a polished and proper research report.
1 0 . 4 Report Format
For an effective and valid research report, it is necessary to write the report in some
standard format that is universally accepted. A report can be divided into three parts:
preliminary parts, main body, and appended part. These three parts can be further
Letter of
Detailed
Authorization Methodology
Executive Summary Calculations
Conclusions
A report should contain all the parts shown in Fig. 10.4a and in the same order, so as
Title page
Letter of authorization
Table of content
Title Page
The title page should express the title of the research, 'for whom' the report is prepared,
'by whom' it is prepared with the 'name of the institute/university/company', and the
'date of its release'. The title of the research report should correctly and completely
Letter of Authorization
For the validity and approval of the research, the letter of authorization from the
concerned authority is required. The letter approves the work done for the research,
highlighting the details of the data and its sources. Also, along with this letter of
authorization, a letter of transmittal is given, which indicates the release of the report to
its readers.
EMR ResearchGroup
MOW!g,ottbtw-Md!
Columbia, IA 50057
The report outlined in the research proposal of March 15, 2009, is complete. I have
personally supeMSed the project, conducted the statistical analyses, and prepared thts
report alooo with my two senior research assoaates.. NataHa James and David Parker.
The report addresses the key decision statement: In what WWfS can )'OUl" restaurants build
customer loyalty so that revenues increase through more frequent patronage? The key
greater share of wallet. As agreed upon in the pn>posal, the report offers no specific
recommendations for managerial action, but rather, it presents conclusions which shouk:I
emble you to make ilformed decisions. Thus, the conclusions conform to the
able to meet OU( goais for interviewing groups of customers and non-customers in a
timely fashion. We are grateful for your business and k><>k forward to working with you
as you develop strateoic ptans of achon based on this report. Once you have taken a look
at the report, please contact me and we will schedule a formal presentation and
Sincerely,
Barry J. Babin
President
. R-,ch Grol.J,
11-4 Rlilto.i Aw
Chaud11r1L u. nm
Thus, the letter of authority is a declaration given by the person who has verified the
whole study and declared the acceptability of the study, as in Fig. 10.Sa.
Table of Content
It is an essential part in any report. It is the list of all the topics covered with the topic
divisions and subdivisions along with its page references. The table of content is
Table of Contents
Reseilfc:h, ,- _,,,_,, - 15
ReseMc:h Methods 15
ReseMm Methodolog 17
Srientifi<; Methods 17
Research Process 19
MOTIVATION IN RESEARCH 21
PROBLEMS/LIMITATIONS OF RESEARCH 31
SUMMARY 35
REVIEW EXERCISES 35
DECISIONS 38
If there are many figures, graphs, and tables in support of the research, a list of the
name of the figures, graphs, and tables with page references is g i v e n after the table of
These sections are i n c l u d ed in the preliminary part of a research report. After t h i s , the
main body of the research report begins, explaining the whole procedure and technique
of the research. In addition to these sections in the preliminary part, it also consists of
an executive summary.
The Executive S u m m a r y
It is simply the summary of the whole report. It briefly explains all the four parts of a
research:
Objectives: It states all the important information and purpose of the research.
Executive Summary
Uncertainty associated with changes in carbon stock is from two additive variance
components:
show the effects of varying prediction error on total uncertainty, the confidence
The estimates of carbon stock from the 2004 Nelson and Marlborough pilot data
are 64.4 12.6 t/ ha (95/o confidence interval). This estimate uses analytical
methods to calculate the uncertainty. Carbon stock is estimated from 104 plots for
six pools:
o Fine litter
Estimates of change in carbon stock using C_Change to predict carbon for 2008
and 2013 are 55.0 10.3 t/ ha. The estimate of change in carbon is for four
pools:
o Fine litter
two surveys, 2008 and 2 0 1 3 are correlated w i t h ? = 0.90. There is some evidence
that the correlation could be as high as 0. 97 but may be less than this if there is
approach should be adopted in choosing the final number of sites in the nationwide
Estimates of uncertainty have been derived using analytical methods and it is not
Extra error from area definition will inevitably increase uncertainty associated with
Introduction
Methodology
Appended pa rt
1 0 . 6 . 1 Introduction
The first section of the main body is the introduction. The introduction in a report
introduces the research to its readers. In this section, the objectives of the research are
clearly stated and also, the reasons for which the investigation is taken up. The main
concept involved in a research is introduced and explained properly, so that a l l the terms
Thus, introduction helps the readers to comprehend the purpose and concept of the
research. The introduction always follows after the executive summary. A sample of an
Introduction
As a signatory to the Kyoto Protocol New Zealand has agreed to report, in a transparent
and verifiable manner, greenhouse gas emissions by sources, and removals by sinks,
associated with direct human-induced, land-use change and forestry activities. These
land-use change and forestry activities are limited to afforestation, reforestation, and
deforestation that have occurred since 1990. In order to provide the necessary data to
allow carbon stocks, and changes in carbon stock, to be estimated in accordance with
the recently-adopted Good Practice Guidance for Land-Use, Land-Use Change and
Forestry (IPCC 2003), a national forest inventory specifically designed for carbon
compliant forests. These are forests which were established after 1 January 1990 on
land, which did not previously contain forests. Part of the preliminary work associated
with the development of this national inventory consisted of a pilot survey, which was
conducted in the Nelson and Marlborough regions. The purpose of the pilot study was to
test the proposed field methodology and collect sufficient data to be able to produce
Any large-scale survey will include some errors (Merritt et al. 2005). Good practice in
forest inventories means that uncertainty associated with the survey and estimation
should be reduced as far as practicable. Good practice also recognizes that w h i l e there
(Cullen and Frey 1999). Moreover, the good practice guide (IPCC 2003) for the
In this report, we present estimates of the uncertainty associated with the carbon
estimates from the pilot study to demonstrate procedures for future analysis, when
As g i v e n in the above pages, it can be seen that the executive summary is followed by
the introduction of the report. Here, to separate the different sections in the executive
summary, bullet points are being used. Thus, the executive summary can be referred to
as the abstract of the report. However, the introduction of the report is given more
elaborately.
After e x p l a i n i n g the research objective, concept, and purpose, the review of literature is
provided. It helps a reader to compare the research with the context of other similar
researches. The context mentioned should contain the information of its a u t h o r and the
year.
Methodology
Data for a research should be collected in a scientific manner, in order to get valid
results. The methods and techniques used for the collection of data are explained in t h i s
section. The methods used to obtain data are selected, so as to encounter fewer
a m o u n t s of biases that can get incorporated in the different phases of data collection.
Research Design: This includes the study type, the source of data collection:
primary or secondary, details of how the data is collected and the m e d i u m used in
the research.
Sample Design: It explains the type of sampling design and the sa m p l e size used
in the research for its data collection. An appropriate sampling type is used
The Fieldwork for Data Collection: The whole process of field data collection
i n c l u d e s the information about 'by whom', 'how' and 'where' the collection of the
data w i l l be done.
After the methodology of research is clearly stated, the next section states what type of
a n a l y s i s are done to obtain the results for the study. This section is the most important
part of the research. Appropriate data analysis employed, is explained briefly and also,
the reason for its suitability to be applied in the research is stated. If an appropriate
that case, the result of the research may deviate from its actual f i n d i n g .
much necessary to implement proper analysis and find the result. The interpretation of
the result w i l l indicate whether the hypothesis under consideration is correct or wrong.
This section consists of the judgment of the researcher on the basis of the results
obtained and also the suggestions, regarding the same is provided. That is, the view of
the researcher towards its study is summarized, so that it can communicate the
Appended Part
documents can also be a d d ed in the appended part. The documents/references that are
Detailed Calculations
Bibliography
Detailed Calculations
To calculate the result, the calculations need to be illustrated and because of the brief
provided in the appendix. Also, some terminologies mentioned in the report must be
The statistical or measurement tables, which are used for interpretation of the finding
Bibliography
where a l l the books, articles, and links used for reference are listed.
BIBLIOGRAPHY
"Bureau of Indian Affairs: Quick Facts". , 2 July 2008. Department of Indian Affairs.
2002 <httpl/wv/\v.doi.gov/bia!quick_facts.html>.
"Education Facts and History". 18 July 2008. National Indian Education Association.
2002 <http://wv11.v.niea.org/history/research.php>.
Ethridge, R. Creek Country The Creek Indians and Their World Chapel Hill: The
Fenn, E. A., Wood, P . H . , Watson, H. L., Clayton, T. H., Nathans, S., Parramore, T. C.,
et al. The Way We Lived in Notth Carolina. Chapel Hill. NC: The University of
Taylor, R. A. FLORIDA: An Illustrated History. New York: Hippocrene Books, Inc., 2005.
A research report can be written in different types, depending on its target au d ien c e.
The report can be of the following types in terms of the presentation of results and
Technical report
P o p u l a r report
Article
M onogr aph
Oral presentation
In these types of research reports, the main purpose is to describe the research
completely, while they only differ in their writing style, that is, the way the whole
procedure of the research is written. Like a study which is for the general population is
presented in a simple but concise way, while a study for an audience c o m p r i s i n g people
who are well aware of the technical terminologies of the subject u n d e r study, the report
Technical Report
assumptions required for the research/study, and the also the limitations mentioned in
formulation and the hypothesis. It also includes details about the population
3. Methodology: It includes the methods and techniques used for the collection of
data. This also includes the details of the sample size, type of sampling, and
5. Analysis of data and interpretation of results: The data collected are analyzed by
6. Conclusion: A detailed summary of the result and suggestions drawn from the
8. Technical appendices: It includes all the documents related to the research. Like
Popular Report
1. Summary: This section throws light to the generalized finding of the research and
3. Objective of the study: Specific objectives for the research are given in this
section.
4. Methodology: The techniques and methods used for the research are included in
t h i s section. The details are given in such a way that it is easily understood and it
does not contain technical terms which are not practically used.
Article
t h i s is for a mass a u d i e n c e . It is short, attractive, and less formal in its writing style.
3. Main body: It comprises two to five paragraphs describing the details of how the
research is done.
study.
Monograph
In comparison to all the reports mentioned above, this is the most detailed write-up,
which is technically written for a specific subject. The main objective of such a report is
to provide i n s i g h t to the topic under study and be more informative. The writer of such a
report must make sure that, the topic considered for the study is not established earlier
in any of the studies, that is, it should be a unique one. However, it can be an
The target reader for a monograph is very limited because it is subject-specific and not
g en era l .
Oral Presentation
its clients/readers. It such reports, the researcher can highlight the importance,
objectives, and results of the study in a precise way by using more g r a p h s , tables, and
flowcharts. Since it is a face-to-face presentation, readers get a scope for clearing their
doubts by asking the researcher, relevant questions. It has an advantage that there is