Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

UNIT II DATA COLLECTION AND SOURCES


Measurements, Measurement Scales, Questionnaires and Instruments, Sampling and
methods. Data - Preparing, Exploring, examining and displaying.

Measurements

Measurement is the process of describing some property of a phenomenon under


study and assigning a numerical value to it. Measurement is considered as the foundation of
scientific inquiry. In our daily life, many things are measured continuously in different ways
for different purposes.
We can not only measure physical objects but abstract objects also, that means we can
measure quantitatively and qualitatively. We can measure height, weight, length, width,
income etc., (quantitative measurement) and at the same time, we can measure attitude,
personality, perception, intelligence, preference (qualitative measurement) etc. A
measurement can give us different kinds of information about a theoretical concept under
study.
A more contemporary definition of measurement as “the estimation or the discovery
of the ratio of some magnitude of a quantitative attribute to a unit of the same attribute”
(Michell, 1997).
According to Warren S Torgerson “The assignment of numbers to objects to represent
amounts or degrees of a property possessed by all of the objects.” To understand the nature of
the data, we must know at which level the data is measured. So the measurement can occur at
different levels, and the relationship among the values assigned determines the level of
measurement. There are four hierarchical levels of measurement identified by Stevens
(1946); they are nominal, ordinal, interval, and ratio.

Measurement Scales
Nominal Scale
This is a method of measuring the objects or events into a discrete category. This is regarded
as the most basic form of measurement. Here we assign a number to an object only for the
identification of the object. So it is a categorical data or qualitative data. Here the numbers
are only used for labeling the object, and there is no quantitative value at all. This is used to
categories the data into different groups. In a survey, all the respondence are divided into
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

different categories, which should be mutually exclusive and collectively exhaustive. Here
the categories have no predefined order.
Examples of nominal scale data connection using a questionnaire.
1. Specify your gender
1. Male
2. Female
2. Are you Married?
1. Yes
2. No
3. You are from
1. Urban
2. Rural
4. Specify your working department
1. Marketing
2. HR
3. Finance
4. Sales
5. Production
6. Operations
5. Specify your food habit
1. Vegetarian
2. No-Vegetarian
Here we can assign number to each option like 1 to Male and 2 to female, and 1 to Yes, and 2
to No, 1 to Urban, 2 to Rural, 1 to Marketing, 2 to HR, 3 to Finance etc.
Here these numbers have no quantitative values; they only represent the category. So we
cannot apply any arithmetic operations in this type of sale. We can only count the number of
items in each category.
Here we can prepare a frequency distribution table for representing this nominal data.
Ordinal Scale
The ordinal scale is the next level of data measurement scale. Here we measure according to
the rank order of the data without considering the degree of difference between the
data. Here the “Ordinal” is the indication of “Order”. In ordinal measurement, we assign a
numerical value to the variables based on their relative ranking or positioning in comparison
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

with other data in that group. An ordinal scale is indicating the logical hierarchy among
variables under observation.
Here the data has an order. In a nominal scale, there is no predefined order for arranging the
data. But here the data is arranged according to some predefined order, but not considering
the magnitude of difference. The ranking scale tells us the relative position of the objects
under study.
Suppose in a 100-meter race John finished first, Tom finished second, Mathew
finished 3 and Xavier finished fourth. Here we explain the data in ranking scale. We arrange
the data according to the relative position of the data set. Here we not consider the magnitude
of difference between John and Tom, Tom and Mathew, Mathew and Xavier. They may not
finish in the equal interval, that is Tom finished 5 seconds after John, Mathew finished 9
seconds after Tom, and Xavier finished 18 seconds after Mathew. Here we do not consider
this magnitude of difference, but only the order of the finishing position.
Examples of Ordinal scale data (Rank Scale) connection using a questionnaire.
Example: Rank your feature preferences when you buy a mobile phone. The most preferred
feature should be ranked one, the second preferred feature should be rank two and so on.

Rank the following mobile brand in order of your preference, the most preferred mobile
brand should be ranked one, the second most preferred should be ranked two and so on.

To know the descriptive analysis of the ranking scale, watch the video.
Ranking Scale Questionnaire - How to tabulate, analyse and prepare graph using MS Excel.
RM4151-RESEARCHG METHODOLOGY UNIT -2

Interval Scale
It is the next higher level of measurement. It overcomes the limitation of ordinal scale
measurement. In the ordinal scale, the magnitude of the difference is unimportant, but here on
an interval scale, the magnitude of the difference is important. In the interval scale, the
difference between the two variables has a meaningful interpretation. In the interval scale, the
difference between variables is equal distance. The distance between any two adjacent
attributes is called an interval, and intervals are always equal.
Examples of Interval Scale data connection using questionnaire.
How likely do you recommend our product to your friends or relatives?

Likert scale is a tool to collect interval data, which is developed by Rensis Likert
To know the descriptive analysis of the interval scale , watch the video.
How to tabulate, analyze, and prepare graph from Likert Scale questionnaire data using Ms
Excel.
Ratio Scale
Ratio scale is purely quantitative. Among the four levels of measurement, ratio scale
is the most precise. The score of zero in ratio scale is not arbitrary compared to the other
three scales.
This is the unique quality of ratio scale data. It represents all the characteristics of nominal,
ordinal, and interval scales. Examples of ratio scales are age, wight, height, income, distance
etc.
Examples of Interval Scale (Ranking Scale) data connection using questionnaire.
Specify you monthly income :
How many students are there in your institution? :
Number of departments in your organisation :
RM4151-RESEARCHG METHODOLOGY UNIT -2

Questionnaires and Instruments

What is a Questionnaire?

A questionnaire is a research instrument that consists of a set of questions or other types of


prompts that aims to collect information from a respondent. A research questionnaire is
typically a mix of close-ended questions and open-ended questions.

Open-ended, long-form questions offer the respondent the ability to elaborate on their
thoughts. Research questionnaires were developed in 1838 by the Statistical Society of
London.

The data collected from a data collection questionnaire can be both qualitative as well
as quantitative in nature. A questionnaire may or may not be delivered in the form of
a survey, but a survey always consists of a questionnaire.

Questionnaire Examples

The best way to understand how questionnaires work is to see the types of questionnaires
available. Some examples of a questionnaire are:

1. Customer Satisfaction Questionnaire: This type of research can be used in any situation
where there’s an interaction between a customer and an organization. For example, you
might send a customer satisfaction survey after someone eats at your restaurant. You can
use the study to determine if your staff is offering excellent customer service and a positive
overall experience.
2. Product Use Satisfaction: You can use this template to better understand your product’s
usage trends and similar products. This also allows you to collect customer preferences
about the types of products they enjoy or want to see on the market.
3. Company Communications Evaluation: Unlike the other examples, a company
communications evaluation looks at internal and external communications. It can be used
to check if the policies of the organization are being enforced across the board, both with
employees and clients.

The above survey questions are typically easy to use, understand, and execute. Additionally,
the standardized answers of a survey questionnaire instead of a person-to-person conversation
make it easier to compile useable data.

The most significant limitation of a data collection questionnaire is that respondents need to
read all of the questions and respond to them. For example, you send an invitation through
email asking respondents to complete the questions on social media. If a target respondent
doesn’t have the right social media profiles, they can’t answer your questions.
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

Characteristics of a good questionnaire

Your survey design depends on the type of information you need to collect from respondents.
Qualitative questionnaires are used when there is a need to collect exploratory information to
help prove or disprove a hypothesis. Quantitative questionnaires are used to validate or test a
previously generated hypothesis. However, most questionnaires follow some essential
characteristics:

• Uniformity: Questionnaires are very useful to collect demographic information, personal


opinions, facts, or attitudes from respondents. One of the most significant attributes of a
research form is uniform design and standardization. Every respondent sees the same
questions. This helps in data collection and statistical analysis of this data. For example,
the retail store evaluation questionnaire template contains questions for evaluating retail
store experiences. Questions relate to purchase value, range of options for product
selections, and quality of merchandise. These questions are uniform for all customers.
• Exploratory: It should be exploratory to collect qualitative data. There is no restriction on
questions that can be in your questionnaire. For example, you use a data collection
questionnaire and send it to the female of the household to understand her spending and
saving habits relative to the household income. Open-ended questions give you more
insight and allow the respondents to explain their practices. A very structured question list
could limit the data collection.
• Question Sequence: It typically follows a structured flow of questions to increase the
number of responses. This sequence of questions is screening questions, warm-up
questions, transition questions, skip questions, challenging questions, and classification
questions. For example, our motivation and buying experience questionnaire
template covers initial demographic questions and then asks for time spent in sections of
the store and the rationale behind purchases.

Types of questions in a questionnaire

You can use multiple question types in a questionnaire. Using various question types can help
increase responses to your research questionnaire as they tend to keep participants more
engaged. The best customer satisfaction survey templates are the most commonly used for
better insights and decision-making.

Some of the widely used types of questions are:

• Open-Ended Questions: Open-ended questions help collect qualitative data in a


questionnaire where the respondent can answer in a free form with little to no restrictions.
• Dichotomous Questions: The dichotomous question is generally a “yes/no” close-ended
question. This question is usually used in case of the need for necessary validation. It is the
most natural form of a questionnaire.
• Multiple-Choice Questions: Multiple-choice questions are a close-ended question type in
which a respondent has to select one (single-select multiple-choice question) or many
(multi-select multiple choice question) responses from a given list of options. The
multiple-choice question consists of an incomplete stem (question), right answer or
answers, incorrect answers, close alternatives, and distractors. Of course, not all multiple-
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

choice questions have all of the answer types. For example, you probably won’t have the
wrong or right answers if you’re looking for customer opinion.
• Scaling Questions: These questions are based on the principles of the four measurement
scales – nominal, ordinal, interval, and ratio. A few of the question types that utilize these
scales’ fundamental properties are rank order questions, Likert scale questions, semantic
differential scale questions, and Stapel scale questions.
• Pictorial Questions: This question type is easy to use and encourages respondents to
answer. It works similarly to a multiple-choice question. Respondents are asked a question,
and the answer choices are images. This helps respondents choose an answer quickly
without over-thinking their answers, giving you more accurate data.

Types of Questionnaires based on Distribution

Questionnaires can be administered or distributed in the following forms:

• Online Questionnaire: In this type, respondents are sent the questionnaire via email or
other online mediums. This method is generally cost-effective and time-efficient.
Respondents can also answer at leisure. Without the pressure to respond immediately,
responses may be more accurate. The disadvantage, however, is that respondents can easily
ignore these questionnaires. Read more about online surveys.
• Telephone Questionnaire: A researcher makes a phone call to a respondent to collect
responses directly. Responses are quick once you have a respondent on the phone.
However, a lot of times, the respondents hesitate to give out much information over the
phone. It is also an expensive way of conducting research. You’re usually not able to
collect as many responses as other types of questionnaires, so your sample may not
represent the broader population.
• In-House Questionnaire: This type is used by a researcher who visits the respondent’s
home or workplace. The advantage of this method is that the respondent is in a comfortable
and natural environment, and in-depth data can be collected. The disadvantage, though, is
that it is expensive and slow to conduct.
• Mail Questionnaire: These are starting to be obsolete but are still being used in
some market research studies. This method involves a researcher sending a physical data
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

collection questionnaire request to a respondent that can be filled in and sent back. The
advantage of this method is that respondents can complete this on their own time to answer
truthfully and entirely. The disadvantage is that this method is expensive and time-
consuming. There is also a high risk of not collecting enough responses to make actionable
insights from the data.

A good questionnaire design

Questionnaire design is a multistep process that requires attention to detail at every step.

Researchers are always hoping that the responses received for a survey questionnaire yields
useable data. If the questionnaire is too complicated, there is a fair chance that the respondent
might get confused and will drop out or answer inaccurately.

As a survey creator, you may want to pre-test the survey by administering it to a focus group
during development. You can try out a few different questionnaire designs to determine
which resonates best with your target audience. Pre-testing is a good practice as the survey
creator can comprehend the initial stages if there are any changes required in the survey.

Steps Involved in Questionnaire Design

1. Identify the scope of your research:

Think about what your questionnaire is going to include before you start designing the look
of it. The clarity of the topic is of utmost importance as this is the primary step in creating the
questionnaire. Once you are clear on the purpose of the questionnaire, you can begin the
design process.
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

2. Keep it simple:

The words or phrases you use while writing the questionnaire must be easy to understand. If
the questions are unclear, the respondents may simply choose any answer and skew the data
you collect.

3. Ask only one question at a time:

At times, a researcher may be tempted to add two similar questions. This might seem like an
excellent way to consolidate answers to related issues, but it can confuse your respondents or
lead to inaccurate data. If any of your questions contain the word “and,” take another look.
This question likely has two parts, which can affect the quality of your data.

4. Be flexible with your options:

While designing, the survey creator needs to be flexible in terms of “option choice” for the
respondents. Sometimes the respondents may not necessarily want to choose from the answer
options provided by the survey creator. An “other” option often helps keep respondents
engaged in the survey.

5. The open-ended or closed-ended question is a tough choice:

The survey creator might end up in a situation where they need to make distinct choices
between open or close-ended questions. The question type should be carefully chosen as it
defines the tone and importance of asking the question in the first place.

If the questionnaire requires the respondents to elaborate on their thoughts, an open-ended


question is the best choice. If the surveyor wants a specific response, then close-ended
questions should be their primary choice. The key to asking closed-ended questions is to
generate data that is easy to analyze and spot trends.

6. It is essential to know your audience:

A researcher should know their target audience. For example, if the target audience speaks
mostly Spanish, sending the questionnaire in any other language would lower the response
rate and accuracy of data. Something that may seem clear to you may be confusing to your
respondents. Use simple language and terminology that your respondents will understand,
and avoid technical jargon and industry-specific language that might confuse your
respondents.

For efficient market research, researchers need a representative sample collected using one of
the many sampling techniques, such as a sample questionnaire. It is imperative to plan and
define these target respondents based on the demographics required.
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

7. Choosing the right tool is essential:

QuestionPro is a simple yet advanced survey software platform that the surveyors can
use to create a questionnaire or choose from the already existing 300+ questionnaire
templates.

Always save personal questions for last. Sensitive questions may cause respondents to drop
off before completing. If these questions are at the end, the respondent has had time to
become more comfortable with the interview and are more likely to answer personal or
demographic questions.

Differences between a Questionnaire and a Survey

Questionnaire Survey
A questionnaire can is a research A survey is a research method used for
instrument that consists of a set of collecting data from a pre-defined group
Meaning
questions to collect information from of respondents to gain information and
a respondent. insights on various topics of interest.
Process of collecting and analyzing that
What is it? The instrument of data collection
data
Consists of questionnaire and survey
Characteristic Subset of survey
design, logic and data collection
Time and
Fast and cost-effective Much slower and expensive
Cost
Use Conducted on the target audience Distributed or conducted on respondents
Close-ended and very rarely open-
Questions Close-ended and open-ended
ended
Answers Objective Subjective or objective
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

Sampling and methods


Sampling Methods with Examples

What is sampling?

Sampling is a technique of selecting individual members or a subset of the population


to make statistical inferences from them and estimate characteristics of the whole population.
Different sampling methods are widely used by researchers in market research so that they do
not need to research the entire population to collect actionable insights.

It is also a time-convenient and a cost-effective method and hence forms the basis of
any research design. Sampling techniques can be used in a research survey software for
optimum derivation.

For example, if a drug manufacturer would like to research the adverse side effects of a drug
on the country’s population, it is almost impossible to conduct a research study that involves
everyone. In this case, the researcher decides a sample of people from each demographic and
then researches them, giving him/her indicative feedback on the drug’s behavior.

Types of sampling: sampling methods


Sampling in market research is of two types – probability sampling and non-probability
sampling. Let’s take a closer look at these two methods of sampling.

1. Probability sampling: Probability sampling is a sampling technique where a


researcher sets a selection of a few criteria and chooses members of a population
randomly. All the members have an equal opportunity to be a part of the sample
with this selection parameter.
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

2. Non-probability sampling: In non-probability sampling, the researcher chooses


members for research at random. This sampling method is not a fixed or predefined
selection process. This makes it difficult for all elements of a population to have
equal opportunities to be included in a sample.
In this blog, we discuss the various probability and non-probability sampling methods that
you can implement in any market research study.

Types of probability sampling with examples:


Probability sampling is a sampling technique in which researchers choose samples from a
larger population using a method based on the theory of probability. This sampling method
considers every member of the population and forms samples based on a fixed process.

For example, in a population of 1000 members, every member will have a 1/1000 chance of
being selected to be a part of a sample. Probability sampling eliminates bias in the population
and gives all members a fair chance to be included in the sample.

There are four types of probability sampling techniques:

• Simple random sampling: One of the best probability sampling techniques that helps
in saving time and resources, is the Simple Random Sampling method. It is a
reliable method of obtaining information where every single member of a
population is chosen randomly, merely by chance. Each individual has the same
probability of being chosen to be a part of a sample.
For example, in an organization of 500 employees, if the HR team decides on
conducting team building activities, it is highly likely that they would prefer
picking chits out of a bowl. In this case, each of the 500 employees has an equal
opportunity of being selected.
• Cluster sampling: Cluster sampling is a method where the researchers divide the
entire population into sections or clusters that represent a population. Clusters are
identified and included in a sample based on demographic parameters like age, sex,
location, etc. This makes it very simple for a survey creator to derive effective
inference from the feedback.
For example, if the United States government wishes to evaluate the number of
immigrants living in the Mainland US, they can divide it into clusters based on
states such as California, Texas, Florida, Massachusetts, Colorado, Hawaii, etc.
This way of conducting a survey will be more effective as the results will be
organized into states and provide insightful immigration data.
• Systematic sampling: Researchers use the systematic sampling method to choose the
sample members of a population at regular intervals. It requires the selection of a
starting point for the sample and sample size that can be repeated at regular
intervals. This type of sampling method has a predefined range, and hence this
sampling technique is the least time-consuming.
For example, a researcher intends to collect a systematic sample of 500 people in a
population of 5000. He/she numbers each element of the population from 1-5000
and will choose every 10th individual to be a part of the sample (Total population/
Sample Size = 5000/500 = 10).
• Stratified random sampling: Stratified random sampling is a method in which the
researcher divides the population into smaller groups that don’t overlap but
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

represent the entire population. While sampling, these groups can be organized and
then draw a sample from each group separately.
For example, a researcher looking to analyze the characteristics of people
belonging to different annual income divisions will create strata (groups) according
to the annual family income. Eg – less than $20,000, $21,000 – $30,000, $31,000 to
$40,000, $41,000 to $50,000, etc. By doing this, the researcher concludes the
characteristics of people belonging to different income groups. Marketers can
analyze which income groups to target and which ones to eliminate to create a
roadmap that would bear fruitful results.
Uses of probability sampling
There are multiple uses of probability sampling:

• Reduce Sample Bias: Using the probability sampling method, the bias in the sample
derived from a population is negligible to non-existent. The selection of the sample
mainly depicts the understanding and the inference of the researcher. Probability
sampling leads to higher quality data collection as the sample appropriately
represents the population.
• Diverse Population: When the population is vast and diverse, it is essential to have
adequate representation so that the data is not skewed towards one demographic.
For example, if Square would like to understand the people that could make their
point-of-sale devices, a survey conducted from a sample of people across the US
from different industries and socio-economic backgrounds helps.
• Create an Accurate Sample: Probability sampling helps the researchers plan and
create an accurate sample. This helps to obtain well-defined data.
Types of non-probability sampling with examples
The non-probability method is a sampling method that involves a collection of feedback
based on a researcher or statistician’s sample selection capabilities and not on a fixed
selection process. In most situations, the output of a survey conducted with a non-probable
sample leads to skewed results, which may not represent the desired target population. But,
there are situations such as the preliminary stages of research or cost constraints for
conducting research, where non-probability sampling will be much more useful than the other
type.

Four types of non-probability sampling explain the purpose of this sampling method in a
better manner:

• Convenience sampling: This method is dependent on the ease of access to subjects


such as surveying customers at a mall or passers-by on a busy street. It is usually
termed as convenience sampling, because of the researcher’s ease of carrying it out
and getting in touch with the subjects. Researchers have nearly no authority to
select the sample elements, and it’s purely done based on proximity and not
representativeness. This non-probability sampling method is used when there are
time and cost limitations in collecting feedback. In situations where there are
resource limitations such as the initial stages of research, convenience sampling is
used.
For example, startups and NGOs usually conduct convenience sampling at a mall to
distribute leaflets of upcoming events or promotion of a cause – they do that by
standing at the mall entrance and giving out pamphlets randomly.
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

• Judgmental or purposive sampling: Judgemental or purposive samples are formed


by the discretion of the researcher. Researchers purely consider the purpose of the
study, along with the understanding of the target audience. For instance, when
researchers want to understand the thought process of people interested in studying
for their master’s degree. The selection criteria will be: “Are you interested in
doing your masters in …?” and those who respond with a “No” are excluded from
the sample.
• Snowball sampling: Snowball sampling is a sampling method that researchers apply
when the subjects are difficult to trace. For example, it will be extremely
challenging to survey shelterless people or illegal immigrants. In such cases, using
the snowball theory, researchers can track a few categories to interview and derive
results. Researchers also implement this sampling method in situations where the
topic is highly sensitive and not openly discussed—for example, surveys to gather
information about HIV Aids. Not many victims will readily respond to the
questions. Still, researchers can contact people they might know or volunteers
associated with the cause to get in touch with the victims and collect information.
• Quota sampling: In Quota sampling, the selection of members in this sampling
technique happens based on a pre-set standard. In this case, as a sample is formed
based on specific attributes, the created sample will have the same qualities found
in the total population. It is a rapid method of collecting samples.
Uses of non-probability sampling
Non-probability sampling is used for the following:

• Create a hypothesis: Researchers use the non-probability sampling method to create


an assumption when limited to no prior information is available. This method helps
with the immediate return of data and builds a base for further research.
• Exploratory research: Researchers use this sampling technique widely when
conducting qualitative research, pilot studies, or exploratory research.
• Budget and time constraints: The non-probability method when there are budget
and time constraints, and some preliminary data must be collected. Since the survey
design is not rigid, it is easier to pick respondents at random and have them take the
survey or questionnaire.
Difference between probability sampling and non-probability sampling methods
We have looked at the different types of sampling methods above and their subtypes. To
encapsulate the whole discussion, though, the significant differences between probability
sampling methods and non-probability sampling methods are as below:

Probability Sampling Methods Non-Probability Sampling Methods

Probability Sampling is a sampling Non-probability sampling is a


technique in which samples from a sampling technique in which the
Definition larger population are chosen using a researcher selects samples based on
method based on the theory of the researcher’s subjective judgment
probability. rather than random selection.

Alternatively
Random sampling method. Non-random sampling method
Known as
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

Population The population is selected


The population is selected arbitrarily.
selection randomly.

Nature The research is conclusive. The research is exploratory.

Since there is a method for deciding Since the sampling method is


the sample, the population arbitrary, the population demographics
Sample
demographics are conclusively representation is almost always
represented. skewed.

Takes longer to conduct since the


This type of sampling method is quick
research design defines the
Time Taken since neither the sample or selection
selection parameters before the
criteria of the sample are undefined.
market research study begins.

This type of sampling is entirely


This type of sampling is entirely
biased and hence the results are biased
Results unbiased and hence the results are
too, rendering the research
unbiased too and conclusive.
speculative.

In probability sampling, there is an


underlying hypothesis before the In non-probability sampling, the
Hypothesis study begins and the objective of hypothesis is derived after conducting
this method is to prove the the research study.
hypothesis.

The data preparation process

The data preparation process is shown in Figure 17.1. The entire process is guided by
the preliminary plan of data analysis that was formulated in the research design phase. The first
step is to check for acceptable questionnaires. This is followed by editing, coding and
transcribing the data. The data are cleaned and a treatment for missing responses is prescribed.
Often, after the stage of sample validation, statistical adjustment of the data may be necessary
to make them representative of the population of interest. The researcher should then select an
appropriate data analysis strategy. The final data analysis strategy differs from the preliminary
plan of data analysis due to the information and insights gained since the preliminary plan was
formulated. Data preparation should begin as soon as the first batch of questionnaires is
received from the field, while the fieldwork is still going on. Thus, if any problems are detected,
the fieldwork can be modified to incorporate corrective action.

Checking the questionnaire


The initial step in questionnaire checking involves reviewing all questionnaires for
completeness and interviewing or completion quality. Often these checks are made while
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

fieldwork is still under way. If the fieldwork was contracted to a data collection agency, the
researcher should make an independent check after it is over. A question naire returned from
the field may be unacceptable for several reasons:
1. Parts of the questionnaire may be incomplete.
2. The pattern of responses may indicate that the respondent did not understand or follow the
instructions. For example, filter questions may not have been followed
3. The responses show little variance. For example, a respondent has ticked only 4s on a
series of seven-point rating scales.
4. The returned questionnaire is physically incomplete: one or more pages is missing. Chapter
17 • Data preparation 422 Editing A review of the questionnaires with the objective of
increasing accuracy and precision. Coding Assigning a code to represent a specific response
to a specific question along with the data record and column position that the code will
occupy.
5 The questionnaire is received after the pre-established cut-off date.
6 The questionnaire is answered by someone who does not qualify for participation

Editing
Editing is the review of the questionnaires with the objective of increasing accuracy
and precision. It consists of screening questionnaires to identify illegible, incomplete,
inconsistent or ambiguous responses. Responses may be illegible if they have been poorly
recorded. This is particularly common in questionnaires with a large number of unstructured
questions. The data must be legible if they are to be properly coded. Likewise, questionnaires
may be incomplete to varying degrees. A few or many questions may be unanswered.
At this stage, the researcher makes a preliminary check for consistency. Certain
obvious inconsistencies can be easily detected. For example, a respondent may have
answered a whole series of questions relating to their perceptions of a particular bank, yet in
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

other questions may have indicated that they have not used that particular bank or even heard
of it.
Coding
Many questionnaire design and data entry software packages code data automatically.
Examples of the options available will be presented in the Internet and computer applications
section and on the Companion Website. Learning how to use such pack ages or even using
spreadsheet packages means that the process of coding is now a much simpler task for the
marketing researcher. Many of the principles of coding are based on the days of data
processing using ‘punched cards’ or even, much more recently, DOS files
Transcribing
Transcribing data involves keying the coded data from the collected questionnaires
into computers. If the data have been collected via the Internet, CATI or CAPI, this step is
unnecessary because the data are entered directly into the computer as they are collected.
Besides the direct keying of data, they can be transferred by using mark sense forms, optical
scanning or computerised sensory analysis.
Mark sense forms require responses to be recorded in a pre-designated area coded for
that response, and the data can then be read by a machine. Optical scanning involves direct
machine reading of the codes and simultaneous transcription.
A familiar exam ple of optical scanning is the transcription of universal product code
(UPC) data, scanned at supermarket checkout counters. Technological advances have resulted
in computerised sensory analysis systems, which automate the data collection process. The
questions appear on a computerised gridpad, and responses are recorded directly into the
computer using a sensing device.
Except for CATI and CAPI, an original record exists which can be compared with
what was either automatically read or keyed. Errors can occur in an automatic read or as data
are keyed and it is necessary to verify the dataset, or at least a portion of it, for these errors.
A second operator re-punches the data from the coded questionnaires. The transcribed
data from the two operators are compared record by record. Any discrepancy between the
two sets of transcribed data is investigated to identify and correct for data keyed in error.
Verification of the entire data set will double the time and cost of data transcription. Given
the time and cost constraints, and that experienced operators who key data are quite accurate,
it is sufficient to verify 10–25% of the data. With automatically read data, the completed data
set that has been read can be compared with original records. Again, a percentage may be
selected and checks made to see what may have caused differences between the original
record and the read data (e.g. respondents entering two ticks when only one was requested).
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

Cleaning the data

Data cleaning includes consistency checks and treatment of missing responses. Even though
preliminary consistency checks have been made during editing, the checks at this stage are
more thorough and extensive, because they are made by computer. Consistency checks
Consistency checks identify data that are out of range or logically inconsistent or have
extreme values. Out-of-range data values are inadmissible and must be corrected. For
example, respondents have been asked to express their degree of agreement with a series of
lifestyle statements on a 1 to 5 scale. Assuming that 9 has been designated for missing values,
data values of 0, 6, 7 and 8 are out of range. Computer packages can be programmed to
identify out-of-range values for each variable and will not progress to another variable within
a record until a value in the set range is entered. Other pack ages can be programmed to
print out the respondent code, variable code, variable name, record number, column number
and out-of-range value6 . This makes it easy to check each variable systematically for out-of-
range values. The correct responses can be determined by going back to the edited and coded
questionnaire.
Treatment of missing responses
Missing responses
Substitute a neutral value.
Substitute an imputed response
Casewise deletion
Pairwise deletion
Statistically adjusting the data
Procedures for statistically adjusting the data consist of weighting, variable re-
specifi cation and scale transformation. These adjustments are not always necessary but can
enhance the quality of data analysis. Weighting In weighting, each case or respondent in the
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

database is assigned a weight to reflect its importance relative to other cases or respondents.
The value 1.0 represents the unweighted case. The effect of weighting is to increase or decrease
the number of cases in the sample that possess certain characteristics. (See Chapter 15, which
dis cussed the use of weighting to adjust for non-response bias.) Weighting is most widely
used to make the sample data more representative of a target population on specific
characteristics. For example, it may be used to give greater importance to cases or respondents
with higher-quality data. Yet another use of weighting is to adjust the sample so that greater
importance is attached to respon dents with certain characteristics. If a study is conducted to
determine what modifications should be made to an existing product, the researcher might want
to attach greater weight to the opinions of heavy users of the product. This could be
accomplished by assigning weights of 3.0 to heavy users, 2.0 to medium users, and 1.0 to light
users and non-users.
Selecting a data analysis strategy
The process of selecting a data analysis strategy is described in Figure below. The
selection of a data analysis strategy should be based on the earlier steps of the marketing
research process, known characteristics of the data, properties of statistical tech niques, and
the background and philosophy of the researcher. Data analysis is not an end in itself. Its
purpose is to produce information that will help address the problem at hand. The selection of
a data analysis strategy must begin with a consideration of the earlier steps in the process:
problem definition (step 1), development of an approach (step 2), and research design (step 3).
The preliminary plan of data analysis prepared as part of the research design should be used as
a springboard. Changes may be necessary in the light of additional information gener ated in
subsequent stages of the research process. The next step is to consider the known characteristics
of the data. The measure ment scales used exert a strong influence on the choice of statistical
techniques (see Chapter 12). In addition, the research design may favour certain techniques.
For example, analysis of variance (see Chapter 19) is suited for analysing experimental data
from causal designs. The insights into the data obtained during data preparation can be valuable
for selecting a strategy for analysis. It is also important to take into account the properties of
the statistical techniques, particularly their purpose and underlying assumptions.
lOMoARcPSD|16467835

RM4151-RESEARCHG METHODOLOGY UNIT -2

What is Data Exploration?

Data exploration is the first step of data analysis used to explore and visualize data to
uncover insights from the start or identify areas or patterns to dig into more. Using interactive
dashboards and point-and-click data exploration, users can better understand the bigger
picture and get to insights faster.

Why is Data Exploration Important?

Starting with data exploration helps users to make better decisions on where to dig
deeper into the data and to take a broad understanding of the business when asking more
detailed questions later. With a user-friendly interface, anyone across an organization can
familiarize themselves with the data, discover patterns, and generate thoughtful questions that
may spur on deeper, valuable analysis.

Data exploration and visual analytics tools build understanding, empowering users to
explore data in any visualization. This approach speeds up time to answers and deepens
users’ understanding by covering more ground in less time. Data exploration is important for
this reason because it democratizes access to data and provides governed self-service
analytics. Furthermore, businesses can accelerate data exploration by provisioning and
delivering data through visual data marts that are easy to explore and use.

What are the Main Use Cases for Data Exploration?

Data exploration can help businesses explore large amounts of data quickly to better
understand next steps in terms of further analysis. This gives the business a more manageable
starting point and a way to target areas of interest. In most cases, data exploration involves
using data visualizations to examine the data at a high level. By taking this high-level
approach, businesses can determine which data is most important and which may distort the
analysis and therefore should be removed. Data exploration can also be helpful in decreasing
time spent on less valuable analysis by selecting the right path forward from the start.

You might also like