Reading Material For AMR - Dr. Vikas Goyal

Advanced Marketing Research
Reading Material
Content
Basic Marketing Research: ........................................................................................................ 4
How clear research objectives can lead a project to success ................................................ 4
The top five mistakes in marketing statistics ......................................................................... 9
Computers know 'how' but they don't know 'what' ............................................................ 13
Take these steps to build your research on a solid foundation ........................................... 17
Qualitative Research: .............................................................................................................. 22
Adapting quantitative techniques to qualitative research .................................................. 22
Qualitative research demands a scientific approach ........................................................... 27
Measurement Scales, Questionnaire Design and Survey Errors: .......................................... 33
A simple solution to nagging questions about survey, sample size and validity ................. 33
Understanding data requires recognition of types of error ................................................ 36
An analysis of the impact of survey scales ........................................................................... 40
Quant or qual, let’s go back to the basics ............................................................................ 50
Survey and sampling in an imperfect world......................................................................... 55
Increasing survey accuracy................................................................................................... 61
Statistical Significance: ............................................................................................................ 66
The significance of significance ............................................................................................ 66
The insignificance of significance testing ............................................................................. 70
The use, misuse and abuse of significance .......................................................................... 74
Vexed by significance testing? Try the bootstrap technique ............................................... 76
Basic Data Analyses:................................................................................................................ 80
Secrets of effective data use ................................................................................................ 80
Ordered up wrong ................................................................................................................ 89
Let's test everything ............................................................................................................. 91
A comparison of missing value options in regression analysis ............................................ 94
Chi-Square Test: ...................................................................................................................... 97
By the Numbers: The cool logic of chi-square ..................................................................... 97
1
Advanced Marketing Research@ Dr. Vikas Goyal, IIM Indore
t-Test: ....................................................................................................................................... 99
Nonparametric tests: sturdy alternatives ............................................................................ 99
ANOVA and ANCOVA: ........................................................................................................... 103
Using ANCOVA to gauge the impact of demographic differences on satisfaction ............ 103
Regression Analysis: .............................................................................................................. 106
Regression regression ........................................................................................................ 106
To progress you must first regress ..................................................................................... 110
Have you ever wondered... ................................................................................................ 121
Factor Analysis: ..................................................................................................................... 126
Factor analysis: A useful tool but not a panacea ............................................................... 126
Discriminant Analysis: ........................................................................................................... 129
A walk through discriminant analysis................................................................................. 129
Clustering: .............................................................................................................................. 135
Latent class modeling as a probabilistic extension of k-means clustering ........................ 135
Multi-Dimensional Scaling: ................................................................................................... 143
Exploring marketing ideas with perceptual maps.............................................................. 143
Perceptual mapping and cluster analysis: some problems and solutions ......................... 153
Quadrant analysis (Percep maps) ...................................................................................... 163
Conjoint Analysis: .................................................................................................................. 169
A short history of conjoint analysis .................................................................................... 169
Conducting full-profile conjoint analysis over the Internet ............................................... 176
Benefit impact analysis (Alternative to Conjoint) .............................................................. 185
Segmentation: ....................................................................................................................... 190
Ten guidelines for a good segmentation............................................................................ 190
Q-Factors or K-Means? A market segmentation dilemma ................................................ 195
Multivariate Analyses: .......................................................................................................... 200
Multivariate analysis - some vocabulary............................................................................ 200
A marketing researcher's guide to multivariate analysis ................................................... 204
(Sub-) optimal test designs for multivariable marketing testing ....................................... 207
A survey of multivariate methods useful for market research .......................................... 210
Report Writing: ...................................................................................................................... 221
Mastering the art of writing quantitative research reports .............................................. 221
2
Charting and graphing software comes of age .................................................................. 223
Additional Readings: ............................................................................................................. 241
A survey of analysis methods ............................................................................................. 241
Part I: key driver analysis ................................................................................................ 241
Part II: Segmentation analysis ........................................................................................ 246
McCullough’s Laws: first principles of commercial data analysis ...................................... 251
Time series analysis: what it is and what it does ............................................................... 256
Social Media Data Analysis: .................................................................................................. 261
Cracking the code of social media data analysis ................................................................ 261
Analyzing the content of social media data ....................................................................... 266
BIG Data Analysis: ................................................................................................................. 271
Big data no big deal ............................................................................................................ 271
Big data matters: Why you should be using it and how others already are ...................... 273
Big data: boon to improving customer experience, bane of researchers?........................ 276
Dealing with External Research Providers: .......................................................................... 283
Ten research industry secrets and how to handle them ................................................... 283
Working with a statistical expert and surviving ................................................................. 288
*The articles are adopted from Quirks Marketing Research Review
3
Basic Marketing Research:
How clear research objectives can lead a project to success
Author - Bonnie W. Eisenfeld
Article Abstract
Defining research objectives at the beginning of a project can serve as a guiding light
throughout the research process and help ensure that client needs are satisfied by asking
the right questions to the right people the right way.
Knowing what you want
A variety of problems can, and most likely will, occur when research objectives are not
explicitly stated at the beginning of the project or are forgotten during the research process.
So, the very first thing a marketing researcher should do is work with the client to identify
and specify research objectives. Research objectives are statements generally describing the
types and categories of information you want to obtain, from what target population and an
explanation of the comparisons you want to make. As the project progresses through its
stages, the research objectives guide and inform the project team.
Exploring or measuring. Research objectives need to specify whether you want to explore
or to measure. Exploration leads to qualitative methodologies such as focus groups or in-
depth interviews. If you know nothing about the market, it is important to explore it and
obtain ideas before proceeding to a quantified measurement phase. A measurement
objective leads to quantitative research methods yielding numerical data. When
measurement is your objective, you need to specify what you want to measure. If a
company is going to make a large, expensive or risky decision, it is particularly important to
quantify market data for the purpose of minimizing the risk.
Categories of data. Research objectives are summary statements describing the categories
of data you want to obtain. Market research objectives might include learning about buyer
behaviors, attitudes, brand awareness, brand image, product satisfaction, product likes and
dislikes, good and bad experiences, likelihood to consider, likelihood to purchase and so
forth. In each case, these objectives need to be tailored to the specific project.
Definition of the target population. Research objectives need to be tied to one or more
target populations. A target population must be able to provide the data you want. For
example, you can’t ask technical questions to respondents who don’t understand the
technical jargon. You can’t ask people to talk about their experience with a product if they
have no experience in that category.
Comparisons. Often researchers want to compare segments of the population to each other
or measure year-to-year changes.
4
Triggered that need
Research objectives are not invented out of thin air. When you have a need for market
research, some marketing, business, strategy or communications problem or objective has
triggered that need. When you conduct research, you should know how the findings will
eventually be used, particularly if a decision is going to be made or an action taken based on
the findings.
For example, a marketing objective might be to sell more of your product. You could
conduct research among three target populations: 1) your customers, 2) customers of
competing brands, and 3) people who potentially need your product.
Customer research. Your research objectives for customers would be to find out how they
are using your product; what would motivate them to use more of your product; what other
brands they are using; problems or dissatisfactions they might have with your brand; and
other obstacles to more frequent usage.
Here is a simple example: A company produced and sold a unique over-the-counter health
product that people used orally in liquid form. Sales had been declining over the past couple
of years. A research project was designed with the objectives of learning the reasons for the
decline and how to increase sales. By conducting interviews with customers, the company
learned that customers did not like the taste of the product and used it less than they
needed it. The solution was to reformulate the product so it would taste better. As a result,
sales increased.
Users of competing brands. Research objectives for users of competing brands would be to
seek to discover their opinions of that brand; what they like and dislike about that brand;
dissatisfactions or problems they have with that brand; perceptions of your brand; and what
would motivate them to try your brand.
For both sets of respondents, you could find out how they use the product and whether
they have additional needs.
Potential users of your product. You could hypothesize a target population with a potential
or latent need for your product. (They need it but they don’t yet know it.) Your research
objective would be to identify the problems or needs that population is having for which
your product provides a solution. Another objective could be to test your product concept
to get respondents’ opinions, likes and dislikes and likelihood to purchase.
Other sources of research objectives
Other starting points for market research might be the information needs for a strategic
plan; an investment or acquisition; a new product launch; a new delivery or communication
channel; or some other major company decision.
5
A review of published market research studies can assist you in defining appropriate
research objectives. For example, a company wanted to measure employees’ satisfaction
with its communication program. From published research, they learned that other
companies were measuring employees’ trust in communications, a key element in defining
satisfaction. Trust was then considered as an option for an additional research objective.
Limiting and prioritizing research objectives
How many research objectives is the right number? Time limits the number of questions
that can be included in a focus group or an individual interview. If an interview is too long,
respondents will become fatigued, rush though their responses and/or terminate early. In a
focus group, time may run out before you have covered all topics. Unless you are going pay
an enormous incentive to get participants to answer a huge questionnaire, you need to limit
your questions. In order to do that, you need to prioritize your objectives. Those that are
less important may need to be omitted.
However, if you have a lot of important objectives - too many for one questionnaire - an
option is to split your sample randomly and conduct two research projects, each with a
different set of objectives and a shorter questionnaire. Assuming the split samples have the
same characteristics, you should meet all your research objectives.
Exploring or measuring
Are you trying to explore a topic or are you trying measure something? The best approach is
to explore first and then measure. You can miss a lot of information if you skip the
exploratory stage. In many instances where exploratory research was skipped, the
questionnaire for the measurement phase neglected to ask the most important questions.
In addition, the multiple response choices did not include some of the most important
answers.
As an example, a company had just installed an employee software platform and wanted to
measure employee satisfaction with it. The IT managers were about to jump in and ask
people how satisfied they were with different elements of the software and postpone
asking any questions about training and support. Luckily the marketing research team was
able to persuade the IT managers to conduct some exploratory in-depth interviews first.
Findings from of these exploratory interviews showed that training and support were
primary concerns. The quantitative research phase was then designed to measure
satisfaction with those two elements and to uncover any suggestions for improvement.
Eventually, the findings led to greatly improved levels of support and more tailored training
methods for segments of the employee population.
Target population and recruiting
6
The recruiting of eligible and appropriate respondents should be based on the research
objectives. Depending on the objectives, respondents may need to have prior knowledge or
experience to enable them to voice an opinion. For example, if you want to find out details
about customers’ complaints, don’t ask the CEO, ask the call-center staff. The CEO will know
about the customers’ complaints only when they reach high levels of magnitude.
In many cases, it is necessary to hypothesize the definition of the most appropriate

audience, especially for the exploratory phase of the research. A company was considering
offering an assistance plan that helps people find medical resources, legal resources,
alternative transportation, hotels and even burial assistance or transportation of a deceased
person. The product manager neglected to think about who would be the most likely type of
person who would be interested in this service and did not specify requirements to the
recruiter. It turned out that none of the participants recruited were travelers, visitors or
newcomers; they had never left their hometown. They were totally bored with the ideas
and couldn’t see why they wouldn’t just ask their family and friends for advice.
Another organization wanted research as input to its strategic plan, one element of which
was to include how to win against its competitors. Unfortunately, the project manager
neglected to specify that some of the participants in the research should either currently
use or previously have used a competitor’s brand - not just the client’s brand. It turned out
that the list source contained no one who used or had used a competitor’s brand and
therefore researchers were unable to obtain any data on opinions about competitors.
Methodology for comparisons
If you plan to compare current data to a previous year’s data, you need to collect the data in
a consistent manner. For example, if you have been using a telephone survey, you need to
continue the telephone survey method. Many marketing researchers are switching to online
methods. A switch in data collection method can be accomplished as a test, simultaneously
with the original method. In that way, you will accomplish the objective of correct
comparison, while at the same time testing the new method for future comparisons.
If you plan to make a comparison of current data with the previous year’s data, you also
need to use the same questionnaire as the previous year, although new questions can also
be added.
Writing questions to meet research objectives
Questions should be written to meet the research objectives. I have seen questionnaires
with questions in them that did not seem to meet any of the research objectives, and
conversely I have seen questionnaires where there were no questions at all for certain
objectives. Either way, you have a problem. This mismatch is common, especially when a
questionnaire is heavily edited by multiple people within an organization.
7
The easiest way to ensure you do not have a mismatch is to head each series of questions
with the appropriate research objective. Don’t remove the research objectives from the
final questionnaire; they will assist the moderators or interviewers in focusing their
questioning appropriately.
For example, if a key objective is to obtain competitor information, have a section titled
“Competitor Information.” Ask respondents which competitors they have heard of; which
ones they use; what they think of them; how satisfied they are with them; and so on. Of
course, if you have not properly recruited respondents who can answer these questions,
you will not have data to meet that objective.
Lately, it seems that clients want longer questionnaires and shorter reports. By focusing on
the research objectives, questionnaires and reports can be better aligned.
Choosing a moderator or interviewer
A moderator or interviewer needs to be matched to the type of data collection and type of
respondent. For a relatively simple structured questionnaire administered by telephone, the
requirements may be a good speaking voice, ability to read and some basic interviewing
training. For specialized populations such as business or technical respondents, particularly
in qualitative research, a moderator or interviewer who sounds knowledgeable about the
subject matter will be able to obtain more information from the respondents by probing
intelligently. The research findings will be richer as a result.
Analysis plan
The analysis plan should be based on the research objectives. For a quantitative study,
tabulating everything by everything is a common procedure but unnecessary. Using the
research objectives, you can think ahead about what kinds of tabulations and other analysis
you will need. You should write the analysis plan prior to finalizing the questionnaire. You
may find that you have neglected to ask a certain question that would provide useful data to
analyze and you still have time to add it to the questionnaire.
Writing the report to meet objectives
The easiest way to write the report is to list the objectives and then take each one and write
to that objective. Pretend you are writing a college exam with an open book. Write what
you think you have found and then go back to check the data. I have seen reports with lots
of data, but at the end you did not know if the research objectives were met because the
objectives were not the focus of the report. A reader should not have to work too hard to
obtain the necessary answers.
Whether your report is written in traditional style or presentation format, it is important to

include the research objectives, description of the respondents and methodology. Keep in
8
mind that some clients keep reports on file for a long time and eventually other people in
the company may use them, so each report needs to be self-explanatory and self-contained.
A continuous cycle
If research objectives are defined correctly in the beginning and threaded through all
elements of the research project, then at the end of the project, you will have useful
findings that meet the research objectives. Market research is a continuous cycle; findings
from each research project can be used to inform the research objectives for subsequent
projects.
The top five mistakes in marketing statistics

Author - William M. Briggs
Article Abstract
From asking too many questions to falling for the latest technique, here is a statistician’s
take on marketers’ common statistics-related mistakes.
Statistics isn’t as easy as it looks. Mastering the subject isn’t equivalent to “submitting the
data to software.” From my perspective as a statistician, these are the top five mistakes I
have seen marketers and researchers make. Do any of them seem familiar to you?
1. Asking too many questions
Data drives statistics: If there isn’t any, few questions can be answered. Yet too much data
causes problems just as too little does. I don’t mean big data, defined as rich and plentiful
data, but of such size that it’s difficult to handle in the usual manner. Too much bad data is
what hurts.
Who’s been in a survey-design meeting where a client wants to know what makes his
product popular, where everybody contributes a handful of questions they want asked? And
those questions lead to more questions, which bring up still others.
The discussion ranges broadly: Everybody has an idea what might be important. A v.p. will
say, “I feel we should ask, ‘Do you like the color blue?’,” while a rival v.p. will insist on,
“About blue, do you not like it?” Gentle hints that one of these questions could and should
be dropped might be taken as impolitic. The marketing analysis company, wanting to keep
its contract, acquiesces.
9
Statisticians are rarely invited to these soirées but if one were present he would have
insisted that duplicate or near-duplicate data cannot provide additional insight but can
cause the analysis to break or give absurd answers.
If there is genuine uncertainty about a battery of questions, then a test survey should be run
first. This trial analysis works out bugs and sets expectations. The process can be iterated
until the suite of questions are manageable and where there is now high likelihood each
piece of data will be useful. This also prevents situations where an analytical method has
been promised but where the survey design did not include the necessary questions (this
often happens; see Mistake 5).
This simple yet rare procedure, if used routinely, would eliminate most of the mistakes
listed below and save money in the long haul.
2. Failing to appreciate limitations
Not everything you want to know can be answered. The best brain, programming the fastest
computer running the most sophisticated algorithm, can’t discover what isn’t there. Even if
you ask Ph.D.s from the best universities or if you write large checks to a company with a
reputation for doing the impossible.
Probability and statistical algorithms are not magic. Software spits out answers but answers
don’t imply the results are what you hope or believe they are.
Example: driver models, where drivers of some outcome are input into an algorithm which
orders the importance and gives the strength of each driver. Now, clients often insist that
each driver be positively associated with the outcome and that negative associations are
either impossible or unacceptable. Pleas for positive “correlations” become so earnest that
some analysts, concerned about their paycheck, provide the client what he wishes.
But sometimes negative results which don’t make sense are still found. This always means
the wrong method of analysis has been used or, pace Mistake 1, too much bad data has
been used.
Or it means that a driver has nothing to say about the outcome after all the other drivers
have been taken into account. These superfluous drivers should be expunged from the
model. But then comes politics: Whichever driver is tossed will be somebody’s favorite.
What’s worrying is when, under pressure, statisticians “discover” ways to keep problematic
drivers.
Other common instances where a statistician is asked to “make it work” are when an old
analysis doesn’t match a current one, when a decline in some measure “should be” an
increase or when somebody doesn’t want to deliver bad news.
3. Not understanding regression
10
Regression or regression-like techniques are the backbone of marketing statistics. Yet most
folks don’t have a good handle on their interpretation and limitations.
Here’s the setup: We have something we want explained, like customer purchase intent or
money spent; any number. Call that number Y. It’s also called the outcome, or, in older
terminology, the dependent variable.
We also have other data which we hope are probative of Y. These are called drivers or
correlates, or, in the same old words, independent variables. Call this potential explanatory
data X. Since we might have more than one piece of explanatory data, we call them X 1, X2,
and so forth.
You see equations written like this
Y = b0 + b1 X1 + b2 X2 + …
where the ellipsis indicates we could go on adding terms and go on and go on some more –
you get the idea. People who use regression certainly grasp this trick: They add terms like
there’s no tomorrow, figuring, “Why not?” Because the equation is wrong, that’s why.
Here’s the real math:
Y ~ N(b0 + b1 X1 + b2 X2 + …, s)
where the tilde indicates it is our uncertainty in Y – and not Y itself – which is characterized
by a normal distribution with a central parameter (which tells where the peak of the bell-
shaped curve goes) which is dependent on values of the Xs. The “s” describes the width of
the bell-shaped curve.
The bi are called parameters, coefficients or sometimes betas (they are occasionally written
using the Greek alphabet). Inordinate interest is given to these creatures, as if they were the
reason for regression. They are not.
It turns out that in classical statistics you can make guesses for the parameters (Bayesians
do this less often). These guesses fascinate marketers in several ways, though they
shouldn’t. Remember the intent of the model was that once we knew what value X took,
then we would know the likely values – plural – Y might take. Who cares about a
parameter? They can’t be seen, tasted or touched.
It’s rare to see uncertainty accompany parameter guesses but it should. Or ideally, as said,
we should eschew the parameters altogether and speak of the relationship between the Xs
and the subsequent uncertainty in Y. But the methods to do this (Bayesian predictive
analytics) are not well-known.
Now you can appreciate Mistake 2 in more detail. Sticking dozens of Xs (drivers) into a
regression equation, which are not designed to say things about parameters, but about Xs
and Ys, practically guarantees some parameters will be negative. Such is life.
11
The discrepancies between understanding and usage occur everywhere, incidentally, not
just regression.
4. Falling for the latest gee-whiz approach
Every year some new algorithm is touted which will solve all conceivable statistical
problems. Remember neural nets? Genetic algorithms? How about partial least squares,
permutation tests, support vector machines, trees, smoothing, machine learning, Bayesian
nets, Markov chain Monte Carlo? Now it’s big data (which isn’t even a technique). Add your
favorite to the list.
Once the new algorithm is released from academia into the wild, somebody invariably
writes a hagiographical article which catches the imagination of marketers, who then beg
statisticians to have the slick new wonder applied to their data. Doesn’t make any difference
if the method is appropriate or not or that it is like applying a sledge hammer to a tack; the
algorithm is hot, it’s sexy and it must be used.
Believe it or not, sometimes the best and fairest analysis is no analysis at all. Simple
summaries and descriptions of data are often superior to the fanciest model. This is because
statistical models are not meant to tell you about what you’ve already seen but what you
will see in the future, given conditions are this or that.
We don’t need to model old data, we need to predict new data. We don’t need to guess
(using p-values or hypothesis tests) whether this X is associated with that Y, we can just
look. If the relationship is real, then given the simplest model the situation allows, knowing
X will give the uncertainty in new Ys. In this way models can actually be validated with new
data.
5. Not coming to a statistician (soon enough)
Too often statisticians are called at the same time coroners are called to murder scenes.
What they can do at that point is the same, too: identify the cause of death.
I don’t want to hurt anybody’s feelings but the next topic is rather sensitive. Let me put it to
you in the form of a question: Would you board a jumbo jet piloted by a man whose only
experience comes from operating remote-control models? What if he learned his
techniques from older experienced hobbyists? What if he possessed a certificate showing he
knows all about model planes? What if he had a Ph.D. (proving his intelligence) in a subject
not related to piloting? Still no?
I am anxious to agree that it is possible that those who have had a statistics class or two
from a psychologist as they study for their Ph.D. in the same subject can understand fully
the complexities and nuances of probability and are just as facile with computation as any
statistician and sometimes are even more so.
12
But – and don’t get mad – it doesn’t happen that often. And just think: How many
statisticians try to practice psychology, politics, sociology, etc., or all those other fields which
contribute much to marketing science?
Computers know 'how' but they don't know 'what'

Author- Gary M. Mullet
Article Abstract
This article points to several potential pitfalls of taking statistical software results at face
value.
Recently I tried to convince a statistical software package that when I typed "variable"
I meant "variable". The software, however, used what I said and ignored what I meant to
say. Shortly after that I ran across a headline for some new statistical software which blared,
"For people who aren't statistics experts." I'm not sure that one necessarily has to be a
statistical expert to properly use statistical software, but as long as computers and their
programs do exactly what they're told to do, instead of what they should have been told to
do, oversimplification of software use can lead to trouble. Many times the difficulty is as
easy to spot as the "variable-variable" one. Many times it's not, as will be seen below.
None of the instances which follow are meant to deride or belittle anyone. Instead, they are
shown to illustrate just how easy it is to push the wrong button and ask for the wrong
analysis. I still type "variable" at least half the time, inadvertently and incorrectly. My error
brings the analysis to a screeching halt and is easy to find. These examples are both more
subtle and potentially more serious.
Examples
At least one data tabulation package does a t-test for proportions or says it does. Generally,
for large enough samples (whatever that may be--and for proportions it's not necessarily
anything greater than 30) the results will agree quite closely with the more correct Z-test or
x2-test. However, there are some fairly strong assumptions underlying the t-test. Even
though these assumptions may sometimes be violated with impunity, strictly speaking there
is no such animal as a t-test for proportions. The program in question, however, is simple to
use and incorrect analyses can be performed without question. While a "statistical expert" is
probably not necessary to tell you whether or not your particular analyses are all right to
report, someone with at least a modicum of knowledge could certainly help. The easy-to-
use software can get an unwary analyst into serious difficulty.
13
While we're at it, you should be aware that the assumptions behind the above mentioned Z-
test and/or x2-test are also quite stringent. For some of your analyses they, too, may be
violated-and the computer package used might not flag the violation. It happens a lot in
practice, because the programs do exactly as they are told, whether or not you really should
have meant to tell it to do such an analysis. (If you do find cases where these tests shouldn't
be done on your proportions, you're probably stuck either doing an exact test or an arcsin
transformation.)
Another variation on this theme is the analysis which was done on a simple paired-product
preference. How was it decided whether or not the proportion who preferred product A
was different than that preferring B? A dependent or paired t-test. Why? The computer
certified the methodology by performing the requested analysis. Quick, simple, easy-to-use
and wrong. But at least the analysis was done without the use of a "statistics expert".
Computer programs that don't require a "statistics expert" may be useful in designing
conjoint studies. Just push the right button (usually ENTER or RETURN), and here come your
conjoint scenarios ready to print and send to the field. Again, at least in a few cases, the
easy-to-use computer programs have been the source of trouble. In one, a 32-card sort was
produced for a study in which one of the attributes had 5-levels. With the other attributes at
2-, 3-, and 4-levels, the design was not a desired orthogonal array--but was unknowingly
used anyway.
Another conjoint study was designed for respondents to sort 16 cards. The problem here
was that one combination of two of the attributes didn't vary together. The pairs of levels
were constant. To illustrate, if one of the attributes was color with two levels, say, red and
blue and the other was size, say, large and small, what the respondents saw was red-large
on 8 cards and blue-small on the other eight. Clearly, there is no way to generate the utility
estimates that were desired, but no one thought to question or check the computer
generated design before the study was actually completed. The computer program which
did the design ( and it was written especially for this study) performed exactly as instructed,
not as it should have been instructed.
A cluster analysis was run on one of the easier-to-use, among the easy-to-use, cluster
programs. The program ran exactly as told, but after a couple of iterations, clusters of size 1
or 2 popped up. What happened? Seems that for a handful of respondents, some, but not
all, of their answers were punched one card column to the right of where they should have
been. These respondents, then, were showing up as the small clusters since they were, in
fact, very different from everyone else. The user of the cluster program had no idea
whether or not the cluster made sense; after all, there were no error messages displayed.
Another computer program was designed to generate mailing labels from a data base. Just
tell it how many you need and names are selected at random and mailing labels produced.
In this particular case the computer generated labels weren't even given a cursory glance--
14
after all the computer printed them--but just stuck on the envelopes and dropped into the
mail. The only problem was that, while the name, city, state, and ZIP code were on each
label, the street and number were not. Lots of undelivered surveys were returned to the
sponsoring organization, with the obvious disastrous consequences to the study.
In yet another case, a computer program did an analysis which was really unnecessary. A
series of statements were collected on a scale where 1 = Yes, the statement applies and 0 =
No, the statement doesn't apply. No problem so far. What the computer was asked to do,
and did, was produce correlations between these statements and the same set of
statements recorded as 1 = No, statement doesn't apply and 1 = Yes, statement applies. The
computer was all too happy to compute these unnecessary correlations, at no small cost.
They could be done, therefore they were done.
Yet another frequent happening (mentioned by Gurwitz) is to request the computer to run a
discriminant or regression or factor analysis. Quick and easy, if it weren't for item
nonresponse. Most computer packages drop a respondent totally from such analyses for
having only a single missing answer, sometimes out of 100 or so items. Several times the
ultimate user of such analyses will be looking at their multivariate analyses for marketing
insights only to find that the analyses weren't performed at all due to every respondent
having at least one missing answer. These, at least, wave a red flag. Even worse are the
analyses which are performed, retained and acted on even through the base sizes were only
10 or 15--those who answered everything requested in the survey. Again, the computers are
merely following orders.
The missing data problem can be severe, but generally unnoted, when discriminant based
perceptual maps are drawn. Reliability can be a real problem when the bases for such maps
are only 10 or 15 respondents, but the mapping algorithms perform anyway--quickly and
easily. Also, you can get maps done when the different brands shown are rated on different
attribute lists or the attributes are scaled differently on the questionnaire. So-called multiple
correspondence analysis maps have been produced from several 2-variables cross-
tabulations, rather than going back to the respondent data. They show all of the points
required, even though the coordinates were not generated as they should have been. Then
there was the discriminant based map which used such a high significance level that the
attribute directions were essentially random. The map made no sense because someone
told the computer to use a high significance level instead of a high confidence level. The
computer didn't balk; thus, the analyses which could be done were done but the analyses
which should have been done were not done.
The mystique associated with statistical computer programs is not limited to those
commercially available. A computer program was specifically written to perform a non-
standard, but still valuable, statistical procedure. As in most such cases, textbook data sets
were used to test the program, which performed well. Unfortunately, the degrees-of-
freedom were set as a constant value, 4, irrespective of the number of respondents and/or
15
stimuli. No one noticed this one for weeks, mainly because everyone believed that the
printed value should be correct; after all, the computer said it and the program was easy to
use.
It's also easy to get in trouble, since the computer is like Ado Annie (it "can't say no"), on
some harmless looking analyses. In one such study, a series of attribute ratings were of the
variety "Too Big," "Just Right" and "Too Small." These were coded and entered into the data
file as 1, 2, and 3, respectively. Two products compared on one such scale showed Product A
with 3 vote for "Too Big" and 112 for "Just Right." Product B had 23 respondents say, "Too
Big," 59 say, "Just Right" and 33 respond with "Too Small." Obviously, the products are
different with respect to this scale. However, the computer was instructed to do a
dependent t-test on the means which turned up as not significantly different. If only the
computer could have said no!
Computers also don't question you (or me) when you try to analyze dependent samples as if
they were independent (or vice versa) as long as the data fit the required format for the
test. They also don't ask if you have an overlapping sample for analysis--they just do as they
are told.
In one survey, the project director designed the study to test for order bias by using all 6
possible rotations of the 3 brands in the survey. Here the sin was of omission--the CRT
interview did not capture which rotation was used on which respondent. Here, too, the
computer did exactly as it was told-- it just wasn't told to do enough. An easily used CRT
interviewing package was involved in this one.
Conclusions
It would be nice to say at this point that the above cases were all apocryphal. Alas, none of
them are. This is not to say that you need to be totally paranoid every time you skim a
computer generated statistical analysis, although a little paranoia may not hurt. The point is
that the easier-to-use the statistical programs become, the more self-styled statistical
experts seem to turn up-statisticians-on-a-chip, as it were. Doing the wrong thing or doing
the right thing incorrectly, just because the computer programs allow it, is probably more
harmful to a marketing research project than not doing anything at all. At least in the latter
case, getting no answer at all is probably less harmful than getting the wrong answer (it sure
is when I type "variable").
It's also not as simple as comparing the means from the statistical analysis with those from
your cross-tabulations. If they agree, then the statistical analysis must have been done
correctly; if not, the advanced analysis must be wrong--right? Not quite.
In one recent study, the statistical analysis was correctly performed, on carefully
"derotated" data and the means didn't even begin to agree with the data tabs. You guessed
it--the data were not derotated before the tabs were done. The statistical analyst was
16
questioned at length about the disagreement between the means, as well. In this case, at
least, it was the easier analysis which the computer didn't question--and should have.
Nor is a solution coming through the haze of my crystal ball. Both the American Marketing
Association and the American Statistical Association have wrestled with and continue to
wrestle with the issue of certification, but that's probably overkill for this type of problem.
Even assuming that certification would help, it's still a long way off. At the very least, we
need to ask questions, lots and lots of questions--not just of our data but of those who ask
questions of our data. Taking computer printouts at face value can be very risky until
computers are programmed to know "what" as well as "how."
Take these steps to build your research on a solid foundation

Author- -Mark Hardy
Article Abstract
Beyond getting the right sample, researchers conducting online surveys face a host of
obstacles, from harder-to-reach consumers to the speeders and cheaters who lurk among
the pool of willing respondents. Here are 10 topics to cover with your sample provider to
help improve your odds.
10 questions to ask before choosing a sample provider
Solid market research is the foundation of every business’s most critical decisions - and the
quality of that research depends on the quality of the sample. A representative, valid,
unbiased sample is essential for research results to provide accurate reflections of the
market - and reliable guidance for business direction. A host of factors, however, from the
explosion of the Internet to the fragmentation of media usage to the rise of social networks,
has made it increasingly difficult to attract the right participants and create the optimal
sample.
Even traditional RDD (random-digit dialing) landline phone sample - long considered the
most methodologically-sound for survey research - is coming into question with the rise of
mobile phone usage. The latest National Health Interview Survey shows that 51 percent of
U.S. homes are now cell-only or cell-primary households - making them hard or even
impossible to reach through traditional RDD. The growth of wireless is not just a U.S.
phenomenon. Research by our firm shows that around the world - from the U.K. to Spain to
Japan - more people now own cell phones than landlines, and that gap is largest among 18-
24-year-olds.
The challenges can be just as great, if not greater, in the online world. The traditional
paradigm for online research has been sending e-mail invitations to potential participants.
But that paradigm is no longer enough to sustain research into the future. Since 2003, e-
17
mail use for personal messages has plummeted 41 percent, according to the Online
Publishers Association. Although 90 trillion e-mails were sent over the last year, 81 percent
were spam. It is harder than ever to get participants’ attention when their in-boxes are
overflowing with junk mail.
But the decline in e-mail is just one of many changes transforming how researchers need to
reach and communicate with participants. Participants now can take surveys anywhere, on
a plethora of devices and often while doing other tasks. In fact, our research shows that
consumers worldwide now media multitask regularly, often texting, chatting on the phone
and surfing the Web all at the same time.
In this new world, technology has given birth to a wide range of new sampling options and
sources - as well as new threats to data quality and integrity. To be sure market researchers
get sample they can count on to drive the right business decisions, they need to ask
questions - and, more importantly, demand answers - different than in the past.
Understanding the responses to the following 10 questions can ensure that researchers
make the right sample choice for their projects.
1. How are reach and diversity achieved?
The reality is that only a finite number of people will ever join a panel. Although panels
always will remain a critical part of the access mix, it is increasingly difficult to deliver all the
participants needed for a survey from panels alone - particularly when looking for hard-to-
reach targets. In fact, in today’s world, no one source can deliver the reach and diversity
critical to unbiased sample.
The optimal sample taps into a variety of sources - traditional panels as well as social media,
online communities, affiliate partners, reward programs, shopping portals and more - to
provide true reach and diversity. In our multimedia, multitasking world, it is important to
engage people wherever they are - and to include all participants, even those who would
never be part of a managed panel.
2. How are multiple sources blended?
Blending sample from multiple sources lets researchers reach all people who want to share
their opinions - even those not on panels - maximizing diversity, the most important
characteristic of a representative sample. Blending also creates a better sample by
improving coverage and ensuring that the opportunity to take surveys is placed in front of
as large and varied a population as possible. Blending, therefore, actually can result in a
better-quality sample than using any one source alone - that is, when it’s done right.
When choosing a sample provider, confirm there are quality controls in place, such as digital
fingerprinting to avoid duplication. Plus, verify that blended sample is regularly checked to
reflect changes in source composition and market dynamics. It is also important to ensure
18
the provider considers a full range of factors by retesting in a multisource environment to
ensure balanced sample, adding calibration questions to surveys to help explain differences
and employing smoothing techniques.
3. What is the recruitment approach?
In today’s sampling world, variety and flexibility are key. Diverse sourcing - critical to
unbiased sample - demands eclectic recruiting. Effective sample providers use a variety of
recruitment methods to drive traffic to surveys, rather than, for example, bombarding
people with pop-up ads. By matching recruitment methods to partner sources and their
membership, sample providers can both increase participation and improve the participant
experience.
4. How are participants treated?
People are vastly different in what motivates and engages them. Even more importantly,
when they feel they are being treated fairly and their efforts are appreciated, they provide
better data. That’s why your sample provider should nurture participants. Effective sample
providers treat their participants like their clients. They make expectations clear, answer
questions quickly and offer reward systems as varied as their sources, with options ranging
from point systems to sweepstakes to charitable donations to information to sincere thanks.
Fully customizing rewards ensures sample providers successfully motivate each target
audience.
5. How is the participant experience managed?
In a world where there is constant competition for participants’ attention, it is more critical
than ever that we create survey experiences that are positive and engaging. One of the
biggest obstacles to participant satisfaction - and the largest sources of participant fatigue
and frustration - is being screened-out from surveys.
Screen-outs happen when participants are ready and willing to share their opinions - but
just at the moment they want to participate, they are told they don’t qualify. Screen-outs
happen when sample providers screen participants for one survey at a time. Many people
seeking to participate will not meet the criteria for an individual project - and will find
themselves shut out of the process.
To avoid that negative experience, it’s important to seek sample providers who offer
participants many projects for which they could qualify, screening for multiple studies at
once and thus greatly increasing the chances that people who want to complete a survey
will have that opportunity. This approach reduces screen-outs, increases participant
satisfaction, slashes the number of e-mail invitations and cuts drop-out rates.
6. What processes are in place for identifying speeders and cheaters?
19
Technology has created new ways for people who want to game the system. Fortunately, it
also has enabled the creation of powerful tools to protect against this type of fraudulent
activity. Research shows a very small number of participants intentionally try to cheat on
surveys. Nevertheless, it is critical that sample providers have proven techniques in place to
prevent any type of fraud that would compromise data integrity. Among the tools and
processes you should ensure sample providers are implementing to protect quality include:
Timestamps to flag participants who have completed a survey - or a portion of a survey - too
quickly to have provided relevant responses.
Checks to identify straightliners - participants whose answers remain static across a survey
(all As, for example) or the same pattern of response (such as ABCABC, etc.).
Quality-control questions to catch participants who are not paying attention, are
inconsistent in their demographic information or are not following instructions.
Matches against third-party consumer databases to confirm each panelist’s name, address
and date of birth, making sure all participants are who they say they are.
Database analyses to identify and remove fraudsters in real time. People trying to complete
surveys fraudulently - using false identities and providing answers just to collect rewards -
exhibit common behavior patterns, such as very fast survey completion times and out-of-
area IP addresses. They also tend to use their varied identities in a consistent pattern across
surveys, as well as to qualify for surveys with very different target audiences. As a result, it is
critical sample providers have tools in place that identify these behavior patterns so can
eliminate fraudulent responders.
7. How are participants validated and de-duped?
In today’s world, where survey participants come from multiple sources, it is possible for
people to be invited more than once to the same survey. Therefore, often unintentionally, a
person may try to respond twice to the same questionnaire. For quality data, it is essential
that sample providers have controls in place to protect against duplicate participants.
Digital fingerprinting is one tool that is critical for preventing duplication. Digital
fingerprinting identifies each participant’s machine. This is done through watermarking (a
sophisticated type of cookie that cannot be easily removed) or through tracking multiple
data points (such as system time, screen resolution and software versions). When a person
logs on to take a survey, the machine’s ID is screened against all those already on file. If a
duplicate is found, the participant is not able to take the survey.
In addition sample providers should use a variety of techniques to authenticate participants.

These can include traps to identify geo-IP violations, address matching (such as, in the U.S.,
matches against the USPS postal file) and profile-specific queries that only the legitimate
participant would know how to answer.
20
8. How are Web partners chosen?
With multisource sampling, providers integrate information from many Web partners to
create a balanced sample. When choosing a sampling vendor, make sure it shares its
process for ensuring each partner provides quality sample. Reliable sample providers have a
consistent set of standards they apply to evaluate sources before incorporating them. They
should fully vet each source to confirm it provides a positive participant experience and
contributes to providing a fully representative sample.
9. What modes of access are available?
With all of today’s communication options, the people needed to complete a research
project can be tougher than ever to reach. They may be online or offline, wired or wireless,
Internet-savvy or Web-averse. Therefore, depending on only one mode to fill sample may
mean missing out on a critical segment of the universe.
Our multimedia world demands multimode sampling - particularly for lower-incidence

targets. Ask if sample providers under consideration can offer access through a range of
online and offline modes, as well as through mixed-access approaches. This is particularly
critical for projects with small universes, narrow parameters or hard-to-reach audiences.
10. How is science applied to ensure representative, balanced samples?
Sampling is not just about filling quotas. If the sample is not balanced, unbiased and
representative, the information it delivers can be inaccurate - and misleading. Make sure
vendors can provide methodologically-sound sample plans before beginning a job. Plans
should include solid selection techniques; detailed stratification and targeting; precise
geographic and demographic allocations; rewards that motivate; appropriate contact
methods; and active panel and community management programs.
Take the time to ask
A solid sample is a critical foundation for effective research. With so many forces converging
to transform how people seek and share information, it is important to evaluate sample
providers against new standards - ensuring they can deliver quality sample in our new
world. Take the time to ask sample providers the right questions - and demand complete
answers. Ensuring quality sample is essential to ensuring effective research results that
guide accurate business decisions.
21
Qualitative Research:
Adapting quantitative techniques to qualitative research
Author - Alan Kornheiser
Article Abstract
Once-academic techniques have become increasingly common in everyday quantitative

market research. This article discusses three multivariate techniques that have been
adapted for qualitative research: conjoint analysis, cluster analysis and multidimensional
scaling.
Borrowing from one to enrich the other
Once-academic techniques, such as conjoint analysis and multidimensional scaling, have

become increasingly common in everyday quantitative market research. Today, it is almost
the exceptional study that does not include at least a quadrant analysis or a set of factor
scores in its report, even if the results simplify the reality beyond recognition and force the
data into a Procrustean bed of limited dimensions.
Given that such quantitative techniques, especially when improperly applied, brush
ambiguities and the small but telling detail under the rug, it may seem strange that we are
proposing a variant of their use in qualitative research. After all, the purpose of good
qualitative research is not to simplify but to enrich; not to reduce the number of key
variables but rather to develop hypotheses and generate as wide a range of possibilities as
possible. However, if we focus not on the underlying mathematics of such quantitative
techniques, which are indeed designed to simplify, but focus instead on the test
methodologies themselves, we may find ourselves with new and useful methods for
generating ideas, terminologies, and relationships in a qualitative environment.
Accordingly, we have adapted three multivariate techniques - conjoint analysis, cluster

analysis, and multidimensional scaling - for use in qualitative research. While we do not
employ mathematical reductions of the results, we do use the sorting, trade-off, and scaling
procedures inherent in these methodologies as the basis for rich idea and hypotheses
generation. What follows shows how we do this, and why.
Pseudoconjoint analysis
In a quantitative study, conjoint analysis is typically used to determine underlying

valuations. While a respondent may say, and believe, that he considers price, a range of
features, and quality to be equivalently valuable, in fact he will invariably choose to trade
one off for another at different rates. For example, price and reliability are vitally important
in choosing an automobile. Different respondents will choose different trade-offs; one will
be much more price sensitive, another far more concerned with quality. A well-designed
22
study can have prospective car buyers trading off price, quality, features, attractiveness,
dealer service, and many other variables in such a way as to effectively model a consumer's
buying decisions. By presenting the consumer with a deck of options (i.e., a set of cards,
each containing a different set of car descriptions) and asking him to rank order the deck in
terms of desirability, a skilled researcher can determine why a prospective buyer makes the
decisions he makes, even if the buyer himself cannot clearly express the trade-offs.
As qualitative researchers, we are interested in understanding precisely what this technique

deliberately ignores: why the trade-offs are made. While conjoint analysis argues that it can
predict buying behavior without explicitly letting buyers describe the reasons for that
behavior, we as qualitative researchers are most interested in precisely those reasons...and
much less interested in making predictions. Accordingly, if we turn conjoint analysis on its
head, we may find we can use its tools as a means of learning why decisions are made,
without actually trying to predict the decisions themselves.
This technique, pseudoconjoint analysis, is best designed to generate understanding of the

way choices are made. We begin in the same place standard conjoint analysis begins: with a
deck of options. Since this is being done in a group setting - although it works just as well in
minigroups or even in-depth interviews - a far smaller set of cards is used: six is typical,
although one might use as many as a dozen if many variables were being examined. By way
of contrast, true conjoint analyses typically use dozens of cards at a minimum.
It is vitally important that this set of cards expresses real, complex choices. While each card
need not contain all possible options (for example, one card might not discuss a car's color,
while another might simply omit the issue of reliability), the entire deck must include all
options, and it must include them in such a way as to require respondents to consider real
trade-offs; there is no point in having people decide they'd rather buy a cheap, reliable blue
car than an expensive, unreliable red car.
Respondents are then asked to sort the cards, from most desirable to least desirable. When
done in a group session, as is most common, the moderator tries to obtain a consensus -
which happens, more often than not, especially if only a limited number of cards are used.
However, almost as commonly no consensus can be reached and there will be
disagreement, as one respondent prefers this while another prefers that. This is actually the
more desirable - and certainly the more realistic - outcome.
Where conflict arises, the moderator must generate discussion. Where is there
disagreement? How important is this disagreement? Other than this disagreement, is there
consensus? The heart of such discussion is the elucidation of the extent of differences in
perceived importance of various elements and the reasons for these differences.
This is best done using standard laddering techniques. A difference over price/quality trade-
offs might, for example, be explored by asking why price is more important? What does
23
price mean to you in this context? What else? What does quality mean? What else? One
takes the terms resulting and ladders them up. If quality means reliability, why is that
important? If price means that you can afford other things, what other things? Why are they
important? And so on.
This procedure is repeated again once a consistent set of choices has been generated. What
about this choice makes it better than that one? What does such a choice mean to you?
What does that mean? Again, and so on, using standard probes. By forcing decisions, by
requiring respondents to set priorities, rich discussions about why choices are made and
how choices are made invariably result.
A good example of how this process works involves a recent study conducted for an
international airline that wished to improve its in-flight entertainment in its business and
first-class cabins. Except for a (perfectly understandable) revulsion at the types of movies
typically shown in airplanes and the usual complaints about air flight, several groups of
business travelers were unable to generate any interesting discussions about their desires
for in-flight entertainment. Worse, when shown a range of possible improvements, these
frequent fliers liked all of them and were unable to explain why one was better than
another. However, when presented with a series of possible sets of entertainment (e.g.,
individual movie screens and GameBoys vs. improved access to computer power supplies
and a non-stop stream of snacks), the respondents were able to create very clear
preferences and to discuss the reasons for their choices with great clarity. Distinct types
emerged - workers vs. sleepers vs. players vs. self-entertainers - and the way in which
travelers moved from one category to another during a flight also emerged. Note that the
pseudoconjoint was valuable not because it enabled us to find and identify these groups; it
was valuable because it catalyzed the discussions that led to these groupings, with their
needs, preferences, and language.
By forcing preferences among fairly equivalently valuable combinations, we are able to

create rich conversations where there might be only silence.
Pseudocluster analysis
In quantitative studies, cluster analysis is a general term used to describe several statistical
techniques that group - as one might expect - similar things closely together. The technique
can group all the products that appeal to young men over in this corner and the products
that appeal to older women in the opposite corner. Because it contains some of the more
basic simplification algorithms (and, in fairness, because it is often done in only two
dimensions, which is almost guaranteed to wipe out any useful subtleties), cluster analysis is
almost the direct opposite of good qualitative analysis. However, by borrowing not the
analysis and not even (as above) the test materials of cluster analysis, but rather by
reproducing cluster analyses outputs, a rich new way of generating discussion and deriving
information is possible.
24
In pseudocluster analysis, the moderator simply places a large number (a dozen is often a
useful number) of products on the table: a dozen types of candy or perfume or software or
anything else being discussed. Respondents are asked to group them into as many sets as
they feel appropriate. They then discuss the reasons for their groupings, what similar
products have in common or different groups do not have in common.
To avoid trivial results, the moderator should feel free to make this harder for the
respondents. If they initially group by color, forbid grouping by color. If they initially create
three groups with everything interesting in a center group, have them do it again using only
the products in the center group. Once they've created useful groups, forbid all the key
discriminators they've used and have them do it again. Continue until you have generated a
rich and complex vocabulary of how products differ and why.
You can then continue by asking where an ideal product would go on such a set of
groupings. Or ask where a product for a young person or an old one or one who hated TV
would fit. You can ladder from reasons for difference to reasons for choice, or from reasons
for choice to reasons for difference. The only key is that you must keep laddering...each
time a grouping becomes firm, probe to determine why that group exists, why the
differences are important, and what those differences mean.
This is a remarkably simple exercise. Respondents greatly enjoy the tactile nature of actually
maneuvering real products on the table, and good internal discussions (take notes - the
recording will miss them!) during the grouping will provide additional richness.
By giving respondents tactile objects to organize, it becomes much easier for them to find
and then discuss similarities and differences among the products.
Pseudo-MDS
The quantitative technique known as multidimensional scaling is an extremely good way to

establish the key dimensions of variability when you have no sense of appropriate
terminology. The moderator simply asks respondents to tell him how similar (or different)
are any two pairs of objects, using many sorts of simple scales. This is repeated with all of
the (or many of the) possible pairs of objects (which can be brands of cigarettes or types of
blue jeans or makers of computers) and uses a sophisticated mathematical algorithm to
generate the actual dimensions being used to discriminate. Interestingly, the first dimension
is almost always "how much I like it," even for objects for which liking would seem to be
irrelevant or fairly consistent.
This is a time-consuming process. One cannot, in a qualitative session, ask respondents to

evaluate multiple pairs. However, one can ask respondents to do something simpler: put
objects on a table, or on a wall, in such a way that the ones closest to each other are the
most similar and those furthest away are the most different.
25
Clearly, this process - pseudo-MDS - is procedurally similar to pseudocluster analysis.
However, in actual use the differences are profound. To begin with, it does not allow any
clusters. All products must be placed distinctly. Secondarily, the distance between objects is
important. While in pseudocluster analysis the final result is almost always three or four or
six or 10 piles arranged neatly on the table, in pseudo-MDS the outcome consists of
products scattered very widely, and with the distance from one to another being very
important.
It's actually hard to do this on a table, and it works best with 3x5 cards fastened to a wall.
However, a single wall allows only two dimensions, so one ideally lets respondents use
several walls...and the table...and the floor...and even the ceiling!
The key probe in this technique is not "Why are these two different?" - since all objects are
different. Rather, it is "Why are these two so much more different than those two?"
Here we are reinventing an ancient military expression: quantity has a quality all its own.
Suppose, as an easy example, we are evaluating candy, and respondents have put a very
sweet candy at one end of the room and a very tart one on the other end of the room. They
will easily tell you that they're using sweet/tart as a way to divide the candies. However, it is
an easy probe to ask why this simple dimension has become so very important...why have
they not differentiated between cherry and chocolate that way, or between inexpensive and
expensive? One obtains a key probe as to what it is in this dimension that matters so much,
and one develops a richer vocabulary and sense of what matters in the category.
By failing to explicitly define what differences are important but forcing respondents to in
some way find very explicit differences, pseudo-MDS exposes the underlying structure used
to define a category.
Not true substitutes
These pseudoquantitative tools are not, as the reader surely now appreciates, in any way
substitutes for true quantitative tools. Rather, they are a set of methodologies that borrow
from quantitative analyses to give the interviewer new ways to make his respondents stop
and think about the topic at hand. They are not magic and do not work automatically.
However, when coupled with appropriate probing, follow-up, laddering, and the
encouragement of group interaction, they offer a new way to provide understanding of the
topic at hand.
26
Qualitative research demands a scientific approach
Author- Martha Wilson
Article Abstract
Many people conducting qualitative research have little or no understanding of the scientific
aspect of their work. This article discusses the need for approaching qualitative research
scientifically, detailing the six basic steps involved in the process: problem formulation,
research design, sampling, data collection, analysis and reporting. The article also notes 10
guidelines that have been selected to initiate an ongoing dialogue in the field regarding
improving the quality of qualitative research.
At a recent conference, in response to a comment on . the science of qualitative research, a

market researcher was heard to say, "Science? I only think of science when I think of
quantitative research." This perspective is quite common. Many of the people doing
qualitative research have little or no understanding of the "research" or scientific aspect of
their work. In fact, a significant number of qualitative researchers have no research training.
This means that day after day, year after year, decisions are being made by health care
organizations, toothpaste conglomerates, clothing retailers and an infinite number of other
businesses based on qualitative work that may not be credible research. It's information,
but is it data?
So what turns the collection of information into research? What transforms information into
data? And why is it important? The answer is straightforward. It is the use of the rules of
scientific inquiry, known as scientific method, to guide the work at hand.
If there is no scientific method used to conduct the work then it isn't research. The idea
behind research of any kind is that information based on research is more reliable and
credible than information gleaned subjectively. And yes, even qualitative research requires
scientific method for it to qualify as research.
What does the qualitative researcher (or the client seeking good qualitative research) need
to know to conduct true qualitative research? They need to understand the fundamental
principles of the scientific method and have the ability to implement them in everyday
practice.
There are essentially six basic steps involved in scientific inquiry for qualitative research:
1. Problem formulation
2. Research design
3. Sampling
4. Data collection
5. Analysis
6. Reporting
27
Carrying out each of these steps requires attention, knowledge and training. These steps are
intimately related and critically interdependent. Without one the other step is inadequate
and the work loses its status as research. The brief description of these steps is designed to
highlight some methodological issues and problems.
1. Problem formulation
Ideal]y, a great deal of thought goes into the identification and formulation of the topic to
be researched. This may include the development of an actual hypothesis to be tested or it
may involve setting the parameters for exploratory research. In either case, it must clarify
what is being measured or tested and why. It's critical to define the terms for the research
at the outset to ensure that what the respondents mean and what the researchers mean
are the same thing. Concepts such as "customer satisfaction" or "product attrac-tiveness"
should be clearly spelled out before being included in the research instruments.
Problem formulation involves a thorough review of similar research and literature available
on the topic and then requires a systematic construction of the problem to be researched. It
is most common to specify the actual, measurable objectives of the research during this
process. Once this step is complete, the researcher is ready to begin the research design.
2. Research design
Problem formulation and research design are probably the most neglected areas of
qualitative research. "Let's do a focus group" frequently substitutes for these
comprehensive steps.
Designing the research first involves weighing the value of a variety of qualitative and
quantitative data collection techniques. The researchers choose the data collection
techniques that are most effective in meeting the research objectives with the least amount
of error and researcher bias. (Unfortunately, the economics of focus groups is more often
the reason for selecting them than their actual value in producing the most reliable data.)
Issues of reliability, credibility and replicability are considered and documented for later
inclusion in the methods section of the final report.
Having chosen the technique(s), the researcher designs the instruments, which might
include a moderator's guide, a guide to field procedures, a questionnaire, an interviewer's
guide or observational guidelines and procedures.
3. Sampling
The ability to obtain the particular sample often determines which data collection
techniques to use. The design of the sample is usually part of the research design phase. It is
so important to qualitative research and so neglected that it is prudent to highlight it as a
separate, but integrated step in the research process.
28
Sampling consists of designing the selection process for the study participants to determine
who gets selected, why and how. There are myriad sampling techniques but all share the
same goal: minimizing the chances of getting respondents who do not reflect the target
population. Sampling also minimizes the chances that the findings are accidental or
coincidental.
A major problem in the field is that focus group research is largely reliant upon databases
maintained by focus group facilities. In some cases, these databases have become pools of
self-selected, recycled participants, some of whom participate in focus groups and
interviews several times a month. In essence, they become "professional research subjects"
and as such, their feedback is highly suspect as they come to adapt to the focus group
culture and learn to say what they think the facilitators and clients want to hear. There is no
reason to believe such databases are in any way random or otherwise representative of the
population to be studied.
Simply put, the databases maintained by most facilities are not appropriate for scientific
sampling. This means that there is bias built in to the universe used to select participants.
Sampling is designed to factor out bias and limit error in the type of respondent. The best
sample is one that both provides access and limits the possibility of including people in the
sample who shouldn't be. Note that with focus groups and most interviewing, the sample
size is too small and not randomized to make generalizations about larger populations. Even
so, the sample should be carefully selected from the universe of people identified during the
formulation of the problem.
4. Data collection
Data collection involves the administration of the instruments selected and finalized during
the design phase of the project. It is done under firmly controlled circumstances prescribed
by the design to insure consistency and replicability. This means, for example, if you wish to
compare responses, all of the questions in an interview are asked of each interviewee in the
same way.
Focus group moderator guides are data collection instruments. Often moderators use
guides as just that, guides. This means that across groups the questions may not always get
asked in the same way with the same wording. Thus, comparative analyses cannot, from a
scientific standpoint, be made using the findings of a series of focus groups. (They are made
routinely, but are probably not accurate.) For purposes of reliability, the questions must be
asked in the same way for comparable groups.
Of course, moderators argue that the nature of qualitative research allows us a great deal
more flexibility than quantitative research. The beauty and uniqueness of qualitative work is
its lack of structure and seemingly limitless ability to explore the issues. This is not in
contradiction to the requirement for structure according to a scientifically derived method.
29
In fact, asking the question the same way every time provides the scientific structure and
then allows the moderator to explore the answer, once it is given, in as many creative ways
as possible. Thus, the creative aspect works hand in hand with the structure.
5. Analysis
Analysis in qualitative research is, more than any other step, not very well defined. In
quantitative research, analyses are highly reliant on statistical techniques, while in
qualitative research its most accurate form is simple description with leeway for subjective
interpretation.
It is important for descriptive analyses to include all responses and for each response to be
characterized as equally important. There is nothing in qualitative research that allows one
respondent's answer to be more important than another's. In fact, the researcher must
guard against clients who try to prioritize the responses based on what the client likes or
dislikes, wants to hear or doesn't want to hear.
One of the most common errors in qualitative research is to fall into quantifying the
responses. It is misleading to report numbers or percentages (e.g., 80 percent felt that the
product was wonderful) because seldom are focus groups, interviews or observations
representative of the target or the general population. Generalizations cannot be extracted
about the general population from small group interviews or from focus groups. One can
assume that people similar in attitude and behavior to those in the room will hold similar
viewpoints but this requires really knowing who the participants are.
The goal of the analysis is to organize and categorize the findings in a way that increases our
understanding of the responses in the context of the population under study. This means
that the "data" must be analyzed and interpreted in the context of the originally defined
problem and research objectives.
6. Reporting
Reporting qualitative findings requires the inclusion of the purpose of the research, a
description of the research including the reasons for selecting the techniques used, a
description of the sampling techniques and a discussion of the recruitment methods. The
latter should include a brief discussion of the number and type of people who self-selected
versus those who refused to participate.
The findings of qualitative research are most accurate and effective when delivered with a
caveat regarding their usefulness. This discussion should highlight both the nature and
limitations of qualitative techniques and focus on their value in providing "flavor" and
increased understanding. The audience and/or client should be cautioned against making
major decisions based solely on qualitative findings. Instead, they should be encouraged to
30
combine the findings with other quantitative and qualitative research results to be sure that
they have a solid basis for decision making.
Increasing the scientific method
How can you increase the reliability and credibility of qualitative market and social
research? Increase the use of the scientific method. Does that infringe on the nature of
qualitative research and limit its creative and exploratory capabilities? No. In fact, it can
enhance these crucial aspects by providing the credibility the research and the findings
deserve.
Qualitative methods were never intended to be without science or structure. The notion
that "If it's qualitative, anything goes" defies the very fact that we're trying to conduct a
unique type of scientific research. If it is to be called research it must be based in science not
whimsy, gut feelings or budgets.
The unstructured nature of qualitative research is both its strength and its weakness. It's
strength lies in the ability to probe the respondent's thoughts, behavior, motivations and
lifestyle. It provides a rich array of information and often provides the context that
quantitative research can't. But its weakness is that its limited structure makes it subject to
a great potential for error. Its much more susceptible to researcher/client bias and
therefore requires objectivity and systematic processes.
While there are numerous things we can do to improve the quality of qualitative research,
these ten guidelines have been selected to initiate an ongoing dialogue in the field:
1. Remember that qualitative research is best for providing an understanding of the

complexities of the issue(s) at hand rather than offering conclusive findings. Both clients and
qualitative researchers must refrain from treating the findings as conclusive without
including both literature review and other research.
2. Qualitative research is most effective when integrated into a larger project which includes
a healthy quantitative component. One of the best uses of focus groups is to test the initial
drafts of telephone, mail or intercept survey instruments. This allows the client to get
feedback about the questionnaire, about whether or not participants were likely to actually
complete it and about their understanding of what each question is intended to measure.
The focus groups, being relatively inexpensive compared to the implementation of a survey,
allow the client an opportunity for refinement before investing significant time and money
fielding the instrument.
3. Many people conducting qualitative lack training in research methods. Becoming

conversant with the scientific method through market research or social science courses at
your local community college or university in research methods can only improve the
professionalism in the field.
31
4. Qualitative researchers must educate clients on the proper use of qualitative findings. We
can do this through our initial discussions of design, through the questions we ask about the
data collection techniques that the clients have asked us to use and through the oral and
written reports we provide. Of course, if we have the opportunity to actually design the
research, we can make it a habit to present our designs in the context of the steps of
scientific inquiry.
5. The true strength of qualitative research lies in its research design and its theoretical
framework. The findings can be validly interpreted within that framework and only within
that framework. The soundness and potential replicability of the findings is dependent upon
the steps of scientific inquiry. And, like dominos, each step profoundly influences the
balance and integrity of the other.
6. Discuss the potential for bias with your client. Highlight the ways that the client's and/or
the researcher's preconceived ideas can produce particular results. Identify ways to
minimize them and build these approaches into the design.
7. When reporting, include in the methods section a description of the sampling technique
and the recruitment process with a discussion of the number and types of people who self-
selected into the process and the number and types of people who selected out and why.
This means carefully documenting who refuses to participate in the study. If at all possible,
obtain minimal demographics.
8. Spend more time finding out who is really in your focus group, individual or small group
interviews, observational settings, etc. Collect not only demographics but administer other
data collection techniques to find out about lifestyle, decision making, buying habits, etc.
Knowing your respondents provides a very solid context for analysis and interpretation.
Rather than trying to extrapolate to larger groups through generalizations, tell your clients
the kind of people you have as respondents and extrapolate from there.
9. Purchase sample whenever possible from reputable organizations. Avoid using facility
databases when possible. Develop healthy rationale for your sample design and stick to it.
10. If you want to know the proportion of the population that feels a particular way or
engages in a particular behavior, use quantitative research methods. The findings will
always be more reliable.
Qualitative research is finally taking its rightful place in the research arena after decades of
being frowned upon by the scientific community. To establish a permanent foothold for it in
social and marketing research, we must increase its credibility, maintain its naturally fragile
integrity and treat it as a serious form of scientific research.
32
Measurement Scales, Questionnaire Design and Survey Errors:
A simple solution to nagging questions about survey, sample size and
validity
Author - Susie Sangren
Article Abstract
The quality of a market analysis is judged by its validity. Unfortunately, data from non-
probability, informal sample surveys lack measurable confidence. This article demonstrates
an easy method of calculating the sample size needed for a specific market survey or
experiment.
You wouldn’t believe how many times I have been asked, "How big should my sample size
be to give a reasonable estimate of the target population?" (My answer is, "It all depends. .
.") The questioners are usually research analysts not trained in probability sampling and
statistical theory.
The quality of a market analysis is judged by its validity - in other words, how confident are
you, as a researcher, about your findings being replicated in the real marketplace? Data
collected from non-probability, informal sample surveys will not allow you to make
conclusions about the population with measurable confidence. Remember that the intent of
a survey is never just to describe the particular individuals who happen to be selected into
the sample, but to obtain a composite profile of the population.
What I am about to show you is an easy (and nonetheless robust) method of calculating the
sample size you would need for your specific market survey or an experiment. The research
design is the simple random sampling, and the sample size calculated is the number of
completed surveys required to achieve a certain level of confidence and error rate. The
number of "completes" may be a lot lower than that of the surveys you will actually send
out, depending on your expectation of the response rate.
The beauty of the simple random sampling is that it is probability-based (therefore

representative of the population, because everyone in the population has an equal chance
of being selected), and it is simple. You can use a random-number generator to pick any
sampling units out of the entire population. Simple random sampling is robust because it
can meet the needs of most managers. With probability sampling, you can report the
following two quantities to relate the accuracy of your sample estimate to the population
parameter:
Sampling errors: How close is your sample estimate to the true population number? A
typical answer may be, "The population number is within ±3 percent of the sample
estimate." Naturally, the smaller the sampling error you want, the larger the sample size you
will need.
33
Level of confidence: How confident are you about your one-sample estimate in repeating
itself through repeated samples? An answer may be, "I am 95 percent confident that the
population number is between A and B." The larger the confidence level you want, the
larger the sample size you will need.
The sample size should be determined before other survey considerations such as: what
questions you should ask; what response rate you can expect: how to or who should collect
the data. There are two ways to approach the sample-size problems:
1) You have already decided on the confidence level and the sampling error requirements,
now you want to know the sample size;
2) You have decided on the sample size and the confidence level required, now you want to
know the error rate of your sample estimate.
To solve Problem One for the sample size, I begin by assuming the following, rather limited,
conditions:
All my survey questions have the yes/no type of dichotomous answers.
My absolute error-rate (E) requirement is 3 percent. (The true population number is within
the range of ±3 percent of my sample estimate.)
My confidence level (C) requirement is 95 percent. (I want to be sure that my population

number estimated from one sample can be repeated 95 times out of 100 samples.)
My first guess at the percentage estimate for the "yes" answer in my sample for a particular
question (P) is 35 percent.
The sample size (N) calculation formula is simply:
N = square of {square root of [P x (1-P)] / (E/std(C)},
where "std(C)" is the equivalent of confidence level, expressed in terms of standard

deviation. I list below three widely acceptable levels of confidence, and their standard-
deviation counterparts:
1. 68 percent confidence level - The population number is within plus or minus one standard
deviation of my sample estimate.
2. 95 percent confidence level - plus or minus two standard deviations. It is the most
popular level.
3. 99.7 percent (almost 100 percent) confidence level - three standard deviations.
Now, let’s substitute all the known quantities into the size calculation formula to solve for N:
0.4770 = sq. rt. of [0.35 x (1-0.35)]

34
0.015 = 0.03/2
N = (0.4770/0.015) ** 2 = 1,011
Therefore, the required survey sample size is 1,011, for a 95 percent confidence level and a
tight error bound of ±3 percent. Exhibit 1 shows the calculated sample sizes under various
levels of sampling error rates and estimated "yes" percentages, all at 95 percent confidence
level by simple random sampling.
To solve for Problem Two for the error rate, I have already been given a sample size, say,
1,011 (N), and the confidence level, say, 95 percent (C). Using the same formula, converting
the confidence level (C) into an appropriate standard deviation, std(C), and assuming that
my sample percentage of the "yes" answer (P) is 35 percent, my sampling error rate will
again be calculated as ±3 percent. Remember that increased sample size generally means
increased survey reliability, which must be traded off with increased cost and time.
Exhibit 2 shows the calculated sampling errors under various sample sizes and estimated
"yes" percentages, all at 95 percent confidence level by simple random sampling.
Notice also that when P=0.5, or 50 percent, the value of [P x (1-P)] is at the maximum. What
this implies is that, the more unsure I am about the survey outcome (i.e., the percentage
estimate for the "yes" answer, P, would be close to 50 percent - I am only certain half the
time), the larger the sampling error will be.
Going back to the Problem Two scenario, and changing my sample estimate for the "yes"
percentage from the earlier 35 percent to 50 percent, now I would calculate a slightly larger
sampling error (3.145 percent versus the earlier 3 percent):
0.50 = Sq. rt. of [0.50 x (1-0.50)]
31.7980 = Sq. rt. of 1,011
E = 0.50 / 31.7980 x 2 = 0.03145 (or, 3.145%)
Finally, I may want to enlarge the calculated sample size (done somewhat subjectively)
because:
1. My survey contains questions with multinomial answers. In such a case, I will pick the
question with the highest number of answer categories to estimate my sample size. The
resulting size should be good for the entire survey.
2. I have to take into consideration the non-response rate.
3. I want to ensure that when I crosstabulate one variable with another, I would have
enough data in each cell.
35
Understanding data requires recognition of types of error
Article Abstract
Error is unavoidable in market research. But understanding the different types of error, and
how they come about, can improve surveys and improve data.
Every survey that is based on a sample from a large universe is subject to two different
types of error, error which relate to very different ways in which survey results can yield a
misleading picture.
"Error," says Alan Roberts, former manager of market research, Wayne Seed Division of
Continental Grain Co., Chicago, "are factors which may cause the picture portrayed by the
sample to differ from the picture that would have emerged if a completely accurate count
(U.S. Census) had been made of the universe from which the sample was drawn.
"These two types of error are called sampling error and for want of a better word, non-
sampling error," says Roberts. "Sampling error relates to the reliability of data; non-
sampling error relates to the validity of data."
Reliability
Reliability is a concept like repeatability, says Roberts. That is, if you keep repeating, in all
executionary details, your first survey, a technical statement can be made that results will
probably fall within a certain range, that numbers generated will have a degree of stability, a
certain percent above or below what the first survey reported.
"Note that this has nothing whatsoever to do with how accurately your survey reflects the
real world out there, the world of everybody that your little survey did not communicate
with," says Roberts. But that limitation never prevents researchers from making what they
call "confidence statements" about the "statistical significance" of their numbers.
The confidence they speak of, such as 90% or 95% or 19 chances out of 20, comes only from
a probability theory. It enables researchers to make very impressive statements that
differences in numbers generated by a survey are either significant (i.e., outside range of
numbers one would expect on a chance basis, given sample size) or not significant (i.e.,
within expected range).
Says Roberts, "This is all good and well, but survey research is used to guide decision-making
by management. What management needs is a true picture, a true road map or blueprint,
of a given market, and/or of the purchase processes that drive that market. There is only
very limited value in management knowing that findings of a first survey would probably be
very similar to those of a second survey, if it were identically conducted. Such knowledge
begs the issue of whether the survey methodology was any good in the first place. In other
36
words, statements of statistical significance beg the issue of data validity and hence its
usefulness.
Types of error
One can scarcely list all possible types of non-sampling error, all the ways that a sample
survey can yield misleading data, all sources of invalid information about a target market
that can be associated with sample surveys. Just a dozen such types are listed here:
(1) Non-probability sample, which is by far the most common type of sample used and puts
"up for grabs" the issue of degree to which the sample of convenience actually used reflects
or fails to reflect the universe (or market) that management seeks to gain information
about.
(2) Non-response, even when at an "allowably" low rate such as 15 or 20%, creates doubt
(seldom addressed in research) as to how survey results would have changed if
non?respondents had all, in fact, participated in the survey. In many procedures, such as
widely used intercept surveys in shopping malls, no information at all is available about
refusals, and there is no basis for learning more about non-response.
(3) Response by a non-targeted individual can arise in by-mail surveys when the
questionnaire is executed, or influenced, by a person other (e.g. family member) than
addressee.
(4) Interrespondent bias can occur in by-mail surveys, as when neighbors participating in the
same survey get together, but more commonly occurs with research done in any theater-
type setting where respondents sit side-by-side as they execute self-administering
questionnaires. Or, during one-on-one interviewing in a public area, a subsequent
respondent may overhear questions and answers from the interview with a prior
respondent.
(5) Respondent "yea-saying" is a widely encountered phenomenon. It is based on a

psychological need, more strongly felt by some individuals than in others, to please the
interviewer by answering according to how the respondent senses the interviewer would
like to have a question answered.
(6) Respondent fatigue may arise early or late in an interview, but more likely toward the
end. Fatigue is a euphemism for unrest, since the respondent need not become physically
tired and it does not require a two-hour interview process for such unrest to arise.
Commonly, interviews are solicited with explicit or implied promise that they will be brief
and/or easy. If, at any point, the respondent concludes that the interview has gone beyond
his/ her expectations, termination may occur. But, more likely, the respondent will be too
polite to cut off the interviewer and will simply begin to answer whatever comes to mind
that will more swiftly conclude the interview. Quality of data deteriorates in that process.
37
(7) Questionnaire bias can involve either construction (sequence of questions) or phrasing.
Order bias is a special issue that can occur within a question. Professional researchers are
usually competent enough to avoid the more obvious types of questionnaire bias, but when
operating management starts hanging "whistles and bells" on the professional's
questionnaire draft, much bias can creep in. Even in otherwise unbiased questionnaires,
some order bias may be unavoidable, as when sample size or other cost factors do not
permit rotation of listing order to the fullest extent needed to avoid any possible bias.
(8) "Iffy" questions that yield "soft" data, i.e., data of low predictive or descriptive value,
abound in questionnaires. Most notorious is the almost universal five-point "intent to buy"
questions (definitely would buy/ probably would buy/might or might not buy/probably
would not buy/definitely would not buy). Any question that asks for more than a
respondent's actual (past) behavior and/or current opinions tends to be "iffy."
(9) Questions outside the respondent's qualified range of personal knowledge or interests
the researcher hopes will be answered "don't know." Unfortunately, many respondents feel
that admitting ignorance about a subject may undermine their self-image. So, they prefer to
guess and their answers are tabulated right along with those of knowledgeable
respondents.
(10) Interviewer bias can be insidious, especially in surveys where interviewing is not
centrally controlled. Personal, one-on-one interviewing is a situation permitting overt or
subtle exercise of influence by the interviewer on response pattern. This may occur with
minor rephrasing of a question by the interviewer, tone of voice, facial expression, anything
that clues the respondent as to an expected answer. Often, after completion of several
interviews, the interviewer begins to expect a certain response pattern and may, without
fully appreciating it, communicate that expectation in the course of subsequent interviews.
(11) Interviewing cheating need not be of the most egregious (and easily detected) sort that
involves reporting of many totally fictitious interviews. It can also be more limited or subtle,
as when an interviewer who has skipped a question or two, or experienced a termination
just before asking the last couple of questions, yields to the temptation of raising her
completion count by inventing a few brief answers here and there, after the interview. Or,
the interviewer may find an apparently cooperative would-be respondent who fails to meet
respondent qualifications specified in the survey design and yields to temptation of
completing that interview after falsifying one or more questions on the qualifier.
(12) Simple incompetence in data gathering is probably a bigger source of invalid data than
actual cheating, although both stem from the same root cause: interviewers tend to be
poorly trained, part?time people, often grossly under-compensated given the importance of
what they do. Sloppy interviewing techniques can take many forms, including misrecording
answers, failure to probe, skipping or rephrasing questions, asking questions or reading lists
out of required sequence and failing to qualify respondents.
38
Probable validity
Of course, do not exhaust all possible examples of non-sample error and do not address
problems of maintaining data quality across the edit, code and tabulate-on stages, says
Roberts. They are set out only to underscore how much false security may be involved when
management accepts sample survey results in an uncritical way, basis a researcher's
confidence statement about the statistical significance of various reported totals.
Far more salient to the success of management decision-making is the need for
management to assess the probable validity of survey data referred to in decision-making,
to dig into the design, methodology and controls used in the survey, to satisfy themselves
that data reported likely will supply a reasonably accurate picture of the market and market
segment that it purports to measure.
Someone once drew an analogy between total survey error and the hypotenuse of a right
triangle, where the other two sides represent sampling error and non-sampling error, says
Roberts. That is, the hypotenuse must be longer than either of the other two sides, because
it is the square of the sum of the squares of the other two sides (e.g., the survey error
hypotenuse is five when the sides are three and four).
That metaphor is useful because, first, it focuses attention on possible error other than that
implicit in every sampling process and, additionally, it positions total survey error as
necessarily something greater than sampling error alone.
Unfortunately, the metaphor is also a bit simplistic and misleading. Sampling error can be
(and should be) stated with quantitative precision; all other sources of error - any factor
tending to undermine data validity - are too diverse to permit quantification and require
qualitative assessment by professionals whose skills extend into many areas besides
probability statistics.
Assessing impact
In the real world, the confidence statement of statistical significance seems so scientific that
the difficult, often messy, process of assessing impact of non-sampling error is all too easily
overlooked. We are not used to thinking of a right triangle with two sides in ratio of one to
20 or 30. Yet, in terms of usefulness of survey findings, when survey error side is one and
non-survey error side, if quantifiable, would turn out to be many times larger, management
can be really "blind-sided" by the size of the hypotenuse.
39
An analysis of the impact of survey scales
Author - Adam S. Cook
Article Abstract
Adam Cook examines many options for survey scales and offers some research-on-research
that explores the effects of various scale point ranges.
Survey scales are important because they help differentiate the degree to which people feel
toward certain questions. Yes-or-no responses are not always an option in consumer
perceptions and feelings. But what’s the best numeric scale to use in analyses and what
scale is easiest for the respondent to interpret?
If it’s a paper or phone survey, sophisticated and/or easily-misinterpreted scale questions

can be a real challenge. A scale can’t have too many radial points (e.g., you don’t want to
ask on a scale of 0 to 100, where 0 = x and 100 = y) or too many descriptors defining each
point (e.g., completely, somewhat, rarely, never, neutral, etc.). If we don’t include a number
of options, our ability to analyze differentiation in responses becomes more limiting (e.g., a
1-to-3 scale doesn’t give us a whole lot of information to differentiate between responses).
See Figure 1 for varying examples of scale questions.
40
41
Thankfully, interactive online surveys exist and they have some real untapped potential for
finding the sweet spot between maximizing participation and analytic reliability or
differentiation. Given the challenges presented in traditional collection methods, I’m going
to focus on the ideal for interactive online scales.
Even vs. odd number of scale collection points. The options are many: 1-to-10 (10 options)
or 0-to-10 (11 options); 0-to-7 (eight options) or 1-to-7 (seven options); 0-to-5 (six options)
or 1-to-5 (five options). I’ve usually heard the value of even-numbered scales is that they
require respondents to choose or lean toward one of the extremes on the scale presented. I
understand the desire to acquire definitive feelings but the truth is that some people have
no feeling, one way or another, toward certain things (an inconvenient reality to some
decision makers). I believe eliminating the neutral option interjects bias into the results and
ultimately the analysis. I feel it’s “extremely important” to use odd-numbered scales, ones
that have the option to select a true mid-point of neutrality. But sometimes you have to
deal with the scale that may be given to you to analyze. We’re not always in a position to
choose.
Number of scale points: three, five, seven, nine, 11, 13 . . . ? Well, we know one scale point
isn’t an option and I’m ruling out even-numbered scales. Here’s what I do know (from the
book Marketing Research: Methodological Foundations): “Research indicates a positive
relationship between the number of scale points and reliability.”
Having a large number of scale points is important for analyses. If you’re using a radial point
scale collection method, having a scale exceeding 0 to 10 or 1 to 11 (11 points) can look
overwhelming. With a radial scale point display, I would recommend not exceeding 11. It’s
important to note: If you are going to display the numbers on the scale, your maximum
scale should probably be 0 to 10. Many books have been written on the importance of this
scale and its ease of translation to potential respondents. Zero is typically defined as bad
and 10 is usually associated with the highest of marks. A scale of 1 to 11 wouldn’t work
because 11 is not commonly associated with perfection or rankings. Sometimes 1 is
considered the best as well without a clear definition (very similar to the defining an ace as
high or low in card games). This lack of a clear association may create confusion.
Here’s where an interactive scale can help us overcome participation and visual fatigue. The
use of sliding-scale displays enables us to remove the need to display numbers (see options
five and six in Figure 1 for visual examples of sliding scales). With the numbers coded into
the background, you need not worry about confusion or overwhelming radial point displays.
In fact, the scale coded into the background is ultimately up to the analyst developing the
interactive survey. The sliding scale can even get us a scale greater than 11 points to help
maximize reliability. Technically we would be limitless, but 0 to 1 billion sounds like a bit
much. If you don’t like the idea of 101 points spanning 0 to 100, simply create 101 points
spanning 0 to 10. It’s simply moving the decimal point to the tenths (e.g., 0, 0.1, 0.2, 0.3 …
42
all the way up to 9.8, 9.9, 10). I have yet to see this offered as a scale option but I’d love to
have this capability. This brings us to our last quandary of where to start our scale.
Starting a scale with a 1 or a 0. If you’re displaying numbers, it’s actually pretty arbitrary in
terms of what you use as long as the numbers are clearly defined. It’s ultimately the number
of scale points that dictates the strength of your analysis. A 1-to-10 scale is essentially the
same as a 0-to-9 scale or, as crazy as it may sound, a 2-to-11 scale. I would hope it’s self
evident that if you’re using a low-end descriptor of “not at all, none, never” or anything
that’s a definitive null, then 0 is the best number to start with on the scale. I really don’t
have a case for using 1 and I’m not completely sure why scales do start with a 1 for display
or analysis purposes. Until I hear a solid case or rationale for using 1 to start the scale, I’m
going to stick to 0 when given the option. Data collection with 0-to-10 also has the easiest
conversion to percentage analyses.
Obviously the choice and preference in scales used is ultimately yours but it is worth
considering the implications of your choice. Figure 2 represents my preferences.
Scale analysis options and pitfalls
Why median scores are a bad idea. Medians are good for analyses that incorporate
extreme outliers in data. Household income is probably the best example of when to use a
median in analysis. One billionaire can make an average income analysis skyrocket. Since
we’re analyzing a scale, there’s a distinct and established range. The percent differences can
43
be significant from one data set using averages in analysis versus another using medians.
See Figure 3 for a random example of 100 respondents analyzed using medians versus
averages.
The average analysis was 19 percent higher than the median analysis in 2011 and 11 percent
lower in 2012. Conversely, when you analyze the change from 2011 to 2012, we see a 12
percent increase in average scoring, while the median analysis shows a 50 percent increase.
If this isn’t enough to put the nail in coffin of median analyses on scale questions, I don’t
know what is.
Why rounding averages is another bad idea. In Figure 4’s example of two different average
scores, where Group A = 4 and Group B = 5, Group B’s average score is 25 percent higher
than Group A’s. What if Group A’s average was actually 4.49 and Group B’s was 4.51? The
difference would be minimal. What if Group A’s average was actually 3.50 and Group B’s
was 5.49? Group B’s average score would be 57 percent higher. The range in difference is
anywhere from 0 percent to 57 percent. This would indicate to me that’s it’s a bad idea to
round results in analyses. See Figure 4 for examples of the impact in rounding averages.
44
Why percentage groupings can actually misinterpret results. I’ve seen reports and analyses
that group numbers together from a scale question. See Figure 5 for an example of grouping
7s or higher on a scale of 0 to 10. All of the examples used in Figure 5 have 25 percent
scoring a 7 or higher.
45
It didn’t occur to me until recently how inaccurate these groupings can be in analysis. When
you start creating a number of different scenarios, some random and some extreme, the
variations are a wake-up call. In Figure 5, Example 3 Minimum and Example 4 Maximum
share the same 25 percent scoring a 7 or higher, but Example 4’s average score is 19 percent
higher than Example 3’s.
At its most extreme in scoring, shown as “Extreme (-)” or “Extreme (+),” the maximum can
be 300 percent higher than the minimum. See Figure 6 for additional analysis comparisons
in averages (non-rounded) versus the grouping method for 50 percent and 75 percent
scoring 7 or higher on a scale of 0 to 10.
46
Converting the scale analysis into percentage representation. Scale analysis can be
converted into a percentage analysis (as already seen in Figures 5 and 6). If you’re using a 0-
to-10 scale, simply moving the decimal point converts your average scores into percentage
representation. At its lowest, the entire sample giving 0s equates to 0 percent and at its
highest, the entire sample giving 10s equates to a 100 percent. If you’re using a scale other
than 0-to-10, you’ll need to use a less-obvious conversion formula. The conversion formula
for varying scales other than 0-to-10 can be found in Figure 7. For an example of scale
impact and conversion on a scale of 1-to-10, see Figure 8.
47
48
Not all analyses are equal. In fact, some can be downright deceptive. My advice: When you
can, use 0-to-10 scales, conduct average (non-rounded) analyses and convert to percentage
analyses when needed.
49
Quant or qual, let’s go back to the basics
Author - Kevin Gray
Article Abstract
Research is more than procedures and computations. It's also a way of thinking. Kevin Gray
provides steps for getting back to the basics of research quality.
Although marketing research may not be "hard" science, as researchers it is our professional
obligation to strive for scientific rigor to the best of our abilities within the constraints under
which we work. Some methodological fine points are likely to have little or no impact on the
client's decisions. Other kinds of details may seem trivial or even geeky at first glance but
are actually consequential and marketing researchers must be wary of conflating the two.
I would like to offer some thoughts on what I call research thinking. Written from my
perspective as a quantitative researcher, I believe this article will have relevance to most
kinds of marketing research, including qualitative and big data analytics. Marketing research
methodologies have been adapted from disparate fields and pieced together by
practitioners over the course of several generations. There surely are good and bad
practices but no absolute best practice, despite occasional assertions to the contrary.
Verifying data quality
Errors of many kinds can sneak into our data and therefore, data checking, cleaning and
other "janitorial work" should never be skipped. We should be reasonably confident the
data contain no serious errors and actually mean what we think they mean. Consumers do
not necessarily interpret survey questions in exactly the same way as the marketing
researchers who write them do. International research is more prone to misunderstandings,
not only due to translation errors but because some concepts don't travel well across
cultures and cannot be communicated precisely. This is particularly true when questions
pertain to values, attitudes and lifestyles. Unfortunately the frantic pace of today’s business
world often causes flaws in questionnaire design to be detected only after fieldwork has
been completed.
We also need to exercise care when interpreting customer records and other big data,
which can be messy and confusing. Even building a traditional data warehouse is rarely a
simple task, in part because the various parts of an organization have diverse requirements.
Data definitions are often ambiguous and it's not uncommon to discover two or more data
fields that are almost but not exactly the same, and we must decide which to use. These
janitorial tasks may seem like unappealing grunt work and not part of your formal job
description but are really important steps to ensuring your data are error free.
Correlation versus causation
50
Scientists need to be careful about inferring cause-and-effect relationships. In most
marketing research we are making causal links but often are not consciously aware that we
are doing so.
Patterns we spot may result from any number of factors, including those we are unable to
measure and those we are unaware of. Consumer groups are often non-equivalent in
important ways before we compare them and, since differences between consumers have
not been "randomized away," conclusions about causation are usually more problematic
than in experimental research. Instead, we must make our causal deductions based upon
associations, though this entails risks. "Correlation does not imply causation" is a warning
drilled into future statisticians in the classroom and often cited in the business media these
days.
Defining relationships
Associations can also be spurious. For example, if a correlation between sales of ice cream
and sales of sunscreen were found it probably would be the result of weather and seasonal
marketing activities, since it is improbable that the sales of one caused sales of the other.
There are also interactions in which the relationship between two variables is mitigated by
other variables. For example, the relationship between age and product evaluations
may depend to some degree on gender, and vice versa. Another kind of relationship is
reciprocal causation, whereby one variable influences a second and that second variable, in
turn, affects the first variable. A case in point is when raising awareness of a brand increases
purchase of it, which leads to greater awareness of the brand since people are more apt to
recall brands they often use than those they use infrequently.
There are still other trick pitches data can throw at us. For example, correlations between
two variables can actually be masked by other variables and appear to be small unless
statistical adjustments are made that remove noise from the relationship. Also, curvilinear
relationships among variables will be obscured by the use of the standard correlation
coefficient and two variables may appear unrelated when in fact they are strongly
associated, though just not in a straight-line fashion.
When data are collected over time – for example weekly sales and marketing data – causal
relationships can sometimes be easier to unravel since cause must logically precede effect.
For example, some marketing activities are not intended to have an immediate impact but
are correlated with sales in later periods. In our more typical cross-sectional research,
however, the data have been collected during one period and a time dimension is lacking.
With either cross-sectional or time-series data, multivariate analysis can help untangle
causal relationships by statistically accounting for potential confounders but this is rarely
easy and different statistical methods and models can give us very different readings. While
often exceedingly useful, it should not be conducted mechanically or on the fly.
51
Causation requires correlation of some kind but correlation and causation are not the same.
Understanding (and avoiding) data interpretation traps
There are other ways in which we can be led astray when interpreting data. A very serious
example would be when the consumers who have completed our survey are atypical and
their opinions dissimilar to those of our target population. In truth, it is nearly impossible to
obtain a sample of completed interviews that is perfectly representative of our population
of interest. This does not imply, however, that all surveys are more or less the same.
Representativeness is a continuum and at some point the lack of representativeness will
begin to shape decisions.
Though this should be Marketing Research 101, the difference between focus groups and
survey research is much more than sample size. Even if we were to assume that
representativeness is no more of a concern with focus groups than with surveys, respondent
interactions, group dynamics and the data make the two methodologies fundamentally
different. Text analytics software cannot make them the same.
Regression to the mean is a statistical phenomenon that is not intuitive for most of us and I
will defer to Wikipedia's concise definition of it: "... if a variable is extreme on its first
measurement, it will tend to be closer to the average on its second measurement – and,
paradoxically, if it is extreme on its second measurement, it will tend to have been closer to
the average on its first." This has bearing on marketing research and our classifying of
consumers as heavy, medium or light purchasers is one example of where this comes into
play. Independent of our marketing efforts, some of the consumers we have put into our
heavy bucket if checked on a later occasion would have lower purchase frequency and some
light purchasers, conversely, would have higher purchase frequency.
Statistical significance testing
Data dredging can be hazardous to our professional health as it's not hard to find an
interesting pattern and assume it is real when it is actually just a chance result. Statistical
significance testing, if used advisedly, can be helpful in screening out fluke results but there
are risks in over-relying on it. Significance testing assumes probability sampling and
measurement without error, two assumptions that are usually not met in the real world of
marketing research. Apart from that, we should not let significance testing do our thinking
for us and should instead first ask ourselves if a difference or correlation we have found is
large enough to have practical significance. If not, it does not matter whether it is
statistically significant. Moreover, I also have found that patterns of results are more
enlightening and trustworthy than examining masses of significance tests.
Big data
52
Thus far, big data may not have made life easier for marketers. David Hand, a former
president of the Royal Statistical Society, describes what he calls the law of truly large
numbers saying, "with a large enough number of opportunities, any outrageous thing is
likely to happen." With a gigantic number of customer records and variables, for example,
significance testing is seldom helpful in flagging chance results and, in any event, the more
we look the more we will find ... though what we find might be fleeting. The signal-to-noise
ratio can be very small in big data.
Related to this is "harking" or hypothesizing after the results are known – something of
which we need to be mindful. An example is the imaginative use of crosstabulations in
marketing research. The ways in which we define consumer subgroups are often fairly
arbitrary, even with something as basic as age group. It is not sound or ethical practice,
however, to redefine these groupings after having looked at the data in order to find
something to please or impress the client.
Models and reality
In analytics there often is a trade-off between explaining and predicting. Many models
provide us with a good understanding of why patterns occur – for instance, why certain
segments of consumers buy certain brands for certain occasions – but some of them are so
complex that they don't predict the behavior of a new sample of consumers that well.
Conversely, some algorithms predict well but are not intuitive and cannot be easily
explained in non-mathematical language. This quandary is not present in all research but is a
frequent challenge modelers must face.
Models are not reality, only simplified representations of reality. In The Grand
Design, Stephen Hawking and Leonard Mlodinow outline what they term "model-dependent
realism" and conclude that if two physical theories or models accurately predict the same
events "one cannot be said to be more real than the other; rather, we are free to use
whichever model is most convenient." Most statisticians I know would not find this a
controversial statement. It is not at all unusual for two or more models to provide an
equivalent fit to the data but suggest very different interpretations and implications for
decision-makers. Settling on which model to use, or whether to go back to the drawing
board, should not be made solely on the basis of criteria such as the BIC or cross-validation
figures. This does not imply of course that the decision should rest on purely subjective
considerations.
GIGO – garbage in garbage out – may be one of the handiest acronyms ever devised.
However sophisticated, a mathematical model won't be helpful if it is based on data that
isn't relevant to our problem or if the data cannot be trusted.
Probabilities versus categories
53
We humans love to categorize and are strongly inclined to think dichotomously, which is
perhaps why we also love to quibble so much about definitions. Categorization can be
useful, especially when quick go/no go decisions are absolutely required and hard evidence
is scant, but this mode of thinking can introduce rigidities and encourage bad decisions.
Though it does not come naturally to us, thinking in terms of probabilities, especially
conditional probabilities, will often lead to better decisions.
Keep an open mind but keep it simple
Confirmation bias is a very human tendency to search for or interpret information in ways
that confirm our preconceptions. On the other hand, we also need to be wary of falling into
the trap of assuming that an exotic theory is necessarily a valid one! Simple answers, even if
boring, are more likely to be true than elaborate ones. We marketing researchers are
frequently guilty of both these cognitive sins without being aware of them and without any
conscious attempt to skew the results of our research or find something sexy. Avoid
confusing the possible with the plausible and the plausible with fact. It's also not difficult,
though, to miss something of genuine practical significance that lies hidden beneath the
surface of our data, so caution in both directions is urged.
A few more tips
• Do your homework. Many phenomena have more than one cause and I would urge you to
integrate data and information from diverse sources and to adopt a holistic perspective with
regard to analytics. Printing "Think Multivariate!" on our T-shirts may be impractical but
nevertheless it's a useful mind-set for marketing researchers to have.
• As a rule marketing research is most valuable when motivated by specific business

objectives and when the research design and interpretation of results are closely tied to
these objectives. Note that research need not be immediately actionable and can also add
value by providing context, for example in market entry feasibility studies.
• When designing research, first consider who will be using the results, how the results will
be used and whenthey will be used, and then work backward into the methodology. Don't
let the tools be the boss.
• Develop hypotheses, even rough ones, to help clarify your thinking when designing
research. These can be formally tested against the evidence when data become available.
• Take care not to over-interpret data. I have witnessed instances in which detailed
profitability calculations have been made based on data that should really have been
interpreted directionally – not as precise figures – or even ignored.
• When you observe a pattern of potential interest in the data, before jumping to
conclusions it's best to ask yourself a few basic questions:
54
Is this pattern actually real?
Is it strong enough to be meaningful from a business point of view?
If it is, what are its business implications?
What could plausibly have caused this pattern? Are there other likely causes?
Do I have real evidence that what I think are the causes are the actual causes?
• Remember that a decision made too slowly is a bad decision ... but a bad decision made
hastily is not a good decision either.
• Be skeptical and don't let yourself be pressured by the opinions of "thought leaders.”
Survey and sampling in an imperfect world

Article Abstract
This article addresses the demands that arise in a random sample survey and their effects
on sample, offers solutions, and proposes stratified random sampling as an alternative to
help achieve the same level of accuracy as computed on a simple random sample with a
reduced sample size.
My January 1999 Data Use article focused on the mechanics of calculating the sample size
for a simple random sample survey at a prescribed level of precision, in an ideal state. But
the world is not ideal: We rarely have the luxury of doing a true random (equal opportunity)
sample survey, and we have to accommodate many conflicting demands. In this article, I
address those external demands and their effects on your sample, and offer solutions. I then
propose stratified random sampling as an alternative to help you achieve the same level of
accuracy as computed on a simple random sample with a reduced sample size.
Compromise between practical constraints and technical elegance
When was the last time you actually knew the entire population before you took the survey,
which is a pre-requisite for any random sampling to ensure that everyone in the population
has an equal chance of being selected into your sample? In some cases, we might use a
convenience sample, a judgement sample, or a quota sample (all of them non-probability
sampling) without realizing that it isn’t a random sample.
A convenience sample is convenient to take for the surveyor. For example, a doctor may
select the patients treated at his hospital for a clinical study.
55
A judgement sample is one taken by an overeager expert believing that, with his intimate
knowledge of the individuals in the sample, the sample must represent characteristics of the
population. For example, the leader of a school board may choose four of his allies on the
board to represent the opinion of all members.
A quota sample is one in which the population is subdivided into several sub-populations or,
strata; within each stratum the surveyor is free to select individuals in any manner he
wishes, usually by way of a convenience sample or a judgement sample, until he reaches the
specified number of individuals.
All of these samples share one thing in common: there is no knowledge of their
representativeness (of any population) and their reliability because they are not random
samples. Does this mean that we should abandon our efforts in calculating probability-
based sample size for them, drawing statistical conclusion at the required level of precision
and confidence intervals? Absolutely not. If we are doing something wrong, we might as
well do it in the most effective way!
Statistics, though not applied optimally, can in so many ways dramatically improve
operating efficiencies, cut costs and improve estimation for a survey. We should, however,
be mindful that what we calculate are: approximately unbiased sample estimates and their
precision, approximately valid confidence intervals; and sample sizes that are based on
quantifiable statistics rather than some arbitrary industry standard.
Close substitutes for simple random sampling
Systematic sampling is a probability-based sample design often used when a listing is

available and can be ordered. You would select every kth element in the population, after a
random start somewhere within the first k elements. For example, suppose you have a list
of 5,000 households in a city, and you want to sample 100 households. Your interval, k,
would be 50 (=5,000/100), or every 50th household. You would then select a random
number between one and 50, say 13, and survey the houses numbered 13, 63 (=13+50), 113
(=63+50), 163 (=113+50), and so forth.
Systematic sampling is used more often in practice than simple random sampling because it
is much easier and cheaper to do. It has two advantages:
1) You do not jump back and forth all over the list wherever your random number leads you,
and you do not have to worry about duplication of a sample.
2) You can select a sample without a complete list of all households.
One major disadvantage with systematic sampling is “periodicity,” where you may
encounter cyclical patterns. For example, a sample of every, say, 50th business from a list in
New York City turns out to be located on or near Fifth Avenue. When this happens, you
must reorder the list and redo the sampling.
56
Random-digit dialing has become an important probability sampling procedure with the
rising popularity of telephone interviewing. In its purest form, this procedure calls for
randomizing all seven digits of a telephone number. However, this is too costly and
inefficient. What is more common in practice is that numbers are selected from a telephone
directory by first using a systematic sampling procedure, and then the last one or two digits
of the numbers are replaced with random numbers. This procedure gives a much higher
percentage of usable telephone numbers, and also has the flavor of a true probability
sample.
Myth and reality about sample size
The general public seems to believe that larger samples are necessarily better than smaller
ones. It simply sounds more credible to say: “Based on a study of 3,000 people” than to say:
“Based on a study of 250.” This is partially true: What if only 8 percent of the 3,000 people
responded, resulting in only 250 completes, whereas 100 percent of the 250 responded?
You may only need 250 responses to attain the precision level for your estimate at a preset
level of, say, within ±1 percent of error at the 95 percent confidence level. Remember that
the sample size you need for computing precision is that of the survey “completes,” not the
survey “mail out.” Once you take non-responses into consideration, a very small, well-
executed sample may yield an estimate as accurate as that from a huge, sloppy sample.
I should also point out that the statistical precision requirement is only one of many
considerations a researcher must face in choosing the sample size, and there is no one
correct answer. Whatever his choice may be, the researcher should fully understand the
consequence of the precision gain or loss due to his choice. Practical constraints, other than
the precision requirement, affecting sample size decisions are:
Time pressure. Often research results are needed “yesterday.”
Cost constraint. A limited amount of money is available for the study.
Study objective. What is the purpose of the study? A decision that does not need great
precision can make do with a very small sample size. A company may be happy to measure
interest in its new product within 15 to 20 percent of precision. A political pollster can be off
by less than 1 percent and fail to predict the election result.
Data analysis procedures. Data analysis procedures have an impact on the sample-size
decision. The sample-size and precision formulae I have proposed so far are premised upon
you doing a basic, one-variable analysis of frequencies. When you start doing
crosstabulations examining the relationship of two variables at a time, you may run into
situations where some cell sizes are so small that the precision of estimates within cells
becomes suspicious. A study doing only one-variable analysis may only require 200
completed responses, whereas a similar study doing two-variable analysis may require over
1,000 responses.
57
Stratified random sample survey
If the population is first grouped (or stratified) according to some criterion, then a simple
random sample is selected for every stratum. This type of survey design is a stratified
random sample.
Quota sampling, undoubtedly the most popular form of sampling used in the research
industry, closely resembles stratified random sampling, and should follow formulae
developed for the stratified sample to approximate its sample size and precision. If you use
the simple random sample calculations for a quota sample, you would overstate the error of
your estimate and the sample-size requirement.
If intelligently used, stratification nearly always results in a smaller sampling error than is
given by a comparable-size simple random sample (That’s why stratified sampling is
statistically “more efficient.”) It is not always true though -- the key is in the careful selection
of the stratification criterion. In constructing strata, you must always ask yourself: “Wat
factor contributes most meaningfully to all the outcome variables I want to measure?”
As an example, suppose that you are asked to study personal income in some target
population. The most important contributing factor to the differences in income may be
education. Better-educated individuals earn more than less-educated ones. If you
distinguish four levels of education (eight years or less in school, 12 years, 16 years, 17 or
more years), you would have four different strata. In the “17 or more years” stratum, you
may find most of the high-income earners. In the “eight years or less” stratum, you may find
most of the low-income persons. The within-strata variability is much smaller than that
across strata. Because you only need the within-strata variability to calculate the overall
sampling error for a stratified sample, the advantages of a stratified design over simple
random design become clear:
For the same level of precision, you would need a smaller sample size in total, thus a lower
cost.
For the same total sample size, you would gain a greater precision for your estimate.
Conversely, if your stratification variable was so poorly chosen that the sample
measurements are all over the place within a stratum, you lose all the advantages inherent
in a stratified random sample. (You might as well do a simple random sample instead.)
There are two popular ways of assigning the sample size to the different strata, once the
total sample size is determined: equal allocation - taking the same sample size from each
stratum; proportional allocation - taking the sample size from each stratum in proportion to
the stratum population size. Other methods exist to achieve even smaller sampling errors
and reliable estimates. But they are complex and beyond the scope of this article. In
58
general, the larger the stratum, the larger the sample size should be; the greater variability
within a stratum, the larger the sample size should be.
An example
Let’s suppose a business has the following employee profile:
62 percent are skilled or unskilled males;
31percent are clerical females; and
7 percent are supervisors.
From a total sample of 400 employees (n=400), the firm wishes to estimate the “overall”
proportion of employees who use certain on-site fitness facilities. Rough guesses are that
the facilities are used by 40 to 50 percent of the males, 20 to 30 percent of the females, and
5 to 10 percent of the supervisors.
A) How would you allocate the sample among the three groups?
B) If the true proportions of users are 48 percent (males), 21 percent (females), and 4
percent (supervisors), respectively, what would be the sampling error of the “overall”
estimated proportion (Pstratum) with stratification?
C) What would be the sampling error from a random sample (Psimple) without stratification
with the same sample size of 400?
A) Using the proportional allocation, we would assign the three stratum sample sizes as:
nstratum 1 = 400 x 62% = 248 to the male stratum
nstratum 2 = 400 x 31% = 124 to the female stratum
nstratum 3 = 400 x 7% = 28 to the supervisor stratum
B) If I guess Pstratum 1= 45 percent, Pstratum 2 = 25 percent, and Pstratum 3= 7.5 percent, then my
overall proportion estimate is:
Pstratum = (45% x 62%) + (25% x 31%) + (7.5% x 7%) = 36.2% (which is a weighted average of
the within-strata proportions)
And, my sampling error for the overall proportion estimate is calculated as:
Sampling error (Pstratum) = square root of {∑ WI2 [P stratum I (1- P stratum I) / n stratum I ]}
= square root of [(0.62 x 0.62 x 0.45 x 0.55) / 248) +
59
(0.31 x 0.31 x 0.25 x 0.75) / 124) +
(0.07 x 0.07 x 0.075 x 0.925) / 28 ]
= 0.02326 = 2.33%
where WI is the weighting factor for a stratum, i.e., the size of the population within a
stratum relative to the total population.
The 95-percent confidence interval (= two standard deviations) for a total sample size of 400
is:
36.2% ± [2 x (2.33%)], or 36.2% ± 4.66%, or [31.54%, 40.86%]
Note: The true estimate is 36.6% = (48% x 62%) + (21% x 31%) + (4% x 7%)
C) With a simple random sample, my overall proportion estimate, P simple, is the same as that
from a stratified sample, 36.2 percent. However, the sampling error for this estimate is
larger:
Sampling error (Psimple) = square root of { [ P simple (1- P simple) ] / n }
= square root of { [ (36.2% x (1-36.2%) ] / 400 }
= 0.02403 = 2.4%
The 95 percent confidence interval for a total sample size of 400 is:
36.2% ± [2 x (2.4%)], or 36.2% ± 4.8%, or [31.4%, 41%]
In this example, the improvement of sampling error from a simple random sample to a
stratified random sample may not seem dramatic, from ±4.8 percent to ±4.66 percent.
However, this difference amounts to a reduction of 25 interviews. (To achieve the same
level of precision with a simple random sample, we would need 425 samples, an increase of
6.25 percent!)
n simple = square of {square root of [ P simple (1-P simple) ] / (E / Std. deviations) }
= square of {square root of [36.2% x (1-36.2%)] / (4.66% / 2) } = 425
where E = my desired level of precision, Std. deviation = expression equivalent to 95%

confidence level
Finally, for a fixed total sample size, the gain in precision from stratified random over simple
random sampling is the largest if the stratum proportion estimates vary greatly from
stratum to stratum (i.e., great across-strata variability). I conclude with a table comparing
the relative precision of stratified and simple random sampling for the above employee
60
survey example with three strata and 400 samples, at various combinations of stratum
proportion estimates:
Four cases are presented in the table, the first having Pstratum 1 = 40 percent, Pstratum 2= 50
percent, and Pstratum 3= 60 percent, and the last having P stratum 1= 10 percent, Pstratum 2= 50
percent, and Pstratum 3= 90 percent. Columns 4 and 5 give the standard errors of the overall
estimated proportion. The last column gives the relative precision of stratified to simple
random sampling. The gain in precision is large only in the last two cases.
Increasing survey accuracy

Author- Norman Frendberg
Article Abstract
This article discusses ways of minimizing three types of survey errors: sampling error,
observational error (incorrect measurements), and non-observational error (the inability to
obtain information from qualified respondents).
Maximizing survey accuracy is the ultimate goal of every survey researcher. Several
approaches to addressing this ongoing challenge follow. In order to minimize survey error,
we must first identify the three types of error:
1. Sampling error- This error occurs when we survey only a sample of the population rather
than every person, i.e., we may survey 1,000 households rather than the approximate 93
million households comprising the total U.S. population.
2. Two types of non-sampling error-
2a. Observational error- This type of error includes incorrect measurements caused by a
variety of factors such as the respondent's failure to recall information accurately (e.g., the
last brand bought).
An observational error may also occur in the information processing phase during activities
such as keypunching.
2b. Non-observational error- The occurrence of this type of error results from the inability to
obtain information from qualified respondents. For example, respondents may be
unavailable or refuse to be interviewed. Additionally, there may be potential respondents
who are excluded from the survey as in the case of those without phones in a telephone
study.
Sampling error
61
The sampling error can be reduced simply by increasing the sample size. The increases in
sample size necessary to reduce sampling error are illustrated in Table 1.
Let's assume that among a random sample of 200 respondents, 20% indicate they
"definitely would buy" our new product. We would be 95% confident that this score among
the total population is ±6 percentage points, (i.e., between 14% to 26%). Increasing the
sample size by a factor of five to 1,000 respondents reduces the sampling error by half to ±3
percentage points. However, for most mail-intercept or phone survey research, such an
increase in sample size would drastically increase the study cost. Increasing the sample size
to reduce sampling error may fail to represent a cost-effective approach to increasing
overall survey accuracy.
Table 1
Sampling Error at 95% Confidence on Sample Measure of 20%
Sampling Error + -
Sample size percentage points
100 8
200 6
500 4
1000 3
However, decreasing sampling error has the unique distinction of being a measurable
source of error in survey research as well as the most commonly understood error.
Non-sampling error
Survey accuracy can also increase as a result of decreasing non-sampling error, although the
exact margin of improvement is not measurable.
Reducing non-sampling error can be accomplished by employing a wide variety of

techniques. However, creating a comprehensive list of techniques is impossible since a
particular method may be suitable for one study, but totally inappropriate for another.
Observational errors
One way to decrease observational error is by enhancing the communication between

interviewer and respondent, thereby improving the collection of accurate data.
62
Photo exhibits can be effectively used to clarify choices for a respondent and aid in memory
recall when appropriate. For example, when conducting mall-intercept panty hose studies,
respondents are often shown color exhibits illustrating package fronts of different styles
(e.g., regular, control top) for the major brands. Respondents can refer to the illustration
when asked about brand and style usage, and many times will mention a product by the
package color.
In another study, a photo exhibit helped respondents determine the weight of their dogs.
The illustration provided information on dog breeds and approximate weights, which served
as a visual reference for the respondent. This process furnished helpful information that
yielded more accurate data, even in the case of a mixed breed.
Reduction in observational errors can also be achieved in the data collection process.
Questions can be designed so that recording information is easier for the interviewers,
which results in a higher degree of error-free data.
In the following example we asked respondents, "Have you ever heard of [brand name] ice
cream?" The response to this question could be recorded in several ways, two of which
follow:
OPTION #1 Interview Instruction:

If "yes," circle answer for each brand respondent has heard of:
Haagen-Dazs 1
Perry’s 2
Sealtest 3
Ben & Jerry’s 4
Baskin-Robbins 5
OPTION #2 Interview Instruction:

Circle "yes" for all brands respondent has heard of and "no" for those brands respondent
has never heard of:
Ever hear of? Yes No
Haagen-Dazs 1 1
Perry’s 2 2
63
Sealtest 3 3
Ben & Jerry’s 4 4
Baskin-Robbins 5 5
Both methods are functional, but option #2 requires the interviewer to record an answer for
each question. The process in this second option is easier, more complete and therefore,
probably derives greater accuracy.
Non-observational errors
A key component of non-observational error is the respondent's refusal to be interviewed.

One approach to increasing survey accuracy by reducing the refusal rate involves offering a
cash incentive. Not only does this technique enhance survey accuracy, but it may actually
save money as indicated below. The following data were collected several years ago
illustrating the actual cost of screening respondents using a $2 cash incentive for various
incidence rates. At a cost of $15/hour, a $2 cash incentive actually saves money below the
30% incidence level.
Cost Analysis by Incidence Rate for Client Cost of $15 Per Hour*
Actual cost
Screening cost Total cost Total cost of $2 incentive
Incidence % ($2 incentive) ($2 incentive) (no incentive) per interview
100% $1.58 $3.58 2.20 1.38
90% 1.75 3.75 2.44 1.31
80% 1.97 3.97 2.75 1.22
70% 2.26 4.26 3.14 1.12
60% 2.63 4.63 3.67 0.96
50% 3.16 5.16 4.40 0.76
40% 3.95 5.95 5.50 0.45
31%,25% 5.04 7.04 7.04 0.00
64
30% 5.27 7.27 7.33 -0.06
20% 7.90 9.90 11.00 -1.10
10% 15.80 17.80 22.00 -4.20
Summary
Although enlarging the sample size does reduce error, this approach may be quite costly for
a small return of increased accuracy. Increasing survey accuracy can be best achieved in a
cost effective manner by decreasing non-sampling error. Three ways to reduce these non-
sampling errors are: use of visual communication aids during the interview process,
improving the recording of answers by interviewers, and offering cash incentives to
respondents.
* Frederick Wiseman, Marianne Schafer, Richard Schafer, "An Experimental Test of the
Effects of a Monetary Incentive of Cooperation Rates and Data Collection Costs in Central
Location Interviewing," Journal of Marketing Research, pp. 439-442, November, 1983.
65
Statistical Significance:
The significance of significance
Author - Hank Zucker, Patrick Baldasare, Vikas Mittel and Albert Madansky
Article Abstract
This article, written in response to two articles previously published in Quirk’s, discusses
significance by way of a definition and an example.
The recurrence of articles on the meaning of "significance levels" is clear evidence that the
concept is murkily understood. The root cause of this may even hark back to poor (or, worse
yet, incorrect) exposition of this concept in statistics textbooks. Indeed, Professor Gerd
Gigerenzer of the University of Chicago Psychology Department has collected and published
a number of misstatements about this concept in the plethora of "statistics for
psychologists" books on the market.
That QMRR has published two articles on the meaning of "significance levels" within a year
indicates that this concept is murkily understood by the marketing research profession as
well. Unfortunately, both of these articles contain ambiguities which help to further muddy
one's understanding of this concept. The purpose of this article is to set the record straight,
hopefully in a clear enough fashion to dispel any erroneous notions readers may have about
this concept.
To provide a context for my comments, consider the following quotes from "The Use,
Misuse and Abuse of Significance," by Baldasare and Mittel (QMRR, November 1994) and
"What Is Significance?" by Zucker (QMRR, March 1994).
"A significance level of, say 95 percent merely implies that there is a 5 percent chance of
accepting something as being true based on the sample when, in fact, in the population it
may be false." (Baldasare & Mittel)
"Given our particular sample size, there is a 5 percent chance that in the population
represented by this sample the proportions for Group A and Group B are not different."
(Baldasare & Mittel)
"It (statistical significance) only tells us the probability with which a difference found in the
sample would not be found in the population." (Baldasare & Mittel)
"Significance levels show you how probably true a result is." (Zucker)
What's troublesome about these statements? First of all, the words "something," "it," and
66
"result" (as underlined by me above), as referents of the adjective "true," are somewhat
imprecise, which can lead the reader to erroneous conclusions about what "truth" is being
assessed by significance testing. Secondly, when Baldasare and Mittel talk about probability
of finding a characteristic in the population and Zucker talks about probability of the truth of
a conclusion they are expressing a common misunderstanding of what the probability
statement associated with a significance test is all about.
Let me illustrate with a simple example. Someone hands me a coin, and I'd like to determine
whether the coin is fair. The coin either is or isn't a fair coin. At the moment, only God
knows for sure (and perhaps so does the person who handed me the coin). But what does
the expression "the probability that the coin is fair" mean? Objectively, that probability is
either 1 (if the coin is in truth fair) and 0 (otherwise). Subjectively, one can interpret the
expression as "What odds would I give that the coin is fair?" But my odds may not be the
same as your odds, which is why I dubbed this interpretation "subjective." And I don't think
this is what Baldasare, Mittel, and Zucker are talking about when they use the word
"probability."
Let's continue with the example. Suppose I toss the coin 100 times and find that I come up
with 60 heads. I can ask myself what is the probability of obtaining 60 or more heads in 100
tosses of a fair coin. In the parlance of significance testing, I postulated a null hypothesis
(that the coin is fair) and asked what is the probability of my data (or data more inconsistent
with the null hypothesis) arising when the null hypothesis is true. That probability is called a
"p-value," and is the only probability calculated in the standard significance testing
packages. (In my example, the p-value is .0284.)
What's a significance level? To understand this concept, let me continue my story. Before I
tossed my coin 100 times, I sat back and planned my analysis. As a professional statistician, I
am asked to make a recommendation about whether or not to accept a posited null
hypothesis. Only God knows whether the null hypothesis is true or false. Suppose God were
to keep a scorecard on my recommendations, but only on those recommendations made
when the null hypothesis is true. (God could also keep a separate scorecard of my
recommendations when the null hypothesis is false, but we won't look at that scorecard
now.) If I want my lifetime percentage of correct calls, given that the null hypothesis is true,
to be 95 percent, I will adopt the following procedure:
1. calculate the p-value as defined above
2. if that p-value is at most .05, I will recommend rejecting the null hypothesis; if that p-
value is greater than .05, I will recommend accepting the null hypothesis. (In my example,
since the p-value was less than .05, I would recommend rejecting the null hypothesis.
Indeed, I would have recommended rejecting the null hypothesis if I had observed 59 or 58
67
heads, but not 57 or fewer, out of 100 using this procedure.)
On any one recommendation, I don't know whether I'm right or wrong. I can only tell you
that the way I operate I'm right 95 percent of the time when the null hypothesis is true. The
level of significance is just the p-value that I use as a cutoff in making my recommendations.
The only correct statement about significance levels is the following restatement of one of
those by Baldasare and Mittel, namely:
"A significance level of, say, 5 percent merely implies that, given my procedure for making
inferences, there is a 5 percent chance of my rejecting a null hypothesis based on the
sample when, in fact, in the population it (the null hypothesis) is true."
The concept associated with the Baldasare and Mittel quote, "the chance of accepting a
hypothesis as being true based on the sample when, in fact, in the population it is false," is
called the operating characteristic of a statistical test. More regularly referred to in the
statistics literature is the power of the test, which is defined as 1 minus the operating
characteristic, or "the chance of rejecting a hypothesis based on the sample when, in fact, in
the population it is false." It is this that is being recorded on God's other scorecard, the one
he keeps on the accuracy of my calls when the null hypothesis is false. This latter concept is
also important in market research, in that it is the power of the test (and not the level of
significance) that determines the required sample size. But this is off the main point of this
article, and should itself be the subject of a future article in this publication.
--Albert Madansky
PATRICK BALDASARE AND VIKAS MITTEL'S REPLY:
We would like to thank Al Madansky for taking the time to carefully read the articles related
to statistically discernible differences (a.k.a. statistical significance). While the points made
by Madansky warrant consideration, we should point out that the differences between his
work and ours are essentially due to a difference in orientation.
We are at issue with Madansky's conclusion that publication of two articles in QMRR about
significance testing indicates that "this concept is murkily understood by the marketing
research profession." While some people within the marketing research field do not fully
understand the concept, it is stereotypic thinking to draw judgments about the profession
as a whole. Publications such as QMRR serve as vehicles for continuing education among
professionals who may not have the time to take formal classes to refresh their skills. By
publishing articles on topics that are of practical importance, QMRR and other such
publications (1) provide a forum for professionals to brush up their skills and knowledge and
(2) remind readers of the importance of basic concepts. Publishing more than one article on
a given topic does not suggest ignorance or ambiguity on the profession's part. Rather, it
68
shows the field's penchant to revisit and revive basic concepts that are useful.
Second, the words something, it, and result refer to the alternative hypothesis. While
phrasing sentences in technical terms such as "the null hypothesis" and/or "alternative
hypothesis" may make the exposition seem more precise, they do not necessarily render it
more understandable or readable. In fact, carrying Madansky's recommendation to the
extreme, we could express the entire problem in terms of mathematical symbols. While this
would make our exposition more technically appealing, it would not necessarily make it
more practical or useful. Ironically, it is the same sort of difference that we pointed out
between statistical significance and practical significance.
Third, the restatement of our point by Madansky says nothing new. On careful examination
we find that his restatement is the same statement as ours, except that he phrases it in
terms of the null hypothesis compared to our phrasing in terms of the alternative
hypothesis. Practitioners are more used to thinking in terms of the alternative hypothesis
rather than the null hypothesis. For instance, a manager is more likely to understand the
statement that we are looking for differences, rather than the statement that we are trying
to gather evidence against the assertion that there are no differences in the population.
Therefore, to enhance the readability of the article we phrased our sentences accordingly. Is
the glass half empty or is it half full? We believe such a debate only muddies the issues.
Nevertheless, Madansky's note is useful in itself because it describes the philosophy of

undertaking a test of significance using a common example. Additionally, he highlights the
dilemma practitioners wrestle with on a daily basis: what is the dividing line between
technical clarity from a purist's perspective and practical clarity from an end user's
viewpoint? This question does not have a right a wrong answer. We leave the readers to
draw their own conclusions.
--Patrick Baldasare and Vikas Mittel
HANK ZUCKER'S REPLY:
Prof. Madansky seems to have misunderstood the aims of my article "What is significance?"
A primary aim was to avoid statistics jargon.
One of the reasons that the meaning of "significance levels" is so "murkily understood" (in
Madansky's terms) is the unfortunate choice of words statistics professionals use to discuss
the underlying concepts. Many statistical terms mean nothing to the non-expert. Worse,
others have clear meanings in normal English that have nothing to do with their meanings in
statistics. Significance is a prime example. A non-expert hearing or seeing the term
significance level would likely think it refers to importance rather than to the chance of
erroneously rejecting a null hypothesis. A key aim of my article was to correct this all-to-
69
understandable mistake.
My article was originally written for a newsletter sent to users of our interviewing and
tabulation software The Survey System. Our clients include many academics and long-time
research professionals, but also many people new to survey research. The article was
written primarily for the latter group. I attempted to give the non-expert a clear, generally
correct understanding of the term significance, to explain how to read the probability
notations provided by statistical packages and to caution the reader that significance tests
do not measure all types of errors. Phrases like "something" or "a result" being true may be
less precise than "rejecting a null hypothesis," but they are more easily understood by non-
experts. As Baldasare and Mittel mention in their response to Madansky, some sacrifice in
precision is often worthwhile for the sake of clarity.
Some readers may find Madansky's approach useful. Others may prefer a less jargonistic
approach, especially since it allowed a similar length article to include important
information about issues related to statistical significance, not just a definition of the term.
The insignificance of significance testing

Author - Neil Helgeson
Article Abstract
Statistical tests often are misinterpreted. This article discusses significance testing and its
place in marketing research.
After conducting a study to compare two products or services, there are a number of
questions that a researcher could ask when comparing the two along a relevant dimension:
1. Which one is better, and by how much? 2. Did the two differ? 3. How likely is it that the
two differ? 4. How likely is it that we obtained the observed difference if the two did not
actually differ?
Most researchers would like the answer to #1, would settle for #2 or #3, and would say
"Who cares?" about #4. Unfortunately, #4 is the question answered when we use an
inferential statistical test on our data. Most users of these tests have no idea that it is really
#4 they are answering when they ask for statistical testing.
Inferential statistics are required when we work with samples, not populations. When we
use samples, we cannot be certain that the conclusions we reach based upon them
accurately reflect the populations from which the samples are drawn. To guide us in using
our results we can perform a statistical test. The approach is called "null hypothesis testing,"
70
and while the calculations vary based on the specific test we perform, conceptually they
contain the same steps.
1. Assume the null hypothesis is true. The null hypothesis is the opposite of the conclusion
we are testing (which we call the alternative hypothesis). For example, if we think that
stated purchase intents for product A and Product B are different (our alternative
hypothesis), our null hypothesis would be that the population purchase intent for product A
was equal to the population purchase intent for product B. The null hypothesis and
alternative hypothesis must be mutually exclusive and exhaustive - one and only one of
them is true. It is important to note that these are statements about population parameters,
not sample statistics, as our goal is to draw conclusions about populations. If we want to
draw conclusions about the samples, we only need to compare their means directly. This is
simple, but we usually do not care about our samples, only about the populations they
represent
2. Calculate a statistic whose distribution is known, given the null hypothesis. The statistic
we calculate depends upon the test - t, x2 , etc. If we are doing a t-test, we calculate t - a
statistic that involves the differences in the sample means, the variability in the samples,
and the sample sizes. If the null hypothesis is true, we know how likely it would be to get t
values in a given range — the range of interest usually being "as large or larger" than the
value we obtained.
3. Determine whether it is reasonable to keep assuming the null hypothesis is true. If we can
expect to get data which produces t values as extreme as ours infrequently if the null
hypothesis is true, we reject our assumption of the null hypothesis in favor of the
alternative hypothesis. Common values for the probability of obtaining the data before we
reject the null hypothesis are 5 percent or 10 percent. This number is called alpha a; it
represents what is called the type I error rate, the probability of rejecting the null
hypothesis when it is true.
If we reject our null hypothesis about products A and B at an alpha of .05, it tells us that, if
the null hypothesis is true, we would get a statistic (t in our example) as large or larger only
5 percent of the time.
There are a great many things that it does not tell us, the most frequently mistaken
conclusion being:
"There is a 95 percent chance that the products are different."
The prevalence of this misinterpretation can be seen in the use of the phrases "95 percent
confidence level," and "90 percent confidence level" rather than "alpha of .05" or "alpha of
.10." The conclusion we reach is based upon the probability of getting data like ours if the
null hypothesis is true, it is not based on the probability of the null hypothesis being true.
This is a crucial distinction. Since our real concern is whether the products differ, it is
71
convenient to assume that this is what the test tells us, but the probability of the products
being different is unknowable under most circumstances. From Bayes’ Theorem we know
that the probability of the two products being different given our data is:
P(data given the alternavitve hypothesis) P(alternative hypothesis)

P(data given the alternavitve hypothesis) P(alternative hypothesis) + P(data given the null
hypothesis) P(null hypothesis)
All these are unknown to us. While we might be able to estimate reasonable values for the
probability of the null hypothesis and alternative hypothesis, knowing the probability of the
data given the alternative hypothesis requires knowing the "real" difference in product
means. If we knew that, there would be no reason to do the statistical test! Under normal
circumstances, there is no way of knowing the probability of products being different based
on the analysis of experimental data. Any phrasing of the analysis of experimental results
that states or implies that there is a certain probability of the products being the same or
different is completely inaccurate. If we find two means different at the "95 percent
confidence level," we are 95 percent confident that, if the null hypothesis is true, we would
not have obtained a difference as large or larger than we obtained. We are not 95 percent
confident that the products differ.
Another common misinterpretation of the results of a statistical test is that they tell us that:
"The differences are "real," or the findings are "valid."
The finding of statistical significance does not tell us that the differences we observed are
"real." We may choose to treat them as real if we find them to be statistically significant, but
that is not what is being tested. No matter what the results of a test, there may or may not
be a difference, and the difference we observe may or may not accurately reflect the size of
that difference. It is important to remember that the means and the differences between
means we observe are our best guess of the population means and differences, regardless
of the results of any significance testing. If we observe a mean purchase intent of 3.86, 3.86
is our best guess of the population purchase intent, although we realize that the actual
value is probably different. Finding that the 3.86 is significantly different from another value
does not tell us that the 3.86 is correct, and finding that it is not significantly different does
not tell us it is incorrect. The precision of our numbers is not directly addressed by the
significance testing.
Another common misinterpretation is that:
"Failure to achieve significance shows that the means are the same."
The observed sample statistics are our best guess of the population parameters. If we find a
difference, our best guess is that there is a difference, even if that difference is not
72
significant. Failure to find that a difference is significant may mean that we do not treat the
difference as "real," but it does not tell us that there is no difference.
The p values we calculate in reaching a decision about the null hypothesis are not
particularly useful in drawing other conclusions. In particular, it is not true that:
"Smaller p values indicate larger differences."
In testing means, the sample size, variability, and absolute difference in means enter into
the calculations. If we hold all else constant, increasing the size of a difference will
ultimately lower p value when we check our test statistic, but since other factors enter in as
well, the p value should not be used as a measure of the size of the difference.
An example of this misapplication can be seen in a situation where our product was
compared to a competitor’s product on a series of dimensions, each dimension measured
by a question. It would be possible to statistically test the differences in means on each
question, and calculate p values for each comparison. It would not be correct to say that our
greatest superiority is on those dimensions where we have higher means with the smallest p
values, and that we have less superiority on those dimensions with larger p values. While
the sizes of the differences do enter into the calculations, larger p values may also be due to
more variability in responding to a question (either due to differing understanding of the
question or differing expectations of respondents), or they could be caused by reduced
sample size, with a larger number of respondents failing to answer a question due to a
failure to understand it or a belief that the question did not apply to them.
Given the limitations in the questions addressed by significance testing, why use it? We use
it because it provides a threshold that keeps us from being constantly buffeted around by
chance variation due to sampling. We realize that a difference we observe may be due to
sampling, and that the populations may not really differ. By looking for statistical
significance, we are assuring that some threshold has been reached before we act.
We should not mindlessly apply the testing, but adjust it according to the consequences of
the actions we may take. If we will use the results of a study to implement a costly change,
we should set our threshold high; an alpha of .01 may be appropriate, to reduce the chance
of incorrectly rejecting a false null hypothesis. If the gains to be made are large, we may
want to set or alpha relatively low, .1 or more, to reduce the chance of failing to reject a
false null hypothesis. We should consider the consequences of the types of errors and set
our criterion appropriately.
When the results of testing are irrelevant, we should not test. The results will just confuse
us. Suppose we are testing 10 potential new product formulations, with the goal of selecting
the best three for further development. Assuming there are no cost differences, etc.,
whether the third-best is significantly better than the fourth-best is irrelevant. Failure to
find statistical significance does not tell us that the third-best is no better than the fourth-
73
best, and should not be used as a reason to choose anything other than the three best-
performing formulations.
We should use alternatives to statistical testing when they more directly address our
concerns. If we are interested in the precision of our numbers, how close our 3.86 is to the
true population purchase intent, we should calculate confidence intervals. The results will
tell us that a certain percentage of the time, the true value will be in a given range. For
example, that 95 percent of the time the population mean will be in the range 3.66 to 4.06.
If our concern is whether a difference is "meaningful," a measure of association such as eta-

squared (h2) is appropriate. These statistics tell us what proportion of the total variance is
explained by our manipulation. For example, if we obtained a value of .37 in a test of
purchase intent for two products, it tells us that 37 percent of variability in purchase intent
can be explained by which product was being evaluated. This is quite large. On the other
hand, if we obtained a value of .01, it tells us that only 1 percent of variability in purchase
intent can be explained by which product is being tested. Large sample sizes make it quite
possible to achieve statistical significance with eta-squared values this low or lower.
Statistical testing has its place in marketing research, but its proper role is smaller than the
role it currently plays. The somewhat convoluted logic of null hypothesis testing fails to
provide answers to the questions which interest researchers the most. Failure to
understand what these tests really tell us can lead to incorrect and perhaps costly errors in
decision making, and can keep us from using the statistics which might provide more
meaningful interpretations of our results.
The use, misuse and abuse of significance

Author - Patrick Baldasare and Vikas Mittel
Article Abstract
Researchers often misuse and abuse the concept of significance, tending to associate
statistical significance with the magnitude of the result. This article suggests an alternative.
When comparing numbers, consider two types of significance: statistical and practical.
Researchers often misuse and abuse the concept of significance. Many in research comb
piles of crosstabulations and reams of analyses to find significant differences and formulate
their decisions based on statistical significance. They tend to associate statistical significance
with the magnitude of the result. Their reasoning is something like this: "The more
statistically significant a result, the bigger the difference between two numbers." In other
words, the fact that one proportion is significantly different than another suggests to many
that there is a big difference between the two proportions and statistical significance is
often associated with the "bigness" of a result. People often think that if the difference
74
between two numbers is significant it must be large and therefore must be considered in
the analysis. We suggest that when comparing numbers, we should consider two types of
significance: statistical significance and practical significance. By understanding the
difference between statistical and practical significance, we can avoid the pitfall that many
in the research industry make.
Statistical significance
What does statistical significance mean? A significance level of, say, 95 percent merely
implies that there is a 5 percent chance of accepting something as being true based on the
sample when, in fact, in the population it might be false. The statistical significance of an
observed difference depends on two factors: the sample size and the magnitude of the
difference observed in the samples.
For example, let's say we do a significance test between two groups of people who are
exposed to a product concept and find a 20-point difference between Group A (65 percent
acceptance) and Group B (45 percent acceptance). Is the difference statistically significant?
Despite the large magnitude of the difference (20 points), its statistical significance will
depend on the sample size. According to statistical theory, we need a sample size of about
50 or more people in each of the groups for the difference to be statistically significant at
the 95 percent level of confidence. If, in fact, we meet the sample size requirement, then
the difference of 20 points will be statistically significant at the 95 percent level of
confidence.
What does this result mean? Many marketers will look at this result and conclude that since
there is a 20-point difference and the difference is statistically significant, there must be a
big difference between Groups A and B. In reality, if we had done a census (i.e., surveyed
the entire population) instead of surveying a sample, the difference between Group A and
Group B may turn out to be smaller.
In other words, what this result tells us is merely this:
Given our particular sample size, there is a 5 percent chance that in the population
represented by this sample, the proportions for Group A and Group B are not different.
That's all. Statistical significance does not tell us anything about how big the difference is. It
only tells us the probability with which a difference found in the sample would not be found
in the population. Thus, for this case statistical significance would allow us to conclude that
there is only a 5 percent chance that in the population the proportion of Group A favoring
the product is not higher than Group B; we are taking a 5 percent risk of concluding a
difference exists when there may not be any such difference. If this difference were
significant at the 99 percent level of confidence, it would not have become larger. It would
only mean that there is a 1 percent chance that the difference observed in the sample
would not be observed in the population. Thus, we are only taking a I percent risk.
75
Practical significance
From a marketing perspective, the statistically significant difference of 20 points may be

meaningful or meaningless. It all depends on our research objectives and resources. If it
costs millions of dollars to reach each additional percentage of the market, we may decide
to funnel resources toward Group A since it has a higher acceptance rate. In this case, the
difference may be termed a "big" difference because (a) we are reasonably sure (95 percent
or 99 percent sure) that the difference observed in our sample also exists in the population
and (b) each percentage of difference is worth millions of dollars to the client. Thus,
statistical significance should not be used to decide how big a difference is, but merely to
ascertain our confidence in generalizing the results from our sample to the population.
In another situation this same difference may be ignored despite the fact that it may be
statistically significant. For instance, if the marketing costs are so low that it makes sense to
market to both groups, we can ignore the difference (even though it is significant) and treat
both groups as if they are the same. We may choose to market to both groups as if they had
similar acceptance rates (even though our statistical test was significant).
Our logic is the following: Although we can be 95 percent sure that the difference observed
here exists in the population, given the marketing scenario, the difference is not meaningful.
Thus, the relevance of a statistically significant difference should be determined based on
practical criteria including the absolute value of the difference, marketing objectives,
strategy, and so forth. The mere presence of a statistical significance does not imply that the
difference is large or that it is of noteworthy importance.
Implications
Statistical significance of a result is not a rule of thumb to ascertain how "big" a difference
is, but a context dependent tool to assess the riskiness of the decisions we make based on a
given sample. At most it can be used to ascertain that a difference actually exists in the
population when we observe it in the sample.
One last thing: How can we avoid this trap whereby significance takes on a larger meaning?
We recommend using the term statistically discernible instead of statistically significant
when discussing results. While this cannot fully solve the problem, it certainly does not
aggravate it either. We, as researchers can explicitly note in our reports: "While such and
such result is statistically discernible, its practical significance will depend on..." In this way
we can alert the end-user of our data to interpret the results realistically.
Vexed by significance testing? Try the bootstrap technique

Author -William S. Farrell
Article Abstract
76
Significance testing can be difficult to teach and learn. This article explains how the
bootstrap technique is simple to use and understand, valid and valuable-in hypothetical and
real-world application. Though not new, the technique is becoming newly accessible to a
majority of market researchers with varying degrees of computing resources.
I teach market research as well as conduct it, and when I come to the part of the course
where significance testing enters the picture, it's never clear who is more worried - me or
the students. We're worried about the same thing, of course: the difficulty of teaching
(learning) the dauntingly complicated theory underlying significance testing. There are
problems even when I try to avoid most of the theory - normal distribution, central limit
theorem, etc. - and go with a "cookbook" approach.
I usually have my students analyze data using a spreadsheet package such as Excel, since
few of them have access to a statistical package. As soon as they try to run their first t-test,
however, they are forced to make decisions about "homoscedastic" vs. "heteroscedastic,"
among other things. And even if they are fortunate enough to have access to a true
statistical package like SPSS, they don't know which of two p values to use for the t-test until
they understand something about "Levene's F test for equality of variances."
Is it any wonder that my students react to statistics the way they react to Freddy Krueger?
Fortunately, help is on the way (for practitioners as well as students), in the form of
something known as the bootstrap technique.
I'll introduce it by way of an example. Let's say we're rolling a pair of dice (you didn't think
you'd get through a statistics article without reading about dice, did you?) and we're curious
about how often a seven will show up. We could answer the question using the formula for
the binomial expansion - if we remembered the formula for the binomial expansion -- or we
could do it another way.
First, we'd count how many ways there are to roll a seven: 1-6, 2-5, 3-4, 4-3, 5-2, 6-1 - six
ways in all. Then we'd count the total number of ways two dice could come up: 1-1, 1-2, 1-3,
etc. I'll spare you the list - there are 36 ways altogether.
So there's our answer: we simply divide six (ways to get a seven) by 36 (total combinations)
and find that a seven should come up about 17 percent of the time, on average. You can bet
on it.
How does this relate to significance testing? Let's look at a hypothetical example more
directly relevant to market research. Say you've just conducted your annual customer
satisfaction survey and you find that customers in the Northeast give you a 9.2 rating on a
10-point scale, while customers in the South give you an 8.5 rating. You'd like to know if the
difference of 0.7 is statistically significant.
77
One (good) way of re-stating your question is as follows: if chance factors alone were at
work, how often would you get a difference as large as 0.7 between the means for these
two groups of customers? That question can be answered using a traditional t-test, or we
could apply the bootstrap method in a way that's analogous to what we just did with the
dice. Theoretically, we'd list all possible ways your customers could have responded, then
we'd calculate the proportion of those in which the difference between sample means was
equal to or greater than 0.7.
Practically, we'd do something like this: let's say you have responses from 93 customers in
the Northeast and 58 customers in the South. We'd put all 151 numbers into a pot; draw a
sample of 93 with replacement and calculate the mean; draw a sample of 58 with
replacement and calculate that mean; calculate the difference between the two means; and
then store that difference. This process would be repeated perhaps a thousand times. When
we were done, we'd calculate the proportion of differences that equaled or exceeded 0.7.
Though you may find this difficult to accept at first (I certainly did), that proportion is
conceptually the same as the p value one could calculate in Excel or SPSS, and is in fact a
more valid answer to the question of whether the two groups differ.
The bootstrap p value and the traditional p value are conceptually identical because they
both tell us the following: If we repeated the customer satisfaction study many times, and
there were no difference between the two populations, we would observe a sample
difference of 0.7 or greater exactly p percent of the time.
The bootstrap value is more valid than the traditional p value because it doesn't depend on
a major assumption underlying traditional significance testing; namely, that the distribution
of what we're measuring is normal in the population (or alternatively, that we have a large
enough sample so that the sampling distribution of the mean is normal).
Alert readers will have noticed that in our hypothetical application of the bootstrap, we
looked at only 1,000 shuffles of the customer data, not all possible combinations as we did
with the dice. Is this kosher? It is, but the details would take us too far afield. Suffice it to say
that in most implementations of the bootstrap, 1,000 to 3,000 iterations (depending on the
specific problem) have been shown to produce extremely accurate p values.
Does the bootstrap work in the "real world" of market research? You can bet on it. I recently
asked a national sample of physicians to rate, on a 10-point scale, the importance of 25
attributes of a medical device. I wanted to compare the ratings of two subgroups of
physicians, to see if one group viewed any of the attributes as differentially important.
78
One group was much smaller than the other - 47 vs. 131. Despite this difference in sample
sizes, SPSS told me that sample variances were equal for the two groups on 22 of the 25
attributes (remember Levene's F test?). For those 22 attributes, the two-tailed p value
computed using a bootstrap p procedure differed by no more than .006 from the p value
calculated by SPSS in a traditional t test. This was reassuring.
For the three attributes where SPSS said the groups had different variances, things got
interesting. Differences for two of these attributes were deemed non-significant, both by
SPSS and by the bootstrap. For the third attribute, SPSS computed a p value of .049, a value
that meets the "standard" criterion for statistical significance. The bootstrap procedure
computed a p value of .12 for this attribute - not even close to significant by most people's
standards. Which one did I believe? I think you can guess.
The real question is why this technique is only now coming into widespread use, and the
answer has a lot to do with computer power. Typical bootstrap significance tests that might
take one to five minutes to solve on a fast 486 today would have required hours on a fast
286 a decade ago.
You might be wondering why this technique, first described in 1979 by Stanford statistician
Bradley Efron, is called the bootstrap. The term is a whimsical reference to the fictional
Baron von Munchausen, who is said to have avoided drowning by pulling himself up by his
bootstraps from the bottom of a lake. It reflects the notion that analysis is performed
without the help of outside agencies, such as the normal distribution.
The bootstrap has been implemented under a variety of descriptive rubrics, including
distribution-free statistics, resampling statistics, exact inference testing and permutation
statistics. They all have in common the notion of repeated sampling from the original data,
calculation of a statistic with each sampling, and then inspection of the resulting distribution
of that statistic.
The technique can be applied to data at all levels of measurement: nominal (categorical),
ordinal (ranking), interval and ratio. It can be used to assess significance (p values) and to
compute confidence intervals. The technique is not a new one, but it is becoming newly
accessible to the vast majority of market researchers whose computing resources lie
somewhere between a calculator and a Cray.
And compared to teaching the normal curve, central limit theorem, etc., I find it much easier
to convey what boils down to a three-step process: (1) What's our result? (2) What are all
the different results that could have occurred? (3) How many of the possible results equal or
exceed ours?
79
I believe this paradigm will transform the way statistical analysis is taught and conducted.
Stay tuned.
Basic Data Analyses:

Secrets of effective data use
Author - Richard J. Vondruska, Ph.D.
Article Abstract
This article discusses the nature of data and a myriad of factors influencing the uses of data
in marketing research. These considerations include the purpose and focus of the research,
interactions between the perspectives marketers and researchers, the questions and
hypotheses being explored in the research, and the possibilities of using customer
satisfaction data to project consumer behavior.
It is difficult, but very rewarding, to distance ourselves occasionally from the day-to-day
details of the marketing research profession, and to consider the "big" issues. Marketing
research occupies a unique position between the more academic world of research, and the
more practical world of business. In a sense, marketing researchers have a sort of "dual
citizenship" in both worlds. Since the agendas of research and business are very different, it
is sometimes problematic having allegiance to both worlds. In this context, the topic "data
use" is a somewhat ambiguous topic. On the one hand, it acknowledges that, in the world of
business, if DATA is not Useful, it is of little value. On the other hand, it implies that
statistical and analytical tools can be employed to transform "raw DATA" into something
that is USEful. However, usefulness is a judgmental term. It implies that "someone"
determines that "something" is useful. Who is the "someone," and what is the "something?"
The nature of data
Before one can use data effectively, it helps to have a rudimentary grasp of the nature of
data. In marketing research, we often think of data as the conglomeration of numbers
obtained from a survey. Unlike mere numbers, however, data is inherently meaningful. It
assumes meaning to the extent that it relates to an aspect of phenomenal reality. More
colloquially put, phenomenal reality is "where the things of interest (phenomena) are
happening." In marketing research, that phenomenal reality is usually the marketplace.
80
Figure I illustrates my own viewpoint on how data should be construed. In Figure 1,
phenomenal reality is represented by the Oriental symbol of wholeness--Yin and Yang. For
those unfamiliar with this symbol, a brief explanation is in order. The ancient Chinese
believed that the world originated with two opposite yet complimentary "forces." Yin is
symbolized by the large black area of the symbol, and Yang by the large white area. Within
each area, there is a small dot of the opposite color. This dot represents the
interdependence of Yin and Yang, despite their separateness. These two "forces" also have
connotative as well as denotative aspects. Yin is characterized as female, passive and dark.
Yang is characterized as male, active and light.
What the Yin-Yang symbol is intended to reflect, for current purposes, is the idea that the
many phenomena we investigate have a "completeness" that is resistant to an analysis
designed to break it into components. Although dividing the whole is sometimes the only
way to gain understanding, that whole must eventually be reconstituted in our theories
about the phenomena. The significance of this viewpoint is not patently obvious if one
construes data using only more traditional Western thought, which emphasizes
componential aspects (e.g., computer flow charts).
The Yin-Yang symbol also captures the subtle complexity of the phenomena under
investigation. It suggests a "harmony of opposites." If data reflected only chaos, there would
be no reason to collect it in the first place. The "pie wedge" removed from the symbol
represents the act of measurement to obtain data.
It should be made clear that we are not talking here about drawing a sample from a
population of consumers. In sampling, we expect to obtain a representative group of
respondents--a sort of microcosm of the population. Notice that what we obtain is not a
81
microcosm of the symbol (i.e., a complete, but smaller Yin-Yang symbol), but rather
incomplete information in the form of a piece of data.
For the purposes of the following discussion, any complex black and white figure could
suffice. Here the "surplus meaning" of the Yin-Yang symbol merely enriches the process.
Imagine that you did not know what the entire symbol looked like. You had only the piece of
data. What could you conclude about the entire symbol? For one, you could conclude that it
has both white and black areas. For another, you could conclude that it is possible to have a
circle of white surrounded by black. You might also note the arc of the edge of the piece.
Something you might be able to infer, but not necessarily conclude, is that the arc is part of
a larger circle. Likewise, you might be able to conjecture that a small black circle might also
exist.
By "slicing" things slightly differently the next time you collect data, you might get the black
circle. Or, you might get a portion of the "S" shaped curve that divides the main regions of
black and white. In other words, data can never give us the "full picture." We must use our
mental faculties to interpret that data for it to become useful. A key point to be made here
is that we sometimes concentrate on the piece of data rather than on how it fits into the
whole.
Theories are developed to explain or account for phenomena. In any particular discipline,
there is an implicit understanding of what "counts" as a phenomenon of interest. For
example, the behavior of free- falling bodies would considered appropriate for study by
physicists, but not by marketing researchers. In marketing research, the primary
phenomenon of study is purchase behavior.
The bane of marketing research is the theory-less "one-shot" study. Anyone who continually
does one- shot studies is simply wasting ammunition. A one-shot marketer is trying to grab
the proverbial gold ring on the carousel. A one-shot researcher is using skills and training in
a mostly opportunistic way. Approached correctly, the field of marketing research can grow
in sophistication to encompass issues bordering on a better understanding of human
behavior itself. Approached poorly, it will never be more than a way for marketers to help
protect their interests in risk-laden situations.
In this context, it should be noted that many of the activities related to marketing research
are actually tangential to the purchase behavior per se. For example, advertisements are
often tested to determine their effectiveness in communicating key ideas, but testing is not
typically tied to the purchase behavior. Only in recent years, with the advent of scanner
technology, has it even been feasible to ask whether or not advertising can produce a
measurable effect on purchase behavior. There is no doubt, of course, that advertising (and
other promotional activities) can help establish the preconditions for a particular purchase
behavior (e.g., awareness of a new product).
82
Data in marketing research
From whatever angle one approaches the topic of "data use," there are certain premises
that are tacitly assumed. From the purely academic perspective, the major premise is that
the "goal" of marketing research is an explanation of the dynamics of the marketplace.
What is sought is understanding rather than knowledge about a specific situation. The
practical applications of this understanding need not be immediate, but application ought to
be within the realm of possibility. The academic perspective can be seen as a "long-term"
one. The main reason it can be viewed so is that there is no reason to believe that the bases
of consumer behavior will change radically over time. Specific products and services may
change, but not the underlying principles governing behavior. In this vein, marketing
research can be viewed as a special member of the family of behavioral sciences - special
because of its direct ties to practical concerns. The cross-fertilization of the behavioral
sciences over the years is evident to even the casual observer. Through this dynamic
process, models of the marketplace are being molded, chiseled, and hewn into powerful
conceptual frameworks.
From a business perspective, the major premise is that the goal of marketing research is to
provide information for decision making. Marketing research, per se, holds no preeminent
position in the array of information used to reach decisions. Obviously, the overriding goal
for a decision maker is to seek good decisions and avoid bad decisions. This "short-term"
perspective might be labeled hedonic empiricism. In this context, the "long-term" view is
precluded by the immediacy of the need for information.
It must not be concluded, however, that either of these two very different perspectives is
the "superior" one. Both perspectives have adaptive advantages, as well as attendant
dangers. "Long-term" academic researchers are often criticized for being out of touch with
the realities of the marketplace (the "ivory tower" criticism). "Short-term" marketers are
often accused of doing research that is motivated by the fear of being judged solely
responsible for a "bad" intuitive decision. They are merely seeking a place to "point a finger"
if the consequences of their decisions don't pan out as expected. With apologies to
comedian Flip Wilson, it is as though they want to be able to say, "The research made me do
it!"
Synergistic relationships
It is often overlooked that most marketing research situations are of the form of a
synergistic relationship between researcher and marketer. In this sense, marketing research
is not "done" in the sense that a statistical analysis is "done." Rather, marketing research
emerges as a joint function of the needs of the marketer and the skills of the researcher.
One might argue that marketing research is more of a transition than a product or service.
To the extent that marketers see research as a product, they will de-emphasize the
understanding that can be gained. To the extent that researchers see marketing as a service,
83
they will de-emphasize the important role it has in the non-academic world. The optimal
situation is a dialogue between marketer and researcher that ensures mutually satisfactory
transactions.
Without such dialogue, the analysis of a data set is often divorced from the original
questions the survey was intended to address. From an objective standpoint, any statistical
textbook could be consulted to determine the "proper" analysis. But the main questions
might not be addressed even in the objectively "proper" analysis. "Proper" data is not
necessarily useful data. Since the design of the survey and analysis of the data are inevitably
interwoven, this dialogue between marketer and researcher should precede questionnaire
development.
There can be no "magical" statistical solutions if the prior steps have not insured that the
"proper" analyses can be performed. Worsening the situation is the widespread availability
of statistical software. This encourages untrained individuals to apply statistical tests in an
indiscriminate manner. The expectations generated in the minds of the owners of these
statistical packages are oftentimes unrealistic. Owning a "statistical cookbook" does not
make a person a "chef." And not even the greatest chef can make chocolate mousse from
headcheese.
It is a lucky marketer who works with a researcher who is aware of the validity, and business
necessity, of the "short-term" view. And it is an equally lucky researcher whose client
appreciates that the "long-term" view can pay dividends in the future. Working together,
this "team" of the marketer and researcher can address any challenge offered by the
marketplace. They will not only find opportunities with a "long-term" view, but also will
seize opportunities by dealing with "short-term" competitive threats with information
rather than emotion.
Fueled by imagination and insight, the contribution of both marketers and researchers
should lead to those "competitive advantages" that are so sought after in the world of
business. So how does one go about finding such "gems" in the data? In some sense, what
we seek is information rather than insight, but I would contend that the two go together
more often than not.
Broad generalizations contribute little, and preoccupation with minutiae is equally

counterproductive. Useful data should satisfy both the marketer and the researcher. The
real challenge to those in marketing research is finding the right "level of focus" for the
wisest "data use."
Vondruska's Postulates
What we need is a principled way in which analysis can be approached to maximize

obtaining the desired information. The "level of focus" notion leads directly to Vondruska's
Postulates, which are as follows:
84
Postulate 1: Lower levels of phenomenal organization are easier to detect than higher levels
of phenomenal organization.
Postulate 2: Higher levels of phenomenal organization are easier to imagine than lower
levels of phenomenal organization.
Obviously, the converse of each postulate is implied as well (e.g., it is difficult to detect
organization at higher levels). What do I mean by "organization?" Simply that the world is
not merely a collection of disjointed atoms in space. Hydrogen molecules organize into
stars; people organize into market segments. We see patterns. We see constancy. We
understand.
Admittedly, the postulates are a bit abstract. So an illustrative analogy seems in order.
Consider the following (familiar?) high school math formulas:
In terms of the postulates, these formulas can be considered at a "low" level of

organization. They are useful unto themselves, but no relationship between the formulas is
implied. Now consider the illustration of the conic sections in Figure 2.
85
By re-conceptualizing ellipses, parabolas, and hyperbolas at a "higher" level of
"organization," we now see something new. Despite their distinct formulas, we see them as
members if the family of plane figures. As the philosopher Ludwig Wittgenstein contended,
sometimes things are related by family resemblance rather than common attributes. If we
do not know that, we will not look for such resemblances.
The point here is that the same type of mental processes prevail when we work with data.
Recasting the postulates in terms of the phrase "He cannot see the forest because of the
trees" may help to explain them further.
Sometimes we can easily detect the "trees," but we miss imagining the "forest." And at
other times, we get clobbered by "trees" as we dash through the "forest" of our
preconceived notions.
The true power of these postulates is that they apply not only to marketing, but to most
investigative endeavors. The proper "level of focus" for most meaningful investigations
usually lies between the extremes of high and low levels of organization. Often, more than
one "focus" is needed to thoroughly understand an array of data. Some, of course, will be
more useful than others for particular purposes.
Facts vs. Ideas
86
Facts "need" ideas, and ideas "need" facts. Examples of the need for both measurement and
theory abound in the history of science. The astronomer Johann Kepler spent many years of
his life pursuing a mathematical/theoretical framework that would provide an account of
planetary orbits. He immersed himself in the mysteries of mathematics in his attempt to
bring order to astronomical phenomena. His driving intuition was that the perfection of
mathematics must be hidden in the universe itself.
One of Kepler's contemporaries, the lesser known Tycho Brahe, approached the problem of
determining the nature of the planetary orbits in a different way. He measured. He collected
data. Night after night, he sat at his telescope and dutifully recorded the positions of the
observable planets. But to his eye, no patterns emerged from the data. It was only when he
and Kepler shared their different perspectives did the true usefulness of the data become
apparent. Kepler is credited with the discovery that planets orbit the Sun in an elliptical
pattern, but Tycho Brahe had no small contribution to that discovery.
Kepler's discovery of the elliptical orbits of the planets would not have been possible
without the painstaking data collection of planetary positions by Tycho Brahe. The key is
that Kepler had to consider the facts in his discovery. He would have much preferred the
orbits to be perfect celestial circles, but the evidence mitigated against that theory. On a
more mundane level, research realities such as these are encountered in marketing research
on an everyday basis.
Hypothesis-driven research
It is not enough merely to subject data to rigorous analysis. The most useful data is gleaned
from an analysis in which one already has a suspicion of what is sought. Hypothesis-driven
research also yields the greatest insights from analysis. I have a personal rule that I apply to
any analysis. After I have applied all of the "right" statistical tools, I look for "patterns" in the
data. When I start to scour statistics manuals to find a procedure that will give me
interesting results, I stop. This is a sure sign that I have "tortured" the data into confessing
all of its secrets. Alas, sometimes there are no further secrets.
Higher level statistical analyses do not typically uncover relationships that are not at all
apparent at lower levels. They simply "formalize" those relationships in a more elegant, and
sometimes more useful way. A good example of this is hierarchical log linear analysis.
Although there is the potential in this procedure for detecting very high level interactions
between variables, these complex interactions are often impossible to interpret--for all
practical purposes.
Obviously, there is a big difference between knowing what one ultimately wants to
accomplish through marketing research and actually accomplishing it. Ambiguity in research
design is especially common in the non-academic world. Invoking another astronomical
analogy, it is as though many marketers fail to realize that even though they can see the
87
planet Jupiter, that does not mean that they can get there directly. It takes a long time to
get to Jupiter - and when you finally get there it will be in a new location! Both theoretical
knowledge and technical knowledge are required to reach distant goals. Only then can the
improbable become the possible.
There is a lesson to be learned here. Straightforward thinking does not always produce the
desired result. Some research problems have solutions that possess a property that is
denoted in the German language by the word "umweg." There is no suitable direct
translation, but the idea is that only a roundabout approach will work. All direct approaches
fail. Most puzzles and games incorporate this "umweg" principle. Indeed, Nature herself
seems to have an immense sense of humor with regard to thwarting direct approaches.
Of course, marketing research is not exempt from this "umweg" principle. An analysis plan
which is too straightforward often founders on the rocks of perplexing findings. Luckily, by
understanding the nature of data, we are still able to tease out the actionable information
needed for practical marketing solutions.
Prediction vs. Assessment
Behavior itself is governed by a multitude of factors, some of which are only measurable
after the fact. This is a major reason why customer satisfaction research enjoys its current
popularity. Marketers realize that although it might be impossible to predict behavior in the
marketplace, they can determine the characteristics of products that succeed, and products
that fail. If these characteristics are interpreted at the proper level of abstraction, they may
be applied to future products with a degree of confidence heretofore not possible.
Some marketers use the argument that looking at "after the fact" measures such as
customer satisfaction is like looking in the rearview mirror while driving a car (after Marshall
McLuhan's comments). This is specious thinking, because we do not really have a front
window in marketing research. Nor do we have the "crystal ball" that all marketers seem to
covet. What we do have is the ability to learn from our mistakes, and to see products and
services through the eyes of the consumer. Every projection is a gamble of sorts. Useful data
allows us to hedge our bets. It does not provide a sure thing.
Another way to characterize customer satisfaction research is in terms of a feedback

mechanism. In much the same way that the thermostat on a climate control system detects
deviations from some acceptable range, a good customer satisfaction survey provides
information about problems in the marketplace. It should also provide a feel for one's
competitive position in the marketplace. This is the best way to use customer satisfaction
data. The worst way to use it is as a yardstick to set "goals" for employee performance. This
is because customer satisfaction has an intuitively asymptotic aspect to it.
In plain English, 1) you can only please people so much; 2) some people will never be
completely satisfied; 3) the more you please people, the more they expect. So if your "goal"
88
is to improve overall customer satisfaction by 5% each year, you are doomed to failure once
the "performance curve" starts to level off (asymptote) over time.
Also, note well that simply because a survey is repeated over time (i.e., a tracking study)
does not mean that it fulfills the requirements of a good customer satisfaction survey. What
is monitored is as important as the monitoring itself. The acid test for any customer
satisfaction program is how well it can detect the problems that detract from the quality of
a product or service. If the program does that, it will make a difference to the bottom line as
well.
Implications for theory and action
To obtain a complete perspective on the myriad of different activities that constitute the
field of marketing research, we must "take a step backward to admire the work." What we
then see is a lattice of interrelated activities leading toward a dual goal - to better
understand the consumer, and to better compete in the marketplace.
If I have given a plausible account of the nature of data, then it follows that we sometimes
must proceed with marketing decisions based on incomplete information. Looking on the
bright side, however, informed decisions are almost always superior to those made in a
vacuum. So although we may be tempted to look to data for crystal clear answers, all data
can ever really provide us with is prudent guidance for our theories and our actions. Therein
lies the main secret to effective data use.
Ordered up wrong
Author - Stephen J. Hellebusch
Article Abstract
Using a recent example to illustrate his points, the author discusses the importance of using
the correct statistical test.
One phenomenon that has always baffled is why anyone would hire an expert to work on a
project and then order the expert to ignore his/her knowledge and do it wrong. A recent
experience drove home the confusion in a pointed way.
As many marketing researchers familiar with statistics are aware, you use a different
statistical test when you have three or more groups than you do when you have only two.
The automatic statistical testing that is very helpful in survey data tables does NOT “know”
this, and cheerfully uses the two-group test in every situation, regardless. Each statistical
89
test addresses a slightly different question, and the question is very important in the
selection of the correct test to use.
In a recent “pick-a-winner” study, we had three independent groups, each one based on a
different version of a concept - Concepts A, B and C. We used a standard purchase intent
scale (definitely will buy, probably will buy, might or might not buy, probably will not buy,
definitely will not buy), and the question was: How do we test to see if there is any
difference in consumer reaction to the three?
The first test was analysis of variance (ANOVA), which addresses the question: Do the mean
values of these three groups differ? We used weights of 5 (for “definitely will buy”), 4, 3, 2
and 1 (for “definitely will not buy”) and generated mean values for each of the three
concepts. The ANOVA showed that the means did not differ significantly at the 90 percent
confidence level, which leads to the conclusion that consumer reaction to these three
concepts on purchase intent does not differ, on average.
At this point in the project, the client was displeased, and told us to test using the
“definitely/probably will buy” percentages (the top two box). This is another testing option
that makes sense. The chi-square test addresses the question: Do these three percentages
differ? It is the proper test to use when there are three or more percentages to test across
three different groups of people. We conducted it, and learned that the percentages did not
differ significantly at the 90 percent confidence level. It told us that, with respect to positive
purchase interest, across all three products, the consumer reaction in terms of the top two
box was the same.
The wrong test
The client was displeased. Having conducted the testing himself, he learned that Concept B
was significantly lower than Concept A, both in the top two box and in the top box, at the 90
percent confidence level. He told us not to use the chi-square, but to use the test the data
tables use. The Z test addressed the question: Do these two percentages differ? When it is
misused in a situation where there are three or more groups, this testing method disregards
key information, and makes the determination after having thrown out data. To please the
client, we conducted multiple Z tests and determined that there were no statistically
significant differences between any of the three pairs (A vs. B; A vs. C; B vs. C) at the 90
percent confidence level. The client had another person in his department conduct the test,
and that testing showed, as the client’s had, that the top two box for A was significantly
higher than B’s at the 90 percent confidence level.
Fairly confused at this point, we ran the data tables, which showed, exactly as the client
said, that A was significantly higher than B at the 90 percent confidence level, both on the
top two box and on top box percentages.
The less-preferred formula
90
We then conducted the three tests by hand, and compared our Z values with the client’s.
We learned that the client, his department mate, and the statistical testing in the survey
table program all used the less-preferred Z test formula. There are two versions of this test.
One of them does not use the recommended correction for continuity. This, essentially, is a
very small adjustment that should be made because the basic formula assumes a
continuous variable (peanut butter) and we are actually working with a discrete variable -
people (peanuts; the count of respondents making up the top two box). Normally, it makes
no difference in the results, because it is so small. In this case, however, it made the
difference between crossing the line into significance and not crossing it.
With that resolved, we discussed the client’s desire to test every row of the scale, with the
wrong statistical test, using the less-preferred formula. We were told that the client always
does this, and that we should do so. So, we did.
The wrong way
This procedure violates the fundamental logic behind testing. By testing the top two box, we
have tested the difference between these three concepts on this scale. When we test the
five rows of the scale (and various other combinations of rows), using multiple Z tests, the
probabilities are so distorted that it is doubtful anyone knows the confidence level at which
we are really operating.
So, we successfully used the less appropriate formula with the wrong test and followed the
wrong procedure for testing. We remain baffled.
Let's test everything

Article Abstract
In statistical testing, the key is to make sure the right numbers are being tested.
The logic of statistical (stat) testing is not complex, but it can be difficult to understand,
because it is the reverse of everyday logic and what normal people expect. Basically, to
determine if two numbers differ significantly, it is assumed that they are the same. The test
then determines whether this notion can be rejected, and we can say that the numbers are
“statistically significantly different at the (some predetermined) confidence level.”
91
While it is not complex, the logic can be subtle. One subtlety leads to a common error, aided
and abetted by automatic computer stat testing - overtesting. Suppose there is a group of
200 men and one of 205 women, and they respond to a new product concept on a purchase
intent scale. The data might look like that shown in Table A.
Statistical logic assumes that the two percentages to be tested are from the same
population - they do not differ. Therefore, it is assumed that men have the same purchase
interest as women. The rules also assume that the numbers are unrelated, in the sense that
the percentages being tested are free to be whatever they might be, from 0 percent to 100
percent. Restricting them in any way changes the probabilities, and the dynamics of the
statistical test.
The right way to test for a difference in purchase intent is to pick a key measure to
summarize the responses, and test that measure. In Table A, the Top Two Box score was
tested - the combined percentages from the top two points on the scale (“definitely would
buy” plus “probably would buy”). Within the group of men, this number could have turned
out to be anything. It just happened to be 13 percent. Within the group of women, it could
have been anything, and, as it turns out, was 40 percent. Within each group, the number
was free to be anything from 0 percent to 100 percent, so picking this percentage to test
follows the statistical rule. The stat test indicates that the idea that these percentages are
from the same place (or are the same) can be rejected, so we can say they are “statistically
significantly different at the 95 percent confidence level.”
92
Something different often happens in practice, though. Since the computer programs that
generate survey data do not “know” what summary measure will be important, these
programs test everything. When looking at computer-generated data tables, the statistical
results will look something like those shown in Table B.
If the Top Two Box score is selected ahead of time, and that is all that is examined (as in
Table A), then this automatic testing is very helpful. It does the work, and shows that 13
percent differs from 40 percent. The other stat test results are ignored. However, if the data
are reported as shown in Table B, there is a problem.
The percentages for the men add to 100 percent. If one percentage is picked for testing, it is
“taken out” of the scale, in a sense. The other percentages are no longer free to be
whatever they might be. They must add to 100 percent minus the set, fixed percent that
was selected for testing. Percentages for the men can vary from 0 percent to 87 percent,
but they can’t be higher, because 13 percent is “used up.” Similarly, percentages for the
women can vary from 0 percent to 60 percent, but 40 percent is used already. When you
look at testing in the other rows, or row by row, you are no longer using the confidence
level you think you are using - it becomes something else.
Statistically, if one said of Table B that the percentages that “definitely would buy” and the
percentages that “definitely/probably would buy” both differ at the 95 percent confidence
level, it would be wrong. One of them does, but the other difference is at some unknown
level of significance, probably much less than 95 percent, given one related significant
difference.
93
Stat tests are very useful. Each one answers a specific question about a numerical
relationship. The one most commonly asked about scale responses is whether two numbers
differ significantly. If they are the right two numbers, and the proper test is used, the
question is easily answered. If they are the wrong two numbers, or the wrong test has been
used, the decision maker can be misled.
A comparison of missing value options in regression analysis

Author - Gary M. Mullet
Article Abstract
Regression analysis is one tool for evaluating customer satisfaction measurement. Non-
response is problematic for multiple regression analysis because most software discards all
of a respondent’s data when it encounters a missing value. This article discusses options for
coping with item non-response in regression runs, comparing run results based on a real
data set.
Whenever you manage to get off the telephone long enough to even glance at your in-box,
you're sure to notice that a large amount of correspondence deals with various facets of
customer satisfaction measurement (CSM). It also seems that more and more promotion,
compensation and retention decisions are based, at least in part, on the results of CSM
studies.
One tool, although certainly not the only one, for evaluating such studies is regression
analysis. As readers of this column are aware, regression analysis is certainly widely used in
other types of marketing research studies. One bugaboo of multiple regression analysis is
item non-response. When (most) computer packages encounter a missing value, they pitch
all of the other data from the given respondent, by default.
There are various options for coping with item non-response in regression runs. We will
compare the results of some of these below, using a real, albeit disguised, data set. If your
livelihood depends on the results of a CSM study, you should be interested in the differing
conclusions which may be drawn from these comparisons. All of the results reported below
use a 95 percent confidence (5 percent significance) criterion and stepwise regression runs.
There are certainly myriad other options available which are not examined below.
Listwise deletion
As already noted, the default option in most programs is listwise deletion. In a very small
nutshell, this means that if a respondent fails to answer even one of the many ratings, that
respondent ceases to exist for the regression in question. As a case in point, a recent
regression on 1200+ respondents yielded not a single valid case for a regression trying to
use only 15 (out of 60-some) independent variables to predict overall satisfaction. While this
is extreme, it is not unusual to lose 50 percent or more of the respondents to item non-
94
response. Thus, conclusions (and compensation) may be based on fewer than half of the
respondents in your carefully designed study!
Our example comes from a data set of 500 respondents who were asked 10 ratings that
were potentially related to an overall opinion measure. For proprietary reasons, the 10
scales used for the independent variables will be denoted below as X1, X2, . . .X10, rather
than given more meaningful labels. The results of the first regression, using the listwise
(default) option, are noted in Table 1 under column A.
As a variation on listwise deletion, some analysts use a portion of the column A results only
to see which set of variables is significant and then instruct the computer to run another
regression, using only those attributes and pretending that the others don't exist. This can
accomplish a couple of things. First, almost assuredly, the base size will increase since fewer
variables require answers from everyone. Secondly, (partial) regression coefficient
magnitudes may change, as well as order of entry of the variables - just look below. In some
cases, attributes that are statistically significant in the first pass through the data will not be
so in this second pass. The results from this "variable screening" analysis are listed under
column B.
Pairwise option
In this variation of regression, attributes are (essentially) looked at two-by-two (sounds like
Noah's Ark). Without beating anyone over the head with statistical theory, the effect of
invoking this option changes the matrix upon which the computer program operates to find
the estimated regression coefficients. The results of the pairwise option are under column
C, in Table 1.
Mean substitution
Be careful here! Mean substitution for missing values is a very attractive option since it's
easy to invoke - just push a computer key - and dramatically increases the base size on
which these personnel and/or other decisions are made. The mean substitution option fills
in the arithmetic mean value for everyone who did answer a given rating for the void
existing for those who did not. Thus, everyone is assumed to be "average" on anything that
they failed to answer.
Then why be careful? First, if you blithely select mean substitution without any filtering of
the data, the mean on the dependent variable, here overall opinion, is also substituted for
those who didn't answer it. You will then be running regressions that include a substantial
number of people who did not give a rating on the criterion measure - be they no longer
customers, no longer product users or whatever. See column D for this type of mean
substitution.
O.K., let's say you're alert enough to run the mean substitution option on only those who
95
gave an answer to the overall opinion question. The results, in column E of Table 1, still
include several respondents who answered only one or two of the independent variable
ratings, which may cause an eyebrow or two to be raised if the results are broadcast.
Finally, let's look at more intelligent mean substitution. You need to ask yourself, "How
many questions should a respondent answer to convince me that they have a grasp of the
interview?" For the data which we are looking at, the answer to this (arbitrarily) was set at
eight. Then, mean substitution was used for those who met two criteria. One, there had to
be a valid answer to the overall opinion question. Second, there had to be at least eight
legal answers to the 10 predictor attribute ratings. The regression coefficients are shown in
column F of the table.
Respondent mean substitution
Many feel that the major drawback to using the automatic mean substitution option is that
an individual with missing values is treated liked everyone else; the mean of all who did
answer is substituted as the value for those who did not, as already noted, variable-by-
variable. Respondent mean substitution treats each individual as an independent entity; the
mean for the questions that were answered (which may require some reverse coding) for
each individual respondent is substituted for the value(s) for which there is no answer for
that respondent and that respondent only. This, then makes use of scale usage differences
between individuals or genuinely different (average) ratings on the independent variables
between individuals. As before, the resulting regression may be run irrespective of the
number of ratings which a respondent did answer, but in column G you'll find the results of
substituting the respondents' own mean for items which had no answer for, as before,
those who answered at least eight of the predictors and also gave an overall opinion rating.
Are we done yet?
Just about. We'll leave perusal of Table 1 to the reader during your scarce leisure time.
Note, however, that there are some common and uncommon threads between the
columns. Depending on your actual application of regression analysis, none of these
differences may be daunting at all. Certainly, in some applications they are somewhat scary.
It should be obvious by now that there are still other analytical variations, such as using the
pairwise option on the respondent mean substitution data. That's not the point. The
important conclusion to draw from the above mathematical manipulations is, it is essential
for the analyst to know exactly which options are used on any regression analysis before
blindly trying to implement the results, whether they be for sales force compensation, new
product share forecasting, brand image analysis or whatever. As always, clear, careful,
concise communication is what it's all about. And please, please don't use total mean
substitution just to be able to show a regression base equal to the number of questionnaires
in hand. While that sounds like a no brainer, it has been done.
96
Chi-Square Test:
By the Numbers: The cool logic of chi-square
Article Abstract
Uses brief examples to illustrate the capabilities of the chi-square test.
As many marketing researchers are aware, there are statistical tests built into the programs
we use to show survey data. Most of these are set to operate at the 90 or 95 percent
confidence level, and automatically test the difference between percentages in specified
columns, as shown in the mock data example in Table 1.
As some marketing researchers are aware, the automatic test built into the survey programs
is not the right test to use when there are more than two subgroups. You need a statistical
test that will look at three percentages simultaneously, and that test is the chi-square (not
to be confused with its cousin, the chi-square goodness of fit test).
The chi-square test looks at all the percentages and tests to see if what we have is different
than what we would expect to have by chance alone. The logic behind it is actually deeper
than this article will go, but, at one level it is cool.
Take a look at the mock data in Table 2 as an example. We want to know if the three
percentages differ significantly statistically at the 95 percent confidence level. If they do, we
will hypothesize that awareness decreases with education level.
97
As Table 2a shows, the first step is to eliminate all the things that make the table pretty, and
(oddly enough), to eliminate the percentages that we are interested in testing. We also add
a new row. Since awareness is a zero-one concept (you are either aware or you are not), we
add the number not aware, which we get just by subtracting the number aware from the
total. Next, we add the rows to get totals, and put the bases in as the column totals.
The chi-square test actually compares all the numbers in the cells to the number that you
would expect to be in the cell by chance. You get this number for one cell by multiplying the
row total by the column total and dividing by the total. For the first cell of 250, we would
expect (421x501)/1003 = 210 to be in it. For the Some College/Tech School + Aware cell,
(421x200)/1003 = 84, etc. The idea is neat. Table 2b shows the actual numbers and the
expected values in boldface.
That’s it. There is no need to go through the whole formula for chi-square, since it can be
found many, many places, and the rest of the logic of the test is the same as for all statistical
tests of difference. (Compare the obtained chi-square to the table value of chi-square that
one would expect if the percentages did not differ; if it is bigger, the percentages differ. If it
is not, they do not.)
The logic compares the actual cell values to the cell values you would expect if the
percentages do not differ, given that the row totals and column totals are what they are.
You are comparing all of the percentages at once, but the logic is based on the number you
expect to see in each cell. Better still, you can calculate that number and see for yourself, if
you are so inclined.
Is our example chi-square significant at the 95 percent confidence level? It certainly is! Of
course, it was constructed to have large differences, so that is really no surprise.
98
t-Test:
Nonparametric tests: sturdy alternatives
Author - William Bailey
Article Abstract
The current economic conditions have affected strategies of consumer research. This article
discusses alternative strategies that are often overlooked: nonparametric tests.
Does this situation sound familiar? "I can't afford the research plan you advise! Is there a
way we can do fewer surveys but still get usable and reliable results?" The current economic
conditions have affected strategies of consumer research. As a result, more and more
clients are trying to find ways to cut costs while at the same time delivering to business
objectives.
As market researchers, we tend to focus on crosstabulations that offer paired tests of

proportions and generally take the results right to the portion of the final report that details
the statistical results. This is not necessarily intended to be a criticism; it's just the way we
typically do consumer research. While this works in many cases, this author is finding that
clients are asking somewhat different questions: "How do these two products differ in
comparison to these other two?" "Is there a difference in opinion by product within gender
or age or...?" They also ask, "How do these product features rank as they apply to the
respondent's overall opinion of my company?" As you can see, these questions begin to
move things beyond the realm of basic data evaluation.
The preferred research plan is to interview a sufficient number of consumers to make the
results statistically reliable at the 90 percent or 95 percent level of confidence with a certain
margin of error, e.g., ±5 percentage points. Why? Because that is what we have always
done! Depending on how one sets the constraint parameters, this works out to be from 250
to 350 completed interviews at the base level of analysis and then we work up from there.
With this base we can apply standard analysis tools such as paired t-tests, analysis of
variance, and factor or regression analysis with reasonable comfort. Further, for this
response base there usually is marginal violation of the implied assumptions; the data
approaches a normal distribution and homogeneity of variance. But is this always the case?
Depending on the response scales used, more likely not; there is some violation we could
overlook. I am not suggesting that we have done a bad job, we just haven't done an
appropriate job for the data's characteristics.
Back to the statement: "I can't afford the research plan you advise! Is there a way we can do
fewer surveys but still get usable and reliable results?" Not to worry. There are alternatives
available that are often overlooked. These approaches fall into the general category of
sturdy or distribution-free statistics or, more specifically, nonparametric statistics.
99
Sturdy statistics
Most market researchers automatically use procedures that assume that the measurements
are drawn from a normal distribution and then proceed to test hypotheses on parameters
such as the mean or the variance (usually the standard deviation, which is the square root of
the variance). Useful tests include but are not limited to the Student's t or the Z statistic,
various forms of regression analysis, and/or analysis of variance to help understand a
study's result and/or differences between product or control/treatment sets. These tools
are a part of what is called parametric statistical tests.
While some of these statistical tests do work well even if the assumption of normality is
violated, extreme violations of this assumption can affect the interpretation of the results.
There are technical reasons behind this, such as the fact that the effect of violating the
assumption of normality is to decrease the Type I error (a conclusion is drawn that the null
hypothesis is false when, in fact, it is true), but that is beyond the scope of our intent here.
If a violation of an assumption is realized, or, as is often the case, if the sample size desired
for the analysis base is small, e.g., under 20 or 30 observations - when "traditional"
statistical tests become questionable, there is a collection of tests that do not depend that
much on the precise shape of the distribution. This class of statistical tests bases themselves
on the signs of differences, ranks of measurements, and/or counts of objects falling into
categories. Such methods may not rest heavily on the specific parameters of the
distribution, and for this reason are called nonparametric or distribution-free tests. They do
not make any or as stringent assumptions about the distribution from which the numbers
were sampled.
However, the term nonparametric is somewhat misleading, since these statistics do in fact
deal with parameters such as the median of a distribution or the probability of success p in a
binominal distribution. The main advantage to many of the methods described herein is that
they defend themselves against distribution outliers and "off normal distributions" and
failures of assumptions. Statisticians use adjectives such as "robust," "resistant" and
"sturdy" to describe them.
Specifically, and more importantly, sturdy statistical techniques provide comparable test
results to traditional tests when the samples are from asymmetric or skewed distributions.
Here the term "power" is usually introduced. While there are transformations available such
as taking logarithms or square roots of the data to bring them more in line with appropriate
parametric assumptions, sturdy or distribution-free tests are a worthwhile alternative.
Further, sturdy statistical methods are useful in cases when the researcher knows nothing
about the parameters of the variable of interest in the population (hence the name
nonparametric).
A comparison
100
This section provides a comparison between tests in these two classifications (called
parametric and nonparametric in the table) based on some popular study scenarios. It is not
meant to be all-inclusive.
Most parametric tests have their nonparametric analogues. In other words, nonparametric
tests exist for most situations a market analyst commonly uses: two independent groups,
two matched groups, and multiple groups. The primary difference is that the data is no
longer interval; instead it is ordinal (or is treated as ordinal). The table summarizes several
"crossover" tools. It offers a very simple comparison between several parametric tests with
their analogues.
Parametric Tests Nonparametric Tests
 Mann-Whitney
Independent t-Test
 Median
 Wilcoxon
Matched Pairs t-Test
 Sign Test
One-Way ANOVA
 Wilcoxon Kruskal-Wallis
While nonparametric tests make fewer assumptions regarding the nature of distributions,
they are usually less powerful than their parametric counterparts. However, in cases where
assumptions are violated and interval data is treated as ordinal, not only are nonparametric
tests more proper, they can also be more powerful.
This section highlights the applicability of the nonparametric tests noted above. For more
detailed information the reader is directed to a statistical resource, the Internet, or software
packages such as (but certainty not limited to) SPSS, SAS, and Prophet. (The author is not
endorsing any of these packages, and no rank order is implied.)
The Mann-Whitney U test is the most popular of the two-independent-samples tests. It is
equivalent to the Wilcoxon rank sum test and the Kruskal-Wallis test for two groups. Mann-
Whitney tests whether two sampled populations are equivalent in location. The
observations from both groups are combined and ranked, with the average rank assigned in
the case of ties. The number of ties should be small relative to the total number of
observations. If the populations are identical in location, the ranks should be randomly
mixed between the two samples. The number of times a score from Group 1 precedes a
score from Group 2 and the number of times a score from Group 2 precedes a score from
Group 1 are calculated. The Mann-Whitney U statistic is the smaller of these two numbers.
101
The Median test tests whether two or more independent samples are drawn from
populations with the same median using the chi-square statistic. This test should not be
used if any cell has an expected frequency less than one, or if more than 20 percent of the
cells have expected frequencies less than five.
The Wilcoxon test is used with two related variables to test the hypothesis that the two
variables have the same distribution. It makes no assumptions about the shapes of the
distributions of the two variables. This test takes into account information about the
magnitude of differences within pairs and gives more weight to pairs that show large
differences than to pairs that show small differences. The test statistic is based on the ranks
of the absolute values of the differences between the two variables.
The Sign test is designed to test a hypothesis about the location of a population distribution.
It is most often used to test the hypothesis about a population median, and often involves
the use of matched pairs, for example, before and after data, in which case it tests for a
median difference of zero. In many applications, this test is used in place of the one sample
t-test when the normality assumption is questionable. It is a less powerful alternative to the
Wilcoxon signed ranks test, but does not assume that the population probability distribution
is symmetric. This test can also be applied when the observations in a sample of data are
ranks; that is, ordinal data rather than direct measurements.
The Kruskal-Wallis test is used to test the null hypothesis that "all populations have identical
distribution functions" against the alternative hypothesis that "at least two of the samples
differ only with respect to location (median), if at all." It is the analogue to the F-test used in
analysis of variance. While analysis of variance tests depend on the assumption that all
populations under comparison are normally distributed, the Kruskal-Wallis test places no
such restriction on the comparison. It is a logical extension of the Wilcoxon-Mann-Whitney
test.
The Spearman Rank Correlation Coefficient bases itself on the rank ordering of each
variable. It may also be a better indicator that a relationship exists between two variables
when the relationship is non-linear.
Kendall's tau-b is a measure of association for ordinal or ranked variables that takes ties into
account. The sign of the coefficient indicates the direction of the relationship, and its
absolute value indicates the strength, with larger absolute values indicating stronger
relationships.
Validate, validate, validate
While in most cases, we are able to be "traditional," there are alternatives if the situation
warrants. Regardless, the analyst has a basic responsibility: validate, validate, validate, and
then analyze and interpret with confidence.
102
ANOVA and ANCOVA:
Using ANCOVA to gauge the impact of demographic differences on
satisfaction
Author - Timothy Taylor
Article Abstract
In instances where surveyed groups may have a known history of responding to questions
differently, rather than using the traditional method of weighting to address those
differences, analysis of covariance (ANCOVA) can be employed. ANCOVA looks at the
correlation between a dependent variable and the covariate independent variables and
removes the variability from the dependent variable that can be accounted for by the
covariates.
In market research, we often run into situations where we are trying to understand what
may be driving significant differences in satisfaction levels between two separate groups of
respondents. This article aims to help researchers discover (or rediscover) a useful analytical
tool to better understand the true differences in satisfaction between respondent groups.
Let’s take the example where there are two independent samples - one consists of users of
Product A and the other of users of Product B. Let’s say that Product A receives significantly
higher satisfaction ratings than Product B. Are these differences due to the fact that Product
A is actually better than Product B? Or could there be variation in the demographic
composition of the two user groups that could be accounting for the observed difference in
overall satisfaction?
For instance, certain demographic groups are known to give consistently higher satisfaction
ratings. So if more of these individuals make up the population evaluating Product A than
Product B, does that account for the observed difference in overall satisfaction between the
two products?
It’s a critical question, and one that might be addressed through traditional weighting - by
adjusting the populations so that the demographic profiles of the two respondent groups
are nearly the same. However, weighting may not always be the best solution given that the
model used for weighting the data may be somewhat arbitrary. For example, should the
Product A group be weighted to match the demographic profile of Product B, or vice versa?
Or, should both groups be weighted to some separate standard? Also, if the weighting
factors become too large, undesirable error could be introduced into the analysis.
To address this problem, a technique called analysis of covariance or ANCOVA can be

successfully applied. This technique has been around for many years and is well known
among statisticians. However, non-statisticians may not be as familiar with the technique
103
and therefore be more likely to turn to basic weighting to answer the type of question
posed here.
The ANCOVA technique looks at the correlation between a dependent variable (overall
satisfaction in this case) and the covariate independent variables (the demographic
variables) and removes the variability from the dependent variable that can be accounted
for by the covariates. Differences in the residual dependent variable as a function of the
original independent variables are then tested for significance. The focus of this analysis is
whether the observed differences in satisfaction are still true after the differential
demographic composition of the groups has been taken into account.
Case study example
To illustrate how this technique might be applied, and what the output might look like, the
following is an example from an actual research effort (the client and product names have
been kept confidential).
In this example, a major bank was looking at the satisfaction levels of one of its cobranded
credit cards and comparing it with satisfaction levels with its basic, non-rewards card. What
it found was troubling: The rewards card showed a 15 percent lower top two box
satisfaction rating than the basic card. However, a quick demographic analysis revealed the
rewards customers to be younger, higher-income Caucasians - known for often having lower
satisfaction ratings than their counterparts.
So the question became, is the large gap in satisfaction due to these demographic
differences, or was there something about the rewards card itself that was contributing to
the lower satisfaction levels?
To answer this question, we conducted an ANCOVA analysis which looked at whether there
were significant differences between the rewards card and the basic card on five key
measures - both before the ANCOVA analysis and after.
As can be seen in Table 1, prior to conducting the ANCOVA analysis, significant differences
existed between the rewards card and the basic card on all five key measures. After the
effects of age, ethnicity and income had been controlled by running the ANCOVA model,
significant differences in the key satisfaction measures between the rewards card and the
basic card still existed.
104
It was concluded from this analysis that the variation in the demographic composition of the
two respondent groups was not driving the differences in the satisfaction ratings between
the two cards. Rather, something else about the card itself, or the customer experience,
must account for the differences.
In this particular case, once the influence of demographics was ruled out, and a thorough
analysis of verbatim comments was completed, it was concluded that customer concerns
over the financial stability of the bank’s particular cobrand partner were adversely
impacting satisfaction ratings with the rewards credit card.
Significant advantages
While ANCOVA has been in the researcher toolbox for many years, it may not be the
solution that immediately comes to mind when trying to sort out the issues presented in
this article. In fact, many researchers might first think of using weighting to try to level the
playing field between the two samples. However ANCOVA offers significant advantages over
the use of weighting, including: the absence of arbitrary decisions on which sample is
weighted, the ability to handle multiple differences in the samples at the same time
(eliminating the need for complex, tiered weighting schemes), and, as a result, the model
can be expanded to multiple degrees of complexity.
In short, the ANCOVA technique provides a telling way of describing true differences
between respondent groups - controlling for compositional variation in the samples in a
reliable manner.
Importantly, the ANCOVA technique can also be quite helpful in monadic research designs
where the sample is split into various groups - with each group evaluating only one
particular item (product concept, company positioning, etc.). This is done to reduce
respondent fatigue. However, differences in the sample composition among the groups can
obviously adversely impact the ability to make reliable conclusions. In this case, ANCOVA
can be used to successfully sort out whether the differences in evaluations seen across the
cells in a monadic design are real or a based on difference in the composition of the
samples.
105
Regression Analysis:
Regression regression
Article Abstract
Much has been written recently about using regression analysis in marketing research. This
article addresses some of the fundamental underpinnings of regression analysis, irrespective
of particular applications.
Much has recently been written, on these pages and elsewhere, about using regression
analysis in marketing research. In fact, awhile back in Quirk's Marketing Research Review
there was a very enlightening, informative and somewhat heated series on using (or not
using) regression analysis in determining derived attribute importance weights in
customer/client satisfaction measurement studies. (cf. "Regression-based satisfaction
analyses: Proceed with caution" October 1992 QMRR, "Appropriate use of regression in
customer satisfaction analyses: a response to William McLauchlan" February 1993 QMRR.)
The following few paragraphs will not enter further into that fray. Instead, we'll go back into
some of the more fundamental underpinnings of regression analysis, irrespective of
particular applications. This is motivated by some recent questions, comments and
observations from a variety of sources, all aimed at a better understanding of this
fundamental statistical analysis tool. Of course, a lot of these issues have been treated
elsewhere, too, but the time seems opportune for a memory refresher.
Missing data (or How come I paid for 750 interviews and your regression analysis only used
37 of 'em?)
Unlike when you took your introductory statistics course (or vice versa), real respondents
frequently fail to answer every question on a survey. Consequently, many times when we
run a regression analysis using canned software programs, we end up with many fewer
respondents than anticipated. Why? Because, the packages assume regression analysis to
be a multivariate procedure (many statisticians don't, by the way) and drop any respondent
from the analysis who fails to answer even a single statement from your set of independent
or dependent variables.
What to do, what to do? I'm not sure that there is a definitive answer, but here's what some
analysts do. First, there's always the option of using just the respondents who've answered
everything. This has the effect of you basing your research report on 50 percent or so of the
respondents, sometimes less. Many times this is (subjectively) appealing, since you have the
assurance that your model is based on those who answered each and every question. It's
not so appealing, however, when you start with a sample of 1,000 or so and end up with
106
only 100 of these dictating your regression results - especially if the model is to determine
compensation or to forecast sales.
Many analysts prefer to use mean substitution for any values which are missing. The
software packages automatically, if you tell them to, substitute the mean of everyone who
did answer a particular question for the missing answer of those who, for whatever reason,
didn't. Here's what can happen when you use this option:
If there is a given question that, say, 90% don't answer, the mean answer of the remaining
10% who did is substituted and used as if that's what the other 90% said.
A respondent who answers few, or no, questions can still be included in the analysis, unless
the analyst overrides the automatic substitution of means, since a mean value is also used
for the dependent regression variable. Oops!
Now, both of these things happen, probably rarely, but certainly more than they should. It
seems obvious that maybe we should look at each question to see what the item
nonresponse is and if there are questions that are particularly high on item non-response,
try the regression without them. Common sense also says to exclude respondents who fail
to answer the question we're using as a dependent variable (overall opinion, overall
satisfaction, or some such) and also to exclude any who don't answer a specified minimum
number of independent variable ratings, 75% or whatever criterion you decide on. Luckily,
both of these can be easily done with the current software. Sadly, as noted, they aren't
always.
An alternative that seems to be gaining favor is to use only those who've answered the
dependent variable (kind of a nobrainer in most circles) and then substitute the
respondent's own mean on the ones that he or she answered for the ones he or she didn't.
Again, you'll probably only want to do this for those who've answered a majority of the
items. This variation takes scale usage into account and is appealing because to some
respondents "there ain't no tens."
Significance testing (or If they're all significant how come no single one is?)
Regression software packages generally test two types of statistical hypotheses

simultaneously. The first type has to do with all of the independent variables as a group. It
can be worded in several equivalent ways: Is the percentage of variance of the dependent
variable that is explained by all of the independent variables (taken as a bunch) greater than
zero or can I do a better job of predicting/explaining the dependent variable using all of
these independent variables than not using any of them? Anyway, you'll generally see an F-
statistic and its attendant significance, which helps you make a decision about whether or
not all of the variables, as a group, help out. This, by the way, is a one-sided alternative
hypothesis, since you can't explain a negative portion of variance.
107
Next, there's usually a table which shows a regression coefficient for each variable in the
model, a t-statistic for each and a two-sided significance level. (This latter can be converted
to a one-sided significance level by dividing by two, which you'll need to do if, for example,
you've posited a particular direction or sign, a priori, for a given regression coefficient.)
Now here's the funny thing: You will sometimes see regression results in which the overall
regression (the F-statistic, above) is significant, but none of the individual coefficients are
accompanied by a t-statistic which is even remotely significant. This is especially common
when you are not using stepwise regression and are forcing the entire set of independent
variables into the equation. How can this be? It can be because of the correlation between
the independent variables. If they are highly correlated, then as a set they can have a
significant effect on the dependent variable. Individually they may not.
Look at it this way. Let's say that we are measuring temperature in both degrees Fahrenheit
and degrees Celsius, measuring with thermometers, rather than measuring one and
calculating the other using the formula most of us hoped we'd never have to remember
beyond high school chemistry. (By measuring, given the inaccuracies of most thermometers,
the computer won't give us nasty messages. It probably would if we measured only one and
calculated the other.) Next let's say we're going to use these two temperatures to predict
something else we've experimentally determined, like pressure. Clearly, the two
temperatures together will explain a significant proportion of the pressure in a closed
container. Also, maybe not as clearly, neither will individually be significant because they
are correlated with each other and each is redundant given the other. That last clause is the
kicker.
When you look at the significance of a regression coefficient (or the coefficient itself, for
that matter) you are seeing the effect of that particular variable, given all of the others in
the model. This is properly called the significance of a partial regression coefficient and the
B is the partial regression coefficient itself. Either degrees Fahrenheit or degrees Celsius,
alone, would be a significant predictor of pressure (this is the total regression) but either
given the other, the partial effects, will not be significant.
This makes sense, I hope, and will help explain some seemingly strange regression outputs.
It also leads us to the next issue.
Wrong signs (or How can the slope be negative when the correlation isn't?)
This happens all the time. You know that overall satisfaction and convenience are positively
correlated -higher ratings on one of these go with higher ratings on the other and the lower
ratings go together, too. Yet in a multiple regression, the sign of the coefficient for
convenience is negative. How come?
There are a couple of things which can be going on. First, the t-statistic for the coefficient
may not be statistically significant. We interpret this as an indication that the coefficient is
108
not significantly different from zero and, hence, the sign (and magnitude for that matter) of
the coefficient are spurious. Fully half the time, for a truly non significant effect, the sign will
be "wrong."
The other thing that can be happening is the partialling effect noted above. It could be that
the slope is negative given the effect of the other variables in the regression (partial) even
though all by itself the variable shows a positive correlation and slope (total).
Try this. Here's a small data set.
Resp. # OVERALL Package Value Taste Color
1 5 6 9 13 11
2 3 8 9 13 9
3 9 8 9 11 11
4 7 10 9 11 9
5 13 10 11 9 11
6 11 12 11 9 9
7 17 12 11 7 11
8 15 14 11 7 9
Let OVERALL be the dependent variable and run various regressions, first with each
independent variable by itself and then with various combinations of the independent
variables. Some things to look for are:
Correlations between all variables, both in magnitude and sign;
Sign and magnitude of regression coefficients (B) when each variable is by itself as a
predictor;
Sign and magnitude of regression coefficients for the variables when they are working with
the other variable(s).
You should see some interesting things with respect to both the magnitude and sign of
some of the regression coefficients. Are the signs sometimes "wrong?" If so, are they wrong
when the variable is used alone or wrong in later models? Possibly they are wrong neither
109
time, they just happen to not agree. By the way, checking the F-statistic for the models
including more than one predictor and the individual t-statistics will also hammer home
some of the earlier points.
Conclusions
The preceding paragraphs have addressed three or four pragmatic issues about using
regression analysis. While they do more than scratch the surface, there are certainly other
regression fundamentals that were neglected. However, taking the small data set provided
and doing a thorough set of regressions should allay many of the qualms of that regression
analysts face.
To progress you must first regress

Author- Joseph L. Kreitzer
Article Abstract
This article discusses the advantages of using regression analysis to analyze data. It
distinguishes this technique from analysis of variance and goodness of fit, which also strive
to determine the relevance of one measure to another.
Regression. The very word glazes the eyes of capable researchers. It fills their minds with
thoughts of their former selves and conjures dark, Freudian images among those with
psychology backgrounds.
The reality of regression analysis, however, is that it provides a tool offering all of the
analysis potential of ANOVA (Analysis of Variance), but with the added ability of answering
important questions ANOVA is ill-equipped to address.
If regression isn't a Freudian term, then just what is it?
There are many statistical techniques for determining the relevance of one measure, e.g.
purchases, to another, e.g. price. The strongest of these techniques are analysis of variance,
goodness of fit (GOF) (e.g. cross-tabulations), and regression analysis. Each is capable of
answering the same basic question of whether or not variation in one measure can be
statistically related to variation in one or more other measures. Only regression analysis,
however, can specify just how the two measures are related. That is, only regression can
provide quantitative as well as qualitative information about the relationship.
For example, suppose that you need to address the question of whether or not sales (in
numbers of units sold) can be related to your price. Suppose further that you use ANOVA
and find a statistically significant relationship does exist. That's it. You've gone as far as
ANOVA can take you.
110
Regression analysis offers additional information which neither ANOVA nor GOF can
provide. Specifically, regression analysis can tell you how much an additional dollar (i.e.
change in price) can be expected to change sales. This information provides an objective
basis for the infamous "what if" problems so central to Lotus-type simulations.
A second advantage of regression analysis is its ready application to graphic imagery.

Regression is sometimes referred to as "curve" or "line" fitting. The regression output,
indeed, yields an equation for a line which can be plotted. The old adage about "a picture is
worth a thousand words" holds especially true for regression lines. The image of actual data
plotted against predicted values is instantly accessible to even the staunchest statistical
cynic. A regression which has been well-specified provides its own pictorial justification.
In the beginning there are numbers
Suppose that you have measured the numbers of units sold in various weeks and kept track
of the price you had charged during the same weeks. Table 1 contains some hypothetical
pairs of sales and prices. A scatter plot, depicted in Figure 1, provides some immediate
sense of the relationship between the two measures.
Table 1
Hypothetical Sales and Prices
Week Sales (Units) Price
1 33,800 389
2 37,000 383
111
3 36,300 368
4 39,700 361
5 36,800 354
6 37,600 352
7 37,600 364
8 37,000 365
9 38,100 346
10 40,100 350
Both correlation analysis and regression analysis work from the respective average values of
the two measures. In effect, they attempt to determine if systematic deviation exists of one
measure from its mean to corresponding deviations of the other measure from its mean.
The dotted lines which note the location of the respective mean values split the scatter
diagram into four quadrants, labeled I, II, III, IV.
In our example, we would like to know if above average sales could be systematically
related to less than average prices (and therefore less than average sales could be related to
above average prices). If this situation exists, then the preponderance of points in our
scatter diagram should lie in quadrants II and IV.
If we demonstrate that this relationship exists, we can evaluate the wisdom of decreasing
the price as a tool for increasing sales. This scenario is illustrated in Figure 1. If no systematic
deviation can be shown, then one could not count on any predictable response in sales to
variations in the price. Essentially, the numbers of sales would be unpredictable (by price).
Knowledge of the price in relation to average price would not cause you to change your
projected numbers of sales from the average of all past weeks. This case is illustrated in
Figure 2.
112
In the former case, a line drawn through the points would have a negative slope, i.e. an
increase in pace would be associated with a decrease in sales. The line forms the basis for
projecting likely sales for any price. In the latter case the best fitting line would be a
horizontal line, i.e. a line with a slope of zero.
(This case does occur, by the way. In one instance a firm's price had no statistically
significant impact on their sales. The reason, as it turned out, is that the firm was a price
"follower" who adjusted price in relation to the industry leader. The significant price turned
out to be the competitor's, not their own.)
In the middle there is methodology
I remember my introduction to regression. After carefully plotting points on a scatter

diagram, we were instructed by the prof to take out a ruler and place it across the diagram
in what we considered to be the best location. After sketching in our "fitted" lines we were
to come up with the equation of the line we had drawn in, which was simple enough,
although altogether subjective.
After everyone reported their variants of the same equation, we learned the unambiguous
technique of least squares. Least squares is a mathematical solution to the problem of
finding the "best" line. It does so by finding the equation of the line which has the least
residual (remaining) variation of actual values from the fitted line. Any other line you draw
in will have a larger typical error.
In addition to its unambiguous solutions, the least squares method is able to incorporate
more than just one explanatory variable and has known statistical properties. Without the
knowledge of statistical properties one could not definitively note dependence of one
measure on another measure. The inability to do the former yields an inappropriately naive
view of the world.
In the end there will be more numbers
113
Fortunately, knowledge of the mechanics of the regression algorithms are unnecessary for
successful application and interpretation of a regression line. Regress, if you will, to sixth
grade math, when you learned the intricacies of graphing lines. A simple line consists of two
parts. The intercept of a line gives the value of a dependent measure, sales, when the
independent measure, price, has the value zero.
In our example, the intercept would provide the number of sales which are likely to occur
when the product was given away free. Notice the apparent silliness of this interpretation.
Don't fret about it, however, as this intercept should not be literally interpreted this way. It
is necessary for the statistical interpretation to have an intercept. The regression algorithm
would tell us that this number is 69,438.
The second part of the line is the interesting one. It shows how to transform one measure
into the first by means of a "slope." The slope gives the change in the dependent measure
for a one unit change in the independent measure. In our example, it shows the additional
sales related to the change in price of one unit, e.g. $ 1 . The regression-supplied estimate of
this number is -88. (The package I used reported this number to be -88.2120, but for clarity
of illustration we needn't bother with all of the extra digits. The same is true of the intercept
reported above.)
The regression would then take on the form:
Q = 69,438 - 88 P
where: Q is sales in units P is price in dollars
If you wanted to know the likely number of sales when the price is set at 300, you would
solve the equation:
Q = 69,438 - (88 * 300) = 69,438 - 26,400 = 43,038
Similarly, sales at a price of 350 would likely be:
Q = 69,438 - (88 * 350) = 69,438 - 30,800 = 38,638
How much will a $1 change (which implies a positive change, or increase) in price affect
sales? Simply read the slope coefficient. A $1 change in price will decrease sales by 88 units.
A $10 change in price will decrease sales by 880 units. A $5 decrease in price will increase
sales by 440 units.
Finally, there is reality
It would be a rare day in the real world when variation in a measure could be explained
solely by variation in only one other measure. In our example we might well expect, and
find, that our advertising and our competitors' prices influence sales of our product.
Regression analysis can incorporate these new measures in the same manner as our price.
114
Each new explanatory measure is equipped with its own, unique slope which transforms the
variations of the new measure into variations of the dependent measure. The interpretation
and arithmetic for each additional measure are as above.
Suppose that we included information about our advertising when estimating our regression
line. The comparable advertising figures, for example, numbers of column inches published
per week, are listed in Table 2. It seems only reasonable that the more we advertise, the
higher the likely number of sales. The regression algorithm reports this number to be 145.
This means that one additional column inch of ad space increases sales by 145 units.
Table 2
Hypothetical Advertising Amounts
Week Advertising
1 995
2 1010
3 995
4 998
5 982
6 981
7 992
8 992
9 976
10 978
Since our advertising now can explain part of the sales variations, the role of price in sales
can become clearer. We see this in a different coefficient for the price variable, - 178, as
well as a different intercept, -41,794.
The revised regression now looks like:
Q = -41,794 - 178 P + 145 A

where: A is our advertising space in column inches
115
If we charged a price of $350 and purchased 1000 column inches of advertising then our
likely sales could be estimated as:
Q = 41,794 - (178 * 350) + (145 * 1000) = -41,794 - 62,300 + 145,000 = 40,906
The slope coefficients are interpreted in the same way as above, with one important
warning. The coefficients yield the expected change in the dependent measure for a one
unit change in each respective independent measure, assuming that the other independent
measure(s) do not change in value. Specifically, a one dollar change in the price will likely
decrease sales by 178 units, assuming no change in advertising. Similarly, a one unit change
in advertising will increase sales by 145 units, assuming price remains constant.
If both price and advertising are to be changed then the effects are simply added together.
If price is to be reduced by 5 (i.e. changed by -5) and advertising increased by 2, then the net
change in sales would be given by:
Change in Q = - (178 * -5) + (145 * 2) = 890 + 290 = 1180
Adding additional information, such as competitors' prices and advertising, causes similar
modifications in the equation and its interpretation.
Evaluating a regression's fit
Anyone can draw a line through a set of points. But clearly a (straight) line drawn through a
circle does not describe the circle. One must be able to evaluate the reasonableness of a
regression to use it wisely. Effective use of regression is clearly an acquired skill, but even a
novice should be able to make some preliminary judgments regarding a regression's fit.
There are basically three diagnostic processes in evaluating a regression: comparing
coefficients' signs, determining significance of each variable, and evaluating the overall fit.
Signs. The first step in evaluating a regression is to consider the signs of the parameters. If
you have reason to believe two measures are positively correlated then the regression
coefficient should be positive. If a negative coefficient showed up then you basically have
two conclusions: you were wrong in expecting the positive correlation or your regression
equation is misspecified, i.e. contains redundancies or inadequate information.
Misspecification is a serious problem, and is discussed in greater detail below.
If the regression coefficients have the expected sign, as ours do, then you have some
assurance that you have chosen the "correct" set of independent measures. The next
question to address is whether or not those coefficients are meaningful. The problem is that
we have only sampled the relationship between sales, price, and advertising. As is true with
any sample, the observed value needn't, and in all probability won't, coincide with the
"true" value.
116
Significance. One coefficient value, in particular, is of great concern. If the true coefficient
value is zero then the two measures are not related. (Their correlation coefficient would be
zero and the partial F statistics insignificant.) Even if the two are unrelated, unfortunately, a
sample coefficient would not likely be zero. In most cases a Student's t test is employed to
determine the significance of the coefficient. If sufficient numbers of observations are
available then a normal distribution can be used.
In our case, the estimated t values for the price and advertising measures are - 4.06 and
2.48, respectively. The regression has 7 degrees of freedom (10 observations minus three
estimated coefficients). Using a .05 level of significance, the critical t value is 2.36. Since
both estimated t values exceed the critical value we can conclude that the measures are
significant in explaining variation in sales. (If you used ANOVA to test for a relationship you
would obtain significant F values for both measures.)
Note that the magnitude of the coefficient isn't sufficient information to determine
significance. Large coefficients might not be "large enough" to be significantly different from
zero while some very small coefficients might be more than "large enough" for significance.
Fit. One could have a regression for which the signs were correct and the coefficients
significantly different from zero, but which has little practical value.
The most intuitive explanation of overall fit is the "coefficient of determination" most
commonly called R2, or "R squared." It reports the percentage of the total variation of the
dependent measure which is explained by the regression equation. The number ranges from
0 to 1, where one hopes for larger values. A regression with an R2 of .1 is not very complete,
casting doubts on the accuracy of the estimated coefficients. An R2 approaching 1 suggests
that the regression's independent measures can explain virtually all of the variations of the
dependent measure.
A mathematical quirk allows this R2 measure to increase as more independent measures

are added. To counteract this inflationary tendency a second R2 called the "adjusted R2" is
calculated which handicaps the R2 by the number of independent measures used in the
regression. It is interpreted the same way as the R2, and is considered the more accurate of
the two statistics.
In our case these numbers are 72.9 and 65.2, respectively. The regression can be said to
explain 65.2 percent of the variation in sales. Is 65.2 percent "enough" to justify the
technique? Fortunately there is a statistical test, you ANOVA users have been thinking about
for several paragraphs, which can answer this question. The R2 statistic is simply a variant of
the F statistic. Indeed the F statistics generated by an ANOVA routine and by a regression
routine are identical. If the F statistic is significant, then the explained variation is
significantly greater than that left unexplained. In our case the F statistic is 9.45, greater
than the critical value of 4.74.
117
One additional statistic generated by a regression is the standard error of the regression. It
is a measure of the residual variance. In our example the remaining variance is 1039,
representing an average percentage error of only 1.7 percent.
Figure 3 shows the actual and fitted values for each of the 10 weeks, using both price and
advertising to explain variations in the sales volume.
If only it were this simple
There are pitfalls in regression analysis that can undermine the veracity of the entire
estimated equation. Basically there are two categories of problems, only one of which
typically poses major problems for the researcher.
The more benign types of problems generally lead to inflated error terms- therefore
decreased accuracy. They do not, fortunately, lead to biased estimates. There are problems
in reliably interpreting individual coefficients in these cases but not for using the equation as
a whole. These problems are:
Multicollinearity. The independent measures are, themselves, closely correlated. This

causes confusion in determining the importance of any single measure. For example, if one
firm "followed" the pricing of a competitor then inclusion of both prices would be
redundant.
Autocorrelation. The error terms are related sequentially. This type of problem is common
in time series data, where sales might fluctuate around some long-term growth path. Much
like the concept of a business cycle, you might find the regression consistently over-
estimating values for sequential periods, only to begin a pattern of underestimation.
Heteroscedasticity. This problem arises when a regression fits "better" for some values of an
independent measure than for other values. An example would be an equation for sales,
118
which seems to compound over time. By virtue of its compounding, the equation will
generate smaller absolute errors early in the series rather than later, when the magnitudes
of the measure are much larger.
At this point it is worth noting simply that techniques have evolved which neutralize many
of the effects these complications generate. Failure to deal with these problems yields
greater uncertainty than is necessary, but does not, in general, tend to seriously mislead the
analyst.
By contrast, specification. errors are serious. Failure to deal with these problems can cause
major difficulties. There are three major types of misspecification.
Failure to include relevant information, omitted variables, forces the regression to allocate
explanatory power among too few independent measures. It is much like trying to explain
sales only by using price, as we did originally. Notice the difference in coefficients in the two
equations. When price was forced to explain "everything," the coefficient was only - 88
compared to -178 when part of the explanatory burden was assumed by the advertising
measure.
High R2 statistics suggest the regression does include most of the relevant information.
Lower R2 values, 65 % in our example, suggest, but do not conclusively show, there are
other relevant measures which have not been included. Were we to include additional
relevant measures the regression coefficients would take on still different values.
Failure to include related information, omitted equations, has a similar effect on the
coefficients. Suppose that the amount of advertising was at leas/partially determined by the
level of sales, e.g. a scheme which set the advertising budget as some base figure plus 10 %
of the sales level. This is, in effect, a second equation that is clearly related to our sales
equation. Advertising is dependent on, not independent of, sales and therefore our sales
regression will give biased estimates of our coefficients.
Incorrect form of the equation is the last misspecification type. Regression only works for
"straight" lines, but the real world is rarely linear. In many cases the straight line
approximation of regression is entirely satisfactory. In others, e.g. learning curves or product
life cycles, a linear relationship is altogether unrealistic.
Without going into detail, suffice it to say, again, that techniques exist to mitigate the
effects of these problems. The remedies are generally easy to employ, but the initial
detection of the problem is less obvious.
So why would anyone not use regression analysis?
There are basically two reasons why regression analysis might not be the first choice of
technique for a researcher.
119
Regression analysis assumes, indeed requires, the error terms to be normally distributed.
The normality requirement is sometimes more than a researcher is willing to take on. Non-
parametric techniques, which do not make such a restrictive assumption, are better suited
to the temperament of these individuals. In general, however, the normality assumption is
not outrageous if a sufficiently large sample can be obtained.
The second reason regression might be avoided is paradoxically related to regression's

strong suit-quantification of relationships. In some cases one might wish to determine
whether or not two measures are significantly related, yet it doesn't make immediate sense
to quantify the relationship between qualitative measures, e.g. gender and location
preference. More advanced regression techniques exist (LOGIT, PROBIT, Discriminant
Analysis) which can be useful in analyzing qualitative models, but discussion of their
characteristics is beyond this article. Basically they convert qualitative problems into one of
probability estimation, e.g. finding the probability that a male would choose to locate in
area X.
A variant of this concern is that regression equations will be inappropriately interpreted

and/or glorified. Given the number of potential pitfalls, any regression should be considered
suspect until carefully scrutinized. There is a tendency on the part of some decision makers
to give undue credence to any tool which has numbers associated with it. Should your
audience consist of individuals with this affliction then you may be well advised to introduce
regression analysis slowly and only in conjunction with education about regression's uses
and misuses.
Where might you go from here?
There are a number of excellent texts on regression analysis and almost every major
statistical package offers a regression routine or two. Economists, who have played a
disproportionate role in the development of regression tools, fondly and conceitedly talk of
econometrics. You might look for other texts using this term in their titles.
Here are some of my favorites:
"Using Econometrics: A Practical Guide." A.H. Studenmund and H.J. Cassidy. Little Brown
and Company (Boston, 1987). One of the more accessible, mildly rigorous works, it includes
several "cookbook" features which help beginners evaluate their regressions.
"The Application of Regression Analysis." D.R. Wittink. Allyn and Bacon (Boston, 1988). This
is written at a fairly low level of rigor. It doesn't cover many of the remedies alluded to
above, but for a complete novice it might be a good introduction. If you go this route, please
follow it up with one of the other texts.
"Forecasting: Methods and Applications, 2nd ea." S. Makridakis, S.C. Wheelwright and V.E.
McGee. John Wiley and Sons (New York, 1983). The book is written, obviously, with
120
forecasting in mind. It is somewhere between the two previous works in rigor. About 1/4 of
the book is devoted to regression analysis. The remainder of the book deals with other
forecasting techniques, both quantitative and qualitative. It is a classic worth acquiring.
"Econometric Statistics and Econometrics, 2nd. ea." T.W. Mirer. Macmillian Publishing Co.
(New York, 1988). Written at a mildly rigorous level, it is fairly accessible, has many
examples and discusses remedies for the problems noted above.
Have you ever wondered...

Article Abstract
This article addresses multiple data-related topics, including regression analysis.
In response to a reporter’s question, a member of one of the Sweet 16 basketball teams in

this year’s NCAA tourney discussed and defined regression analysis. His answer had to do
with the idea that if he (or an opponent) had a bad game, his next game would probably be
better, due to the regression effect. The question was posed to the student-athlete after it
was learned that he had to take an examination on regression analysis while on the road for
this session of March Madness. As most readers know, his example of the regression effect
closely parallels that of the relationship between fathers’ and sons’ heights, which some
sources say gave rise to the term regression analysis over 100 years ago.
What follows are answers to several other questions, some dealing with regression analysis,
some not. The pages of this column have covered a wide variety of topics and some of the
answers below are borrowed liberally from them and some aren’t. (Remember: stealing
from one author is plagiarism; stealing from several is research. Since I stole that statement
from only one source, it’s plagiarism. I’d love to give credit but honestly don’t know the
original source.)
Why do some regression coefficients have the wrong sign?
What exactly is meant by the "wrong sign?" Computationally, hardware and software are at
the stage where, for a given data set, the signs are undeniably calculated correctly. (Such
was not necessarily the case back in the days of punched cards, about which more later.)
There could be a couple of things going on. First, your theory could be wrong; that is, the
sign might not really be wrong. Second, it could be a statistically non-significant result, in
which case the sign of the coefficient is meaningless. Third, it might be that, due primarily to
collinearity in the data set, you are comparing the sign of a partial regression coefficient
with expectations from a total regression relationship. The partial coefficient involves the
relationship of the particular independent variable of interest accounting for what goes on
with other independent variables. The total coefficient ignores what goes on with the other
121
independent variables and looks at only the relationship between the criterion variable and
a single predictor.
It’s not at all uncommon to see a positive sign attached to a correlation coefficient involving
a single predictor and single dependent variable (predictee?) and yet the regression
coefficient for this same predictor, when other predictors are in the equation, will be
negative. Run a handful of regressions with a larger handful of predictors and you will
almost assuredly see several such "wrong" signs. They may or may not be cause for concern,
depending on the intent of the study for which you are doing regression analysis in the first
place. If you are among those who use beta coefficients to allocate relative "importance,"
you might be in for a headache due to these sign reversals.
How does sample size impact regression and multiple correlation?
Well, we see a couple of things going in opposite directions here. A smaller sample will
usually show a larger R2 and a smaller number of statistically significant predictors than will
a larger one from the same population. It’s a degree of freedom phenomenon and makes a
least a modicum of sense.
Remembering back to when you took geometry (or, in my case, vice versa), you saw that
two points perfectly determine a straight line, three points determine a plane, four points
determine a hyperplane in four-dimensional space, and so on. Regression analysis is really
doing nothing other than estimating the coefficients of the equations for those lines, planes
and hyperplanes. It should be fairly easy to see that the fewer data points we have, the
better the fit of the planes to the data, usually. Thus, R2 will be larger with fewer data
points, generally speaking. That’s why some of your models are "better," if you use R2 to
determine goodness of the model as many are wont to do, when you look at small subsets
of the sample and compare the results with the total sample. Economists were among the
first to recognize this and in most introductory econometrics texts you’ll find a definition of
R2-adjusted-for-degrees-of-freedom. This adjusted R2 is routinely shown as part of the
output of most current software packages. It’s very disconcerting when this value shows up
as negative. If it does, you are woefully short on sample!
As for the number of significant predictors, as the number of observations increases, the
denominator in the statistic that determines whether a regression coefficient is significant
decrease, other things remaining constant. Thus, it’s "easier" for a coefficient to be
statistically significant and, with bigger samples, more will be declared significant than when
you are analyzing smaller samples.
As you are no doubt aware, samples that are inordinately large are troublesome in other
statistical analyses, too. Even with simple t-tests for independent means, big samples will
show that even minuscule sample mean differences are significant. In these cases, as well as
122
those above, the real issue is, are the results substantive in addition to being statistically
significant? The answer may be that they are not, just because the sample was too large.
Why are my R2 values so lousy when I use yes-no predictors?
Yes-no predictors, or dummy variables, come about when we use qualitative (e.g., gender,
brand used most often, education category, etc.) rather than quantitative variables as
predictors in a regression analysis. You won’t find much about this in print, but Michael
Greenacre wrote about it, maybe with a proof, a few years ago in a Journal of the American
Statistical Association paper. For our purpose here, it’s something to acknowledge and, in
part, it points out what many consider the folly of comparing the goodness of regression
models by using R2.
If you have some statistically significant regression coefficients and your regression equation
makes a degree of substantive sense, then you might want to ignore the magnitude of R2
when using dummy predictors. See immediately below for more on small correlation
coefficients.
Why do performance items with higher means commonly seem less "important" than those
with lower means in CSM studies where we find derived importance?
While this is not always the case, what you’ll see as often as not is that items with higher
variance are more correlated with the overall measure than those with lower variance. This
is because of the whole idea of correlation having to with joint variation. Thus, a measure
with low variance usually can’t explain or account for as much variation in the criterion or
overall measure as will one with a higher variance. Finally, then, an item on which the
performance is universally high in the minds of respondents can’t explain much variance in
(or can’t be strongly correlated with) the overall measure just because it (the item) doesn’t
vary. However, an item on which mean performance is so-so usually has a lot of individual
variation (some respondents think you perform great, others think just the opposite, hence
a middling mean) and, thus, can account for a lot of shared variation in the overall measure.
Correlation is really non-directional and is looking at nothing other than shared variation,
much as we’d like to imply a dependence relationship to it.
The same thing can occur when you correlate a lot of stated importance items with an
overall measure. The higher the stated mean importance, the lower the correlation. Again,
recognize that this is not a universal phenomenon, but something those of you doing lots of
CSM work might want to chew on.
Some say never use individual respondent data in multiple correspondence analysis (MCA).
Why?
Beats me, although it’s undeniably easier to use a big crosstab table when doing an MCA
than using the proper individual respondent data. In fact, without a lot of inventiveness
123
there’s no way that some data sets can even be put into an appropriate crosstab table prior
to MCA. The major reason that one should use individual respondent level data is that you
then see the effect of correlated answers in the resulting perceptual map. If the various
categories (age, sex, BUMO, etc.) are truly statistically independent, then it’s O.K. to use
summary data for an MCA. Otherwise, be very careful.
Does respondent order really make a difference in segmentation studies?
This can be answered with an unequivocal, "maybe, maybe not." If you’re using an older
clustering program, you might see some differences in your segmentation results depending
on the order of the respondent data. On the other hand, if you are using a newer program
that does iterations (such as the k-means program found in PC-MDS and elsewhere) input
order is pretty much immaterial. We’ve run several analyses where we’ve taken a data set,
clustered it, randomized the order of the data, reclustered, and so on. The differences in the
segments were so minor as to be negligible. They certainly weren’t substantive. Using the
data in the order collected probably works as well as any other order.
What is "card image" data?
This question really makes me feel old and I hear it frequently. It seems like only yesterday
when data were input into a computer via a card reader. The card reader read information
that was punched into cards, commonly called IBM-cards but more properly known as
Hollerith cards (although Hollerith was one of the founders of IBM), which had 80 columns.
Into each column we could put either a number, a letter or a special symbol by punching a
small rectangle out of one or more of 12 possible positions down the column by using a
machine with a keyboard similar to a typewriter (another archaic instrument). No wonder
the cards were/are also sometimes called punch(ed) cards. These cards were also used to
enter the computer programs. For the budget-conscious, one could correct errors in the
cards by filling unwanted holes with a wax-like substance - less expensive but harder than
just retyping the whole thing.
Anyway, in marketing research, then, card image data refers to data that consists of one or
more "cards" of 80 columns per respondent. Of course, we can’t call them cards since they
aren’t, so the new nomenclature is "records." Thus, a study using card-image data with five
records per respondent means that we have up to 400 columns of data per respondent,
arranged in blocks of 80 columns each. If we arranged this same data into one long record of
400 columns per respondent, then we have what some call string data, which is not the
same as what some computer programs mean by the term "string data." In either case, the
tough job is to tell the computer what can be found where so you can get the answers that
you seek before the Friday afternoon deadline.
124
Also, you should note that while most programs used in marketing research data processing
can handle card image data, data which are not card image cannot be analyzed by all of
them.
By the way, punched cards were run through a card sorter, sorting/matching on particular
punches in particular columns, as early versions of computer matching of potential dates
(for social activities, not for calendars). Finally, an interesting item to put on a list for a
scavenger hunt is a "punch card." Good luck finding one!
Any ideas for some readable reference material on statistical analysis?
Most anything that Jim Myers has ever published is well worth reading. His papers and
books are very pragmatic, offer alternative viewpoints and make the reader think. I
personally like materials that are not dogmatic and recognize that, particularly in marketing
research, there may be two or more ways of looking at a particular type of analysis. (Jim also
remembers what punched cards were, I’m sure.)
In our business, applications articles are interesting and sometimes directly useful.
However, recognize that if a company or individual comes up with a truly unique way to
solve a thorny data analysis problem it will probably never be shown to the general research
community, instead remaining proprietary. That said, of course, not every technique that is
put forth as proprietary is necessarily a unique problem solving tool.
The heavy duty technical articles and books are probably more cutting edge, but difficult for
most of us to read and even harder to directly apply. This is not to say that they should be
discontinued, but there may be a long time between the publication of such an article and
when you can use it in an ATU study, say.
I enjoy the articles in columns such as this one, but the reader has to recognize that these
papers are not peer reviewed and so may contain some things that are not 100 percent
factual. That’s O.K. as long as the reader doesn’t take them as gospel, but notes that they
are merely opinions about how to solve certain problems. You should also use a large salt
shaker when pulling material from the Internet. I don’t hesitate to ask others where to find
information on such-and-such; you shouldn’t either. There is no perfect source of
information.
125
Factor Analysis:
Factor analysis: A useful tool but not a panacea
Author - Randy Hanson
Article Abstract
This article describes the purpose of using factor analysis and the four-step process required
to complete this type of quantitative analysis. It also describes potential shortfalls of using
this method, including possible misapplications and problems related to subjectivity.
Factor analysis is a generic term for techniques that analyze interrelationships among
variables. Its purpose is to reduce a large set of variables to a smaller set of unifying
concepts, or "factors." Factor analysis accomplishes this reduction through a statistical
model that attempts to explain the correlation between variables. Many widely available
data analysis packages such as SAS® and SPSS® contain programs to conduct factor analysis.
In marketing research studies, ratings are often collected on a large number of products,
attributes, attitudes or behaviors. Usually, it's reasonable to assume these ratings are
correlated because the items measured are typically different facets of a few common,
underlying dimensions.
For example, assume respondents are asked to rate a product or service on 20 attributes.
Several of the attributes may actually measure the same dimension, such as quality, cost, or
usefulness. While these dimensions are neither well-defined nor easily measured, factor
analysis can help determine the factors which underlie the 20 original variables. Factor
analysis can both simplify the description and increase the understanding of complex
phenomena such as purchase intentions and consumer evaluations.
Four steps
There are usually four steps in a factor analysis: 1. Compute the correlation matrix; 2.
Extract the factors; 3. Rotate the factors; and 4. Calculate factor scores.
These steps will be discussed in the context of a specific example: A service company wants
to know how customers perceive its organization and competitor companies. Customers are
asked to rate each company on a list of 10 attributes. Results of this factor analysis are
shown in Table 1.
126
Step one: Once the data are collected, the correlation matrix is computed and examined to
determine if a factor analysis is appropriate. If nearly all the correlations are small, there is
probably not much point in carrying out the analysis. The correlation matrix can also provide
a preview of the factor analysis results by identifying separate groups of highly correlated
variables.
Step two: The extraction phase of factor analysis requires several decisions by the analyst.
First, a method of factor extraction (principal components, principal axis factoring,
maximum likelihood, or a host of others) must be selected. In our example, the widely-used
method of principal components analysis is used. Second, the number of factors to be
retained must be decided. The most frequently used criterion is to keep all factors with
eigenvalues greater than one. (An eigenvalue is simply the portion of total variation
explained by each factor). In our example, the first three factors are retained because their
eigenvalues are greater than one.
Step three: The factor solution is then rotated to make the factors more interpretable.
Choices include varimax, quartimax, equamax and oblique rotation, with varimax being the
most frequently used. The rotated factor analysis results for our example, based on varimax
rotation, produce the factor loadings and eigenvalues for our 10 original attributes (see
Table 1). Factor loadings show the degree of association of each attribute with each
underlying factor and range from -1 to + 1 (only loadings with absolute values greater than
.5 are shown in the table).
Naming factors
At this point, some time should be spent naming factors. This process will highlight the
criteria used by customers to evaluate the companies. Looking at the attributes with higher
factor loadings we might call Factor One in our example, "The Basics," or "Comfort Level;"
Factor Two, "Quality," or "Status," and Factor Three, "Money's Worth." Thus, the company's
future advertising and sales presentations may be more effective if they stress "The Basics,"
"Quality," or "Money's Worth."
127
Step four: After naming the rotated factors, scores for each respondent are calculated. The
factor scores are simply summary ratings for each underlying factor. We now have three
variables per customer for analysis instead of the original 10. This reduced set of data can
then be used in a variety of subsequent analyses.
For example: 1. Customers can be segmented (clustered) based on factor scores to reveal
subgroups with similar evaluative styles; 2. If a nonoblique rotation is used, the factor scores
can serve as independent variables in a subsequent regression analysis, and 3. The factor
scores can be used as input for a factor-based perceptual map.
Technique usefulness
There are a growing number of factor analysis practitioners who have doubts about the
general usefulness of the technique. While factor analysis is relatively sound from a
mathematical perspective, it is criticized for:
Misapplication: The question here is whether underlying factors exist at all. While a factor
analysis can be applied to any database, it may not be appropriate. The evidence suggests
the concept of factors may be valid within psychology but is, in other circumstances, open
to debate. Separate from this, the researcher may use factor analysis to "group" attributes
as they already exist, not to discover underlying factors. In these situations, clustering based
on variables is a more appropriate technique.
Ambiguity: There is a great deal of subjectivity in choosing the number of factors to retain,
the extraction method and the rotation method. Because of this, two honest researchers
analyzing the same data independently may find different factors and reach divergent
conclusions. At worst, this ambiguity can be used by an unscrupulous analyst to try different
combinations of methods until a preconceived hypothesis comes up. A defensible solution
to this "data snooping" is to set objective standards and procedures before the data are
analyzed.
In summary, factor analysis is a tool to be included in every marketing researcher's

repertoire. It is not, however, a panacea. Factor analysis is a useful technique when
conditions are appropriate, when it is conscientiously applied and most importantly, when it
provides a deeper and clearer understanding of the data.
128
Discriminant Analysis:
A walk through discriminant analysis
Author - Michael Lieberman
Article Abstract
Discriminant analysis is used in situations where you want to build a predictive model of
group membership based on observed data. This article discusses discriminant analysis,
including some basics, output and predictive and descriptive aspects of the analysis.
Recently, after conducting a successful market segmentation for a client (we were able to
identify high-likelihood customers who are price-insensitive), my client phoned me.
"Michael," he said, "my client loves the segments. He wants to be able to run that banner
point in the next study. Is there a way to add a few questions to the survey and come up
with the classifications?"
"Yes," I answered. "What you need is a discriminant analysis."
At first glance this is not what my client requested. He wants to identify people, not classify
them. What, then, was he asking for?
What my client's client wanted to do, in essence, was to discriminate between segment
members and non-segment members. Once identified, segment members will be used in a
future banner for analysis in the next study.
Discriminant analysis is used in situations where you want to build a predictive model of
group membership based on observed data - characteristics, attitudes, demographic
attributes, etc. The analysis produces a linear equation of variables that can be used to
explain which attribute best discriminates between the two groups and, as an extension,
build a powerful predictive model for future classification.
Sometimes clients confuse discriminant analysis with cluster analysis. In fact, they are
conceptually similar. However, one uses cluster analysis to form groups. Discriminant flows
in the opposite direction: You have the groups, you want to know why.
Basics of discriminant analysis
Discriminant analysis is an a priori technique. That is, you have the groups defined before
you begin. Multiple discriminant analysis, from which discriminant maps are drawn, is a case
where you have membership from more than one group. For ease of understanding, we are
going to restrict our case to a simple discriminant with definition of two groups.
Characteristics of the grouping variable are simple. They are distinct, mutually exclusive, and
exhaustive. In the case of my client's request, either a respondent is in the target group or
he isn't. No fence-sitting. No overlapping.
129
Basic data assumptions of the predictor variables are that they are normally distributed and
independent.
Choosing which predictor variables will be included in the analysis requires a bit of
marketing sense. For example, our client seeks to distinguish between high-probability
customers and low-probability customers. Within the survey, respondents are asked to rate
the company on a given array of attributes - rankings of importance, performance, company
image, and firm demographics such as size, revenue, number of employees, and geographic
area. A good analysis, especially if it is going to be used for back classification, cannot use all
the data available. The results would be murky and there would be a good deal of variation
error, commonly referred to as noise.
Therefore, it is vital to choose which predictors go into the equation. In our fictitious
example, similar to the case above, attitudes toward the technical prowess of the firm,
marketing support, customer service, size of firm, and revenues were chosen.
The output
The analysis produces a discriminant function. That is, a linear equation where coefficients
are multiplied against the values of the predictor variables to produce a discriminant score.
Derived from the discriminant score, a likelihood of each group membership is calculated
based on past group membership. To put it simply, the respondent fills out the form and
gets a score, which is then compared to a chart to see if he qualifies for the group.
As in all sophisticated statistical analysis, a blizzard of output accompanies the procedure.

There are five outcomes that I examine and report: the beta scores of the discriminant
function (known as the raw coefficients), the standardized coefficients, Wilk's Lambda, the
discriminant score, and the percentage of respondents correctly reclassified based on the
function once it is rerun.
The raw and standardized coefficients are used for descriptive and classification purposes.
The discriminant score, when calculated afterwards, is the instrument used for future
classification. Wilk's Lambda is a statistic that gives us the robustness of the model. Wilk's
Lambda includes a chi-square test, which, if significant, says that the model has tested well
and can be assumed strong and reasonably accurate. The percentage of correctly classified
respondents tells us how many people returned to where they belong once rerun through
the model. It, like Wilk's Lambda, is a measure of how good the model is.
The analysis - descriptive aspects
The groups are defined (potential customers and non-potential customers), the predictor
variables are chosen and properly recoded and the analysis is run. My firm uses SPSS to
perform the function. This, as I mentioned, produces a large amount of output. To make the
outcome simple and actionable, we often transfer them to an Excel spreadsheet or
130
PowerPoint slides which are easy for our clients to understand and incorporate into their
reports.
Figure 1 shows the five parameters of our fictitious example, ranked in descending order.
Figure 2 is a graphic display.
From here I will walk through the example in the same order as if I were delivering it to a
client. The first output I would report, for descriptive purposes, is the standardized
coefficients.
131
For interpreting the standardized coefficients it is more useful to look at them relative to
each other. "Availability of training and educational services" has a coefficient of .590. The
next attribute, "Strong marketing support" has a coefficient of .537. What this means is that
these two attributes are the strongest indicators of membership to the group. "Solid
technical support" (.032), "Size of firm in annual revenue" (.032), and "High-quality
customer service" (-.103), are near zero and, thus, not strong indicators.
The strengths of the standardized coefficients are relative to each other. A rule is that if a
predictor has twice the standardized coefficient of another predictor, it is twice as good a
discriminator for the group. Predictors near zero have little effect. Figure 2 graphically
displays these results.
The marketing interpretation for this model is clear. Technical support, size of the company,
and customer service are not a major concern to customers when approaching potential
suppliers. Though determined conclusively through regression analysis, another conclusion
is that if the respondent believes the company has good training and marketing support, he
is probably a good candidate to become a customer.
Figure 3 displays raw coefficients. These can be descriptive too, though I tend to use the
standardized coefficients for pure descriptive purposes (due to differing scales among the
predictor variables; "standardized" gives each predictor a mean of 0 and a standard
deviation of 1). Unlike standardized, raw coefficients serve a dual function. They are also
used in the re-classification phase.
Finally, it is useful to assess the model itself. Shown also in Figure 4 is the Wilk's Lambda and
the percentage reclassified correctly. The Wilk's Lambda is clearly significant. Look at the
"Sig." column, which reads something like "This is the chance the model is zero, or
meaningless." In our example the Sig. is 0.000, or 0 percent chance. Generally I accept any
model with a Wilk's Lambda Sig. less than 10 percent. With more than two-thirds, 67.4
132
percent, of respondents being correctly reclassified, we can be confident that model is
robust and the process a good fit.
The analysis - predictive aspects
Great. We have the discriminators. The client now wants to be able to reclassify future
studies according to the groups already in existence. In addition, for a purely promotional
application, the client wants to be able to phone a potential customer, ask him a few
questions, and determine if he is a good candidate for a follow-up.
First fact: In order to successfully perform a re-classification, you must ask exactly the same
questions that are present in the model. Also, you must use the same scales.
The process is as follows: Ask the questions and plug the answers back into the equation.
The model will produce a discriminant score. The prior run has produced a look-up table of
sorts which shows discriminant scores and the likelihood of a person with that score joining
the group. In practice, if a given respondent has a score with a corresponding likelihood
higher than 50 percent, put him in the group.
For a large number of respondents, my firm will write a small SPSS syntax program so that
the process of re-classification for a large number of data will become automated. That is,
the banner points can be re-created from the previous study.
For individual respondents an Excel spreadsheet calculator is built. Shown in Figure 5, it is

used to calculate the discriminant score and then compare the derived value to the values
that are presented in the look-up table. This is used for a client who wishes, say, to phone a
prospective customer, ask him a few questions, and then decide if he has a reasonable
chance of becoming a real customer.
133
In our example, the respondent has a score of 7.774. We go to Figure 6 - output from SPSS
which gives calculated, existing discriminant scores and the probability a respondent with
that score will end up in the group - and find that this score corresponds with a likelihood
between 67 percent and 72 percent to belong to our target group. Conclusion: call him
back.
Useful and popular
Discriminant analysis is one a number of statistical techniques that we offer to our clients in
order to add value to existing projects or pre-plan for a larger data delivery within the
context of expected output. The power and efficiency of the process allows strategic
planners to capitalize on the existing data to explain and predict consumer conduct without
consulting a mystic. It is among our more useful and popular techniques given its power and
ease of use.
134
Clustering:
Latent class modeling as a probabilistic extension of k-means clustering
Author - Jay Magidson and Jeroen Vermunt
Article Abstract
Recent developments in latent class (LC) modeling offer an alternative approach to cluster
analysis, which can be viewed as a probabilistic extension of the k-means approach to
clustering. This article introduces the LC model and compares its performance with
traditional cluster analysis in various simulated settings.
Cluster analysis has been one of the primary tools that marketing researchers have used to
analyze their survey and other data to help identify different market segments. According to
Kaufman and Rousseeuw (1990), cluster analysis is "the classification of similar objects into
groups, where the number of groups, as well as their forms are unknown." Recent
developments in model-based clustering, especially using latent class (LC) modeling offer
major improvements in the ability to identify important segments and to classify persons
into the relevant segment (Vermunt and Magidson, 2001). This article introduces the LC
cluster model and compares its performance with traditional cluster analysis in various
simulated settings.
In LC analysis, a k-class latent variable is used to explain the associations among a set of
observed variables. Each latent class, like each cluster, groups together cases that are
similar (homogeneous) with respect to the classification variables (attitudes, preferences,
behavior, etc.). In fact, from a statistical perspective, persons in the same latent class are
indistinguishable from each other in that the response patterns that describe their attitudes,
preferences, etc., are assumed to be characterized by exactly the same probabilities. This
differs markedly from the traditional approach used in cluster analysis of grouping together
persons whose responses are "close" according to some ad hoc measure of distance
(hierarchical approaches) or those that attempt to minimize within-cluster variation (e.g., k-
means clustering).
The fundamental assumption underlying LC models is that of local independence, which

states that objects (persons, cases) in the same latent class share a common joint
probability distribution among the observed variables. Persons are classified into that class
having the highest (modal) posterior membership probability of belonging given their
responses. Bayes theorem is used to compute class membership probabilities, and all LC
model parameters are estimated by the method of maximum likelihood (ML). Thus, the LC
approach to clustering and classification moves traditional cluster analysis onto a solid
statistical framework.
LC is most similar to the k-means approach to cluster analysis in which cases that are "close"
to one of k centers are grouped together. In fact, LC clustering can be viewed as a
135
probabilistic variant of k-means clustering where probabilities are used to define
"closeness" to each center (McLachlan and Basford, 1988). As such, LC clustering provides a
way not only to formalize the k-means approach in terms of a statistical model, but also
extends the k-means approach in several directions.
LC extensions of the k-means approach
1. Probability-based classification. While the k-means clustering algorithm utilizes an ad-hoc

approach for classification, the LC approach allows cases to be classified into clusters using
model-based posterior membership probabilities estimated by maximum likelihood (ML)
methods. This approach also yields ML estimates for misclassification rates.
2. Determination of number of clusters. K-means provides no assistance in determining the

number of clusters. In contrast, LC clustering provides diagnostics such as the BIC statistic,
which can be useful in determining the number of clusters.
3. Inclusion of variables of mixed scale types. K-means clustering is limited to quantitative

variables having interval scales. In contrast, LC clustering can be performed on variables of
mixed metrics. Classification variables may be continuous, categorical (nominal or ordinal),
or counts or any combination of these.
4. No need to standardize variables. Prior to performing k-means clustering, variables must

be standardized to have equal variance to avoid obtaining clusters that are derived primarily
by those variables having the largest amounts of variation. In contrast, the LC clustering
solution is invariant of linear transformations on the variables; thus, no standardization of
variables is required.
5. Inclusion of demographics and other exogenous variables. A common practice following a

k-means clustering is to use discriminant analysis to describe differences that may exist
between the clusters on one or more exogenous variables. In contrast, the LC cluster model
is easily extended to include exogenous variables (covariates). This allows both classification
and cluster description to be performed simultaneously using a single uniform ML
estimation algorithm.
The general LC cluster model
The basic LC cluster model can be expressed as:
f(yi) = ∑k p(x=k) f(yi|x=k)
while the LC cluster model with covariates is:
f(yi|zi) = ∑k p(x=k|zi) f(yi|x=k)
or
136
f(yi|zi) = ∑k p(x=k|zi) f(yi|x=k,zi)
where:
yi: vector of dependent/endogenous/indicators for case i
zi: vector of independent/exogenous/covariates for case i
x: nominal latent variable (k denotes a class, k=1,2,...,K)
and f(yi|x=k) denotes the joint distribution specified for the yi given latent class x=k.
For yi continuous, the multivariate normal distribution is used with class-specific means. In
addition, the within-class covariance matrices can be assumed to be equal or unequal across
classes (i.e., class-independent or class-dependent), and the local independence assumption
can be relaxed by applying various structures to the within-class covariance matrices:
- diagonal (local independence)
- free or partially free - allow non-zero correlations (direct effects) between selected
variables
For variables of other/mixed scale types, local independence among the variables imposes
restrictions on second-order as well as to higher-order moments. Within a latent class, the
likelihood function under the assumption of independence is specified using the product of
the following distributions:
- continuous: normal
- nominal: multinomial
- ordinal: restricted multinomial
- count: Poisson/binomial
LC cluster vs. k-means - comparisons with simulated data
To examine the kinds of differences that might be expected in practice between LC and k-
means clustering, we generated data of the type most commonly assumed when using k-
means clustering. Specifically, we generated several data sets containing two normally
distributed variables Y1 and Y2 within each of k=2 hypothetical populations (clusters). For
data sets 1, 2 and 3, the first cluster consists of 200 cases centered at (3,4), the second 100
cases with center at (7,1).
137
In Data Set 1 within each cluster the variables were generated to be independent with
standard deviation equal to one. By fixing the variables to have the same standard
deviation, Data Set 1 was generated to be especially favorable to the k-means approach
where the variables are typically standardized to have the same variance prior to analysis.
We used the Latent GOLD program (Vermunt and Magidson, 2000) to estimate various
latent class models for each data set. Table 1 shows that the LC models correctly identify
Data Set 1 as arising from two clusters, having equal within-cluster covariance matrices (i.e.,
the "two-cluster, equal" model has the lowest value for the BIC statistic, the criterion most
widely used in choosing among several LC models). The ML estimate for the expected
misclassification rate is 1.1 percent. Classification based on the modal posterior
membership probability resulted in all 200 Cluster 1 cases being classified correctly and only
one of the 100 Cluster 2 cases, (y1,y2) = (5.08,2.43), being misclassified into Class 1. For
Data Set 1, use of k-means clustering with two clusters produced a comparable result - all
100 Cluster 2 cases were classified correctly and only one of the 200 Cluster 1 cases was
misclassified, (y1,y2) = (4.32,1.49).
Data Set 2 was identical to Data Set 1 except that the standard deviation for y2 was doubled
so the standard deviation for Y2 was twice that of Y1, to reflect the more usual situation in
138
practice of unequal variances. Figure 2 shows the greater overlap between the clusters
which is caused by increasing the variability in the data.
Table 2 shows that the LC models again correctly identified these data set as arising from
two clusters and having equal within-cluster covariance matrices (i.e., the "two-cluster,
equal" model has the lowest BIC). The ML estimate for the expected misclassification rate is
0.9 percent and classification based on the modal posterior membership probability
resulted in only three of the Cluster 1 cases and one of the Cluster 2 cases being
misclassified.
For Data Set 2, k-means performed much worse than LC clustering. Overall, 24 (8 percent) of
the cases were misclassified (18 Cluster 1 cases and six Cluster 2 cases). When the variables
were standardized to have equal variances prior to the k-means analysis, the number of
misclassifications dropped to 15 (5 percent), 10 of the Cluster 1 and five of the Cluster 2
cases, but was still markedly worse than the LC clustering.
Data Set 3 threw in a new wrinkle of constructing different amounts of variability in each
clusters. To accomplish this and to remove the overlap between the clusters, for Cluster 1
139
the standard deviations for both variables were reduced to 0.5, while for Cluster 2, the data
remained the same as used in Data Set 2.
Table 3 shows that the LC models correctly identify this data set as arising from two clusters
and having unequal within-cluster covariance matrices (i.e., the "two-cluster, unequal"
model has the lowest BIC). The ML estimate for the expected misclassification rate was 0.1
percent, and use of the modal posterior membership probabilities results in perfect
classification. K-means correctly classified all Cluster 1 cases for these data but misclassified
six Cluster 2 cases. When the variables were standardized to have equal variances prior to a
k-means analysis, the six cases misclassified based on the analysis with the unstandardized
variables remained misclassified.
For Data Set 4 we added some within-class correlation to the variables so that the local
independence assumption no longer held true. For Class 1 the correlation added was
moderate, while for Class 2 only a slight amount of correlation was added.
140
In addition to the usual LC models, we also estimated models that allowed a "free"
covariance structure which relaxes the local independence assumption. While such models
were not required for the earlier analyses (i.e., for the earlier analyses the BIC values were
higher than that obtained using comparable models having a fixed covariance structure),
such models provided an improved fit to these data. Table 4 shows that the LC models
correctly identify this data set as arising from two clusters, having a "free" covariance
structure (i.e., the "two-cluster, free" model has the lowest BIC). The ML estimate for the
expected misclassification rate was 3.3 percent, and use of the modal posterior membership
probabilities resulted in 10 misclassifications among the 300 cases.
K-means performed very poorly for these data. While all 100 Cluster 2 cases were classified
correctly, 44 Cluster 1 cases were misclassified, for an overall misclassification rate of almost
15 percent. If the recommended standardization procedure is followed prior to a k-means
analysis, the results turn out to be even worse - 14 of the Cluster 1 and 66 of the Cluster 2
cases are now misclassified, an error rate of over 26 percent!
Comparison with discriminant analysis
Since Data Set 2 satisfies the assumptions made in discriminant analysis, if we now pretend
that the true class membership is known for all cases, the linear discriminant function can
be calculated and used as the gold standard. We computed the equi-probability line from
141
linear discriminant function and appended it to the data set in Figure 5. Remarkably, it can
be seen that the results are identical to that of latent class analysis - the same four cases are
misclassified! These results suggest that it is not possible to obtain better classification
results for these data than that given by the LC model. For a more detailed analysis of these
data see www.latentclass.com.
Summary and conclusion
Recent developments in LC modeling offer an alternative approach to cluster analysis, which

can be viewed as a probabilistic extension of the k-means approach to clustering. Using four
data sets, each generated from two homogeneous populations, we compared LC with k-
means clustering to determine which could do better at classifying cases into the
appropriate population. For all situations considered the LC approach does exceptionally
well. In contrast, the k-means approach only does well when the variables have equal
variance and the assumption of local independence holds true. Further research is
recommended to explore other simulated settings.
While this article was limited to the use of LC models for cluster analysis, LC models have
shown promise in many other areas of multivariate analysis such as factor analysis
(Magidson and Vermunt 2001), regression analysis, as well as in applications of conjoint and
choice modeling. Future articles will address each of these areas.
Note: Interested readers may obtain a copy of the simulated data used for these examples
(including the formulae used in their construction) at www.latentclass.com.
142
Multi-Dimensional Scaling:
Exploring marketing ideas with perceptual maps
Author - Susie Li
Article Abstract
Making marketing strategies is a complex process requiring research, judgment and

creativity. Perceptual mapping is a powerful tool for exploring data and generating
hypotheses. This article discusses three types of perceptual maps: preference,
multidimensional scaling (MDS) and correspondence.
Making marketing strategies is a complex process requiring research, judgment, and

creativity. Marketing research helps marketers establish an objective (or, simulated/virtual)
marketplace to understand their customers and their products, or answering questions like:
How do my customers use my product? What are the strengths and weakness of my
product relative to my competition? Where does my product fit in the overall market
consuming such products? Who are the targeted customers for my product?
Once this structured framework is established and understood, it then becomes a guide and
analytic platform for creative strategists to design innovative, targeted strategies (to fill the
gaps, or to raise the existing product to a higher ground, etc.).
A comprehensive product research project, namely, a portfolio analysis, should provide all
the following information:
1) What (is the overall competitive market structure for this product)? Systematically map
the entire market, partition it into a competitive hierarchy by studying consumer
preferences, consumers’ product usage, product substitution and switching behaviors in the
past.
2) Why (do consumers buy our or our competitors’ products)? Study the desired benefits of
products and fulfillment of those benefits, from the perspective of the consumers.
3) Who (are the customers for our or our competitors’ products)? Develop consumer
segmentation based on their lifestyle, life cycle, and product usage pattern.
4) How (do we better manage or market our product)? Identify consumers’ unmet needs
and desirable product features or growth/niche opportunities; optimize pricing or
promotional strategies; market to targeted customers or segments; create well-defined,
consumer-focused product-positioning strategies to maximize volume and profits.
143
Steps 1, 2, and 3 are intensive analytic work calling for various quantitative or qualitative
models, whereas Step 4 is a guided creative process. Perceptual mapping is one of the many
techniques used in the analytic steps, and an extremely popular one.
Its beauty is in its graphical display: Simpler to interpret than a listing of numerical results, it
quickly points to potential relationships, connections, and patterns in the data. Its deficiency
is that the graph is only an approximate representation of the real data, because of the
amount of data condensation/transformation the procedure requires. Therefore, perceptual
mapping should not be used alone to reach any conclusions, and must be accompanied by
other mathematical means to verify its findings. In general, perceptual mapping is a
powerful tool for exploring data, and for coming up with hypotheses.
Consumer researchers especially appreciate the feature of perceptual mapping in

compacting complex consumer behavioral data (usually a vast amount of multi-dimensional
psychometric measurements) into a concise, easy-to-show format. Simple techniques like
this not only help researchers avoid taking the wrong paths, but also open them to fresh
possibilities not obvious from traditional methods.
There are three ways of producing perceptual maps, although most people are familiar with
only one: the MDS map. The three types of maps are produced by three different
techniques and have different usages:
1. Preference map
2. Multidimensional scaling (MDS) map
3. Correspondence map
Each map requires a different view of the input data, and the maps are used to study
different aspects of the marketing problem. In the following sections, I will explain in
general how to create the preference map and the MDS map, how to examine the results,
and how to use the results to generate new ideas. Then I will present a correspondence map
example in more detail, linking the input data with the output map, and a creative
(somewhat ad-hoc) application of the correspondence analysis.
1. Preference map (for study of consumer preferences)
A basic preference map shows consumers’ preferences for a set of products. It is more
useful than presenting a table of mean ratings. In a typical preference analysis, consumers
are surveyed for their preferences for a set of products. For example, 15 consumers are
asked to rate their preferences for 10 U.S.-made cars on a rating from 1 to 10 (1 is the least
preferred, 10 is the most preferred).
The data is shown in the table below. Preference analysis performs a principal component
analysis on the rating data, and then plots the first two principal components from the
144
analysis to create an approximate two-dimensional display of the consumer preferences for
the 10 cars.
Reading the basic map: The points on the map above are cars. The placement of the points
has everything to do with consumers’ ratings. The arrows are the individual consumers who
rated the cars. Cars that project farther along a consumer’s vector are more strongly
preferred by that consumer.
To interpret the two axes (i.e., the principal components or the dimensions) can be tricky.
The first thing to remember is that cars at the positive end of either dimension are preferred
to those at the negative end. Dimension 1 is usually related to consumers’ overall
preference. However, it takes judgment to interpret the meaning of Dimension 2, usually by
observing the placement of the cars and knowledge of those cars. Dimension 2 in the car
example appears to be related to vehicle ride or fuel economy (we will confirm that later).
Variation
145
Sometimes, in order to confirm the meaning of Dimension 2, a researcher may ask the
consumers to rate three more attributes, like vehicle ride, miles per gallon, and reliability.
The researcher then projects the new attribute information onto the original scatter plot of
cars to produce an “ideal-point” model of preference mapping.
The overlay plot shows you which cars are closest to the ideal level of vehicle ride/miles per
gallon/reliability. For our car example, Marquis has the shortest distance from the ideal
point for the attribute Ride, therefore closest to ideal ride; similarly, Taurus is closest to the
ideal point of Reliability; Contour is closest to the ideal point of MPG.
For insights and ideas, ask the following:
What products do most consumers like?
Where is my product positioned relative to my competitors’ products?
What new consumers should I target for my product?
What new products should I create for consumer segments where there is interest but
currently few products available?
Comparing the first map with the second map, you can postulate that there is a segment of
consumers interested in upscale cars which are reliable and ride well (consumer vectors
pointing to Ride and Reliability, where there is no car). These are potential buyers for luxury
cars which will not break down easily (think Lexus or Acura).
2. Multidimensional scaling map (for analysis of product competitiveness)
146
Multidimensional scaling is a graphic technique for analyzing the similarities (or
dissimilarities) between products. It is not meant for studying consumer preference, but for
analyzing competitive positioning of the products in the minds of the consumers. In this
exercise, you will create an approximate plot of product points such that distances between
points mirror the degree of their similarity. You can also use this plot to learn something
about the unknown attributes that may underlie consumers’ perception of these products’
similarities.
The data: For a multidimensional scaling survey, it would be ideal, but highly impractical, to
ask every consumer to rate the degree of similarity (or dissimilarity) between all possible
pairs of products, because the number of pairs of products to rate would be too large if
there are many products. Alternatively, each consumer is asked to place the products into
groups of similar products. Consumers can decide as many or as few groups as they like (see
chart below).
Each row of the data contains the groups of similar cars perceived by a particular consumer.
For example, Consumer 2 created three groups of similar cars: group one contains Taurus,
Contour, Grand Prix; group two contains Cavalier, Intrepid, and Concord, and so on.
Multidimensional scaling performs an initial principal component analysis of the original

data, and then improves on the solution iteratively. When the solution can no longer be
improved, the procedure stops and produces an optimal two-dimensional map of product
distances (below).
147
Reading the basic map: Points on the map are actual products. Points close together are
perceived by the consumers as being similar. In general:
Points that are closely clustered together are competing against each other.
Points that share the same point or are almost on top of each other are substitute products
for each other.
Study (or estimate) the hidden attributes that describe the dimensions of the plot. These
attributes/dimensions can help explain how consumers judge the degree of similarity
between products. You may learn something from the consumers to redesign your product
for better perception.
If the dimensions are not directly interpretable then perhaps the directions as pointed to by
the products, through the space defined by the dimensions, may be interpretable.
Again, for insights and ideas, consider the following:
What products are substitutes for each other?
What products compete with each other?
How do consumers view the competitive positioning of my product? (Which products

compete directly against my product? Which products can my new product hope to
compete against?)
What is the consumers’ overall perception of the competitive marketplace?
How should I reposition my product to better compete in this market?
3. Correspondence map to explore information in any frequency table
148
Correspondence analysis is an ingenious device to explore the associative relationships and
clustering patterns in the frequency data. For example, you can use the correspondence
map to examine the association between a categorical variable that identifies a group of
customers and another categorical variable that distinguishes your product. It is even
equipped to display multiple categorical variables simultaneously (such as in multi-way
tables of frequency), each having a large number of levels, although with some sacrifice (i.e.,
the distances between all points in the plot become meaningless).
Simple correspondence map
The following two-variable frequency table of car by income shows these associations by
visual inspection:
the lower income level is associated with Cavalier and Contour;
the middle income level is associated with Sable;
the upper income level is associated with Intrepid, Grand Am, and Grand Prix.
Car Model Income
Frequency Lower Middle Upper Total Row
Dodge Intrepid 2 7 16 25
Chevrolet Cavalier 49 7 3 59
Pontiac Grand AM 4 5 23 32
Mercury Sable 4 49 5 58
Ford Contour 15 2 5 22
Pontiac Grand Prix 1 7 14 22
Total Column 75 77 66 218
Using these frequency counts, we can construct row and column profiles (see above): row
(or car model) profiles are simply row percentages divided by 100; similarly, column (or
income) profiles are column percentages divided by 100.
149
These row and column profiles can be thought of as points in a higher-dimensional space.
For example, the six-row (car model) profiles form points in three (column- or income-)
dimensions:
(Lower, Middle, Upper)
1. Intrepid (0.08, 0.28, 0.64)
2. Cavalier (0.83, 0.12, 0.05)
3. Grand AM (0.13, 0.16 0.72)
4. Sable (0.07, 0.84, 0.09)
5. Contour (0.68, 0.09, 0.23)
6. Grand Prix (0.05, 0.32, 0.64)
Similarly, the three column (income) profiles form points in six-row (car model) dimensions.
(Intrepid, Cavalier, Grand AM, Sable, Contour, Grand Prix)
1. Lower (0.03, 0.65, 0.05, 0.05, 0.20, 0.13)
2. Middle (0.09, 0.09, 0.06, 0.64, 0.03, 0.09)
3. Upper (0.24, 0.05, 0.35, 0.08, 0.08, 0.21)
*Numbers in boldface are more significant than others in influencing the associations.
Note that to accurately describe these profile points in a plot, we would need at least a
three-dimensional plot (lesser of the three columns and six rows) - which would be
impossible to fathom with human eyes. We can, however, use correspondence analysis to
perform a variation of principal component analysis appropriate for categorical data on
these row and column profiles, and retain only the first two dimensions (or principal
components) for plotting an approximate representation of the row and column profiles.
150
Each row or column profile is now displayed as a point in this plot. The plot shows the
association between various levels of the row (car model) profiles with those of the column
(income) profiles. However, owing to data transformation, absolute distances between the
row and column profiles have lost meaning. We can only examine the “cluster pattern” and
“relative distances between clusters.” The more clustered the points are, the more
associated the row or column levels are with each other; conversely, the further apart the
clusters are from each other, the more distinct their relationships are. This correspondence
map graphically confirms the relationships we have observed earlier.
One word of caution: Association does not imply causation. While the points appear
clustered together, they are not necessarily linked in a cause-and-effect manner. For
example, we know that certain income levels tend to own certain cars from our
correspondence map, but we can’t be sure if those income levels caused those cars to be
purchased. A correspondence map can describe a phenomenon, but cannot tell if one
variable causes the other. You will need mathematical modeling, like logistic regression, to
investigate the causal relationships.
Multiple correspondence map
Things get a lot more interesting when you try to make sense of multi-way frequency tables.
Looking at a crosstabulation to figure out the relationship between two variables is easy.
Beyond that, the task becomes much more difficult. For example, if you have four variables,
you may have to examine six crosstabs to guess at the intertwined relationships. Multiple
correspondence analysis is especially effective at simplifying these complex multi-way
tables, and making them into a single display similar to that generated by the simple
correspondence analysis (except there will be more points on the map). You read the
multiple correspondence map much the same way as you would a simple correspondence
map.
151
An example of using a simple correspondence map to solve a puzzle
Situation: The marketing department of a major car manufacturer would like to refine their
customer-targeting strategy. A consumer segmentation was done by a department based on
consumers’ lifestyle demographics only, without regard to the types of cars they drive. The
marketing department wants to know what types of cars these segments are most likely to
buy without redoing the segmentation.
Solution: To quickly find out the types of cars most likely owned by the individual segments,
the analyst quickly tabulates the frequency of car types by consumer segments. The table is
shown below.
A simple correspondence map (below) is produced based on this frequency table.
152
This plot provides the analyst with some valuable insight into the relationship between the
types of cars and segments:
Segment 4 tends to own performance-types of cars, be they luxury or popular sporty cars.
This is a young, stylish and performance-conscious group of customers.
Segment 6 tends toward sedan types of cars, be they luxury or popular sedans. This
segment values the comfort, stability, and safety of a traditional large car.
Segments 2 and 3 tend to own station wagons or vans. These segments are family-,
children-, or cargo-oriented, and value cars with ample luggage/cargo space.
Segment 7 prefers SUVs, be they luxury or popular SUVs. This segment of consumers may
like the dominant, rugged, and protected road feel of the SUV.
In this example, car personality also matches nicely with owner personality (which is
described by the demographic characteristics of the relevant consumer segments), thereby
increasing the “face validity” of this analysis.
Perceptual mapping and cluster analysis: some problems and solutions

Autor - Charles I. Stannard
Article Abstract
This article discusses common issues involved with using perceptual mapping and cluster
analysis and how the author dealt with each of them in specific studies. It discusses three
areas: the potential problem of owners and nonowners of a brand producing spurious or
misleading maps, evaluating market segments based on cluster analysis and using and maps
in advertising research.
This article will discuss some common problems and issues analysts have to deal with in
studies using perceptual mapping and cluster analysis. It will describe the problems and the
various ways we dealt with each of them in specific studies.
The first problem, common to much marketing and advertising research, is how to deal with
owners and nonowners of a brand. The context for this discussion is a large study of the
appliance category. The second issue concerns evaluating segments based on cluster
analysis. The data are also from the appliance study. The third issue concerns the use and
interpretation of maps in advertising research. Here the data come from a study of the
automotive category.
Dealing with owners and nonowners
153
The context for the discussion of owners and nonowners is a large positioning study
conducted for a major maker of appliances. The study was done a year ago to assist the
development of image objectives for the brand. We wanted to understand the
characteristics (attributes and benefits) by which purchasers of major appliances distinguish
manufacturers, and to determine the importance of these characteristics in the purchase
decision. The project had two phases: a qualitative phase to learn which attributes and
benefits consumers use to differentiate among manufacturers, and to understand
qualitatively the process by which consumers purchase major appliances; and a quantitative
phase in which we quantified and tested what we learned in the qualitative phase.
Any time we seek information from consumers, the problem of to whom do we talk
confronts us. This problem is very similar to the problem confronting anthropologists
studying a strange culture. In anthropology it is called the problem of the informed
informant. Whether we are anthropologists in New Guinea or market researchers in the
United States, the problem is the same; while just about everyone we question will respond
with an answer, not all "answers" are equally valid and valuable. Naturally we want good
answers, but since the canons of objectivity, not to mention feasibility, prevent us from
ruling on the "goodness" of each and every answer, we move from evaluating answers to
evaluating "answerers."
In choosing to whom we talk, we estimate whether it is reasonable to expect that a given

person will provide us with good answers. The key criterion we use to judge whether a
person can give us good answers is whether he or she is likely to be knowledgeable about
the subject we are investigating. Typically, we make these judgments based on whether a
person can indicate experience with the subject in which we are interested. An important
indicator of experience is ownership or use of specific products or brands. We codify the
criteria for judging the likely value of a respondent in the qualifications he or she must meet
to enter the study.
The major appliance category (refrigerators, ovens and ranges, dishwashers, clothes
washers, dryers, and microwaves) has several characteristics that determined the
requirements people had to meet to participate in the study. The first is that the repurchase
cycle is quite long-10 or more years. Moving and remodeling can shorten the cycle, but
typically consumers are only sporadically in the market, usually after a long absence.
Second, while consumers frequently have several different brands of appliances in their
homes, they often cannot list them by brand when asked in an interview. Finally, except for
moving or remodeling, it appears that many purchases are unanticipated, being a quick
response to the actual or expected failure of the product.
These characteristics indicate that most consumers have little current knowledge about
manufacturers and brands. They become interested in the category when they are about to
purchase one or more appliances. Sometimes, as in remodeling and moving, the purchase is
foreseen and the search for information about products and manufacturers can be leisurely
154
and thoughtful. In other instances, when a current product fails, the search process is much
more hurried and even haphazard. In either case, we think people move from a state of
relatively low awareness and knowledge of manufacturers and their product offerings to
one of relatively high awareness and knowledge in a short period of time, which is
characterized by a comparatively vigorous search for brand and product knowledge.
For our purposes, therefore, we wanted a sample of people who would have greater than
average knowledge of and involvement in the appliance category. They would better
represent those people who are in the market for an appliance and thus would provide a
better picture of the market from their point of view. Therefore, in addition to the usual
demographic, appliance and brand ownership qualifications, we wanted people who were
recently in the market for a major appliance, or anticipated being in the market in the near
future. They best represented the state of mind and knowledge of consumers at the time of
purchase and are the target of the advertising and marketing efforts.
In addition to choosing the right people, we also have to ask them the right questions. In a
positioning study, we typically ask respondents to do two things. First, we ask them what
attributes are important in distinguishing between brands. Then we have them rate brands
on the attributes. Choosing the right questions means asking them to rate brands they are
familiar with on attributes that are important to them in choosing between brands in the
purchase decision. Since the number of attributes and brands of interest was too large-7
brands and 28 attributes-for any one person to rate all combinations of brands and
attributes, we had each respondent rate 4 brands on 12 attributes. The brands and
attributes were selected as follows: Each person rated the client's brand and three other
brands with which he or she was familiar. The attributes were classified a priori into six
categories based on their content. Each person rated two attributes in each category. The
specific attributes were the two he or she rated most highly in each of the six categories.
The major part of the analysis of the appliance market involved producing a map of the
market that located brands and attributes. The advantages of maps are well known. They
provide an economical summary of a great deal of data on brands and attributes, in our case
7 brands and 28 attributes. Another advantage of maps is that the audience, usually
managers, often finds them easier to understand, more revealing, and certainly more
interesting than other ways of presenting the same data. From the analyst's and presenter's
points of view, maps are often easier to present and interpret for an audience than are
complex tables of numbers and coefficients.
Having thoroughly considered-or so we thought-the important issue of product and brand

ownership, we were chagrined to discover that our initial map did not make a great deal of
sense. In analyzing it we found much less discrimination among brands than we expected,
and what appeared to be some odd juxtapositions of brands. We found some of the large,
middle-range brands were positioned very close to smaller, expensive and high-quality
155
brands. Everything we knew about the market suggested that consumers perceive the
smaller brands as different from the larger/middle range brands.
Thus, instead of shouting "Eureka!," we invoked the first rule of nonsensical analysis:
whenever we find something truly new and unexpected in an analysis, look for an error-
either in the logic of the analysis or in the data themselves. We know from experience that
the odds favoring an error are much greater than those favoring the discovery of something
truly new.
We identified two related aspects of brand ownership as possibly causing the strange map.
First, the proportion of the sample owning specific brands varied greatly, mirroring the
reality of the marketplace. Second, as is usually the case, people rated more highly the
appliance brands they owned than the brands they did not own. In fact, the differences
between brand owners and nonowners were greater in many instances than the differences
among brands, when ownership was controlled. In combination, these two aspects of brand
ownership in our sample could be the reason the larger brands ended up in close proximity
to some of the smaller, more expensive and higher quality brands.
The obvious solution, if these were the cause of the problem, was to separate owners and
nonowners. We did this by creating 14 brands, seven as seen by owners and seven as seen
by nonowners, and estimated the space using 14 brands-the seven original owner brands
plus the seven nonowner brands. This approach has the advantage of using all the
information (i.e., the total sample of ratings) in the sample, rather than a portion of it, as
would be the case if the space were created using only owners. The disadvantage is that it
can be difficult to crease mutually exclusive and exhaustive groups of owners and
nonowners when there is extensive multiple ownership of brands.
We then re-estimated the space, this time using the 14 brands. The first function or
dimension captured the differences between owners and nonowners. It grouped at one end
the owners of the various brands and placed the nonowners of the brands at the other end.
Furthermore, there was no overlap between brand owners and nonowners of the brands. In
effect, the first dimension accounted for the effects of ownership on the brand ratings.
We based our map on the next two dimensions, which successfully described the
marketplace. One dimension was price/value. Brands at one end of the dimension were
characterized as offering the lowest prices for comparably featured appliances; brands at
the other end of the dimension were seen as saving money in the long run. The third
dimension described quality in two different ways. One was called "promised quality."
Brands offering promised quality were highly recommended by others and promised to
honor warranty claims without hassle or difficulty. The other end of the third dimension was
"experienced quality." Brands offering experienced quality were seen as a pleasure to own
and extremely durable and long lasting.
156
Thus, we eliminated the negative effects of brand ownership by creating brands to
represent nonowners, estimating the space with the owner and nonowner brands and
discarding the first dimension. This worked because owners rate their brands higher than
they rate brands they do not own, and these differences, in addition to being consistent, are
also substantial, generally being greater than the differences among brands. This explains
why it was the first dimension. The fact that in most categories owners rate their brands
higher than do nonowners suggests that what we did in the appliance category may have
greater utility and generality as a solution to the problem of owners and nonowners.
Segments in positioning research
In addition to mapping the usual groups or segments of consumers like users and non-users,
men and women, and so on, it is also possible to map segments derived from psychographic
data. The latter segments emerge from the data in cluster analysis, as opposed to the
former, which are predetermined according to explicit criteria. Because the segments in
cluster analysis are based on psychographics, they can provide richer and fuller explanations
of behavior and market structure. That is, instead of saying, based on the interpretation of a
map, people choose brand A because it is low priced and readily available, we can, in the
ideal case, elaborate on the reasons people choose Brand A. For example, we might find
that the people choosing Brand A really make up two different segments, one which is very
price sensitive because of low family income, and another which has very little interest in
the category and therefore opts for the low-priced, convenient brand in this category. Of
course, this is the promise of psychographically-based segments. In reality, though, we know
promises are not always kept.
In our appliance positioning study we also included 12 psychographic statements relating to

appliances and shopping in addition to the 28 appliance manufacturer attributes. The
psychographic statements included items like "In buying major appliances the reputation of
the store is more important than the brand name," "When buying appliances, it pays to buy
the best model even though it is more expensive," "It is more important to have good
appliances in the home than good furniture."
Clustering the items produced four consumer types. We named them "Flashy Flora and
Fred," "Needy Nan and Neil," "Classy Carl and Cristy," and "Apathetic Al and Ann." The
demographic and psychographic portraits of the groups appeared to have integrity and
make sense. For instance, Needy Nan and Neil, as their name implied, had the lowest total
family income, with 46 percent of them having total incomes of less than $25,000.
Concomitantly, they were the least educated, with 40 percent having a high school
education or less. They also had the largest families and were the second youngest of the
clusters.
Their attitudes towards appliances fit their demographics. Nan and Neil were very price
sensitive. They wanted to buy the lowest priced appliances from among similar makes and
157
models. At the same time, they had to have appliances that lasted, more so than any of the
other clusters. Their extreme price sensitivity created a problem for them. They could not
rely on the brand name-an important indicator of quality and durability-to help them
choose the best and lowest-priced brand of appliance. As a result they had to look to other
sources of information to help them choose among brands. They, more than the other
segments, relied on two sources to help them do this. One was Consumer Reports. The
other was whether the thought the manufacturer was a specialist in kitchen or laundry
appliances. They took specialization as an indication of durability, an important attribute in
appliances for them.
Classy Carl and Cristy, by way of contrast, were the wealthiest segment. They had the
highest household income (7 percent were over $35,000), were the oldest on average, and
had the largest homes as indicated by the number of bedrooms and bathrooms. This
segment also had the largest number of college graduates of any segment. Their views
about appliances were quite different from Needy Nan and Neil's. Classy Carl and Cristy
were not very price sensitive. They were the least likely of all segments to look for the least
expensive brand of appliance. Rather, they thought it paid to buy the best mode] appliance,
even though it was more expensive. For them. however, having appliances that were a
pleasure to own was also very important, as were appliances that were easy to clean and
keep clean. Their attitudes towards appliances were echoed in their views about their
kitchens: they were very proud of them and the way they looked. Indeed, it is likely that Carl
and Cristy judged kitchen appliances for the looks as well as for their quality and features.
Perhaps because they bought the tees, appliances, Carl and Cristy, of all the segments, had
the most positive attitudes towards appliance makers. This was manifest in their agreement
with those statements that implied a willingness of manufacturers to value customers and
stand behind their products.
While the portraits that cluster analysis creates can be interesting and plausible, it is
important that they relate to product ownership and usage in intuitively meaningful ways.
In the appliance category, we expected to find sharp difference, among the clusters in brand
penetration. For instance, we expected to find penetration of the more expensive brands to
be greater for Classy Carl and Cristy than for Needy Nan and Neil. And we expected the
opposite penetration for the lower-priced brands.
In fact, however, we did not find the expected pattern of brand penetration among the
segments. Instead, we found that brand penetration was relatively flat among the
segments. This was puzzling and demanded an explanation. In thinking about the purchase
process, however, an explanation of the lack of differential brand penetration suggested
itself. The explanation focused on two aspects of the retail side of the appliance business.
First, retail sales are increasingly dominated by "power retailers" that continually have sales
featuring specific brands. Second, appliances are as much sold as they are bought.
Salespeople often receive "spiffs" or special sales inducements above the regular
158
commission from manufacturers for sales of their brand or specific models of their brand.
When this occurs, salespeople work hard to steer people toward these brands, with a fair
amount of success, according to them. When we take these two aspects of the market into
account, the lack of differential brand penetration among segments in brand penetration
might reflect a retail reality which is working against the manufacturers' efforts at creating
and sustaining brand character and differentiation.
This certainly was a plausible explanation of our findings. The question now was whether to
show the segmentation results in conjunction with the perceptual maps. After some
discussion we decided not to show the segmentation results. We though that they would be
hard to interpret; essentially explaining the absence of differences is much harder than
showing and explaining differences. In this case, it would be even harder since the argument
was both long and subtle. And, since the results of the segmentation added little to our
overall understanding of the appliance, not presenting them could be done with little loss.
Parenthetically, I would argue against advancing very subtle explanations of data except
when absolutely necessary. While we may appreciate our subtlety and cleverness in teasing
our implications and formulating explanations, they can be lost on our audiences and can
confuse them as well.
Assessing advertising with perceptual mapping
Perceptual mapping is often used to determine the actual or desired positioning of brands.
The results of such analyses, as was the case in the appliance category, frequently become
the basis for efforts at repositioning a brand in consumers' minds. We use perceptual
mapping much less often to assess whether advertising is in fact positioning brands in the
desired ways. This section will present results from a study that uses perceptual mapping to
assess how advertising is positioning manufacturers.
The data come from an ongoing study of the automotive category. The study is designed to
assess the effect of advertising on the images or positionings of various manufacturers. The
aim of the study is to determine in which direction on a map the advertising for specific
manufacturers is moving the images of these manufacturers. Of course, not all directions
are equal; the desire is that the advertising will move in a direction consonant with the
desired and agreed upon positioning of the specific manufacturer, and this will be the only
manufacturer moving in that direction.
In the study, respondents first rate several automobile manufacturers on 15 image

attributes (quality, sporty, technologically advanced, and so on). They then see six
commercials and read two print ads for several manufacturers and rate each manufacturer
based on what the commercial or ad communicates about the manufacturer. The research
is unique in that it attempts to assess simultaneously the effects of many campaigns, as
opposed to individual commercials and ads, on the perceptions of many manufacturers.
159
There are two ways to determine the impact of advertising on the images or positions of
automobile makers. Both ways begin with a map showing the structure of the market prior
to exposure to the advertising. The structure is shown in Figure 1.
This map shows that people distinguish among manufacturers in the following ways. On one
dimension they see cars that offer value and appeal to younger people; M best exemplifies
this type of manufacturer. At the other end of this dimension they see cars that appeal to
older people and offer more power and luxury; E is an example of such a maker. The other
dimension has "technologically-advanced" and "high-quality" as its defining characteristics
on one end, and family cars on the other end of the dimension. Both the dimensions and
the placement of the makers make sense to people familiar with the automotive category.
From the map it appears as though consumers have fairly clear pictures of a number of cars.
Where there is confusion in images, it is primarily among the American manufacturers who
are the largest producers in the United States market and have had the greatest difficulty in
differentiating the many models and brands they produce. The classification analysis bears
this out. Overall we correctly classify 33 percent of the respondents, but the correct
classification by maker varies from 13 percent for a domestic manufacturer to 74 percent
for a foreign maker.
It is after exposure to the advertising that we have alternative ways of looking at and
portraying the structure of the market. One option is to apply the original structure to the
post-advertising ratings of each car; the other is to re-estimate the structure using only the
160
post-advertising ratings. We have done this and the results of these two options are quite
different.
Figure 2 presents the results of applying the original structure to the post-advertising
ratings. This map is radically different from the market represented in Figure 1. All of the
makes are now located in the lower left quadrant, whereas originally only makes I, J, K, and
M were there. Clearly, drastic changes have occurred, changes that most advertisers would
not be pleased with. Figure 2 implies much less differentiation among brands based on the
advertising.
This is apparent when we look at our ability to correctly classify people based on their post-
advertising ratings of manufacturers. Whereas originally we could correctly classify 33
percent of the respondents, our ability drops to 6 percent based on the post-advertising
ratings. Looking at the map it appears as though every maker's advertising is directed
against the same strategy and communicating the same message.
Figure 3 presents the results of re-estimating the structure using the post-advertising
ratings. This produces a very different picture from Figure 2. There is greater dispersion
among the manufacturers than in Figure 2. With the exception of maker I, which in the
original map was away from the center and closest to H, the general pattern seems similar
to the original map. At least, we could all agree that this one might be based on the original
structure, whereas we would be much harder pressed to agree with this statement
161
regarding Figure 2. And our ability to correctly classify makers is not different from our
ability in Figure 1, 34 percent versus 33 percent.
What do these two very different maps tell us? Should we use both to understand what is
happening, or should we choose between them? I think each tells us something important
about what the advertising is communicating about manufacturers and how it is working.
Figure 2 tells us two things. First, it says that the original structure does not adequately
describe the market based on the exposure to the advertising. There is virtually no
differentiation among makers, and only 25 percent of the space is being used. At the same
time, this map tells us something very important from a marketing sense. It says that
everybody seems to be singing the same song about their cars. Everybody wants people to
think their cars are youthful, offer good value for the money, and are technologically
advanced and high quality. The net result is that manufacturers are blurring, rather than
sharpening, their images.
Figure 3 tells us how the structure has changed based on the advertising, and thus provides
insights into how the advertising is working. The horizontal dimension, the first dimension in
this and the original solution, remains basically the same, describing characteristics of cars
that are seen to appeal to older and younger people.
It is the second dimension that changes after advertising. Instead of being a

family/affordable car versus high quality and technologically advanced car dimension, it
162
changes to family/ affordable car on one end to exciting, powerful quality car on the other
end. This suggests that the advertising is attempting to change the relative importance of
the criteria people use to judge cars. Another way of saying this is that the model of
advertising as agenda setting appears to describe the way advertising is working in the
automotive category.
By examining both maps, we have learned some important things about automotive
advertising. There may be a lesson in this for mapping studies that are repeated at regular
intervals. The lesson is that perceptual mapping can demonstrate the direction of change as
well as the changes in the underlying structure of consumers' perceptions of the
marketplace. Each complements the other and adds to our understanding of consumers and
the structure of the marketplace.
Summary
This article discussed three different issues in positioning research. It offered a way of
dealing with the potential problem of owners and nonowners producing spurious or
misleading maps. The solution was to create owner and nonowner brands and estimate the
space using owner and nonowner brands. We suggested that it was likely that one
dimension would differentiate owners and nonowners and thereby eliminate their effects
from the other dimensions. The article also discussed using segments based on cluster
analysis in maps. The example discussed showed that there may be occasions when it is
better not to display the segments. Finally, the article showed how maps can be used to
assess advertising.
Quadrant analysis (Percep maps)

Author - Norman Frendberg
Article Abstract
This article describes quadrant analysis, one way to simultaneously analyze what attributes
are important to consumers and how consumers rate particular brands according to those
attributes.
Successful products and services usually deliver one or more consumer needs. Both
manufacturers and service providers face the questions: "What human needs are being
met?" and "Which features should be emphasized when designing products and
advertising?" The answers to both queries are critically important since future sales volume
depends upon these issues.
163
Studies are frequently conducted to select the "best" product/concept/advertising from
several choices, where "best" is defined as the highest scorer on both of two dimensions.
For example, the "best" attributes to emphasize in positioning new products are those that
are perceived as both important to the consumer and not adequately provided by existing
products.
If we developed a comprehensive list of the human

needs a product could satisfy, some of them would be perceived as more important than
others, while some needs would be met by products currently on the market. The human
needs that represent the most direct opportunity are those which are concurrently most
important to consumers and perceived as lacking in existing products.
The relevant data includes two respondent evaluations for each of the appropriate human
needs. One evaluation is an importance rating and the other rates the brand used most
often. Quadrant analysis is one way to simultaneously analyze the two dimensions of brand
rating and importance.
This analysis is accomplished by plotting the importance and brand rating scores for each
attribute on one graph. One axis represents the importance measure, while the other axis
illustrates the brand rating. By plotting the coordinates of importance and brand rating for
each attribute, we are able to visually depict their relationship. This visualization will enable
us to both analyze the data and communicate insightful evaluations to management.
Let's continue with a quadrant analysis example based on importance and brand rating. The
data used to create this plot come from two questions such as the following:
1. I'm going to read you a list of statements that you might consider when buying fresh
melons. As I read each one, please tell me if that statement is "extremely important,"
"somewhat important, " "slightly important" or "not at all important" to you when selecting
fresh melons.
2. Now, I'm going to read the same list of statements. This time, please tell me how well you
feel each statement describes "Brand X" fresh melons, which you told me you purchase
most frequently. Does this statement [read first statement] describe "Brand X" melons
"completely," "very well," "somewhat" or "not at all"?
164
Table 1 below shows the hypothetical attribute scores on both measures: importance of
fresh melons and "Brand X" fresh melon ratings.
Table 1
Attribute Importance Rating of Brand X
% extremely important % describes r completely
Base (200) (200)
Tastes good 80% 80%
Feels good 40% 40%
Looks good 80% 20%
Sounds good 20% 40%
Smells good 10% 60%
Table 1 provides the data to create the following quadrant plot for our hypothetical melon
study:
165
The attributes with the highest "extremely important" scores are located on the right half of
the map, while those with the lowest "excellent" association are located on the bottom half.
The quadrant on the lower right side of the map could be referred to as the "direct
opportunity quadrant," since it represents those attributes that are perceived as very
important, but lacking in the current brand.
This plot facilitates the analysis and communication of the results to marketing
management, i.e., it identifies "feels good" and "looks good" as offering special opportunity
because they are considered important and rated low for the existing brand.
The data and corresponding quadrant analysis are not always as straightforward as the
previous melon example. Occasionally, the attribute measures are highly correlated, i.e.,
those attributes that are rated high on importance also score high for the brand rating. This
situation results in a quadrant map that may look like this:
166
The map in Exhibit 2a is less useful in the creation of actionable marketing insights.
However, with a slightly different calculation method, we are able to use quadrant analysis
even when there is a high correlation between the measures.
The calculation method changes only for the brand rating. The new "describes completely"
percentage for a particular attribute is calculated only among those respondents who rated
that attribute "extremely important." Using the melon example - 4O% of the 200 (or 80)
rated "feels good" as "extremely important." Among that 80, we would calculate the
percent who rated "Brand X" as "describes completely" on "feels good."
Thus, the new calculation approach for each attribute is as follows:
- Importance - among total sample percent rating attribute "extremely important."
- Brand rating - among only those rating the attribute "extremely important" and percent
rating "describes completely."
The resulting quadrant map will appear more like Exhibit 2b, which is more useful in
generating marketing insights. Furthermore, this exhibit may enhance the market
researcher's ability to understand "truth" since the brand ratings are key among those users
regarding the attribute as important.
167
168
Conjoint Analysis:
A short history of conjoint analysis
Author - Bryan Orme
Article Abstract
From the early 1960s to today, the author charts the growth of and change to the practice
of conjoint analysis.
The field of marketing research has rarely been the genesis for new statistical models.
We’ve mainly borrowed from other fields. Conjoint analysis and the more recent discrete
choice (choice-based conjoint) are no exception, and were developed based on work in the
’60s by mathematical psychologists Luce and Tukey, and in the ’70s by McFadden (2000
Nobel Prize winner in economics).
Marketers sometimes have thought (or been taught) that the word “conjoint” refers to
respondents evaluating features of products or services CONsidered JOINTly. In reality, the
adjective conjoint derives from the verb conjoin, meaning “to join together.” The key nature
of conjoint analysis is that respondents evaluate product profiles composed of multiple
conjoined elements (attributes or features). Based on how respondents evaluate the
combined elements (the product concepts), we deduce the preference scores that they
might have assigned to individual components of the product that would have resulted in
those overall evaluations. Essentially, it is a “back-door” approach (decompositional) to
estimating people’s preferences for features rather than an explicit (compositional)
approach of simply asking respondents to rate the various components. The fundamental
premise is that people cannot reliably express how they weight separate features of the
product, but we can tease this information out using the more realistic approach of asking
for evaluations of product concepts through conjoint analysis.
Let’s not deceive ourselves. Human decision-making and the formation of preferences is
complex, capricious and ephemeral. Traditional conjoint analysis makes some heroic
assumptions, including the proposition that the value of a product is equal to the sum of the
value of its parts (i.e., simple additivity), and that complex decision-making can be explained
using a limited number of dimensions. Despite the leaps of faith, conjoint analysis tends to
work well in practice, and gives managers, engineers and marketers great insight to reduce
uncertainty when facing important decisions. Conjoint analysis isn’t perfect, but we don’t
need it to be. With all its assumptions and imperfections, it still trumps other methods.
Early conjoint analysis (1960s and 1970s)
169
Just prior to 1970, marketing professor Paul Green recognized that Luce and Tukey’s 1964
article on conjoint measurement (published in a non-marketing journal) might be applied to
marketing problems to understand how buyers made complex purchase decisions, to
estimate preferences and importances for product features, and to predict buyer behavior.
Green couldn’t have envisioned the profound impact his work on full-profile “card-sort”
conjoint analysis would eventually achieve when he and co-author Rao published their
historic 1971 article, “Conjoint Measurement for Quantifying Judgmental Data” in
the Journal of Marketing Research (JMR).
With early full-profile conjoint analysis, researchers carefully constructed (based on

published catalogs of orthogonal design plans) a deck of conjoint “cards.” Each card
described a product profile, such as shown in Exhibit 1 for automobiles
Respondents evaluated each of perhaps 18 separate cards, and sorted them in order from
best to worst. Based on the observed orderings, researchers could statistically deduce for
each individual which attributes were most important, and which levels were most
preferred. The card-sort approach seemed to work quite well, as long as the number of
attributes studied didn’t exceed about six. And, researchers soon found that slightly better
data could be obtained by asking respondents to rate each card (say, on a 10-point scale of
desirability) and using ordinary least squares (regression) analysis to derive the respondent
preferences. In the mid-1970s, Green and Wind published an article in the Harvard Business
Review on measuring consumer judgments for carpet cleaners, and business leaders soon
took notice of this new method.
Also just prior to 1970, a practitioner named Rich Johnson at Market Facts was working
independently to solve a difficult client problem involving a durable goods product and
trade-offs among 28 separate product features, each having about five different realizations
(levels). The problem was much more complex than those being solved by Green and co-
authors with full-profile card-sort conjoint analysis, and Johnson invented a clever method
of pairwise trade-offs using “trade-off matrices,” which he published in JMR in 1974. Rather
than asking respondents to evaluate all attributes at the same time (in “full profile”),
Johnson broke the problem down into focused trade-offs involving just two attributes at a
time. Respondents were asked to rank-order the cells within each table, in terms of
preference, for the conjoined levels (Exhibit 2).
170
Respondents completed a number of these pairwise tables, covering all attributes in the
study (but not all possible combinations of attributes). By observing the rank-ordered
judgments across the trade-off matrices, Johnson was able to estimate a set of preference
scores and attribute importances across the entire list of attributes, again for each
individual.
Conjoint analysis in the 1980s
By the early 1980s, conjoint analysis was spreading (at least among researchers and
academics possessing statistical knowledge and computer programming skills). Another
influential case study had been published by Green and Wind regarding a successful
application of conjoint analysis to help Marriott design its new Courtyard hotels. When
commercial software became available in 1985, the floodgates were opened. Based on
Green’s work with full-profile conjoint analysis, Steve Herman and Bretton-Clark software
released a software system for the IBM standard.
Also in 1985, Johnson and his new company, Sawtooth Software, released a software
system (also for the IBM standard) called ACA (adaptive conjoint analysis). Over many years
of working with trade-off matrices, Johnson had discovered that respondents had difficulty
dealing with the numerous tables and in providing realistic answers. He discovered that he
could program a computer to administer the survey and collect the data. The computer
could adapt the survey to each individual in real time, asking only the most relevant trade-
171
offs in an abbreviated, more user-friendly way that encouraged more realistic responses.
Respondents seemed to enjoy taking computer surveys, and they often commented that
taking an ACA survey was like “playing a game of chess with the computer.”
One of the most exciting aspects of these commercial conjoint analysis programs (traditional
full-profile conjoint or ACA) was the inclusion of “what-if” market simulators. Once the
preferences of typically hundreds of respondents for an array of product features and levels
had been captured, researchers or business managers could test the market acceptance of
competitive products in a simulated competitive environment. One simply scored the
various product offerings for each individual by summing the preference scores associated
with each product alternative. Respondents were projected to “choose” the alternative with
the highest preference score. The results reflected the percent of respondents in the sample
that preferred each product alternative, termed “share of preference.” Managers could
make any number of slight modifications to their products and immediately test the likely
market response by pressing a button. Under the proper conditions, these shares of
preference were fairly predictive of actual market shares. The market simulator took
esoteric preference scores (part worth utilities) and converted them into something much
more meaningful and actionable for managers (product shares).
Conjoint analysis quickly became the most broadly-used and powerful survey-based
technique for measuring and predicting consumer preference. But the mainstreaming of
conjoint analysis wasn’t without its critics, who argued that making conjoint analysis
available to the masses through user-friendly software was akin to “giving dynamite to
babies.”
Those who experienced conjoint analysis in the late 1980s are familiar with the often
acrimonious debates that ensued between two polarized camps: those advocating full-
profile conjoint analysis and those in favor of ACA. In hindsight, the controversy had both
positive and negative consequences. It certainly inspired research into the different merits
of the approaches. But it also dampened some of the enthusiasm and probably was a drag
on accelerating use of the technique, as some researchers and business managers alike
paused to assess the fallout.
Even prior to the release of the first two commercial conjoint analysis systems, Jordan
Louviere and colleagues were adapting the idea of choice analysis among available
alternatives and multinomial logit to, among other things, transportation and marketing
problems. The groundwork for modeling choice among multiple alternatives had been laid
by McFadden in the early 1970s. The concept of choice analysis was attractive: buyers didn’t
rank or rate a series of products prior to purchase, they simply observed a set of available
alternatives (again described on conjoined features) and made a choice. A representative
discrete choice question involving automobiles is shown in Exhibit 3.
172
Discrete choice analysis seemed more realistic, natural for respondents, and offered
powerful benefits, such as the ability to better model interaction terms (i.e., brand-specific
demand curves), cross-effects (i.e., availability effects and cross-elasticities), and the
flexibility to incorporate alternative-specific attributes and multiple constant alternatives.
But the benefits came at considerable cost: discrete choice questions were an inefficient
way to ask respondents questions. Respondents needed to read quite a bit of information
before making a choice, and a choice only indicated which alternative was preferred rather
than strength of preference. As a result, there wasn’t enough information to separately
model each respondent’s preferences. Rather, aggregate (summary) models of preference
were developed across groups of respondents, and these were subject to various problems
such as IIA (commonly known as the “red bus/blue bus” problem) and ignorance of the
separate preference functions for latent subgroups. Overcoming the problems of
aggregation required building ever more complex models to account for availability and
cross-effects (“mother logit” models), and most conjoint researchers either didn’t have the
desire, stomach or ability to build them - not to mention that no easy-to-use commercial
software existed for start-to-finish discrete choice analysis. As a result, discrete choice
analysis was used by a relatively small and elite group throughout the 1980s.
Conjoint analysis in the 1990s
Whereas the 1980s was characterized by a polarization of conjoint analysts into ideological
camps, researchers in the 1990s largely came to recognize that no one conjoint method was
the best approach for every problem, and expanded their repertoire. Sawtooth Software
influenced and facilitated this movement by publishing research (much of it forwarded by its
users at the Sawtooth Software Conference) demonstrating under what conditions different
conjoint methods performed best, and then by developing additional commercial software
systems for full-profile conjoint analysis and discrete choice.
Based on industry usage studies conducted by leading academics, ACA was the most widely
used conjoint technique and software system worldwide. By the end of the decade, ACA
would yield that position to the surging discrete choice analysis. Two main factors are
responsible for discrete choice analysis overtaking ACA and other ratings-based conjoint
methods by the turn of the century:
1) The release of commercial software for discrete choice (CBC or choice-based conjoint) by
Sawtooth Software in 1993.
2) The application of hierarchical Bayes (HB) methods to estimate individual-level models

from discrete choice (principally due to articles and tutorials led by Allenby of Ohio State
University).
Discrete choice experiments are typically more difficult to design and analyze than
traditional full-profile conjoint or ACA. Commercial software made it much easier to design
173
and field studies, while HB made the analysis of choice data seem nearly as straightforward
and familiar as for ratings-based conjoint. With individual-level models under HB, the IIA
issues and other problems due to aggregation were controlled or entirely solved. This has
helped immensely with CBC studies, especially for those designed to investigate the
incremental value of line extensions or “me-too” imitation products. While HB transformed
the way discrete choice studies were analyzed, it also provided incremental benefits in
accuracy for traditional ratings-based conjoint methods that had always been analyzed at
the individual level.
Other important developments during the 1990s included:
latent class models for segmenting respondents into relatively homogeneous groups, based
on preferences;
Web-based data collection for all main flavors of conjoint/choice analysis;
improvements in computer technology for rendering and presenting graphics;
dramatic increases in computing speed and memory made techniques such as HB feasible
for common data sets;
greater understanding of efficient conjoint and choice designs: level balance, level overlap,
orthogonality, and utility balance;
SAS routines developed by Kuhfeld, especially for design of discrete choice plans using
computerized searches;
advances in the power and ease of use of market simulators (due to commercial software
developers, or consultants building simulators within common spreadsheet applications).
The 1990s represented a decade of strong growth for conjoint analysis and its application in
a fascinating variety of areas. Conjoint analysis had traditionally been applied to fast-moving
consumer goods, technology products and electronics, durables (especially automotive),
and a variety of service-based products (such as cell phones, credit cards, banking services).
Some other interesting areas of growth for conjoint analysis included design of Web sites,
litigation and damages assessment, human resources and employee research, and Web-
based sales agents for helping buyers search and make decisions about complex products
and services.
Analysts had become so trusting of the technique that the author became aware of some
who used conjoint analysis to help them personally decide among cars to buy or even
members of the opposite sex to date!
Year 2000 and beyond
174
Much of the recent research and development in conjoint analysis has focused on doing
more with less: stretching the research dollar using IT-based initiatives, reducing the
number of questions required of any one respondent with more efficient design plans and
HB (“data borrowing”) estimation, and reducing the complexity of conjoint questions using
partial-profile designs.
Researchers have recently gone to great lengths to make conjoint analysis interviews more
closely mimic reality: using animated 3D renditions of product concepts rather than static
2D graphics or pure text descriptions, and designing virtual shopping environments with
realistic store aisles and shelves. In some cases the added expense of virtual reality has paid
off in better data, in other cases it has not.
Since 2000, academics have been using HB-related methods to develop more complex
models of consumer preference: relaxing the assumptions of additivity by incorporating
non-compensatory effects, incorporating other descriptive and motivational variables,
modeling the interlinking web of multiple influencers and decision-makers, and linking
survey-based discrete choice data with sales data, to name just a few. Additional efforts
toward real-time (adaptive) customization of discrete choice designs to reduce the length of
surveys and increase the precision of estimates have been published or are underway.
Software developers are continuing to make it easier, faster, more flexible and less
expensive to carry out conjoint analysis projects. These software systems often support
multiple interviewing formats, including paper-based, PC-based, Web-based and handheld
device interviewing. Developers keep a watchful eye on the academic world for new ideas
and methods that gain traction and are shown to be reliable and useful in practice.
Commercially-available market simulators are becoming more actionable as they

incorporate price and cost information, leading to market simulations based on revenues
and profitability rather than just “shares of preference.” To reduce the amount of manual
effort involved in specifying successive market simulations to find optimal products,
automated search routines are now available. These find optimal or near-optimal solutions
when dealing with millions of possible product configurations and dozens of competitors -
usually within seconds or minutes. This has expanded opportunities for academics in game
theory who can study the evolution of markets as they achieve equilibrium, given a series of
optimization moves by dueling competitors.
Importantly, more people are becoming proficient in conjoint analysis as the trade is being
taught to new analysts, as academics are including more units on conjoint analysis in
business school curricula, as a growing number of seminars and conferences are promoting
conjoint training and best practices, and as research is being published and shared more
readily over the Internet.
Continues to evolve
175
Yes, conjoint analysis is 30-plus years old. But rather than stagnating in middle age, it
continues to evolve - transformed by new technology and methodologies, infused by new
intellectual talent, and championed by business leaders. It is very much in the robust growth
stage of its life cycle. In retrospect, very few would disagree that conjoint analysis
represents one of the great success stories in quantitative marketing research.
Conducting full-profile conjoint analysis over the Internet

Author - Bryan Orme and W. Christopher King
Article Abstract
This article discusses pros and cons of various types of text-based e-mail surveys and online
surveys. It also reports on an online full-profile conjoint survey dealing with credit card
preferences. This study used an Internet survey to compare the pairwise and single-concept
approach for computerized FP conjoint analysis.
The advent of the World Wide Web (WWW) is changing the way we communicate in
business. Over the past 20 years, a similar impact was felt with personal computers and
software, overnight delivery services, fax machines, e-mail, and voice mail/answering
machines. The WWW is building on the strengths of these advances.
The growth in Internet usage is truly astounding. According to IntelliQuest, Inc. of Austin,
Texas, as of the first quarter 1998, 32 percent of the U.S. population age 16 and older (or
66.5 million individuals) is on-line. In the period of only a year (fourth quarter 1996 to fourth
quarter 1997), the number of Internet users in the U.S. grew by 32 percent. And if
projections hold, 38 percent of the U.S. population age 16 and older (or 78.4 million
individuals) will be on-line as of third quarter 1998.
As market researchers begin to use the Internet to conduct surveys, they shouldn’t feel
completely disoriented. Internet surveys share much in common with traditional
computerized surveys. The trick is to leverage what we’ve already learned about computer
interviewing and computerized conjoint surveys and apply it to this new and exiting
medium.
This article is organized in two parts. First, we’ll cover general WWW survey research issues,
and then we’ll report on an on-line full-profile conjoint survey conducted over the Web
dealing with credit card preferences.
Computer interviewing: historical perspective
Until recently, the WWW had been largely experimental in the marketing research industry.
Control and access were primitive, limiting the kind of information one could collect. We see
176
many parallels between early Web research and what was felt when computerized
interviewing first appeared in the ’70s.
The first computerized interviewing was done using terminals connected to large computers
in the mid ’70s. Later, Dr. Richard M. Johnson, chairman of Sawtooth Software, pioneered
PC-based interviewing in 1979 using Apple II computers. He found that he could customize
each interview, not just with programmed skip patterns, but using adaptive heuristics to
formulate efficient preference questions for collecting conjoint data. The computer would
"learn" about a respondent’s preferences and customize each interview to focus on the
most important attributes. In 1985, Sawtooth Software released Ci2 (Computer
Interviewing) and ACA (Adaptive Conjoint Analysis) for the IBM PC to the marketing research
community.
Widespread use of disk-by-mail (DBM) was still many years in the future when PCs became
commonplace in businesses and homes. Today we face similar issues and opportunities with
the Internet. Fortunately, advances in software and the booming popularity of the Internet
means that WWW interviewing is rapidly becoming practical and feasible as an additional
tool for the market researcher.
Using the WWW to collect market research data consists of two modalities: surveys that are
e-mailed, and on-line surveys.
E-mail surveys
The text-based e-mail survey is perhaps the easiest method for conducting marketing
research surveys on the Web. Respondents type answers into pre-specified blanks with their
e-mail editor or word processor, and return the completed form to the sender.
Text-based e-mail survey pros:
Low cost: quick and easy to put together.
Text-based e-mail survey cons:
Lots of data cleaning.
Respondents may delete part of the survey with their word processor.
Questionnaires are not very attractive: no graphics, font control or colors.
Respondent sees all questions at once: no automatic skip patterns.
The second form of e-mail survey involves a program executable (usually in a zipped file)
which respondents install on their computers. The data file is e-mailed back to the sender.
E-mailed survey executable pros:
177
Control of skip patterns and data entry verification.
Attractive surveys, including graphics, font control and colors.
E-mailed survey executable cons:
Many users fear installing software e-mailed to them.
Installation can be time-consuming: best for computer-literate respondents.
Software compatibility across different computers -- on some computers it may not work at
all.
On-line surveys
The other form of Web-based survey is the on-line survey: respondents connect directly to
the Web site which displays the questionnaire. On-line surveys can be formatted as a single
form (page). The respondent scrolls down the page from question to question, then clicks
the submit button to send the information to a server.
Single-form on-line survey pros:
Only a single download required at connection and a single upload when the form is
completed.
Relatively inexpensive to program and administer.
Single-form on-line survey cons:
No automatic skip logic.
Data verification only possible at end of survey.
Long forms can seem overwhelming and may not be completed.
Long download time if survey is long, includes complex graphics, and/or your connection is
slow.
An entire interview might be lost if the computer, modem or net connection fails.
Respondents cannot complete part of the form, terminate, and restart at a later time
without losing all their work.
The second type of on-line WWW survey is the multi-form survey. Questions are presented
on different pages (forms), and the data are saved when the respondent clicks the submit
button at the bottom of each page.
178
Multi-page on-line survey pros:
Permits skip logic and question-specific data verification.
User doesn’t face entire task at once.
Multi-page on-line survey cons:
Complex to program without the aid of WWW survey software.
Delay between pages if you have a slow connection or your server has limited bandwidth.
Using passwords to control access to your Web survey
It is usually critical with Web-based surveys to limit access to your survey. Assigning
passwords prevents unauthorized access to your survey and "ballot stuffing." Benefits also
include control over quota cells and restarting of incomplete interviews.
Software compatibility and availability
Incompatibility among browsers and servers remains a major software issue. With the
introduction of the Java programming language and Visual Basic (VB) scripting, additional
functionality can be added to on-line surveys that far exceeds the restrictions of HTML.
Unfortunately, Java standards are still elusive and VB is not supported by all browsers. Very
little is common on the server side, and some software must be customized for each server
configuration.
But, all is not hopeless. New PC-based software makes it possible to construct, administer,
and host your own survey on either your own Web server, your ISP’s (Information Service
Provider) server, or the server belonging to the manufacturer of the survey software. The
advantage of hosting your own site or using an ISP is that you have control over the study.
You also avoid the per-interview costs that are frequently associated with hosting on
someone else’s marketing research site. It also means that you can easily test your
questionnaire, add questions while a study is in progress, and monitor its progress on-line.
Is the Web appropriate for your research?
Much has been said about the representativeness of data collected over the Internet. We
trust you have studied the arguments to determine that the Internet is the right vehicle for
your research study. We won’t spend time addressing these arguments, but will proceed
under the assumption that the Internet is appropriate for your research study.
We’ll now focus our attention on conducting full-profile conjoint analysis on the Internet.
179
Conjoint analysis usage
In a 1997 survey of conjoint analysis usage in the marketing research industry, ACA
(Adaptive Conjoint Analysis) was found to be the most widely used conjoint methodology in
both the U.S. and Europe (Vriens, Huber and Wittink, 1997). Traditional full-profile (FP)
conjoint was also reported as a popular method. In general, we believe traditional FP
conjoint is an excellent approach when the number of attributes is around six or fewer,
while ACA is generally preferred for larger problems.
Paper vs. computerized full-profile conjoint
FP conjoint analysis studies can be done either as paper-based or as computerized surveys

(Internet surveys, disk-by-mail, or CAPI). Because they typically involve fixed designs and,
unlike ACA, are not adaptive, computerized FP surveys really offer no real benefit over the
paper-based approach in terms of the reliability or validity of the results. In fact, paper-
based FP may work better than computerized FP. With traditional paper-based card-sort,
respondents can examine many cards at the same time, comparing and manipulating them
into piles. This helps respondents learn the range of possibilities and settle on a reliable
response strategy. With computerized approaches, respondents see only one isolated
question at a time. It may take a few questions for respondents to learn about the range of
possibilities and settle on a reliable response strategy. It is probably beneficial with
computerized FP, therefore, to show the best and worst profiles early on in the survey.
Even though computerized FP probably offers no significant benefit over paper-based

surveys in terms of reliability or validity, real benefits might be realized in survey
development and data collection costs.
Pairwise versus single-concept approach
Pairwise and single-concept presentation are two popular approaches for FP conjoint. A
pairwise FP conjoint question administered over the Internet is shown below.
The single-concept approach is represented below.
With pairwise questions, respondents make comparative judgements regarding the relative
acceptability of competing products. The single-concept approach probes the acceptability
of a product, and de-emphasizes the competitive context. Both methods have proven to
work well in practice, but we are unaware of any study other than this one that has directly
compared these two approaches.
Purchase likelihood ratings reflect the absolute desirability of product profiles. With
pairwise ratings, we only gain relative information. This potentially can be a critical
distinction, depending upon the aim of the research. Consider the person who takes a
pairwise conjoint interview designed to find the optimal blend for tofu. The conjoint utilities
might appear reasonable, even though he finds tofu disgusting and has absolutely no desire
180
to ever buy it. If we use single-concept profiles, we can both derive utilities and learn about
a respondent’s overall interest in the category. Respondents who have no desire to
purchase can be given less weight in simulations, or be thrown out of the data set entirely.
The danger with single-concept ratings is that if a person gives most of the profiles the
lowest (or highest) rating, there is limited variation in the dependent variable, and we may
not be able to estimate very stable utilities.
One need not give up the benefit of measuring purchase likelihood when using the pairwise
approach. Both pairwise and single-concept conjoint questions can be included in the same
survey. Single-concept purchase likelihood questions could be used to calibrate (scale)
pairwise utilities (as is done in ACA). We can get the benefit of the comparative emphasis of
pairwise questions while including information on purchase intent.
An experiment
We designed an Internet survey to compare the pairwise and single-concept approach for
computerized FP conjoint analysis.
The subject for our study was credit cards, with the following attribute levels:
Respondents completed both pairwise and single-concept conjoint questions (in rotated
order). Enough conjoint questions (nine) were included to estimate utilities (12 part-worths)
for both the pairwise and single-concept designs at the individual level. These designs had
only one degree of freedom. In general, we would not recommend conjoint designs with so
few observations relative to estimated parameters. For the purposes of our methodological
study (respondents were required to complete both designs in the same interview) these
saturated designs seemed satisfactory. Additionally, holdout choice sets were administered
both before and after the traditional conjoint questions.
A total of 280 respondents completed the survey. Respondents self-selected themselves for
the survey, which was launched from a hyperlink on Sawtooth Software’s home page. This
sampling strategy is admittedly poor had we been interested in collecting a representative
sample. But the purpose of our study was not to achieve outwardly projectable results, but
rather to compare the within-respondent reliability of alternative approaches to asking FP
computerized conjoint.
We took three steps to help ensure the quality of our data: 1) we required respondents to
give their name and telephone number for follow-up verification; 2) we included repeated
holdout choice tasks for measuring reliability and flagging "suspect" respondents; and 3) we
examined the data for obvious patterned responses.
Measuring the reliability of conjoint methods
Reliability and validity are two terms often used to characterize response scales or
181
measurement methods. Reliability refers to getting a consistent result in repeated trials.
Validity refers to achieving an accurate prediction. Our study focuses only on issues of
reliability.
Holdout conjoint (or choice) tasks are a common way to measure reliability in conjoint
studies. We call them holdout tasks because we don’t use them for estimating utilities. We
use holdouts to check how well conjoint utilities can predict answers to observations not
used in utility estimation. If we ask some of the holdout tasks twice (at different points in
the interview), we also gain a measure test-retest reliability.
We included a total of three different holdout choice questions in our Internet survey, which
looked like:
These questions came at the beginning of the interview, and then the same ones (after
rotating the product concepts within set) were repeated at the end of the survey.
Respondents on average answered these holdouts the same way 83.0 percent of the time.
This test-retest reliability is in line with those reported for other methodological studies
we’ve seen that were not collected over the Internet. But one can argue that our
respondents (marketing and market research professionals) were a well-educated and
careful group. We cannot conclude from our study that Internet interviewing is as reliable as
other methods of data collection.
We use the holdout choice tasks to test the reliability of our conjoint utilities. We would
hope that the conjoint utilities can accurately predict answers to the holdout questions. We
call the percent of correct predictions the holdout hit rate. Some have referred to hit rates
as a validity measurement, but prediction of holdout concepts asked in the same conjoint
interview probably say more about reliability than validity.
Comparing different conjoint methods using holdouts will usually favor the conjoint method
that most resembles the holdouts. The comparative nature of the pairwise approach seems
to more closely resemble the choice tasks (showing three concepts at a time) than does
single-concept presentation.
Holdout predictions are not the only way to measure reliability. We can also examine
whether part-worth utilities conform to a priori expectations. Three of the attributes
(annual fee, interest rate, and credit limit) were ordered attributes (i.e., low interest rates
are preferred to high interest rates). When part-worth utilities violate known relationships,
we refer to these as reversals.
Reliability of pairwise versus single-concept approach
The holdout hit rates for the pairwise and single-concept approach were 79.3 percent and
79.7 percent, respectively. This is a virtual tie; the difference is not statistically significant.
182
These findings suggest that both methods perform equally well in predicting holdout choice
sets.
The average number of reversals per respondent was 1.5 and 1.3 for pairwise and single-
concept designs, respectively. The difference was significant at the 90 percent confidence
level. These findings suggest that utilities from pairs questions may contain a bit more noise
than singles. The difference was small, however, and we caution drawing general
conclusions without more corroborating evidence.
Qualitative evidence
In addition to completing conjoint tasks, we asked for qualitative evaluations of the pairwise
versus the single-concept approach. Respondents perceived that the pairwise questions
took only 13 percent longer than the singles. We asked a battery of questions such as
whether respondents felt the conjoint questions were enjoyable, easy, frustrating, or
whether the questions asked about too many features at once. We found no significant
differences between any of the qualitative dimensions for pairwise vs. single-concept
presentation.
Conjoint importances and utilities
We calculated attribute importances in the standard way, by percentaging the differences

between the best and worst levels for each attribute. Conjoint importances describe how
much impact each attribute has on the purchase decision, given the range of levels we
specified for the attributes.
We constrained the utilities to conform to a priori order for annual fee, interest rate and
credit limit. Further, we scaled the conjoint utilities (at the individual level) so that the worst
level was equal to zero, and the sum of the utility points across all attributes was equal to
400 (the number of attributes times 100). Importances were computed at the individual-
level, then aggregated.
Importances and utilities for pairs vs. single-concept presentation were as follows:
Conjoint Importances
Pairs Single-Concept
Brand 18% 19%
Annual fee 37% 37%
Interest rate 21% 20%
183
Credit limit 24% 24%
Conjoint Utilities
Pairs Single-Concept
Visa 36 38
Mastercard 27 31
Discover 13 12
No annual fee 104 104
$20 annual fee* 44 34
$40 annual fee 0 0
10% interest rate 55 55
$5,000 credit limit 64 67
*statistically significant difference at 99% confidence level
184
The only significant difference for either conjoint importances or utilities between the two
full-profile methods occurred in the utility for the middle level of annual fee ($20). In a
presentation at our 1997 Sawtooth Software Conference, Joel Huber of Duke University
argued that respondents may adopt different response strategies for sets of products versus
single-concept presentation. He argued that when faced with comparisons, respondents
may simplify the task by avoiding products with particularly bad levels of attributes. Annual
fee was the most important attribute. The larger gap between the worst and middle level
(44-0) for pairs versus single-concept (34-0) is statistically significant at the 99 percent
confidence level (t=3.93) and supports Huber’s "undesirable levels avoidance" hypothesis.
Pairwise versus single-concept FP conjoint: conclusions and suggestions
Our data tell a comforting story, suggesting that both computerized pairwise and single-
concept FP ratings-based conjoint are equally reliable and result in the same importances
and roughly the same utilities. Computerized FP conjoint seems to have worked well for a
small design such as our credit card study. Given that the researcher has determined that
the Internet is an appropriate vehicle for interviewing a given population, our findings
suggest that FP conjoint can be successfully implemented via the Internet for a small study
including four attributes.
Benefit impact analysis (Alternative to Conjoint)

Author - Ed Cohen
Article Abstract
Conjoint analysis is incredibly useful to managers. This article outlines benefit impact
analysis, a relatively simple technique for exploring product elements that produces a
measure analogous to conjoint’s utility values in lieu of conjoint analysis.
With the advent of conjoint analysis and other sophisticated modeling techniques,
considerable progress has been made in giving management the kind of information it
needs to make tactical and strategic decisions about a product or service. These decisions
are based on evaluating numerous and complex marketing issues such as competitive
frame, brand positioning, product design, and packaging and pricing -- each with its own
almost bewildering array of alternatives.
It is beyond the intent and scope of this article to discuss the many very useful techniques
available today. Rather, we will outline one relatively simple technique, benefit impact
analysis, for exploring a series of product elements that produces a measure analogous to
185
conjoint's utility values in circumstances where a standard conjoint analysis may not be
possible.
Application
Benefit impact analysis warrants consideration in any of the following situations:
As a preliminary to a conjoint study to help define the range of variables, such as quantity,
size, capacity, price, etc., to be included in the conjoint matrices.
Where variables cannot be precisely quantified. For example, a discrete value can be
assigned to price, quantity, certain physical attributes, interest payout levels and others.
Inches, ounces, dollars, cents and primary colors are concrete and readily understood by
consumers. On the other hand, many sensory variables are less clearly quantifiable in terms
that respondents comprehend. These might include such elements as "degree of softness,"
"strength of fragrance," and "carbonation level."
In cases where, for any reason (such as low incidence categories or market targets),
personal interviews may be prohibitively costly, BIA data may be collected by telephone.
The following case history illustrates one application of BIA in a situation involving both
easily quantified and more qualitative types of variables. This particular study was done with
personal (central location) interviews.
Study background
The client, a manufacturer of household paper products, was battling several strongly
competitive brands, some of which were uniquely positioned and continually chipping away
at the company's brand share. To thwart the erosion of brand share, management felt it
necessary to modify its own brand in some way and considered four possibilities. Each of
the alternatives would have some impact on the others and confusion reigned.
The attributes
Four variables relating to the category were candidates for modification: quantity per
package, price per package, product absorbency, product softness.
The first two, quantity and price, are clearly definable in precise terms easily understood by
consumers. Absorbency and softness are not. Think about what 10% softer means to the
average respondent. We decided after discussions with the client that, although imperfect
and admittedly still ambiguous, respondents would relate more easily to purely verbal
descriptors, e.g., a little softer, a lot softer.
Method
186
The BIA technique was utilized to determine the relative appeal of hypothetical
modifications in the four product benefit areas. Two levels for each benefit were
considered:
Quantity
-25 more per package

-50 more per package
Price
-5¢ less per package

-10¢ less per package
Softness
-A little softer
-A lot softer
Absorbency
-A little more absorbent

-A lot more absorbent
Each benefit level or option was paired with every other in the array, except that the two
options within benefits were not paired for obvious reasons, e.g., 5 cents less vs. 10 cents
less. Thus, there were 24 "cross-benefit" pairs.
Note that one is limited to a relatively few variables, since the number of combinations
(pairs) increases dramatically as we add benefits and/or levels. For example, the addition of
a fifth benefit, maintaining two levels for each, yields a total of 40 cross-benefit pairs.
Adding one level to each of the four benefits produces 54 such pairs. In both cases,
respondent judgments are likely to become fuzzy long before the final few choices are
made.
Respondents were presented with the series of 24 benefit/level pairs on a rotated basis, and
given the following instruction:
"Please read each pair of alternatives and select the one choice you would prefer over the
other, according to which you personally would rather have in your (product category)."
Respondents then made their selections on a self-administered basis. Had the study been
conducted by telephone, instructions would have been modified to accommodate the
reading of each pair by interviewers to elicit verbal choices.
187
BIA analysis
A. Share of preference. The analytic model calculates a "share of preference" for each of the
eight benefit levels, along with statistical significance of the differences among the
respective items.
Among the eight alternatives, the one with the greatest impact is the 10-cent reduction in
price (15.12 share), while a close second position is held by 50 more per package (14.70
share). Clearly, the desire for economy is stronger than qualitative considerations, but these
data suggest substantial absorbency improvements are likely to induce greater interest in
the brand than more modest changes in quantity or pricing.
B. Benefit leverage. Using share of preference data we can answer the following type of
question:
"What is the relative leverage value (or elasticity) of each type of benefit investigated?"
A simple calculation provides an estimate of the leverage/elasticity value for each of the
benefit areas.
Exhibit B
Benefit Impact Scores
Share Difference
of benefit equals
preference impact score
Price
10¢ less 15.12
5¢ less 12.15 2.97
Quantity
50 or more per package 14.70
25 more per package 12.44 2.26
188
Absorbency
Lot more absorbent 13.29
Little more absorbent 11.99 1.30
Softness
Lot softer 10.36
Little softer 9.95 .41
As shown in Exhibit B, leverage seems to be greatest for price, followed by quantity. This
may be interpreted to mean that consumers are more sensitive to these benefits than to
the others. Absorbency, while intrinsically important to the category, offers more modest
leverage value, possibly because most brands in the category offer at least acceptable
absorbency benefits. Softness, too, at the bottom of the benefit share hierarchy, seems to
be meeting consumers' basic expectations and offers the least opportunity for marketing
leverage.
Summary
BIA offers the researcher a fairly simple but useful technique which estimates the relative
consumer appeal of certain changes in product attributes/benefits. It also provides a
reading of the relative impact of benefit variables.
The potential applications of BIA are not limited solely to products, nor is the method
limited to personal interviewing. The technique is quite versatile and warrants consideration
in working towards a solution for your next configuration problem, be it for a new or
established product or service.
189
Segmentation:
Ten guidelines for a good segmentation
Author - Peter Flannery
Article Abstract
The author provides five methodological and five applied-marketing guidelines to help
readers craft better segmentations.
Look for the similarities
Very little has been written about what makes a good segmentation. In this article, I will
tackle that topic by exploring 10 guidelines. The first five are methodological goals. The
remaining five are marketing goals.
Five methodological guidelines
1. Similarity within segments
A good segmentation must find a set of objects (whether individuals, companies or

products) that are similar to each other. Finding similar objects is not always easy. For one,
there is the issue of “On what basis (topics) are the objects similar?” In addition, there is the
issue of defining similarity: How similar is similar? For example, say you have decided to
segment your market by company size. You will still face the issue of defining what counts
as similar size ranges. Even with a concrete topic like size, it can be hard to make a clear
decision rule about size breaks. Is a company with 25 employees more like a five-employee
company or more like a 250-employee company? The solution is often to hope for a logical
break in one’s database. For example, one may find that most companies with 25
employees do not have a separate HR benefits manager. If one is marketing HR benefits
(health insurance, 401k plans, legal assistance), the 25-employee company will probably be
seen as more similar to a five-employee company.
Number of employees is a fairly objective topic. When the segmentation topics are
intangible attributes, such as attitudes or preferences, defining similarity can get even
messier. Fortunately, multivariate algorithms can automatically investigate the covariance
among dozens of input variables to see where they clump together and to find bumps or
piles in a multidimensional plane.
2. Differences across segments
The second goal of a good segmentation is to find groups that are clearly distinct from each
other. Groups with fuzzy boundaries are the blight of good segmentation models, at least as
they are usually conceived. Finding differences across segments is connected to the first
goal of finding similarity within segments. The two goals are corollaries. In statistical terms,
190
it is sometimes even said that the variance (distance) across groups should be maximized
while the variance within groups should be minimized. In other words, people within a
group will look similar, but people in other groups will look different.
3. Interpretability of segments
Once groups have been found, it is common to interpret and name the groups. Sometimes
interpreting a segment is easy. Consider, for example, a group of individuals whose data
shows that they love all types of food, both diet and regular foods. They frequently visit
restaurants, buy upscale kitchen appliances and watch cooking shows. The interpretation of
this group is straightforward. Thus, marketers are free to wordsmith on a catchy name for
the group, such as Yummies, Food Lovers or Foodies. Whatever its final name, this segment
gets an A+ on interpretability. It is internally consistent, coherent and compelling.
Unfortunately, statistical algorithms can also come up with segments that don’t make sense.
Segments may be non-interpretable. For example, consider a hypothetical Adventurers
segment. Their data show that they like to take risks on outdoor hobbies, drive fast and play
Lotto more than other segments. So far, so good. However, elsewhere their data say that
they are low on watching dramas and fright shows, they will not experiment with new
products, and they index higher on wanting airbags than on wanting horsepower. One can
slap the name Adventurers on this group, but the overall interpretation of the group is
suspect. This segment gets a C- on interpretability.
Technically, a segment does not need a coherent interpretation to be valuable. A portfolio

of stocks, for example, can lack a consistent theme and be poorly named, yet it may still be
profitable. For most marketers, however, it is difficult to accept any segment that lacks a
common-sense interpretation, let alone a catchy name.
4. Measurability of segments
Segments differ on how well they can be measured. Sometimes, segments can be identified
with relatively objective measures such as gross revenue, vehicle ownership, shoe size or
type of hospitalization (e.g., acute vs. rehab). These are cut-and-dried topics that are fairly
easy to measure. Other times, however, segments can only be identified with subjective
topics such as new technology attitudes or computer brand loyalty or health care services
knowledge. The latter topics are more difficult to measure.
Of course, questions about such attitudes, loyalty and knowledge can always be asked. But
respondents may struggle with these questions. The same respondent may even give
different answers to the same question, just because the question is ambiguous. Ideally, you
will pilot-test and refine any ambiguous questions before you build a segmentation model
on those questions. Alas, time does not always permit such pilot testing and refinement.
Time or not, a good segmentation still requires good measurement.
191
5. Stability of segments
There are many types of cluster analysis. Most cluster analysis techniques will always make
segments. When there are clear and natural breaks (real divisions) in the data, most
techniques tend to get the same answer. But watch out: when the data are flat, and when
there are few real divisions in the data, the various techniques will still make segments. The
problem is that the resulting segments are arbitrary. They are neither reliable nor stable.
Methodologically, there are a couple of ways to assess segment stability. Neither method is
perfect. Some experts track the number of weakly-classified respondents - that is, the
number of respondents who sit on the border between segments. If the number of
borderline respondents is too high, the segmentation solution is deemed instable. Other
experts prefer to use cross-validation techniques to index stability. Here, for example, one
may split the sample into odd- versus even-numbered records (or ID numbers). Do the split
samples share the same cluster solution? Sample sizes are often small in B2B research,
potentially ruling out the split-sample approach. Fortunately, both methods of measuring
stability are acceptable. In fact, either method would be an improvement for most
segmentation studies.
Five applied-marketing guidelines
6. Size of segments
Size matters when it comes to segments. Your key segments must be large enough to
support revenue generation. By default, junior marketers often look for segments that are
at least 20 percent of consumers, hoping that their 20 percent target segment will provide
80 percent of the available profits. They are following an idealized Pareto rule, which
indexes at 400 percent. There is no harm in following this expectation, but it rarely works.
More often, marketers are lucky enough to find that 15 percent of consumers generate 45
percent of profits (a 300 percent index on a smaller base). In hyper-segmented markets,
one’s target segments are often much smaller. For example, one may have to settle for a
segment that accounts for 10 percent of consumers, 16 percent of profits, and thus, indexes
at 160 percent. Obviously, the hypothetical segment sizes and index scores illustrated above
are not benchmarks. Rather, the acceptable size for segments depends on your business
model and industry.
7. Availability (accessibility) of segments
Just because a segment is easy to measure (per Guideline 4) does not guarantee that the
segment is easy to find. Segments can be inaccessible simply because they are defined on
non-public topics - that is, on topics that are not available in syndicated databases. A shoe
company can easily define consumer groups by shoe width, but it will be hard to find a
database that provides access to double-E-width consumers. Likewise, C-level executives
192
(CEO, CFO, CMO) are easy to define as a segment. However, C-level executives are
notoriously inaccessible. They often hide from marketers, let alone from marketing
researchers.
The two guidelines of measurability and availability interact in the future recruitment of
segments. After making a successful segmentation model, marketers often want to create a
short survey (or segmentation screener) to find more segment members, either for future
marketing research or for sales calls. The fewer the questions, the easier it is to implement
this segmentation screener. In some organizations, the segmentation screener takes on a
life of its own. It becomes the de facto segmentation for years to come, long after
researchers have forgotten the original segmentation study. In such cases, the screener
must work well with all the normal and easily accessible sample sources, whether phone,
mail or Internet sample sources.
8. Brandability of segments
By brandability, I simply mean that a brand does well in a key segment. By now, if you have
made it as far as Guideline 8, it is possible to evaluate whether a specific segment can be
adopted as a priority or target segment for your brand.
Ideally, your brand will score high in your proposed target segment. Actually, your brand
does not need to score high in absolute terms. Rather, your brand just needs to index higher
in its target segment. For example, if your brand has 14 percent purchase interest across the
whole sample, you may be satisfied with a target segment that has 22 percent purchase
interest in your brand.
All the better if your main competitor indexes poorly in the target segment. Besides scoring
well on purchase interest, your brand should also score well on brand metrics such as
awareness, uniqueness, favorability, loyalty, etc. Brandability can also require strong
performance on brand imagery ratings such as quality, dependability, safety, efficacy,
luxury, friendship, fun, etc. This is a matter of brand positioning. Within your target
segment, your brand’s imagery ratings should lean toward your brand’s prior stated
positioning.
9. Profitability of segments
In the past, researchers seldom attempted to estimate the profit of segments. Nowadays, it
is becoming common to estimate segment profitability, even with survey data. Segment
profitability can be calculated many ways. Here is a simple method.
Eq. 1: Relative Profit Index = “Size” x “Income” x “Brand A Purchase Consideration” where,
for each segment, there is a measure of:
193
a) Size = Size (of the segment)
b) Income = Average income
c) Brand A Purchase Consideration = Definitely Will Buy Brand A
Profitability indices differ on their degree of complexity and completeness. Equation 1 is

admittedly simplistic and limited. It should not be used to forecast sales volume. It can be
used, however, for a topline financial analysis of segments. Even better indices are available,
if one includes terms or adjustments for d) disposable income or purchasing power, e) brand
loyalty, f) willingness to switch out of a competitive brand. To develop a ROI analysis, it is
necessary to include terms for g) marginal costs of production and marketing and h)
marginal gains from incremental sales, that is, sales beyond one’s current portfolio of
products.
10. Communicability of segments
Even if all nine guidelines above are met, a good segmentation still needs one more asset. A
good segmentation must be easy to communicate. Even the best segmentation scheme can
fall flat, if it cannot be easily understood and communicated within a company. The name of
a segment (e.g., Foodies, Adventurers) is the main way that a segment gets communicated.
Obviously, care must be taken to select a name that is descriptive. Beyond the segment
name, the main way to increase communication is with eye-popping visuals. Here, it is
important to create and use multiple pictures or collages. No one visual should be allowed
to represent a whole segment, lest the future interpretation of that segment become too
pigeonholed by a single photo.
To optimize communication, researchers may have to sub-optimize other guidelines. For

example, to improve communication a researcher may select a five-cluster solution even
though a 12-cluster solution performs better on some of the guidelines mentioned above.
Meet all 10
I have shared 10 guidelines for a good segmentation. These guidelines are balanced
between methodological guidelines and applied marketing guidelines.
Few segmentation schemes can meet all 10 guidelines without encountering some degree
of trade-off. To improve the communication of the segmentation (Guideline 10), one may
have to select segments that are slightly harder to measure (Guideline 4). To improve the
brandability of segments (Guideline 8), you may have to adopt segments that are slightly
less differentiated across segments (Guideline 2). Such trade-offs are endemic to
segmentation research. Assuming that the segmentation scheme performs satisfactorily on
both guidelines, researchers will have to decide which guideline gets priority. Some
companies will seek maximally profitable segments (Guideline 9), sacrificing a bit of
interpretative clarity (Guideline 3). Other companies will need crystal-clear interpretations,
knowing that extra profits will never be tapped unless the segments sound compelling. Just
194
as long as performance on both guidelines stays above acceptable levels, you are free to
optimize the guideline that best meets your business needs.
Trade-offs among the guidelines will exist, regardless of whether researchers acknowledge
the trade-offs. A novice researcher can sometimes make good trade-offs just by intuition or
luck. Most often, however, the best trade-offs are made by experienced researchers who
are conscious of the 10 guidelines and their implicit trade-offs. A bad segmentation will
ignore and stumble through the 10 guidelines. A good segmentation will acknowledge and
balance all 10 guidelines. A great segmentation will foresee and optimize the 10 guidelines
for your exact marketing needs.
Q-Factors or K-Means? A market segmentation dilemma

Author - Stanely Cohen
Article Abstract
In a constantly changing marketplace, marketers must focus on subgroups, tailoring their

product and message to achieve their goals. Demographics often aren't enough, consumer
attitudes and perceptions must be understood to gain a competitive edge
In the recent history of marketing, competition in the marketplace has become more and
more intense as new products, product refinements and line extensions have proliferated.
Demographic profiles and levels have changed dramatically, with change as the expected
norm. Levels of income, education and age have grown geometrically. Geographic
distributions are in a state of flux. An accelerated growth of actual and perceived needs has
kept pace with the expansion of lifestyles. Consumption horizons have broadened
dramatically and the consumer is demanding an ever-widening range of wants.
These marketing trends are mutually reinforcing so that the expansion of one contributes to
the acceleration of the others. The intensity of competition must keep pace with the growth
of the demand. The marketers are constantly "on the spot" to gain insight into the
consumer process of change so that their products can maintain existing and gain new
positions.
As a result, marketers must look for the subgroups and submarkets which will add to a
sharper definition of the consumer needs and wants. They must find ways to reach and
appeal to these diverse groups. They must tailor their product and their message to achieve
the "sale."
It is not sufficient for marketers to simply look at the demographics of the consumer and to
get a picture, as accurate as necessary, of consumption patterns of the individual and family.
They must understand the reported needs, attitudes, perceptions, lifestyle and
195
psychological profiles of the consumer in order to gain the required response to marketing
efforts.
Market segments
Their first task, therefore, is to derive meaningful discriminating market groups and to
decipher the characteristics that make the discrimination actionable. These groupings have
been identified with the marketing labels, segments and/or clusters. They must isolate these
groups in which the consumer members are as homogenous as possible, while
simultaneously being different from other groups.
In order to achieve this goal, the marketer must rely on highly sophisticated statistical and
mathematical methods. In most cases, the marketer is relatively unsophisticated in the
mechanics of performing these operations. The marketer is usually only concerned with the
bottom line, the resulting groupings, while taking the technical validity for granted.
However, lack of sophistication not withstanding, the marketer is responsible for the
conceptual integrity of the research.
The selection of the appropriate grouping technique requires the understanding of the basis
of the methodology. Too often, the selection of the technique is based upon exogenous
criteria (i.e., "Do we have the program? "Will the data fit within the limits of the program?"
"Does anyone know how to use the program?" And sadly enough, "I've always used this
technique and it will do the job! ")
There exists in the selection of the existing popular methodology, two very different criteria.
Each is based upon the internal structure of the data. The diversity of these two measures
cannot be ignored when deciding upon the method to be used. This diversity is manifested
in the distinct labeling of the techniques. They are:
Segmentation. This is based on the relationship between subjects within a group definition.
This relationship seeks to maximize the "correlation" measure between group members.
The groups are appropriately called segments. The technique used to achieve this
segmentation is Q-Segmentation Analysis, which is a factor analytic-based technique.
Clustering. This is based on the proximity of group members to one another. The method
seeks to minimize the distance between group members while maximizing the distance
between the different groups.
These groups are called clusters. The technique used to achieve this clustering is the K-
Means Analysis, which is a form of variance reduction methodology.
The programs used to compute these two different analyses are insensitive to the
differences in the data. The selection of the method is the decision of the researcher. The
196
program will obediently perform its task on the data it is given. The results, however, could
be dramatically different.
Before we get into the discussion of the dilemma of making the selection, let us illustrate
the difference with a very oversimplified example.
Isolating subgroups
Suppose we have a problem in which only two items (variables) have been measured and
we are interested in isolating two subgroups. The scales will be 1-6 and the raw data is
distributed with one concentration around 13 for both variables and another concentration
around 3-6 for both variables (see illustration 1).
If we were to perform the two techniques on this set of data, we would get the results for
clustering as shown in illustration 2 and for segmentation as shown in illustration 3. Both
techniques would result in a unique assignment of each case to one group or the other.
197
What do these illustrations tell us? The clustering process tells us: Cluster 1 is where
variable 1 is low while variable 2 is also low; cluster 2 is where variable 1 is high while
variable 2 is also high. Both of these groups are dependent on the location of the variables
while they are independent of the relationship between the two variables.
The segmentation process tells us: Segment 1 is where variable 1 is consistently higher than
variable 2; segment 2 is where variable 1 is consistently lower than variable 2. Both of these
groups are dependent on the relationship between the groups and independent of the
location of the variables.
The question is: Which is right? The answer is: That depends! Therein lies the dilemma.
First the reader is cautioned that this is a case in which the data has been constructed to
illustrate a point. The issue has been sharpened by the structure of the data. In the "real
life" situation, the data will not be as sharp and the issue will be more diffuse.
Again, for the purpose of illustrative understanding, let us flesh out these variables and give
them some substantive meaning.
Let us say that variable 1 is the scale of coffee consumption and variable 2 is the scale of
cigarette smoking. The clustering tells that people who are low coffee consumers are also
low cigarette consumers. The two clusters will isolate markets in which one high/low level
accompanies the other. Are we selling to a market level? (Do you smoke and drink coffee
lightly/heavily?) That is the question!
High or low consumption
198
The segmentation tells us that people's cigarette consumption is either higher or lower than
their coffee consumption. Are we selling to a consumption tendency? (How much more/less
do you smoke/drink coffee?) That is the question.
Which one will we use? The one that answers the marketing question at hand. The main
point is that they are both technically accurate and substantively valid, depending on the
marketing question. Market groupings, be they segments or clusters, are never inherently
defined. They will be useful only as far as they will answer the marketing question. For the
most part, behavioral questions will be level-oriented while perceptual questions will be
correlation-oriented.
The next step is up to the researcher. The questions must be understood conceptually. The
research must be designed to collect the pertinent data. And finally, the appropriate
program to do the analysis must be selected.
The programs to do these analyses are available from a number of sources: Software firms,
consultants and universities. We will address ourselves to the PC market where much of the
emphasis is going today.
The K-Means Cluster methodology is available for the PC from several sources: SPSS/PC+
from SPSS Inc., Chicago; SYSTAT, from SYSTAT Inc., Evanston, Ill., and PC-MDS from Brigham
Young University, Provo, Utah. The Segmentation methodology is available from Pulse
Analytics, Inc., Ridgewood, N.J.
As a final footnote to the dilemma, we must acknowledge that exogenous conditions may
still play a part in the use of the techniques. One significant condition is that all the K-Means
programs mentioned here are handicapped by the limitation on the number of variables
and cases that they can handle.
My experience shows that the usual research project is ambitious beyond the limits of these
programs. Therefore, the researcher should try to use data reduction methods rather than
select an alternative method by default.
199
Multivariate Analyses:
Multivariate analysis - some vocabulary
Article Abstract
People new to multivariate analysis can sometimes feel as though coworkers are speaking a
foreign language. Gary Mullet, of the Georgia Institute of Technology, explains some of the
requisite vocabulary for multivariate statistics and analysis.
If you've been in marketing research as a client or a vendor for any longer than five minutes,
you've undoubtedly heard (or thought that you did) something that sounded like, "After we
regressed the eigenvalues on the discriminated clusters from the principle components
maps, the factor loadings were clustered conjointly on the razzenfritzed centroidal variated
hyacinths."
Well, to the neophyte in multivariate statistics, the above might as well have been what was
actually said. Seems as if there are more buzzwords in statistics than in any other science
and it also seems that some researchers try to use as many of them all at once if possible.
Even when we're not really trying to impress someone, we're often forced to use several
confused and confusing terms, just because there are no convenient alternatives.
Anyway, below you'll find several multivariate techniques listed, and I hope, defined for the
user of marketing research (as opposed to the professional statistician). Within each broad
topic, I'll try to tell you what the technique will do for you and also define some of the tool-
specific words. Who knows, with a little practice you, too, may be able to say things like,
"We really didn't need to consider the razzenfritzed centroidal variated hyacinths in this
factor analysis." In each case, we're assuming that a sample of respondents have answered
several questions on your survey.
Regression analysis
Regression analysis seems to be the grandfather of all multivariate analytical techniques.

What it usually does is to find an equation which relates a variable of interest, such as
amount consumed in the past 30 days, purchase intent, number of items owned or any
other numeric variable, to one or more other demographic, psychographic or behavioral
variables. The variable of interest is called the dependent or criterion variable, the others
are the independent or predictor variables.
When the dependent variable is either purchase interest or overall opinion of a product,
some researchers say that they are building a "driver model." They're trying to find which
product attributes "drive'' overall opinion of the test product, say.
200
The major thing to recognize in regression analysis is that the dependent variable is
supposed to be a quantity such as how much, how many, how often, how far? The
computer won't tell you if you've defined the variable of interest wrong, either, so it's up to
you or your colleagues. Most regression models will leave you with an equation that shows
only the predictor variables which are statistically significant. One misconception that many
people have is that the statistically significant variables are also those which are substantive
from a marketing perspective. They won't necessarily be. It's up to you to decide which are
which. A couple of buzzwords that come primarily from regression analysis are:
Multicollinearity. The degree to which your predictor variables are correlated or

redundant. In a nutshell, it's a measure of the extent that two or more variables are telling
you the same thing.
R-squared. A measure of the proportion of variance in, say, amount consumed that is
accounted for by the variability in the other measures that are in your final equation. You
shouldn't ignore it, but it's probably overemphasized.
There are a variety of ways to get to the final equation for your data but the thing to
recognize for now is that if you want to build a relationship between a quantitative variable
and one or more other variables (either quantitative or qualitative), regression analysis will
probably get you started.
Discriminant analysis
Discriminant analysis is very similar to regression analysis, except that here the dependent
variable will be a category: Brand used most often, product usage (heavy, medium, light, not
aware). The output from a discriminant analysis will be one or more equations which can be
used to put people (usually) with a given profile into the appropriate slot or pigeonhole. As
with regression, the predictor variables can be a mixed bag of both qualitative and
quantitative.
Again, the computer packages around won't save you from yourself and tell you when you
should use regression analysis and when to use discriminant analysis, so you'll have to be on
your toes. Also you should be aware that the IRS is a big user of discriminant analysis. The
categories of interest to them are "Audit" and "Don't Bother." You can imagine what the
predictors are, especially if you're starting to fret over the new tax forms.
Marketing researchers frequently use discriminant analysis to profile users of various brands
within a given product category. It's also used to determine what, if any, differences there
are between, say, "Trier? acceptors," "Trier-rejecter" and "Non-triers." In the past it was
heavily used in credit scoring. It probably still is. As with regression, you need to be
concerned with statistical vs. substantive significance, multicollinearity and it? squared (or
its equivalent). Used correctly, it's a powerful tool since so much marketing research data is
categorical.
201
Logistic regression
Logistic regression does the same things as regression analysis as far as sorting out the
significant predictor variables from the chaff, but the dependent variable is usually a 0-1
type, similar to discriminant analysis. However, rather than the usual regression type
equation as output, a logistic regression gives the user an equation with all of the predicted
values constrained to be between 0 and 1.
'Why bother? Most users of logistic regression use it to develop such things as probability of
purchase from concept tests. If a given respondent gives positive purchase intent, they're
coded as "1" in the input data set; a negative intent yields a "0" for the input. Now looking
at both the demographics of the respondents and their product evaluation data, a model is
built that allows the researcher to say things like, "Males aged 35-49 have a .87 probability
to buy this product, females who are between 18-35 have a .43 chance,. . ." It can also be
used instead of discriminant analysis when there are only two categories of interest.
Factor analysis
There are several different methodologies which wear the guise of factor analysis.
Generally, they're all attempting to do the same thing. Find groups, chunks, clumps or
segments of variables which are correlated within the chunk and uncorrelated with those in
the other chunks. The chunks are called factors.
Most factor analyses depend on the correlation matrix of all pairs of variables across all of
the respondents in the sample. Also, as it is commonly used, factor analysis refers to
grouping the variables or items in your questionnaire together. However, Q?factor refers to
putting the respondents together, again by similarity of their answers to a given set of
questions. Two of the troublesome terms from factor analysis are:
Eigenvalue. Although mathematicians would blanch, all you really need to know about
eigenvalues in a factor analysis is that they add up to the number of variables that you
started with and each one is proportional to the amount of variance explained by a given
factor. Analysts use eigenvalues to help decide when a factor analysis is a good one and also
how many factors they'll use in a given analysis.
Rotation. In addition to doing it to your tires, doing it to an initial set of factors will give a
result that will be much easier to interpret. It's a result of rotation that labels such as "price
sensitive," "convenience" and so on are applied to the factor.
Although the literature says that factor analysis should only be done on quantitative
variables, we've seen some that are very understandable when conducted on yes-no type
variables as well. As with most multivariate procedures, that seems to be the bottom line
for factor analysis: Does it make sense? If yes, it's a good one; otherwise it's probably not,
irrespective of what the eigenvalues say.
202
Cluster analysis
Now the clumps of interest are respondents, instead of variables. As with factor analysis,
there are a number of algorithms around to do cluster analysis. Also, clusters are usually not
formed on the basis of correlation coefficients. They usually look at squared differences
between respondents on the actual variables you're using to cluster. If two respondents
have a large squared difference (relative to other pairs of respondents) they end up in
different clusters. If the squared differences are small, they go into the same cluster.
One word of caution. Not all cluster software can easily handle categorical variables. For
instance, if you're trying to cluster using brand used most often, which has four categories,
you need to be sure to use a program which will cluster such nominal scale responses.
Otherwise, you'll get a cluster mean on brand of 2.34 or some such, which is tough to
interpret, at best. Most cluster programs do OK on either quantitative data or yes?no type
data. A couple do handle multiple categories as well.
Perceptual mapping
A perceptual map can be used to show relative similarities between such things as:
Brands
Product attributes
Both
Cluster groups
Factor scores
Most anything else of interest in marketing research.
An appropriate map can serve as an excellent data summary and presentation device.
Several of the mapping programs do much the same as factor analysis. Some use regression.
You can also map the results of a discriminant analysis.
One major thing to remember when you're faced with a perceptual map: What you see is
only a two- dimensional picture of the interrelationships in your data set. It may take three
or more dimensions to adequately represent your data; hence, your two-dimensional view
might be leading you astray.
Most mapping procedures provide a measure or two of how well the two-dimensional map
captures the data relationships. Be sure that you are given these measures.
Another thing to keep in mind is that many maps are going to show you relative positioning
or differences and not absolutes. Factor analysis, being based on correlations, does this too.
203
Combinations
At the risk of going overboard, again, on jargon, some studies use combinations of
techniques. For instance, each brand might be scored on the factor results. Then, brands are
used as criterion variables with factor scores to discriminate between them. A driver model
might be evaluated for each brand using the raw data (not factored) and respondents could
be clustered on their perceptions of a single brand plus their demographics. A perceptual
map is constructed showing the cluster groups and brand ratings, another from the
discriminate analysis. This is not on a typical scenario.
Ask, then invest
It's easy to overwhelm and be overwhelmed by the vocabulary alone of multivariate data
analysis, let alone the interpretation of the same. Adding to the problem is computer
literacy without attendant statistical literacy. Most programs/packages will do whatever
analyses you request on whichever data you feed them. With the above information, I hope
that as a minimum, you'll be able to ask the right questions before investing in an
unwarranted multivariate procedure.
A marketing researcher's guide to multivariate analysis

Author - Charles J. Schwartz
Article Abstract
Marketing researchers are regularly faced with a variety of challenges, including tight
deadlines, demands for results that are concise and easy to understand, and an abundance
of data. This article discusses multivariate analysis as a body of statistical techniques that
helps researchers isolate actionable findings quickly and reduce them to a couple of charts
and graphs.
As marketing researchers, we have all been faced with tight deadlines, demands for concise,
easy to understand results, and survey tabs as thick as the Manhattan phone book.
Burdened with an 80-or 90-question survey and breakdowns by every conceivable
demographic, who hasn't found it difficult to isolate actionable findings quickly and boil
them down into a couple of charts and graphs?
Multivariate analysis is a body of statistical techniques that do precisely this job. They were
specifically developed to isolate the important relationships between variables and highlight
the structure behind what might seem to be a chaotic mass of data. In the hands of a
competent analyst, they can simplify interpretation, provide innova-tive graphic
presentations and give insights that would be impossible to obtain by simple one-and two-
way tabulations. In any large or complex study, these are not esoteric frills, but essential
tools to speed up and enhance analysis.
204
These techniques are applicable not only to surveys but to a broad range of data such as
demographics, sales and CIF information. A multivariate analysis might show that a set of
detailed demographics reflects only one or two significant aspects of a population. Another
analysis might derive simple customer segments from complex cross-sell data. These are
just two examples of the potential of multivariate analysis to increase the value of both
internal business information and publicly available data for marketing research purposes.
While they may be essential tools, multivariate techniques demand a fairly sophisticated
statistical back-ground to apply correctly. Still, their results can be used by researchers at
almost any level of technical sophistication.
Data reduction, scaling and perceptual mapping
Rather than summarize the statistical techniques themselves, it is probably better to look at
some of the ways they might be used.
One of the most common situations a researcher faces is scaling. Respondents may be asked
questions about multiple product attributes or may rate the importance of several product
or service characteristics. Often these questions come in groups of 20 or 30 and sometimes
(in the case of one client) up to several hundred. While a manufacturer may have strong
opinions about 200 of his or her product's attributes, it is almost certain that the customers
look at the product on only a few dimensions. In these situations, tabs often show little
difference from question to question. Even when differences are significant, it may be
difficult to summarize them by customer characteristics like demographics as these
differences would involve multiple questions and appear over many pages of tabulations.
Whole families of multivariate techniques have sprung up to deal with just this kind of
problem. Used in marketing research under the rubric of perceptual analysis, techniques like
factor analysis and discriminant analysis can boil dozens of attributes down to two or three
significant, easily interpreted attitudes. Respondents can be scored on these attitudes,
differences between respondent groups can be identified and the differences can be easily
graphed.
The example of the client with 200 attribute questions is a good case study of this use of the
techniques. Here it turned out that the attributes represented only three customer attitudes
- suitability to the task, workmanship and prestige. The client received a set of three
dimensional charts that graphically differentiated the market niches of several brands based
on the three attitude dimensions. Faced with 200 independent attributes, it is questionable
whether these differences could have been identified, let alone displayed concisely on a few
graphs. If this client was planning to do further research, she could have benefited in
another way. The analysis showed that approximately 25 of the 200 questions served to
identify the three attitudes. A future survey could have dropped 175 questions, saving a
significant chunk of the research costs.
205
Market segmentation
Market segmentation is the area that most clearly shows the accessibility of multivariate
analysis. Almost any clustering scheme is the result of the application of one and often
several multivariate techniques. Claritas' PRIZM and Donnelley's Cluster Plus, for example,
are the products of this kind of analysis. The huge success of these products stands as
testimonial to the usefulness and clarity of multivariate results.
Any well-designed survey can be subjected to a variety of multivariate clustering techniques

to develop custom segmentation schemes based on the questions included in the survey.
The geographic identifiers of survey respondents or a customer information file allow
records to be linked to census demographics and those demographics can be used to
develop a customized product or service specific clustering scheme similar to the generic
model set by Cluster Plus or PRIZM. These customized schemes will provide insights into
specific markets that the more general clustering systems cannot. In some cases, like
business-to-business marketing or marketing to niches such as clinic groups or seniors,
these techniques are almost the only way to obtain statistically sound segmentation
information.
Prediction and forecasting
Prediction and forecasting are inherently multivariate. Future sales, for example, is
dependent on a host of factors such as the economy, demographic changes or changing
tastes. Even in a trend analysis, future activity is generally not a simple function of a straight
line projection or moving average. It can be cyclical, have seasonal components or have
complicated lag times, all of which can and must be modeled through multivariate
techniques.
Multivariate econometric techniques have been developed to deal specifically with the
problems of forecast and projection. These techniques have been highly optimized to obtain
mathematically based forecasts with minimum error given the input data. There are widely
accepted techniques that deal with interdependencies between predictor variables and
between those variables and the passage of time that may not even be apparent in the most
detailed tabulations. If not controlled, these interdependencies can lead to very misleading
results. While Chase and others use these techniques in very complex models to predict the
economy, in most business situations a simple, understandable model is enough to produce
clear improvements in predictability over more basic trending or moving average
projections.
Causal analysis
One of the most highly developed areas of multivariate analysis is causal analysis. There is a
battery of powerful techniques designed specifically to model and test theories about
causation. These techniques can prove their value even when there are as few as three
206
interrelated causes and certainly when causation is two-way or multifaceted. In these
situations, even the largest sample may be too small to isolate important causal factors
through a tabular analysis. By applying well developed statistical theories, multivariate
techniques can leverage the data from even a relatively small sample to provide a way to
test detailed hypotheses about the marketplace. If a survey is done to determine the cause
of a drop in sales, for example, multivariate techniques provide an objective way to model
what those causes might be and determine which among them is most important. If
management has a theory concerning the drop in sales, multivariate techniques provide an
objective means to evaluate the theory and to elaborate on it.
Multivariate analysis includes a wide range of techniques that can be used in almost any
research situation. As such, no simple article can cover all their uses. The purpose here has
been more limited. First, it has been to give the reader a taste of the kind of practical
questions that multivariate techniques can answer in the marketing research situation.
Second, it has been to stress the cost effectiveness of incorporating multivariate analysis in
the research effort from the ground up. By planning for this kind of analysis from the design
phase on, results will be enhanced and ultimately the entire cost of the research effort
could be reduced. Finally, and probably more importantly, it has been to impress upon the
reader that although they are powerful statistical techniques, multivariate analyses provide
results that are accessible to researchers and management alike. Rather than adding
complexity, multivariate techniques clarify, simplify, and increase the actionability of any
results a researcher can provide to his or her clients.
(Sub-) optimal test designs for multivariable marketing testing

Author - Gordon H. Bell and Roger Longbotham
Article Abstract
Multivariable tests are valuable when used to their fullest advantage. Guidelines for getting
the most from these tests are offered.
Multivariable testing in marketing is like the gold rush of the 1800s. New “discoveries” hit
the press and we rush off to mine the next breakthrough technique. But the reality is not
quite so glamorous - or chaotic. This “new” field of multivariable testing is actually the result
of decades of academic research and statistical practice, with impressive depth beyond the
basic terms and concepts that reach the marketing press.
In testing - as in marketing - clarity and efficiency should take precedence over technical
showmanship. Statistical complexity on its own has little inherent value unless it achieves an
207
obvious increase in ROI. The key is to find the right balance between powerful statistics, a
user-friendly approach and clear, actionable results.
Efficient and flexible
Some multivariable test designs are both powerful and easy to understand. Full-factorial,
fractional-factorial and Placket-Burman designs provide a solid foundation for efficient and
flexible multivariable testing in marketing. You can use versions of these to test two or two
dozen variables, analyze main effects alone or in combination with interactions and adjust
the size and layout of the test design to meet your marketing objectives and constraints.
Other statistical designs sacrifice ease of use in order to achieve a specialized objective.
Computer-generated “optimal” test designs are one example. First developed in the late
1950s for manufacturing experiments, optimal designs offer a way to run experiments
under non-standard conditions. For example, in a manufacturing test of machine speed and
flow rate, the combination of high speed and low flow may burn out the machine, so this
combination must be avoided. Optimal designs allow you to test under sub-optimal
conditions where certain combinations are constrained, the cost of testing is immense or
the “response surface” has abnormal characteristics. Fortunately, these constraints are
seldom necessary in marketing tests.
The D-optimality criterion is one method for defining optimal designs. For this approach, a
design is set up to minimize the volume of the confidence region of the effect estimates
(considering the variances and covariances of these estimates). Other optimality criteria will
result in different test designs. In addition to D-optimality, statisticians have defined A-, C-,
E-, G-, I- and S- (and other) optimality criteria. Even the same optimality criterion may result
in different test designs depending on the optimization software. Simply put, if none of this
paragraph makes sense to you, then these optimal test designs become a “black box.” If you
cannot use your marketing experience to interpret the results, then the statistical output
should be implemented with great care.
In addition, optimal designs create a number of challenges:
These designs are applicable when your test variables are continuous (like temperature and
pressure). However, when you have discrete variables, as we normally do in marketing tests,
they either don’t work or provide little or no benefit.
The computer creates a design based solely on the input criteria and the underlying
assumptions implicit in the approach. Optimal designs generally assume some form for the
model, relationship or range of influence for the variables being studied. If these
assumptions are not met, the design is no longer optimal. Rarely do we know this much
about the relationships prior to conducting the test - that’s why we are doing it!
208
The complex analyses require advanced statistical skills. Unless (or even if) you have a Ph.D.
statistician on staff, the analysis can be challenging and the results can be very confusing.
Results are not only difficult to interpret but may change based on the selected criteria and
assumed effects. Forcing constraints is like removing boards from the framing of a house - a
few changes may be OK, but you never quite know when you have weakened the structure
too much.
The small increase in statistical power comes with a large increase in complexity. These
designs make great journal articles but are not very practical for most real-world
applications.
In multivariable marketing testing, the most “optimal” test design is usually one with a
straightforward execution, clear analysis and easily understood results. From both a
marketing and statistical perspective, esoteric designs like D-optimal designs are often a
sub-optimal choice.
The right techniques at the right time
Multivariable testing is most effective as a strategic marketing tool. Test designs offer an
efficient framework for testing your new ideas. But just as the framing of a house is only the
first step towards a beautiful home, what you place upon the statistical framework is what
ultimately determines the attractiveness of your test results.
Strategic testing means using the right techniques at the right time. What is your biggest
opportunity to increase marketing ROI? What are the primary questions you want to answer
in each test? Once you answer those questions, then you can follow a logical, structured
approach:
1. Plan a series of tests. One test cannot answer every question. Consider a cycle of testing,
where you build upon results from each test and refocus your marketing programs as you
gain new insights. You can test many creative elements to determine which are important,
or refine price and offer variables to quantify key interactions, or test your contact strategy
to pinpoint profitable touchpoints, but testing all of these together quickly becomes
unmanageable.
2. Answer the big questions before fine-tuning your programs. Find out which marketing-
mix elements are important before testing the details. For example, you can find out if
envelope color makes a difference (perhaps testing a white versus blue envelope) before
testing five different shades of color. Or you can quantify the impact of a 20 percent price
increase, before testing 5 percent, 8 percent, 10 percent, 15 percent and 20 percent
changes all at once.
This also means that two-level test elements are frequently more efficient than multilevel
designs. Especially with creative elements, two levels can provide more useful and
209
actionable information. For example, a test of two headlines, creative and offer-focused,
can show a) if different headlines have a different impact and b) what type of headline is
most effective. If the offer headline is more effective, then the next test can focus on
different wording. In addition, multiple levels usually require larger sample size, more test
cells and more complex analyses and create real difficulty in analyzing interactions.
3. Find the most powerful and efficient method for testing your ideas. The simplest solution
is often the best. Real-world tests in dynamic markets with limited resources are much
different than theoretical experiments in a controlled laboratory environment managed by
Ph.D. statisticians. Putting powerful tools into the hands of marketers is more important
than using the most theoretically advanced statistics. Testing bold new ideas, executing
clean and fast marketing tests and rapidly improving performance is where the real power
of multivariable techniques rests.
4. Understand the statistical rules you need to follow. The statistics encourage some self-
restraint. Every test requires a balance between creative freedom and statistical structure.
Part of the art of testing is finding a way to leverage your team’s brainpower within the
statistical constraints required to achieve reliable test results.
Gain useful insights
When the test elements, execution and results and clear and understandable, the marketing
team is much more likely to gain useful insights and implement the results. If the test is a
black box with confusing data, then the results may never be understood or implemented.
Although the underlying statistical theory is daunting, every marketer can understand the
basic pros and cons of each scientific test design they execute. Your guide should be able to
explain the alternatives and why the selected test design is the optimal choice for your
unique marketing program and objectives.
Multivariable testing is a powerful tool to help you learn more, faster, with fewer resources.
Yet like every tool, how you use it is the key to success. With a full toolbox of test designs
you can find the most efficient technique for each situation. When the statistics become
transparent to your marketing team, you free their creative energy for explosive growth.
A survey of multivariate methods useful for market research

Article Abstract
Most researchers are already familiar with universal statistical methods. This article
discusses multivariate statistical methods, including key characteristics of multivariate
procedures and examples
210
Most researchers are already familiar with univariate statistical methods. Multivariate
statistics are developed for analysis of data which has more complex, multi-dimensional,
dependence/interdependence structures. For example:
Data is divided into a dependent group, as well as an independent group of variables.

Researchers are interested in finding the causal relationship between the independent and
dependent groups of variables. They would choose such multivariate methods as:
multivariate multiple regression, discriminant analysis, conjoint analysis, crosstab ANOVA
analysis, categorical analysis, and logistic regression.
Data may be viewed as one big group of variables serving the same purpose. Researchers
are interested in the interdependence structure of the data. Their focus is to restate the
original variables in an alternative way to better interpret the meaning, or to group
observations into similar patterns. They would choose such multivariate methods as:
principal component analysis, factor analysis, canonical correlation analysis, cluster analysis,
and multidimensional scaling.
Data may not be normally distributed. Researchers are not concerned about making
broader inferences about the population, but about the analysis of the specific data at hand.
They would need multivariate methods that are tolerant of non-standard, non-metric data
which is less likely to be normally distributed.
Multivariate methods are derived from univariate principles, but are more empirical
because they work backward from data to conceptualization. For many marketing
applications, multivariate methods can outperform univariate methods.
Marketing problems are inherently multi-dimensional, and solutions often inexact. For
example, customer types are classified along a range of customer characteristics; stores and
brands are perceived and evaluated with respect to many different attributes;
creditworthiness of a credit card applicant is judged on a variety of financial information.
Multivariate methods are versatile tools allowing researchers to explore for fresh
knowledge in huge consumer databases. They are used for market segmentation studies,
customer choice and product preference studies, market share forecasts, and new product
testing. Results are used in making decisions about: strategy for target-marketing
campaigns, new product or service design, and existing product refinement.
Multivariate methods are popular among marketing professionals also because of their
tolerance of less-than-perfect data. The data may violate too many univariate assumptions;
they may be survey data with too much variable information and not enough observations
(e.g., researchers ask too many redundant questions with too few respondents); or they
may have problems resulting from poor sample or questionnaire designs.
Key characteristics of multivariate procedures
211
The research objective should determine the selection of a multivariate method. In this
article, "observations" refers to entities such as people, subjects; "variables" (sometimes
called "dimensions") is the characteristics of these entities measured in quantitative or
qualitative terms. Data consists of both observations and variables.
1. Principal component analysis

Principal component analysis restates the information in one set of variables in an alternate
set of variables on the same observations. Principal components are linear combinations of
the original variables such that they are mutually independent. This method applies
orthogonal rotation of data along the axes that represent the original set of variables.
Orthogonality ensures the independence of all components, while preserving 100 percent of
the variance (synonymous with information) in the original variables. There can be as many
variables as there are principal components.
1.1 Applications
Variable reduction. Because principal components are extracted in decreasing order of

variance importance, much of the information in the original set of variables can be
summarized in just the first few components. Therefore you can often drop the last few
components without losing much.
Principal component scores can be used as independent predictors in regressions, thereby

avoiding collinearity problems. Because principal components are orthogonal to
(independent of) each other, they are an excellent choice for regressions where the original
independent variables may be highly correlated.
Outlier detection. Outliers are observations with different behavior from the rest of the
observations. They may be caused by measurement errors, but they can exert undue
influence on the regression. It is easy to find outliers in a one- or two-dimensional space
defined by one or two variables. With higher dimensions defined by more variables, it
becomes difficult to find their joint outliers. Principal components analysis can help locate
outliers in a higher dimensional space.
1.2 Example
Do a regression analysis predicting the number of baseball wins from the following baseball
statistics (many of them redundant), from the 1990 professional baseball season:
Dependent variable - number of wins.
Independent variables - batting average; number of runs; number of doubles; number of

home runs; number of walks; number of strikeouts; number of stolen bases; earned run
average; number of complete games; number of shutouts; number of saves; number of hits
allowed; number of walks allowed; number of strikeouts; league.
212
1.3 Limitations
It is impossible to interpret the principal component scores computed for the observations.
They are merely mathematical quantities for variable transformation. In regression, use of
principal components in place of original variables is solely for prediction purposes. The
resulting R2 and coefficients may be significant, but the principal component scores do not
have clear meanings themselves. If you want to give meaningful interpretation to the
components, you are better off doing a factor analysis instead.
2. Clustering of variables or observations

Clustering is a collection of ad hoc techniques for grouping entities (either observations or
variables) according to a distance measure specified by the researcher. The distance
measure is a pairwise proximity between observations based on all available variables. If
this distance measures similarity, such as the squared correlation, then it is the "similarity
proximity." If this distance measures dissimilarity, such as the Euclidean distance, then it is
the "dissimilarity proximity."
Once the choice of distance measure is made, a clustering algorithm groups members in all
possible ways, each round calculating the values for an objective function (e.g., sum-of-
squared-error, SSE, between clusters) using the predetermined distance measures. Finally it
settles on a cluster configuration that optimizes this objective function (e.g., giving the
highest SSE between clusters for separating them).
2.1 Variable clustering method

Variable clustering, like principal components analysis, is a technique for investigating the
correlation among variables. The goal is to reduce a large number of variables to a handful
of meaningful, interpretable, non-overlapping ones for further analysis. Clustering uses an
oblique rotation of the variables along the principal-component axes to assign each of the
variables individually to one of the rotated component axis-clusters with which it has the
highest squared multiple correlation. Oblique rotation, contrary to orthogonal rotation,
permits variables to be somewhat correlated.
2.2 Clustering applications
Variable reduction. Variable clustering is often used for variable reduction purposes. After
dividing the variables into clusters, you can calculate a cluster score for each observation for
each of the clusters. In a regression where you have an inordinate amount of potential
independent variables, you can do a cluster analysis first to reduce the number of
independent variables. Once the variable clusters are found, you can regress your
dependent variable on the clusters instead of the original variables. Better yet, you can even
pick the variables that best represent the clusters, and use the reduced set of variables for
your regression.
213
Unlike principal components scores where each score is a linear combination of all variables
(and there are many scores), cluster scores are simpler to interpret. A variable either
belongs or doesn’t belong in a cluster, making the interpretation a lot cleaner.
Grouping entities. Observations or variables are grouped based on their overall similarity.
Although clustering makes no attempt to look inside the cluster members, the resulting
clusters need names or labels for identification. It doesn’t have to be precise: You inspect
the within-cluster values of the variables, and compare them with the between-cluster
values of the same variables to differentiate the cluster characters. Since the cluster labels
should reflect the larger differences in those variable values, you may even discover
interesting patterns in the groupings.
2.3 Example of clustering observations

The Air Force trains recruits for many jobs. It is expensive to design and administer a training
program for every job. Therefore, data (variables) is collected on each of the jobs
(observations), and then the jobs are clustered. Clustering enables the Air Force to design
training programs for the entire clusters of jobs, not just for specific jobs.
3. Factor analysis
Factor analysis is useful for exploring and understanding the internal structure of a set of
variables. It describes the linkages among a set of "observable" variables in terms of
"unobservable" or "underlying" constructs called factors.
In principal component analysis, you construct new variables from all the observed
variables; whereas in factor analysis, you reconstruct the observed variables from two new
types of underlying factors: the estimated common factors (to all observations) and the
unique factors (to individual observations). The importance is in the study of the factor
loadings - the coefficients for estimating the observable variables from the underlying
factors.
The general form for factor analysis looks like a linear ANOVA model:
Yij = µi + Eij,
µi = ßkXik
where Yij is the observed value of subject i on variable j,
µi is the subject i mean value for the common underlying factors, Xk
Eij is the error term unique to subject i on variable j, after accounting for µi,
ßk is the factor loading estimates for the unknown common factors Xk.
ßk is the same for all subjects on factor k.
214
In a linear model, you are given the values for X and Y, so that it is possible to solve for a
unique ß and E. With a factor model, X (the common factors), ß (the factor loadings), and E
(the unique factors) are all unknown and have to be estimated. You can have an infinite
number of solutions to factor loadings, all of which fit the data well. There lies the first
indeterminacy of factor analysis. To obtain a unique solution, you must impose the
constraint that all the common and unique factors be mutually uncorrelated.
After you estimate the factor loadings, any orthogonal rotation of these estimates to derive
the X factors would work equally well to preserve the variance information. For
convenience, researchers often rotate the factor axes so that they can better interpret the
resulting axes - factors. There lies the second indeterminacy of the factor analysis.
Because of the amount of guesswork involved in doing a factor analysis (it is a fishing
expedition), you can have different findings with different groups of respondents, with
different ways of obtaining data, and with different mixes of variables. The technique should
only be used as an exploratory tool to help untangle badly tangled data, which should then
be followed up by a confirmatory analysis like the regular ANOVA.
3.1 Application
Exploratory variable analysis. If you can gain insights into the underlying factors, you may be
able to learn something interesting about the observable variables, or you may even derive
causal relations of how these factors influence the observed variables. Also, if you can show
that a small number of underlying variables can explain a large number of observed
variables, then you can significantly simplify your research.
3.2 Example
A car dealer has asked for several customers’ preference ratings on a variety of car models
made by Mercedes, BMW and Toyota. Using factor analysis, the researcher is able to
identify three common factors that underlie all customers’ preference ratings for these cars:
style-consciousness, price-consciousness, and performance-consciousness. The common
factor loadings are estimated for each car showing the extent to which customer ratings for
this car depend on the degree of their preferences for these three underlying factors. For
example, a researcher may discover that the rating of the Mercedes sedan would load
heavily on customers’ propensity toward style-consciousness and performance-
consciousness.
4. Multidimensional scaling (MDS)

MDS is a descriptive procedure for converting pair-wise proximity measurements among
objects (observations mostly, variables occasionally) into a geometric representation in a
two-dimensional space. The goal is to plot it.
The method requests one single input variable: a set of pair-wise proximity measures on
objects using all the variable information. MDS then applies iterative
215
optimization/transformation procedures to derive new configurations, projecting the
proximity onto a lower dimensional space. MDS deals with configurations rather than
groupings of the objects. It finds the relative locations of objects in a two-dimension space
for plotting purposes.
The proximity measure can take many forms: either as an absolute value (e.g., distance
between two points), or as a calculated value (e.g., correlation between two variables).
These are examples of proximity measures:
physical distance between two locations;
psychological measure of similarity or dissimilarity between two products, as viewed by the

objects;
a measure that reflects how well two things go together; for example, two kinds of foods
served in a meal.
4.1 Application
Geometric representation of objects, and outlier detection. MDS enables you to create a
plot of points in a two-dimensional space such that distances between points reflect the
degree of their similarity or dissimilarity. These points can be objects representing anything
you want (e.g., brands of product, groups of people, geographic locations, political
candidates). Data for MDS analysis can be metric or non-metric, and they need not be
absolutely precise (in that case, you would be drawing an approximate map).
By studying the spread of the data points on a plane, you may discover unknown variables
(or dimensions) that affect the similarity and dissimilarity values, or the outliers that are
distant from all other points.
4.2 Example
A market research firm was interested in knowing how customers perceive the similarities
between various snack foods. They selected 14 popular snack foods and asked six subjects
to rank every pair of snacks. MDS was used to transform the proximity data and plot the
points (snacks) on a two-dimensional space. Points that were relatively close represent the
snacks that were judged to be similar by the customers.
You can expand the research if you have existing sales data for each of the snacks. You can
build a regression model of sales on the properties represented by the two new dimensions
(e.g., saltiness and crunchiness). Each snack food receives a score for each of the
dimensions. The results of the regression can help you design a new snack food with
properties found in an area of the plot where there are no snack foods, but promises high
sales as predicted by the regression.
216
5. Discriminant analysis
Discriminant analysis is a model-based regression technique for classifying individual
observations into one of several groups based on a set of "continuous" discriminator
(independent) variables. The modeled (calibrated) relationship can be applied to new
members to predict their group membership.
In a discriminant analysis, the researcher selects the observations first before measuring
their values on the discriminator variables to avoid violations of method assumptions.
Discriminant analysis is not appropriate for situations where the researcher wants to select
observations to guarantee that wide ranges of values are included for the independent
variables (in such a case, use logistic regression instead).
5.1 Applications
Discovering the discriminant function that optimally discriminates among groups, and
learning how they work. Discriminant analysis and cluster analysis both form groups. In a
discriminant analysis, the group membership in the sample data is known, and the
procedure is concerned with finding the meaningful relationship between group
memberships and the discriminators. In a cluster analysis, the groupings are not known
ahead of time, and the sole purpose is to find group memberships based on a composite
distance measure on all possible variables.
Grouping of observations. Discriminant analysis uses the linear discriminant function to

predict group memberships for a set of data points.
5.2 Example
Before granting a credit card to a customer, a bank would want the assurance that the
potential customer is a good credit risk. The bank may build a discriminant model of people
whose credit behavior is known, based on such discriminator variables as their income,
amount of previous credit, length of time employed. The modeled relationship can be
applied to new applicants’ values on the discriminator variables, which are known, to
predict their future credit behavior.
5.3 Limitations
With discriminant analysis, you can examine the extent to which the groups differ on the
discriminators, which you cannot with ad-hoc procedures like cluster analysis.
It is very important to cross-validate the results of a discriminant analysis. A discriminant

analysis would give spurious results when it classifies the same observations it used to
develop the functional relationship. (The misclassification rate of new data would be higher
than what the model predicts.) To properly cross-validate the discriminant function, you
should have sufficient observations in your sample so that a portion of them can be used to
develop the function, and the other portion used to cross-validate your result.
217
6. Canonical correlation analysis, or multivariate multiple regression
Like the univariate multiple regression, you have several independent variables in the
multivariate multiple regression; however, unlike univariate regression, you have several
dependent variables in the multivariate multiple regression. The goal of multivariate
multiple regression is to find the joint effect of independent variables on all dependent
variables simultaneously.
You can do a separate univariate regression for each of the dependent variables. The
problems with this approach are:
you would have a separate series of prediction equations for each dependent variable;
you would have no multivariate information about the relationship of the set of
independent variables with the set of dependent variables;
you would have no information about the relationship within the dependent variables or
the relationship within the independent variables.
6.1 Canonical correlation analysis

A canonical correlation analysis enables you to discover the linear relationship between two
sets of variables, without regard to which set is the independent variables and which set is
dependent variables. In a multivariate multiple regression, the canonical correlation (or,
multivariate R2) is equivalent to the correlation (univariate R2) in a univariate multiple
regression.
Canonical correlation analysis is able to simplify the following problems:
The dependent variables may be measuring redundant information. (A subset of them, or a

smaller set of their linear combinations, may be sufficient.)
The independent variables may be measuring redundant information. (A subset of them, or

a smaller set of their linear combinations, may be sufficient.)
The linear combinations of the independent variables may serve as predictors of the linear
combinations of the dependent variables. This would reduce the complexity of the analysis
that may actually give you better insights into the data.
Canonical correlation analysis does this by a redundancy analysis, finding successive linear
combinations of the variables in each of the two sets such that:
Each linear combination of the variables in one set is independent of the previous linear
combinations of variables in the same set.
The correlation of any pair of linear combinations between two separate sets is the highest
correlation there can be, subject to the constraint that they have to be orthogonal to
previously selected pairs.
218
6.2 Applications of canonical correlation analysis
Redundancy analysis. The result of the analysis is a set of canonical variables: linear
combinations of the original variables that optimally correlate with the corresponding linear
combinations in another set. By examining the coefficient and correlation structures of the
variables used in forming the canonical variables, you know the proportion of variance in
one set of variables as explained by the other set. These proportions, considered together,
are called redundancy statistics.
Redundancy statistics are in fact the multivariate R2 in a multiple multivariate regression. By

performing a canonical analysis on two sets of variables, one set identified as independent
variables and the other set as dependent variables, you can calculate the redundancy
statistics that estimate the proportions of variance in the dependent variable set that the
independent variable set can explain.
6.3 Example
A pharmaceutical company is interested in comparing the efficacy of a new psychiatric drug

against an old drug, each at three different dosage levels. Six patients are randomly assigned
to one of the six drug-dose combinations.
There are a set of three dependent variables: the three sets of gain scores from three
psychological tests conducted on the patients before and after the trial. These scores are:
HDRS (Hamilton gain score), YBOCS (Yale-Brown gain score), and NIHS (National Institute of
Health gain score). There are a set of three independent variables in the model: drug (new
or old) and dosage level (50, 100, or 200 mg of the drugs), and prior physical conditions.
7. Conjoint analysis
Conjoint analysis is used to analyze product preferences and to simulate consumer choice. It
is also used to study the factors that influence consumers’ purchasing decisions. Products
can be characterized by attributes such as price, color, guarantee, environmental impact,
reliability, etc. Consumers typically do not have the option of buying the product that is best
in every attribute, particularly when one of those attributes is price. Consumers are
constantly making trade-off decisions when they purchase products (e.g., large car size
means increased safety and comfort, which must be traded off with higher cost and
pollution). Conjoint analysis studies these trade-offs, under the realistic condition that many
attributes in a product are presented to consumers together.
Conjoint analysis is based on an additive, simple main-effect analysis-of-variance model.

This model assumes no interactions among the attributes (which may be unrealistic). Data is
collected by asking participants about their preference ratings for the overall products
defined by a set of attributes at specified levels. Conjoint analyses are performed for each
customer, but usually the goal is to summarize or average the results across all participating
consumers.
219
For each consumer, conjoint analysis decomposes his original overall ratings into part-worth
utility scores for each of the attribute levels. Total utility for a particular product (viewed as
a combination of attributes and their levels) is the sum of the relevant part-worth utilities.
Large utilities indicate the preferred combinations, and small utilities the less preferred
combinations. The attributes with the widest utility range are considered to be the most
important in predicting this consumer’s preference. The average importance probability for
an attribute across all consumers indicates the overall importance of this attribute.
A consumer’s total utilities estimated for each of the attribute-level combinations of a

product are then used to simulate expected market share - the proportion of times that a
product combination would be purchased by him. The maximum utility model is often used
to simulate market share. The model assumes that a customer will buy with 100 percent
probability the product combination for which he has the highest utility, and 0 percent for
all other combinations. The probabilities for each of the product combinations are averaged
across consumers to get an overall market share.
7.1 Example
A consumer is asked to rate his preference for eight chocolate candies. The covering is
either dark or milk chocolate, the center is either hard or soft, and the candy does or does
not contain nuts. Ratings are on a 1 to 9 scale where 1 indicates the lowest preference and 9
the highest. Conjoint analysis is used to determine the importance of each attribute of the
product to the consumer, and his utility score for each level of an attribute.
7.2 Limitations
The deficiency with conjoint analysis is its lack of error degrees of freedom (too few
observations, and too many variables). It is like conducting an ANOVA analysis on one
subject over all levels of the attributes. The R2 with data points extracted from one
individual would always be high, but that does not guarantee good fit or the model’s
predictive power. Researchers prefers the main-effects model because it requires the
fewest parameters to estimate, alleviating the burden of not having enough error degrees
of freedom.
The second problem is the complexity of designing the experiment. The simplest design
would be the full-factorial design, requiring all possible combinations of the levels of the
attributes given to the consumers for rating. When the number of attributes is large (say six
or more), and the number of levels for each attribute is large (say four or more), it is tiring
for anyone to rate that many combinations. For a small number of attributes and levels, you
can choose an orthogonal design leading to uncorrelated part-worth estimates.
While conjoint analysis is ideal for new product design, researchers are advised to confirm
the conjoint analysis result with a standard ANOVA analysis, even though the latter works
better with continuous measurements.
220
Report Writing:
Mastering the art of writing quantitative research reports
Author - Ron Weidemann II and Albert Fitzgerald
Article Abstract
Quantitative research report-writing is really about telling a story. The authors provide tips
and guidance on how to structure that story to create reports that enlighten and enthrall
the end users.
Let’s face it: we’ve all seen those dreaded reports, full of mind-numbing tables. Data here;
data there; data, data everywhere! But aren’t quantitative research reports all about
numbers? Isn’t it necessary to show the results of questions with numerical tables? Maybe
we can show a few pie charts or a bar chart or two, but isn’t the core of the research the
numbers? Our answer is no. Numbers do not tell the story. As researchers, that’s our job.
Quantitative research reports are really about telling a story and using the data as
supporting information. Of course this is easier said than done. Numbers are often just as
tedious for the analysts as they are for the person reading the report. The good news is
there are a number of things an analyst can do to ensure that a quantitative report will be
readable, tell a story and allow the consumers of the information (our clients) to make
critically important business decisions.
First of all, know your market. Stay up to date on current events, macro/micro trends and
the competition. This information can be invaluable when giving insights into why the
market has shifted in an unsuspected way.
Next, use study objectives as a guide. Review the objectives and match them to specific
question batteries. This makes certain you are meeting the critical needs of the study. Write
down each objective and review the questions that address them. Then write a single
sentence that answers each objective. For example, if the objective is to identify the optimal
price point for a new product, write down the ideal price point. Keep it simple and short.
Don’t elaborate or try to explain methodology or give supporting materials at this point. We
need a 10,000-foot view before we address the details.
Then draft a one- to two-page summary. This always helps get out of the trees and see the
big picture. Once you have your summary, the storyline should take shape. By rearranging
your points you can find an ideal flow - a way of presenting the critical information that
communicates the most important results of the study.
Road map
221
Now it’s time for reporting. The first step is to lay out a road map of the report. Using your
executive summary, ask yourself, “How do I want my client to consume this information?”
This is a critical step, one often overlooked due to time pressure and the availability of
cookie-cutter report templates. Remember, each story you tell is different, and your report
should reflect its uniqueness. Once you have the report laid out it is time to tell the story.
Executive summaries are always a good way to begin a report. Keep the insights tight and
pithy. No matter how long the survey was, executive summaries should never be more than
one or two slides. If yours has more than this, you are sure to have superfluous information.
Next follow your road map by building slides that support your insights from the summary.
Also, keep in mind that each slide should be able to stand on its own with both graphical
and text-based content.
When writing your report, ignore the order in which the survey questions were asked. Few
reports flow well and few stories are compelling when the information in the final report
follows in the exact order of the questions in the questionnaire. Instead, present the most
important information up front. Follow the flow that made sense in the summary. This will
help the final report tell a compelling story.
Today many reports are crammed with data. Take time and ask yourself: Why did I put that
data there? What is its purpose? What point does each number convey? If the data is not
essential for communicating the point, leave it out. Use flow charts, arrows and other
graphical tools to walk people through your slide. You want the reader to spend their time
absorbing the information rather than trying to figure out what is going on in the slide.
Other tips: use color to denote differences and carry readers through your report. Also, omit
data cuts or segments where differences do not exist. Few things clutter up a report more
than slides showing comparisons between data cuts which then conclude that there are no
differences worth noting!
Educate and enthrall
At the end of the day it is important that you effectively and efficiently communicate the
information to your client. But realize that your client wants information to help make
critical business decisions, not mountains of numbers that are difficult to sift through. Put
yourself in their shoes and you’ll write reports that enlighten, educate and enthrall.
222
Charting and graphing software comes of age
Author - Steven Struhl
Article Abstract
This article describes six different charting and software packages that run in the Microsoft
Windows environment. Each package was reviewed according to the variety of charting
options; speed, efficiency and demands made on the PC; ease and smoothness of operation;
drawing and embellishing; transferring files and graphics; and value for the cost.
As the title of this review suggests, these programs prove that the days of good, even
outstanding, software have arrived. As much as any software on the market, these packages
collectively show how powerfully computers can perform in the Microsoft Windows
environment. I would rate each of these packages as at least "good" overall, as well as in the
areas of creating charts and graphs, and generating presentations. Each program has a
different "personality," however, and appeals to widely differing sets of users. If you can
find the program that meets your specific needs, you're likely to be very pleased.
That enthusiastic opening out of the way, let's get to the details. There is plenty to say about
these feature-rich programs. They perform so many specialized functions it was hard to
decide what to include. Suffice it to say that each program does everything discussed here
and more.
This review falls into several sections:
a three-minute summary of the programs;
a review of ground rules;
a sidebar on graphics and related formats;
review areas in detail; and
recommendations.
Overview
Any of these products will handle the basics with ease. All can produce remarkably
professional results. All provide excellent file import and export capabilities, and most allow
you to do advanced analyses along with charting and graphing data from a file or entered
from the keyboard. All are loaded with features, including many that would have seemed
incredible even a few years ago. Each has some area (or many areas) that it handles
particularly well.
223
This sophistication has its price, though: the programs range in size from large to huge, and
most require a powerful PC to operate efficiently. While you can run all of them on a less
powerful PC (that is, anything below a 486-based system), most of them will perform with
infuriating slowness. Charisma 2.1 (the predecessor of the upcoming Charisma 4.0) was
actually fleet enough to run at a reasonable speed on a 386DX, which ran at 16 megahertz
(MHz). Freelance Graphics 2.01, nearly as fast, ran at a barely tolerable rate on the slower
machine. The rest of these new giants really need more speed, however. You will also need
plenty of hard disk space (from 13 to 34 megabytes for each program).
As your needs become more specialized, you may start to find that each package has a few
gaps. Your best bet is getting the literature from each company and reading about the
packages in detail before you buy. For a preview that points out great features (in my
idiosyncratic view), as well as omissions or rough spots, just read the rest of the review.
Below are thumbnail sketches describing each program and its "personality."
Charisma 2.1 and 4.0 for Windows

The last version of Charisma was 2.1, now a venerable program at two years old. Charisma
has long been a personal favorite of mine for its speed, ease of use, and the strength of its
drawing and editing tools. Micrografx, which produces Charisma, has never received the
recognition it deserves for producing excellent software, but the package boasts many little
touches that show its creators to really understand what goes into producing precise images
quickly. Perhaps many of their packages do not "win" in sheer number of features, but they
always have provided the things you need to produce great-looking work - in programs that
run quickly, smoothly and "intuitively" (they almost always do what you expect).
Version 4.0 was provided in "beta release" (not finished) form. About 30% of the final
program did not work, so it was difficult to assess its actual performance. Given the features
described in the product's manual, and the Micrografx track record, though, the finished
product is likely to be excellent. Based on what was sent, I would expect something similar
to Freelance Graphics 2.01 and Harvard 2.0, but with more charting options and more fully
integrated advanced drawing and image modification features. Micrografx also offers the
best-looking clip-art for PCs that I have ever seen (although Corel is quite close in quality).
The images they provide can add a truly professional touch to your work.
Corel Draw 4.0 for Windows

Roughly a year ago, Corel set new standards for graphics software when it released Corel
Draw 3.0. The latest version is even more advanced. You can do nearly anything you can
imagine with an image using Corel. Corel Chart, the package's charting program and a full-
scale program in its own right, is simply one module of many in this amazing package. Other
features include advanced precision drawing, photo manipulation, desktop publishing,
remarkable graphics format conversion capabilities, fractal object textures (which often look
224
like natural textures), animation and completely professional prepress image preparation. It
can make you look like an artist in spite of yourself. The program is a colossus in both
features and size, and even comes with a CD-ROM disk holding some 750 (yes, really!) high-
quality True Type typefaces and some 18,000 high-quality pieces of clip-art.
Perhaps the single largest drawback to Corel is that it if you start exploring the program's
features and capabilities, you may take weeks to get to the charts you want to make.
Beyond this, you should be aware that the images that Corel can produce may become too
complex for your printer to handle, and the program itself is likely to give even a powerful
computer, like a 486 DX2-50, a hard workout.
The charting module did have a few drawbacks, chiefly speed. At times, it seemed to move
more slowly than the other programs and the other parts of Corel. Adding labels and
annotations seemed to take the most time. I hope Corel will work on this, because it can do
remarkable things. Also, unlike the drawing portion of the program, the charting module
does not allow you to interrupt it while it redraws the screen, to enter another command, or
change what is happening. It tended to redraw the entire screen even after small changes
(like changing the point size of a label), but did not always do this. This module also lacked
the great flexibility of the drawing portion of the program, although you can always copy a
chart into the drawing module, and embellish with all of Corel's powerful tools.
DeltaGraph Professional
DeltaGraph started as a Macintosh program and migrated to Windows last year. This
program is a real analytical heavyweight, able to fit many types of curves and surfaces to
your data. If you have a scientific bent and want a program that will quickly perform many
analyses and convey the results, this may be the choice for you. A particular strength of this
program is its ability to create labeled scatter points in one step.
On the negative side, DeltaGraph had a somewhat less integrated feel than most of the
other programs. Backgrounds, for instance, are kept separately from charts, and applied to
them. Most of the other programs let you start with a basic "look" for all the pages you are
making, and go from there. Also, some of the complex options in DeltaGraph moved a little
slowly. The program has not adopted the "intelligent redraw" that many others use, so even
minor changes require regeneration of the entire image. If you are working with relatively
simple graphs, this should not matter. It can be irritating when working with more complex
images, however. For instance, it redrew a complex image several times as I tried to get axis
labels exactly the way I wanted them.
Lotus Freelance Graphics 2.01 for Windows

This is my product of choice for a presentation that's mostly word-oriented. The program
does a good job with charts, also, although with fewer fancy options as the others. This
program excels at putting together sharp, professional-looking paper or slide presentations
225
with a uniform look from page to page. Its handling of bullet points at multiple levels is
superb. It has a good collection of clip-art (which it calls "symbols") that you can easily add
to or modify.
Freelance Graphics 2.01 runs smoothly and quickly. The only time I noticed the least
slowdown was when moving around a magnified view of the screen. Its operation is highly
"intuitive," showing how far Freelance Graphics 2.01 has advanced the idea of "software
usability." They really seem to know how users will do things with software. The program
would almost always respond as I guessed it would when I was using a new feature or
producing a new effect. I remember looking something up in the manual only once, and
needed only a quick glance at the computer-run tutorial. This is the product that I return to
when I need to produce a truly professional presentation quickly.
Harvard 2.0 for Windows

This product is really good - amazingly good, especially after some moderately unhappy
experiences with versions of Harvard for DOS. SPC, the makers of Harvard, have gone the
distance to make a product that produces truly professional results with tremendous ease.
Harvard's ability to prepare word charts trails Freelance Graphics 2.01 only slightly, has
more options for graphs, and adds advanced drawing and image modification tools. Harvard
goes beyond the other programs in the reminders and intelligent advice it offers. You can
turn on a screen that gives pointers (all well-taken) on good charting and graphing
practices.
Perhaps most intelligently, Harvard tells you what all the little icons scattered on the screen
actually do. (Windows programs now make extensive use of "icons," or small pictographs,
which you click on with the mouse to accomplish tasks.) As you pull the mouse across the
spot occupied by an icon, the text on the top bar on the screen (normally devoted to the
program's name) changes to explain what the icon does. Bravo, Harvard!
Harvard also includes an add-on "F-X" module that can enhance your charts and
presentations with an ample sampling of the many remarkable effects you can expect from
Corel. You can make two-dimensional objects (including text) look three-dimensional in
various ways, add shadows and many special fill patterns that look like chrome, steel,
leather and so on. The results look excellent. Corel provides many more options (probably in
the billions - no kidding), but Harvard makes this type of wizardry easier.
Harvard looks like an excellent choice for more chart-intensive presentations, or for those of
us who like to add a few extra fancy touches.
Harvard also has some advanced multi-media features, meaning you can make your
presentation a real spectacular, including sound clips, animation and so on. Harvard even
226
allows you to set up a tele-conference presentation with up to 64 networked computers
(requiring only VGA monitors).
Stanford Graphics
This program provides the most remarkable range of charting options, and genuinely
advanced analytical capabilities. Stanford claims to produce more than 140 types of charts.
Some of these types are closely related, but the variety still is incredible. If you want to work
with your data in detail, do more technical and scientific charting, perform various types of
what-if analyses, or just produce some absolutely amazing charts, Stanford may well be your
choice.
Stanford's speed is good, even with more than one complex chart on the page. It has
become much more flexible than earlier versions, but it's still not quite up to most of the
other programs in drawing and on-screen editing. It sometimes makes you work a little
harder than the others do, and on a few occasions, it was not clear how to get a desired
result. You ultimately can do nearly anything you want with a chart, although you may need
to "work around" to a solution in few instances. For instance, it takes three steps to find the
goodness of fit of a line or curve drawn through points; most other packages produce this
automatically. Help is available if you get stumped. Stanford's technical support proved
quite helpful with the questions I posed for them.
Areas reviewed
We looked at the packages from the perspective of survey and database data-information
that market researchers, marketers and planners are likely to use. We evaluated each for:
variety of charting options;
speed, efficiency and demands made on your PC;
ease and smoothness of operation;
drawing and embellishing;
transferring files and graphics; and
value for the money.
Performance in each area is summarized by a system of stars, ranging from one star for
"poor" to six stars for "outstanding," as follows:
*.......................................................poor
** ......................................................fair
*** .................................................good
****........................................ very good
227
*****........................................ excellent
****** ..................outstanding, wonderful
Test equipment
We tried these programs on an IBM-compatible 80486-based PC. This machine (although

already at the advanced age of 11 months) still can be counted as a fairly up-to-date "heavy
duty" (or "hot rod") machine, with most of the latest features. It is powered by a DX-2 type
chip running at 50 MHz, has 8 megabytes (MB) of RAM (random access memory-the
computer's working space for running programs), and a 212 MB hard drive with an access
time of 12 milliseconds. Its video is handled by an ATI Graphics Ultra Pro card, a highly
respectable card for handling the demands of Microsoft Windows.
Nearly all these programs demand a PC about as powerful as our test unit. Windows itself
tends to run more slowly with anything less than 8MB of RAM. Strictly, Windows requires
only 2 MB of RAM, but you will see performance suffer with 4 MB or less of memory.
Review area 1: Variety of charting options
When
When considering chart variety, a large part of your evaluation will revolve around what you
expect graphics to do. One school that has long held some sway, as exemplified by Tufte and
his "Visual Display of Quantitative Information," maintains that simplicity is the highest
228
good. "Minimize the ink-to-information ratio" is their war cry. An ideal chart for this faction
is spare, with lots of white space.
Meanwhile, a few in and around the academic community grumbled about all this austerity,
and practitioners continued to notice that audiences liked color, 3-D effects, and so on.
Recently, Tukey, a star in the data-analytical pantheon, and long in Tufte's camp, fired a
strong salvo against the purist approach. Graphs and charts, he stated, should be used more
as a qualitative aid to understanding than to display information precisely. Expect graphic
representations of the data to give a feeling for its patterns, underline key points, and arrest
the attention. Leave the analysis of the data to the numbers and accompanying text.
Particularly if you believe in arresting the attention, all these packages have plenty to offer.
Even Freelance Graphics 2.01, which has somewhat fewer charting options than the others,
can frame your chart in a presentation format that will compel, amuse or startle.
Stanford, however, is the clear winner in charting options. The number of different charts it
offers is nothing short of extraordinary. You can get an idea of the options by browsing
through a section of the program called the "Gallery." This shows every basic chart type
Stanford makes, divided into 2-D and 3-D sets. Some of these charts look so incredible you
may find yourself trying to contrive some data to fit into them.
Like Stanford, Corel and DeltaGraph do 3-D graphs, with many of the same chart-handling
features (including rotation of the graph in three dimensions, and changing the chart's
perspective). Charisma 4.0 will produce 3-D graphs as well.
Both Stanford and DeltaGraph produce scatter plots with labels, a highly useful feature for
perceptual mapping. With this feature, the programs take the labels from your data (which
appear in spreadsheet form in all the programs), and put them on the chart near the points.
With DeltaGraph, you can then drag and drop any overlapping labels on the screen. With
Stanford, you need to change the distance in a dialogue box. Stanford, though, will put all
the labels you change in any way at just one distance. (If you move one above a point and
one below another point, and if you set one, for instance, 18 units from the point, the other
will move to a distance of 18 units from its point.) You may need to delete a few labels and
"overlay" another data series, which can then have other custom distances, on very
crowded scatter plots. This program also allows you to use any symbol in any typeface for
markers, even providing a special symbol set for this purpose. The upcoming Charisma 4.0
also will make labeled scatter plots.
Stanford also will produce vector maps, with vectors pointing toward the origin and labels at
the ends. This is also useful for perceptual maps. Getting the vectors to radiate directly from
the origin (0,0 point) of the chart may prove tricky, though.
229
Review area 2: Speed, efficiency and demands on your PC
Corel is a heavyweight among heavyweights. Aside from Charisma 2.1 (which Charisma 4.0
is about to replace), none of these programs requires less than 10 MB of disk space for a full
installation. Corel requires 34 MB, although you can run it using an included CD ROM disk,
keeping only a portion of the program on your hard drive (assuming you have a CD ROM
drive). Corel also includes another CD ROM disk, as mentioned, with 750 additional
typefaces (some come with the basic program), and some 18,000 pieces of clip art. The final
size of Charisma 4.0, still in development, is not certain. Charisma 4.0 also will include a CD
ROM disk full of images, and perhaps other materials.
Most of the programs, besides being large, make heavy demands on your PC. Only Charisma
2.1 runs somewhat comfortably on an older 386-based machine. Freelance Graphics 2.01,
not quite as fast as Charisma, will run on a slower 386-based PC, but requires plenty of
patience on this platform. Harvard would probably run about as fast as Freelance Graphics
2.01. Corel, Stanford and Delta Point required too much speed and power to run
comfortably on an older, slower PC.
All the programs ran at least acceptably on our test model, a 486-based PC. Charisma 2.1
ran very quickly. Charisma 4.0, still filled with "beta test code," seemed slower, but part of
the "beta development" cycle is making the program run more quickly. Charisma's
operations seem subjectively faster because of the wealth of "shortcut key" combinations
that can quickly execute common operations. These key combinations (for instance, using
Shift + L to left-align objects, from the object alignment menu) can save a great deal of time
230
compared with clicking through two or three levels of menus. Charisma also has a "set"
check-box on most of its larger dialog boxes, which keeps them in view until you decide
otherwise. This way, you easily can try out a few effects without having to call up the dialog
box from a menu repeatedly.
Harvard, Freelance Graphics 2.01, and Stanford operated at excellent speeds, overall.
Harvard sometimes seemed a little slow opening or saving charts, and could use a little
more speed in saving new "master styles," consisting of backgrounds and layouts, for use
with presentations. Freelance Graphics 2.01, otherwise quite quick, moved slowly when
panning around a magnified view of a presentation page. This tended to discourage exact
editing of objects on the page. Stanford, while usually quite quick, often relies on entering
data in dialog boxes to move things on the screen. This seems slower than simply pulling the
object to the location you would like.
DeltaGraph did well with simple charts, but it redraws the entire screen every time you
change any detail. If you are working with heavily detailed charts or special effects (like
gradient shading that changes gradually from one color to another), then the redraw time
can seem slow.
Corel often moved quite quickly, particularly if you simply entered data and used one of its
(many) preset graph types. However, like DeltaGraph, it tended to redraw the entire screen
after small changes. With complex patterns, like the fancy stars in the "Drawing and
Embellishing" rating chart, all these redraws turned editing into a time-consuming process.
In that particular chart, I didn't like the default placement of labels, and it was here that
Corel moved most slowly. Finally I copied this one chart to the Windows clipboard and
pasted it into Charisma 2.1 for final editing. I hope Corel will work on the speed of changing
annotations on graphs, since this renders an otherwise rich and versatile program harder to
use.
Review area 3: Ease and smoothness of operation
231
Some of these aspects were covered in the last section. The speed and ease champion, as
may be apparent, is Charisma 2.1. Freelance Graphics 2.01 and Harvard follow closely. All
move quickly, perform as expected nearly all the time, and have a well-integrated "feel."
Note though, that Harvard and Freelance Graphics 2.01 use a "presentation" metaphor, in
which you start by choosing a set of basic page-layouts with a common background, uniform
text fonts and colors, and so on. Both Harvard and Freelance Graphics 2.01 then allow you
to easily modify these "master styles" to taste. All chart and text pages will then change
accordingly. Charisma 2.1 puts charts into large 12-page workspaces. You can make the look
of these pages uniform for a presentation, but this is a more labor-intensive approach than
with the other two. Charisma 4.0 will go over to a presentation-style metaphor, but starts
with an initial question about whether you will be preparing something for paper, slides or
screen, with the default choices it offers modified accordingly.
Harvard is probably the most helpful of all the programs, with an advice screen that you can
keep on as needed (including all the time). The program provides plenty of sound pointers
on displaying data. As mentioned, Harvard keeps you informed about what the various icons
do by changing the text in the top bar on the screen. This is a great feature.
Charisma 2.1 uses the bottom corner of the screen to explain menu choices more fully.
When you run the cursor over a menu item, an extra explanation appears. Charisma 4.0
should extend this system to icons, and add reminders about the equivalent short-cut keys
for menu operations. Reminders about short-cut keys now appear directly on the menus in
Charisma 2.1.
232
DeltaGraph, in addition to seeming somewhat slower than the others, had a somewhat less
cohesive feel. Rather than using a presentation metaphor, it allows you to keep
backgrounds in a library, and apply these to presentations. The backgrounds did not always
appear until I ran the "slide show" feature - and why this happened was not apparent.
Making the slide show run was not entirely intuitive, and the default between-page waiting
times and transitions were far too slow for my tastes. Nonetheless, the program will get the
job done, with professional results. DeltaGraph also provides advice on which of its many
charting options to choose. You select the type of audience, what you want to convey, and
how fancy you want the chart to look, and the program makes a suggestion.
Stanford offers the option of working on a single chart or an entire presentation. It works
reasonably well but not quite with the same smoothness and ease of modification as
Harvard or Freelance Graphics 2.01. Stanford offers a broad range of analytical options, but
you likely will need to read the manual to use some of them. As mentioned, Stanford has
not yet automated certain features that the other packages have, particularly providing the
goodness of fit (such as the r-squared) for a line or curve fitted to the data. On the other
hand, Stanford has helped ease of use a great deal by adding a feature that highlights the
portions of a spreadsheet corresponding to the various areas that will appear on a chart.
Stanford also has a pop-up "advisor" (a professorial-looking character) that can provide
extra guidance about using the program.
Corel does not use a presentation metaphor per se, but makes it easy to keep pages
uniform. Any slide or chart can serve as a template for all others, so once you set things up
the way you like, all subsequent pages can "inherit" layouts, colors, chart placement, and so
on, recreating a "look" in its entirety. Corel in its Draw module includes desktop publishing
features so powerful that some users reportedly use it to do text entry, processing, and so
on. The other programs lend themselves best to presentations, rather than to intricate page
layouts.
Perhaps not surprisingly, Corel's tremendous depth of features require a lot of learning
time. Each feature is relatively simple, but there are so many of them! One of the CD ROM
disks that comes with Corel has a huge tutorial, explaining all aspects of the program. Some
users apparently have found this so valuable they installed CD ROM drives on all office PCs,
just so everyone could use it.
Review area 4: Drawing and embellishing
233
All All
of these programs have some advanced drawing features, and all include "clip art" or
symbols of at least good quality. For instance, all these programs allow you do Bezier-curve
editing on drawings. In this feature, lines or objects that you select can be reshaped by
pulling on "control" points. You also can start with a rather lumpy drawing done with a
mouse, and smooth it by eliminating various control points. This form of editing allows you
to produce much more professional-looking results than you would otherwise.
Corel takes this feature a step further, with procedures that can make you look like an artist
in spite of yourself. So while drawing with a mouse still may feel no better than drawing
with a potato, now you may well make the final output look like the work of an illustrator.
Corel, in its Draw module, also goes far beyond all the other programs in drawing and
embellishing. It has so many amazing features that it would take another review like this
one to explain them all. I can scarcely imagine anything you would ever want to do with an
image that falls outside Corel's capabilities.
Harvard Graphics provides a substantial subset of Corel's special-effects magic in an

included companion program called Harvard F-X. You can "extrude" two-dimensional
objects (make them three-dimensional), bend and warp things, put text on irregular curves,
fill objects with many interesting textures, and so on. Want your logo redone in stainless
steel? No problem.
Harvard also makes it easy to use intricate "bitmap" fills inside objects. You can quickly
choose how the bitmap gets handled - whether the object in question gets filled with many
234
small copies of the bitmap, or whether the bitmap should be stretched to fit horizontally or
vertically, or clipped in either direction, and so on.
Charisma 2.1 has an excellent set of basic drawing tools, as well as some advanced ones,
including Bezier-curve editing, joining lines into closed or open figures, and rotation of text
and objects. Charisma 4.0 promises to include many of the advanced drawing and image
manipulation features found in Harvard and Corel. Micrografx, Charisma's parent company,
also makes the finest looking clip art for PCs I have ever seen.
Freelance Graphics 2.01's drawing tools are powerful, simple, and get the job done. Along
with the other alignment commands, it includes options for evenly spacing objects
horizontally and vertically. Freelance calls its clip art "symbols." It keeps the images
organized by subject, so if you want an arrow, you simply open the "Arrows" group, browse
until you find what you want, and paste it into the drawing. Freelance Graphics 2.01 makes
it simple to add clip art to a group, or modify the symbols already in a group. You simply
open a group, just as you might a presentation, and modify whatever you wish.
DeltaGraph, and particularly Stanford, have fewer drawing tools than the others. Stanford
can't align objects with each other (at their left edges, right edges, and so on), except by a
rather difficult system of entering coordinates in a dialog box. Also, Stanford, for all its
amazing ability to rotate 3-D charts, does not rotate text that is not attached to a chart.
DeltaGraph limits text rotations to 90-degree increments. One surprising strength of
Stanford is its ability to fill an object with either a bitmapped or vector image. This feature
works smoothly and quickly, but with fewer controls than Harvard has. Harvard, though, can
fill only with bitmaps.
Review area 5: Transferring files and graphics
235
Although the programs still have proprietary file formats, they have learned how to talk to
other programs. All come with a wide variety of import and export filters for images. Should
you find another program that does not communicate directly with one of these, you should
almost always be able to copy an entire chart or page onto the Windows clipboard, then
paste it into the other application. (This was exactly what I did to touch up the Corel rating
chart with Charisma.) Most will read in Lotus spreadsheets, dBase files and ASCII data. Data
exporting options can be more limited, but the Windows clipboard can come to the rescue
again. I was able to cut and paste a large Stanford spreadsheet, in which the program had
done many calculations, directly into Microsoft Excel.
Corel again goes the other programs one better. Not only can Corel import and export
bitmap pictures, but it has a conversion program, Corel Trace, that will convert a bitmap
image into a vector image. This means you can, for instance, scan in a logo, and convert the
resulting image (which always is a bitmap) into a vector image. The vector image can then
be smoothed, processed, and so on; when it prints, it will be at the maximum resolution of
your printer. If this is not enough, you can also ask Corel to turn the image into something
looking like a woodcut or engraving along the way. Corel Trace even has an OCR (optical
character recognition) module, so you can scan in text and make the resulting bitmap into
actual letters that you can edit and manipulate just like any other text.
236
Review area 6: Value for the money
Ratings for Value for the Money
Charisma 2.1 *****

Corel 4.0 *****
Delta Graph Pro *****
Freelance 2.1 *****
Harvard 2.0 *****
Stanford 2.1 *****
If you have gotten this far, you know that these programs have very different
"personalities," so their value to you depends on how you want to use the programs. All
these programs do great things in their own ways. For the right user, each would represent
an excellent value. The recommendations below summarize what I judge are the best uses
for each.
Recommendations
If you need speed above all else, and want a program that does splendid-looking charts and
high precision editing, find Charisma 2.1 before it disappears from the shelves. Recall,
though, that Charisma 2.1 does not have as many presentation-oriented capabilities as
Harvard 2.0 or Freelance Graphics 2.01. Charisma is the program I choose when I want a
basic chart or graph done "exactly so" in the least time.
237
If you want to prepare professional-looking presentations with a lot of text, and some
charts, as quickly and easily as possible, Freelance Graphics 2.01 would be an excellent
choice. This is the program I turn to first for putting together bullet-point style presentations
in nearly no time.
If you want presentations with a little more charting power, and more in the way of special
effects and drawing, Harvard 2.0 would be an outstanding choice. Freelance Graphics 2.01
seems a little better with bullet-point text, but Harvard has surprising depth in many other
areas. If you want sound advice on how to display data, Harvard can give this to you
continuously and interactively.
If you do not need a program immediately, you might want to wait for Charisma 4.0. If it
lives up to the promise of the "beta" prerelease program, and follows in the footsteps of its
predecessor, it should be outstanding. It promises to have about as much depth as
Freelance Graphics 2.01 handling words, and to handle drawing, special effects, and multi-
media with all the aplomb Harvard shows. And it should offer more charting options than
either, including 3-D charts you can rotate. Given the excellent track record Micrografx has
quietly achieved, I would expect Charisma 4.0 to emerge from final development as a real
winner.
Any of these three programs would make an excellent first program for general use,
238
whether mostly for charting (Charisma 2.1), or presentations (Freelance Graphics 2.01 or
Harvard 2.0).
Any of the other three programs would make an excellent additional program, adding depth
in special areas to the three above. Again, the one that is "best" will depend upon your
needs.
Corel, of course, could be an excellent first choice also, if your needs go more toward
desktop publishing and advanced image manipulation. I know a few users who start with
Corel in the morning and stay with it nearly the whole day. Certainly, if you want the
ultimate in handling pictures, drawings, photos, words and charts, you could scarcely do
better than this program.
DeltaGraph can provide substantial analytical and 3-D charting capabilities, and at a
reasonable cost. If you need to fit complex functions (description of the curve) and see the
results quickly, this is an outstanding choice. It has recently been available at very good
discounts, so it could be your choice for adding advanced technical capabilities to your
charting repertoire for less than $200 at retail. DeltaGraph also is strictly 100% compatible
with its Macintosh counterparts, which makes trading charts across platforms very easy.
Stanford is even more of an analytical powerhouse than DeltaGraph, so if you want an

absolutely astounding variety of charts and graphs, and many analytical functions you will
not otherwise find outside a large statistics program, this would be the choice for you. With
all this power, though, you may need to do a little more work with Stanford to get to the
same result you get more easily from the other programs. However, if you want a charting
program that really lets you analyze the data and present it in as many ways as possible,
look to Stanford.
Overall, as long as you have a good grasp of your charting and presentation needs, you can
scarcely make a poor choice among these programs. Once you find a program with a
personality that fits your needs, you can expect excellent performance and professional
results. If you have not used a charting and presentation package yet, you should be more
than pleasantly surprised. I cannot think of another software category with so many
distinguished offerings.
Sidebar: Bitmaps, vectors, sounds and movies
In the old days - around 1990 - when you used a computer program designed to create or
modify images, you mostly had to worry whether an image was a bitmap or vector-based.
Bitmap images, while often colorful and detailed, never exceeded the resolution at which
they appeared on the PC screen. So if you were working with a VGA screen, the image
would consist of 640 dots per inch (DPI) horizontally and 480 DPI vertically. Bitmaps rarely
239
looked sharp when printed, since standard printer resolution (for a laser printer) is 300 DPI.
The new HP Laserjet IV and many of its competitors now pump out 600 DPI.
So, for a bitmap that fills an entire VGA computer screen to print at 300 DPI, it would need
to be about 2.17 inches by 1.6 inches on the page. Sometimes you can tell if a file is
bitmapped by the suffix in its file name. Some popular bitmap formats include .BMP, .PCX,
.PIC and .IMG.
Vector-based images, however, always print at the maximum resolution possible for the
output device (usually a printer). Vector-based images may not have the subtle gradations
of color and shading that bitmaps do on the screen, but they usually look much sharper
when printed. In short, vector images always avoid (as much as is possible) the jagged,
rough-looking output that was so characteristic of early efforts from PCs.
In the old days, bitmaps and vectors could not mix. If you wanted to play with bitmaps, you
got a painting program. If your interest was in vectors, you went in for drawing. One of the
earliest versions of Corel Draw caused quite a stir with its ability to mix bitmaps and vectors
in the same image.
Of course, this distinction is not so clear any more. Now PCs can handle huge bitmaps that
are stored at higher resolution than the screen can display. Some programs, like Corel, can
produce images that have qualities like bitmaps and like vectors. For instance, the fractal
textures that Corel produces are detailed and realistic-looking, but print at the highest
resolution of your printer.
In addition, now many other types of objects can go into a presentation, like sound, music
and film clips. Windows now handles specific file types for sounds (.WAV), which can be
attached to presentations. Similarly, you can incorporate film clips and animation directly
into the work you show. All this goes under the heading of multi-media. Perhaps this is the
way things will go, but for the moment, most serious marketers, market researchers,
planners, etc., do not seem too disappointed if the methods section of their report cannot
sing "Like a Virgin," or the concluding summary doesn't chime in with the slow movement of
Haydn's "Surprise Symphony."
Anyhow, if you need such things, Harvard is reputed to be a real multimedia spectacular.
Charisma 4.0 likely will be a strong challenger. Also, Corel 4.0 has a module, Corel Move,
that allows the artistic among us to do their own animation, right on the PC, with the PC
doing much of the hard work.
240
Additional Readings:
A survey of analysis methods
Rajan Sambandam
Article Abstract
This first of two articles about analysis methods examines key driver analysis, including
single dependent variable, multiple dependent variables, non-linearity, artificial intelligence,
recent advances and tools.
Part I: key driver analysis

Practical marketing research deals with two major problems: identifying key drivers and
developing segments. In the first article of this two-part series we will look at key driver
analysis and in the second part we will look at segmentation.
Key driver analysis is a broad term used to cover a variety of analytical techniques. It always
involves at least one dependent or criterion variable and one or (typically) multiple
independent or predictor variables whose effect on the dependent variable needs to be
understood. The dependent variable is usually a measure on which the manager is trying to
improve the organization’s performance. Examples include overall satisfaction, loyalty,
value and likelihood to recommend.
When conducting a key driver analysis, there is a very important question that needs to be
considered: Is the objective of the analysis explanation or prediction?
Answering this question before starting the analysis is very useful because it not only helps
in choosing the analytical method to be used but also, to some extent, the choice of
variables. When the objective of the analysis is explanation, we try to identify a group of
independent variables that can explain variations in the dependent variable and that are
actionable. For example, overall satisfaction with a firm can be explained by attribute
satisfaction scores. By improving the performance on those attributes identified as key
drivers, overall satisfaction can be improved. If the predictors used are not actionable, then
the purpose of the analysis is defeated.
In the case of prediction, we try to identify variables that can best predict an outcome. This
is different from explanation because the independent variables here do not have to be
actionable, since we are not trying to change the dependent variable. As long as the
independent variables can be measured, predictions can be made. For example, in the
financial services industry, it is important to be able to predict (rather than change) the
creditworthiness of a prospective customer from the customer’s profile.
Beyond the issue of explanation versus prediction, there are two other questions that help
in the choice of analytical technique to be used:
241
1) Is there one, or more than one, dependent variable?
2) Is the relationship being modeled linear or non-linear?
In the remainder of this article we will discuss analytical methods that would be appropriate
if one or both of these questions is answered in the affirmative.
Single dependent variable
Scaled values
Key driver analyses often use a single dependent variable and the most commonly used
method is multiple regression analysis. A single scaled dependent variable is explained using
multiple independent variables. Typically, the scale for the dependent variable ranges from
five points to 10 points and is usually an overall measure such as satisfaction or likelihood to
recommend.
The independent variables are some measures of attribute satisfaction usually measured on
the same scale as the dependent variable, but not necessarily. There are two main parts to
the output that are of interest to the manager: the overall fit of the model and the relative
importance.
The overall fit of the model is often expressed as R2 or the total variance in the dependent
variable that can be explained by the independent variables in the model. R 2 values range
from 0 to 1, with higher values indicating better fit. For attitudinal research, values in the
range of 0.4-0.6 are often considered to be good. Relative importance of the independent
variables is expressed in the form of coefficients or beta weights. A weight of 0.4 associated
with a variable means that a unit change in that variable can lead to a 0.4 unit change in the
dependent variable. Thus, beta weights are used to identify the variables that have the most
impact on the dependent variable.
While regression models are quite robust and have been used for many years they do have
some drawbacks. The biggest (and perhaps most common) is the problem of
multicollinearity. This is a condition where the independent variables have very high
correlations among them and hence their impact on the dependent variable is distorted.
Different approaches can be taken to address this problem.
A data reduction technique such as factor analysis can be used to create factors out of the
variables that are highly correlated. Then the factor scores (which are uncorrelated with
each other) can be used as independent variables in the regression analysis. Of course, this
would make interpretation of the coefficients harder than when individual variables are
used. Another method of combating multicollinearity is to identify and eliminate redundant
variables before running the regression. But this can be an arbitrary solution that may lead
to the elimination of important variables. Other solutions such as ridge regression have also
been used. But, if in fact the independent variables truly are related to each other, then
242
suppressing the relationship would be a distortion of reality. In this situation other methods,
such as structural equation modeling, that use multiple dependent variables may be more
helpful and will be discussed later in this article.
Categorical values
What if the dependent variable to be used is not scaled, but categorical? This situation
arises frequently in loyalty research and examples include classifications such as
customer/non-customer and active/inactive/non-customer. Using regression analysis would
not be appropriate because of the scaling of the dependent variable. Instead, a classification
method such as linear discriminant analysis (or its equivalent, logistic regression) is required.
This method can identify the key drivers and also provide the means to classify data not
used in the analysis into the appropriate categories.
Key driver analyses with categorical dependent variables are often used for both
explanation and prediction. An example of the former is when a health care organization is
trying to determine the reasons behind its customers dis-enrolling from the health plan.
Once these reasons are identified, the company can take steps to address the problems and
reduce dis-enrollment.
An example of the latter is when a bank is trying to predict to whom it should offer the new
type of account it is introducing. Rather than trying to change the characteristics of the
consumers, it seeks to identify consumers with the right combination of characteristics that
would indicate profitability.
Multiple dependent variables
As mentioned above, one problem with multiple regression models is that relationships
between independent variables cannot be incorporated. It is possible to overcome this by
running a series of regression models. For example, if respondents answer multiple modules
in a questionnaire relating to customer service, pricing etc., individual models can be run for
each module. Following this an overall model that uses the dependent variables from each
model as independents can be run. However, this process can be both cumbersome and
statistically inefficient.
A better approach would be to use structural equation modeling techniques such as LISREL
or EQS. In these methods, a single model can be specified with as many variables and
relationships as desired and all the importance weights can be calculated at once. This can
be done for both scaled and binary variables.
By specifying the links between the independent variables, their inherent relationships are
acknowledged and thus the problem of multicollinearity is eliminated. But the drawback in
this case is that the nature of the relationships needs to be known up front. If this
theoretical knowledge is absent, then these methods are not capable of identifying the
relationships between the variables.
243
Non-linearity
All of the methods discussed so far have been traditionally used as linear methods. Linearity
implies that each independent variable has a linear (or straight-line) relationship with the
dependent variable. But what if the relationship between the independent and dependent
variables is non-linear? Research has shown that in many situations, linear models provide
reasonable approximations of non-linear relationships and thus tend to be used since they
are easier to understand. There are situations however, where the level of non-linearity or
the predictive accuracy required is so high that non-linear models may need to be used.
The simplest extensions to linear models use products (or interactions) of independent
variables. When two independent variables are multiplied and the product is used as an
independent variable in the model, its relationship with the dependent variable is no longer
linear. Similarly, other non-linear effects can be obtained by squaring a variable (multiplying
it with itself), cubing it or raising it to higher powers. Such models are referred to as
polynomial regression models and they have useful properties. For example, squaring a
variable can help model a U-shaped relationship such as the one between a fruit juice’s
tartness rating and the overall taste rating. Other variations such as logarithmic (or
exponential) transformations can also be used if there is a curved relationship between the
dependent and independent variables.
The methods described above are not strictly considered to be non-linear methods. In real
non-linear models the relationship between the dependent and independent variables is
much more complex. It is usually in a product form and linearity cannot be achieved by
transforming the variables. Further, the user needs to specify the nature of the non-linear
relationship to be modeled. This can be a very important drawback, especially when there
are many independent variables. The relationship between the dependent and independent
variables can be very complicated, making it extremely hard to specify the type of non-
linear model required. A recent development in non-linear models that can help in this
regard is the multivariate adaptive regression splines (MARS) approach that can model non-
linear relationships automatically with minimal input from the user.
Non-linear models are particularly useful if prediction rather than explanation is the
objective. The reason for this is that the coefficients from a non-linear regression are much
harder to interpret than those from a linear regression. The more complicated the model,
the harder the coefficients can be to interpret. This is not really a problem for prediction
because the issue is only whether an observation’s value can be predicted, not so much how
the prediction can be accomplished. Hence, if explanation is the objective, it is better to use
linear models as much as possible.
Artificial intelligence
244
The title of artificial intelligence covers several topic areas including artificial neural
networks, genetic algorithms, fuzzy logic and expert systems. In this article we will discuss
artificial neural networks as they have recently emerged as useful tools in the area of
marketing research. Although they have been used for many years in other disciplines,
marketing research is only now beginning to realize the potential of these tools. Artificial
neural networks were originally conceived as tools that could mathematically emulate the
decision-making processes of the human brain. Their algorithm is set up in such a way that
they “learn” the relationships in the data by looking at one (or a group of) observation(s) at
a time.
Neural networks can model arbitrarily complex relationships in the data. This means that
the user really doesn’t need to know the precise nature of the relationships in the data. If a
network of a reasonable size is used as a starting point, it can learn the relationships on its
own. Often, the challenge is to stop the network from learning the data too well as this
could lead to a problem known as overfitting. If this happens, then the model would fit the
data on which it is trained extremely well, but would fit new (or test) data poorly.
While complex relationships can be modeled with neural networks, obtaining coefficients or
importance weights from them is not straightforward. For this reason, neural networks are
much more useful for prediction rather than explanation.
There are many types of neural networks, but the most commonly used distinction is
between supervised and unsupervised networks. We will look at supervised networks here
and at unsupervised networks in the next article. Supervised neural networks are similar to
regression/classification type models in that they have dependent and independent
variables.
Back-propagating networks are probably the most common supervised learning networks.
Typically they contain an input layer, output layer and hidden layer. The input and output
layers correspond to the independent and dependent variables in traditional analysis. The
hidden layer allows us to model non-linearities. In a back-propagating network the input
observations are multiplied by random weights and compared to the output. The error or
difference in the output is sent back over the network to adjust the weights appropriately.
Repeating this process continuously leads to an optimal solution. A holdout (or test) dataset
is used to see how well the network can predict observations it has not seen before.
Recent advances
Several recent advances have been made in key driver methodology. The first of these
relates to regression analysis and is called hierarchical Bayes regression. Consider an
example where consumers provide attribute and overall ratings for different companies in
the marketplace. Different consumers may rate different companies based on their
familiarity with the companies. An overall market-level model can be obtained by combining
245
all of the ratings and running a single regression model across everybody. But if we one
could run a separate model for each consumer and then combine all of that information, the
resulting coefficients would be much more accurate than what we get from a regular
regression analysis. This is what hierarchical Bayes regression does and is hence able to
produce more accurate information. Of course, this type of analysis can be used only in
situations where respondents provide multiple responses.
For classification problems, there have been a series of recent advances such as stacking,
bagging and boosting. In stacking, a variety of different analytical techniques are used to
obtain classification information and then the final results are based on the most frequent
classification of data points into groups in each of those methods. Bagging is a procedure
where the same technique is used on many samples drawn from the same data and the final
classifications are made based on the frequencies observed in each sample. Finally, boosting
is a method of giving higher weights to observations that are mis-classified and repeating
the analysis several times. The final classifications are based on a weighted combination of
the results from the various iterations.
Variety of tools
This article has touched upon both traditional methods and recent developments in key
driver methodology that may be of interest to marketing research professionals. The
particular method to be used often hinges on the primary objective - explanation or
prediction. Once this determination is made, there are a variety of tools that can be used
that include linear and non-linear methods, as well as those that employ multiple
dependent variables.
Part II: Segmentation analysis

Article Abstract
Part II of a survey of analysis methods, this article examines segmentation analysis, including
cluster analysis, neural networks, self-organizing maps and mixture models.
Segmentation analysis has been a part of marketing research for decades. It continues to be
useful in a variety of different situations, even when the primary objective of the study is
not segmentation. Since segmentation divides the data into comparatively homogenous
groups, marketing efforts such as targeting, positioning, retention and product development
can be more efficiently performed. While the value of segmentation analysis is rarely
questioned, the methods of developing segments have always given rise to considerable
debate.
One of the simplest ways of segmenting the data is basic crosstabulation analysis.
Respondents can be divided into, say, age or income groups and their differences studied
across a variety of questions. This approach of pre-defining the respondent is often referred
to as a priori segmentation.
246
Use of a priori segments, while attractive, is often not sufficient given the need to obtain
complex segments based on multiple variables. Therefore, most of the time segments need
to be developed after data have been collected. In this article we will consider various
segmentation methods, both traditional and recent, that can be used to address marketing
research problems. The three methods we will consider are: cluster analysis, neural
networks and mixture models.
It should be noted at the outset that regardless of the method used for analysis, the quality
of the segmentation scheme is determined by its usefulness to the manager. Even if the
statistics indicate that a particular solution is the best one, if it is not useful to the manager
then the segmentation analysis should be seen as a failure. This condition is not as harsh as
it seems, because not only are many different solutions possible with a given set of
variables, but changing the variable set can lead to more solutions. Further, using different
analytical methods can also provide new solutions. Finally, there is also the option of
dividing some of the segments obtained into sub-segments, if that would make them more
actionable.
Next, we will look at each of the segmentation methods mentioned above and how they
work. This will be followed by a discussion on ideas for developing good segments.
Cluster analysis
Cluster analysis is the traditional method used for segmentation in marketing research. This
is actually a family of methods that subsumes many variations and can be broadly classified
under two distinct groups: hierarchical and non-hierarchical (or partitioning) methods.
Hierarchical clustering includes methods where the basic idea is to start with each
observation as one cluster. Each observation is located on an n-dimensional space where n
is the number of attributes used in the analysis. The distances between observations are
measured using some form of distance metric such as Euclidean distance. Based on these
distances, observations that are closest to one another are joined together to form a new
cluster. This process continues until all observations have been merged into a single cluster.
The optimal number of clusters can be determined by looking at standard measures of fit
(statistics such as the cubic clustering criterion, pseudo-f and pseudo-t2) provided for each
cluster solution.
Conversely, it is possible to start with all observations together as one cluster and work
backwards until each observation becomes a cluster by itself. With both variants of the
hierarchical method, the analyst will have to study the results of the analysis to determine
the appropriate number of clusters.
In the non-hierarchical methods (such as k-means clustering), random observations are

chosen as seeds (or cluster centers) for a pre-specified number of clusters. Thus, the initial
ordering of the data can dictate the formation of clusters. Observations that are closest to a
247
particular seed are assigned to that seed, thus giving rise to clusters. The analyst then
obtains the fit statistics for a variety of solutions in order to determine the optimal number
of clusters.
Choosing the appropriate number of clusters is never easy even with data sets that are
reasonably well behaved. In commonly used methods like k-means clustering, the analyst
needs to specify the number of clusters desired. This can be problematic, because the
algorithm will assign observations to clusters regardless of whether there are bona fide
segments in the data. The fit statistics that indicate the optimal number of clusters are often
unclear. Sometimes the optimal number of clusters may not make operational sense. In
such cases actionability should be considered before deciding on the optimal number of
clusters. Hence, the process of developing segments from data using cluster analysis has a
high interpretive content.
Neural networks
Artificial neural networks are a recent addition to the variety of techniques used for data
analysis. There are two basic types of neural networks: supervised learning and
unsupervised learning networks. Supervised learning networks can be used in place of
traditional methods like regression and discriminant analysis and were discussed in the
previous article in this series. Unsupervised learning networks are the subject of our
discussion here.
Unsupervised learning networks are generally used when there are no clear distinctions
between dependent and independent variables in the data and when pattern or structure
recognition is required. Since pattern recognition is really what is needed in segmentation
analysis, unsupervised neural networks can be used for this purpose. The type of
unsupervised learning network most appropriate for the problem of segmentation is the
self-organizing map (SOM) developed by Teuvo Kohonen.
Self-organizing map
A typical SOM consists of an input layer and a grid-like structure known as the Kohonen
layer. The input layer contains the variables that are going to be used in the analysis, while
the Kohonen layer is a grid of processing elements. Each of the variables in the input layer is
connected to each of the processing elements in the Kohonen layer. These connections have
random starting weights attached to them before the start of the analysis.
When the information from the first respondent is presented to the network, the processing
elements “compete” with each other. By mathematically combining the first respondent’s
score on each input variable with the weight of each connection, the processing element
with the “winning” score can be determined. Winning implies that this particular processing
element is the one that most closely resembles the input scores of the respondent. This
processing element is called the “winner.” The weights associated with the winner will then
248
be adjusted to more closely resemble the respondent. The network can be thought of as
learning the response pattern of the respondent.
Not only are the weights associated with the winning processing element changed, but the
weights of the neighboring processing elements are also changed. In other words an area of
the grid is learning the response tendencies of the respondent.
When the second respondent’s data are presented to the network the process is repeated.
If the second respondent is similar to the first, then a processing element from the same
area of the grid wins. Whether it is the same processing element as the last time will depend
on whether the second respondent is exactly similar to the first one. If the second
respondent is very different, then a processing element in a different part of the network
will win.
At the end of this process the grid will show a two-dimensional representation of the data
with different segments showing up as different neighborhoods on the map. Because of the
iterative process described above, substantial segments cannot be formed around outliers.
This is a clear advantage this method enjoys over traditional k-means cluster analysis.
SOMs also have an advantage in that they were initially developed as not just a data-
reduction tool, but also as a data visualization tool. This capability allows the SOM to
provide a more intuitive understanding of the relationship between the variables and the
segments, hence making the process of developing segments easier. However, some experts
feel the reduction of a multidimensional problem to a two-dimensional space for
visualization can actually be a disadvantage because of the constraints it may impose on the
segmenting process. A further disadvantage in the case of large datasets is the amount of
time required to run the analysis as compared to k-means cluster analysis.
Mixture models
This is another broad category of segmentation methods. The basic idea linking methods in
this category is that the data contain many distributions or segments which are mixed
together. The task of the analysis then becomes one of unmixing the distributions and for
this reason they are also called unmixing models.
One of the major differences between the cluster methods described previously and
mixture models is the prior specification of the number of segments in the data. In non-
hierarchical cluster analysis we have to explicitly specify the number of clusters in the data.
In hierarchical cluster analysis the results are presented for every possible cluster solution
(with the limit being each observation treated as a cluster), thus effectively making the
analyst choose the optimal number of clusters. In mixture models, the assumption of
underlying distributions allows the use of optimization approaches that can automatically
identify the number of segments (distributions) in the data.
249
Another variation of the mixture model approach to segmentation is known as latent
segmentation analysis. While it belongs to the mixture model family, it has some advantages
that might be very useful in a marketing research context. For example, latent segmentation
analysis makes it possible to simultaneously conduct a segmentation and key driver analysis,
where each segment can have its own unique key driver analysis. Thus if a manager is
interested in not just identifying segments but also understanding the key drivers of, say,
satisfaction within each segment, this would be an appropriate method to use. This process
is more efficient than running a segmentation analysis first, followed by separate key driver
runs for each segment.
While mixture models can be very useful in creating segments, they also have some
disadvantages. The primary disadvantage is with the large amount of time required to run
the analysis, especially when compared to k-means cluster analysis. There are also other
disadvantages such as sensitivity to the presence of outliers.
More than one method
While different types of approaches to segmentation analysis have been discussed here it is
not clear that there is one approach that is the best in every situation. Segmentation
analysis often involves trying more than one method to obtain the best result. The main
reason for this is that unlike key driver analysis, segmentation analysis is quite unstructured.
The final solution depends on the number and nature of variables included in the analysis.
Changing even one variable can have a strong impact on the results. Without seeing the
results, however, it is hard to identify the variables that can be useful in the analysis. This
type of circular problem implies that the most important step in a segmentation analysis is
the choice of variables to use. The more thought we put into selecting the variables, the
more likely it is that the results will useful.
There are a few other steps that can be taken (with any of the methods described here) to
increase the chances of developing good segments. These are:
eliminating outliers;
using as few input variables as possible; and
using input variables with low correlation between them.
Eliminating outliers not only ensures that segments don’t center on them, they also result in
tighter, better-defined segments. Using as few input variables as possible is hard to do, but
very important for deriving useful and timely solutions. Beyond the fact that irrelevant
variables can sabotage the analysis, using too many variables complicates the analysis,
leading to solutions that are not useful. One way of reducing the number of input variables
is to remove those that are highly correlated with other input variables. Further, since
250
segmentation methods don’t work as well when there is a collinearity problem in the input
variable set, it makes sense to eliminate collinearity as much as possible.
McCullough’s Laws: first principles of commercial data analysis

Author - Richard "Dick" McCullough
Article Abstract
A long-time statistician recalls and reviews a number of data analysis laws and their impact
on the intrepid analysts who are doing their best to make sense of the numbers before
them.
In a career that is quickly - too quickly - approaching 30 years in length, I have stumbled
upon a series of general principles that have been proven, usually by my not employing
them, to successfully guide the earnest analyst as he or she tentatively picks his or her way
through that dark and tangled forest that we often refer to as a commercial data set - all
this in his or her quest for the Holy Grail of data analysis: Truth.
Thus, this article humbly serves to summarize these laws and their accompanying theorems
and corollaries in much the same way as Maxwell summarized the laws of electricity and
magnetism over 140 years ago. Yes, I know. I’m a bit behind.
If you still have trouble understanding the difference between accepting the null hypothesis
and failing to reject it, you will find the first law extremely useful. Forget all that conceptual
nonsense and apply the first law with vigor. You’ll be fine.
McCullough’s First Law of Statistical Analysis: If the statistics say an effect is real, it
probably is. If the statistics say an effect is not real, it might be anyway.
This is true because none of you bother to look at beta errors (don’t worry, I don’t either). I
mean, who’s got the sample size, anyway? If you do worry about such things as beta errors
and power curves (and you know who you are), either you are an academic (and should
have stopped reading this article long ago) or you are in desperate need of an appropriate
12-step program. When in doubt, see your nearest HR representative.
251
Douglas MacLachlan, a distinguished professor at the University of Washington, was kind
enough to let me repeat a law he often shares with his graduate students, which captures
the spirit and intent of my first law very well:
MacLachlan’s Law: Torture any data set long enough, and it will confess.
The point being, of course, don’t quit! Data don’t yield themselves up to the dashing data
monger like some chambermaid from a gothic novel. No, data are reluctant lovers that must
be coaxed and wooed (and occasionally slapped around a bit). The successful data analyst is
the one who is tenacious. Remember, data are not people. If they say no (and even if they
mean it), you don’t have to listen. Pretend they meant yes. Forge ahead.
It turns out, rather unfortunately, that Professor MacLachlan means exactly the opposite to
the above when he quotes MacLachlan’s Law. His point is you can artificially manufacture
from your data set virtually any story you want by exhaustive and indiscriminate analysis
(not to be confused with discriminant analysis, which is an entirely different cup of tea). But
that quickly leads us into a philosophical discussion of theory-driven versus data-driven
models. Nobody wants to go there, believe me. Let me just say if you follow, with the
dedicated fervor of an English soccer hooligan, McCullough’s Second Law (see below), or
more specifically, the Corollary to the Second Law (see below), you will safely steer clear of
any trouble with MacLachlan’s Law (as interpreted by MacLachlan).
McCullough’s Second Law of Statistical Analysis: Never, ever confuse significance with
importance.
Imagine a battery of 100 brand imagery statements. Now imagine the master you serve
wants you to test for significant differences between right-handed respondents and left-
handed respondents (don’t pretend you haven’t faced similarly mind-numbing requests
with a toothy smile and a perky “Good idea, sir!”). Dutifully, you conduct 100 pairwise tests.
Not surprisingly, you find five statements have significantly different mean ratings for right-
and left-handed persons. This finding is likely unimportant for two entirely different
reasons.
Of course, 100 pairwise tests are apt to generate some statistically significant differences by
accident. I mean, by definition the tests are only accurate 95 percent of the time, right? So
they’re inaccurate 5 percent of the time. There are easy ways around this. Go look them up
(see if you can find Fisher’s pooled alpha - obscure but cool). In the meantime, ignore these
five differences because they are very probably spurious.
These differences are likely to be unimportant in a second, more important way. At least in
my example, these differences are likely to not tell you anything that you can use in your
business practice. Suppose left-handed respondents truly do think Brand A is very slightly
more “frivolous” (or “playful” or “precocious” or “insouciant,” ad nauseam) than Brand B
252
(7.8 vs. 7.7 out of 10, say). Now what? Shall we hang a $10 million ad campaign on this
statistically significant finding? How?
The careful reader will note that I said these five differences are likely to be unimportant.
What if all five differences are consistent with one another? That is, what if they provide
face validity for each other? Tell a compelling story that makes sense and is actionable? In
that unlikely event, skip the next corollary and law and proceed directly to McCullough’s
Third Law (do not pass Go; do not collect $200).
Corollary to the Second Law: If it doesn’t make sense, don’t do it.
The Corollary to the Second Law could also be called the First and Most Important Principle
of Confirmatory Analysis because it is the cornerstone of confirmatory modeling, such as
structural equations or confirmatory factor analysis. Statistical principles, like religious ones,
should stand the test of common sense. Otherwise, you run the risk of committing atrocious
crimes against humanity (or numbers) that violate the very principles you sought to uphold.
To be blunt, if the sign of your regression coefficient is opposite to what you know to be
true, deal with it! Check for collinearity, double-check the data set for coding errors, toss the
bugger out if you have to. But don’t just leave the absurdity in your model because the data
said so. The data didn’t say so any more than a rock tells you to throw it through a window.
Don’t get me wrong: It’s alright to explore a data set. Try things out. Play around.
Experiment. Just examine your conclusions with a healthy dose of reality, a.k.a. cynicism.
The first time you try telling a grizzled veteran of the trenches that his or her sales are not
affected by a key competitor’s drop in price, you’ll appreciate the wisdom (painfully earned,
I might add) of the Corollary to the Second Law.
The Rolling Stones Law: You can’t always get what you want. But if you try sometimes, you
just might find, you get what you need.
Besides being lyrics from a great song, the Rolling Stones Law reminds us that we are not,
despite being the high priests and holy gatekeepers of Info rmation, in control (and given
my track record, this is a very good thing). It is natural to approach a data set with some
preconceived ideas about what is going on. But we can’t let those prejudices influence our
search through the data (remember MacLachlan’s interpretation of MacLachlan’s Law?). We
must accept what the data god gives us, even if it isn’t the answer that our president’s wife
thinks that it should be. We must strive for Zen-like detachment and accept what is. Mick
and the gang assure us it will be enough. Keep the faith, brothers and sisters!
McCullough’s Third Law of Statistical Analysis: If you can’t tell what it is, it ain’t art.
Apologies to Jackson Pollock, et al. All the hoity toity statistics are nice, particularly as
entertainment for us quant jocks, but the ultimate goal, as stated earlier, is to find Truth. If
you have a seemingly random collection of statistically significant differences, a la Pablo
253
Picasso, please see the second law above because you haven’t found Truth. If you have a
coherent picture, a la Leonardo daVinci, confirmed and/or corroborated by numerous
independent data points, you’re likely to be barking up Truth’s tree, whether you have
official significance or not. Remember:
McCullough’s Law of Small Samples: Give me a sample small enough and all means will be
statistically equal.
Sample size is like dirt on a window pane. The smaller the sample size, the dirtier the
window. If there are two sprawling, old oak trees majestically parked side by side in front of
the window, a small sample size will make them harder to see but it won’t have any effect
on whether or not there are two trees instead of one. If you stare through five dirty
windows and think you see the hazy outline of two big oak trees each time, there’s probably
two different trees out there. Through one window alone, it might just be random patterns
in the dirt.
Not wishing to confuse you, but there is a flip side to the Law of Small Samples:
McCullough’s Law of Large Samples: Give me a sample large enough and all means will be
statistically significant.
The problem with data analysis (survey data, anyway) is that it comes with some
assumptions that are usually well hidden. And, in the real world, these assumptions are
almost never fully realized. As sample size gets larger, the window gets clearer (see above).
And when staring at an apparently statistically significant difference, we know we’re seeing
something. The question is what.
Let’s say we did an online survey with a million people. They rated five brands on
preference. Just preference. There may be the slightest bit of fatigue that sets in by Brand
No. 5. Or maybe some respondents have a very small computer screen and have to scroll
over to see Brand 5. Scrolling annoys them and they subconsciously take it out on Brand 5.
These very tiny influences may show up as statistically significantly lower brand preference
ratings for Brand 5 even if Brand 5 is equally preferred to the other brands. The large sample
size allows us to identify a non-random effect. It just might not be the effect we are
expecting. Can you say “measurement error”?
It’s sad, isn’t it? All of us who enter the Brotherhood do so with the fervent desire of first
finding then standing on firm ground. It’s what attracts us in the first place. All these rules,
equations, tedium. There must be a reward. And that reward is certainty. Alas, no.
But let’s not dwell on this too long, ok?
McCullough’s Law of Cluster Analysis: The name of the cluster is always more important
than the cluster.
254
Keep in mind that finding Truth (or your somewhat imperfect version of it) won’t do much
good if you can’t explain it to someone else, most likely someone in marketing. You can’t
expect a marketing person to digest a series of distribution comparisons. If he or she could,
they wouldn’t be in marketing. No, the sum total of learning to be gleaned from your three
weeks of segmentation analysis will be contained in the names of your clusters, whether
you like it or not. So name them carefully. Ditto for any other golden nuggets of knowledge
you wish to share with the unwashed masses. In sum: Be brief. Be simple. Be clear.
Research dollars come straight off the bottom line. If we don’t help the guys on the firing
line make more money, then we’ve helped them make less. So let’s bend over backwards to
help them “get it.” Speak plain English (or whatever).
My last two laws may be the most important. They are both yellow caution flags telling the
avid analyst to beware the twin pitfalls of enthusiasm and earnestness. I’ve saved them for
the Big Finish (drum roll, please).
McCullough’s Law of Statistical Fashion: Give a kid a hammer, and the world becomes a
nail.
Been there, done that. Like a kid at Christmas with a new toy, the analyst, armed with a
recently mastered (or not) statistical technique, sees every marketing problem as an
opportunity to practice his newly acquired vocation. We analysts must view the ever-
increasing portfolio of powerful statistical techniques available to us in user-friendly drop-
down menus (and soon portable to your cell phone and/or wrist watch) as a selection of
beautiful arrows in our analytical quiver. Each arrow serves its own unique purpose. The
competent analyst will be prepared to use whatever arrow is right for his or her business
problem, not vice versa.
McCullough’s Law of Marketing Impact: Any direction is better than no direction.
Analysts seek Truth. It is what we do. Finding it with certainty is problematic, however, for
numerous reasons. If marketing is a voyage, research is the compass. If you were lost in a
mountain range full of iron ore, would you sit still until you were certain your compass was
accurate or would you start walking while there was still light? If you’re a true stat geek,
you’ll be frozen solid by morning while the marketing guys will be sipping lattés at an
sidewalk café.
When you study a data set, the easiest (and most cowardly) path is “We didn’t see anything
significant.” Sometimes, if you’ve done an extraordinarily poor job designing your study in
the first place, this may be true. But most of the time, there are stories in there. Some of
them take a little more digging to get to, but they are there. Dig. Paint a picture. Suffer for
your art. The more independent data points you can find that sing the same song, the more
likely you are at least getting warm (remember the dirty windows). Businesses rarely
succeed by inaction. And most marketers rarely remain marketers through inaction. Get in
255
there and help. So you’re stretching the data set a little bit. It’s not like you’re pulling the
wings off butterflies. It’s OK. Data don’t have feelings. Data don’t cry. Search your data with
an open mind, a pure heart and a ruthless spirit.
Do you really want those marketing guys stumbling down the mountain without a
compass?
Time series analysis: what it is and what it does

Author- Kevin Gray
Article Abstract
Most marketing research is cross-sectional but time series analysis is an often-overlooked

but valuable tool. This article offers an overview of univariate analysis, causal modeling,
multiple time series and more.
Most marketing research is cross-sectional, meaning our data represent one slice in time.
However, we also have data collected over many periods, such as weekly sales data for our
brands and competitors' brands. This is an example of time series data. Time series analysis
is a specialized area of statistics to which many marketing researchers have had limited
exposure, despite it having many important applications in MR.
Why is the distinction between cross-sectional and time series analysis important? For
several reasons - one being that research objectives are usually different. Another is that
most of the statistical methods we learn in college and use in marketing research are
intended for cross-sectional data and if we apply them to time series data, results may be
misleading. Time is a dimension in the data we need to take into account.
Time series analysis is a complex topic but to put it simply, when we use our usual cross-
sectional techniques (e.g., regression) on time series data, one or more of the following
outcomes can occur.
Standard errors can be far off. More often than not, p-values will be too small and variables
can appear more significant than they really are.
Regression coefficients can be seriously biased.
We do not maximize the information provided by the serial correlation in the data.
Table 1 shows a simple illustration of what a time series data file looks like. The column
labeled "Date" is the date variable and corresponds to a respondent ID in survey research
256
data. "Week," the sequence number of each week, is included because using it (rather than
actual date) reduces graph clutter. The sequence number can also serve as a trend variable
in certain time series models. In this illustration, "Sales" are the number of units sold each
week.
Univariate analysis
To build on the example in Table 1, a possible objective would be to forecast sales. There
are many ways to accomplish this and the most straightforward would be through
univariate analysis, in which we basically extrapolate future data from past data. Two
popular univariate time series methods are exponential smoothing (e.g., Holt-Winters) and
ARIMA (autoregressive integrated moving average). In Figure 1, one year of historical sales
data have been used to forecast sales one quarter ahead with an ARIMA model.
Causal modeling
257
Obviously, there are risks in assuming the future will be like the past but, fortunately, we
can also include causal or predictor variables to mitigate these risks. Besides improving the
accuracy of our forecasts, another objective may be to understand which marketing
activities - ours and competitors' - most influence sales. Causal variables will typically
include data such as GRPs and price and also may incorporate data from consumer surveys
or exogenous variables such as GDP. These kinds of analyses are called market response or
marketing-mix modeling and are a central component of ROI analysis. They can be thought
of as key driver analysis for time series data. The findings are often used in simulations to
find the optimal marketing mix.
Transfer function models and dynamic regression are two popular approaches to time series
causal analysis. Essentially, they refer to specialized regression procedures developed for
time series data. There are also more sophisticated methods and I'll highlight a few in just a
bit.
Multiple time series
There are situations in which you might prefer to analyze multiple time series
simultaneously (e.g., sales of your brands and key competitors). Figure 2 shows weekly sales
data for three brands over a one-year period. Since sales movements of brands competing
with each other will typically be correlated over time, it often will make sense - and be more
statistically rigorous - to include data for all key brands in one model. Vector autoregression
(VAR) and the more general state space framework are two frequently-used methods for
multiple time series analysis. Causal data can be included and market response/marketing-
mix modeling carried out.
258
Other variations of time series models
There are several additional time series methods relevant to marketing research.
Panel models include cross-sections in a time series analysis. Series for several brands, for
instance, can be stacked on top of one another and analyzed simultaneously. Why not use
VAR or another multiple time series method instead? We may not have a sufficient number
of observations (points in time) for these approaches to be feasible or perhaps we're mainly
interested in looking at the product category as a whole.
In some instances, one model will not fit an entire series well because of structural changes
within the series and model parameters varying across time. There are numerous
breakpoint tests and models (e.g., state space, switching regression) available for these
circumstances.
You may also notice that sales, call center activity or other tracked data series exhibit
clusters of volatility. That is, there may be periods in which the figures move up and down in
a more extreme fashion than other periods and you may not be able to explain why this is
happening on the basis of seasonality or other business or economic reasons. Figure 3
illustrates this kind of pattern.
In these cases, there is a class of models under the name of GARCH (generalized
autoregressive conditional heteroskedasticity) you might consider. ARCH and GARCH
models were originally developed for financial markets but can used for other time series
data when volatility is of interest. Volatility can fall into many patterns and accordingly there
are many flavors of GARCH models. Causal variables can be included. There are also
multivariate extensions for situations in which you have two or more series you wish to
analyze jointly.
259
The methods I've mentioned are time domain techniques and often employed in operations
research and econometrics, as well as in marketing research. Another family of methods,
known as the frequency domain, plays a more limited role in MR.
Scratched the surface
I've barely scratched the surface of a rich and multifaceted set of topics that are new to
many marketing researchers but are more and more becoming an important part of our
world. Fortunately for readers wishing to learn more about these methods, there are many
excellent introductory textbooks available (e.g., William W.S. Wei), as well as those covering
specific topics in depth.
260
Social Media Data Analysis:
Cracking the code of social media data analysis
Author - Thomas Malkin
Article Abstract
While analyzing data from social media is something new to researchers, one way of
extracting meaning from the data isn’t. Old-fashioned coding - with an assist from
technology, of course - is an approach that can generate useful insights and bring clarity to a
seemingly murky process.
Making sense of the chatter
For some researchers, information generated by social media has become a worthwhile
source of complementary data. When the directional insights from social media are
integrated with the representative opinions garnered from traditional marketing research,
they give full voice to the needs, wants and viewpoints of the customer and can help
achieve optimal decision-making. But for many in our industry, the task of making sense of
the flood of words generated by the social media outlets is daunting, to say the least.
One way to extract insights from the torrent of consumer opinions is to apply the
methodology market researchers use to analyze open-ended survey questions: coding. The
coding approach to analyzing unstructured comments involves readers quantifying positive,
negative or neutral opinions after they’ve matched the subjects (i.e., “Cell Phone Product
X”) in the text with categories or classifications. These categories typically represent
consumer passions or issues that drive purchasing decisions (i.e., opinions around apps or
multimedia for the cell phone product category).
The challenge in adapting coding to social media is that reading through the vast amount of
constantly-changing data on the Internet is overwhelming and expensive. Technology is thus
needed to automate coding so the high volume of historical and daily social media data is
not a barrier to obtaining insights from this valuable trove of data.
For automated coding to achieve accuracy, especially in social media where people
communicate with familiarity in blogs, forums and social utility sites like Facebook, one
approach is to go beyond keyword-based technology and take into account the implicit
subject, implicit issue and context-dependent sentiment. An example of an implicit subject
is, “This phone has a great battery life,” in which the type of phone isn’t explicitly
mentioned. An example of an implicit issue is, “This camera is too large,” in which the
category or classification - size - is not explicitly mentioned. And an example of a context-
dependent sentiment is, “This battery life is long,” in which “long” makes this opinion a
positive sentiment whereas long means something negative in the opinion, “This movie line
is long.”
261
This type of rubric allows researchers to compare brands or products to each other based
on the issues that resonate with consumers rather than just brand buzz. For example, if
Brand X has an 83 percent positive sentiment on the topic of customer satisfaction (the 83
percent representing a ratio of the overall positive and negative opinions), that could very
well meet the prior expectations of Brand X. However, if Brand X’s competitors are
achieving a positive sentiment on customer satisfaction greater than 87 percent, Brand X
now has a decision to make as to whether it wishes to do something about this disparity.
Doing a competitive trend analysis over 12 months on customer satisfaction would give
direction to Brand X and its competitors as to when each brand experienced more or less
positive and negative opinions (i.e., daily, weekly, monthly, etc.) and where in social media
such opinions were expressed. The advantage that social media provides over other data
sources is that Brand X can efficiently learn why consumers consider its competition to be
better on customer satisfaction by simply drilling down to read the actual comments about
Brand X or all of its competitors from their original online sources.
See the correlations
With the availability of social media-generated input, a starting point for decision-making is
to get a measurable, story-style read on a product’s or brand’s positioning relative to a
competitive set on the issues that drive purchasing decisions. Researchers can now not only
see how many opinions have been expressed on three dimensions of data - the subjects,
issues and sentiment - but also see the correlations among those three variables as well. In
addition, issues or subjects that may not be talked much about and thus typically
overlooked by decision makers can become directionally insightful if they are spoken of in
the context of or relative to something else.
Let’s say a beer brand in the craft beer category is trying to win market share by going
green. In a 12-month study on the category, the issue or code “green initiative” is not
spoken of very much among the several brands in the competitive set. But when opinions
on this issue are expressed, they are usually done so in the context of a couple of brands in
particular, and both brands have distinctly different volumes of social media. Being able to
visualize how the opinions correlate (see Figures 1 and 2 in sidebar) and then drill down to
understand the actual story around “green initiative” provides directional insights for any
brand in the craft beer category. This is especially true if the brand in question is not being
acknowledged for its efforts by consumers across millions of social media outlets.
Additionally, insights can be obtained by seeing stories emerge amongst the other issues
measured in that study.
Once a historical read on a product category is obtained, monitoring becomes more

impactful. One new form of measurement is the monitoring of changes in social media
thought leadership on issues that resonate with consumers before and after an event.
Emerging trend analysis and crisis management also becomes more thoughtful and less
reactionary because researchers can get a better read on what course of action to take once
262
they know the issues discussed, how they’re evolving over time and how consumers are
talking about those issues (especially in the context of and relevant to other issues or
subjects). Such granular monitoring can also provide greater insights for communication
strategies since opinions are quantified and weighted by source based on discussions on
both the subject and issue.
More breadth
Gaining an understanding of product categories based on the voice of the crowd in such a
granular manner creates many applications for market researchers. They can benchmark
the insights obtained against prior quantitative and qualitative research and see more
breadth in the data that may otherwise have been missed. Typically, many more questions
than answers are raised, leading to further exploration using traditional marketing research
and, one hopes, a clearer and more complete picture of the market segment in question.
Using social media to track the Tiger Woods saga
For a more detailed example of how a marketing research firm has used social media
research, Thomas Malkin spoke to Jon Last, president of Sports and Leisure Research Group,
White Plains, N.Y.
How have you used social media in conjunction with your traditional marketing research?
“We tracked the magnitude and tonality of Web conversation and opinion about Tiger
Woods over 1,100 disparate and relevant Web sites from January 2009 through mid-March
2010, right after Woods’ public statement in February, to see if it was consistent with our
attitudinal survey research that was part of our winter 2010 omnibus study. In the omnibus
study, a national sample of nearly 1,000 avid golfers agreed that the rancor regarding the
transgressions of golf’s greatest player would dissipate significantly by the summer months.
Further, this study suggested that for the most engaged and passionate fans, Tiger’s on-
course achievements far outweighed any personal shortcomings.”
What were your findings upon benchmarking the insights obtained from social media with
those from your prior attitudinal research?
“As one might expect, the level and tonality of buzz regarding Tiger was at its peak
immediately after his November accident. But this chatter quickly and precipitously dropped
in the first month of the new year, spiking again, though at nowhere near the level of
November, around his mid-February statement. By March, the level of online conversation
was back to pre-scandal levels.
“Upon further assessment, we looked at the tonality of the conversations pertaining to

those opinions on Tiger. We developed codes that could be used to measure the tonality
relative to some of the findings of our prior attitudinal research, selecting ‘admiration,’
‘apology accepted,’ credibility,’ ‘disappointment,’ ‘doubtful,’ ‘inspiration,’ ‘marketability’
263
and ‘trustworthiness,’ among others. Our analysis demonstrated that conversations in social
media were on par with the conclusions that we drew from the traditional quantitative
study.
“From an illustrative standpoint, if you take a look at an issues trend analysis, you’ll see that
before the crisis emerged on November 20, 2009, conversations on Tiger’s character were
focused on two issues: admiration and inspiration. From November to January, new
conversations emerged on the issues of character, namely trustworthiness, marketability
and credibility. Conversations around disappointment and doubt, which previously included
opinions solely about his golf game, now included those about his character as well. In the
second peak period in February, disproportionally fewer conversations were occurring
under inspiration and trustworthy with new conversations emerging under apology
accepted. Thus we were able to use the social media analysis to validate the earlier
hypotheses drawn by our quantitative research. Golf fans were beginning to quickly move
past the issue!
“Another illustration of the findings can be seen in the heat maps [Figures 1 and 2], both
prior to the crisis and at the tail end of it. Prior to November 20 [Figure 1], you can visualize
the intensity of conversations on Tiger himself, the issues or pre-determined codes and
sentiment by looking at both the size and the color of the circles. [The smaller the circle and
the lighter the color, the lower the frequency of conversation; the larger the circle and
darker - i.e., green - or hotter the color - i.e., yellow - the higher the frequency of
conversation, with the reddish colors showing the highest intensity.] You can also see the
statistical correlations of the three data dimensions - subject, issue and sentiment - as
indicated by how close the issues are to each other [The closer they are to each other, the
higher the correlation.] Thus, it’s apparent that conversations around admiration and
inspiration are closer to positive sentiment and spoken of in the same context relative to
other issues before the crisis emerged. As you can see on the second heat map [Figure 2], by
March 22, the story was quite different.”
Figure 1: Tiger Woods Social Media Story Before November 20, 2009
264
Figure 2: Tiger Woods Social Media Story From March 10 to March 22, 2010
What are your conclusions about benchmarking insights from social media with those from
your prior attitudinal research?
“While such analysis is not as representative as a well-designed quantitative attitudinal

study, it does yield strong directional insights and provides breadth to our earlier findings.
265
So, while I won’t go as far as Ad Agedid in suggesting the imminent demise of traditional
survey research, I will assert that this capability presents us with a valuable new tool to
enhance our understanding of fan sentiment. At the risk of further propagating the sound-
byte society, I’d maintain that social media in concert with formal research is a conversation
that all market researchers should be paying attention to.”
Analyzing the content of social media data

Author - Ann Veeck
Article Abstract
The author provides a framework for analyzing social media data, drawing comparisons to
qualitative data analysis while also outlining how and where social media data requires a
specialized approach.
Beyond monitoring
Among the many new sources of consumer information that have emerged in the last
decade, social media data are among the most potent and game-changing for effective
marketing research. Social media platforms offer a powerful opportunity to gain immediate
access to the unfettered opinions of consumers. Many companies are aware of the value of
using social media data to gain marketing insights. But there is so much information out
there. How can businesses tap this source to obtain deep, actionable insights?
A number of excellent programs and services – some free and some commercial – have
been developed for the analysis of social media data. Yet, the focus of the vast majority of
these tools is to provide summary statistics of the data. Web analytics – for example, word
counts, reach, word clouds, volume, sentiment analysis – can provide valuable, up-to-the-
minute snapshots of Web content. Still, no algorithm is an adequate replacement for the in-
depth analysis of consumer-generated feedback that can be conducted by a skilled analyst
with a deep understanding of a brand and its challenges and opportunities.
So, how can analysts move beyond reporting superficial summary data to acquire strong and
actionable insights from consumers? Fortunately, a model for the in-depth analysis of social
media data already exists in the example of the best practices that have been used by
research analysts for decades to analyze qualitative data. With an understanding of the
differences between social media data and traditional forms of qualitative data, qualitative
data analysis can be applied with modifications to the analysis of social media data.
Important differences
266
While the process for analyzing the content of social media data is similar to that used for
qualitative analysis, important differences must be taken into consideration. The following
are the steps for analyzing social media data.
Step 1: Develop a problem definition and research objectives.
For most research, developing focused research objectives is usually the most important
step. What decisions will be made with this information? This guideline holds particularly
true for social media analysis where a clear direction is needed to make sense of the copious
amount of data. Limiting the focus to a defined topic and specific objectives will make the
analysis more manageable. Still, to take full advantage of social media data analysis, the
research objectives should also allow for an element of discovery. The data may lead to
unexpected places.
The following are examples of objectives that social media analysis is particularly suited to
address: competitive analysis; product extensions; product strengths and weaknesses; new
uses of products; and reactions to advertising and promotions.
Step 2: Identify key search terms
The identification of the proper key search terms is a crucial step to the successful analysis
of social media data. The process is often an iterative one, with broader searches being
followed by searches using combinations of terms or newly discovered synonyms or
tangential phrases. Obvious terms to start a search include the product’s brand name,
competitors’ brand names and the product class. More exploratory analyses might
investigate activities, events and emotions related to a brand.
Step 3: Identify social media data sources
The identification of the most useful data sources is another important step to social media
data analysis. Online aggregator tools, such as TweetDeck and Scout Labs, can aid in this
process. Still, sometimes these tools can miss some important types of social media
platforms.
Depending on the research objectives, some types of social media sites that can provide
consumer-generated data include the following:
social network sites (e.g., Facebook),
video-sharing sites (e.g., YouTube),
photo-sharing sites (e.g., Flickr),
product and service review sites (e.g., Yelp),
Web-based communities (e.g., Chowhound),
267
blogs (e.g., Gardenista), and
microblogs (e.g., Twitter).
Finding the most current and germane sites is a moving target, since social media-oriented
data sources ebb and flow in popularity. While this makes the task of identifying the best
sites from which to gather data more difficult, it also means that new forms of exciting and
relevant consumer-generated feedback are always emerging and can be uncovered with a
bit of persistence.
Step 4: Organize data
Some of the most important consumer-generated data will not necessarily be in the form of
text. Photos, videos, artwork, literature and other forms of data might provide new insights
into product feedback. As a result, organization of the data should be flexible and allow for
diverse forms of media. A number of commercial services (e.g., HootSuite, Radian6) and
software (e.g., NVivo) are available to assist in this process, as well as free online tools (e.g.,
SocialMention, Google Alerts). However, some analysts will prefer to replace or supplant
these options with more of a do-it-yourself approach to organizing data to ensure versatility
and comprehensiveness. Analysts will also need to decide whether to view the data online,
via hard copy or through a combination of paper and electronic sources when conducting
the analysis, based on personal preferences and on to what extent the data analysis will
involve collaboration among team members.
With the abundance of data available on the Web, and with all the twists and turns that can
be encountered in the process of organizing data, it is important to know when to stop
seeking new sources. The rule of thumb is that when a saturation point is reached – that is,
when little new information is being acquired relative to the effort – it is time to end the
searches.
Step 5: Analyze data
Once the social media data have been gathered and organized, the best practices for
analyzing social media data are the same as those used for traditional qualitative data. First
the analysts should review the data thoroughly. As with all research, insightful analysis
depends on a comprehensive knowledge and understanding of the data. Then the analysts
should begin identifying key themes that emerge from the findings – beliefs, ideas,
concepts, definitions, behaviors. The data should be coded according to themes, either by
hand or via software (e.g., NVivo) and then compared and integrated. To repeat: This step
parallels content analysis of traditional types of qualitative data.
Step 6: Present findings
Following analysis of the data, the findings will be presented via oral and written
presentation, using concrete examples and illustrations. Here is where social media data
268
really stands out. Quotes can be presented from Twitter, reviews and blogs, just as verbatim
quotes would be used to illustrate findings from focus groups and interviews. But consumer-
generated social media data offer much more. Photos found online can illustrate exactly
where, when and how a consumer is using a product or service. Consumer-produced videos
can demonstrate perceived advantages and disadvantages of products. Even textual quotes
praising or criticizing products can be much more colorful when found online with the
opinions offered spontaneously and not prompted by a moderator.
Step 7: Outline limitations
When using social media data, it is at least as, and probably even more, important than with
other research methods to outline the limitations of the data. Explicitly stating the problems
and gaps encountered when gathering and analyzing the data helps to provide a more
complete understanding of the findings.
The following are some of the limitations that are most commonly encountered with social
media data:
The online consumers are not necessarily demographically representative of the product’s
target consumers.
Self-selection bias is inherent with social media data.
Advocates and detractors can distort online conservations.
The demographic and geographic information of the consumers is often not traceable.
Step 8: Strategize
As with all research, the final and most important step of the analysis is to use the finding to
develop research-based, actionable recommendations related to the research objectives.
Then, based on the project’s results, the next stage of research should be planned.
Challenges and opportunities
Many of the basic steps used for the content analysis of text from structured data collection
methods – such as interviews, focus groups, diaries, and managed online communities – can
be generalized to social media data. However, social media data is different in a number of
fundamental ways, representing both challenges and opportunities for analyses. It is useful
to consider these differences.
Overwhelming amount of data. Traditional interviews or focus groups offer a discrete

amount of material to organize and present. Social media data, on the other hand, is
available in abundance. Often much more social media data related to a topic exists than
can be reasonably analyzed. Analysts must place limits, by topics or time periods, on their
search efforts.
269
Unrestricted comments. With focus groups, interviews and even online communities,
participants are responding to directed questions. The users of social media state whatever
is on their minds. This represents a great opportunity to gain new understandings about
consumers’ motives, needs, behaviors and emotions. It also means that the problem
definitions and research objectives that researchers identify prior to analysis may miss the
mark and require revision.
Much more noise. Because social media data is not generally managed, many, if not most,
of the comments that analysts sort through will be useless. For every insightful comment
found, there are likely to be numerous useless posts, such as sales pitches (“My friend made
$1,200 at home last month…”), empty comments (“So true.” “What he said.” “Yes.”), and
non-contextualized obscenities (no examples necessary).
Multiple languages. Because social media is on the World Wide Web, relevant comments
are frequently posted in multiple languages. As a result, depending on the objectives of the
research, it may be beneficial to assemble a multilingual team for targeted projects.
Multiple forms. Consumer-generated data found online can take many different forms. In
addition to text, data might appear as videos, audios, photos, artwork, slideshows and other
structures.
Lacks context. Traditional qualitative methods allow quotes to be identified with specific
individuals, providing key information such as gender, age, location and income. It is much
more difficult to ascribe demographics to social media quotes. Even if the information can
be traced to a user profile, there are no assurances that the profile is factual.
Cannot ignore
Social media allows access to up-to-date, candid consumer insights as never before.
Companies seeking to make sound, data-driven decisions cannot ignore social media as a
data source. Conducting a content analysis of online consumer-generated data, guided by
targeted objectives, can yield actionable strategic recommendations. Used in conjunction
with ongoing monitoring of Web analytics, and as a supplement to traditional research
methods, social media content analysis can provide new strategic directions for companies.
270
BIG Data Analysis:
Big data no big deal
Author - George Stephan
Article Abstract
A look at the findings of an April 2013 survey about the relationship between big data and
marketing research.
Big data has been a hot topic for a few years now and as it grows ever larger – literally and
figuratively – researchers are still trying to define the relationship between big data and
marketing research: Is there a relationship? Is it working? What are the issues?
To seek answers to these questions, WebLife Research and Quirk’s conducted an online
survey in April 2013. The survey was completed by 246 respondents – most of whom are
market researchers at large corporations. About two-thirds work with big data to some
extent.
Not nearly as insightful
Based on the survey results, it’s obvious that both big data and market research are here to
stay. Big data is valuable as a Web analytical tool but thus far not nearly as insightful as
market research. Responses to the question “How actionable are the online behavioral
insights you get from big data vs. from qualitative and quantitative market research” show
that big data’s actionable insights significantly lag behind market research, with a top three-
box rating of 18 percent vs. 56 percent. In fact, one-third say that market research is more
important now, as it helps understand the whys behind big data.
“Web analytics are used intensively.”
“Better access to big data has made it possible for us to group and track (to some extent) our
most desirable visitors.”
“Big data is mostly used to explain held beliefs. Not for insight. Certainly not to question
anything.”
“Right now big data is used on a very operational level. The research team is working to
integrate this data to now tell customer ‘stories’ and journeys.”
“Big data tells me what visitors are doing but not why!”
It appears that it will be more than two years for big data to truly impact marketing efforts,
as only 29 percent feel that it is very likely to happen in the next two years. If researchers
and marketers worked together, this timeline could be accelerated. Other barriers to the
271
better use of big data are budgets, inability to collect needed data and lack of experienced
staff (Table 1).
Isn’t happening yet
Big data and market research teams don’t work much together (Table 2). Researchers
indicate a desire to integrate the two but repeatedly say it just isn’t happening yet. Only
one-third say that the two disciplines are part of a cohesive team. Very few companies have
figured out how to get users of big data and market research together to develop actionable
insights (only 19 percent use them together frequently).
272
“I don’t use big data – it is used by our marketing department but not by market research.”
“We are just beginning to determine how market research and big data can work together,
trying to form a cohesive team.”
“They work together ‘sporadically’ – no time, no budget, too thinly resourced to be any
good.”
“We understand that data can only give us the what and we need to do qualitative research
to understand the why.”
“Big data results focused, market research directional.”
“Big data free with Google Analytics, market research expensive.”
Oddly, innovative qualitative tools designed to help understand online behaviors are
generally not used to understand those behaviors right now. In fact, existing qualitative
research techniques are still used much more often and most corporate researchers (64
percent) say market research does not take a back seat to big data and they are not worried
that big data will threaten their jobs.
Great opportunity
There is a great opportunity to integrate big data and market research to better understand
online behavior to grow businesses. Corporate management should consider training their
staff, integrating the disciplines and exploiting this opportunity before competition does.
If you’re looking to take action, consider making integration part of corporate policy; asking
for an insights integration budget and staff; training market researchers and analysts to
understand each other’s disciplines; hiring a team leader who can insure that both
disciplines work cohesively together; and trying out the new qualitative tools designed to
understand online behavior.
Big data matters: Why you should be using it and how others already are
Author - Helen Strong
Article Abstract
The author argues that researchers should include big data in their research design and
offers several real-life examples of how companies have successfully applied big data
analytics for marketing.
273
With the increasing pace of today's innovations, academics and educators are constantly
trying to keep up with business advances. In the research industry, we are faced with the
emergence of big data and electronic measurement and the impact that these processes
should be having on marketing research design theory.
As academics, we are failing our students in that it is difficult, if not impossible, to find a
textbook that goes beyond a discussion of the use of e-mails and the Internet for gathering
data via interviews.
With the impact that big data and electronic measurement methodologies are having on
providing decision data, there is no doubt that these approaches need to be integrated into
the market researchers' design arsenal in the near future.
Already recognized
One could argue that the method is already recognized. There are even qualification courses
based on the use of computers in research (Hunter, 1998). However, all references
concentrate on the role of electronic collection of data and its role in surveys and literature
reviews. It is agreed that new data collection methods have contributed to the researcher's
ability to accelerate the research process and perhaps have improved access to respondents
who would otherwise be inaccessible. What has not been recognized is that big
data/electronic measurement are alternative or additional methodologies that need to be
incorporated into the design portfolio.
Big data is already being used by companies with great financial resources and vast
computing power. As health care quality is under scrutiny, clinical records are being shared
between physicians1 who want to conduct research and look for better ways of treating
patients (Electronic Data Methods Forum, 2013).
Some may argue that big data is an extreme form of secondary information. However, if we
apply the test for secondary data we see that this type of data a) has not necessarily been
collected by someone else; b) is in its raw form; c) has not been processed into information;
and d) is often in the form of communication from one person or entity to another.
Trends, patterns and correlations
Essentially, big data analysis examines extremely large amounts of data - looking for trends,
patterns and correlations between variables captured on a computer system or via some
electronic transmission. The source can be internal (global) company data or perhaps even
external trawling of the Web. The cyber-region is interrogated for information about an
event, product, brand, etc., and analysts can gather (in real time) what people have to say,
how they say it and on which networks of people are saying it.
The type of information that can be gathered via big data/electronic collection combines
several time-honored traditional methods. It can consider transaction data, Web page log-
274
ins and site activity (e.g., how long people spend on a page and whether searches are
converted into purchases). It can monitor e-mails, mobile phone interaction - anything that
occurs in the ether. This capacity raises ethics questions but that is a discussion for another
time.
Data reduction
To cope with the sheer volume, new database tools and software are available to the
computer nerds who can make sense out of data that is sitting anywhere in the world (or in
the clouds) on one or many computers. Data reduction into meaningful information is
achieved through algorithms and heuristics or intelligent guesswork. The analysts work with
such elements as MPP (massively parallel programming); NoSQL databases, Hadoop and
MapReduce to achieve their alchemy2.
IBM3 provides some insight into the characteristics of big data research problems. IT
practitioners need to consider IBM's Four V's of Big Data: volume, velocity, variety and
veracity. To put it into perspective, IBM estimates that 90 percent of data in existence has
come into being during the past two years (2011 and 2012). IBM believes that whilst
decision makers wanted to harness the power that big data analytics can provide, they have
been skeptical about its ability to provide the answers. Until now, that is.
Organizations such as IBM, Oracle, SAS, Cisco and others have demonstrated beyond a
shadow of a doubt that they can provide decision solutions. The power of big data analysis
can provide answers to the questions as to the attitudes, language and feelings associated
with a product or organization. And with judicious analysis of the people who are saying, it
can provide a profile of supporters and detractors.
A responsible expert
As with traditional qualitative and quantitative market research, a company needs to

appoint a responsible expert who is going to ask the right questions. What decisions could
be supported through analysis of our data sources? Where should we be trawling? What
type of data needs to be available? Who is going to collect it? And how can we ensure
complex manipulation of data and its reduction to a useful and workable level?
This curator of information also needs to be conscious of the validity achieved through what
could be considered by some to be arbitrary data. Are the exchanges being measured a true
reflection of the opinions and feelings of the people making them? Can they be applied to
the problem at hand? Of course the difficult question also arises regarding reliability. If the
same search is completed tomorrow, will the data be comparable to that captured today?
Perhaps the case studies have the answer.
Real-life situations
275
A white paper by Cisco4 identifies real-life situations that have used big data analysis to
provide solutions to business and global problems. For example, Cisco cites Amazon's retail
pricing market research where iPhone and Android users uploaded photographic proof of
retail prices; e-mail monitoring by insurance companies to anticipate litigation and fraud;
and even a law enforcement application where police can monitor the status of offenders.
Electronic methods are powerful when combined with personal interviewing. Within the
retail environment we are seeing electronic answers to the previously unanswered
questions as to how people react to a product display and what elements prompt them to
lift and buy an article. Previously, retailers knew that special positions worked for them but
they did not know how or why. Now with research based on measurement of brain activity,
they can see whether the liking and decision portions of the brain are lit up prior to a
product being put into the basket.
A study by POPAI5 used traditional research techniques in combination with electronic

monitoring of shoppers as they selected goods off the shelves. They measured what areas of
the brain are activated at specific moments and associate the impact of design elements on
the efficacy of displays (i.e., the combination of research methods was quantitative and
electronic).
Still in the retail field, some companies6 are allowing marketers to view their shelf presence
and reactions to promotional displays in real time via electronic transmissions. Nielsen's
retail audit data is open to threats of substitutes that can offer online and constant flows of
information. Out-of-stocks and empty shelves that contribute to poor performance of
promotions could be things of the past. From the retailers' point of view, they will be able to
track the number and type of customers frequenting stores at different times of the day
A new philosophy
All this adds up to a new philosophy of information gathering and exciting new areas for
academic investigation. There are definitely more than two design methodologies for
market research. Hence the challenge today is to ensure that fledgling market researchers
and marketing practitioners keep in touch with global trends. Educators need to include the
big data and electronic measurement design methodologies in the curricula to prepare their
students for the new real world.
Big data: boon to improving customer experience, bane of researchers?

Author - John Goodman, Peter North and David Beinhacker
Article Abstract
The authors explore examples of the use of big data to improve the customer experience
and offer tips for researchers to avoid being left in big data’s wake.
276
Taking the good with the bad
While big data (BD) is a hot topic in the realms of marketing segmentation, marketing
research and customer experience, your view of it likely depends on your vantage point. For
those in segmentation and customer experience, big data offers many exciting and varied
ways to listen to, learn about and analyze your customers and non-customers alike. For
those in marketing research, however, big data looms as a threat that could put you out of
business, as technology vendors have created data-driven utilities that ostensibly replace
much of the satisfaction-tracking functions traditionally performed by market research
companies.
Big data is most often used for marketing purposes but it has powerful customer experience
applications that are often overlooked and underutilized. This article first suggests a broad
definition of BD. It then suggests incorporating a broader range of data inputs into BD than
most companies use. These expanded inputs, in turn, lead to valuable opportunities for
improving the customer experience process. The article then suggests the biggest
opportunities and threats created by the broader definition and range of inputs. Finally, it
addresses how to best play the politics of big data and recommends specific actions you can
take to assure you are driving the BD bus as opposed to being run over by it.
New forms
In 2012, Gartner updated its definition of big data to “high volume, high velocity and/or high
variety information assets that require new forms of processing to enable enhanced
decision-making, insight discovery and process optimization.” While there are dozens of
definitions, Gartner’s is adequate because it highlights the complexity stemming from many
sources and new forms of processing to make decisions and discover insights.
The term “many sources” means that in the customer experience context, this data can be
produced by any electronic source but it also exists on pads of paper in call centers and
within stories related in focus groups. So, in the customer experience context, any
information or data describing an instance or aspect of a customer experience is a possible
part of the customer experience BD constellation. The challenge is recognizing the source
and translating it into a compatible format and classification. The new approach to
processing implies sorting through the mass of data to find the nuggets that are important.
Making decisions and discovering insights requires more than data-crunching; actionable
outcomes arrive from what Shah, Horne and Capella call “informed skeptics” who apply
informed judgment to analysis.1
Customer experience BD consists of the usual suspects – contact data, customer survey
data, employee survey data, purchase data, customer demographics and economic data
(usually purchased from aggregators). In addition, there are now many “squishier” data
sources such as digital recordings, social media and videos, as well as ad hoc employee
277
input. Finally, there is a massive amount of transaction and quality data, such as process
failure data (e.g., the part was not in stock or the package was not delivered on time), that,
in our experience, is not even viewed as customer experience data. However, it is often
some of the richest in terms of describing the key facets of the customer experience.
In the past, the squishy data was only sampled by quality analysts or researchers because it
was too massive. Similarly, the transaction and quality data was only reported in aggregate
and in most cases was downplayed. For example, to say that 95 percent of all appointments
were met sounds impressive until you know that there were 3,000 appointments in a day so
150 customers waited at home in vain each day. If you then could tie those missed
appointments to call data, survey data and repurchase data, you could start calculating how
much revenue and word-of-mouth damage occurred in addition to the extra cost of
rescheduling or expediting visits.
A number of opportunities
Big data can be used in a number of ways and presents a number of opportunities for the
customer experience – almost all involving making business more proactive. It is
traditionally used to segment customers for marketing offerings, such as how Amazon uses
it, suggesting cross-sell and up-sell. The new innovation is proactively setting and resetting
expectations. In a November 2012 survey of more than 600 companies, conducted as part of
World Quality Month, the American Society for Quality found that the single biggest
concern of businesses was setting proper customer expectations2.
The survey polled more than 600 quality and customer service experts worldwide, who said
that managing customer expectations (29 percent) and communicating with customers (20
percent) are the top challenges in maintaining quality service.
Other challenges include: educating customers about products and services (16 percent);
providing customers with timely service (13 percent); and training and retaining good staff
(12 percent). What is fascinating with all five of these items is that they all can be at least
partially remedied by big data.
Given that no one reads contracts or directions anymore (we find that less than 2 percent of
any audience says they fully read their homeowners insurance policy, for example),
expectations-setting must be achieved via highly tailored and “just-in-time” communication.
One insurance company tailors the welcome letter to customers to highlight only the three
provisions that have proved to be the most likely to create unpleasant surprises for that
specific type of customer. Likewise, a utility uses integrated data to communicate via each
customer’s preferred communication channel (text, e-mail or cell phone) to update power
outage information by individual neighborhood. Market research should be working to
identify these information needs (including exactly when they will be needed) and provide
this information to the operations side of the business – that is, steering the big data bus.
278
The second and third issues (communication and education) have huge opportunities for
enhancement via big data. Customers can be contacted through their preferred
communication channel about the most relevant information, which will get their attention.
Both of these issues were combined in a recent test at a West Coast power utility. We found
that sending an e-mail saying, in effect “Your bill is going to be an unpleasant surprise and
we’re concerned!” got a 54 percent open rate across more than 30,000 customers – an
extremely high open rate (most industry e-mail campaigns have open rates of no more than
10 percent). Once opened, the e-mail offered energy usage data for the first 10 days of the
billing period and projections showing that the ultimate bill would be $50 or 30 percent
higher than expected. It then offered three energy-saving tips and links to sign up for more
support tools. Click-through rates and adoption of the offered tools were very high and a
number of consumers called with compliments, saying it shows the utility really cares about
helping consumers conserve and keep their bill down. Getting such a positive response from
customers from such negative news is a testament to the vast potential of customer
communication and education.
Providing timely service via data is what one of the authors (Goodman) calls delivering
“psychic pizza.” That is, ringing the doorbell and saying, “Here is the pizza you were about to
order!” This process preemptively addresses predictable needs or service questions that
customers haven’t even asked yet. For example, an East Coast utility has addressed the
perennial point of pain of arranging a home repair visit by getting the consumer’s preferred
channel of communication, confirming the afternoon before in an automated manner via
that preferred channel and informing the customer that the service tech will call at 8:00
a.m. to tell the consumer where they stand in the daily queue. This delights the customer
and eliminates the cost of up to three inbound phone calls: 4 p.m. the day before (“Is he
really coming tomorrow?"); 8 a.m. the day of the visit (“When is he coming?”); and the
frantic call if he is not there by 10:30 a.m. (“Is he really coming before noon because I have
to go back to the office!”).
Retaining good staff
Big data can also assist in retaining and training good service staff by both preventing
frustratingly recurring customer “dumb questions and inquiries” (which are reduced by
voice-of the customer and customer education activities) and by providing easily accessible,
flexible answers to frontline employees so that they can be more successful in handling
difficult issues. One financial services company provides service staff with flexible solution
spaces and talking points based on customer history and value and the specific
circumstances – there can be four answers to the same problem based on the big data
algorithm.
Half of all voluntary turnover among good employees is due to the employee being
frustrated by their own lack of effective tools and answers. They often say, “I’m not getting
paid enough to take flack for things that are not my fault and that I can’t explain.” Big data
279
can provide them with clear, believable, defensible explanations that leave the customer
feeling that they were treated fairly.
Market research should be identifying where education is needed and evaluating the
effectiveness of customer education as well as the effectiveness of response guidance in
producing customers who are satisfied and feel that they have been treated fairly and with
respect.
Serious threats
But for all of its beneficial applications, big data also presents some serious potential threats
to the quality of the customer experience and to the job security of market research
professionals and vendors.
The threat to the customer experience is using the data in a way that offends or scares
customers. One hotel chain found that too much personalization offended customers. A
customer noted “The fact that I ordered scotch on one trip does NOT mean I want extra
scotch stocked in my mini-bar – in fact I’m not sure that I even want what and how much I
drink in your records!” The point is to be responsive without being creepy. To avoid making
a mess of the experience you cannot take the data and mechanistically act on it. As Shah et
al. say in their Harvard Business Review article, leaven data with judgment and avoid
formulaic, one-size-fits-all actions.
The threats to researchers’ job security are much more serious and certain. First, most CRM
systems execute automatic e-mail satisfaction tracking campaigns, which is fast becoming
the response medium of choice. This eliminates the need for outside survey firms to
conduct satisfaction tracking surveys. Additionally, the CRM systems can automatically
integrate the survey data back into the customer records for use in both analysis and future
interactions with the customer. Likewise, most telephone automated call directors are
equipped with computer telephone integration (CTI), which facilitates measurement of first-
contact resolution by allowing easy identification of repeat calls from the same phone
number – again, no need to ask the same introductory questions of a customer who has
already provided the answers.
A second threat is that CRM, when tied to operational data, can also report customer
experience more complexly and accurately than the customer themselves can on a survey.
For example, a delivery company has operational data indicating exactly how many
packages missed their connection and therefore were not delivered on time. The company
also has call center data describing the incidents where the customer called in to complain.
Finally, it has satisfaction survey data collected after the fact. The CRM data, with the
operational data included, will provide a more complete record of the total number of
customers encountering any particular problem and, after the fact, allow analysis of the
impact of the incident on loyalty and actual sales – which is often more credible than
280
customer-stated intention to repurchase. Again, this suggests a possible decline in the need
for surveys.
The third threat lies in the contact center and is embodied in the speech analytics tools now
being brought online. These tools, when properly tuned, can ascertain satisfaction, replacing
both call-quality monitoring and satisfaction surveying. While this is troubling for survey
companies and the market research department, it is disastrous for the quality-monitoring
staff, because we predict that within three years, their jobs will completely disappear.
Finally, these same speech analytics tools can analyze phrases and sequences to identify the
best ways to pitch a product to improve sales and close rates.
You are probably reacting angrily, in one of two ways. First, you say that these tools are not
accurate and won’t be any time soon. Three years ago you were right. Now, some of the
tools are becoming very accurate. Second, you are confident that the tools will be too
expensive to be adopted. Again, in the past, they were a half-million dollars and up. The
prices have dropped dramatically and now compete with the costs of labor-intensive
activities like call monitoring. Further, once they are purchased for one function, they will be
available to perform all the other functions as well.
Now, before you start redrafting your résumé, take heart. Most of the vendors don’t know
how to effectively use the brilliant tools they are being offered – so you have a few years.
But you should prepare to incorporate them into your toolkit.
Learn to drive it
So how can you take the wheel of the big data bus and learn to drive it rather than getting
run over by it? First, some perspective. Big data has been around for almost 20 years,
starting with Amazon personalizing product recommendations based on previous purchases.
In the late 1990s, CRM systems promised a customer experience nirvana almost before
customer experience was first introduced by Joe Pine in the mid-1990s. The ongoing
challenge of BD from a customer experience perspective is to get the BD tools to truly (and
cost-effectively) enhance the customer experience and to do so without seeming creepy
and/or violating privacy. Most of the vendors cannot provide the savvy judgment needed to
capitalize on these challenges, which provides the primary opportunity for marketing
research.
For marketing research to gain a leadership position in the use of big data, researchers
must:
explore and understand all the capabilities of BD to provide and take action on business
intelligence and how it can be practically used for enhancing the customer experience, like
the delivery of psychic pizza to both improve loyalty and word-of-mouth as well as decrease
costs;
281
understand at least topline approaches to the technical integration of the various tools like
CRM, CTI and speech analytics;
provide the cross-functional bridge between marketing, operations and finance, exerting
the necessary political finesse to keep everyone focused on the goal of enhancing the
customer experience;
evaluate how effective the implementations are in really impacting customer experience;
and
create the capability to analyze how the tools impact costs and revenue in a transparent
manner, one that the CFO and CMO will accept.
Capitalizing on big data will require hard work and political savvy and there is a good chance
you will end up with a smaller staff. But the work you do will have more impact and the jobs
that remain will be more sophisticated. And, as you gain experience working with and
synthesizing all of the disparate data sources available to you, you’ll enhance your own skill
set in ways that will position you for whatever comes next.
282
Dealing with External Research Providers:
Ten research industry secrets and how to handle them
Author - Vince T. Migliore
Article Abstract
Research is increasingly being done by vendors rather than an in-house research

department. This article discusses 10 industry secrets, including tips for handling vendors.
Even large firms nowadays don’t have an in-house research department, choosing rather to
farm out this function to vendors. But how much do you know about these research firms?
What kinds of questions should you ask to find out more about the company that’s
processing your data? The following is a list of 10 industry secrets, including tips on how to
handle them.
1. You may not need primary research.
Very often there is no need at all for primary research. Much of the information you require
is readily available from secondary sources. It’s usually free, or can be purchased for a
fraction of the cost of conducting a survey.
An example: a small software company was enjoying rapid growth for its product in a
narrow niche market with only four other competitors. The company was a success even
without a thorough understanding of its position in the industry, and it wanted to get
market share and growth trend information. It was prepared to spend over $20,000 for a
telephone survey. Instead, we downloaded the sales and investor information of the four
competitors from the Web, gathered data from the library, and made a call to an industry
analyst for a major stock brokerage firm. The result: we had just about everything we
needed for less than 10 hours of work.
The fix: Do your homework! In the Information Age, just about anything you need to know is
available if you know where to look. Start by surfing the Internet. Get in touch with a good
research librarian - they are worth their weight in gold. Many firms, such as DataQuest,
Standard & Poors, or Dun & Bradstreet have huge resources that you can tap into for a
relatively small fee. (Mention of firms and brand names should not be construed as an
endorsement of their products or services.)
2. Random selection? I don’t think so!
The whole idea of conducting market research is to gather data that is representative of the
entire population that you are targeting. This requires you to use a random sample, which
by definition means every person has an equal chance of being selected. All too often the
sample is composed of people who happen to be home when you call, or people who filled
out their E-mail address, or some other convenience factor. Studies show, for example, that
283
the first round of daytime calling of a random telephone list yields mostly retired people,
students, and the unemployed. Is that your target audience?
There is also a popular trend called panel research, where the sample is composed of
volunteers who agree to be called and surveyed over and over again. They are enticed to
participate in surveys by the lure of cash awards, prizes, and a chance to express their
opinions. There are many instances where a panel sample is adequate and appropriate, but
this selection method does not constitute a random sample.
The fix: Know how your sample is being drawn. If it’s a telephone survey, where did the list
come from? Learn how many attempts are made to contact each person on the list. The
more, the better. Selecting every nth name from a master list is a good way to generate a
random sample.
3. You don’t always get a representative sample.
Another tenet of research is that you want to be able to project the findings from the survey
sample to the entire population. To accomplish this you need a representative sample,
something you don’t always achieve, even with a random sample. For some types of
research, broad, ballpark measures are sufficient. The industry standard for most surveys,
however, is to achieve a reliability of ±5 percent at the 95 percent level of confidence. This is
a technical way of saying if you did the same survey 100 times, using the same sampling
method, that 95 times out of 100, the results would be within 5 percent of the "true"
findings, which are those you would get if you surveyed everybody in the target audience.
The fix: Have a plan. Define your objectives. First, decide if you need a high level of accuracy.
If you are simply trying to poll the general sentiments of your retail customers, then a small
sample will often be adequate. On the other hand, if the purpose of your research is to
make a multi-million dollar decision on corporate strategy, then you’d better have accurate
results that can be projected to the entire population. To accomplish this task, you must
start with a large and representative sample. Sampling is a complex subject, and the laws of
probability dictate very specific minimum sample sizes. A rule of thumb, though, is that for
target populations of 10,000 or more, you need a sample of at least 400 people. Further, to
reach the level of reliability mentioned above, you need a random sample, or what’s called a
stratified probability sample. Finally, you must include techniques for verifying the sample
reliability. To do that, include demographic questions that establish multiple profiles of
those responding to the survey. For example, if you’re conducting a general population
survey, include age, gender, ethnicity, and ZIP code questions on the survey instrument,
then compare your survey results to U.S. Census data. If you’re surveying customers, and
you know from sales data that 15 percent are in the education field, then your survey
findings should reflect that.
4. Your questionnaire may be flawed.
284
Everyone has good intentions, but even a well-designed and easy-flowing questionnaire will
often contain useless questions. "How many times have you gone to a movie theater this
year?" "How many times did you go last year?" Such questions are fraught with problems.
By "this year" do you mean the calendar year, or the last 12 months? If you get an average
attendance of 4.2 times a year, does that convey any actionable response from your
company, or are you simply going to use the results to classify your audience into high,
medium, and low attenders? Can people really remember the number of movie visits they
made a year ago? Finally, if you find average attendance is 4.2 times a year, does that really
convey the full picture to you?
The fix: Study your questionnaire. A good way to check it is to write in the percentages that
you expect to find. Then ask yourself what would happen if the survey responses were
significantly higher or lower than what you expect. If there is nothing you could do or would
do about such surprise findings, then why ask the question?
Finally, give the survey to friends and relatives outside of work, and see if they can detect
any biased or difficult questions. Keep an open mind!
5. Our interviewers are underqualified.
Due to competitive pressures, the interviewers that conduct your survey are likely the
lowest paid employees in the field service. There is generally a high turnover rate in this
business. For questionnaires with highly technical content, they will often not know what
the questions mean.
The fix: Demand an orientation meeting and follow-up visits. Use these meetings to educate
the interviewers and give background material on the purpose of the survey. Ask that the
same interviewers be assigned for the duration of the project. Ask to monitor calls, and
observe the interviewing process. Provide a glossary of terms and definitions. Provide cheat-
sheets and reference material to answer the most frequently asked questions from the
interviewers.
6. Our data entry is shaky.
As with interviewers, data entry clerks are often overworked. Besides keystroke errors,
there are many transposition errors and missing data errors. For instance, there are 35
questions but only 34 entries, with the answer for question 21 placed in the slot reserved
for question 20, etc. Most researchers will tell you that data integrity is the most daunting
task in all of the research process.
The fix: Ask for involvement with and oversight of the data entry process. If you can afford
it, double data entry with documented conflict resolution is the best bet. One of the better
schemes I’ve seen is to assign a code for every variable, whether or not a response is
required. For example, use negative numbers for non-responses: 1=Yes, 2=No, 3=Don’t
285
Know/Unsure, -1=Refusal, -2=No Response/Interviewer error, -3=No Response/Skip pattern,
etc.
7. Crosstabulation tables are deceptive.
Crosstabulations have to be viewed with caution. Let’s say you’re crossing an important
yes/no question by age groups, and the smallest age group has only 14 respondents. If four
of those 14 say yes, then the corresponding percentage is 28.6 percent. Let’s assume further
that for the total population 12.3 percent of all respondents say yes to that question. It’s
easy to assume then that this age group is more than twice as likely to say yes. Not so fast!
First of all, the 28.6 percent tenth-of-a-decimal-point format implies an accuracy level that is
simply not justified by the number of cases it relies on. Second, the 28.6 percent is based on
only four respondents, so you should suspect a reliability problem. Finally, many research
firms supply crosstabulation and banner tables that do not show the statistical tests that
would tell you the probability of these percentage differences being "real" or simply due to
chance.
The fix: Study the total population frequencies before you order crosstabulation tables. If
there are only 14 people in the youngest age group, 18- to 24-year-olds, then consider
combining that group with the adjacent one, say 25- to 34-year-olds. By forcing larger
numbers of respondents into fewer age groupings, you can increase the reliability of the
percentages in those groups. Also, ask for the appropriate statistical tests with crosstabs.
For category questions, use the Chi-square test, and for differences in averages on a scale,
use the Student’s T-test, or ANOVA. A good rule of thumb is that there should be at least
five cases in the smallest cell for the Chi-square test to be accurate. Last, use some common
sense and good judgment when reviewing crosstabs. If the percentage of respondents
saying yes goes up in a stepwise fashion as the age groups get older, then most likely the
trend is real. If the age groups show only minor variations with no apparent pattern, then
the differences are probably due to chance.
Here is a crosstab technique I’ve found useful. Make a copy of all the crosstabs that you can
mark up. Flag those pages that contain the survey’s crucial questions, like "Would you
recommend our service to your friends?" Let’s say 80.5 percent of all respondents say yes to
that question. Now, scan across the subgroup categories in the crosstabs and see which
subgroups are higher than that 80.5 percent. If any of the subgroups is substantially higher,
and has a good number of respondents, then highlight the percentage in yellow. These are
your happy customers. If a subgroup shows a very high rating, say females at a 91.5 percent
yes rating, and that 91.5 percent is higher than any of the other demographic subgroups
(age, ZIP code, ethnicity, etc.), then highlight that percentage and also circle it with a red
pen. Repeat that for all the crucial survey questions. (Time consuming, yes, but this is why
research analysts get the big bucks!) Now go back and count how many red circled
percentages you find under gender, age, etc. If there are 10 red circles under male/female,
and only one under ZIP code, then you know gender is more important than geography.
286
Meanwhile, as you’re busy highlighting, you can get a feel for how much variation there is in
each subgroup, and how much is required to reach statistical significance in the Chi-square
tests (if you’ve run them).
8. Sorry, we don’t do that.
The standards in the market research industry are changing, and not everyone is keeping
up. On-line and E-mail surveys are just a few recent examples. Many research firms have
relied on telephone and personal interviewing, and have not acquired the skills needed for
these new forms of research. Likewise, there are powerful and important statistical
methods available that may be crucial to your project, but you won’t hear about them
because the company you’re using doesn’t have the software program, or the computer
hardware, or the intellectual know-how to perform them.
Conjoint analysis is a great example. Here is a potent and decisive tool for deciding which
new features your customers like best for improving your product. Conjoint analysis,
though, requires a dedicated software program, computer-assisted interviewing, and lots of
brain power in the planning and analysis stages.
The fix: Shop around, and again, do your homework. Read the trade journals for recent
developments, and break out the old statistics text, to brush up on some of the less well-
known statistical procedures. You should at least know the usage for these methods: Chi-
square, Student’s T-test, analysis of variance, factor analysis, and conjoint analysis. For
Internet surveys, you should be able to define the following: Spam, HTML, CGI-bin, radio-
button, forms-retrieval, and Web hosting.
9. Survey analysis is a voodoo science.
There is no comprehensive, one-size-fits-all method of survey analysis. Much of it is based

on crosstabulations that are not always trustworthy, as we’ve seen above. Meanwhile,
research companies like to convey the impression that they are experienced in your
industry, but a good research analyst is rarely a subject matter expert. In order to get a
meaningful report, you need an analyst who is intimately familiar with the strengths and
weaknesses of statistical procedures, and who also has the ability to recognize which
findings are significant to the survey objectives. This is a difficult task.
The fix: Use teamwork to bridge the knowledge gap. It’s extremely rare that one person
knows all the answers. Fortunately, most research projects are conducted in an atmosphere
of cooperation and friendly interdependence. It may help to schedule a brainstorming
session after the survey results are in, but before the report is written. As an example, the
statistician may find that males over 40 rate your product significantly lower than other
groups, but the industry analyst says, "We know that, it’s the nature of our product, and we
don’t expect it to change." In other words, not all survey findings are important for strategic
business decisions. Discernment in this area requires input from all players on the team.
287
10. Follow-up? Forget about it!
"Here’s your report. Good-bye and good luck!" How many times do we hear that? All too
often thousands of dollars are spent on a research project only to have the report sit on a
shelf without an implementation plan. Just as likely, there is little review of the survey
process, and no evaluation of the benefits it has provided.
The fix: Integrate the presentation of findings with a plan for implementation that conforms
to the survey objectives. Instead of one presentation event, plan on a multi-step process of
disseminating and evangelizing the survey findings. Fortunately there are usually several key
players in your firm who will appreciate and champion the project suggestions. Use them.
Meanwhile, mark your calendar for a day about six months down the road, where you take
some time for an objective review of the survey. Did it help business? Did it provide key
insights? Would you use this research firm again?
Working with a statistical expert and surviving

Author- Paul M. Gurwitz
Article Abstract
The author gives advice, aid and comfort to all of those marketing professionals who have
required the services of a statistician. While the prospect of hiring a statistician is not always
appealing, the author gives a few tips to help the relationship between statistician and
market professional.
This piece is an attempt to give aid and comfort to all of those marketing professionals who
have, at one time or another, needed to use the services of a statistician.
For those trained in the disciplines of conventional market research, who know all there is
to know about sampling, questionnaire design and field methods, but may be quite
unfamiliar with some of the more advanced analytic techniques, this position can be a most
uncomfortable one.
First of all, there is the feeling of lack of knowledge. If you need to hire a statistical expert,
it's because he/she knows something that you A) Don't know, and B) Need to know.
Researchers who are accustomed to dealing with problems of their discipline with a firm
hand born of easy familiarity may be quite intimidated by this situation. How, after all, can
you properly evaluate the expert's work without being an expert yourself? Is it possible to
judge the quality of the work being produced? Some may fear being cheated, or otherwise
badly served, without knowing it.
Along with this comes the feeling of loss of control. Researchers who are accustomed to
exercising a great degree of control over their projects may hesitate to call in an expert for
288
analytical advice for fear of losing that control. Some may resent the idea of another person
making major input into their project.
There can also be a feeling of personal inadequacy. "After all," you say, "I'm a market
research professional; I've been in the business for many years. I should know all there is to
know. How come I have to call an outsider in to help me? Doesn't this mean that I am less
than a complete researcher?"
If you must hire
Given this uncomfortable situation, how can the researcher hire and work with a statistical
expert so as to accomplish the task at hand without lowering his/her comfort level? Here
are some suggestions:
1. You're the expert on your problem. The overall principle to bear in mind is that you really
are the expert on your project. You know the business, the study, and its objectives. The
person you hire has knowledge you can use to help accomplish those objectives but you are
the one who can best explain them and judge when they are accomplished.
2. Bring the statistician in early. The best time to start analyzing a study is before the
questionnaire is written; that goes double when it comes to statistical analysis. Once the
questionnaire is designed and out in the field, your analytical options are already foreclosed;
too often, the best solution to a given problem turns out to be impossible to execute
because the required data are not there, or are in a form that cannot be used. You can make
more efficient use of the analyst's time and effort if that person is there to suggest
approaches to your problem at the beginning of the study.
3. Talk objectives, not techniques. As in any other area, you will get no more than what you
expect to get. If you ask a statistician if he/she can produce a regression or a perceptual
map or any other technique, the answer will almost always be yes and you will have learned
almost nothing about the person's real capabilities. On the other hand, if you describe the
study, its objectives, and the specific problem to be solved and ask what the expert can do
to help you solve the problem, you will probably learn a great deal from the answers. You
will learn not only about his/her own analytical capabilities, but also his/her ability to relate
to you and your problem.
4. Ask for explanations. Some of the techniques used by statistical analysts are
mathematically very complex; however, what they do should not be. You have a right to
know how the expert proposes to solve your problem; furthermore, you have a right to hear
it in English. If the analyst you are considering cannot articulate what it is he/she plans to do
and what it will do for you in a way you can understand and evaluate, the odds are very
good that you won't be satisfied with the outcome later on.
289
5. Agree on what you expect to get up front. Many of the problems in working with a
statistician can be avoided by agreeing at the beginning of the project (or the statistician's
involvement in it) on what is to come out of it. Often, problems in this relationship creep up
because there was a basic misunderstanding about what the analyst was expected to
produce; this is compounded if both parties come to the same project with very different
expectations. The clearer you can express your expectations to the person you hire, the
more likely you will get what you expect.
6. Allow enough time. It's a truism in the research business that the client always wants it
yesterday. However, in your rush to give the client something, don't cheat yourself out of
good analysis. Statistical analysis is not a shortcut. Most good analysis does not roll right out
of the computer the first time out; there is a degree of trial and error to the correct
application of most statistical procedures. Because of this, there is usually a direct
relationship between the amount of time given an analysis and the quality of the results.
Now it's true that this principle can be (and has been) taken to extremes: The best
marketing advice in the world is worthless if the problem is moot by the time it's delivered.
By the same token, bad advice quickly given can be worse than none at all. So, don't assume
instantaneous results. Give the analyst enough time to do right by you.
7. Be available throughout the project. In an endeavor like a research project, it is impossible

to anticipate all of the possible questions and contingencies that can arise. For this reason, it
is essential that you communicate to the expert you hire that you are available to consult
with him/her as needed. While this person is expert in the analytical methods he/she is
using, you cannot reasonably expect him/her to be as knowledgeable in the background of
your project or your business as you are. (By the way, it is the nature of the consulting
business that this will often be true even of analysts who have a track record in your
category. Even experts who have done numerous projects in a particular category usually
lack the day-to-day depth of the line researcher). If you simply turn over the data and say,
"Come back in a week with results," be prepared to agree with the judgment decisions the
analyst will make on the basis of partial knowledge. Otherwise, offer a continuing dialogue
throughout the project.
Faced with today's virtual onslaught of marketing information: Scanner data, people meters,
multi-million- record marketing databases, to name a few, the market researcher has
increasingly little choice but to turn to statistical methods to make sense of it. Because of
this, the researcher and the statistical expert are seen together more and more often. It
may not always be a marriage made in heaven; however, I hope that following these
suggestions might help both parties ward off the divorce court.
290

Reading Material For AMR - Dr. Vikas Goyal

Uploaded by

Copyright:

Available Formats

You might also like

Reading Material For AMR - Dr. Vikas Goyal

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Reading Material For AMR - Dr. Vikas Goyal

Uploaded by

Copyright:

Available Formats

Advanced Marketing Research

*The articles are adopted from Quirks Marketing Research Review

Knowing what you want

Other sources of research objectives

Limiting and prioritizing research objectives

Target population and recruiting

In many cases, it is necessary to hypothesize the definition of the most appropriate

Methodology for comparisons

Writing questions to meet research objectives

Choosing a moderator or interviewer

Writing the report to meet objectives

Whether your report is written in traditional style or presentation format, it is important to

The top five mistakes in marketing statistics

1. Asking too many questions

2. Failing to appreciate limitations

3. Not understanding regression

You see equations written like this

4. Falling for the latest gee-whiz approach

5. Not coming to a statistician (soon enough)

Computers know 'how' but they don't know 'what'

Take these steps to build your research on a solid foundation

10 questions to ask before choosing a sample provider

1. How are reach and diversity achieved?

2. How are multiple sources blended?

3. What is the recruitment approach?

4. How are participants treated?

5. How is the participant experience managed?

6. What processes are in place for identifying speeders and cheaters?

7. How are participants validated and de-duped?

In addition sample providers should use a variety of techniques to authenticate participants.

9. What modes of access are available?

Our multimedia world demands multimode sampling - particularly for lower-incidence

10. How is science applied to ensure representative, balanced samples?

Take the time to ask

Once-academic techniques have become increasingly common in everyday quantitative

Borrowing from one to enrich the other

Once-academic techniques, such as conjoint analysis and multidimensional scaling, have

Accordingly, we have adapted three multivariate techniques - conjoint analysis, cluster

In a quantitative study, conjoint analysis is typically used to determine underlying

As qualitative researchers, we are interested in understanding precisely what this technique

This technique, pseudoconjoint analysis, is best designed to generate understanding of the

By forcing preferences among fairly equivalently valuable combinations, we are able to

The quantitative technique known as multidimensional scaling is an extremely good way to

This is a time-consuming process. One cannot, in a qualitative session, ask respondents to

Not true substitutes

At a recent conference, in response to a comment on . the science of qualitative research, a

Increasing the scientific method

1. Remember that qualitative research is best for providing an understanding of the

3. Many people conducting qualitative lack training in research methods. Becoming

The beauty of the simple random sampling is that it is probability-based (therefore

All my survey questions have the yes/no type of dichotomous answers.

My confidence level (C) requirement is 95 percent. (I want to be sure that my population

The sample size (N) calculation formula is simply:

N = square of {square root of [P x (1-P)] / (E/std(C)},

where "std(C)" is the equivalent of confidence level, expressed in terms of standard

0.4770 = sq. rt. of [0.35 x (1-0.35)]

0.50 = Sq. rt. of [0.50 x (1-0.50)]

31.7980 = Sq. rt. of 1,011