Download as pdf or txt
Download as pdf or txt
You are on page 1of 12

Boost Your Marketing ROI

with Experimental Design

by Eric Almquist and Gordon Wyner

Reprint r0109k
October 2001

HBR Case Study r0109a


Off with His Head?
David Champion

HBR at Large r0109b


The Leadership Lessons of Mount Everest
Michael Useem

Different Voice r0109c


Genius at Work:
A Conversation with Mark Morris

Harnessing the Science of Persuasion r0109d


Robert B. Cialdini

Torment Your Customers


(They’ll Love It) r0109e
Stephen Brown

Radical Change, the Quiet Way r0109f


Debra E. Meyerson

Your Next IT Strategy r0109g


John Hagel III and John Seely Brown

HBR Interview r0109h


Bernard Arnault of LVMH:
The Perfect Paradox of Star Brands
Suzy Wetlaufer

Best Practice r0109j


Speeding Up Team Learning
Amy Edmondson, Richard Bohmer, and Gary Pisano

Tool Kit r0109k


Boost Your Marketing ROI
with Experimental Design
Eric Almquist and Gordon Wyner
To o l K i t

Most marketing

Boost Your executives will


admit that their

Marketing ROI discipline involves


a lot of guesswork.

with But by borrowing


a statistical

Experimental technique long


applied in other

Design fields, marketers


can develop
campaigns that
by Eric Almquist target customers
and Gordon Wyner with uncanny
accuracy.

C onsumers are bombarded


daily with hundreds, perhaps
thousands, of marketing messages. De-
mental design techniques long applied
in other fields such as pharmaceutical
research. Experimental design, which
livered through all manner of media, quantifies the effects of independent
from television commercials to tele- stimuli on behavioral responses, can
phone solicitations to supermarket cir- help marketing executives analyze how
culars to Internet banner ads, these the various components of a marketing
stimuli may elicit the desired response: campaign influence consumer behavior.
The consumer clips a coupon, clicks on This approach is much more precise and
a link, or adds a product to a shopping cost effective than traditional market
cart. But the vast majority of marketing testing. And when you know how cus-
messages fail to hit their targets. Obvi- tomers will respond to what you have to
ously, it would be valuable for compa- offer, you can target marketing pro-
nies to be able to anticipate which stim- grams directly to their needs–and boost
uli would prompt a response since even the bottom line in the process.
a small improvement in the browse-to-
buy conversion rate can have a big im- Traditional Testing
pact on profitability. But it has been very The practice of testing various forms
difficult to isolate what drives consumer of a marketing or advertising stimulus
behavior, largely because there are so isn’t new. Direct marketers, in particu-
many possible combinations of stimuli. lar, have long used simple techniques
Now, however, marketers have easier such as split mailings to compare how
access, at relatively low cost, to experi- customers react to different prices or

Copyright © 2001 by Harvard Business School Publishing Corporation. All rights reserved. 5
T O O L K I T • E x p e r i m e n ta l D e s i g n

promotional offers. But if they try to motion of every product it offers every independent variables before putting
evaluate more than just a couple of cam- minute of the day. It can also change them into the marketplace, trying out
paign alternatives, traditional testing the color of banner ads, the tone of pro- different kinds of stimuli on customers
techniques quickly grow prohibitively motional messages, and the content of rather than observing them as they have
expensive. outbound e-mails with relative ease. naturally occurred. Because you control
Consider the “test and control cell” The increasing complexity of the the introduction of stimuli, you can es-
method, which is the basis for almost all stimulus-response network, as we call it, tablish that differences in response can
direct mail and e-commerce testing means that marketers have more com- be attributed to the stimulus in ques-
done today. It starts with a control cell munication alternatives than ever be- tion, such as packaging or color of a
for, say, a base price, then adds test cells fore – and that the portion of alterna- product, and not to other factors, such
for higher and lower prices. To test five tives they actually test is growing even as limited availability of the product.
price points, six promotions, four ban- smaller. But this greater complexity can In other words, experimental design re-
ner ad colors, and three ad placements, also mean greater flexibility in your mar- veals whether variables caused a certain
you’d need a control cell and 360 test keting programs – if you can uncover behavior as opposed to simply being
cells (5 × 6 × 4 × 3 = 360). And that’s a rel- which changes in the stimulus-response correlated with the behavior.
atively simple case. In credit card mar- network actually drive customer be- While experimental design itself isn’t
keting, the possible combinations of havior. One way to do this is through new, few marketing executives have
brands, cobrands, annual percentage scientific experimentation. used the technique–either because they
rates, teaser rates, marketing messages, haven’t understood it or because day-
and mail packaging can quickly add up A New Marketing Science to-day marketing operations have got-
to hundreds of thousands of possible The science of experimental design lets ten in the way. But new technologies
bundles of attributes. Clearly, you can- people project the impact of many stim- are making experimental design more
not test them all. uli by testing just a few of them. By accessible, more affordable, and easier
There’s another problem with this using mathematical formulas to select to administer. (For more information
brute force approach: It typically does and test a subset of combinations of on the genesis of this type of testing,
not reveal which individual variables variables that represent the complexity see the sidebar “The Origins of Experi-
are causing higher (or lower) responses of all the original variables, marketers mental Design.”) Companies today can
from customers, since most control- can model hundreds or even thousands collect detailed customer information
cell tests reflect the combined effect of stimuli accurately and efficiently. much more easily than ever before and
of more than two simple variables. Is This is not the same thing as an after- can use those data to build models that
it the lower price that prompted the the-fact analysis of consumer behavior, predict customer response with greater
higher response? The promotional deal? sometimes referred to as data mining. speed and accuracy.
The new advertising message? There’s Experimental design is distinguished by Today’s most popular experimental-
no way to know. the fact that you define and control the design methods can be adapted and cus-
The problem has been magnified re-
cently as companies have gained the
ability to change their marketing stim- The Origins of Experimental Design
uli much more quickly. Just a few years
ago, changing prices and promotions on Experimental design methodologies – some dating as far back as the nineteenth
a few cans of food in the supermarket, century – have been used for years across many fields, including process manufac-
for example, required the time-consum- turing, psychology, and pharmaceutical clinical trials, and they are well known to
ing application of sticky labels and the most statisticians. Sir Ronald A. Fisher was among the first statisticians to intro-
duce the concepts of randomization and analysis of variance. In the early 1900s,
distribution of paper coupons. Today, a
he worked at the Rothamsted Agricultural Experimental Station outside London.
store can adjust prices and promotions
His focus was on increasing agricultural yields.
electronically by simply reprogramming
Another major breakthrough in the field came with the work of U.S. economist
its checkout scanners. The Internet has
and Nobel laureate Daniel L. McFadden in the 1970s, who drew on psychological
further heightened marketing complex-
theories to explain that consumer choices are a function of the available alterna-
ity by reducing the physical constraints
tives and other consumer characteristics. In helping to design San Francisco’s
on pricing, packaging, and communi- BART commuter rail system, McFadden analyzed the way people evaluate trade-
cations. In the extreme, an on-line re- offs in travel time and cost and how those trade-offs affect their decisions about
tailer could change the prices and pro- means of transportation. He was able to help forecast demand for BART and de-
termine where best to locate stations. The model was quite accurate, predicting
Eric Almquist and Gordon Wyner are a 6.4% share of commuter travel for BART, which was close to the actual 6.2%
vice presidents of Mercer Management share the system achieved.
Consulting, based in Boston.

6 harvard business review


E x p e r i m e n ta l D e s i g n • T O O L K I T

tomized using guidelines from standard the three attributes and their various each attribute is paired in at least one
reference textbooks such as Statistics levels are as follows: instance with each level of the other at-
for Experimenters by George E. P. Box, tributes. So, for example, price at $150 is
Price
J. Stuart Hunter, and William G. Hunter; matched at some point with each pro-
and from off-the-shelf software pack- LEVEL (1) (2) (3) (4) motion and each message. This makes it
ages such as the Statistical Analysis Sys- $150 $160 $170 $180 possible to unambiguously separate the
tem, the primary product of SAS Insti- influence of each variable on customer
Message
tute. A handful of companies have response.
already applied some form of experi- (1) SPEED
“Biz Ware lets you manage customer
mental design to marketing. They in- Biz Ware’s Experimental Design
relationships in just minutes a day.”
clude financial firms such as Chase, PROMOTION (1) (1) (2) (2)
Household Finance, and Capital One, (2) POWER
“Biz Ware can be expanded to handle MESSAGE (1) (2) (1) (2)
telecommunications provider Cable & a virtually infinite number of customer PRICE (1) X X
Wireless, and Internet portal America files.”
PRICE (2) X X
Online.
Applying experimental-design meth- Promotion PRICE (3) X X
ods requires business judgment and a (1) 30-DAY TRIAL PRICE (4) X X
degree of mathematical and statistical “You can try Biz Ware now for 30 days
at no risk.”
sophistication – both of which are well The eight chosen combinations are
(2) FREE GIFT
within the reach of most large corpora- now tested, using one of several media:
“Buy Biz Ware now and receive our con-
tions and many smaller organizations. tact manager software absolutely free.” scripts at a call center, banner ads on
The experimental design technique is Biz Ware’s Web site, e-mail messages to
particularly useful for companies that The total number of possible combina- prospective customers, and direct mail
have large numbers of customers and tions can be determined by multiplying solicitations. (In general, you should test
that face rapid and constant change in the number of levels of each attribute. using the medium you ultimately expect
their markets and product offers. In- The three attributes Biz Ware wants to to use for your marketing campaign, al-
ternet retailers, for instance, benefit test yield a total of 16 possible combi- though you can also choose multiple
greatly from experimentation because nations since 4 × 2 × 2 = 16. All 16 com- media and treat the choice of media as
on-line customers tend to be fickle. At- binations can be mapped in the cells of an attribute in the experiment.)
tracting browsers to a Web site and then a simple chart like the one below. How big should the sample size be to
converting them into buyers has proved make the experiment valid? The answer
Biz Ware’s Universe
very expensive and largely ineffective. depends on several characteristics of the
of Possible Combinations
Getting it right the first time is nearly test and the target market. These may
PROMOTION (1) (1) (2) (2)
impossible, so experimentation is criti- include the expected response rate,
MESSAGE (1) (2) (1) (2)
cal. The rigorous and robust nature of based on the results of past marketing
experimental design, combined with the PRICE (1) X X X X efforts; the expected variation among
increasing challenges of marketing to PRICE (2) X X X X subgroups of the market; and the com-
oversaturated consumers, will make PRICE (3) X X X X plexity of the design, including the
widespread adoption of this new mar- PRICE (4) X X X X number of attributes and levels. In any
keting science only a matter of time in event, the sample size should be large
most industries. It’s not necessary to test them all. In- enough so that marketers can statisti-
stead, using what’s called a fractional cally detect the impact of the attributes
The ABCs of Experimental factorial design, Biz Ware selects a sub- on customer response. Since increasing
Design set of eight combinations to test. the complexity and size of an experi-
To illustrate how experimental design “Factorial” means Biz Ware “crosses” ment generally adds cost, marketers
works, let’s consider the following sim- each attribute (price, promotion, and should determine the minimum sample
ple case. A company, which we’ll call Biz message) with each of the others in grid- size necessary to achieve a degree of pre-
Ware, is marketing a software product like fashion, as in the universe chart cision that is useful for making business
to other companies. Before launching above.“Fractional”means Biz Ware then decisions. (There are standard guide-
a national campaign, Biz Ware wants chooses a subset of those combinations lines in statistics that can help marketers
to test three different variables, or in which the attributes are independent answer the question of sample size.)
attributes, of a sales message for the (either totally or partially) of each other. We’ve conducted complex experiments
product: price, message, and promotion. The following chart shows the resulting by sending e-mail solicitations to lists of
Each of the three attributes can have a experimental design, with Xs marking the just 20,000 names, where 1,250 people
number of variations, or levels. Suppose cells to be tested. Note that each level of each receive one of 16 stimuli.

october 2001 7
T O O L K I T • E x p e r i m e n ta l D e s i g n

Within a few days or weeks, the ex- With this complete picture, it be- chases. One of these stimuli was an
periment’s results come in. Biz Ware’s comes clear that some combinations e-mail to parents and teachers. To test
marketers note the number and per- are far more likely to be effective than various components of the e-mail con-
centage of positive responses to each of others. The combination of Price 1 ($150), tent and format, we relied on the best
the eight tested offers. Message 2 (Power), and Promotion 2 judgment of Crayola’s marketing staff
(Free Gift) is clearly the most attractive about the messages that were most
Biz Ware’s Design Results to consumers. But is it the right com- likely to appeal to the target markets.
PROMOTION (1) (1) (2) (2) bination for Biz Ware? That’s where The e-mail included five key attributes
MESSAGE (1) (2) (1) (2) business judgment comes in: The com- that seemed likely to affect the cus-
pany’s management will need to ana- tomer response rate, which would be
PRICE (1) 14% 40%
lyze the experiment’s implications for measured by click-throughs to the Cray-
PRICE (2) 9% 13% its resources, revenue, and profitability. ola Web site. These attributes and their
PRICE (3) 6% 10% related levels were:
Experimentation at Crayola • Two subject lines: “Crayola.com
PRICE (4) 1% 7%
Let’s look at an actual example of how ex- Survey” and “Help Us Help You.”
At a glance, you might intuitively un- perimental design can enhance a mar- • Three salutations: “Hi [user name] :-),”
derstand that price has a significant im- keting campaign. Last year, Crayola, a “Greetings!” and “[user name].”
pact on the response to the various of- division of Binney & Smith and Hall- • Two calls to action: “As Crayola.com
fers, since the lower price offers (Price 1 mark, launched a creative arts and ac- grows, we’d like to get your thoughts
and Price 2) generally drew much better tivities portal on the Internet called about the arts and how you use art
response rates than the higher price of- Crayola.com. The site’s target customers materials” and “Because you as an ed-
fers (Price 3 and Price 4). But statistical include parents and educators, and it ucator have a special understanding
modeling, using standard software, sells art supplies and offers arts-and- of the arts and how art materials are
makes it possible to assess the impact of crafts project ideas and classroom lesson used, we invite you to help build
each variable with far greater precision. plans. We conducted an experimental de- Crayola.com.”
Indeed, by using a method known as lo- sign to help Crayola attract people to the • Three promotions: a chance to partici-
gistic regression analysis, Biz Ware can site and convert browsers into buyers. pate in a monthly drawing to win
extrapolate from the results of the ex- Based on Crayola’s experience and $100 worth of Crayola products; a
periment the probable response rates market knowledge, we identified a set monthly drawing for one of ten $25
for all 16 cells. of stimuli that could be varied to drive Amazon.com gift certificates; and no
traffic to Crayola.com and induce pur- promotion.
Biz Ware’s Modeled Responses
PROMOTION (1) (1) (2) (2)
MESSAGE (1) (2) (1) (2) Estimating a Response Model
PRICE (1) 14% 23% 28% 42% Logistic regression analysis is a statistical technique that allows an experimenter to
PRICE (2) 7% 12% 15% 24% analyze the impact of each stimulus in an experiment. The formula assumes that
the outcome – in Biz Ware’s case, the customer response rate – is a function of the
PRICE (3) 3% 6% 7% 12%
attributes – in Biz Ware’s case, price, message, and promotion. Here’s what Biz
PRICE (4) 1% 3% 3% 6% Ware’s generic equation looks like:

Note that the percentages shown above Log ( 1 −response rate


response rate )
= a + b1 (price) + b2 (message) + b3 (promotion)
don’t precisely match the original per-
We plug Biz Ware’s customer response data into this equation, using SAS soft-
centages from the test. That’s because ware to estimate the coefficients (a, b1, b2, and b3). For price we can drop a num-
Biz Ware used the original percentages ber into the formula. For message and promotion, which are qualitative
to create a general equation for esti- attributes, we assign a dummy code – zero or one, since there are only two levels
mating the results in all the cells. When for each attribute. It does not matter which attribute is assigned which number.
the new equation is then applied to the For Biz Ware, the equation looks like this:
cells already tested, the results usually
vary somewhat from the original num-
Log ( 1 response rate
− response rate )
= 10.3 − 8.1 (price) + 0.6 (message) + 0.9 (promotion)
bers. The important thing is that the The coefficients tell us a few things: Higher price has a negative impact on de-
tester ends up with a full set of consis- mand (hence, the coefficient b1 is −8.1) and the effect of promotion is greater than
tent results for all possible permuta- the effect of message (because 0.9 is greater than 0.6). But more important, these
tions. (For more about how these calcu- coefficients allow us to apply the equation to extrapolate from the data collected
lations were made, see the sidebar and predict responses for all 16 cells.
“Estimating a Response Model.”)

8 harvard business review


E x p e r i m e n ta l D e s i g n • T O O L K I T

Crayola Draws
Results subject Help Us Help You
salutation Greetings!

Sample E-mail call to action Because you as an educator have a special understanding of the
arts and how art materials are used, we invite you to help build
Crayola.com marketers wanted to
Crayola.com.
measure how customer response
By answering ten quick questions, you’ll be helping to make sure
would be affected by different improvements we make to Crayola.com meet your needs. Simply
variations (or levels) of five main follow this link: http://___________. (If this does not work, please
e-mail attributes: two subjects, cut and paste the link into your browser’s address bar.)
three salutations, two calls to ac-
promotion As a thank you, you will be entered into our monthly drawing to
tion, three promotions, and two
win one of ten $25 Amazon.com gift certificates. By completing
closings. At right is one of the 72
our survey, you’ll be automatically entered.
possible combinations. An experi- Thanks for your assistance. Be sure to check back often for new
mental design was developed so fun, creative, and colorful ideas and solutions.
that only 16 combinations had to
be tested. closing Yours,
Crayola.com

Parents’ Response Rates Change in


Levels Response Rate
The company measured the re-
sponses it received and through subject Crayola.com Survey 7.5%
statistical modeling could quickly Help Us Help You 0.0%
pinpoint which stimuli appealed
most to its target customers – in salutation Hi [user name] :-) 2.7%
this case, parents. Greetings! 0.0%
[user name] 3.4%
The subject line “Crayola.com
call to action As Crayola.com grows… 0.0%
Survey,” for example, was more
effective at creating positive re- Because you are… 3.5%
sponses than “Help Us Help You.” promotion $100 product drawing 8.4%
The response rate of the former $25 Amazon.com gift certificate drawing 5.2%
was 7.5% higher than that of the
No offer 0.0%
latter, all else being equal.
closing Crayola.com 0.0%
EducationEditor@Crayola.com 1.2%

Best Response Worst Response


Script Attributes
The combination of attributes subject Crayola.com Survey Help Us Help You
that got the best response from
parents was more than three salutation User name Greetings!
times as effective as the combina-
call to action Because you are… As Crayola.com grows…
tion of attributes that got the
worst response. At right are the promotion $100 product drawing No offer
best and worst script attributes
of the 72 possible combinations. closing EducationEditor@Crayola.com Crayola.com

response rate 33.7% 9.7%

The best script is 3.5 times more effective than the worst.

october 2001 9
T O O L K I T • E x p e r i m e n ta l D e s i g n

• Two closings: “Crayola.com” and Crayola with similar tests we performed Customers may have multiple options
“EducationEditor@Crayola.com.” for Cable & Wireless. At Crayola, e-mails to choose from rather than a simple
Taking into account all the levels of containing no promotional offer drew “yes” or “no” response – for example, a
each attribute, there were a total of 72 poorly compared with e-mails containing choice between one-year, two-year, and
possible versions of the e-mail (2 × 3 × 2 either of the two promotional offers three-year subscriptions or no purchase.
× 3 × 2 = 72). While Crayola might have tested (a $100 product drawing and a Different types of experimental de-
been able to test all 72 variations, the $25 Amazon.com gift certificate). At signs can be used when the experi-
process would have been cumbersome Cable & Wireless, though, e-mails with mental objectives vary. For example,
and expensive. Instead, we created a no promotional offer drew the best click- so-called screening designs can effi-
subset of 16 e-mails to represent the 72 through rate. But that’s part of the value ciently test very large numbers of attri-
possible combinations. Over a two-week of experimental design: It allows mar- butes to select a smaller number to in-
period, we sent the 16 types of e-mail keters to move beyond rules of thumb vestigate in more detail. Subsequent
to randomly selected samples of cus- or experience to pinpoint the marketing testing can employ more levels for each
tomers and tracked and analyzed their approaches that work best with a par- of a smaller collection of attributes. Re-
responses. The results were compelling. ticular audience in a particular market- sponse surface designs are used in food
The “best” e-mail of the 72 possible place at a particular moment in time. testing in which multiple dimensions
scripts yielded a positive response rate such as sweet, salty, crunchy, and sour
of about 34% and was more than three The Expanding Marketing have an ideal level somewhere between
times as effective at attracting parents as Universe “too much” and “not enough.” The de-
the “worst” e-mail, which yielded a pos- In the world of marketing experimen- sign lets testers estimate the ideal com-
itive response rate of only about 10%. tation, the Crayola tests are relatively bination of tastes and textures.
(See the exhibit “Crayola Draws Re- simple. We tested only a handful of
sults.”) Among educators, the best e-mail marketing attributes, with a relatively Getting Results
script was nearly twice as effective as small number of levels for each. Even Naturally, there are limits to the power
the worst script, with response rates of so, the customer impact was impres- of experimental design. This approach
35% for the best e-mail versus 20% for sive, with the best combinations of requires thoughtful planning to hy-
the worst. stimuli drawing double, triple, or qua- pothesize what you are looking for and
We also conducted similar experi- druple response rates compared with to rule out other possibilities before the
ments with Crayola to test the effects of the worst combinations. experiment can begin.
three different banner ads on the home The approach we took with Crayola One caution is that many experimen-
page, as well as product, price, and pro- can be extended and applied to more tal designs rely on “main effects” mod-
motions offered at the on-line store. The attributes and more levels. It’s not un- els. That is, they assume that interaction
best combination was nearly four times usual for a company to test ten or more effects – the impact that one variable
as effective in converting shoppers into attributes, including some with as many can have on another variable – are neg-
buyers as the worst combination and as eight levels. A credit card company, ligible. This is usually a reasonable as-
nearly doubled revenues per buyer. for example, might be interested in test- sumption when you’re dealing with
ing six teaser rates, six cobrands, four complex combinations of three, four,
Uncovering the Unexpected different annual percentage rates, six and five variables at a time. However,
In the Crayola experiments, as well as in promotions, four insurance packages, interactions between two variables can
other tests, our results have yielded sur- four modes of communication, eight be important and can be tested. For ex-
prising insights. When we ask experi- direct mail packages, and four mailing ample, suppose you find that free sam-
enced marketers to predict which stim- schedules. This represents a possible set ples have a more positive impact on a
uli are likely to elicit the best responses, of 442,368 distinct marketing stimuli, product’s sales than do coupons. You
few get it right. Crayola, for example, obviously too large a universe for a test- may also learn that free samples pro-
was surprised to find that a price reduc- and-control-cell approach. But by using vide an even greater lift when they are
tion on a product or line of products experimental design to select and test handed out in the stores as opposed to
generated sufficient volume to create a manageable number – say 128 combi- being sent through the mail. One vari-
higher revenues while also maintaining nations of these variables – the credit able, the chosen distribution channel,
the site’s profitability. Conventional wis- card company could estimate with great interacts meaningfully with another
dom would have suggested that raising accuracy the customer reaction to all variable, the promotion itself.
prices would be a more effective way to 442,368 combinations. Experimental design also calls for
increase revenue. And this is by no means the upper substantive knowledge to frame the
Of course, the findings from tests like limit of the usefulness of experimental problem, careful application of theo-
these can’t be generalized. For instance, design. The responses being sought from retically sound methods, and skillful
compare the e-mail tests we conducted at customers can be more complex as well. interpretation of the results in the ap-

10 harvard business review


E x p e r i m e n ta l D e s i g n • T O O L K I T

propriate context. Some knowledge of be one part of a continuous test-and- to better communicate with their cus-
basic statistical methods is necessary, learn cycle. tomers – and substantially raise the
of course. But even more important is Marketing is, and always will be, a odds that their marketing efforts will
a nuanced understanding of customers creative endeavor. But it doesn’t have pay off.
and the ability to form reasonable as- to be so mysterious. As marketing noise
sumptions concerning which attributes and advertising clutter continue to in- Reprint r0109k
and levels should be tested and which crease, marketers will find that scien- To place an order, call 1-800-988-0886.
shouldn’t. Experimental design should tific experimentation will allow them

october 2001 11

You might also like