Professional Documents
Culture Documents
Boost Your Marketing ROI With Experimental Design: by Eric Almquist and Gordon Wyner
Boost Your Marketing ROI With Experimental Design: by Eric Almquist and Gordon Wyner
Reprint r0109k
October 2001
Most marketing
Copyright © 2001 by Harvard Business School Publishing Corporation. All rights reserved. 5
T O O L K I T • E x p e r i m e n ta l D e s i g n
promotional offers. But if they try to motion of every product it offers every independent variables before putting
evaluate more than just a couple of cam- minute of the day. It can also change them into the marketplace, trying out
paign alternatives, traditional testing the color of banner ads, the tone of pro- different kinds of stimuli on customers
techniques quickly grow prohibitively motional messages, and the content of rather than observing them as they have
expensive. outbound e-mails with relative ease. naturally occurred. Because you control
Consider the “test and control cell” The increasing complexity of the the introduction of stimuli, you can es-
method, which is the basis for almost all stimulus-response network, as we call it, tablish that differences in response can
direct mail and e-commerce testing means that marketers have more com- be attributed to the stimulus in ques-
done today. It starts with a control cell munication alternatives than ever be- tion, such as packaging or color of a
for, say, a base price, then adds test cells fore – and that the portion of alterna- product, and not to other factors, such
for higher and lower prices. To test five tives they actually test is growing even as limited availability of the product.
price points, six promotions, four ban- smaller. But this greater complexity can In other words, experimental design re-
ner ad colors, and three ad placements, also mean greater flexibility in your mar- veals whether variables caused a certain
you’d need a control cell and 360 test keting programs – if you can uncover behavior as opposed to simply being
cells (5 × 6 × 4 × 3 = 360). And that’s a rel- which changes in the stimulus-response correlated with the behavior.
atively simple case. In credit card mar- network actually drive customer be- While experimental design itself isn’t
keting, the possible combinations of havior. One way to do this is through new, few marketing executives have
brands, cobrands, annual percentage scientific experimentation. used the technique–either because they
rates, teaser rates, marketing messages, haven’t understood it or because day-
and mail packaging can quickly add up A New Marketing Science to-day marketing operations have got-
to hundreds of thousands of possible The science of experimental design lets ten in the way. But new technologies
bundles of attributes. Clearly, you can- people project the impact of many stim- are making experimental design more
not test them all. uli by testing just a few of them. By accessible, more affordable, and easier
There’s another problem with this using mathematical formulas to select to administer. (For more information
brute force approach: It typically does and test a subset of combinations of on the genesis of this type of testing,
not reveal which individual variables variables that represent the complexity see the sidebar “The Origins of Experi-
are causing higher (or lower) responses of all the original variables, marketers mental Design.”) Companies today can
from customers, since most control- can model hundreds or even thousands collect detailed customer information
cell tests reflect the combined effect of stimuli accurately and efficiently. much more easily than ever before and
of more than two simple variables. Is This is not the same thing as an after- can use those data to build models that
it the lower price that prompted the the-fact analysis of consumer behavior, predict customer response with greater
higher response? The promotional deal? sometimes referred to as data mining. speed and accuracy.
The new advertising message? There’s Experimental design is distinguished by Today’s most popular experimental-
no way to know. the fact that you define and control the design methods can be adapted and cus-
The problem has been magnified re-
cently as companies have gained the
ability to change their marketing stim- The Origins of Experimental Design
uli much more quickly. Just a few years
ago, changing prices and promotions on Experimental design methodologies – some dating as far back as the nineteenth
a few cans of food in the supermarket, century – have been used for years across many fields, including process manufac-
for example, required the time-consum- turing, psychology, and pharmaceutical clinical trials, and they are well known to
ing application of sticky labels and the most statisticians. Sir Ronald A. Fisher was among the first statisticians to intro-
duce the concepts of randomization and analysis of variance. In the early 1900s,
distribution of paper coupons. Today, a
he worked at the Rothamsted Agricultural Experimental Station outside London.
store can adjust prices and promotions
His focus was on increasing agricultural yields.
electronically by simply reprogramming
Another major breakthrough in the field came with the work of U.S. economist
its checkout scanners. The Internet has
and Nobel laureate Daniel L. McFadden in the 1970s, who drew on psychological
further heightened marketing complex-
theories to explain that consumer choices are a function of the available alterna-
ity by reducing the physical constraints
tives and other consumer characteristics. In helping to design San Francisco’s
on pricing, packaging, and communi- BART commuter rail system, McFadden analyzed the way people evaluate trade-
cations. In the extreme, an on-line re- offs in travel time and cost and how those trade-offs affect their decisions about
tailer could change the prices and pro- means of transportation. He was able to help forecast demand for BART and de-
termine where best to locate stations. The model was quite accurate, predicting
Eric Almquist and Gordon Wyner are a 6.4% share of commuter travel for BART, which was close to the actual 6.2%
vice presidents of Mercer Management share the system achieved.
Consulting, based in Boston.
tomized using guidelines from standard the three attributes and their various each attribute is paired in at least one
reference textbooks such as Statistics levels are as follows: instance with each level of the other at-
for Experimenters by George E. P. Box, tributes. So, for example, price at $150 is
Price
J. Stuart Hunter, and William G. Hunter; matched at some point with each pro-
and from off-the-shelf software pack- LEVEL (1) (2) (3) (4) motion and each message. This makes it
ages such as the Statistical Analysis Sys- $150 $160 $170 $180 possible to unambiguously separate the
tem, the primary product of SAS Insti- influence of each variable on customer
Message
tute. A handful of companies have response.
already applied some form of experi- (1) SPEED
“Biz Ware lets you manage customer
mental design to marketing. They in- Biz Ware’s Experimental Design
relationships in just minutes a day.”
clude financial firms such as Chase, PROMOTION (1) (1) (2) (2)
Household Finance, and Capital One, (2) POWER
“Biz Ware can be expanded to handle MESSAGE (1) (2) (1) (2)
telecommunications provider Cable & a virtually infinite number of customer PRICE (1) X X
Wireless, and Internet portal America files.”
PRICE (2) X X
Online.
Applying experimental-design meth- Promotion PRICE (3) X X
ods requires business judgment and a (1) 30-DAY TRIAL PRICE (4) X X
degree of mathematical and statistical “You can try Biz Ware now for 30 days
at no risk.”
sophistication – both of which are well The eight chosen combinations are
(2) FREE GIFT
within the reach of most large corpora- now tested, using one of several media:
“Buy Biz Ware now and receive our con-
tions and many smaller organizations. tact manager software absolutely free.” scripts at a call center, banner ads on
The experimental design technique is Biz Ware’s Web site, e-mail messages to
particularly useful for companies that The total number of possible combina- prospective customers, and direct mail
have large numbers of customers and tions can be determined by multiplying solicitations. (In general, you should test
that face rapid and constant change in the number of levels of each attribute. using the medium you ultimately expect
their markets and product offers. In- The three attributes Biz Ware wants to to use for your marketing campaign, al-
ternet retailers, for instance, benefit test yield a total of 16 possible combi- though you can also choose multiple
greatly from experimentation because nations since 4 × 2 × 2 = 16. All 16 com- media and treat the choice of media as
on-line customers tend to be fickle. At- binations can be mapped in the cells of an attribute in the experiment.)
tracting browsers to a Web site and then a simple chart like the one below. How big should the sample size be to
converting them into buyers has proved make the experiment valid? The answer
Biz Ware’s Universe
very expensive and largely ineffective. depends on several characteristics of the
of Possible Combinations
Getting it right the first time is nearly test and the target market. These may
PROMOTION (1) (1) (2) (2)
impossible, so experimentation is criti- include the expected response rate,
MESSAGE (1) (2) (1) (2)
cal. The rigorous and robust nature of based on the results of past marketing
experimental design, combined with the PRICE (1) X X X X efforts; the expected variation among
increasing challenges of marketing to PRICE (2) X X X X subgroups of the market; and the com-
oversaturated consumers, will make PRICE (3) X X X X plexity of the design, including the
widespread adoption of this new mar- PRICE (4) X X X X number of attributes and levels. In any
keting science only a matter of time in event, the sample size should be large
most industries. It’s not necessary to test them all. In- enough so that marketers can statisti-
stead, using what’s called a fractional cally detect the impact of the attributes
The ABCs of Experimental factorial design, Biz Ware selects a sub- on customer response. Since increasing
Design set of eight combinations to test. the complexity and size of an experi-
To illustrate how experimental design “Factorial” means Biz Ware “crosses” ment generally adds cost, marketers
works, let’s consider the following sim- each attribute (price, promotion, and should determine the minimum sample
ple case. A company, which we’ll call Biz message) with each of the others in grid- size necessary to achieve a degree of pre-
Ware, is marketing a software product like fashion, as in the universe chart cision that is useful for making business
to other companies. Before launching above.“Fractional”means Biz Ware then decisions. (There are standard guide-
a national campaign, Biz Ware wants chooses a subset of those combinations lines in statistics that can help marketers
to test three different variables, or in which the attributes are independent answer the question of sample size.)
attributes, of a sales message for the (either totally or partially) of each other. We’ve conducted complex experiments
product: price, message, and promotion. The following chart shows the resulting by sending e-mail solicitations to lists of
Each of the three attributes can have a experimental design, with Xs marking the just 20,000 names, where 1,250 people
number of variations, or levels. Suppose cells to be tested. Note that each level of each receive one of 16 stimuli.
october 2001 7
T O O L K I T • E x p e r i m e n ta l D e s i g n
Within a few days or weeks, the ex- With this complete picture, it be- chases. One of these stimuli was an
periment’s results come in. Biz Ware’s comes clear that some combinations e-mail to parents and teachers. To test
marketers note the number and per- are far more likely to be effective than various components of the e-mail con-
centage of positive responses to each of others. The combination of Price 1 ($150), tent and format, we relied on the best
the eight tested offers. Message 2 (Power), and Promotion 2 judgment of Crayola’s marketing staff
(Free Gift) is clearly the most attractive about the messages that were most
Biz Ware’s Design Results to consumers. But is it the right com- likely to appeal to the target markets.
PROMOTION (1) (1) (2) (2) bination for Biz Ware? That’s where The e-mail included five key attributes
MESSAGE (1) (2) (1) (2) business judgment comes in: The com- that seemed likely to affect the cus-
pany’s management will need to ana- tomer response rate, which would be
PRICE (1) 14% 40%
lyze the experiment’s implications for measured by click-throughs to the Cray-
PRICE (2) 9% 13% its resources, revenue, and profitability. ola Web site. These attributes and their
PRICE (3) 6% 10% related levels were:
Experimentation at Crayola • Two subject lines: “Crayola.com
PRICE (4) 1% 7%
Let’s look at an actual example of how ex- Survey” and “Help Us Help You.”
At a glance, you might intuitively un- perimental design can enhance a mar- • Three salutations: “Hi [user name] :-),”
derstand that price has a significant im- keting campaign. Last year, Crayola, a “Greetings!” and “[user name].”
pact on the response to the various of- division of Binney & Smith and Hall- • Two calls to action: “As Crayola.com
fers, since the lower price offers (Price 1 mark, launched a creative arts and ac- grows, we’d like to get your thoughts
and Price 2) generally drew much better tivities portal on the Internet called about the arts and how you use art
response rates than the higher price of- Crayola.com. The site’s target customers materials” and “Because you as an ed-
fers (Price 3 and Price 4). But statistical include parents and educators, and it ucator have a special understanding
modeling, using standard software, sells art supplies and offers arts-and- of the arts and how art materials are
makes it possible to assess the impact of crafts project ideas and classroom lesson used, we invite you to help build
each variable with far greater precision. plans. We conducted an experimental de- Crayola.com.”
Indeed, by using a method known as lo- sign to help Crayola attract people to the • Three promotions: a chance to partici-
gistic regression analysis, Biz Ware can site and convert browsers into buyers. pate in a monthly drawing to win
extrapolate from the results of the ex- Based on Crayola’s experience and $100 worth of Crayola products; a
periment the probable response rates market knowledge, we identified a set monthly drawing for one of ten $25
for all 16 cells. of stimuli that could be varied to drive Amazon.com gift certificates; and no
traffic to Crayola.com and induce pur- promotion.
Biz Ware’s Modeled Responses
PROMOTION (1) (1) (2) (2)
MESSAGE (1) (2) (1) (2) Estimating a Response Model
PRICE (1) 14% 23% 28% 42% Logistic regression analysis is a statistical technique that allows an experimenter to
PRICE (2) 7% 12% 15% 24% analyze the impact of each stimulus in an experiment. The formula assumes that
the outcome – in Biz Ware’s case, the customer response rate – is a function of the
PRICE (3) 3% 6% 7% 12%
attributes – in Biz Ware’s case, price, message, and promotion. Here’s what Biz
PRICE (4) 1% 3% 3% 6% Ware’s generic equation looks like:
Crayola Draws
Results subject Help Us Help You
salutation Greetings!
Sample E-mail call to action Because you as an educator have a special understanding of the
arts and how art materials are used, we invite you to help build
Crayola.com marketers wanted to
Crayola.com.
measure how customer response
By answering ten quick questions, you’ll be helping to make sure
would be affected by different improvements we make to Crayola.com meet your needs. Simply
variations (or levels) of five main follow this link: http://___________. (If this does not work, please
e-mail attributes: two subjects, cut and paste the link into your browser’s address bar.)
three salutations, two calls to ac-
promotion As a thank you, you will be entered into our monthly drawing to
tion, three promotions, and two
win one of ten $25 Amazon.com gift certificates. By completing
closings. At right is one of the 72
our survey, you’ll be automatically entered.
possible combinations. An experi- Thanks for your assistance. Be sure to check back often for new
mental design was developed so fun, creative, and colorful ideas and solutions.
that only 16 combinations had to
be tested. closing Yours,
Crayola.com
The best script is 3.5 times more effective than the worst.
october 2001 9
T O O L K I T • E x p e r i m e n ta l D e s i g n
• Two closings: “Crayola.com” and Crayola with similar tests we performed Customers may have multiple options
“EducationEditor@Crayola.com.” for Cable & Wireless. At Crayola, e-mails to choose from rather than a simple
Taking into account all the levels of containing no promotional offer drew “yes” or “no” response – for example, a
each attribute, there were a total of 72 poorly compared with e-mails containing choice between one-year, two-year, and
possible versions of the e-mail (2 × 3 × 2 either of the two promotional offers three-year subscriptions or no purchase.
× 3 × 2 = 72). While Crayola might have tested (a $100 product drawing and a Different types of experimental de-
been able to test all 72 variations, the $25 Amazon.com gift certificate). At signs can be used when the experi-
process would have been cumbersome Cable & Wireless, though, e-mails with mental objectives vary. For example,
and expensive. Instead, we created a no promotional offer drew the best click- so-called screening designs can effi-
subset of 16 e-mails to represent the 72 through rate. But that’s part of the value ciently test very large numbers of attri-
possible combinations. Over a two-week of experimental design: It allows mar- butes to select a smaller number to in-
period, we sent the 16 types of e-mail keters to move beyond rules of thumb vestigate in more detail. Subsequent
to randomly selected samples of cus- or experience to pinpoint the marketing testing can employ more levels for each
tomers and tracked and analyzed their approaches that work best with a par- of a smaller collection of attributes. Re-
responses. The results were compelling. ticular audience in a particular market- sponse surface designs are used in food
The “best” e-mail of the 72 possible place at a particular moment in time. testing in which multiple dimensions
scripts yielded a positive response rate such as sweet, salty, crunchy, and sour
of about 34% and was more than three The Expanding Marketing have an ideal level somewhere between
times as effective at attracting parents as Universe “too much” and “not enough.” The de-
the “worst” e-mail, which yielded a pos- In the world of marketing experimen- sign lets testers estimate the ideal com-
itive response rate of only about 10%. tation, the Crayola tests are relatively bination of tastes and textures.
(See the exhibit “Crayola Draws Re- simple. We tested only a handful of
sults.”) Among educators, the best e-mail marketing attributes, with a relatively Getting Results
script was nearly twice as effective as small number of levels for each. Even Naturally, there are limits to the power
the worst script, with response rates of so, the customer impact was impres- of experimental design. This approach
35% for the best e-mail versus 20% for sive, with the best combinations of requires thoughtful planning to hy-
the worst. stimuli drawing double, triple, or qua- pothesize what you are looking for and
We also conducted similar experi- druple response rates compared with to rule out other possibilities before the
ments with Crayola to test the effects of the worst combinations. experiment can begin.
three different banner ads on the home The approach we took with Crayola One caution is that many experimen-
page, as well as product, price, and pro- can be extended and applied to more tal designs rely on “main effects” mod-
motions offered at the on-line store. The attributes and more levels. It’s not un- els. That is, they assume that interaction
best combination was nearly four times usual for a company to test ten or more effects – the impact that one variable
as effective in converting shoppers into attributes, including some with as many can have on another variable – are neg-
buyers as the worst combination and as eight levels. A credit card company, ligible. This is usually a reasonable as-
nearly doubled revenues per buyer. for example, might be interested in test- sumption when you’re dealing with
ing six teaser rates, six cobrands, four complex combinations of three, four,
Uncovering the Unexpected different annual percentage rates, six and five variables at a time. However,
In the Crayola experiments, as well as in promotions, four insurance packages, interactions between two variables can
other tests, our results have yielded sur- four modes of communication, eight be important and can be tested. For ex-
prising insights. When we ask experi- direct mail packages, and four mailing ample, suppose you find that free sam-
enced marketers to predict which stim- schedules. This represents a possible set ples have a more positive impact on a
uli are likely to elicit the best responses, of 442,368 distinct marketing stimuli, product’s sales than do coupons. You
few get it right. Crayola, for example, obviously too large a universe for a test- may also learn that free samples pro-
was surprised to find that a price reduc- and-control-cell approach. But by using vide an even greater lift when they are
tion on a product or line of products experimental design to select and test handed out in the stores as opposed to
generated sufficient volume to create a manageable number – say 128 combi- being sent through the mail. One vari-
higher revenues while also maintaining nations of these variables – the credit able, the chosen distribution channel,
the site’s profitability. Conventional wis- card company could estimate with great interacts meaningfully with another
dom would have suggested that raising accuracy the customer reaction to all variable, the promotion itself.
prices would be a more effective way to 442,368 combinations. Experimental design also calls for
increase revenue. And this is by no means the upper substantive knowledge to frame the
Of course, the findings from tests like limit of the usefulness of experimental problem, careful application of theo-
these can’t be generalized. For instance, design. The responses being sought from retically sound methods, and skillful
compare the e-mail tests we conducted at customers can be more complex as well. interpretation of the results in the ap-
propriate context. Some knowledge of be one part of a continuous test-and- to better communicate with their cus-
basic statistical methods is necessary, learn cycle. tomers – and substantially raise the
of course. But even more important is Marketing is, and always will be, a odds that their marketing efforts will
a nuanced understanding of customers creative endeavor. But it doesn’t have pay off.
and the ability to form reasonable as- to be so mysterious. As marketing noise
sumptions concerning which attributes and advertising clutter continue to in- Reprint r0109k
and levels should be tested and which crease, marketers will find that scien- To place an order, call 1-800-988-0886.
shouldn’t. Experimental design should tific experimentation will allow them
october 2001 11