AB Testing - HBR

Research: When A/B Testing Doesn’t Tell You
the Whole Story

 Eva Ascarza
June 23, 2021
Richard Drury/Getty Images

Summary. When it comes to churn prevention, marketers traditionally start by identifying which customers
are most likely to churn, and then running A/B tests to determine whether a proposed retention intervention
will be effective at retaining those high-risk customers....more
Every year, marketers spend billions of dollars on campaigns meant to attract, retain, and upsell
customers. Yet despite this massive investment, it can be extremely challenging to determine
how effective these initiatives actually are, and how they can be improved. One common method
of measuring a campaign’s Return on Investment (ROI) is to run an A/B test: Marketers will
target customers with two different interventions, and then compare results between the two
groups. With the right approach to analysis, these A/B tests can provide useful insights — but
they also have the potential to be highly misleading.
To understand the shortcomings of how A/B tests are often employed, it’s helpful to consider a
hypothetical example. Imagine you work for a large arts organization that is concerned about
declining retention rates among its members. You’re thinking about sending a small gift along
with the renewal notification to the members you’ve determined are at a higher risk of canceling
their memberships, but since that comes at a cost, you want to make sure the intervention is
effective before rolling it out more broadly. So you decide to run a small pilot campaign,
randomly choosing one group of “at risk” members to receive a gift and one not to, in order to
see if those who receive the gift are more likely to renew.
Now, say you don’t find any difference in retention rates between members who receive the gift
and those in the control group. If you ended your analysis there, it would likely lead you to
cancel the gift program, since the data seems to suggest that sending gifts has no impact on
retention. But upon closer examination of the data, you might find that for a certain subgroup of
customers — such as those who had visited the venue in the last year — the gift did in fact
significantly increase their chances of renewing, while for customers who had not visited the
venue in a long time, the gift actually made them less likely to renew, perhaps because it served
as a more salient reminder of how infrequently they had been using their membership. Using an
A/B test to evaluate the average effect of an intervention can cover up important insights around
which customers are likely to be more or less receptive to that campaign (whether the analysis
suggests the intervention has a positive, negative, or, as in this example, an insignificant effect),
leading marketers to make the wrong decisions around which campaigns to run with which
customers.
Optimizing Churn Prevention Campaigns

This isn’t just hypothetical — in fact, this example is based on a real organization I worked with
as part of my research. When it comes to increasing retention, companies typically identify “high
risk” customers — that is, customers whose recent behavior or other characteristics suggest they
are particularly likely to cancel their subscriptions or stop purchasing a company’s product —
and then run A/B tests to determine if their retention campaigns will be effective with this group.
While this is an understandable strategy (certainly you don’t want to waste marketing resources
on customers who weren’t going to churn anyway), my research suggests that it can seriously
backfire, as it can lead marketers to make flawed decisions that actually reduce overall retention
rates and ROI on marketing spend.
Specifically, I conducted field experiments with two large companies that were implementing
retention campaigns. In the first part of my study, the companies both developed churn reduction
interventions and then ran A/B tests tracking churn rates for a total of over 14,000 customers,
where one randomly assigned group of customers received the interventions, and the other did
not. Next, I collected a rich dataset of customer information, including recent activity and
engagement with the company, tenure as a customer of the company, location, and other metrics
that were used to predict churn risk, and examined which of these characteristics correlated with
a positive response to the retention campaigns.
Across both companies, I found that the customers who had been identified as having the highest
risk of churning were not necessarily the best targets for the retention programs — in fact, there
was little correlation between customers’ churn risk level and their sensitivity to the
interventions. The data showed that there was a distinct group of customers who responded
strongly to each intervention (customers with particular behavioral or demographic
characteristics that consistently correlated with being much less likely to churn after receiving
the interventions), but that “high-sensitivity” group had almost no overlap with the people
identified as “high churn risk.” And this had serious implications for ROI: My analysis found
that if the two companies were to spend the same amount of marketing budget targeting the high-
sensitivity group rather than the high-churn-risk group, it would reduce their churn rates by an
additional 5% and 8% respectively.
Of course, the specific factors that make a customer more likely to be receptive to a retention
campaign will vary organization to organization and even campaign to campaign, but running
pilots like the ones described above can help you identify the characteristics that will be the best
predictors of your customers’ sensitivity to a specific intervention. For example, one of the
organizations in my study was a telecommunications company with access to detailed data on
behavioral metrics such as the number of calls customers had made in the last month, the number
of texts they sent, gigabytes of data downloaded, and more. For this company, the data showed
that how recently a customer had last engaged with the company predicted their level of churn
risk, but had no impact on their sensitivity to the churn intervention. What did predict sensitivity
was their data usage — suggesting that to maximize ROI, the company should consider targeting
their retention campaign not at the customers who hadn’t engaged in a long time, but at the
customers who used the most data.
Moving from Prediction to Prescription

So what does this mean for marketers? The key insight is that marketing interventions should be
targeted based on each customer’s expected response to that intervention, not on what customers
are expected to do in the absence of that intervention. In a sense, marketers are like doctors:
Doctors don’t just give random treatments to the patients who are most likely to die — they
prescribe specific treatments to the patients who are most likely to respond positively to those
treatments.
Rather than trying to predict what customers will do (i.e., trying to determine their risk of
churning), marketers should focus on how different types of customers will respond to particular
campaigns, and then design campaigns that are most likely to be effective at reducing churn
among a given group of customers. Companies should leverage A/B test data not simply to
attempt to measure the overall effectiveness of a campaign among all customers, but to explore
which types of customers will be most sensitive to certain interventions. That means combining
customers’ historical transaction and demographic data with the data collected through A/B tests
to identify the behaviors and traits that make a customer most likely to respond to a particular
intervention. Luckily, many companies already collect all this data — it’s merely a matter of
leveraging it in a new way.
***
The concept of targeted marketing campaigns is nothing new — but it’s critical to think carefully
about how you’re making those targeting decisions. Rather than just guessing about what factors
might indicate that someone is a strong target, or focusing on a group that’s been deemed high
priority (such as high churn risk customers), firms should target the customers who will be the
most sensitive to the specific intervention they’re implementing. To maximize ROI, marketers
need to stop asking, “Is this intervention effective?” and start asking, “For whom is this
intervention most effective?” — and then target their campaigns accordingly.
Retention rate.
Remdermite control trial.
Churn rate: likely not buy any more

-> chia thành 2 nhóm (a/b testing): 1 nhóm gửi quà – 1 nhóm không gửi quà (quà là
intervention)
-> Effective/Less effective -> intervention is good.
-> However, there’s case in which data suggests no impact
-> bỏ chương trình intervention -> maybe wrong decision, bởi vì có những nhóm người dù có
intervention hay không thì họ vẫn bỏ while có một nhóm người highly sensitive to intervention.
-> Ask smart questions: Qualitative -> descriptive -> predictive (identify whom, maybe even
highly sensitive intervention) -> prescriptive (choose suitable intervention).
-> In case the relationship btw highly sensitive to intervention and intervention: prescriptive
(Stop “Is this intervention effective?” and start asking, “For whom is this intervention most
effective?”)

AB Testing - HBR

Uploaded by

Copyright:

Available Formats

You might also like

AB Testing - HBR

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AB Testing - HBR

Uploaded by

Copyright:

Available Formats

Research: When A/B Testing Doesn’t Tell You

the Whole Story

Richard Drury/Getty Images

Optimizing Churn Prevention Campaigns

Moving from Prediction to Prescription

Churn rate: likely not buy any more

You might also like