Download as pdf or txt
Download as pdf or txt
You are on page 1of 54

Class 2: Market segmentation and targeting with RFM

Sridhar Moorthy
Rotman School of Management
University of Toronto

RSM 8522 Analytics for Marketing Strategy

©Moorthy 2024
Last class …
• The Pilgrim Bank case showed that
― consumers are heterogeneous.
― 80% of profits might be generated by 20% of consumers

• Taking a long-term customer equity perspective means moving from customer profits to
customer lifetime value (CLV)

• By examining CLV, management can understand where the problem lies:


― Acquisition, development, or retention?
― Escalating COGS or escalating marketing costs?

• We can use CLV calculations to evaluate and plan marketing actions for acquisition,
retention, development.
― For example, in the Tuscan Lifestyles example, we saw how we can evaluate whether a marketing
action geared toward increasing acquisitions is worth pursuing from a CLV perspective

©Moorthy 2024 2
Today we will begin our discussion of how to do market segmentation,
starting with RFM

• We already know from the Pilgrim Bank case that not all customers are equally
profitable.

• But can we identify which customers will be more profitable, i.e., identify observable
variables that predict profitability?

• If the answer is yes, then we can segment the market based on those observable predictors
and target only those customers who are likely to be most profitable.
― This targeting strategy is likely to yield higher profits than “mass marketing.”

©Moorthy 2024
There are many ways to segment markets, as you learnt in RSM 8901

What are some of those ways?

©Moorthy 2024
What does RFM stand for?

R: Recency of last purchase (“how long ago was your last purchase”)
F: Frequency of past purchases in a given time-period
M: Monetary value of past purchases in a given time-period

• RFM is based on past behavior. It can only be used to segment customers for whom R, F, and M data
are available. In other words, RFM segmentation is potentially useful for developmental marketing,
rather than acquisition marketing.

• RFM is based on the premise that customers with a “high” RFM index—made a purchase recently, buy
often, and spend a lot—are our best targets for (development) marketing efforts.

©Moorthy 2024
Frequency
(more frequent)

+ Monetary value
Recency
(larger purchase amount)
(more recent)

+
+
Likelihood of buying again in
response to marketing effort

©Moorthy 2024
Empirical evidence for the “recency effect”: meal preparation service

©Moorthy 2024
Let us see how we can do an RFM analysis in an actual case-study.

©Moorthy 2024
A bookstore wants to use RFM analysis to decide whom to target with a
book offer

• Its database of past customers has information on:


― Recency of last purchase (“last”—months since last purchase)
― Purchase frequency (“purch”—total number of purchases in the chosen period)
― Total expenditure (“total_”—total expenditure in the chosen period)

• Stan Lawton, marketing director, pulls a random sample of 50,000 customers from the
database and mails the book in question, Art History of Florence, to the sample
― 4,522 customers out of 50,000 end up buying the book (“buyer”)

• Based on this test Lawton wants to determine which of the remaining 500,000 customers
in his database should be approached with the Art History of Florence offer.

©Moorthy 2024
We will first explore the assumptions of RFM analysis …

• Do buyers and non-buyers in the sample differ on R, F, and M variables?

• How predictive are R, F, and M variables for the likelihood of purchasing?

• Are R, F, and M independent?

©Moorthy 2024
In the data set we have a response variable and RFM variables

Response

Recency

Frequency

Monetary value

©Moorthy 2024
The dataset looks like this …

acctnum last total_ purch buyer


10001 29 357 10 no
10002 27 138 3 no
10003 15 172 2 no
10004 7 272 1 no
10005 15 149 1 no
10006 7 113 1 yes
10007 25 15 1 no
10008 1 238 11 no
10009 5 418 11 yes
10010 11 123 1 no
10011 5 294 2 no
10012 13 173 2 no
10013 13 226 1 no
10014 5 288 3 no
10015 25 15 1 no
©Moorthy 2024
We will use the following R packages in this exercise

• ggplot2 or tidyverse (tidyverse includes ggplot2)

• psych: we will use describeBy from this package to create summary statistics by category

• Hmisc: rcorr is a useful function to produce correlation matrix with p-values

©Moorthy 2024
Do buyers and non-buyers differ on R, F, and M variables?

Indeed:

1. Buyers’ last purchase was 8.61


months ago, versus 12.73 months for
non-buyers.
2. Buyers purchased 5.22 times since
initial purchase versus 3.76 for non-
buyers.
3. Buyers purchased a total of $234.3
versus $205.73 for non-buyers.

In short, the RFM variables are


behaving as we expect them to.
describeBy(x,y) provides summary stats of x by category y

©Moorthy 2024
Are the RFM variables independent?

Correlations

p-values
(statistical significance)

What does this say? Is this what you expected to see?

©Moorthy 2024
How to segment customers based on RFM profile: 3 methods

1. Seat-of-the-pants
• Example: one-time buyers vs. repeat-buyers, people
who bought less than 6 months ago, 6mo-1yr ago,
more than 1yr ago, etc.

2. Independent N-tile
• Classify customers into recency quintile/decile
• Independently classify customers into frequency
quintile/decile for frequency
• Independently classify customers into monetary
quintile/decile
• For each customer, aggregate the three indices

3. Sequential N-tile
• Classify customers into recency quintile/decile
• Within each recency quintile/decile, classify customers into
frequency quintile/decile
• Within each recency-frequency quintile/decile, classify
customers into monetary quintile/decile
• For each customer, aggregate the three indices

©Moorthy 2024
The three methods compared

1. The seat-of-the-pants approach is the easiest, but it relies on intuition—which is not always reliable.

2. The independent N-tile approach, being based on data, is likely to yield better predictions than seat-
of-the-pants. Also:
― Quite easy to execute: only 3 sorts required.
― Interpretation of the three RFM components is unambiguous: for example, a frequency score of 5 for one
customer means the same as a frequency score of 5 for another customer, regardless of their recency scores.
― However: In small samples, especially with skewed distributions, might result in empty cells and uneven
distribution of aggregate RFM scores.

3. The sequential N-tile approach is likely to yield the best predictions because of finer sorting:
― The finer sorting also tends to produce a more even distribution of aggregate RFM scores, especially when the
underlying distributions are skewed
― However: Harder to execute than independent n-tiles: for example, with quintiles at each stage, you need to do
one sort for R, 5 sorts for F, and 25 sorts for M.
― Also: the frequency and monetary scores are harder to interpret. For example, a frequency rank of 5 for a
customer with a recency rank of 5 may not mean the same thing as a frequency rank of 5 for a customer with a
recency rank of 4, since the frequency rank is dependent on the recency rank.
©Moorthy 2024
Let us execute the independent n-tiles method on our bookstore data.

©Moorthy 2024
The dataset looks like this …

acctnum last total_ purch buyer


10001 29 357 10 no
10002 27 138 3 no
10003 15 172 2 no
10004 7 272 1 no
10005 15 149 1 no
10006 7 113 1 yes
10007 25 15 1 no
10008 1 238 11 no
10009 5 418 11 yes
10010 11 123 1 no
10011 5 294 2 no
10012 13 173 2 no
10013 13 226 1 no
10014 5 288 3 no
10015 25 15 1 no
©Moorthy 2024
We begin by classifying each customer into the recency quintile s/he belongs.
The result is a rec_iq for each customer.

data$rec_iq <‐.bincode(data$last, quantile(data$last, probs = seq(0, 1, 0.2)), right = TRUE,


include.lowest = TRUE)

.bincode function assigns to each observation the “bin number”—in this case, quintile—to which
last belongs (if you want to change from quintiles to deciles (10 segments), specify probs = seq(0, 1, 0.1))

The vector rec_iq will have the same length as last and will contain numbers from 1 to 5, with 1
referring to the group with the most recency (i.e., smallest last).

©Moorthy 2024
describeBy(data$last, data$rec_iq): summary stats for last by
recency quintile

©Moorthy 2024
Visually: ggplot(data=data,aes(x=rec_iq,y=last)) + geom_bar(stat =
"summary", fun.y = "mean")

Note: Quintile 1 groups consumers with the lowest number of months since last purchase,
which is what we want
©Moorthy 2024
Are the most recent purchasers most likely to buy?

ggplot(data=data,aes(x=rec_iq,y=buyerdummy)) + geom_bar(stat = "summary", fun.y = "mean")

Yes!

©Moorthy 2024
Next we assign people to frequency quintiles

data$freq_iq
<‐.bincode(data$purch,
quantile(data$purch, probs =
seq(0, 1,0.2)), right = TRUE,
include.lowest = TRUE)

ggplot(data=data,aes(x=freq_i
q,y=purch)) +
geom_bar(stat="summary",fun.y
="mean")

©Moorthy 2024
Why are there only four quintiles? Looking at the histogram of purch
provides a clue

ggplot(data=data,aes(x=purch)) + geom_histogram()

• Distribution of purch is very skewed: most


people make only 1 or 2 purchases.
• Customers with same value of purch must be
assigned the same group
• To have approximately the same number of
people in each group and satisfy the above
requirement means 4 bins instead of 5
(quintile 3 is empty).

©Moorthy 2024
Are the most frequent purchasers—the ones with the highest probability of
purchase--in the first quintile?

ggplot(data=data,aes(x=freq_iq,y=buyerdummy)) +
geom_bar(stat="summary",fun.y="mean")

No!

©Moorthy 2024
Reorder freq_iq so that group 1 is most likely to buy
data$freq_iq <‐ 6 ‐data$freq_iq
ggplot(data=data,aes(x=freq_iq,y=buyerdummy)) +
geom_bar(stat="summary",fun.y="mean")

Yes!

©Moorthy 2024
Finally, we assign people to monetary value quintiles and anticipating the
same issue as with frequency reorder the bin numbers …

# Create monetary value quintile


data$mv_iq
<‐.bincode(data$total_,
quantile(data$total_, probs =
seq(0, 1, 0. 2)), right = TRUE,
include.lowest = TRUE)

# Reorder
data$mv_iq <‐ 6 ‐data$mv_iq
ggplot(data=data,aes(x=mv_iq,y=t
otal_))+geom_bar(stat="summary",
fun.y="mean")

©Moorthy 2024
Are people in group 1 most likely to buy?
ggplot(data=data,aes(x=mv_iq,y=buyerdummy))+geom_bar(stat="summary
",fun.y="mean")

Yes!

©Moorthy 2024
Aggregate the individual R, F, and M indices into a composite RFM
index and predict average response rate by this composite index

• Create RFM index based on R, F, and M indices


― data$rfmindex_iq <‐ 100 * data$rec_iq + 10 * data$freq_iq + data$mv_iq
― If R=1, F=2, and M=3, then the RFM index is 123
― There are 5 x 4 x 5 = 100 RFM segments

• Predict average response rate by RFM index


― data$RFM_response <‐ ave(data$buyerdummy, data$rfmindex_iq)

©Moorthy 2024
Relationship between response rate and RFM index

ggplot(data = data, aes(x = as.factor(rfmindex_iq), y = buyerdummy)) +


geom_bar(stat = "summary", fun.y = "mean")+ labs(title="% Buyers by RFM Index",
x="RFM Index", y="% Buyer")+ scale_x_discrete()

Most likely to buy Least likely to buy

©Moorthy 2024
Calculate the break-even response rate using the cost of the marketing
effort and the bookstore’s margin

• Cost of mailing an offer = $0.50

• Selling price of Art History of Florence (with free shipping) = $18

These data are


• Cost of Art History of Florence for bookstore (wholesale price) = $9
available to Lawton.

• Shipping costs borne by the bookstore = $3

What is the break-even response rate?


©Moorthy 2024
We only target segments with response rate greater than the
breakeven response rate

# Create “target_iq” to indicate whether data$target_iq[data$RFM_response > break_even] <‐ 1


targeting to this customer or not; we will be data$target_iq[data$RFM_response <= break_even] <‐ 0
targeting only those consumers whose
RFM_response > break_even=.083

# Calculate the fraction of customers targeted ## 0 1


table(as.integer(data$target_iq))/nrow(data) ## 0.53464 0.46536

## group: 0
## vars n mean sd median trimmed mad min max range skew
kurtosis se
describeBy(data$buyerdummy,data$target_iq) ## X1 1 26732 0.05 0.21 0 0 0 0 1 1 4.29 16.4 0
## ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
Average response rate of the target is higher ## group: 1
## vars n mean sd median trimmed mad min max range skew
than the break-even rate and higher than the kurtosis se
response rate for the entire sample (.0904) ## X1 1 23268 0.14 0.35 0 0.05 0 0 1 1 2.07 2.28 0
©Moorthy 2024
Visualization of targeting strategy

©Moorthy 2024
# of prospects to target and number of expected buyers

# Calculate the number of mails sent Mailing to 46.54% of database means mailing
under the targeting policy. to 500,000 * 46.54%=232,680 prospects
exp_mail_iq <‐
500000*mean(data$target_iq)

# Calculate the expected number of responses under From which we expect to net
the targeting policy. 14.05% * 232,680 = 32,692 buyers
exp_res_iq <‐
exp_mail_iq*mean(data[data$target_iq==1,]$buyerdummy)

©Moorthy 2024
Profits under RFM-based targeting strategy

• Gross sales: $18 × 32,692 = $588,456

• Net profit: ($18 − 9 − 3) × 32,692 − $0.5 × 232,680 = $79,812

• Net profit/sales = 13.6%

• Return on marketing expenditure: $79,812/$116,340 = 𝟔𝟖. 𝟔%

©Moorthy 2024
Suppose, instead, we had not followed the RFM approach and sent the
offer to all 500,000 consumers, i.e., no targeting

• Our response rate would have been the average response rate for the entire sample

# Response rate without any targeting


prop.table(table(data$buyerdummy))
##
## 0 1
Average response rate for the
## 0.90956 0.09044
entire sample is 9.04%

Our expected number of buyers would have been 9.04% × 500,000 = 45,200.

©Moorthy 2024
And our profits and ROI would have been …

• Gross sales: $18 × 45,200 = $813,600

• Net profit: ($18-9-3) × 45,200−$0.5 × 500,000 = $21,200 (instead of $79,812 with RFM
targeting)

• Net profit/sales = $21,200/$813,600 = 2.6%

• Return on marketing expenditure: $21,200 /$250,000 = 𝟖. 𝟓% (instead of 68.6% with


RFM targeting)

In short, RFM-based market segmentation and targeting adds a lot of value!


©Moorthy 2024
Recap of RFM-based segmentation using the independent n-tiles approach.

©Moorthy 2024
Step 1: assign R, F, and M indices to each customer based on quintiles or deciles

1. For each variable of interest, recency, frequency and monetary value

― Sort customer database from “best” to “worst” on the variable (here “best” refers to variable
value that corresponds to the highest probability of purchase)

― Decide how many segments to classify customers in; then calculate “cut-off values” for the
variable that demarcate each segment
• Five segments: quintiles; ten segments: deciles.

― Classify each customer into the quintile or decile s/he belongs to, using the cut-off values
calculated in the previous step

― Confirm that customers in the “top group” have the highest probability of buying. If not, reverse
the index so that it is.

©Moorthy 2024
Step 2: combine the three indices into a composite 3-digit index

2. Assign every customer a 3-digit composite index, e.g., 125, 555, etc., based on his/her
recency, frequency, and monetary value indices
― For example, with quintile classification in the previous step, a customer who was in the top-most
quintile on recency, the second quintile on frequency, and the bottom quintile on monetary value
would get a composite index of 125.
― With quintiles for R, F, and M, we will normally end up with 125 segments; with deciles we will
normally have 1000 segments.
― Given the classification rule above, a customer with a composite index 125 should have a higher
probability of buying than a customer with index 555.

©Moorthy 2024
Step 3: estimate the average response rate for each RFM cell

3. This is % of customers within each 3-digit composite index who responded to the
marketing action (e.g., catalog mailing)
― average of 0/1 outcome variable (not respond/respond)

©Moorthy 2024
Step 4: calculate the break-even response rate for the marketing initiative

4. Calculate break-even response rate for the marketing initiative


― A marketing initiative is profitable if and only if Probability of response × Profit margin on
each sale − Cost of marketing initiative ≥ 0

― Therefore, break-even response rate for the marketing initiative is:

Cost of marketing initiative


Profit margin on each sale

©Moorthy 2024
Step 5: target the marketing initiative to the RFM cells with average
response rate greater than the breakeven rate

5. Select the 3-digit segments with average response rate above the break-even response
rate and target the offer only to customers in those groups

©Moorthy 2024
Now let us turn to the sequential n-tile approach

©Moorthy 2024
Sample code for implementing sequential RFM

# R index R-index
data$rec_sq <‐.bincode(data$last, quantile(data$last, probs = seq(0, 1, 0.2)), right = TRUE, same as
include.lowest = TRUE) before
# F index
data$freq_sq <‐ 0
for (i in 1:5) {
F-index not
data[data$rec_sq==i,]$freq_sq <‐.bincode(data[data$rec_sq==i,]$purch,
the same as
quantile(data[data$rec_sq==i,]$purch, probs = seq(0, 1, 0.2)), right = TRUE, include.lowest =
TRUE) before
}

# M index
data$mv_sq <‐ 0
for (i in 1:5) { M-index not
for (j in 1:5) { the same as
data[data$rec_sq==i & data$freq_sq==j,]$mv_sq <‐.bincode(data[data$rec_sq==i &
before
data$freq_sq==j,]$total_, quantile(data[data$rec_sq==i & data$freq_sq==j,]$total_, probs =
seq(0,
1, 0.2)), right = TRUE, include.lowest = TRUE)
}
}
©Moorthy 2024
The composite RFM index for sequential RFM is calculated in the same
way as before
data$rfmindex_sq <‐ 100*data$rec_sq+10*data$freq_sq+data$mv_sq
ggplot(data = data, aes(x = as.factor(rfmindex_sq),y=buyerdummy))+
geom_bar(stat="summary",fun.y="mean")+
labs(title="% Buyers by RFM Index (Sequential)",
x="RFM Index", y="% Buyer")

©Moorthy 2024
The targeting decision is again made in the same way as before

# whether to target
data$target_sq[data$RFM_response_sq > break_even] <‐ 1
data$target_sq[data$RFM_response_sq <= break_even] <‐ 0

# Calculate the fraction of customers targeted Mail to 47.4% of sample:


table(as.integer(data$target_sq))/nrow(data) 500,000 x 47.4%=237,000
## 0 1
## 0.52594 0.47406
Expected response rate: 14%
describeBy(data$buyerdummy,data$target_sq)
## group: 0
## vars n mean sd median trimmed mad min max range skew kurtosis
Expected number of buyers:
se
13.97% x 237,000 = 33,109
## X1 1 26297 0.05 0.21 0 0 0 0 1 1 4.33 16.78 0
## ‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐‐
## group: 1
## vars n mean sd median trimmed mad min max range skew kurtosis
se
## X1 1 23703 0.14 0.35 0 0.05 0 0 1 1 2.08 2.32

©Moorthy 2024
Profitability of targeting using sequential RFM

• Gross sales: $18 × 33,109 = $595,962

• Net profit: $18 − 9 − 3 × 33,109 − $0.5 × 237,000 = $𝟖𝟎, 𝟏𝟓𝟒

• Net profit/sales = 13.4%

• Return on marketing expenditure: $80,154/$118,500 = 67.6%


― Again, way bigger than with “no targeting”
― Not very different from what we got in the independent N-tile RFM analysis

©Moorthy 2024
Comparing the three methods: no targeting, independent n-tile,
sequential n-tile

©Moorthy 2024
RFM is widely used in industry

• For example, Fedex used RFM analysis “to separate growing customers with additional
upside potential from those who had reached the limit of their growth. The key differences
between the two clusters, such as sales contact rates, automation status, discounts levels,
etc. could then be worked into promotional programs designed to continue to grow these
customers with untapped upside potential.” (Sellers and Hughes 2009)

• Relatively easy to implement (no heavy-duty analytics required)

• Works for B2C and B2B

©Moorthy 2024
Note: how the “recency effect” operates depends on product category

• Strongest for frequently-purchased goods, e.g., things you buy in a grocery store.
― For these products if your last purchase was a long time ago—longer than the inter-purchase time
in the category—then it is likely that “you have moved on” and unlikely to be a buyer.

• For durable goods, recency of purchase may be a negative indicator.


― Recent purchase “takes you out of the market for a significant amount of time.”

Essentially, you have to weigh recency of purchase against the normal interpurchase
time in the category.

©Moorthy 2024
Limitations of RFM analysis

• Need past customer behavior information


― Cannot use for acquiring completely new customers (such as for a start-up company)

• Need relatively large customer base to effectively implement

• Being based on solely on “behavior,” it doesn’t use any other information, such as
demographic characteristics of the consumer, or information on the purchase
environment in the past—such as prices--which, if incorporated, could lead to better
predictions.

©Moorthy 2024
Takeaways

• RFM is a simple approach to market segmentation and targeting

• It is based on three dimensions of past behavior


1. Recency
2. Frequency
3. Monetary value

• Three variants of RFM


1. Seat-of-the-pants (based on intuition)
2. Independent N-tile
3. Sequential N-tile

©Moorthy 2024

You might also like