Professional Documents
Culture Documents
Week 3 BTYD Model Fader Et Al MKSC 10
Week 3 BTYD Model Fader Et Al MKSC 10
Week 3 BTYD Model Fader Et Al MKSC 10
Peter S. Fader
The Wharton School of the University of Pennsylvania, Philadelphia, Pennsylvania 19104,
faderp@wharton.upenn.edu
Bruce G. S. Hardie
London Business School, London NW1 4SA, United Kingdom, bhardie@london.edu
Jen Shang
School of Public and Environmental Affairs, Indiana University, Bloomington, Indiana 47405,
jenshang@indiana.edu
M any businesses track repeat transactions on a discrete-time basis. These include (1) companies for whom
transactions can only occur at xed regular intervals, (2) rms that frequently associate transactions with
specic events (e.g., a charity that records whether supporters respond to a particular appeal), and (3) orga-
nizations that choose to utilize discrete reporting periods even though the transactions can occur at any time.
Furthermore, many of these businesses operate in a noncontractual setting, so they have a difcult time dif-
ferentiating between those customers who have ended their relationship with the rm versus those who are
in the midst of a long hiatus between transactions. We develop a model to predict future purchasing patterns
for a customer base that can be described by these structural characteristics. Our beta-geometric/beta-Bernoulli
(BG/BB) model captures both of the underlying behavioral processes (i.e., customers purchasing while alive
and time until each customer permanently dies). The model is easy to implement in a standard spreadsheet
environment and yields relatively simple closed-form expressions for the expected number of future transactions
conditional on past observed behavior (and other quantities of managerial interest). We apply this discrete-time
analog of the well-known Pareto/NBD model to a data set on donations made by the supporters of a nonprot
organization located in the midwestern United States. Our analysis demonstrates the excellent ability of the
BG/BB model to describe and predict the future behavior of a customer base.
Key words: BG/BB; beta-geometric; beta-binomial; customer-base analysis; customer lifetime value; CLV; RFM;
Pareto/NBD
History: Received: March 24, 2009; accepted: March 31, 2010; accepted by Scott A. Neslin, acting
editor-in-chief. Published online in Articles in Advance August 11, 2010.
Table 1 Annual Donation Behavior by the 1995 Cohort of First-Time framework to accommodate business settings charac-
Supporters terized by discrete-time purchasing (see pp. 1617 and
ID 1995 1996 1997 1998 1999 2000 2001 Table 3 in their paper), yet no one to date has pre-
sented such a model.
100001 1 0 0 0 0 0 0 As another example, consider attendance at the
100002 1 0 0 0 0 0 0
100003 1 0 0 0 0 0 0
INFORMS Marketing Science Conference. The confer-
100004 1 0 1 0 1 1 1 ence occurs at a discrete point in time and an indi-
100005 1 0 1 1 1 0 1 vidual can either attend or not. Similarly, consider
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
behavior in terms of whether or not each customer holdout period. We then examine the relative perfor-
went on a cruise in 2000, 2001, 2002, etc. (Berger et al. mance of the Pareto/NBD model when applied to this
2003). Once again, purchasing behavior is more con- same data set. Next we present an extension to the
veniently described as a Bernoulli process rather than basic model in which the consequences of relaxing one
as a Poisson process. An example of this in a con- of the model assumptions are explored. We conclude
sumer packaged goods setting is the work of Chateld with a discussion of several additional issues that arise
and Goodhardt (1970), who model the purchasing of from this work.
a product not in terms of the number of purchases
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
Assumption 5. Heterogeneity in follows a beta dis- observed transaction occurred (tx ).2 We therefore go
tribution with pdf from 2n binary string representations of all the pos-
sible purchase patterns to nn + 1/2 + 1 possible
1 1
1 recency/frequency patterns.
f
= 0 1
> 0 (2)
B
This realization that recency and frequency are suf-
cient summary statistics offers signcant benets
Assumption 6. The transaction probability p and the
for model implementation, particularly as the num-
dropout probability vary independently across customers.
ber of transaction opportunities becomes sizeable. For
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
Assumptions (2) and (4) yield the beta-Bernoulli instance, in the case of our nonprot organization, we
model (i.e., the beta-binomial model without the bino- can compress the number of necessary binary strings
mial coefcient, since we explicitly account for the from 64 down to 22 recency/frequency combinations,
ordering of the transactions). Similarly, Assumptions making it a bit easier to visualize and manipulate the
(3) and (5) yield the beta-geometric (BG) distribu- data set. However, in another recent application with
tion. We therefore call this the beta-geometric/beta- n = 10, we saw a reduction from 1,024 binary strings
Bernoulli (BG/BB) model of buyer behavior. down to 56 recency/frequency combinations. Further-
more, these numbers are not affected by the size of the
2.1. Derivation of Model Likelihood Function customer base being modeled; see Table 2 for a com-
Consider a customer with repeat purchase string plete characterization of the nonprot data set par-
1 0 1 0 0. What is PY1 = 1 Y2 = 0 Y3 = 1 Y4 = 0 Y5 = tially presented in Table 1. Whether we have 11,000
0 p ? The fact that the customer made a purchase at customers or 11 million customers, the data struc-
the third transaction opportunity means that he must ture would be identicalthe numbers in the No. of
have been alive for t = 1 2 3. However, Y4 = 0, Y5 = donors columns would grow, but the computational
0 could be the result of one of three scenarios: (i) he demands for data storage and manipulation would be
died at the beginning of the fourth transaction oppor- unaffected.
tunity (AAADD), (ii) he was alive at the fourth trans- Returning to the likelihood function, we generalize
action opportunity and died at the beginning of the the logic behind the construction of (3), so it follows
fth transaction opportunity (AAAAD), or (iii) he was that
alive at both the fourth and fth transaction opportu-
nities (AAAAA). We therefore compute PY1 = 1 Y2 = Lp xtx n = px 1pnx 1n
0 Y3 = 1 Y4 = 0 Y5 = 0 p by computing the prob- ntx 1
ability of the purchase string conditional on each sce- + px 1ptx x+i 1tx +i (4)
nario and multiplying it by the probability of that i=0
scenario: To arrive at the likelihood function for a randomly
f 10100 p chosen customer with purchase history (x tx n), we
remove the conditioning on p and by taking
= f 10100 p AAADDPAAADD the expectation of (4) over their respective mixing
distributions:
+ f 10100 p AAAADPAAAAD
+ f 10100 p AAAAAPAAAAA L
x tx n
1 1
= p1 pp 1 3 +p1 pp1 p 1 4 = Lp x tx nf p f
dp d
0 0
PAAADD PAAAAD
B + x + n x B
+ n
+ p1 pp 1 p1 p1 5 (3) =
B B
PY1 =1Y2 =0Y3 =1 PAAAAA ntx 1
B + x + tx x + i
+
Note that the zero-order nature of purchasing while i=0
B
the customer is alive means that the exact order of
any given number of transactions prior to the last B + 1
+ tx + i
(5)
observed transaction does not matter. For example, B
it should be clear that f 10100 p = f 01100 p . (The solution to the double integral follows naturally
Therefore, we do not need the complete binary- from the integral representation of the beta function.)
string representation of a customers transaction his-
tory. Rather, all we need to know for n transaction 2
If x = 0, then tx = 0. Note that this measure of recency differs
opportunities are frequency and recency: the number
of from that normally used by the direct marketing community, who
transactions across the calibration period (x = nt=1 yt ) measure recency as the time from the last observed transaction to
and the transaction opportunity at which the last the end of the observation period (i.e., n tx ).
Fader, Hardie, and Shang: Customer-Base Analysis in a Discrete-Time Noncontractual Setting
1090 Marketing Science 29(6), pp. 10861108, 2010 INFORMS
6 6 1203 4 4 240 =
5 6 728 3 4 181 + 1
4 6 512 2 4 155
+
1 +
+ n
3 6 357 1 4 78 1 (8)
2 6 234 3 3 322
+
+ n
1 +
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
1 6 129 2 3 255
5 5 335 1 3 129 More generally, let the random variable
4 5 284 2 2 613 Xn n + n = nt=n+1 Yt denote the number of trans-
3 5 225 1 2 277
actions in the interval n n + n . The BG/BB
2 5 173 1 1 1091
1 5 119 0 0 3464 probability of x transactions occurring in this
interval is given by
n B+x +n x B
+n+n
possible recency/frequency patterns, each containing +
x B B
fj customers. The sample log-likelihood function is
1
n
given by i B+x +i x B +1
+n+i
+ (9)
i=x
x B B
J
LL
= fj lnL
xj txj n (6) with mean
j=1
EXn n + n
where xj and txj are the frequency and recency, respec-
i B + x + i x B + 1
+ i
n1
2
+ (7) + (12)
i=x
x B B
L
x tx n
Fader, Hardie, and Shang: Customer-Base Analysis in a Discrete-Time Noncontractual Setting
Marketing Science 29(6), pp. 10861108, 2010 INFORMS 1091
A B C D E F G H I J K L M N
1 alpha 1.204 B(alpha, beta) 1.146
=EXP(GAMMALN(B1) + GAMMALN(B2) GAMMALN(B1 + B2))
2 beta 0.750
3 gamma 0.657 B(gamma, delta) 0.729
4 delta 2.783 =EXP(GAMMALN($B$1 + A9) + GAMMALN($B$2 + C9 A9)
5 GAMMALN($B$1 + $B$2+C9))/$E$1*EXP(GAMMALN($B$3) +
GAMMALN($B$4 + C9) GAMMALN($B$3 + $B$4 + C9))/$E$3
6 LL 33,225.6 =SUM(E9: E30)
7
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
We use (13) to compute the expected number of trans- being an underestimation of expected purchasing by
actions in the 20022006 period (n = 5) conditional on those individuals whose last repeat purchase occurred
each of the 22 x tx patterns associated with n = 6. before 1998.
These conditional expectations are reported in Table 5 Referring back to Table 5, we can now address the
as a function of recency (the year of the individuals questions about different kinds of customers raised at
last transaction) and frequency (the number of repeat the outset of the paper.
transactions). A donor who has made a repeat transaction every
In Figure 6(a) we report these conditional expecta- year is expected to make only 3.75 transactions over
tions, along with the average of the number of the the next ve years. Of course, such donors are still
transactions that actually occurred in the 20022006 extremely valuable, but the possibility of death plus
forecast period, broken down by the number of repeat the fact that they might have been somewhat lucky in
transactions in 19962001. (For each x, we are aver- the past make them a bit less valuable than they might
aging over customers with different values of tx .)
Similarly, Figure 6(b) reports these conditional expec-
tations along with the average of the number of the Figure 3 Predicted vs. Actual Frequency of Repeat Transactions
transactions that actually occurred in the 20022006 4,000
forecast period, broken down by the year of the indi- Actual
viduals last transaction. (For each tx , we are aver-
3,000 Model
aging over customers with different values of x.) We
No. of people
1,000
Table 4 Parameter Estimates, 1995 Cohort
LL
0
BB 0487 0826 355161 0 1 2 3 4 5 6
BG/BB 1204 0750 0657 2783 332256 No. of repeat transactions
Fader, Hardie, and Shang: Customer-Base Analysis in a Discrete-Time Noncontractual Setting
1094 Marketing Science 29(6), pp. 10861108, 2010 INFORMS
Figure 4 Predicted vs. Actual (a) Cumulative and (b) Annual Repeat Table 5 Expected Number of Repeat Transactions in 20022006 as a
Transactions Function of Recency and Frequency
5,000
are identical (x = 4, tx = 6); thus, they have the same
4,000 conditional expectation. Minor, remote differences in
purchase histories are deemed to be irrelevant when
3,000
making predictions using the BG/BB model.
2,000 A donor who has been completely absent since
making his or her initial transaction is expected to
1,000 make only 0.07 repeat transactions over the next ve
years. However, although each such donor is not
0
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 particularly valuable alone, it is important to note,
Year as per Table 2, that over 30% of the entire cohort
of donors is in this recency/frequency group. Taken
together, these donors are expected to make over 240
have otherwise seemed. (With reference to Figure 6(a),
transactions over the next ve years, making them
we see that this conditional expectation overestimates
collectively more valuable than about half of the other
the actual mean (3.53) by only 6%.)
recency/frequency groups.
Donor 100009, who had had a perfect record until
Beyond these specic analyses, Table 5 offers addi-
the most recent year, is expected to make 1.81 trans-
tional insights about the broader interplay between
actions over the next ve years. In contrast, donor
recency and frequency. First, note that for any row
(i.e., value of x), the expected number of transactions
Figure 5 Predicted vs. Actual Frequency of Repeat Transactions in in the forecast period decreases as we move from
20022006 right to left (i.e., the less recent the last observed
7,000 transaction). This is as we would expect, because the
longer the hiatus in making a purchase, the more
6,000
Actual likely it is that the customer is dead. Looking
down the columns, however, we see a somewhat dif-
Model
5,000 ferent pattern. We rst look at 2001 and note that
the conditional expectation is clearly an increasing
No. of people
Figure 6 Predicted vs. Actual Conditional Expectations of Repeat Table 6 P(Alive in 2002) as a Function of Recency and Frequency
Transactions in 20022006 as a Function of (a) Frequency
and (b) Recency Year of last transaction
No. of rpt transactions
(a) (19962001) 1995 1996 1997 1998 1999 2000 2001
4
No. of repeat transactions (20022006)
0 011
Actual 1 007 025 048 068 083 093
Model 2 007 030 059 080 093
3 3 010 044 077 093
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
1
those customers who made only one repeat transac-
tion will have a lower value of p than those who have
made a repeat purchase in all ve years, and there-
0 fore the fact that no transaction occurred in 2001 can
0 1 2 3 4 5 6 be attributed more to their low probability of making
No. of repeat transactions (19962001) a purchase in any given year than to the possibility
of them being dead.
(b) Table 7 reports the mean of the marginal pos-
4
terior distribution of P . Looking at this table col-
No. of repeat transactions (20022006)
2
3 009 040 069 084
4 019 066 088
1
5 051 091
6 092
0
0.00 0.25 0.50 0.75 1.00
p have shown interest in the Palive measure.
Although we have reported this quantity as a means
(b) 14 of understanding patterns of conditional expectations,
12 E () = 0.07 we feel that the measure is of limited diagnostic
value when viewed by itself. It is a prediction of
10
something that is, by denition, unobservable (i.e.,
8 whether or not a customer is still alive at a par-
f ( )
E () = 0.19
6
ticular point in time), and thus it is impossible to
E () = 0.20
directly assess its validity. A useful companion mea-
4 sure is a prediction of whether or not the customer
2 will be active in the future, that is, whether or not the
customer undertakes any transactions in a specied
0
0.00 0.25 0.50 0.75 1.00 future period of time.5
The probability that a customer is active in the
20022006 period (n = 5) is computed as 1
PXn n + n = 0 x tx n using (12), conditional on
the prior is the plot of a beta distribution with param-
each of the 22 x tx patterns associated with n = 6.
eters = 0657 and
= 2783; the overall mean of
This conditional penetration is reported in Table 8
across the whole sample is 0.19. The posterior distri-
as a function of recency (the year of the individuals
bution of P for an individual who made three consec-
last transaction) and frequency (the number of repeat
utive repeat purchases with the last one in 1998 has
transactions).
most of its mass to the right; the observed sequence
Comparing Tables 5 and 8, we note that the esti-
of purchases reects the high mean of this distribu-
mated probabilities of being alive in 2002 are strictly
tion EP = 080). At the same time, the three-year
higher than the corresponding conditional 20022006
hiatus suggests that the supporter is dead as a result
penetration numbers. This makes intuitive sense, but
of their coming from a posterior distribution with
the differences between these measures reect several
an interior mode and with E = 020.
factors. First, the Palive numbers are just for one
On the other hand, someone who made three repeat
year, whereas the penetration numbers are for a ve-
purchases with the last one in 2001 had to be alive
year period. Second, the mere fact that someone is
over the whole period, which is a result of their
alive does not mean she will be active, because the lat-
coming from a beta distribution with most of its mass
ter state depends on the persons underlying transac-
piled to the left, with E = 007. The fact that trans-
tion probability p. This is very clear when we look at
actions did not occur in three of the six years reects
the rightmost column of both tables. Although those
the fact that their p comes from a distribution with a
people who made a purchase in 2001 have the same
lower mean (EP = 053).
probability of being alive, irrespective of frequency,
These relationships between P and suggest that
their corresponding probabilities of making at least
there may be some correlation in the joint posterior
one transaction in the next ve years clearly (and log-
distribution (despite the fact we assume independent
ically) increase as a function of frequency, reecting
priors). This is indeed the case, and we explore it with
two analyses in Appendix B. (We discuss a model
5
with correlated priors in 5.) Many authors, including Schmittlein et al. (1987), have used the
terms alive and active as synonyms. We feel that this should
3.1.2. Conditional Penetration. Ever since the not be the case, with the term alive referring to an unobservable
publication of Schmittlein et al. (1987), researchers state and the term active referring to observable behavior.
Fader, Hardie, and Shang: Customer-Base Analysis in a Discrete-Time Noncontractual Setting
Marketing Science 29(6), pp. 10861108, 2010 INFORMS 1097
in part the associated probabilities of making a pur- Figure 8 Predicted vs. Actual Frequency of Repeat Transactions by the
chase at any given transaction opportunity given alive 19952000 Cohorts
(Table 7). Third, the lower penetration numbers also 25,000
reect the fact that inactivity may be due to the per- Actual
son dying in 20032006, even if they had been alive 20,000 Model
in 2002.
No. of people
In summary, we encourage researchers who might 15,000
be attracted by the Palive measure to also utilize the
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
LL
0
BB 0501 0753 1156150 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
BG/BB 1188 0749 0626 2331 1105210 Year
Fader, Hardie, and Shang: Customer-Base Analysis in a Discrete-Time Noncontractual Setting
1098 Marketing Science 29(6), pp. 10861108, 2010 INFORMS
56,847 potential repeat supporters.) The conditional Figure 10 Comparing the Number of Repeat Donations as Predicted
expectation plots, omitted in the interests of space, are by the Pareto/NBD Model with the Actual Numbers
similarly impressive. 5,000
This pooled analysis provides a further illustra-
Actual
tion of the remarkable ability of the BG/BB model to 4,000 Pareto/NBD
describe and predict the future behavior of a customer
No. of people
base. It is encouraging to see how one set of param-
3,000
eters can capture the behavior of different cohorts
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
1,000
4. Comparison with the Pareto/NBD
Model 0
Our empirical analysis has focused on the number of 0 1 2 3 4 5 6 7 8 9 10+
repeat transactions. The alert reader will have ques- No. of repeat donations
tioned our use of the term transactions because
this is not a necessarily discrete setting (Figure 1).
Strictly speaking, we have been modeling whether or the Pareto/NBD model (e.g., Fader et al. 2005), we
not the supporter has made any donation to the orga- note that the Pareto/NBD provides a poor t to the
nization each year; we have ignored the fact that some observed donation data.
supporters may make more than one donation in a Another test of the Pareto/NBD as a model of the
given year. donation process is to estimate the implied ow of
We feel that such an approach is perfectly appro- annual transactions (i.e., annual incidence) and then
priate for two reasons. First, the majority of the sup- examine how well the model captures and predicts
porter base (71%) made only one donation for each of the observed transaction patterns. The expected num-
the years during which a transaction occurred. Sec- ber of people making 0 1 6 repeat transactions
ond, this is the way the nonprot organization thinks between 1996 and 2001 is compared to the actual fre-
about its donor base; they focus more on whether or quency distribution in Figure 11. In contrast to the
not each person has made a donation in any given t observed for the BG/BB model in Figure 3, we
year (0/1), not as much on the number of donations see that the Pareto/NBD fails to capture the observed
made. Thus, the 0/1 indicator is the primary behav- annual incidence of donations.
ioral measure recorded in the database provided to us We can also examine how well the model tracks
(just as it was for Netzer et al. 2008). repeat transactions over time, both cumulatively (Fig-
Nevertheless, the fact that 29% of the supporter ure 12(a)) and year by year (Figure 12(b)). In con-
base made more than one donation in at least one trast to the equivalent plots for the BG/BB model
of the years during which a transaction occurred (Figures 4(a) and 4(b), respectively), we see that
may lead some to argue that we should be mod- Pareto/NBD fails to track the actual data. The initial
elling the number of donations over time rather than
annual incidence; the natural model to use for such
Figure 11 Comparing the Number of Repeat Transactions (i.e., Annual
an approach to the data would be the Pareto/NBD. Incidence) as Predicted by the Pareto/NBD Model with the
Returning to the 1995 cohort, we obtained data Actual Numbers
on the number of repeat donations made by each 5,000
supporter within each year (i.e., the binary string
characterization of behavior is replaced by a string Actual
4,000
of nonnegative integers). Given the interval-censored Pareto/NBD
nature of these data, we estimate the parameters of
No. of people
Figure 12 Predicted vs. Actual (a) Cumulative and (b) Annual Repeat a strong performance by the BG/BB model and a poor
Transactions performance by the Pareto/NBD model.
(a) To summarize, this analysis has demonstrated that
the Pareto/NBD model fails to capture the ow of
Cumulative no. of repeat transactions
40,000
donations. Treating the data as discreteeven though
Actual
Pareto/NBD the underlying process is not necessarily discrete
30,000 and modeling the ow of transactions (i.e., inci-
dence, rather than the overall number within each dis-
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
20,000
crete time interval) using the BG/BB model is clearly
superior.
Why does the Pareto/NBD perform so poorly
10,000 in this case? The assumption of exponential inter-
purchase times between donations (which yields the
0 Poisson count model) is a dubious one in this set-
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 ting. Donations are made too regularly (e.g., in
Year
December of each year) to be accommodated by the
memorylessness of the exponential/Poisson. Con-
(b) 6,000 sider, for example, the 1,203 customers who made
a donation every year (Table 2). An individual-level
No. of repeat transactions
5,000
Poisson model would take such a high donation rate
4,000 and (because of its equi-dispersion property) would
predict a fairly large number of years with multiple
3,000 donations. However, each of these customers made,
2,000
on average, a total of only 1.3 donations per year
across the calibration period. The Pareto/NBD sim-
1,000 ply cannot cope with such a low level of persistent
behavior. Schmittlein et al. (1987, p. 17) explicitly
0
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
acknowledged this limitation as well: For processes
Year like church attendance and television viewing the
opportunities for a transaction occur regularly, so
our model is inappropriate. In contrast, directly
underprediction follows naturally from the overesti- modeling annual incidenceas opposed to continuous-
mation of the number of people making zero dona- time purchasingas a memoryless process (while
tions between 1996 and 2001. We also note that the the customer is alive) is a much more reasonable
Pareto/NBD fails to capture the overall rate of decline approach.
in transactions over time.
Finally, we examine how well the BG/BB and
Pareto/NBD models track (and predict) the evolution 5. Extending the Basic Model
of the number of cohort members that ever make a Of all the assumptions associated with the BG/BB
repeat transactionsee Figure 13. Once again we see model, the one that many readers will have the most
problem with is Assumption (6), that the transac-
tion probability p and the dropout probability vary
Figure 13 Comparing the Number of Ever-Repeaters as Predicted by
the BG/BB and Pareto/NBD Models with the Actual Number
independently across customers. This is not nearly
as restrictive as it may seem; more formally, we are
8,000
assuming independent priors, which does not imply
independence in the joint posterior distribution of P
No. of ever-repeaters
6,000 and . (In fact, we can see some fairly strong correla-
tions in the posterior distributionssee Appendix B.)
Nevertheless, we now relax this assumption.
4,000
An extremely attractive consequence of Assump-
Actual
tions (4)(6) (i.e., independent beta-mixing distribu-
2,000 BG/BB tions) is that we arrive at simple analytical expressions
Pareto/NBD for all the model quantities of interest, which greatly
0
reduces the barriers to model implementation (e.g.,
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 being able to perform all the analysis in an Excel
Year spreadsheet). Ideally, we would like to be able to relax
Fader, Hardie, and Shang: Customer-Base Analysis in a Discrete-Time Noncontractual Setting
1100 Marketing Science 29(6), pp. 10861108, 2010 INFORMS
the independence assumption without losing the abil- Table 10 Results of the Model That Replaces Independent
ity to derive simple analytical expressions. Beta-Mixing Distributions with an SBB Distribution for
The Sarmanov family of distributions, as introduced Heterogeneity in P and
P P P
MVN
logit P 2 are negligible. However, when we look at the dis-
tribution of holdout period transactions (Figure 15),
Because the individual-level process has not it is clear that the SBB -G/B model provides a bet-
changed, the likelihood function for a randomly cho-
ter prediction of the distribution than the already
sen customer is obtained by taking the expectation of
(4) over the joint distribution of P and :
L x tx n Figure 14 Comparing Predicted (a) Cumulative and (b) Annual Repeat
1 1 Transactions from the BG/BB and SBB -G/B Models vs. Actual
= Lp x tx nf p dp d (a)
0 0
Cumulative no. of repeat transactions
40,000
The major downside of using this distribution is that
Actual
there is no analytic solution to this double integral.
BG/BB
We therefore evaluate the integrals using Monte Carlo 30,000 SBB-G/B
simulation; that is, we estimate the model parameters
using the method of maximum simulated likelihood
20,000
(making use of MATLAB). We call this the SBB -G/B
model.
We rst estimate a constrained version of the model 10,000
assuming p and are assumed to be uncorrelated.
With reference to Table 10, we see that model t
0
is almost identical to that of the original BG/BB 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
model. The associated moments in the P space Year
are also very close to those associated with the BG/BB
model. Allowing for a correlation results in a signi- (b)
6,000
cant improvement in model tan increase of 15 log-
likelihood points at the cost of one extra parameter.
No. of repeat transactions
5,000
The estimated (prior) correlation between P and is
4,000
0.361 (versus the limit of 0.042 associated with using
a Sarmanov bivariate beta distribution). 3,000
The big question is whether this improvement in
model t leads to any meaningful improvement in 2,000
the associated predictions. We rst consider how well
1,000
it tracks aggregate repeat transactions over time. The
cumulative and year-by-year numbers are plotted in 0
Figure 14. We note that the differences in the predic- 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006
tions associated with the BG/BB and SBB -G/B models Year
Fader, Hardie, and Shang: Customer-Base Analysis in a Discrete-Time Noncontractual Setting
Marketing Science 29(6), pp. 10861108, 2010 INFORMS 1101
Figure 15 Predicted (from the BG/BB and SBB -G/B Models) vs. Actual Table 11 Expected Number of Repeat Transactions in 20022006 as
Frequency of Repeat Transactions in 20022006 a Function of Recency and Frequency, as Predicted by the
SBB -G/B Model
7,000
Year of last transaction
6,000 Actual No. of rpt transactions
BG/BB (19962001) 1995 1996 1997 1998 1999 2000 2001
5,000
SBB-G/B
No. of people
0 010
4,000 1 010 044 075 093 104 111
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
0
0 1 2 3 4 5
this assumption comes at a cost. Whereas the basic
No. of repeat transactions BG/BB model can be implemented in Excel, the SBB -
G/B model requires a less accessible computing envi-
ronment (e.g., MATLAB). Although allowing for this
excellent prediction associated with the BG/BB
correlation does lead to some improvements in the
model.7
models predictive performance, the numbers are suf-
Turning our attention to the conditional expecta-
ciently similar for us to conclude that the cost-benet
tions, we rst look at the expected number of trans-
actions in the 20022006 period (n = 5) conditional
on each of the 22 (x tx ) patterns associated with Figure 16 Predicted (from the BG/BB and SBB -G/B Models) vs. Actual
n = 6. These conditional expectations are reported in Conditional Expectations of Repeat Transactions in
20022006 as a Function of (a) Frequency and (b) Recency
Table 11; they are the SBB -G/B model equivalents of
the numbers reported in Table 5. We note that these (a)
4
conditional expectations are highly correlated with
No. of repeat transactions (20022006)
7
Assessing the relative t using the chi-squared goodness-of-t 0
measure, we note that it reduces from 47.9 for the BG/BB model to 1995 1996 1997 1998 1999 2000 2001
4.8 for the SBB -G/B model. Year of last transaction
Fader, Hardie, and Shang: Customer-Base Analysis in a Discrete-Time Noncontractual Setting
1102 Marketing Science 29(6), pp. 10861108, 2010 INFORMS
trade-off is not immediately obvious. We will revisit Various benets associated with the BG/BB have
this issue in the following section. been mentioned throughout this paper, and we sum-
marize them here.
The BG/BB offers tremendous advantages in
6. Discussion terms of the required data structures. The size of
We have developed a new model that can be used the data summary required for model estimation
to answer standard customer-base analysis questions is purely a function of the number of transaction
in noncontractual settings where opportunities for opportunitiesnot the number of customersand
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
transactions occur at discrete intervals. Using a data therefore the model is highly scalable to customer
set on annual donations made by the supporters of bases of different sizes. Furthermore, in recognizing
a nonprot organization located in the midwestern that recency and frequency are sufcient summary
United States, we have demonstrated how the model statistics, the relationship between the number of
can be used to compute a number of managerially transaction opportunities and the size of the data set
relevant quantities such as future purchasing pat- is on the order of n2 , which is a signicant reduction
terns, both collectively and individually (conditional compared to using the full binary strings (order 2n ).
on past behavior). In examining these quantities we Besides the efcient data requirements, the cal-
have observed some interesting effects of past behav- culations associated with the model are much simpler
ior (as summarized by recency and frequency) on pre- than those of the Pareto/NBD. No unconventional
dictions about future behavior. or computationally demanding functions are required
The contractual versus noncontractual distinction for parameter estimation or for most of the diagnostic
that lies at the heart of this work is very similar to statistics that emerge from the model. Taken together
Jacksons (1985a, b) lost-for-good versus always- with the aforementioned data advantages, this means
a-share framework. Rust et al. (2004) observe that that the model is easy to fully implement and utilize
such a distinction is important, because the esti- within a standard spreadsheet environment, as illus-
mates of CLV generated by applying a lost-for-good trated in Figure 2. This is very appealing to practi-
model to data best characterized by the always-a- tioners, because this reduction in space/effort can be
share assumption will systematically underestimate accomplished at virtually no cost (i.e., without sacri-
true CLV. In a discrete-time always-a-share setting, the cing anything in model performance, as shown in
BB is the natural benchmark model for purchasing our empirical analyses).
from the rm. However, as shown earlier, it substan- Pragmatic considerations aside, we see that the
tially overforecasts cumulative repeat transactions; it Pareto/NBD can fail to capture the ow of donations,
fails to capture the leakage of customers over time be it the actual number or annual incidence. We sus-
typically observed in an always-a-share settingalso pect that there are many settings (particularly when
observed by East and Hammond (1996). By allowing periodic transactions tend to occur during a relatively
for an unobserved death component, the BG/BB can limited range of time) when these shortcomings of the
be viewed as a leaky version of an always-a-share Pareto/NBD will be quite evident.
model. The discrete nature of the data and the associated
As we mentioned from the outset of this paper, behavioral story lead to model diagnostics that are
the BG/BB is the direct analog of the Pareto/NBD convenient to display and are readily interpretable.
as one moves from a continuous-time setting to a For instance, it is very easy to see and appreciate the
discrete-time domain. We have brought up a number nonlinear pattern associated with high frequency and
of specic examples where this distinction is critically low recency, shown in Table 5. Likewise, a simple
important, as well as some situations (characterized examination of that table instantly answers the man-
as discretized by recording process in Figure 1) where agerial questions raised in the introduction.
the analyst might intentionally convert a continuous- Finally, it is relatively easy to build and ana-
time setting into a discrete-time one, primarily to lyze the BG/BB model across multiple cohorts of
be able to use the BG/BB model instead of the customerssomething that has been done rarely
Pareto/NBD. We are aware of several organizations (if ever) in the Pareto/NBD literature. Not only does
(including hotel chains, nancial services rms, and this make the model even more practical, but the
a variety of nonprots) that have chosen to focus multiyear empirical results shown here offer much
on discretized data, either on their own (such as stronger support for the models validity than a
the organization that provided the data used here) single-cohort analysis can provide.
or specically to utilize the BG/BB framework. The Although the BG/BB is an excellent starting point
fact that they have approached their data manage- for modeling discrete-time noncontractual data, there
ment/analysis in such a manner is an indication of are several natural extensions worth investigating
the direct applicability of this new model. in future research. First, as is the case with the
Fader, Hardie, and Shang: Customer-Base Analysis in a Discrete-Time Noncontractual Setting
Marketing Science 29(6), pp. 10861108, 2010 INFORMS 1103
Pareto/NBD model, the BG/BB model will need to of marketing activitiesassuming such data are read-
be augmented by a model of purchase amounts when ily available in the rst case.8 If some of the under-
we are interested in the overall monetary value of lying modeling assumptions are unappealing (e.g.,
each customer. A natural candidate would be the the assumption of independence between the transac-
gamma-gamma mixture (Colombo and Jiang 1999) tion and dropout probabilities), we can create a ver-
that Fader et al. (2005) use in conjunction with the sion 2.0 of the model that comes at some increased
Pareto/NBD model. In situations (such as the data set computational cost.
used here) that are not necessarily discrete and where Implicit in these basic models is the assumption
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
there is the possibility that more than one transaction that future marketing activities will be basically the
could occur in each discrete-time interval, we should same as past marketing activities. The impressive
derive the monetary-value multiplier by rst mod- predictive performance of the BG/BB model sug-
eling the number of transactions (conditional on the gests that this is not an overly restrictive assumption.
fact that at least one transaction occurred) and then If there has been some customization of marketing
multiply this by the average value per transaction. activities on the basis of outputs generated from this
model (e.g., after scoring the customer database on
A logical model would be the shifted beta-geometric
the basis of P(alive) or the conditional expectations),
distribution (as used by Morrison and Perry 1970
then all we would need to do is reestimate the model
to model purchase quantity, conditional on purchase
on an updated data set when it is time to apply
incidence).
the model again in the future. (Given that this can
Second, we may want to allow for a non-zero-order
be done in Excel, such reestimation comes at very
purchasing process at the individual level. A good low cost.) Furthermore, the forecasts generated by the
historical starting point would be the Brand Loyal model provide a natural (and low-cost) baseline for
Model (Massy et al. 1970). This would effectively be examining the performance of the customized mar-
an extension of the Markov chain model of retail cus- keting activities.
tomer behavior at Merrill Lynch by Morrison et al. Beyond efforts to use the BG/BB for customized
(1982), an extension in which the exit parameter is marketing activities, a similar iterative approach can
allowed to be heterogeneous and is estimated directly be applied to better understand other kinds of time-
from the data (as opposed to being derived from other varying marketing activities. In ongoing eld appli-
data sources). cations of the model, we encourage organizations to
The research presented in this paper is clearly rerun the model on a periodic basis to try to detect
anchored in the probability models for customer- notable deviations from its baseline predictions, as
base analysis tradition, of which the Pareto/NBD is well as to make inferences about the changing nature
a central model. As Fader and Hardie (2009) note, of the underlying buy and die processes. Like-
this stream of research uses combinations of basic wise, we encourage organizations to run the model
probability distributions to develop simple mod- separately for different cohorts of customers, e.g.,
els of customer behavior that can be used to make based on their date and/or channel of acquisition.
predictions of future behavior conditional on cus- It is often possible to detect systematic shifts across
tomers past behavior. It is perhaps useful to reect these incoming customer groups, which can help
on how this ts within the broader customer prof- rene expectations and acquisition tactics for newly
itability/CLV/customer equity literature, as exempli- acquired customers. Although these efforts admit-
ed by a number of top managerially oriented books tedly fall short of a full-blown optimization strategy,
(e.g., Blattberg et al. 2001, Gupta and Lehmann 2005, they help organizations gain a much better feel for
Kumar 2008, Rust et al. 2000) and the large academic the evolving patterns of their customer base and the
literature (e.g., as reviewed in Blattberg et al. 2008), effectiveness of their marketing efforts.
especially in light of the fact that the effects of factors As this kind of analytics culture gets embedded
such as marketing activities are completely ignored. into a marketing organization, we can expect man-
If one takes an evolutionary model-building view agers to begin to ask deeper kinds of what-if and
resource allocation questions tied to marketing vari-
of embedding analytics in an organization (Urban and
ables. Assuming all the data are readily available in
Karash 1971), models such as the BG/BB represent a
the organization, it is possible to develop models that
natural rst step. These models can be implemented
incorporate these effects (e.g., Kumar et al. 2008; also
by an organization at very low cost. For example,
see the review by Blattberg et al. 2009). As covariates
no new software is required and the model can be
coded up in a blank spreadsheet in a matter of min- 8
In the nonprot example considered in this paper, we know that
utes; furthermore, the data requirements are minimal marketing activities were undertaken but the data were not avail-
and do not require the merging of databases, as is typ- able. There was no indication that these activities were customized
ically the case when wanting to incorporate the effects at the donor level.
Fader, Hardie, and Shang: Customer-Base Analysis in a Discrete-Time Noncontractual Setting
1104 Marketing Science 29(6), pp. 10861108, 2010 INFORMS
are incorporated, data structures and model estima- Taking the expectation of this over the mixing distribu-
tion issues become more complex. To the extent that tions for P and ((1) and (2), respectively) gives us (7).
customers have been targeted with different market-
ing activities on the basis of their past behavior, we A.2. Derivation of (8)
Conditional on p and , the expected number of transactions
must also account for endogeneity. This is clearly a
over n transaction opportunities is computed as
major step up the evolutionary ladder of marketing
analytics in the organization. We feel that it is impor-
n
EXn p = PYt = 1 p alive at tPalive at t
tant that any organization embarking on such a jour- t=1
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
ney should learn to walk before they can run, and the
n
BG/BB seems to be a solid way to start the journey. =p 1 t
t=1
Acknowledgments
n1
The authors thank the anonymous nonprot organization = p1 1 s
for making the data set available, Paul Berger for his exten- s=0
sive input into an earlier version of this paper, and Katie
Palusci for her capable research assistantship. The rst which, recalling (A1) and performing some further algebra,
author acknowledges the support of the Wharton Interac-
tive Media Initiative. The second author acknowledges the p1 p1 n+1
support of the London Business School Centre for Mar- = (A5)
keting and the hospitality of the Department of Market-
Taking the expectation of this over the mixing distribu-
ing at the University of Auckland Business School. The
tions for P and gives us
authors thank the acting editor-in-chief, the area editor, and
both reviewers for their encouragement and insightful com- EXn
ments. A good paper has gotten even better as a result of
their careful reading throughout the review process. B 1
+ 1 1
+ n + 1
=
+ B
Appendix A. Derivations
In this appendix we present derivations of the key results (Strictly speaking, the use of the integral representation of
presented in 2.2. Before starting, we rst recall that for the beta function to solve the integral associated with taking
0 < k < 1, the expectation over only holds for > 1. However, it can
The sum of the rst n terms of a geometric series is be shown that we arrive at the same result when 0 < < 1.)
Representing the beta functions in terms of gamma func-
1 kn
a + ak + ak2 + + akn1 = a (A1) tions and recalling the recursive property of gamma func-
1k tions gives us (8). Reecting on the bracketed term in (8) as
The sum of an innite geometric series is n , we note that EXn grows to a limit of
a
akn = (A2)
n=0
1k + 1
and note the following transformation of Eulers integral when > 1. When < 1, there is no limit on EXn. (The
representation of the Gaussian hypergeometric function Pareto/NBD model shares this property regarding the exis-
(2 F1 a b c z):
tence of a limit.)
1
t b1 1 tcb1 1 zta dt
0 A.3. Derivation of (9) and (10)
= Bb c b2 F1 a b c z c > b (A3) Recalling (A4), it follows from the memoryless nature of the
death process that
A.1. Derivation of (7)
An individual making x purchases had to be alive for PXn n + n = x p alive at n
n
customer is alive is i
+
px 1 pix 1 i (A6)
i x x
p 1 pix i=x
x
Noting that the probability that someone is alive at n is
Removing the conditioning on being alive for i transaction 1 n , we have
opportunities by multiplying this by the probability that the
individual is alive for that length of time gives us PXn n + n = x p
n x n
PXn = x p = p 1pnx 1n =
x =0 1 1 n + px 1 pn x 1 n+n
x x
1
i x
n1 n
i
+ p 1pix 1i (A4) +
px 1 pix 1 n+i
i=x
x i=x x
Fader, Hardie, and Shang: Customer-Base Analysis in a Discrete-Time Noncontractual Setting
Marketing Science 29(6), pp. 10861108, 2010 INFORMS 1105
(The rst term accounts for the fact that anyone not alive at A.6. Derivation of (13)
n will, by denition, not make any purchases in the interval Conditional on p and , the expected number of transac-
n n + n .) Taking the expectation of this over the mixing tions across the next n transaction opportunities (i.e., in
distributions for P and gives us (9). the interval (n n + n ]) by a customer with purchase history
By denition, Xn n + n = Xn + n Xn; it follows x tx n is
that EXn n+n = EXn+n EXn. Substituting (8)
in this gives us (10). EXn n + n p x tx n
= EXn n + n p alive at n
A.4. Derivation of (11)
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).
Palive at n + 1 p x tx n =p 1 s
s=1
px 1 pnx 1 n+1 +1
= (A8) p1 p1 n
Lp x tx n = (A11)
By Bayes theorem, the joint posterior distribution of P Taking the expectation of the product of (A7) and (A11)
and is given by over the joint posterior distribution of P and , (A9), and
f p
x tx n simplifying (i.e., representing certain beta functions in terms
of gamma functions and exploiting the recursive property
Lp x tx nf p f
of gamma functions) gives us (13).
= (A9)
L
x tx n
where the individual elements are given in (1), (2), (4), A.7. Derivation of (14)
and (5). Taking the expectation of (A8) over the joint poste- The number of discounted expected residual transactions
rior distribution of P and gives us (11). for a customer alive at n is
By the same logic, we can derive an expression for the DERTd p alive at n
probability that a customer with purchase history x tx n
is alive at transaction opportunity n + m. Conditional on p
PYt = 1 p alive at tPalive at t t > n
=
and , t=n+1
1 + dtn
px 1 pnx 1 n+m
1 tn
Palive at n + m p x tx n = =p
Lp x tx n 1 + dtn
t=n+1
Taking the expectation of this over the joint posterior dis-
1 1 s
tribution of P and yields =p
1 + d s=0 1 + d
Palive at n + m
x tx n
which, recalling (A2),
B + x + n x B
+ n + m
= p1
B B
= (A12)
d+
L
x tx n1 (A10)
Multiplying this by the probability that a customer
A.5. Derivation of (12) with purchase history x tx n (and latent transaction and
By denition, dropout probabilities p and ) is still alive at transaction
opportunity n, (A7), gives us
PXn n + n = x p x tx n
px+1 1 pnx 1 n+1
=
x =0 1 Palive at n p x tx n DERTd p x tx n = (A13)
d + Lp x tx n
+ PXn n + n = x p alive at n Taking the expectation of this over the joint posterior dis-
Palive at n p x tx n tribution of P and , (A9), gives us
Substituting (A6) and (A7) in this, and taking the expecta- DERTd
x tx n
tion over the joint posterior distribution of P and , (A9), B + x + 1 + n x
gives us (12). = L
x tx n
B
Fader, Hardie, and Shang: Customer-Base Analysis in a Discrete-Time Noncontractual Setting
1106 Marketing Science 29(6), pp. 10861108, 2010 INFORMS
which, recalling (A3), tributions, as we show here in two separate analyses that
demonstrate how these correlations can be estimated and
B
+ n + 1 1 interpreted.
= F 1
+ n + 1 +
+ n + 1
B
1 + d 2 1 1+d First, following an analysis shown in Abe (2009), Fig-
giving us the expression in (14). ure B.1 is a scatter plot of the means of the marginal poste-
It is interesting to note that this expression for DERT dif- rior distributions of P and . Each circle represents the pair
fers from that for the conditional expectation, (13), by a of means for a particular purchase history x tx n (com-
factor of puted using (17) with l = 1 m = 0 and l = 0 m = 1, respec-
12 F1 1
+ n + 1 +
+ n + 1 1/1 + d tively), and the area of each circle is directly proportional
+
+ n1 + d to the number of customers who share the same purchase
history (i.e., using the numbers from Table 2). The weighted
+
+ n
1 +
+ n + n 1 correlation across the 22 pairs of numbers is 042. This
1
1 +
+ n
+
+ n + n implies, as common intuition would suggest, that customers
For any given analysis setting, this is a constant, inde- who purchase more frequently (while alive) tend to live
pendent of the customers exact purchase history. There- longer than light purchasers (but of course we do not want
fore, any ranking of customers on the basis of DERT will to imply any kind of causal connection here).
be exactly the same as that derived using the conditional However, this analysis tells only part of the story because
expectation of purchasing over the next n periods. When it only considers the posterior means. When we take into
> 1 and d = 0 (i.e., there is no discounting of future pur- account the full posterior distribution for a given customer,
chases), this converges to 1 as n . a different correlation analysis emerges. Suppose for each
Because L
x tx n = 1 when x = tx = n = 0, it customer in a given recency/frequency group we made a
follows that the number of discounted expected transactions number of draws from their joint posterior distribution
(DET) for a just-acquired customer is what would be the correlation between p and across these
DETd
draws? The joint posterior distribution of P and is given
by (A9). For the special case where tx = n, this collapses to
2 F1 1
+1 +
+11/1+d
= (A14)
+ +
1+d f p
xtx n = f p +x+nxf
+n
To compute DET for a yet-to-be-acquired customer, we need
to add 1 to this quantity (i.e., the purchase at time t = 0 that
corresponds to the customers rst-ever purchase with the Figure B.1 Scatter Plot of the Marginal Posterior Means of P and
rm and therefore starts the transaction opportunity clock). for the 22 (x tx ) Patterns Associated with n = 6
1.0
A.8. Derivation of (15)(17)
We obtain (15) and (16) by integrating (A9) over and p,
respectively.
By denition, the l m)th product moment (l m = 0.8
0 1 2 ) of the joint posterior distribution of P and is
EP l m
x tx n
0.6
1 1
= pl m f p
x tx n dp d
E()
0 0
3 0174 0214 0071 0000 Blattberg, R. C., E. C. Malthouse, S. A. Neslin. 2009. Customer
4 0214 0114 0000 lifetime value: Empirical generalizations and some conceptual
5 0190 0000 questions. J. Interactive Marketing 23(2) 157168.
6 0000
Chateld, C., G. J. Goodhardt. 1970. The beta-binomial model for
consumer purchasing behaviour. Appl. Statist. 19(3) 240250.
i.e., the posterior distribution of P is independent of the Colombo, R., W. Jiang. 1999. A stochastic RFM model. J. Interactive
Marketing 13(3) 212.
posterior distribution of . (Equivalently, the marginal pos-
terior distributions of P and , (15) and (16), collapse to Danaher, P. J., B. G. S. Hardie. 2005. Bacon with your eggs? Appli-
cations of a new bivariate beta-binomial distribution. Amer.
the updated beta distributions f p + x + n x and
Statistician 59(November) 282286.
f
+ n, respectively.) In all other cases, the poste-
rior distribution of an individuals transaction probability is East, R., K. Hammond. 1996. The erosion of repeat-purchase loyalty.
Marketing Lett. 7(2) 163171.
not independent of the posterior distribution of her dropout
probability. The joint posterior correlation is given by Easton, G. 1980. Stochastic models of industrial buying behaviour.
OMEGA 8(1) 6369.
corrP
x tx n Ehrenberg, A. S. C. 1988. Repeat-Buying, 2nd ed. Charles Grifn &
Company, London.
EP EP E
= (B1) Fader, P. S., B. G. S. Hardie. 2005. Implementing the Pareto/NBD
EP 2 EP 2 E2 E 2 model given interval-censored data. Retrieved June 26,
2010, http://brucehardie.com/notes/011/.
where the individual terms are computed using (17). This Fader, P. S., B. G. S. Hardie. 2009. Probability models for customer-
correlation is reported in Table B.1 as a function of recency base analysis. J. Interactive Marketing 23(1) 6169.
(the year of the individuals last transaction) and frequency Fader, P. S., B. G. S. Hardie, K. L. Lee. 2005. RFM and CLV: Using
(the number of repeat transactions). iso-value curves for customer base analysis. J. Marketing Res.
This table shows that the intracustomer correlations are 42(4) 415430.
strictly positive (except when tx = n), or, equivalently, if we Gupta, S., D. R. Lehmann. 2005. Managing Customers as Investments:
were to draw from the joint posteriors across all the individ- The Strategic Value of Customers in the Long Run. Wharton School
uals that are represented within each cell of this table, we Publishing, Upper Saddle River, NJ.
would see these positive correlations. In the most extreme Jackson, B. B. 1985a. Build customer relationships that last. Harvard
case, i.e., when tx = n = 0, we see a fairly strong relationship Bus. Rev. 63(NovemberDecember) 120128.
between p and . This makes sense: customers in this cell Jackson, B. B. 1985b. Winning and Keeping Industrial Customers. Lex-
with a higher purchasing propensity are even more likely ington Books, New York.
(than light purchasers) to be dead. However, across cells, Johnson, N. L. 1949. Bivariate distributions based on simple trans-
the overall correlation is a fairly strong negative one, as dis- lation systems. Biometrika 36(34) 297304.
cussed previously. In some sense, this combined analysis Kumar, V. 2008. Managing Customers for Prot. Wharton School Pub-
(within and across each type of customer) represents a form lishing, Upper Saddle River, NJ.
of Simpsons paradox (Simpson 1951, Wagner 1982). Kumar, V., R. Venkatesan, T. Bohling, D. Beckmann. 2008. Practice
Taken together, these two analyses provide a more com- Prize ReportThe power of CLV: Managing customer lifetime
plete picture of the correlations than shown by Abe (2009) value at IBM. Marketing Sci. 27(4) 585599.
and other researchers, who have limited themselves to a Mason, C. H. 2003. Tuscan lifestyles: Assessing customer lifetime
simple correlation across the posterior means. More impor- value. J. Interactive Marketing 17(4) 5460.
tantly, these analyses put to rest any concerns that a simple Massy, W. F., D. B. Montgomery, D. G. Morrison. 1970. Stochastic
Models of Buying Behavior. MIT Press, Cambridge, MA.
empirical Bayesian model with independent priors will be
Morrison, D. G., A. Perry. 1970. Some data based models for ana-
unable to capture and reveal correlations in the underlying
lyzing sales uctuations. Decision Sci. 1(34) 258274.
processes. To the contrary, these analyses arise quite natu-
Morrison, D. G., D. C. Schmittlein. 1988. Generalizing the NBD
rally from the BG/BB modeland the same is true for the model for customer purchases: What are the implications and
Pareto/NBD and other related models. is it worth the effort? J. Bus. Econom. Statist. 6(2) 145159.
Morrison, D. G., R. D. H. Chen, S. L. Karpis, K. E. A. Britney. 1982.
Modelling retail customer behavior at Merrill Lynch. Marketing
References Sci. 1(2) 123141.
Abe, M. 2009. Counting your customers one by one: A hierarchi- Netzer, O., J. M. Lattin, V. Srinivasan. 2008. A hidden Markov
cal Bayes extension to the Pareto/NBD model. Marketing Sci. model of customer relationship dynamics. Marketing Sci. 27(2)
28(3) 541553. 185204.
Fader, Hardie, and Shang: Customer-Base Analysis in a Discrete-Time Noncontractual Setting
1108 Marketing Science 29(6), pp. 10861108, 2010 INFORMS
Park, Y.-H., P. S. Fader. 2004. Modeling browsing behavior at mul- Schmittlein, D. C., R. A. Peterson. 1994. Customer base analysis:
tiple websites. Marketing Sci. 23(3) 280303. An industrial purchase process application. Marketing Sci. 13(1)
Pfeifer, P. E., M. E. Haskins, R. M. Conroy. 2005. Customer lifetime 4167.
value, customer protability, and the treatment of acquisition Schmittlein, D. C., D. G. Morrison, R. Colombo. 1987. Counting
spending. J. Managerial Issues 17(1) 1125. your customers: Who are they and what will they do next?
Management Sci. 33(1) 124.
Piersma, N., J.-J. Jonker. 2004. Determing the optimal direct fre-
quency. Eur. J. Oper. Res. 158(1) 173182. Simpson, E. H. 1951. The interpretation of interaction in contin-
gency tables. J. Roy. Statist. Soc. Ser. B 13(2) 238241.
Rosset, S., E. Neumann, U. Eick, N. Vatnik. 2003. Customer life-
Skellam, J. G. 1948. A probability distribution derived from the
time value models for decision support. Data Mining Knowledge
Additional information, including rights and permission policies, is available at http://journals.informs.org/.
INFORMS holds copyright to this article and distributed this copy as a courtesy to the author(s).