Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

MARCH 2, 2016

PREDICTING CUSTOMER CHURN:


ANALYSIS RESULTS

SUXIN DENG, ALISON GOLENSKY, XINYU LIU, YUAN


ZHAO AND BLAIR ZIMELIS

ANALYTIC TECHNIQUES IMC 451


MEDILL, NORTHWESTERN UNIVERSITY

PREDICTING CUSTOMER CHURN:


1. OVERVIEW

The model helped determine that the customers most likely to churn were younger in age
and less engaged with QWE Inc.

This report presents a recommended predictive model for identifying customers with the
greatest risk (80%) of defaulting from QWE Inc.s solution services over the next two months
(after February 2012). Using the data provided, we built a model that can be used by QWE Inc.
to predict the specific customers who are most likely to end their relationship (churn) with the
company. This model also presented the top three drivers of churn for the identified customers.
The model was built from a Regression Decision Tree using the following information to predict
a customers likelihood to leave the company (or churn rate):

Days Since Last Login


Change in Number of Monthly Logins
Age

PREDICTING CUSTOMER CHURN BASED ON BEHAVIORAL DATA

The objective of QWE Inc. is to estimate how likely a customer is to leave in the next two
months based on this behavioral data, including customer age, CHI score, etc. Currently, the
overall percentage of customers churning within 2 months is 5.1%
Two approaches, logistic regression and decision tree, have been taken to make the prediction.
2. RECOMMENDATION

Based on the model built from a Decision Tree and verified by a Logistic Regression, the top
three drivers for customer churn were identified as days since last login, number of logins and
account age. The model and these drivers identified the top customers with the highest
probability to churn (see Figure A in Appendix). The model provides 80% accuracy that
customers will leave under the following circumstances:
CHANGE IN DAYS SINCE
LAST LOGIN

CHANGE IN MONTHLY
LOGINS

18

2.5

3/2/2016

Predicting Customer Churn:

ACCOUNT AGE (Months)

22
1

Therefore, it is recommended that QWE Inc. continue to look at these top three drivers of
churn. This model will allow them to predict the customers at risk of churning ahead of time to
provide more targeted engagement and keep them from leaving.
TESTING OUR MODEL
Change in
Days Since
Last Login

Change in
Monthly
Logins

Account Age

Meeting 3
Conditions
Above

Churn
(Yes/No)

672

17

16

No

No

354

-1

-4

13

No

No

5203

No

No

3. MERITS AND DEMERITS OF THE MODEL

LOGISTIC REGRESSION

Logistic Regression can give us a broader list (as broad as we want) and the level of
precision is relatively stable with a larger sample size.

Different communication methodologies would be designed to improve customers


performance in different characteristics. Assuming that we had a limited budget and could only
implement one single methodology, we ran a logistic regression on every single variable. This
allowed us to identify the one variable that would help us find the most customers who actually
churned out of the 100 that were predicted-to-leave. The chart below shows the precision of
our prediction with each variable (how many customers did leave out of 100 chosen by
prediction). Days Since Last Login, CHI score, Views and Logins are the four most effective single
variables for prediction.
Var

Age

CHI

CHI-Dec

SP-Dec

SP

Logins

Views

Blogs

Days

Precise

2%

14%

6.5%

4.5%

3%

8%

10%

4%

14.8%

The logistic regression model does not consider the customer account age because the
probability that a customer would churn does not simply increase/decrease as the age goes up.
3/2/2016

Predicting Customer Churn:

We can clearly see a pattern of fluctuation from Chart 1 below. Customers younger than 12
months rarely left QWE Inc., maybe, because they were still learning about the service. The
probability reaches its peak at the age of 12 months, indicating that customers who signed a
12-month contract were very likely to leave when the first-years contract expired. Customers
who are 18 months or older seem less likely to leave the company, but the contract expiration
date is still a high-risk time. However, regression cannot reflect that trend.
Chart 1: Average Churn Rate Displayed by Customer Account Age

Regression with one single variable does not make much practical sense because many other
factors would likely influence a customers decision. Therefore, we decided to include other
variables in our regression model. After trying different combinations, we developed the most
precise model with Days Since Last Login, Difference in Customer Happiness Index Score,
Customer Happiness Index Score in December, and Different in Monthly Views. These variables
are all correlated to each other to some degree for the 100 customers chosen by the regression
model. Therefore, not all of the variables were necessary in our regression model to generate
our list of 100 customers. Two out of the five variables were sufficient for us to generate the
most precise result.
However, the precision was only 21%, which means only 21% of our investments would be put
into the right place and 79% of the money would be spent ineffectively. Therefore, logistic
regression may not be the best model for selecting those 100 likely-to-churn customers due to
its lower accuracy. Logistic Regression might be a better option when either a larger budget or a
more cost-effective communication method is available. Logistic Regression can give us a
broader list (as broad as we want) and the level of precision is relatively stable with a larger
sample size.

3/2/2016

Predicting Customer Churn:

Chart 2: Precision of Potential Sample Sizes

While the logistic regression model has theoretical merits shown above, we thought that given
the limited resources of QWE to contact only the top 100 customers out of its full database
(which we assume has over 60K customers) each month, the tree model was better for this
business situation.
DECISION TREE (APPENDIX FIGURE B.)

Therefore, if we want a small, but very accurate target and a cost-efficient communication
decision, then the decision tree would be a better choice.

In comparison, we can offer a shorter list of potential churning customers with higher accuracy
by using the decision tree model. Based on customer data within two months, we chose a list of
10 customers (primary target) with an 80% precision rate and another list of 33 customers
(secondary target) with a 67% precision rate.
The top four drivers are Days Since Last Login, Change in Number of Monthly Logins, Customer
Account Age and Views. The top 10 customers with the highest risk of leaving QWE Inc. are
those who have been with the company for less than 22 months old. This customer group is our
primary target. The tree also indicated that customers who are 12 months old are likely to
churn, so they are our secondary target group.
In terms of the usage information within the past two month, the time between logins was 18
days longer in this month than in last month, login times decreased by 2.5 times, and views
decreased by 140. Generally speaking, defaulters are those newer customers with decreased
engagement.
The merit of this method is that our precision is least 67%. It is also a more cost-efficient
method in comparison to the logistic regression model. If we have very limited budget, we may
3/2/2016

Predicting Customer Churn:

want to target the small group of people, determined by the decision tree, with the highest
churn rate and retain them in the long run. However, a decision tree can only give us a short list
of the customers who are most likely to churn. This limits the number of people that we can
communicate with based on the list generated from the decision tree model. In this sense,
some of the customers with a fairly above average churn rate are neglected.
Therefore, if we want a small, but very accurate target and a cost-efficient communication
decision, then the decision tree would be a better choice. With the full database of 60k
customers, we would most likely find 100 likely-to-leave customers who meet all four
conditions with 80% precision. If we have a relatively small database, we could take the other
branch (67% precision) into consideration. The precision would be a little lower but overall it is
much better than the result from logistic regression model.
4. ACTIONABLE STRATEGIES

Ultimately, the company should make customer engagement a priority to prevent


customer attrition.

Considering that customers younger than 22 months are more likely to leave, QWE Inc. might
contact them with a regular phone call or an email. This reminder can communicate QWEs
contribution to the growth of that clients business in the past few months. For customers who
are on a monthly contract or in the last month of their contracts, a link to renew the
subscription should be included in the promotion emails, ideally in a conspicuous place.
To reduce the customer churn rate, we suggest that QWE Inc. should allocate its proactive
investments based on the priority of customers to generate the highest returns. For example, if
we have a limited budget, we might want to go for only our primary target customers. If we still
have some money left after taking good care of our primary target, we might adopt some costefficient strategies to address our secondary target.
In terms of strategies, the service center in the company should make a phone call to the
primary target group to gain feedback on their experience with QWE Inc. and reiterate the
value of our services for business solutions. For secondary customers, it would be better to
send some promotional emails to inform them of new features and services that will satisfy
their needs.
In addition, QWE Inc. could also try to provide a better customer login experience for all by
sending automatic messages to customers about new company information or company
improvements. A gamification system could also be introduced to motivate customers to use
our service and even interact with other users.
3/2/2016

Predicting Customer Churn:

5.

APPENDIX

A. Customers Most Likely to Default


Churned

Days Since
Last Login

# Logins

Views

Age

Yes

31

-175

16

Yes

31

-246

14

Yes

31

-517

18

Yes

31

-196

16

Yes

31

-180

Yes

31

-1846

Yes

22

-4

-148

Yes

20

-5

-142

12

No

31

-285

14

10

No

31

-189

11

B. Decision Tree

3/2/2016

Predicting Customer Churn:

You might also like