Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 46

UNIVERSITY OF ECONOMICS AND LAW

FACULTY OF INFORMATION SYSTEMS


______________________

FINAL PROJECT REPORT


INTERDISCIPLINARY RESEARCH METHOD COURSE

TOPIC: RESEARCH THE CHURN RATE OF EACH


REGION IN THE RETAIL SECTOR BY USING RFM
MODEL

Lecturer:
1. Ho Trung Thanh, Ph.D.
2. Nguyen Van Ho, MA
Group 8:
1. Vũ Hồng Thanh Trúc
2. Trần Ngọc Bích Trân
3. Nguyễn Huyền Trang
4. Nguyễn Phương Oanh
5. Phùng Nguyễn Đăng Khoa
Members of Group 8

Point / 10
No. Full name Student ID (Individual Signature
Contribution)
Vũ Hồng Thanh Trúc K214111966
1

Trần Ngọc Bích Trân K214111964


2

Nguyễn Huyền Trang K214111965


3

Nguyễn Phương Oanh K214110846


4

Phùng Nguyễn Đăng K214111323


5
Khoa
Acknowledgements
For the successful completion of the course, our group would like to thank the instructor
- Ph.D Ho Trung Thanh, Faculty of Information Systems, University of Economics and
Law for sharing valuable knowledge and guiding us carefully . There are suggestions
made by the lecturer during the process of implementing the thesis so that our group can
complete the thesis in the best way.
After studying the topic and finishing the course, our group learned and accumulated
knowledge and experience from the previous lecturers to improve and develop ourselves.
Besides, this is also an opportunity to help us realize what we need to improve to prepare
for a long journey ahead.
Due to our limited knowledge and lack of practical experience, the content of the research
paper is difficult to avoid shortcomings. We look forward to receiving more advice and
instruction from lecturers.
Thanks and Best regards!
Commitment
Our group has read and understood the violations of academic honesty. Our group hereby
declares that the thesis "Research the churn rate of each region in the retail sector by
using RFM model " is our group's own research work which has been researched, read
and translated by our group, collected and implemented under the guidance of Dr. Ho
Trung Thanh.
Except for the reference results, the citations from other studies have been cited, this
study of our group has never been published in any previous work.
TABLE OF CONTENT

GANTT CHART................................................................................................................1

Project Overview...............................................................................................................3

Reasons............................................................................................................................3

Objectives........................................................................................................................3

Objects and scopes...........................................................................................................3

Research method..............................................................................................................4

Tools and Programing language......................................................................................4

Structure of project..........................................................................................................4

Chapter 1: Literature Review...........................................................................................5

Chapter overview.............................................................................................................5

1.1 Domestic Overview...................................................................................................5

1.2 Oversea Overview.....................................................................................................5

1.3 Summary....................................................................................................................6

Chapter 2: Theoretical Background................................................................................8

Chapter overview.............................................................................................................8

2.1 The rationale of RFM:...............................................................................................8

2.1.1 Consumer behavior segmented by RFM:...........................................................8

2.1.2 RFM Scoring:.....................................................................................................9

2.2 Market segment definition:......................................................................................10

2.3 CLV - customer lifetime value:...............................................................................11

2.4. Churn rate - customer churn rate:...........................................................................12


Chapter 3: Proposed Model............................................................................................13

Chapter overview...........................................................................................................13

3.1. Proposed model......................................................................................................13

3.2. Field Description....................................................................................................14

3.3. Exploratory Data Analysis......................................................................................17

3.3.1. Variables information......................................................................................17

3.3.2. Descriptive analysis:........................................................................................17

3.3.3. Histogram:.......................................................................................................18

3.3.4. Scatter Plot:......................................................................................................20

3.4. Data preprocessing..................................................................................................20

3.5. RFM Scoring..........................................................................................................22

Chapter 4: Experimental Results...................................................................................25

Chapter overview...........................................................................................................25

4.1. Results.....................................................................................................................25

4.2. Churn rate...............................................................................................................27

Chapter 5. Solution..........................................................................................................29

5.1. Why do customers churn?......................................................................................29

5.2. Effective solution to reduce customer churn..........................................................29

5.3. What are the various channels you can use to regain customers?..........................31

Conclusion and Future Works.......................................................................................33

References.........................................................................................................................34
List of Tables
Table 2.1.2. Score description according to RFM
Table 2.1.2. Score description according to RFM
Table 3.1. Field Description
Table 3.5.1. Customer segmentation
List of Figures
Figure 3.1 The proposed model
Figure 3.2 Sales Module

Figure 3.3.2.Descriptive data analysis

Figure 3.3.3. Numerical Histogram


Figure 3.3.4. Correlation between Order Quantity and Total Amount
Figure 3.4.1. RFM values in Excel
Figure 3.4.2. Calculating of RFM
Figure 3.5.1. The RFM score of each customer
Figure 3.5.2. The customer segmentation matrix
Figure 4.1.1. The distribution of Recency
Figure 4.1.2. The distribution of Frequency
Figure 4.1.3. The distribution of customer segments
Figure 4.2. Customers churn rate by region
List of Acronyms

DB Digital Business

MIS Management Information Systems

CLV Customer Lifetime Value

RFM Recency - Frequency - Monetary


GANTT CHART

1
ABSTRACT
Customer segmentation is necessary because you can't treat them the same, using
the same content, the same channels, and the same priorities; therefore, RFM was
invented as a tool to analyze customer behavior. RFM is part of Marketing Analysis and
has been used to analyze customer value. In recent years, many modern tools such as K-
means clustering, EM clustering and Fuzzy C-means clustering have been developed, but
RFM is still used because of its simplicity.
The topic “Research the churn rate of each region in the retail sector by using
RFM model” was conducted with the aim of finding out the churn rate of customers in
the retail sector and based on that, this study suggests which customer groups are
potential for the company. To answer the research question, we used qualitative and
quantitative research methods through 6400 customer samples, applied the scales to suit
the context and group of research subjects and used RFM analysis to segment customers
based on Scoring; besides that, the tool that our group chose to support this research was
Excel.
This research has presented a system of solutions to enhance the company's
competitiveness in the retail sector, better understanding of customer behavior,
customers' purchasing trends and know the company's potential customer segmentation in
different aspects. Moreover, the research also proposes some specific recommendations
for the company to develop marketing strategy, customer retention strategy and succeed
more and more in the future .
Keywords: RFM model, customer segmentation, churn rate, CLV, retail sector.

2
Project Overview
Reasons
All businesses that offer products and services to consumers are considered to be
in the retail sector. Worldwide, there are several variations in retail sales and store types.
The retail sector employs a sizable workforce globally and exhibits consistent growth
year over year. This fast-paced industry's competitiveness has become more apparent in
recent years. Retail establishments have been forced to reevaluate their established
procedures for years by 2022. The management and supply chain ideas that have changed
on a worldwide scale for many well-known businesses only serve to highlight how
crucial retail sales are to the economy. Businesses who want to stay ahead of the
competition must stay current with the latest trends as the retail sector undergoes ongoing
change. Businesses may stay ahead of the curve and satisfy their clients by
comprehending these trends and utilizing them. In such a fierce competitive environment,
it is important to keep old customers instead of finding new customers to save costs. To
achieve this goal, businesses must understand the churn rate of each specific customer
segment and have different strategies for them.

Objectives
This study has the objectives to figure out the churn rate of each customer
segmentation in the retail sector using RFM model and recommend which groups of
customers that deserve the companies’ attention. Therefore, there are two main problems
that need to be solved in this study:
- Categorize the customer segments using the RFM model.
- Calculate the churn rate of each customer segment.
Objects and scopes
Objects
The goal of this research is to segment customers. Companies can use that
information to plan and allocate money for marketing efforts.

Scopes
Time scope: The project is carried out from September 23rd to November 23rd. The
time scope of the data is between 01/07/2017 and 15/06/2020.
Space scope: The space scope is about RFM model and Customer Life Value
model.

3
Research method
Qualitative research through synthesizing, selecting and determining the
relationship between variables from previous studies to find the proposed model. Apply
the scales to suit the context and group of research subjects. In addition, the research
team also applied quantitative research through 18485 customer samples and used RFM
analysis to segment customers based on Scoring.
Tools and Programing language
In this study, we use Excel - a software which is created by Microsoft to segment
customers and draw statistical charts for qualitative research.
Structure of project
Chapter 1: Literature Overview
Chapter 2: Theoretical Background
Chapter 3: Proposed Model
Chapter 4: Experimental Results
Chapter 5: Solution

4
Chapter 1: Literature Review
Chapter overview
Before going into the research, we will review some of the previous domestic and
foreign studies relating to customer segmentation in this chapter.
Through these studies, we will identify the weaknesses that still exist in the research
papers. From there, we determine new research directions and suitable research methods.
1.1 Domestic Overview
Regarding the subject area, there are few domestic studies related to customer
segmentation and the RFM model than foreign studies. However, the research papers also
did quite well in a particular area.
Ho Bach Nhat and Nguyen Huu Ngam used descriptive statistical methods in their
research on the segment of credit customers in Long Xuyen City and Dinh Tien Minh, Le
Vu Lan Oanh also used that method in their research on the segment of shopping
customers in Ho Chi Minh City. These studies offer suggestions and guidance for
businesses, however, are restricted to a particular region, and thus they cannot be
extrapolated to other locations with different characteristics.
The research "Thái độ và ý định mua rau Vietgap của người tiêu dùng TPHCM”
and “Các nhân tố ảnh hưởng đến sự hài lòng của khách hàng cá nhân về dịch vụ Internet
Banking" and “ Phân tích hành vi khách hàng hướng đến phân khúc thị trường từ dữ liệu
big data: trường hợp của Sacombank” by Phạm Văn Hậu also uses in-depth analytical
models. Moreover, the application is quite simple and ineffective, so there are many
limitations such as: could not find out the consumption habits of customers, have not
classified the potential customers correctly.
Pham Kien Trung, Nguyen Duc Thang, Le Van Chien, and Nguyen Van Thuong
have applied the K-means algorithm to cluster target customers. Research gives us more
information about customers, which helps in effective customer care and helps the
company target the right customers for new products and services. The Research is still
incomplete due to a lack of information on customer behavior, habits, and preferences.
1.2 Oversea Overview

Today, the target group of customers is becoming more and more popular with
companies in order to optimize benefits and profits for the company both in Vietnam and
abroad.One of the widely applied grouping models is RFM, In terms of application, the
studies indicate that the grouping of RFM customers can be applied to many different
ways depending on the strategy and vision of the business: there are businesses to reach
customers effectively, businesses to build marketing strategies, but the limitations of
5
these studies are generally applied RFM only in some specific cases that cannot be
applied to large and diverse data sets: Selin Yilmaz, Cédric Chanez, Peter Cuony, Martin
Kumar Patel(2022); Selin Yilmaz, Cédric Chanez, Peter Cuony, Martin Kumar
Patel(2019); Tianyuan Zhang, Sérgio Moro, Ricardo F. Ramos (2014);Tianyuan Zhang,
Sérgio Moro, Ricardo F. Ramos; P.Anitha, Malini M.Patil (2019); Tingzhong
Wang(2022); Jun Wu, Li Shi, Wen-Pin Lin, Sang-Bing Tsai, Yuanyuan Li, Liping Yang,
and Guangshu Xu (2020); Ching-HTNMT Cheng, You-Shyang Chen (2009); Daqing
Chen, Sai Laing Sain, Kun Gou (2012); Rendra Gustriansyah, Nazori Suhandi, Fery
Antony (2021); María Teresa Ballestar, Pilar Grau-Carles, Jorge Sainz (2018). The above
researches show that the application of RFM model can be applied practically, however,
the above articles only apply RFM to a small industry or a small data set, so it is difficult
to assess the practical effectiveness of applying it to a huge data set.

The practical application of RFM model also has many problems: firstly,
collecting a large amount of customer data, this will be difficult for large companies and
there are many business areas and how to choose to get data such as: E Ernawati, S S K
Baharin and F Kasmin (2021); Tingzhong Wang(2022). For that reason, some studies
will introduce improvements in the exploitation of customer data, but the study points to
44 things that need to be considered and classified. In addition to the improvements, they
will not stop there. The algorithms for calculating RFM are also extremely diverse and
there is no in-depth study to confirm the effectiveness of those algorithms. The choice of
clustering algorithm will greatly affect the exact results of RFM ( citing Jun Wu, Li Shi,
Wen-Pin Lin, Sang-Bing Tsai, Yuanyuan Li, Liping Yang, and Guangshu Xu (2020); Jun
Wu, Li Shi, Wen-Pin Lin, Sang-Bing Tsai, Yuanyuan Li, Liping Yang, and Guangshu Xu
(2020).

Some studies such as Willem Verbeke, Bart Dietz & Ernst Verwaal (2010); Chu
Fang and Haiming Liu (2021); Wafa Qadadeha & Sherief Abdallahb (2003) have
introduced improved methods of customer grouping algorithm but generally not yet
highly effective when improved models and algorithms can work effectively in scientific
research is still not guaranteed. However, we already know that the K-Means algorithm
will fit a large data set compared to calculations based on conventional quartiles, but K-
Means is very sensitive to extraneous data.

1.3 Summary

In summary, domestic and foreign studies have certain limitations for data
processing, limited research scope, and lack of some grand information about the
research subjects, such as consumer abandonment behavior.
6
In this study, the original RFM model applied by our research team aims to solve the
limitation in data processing to show the customer churn rate by segment in the retail
sector. From there, suggest new solutions and solutions for the business's marketing
strategies.

7
Chapter 2: Theoretical Background
Chapter overview
Vietnam's retail market is booming, more and more businesses are entering this
potential market. Therefore, retail companies compete very hotly for many customers.
RFM model is more and more widely used, almost every company uses this model to
identify potential customers and then come up with effective marketing strategies. In this
chapter, theoretical foundations, definitions, theories, operating principles, scientific
hypotheses, as well as tools used in the thesis will be presented. We will figure out the
definition of RFM model, market segmentation, CLV - customer lifetime value, how
customer consumption behavior is segmented according to RFM model, customer
clustering according to RFM Scoring, and calculate customer churn rate.
2.1 The rationale of RFM:
2.1.1 Consumer behavior segmented by RFM:
RFM-based customer segmentation has been used by direct marketers for over 60
years to target the right group of customers within their customer base, to save on mailing
costs and improve sales and profits.
An RFM model is a behavior-based model that is used to analyze customer
behavior and then make predictions based on customer-related data sets collected and
stored at an enterprise. First widely introduced by Arthur M. Hughes through work in
1996 - originally called RFM analysis, later commonly referred to as RFM modeling.
RFM stands for Recency, Frequency and Monetary, the meaning of each factor is as
follows:
- Recency: the last transaction time of the customer. This metric tells us if a
customer is actually active near the time of the review. The larger this index, the
higher the customer's tendency to leave. From there, a bad warning is drawn for
businesses that should have new products or policies to properly serve the needs
and tastes of customers.
- Frequency: Frequency of customer purchases (how long or how many products).
The more times customers buy, the greater the sales value they bring to the
business. However, this value is not enough to become the main evaluation basis
because it depends on the value of the new order to assess the potential of the
customer.
- For example, according to this research method, the research data period is from
7/2017 to 5/2021, the total number of customer purchases will be determined
during this time. Customers who regularly shop will be the ones most likely to

8
become loyal customers at a high level in the future. This information also helps in
determining customer satisfaction with the company's retail business services.
- Monetary: Total amount purchased by the customer. This is the most intuitive
factor affecting the sales of the business. This index helps businesses know how
much money customers have spent to buy the company's products. Thereby, the
company will have data information to classify customers. This is the basis for the
company to build appropriate programs and strategic solutions for products or
services according to customers' ability to pay or pay.
RFM is a marketing technique used to gauge the behavior of a company's
customers. According to a domestic research paper on the banking sector, it has been
proved that the higher the value of R and F, the more customers can measure the behavior
of using services at the bank. In addition, the higher the M-value, the more likely a
customer can see that they can have more than one consumption transaction.
RFM analysis provides fundamental benefits to a retail company:
- Use objective scales – provide a superior and concise description of the customer.
- Simple – the company can use it effectively without the need for sophisticated
software or data analysts.
- Intuitive – the output of this segmentation method is easy to understand and
interpret.

2.1.2 RFM Scoring:


According to the research method, the company will take the collected data and
conduct customer clustering by RFM Scoring method.
All customers are analyzed based on 3 criteria: time of last purchase (R), frequency of
purchase (F), total amount spent to purchase (M), criteria are evaluated. Rates on a scale
of 1 to 5.
Customer clustering process using RFM Scoring method:
A recent hit score is assigned to each customer based on the date of the most
recent purchase. Scores are generated by pooling recent hit values into several categories
(default is 5). For example, if you use four categories, customers with the most recent
purchase date will receive a recent visit rating of 4, and customers with a past purchase
date will receive a rating of 4. The last visit was 1.
Frequency ratings are assigned in a similar way. Customers with a high frequency of
purchases are given a higher score (4 or 5) and those with the lowest frequency of
purchases are given a score of 1.
The monetary score is assigned on the basis of the total revenue generated by the
customer in the period under consideration for analysis. Customers with the highest

9
sales/orders are given a higher score while those with the lowest sales are given a score of
1.
The fourth point, the generated RFM point, is simply three individual points concatenated
into a single value.
Details of customer segmentation table according to RFM will be described in the table
below:

Point Group R( Recency) F(Frequency) M(Monetary)

1 Leaving service ( not Very low Very low Very low


sure)

2 Risk Low Low Low

3 Normally Normal Normal Normal

4 Potential High High High

5 The best Very high Very high Very high


Table 2.1.2. Score description according to RFM
2.2 Market segment definition:
Around the world, there are many definitions of market segmentation:
According to Philip Kotler, “Market segmentation is the breakdown of customers
into a homogeneous subset of customers, where any subset can be envisioned as a market
target to be achieved with a given mix of customers. separate marketing”.
According to William J. Stanton, “Market segmentation is the process of dividing
the total heterogeneous market for a good or service into several segments. Each of these
segments tends to be homogeneous in all important respects.”
Market segmentation in retail business is understood as dividing the market into
different segments in which each segment will have a certain product or set of services
for a certain group of people. These segments are called market segments, i.e. a group of
consumers who respond equally to the same set of marketing stimuli, and market
segmentation is the process of consumer segmentation. grouped on the basis of
differences such as needs, personality or search propensity, ..
The goal of market segmentation in the retail sector is to divide the market into
smaller markets with customers with similar needs that are easier to recognize, capture,
and respond to more effectively. Market segmentation helps companies create products

10
and services that meet the needs of specific customers and focus marketing resources
more efficiently. Through that, the company will determine which is its target market.
Criteria for market segmentation:
Market segmentation by age: choosing a set of potential customers will make it
easier to choose products suitable for that age.
Market segmentation by object: it is necessary to determine the audience the
company wants to target in order to have orientation in choosing the right product.
Segmentation by demand: retail companies should research and understand the needs of
users to choose suitable business items.
Market segmentation by item: Identify potential products (in terms of long-term
consumption criteria of users, uniqueness of products and services, etc.) and find out the
level of competition. competition of other retail businesses to provide orientation on
business products.
2.3 CLV - customer lifetime value:
CLV - Customer Lifetime Value, this is the term for defining all the strategies that
a business has used to attract customers, convert, or retain customer satisfaction and take
advantage of it for the purpose of increasing revenue and developing its brand. All of
these customer lifecycle value strategies aim to engage customers throughout their entire
buying journey. Or simply defined, it is the journey from point A to point B that the
customer will take until they actually make the final purchase.
CLV is the value a customer contributes to the company during their lifetime.
Loyal customers are people who bring long-term and sustainable profits to the business
because of their high life-cycle value, it was discovered early by Kotler (1974) with the
present value definition of future profit streams. projected for a certain period of time the
transaction horizon with customers and emphasis on profitable customers.
Four stages of a customer life cycle:
Awareness: during this time the target customer has a need for the product, it will
be converted into a potential customer.
Purchases: After going through phase 1, customers will move on to stage 2 – it
will be their first purchase at your business, from a business they already feel the most
trust.
Develop and maintain relationships with customers: After the first purchase, the
company needs to have a plan to keep in touch, to help develop relationships with
customers.
Loyal customers: when customers feel satisfied every time they make a purchase,
then of course they will voluntarily become loyal customers.
Three important pieces of information for measuring customer lifetime value:

11
Average Purchase Value - This is considered as the average amount that a certain
group of customers will spend on a single purchase.
Purchase Frequency - This is considered as the number of times that a customer
will purchase from the company in a given period of time. A customer who purchases
regularly is likely to return for more.
Customer Value - This is just how much money a customer has spent over a
certain period of time. A customer who spends more means more likely to pay you in the
future.
In general, there are different methods for calculating CLV for organizations in
different sectors. In this paper, due to value-based segmentation and customer lifetime
visit time, frequency, and currency (RFM), the characteristics selected by this method
include last purchase date, number of visits purchase is the customer's buying frequency,
the total amount spent.
2.4. Churn rate - customer churn rate:
Customer Churn Rate is the percentage of a business's customers that no longer
purchase or interact with a business for a given time or at all. A high Customer Turn Rate
means that a number of customers no longer want to buy goods and services from a
business. Customer Turn Rate or customer consumption rate is a mathematical
calculation of the percentage of customers who are unlikely to make another purchase
from a business.
Or according to Avery, customer churn rate is a measure of the percentage of people who
ended a customer relationship with a business during a particular period. Typically, churn
rates are calculated on a monthly, quarterly, or yearly basis, depending on the industry
and product of the business. The annual rate is the default unit for most companies,
however companies price products on a monthly basis. For example, cell phone service
providers, gyms, or software companies often track customer rates on a monthly basis.
Some companies with rapid customer churn, or significant customer loss, will also
evaluate this rate on a monthly basis.

12
Chapter 3: Proposed Model
Chapter overview
3.1. Proposed model

Figure 3. 1. The proposed model

13
The above model was applied to the research article, divided into 5 main
stages, each stage was done well and fully. In the first phase, we use the data
available from AdventureWorks from Microsoft, we identify the elements and
subjects that the research section needs in AdventureWorks (collect). In the next
stage, we select in the database table the necessary items to be able to calculate the
factors and objects to be studied and separate them to start the calculation process,
converting the necessary data (selective). In the next stage, we rely on those data to
calculate the ranges based on quintiles and find the different results so that in the
next stage we will classify those results into separate groups, each group will have
its own setpoint and evaluate the real effectiveness of the model just studied.
3.2. Field Description
AdventureWorks is data provided by Microsoft that is updated to 2021. We are
currently applying the RFM model for calculations on this dataset and therefore we
focus on only one module, Sales module.

14
Figure 3. 2. Sales Module

The sales module contains information about carts, orders, special offers,
places of delivery, and salespeople. We extract data from three tables:
Sales.SalesTerritory, SalesOrderHeader, and SalesOrderDetail, then select the
appropriate columns as below:

15
Column Class Table Description

SalesOrderID int Sales.SalesOrderH Auto Increment


eader

OrderDate datetime Sales.SalesOrderH Date on which an order


eader was created

Status tinyint Sales.SalesOrderHea The current status is


der labeled from 1 to 5. 1 =
Processing, 2 =
Approved, 3 =
Pre-ordered, 4 =
Rejected, 5 = Shipped, 6
= Canceled

ClientID int Sales.SalesOrderHea Customer Identification


der Number

OrderQty smallint Sales.SalesOrderDet Invalid q'ty allocated


ail

UnitPrice money Sales.SalesOrderDet The price of a product


ail

UnitPriceDiscou money Sales.SalesOrderDet Discount amount


nt ail

16
Line Total number( Sales.SalesOrderHea Subtotal per
38, 6) der product, excluding
Unit PriceDiscount

TerritoryID Int Sale.SalesTerritory Place of delivery

Table 3. 1. Field Description

3.3. Exploratory Data Analysis


3.3.1. Variables information

Custom Custome Customer City State- Country- Postal


erKey r ID Province Region Code

-1 [Not [Not [Not [Not [Not [Not


Applicabl Applicable] Applicable] Applicable] Applicabl Applicabl
e] e] e]

Original dataset has total 121253 rows and 15 columns. There are error values
which are customerID variables not applicable as those may not provide their information
or one-time buyers. After those error values had been eliminated, the dataset had 60398
rows left.

3.3.2. Descriptive analysis:

17
Table 3.2. Descriptive analysis result

All min values of Order Quantity and Unit Price Discount Pct are equal with their
max values times as 1 and 0. That means the UnitPrice value equals with Sales Amount
value.
The min and max of UnitPrice are 2.29 and 3578.27 respectively, which means the
pricing of products varies greatly. Its large range may be caused by some extreme values.

3.3.3. Histogram:

Figure 3.3.1. Distribution of OrderQuantity

18
Figure 3.3.2.Distribution of UnitPrice

Figure 3.3.3.Distribution of Unit Price Discount Pct

19
Histogram was used to display the distribution of 3 numerical variables including
OrderQuantity, UnitPrice, Unit Price Discount Pct. In this case, all of the distributions are
right-skewed because most of the observations are clustered in the first bin while in the
other bins there are very few values.

3.3.4. Scatter Plot:

Figure 3.3.4. Correlation between Order Quantity and Sales Amount

Scatter chart showing quantity and total amount with two trends. When the order
quantity is around 0-20 then the total amount has positive correlation. That means the
more customers order, the higher the total amount.

However, when the order quantity increases from 20 or more the total amount
tends to stay the same. But this trend is quite sparsely distributed.

3.4. Data preprocessing


RFM (Recency – Frequency – Monetary) is used to analyze customer value,
thereby helping businesses analyze each group of customers they have, and having many
marketing campaigns or special care for customers. Using the provided dataset, the team
comes up with a calculation and calculation of each customer's R, F, M in the dataset.

20
First, the team prepared a sheet of data on Excel consisting of CustomerID, Order ID,
Order date, Money. Then proceed to insert the Pivot Table.
For Recency (R), selecting max to calculate the nearest date that the customer
ordered. Then use the Max function to calculate the number of days. The calculation is to
subtract the number of days in the data file from the last number of days the customer
ordered.
For Frequency (F), using the Distinct count function to count the number of non-
duplicate order codes, thereby viewing F.
For Monetary (M), using the sum function to sum up each customer's money. In
which, the amount of money will be equal to the unit price multiplied by the amount set
minus the discount.

Figure 3.4.1. RFM values in Excel


In the data file, there are 29483 customers, the team applies the scoring
method according to the quintile method in case the data file is continuous from 0
to the max value. From there, the research team evaluated the customer's score and
produced the results to segment into eleven customer groups, including the loss
customer.

Figure 3.4.2. Calculating of RFM


3.5. RFM Scoring
After we have the values for Recency, Frequency and Monetary parameters, we
deduce to R, F and M Score respectively. Each R, F, M Score will get a value between 1
and 5 for each parameter. The RFM score is calculated by using quintiles. Each quintile
contains 20% of the population. The reason for using quintiles instead of setting ranges
based on customer’s expected behavior is it will be more flexible as the data will be

21
adapted to the ranges which are reality deviated and would be better if there is diversity
of customer behavior.

CustomerK RFM
ey R F M R Score F Score M Score Score
11000 255 8 8248.99 3 3 4 334
11001 34 11 6383.88 5 3 3 533
11002 324 4 8114.04 3 2 4 324
11003 248 9 8139.29 3 3 4 334
11004 257 6 8196.01 3 3 4 334
11005 256 6 8121.33 3 3 4 334

Table 3.5.1. The RFM score of each customer


We will divide the customer into 11 segments based on the R, F and M scores. The
description of the segments is as the following:

Segmentation
RFM Score Description

Customers bought recently,


Champions 555, 554, 544, 545, 454, 455, frequently and spent a lot of
445 money.

Customers spend good


Loyal 543, 444, 435, 355, 354, 345, money and regularly buy,
344, 335 responsive to promotions.

Customers recently buy with


Potential 553, 551, 552, 541, 542, 533, average frequency value or
Loyalist 532, 531, 452, 451, 442, 441, regularly buy with average
431, 453, 433, 432, 423, 353, monetary
352, 351, 342, 341, 333, 323

22
Customers bought recently
New Customers 512, 511, 422, 421, 412, 411, but seldom buy with low
311 spend

Customers with high overall


Promising 525, 524, 523, 522, 521, 515, R and M scores but have not
514, 513, 425,424, frequently bought
413,414,415, 315,

Customers have above


Need Attention 535, 534, 443, 434, 343, 334, average RFM scores but may
325, 324 not have bought very
frequently.

Customers have below


About To Sleep 331, 321, 312, 221, 213, 231, average recency and
241, 251 monetary values. We will
lose them if not reactivated.

Similar to “Can not lose


At Risk 255, 254, 245, 244, 253, 252, them” but they have smaller
243, 242, 235, 234, 225, 224, monetary and frequency
153, 152, 145, 143, 142, 135, value.
134, 133, 125, 124

Customers did not buy often,


Cannot Lose 155, 154, 144, 214,215,115, spent big money but it was a
Them 114, 113 long time ago.

Customers whose last


Hibernating 332, 322, 233, 232, 223, 222, purchase was long back and
customers 132, 123, 122, 212, 211 have a low number of orders.

Lost customers 111, 112, 121, 131,141,151 Customers buy rarely with
small spend and it was far
from day since last

Table 3. 5. Customer segmentation


23
We visualize the matrix result. We can easily see hibernating customers account
for the highest percentage - 48.38% whereas at risk and can not loose segmentations have
the same lowest percentage 0%.

Figure 3.5.2. The customer segmentation matrix

24
Chapter 4: Experimental Results
Chapter overview
4.1. Results

Figure 4.1.1. The distribution of Recency

25
Figure 4.1.2. The distribution of Frequency
We can see that the recency score is in stark contrast to the frequency score. While
recently customers are distributed mainly at points 4 and 5, those who buy frequently are
allocated at points 1 and 2.
Finally, we look at the distribution of our segments by bar chart because it is a
better fit for comparing quantities.

Figure 4.1.3. The distribution of customer segments


26
This data shows that the company had quite a few customers who tend to buy
infrequently and spend little money ( account for 48,38%, it is nearly half of total ). Lost
customers account for a small percentage (1.99%) but still more than champions, loyal
segments.
4.2. Churn rate

Based on the above RFM score, we can calculate customer churn rate by region.
The graph below visualizes this data.

Figure 4. 2. Customers churn rate by region

According to experts, the best churn rate that companies should aim for is less than
3%. We can see that the customer churn rate at this company is low, with the lowest rate
of 0.00% in the Southeast, Northeast and Central of the United States and the highest at
2.60% in the Northwest of the United States. This shows that the business has rather
effective policies and good quality products to retain customers.

27
Chapter 5. Solution
Chapter Overview
5.1. Why do customers churn?
We recommend companies to find answers to these questions to recognize the major
problems that cause customers to abandon them:

 Is the pricing strategy really the root of the problem?


 Is it because the products don't offer values that the clients want?
 Have your clients found a more reasonably priced alternative?
 Is it as a result of a bad experience they had with the business, such as persistent
outages or inadequate support?
 Is there another issue at hand?
The next stage is to explicitly ask these questions to your clients who are about to
leave, though you can always add more to the list. Online survey platforms can be used to
build straightforward polls that reveal a clear client turnover pattern.
A root cause analysis (RCA) is a fantastic tool for better understanding the mindset of
your target audience. You ask a series of "why" inquiries in a progressive manner to
identify the root cause of an issue. RCA looks further into a reported issue to identify the
underlying cause rather than stopping at the first indication of a problem.
5.2. Effective solution to reduce customer churn.
Provide superior customer service and support.
A big reason why customers leave is due to poor quality customer service.
According to an Oracle study, 89% of customers turn to a competitor due to a poor
customer service experience with the original brand. Customers want to feel like they're
being listened to by your organization, so prove to them that they do. The company can
provide superior service and support by being proactive. Don't just wait until the
customer makes a complaint to communicate with them indifferently. If an upgrade is

28
now available for their purchased product, or a glitch was detected in the product they
purchased, try to contact customers privately.
Provide value beyond the purchase.
Besides providing excellent service and support, there are other ways that you
provide value to your existing customers and prevent them from leaving. It's all about
making customers feel like the company has more to offer than a single product that they
need to meet a single need. Show your customers that your organization provides
unparalleled value. Some examples include sending out daily or weekly newsletters,
sharing with them relevant blog posts, or updating them on upcoming events and
programs hosted by your company. Encourage them to sign up for your email newsletter.
Experience personalization reduces customer churn
Any of us want special treatment and to be remembered. That's why most
customers are satisfied when called directly at restaurants or online call centers.
Therefore, businesses should organize customer appreciation events, invest in customer
care programs as well as learn about their needs. Sending customized communications
with incentives to the consumers may persuade them to do business with the company
again.
Customer feedback allows the company to learn what they like and hate about the
products and services. Instead than sending out surveys and forms, use a chatbot to
collect feedback and communicate with customers. After that, take action on the
unfavorable comments. Create a win-back strategy based on the customers' complaints.
Build customer loyalty
Customer satisfaction is an invaluable asset to every business. With loyal
customers, they are less inclined to leave you. Therefore, the company needs to find a
way to build the credibility and trust of their customers, keep them in a long-term
relationship by providing optimal benefits, and always keep their promises in any case.
Let customers know what the company is doing for them. Those are values, benefits that
they can't find anywhere but the business.
Restore customer trust in your business by strengthening relationships
Losing customers means losing trust in a variety of ways. Therefore, the company
needs to research methods in advance to avoid this happening. Create brand engagement
to keep customers and improve brand loyalty to keep them from giving up on the
company.

29
It's important for companies to reflect on the reasons for the departure strategies
and come up with strategies to restore their trust in it. To rebuild their trust in your
company, use the methods listed above and continue to develop the company's own new
ones. To have a greater impact on lost customers, track and tailor the company re-
engagement campaigns. The category is quite useful for small businesses and aspiring
entrepreneurs who don't have a website. WhatsApp Marketing is a great approach to
developing long-term relationships with customers and improving your brand. For greater
results, branch out from WhatsApp and become an authority on all the mediums your
customers use. Use Telegram and Facebook to reinforce the company's marketing
strategy with email notification and web promotion campaigns.
5.3. What are the various channels you can use to regain customers?
A live chat service:
Live chat is the perfect tool for facilitating real-time engagement with clients on
your website or mobile app. Ideally, you should make the most of live chat even before
customers leave. Assume they click the "unsubscribe" or "cancel membership" buttons,
or that they move their cursor to the leave tab. In that case, you might immediately give
them crucial chat messages to sway them away from their relocation plans.
To get a deeper grasp of customer sentiment and reduce churn in real time, you
may also regularly perform a CSAT (Customer Satisfaction) survey or check your brand's
NPS (Net Promoter Score) via live chat. Live chat has the additional benefit that your
staff are immediately studying the issue, looking into the potential attrition's root cause,
and searching for user-friendly remedies.
Utilizing live chat in conjunction with emails or retargeting ads is another option
to re-engage lost customers. For instance, if you've presented a convincing case for
attracting lost customers back to your website via banner adverts or customized emails,
give them an honorable welcome when they return. Personalize your live chat messaging
and offer enticing instant gratification to lure them back. When it comes to a successful
marketing campaign, it's not just about what you're selling; it's also about how you're
selling it.
Retargeting adsvertisement:
Since retargeting advertisements can assist in converting up to 70% of new
website visitors into customers, marketers frequently utilize them to increase consumer
conversion rates. Retargeting advertisements are successful because they need you to
place web cookies on a visitor's browser (with their consent) and track them throughout
the internet. It causes a memory to emerge that flows from thoughts to actions.
30
Retargeting ads could, if done correctly, assist you in recovering lost customers.
You must ensure that everything is packaged tastefully, watch out for intrusive
advertising, and offer a strong reason for people to come back to your establishment.
Between a successful remarketing plan and bombarding them with ads, there is a fine
line. The more individuals are exposed to your commercials, the less interested they will
be in your brand.
Emails:
Email can be a more effective, direct, and personalized way to win back lost
customers. You can create a one-time email campaign or handle it like a drip campaign
for clients who have given up on your business. This approach can be tested with text
messages as well, or it can be used in conjunction with an email re-engagement
campaign.
While the message concept—offer an alluring incentive, create urgency, or remind
them of what they're missing out on—applies to all techniques, your email strategy
should go above and beyond. Not a generic advertising masquerading as a marketing
email should be your copy. It must inspire awe, create a feeling that makes them feel
special, and urge them to think twice about leaving your business.
Do it elegantly and use the chance to inject some humor into your email.
Messaging and social media channels:
Social media marketing is one of the best strategies to address customer issues. By
publishing blogs, videos, and other pertinent information, you may keep churned users
interested in your content. The educational value you offer keeps them engaged even
after they churn.
Retargeting and social media ads are great ways to hyper-target and interact with both
current and potential clients. It gives you a platform to interact with clients in a special
way. You can experiment with a few other ad formats.

 Advertisements with images


 Commercials on a carousel
 Videos with advertisements
 Story ads Collection adverts
Numerous sites, including Facebook, Instagram, and Twitter, are available for running
these advertising. By using these adverts, you may reach your target audience where they

31
are and encourage them to come back to your establishment. Additionally, they are
particularly helpful for reducing cart abandonment.
When it comes to interacting directly with your customers, messaging channels are
the future. You can use Facebook Messenger and WhatsApp in conjunction with live chat
to send messages to your customers on their favorite messaging platform.

32
Conclusion and Future Works
The study provides a summary of data analysis. The data shows that the majority
of the company's customers tend to buy infrequently and spend little money. The
percentage of new customers accounts for the most among customer segments. On the
other hand, the customer churn rate at this company is low. This shows that the company
is still building a foothold in the market, and is quite successful in doing so. In this
research, we offer some recommendations for this company and companies with similar
situations.
However, this study still has many limitations. Traditional RFM modeling using
quintile analysis still requires a lot of physical efforts due to manual processing. The
RFM model itself is a simple model, which may not fully describe the complexity of
customer segmentation.
Our team will update this research in the future by including more cutting-edge
clustering techniques such as K-means clustering, EM clustering, Fuzzy C-means
clustering, and OPTICS. In order to better practice categorizing the consumer groups, we
will also expand our knowledge of customer segmentation and marketing strategies.

33
References
[1] Zhang, T., Moro, S., & Ramos, R. F. (2022). A Data-Driven Approach to Improve
Customer Churn Prediction Based on Telecom Customer Segmentation. Future
Internet, 14(3), 94 (26/09/2022).
[2] Cuadros, A. J., & Domínguez, V. E. (2014). Customer segmentation model based on
value generation for marketing strategies formulation. Estudios Gerenciales, 30(130),
25-30 (26/09/2022).
[3] Yilmaz, S., Chanez, C., Cuony, P., & Patel, M. K. (2022). Analysing utility-based
direct load control programmes for heat pumps and electric vehicles considering
customer segmentation. Energy Policy, 164, 112900 (27/09/2022).
[4] Kevrekidis, D. P., Minarikova, D., Markos, A., Malovecka, I., & Minarik, P. (2018).
Community pharmacy customer segmentation based on factors influencing their
selection of pharmacy and over-the-counter medicines. Saudi pharmaceutical journal,
26(1), 33-43 (27/09/2022).
[5] Alkhayrat, M., Aljnidi, M., & Aljoumaa, K. (2020). A comparative dimensionality
reduction study in telecom customer segmentation using deep learning and PCA.
Journal of Big Data, 7(1), 1-23 (27/09/2022).
[6] Wu, J., Shi, L., Lin, W. P., Tsai, S. B., Li, Y., Yang, L., & Xu, G. (2020). An
empirical study on customer segmentation by purchase behaviors using a RFM model
and K-means algorithm. Mathematical Problems in Engineering, 2020 (28/09/2022).
[7] Wu, S., Yau, W. C., Ong, T. S., & Chong, S. C. (2021). Integrated churn prediction
and customer segmentation framework for telco business. IEEE Access, 9, 62118-
62136 (29/09/2022).
[8] Luo, J., Qiu, S., Pan, X., Yang, K., & Tian, Y. (2022). Exploration of Spa Leisure
Consumption Sentiment towards Different Holidays and Different Cities through
Online Reviews: Implications for Customer Segmentation. Sustainability, 14(2), 664
(27/09/2022).
[9] Nhật, H. B., & Ngẫm, N. H. (2020). Phân khúc thị trường tín dụng tiêu dùng tại
thành phố Long Xuyên (04/10/2022).

34
[10] Drivers of sales performance: a contemporary meta-analysis. Have salespeople
become knowledge brokers?, Willem Verbeke, Bart Dietz & Ernst Verwaal (2010),
Journal of the Academy of Marketing Science 39,407–428 (2011) (04/10/2022).
[11] Wang, T., Li, N., Wang, H., Xian, J., & Guo, J. (2022). Visual Analysis of E-
Commerce User Behavior Based on Log Mining. Advances in Multimedia, 2022
(26/09/2022).
[12] Cheng, C. H., & Chen, Y. S. (2009). Classifying the segmentation of customer value
via RFM model and RS theory. Expert systems with applications, 36(3), 4176-4184
(27/09/2022).
[13] Ballestar, M. T., Grau-Carles, P., & Sainz, J. (2018). Customer segmentation in e-
commerce: Applications to the cashback business model. Journal of Business
Research, 88, 407-414 (28/09/2022).
[14] Chen, D., Sain, S. L., & Guo, K. (2012). Data mining for the online retail industry:
A case study of RFM model-based customer segmentation using data mining. Journal
of Database Marketing & Customer Strategy Management, 19(3), 197-208
(26/09/2022).
[15] Hậu, P. V. (2019). Phân tích hành vi khách hàng hướng đến phân khúc thị trường từ
dữ liệu big data: trường hợp của Sacombank (04/10/2022).
[16] Ernawati, E., Baharin, S. S. K., & Kasmin, F. (2021, April). A review of data mining
methods in RFM-based customer segmentation. In Journal of Physics: Conference
Series (Vol. 1869, No. 1, p. 012085). IOP Publishing (29/09/2022).
[17] Anitha, P., & Patil, M. M. (2019). RFM model for customer purchase behavior using
K-Means algorithm. Journal of King Saud University-Computer and Information
Sciences (28/09/2022).
[18] Gustriansyah, R., Suhandi, N., & Antony, F. (2020). Clustering optimization in RFM
analysis based on k-means. Indones. J. Electr. Eng. Comput. Sci, 18(1), 470-477
(26/09/2022).
[19] Hwang, S., & Lee, Y. (2021). Identifying customer priority for new products in
target marketing: Using RFM model and TextRank. Innovative Marketing, 17(2), 125
(27/09/2022).
[20] Pham, T. K., Nguyen, T. D., Van Le, C., & Van Nguyen, T. (2020). Analyzing
customer sentiments using K-means algorithm. Journal of Mining and Earth Sciences
Vol, 61(5), 145-150 (04/10/2022).
35
[21] Oanh, Le & Tien Minh, DINH. (2021). Phân khúc khách hàng mua sắm dựa trên
thuộc tính của các trung tâm thương mại tại TP.HCM (04/10/2022).
[22] Verbeke, W., Dietz, B., & Verwaal, E. (2011). Drivers of sales performance: a
contemporary meta-analysis. Have salespeople become knowledge brokers?. Journal
of the Academy of Marketing Science, 39(3), 407-428 (29/09/2022).
[23] Fang, C., & Liu, H. (2021). Research and Application of Improved Clustering
Algorithm in Retail Customer Classification. Symmetry, 13(10), 1789 (28/09/2022).
[24] Shih, Y. Y., & Liu, C. Y. (2003). A method for customer lifetime value ranking—
Combining the analytic hierarchy process and clustering analysis. Journal of
Database Marketing & Customer Strategy Management, 11(2), 159-172 (28/09/2022).
[25] Qadadeh, W., & Abdallah, S. (2018). Customers segmentation in the insurance
company (TIC) dataset. Procedia computer science, 144, 277-290 (29/09/2022)

36
1

You might also like