IEEE_Conference_Template (3)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Bisecting K-Means - An Efficient Approach to

Customer Segmentation
Divanshu Nayan Pranshu Mishra Shahoor Ahmed
Computer Science and Engineering Computer Science and Engineering Computer Science and Engineering
C.V. Raman Global University C.V. Raman Global University C.V. Raman Global University
Bhubaneswar, India Bhubaneswar, India Bhubaneswar, India

Dr. Bichitrananda Behra


Computer Science and Engineering
C.V. Raman Global University
Bhubaneswar, India

Abstract—Organizations can efficiently segment their con- for evaluating clients based on their purchasing patterns. To
sumer base by leveraging RFM (Recency, Frequency, and Mon- quantify the Recency, Frequency, and Monetary aspects, a
etary) values, derived from analyzing transactional data over scoring system is developed. These scores are then amalga-
a specified period. This segmentation approach facilitates the
identification of groups with similar behaviors, enabling a deeper mated to generate an RFM score, ranging from 555 to 111
understanding of customer needs and the exploration of potential (Haiying and Yu, 2010). This composite score serves as a tool
clients for the business. Additionally, segmenting the client base for examining customers’ historical and current behaviors to
has a positive impact on revenue generation. Emphasizing the predict future patterns. Notably, within this framework, the
retention of current consumers over acquiring new ones is widely scores of Recency, Frequency, and Monetary exhibit a direct
acknowledged as a priority. For instance, businesses can employ
marketing strategies tailored to specific market niches to cultivate correlation with customers’ lifetime value and retention rates.
client loyalty and enhance retention efforts.The study utilizes
classic K-means and hierarchical clustering algorithms to cluster
transactional data subsequent to conducting an RFM analysis. It
introduces a novel approach for selecting the Bisecting K-Means Following the computation of recency, frequency, and mon-
method. The efficacy of these techniques is evaluated based on
cluster compactness, execution time, and the average similarity etary values, the K-Means technique is applied to group the
between each cluster and its most similar counterpart. variables into clusters within the customer base. This facilitates
Index Terms—Customer segmentation, RFM analysis, K- the identification of which consumer group contributes most
Means, Hierarchical Clustering, Bisecting K-Means significantly to the company’s profitability by examining the
behavior of each cluster. Additionally, two other clustering
I. I NTRODUCTION algorithms, the Bisecting K-Means algorithm and Hierarchi-
The current business environment has become more com- cal clustering, are employed. The objective of this study is
petitive, requiring new approaches to maintain a competitive to introduce a method for enhancing the interpretability of
edge. A customer segmentation model’s implementation can clusters, improving compactness, and reducing cluster spread
greatly increase business earnings. The Pareto principle, which and processing time. Once customer clusters are established,
states that every 2 customers out of 10 usually contribute understanding the distinctions among these groupings be-
disproportionately to revenue, emphasizes the need of prior- comes imperative. A thorough examination of the clusters is
itizing client retention over gaining new ones. Business ex- conducted to identify targeted clients and tailor offers and
perts can customize marketing strategies, identify trends, plan promotions relevant to their needs and preferences.Marketing
product development, coordinate advertising campaigns, and professionals will find the proposed consumer segmentation
provide pertinent items by utilizing customer segmentation, methodology valuable for its potential to enhance targeted
which capitalizes on a variety of distinctive client attributes. marketing efforts. The remaining portion of the research
Customer segmentation guarantees successful communication concentrates on comparing and contrasting the three cluster-
with specific groups by tailoring communications. Customer ing techniques, evaluating them based on similarity, cluster
segmentation often makes use of variables including location, compactness, execution time, and other pertinent variables.
age, gender, income, lifestyle, and past purchasing patterns. This comparative analysis will provide valuable insights into
In this context, behavioral data is utilized for segmenta- the strengths and weaknesses of each technique, enabling
tion due to its widespread availability, dynamic nature, and marketing professionals to make informed decisions about
foundation in past purchase behaviors. Recency, Frequency, which clustering approach best suits their specific needs and
and Monetary (RFM) analysis emerges as a prominent method objectives.
II. L ITERATURE R EVIEW: results, agglomerative hierarchical approaches and standard K-
According to Jiang and Tuzhilin (2009), enhancing mar- means are sometimes used. The bisecting K-means technique,
keting success requires both buyer targeting and customer however, outperforms both the hierarchical and ordinary K-
segmentation. Although these two jobs are combined into a means approaches, according to the data.
methodical process, unified optimization remains the chal- In 2010, Samah Fodeh, Bill Punch, and Pang-Ning [8]
lenge. The author suggested the K-Classifiers Segmentation put out a model that relies on the presence of polysemous
algorithm as a solution to the issue. The goal of this strategy and synonymous terms inside the collection of documents.
is to provide more resources to clients who generate higher They claim that document clustering can be performed with a
profits for the business. Numerous authors have written on significantly smaller number of features by utilizing ontology.
various techniques for client segmentation. 229 A Comparative Analysis of K-Means and Their Dissec-
He and Li (2016) proposed a three-pronged strategy for tion Wordnet-Based Document Clustering Techniques Nouns
raising customer happiness, customer behavior, and customer that are synonymous or polysemous are both comparatively
lifetime value (CLV). The writers have come to the conclusion common and crucial for the creation of document clusters.
that just as customers’ needs vary from one another, so do they. Document clustering is improved by noun identification. These
Finding their needs and expectations and providing a decent terms could be synonymous and polysemous.
service are made easier with the help of segmentation.
Cho and Moon (2013) suggested use weighted frequent pat- In 2010, Rekha Baghel [6] proposed a novel method of doc-
tern mining to create a personalized recommendation system. ument grouping based on frequently occurring ideas. Instead of
Using the RFM concept, customer profiling is done to identify using frequent items as in typical text mining techniques, the
potential customers. For the purpose of creating weighted clustering algorithm employed in FCDC (Frequent Concepts
association rules through mining, the author has set different based document clustering) works up on frequently gathered
weights for every transaction. By using the RFM model, the concepts. Numerous clustering algorithms treat documents as
consumer will receive a recommendation that is more correct, a ”bag of words” and disregard significant word associations,
increasing the company’s profit. such as synonyms. The semantic association between words is
A novel clustering algorithm that functions similarly to the used by this suggested FCDC algorithm to generate concepts.
K-means and K-medoids algorithms was proposed by Shah It makes alternative use of the WordNet ontology idea to
and Singh (2012). These two approaches are partitional in generate a low dimensional feature vector that enables the
nature. Although the suggested approach lowers the cluster development of an effective clustering method.
error threshold, it does not always offer the best option.
According to Saurabh, the novel method executes faster than
the conventional methods as the number of clusters increases.
Sheshasaayee and Logeshwari (2017) divided the RFM and
III. A LGORITHM DESCRIPTION
LTV (Life Time Value) approaches to create a new, integrated
strategy. They employed a two-phase strategy, wherein clus-
tering is done in the second phase after the statistical approach
in the first. After the two-phase model, they hope to perform The segmentation process utilizes the transactional dataset
K-means clustering and employ a neural network to improve of a company’s clients, employing three distinct algorithms to
their segmentation. group clients based on RFM analysis. Initially, the data un-
A direct clustering strategy is presented by Jiang and dergoes pre-processing to remove outliers and filter significant
Tuzhilin (2009), which groups customers by merging transac- occurrences. The z-score method is employed to identify out-
tional data from several customers instead of using computed liers and assess how the data align with the mean and standard
statistics. The authors also shown that the problem of finding deviation. Through this method, the standard deviation and
an ideal segmentation solution is NP-hard. Tuzhilin therefore mean are standardized to 0 and 1, respectively. Outliers are
developed a variety of suboptimal clustering techniques. The identified as data points that exhibit significant deviation from
client segments derived by direct grouping were then em- the mean (zero). Subsequently, the recency, frequency, and
pirically investigated by the writers, and it was found to be monetary values are computed by inputting the preprocessed
superior to the statistical method. data into the RFM model.The three clustering algorithms,
Michael Steinbach Agglomerative hierarchical clustering namely K-Means, Hierarchical Clustering, and Bisecting K-
and standard K-means are the two primary approaches to Means, are subsequently applied to the three qualities (recency,
document clustering algorithms that George Karypis Vipin frequency, and monetary values). These algorithms partition
Kumar [5] introduced in 2001. Although its quadratic tem- the clients into distinct groups based on their RFM characteris-
poral complexity limits it, hierarchical clustering is frequently tics. Following this, the cluster compactness, similarity index,
thought to be the superior clustering approach. On the other and execution time of each clustering method are scrutinized to
hand, K-means and its variations, such as splitting k-means, assess their effectiveness. For a quick reference, a summarized
are believed to generate clusters but have a temporal com- depiction of the suggested client segmentation strategy is
plexity that is linear in the number of documents. To maximize presented in Figure 1.
points are shifted within. The procedure is iterated until the
total can no longer be reduced. Algorithm 1 displays the K-
Means algorithm.

Min-max normalization is used to normalize the recency,


frequency, and monetary values of the variables. Because
skewed values could be troublesome, this is done. Now, the
scaled data is subjected to the clustering method. To determine
which customer category generates the most revenue for the
business, the amount of money earned by each is calculated.
K-means has complexity O(n + k + i). where k is the number
of clusters, i denotes the number of iterations, and ’n’ denotes
the number of instances.

K-Means Algorithm
1: Input:
2: - Dataset D = {x1 , x2 , . . . , xn } with n data points in d
dimensions.
3: - Number of clusters k.
4: Output:
5: - Cluster centroids {c1 , c2 , . . . , ck }.
6: Initialization:
7: 1. Randomly select k data points as initial cluster centroids
(0) (0) (0)
{c1 , c2 , . . . , ck }.
A. RFM analysis
8: for t = 1 to T (maximum iterations) do
In database marketing, Recency, Frequency, and Monetary 9: 2. Assignment Step: Assign each data point xi to the
(RFM) analysis stands as a potent and widely recognized nearest cluster centroid cj based on distance metric
method. Ranking clients based on their historical purchasing (e.g., Euclidean distance).
behavior is a prevalent practice in this realm. RFM analysis
finds numerous applications across various domains, including
online shopping and e-commerce, particularly in scenarios
involving a large number of clients. This strategy entails
utilizing three dimensions to segment customers: Monetary (t−1) 2
assign xi to cluster j = arg min ||xi − cl ||
(M), Frequency (F), and Recency (R). l∈{1,2,...,k}
1) Recency: How recently did the client make a purchase?:
The amount of time a customer takes between two purchases
is known as their recency value. A lower number of recency
suggests that the client makes several quick trips to the
business. In a similar vein, a higher value suggests a lower 10: 3. Update Step: Recompute the centroid of each cluster
likelihood of a customer visiting the business soon. cj as the mean of the data points assigned to it.
2) Frequency: How many times did the customer make a
purchase?: The quantity of purchases a consumer makes in a
certain time frame is known as their frequency. The company’s
clients are more devoted the greater the frequency value.
3) Monetary: What was the customer’s expenditure?: The 1 X
(t)
quantity of money spent by the consumer during a specific cj = xi
|Cj |
time period is referred to as monetary. The more money spent, xi ∈Cj
the more revenue the company receives from them.

B. K-Means clustering:
K-Means is a common algorithm that divides the data into
the number of clusters that are specified so that the intra-cluster 11: 4. Termination Criterion: If the centroids haven’t
similarity is high. It takes the parameters and the number of changed significantly between iterations (or a maximum
clusters as inputs. The iterative K-Means method calculates number of iterations is reached), then terminate. Other-
the centroids’ values before to each iteration. The centroids wise, go back to step 2.
determined at each iteration determine which clusters the data 12: end for=0
Min-max normalization is used to scale the variables, same
like in the preceding procedure. The clients are currently
grouped according to the most recent, frequent, and monetary
values using hierarchical clustering.

Agglomerative Hierarchical Clustering Algorithm:

1: Input:
2: - Dataset D = {x1 , x2 , . . . , xn } with n data points in d
dimensions.
3: - Distance metric (e.g., Euclidean distance).
4: - Linkage function (e.g., Single Linkage, Complete Link-
age).
5: Output:
6: - Dendrogram representing the hierarchical cluster struc-
ture.
7: Initialization:
8: 1. Consider each data point as an individual cluster.
9: 2. Compute a proximity matrix storing the distance be-
tween all data points.
10: for t = 1 to n − 1 (merging iterations) do
11: 3. Find closest clusters: Identify the two most similar
clusters based on the chosen linkage function and the
proximity matrix.
12: 4. Merge clusters: Combine the identified clusters into
a new cluster.
13: 5. Update proximity matrix: Recalculate distances
between the new cluster and all remaining clusters.
14: end for
15: 6. The final set of clusters and their hierarchy is repre-
sented by the dendrogram. =0

C. Hierarchical Clustering:
Hierarchical clustering is an unsupervised learning tech-
nique that organizes data points into a nested structure, similar
to a family tree. It begins by treating each data point as its own
individual cluster. Then, it iteratively merges the most similar
clusters based on a chosen distance metric (like Euclidean
distance) until a single cluster encompassing all data points is
formed. This process creates a visual representation called a
dendrogram, which depicts the merging hierarchy and allows
you to determine the optimal number of clusters for your
data analysis. However, the computational cost of hierarchical
clustering can be significant. In the worst-case scenario, its
time complexity scales with the cube of the number of data
points (O(n3 )), making it less suitable for massive datasets
compared to other clustering algorithms.
D. Bisecting K-means
Bisecting K-means offers a unique clustering perspective,
merging the top-down logic of divisive hierarchical clustering
with the iterative splitting of K-means. It starts with all
data points in a single cluster and strategically divides the
cluster with the most significant internal differences using
K-means (K=2). This selective splitting continues until the
desired number of clusters is reached. This approach can
be advantageous for large cluster counts due to its focus
on the most informative splits and its tendency to produce
clusters with more balanced sizes compared to standard K-
means.The time complexity of the bisecting k-means algorithm
is O((K-1)IN), where I is the number of iterations to converge.
Bisecting k-means is also linear in the size of the documents.
Bisecting K-Means Clustering Algorithm

1: Input:
2: - Dataset D = {x1 , x2 , . . . , xn } with n data points in d
dimensions.
3: - Number of clusters k.
4: Output:
5: - Cluster centroids {c1 , c2 , . . . , ck }. IV. E XPERIMENTATION AND RESULT DISCUSSION
6: Initialization:
7: 1. Start with all data points in a single cluster.
8: for t = 1 to k − 1 (bisection steps) do
9: 2. Splitting Step: Apply K-Means algorithm (often
with a single iteration) to the current cluster to split
it into two sub-clusters.
10: 3. Choose one of the sub-clusters for further splitting
in the next iteration. Common strategies include:
11: (a) Selecting the sub-cluster with higher centroid dis-
tance (larger diameter).
12: (b) Selecting the sub-cluster with higher within-cluster
variance (more spread).
13: end for By using the transactional data set of customers of an online
14: 4. The final set of clusters consists of the k remaining retailer for a year, which is sourced from the University of
clusters after bisection. =0 California, Irvine (UCI) repository, the effectiveness of the
suggested methodology is assessed. This section outlines the It is found that because of its lower computational cost, the
consumer segmentation process step-by-step. The dataset has suggested Bisecting K-Means method works faster than the
eight attributes, such as the customer ID, product code, name, other two. The average distance between the generated clusters
price, date, and time of purchase, among others. There are is studied using the silhouette width and the average similarity
541910 instances with eight attributes in the original data of each cluster with its most similar cluster is meansured by
set. The dataset includes consumer purchases made between Davies-Bouldin score. The silhouette plot is a visual analysis
December 1, 2010, and December 9, 2011. During data of the clustering result that shows the number of customers in
pre-processing, any cases with missing values in significant each cluster as well as the shortest distance between a cluster
attributes, unit prices and quantities less than 0, and dates point and another cluster point. The data points inside a cluster
older than the current date are eliminated. As an extra step are closer to one another but not to the ones in other clusters
in the pre-processing of the data, the Z-Score analysis is also when the average silhouette width is bigger and vice versa.And
carried out to detect the outliers. Only those records that pass data points inside a cluster are less similar to the ones in
the filtering process—such as invoice data and time, product other clusters when the Davies-Bouldin score is Smaller and
quantity per transaction, and product pricing per unit in terms vice versa. For the final clusters produced by the Hierarchical
of currency and frequency—have been sent into the benchmark Clustering and Bisecting K-Means approach as well as the
algorithms. The three extra attributes—recentness, frequency, K-means clustering technique, the average silhouette width is
and monetary—that are produced from RFM computation are computed. The average silhouette width of the Bisecting K-
present in 4067 occurrences of the amended dataset. Table 2 Means clustering is found to be larger than that of the K-Means
displays a description of the original dataset. clustering and the Hierarchical clustering.

V. C ONCLUSION :
Customer relationships will be strengthened by customer
segmentation. While acquiring new clients is significant for
the business, keeping the current clientele is even more crucial
(Tong et al., 2017). This work uses RFM analysis for seg-
mentation and then expands it to include other methods such
as K-Means clustering, Hierarchical Clustering, and a new
technique. K-Means bisection achieved by slightly altering
the current K-Means clustering. These methods’ operation is
examined. After analyzing how long each algorithm takes
to run, it is found that the suggested Bisecting K-Means
strategy takes less time. Because of its simplicity and lower
computation cost, the suggested algorithm is more efficient.
Due to the fact that segmentation is carried out according to
values of currency, frequency, and recency, the business is able
Fig. 2 displays the result plots produced by bisecting K- to tailor its marketing campaigns to the clients’ purchasing
means, hierarchical clustering, and K-means clustering. Every habits. Future research will examine consumer behavior in
algorithm’s execution time is computed using the system time. each category, including the products that members of that
segment purchase on a regular basis. This would make it easier [18] Zahrotun, L., 2017. Implementation of data mining technique for cus-
to give particular products greater promotional incentives. tomer relationship management (CRM) on online shop tokodiapers.com
with fuzzy c-means clustering. In: 2017 2nd International conferences
on Information Technology, Information Systems and Electrical Engi-
R EFERENCES neering (ICITISEE), Yogyakarta, pp. 299–303.
[1] Phan Duy Hung, Nguyen Thi Thuy Lien, and Nguyen Duc Ngoc. 2019. [19] Tong, L., Wang, Y., Wen, F., Li, X., Nov. 2017. The re-
Customer Segmentation Using Hierarchical Agglomerative Clustering. search of customer loyalty improvement in telecom industry
In Proceedings of the 2nd International Conference on Information Sci- based on NPS data mining. China Commun. 14 (11), 260–268.
ence and Systems (ICISS ’19). Association for Computing Machinery, https://doi.org/10.1109/CC.2017.8233665.
New York, NY, USA, 33–37. [20] Shah, S., Singh, M., 2012. Comparison of a Time Efficient Modified
[2] Chihli Hung, Chih-Fong Tsai, Market segmentation based on hierarchi- K-mean Algorithm with K-Mean and K-Medoid Algorithm. In: 2012
cal self-organizing map for markets of multimedia on demand, Expert International
Systems with Applications, Volume 34, Issue 1, 2008, Pages 780-787,
ISSN 0957-4174,
[3] I. Maryani, D. Riana, R. D. Astuti, A. Ishaq, Sutrisno and E. A. Pratama,
”Customer Segmentation based on RFM model and Clustering Tech-
niques With K-Means Algorithm,” 2018 Third International Conference
on Informatics and Computing (ICIC), Palembang, Indonesia, 2018, pp.
1-6, doi: 10.1109/IAC.2018.8780570.
[4] A. Joy Christy, A. Umamakeswari, L. Priyatharsini, A. Neyaa, RFM
ranking – An effective approach to customer segmentation, Journal of
King Saud University - Computer and Information Sciences, Volume
33, Issue 10, 2021, Pages 1251-1257, ISSN 1319-1578,
[5] Chongkolnee Rungruang, Pakwan Riyapan, Arthit Intarasit, Khanchit
Chuarkham, Jirapond Muangprathub, RFM model customer segmenta-
tion based on hierarchical approach using FCA, Expert Systems with
Applications, Volume 237, Part B, 2024, 121449, ISSN 0957-4174,
[6] M. Aryuni, E. Didik Madyatmadja and E. Miranda, ”Customer Seg-
mentation in XYZ Bank Using K-Means and K-Medoids Cluster-
ing,” 2018 International Conference on Information Management and
Technology (ICIMTech), Jakarta, Indonesia, 2018, pp. 412-416, doi:
10.1109/ICIMTech.2018.8528086.
[7] R. Kashef, M.S. Kamel, Enhanced bisecting k-means clustering using
intermediate cooperation, Pattern Recognition, Volume 42, Issue 11,
2009, Pages 2557-2569, ISSN 0031-3203.
[8] V. Rohilla, M. S. S. kumar, S. Chakraborty and M. S. Singh,
”Data Clustering using Bisecting K-Means,” 2019 International Con-
ference on Computing, Communication, and Intelligent Systems (IC-
CCIS), Greater Noida, India, 2019, pp. 80-83, doi: 10.1109/ICC-
CIS48478.2019.8974537.
[9] S. Banerjee, A. Choudhary and S. Pal, ”Empirical evaluation of K-
Means, Bisecting K-Means, Fuzzy C-Means and Genetic K-Means clus-
tering algorithms,” 2015 IEEE International WIE Conference on Elec-
trical and Computer Engineering (WIECON-ECE), Dhaka, Bangladesh,
2015, pp. 168-172, doi: 10.1109/WIECON-ECE.2015.7443889.
[10] He X., Li, C., 2016. The research and application of customer seg-
mentation one-commerce websites. In: 2016 6th International Con-
ference on Digital Home(ICDH), Guangzhou, pp. 203–208. doi:
10.1109/ICDH.2016.050.
[11] Haiying, M., Yu, G., 2010. Customer Segmentation Study of Col-
lege Students Based on the RFM. In: 2010 International Conference
on E-Business and EGovernment, Guangzhou, pp. 3860-3863. doi:
10.1109/ICEE.2010.968.
[12] Sheshasaayee, A., Logeshwari, L., 2017. An efficiency analysis on
the TPA clustering methods for intelligent customer segmentation. In:
2017 International Conference on Innovative Mechanisms for Industry
Applications (ICIMIA), Bangalore, pp. 784–788.
[13] Liu, C.C., Chu, S.W., Chan, Y.K., Yu, S.S., 2014. A Modified K-
Means Algorithm – Two-Layer K-Means Algorithm. In: 2014 Tenth
International Conference on Intelligent Information Hiding and Multi-
media Signal Processing, Kitakyushu, pp. 447–450. doi: 10.1109/IIH-
MSP.2014.118.
[14] Cho, Young, Moon, S.C., 2013. Weighted mining frequent pattern-based
customer’s RFM score for personalized u-commerce recommendation
system. J. Converg. 4, 36–40.
[15] Jiang, T., Tuzhilin, A., March 2009. Improving personal-
ization solutions through optimal segmentation of customer
bases. IEEE Trans. Knowledge Data Eng. 21(3), 305–320.
https://doi.org/10.1109/TKDE.2008.163N.
[16] Lu, H., Lin, J.Lu., Zhang, G., May 2014. A customer churn prediction
model in telecom industry using boosting. IEEE
[17] Memon, K.H., Lee, D.H., 2017. Generalised fuzzy c-means clustering
algorithm with local information. In: IET Image Processing, vol. 11, no.
1, pp. 1-12, 1.

You might also like