Professional Documents
Culture Documents
Onugu Memory Christian Project
Onugu Memory Christian Project
BY
ONUGU MEMORY C.
FUO/17/CSI/6679
DECEMBER, 2022
CUSTOMER SEGMENTATION USING MACHINE LEARNING: A CASE
STUDY OF MARKET SQUARE, CHOBA, PORT HARCOURT, RIVERS
STATE
BY
ONUGU MEMORY C.
FUO/17/CSI/6679
DECEMBER, 2022
36
DECLARATION
I, Onugu Memory C. declare that this project on “Customer Segmentation Using Machine
Learning: A Case Study of Market Square, Choba, Port Harcourt, Rivers State” was carried out
by me; that this is my original work and that it has not been submitted wholly or in part for the
36
CERTIFICATION
This is to certify that this project was carried out by Onugu Memory C. with matriculation
number FUO/17/CSI/6679, under full supervision and in accordance with the requirements of the
Department of Computer Science and Informatics, Federal University Otuoke, Bayelsa State, for
the degree of Bachelor of Science (B.Sc). This work is original and has not been submitted in
part of full for any other Diploma or Degree of this or any other university.
___________ __________
Head of Department Signature Date
___________ __________
Dean, Faculty of Science Signature Date
___________ __________
External Examiner Signature Date
36
DEDICATION
This work is dedicated to the Almighty God, my only source of knowledge, power, strength and
inspiration.
36
ACKNOWLEDGEMENTS
My gratitude goes to God Almighty whose abundant grace, mercy and unmerited favour has
been the source of my inspiration and success all through my period of studies. His underlying
love and supernatural favour has always been a source of great strength and without this, I would
I want to express my deep gratitude to my able supervisor Mrs. Moko for her tireless effort to
read through and correct me in all relevant passages and chapters of this work.
My special appreciation goes to all the lecturers in the Department of Computer Science and
Informatics and lecturers in other departments who have imparted knowledge to me in one way
or the other.
My family remains a steady and permanent point of contact. I salute the love, understanding and
all-round support of my dad, Onugu Christian, my Step-Mom, Iyartodum Philip and my siblings.
They have remained strong pillar of strength, courage, guidance and prayers. The support of my
family has been my great strength through this journey of academic success.
To my wonderful friends God gave me here in Federal University Otuoke, Anwakobe Joy
Passover, Akelemor Bright Clever, Felicity Nwachukwu, Charles Dorathy Talent, Mahoney
Okon and others too numerous to mention, I appreciate them for their love, support and
encouragement throughout my stay in Federal University Otuoke. The above listed persons
affected me positively in one way or another. To my course mates, I appreciate them for being a
wonderful family.
Finally, let me stress that this research work like all human efforts have several limitations and
short comings. The responsibility of all errors and short comings are entirely mine.
36
ABSTRACT
36
TABLE OF CONTENTS
Title Page i
Declaration ii
Certification iii
Dedication iv
Acknowledgement v
Abstract vi
Table of Contents vii
36
CHAPTER THREE: SYSTEM DESIGN AND ANALYSIS
3.1 Research methodology 25
3.1.1 Rapid application development methodology 25
3.1.2 Agile development methodology 25
3.1.3 Waterfall methodology 26
3.1.4 Adopted methodology 26
3.2 General Analysis of existing system 27
3.3 Method of data collection used 28
3.4 System Investigation 28
3.5 Data Analysis 29
3.6 Data requirements 29
3.7 Existing System 29
3.7.1 Analysis of the existing system 30
3.8 Proposed system 31
3.8.1 Support for the new system 32
36
CHAPTER FIVE: SUMMARY, CONCLUSIONS AND RECOMMENDATIONS
5.1 Summary 51
5.2 Conclusion 51
5.3 Recommendations 52
References 53
Appendix
36
CHAPTER ONE
INTRODUCTION
1.1 Background to the Study
Throughout the long term, expanded contest among organizations and the accessibility of huge
scope chronicled information has brought about broad utilization of data mining methods to
observe basic and vital data that is concealed in associations' data, Blanchard et al, (2019). The
business world has become more serious over the long haul, since organizations like these need
to satisfy the needs of their customers, clients' wants and needs, drawing in new customers
because of which their organizations will improve, Puwanenthiren, (2012). The mission of
recognizing and tending to every individual's necessities and prerequisites in the corporate world,
managing customers is very troublesome. Along these lines, customers can contrast as far as
their necessities, wants, and inclinations, socioeconomics, size, flavor, attributes, etc. In business,
treating all customers similarly is a terrible practice. The idea of customer segmentation has been
Customer segmentation, otherwise called market segmentation, is the method involved with
separating individuals into groups, coordinated into subgroups or fragments, where each has its
Customers can be characterized in deals, business, and financial aspects, (here and there known
thought - acquired from a merchant, seller, or provider through a monetary exchange or trade for
analyzing a customer base and grouping customers into categories or segments which share
36
by using Clustering & clustering is the technique that comes under unsupervised learning of
machine learning. Segmentation allows prospects based on their wants and needs. Customer
Segmentation means grouping the customers based on marketing groups which shares the same
similarity among customers. To be more exact, it means segmenting customers sharing the
normal attributes which are the most effective way of advertising. Client division is gathering
data about every client and examining it to recognize the various examples for making the
fragments. The absolute best strategies for social occasion data are eye to eye interviews,
telephonic meetings, through overviews or through research utilizing data which is distributed
connected with client classes. The fundamental data which incorporates charging data,
transporting data, and items bought, promotion codes, installment strategy and so forth, beyond
these a few organizations likewise gather data like justification for the buy, ad channel which
makes them to buy, age, orientation and so on, In B2B (Business to Business) showcasing clients
are assembled by various variables like enterprises, number of managers, items bought from the
organization in prior times and area. On other-hand, in B2C (Business to Consumer) promoting
organizations fragment the clients in light old enough, orientation of the clients, their conjugal
status, life phase of the clients like single, married, divorced, retired etc. One of the main factor
practiced for all the businesses nevertheless of size or industry. Common segmentation types
customer), customer status, behavioral, psychographic etc., Some of the major benefits of
36
1.2 Statement of the Problem
Customer experience is becoming a major trend in making online customers. In fact, it’s well on
the way to overtake price and product as the main brand differentiator. Yes, people value the
experience more than money. Here’s why: they don’t want to spend money with businesses that
don’t provide the experience they expect, let alone with those treating them badly. Recently, this
concept has been shifting, and instead of just “not bad treatment,” customers want “exceptionally
personalized treatment.” Many of them are quick to leave a business that doesn’t provide that.
For this fact, there is need for customer segmentation using machine learning in business. In
general, customers are willing to pay a premium for a product that meets their needs more
specifically than does of a competing product. Thus, marketers who successfully carry
There are several important reasons why customer segmentation needs to be done carefully for
better matching of customer needs – customer needs differ. Creating separate offers for each
This study is aimed at the development of a customer segmentation system using machine
learning for Market Square, Choba Branch, Port Harcourt. The specific objectives are:
1. To develop a cluster segmentation system for Market Square, Choba Branch, Port
36
3. To implement data overview and data cleaning, exploratory data analysis, unsupervised
Machine learning task: cluster analysis, customer segmentation report and supervised
This study will provide a better understanding on how Market Square, customers Choba branch
can easily be segmented using machine learning algorithm, the k-means clustering method which
will help the enterprise to better understand its target audience and to be used to begin
The study focuses on developing an algorithm in machine learning that segment or group
customers in Choba Market Square based on their common characteristics such as demographics
Customer: In Sales, trade, and financial aspects, a customer is the beneficiary of good,
Customer segmentation: is the method involved with isolating a broad consumer or business
market, regularly comprising of existing and possible customers, into sub groups of buyers in
Machine learning: is the study of computer algorithm that can work on consequently through
36
1.7 Limitations of the Study
The limitation of this study includes time, and lack of sufficient or relevant data which the
researcher would have used to give a sufficiently new approach to this form of study.
36
CHAPTER TWO
LITERATURE REVIEW
The review of literature discusses or contains a detailed information on the inspection and
examination of the various areas in the chapter that will appear or contribute in the writing of this
project such as the concept of customer segmentation and machine learning and theoretical
development of a machine learning algorithm that help in customer segmentation. All this will
in one way or the other be part of the breaking down of this project topic.
The expression "market segmentation" alludes to partitioning a market along some shared trait,
similitude, or family relationship. That is, the individuals from a market fragment share
something in like manner. The reason for division is the grouping of promoting energy and
power on the sub division (or the market portion) to acquire an upper hand inside the section
(Thomas, 2007). Smith (1956) broadly referred to as giving the premise to the idea of market
segmentation as it is applied today. Wind (1978) outlines market division as a proactive cycle
study before division can start." Market portions can be portrayed in various ways on method for
customers that generally have similar inclinations. Furthermore, there are diffused inclinations
which imply that the customers change in their inclinations lastly clusters inclinations which
36
imply that the normal market fragments rise up out of groups of customers with shared
inclinations (Kotler et al, 2009). The essential reason of market segmentation is that a
heterogeneous grouping of clients can be assembled into homogenous clusters or fragments, each
requiring varying utilizations of the showcasing blend to support their requirements. While
discussing market segmentation it is important to momentarily make reference to the three areas
of advertising which is to be thought about when marketing an item. The main region is mass
showcasing. It covers the area of efficiently manufacturing, mass conveys and mass elevates on
item to all purchasers (Gunter et al, 1992). In any case, advertisers have understood the
extraordinary assortment in every individual customer and along these lines the market
segmentation is a useful apparatus for the advertisers to redo their promoting programs for every
individual customer (Dibb et al, 1996). The subsequent region is item separated advertising. The
advertiser produces at least two items that show various elements, styles, quality, and sizes.
The course of customer division, includes the making of customer sections or parts or sub-sets. A
the whole client and is recognized or made by the promoting office in such a way that the people
(or associations) in that very portion would request a specific arrangement of labor and products
that have comparable elements. To put it plainly, a portion is a segment of customer, the
1. Geologically or item astute or even need savvy, a solitary client portion is particular from
different fragments, and however one can likewise rely on the presence of brief
similitudes.
36
2. Items that are requested by the buyers are homogeneous and at times additionally will
The top-down (or first-level) segmentation approach utilizes customer property data, normally
known as client reference information, to decide customer clusters. The objective of the top-
down methodology is to join and group these customers in view of their qualities, like Nigerian
Industry Classification System (NICS) code, geographic impression, and line-of-business data.
When business data has been laid out in the primary level, it is vital to approve it by utilizing
client information. This check is alluded to as the refinement interaction (to try not to mistake it
for the granular perspective examined later). The refinement process includes utilizing the
factual depictions of the populaces laid out by top-down information and contrasting and talking
about them and business data to acknowledge or refine the top-down level. To approve business
data through customer ascribes, measurements like thickness appropriation, count, mean,
greatest, and unmistakable qualities can be gotten from customer’s data, including reference
information and chronicled action information when accessible. Accordingly, further data
36
2.1.3 Bottom-up division
The bottom-up (or second-level) segmentation approach depends on the portions laid out by the
view of comparable exchange conduct like wire, money, check, and mechanized clearing house
exchanges. The granular perspective basically applies unaided ML procedures to the top-down
populace, and it requires at least a year of conditional action to work effectively. Bottom up
segmentation procedures, for example, k-means clustering can incorporate utilizing a decent
number (k) of groups to characterize every main item. Data focuses are then allocated to a cluster
in light of nearness to the focal point of the group. The fundamental target of the bottom up
division approach is to improve inductions concerning whether or not any action can be
considered to be irregular, explicit to a customer’s cluster. The last segmentation is a mix of the
top-down and the bottom up fragments. Contingent upon the exchange checking strategy,
customer risk rating (CRRs) generally are added too to shape a total segmentation model.
Because of elements like data accessibility, data quality, and the mind boggling nature of
customer conduct inside once in a while complex items, it is vital to join the top-down and the
effective segmentation model can be accomplished by just a top-down methodology except if the
customer base is little and items are exceptionally straightforward. While managing enormous
quantities of customers and a variety of perplexing items, the granular perspective likewise ought
"symmetry" is a numerical idea that alludes to the oppositeness between two ideas; for this
36
situation, to keep up with symmetry is to ensure top-down and bottom up thoughts are kept free
of each other, which mean trying not to go through the base perceptions to alter or overwrite the
top-down portions.
In the occasion solid contrasts are seen from hierarchical agreement and base up proof, a
profound plunge investigation ought to be led to get why. Obviously, analysis applied to take
apart such conflicts ought to follow laid out model approval structure and administration
rehearses. Whenever clashes persevere, the top-down rationale ought to be kept unblemished,
and that implies bottom up inconsistencies will probably bring about alarms. The objective is to
ensure that the ready examination and laid out tuning input circle can be utilized to more readily
The primary reasons for carrying out customer segmentation for customer trend analysis are:
To target each profitable segment in a unique way that suits that particular segment, and
36
At the point when it boils down to viable use of customer segmentation analysis, there must be a
few fixed boundaries that should embrace and authorize to accomplish the best outcomes and
greatest benefits. Coming up next are the various variables that decide how the different client
Demographic Segmentation
Demographic segmentation is one of the most straightforward division procedures to tap the
likely customer without squandering the assets. In business, it is truly challenging for a solitary
association to fulfill the necessities, all things considered, and subsequently the association needs
to fall back on customer segmentation. Through customer grouping, the association satisfies the
necessities of all shoppers having a place with a specific specialty as opposed to attempting to
satisfy the requirements of the whole customer which is for all intents and purposes
incomprehensible.
factors, for example, age, orientation, social class and so forth, into thought. This assists with
separating the customer into a few groups, each having a typical variable, and focus on every one
of these groups to improve the presentation of the association. This customer segmentation
methodology aims at understanding the prospective customer, and taking necessary steps to
Segmentation variables are basically factors which help the organization to determine the target
group. Variables mainly consist of demographic factors such as age, ethnicity, and occupation.
36
Below are the variables which are commonly used to divide the customer into smaller segments
Age
Gender
Family size
Income
Occupation
Education
Ethnicity
Nationality
Religion
Social standards
Based on these variables, an organization can decide which group they would cater to.
36
Segment segmentation has a few advantages which settle on it the best option in the customer
An association can without much of a stretch order the necessities of the buyers based on
Segment segmentation factors are a lot more straightforward to acquire and gauge
Geographic Segmentation
isolated based on geographic units, similar to urban communities, states, nations, and so on
Customer segmentation can be founded on any element, similar to culture, financial status,
geographic contrasts, and so forth Assuming that the customer segmentation depends on
which the target group for a given item is isolated by geographic units, like countries, states,
Geographic segmentation and profiling are extremely crucial cycles of promoting technique, as
they are figured out subsequent to directing itemized investigations of the customers who have a
place with various territorial units. This kind of customer segmentation can be gainful to
36
recognize the inclinations and requirements of customers in a specific area, according to the
Psychographic Segmentation
brain science and way of life propensities for customers. Promoting an item requires a profound
comprehension of the customer's brain science, alongside their necessities, for the item to be
there are a great deal of contrasts between customers of various regions, ages and identities. So
he needs to separate the customer into different portions, and focus on each fragment
independently in order to boost deals. These fragments are isolated on an assortment of variables
like age, sex, way of life, pay level and brain research. Psychographic segmentation plays on the
brain research of the expected customers and assists the dealer with deciding how he should
Interests
Exercises
Suppositions
Propensities
36
Way of life
Side interests
Involving these elements as a base, an advertiser can decide how a specific gathering of
Aside from the conspicuous benefit of expanded deals, there are a couple of other mind boggling
Better contributions for the plan of new items that the customer will like.
36
It is an investigation of various kind of calculations that work on their exhibition in some
particular assignment by their own insight. These calculations work on their exhibition by
investigation of past data and undertaking. We can say that A savvy PC which gain from their
own experience very much like individuals which master in their work by through their previous
1) Machine Learning is utilized to make that kind of systems which are changed and modify
their working as indicated by the need of user.eg individual mailing, and message sifting.
2) It is help to find data from the data sets, with the goal that the organization can take new
business thought and work on their presentation. This idea is known as the data mining.
3) It assists with making the framework which are perceive the individual penmanship,
discourse and some more. For instance, open any framework by matching secret key
4) It improvement that framework which requires more information and abilities to perform
Information handling and constant forecasts: In this the framework consumes more data and
makes expectations detached levels. For instance, in friendly site when customers add any item
in their truck then site offers them rebate and different gifts time to time (Emir, 2012).
36
Acknowledge Data from various sources: It acknowledge data from enormous number of
sources detached structures since it can deal with huge data. so it produces ideal result by
investigation of data.
Give multi-layered perspective on data: It gives the different perspective on data in various
kind of questions. Data is dynamic in nature so result is likewise change as indicated by the need.
It is utilized in assortment of uses, for example, banking and monetary area, medical services,
retail, distributing and online media, robot movement, game playing and so on
Simply decide: It help to settle on choices in light of the examinations of past data. For instance,
A Soap organization need to send off their new item in market, Machine learning help to be
familiar with their past deal in old items so they can settle on the choices whether or not their
Adaptability: It can make changes in the framework as per the need and climate changes.
Web search: Machine learning is utilized in pretty much all aspects of the framework at
significant web search tools like Google or Bing, yippee, facebook. Whatever requires some kind
of "knowledge" is frequently addressed utilizing machine learning. It is gain from the questions
of the customers so they can fulfill the customer from their administrations. Today insightful
pursuit frameworks offer inquiry by discourse, picture and characters (Narges, 2014).
Clinical: Machine learning is utilized in clinical field; it is help to foresee the ailment of patient
by their past clinical history in some expire. It helps the specialist how long understanding can
36
battle with some perish. Numerous uses of Machine learning in clinical assistance in lab in blood
testing, tissues and some more. It helps to keep up with immensely significant data in regards to
the patient on day by day base. Numerous frameworks are accessible in clinical world which are
utilizing the machine learning to analyze the patient's condition 24 hours in medical clinics.
Web based business: Many applications are sent off now days to help the internet business, on
the off chance that it seems like each innovation organization is throwing around trendy
expressions like "huge data," "man-made reasoning," and "machine learning," indeed, you're not
off-base. The thing is E-trade organizations have a ton of data readily available. However,
utilizing that data is a test. AI can sort out computerized data at a lot quicker rate than any human
is prepared to do. Picking the use of AI will in general be a choice of needs. Of course, you could
utilize AI to do a great deal of things, yet what will have the biggest effect? In an ideal world, we
could pick everything, let the machines dominate, and unwind. However, this is appallingly far
from this present reality. Associations work with confined resources and have to focus on what
ML Innovation to take on. Any reasonable person would agree that the need would be the tech
that has the greatest effect. With this present, we should audit the most impressive uses of
• Personalization: by utilizing a website a customer can look through their items by their
decisions.
• Evaluating streamlining: The different website offers different cost on the items with the
goal that the customer can choose the best item in best cost.
36
• Search Ranking: Machine learning is set the pursuit by keeping track the customer
interest region. With the goal that it can offers the different data in regards to their
advantage.
they can get the all rules, and also get the new trends according to the time.
• Customer service: ML has numerous applications which are utilized for customer care, it
assists with taking the questions from the customer in normal language and resolve the
issues quickly.
Data extraction: Machine Learning idea is extremely valuable in Data extraction, that is
in huge number of data product houses have verifiable data, on in light of Machine
Learning procedures it helps to extricate the valuable data, so any association find out
with regards to their exhibition, and find better approaches to expand it (Lacie, 2015).
calculation is give the methodically approach in troubleshooting in light of the fact that it
help to accomplish the best outcome in investigating, which for the most part supportive
A customer segmentation hypothesis is a cutting edge hypothesis that attempts to clarify the
connection of yield of an obligation instrument with its development period. This hypothesis
36
unites possible purchasers into sections with normal requirements that will react to a promoting
activity.
The first and most significant place of customer segmentation hypothesis is, that there is no good
reason for burning through cash for promoting of your item to specific individuals, on the off
chance that these individuals won't buy the item. You want to conclude who is your objective
gathering and they make a decent attempt to advance/customer your item to that specific
gathering. This is customer segmentation. You would acquire a lot more deals assuming you
To sum up, customer segmentation hypothesis is tied in with isolating the customer into more
modest gatherings of customers and afterward promoting your item just to the gathering that are
In the e-business world, web based shopping has turned into the most well-known exchanging
design Nigeria. Insights show that the public internet based retail deals arrived at RMB 10,632.4
billion of every 2019. In such an internet based climate, customer buy practices change
powerfully. An amazing customer situated showcasing procedure for foreseeing customer online
practices in light of data mining is thusly much required by selling endeavors. Data mining,
which can find concealed data on extraordinary congruity from gigantic measures of online
exchange data, is the most appropriate technique for customer buy conduct examination.
Specifically, in the current period of large data, Machine learning is considered to have
expansive applications possibilities across the business. There have been numerous incredible
speculations about data mining with wide modern applications in the beyond twenty years. Chen
36
et al (1996), Shaw et al (2001), Chen et al (2003) and Ngai et al (2009) give exhaustive audits of
data mining procedures and their modern applications. With respect to the applications, it
incorporates banking and money, retail, media transmission, and protection. In the exploration of
Ngai et al (2009), data mining instruments were utilized to examine customer data inside a CRM
structure. ML can uncover helpful data to investigate customer practices and attributes. It is
thusly of incredible importance to endeavors expecting to obtain and hold possible customers,
assisting them with amplifying customer worth and supporting their customer the board and
market technique choices. Without a doubt, use of data mining in the CRM area is an arising
pattern in the period of enormous data economy. One of the most generally utilized Machine
gatherings in light of comparability (Chau, 2009). He based his exploration on a certifiable data
overseeing systems by joining RFM and K-implies strategies. With online exchange data
gathered from November 2019 to April 2022, I make a normalized dataset for additional
investigation. On this premise, I use a RFM model and K-implies calculation to direct customer
segmentation and worth investigation. A PCA strategy is then used to decide the heaviness of
RFM pointers. Customers are ordered into four gatherings in light of their buy practices. On this
premise, different CRM procedures are presented to acquire an undeniable degree of consumer
loyalty. Changes of some key presentation records because of reception of the technique
proposed in this paper are given, remembering increment for absolute buy volume and all out
utilization sum, along these lines showing the conspicuous viability of this strategy. The
remainder of the paper is coordinated as follows. Applicable examination studies are investigated
in Section 1. In Section 2, the philosophy and the model utilized for the current examination are
36
portrayed. Aftereffects of observational examinations are given in section 3-4. Section 5 finishes
The RFM model was first proposed by Hughes of the American Database Institute in 1994. As a
well-known apparatus of customer esteem examination, it has been broadly utilized for
estimating customer lifetime esteem (Cheng et al, 2009) and in customer segmentation and
conduct investigation (Chen et al, 2012). In the accompanying passages, I gave a short portrayal
of the RFM model in the above writing. RFM is short for recency, recurrence, and financial,
which allude to recency of the last buy, buy recurrence, and money related worth of
procurement, individually. R (recency) addresses the time span between a customer's last buy
date and end date of a factual period. The more limited the span, the greater the worth of R. F
(recurrence) shows the quantity of buys made by the customer during the factual period. The
bigger the worth of F, the higher the customer faithfulness and the more grounded aim to buy
once more. M (financial) addresses the aggregate sum the customer spends in buy during the
factual period. As a rule, the higher the absolute buys sum, the more faithful the customer. It can
concentrates on show that the more noteworthy the worth of R or F, the more prominent the
probability that the comparing customer will manage another exchange with the vender.
Moreover, the bigger M is, the almost certain the relating customer will buy items or
administrations from the merchant once more. While Hughes 1994 connected equivalent
significance to these three factors, Stone accepted that the significance of the three factors shifts
among ventures because of their various attributes, recommending inconsistent loads of these
factors. RFM is broadly utilized in customer esteem investigation, and specialists have stretched
out it as per various perspectives. Liu and Shih utilized a scientific order process
36
AHP) to decide the heaviness of RFM factors, a bunching strategy to bunch customers, and an
affiliation rule technique to prescribe items to customers in various gatherings. Cheng and Chen
consolidated RFM examination with a harsh set hypothesis to lay out rules for customer order
(Cheng et al, 2009). Chiang proposed a RFMDR model (in light of a RFM/RFMD model), a
drawn out adaptation of RFM analysis, to recognize important web based shopping customers for
the business and to produce fluffy affiliation rules. Kolarovszki et al. have proposed an original
CRM configuration demonstrates helpful in postal assistance organizations. Tune et al. proposed
a measurement based way to deal with assess potential customers by means of time series. With
this methodology, it is feasible to portion time frames in a huge scope dataset. Considering the
way that most RFM models are created according to a customer point of view rather than an item
one, Heldt et al. proposed a RFM per item (RFM/P) model. In this model, customer upsides of
all items are assessed independently first and afterward added together to get a general customer
performed on this premise. Adnan Amin et al. concentrated on the expectation of customer beat
in the telecom business under various conditions by utilizing unpleasant set, order, and
The K-implies calculation, as perhaps the most famous grouping calculation, was first utilized by
Macqueen in 1967, and it has been utilized broadly in different fields including Machine
learning, measurable data investigation, and other business applications. The writing shows that
one of the significant uses of K-implies is customer segmentation. The K-implies calculation is
broadly used to successfully distinguish significant customers and foster appropriate promoting
procedures (Arunachalam et al, 2018). Specifically, Cheng and Chen utilized a RFM model and
36
K-means to perform customer relationship the executives, and trial results exhibit that the model
(2012) proposed a crossover delicate figuring approach based on bunching, rule extraction, and
organizations. This approach was applied in two contextual analyses in the fields of protection
and telecom, individually, expecting to anticipate possibly productive leads and to diagram the
most compelling elements accessible to customers during such expectation. With the RFM
applied data mining strategies, for example, c-implies, move learning, and multi view learning in
mind CT, EEG picture segmentation, and multi view grouping research. Contrasted and other
bunching calculations, the K-means calculation isn't just quicker in computation however it can
likewise diminish the misclassification pace of data. In this manner, we utilize the K-implies
calculation to bunch as per R-F-M credits. The exactness of this calculation relies upon
instatement conditions and the quantity of bunches. The popular elbow technique is generally
used to decide the worth of K. In the following area, we will present our technique bit by bit.
This section clarifies the proposed course of customer esteem examination. The interaction
comprises of the accompanying four stages displayed in section 1: (1) data preprocessing or data
readiness and preprocessing; (2) standardization of RFM model files; (3) file weight analysis;
and (4) customer clustering by the K-means calculation, where each element of customer data is
broken down utilizing the RFM model and K-means calculation to group target customers. The
exploration analysis process is presented bit by bit as follows: Step 1: data preprocessing from
the get go, a unique dataset for the experimental contextual analysis in light of RFM model
36
boundaries is chosen. The first dataset is then cleaned to eliminate anomalies and erroneous
qualities and bring forth an underlying dataset. Then, by taking out repetitive traits, the data is
changed into a configuration that is simpler and more effective to process for customer esteem
examination. Stage 2: standardization of RFM model files Given the enormous contrasts in the
worth scopes of the three signs of the RFM model, i.e., time since last buy, buy recurrence, and
absolute buy sum, to kill the effect of mathematical qualities on the characterization results, the
min-max standardization strategy is utilized to normalize the information and acquire the
V.Vijilesh1, A. Customer K-Means is an unsupervised learning algorithm and Our dataset is limited to
Harini2, M. segmentation used for clustering tasks which works really well sales record, we can use
Hari using machine with complex dataset. It is an iterative algorithm that a RFM based model for
Dharshini3, learning partitions the dataset into “k” pre-defined non finding segments where
R.Priyadharshi overlapping subgroups (clusters) where each data R is Recency (how
ni4 point belongs to only one group. recently a purchase
et al 2021 happened), F is
Frequency (how frequent
transactions are made),
M is Monetary value
(Value of all
transactions). Recency,
Frequency and Monetary
score for each customer
is calculated. The latest
date is assigned as a
placeholder to calculate
recent purchases
Patel Monil1, Customer Customer Segmentation, also known as customer
36
Patel Darshan Segmentation segmentation, refers to the process of dividing a
2, Rana Jecky using market into different buyers with different
3, Chauhan Machine behaviours, characteristics [5]. Customer
Vimarsh 4, Learning segmentation refers to a way of dividing according
Prof. B. R. to different characteristics of consumer groups. This
Bhatt 5 theory proposes to study and predict the future
Et al 2020 consumption trend of customers in the way of
segmentation of customer information and
consumption behaviour, as well as the profit market
planning of enterprises
Karin Kelley Customer Customer Relationship Management (CRM) in
et al 2015 Relationship Information systems is one of the enterprise
Management software among Enterprise Resource Planning(ERP)
(CRM) in and Supply Chain Management(SCM).
Information
System Enterprise software is used in large organizations
and is considered an essential part of a computer-
based information system. It provides business-
oriented tools such as online payment processing
and automated billing systems. It is also referred to
as enterprise application software.
36
CHAPTER THREE
SYSTEM ANALYSIS AND DESIGN
and examine data about a topic. In a research paper, this section permits the reader to
fundamentally assess a review's general legitimacy and dependability. The followings techniques
both an overall term, used to allude to versatile programming improvement draws near, as well
as the name for James Martin's way to deal with quick turn of events. For a rule, RAD ways to
deal with programming advancement set less accentuation on arranging and more accentuation
on a versatile interaction. Models are frequently utilized notwithstanding or once in a while even
RAD is particularly appropriate for (albeit not restricted to) creating programming that is driven
improvement devices.
necessities and arrangements advance through the cooperative exertion of self-coordinating and
cross-practical groups and their customer(s) end user(s). It advocates versatile preparation,
transformative turn of events, early conveyance, and constant improvement, and it energizes fast
36
There is episodic proof that taking on lithe practices and values works on the deftness of
each stage relies upon the expectations of the past one and relates to a specialization of errands.
development, it will in general be among the less iterative and adaptable methodologies, as
progress streams in to a great extent one course ("downwards" like a cascade) through the
periods of origination, inception, examination, plan, development, testing, sending and support.
The waterfall development model started in the assembling and development ventures; where the
exceptionally organized actual conditions implied that plan changes turned out to be restrictively
costly a whole lot earlier in the advancement interaction. At the point when initially took on for
programming advancement, there were no perceived options for information based innovative
work.
adopted as it is well suited for developing software that encourages rapid and flexible response to
36
Fig 3.1: Agile Methodology (DevTeam.Space)
In analyzing the system of customer segmentation, the Galaxy Business Center (GBC) will be used. It
deals with buying and selling of all kinds of electronic appliances. Information gathered by the business
owner could be from a regular customer or potential customers, the strategy of which is being
contemplated or planned. Information of this nature could get to customer’s data by any of the
following ways:
Such information is brought to the knowledge of the company in form of complain or suggestions from
customer the customer care department might have gathered the customer’s data through one on one
contact from the customers. The company having received the information groups the customer’s data
36
into groups and treat them separately so that a particular unit handles a particular group of customer
complain made. With the case of those involving customer not satisfied adjustment are made to resolve
it and serve the customer better. The segmentation policies used will be based on observation and
practical business sense; there is no evidence of formal market research. Research would have been less
useful than it is today. The trade and its markets is small, thus they gained the first-hand experience
with consumers which their present-day counterparts, isolated in large bureaucracies, must experience
Unlike modern marketers, however, actively look for new market segments. Like other businessmen
then they assumed their markets to be fixed in size, and believed that vigorous marketing would steal
the rightful market shares of their fellow businessmen. Existing segmentation practice reinforced the
prevalent passivity. The habit of thinking in terms of small segments contributed to an under serving of
customers need which clogged the trade and make the company lose their customers as they fail to
During this project research work, data needed for the project was gathered from the various
sources. In gathering and collecting necessary data and information needed from the system
analyses, two major fact-finding techniques were used in this work and there are:
(a) Primary Source: This refers to the source of collecting original data in which the
researcher made use of empirical approach such as personal interview and questionnaires
(b) Secondary Source: The secondary data were obtained by the researcher from magazine,
journal, newspaper, library source and internet downloads. The data collected from this
36
3.4 System Investigation
In other for the researcher to fully ascertain the extent to which the study will be beneficial to
Market square Choba and other companies, the researcher to ascertain the existing system of
The researcher found out the market segmentation is more or less obsolete in public sector
whereas most Nigerian business owner and organization uses the book keeping method to keep
tracks of customer details, which is not only stressful but not reliable.
Data analysis is a process of inspecting, cleansing, transforming and modeling data with the goal
analysis has multiple facets and approaches, encompassing diverse techniques under a variety of
names, and is used in different business, science, and social science domains. In today's business
world, data analysis plays a role in making decisions more scientific and helping businesses
Data Analysis refers to breaking a whole into its separate components for individual
examination. Data analysis is a process for obtaining raw data and converting it into information
useful for decision-making by users. Data are collected and analyzed to answer questions, test
With the end goal of this concentrate on the information expects for this concentrate on protests
36
3.7 Existing System
Most business has not been in a situation to fulfill their clients as a whole, without fail. It
demonstrates challenging to meet the specific necessities of every individual customer. Business
All the more explicitly, organizations should have the option to ingest and dissect chronicled
customer conduct to lay out future "anticipated" conduct and make the best segmentation model
conceivable. However, this isn't realistic in many organizations as the depend on simple strategy
for gathering and examining information of their customers. Also this has never been compelling
or assists organizations with accomplishing the point of arriving at a designated customer. A few
marketers use book keeping to keep the record or information of their customer which
information section is very troublesome toward the end for them, this cycle can require week to
months, before you realize the business has lost their clients.
36
Start
Clean Data
Data Modeling
Complete
NO Reprocess
Modeling
Data visualization
Stop
36
3.7.1 Analysis of The Existing System
1. Time Consuming: In the current framework, the data about the client is put away in
record books. Whenever data about a specific customer is required or when a few
changes are required in the record, one needs to look through many record books for
physically. Likewise, the current system requires a great deal of time to achieve the
errand of griping a report, and other refreshing or changing in the current data.
at many spots; for example, in various records. Likewise, the grievance structures and the
other significant data about customers are additionally produce repetitive information.
record, the marketers need to make changes in many documents. To eliminate or change
the information, they should transform them at every one of the spots, where they kept
them.
4. Storage Media: For taking care of information and related data, a few records are
utilized for example similar information is put away at numerous areas which squanders a
5. Information Updating: It's obviously true that with the progression of time, old
information needs alteration. Interaction of change, refreshing and expansion of new data
36
6. Backup and Recovery: In the current manual system, there is generally a gamble of
7. Integrity: It has been demonstrated that the manual process for gathering and putting
8. Burden of Work: In Market office every one of the documents work is finished by the
The proposed systems permit marketers in usage of data mining techniques to notice essential
and indispensable information, it permits organizations to all the more likely cluster and set more
exact edges for checking grouping of comparatively acting clients, finally this system will tackle
the hole made by the current system which doesn't exemplify the various protests of a customers.
The proposed system named customer segmentation utilizing machine learning isn't simply
equivalent to the Market Square Choba, yet in addition to private association/organizations as the
proposed framework will empower the Business proprietors in market segmentation and
furthermore have an idea about the sort of client they are managing.
36
Fig 3.3: Flowchart of the Proposed System
36
Start
CHAPTER FOUR
SYSTEM IMPLEMENTATION
Input
phase. After design phase we can reduce the time required to create the implementation.
Bisecting K-Means Training the Machine Clustering category of
algorithm algorithms
Learning Model
The research project Customer segmentation allows admin to track down the customers’
of the language that are suitable for the problem at hand. The important factor to be considered in
the selection of programming language includes the target operating system and the
Notebook IDE.
Stop
4.1.1. Python
36
Python is dynamically-typed and garbage-collected. It supports multiple programming
standard library.
Faster: It is faster than other scripting language e.g. asp and php.
Open Source: Open source means you don’t need to pay for use the of python, you can free
Platform Independent: Python code will be run on every platform, Linux, Unix, Mac OS X,
Windows.
Case Sensitive: Python is case sensitive scripting language at time of variable declaration. In
Python, all keywords (e.g. if, else, while, echo, etc.), classes, functions, and user-defined
Error Reporting: Python have some predefined error reporting constants to generate a warning
or error notice.
Real-Time Access Monitoring: Python provides access logging by creating the summary of
Loosely Typed Language: Python supports variable usage without declaring its data type. It
will be taken at the time of the execution based on the type of data it has on its value.
The Jupyter Notebook is the original web application for creating and sharing computational
36
Language of choice: Jupyter supports over 40 programming languages, including Python, R,
Share notebooks: Notebooks can be shared with others using email, Dropbox, GitHub and the
Interactive output: Your code can produce rich, interactive output: HTML, images, videos,
Big data integration: Leverage big data tools, such as Apache Spark, from Python, R, and
Scala. Explore that same data with pandas, scikit-learn, ggplot2, and Tensor Flow.
Error Handling: Error handling refers to the response and recovery procedures from error
communication errors. Error handling helps in maintaining the normal flow of program
execution. In fact, many applications face numerous design challenges when considering error-
handling techniques.
Not all customers are profitable, and some customers are much more profitable than others. For
instance, according to, in the e-commerce industry, 20 percent of the business owner account for
80 percent of the prescriptions. It indicates that a minority of customers in the ecommerce market
represent the majority of value. Companies, hence, need to segment their customers in terms of
their profitability, so that they can focus on the small number of most profitable customers that
36
contribute to their major profit pools. Customer segmentation is employed across many
industries. A typical example is the retail industry. There tail industry is one of the oldest
industries since the notion of trade was invented. Retailers perform the task of middleman and
serve the consumers from the barter economy to the new tech-based economy. As the
competition in the retailing sector intensifies, retailers now require their own marketing
strategies to retain existing customers and acquire new customers to remain ahead of
competition. In a recent survey, most retailers are shown to base their strategies on special
services to enhance customer loyalty. However, the development of new products and services
should be based on a better understanding of the customer base. One of the most useful tools for
know precisely where they have to concentrate their efforts. One of the major strategies
recommended to retailers by the Retail Council of Nigeria is to focus their efforts on niche
markets and special customers. Customer segmentation, hence, is a crucial element of retail
strategy. Without accurate segmentation of customers in the light of their profitability, strategic
decision makers are not able to gather the correct information they need to evaluate and execute
companies, given that they often lack the expertise and specific utilities to make sense of the vast
volumes of customer data that exist throughout the business. Besides the need for expertise and
36
1. Understanding segmentation objectives: Each customer segmentation task has
segmentation objectives (e.g. maximize profit, minimize churn) that serve the business
needs. The understanding of business needs and segmentation objectives is the first step
of a segmentation procedure.
2. Deciding what data should be collected and where it can be collected: Customer data is
available throughout the enterprise and stored in various databases. Some data are
valuable for segmentation whereas some are not. Hence it is necessary to consider what
3. Integrating and cleaning collected data: The data collected from various databases is
frequently inconsistent. Some data may also miss values in certain fields. Hence the
4. Deciding on the methods and technologies used for segmenting the data: e.g. statistical
methods, online analytical processing (OLAP), and data mining, can be used for
segmentation. Each method or technology has its own advantages and disadvantages.
segmentation operation.
5. Implementing the applications and tools for segmentation: After the segmentation method
has been chosen, the corresponding applications and tools, which implement the chosen
36
Fig 4.1: Flow chart of the System
Output is the Information obtained from processing data, which has been fed into the computer.
The input of the system includes the list of resource materials listed out, using Market
Square data set that contains transaction information from around 4,000 customers.
Having a Python IDE installed on my device before running the program to import and display
the data set. Jupyter Notebook was used to easily run the code and display visualizations at each
step.
I, make sure to have the following libraries installed — Numpy, Pandas, Matplotlib, Seaborn,
Because of the bulkiness of the data set, only the head of the data will be displayed which is
shown below:
36
The data frame consists of 7 variables:
With the transaction data above, we build different customer segments based on each user’s
purchase behavior.
The raw data we collected from Market square is complex and in a format that cannot be easily
36
The informative features in this dataset that tell us about customer buying behavior include
“Quantity”, “InvoiceDate” and “UnitPrice.” Using these variables, we are going to derive a
Monetary Value: How much money do they spend on average when making purchases?
With the variables in this e-commerce transaction dataset, we will calculate each customer’s
recency, frequency, and monetary value. These RFM values will then be used to build the
segmentation model.
4.4.2 Recency
Let’s start by calculating recency. To identify a customer’s recency, we pinpoint when each user
was last seen making a purchase in the dataframe we just created, we only kept rows with the
most recent date for each customer. We now need to rank every customer based on what time
For example, if customer X was last seen acquiring an item 3 months ago and customer Y did the
same 2 days ago, customer Y must be assigned a higher recency score. The dataframe now has a
new column called “recency” that tells us when each customer last bought something from the
platform:
36
4.4.3 Frequency
When you calculate frequency, how many times has each customer made a purchase on the
platform:
The new data frame we created consists of two columns — “CustomerID” and “frequency.” Let’s
spent on the platform. The new data frame we created consists of each CustomerID and its
associated monetary value. Let’s merge this with the main data frame Now, let’s select only the
36
4.5 Removing Outliers
We have successfully derived three meaningful variables from the raw, uninterpretable
Before building the customer segmentation model, we first need to check the dataframe for
To get a visual representation of outliers in the data frame, let’s create a boxplot of each variable:
36
Fig. 4.3: Frequency
36
Observe that “recency” is the only variable with no visible outliers. “Frequency” and
“monetary_value”, on the other hand, have many outliers that must be removed before we
To identify outliers, we will compute a measurement called a Z-Score. Z-Scores tell us how far
away from the mean a data point is. A Z-Score of 3, for instance, means that a value is 3 standard
Looking at the head of the dataframe again, we notice that a few extreme values have been
removed:
4.6 Standardization
Run lines of code to scale the dataset’s values so that they follow a normal distribution:
36
Looking at the head of the standardized data frame:
We have now completed the data preparation stage and can finally start building the
segmentation model.
As mentioned above, we are going to create a K-Means clustering algorithm to perform customer
segmentation.
The goal of a K-Means clustering model is to segment all the data available into non-overlapping
Here is a simple visual representation of how K-Means clustering groups a dataset into different
segments
36
When building a clustering model, we need to decide how many segments we want to group the
Created a loop and run the K-Means algorithm from 1 to 10 clusters. Then, plot model results for
this range of values and select the elbow of the curve as the number of clusters to use.
The “elbow” of this graph is the point of inflection on the curve, and in this case is at the 4-
cluster mark.
36
This means that the optimal number of clusters to use in this K-Means algorithm is 4. Let’s now
To evaluate the performance of this model, we will use a metric called the silhouette score. This
is a coefficient value that ranges from -1 to +1. A higher silhouette score is indicative of a better
model. The silhouette coefficient of this model is 0.44, indicating reasonable cluster separation.
Having built our segmentation model, assigned clusters to each customer in the dataset
Visualizing the data to identify the distinct traits of customers in each segment
36
Recency Cluster
Frequency cluster
36
Monetary value cluster
By looking at the charts above, we identified the following attributes of customers in each
segment
2 These customers are seen making purchases often and have visited the platform recently.
Their monetary value is extremely high, indicating that they spend a lot when shopping
online. This could mean that users in this segment are likely to make multiple purchases in a
single order and are highly responsive to cross-selling and up-selling. Resellers who
purchase products in bulk could also be part of this segment.
3 Customers in this segment have been seen making purchases very frequently in the past.
However, these are people who have stopped visiting the platform for some reason and
haven’t been seen shopping on the site recently. This could mean several things — they were
36
disappointed with the service and switched to a competitor platform, they no longer have
any interest in the products sold, or their customer ID changed as they re-registered onto the
platform with different credentials.
4 This cluster consists of users who are new to the platform. They have the potential to
become long-term consumers with high frequency and monetary value and should be
targeted with special “new-user promotions” to instill brand loyalty.
Real-world customer segmentation projects will require you to come up with actionable insights
that the marketing team can use to improve sales, just like we did above.
36
CHAPTER FIVE
It is very important that Market Square and other business organization should know their
customer’s behavior through customer segmentation and analysis. This will help the organization
to know whether to put in more effort in promotion or introducing new brand or repackaging to
The reliability and efficiency of this project correct that weakness that is found in the existing
method of customer segmentation. The achievements recorded by this design can be summarized
as follows.
1. The design provides prompt and accurate customers behavior to the organization as at
when due. With this the business organization can evaluate the behavior of their customers
2. Improved customer retention from sending customer retention emails to running targeted
customers.
5.2 Conclusion
The process of customer segmentation ensures that your brand is customer-centric and helps you
serve them better. It boosts conversions, brings your marketing efforts to fruition, and also helps
build everlasting customer relationships. The strategies discussed here will help you organize your
36
segments, but after you have them in place, continue to monitor and make sure your product is
still valuable to the groups. The key to successful customer segmentation is the constant research
it entails to ensure your brand and product stay relevant and indispensable.
5.3 Recommendations
Efforts have been made to design and develop software that implement customer
segmentation using machine learning Algorithm. But there are still areas that may be
considered for further research, some of the recommendations are listed below
1. There is need for the development of Customer satisfaction system in order to measure
expectations.
36
REFERENCES
Aaker, J. L., Anne M. Brumbaugh, and Sonya A. Grier. (2000). Customers in the selected market
are segmented into different groups based on their characteristics. International Journal
of Scientific Research in Science and Technology IJSRST. Retrieved from
https://www.gsb.stanford.edu/faculty-research/faculty/jennifer-aaker on 4/06/2022
Bhade, K., et al (2018) and
Blanchard, et al (2019). utilization of data mining methods: explain the need of organization to
segment or group their customer’s base on their traits.
Blanchard, P. A. (2019). History of customer segmentation
Chen, D. Sain, S. L. (2012),
Cheng, C. H. and Chen, Y. S. (2009). The RFM model was first proposed by Hughes of the
American Database Institute in 1994
Emir, L. et al (2012). Information handling and constant forecasts: In this the framework
consumes more data and makes expectations detached levels. EUR-Lex - 32012R0648 -
EN - EUR-Lex - Europa.
Gnanaraj et al (2007). Customer segmentation, otherwise called market segmentation: Defines
customer segmentation as a method of analyzing a customer base and grouping
customers into categories or segments which share particular attributes
Hoegele, D. Schmidt, S. L. and Torgler, B. (2016). Demographic segmentation variables:
Variables are commonly used to divide the customer into smaller segments.
https://www.semanticscholar.org/author/Daniel-Hoegele/97637075
Huang S, Wang Q, School B. (2014). Use of Customer Segmentation:
Kotler and Keller (2009)
Kotler et al (2009). The use of scientific methods to distinguish these sections: Market portions
can be portrayed in various ways on method for describing the inclinations of the
objective customers.
Lacie, L. (2015). Data extraction: Machine Learning idea is extremely valuable in Data
extraction. https://pubmed.ncbi.nlm.nih.gov/26073888
Mesforoush, A. and Tarokh, M. J. (2018).
Narges, R., (2014). Utilizations of machine learning. Machine Learning in Customer
Segmentation Series-Part 1: Story of Customers Data.
Puwanenthiren, et al (2012) utilization of data mining methods: explain the need of organization
to segment or group their customer’s base on their traits.
Riyaj, S. et al (2010). Calculations work on exhibition by investigation.
https://www.gulftalent.com/people/riyaz-shaikh-8492391
Smith (1956). Definition of customer segmentation. Hoboken, New Jersey: Wiley.
Thomas et al (2007).
36
Yizhang, J. et al (2019). RFM model and K-implies calculation. Business International Journal.
Published 2 March 2013
36
APPENDIX I
ALGORITHM OF EACH MODULE
Start
Customer segmentation
When is YES
customer’s B
last purchase?
What is YES
customer C
behavior like?
Data YES
D
decide
Prescribe
36
DATA MODULE
(Gathering data, cleaning data, process data)
Gathering data
Cleaning data
Customer’s
NO Discarded
behavior
Stop
36
SUPPORT SERVICES MODULE
Wants Support!
Is NO Eliminate
support support
for item
complet
e?
Stop
36
APPENDIX II
PROCEDURAL CHART/DESIGN
Start
Data collection Data cleaning Data analysis Data visualization Data processing
Customer’s
bio
36
APPENDIX III
THE PROGRAM
import numpy as np, pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
customers = pd.read_csv(r'C:\star\Online Retail.csv',encoding='unicode_escape')
customers.head()
freq = customers_rec.groupby('CustomerID')['Date'].count()
customers_freq = pd.DataFrame(freq).reset_index()
customers_freq.columns = ['CustomerID','frequency']
ec_freq = customers_freq.merge(customers_rec,on='CustomerID')
rec_freq['total'] = rec_freq['Quantity']*customers['UnitPrice']
m = rec_freq.groupby('CustomerID')['total'].sum()
m = pd.DataFrame(m).reset_index()
m.columns = ['CustomerID','monetary_value']
36
rfm = m.merge(rec_freq,on='CustomerID')
finaldf = rfm[['CustomerID','recency','frequency','monetary_value']]
list1 = ['recency','frequency','monetary_value']
for i in list1:
print(str(i)+': ')
ax = sns.boxplot(x=finaldf[str(i)])
plt.show()
36
import matplotlib.pyplot as plt
from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
from sklearn.decomposition import PCA
from mpl_toolkits.mplot3d import Axes3D
SSE = []
for cluster in range(1,10):
kmeans = KMeans(n_clusters = cluster, init='k-means++')
kmeans.fit(scaled_features)
SSE.append(kmeans.inertia_)
# converting the results into a dataframe and plotting them
frame = pd.DataFrame({'Cluster':range(1,10), 'SSE':SSE})
plt.figure(figsize=(12,6))
plt.plot(frame['Cluster'], frame['SSE'], marker='o')
plt.xlabel('Number of clusters')
plt.ylabel('Inertia')
36
print(silhouette_score(scaled_features, kmeans.labels_, metric='euclidean'))
pred = kmeans.predict(scaled_features)
frame = pd.DataFrame(new_customers)
frame['cluster'] = pred
36