Mid Semester Project Review UditSoni

COMPARATIVE STUDY ON CLASSIFICATION
ALGORITHMS ON
CUSTOMER CHURN PREDICTION
Seminar Presentation by
Udit Soni
Roll No: 22MAC2R30
(SEMESTER 4)
under the supervision of
Prof. Deepika Neela

Department of Mathematics
National Institute of Technology Warangal
Telangana, India-506004
CONTENT
• Introduction of Project
• Classification Algorithms in Machine Learning
• Introduction to Customer Churn Dataset
• Performing Analysis on the dataset
• Logistic Regression
• Conclusion
• References
INTRODUCTION
• Customer churn means detecting which customers are likely to leave a service
or cancel a subscription to a service.
• Critical prediction for many businesses
• The telecommunications business has an annual churn rate of 15-25 percent in
this highly competitive market
• Can we reduce customer churn?
CLASSIFICATION ALGORITHMS
• Used to predict the class of a data instance based on its input features
• Aims to learn a mapping between the input features and the output class
labels, which can be binary (e.g., yes/no)
• Three important classification algorithms
• Logistic Regression
• Random Forest
• Support Vector Machine
CUSTOMER CHURN DATASET
• The dataset's categorical column values provide insights into various aspects of customer behavior and
preferences, facilitating exploratory data analysis and predictive modeling:
• Gender: Categorized as 'Female' or 'Male'
• Partner: Indicating whether the customer has a partner ('Yes' or 'No')
• Dependents: Reflecting the presence of dependents ('Yes' or 'No')
• PhoneService: Specifies if the customer has phone service ('Yes', 'No', or 'No phone service')
• MultipleLines: Indicates whether the customer has multiple lines ('Yes', 'No', or 'No phone service')
• InternetService: Describes the type of internet service subscribed ('DSL', 'Fiber optic', or 'No')
• OnlineSecurity, OnlineBackup, DeviceProtection, TechSupport, StreamingTV, StreamingMovies: Representing
various additional services with values 'Yes', 'No', or 'No internet service'
• Contract: Specifies the contract type ('Month-to-month', 'One year', or 'Two year')
• PaperlessBilling: Reflects the preference for paperless billing ('Yes' or 'No')
• PaymentMethod: Specifies the payment method chosen by the customer ('Electronic check', 'Mailed check',
'Bank transfer (automatic)', or 'Credit card (automatic)')
• The customer churn dataset comprises 21 columns, including the target
variable 'Churn' and excluding the 'ID' column
• The dataset's dimensions are (7043, 20), where '7043' represents the number of
customer records.
• Ensuring data integrity and completeness is crucial for building reliable
predictive models.
PAIR PLOT
• A pair plot is a useful tool for visualizing relationships
and patterns in multivariate data
• It allows us to quickly identify trends, correlations, and

potential outliers in your dataset
• Each scatterplot in the pair plot represents the

relationship between two variables making it easy to
understand the pairwise interaction within our data
• It builds the histogram and scatter plot

PERFORMING ANALYSIS ON THE
DATASET
• Visualization using Plots
• Data Label Encoding
• Data Scaling – Normalization
VISUALIZATION USING PLOTS
• Visualization plays a crucial role in understanding and interpreting the
customer churn dataset.
• we can gain insights into the distribution, relationships, and patterns present
within the data.
• Utilize histogram plots, pi plots, and pair plots to visualize different attributes
of the dataset independently.
• Customers with higher monthly
charges are also more likely to
churn
• New customers are more likely to
churn
• It can be observed that the fraction
of senior citizens is very low. Most
of the senior citizens churn.
• 26.6 % of customers switched to another firm.
• Customers are 49.5 % female and 50.5 % male.
• Major customers who moved out were
using electronic checks as payment
methods.
• Customers who opted for credit-card
automatic transfer or bank automatic
transfer and mailed check as payment
method were less likely to move out
• About 75% of customers
with month-to-month
contracts opted to move
out as compared to 13%
of customers with one-
year contracts and 3%
with two-year contracts.
• Customers without
dependents are more likely
to churn
DATA LABEL ENCODING
• Data label encoding is a crucial preprocessing step that involves converting
categorical variables into numerical form.
• Categorical variables represent attributes with a limited number of distinct
values, such as subscription types, payment methods, or demographic
categories.
• The label encoding technique assigns a unique numerical value to each
category within a categorical variable.
• This transformation enables machine learning algorithms to process and
analyze categorical data, as many algorithms require numerical input.
DATA SCALING
• Data scaling is a crucial preprocessing step aimed at standardizing the range
and distribution of features to ensure consistent model performance.
• Normalization(Min-Max Scaling) is one of the key techniques used for scaling
data, and it involves transforming feature values to fit within a specified range,
typically [0, 1].
• Xscaled=(X-Xmin)/(Xmax-Xmin)
• Min-max scaling ensures that feature values are proportionally adjusted to fit
within the specified range
• Z-score standardization is another commonly used scaling technique that
transforms feature values to have a mean of zero and a standard deviation of
one, resulting in a standard Gaussian distribution.
• Xscaled=(X-Mean(X))/StandardDeviation(X)
• Z-score standardization ensures that feature values are centered around zero
and have consistent units of measurement
LOGISTIC REGRESSION
• Logistic Regression is a type of statistical classification algorithm.
• It is used for Binary Classification tasks
• The algorithm establishes the relationship between the input features and the
probability of the binary output variables.
Graph of the Sigmoid Function

For mathematical intuition of Logistic Regression, Let us consider a general equation of Multiple Linear Regression as follows:
y(x1, x2, x3, . . . , xn) = b0 + b1x1 + b2x2 + b3x3 + . . . + bnxn
This equation will produce values beyond 1. Since Linear Regression doesn’t suit values beyond 1, the output values may be non-
linear. Thus, an algorithm for non-linear output variables is required to process.
Let an event E happen. The odds of an event are equal to P(E) / P(E’)
For P(E) = p(occurring), P(E’) = 1-p(occurring).
The range of P(E) / P(E’) will be [0,∞). Since we need to create a better model, so we will convert the odds range to (−∞,∞). Thus,
we will apply Logarithmic Functions i.e. logit(odds):
logit(odds) = log(odds) = log(p / 1−p)
Also, we know that:
logit(odds) = a + bX
p / (1 − p) = ea+bX
P = 1/ (1+e-(a+bX))
The above equation is the Sigmoid Function for Binary Logistic Regression. In general, this type of function, which is used for
non-linear output variables, is known as the Activation Function.
Note: The inverse of the logit function is the Sigmoid function. The range of this function will be
[0,1].
Therefore, it is a vital tool for solving binary classification problems. Its smoothness and simple
derivative make it easy to compute, which helps to ensure efficient and effective training of the
model
ADVANTAGES OF LOGISTICS REGRESSION
• Logistic Regression Is very easy to understand

• It requires less training
• Good accuracy for many simple data sets and it performs well when the
dataset is linearly separable.
• It makes no assumptions about distributions of classes in feature space.
• Logistic regression is less inclined to over-fitting but it can overfit in high
dimensional datasets. One may consider Regularization (L1 and L2)
techniques to avoid over-fitting in these scenarios.
• Logistic regression is easier to implement, interpret, and very efficient to train.
CONCLUSION
• Completed data gathering and preprocessing steps
• In data visualization steps with the help of several plots and graphs made different
assumption for predicting customer churn
• Gain theoretical understanding for Logistic Regression
REFERENCES
• Data set https://www.kaggle.com/datasets/blastchar/telco-customer-churn
• Shai Shalev-Shwartz, Shai Ben-David, Understanding Machine Learning:

From Theory to Algorithms, Third Edition,2015
• John D. Kelleher, Brian Mac Namee, Aoife D'Arcy, Fundamentals of Machine
Learning for Predictive Data Analytics, Second Edition,2020

Mid Semester Project Review UditSoni

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mid Semester Project Review UditSoni

Uploaded by

Copyright:

Available Formats

COMPARATIVE STUDY ON CLASSIFICATION

under the supervision of

Prof. Deepika Neela

• It allows us to quickly identify trends, correlations, and

• Each scatterplot in the pair plot represents the

• It builds the histogram and scatter plot

Graph of the Sigmoid Function

• Logistic Regression Is very easy to understand

• Data set https://www.kaggle.com/datasets/blastchar/telco-customer-churn

• Shai Shalev-Shwartz, Shai Ben-David, Understanding Machine Learning:

You might also like