Professional Documents
Culture Documents
Churn Rate EDA
Churn Rate EDA
Prashant 180538
Problem Statement
Apply basic data understanding and try to find some interpretations from the Telco-Churn-Rate
Dataset
There are a total of 7043 entries in our Dataset having class labels as Churn Yes or Churn No
So Total (3) Numeric, (7) Binary and (11) Nominal Attributes present in the dataset
Data Preprocessing
Data Cleaning
Missing Values
There are 11 entries in the TotalCharges column where Total Charges is empty so we drop those
entries from the dataset and the indexes of those 11 entries are [488, 753, 936, 1082, 1340, 3331, 3826,
4380, 5218, 6670, 6754]
Null Values
There are no null values in the dataset
Outlier Analysis
Upon plotting the boxplots for the 3 numeric attributes i.e. tenure, MonthlyCharges, TotalCharges
the IQR analysis was performed and no outliers were found.
EDA
1). Churn Rate for Senior Citizens is high as compared to Non-Senior Citizens
(plot-1).
2). The people having partners or dependents have a lower Churn Rate as compared
to people with No Partners or No dependents (plot-2).
3). Out of the two Internet Services, people having Fiber Optic have a higher Churn
Rate as compared to people having DSL (plot-4).
4). Customers with longer contract terms i.e. one year or two-year contract have a lower Churn
Rate as compared to those with month-to-month contracts.
2). Customers having higher monthly charges have higher Churn Rate as and the median of
charges lies close to 80 dollars whereas for the non-churning customers the median of monthly
charges is close to 70 dollars. (plot-2)
3). An interesting observation comes in plot-3 where we can see that the Total Charges of Non-
Churning Customers is higher as compared to Churning Customers. The median for non Churning
Customers is close to 2000 dollars and that of Churning Customers is close to 1000 dollars. (plot-3)
Attribute Generation
Hence looking at this observation we create a new attribute called “count_of_services_used” which
takes into account the number of additional services (out of OnlineSecurity, OnlineBackup,
DeviceProtection, TechSupport, StreamingTV, StreamingMovies, InternetService) used by the
customer.
Attribute Importance