Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 32

Final Project Introduction

July 12, 2023


Financial dashboard in Power BI
• Importing data, modelling data, analysing and visualizing data in Power BI
• Preparing Financial Statements: Profit and Loss Statement (Income
Statement), Balance Sheet, Cash Flow Statement
• Calculating key Financial Ratios to analyse Profitability, Liquidity, Financial Risk, and
Operational Business Efficiency
• Preparing Dynamic Financial Dashboards
Dataset
• Gl,Charts of accounts, territory and calendar date
Stock market Price prediction

• Scarping data of the stock to be predicted


• Time series analysis
• Making predictions on stock prices
• Deployment
FMCG Sales dashboard in Power BI desktop and Sales
Forecasting

Sales Analytics and Consumer Consumption Insight is developed to enhance sales effectiveness of employees which also
provides consumer consumption insights for better segmenting and target.
Business Problem

Customer churn is a major challenge for banks, as it reduces their revenue and market
share. It is also costly and difficult to acquire new customers than to retain existing
ones. Therefore, banks need to understand the factors that influence customer churn
and develop strategies to prevent it.
Business Objective and Business Metrics
The objective of this project is to create a machine learning model that can accurately
predict customer churn and provide insights into the characteristics of customers who
are likely to leave the bank.
• The model will help the bank to target its at-risk customers and offer them
personalized incentives or solutions to retain them.
• The model will also help the bank to improve its products and services based on the
feedback and preferences of its customers.
Goal
The goal is to build a machine learning model that can identify the customers who are
likely to leave the bank. This can help the bank to retain its valuable customers and
increase its revenue.
Data Description and Features
The dataset is of 10000 customers with their features and whether they have left the bank or not.
• Row Number: The record number, which has no effect on the outcome.
• Customer ID: A random identifier, which has no effect on the outcome.
• Surname: The customer’s surname, which has no effect on the outcome.
• Credit Score: The customer’s credit score, which reflects their creditworthiness. A higher credit score means a
lower risk of defaulting on loans, and thus a lower likelihood of leaving the bank.
• Geography: The customer’s country of residence (Germany, France, or Spain). Different countries may have
different banking preferences and regulations, which can influence the customer’s decision to stay or leave.
• Gender: The customer’s gender (Female or Male). Gender may have some impact on the customer’s banking needs
and expectations, which can affect their satisfaction and loyalty.
• Age: The customer’s age. Older customers may have more stable and long-term relationships with the bank, while
younger customers may be more willing to switch to other banks for better offers or services.
• Tenure: The number of years that the customer has been with the bank. A longer tenure implies a stronger bond
and trust between the customer and the bank, and thus a lower chance of leaving.
Data Description and Features
NumOfProducts: The number of products or services that the customer has purchased from the bank. A higher number of
products means a more diversified and comprehensive banking experience for the customer, which can increase their
retention. A lower number of products may indicate a lack of awareness or satisfaction with the bank’s offerings, which can
lead to churn.
HasCrCard: Whether the customer has a credit card from the bank or not (0 = No, 1 = Yes). Having a credit card can enhance
the customer’s convenience and loyalty to the bank, as well as generate more revenue for the bank. Not having a credit card
may reduce the customer’s attachment and involvement with the bank.
IsActiveMember: Whether the customer is an active member of the bank or not (0 = No, 1 = Yes). Active members are more
likely to use the bank’s products and services frequently, which can improve their satisfaction and retention. Inactive
members may have lower engagement and interest in the bank, which can increase their churn rate.
EstimatedSalary: The customer’s estimated annual salary. A higher salary means a higher income and spending power for
the customer, which can make them more valuable and loyal to the bank. A lower salary may limit the customer’s ability and
willingness to use the bank’s products and services, which can make them more likely to leave.
Exited: Whether the customer has left the bank or not (0 = No, 1 = Yes). This is the target variable that we want to predict
using the other features
Data Preparation
Import Data:
Data Preparation
Data Checking
First Observation
• The dataset consists of 10000 observations with 14 variables each
• The data is clean and complete, with no missing or duplicated values
• Target data is called "Exited"
• Some data type might need changing
• Gender and Geography can be encoded
Data Descriptive Statistics
Data Descriptive Statistics
Second Observation
• The average credit score of the customers is 650.53, which is considered fair.
• The credit score ranges from 350 to 850, with 850 being the highest possible score.
• The average age of the customers is 38.92 years, with a minimum of 18 and a maximum of 92.
• The average tenure of the customers is 5.01 years, meaning they have been with the company for about 5 years on average. The
tenure ranges from 0 to 10 years.
• The average balance of the customers is 76485.89, with a large variation of 62397.41. The balance ranges from 0 to 250898.09,
with some customers having no balance at all.
• The average number of products used by the customers is 1.53, with a standard deviation of 0.58. The number of products ranges
from 1 to 4, with most customers using either 1 or 2 products.
• The average estimated salary of the customers is 100090.24, with a standard deviation of 57510.49. The estimated salary ranges
from 11.58 to 199992.48, with a wide distribution of values.
• The data also shows that 20.4% of the customers exited the company, which is a high churn rate. This could indicate that the
company is not meeting the needs or expectations of its customers, or that there are better alternatives in the market.
• 70.55% customer has a credit card, 51.51% customer is an active member
ANOVA (analysis of variance) is a statistical test that compares the means
of multiple groups to determine if there are significant differences
between them.
Conclusion: p-value 0.006738213892205324 is less than alpha 0.05,
F test = 7.344522163758249
There is a significant difference in the
target based on the predictor.

Observation:
Outlier present
Lower credit score could correlated with Exited based on statistical testing
Conclusion: p-value 1.2399313093445346e-186 is less
than alpha 0.05, ftest = 886.0632749090969
There is a significant difference in the target based on the predictor.

Observation
Outlier present
Older customer more likely to exit the bank
Conclusion: p-value 0.1615268494946745 is greater than alpha 0.05,
Ftest = 1.960163626100812 There is no significant difference in the target
based on the predictor.
Observation:
Tenure not likely correlated with exited
Conclusion: p-value 0.22644042802263928 is greater
than alpha 0.05, Ftest = 1.463261923973248
There is no significant difference in the target based
on the predictor.
Observation
Salary doesn't correlate with exited
Geography Vs Churn Rate %

Chi-square Statistic: 301.25533682434536, p-value:


3.8303176053541544e-66
less than 0.05 There is a significant difference in the target
based on the predictor.

Observation
Customer from Germany are more likely to leave the bank
HasCrCard VS Churn Rate %
Chi-square Statistic: 0.47133779904440803, p-value:
0.49237236141554686
greater than 0.05 There is no significant difference in the target
based on the predictor.

Observation:
Owning credit card or not doesn't have significant impact to
customer exiting the bank
4287963, p-value: 8.785858269303703e-55 less than 0.05 There is a significant difference in the target based on the predictor.
164287963, p-value: 8.785858269303703e-55 less than 0.05 There is a significant difference in the target based on the predictor.

IsActiveMember Vs Churn Rate%

Chi-square Statistic: 242.98534164287963, p-value:


8.785858269303703e-55 less than 0.05 There is a significant
difference in the target based on the predictor.
Observation
Active member are not likely to exit the bank
Gender Vs Churn Rate%
Chi-square Statistic: 112.91857062096116, p-value:
2.2482100097131755e-26 less than 0.05
There is a significant difference in the target based on the predictor
Observation
Female customer are likely to leave the bank
Outlier Removal and Data Cleaning
Q1 = df[col].quantile(0.25)
Q3 = df[col].quantile(0.75)
IQR = Q3 - Q1
lower = Q1 - 1.5 * IQR
upper = Q3 + 1.5 * IQR

When Age is printed below lower score: 14.0 and upper score: 62.0
When Credit Score is printed below lower score: 383.0 and upper score: 919.0
Feature Engineering
Feature Engineering
Feature Engineering
Correlation Analysis

You might also like