Credit EDA Case Study

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 22

CREDIT – EDA CASE STUDY

Mr. Murali Krishna Manala


Ms. Prachi Patil
PURPOSE

• Exploratory Data Analysis on the Customer Loan Application which might


help the bank w.r.t Risk associated with customer default behavior.
• To provide inferences & decisions based on the data analysis will thus enable
the company to channelize the business towards making or scaling out profits.

APPROACH

1) Importing and Cleaning of 2) Formatting or Grouping 3) Performing Univariate & 4) Draw useful Insights.
the Data provided for an effective analysis. Bivariate analysis on
Categorial and Numerical
fields.
UNIVARIATE DISCRETE ANALYSIS FOR AGE GROUPS

 Applicants are increasing with Age of the applicant until age 40 and after that we see decline in the no of
applications.
 And from the 2nd plot , we see that Default rate is decreasing as the Age of the applicant increases.
UNIVARIATE DISCRETE ANALYSIS FOR FAMILY STATUS

• From the chart, It can be inferred that the most of applicants belongs to Married, Single & Civil Marriage
categories sequentially. Out of which, Single & Civil Marriage tend to default more, and Unknown
category never defaulted.
UNIVARIATE DISCRETE ANALYSIS FOR OCCUPATION

 Most of the Applicants occupation is Missing. However Top applicants are from Laborers ,Sales Staff.
 But most of the Default percentage is occurring from Low-Skill Laborers group
UNIVARIATE DISCRETE ANALYSIS FOR INCOME TYPE

 Most of the applicants are From Working Category & Commercial Associate and Least are Businessman
and Student
 However, Default percentage is more on Maternity Leave and Unemployed applicant group
UNIVARIATE CATEGORICAL ANALYSIS FOR ORGANIZATION
TYPE

 Most of the Applicants are from Business Entity Type 3 , Missing and Self Employed.
 However, Most default percentage is from Transport Type 3 , Industry Type 13 and industry Type 8 groups
ORDERED/ CONT., NUMERI CAL
VARIABLE ANALYSIS ON WORK
EXP

• Most of the applicants are more than 20 Years


experience. And as the experience increasing
default rate decreasing.
• Applicants who have less than <1 year experience
have 50% default chance.
ORDERED/CONT., NUMERICAL
VARIABLE ANALYSIS

• Clients who have more dependencies tend to have


more default percentage
• Similarly Clients whose Region Rating is 3.0 tend
to have more default percentage
ORDERE D/CONT., NUME RI CAL
VARI ABLE ANALYSIS – E XT
SOURCE

• The better the external score of the applicant the


lesser the default rate.(Since Ext_Source_3 was
imputed null with Mean ,we are seeing nearly
binomial where as Ext_Source_2 is rightly skewed)
ORDE RE D/CONT., NUME RICAL
VARIABLE ANALYSIS – AMT
FIE LDS

• AMT_GOOD_PRICE , AMT_CREDIT,
AMT_ANNUITY doesn’t seem to have any impact
on the default rate.
B I VA R I AT E A N A LY S I S -
E DU C AT I O N V S G E N D E R V S
INCOME

• From the heatmap, We can infer that Male


applicants with Lower secondary and
Secondary Education tend default more
especially from the Low, Medium and
High income
• And, those with an academic degree,
except the Female applicants with very
high-income group does not default.
B I VA R I AT E A N A LY S I S –
OC C U PAT I O N V S FA M I LY
S TAT U S V S I N C O M E

• From the heatmap, Widows from the HR


Staff with Very Low- and Medium-income
default more.
• Civil Married applicants who are Drivers
with and Low income default more.
BIVARIATE
ANALYSIS – ORG
TYPE VS HOUSING
TYPE VS INCOME

•From the heat Map, Following applicants have more


default rate
• UnEmployed Living in Municipal Apartment
• Transport Type 1 with Rented apartment – Working
• Who are in Maternity Leave
• Office Apartment- Working in Industry Type 4 and
Insurance
• Industry Type 8 living with parents and working
• State servants in Municipal Apartment working
industry type 3
• Commercial Associates working in Industry Type-1
living in Co-op apartment
BIVARIATE
ANALYSIS ON
AMT_CREDIT VS
AMT_ANNUITY

• From the Scatter Plot, We can


observe that AMT_ANNUITY
is more in relation to the
AMT_CREDIT.
• So when AMT_ANNUITY is
more there is slight chance of
Default.
BIVARIATE
ANALYSIS – ON
AMT FIELDS

• From the Pair plot, we can


observe that there is no relation
b/w AMT fields w.r.t defaulting
behavior
CORRELATION MATRIX FOR THE DEFAULT DATASET
TOP 10
CORRELATIONS
Correlation Score - Correlation Score
Feature 1 Feature 2
Defaulter -Non Defaulter

OBS_30_CNT_SOCIAL_CIRCLE OBS_60_CNT_SOCIAL_CIRCLE 0.99827 0.99851

•Top 10 correlations between variables are in AMT_GOODS_PRICE AMT_CREDIT 0.983108 0.987255

the range of (0.33) to (0.99). and both


REGION_RATING_CLIENT REGION_RATING_CLIENT_W_CITY 0.956637 0.950149

CNT_FAM_MEMBERS CNT_CHILDREN 0.884153 0.876905

datasets(defaulter and non-defaulter) have DEF_60_CNT_SOCIAL_CIRCLE DEF_30_CNT_SOCIAL_CIRCLE 0.869016 0.859371

almost similar correlation AMT_GOODS_PRICE AMT_ANNUITY 0.752895 0.776867

AMT_CREDIT AMT_ANNUITY 0.752195 0.771317

•except for the Region rating client vs Region REGION_RATING_CLIENT_W_CITY REGION_POPULATION_RELATIVE 0.446977 0.539005

population relative and Credit vs Annuity REGION_RATING_CLIENT REGION_POPULATION_RELATIVE 0.443236 0.537301

DEF_30_CNT_SOCIAL_CIRCLE OBS_60_CNT_SOCIAL_CIRCLE 0.337389 0.331726


PREVIOUS APPLICATION DATA ANALYSIS

• 63% of Previous loans are approved and those applicants have 8% default rate whereas previously Refused
applicants have 12% default percentage.
PREVIOUS
APPLICATION
DATA
ANALYSIS

• Those who were previously


refused for Revolving and
Cash Loans have more default
percentage.
• and also who have newly
applied for Cash Loans and
Revolving Loans previously
have more default rate
D ATA C L E A N I N G

• Exploratory Data Analysis on the Customer Loan Application which might


help the bank w.r.t Risk associated with customer default behavior.
• To provide inferences & decisions based on the data analysis will thus enable
the company to channelize the business towards making or scaling out profits.
INFERENCE

Recommended Applicants Risk Associated Applicants


IT Staff Applicants with Civil Marriage or widow
status belonging to occupation – Driver/HR
Applicants holding academic degree Staff
Low skilled Laborers
Male applicants with Lower secondary or
incomplete higher education
Applicants who are unemployed or on
maternity leave

You might also like