Professional Documents
Culture Documents
LendingClub CaseStudy
LendingClub CaseStudy
Group Members:
1. Ankit Kumar
The problem
Lending Club is the largest Lending Club wants to As a data scientist working
online loan marketplace, understand the driving for Lending Club analyze the
facilitating personal loans, factors behind loan default, dataset containing
business loans, and i.e. the driver variables information about past loan
financing of medical which are strong indicators applicants using EDA to
procedures. of default. understand how consumer
attributes and loan attributes
Borrowers can easily access The company can utilise this influence the tendency of
lower interest rate loans knowledge for its portfolio default
through a fast online and risk assessment.
interface.
Analysis Approach
> Drop columns with null values,
all random values or single
> Analyze variables against
category value
> Convert values to proper int, segments of other variables Publish insights and
float, date representations > Create derived variables observations
Segmented
Univariate Bivariate Summarize
Clean Data Univariate
Analysis Analysis
Analysis Results
Lending Club charges higher interest rates as the grade of loan becomes worse.
However, as we will see on next slide - the driving variable for defaults is the higher interest rate.
Analysis - Defaults by Interest Rate
Percentage of Defaults increases monotonically with higher
interest rates. At rates of 19% and above, more than 33% of
loans are Charged Off.
Analysis - Defaults by Loan Purpose
More than a quarter of loans taken for the purpose of running a small
business see defaults.
Analysis - Defaults by Borrower’s Income
Borrowers having annual income less than 20000 default on their loans at much higher rates.
Loan default decreases with higher annual income.
As we will see on next slide – the ratio of amount to income is more important.
Analysis - Defaults by ratio of amount to income