Professional Documents
Culture Documents
EDA Credit Case Study
EDA Credit Case Study
CASE STUDY
1
PROBLEM STATEMENT:
The loan providing companies face difficulties to give loans to the people due to their insufficient or non-existent credit history.
Because of that, some consumers use it as their advantage by becoming a defaulter.
This case study aims to identify patterns which indicate if a client has difficulty paying their instalments which may be used for taking
actions such as denying the loan, reducing the amount of loan, lending (to risky applicants) at a higher interest rate, etc.
This will ensure that the consumers capable of repaying the loan are not rejected
This analysis will help company to identify the variables behind loan default, i.e. the variables which are strong indicators of default.
The company can utilize this knowledge for its portfolio and risk assessment.
1. Reading the application data and cleaning the data. Columns with 50% missing data was deleted.
Remaining columns with missing values were analyzed for using distribution plot and count plots. Data was analyzed to find the mean,
mode and median.
2. After analyzing the data columns which cannot be replaced with mean / median / mode were deleted.
Other missing data were substituted with mean / median and mode.
5. Some numerical columns were converted into the categorical columns for the analysis.
6. Univariate analysis was done for all categorical columns with object data type.
7. Top 10 Correlation for defaulters and non defaulters were carried out.
8. Reading the previous application data and cleaning the data. Columns with 50% missing data was deleted.
9. Remaining columns with missing values were analyzed for using distribution plot and count plots. Data was analyzed to find the mean,
mode and median. After analyzing the data columns which cannot be replaced with mean / median / mode were deleted. Other
missing data were substituted with mean / median and mode.
Inferences:
Inferences:
Inferences:
1. Quantum of the applicants who does not own car are more than
the applicants who does own a car.
2. I has been observed from the graph that the applicants who do
not own the car has chances to be a defaulter as compared to the
applicants who owns a car.
PLOT: FLAG_OWN_REALTY
Inferences:
1. Quantum of the applicants who own a house/flat are more than the
applicants who does not have house.
2. It has been observed from the graph that within the applicants who does not
own a house/flat has more defaulters as compared to the applicants who
owns a house/flat.
PLOT: NAME_TYPE_SUITE
Inferences:
Inferences:
Inferences:
Inferences:
Inferences:
Inferences:
Inferences:
1. Applicants with Business entity type 3 and self employed have high percentage of defaulters.
Top 10 Correlation of Non- Defaulters and Defaulters is same.
Inferences:
1. People with secondary education loans were refused or cancelled more no. of times in past.
PLOT: CNT_CHILDREN
Inferences: