Hackathon Presentation - : by Dhruba Barman

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 27

Hackathon Presentation

- by Dhruba Barman
Problem Statement
● Financing or loans are essential for any community, today Banks are
desperate to provide loans to individuals having good tracking
financial/credit history. But many people are not part of this system, they
don't have trackable financial history and thus Banks are reluctant to
offer loans to them. And these people are often exploited by local
untrustworthy lenders
● This challenge is to predict loan pay-ability using various statistical and
machine learning methods
Why solve this problem?
- Business Impact
- Control losses
- Better fund management
- Timely services to customer
Data

Currentdataset:
● TARGET Status ( 0 for no claim and 1 for Claim)

Additional Data that could help:


● The types of Claim (as mentioned in assumption) would have helped to predict
better
Exploratory Data Analysis
Realty ownership
Gender Data
• Realty owner have high claim count.
• Only 5% of all females had a claim
• We can offer some discounts to non
• Only 8% of all males had a claim
owners so as to increase the chances
of claim for them.
Contract Type data
• Cash loans have higher claims. We can increase the premium for “Cash loans” contract as demand is high.
• Revolving loans should be looked upon to increase the claim counts by providing some lucrative offers.
• It is observed that in case of Revolving loans, claims for females is high. So may be targeting females can increase
claim count for this type.
Count of children
• As the count of children increases, chances of claims reduces.
• Claims decreases for males with high children counts.
• So we can target males with more children and provide some lucrative offers so as to
increase the claims.
Family size:
• As the family size increases, claim count reduces.
• Claims by males decreases with increase in family size.
• So we can target males in big families here and provide some lucrative offers so as to
increase the claims.
Income type
• Females having income type as “State servant” have higher claims.
• In general “Working” type has high claims.
Education type
• Secondary/Secondary special type has high chances of claims.
• We can target lower secondary types as claim ratio for both genders is same under this
type.
Family status
• Married couples have high probability of claims, by both males and females. We can
increase the premium for them as demand is more.
• For females who are widows, we can offer some discounts or prizes so as to increase
the claims by them.
Housing type
• People in House/apartment have higher chances of claims and is on high demand for
this type.
• Females with “Office apartments” have high claims and they should be targeted to
increase claims under this type.
• Males having Co-op apartments have high count of claims. So if we target the males
under this type, we may have chances of increase in claims.
Car age:
• Car owners whose car age is within 10-12 years have Region rating
higher claim counts. Premiums can be increased for this • Regions having rating ‘2’ have
group. high claims
• Attractive offers can be introduced for individuals whose
car age is high.
Occupation type
• Laborers have high chances of claims compared to other occupations.
• Females having occupation as “Secretaries”, ”Realty agents”, ”Low skill laborers” have high count of
claims compared to males.
Day wise analysis
• It is obvious that Sunday being a holiday, claim counts also reduces. #FamilyTime 
• We can definitely provide some attractive offers for the weekends to increase claim counts.
• Weekdays are always on demand for any type of claims.
Hour wise analysis Reg Region
• Claim count is high during the business • The bar plots depicts the obvious region to
hours, specially from 10AM to 2PM. target in this type.
• Individuals whose job registration and living
region is same can be targeted.
Region analysis
• Claims have higher counts where job registration region is not in work region.
• If living region is not same as work region, claims have high counts.
• So we can provide some discount or other special offers for individuals whose job registration and
living region is same as work region.
Total Income analysis

• Distribution of annual income is uniform for both Claim and NoClaim


This is the Correlation Heatmap showing the correlation between different features. Correlation above 0.6 are marked
in RED. Highly correlated columns can be dropped.
Distribution of TARGET Data

• Only 6% of records has a Claim while 94% has no claim.


• Target data seems imbalanced.
Data cleaning and preprocessing
performed
● Columns with null values are dropped.
● Duplicate records are dropped.
● Columns marked as “ignore this field” in data_dictionary.csv
are dropped. Most of them are highly correlated.
Backup Slides
(bar plots of some of the dropped features)
Emergencystate
FONDKAPREMONT Mode: • Claim count is high for No emergency mode
• Reg Oper Account has Highest Claims. and here demand will obviously remain high
and hence premiums can be increased.
Housetype analysis
• Block of flats have high demand in terms of Claims.
• .Males have high Claim counts for terraced house. So we can have more attractive offers for males
to increase the claim counts for “terraced house”.
Wallmaterial analysis
• Individuals living in house of Panel/Stone,bricks have higher claims
• Claim count is high for females living in houses having Wooden and Mixed wallmaterial. Although
analysis on other features is also needed along with this to come to final conclusion.
Thank You

You might also like