Download as pdf or txt
Download as pdf or txt
You are on page 1of 4

Spotle.

ai data science final


CAPSTONE PROJECT

Building an credit card


analyser using Decision
Tree classifier

BY :
shazib rahman
objective:
To build a custom Credit Risk Analyzer, using a decision tree model, which will predict the
suspected credit card defaulters among the new applicants.

WHAT IS CREDIT ANALYSIS ?

Credit analysis is a type of analysis an investor or bond


portfolio manager performs on companies or other debt issuing
entities to measure the entity's ability to meet its debt
obligations. The credit analysis seeks to identify the
appropriate level of default risk associated with investing
in that particular entity.

Why credit card risk evaluation ?


Recently, with the financial crisis becoming serious, the
trend of financial globalization and financial market
volatility has attracted people’s attention, especially banks
and investors who suffered unprecedented challenges of credit
risk. The credit crisis caused by American showed that
international banking has been challenged because of their
lack of effective methods for assessment in controlling
credit risk.

Why decision tree ?

Decision tree is used to develop credit risk analyzer. This


model suits us based on the data format we have on previous
credit card holders.
METHODOLOGY USED

As i already have written before we are using


decision tree for this dataset but the random
forest can also be used because it is based on
decision tree itself and now the question is what
is decision tree
well,Decision Trees (DTs) are a non-parametric
supervised learning method used for classification and
regression. The goal is to create a model that predicts
the value of a target variable by learning simple
decision rules inferred from the data features. For
instance, in the example below, decision trees learn from
data to approximate a sine curve with a set of if-then-
else decision rules. The deeper the tree, the more
complex the decision rules and the fitter the model.

So,we are using sklearn package which has already


Decision tree algorithm.

DATA:
Now if we talk about the dataset given it consist 13
columns and 12 columns are independent variable and the
last one is dependent variable and the last variable will
tell us whether the card user is defaulter or not.
In the above picture you can see the values and the
column names

techniques used

well , i used decision tree classifier which can be


directly imported from sklearn.tree module.
1. Label encoder for converting the categorical values
to decimal values.
2. I used pandas to read csv files.
3.I used matplotlib indirectly for plotting the graphs
using pandas inbuilt hist method to draw a histogram
4. Numpy
5. accuracy_score , classification_report and
confusion_matrix from sklearn.metrics for better analysis
of my model.

Result
1. I have used to criterian of decision tree i.e. entropy
and then gini.
2.accuracy score for entropy is 81 % whereas for gini it
is 80 % approx

You might also like