Download as pdf or txt
Download as pdf or txt
You are on page 1of 18

Credit Card Fraud Detection

By
Chandan Dutta
Anirban Chattaraj
Sreya Singh
Rajat Thakur
What Is Credit Card Fraud?
Credit card fraud is a growing problem for consumers and financial
institutions. It's a type of financial crime that involves the
unauthorized use of a credit card.
Credit card fraud is a significant problem, impacting both individuals
and financial institutions. It involves unauthorized use of credit cards
for personal gain, resulting in financial losses for victims.
What does credit card fraud detection does?

The Credit Card Fraud detection problem includes modeling past credit
card transactions with the knowledge of the ones that turned out to
be fraud. The model is then used to identify whether
a new transaction is fraudulent or not. Our aim is
to detect maximum of fraudulent transactions while minimizing the
incorrect fraud classifications.
About the dataset.

We gathered the dataset


from Kaggle
(https://www.kaggle.com/datase
ts/ealtman2019/credit-card-
transactions/data?select=User0_
credit_card_transactions.csv)
It has 13 columns
and 691920 observations.
target variable is used as factor
variable.
Phases of the Project
Data Collection and Preprocessing
Balancing the Dataset using Undersampling .
Feature Selection
Exploratory Data Analysis
Model Creation and Evaluation.
Model Deployment.
Data Collection
Credit card fraud detection models rely on large datasets of transaction history.
Data collection involves gathering information from various sources, such as
transaction logs, customer profiles, and external databases.
Once data is collected, it is preprocessed to prepare it for model training.
This involves cleaning data, handling missing values, and transforming features
into a suitable format.
Preprocessing
In this dataset, "Merchant state" and "Zip" column only have Nan
values .
So , using simple Imputer Replace the Nan values using the Mean for
Numeric values and Most Frequent word for Categorical values .
After Preprocessing save the preprocessed file into a csv file.
Balancing the Dataset using UnderSampling

In this dataset is_fraud? column have 2 classification (Yes


or No) .
In Yes column we have 872 values and No column we have
691048 values .
Using Under sampling convert the Imbalanced data into
Balanced Data.
Feature Selection
Feature engineering is crucial for building an effective credit card
fraud detection model. It involves transforming raw data into
meaningful features that capture the underlying patterns of
fraudulent transactions.
Using correlation and extra tree classifier extract the important
features.
Correlation Matrix
Feature Importance
EDA

Based on this analysis, Comparing with other


transaction in Swipe Transaction most of the
fraud transaction occurred.
Analysis based on the type of transaction in
which year most of the fraud transaction
occured
Model Creation and model evaluation
In this project, I used Random forest Classifier algorithm for best
accuracy.
Random Forest is an ensemble learning method that combines the
predictions of multiple individual decision trees to make more
accurate and robust predictions.
Then Save the model.
Confusion Matrix
Credit card Fraud Detection Web App

Run app.py Predicts


Open Web App Homepage Prints Fraud
Start or Not

8 feature value

User Input
Web App
Conclusion
Credit Card is a great tool to pay money easily, but as with all the other monetary payment
tools, reliability is a issue here too as it is subjected to breach and other frauds. To encounter
this problem, a solution is needed to identify the patterns in the transactions and identify the
ones which are fraud, so that finding such transactions beforehand in future will be very easy.
Machine Learning is a great tool to do this work since Machine Learning helps us in finding
patterns in the data. Machine Learning can help producing great results if provided enough
amount of data. Also, with further advances in the technology, Machine Learning too will
advance with time, it will be easy for a person to predict if a transaction is fraud or not much
more accurately with the advances.

You might also like