Report

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

CREDIT CARD APPROVAL

A Course Project report was submitted


in partial fulfillment of the requirement for the award of the degree

BACHELOR OF TECHNOLOGY
in
ARTIFICIAL INTELLIGENCE AND MACHINE LEARNING
by
Sai Teja Thatikonda (2103A52035)
Mithrasri Kadarla (2103A52050)
Gayapu Rishitha (2103A52016)
Mirupuri Siri Chandhana (2103A52057)

Under the guidance of


Ms.Neelima
Assistant Professor, Department of CSE.

Department of Computer Science and Artificial Intelligence


Department of Computer Science and Artificial Intelligence

CERTIFICATE
This is to certify that the project entitled “CREDIT CARD APPROVAL" is the bonafide work carried out
by Sai Teja, Mithrasri, Rishitha, Siri Chandhana as a Course Project for the partial fulfillment to award
the degree BACHELOR OF TECHNOLOGY in ARTIFICIAL INTELLIGENCE AND MACHINE
LEARNING during the academic year 2023-2024 under our guidance and Supervision.

Mr. Prawin Dr. M.Sheshikala


Asst. Professor, Assoc. Prof. & HOD (CSE),
S R University, S R University,
Ananthasagar, Warangal. Ananthasagar, Warangal.
ACKNOWLEDGMENT
We express our thanks to Course co-coordinator Mr. Prawin, Asst—Prof. for guiding us from the
beginning through the end of the Course Project. We express our gratitude to the Head of the department
CS&AI, Dr. M. Sheshikala, Associate Professor for encouragement, support, and insightful suggestions.
We truly value their consistent feedback on our progress, which was always constructive and encouraging and
ultimately drove us in the right direction.

We wish to take this opportunity to express our sincere gratitude and a deep sense of respect to our
beloved Dean, School of Computer Science and Artificial Intelligence, Dr. C. V. Guru Rao, for his
continuous support and guidance in completing this project in the institute.

Finally, we thank all the teaching and non-teaching staff of the department for their suggestions and timely
support.
ABSTRACT
This study delves into credit card approval data through exploratory data analysis (EDA) to uncover influential
factors. We meticulously analyze income distribution, credit history, and debt-to-income ratios to identify
patterns and indicators of approval likelihood. Transitioning to presentation, we utilize Power BI for dynamic
visualizations, offering comprehensive insights. Our findings highlight approval rate trends and credit-related
variables, aiding financial stakeholders. Through EDA, we establish a robust understanding of the credit
approval landscape. Leveraging Power BI, we create interactive dashboards and trend analyses, facilitating
informed decision-making. This research enhances understanding of credit risk assessment and decision-
making processes. By distilling complex data into actionable insights, we empower stakeholders in the
financial sector.
About the Organization
The 45-year-old Sri Rajeshwara Educational Society, the parent organization of SR University, is a
conglomerate of educational institutions with 10,000 staff members who are not teachers and over 90,000
students. 95 educational institutions in Telangana and Andhra Pradesh are under the management of SR
Educational Academy.

The mission of SR University is to establish a cutting-edge learning environment that produces graduates
who will have a major impact on the development of Telangana and India. We intend to use three crucial
differentiators to completely overhaul the educational system. Everyone has the opportunity to participate,
flourish, and leave a lasting impression through the co-curricular, extracurricular, and curricular options
offered by the collaborative entrepreneurial environment. The system has close connections to both global
academic institutions and business.
Table of Contents

Chapter No. Title Page No.

1. Introduction

1.1. Overview 1

1.2. Problem Statement 2

1.3. Existing system 5

1.4. Proposed system 6

1.5. Objectives 7

1.6. Architecture 9

2. Literature survey 11

3. Data pre-processing

3.1. Dataset description 16

3.2. Data Wrangling, Data Acquisition 17

3.3. Data cleaning 17

3.4. Data Visualization 18

4. Methodology

4.1. Proposed Approach/ Algorithm/Model 20

4.2. Implementation 22

4.3. Model Architecture 23

4.4. Software Description 23

5. Results and discussion 24

6. Conclusion and Future scope 26

7. References 27
1.INTRODUCTION

1.1 Overview

Description:
1. Objective: The main purpose of this program is to create a comprehensive guide that evaluates various
factors and determines whether a personal credit card should be approved.
2. Technologies Used:
• HTML and CSS: These technologies are used for creating the user interface (UI) of the system,
providing a visually appealing and interactive platform for users to input their information.
• JavaScript:JavaScript will be used for userside scripting to enhance user interface interactivi
ty and control user input.
• ExploratoryDataAnalysis(EDA):EDA technology will be used to search and understand records c
ontaining historical credit card application data. These steps include data analysis, data visualization,
and data cleaning.
3. PowerBI:Power BI will be used to create dashboards and visualizations based on EDA results. Thes
e visualizations help understand patterns and trends in the data.
4. Data Collection: This project must include documents containing past credit card application
information. This information includes income, credit score, employment, debt-to-income ratio, etc. It
should include various features such as.
5. Data Preprocessing: Before data is fed into the prediction model, steps such as retaining missing
values, coding categorical variables, and scaling numeric features are performed to ensure data quality
and consistency.
6. Model Development: Machine learning algorithms will be used to create predictive models. The
model will be trained on previous data to learn patterns and relationships between concepts and
different goals (decision confirmation cards).
7. Model Evaluation: The performance of the prediction model will be evaluated using appropriate
indicators such as accuracy, precision, recall and F1 score. This step ensures that the model provides
reliable and general predictions for unobserved data.
8. Integration: Once the predictive model is trained and analyzed, it is integrated into the web design
using HTML, CSS, and JavaScript.
9. Deployment: Finally, the finished system will be delivered on a platform suitable for the user's use.
Regular maintenance and updates are required to ensure efficient and correct operation.

1
1.2. Problem Statement
The process of processing credit cards often involves manual checks, leading to inefficiencies, delays and
decisions. Additionally, the manual process can lead to errors and inconsistencies, resulting in negative
consequences for both applicants and financial institutions.

1. Efficiency

2. Fact

3. Risk Management

4. Customer Experience

5. Scalability

6. user friendly

Requirement Analysis:

1. Functional Requirements:

• User Registration and Login:

• Users must be able to register an account and log in to a secure system.


• Credit Card Application Submission:

• Applicants must be able to apply for a credit card by providing the necessary
information.
• Automated Approval Process:

• The system should automate the credit card approval process based on predefined
criteria.
• Application Status Tracking:

• Applicants must be able to track the status of their credit cards.


• Data Analysis and Visualization:

• Performs data analysis (EDA) on applications and credit history.

• Use Power BI to create interactive visualizations and dashboards to provide insights


into your data.
• Machine Learning Model Integration:

• Use machine learning models to optimize the credit card approval process.

• Estimate creditworthiness and evaluate probability based on historical data.

2
2. Non-Functional Requirements:

• Security:

• Ensure appropriate security to protect sensitive client information.


• Scalability:

• Design systems that can accommodate increasing numbers of users and data volumes.
• Performance:

• Make sure the system can handle multiple users simultaneously without performance
degradation.
• User Experience:

• Create a user-friendly interface that is intuitive and easy to navigate.


Risk Analysis:

1. Data Security Risks:

• Create a user-friendly interface that is intuitive and easy to navigate. information requested.

• Mitigation measures: Use encryption, access control, and regular security audits to protect data.
2. Model Accuracy Risks:

• Risk: Machine learning models may provide inaccurate predictions, resulting in incorrect credit
card approval or decline.

• Mitigation: Continuously validate and update machine learning models based on new data. Use
model interpretation techniques to understand predictive models.
3. Regulatory Compliance Risks:

• Risk: Failure to comply with data protection laws (e.g. GDPR, CCPA) and business standards.

• Mitigation: Ensure systems comply with all regulations and standards. Conduct regular audits
to identify and resolve compliance issues.
4. Scalability Risks:

• Risk: The system cannot handle the increasing data and user traffic.

• Mitigation: Design your system with scalability in mind. Leverage cloud-based solutions and
monitor system performance to ensure scalability issues.

3
Feasibility Analysis:

1. Technical Feasibility:

• Availability of Required Technologies:

• Check the availability of required technologies such as Python, Power BI and data
management.
• System Integration:

• Evaluate the feasibility of integrating different products, including data analytics,


machine learning, and visualization.
• Scalability and Performance:

• Evaluate the effectiveness of creating a scalable system that can handle large amounts
of data and user traffic.
2. Operational Feasibility:

• User Acceptance:

• Check the customer's willingness to accept and use the system effectively.
• Training and Support:
• Evaluate the feasibility of providing training and support to users to ensure the quality
and functionality of the system.
3. Financial Feasibility:

• Cost Analysis:

• Estimate costs associated with development, implementation and maintenance.


• Return on Investment (ROI):

• Evaluate the potential return on investment by considering the benefits of implementing


the credit card approval process, improving the decision process, and improving the
user experience.

4
1.3. Existing System

These are some examples of some existing solutions:


1. Credit Karma:

Similarity: Credit Karma is a popular platform that offers free credit scores and credit monitoring services to
users.

Differentiation:

• While Credit Karma provides users with information regarding their credit scores and credit status,
our project is specifically focused on streamlining the credit card approval process.
• Unlike Credit Karma, which mainly provides information to users, our project involves creating a
system where financial institutions can evaluate credit cards for profitability and accuracy.
• Our activities include technologies such as deep learning-based credit scoring models and blockchain-
based authentication to deliver innovative solutions beyond resources in Credit Karma services.
2. FICO Score:

Similarity: FICO Score is a well-known credit scoring system used by many financial institutions to assess
creditworthiness.

Differentiation:

• While FICO Score provides digital credit scores to individuals, our project goes one step further and
simplifies the entire credit card approval process.
• Unlike the FICO Score, which relies on traditional credit scores, our program uses technologies like
deep learning and descriptive intelligence to transparently make better, more informed decisions.
• Our project also addresses the need for secure authentication through blockchain technology,
increasing the security and reliability of the credit card approval process.
3.LendingClub:

Similarity: LendingClub is a peer-to-peer lending platform that connects borrowers with investors.

Differentiation:

• While LendingClub helps people get loans, our project focuses on the decision to approve credit cards
at financial institutions.
• Our program focuses on credit cards with different risk and approval standards, unlike LendingClub,
which primarily services personal loans.
• Our program is designed to improve the credit card approval process for applicants and financial
institutions by providing better and more transparent solutions than traditional methodologies.
In general, although these solutions differ in terms of specific products and target audiences, they all share
the goal of providing credit-related services and assessing creditworthiness in the financial industry.

5
1.4. Proposed System
Credit Card Approval Program aims to create a web application for approving or rejecting credit cards based
on various criteria. The system integrates HTML, CSS, JavaScript, and Power BI for data analysis and
visualization to facilitate decision-making.

Proposed System:

1. Frontend Development:

• HTML & CSS: A user interface (UI) for developing web applications, including application
documentation and templates to improve user experience.

• JavaScript: Implement client-side applications to add interactions to the user interface, such as
form validation and dynamic updates based on user input.

2. Exploratory Data Analysis (EDA):

• Designed to understand credit card application data, identify patterns and understand factors
that influence approval decisions.

• Power BI is used for in-depth visualization of different data, distributions, relationships and
results and helps identify important features for credit evaluation.
3. Backend Development:

• Python: Used for backend development, data processing, analysis, and machine learning
algorithms for credit card approval.

• Flask or Django: A framework for creating backend servers and API endpoints to enable
communication between frontend and backend components.
4. Machine Learning Model:

• It is built using Python libraries such as scikit-learn, pandas and NumPy.

• The applicant's income, credit score, employment, etc. A classification system used to estimate
the likelihood of approval for a credit card based on its characteristics.
5. Database Management:

• SQLite: A lightweight and easy-to-use database management system to securely store desired
information

• The requested information is stored in a relational database, ensuring efficient data retrieval
and management.
6. Integration:

• Front-end data integrates with back-end APIs to capture data requests and submit them for
analysis.

• The results of the learning model are sent back to the front end for display, indicating whether
the credit card has been approved or not.

6
7. Security:

• Use encryption techniques and secure coding practices to protect sensitive customer inform
ation.

• Use HTTPS protocol to provide secure communication between client and server.
8. Testing and Deployment:

• Conduct tests to ensure the correctness and accuracy of the system.

• Deploy local or cloud-based web servers to provide easy user access while maintaining
performance and performance.
By implementing this recommendation, the Credit Card Approval Project aims to automate and optimize the
credit card application process, provide applicants with a user interface, and also use data analysis and machine
learning to make good decisions.

1.5. Objectives
1. Automate Credit Card Approval Process:

• Develop a system that automates the credit card approval process based on various criteria,
such as applicant's income, credit history, employment status, etc.

• The system should efficiently process applications and provide real-time approval or rejection
decisions.

2. Enhance User Experience:

• Design a user-friendly interface using HTML, CSS, and JavaScript to facilitate smooth
interaction with the system.

• Ensure that applicants can easily submit their information and track the status of their credit
card application.

3. Perform Exploratory Data Analysis (EDA):

• Conduct comprehensive exploratory data analysis (EDA) to gain insights into the credit card
application dataset.

• Identify patterns, trends, and correlations within the data to inform decision-making in the
approval process.

4. Implement Interactive Visualizations:

• Utilize Power BI to create interactive visualizations that provide stakeholders with intuitive
insights into the credit card application data.

• Visualize key metrics, such as approval rates, application demographics, and credit score
distributions, to facilitate data-driven decision-making.

7
5. Ensure Data Security and Privacy:

• Implement robust security measures to protect sensitive applicant information throughout the
credit card approval process.

• Adhere to regulatory standards and best practices to safeguard data privacy and prevent
unauthorized access or data breaches.

6. Optimize Decision-Making with Machine Learning:

• Explore machine learning algorithms to optimize the credit card approval process.

• Train predictive models using historical credit card application data to predict the likelihood of
default and assess creditworthiness accurately.

7. Facilitate Collaboration and Integration:

• Integrate different components of the system, including data analysis, visualization, and
decision-making, to streamline workflows and enhance efficiency.

8. Monitor Performance and Compliance:

• Implement monitoring mechanisms to track system performance, application throughput, and


approval accuracy.

• Ensure compliance with regulatory guidelines and internal policies governing credit card
approval processes.

9. Continuous Improvement and Iteration:

• Foster a culture of continuous improvement by soliciting feedback from stakeholders and


iterating on the system based on insights gained from data analysis and user experience.

• Regularly update machine learning models and decision-making algorithms to adapt to


changing market conditions and improve predictive accuracy.

8
1.6. Architecture
Presentation Layer

HTML/CSS/JavaScript

Application Layer Data Analysis Visualization Layer

• Backend • Exploratory Data Power BI Visualizations


Logic Analysis
• Data • Machine Learning Dashboards
Visualization
• Integration

Database

• MySql

9
2. LITERATURE SURVEY
PAPER TITLE Approach Journal /Publisher Research Gap FUTURE DATASET USED
SCOPE

Credit Card Approval A systematic Journal of Lack of detailed Exploring the Accurate Loan
Prediction: A literature Theoretical and literature review potential of Approval Prediction
Systematic review aimed Applied Information on existing other machine Based on Machine
Literature Review at finding the Technology, studies using learning Learning Approach
best machine Little Lion Scientific machine algorithms for
Author: Indrajani learning learning for credit card
Sutedja, Jacky Lim, algorithms for credit card approval predi
Erick Setiawan, credit card app approval predict ction.
Frans Rexy Adiputra roval. ion.

Year: 2024

Credit Card Approval Implementing International Journal The need for Enhanced Not provided
Prediction using machine of Advanced improved machine (literature review)
Machine Learning learning Research in Science, accuracy and learning
algorithms like Communication and transparency in models for
Author: Sawan logistic Technology (IJARSC credit card more accurate
Nihatkar, Aditya regression and T) approval proces credit card
Shahari, Tushar Patil, random forest ses approval predi
Atharva Kulkarni, Ms. for credit card ction
Suvarna Potdukhe approval predi
ction
Year: 2023

Credit Card Analytics: Review article International Journal Despite Future Fraud Detection and
A Review of Fraud summarizing of Computer Trends advancements, directions Risk
Detection and Risk methodologies and challenges include Assessment Techniq
Assessment Techniques and techniques Technology (IJCTT) persist in fraud proactive fraud ues
in credit card detection and detection using
Year: 2023 fraud detection credit risk technologies
and assessment, like artificial
Author:
risk assessmen such as intelligence
Kaushikkumar Patel
t. imbalanced and integrating
datasets and multiple data
evolving sources for
fraudulent techn enhanced
iques. fraud
detection capa
bilities.

Credit Card Score Investigating Not provided in lack of previous The future A new dataset from
Prediction using machine the text. studies utilizing work aims to an American bank is
Machine Learning learning the newly enhance the used, which includes
Models: A New Dataset models for proposed MLP model's credit card
credit card dataset for accuracy and transaction histories
Year: 2023 default credit card recall values to and
prediction score prediction make it a more customer profiles.
Authors: Anas Arram,
using a new da . powerful
Masri Ayob, Mustafa
taset. model for
Abbas Abbood Albadr
credit card sco
ring.

10
PAPER TITLE Approach Journal Research Gap FUTURE DATASET USED
/Publisher SCOPE

Predicting Credit Card Comparing the Not provided Lack of detailed Exploring the Credit card approval
Approval Using accuracy of literature review on potential of dataset from Kaggle,
Machine Learning machine the existing studies other machine preprocessed for
Models learning using machine learning model
algorithms in learning for credit algorithms for training and evaluati
Year: 2023 predicting card credit card on.
credit card approval prediction. approval predict
Author: Semasuka
approval using ion.
a
dataset from K
aggle.

Credit Card Approval Artificial Artificial Neural While previous Exploring Dataset: UCI
Prediction by Using Neural Network (ANN) studies have additional Repository credit
Machine Learning Network investigated predictive card defaulter
Techniques (ANN) Accuracy: 0.78 predictive models models for credit
for credit risk risk assessment.
Author: M.P. C. Peiris Linear SVM Linear
assessment, there is
SVM:Accuracy: Investigating the
a lack of
Year:2022 0.71 impact of
comparative
analysis specifically additional
Nonlinear SVM:
focusing on the Sri features on
Accuracy: 0.88 Lankan context. model
performance.

Comparative Analysis XGBoost XGBoost Although previous Future research Kaggle


of XGBoost Classifier Classifier Classifier studies have could explore
and Decision Tree for Accuracy Accuracy: investigated the the performance Link:
Credit Card Approval 87.96% recognition of of other
Decision Tree https://www.kaggle.c
Prediction credit cards using machine
Accuracy Decision Tree om/code/wisartthong
various machine learning
Author: Accuracy: yoy/credit-card-
learning algorithms, algorithms and
Statistical 79.38% approval-prediction
there is no ensemble
Pathipati Yasasvi Analysis
comparison methods for
Accuracy of
Year:2022 between XGBoost credit card
XGBoost
classifiers and approval
Classifier and
decision trees prediction.
Decision Tree
specifically for this
(p = 0.001, p <
task.
0.05).

Predicting Credit Card Random Both Random There is a lack of Exploring Dataset used:
Approval: A Forest and Forest and comparative advanced
Comparative Analysis Logistic Logistic analysis between machine http://localhost:8888/
of Machine Learning Regression Regression different machine learning notebooks/Download
Models models models achieved learning models, techniques for s/MajorProjectBM/Pr
an accuracy of such as random credit card edicting%20Credit%
Authors: Harsha 86%. forest and logistic approval 20Card%20Approval
Vardhan Peela regression, prediction. s/notebook.ipynb
Year:2022 specifically for this
task.

11
PAPER TITLE Approac Journal /Publisher Research Gap FUTURE DATASET USED
h SCOPE

An empirical study Logistic Model’s overall Absence of specific Explore the data sample from a
for credit card Regressi correctness of accuracy metrics or integration of large Greek bank
approvals in the on estimated percentages comparisons with other additional factors
Greek banking Analysis 71.87% predictive models. to enhance the (this is the private
sector model's predictive one)
(Accepted and capabilities.
Authors: Rejected)
Investigate other
Maria Mavri and machine learning
George Ioannou techniques for
credit card
Year:2021
approval
prediction.

CREDIT CARD Logistic The percentage of Specific accuracy Validate the Irvine machine
APPROVAL Regressi both approved values & details of model on diverse learning repository
VERIFICATION on, (55.5%) and denied dataset characteristics datasets and real-
MODEL Random (44.5%) records in are not fully presented. world scenarios. Link:
Forest the dataset.
Author: Umabhanu The study lacks a Investigate https://www.icpsr.u
Classifier
Tanikella comprehensive ensemble methods mich.edu/web/NA
comparison of the to enhance HDAP/studies/275
Year:2021 Logistic Regression & prediction 1/versions/V1
Random Forest models robustness.
with other methods.

MACHINE Random Random wheel Specific accuracy or Explore the Irvine (UCI)
ASSISTANCE Wheel classifier: effectiveness values are application of the machine learning
FOR CREDIT Classifier not mentioned. Random Wheel repository
CARD Accuracy:0.8681 Classifier in other
APPROVAL The paper lacks a sensitive areas Link:
Precision:0.8763 comparison with other
Random Wheel can beyond credit
specific classifiers. https://www.icpsr.u
Recommend and Fmeasure:0.8685 card approvals.
mich.edu/w
Explain
eb/NAHDA
Kappa:0.7368
Author: P/studies/27
51/versions/
ANUPAM KHAN V1

Year:2021

Comparison of Data Bayesian Bayesian Networks: No comparison with Explore other Machine-learning
Mining Algorithms Network Correctly classified state-of-the-art advanced data repository of
in Credit Card s, instances: 595 classifiers or more mining algorithms University of
Approval Decision advanced techniques. for credit card California
Table 86.2139% approval.
Author: Algorith Link:
Decision Table Investigate
m
Wilson Muange Algorithm: Correctly ensemble methods https://archive.ics.u
Musyoka classified instances: to improve ci.edu/dataset/27/cr
575 83.333% classification edit+approval
accuracy.
Average Kappa
Year:2021 score: 0.786563

12
PAPER TITLE Approach Journal /Publisher Research Gap FUTURE SCOPE DATASET USED

A Comparative -Self Organising SOM produces Achieving 90% Enhance system Statlog Australian
Study on Credit Map (SOM) better results than accuracy for security with the Credit Card
Card Fraud other fraud detection use of certificates Approval Dataset
Detection -K menas methodologies lacks context for both merchants
clustering discussed. and specific and customers. https://archive.ics.u
Author: details about ci.edu/dataset/143/s
-PCA Vs t-SNE Continuously tatlog+australian+c
false positives
S. Ghoshal, D. update the system redit+approval
and false
Bose, K. Deb to incorporate new
negatives.
checks and patterns
Year:2021
of fraudulent
transactions.

Research on LSTM and the LSTM and the Lack of Further improving Not available
Default Prediction XGBoost-LSTM XGBoost-LSTM comparative prediction accuracy
for Credit Card fusion model fusion model analysis by incorporating
Users Based on achieved high between additional features.
XGBoost-LSTM accuracy in XGBoost,
Model predicting credit LSTM, and Exploring other
card default without hybrid models deep learning
Author: Jing Gao manual feature specifically architectures for
extraction. focusing on credit card default
Year:2021 prediction.
transaction flow
data.

Investigating the Bayesian Naive Bayes 12.43% Consider additional Credit card
Performance of Inference, Naive Classifier: 12.43% misclassificatio evaluation metrics approval
Naïve- Bayes Bayes Classifier, misclassification n error on the beyond error
Classifiers and K- K-Nearest error on the credit credit card classification, such {Didn’t mention
Nearest Neighbor Neighbor card approval approval testing as precision, recall, about the source}
Classifiers Classifier testing dataset. dataset. and F1-score.

Author: K-Nearest Around 9.45%


Neighbor (k=5): error
Mohammed J. Around 9.45% error classification on
Islam classification on the the credit card
creditcard approval approval testing
Year:2020
testing dataset. dataset.

EXPLORATORY Automated The application The paper lacks The system can be No information it
DATA Prediction System efficiently classifies specific details enhanced for was private
ANALYSIS FOR for Loan loan applicants to about security, reliability,
LOAN Classification prevent financial performance and dynamic weight (The data is
PREDICTION losses. It meets metrics, adjustments. collected by mining
banker technical Integration with the data)
Author: requirements, implementation, automated
Vinuthna Using the weka
though exact and validation processing systems
Ramavarapu tool
performance on real-world is possible.
metrics are not data.
Year:2020
mentioned.

13
PAPER TITLE Approach Journal /Publisher Research Gap FUTURE DATASET USED
SCOPE

Accurate Loan Logistic The Decision Tree The paper doesn't Further research Not mentioned it
Approval Regression (LR), (DT) algorithm elaborate on the could explore the was just mentioned
Prediction Based Decision Tree provides better potential integration of like-
on Machine (DT), and accuracy in loan limitations or real-time data to
Learning Approac Random Forest (R approval prediction challenges faced enhance (Data Selection
h F). compared to during the study. prediction using info gain of
Logistic Regression accuracy. features)
Author: J. and Random Forest.
Tejaswini, T. The accuracy, More advanced
Mohana Kavya, precision machine learning
R. Devi Naga techniques, such
Ramya, P. Sai as ensemble
Triveni, methods or deep
Venkata Rao Mad learning, could be
dumala investigated for
loan
Year:2020 approval predictio
n.

Credit Card Support Vector The performance of The evaluation The paper UCI Machine
Approval Classifier (SVC), the models is metrics provided suggests the Learning
Prediction using Adaboost evaluated using are limited to possibility of Repository
Classification Alg Classifier, and several metrics, accuracy, using deep
orithms Gradient including accuracy, precision, recall, learning models, Link:
Boosting Classifie precision, recall, & and F1-score, such as neural
https://archive.ics.u
r. F1-score. potentially networks, to
ci.edu/dataset/27/cr
missing other improve
Author: Naman The highest edit+approval
important aspects prediction
Dalsania, Devang accuracy achieved of accuracy further.
Punatar, and is 90% by the model performanc
Deep Kothari. Gradient e.
BoostingClassifier.
Year:2019

Improve XGBoost XGBoost The paper doesn't There is also Kaggle


Accuracy in Classifier Classifier: Mean explicitly mention potential for
Prediction of accuracy = 87.97%, demerits, but it further research Link:
Credit Card Random Forest Mean loss = seems that and improvement
Algorithm https://www.kaggle
Approval Using 12.03% Random Forest in credit card
.com/code/wisartth
Novel XGBoost had lower approval
Random Forest ongyoy/credit-card-
Compared accuracy and prediction using
Algorithm: Mean approval-prediction
with Random For higher loss different machine
est accuracy = 82.86%, compared to XGB learning
Mean loss = 17.14 oost. algorithms and tec
% hniques.
Author: Pathipati
Yasasvi and S.
Magesh Kumar

Year:2019

14
PAPER TITLE Approach Journal /Publisher Research Gap FUTURE DATASET USED
SCOPE

An Approach for Logistic Model performance Not explicitly Recommends Kaggle


Prediction of Regression measures such as mentioned. evaluating other
Loan Approval sensitivity and attributes of Link:
using Machine specificity were co customers for
https://www.kaggle
Learning Algorith mputed. better credit
.com/datasets/atulm
m granting decisions
ittal199174/credit-
.
Author: Yash risk-analysis-for-
Diwate, Prashant extending-bank-
Rana, and loans
Pratik Chavan

Year:2019

15
3. DATA PRE-PROCESSING
3.1. Dataset Description

In this project we are using 2 datasets they are application_record and credit_record.

1. application_record:

• ID: Unique identifier for each record.

• CODE_GENDER: Gender of the applicant (e.g., Male, Female).

• FLAG_OWN_CAR: Binary indicator (0 or 1) indicating whether the applicant owns a car.

• FLAG_OWN_REALITY: Binary indicator (0 or 1) indicating whether the applicant owns


real estate.

• CNT_CHILDREN: Number of children the applicant has.

• AMT_INCOME_TOTAL: Total income of the applicant.

• NAME_INCOME_TYPE: Type of income (e.g., Working, Commercial Associate, Pensioner,


etc.).

• NAME_EDUCATION_TYPE: Highest level of education attained by the applicant.

• NAME_FAMILY_STATUS: Family status of the applicant (e.g., Married, Single, Widow,


etc.).

• NAME_HOUSING_TYPE: Type of housing the applicant resides in (e.g., House/Apartment,


With parents, Municipal apartment, etc.).

• DAYS_BIRTH: Number of days since the applicant was born (negative values represent
age).

• DAYS_EMPLOYED: Number of days the applicant has been employed (negative values
indicate employment before application).

• FLAG_MOBIL: Binary indicator (0 or 1) indicating whether the applicant has a mobile


phone.

• FLAG_WORK_PHONE: Binary indicator (0 or 1) indicating whether the applicant has a


work phone.

• FLAG_PHONE: Binary indicator (0 or 1) indicating whether the applicant has a phone.

• FLAG_EMAIL: Binary indicator (0 or 1) indicating whether the applicant has an email.

• OCCUPATION_TYPE: Occupation of the applicant.

• CNT_FAM_MEMBERS: Number of family members.

16
2. credit_record:

• ID: Unique identifier for each record, linking to the corresponding applicant in the
application_record dataset.

• months: Number of months since the record was created.

• status: Credit status of the applicant at the given month (e.g., 0: No DPD (Days Past Due), 1:
DPD 1-30 days, 2: DPD 31-60 days, etc.).

These documents provide detailed information about applicants, including demographic, financial,
employment details and credit histories. The application_record dataset contains static information about
applicants, while the credit_record dataset contains credit-related information that is updated monthly.

3.2. Data Wrangling and Acquisition:

• Data curation for forecast card approval involves cleaning and prioritizing raw data by addressing
missing values, errors, and changes. This involves coding categorical variables, scaling numerical
features, and selecting relevant features for model training. Data segmentation and validation are
important to ensure the quality and integrity of the data used for model estimation.

• Data collection involves collecting or retrieving raw data from various sources such as repositories,
APIs, databases, or manual entries. This process ensures that the material needed for analysis or
modeling is conveniently available for further processing and analysis. Data collection is an
important step in the data analysis and modeling process and forms the basis for subsequent data
cleaning, prioritization and review activities.

3.3. Data Cleaning:

• To make a dataset more reliable and high-quality for analysis or modeling, errors, inconsistencies,
and missing values are found and fixed through the process of data cleaning.
• This include identifying and addressing outliers, imputation or removal of missing values, and
format standardization and inconsistency correction to guarantee data consistency.
• Data cleaning is crucial to guaranteeing precise and significant insights from the data and avoiding
biases or mistakes in later analysis or predictive models.

17
3.4. Data Visualization:

Power Bi Results:

18
19
4.Methodology
4.1. Proposed Approach/ Algorithm/Model

The aim of the project is to create a predictive model that determines whether a credit card applicant will
be approved or not based on various qualifications provided by the credit card applicant. This work involv
es exploratory data analysis (EDA) to understand underlying patterns and relationships in the data and then
develop and evaluate different learning models to apply prediction.

1. Exploratory Data Analysis (EDA):


• Explore the distribution of credit statuses to understand the prevalence of different credit
behaviors.

• Analyze the demographic and financial attributes of applicants, such as income, education,
family status, etc., to identify any correlations with credit card approval.

5 Data Preprocessing:

• Handle missing values, duplicates, and convert categorical variables into numerical format.

• Engineer new features such as duration of credit history, income stability, etc., which may
provide additional predictive power.

5 Model Selection:

• Train various machine learning models such as Logistic Regression, Decision Trees, Random
Forest, Support Vector Machines (SVM), etc., to predict credit card approval.

• Experiment with different hyperparameters and techniques such as ensemble methods to


improve model performance.

5 Model Evaluation:

• Evaluate models using metrics like accuracy, precision, recall, F1-score, and ROC AUC to
assess their predictive performance.

• Identify the best-performing model based on evaluation metrics and fine-tune it further if
necessary.

5 Deployment and Monitoring:

• Once the best model is identified, deploy it into a production environment where it can be used
to predict credit card approvals in real time.

• Implement monitoring mechanisms to track model performance over time and retrain/update
the model as needed to maintain accuracy and reliability.

Model: The final model used for credit card approval is random forest distribution. This learning algorithm
includes multiple decision trees to make predictions and ensures stability and accuracy in predicting credit
card approval based on the applicant.

20
Input data/Tool used:

The input data for this project consists of two main datasets:

1. Credit Record Data: Contains information about individuals' credit histories, including their credit
statuses, months of balance, and unique identifiers (ID).
2. Application Record Data: Contains demographic and financial attributes of credit card applicants, such
as income, education, family status, etc., along with their unique identifiers (ID).
Tools Used:

• Python Programming Language: Utilized for data preprocessing, exploratory data analysis (EDA), and
model development.

• Libraries such as pandas, numpy, matplotlib, and seaborn for data manipulation, visualization, and
analysis.

• Scikit-learn library for implementing machine learning models and evaluation metrics.

• XGBoost and LightGBM libraries for gradient boosting models.

• Keras and PyTorch for neural network implementations.

• Jupyter Notebook: Used as an interactive environment for code development, data exploration, and
analysis.

• RandomizedSearchCV: Employed for hyperparameter tuning and model selection using randomized
search cross-validation.

• Various Machine Learning Models:

• Logistic Regression

• Decision Trees

• Random Forest

• Support Vector Machines (SVM)

• Neural Networks (implemented using Keras and PyTorch)

• Other utility libraries for data preprocessing, model evaluation, and deployment.

21
4.2. Implementation
1. Data Loading and Exploration:

• Load the credit record data and application record data into pandas data frames.

• Explore the datasets to understand their structure, and check for missing values, duplicates, and
anomalies.

2. Data Preprocessing:

• Handle missing values by imputation or removal based on the context.

• Convert categorical variables into numerical format using techniques like one-hot encoding or
label encoding.

• Perform feature engineering to create new relevant features, such as duration of credit history,
income stability, etc.

3. Data Integration:

• Merge the two datasets based on the unique identifier (ID) to create a unified dataset for
analysis.

4. Exploratory Data Analysis (EDA):

• Analyze the distribution of credit statuses, demographic attributes, and financial variables.

• Visualize relationships between different features and the target variable (credit card approval)
using plots and graphs.

5. Model Development:

• Split the dataset into training and testing sets.

• Train various machine learning models such as Logistic Regression, Decision Trees, Random
Forest, Support Vector Machines (SVM), and Neural Networks.

• Use techniques like cross-validation and hyperparameter tuning to optimize model


performance.

6. Model Evaluation:

• Evaluate the trained models using appropriate evaluation metrics such as accuracy, precision,
recall, F1-score, and ROC AUC.

• Compare the performance of different models and select the best-performing one for credit
card approval prediction.

22
7. Deployment:

• Deploy the selected model into a production environment where it can be used to make real-
time predictions on credit card approvals.

• Implement monitoring mechanisms to track model performance and retrain/update the model
as needed to maintain accuracy and reliability.

8. Documentation and Reporting:

• Document the entire implementation process, including data preprocessing steps, model
development, evaluation, and deployment.

• Prepare a comprehensive report summarizing key findings, insights, and recommendations for
stakeholders.

9. Continuous Improvement:

• Monitor the deployed model's performance over time and gather feedback from users.

• Continuously refine and improve the model based on new data, feedback, and emerging trends
in credit card approval processes.

By following these implementation steps, the credit card approval prediction system can be effectively
developed, deployed, and maintained to assist financial institutions in making informed decisions.

4.3. Model Architecture:


• Logistic Regression: A simple linear model that works well for binary classification tasks like credit
card approval prediction.
• Random Forest: An ensemble of decision trees that improves accuracy and generalization by reducing
overfitting.
• Gradient Boosting Machines (e.g., XGBoost, LightGBM): Boosting algorithms that combine weak
learners (usually decision trees) to create a strong predictive model.

4.4 Software Architecture:

• Frontend: Power BI Atom


• Backend: Flask
• Data Analysis and Model Development: Jupyter Notebook

23
5. Results and Analysis:
1. Data Exploration:

• The credit record data and application record data were loaded and explored.

• The distribution of credit statuses, demographic attributes, and financial variables was analyzed.

• Anomalies, missing values, and duplicates were addressed through data preprocessing.

2. Model Development:

• Various machine learning models were trained and evaluated for credit card approval prediction.

• Models such as Logistic Regression, Decision Trees, Random Forest, Support Vector Machines
(SVM), and Neural Networks were implemented.

• Hyperparameters were tuned using techniques like cross-validation and randomized search to
optimize model performance.

3. Model Evaluation:

• The trained models were evaluated using multiple evaluation metrics, including accuracy,
precision, recall, F1-score, and ROC AUC.

• The performance of each model was compared, and the best-performing model was selected
based on evaluation results.

4. Deployment:

• The selected model was deployed into a production environment for real-time credit card
approval prediction.

• Monitoring mechanisms were implemented to track model performance and ensure its reliability
over time.

• The deployed model provided predictions on credit card approvals, assisting financial institutions
in decision-making processes.

5. Continuous Improvement:

• The deployed model's performance was monitored continuously, and feedback from users was
collected.

• Refinements and updates were made to the model based on new data, feedback, and evolving
credit card approval processes.

• The system remained flexible and adaptable to changes, ensuring its effectiveness in predicting
credit card approvals.

24
Outcomes:

Logistic Regression:

Logistic Model Train Accuracy : 61.17259550280924 %

Logistic Model Test Accuracy : 61.129202158092575 %

Classification Report:
Precision Recall F1-Score Support

0.0(Rejected) 0.61 0.61 0.61 93356

1.0(Accepted) 0.61 0.13 0.61 93356

accuracy 0.61 187017

Macro avg 0.61 0.61 0.61 187017

Weighted avg 0.61 0.61 0.61 187017

Random Forest:

Accuracy:0.9918
Precision Recall F1-Score Support

0.0(Rejected) 0.99 1.00 0.99 933536

1.0(Accepted) 1.00 0.99 0.99 93661

accuracy 0.99 187017

Macro avg 0.99 0.99 0.99 187017

Weighted avg 0.99 0.99 0.99 187017

25
6. Conclusion:
In summary, the credit card approval process aims to improve the credit card approval process by determining
the use of data and information. By completing the needs analysis, risk assessment and feasibility analysis,
we ensure that the system meets the user's needs, reduces the chance of risk and is technically, operationally
and financially feasible. The interface that allows applicants to submit credit card applications and track their
status increases user experience and satisfaction. Leveraging exploratory data analysis (EDA) and machine
learning model, the system will provide data-driven insights to optimize the credit card approval process,
enable more accurate decisions and improve risk assessment. Data requests and ensuring compliance with
data protection laws and industry standards. Additionally, the system will be designed to be more efficient,
able to handle more information and business users, and use cloud solutions and monitoring actions to clarify
scalability issues. development, operation and maintenance costs. The potential return on investment (ROI)
was also analyzed to determine the effectiveness of the credit card approval process and improve the decision.
Improve user experience, ultimately increasing customer satisfaction and business success.

Future Scope:

• Advanced Machine Learning Techniques


• Real-time Decision Support
• Predictive Analytics for Fraud Detection:
• Regulatory Compliance and Ethical AI

26
7. References:
[1] SUTEDJA, I., LIM, J., SETIAWAN, E., & ADIPUTRA, F. R. (2024). CREDIT CARD APPROVAL PREDICTION:
A SYSTEMATIC. Journal of Theoretical and Applied Information Technology, 102(3).

[2] Devi, D. A. S., SM, L. A., Phanindra, P. K. S., Vamsi, T., & Sai, K. R. N. M. (2023). Credit Card Approval Prediction
using Machine Learning. i-Manager's Journal on Information Technology, 12(3), 39.

[3] Patel, K. (2023). Credit Card Analytics: A Review of Fraud Detection and Risk Assessment
Techniques. International Journal of Computer Trends and Technology, 71(10), 69-79.

[4] Arram, A., Ayob, M., Albadr, M. A. A., Sulaiman, A., & Albashish, D. (2023). Credit card score prediction using
machine learning models: A new dataset. arXiv preprint arXiv:2310.02956.

[5] Devi, D. A. S., SM, L. A., Phanindra, P. K. S., Vamsi, T., & Sai, K. R. N. M. (2023). Credit Card Approval Prediction
using Machine Learning. i-Manager's Journal on Information Technology, 12(3), 39.

[6] Devi, D. A. S., SM, L. A., Phanindra, P. K. S., Vamsi, T., & Sai, K. R. N. M. (2023). Credit Card Approval Prediction
using Machine Learning. i-Manager's Journal on Information Technology, 12(3), 39.

[7] Yasasvi, P., & Kumar, S. M. (2022, December). Improve Accuracy in Prediction of Credit Card Approval using a Novel Xgboost
compared with Decision Tree Algorithm. In 2022 4th International Conference on Advances in Computing, Communication Control
and Networking (ICAC3N) (pp. 674-679). IEEE.

[8] Peela, H. V., Gupta, T., Rathod, N., Bose, T., & Sharma, N. Prediction of Credit Card Approval. International Journal of Soft
Computing and Engineering, 11(2), 1-6.

[9] Mavri, M., & Ioannou, G. (2004). An empirical study for credit card approvals in the Greek banking
sector. Operational Research, 4, 29-44.

[10] Tanikella, U. (2020). Credit Card Approval Verification Model (Doctoral dissertation, California State University
San Marcos).

[11] Khan, A., & Ghosh, S. K. (2021). Machine Assistance for Credit Card Approval? Random Wheel can Recommend
and Explain. arXiv preprint arXiv:2105.06255.

[12] Musyoka, W. M. (2018). Comparison of Data Mining Algorithms in Credit Card Approval. Inte. Jour. Comp. Info.
Technology, 7, 78-86.

[13] DEB, K., Ghosal, S., & Bose, D. (2021). A comparative study on credit card fraud detection.

[14] Gao, J., Sun, W., & Sui, X. (2021). Research on default prediction for credit card users based on XGBoost-LSTM
model. Discrete Dynamics in Nature and Society, 2021, 1-13.

[15]Islam, M. J., Wu, Q. J., Ahmadi, M., & Sid-Ahmed, M. A. (2007, November). Investigating the performance of
naive-bayes classifiers and k-nearest neighbor classifiers. In 2007 international conference on convergence information
technology (ICCIT 2007) (pp. 1541-1546). IEEE.

27
[16] Jency, X. F., Sumathi, V. P., & Sri, J. S. (2018). An exploratory data analysis for loan prediction based on nature
of the clients. International Journal of Recent Technology and Engineering (IJRTE), 7(4), 17-23.

[17] Tejaswini, J., Kavya, T. M., Ramya, R. D. N., Triveni, P. S., & Maddumala, V. R. (2020). Accurate loan approval
prediction based on machine learning approach. Journal of Engineering Science, 11(4), 523-532.

[18] Dalsania, N., Punatar, D., & Kothari, D. (2022). Credit Card Approval Prediction using Classification
Algorithms. International Journal For Research in Applied Science and Engineering Technology, 10(11).

[19] Yasasvi, P., & Magesh Kumar, S. (2022). Improve Accuracy in Prediction of Credit Card Approval Using Novel
XGboost Compared with Random Forest. In Advances in Parallel Computing Algorithms, Tools and Paradigms (pp.
582-588). IOS Press.

[20] Sheikh, M. A., Goel, A. K., & Kumar, T. (2020, July). An approach for prediction of loan approval using
machine learning algorithm. In 2020 international conference on electronics and sustainable communication systems
(ICESC) (pp. 490-494). IEEE.

28

You might also like