Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

Bicol University

College of Science
Bachelor of Science in Information Technology
Rizal Avenue, Legazpi City, Albay

Home Loan Approval Prediction using Data Mining

Members:
Acosta, Barbhea
Bañares, Marife
Boral, Niña Jhayne
Buban, Lyka
Orlanda, Denise Ann

BS Information Technology 3B

Jennifer Laraya-Llovido
Professor
Introduction

- Home Loan, advantages


- With the growing number of home loan applicants, lots of data, if it will be done
manually, mistakes and errors can occur which is bad for the bank or mortgage company.
- Now, data mining is … which is useful…involves prediction. Companies have found out
that they utilized or use this to make the process or transactions more efficient and
accurate.
- The research aims to show how data mining can be used in predicting whether the home
loan application is likely to be approved or not and identify applicants who are most
likely to have their home loans approved.
- The purpose of this paper is to present a solution that banks and other financial
institutions can use to easily determine and predict home loan approvals which lessen the
chances of mistakes and make the process more efficient. It will focus on
- Considering the problem in regard with checking and processing of home loan
applications and also how time-consuming it is, which does not only affect the financial
institutions or lenders but also the applicants themselves waiting for their application to
be approved. With data mining, we can predict if an applicant have the capability or can
able to pay their loan in time. Both the lender and the applicant can benefit from it.
Secure, less risk, assurance sa part ng bank.

Problem:
If you are like most people, buying a home is one of the biggest financial decisions you will ever
make. And for many people, taking out a home loan is the best way to afford that dream home.
Obtaining a home loan is a time-consuming procedure. Banks are facing a significant problem in
the approval of the loan. Daily there are so many applications that are challenging to manage by
the bank employees, and also the chances of some mistakes are high. Most banks earn profit
from the loan, but it is risky to choose deserving customers from the number of applications. One
mistake can make a massive loss to a bank.

Banks and financial institutions provide several services and that includes a housing loan.
Housing loan or a home mortgage is a loan given by a bank, a mortgage company or other
financial institution for the purchase of a residential property. This allows the borrower to have a
chance to own their dream houses and real state properties, as it gives more payment options in
buying the property. Now, this financial institutions collects data of their home loan applicants.
However, with the amount of data gathered, checking it manually can be time consuming and
can result to mistakes and errors. In this paper, we will present how we can use data mining to
predict home loan approvals which can make the process more efficient and accurate.

Data was gathered from the kaggle website1 in which the dataset was originally taken from the
DataHack website2.
The purpose (focus) of this paper is to
- include loan criteria (specify best criteria, summarize)

Related Works
- Define data mining
- Look for other studies related to loan (like predicting loan risk etc. atleast 3 studies) (include
the reference)
Methodology

Data mining is the process of extracting patterns from large datasets. It involves
collecting and analyzing data from various sources in order to identify patterns and trends. Data
mining processes include business understanding, data understanding, data preparation,
modeling, evaluation and deployment. It is an essential tool for businesses and organizations like
loan and banking industry, as it helps to make better decisions related to lending out money. Data
mining can help identify the trends and patterns in loan repayments specifically in predicting
home loan approvals which lessen the chances of mistakes and make the process more efficient.
Additionally, it can be used to identify potential risks and frauds related to loan applications.

Figure 1: Data Processes

Data Understanding

The primary dataset used in the House Loan Approval Prediction was collected from the
Dream Housing Finance Company. This dataset has a total row of 614 and 13 column. The data
contains information about the applicant such as the loan ID, gender, marital status, number of
dependents, educational attainment, job, income, co-applicant’s income. It also contains details
about the loan such as loan amount, loan term, credit history, property area and the loan status.
The loan status is the dependent variable to the preceding attributes.

Data Preparation
Data preparation includes filtering and cleaning the dataset of Home Loan Approval.
Filtering is done by removing unnecessary fields such as loan ID, applicant’s education and
property area returning 10 columns only.

Attribute Description Data Type

Gender Male/ Female String

Married Applicant’s Marital Status (Y/N) String

Dependents Number of Dependents String

Self_Employed Self employed (Y/N) String

ApplicantIncome Applicant’s Income Integer

CoapplicantIncome Co-applicant’s Income Integer

LoanAmount Loan Amount Integer

Loan_Amount_Term Loan Term in Months Integer

Credit_History Record of Credit Activity Integer

Loan_Status Loan Approval (Y/N) String

Table 1: Home Loan Dataset

Cleaning includes fixing the data type value and

data type conversion, missing value imputation and data normalization. Additionally, feature
engineering can also be done to generate new variables based on existing ones

Data Modelling
Evaluation

Development

Results and Discussion


- discuss the result
- demonstrate the work accomplished and explain its significance.

Conclusion
- The conclusion should give your reader the points to “take home” from your paper.
- state clearly what your results demonstrate about the problem you were tackling in the paper.
- generalize your findings, putting them into a useful context that can be built upon. All
generalizations should be supported by your data, however; the discussion should prove these
points, so that when the reader gets to the conclusion, the statements are logical and seem
self-evident.
Recommendation
- explain to your readers where you think the results can lead you.
- What do you think are the next steps to take?
- Recommendations are the added suggestions that you want people to follow when performing
future studies.
References
[1] Ergün, B. (2018, November 8). Loan Data Set. Kaggle. Retrieved December 1, 2022, from
https://www.kaggle.com/datasets/burak3ergun/loan-data-set
[2] Loan prediction. Analytics Vidhya. (n.d.). Retrieved December 1, 2022, from
https://datahack.analyticsvidhya.com/contest/practice-problem-loan-prediction-iii/

You might also like