Professional Documents
Culture Documents
Case Study2
Case Study2
College of Science
Bachelor of Science in Information Technology
Rizal Avenue, Legazpi City, Albay
Members:
Acosta, Barbhea
Bañares, Marife
Boral, Niña Jhayne
Buban, Lyka
Orlanda, Denise Ann
BS Information Technology 3B
Jennifer Laraya-Llovido
Professor
Introduction
Problem:
If you are like most people, buying a home is one of the biggest financial decisions you will ever
make. And for many people, taking out a home loan is the best way to afford that dream home.
Obtaining a home loan is a time-consuming procedure. Banks are facing a significant problem in
the approval of the loan. Daily there are so many applications that are challenging to manage by
the bank employees, and also the chances of some mistakes are high. Most banks earn profit
from the loan, but it is risky to choose deserving customers from the number of applications. One
mistake can make a massive loss to a bank.
Banks and financial institutions provide several services and that includes a housing loan.
Housing loan or a home mortgage is a loan given by a bank, a mortgage company or other
financial institution for the purchase of a residential property. This allows the borrower to have a
chance to own their dream houses and real state properties, as it gives more payment options in
buying the property. Now, this financial institutions collects data of their home loan applicants.
However, with the amount of data gathered, checking it manually can be time consuming and
can result to mistakes and errors. In this paper, we will present how we can use data mining to
predict home loan approvals which can make the process more efficient and accurate.
Data was gathered from the kaggle website1 in which the dataset was originally taken from the
DataHack website2.
The purpose (focus) of this paper is to
- include loan criteria (specify best criteria, summarize)
Related Works
- Define data mining
- Look for other studies related to loan (like predicting loan risk etc. atleast 3 studies) (include
the reference)
Methodology
Data mining is the process of extracting patterns from large datasets. It involves
collecting and analyzing data from various sources in order to identify patterns and trends. Data
mining processes include business understanding, data understanding, data preparation,
modeling, evaluation and deployment. It is an essential tool for businesses and organizations like
loan and banking industry, as it helps to make better decisions related to lending out money. Data
mining can help identify the trends and patterns in loan repayments specifically in predicting
home loan approvals which lessen the chances of mistakes and make the process more efficient.
Additionally, it can be used to identify potential risks and frauds related to loan applications.
Data Understanding
The primary dataset used in the House Loan Approval Prediction was collected from the
Dream Housing Finance Company. This dataset has a total row of 614 and 13 column. The data
contains information about the applicant such as the loan ID, gender, marital status, number of
dependents, educational attainment, job, income, co-applicant’s income. It also contains details
about the loan such as loan amount, loan term, credit history, property area and the loan status.
The loan status is the dependent variable to the preceding attributes.
Data Preparation
Data preparation includes filtering and cleaning the dataset of Home Loan Approval.
Filtering is done by removing unnecessary fields such as loan ID, applicant’s education and
property area returning 10 columns only.
data type conversion, missing value imputation and data normalization. Additionally, feature
engineering can also be done to generate new variables based on existing ones
Data Modelling
Evaluation
Development
Conclusion
- The conclusion should give your reader the points to “take home” from your paper.
- state clearly what your results demonstrate about the problem you were tackling in the paper.
- generalize your findings, putting them into a useful context that can be built upon. All
generalizations should be supported by your data, however; the discussion should prove these
points, so that when the reader gets to the conclusion, the statements are logical and seem
self-evident.
Recommendation
- explain to your readers where you think the results can lead you.
- What do you think are the next steps to take?
- Recommendations are the added suggestions that you want people to follow when performing
future studies.
References
[1] Ergün, B. (2018, November 8). Loan Data Set. Kaggle. Retrieved December 1, 2022, from
https://www.kaggle.com/datasets/burak3ergun/loan-data-set
[2] Loan prediction. Analytics Vidhya. (n.d.). Retrieved December 1, 2022, from
https://datahack.analyticsvidhya.com/contest/practice-problem-loan-prediction-iii/