Welcome to Scribd!

Skip carousel

Presentation On Data Science

Uploaded by

Aaditya Singh

0% found this document useful (0 votes)

1 views8 pages

Data Science Presentation and assignment useful in academic projects

Original Title

Presentation on data science

Copyright

Available Formats

PPTX, PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Report this Document

Data Science Presentation and assignment useful in academic projects

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pptx, pdf, or txt

0% found this document useful (0 votes)

1 views8 pages

Presentation On Data Science

Uploaded by

Aaditya Singh

Data Science Presentation and assignment useful in academic projects

Copyright:

Available Formats

Download as PPTX, PDF, TXT or read online from Scribd

Flag for inappropriate content

Download as pptx, pdf, or txt

Jump to Page

You are on page 1of 8

Search inside document

Housing Loan

Approval
Prediction
Group 1

Ansh Tulsyan – 200103033

Bhushan Agrawal - 200103050

Mansi Ramrakhyani - 200103089

Sabari Santosh S - 200101130

Vikrant Beniwal – 200103089

The Problem Statement

• Predicting Loan Approval for home

loans based on historical data by
identifying those customer segments
who are eligible for a loan amount

• The Independent Variables: Gender,

Marital Status, Education, Number of
Dependents, Income, Loan Amount,
Credit History

• The Dependent Variable: Loan Status

The Dataset

• Dataset Source: Analytics Vidhya

• There are outliers in some of the features

– Applicant and Co Applicant income

• Extra characters like ‘+’ on some of the

rows for the feature – Dependents

• Blank fields in Gender, Married,

Dependents and Self_Employed and NAs
in LoanAmount, Loan_Amount_term and
Credit_History Missing data and NAs

• Data is also Skewed

Exploratory Data Analysis
Key Observations – Exploratory Data Analysis
• 7 Variables had missing data

• On studying the distribution of loan amount and applicant

income, there are extreme values which may be outliers

• Graduates have more outliers, and their loan distribution

is wider compared to non-graduates

• Some features had certain data points that were not

scaled – for example: Credit_history variable had mean of
0.8422 even though the data was 0s and 1s

• There was no normal distribution of the features showing

that scaling and normalization needs to be done
Distribution After Cleaning The Data
The Solution
• Logistic Regression : It is used to predict the probability of certain Loading The Data

classes or events
Exploratory Data Analysis
• GLM: Model used is GLM – Generalized Linear Model

• MLE: Maximum Likelihood Estimation is used to fit the data to the

Cleaning The Data
model

• Feature Selection: Manual Feature selection was done based on Train/Test Split
the exploratory data analysis and feature correlation matrix

• Independent Variables: Credit History, Education, Self Employed, Build Model and Fit Train Data

Property Area, Loan Amount, Income

Measure Results and Prediction on
• Dependent Variable: Loan Status Test Data
The Results
• Train Data: Accuracy 82%

• Test Data: Accuracy 84%

• Misclassification Error: 0.16

• AUC ROC: 0.4912406

• Confusion Matrix:

Model Summary
Thank you!

LECTURE 14 - Implementing The Designed Curriculum As A Change Process
Document4 pages
LECTURE 14 - Implementing The Designed Curriculum As A Change Process
Aeleu Joverz
No ratings yet
Excavation Inspection HSE Checklist
Document3 pages
Excavation Inspection HSE Checklist
Sarfraz Randhawa
No ratings yet
Credit EDA Case Study
Document22 pages
Credit EDA Case Study
Murali krishna Manala
100% (3)
Data Quality Concepts PDF
Document83 pages
Data Quality Concepts PDF
Sugumar Kanniyappan
100% (3)
Build A CMS in An Afternoon With PHP and MySQL
Document45 pages
Build A CMS in An Afternoon With PHP and MySQL
hnguyen_698971
No ratings yet
Assignment - 3 - Data Analytics
Document25 pages
Assignment - 3 - Data Analytics
Learners Hub
No ratings yet
Machine Learning Unit-1.2
Document38 pages
Machine Learning Unit-1.2
sahil.utube2003
No ratings yet
What Is Data Mining: Effective Data Collection Warehousing
Document21 pages
What Is Data Mining: Effective Data Collection Warehousing
ysakhare69
No ratings yet
Estimating Parameter Values For Single Facilities
Document85 pages
Estimating Parameter Values For Single Facilities
Bahagian Kolaborasi Keusahawanan Jpkk
No ratings yet
Extra Class For Ai Ai Project Extra Class For Ai Project Cycle
Document26 pages
Extra Class For Ai Ai Project Extra Class For Ai Project Cycle
Bruh
No ratings yet
Business Analytics Process and Data Exploration
Document38 pages
Business Analytics Process and Data Exploration
J Warneck Gultøm
No ratings yet
Unit 1 PART A
Document59 pages
Unit 1 PART A
vinodnangare01
No ratings yet
Multivariate Data Analysis: Overview of Methods
Document30 pages
Multivariate Data Analysis: Overview of Methods
Anjali Shergil
100% (1)
Ihic-2022 PPT Paper - Id 100
Document11 pages
Ihic-2022 PPT Paper - Id 100
prashantrinku
No ratings yet
Capstone Project Weekly Progress Report
Document3 pages
Capstone Project Weekly Progress Report
Rohit N
No ratings yet
Data Science PDF
Document11 pages
Data Science PDF
sredhar s
No ratings yet
Ai Project Cycle
Document10 pages
Ai Project Cycle
mishabhatia0109
No ratings yet
Description: Tags: FPJourneyThroughNSLDS
Document61 pages
Description: Tags: FPJourneyThroughNSLDS
anon-650237
No ratings yet
Data Quality: A Raising Data Warehousing Concern: Presented By: Chowdhury, Mohammad Aminul Hoque
Document39 pages
Data Quality: A Raising Data Warehousing Concern: Presented By: Chowdhury, Mohammad Aminul Hoque
Ajinder Singh
No ratings yet
Data Quality: A Raising Data Warehousing Concern: Presented By: Chowdhury, Mohammad Aminul Hoque
Document39 pages
Data Quality: A Raising Data Warehousing Concern: Presented By: Chowdhury, Mohammad Aminul Hoque
Ajinder Singh
No ratings yet
Customer Churn Prediction Project: Group C
Document12 pages
Customer Churn Prediction Project: Group C
Rohit N
No ratings yet
Ch01 (Data and Statistics) Final
Document40 pages
Ch01 (Data and Statistics) Final
eaint thu
No ratings yet
DTS Modul Data Science Methodology
Document56 pages
DTS Modul Data Science Methodology
dancent sutanto
100% (1)
Business Analytics
Document21 pages
Business Analytics
Dakshkohli31 Kohli
No ratings yet
Ds & ML Project (IBM)
Document9 pages
Ds & ML Project (IBM)
Anirudh Nair
No ratings yet
Retail Lending Principles
Document22 pages
Retail Lending Principles
Viji Ranga
25% (4)
Chapter 6 DATA MINING R1
Document81 pages
Chapter 6 DATA MINING R1
keith
No ratings yet
Chap 2 - Credit Analysis
Document27 pages
Chap 2 - Credit Analysis
charlie simo
No ratings yet
DATA Mining
Document21 pages
DATA Mining
Robi BM
No ratings yet
Edafinal 1
Document32 pages
Edafinal 1
2025vcetitb24
No ratings yet
Data Science Methodology
Document26 pages
Data Science Methodology
Aathmika Vijay
No ratings yet
Secondary Data Analysis: Kalim Hyder Senior Economist, State Bank of Pakistan
Document37 pages
Secondary Data Analysis: Kalim Hyder Senior Economist, State Bank of Pakistan
Muhammad Asad Ali
No ratings yet
Classification: Unit-III
Document90 pages
Classification: Unit-III
KRISHMA
No ratings yet
Improving Data Quality & Sustenance For A Oil Major: Meenakshisundaram. T
Document17 pages
Improving Data Quality & Sustenance For A Oil Major: Meenakshisundaram. T
Victor Chiriboga
No ratings yet
Financial Status Analysis of Credit Score Rating Using
Document18 pages
Financial Status Analysis of Credit Score Rating Using
Anonymous pKxfg8N
No ratings yet
CH # 7: Collecting & Analyzing Diagnostic Information: OC&D-Summer-2011
Document10 pages
CH # 7: Collecting & Analyzing Diagnostic Information: OC&D-Summer-2011
Saif Ali Khan
No ratings yet
Machine Learning Presentation
Document13 pages
Machine Learning Presentation
mdevichakradhar
No ratings yet
Lesson 1 Introduction To Data Mining: Jennifer O. Contreras Coloma
Document40 pages
Lesson 1 Introduction To Data Mining: Jennifer O. Contreras Coloma
Omer Husham
No ratings yet
User Base Analysis
Document15 pages
User Base Analysis
Roman Zolotyy
No ratings yet
Unit1-Data Science Fundamentals
Document35 pages
Unit1-Data Science Fundamentals
vedantbailmare22
No ratings yet
02.data Preprocessing PDF
Document31 pages
02.data Preprocessing PDF
sunil
100% (1)
Predictive Analytics
Document40 pages
Predictive Analytics
Mohit Kumar
No ratings yet
IS4242 W5 Predictive Modeling
Document81 pages
IS4242 W5 Predictive Modeling
wongdeshun4
No ratings yet
CH 15
Document28 pages
CH 15
Sreela Sreekumar Pillai
0% (1)
M8 (Info) - Testing and Measurement in Career Counseling
Document22 pages
M8 (Info) - Testing and Measurement in Career Counseling
pjmksg1
No ratings yet
BI Unit 3
Document96 pages
BI Unit 3
Michael Jone
No ratings yet
Yong Kim - NYL PDF
Document2 pages
Yong Kim - NYL PDF
Riana Han
No ratings yet
Data Quality and Data Cleaning: An Overview
Document27 pages
Data Quality and Data Cleaning: An Overview
SohaibNasir
No ratings yet
Basics of Data Science
Document46 pages
Basics of Data Science
Amisha Sawant
No ratings yet
Pre Processing
Document60 pages
Pre Processing
vani_V_prakash
No ratings yet
Untitled
Document14 pages
Untitled
Lakhvir Kaur
No ratings yet
Description: Tags: FinalLaRSReview
Document24 pages
Description: Tags: FinalLaRSReview
anon-847063
No ratings yet
ML - Lec1 MSC 2024
Document14 pages
ML - Lec1 MSC 2024
hanaw.07000164
No ratings yet
Credit Risk
Document15 pages
Credit Risk
joshdreamz
No ratings yet
2019 07 Technical Guide CreditScore
Document52 pages
2019 07 Technical Guide CreditScore
Alexandru Toloaca
No ratings yet
Presentation On Data Analysis: Sameer Ahmed Hammad Paracha Muhammad Usman Zeeshan Ahmed
Document18 pages
Presentation On Data Analysis: Sameer Ahmed Hammad Paracha Muhammad Usman Zeeshan Ahmed
Bushra Syed
No ratings yet
Multidisciplinary Projects p2
Document80 pages
Multidisciplinary Projects p2
Hong Minh Dao
No ratings yet
TUP CreditAnalysis PPT Chapter03
Document14 pages
TUP CreditAnalysis PPT Chapter03
Nhon Hoang
No ratings yet
Antim Prahar 2024 Data Analytics For Business Decisions
Document38 pages
Antim Prahar 2024 Data Analytics For Business Decisions
Abhay Gupta
No ratings yet
Chapter Seven: Selection Decisions and Personnel Law
Document42 pages
Chapter Seven: Selection Decisions and Personnel Law
Sherif Shams
No ratings yet
1 PPPP
Document26 pages
1 PPPP
hedator300
No ratings yet
Chapter 1 DM
Document20 pages
Chapter 1 DM
minaluasefa23
No ratings yet
Dictionary of Credit Risk Business Terms - EXTRACT
From Everand
Dictionary of Credit Risk Business Terms - EXTRACT
Steve Preece
No ratings yet
Supermarket CSR 07 Holly Waterman
Document213 pages
Supermarket CSR 07 Holly Waterman
Raja Imran Khan
No ratings yet
(Methods in Enzymology 183) Abelson J.N., Simon M.I., Doolittle R.F. (Eds.) - Molecular Evolution - Computer Analysis of Protein and Nucleic Acid Sequences-Academic Press (1990)
Document725 pages
(Methods in Enzymology 183) Abelson J.N., Simon M.I., Doolittle R.F. (Eds.) - Molecular Evolution - Computer Analysis of Protein and Nucleic Acid Sequences-Academic Press (1990)
Rilberte Costa
No ratings yet
5.UART Serial Communication Module Design and Simulation
Document4 pages
5.UART Serial Communication Module Design and Simulation
venkatahari babu
No ratings yet
RLE Materials
Document22 pages
RLE Materials
Xan Lopez
No ratings yet
Solvent-Based Separation and Recycling of Waste Plastics A Review
Document14 pages
Solvent-Based Separation and Recycling of Waste Plastics A Review
Christhy Vanessa Ruiz Madroñero
No ratings yet
Duncan VCAs Part 1
Document5 pages
Duncan VCAs Part 1
Ian Press
No ratings yet
Mcookbook PDF
Document271 pages
Mcookbook PDF
fawwaz
No ratings yet
Rice As PMFC in Battery
Document7 pages
Rice As PMFC in Battery
Gamer Gabo
No ratings yet
Total Quiz HRM627
Document28 pages
Total Quiz HRM627
Syed Faisal Bukhari
No ratings yet
Nirma: University
Document2 pages
Nirma: University
BHENSDADIYA KEVIN PRABHULAL
No ratings yet
WRAP Food Grade HDPE Recycling Process: Commercial Feasibility Study
Document45 pages
WRAP Food Grade HDPE Recycling Process: Commercial Feasibility Study
HACHALU FAYE
No ratings yet
Apxv18 206516H
Document2 pages
Apxv18 206516H
Doc All Telecom
No ratings yet
PVC Coated Steel Wire Rope 2020
Document1 page
PVC Coated Steel Wire Rope 2020
ELZEKKIWIRE
No ratings yet
24 Micro-Cap Multibagger Stocks To Buy Now PDF
Document11 pages
24 Micro-Cap Multibagger Stocks To Buy Now PDF
Pravin Yeluri
No ratings yet
Chapter 3
Document5 pages
Chapter 3
Sheila Shamimi
No ratings yet
NEW - Brochure Villas DAOS ENG Low
Document30 pages
NEW - Brochure Villas DAOS ENG Low
sean golar
No ratings yet
Diploma in Marine
Document18 pages
Diploma in Marine
Kaung Min Mo
No ratings yet
Monorail Hoist System
Document17 pages
Monorail Hoist System
ypatels
No ratings yet
Customer Engagement - A Literature Review: October 2016
Document7 pages
Customer Engagement - A Literature Review: October 2016
Nlke Nzke
No ratings yet
Berger Paints Profile
Document35 pages
Berger Paints Profile
Vimal Jain
No ratings yet
Holiday Homework: FUN IN
Document25 pages
Holiday Homework: FUN IN
Sunita pal
No ratings yet
GP Medium Correct
Document2 pages
GP Medium Correct
Jimmy Joe
No ratings yet
Kajabi - The Ultimate Guide To Create A Profitable Membership Site
Document23 pages
Kajabi - The Ultimate Guide To Create A Profitable Membership Site
Skyler Warren
No ratings yet
74HC4538
Document13 pages
74HC4538
roozbehxox
No ratings yet
2020 CCUSA Work Experience USA Program Agreement - Colombia: CCUSA Inc., (CCUSA), Primavera Camping Tours
Document8 pages
2020 CCUSA Work Experience USA Program Agreement - Colombia: CCUSA Inc., (CCUSA), Primavera Camping Tours
Santiago Daza
No ratings yet
Integrado de Plancha Remington
Document99 pages
Integrado de Plancha Remington
Andrey serrano hidalgo
No ratings yet
Truck Study 2020 PDF
Document30 pages
Truck Study 2020 PDF
Mainak Mukherjee
No ratings yet