Assignment-2: Submitted By: Name: Vipul Kumar Singh Roll No: 133118 Submitted To: Prof. Kuldeep Baishya

AMR-2
Assignment-2
Submitted to: Submitted by:

Prof. Kuldeep Baishya Name: Vipul Kumar Singh
Roll No: 133118
Step 1: Initially, we apply Binary Logistic Regression as the dependent variable (Loan_Status) has two possible outputs (Y and N). Except for the row identifier:
Loan_ID and Credit_History, we consider the remaining factors as the independent variables.
Variables in the Equation

B S.E. Wald df Sig. Exp(B)
Step 0 Constant .794 .088 81.997 1 .000 2.212
The constant has a significant value denoting that there is a clear difference between Yes and No categories of the dependent variable Loan_Status.
Model Summary
Step -2 Log likelihood Cox & Snell R Square Nagelkerke R Square

1 722.105a .049 .070
a. Estimation terminated at iteration number 20 because maximum iterations has been reached. Final solution cannot be found.
The Naegelkerke R2 statistic is usually referred to (it scales up to 1.0) and is a measure of how well the dependent variable is covered by independent variables.
Classification Table
Predicted
Loan_Status
Observed N Y Percentage Correct
Step 1 Loan_Status N 14 175 7.4
Y 12 406 97.1
Overall Percentage 69.2
a. The cut value is .500
The classification (or truth) table shows that the model is good at predicting True Positives (at 97.1% accuracy) but not at predicting True Negatives (7.4%). This is not
ideal for a bank as the model should be used to predict which loans to reject.
Categorical Variables Coding

Parameter coding
Frequency (1) (2) (3) (4)
Dependents 1 1.000 .000 .000 .000
0 349 .000 1.000 .000 .000
1 103 .000 .000 1.000 .000
2 103 .000 .000 .000 1.000
3+ 51 .000 .000 .000 .000
Property_Area Rural 177 1.000 .000

Semiurba 231 .000 1.000
Urban 199 .000 .000
Education Graduate 476 1.000

Not Grad 131 .000
Married No 210 1.000

Yes 397 .000
Self_Employed No 513 1.000

Yes 94 .000
Gender Female 116 1.000

Male 491 .000
Variables in the Equation

B S.E. Wald df Sig. Exp(B)
Step 1a Gender(1) -.013 .249 .003 1 .958 .987
Married(1) -.509 .214 5.633 1 .018 .601
Dependents 2.559 4 .634
Dependents(1) -21.980 40192.970 .000 1 1.000 .000
Dependents(2) .305 .341 .798 1 .372 1.356
Dependents(3) -.025 .377 .004 1 .948 .976
Dependents(4) .362 .386 .882 1 .348 1.436
Education(1) .492 .218 5.084 1 .024 1.636
Self_Employed(1) -.076 .253 .089 1 .765 .927
ApplicantIncome .000 .000 .000 1 .997 1.000
CoapplicantIncome .000 .000 2.162 1 .141 1.000
LoanAmount -.001 .001 .592 1 .442 .999
Loan_Amount_Term -.001 .001 .648 1 .421 .999
Property_Area 11.448 2 .003
Property_Area(1) -.170 .222 .583 1 .445 .844
Property_Area(2) .558 .222 6.311 1 .012 1.748
Constant .939 .645 2.122 1 .145 2.558
Variable(s) entered on step 1: Gender, Married, Dependents, Education, Self_Employed, ApplicantIncome, CoapplicantIncome, LoanAmount, Loan_Amount_Term,
Property_Area.
The model shows significant independent variables, which are
• Married
• Education
• Property Areas (Urban and Rural)
Step 2: We now add the Credit_History independent factor alongside the above-identified significant factors to create a new model.
Categorical Variables Codings

Parameter coding
Frequency (1) (2)
Credit_History 2 1.000 .000
0 103 .000 1.000
1 509 .000 .000
Property_Area Rural 179 1.000 .000

Semiurba 233 .000 1.000
Urban 202 .000 .000

Education Graduate 480 1.000
Not Grad 134 .000
Married No 214 1.000

Yes 400 .000
Model Summary
Step -2 Log likelihood Cox & Snell R Square Nagelkerke R Square

1 585.044a .251 .354
Estimation terminated at iteration number 20 because maximum iterations has been reached. Final solution cannot be found.
It can be seen that the R2 statistic has improved significantly from 0.07 to 0.354
Classification Table
Predicted
Loan_Status
Observed N Y Percentage Correct
Step 1 Loan_Status N 87 105 45.3
Y 16 406 96.2
Overall Percentage 80.3
The cut value is .500
The prediction accuracy for True Negatives has also increased from 7.4% to 45.3%.

Assignment-2: Submitted By: Name: Vipul Kumar Singh Roll No: 133118 Submitted To: Prof. Kuldeep Baishya

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Assignment-2: Submitted By: Name: Vipul Kumar Singh Roll No: 133118 Submitted To: Prof. Kuldeep Baishya

Uploaded by

Copyright:

Available Formats

AMR-2

Submitted to: Submitted by:

Variables in the Equation

Step -2 Log likelihood Cox & Snell R Square Nagelkerke R Square

Overall Percentage 69.2

a. The cut value is .500

Categorical Variables Coding

1 103 .000 .000 1.000 .000

2 103 .000 .000 .000 1.000

3+ 51 .000 .000 .000 .000

Property_Area Rural 177 1.000 .000

Urban 199 .000 .000

Education Graduate 476 1.000

Married No 210 1.000

Self_Employed No 513 1.000

Gender Female 116 1.000

Variables in the Equation

The model shows significant independent variables, which are

Categorical Variables Codings

1 509 .000 .000

Property_Area Rural 179 1.000 .000

Urban 202 .000 .000

Married No 214 1.000

Step -2 Log likelihood Cox & Snell R Square Nagelkerke R Square

Overall Percentage 80.3

The cut value is .500

You might also like