Data Analysis Powerpoint 11

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 19

Credit 85

MACHINE LEARNING
NAME:
INSTITUTION:
The Lake is Our Home

• From the histogram, the credit


amount shows lower credit forms
the majority of the credit issues.
• Larger amounts are fewer.
Credit Score Density Analysis
General Overview

The highest applicants


were between ages 30 to
40 years.
Older individuals have
lesser loans applications
We must now protect our lake, we must do everything to
return the lake to its former glory
Lower savings have
higher loans when
compared to higher
savings
Decision Tree

A decision tree is a type of supervised


learning algorithm used in machine
learning that can be used for both
classification and regression tasks.
The lake was used during fishing, sports and
swimming, we have lost all these today.

The algorithm builds a tree-like model


of decisions and their possible
consequences, with each decision
leading to a new set of decisions or an
outcome.
The decision tree begins with a single
node, known as the root node, which
represents the entire dataset.
The algorithm then splits the dataset
into smaller subsets based on the
values of a chosen feature or attribute.
Decision Tree
Analysis
The decision tree shows that
housing and the current credit
is a predictor of the future
credit.
Decision Tree
Analysis
• The list of coefficients to the right
shows significance of each variable in
predicting the Creditability.
• The most significant are the Type of
apartment, duration of the current
address, sex and marital status is also
another important factor in
determining customer credibility.
Decision Tree
Analysis
Techniques such as pruning or
ensemble methods like random
forests can be used to mitigate this
issue.
In summary, decision trees are a
versatile and powerful machine
learning algorithm that can be
used for both classification and
regression tasks.
They are easy to interpret and
computationally efficient, but can
be prone to overfitting.
Decision Tree Analysis

From the results, the single


most important Factor in
predicting predictability is
Bank.Amount.
Age is an 80% predictor for
Creditability.
Logistic Regression Analysis

• In logistic regression, the probability


of the event (e.g., success, default,
fraud) is modeled as a function of one
or more predictor variables, often
referred to as independent variables
or features.
• The relationship between the
predictors and the probability of the
event is modeled using a logistic
function, which maps the input
variables onto the output probability.
Logistic Regression Analysis

• Logistic regression is widely used in various fields, including finance, healthcare,


marketing, and social sciences, for predicting outcomes such as customer churn, disease
diagnosis, and credit risk assessment.
• It is a powerful tool for analyzing and understanding the relationships between variables
and predicting the likelihood of future events
Logistic Regression Analysis

• The Accuracy of the regression is 0.729. This shows how well the regression
choice fits the data.
Discriminant Analysis

The graph below shows the


result of the discriminant
analysis.
It is evident that Credibility 2
occupies the upper ends of
the graphs while level 1
occupies lower levels.
Discriminant Analysis

The graph below shows the


result of the discriminant
analysis.
It is evident that Credibility 2
occupies the upper ends of
the graphs while level 1
occupies lower levels.
Age in years has a coefficient
of 9.1 while Occupation has a
9.9. The two are the most
significant indicators for
Creditability.
Discriminant Analysis

• Based on the results you provided, it appears that the loan status was divided into
three groups: below 200 million, between 200 and 400 million, and above a certain
threshold. The highest indicator of creditworthiness, as determined by the LDA, was
found in the group with loan status below 200 million, with a percentage of 47.04%.
• The group with loan status between 200 and 400 million had a slightly lower
percentage of indicators of creditworthiness, at 40.6%. Finally, the group with the
highest loan status had the lowest percentage of indicators of creditworthiness, at
12.3%.
• It is important to note that these results are specific to the dataset and variables used
in the analysis, and should not be generalized to other situations without further
investigation and analysis.
R CODE for the Analysis
References
• Zou, X., Hu, Y., Tian, Z., & Shen, K. (2019, October). Logistic regression model optimization
and case analysis. In 2019 IEEE 7th international conference on computer science and network
technology (ICCSNT) (pp. 135-139). IEEE.
• Thabtah, F., Abdelhamid, N., & Peebles, D. (2019). A machine learning autism classification
based on logistic regression analysis. Health information science and systems, 7, 1-11.
• Senaviratna, N. A. M. R., & Cooray, T. M. J. A. (2019). Diagnosing multicollinearity of logistic
regression model. Asian Journal of Probability and Statistics, 5(2), 1-9.
• Boateng, E. Y., & Abaye, D. A. (2019). A review of the logistic regression model with emphasis
on medical research. Journal of data analysis and information processing, 7(4), 190-207.

You might also like