Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 4

Why Linear Regression is not

suitable for Classification


2 reasons why linear regression is not suitable:

 the predicted value is continuous, not probabilistic


 sensitive to imbalance data when using linear
regression for classification
https://jinglescode.github.io/2019/05/07/why-linear-regression-is-not-suitable-for-
classification/#:~:text=This%20article%20explains%20why%20logistic,using%20linear%20regression
%20for%20classification

We may have to create best fit line again for new future data points..which will make our previous data
points category not useful//
Now we have 2 models trained on the same dataset, one by
linear regression, and another by logistic regression. We can
compare both models performance by using root mean squared
error (RMSE) and the coefficient of determination (R² score).
Problem #2: Sensitive to imbalance data

1 Clean Up the data

2 Convert to dummies

3 Train and Test

4 Build Model

Get probabilities on Train

5 Use the train probabilities to identify cutoff

Use Ks

Or

FBeta To find cutoff

6 Apply model on test data….

7 Apply that cutoff to convert soft classes to 1 or 0

8 Calculate all metrices


Sn

Sp

Odds Ratio = P/1-P = B0 + B1X1 + B2X2+….

P/1-P = Z

Log(P)/(Log(1-P) = Z

Taking Exponential

P(1+Z) = Z

P = exp(Z)/(1+exp(Z))

By Computing the equation we get the Z value…

Then we need to take exp(Z) to calculate P value

Z = Anand__ Wajirabaad

Cutoff will help us getting in Precison and Recall and other parameters such as Accuracy…

But these Precision and Recall will help me in calculating the F Beta score…

You might also like