Professional Documents
Culture Documents
Logistic Regression Example Illustrated
Logistic Regression Example Illustrated
------------------------------------------------------------------------------------
----
Z=2
P=0.88
A student who studied for 33 hours has
88% chance of passing the course.
It’s an S-shaped curve that can take any real-valued number and map it into a value
between 0 and 1.
1 / (1 + e^-value)
Where e is the base of the natural logarithms (Euler’s number or the EXP() function in
your spreadsheet) and value is the actual numerical value that you want to transform.
Below is a plot of the numbers between -5 and 5 transformed into the range 0 and 1
using the logistic function.
The odds equals the probability that Y=1 divided by
the probability that Y=0. For example, if the
probability that Y =1 is 0.8 then the probability that
Y=0 is 1-0.8 or 0.2
Odds = P(Y=1)/P(Y=0) = 0.8/0.2 = 4
Log odds = Ln(Odds) = Ln(P(Y=1)/P(Y=0)) =
Ln(P(Y=1)/[1-P(Y=1)])
Sigmoid function
where:
The crosstab of the variable hon with female shows that there are 109
females and 91 males; 32 of those 109 females secured honours.
Probability:
The probability of an event is the number of instances of that event divided
by the total number of instances present.
=0.29
Odds:
0.42
32/77 => For every 32 females that secure honours, there are 77 females
that do not secure honours.
32/77 => There are 32 females that secure honours, for every 109 (ie
32+77) females.
Log odds:
The Logit or log-odds of an event is the log of the odds. This refers to the
natural log (base ‘e’). Thus,
Q: Find the odds ratio of graduating with honours for females and males.
Calculations:
B1= 0.593
Thus, the LogR equation becomes
Now, let us try to find out the probability of a female securing honours when
there is only 1 input feature present-‘female’.
As log-odds = -0.877.
Thus, odds= e^ (Bt.X)= e^ (-0.877)= 0.416
And, probability is calculated as:
Suppose we want to calculate the effect of being female on the probability
of graduating with honours.
A key difference from linear regression is that the output value being modeled is a binary
values (0 or 1) rather than a numeric value.
Below is an example logistic regression equation:
Where y is the predicted output, b0 is the bias or intercept term and b1 is the coefficient
for the single input value (x).
Let’s say we have a model that can predict whether a person is male or female based on
their height (completely fictitious). Given a height of 150cm is the person male or female.
We have learned the coefficients of b0 = -100 and b1 = 0.6. Using the equation above
we can calculate the probability of male given a height of 150cm or more formally
P(male|height=150). We will use EXP() for e, because that is what you can use if you
type this example into your spreadsheet:
y = e^(b0 + b1*X) / (1 + e^(b0 + b1*X))
y = 0.0000453978687
---------------------------------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------------------