Lecture 10 - Logistic Regression (Cont.)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

Machine Learning

Logistic regression (cont.)


Lecture – 10

Instructure: Qamar Askari


Lecture headlines
• Formulating cost function for logistic regression
• Gradient descent for logistic regression
• Multi-class classification and one vs. all method
Formulating cost function for logistic
regression
Cost function of linear regression:

What if we use same cost function for logistic regression?


Answer: We can use this function but cost function (J) become non-convex for
logistic regression because hypothesis involves sigmoid function which makes it
non-convex as shown below:

J for linear regression J for logistic regression


Formulating cost function for logistic
regression
What is a convex function and what are its benefits?
In mathematics, a real-valued function defined on an n-dimensional interval is
called convex (or convex downward or concave upward) if the line segment
between any two points on the graph of the function lies above or on the graph.

Convex functions play an important role in many areas of mathematics. They are
especially important in the study of optimization problems where they are
distinguished by a number of convenient properties. For instance, a strictly convex
function on an open set has no more than one minimum.

For further reading please visit following link:


https://en.wikipedia.org/wiki/Convex_function
Formulating cost function for logistic
regression
Drawback of non-convex cost function:
Gradient descent algorithm converges to local optimum and if the function is non-
convex then there will be many local optima and the gradient descent algorithm
will not be able to reach the global optimum. For example, in following situation
the algorithm may converge on upper blue point called local optimum.

J for logistic regression


Formulating cost function for logistic
regression
Making the cost function convex:
The cost function can be made convex by incorporating log functions.
Formulating cost function for logistic
regression
Final cost function (J):
Gradient descent for the logistic regression

Want :
Repeat

(simultaneously update all )

Challenge: Finding the derivative of J w.r.t thetas.


Good News: The derivative of this cost function is exactly similar to the derivative of the cost function used
for the linear regression. For justification, please see the Handout 10.2.
Gradient descent for the logistic regression

Want :
Repeat

(simultaneously update all )

Algorithm looks identical to linear regression!


Multi-class classification
What if there are more than two classes?

Example 1:
Email foldering/tagging: Work, Friends, Family, Hobby

Example 2:
Medical diagrams: Not ill, Cold, Flu

Example 3:
Weather: Sunny, Cloudy, Rain, Snow
What if there are more than two classes?
Binary classification: Multi-class classification:

x2 x2

x1 x1
Solution: One vs. all method
Solution: One vs. all method
Train a logistic regression classifier for each
class to predict the probability that .

On a new input , to make a prediction, pick the


class that maximizes

You might also like