Linear Regression: Study On The Coursera All Right Reserved: Andrew NG

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 29

Linear Regression

Study on the Coursera


All Right Reserved : Andrew Ng
Linear Regression with one
Variable
Housing Prices
(Portland, OR)
Price
(in 1000s
of dollars)

Size (feet2)
Supervised Learning Regression Problem
Given the “right answer” for Predict real-valued output
each example in the data.
Training set of Size in feet2 (x) Price ($) in 1000's (y)
housing prices 2104 460
1416 232
1534 315
852 178
… …
Notation:
Training Set
m = Number of training examples

x’s = “input” variable / features Learning Algorithm

y’s = “output” variable / “target” variable


Size of Estimated
house
h price

Question : How to describe h?


Size in feet2 (x) Price ($) in 1000's (y)
Training Set 2104 460
1416 232
1534 315
852 178
… …

Hypothesis:
‘s: Parameters
How to choose ‘s ?
y

Idea: Choose so that


is close to
for our training examples
Cost Function
Simplified
Hypothesis: :

Parameters:

Cost Function:

Goal:
Price
($)
in
1000’s

Size in feet2 (x)

Question:How to minimize J?
Gradient Descent

Have some function


Want

Outline:
• Start with some
• Keep changing to reduce
until we hopefully end up at a minimum
Gradient descent algorithm

Correct: Simultaneous update Incorrect:


Gradient descent algorithm

Notice : α is the learning rate.


If α is too small, gradient descent
can be slow.

If α is too large, gradient descent


can overshoot the minimum. It may
fail to converge, or even diverge.
at local optima

Current value of

Unchange

Gradient descent can converge to a local minimum, even with the


learning rate α fixed.
As we approach a local minimum, gradient descent will
automatically take smaller steps. So, no need to decrease α over
time.
Gradient Descent for
Linear Regression
Gradient descent algorithm Linear Regression Model
Gradient descent algorithm

update
and
simultaneously
J(θ0,θ1)

θ1
θ0
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
(for fixed , this is a function of x) (function of the parameters )
Linear Regression with multiple
variables
Hypothesis:
Parameters:
Cost function:

Gradient descent:
Repeat

(simultaneously update for every )


New algorithm :
Gradient Descent
Repeat
Previously (n=1):
Repeat

(simultaneously update for

(simultaneously update )
Normal Equation in Linear
Examples:
Regression
Size (feet2) Number of Number of Age of home Price ($1000)
bedrooms floors (years)

1 2104 5 1 45 460
1 1416 3 2 40 232
1 1534 3 2 30 315
1 852 2 1 36 178

simultaneously update
Difference
Gradient Descent Normal Equation
• Need to choose α • No Need to choose α
• Needs many iterations • Don’t need to iterate
• Works well even n is • Slow if n is large
large

You might also like