Professional Documents
Culture Documents
05 Multivariate Regression IV
05 Multivariate Regression IV
Housing Prices
(Portland, OR)
of dollars)
(in 1000s
Price
Size (feet2)
Supervised Learning Regression Problem
Given the “right answer” for Predict real-valued output
each example in the data.
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Regression Example
4
Hypothesis:
‘s: Parameters
How to choose ‘s ?
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Linear Regression with one Variable
6
Simplified:
Hypothesis:
Parameters:
Cost Function:
Goal:
𝜃1 𝑥 2,𝑖 − 𝑥 𝑖 𝑦 𝑖 = 0
𝑚 𝑖=1 𝑖=1
𝜕 𝐽(𝜃1 ) 𝜕 2
= ℎ𝜃 𝑥 𝑖 − 𝑦 𝑖 𝑚 𝑚
𝜕𝜃1 𝜕𝜃1
𝑖=1 𝜃1 𝑥 2,𝑖 = 𝑥 𝑖 𝑦 𝑖
𝑚
𝜕 𝐽(𝜃1 ) 𝜕 𝑖=1 𝑖=1
2
= 𝜃1 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 𝜕𝜃1 σ𝑚 𝑖 𝑖
𝑖=1 𝑖=1 𝑥 𝑦
𝜃1 = 𝑚 2𝑖
𝑚 σ𝑖=1 𝑥
𝜕 𝐽(𝜃1 ) 𝜕
= 2 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝜃1 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 𝜕𝜃1
𝑖=1
𝑚 𝑐𝑜𝑣𝑎𝑟(𝑋, 𝑌)
𝜕 𝐽(𝜃1 )
= 2 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑥 𝑖 = 0 𝑣𝑎𝑟(𝑋)
𝜕𝜃1 if 𝑚𝑒𝑎𝑛(𝑋) = 𝑚𝑒𝑎𝑛(𝑌) = 0
𝑖=1
Hypothesis:
Parameters:
𝑥0 = 1
Cost function:
Gradient descent:
Repeat
(simultaneously update )
Two Parameters(𝜃0 , 𝜃1 ):
𝑚 𝑚 𝑚
1 1 1
𝜃0 + 𝜃1 𝑥 = 𝑦 𝑖
𝑖
𝑚 𝑚 𝑚
𝑖=1 𝑖=1 𝑖=1
𝑚
𝜕 𝐽(𝜃0 ) 𝜕 1 2 𝑚
= ℎ𝜃 𝑥 𝑖 − 𝑦 𝑖 1
𝜕𝜃0 𝜕𝜃0 2𝑚
𝑖=1 𝑥ҧ = 𝑥 𝑖
𝑚 𝑚
𝜕 𝐽(𝜃0 ) 𝜕 1 2 𝑖=1
= 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑚
𝜕𝜃0 𝜕𝜃0 2𝑚
𝑖=1 1
𝜕 𝐽(𝜃0 ) 1
𝑚
𝜕
𝑦ത = 𝑦 𝑖
=2 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑚
𝜕𝜃0 2𝑚 𝜕𝜃0 𝑖=1
𝑖=1
𝑚
𝜕 𝐽(𝜃0 ) 1
= 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 = 0
𝜃0 = 𝑦ത − 𝜃1 𝑥ҧ
𝜕𝜃0 𝑚
𝑖=1
Two Parameters(𝜃0 , 𝜃1 ):
𝑚
𝜕 𝐽(𝜃1 ) 𝜕 1 2
= ℎ𝜃 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 𝜕𝜃1 2𝑚
𝑖=1
𝑚
𝜕 𝐽(𝜃1 ) 𝜕 1 2
= 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 𝜕𝜃1 2𝑚
𝑖=1
𝑚
𝜕 𝐽(𝜃1 ) 1 𝜕
=2 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖
𝜕𝜃1 2𝑚 𝜕𝜃1
𝑖=1
𝑚
𝜕 𝐽(𝜃1 ) 1
= 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑥 𝑖 = 0
𝜕𝜃1 𝑚
𝑖=1
Two Parameters(𝜃1 , 𝜃2 ):
𝑚
𝜕 𝐽(𝜃1 ) 1
= 𝜃0 + 𝜃1 𝑥 𝑖 − 𝑦 𝑖 𝑥 𝑖 = 0
𝜕𝜃1 𝑚
𝑖=1
𝑚 𝑚 𝑚
1 1 1
𝜃0 𝑥 + 𝜃1 𝑥 = 𝑥 𝑖 𝑦 𝑖
𝑖 2𝑖
𝜃0 = 𝑦ത − 𝜃1 𝑥ҧ
𝑚 𝑚 𝑚
𝑖=1 𝑖=1 𝑖=1
𝑚 𝑚 𝑚
𝜃1 𝑥 2,𝑖 + 𝑦𝑥
ത 𝑖 − 𝜃1 𝑥𝑥
ҧ 𝑖 = 𝑥 𝑖 𝑦𝑖
𝑖=1 𝑖=1 𝑖=1 𝑖=1
𝑚 𝑚
𝜃1 𝑥 𝑖 𝑥 𝑖 − 𝑥ҧ = 𝑥 𝑖 𝑦 𝑖 − 𝑦ത
𝑖=1 𝑖=1
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Multivariate Regression
14
Two Parameters(𝜃0 , 𝜃1 ): 𝑚
1
𝑚 𝑚
𝑥ҧ = 𝑥 𝑖
𝑚
𝑖=1
𝜃1 𝑥 𝑖 𝑥 𝑖 − 𝑥ҧ = 𝑥 𝑖 𝑦 𝑖 − 𝑦ത
𝑚
𝑖=1 𝑖=1 1
𝑦ത = 𝑦 𝑖
σ𝑚 𝑖
𝑖=1 𝑥 𝑦 − 𝑦
𝑖
ത 𝑚
𝜃1 = 𝑚 𝑖 𝑖 𝑖=1
σ𝑖=1 𝑥 𝑥 − 𝑥ҧ
σ𝑚 𝑥 𝑖 𝑦 𝑖 − 𝑚 σ𝑚 𝑥 𝑖 𝑦
ത
𝑖=1 𝑚 𝑖=1
𝜃1 =
σ𝑚 𝑥 2,𝑖 − 𝑚 σ𝑚 𝑥 𝑖 𝑥ҧ
𝑖=1 𝑚 𝑖=1
σ𝑚
𝑖=1 𝑥 𝑖 𝑦 𝑖 − 𝑚. 𝑥ҧ 𝑦
ത
𝜃1 = 𝑚 2,𝑖
σ𝑖=1 𝑥 − 𝑚. 𝑥ҧ 2
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Multivariate Regression
15
Two Parameters(𝜃0 , 𝜃1 ): 𝑚
1
𝑥ҧ = 𝑥 𝑖
σ𝑚 𝑥 𝑖 𝑦 𝑖 − 𝑚. 𝑥ҧ 𝑦
ത 𝑚
𝑖=1 𝑖=1
𝜃0 = 𝑦ത − 𝜃1 𝑥ҧ 𝜃1 = 𝑚 2,𝑖
σ𝑖=1 𝑥 − 𝑚. 𝑥ҧ 2 𝑚
1
𝑦ത = 𝑦 𝑖
𝑚
These equations can be summarized by the following 𝑖=1
matrix equation (also known as normal equation)
𝑚 σ𝑚
𝑖=1 𝑥 𝑖
𝜃0 σ𝑚
𝑖=1 𝑦 𝑖
=
σ𝑚
𝑖=1 𝑥
𝑖 σ𝑚
𝑖=1 𝑥
2,𝑖 𝜃1 σ𝑚 𝑖 𝑖
𝑖=1 𝑥 𝑦
Two Parameters(𝜃1 , 𝜃2 ): 𝑚
1
𝑥ҧ = 𝑥 𝑖
𝑚 σ𝑚
𝑖=1 𝑥 𝑖 𝜃0 σ𝑚
𝑖=1 𝑦 𝑖 𝑚
= 𝑖=1
σ𝑚
𝑖=1 𝑥
𝑖 σ𝑚
𝑖=1 𝑥
2,𝑖 𝜃1 σ𝑚 𝑖 𝑖
𝑖=1 𝑥 𝑦 𝑚
1
This equation can be written in more compact form, 𝑦ത = 𝑦 𝑖
𝑚
Suppose 𝐗 = (𝟏 𝐱) , 𝟏 = (1,1,1, … )𝑇 and 𝑖=1
𝐱 = (𝑥1 , 𝑥2 , 𝑥3 , … , 𝑥𝑛 )𝑇
𝑇
𝟏 𝑇
𝟏 𝟏𝑇 𝐱 𝑚 σ𝑚
𝑖=1 𝑥
𝑖
𝐗 𝐗= 𝑇 =
𝐱 𝟏 𝑇
𝐱 𝐱 σ𝑚
𝑖=1 𝑥
𝑖 σ𝑚
𝑖=1 𝑥
2,𝑖
𝑇𝐲 σ 𝑚 𝑖
𝟏 𝑖=1 𝑦
𝐗 𝑇 𝐲 = (𝟏 𝐱)𝑇 𝐲 = 𝑇 =
𝐱 𝒚 σ𝑚 𝑖 𝑖
𝑖=1 𝑥 𝑦
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Multivariate Regression
17
𝐗 𝑇 𝐗𝛉 = 𝐗 𝑇 y
where 𝛉 = (𝜃0 , 𝜃1 )𝑇
𝛉 = (𝐗 𝑇 𝐗)−1 𝐗 𝑇 y
𝑚 × 𝑛 design matrix
1 𝑥11 𝑥12 ⋯ 𝑥1𝑛
1 𝑥21 𝑥22 ⋯ 𝑥2𝑛
𝐗= ⋯ ⋯ ⋯ ⋯ ⋯
1 𝑥𝑚1 𝑥𝑚2 ⋯ 𝑥𝑚𝑛
𝜃 = (𝜃0 , 𝜃1 , 𝜃2 , ⋯ , 𝜃𝑛 )𝑇
Examples:
Size (feet2) Number of Number of Age of home Price ($1000)
bedrooms floors (years)
1 2104 5 1 45 460
1 1416 3 2 40 232
1 1534 3 2 30 315
1 852 2 1 36 178
simultaneously update
where 𝛉 = (𝜃0 , 𝜃1 , 𝜃2 , 𝜃3 , 𝜃4 )𝑇
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Example 2
21
𝛉 = (𝐗 𝑇 𝐗)−1 𝐗 𝑇 y
𝛉 = (𝐗 𝑇 𝐗)−1 𝐗 𝑇 y
𝜽𝟎
𝛉 = 𝜽𝟏
𝜽𝟐
𝜽𝟎
𝛉 = 𝜽𝟏
𝜽𝟐
𝜽𝟎
𝛉 = 𝜽𝟏
𝜽𝟐
𝜽𝟎
𝛉 = 𝜽𝟏
𝜽𝟐
𝜽𝟎 −153.51
𝛉 = 𝜽𝟏 = 1.24
𝜽𝟐 12.08
Outline:
• Start with some
• Keep changing to reduce
until we hopefully end up at a minimum
Dr. Hashim Yasin Applied Machine Learning (CS4104)
Gradient Descent
27
update
and
simultaneously
𝜕
∙ 𝐽(𝜃0 , 𝜃1 )
𝜕𝜃1