Professional Documents
Culture Documents
Gradient Descent Algorithm
Gradient Descent Algorithm
Gradient Descent
Gradient Descent
Gradient Descent is just like Agile Methodology
Make
changes Build
depending something
upon the quickly
feedback
Algorithm:
- initialize θ ’s randomly
- keep chaining θ ′ s to reduce J(θ)
until we hopefully end up at a minimum
Gradient Descent
Lets have some function 𝐽 θ
Algorithm:
- initialize θ ’s randomly
- repeat until convergence {
𝜕
θi := θi - α J(θ)
𝜕θi
Gradient Descent
Algorithm:
- initialize θ1 randomly
- keep chaining θ1 to reduce J(θ 1)
until we hopefully end up at a minimum
Gradient Descent
Lets have some function 𝐽 θ1
Algorithm:
- initialize θ ’s randomly
- repeat until convergence {
𝜕
θ1 := θ1 - α J(θ1)
𝜕θ1
}
Gradient Descent
𝐽 θ1 = (θ1 - 3 )2 +5 θ1 := θ1 - α
𝜕
J(θ1)
𝜕θ1
θ1 𝑱 θ1 𝜕
0 14 J(θ1) = 2(θ1 – 3) α = 0.1
𝜕θ1
1 9
-1 21
If θ1 = 10
2 6
-2 30
3 5
-3 41
4 6
-4 54
5 9
-5 69
6 14 If θ1 = -5
-6 86
7 21
8 30
9 41
10 54
11 69
12 86
13 105
Gradient Descent
Q&A
Impact of learning rate in Gradient
Descent
Impact of learning rate in Gradient Descent
Impact of learning rate in Gradient Descent
Q&A
How to implement Gradient Descent
How to implement Gradient Descent
𝐽 θ1 = (θ1 - 3 )2 +5 initialize θ ’s randomly
- repeat until convergence {
θ1 𝑱 θ1 𝜕
0 14 θ1 := θ1 - α J(θ1)
𝜕θ1
1 9 }
-1 21
2 6
-2 30
3 5 𝜕
𝐽(θ1) = 2(θ1 – 3)
-3 41 𝜕θ1
4 6
-4 54
5 9
-5 69 initialization θ1 = 10 initialization θ1 = -5
6 14
-6 86
7 21 Repeat until convergence{
8 30
θ1 := θ1 - α 2(θ1 – 3)
9 41
10 54
}
11 69
12 86
13 105
How to implement Gradient Descent
min J(θ0,θ1)
θ0,θ1
Algorithm:
- initialize θ ’s randomly
- repeat until convergence {
𝜕
θi := θi - α J(θ0,θ1)
𝜕θi
How to implement Gradient Descent
How to implement Gradient Descent
How to implement Gradient Descent
How to implement Gradient Descent
Cost function: J(θ0,θ1)
Algorithm:
- initialize θ ’s randomly min J(θ0,θ1)
θ0,θ1
- repeat until convergence {
𝜕
θi := θi - α J(θ0,θ1)
𝜕θi
Correct: Simultaneous Update Incorrect
𝜕 𝜕
temp0 := θ0 - α J(θ0,θ1) temp0 := θ0 - α J(θ0,θ1)
𝜕θ0 𝜕θ0
𝜕
temp1 := θ1 - α J(θ0,θ1) θ0 := temp0
𝜕θ1
𝜕
θ0 := temp0 temp1 := θ1 - α J(θ0,θ1)
𝜕θ1
θ1 := temp1 θ1 := temp1
Q&A