Professional Documents
Culture Documents
DSE 2020-21 1st Sem DL Problem Solving
DSE 2020-21 1st Sem DL Problem Solving
𝑧12
𝑦11 𝑦21
𝑧11 𝑧21
𝑦10 𝑦20
𝑧11 = 0.25*1+0.75*1=1.0
𝑧21 = 0.75*1+0.25*1=1.0
𝑧12
𝑦11 = ReLU(𝑧11 )= 1 𝑦21 = ReLU(𝑧21 )= 1 𝑦11 𝑦21
𝑧12 = 0.67*1+0.33*1=1.0 𝑧11 𝑧21
𝑦12 = ReLU(𝑧12 )= 1.0
𝑦10 𝑦20
1
𝑑𝑖𝑣 = 𝑑 − 𝑦 2
2
𝛿𝑑𝑖𝑣
= y − d = 1.0
𝛿𝑦
𝛿𝑑𝑖𝑣 𝛿𝑑𝑖𝑣 𝛿𝑦12
= * = 1*1=1 𝛿𝑑𝑖𝑣 𝛿𝐷𝑖𝑣 𝛿𝑧12
𝛿𝑧12 𝛿𝑦12 𝛿𝑧12 = 𝛿𝑧 2 *𝛿𝑦 1 =1*w31
𝛿𝑦11 1 1
𝛿𝑑𝑖𝑣 𝛿𝐷𝑖𝑣 𝛿𝑧12 𝛿𝑑𝑖𝑣 𝛿𝑑𝑖𝑣 𝛿𝑦11
= * 𝛿𝑤 = 1 ∗ 𝑦11 =1 = ∗ 𝛿𝑧 1 =0.67*1=0.67
𝛿𝑤31 𝛿𝑧12 31
𝛿𝑧11 𝛿𝑦11 1
𝛿𝑑𝑖𝑣 𝛿𝑑𝑖𝑣 𝛿𝑑𝑖𝑣 𝛿𝑧11 0
𝑤31 = 𝑤31 − 𝜂*𝛿𝑤 =0.67-0.1=0.57 = 1 *
𝛿𝑤12 𝛿𝑧1 𝛿𝑤12
=0.67*𝑦2 =0.67
31
𝛿𝑑𝑖𝑣
𝑤12 = 𝑤12 − 𝜂∗ =0.75−.067=0.683
𝛿𝑤12
Weight Updates with ReLU Hidden, sigmoid output
and binary cross-entropy 𝑦 = 𝑦12
𝑧11 = 0.25*1+0.75*1=1.0
𝑧21 = 0.75*1+0.25*1=1.0
𝑧12
𝑦11 = ReLU(𝑧11 )= 1 𝑦21 = ReLU(𝑧21 )= 1 𝑦11 𝑦21
𝑧12 = 0.67*1+0.33*1=1.0 𝑧11 𝑧21
𝑦12 = sigmoid(𝑧12 )=e/(1+e)
𝑦10 𝑦20
div = -d*log(y)-(1-d)*log(1-y)
𝛿𝑑𝑖𝑣
=-d/y+(1-d)/(1-y)=1/(1-y)= (1+e)
𝛿𝑦
(k ) (k )
(k ) (k ) i ij j
i j j
h1 h2
(k ) (k )
i j
Softmax
(k ) (k ) ij
i j j Output Layer
h1 h2
h1 h2
𝛿𝑦1 1 1 𝑒
= ∗ 1− =
𝛿𝑧1 1 + 𝑒 −1 1 + 𝑒 −1 1+𝑒 2
𝛿𝑑𝑖𝑣
𝑤11 𝑡 + 1 = 𝑤11 𝑡 −𝜂
𝛿𝑤11
𝛿𝑑𝑖𝑣 𝛿𝑦1 𝛿𝑧1
= 𝑤11 𝑡 − 0.1 ∗
𝛿𝑦1 𝛿𝑧1 𝛿𝑤11
𝑒
= 𝑤11 𝑡 − 0.1 ∗ 0 ∗ ∗ ℎ1
1+𝑒 2
= 1.0
Example Problems
L1 and L2 Regularization
𝑙2 regularization
𝛼
max 𝜃𝑖∗ − ,0 if 𝜃𝑖∗ ≥ 0
𝐻𝑖𝑖
(𝜃𝑅∗ )𝑖 ≈
∗
𝛼
min 𝜃𝑖 + ,0 if 𝜃𝑖∗ < 0
𝐻𝑖𝑖
L1 Regularization
• Loss 𝑥, 𝑦 = 3𝑥 2 − 6𝑥 + 3 + 𝑦 2
• What is the value of (xopt , yopt) that minimizes the loss, if L1
regularization constant 0.1 is applied?
𝛼
max 𝜃𝑖∗ − ,0 if 𝜃𝑖∗ ≥ 0
𝐻𝑖𝑖
(𝜃𝑅∗ )𝑖 ≈
𝛼
min 𝜃𝑖∗ + ,0 if 𝜃𝑖∗ < 0
𝐻𝑖𝑖
ɳx,opt = 1/6
ɳy,opt = 1/4
ɳz,opt = 1/8
• 1,opt 2,opt
• 2,opt
• 2,opt
• 2,opt
• 2,opt
• 2,opt
Minimization of Quadratic Error Function
Weight Updates – Ordinary Gradient Descent
Weight Updates – Ordinary Gradient Descent
Weight Updates – Momentum Method
w2(t+1)= 2.058+0.9*(2.0-1.0)=2.958
Weight Updates – RProp
At time t-1,
dE/dw1 =0.5*(1-3)-(1-4)/6=-0.5
dE/w2=2/9*(1-4)-(1- 3)/6=-0.333
At time t,
dE/dw1 =0.5*(1.5-3)-(2.0-4)/6=0.4167
dE/w2=2/9*(2-4)-(1.5- 3)/6=-0.194