Download as pdf or txt
Download as pdf or txt
You are on page 1of 6

19EEE362

Deep Learning for visual


computing
300 (3) Dr. Lekshmi R. R.
Asst. Prof.
Department of Electrical & Electronics
Engineering
Amrita School of Engineering 1
Residual Network
ResNet
Plain Network

a[l] a[l+1] a[l+2]

a [l] 𝐿𝑖𝑛𝑒𝑎𝑟 𝑅𝑒𝐿𝑢 a[l+1]


z [l+1] = w [l+1] a[l] + b [l+1]
a[l+1] = g(z [l+1] )
a[l+1] 𝐿𝑖𝑛𝑒𝑎𝑟 𝑅𝑒𝐿𝑢 a[l+2]

z [l+2] = w [l+2] a[l+1] + b [l+2]


a[l+2] = g(z [l+2] )
Residual Network
a[l+1] a[l+2]
[l] Skip over a[l] by 2 layers to pass
a
information deeper into layer

Short cut

a[l+1]
a[l] 𝐿𝑖𝑛𝑒𝑎𝑟 𝑅𝑒𝐿𝑢 𝐿𝑖𝑛𝑒𝑎𝑟 𝑅𝑒𝐿𝑢 a[l+2]

z [l+1] = w [l+1] a[l] + b [l+1] Main path

a[l+1] = g(z [l+1] )


a[l+2] = g(z l+2 + a[l] )
z [l+2] = w [l+2] a[l+1] + b [l+2]
a[l+2] = g(z [l+2] )
Plain to Residual Network
structure helps vanishing gradient problem and to train better

𝑥 a[l]

Plain Residual

Training error
Training error

Reality

Theory

# layers # layers
Thank you

You might also like