Professional Documents
Culture Documents
DL21 - Lecture 1 - Intro
DL21 - Lecture 1 - Intro
It will include:
• 12 lectures
• 12 seminars
• 4 homeworks
• 1 project
• 3,5 guest lectures
“Deep Learning”, Spring 2021: Lecture 1, Introduction
2006
0.98912 = 0.875
“Deep Learning”, Spring 2021: Lecture 1, Introduction
2014
Natural feature
mapping:
• Highly non-smooth
w.r.t. jitter
• Require lots of
training samples
f(x1) f(x2)
“Deep Learning”, Spring 2021: Lecture 1, Introduction
Haar features for face detection
Viola-Jones features:
• Smooth w.r.t. jitter
• Less training
examples needed
• (also fast to compute)
f(x1) f(x2)
A B
D C
“face”
Linear “background”
Haar feature classifier
extractor +
thresholding
vs.
“background”
“background”
* H
* σ
* σ
* σ * σ
* NL * sm
M σ M
f3(;w3) z
f1(;w1) f2(;w2) f4(;w4)
x0 x1 x2 x3
z=f4( f3( f2( f1(x; w1); w2); w3); w4)
• We can implement derivatives of
elementary functions
• But how to implement e.g. ?
f3(;w3) z
f1(;w1) f2(;w2) f4(;w4)
x0 x1 x2 x3
f3(;w3) z
f1(;w1) f2(;w2) f4(;w4)
x0 x1 x2 x3
“Deep Learning”, Spring 2021: Lecture 1, Introduction
General approach: layer abstraction
* NL * cp
* sml
Typical usecase:
• Two related tasks
• Limited labeled data for the main task
• Lots of labeled data for auxiliary task
* sml
* NL * sml
* NL * sm
Backward propagation:
McCuloch-Pitts model:
“Deep Learning”, Spring 2021: Lecture 1, Introduction
The reality is much more messy
* * sm
* NL * sm
Operations:
generalized convolutions
pooling (image resizing)
elementwise non-linearity
matrix multiplication
“Deep Learning”, Spring 2021: Lecture 1, Introduction
Representations
Task where we
have a lot of
data
Final problem
“Deep Learning”, Spring 2021: Lecture 1, Introduction
Learning intermediate representations
* σ log
* sml
Test:
* sm
* NL * sm
* NL * sm
average
….
≈
Matrix
multiplication
Softmax
ReLU, Natash