Professional Documents
Culture Documents
CS437 5317 EE414 L2 LinearRegression
CS437 5317 EE414 L2 LinearRegression
Deep Learning
Murtaza Taj
murtaza.taj@lums.edu.pk
Lecture 2: Introduction
Wed 20th Jan 2021
Recap: Datasets over Algorithms
Perhaps the most important news of our day is that datasets—not algorithms—might be the key
limiting factor to development of human-level artificial intelligence.
Alexander Wissner-Gross, 2016.
http://www.spacemachine.net/views/2016/3/datasets-over-algorithms
Transfer Learning
! Low-budget deep-learning: less data and less compute
power
Train
Freeze Layers
Machine
Data Understanding
Learning
! Supervised learning
! Training data includes desired outputs
! Unsupervised learning
! Training data does not include desired outputs
! Reinforcement learning
! Rewards from sequence of actions
• http://cloudcv.org/classify/
x y
scuba diver
tiger shark
hammerhead
shark
HoG Textons
! Vision
hand-crafted
features your favorite
SIFT/HOG
classifier “car”
fixed learned
! Speech
hand-crafted
features your favorite
MFCC
classifier \ˈd ē p\
fixed learned
! NLP
hand-crafted
This burrito place your favorite
features
is yummy and fun!
Bag-of-words
classifier “+”
fixed learned
Slide Credit: Marc'Aurelio Ranzato, Yann LeCun !52
Hierarchical Compositionality
! Vision
! Speech
sample spectral formant motif phone word
band
! NLP
character word NP/VP/.. clause sentence story
•
Compose into a
complicate function
(C) Dhruv Batra Slide Credit: Marc'Aurelio Ranzato, Yann LeCun !54
Building A Complicated Function
(C) Dhruv Batra Slide Credit: Marc'Aurelio Ranzato, Yann LeCun !55
Building A Complicated Function
Idea 2: Compositions
Compose into a
• Deep Learning
• Grammar models
complicate function
• Scattering transforms…
(C) Dhruv Batra Slide Credit: Marc'Aurelio Ranzato, Yann LeCun !56
Building A Complicated Function
Idea 2: Compositions
Compose into a
• Deep Learning
• Grammar models
complicate function
• Scattering transforms…
(C) Dhruv Batra Slide Credit: Marc'Aurelio Ranzato, Yann LeCun !57
Linear Combination
Composition
Deep Learning = Hierarchical Compositionality
“car”
Dendrites
x3
cell body y
xj
x2 w
wT x
axon synaptic terminal
x1
Linear Regression
Parameter Optimization: Least Squared Error Solutions
! Let us first consider the ‘simpler’ problem of fitting a line to
a set of data points…
x y
1.3 5.7
2.4 7.3
3.4 10.5
4.6 11.8
5.3 13.9
6.6 16.3
6.4 15.3
8.0 17.9
8.9 20.8
9.2 20.9
(t (i) −y (i))2
∑
! Error for whole data: E=
i
(t (i) −m x (i) −c)2
∑
E=
i
! Step 3: Differentiate Error w.r.t. parameters, put equal to
zero and solve for minimum point
x t
1.3 5.7
(t (i) −m x (i) −c)2
∑
E= 2.4 7.3
i 3.4 10.5
∂E 4.6 11.8
= − (t (i) −m x (i) −c)x (i)
∂m ∑ 5.3 13.9
i 6.6 16.3
∂E 6.4 15.3
= − (t (i) −m x (i) −c)
∂c ∑ 8.0 17.9
i
8.9 20.8
∑i (x (i))2 ∑i x (i) ∑i x (i)t (i)
[ ∑i x (i) ∑i 1 ] [ ] [ ∑i t (i) ]
m 9.2 20.9
c =
Ax = b
380.63 56.1 m 914.68
=
56.1 10 c 140.4
∂E
= − (t (i) −m x (i) −c)x (i)
∂m ∑ ∂E
= − (t (i) −y (i))xj(i)
i
∂wj ∑
∂E i
= − (t (i) −m x (i) −c)
∂c ∑
i
t (i) ∈{−1, + 1}
j
x1 x2 x3 y
(t (i) −wT x)2
∑
E=
i 1 3 2 +ve
∂E i
1 4 3 +ve
= − (t (i) −y (i))xj(i)
∂wj ∑ 1 4 8 -ve
i
1 8 9 -ve
{−1 y ≤0
+ 1 y> 0 ! w = {10, -1, -1}
! 1x10 + (-1)x3 + (-1)x2 > 0
! 1x10 + (-1)x4 + (-1)x8 < 0
Linear Regression for Classification
w = {w0, w1, w2, ⋯}T
x = {1,x1, x2, ⋯}T
y = wT x =
∑
wj xj
j
t (i) ∈{−1, + 1}
{−1 y ≤0
+ 1 y> 0 ! w = {10, -1, -1}
! 1x10 + (-1)x3 + (-1)x2 > 0
! 1x10 + (-1)x4 + (-1)x8 < 0
Neuron
Neuron
x = {1,x1, x2, ⋯}T
Dendrites
x3
cell body y
xj
x2 w
wT x
axon synaptic terminal
x1