Professional Documents
Culture Documents
顏安孜 An-Zi Yen azyen@nlg.csie.ntu.edu.tw: National Taiwan University
顏安孜 An-Zi Yen azyen@nlg.csie.ntu.edu.tw: National Taiwan University
類神經網路
顏安孜 An-Zi Yen
azyen@nlg.csie.ntu.edu.tw
Neural Network = simulate the human brain?
2
National Taiwan University
…
wi z z
xi 𝑃𝑤,𝑏 𝐶1 |𝑥
…
…
wI Sigmoid Function z
xI b
z
1
1 ez z
• http://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/Logistic%20Regression%20(v4).pdf 4
Logistic Regression Linear Regression
Cross entropy:
𝑙 𝑓 𝑥 𝑛 , 𝑦ො 𝑛 = − 𝑦ො 𝑛 𝑙𝑛𝑓 𝑥 𝑛 + 1 − 𝑦ො 𝑛 𝑙𝑛 1 − 𝑓 𝑥 𝑛
5
Logistic Regression Linear Regression
C1: 𝑤 1 , 𝑏1 𝑧1 = 𝑤 1 ∙ 𝑥 + 𝑏1 Probability:
1 > 𝑦𝑖 > 0
C2: 𝑤 2 , 𝑏2 𝑧2 = 𝑤 2 ∙ 𝑥 + 𝑏2 σ𝑖 𝑦𝑖 = 1
C3: 𝑤 3 , 𝑏3 𝑧3 = 𝑤 3 ∙ 𝑥 + 𝑏3 yi PCi | x
Softmax
3 0.88 3
e
20
z1 e e z1
y1 e z1 zj
j 1
1 0.12 3
z2 e e z2 2.7
y2 e z2
e
zj
j 1
0.05 ≈0
z3 -3
3
e e z3
y3 e z3
e
zj
3 j 1
e
zj
j 1
7
[Bishop, P209-210]
Multi-class Classification (3 classes as example)
y ŷ
𝑧1 = 𝑤 1 ∙ 𝑥 + 𝑏1 y1 ŷ1
Cross Entropy
Softmax
𝑥 𝑧2 = 𝑤 2 ∙𝑥 + 𝑏2 y2 3 ŷ2
− 𝑦ො𝑖 𝑙𝑛𝑦𝑖
𝑧3 = 𝑤 3 ∙𝑥 + 𝑏3 y3 ŷ3
𝑖=1
target
If x ∈ class 1 If x ∈ class 2 If x ∈ class 3
1 0 0
𝑦ො = 0 𝑦ො = 1 𝑦ො = 0
0 0 1
−𝑙𝑛𝑦1 −𝑙𝑛𝑦2 −𝑙𝑛𝑦3 8
Limitation of Logistic Regression
z w1 x1 w2 x2 b
x1 w1
z Class1 y 0.5 (𝑧 ≥ 0)
y
w2 Class 2 y 0.5 (𝑧 < 0)
x2
b
Can we? No, we can’t …
Input Feature x2
Label
x1 x2 z≥0 z<0
0 0 Class 2
0 1 Class 1
1 0 Class 1 z<0 z≥0
1 1 Class 2
x1 9
Limitation of Logistic Regression
• Cascading logistic regression models
z1
x1 𝑥1′
z
y
z2
x2 𝑥2′
x2
x1 11
𝑥1′ =0.73 𝑥1′ =0.27 𝑥1′
w1
x2 z
w2 y
𝑥1′ =0.27 𝑥1′ =0.05
𝑥2′ b
x2
x2 𝑥1′
(0.27, 0.27) (0.05,0.73)
𝑥2′ =0.27 𝑥2′ =0.73
x1 𝑥2′ 12
Deep Learning
z
z z
z
“Neuron”
Neural Network
Different connection leads to different network
structures
Network parameter 𝜃: all the weights and biases in the “neurons”
13
Fully Connect Feedforward Network
1 4 0.98 2 0.86 3 0.62
1
-2 -1 -1
1 0 -2
-1 -2 0.12 -2 0.11 -1 0.83
-1
1 -1 4
0 0 2
Sigmoid Function z
z
1
z
1 e z 14
Fully Connect Feedforward Network
neuron
Input Layer 1 Layer 2 Layer L Output
x1 …… y1
x2 …… y2
……
……
……
……
……
xN …… yM
Input Output
Layer Hidden Layers Layer
15
Neural Network
x1 …… y1
x2 W1 W2 ……
WL y2
b1 b2 bL
……
……
……
……
……
xN x a1 ……
a2 y yM
𝜎 W1 x + b1
𝜎 W2 a1 + b2
𝜎 WL aL-1 + bL
16
Deep Learning
17
National Taiwan University
Demo
顏安孜 An-Zi Yen
azyen@nlg.csie.ntu.edu.tw
National Taiwan University
Age
Workclass
Income >50K
Education Model Income <=50K
Occupation
Sex
https://archive.ics.uci.edu/ml/datasets/Adult 19
National Taiwan University
https://scikit-
learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html 20