顏安孜 An-Zi Yen azyen@nlg.csie.ntu.edu.tw: National Taiwan University

National Taiwan University
類神經網路
顏安孜 An-Zi Yen
azyen@nlg.csie.ntu.edu.tw
Neural Network = simulate the human brain?
2
From Logistic Regression to

Deep Learning
Reference: Machine Learning (2017,Spring)
Professor Hung-Yi Lee
http://speech.ee.ntu.edu.tw/~tlkagk/courses.html
Logistic Regression
z   wi xi  b
x1 Probability > 0.5  C1
w1 Otherwise  C2
i
…
…
wi z  z 
xi  𝑃𝑤,𝑏 𝐶1 |𝑥
…
…
wI Sigmoid Function  z 
xI b
 z  
1
1  ez z
• http://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/Logistic%20Regression%20(v4).pdf 4
Logistic Regression Linear Regression
Step 1: 𝑓𝑤,𝑏 𝑥 = 𝜎 ෍ 𝑤𝑖 𝑥𝑖 + 𝑏 𝑓𝑤,𝑏 𝑥 = ෍ 𝑤𝑖 𝑥𝑖 + 𝑏

𝑖 𝑖
Output: between 0 and 1 Output: any value
Training data: 𝑥 𝑛 , 𝑦ො 𝑛 Training data: 𝑥 𝑛 , 𝑦ො 𝑛

Step 2: 𝑦ො 𝑛 : 1 for class 1, 0 for class 2 𝑦ො 𝑛 : a real number
1
𝐿 𝑓 = ෍ 𝑙 𝑓 𝑥 𝑛 , 𝑦ො 𝑛 𝐿 𝑓 = ෍ 𝑓 𝑥 𝑛 − 𝑦ො 𝑛 2
2
𝑛 𝑛
Cross entropy:
𝑙 𝑓 𝑥 𝑛 , 𝑦ො 𝑛 = − 𝑦ො 𝑛 𝑙𝑛𝑓 𝑥 𝑛 + 1 − 𝑦ො 𝑛 𝑙𝑛 1 − 𝑓 𝑥 𝑛
5
Logistic Regression Linear Regression
Step 1: 𝑓𝑤,𝑏 𝑥 = 𝜎 ෍ 𝑤𝑖 𝑥𝑖 + 𝑏 𝑓𝑤,𝑏 𝑥 = ෍ 𝑤𝑖 𝑥𝑖 + 𝑏

𝑖 𝑖
Output: between 0 and 1 Output: any value
Training data: 𝑥 𝑛 , 𝑦ො 𝑛 Training data: 𝑥 𝑛 , 𝑦ො 𝑛

Step 2: 𝑦ො 𝑛 : 1 for class 1, 0 for class 2 𝑦ො 𝑛 : a real number
1
𝐿 𝑓 = ෍ 𝑙 𝑓 𝑥 𝑛 , 𝑦ො 𝑛 𝐿 𝑓 = ෍ 𝑓 𝑥 𝑛 − 𝑦ො 𝑛 2
2
𝑛 𝑛
Logistic regression: 𝑤𝑖 ← 𝑤𝑖 − 𝜂 ෍ − 𝑦ො 𝑛 − 𝑓𝑤,𝑏 𝑥 𝑛 𝑥𝑖𝑛

𝑛
Step 3:
Linear regression: 𝑤𝑖 ← 𝑤𝑖 − 𝜂 ෍ − 𝑦ො 𝑛 − 𝑓𝑤,𝑏 𝑥 𝑛 𝑥𝑖𝑛
6
𝑛
Multi-class Classification (3 classes as example)
C1: 𝑤 1 , 𝑏1 𝑧1 = 𝑤 1 ∙ 𝑥 + 𝑏1 Probability:
 1 > 𝑦𝑖 > 0
C2: 𝑤 2 , 𝑏2 𝑧2 = 𝑤 2 ∙ 𝑥 + 𝑏2  σ𝑖 𝑦𝑖 = 1
C3: 𝑤 3 , 𝑏3 𝑧3 = 𝑤 3 ∙ 𝑥 + 𝑏3 yi  PCi | x 
Softmax
3 0.88 3
e
20
z1 e e z1
 y1  e z1 zj
j 1
1 0.12 3
z2 e e z2 2.7
 y2  e z2
e
zj
j 1
0.05 ≈0
z3 -3
3
e e z3
 y3  e z3
e
zj
3 j 1
 e
zj
j 1
7
[Bishop, P209-210]
Multi-class Classification (3 classes as example)
y ŷ
𝑧1 = 𝑤 1 ∙ 𝑥 + 𝑏1 y1 ŷ1
Cross Entropy
Softmax
𝑥 𝑧2 = 𝑤 2 ∙𝑥 + 𝑏2 y2 3 ŷ2
− ෍ 𝑦ො𝑖 𝑙𝑛𝑦𝑖
𝑧3 = 𝑤 3 ∙𝑥 + 𝑏3 y3 ŷ3
𝑖=1
target
If x ∈ class 1 If x ∈ class 2 If x ∈ class 3
1 0 0
𝑦ො = 0 𝑦ො = 1 𝑦ො = 0
0 0 1
−𝑙𝑛𝑦1 −𝑙𝑛𝑦2 −𝑙𝑛𝑦3 8
Limitation of Logistic Regression
z  w1 x1  w2 x2  b
x1 w1
z  Class1 y  0.5 (𝑧 ≥ 0)
 y 
w2 Class 2 y  0.5 (𝑧 < 0)
x2
b
Can we? No, we can’t …
Input Feature x2
Label
x1 x2 z≥0 z<0
0 0 Class 2
0 1 Class 1
1 0 Class 1 z<0 z≥0
1 1 Class 2
x1 9
Limitation of Logistic Regression
• Cascading logistic regression models
z1
x1  𝑥1′
z
 y
z2
x2  𝑥2′
Feature Transformation Classification
(ignore bias in this figure) 10

𝑥1′ =0.73 𝑥1′ =0.27
x2
x1 𝑥1′ =0.27 𝑥1′ =0.05

x1  𝑥1′
x1
𝑥2′ =0.05 𝑥2′ =0.27

x2
x2  𝑥2′
x2
𝑥2′ =0.27 𝑥2′ =0.73
x1 11
𝑥1′ =0.73 𝑥1′ =0.27 𝑥1′
w1
x2 z
w2  y
𝑥1′ =0.27 𝑥1′ =0.05
𝑥2′ b
x2
𝑥2′ =0.05 𝑥2′ =0.27 (0.73, 0.05)
x2 𝑥1′
(0.27, 0.27) (0.05,0.73)
𝑥2′ =0.27 𝑥2′ =0.73
x1 𝑥2′ 12
Deep Learning
  z 
  z    z 
  z 
“Neuron”
Neural Network
Different connection leads to different network
structures
Network parameter 𝜃: all the weights and biases in the “neurons”
13
Fully Connect Feedforward Network
1 4 0.98 2 0.86 3 0.62
1
-2 -1 -1
1 0 -2
-1 -2 0.12 -2 0.11 -1 0.83
-1
1 -1 4
0 0 2
Sigmoid Function  z 
 z  
1
z
1 e z 14
Fully Connect Feedforward Network
neuron
Input Layer 1 Layer 2 Layer L Output
x1 …… y1
x2 …… y2
……
……
……
……
……
xN …… yM
Input Output
Layer Hidden Layers Layer
15
Neural Network
x1 …… y1
x2 W1 W2 ……
WL y2
b1 b2 bL
……
……
……
……
……
xN x a1 ……
a2 y yM
𝜎 W1 x + b1
𝜎 W2 a1 + b2
𝜎 WL aL-1 + bL
16
Deep Learning
17
Demo
顏安孜 An-Zi Yen
azyen@nlg.csie.ntu.edu.tw
Adult Income Prediction

• The adult dataset is from the 1994 Census database. It is also known
as “Census Income” dataset.
• Prediction task is to determine whether a person makes over 50K a
year.
• >50K, <=50K
Age
Workclass
Income >50K
Education Model Income <=50K
Occupation
Sex
https://archive.ics.uci.edu/ml/datasets/Adult 19
Logistic Regression Demo
https://scikit-
learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html 20

顏安孜 An-Zi Yen azyen@nlg.csie.ntu.edu.tw: National Taiwan University

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

顏安孜 An-Zi Yen azyen@nlg.csie.ntu.edu.tw: National Taiwan University

Uploaded by

Copyright:

Available Formats

National Taiwan University

From Logistic Regression to

Step 1: 𝑓𝑤,𝑏 𝑥 = 𝜎 ෍ 𝑤𝑖 𝑥𝑖 + 𝑏 𝑓𝑤,𝑏 𝑥 = ෍ 𝑤𝑖 𝑥𝑖 + 𝑏

Training data: 𝑥 𝑛 , 𝑦ො 𝑛 Training data: 𝑥 𝑛 , 𝑦ො 𝑛

Step 1: 𝑓𝑤,𝑏 𝑥 = 𝜎 ෍ 𝑤𝑖 𝑥𝑖 + 𝑏 𝑓𝑤,𝑏 𝑥 = ෍ 𝑤𝑖 𝑥𝑖 + 𝑏

Training data: 𝑥 𝑛 , 𝑦ො 𝑛 Training data: 𝑥 𝑛 , 𝑦ො 𝑛

Logistic regression: 𝑤𝑖 ← 𝑤𝑖 − 𝜂 ෍ − 𝑦ො 𝑛 − 𝑓𝑤,𝑏 𝑥 𝑛 𝑥𝑖𝑛

Feature Transformation Classification

(ignore bias in this figure) 10

x1 𝑥1′ =0.27 𝑥1′ =0.05

𝑥2′ =0.05 𝑥2′ =0.27

𝑥2′ =0.27 𝑥2′ =0.73

𝑥2′ =0.05 𝑥2′ =0.27 (0.73, 0.05)

Adult Income Prediction

Logistic Regression Demo

You might also like