Download as pdf or txt
Download as pdf or txt
You are on page 1of 20

National Taiwan University

類神經網路
顏安孜 An-Zi Yen
azyen@nlg.csie.ntu.edu.tw
Neural Network = simulate the human brain?

2
National Taiwan University

From Logistic Regression to


Deep Learning
Reference: Machine Learning (2017,Spring)
Professor Hung-Yi Lee
http://speech.ee.ntu.edu.tw/~tlkagk/courses.html
Logistic Regression
z   wi xi  b
x1 Probability > 0.5  C1
w1 Otherwise  C2
i


wi z  z 
xi  𝑃𝑤,𝑏 𝐶1 |𝑥

wI Sigmoid Function  z 
xI b
 z  
1
1  ez z

• http://speech.ee.ntu.edu.tw/~tlkagk/courses/ML_2017/Lecture/Logistic%20Regression%20(v4).pdf 4
Logistic Regression Linear Regression

Step 1: 𝑓𝑤,𝑏 𝑥 = 𝜎 ෍ 𝑤𝑖 𝑥𝑖 + 𝑏 𝑓𝑤,𝑏 𝑥 = ෍ 𝑤𝑖 𝑥𝑖 + 𝑏


𝑖 𝑖
Output: between 0 and 1 Output: any value

Training data: 𝑥 𝑛 , 𝑦ො 𝑛 Training data: 𝑥 𝑛 , 𝑦ො 𝑛


Step 2: 𝑦ො 𝑛 : 1 for class 1, 0 for class 2 𝑦ො 𝑛 : a real number
1
𝐿 𝑓 = ෍ 𝑙 𝑓 𝑥 𝑛 , 𝑦ො 𝑛 𝐿 𝑓 = ෍ 𝑓 𝑥 𝑛 − 𝑦ො 𝑛 2
2
𝑛 𝑛

Cross entropy:
𝑙 𝑓 𝑥 𝑛 , 𝑦ො 𝑛 = − 𝑦ො 𝑛 𝑙𝑛𝑓 𝑥 𝑛 + 1 − 𝑦ො 𝑛 𝑙𝑛 1 − 𝑓 𝑥 𝑛
5
Logistic Regression Linear Regression

Step 1: 𝑓𝑤,𝑏 𝑥 = 𝜎 ෍ 𝑤𝑖 𝑥𝑖 + 𝑏 𝑓𝑤,𝑏 𝑥 = ෍ 𝑤𝑖 𝑥𝑖 + 𝑏


𝑖 𝑖
Output: between 0 and 1 Output: any value

Training data: 𝑥 𝑛 , 𝑦ො 𝑛 Training data: 𝑥 𝑛 , 𝑦ො 𝑛


Step 2: 𝑦ො 𝑛 : 1 for class 1, 0 for class 2 𝑦ො 𝑛 : a real number
1
𝐿 𝑓 = ෍ 𝑙 𝑓 𝑥 𝑛 , 𝑦ො 𝑛 𝐿 𝑓 = ෍ 𝑓 𝑥 𝑛 − 𝑦ො 𝑛 2
2
𝑛 𝑛

Logistic regression: 𝑤𝑖 ← 𝑤𝑖 − 𝜂 ෍ − 𝑦ො 𝑛 − 𝑓𝑤,𝑏 𝑥 𝑛 𝑥𝑖𝑛


𝑛
Step 3:
Linear regression: 𝑤𝑖 ← 𝑤𝑖 − 𝜂 ෍ − 𝑦ො 𝑛 − 𝑓𝑤,𝑏 𝑥 𝑛 𝑥𝑖𝑛
6
𝑛
Multi-class Classification (3 classes as example)

C1: 𝑤 1 , 𝑏1 𝑧1 = 𝑤 1 ∙ 𝑥 + 𝑏1 Probability:
 1 > 𝑦𝑖 > 0
C2: 𝑤 2 , 𝑏2 𝑧2 = 𝑤 2 ∙ 𝑥 + 𝑏2  σ𝑖 𝑦𝑖 = 1
C3: 𝑤 3 , 𝑏3 𝑧3 = 𝑤 3 ∙ 𝑥 + 𝑏3 yi  PCi | x 
Softmax
3 0.88 3

e
20
z1 e e z1
 y1  e z1 zj

j 1

1 0.12 3
z2 e e z2 2.7
 y2  e z2
e
zj

j 1
0.05 ≈0
z3 -3
3
e e z3
 y3  e z3
e
zj

3 j 1

 e
zj

j 1
7
[Bishop, P209-210]
Multi-class Classification (3 classes as example)

y ŷ
𝑧1 = 𝑤 1 ∙ 𝑥 + 𝑏1 y1 ŷ1
Cross Entropy

Softmax
𝑥 𝑧2 = 𝑤 2 ∙𝑥 + 𝑏2 y2 3 ŷ2
− ෍ 𝑦ො𝑖 𝑙𝑛𝑦𝑖
𝑧3 = 𝑤 3 ∙𝑥 + 𝑏3 y3 ŷ3
𝑖=1
target
If x ∈ class 1 If x ∈ class 2 If x ∈ class 3
1 0 0
𝑦ො = 0 𝑦ො = 1 𝑦ො = 0
0 0 1
−𝑙𝑛𝑦1 −𝑙𝑛𝑦2 −𝑙𝑛𝑦3 8
Limitation of Logistic Regression
z  w1 x1  w2 x2  b
x1 w1
z  Class1 y  0.5 (𝑧 ≥ 0)
 y 
w2 Class 2 y  0.5 (𝑧 < 0)
x2
b
Can we? No, we can’t …
Input Feature x2
Label
x1 x2 z≥0 z<0
0 0 Class 2
0 1 Class 1
1 0 Class 1 z<0 z≥0
1 1 Class 2
x1 9
Limitation of Logistic Regression
• Cascading logistic regression models

z1
x1  𝑥1′
z
 y
z2
x2  𝑥2′

Feature Transformation Classification

(ignore bias in this figure) 10


𝑥1′ =0.73 𝑥1′ =0.27

x2

x1 𝑥1′ =0.27 𝑥1′ =0.05


x1  𝑥1′
x1

𝑥2′ =0.05 𝑥2′ =0.27


x2
x2  𝑥2′
x2

𝑥2′ =0.27 𝑥2′ =0.73

x1 11
𝑥1′ =0.73 𝑥1′ =0.27 𝑥1′
w1
x2 z
w2  y
𝑥1′ =0.27 𝑥1′ =0.05
𝑥2′ b
x2

𝑥2′ =0.05 𝑥2′ =0.27 (0.73, 0.05)

x2 𝑥1′
(0.27, 0.27) (0.05,0.73)
𝑥2′ =0.27 𝑥2′ =0.73

x1 𝑥2′ 12
Deep Learning

  z 

  z    z 

  z 
“Neuron”
Neural Network
Different connection leads to different network
structures
Network parameter 𝜃: all the weights and biases in the “neurons”
13
Fully Connect Feedforward Network
1 4 0.98 2 0.86 3 0.62
1
-2 -1 -1
1 0 -2
-1 -2 0.12 -2 0.11 -1 0.83
-1
1 -1 4
0 0 2

Sigmoid Function  z 

 z  
1
z
1 e z 14
Fully Connect Feedforward Network
neuron
Input Layer 1 Layer 2 Layer L Output
x1 …… y1
x2 …… y2

……
……

……

……

……
xN …… yM
Input Output
Layer Hidden Layers Layer
15
Neural Network
x1 …… y1
x2 W1 W2 ……
WL y2
b1 b2 bL

……
……

……

……

……
xN x a1 ……
a2 y yM

𝜎 W1 x + b1
𝜎 W2 a1 + b2
𝜎 WL aL-1 + bL
16
Deep Learning

17
National Taiwan University

Demo
顏安孜 An-Zi Yen
azyen@nlg.csie.ntu.edu.tw
National Taiwan University

Adult Income Prediction


• The adult dataset is from the 1994 Census database. It is also known
as “Census Income” dataset.
• Prediction task is to determine whether a person makes over 50K a
year.
• >50K, <=50K

Age
Workclass
Income >50K
Education Model Income <=50K
Occupation
Sex

https://archive.ics.uci.edu/ml/datasets/Adult 19
National Taiwan University

Logistic Regression Demo

https://scikit-
learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html 20

You might also like