Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Artificial Intelligence

Softmax function

Pham Viet Cuong


Dept. Control Engineering & Automation, FEEE
Ho Chi Minh City University of Technology
Softmax function
ü Usual output layer for classification tasks
ü Example: Image classification problem ILSVRC2012
(Large Scale Visual Recognition Challenge 2012)
v 1000 categories
v 1.2 million images

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 2
Softmax function
ü Ordinary classifier:
v Some rule, or function, that assigns to a sample x a class label ŷ
yˆ  f  x 
ü Probabilistic classifier:
v Conditional distributions PY | X : for a given x  X , assign
probabilities to all y  Y (these probabilities sum to one).
 y1  0.20
   
y
 2   0 . 35 
 y 3  0.05
   
 y 4  0.40
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 3
Softmax function
ü Example: Image classification problem
v 4 categories: dog, cat, kite, car 1  0 0  0 
v One-hot encoding 0  1 0  0 
       
0  0 1  0 
       
0  0 0  1 
Black box z y
 z1   0 . 8   y1  0.28
Input Output     Softmax  y   0.51
 z 2    1 .4   2   
 z 3    0 .8   y 3  0.06
       
 z 4   0.2   y 4  0.15
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 4
Softmax function
ü Softmax function:
v Takes as input a vector of k real numbers  0 .8 
v Normalizes it into a probability distribution consisting of k  1 .4 
probabilities proportional to the exponentials of the input numbers  
ü Input:   0 .8 
v Some vector components could be negative, or greater than one  0.2 
v Might not sum to 1
ü Output: 0.28
v Each component in the interval (0,1)  0.51
 
v The components add up to 1 0.06
v Larger input components correspond to larger probabilities  
v Can be interpreted as probabilities  0.15 
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 5
Softmax function
ü Softmax function:

e zi

yi  C
, i  1, 2,, C
e
j 1
zj

 z1   0 . 8   y1  0.28
       
y 0 . 51
 z 2    1 .4   2   
 z 3    0 .8   y 3  0.06
       
 z 4   0.2   y 4  0.15

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 6
Softmax function
ü Where comes the name “softmax”?
v Smooth approximation to the maximum function?
v Smooth approximation to the arg max function --> softargmax

 z1   0 . 8  0   y1  0.28
    1    
z
 2   1 . 4  y
 2   0 . 51
arg maxz    
 z 3    0 .8  0   y 3  0.06
         
 z 4   0.2  0   y 4  0.15
f x   max 0, x 
Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 7
Softmax function
ü References
v https://cs231n.github.io/linear-classify/#softmax
v http://neuralnetworksanddeeplearning.com/chap3.html
v https://en.wikipedia.org/wiki/Softmax_function
v https://en.wikipedia.org/wiki/Probabilistic_classification
v A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet Classification with
Deep Convolutional Neural Networks, 2012
v https://www.thehappycatsite.com/funny-cat-quotes/

Pham Viet Cuong - Dept. Control Eng. & Automation, FEEE, HCMUT 8

You might also like