Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

19EEE362:Deep Learning For Visual Computing

Dr.T.Ananthan

Department of EEE,
School of Engineering,
AMRITA Vishwa Vidyapeetham,
Coimbatore.

April 22, 2023

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 1 / 21
Deep Feedforward Networks

Deep feedforward networks are often called as feedforward net-


works or multilayer perceptrons(MLPs) are typical example of
deep learning models.
Feedforward neural networks are called networks because they are
typically represented by composing together many different func-
tions.
For example, we might have three functions 𝑓 (1), 𝑓 (2), and 𝑓 (3)
connected in a chain, to form 𝑓 (𝑥) = 𝑓 (3)( 𝑓 (2)( 𝑓 (1)(𝑥))).
In this case, 𝑓 (1) is called the first layer of the network, 𝑓 (2) is
called the second layer, and so on.
The overall length of the chain gives the depth of the model. It
is from this terminology that the name deep learning arises. The
final layer of a feedforward network is called the output layer.

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 2 / 21
Deep Learning

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 3 / 21
Activation Functions

Threshold
Sigmoid
Tanh
ReLu (Rectified Linear Unit)
Leaky ReLu
Softmax

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 4 / 21
Threshold Activation Function

(
1, if 𝑣 ≥ 0
𝜑(𝑣) =
0, if 𝑣 < 0

1 It assumes 0 or 1
2 Not differentiable

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 5 / 21
Sigmiod Activation Function

1
𝜑(𝑣) =
1 + 𝑒𝑥 𝑝(−𝑎𝑣)

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 6 / 21
Sigmiod Activation Function.....

1 It is a non linear function


2 It assumes continuous range of value from 0 to 1
3 It is differentiable
4 At the bottom of the tails, most values become zero
during the training and hence most important aspect of
learning of neural network is inhibited.
5 It is undesirable to have values squashed near tails where
the gradient is zero
6 Its output not zero centered

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 7 / 21
Tanh Activation Function

𝜑(𝑣) = 𝑡𝑎𝑛ℎ(𝑣)

This is similar to sigmoid function except that


It assumes value continuous range of values from -1 to +1
It is zero centered

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 8 / 21
Deep Network

Deep network have Vanishing gradient problem.

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 9 / 21
Vanishing Gradient Problem

In back propagation algorithm during the backward pass the weights


are updated.
These weights are updated using local gradient in each layer.
The local gradient of each layer is computed using the derivative of
sigmoid or tanh activation functions.
During the backward pass as the number of layers increases the
local gradient is very small or approaches zero.
Hence no training can take place.
This is happening because of the activation function output lies in
the range of 0 to 1 or -1 to +1.
This is called vanishing gradient problem

To overcome this problem in deep network ReLu (Rectified Linear unit)


activation can be used.

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 10 / 21
ReLu activation function
Rectified Line

(
𝑣, if 𝑣 ≥ 0
𝜑(𝑣) =
0, if 𝑣 < 0

It is non linear.
It is continuously differentiable.
It assumes a continuous range of values from 0 to 𝑣 or in general 0 to ∞.
However, the problem its output is for negative values of 𝑣.
Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 11 / 21
Leaky ReLu activation function

(
𝑣, if 𝑣 ≥ 0
𝜑(𝑣) =
0.01𝑣, if 𝑣 < 0

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 12 / 21
Binary Classification Problem

Problem statement: Given the data, to predict the flower is iris setosa or not

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 13 / 21
Dataset

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 14 / 21
Neural Network Architecture
Binary classification

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 15 / 21
Multiple class Problem

Problem statement: Given the data, to predict the flower is iris setosa or iris
versicolor or iris virginica

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 16 / 21
Dataset

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 17 / 21
Neural Network Architecture
Multiple Classes

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 18 / 21
Target Vector for Multiple Classes

iris setosa =[1 0 0]


iris vericolor =[0 1 0]
iris virginica =[0 0 1]

In general for 𝐾 classes,

𝐶1 =[1 . . . . . . 0]
𝐶2 =[0 1 . . . 0]
..
.
𝐶 𝑘 =[0 . . . . . . 1]

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 19 / 21
Softmax Activation Function

For multiple class problems softmax activation function is used only in


the output layer.
Softmax is function that converts a vector of numbers into a vector of
probabilities. 𝑥 1 𝑥 2 𝑥3 𝑥 4 𝑥 5

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 20 / 21
Softmax-Example

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 21 / 21
Softmax-Example.....

𝑒 𝑧𝑖
𝑓 (z) = Í 𝑘
𝑗=1 𝑒𝑧 𝑗
𝑒 𝑧1
𝑓 (𝑧 1 ) =
𝑒 𝑧1 + 𝑒 𝑧2 + 𝑒 𝑧2 + 𝑒 𝑧4 + 𝑒 𝑧5
𝑒 1.3
= 5.1
𝑒 + 𝑒 2.2 + 𝑒 0.7 + 𝑒 1.1
3.66 3.66
= = = .02
3.66 + 164.02 + 9.02 + 2.01 + 3.0 181.71

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 22 / 21
Softmax-Example.....

𝑒 5.1
𝑓 (𝑧 2 ) = = 0.90
181.71
𝑒 2.2
𝑓 (𝑧 3 ) = = 0.05
181.71
𝑒 0.7
𝑓 (𝑧 4 ) = = 0.01
181.71
𝑒 1.1
𝑓 (𝑧 5 ) = = 0.02
181.71
𝑓 (z) =[0.02 0.90 0.05 0.01 0.02]
=0.02 + 0.90 + 0.05 + 0.01 + 0.02 = 1

Dr.T.Ananthan (Department of EEE) 19EEE362:Deep Learning For Visual Computing April 22, 2023 23 / 21

You might also like