Download as pdf or txt
Download as pdf or txt
You are on page 1of 25

Multi Layer Perceptron

Multi Layer Perceptron


The multilayer perceptron (MLP) belongs to the class of feedforward networks, meaning that the information flows among
the network nodes exclusively in the forward direction.

Multilayer perceptron with an input layer, three hidden layers, and an output layer
Backpropagation learning algorithm
The algorithm is based on the gradient descent technique for solving an optimization problem, which involves the
minimization of the network cumulative error Ec

where the index i represents the i-th neuron of


the output layer composed of a total number of
q neurons

being the square of the Euclidian norm of the vectorial difference between the k-th target output vector t(k) and the
k-th actual output vector o(k) of the network

Consider n is the number of training patterns presented to the network for learning purposes

The algorithm is designed in such a way as to update the weights in the direction of the gradient
descent of the cumulative error (with respect to the weight vector).
Off line (all the training patterns are
presented to the system at once) or on line
(training is made pattern by pattern)
with respect to the vector w(l ) corresponding to all
interconnection weights between layer (l ) and the preceding
layer (l − 1)

The signal toti (l ) represents the sum of all signals reaching


node (i) at hidden layer (l) coming from previous layer
(l-1).
• Using chain rule differentiation we obtain:

• For the case where layer (l) is the output layer (L), above equation
can be expressed as:
• Considering the case where f is the sigmoid function
• The error signal becomes expressed as:

• Propagating the error backward now, and for the case where (l)
represents a hidden layer (l < L), the expression of Δwij(l) is given as follows

• where the error signal δi(l) is now expressed as a function of output of


previous layers as:
To illustrate this powerful
algorithm, we apply it for the
training of the following
network shown in Figure.
The following three training
pattern pairs are used, with x
and t being the input and the
output data respectively:
Momentum
Effect of Hidden Nodes on Function Approximation
Effect of training patterns on function approximation

You might also like