Professional Documents
Culture Documents
Aula 3 T
Aula 3 T
Learning
António Ruano
aruano@ualg.pt
Characteristics of Neural Networks
• Neural network models can be somehow characterized by
• the models employed for the individual neurons,
• the way that these neurons are interconnected between themselves,
• and by the learning mechanism(s) employed.
Interconnecting neurons
• As already mentioned, biological neural networks are densely interconnected. This
also happens with artificial neural networks.
• According to the flow of the signals within an ANN, we can divide the architectures
into:
• feedforward networks, if the signals flow just from input to output,
• recurrent networks, if loops are allowed.
• According to the existence of hidden neurons, we can have:
• multilayer NNs, if there are hidden neurons,
• singlelayer NN, if no hidden neurons (and layers) exist.
• Finally, the NNs are:
• fully connected, if every neuron in one layer is connected with the layer immediately above
• partially connected, if not.
Input Layer
Interconnecting neurons
• Singlelayer feedforward network
Output Layer
Hidden Layer
• Recurrent network
Characteristics of Neural Networks
• Neural network models can be somehow characterized by
• the models employed for the individual neurons,
• the way that these neurons are interconnected between themselves,
• and by the learning mechanism(s) employed.
Learning
• Learning is envisaged in ANNs as the method of modifying the
weights of the NN such that a desired goal is reached.
• We can somehow classify learning with respect to the learning
mechanism, with respect to when this modification takes place, and
according to the manner how the adjustment takes place.
Learning mechanism
• Learning can be divided into three classes:
• supervised learning - this learning scheme assumes that the network is used
as an input-output system. Because of that, for each input pattern there is a
desired output, or target, and a cost function is used to update the
parameters;
• reinforcement learning - In contrast with supervised learning, however, this
cost function is only given to the network from time to time. The network
does not receive a teaching signal at every training pattern but only a score
that tells it how it performed over a training sequence;
• unsupervised learning - in this case there is no teaching signal from the
environment.
Supervised learning
• Associated with some input matrix, I, there is a matrix of desired
outputs, or teacher signals, T. The dimensions of these two matrices
are m*ni and m*no, respectively, where m is the number of patterns in
the training set, ni is the number of inputs in the network and no is the
number of outputs.
• The aim of the learning phase is to find values of the weights and
biases in the network such that, using I as input data, the
corresponding output values, O, are as close as possible to T.
Supervised learning (cont)
• The most commonly used minimization criterion is:
• where tr(.) denotes the trace operator and E is the error matrix defined as:
• There are two major classes within supervised learning:
• Gradient descent learning: These methods reduce a cost criterion through the use of
gradient methods. Of course, in this case the activation functions must be
differentiable. Typically, the update mechanism is: