Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 12

Neural Networks and Deep

Learning
António Ruano
aruano@ualg.pt
Characteristics of Neural Networks
• Neural network models can be somehow characterized by
• the models employed for the individual neurons,
• the way that these neurons are interconnected between themselves,
• and by the learning mechanism(s) employed.
Interconnecting neurons
• As already mentioned, biological neural networks are densely interconnected. This
also happens with artificial neural networks.
• According to the flow of the signals within an ANN, we can divide the architectures
into:
• feedforward networks, if the signals flow just from input to output,
• recurrent networks, if loops are allowed.
• According to the existence of hidden neurons, we can have:
• multilayer NNs, if there are hidden neurons,
• singlelayer NN, if no hidden neurons (and layers) exist.
• Finally, the NNs are:
• fully connected, if every neuron in one layer is connected with the layer immediately above
• partially connected, if not.
Input Layer

Interconnecting neurons
• Singlelayer feedforward network

Output Layer

• Multilayer feedforward network ([3 2 1])

Hidden Layer

• Recurrent network
Characteristics of Neural Networks
• Neural network models can be somehow characterized by
• the models employed for the individual neurons,
• the way that these neurons are interconnected between themselves,
• and by the learning mechanism(s) employed.
Learning
• Learning is envisaged in ANNs as the method of modifying the
weights of the NN such that a desired goal is reached.
• We can somehow classify learning with respect to the learning
mechanism, with respect to when this modification takes place, and
according to the manner how the adjustment takes place.
Learning mechanism
• Learning can be divided into three classes:
• supervised learning - this learning scheme assumes that the network is used
as an input-output system. Because of that, for each input pattern there is a
desired output, or target, and a cost function is used to update the
parameters;
• reinforcement learning - In contrast with supervised learning, however, this
cost function is only given to the network from time to time. The network
does not receive a teaching signal at every training pattern but only a score
that tells it how it performed over a training sequence;
• unsupervised learning - in this case there is no teaching signal from the
environment.
Supervised learning
• Associated with some input matrix, I, there is a matrix of desired
outputs, or teacher signals, T. The dimensions of these two matrices
are m*ni and m*no, respectively, where m is the number of patterns in
the training set, ni is the number of inputs in the network and no is the
number of outputs.
• The aim of the learning phase is to find values of the weights and
biases in the network such that, using I as input data, the
corresponding output values, O, are as close as possible to T.
Supervised learning (cont)
• The most commonly used minimization criterion is:
• where tr(.) denotes the trace operator and E is the error matrix defined as:
• There are two major classes within supervised learning:
• Gradient descent learning: These methods reduce a cost criterion through the use of
gradient methods. Of course, in this case the activation functions must be
differentiable. Typically, the update mechanism is:

• forced Hebbian or correlative learning: This is a variant of Hebbian learning. Suppose


that you have m input patterns, and consider the pth input pattern, Ip,. . The
corresponding target output pattern is Tp,. . Then the weight matrix W is computed
as the sum of m superimposed pattern matrices Wp, where each is computed as a
correlation matrix:
Unsupervised learning
• The two most important classes of unsupervised learning are:
• Hebbian learning - this type of learning mechanisms, due to Donald
Hebb, updates the weights locally, according to states of activation of its
input and output. Typically, if both coincide, then the strength of the
weight is increased; otherwise, it is decreased;
• competitive learning - in this type of learning an input pattern is
presented to the network and the neurons compete among themselves.
The processing element (or elements) that emerge as winners of the
competition are allowed to modify their weights (or modify their
weights in a different way from those of the non-winning neurons).
Off-line and on-line; deterministic or
stochastic learning
• If the weight adjustment is only performed after a complete set of
patterns has been presented to the network, we denote this process as
off-line learning, epoch learning, batch learning or training.
• On the other hand, if upon the presentation of each pattern there is
always (or can happen) a weight update, we can say that we are in the
presence of on-line learning, instantaneous learning, pattern learning, or
adaptation.
• In all the techniques described so far, the weight adjustment is
deterministic. In some networks, like the Boltzmann Machine, the states
of the units are determined by a probabilistic distribution. As learning
depends on the states of the units, the learning process is stochastic.
Applications
• ANNs have been applied to almost every field of science. Examples:
• Combinatorial problems
• Content Addressable Memories
• Data Compression
• Modelling
• Forecasting
• Control
• Pattern recognition
• NILM

You might also like