Ann 2

Artificial Neural
Networks
Neural Networks 1
Agenda
• Neural Networks
• Neural Network Learning
• Network Architectures
• Dimensions of a Neural Network
• Backpropagation Training Algorithm
• Over-Training Prevention
• Successful Applications
Neural Networks 2
Neural Networks
• A NN is a machine learning approach inspired by the
way in which the brain performs a particular learning
task:
– Knowledge about the learning task is given in the form of
examples.
– Inter neuron connection strengths (weights) are used to

store the acquired information (the training examples).
– During the learning process the weights are modified in

order to model the particular learning task correctly on the
training examples.
3
Learning
• Supervised Learning
– Recognizing hand-written digits, pattern recognition,
regression.
– Labeled examples
(input , desired output)
– Neural Network models: perceptron, feed-forward, radial
basis function, support vector machine.
• Unsupervised Learning
– Find similar groups of documents in the web, content
addressable memory, clustering.
– Unlabeled examples
(different realizations of the input alone)
– Neural Network models: self organizing maps, Hopfield
networks.
4
Network architectures
• Three different classes of network architectures
– single-layer feed-forward neurons are organized

– multi-layer feed-forward in acyclic layers
– recurrent
• The architecture of a neural network is linked with the

learning algorithm used to train
Neural Networks NN 1 5
Single Layer Feed-forward
Input layer Output layer

of of
source nodes neurons
Multi layer feed-forward
3-4-2 Network
Input Output
layer layer
Hidden Layer
Recurrent network
Recurrent Network with hidden neuron(s): unit
delay operator z-1 implies dynamic system
z-1
input
z -1
hidden
output
z-1
Neural Network Architectures
Real Neurons
• Cell structures
– Cell body
– Dendrites
– Axon
– Synaptic terminals
10
Real Neural Learning
• Synapses change size and strength

with experience.
• Hebbian learning: When two connected
neurons are firing at the same time, the
strength of the synapse between them
increases.
• “Neurons that fire together, wire
together.”
11
The Artificial Neuron
• The neuron is the basic information processing unit of
a NN. It consists of:
1 A set of synapses or connecting links, each link
characterized by a weight:
W1, W2, …, Wm
2 An adder function (linear combiner) which
computes the weighted sum of m
the inputs: u   wjxj
j 1
3 Activation function (squashing function)  for

limiting the amplitude of the
output of the neuron. y   (u  b)
The Artificial Neuron
Bias
b
x1 w1
Activation
Local function
Field

Output
Input x2 w2 v  () y
signal
  Summing
function
xm wm
Synaptic
weights
Bias of a Neuron
• Bias b has the effect of applying an affine
transformation to u
v=u+b
• v is the induced field of the neuron
v
u
m
u   wjxj
j 1
Bias as extra input
• Bias is an external parameter of the neuron. Can be
modeled by adding an extra input. m
v   wj xj
w0 j 0
x0 = +1
w0  b
x1 w1 Activation
Local function
Field

Input Output
signal x2 w2 v  () y
 
Summing
function
xm wm Synaptic
weights
Dimensions of a Neural
Network
• Various types of neurons
• Various network architectures
• Various learning algorithms
• Various applications
Backpropagation Training
Algorithm
•Create the 3-layer network with H hidden units with full
connectivity between layers. Set weights to small random real
values.
•Until all training examples produce the correct value (within ε),
or mean squared error ceases to decrease, or other termination
criteria:
Begin epoch
For each training example, d, do:
Calculate network output for d’s input values
Compute error between current output and correct output for d
Update weights by backpropagating error and using learning rule
End epoch
17
Comments on Training Algorithm
• Not guaranteed to converge to zero training error,
may converge to local optima or oscillate indefinitely.
• However, in practice, does converge to low error for

many large networks on real data.
• Many epochs (thousands) may be required, hours or

days of training for large networks.
• To avoid local-minima problems, run several trials

starting with different random weights (random
restarts).
– Take results of trial with lowest training set error.
– Build a committee of results from multiple trials (possibly
weighting votes by training set accuracy).
18
Over-Training Prevention
• Running too many epochs can result in over-fitting.
error
on test data
on training data
0 # training epochs
• Keep a hold-out validation set and test accuracy on it after
every epoch. Stop training when additional epochs actually
increase validation error.
• To avoid losing training data for validation:
– Use internal 10-fold CV on the training set to compute the average
number of epochs that maximizes generalization accuracy.
– Train final network on complete training set for this many epochs.
19
Determining the Best
Number of Hidden Units
• Too few hidden units prevents the network from
adequately fitting the data.
• Too many hidden units can result in over-fitting.
error
on test data
on training data
0 # hidden units
• Use internal cross-validation to empirically determine

an optimal number of hidden units.
20
Successful Applications
• Text to Speech (NetTalk)
• Fraud detection
• Financial Applications
– HNC (eventually bought by Fair Isaac)
• Chemical Plant Control
– Pavillion Technologies
• Automated Vehicles
• Game Playing
– Neurogammon
• Handwriting recognition
21
Thanks 
Neural Networks 22

Ann 2

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Ann 2

Uploaded by

Copyright:

Available Formats

Artificial Neural

– Inter neuron connection strengths (weights) are used to

– During the learning process the weights are modified in

• Three different classes of network architectures

– single-layer feed-forward neurons are organized

• The architecture of a neural network is linked with the

Input layer Output layer

• Synapses change size and strength

3 Activation function (squashing function)  for

• Various network architectures

• Various learning algorithms

• However, in practice, does converge to low error for

• Many epochs (thousands) may be required, hours or

• To avoid local-minima problems, run several trials

• Use internal cross-validation to empirically determine

You might also like