Professional Documents
Culture Documents
Neural Networks - Presentation
Neural Networks - Presentation
NETWORKS
Data Mining PhD Seminar 31 October 2011
Gabriela Sava
Mathematical definition
neural network is a directed graph
A
with vertices and arcs with the following
restrictions:
1. V is partitioned into a set of input nodes
, hidden nodes and output nodes
2. The vertices are partitioned into layers
with all input nodes in layer 1 and output
nodes in layer k. The hidden nodes are in
layers between 2 and k-1 hidden layers
3. Any arc must have node i in layer h-1
and node j in layer h
4. Any arc is labeled with a numeric value
called weight
5. Node I is labeled with a function
Characteristics
Terminology
Functions
form:
Sigmoid
S-shape curve with output
values between -1 and 1 (or 0 and 1)
which is monotonically increasing
Although there are several types of
sigmoid function, a common one is
logistic function:
Gaussian
is a bell-shaped curve with
Design issues
Architecture
Feed-forward networks connections
are only to layers later in the structure;
the signal propagates only in one
direction, the nodes from next layer
use the values
produces by
previous layer as
input values
Classification
Unsupervised Learning Models
process does not require an example of
desired output (in most models the
target output is the same with the input)
Objective to categorize or discover
features or patterns in the training data
Used in a wide variety of fields under
different names the most known is
cluster analysis
The most common variety is Hebbian
For more about the Hebb Laws: T Kohonen, SelfOrganizing
Maps, 3edition,
Springer, 2001, pag 91-96
learning
dimensionality
reduction
rd
Feedback Nets
Feed-forward Nets
Feedback Nets
Brain-State-in-a-Box (BSB)
Fuzzy Cognitive Map (FCM)
Boltzmann Machine (BM)
Backpropagation Through Time (BPTT)
Real-Time Recurrent Learning (RTRL)
Feed-forward Nets
Perceptron
Backpropagation (BP)
Adaptive Logic Network (ALN)
Learning Vector Quantization (LVQ)
Probabilistic Neural Networks (PNN)
Backpropagation Neural
Network
Network Architecture
Input layer input variables which use
the linear transformation function
Hidden layer represents the interaction
among the input nodes; uses the
sigmoid transformation
function
Output layer
represents the output
variables
Algorithm
Learning
Process
1. Set the parameter of the network
2. Set the uniform random values for:
- the weights matrix between the input
layer and the hidden layer
- the weights matrix between the hidden
layer and the output layer
- the bias from in the hidden layer
- the bias from the output layer
3. Obtain an input training vector X and
the desired output vector T
4. Calculate the output vector Y as follows:
4.b Calculate the output of vector Y:
5. Calculate the sensitivity value (output
error)
Recall
Process
1. Set the network parameter
2. Read in the weights and , and the vectors
and
3. Read in the test vector X
4. Calculate the output vector Y as follows:
4.b Calculate the output of vector Y:
Limitations
Local Minima - occurs because the algorithm
always changes the weights in such a way as to
cause the error to fall, but the error might briefly
have to rise as part of a more general fall; if this
is the case, the algorithm will gets stuck
(because it cant go uphill) and the error will not
decrease further
Solution:
Reset the weights and start the training again
with other random values
XOR problem
Input
A
Input
B
Output
F
XOR representation
Solving XOR
Examples
Pattern classification
Adaptive control
Noise filtering
Data compression
Expert systems
Strengths:
High computation capability save time and
effort when working with huge databases and
improve the accuracy of the computation results
Learning PNN is a dynamic system which can
learn quickly from the data source, also the
decision boundaries can be updated in real time
using new data when they become available
Fault tolerance a damage to the connections
will only decrease slightly the functionality;
incomplete input information or with noise will
not stop the network processes
Weaknesses:
Large memory requirements because
the information is stored in matrix form
and as the number of training is
increasing, the matrix will become very
large
Slower recall process due to
processing of the large matrices
Network Architecture
Input layers the one which need to be
classified
Pattern layer has one neuron for each
training vector sample
Summation layer has one neuron for each
population class
Output layer
threshold discriminator
which decided the
summation with maximum
output
Basic concepts
Assume
k possible classifications: ..
Classification rule is determined by the
following vector:
The probability of classifying the input
vector into each class is determined by
function which has a Gaussian
distribution:
where:
X input vectors
- total number of training patterns for
category k
j pattern number
m space dimension
smoothing parameter
- j-th training pattern for category k
Algorithm
Learning
process
1. Use random numbers to initialize the
original network weights and set the
smoothing parameter
2. Input the vector X of the training
sample and the target vector T
3. Set the matrix W
3.a matrix W_xh is between the input layer
and the hidden layer
3.b matrix W_hy is between the hidden
layer and the output layer
Recall process
1. Set the smoothing parameter by
educated guess based on knowledge of
the data or using a heuristic technique
(e.g Jackknifing)
2. Read de matrices W_xh and W_hy
3. Input the vector X of one of the testing
examples
4. Compute the deductive output vector Y
Training
Examples
document classification
information retrieval
image processing
decision support systems
engineering
gaming
law
Kohonen Networks
Self-Organising Maps (SOM)
It is a self-organizing network the
correct output can not be defined a priori
and therefore a numerical measure of
the magnitude of the mapping error can
not be used
Main characteristic - transform the input
space into a 1-D or 2-D discrete map (for
visualization and dimension reduction) in
a topologically-preserving way
(neighboring neurons respond to
For more details see: T Kohonen, Self-Organizing Maps, 3
similaredition,
input
patterns)
Springer,
2001, ch 3, 4 and 5
rd
Network Architecture
Competition
Each neuron in a SOM is assigned a
weight vector with the same
dimensionality as the input space
Any given input pattern is compared to
the weight vector of each neuron and
the closest neuron is declared the winner
The Euclidean norm is
commonly used to measure
distance
Cooperation
Mathematical
implementation
function is shift-invariant
The amplitude must decrease
monotonically to zero
The width must be adjustable.
Adaptation
During training, the winner neuron and
its topological neighbors are adapted to
make their weight vectors more similar
to the input pattern that caused the
activation
Neurons that are closer to the
winner will adapt more heavily
than neurons that are
further away
Mathematical
implementation
Algorithm
Initialize weights to some small, random
1.
values
2.a Select the next input pattern form the
database
Find the unit that best matches the
input pattern
Examples
Density estimation
Inverse kinematics
Network Architecture
Single
layered recurrent networks
All the neurons receive feedback from
everybody
The states of neurons are binary -1 and 1
The connections are symmetric
No self connections
The information is stored in
fixpoint attractors
Algorithm
Train the network using a Standard
1.
pattern
2. Update weight vectors of network
according to the next thresholding rule
(activation function):
Examples
Pattern reconstruction
Limitations
Bidirectional Associative
Memory (BAM)
Network Architecture
Network has only two layers input and
output
Training vector takes values (-1,1) and it
is divided in 2 parts: Front part input
layer and Rear part Output layer
BAM rule: network can
remember the relationships
from the Front part to the
Rear part
Algorithm
Learning
Process
1. Set the network parameter
2. Calculate the weight matrix
Recall
Process
1. Read the weight matrix W
2. Input a test vector X
3. Calculate the output vector Y:
where:
where:
5. Repeat steps 3 and 4 until the network
converges to the learning rule output
nodes are associated with the input ones
Examples
Character recognition
Competitive Learning
Network architecture
Algorithm
1. Normalize all input patterns to get
values between (0,1)
2. Randomly select a pattern
2a. Find the winner neuron maximum
value given by the activation function
Examples
Advantages
Disadvantages