Professional Documents
Culture Documents
CS407 Neural Computation: Associative Memories and Discrete Hopfield Network. Lecturer: A/Prof. M. Bennamoun
CS407 Neural Computation: Associative Memories and Discrete Hopfield Network. Lecturer: A/Prof. M. Bennamoun
CS407 Neural Computation: Associative Memories and Discrete Hopfield Network. Lecturer: A/Prof. M. Bennamoun
Lecture 6:
Associative Memories and Discrete
Hopfield Network.
2
Fausett
Introduction…
To a significant extent, learning is the process of forming
associations between related patterns.
Aristotle observed that human memory connects items
(ideas, sensations, etc.) that are
– Similar
– Contrary
– Occur in close proximity (spatial)
– Occur in close succession (temporal)
The patterns we associate together may be
– of the same type or sensory modality (e.g. a visual
image may be associated with another visual image)
– or of different types (e.g. a fragrance may be
associated with a visual image or a feeling).
Memorization of a pattern (or a group of patterns) may be
considered to be associating the pattern with itself.
3
Introduction…
In this lecture, we will consider some relatively simple
(single layer) NNs that can learn a set of pattern pairs (or
associations).
An associative memory (AM) net may serve as a highly
simplified model of human memory. However, we shall not
address the question whether they are all realistic models.
AM provide an approach of storing and retrieving data
based on content rather than storage address (info.
storage in a NN is distributed throughout the system in the
net’s weights, hence a pattern does not have a storage
address)
Content-addressable memory 4
Address-addressable memory
Introduction…
Each association is an I/P O/P vector pair , s:f.
Two types of associations. For two patterns s and f
– If s = f the net is called auto-associative memory.
– If s != f the net is called hetero-associative memory
5
Introduction…
In each of these cases, the net not only learns the specific
patterns pairs that were used for training, but is also able to
recall the desired response pattern when given an I/P
stimulus that is similar, but not identical, to the training I/P.
Associative recall
– evoke associated
patterns
– recall a pattern by
part of it
– evoke/recall with
incomplete/ noisy
patterns
6
Introduction…
Before training an AM NN, the original patterns must
be converted to an appropriate representation for
computation.
– Iterative Hetero-associative
– Iterative associative
11
Basic Concepts
B. Chen & Zurada
Retreival : v = M [x ]
12
Basic Concepts
M is a general matrix-type operator
– Memory Paradigms (taxonomy)
• Dynamic systems
• Static/Feedforward systems
x (i ) → v (i ) v (i ) ≠ x (i )
for i = 1,...,p
14
Basic Concepts
The most important classes of associative memories
are static and dynamic memories. The taxonomy is
entirely based on their recall principles.
Static Memory
– Recall an I/P response after an input has been
applied in one feedforward pass, and theoretically
without delay.
– Non-recurrent (no feedback, no delay)
vk = M1 xk [ ]
k denotes the index of recursion
M1 is an operator symbol
This eq. Represents a system of
15
non-linear algebraic eqs.
Basic Concepts
Dynamic Memory
– Produce recall as a result of output/input feedback
interactions, which requires time
– Recurrent, time-delayed
– Dynamically evolved and finally converge to an
equilibrium state according to the recursive formula
[
v k +1 = M 2 x k , v k ]
The operator M2 operates at present
instant k on the present input xk and
output vk to produce the o/p in the next
instant k+1
∆ is a unit delay needed for cyclic
operation.
The pattern is associated to itself (auto-)
16
Recurrent Autoassociative Memory
Basic Concepts
The fig. below shows the block diagram of a recurrent
heteroassociative memory net.
17
Basic Concepts
Dynamic Memory
– Hopfield model: a recurrent autoassociative
network for which the input x0 is used to initialize
v0, i.e. x0 = v0, and the input is then removed for
the following evolution
nonlinear mapping
[ ]
v k +1 = M 2 v k [ ]
v k +1 = Γ Wv k
Operator M2 consists of multiplication by a weight matrix followed by the
ensemble of non-linear mapping operations vi = f(neti) performed by the layer
of neurons 18
Basic Concepts
sgn(⋅) 0 ......... 0
0 sgn(⋅) .... 0
Γ=
.............................
0 .............. sgn (⋅)
vi = f (neti ) = sgn(neti )
19
Basic Concepts
Equilibrium states
Regarding the vector v(k+1) as the state of the network at the (k+1)’th instant we can
[ ]
consider the recurrent equation v k +1 = Γ Wv k as defining a mapping of the vector v into
Itself.
The example state transition map for a memory network can be represented as
shown on the fig. below.
Each node of the graph is equivalent to a state and has one and only one edge
leaving it
1/ If the transitions terminate with a state mapping
into itself, as in the case of node A, the the equilibrium
A is the fixed point solution (equilibrium)
21
Linear Associative Memory
Belong to Static Networks
No nonlinear or delay operations
Dummy neurons with identity activation
functions
v i = f (net i ) = net i
v = Wx
22
Linear Associative Memory
Given p associations {s ( ) , i
f (i ) }
[
s (i ) = s 1(i ) , s 2(i ) ,..., s n(i ) ] t
Inputs Stimuli (e.g. patterns)
f ( ) = [f
i
1
(i )
, f 2(i ) ,..., f m(i ) ] t Outputs Forced response
(e.g. images of input
Patters)
Apply Hebbian Learning Rule
w ij′ = w ij + f i s j
W ′ = W + fs t outer product (learning rule)
[ ]
∆
W0 = 0 F = f (1 ) f (2 ) f (p )
W ′ = f (i ) s (i )t
S = [s ( ) ]
∆
∆w i = c ⋅ o ⋅ x = cf (w x )⋅ x
i
t
i
or ( )
∆wij = c ⋅ f w ti x ⋅ x j
24
Linear Associative Memory
(j)
Perform the associative recall using s
p
(i ) (i )t ( j )
v = W ′ s = ∑ f s s
(j)
i =1
= f (1 )s (1 )t s ( j ) + ... + f ( j )s ( j )t s ( j ) + ... f ( p )s ( p )t s ( j )
(j)
– If the desired response is v = f
• Necessary conditions: orthonormal
s (i )t s ( j ) = 0 , for i ≠ j
( j )t ( j )
s s =1
• Rather strict and may not always hold
25
Linear Associative Memory
If a distorted input is presented
( j )′
s = s (j) + ∆ (j)
p
(j) ( j ) ( j )t (j)
v = f + f s ∆ + ∑ f (i ) s (i )t ∆ ( j )
i≠ j
( j) ( j)
If s and ∆ statically indenpende nt
thus orthognal
p
(j)
v = f + ∑ f (i ) s (i )t ∆ ( j )
i≠ j Cross-talk noise
2 1 3 0 1
v = M ′s (2 ) = 1 2 1 1 = 2
2 3 2 0 3
27
Dan St. Clair
Alternative View
Goal: Associate an input vector with a
specific output vector in a neural net
28
Weight Matrix
To store more than one association in a neural
net using Hebb’s Rule
– Add the individual weight matrices
{
wn1
1 if y_ini > 0
yi = 0 if y_ini = 0
w12 -1 if y_ini < 0
xi wi2 Σ 0 For binary targets:
{
wn2
1 if y_ini > 0
w13
yi = 0 if y_ini <= 0
xn wi3 Σ 0
wn3
30
Example
GOAL: build a neural network which will
associate the following two sets of patterns
using Hebb’s Rule:
s1 = ( 1 -1 -1 -1) f1 = ( 1 -1 -1)
s2 = (-1 1 -1 -1) f2 = ( 1 -1 1)
s3 = (-1 -1 1 -1) f3 = (-1 1 -1)
s4 = (-1 -1 -1 1) f4 = (-1 1 1)
32
Weight Matrix
Add all four individual weight matrices to
produce the final weight matrix:
1 –1 –1 -1 1 –1 1 –1 1 1 -1 –1
-1 1 1 1 -1 1 1 -1 1 1 -1 -1
-1 1 1 + -1 1 -1 + -1 1 -1 + 1 -1 -1
-1 1 1 -1 1 -1 1 -1 1 -1 1 1
33
Example Architecture
General Structure
2 -2 –2
2 -2 2
-2 2 -2
-2 2 2
yi = {1 if y_ini > 0
0 if y_ini = 0
-1 if y_ini < 0
2
Σ 0
2
x1
-2
-2
x2 -2
Σ 0
-2
2
2
x3
-2
Σ 0
2
x4
-2
2
34
Example Run 1
Try the first input pattern:
{
S1 = ( 1 -1 -1 -1) f1 = ( 1 -1 -1) 1 if y_ini > 0
yi = 0 if y_ini = 0
-1 if y_ini < 0
2 y_in1 = 2 – 2 + 2 + 2 = 4 so y1 = 1
Σ 0
2
1
-2
Where is this -2
information
stored? -1 -2
y_in2 = -2 + 2 - 2 - 2 = -4 so y2 = -1
Σ 0
-2
2
2
-1
-2
y_in3 = -2 - 2 + 2 - 2 = -4 so y3 = -1
Σ 0
2
-1
-2
2 35
Example Run II
Try the second input pattern:
s2 = (-1 1 -1 -1) f2 = ( 1 -1 1)
2
yi = { 1 if y_ini > 0
0 if y_ini = 0
-1 if y_ini < 0
y_in1 = -2 + 2 + 2 + 2 = 4 so y1 = 1
Σ 0
2
-1
-2
-2
1 -2
y_in2 = 2 - 2 - 2 - 2 = -4 so y2 = -1
Σ 0
-2
2
2
-1
-2
y_in3 = 2 + 2 + 2 - 2 = 4 so y3 = 1
Σ 0
2
-1
-2
2 36
Example Run III
Try the Third input pattern:
{
s3 = (-1 -1 1 -1) f3 = (-1 1 -1) 1 if y_ini > 0
yi = 0 if y_ini = 0
-1 if y_ini < 0
2
y_in1 = -2 - 2 - 2 + 2 = -4 so y1 = -1
Σ 0
2
-1
-2
-2
-1 -2
y_in2 = 2 + 2 + 2 - 2 = 4 so y2 = 1
Σ 0
-2
2
2
1
-2
y_in3 = 2 - 2 - 2 - 2 = -4 so y3 = -1
Σ 0
2
-1
-2
2 37
Example Run IV
Try the fourth input pattern:
{
s4 = (-1 -1 -1 1) f4 = (-1 1 1) 1 if y_ini > 0
yi = 0 if y_ini = 0
-1 if y_ini < 0
2
y_in1 = -2 - 2 + 2 - 2 = -4 so y1 = -1
Σ 0
2
-1
-2
-2
-1 -2
y_in2 = 2 + 2 - 2 + 2 = 4 so y2 = 1
Σ 0
-2
2
2
-1
-2
y_in3 = 2 - 2 + 2 + 2 = 4 so y3 = 1
Σ 0
2
1
-2
2 38
Example Run V
Try a non-training pattern
What do
the 0’s
yi = { 1 if y_ini > 0
0 if y_ini = 0
-1 if y_ini < 0
mean? 2
y_in1 = -2 - 2 + 2 + 2 = 0 so y1 = 0
Σ 0
2
-1
-2
-2
-1 -2
y_in2 = 2 + 2 - 2 - 2 = 0 so y2 = 0
Σ 0
-2
2
2
-1
-2
y_in3 = 2 - 2 + 2 - 2 = 0 so y3 = 0
Σ 0
2
-1
-2
2 39
Example Run VI
Try another non training pattern
{
What does 1 if y_ini > 0
this result yi = 0 if y_ini = 0
mean? -1 if y_ini < 0
2
y_in1 = -2 + 2 - 2 - 2 = -4 so y1 = -1
Σ 0
2
-1
-2
-2
1 -2
y_in2 = 2 - 2 + 2 + 2 = 4 so y2 = 1
Σ 0
-2
2
2
1
-2
y_in3 = 2 + 2 - 2 + 2 = 4 so y3 = 1
Σ 0
2
1
-2
2
40
AM and Discrete Hopfield Network.
Introduction
Introduction
Basic
BasicConcepts
Concepts
Linear
LinearAssociative
AssociativeMemory
Memory
(Hetero-associative)
(Hetero-associative)
Hopfield’s
Hopfield’sAutoassociative
Autoassociative
Memory
Memory
Performance
PerformanceAnalysis
Analysisfor
for
Recurrent
RecurrentAutoassociation
Autoassociation
Memory
Memory
References
Referencesand
andsuggested
suggested
reading
reading
41
B. Chen & Zurada
wij = wji
(symmetric weights)
wii=0
(no self connections)
W = ∑ s (m ) s (m )t − p I
p
– In other words:
p
wij = ∑ si(m ) s (jm ) , for i ≠ j
m =1
wii = 0
– The diagonal elements set to be zero to
avoid self-feedbacks
43
Hopfield’s Autoassociative Memory
– The matrix only represents the correlation
terms among the vector entries
– Invariant with respect to the sequence of
storing patterns
– Additional autoassociations can be added
at any time by superimposition
– autoassociations can also be removed
44
Hopfield’s Autoassociative Memory
Retrieval Algorithm
– The output update rule can be expressed
as: k + 1 n k
vi = sgn ∑ w ij v j
j =1
1 - 1 -1 - 1 - 1 - 1
- 1
= 1 v = 1 v 2 = 1 v 4 = 1 v = 1 v = 1
6
v = 1
1 5
Test input : v0 3
convergence
46
Hopfield’s Autoassociative Memory
Example: Autoassociation for numbers (Lippmann
1987)
47
Hopfield’s Autoassociative Memory
Example: Autoassociation for numbers
48
Hopfield’s Autoassociative Memory
(
wij = 1 − δ ij ) ∑ (2 s (
p
m =1
i
m)
)( )
− 1 2 s (jm ) − 1
– In other words:
( )( )
p
wij = ∑ 2 si(m ) − 1 2 s (jm ) − 1 for i ≠ j
m =1
wii = 0
49
Hopfield’s Autoassociative Memory
Performance Consideration
– Hopfield autoassociative memory also
referred to as “Error Correcting Decoder”
– Given an input equal to the stored memory
with random error, it produces as output
the original memory that is close to the
input. Example:
50
Hopfield’s Autoassociative Memory
Performance Consideration
– Let us assume that s(m’) has been stored in the memory as one of
p patterns.
– This pattern is now at the memory input. The activation value of the
i-th neuron for retrieval of pattern s(m’) is :
n
net i = ∑ wij s (jm′ ) wij
j =1
n p If the effect of the diagonal
≅ ∑ ∑ si(m ) s (jm )s (jm′ ) is neglected
j =1 m =1
If s(m’) and s(m) are statistically
p n independent (also when orthogonal)
≅ ∑ si (m )
∑ j sj
s (m ) (m′ )
the product vanishes (is zero)
m =1 j =1 Otherwise, in the limit case, the
product will reach n for both vectors
being identical (note that it’s the dot
≈ si(m ' )n~ , where n~ ≤ n product with entries of vectors = +1
The major overlap case.
or – 1)
51
i.e. s(m’) does not produce any update and is therefore stable
Hopfield’s Autoassociative Memory
Performance Consideration
The equilibrium state for the retrieval pattern s(m’)
is considered net = W s (m ′ )
p ( m ) ( m )t (m ′ )
= ∑ s s − p I s
m =1
1. If the stored prototypes are statistically
independent or orthogonal, and n>p
s (i)ts ( j ) = 0 for i ≠ j
i.e. then
s (i)ts ( j ) = n for i = j
(
net = s ( 1 ) s ( 1 ) t + s ( 2 ) s ( 2 ) t + ... + s ( p ) s ( p ) t − p I ) s ( m ')
∑ (s ( )s (
p
) ( m )t m ′)
net = n s ( m ')
− ps ( m ')
+ m
s
m≠m'
Equilibrium state term
(as on the previous slide)
Noise vector (η)
net = (n − p ) s (m ′ ) +
p
∑
m≠m′
(s ( ) s ( ) ) s (
m m t m ′)
η = ∑
p
m≠m′
(s ( m)
)
s (m )t s (m ′ )
= (W − s ( ) s (
m′ m ′ )t
)
+ I s (m ′ )
and has opposite sign, then s i(m ′ ) will not be the network’s
equilibrium
- The noise term obviously increases for an increased number of
stored patterns, and also becomes relatively significant when the
factor (n-p) decreases 53
AM and Discrete Hopfield Network.
Introduction
Introduction
Basic
BasicConcepts
Concepts
Linear
LinearAssociative
AssociativeMemory
Memory
(Hetero-associative)
(Hetero-associative)
Hopfield’s
Hopfield’sAutoassociative
Autoassociative
Memory
Memory
Performance
PerformanceAnalysis
Analysisfor
for
Recurrent
RecurrentAutoassociation
Autoassociation
Memory
Memory
References
Referencesand
andsuggested
suggested
reading
reading
54
Performance Analysis for Recurrent
Autoassociation Memory
• When the net. is allowed to update, it evolves through certain states which are
shown on the state transition diagram on the next slide.
• States are vertices of the 4D cube.
•Only transitions between states have been marked on the graph; the self-
sustaining or stable state transitions have been omitted for clarity of the picture. 55
Performance Analysis for Recurrent
Autoassociation Memory
• Since only asynchronous transitions are allowed for autoassociative recurrent
memory, every vertex is connected by an edge only to the neighbouring vertex
differing by a single bit.
•Thus there are only 4 transition paths connected to every state. The paths indicate
directions of possible transitions.
− 1 − 1
1 1
v =
0 v′ =
0
− 1 1
1 1
1 1 1
1 − 1 − 1
v =
1
v =
2
v = v = ... =
3 4
− 1 − 1 − 1
1 1 1
•Let us look at a different starting vector v(0) = [-1 1 1 1]t , the transition starting in 57
ascending order leads to s(1) in single step
Performance Analysis for Recurrent
Autoassociation Memory
− 1
− 1
v1 = v 0 v =
2
v3 = v2 v 4 = v 3 etc.
1
1
• You can also verify that by carrying out the transitions initialized at v(0), but carried
out in descending node order, pattern s(2) can be recovered.
58
Advantages and Limits for Associative
Recurrent Memories
Limits
– Limited capability
– Converge to spurious memories (states).
Advantage
– The recurrences through the thresholding
layer of processing neurons (threshold
functions) tend to eliminate noise
superimposed on the initializing input
vector
59
AM and Discrete Hopfield Network.
Introduction
Introduction
Basic
BasicConcepts
Concepts
Linear
LinearAssociative
AssociativeMemory
Memory
(Hetero-associative)
(Hetero-associative)
Hopfield’s
Hopfield’sAutoassociative
Autoassociative
Memory
Memory
Performance
PerformanceAnalysis
Analysisfor
for
Recurrent
RecurrentAutoassociation
Autoassociation
Memory
Memory
References
Referencesand
andsuggested
suggested
reading
reading
60
Suggested Reading.
L. Fausett,
“Fundamentals of
Neural Networks”,
Prentice-Hall,
1994, Chapter 3.
J.Zurada,
“Artificial Neural
systems”, West
Publishing, 1992,
Chapter 6.
61
References:
These lecture notes were based on the references of the
previous slide, and the following references
62