Professional Documents
Culture Documents
Module 1 - Fundamentals of NN
Module 1 - Fundamentals of NN
Module 1 - Fundamentals of NN
Soft Computing
Books
• Textbooks:
1. “Neural Networks, Fuzzy Logic and Genetic Algorithms” S.
Rajasekaran, G.A.VijayalakshmiPai, PHI (ECE).
• References:
1. MIT-OCW
2. “Introduction to the Theory of Neural Computation”, Hertz,
Krogh, Palmer.
3. “Artificial Neural Networks”, B. Yegnarayana, PHI.
4. “Genetic Algorithms”, David E. Goldberg, Addison Wesley
Course Learning Outcomes
Bloom’s Cognitive
After the completion of the course the
CO
student should be able to
level Descriptor
Y y
x2
X2
w2
y = f ( yin )
y in = x 1 w 1 + x 2 w 2
Characteristics of NNs
• Exhibit mapping capabilities
1. Neurons (nodes)
2. Synapses (weights)
Basic models of ANN
Basic Models of
ANN
Activation
Interconnections Learning rules
function
Classification based on
interconnections
Interconnections
Multilayer Multilayer
1. Feed-forward nets
Information is distributed
Information processing is parallel
Often used in data mining
Neural Networks NN 1 32
1.2 Multi layer feed-forward
3-4-2 Network
Input Output
layer layer
Hidden Layer
Neural Networks NN 1 33
2. Feedback network
• When outputs are directed back as inputs to same or
preceding layer nodes it results in the formation of feedback
networks.
• Used in Associative memories and Optimization problems
where the network looks for the best arrangements or
interconnected factors.
3. Recurrent Network
• Feedback networks with closed loop are called
Recurrent Networks.
• The response at the (k+1)th instant depends on the entire
history of the network starting at k=0.
z-1
input
z-1 hidden
output
z-1
38
Example of Recurrent NN
• First we see a simple example of NN
• Suppose we have a roommate which is perfect.
• He cooks three types of foods based on weather and he
cooks everyday.
• He cooks orange juice, pakora and Manchurian.
• We assign some vectors to foods and weather
1
0
0
1
0
0
1
0
0
1
0
0
1
0
0
1
1
0
0
• Now for next example,
• Consider the roommate cooks different food every
day e.g if today he cooks orange juice, next day he
cooks pakora, next day Manchurian like wise.
M T W TH FRI SAT
• More complex RNN
OJ OJ P M M OJ P
S R R S R R
• Food metrics tells which food to be cooked and weather
matrix tells for which day the food is cooked
Basic models of ANN
Basic Models of
ANN
Activation
Interconnections Learning rules
function
Learning
• It’s a process by which a NN adapts itself to a
stimulus by making proper parameter
adjustments, resulting in the production of
desired response
• Two kinds of learning
– Parameter learning:- connection weights are
updated
– Structure Learning:- change in network structure
1. Training (Parameter
Learning)
• The process of modifying the weights in the
connections between network layers with the
objective of achieving the expected output is
called training a network.
• This is achieved through
– Supervised learning
– Unsupervised learning
– Reinforcement learning
Classification of learning
• Supervised learning
• Unsupervised learning
• Reinforcement learning
1.1 Supervised Learning
• Child learns from a teacher
• Each input vector requires a corresponding
target vector.
• Training pair=[input vector, target vector]
Neural
X Network Y
W
(Input) (Actual output)
Error
Error
(D-Y) Signal
signals Generator (Desired Output)
Supervised learning contd.
Supervised
learning does
minimization of
error
1. 2 Unsupervised Learning
• All similar input patterns are grouped together as
clusters.
• If a matching input pattern is not found a new cluster
is formed
Unsupervised learning
Self-organizing
• In unsupervised learning there is no feedback
• Network must discover patterns, regularities,
features for the input data over the output
• While doing so the network might change in
parameters
• This process is called self-organizing
1.3 Reinforcement Learning
X
Y
NN
(Input) W (Actual output)
Error
signals Error
Signal R
Generator Reinforcement signal
Reinforcement Learning (Contd…)
• Though teacher is available, does not provide
expected answer
• Only indicates the computed answer is correct
or incorrect. For correct answer reward and for
wrong answer penalty is given.
• Feedback is provided in terms of
reinforcement signals
• Helps in its learning process
Basic models of ANN
Basic Models of
ANN
Activation
Interconnections Learning rules
function
Activation Functions
• Also known as squashing function or transfer
function
• Defines the output of a neuron.
• Used to calculate the output response of a
neuron.
• Sum of the weighted input signal is applied
with an activation to obtain the response.
• Activation functions can be linear or non linear
Activation Functions
1ifx
f (x) = {
0 ifx
1 ifx
f (x) = {
− 1 ifx
1 ifx 1
f ( x ) = x if 0 x 1
0 ifx 0
The Neuron
• The neuron is the basic information processing unit of a
NN. It consists of:
1 A set of synapses or connecting links, each link
characterized by a weight:
W1, W2, …, Wm
2 An adder function (linear combiner) which computes
the weighted sum of the inputs: m
u= w x j j
j =1
3 Activation function (squashing function) for limiting
the amplitude of output. (of the neuron)
y = (u + b)
Neural Networks NN 1 60
The Neuron
Bias
b
x1 w1
Activation
Local function
Field
(−)
v
Output
Input x2 w2 y
signal
Summing
function
xm wm
Synaptic
weights
Neural Networks NN 1 61
Bias
Example, Suppose we are having a equation of
line Y = mX + C.
Why Bias is required?
• The relationship between input and output
given by the equation of straight line y=mx+c
C(bias)
Input X Y y=mx+C
Bias
• It is a constant that helps the model in a way
that it can fit best for the given data.
• It gives the freedom to perform best.
• Bias is like another weight. It’s included by
adding a component x0=1 to the input vector
X.
• X=(1,X1,X2…Xi,…Xn)
• Bias is of two types
– Positive bias: increase the net input
– Negative bias: decrease the net input
Bias as extra input
• Bias is an external parameter of the neuron. Can be modeled by
adding an extra input. m
x0 = +1
w0
v= wx
j=0
j j
w0 = b
x1 w1 Activation
Local function
Field
(−)
Input v
Output
signal x2 w2 y
Summing
function
xm wm Synaptic
weights
Neural Networks NN 1 65
Comparison between brain verses
computer
Brain ANN
Violet Yes No No
Hot Dog No No No
situation if the threshold is set at 1
Total (x)
(Combin (Threshold=
Object Purple? Round? Eat?
ed 1)
Input) x > 1?
Blueberry 1 1 2 Yes 1
Golf ball 0 1 1 No 0
Violet 1 0 1 No 0
Hot Dog 0 0 0 No 0
situation if the threshold is set at 2
Blueberry 1 1 2 No 0
Golf ball 0 1 1 No 0
Violet 1 0 1 No 0
Hot Dog 0 0 0 No 0
Assignment Example 1 (MP Model)
OR function
W1=1 W2=1 Threshold (θ) = 0
I1 I2 I1
0 0
0 1 I2
1 0
1 1
Assignment Example 2 (MP Model)
OR function
W1=0.5 W2=0.5 Threshold (θ) = 0
I1 I2
0 0 I1
0 1
1 0
I2
1 1
(1)
(2)
How to “test" a Hopfield network
How to “test" a Hopfield network
• Testing Algorithm:
8. Now feedback the obtained output Vi to all other
units. The activation vectors are updated.
9. Finally, test the network for convergence.
Example
• We have a 5 node Hopfield network and we want it to
recognize the pattern (0 1 1 0 1)
• So it might go 3, 2, 1, 5, 4, 2, 3, 1, 5, 4, etc
When to stop updating the network
• Basically, if you go through all the nodes and
none of them changes, you can stop
Example (3, 1, 5, 2, 4,)
for input pattern (1 1 1 1 1)
Assignment
• Solve above example for following sequence
(2, 4, 3, 5, 1)
Example (2, 4, 3, 5, 1,)
for input pattern (1 1 1 1 1)
Hopfield Model
•We will study the memorization (i.e. find a set of suitable
wij) of a set of random patterns, which are made up of
independent bits Vi which can each take on the values +1
and –1 with equal probability.
•Our procedure for testing whether a proposed form of w ij
is acceptable is first to see whether the patterns are
themselves stable, and then to check whether small
deviations from these patterns are corrected as the
network evolves.
•We distinguish two cases:
•One pattern
•Many patterns
Hopfield Model : Storage of One pattern
•The condition for a single pattern to be stable is:
Vi in = ( wijV j ) ( for all i)
j
w ij V iV j
hi = j
w ij V j
By the majority that are right and sgn(hi) will still give i.
This means that the network will correct errors as desired,
and we can say that the pattern i is an attractor.
Hopfield Model : Storage of many patterns
•In the case of many patterns the weights are assumed to be a
superposition of terms like in the case of a single pattern:
1 p
i j
w ij = V V
N =1
Where p is the number of patterns labeled by .
•An associative memory model above for all possible pairs ij,
with binary units and asynchronous updating is usually called
a Hopfield model.
Hopfield Model : Stability of a particular pattern
•Let us examine the stability of a particular pattern Vis. The
stability condition generalizes to:
Vi = sgn( hi ) ( for all i )
s s
j N j
Now we separate the sum on into the special term =s and
all the rest:
1
Vi V j V j
hi = Vi +
s s s
N j s
Hopfield Model : Stability of a particular pattern
• If the second term were zero, we could immediately conclude that
pattern s was stable according to the previous stability condition.
This is still true if the second term is small enough: if its magnitude
is smaller that 1 it cannot change the sigh of h is and the stability
condition will be still satisfied.
• The second term is called crosstalk. It turns out that it is less than
1 in many cases of interest if p is small enough.
Optimization Problems