Download as pdf or txt
Download as pdf or txt
You are on page 1of 33

COMPUTATIONAL INTELLIGENCE

Adaline and Madaline Networks

Prof. B. R. Suthar
Learning rule/Learning process

➢ It is a method or a mathematical logic. It improves


the Artificial Neural Network’s performance and applies
this rule over the network.
➢ Thus learning rules updates the weights and bias levels
of a network when a network simulates in a specific data
environment.
➢ Applying learning rule is an iterative process. It helps a
neural network to learn from the existing conditions and
improve its performance.
Learning Laws
https://data-flair.training/blogs/learning-rules-in-neural-network

➢ Hebb's Rule: This was introduced by Donald Hebb in


‘Organization of Behavior’.
➢ If two neighbor neurons activated and deactivated at the
same time. Then the weight connecting these neurons
should increase. For neurons operating in the opposite
phase, the weight between them should decrease. If
there is no signal correlation, the weight should not
change.
Learning Laws
• Hopfield Law:
If the desired output and the input are both active or
both inactive, increment the connection weight by
the learning rate, otherwise decrement the weight
by the learning rate.
Learning Laws
• The Delta Rule: This rule is based on the simple idea
of continuously modifying the strengths of the input
connections to reduce the difference (the delta)
between the desired output value and the actual
output of a processing element.
• The result will be in a least mean squared
error(LMSE).

• Adaptive Linear Neuron(Adaline) and Multilayered


Adaline(Madaline) both are using Delta rule.
Learning Laws
• The Gradient Descent Rule: This is similar to Delta
Rule in that, the derivative of the transfer function is
still used to modify the delta error before it is applied
to the connection weights.
Learning Laws
• Kohonen's Law: In this, the processing elements
compete for the opportunity to learn or update their
weights. The element with largest output is declared
the winner and has the capability of inhibiting its
competitors as well as exciting its neighbors. Only
the winner is permitted an output and only the
winner plus its neighbors are allowed to adjust their
connection weights.
Introduction
• Perceptron modify weights to reduce the number of
misclassification but perfect classification using linear
element may not be possible.

• It may be achieved by minimizing means squared


error (MSE) instead of misclassified sample.

• MSE: It is a function that corresponds to expected


value of squared error loss.
Difference Between Adaline & Perceptron

• The difference between Adaline and the standard


perceptron is that in the learning phase the weights
are adjusted according to the weighted sum of the
inputs (the net). In the standard perceptron, the net
is passed to the activation (transfer) function and the
function's output is used for adjusting the weights.
Introduction
• Widrow and Hoff developed the learning rule that is
very closely related to Perceptron learning rule. The
rule called Delta rule, adjust the weights to reduce the
difference between the net input to the output unit
and the desired output, which results in a least mean
squared error( LMS error).

• Adaline and Madaline Network use this LMS learning


rule and applied to various neural network
applications. ‘The weights on the interconnections
between the Adaline and Madaline network are
adjustable.
Adaline Network
• Widrow and Hoff [1960], found to use bipolar activations for
its input signals and target outputs.
• The weights and the bias of Adaline are adjustable.
• The learning rule used can be called as Delta rule, Least Mean
Square rule or Widrow-Hoff rule.
• Activation of the unit is its net input.

• When Adaline is to be used for pattern classification, then


after training, a threshold function is applied to the net input
to obtain the activation.
• The Adaline unit can solve the problem with linear
separability if it occurs.
Adaline Architecture
• The Adaline has only one output unit. This output
unit receives input from several units and also from
bias, whose activation always is +1.
• The bias weights are also trained in the same manner
as the other weights.
Algorithm
• The initial weights of Adaline network have to be set to small random
values and not to zero as in Perceptron network, because it may influence
the error factor to be considered.

• After it activation for input unit are set.

• The net input is calculated based on training input patterns and the
weights.

• By applying delta learning rule updation of weights to be carried out.

• The training process is continued until the error, which is the difference
between the target and the net input becomes minimum.
Algorithm Steps
1. Initialize weights( not zero, small random values are used).
Set learning rule α.
2. While stopping condition is false, do Steps 3-7.
3. For each bipolar training pair s:t, perform Steps 4-6.
4. Set activations of input units xi = si, for i=1 to n.
5. Compute net input to output unit y-in = ∑ xiwi + b; i=1 to n
6. Update bias and weights, i=1 to n.
wi(new) = wi(old) + α * (t-y-in) * xi
bi(new) = b(old) + α * (t-y-in)

Calculate error E= (t-y-in) ^ 2

7. Test for stopping condition.

The stopping condition may be when the weight


change reaches small level or number of iterations
etc.
Applications
• Adaptive filtering
• Pattern recognition
Adaline Network Example
• Develop an adaline network for the given
function with bipolar inputs and bipolar
targets.
Solution
• Initial weight and bias assume small random
values
• w1=w2=b=0.2
• Learning rate α= 0.2
Solution
Solution
Solution
Solution
Madaline Network
• Madaline is the combination of Adaline.

• If Adalines are combined, such that the output of some of


them becomes input for others, then the net becomes
multilayers. This forms the Madaline.

• Madaline has two training algorithms:


1. MRI
2. MRII
Madaline Architecture
• An architecture explained with 2 input neurons, 2 hidden
neurons and 1 output neuron.
• It is multilayer feed forward network.
• Both hidden units have separate bias connections, along with
input connections.
MRI Algorithm
• The weights of the hidden Adaline units are only adjusted while the
weights of the output unit are fixed.
• First, the initialization of the weights between the input and the hidden
units is done (small positive random values).
• The input is presented, and based on the weighted interconnections the
net input is calculated for both the hidden units.
• Then by applying the activations the output is obtained for both the
hidden units.
• The input for the output layer, and constant weighted inter connections
acting between the hidden and output layer, the net input to output
neuron is found. By means of the net input, applying the activations the
final output of the net is calculated.
• Then this compared with the target, and suitable weight updation are
performed.
MRI Algorithm Steps
• The weights into y, v1 and v2 are fixed as 0.5 with bias b3 as 0.5.
• The activation function for units z1,z2 and y is given by

f(p)= 1, if p>=0,
if p<0
1. Initialize weights, bias and set learning rate as α.
v1=v2=0.5 and b3=0.5. Other weights may be small random values.
2. When stopping condition is false do Steps 3-9.
3. For each bipolar training pair s:t, do steps 4-8.
4. Set activations of input units:
xi=si for i=1 to n
5. Calculate net input of hidden Adaline units.
z-in1 = b1 + x1w11 + x2w21
z-in2 = b2 + x1w12 + x2w22
MRI Algorithm Steps (cont…)
6. Find output of hidden Adaline unit using activation mentioned above.
z1 = f(z-in1)
z2 = f(z-in2)
7. Calculate net input to output
Y-in = b3 + z1v1 + z2v2
Apply activation to get the output of net.
Y = f (y-in)
8. Find the error and do weight updation.
If t = y, no weight updation.
If t != y, then,
If t=1, then update weights on all units zj unit whose net input is closest to 0.
wij (new) = wij(old) + α*(-1-zinj)*xi
bj(new) = bj(old) + α* (-1-zinj)
If t= -1, then update weights on all units zk which have positive net input.

9. Test for the stopping condition.


NOTE: The stopping condition may be weight change or number of epochs etc.
Example
2. From a Madaline network for XOR function with bipolar input and target using
MRI algorithm.
Solution
Given architecture
Solution
Solution
Solution
Solution
Solution
• In Epoch 3 t=y. Now if further iteration do
then weights remains same. No need to do
further process.

You might also like