Professional Documents
Culture Documents
Lecture 3 ANN PDF
Lecture 3 ANN PDF
1 1 1
x1 x1 x1
0 1 0 1 0 1
1
3/14/2023
1 1
x1 x1
0 1 0 1
2
3/14/2023
Graphical solution
Target values:
𝛼=1
(2*2)(2*1)=(2*1)
3
3/14/2023
4
3/14/2023
At this point the algorithm has converged, since all input patterns will be
correctly classified.
Faculty of Engineering-Cairo University 9
Now we are almost ready to train an ADALINE network using the LMS
rule. We will use a learning rate of 0.04 , and we will present the input
vectors in order according to their subscripts.
First iteration:
5
3/14/2023
6
3/14/2023
7
3/14/2023
8
3/14/2023
y L1
L3
L1 L2 L3
L2
x x y
Activation Surfaces
y L1
x1=0
1=1 2=1 3= 4
L3
L1 L2 L3
xy+4=0 0 1
1 1
0 1
L2
y1=0 x y
x
9
3/14/2023
Activation Surfaces
010
y L1
Region Code
L3
011 L1 L2 L3
110
111 L2
y L1 z
z=0 L4
L3
L1 L2 L3
z=1 L2
x x y
10
3/14/2023
y L1 z
4=2.5
z=0 L4
L3
1 1 1
L1 L2 L3
z=1 L2
x x y
22
y1 y2 yn
Output Layer . . .
. . .
Hidden Layer
. . .
Input Layer . . .
x1 x2 xm
MULTILAYER PERCEPTRON
11
3/14/2023
Example:
XOR
x2 Not linearly separable.
Is a single layer perceptron
1
workable?
0 x1
1
12
3/14/2023
13
3/14/2023
Is linearly separable ?
Is a single layer
perceptron workable?
14
3/14/2023
Is a single layer
perceptron workable?
The weight matrix and bias vector for the first layer are
15
3/14/2023
The weight matrix and bias vector for the second layer are
The weight matrix and bias vector for the third layer are
16
3/14/2023
33
-LEANING RULE
Adaline
x1 wi1
x2
wi2
.
.
.
wTi x( k ) yi( k ) wTi x( k )
wim
xm
17
3/14/2023
Unipolar Sigmoid
x1 wi1
x2
wi2
yi( k ) a wTi x ( k )
.
.
.
wTi x( k )
wim 1
xm a(neti )
1 e neti
Bipolar Sigmoid
x1 wi1
x2
wi2
yi( k ) a wTi x ( k )
.
.
.
wTi x( k )
wim 2
xm a(neti ) 1
1 e neti
18
3/14/2023
1 p (k )
Minimize E (w)
2 k 1
(d y ( k ) ) 2
1 p
d ( k ) a(wT x( k ) )
2
2 k 1
T
E (w ) E (w ) E (w )
w E (w ) , , ,
w1 w2 wm
w w E(w)
Faculty of Engineering-Cairo University 37
The Gradient
1 p (k )
Minimize E (w) (d y ( k ) ) 2
y ( k ) a wT x( k )
2 k 1
E (w ) p
y ( k )
(d ( k ) y ( k ) ) Depends on the
w j k 1 w j activation function
used.
p a net ( k ) net ( k )
( d (k )
y )
(k )
k 1 net ( k ) w j m
net ( k ) wT x( k ) wi xi( k )
? ? i 1
net (k )
x (jk )
w j
Faculty of Engineering-Cairo University 38
19
3/14/2023
y ( k ) a net ( k )
1 p (k )
Minimize E (w) (d y ( k ) ) 2 d y
(k ) (k ) (k )
2 k 1
E (w ) p a net ( k )
(d y )x j
(k ) (k ) (k )
w j k 1 net ( k )
p a net ( k )
Batch w j (k ) (k )
x
net ( k )
j
Learning k 1
Rule
a net ( k )
Incremental w j (k )
x (k )
net ( k )
j
y ( k ) a net ( k )
1 p (k )
Minimize E (w) (d y ( k ) )2
2 k 1
E (w ) p a net (k )
(d ( k ) y ( k ) )x (jk )
w j k 1 net ( k )
Sigmoid
Adaline Unipolar Bipolar
1
a(net ) net a(net ) a(net )
2
1
1 e net 1 e net
a(net ) a(net )
1 y ( k ) (1 y ( k ) )
net net Exercise
20
3/14/2023
(k ) d (k ) y(k )
1 p (k )
Minimize E (w) (d y ( k ) )2
2 k 1
E (w ) p
(d ( k ) y ( k ) )x (jk ) y ( k ) (1 y ( k ) )
w j k 1
p
( k ) x(jk ) y ( k ) (1 y ( k ) )
k 1
p
w j ( k ) x (jk ) y ( k ) (1 y ( k ) ) Weight Modification Rule
k 1
Comparisons y ( k )(1
y ( k ))
p
Batch w j ( k ) x (jk )
Adaline k 1
Incremental w j ( k ) x (jk )
p
Batch w j ( k ) x (jk ) y ( k ) (1 y ( k ) )
Sigmoid k 1
Incremental w j ( k ) x (jk ) y ( k ) (1 y ( k ) )
21
3/14/2023
Adaline Sigmoid
f(x) = x
net net
a(net ) a(net )
1 y (1 y )
net net
constant depends on output
a’(net)
1
net
a(net )
1
net y=net
constant
22
3/14/2023
The sigmoidAdaline
will get saturated Sigmoid
if its output value nears the two
extremes. y=a(net)
y=a(net) = net
a’(net)
y (1 y )
net net
a(net ) a(net )
1 y (1 y )
net net
0 1 y depends on output
constant
x2
wi2
yi( k ) a wTi x ( k )
.
.
.
wTi x( k )
wim
xm
Faculty of Engineering-Cairo University 46
23