Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

3/14/2023

Artificial Neural Networks


Lecture 3

Two-dimensional plots of basic logical operations


x2 x2 x2

1 1 1

x1 x1 x1
0 1 0 1 0 1

(a) AND (x1  x2) (b) OR (x1  x2) (c) Exclusive-OR


(x1  x2)

A perceptron can learn the operations AND and


OR, but not Exclusive-OR.

Faculty of Engineering-Cairo University

1
3/14/2023

Logical operations in perceptron network


x2 x2

1 1

x1 x1
0 1 0 1

Faculty of Engineering-Cairo University

Example 4.3 (solved examples)


 Design a perceptron network to solve the following
classification problem with four classes of input vector. The
four classes are

Faculty of Engineering-Cairo University 4

2
3/14/2023

Example 4.3 (solved examples)

 Graphical solution

We can now select the weight vectors:

Then solve 𝐖 𝐓 𝐏 + 𝐛 = 𝟎 , to get b

Target values:

Faculty of Engineering-Cairo University 5

Example 4.3 (solved examples)


 Solution by perceptron learning rule

𝛼=1

(2*2)(2*1)=(2*1)

Faculty of Engineering-Cairo University 6

3
3/14/2023

Example 4.3 (solved examples)

Faculty of Engineering-Cairo University 7

Example 4.3 (solved examples)

Faculty of Engineering-Cairo University 8

4
3/14/2023

Example 4.3 (solved examples)


New epoch

At this point the algorithm has converged, since all input patterns will be
correctly classified.
Faculty of Engineering-Cairo University 9

Example 4.3 (solved examples)

 Now we are almost ready to train an ADALINE network using the LMS
rule. We will use a learning rate of 0.04 , and we will present the input
vectors in order according to their subscripts.
First iteration:

Faculty of Engineering-Cairo University 10

5
3/14/2023

Example 4.3 (solved examples)

Faculty of Engineering-Cairo University 11

Example 4.3 (solved examples)

Note that, The perceptron


rule stops training when
all the patterns are
classified correctly. The
LMS algorithm moves the
boundaries as far from the
patterns as possible.

Faculty of Engineering-Cairo University 12

6
3/14/2023

Example 4.3 (solved examples)

Faculty of Engineering-Cairo University 13

Example 4.3 (solved examples)

Faculty of Engineering-Cairo University 14

7
3/14/2023

Example 4.3 (solved examples)

Faculty of Engineering-Cairo University 15

Example 4.3 (solved examples)

Faculty of Engineering-Cairo University 16

8
3/14/2023

Example: Activation Surfaces

y L1

L3

L1 L2 L3
L2

x x y

Faculty of Engineering-Cairo University 17

Activation Surfaces

y L1
x1=0
1=1 2=1 3= 4
L3
L1 L2 L3
xy+4=0 0 1
1 1
0 1
L2
y1=0 x y
x

Faculty of Engineering-Cairo University 18

9
3/14/2023

Activation Surfaces

010
y L1

Region Code
L3
011 L1 L2 L3
110
111 L2

001 101 100 x x y

Faculty of Engineering-Cairo University 19

Example: Activation Surfaces

y L1 z
z=0 L4
L3

L1 L2 L3
z=1 L2

x x y

Faculty of Engineering-Cairo University 20

10
3/14/2023

Example: Activation Surfaces

y L1 z
4=2.5
z=0 L4
L3
1 1 1

L1 L2 L3
z=1 L2

x x y

Faculty of Engineering-Cairo University 21

22

y1 y2 yn

Output Layer . . .

. . .
Hidden Layer
. . .

Input Layer . . .

x1 x2 xm

MULTILAYER PERCEPTRON

Faculty of Engineering-Cairo University

11
3/14/2023

Multilayer Perceptron (MLP)

Faculty of Engineering-Cairo University 23

How an MLP Works?

Example:

XOR
x2  Not linearly separable.
 Is a single layer perceptron
1
workable?

0 x1
1

Faculty of Engineering-Cairo University 24

12
3/14/2023

MLP : Pattern Classification

The input/target pairs for the XOR gate are

Two - layer network can solve the XOR problem

One solution is to use two neurons in the first


layer to create two decision boundaries
The second layer is used to combine the two
boundaries together using an AND operation.

Faculty of Engineering-Cairo University 25

MLP : Pattern Classification

One solution is to use two neurons in the first


layer to create two decision boundaries

The second layer is used to combine the two


boundaries together using an AND operation.
Faculty of Engineering-Cairo University 26

13
3/14/2023

MLP : Pattern Classification

Faculty of Engineering-Cairo University 27

Example 2 : Pattern Classification

 Design a multilayer network to correctly classify the two


different classes shown in the Figure. Class I vectors are
represented by light circles, and Class II vectors are represented
by dark circles.

Is linearly separable ?

Is a single layer
perceptron workable?

How many layers?

How many neurons in


each layer ?
Faculty of Engineering-Cairo University 28

14
3/14/2023

Example 2 : Pattern Classification

 Design a multilayer network to correctly classify the two


different classes shown in the Figure. Class I vectors are
represented by light circles, and Class II vectors are represented
by dark circles.
1 neuron in the third
layer Is linearly separable ?

Is a single layer
perceptron workable?

How many layers?


11 neurons in the
first layer How many neurons in
4 neurons in the each layer ?
second layer Faculty of Engineering-Cairo University 29

Example 2 : Pattern Classification

 The weight matrix and bias vector for the first layer are

Faculty of Engineering-Cairo University 30

15
3/14/2023

Example 2 : Pattern Classification

 The weight matrix and bias vector for the second layer are

 The weight matrix and bias vector for the third layer are

Faculty of Engineering-Cairo University 31

Example 2 : Pattern Classification

Faculty of Engineering-Cairo University 32

16
3/14/2023

33

-LEANING RULE

Faculty of Engineering-Cairo University

Adaline

x1 wi1
x2
wi2
.
.
.
 wTi x( k ) yi( k )  wTi x( k )

wim
xm

Faculty of Engineering-Cairo University 34

17
3/14/2023

Unipolar Sigmoid

x1 wi1
x2
wi2
yi( k )  a  wTi x ( k ) 
.
.
.
 wTi x( k )

wim 1
xm a(neti ) 
1  e neti

Faculty of Engineering-Cairo University 35

Bipolar Sigmoid

x1 wi1
x2
wi2
yi( k )  a  wTi x ( k ) 
.
.
.
 wTi x( k )

wim 2
xm a(neti )  1
1  e neti

Faculty of Engineering-Cairo University 36

18
3/14/2023

Gradient Decent Algorithm

1 p (k )
Minimize E (w)  
2 k 1
(d  y ( k ) ) 2

1 p
   d ( k )  a(wT x( k ) ) 
2

2 k 1
T
 E (w ) E (w ) E (w ) 
 w E (w )   , , , 
 w1 w2 wm 

w  w E(w)
Faculty of Engineering-Cairo University 37

The Gradient

1 p (k )
Minimize E (w)   (d  y ( k ) ) 2
y ( k )  a  wT x( k ) 
2 k 1
E (w ) p
y ( k )
  (d ( k )  y ( k ) ) Depends on the
w j k 1 w j activation function
used.

p a  net ( k )  net ( k )
  ( d (k )
y )
(k )

k 1 net ( k ) w j m
net ( k )  wT x( k )   wi xi( k )
? ? i 1

net (k )
  x (jk )
w j
Faculty of Engineering-Cairo University 38

19
3/14/2023

Weight Modification Rule

y ( k )  a  net ( k ) 
1 p (k )
Minimize E (w)   (d  y ( k ) ) 2   d  y
(k ) (k ) (k )

2 k 1
E (w ) p a  net ( k ) 
  (d  y )x j
(k ) (k ) (k )

w j k 1 net ( k )

p a  net ( k ) 
Batch w j     (k ) (k )
x
net ( k )
j
Learning k 1
Rule
a  net ( k ) 
Incremental w j   (k )
x (k )

net ( k )
j

Faculty of Engineering-Cairo University 39

Weight Modification Rule

y ( k )  a  net ( k ) 
1 p (k )
Minimize E (w)   (d  y ( k ) )2
2 k 1
E (w ) p a  net (k )

  (d ( k )  y ( k ) )x (jk )
w j k 1 net ( k )
Sigmoid
Adaline Unipolar Bipolar
1
a(net )  net a(net )  a(net ) 
2
1
1  e   net 1  e  net
a(net ) a(net )
1   y ( k ) (1  y ( k ) )
net net Exercise

Faculty of Engineering-Cairo University 40

20
3/14/2023

Learning Rule  Unipolar Sigmoid

 (k )  d (k )  y(k )
1 p (k )
Minimize E (w)   (d  y ( k ) )2
2 k 1
E (w ) p
  (d ( k )  y ( k ) )x (jk )  y ( k ) (1  y ( k ) )
w j k 1
p
   ( k ) x(jk )  y ( k ) (1  y ( k ) )
k 1

p
w j     ( k ) x (jk )  y ( k ) (1  y ( k ) ) Weight Modification Rule
k 1

Faculty of Engineering-Cairo University 41

Comparisons  y ( k )(1 
y ( k ))
p
Batch w j     ( k ) x (jk )
Adaline k 1

Incremental w j   ( k ) x (jk )

p
Batch w j     ( k ) x (jk )  y ( k ) (1  y ( k ) )
Sigmoid k 1

Incremental w j   ( k ) x (jk )  y ( k ) (1  y ( k ) )

Faculty of Engineering-Cairo University 42

21
3/14/2023

The Learning Efficacy

Adaline Sigmoid

y=a(net) = net y=a(net)

f(x) = x

net net

a(net ) a(net )
1   y (1  y )
net net
constant depends on output

Faculty of Engineering-Cairo University 43

The Learning Efficacy

Adaline The learning efficacy of


Adaline is constant meaning
y=a(net) = net that the Adline will never get
saturated.

a’(net)

1
net

a(net )
1
net y=net
constant

Faculty of Engineering-Cairo University 44

22
3/14/2023

The Learning Efficacy

The sigmoidAdaline
will get saturated Sigmoid
if its output value nears the two
extremes. y=a(net)
y=a(net) = net

a’(net)
 y (1  y )

net net

a(net ) a(net )
1   y (1  y )
net net
0 1 y depends on output
constant

Faculty of Engineering-Cairo University 45

The Learning Efficacy


Initialization for Sigmoid Neurons

Before training, its weight


must be sufficiently small to
x1 wi1 avoid the saturation region

x2
wi2
yi( k )  a  wTi x ( k ) 
.
.
.
 wTi x( k )

wim
xm
Faculty of Engineering-Cairo University 46

23

You might also like