Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 10

Single-Layer Perceptrons

(3.7 ~ 3.9)
Learning-Rate Annealing Schedules(I)
 When learning rate is large, trajectory may follow zigzagging
path
 When it is small,procedure may be slow
 Simplest learning rate parameter
 ( n)  0
 Stochastic approximation
 timevaring
 when c is large, danger of parameter blowup for small n.

c
 ( n)  (c is constant)
n
Learning-Rate Annealing Schedules(II)
 Search then converge
schedule
 in early stage, learning rate
parameter is approximately
equal to 0
 for a number of iteration n
large compared to search
time constant  ,
learning rate parameter
approximates as c/n
0
 ( n) 
1  (n /  )
(0 ,  is constant)
Perceptron(I)
 Goal
 classifying applied Input x1 , x2 ,..., xm into one of two classes
 Procedure
 if output of hard limiter is +1, to class C1 if it is -1, to class C2
 input of hard limiter : weighted sum of input
m
v   wi xi  b
i 1

 effect of bias b is merely to


shift decision boundary away
from origin
 synaptic weights adapted on
iteration by iteration basis
Perceptron(II)
 Decision regions separated by a hyperplane
m

w x b  0
i 1
i i

 point (x1,x2) above boundary line


is assigned to C1
 point (y1,y2) below boundary line
 to class C
2
Perceptron Convergence Theorem(I)
 Linearly separable
 if two classes are linearly separable, there exists decision surface co
nsisting of hyperplane.
 If so, there exists weight vector w

w T x  0 for every input vector x belonging to class C1


w T x  0 for every input vector x belonging to class C 2
 foronly linearly
separable classes,
perceptron works
well
Perceptron Convergence Theorem(II)
 Using modified signal-flow graph
 bias b(n) is treated as synaptic weight driven by fixed input +1
 w0 ( n) is b(n)
 linear combiner output
m
v(n)   wi (n) xi (n)
i 1

 w T ( n) x( n)
Perceptron Convergence Theorem(III)
 Weight adjustment
 if x(n) is correctly classified
w (n  1)  w (n) if w T x(n)  0 and x(n) belongs to class C1
w (n  1)  w (n) if w T x(n)  0 and x(n) belongs to class C 2
 otherwise

w (n  1)  w (n)   (n)x(n) if w T x(n)  0 and x(n) belongs to class C 2


w (n  1)  w (n)   (n)x(n) if w T x(n)  0 and x(n) belongs to class C1
 learning rate parameter  (n) controls adjustment applied to weight
vector
Perceptron Convergence Theorem(IV)
 Fixed increment convergence theorem
 for linearly separable vectors X1 and X2 , perceptron converges afte
r some n0 iterations

w (n0 )  w (n0  1)  w (n0  2)  ...


is solution vector for n0  nmax
 proof in case of  ( n)  1

n 2 2 n
 w(n  1)   x(k )  n max x(k )
2 2 2
2 X ( k )X 1
w0 k 1

see text book for detail


Summary
 Initialization
 set w(0)=0
 Activation
 attime step n, activate perceptron by applying continuous valued input v
ector x(n) and desired response d(n)
 Computation of actual response
y (n)  sgn[w T (n)x(n)]
 Adaptation of Weight Vector
w (n  1)  w (n)  [d (n)  y (n)]x(n)
 1 if x(n) belongs to class C1
d ( n)  {
 1 if x(n) belongs to class C 2
 Continuation
 inclement time step n and go back to step 2

You might also like