CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam

CSE 473
Pattern Recognition
Instructor:
Dr. Md. Monirul Islam
Linear Classifier
2
Learning a Linear Classifier
x2
w.x + w0 =0
wtx + w0 =0
x = [x1, x2]
w = [w1, w2]
x1
The Perceptron Algorithm
– Use training file to learn w
– Steps
– Initialize w
– Classify all training samples using current w
– Find new w using
w(t  1)  w(t )   t   x x
xY
– Repeat until w converges
4
Classification by a Trained
Perceptron
g(x)= w.x + w0
If g(x) > 0 : x  ω1
x2 Otherwise : x  ω2
wtx + w0 =0
x = [x1, x2]
w = [w1, w2]
x1
Variants of Perceptron Algorithm (1)
T
w (t ) x ( t )  0
update
w (t  1)  w (t )   x ( t ) ,
x ( t )  1
T
w (t ) x ( t )  0
w (t  1)  w (t )   x ( t ) ,
x (t )   2
w (t  1)  w (t ) otherwise No Update
– It is a reward and punishment type of algorithm
6
Variants of Perceptron Algorithm (2)
 initialize weight vector w(0)

 define pocket ws.and history hs
 generate next w(t+1). If it is better than w(t),
store w(t+1) in ws and change the hs
– It is pocket algorithm
7
Generalization of Perceptron Algorithm
for M- Class case
• Let M classes ω1, ω2, ω3, . . ., ωM,
• Let M linear discriminant functions, wi
8
for M- Class case
• Let M classes ω1, ω2, ω3, . . ., ωM,
• Let M linear discriminant functions, wi
• The object x is classified to ωi, if
9
for M- Class case
T T
[ wi  wj ].x  0
10
for M- Class case
T T
[ wi  wj ].x  0
can be written as
T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0 11
for M- Class case
T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
12
for M- Class case
T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
13
for M- Class case
T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
14
for M- Class case
[ wiT  wTj ].x  0
T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
[ w1T , w2T , wiT ,, wTj , wM

T
].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0 15
for M- Class case
T T T T T
[ w1 , w2 , wi ,, w j , wM ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
T T T T T T
Let, w  [ w1 , w2 , wi ,, w j , wM ]
T T T T T T T
and xi , j  [0 ,,0 , x ,,0 , x ,0 ]
16
for M- Class case
T T T T T
[ w1 , w2 , wi ,, w j , wM ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
T T T T T T
Let, w  [ w1 , w2 , wi ,, w j , wM ]
T T T T T T T
and xi , j  [0 ,,0 , x ,,0 , x ,0 ]
i-th j-th
position position 17
for M- Class case
T T T T T
[ w1 , w2 , wi ,, w j , wM ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
T T T T T T
Let, w  [ w1 , w2 , wi ,, w j , wM ]
T T T T T T T
and xi , j  [0 ,,0 , x ,,0 , x ,0 ]
Then, the wT xi , j  0
condition is 18
for M- Class case
• For each training vector of class ωi, construct
T T T T T T T
xi , j  [0 ,,0 , x ,,0 , x ,0 ]
ith location jth location
(l+1)M dimension
• Concatenate the weight vectors:

T T T T T T
w  [ w1 , w2 , wi ,, w j , wM ]
19
for M- Class case
T T T T T T T
xi , j  [0 ,,0 , x ,,0 , x ,0 ]
w  [ w1T , w2T , wiT ,, wTj , wM

T T
]
• Use a single Perceptron to solve

• Parameters:
• (l+1)M feature dimension
• All N(M-1) training vectors to be on positive side
• This reorganization is known as Kesler’s construction
20
Kesler’s Algorithm: Training
• For each training vector of class ωi, construct
T T T T T T T
xi , j  [0 ,,0 , x ,,0 , x ,0 ]
• Concatenate the weight vectors:
T T T T T T
w  [ w1 , w2 , wi ,, w j , wM ]
• Use the basic algorithm so that all new training
samples satisfy
wT xi , j  0
21
Kesler’s Algorithm: Testing
• Isolate each weight vector wiT from combined w
w  [ w1T , w2T , wiT ,, wTj , wM

T T
]
w1T w2T wiT wjT wMT
22
Kesler’s Algorithm: Testing
• Isolate each weight vector wiT from combined w
w  [ w1T , w2T , wiT ,, wTj , wM

T T
]
w1T w2T wiT wjT wMT
• New object x is classified to ωi, if
wiT > wjT for all j ≠ i
23
Sample Data for Sessional on
Perceptron Algorithms (Week 3 and 4)
Classes Features Samples
3 3 300
Feature1 Feature2 Feature3 Class

11.0306 9.0152 8.0199 1
11.4008 8.7768 6.7652 1
Sample Data 11.2489 9.5744 8.0812 1
9.3157 7.4360 5.6128 1
for 15.7777 1.5879 11.4440 2
Perceptron 15.8685 2.7902 11.2532 2
14.9448 0.7798 12.7481 2
15.9801 1.0142 14.2029 2
2.3979 5.6525 2.7566 3
2.5103 6.3484 1.4272 3
2.7527 4.6571 3.1138 3
-0.0195 4.5524 0.0118 3
What to do?
• Use training file to train (1) basic, (2) reward and punishment, (3)
pocket and (4) kesler’s reconstruction
• No. of features will be variable
• No. of classes will be > 3 for kesler’s algorithm
• Use test file to evaluate the performance and identify the

misclassified samples
• Use different training and testing data files during evaluation
26
What to do?
• This instruction and data files (both linearly separable and non-
separable) will be uploaded to moodle as well
• Try to verify your code using other available data sets
• Upload your source code to moodle by 10 pm on next Sunday
• Copy checker will be used. Be careful to avoid negative marking.
27
Convergence Proof of Perceptron
Algorithm
Let , w* be the optimal weight vector

w(t) be the weight vector at the tth iteration
w(t+1) be the weight vector at the (t+1)th iteration
28
Algorithm
Let , w* be the optimal weight vector

w(t) be the weight vector at the tth iteration
w(t+1) be the weight vector at the (t+1)th iteration
We will prove that ||w(t+1) - w* || < ||w(t) - w* ||
29
Algorithm
We know that w(t  1)  w(t )   t   x x
xY
Let , α be a positive real number
Then,
30
Algorithm
Squaring both sides,
31
Algorithm
However,
32
Algorithm
However,
Hence,
33
Algorithm
Now, define
and
Algorithm
Now, define
and
Here, is always negative, so is γ
35
Algorithm
Now, define
and
Here, is always negative, so is γ
We can write,
36
Algorithm
If we choose,
We can write,
37
Algorithm
Here,  t2  2   t  2  0
How?
38
Algorithm
Applying the above equation successively for steps t, t-1, . . ., 0, we get
39
Algorithm
However,
40
Algorithm
However,
41
Algorithm
However, and
This means,
After some constant tine t0 the R. H. S. will be non-positive
But, the L. H. S. cannot be negative
Therefore,
42
Algorithm
is equivalent to
43

CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

CSE 473 Pattern Recognition: Instructor: Dr. Md. Monirul Islam

Uploaded by

Copyright:

Available Formats

CSE 473

– Repeat until w converges

– It is a reward and punishment type of algorithm

 initialize weight vector w(0)

[ w1T , w2T , wiT ,, wTj , wM

• Concatenate the weight vectors:

w  [ w1T , w2T , wiT ,, wTj , wM

• Use a single Perceptron to solve

w  [ w1T , w2T , wiT ,, wTj , wM

w  [ w1T , w2T , wiT ,, wTj , wM

• New object x is classified to ωi, if

wiT > wjT for all j ≠ i

Feature1 Feature2 Feature3 Class

• No. of features will be variable

• No. of classes will be > 3 for kesler’s algorithm

• Use test file to evaluate the performance and identify the

• Use different training and testing data files during evaluation

• Try to verify your code using other available data sets

• Upload your source code to moodle by 10 pm on next Sunday

• Copy checker will be used. Be careful to avoid negative marking.

Let , w* be the optimal weight vector

Let , w* be the optimal weight vector

We will prove that ||w(t+1) - w* || < ||w(t) - w* ||

Let , α be a positive real number

Squaring both sides,

Here, is always negative, so is γ

Here, is always negative, so is γ

Applying the above equation successively for steps t, t-1, . . ., 0, we get

Applying the above equation successively for steps t, t-1, . . ., 0, we get

Applying the above equation successively for steps t, t-1, . . ., 0, we get

But, the L. H. S. cannot be negative

You might also like