Download as pdf or txt
Download as pdf or txt
You are on page 1of 43

CSE 473

Pattern Recognition

Instructor:
Dr. Md. Monirul Islam
Linear Classifier

2
Learning a Linear Classifier

x2

w.x + w0 =0
wtx + w0 =0

x = [x1, x2]
w = [w1, w2]

x1
The Perceptron Algorithm
– Use training file to learn w
– Steps
– Initialize w
– Classify all training samples using current w
– Find new w using

w(t  1)  w(t )   t   x x
xY

– Repeat until w converges

4
Classification by a Trained
Perceptron
g(x)= w.x + w0
If g(x) > 0 : x  ω1
x2 Otherwise : x  ω2

wtx + w0 =0

x = [x1, x2]
w = [w1, w2]

x1
Variants of Perceptron Algorithm (1)
T
w (t ) x ( t )  0

update
w (t  1)  w (t )   x ( t ) ,
x ( t )  1

T
w (t ) x ( t )  0
w (t  1)  w (t )   x ( t ) ,
x (t )   2

w (t  1)  w (t ) otherwise No Update

– It is a reward and punishment type of algorithm

6
Variants of Perceptron Algorithm (2)

 initialize weight vector w(0)


 define pocket ws.and history hs
 generate next w(t+1). If it is better than w(t),
store w(t+1) in ws and change the hs

– It is pocket algorithm

7
Generalization of Perceptron Algorithm
for M- Class case
• Let M classes ω1, ω2, ω3, . . ., ωM,
• Let M linear discriminant functions, wi

8
Generalization of Perceptron Algorithm
for M- Class case
• Let M classes ω1, ω2, ω3, . . ., ωM,
• Let M linear discriminant functions, wi
• The object x is classified to ωi, if

9
Generalization of Perceptron Algorithm
for M- Class case
• The object x is classified to ωi, if

T T
[ wi  wj ].x  0

10
Generalization of Perceptron Algorithm
for M- Class case
• The object x is classified to ωi, if

T T
[ wi  wj ].x  0
can be written as
T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0 11
Generalization of Perceptron Algorithm
for M- Class case
T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0

12
Generalization of Perceptron Algorithm
for M- Class case
T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0

13
Generalization of Perceptron Algorithm
for M- Class case
T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0

14
Generalization of Perceptron Algorithm
for M- Class case
[ wiT  wTj ].x  0

T T T T T T
[0 ,,0 , wi ,,0 , w j ,0 ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0

[ w1T , w2T , wiT ,, wTj , wM


T
].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0 15
Generalization of Perceptron Algorithm
for M- Class case
T T T T T
[ w1 , w2 , wi ,, w j , wM ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
T T T T T T
Let, w  [ w1 , w2 , wi ,, w j , wM ]

T T T T T T T
and xi , j  [0 ,,0 , x ,,0 , x ,0 ]

16
Generalization of Perceptron Algorithm
for M- Class case
T T T T T
[ w1 , w2 , wi ,, w j , wM ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
T T T T T T
Let, w  [ w1 , w2 , wi ,, w j , wM ]

T T T T T T T
and xi , j  [0 ,,0 , x ,,0 , x ,0 ]

i-th j-th
position position 17
Generalization of Perceptron Algorithm
for M- Class case
T T T T T
[ w1 , w2 , wi ,, w j , wM ].
T T T T T T T
[0 ,,0 , x ,,0 , x ,0 ]  0
T T T T T T
Let, w  [ w1 , w2 , wi ,, w j , wM ]

T T T T T T T
and xi , j  [0 ,,0 , x ,,0 , x ,0 ]

Then, the wT xi , j  0
condition is 18
Generalization of Perceptron Algorithm
for M- Class case
• For each training vector of class ωi, construct
T T T T T T T
xi , j  [0 ,,0 , x ,,0 , x ,0 ]
ith location jth location

(l+1)M dimension

• Concatenate the weight vectors:


T T T T T T
w  [ w1 , w2 , wi ,, w j , wM ]
19
Generalization of Perceptron Algorithm
for M- Class case
T T T T T T T
xi , j  [0 ,,0 , x ,,0 , x ,0 ]

w  [ w1T , w2T , wiT ,, wTj , wM


T T
]

• Use a single Perceptron to solve


• Parameters:
• (l+1)M feature dimension
• All N(M-1) training vectors to be on positive side
• This reorganization is known as Kesler’s construction
20
Kesler’s Algorithm: Training
• For each training vector of class ωi, construct
T T T T T T T
xi , j  [0 ,,0 , x ,,0 , x ,0 ]
• Concatenate the weight vectors:
T T T T T T
w  [ w1 , w2 , wi ,, w j , wM ]
• Use the basic algorithm so that all new training
samples satisfy
wT xi , j  0
21
Kesler’s Algorithm: Testing
• Isolate each weight vector wiT from combined w

w  [ w1T , w2T , wiT ,, wTj , wM


T T
]
w1T w2T wiT wjT wMT

22
Kesler’s Algorithm: Testing
• Isolate each weight vector wiT from combined w

w  [ w1T , w2T , wiT ,, wTj , wM


T T
]
w1T w2T wiT wjT wMT

• New object x is classified to ωi, if

wiT > wjT for all j ≠ i

23
Sample Data for Sessional on
Perceptron Algorithms (Week 3 and 4)
Classes Features Samples
3 3 300

Feature1 Feature2 Feature3 Class


11.0306 9.0152 8.0199 1
11.4008 8.7768 6.7652 1
Sample Data 11.2489 9.5744 8.0812 1
9.3157 7.4360 5.6128 1
for 15.7777 1.5879 11.4440 2
Perceptron 15.8685 2.7902 11.2532 2
14.9448 0.7798 12.7481 2
15.9801 1.0142 14.2029 2
2.3979 5.6525 2.7566 3
2.5103 6.3484 1.4272 3
2.7527 4.6571 3.1138 3
-0.0195 4.5524 0.0118 3
What to do?
• Use training file to train (1) basic, (2) reward and punishment, (3)
pocket and (4) kesler’s reconstruction

• No. of features will be variable

• No. of classes will be > 3 for kesler’s algorithm

• Use test file to evaluate the performance and identify the


misclassified samples

• Use different training and testing data files during evaluation

26
What to do?

• This instruction and data files (both linearly separable and non-
separable) will be uploaded to moodle as well

• Try to verify your code using other available data sets

• Upload your source code to moodle by 10 pm on next Sunday

• Copy checker will be used. Be careful to avoid negative marking.

27
Convergence Proof of Perceptron
Algorithm

Let , w* be the optimal weight vector


w(t) be the weight vector at the tth iteration
w(t+1) be the weight vector at the (t+1)th iteration

28
Convergence Proof of Perceptron
Algorithm

Let , w* be the optimal weight vector


w(t) be the weight vector at the tth iteration
w(t+1) be the weight vector at the (t+1)th iteration

We will prove that ||w(t+1) - w* || < ||w(t) - w* ||

29
Convergence Proof of Perceptron
Algorithm
We know that w(t  1)  w(t )   t   x x
xY

Let , α be a positive real number

Then,

30
Convergence Proof of Perceptron
Algorithm

Squaring both sides,

31
Convergence Proof of Perceptron
Algorithm

However,

32
Convergence Proof of Perceptron
Algorithm

However,

Hence,

33
Convergence Proof of Perceptron
Algorithm

Now, define

and
Convergence Proof of Perceptron
Algorithm

Now, define

and

Here, is always negative, so is γ

35
Convergence Proof of Perceptron
Algorithm

Now, define

and

Here, is always negative, so is γ

We can write,

36
Convergence Proof of Perceptron
Algorithm

If we choose,

We can write,

37
Convergence Proof of Perceptron
Algorithm

Here,  t2  2   t  2  0

How?

38
Convergence Proof of Perceptron
Algorithm

Applying the above equation successively for steps t, t-1, . . ., 0, we get

39
Convergence Proof of Perceptron
Algorithm

Applying the above equation successively for steps t, t-1, . . ., 0, we get

However,

40
Convergence Proof of Perceptron
Algorithm

Applying the above equation successively for steps t, t-1, . . ., 0, we get

However,

41
Convergence Proof of Perceptron
Algorithm

However, and

This means,
After some constant tine t0 the R. H. S. will be non-positive

But, the L. H. S. cannot be negative

Therefore,
42
Convergence Proof of Perceptron
Algorithm

is equivalent to

43

You might also like