Professional Documents
Culture Documents
PR January20 03 PDF
PR January20 03 PDF
Pattern Recognition
Instructor:
Dr. Md. Monirul Islam
Bayesian Classifier
and its Variants
2
Classification Example 1
• Given:
– A doctor knows that meningitis causes stiff neck 50% of the time
– one of every 50,000 persons has meningitis
– one of every 20 persons has stiff neck
3
Classification Example 1
• Given:
– A doctor knows that meningitis causes stiff neck 50% of the time
– one of every 50,000 persons has meningitis
– one of every 20 persons has stiff neck
4
a l a l s
u
Classification Example 2 r ic r ic o
o o u
g g t in s
te te n las
ca ca c o c
Tid Refund Marital Taxable
Status Income Evade
5
a l a l s
u
Classification Example 2 r ic r ic o
o o u
g g t in s
te te n las
ca ca c o c
Tid Refund Marital Taxable
Status Income Evade
8
Bayes Classifier
• A probabilistic framework for solving classification
problems
• Bayes theorem:
P( A, C ) P( A) P(C | A) P( A | C ) P (C )
9
Bayes Classifier
• A probabilistic framework for solving classification
problems
• Conditional Probabilities:
P( A, C )
P (C | A)
P ( A)
P( A, C )
P( A | C )
P (C )
10
Example 1
• Given:
– A doctor knows that meningitis causes stiff neck 50% of the time
– Prior probability of any patient having meningitis is 1/50,000
– Prior probability of any patient having stiff neck is 1/20
P ( S | M ) P ( M ) 0.5 1 / 50000
P( M | S ) 0.0002
P( S ) 1 / 20
11
Example 3
• The sea bass/salmon example
12
Example 3
• The sea bass/salmon example
13
Example 3
• Decision rule with only the prior information
14
Example 3
• Decision rule with only the prior information
15
Example 3
• Use of the class –conditional information
– Use lightness
16
Example 3
Lightness
17
Example 3
• Given the lightness evidence x, calculate Posterior
from Likelihood and evidence
p( x | j ) P( j )
– P( j | x)
p( x)
18
Example 3
• Given the lightness evidence x, calculate Posterior
from Likelihood and evidence
p( x | j ) P( j )
– P( j | x)
p( x)
19
Example 3
Posterior function
20
Example 3
Decision rules with the posterior probabilities
21
Example 3
Decision rules with the posterior probabilities
Therefore:
whenever we observe a particular x, the probability of
error is :
P(error | x) = P(1 | x) if we decide 2
P(error | x) = P(2 | x) if we decide 1
22
• Minimizing the probability of error
Therefore:
P(error | x) = min [P(1 | x), P(2 | x)]
(Bayes decision)
23
Bayesian Decision Theory – Continuous
Features
24
Classification to Minimize Loss
Let {1, 2,…, m} be the set of m states of nature
(or “categories or classes”)
25
Classification to Minimize Loss
Let {1, 2,…, m} be the set of m states of nature
(or “categories or classes”)
26
Classification to Minimize Loss
27
Classification to Minimize Loss
Overall risk
R = Sum of all R(i | x) for i = 1,…,n
28
Classification to Minimize Loss
Overall risk
R = Sum of all R(i | x) for i = 1,…,n
29
Classification to Minimize Loss
• Two-category classification
1 : deciding 1
2 : deciding 2
30
Classification to Minimize Loss
• Two-category classification
1 : deciding 1
2 : deciding 2
ij = (i | j)
loss incurred for deciding i when the true state of nature is j
31
Classification to Minimize Loss
• Two-category classification
1 : deciding 1
2 : deciding 2
ij = (i | j)
loss incurred for deciding i when the true state of nature is j
Conditional risk:
32
Classification to Minimize Loss
Our rule is the following:
if R(1 | x) < R(2 | x)
action 1: “decide 1” is taken
33
Classification to Minimize Loss
Our rule is the following:
if R(1 | x) < R(2 | x)
action 1: “decide 1” is taken
34
Classification to Minimize Loss
Our rule is the following:
if R(1 | x) < R(2 | x)
action 1: “decide 1” is taken
decide 1 if:
(21- 11) P(x | 1) P(1) >
(12- 22) P(x | 2) P(2)
P ( x | 1 ) 12 22 P ( 2 )
if .
P ( x | 2 ) 21 11 P ( 1 )
P ( x | 1 ) 12 22 P ( 2 )
if .
P ( x | 2 ) 21 11 P ( 1 )
Likelihood
ratio
Then take action 1 (decide 1)
Otherwise take action 2 (decide 2)
37
Classification to Minimize Loss
Optimal decision property
P ( x | 1 ) 12 22 P ( 2 )
.
P( x | 2 ) 21 11 P(1 )
38
Minimum Error Rate Classification
Assume the loss function for two class case:
39
Minimum Error Rate Classification
Assume the loss function for two class case:
Discriminant
Functions
d- dimensional
Feature vector
47
Classifiers, Discriminant Functions
and Decision Surfaces
48
Classifiers, Discriminant Functions
and Decision Surfaces
49
Classifiers, Discriminant Functions
and Decision Surfaces
Ri
Rj
Decision Surface
• Let, Ri and Rj : two regions identifying classes ωi and
ωj
Decision Decide ωi if P(ωi |x) > P(ωj |x)
Rule
Ri
Rj
Decision Surface
• Let, Ri and Rj : two regions identifying classes ωi and
ωj
Decision Decide ωi if P(ωi |x) - P(ωj |x) > 0
Rule
Ri
Rj
Decision Surface
• Let, Ri and Rj : two regions identifying classes ωi and
ωj
Decision Decide ωi if g(x) ≡ P(ωi |x) - P(ωj |x) > 0
Rule
Ri
Rj
Decision Surface
• Let, Ri , R j : two regions identifying classes ωi and ωj
g ( x) 0
decision
surface
56
Decision Surface
• If Ri , R j: two regions identifying classes ωi and ωj
+
- g ( x) 0
decision
surface
57
Decision Surface
• If Ri , R j: two regions identifying classes ωi and ωj
Ri : P (i x) P ( j x)
+
- g ( x) 0
R j : P ( j x) P (i x)
decision
surface
58
Decision Surface in Multi-categories
59
Decision Surface in Two-categories
60
– The computation of g(x)
g ( x) P(1 | x) P (2 | x)
P( x | 1 ) P(1 ) P( x | 2 ) P(2 )
P ( x | 1 ) P (1 )
ln ln
P ( x | 2 ) P(2 )
61
62
The Normal Density
• Density which is analytically tractable
• Continuous density
• A lot of processes are asymptotically Gaussian
– Handwritten characters, speech sounds, and many more
– Any prototype corrupted by random process
63
The Normal Density
• Univariate density
1 1 x
2
P( x ) exp ,
2 2
Where:
= mean (or expected value) of x
2 = expected squared deviation or variance
64
65
66
• Multivariate density
1 1
P( x ) 1/ 2
exp ( x ) ( x )
t 1
( 2 ) d/2
2
where:
x = (x1, x2, …, xd)t (t stands for the transpose vector form)
= (1, 2, …, d)t mean vector
= d*d covariance matrix
|| and -1 are determinant and inverse respectively
67
• Multivariate density
1 1
P( x ) 1/ 2
exp ( x ) ( x )
t 1
( 2 ) d/2
2
where:
68
2D Gaussian Example - 1
70
2D Gaussian Example - 2
2
1
2
2
71
2D Gaussian Example - 3
2
1
2
2
72
Computer Exercise
73
a l a l s
c c u
Classification Example 2 r i r i o
o o u
g g t in s
te te n las
ca ca c o c
Tid Refund Marital Taxable
Status Income Evade