Professional Documents
Culture Documents
Cours FLD
Cours FLD
Lazhar.labiod@parisdescartes.fr
discriminant
functions
features
21
Discriminant Functions for the Normal Density
▪ Suppose we for class ci its class conditional density
p(x|ci) is N(i,i)
1 1
p(x | ci ) = exp {− (x − )i t −1
(x − )}
(2)
i i
i
1/ 2
d/ 2 2
1 1
g i (x )= − (x − i ) i (x − )i − ln i + lnP(ci )
t −1
2 2
22
Case i = 2I
▪ That is
23
Case i = 2I
▪ Discriminant function
▪ Det(i)= 2d and
24
Case i = 2I Geometric Interpretation
If ln P( ci ) = ln P( c j ), then If ln P( ci ) ln P( c j ), then
g i (x ) = − 2 x − i + ln P( c i )
1
g i (x ) = − x − i
2 2
2
constant
for all classes
26
Case i = 2I
constant in x
gi (x ) = w x + w i0
t
i
linear in x:
d
w x=
t
i Σw x i i
i =1
▪ Discriminant function is
28
Case i = 2I: Example
▪ Need to find out when gi(x) < gj(x) for i,j=1,2,3
▪ Can be done by solving gi(x) = gj(x) for i,j=1,2,3
▪ Let’s take g1(x) = g2(x) first
▪ Simplifying,
line equation
29
Case i = 2I: Example
▪ Next solve g2(x) = g3(x)
30
Case i = 2I: Example
▪ Priors P (c1 ) = P (c2 ) =
1
and P (c3 ) = 1
4 2
c3
lines connecting
c2 means
are perpendicular to
decision boundaries
c1
31
Case i =
32
Case i =
▪ Discriminant function
constant
for all classes
▪ Discriminant function becomes
( x − i )t Σ −1( x − i ) + ln P( c i )
1
gi ( x ) = −
2
squared Mahalanobis Distance
▪ Mahalanobis Distance
▪ If =I, Mahalanobis Distance becomes usual
Eucledian distance
x−y = ( x − y )t( x − y )
2
I −1
33
Eucledian vs. Mahalanobis Distances
x − = (x − )t (x − ) x− = (x − )t Σ −1(x − )
2 2
−1
eigenvectors of
1 3 1 3
decision region
for c1 decision region
for c1
points in each cell are closer to the
mean in that cell than to any other
mean under Mahalanobis distance
35
Case i =
▪ Can simplify discriminant function:
36
Case i = : Example
▪ 3 classes, each 2-dimensional Gaussian with
37
Case i = : Example
▪ Let’s solve in general first
38
Case i = : Example
c2
c1 lines connecting
means
are not in general
perpendicular to
decision boundaries
c3
40
General Case i are arbitrary
▪ Covariance matrices for each class are arbitrary
▪ In this case, features x1, x2. ,…, xd are not
necessarily independent
41
General Case i are arbitrary
▪ From previous discussion,
1 1
g i (x ) = − (x − i ) i (x − )i − ln i + lnP(ci )
t −1
2 2
42
General Case i are arbitrary
linear in x
constant in x
gi (x ) = xtWx + w t x + w i 0
quadratic in x since
d d d
x tWx = Σ Σ w ij x i x j = Σ w ij x i x j
j =1 i =1 i , j =1
43
General Case i are arbitrary: Example
▪ 3 classes, each 2-dimensional Gaussian with
c2
c1
c3 c1
45
Important Points
▪ The Bayes classifier when classes are normally
distributed is in general quadratic
▪ If covariance matrices are equal and proportional to
identity matrix, the Bayes classifier is linear
▪ If, in addition the priors on classes are equal, the Bayes
classifier is the minimum Eucledian distance classifier
▪ If covariance matrices are equal, the Bayes
classifier is linear
▪ If, in addition the priors on classes are equal, the Bayes
classifier is the minimum Mahalanobis distance classifier
▪ Popular classifiers (Euclidean and Mahalanobis
distance) are optimal only if distribution of data
is appropriate (normal)
46