Download as pdf or txt
Download as pdf or txt
You are on page 1of 30

' $

Overview

• Minimum Enclosing Ball(MEB) problem


• Support Vector Clustering(SVC)
• Rough Support Vector Clustering(RSVC)

& %
' $
2

Hard MEB problem

Given a set of points S = {x1 , ..., xm } where xi ∈ Rd , the Minimum


Enclosing Ball(MEB) of S is the smallest ball that contains all the
points in S.

& %
' $
3

Figure 1: Minimum Enclosing Ball

& %
' $
4

(x)

Data Space Feature Space

Figure 2: MEB in Kernel Induced Feature Space

& %
' $
5

The Primal is

min R2

st kφ(xi ) − µk2 ≤ R2 ∀i (1)

& %
' $
6

The Lagrangian
m
L = R2 + αi (kφ(xi ) − µk2 − R2 )
X

i=1

where the Lagrangian multiplier

αi ≥ 0 ∀i (2)

& %
' $
7

Karush-Kuhn-Tucker conditions

m m
∂L X X
= 0 =⇒ 2R − 2R αi = 0 =⇒ αi = 1 (3)
∂R i=1 i=1

m m
∂L X X
= 0 =⇒ −2 αi ( kφ(xi ) − µk ) = 0 =⇒ µ = αi φ(xi ) (4)
∂µ i=1 i=1

Complementary Slackness conditions

αi ( kφ(xi ) − µk2 − R2 ) = 0 (5)

& %
' $
8

Wolfe Dual Form


m
X m
X
min αi αj K(xi , xj ) − αi K(xi , xi )
i,j i=1

m
X
st αi = 1 αi ≥ 0 ∀i (6)
i=1

Here K(xi , xj ) is the Kernel function giving φ(xi ).φ(xj ) in the high
dimensional space and αi s are the Lagrangian multipliers.

& %
' $
9

Gaussian Kernel
−qkxi −xj k2
K(xi , xj ) = e (7)

where q is a user defined parameter.

& %
' $
10

(x)

Feature Space Data Space

Figure 3: Support Vector Clustering

& %
' $
11

The Primal is
m
min R2 + C
X
ξi
i=1

st kφ(xi ) − µk2 ≤ R2 + ξi , ξi ≥ 0 ∀i (8)


m
X
Where C is a constant and C ξi is the penalty term.
i=1

& %
' $
12

The Lagrangian
m m m
L = R2 + C αi (kφ(xi ) − µk2 − R2 − ξi ) −
X X X
ξi + βi ξi
i=1 i=1 i=1

where the Lagrangian multipliers

αi ≥ 0 βi ≥ 0 ∀i (9)

& %
' $
13

Karush-Kuhn-Tucker conditions

m m
∂L X X
= 0 =⇒ 2R − 2R αi = 0 =⇒ αi = 1 (10)
∂R i=1 i=1

m m
∂L X X
= 0 =⇒ −2 αi ( kφ(xi ) − µk ) = 0 =⇒ µ = αi φ(xi ) (11)
∂µ i=1 i=1

∂L
= 0 =⇒ C − αi − βi = 0 =⇒ αi + βi = C (12)
∂ξi

& %
' $
14

Complementary Slackness conditions

αi ( kφ(xi ) − µk2 − R2 − ξi ) = 0 (13)

βi ξi = 0 (14)

& %
' $
15

Wolfe Dual Form


m
X m
X
min αi αj K(xi , xj ) − αi K(xi , xi )
i,j i=1

m
X
st αi = 1, 0 ≤ αi ≤ C ∀i (15)
i=1

& %
' $
16

Observations
• points with αi = 0 are inside the sphere.
• points with 0 < αi < C lies on the sphere (SVs).
• points with αi = C lie outside the feature space sphere (BSVs).

& %
' $
17

Radius of the sphere enclosing the image of the data points is given
by
R = {G(xi )/where 0 < αi < C} where
G2 (xi ) = kφ(xi ) − µk2
m
X m
X
= K(xi , xi ) − 2 αj K(xj , xi ) + αj αk K(xj , xk ) (16)
j=1 j,k

& %
' $
18

Contours that enclose the points in data space are defined by

{x/G(x) = R} (17)

Thus the computation in high dimensional space and reverse


mapping to find the contours in data space is avoided with the help
of Kernel function.

& %
' $
19

Cluster Assignment
Employs a geometric method involving G(x), based on the
observation:
given a pair of points that belong to different clusters, any path
that connects them must exit from the sphere in feature space.
Define an adjacency matrix M between pairs of points xi and xj
whose images lie in or on the sphere in the feature space by looking
at the image of path that connects them as


 1 if G(y) ≤ R ∀y ∈ [xi , xj ]
M [i, j] = (18)
 0 otherwise

& %
' $
20

• Clusters are now defined as the connected components of the


graph induced by M.
• The points that lie outside the sphere(Bounded Support
Vectors) can be assigned to the closest clusters.

& %
' $
21

Rough Support Vector Clustering

• Rough Sphere as a sphere having an inner radius R defining its


lower approximation and an outer radius T > R defining its
upper approximation.
• It tries to find the smallest Rough Sphere in the high
dimensional space enclosing the images of all the points in the
data set.

& %
' $
22

m m
min R2 + T2 +
X X ′
1 δ
υm ξ i + υm ξi
i=1 i=1

s.t 2 2 ′
kφ(xi ) − µk ≤R + ξ i + ξ i

0≤ξ i ≤ T2 - R2

ξ i ≥0 ∀i

& %
' $
23

m m
L = R2 + T2 +
X X ′
1 δ
υm ξ i + υm ξ i+
i=1 i=1
m m
αi ( kφ(xi ) − µk2 - R2 - ξ i - ξ i ) -
X ′
X
βi ξ i
i=1 i=1
m m
λi (ξ i - T2 + R2 ) -
X X ′
+ ηi ξ i
i=1 i=1

where the Lagrangian multipliers


αi ≥0 β i ≥0 λi ≥0 η i ≥0 ∀i

& %
' $
24

m
X m
X
1
αi =2 — (1) µ= 2 αi φ(xi ) — (2)
i=1 i=1

1 δ
β i - λi = υm - αi — (3) υm
- αi = ηi — (4)

& %
' $
25

Complementary Slackness conditions are

2 2 ′
αi ( kφ(xi ) − µk - R - ξ i - ξ i ) = 0 —– (5)

λi (ξ i - T2 + R2 )=0 ——— (6)

β i ξ i = 0 ——— (7) η i ξ ’ i =0 ——— (8)

& %
' $
26

Wolfe Dual form can be written as

m
X m
X
min αi αj K(xi ,xj ) - αi K(xi ,xi )
i,j i=1

m
X
δ
s.t 0≤ αi ≤ υm for i =1....m, αi =2
i=1

& %
' $
27

• αi = 0 lie in lower approximation


1
• 0 < αi < υm form the Hard Support Vectors
( Support Vectors which mark the boundary of lower
approximation ).
1
• αi = υm lie in the boundary region ( patterns that may be
shared by more than one cluster ).
1 δ
• υm < αi < υm form the Soft Support Vectors
( Support Vectors which mark the boundary of upper
approximation )
δ
• αi = υm lie outside the sphere ( Bounded Support Vectors ) .

& %
' $
28

Let us define

1
R = G(xi ) : 0 < αi < υm

1 δ
T = G(xi ) : υm
< αi < υm

& %
' $
29

algorithm find clusters


{
• Find the adjacency matrix M as
 1 if G(y) ≤ R ∀y ∈ [xi , xj ]
M [i, j] =
 0 otherwise

• Find connected components for the graph represented


by M.
This gives the Lower Approximation of each cluster.
• Now find the Boundary Regions as
xi ∈ L A(Ci ) and
pattern xk ∈ / L A(Cj ) f or any cluster j,
if G(y) ≤ T ∀y ∈ [xi , xk ] then xk ∈ B R(Ci )
}
& %
' $
30

• Number of Bounded Support Vectors nbsv < 2 υm


δ .
′ ′
• For δ = 1, nbsv < 2υm = υ m where υ = 2υ. This
corresponds to all the patterns xi with kφ(xi ) − µk2 > R2 .

υ
• since δ > 1 for RSVC, we can say that δ is the upper bound

on the fraction of points permitted to lie outside T and υ is
the upper bound on the fraction of points permitted to lie
outside R.
• Hence υ and δ together give us control over the width of
boundary region and the number of Bounded Support Vectors.

& %

You might also like