Professional Documents
Culture Documents
Classification: Linear SVM
Classification: Linear SVM
Classification
Linear SVM
Huiping Cao
B1
B2
B2
B1
B2
upport Vector
Find Machines
hyperplane that maximizes the margin; so B1 is better than
B2
B1
B2
b21
b22
margin
b11
b12
+b=0
+ b = −1
w • x + b = +1
b11
b12
⎧ 1 if w • x + b ≥ 1 2
= ⎨ Margin = !
⎩− 1 if w • x + b ≤ −1 || w ||
B1 : w |x + b
CS479/579: Data Mining
=0 Slide-8
b11 : w| x + b = 1
b12 : w| x + b = −1
Huiping Cao, Slide 8/26
Linear Support Vector Machine
B1
w• x + b = 0
d
! ! X1
w • x + b = !1
X2
!
w
b11
b12
2
We want to maximize margin: d = kwk
kwk2
which is equivalent to minimize L(w) = 2 .
Subjected to the following constraints:
(
1 if w| xi + b ≥ 1
yi =
−1 if w| xi + b ≤ −1
which is equivalent to
yi (w| xi + b) ≥ 1, i = 1, · · · , N
kwk2
Minimize
2
subject to
yi (w| xi + b) ≥ 1, i = 1, · · · , N
Convex optimization problem:
Objective function is quadratic
Constraints are linear
Can be solved using the standard Lagrange multiplier
method.
Lagrange formulation
Lagrangian for the optimization problem (take into account
the constraints by rewriting the objective function).
N
kwk2 X
LP (w, b, λ) = − λi (yi (w| xi + b) − 1)
2
i=1
minimize w.r.t. w and b and maximize w.r.t. each λi ≥ 0
where λi : Lagrange multiplier
To minimize the Lagrangian, take the derivative of
LP (w, b, λ) w.r.t. w and b and set them to 0:
N
∂LP X
= 0 =⇒ w = λi yi xi
∂w
i=1
N
∂LP X
= 0 =⇒ λi yi = 0
∂b
i=1
Huiping Cao, Slide 13/26
Linear Support Vector Machine
N N N N
kwk2 1 1 X 1 XX
λi yi x|i ) · ( λi λj yi yj x|i xj
X
= w| w = ( λj yj xj ) = (1)
2 2 2 i=1 j=1
2 i=1 j=1
N
X N
X N
X N
X
− λi (yi (w| xi + b) − 1) = − λi yi w| xi − λi yi b + λi (2)
i i=1 i=1 i=1
N N N
λj yj x|j )xi +
X X X
=− λi yi ( λi (3)
i=1 j=1 i=1
N N X
N
λi λj yi yj x|i xj
X X
= λi − (4)
i=1 i=1 j=1
Finalized LD (λ)
N N N
1 XX
λi λj yi yj x|i xj
X
LD (λ) = λi −
2
i=1 i=1 j=1
Solve LD (λ) – QP
subject to constraints
(1) λi ≥ 0 for i = 1, 2, · · · , N and
PN
(2) i=1 λi yi = 0.
Translate the objective to minimization because QP packages
generally come with minimization.
N N N
1 XX X
minλ ( λi λj yi yj x|i xj − λi )
2
i=1 j=1 i=1
Solve LD (λ) – QP
subject to
y| λ = 0
| {z }
linear constraint
0
|{z} ≤λ≤ ∞
|{z}
lower bounds upper bounds
Lagrange multiplier
λi ≥ 0
The constraint (zero form with extreme value)
λi (yi (w| xi + b) − 1) = 0
Either λi is zero
Or yi (w| xi + b) − 1) = 0
Support vector xi : yi (w| xi + b − 1) = 0 and λi > 0
Training instances that do not reside along these hyperplanes
have λi = 0
Huiping Cao, Slide 19/26
Linear Support Vector Machine
Get w and b
b value
b = yi − w| xi
Idea:
Given a support vector (xi , yi ), we have yi (w| xi + b) − 1 = 0.
Multiply yi on both sides, → yi2 (w| xi + b) − yi = 0.
yi2 = 1 because yi = 1 or yi = −1.
Then, (w| xi + b) − yi = 0.
|
yz = sign(w| z + b) = sign(( N
P
i=1 λi yi xi )z + b)
SVM – discussions
Support Vector Machines
What is the problem if not linearly separable?
Whatif the problem is not linearly separable?
SVM – discussions
Nonlinear Support Vector Machines
Do not work in X space. Transform the data into higher dimensional Z space
such that the data are linearly separable.
N N N
1 XX
λi λj yi yj x|i xj
X
LD (λ) = λi −
i=1
2 i=1 j=1
→
N N N
1 XX
λi λj yi yj z|i zj
X
LD (λ) = λi −
i=1
2 i=1 j=1
https://www.gnu.org/software/octave/doc/interpreter/Quadratic-
Programming.html
Solve the quadratic program
minx (0.5x| ∗ H ∗ x + x| ∗ q)
subject to
A∗x= b
lb <= x <= ub
Alb <= Ain ∗ x <= Aub