Professional Documents
Culture Documents
SVM Hands-On Problem
SVM Hands-On Problem
SVM Hands-On Problem
Data setup
Optimal separating hyperplane
Solving the quadratic programming problem
Calculating
and
In a previous post, I went over the theory of SVMs in great detail. Even though this chapter works standalone as well, I recommend checking it out first.
Data setup
x1 = (2, 1)
and
x2 = (1, 2)
1
and
−1
Our goal is to find a hyperplane that separates these points in the best possible way. This means we try to find a hyperplane that leaves the largest margin between the closest points and
the hyperplane. i.e. finding the optimal separating hyperplane.
We saw in the previous chapter that any hyperplane can be represented by
wx + b = 0
∥w∥
Support Vectors are the points (vectors) closest to the separating hyperplane. In our case, we only have two vectors, which therefore will both be Support Vectors. In the above
representation of hyperplanes, both
∥w∥
and
In the previous chapter, we deduced the so-called dual form . The dual-form contains an quadratic objective function with constraints. The dual form reads:
L
1
max LD = ∑ αi − ∑ αi αj yi yj ⟨xi , xj ⟩
2
α
i=1 i,j
subject to ∑ αi yi = 0
i=1
αi ≥ 0 ∀i
α1
and
α2
Quadratic programming (QP) is a special kind of mathematical optimization problem. It is about trying to find a vector
minimizing or maximizing an objective function subject to bounds, linear equality, and inequality constraints.
1
T T
min ( x Hx + f x)(objective function)
x 2
Solving quadratic programming problems is not trivial, but there are many so ware packages containing solvers.
LD
:
L
1
max LD = ∑ αi − ∑ αi αj yi yj ⟨xi , xj ⟩
2
α
i=1 i,j
2 2
1
= α1 + α2 − (α1 α1 ⋅ 1 ⋅ 1 ⋅ ⟨( ),( )⟩+
2
1 1
1 2
+2 ⋅ α1 α2 ⋅ 1 ⋅ (−1) ⋅ ⟨( ),( )⟩+
2 1
1 1
+α2 α2 ⋅ (−1) ⋅ (−1) ⋅ ⟨( ),( )⟩)
2 2
1 2 2
= α1 + α2 − (5α − 8α1 α2 + 5α )
2 1 2
subject to α1 y1 + α2 y2 = 0
α1 ≥ 0, α2 ≥ 0
In order to solve this Quadratic Programming Problem, we make use of a solver that I made using Wolfram Alpha: It maximizes some objective function with respect to arbitrary objective
functions. In the below widget, our optimization problem is already entered (note that
α1 = x
and
α2 = y
).
subject to x-y=0
and x>=0
and y>=0
Submit
Input interpretation:
Global maximum:
3D plot:
Computing...
(2)
are therefore
α1 = 1
and
α2 = 1
Calculating
w
and
The bulk of the work is done now! What´s le to do now is to finally calculate w and b. Here´s how to do it:
L
2 1
w = ∑ αi yi xi = 1 ⋅ 1 ⋅ ( ) + 1 ⋅ (−1) ⋅ ( )
1 2
i=1
2 1
= ( ) − ( )
1 2
1
= ( )
−1
2 2
b = ys − ∑ αm ym ⟨xm , xs ⟩ = 1 − (1 ⋅ 1 ⋅ ⟨( ),( )⟩
1 1
m∈S
2 1
+ 1 ⋅ (−1) ⋅ ⟨( ),( )⟩)
1 2
= 0
Our hyperplane
(1)
, therefore, is:
1
( )x = 0
−1
And that´s it! By the way, you can rewrite that by evaluating the le -hand side:
x1 − x2 = 0 ⇔ x1 = x2
Sharing is caring!