SVM Hands-On Problem

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 7

Machine Learning, Deep Learning, AI

Tutorials and blog posts

How Support Vector Machines work – an example.


December 18, 2016 Examples example, Support Vector Machine
Support Vector Machines are a common method for binary classification and regression. There is a large amount of resources online that attempt to explain how SVMs works, but few that
include an example with actual numbers. In this section, I explain how it works with a concrete example that folks can compare their own calculations to in order to ensure they
understand SVMs correctly.

 Data setup
 Optimal separating hyperplane
 Solving the quadratic programming problem
 Calculating

and

In a previous post, I went over the theory of SVMs in great detail. Even though this chapter works standalone as well, I recommend checking it out first.

Data setup

We will try to separate the points

x1 = (2, 1)

and

x2 = (1, 2)

by hand. To that end, we assign the labels

1
and

−1

as seen in the picture below.

Optimal separating hyperplane

Our goal is to find a hyperplane that separates these points in the best possible way. This means we try to find a hyperplane that leaves the largest margin between the closest points and
the hyperplane.  i.e. finding the optimal separating hyperplane.
We saw in the previous chapter that any hyperplane can be represented by

wx + b = 0

w is normal to the hyperplane.

∥w∥

is the perpendicular distance from the hyperplane to the origin

Support Vectors are the points (vectors) closest to the separating hyperplane. In our case, we only have two vectors, which therefore will both be Support Vectors. In the above
representation of hyperplanes, both

∥w∥

and

are unknowns. How do we retrieve them?

In the previous chapter, we deduced the so-called dual form . The dual-form contains an quadratic objective function with constraints. The dual form reads:
L

1
max  LD = ∑ αi − ∑ αi αj yi yj ⟨xi , xj ⟩
2
α
i=1 i,j

subject to   ∑ αi yi = 0

i=1

  αi ≥ 0 ∀i

Solving the quadratic programming problem in order to retrieve

α1

and

α2

This boils down to solving a QP (Quadratic programming) problem.

Note: Quadratic Programming

Quadratic programming (QP) is a special kind of mathematical optimization problem. It is about trying to find a vector

minimizing or maximizing an objective function subject to bounds, linear equality, and inequality constraints.

1
T T
min ( x Hx + f x)(objective function)
x 2

Ax ≤ b   (inequality constraint)

Ax = b   (equality constraint)

ab ≤ x ≤ ub   (bound constraint)

Solving quadratic programming problems is not trivial, but there are many so ware packages containing solvers.

Let´s insert our numbers into

LD

:
L

1
max  LD = ∑ αi − ∑ αi αj yi yj ⟨xi , xj ⟩
2
α
i=1 i,j

2 2
1
= α1 + α2 − (α1 α1 ⋅ 1 ⋅ 1 ⋅ ⟨( ),( )⟩+
2
1 1

1 2
+2 ⋅ α1 α2 ⋅ 1 ⋅ (−1) ⋅ ⟨( ),( )⟩+
2 1

1 1
+α2 α2 ⋅ (−1) ⋅ (−1) ⋅ ⟨( ),( )⟩)
2 2

1 2 2
= α1 + α2 − (5α − 8α1 α2 + 5α )
2 1 2

subject to    α1 y1 + α2 y2 = 0

  α1 ≥ 0,  α2 ≥ 0

In order to solve this Quadratic Programming Problem, we make use of a solver that I made using Wolfram Alpha: It maximizes some objective function with respect to arbitrary objective
functions. In the below widget, our optimization problem is already entered (note that

α1 = x

and

α2 = y

).

Objective function x+y-2.5x^2+4xy

subject to x-y=0

and x>=0

and y>=0

Submit
Input interpretation:

Global maximum:

3D plot:
Computing...

Get this widget

The Lagrange-Multipliers solving our Quadratic Programming Problem

(2)

are therefore

α1 = 1

and

α2 = 1

(the solution is listed under the paragraph “Global Maximum”).

Calculating
w

and

The bulk of the work is done now! What´s le to do now is to finally calculate w and b. Here´s how to do it:

L
2 1
w = ∑ αi yi xi = 1 ⋅ 1 ⋅ ( ) + 1 ⋅ (−1) ⋅ ( )
1 2
i=1

2 1
= ( ) − ( )
1 2

1
= ( )
−1

2 2
b = ys − ∑ αm ym ⟨xm , xs ⟩ = 1 − (1 ⋅ 1 ⋅ ⟨( ),( )⟩
1 1
m∈S

2 1
+ 1 ⋅ (−1) ⋅ ⟨( ),( )⟩)
1 2

= 0

Our hyperplane

(1)

, therefore, is:

1
( )x = 0
−1

And that´s it! By the way, you can rewrite that by evaluating the le -hand side:

x1 − x2 = 0 ⇔ x1 = x2
Sharing is caring!

 Facebook 0  Twitter  Google+

You might also like