Machine Learning Approaches - SVM

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 34

10/02/2020

Machine
Learning
Approaches - SVM
B. Solaiman
Basel.solaiman@imt-atlantique.fr
Départ. Image &Traitement de l’Information
Brest, France

Introduction

Machine Learning Approaches ----------- B. Solaiman

1
10/02/2020

Introduction 3

A Pattern (sample, instance, example, ….) is


represented by a set of K features, or attributes,
viewed as a K-dimensional feature vector X RK

The pattern “may” be associated with a target


(Xm,ym): XmRK , ym=f(Xm)
variable y = f(X) (in case of supervised learning)

Stored Data f : Unknown function associating X and y

Learning : given the training set, estimate the prediction


function f by minimizing “a” prediction error
Machine ~ ~
Learning
f : XmRK , ym = f(Xm)

Machine Learning Approaches ----------- B. Solaiman

Introduction 4

y
No Target Variable y  Unlabeled Training Data

y  L= {C1, C2,…Cn}  Labeled Training Data


(L: set of discrete labels)

yR  Labeled Training Data

Regression problem
Instead of predicting the class of an input pattern, we want to predict continuous
values

Machine Learning Approaches ----------- B. Solaiman

2
10/02/2020

Introduction 5

Natures of a feature
Numerical, or quantitative, Feature

Symbolic, or qualitative, Feature


Nominal Feature:
Feature values are symbols (= and ≠ are the only operations
we can conduct)

Ordinal Feature:
Feature values are “ordered” symbols (=, ≠, <, >, ≤ and ≥
are the only operations we can conduct)

Machine Learning Approaches ----------- B. Solaiman

Introduction 6

Distance between patterns Numerical Features


X = [x1, x2, .., xn] & Y = [y1, y2, .., yn] xi, yi
1
 n pp
Minkowski Distance d p X ,Y   
 i 1

xi  y i 

n
Manhattan distance : d1  X , Y    xi  y i (P = 1)
i 1

n
Euclidian distance : d 2 X ,Y    x
i 1
i  yi 
2
(P = 2)

n
Max distance : d   X , Y   max xi  y i (P = )
i 1

Machine Learning Approaches ----------- B. Solaiman

3
10/02/2020

Introduction 7

Distance between patterns


Nominal features
X = [x1, x2, .., xn] & Y = [y1, y2, .., yn]
x i , yi  W  W 1 X W 2 X … X W n

n- Q
dist(X, Y) =
n
Q: Number of features having the same modality in X and Y

Machine Learning Approaches ----------- B. Solaiman

Introduction 8

Distance between patterns


Nominal binary features
X = [x1, x2, .., xn] & Y = [y1, y2, .., yn] xi, yiW  {0, 1}
Confusion Matrix
Y
1 0 a : nb of feature assuming « 1 » in X and « 1 » in Y
1 a b b : nb of feature assuming « 1 » in X and « 0 » in Y
X c : nb of feature assuming « 0 » in X and « 1 » in Y
0 c d d : nb of feature assuming « 0 » in X and « 0 » in Y

Machine Learning Approaches ----------- B. Solaiman

4
10/02/2020

Introduction 9

Distance between patterns Nominal binary features


Case 1: Binary Symmetric Features (0 and 1 have the same importance)
Simple Matching Coefficient
Y
b+c 1 0
dist(X, Y) =
a+b+c+d 1 a b
X
Example 0 c d
X
Y
2+1
dist(X, Y) = = 3 / 7 = 0.429
2+2+1+2
Machine Learning Approaches ----------- B. Solaiman

Introduction 10

Distance between patterns Nominal binary features


Case 2: Binary Non Symmetric Features
(1 is more important than 0)
Jaccard’ Coefficient Y
b+c 1 0
dist(X, Y) = 1 a
a+b+c b
X
Exemple 0 c d
X
Y
2+1
dist(X, Y) = =3/5
2+2+1
Machine Learning Approaches ----------- B. Solaiman

5
10/02/2020

11

Support Vector Machines (SVM)

Machine Learning Approaches ----------- B. Solaiman

12

Support Vector Machines (SVM)


Linear Models for Classification
Perceptron
A Bit of Geometry
Hard (Linear) Support Vector Machines (H-SVM)

Soft Margin Support Vector Machines (SM-SVM)


Kernel-based Support Vector Machines

Machine Learning Approaches ----------- B. Solaiman

6
10/02/2020

Support Vector Machines (SVM) 13

Linear Models for Classification


x2
C1 H Model for Classification ?
(x1, x2) C1
C2 Model
(x1, x2) C2

x1
Training Set Linear Model :
(Binary classification Hyperplane
=
Separation Surface H
Two classes)
Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 14

x2 H
How does it work ?
C1 H+ Hyperplane equation :
C2 w.X+b=0
X = (x1, x2,…, xn)
w = (w1, w2,…, wn)
H- w . X = x1.w1+ x2.w2+….+ xn.wn
x1
IF : X “on” H: w.X+b=0
IF : X ϵ H+: w.X+b>0
IF : X ϵ H-: w.X+b<0
Machine Learning Approaches ----------- B. Solaiman

7
10/02/2020

Support Vector Machines (SVM) 15

x2 H
C1
C2

Linear Model
Classification Decision Rule :
x1

Computation of: If w . X + b > 0 Then Decide C1


X If w . X + b > 0 Then Decide C2
w.X+b

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 16

Linear Model for Binary Classification

Find: w = (w1, w2,…, wn), b


Input
Training dataset w . X + b > 0 if y = +1 w .X+b=0
X = (x1, x2,…, xn), y w . X + b < 0 if y = -1

Machine Learning Approaches ----------- B. Solaiman

8
10/02/2020

Support Vector Machines (SVM) 17

Perceptron, Rosenblatt - 1958

w1
x1
w2
x2
 f(X) = w . X + b  0?
wn = x1.w1+x2.w2+….+xn.wn+b
xn

b = w0 , x0 = 1 → f(X) = w . X (dot product)

Decision Rule: If f(X) = w . X > 0 Then +1


If f(X) = w . X < 0 Then - 1

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 18

A Bit of Geometry
Dot Product < w, X > =w.X
= ||w|| ||X|| Cos q
 x1.w1 + x2.w2 +….+ xn.wn

w w q > 90
q < 90

X X
w. X > 0 w. X < 0
Machine Learning Approaches ----------- B. Solaiman

9
10/02/2020

Support Vector Machines (SVM) 19

Perceptron, Rosenblatt - 1958


Decision Rule: If f(X) = w . X > 0 Then +1
If f(X) = w . X < 0 Then -1
Types of error:
f(X) = w. X > 0 whereas y = -1 f(X) = w . X < 0 whereas y = +1
wt+1
wt+1 wt
X wt X

Dw ~  y . X
Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 20

Machine Learning Approaches ----------- B. Solaiman

10
10/02/2020

Support Vector Machines (SVM) 21

Perceptron, Rosenblatt - 1958


Perceptron Algorithm :

Random initialization of (w1, w2,…, wn), b


Pick training samples X one by one
Predict class of X using current (w1, w2,…, wn), b
y’ = Sign(w X)
If y’ is correct (i.e., y = y’ ) : No change: wt+1 = wt
If y’ is wrong: Adjust wt : wt+1 = wt +   yt  Xt
►  is the learning rate parameter
► Xt is the t-th training example
► yt is true tth class label ({+1, -1})

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 22

Perceptron, Rosenblatt - 1958

Machine Learning Approaches ----------- B. Solaiman

11
10/02/2020

Support Vector Machines (SVM) 23

A Bit of Geometry
Projection of X onto w
= ||X|| Cosq
= w . X / ||w ||

w
q
X

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 24

A Bit of Geometry Parallel Hyperplanes


All “parallel hyperplanes” have:
H - The same equation w . X + b = 0;
x2
X2 - The same coefficients vector w;
- They have different origin translation values b
Coefficients vector : w
- Perpendicular to H ?
w X1 w . (X1X2) = w . (X2-X1)
= w.X2 - w.X1
x1 = -b – (-b) = 0

Machine Learning Approaches ----------- B. Solaiman

12
10/02/2020

Support Vector Machines (SVM) 25

A Bit of Geometry Distance between an instance X and H

X D = dist(X, H) = ??
x2 D
q D = || X0X || Cosq
X0  || w. X0X || / || w ||
 || w . X – w . X0 | / || w ||
q  || w . X + b || / || w ||
w x1

H
Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 26

Hard (Linear) Support Vector Machines (H-SVM)


Which of the linear separators is optimal?

Machine Learning Approaches ----------- B. Solaiman

13
10/02/2020

Support Vector Machines (SVM) 27

Hard (Linear) Support Vector Machines (H-SVM)


Largest Margin Concept:
H The distance from the separating
hyperplane, H, corresponds to the
A C
“confidence” of decision

B Example:

We are more sure and confident about the


class of A and B than of C

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 28

Hard (Linear) Support Vector Machines (H-SVM)


Largest Margin Concept:

The margin of a linear classifier is the width that the boundary could be increased
by before hitting a datapoint instance

Machine Learning Approaches ----------- B. Solaiman

14
10/02/2020

Support Vector Machines (SVM) 29

Hard (Linear) Support Vector Machines (H-SVM)


Largest Margin Concept :
Margin : D
H The decision boundary should be as far away
from the data of both classes as possible
“H-SVM” : The maximum margin linear
classifier is the linear classifier with the
C1 maximum margin

C2 Support vectors
(datapoints that the margin pushes up against)

Separating hyperplane : w . X + b = 0
Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 30

Hard (Linear) Support Vector Machines (H-SVM)


Largest Margin Concept :
H Margin width : D = ?
- : C2 D
D  2 . | w . X + b | / || w ||
+ : C1

w.X+b=+1 D  2 / || w ||
w.X+b=-1
Machine Learning Approaches ----------- B. Solaiman

15
10/02/2020

Support Vector Machines (SVM) 31

Hard (Linear) Support Vector Machines (H-SVM)


Vapnik, 1965; Vapnik, 1995 :
{(Xm, ym), m = 1,.., M, Xm Rn , ym  {+1, -1} }
H-SVM formulation : Find w & b such that
Maximize : Minimize:
D  2 / || w || || w ||2 / 2
Subject to Support vectors constraints :

ym . (w . Xm + b) ≥ +1 for all m
Quadratic Optimization problem, subject to linear constraints

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 32

1D Example a.x+b=0

Dataset : {(-3,-1), (-1,-1), (+2,+1)}


-3 -1 +1
H : All lines with a cross point between -1 and +1
Margin constraints: ym . (w . Xm + b) ≥ +1 , m = 1, .., M :
a . (-3) + b < -1 → b < 3a - 1 b Support Vectors
a . (-1) + b < -1 → b < a - 1
a . (+2) + b > +1 → b > - 2a + 1
Minimize a
|| w ||2 = ||a||2 → a = 0.66
→ b = 0.33
Machine Learning Approaches ----------- B. Solaiman

16
10/02/2020

Support Vector Machines (SVM) 33

Hard (Linear) Support Vector Machines (H-SVM)


RAPPEL: QUADRATIC PROGRAMMING PROBLEM

F(w , b) = b + ∑ ai wi + ∑ ∑ qi,j wi wj w= (w1, w2, …wn) , b

Quadratic objective Quadratic


Function Programming w=(w1, w2, …wn) , b
F(w , b) Numerical Analysis

M Linear Constraints
∑ cij wi ≤ dj j = 1, 2, …., M

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 34

Hard (Linear) Support Vector Machines (H-SVM)

Optimization using Lagrange Multipliers

Lagrangian : L = || w ||2 / 2 + ∑ l { 1 – [y
m
m . (w . Xm + b)]}
m=1,..,M

Lagrange Multipliers

The constraints are added into the optimization by adding extra variables (called
Lagrange multipliers)

|| w ||2 = (w1)2 + (w2)2 +….+ (wn)2

Machine Learning Approaches ----------- B. Solaiman

17
10/02/2020

Support Vector Machines (SVM) 35

Hard (Linear) Support Vector Machines (H-SVM)


Optimization using Lagrange Multipliers
Objective
Function:
L = || w ||2 / 2 + ∑ l { 1 – [y
m
m . (w . Xm + b)] }
m=1,..,M
Linear sum
Setting the gradient of L w.r.t. w and b to zero, we haveof: samples
∂L
∂w
= w - ∑ lm ym . Xm = 0 w* = ∑l m ym . Xm
m=1,..,M m=1,..,M

∂L
∂b
=- ∑l m ym =0 ∑l m ym = 0
m=1,..,M m=1,..,M

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 36

L = || w ||2 / 2 + ∑ l { 1 – [y
m
m . (w . Xm + b)] } w* = ∑l m ym . Xm
m=1,..,M
m=1,..,M

L=(1/2) . [∑l m m
my .X ].[∑lm ym.Xm]-[∑lm ym.Xm{∑lk yk.Xk}]-∑lm ym.b +∑lm
m=1,..,M m=1,..,M m=1,..,M k=1,..,M m=1,..,M m=1,..,M
The optimization problem becomes : (called : Dual Problem)
Find lm such that:
L= - (1/2) . ∑ ∑l m lk ym yk . [Xm. Xk] +∑lm is
m=1,..,M k=1,..,M m=1,..,M maximized
where ∑l m ym = 0 lm ≥ 0, m =1,.., M
m=1,..,M
Machine Learning Approaches ----------- B. Solaiman

18
10/02/2020

Support Vector Machines (SVM) 37

L= - (1/2) . ∑ ∑l m lk ym yk . [Xm. Xk] +∑lm


m=1,..,M k=1,..,M m=1,..,M

- Quadratic optimization problems are a well-known mathematical


programming problems for which several algorithms exist;

- Once resolved, most of lm are 0, only a small number have lm > 0;


the corresponding Xm s are the support vectors

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 38

Given a solution l1… lM to the dual problem, solution to the primal


(i.e., coefficients vector w* and origin shift factor b* solution) is given
by : Learned weight

w* = ∑l m y .
m Xm
m ϵ Support Vectors Support Vectors

b*(k) = yk - ∑ lm ym . XmXk
m =1,…, M
For any k such
that lk > 0
b* = (1/K) ∑ b*(k) (K denotes the nb of SVs)
k
Machine Learning Approaches ----------- B. Solaiman

19
10/02/2020

Support Vector Machines (SVM) 39

Geometrical Interpretation
C1
Support vectors
l8=0.6 l10=0 w  l’s with values different
from zero (they hold up the
separating plane)!
l7=0
l5=0 l2=0
l1=0.8
l4=0
l6=1.4
l9=0
l3=0
w.X+b=0
C2
Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 40

Decision rule : Given a new data instance X


Then, the classifying function is
(note that we don’t need w* explicitly):

f(X) = ∑ lm ym . Xm . X + b*
m ϵ Support Vectors

If f(X) > 0 Then decide C1;


Else decide C2

Machine Learning Approaches ----------- B. Solaiman

20
10/02/2020

Support Vector Machines (SVM) 41

Notice that f(X) relies on the dot product between the


unknown instance X and the support vectors Xm

Also keep in mind that solving the optimization problem


involved computing the dot products Xm Xk between all
training data instances

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 42

Soft Margin Support Vector Machines (SM-SVM)

Non Linear Separability

Slightly non separable Highly non separable

Soft Margin Kernel-based


Support Vector Machines Support Vector Machines

Machine Learning Approaches ----------- B. Solaiman

21
10/02/2020

Support Vector Machines (SVM) 43

Soft Margin Support Vector Machines (SM-SVM)


To allow errors in data, we relax the
margin constraints by introducing
slack variables, m ( 0) :
w.Xm + b ≥ +1-m , if ym = +1
w.Xm + b ≤ -1+m , if ym = -1 ξm ξk

The new constraints:


ym.(w.Xm+b) ≥ 1-m, m=1, …, M
m: Measure of margin violation
m ≥ 0 , m=1, …, M by an erroneous instance Xm

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 44

Assigning an “extra cost” for errors in the objective function


that we are going to optimize:

Minimize: L = || w ||2 / 2 + C ∑ m
m=1, …, M

Subject to: ym . (w . Xm + b) ≥ 1 - m, m=1, …, M


m ≥ 0 , m=1, …, M
∑ m : Total margin violation
m=1, …, M

C : Large C means a higher penalty to errors


Machine Learning Approaches ----------- B. Solaiman

22
10/02/2020

Support Vector Machines (SVM) 45

Optimization using Lagrange Multipliers

L = ||w||2/2 + C ∑ m -∑l {y .(w.X


m
m m+ b) - 1+ m -} ∑mm m
m=1,..,M m=1,..,M m=1,..,M

Minimizing Total margin lm, mm ≥ 0


↔ violation error Lagrange Multipliers
Maximizing the
margin Objective function
to be minimized w.r.t. w, b & m
Tradeoff (i.e., relative
importance) between
error and margin
Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 46

L = ||w||2/2 + C ∑ m -∑l {y .(w.X


m
m m+ b) - 1+ m -} ∑mm m
m=1,..,M m=1,..,M m=1,..,M

Setting the gradient of L w.r.t. w, b and m to zero, we have :

∂L
∂w
= w - ∑ lm ym . Xm = 0
m=1,..,M Same solution
as for the H-SVM
∂L
∂b
=- ∑l m ym = 0
m=1,..,M

∂L mm ≥ 0 lm ≤ C
= C – lm – mm = 0 lm ≥ 0
∂ m (Otherwise, C – lm ≤ 0,
thus no mm will verify:
C – lm – mm = 0
Machine Learning Approaches ----------- B. Solaiman

23
10/02/2020

Support Vector Machines (SVM) 47

w* = ∑l m ym . Xm
with the only difference
m=1,..,M
w.r.t (H-SVM):
∑l m ym = 0 0 ≤ lm ≤ C
m=1,..,M

Decision rule : Given a new data instance X


Then, the classifying function is

f(X) = ∑ lm ym . Xm . X + b*
m ϵ Support Vectors
If f(X) > 0 Then decide C1;
Else decide C2
Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 48

Kernel-based Support Vector Machines


General idea: the original feature space is mapped to some higher-dimensional
feature space where the training set is linearly separable

Φ: Rn → RN
X → Z = Φ(X)

X R n Z R N , N > n
Machine Learning Approaches ----------- B. Solaiman

24
10/02/2020

Support Vector Machines (SVM) 49

Kernel-based Support Vector Machines


x2

x1

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 50

Kernel-based Support Vector Machines


Rn RN
Φ: Rn → RN
X → Z = Φ(X)
Nonlinearly separable dataset Linearly separable dataset
{(Xm, }
ym), XmRn, ym {+1, -1} {(Z m= }
Φ(Xm), ym), ZmRN, ym {+1, -1}

Do we need to know the explicit


expression of Z = Φ(X) ? SVM Machinery

Machine Learning Approaches ----------- B. Solaiman

25
10/02/2020

Support Vector Machines (SVM) 51

The Kernel trick


Lagrangian to L= - (1/2) . ∑ ∑l m lk ym yk . [Zm. Zk] +∑lm
maximize m=1,..,M k=1,..,M m=1,..,M

Constraints ∑l m ym = 0 lm ≥ 0, m =1,.., M
m=1,..,M

Decision f(X) = ∑ lm ym . Zm . Z +b*


Rule m ϵ Support Vectors

Bias term b* = yk - ∑ lm ym . ZmZk


m =1,…, M

We only deal with Z as far as Z-dot product is concerned


Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 52

The Kernel trick SVM Machinery


(Quadratic Programming,
Decision rule) needs

Z - Transformation
Z-dot product

Xm Zm = Φ(Xm)
K(Xm, Xm’) = ZmZm. Z. Zm’m’ Z-dot product
Xm’ m’ = Φ(X
Zwithout m’)computation of
explicit
Φ(Xm) and Φ(Xm’)

K(.,.) : is called the Transformation Kernel

Machine Learning Approaches ----------- B. Solaiman

26
10/02/2020

Support Vector Machines (SVM) 53

Example 1

X = (x1, x2) ……………. : XR2


2 2
Φ(X) = (1, x1, x2, (x1) , (x2) , x1 . x2) :
d
Full 2 order polynomial transformation

K(X, X’) = Z . Z’
= 1+ x1x’1 + x2x’2+ (x1)2 (x’1)2 + (x2)2 (x’2)2 + x1 . x’1 . x2 x’2

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 54

Example 2 X = (x1, x2) ……………. : XR2


Computing the Kernel without transforming X and X’ ?
2
K(X, X’) = (1 + X . X’)
2 2
K(X, X’)= (1 + X . X’) = (1+ x1x’1 + x2x’2)
2 2 2 2
=1+(x1) (x’1) +(x2) (x’2) +2x1x’1+2x2x’2+2x1x’1x2 x’2
This is the inner product of Z= Φ(X) and Z’= Φ(X’) where
2 2
Φ(X) = [1, (x1) , (x2) , √2 x1 , √2 x2 , √2 x1 x2]
2 2
Φ(X’) = [1, (x’1) , (x’2) , √2 x’1 , √2 x’2 , √2 x’1 x’2]
Machine Learning Approaches ----------- B. Solaiman

27
10/02/2020

Support Vector Machines (SVM) 55

Example 2

Think about the computation complexity if :

- X = (x1, x2,….. , xd) ……… : XRd , d >> 2 and


Q
- K(X, X’) = (1 + X . X’) , Q >> 2

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 56

Example 3 (Radial Basis Function Kernel)


- a ||X - X’||2
Show that K(X, X’) = e
corresponds to an “inner product” in an infinite
dimensional Z-Space (i.e., transforming X instances)
Hint
For d = 1 (X = x: i.e., scalar), a = 1:

- a ||X - X’||2
- Expand e
- Expand cross exponential term using Taylor series

Machine Learning Approaches ----------- B. Solaiman

28
10/02/2020

Support Vector Machines (SVM) 57

Example 4 XOR

x2
1 Index i x
X yu
1 (1,1) 1

2 (1,-1) -1
-1 1 x1
3 (-1,-1) 1

4 (-1,1) -1
-1

2
K(X, X’) = (1 + X . X’)
Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 58

Example 4 XOR
Optimal Lagrange multipliers: l1 = l2 = l3 = l4 = 1/8
The 4 instance data are SV

Machine Learning Approaches ----------- B. Solaiman

29
10/02/2020

Support Vector Machines (SVM) 59

Example 5 Face Detection application


Training images database:
266 pictures: 150 faces + 116 non-faces

...
Preprocessing
- Gray scale transformation
- Histogram equalization
- Adjust resolution to 30x40 pixel
Training the SVM
based on the 266 training instances, a polynomial kernel

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 60

Example 5 Face Detection application


Test phase:
Given an input image:
• - Moving 30x40 pixel sub window over the input image
• - Histogram equalization of a sub window
• - Classification by SVM
• - Removing intersections

Machine Learning Approaches ----------- B. Solaiman

30
10/02/2020

Support Vector Machines (SVM) 61

How many faces ?


4 5 6
3
2 7

9
1 8
11

10

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 62

Example 6 Learning gender with SVMs

Processed
faces

Machine Learning Approaches ----------- B. Solaiman

31
10/02/2020

Support Vector Machines (SVM) 63

Example 6 Learning gender with SVMs


Training dataset: 1044 males, 713 females
Experiment with various kernels, select Gaussian RBF

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 64

Example 6 Learning gender with SVMs

Machine Learning Approaches ----------- B. Solaiman

32
10/02/2020

Support Vector Machines (SVM) 65

Example 7 Two spirals benchmark problem


Carnegie Mellon AI Repository
i i  1,2,, n
a n : num of patterns
104  : density
x  R cos a
R : radius
y  R sin a
x  x y   y
Data generation :

( x, y ) with R  3.5   1.0 n   100 n   100

Machine Learning Approaches ----------- B. Solaiman

Support Vector Machines (SVM) 66

Example 7 Two spirals benchmark problem

Class
–1

Class
+1

Machine Learning Approaches ----------- B. Solaiman

33
10/02/2020

Support Vector Machines (SVM) 67

Example 7 Two spirals benchmark problem

Class
+1

Class
–1

Machine Learning Approaches ----------- B. Solaiman

34

You might also like