Professional Documents
Culture Documents
Machine Learning Approaches - SVM
Machine Learning Approaches - SVM
Machine Learning Approaches - SVM
Machine
Learning
Approaches - SVM
B. Solaiman
Basel.solaiman@imt-atlantique.fr
Départ. Image &Traitement de l’Information
Brest, France
Introduction
1
10/02/2020
Introduction 3
Introduction 4
y
No Target Variable y Unlabeled Training Data
Regression problem
Instead of predicting the class of an input pattern, we want to predict continuous
values
2
10/02/2020
Introduction 5
Natures of a feature
Numerical, or quantitative, Feature
Ordinal Feature:
Feature values are “ordered” symbols (=, ≠, <, >, ≤ and ≥
are the only operations we can conduct)
Introduction 6
n
Euclidian distance : d 2 X ,Y x
i 1
i yi
2
(P = 2)
n
Max distance : d X , Y max xi y i (P = )
i 1
3
10/02/2020
Introduction 7
n- Q
dist(X, Y) =
n
Q: Number of features having the same modality in X and Y
Introduction 8
4
10/02/2020
Introduction 9
Introduction 10
5
10/02/2020
11
12
6
10/02/2020
x1
Training Set Linear Model :
(Binary classification Hyperplane
=
Separation Surface H
Two classes)
Machine Learning Approaches ----------- B. Solaiman
x2 H
How does it work ?
C1 H+ Hyperplane equation :
C2 w.X+b=0
X = (x1, x2,…, xn)
w = (w1, w2,…, wn)
H- w . X = x1.w1+ x2.w2+….+ xn.wn
x1
IF : X “on” H: w.X+b=0
IF : X ϵ H+: w.X+b>0
IF : X ϵ H-: w.X+b<0
Machine Learning Approaches ----------- B. Solaiman
7
10/02/2020
x2 H
C1
C2
Linear Model
Classification Decision Rule :
x1
8
10/02/2020
w1
x1
w2
x2
f(X) = w . X + b 0?
wn = x1.w1+x2.w2+….+xn.wn+b
xn
A Bit of Geometry
Dot Product < w, X > =w.X
= ||w|| ||X|| Cos q
x1.w1 + x2.w2 +….+ xn.wn
w w q > 90
q < 90
X X
w. X > 0 w. X < 0
Machine Learning Approaches ----------- B. Solaiman
9
10/02/2020
Dw ~ y . X
Machine Learning Approaches ----------- B. Solaiman
10
10/02/2020
11
10/02/2020
A Bit of Geometry
Projection of X onto w
= ||X|| Cosq
= w . X / ||w ||
w
q
X
12
10/02/2020
X D = dist(X, H) = ??
x2 D
q D = || X0X || Cosq
X0 || w. X0X || / || w ||
|| w . X – w . X0 | / || w ||
q || w . X + b || / || w ||
w x1
H
Machine Learning Approaches ----------- B. Solaiman
13
10/02/2020
B Example:
The margin of a linear classifier is the width that the boundary could be increased
by before hitting a datapoint instance
14
10/02/2020
C2 Support vectors
(datapoints that the margin pushes up against)
Separating hyperplane : w . X + b = 0
Machine Learning Approaches ----------- B. Solaiman
w.X+b=+1 D 2 / || w ||
w.X+b=-1
Machine Learning Approaches ----------- B. Solaiman
15
10/02/2020
ym . (w . Xm + b) ≥ +1 for all m
Quadratic Optimization problem, subject to linear constraints
1D Example a.x+b=0
16
10/02/2020
M Linear Constraints
∑ cij wi ≤ dj j = 1, 2, …., M
Lagrangian : L = || w ||2 / 2 + ∑ l { 1 – [y
m
m . (w . Xm + b)]}
m=1,..,M
Lagrange Multipliers
The constraints are added into the optimization by adding extra variables (called
Lagrange multipliers)
17
10/02/2020
∂L
∂b
=- ∑l m ym =0 ∑l m ym = 0
m=1,..,M m=1,..,M
L = || w ||2 / 2 + ∑ l { 1 – [y
m
m . (w . Xm + b)] } w* = ∑l m ym . Xm
m=1,..,M
m=1,..,M
L=(1/2) . [∑l m m
my .X ].[∑lm ym.Xm]-[∑lm ym.Xm{∑lk yk.Xk}]-∑lm ym.b +∑lm
m=1,..,M m=1,..,M m=1,..,M k=1,..,M m=1,..,M m=1,..,M
The optimization problem becomes : (called : Dual Problem)
Find lm such that:
L= - (1/2) . ∑ ∑l m lk ym yk . [Xm. Xk] +∑lm is
m=1,..,M k=1,..,M m=1,..,M maximized
where ∑l m ym = 0 lm ≥ 0, m =1,.., M
m=1,..,M
Machine Learning Approaches ----------- B. Solaiman
18
10/02/2020
w* = ∑l m y .
m Xm
m ϵ Support Vectors Support Vectors
b*(k) = yk - ∑ lm ym . XmXk
m =1,…, M
For any k such
that lk > 0
b* = (1/K) ∑ b*(k) (K denotes the nb of SVs)
k
Machine Learning Approaches ----------- B. Solaiman
19
10/02/2020
Geometrical Interpretation
C1
Support vectors
l8=0.6 l10=0 w l’s with values different
from zero (they hold up the
separating plane)!
l7=0
l5=0 l2=0
l1=0.8
l4=0
l6=1.4
l9=0
l3=0
w.X+b=0
C2
Machine Learning Approaches ----------- B. Solaiman
f(X) = ∑ lm ym . Xm . X + b*
m ϵ Support Vectors
20
10/02/2020
21
10/02/2020
Minimize: L = || w ||2 / 2 + C ∑ m
m=1, …, M
22
10/02/2020
∂L
∂w
= w - ∑ lm ym . Xm = 0
m=1,..,M Same solution
as for the H-SVM
∂L
∂b
=- ∑l m ym = 0
m=1,..,M
∂L mm ≥ 0 lm ≤ C
= C – lm – mm = 0 lm ≥ 0
∂ m (Otherwise, C – lm ≤ 0,
thus no mm will verify:
C – lm – mm = 0
Machine Learning Approaches ----------- B. Solaiman
23
10/02/2020
w* = ∑l m ym . Xm
with the only difference
m=1,..,M
w.r.t (H-SVM):
∑l m ym = 0 0 ≤ lm ≤ C
m=1,..,M
f(X) = ∑ lm ym . Xm . X + b*
m ϵ Support Vectors
If f(X) > 0 Then decide C1;
Else decide C2
Machine Learning Approaches ----------- B. Solaiman
Φ: Rn → RN
X → Z = Φ(X)
X R n Z R N , N > n
Machine Learning Approaches ----------- B. Solaiman
24
10/02/2020
x1
25
10/02/2020
Constraints ∑l m ym = 0 lm ≥ 0, m =1,.., M
m=1,..,M
Z - Transformation
Z-dot product
Xm Zm = Φ(Xm)
K(Xm, Xm’) = ZmZm. Z. Zm’m’ Z-dot product
Xm’ m’ = Φ(X
Zwithout m’)computation of
explicit
Φ(Xm) and Φ(Xm’)
26
10/02/2020
Example 1
K(X, X’) = Z . Z’
= 1+ x1x’1 + x2x’2+ (x1)2 (x’1)2 + (x2)2 (x’2)2 + x1 . x’1 . x2 x’2
27
10/02/2020
Example 2
- a ||X - X’||2
- Expand e
- Expand cross exponential term using Taylor series
28
10/02/2020
Example 4 XOR
x2
1 Index i x
X yu
1 (1,1) 1
2 (1,-1) -1
-1 1 x1
3 (-1,-1) 1
4 (-1,1) -1
-1
2
K(X, X’) = (1 + X . X’)
Machine Learning Approaches ----------- B. Solaiman
Example 4 XOR
Optimal Lagrange multipliers: l1 = l2 = l3 = l4 = 1/8
The 4 instance data are SV
29
10/02/2020
...
Preprocessing
- Gray scale transformation
- Histogram equalization
- Adjust resolution to 30x40 pixel
Training the SVM
based on the 266 training instances, a polynomial kernel
30
10/02/2020
9
1 8
11
10
Processed
faces
31
10/02/2020
32
10/02/2020
Class
–1
Class
+1
33
10/02/2020
Class
+1
Class
–1
34