Professional Documents
Culture Documents
Support Vector Machines: Artificial Intelligence Engineer
Support Vector Machines: Artificial Intelligence Engineer
Utkarsh Kulshrestha
Artificial Intelligence Engineer
kuls.utkarsh1205@gmail.com
History of SVM
handwritten
2 digit recognition
1.1% test error rate
SVM is now regarded as an important example of
“kernel methods”, one of the key area in machine
learning
19/12/27 2
Linear Classifiers
x f yest
denotes +1
denotes -1
w: weight vector
x: data vector
How would you
classify this
data?
f(x,w,b) = sign(w. x -
b)
19/12/27 3
Linear Classifiers
x f yest
f(x,w,b) = sign(w. x -
denotes +1 b)
denotes -1
19/12/27 4
Linear Classifiers
x f yest
f(x,w,b) = sign(w. x -
denotes +1 b)
denotes -1
19/12/27 5
Linear Classifiers
x f yest
f(x,w,b) = sign(w. x -
denotes +1 b)
denotes -1
19/12/27 6
Linear Classifiers
x f yest
f(x,w,b) = sign(w. x -
denotes +1 b)
denotes -1
Any of these
would be fine..
..but which is
best?
19/12/27 7
Classifier Margin
x f yest
f(x,w,b) = sign(w. x -
denotes +1 b)
denotes -1 Define the
margin of a
linear classifier
as the width
that the
boundary could
be increased by
before hitting a
datapoint.
19/12/27 8
Maximum Margin
x f yest
f(x,w,b) = sign(w. x -
denotes +1 b)
denotes -1 The maximum
margin linear
classifier is the
linear classifier
with the, um,
maximum
margin.
This is the
simplest kind of
SVM (Called an
Linear SVM LSVM)
19/12/27 9
Maximum Margin
x f yest
f(x,w,b) = sign(w. x +
denotes +1 b)
denotes -1 The maximum
margin linear
classifier is the
linear classifier
Support with the, um,
Vectors are
those maximum
datapoints margin.
that the
margin pushes This is the
up against simplest kind of
SVM (Called an
Linear SVM LSVM)
19/12/27 10
Why Maximum Margin?
f(x,w,b) = sign(w. x -
denotes +1 b)
denotes -1 The maximum
margin linear
classifier is the
linear classifier
Support with the, um,
Vectors are
those maximum
datapoints margin.
that the
margin pushes This is the
up against simplest kind of
SVM (Called an
LSVM)
19/12/27 11
A Geometrical Interpretation
Class 2
10=0
8=0.
6
7=0
2=0
5=0
1=0.8
4=0
6=1.4
9=0
3=0
Class 1
19/12/27 12
Non-linear SVMs: Feature spaces
General idea: the original input space can always be
mapped to some higher-dimensional feature space where the
training set is separable:
Φ: x → φ(x)
19/12/27 13
Choosing the Kernel Function
Probably the most tricky part of using SVM.
19/12/27 14
Choosing the C Parameter Value
19/12/27 16
Effect of C & Gamma
19/12/27 17
Summary: Steps for Classification
the value of C
You can use the values suggested by the SVM
software, or you can set apart a validation set to
determine the values of the parameter
19/12/27 18
Thank You !!!
19/12/27 19