Support Vector Machine

You might also like

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 21

Support Vector

Machine
Sari, Mary Joy C.
Pateno, Paloma
Trugeder, Sanchez
Ugdiman, Mishelle
• What is Machine Learning?
• What is Supervised Learning?
• So what’s an Algorithm?

Agenda •

What is a Support Vector Machine?
How does a Support Vector Machine work?
• Types of Support Vector Machine?
• Pros and Cons of Support Vector Machine?
• Executable Sample of Support Vector Machine
What is Machine Learning?

Machine Learning

Supervised Learning Unsupervised Learning Reinforement Learning


What is Machine Learning?

Machine Learning

Supervised Learning Unsupervised Learning Reinforement Learning

Classification
Support Vector Machine

Regression
Supervised Learning
Machine learning model learns from the past input data and makes future predictions as output.

Teach the Model Model is trained!

Got it!

Star
What is a Support Vector Machine?
• Support Vector Machines is one of the most popular Supervised Learning algorithms, which is
used for Classification as well as Regression problems. However, primarily it is used for
classification problems in Machine Learning.
• The goal of the SVM algorithm is to create the best line or decision boundary that can segragate
n-dimensional space into classes so that we can easily put the new data point in the correct
category in the future.
• Svm is deffirent from other classification algorithms because of the way they choose boundary
that maximizes the distance from the nearest data points of all the classes.
• The best decision boundary created by SVM is called a Hyperplane.
Important Terms

• Support Vectors: These are the points that are closest to the hyperplane. A separating
line will be defined with the help of these data points.
• Margin: It is the distance between the hyperplane (support vector). In SVM large margin
is considered a good margin. There are two types of margins Hard margin and Soft
margin.

• SVM choose the extreme


points/vectors that help in
creating in hyperplane.
These extreme cases are
called as Support Vector
Machine. Consider the
diagram in which there are
two different categories
that are classified using a
decision boundary or
Hyperplane.
Example diagram

• Supposed we see a
strange cat that also has
some features of dogs, so
if we want a model that
can accurately identify
whether it is a cat or dog,
so such model can be
created by using the SVM
algorithm.
• We will first train our
model with lots of images
of cats and dogs so that it
can learn about different
features of cats and dogs,
and then we test it with
this strange creature.
Types of Support Vector Machine?

1. Linear SVM
• Is used for separable data, which means if a dataset can be classified
into two classes by using a single straight line, then such data is termed
as linearly separable data, and classifier is used called as Linear SVM
classifier.

2. Non-Linear SVM
• Is used for non-linearly separated data, which means if a dataset cannot
be classified by using a straight line, then such data is termed as non-
linear data and classifier used is called as Non-linear SVM classifier.
Linear Support Vector Machine
SVM is defined such that is defined in terms of the support vectors only, we don’t have to
worry about other obsevations since the margin is made uisng the points which are closest to
the hyperplane, whereas in logistic regression the classifier is defined over all the points.

Let’s understand the


working of SVM using To classify these points,
an example. Suppose we can have many
we have a dataset that decision boundaries but
has two classes the question is which is
(green and blue). We the best and how do we
want the new data find it?
point as either blue or
green.

Linear SVM
Linear Support Vector Machine
NOTE: Since we are plotting the data points in a2-dimensional graph we call this decision
boundary a straight line but if we have more dimensions, we call this decision boundary a
Hyperplane.

• The best hyperplane is


that plane that has the
maximum distance from
booth the classes, and
this is the main aim of
SVM.
• This is done by finding
different hyperplanes
which classify the labels
in the best way then it will
choose the one which is
farthest from the data
points or the one which
has a maximum margin
Linear Support Vector Machine

• The SVM algorithm helps to find the


best line or decision boundary; this
best boundary or region is called as
a hyperplane. SVM algorithm finds
the closest point of the lines from
both of the classes. These points
are called support vectors.
• The distance between the vectors
and the hyperplane is called as
margin. And the goal of SVM is to
maximize this margin.The
hyperplane with maximum margin is
called the optimal hyperplane.
Non-Linear Support Vector Machine
If data is linearly arranged, then we can separate it by using a straight line, but for non-
linear data, we cannot draw a single straight line. Consider the below image:

• To separate these data


points, we need to add
one more dimension.
For linear data, we
have used two
dimensions x and y, so
for non-linear data, we
will add a third
dimension z. It can be
calculated as:
Non-Linear Support Vector Machine
By adding the third dimension, the sample space will become as below image:
Non-Linear Support Vector Machine
So, now SVM will divide the datasets into classes in the following way. Consider the
below image:
Non-Linear Support Vector Machine
Since we are in 3-d space, hence it is looking like a plane parallel to the x-axis. If we
convert it in 2d space with z-1, then it will become as:

Hence we get a circumference of radius 1 in case of non-linear data.


Python Implementation of Support Vector
Machine
• Now we will implement the SVM algorithm using Python. Here we will use the same dataset
user_data which we have in Logistic regression and KNN classification.

Data pre-processing step:


Python Implementation of Support Vector
Machine
After excuting the code above, we will pre-
process data. The code will give the dataset as: The scaled output for the test set will be:
Advantages of Support Vector Machine

• SVM works relatively well when there is a clear margin of separation between
classes.
• SVM is more effective in high dimensional spaces.
• SVM is effective in cases where the number of dimensions is greater than the
number of samples.
• SVM is relatively memory efficient
Disadvantages of Support Vector Machine

• SVM algorithm is not suitable for large data sets.


• SVM does not perform very well when the data set has more noise i.e. target
classes are overlapping.
• In cases where the number of features for each data point exceeds the
number of training data samples, the SVM will underperform.
• As the support vector classifier works by putting data points, above and
below the classifying hyperplane there is no probabilistic explanation for the
classification.
THANK YOU!

You might also like