Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 11

NLP REVIEW CLASSFICATION

KNOWLEDGE SOLUTIONS INDIA

Group 21
Savita G
SVM DEFINITION
 “Support Vector Machine” (SVM) is a supervised
machine learning algorithm which can be used for both
classification or regression challenges.
 However, it is mostly used in classification problems.

 In this algorithm, we plot each data item as a point in n-


dimensional space (where n is number of feature you
have) with the value of each feature being the value of a
particular coordinate.
 Then, we perform classification by finding the hyper-
plane that differentiate the two classes very well.
WORKING OF SVM
An SVM model is basically a representation of different classes in a
hyperplane in multidimensional space. The hyperplane will be generated in
an iterative manner by SVM so that the error can be minimized. The goal of
SVM is to divide the datasets into classes to find a maximum marginal
hyperplane (MMH).
The followings are important concepts in SVM −
Support Vectors − Data points that are closest to the hyperplane is called
support vectors. Separating line will be defined with the help of these data
points.
Hyperplane − It is a decision plane or space which is divided between a set
of objects having different classes.
 Margin − It may be defined as the gap between two lines on the closet data
points of different classes. It can be calculated as the perpendicular
distance from the line to the support vectors. Large margin is considered as
a good margin and small margin is considered as a bad margin
WORKING OF SVM
Thumb rule to identify the right hyper-plane
 Select the hyper-plane which segregates the two classes
better.
 Maximizing the distances between nearest data point
(either class) and hyper-plane. This distance is called as
Margin.
SVM MODEL
 Hyperplanes are decision boundaries that help classify
the data points.
 Support vectors are data points that are closer to the
hyperplane and influence the position and orientation of
the hyperplane.
SPLITING THE DATASET

•For splitting the dataset into train set and test set, we need to use
train_test_split() function available in model_selection module of sklearn
package.

•Here in our model we use a test size of 5%.


SVM MODEL AND PREDICTION
 We build the SVM model and predict our model using
the predict() function. It’s implementation is shown in
the following figure.
ACCURACY SCORE AND
CONFUSION MATRIX
 Accuracy classification score: In multilabel
classification, this function computes subset accuracy:
the set of labels predicted for a sample
must exactly match the corresponding set of labels in
y_true.
 Confusion Matrix is a performance measurement for
machine learning classification.
ACCURACY SCORE AND
CONFUSION MATRIX
ADVANTAGES
 The main strength of SVM is that they work well even
when the number of SVM feature is much larger than the
number of instances.
 It can work on datasets with huge features spaces, such is
the case in spam filtering, where a large number of
words are the potential signifiers of a message being
spam.
 SVMs are conceptually easy to understand. They create
an easy-to-understand linear classifier.
 SVMs are now available with almost all data anaytics
toolsets.
THANK YOU

You might also like