SVM

You might also like

Download as pdf
Download as pdf
You are on page 1of 7
2604/2022 12:42 vm Support Vector Machine (SVM) Goals: In this session, we will try to understand the concept of Support Vector Machine and how it works using the Python programming language. We'll use the Scikit-learn library and Iris datasets to illustrate a Support Vector Machine simple explanation In [13]: from reython.display import Inage # Inage(filename = "Capture11.PNG", width = 400, height = 400) Understanding Support Vector Machine algorithm “Support Vector Machine" (SVM) is a supervised machine learning algorithm that can be usec for both classification or regression challenges. However, itis mostly used in classification problems. In the SVM algorithm, we plot each data item as a point in n-dimensional space {where n is a number of features you have) with the value of each feature being the value of @ particular coordinate. Then, we perform classification by finding the hyper-plane that differentiates the two classes very well In [41] tmage(Fitename = “Capture12.PNG", width = 88, height = 80) out [41 localhost 8888inbconverthtmlDesktop!1MP RITELisvmipynb?downlosd=false wn 2604/2022 12:42 vm Gamma: High Gamma Let's start coding! Step 1: Import packages import matplotlib.pyplot as pit Xmatplotlib inline import pandas as pd from sklearn.datasets import load_iris import nunpy as np from sklearn import svm, datasets Step 2: Load data iris = load_iris() print(iris.feature names) print feature names print (inis.target_names) print targets or classes names ['sepal length (cm)', ‘sepal width (cm)', ‘petal length (cm) ['setosa’ ‘versicolor’ 'virginica’] Step 3: Convert data into dataframe df = pd.bataFrane(iris.data, colunns-iris. feature_names) df['target'] = inis.target df-head() sepal length (cm) sepal width (cm) pet 0 5A 38 14 1 49 30 14 2 47 32 13 localhost 8888 inbconverthtmlDesktop/1MP RITEL/svm ipynb?downlosd=false Low Regularization (C) Low Gamma ", ‘petal width (cm)"] length (cm) petal width (cm) target 02 0 02 0 02 0 2604/2022 12:42 sepal length (cm) sepal width (cm) petal length (cm) df [df target: 46 50 df[df.target==1]-head() print first five rows of versicolor df (dF. target: sepal length (cm) 100 101 102 103 104 apply a Lanbda function to create a column that contains target names 63 58 ts 63 85 sepal width (em) dF['lower_name"] =df.target.apply (lambda x: iris.target_nanes[x]) df.head() Sepal length (cm) sepal width (cm) petal length (cm) petal 0 Sa 3s 14 02 1 4s 30 14 02 2 47 32 13 02 3 46 34 15 02 4 5.0 36 14 02 4£[45:55] #print rows between 45 and 55 sepal length (cm) sepal width (cm) petal length (cm) _ petal width (cm) 45 48 3.0 14 03 46 51 38 16 02 47 46 32 14 02 48 53 37 15 02 49 5.0 33 14 02 50 70 32 47 14 51 64 32 4s 15 52 6g 3 4s 15 53 55 23 40 13 54 65 28 46 15 localhost 8888 inbconverthtmlDesktop/1MP RITEL/svm ipynb?downlosd=false target petal width (cm) target 34 15 02 0 36 14 02 0 Jhead() #print first five rows of setosa J.head() #print first five rows of virginica petal length (cm) petal width (cm) target 33 60 25 2 27 51 192 30 58 212 29 56 12 30 58 2200 2 idth (cm) target flower_name setose setose setose setose setose flower_name setosa setosa setosa setosa setosa versicolor versicolor versicolor versicolor versicolor 2604/2022 12:42 asplit datafrane into 3 sub-datoframes dfa = df[:50] #setosa dF1 = df[50:100] | #versicolor df2 = df[100:] — #virginica Step 4: Plot data Petal length vs Pepal Width (Setosa vs Versicolor vs virgi ica) plt.xlabel('Petal Length’) plt.ylabel(‘Petal With’) plt.scatter(dfo[ ‘petal Length (cn)"], df@[ ‘petal width (cm)' ], color="green" marker: plt.scatter(dfi[‘petal length (cx)'], d#1["petal width (cm)' ], color="blue" marke plt.scatter(df2{"petal length (cn)"], df2["petal width (cm)'], color="red” marker: plt. Legend() 251) setosa rr 2 versicolor Saw taee ye oi wee 204, * “9 wae ae errs gis a 5 Senn 210 iver O53) pete 00 1 2 3 @ 5 6 7 Petal Length Step5: Split data into training and testing from sklearn.model_selection import train_test_split df.drop(['target’, "flower_nane'], axis="colunns') df. target. x y X_train, Xtest, y_train, y_test = train_test_split(x, y, test_siz len(x_train) 128 len(x_test) 30 Step6: Create model: Support Vector Machine classifier (SVM) from sklearn.svm import SVC model = SVC() localhost 8888inbconverthtmlDesktop/1MP RITEL/svm ipynb?downlosd=false 2604/2022 12:42 vm nodel.Fit(X train, y_train) svc() model.score(X_test, y_test) .9333333333333333 nodel predict ([[4.8,3.0,2.5,0.3]]) array([]) Tune parameters 1. Regularization (C) nodel_¢ = svc(c=10) nodel_¢.fit(X train, y_train) nodel_C.score(X_test, y_test) 0. 9333333333333333 nodel_¢ = svc(c=100) nodel_C.fit(X_train, y_train) nodel_C.score(x_test, y_test) @.9333333333333332 2. Gamma mnodel_g = SVC(ganna=100) model_g.fit(X_train, y_train) model_g.score(x test, y_test) .43333333333333335 3. Kernel model_linear_kernal = SVC(kernel="Linear') model_linear_kernal.fit(x train, y train) svC(kernel="Linear') model_linear_kernal.score(x_test, y_test) 0.9 model_rbf_kernal = SVC(kernel="rbf") nodel_rbf_kernal.fit(X train, y train) sve() localhost 8888 /nbconverthtmlDesktop/1MP RITEL/svm ipynb?downlosd=false 2604/2022 12:42 vm model_rbf_kernal.score(x_test, y_test) 9.9333333333333332 Question: try different values of C, gamma and kernel and interpret the results. Plot data with ifferent parameters # import some data to play with X= iris.data[:, :2] # we only take the first two features. y = inis.target h = .02 # step size in the mesh # we create an instance of SVM and fit out data. We do not scale our # data since we want to plot the support vectors C= 1.8 # SVM regularization paraneter sve = svm.SVC(kernel="linear’, C=C).fit(X, y) rbf_sve = svm.SVC(kernel='rbf', ganma0.7, C=C).Fit(X, y) poly_sve = svm.SvC(kernel='poly’, degree=3, C=C).fit(X, y) Lin_sve = svm.LinearSvC(C=C).Fit(X, y) # create a mesh to plot in xmin, xmax = X[:, @].min() - 1, X[:, @].max() +4 yomin, yomax = x[:, 1]emin() = 2, x[:, 1Jemax() #2 2X, yy = np.meshgrid(np.arange(x_min, x max, h), np.arange(y_min, y max, h)) 4 title for the plots titles = ['SVC with linear kernel’, "LinearSvc (linear kernel)", "SVC with RBF kernel’, "SVC with polynomial (degree 3) kernel’) for i, clf in enumerate((sve, lin_svc, rbf_sve, poly_svc)): # Plot the deciston boundary. For that, we will assign a color to each # point in the mesh [x min, x max]x[y_min, ymax]. plt.subplot(2, 2, 4 + 1) Plt. subplots_adjust(wspace=0.4, hspace=0.4) Z = clf.predict(np.c_[xx.ravel(), yy-ravel()]) # Put the result into a color plot Z_= Z.reshape(xx. shape) plt.contourf(xx, yy, Z, cmap=plt.cm.coolwarm, alpha= -8) # Plot also the training points plt.scatter(x[:, @], X[:, 1], cey, cmap=plt.cm.coolwarn) plt.xlabel('Sepal length’) plt.ylabel('Sepal width’) plt.xLim(xx.min(), xx.max()) plt.ylim(yy.min(), yymax()) plt.xticks(()) plt.yticks(()) plt.title(titles[i]) plt.show() localhost 8888 inbconverthtmlDesktop/1MP RITEL/svm ipynb?downlosd=false 2604/2022 12:42 vm C:\ProgranData\Anaconda3\1ib\site-packages\sklearn\svm\_base.py:985: ConvergenceWiarn ing: Liblinear failed to converge, increase the number of iterations. warnings.warn("Liblinear failed to converge, increase ” SVC with linear kernel LinearSV¢ (linear kernel) 3 3 = 5 & & ‘Sepal length ‘Sepal length SVC with RBF kernel_ SVC with polynomial (degree 3) kernel 3 3 = 3 & g ‘Sepal length Sepal length localhost 8888inbconverthtmlDesktop!1MP RITELisvmipynb?downlosd=false ™

You might also like