Download as pdf or txt
Download as pdf or txt
You are on page 1of 2

WEEK 6 BUSINESS DATA MINING

Support Vector Machines (SVMs) are a powerful class of supervised learning algorithms
used for classification and regression tasks. SVMs work by finding the optimal
hyperplane that separates data points into different classes or predicts continuous target
variables. There are several types of SVMs, each with its own characteristics and
variations. Here are the main types of support vector machines:

1. Linear SVM:
- Linear SVM is the simplest form of SVM and is used for linearly separable datasets,
where the classes can be separated by a straight line (hyperplane) in the feature space.
- In linear SVM, the goal is to find the hyperplane that maximizes the margin, which is
the distance between the hyperplane and the closest data points (support vectors) from
each class.
- The decision function of a linear SVM is represented as:
\[f(x) = w^T x + b\]
where \(w\) is the weight vector, \(x\) is the input feature vector, and \(b\) is the bias
term.
- Linear SVM is computationally efficient and works well for high-dimensional
datasets with a large number of features.

2. Nonlinear SVM:
- Nonlinear SVM is used for datasets that are not linearly separable, meaning that the
classes cannot be separated by a straight line in the feature space.
- Nonlinear SVM uses kernel functions to map the input features into a higher-
dimensional space where the classes become separable by a hyperplane.
- Common kernel functions used in nonlinear SVM include:
- Polynomial kernel: \(K(x, x') = (x^T x' + c)^d\)
- Gaussian (Radial Basis Function) kernel: \(K(x, x') = e^{-\frac{\|x -
x'\|^2}{2\sigma^2}}\)
- Sigmoid kernel: \(K(x, x') = \tanh(\alpha x^T x' + c)\)
- Nonlinear SVM allows for more flexible decision boundaries and can capture
complex relationships in the data.

3. Multiclass SVM:
- Multiclass SVM is an extension of binary SVM to handle datasets with more than two
classes. It can classify instances into multiple classes by training multiple binary
classifiers, typically using the one-vs-rest (OvR) or one-vs-one (OvO) strategy.
- In the OvR strategy, a separate binary classifier is trained for each class, which learns
to distinguish that class from all other classes. The final prediction is made based on the
classifier with the highest confidence score.
- In the OvO strategy, a binary classifier is trained for each pair of classes, which learns
to distinguish between instances of those two classes. The final prediction is made based
on a majority voting scheme.

4. Probabilistic SVM:
- Probabilistic SVM extends the binary SVM to output probability estimates for class
membership rather than just binary predictions.
- One common approach for probabilistic SVM is Platt scaling, which fits a logistic
regression model to the SVM decision scores to estimate class probabilities.
- Probabilistic SVM provides a measure of confidence or uncertainty in the predictions,
which can be useful for decision-making and evaluating model performance.

5. SVM for Regression (Support Vector Regression, SVR):


- SVM can also be used for regression tasks, where the goal is to predict continuous
target variables rather than discrete class labels.
- Support Vector Regression (SVR) works by finding a hyperplane that fits as many
data points within a specified margin (epsilon) around the predicted values.
- SVR aims to minimize the margin violations (instances that fall outside the margin)
while ensuring that the error between the predicted and actual values is within the
specified epsilon.

6. SVM with Weighted Classes:


- SVM can be extended to handle imbalanced datasets by assigning different weights to
the classes based on their importance or prevalence in the dataset.
- By adjusting the class weights, SVM can give more emphasis to minority classes and
reduce bias towards the majority class during training.

These are the main types of support vector machines, each tailored to different types of
data and tasks. Depending on the characteristics of the dataset and the objectives of the
analysis, practitioners may choose the most appropriate type of SVM and kernel function
to build accurate and effective models.

You might also like