Professional Documents
Culture Documents
LDA
LDA
Dimensionality reduction techniques have become critical in machine learning since many high-
dimensional datasets exist these days. Linear discriminant analysis is one such dimensionality
Linear Discriminant Analysis was developed as early as 1936 by Ronald A. Fisher. The original Linear
discriminant applied to only a 2-class problem. It was only in 1948 that C.R. Rao generalized it to apply
to multi-class problems.
Linear Discriminant Analysis Python helps to reduce high-dimensional data set onto a lower-
dimensional space. The goal is to do this while having a decent separation between classes and
The original technique that was developed was known as the Linear Discriminant or Fisher’s
Discriminant Analysis. This was a two-class technique. The multi-class version, as generalized by C.R.
Rao, was called Multiple Discriminant Analysis. Today, they are all known simply as Linear Discriminant
To understand Linear Discriminant Analysis Python better, let’s begin by understanding what
Multi-dimensional data is data that has multiple features which have a correlation with one another.
An alternative to dimensionality reduction is plotting the data using scatter plots, boxplots,
histograms, and so on. We can then use these graphs to identify the pattern in the raw data.
However, with charts, it is difficult for a layperson to make sense of the data that has been presented.
Moreover, if there are many features in the data, thousands of charts will need to be analyzed to
identify patterns.
Dimensionality reduction algorithms solve this problem by plotting the data in 2 or 3 dimensions. This
allows us to present the data explicitly, in a way that can be understood by a layperson.
Linear Discriminant Analysis For Dummies
Linear Discriminant Analysis works on a simple step-by-step basis. Here is a Linear Discriminant
(i) Calculate the separability between different classes. This is also known as between-class variance
(ii) Calculate the within-class variance. This is the distance between the mean and the sample of every
class.
Within-Class Variance
(iii) Construct the lower dimensional space that maximizes Step1 (between-class variance) and
minimizes Step 2(within class variance). In the equation below P is the lower dimensional space
Fisher’s Criterion
Representation Of Linear Discriminant Analysis Models
The representation of Linear Discriminant Analysis models consists of the statistical properties of the
dataset. These are calculated separately for each class. For instance, for a single input variable, it is the
If there are multiple variables, the same statistical properties are calculated over the multivariate
Gaussian. This includes the means and the covariance matrix. All these properties are directly
estimated from the data. They directly go into the Linear Discriminant Analysis equation.
The statistical properties are estimated on the basis of certain assumptions. These assumptions help
simplify the process of estimation. One such assumption is that each data point has the same variance.
Another assumption is that the data is Gaussian. This means that each variable, when plotted, is
shaped like a bell curve. Using these assumptions, the mean and variance of each variable are
estimated.
The linear Discriminant analysis estimates the probability that a new set of inputs belongs to every
class. The output class is the one that has the highest probability. That is how the LDA makes its
prediction.
LDA uses Bayes’ Theorem to estimate the probabilities. If the output class is (k) and the input is (x),
here is how Bayes’ theorem works to estimate the probability that the data belongs to each class.
Plk – Prior probability. This is the base probability of each class as observed in the training data
f(x) – the estimated probability that x belongs to that particular class. f(x) uses a Gaussian distribution
function.
Two dimensionality-reduction techniques that are commonly used for the same purpose as Linear
Discriminant Analysis are Logistic Regression and PCA (Principal Components Analysis). However,
Linear Discriminant Analysis has certain unique features that make it the technique of choice in many
cases. Here are some differences between Linear Discriminant Analysis and the other techniques.
(i) PCA is an unsupervised algorithm. It ignores class labels altogether and aims to find the principal
components that maximize variance in a given set of data. Linear Discriminant Analysis, on the other
hand, is a supervised algorithm that finds the linear discriminants that will represent those axes which
(ii) Linear Discriminant Analysis often outperforms PCA in a multi-class classification task when the
class labels are known. In some of these cases, however, PCA performs better. This is usually when the
sample size for each class is relatively small. A good example is the comparisons between classification
(ii) Many times, the two techniques are used together for dimensionality reduction. PCA is used first
followed by LDA.
Linear Discriminant Analysis vs PCA
Logistic regression is both simple and powerful. However, it is traditionally used only in binary
classification problems. While it can be extrapolated and used in multi-class classification problems,
this is rarely done. When it’s a question of multi-class classification problems, linear discriminant
analysis is usually the go-to choice. In fact, even with binary classification problems, both logistic
Logistic regression can become unstable when the classes are well-separated. This is where the Linear
If there are just a few examples from the parameters need to be estimated, logistic regression tends to
become unstable. In this situation too, Linear Discriminant Analysis is the superior option as it tends to
Of course, you can use a step-by-step approach to implement Linear Discriminant Analysis. However,
the more convenient and more often-used way to do this is by using the Linear Discriminant Analysis
class in the Scikit Learn machine learning library. Here is an example of the code to be used to achieve
this.
# LDA
sklearn_lda = LDA(n_components=2)
X_lda_sklearn = sklearn_lda.fit_transform(X, y)
ax = plt.subplot(111)
plt.scatter(x=X[:,0][y == label],
marker=marker,
color=color,
alpha=0.5,
label=label_dict[label])
plt.xlabel(‘LD1’)
plt.ylabel(‘LD2’)
leg.get_frame().set_alpha(0.5)
plt.title(title)
ax.spines[“top”].set_visible(False)
ax.spines[“right”].set_visible(False)
ax.spines[“bottom”].set_visible(False)
ax.spines[“left”].set_visible(False)
plt.grid()
plt.tight_layout
plt.show()
plot_step_lda()
Due to its simplicity and ease of use, Linear Discriminant Analysis has seen many extensions and
variations. These have all been designed with the objective of improving the efficacy of Linear
Discriminant Analysis examples. Here are some common Linear Discriminant Analysis examples where
Regular Linear Discriminant Analysis uses only linear combinations of inputs. The Flexible Discriminant
In Quadratic Discriminant Analysis, each class uses its own estimate of variance when there is a single
input variable. In case of multiple input variables, each class uses its own estimate of covariance.
This method moderates the influence of different variables on the Linear Discriminant Analysis. It does
Conclusion
Linear Discriminant Analysis Python has become very popular because it’s simple and easy to
understand. While other dimensionality reduction techniques like PCA and logistic regression are also
widely used, there are several specific use cases in which LDA is more appropriate. Thorough
knowledge of Linear Discriminant Analysis is a must for all data science and machine
learningenthusiasts.
If you are also inspired by the opportunities provided by the data science landscape, enroll in our data