Pca Paras Joshi

You might also like

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 2

PCA

1. Principal Component Analysis.


2. The aim is to reduce the dimensions of the predictor variables.
3. Suppose the dataset has n features, the aim is to reduce the number of
features to k such that k<=n.

From 2 dimensions to 1 dimension


4. Mathematically we try to project the n dimensional data to a linear
subspace of k dimensions.
5. Before applying PCA: -
 Perform mean normalization of all features so that their mean is
reduced to zero(mandatory).
 Do feature scaling by dividing the normalized features by either
range or standard deviation of the corresponding feature.
6. Algorithm:-
 Compute the covariance matrix.
 Compute Eigen Vectors of the covariance matrix.
 Find the Ureduce matrix which gives the n dimensions on which
entire data can be projected. Select the first k dimensions.
 Now any example zi=(Ureduce)T.xi.
7. Reconstruction from compressed representation:-
Xiapprox= (Ureduce).zi
8. To select best possible k ensure:-
Variance with k dimensions >=0.99
Variance with n dimensions

9. Applications of PCA:-
 Reduces memory/disk space required to store data.
 Speeds up the learning algorithm.
 Makes Visualization easier.
10. Use of PCA to prevent overfitting is bad use of PCA, since some data is
lost in the process, use regularization instead.

You might also like