Professional Documents
Culture Documents
JI-AIDS-ML-03-Unsup-01-Principal Components Analysis
JI-AIDS-ML-03-Unsup-01-Principal Components Analysis
Analysis
ISB / MLSL / Machine Learning
Dr. Shailesh Kumar
Orthogonal Bases Functions
k
x
x = x1i + x2 j + x3k
UNORDERED – all bases
equally important!
x3 ( ) ( ) ( )
= iT x i + j T x j + kT x k
x1 j
x2
i
DIRECTION of
Projection
RAW Data in
3-D SPACE
PROJECTED data
In 2-D SPACE
Is there a “Best” Projection?
A B C D
PROJECTED data
In 1-D SPACE
Which is the “better” projection here?
Which is the “best” projection?
D
Two Complete and Orthogonal Projections
SECOND Principal
Component
FIRST Principal
Component
PCA | Finding the Best Dimensions to project data
Principal Components Analysis in Nature
1st Principal 2nd Principal 3rd Principal 4th Principal 5th Principal
Component Component Component Component Component
Principal Components Analysis
Project the to preserves maximum variance
Projection: x(n) ® wT x(n) = y(n) Variance in Projected Space:
mY y(n)
mY
y (n)
w1
w2
x(n) x(n)
Principal Components Analysis
Project the data into the direction in which its variance is maximum
Projection: x(n) ® wT x(n) = y(n) Variance in Projected Space:
w* = argmax wT Cw = EigenVector(C)
mY w
y(n)
cv = cov(data);
[eig_vectors, eig_values] = eig(cv);
[d,q] = sort(-diag(d));
x(n) pca_proj = data * v(:,q(1:d));
PCA for IRIS data
cov(data) eig_vectors
0.6811 -0.0390 1.2652 0.5135 - -0.3173 0.5810 0.6565 0.3616
0.0390 0.1868 -0.3196 -0.1172 0.3241 -0.5964 0.7297 -0.0823
1.2652 -0.3196 3.0924 1.2877 0.4797 -0.0725 -0.1758 0.8566 -
0.5135 -0.1172 1.2877 0.5785 0.7511 -0.5491 -0.0747 0.3588
eig_values =
0.0235 0.0780 0.2406 4.1967
´ ´
Data = Signal (Structure) + Noise (Background)
( ) ( ) ( )
x = w1T x w1 + wT2 x w2 + wT3 x w3 + wT4 x w4( )
SIGNAL NOISE
Eigen Values
% age Variance Captured 4.1967 0.2406 0.0780 0.0235
Eigen Values
% age Variance Captured