Professional Documents
Culture Documents
EE769-11 Dimension Reduction
EE769-11 Dimension Reduction
EE769-11 Dimension Reduction
Dimension Reduction
Amit Sethi
Faculty member, IIT Bombay
Learning objectives
2
Dimensionality reduction: what & why?
• Objective:
• For x RD, find mapping f : x → y, y Rd, d<D
• Such that, there exists f-1 that can (almost) reconstruct x
• Advantages:
• Less redundancy, easier classification
• Smaller storage, faster search
• Unravel meaningful latent variables
3
Example: Only two latent variables
4
Example 2: Many latent variables, but still less
than “pixels”
7
Deviations from the underlying hyperplane
8
Principal component analysis
Objective:
• Find a way to select top d < D orthogonal directions that explain the
maximum possible variance of the data
Outline:
• Mean-center the data
• Loop
• Find direction of maximum variance (that is orthogonal to all
previously found directions)
• Until desired percent of variance is captured or number of
dimensions
9
Complete algorithm
• Mean centering: z = x − μx
• Covariance: N C = ZT Z
• Eigen decomposition: C = U Λ UT ; λj uj = C uj
• Selection: Cd = U Λd UT = Ud Λd UdT, d<D
• Drop dimensions: Set lower D−d eigenvalues to zero
• Projection: Y = Z Ud
11
Data often lies around a nonlinear manifold
12
Kernel PCA introduces nonlinearity by
mapping data to a feature space
13
Kernel PCA avoids calculating features directly
by using the kernel trick
• Recall PCA: C ui = λiui ; N C = ZT Z
• Substituting features: ∑n φ(zn) φ(zn)T ui = N λiui
• Substitute: ui = ∑n ain φ(zn)
• This means: ∑n kl,n ∑m aim kn,m = N λi ∑n ain kl,n
• In matrix form: K2 ai = λi N K ai
• Solve eigenvalue problem: K ai = λi N ai