Professional Documents
Culture Documents
404
404
404
London
Nov 9, 2010
Overview of Talk
Review principal components
Introduce independent components
25
2006
20
15
10
0
-5
10
15
20
2007
25
30
35
40
Goal:
Fit a linear model that minimizes the squared error between the
model and observations
Imagine regressing many stocks against a single independent variable
PCA finds the independent variable that explains the most on average
10
Apr-08
-20
Jan-08
-10
Oct-07
0
Jul-07
% Return
20
Stock 1
Stock 2
Stock 1 (centered)
Stock 2 (centered)
Regression
SEi = squared error for observation i = t=1..T [yi(t) i x(t)]2
Given x and y, regression sets to minimize squared error
i* = xTyi / xTx
PCA
TSE = total error over all observations = i=1..N SEi
PCA finds the vector x (and the is) that minimize TSE
To get the next component, repeat on the residuals {yi i x}
C and C have the same non-zero eigenvalues and yield the same PCs
i.e. covariance of 5000 stocks over 60 months can be analyzed as a 60
60 matrix instead of a 5000 5000
vi = Y vi / (i)
PCA (cont.)
2 views of PCA
0
-5
Model in operation
Person looks into camera
Compare the images eigenface
loadings to the reference
photos
If close enough, accept as a
match
Principal Components
Independent Components
Goal
Notions of independence:
1. Nonlinear decorrelation
Notions of independence:
2. Non-Gaussianity
Model is Y = AS
where the columns of S are independent
A measure of non-Gaussianity:
Kurtosis
Kurtosis
Entropy
Mutual Information
Using 3 weighings of a balance, can you identify the odd coin and
whether its heavier or lighter?
Each use of the balance returns 1 of 3 values: =, <, >
24 equally likely configurations (12 coin positions x 2 states)
Entropy = - p(x) log p(x) = -i=1..24 1/24 log3 (1/24) = 2.89
It would be impossible if entropy > # of weighings
A measure of non-Gaussianity:
Negentropy
Negentropy
sensitive to outliers
Mutual information
Minimize the mutual information among the
signals
dep(M) = mutual information of columns of M
Summary
Incredibly clever and powerful tool for
extracting information
Fundamental can motivate results from
many different starting points
References
Hyvrinen, A, J Karhunen, and E Oja. Independent
Component Analysis. 2001
Stone, James. Independent Component Analysis: A Tutorial
Introduction. 2004
Bishop, Christopher. Pattern Recognition and Machine
Learning. 2007