Professional Documents
Culture Documents
L6 7 Discriminant Analysis PCA
L6 7 Discriminant Analysis PCA
SMA 2437
MULTIVARIATE METHODS Discriminant
ANALYSIS
Lecture 6 and 7
@tkaranjah 2023 1 @tkaranjah 2023 2
1
01/03/2023
is maximized.
2
01/03/2023
3
01/03/2023
Discriminant Analysis
PRINCIPAL
The values of the projected points are
found by calculating z for each
COMPONENT ANALYSIS
observation vector y in the two groups.
The results are given in Table 8.2, where
the separation provided by the
discriminant function is clearly evident.
4
01/03/2023
We can construct p linear combinations The principal components are those uncorrelated linear
combinations Y1,…,Yp whose variances are as large as possible.
Y1 = a'1 X = a11X 1 + a12X 2 + + a1p X p Thus the first principal component is the linear combination of
Y2 = a'2 X = a21X 1 + a22X 2 + + a2p X p maximum variance, i.e., we wish to solve the nonlinear
optimization problem
Yp = a'p X = a p1X 1 + a p2X 2 + + a pp X p
source of max a'1Σa1
nonlinearity a1 restrict to
It is easy to show that coefficient vectors
st a'1a1 = 1 of unit length
Var ( Yi ) = a'iΣai, i = 1, ,p
Cov ( Yi, Yk ) = a Σak, i, k = 1,
'
i ,p
The second principal component is the linear combination of The third principal component is the solution to the
maximum variance that is uncorrelated with the first nonlinear optimization problem
principal component, i.e., we wish to solve the nonlinear
optimization problem
max a'3Σa3
a3
max a'2Σa2 st a'3a3 = 1
a2
restricts restricts
st '
a a2 = 1
2 covariance a'1Σa3 = 0 covariances
to zero to zero
a'1Σa2 = 0 a'2Σa3 = 0
st a'iai = 1
a'k Σai = 0 k < i
5
01/03/2023
Var ( X )
i = λ1 + + λp =
p
Var (Y )
i
λ
i=1
i
i=1 i=1
Example: Suppose we have the following population of four First we need the covariance matrix :
observations made on three random variables X1, X2, and X3:
2.00 3.33 1.33
Σ = 3.33 8.00 4.67
X1 X2 X3 1.33 4.67 7.00
1.0 6.0 9.0
4.0 12.0 10.0 and the corresponding eigenvalue-eigenvector pairs:
3.0 12.0 15.0
4.0 10.0 12.0
6
01/03/2023
and the proportion of total population variance due to the each Next we obtain the correlations between the original random
principal component is variables Xi and the principal components Yi:
7
01/03/2023
We can display these results in a correlation matrix: We could standardize the variables X1, X2, and X3, then work with
the resulting covariance matrix , but it is much easier to proceed
X1 X2 X3 directly with correlation matrix :
Y1 0.5290853 0.3337024 0.3185683
Y2 -0.3814819 -0.1104624 0.2028517
1.000 0.833 0.356
Y3 0.2730207 -0.0379573 0.0149166
ρ = 0.833 1.000 0.624
0.356 0.624 1.000
- the first principal component (Y1) is a mixture of all three
random variables (X1, X2, and X3)
and the corresponding eigenvalue-eigenvector pairs:
- the second principal component (Y2) is a trade-off between X1
and X3
λ3 0.1624235
p
= = 0.054141167 ρY2,Z1 = e12 λ2 = -0.5449250 0.6226418 = -0.429987538
3
λ
i=1
i
8
01/03/2023
Interpretation of components:
Assess the weight of each of the original variables in the component;
• If you multiply one variable by a scalar you get different for example, suppose we have five original variables and the first
results if using a covariance matrix principal component is
Y1 = 0.897 X1 + 0.023X 2 − 0.768 X 3 + 0.169 X 4 − 0.324 X 5
• However the correlation matrix is invariant to scale
Then X1 and X 3 have the highest weights and are the most
important variables in the component.
• PCA should be applied on data that have approximately the
Another of assessing the relationship is through the correlation of the
same scale in each variable
original variable X j and the component Yi ;
aij i aij i
rij = =
i jj jj
@tkaranjah 2023 51 @tkaranjah 2023 52
9
01/03/2023
END OF LECTURE 6
8:08:43 AM
@tkaranjah 2023 55
10