Professional Documents
Culture Documents
Matrices de Proyeccion
Matrices de Proyeccion
Projection Matrices
2.1 Definition
Definition 2.1 Let x ∈ E n = V ⊕ W . Then x can be uniquely decomposed
into
x = x1 + x2 (where x1 ∈ V and x2 ∈ W ).
The transformation that maps x into x1 is called the projection matrix (or
simply projector) onto V along W and is denoted as φ. This is a linear
transformation; that is,
φ(a1 y 1 + a2 y 2 ) = a1 φ(y 1 ) + a2 φ(y 2 ) (2.1)
for any y 1 , y 2 ∈ E n . This implies that it can be represented by a matrix.
This matrix is called a projection matrix and is denoted by P V ·W . The vec-
tor transformed by P V ·W (that is, x1 = P V ·W x) is called the projection (or
the projection vector) of x onto V along W .
Theorem 2.1 The necessary and sufficient condition for a square matrix
P of order n to be the projection matrix onto V = Sp(P ) along W = Ker(P )
is given by
P2 = P. (2.2)
Lemma 2.1 Let P be a square matrix of order n, and assume that (2.2)
holds. Then
E n = Sp(P ) ⊕ Ker(P ) (2.3)
H. Yanai et al., Projection Matrices, Generalized Inverse Matrices, and Singular Value Decomposition, 25
Statistics for Social and Behavioral Sciences, DOI 10.1007/978-1-4419-9887-3_2,
© Springer Science+Business Media, LLC 2011
26 CHAPTER 2. PROJECTION MATRICES
and
Ker(P ) = Sp(I n − P ). (2.4)
Proof of Lemma 2.1. (2.3): Let x ∈ Sp(P ) and y ∈ Ker(P ). From
x = P a, we have P x = P 2 a = P a = x and P y = 0. Hence, from
x+y = 0 ⇒ P x+P y = 0, we obtain P x = x = 0 ⇒ y = 0. Thus, Sp(P )∩
Ker(P ) = {0}. On the other hand, from dim(Sp(P )) + dim(Ker(P )) =
rank(P ) + (n − rank(P )) = n, we have E n = Sp(P ) ⊕ Ker(P ).
(2.4): We have P x = 0 ⇒ x = (I n − P )x ⇒ Ker(P ) ⊂ Sp(I n − P ) on
the one hand and P (I n − P ) ⇒ Sp(I n − P ) ⊂ Ker(P ) on the other. Thus,
Ker(P ) = Sp(I n −P ). Q.E.D.
P (P x) = P y = y = P x =⇒ P 2 x = P x =⇒ P 2 = P .
P V ·W x + P W ·V x = (P V ·W + P W ·V )x. (2.5)
Because the equation above has to hold for any x ∈ E n , it must hold that
I n = P V ·W + P W ·V .
P Q = P (I n − P ) = P − P 2 = O, (2.6)
2.1. DEFINITION 27
implying that Sp(Q) constitutes the null space of P (i.e., Sp(Q) = Ker(P )).
Similarly, QP = O, implying that Sp(P ) constitutes the null space of Q
(i.e., Sp(P ) = Ker(Q)).
−→
Example 2.1 In Figure 2.1, OA indicates the projection of z onto Sp(x)
−→
along Sp(y) (that is, OA= P Sp(x)·Sp(y ) z), where P Sp(x)·Sp(y ) indicates the
−→
projection matrix onto Sp(x) along Sp(y). Clearly, OB= (I 2 −P Sp(y )·Sp(x) )
× z.
Sp(y) = {y}
B z
Sp(x) = {x}
O P {x}·{y} z A
Sp(y) = {y}
B z
V = {α1 x1 + α2 x2 }
O P V ·{y} z A
Theorem 2.3 The necessary and sufficient condition for a square matrix
P of order n to be a projector onto V of dimensionality r (dim(V ) = r) is
given by
P = T ∆r T −1 , (2.8)
where T is a square nonsingular matrix of order n and
1 ··· 0 0 ··· 0
. . ..
.. . . ... ... . . . .
0 ··· 1 0 ··· 0
∆r = .
0 ··· 0 0 ··· 0
.. . . . . . ..
. . .. .. . . .
0 ··· 0 0 ··· 0
Thus, we obtain
à ! à ! à !
α α α
P x = x =⇒ P T =T = T ∆r ,
0 0 0
à ! à ! à !
0 0 0
P y = 0 =⇒ P T = = T ∆r .
β 0 β
Adding the two equations above, we obtain
à ! à !
α α
PT = T ∆r .
β β
à !
α
Since is an arbitrary vector in the n-dimensional space E n , it follows
β
that
P T = T ∆r =⇒ P = T ∆r T −1 .
Furthermore, T can be an arbitrary nonsingular matrix since V = Sp(A)
and W = Sp(B) such that E n = V ⊕ W can be chosen arbitrarily.
(Sufficiency) P is a projection matrix, since P 2 = P , and rank(P ) = r
from Theorem 2.1. (Theorem 2.2 can also be used to prove the theorem
above.) Q.E.D.
P2 = P, (2.10)
Corollary
P 2 = P ⇐⇒ Ker(P ) = Sp(I n − P ). (2.13)
Proof. (⇒): It is clear from Lemma 2.1.
(⇐): Ker(P ) = Sp(I n −P ) ⇔ P (I n −P ) = O ⇒ P 2 = P . Q.E.D.
(x, P y) = (P x + x2 , P y) = (P x, P y)
= (P x, P y + y 2 ) = (P x, y) = (x, P 0 y)
Qy (where Q = I − P )
¡
¡
ª y
Sp(A)⊥ = Sp(Q)
Sp(A) = Sp(P )
Py
Note A projection matrix that does not satisfy P 0 = P is called an oblique pro-
jector as opposed to an orthogonal projector.
Example 2.3 Let 1n = (1, 1, · · · , 1)0 (the vector with n ones). Let P M
denote the orthogonal projector onto VM = Sp(1n ). Then,
1 1
n ··· n
P M = 1n (10n 1n )−1 10n = ... .. .
..
. . (2.16)
1 1
n ··· n
⊥ = Sp(1 )⊥ , the ortho-complement sub-
The orthogonal projector onto VM n
space of Sp(1n ), is given by
1
1− n − n1 ··· − n1
− n1 1− 1
· · · − n1
In − P M =
.. ..
n
.. ..
.
(2.17)
. . . .
− n1 − n1 · · · 1 − n1
Let
QM = I n − P M . (2.18)
Clearly, P M and QM are both symmetric, and the following relation holds:
E n = V1 ⊕ W1 = V2 ⊕ W2 , (2.21)
V1 + (W1 ∩ W2 ) = (V1 + W1 ) ∩ W2 = E n ∩ W2 = W2 .
Corollary When V1 ⊂ V2 or W2 ⊂ W1 ,
Proof. In the proof of Lemma 2.3, exchange the roles of W2 and V2 . Q.E.D.
Theorem 2.7 Let P 1 and P 2 denote the projection matrices onto V1 along
W1 and onto V2 along W2 , respectively. Then the following three statements
are equivalent:
(i) P 1 + P 2 is the projector onto V1 ⊕ V2 along W1 ∩ W2 .
(ii) P 1 P 2 = P 2 P 1 = O.
(iii) V1 ⊂ W2 and V2 ⊂ W1 . (In this case, V1 and V2 are disjoint spaces.)
Proof. (i) → (ii): From (P 1 +P 2 )2 = P 1 +P 2 , P 21 = P 1 , and P 22 = P 2 , we
34 CHAPTER 2. PROJECTION MATRICES
Theorem 2.9 When the decompositions in (2.21) and (2.22) hold, and if
P 1P 2 = P 2P 1, (2.24)
Proof. P 1 P 2 = P 2 P 1 implies (P 1 P 2 )2 = P 1 P 2 P 1 P 2 = P 21 P 22 = P 1 P 2 ,
indicating that P 1 P 2 is a projection matrix. On the other hand, let x ∈
V1 ∩ V2 . Then, P 1 (P 2 x) = P 1 x = x. Furthermore, let x ∈ W1 + W2
and x = x1 + x2 , where x1 ∈ W1 and x2 ∈ W2 . Then, P 1 P 2 x =
P 1 P 2 x1 +P 1 P 2 x2 = P 2 P 1 x1 +0 = 0. Since E n = (V1 ∩V2 )⊕(W1 ⊕W2 ) by
the corollary to Lemma 2.3, we know that P 1 P 2 is the projector onto V1 ∩V2
along W1 ⊕W2 . Q.E.D.
Note Using the theorem above, (ii) → (i) in Theorem 2.7 can also be proved as
follows: From P 1 P 2 = O
Q1 Q2 = (I n − P 1 )(I n − P 2 ) = I n − P 1 − P 2 = Q2 Q1 .
E n = V1 ⊕ V2 ⊕ · · · ⊕ Vm . (2.25)
P 1 + P 2 + · · · + P m = I n. (2.26)
P i P j = O (i 6= j). (2.27)
P 2i = P i (i = 1, · · · m). (2.28)
rank(P 1 ) + rank(P 2 ) + · · · + rank(P m ) = n. (2.29)
2.3. SUBSPACES AND PROJECTION MATRICES 37
P 1 P i + P 2 P i + · · · + P i (P i − I n ) + · · · + P m P i = O.
Since Sp(P 1 ), Sp(P 2 ), · · · , Sp(P m ) are disjoint, (2.27) and (2.28) hold from
Theorem 1.4. Q.E.D.
Then, E n = Vi ⊕V(i) . Let P i·(i) denote the projector onto Vi along V(i) . This matrix
coincides with the P i that satisfies the four equations given in (2.26) through (2.29).
Corollary 2 Let P (i)·i denote the projector onto V(i) along Vi . Then the
following relation holds:
Note The projection matrix P i·(i) onto Vi along V(i) is uniquely determined. As-
sume that there are two possible representations, P i·(i) and P ∗i·(i) . Then,
from which
Each term in the equation above belongs to one of the respective subspaces V1 , V2 ,
· · · , Vm , which are mutually disjoint. Hence, from Theorem 1.4, we obtain P i·(i) =
P ∗i·(i) . This indicates that when a direct-sum of E n is given, an identity matrix I n
of order n is decomposed accordingly, and the projection matrices that constitute
the decomposition are uniquely determined.
P = P 1 + P 2 + · · · + P m. (2.35)
Proof. That (i) and (ii) imply (iii) is obvious. To show that (ii) and (iii)
imply (iv), we may use
P 2 = P 21 + P 22 + · · · + P 2m and P 2 = P ,
P 1 + P 2 + · · · + P m + (I n − P ) = I n
E n = Vk ⊃ Vk−1 ⊃ · · · ⊃ V2 ⊃ V1 = {0},
Note The theorem above does not presuppose that P i is an orthogonal projec-
tor. However, if Wi = Vi⊥ , P i and P ∗i are orthogonal projectors. The latter, in
⊥
particular, is the orthogonal projector onto Vi ∩ Vi−1 .
We first consider the case in which there are two direct-sum decomposi-
tions of E n , namely
E n = V1 ⊕ W1 = V2 ⊕ W2 ,
as given in (2.21). Let V12 = V1 ∩ V2 denote the product space between V1
and V2 , and let V3 denote a complement subspace to V1 + V2 in E n . Fur-
thermore, let P 1+2 denote the projection matrix onto V1+2 = V1 + V2 along
V3 , and let P j (j = 1, 2) represent the projection matrix onto Vj (j = 1, 2)
along Wj (j = 1, 2). Then the following theorem holds.
Theorem 2.16 (i) The necessary and sufficient condition for P 1+2 = P 1 +
P 2 − P 1 P 2 is
(V1+2 ∩ W2 ) ⊂ (V1 ⊕ V3 ). (2.36)
(ii) The necessary and sufficient condition for P 1+2 = P 1 + P 2 − P 2 P 1 is
(V1+2 ∩ W1 ) ⊂ (V2 ⊕ V3 ). (2.37)
Proof. (i): Since V1+2 ⊃ V1 and V1+2 ⊃ V2 , P 1+2 − P 1 is the projector
onto V1+2 ∩ W1 along V1 ⊕ V3 by Theorem 2.8. Hence, P 1+2 P 1 = P 1 and
P 1+2 P 2 = P 2 . Similarly, P 1+2 − P 2 is the projector onto V1+2 ∩ W2 along
V2 ⊕ V3 . Hence, by Theorem 2.8,
P 1+2 − P 1 − P 2 + P 1 P 2 = O ⇐⇒ (P 1+2 − P 1 )(P 1+2 − P 2 ) = O.
Furthermore,
(P 1+2 − P 1 )(P 1+2 − P 2 ) = O ⇐⇒ (V1+2 ∩ W2 ) ⊂ (V1 ⊕ V3 ).
(ii): Similarly, P 1+2 − P 1 − P 2 + P 2 P 1 = O ⇐⇒ (P 1+2 − P 2 )(P 1+2 −
P 1 ) = O ⇐⇒ (V1+2 ∩ W1 ) ⊂ (V2 ⊕ V3 ). Q.E.D.
Corollary Assume that the decomposition (2.21) holds. The necessary and
sufficient condition for P 1 P 2 = P 2 P 1 is that both (2.36) and (2.37) hold.
The following theorem can readily be derived from the theorem above.
Theorem 2.17 Let E n = (V1 +V2 )⊕V3 , V1 = V11 ⊕V12 , and V2 = V22 ⊕V12 ,
where V12 = V1 ∩ V2 . Let P ∗1+2 denote the projection matrix onto V1 + V2
along V3 , and let P ∗1 and P ∗2 denote the projectors onto V1 along V3 ⊕ V22
and onto V2 along V3 ⊕ V11 , respectively. Then,
P ∗1 P ∗2 = P ∗2 P ∗1 (2.38)
2.3. SUBSPACES AND PROJECTION MATRICES 41
and
P ∗1+2 = P ∗1 + P ∗2 − P ∗1 P ∗2 . (2.39)
Proof. Since V11 ⊂ V1 and V22 ⊂ V2 , we obtain
V1+2 ∩ W2 = V11 ⊂ (V1 ⊕ V3 ) and V1+2 ∩ W1 = V22 ⊂ (V2 ⊕ V3 )
by setting W1 = V22 ⊕ V3 and W2 = V11 ⊕ V3 in Theorem 2.16.
Another proof. Let y = y 1 + y 2 + y 12 + y 3 ∈ E n , where y 1 ∈ V11 ,
y 2 ∈ V22 , y 12 ∈ V12 , and y 3 ∈ V3 . Then it suffices to show that (P ∗1 P ∗2 )y =
(P ∗2 P ∗1 )y. Q.E.D.
Corollary P 1+2 = P 1 + P 2 − P 1 P 2 ⇐⇒ P 1 P 2 = P 2 P 1 .
P 1+(2∩3) = P 1 + P 2 P 3 − P 1 P 2 P 3
respectively, and so P 1+2 P 1+3 = P 1+3 P 1+2 holds. Hence, the orthogonal
projector onto (V1 + V2 ) ∩ (V1 + V3 ) is given by
(P 1 + P 2 − P 1 P 2 )(P 1 + P 3 − P 1 P 3 ) = P 1 + P 2 P 3 − P 1 P 2 P 3 ,
The three identities from (2.40) to (2.42) indicate the distributive law of
subspaces, which holds only if the commutativity of orthogonal projectors
holds.
We now present a theorem on the decomposition of the orthogonal pro-
jectors defined on the sum space V1 + V2 + V3 of V1 , V2 , and V3 .
P 1+2+3 = P 1 + P 2 + P 3 − P 1 P 2 − P 2 P 3 − P 3 P 1 + P 1 P 2 P 3 (2.43)
2.3. SUBSPACES AND PROJECTION MATRICES 43
to hold is
P 1 P 2 = P 2 P 1 , P 2 P 3 = P 3 P 2 , and P 1 P 3 = P 3 P 1 . (2.44)
P 1̃ = P 1 − P 1 P 2 − P 1 P 3 + P 1 P 2 P 3 ,
P 2̃ = P 2 − P 2 P 3 − P 1 P 2 + P 1 P 2 P 3 ,
P 3̃ = P 3 − P 1 P 3 − P 2 P 3 + P 1 P 2 P 3 ,
P 12(3) = P 1 P 2 − P 1 P 2 P 3 ,
P13(2) = P 1 P 3 − P 1 P 2 P 3 ,
P 23(1) = P 2 P 3 − P 1 P 2 P 3 ,
and
P 123 = P 1 P 2 P 3 .
Then,
Lemma 2.4
and " #
In O
[Q2 P 1 , P 2 ] = [P 1 , P 2 ] = [P 1 , P 2 ]T .
−P 1 I n
2.3. SUBSPACES AND PROJECTION MATRICES 45
which implies
V1 + V2 = Sp(P 1 , Q1 P 2 ) = Sp(Q2 P 1 , P 2 ).
and
V1[2] = {x|x = Q2 y, y ∈ V1 }. (2.52)
Let Qj = I n − P j (j = 1, 2), where P j is the orthogonal projector onto
Vj , and let P ∗ , P ∗1 , P ∗2 , P 1[2] , and P 2[1] denote the projectors onto V1 + V2
along W , onto V1 along V2[1] ⊕ W , onto V2 along V1[2] ⊕ W , onto V1[2] along
V2 ⊕ W , and onto V2[1] along V1 ⊕ W , respectively. Then,
P ∗ = P ∗1 + P ∗2[1] (2.53)
or
P ∗ = P ∗1[2] + P ∗2 (2.54)
holds.
P = P 1 + P 2. (2.55)
46 CHAPTER 2. PROJECTION MATRICES
(M P )0 = M P . (2.60)
Note The theorem above implies that with an oblique projector P (P 2 = P , but
P 0 6= P ) it is possible to have ||P x|| ≥ ||x||. For example, let
· ¸ µ ¶
1 1 1
P = and x = .
0 0 1
√
Then, ||P x|| = 2 and ||x|| = 2.
The above identity and Theorem 2.23 lead to the following corollary.
Corollary
(i) If V2 ⊂ V1 , tr(X 0 P 2 X) ≤ tr(X 0 P 1 X) ≤ tr(X 0 X).
(ii) Let P denote an orthogonal projector onto an arbitrary subspace in E n .
If V1 ⊃ V2 ,
tr(P 1 P ) ≥ tr(P 2 P ).
Proof. (i): Obvious from Theorem 2.23. (ii): We have tr(P j P ) = tr(P j P 2 )
= tr(P P j P ) (j = 1, 2), and (P 1 − P 2 )2 = P 1 − P 2 , so that
p
However, (2.65) is more general than (2.66) because tr(P 1 )tr(P 2 ) ≥ min(tr(P 1 ),
tr(P 2 )).
2.5. MATRIX NORM AND PROJECTION MATRICES 49
Lemma 2.6
||A|| ≥ 0. (2.68)
||CA|| ≤ ||C|| · ||A||, (2.69)
Let both A and B be n by p matrices. Then,
||A + B|| ≤ ||A|| + ||B||. (2.70)
Let U and V be orthogonal matrices of orders n and p, respectively. Then
||U AV || = ||A||. (2.71)
Proof. Relations (2.68) and (2.69) are trivial. Relation (2.70) follows im-
mediately from (1.20). Relation (2.71) is obvious from
tr(V 0 A0 U 0 U AV ) = tr(A0 AV V 0 ) = tr(A0 A).
Q.E.D.
Note Let M be a symmetric nnd matrix of order n. Then the norm defined in
(2.67) can be generalized as
||A||M = {tr(A0 M A)}1/2 . (2.72)
This is called the norm of A with respect to M (sometimes called a metric matrix).
Properties analogous to those given in Lemma 2.6 hold for this generalized norm.
There are other possible definitions of the norm of A. For example,
Pn
(i) ||A||1 = maxj i=1 |aij |,
(ii) ||A||2 = µ1 (A), where µ1 (A) is the largest singular value of A (see Chapter 5),
and
Pp
(iii) ||A||3 = maxi j=1 |aij |.
All of these norms satisfy (2.68), (2.69), and (2.70). (However, only ||A||2 satisfies
(2.71).)
50 CHAPTER 2. PROJECTION MATRICES
Proof. (2.73): Square both sides and subtract the right-hand side from the
left. Then,
where P B is the orthogonal projector onto Sp(B). The equality holds if and
only if BX = P B A. We also have
or
Note Relations (2.75), (2.76), and (2.77) can also be shown by the least squares
method. Here we show this only for (2.77). We have
Differentiating this criterion with respect to Y and setting the result equal to zero,
we obtain C(A − BX) = CC 0 Y 0 or (A − BX)C 0 = Y CC 0 . Postmultiplying
the latter by (CC 0 )−1 C 0 , we obtain (A − BX)P C 0 = Y C. Substituting this into
P B (A − Y C) = BX, we obtain P B A(I p − P C 0 ) = BX(I p − P C 0 ) after some
simplification. If, on the other hand, BX = P B (A − Y C) is substituted into
(A − BX)P C 0 = Y C, we obtain (I n − P B )AP C 0 = (I n − P B )Y C. (In the
derivation above, the regular inverses can be replaced by the respective generalized
inverses. See the next chapter.)
52 CHAPTER 2. PROJECTION MATRICES
P ∗j x = x ∀x ∈ Vj (j = 1, · · · , m) (2.80)
and
P ∗j x = 0 ∀x ∈ V(j) (j = 1, · · · , m). (2.81)
x = x1 + x2 + · · · + xm = (P ∗1 + P ∗2 + · · · P ∗m )x.
P ∗i P ∗j x = 0 (i 6= j) and (P ∗j )2 x = P ∗j x (i = 1, · · · , m) (2.82)
P 1+2+3 = P 1 + P 2 + P 3 − P 1 P 2 − P 2 P 3 − P 1 P 3 + P 1 P 2 P 3 .
6. Show that
Q[A,B] = QA QQA B ,
where Q[A,B] , QA , and QQA B are the orthogonal projectors onto the null space of
[A, B], onto the null space of A, and onto the null space of QA B, respectively.