Professional Documents
Culture Documents
M504Lect5 - Eigenvalues of Hermitians Matrices
M504Lect5 - Eigenvalues of Hermitians Matrices
This lecture takes a closer look at Hermitian matrices and at their eigenvalues. After a few
generalities about Hermitian matrices, we prove a minimax and maximin characterization of
their eigenvalues, known as Courant–Fischer theorem. We then derive some consequences
of this characterization, such as Weyl theorem for the sum of two Hermitian matrices, an
interlacing theorem for the sum of two Hermitian matrices, and an interlacing theorem for
principal submatrices of Hermitian matrices.
1
2 Variational characterizations of eigenvalues
We now recall that, according to the spectral theorem, if A ∈ Mn is Hermitian, there exists
a unitary matrix U ∈ Mn and a real diagonal matrix D such that A = U DU ∗ . The diagonal
entries of D are the eigenvalues of A, which we sort as
We utilize this notation for the rest of the lecture, although we may sometimes just write λ↑j
instead of λ↑j (A) when the context is clear. Note also that the columns u1 , . . . , un of U form an
orthonormal system and that Auj = λ↑j (A)uj for all j ∈ [1 : n]. We start by observing that
with the leftmost inequality becoming an equality if x = u1 and the rightmost inequality
becoming an equality if x = un . The argument underlying the observation (1) will reappear
several times (sometimes without explanation), so we spell it out here. It is based on the
expansion of x ∈ Cn on the orthonormal basis (u1 , . . . , un ), i.e.,
n
X n
X
x= cj uj with c2j = kxk22 .
j=1 j=1
The inequalities (1) (with the cases of equality) can also be expressed as λ↑1 = minkxk2 =1 hAx, xi
and λ↑n = maxkxk2 =1 hAx, xi, which is known as Rayleigh–Ritz theorem. It is a particular case
of Courant–Fischer theorem stated below.
(3) λ↑k (A) = min max hAx, xi = max min hAx, xi.
dim(V )=k x∈V dim(V )=n−k+1 x∈V
kxk2 =1 kxk2 =1
Remark. This can also be stated with dim(V ) ≥ k and dim(V ) ≥ n − k + 1, respectively, or
(following the textbook) as
2
Proof of Theorem 3. We only prove the first equality — the second is left as an exercise. To
begin with, we notice that, with U := span[u1 , . . . , uk ], we have
where the last inequality follows from an argument similar to (2). For the inequality in the
other direction, we remark that our objective is to show that for any k-dimensional linear
subspace V of Cn , there is x ∈ V with kxk2 = 1 and hAx, xi ≥ λ↑k (A). Considering the subspace
W := span[uk , . . . , un ] of dimension n − k + 1, we have
Hence we may pick x ∈ V ∪ W with kxk2 = 1. The inequality hAx, xi ≥ λ↑k (A) follows from an
argument similar to (2).
We continue with some applications of Courant–Fischer theorem, starting with Weyl theorem.
This establishes the rightmost inequality. We actually use this result to prove the leftmost
inequality by replacing A with A + B and B with −B, namely
λ↑k (A) = λ↑k (A + B + (−B)) ≤ λ↑k (A + B) + λ↑n (−B) = λ↑k (A + B) − λ↑1 (B).
Weyl theorem turns out to be the particular case k = 1 of Lidskii’s theorem stated below
with the sequences of eigenvalues arranged in nonincreasing order rather than nondecreasing
order.
3
Theorem 6. For Hermitian matrices A, B ∈ Mn , if 1 ≤ j1 < . . . < jk ≤ n, then
k k k
λ↓j` (A λ↓j` (A) λ↓` (B).
X X X
+ B) ≤ +
`=1 `=1 `=1
Proof. Replacing B by B − λ↓k+1 (B)I, we may assume that λ↓k+1 (B) = 0. By the spectral
theorem, there exists a unitary matrix U ∈ Mn such that
B = U diag λ↓1 (B), . . . , λ↓k (B), λ↓k+1 (B), . . . , λ↓n (B) U ∗ .
| {z } | {z }
≥0 ≤0
3 Interlacing theorems
The two results of this section locate the eigenvalues of a matrix derived from a matrix A
relatively to the eigenvalues of A. They are both consequences of Courant–Fischer theorem.
Theorem 7. Let A ∈ Mn be a Hermitian matrix and As be an s × s principal submatrix of A,
s ∈ [1 : n]. Then, for k ∈ [1 : s],
λ↑k (A) ≤ λ↑k (As ) ≤ λ↑k+n−s (A).
Remark. The terminology of interlacing property is particularly suitable in the case of an
(n − 1) × (n − 1) principal submatrix A,
e since we then have
4
Proof of Theorem 7. Suppose that the rows and columns of A kept in As are indexed by a
set S of size s. For a vector x ∈ Cs , we denote by x e ∈ Cn the vector whose entries on S
equal those of x and whose entries outside S equal zero. For a linear subspace V of Cs , we
x, x ∈ V }, which is a subspace of Cn with dimension equal the dimension of V .
define Ve := {e
Given k ∈ [1 : s], Courant–Fischer theorem implies that, for all linear subspace V of Cs with
dim(V ) = k,
max hAs x, xi = max hAe ei = max hAe
x, x ei ≥ λ↑k (A).
x, x
x∈V x∈V e∈Ve
x
kxk2 =1 kxk2 =1 ke
xk2 =1
Taking the minimum over all k-dimensional subspaces V gives λ↑k (As ) ≥ λ↑k (A). Similarly, for
all linear subspace V of Cs with dim(V ) = s − k + 1 = n − (k + n − s) + 1,
min hAs x, xi = min hAe ei = min hAe
x, x ei ≤ λ↑k+n−s (A).
x, x
x∈V x∈V e∈Ve
x
kxk2 =1 kxk2 =1 ke
xk2 =1
Taking the maximum over all (s−k+1)-dimensional subspaces V gives λ↑k (As ) ≤ λ↑k+n−s (A).
Before turning to the proof, we observe that the n×n Hermitian matrices of rank r are exactly
the matrices of the form
X r
(4) B= µj vj vj∗ , µ1 , . . . , µr ∈ R \ {0}, (v1 , . . . , vr ) orthonormal system.
j=1
Indeed, by the spectral theorem, the n × n Hermitian matrices of rank r are exactly the
matrices of the form
B = V diag µ1 , . . . , µr , 0, . . . , 0 V ∗ , µ1 , . . . , µr ∈ R \ {0}, V ∗ V = I,
(5)
which is just another way of writing (4).
= λ↑k+r (A + B).
This establishes the rightmost inequality. This inequality can also be used to establish the
leftmost inequality. Indeed, for k ∈ [1 : k − 2r], we have k + r ∈ [1 : k − r], and it follows that
λ↑k+r (A + B) ≤ λ↑k+r+r (A + B + (−B)) = λ↑k+2r (A).
5
4 Exercises