Download as pdf or txt
Download as pdf or txt
You are on page 1of 31

Chair for Mathematical Information Science

Verner Vlačić
Sternwartstrasse 7
CH-8092 Zürich

Mathematics of Information
Hilbert spaces and linear operators

These notes are based on [1, Chap. 6, Chap. 8], [2, Chap. 1], [3, Chap. 4], [4, App. A] and [5].

1 Vector spaces
Let us consider a field F, which can be, for example, the field of real numbers R or the field of
complex numbers C. A vector space over F is a set whose elements are called vectors and in
which two operations, addition and multiplication by any of the elements of the field F (referred
to as scalars), are defined with some algebraic properties. More precisely, we have

Definition 1 (Vector space). A set X together with two operations (+, ·) is a vector space over
F if the following properties are satisfied:

(i) the first operation, called vector addition: X × X → X denoted by + satisfies

• (x + y) + z = x + (y + z) for all x, y, z ∈ X (associativity of addition)


• x + y = y + x for all x, y ∈ X (commutativity of addition)

(ii) there exists an element 0 ∈ X , called the zero vector, such that x + 0 = x for all x ∈ X

(iii) for every x ∈ X , there exists an element in X , denoted by −x, such that x + (−x) = 0.

(iv) the second operation, called scalar multiplication: F × X → X denoted by · satisfies

• 1 · x = x for all x ∈ X
• α · (β · x) = (αβ) · x for all α, β ∈ F and x ∈ X (associativity for scalar multiplication)
• (α+β)·x = α·x+β ·x for all α, β ∈ F and x ∈ X (distributivity of scalar multiplication
with respect to field addition)
• α · (x + y) = α · x + α · y for all α ∈ F and x, y ∈ X (distributivity of scalar multiplication
with respect to vector addition).

We refer to X as a real vector space when F = R and as a complex vector space when F = C.

Examples. 1. C is both a real and a complex vector space.

2. The set FN , {(x1 , x2 , . . . , xN ) : xk ∈ F} of all N -tuples forms a vector space


over F.

3. The set F[x] of all polynomials with coefficients in F is a vector space over F.

4. The space FZ of all sequences of F is a vector space over F.


5. The set FM ×N of all matrices of size M × N with entries in F forms a vector
space over F under the laws of matrix addition and scalar multiplication.

6. If X is an arbitrary set and Y an arbitrary vector space over F, the set F(X , Y)
of all functions X → Y is a vector space over F under pointwise addition and
multiplication.

A subspace of a vector space X over F is a subset of X which is itself a vector space over F
(with respect to the same operations). One can easily verify that a subset Y of X is a subspace
using the following proposition.

Proposition 1.1. Let X be a vector space over F and Y a nonempty subset of X . Then Y is a
subspace of X if it contains 0 and if it is stable under linear combinations, that is,

α·y+β·z ∈Y

for all α, β ∈ F and y, z ∈ Y.

Examples. 1. R and iR are subspaces of the real vector space C.

2. The set of N -tuples (x1 , x2 , . . . , xN −1 , 0) with xk ∈ R is a subspace of RN .

3. The set `p (Z), p ∈ [1, ∞], of all complex-valued sequence {uk }k∈Z which satisfy
P+∞ p
k=−∞ |uk | < ∞, if p ∈ [1, ∞),
supk∈Z |uk | < ∞, if p = ∞,

is a subspace of CZ .

4. If X is an arbitrary set, Y a vector space over F, and Z a subspace of Y, then


the set F(X , Z) of functions X → Z is a subspace of F(X , Y).

5. The space C n [a, b] of all complex-valued functions with continuous derivatives


of order 0 6 k 6 n on the closed, bounded interval [a, b] of the real line is a
subspace of F([a, b], C).

6. If (S, Σ, µ) is a measure space, we let Lp (S, Σ, µ) denote the space of all mea-
surable functions mapping S to C whose absolute value raised to the p-power is
µ-integrable, that is,
 Z 
p p
L (S, Σ, µ) = f : S → C measurable : |f | dµ < ∞ ,
S

two elements of Lp (S, Σ, µ) being considered as equivalent if they differ only on


a set whose measure is zero. Lp (S, Σ, µ) is a subspace of F(S, C). For simplicity,
we often use the notation Lp (S) when S is a subset of RN , Σ is the Borel
σ-algebra over S, and µ the Lebesgue measure on S. In this case, two functions
are equivalent when they are equal except possibly on a set of Lebesgue measure
zero (e.g., a finite or countable1 set of points of RN ).

2
Theorem 1 (Intersection of vector spaces). The intersection of any collection of subspaces of a
vector space X over F is again a subspace of X .

Definition 2. If S and T are linear subspaces of a vector space X with S ∩ T = {0}, then we
define the direct sum S ⊕ T by

S ⊕ T = {x + y | x ∈ S, y ∈ T }.

Note that the union of two subspaces is in general not a subspace. Take for example X = R2 .
The lines xR and yR, with x = (1, 0) and y = (0, 1), are both subspaces of R2 , but xR ∪ yR is a
not subspace, given that x + y = (1, 1) ∈
/ xR ∪ yR.

Definition 3 (Subspace spanned by S). Let S be a (possibly infinite) subset of a vector space
X over F. The subspace spanned by S is the intersection of all subspaces containing S. It is the
smallest subspace containing S. It is denoted by span(S) and may be written the set of all finite
combinations of elements of S, that is,
( k )
X
span(S) , λ` x` : k ∈ N, x` ∈ S, λ` ∈ F .
`=1

Examples. 1. If S is empty, then the subspace spanned by S is {0}.

2. In the real vector space C, we have the following:

• the subspace spanned by {1} is R,


• the subspace spanned by {i} is iR,
• the subspace spanned by {1, i} is C.

3. In F[x], the subspace spanned by {1, x, . . . , xN } is the space FN [x] of all polyno-
mials whose degree is less than N .

Definition 4 (Linear independence). Let X be a vector space over F. A finite set {x1 , x2 , . . . , xN }
of vectors of X is said to be linearly independent if for every λ1 , λ2 , . . . , λN ∈ F, the equality

λ1 x1 + λ2 x2 + . . . + λN xN = 0

implies that λ1 = λ2 = . . . = λN = 0. An infinite set S of vectors of X is linearly independent2 if


every finite subset of S is linearly independent.

Examples. 1. Any set of vectors containing the zero vector is linearly dependent.

2. The real vector space C, the set {1, i} is linearly independent.

3. The basic monomials {1, x, x2 , . . . , xN } form a linearly independent set of F[x].

4. The set of trigonometric functions {1, cos(x), sin(x), cos2 (x), cos(x) sin(x), sin2 (x)}
is linearly dependent.

1
Note, however, that there exist uncountable sets (e.g., the Cantor set) of Lebesgue measure zero.

3
Definition 5 (Dimension). A vector space X is N -dimensional if there exists N linearly
independent vectors in X and any N + 1 vectors in X are linearly dependent.

Definition 6 (Finite-dimensional space). A vector space X is finite-dimensional if X is N -


dimensional for some integer N . Otherwise, X is infinite dimensional.

Examples. 1. The spaces FN and FN [x] are finite-dimensional (their dimension


is N ).

2. The space F(X , X ), F[x], C n [a, b], `p (Z) are infinite-dimensional.

Definition 7 (Basis). A basis3 of a vector space X over F is a set of linearly independent and
spanning vectors of X .

Examples. 1. The set {ek }3k=1 , where e1 = [1 0 0]T , e2 = [0 1 0]T , and e3 = [0 0 1]T ,
forms a basis for R3 .

2. The set {1, x, x2 , . . . , xN } forms a basis for FN [x].

3. The set {1, x, x2 , x3 , . . .} forms a basis for F[x].

Theorem 2. Let X be a finite dimensional vector space. Any set of linearly independent vectors
can be extended to a basis of X .

2 Inner products and norms


From here, we will assume that F = R or F = C.

Definition 8 (Norm). Let X be a vector space over F. A norm on X is a function which maps
X to R and satisfies the following properties:

(i) for all x ∈ X , we have kxk > 0 and kxk = 0 implies x = 0 (positivity)

(ii) for all x ∈ X and for all α ∈ F, we have kαxk = |α| kxk (homogeneity)

(iii) for all x, y ∈ X , we have kx + yk 6 kxk + kyk (triangle inequality).

Definition 9. A vector space X over F is a normed vector space (or a pre-Banach space) if it is
equipped with a norm k·k.

3
This type of basis is sometimes referred to as an algebraic basis or a Hamel basis to highlight the difference
with Schauder bases in infinite-dimensional Banach spaces, and orthonormal bases in infinite-dimensional Hilbert
space. In finite dimensions, all these concepts coincide (cf. notes on bases in infinite-dimensional spaces for more
details).

4
Examples. 1. We can provide the space C 0 [a, b] with the norm
Z b
kf k = |f (t)|dt.
a

2. We can define a norm on FN as follows:


N
!1/2
X
2
kxk2 = |xk | .
k=1

3. If we write u = {uk }k∈Z , then the following defines a norm on `p (Z), p ∈ [1, ∞],

p 1/p , p ∈ [1, ∞),


P+∞ 
kukp = k=−∞ |uk |
kuk∞ = supk∈Z |uk |, p = ∞.

4. We can define the following norm on Lp (S, Σ, µ), p ∈ [1, ∞):


Z 1/p
p
kf kp = |f | dµ .
S

Definition 10 (Inner product). Let X be a vector space over F. An inner product (or scalar
product) over F is a function X × X → F such that, for all x, y ∈ X and α ∈ F, the following
holds
(i) hy, xi = hx, yi (symmetry)
(ii) hαx, yi = α hx, yi (sesquilinearity if F = C, bilinearity if F = R)
(iii) hx + y, zi = hx, zi + hy, zi (sesquilinearity if F = C, bilinearity if F = R)
(iv) hx, xi ∈ R and hx, xi > 0 (positivity)
(v) hx, xi = 0 implies that x = 0 (positivity).

The inner product on a vector space X induces a norm on X . But note that not all norms
come from an inner product (see Problem 3 of Homework 1).
p
Theorem 3. Let X be a vector space. If h·, ·i is an inner product on X , then, kxk = hx, xi is
a norm on X fulfilling the parallelogram law:

2 kxk2 + 2 kyk2 = kx + yk2 + kx − yk2 , x, y ∈ X .

On the other hand, every norm kxk on X fulfilling the parallelogram law is the induced norm of
exactly one inner product, which is defined as
1
hx, yi = (kx + yk2 − kx − yk2 ), F = R (1)
4
1
hx, yi = (kx + yk2 − kx − yk2 + i kx + iyk2 − i kx − iyk2 ), F = C. (2)
4
Formulas (1) and (2) are called polarization formulas.

5
Definition 11 (Inner product space). A vector space X over F is an inner product space or a
pre-Hilbert space if it is equipped with an inner product h·, ·i.

Theorem 4 (Cauchy-Schwarz inequality). Let X be an inner product space. Then, we have

|hx, yi| 6 kxk kyk , x, y ∈ X

with equality if and only if x and y are linearly dependent.

In what follows, we will always assume that for an inner product space, k · k is the norm
induced by the corresponding inner product.

Examples. 1. The space CN can be equipped with the inner product


N
X
hx, yi , xk yk ,
k=1

where x = (x1 , x2 , . . . , xN ) and y = (y1 , y2 , . . . , yN ) are in CN . The norm


induced is the one given in the previous set of examples.

2. The space `2 (Z) of square summable sequences is an inner product space. If we


write u = {uk }k∈Z and v = {vk }k∈Z , the inner product on `2 (Z) and the norm
induced are
+∞
!1/2
X X
2
hu, vi , uk vk and kuk2 , |uk | .
k∈Z k=−∞

3. When (S, Σ, µ) is a measure space, the space L2 (S, Σ, µ) is an inner product


space if it is equipped with the inner product:
Z
hf, gi , f g dµ
S

with the induced norm Z 1/2


kf k2 , |f |2 dµ .
S

4. The space C[0, 1] of all continuous functions on [0, 1] can be equipped with the
inner product Z 1
hf, gi = f (x)g(x)dx.
0

Definition 12 (Convergence of a sequence). Let X be a normed space equipped with the


norm k·k. A sequence {xn }n∈N of elements of X converges to x if for every ε > 0 there exists an
integer N (ε) such that kxn − xk 6 ε whenever n > N (ε).

Theorem 5 (Norm is continuous). Let X be a normed vector space. Then, k · k is continuous,


i.e., if a sequence {xn }n∈N of elements of X converges to x ∈ X , then kxn k converges to kxk.

6
Theorem 6 (Inner product is continuous). Let X be an inner product space. Then, h·, ·i is
continuous, i.e., if two sequences {xn }n∈N and {yn }n∈N of elements of X converge to x ∈ X and
y ∈ X , respectively, then hxn , yn i converges to hx, yi.

3 Banach and Hilbert spaces

Definition 13 (Cauchy sequence). Let X be a normed space equipped with the norm k·k. A
sequence {xn }n∈N of elements of X is called a Cauchy sequence if for every ε > 0 there exists an
integer N (ε) such that kxn − xm k 6 ε whenever n, m > N (ε).

Theorem 7. Every convergent sequence is Cauchy.

Definition 14 (Complete space). A normed space X is complete if every Cauchy sequence in


X converges in X .

Definition 15 (Banach space). A Banach space is a complete normed space.

Definition 16 (Hilbert space). A Hilbert space is a complete inner product space.

Definition 17. A subset S ⊆ X of a normed space X is called a closed set if it contains all
limit points, i.e., if {xn }n∈N is a sequence of points xn ∈ S with limn→∞ xn = x ∈ X , then x ∈ S.
We write S for the smallest closed subset of X containing S in the following sense: if A is any
other closed subset of X such that A ⊃ S, then A ⊃ S. We say S is the closure of S in X .

Examples. 1. Consider R2 with norm k·k2 . The open disk D = {x ∈ R2 | kxk2 < 1}
is not a closed set (consider the sequence {(1 − 1/n, 0)}n∈N ). The closure of D
is the closed disk D = {x ∈ R2 | kxk2 6 1}.

2. Let X = L2 [0, 1] be the space of square-integrable functions [0, 1] → C with the


inner product Z 1
hf, gi = f (x)g(x)dx, (3)
0
and let S = C[0, 1] ⊂ X be the set of all continuous functions 1] → C. Then

2πin ·[0,
ˆ
S = X . To see this, fix a function f ∈ X and let fn = f, e denote its n-th
PN
Fourier coefficient. Then we have kf − fN k → 0, where fN = n=−N fˆn e2πin · is
the N -th partial Fourier sum. But, since fN is a linear combination of continuous
functions x → e2πinx , fN is itself continuous, for each N ∈ N. Therefore f ∈ S.
Since f was arbitrary, we obtain X ⊂ S. But S ⊂ X in general, so S = X , as
desired.

3. To say that a set S is closed only makes sense relative to the space X that S is
a subset of. To see this, let X1 = C[0, 1] be the space of all continuous functions
from [0, 1] to C with the inner product (3), and let X2 = L2 [0, 1] be the space of
all square-integrable functions f : [0, 1] → C with the same inner product. Let
S = {f ∈ C[0, 1] : f (x) = 0 for x ∈ [0, 12 ]} and note that S is a subset of both X1
and X2 . However, S is closed in X1 , but it is not closed in X2 . Indeed, suppose

7
{fn }n∈N is a sequence in S that converges to some f ∈ X1 = C[0, 1]. Then
1 1
Z
2
Z
2
Z 1
2 2
|f (x)| dx = |f (x)−fn (x)| dx 6 |f (x)−fn (x)|2 dx = kf −fn k2 → 0,
0 0 0

as n → ∞, so f (x) = 0, for all x ∈ [0, 12 ]. Since we have assumed f ∈ C[0, 1], we


have f ∈ S and so S is closed in X1 . To see that S is not closed in X2 , consider
the following sequence of functions:

0
 x ∈ [0, 12 ],
fn (x) := n x − 12 x ∈ ( 12 , 21 + n1 ],


1 x ∈ ( 12 + n1 , 1].

Note that fn ∈ S for all n ∈ N. Moreover we have


1 1
Z 1 Z +n
2 1
|fn (x) − 1[ 1 ,1] (x)|2 dx 6 1 dx = → 0,
0 2 1 n
2

as n → ∞, so fn → 1[ 1 ,1] in X2 . But 1[ 1 ,1] ∈


/ S, so S is not closed in X2 .
2 2

4. Since the concept of closedness depends on the ambient space, so does the concept
of closure. In the above example, the closure of S in X1 is the set S itself, whereas
the closure of S in X2 is the set S = {f ∈ L2 [0, 1] : f (x) = 0 for x ∈ [0, 12 ]} ) S.

Definition 18. Let A ⊂ B be subsets of a normed space X . We say that A is dense in B if


A = B. Equivalently, A is dense in B if, for every x ∈ B and every  > 0, there exists a y ∈ A
such that kx − yk < .

Theorem 8 (Completion of normed space). Let X be a normed space with norm k·k. Then,
there exists a Banach space Xb with norm k·k
c such that the following properties hold.

• X is a subspace of Xb;

• On X , k·k
c = k·k;

• Xb is the closure of X .

Theorem 9 (Completion of inner product spaces). Let X be an inner product space with
inner product h·, ·i . Then, there exists a Hilbert space Xb with inner product h·,
d·i such that the
following properties hold.

• X is a subspace of Xb;

• On X , h·,
d·i = h·, ·i;

• Xb is the closure of X .

Furthermore, Xb is unique up to linear isometries.

8
Examples. 1. The set of rational numbers Q is an inner product space with inner
product equal to pq for p, q ∈ Q. This space is not complete.

2. The set of real numbers R is a Hilbert space space with inner product equal to
rs for r, s ∈ R.

3. CN is a Hilbert space.

4. `2 (Z) is a Hilbert space.

5. When (S, Σ, µ) is a measure space, L2 (S, Σ, µ) is a Hilbert space.

6. The space C[0, 1] equipped with the inner product


Z 1
hf, gi = f (x)g(x)dx.
0

is not a Hilbert space, because it is not complete. The completion of C[0, 1] is


L2 ([0, 1]).

4 Orthogonality in Hilbert spaces


Definition 19. Two vectors x and y in an inner product space X are called orthogonal if
hx, yi = 0. In this case we write x ⊥ y. Two sets U, V ⊆ X are called orthogonal if hx, yi = 0 for
all x ∈ U and y ∈ V. We write U ⊥ V if the sets X , Y are orthogonal.

Definition 20. Let S be a nonempty subset of the inner product space X . We define the
orthogonal complement S ⊥ of S as S ⊥ = {x ∈ X : x ⊥ S}.

Lemma 1. Let S be a nonempty subset of the inner product space X . Then S ⊥ is a closed
linear subspace of X .

Proof. Follows from the linearity and continuity of the inner product.

The following theorem expresses one of the fundamental geometrical properties of Hilbert
spaces. While the result may appear obvious (see Figure 1), the proof is not trivial.

Theorem 10 (Projection on closed subspaces). Let S be a closed subspace of a Hilbert space


X . Then, the following properties hold.

(a) For each x ∈ X there is a unique closest point y ∈ S such that

kx − yk = min kx − zk.
z∈S

(b) The point y ∈ S closest to x ∈ X is the unique element of S such that the “error” (x−y) ∈ S ⊥ .

Corollary 4.1. Let S be a closed subspace of a Hilbert space X . Then X = S ⊕ S ⊥ and


S ⊥⊥ = S.

9
x
S

Figure 1: y ∈ S is the point in the closed subspace S closest to x and the unique point such that
the “error” (x − y) ∈ S ⊥ .

Lemma 2. Let S be a subset of a Hilbert space X . Then, the following properties hold.

(a) S ⊥ = span(S)⊥ ;

(b) S ⊥⊥ = span(S);

(c) S ⊥ = {0} if and only if span(S) = X .

Proof. • Proof of (a): span(S)⊥ ⊆ S ⊥ follows from S ⊆ span(S). It remains to show that
S ⊆ span(S)⊥ , which follows from the linearity and continuity of the inner product.

• Proof of (b): Follows from (a) and Corollary (4.1).

Proof of (c): Follows from (a) and Corollary (4.1).

Corollary 4.1 and Lemma 2 tell us that subspaces of Hilbert spaces behave analogously to
subspaces of finite-dimensional vector spaces familiar to us. In these results the completeness of
X is crucial, as illustrated by the following example:

Example.
Let X = C[0, 1] equipped with the inner product
Z 1
hf, gi = f (x)g(x)dx.
0

As we have already seen, X is an inner product space, but not a Hilbert space. Let
S = {f ∈ C[0, 1] : f (x) = 0 for x ∈ [0, 12 ]} and note that this is a closed subspace of
X . Its orthogonal complement is S ⊥ = {f ∈ C[0, 1] : f (x) = 0 for x ∈ [ 12 , 1]}, but
X 6= S ⊕ S ⊥ .

10
5 Orthonormal bases in Hilbert spaces
Definition 21 (Orthogonal/orthonormal set). A subset S of nonzero vectors of a Hilbert space
X is orthogonal if any two distinct elements in S are orthogonal. An orthogonal set S is called
orthonormal if kxk = 1 for all x ∈ S.

Examples. 1. The set {t 7→ e2iπnt }n∈Z of complex exponentials is orthonormal in


L2 [0, 1].
N −1

2. The set {en }n=0 , with en [k] = e2iπkn/N / N for 0 6 k 6 N − 1, is an orthonor-
mal set in CN .

3. Considering the space


 Z +∞ 
L2w (R) = f : R → R: |f (x)|2 w(x)dx < ∞
−∞

equipped with the inner product


Z +∞
hf, gi = f (x)g(x)w(x)dx
−∞
2 /2
with w(x) = e−x . We define the polynomial Hn , n ∈ N such that
dn −x2 /2 2
n
e = (−1)n Hn (x)e−x /2 .
dx
The polynomials Hn , n ∈ N, are the Hermite polynomials and they an orthogonal
set in L2w (R).

4. A function that is a sum of finitely many periodic functions is said to be


quasiperiodic. If the ratios of the periods of the terms in the sum are rational,
then the sum is itself periodic, but if at least one of the ratios is irrational, then
the sum is not periodic. For example,

f (t) = eit + eiπt

is quasiperiodic but not periodic. Let X be the space of quasiperiodic functions


f : R → C of the form
n
X
f (t) = ak eiωk t , n ∈ N, ak ∈ C, ωk ∈ R.
k=1

Then X is a vector space. Furthermore, X is an inner product space with inner


product

1 T
Z
hf, gi = lim f (t)g(t). (4)
T →∞ T −T

The set of functions

{eiωt | ω ∈ R} (5)

11
is orthonormal in X . Note that X is an inner product space but not complete.
The completion of X is the space of L2 -almost periodic functions and consists of
equivalence classes of functions of the form

X
f (t) = ak eiωk t , ak ∈ C, ωk ∈ R,
k=1
P∞
where k=1 |ak |2 < ∞. The set in (5) is an uncountable orthogonal subset of
this Hilbert space.

We next introduce the concept of unconditional convergence, which allows us to define sums
of uncountable number of terms.

Definition 22 (Unconditional convergence). Let {xα ∈ X | α ∈ I} be an indexed subset of a


normed space X , where I may be countable or not. For each finite subset J of I, we define the
partial sum
X
SJ = xα .
α∈J
P
The unordered sum α∈I xα converges to x ∈ X , written as
X
x= xα ,
α∈I

if for every ε > 0 there exists a finite subset Jε ⊆ I such that kSJ − xk < ε for all finite subsets
J ⊆ I containing Jε .

Note that unconditional convergence is independent of permutations of the index set I.

Example.
Let {xα ∈ X | α ∈ I} be an indexed set of non-negative real numbers xα > 0 and set
nX o
M = sup xα | J ⊆ I, J finite .
α∈J
P
• If
PM = ∞, then α∈J xα is arbitrarily large for sufficiently large J . Therefore,
α∈I xα does not converge unconditionally.

• Suppose that 0 6 M < ∞. Then, for each ε > 0 there exists a finite set Jε ⊆ J
such that
X
M −ε< xα 6 M.
α∈Jε

It follows that for each finite set J ⊆ I containing Jε we have


X
M −ε< xα 6 M, (6)
α∈J

showing that the unordered sum converges to M .

12
P
Definition 23. An unordered sum α∈I xα is Cauchy if for every ε > 0 there exists a finite
subset Jε ⊆ I such that kSK k < ε for all finite sets K ⊆ I \ Jε .

Proposition 5.1. An unordered sum in a Banach space converges if and only if it is Cauchy.
If an unordered sum in a Banach space converges then {α ∈ I | xα 6= 0} is at most countable.

Lemma 3. Let P S = {xα ∈ X | α ∈ I} be an indexed orthogonal subset of aPHilbert space X .


Then, the sum α∈I xα converges unconditionally to some x ∈ X if and only if α∈I kxα k2 < ∞.
In that case, we have
X 2 X
xα = kxα k2 . (7)


α∈I α∈I

Proof. For every finite set K ⊆ I we have


X 2 X X
xα = hxα , xβ i = kxα k2 .


α∈K α,β∈K α∈K

Therefore, α∈I xα is Cauchy if and only if α∈I kxα k2 is Cauchy. Thus, one of the sums
P P
converges unconditionally if and only if the other does. Finally, (7) follows from the continuity of
the norm and the fact that the sum is the limit of a sequence of finite partial sums.

Proposition 5.2 (Bessel’s inequality). Let X be a Hilbert space, S = {xα ∈ X | α ∈ I} an


orthonormal subset of X , and x ∈ X . Then
2 2
P
(a) α∈I | hx, xα i | 6 kxk ;
P
(b) xS , α∈I hx, xα i xα converges unconditionally;

(c) x − xS ∈ S ⊥ .

Proof. Let J ⊆ I be a finite set. It follows immediately from the orthogonality that
X X 2
| hx, xα i |2 = kxk2 − x − hx, xα i xα 6 kxk2 . (8)

α∈J α∈J

Therefore, (see Example 5) (a) holds. Furthermore, (a) implies (b) because of Lemma 3. Finally,
the statement (c) means that hx − xS , xP
α i = 0 for all α ∈ I, which is true by the continuity of
the inner product and the fact that x − β∈K hx, xβ ixβ ⊥ xα for all finite subsets K of I and all
α ∈ A.

Definition 24 (Closed linear span). Given a subset S of a Hilbert space X , we define the closed
linear span [S] of S by
nX X o
[S] = cx x | cx ∈ F, cx x converges unconditionally .
x∈S x∈S

Lemma 3 implies that the closed linear span [S] of an orthonormal subset S = {xα ∈ X | α ∈ I}
of a Hilbert space X can be written as
nX X o
[S] = cα xα | cα ∈ F, |cα |2 < ∞ .
α∈I α∈I

13
Theorem 11. If S = {xα ∈ X | α ∈ I} is an orthonormal subset of a Hilbert space X , then
the following conditions are equivalent:
(a) hx, xα i = 0 for all α ∈ I implies that x = 0;
P
(b) x = α∈I hx, xα i xα for all x ∈ X ;
(c) kxk2 = α∈I | hx, xα i |2 for all x ∈ X ;
P

(d) [S] = X ;
(e) S is a maximal orthonormal set, i.e. it has the property that if S 0 ⊃ S is another orthonormal
set, then necessarily S 0 = S.

Proof. • (a) implies (b): (a) states that S ⊥ = {0}. Therefore, Proposition 5.2 implies (b);
• (b) implies (c): Follows from Lemma 3;
• (c) implies (d): (c) implies that S ⊥ = {0}. Therefore, we also have [S]⊥ = {0};
P
• (d) implies (e): (d) implies that x = α∈I cα xα for all x ∈ X . Therefore, if x ⊥ xα for all
α ∈ I, then x = 0.
• (e) implies (a): Suppose for a contradiction that there exists an x ∈ X such that hx, xα i = 0
for all α ∈ I, but x 6= 0. Then S ∪ {x/kxk} is an orthonormal set and is a strict superset
of S.

Definition 25. An orthonormal subset S = {xα ∈ X | α ∈ I} of a Hilbert space X is complete


if it satisfies any of the equivalent conditions (a)–(e) in Theorem 11. A complete orthonormal
subset of X is called orthonormal basis of X .

Condition (a) is the easiest to verify. Condition (b) is used most often. Condition (c) is called
Parseval’s identity. Condition (d) simply expresses completeness. Condition (e) can be used to
prove that every Hilbert space has an orthonormal basis using Zorn’s lemma.
If a Hilbert space X has an orthonormal basis S = {xα ∈ X | α ∈ I}, then it is isomorphic to
the sequence space l2 (I). In what follows, we are mostly interested in Hilbert spaces that have a
countable orthonormal bases, which are called separable.

Examples. 1. The set {t 7→ e2iπnt }n∈Z of complex exponentials forms an orthonor-


mal basis for L2 [0, 1].

2. The set {Hn }n∈N of Hermite polynomials forms an orthogonal basis for L
√w (R).
2
It can be transformed into an orthonormal basis by noting that kHn k = 2πn!
for all n ∈ N.

6 Bounded linear operators on a Hilbert space


Definition 26. Let X , Y be vector spaces over F. A linear operator (or linear transformation,
or linear map) from X to Y is a function T : X → Y that satisfies
T(αx + βy) = αTx + βTy

14
for all x, y ∈ X and α, β ∈ F. The set R(T) = {Tx : x ∈ X } is called the range of T. The
dimension of R(T) is called the rank. The set N (T) = {x ∈ X : Tx = 0} is called the null space
or kernel of T.

Example.
If X is N -dimensional with basis {xn }N M
n=1 and Y is M -dimensional with basis {ym }m=1 ,
then T is completely determined by its matrix representation U = {Un,m } 16n6N with
16m6M
respect to these two bases:
M
X
Txn = Un,m ym , n = 1, 2, . . . , N.
m=1
PN
Therefore, if x = n=1 αn xn and y = Tx, we have
M
X N
X
y= βm ym with βm = Um,n αn .
m=1 n=1

That is, if α = (α1 , . . . , αN ) ∈ FN and β = (β1 , . . . , βM ) ∈ FM are the coefficient


vectors of x and y with respect to the bases {xn }N M
n=1 and {ym }m=1 , respectively, then
we have the matrix-vector relation: β = Uα.

Proposition 6.1. N (T) is a subspace of X and R(T) is a subspace of Y. Furthermore, if T is


continuous then N (T) is closed.

Definition 27 (Bounded linear operators). A linear operator T : X → Y is bounded if there


exists a constant K > 0 such that
kTxkY 6 K kxkX . (9)

The set of bounded linear operators T : X → Y is denoted by B(X , Y).

Theorem 12. Let X , Y be normed spaces. Then, B(X , Y) is a normed space with norm

kTk = sup (kTxkY ).


kxkX =1

This is called the operator norm of an operator between two normed vector spaces. Furthermore,
if Y is a Banach space, then B(X , Y) is a Banach space.

Proposition 6.2. Let X , Y be normed spaces and T : X → Y a linear operator. The following
properties are equivalent.

(a) T is continuous;

(b) kTk < ∞;

(c) T is bounded.

Proof.

15
(a) implies (b): Suppose that (b) does not hold. Then, there exists a sequence {xn }∞
n=1 of
elements xn ∈ X with kxn kX = 1 such that kTxn kY > n. Set yn = xn /n. Then, yn → 0, but
kTyn kY > 1. Therefore, T is not continuous as limn→∞ Tyn 6= T0 = 0;
(b) implies (c): Obvious for x = 0. For x 6= 0, we have

x
kTxkY = kxkX T kxk 6 kTk kxkX ;

X Y

(c) implies (a): (c) implies that T is uniformly continuous.

Lemma 4. Let X , Y be normed spaces and T : X → Y a linear operator. Then,

kTxkY 6 kTk kxkX , x ∈ X.

Proof. Obvious for x = 0. For x 6= 0, we have



x
kTxkY = kxkX T 6 kTk kxk .
X
kxkX Y

Lemma 5. If T1 and T2 are linear operators between normed spaces for which the range of T2
is contained in the domain of T1 , we can consider the composed operator T1 T2 . If T1 and T2 are
bounded, then also T1 T2 is bounded and

kT1 T2 k 6 kT1 k kT2 k .

Proof.

kT1 T2 k = sup (kT1 (T2 x)k)


kxk=1

6 sup (kT1 k k(T2 x)k)


kxk=1

= kT1 k kT2 k .

Definition 28. Let X , Y be a normed space with norms k·kX and k·kY , and let T : X → Y be
a linear operator. We say that T is a linear isometry if kTxkY = kxkX , for all x ∈ X .

6.1 (Orthogonal) projections


Definition 29. A projection on a vector space X is a linear operator P : X → X such that
P2 = P.

Theorem 13. Let X be a vector space.

(a) If P : X → X is a projection, then X = R(P) ⊕ N (P);

(b) If X = S ⊕ T , then there exists a unique projection P : X → X such that S = R(P) and
T = N (P), called the projection onto S along T .

16
Proof. We first show that x ∈ R(P) if and only if x = Px. Clearly, Px ∈ R(P). Suppose that
x ∈ R(P). Then x = Pz for some z ∈ X and, therefore, Px = PPz = Pz = x.
Now we prove (a). Let x ∈ N (P) ∩ R(P). Then x = Px and Px = 0, which implies that x = 0.
Therefore, N (P) ∩ R(P) = {0}. Furthermore, we can decompose each x ∈ X as x = Px + (I − P)x
with Px ∈ R(P) and (I − P)x ∈ N (P). Therefore, X = R(P) ⊕ N (P).
Now we show that (b) holds. Each x ∈ X has a unique decomposition x = s + t with s ∈ S
and t ∈ T , and P (x) = s defines the required projection.

Example.
Let
n x o
S= | x ∈ R ⊆ R2
0
n x o
T = | x ∈ R ⊆ R2 .
x

Then, S and T are linear subspaces of R2 with S ∩ T = {0}. Furthermore, we can


decompose each vector in R2 as
     
x x−y y
= + .
y 0 y
| {z } |{z}
∈S ∈T

Therefore, R2 = S ⊕ T . Define the operator

P : R2 → R2
   
x x−y
7→ .
y 0

This operator is linear and P2 = P. Therefore, P is a projection.

Definition 30. An orthogonal projection on a Hilbert space X is a linear operator P : X → X


such that P2 = P and

hPx, yi = hx, Pyi , x, y ∈ X .

Theorem 14. Let X be a Hilbert space.

(a) If P : X → X is an orthogonal projection, then R(P) is closed and R(P)⊥ = N (P). Therefore,
X = R(P) ⊕ N (P) is an orthogonal direct sum;

(b) If S is a closed subspace, then there exists a unique orthogonal projection P : X → X such
that S = R(P) and S ⊥ = N (P).

Proof. We first prove (a). Let P : X → X be an orthogonal projection. According to Property (a)
in Theorem 13, X = R(P) ⊕ N (P). If x = Pz ∈ R(P) and y ∈ N (P), we have

hx, yi = hPz, yi = hz, Pyi = 0.

17
Therefore, R(P) = N (P)⊥ and, in particular, a closed subspace.
Now we show that (b) holds. Let S be a closed subspace of X . Then, Corollary 4.1 implies
that X = S ⊕ S ⊥ . We define the projection P by

Px = s, where x = s + t with s ∈ S, t ∈ S ⊥ .

Then, R(P) = S and N (P) = S ⊥ . Now let x1 , x2 ∈ X and xi = si +ti be the unique decomposition
of xi with si ∈ S and ti ∈ S ⊥ , i = 1, 2. Then, we have

hx1 , Px2 i = hx1 , P(s2 + t2 )i = hx1 , s2 i = hs1 + t1 , s2 i = hs1 , s2 i = hs1 , s2 + t2 i = hPx1 , s2 + t2 i .

Therefore, P is an orthogonal projection. The uniqueness follows from Theorem 13(b).

Example.
Let
n x o
S= | x ∈ R ⊆ R2 .
0

Then, S is a (closed) linear subspaces of R2 . Furthermore, we have


n 0 o

S = |y∈R .
y

Define the operator

P : R2 → R2
   
x x
7→ .
y 0

This operator is linear, P2 = P, and


 T     !T      
x1 x x x2 x1 x
P 2 = x1 x2 = P 1 , , 2 ∈ R2 .
y1 y2 y1 y2 y1 y2

Therefore, P is an orthogonal projection with S = R(P) and S ⊥ = N (P).

Lemma 6. Let P be a non-zero orthogonal projection. Then kPk = 1.

Proof. Recall that

kPk = sup (kPxk).


kxk=1

If x 6= 0, the Cauchy-Schwarz inequality implies that

hPx, Pxi hx, Pxi


kPxk = = 6 kxk . (10)
kPxk kPxk

Therefore, kPk 6 1. For Px 6= 0, we have kP(Px)k = kPxk, which implies that kPk > 1.

18
Proposition 6.3. Let S = {xα : α ∈ I} be an orthonormal system (but not necessarily an
ONB) in a Hilbert space X . Let V = [S] be the closed linear span of S. Then the orthogonal
projection PV is given explicitly by
X
PV x = hx, xα ixα .
α∈I

Proof. The set S can be extended to an ONB B of X (if X is separable, then this can be done by
applying the Gram-Schmidt orthogonalization procedure, or in the general case, by using Zorn’s
Lemma). Thus we can enumerate B = {xα : α ∈ J } by an index set J containing I. Now, given
an arbitrary x ∈ X , define s, t ∈ X according to
X X
s= hx, xα ixα , t= hx, xα ixα .
α∈I α∈J \I

Then, by Theorem 11, we have


X X X
x= hx, xα ixα = hx, xα ixα + hx, xα ixα = s + t.
α∈J α∈I α∈J \I

Moreover, * +
X X
hs, ti = hx, xα ixα , hx, xβ ixβ
α∈I β∈J \I
X
= hx, xα ihx, xβ i hxα , xβ i = 0.
| {z }
α∈I
β∈J \I =0, as α6=β

Therefore, by Theorem 14, we have PV x = s, as desired.

7 The dual of a Hilbert space


Definition 31. Let X be a Hilbert space. An element of B(X , F) is called a bounded linear
functional.

Example.
Let X be a Hilbert space. For every y ∈ X , the mapping

X →F (11)
x 7→ hx, yi (12)

is a bounded linear functional, which we denote by fy .

Theorem 15 (Riesz representation theorem). Let f : X → F be a bounded linear functional on


a Hilbert space X . Then there exists a unique yf ∈ X such that f (x) = hx, yf i for all x ∈ X .

Therefore, the mapping

θX : X → B(X , F) (13)
y 7→ fy = h·, yi

19
is a bijection. In the case of F = C, this mapping is antilinear (as can be seen from θX (αy) =
h·, αyi = α h·, yi = θX (y)). Furthermore, the mapping θ is an isometry:

kfy k = sup |fy (x)| = sup |hx, yi| = kyk, y 6= 0.


kxk=1 kxk=1

What is more, B(X , F) is a Hilbert space with inner product

hfx , fy i = hy, xi .

Theorem 16. Let X , Y be Hilbert spaces and T ∈ B(X , Y) a bounded linear operator. Then,
there exists a unique operator T∗ ∈ B(Y, X ) such that

hTx, yi = hx, T∗ yi , x ∈ X,y ∈ Y (14)

with kT∗ k = kTk. Furthermore, we have T∗∗ = T.

Proof. We first prove the existence. Let y ∈ Y be arbitrary but fixet. The mapping x 7→ hTx, yi
is a bounded linear functional on X . Therefore, according to Riesz’s representation theorem,
there exists a unique zy ∈ X such that

hTx, yi = hx, zy i , x ∈ X.

We set T∗ y = zy .
We now show that T∗ is unique. Suppose that

hTx, yi = hx, T∗1 yi = hx, T∗2 yi , x ∈ X,y ∈ Y

Then, we have

hx, (T∗1 − T∗2 )yi = 0, x ∈ X , y ∈ Y.

Therefore, (T∗1 − T∗2 )y ∈ X ⊥ = {0} for all y ∈ Y, which implies that T∗1 − T∗2 = 0 and, in turn,
that T∗1 = T∗2
The linearity of T∗ follows from the uniqueness and the linearity of the inner product.
Furthermore, we have

kT∗ k = sup kT∗ yk


kyk=1

= sup kzy k
kyk=1

= sup khT·, yik (15)


kyk=1

= sup sup |hTx, yi|


kyk=1 kxk=1

6 sup sup kTxk kyk (16)


kyk=1 kxk=1

= kTk , (17)

where (15) follows from the fact that the mapping θ in (13) is an isometry and in (16) we applied
the Cauchy-Schwarz inequality. T∗∗ = T follows immediately from (14). Furthermore, we have

kTk = kT∗∗ k 6 kT∗ k 6 kTk ,

which shows that kT∗ k = kTk.

20
Definition 32. Let X , Y be Hilbert spaces and T ∈ B(X , Y). The operator T∗ ∈ B(Y, X ) in
Theorem 16 is called the adjoint of T ∈ B(X , Y).

Definition 33. Let X be a Hilbert space and T ∈ B(X , X ). Then, T is

1. self-adjoint if T = T∗ ;

2. unitary if T is invertible and T−1 = T∗ ;

3. normal if TT∗ = T∗ T.

Examples. 1. Every orthogonal projection P : X → X is a self-adjoint operator.

2. If T : X → X is a unitary operator then it is a linear isometry. Indeed,

kTxk2 = hTx, Txi = hT∗ Tx, xi = hId x, xi = kxk2 , for all x ∈ X .

The converse is not true, that is, a linear isometry may notP be unitary. To see this,
X {x ∈ 2
n∈N |xn | < ∞} with
consider the Hilbert space = = (x , x , . . . ) R N :
P 1 2
inner product hx, yi = n∈N xn yn , and let T(x1 , x2 , x3 , . . . ) = (0, x1 , x2 , . . . ) be
the right-shift operator. Then T is a linear isometry, but it is not invertible, so
it cannot be unitary.

Lemma 7. Let X be a Hilbert space and T self-adjoint. Then, we have

kTk = sup {|hTx, xi| : kxk = 1} .

Proof. Let

α = sup {|hTx, xi| : kxk = 1} .

From |hTx, xi| 6 kTxk kxk 6 kTk kxk2 it follows that α 6 kTk. It remains to show that kTk 6 α.
It follows from the Cauchy Schwarz inequality that

|hTx, yi| 6 kTxk kyk , x, y ∈ X

with equality if and only if Tx and y are colinear. Therefore, we can rewrite

kTxk = sup {|hTx, yi| : kyk = 1} , (18)

which implies that

kTk = sup (kTxk)


kxk=1

= sup {|hTx, yi| : kxk = 1, kyk = 1}


= sup {|hTx, yi| : kxk = 1, kyk = 1, hTx, yi ∈ R} . (19)

21
It follows from the polarization formula ((2) for F = C and (1) for F = R) in Theorem 3 for
hTx, yi ∈ R that
1
|hTx, yi| = kTx + yk2 − kTx − yk2

4
1
= |hTx + y, Tx + yi − hTx − y, Tx − yi|
4
1
= |hT(x + y), x + yi − hT(x − y), x − yi|
4
1 1
6 |hT(x + y), x + yi| + |hT(x − y), x − yi|
4 4
α
6 (kx + yk2 + kx − yk2 )
4
α
= (kxk2 + kyk2 ) , x, y ∈ X with hTx, yi ∈ R.
2
Using this result in (19) gives kTk 6 α.

Definition 34. Let X be a Hilbert space. A self-adjoint operator T ∈ B(X , X ) is


1. positive semidefinite if hTx, xi > 0 for all x ∈ X . In this case, we write T > 0.

2. positive definite if it is positive semidefinite and hTx, xi = 0 implies that x = 0. In this


case, we write T > 0.

Definition 35. Let X be a Hilbert space and T ∈ B(X , X ), T > 0. We call a self-adjoint
operator A ∈ B(X , X ) a square root of T if T = A2 .

Proposition 7.1. Let X be a Hilbert space and T ∈ B(X , X ), T > 0. Then, T has a unique
positive definite square root.

Definition 36. An operator T : X → Y is invertible if it is surjective and injective.

A Proof of Theorem 10
Proof of Theorem 10. We first prove the existence of y in (a). Set

d = inf{kx − zk | z ∈ S}. (20)

From the definition of d, there exists a sequence {yn }n∈N of elements yn ∈ S such that

lim kx − yn k = d. (21)
n→∞

Suppose that this sequence is Cauchy. Since X is complete, there is a y ∈ X such that
limn→∞ yn = y, and since S is closed, we have y ∈ S. Finally, it follows from the continuity of
k · k that kx − yk = limn→∞ kx − yn k = d.
It remains to show that {yn }n∈N is Cauchy. The parallelogram law applied to (x − ym ) and
(x − yn ) implies that

2kx − yn k2 + 2kx − ym k2 = k2x − ym − yn k2 + kym − yn k2


= 4kx − (ym + yn )/2k2 + kym − yn k2
> 4d2 + kym − yn k2 , (22)

22
where we used (20) and the fact that (ym + yn )/2 ∈ S in the last step. Furthermore, (21) implies
that for each ε > 0 there exists an N (ε) such that kx − yn k 6 d + ε. Combining this with (22),
we find that

kym − yn k2 6 4(d + ε)2 − 4d2


= 4ε(2d + ε).

Therefore, {yn }n∈N is Cauchy.


Now let x ∈ X and y ∈ S be the unique closest point from (a). Then, for each λ ∈ F and
z ∈ S, we have

kx − yk2 6 kx − y + λzk2
= kx − yk2 + |λ|2 kzk2 + hx − y, λzi + hλz, x − yi
= kx − yk2 + |λ|2 kzk2 + 2<(λ̄ hx − y, zi).

It follows that

−2<(λ̄ hx − y, zi) 6 |λ|2 kzk2 , λ ∈ C, z ∈ S.

We can write hx − y, zi = |hx − y, zi| eiϕz (with ϕz ∈ {0, π} in the case of F = R). Therefore, for
λ = −εeiϕz , we get

2 |hx − y, zi| 6 εkzk2 , ε > 0, z ∈ S.

Taking the limit ε → 0 gives

|hx − y, zi| = 0, z ∈ S.

Now we prove the uniqueness in (a). Suppose y1 and y2 are elements of S such that
kx − y1 k = kx − y2 k = d. Then since x − y1 ∈ S ⊥ and y1 − y2 ∈ S, we have

d2 = kx − y2 k2 = kx − y1 k2 + ky1 − y2 k2
= d2 + ky1 − y2 k2 ,

so y1 = y2 .
Finally, we prove the uniqueness in (b). Let y1 and y2 be points in S such that x − y1 ∈ S ⊥ ,
x − y2 ∈ S ⊥ . Since x − y1 ⊥ y1 − z for all z ∈ S, we have

d2 = inf kx − zk2 = inf kx − y1 k2 + ky1 − zk2



z∈S z∈S
= kx − y1 k2 + inf ky1 − zk2
z∈S
2
= kx − y1 k .

Therefore kx − y1 k = d, and by the same argument kx − y2 k = d. By the proof of uniqueness in


(a) it follows that y1 = y2 .

23
B Proof of Theorem 15
Proof. If N (f ) = X we have yf = 0. Suppose that N (f ) 6= X . Since f is continuous, N (f ) is
closed. Therefore, according to Corollary 4.1, we can decompose X = N (f ) ⊕ N (f )⊥ . Since
N (f ) 6= X , there exists a non-zero z ∈ N (f )⊥ . We can rewrite any x ∈ X as
 
f (x) f (x)
x= x− z + z
f (z) f (z)
| {z }
∈N (f )

Therefore for any x ∈ X there exist α ∈ F and n ∈ N (f ) such that

x = αz + n, α.

Taking the inner product with z implies that

hx, zi
α= .
kzk2

Evaluating f on x = αz + n gives

f (z) hx, zi
f (x) = αf (z) = .
kzk2

Therefore, we can set zf = (f (z)/ kzk2 )z.


To prove uniqueness, suppose that

f (x) = hx, y1 i = hx, y2 i , x ∈ X.

In particular, we have

f (y1 − y2 ) = hy1 − y2 , y1 i = hy1 − y2 , y2 i ,

which implies that ky1 − y2 k = 0. Therefore, y1 = y2 .

C Spectrum of self-adjoint operators


Definition 37. Let X be a Hilbert space and T : X → X . The set

σ(T) = {λ ∈ F | (T − λI) is not invertible}

is called the spectrum of T.

We can split the spectrum into σ = σ1 ∪ σ2 , where

σ1 (T) = {λ ∈ F | (T − λI) is not injective)}


σ2 (T) = {λ ∈ F | (T − λI) is not surjective)}.

Lemma 8. Let X be a Hilbert space and T ∈ B(X , X ) self-adjoint. Then, we have

(a) For all x ∈ X we have hTx, xi ∈ R and σ1 (T) ⊆ R;

24
(b) if λ ∈
/ σ1 (T), then R(T − λI) = X ;

(c) if there exists an A > 0 such that k(T − λI)xk > Akxk for all x ∈ X , then λ ∈
/ σ(T).

Proof. • Proof of (a): Let x ∈ X . Then

hTx, xi = hx, Txi = hTx, xi ∈ R.

Now let λ1 ∈ σ1 (T). Then there exists an x ∈ X , x 6= 0, such that Tx = λx. Then
λkxk2 = hTx, xi ∈ R, so λ ∈ R.

• Proof of (b): We show that R(T − λI)⊥ = {0}. Let x ∈ R(T − λI)⊥ . Then, we have

0 = h(T − λI)y, xi


= y, (T − λI)x , y ∈ X.

Therefore, (T − λI)x = 0. If x 6= 0, this implies that λ ∈ σ1 (T) ⊆ R and, in turn, that


λ ∈ σ1 (T), which is a contradiction. Therefore, x = 0.

• Proof of (c): it follows immediately that λ ∈ / σ1 (T). Therefore, it remains to show


that λ ∈
/ σ2 (T). According to (b), it remains to show that R(T − λI) is closed. Let
{yn = (T − λI)xn }n∈N be a sequence in R(T − λI) such that y = limn→∞ yn . Then, we have

kyn − ym k = k(T − λI)(xn − xm )k


> A k(xn − xm )k .

Therefore, {xn }n∈N is a Cauchy sequence in a Hilbert space and, therefore, converges to x.
By the continuity of T − λI, we have y = (T − λI)x ∈ R(T − λI).

Lemma 9. Let X be a Hilbert space and T ∈ B(X , X ) self-adjoint. Then σ(T) ⊆ R.

Proof. Let λ = α + iβ ∈ C with β > 0. We have to show that λ ∈


/ σ(T). We have

h(T − λI)x, xi = hTx, xi −λ kxk2 , x ∈ X.


| {z } | {z }
∈R ∈R

This implies that

2i Im(h(T − λI)x, xi) = (λ − λ) kxk2 , x∈X


2
= −2iβ kxk , x∈X

and, in turn, that

kxk k(T − λI)xk > |h(T − λI)x, xi|


> |β| kxk2 , x ∈ X,

where we used the Cauchy-Schwarz inequality in the first step. Therefore, Lemma 8 implies that
λ∈/ σ(T).

25
Theorem 17. Let T ∈ B(X , X ) self-adjoint and set

m = inf hTx, xi and M = sup hTx, xi .


kxk=1 kxk=1

Then σ(T) ⊆ [m, M ].

Proof. First note that


 
2 2 x x
m kxk 6 kxk T , 6 M kxk2 , x ∈ X.
kxk kxk
| {z }
hTx,xi

Suppose that λ = m − c with c > 0. Then, the Cauchy-Schwarz inequality implies that

kxk k(T − λI)xk > |h(T − λI)x, xi|


> h(T − λI)x, xi
> kxk2 (m − λ)
= c kxk2 , x ∈ X.

Therefore, Lemma 8 implies that λ ∈


/ σ(T).
Now suppose that λ = M + c with c > 0. Then, the Cauchy-Schwarz inequality implies that

kxk k(T − λI)xk > |h(T − λI)x, xi|


> − h(T − λI)x, xi
> kxk2 (λ − M )
= c kxk2 , x ∈ X.

Again, Lemma 8 implies that λ ∈


/ σ(T).

D Pseudo-inverse of a bounded linear operator


Let X and Y be Hilbert spaces. It is often desirable to find some kind of “inverse” for an operator
T : X → Y that is not invertible in the strict sense.

Theorem 18 (Inverse mapping theorem). Let X , Y be Banach spaces. If T ∈ B(X , Y) is


invertible, then its inverse T−1 ∈ B(Y, X ). Consequently, T(S) is closed if and only if S is closed.

Theorem 19. Let X , Y be Hilbert spaces and T ∈ B(X , Y). Then, we have

R(T) = N (T∗ )⊥ (23)


∗ ⊥
N (T) = R(T ) . (24)

Proof. We first show (23). If y ∈ R(T), there exists a z ∈ X such that y = Tz. Then, we have

hy, ỹi = hTz, ỹi = hz, T∗ ỹi = 0, ỹ ∈ N (T∗ ),

which implies that R(T) ⊆ N (T∗ )⊥ . Since N (T∗ )⊥ is closed, it follows that R(T) ⊆ N (T∗ )⊥ .
On the other hand, if y ∈ R(T)⊥ , we have

0 = hTx, yi = hx, T∗ yi , x ∈ X,

26
which implies that T∗ y = 0. This means that R(T)⊥ ⊆ N (T∗ ). By taking the complement of
this relation we get N (T∗ )⊥ ⊆ R(T)⊥⊥ = R(T). To prove (24), we first apply (23) to T∗ , use
T∗∗ = T, and apply complements.

An equivalent formulation of this theorem is that if T ∈ B(X , Y) for two Hilbert spaces X , Y,
then Y can be written as the orthogonal direct sum

Y = R(T) ⊕ N (T∗ ).

The equation y = Tx has a solution x ∈ X if and only if y ∈ R(T). Therefore, if T ∈ B(X , Y)


has closed range, the equation y = Tx has a solution x ∈ X if and only if y ⊥ N (T∗ ).

Theorem 20. Let X , Y be Hilbert spaces and T ∈ B(X , Y). Then, R(T) is closed if and only
if R(T∗ ) is closed.

Proof. We show that R(T∗ ) closed implies that R(T) is closed. We have the following situation:
T∗
X −→ (R(T) ⊕ R(T)⊥ ) −→ R(T∗ ) ⊆ Y.
T

Using (23), if follows that the mapping

U : R(T) → R(T∗ )
y 7→ T ∗ (y)

is a bounded invertible linear operator. Since R(R∗ ) is closed in X which is complete, R(T∗ ) is
itself complete and hence a Hilbert space in its own right. Therefore, the adjoint

U∗ : R(T∗ ) → R(T)

exists and is a bounded invertible linear operator (with inverse U∗ −1 = U−1 ). We now show that
R(T) ⊆ R(T), which implies that R is closed. Let y ∈ R(T) and denote by P the orthogonal
projection onto R(T). Then, y = U∗ (x) for some x ∈ R(T∗ ), and we have

hy, zi = hy, Pzi


= hU∗ x, Pzi
= hx, UPzi
= hx, T∗ Pzi
= hx, T∗ zi
= hTx, zi , z ∈ Y.

Therefore, y = Tx ∈ R(T).

The following lemma gives a condition that ensures the existence of a right-inverse.

Lemma 10. Let X , Y be Hilbert spaces and T ∈ B(X , Y) with closed range R(T). Then there
exists a unique operator T† ∈ B(X , Y) satisfying

N (T† ) = R(T)⊥ (25)


R(T† ) = N (T)⊥ (26)

TT y = y, y ∈ R(T). (27)

27
Proof. We have the orthogonal direct sums

X = N (T) ⊕ N (T)⊥
Y = R(T) ⊕ R(T)⊥ .

To prove the existence of T† , we consider the linear operator



N (T)⊥ : N (T) → R(T).
e = T|
T

It is clear that T
e is linear and bounded. T e = 0, it follows that
e is injective because if Tx

x ∈ N (T) ∩ N (T) = {0}. To show that T e is surjective, let y ∈ R(T). Then, there exists
an x ∈ X such that y = T(x). We can write x = u + v with u ∈ N (T) and v ∈ N (T)⊥ .
Therefore, y = T(u + v) = T(u) + T(v) = T(u) = T(u)
e ∈ R(T). e −1 exists and
e It follows that T
−1 ⊥
e ∈ B(R(T), N (T) ) by Theorem 18. We define the operator T as
T †

T† (y) = T
e −1 (y), y ∈ R(T) (28)
† ⊥
T (y) = 0, y ∈ R(T) . (29)

It follows immediately that T† has the desired properties. To prove uniqueness, suppose that T†1
and T†2 fulfill Properties (25)–(27) and let y ∈ Y. We can decompose y = w + z with w ∈ R(T)
and z ∈ R(T)⊥ . Then, we have

T†i (y) = T†i (w) + T†i (z) = T†i (w) = T


e −1 (w), i = 1, 2.

Therefore, T†1 = T†2 .

Definition 38 (Pseudo-inverse). Let X , Y be Hilbert spaces and T ∈ B(X , Y) with closed


range R(T). The unique operator T† ∈ B(X , Y) satisfying Properties (25)–(27) is called the
pseudo-inverse of T† .

Proposition D.1 (Properties of T† ). Let X , Y be Hilbert spaces and T ∈ B(X , Y) with closed
range R(T). Then, the following properties hold.

(a) The orthogonal projection of Y onto R(T) is given by TT† .

(b) The orthogonal projection of X onto R(T† ) is given by T† T.

(c) T∗ has closed range and (T∗ )† = (T† )∗ .

(d) On R(T), the operator T† is given explicitly by T† = T∗ (TT∗ )−1 .

(e) If T is surjective, then given y ∈ Y, the equation Tx = y has a unique solution of minimal
norm, namely x = T† y.

Proof. • (a) Follows from Theorem 14 and Properties (25) and (27);

• (b): Property (26) implies that T† Tx = 0 for all x ∈ R(T† )⊥ . Now let x ∈ R(T† ). Then,
x = T† y for some y ∈ Y. Decomposing y = u + v with u ∈ R(T) and v ∈ R(T)⊥ , Property
(25) implies that x = T† u. Therefore, Property (26) implies that T† Tx = T† TT† u = T† u = x.
Theorem 14 now shows that T† T is an orthogonal projection onto R(T† );

28
• (c): It follows from Theorem 20 that T∗ has closed range. This implies that T∗ † is well
defined and is the unique operator satisfying
N (T∗ † ) = R(T∗ )⊥
R(T∗ † ) = N (T∗ )⊥
T∗ T∗ † x = x, x ∈ R(T∗ ).

Furthermore, it follows from (25)–(27) and (23) together with (24), that T† satisfies the
same properties. Therefore, (T∗ )† = (T† )∗ ;
• It follows from R(T) = N (T∗ )⊥ and N (T) = R(T∗ )⊥ that the operator U : R(T) → R(T),
y 7→ TT∗ y is invertible and the operator
T∗ (TT∗ )−1 on R(T) and 0 on R(T)⊥
fulfills (25)–(27);
• x = T† y is a solution of y = Tx, i.e., y = TT† y. Suppose that y = Tz for some z ∈ X and
set e = T† y − z. It follows that e ∈ N (T). Since T† y ∈ N (T)⊥ , we have e ⊥ T† y, which
2 2
implies that kzk2 = T† y − e = T† y + kek2 , which is minimal for e = 0.

E Application to frame theory


Let {fk }∞
k=0 be a frame with frame bounds 0 < A 6 B < ∞ for a Hilbert space X . It has been
seen in the lecture that the frame operator
S: X → P
X

x 7→ k=0 hx, fk i fk

is bounded, invertible, self-adjoint, and positive, and that {S−1 fk }∞k=0 is a frame with frame
bounds 0 < B −1 6 A−1 < ∞. In fact, {Tfk }∞ k=0 is a frame for a larger class of operators T, as
stated in the following theorem.

Theorem 21. Let X be a Hilbert spaces and T ∈ B(X , X ) with closed range R(T) and {fk }∞
k=0
be a frame with
X∞
2
A kxk 6 |hx, fk i|2 6 B kxk2 , x ∈ X .
k=0
Then, {Tfk }∞
k=0 is a frame in span{Tfk }∞
k=0 with

† −2 2
X
A T
kyk 6 |hy, Tfk i|2 6 B kTk2 kyk2 , y ∈ span{Tfk }∞
k=0 . (30)
k=0

Proof. In order to prove Theorem 21, we make use of the following lemma, which shows that it is
enough to check the frame condition on a dense set.

Lemma 11. Suppose that {fk }∞


k=0 is a sequence of elements in X and there exist constants
0 < A 6 B < ∞ such that
X∞
A kxk2 6 |hx, fk i|2 6 B kxk2 (31)
k=0
for all x in a dense subset of X . Then (31) holds for all x ∈ X .

29
The upper frame bound in (30) follows from

X ∞
X
2
|hy, Tfk i| = |hT∗ y, fk i|2 6 B kT∗ yk2 6 B kTk2 kyk2 , y ∈ X.
k=0 k=0

Now we prove that the lower frame bound in (30) holdes for all y ∈ span{Tfk }∞ k=0 . We know
that the operator TT† is the orthogonal projection onto R(T) and, in particular, it is self-adjoint.
Let y ∈ span{Tfk }∞k=0 . Then, y ∈ R(T), and we obtain that

y = (TT† )y = (TT† )∗ y = (T† )∗ T∗ y.

This gives

kyk2 6 k(T† )∗ k2 kT∗ yk2 (32)


† 2 ∗ 2
= k(T )k kT yk . (33)

Furthermore, we have
∞ ∞
1 X 1 X
kT∗ yk2 6 |hT∗ y, fk i|2 = |hy, Tfk i|2 . (34)
A A
k=0 k=0

Plugging (34) into (32) gives


† 2 ∞
T X
2
kyk 6 |hy, Tfk i|2 ,
A
k=0

which shows that the lower frame condition is satisfied for all y ∈ span{Tfk }∞
k=0 . Using Lemma 11,
we can conclude that {Tfk }∞ k=0 forms a frame for span{Tf } ∞ .
k k=0

In the special case where T is a surjective bounded operator, T has closed range. Furthermore,
every y ∈ Y can be written as y = Tx, where x ∈ X . Since {fk }∞ k=0 is a frame, one has


X
x, S−1 fk fk ,


x=
k=0

and it follows that



X
x, S−1 fk Tfk ,


y=
k=0

which shows that Y = span{Tfk }∞


k=0 . This result leads to the following corollary.

Corollary E.1. Assume that {fk }∞ k=0 is a frame for X with frame bounds 0 < A 6 B < ∞
and that T : X → Y is a bounded surjective operator. Then {Tfk }∞
k=0 is a frame for Y with frame
† −2 2
bounds A T , B kTk .

As a consequence, if we have at our disposal a frame, we can construct many other frames by
applying surjective operators to it.

30
References
[1] J. K. Hunter and B. Nachtergaele, Applied Analysis. Singapore: World Scientific Publishing,
2005.

[2] S. B. Damelin and W. Miller, The Mathematics of Signal Processing, ser. Cambridge texts in
applied mathematics. Cambridge, UK: Cambridge University Press, 2012.

[3] E. M. Stein and R. Sharachi, Real Analysis: Measure Theory, Integration, and Hilbert Spaces,
ser. Princeton lectures in analysis. Princeton, NJ, USA: Princeton University Press, 2005.

[4] O. Christensen, An Introduction to Frames and Riesz Bases, ser. Applied and Numerical
Harmonic Analysis. Boston, MA, USA: Birkhäuser, 2003.

[5] C. Heil, “Operators on Hilbert spaces,” Functional analysis lecture notes, February 2006.

31

You might also like