3

yybook
2012/7/9
page 59
i
i
i
i
i
i
i
i
Chapter 3
Inner Product and Hilbert
Spaces
Most of the readers may already be familiar with the concept of inner products. An inner
product space is a vector space equipped with a special kind of metric (a concept measuring
length, etc.) called an inner product. This naturally induces a norm, so an inner product
space also becomes a normed linear space. Generally speaking, however, norms measure
only the size of a vector, whereas inner products introduce more elaborate concepts related
to angles and directions such as orthogonality and thus allow for much more detailed study.
In this chapter we start by introducing inner product spaces and Hilbert spaces and
then discuss their fundamental properties, particularly the projection theorem, the existence
of the orthogonal complement, and orthogonal expansion, and their application to best
(optimal) approximation. The orthogonal expansion is later seen to give a basis for Fourier
expansion, and the best approximation in this context is important for various engineering
applications.
3.1 Inner Product Spaces
We start with the following denition without much intuitive justication.
Denition 3.1.1. Let X be a vector space over K. An inner product is a mapping from the
product space
38
XX into K
(, ) : XX K
that satisfy the following properties for all x, y, z X, and every scalar K:
1. (x, x) 0, and equality holds only when x =0.
2. (y, x) =(x, y).
3. (x, y) =(x, y).
4. (x +y, z) =(x, z) +(y, z).
38
Aproduct of sets X, Y is the set of all ordered pairs {(x, y) : x X}. See Appendix Afor more details (p. 234).
59
D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 60
i
i
i
i
i
i
i
i
60 Chapter 3. Inner Product and Hilbert Spaces
The reader may already be familiar with this denition, but let us add some expla-
nation for its background. In a more elementary situation, an inner product may be intro-
duced as
(x, y) :=xycos.
Here x is supposed to represent the length of the vector x, and is the angle between
x, y. A drawback of this more intuitive denition is that concepts such as length and angle
are generally not dened in general vector spaces, so it is not a feasible idea to attempt to
dene an inner product based on these undened quantities. The right procedure is rather
to derive these concepts from that of inner products.
Apopular example is the following standard inner product dened on R
n
:
(x, y) :=
n
i=1
x
i
y
i
. (3.1)
The space R
n
equipped with this inner product is called the n-dimensional Euclidean space.
While this gives a perfect example of an inner product in R
n
, this is not the only one.
In fact, let A = (a
ij
) be a positive denite matrix, i.e., a symmetric matrix with positive
eigenvalues only. Then
(x, y)
A
:=(Ax, y) =
n
i,j=1
a
ji
x
i
y
j
(3.2)
also induces an inner product. Here (Ax, y) on the right-hand side denotes the standard
inner product. Unless A is the identity matrix, this denes an inner product different from
the standard one.
A space X equipped with inner product (, ) is called an inner product space.
39
Let
(X, (, )) be an inner product space. Then we can dene the length of the vector x as
x :=
_
(x, x). (3.3)
As we see below in Proposition 3.1.8, this denition satises the axioms of norms (2.2)
(2.4). Hence an inner product space can also be regarded as a normed space.
When two vectors x, y satisfy (x, y) = 0, we say that x, y are mutually orthogonal,
and we write x y. If x m for every m in a subset M, we also write x M.
If we introduce a different inner product in a vector space, then the notion of the
length of the vector will naturally change, so the resulting inner product space should be
regarded as a different inner product space.
Example 3.1.2. The n-dimensional complex vector space C
n
endowed with the inner
product
(x, y) :=
n
i=1
x
i
y
i
=y
x, note the order of x and y,

where x = [x
1
, . . . , x
n
]
T
and y = [y
1
, . . . , y
n
]
T
. This is called the n-dimensional (complex
Euclidean space or the unitary space. The inner product here is an extension of the standard
inner product in R
n
and is also referred to under the same terms.
39
If this space is complete, then it is called a Hilbert space. An inner product space (not necessarily complete)
is sometimes called a pre-Hilbert space. D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 61
i
i
i
i
i
i
i
i
3.1. Inner Product Spaces 61
Example 3.1.3. Let us give a less trivial example of an inner product space. Let C[0, 1]
be the vector space consisting of all K-valued (K = C or R) continuous functions on the
interval [0, 1]. Then
(f , g) :=
_
1
0
f (t )g(t )dt (3.4)
denes an inner product in C[0, 1].
Let us rst show some basic properties of inner products. We then show that an inner
product space naturally becomes a normed linear space.
Proposition 3.1.4. Let X be an inner product space.
40
For every x, y, z in X and every
scalar K, the following properties hold:
1. (x, y +z) =(x, y) +(x, z).
2. (x, y) =(x, y).
3. (x, 0) =0 =(0, x).
4. If (x, z) =0 for every z X, then x =0.
5. If (x, z) =(y, z) for every z X, then x =y.
Proof. The proofs for 1 and 2 are easy and left as an exercise. One needs only use prop-
erty 2 in Denition 3.1.1 to switch the order of elements and then invoke properties 3 and 4
in the same denition.
Property 3 above follows directly from (0x, x) =0 (x, x) =0. Property 4 reduces to
property 1 of Denition 3.1.1 by putting z =x.
Finally, property 5 follows by applying property 4 to x y.
Theorem 3.1.5 (CauchySchwarz).
41
Let (X, (, )) be an inner product space. For every
x, y X,
| (x, y) | xy, (3.5)
where x =
_
(x, x) as dened by (3.3). The equality holds only when x, y are linearly
dependent.
Proof. If y =0, then the equality clearly holds. So let us assume y =0. For every scalar ,
0 (x y, x y) =(x, x) (y, x) (x, y) +||
2
(y, y) . (3.6)
To see the idea, let us temporarily assume that (x, y) is real-valued and also that is real.
Then the right-hand side of (3.6) becomes
y
2
2
2(x, y) +x
2
0.
40
It is often bothersome to write (X, (, )) to explicitly indicate the dependence on the equipped inner product,
so we often write X. When we want to show the dependence on the inner product explicitly, we write (X, (, )).
41
Often called Schwarzs inequality. D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 62
i
i
i
i
i
i
i
i
Completing the square, this is equal to
_
y
(x, y)
y
_
2
_
(x, y)
y
_
2
+x
2
.
This attains its minimum
_
(x, y)
y
_
2
+x
2
when =(x, y) /y
2
, and hence | (x, y) |
2
x
2
y
2
.
Considering this in the general case (i.e., and (x, y) are not necessarily real-valued),
let us substitute =(x, y) /y
2
into (3.6). This yields
0 x
2
| (x, y) |
2
y
2
,
and hence
| (x, y) | xy
follows.
Conversely, suppose that x, y are linearly dependent and that y =0. Then x =y for
some scalar . Then | (y, y) | =||| (y, y) | =|| y
2
=yy, and hence the equality
holds.
Schwarzs inequality readily yields the following triangle inequality.
Proposition 3.1.6. Let X be an inner product space. Then
x +y x+y
for every x, y in X.
Proof. Apply Schwarzs inequality to
x +y
2
=x
2
+2Re(x, y) +y
2
x
2
+2| (x, y) | +y
2
.
Proposition 3.1.7 (parallelogram law). Let X be an inner product space. For every x,
y in X,
x +y
2
+x y
2
=2x
2
+2y
2
.
Proof. The proof is immediate from
x +y
2
=(x +y, x +y) =x
2
+(x, y) +(y, x) +y
2
,
x y
2
=x
2
(x, y) (y, x) +y
2
.
Now let (X, (, )) be an inner product space. As we have already seen in (3.3), we can
introduce a norm into this space by
x :=
_
(x, x). (3.7) D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 63
i
i
i
i
i
i
i
i
3.2. Hilbert Space 63
When we wish to explicitly indicate that this is a norm in space X, it may be denoted as
x
X
with subscript X.
Let us ensure that this indeed satises the axioms for norms (Denition 2.1.2).
Proposition 3.1.8. The norm (3.7) dened on the inner product space (X, (, )) satises the
axioms of norms (2.2)(2.4). Thus an inner product space can always be regarded as a
normed linear space.
Proof. That x 0 and x =0 only when x =0 readily follow from the axioms of inner
products (Denition 3.1.1).
The triangle inequality x +y x+y has already been proven in Proposition
3.1.6. For scalar products, the axioms of inner products and Proposition 3.1.4 yield
x
2
=(x, x) =(x, x) =||
2
(x, x) =||
2
x
2
.
Notice that the inner product is continuous with respect to the topology induced by
the norm as dened above. That is, the following proposition holds.
Proposition 3.1.9 (continuity of inner products). Let X be an inner product space. Sup-
pose that sequences {x
n
} and {y
n
} satisfy x
n
x 0, y
n
y 0 as n . Then
(x
n
, y
n
) (x, y).
Proof. Since x
n
x, there exists M such that x
n
M for all n (Chapter 2, Problem 2).
Then the CauchySchwarz inequality (Theorem 3.1.5, p. 61) implies
| (x
n
, y
n
) (x, y) | =| (x
n
, y
n
y) +(x
n
x, y) |
| (x
n
, y
n
y) | +| (x
n
x, y) |
My
n
y+x
n
xy.
The right-hand side converges to 0.
As we have noted, an inner product space can always be regarded as a normed space
by introducing a norm by (3.7). This normed space is, however, not necessarily complete,
i.e., not necessarily a Banach space. For example, the space C[0, 1] in Example 3.1.3 with
inner product (3.4) is not complete
42
(cf. Problem 4). When this becomes complete, it is
called a Hilbert space. This is the topic of the next section.
3.2 Hilbert Space
The following denition arises naturally from what we have seen so far.
Denition 3.2.1. An inner product space (X, (, )) that becomes a Banach space with respect
to the norm naturally induced by (3.7) is called a Hilbert space.
We will see in due course that a Hilbert space exhibits various properties that resemble
those in nite-dimensional spaces. But before going into detail, let us give two typical
examples of a Hilbert space.
42
To be precise, with respect to the norm induced by it. D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 64
i
i
i
i
i
i
i
i
Example 3.2.2. The space L
2
[a, b] of all square integrable functions on the interval [a, b]
in the sense of Lebesgue, given in Example 2.1.18 (p. 45) by setting p =2. Dene an inner
product by
(f , g) :=
_
b
a
f (t )g(t )dt . (3.8)
Clearly
|f (t )g(t )| |f (t )|
2
+|g(t )|
2
,
so the integral on the right-hand side of (3.8) exists and is nite. For f , g L
2
[a, b],
|f (t ) +g(t )|
2
|f (t )|
2
+2|f (t )g(t )| +|g(t )|
2
(at each t ), so that f +g also belongs to L
2
[a, b].
The crucial step is in guaranteeing that (3.8) is well dened for L
2
[a, b] and that
L
2
[a, b] constitutes a vector space; verifying that (3.8) indeed satises the axioms of norms
is a rather easy exercise and hence is left to the reader.
Thus L
2
[a, b] is an inner product space. Further, it is clear that the norm
2
:=
__
b
a
|(t )|
2
dt
_
1/2
dened by setting p =2 in (2.22) is indeed induced from this inner product via (3.7).
It remains only to show the completeness. But, as with L
1
[a, b], this requires knowl-
edge of the Lebesgue integral, so we omit its proof here.
Example 3.2.3. The sequence space
2
dened in Example 2.1.17. Introduce an inner
product by
({x
n
}, {y
n
}) :=
n=1
x
n
y
n
. (3.9)
Let us show that this is indeed an inner product. Note that |ab| a
2
+b
2
for every real a, b.
Then
n=1
|x
n
y
n
|
n=1
_
|x
n
|
2
+|y
n
|
2
_
<,
so that (3.9) always converges. It is almost trivial to see that the axioms for inner products
(Denition 3.1.1, p. 59) are satised. Further, as with Example 3.2.2 above, for {x
n
},
{y
n
}
2
,
|x
n
+y
n
|
2
|x
n
|
2
+2|x
n
y
n
| +|y
n
|
2
,
and hence {x
n
+y
n
}
2
. This shows that
2
constitutes a vector space.
It remains to show the completeness. Let x
(k)
:={x
(k)
n
}
n=1
, k =1, 2, . . . , be a Cauchy
sequence of elements in
2
. Note that
|x
(k)
n
x
()
n
|
_
_
_x
(k)
x
()
_
_
_
2
D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 65
i
i
i
i
i
i
i
i
for all n, k, . Then for each n, x
(k)
n
, k =1, 2, . . . , constitute a Cauchy sequence in K. Hence
there exists x
n
K such that x
(k)
n
x
n
0 as k . Dene x := {x
n
}
n=1
. Showing
x
2
and x
(k)
x
2
0 (k ) would complete the proof.
Take any >0. There exists N such that whenever k, N,
_
_
x
(k)
x
()
_
_
2
. Take
any positive integer L. Then
_
L
n=1
|x
(k)
n
x
()
n
|
2
_
1/2
_
_
_x
(k)
x
()
_
_
_
2
whenever k, N. Letting go to innity shows that

_
L
n=1
|x
(k)
n
x
n
|
2
_
1/2
for each xed L and for k N. Since this estimate is independent of L, we can let L tend
to innity to obtain
_
_
_x
(k)
x
_
_
_
2
=
_

n=1
|x
(k)
n
x
n
|
2
_
1/2
(3.10)
whenever k N. Then x = x x
(k)
+x
(k)
(k N) implies x
2
because x x
(k)
and
x
(k)
both belong to
2
. It also follows from (3.10) that x
(k)
x
2
0.
We will subsequently see that in a Hilbert space there always exists a perpendicular,
as well as its foot, from an arbitrary element to a given, arbitrary, closed subspace. This
in turn implies the existence of a direct sum decomposition for an arbitrary closed sub-
space. These properties are certainly valid in nite-dimensional Euclidean spaces but not
necessarily so for Banach spaces. It is these properties that enable us to solve the minimum
distance (and hence the minimum-norm approximation) problem, and this in turn (see sec-
tion 3.3, p. 73) gives rise to a fairly complete treatment of Fourier expansion in the space
L
2
(see subsection 3.2.2 (p. 69) and section 7.1 (p. 141)).
The following theorem gives the foundation for this fact.
Theorem3.2.4. Let X be a Hilbert space, and M a nonempty closed subset of X. For every
x X, there exists m
0
in M that minimizes x m
0
among all min M. Furthermore, such
an m
0
is unique.
A set M is said to be convex if for every pair x, y M and 0 < t < 1, t x +(1 t )y M
holds.
Proof of Theorem 3.2.4. Suppose
d := inf
mM
x m. (3.11)
Since M is nonempty, d < . By the denition of inmum (see Lemma A.2.1 in
Appendix A), for every > 0 there always exists m M such that
d x m < d +. D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 66
i
i
i
i
i
i
i
i
Hence for each positive integer n there exists m
n
M such that
x m
n
2
< d
2
+
1
n
. (3.12)
Then by the parallelogram law, Proposition 3.1.7 (p. 62), we have
(x m
k
) (x m
)
2
+x m
k
+x m
2
=2x m
k
2
+2x m
2
< 4d
2
+2
_
1
k
+
1
_
.
This yields
m
k
m
2
< 4d
2
+2
_
1
k
+
1
_
4
_
_
_
_
x
m
k
+m
2
_
_
_
_
2
.
Since M is convex, (m
k
+m
)/2 is also an element of M, so that by (3.11)

_
_
_
_
x
m
k
+m
2
_
_
_
_
2
d
2
holds. Hence
m
k
m
2
< 4d
2
+2
_
1
k
+
1
_
4d
2
=2
_
1
k
+
1
_
.
Since the last expression approaches 0 as k, , the sequence {m
n
} is a Cauchy se-
quence. The completeness of the Hilbert space then implies that there exists a limit m
0
X
such that m
n
m
0
as n . Furthermore, since M is closed, m
n
m
0
implies m
0
M.
Hence by (3.11) we have
x m
0
d.
On the other hand, letting n in (3.12), we obtain by the continuity of a norm (Lemma
2.1.7) that
x m
0
d.
Hence x m
0
=d. Thus there exists a point m
0
that attains the inmum of the distance.
Let us show the uniqueness. Suppose that there exists another m
0
M such that
_
_
x m
0
_
_
=d. Since M is convex, (m
0
+m
0
)/2 M, and hence
_
_
_
_
x
m
0
+m
0
2
_
_
_
_
d.
Applying the parallelogram law (Proposition 3.1.7) to x m
0
and x m
0
, we have
_
_
m
0
m
0
_
_
2
=2x m
0
2
+2
_
_
x m
0
_
_
2
4
_
_
_
_
x
m
0
+m
0
2
_
_
_
_
2
2d
2
+2d
2
4d
2
=0.
Hence m
0
=m
0
. D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 67
i
i
i
i
i
i
i
i
3.2.1 Orthogonal Complements
Denition 3.2.5. Let E be a subset of a Hilbert space X. Asubset of X given by
{x X : (x, y) =0 for all y E} (3.13)
is called the orthogonal complement of E and is denoted by E
or by XE.
Proposition 3.2.6. The orthogonal complement E
of any subset E of a Hilbert space X

is always a closed subspace.
Proof. This is easy and left as an exercise for the reader.
As a consequence of the completeness, there always exists the perpendicular foot to
a closed subspace. Let us prove the following theorem.
Theorem 3.2.7 (projection theorem). Let X be a Hilbert space and M a closed subspace.
For any x X there exists a unique x
0
M such that
(x x
0
, m) =0 m M. (3.14)
The element x
0
is called the foot of the perpendicular fromx to M. Conversely, when (3.14)
holds,
x x
0
x m m M
follows. That is, the perpendicular from x to M gives the shortest distance from x to M.
Proof. Let us rst prove the uniqueness (if it exists) of such x
0
satisfying (3.14). Suppose
that there exists another x
1
M such that
(x x
1
, m) =0 m M.
This readily implies (x
0
x
1
, m) =0 for all m M. Since x
0
x
1
is itself an element of M,
we can substitute x
0
x
1
into m to obtain (x
0
x
1
, x
0
x
1
) =0. Then by the rst axiom of
inner products (p. 59) x
0
x
1
=0 follows.
Let us now prove the existence. Since every subspace is clearly convex, Theorem
3.2.4 implies that there exists x
0
M such that
x x
0
x y y M.
For every m M and a complex number , we have
x x
0
m
2
=x x
0
2
(x x
0
, m) (m, x x
0
) +||
2
m
2
. (3.15)
Since (x x
0
, m) =e
i
|(x x
0
, m)| for some , we may set =t e
i
(t is a real parameter
here) in the above and obtain
0 x x
0
m
2
x x
0
2
=2t |(x x
0
, m)| +t
2
m
2
.
D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 68
i
i
i
i
i
i
i
i
Hence, for t 0, we have
|(x x
0
, m)|
t
2
m
2
.
Letting t 0, we obtain (x x
0
, m) =0.
Conversely, suppose (3.14) holds. Then setting =1 in (3.15) and using (3.14), we
obtain
x x
0
m
2
=x x
0
2
+m
2
x x
0
2
.
Hence x x
0
gives the shortest distance from x to M.
This theorem yields the following theorem, which states that for every closed sub-
space M of X, X can be decomposed into the direct sum of M and its orthogonal comple-
ment M
.
Theorem 3.2.8. Let X be a Hilbert space and M a closed subspace. For any x X, take
x
0
M uniquely determined by Theorem 3.2.7, and then dene
p
M
(x) :=x
0
,
p
M
(x) :=x x
0
.
Then
x =p
M
(x) +p
M
(x) (3.16)
gives a direct sum decomposition of X into M and M
. (See Figure 3.1.)

Proof. That every x is expressible as (3.16) and that x
0
M, x x
0
M
are already
shown in Theorem 3.2.7. Also, M M
= {0} follows because x M M
readily
implies (x, x) =0 by denition.
The projection theorem may often be used in the following shifted form.
Corollary 3.2.9. Let X be a Hilbert space and M a closed subspace. Take any x in X
and set
V :=x +M ={x +m : m M}.
Then there uniquely exists an element x
0
in V with minimum norm that is characterized
by x
0
M.
The proof reduces to the projection theorem, Theorem 3.2.7, by shifting V by x.
This is left as an exercise for the reader.
Exercise 3.2.10. Prove Corollary 3.2.9.
D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 69
i
i
i
i
i
i
i
i
M
M
x
p (x)
M
p (x)
M
Figure 3.1. Orthogonal complement

3.2.2 Orthogonal Expansion
Denition 3.2.11. A family {e
n
}
n=1
of nonzero elements in a Hilbert space X is said to be
an orthogonal system if
(e
i
, e
j
) =0 i =j.
If, further, e
i
=1 for all i, then {e
n
}
n=1
is called an orthonormal system.
If the family {e
1
, . . . e
n
} is an orthonormal system, then it is a linearly independent
family. Indeed, if
n
j=1
j
e
j
=0,
then by taking the inner product of both sides with e
k
, we obtain that
0 =
_
_
n
j=1
j
e
j
, e
k
_
_
=
k
holds for each k, and this shows the linear independence. Likewise, we can easily modify
the proof to show that an orthogonal system is linearly independent.
Example 3.2.12. The family
e
n
(t ) :=
1
2
e
int
, n =0, 1, 2, . . . ,
in L
2
[, ] forms an orthonormal system. One can easily check by direct calculation that
(e
n
, e
m
) =
nm
(
nm
denotes the Kronecker delta, i.e., the function that takes value 1 when
n =m and 0 otherwise). D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 70
i
i
i
i
i
i
i
i
As we have just seen in the above example, an orthonormal system typically appears
in the theory of Fourier series. Let us adopt the following denition by generalizing this.
Denition 3.2.13. Let {e
n
} be a given orthonormal system in a Hilbert space X. For
every x X, (x, e
n
) is called the nth Fourier coefcient of x and the series
(x, e
n
)e
n
the
Fourier series of x.
We should note, however, that at this stage this series is only a formal object, and we
say nothing about its convergence. However, the following fact can easily be veried.
Theorem 3.2.14. Let X be a Hilbert space and {e
k
}
n
k=1
an orthonormal system in it. Take
any x X. The Fourier series
(x, e
k
)e
k
gives the best approximant among all elements
of form
c
k
e
k
in the sense of norm, i.e., making x
c
k
e
k
minimal. The square of the
error x
(x, e
k
)e
k
is given by
d
2
=x
2
k=1
|(x, e
k
)|
2
. (3.17)
Proof. Let M :=span{e
k
}
n
k=1
, i.e., the nite-dimensional subspace consisting of all linear
combinations of {e
k
}
n
k=1
. By Theorem 2.3.2 (p. 52), M is a closed subspace. By Theorem
3.2.7, the minimum of x m, m M, is attained when (x m) M. Set m=
c
k
e
k
,
and write down this condition to obtain
_
x
c
k
e
k
, e
i
_
=0, i =1, . . . , n. (3.18)
According to the orthonormality (e
k
, e
i
) =
ki
, we have
(x, e
i
) =c
i
(e
i
, e
i
) =c
i
.
This gives precisely the Fourier coefcients.
Finally, the orthonormality of {e
k
} readily yields
_
_
_x
(x, e
k
)e
k
_
_
_
2
=
_
x
(x, e
k
)e
k
, x
(x, e
k
)e
k
_
=x
2
k=1
(x, e
k
)(x, e
k
)
=x
2
k=1
|(x, e
k
)|
2
,
which is (3.17).
Theorem 3.2.15 (Bessels inequality). Let {e
k
}
k=1
be an orthonormal system in a Hilbert
space X. For every x X, the following inequality, called Bessels inequality, holds:
k=1
|(x, e
k
)|
2
x
2
. (3.19)
D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 71
i
i
i
i
i
i
i
i
Proof. Take any n and set y
n
:=
n
k=1
(x, e
k
)e
k
. Apply (3.17) to y
n
to obtain
n
k=1
|(x, e
k
)|
2
x
2
.
Letting n , we obtain (3.19).
The convergence of a series is of course dened by the convergence of the sequence of
its partial sums, as with real series. But unlike the case of the reals, testing its convergence
is in general not easy. However, for the case of series of linear sums with respect to an
orthonormal system as with Fourier series, the following simple criterion is available.
Proposition 3.2.16. Let X be a Hilbert space and {e
n
}
n=1
an orthonormal system. A nec-
essary and sufcient condition for the series
k=1
k
e
k
with complex coefcients
k
to be
convergent in X is that
k=1
|
k
|
2
<.
Proof. Dene the nth partial sum s
n
by
s
n
:=
n
k=1
k
e
k
,
and suppose that s =lim
n
s
n
is convergent. Since
_
n
k=1
k
e
k
, e
j
_
=
j
,
the continuity of inner products implies (s, e
k
) =
k
. Then by Bessels inequality (3.19)
k=1
|
k
|
2
=
k=1
| (s, e
k
) |
2
s
2
<.
Conversely, if
k=1
|
k
|
2
<, then for mn we have
s
m
s
n
2
=
_
_
_
_
_
_
m
k=n+1
k
e
k
_
_
_
_
_
_
2
=
m
k=n+1
|
k
|
2
0 (n ).
That is, the sequence {s
n
} is a Cauchy sequence and is convergent by the completeness
of X.
Denition 3.2.17. An orthonormal system {e
n
}
n=1
in a Hilbert space X is said to be com-
plete if it has the property that the only element orthogonal to all e
n
is 0.
D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 72
i
i
i
i
i
i
i
i
Some remarks may be in order. First, the termcomplete is confusing, since it has no
relevance to the completeness in the sense of convergence of Cauchy sequences. Second,
the condition above is equivalent to the property that every element of X is expressible as
an (innite) linear combination of {e
n
}
n=1
, and hence we have the term completeness
43
for {e
n
}
n=1
.
Theorem3.2.18. Let {e
n
}
n=1
be a complete orthonormal systemin a Hilbert space X. Then
for every x X, the following equalities hold:
1. x =
n=1
(x, e
n
) e
n
, (3.20)
2. x
2
=
n=1
| (x, e
n
) |
2
. (3.21)
The second equality is called Parsevals identity.
Proof. Let us rst note that Bessels inequality (3.19) along with Proposition 3.2.16 imply
that
n=1
(x, e
n
) e
n
converges. Observe that
_
x
n=1
(x, e
n
) e
n
, e
j
_
=
_
x, e
j
_
n=1
(x, e
n
)
_
e
n
, e
j
_
=
_
x, e
j
_
_
x, e
j
_
=0.
That is, x
n=1
(x, e
n
) e
n
is orthogonal to all e
j
. Hence by the completeness of {e
n
}
n=1
,
we have x
n=1
(x, e
n
) e
n
=0, i.e., (3.20) follows.
Clearly
_
_
_
_
_
N
n=1
(x, e
n
) e
n
_
_
_
_
_
2
=
N
n=1
| (x, e
n
) |
2
.
Letting N , we have by the continuity of a norm that the left-hand side converges to
x
2
, while the right-hand side converges to
n=1
| (x, e
n
) |
2
. This proves (3.21).
Examining the proof above, we see that the completeness is the key to showing that
the error x
n=1
(x, e
n
) e
n
is indeed 0. Conversely, if the family is not complete, i.e.,
there exists a nonzero element x that is orthogonal to all e
n
, then this expansion theorem
is no longer valid. In fact, for such an element x, its Fourier expansion

n=1
(x, e
n
) e
n
is
clearly 0, but x =0.
This is not an exceptional case at all. Indeed, for a given orthonormal system{e
n
}
n=1
,
extracting one element, say, e
1
, out of this system makes it noncomplete, and the condition
above will not be satised. For example, in space
2
, the family
e
2
=(0, 1, 0, . . .), e
3
=(0, 0, 1, 0, . . .), . . . , e
n
=(0, . . . , 1, 0, . . .)
is not complete. The element e
1
=(1, 0, 0, . . .) is missing here.
43
in the sense that nothing is missing. D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 73
i
i
i
i
i
i
i
i
3.3. Projection Theorem and Best Approximation 73
The next proposition gives a characterization of completeness.
Proposition 3.2.19. Each of the following statements gives a necessary and sufcient
condition for an orthonormal system {e
n
}
n=1
in a Hilbert space X to be complete:
1. span{e
n
: n =1, 2, . . .} is dense
44
in X.
2. Parsevals identity, i.e.,
x
2
=
n=1
| (x, e
n
) |
2
,
holds for every x X.
Proof. The necessity is already shown in Theorem 3.2.18.
The sufciency of condition 1 can be seen as follows: Suppose that x is orthogonal to
all e
n
. Then it is orthogonal to span{e
n
: n =1, 2, . . .}. Since this space is dense in X, this
implies that x is orthogonal to every element of X by the continuity of the inner product.
Then by Proposition 3.1.4 (p. 61), x =0 follows; that is, {e
n
}
n=1
is complete.
To show that Parsevals identity yields completeness, assume that {e
n
}
n=1
is not
complete. Then there exists a nonzero x that is orthogonal to all e
n
. Clearly we
have

n=1
| (x, e
n
) |
2
= 0. On the other hand, x
2
= 0, and this contradicts Parsevals
identity.
3.3 Projection Theorem and Best Approximation
To state it simply, the notion of inner product is a generalization of bilinear forms. The
projection theorem, Theorem 3.2.7, asserts that the minimization of such bilinear forms is
given by the orthogonal projection onto a given subspace.
We here give a simple application related to such an approximation problem. Where
there is a performance index (also called a penalty function or a utility function depending
on the context and eld) of second order such as energy, its optimization (minimization) is
mostly reduced to the projection theorem. Least square estimation, Kalman ltering, and
the YuleWalker equation are typical examples. For details, the reader is referred to, for
example, Luenberger [31].
Let X be a Hilbert space and M = span{v
1
, . . . , v
n
} its nite-dimensional subspace.
We may assume, without loss of generality, that {v
1
, . . . , v
n
} is linearly independent by
deleting redundant vectors if necessary. The problem here is to nd x M such that
_
_
x x
_
_
= inf
mM
x m
for any given x X.
First note that M is a closed subspace as shown in Theorem 2.3.2. Hence by the
projection theorem, Theorem 3.2.7, x is given by the foot of the perpendicular to M, and it
is characterized by
(x x, m) =0, m M.
44
A subset in a topological space X is dense if its closure (Denition 2.1.23, p. 47) agrees with the whole
space X; see Appendix A, section A.3 (p. 238). D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 74
i
i
i
i
i
i
i
i
Since M =span{v
1
, . . . , v
n
}, this is clearly equivalent to
(x x, v
i
) =0, i =1, . . . , n.
Setting x =
n
j=1
j
v
j
, this can be rewritten as
(v
1
, v
1
)
1
+ (v
2
, v
1
)
2
+ (v
n
, v
1
)
n
= (x, v
1
) ,
(v
1
, v
2
)
1
+ (v
2
, v
2
)
2
+ (v
n
, v
2
)
n
= (x, v
2
)
.
.
.
.
.
.
.
.
.
.
.
.
(v
1
, v
n
)
1
+ (v
2
, v
n
)
2
+ (v
n
, v
n
)
n
= (x, v
n
) ,
(3.22)
which in turn is written in matrix form as
_
_
_
_
_
(v
1
, v
1
) (v
2
, v
1
) (v
n
, v
1
)
(v
1
, v
2
) (v
2
, v
2
) (v
n
, v
2
)
.
.
.
.
.
.
(v
1
, v
n
) (v
2
, v
n
) (v
n
, v
n
)
_
_
_
_
_
_
_
2
.
.
.
n
_
_
=
_
_
_
_
_
(x, v
1
)
(x, v
2
)
.
.
.
(x, v
n
)
_
_
. (3.23)
This equation is called the normal equation, and the transpose of the left-hand side matrix
of (3.23) is called the Gram matrix and is denoted by G(v
1
, v
2
, . . . , v
n
).
One can also solve its dual problem, namely, that of nding a vector of minimum
norm satisfying nitely many linear constraints, according to Corollary 3.2.9 as follows.
Set M :=span{v
1
, . . . , v
n
} as above, and suppose that {v
1
, . . . , v
n
} is linearly indepen-
dent. Consider the constraints
(x, v
1
) = c
1
,
(x, v
2
) = c
2
.
.
.
.
.
.
(x, v
n
) = c
n
,
(3.24)
and nd an optimal element with x minimal. Denote such an element by x.
Let x
0
be a solution of (3.24). Then every solution of (3.24) is expressible as (an
element of) x
0
+M
.
Hence, by Corollary 3.2.9, the solution to the minimum norm problem x belongs to
M
, and M =M
because M is a closed subspace (Problem 3). Hence x M, and we

can set
x =
n
j=1
j
v
j
.
Writing down the conditions for x M to satisfy (3.24), we obtain
_
_
_
_
_
(v
1
, v
1
) (v
2
, v
1
) (v
n
, v
1
)
(v
1
, v
2
) (v
2
, v
2
) (v
n
, v
2
)
.
.
.
.
.
.
(v
1
, v
n
) (v
2
, v
n
) (v
n
, v
n
)
_
_
_
_
_
_
_
2
.
.
.
n
_
_
=
_
_
_
_
_
c
1
c
2
.
.
.
c
n
_
_
. (3.25)
As a slight modication, consider the case where M is given as the kernel of a linear
mapping L : X Y. To give a solution to this problem, we need the concept of adjoint D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p
yybook
2012/7/9
page 75
i
i
i
i
i
i
i
i
Problems 75
mappings in Hilbert space. This is to be introduced later in Chapter 5, section 5.4 (p. 94),
and we here give a solution to the minimum norm problem assuming the knowledge of this
concept. The reader not familiar with this concept can skip the material in the rest of this
section without much loss.
Let L : X Y be a linear map from a Hilbert space X to another Hilbert space Y;
actually we can assume X and Y are both just inner product spaces. The adjoint mapping
L
of L is the mapping L
: Y X dened by
(Lx, y)
Y
=
_
x, L
y
_
X
, x X, y Y (3.26)
(see section 5.4 (p. 94), Theorem 5.4.1). Then the following fact holds.
Theorem 3.3.1. Let X, Y, L be as above, and suppose that Y is nite-dimensional and
that LL
: Y Y admits an inverse. Then for any y

0
Y, Lx =y
0
admits a solution, and
the minimum-norm solution is given by
x
0
=L
(LL
)
1
y
0
. (3.27)
Proof. Since Lx
0
=LL
(LL
)
1
y
0
=y
0
, x
0
is clearly a solution to Lx =y
0
.
Dene M :=ker L, and take any x M. Clearly
(x, x
0
) =
_
x, L
(LL
)
1
y
0
_
=
_
Lx, (LL
)
1
y
0
_
=
_
0, (LL
)
1
y
0
_
=0,
so that x
0
is orthogonal to M. Hence, by Corollary 3.2.9 (p. 68), x
0
is the solution with
minimum norm.
Needless to say, a typical example of orthogonal expansion is the Fourier expan-
sion. To actually prove its completeness, we will further need some analytic arguments.
Furthermore, its essence is perhaps better appreciated for its connection with Schwartz
distributions. We will thus discuss this topic in the later chapter on Fourier analysis.
Problems
1. (GramSchmidt orthogonalization). Let (X, (, )) be an inner product space and
{x
1
, . . . , x
n
, . . .} a sequence of linearly independent vectors (i.e., any nite subset is
linearlyindependent). Showthat one canchoose anorthonormal family{e
1
, . . . , e
n
, . . .}
such that for every n, span{e
1
, . . . , e
n
} = span{x
1
, . . . , x
n
} holds. (Hint: Set e
1
:=
x
1
/x
1
, and suppose that {e
1
, . . . , e
n1
} have been chosen. Then dene z
n
:=
x
n
n1
i=1
(x
n
, e
i
) e
i
and e
n
:=z
n
/z
n
. This procedure is called the GramSchmidt
orthogonalization.)
2. Let Q be an nn real positive denite matrix, let A be an mn real matrix such
that rankA = m, and let b R
m
. Find the minimum of x
T
Qx under the constraint
Ax =b. (Hint: Dene an inner product in R
n
by (x, y)
Q
=y
T
Qx, and then reduce
this problem to Theorem 3.3.1.)
3. Let X be a Hilbert space and M a closed subspace. Show that (M
=M.
4. Show that C[0, 1] is not complete with respect to the inner product (3.4). (Hint:
Consider the sequence f
n
of functions given in Problem 3 of Chapter 2.)
D
o
w
n
l
o
a
d
e
d

1
1
/
2
0
/
1
3

t
o

1
9
0
.
1
4
4
.
1
7
1
.
7
0
.

R
e
d
i
s
t
r
i
b
u
t
i
o
n

s
u
b
j
e
c
t

t
o

S
I
A
M

l
i
c
e
n
s
e

o
r

c
o
p
y
r
i
g
h
t
;

s
e
e

h
t
t
p
:
/
/
w
w
w
.
s
i
a
m
.
o
r
g
/
j
o
u
r
n
a
l
s
/
o
j
s
a
.
p
h
p

3

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

3

Uploaded by

Copyright:

Available Formats

yybook

x, note the order of x and y,

whenever k, N. Letting go to innity shows that

)/2 is also an element of M, so that by (3.11)

of any subset E of a Hilbert space X

. (See Figure 3.1.)

= {0} follows because x M M

Figure 3.1. Orthogonal complement

because M is a closed subspace (Problem 3). Hence x M, and we

: Y Y admits an inverse. Then for any y

You might also like