A First Course in Functional Analysis Theory and Applications

VS2%RS]BZRS%qS2Zqx%BqphSVqphiR%R
VS2%RS]BZRS%qS2Zqx%BqphSVqphiR%R
nYBiSpqHSV{{h%xp%BqR
gpA%qHpqpYS6q
VqY7S^RR
VqS%7{%qSBPSF%7AhHBqS^ZAh%RY%qS]B7{pqi
www:anthempress:com
nY%RSH%%BqSfRS{ZAh%RYHS%qS<~SpqHS<6VS0(ck
AiSVWn;dS^gd66
}ur}DS!hpx,P%pRSgBpHSNBqHBqS6dcSM;VS<~
BS^"S!B-S}}SNBqHBqS6FcS}mS<~
pqH
0::SpH%RBqSV&S'ccDSWaSOB,SWOSc((cDS<6V
S]B{i%YS©SgpA%qHpqpYS6qS0(ck
nYSpZYBSpRRRSYS7BphS%YSBSAS%Hq%fHSpRSYSpZYBSBPSY%RSaB,
VhhS%YRSR&HSF%YBZSh%7%%qSYS%YRSZqHSxB{i%YSR&HSpAB&
qBS{pSBPSY%RS{ZAh%xp%BqS7piSAS{BHZxHSRBHSBS%qBHZxHS%qB
pS%&phSRiR7SBSpqR7%HS%qSpqiSPB7SBSAiSpqiS7pqR
thxBq%xS7xYpq%xphS{YBBxB{i%qSxBH%qSBSBYa%RCS
a%YBZSYS{%BSa%qS{7%RR%BqSBPSABYSYSxB{i%YS
BaqSpqHSYSpAB&S{ZAh%RYSBPSY%RSABB,
BritishLibraryCataloguingTinTPublicationData
VSxpphBZSxBHSPBSY%RSABB,S%RSp&p%hpAhSPB7SYS!%%RYSN%Api
LibraryofCongressCatalogingTinTPublicationData
VSxpphBSxBHSPBSY%RSABB,SYpRSAqSJZRH
v6!WTckUS}MS(SMu}0MSk0(S}St;A,C
v6!WTc(US(SMu}0MSk0(S(St;A,C
nY%RS%hS%RSphRBSp&p%hpAhSpRSpqS!BB,

Preface
This book is the outgrowth of the lectures delivered on functional
analysis and allied topics to the postgraduate classes in the Department
of Applied Mathematics, Calcutta University, India. I feel I owe an
explanation as to why I should write a new book, when a large number of
books on functional analysis at the elementary level are available. Behind
every abstract thought there is a concrete structure. I have tried to unveil
the motivation behind every important development of the subject matter.
I have endeavoured to make the presentation lucid and simple so that the
learner can read without outside help.
The first chapter, entitled ‘Preliminaries’, contains discussions on topics
of which knowledge will be necessary for reading the later chapters. The
first concepts introduced are those of a set, the cardinal number, the
different operations on a set and a partially ordered set respectively.
Important notions like Zorn’s lemma, Zermelo’s axiom of choice are stated
next. The concepts of a function and mappings of different types are
introduced and exhibited with examples. Next comes the notion of a linear
space and examples of different types of linear spaces. The definition of
subspace and the notion of linear dependence or independence of members
of a subspace are introduced. Ideas of partition of a space as a direct
sum of subspaces and quotient space are explained. ‘Metric space’ as an
abstraction of real line 4 is introduced. A broad overview of a metric
space including the notions of convergence of a sequence, completeness,
compactness and criterion for compactness in a metric space is provided in
the first chapter. Examples of a non-metrizable space and an incomplete
metric space are also given. The contraction mapping principle and its
application in solving different types of equations are demonstrated. The
concepts of an open set, a closed set and an neighbourhood in a metric
space are also explained in this chapter. The necessity for the introduction
of ‘topology’ is explained first. Next, the axioms of a topological space are
stated. It is pointed out that the conclusions of the Heine-Borel theorem
in a real space are taken as the axioms of an abstract topological space.
Next the ideas of openness and closedness of a set, the neighbourhood of
a point in a set, the continuity of a mapping, compactness, criterion for
compactness and separability of a space naturally follow.
Chapter 2 is entitled ‘Normed Linear Space’. If a linear space admits a
metric structure it is called a metric linear space. A normed linear space is a
type of metric linear space, and for every element x of the space there exists
a positive number called norm x or x fulfilling certain axioms. A normed
linear space can always be reduced to a metric space by the choice of a
suitable metric. Ideas of convergence in norm and completeness of a normed
linear space are introduced with examples of several normed linear spaces,
Banach spaces (complete normed linear spaces) and incomplete normed
linear spaces.
vii
Continuity of a norm and equivalence of norms in a finite dimensional
normed linear space are established. The definition of a subspace and its
various properties as induced by the normed linear space of which this
is a subspace are discussed. The notion of a quotient space and its role
in generating new Banach spaces are explained. Riesz’s lemma is also
discussed.
Chapter 3 dwells on Hilbert space. The concepts of inner product space,
complete inner product or Hilbert space are introduced. Parallelogram law,
orthogonality of vectors, the Cauchy-Bunyakovsky-Schwartz inequality, and
continuity of scalar (inner) product in a Hilbert space are discussed. The
notions of a subspace, orthogonal complement and direct sum in the setting
of a Hilbert space are introduced. The orthogonal projection theorem takes
a special place.
Orthogonality, various orthonormal polynomials and Fourier series are
discussed elaborately. Isomorphism between separable Hilbert spaces is
also addressed. Linear operators and their elementary properties, space
of linear operators, linear operators in normed linear spaces and the norm
of an operator are discussed in Chapter 4. Linear functionals, space of
bounded linear operators and the uniform boundedness principle and its
applications, uniform and pointwise convergence of operators and inverse
operators and the related theories are presented in this chapter. Various
types of linear operators are illustrated. In the next chapter, the theory of
linear functionals is discussed. In this chapter I introduce the notions of
a linear functional, a bounded linear functional and the limiting process,
and assert continuity in the case of boundedness of the linear functional
and vice-versa. In the case of linear functionals apart from different
examples of linear functionals, representation of functionals in different
Banach and Hilbert spaces are studied. The famous Hahn-Banach theorem
on the extension on a functional from a subspace to the entire space with
preservation of norm is explained and the consequences of the theorem
are presented in a separate chapter. The notions of adjoint operators and
conjugate space are also discussed. Chapter 6 is entitled ‘Space of Bounded
Linear Functionals’. The chapter dwells on the duality between a normed
linear space and the space of all bounded linear functionals on it. Initially
the notions of dual of a normed linear space and the transpose of a bounded
linear operator on it are introduced. The zero spaces and range spaces of a
bounded linear operator and of its duals are related. The duals of Lp ([a, b])
and C([a, b]) are described. Weak convergence in a normed linear space
and its dual is also discussed. A reflexive normed linear space is one for
which the canonical embedding in the second dual is surjective (one-to-
one). An elementary proof of Eberlein’s theorem is presented. Chapter 7 is
entitled ‘Closed Graph Theorem and its Consequences’. At the outset the
definitions of a closed operator and the graph of an operator are given. The
closed graph theorem, which establishes the conditions under which a closed
linear operator is bounded, is provided. After introducing the concept of an
viii
open mapping, the open mapping theorem and the bounded inverse theorem
are proved. Application of the open mapping theorem is also provided. The
next chapter bears the title ‘Compact Operators on Normed Linear Spaces’.
Compact linear operators are very important in applications. They play a
crucial role in the theory of integral equations and in various problems of
mathematical physics. Starting from the definition of compact operators,
the criterion for compactness of a linear operator with a finite dimensional
domain or range in a normed linear space and other results regarding
compact linear operators are established. The spectral properties of a
compact linear operator are studied. The notion of the Fredholm alternative
is discussed and the relevant theorems are provided. Methods of finding an
approximate solution of certain equations involving compact operators in
a normed linear space are explored. Chapter 9 bears the title ‘Elements of
Spectral Theory on Self-adjoint Operators in Hilbert Spaces’. Starting from
the definition of adjoint operators, self-adjoint operators and their various
properties are elaborated upon the context of a Hilbert space. Quadratic
forms and quadratic Hermitian forms are introduced in a Hilbert space and
their bounds are discovered. I define a unitary operator in a Hilbert space
and the situation when two operators are said to be unitarily equivalent,
is explained. The notion of a projection operator in a Hilbert space is
introduced and its various properties are investigated. Positive operators
and the square root of operators in a Hilbert space are introduced and
their properties are studied. The spectrum of a self-adjoint operator in a
Hilbert space is studied and the point spectrum and continuous spectrum
are explained. The notion of invariant subspaces in a Hilbert space is
also brought within the purview of the discussion. Chapter 10 is entitled
‘Measure and Integration in Spaces’. In this chapter I discuss the theory
4
of Lebesgue integration and p-integrable functions on . Spaces of these
functions provide very useful examples of many theorems in functional
analysis. It is pointed out that the concept of the Lebesgue measure is
a generalization of the idea of subintervals of given length in 4 to a class
4
of subsets in . The ideas of the Lebesgue outer measure of a set E ⊂ , 4
Lebesgue measurable set E and the Lebesgue measure of E are introduced.
The notions of measurable functions and integrable functions in the sense
of Lebesgue are explained. Fundamental theorems of Riemann integration
and Lebesgue integration, Fubini and Toneli’s theorem, are stated and
explained. Lp spaces (the space of functions p-integrable on a measure
4
subset E of ) are introduced, that (E) is complete and related properties
discussed. Fourier series and then Fourier integral for functions are
investigated. In the next chapter, entitled ‘Unbounded Linear Operators’,
I first give some examples of differential operators that are not bounded.
But these are closed operators, or at least have closed linear extensions. It
is indicated in this chapter that many of the important theorems that hold
for continuous linear operators on a Banach space also hold for closed linear
operators. I define the different states of an operator depending on whether
ix
the range of the operator is the whole of a Banach space or the closure of
the range is the whole space or the closure of the range is not equal to
the space. Next the characterization of states of operators is presented.
Strictly singular operators are then defined and accompanied by examples.
Operators that appear in connection with the study of quantum mechanics
also come within the purview of the discussion. The relationship between
strictly singular and compact operators is explored. Next comes the study
of perturbation theory. The reader is given an operator ‘A’, the certain
properties of which need be found out. If ‘A’ is a complicated operator, we
sometimes express ‘A = T +B’ where ‘T ’ is a relatively simple operator and
‘B’ is related to ‘T ’ in such a manner that knowledge about the properties
of ‘T ’ is sufficient to gain information about the corresponding properties
of ‘A’. In that case, for knowing the specific properties of ‘A’, we can
replace ‘A’ with ‘T ’, or in other words we can perturb ‘A’ by ‘T ’. Here
we study perturbation by a bounded linear operator and perturbation by
strictly singular operator. Chapter 12 bears the title ‘The Hahn-Banach
Theorem and the Optimization Problems’. I first explain an optimization
problem. I define a hyperplane and describe what is meant by separating
a set into two parts by a hyperplane. Next the separation theorems for
a convex set are proved with the help of the Hahn-Banach theorem. A
minimum Norm problem is posed and the Hahn-Banach theorem is applied
to the proving of various duality theorems. Said theorem is applied to prove
Chebyshev approximation theorems. The optimal control problem is posed
and the Pontryagin’s problem is mentioned. Theorems on optimal control of
rockets are proved using the Hahn-Banach theorem. Chapter 13 is entitled
‘Variational Problems’ and begins by introducing a variational problem.
The aim is to investigate under which conditions a given functional in a
normed linear space admits of an optimum. Many differential equations are
often difficult to solve. In such cases a functional is built out of the given
equation and minimized. One needs to show that such a minimum solves
the given equation. To study those problems, a Gâteaux derivative and a
Fréchet derivative are defined as a prerequisite. The equivalence of solving
a variational problem and solving a variational inequality is established.
I then introduce the Sobolev space to study the solvability of differential
equations. In Chapter 14, entitled ‘The Wavelet Analysis’, I provide a
brief introduction to the origin of wavelet analysis. It is the outcome of
the confluence of mathematics, engineering and computer science. Wavelet
analysis has begun to play a serious role in a broad range of applications
including signal processing, data and image compression, the solving of
partial differential equations, the modeling of multiscale phenomena and
statistics. Starting from the notion of information, we discuss the scalable
structure of information. Next we discuss the algebra and geometry of
wavelet matrices like Haar matrices and Daubechies’s matrices of different
ranks. Thereafter come the one-dimensional wavelet systems where the
scaling equation associated with a wavelet matrix, the expansion of a
x
function in terms of wavelet system associated with a matrix and other
results are presented. The final chapter is concerned with dynamical
systems. The theory of dynamical systems has its roots in the theory of
ordinary differential equations. Henry Poincaré and later Ivar Benedixon
studied the topological properties of the solutions of autonomous ordinary
differential equations (ODEs) in the plane. They did so with a view of
studying the basic properties of autonomous ODEs without trying to find
out the solutions of the equations. The discussion is confined to one-
dimensional flow only.
Prerequisites The reader of the book is expected to have a knowledge
of set theory, elements of linear algebra as well as having been exposed to
metric spaces.
Courses The book can be used to teach two semester courses at the M.Sc.
level in universities (MS level in Engineering Institutes):
(i) Basic course on functional analysis. For this Chapters 2–9 may be
consulted.
(ii) Another course may be developed on linear operator theory. For
this Chapters 2, 3–5, 7–9 and 11 may be consulted. The Lebesgue
measure is discussed at an elementary level in Chapter 10; Chapters
2–9 can, however, be read without any knowledge of the Lebesgue
measure.
Those who are interested in applications of functional analysis may look

into Chapters 12 and 13.
Acknowledgements I wish to express my profound gratitude to my
advisor, the late Professor Parimal Kanti Ghosh, former Ghose professor
in the Department of Applied Mathematics, Calcutta University, who
introduced me to this subject. My indebtedness to colleagues and teachers
like Professor J. G. Chakraborty, Professor S. C. Basu is duly acknowledged.
Special mention must be made of my colleague and friend Professor A. Roy
who constantly encouraged me to write this book. My wife Mrs. M. Sen
offered all possible help and support to make this project a success, and
thanks are duly accorded. I am also indebted to my sons Dr. Sugata Sen
and Professor Shamik Sen for providing editorial support. Finally I express
my gratitude to the inhouse editors and the external reviewer. Several
improvements in form and content were made at their suggestion.
xi
>=C4=CB
$DJHE:K9J?ED [YLL
$ *H;B?C?D7H?;I ^
7HW
*XQFWLRQ 1DSSLQJ
0LQHDU 7SDFH
1HWULF 7SDFHV
8RSRORJLFDO 7SDFHV
'RQWLQXLW\ 'RPSDFWQHVV
$$ (EHC;: &?D;7H -F79;I ^
(HQLWLRQV DQG )OHPHQWDU\ 4URSHUWLHV
7XEVSDFH 'ORVHG 7XEVSDFH
*LQLWH (LPHQVLRQDO 2RUPHG 0LQHDU 7SDFHV DQG 7XEVSDFHV
5XRWLHQW 7SDFHV
'RPSOHWLRQ RI 2RUPHG 7SDFHV
$$$ #?B8;HJ -F79; ^
-QQHU 4URGXFW 7SDFH ,LOEHUW 7SDFH
]pZxYiT!Zqip,B&R,iT6xYap8St]!6CSvqJZph%i
'DXFK\&XQ\DNRYVN\7FKZDUW] -QHTXDOLW\
4DUDOOHORJUDP 0DZ
3UWKRJRQDOLW\
3UWKRJRQDO 4URMHFWLRQ 8KHRUHP
3UWKRJRQDO 'RPSOHPHQWV (LUHFW 7XP
3UWKRJRQDO 7\VWHP
'RPSOHWH 3UWKRQRUPDO 7\VWHP
-VRPRUSKLVP EHWZHHQ 7HSDUDEOH ,LOEHUW 7SDFHV
$0 &?D;7H )F;H7JEHI ^
(HQLWLRQ! 0LQHDU 3SHUDWRU
0LQHDU 3SHUDWRUV LQ 2RUPHG 0LQHDU 7SDFHV
0LQHDU *XQFWLRQDOV
8KH 7SDFH RI &RXQGHG 0LQHDU 3SHUDWRUV
9QLIRUP &RXQGHGQHVV 4ULQFLSOH
7RPH %SSOLFDWLRQV
-QYHUVH 3SHUDWRUV
&DQDFK 7SDFH ZLWK D &DVLV
[LLL
0 &?D;7H !KD9J?ED7BI ^
,DKQ&DQDFK 8KHRUHP
,DKQ&DQDFK 8KHRUHP IRU 'RPSOH[ :HFWRU DQG 2RUPHG
0LQHDU 7SDFH
%SSOLFDWLRQ WR &RXQGHG 0LQHDU *XQFWLRQDOV RQ ?8 9@
8KH +HQHUDO *RUP RI 0LQHDU *XQFWLRQDOV LQ 'HUWDLQ
*XQFWLRQDO 7SDFHV
8KH +HQHUDO *RUP RI 0LQHDU *XQFWLRQDOV LQ ,LOEHUW
7SDFHV
'RQMXJDWH 7SDFHV DQG %GMRLQW 3SHUDWRUV
0$ -F79; E< EKD:;: &?D;7H !KD9J?ED7BI ^
'RQMXJDWHV (XDOV DQG 8UDQVSRVHV %GMRLQWV
'RQMXJDWHV (XDOV RI (9 ?8 9@ DQG ?8 9@
;HDN DQG ;HDN 'RQYHUJHQFH
6H H[LYLW\
&HVW %SSUR[LPDWLRQ LQ 6H H[LYH 7SDFHV
0$$ BEI;: "H7F> .>;EH;C 7D: $JI EDI;GK;D9;I ^
'ORVHG +UDSK 8KHRUHP
3SHQ 1DSSLQJ 8KHRUHP
&RXQGHG -QYHUVH 8KHRUHP
%SSOLFDWLRQ RI WKH 3SHQ 1DSSLQJ 8KHRUHP
V{{h%xp%BqRSBPSYS"{qSp{{%qSnYB7
0$$$ ECF79J )F;H7JEHI ED (EHC;: &?D;7H -F79;I
'RPSDFW 0LQHDU 3SHUDWRUV
7SHFWUXP RI D 'RPSDFW 3SHUDWRU
*UHGKROP %OWHUQDWLYH
%SSUR[LPDWH 7ROXWLRQV
$2 B;C;DJI E< -F;9?7B .>;EHO E< -;B< :@E?DJ
dh7qRSBPS6{xphSnYBiSBPS6hPTVH_B%q ^
)F;H7JEHI ?D #?B8;HJ -F79;I
"{pBRS%qS;%hAS6{pxR
%GMRLQW 3SHUDWRUV
7HOI%GMRLQW 3SHUDWRUV
5XDGUDWLF *RUP
9QLWDU\ 3SHUDWRUV 4URMHFWLRQ 3SHUDWRUV
4RVLWLYH 3SHUDWRUV 7TXDUH 6RRWV RI D 4RVLWLYH 3SHUDWRU
7SHFWUXP RI 7HOI%GMRLQW 3SHUDWRUV
-QYDULDQW 7XEVSDFHV
'RQWLQXRXV 7SHFWUD DQG 4RLQW 7SHFWUD
[LY
2 ';7IKH; 7D: $DJ;=H7J?ED ?D -F79;I ^
8KH 0HEHVJXH 1HDVXUH RQ
1HDVXUDEOH DQG 7LPSOH *XQFWLRQV
'DOFXOXV ZLWK WKH 0HEHVJXH 1HDVXUH
8KH *XQGDPHQWDO 8KHRUHP IRU 6LHPDQQ -QWHJUDWLRQ
8KH *XQGDPHQWDO 8KHRUHP IRU 0HEHVJXH -QWHJUDWLRQ
(9 7SDFHV DQG 'RPSOHWHQHVV
(9 'RQYHUJHQFH RI *RXULHU 7HULHV
2$ /D8EKD:;: &?D;7H )F;H7JEHI ^
(HQLWLRQ! %Q 9QERXQGHG 0LQHDU 3SHUDWRU
7WDWHV RI D 0LQHDU 3SHUDWRU
(HQLWLRQ! 7WULFWO\ 7LQJXODU 3SHUDWRUV
6HODWLRQVKLS &HWZHHQ 7LQJXODU DQG 'RPSDFW 3SHUDWRUV
4HUWXUEDWLRQ E\ &RXQGHG 3SHUDWRUV
4HUWXUEDWLRQ E\ 7WULFWO\ 7LQJXODU 3SHUDWRUV
4HUWXUEDWLRQ LQ D ,LOEHUW 7SDFH DQG %SSOLFDWLRQV
2$$ .>; #7>D 7D79> .>;EH;C 7D: )FJ?C?P7J?ED ^
*HE8B;CI
8KH 7HSDUDWLRQ RI D 'RQYH[ 7HW
1LQLPXP 2RUP 4UREOHP DQG WKH (XDOLW\ 8KHRU\
%SSOLFDWLRQ WR 'KHE\VKHY %SSUR[LPDWLRQ
%SSOLFDWLRQ WR 3SWLPDO 'RQWURO 4UREOHPV
2$$$ 07H?7J?ED7B *HE8B;CI ^
1LQLPL]DWLRQ RI *XQFWLRQDOV LQ D 2RUPHG 0LQHDU 7SDFH
+ADWHDX[ (HULYDWLYH
*UHFKHW (HULYDWLYH
dJZ%&phqxSBPSYS%q%7%8p%BqS^BAh7SPBS6Bh&%q
)TXLYDOHQFH RI WKH 1LQLPL]DWLRQ 4UREOHP WR 7ROYLQJ
:DULDWLRQDO -QHTXDOLW\
wp%p%BqphSvqJZph%i
(LVWULEXWLRQV
7REROHY 7SDFH
2$0 .>; 17L;B;J D7BOI?I ^
%Q -QWURGXFWLRQ WR ;DYHOHW %QDO\VLV
8KH 7FDODEOH 7WUXFWXUH RI -QIRUPDWLRQ
%OJHEUD DQG +HRPHWU\ RI ;DYHOHW 1DWULFHV
3QHGLPHQVLRQDO ;DYHOHW 7\VWHPV
"qT*%7qR%BqphSFp&hS6iR7R
[Y
20 OD7C?97B -OIJ;CI ^
% (\QDPLFDO 7\VWHP DQG -WV 4URSHUWLHV
,RPHRPRUSKLVP (LHRPRUSKLVP 6LHPDQQLDQ
1DQLIROG
7WDEOH 4RLQWV 4HULRGLF 4RLQWV DQG 'ULWLFDO 4RLQWV
)[LVWHQFH 9QLTXHQHVV DQG 8RSRORJLFDO 'RQVHTXHQFH
d-%RqxS<q%JZqRRSpqHSnB{BhB%xphS]BqRJZqxR
&LIXUFDWLRQ 4RLQWV DQG 7RPH 6HVXOWV
&?IJ E< -OC8EBI ^
?8B?E=H7F>O ^
$D:;N ^
[YL
Introduction
Functional analysis is an abstract branch of mathematics that grew
out of classical analysis. It represents one of the most important
branches of the mathematical sciences. Together with abstract algebra and
mathematical analysis, it serves as a foundation of many other branches of
mathematics. Functional analysis is in particular widely used in probability
and random function theory, numerical analysis, mathematical physics and
their numerous applications. It serves as a powerful tool in modern control
and information sciences.
The development of the subject started from the beginning of the
twentieth century, mainly through the initiative of the Russian school
of mathematicians. The impetus came from the developments of linear
algebra, linear ordinary and partial differential equations, calculus of
variation, approximation theory and, in particular, those of linear integral
equations, the theory of which had the greatest impact on the development
and promotion of modern ideas. Mathematicians observed that problems
from different fields often possess related features and properties. This
allowed for an effective unifying approach towards the problems, the
unification being obtained by the omission of inessential details. Hence
the advantage of such an abstract approach is that it concentrates on the
essential facts, so that they become clearly visible.
Since any such abstract system will in general have concrete realisations
(concrete models), we see that the abstract method is quite versatile in
its applications to concrete situations. In the abstract approach, one
usually starts from a set of elements satisfying certain axioms. The nature
of the elements is left unspecified. The theory then consists of logical
consequences, which result from the axioms and are derived as theorems
once and for all. This means that in the axiomatic fashion one obtains a
mathematical structure with a theory that is developed in an abstract way.
For example, in algebra this approach is used in connection with fields,
rings and groups. In functional analysis, we use it in connection with
‘abstract’ spaces; these are all of basic importance.
In functional analysis, the concept of space is used in a very wide and
surprisingly general sense. An abstract space will be a set of (unspecified)
elements satisfying certain axioms, and by choosing different sets of axioms,
we obtain different types of abstract spaces.
The idea of using abstract spaces in a systematic fashion goes back to M.
Fréchet (1906) and is justified by its great success. With the introduction
of abstract space in functional analysis, the language of geometry entered
the arena of the problems of analysis. The result is that some problems of
analysis were subjected to geometric interpretations. Furthermore many
conjectures in mechanics and physics were suggested, keeping in mind
the two-dimensional geometry. The geometric methods of proof of many
theorems came into frequent use.
xvii
The generalisation of algebraic concepts took place side by side with that
of geometric concepts. The classical analysis, fortified with geometric and
algebraic concepts, became versatile and ready to cope with new problems
not only of mathematics but also of mathematical physics. Thus functional
analysis should form an indispensable part of the mathematics curricula at
the college level.
xviii
CHAPTER 1
PRELIMINARIES
In this chapter we recapitulate the mathematical preliminaries that will

be relevant to the development of functional analysis in later chapters.
This chapter comprises six sections. We presume that the reader has been
exposed to an elementary course in real analysis and linear algebra.
1.1 Set
The theory of sets is one of the principal tools of mathematics. One type of
study of set theory addresses the realm of logic, philosophy and foundations
of mathematics. The other study goes into the highlands of mathematics,
where set theory is used as a medium of expression for various concepts
in mathematics. We assume that the sets are ‘not too big’ to avoid any
unnecessary contradiction. In this connection one can recall the famous
‘Russell’s Paradox’ (Russell, 1959). A set is a collection of distinct and
distinguishable objects. The objects that belong to a set are called elements,
members or points of the set. If an object a belongs to a set A, then we
write a ∈ A. On the other hand, if a does not belong to A, we write
a∈ / A. A set may be described by listing the elements and enclosing them
in braces. For example, the set A formed out of the letters a, a, a, b, b, c can
be expressed as A = {a, b, c}. A set can also be described by some defining
properties. For example, the set of natural numbers can be written as
N = {x : x, a natural number} or {x|x, a natural number}. Next we
discuss set inclusion. If every element of a set A is an element of the set
B, A is said to be a subset of the set B or B is said to be a superset of
A, and this is denoted by A ⊆ B or B ⊇ A. Two sets A and B are said
to be equal if every element of A is an element of B and every element of
B is an element of A–in other words if A ⊆ B and B ⊆ A. If A is equal
to B, then we write A = B. A set is generally completely determined by
its elements, but there may be a set that has no element in it. Such a set
is called an empty (or void or null) set and the empty set is denoted by Φ
1
2 A First Course in Functional Analysis
(Phi). Φ ⊂ A; in other words, the null set is included in any set A – this
fact is vacuously satisfied. Furthermore, if A is a subset of B, A = Φ and
A = B, then A is said to be a proper subset of B (or B is said to properly
contain A). The fact that A is a proper subset of B is expressed as A ⊂ B.
Let A be a set. Then the set of all subsets of A is called the power set
of A and is denoted by P (A). If A has three elements like letters p, q and
r, then the set of all subsets of A has 8(= 23 ) elements. It may be noted
that the null set is also a subset of A. A set is called a finite set if it is
empty or it has n elements for some positive integer n; otherwise it is said
to be infinite. It is clear that the empty set and the set A are members of
P (A). A set A is called denumerable or enumerable if it is in one-to-one
correspondence with the set of natural numbers. A set is called countable
if it is either finite or denumerable. A set that is not countable is called
uncountable.
We now state without proof a few results which might be used in
subsequent chapters:
(i) An infinite set is equivalent to a subset of itself.
(ii) A subset of a countable set is a countable set.
The following are examples of countable sets: a) the set J of all integers,
b) the set 3 of all rational numbers, c) the set P of all polynomials with
rational coefficients, d) the set all straight lines in a plane each of which
passes through (at least) two different points with rational coordinates and
4
e) the set of all rational points in n .
Examples of uncountable sets are as follows: (i) an open interval ]a, b[, a
closed interval [a, b] where a = b, (ii) the set of all irrational numbers. (iii)
the set of all real numbers. (iv) the family of all subsets of a denumerable
set.
1.1.1 Cardinal numbers
Let all the sets be divided into two families such that two sets fall into
one family if and only if they are equivalent. This is possible because the
relation ∼ between the sets is an equivalence relation. To every such family
of sets, we assign some arbitrary symbol and call it the cardinal number of
each set of the given family. If the cardinal number of a set A is α, A = α
or card A = α. The cardinal number of the empty set is defined to be
0 (zero). We designate the number of elements of a nonempty finite set
as the cardinal number of the finite set. We assign ℵ0 to the class of all
denumerable sets and as such ℵ0 is the cardinal number of a denumerable
set. c, the first letter of the word ‘continuum’ stands for the cardinal number
of the set [0, 1].
1.1.2 The algebra of sets
In the following section we discuss some operations that can be
performed on sets. By universal set we mean a set that contains all the sets
Preliminaries 3
under reference. The universal set is denoted by U . For example, while

discussing the set of real numbers we take 4as the universal set. Once
+
again for sets of complex numbers the universal set is the set of complex
numbers. Given two sets A and B, the union of A and B is denoted by
A ∪ B and stands for a set whose every element is an element of either A
or B (including elements of both A and B). A ∪ B is also called the sum
of A and B and is written as A + B. The intersection of two sets A and B
is denoted by A ∩ B, and is a set, the elements of which are the elements
common to both A and B. The intersection of two sets A and B is also
called the product of A and B and is denoted by A · B. The difference of
two sets A and B is denoted by A − B and is defined by the set of elements
in A which are not elements of B. Two sets A and B are said to be disjoint
if A ∩ B = Φ. If A ⊆ B, B − A will be called the complement of A with
reference to B. If B is the universal set, Ac will denote the complement of
A and will be the set of all elements which are not in A.
Let A, B and C be three non-empty sets. Then the following laws hold
true:
1. Commutative laws
A ∪ B = B ∪ A and A ∩ B = B ∩ A
2. Associative laws
We have a finite number of sets
A ∪ (B ∪ C) = (A ∪ B) ∪ C and (A ∩ B) ∩ C = A ∩ (B ∩ C)
3. Distributive laws
A ∩ (B ∪ C) = (A ∩ B) ∪ (A ∩ C)
(A ∪ B) ∩ C = (A ∩ C) ∪ (A ∩ B)
4. De Morgan’s laws
(A ∪ B)c = (Ac ∩ B c ) and (A ∩ B)c = (Ac ∪ B c )
Suppose we have a finite class of sets of the form {A1 , A2 , A3 , . . . , An },
then we can form A1 ∪ A2 ∪ A3 ∪ . . . An and A1 ∩ A2 ∩ A3 ∩ . . . An . We
can shorten the above expression by using the index set I = {1, 2, 3, . . . , n}.
The above expressions for union and intersection can be expressed in short
by ∪i∈I Ai and ∩i∈I Ai respectively.
1.1.3 Partially ordered set
Let A be a set of elements a, b, c, d, . . . of a certain nature. Let us
introduce between certain pairs (a, b) of elements of A the relation a ≤ b
with the properties:
(i) If a ≤ b and b ≤ c, then a ≤ c (transitivity)

(ii) a ≤ a (reflexivity)
(iii) If a ≤ b and b ≤ a then a = b
Such a set A is said to be partially ordered by ≤ and a and b, satisfying

a ≤ b and b ≤ a are said to be congruent. A set A is said to be totally
ordered if for each pair of its elements a, b, a ≤ b or b ≤ a.
A subset B of a partially ordered set A is said to be bounded above if
there is an element b such that y ≤ b for all y ∈ B, the element b is called
an upper bound of B. The smallest of all upper bounds of B is called the
least upper bound (l.u.b.) or supremum of B. The terms bounded below
and greatest lower bound (g.l.b.) or infimum can be analogously defined.
Finally, an element x0 ∈ A is said to be maximal if there exists in A no
element x = x0 satisfying the relation x0 ≤ x. The natural numbers are
totally ordered but the branches of a tree are not. We next state a highly
important lemma known as Zorn’s lemma.
1.1.4 Zorn’s lemma
Let X be a partially ordered set such that every totally ordered subset
of X has an upper bound in X. Then X contains a maximal element.
Although the above statement is called a lemma it is actually an axiom.
1.1.5 Zermelo’s theorem
Every set can be well ordered by introducing certain order relations.
The proof of Zermelo’s theorem rests upon Zermelo’s axiom of arbitrary
choice, which is as follows:
If one system of nonempty, pair-wise disjoint sets is given, then there is
a new set possessing exactly one element in common with each of the sets
of the system.
Zorn’s Lemma, Zermelo’s Axiom of Choice and well ordering theorem
are equivalent.
1.2 Function, Mapping

Given two nonempty sets X and Y , the Cartesian product of X and Y ,
denoted by X × Y is the set of all ordered pairs (x, y) such that x ∈ X and
y ∈Y.
Thus X × Y = {(x, y) : x ∈ X, y ∈ Y }.
1.2.1 Example
Let X = {a, b, c} and let Y = {d, e}. Then, X × Y =
{(a, d), (b, d), (c, d), (a, e), (b, e), (c, e)}.
It may be noted that the Cartesian product of two countable sets is
countable.
1.2.2 Function
Let X and Y be two nonempty sets. A function from X to Y is a subset
of X × Y with the property that no two members of f have the same first
Preliminaries 5
coordinate. Thus (x, y) ∈ f and (x, z) ∈ f imply that y = z. The domain

of a function f from X to Y is the subset of X that consists of all first
coordinates of members of f . Thus x is in the domain of f if and only if
(x, y) ∈ f for some y ∈ Y .
The range of f is the subset of Y that consists of all second coordinates
of members of f . Thus y is in the range of f if and only if (x, y) ∈ f for
some x ∈ X. If f is a function and x is a point in the domain of f then f (x)
is the second coordinate of the unique member of f whose first coordinate
is x.
Thus y = f (x) if and only if (x, y) ∈ f . This point f (x) is called the
image of x under f .
1.2.3 Mappings: into, onto (surjective), one-to-one (injective)
and bijective
A function f is said to be a mapping of X into Y if the domain of f is
X and the range of f is a subset of Y . A function f is said to be a mapping
of X onto Y (surjective) if the domain of f is X and the range of f is Y .
onto
The fact that f is a mapping of X onto Y is denoted by f : X −→ Y .
A function f from X to Y is said to be one-to-one (injective) if distinct
points in X have distinct images under f in Y . Thus f is one-to-one if and
only if (x1 , y) ∈ f and (x2 , y) ∈ f imply that x1 = x2 . A function from X
to Y is said to be bijective if it is both injective and surjective.
1.2.4 Example
Let X = {a, b, c} and let Y = {d, e}. Consider the following subsets of
X ×Y:
F = {(a, b), (b, c), (c, d), (a, c)}, G = {(a, d), (b, d), (c, d)},
H = {(a, d), (b, e), (c, e)}, φ = {(a, c), (b, d)}
The set F is not a function from X to Y because (a, d) and (a, e) are
distinct members of F that have the same first coordinate. The domain of
both G and H is X and the domain of φ is (a, b).
1.3 Linear Space

A nonempty set is said to be a space if the set is closed with respect to
certain operations defined on it. It is apparent that the elements of some
sets (i.e., set of finite matrices, set of functions, set of number sequences)
are closed with respect to addition and multiplication by a scalar. Such
sets have given rise to a space called linear space.
Definition. Let E be a set of elements of a certain nature satisfying the
following axioms:
(i) E is an additive abelian group. This means that if x and y ∈ E,
then their sum x + y also belongs to the same set E, where the operation
of addition satisfies the following axioms:
(a) x + y = y + x (commutativity);
(b) x + (y + z) = (x + y) + z (associativity);
(c) There exists a uniquely defined element 0, such that x + θ = x for
any x in E;
(d) For every element x ∈ E there exists a unique element (−x) of the
same space, such that x + (−x) = θ.
(e) The element θ is said to be the null element or zero element of E and
the element −x is called the inverse element of x.
(ii) A scalar multiplication is said to be defined if for every x ∈ E, for any

scalar λ (real or complex) the element λx ∈ E and the following conditions
are satisfied:
(a) λ(μx) = λμx (associativity)

λ(x + y) = λx + λy
(b) (distributivity)
(λ + μ)x = λx + μx
(c) 1 · x = x
The set E satisfying the axioms (i) and (ii) is called a linear or vector
space. This is said to be a real or complex space depending on whether the
set of multipliers are real or complex.
1.3.1 Examples
(i) Real line 4
The set of all real numbers for which the ordinary additions and
multiplications are taken as linear operations, is a real linear space . 4
(ii) The Euclidean space 4 , unitary space C
n n
, and complex plane
+
Let X be the set of all ordered n-tuples of real numbers. If x =
(ξ1 , ξ2 , . . . , ξn ) and y = (η1 , η2 , . . . , ηn ), we define the operations of addition
and scalar multiplication as x + y = (ξ1 + η1 , ξ2 + η2 , . . . , ξn + ηn ) and
λx = (λξ1 , λξ2 , . . . , λξn ). In the above equations, λ is a real scalar. The
above linear space is called the real n-dimensional space and denoted by n . 4
+
The set of all ordered n-tuples of complex numbers, n , is a linear space
with the operations of additions and scalar multiplication defined as above.
The complex plane + is a linear space with addition and multiplication of
complex numbers taken as the linear operations over 4 +
(or ).
(iii) Space of m × n matrices, m×n 4

4 m×n
is the set of all matrices with real elements. Then m×n is a 4
real linear space with addition and scalar multiplication defined as follows:
Preliminaries 7
Let A = {aij } and B = {bij } be two m × n matrices. Then A + B =

{aij + bij }. αA = {αaij }, where α is a scalar. In this space −A = {−aij }
and the matrix with all its elements as zeroes is the zero element of the
4
space m×n .
(iv) Sequence space l∞

Let X be the set of all bounded sequences of complex numbers, i.e., every
element of X is a complex sequence x = {ξi } such that |ξi | ≤ Ci , where Ci
is a real number for each i. If y = {ηi } then we define x + y = {ξi + ηi }
and λx = {λξi }. Thus, l∞ is a linear space, and is called a sequence space.
(v) C([a, b])

Let X be the set of all real-valued continuous functions x, y, etc, which
are functions of an independent variable t defined on a given closed interval
J = [a, b]. Then X is closed with respect to additions of two continuous
functions and multiplication of a continuous function by a scalar, i.e.,
(x + y)(t) = x(t) + y(t), αx(t) = (αx(t)) where α is a scalar.
(vi) Space lp , Hilbert sequence space l2

Let p ≥ 1 be a fixed real number. By definition each element in the space
∞

lp is a sequence x = {ξi } = {ξ1 , ξ2 , . . . , ξn , . . .}, such that |ξi |p < ∞,
i=1
for p real and p ≥ 1, if y ∈ lp and y = {ηi }, x + y = {ξi + ηi } and αx =
4
{αξi }, α ∈ . Since |ξi |p + |ηi |p ≤ 2p max(|ξi |p , |ηi |p ) ≤ 2p (|ξi |p + |ηi |p ), it
∞

follows that |ξi +ηi |p < ∞. Therefore, x+y ∈ lp . Similarly, we can show
i=1
that αx ∈ lp where α is a scalar. Hence, lp is a linear space with respect
to the algebraic operations defined above. If p = 2, the space lp becomes
l2 , a square summable space which possesses some special properties to be
revealed later.
(vii) Space Lp ([a, b]) of all Lebesgue pth integrable functions

Let f be a Lebesgue measurable function defined on [a, b] and 0 < p <
b
∞. Since f ∈ Lp ([a, b]), we have a |f (t)|p dt < ∞. Again, if g ∈ Lp ([a, b]),
b
i.e., a |g(t)|p dt < ∞. Since f + g|p ≤ 2p max(|f |p , |g|p ) ≤ 2p (|f |p + |g|p ),
f ∈ Lp ([a, b]), g ∈ Lp ([a, b]) imply that (f + g) ∈ Lp ([a, b]) and αf ∈
Lp ([a, b]). This shows that Lp ([a, b]) is a linear space. If p = 2, we get
L2 ([a, b]), which is known as the space of square integrable functions. The
space possesses some special properties.
1.3.2 Subspace, linear combination, linear dependence, linear

independence
A subset X of a linear space E is said to be a subspace if X is a linear
space with respect to vector addition and scalar multiplication as defined
in E. The vector of the form x = α1 x1 + α2 x2 + · · · + αn xn is called a
linear combination of vectors x1 , x2 , . . . , xn , in the linear space E, where
α1 , α2 , . . . , αn are real or complex scalars. If X is any subset of E, then
the set of linear combinations vectors in X forms a subspace of E. The
subspace so obtained is called the subspace spanned by X and is denoted
by span X. It is, in fact, the smallest subspace of E containing X. In other
words it is the intersection of all subspaces of E containing X.
A finite set of vectors {x1 , x2 , . . . , xn } in X is said to be linearly
dependent if there exist scalars {α1 , α2 , . . . , αn }, not all zeroes, such
that α1 x1 + α2 x2 + · · · + αn xn = 0 where {α1 , α2 , . . . , αn } are scalars,
real or complex. On the other hand, if for all scalars {α1 , α2 , . . . , αn },
α1 x1 + α2 x2 + · · · + αn xn = 0 ⇒ α1 = 0, α2 = 0, . . . , αn = 0, then the
set of vectors is said to be linearly independent.
A subset X (finite or infinite) of E is linearly independent if every finite
subset of X is linearly independent. As a convention we regard the empty
set as linearly independent.
1.3.3 Hamel basis, dimension
A subset L of a linear space E is said to be a basis (or Hamel basis) for
E if (i) L is a linearly independent set, and (ii) L spans the whole space.
In this case, any non-zero vector x of the space E can be expressed
uniquely as a linear combination of finitely many vectors of L with the
scalar coefficients that are not all zeroes. Clearly any maximal linearly
independent set (to which no new non-zero vector can be added without
destroying linear independence) is a basis for L and any minimal set
spanning L is also a basis for L.
1.3.4 Theorem
Every linear space X = {θ} has a Hamel basis.
Let L be the set of all linearly independent subsets of X. Since X = {θ},
it has an element x = θ and {x} ∈ L, therefore L = Φ. Let the partial
ordering in L be denoted by ‘set inclusion’. We show that for every totally
ordered subset Lα , α ∈ A of L, the set L = ∪[Lα : α ∈ A] is also in L.
Otherwise, {L} would be generated by a proper subset T ⊂ L. Therefore,
for every α ∈ A, {Lα } is generated by Tα = T ∩ Lα . However, the linear
independence of Lα implies Tα = Lα . Thus, T = ∪[T ∩ Lα : α ∈ A] =
∪[Tα : α ∈ A] = ∪[Lα : α ∈ A] = L, contradicting the assumption that T
is a proper subset Lα . Thus, the conditions of Zorn’s lemma having been
satisfied, there is a maximal M ∈ L. Suppose {M } is a proper subspace
of X. Let y ∈ X and y ∈ / {M }. The subspace Y of X generated by M
and y then contains {M } as a proper subspace. If, for any proper subset
Preliminaries 9
T ⊂ M, T and also y generate Y , it follows that T also generates {M },

thus contradicting the concept that M is linearly independent. There is
thus no y ∈ X, y ∈/ {M }. Hence M generates X.
A linear space X is said to be finite dimensional if it has a finite basis.
Otherwise, X is said to be infinite dimensional.
1.3.5 Examples
(i) Trivial linear space
Let X = {θ} be a trivial linear space. We have assumed that Φ is a
linearly independent set. The span of Φ is the intersection of all subspaces
of X containing Φ. However, θ belongs to every subspace of X. Hence it
follows that the Span Φ = {θ}. Therefore, Φ is a basis for X.
(ii) n4
Consider the real linear space 4 n
where every x ∈ 4 n
is an
ordered n-tuple of real numbers. Let e1 = (1, 0, 0, . . . , 0), e2 =
(0, 1, 0, 0, . . . , 0), . . . , en = (0, 0, . . . , 1). We may note that {ei }, i =
1, 2, . . . , n is a linearly independent set and spans the whole space n . 4
4 4
Hence, {e1 , e2 , . . . , en } forms a basis of n . For n = 1, we get 1 and any
singleton set comprising a non-zero element forms a basis for 1 . 4
+
(iii) n
+
The complex linear space n is a linear space where every x ∈ n is +
an ordered n-tuple of complex numbers and the space is finite dimensional.
+
The set {e1 , e2 , . . . , en }, where ei is the ith vector, is a basis for n .
(iv) C([a, b]), Pn ([a, b])

C([a, b]) is the space of continuous real functions in the closed interval
[a, b]. Let B = {1, x, x2 , . . . , xn , . . .} be a set of functions in C([a, b]). It
is apparent that B is a basis for C([0, 1]). Pn ([a, b]) is the space of real
polynomials of order n defined on [a, b]. The set Bn = {1, x, x2 , . . . , xn } is
a basis in Pn ([a, b]).
(v) 4m×n(+m×n )
4m×n is the space of all matrices of order m × n. For i = 1, 2, . . . , n let
Ei×j be the m×n matrix with (i, j)th entry as 1 and all other entries as zero.
Then, {Eij : i = 1, 2, . . . , m; j = 1, 2, . . . , n} is a basis for (4)m×n (C m×n ).
1.3.6 Theorem
Let E be a finite dimensional linear space. Then all the bases of E have
the same number of elements.
Let {e1 , e2 , . . . , en } and {f1 , f2 , . . . , fn , fn+1 } be two different bases in
E. Then, any element fi can be expressed as a linear combination of

n
e1 , e2 , . . . , en ; i.e., fi = aij ej . Since fi , i = 1, 2, . . . , n are linearly
i=1
independent, the matrix [aij ] has rank n. Therefore, we can express fn+1
n
as fn+1 = an+1j ej . Thus the elements f1 , f2 , . . . , fn+1 are not linearly
j=1
independent. Since {f1 , f2 , . . . , fn+1 } forms a basis for the space it must
contain a number of linearly independent elements, say m(≤ n). On the
other hand, since {fi }, i = 1, 2, . . . , n + 1 forms a basis for E, ei can be
expressed as a linear combination of {fj }, j = 1, 2, . . . , n + 1 such that
n ≤ m. Comparing m ≤ n and n ≤ m we conclude that m = n. Hence the
number of elements of any two bases in a finite dimensional space E is the
same.
The above theorem helps us to define the dimension of a finite
dimensional space.
1.3.7 Dimension, examples
The dimension of a finite dimensional linear space E is defined as the
number of elements of any basis of the space and is written as dim E.
(i) dim 4 = dim + = 1

(ii) dim 4n = dim +n = n
For an infinite dimensional space it can be shown that all bases are
equivalent sets.
1.3.8 Theorem
If E is a linear space all the bases have the same cardinal number.
Let S = {ei } and T = {fi } be two bases. Suppose S is an infinite
set and has cardinal number α. Let β be the cardinal number of T . Every
fi ∈ T is a linear combination, with non-zero coefficients, of a finite number
of elements e1 , e2 , . . . , en of S and only a finite number of elements of T are
associated in this way with the same set e1 , e2 , . . . , en or some subset of it.
Since the cardinal number of the set of finite subsets of S is the same as
that of S itself, it follows that β ≤ ℵ0 , β ≤ α. Similarly, we can show that
α ≤ β. Hence, α = β. Thus the common cardinal number of all bases in
an infinite dimensional space E is defined as the dimension of E.
1.3.9 Direct sum
Here we consider the representation of a linear space E as a direct sum
of two or more subspaces. Let E be a linear space and X1 , X2 , . . . , Xn
be n subspaces of E. If x ∈ E has an unique representation of the form
x = x1 + x2 + · · · + xn , xi ∈ Xi , i = 1, 2, . . . , n, then E is said to be the
direct sum of its subspaces X1 , X2 , . . . , Xn . The above representation is
Preliminaries 11
called the decomposition of the element x into the elements of the subspaces

n
X1 , X2 , . . . , Xn . In that case we can write E = X1 ⊕X2 ⊕· · · Xn = ⊕Xi .
i=1
1.3.10 Quotient spaces

Let M be a subspace of a linear space E. The coset of an element x ∈ E
with respect to M , denoted by x+M is defined as x+M = {x+m : m ∈ M }.
This can be written as E/M = {x + M : x ∈ E}. One observes that
M = θ + M, x1 + M = x2 + M if and only if x1 − x2 ∈ M and as
a result, for each pair x1 , x2 ∈ E, either (x1 + M ) ∩ (x2 + M ) = θ or
x1 + M = x2 + M . Further, if x1 , x2 , y1 , y2 ∈ E, it then follows that
x1 − x2 ∈ M and y1 − y2 ∈ M, (x1 + x2 ) − (y1 + y2 ) ∈ M and for any scalar
λ, (λx1 − λx2 ) ∈ M because M is a linear subspace. We define the linear
operations on E/M by

(x + M ) + (y + M ) = (x + y) + M,
λ(x + M ) = λx + M where x, y ∈ M, λ is a scalar (real or complex).
It is clearly apparent that E/M under the linear operations defined
above is a linear space over 4 +
(or ). The linear space E/M is called
a quotient space of E by M . The function φ : E → E/M defined
by φ(x) = x + M is called canonical mapping of E onto E/M . The
dimension of E/M is called the codimension (codim) of M in E. Thus,
codim M = dim (E/M ). The quotient space has a simple geometrical
interpretation. Let the linear space E = R2 and the subspace M be given
by the straight line as fig. 1.1.
(x + M ) + (y + M )
= (x + y ) + M
y+M
x+y
x+M
y x+m
x M
Fig. 1.1 Addition in quotient space
1.4 Metric Spaces

Limiting processes and continuity are two important concepts in classical
analysis. Both these concepts in real analysis, specifically in are based on4
distance. The concept of distance has been generalized in abstract spaces
yielding what are known as metric spaces. For two points x, y in an abstract
4
space let d(x, y) be the distance between them in , i.e., d(x, y) = |x − y|.
The concept of distance gives rise to the concept of limit, i.e., {xn } is said
to tend to x as n → ∞ if d(xn , x) → 0 as n → ∞. The concept of continuity
can be introduced through the limiting process. We replace the set of real
4
numbers underlying by an abstract set X of elements (all the attributes of
which are known, but the concrete forms are not spelled out) and introduce
on X a distance function. This will help us in studying different classes of
problems within a single umbrella and drawing some conclusions that are
universally valid for such different sets of elements.
1.4.1 Definition: metric space, metric
A metric space is a pair (X, ρ) where X is a set and ρ is a metric on
X (or a distance function on X) that is a function defined on X × X such
that for all x, y, z ∈ X the following axioms hold:
1. ρ is real-valued, finite and non-negative

2. ρ(x, y) = 0 ⇔ x = y
3. ρ(x, y) = ρ(y, x) (Symmetry)
4. ρ(x, y) ≤ ρ(x, z) + ρ(z, y) (Triangle Inequality)
These axioms obviously express the fundamental properties of the distance

between the points of the three-dimensional Euclidean space.
A subspace (Y, ρ̃) of (X, ρ) is obtained by taking a subset Y ⊂ X and
restricting ρ to Y × Y . Thus the metric on Y is the restriction ρ̃ = ρ|Y ×Y .
ρ̃ is called the metric induced on Y by ρ.
In the above, X denotes the Cartesian product of sets. A × B is the set
of ordered pairs (a, b), where a ∈ A and b ∈ B. Hence, X × X is the set of
all ordered pairs of elements of X.
1.4.2 Examples
(i) Real line 4
This is the set of all real numbers for which the metric is taken as
ρ(x, y) = |x − y|. This is known as the usual metric in . 4
4 +
(ii) Euclidean space n , unitary space n , and complex plane +
Let X be the set of all ordered n-tuples of real numbers. If
(ξ1 , ξ2 , . . . , ξn ) and y = (η1 , η2 , . . . , ηn ) then we set

n

ρ(x, y) = (ξi − ηi )2 (1.1)
i=1
It is easily seen that ρ(x, y) ≥ 0. Furthermore, ρ(x, y) = ρ(y, x).

Preliminaries 13

n
Let, z = (ζ1 , ζ2 , . . . , ζn ). Then, ρ2 (x, z) = (ξi − ζi )2
j=1

n
n
n
= (ξi − ηi )2 + (ηi − ζi )2 + 2 (ξi − ηi )(ηi − ζi )
i=1 i=1 i=1
Now by the Cauchy-Bunyakovsky-Schwartz inequality [see 1.4.3]
n
1/2 n
1/2
n
2 2
(ξi − ηi )(ηi − ζi ) ≤ (ξi − ηi ) (ηi − ζi )
i=1 i=1 i=1
≤ ρ(x, y)ρ(y, z)
Thus, ρ(x, z) ≤ ρ(x, y) + ρ(y, z).
Hence, all the axioms of a metric space are fulfilled. Therefore, n 4
under the metric defined by (1.1) is a metric space and is known as the
n-dimensional Euclidean space. If x, y, z denote three distinct points in 2 4
then the inequality ρ(x, z) ≤ ρ(x, y) + ρ(y, z) implies that the length of any
side of a triangle is always less than the sum of the lengths of the other
two sides of the triangle obtained by joining x, y, z. Hence, axiom 4) of
the set of metric axioms is known as the triangle inequality. n-dimensional
+
unitary space n is the space of all ordered n-tuples of complex numbers
n

with metric defined by ρ(x, y) = (ξi − ηi )2 . When n = 1 this is the
i=1
complex plane + with the usual metric defined by ρ(x, y) = |x − y|.
(iii) Sequence space l∞
Let X be the set of all bounded sequences of complex numbers, i.e.,
every element of X is a complex sequence x = (ξ1 , ξ2 , . . . , ξn ) or x = {ξi }
such that |ξi | ≤ Ci , where Ci for each i is a real number. We define the
metric as ρ(x, y) = sup |ξi − ηi |, where y = {ηi }, N = {1, 2, 3, . . .}, and
i∈N
‘sup’ denotes the least upper bound (l.u.b.). l∞ is called a sequence space
because each element of X (each point in X) is a sequence.
(iv) C([a, b])

Let X be the set of all real-valued continuous functions x, y, etc, that
are functions of an independent variable t defined on a given closed interval
J = [a, b].
We choose the metric defined by ρ(x, y) = max |x(t) − y(t)| where max
t∈J
denotes the maximum. We may note that ρ(x, y) ≥ 0 and ρ(x, y) = 0 if and
only if x(t) = y(t). Moreover, ρ(x, y) = ρ(x, y). To verify the triangular
inequality, we note that
|x(t) − z(t)| ≤ |x(t) − y(t)| + |y(t) − z(t)|
≤ maxt |x(t) − y(t)| + maxt |y(t) − y(t)

≤ ρ(x, y) + ρ(y, z), for every t ∈ [0, 1]
Hence, ρ(x, z) ≤ ρ(x, y) + ρ(y, z). Thus, all the axioms of a metric space
are satisfied.
The set of all continuous functions defined on the interval [a, b] with the
above metric is called the space of continuous functions and is defined on
J and denoted by C([a, b]). This is a function space because every point of
C([a, b]) is a function.
(v) Discrete metric space

0, if x = y
Let X be a set and let ρ be defined by, ρ(x, y) = .
1, if x = y
The above is called a discrete metric and the set X endowed with the
above metric is a discrete metric space.
(vi) The space M ([a, b]) of bounded real functions

Consider a set of all bounded functions x(t) of a real variable t, defined
on the segment [a, b]. Let the metric be defined by ρ(x, y) = sup |x(t)−y(t)|.
t
All the metric axioms are fulfilled with the above metric. The set of real
bounded functions with the above metric is designated as the space M [a, b].
It may be noted that C[a, b] ⊆ M ([a, b]).
(vii) The space BV ([a, b]) of functions of bounded variation

Let BV ([a, b]) denote the class of all functions of bounded variation on

n
[a, b], i.e., all f for which the total variation V (f ) = sup |f (xi )−f (xi+1 )|
i=1
is finite, where the supremum is taken over all partitions, a = x0 < x1 <
x2 < · · · < xn = b. Let us take ρ(f, g) = V (f − g). If f = g, v(f − g) = 0.
Else, V (f − g) = 0 if and only if f and g differ by a constant.
ρ(f, g) = ρ(g, f ) since V (f − g) = V (g − f ).

If h is a function of bounded variation,
n
ρ(f, h) = V (f − h) = sup |(f (ti ) − h(ti )) − (f (ti−1 ) − h(ti−1 ))|
i=1

n
= sup |(f (ti ) − g(ti ) + g(ti ) − h(ti ))
i=1
− (f (ti−1 ) − g(ti−1 ) + g(ti−1 ) − h(ti−1 ))|

n
≤ sup |(f (ti ) − g(ti )) − (f (ti−1 ) − g(ti−1 ))|
i=1
Preliminaries 15

n
+ sup |(g(ti ) − h(ti )) − (g(ti−1 ) − h(ti−1 ))|
i=1
≤ V (f − g) + V (g − h) = ρ(f, g) + ρ(g, h)
Thus all the axioms of a metric space are fulfilled.
If BV ([a, b]) is decomposed into equivalent classes according to the

equivalence relation defined by f ∼ = g, and if f (t) − g(t) is constant on
[a, b], then this ρ(f, g) determines a metric ρ on the space BV ([a, b]) of
such equivalent classes in an obvious way. Alternatively we may modify
the definition of ρ so as to obtain a metric on the original class BV ([a, b]).
For example, ρ(f, g) = |f (a) − g(a)| + V (f − g) is a metric of BV ([a, b]).
The subspace of the metric space, consisting of all f ∈ BV ([a, b]) for which
f (a) = 0, can naturally be identified with the space BV ([a, b]).
(viii) The space c of convergent numerical sequences

Let X be the set of convergent numerical sequences x =
{ξ1 , ξ2 , ξ3 , . . . , ξn , . . .}, where lim ξi = ξ. Let x = {ξ1 , ξ2 , ξn , . . .} and
i
y = {η1 , η2 , ηn , . . .}. Set ρ(x, y) = sup |ξi − ηi |.
i
(ix) The space m of bounded numerical sequences

Let X be the sequence of bounded numerical sequences x =
{ξ1 , ξ2 , . . . , ξn , . . .}, implying that for every x there is a constant K(X)
such that |ξi | ≤ K(X) for all i. Let x = {ξi }, y = {ηi } belong to X.
Introduce the metric ρ(x, y) = sup |ξi − ηi |.
i
It may be noted that the space c of convergent numerical sequences is
a subspace of the space m of bounded numerical sequences.
(x) Sequence space s

This space consists of the set of all (not necessarily bounded) sequences
of complex numbers and the metric ρ is defined by
n
1 |ξi − ηi |
ρ(x, y) = i 1 + |ξ − η |
where x = {ξi } and y = {ηi }.
i=1
2 i i
Axioms 1-3 of a metric space are satisfied. To see that ρ(x, y) also
satisfies axiom 4 of a metric space, we proceed as follows:
t 1
Let f (t) = , t ∈ R. Since f (t) = > 0,
1+t (1 + t)2
f (t) is monotonically increasing.
Hence |a + b| ≤ |a| + |b| ⇒ f (|a + b|) ≤ f (|a|) + f (|b|).
|a + b| |a| + |b| |a| |b|
Thus, ≤ ≤ + .
1 + |a + b| 1 + |a| + |b| 1 + |a| 1 + |b|
Let a = ξi − ζi , b = ζi − ηi , where z = {ζi }.

|ξi − ηi | |ξi − ζi | |ζi − ηi |
Thus, ≤ +
1 + |ξi − ηi | 1 + |ξi − ζi | 1 + |zetai − ηi |
Hence, ρ(x, y) ≤ ρ(x, z) + ρ(z, y), indicating the axiom on ‘triangle
inequality’ has been satisfied. Thus s is a metric space.
Problems
√
1. Show that ρ(x, y) = x − y defines a metric on the set of all real
numbers.
2. Show that the set of all n-tuples of real numbers becomes a metric
space under the metric ρ(x, y) = max{|x1 − y1 |, . . . , |xn − yn |} where
x = {xi }, y = {yi }.
3. Let 4
be the space of real or complex numbers. The distance ρ of
two elements f, g shall be defined as ρ(f, g) = ϕ(|f − g|) where ϕ(x) is
a function defined for x ≥ 0, ϕ(x) is twice continuously differentiable
and strictly monotonic increasing (that is, ϕ (x) > 0), and ϕ(0) = 0.
Then show that ρ(f, g) = 0 if and only if f = g.
4. Let C([B]) be the space of continuous (real or complex) functions f ,
defined on a closed bounded domain B on n . Define ρ(f, g) = ϕ(r) 4
where r = max |f − g|. For ϕ(r) we make the same assumptions as
4 is
B
in example 3. When ϕ (r) < 0, show that the function space
metric, but, when ϕ (r) > 0 the space is no more metric. 4
1.4.3 Theorem (Hölder’s inequality)
1 1
If p > 1 and q is defined by + = 1
p q
n 1/p n 1/q
n
(H1) |xi yi | ≤ |xi | p
|yi |q
i=1 i=1 i=1

for any complex numbers x1 , x2 , x3 , . . . , xn , y1 , . . . , yn .
(H2) If x ∈ p i.e., pth power summable, y ∈ q where p, q are defined
as above, x = {xi }, y = {yi }.
∞
∞ 1/p ∞ 1/q

We have |xi yi | ≤ |xi | p
|yi | q
. The inequality is
i=1 i=1 i=1
known as Hölder’s inequality for sum.
(H3) If x(t) ∈ Lp (0, 1) i.e. pth power integrable, y(t) ∈ Lq (0, 1) i.e.
th
q power integrable, where p and q are defined as above, then
1 1 1/p 1 1/q
|x(t)y(t)|dt ≤ |x(t) dt
p
|y(t) dt
q
.
0 0 0
Preliminaries 17
The above inequality is known as Hölder’s inequality for integrals. Here p

and q are said to be conjugate to each other.
Proof: We first prove the inequality

a b
a1/p b1/q ≤ + , a ≥ 0, b ≥ 0. (1.2)
p q
In order to prove the inequality we consider the function
f (t) = tα − αt + α − 1 defined for 0 < α < 1, t ≥ 0.

Then, f (t) = α(tα−1 − 1)
so that f (1) = f (1) = 0
f (t) > 0 for 0 < t < 1 and f (t) < 0 for t > 1.
It follows that f (t) ≤ 0 for t ≥ 0. The inequality is true for b = 0 since

p > 1. Suppose b > 0 and let t = a/b and α = 1/p. Then
a a α 1 a 1
f = − · + − 1 ≤ 0.
b b p b p
Multiplying by b, we obtain,

1
1/p 1− p a 1 a b 1 1
a b ≤ +b 1− = + since 1 − = .
p p p q p q
Applying this to the numbers
|xj |p |yj |q
aj =
∞ , bj =
n
|xi |p |yi |q
i=1 i=1
for j = 1, 2, . . . n, we get
|xj yj | aj bj
1/p 1/q ≤ p + q , j = 1, 2, . . . , n.

n n
|xj |p |yj |q
j=1 j=1
By adding these inequalities the RHS takes the form

n
n
aj bj
j=1 j=1 1 1
+ = + = 1.
p q p q
LHS gives the Hölder’s inequality (H1)
⎛ ⎞1/p ⎛ ⎞1/q
n n n
|xj yj | ≤ ⎝ |xj |p ⎠ ⎝ |yj |q ⎠ (1.3)
j=1 j=1 j=1
which proves (1).

To prove (H2), we note that
∞

x ∈ p ⇒ |xj |p < ∞ [see H2],
j=1
∞

y ∈ q ⇒ |yj |q < ∞ [see H2].
j=1
|xj |p |yj |q
Taking aj =
∞ , bj =
∞ for j = 1, 2, . . . ∞
|xi |p |yi |q
i=1 i=1
we obtain as in above,
|xj yj |p aj bj
∞ 1/p ∞ 1/q ≤ p + q .

|xi | p |yi |q
i=1 i=1
Summing over both sides for j = 1, 2, . . . , ∞ we obtain the Hölder’s

inequality for sums.
In case p = 2, then q = 2, the above inequality reduces to the Cauchy-
Bunyakovsky-Schwartz inequality, namely
∞
∞
1/p ∞
1/q

|xi yi | ≤ |xi | p
|yi |q
.
i=1 i=1 i=1
The Cauchy-Bunyakovsky-Schwartz has numerous applications in a

variety of mathematical investigations and will find important applications
in some of the later chapters.
To prove (H3) we note that
1
x(t) ∈ Lp (0, 1) ⇒ |x(t)p dt < ∞ [see H3],
0
1
y(t) ∈ Lp (0, 1) ⇒ |y(t)|q dt < ∞ [see H3].
0
|x(t)|p |y(t)|q
Taking a = 1 and b = 1 in the inequality
0
|x(t)|p dt 0
|y(t)|q dt
1 a b
a1/p b1− p ≤ + , and integrating from 0 to 1, we have
p q
1
0
|x(t)y(t)|dt
1 1 ≤1 (1.4)
( 0 |x(t)|p dt)1/p ( 0 |y(t)q dt)1/q
which yields the Hölder’s inequality for integrals.
Preliminaries 19
1.4.4 Theorem (Minkowski’s inequality)

(M1) If p ≥ 1, then
1/p n 1/p 1/p

n
n
|xi + yi | p
≤ |xi | p
+ |yi | p
i=1 i=1 i=1
for any complex numbers x1 , . . . xn , y1 , y2 , . . . , yn .

(M2) If p ≥ 1, {xi } ∈ p , i.e. pth power summable, y = {yi } ∈ q , i.e.,
qth power summable, where p and q are conjugate to each other, then
∞

1/p ∞
1/p ∞
1/p

|xi + yi | p
≤ |xi | p
+ |yi | p
.
i=1 i=1 i=1
(M3) If x(t) and y(t) belong to Lp (0, 1), then

1 1/p 1 1/p 1 1/p
|x(t) + y(t)p dt ≤ |x(t)p dt + |y(t)|p dt .
0 0 0
Proof: If p = 1 and p = ∞ the (M1) is easily seen to be true.

Suppose 1 < p < ∞, then
n 1/p n 1/p

|xi + yi | p
≤ (|xi | + |yi |) p
. (1.5)
i=1 i=1
Moreover, (|xi | + |yi |)p = (xi | + |yi |)p−1 |xi | + (|xi | + |yi |)p−1 |yi |.
Summing these identities for i = 1, 2, . . . , n,
n 1/p n 1/q
n
(|xi + yi | )|xi | ≤
p−1
|xi | p
((|xi | + |yi |) )
p−1 q
i=1 i=1 i=1

p 1/p 1/q

n
= |xi | p
(|xi | + |yi |)p .
i=1 i=1
Similarly we have,
1/p
1/q

n
n
n
(|xi | + |yi |) p−1
|yi | ≤ |yi | p
(|xi | + |yi | ) p
i=1 i=1 i=1
From the above two inequalities,

⎡
1/p n
1/p ⎤
n n
(|xi | + |yi |)p ≤ ⎣ |xi |p + |yi |p ⎦·
i=1 i=1 i=1
n
1/q

(|xi | + |yi |) p
(1.6)
i=1
n
1/p n 1/p 1/p

n
or, (|xi | + |yi |) p
≤ |xi |
p
+ |yi | p
i=1 i=1 i=1

n
assuming that (|xi | + |yi |)p = 0.
i=1
From (1.5) and (1.6) we have

n 1/p n 1/p n 1/p

|xi + yi | p
≤ |xi |p
+ |yi | p
(1.7)
i=1 i=1 i=1
(M2) is true for p = 1 and p = ∞.

To, prove (M2) for 1 < p < ∞, we note that
n
n
x = {xi } ∈ p ⇒ |xi |p < ∞ also y = {yi } ∈ p ⇒ |yi |p < ∞.
i=1 i=1

n
We examine |xi + yi |p .
i=1
Let us note that z = {zi } ∈ p ⇒ z = {|zi |p−1 } ∈ q .
On applying twice Hölder’s inequality to the sequences {xi } ∈ p and
{|xi + yi |p−1 } ∈ q , corresponding to {yi } ∈ p we get,
∞
∞

n
|xi + yi | ≤ p
|xi + yi | p−1
|xi | + |xi + yi |p−1 |yi |
i=1 i=1 i−1
∞

1/q ⎡ ∞
1/p ∞
1/q ⎤

≤ |xi + yi |(p−1)q ⎣ |xi |p + |yi |q ⎦
i=1 i=1 i=1
∞
1/q ⎡ ∞
1/p ∞
1/q ⎤

= |xi + yi |p ⎣ |xi |p + |yi |q ⎦.
i=1 i=1 i=1
∞

Assuming |xi + yi |p = 0, the above inequality yields on division
i=1
∞
1/q

by |xi + yi | p
,
i=1

1/p ⎡
1/p
1/p ⎤
∞
∞
∞

|xi + yi |p ≤⎣ |xi |p + |yi |q ⎦ (1.8)
i=1 i=1 i=1
Preliminaries 21
It is easily seen that (M3) is true for p = 1 and p = ∞. To prove (M3)

for 1 < p < ∞ we proceed as follows.
1
Let x(t) ∈ Lp (0, 1) i.e. |x(t)|p dt < ∞
0
1
y(t) ∈ Lp (0, 1) i.e. |y(t)|p dt < ∞
0
1
If z(t) ∈ Lp (0, 1) i.e. |z(t)|p dt < ∞
0
1 p
i.e. (|z(t)|p−1 ) p−1 dt < ∞ i.e. |z(t)|p−1 ∈ Lq (0, 1).
0
1
Let us consider the integral |x(t) + y(t)|p dt
0
for 1 < p < ∞, |x(t) + y(t)|p ≤ |x(t)|p + |y(t)|p
≤ 2p (|x(t)|p + |y(t)|p )
1 1 1
Hence, |x(t) + y(t)| dt ≤ 2
p p
|x(t)|p dt + |y(t)| dt p
0 0 0
< ∞ since x(t), y(t) ∈ Lp (0, 1)
1 1
p
Furthermore, |x(t) + y(t)|p dt < ∞ ⇒ (|x(t) + y(t)|) p−1 dt < ∞
0 0
1
⇒ (|x(t) + y(t)|)p−1 dt ∈ Lq (0, 1)
0
where p and q are conjugate to each other.
Using Hölder’s inequality we conclude
1 1
|x(t) + y(t)|p dt ≤ |x(t) + y(t)|p−1 |x(t)|dt
0 0
1
+ |x(t) + y(t)|p−1 |y(t)|dt
0
1 1/q 1 1/p
(p−1)/q
≤ |x(t) + y(t)| dt |x(t)| dt
p
0 0
1 1/q 1 1/p
+ |x(t) + y(t)|(p−1)q dt |y(t)|p dt
0 0
1 1/q 1 1/p
= |x(t) + y(t)| dt
p
|x(t)|p dt
0 0
1 1/p
+ |y(t)| dt p
.
0
1
Assuming that |x(t) + y(t)|p dt = 0 and dividing both sides of the
0
1 1/q
above inequality by |x(t) + y(t)| dt
p
, we get
0
1 1/p
|x(t) + y(t)| dt p
0
1/p 1/p
1 1
≤ |x(t)| dt
p
+ |y(t)|p dt (1.9)
0 0
Problems
1. Show that the Cauchy-Bunyakovsky-Schwartz inequality implies that
(|ξ1 | + |ξ2 | + · · · + |ξn |)2 ≤ n(|ξ1 |2 + · · · + |ξn |2 ).
2. In the plane of complex numbers show that the points z on the open
unit disk |z| < 1 form a metric space if the metric is defined as

1 1+u z1 − z2
ρ(z1 , z2 ) = log , where u = .
2 1−u 1 − z 1 z2
(n) (n)
1.4.5 The spaces lp , l∞ , lp , p ≥ 1, l∞
(n) (n)
(i) The spaces lp , l∞
Let X be an n-dimensional arithmetic space, i.e., the set of all possible
n-tuples of real numbers and let x = {x1 , x2 , . . . , xn }, y = {y1 , y2 , . . . , yn },
and p ≥ 1.
n
1/p

We define ρp (x, y) = |xi − yi | p
. Let max |xi −yi | = |xk −yk |.
1≤i≤n
i=1
Then,
⎛ ⎞1/p
n p
⎜ xi − yi ⎟
ρp (x, y) = |xk − yk | ⎝1 +
xk − y k ⎠ .
i=1
i=k
Making p → ∞, we get ρ∞ (x, y) = max |xi − yi |. It may be noted that

1≤i≤n
ρp (x, y) ∀ x, y ∈ X satisfies the axioms 1-3 of a metric space. Since by
Minkowski’s inequality,
n
1/p n
1/p n
1/p

|xi − zi |p ≤ |xi − yi |p + |yi − zi |p
i=1 i=1 i=1
axiom 4 of a metric space is satisfied. Hence the set X with the metric
(n)
ρp (x, y) is a metric space and is called l∞ .
Preliminaries 23
(ii) The spaces lp , p ≥ 1, l∞

Let X be the set of sequences x = {x1 , x2 , . . . , xn } of real numbers.
∞

x is said to belong to the space lp if |xi |p < ∞ (p ≥ 1, p fixed).
i=1
In lp we introduce the metric ρ(x, y) for x = {xi } and y = {yi } as
∞
1/p

ρp (x, y) = |xi − yi |p . The metric is a natural extension of the
i=1
(n)
metric in lp when n → ∞. To see that the series for ρp converges for
x, y ∈ lp we use Minkowski’s inequality (M2). It may be noted that the
above metric satisfies axioms 1-3 of a metric space. If z = {zi } ∈ lp , then
∞
1/p

the Minkowski’s inequality (M2) yields ρ(x, y) = |xi − zi | p
=
i=1
∞
1/p

|(xi − yi ) + (yi − zi )|p ≤ ρ(x, y)+ρ(y, z) Thus ln is a metric space.
i=1
∞

1/2

If p = 2, we have the space l2 with the metric ρ(x, y) = |xi − yi |2 .
i=1
Later chapters will reveal that l2 possesses a special property in that it
admits of a scalar product and hence becomes a Hillbert space.
l∞ is the space of all bounded sequences, i.e., all x = {x1 , x2 , . . . , .}
for which sup |xi | < ∞, with metric ρ(x, y) = sup |xi − yi | where
1≤i≤∞ 1≤i≤∞
y = {y1 , y2 , . . . , .}.
1.4.6 The complete spaces, non-metrizable spaces

(i) Complex spaces
Together with the real spaces C([0, 1]), m, lp it is possible to consider
the complex space +
([0, 1]), m, lp corresponding to the real spaces.
+
The elements of complex space ([0, 1]) are complex-valued continuous
functions of a real variable. Similarly, the elements of complex space m are
elements that are bounded, as in the case of complex lp spaces whose series
of p-power of moduli converges.
(ii) Non-metrizable spaces

Let us consider the set F ([0, 1]) of all real functions defined on the
interval [0,1]. A sequence {xn (t)} ⊂ F ([0, 1]) will converge to x(t) ∈
F ([0, 1]), if for any fixed t, we have xn (t) → x(t). Thus the convergence of a
sequence of functions in F ([0, 1]) is a pointwise convergence. We will show
that F ([0, 1]) is not a metric space. Let M be the set of continuous functions
in the metric space F ([0, 1]). Using the properties of closure in the metric
space, M = M [see 1.4.6]. Since M is a set of continuous functions, the
limits are in the sense of uniform convergence. However, F ([0, 1]) admits
of only pointwise convergence. This means int M = Φ, i.e., M is nowhere
dense, that is therefore a first category set of functions [sec 1.4.9.]. Thus,
M is the set of real functions and their limits are in the sense of pointwise
convergence. Therefore, M is a set of the second category [sec. 1.4.9] and
the pointwise convergence is non-metrizable.
Problems
1. Find a sequence which converges to 0, but is not in any space lp where
1 ≤ p < ∞.
|x − y|
2. Show that the real line with ρ(x, y) = is a metric space.
1 + |x − y|
3. If (X, ρ) is any metric space, show that another metric of X is defined
by
ρ(x, y)
ρ(x, y) = .
1 + ρ(x, y)
4. Find a sequence {x} which is in lp with p > 1 but x ∈ l1 .

5. Show that the set of continuous functions on (−∞, ∞) with
∞
1 max[|x(t) − y(t)| : |t| ≤ n]
ρ(x, y) = n 1 − max[|x(t) − y(t)| : |t| ≤ n]
is a metric space.
n=1
2
6. Diameter, bounded set: The diameter D(A) of a non-empty set

A in a metric space (x, ρ) is defined to be D(A) = sup ρ(x, y). A
x,y∈A
is said to be bounded if D(A) < ∞. Show that A ⊆ B implies that
D(A) ≤ D(B).
7. Distance between sets: The distance D(A, B) between two non-
empty sets A and B of a metric space (X, ρ) is defined to be
D(A, B) = inf ρ(x, y). Show that D does not define a metric on
x∈A
y∈B
the power set of X.
8. Distance of a point from a set: The distance D(x, A) from a
point x to a non-empty subset A of (X, ρ) is defined to be D(x, A) =
inf ρ(x, a). Show that for any x, y ∈ X, |D(x, A)−D(y, A)| ≤ d(x, y).
a∈A
1.4.7 Definition: ball and sphere

In this section we introduce certain concepts which are quite important
in metric spaces. When applied to Euclidean spaces these concepts can
be visualised as an extension of objects in classical geometry to higher
dimensions. Given a point x0 ∈ X and a real number r > 0, we define
three types of sets:
Preliminaries 25
(a) B(x0 , r) = {x ∈ X|ρ(x, x0 ) < r} (open ball)

(b) B(x0 , r) = {x ∈ X|ρ(x, x0 ) ≤ r} (closed ball)
(c) S(x0 , r) = {x ∈ X|ρ(x, x0 ) = r} (sphere).
In all these cases x0 is called the centre and r the ball radius. An open
ball in the set X is a set of all points of X the distance of which from the
centre x0 is always less than the radius r.
Note 1.4.1. In working with metric spaces we borrow some terminology

from Euclidean geometry. But we should remember that balls and spheres
in an arbitrary metric space do not possess the same properties as balls and
4
spheres in 3 . An unusual property of a sphere is that it may be empty.
For example a sphere in a discrete metric space is null, i.e., S(x0 , r) = Φ if
r = 1. We next consider two related concepts.
1.4.8 Definition: open set, closed set, neighbourhood, interior
point, limit point, closure
A subset M of a metric space X is said to be open if it contains a
ball about each of its points. A subset K of X is said to be closed if its
complement (in X) is open- that is, K C = X − K is open.
An open ball B(x0 , ) of radius is often called an -neighbourhood of
x0 . By a neighbourhood of x0 , we mean any subset of X which contains
an -neighbourhood of x0 . We see that every neighbourhood of x0 contains
x0 . In other words, x0 is a point in each of its neighbourhoods. If N is a
neighbourhood of x0 and N ⊆ M , then M is also a neighbourhood of x0 .
We call x0 an interior point of a set M ⊆ X if M is a neighbourhood of
x0 . The interior of M is the set of all interior points of M and is denoted
by M 0 or int(M ). int(M ) is open and is the largest open set contained in
M . Symbolically, int(M ) = (x : x ∈ M and B(x0 , ) ⊆ M ) for some > 0.
If A ⊆ X and x0 ∈ M , then x0 is called the limit point of A if every
neighbourhood of x0 contains at least one point of A other than x0 . That
is, x0 is a limit point of A if and only if Nx0 is a neighbourhood of x0 and
implies that (Nx0 − (x0 )) ∩ A = Φ.
If for all neighbourhoods Nx of x, Nx ∩ A = Φ, then x is called a contact
point. For each A ⊂ X, the set A, consisting of all points which are either
points of A or its limiting points, is called the closure of A. The closure of
a set is a closed set and is the smallest closed set containing A.
Note 1.4.2. In what follows we show how different metrics yield different
4
types of open balls. Let X = 2 be the Euclidean space. Then the unit
open ball B(0, 1) is given in Figure 1.2(a). If the l∞ norm is used the unit
open ball B(0, 1) is the unit square as given in Figure 1.2(b). If the l1 norm
is used, the unit open ball B(0, 1) becomes the ‘diamond shaped’ region
shown in Figure 1.2(c). If we select p > 2, B(0, 1) becomes a figure with
curved sides, shown in Figure 1.2(d). The unit ball in C[0, 1] is given in
Figure 1.2(e).
x2 x2 x2
x1 x1 x1
B (0,1)⊂(R 2, ρ) B (0,1)⊂(R 2, ρ∞) B (0,1)⊂(R 2, ρl )

1
(a) (b) (c)

x2 x2
||f −f0||<r
x1 f0 + r
2r
f0
f0−r
B (0,1)⊂(R 2, ρl3) x1
1
(d) (e)
Fig. 1.2
Note 1.4.3. It may be noted that the closed sets of a metric space have
the same basic properties as the closed numerical point sets, namely:
(i) M ∪ N = M ∪ N ;
(ii) M ⊂ M ;
(iii) M = M = M ;
(iv) The closure of an empty set is empty.
1.4.9 Theorem
In any metric space X, the empty set Φ and the full space X are open.
To show that Φ is open, we must show that each point in Φ is the centre
of an open ball contained in Φ; but since there are no points in Φ, this
requirement is automatically satisfied. X is clearly open since every open
ball centered on each of its points is contained in X. This is because X is
the entire space.
Note 1.4.4. It may be noted that an open ball B(0, r) on the real line is
the bounded open interval ] − r, r[ with its centre as the origin and a total
length of 2r. We may note that [0,1[ on the real line is not an open set since
the interval being closed on the left, every bounded open interval with the
origin as centre or in other words every open ball B(0, r) contains points
of 4not belonging to [0, 1[. On the other hand, if we consider X = [0, 1[
as a space itself, then the set X is open. There is no inconsistency in the
above statement, if we note that when X = [0, 1[ is considered as a space,
Preliminaries 27
there are no points of the space outside [0,1[. However, when X = [0, 1[ is
4 4
considered as a subspace of , there are points of outside X. One should
take note of the fact that whether or not a set is open is relevant only with
respect to a specific metric space containing it, but never on its own.
1.4.10 Theorem
In a metric space X each open ball is an open set.
Let B(x0 , r) be a given ball in a metric space X. Let x be any point in
B(x0 , r). Now ρ(x0 , x) < r. Let r1 = r−ρ(x0 , x). Hence B(x, r1 ) is an open
ball with centre x and radius r1 . We want to show that B(x, r1 ) ⊆ B(x0 , r).
For if y ∈ B(x, r1 ), then
ρ(y, x0 ≤ ρ(y, x) + ρ(x, x0 ) < r1 + ρ(x, x0 ) = r, i.e., y ∈ B(x0 , r).
Thus B(x, r1 ) is an open ball contained in B(x0 , r). Since x is arbitrary, it
follows that B(x0 , r) is an open set. In what follows we state some results
that will be used later on:
Let X be a metric space.
(i) A subset G of X is open ⇔ it is a union of open balls.
(ii) (a) every union of open sets in X is open and (b) any finite
intersection of open sets in X is open.
Note 1.4.5. The two properties mentioned in (ii) are vital properties
of a metric space and these properties are established by using only the
‘openness’ of a set in a metric space. No use of distance or metric is required
in the proof of the above theorem. These properties are germane to the
development of ‘topology’ and ‘topological spaces’. We discuss them in the
next section.
We will next mention some properties of closed sets in a metric space.
We should recall that a subset K of X is said to be closed if its complement
K C = X − K is open.
(i) The null set Φ and the entire set X in a metric space are closed.
(ii) In a metric space a closed ball is a closed set.
(iii) In a metric space, (a) the intersection of closed sets is closed and
(b) the union of a finite number of closed sets is a closed set.
Note 1.4.6. This leads to what is known as closed set topology.

1.4.11 Convergence, Cauchy sequence, completeness
In real analysis we know that a sequence of real numbers {ξi } is said
to tend to a limit l if the distance of {ξi } from l is arbitrarily small ∀ i
excepting a finite number of terms. In other words, it is the metric on 4
that helps us introduce the concept of convergence. This idea has been
generalized in any metric space where convergence of a sequence has been
defined with the help of the relevant metric.
Definition: convergence of a sequence, limit

A sequence {xn } in a metric space X = (X, ρ) is said to converge or
to be convergent if there is an x ∈ X such that lim ρ(x, xn ) = 0. x is
n→∞
called limit of {xn } and we write it as x = lim xn . In other words, given
n→∞
> 0, ∃ n0 ( ) s.t. ρ(x, xn ) < ∀ n > n0 ( ).
Note 1.4.7. The limit of a convergent sequence must be a point of the

space X.
For example, let X be the open interval [0,1[ in 4
with the usual
metric ρ(x, y) = |x − y|. Then the sequence (1/2, 1/3, . . . , 1/n, . . .) is
not convergent since ‘0’, the point to which the sequence is supposed to
converge, does not belong to the space X.
1.4.12 Theorem
A sequence {xn } of points of a metric space X can converge to one limit
at most.
If the limit is not unique, let xn → x and xn → y as n → ∞, x = y.
Then ρ(x, y) ≤ ρ(xn , x) + ρ(xn , y) < + , for n ≥ n0 ( ). Since is an
arbitrary positive number, it follows that x = y.
1.4.13 Theorem
If a sequence {xn } of points of X converges to a point x ∈ X, then the
set of numbers ρ(xn , θ) is bounded for every fixed point θ of the space X.
Note 1.4.8. In some spaces the limit of a sequence of elements is directly

defined. If we can introduce in this space a metric such that the limit
induced by the metric coincides with the initial limit, the given space is
called metrizable.
Note 1.4.9. It is known that in 4the Cauchy convergence criterion

ensures the existence of the limit. Yet in any metric space the fulfillment
of the Cauchy convergence criterion does not ensure the existence of the
limit. This needs the introduction of the notion of completeness.
1.4.14 Definition: Cauchy sequence, completeness
A sequence {xn } in a metric space X = (X, ρ) is said to be a Cauchy
sequence or convergent sequence if given > 0, ∃ n0 ( ), a positive integer
such that ρ(xn , xm ) < for n, m > n0 ( ).
Note 1.4.10. The converse of the theorem is not true for an arbitrary
metric space, since there exist metric spaces that contain a Cauchy sequence
but have no element that will be the limit.
Preliminaries 29
1.4.15 Examples
(i) The space of rational numbers
Let X be the set of rational numbers, in which the metric is taken
as ρ(r1 , r2 ) = |r1 − r2 |. Thus, X is a metric space. Let us take
r1 = 1, r2 = 12 , · · · , rn = n1 . {rn } is a Cauchy sequence and rn → 0 as
! "n
n → ∞. On the other hand let us take rn = 1 + n1 where n is an
! "
1 n
integer. {rn } is a Cauchy sequence. However, lim 1 + n = e, which is
n→∞
not a rational number.
(ii) The space of polynomials P (t)(0 ≤ t ≤ 1)

Let X be the set of polynomials P (t) (0 ≤ t ≤ 1) and let the metric
be defined by ρ(P, Q) = max |P (t) − Q(t)|. It can be seen that with the
t
above metric, the space X is a metric space. Let {Pn (t)} be the sequence of
nth degree polynomials converging uniformly to a continuous function that
is not a polynomial. Thus the above sequence of polynomials is a Cauchy
sequence with no limit in the space (X, ρ). In what follows, we give some
examples of complete metric spaces.
(iii) Completeness of n and n 4 +

Let us consider xp ⊆ 4 n
. Then we can write xp =
(p) (p) (p) (q) (q) (q)
{ξ1 , ξ2 , . . . , ξn }. Similarly, xq = {ξ1 , . . . ξ2 , . . . , ξn }. Then,
n
1/2
(p) (q)
ρ(xp , xq ) = |ξi − ξi |2 . Now, if {xm } is a Cauchy sequence,
i=1
for every > 0, ∃ n0 ( ) s.t.
n
1/2
(p) (q) 2
ρ(xp , xq ) = |ξi − ξi | < for p, q > n0 ( ). (1.10)
i=1
(p) (q)
Squaring, we have for p, q > n0 ( ), i = 1, 2, . . . , n, |ξi − ξi |2 < 2 ⇒
(p) (q)
|ξi − ξi | < . This shows that for each fixed i, (1 ≤ i ≤ n),
(i) (i) (i)
the sequence {ξ1 , ξ2 , . . . , ξn } is a Cauchy sequence of real numbers.
Therefore, ξi
(m)
= ξi ∈4 as m → ∞. Let us denote by x the vector,
4
x = (ξ1 , ξ2 , . . . , ξn ). Clearly, x ∈ n . It follows from (1.1) that ρ(xm , x) ≤
for m ≥ n0 ( ). This shows that x is the limit of {xm } and this proves
completeness because {xm } is an arbitrary Cauchy sequence. Completeness
+
of n can be proven in a similar fashion.
(iv) Completeness of C([a, b]) and incompleteness of S([a, b])

Let {xn (t)} ⊂ C([a, b]) be a Cauchy sequence. Hence, ρ(xn (t), xm (t)) →
0 as n, m → ∞ since xn (t), xm (t) ∈ C([a, b]). Thus, given > 0, ∃ n0 ( )
such that max |xn (t) − xm (t)| < for n, m ≥ x0 (t). Hence, for every
t∈[a,b]
fixed t = t0 ∈ J = [a, b], |xn (t0 ) − xm (t0 )| < for m, n > n0 ( ). Thus,
{xn (t0 )} is a convergent sequence of real numbers. Since 4
is complete,
4
{xn (t0 )} → x(t0 ) ∈ . In this way we can associate with each t ∈ J a
unique real number x(t) as limit of the sequence {xn (t)}. This defines a
(pointwise) function x on J and thus x(t) ∈ C([a, b]). Thus, it follows from
(6.2), making n → ∞, max |xm (t) − x(t)| < for m ≥ n0 ( ) for every t ∈ J.
Therefore, {xm (t)} converges uniformly to x(t) on J. Since xm (t)’s are
continuous functions of t, and the convergence is uniform, the limit x(t) is
continuous on J. Hence, x(t) ∈ C([a, b]), i.e., C([a, b]), is complete.
Note 1.4.11. We would call C([a, b]) as real C([a, b]) if each member of
C([a, b]) is real-valued. On the other hand, if each member of C([a, b]) is
+
complex-valued then we call the space as complex ([a, b]).
By arguing analogously as above we can show that complex ([a, b]) is +
complete. We next consider the set X of all continuous real-valued functions
on J = [a, b]. Let us define the metric ρ(x(t), y(t)) for x(t), y(t) ∈ X as
b
ρ(x, y) = |x(t) − y(t)|dt.
a
We can easily see that the set X with xm (t ), xn (t )
the metric defined above is a metric space
S[a, b] = (X, ρ). We next show that O
S[a, b] is not complete. Let us construct
a {xn } as follows: If a < c < b and for
every n so large that a < c − n1 1
We define ⎧ A B C
⎪ 1 0 t
⎪
⎪ 0 if a ≤ t ≤ c − 1
c −—
1
c −—
⎪
⎨ n m n
xn (t) = 1
⎪
⎪ nt − nc + 1 if c − ≤ t ≤ c Fig. 1.3
⎪
⎪ n
⎩
1 if c ≤ t ≤ b
For n > m
b
1 1 1 1 1 1
|xn (t) − xm (t)|dt = ΔAOB = · 1 · c − − c + < +
a 2 n m 2 n m
Thus ρ(xn , xm ) → 0 as n, m → ∞. Hence(xn ) is a Cauchy sequence.

x(t) = 0, t ∈ [a, c)
Now, let x ∈ S[a, b], then lim ρ(xn , x) = 0 ⇒
n→i x(t) = 1, t ∈ (c, b]
Since it is impossible for a continuous function to have the property,
(xn ) does not have a limit.
(v) m, the space of bounded number sequences, is complete.

Preliminaries 31
(vi) Completeness of lp , 1 ≤ p < ∞.

(a) Let (xn ) be any Cauchy sequence in the space lp where xn =
(n) (n) (n)
{ξ1 , ξ2 , . . . , ξi , . . .}. Then given > 0, ∃ n0 ( ) such that ρ(xn , xm ) <
∞
i/p
(n) (m) p
for n, m ≥ n0 ( ). Or, |ξi − ξi | < . It follows that for
i=1
(n) (m)
every i = 1, 2, . . . , |ξi − ξi | < (n, m ≥ n0 ( )). We choose a fixed
(n) (n)
i. The above inequality yields {ξ1 , ξ2 , . . .} as a Cauchy sequence of
numbers. The space 4 (n)
being complete {ξi } → ξi ∈ as n → ∞. 4
Using these limits, we define x = {ξ1 , ξ2 , . . .} and show that x ∈ lp and
xm → x as m → ∞. Since is an arbitrary small positive number,
k
(n) (m)
ρ(xn , xm ) < ⇒ |ξi − ξi |p < p (k = 1, 2, . . .). Making n → ∞,
i=1

k
(m)
we obtain for m > n0 ( ) |ξi − ξi |p < p . We may now let
i=1
∞

(m)
k → ∞, then for m > n0 ( ), |ξi − |p < p . This shows that
i=1 i
(m)
xm − x = {ξi − ξi } ∈ lp . Since xm ∈ lp , it follows by the Minkowski
inequality that x = (x − xm ) + xm ∈ lp . It also follows from the above
inequality that (ρ(xn , xm ))p ≤ p . Further, since is a small positive
number, xm → x as m → ∞. Since {xm } is an arbitrary Cauchy sequence
in lp , this proves the completeness of lp , 1 ≤ p ≤ ∞.
(b) Let {xn } be a Cauchy sequence

(n) (n) (n)
in l∞ where xn = {ξ1 , ξ2 , . . . , ξi , . . .}. Thus for each > 0, there
(n) (m)
is an N such that for m, n > N , we have sup |ξi − ξi | < . It follows
i
(n) (n)
that for each i, {ξi } is a Cauchy sequence. Let ξi = lim ξi and let
n→∞
(n)
x = {ξ1 , ξ2 , . . .}. Now for each i and n > N , it follows that |ξi − < . ξi |
(n) (n) (n)
Therefore, |ξi | ≤ |ξi | + |ξi − ξi | ≤ |ξi | + for n > N . Hence, ξi is
(n)
bounded for each i- i.e., x ∈ l∞ and {ξi } converges to x in the l∞ norm.
Hence, l∞ is complete under the metric defined for l∞ .
Problems
1. Show that in a metric space an ‘open ball’ is an open set and a ‘closed
ball’ is a closed set.
2. What is an open ball B(x0 ; 1) on 4? In +? In l1? In C([0, 1])? In
l2 ?
3. Let X be a metric space. If {x} is a subset of X consisting of a single
point, show that its complement {x}c is open. More generally show
that AC is open if A is any finite subset of X.
4. Let X be a metric space and B(x, r) the open ball in X with centre x
and radius r. Let A be a subset of X with diameter less than r that
intersects B(x, r). Prove that A ⊆ B(x, 2r).
5. Show that the closure B(x0 , r) of an open ball B(x0 , r) in a metric
space can differ from the closed ball B(x0 , r).
6. Describe the interior of each of the following subsets of the real line:
the set of all integers; the set of rationals; the set of all irrationals;
]0, 1]; [0, 1]; and [0, 1[∪{1, 2}.
7. Give an example of an infinite class of closed sets, the union of which
is not closed. Give an example of a set that (a) is both open and
closed, (b) is neither open nor closed, (c) contains a point that is not
a limit point of the set, and (d) contains no points that are not limit
points of the set.
8. Describe the closure of each of the following subsets of the real line;
the integers; the rationals; ]0, +∞[; ] − 1, 0[∪]0, 1[.
9. Show that the set of all real numbers constitutes an incomplete metric
space if we choose ρ(x, y) = | arctan x − arctan y|.
10. Show that the set of continuous real-valued functions on J = [0, 1]
do not constitute a complete metric space with the metric ρ(x, y) =
1
|x(t) − y(t)|dt.
0
11. Let X be the metric space of all real sequences x = {ξi } each
of which
has only finitely many nonzero terms, and ρ(x, y) = |ξi − ηi |,
(n) (n) (n)
where y = {ηi }. Show that {xn } with xn = {ξj }, ξj = j −2 for
(n)
j = 1, 2, . . . , n, and ξj = 0 for j > n is a Cauchy sequence but does
not converge.
12. Show that {xn } is a Cauchy sequence if and only if ρ(xn+k , xn )
converges to zero uniformly in k.
13. Prove that the sequence 0.1, 0.101, 0.101001, 0.1010010001,. . . is a
Cauchy sequence of rational numbers that does not converge in the
space of rational numbers.
' (
14. In the space l2 , let A = x = (x1 , x2 , . . .) : |xn | ≤ n1 , n = 1, 2, . . . .
Prove that A is closed.
1.4.16 Criterion for completeness

Definition (dense set, everywhere dense set and nowhere dense set)
Given A and B subsets of X, A is said to be dense in B if B ⊆ A. A is
said to be everywhere dense in X if A = X. A is said to be nowhere dense
C
in X, if A = X or X − A = X or A = Φ or intA = Φ. The set of rational
numbers is dense in . 4
Preliminaries 33
As an example of a nowhere dense set in two-dimensional Euclidean

space (the plane) is any set of points whose coordinates are both rational is
an example of the first category [see 1.4.18]. It is the union of countable sets
of ‘one-point’ sets. Although this set is of first category, it is nevertheless
4
dense in 2 .
We state without proof the famous Cantor’s intersection theorem.
1.4.17 Theorem
Let a nested sequence of closed balls [i.e., each of which contains all that
follows: B 1 ⊇ B 2 ⊇ . . . B n . . .] be given in a complete metric space X. If
the radii of the balls tend to zero, then these balls have a unique common
point.
1.4.18 Definition: first category, second category
A set M is said to be of the first category if it can be written as a
countable union of nowhere dense sets. Otherwise, it is said to be of the
second category. The set of rational points of a straight line is of the first
category while that of the irrational points is of the second category as
borne out by the following.
1.4.19 Theorem
A nonempty complete metric space is a set of the second category.
As an application of theorem 1.4.16, we prove the existence of nowhere
differentiable functions on [0, 1] that are continuous in the said interval. Let
us consider the metric space C0 ([0, 1]) of continuous functions f for which
f (0) = f (1) with ρ(f, g) = max{|f (x) − g(x)|, x ∈ [0, 1]}. Then C0 ([0, 1])
is a complete metric space. We would like to show that those functions
in C0 ([0, 1]) that are somewhere differentiable form a subset of the first
category. C0 ([0, 1]) being complete, is of the second category. C0 ([0, 1]) can
contain functions which are somewhere differentiable. Therefore, C0 ([0, 1])
can contain functions that are nowhere differentiable. For convenience we
extend the functions of C0 ([0, 1]) to the entire axis by periodicity and to
treat the space Γ of such extensions with the metric ρ defined above.
Let K ⊂ Γ be the set of functions such that for some ξ, the set
f (ξ + h) − f (ξ)
of numbers : h > 0 is bounded. K contains the set of
h
functions that are somewhere differentiable. We want to show that K is of
the first category in Γ.
)
f (ξ + h) − f (ξ)
Let Kn = f ∈ Γ : for some ξ, ≤ n, for all h > 0 .
h
*∞
Then K = Kn . We shall show that for every n = 1, 2, . . ., (i) Kn is
n=1
closed, and (ii) Γ ∼ Kn is everywhere dense in Γ. If (i) and (ii) are both
true, Γ ∼ Kn = Γ or Kn = Φ. Since Kn is a closed set, it follows that Kn
is nowhere dense in Γ. Hence K will become nowhere dense in Γ.
For (i) let f be a limit point of Kn and let {fn } be a sequence in

converging to f . For
K n each k = 1, 2, . . . let ξk be in [0,1] such that
fk (ξ + h) − fk (ξk )
≤ n for all h > 0. Let ξ be a limit point of {ξk } and
h
{ξkj } converge to ξ. For h > 0 and > 0,

f (ξ + h) − f (ξ)
≤ fkj (ξkj + h) − fkj (ξkj ) + 1 {|f (ξ + h)
h h h
− fkj (ξkj + h)| + |fkj (ξkj + h) − fkj (ξkj )| + |fkj (ξk ) − f (ξk )|
+ |f (ξk ) − f (ξ)|}
There is an N = N ( h) such that k > N implies that sup |fk (t)−f (t)| ≤
t
h
. Since f is continuous, there is an M > N such that for kj > M , we
4
h h
have, |f (ξ+h)−f (ξkj +h)| < and |f (ξkj −f (ξ)| < . Since lim ξkj = ξ,
4 4 j→∞

f (ξ + h) − f (ξ) fkj (ξkj + h) − fkj (ξkj )
if kj > M , we have <

+ ≤

h h
f (|ξ + h) − f (ξ)
n + . It follows that ≤ n for all h > 0. Thus f ∈ Kn

h
and Kn is closed.
For (ii) Let us suppose that g ∈ Γ. Let > 0 and let us partition [0,
1] into k equal intervals such that if x, x are in the same interval of the
partitioning, |g(x) − g(x )| < /2 holds. Let us consider th
! i−1 "the i !subinterval
"
i−1
≤ x ≤ k and consider the rectangle with sides g k and g ki . For all
i
k ! " ! "
points!within! the rectangle the ordinates satisfy g i−1
"" k − 2 ≤ !y ≤ g !k + ""2.
Thus ki , g ki is on the right-hand side of the rectangle and i−1 k , g i−1
k
is on the left-hand side of the rectangle. By joining these two points by a
polygonal graph that remains within the rectangle and the line segments of
which have slopes exceeding n in absolute value, we thus obtain a continuous
function that is within of g and as because its slope exceeds n, it belongs
to Γ ∼ Kn . Thus Γ − Kn is dense in Γ. Combining (i) and (ii) we can say
that Kn and hence K is nowhere dense in Γ.
x2
ε/2
ε/2 { { g (x)
x1
i −1 i
—— —
k k
Fig. 1.4
Preliminaries 35
1.4.20 Isometric mapping, isometric spaces, metric completion,

necessary and sufficient conditions
Definition: isometric mapping, isometric spaces
Let X1 = (X1 , ρ1 ) and X2 = (X2 , ρ2 ) be two metric spaces. Then
(a) A mapping f of x1 into X2 is said to be isometric or an isometry if
f preserves distance-i.e., for all x, y ∈ X1 , ρ2 (f x, f y) = ρ1 (x, y) where f x
and f y are images of x and y respectively.
(b) The space X1 is said to be isometric to X2 if there exists an one-to-
one and onto (bijective) isometry of X1 onto X2 . The spaces X1 and X2
are called isomtric spaces.
In what follows we aim to show that every metric space can be embedded
in a complete metric space in which it is dense. If the metric space X is
then X
dense in X, is called the metric completion of X. For example,
the space of real numbers is the completion of the X of rational numbers
corresponding to the metric ρ(x, y) = |x − y|, x, y ∈ X.
1.4.21 Theorem
Any metric space admits of a completion.
1.4.22 Theorem
If X ⊆ X̂, X is dense in X̂, and any fundamental sequence of points of
X has a limit in X̂, then X̂ is a completion of X.
1.4.23 Theorem
A subspace of a complete metric space is complete if and only if it is
closed.
1.4.24 Theorem
Given a metric space X1 , assume that this space is incomplete, i.e.,
there exists in this space a Cauchy sequence that has no limit in X1 . Then
there exists a complete space X2 such that it has a subset X2 everywhere
dense in X2 and isometric to X1 .
Problems
1. Let X be a metric space. If (xn ) and (yn ) are sequences in X such
that xn → x and yn → y, show that ρ(xn , yn ) → ρ(x, y).
2. Show that a Cauchy sequence is convergent ⇔ it has a convergent
subsequence.
3. Exhibit a non-convergent Cauchy sequence in the space of
polynomials on [0,1] with uniform metric.
4. If ρ1 and ρ2 are metrics on the same set X and there are positive
numbers a and b such that for all x, y ∈ X, aρ1 (x, y) ≤ ρ2 (x, y) ≤
bρ1 (x, y), show that the Cauchy sequences in (X, ρ1 ) and (X, ρ2 ) are
the same.
5. Using completeness in 4, prove completeness of C.

6. Show that the set of real numbers constitute an incomplete metric
space, if we choose ρ(x, y) = | arctan x − arctan y|.
7. Show that a discrete metric space is complete.
8. Show that the space C of convergent numerical sequence is complete
with respect to the metric you are to specify.
9. Show that convergence in C implies coordinate-wise convergence.
10. Show that the set of rational numbers is dense in 4.
11. Let X be a metric space and A a subset of X. Prove that A is
everywhere dense in X ⇔. The only closed superset of A is X ⇔ the
only open set disjoint from A is Φ.
12. Prove that a closed set F is nowhere dense if and only if it contains
no open set.
13. Prove that if E is of the first category and A ⊆ E, then A is also of
the first category.
14. Show that a closed set is nowhere dense ⇔ its complement is
everywhere dense.
15. Show that the notion of being nowhere dense is not the opposite of
being everywhere dense. [Hint: Let 4
be a metric space with the
usual metric and consider the subset consisting of the open interval
]1,2[. The interior of the closure of this set is non-empty whereas the
closure of ]1,2[ is certainly not all or .] 4
1.4.25 Contraction mapping principle
1.4.26 Theorem
In a complete metric space X, let A be a mapping that maps the elements
of the space X again into the elements of this space. Further for all x and
y in X, let ρ(A(x), A(y)) ≤ αρ(x, y) with 0 ≤ α < 1 independent of x and
y. Then, there exists a unique point x∗ such that A(x∗ ) = x∗ . The point
x∗ is called a fixed point of A.
Proof: Starting from an arbitrary element x0 ∈ X, we build up the
sequence {xn } such that x1 = A(x0 ), x2 = A(x1 ), . . . , xn = A(xn−1 ), . . .. It
is to be shown that {xn } is a Cauchy or fundamental sequence. For this we
note that,
ρ(x1 , x2 ) = ρ(A(x0 ), A(x1 )) ≤ αρ(x0 , x1 ) = αρ(x0 , A(x0 )),

ρ(x2 , x3 ) = ρ(A(x1 ), A(x2 )) ≤ αρ(x1 , x2 ) ≤ α2 ρ(x0 , A(x0 )),
··· ··· ··· ··· ··· ··· ··· ···
ρ(xn , xn+1 ) ≤ α ρ(x0 , A(x0 ))
n
Further, ρ(xn , xn+p ) ≤ ρ(xn , xn+1 ) + · · · + ρ(xn , xn+1 )

Preliminaries 37
+ · · · + ρ(xn+p−1 , xn+p )
≤ (α + α
n n+1
+ ··· + α n+p−1
)ρ(x0 , A(x0 ))
α −α
n n+p
= ρ(x0 , A(x0 )).
1−α
n
Since, by hypothesis, ρ(xn , xn+p ) ≤ 1−α α
ρ(x0 , A(x0 )), therefore
ρ(xn , xn+p ) → 0 as n → ∞, p > 0. Thus, (xn ) is a Cauchy sequence.
Since the space is complete, there is an element x∗ ∈ X, the limit of the
sequence, x∗ = lim xn .
n→∞
We shall show that A(x∗ ) = x∗ .

ρ(x∗ , A(x∗ )) ≤ ρ(x∗ , xn ) + ρ(xn , A(x∗ ))
≤ ρ(x∗ , xn ) + ρ(A(xn−1 ), A(x∗ ))
≤ ρ(x∗ , xn ) + αρ(xn−1 , x∗ )
For n sufficiently large, we can write, ρ(x∗ , xn ) < /2, ρ(x∗ , xn−1 ) <
/2α, for any given . Hence ρ(x∗ , A(x∗ )) < . Since > 0 is arbitrary,
ρ(x∗ , A(x∗ )) = 0, i.e., A(x∗ ) = x∗ .
Let us assume that there exists two elements, x∗ ∈ X, y ∗ ∈ Y, x∗ = y ∗
satisfying A(x∗ ) = x∗ and A(y ∗ ) = y. Then, ρ(x∗ , y ∗ ) = ρ(A(x∗ ), A(y ∗ )) ≤
αρ(x∗ , y ∗ ). Since x∗ = y ∗ , and α < 1, the above inequality is
impossible unless ρ(x∗ , y ∗ ) = 0, i.e, x∗ = y ∗ . Making p → ∞, in
n
−αn+p
the inequality ρ(xn , xn+p ) ≤ α 1−α ρ(x0 , A(x0 )), we obtain ρ(xn , x∗ ) ≤
αn
1−α ρ(x0 , A(x0 )).
Note 1.4.12. Given an equation F (x) = 0, where F : n → n , we can 4 4
write the equation F (x) = 0 in the form x = x − F (x). Denoting x − F (x)
by A(x), we can see that the problem of finding the solution of F (x) = 0 is
equivalent to finding the fixed point of A(x) and vice versa.
1.4.27 Applications
(i) Solution of a system of linear equations by the iterative
method
Let us consider the real n-dimensional space. If x = (ξ1 , ξ2 , . . . , ξn ) and
y = (η1 , η2 , . . . , ηn ), let us define the metric as ρ(x, y) = max |ξi − ηi |. Let
i
us consider y = Ax, where A is an n × n matrix, i.e., A = (aij ). The

n
system of linear equations is given by ηi = aij ξj , i = 1, 2, . . . , n. Then
j=1

(1) (2)
(1) (2)
,
ρ(y1 2y ) = ρ(Aξ , Aξ ) yields max |η −η | = max a (ξ − ξ ) ≤
i
1 2 i i ij j j
i
j
n
max |aij |ρ(x1 , x2 ). Now if it is assumed that |aij | < 1, for all i, then
i
j=1 j
the contraction mapping principle becomes applicable and consequently the

matrix A has a unique fixed point.
(ii) Existence and uniqueness of the solution of an integral

equation
1.4.28 Theorem
Let k(t, s) be a real valued function defined in the square a ≤ t, s ≤ b such

b b b
2
that k (t, s)dt ds < ∞. Let f (t) ∈ L2 ([a, b]) i.e., |f (t)|2 dt < ∞.
a a b a
Then the integral equation x(t) = f (t) + λ k(t, s)x(s)ds has a unique
a
solution x(t) ∈ L2 ([a, b]) for every sufficiently small value of the parameter
λ.
b
Proof: Consider the operator Ax(t) = f (t)+λ k(t, s)x(s)ds. Let x(t) ∈
b a
L2 ([a, b]), i.e., x2 (t)dt < ∞. We first show that for x(t) ∈ L2 ([a, b]),
a
Ax ∈ L2 ([a, b]).

b b b b
2 2
(Ax) dt = f (t)dt + 2λ f (t) k(t, s)x(s)ds dt
a a a a

2
b b
2
+λ k(t, s)x(s)ds dt
a a
Using Fubini’s theorem th. 10.5 and the square integrability of k(t, s) we
can show that
b b b b
f (t) k(t, s)x(s)ds dt = k(t, s)x(s)f (t)dt ds
a a a a

1/2
1/2
1/2
b b b b
2 2 2
≤ k (t, s)dt ds f (t)dt · x (s)ds
a a a a
< +∞

2
b b
Similarly we have k(t, s)x(s)ds dt < ∞. Thus, A(x) ∈
a a
L2 ([a, b]). Therefore, A : L2 ([a, b]) → L2 ([a, b]). Using the metric in
L2 ([a, b]), i.e., given x(t), y(y) ∈ L2 ([a, b]),

1/2
b
ρ(Ax, Ay) = |Ax − Ay|2 dt
a
Preliminaries 39
⎡
2 ⎤1/2
b b
= |λ| ⎣ k(t, s)(x(s) − y(s))ds dt⎦
a a

2
b b b
≤ |λ| |k(t, s)|2 dt ds |x(s) − y(s)|2 ds 1/2
a a a

1/2
b b
2
= |λ| |k(t, s)| dt ds ρ(x, y) < αρ(x, y)
a

1/2
b b
2
where α = |λ| |k(t, s)| dt ds and α < 1 if |λ| <
aa

−1/2
b b
|k(t, s)2 |dt ds . Thus the contraction mapping principle
a a
holds, proving the existence and uniqueness of the solution of the given
integral equation for values of λ satisfying the above inequality.
(iii) Existence and uniqueness of solution for ordinary differential

equations
Definition: Lipschitz condition

Let E be a connected open set in the plane 2 of the form E = 4
]s0 − a, s0 + a[×]t0 − b, t0 + b[, where a > 0, b > 0, (s0 , t0 ) ∈ E. Let f
be a real function defined on E. We shall say that f satisfies a Lipschitz
condition in t on E, with Lipschitz condition M if for every (s, t1 ) and
(s, t0 ) in E, and s ∈]s0 − a, s0 + a[, we have |f (s, t1 ) − f (s, t0 )| ≤ M |t1 − t0 |.
Let (s0 , t0 ) ∈ E. By a local solution passing through (s0 , t0 ) we mean a
function ϕ defined on s0 , ϕ(s0 ) = t0 , s, ϕ(s) ∈ E for every s ∈]s0 −a, s0 +a[
and ϕ (s) = f (s, ϕ(s)) for every s ∈]s0 − a, s0 + a[.
1.4.29 Theorem
If f is continuous on the open connected set E =]s0 − a, s0 + a[ and
satisfies a Lipschitz condition in t on E, then for every (s0 , t0 ) ∈ E, the
dt
differential equation ds = f (s, t) has a unique local solution passing through
(s0 , t0 ).
Proof: We first show that the function ϕ defined on the interval ]s0 −a, s0 +
a[ such that ϕ(s0 ) = t0 and ϕ (s) = f (s, ϕ(s)) for every s in the said interval
s
is of the form ϕ(s) = t0 + f (t , ϕ(t ))dt . It may be observed from the
s0
above form that ϕ(s0 ) = t0 , ϕ(s), is differentiable and ϕ (s) = f (s, ϕ(s)).
Let E1 ∈ E =]s0 − a, s0 + a[×]t0 − a, t0 + b[ a > 0, b > 0 be an
open connected set containing (s0 , t0 ). Let f be bounded on E1 and let
|f (s, t)| ≤ A for all (s, t) ∈ E1 . Let d > 0 be such that (a) the rectangle
R ⊆ E1 where R =]s0 − d, s0 + d[×]t0 − dA, t0 + dA[ and (b) M d < 1, where

M is a Lipschitz constant for f in E.
Let J =]s0 − d, s0 + d[. The set B of continuous functions ψ on J such
that ψ(s0 ) = t0 and |ψ(s) − t0 | ≤ dA for every s ∈ J is a complete metric
space under the uniform metric ρ.
s
Consider the mapping T defined by (T ψ)(s) = t0 + f (t, ψ(t))dt for
s0
ψ ∈ B and s ∈ J.
sNow (T ψ)(s0) =t0s, T ψ is continuous and for every s ∈ J, |T ψ(s)−t0 | =

f (t, ψ(t))dt ≤ |f (t, ψ(t))|dt ≤ dA. Hence T ψ ∈ B. Thus T maps

s0 s0
B into B.
We now show that T is a contraction. Let ψ1 , ψ2 ∈ B. Then for every
s ∈ J,
s

|T ψ1 (s) − T ψ2 (s)| = (f (t , ψ1 (t ))) − f (t , ψ2 (t ))dt
s0
≤ M d max[|ψ1 (t ) − ψ2 (t )| : t ∈ J]
so that ρ(T ψ1 , T ψ2 ) ≤ M dρ(ψ1 , ψ2 ). Hence, T is a contraction.
We next show that the local solution can be extended across E1 . Let J =
J1 , d = d1 , s0 +d = s1 and ϕ(s1 ) = t1 . By theorem 1.4.29 applied to (s1 , t1 )
we obtain J2 , d2 and (s2 , t2 ). The solution functions ϕ = φ1 on J1 and φ2
on J2 agree on an interval and so yields a solution on J1 ∪J2 . In this way, we
obtain a sequence {(sn , tn )} with sn+1 > sn , n = 1, 2, . . .. We assume that
E1 is bounded and show that the distance of (sn , tn ) from the boundary
of E1 converges to zero. If (sn , tn ) ∈ E1 , we denote
by δn the distance
of
δn 1
(sn , tn ) from the boundary. We take dn = M in √ , , so that
A2 + 1 2M
δn
M dn < 1 and dn ≤ √ . Thus sn+1 = sn + dn and ϕ(sn+1 ) = tn+1 .
A2 + 1
∞
Hence (sn+1 , tn+1 ) ∈ E1 . Since dn > 0 for all n, dn < ∞. If
n=1
∞
δn 1 δn
√ is smaller than M, dn = √ . Since dn < ∞,
A2 + 1 2 A2 + 1 n=1
∞ + ∞
1
δn = (A2 + 1) dn < ∞. On the other hand, if M is smaller
n=1 n=1
2
∞
δn 1 n
than √ , then dn = M . Since dn < ∞, lim < ∞. Hence
2
A +1 2 n→∞ 2M
n=1
M must be of the order Kn, where K is finite. Now, the Lipschitz constant
∞
δn 1
M cannot be arbitrarily large. Hence √ < M and dn < ∞.
A2 + 1 2 n=1
Therefore δn → 0 as n → ∞. Keeping in mind that D is the union of an
Preliminaries 41
increasing sequence of sets, each having the above properties of E1 , we have

the following theorem.
1.4.30 Theorem
If f is continuous on an open connected set E and satisfies the Lipschitz
condition in t on E, then for every (s0 , t0 ) ∈ E the differential equation
dt
= f (s, t) has a unique solution t = ϕ(s) such that t0 = ϕ(s0 ) and such
ds
that the curve given by the solution passes through E from boundary to
boundary.
1.4.31 Quasimetric space
If we relax the condition ρ(x, y) = 0 ⇔ x = y, we get what is known as
a quasimetric space. Formally, a quasimetric space is a pair of (X, q) where
X is a set and q (quasidistance) is a real function defined on X × X such
that for all x, y, z ∈ X we have q(x, x) = 0 and q(x, z) ≤ q(x, y) + q(z, y).
We next aim to show that the quasidistance is symmetric and non-negative.
If we take x = y, then q(x, z) ≤ q(x, x) + q(z, x). Since q(x, x) = 0,
q(x, z) ≤ q(z, x). Similarly, we can show q(z, x) ≤ q(x, z). This is only
possible if q(x, z) = q(z, x), which proves symmetry.
Taking x = z in the inequality, we have 0 ≤ 2q(z, y) or q(z, y) ≥ 0,
which shows non-negativity.
Combining q(x, y) ≥ q(x, z) − q(y, z) and q(x, y) ≥ q(y, z) − q(x, z), we
can write |q(x, z) − q(y, z)| ≤ q(x, y).
1.4.32 Example
Let 42
be the two-dimensional plane, x = (ξ1 , η1 ), y = (ξ2 , η2 ), where
4
x, y ∈ 2 . The quasidistance between x, y is given by q(x1 , x2 ) = |ξ1 − ξ2 |.
4
We will show that 2 with the above quasidistance between two points is
a quasi-metric space.
Firstly, q(x1 , x1 ) = 0. If x3 = (ξ3 , η3 ),
q(x1 , x2 ) = |ξ1 − ξ3 | ≤ |ξ1 − ξ2 | + |ξ2 − ξ3 | = q(x1 , x2 ) + q(x3 , x2 )
Hence 42 with the above quasi-distance is a quasimetric space.

Note 1.4.13. q(x1 , x2 ) = 0 ⇒ x1 = x2 . Thus a quasimetric space is not
necessarily a metric space.
Problems
1. Show that theorem 1.4.26 fails to hold if T has only the property
ρ(T x, T y) < ρ(x, y).
2. If T : X → X satisfies ρ(T x, T y) < ρ(x, y) when x = y and T has
a fixed point, show that the fixed point is unique; here (X, ρ) is a
metric space.
3. Prove that if T is a contraction in a complete metric space and x ∈ X,

then
T lim T n x = lim T n+1 x.
n→∞ n→∞
4. If T is a contraction, show that T n (n ∈ N ) is a contraction. If T n is

a contraction for n > 1, show that T need not be a contraction.
5. Show that f defined by f (t, x) = | sin(x)| + t satisfies a Lipschitz
condition on the whole tx-plane with respect to its second argument,
∂f
but that does not exist when x = 0. What fact does this illustrate?
∂x
6. Does f defined by f (t, x) = |x|1/2 satisfy a Lipschitz condition?
d2 u
7. Show that the differential equation = −f (x) where u ∈
dx2
2
C (0, 1), f (x) ∈ C(0, 1) and u(0) = u(1) = 0, is equivalent to the
1
integral equation u(x) = G(x, t)f (t)dt where G(x, t) is defined as
0
x(1 − t) x ≤ t
G(x, t) = .
t(1 − x) t ≤ x

xn+1 2xn
8. For the vector iteration = 1 show that x = y = 0
yn+1 2 xn
is a fixed point.
9. Let X = {x ∈ 4: x ≥ 1} ⊂ 4
and let the mapping T : X → X be
defined by T x = x/2 + x−1 . Show that T is a contraction.
10. Let the mapping T : [a, b] → [a, b] satisfy the condition |T x − T y| ≤
k|x − y|, for all x, y ∈ [a, b]. (a) Is T a contraction? (b) If
T is continuously differentiable, show that T satisfies a Lipschitz
condition. (c) Does the converse of (b) hold?
11. Apply the Banach fixed theorem to prove that the following system
of equations has a unique solution:
2ξ1 + ξ2 + ξ3 = 4
ξ1 + 2ξ2 + ξ3 = 4
ξ1 + ξ2 + 2ξ3 = 4
12. Show that x = 3x2/3 , x(0) = 0 has infinitely many solutions, x, given
by x(t) = 0 if t < c and x(t) = (t − c)3 if t ≥ c, where c > 0 is
any constant. Does 3x2/3 on the right-hand side satisfy a Lipschitz
condition?
13. Pseudometric: A finite pseudometric on a set X is a function
ρ : X×X → 4
satisfying for all x ∈ X conditions (1), (3) and
(4) of Section 1.4.1 and 2 (i.e. ρ(x, x) = 0, for all x ∈ X).
Preliminaries 43
What is the difference between a metric space and a pseudometric

space? Show that ρ(x, y) = |ξi − ηi | defines a pseudometric on the set
of all ordered pairs of real numbers, where x = (ξ1 , ξ2 ), y = (η1 , η2 ).
4
14. Show that the (real or complex) vector space n of vectors x =
{x1 , . . . , xn }, y = {y1 , . . . , yn } becomes a pseudometric space by
introducing the distance as a vector: ρ(x, y) = (p1 |x1 − y1 |, p2 |x2 −
y2 |, . . . , pn |xn − yn |). The pj (j = 1, 2, . . . , n) are fixed positive
constants. The order is introduced as follows: x, y ∈ n , x ≤ y ⇔4
xi ≤ yi , i = 1, 2, . . . , n.
15. Show that the space 4 of real or complex valued functions
f (x1 , x2 , . . . , xn ) that are continuous on the closure of B ( i.e., B of
4 , is pseudometric when the distance is the function ρ(f (x), g(x)) =
p(x)|f (x) − g(x)|, and p(x) is a given positive function in ). 4
1.4.33 Separable space
Definition: separable space A space X is said to be separable if it
contains a countable everywhere dense set; in other words, if there is in X
as sequence (x1 , x2 , . . . , xn ) such that for each x ∈ X we find a subsequence
{xn1 , xn2 , . . . , xnk , . . .} of the above sequence, which converges to x. If X
is a metric space, then separability can be defined as follows: There exists
a sequence {x1 , x2 , . . . , xn , . . .} in X such that we find an element xn0 of it
for every x ∈ X and every > 0 satisfying ρ(x, xn0 ) < .
The separability of the n-dimensional Euclidean space n 4

4 4
The set n0 , which consists of all points in the space n with rational
coordinates, is countable and everywhere dense in n . 4
The separability of the space C([0, 1]) In the space C([0, 1]), the set C0 ,
consisting of all polynomials with rational coefficients, is countable. Take
any function x(t) ∈ C([0, 1]). By the Weierstrass approximation theorem
[Theorem 1.4.34] there is a polynomial p(t) s.t.
max |x(t) − p(t)| < /2,

t
> 0 being any preassigned number. On the other hand, there exists
another polynomial p0 (t) with rational coefficients, s.t.,
max |p(t) − p0 (t)| < /2.

t
Hence ρ(x, p0 ) = max |x(t) − p0 (t)| < . Hence C([0, 1]) is separable.
t
1.4.34 The Weierstrass approximation theorem

If [a, b] is a closed interval on the real-line, then the polynomials with
real coefficients are dense in C([a, b]).
In other words, every continuous function on [a, b] is the limit of a uniformly

convergent sequence of polynomials.
x

n k k
Bn (x) = x (1 − x)n−k f
k n
k=0
are called Bernstein polynomials associated with f . We can prove our

theorem by finding a Bernstein polynomial with the required property.
Note 1.4.14. The Weierstrass theorem for C([0, 1]) says in effect that all
real linear combinations of functions 1, x, x2 , . . . , xn are dense in [0, 1].
The Separability of the Space lp (1 < p < ∞)
Let E0 be the set of all elements x of the form (r1 , r2 , . . . , rn ) where ri
are rational numbers and n is an arbitrary natural number. E0 is countable.
We would like to show that E0 is everywhere dense in lp . Let us take an
element x = {xi } ∈ lp and let an arbitrary > 0 be given. We find a
∞
p
natural number n0 such that for n > n0 |ξk |p < . Next, take an
2
k=n+1
∞
p
element x0 = (r1 , r2 , . . . , rn,0,0 . . .) such that |ξk − rk |p < . Then,
2
k=1
∞
∞
p p
p
[ρ(x, x)] = |ξk − rk | +
p
|ξk | <
p
+ = p where ρ(x, x0 ) < .
2 2
k=1 k=n+1
The space s is separable.
The space m of bounded numerical sequences is inseparable.

Problems
1. Which of the spaces 4n, +, l∞ are separable?
2. Using the separability property of C([a, b]), show that Lp ([a, b]), a < b
(the space of pth power integrable functions) is separable.
1.5 Topological Spaces

The predominant feature of a metric space is its metric or distance. We
have defined open sets and closed sets in a metric space in terms of a metric
or distance. We have proved certain results for open sets (see results (i)
and (ii) stated after theorem 1.4.11.) in a metric space. The assertions of
the above results are taken as axioms in a topological space and are used to
define an open set. No metric is used in a topological space. Thus a metric
space is a topological space but a topological space is not always a metric
space.
Preliminaries 45
Open sets play a crucial role in a topological space.

Many important concepts such as limit points, continuity and
compactness (to be discussed in later sections) can be characterised in terms
of open sets. It will be shown in this chapter that a continuous mapping
sends an open set back into an open set.
We think of deformation as stretching and bending without tearing.
This last condition implies that points that are neighbours in one
configuration are neighbours in another configuration, a fact that we
should recognize as a description of continuity of mapping. The notion
of ‘stretching and bending’ can be mathematically expressed in terms of
functions. The notion of ‘without tearing’ can be expressed in terms of
continuity. Let us transform a figure A into a figure A subject to the
following conditions:
(1) To each distinct point p of A corresponds one point p of A and vice
versa.
(2) If we take any two points p, q of A and move p so that the
distance between it and q approaches zero, then the distance between the
corresponding points p and q of A will also approach zero, and vice versa.
If we take a circle made out of a rubber sheet and deform it subject to the
above two conditions, then we get an ellipse, a triangle or a square but not
a figure eight, a horseshoe or a single point.
These types of transformations are called topological transformations
and are different from the transformations of elementary geometry or of
projective geometry. A topological property is therefore a property that
remains invariant under such a transformation or in particular deformation.
In a more sophisticated fashion one may say that a topological property of
a topological space X is a property that is possessed by another topological
space Y homeomorphic to X (homeomorphism will be explained later in this
chapter). In this section we mention some elementary ideas of a topological
space t. The notions of neighbourhood, limiting point and interior of a set
amongst others that will be discussed in this section.
1.5.1 Topological space, topology

Let X be a non-empty set. A class of subsets of X is called a topology
on X if it satisfies the following conditions:
(i) φ, X ∈ .
(ii) The union of every class of sets in is a set in .
(iii) The intersection of finitely many members of is a member of .
Accordingly, one defines a topological space (X, ) as a set X and a class
of subsets of X such that satisfies the axioms (i) to (iii). The member
of are called open sets.
1.5.2 Example
Let X = (α1 , α2 , α3 ). Consider 1 = {φ, X, {α1 }, {α1 , α2 }}, 2 =
{φ, X, {α1 }, {α2 }}, 3 = {φ, x, {φ1 }, {α2 }, {α3 }, {α1 , α2 }, {α2 , α3 }} and
4 = {φ, X}.
Here, 1 , 3 , and 4 are topologies, but 2 is not a topology due to
the fact that {α1 } ∪ {α2 } = {α1 , α2 } ∈ 2 .
1.5.3 Definition: indiscrete topology, discrete topology
An indiscrete topology denoted by ‘J’, has only two members, Φ and
X. The topological space (X, J) is called an indiscrete topological space.
Another trivial topology for a non-empty set X is the discrete topology
denoted by ‘D’. The discrete topology for X consists of all subsets of X.
A topological space (X, D) is called a discrete topological space.
1.5.4 Example
Let X = R. Consider the topology ‘S’ where Φ ∈ S. If G ⊆ R
and G = Φ, then G ∈ S if for each p ∈ G there is a set H of the form
{x ∈ R : a ≤ x < b}, a < b, such that p ∈ H and H ⊆ G. The set
H = {x ∈ R : a ≤ x < b} is called a right-half open interval. Thus a
nonempty set G is S-open if for each p ∈ G, there is a right-half open
interval H such that p ∈ H ⊆ G. The topology defined above is called a
limit topology.
1.5.5 Definition: usual topology, upper limit topology, lower
limit topology
Let X = R {real}. Let us consider a topology = {Φ, R, {]a, b[}, a < b
and all unions of open intervals} on X = R. This type of topology is called
the usual topology.
Let X = R be the non-empty set and = {φ, R, {]a, b]}, a < b and
union of left-open right closed intervals}. This type of topology is called
the upper limit topology.
Let X = R be the non-empty set and = {Φ, X = R, {[a, b[}, a < b
and union of left-closed right-open intervals}. Then this type of topology
is called lower limit topology.
1.5.6 Examples
(i) (Finite Complement Topology)
Let us consider an infinite set X and let = {Φ, X, A ⊂ X|AC be a
finite subset of X}. Then we see that is a topology and we call it a finite
complement topology.
(ii) (Countable complement topology)
Let X be a non-enumerable set and = {Φ, X, A ⊂ X|AC be a
countable complement}. Then is a topology and will be known as a
countable complement topology.
(iii) In the usual topology in the real line, a single point set is closed.
Preliminaries 47
1.5.7 Definition: T1 -space, closure of a set

A topological space is called a T1 -space if each set consisting of a single
point is closed.
1.5.8 Theorem
Let X be a topological space. Then (i) any intersection of closed sets in
X is a closed set and (ii) finite union of closed sets in X is closed.
Closure of a set
If A is a subset of a topological space, then the closure of A (denoted
by A) is the intersection of all closed supersets of A. It is easy to see that
the closure of A is a closed superset of A that is contained in every closed
superset of A and that A is closed ⇔ A = A.
A subset of a topological space X is said to dense or everywhere dense
if A = X.
1.5.9 Definition: neighbourhood
Let (X, ) be a topological space and x ∈ X be an arbitrary point. A
subset Nx ⊆ X containing the point x ∈ X is said to be a neighbourhood
of x if ∃ a -open set Gx such that x ∈ Gx ⊆ Nx .
Clearly, every open set Gx containing x is a neighbourhood of x.
1.5.10 Examples
4
(i)]x − 1 , x + 2 [ is an open neighbourhood in ( , U ).
4
(ii)]x, x + ] is an open neighbourhood in ( , UL ), where UL is a topology
defined above . 4
4
(iii)[x − , x[ is an open neighbourhood in ( , U4 ).
(iv)Let {X, } be a topological space and N1 , N2 be any two

neighbourhoods of the element x ∈ X. Then N1 ∪ N2 is also a
neighbourhood of x.
Note 1.5.1. A -neighbourhood of a point need not be a -open set but

a -open set is a -neighbourhood of each of its points.
1.5.11 Theorem
A necessary and sufficient condition that a set G in a topological space
(X, ) be open is that G contains a neighbourhood of each x ∈ G.
Problems
1. Show that the indiscrete topology J satisfies all the conditions of 1.5.1.
2. Show that the discrete topology D satisfies all the conditions of 1.5.1.
3. If D represents the discrete topologies for X, show that every subset

of X is both open and closed.
4. If {Ii , i = 1, 2, . . . , n} is a finite collection of open intervals such that
∩{Ii , i = 1, 2, . . . , n} = Φ, show that ∩{Ii , i = 1, 2, . . . , n} is an open
interval.
5. Show that any finite set of real numbers is closed in the usual topology
4
for .
6. Which of the following subsets of 4 are U -neighbourhoods of 2?
(i) ]1, 3[ (ii) [1, 3] (iii) ]1, 3] (iv) [1, 3[ (v) [2, 3[.
1.5.12 Bases for a topology
In 1.5.1 the topologies U and S for 4
and T for J were introduced.
These neighbourhoods for each point were specified and then a set was
declared to be a member of the topology if and only if the set contains a
neighbourhood of each of its points. This is an extremely useful way of
defining a topology. It should be clear, however, that neighbourhoods must
have certain properties. In what follows we present a characterization of
the neighbourhoods in a topological space.
1.5.13 Theorem
Let (X, ) be a topological space, and for each p ∈ X let up be the family
of -neighbourhoods of p. Then:
(i) If U ∈ up then p ∈ U .
(ii) If U ∈ up and V ∈ up , then, by the definition of neighbourhood,
there are -open sets G1 and G2 such that p ∈ G1 ⊆ U and
p ∈ G2 ⊆ V . Now, p ∈ G1 ∩ G2 where G1 ∩ G2 is a -open set. Since
p ∈ G1 ∩ G2 ⊆ U ∩ V it follows that U ∩ V is a -neighbourhoods of
p. Hence U ∩ V ∈ up .
(iii) If U ∈ up , the family of -neighbourhoods of p, ∃ an open set G
such that p ∈ G ⊆ U . Therefore, p ∈ G ⊆ U ⊆ V and V is a
-neighbourhood of p. Hence V ∈ up .
(iv) If U ∈ up , then there is a -open set V such that p ∈ V ⊆ U .
Since V is a -open set, V is a neighbourhood of each of its points.
Therefore V ∈ uq for each q ∈ V .
1.5.14 Theorem
Let X be a non-empty set and for each p ∈ X, let Bp be a non-empty
collection of subsets of X such that
(i) If B ∈ Bp , then p ∈ B.
(ii) If B ∈ Bp , and C ∈ Bp , then B ∩ C ∈ Bp .
If consists of the empty set together with all non-empty subsets G of
X having the property that p ∈ G implies that there is a B ∈ Bp such that
B ⊆ G, then is a topology for X.
Preliminaries 49
1.5.15 Definition: base at a point, base of a topology

Base at a point
Let (X, ) be a topological space and for each x ∈ X, let Bx be a
nonempty collection of -neighbourhoods of x. We shall say that Bx is a
base for the -neighbourhood system of x if for each -neighbourhood Nx
of x there is a B ∈ Bx such that x ∈ Bx ⊆ Nx .
If Bx is a base for the -neighbourhood system of x, then the members
of Bx will be called basic -neighbourhoods of x.
Base for a topology

Let (X, ) be a topological space and B be a non-empty collection of
subsets of X such that
(i) B ∈
(ii) ∀ x ∈ X, ∀ neighbourhoods Nx of x, ∃ Bx ∈ B such that
x ∈ Bx ⊆ Nx .
Thus B is called the base of the topology and the sets belonging to B
are called basic open sets.
1.5.16 Examples
4
(i) Consider the usual topology U for the set of real numbers . The set
of all open intervals
of lengths 2/n(n = 1, 2, . . , 4
.) is a base for ( , U ). The
1
4
open intervals x : |x − x0 | < , (n = 1, 2, . . .) for ( , U ) form a base at
n
x0 .
(ii) In the case of a point in a metric space, an open ball centered on the
point is a neighbourhood of the point, and the class of all such open balls is
a base for the point. In the theorem below we can show a characterization
of ‘openness’ of a set in terms of members of the base B for a topological
space (X, ).
1.5.17 Theorem
Let (X, ) be a topological space and B, a base of the topology, then, a
necessary condition that a set G ⊆ X be open is that G can be expressed as
union of members of B.
1.5.18 Definition: first countable, second countable
A topological space (X, ) that has a countable local base at each x ∈ X
is called first countable. A topological space (X, ) is said to be second
countable if ∃ a countable base for the topology.
1.5.19 Lindelöf ’s theorem
Let X be a second countable space. If a non-empty open set G in X
is represented as the union of a class {Gi } of open sets, then G can be
represented as a countable union of Gi ’s.
1.5.20 Example
4 4
( , U ) is first countable. ( , UL ) is second countable. It is to be noted
that a second countable space is also first countable.
Problems
1. For each p ∈ 4find a collection Bp such that Bp is a base for the
D-neighbourhood system of p.
2. Let X = {a, b, c} and let = {X, Φ, X, {a}, {b, c}}. Show that is a
topology for X.
3. For each p ∈ X find a collection Bp of basic -neighbourhoods of p.
4. Prove that open rectangles in the Euclidean plane form an open base.
1.5.21 Limit points, closure and interior

We have characterized -open sets of a topological space (X, ) in terms
of -neighbourhoods. We now introduce and examine another concept that
is conveniently described in terms of neighbourhoods.
Definition: limit point, contact point, isolated point, derived set, closure
Let (X, ) be a topological space and let A be a subset of X. The
point x ∈ X is said to be a limit point of A if every -neighbourhood
of x contains at least one point of A other than x. That is, x is a limit
point of A if and only if Nx a -neighbourhood of x satisfies the condition
Nx ∩ (A − {x}) = Φ.
If ∀ neighbourhoods Nx of x, s.t. Nx ∩ A = Φ, then x is called a contact
point of A. D(A) = {x : x is a limit point of A} is called the derived set of
A.
A ∪ D(A) is called the closure of A denoted by A.
Problems
4
Let X = and A =]0, 1[. Then find D(A) for the following cases:
(i) = U , the usual topology on 4
(ii) = UL , the lower limit topology on 4
4
(iii) = U4 , the upper limit topology on .
1.5.22 Theorem
Let (X, ) be a topological space. Let A be a subset of X and D(A) the
set of all limit points of A. Then A ∪ D(A) is -closed.
1.5.23 Definition: -closure
Let (X, ) be the topological space and A be a subset of X. The -
closure of A denoted by A is the smallest -closed subset of X that contains
A.
Preliminaries 51
1.5.24 Theorem
Let (X, ) be a topological space and let A be a subset of X. Then
A = A ∪ D(A) where D(A) is the set of limit points of A. It follows from
the previous theorem that A ∪ D(A) is a closed set.
1.5.25 Definition: interior, exterior, boundary
Let (X, ) be a topological space and let A be a subset of X. A point x
is a -interior point of A if A is a -neighbourhood of x. The -interior
of A denoted by Int A is the set of all interior points of A.
x ∈ X is said to be an exterior point of A if x is an interior point of
AC = X ∼ A.
A point in X is called a boundary point of A if each -neighbourhood
of x contains points both of A and of AC . The -boundary of A is the set
of all boundary points of A.
1.5.26 Example
4
Consider the space ( , U ). The point
1
2
is an interior point of [0, 1],
but neither 0 nor 1 is an interior point of [0, 1]. The U -interior of [0, 1] is
4
]0, 1[. In the UL -topology for , ‘0’ is an interior point of [0, 1] but 1 is not.
The UL -interior of [0, 1] is [0, 1[.
1.5.27 Definition: separable space
Let (X, ) be a topological space. If ∃ a denumerable (enumerable)
subset A of X, A ⊆ X such that A = X, then X is called a separable
space. Or, in other words, a topological space is said to be separable if
it contains a denumerable everywhere dense subset.
1.5.28 Example
4
Let X = and A be the set of all intervals. Now let D(A) be the derived
4
set of A. It is clear that D(A) = . Hence A = A ∪ D(A) = A ∪ = X. 4
4
Hence A is everywhere dense in X = . Similarly, the set 3
of rational
4 4
points, which is also a subset of , is everywhere dense in . Again, since
3 is countable the topological space 4
is separable.
1.6 Continuity, Compactness

1.6.1 Definition: ‘ − δ’ continuity
4 + 4 +
Let D ⊂ (or ), f : D → (or ) and a ∈ D. The function f is said
to be continuous at a if lim f (x) = f (a). In other words, given > 0, ∃ a
x→a
δ = δ( ) such that |x − a| < δ ⇒ |f (x) − f (a)| < .
Note 1.6.1. f is said to be continuous in D if f is continuous at every

a ∈ D.
1.6.2 Definition: continuity in a metric space
Given two metric spaces (X, ρX ) and (Y, ρY ), let f : D ⊂ X → Y be a

mapping. f is said to be continuous at a ∈ D if ∀ x ∈ D, for each > 0
there exists δ > 0 such that ρX (a, x) < δ ⇒ ρY (f (a), f (x)) < .
1.6.3 Definition: continuity on topological spaces
Let (X, F ) and (y, V ) be topological spaces and let f be a mapping of

X into Y . The mapping f is said to be continuous (or F -V continuous)
if f −1 (G) is F -open whenever G is V -open. That is, the mapping f is
continuous if and only if the inverse image under f of every V -open set is
an F -open set.
1.6.4 Theorem (characterization of continuity)
Let (X, ρX ) and (Y, ρy ) be metric spaces and let f : X → Y be a

mapping. Then the following statements are equivalent:
(i) f is continuous on X.
(ii) For each x ∈ X, f (xn ) → f (x) for every sequence {xn } ⊂ X with
xn → x.
(iii) f −1 (G) is open in X whenever G is open in Y .
(iv) f −1 (F ) is closed in X whenever F is closed in Y .
(v) f (A) ⊂ f (A) ∀ A ⊂ X
(vi) f −1 (B) ⊂ f −1 (B), ∀ B ⊂ Y .
1.6.5 Definition: homeomorphism
Let (X, ρX ) and (Y, ρY ) be metric spaces. A mapping f : X → Y is

said to be a homeomorphism if
(i) f is bijective
(ii) f is continuous
(iii) f −1 is continuous
If a homeomorphism from X to Y exists, we say that the spaces X and
Y are homeomorphic.
1.6.6 Theorem
Let I1 and I2 be any two open intervals. Then (I1 , U1 ) and (I2 , U2 ) are
homeomorphic, and a homeomorphism exists between the spaces (I1 , UI1 )
and (I2 , UI2 ).
Preliminaries 53
x2
Iu1 h
x1
Iu2
Fig. 1.5
1.6.7 Definition: covering, subcovering, τ -open covering

A collection C = {Sα : α ∈ λ} of subsets of a set X is said to be a
covering of X if ∪{Sα : α ∈ λ} = X. If C1 is a covering of X, and C2 is
a covering of X such that C2 ⊆ C1 , then C2 is called a subcovering of C1 .
Let (X, ζ) be a topological space. A covering C of X is said to be a ζ-open
covering of X if every member of C is a τ -open set. A covering C is said to
be finite if C has only a finite number of members.
1.6.8 Definition: a compact topological space
A topological space (X, ζ) is said to be compact if every ζ-open covering
of X has a finite subcovering.
Note 1.6.2. The outcome of the Heine-Borel theorem on the real line is
taken as a definition of compactness in a topological space. The Heine-
Borel theorem reads as follows: If X is a closed and bounded subset of the
4 4
real line , then any class of open subsets of , the union of which contains
X, has a finite subclass whose union also contains X.
1.6.9 Theorem
4
The space ( , U ) is not compact. Therefore, no open interval is compact
w.r.t. the U -topology.
1.6.10 Definition: finite intersection property
Let (X, ζ) be a topological space and {Fλ |λ ∈ Λ} be a class of subsets
such that the intersection of finite number of elements of {Fλ |λ ∈ Λ} is non-
-
n
void, i.e., Fλk = Φ irrespective of whatever manner any finite number
k=1
of λk s {λk |k = 1, 2, . . . , n} is chosen from Λ. Then, F = {Fλ |λ ∈ Λ} is said
to have the Finite Intersection Property (FIP).
1.6.11 Theorem
The topological space (X, ζ) is compact if every class of closed subsets
{Fλ |λ ∈ Λ} possessing the finite intersection property has non-void
-n
intersection. In other words, if all Fλ ’s are closed and Fλk = Φ for a
- k=1
finite subcollection {λ1 , λ2 , . . . , λn }, then Fλ = Φ. The converse result
λ∈Λ
is also true, i.e., if every class of closed subsets of (X, ζ) having the FIP
has non-void intersection, then (X, ζ) is compact.
1.6.12 Theorem
A continuous image of a compact space is compact.
1.6.13 Definition: compactness in metric spaces
A metric space being a topological space under the metric topology,
the definition of compactness as given in definition 1.6.5 is valid in metric
spaces. However, the concept of compactness in a metric space can also be
introduced in terms of sequences and can be related to completeness.
1.6.14 Definition: a compact metric space
A metric space (X, ρ) is said to be compact if every infinite subset of X
has at least one limit point.
Remark 1.6.1. A set K ⊆ X is then compact if the space (K, ρ) is

compact. K is compact if and only if every sequence with values in K has
a subsequence which converges to a point in K.
1.6.15 Definition: relatively compact
If X is a metric space and K is a subset of X such that its closure K is
compact, then K is said to be relatively compact.
Lemma 1.6.1. A compact subset of a metric space is closed and bounded.

The converse of the lemma is in general false.
1.6.16 Example
Consider en in I2 where e1 = (1, 0, 0, . . . , 0), e2 = (0, 1, 0, . . . , 0),
e3 = (0, 0, 1, . . . 0). This sequence is bounded, since ρ(θ, en ) = 1 where
ρ stands for the metric in I2 . Its terms constitute a point set that is closed
because it has no limit point. Hence the sequence is not compact.
Lemma 1.6.2. Every compact metric space is complete.

1.6.17 Definition: sequentially compact, totally bounded
(i) A metric space X is said to be sequentially compact if every sequence
in X has a convergent subsequence.
(ii) A metric space X is said to be totally bounded if for every > 0,
X contains a finite net called an -net such that the finite set of open balls
Preliminaries 55
of radius > 0 and centers in the − net covers X.
Lemma 1.6.3. X is totally bounded if and only if every sequence in X

has a Cauchy sequence.
1.6.18 Theorem
In a metric space (X, ρ), the following statements are equivalent:
X is compact.
X is sequentially compact.
X is complete and totally bounded.
1.6.19 Corollary
⎛
For 1 ≤ p < ∞ and x, y
⎞ p1
∈ 4n(+n), consider ρp (x, y) =
n
⎝ |xj − yj |p ⎠ .
j=1
4 +
(a) (Heine-Borel) A subset of n ( n ) is compact if and only if it is
closed and bounded.
4 +
(b) (Bolzano-Weierstrass) Every bounded sequence in n ( n ) has a
convergent subsequence.
4 +
Proof: Since a bounded subset of n ( n ) is totally bounded, part (a)
4 +
follows from theorem 1.6.18. Since the closure of a bounded set of n ( n )
is complete and totally bounded, part (b) follows from theorem 1.6.18.
1.6.20 Theorem
In a metric space (X, ρ) the following statements are true: X is totally
bounded ⇒ X is separable.
X is compact ⇒ X is separable.
X is separable ⇒ X is second countable.
1.6.21 Theorem
If f is a real-valued continuous function defined on a metric space (X, ρ),
then for any compact set A ⊆ X the values Sup[f (x), x ∈ A], Inf [f (x), x ∈
A] are finite and are attained by f at some points of A.
Remark 1.6.2. If a continuous function f (x) is defined on some set M

that is not compact, then Sup f (x) and Inf f (x) need not be attained. For
x∈M x∈M
example, consider the set of all functions x(t) such that x(0) = 0, x(1) = 1
1
and |x(t)| ≤ 1. The continuous functional f (x) = x2 (t)dt, though
0
continuous on M , does not attain the g.l.b. on M . For if x(t) = tn , f (x) =
1
→ 0, as n → ∞. Hence Inf [f (x)] = 0. But the form of f (x)
22n+1
indicates that f (x) > 0 for every x = x(t) continuous curve that joins the
points (0,0) and (1,1). The fallacy is that the set of curves considered is
not compact even if the set is closed and bounded in C ([0,1]).
1.6.22 Definition: uniformly bounded, equicontinuous
Let (X, ρ) be a complete metric space. The space of continuous real-
valued functions on X with the metric, ρ(x, y) = max[|f (x) − g(x)|x ∈ X]
is a complete metric space, which we denote by C([X]). A collection F of
functions on a set X is said to be uniformly bounded if there is an M > 0
such that |f (x)| ≤ M ∀ x ∈ X and all f ∈ F. For subsets of C([X]),
uniform boundedness agrees with boundedness in a metric space, i.e., a set
F is uniformly bounded if and only if it is contained in a ball.
A collection F of functions defined on a metric space X is called
equicontinuous if for each > 0, there is a δ > 0 such that ρ(x, x ) <
δ ⇒ |f (x) − f (x )| < for all x, x ∈ X and for f ∈ F . It may be noted
that the functions belonging to the equicontinuous collection are uniformly
continuous.
1.6.23 Theorem (Arzela-Ascoli)
If (X, ρ) is a compact metric space, a subset K ⊆ C(X) is relatively
compact if and only if it is uniformly bounded and equicontinuous.
1.6.24 Theorem
If f is continuous on an open set D, then for every (x0 , y0 ) ∈ D
dy
the differential equation = f (x, y) has a local solution passing through
dx
(x0 , y0 ).
Problems
1. Show that if (X, ζ) is a topological space such that X has only a finite
number of points, then (X, ζ) is compact.
2. Which of the following subspaces of (R, U ) are compact: (i) J, (ii)
[0, 1], (iii) [0, 1] ∪ [2, 3], (iv) the set of all rational numbers or (v) [2, 3[.
3. Show that a continuous real or complex function defined on a compact
space is bounded. More generally, show that a continuous real or
complex function mapping compact space into any metric space is
bounded.
4. Show that if D is an open connected set and a differential equation
dy
= f (x, y) is such that its solutions form a simple covering of D,
dx
then f is the limit of a sequence of continuous functions.
5. Prove the Heine-Borel theorem: A subspace (Y, UY ) of (R, U ) is
compact if and only if Y is bounded and closed.
1
6. Show that the subset I2 of points {xn } such that |xn | ≤ , n = 1, 2, . . .
n
is compact.
Preliminaries 57
7. Show that the unit ball in C([0, 1]) of points [x : max |x(t)| ≤ 1, t ∈
[0, 1]] is not compact.
8. Show that X is compact if and only if for any collection F of closed
-
n
sets with the property that Fi = Φ for any finite collection in F,
i=1
it follows that ∩{Fλ , Fλ ∈ F} = Φ.
CHAPTER 2
NORMED LINEAR
SPACES
If a linear space is simultaneously a metric space, it is called a metric linear

space. The normed linear spaces form an important class of metric linear
spaces. Furthermore, in each space there is defined a notion of the distance
from an arbitrary element to the null element or origin, that is, the notion
of the size of an arbitrary element. This gives rise to the concept of the
norm of an element x or ||x|| and finally to that of the normed linear space.
2.1 Definitions and Elementary Properties
2.1.1 Definition
Let E be a linear space over 4 (or +).
To every element x of the linear space E, let there be assigned a unique
real number called the norm of this element and denoted by ||x||, satisfying
the following properties (axioms of a normed linear space):
(a) ||x|| > 0 and ||x|| = 0 ⇔ x = 0;
(b) ||x + y|| ≤ ||x|| + ||y|| (triangle inequality), ∀ y ∈ X
(c) ||λx|| = |λ| ||x|| (homogeneity of the norm), ∀ λ ∈ 4 (or +)

The normed linear space E is also written as (E, || · ||).
58
Normed Linear Spaces 59
Remark 2.1:
1. Properties (b) and (c) ⇒ property (a).
0 = ||x − x|| ≤ ||x|| + || − x|| = 2||x||
which yields ||x|| ≥ 0.

2. If we regard x in a normed linear space E, as a vector, its length is
||x|| and the length ||x − y|| of the vector x − y is the distance between
the end points of the vectors x and y. Thus in view of (a) and (b)
and the definition 2.1.1, we say all vectors in a normed linear space E
have positive lengths except the zero vector. The property (c) states
that the length of one side of a triangle can never exceed the sum of
the lengths of the other two sides (Fig. 2.1(a)).
x+y y ||x + y || ||y ||
x ||x||
Fig. 2.1(a) Fig. 2.1(b)
3. In a normed linear space a metric (distance) can be introduced by

ρ(x, y) = ||x − y||. It is clear that this distance satisfies all the metric
axioms.
2.1.2 Examples
1. The n-dimensional Euclidean space 4 n
and the unitary
+
space n are normed linear spaces.
Define the sum and product of the elements by a scalar as in Sec. 1.4.2,
4
Ex. 2. The norm of x ∈ n is defined by
n
1/2

2
||x|| = |ξi | .
i=1
This norm satisfies all the axioms of 2.1.1. Hence 4n and +n are
normed linear spaces.
2. Space C([0, 1]).
We define the addition of functions and multiplication by a scalar
in the usual way.
We set ||x|| = max |x(t)|. It is clear that the axioms of a normed
t∈[0,1]
linear space are satisfied.
3. Space lp : We define the addition of elements and multiplication of

elements by a scalar as indicated earlier.
p
1/p

We set for x = {ξi } ∈ lp , ||x|| as ||x|| = |ξi | p
.
i=1
The axioms (a) and (c) of 2.1.1 are satisfied.

If y = {ηi }, it can be shown by making an appeal to Minkowskii’s
inequality ([1.4.5])

1/p
1/p p
1/p

p
p
|ξi + ηi |p
≤ |ξi | p
+ |ηi | p
<∞
i=1 i=1 i=1
where x(t) and y(t) are p-th power summable.

Thus lp is a normed linear space.
4. Space Lp ([0, 1]), p > 1
1 1/p
We set ||x|| = |x(t)|p dt for x(t) ∈ Lp ([0, 1]).
0
If y(t) ∈ Lp ([0, 1]) then

1 1/p
|y(t)| p
<∞
0
where x(t) and y(t) are Riemann integrable. Then by Minkowski’s

inequality for integrals, we have the inequality (c) of (2.1.1).
5. Space l∞ : Let x = {ξi }, such that |ξi | ≤ Ci where Ci for each i is a
real number. Setting ||x|| = sup |ξi |, we see that all the norm axioms
i
are fulfilled. Hence l∞ is a normed linear space.
6. Space C k ([0, 1]): Consider a space of functions x(t) defined and
continuous on [0, 1] and having their continuous derivatives upto the
order k, i.e., x(t) ∈ C k [0, 1]. The norm on this function space is
defined by
! "
||x|| = max maxt |x(t)|, maxt |x1 (t)|, . . . , maxt |xk (t)| .
Then ||x + y|| = max(maxt |x(t)| + y(t)|, maxt |x1 (t) + y 1 (t)|,
. . . , maxt |xk (t) + y k (t)|)
It is clear that all the axioms of 2.1.1 are fulfilled. Therefore C k ([0, 1])
is a normed linear space.
2.1.3 Induced metric, convergence in norm, Banach space

Induced metric: In a normed linear space a metric (distance) can be
introduced by, ρ(x, y) = ||x − y||. It may be seen that the metric defined
above satisfies all the axioms of a metric space. Thus the normed linear
space (E, || · ||) can be treated as a metric space if we define the metric in
the above manner.
Convergence in norm: After introducing the metric, we define the
convergence of a sequence of elements {xn } to x namely, x = lim xn or
xn → x if ||xn − x|| → 0 as n → ∞. Such a convergence in a normed linear
space is called convergence in norm.
Banach Space: A real (complex) Banach space is a real (complex) normed
linear space that is complete in the sense of convergence in norm.
2.1.4 Examples
1. The n-dimensional Euclidean space n is a Banach Space 4
Defining the sum of elements and the product of elements by a scalar (real
or complex) in the usual manner, the norm is defined as

1/2

n
||x|| = ξi2
i=1
where x = {ξi }. By example 1, 2.1.2 we see that n is a normed linear

(m) (m)
4
space. If xm = {ξi } and if ξi → ξi as m → ∞, ∀ i, then,

1/2

n
(m) 2
|ξi − ξi | → 0 as m → ∞
i=1
or in other words ||xm − x|| → 0 as m → ∞ where x = {ξi }. Since x ∈ n 4

4
it follows that n is a complete normed linear space or n is a Banach 4
space.
2. lp is a Banach space(1 ≤ p < ∞)
Defining the sum of the elements and the product of elements by a scalar
as in 1.3 and taking the norm as

1/p

p
||x|| = |ξi | p
i=1
where x = {ξi }, we proceed as follows.

(n) (n)
Let xn = {ξi } ∈ lp and ξi → ξi as n → ∞. Then ||xn − x|| → 0 as
n → ∞ where x = {ξi }. Now by Minkowski’s inequality [1.4.5] we have
∞
1/p ∞
1/p ∞

1/p
(x)
|ξi | p
≤ |ξin |p + |ξi − ξi | p
i=1 i=1 i=1

or, ||x|| ≤ ||xn || + ||x − xn || < ∞ for n sufficiently large.
Hence x ∈ lp , 1 ≤ p < ∞ and the space is a Banach space.
3. C([0, 1]) is a Banach space
C([0, 1]) is called the function space. Consider the linear space of all
scalar valued (real or complex) continuous functions defined in C([0, 1]).
Let {xn (t)} ⊂ C([0, 1]) and {xn (t)} be a Cauchy sequence in C([0, 1]).
C([0, 1]) being a normed linear space [see 2.1],
||xm − xn || = max |xm (t) − xn (t)| < ∀ m, n ∈ N. (2.1)

t
Therefore, for any fixed t = t0 ∈ [0, 1], we get
|xm (t0 ) − xn (t0 )| < ∀ m, n ∈ N.
This shows that {xm (t0 )} is a Cauchy sequence in ( ). But ( ) 4+ 4+

being complete, the sequence converges to a limit in ( ). In this way, we 4+
4+
can assign to each t ∈ [0, 1] a unique x(t) ∈ ( ). This defines a (pointwise)
function x on [0, 1]. Now, we show that x ∈ C([0, 1]) and xm → x.
Letting n → ∞, we have from (2.1)
|xm (t) − x(t)| < ∀ m ≥ N and t ∈ [0, 1] (2.2)
This shows that the sequence {xm } of continuous functions converges

uniformly to the function x on [0, 1], and hence the limit function x is
a continuous function on [0, 1]. As such, x ∈ C([0, 1]). Also from (2.2), we
have
max |xm (t) − x(t)| < , ∀ m ≥ N
t
⇒ ||xm − x|| ≤ ∀ m ≥ N
⇒ xm → x ∈ C([0, 1]).
Hence C[0, 1] is a Banach space.

4. l∞ is a Banach space
Let {xm } be a Cauchy sequence in l∞ and let xm = {ξim } ∈ l∞ . Then

(m) (n)
sup |ξi − ξi | ≤ , ∀ m, n ≥ N.
i
This gives |ξim − ξin | < ∀ m, n ≥ N (i = 1, 2, . . .).
4+
This shows that for each i, {ξim } is a Cauchy sequence in ( ). Since
4+ 4+
( ) is complete, {ξim } converges in ( ). Let ξim → ξi as m → ∞, and
let x = (ξ1 , ξ2 , . . . , ξn , . . . , . . .).
Let m → ∞, then
(m)
|ξi − ξi | < ∀ m ≥ N (i = 1, 2, . . .) (2.3)
Since xm ∈ l∞ , there is a real number Mm such that |ξim | ≤ Mm , ∀ i.
Therefore, |ξi | ≤ |ξim | + |ξi − ξim | ≤ Mm + , ∀ m ≥ N, i = 1, 2, . . .
Since the RHS is true for each i and is independent for each i, it follows
that {ξi } is a bounded sequence of numbers and thus x ∈ l∞ . Furthermore,
it follows from (2.3),
||xm − x||∞ < ∀ m > N.
Hence xm → x ∈ l∞ and l∞ is a Banach Space.
5. Incomplete normed linear space
Let X be a set of continuous functions defined on a closed interval [a, b].

For x ∈ X let us take ||x|| as
b
||x|| = |x(t)|dt (2.4)
a
The metric induced by (2.4) for x, y ∈ X is given by

b
ρ(x, y) = |x(t) − y(t)|dt (2.5)
a
In note 1.4.10, we have shown that the metric space (X, ρ) is not
complete, i.e., given a Cauchy sequence {xm } in (X, ρ), xm does not
converge to a point in X. Hence the normed linear space (X, || · ||) is

not complete.
6. An incomplete normed linear space and its completion

L2 ([a, b])
The linear space of all continuous real-valued functions on [a, b] forms

a normed linear space X with norm defined by

1/2
b
2
||x|| = (x(t)) dt (2.6)
a
Let, us consider {xm } as follows:
⎧ . /
⎪
⎨ 0 ! if t ∈ 0, 12
" . /
xm (t) = m t − 12 if t ∈ 12 , am
⎪
⎩
1 if t ∈ [am , 1]
1 1
where am = + .
2 m
Let us take a = 0 and b = 1 and n > m.
1
Hence, ||xn − xm ||2 = [xn (t) − xm (t)]2 dt
0
1 1
2+n
1 1
2+m
= |xm (t) − xn (t)|2 dt + |xn (t) − xm (t)|dt
1 1 1
2 2+n

1 1 1
= ABC = − < [see figure 2.1(c)].
m n m
The Cauchy sequence does not converge to a point in (X, || · ||). For
every x ∈ X,
1
||xn − x|| = |xn (t) − x(t)|2 dt
0
1/2 am 1
= |x(t)|2 dt + |xn (t) − x(t)|2 dt + |1 − x(t)|2 dt.
0 1/2 am
Since the integrands are non-negative, xn → x in the space (X, || · ||)

implies that x(t) = 0 if t ∈ [0, 12 [, x(t) = 1 if t ∈] 12 , 1]. Since it is impossible
for a continuous function to have this property {xn } does not have a limit
in X.
The space X can be completed by Theorem 1.4.5. The completion is

denoted by L2 ([0, 1]). This is a Banach space. In fact the norm on X and
the operations of a Linear 4

space can be extended to the completion of
X. This process can be seen in Theorem 2.1.10 in the next section. In
general for any p ≥ 1, the Banach space Lp [a, b] is the completion of the
normed linear spaces which consists of all continuous real-valued functions
on [0, 1] and the norm defined by
1 1/p
||x||p = |x(t)| dt
p
.
0
With the help of Lebesgue integrals the space Lp ([0, 1]) can also be
obtained in a direct way by the use of Lebesgue integral and Lebesgue
measurable functions x on [0, 1] such that the Lebesgue integral of |x|p on
[0, 1] exists and is finite. The elements of Lp ([0, 1]) are equivalent classes of
those functions, where x is equivalent to y if the Lebesgue integral of |x−y|p
over [0, 1] is zero. We discuss these (Lebesgue measures) in Chapter Ten.
Until then the development will take place without the use of measure
theory.
1
— 1
m —
m
1
—
n
C
B
1 1
xn
xm
A
0 0
1 am 1 1 1
— —
2 2
t
Fig. 2.1(c) Fig. 2.1(d)
7. Space s
Every normed linear space can be reduced to a metric space. However,

every metric cannot always be recovered from a norm. Consider the space
with the metric ρ defined by
∞
1 |ξi − ηi |
ρ(x, y) = ,
i=1
2i 1 + |ξi − ηi |
where x = {ξi } and y = {ηi } belong to s. This metric cannot be obtained

from a norm. This is because this metric does not have the two properties
that a metric derived from a norm possesses. The following lemma identifies
the existence of these two properties.
2.1.5 Lemma (translation invariance)
A metric ρ induced by a norm on a normed linear space E satisfies:
(a) ρ(x + a, y + a) = ρ(x, y) (b) ρ(αx, αy) = |α|ρ(x, y) for all

x, y, a ∈ E and every scalar α.
Proof: We have
ρ(x + a, y + a) = ||x + a − (y + a)|| = ||x − y|| = ρ(x, y)

ρ(αx, αy) = ||αx − αy|| = |α| ||x − y|| = |α|ρ(x, y).
Problems
1. Show that the norm ||x|| of x is the distance from x to 0.

2. Verify that the usual length of a vector in the plane or in three
dimensional space has properties (a), (b) and (c) of a norm.
3. Show that for any element x of a normed linear space ||x|| ≥ 0 follows
from axioms (b) and (c) of a normed linear space.
n
1/2

4. Given x = {ξi } ⊂ n
4
, show that ||x|| = |ξi |2
defines a norm
4n .
i=1
on
5. Let E be the linear space of all ordered triplets x = {ξ1 , ξ2 , ξ3 }, y =
{η1 , η2 , η3 } of real numbers. Show that the norms on E are defined
by,
||x||1 = |ξ1 | + |ξ2 | + |ξ3 |, ||x||2 = {ξ12 + ξ22 + ξ32 }1/2 ,
||x||∞ = max{|ξ1 |, |ξ2 |, |ξ3 |}
6. Show that the norm is continuous on the metric space associated with
a normed linear space.
7. In case 0 < p < 1, show with the help of an example that ||·||p does not
n
p1
(n)

define a norm on lp unless n = 1 where ||x||p(n) = |ξi |p , x=
i=1
{ξi }.
8. Show that each of the following defines a norm on 42 .

|x1 | |x2 | x21 x22
(i) |x||1 = + , (ii) |x||2 = + 2 ,
a b a2 b
,
|x1 | |x2 |
(iii) ||x||∞ = max +
a b
where a and b are two fixed positive real numbers and x = (x1 , x2 ) ∈
4 2
. Draw a closed unit sphere (||x|| = 1) corresponding to each of
these norms.
9. Let || · || be a norm on a linear space E. If x + y ∈ E and
||x + y|| = ||x|| + ||y||, then show that ||sx + ty|| = s||x|| + t||y||,
for all s ≥ 0, t ≥ 0.
4
10. Show that a non-empty subset A of n is bounded ⇔ there exists a
real number K such that for each x = (x1 , x2 , . . . , xn ) in A we have
|xi | ≤ K for each subscript i.
11. Show that the real linear space C([−1, 1]) equipped with the norm
given by
1
||x||1 = |x(t)|dt,
−1
where the integral is taken in the sense of Riemann, is an incomplete

normed linear space [Note: ||x||1 is precisely the area of the region
enclosed within the integral t = −1 and t = 1].
12. Let E be a linear space and ρ the metric on E such that
ρ(x, y) = ρ(x − y, 0)
and ρ(αx, 0) = |α|ρ(x, 0) ∀ x, y ∈ E and α ∈ 4 (+ )
Define ||x|| = ρ(x, 0), x ∈ E. Prove that || · || is a norm on E and
that ρ is the metric induced by the norm || · || on E.
13. Let E be a linear space of all real valued functions defined on
[0, 1] possessing continuous first-order derivatives. Show that ||f || =
|f (0)| + ||f ||∞ is a norm on E that is equivalent to the norm
||f ||∞ + ||f ||∞ .
2.1.6 Lemma
In a normed linear space E,
| ||x|| − ||y|| | ≤ ||x − y||, x, y ∈ E.
Proof: ||x|| = ||(x − y) + y|| ≤ ||x − y|| + ||y||.

Hence ||x|| − ||y|| ≤ ||x − y||.
Interchanging x with y,
||y|| − ||x|| ≤ ||y − x||.
Hence | ||x|| − ||y|| | ≤ ||x − y||.
2.1.7 Lemma
The || · || is a continuous mapping of E into 4 where E is a Banach

space.
Let {xn } ⊂ E and xn → x as n → ∞, it then follows from lemma 2.1.4

that
| ||xn || − ||x|| | ≤ ||xn − x|| → 0 as n → ∞.
Hence the result follows.
2.1.8 Corollary
4+
Let E be a complete normed linear space over ( ). If {xn }, {yn } ⊂ E,
4+
αn ∈ ( ) and xn → x, yn → y respectively as n → ∞ and αn ∈ ( ) 4+
then (i) xn + yn → x + y (ii) αn xn → x as x → α.
Proof: Now, ||(xn + yn ) − (x + y)|| ≤ ||xn − x|| + ||yn − y|| → 0 as n → ∞
Hence xn + yn → x + y as n → ∞.
||αn xn − αx|| ≤ ||αn (xn − x)+(αn − α)x|| ≤ |α| ||xn − x|| + |αn − α| ||x|| → 0
because {αn } being a convergent sequence is bounded and ||x|| is finite.
2.1.9 Summable sequence
Definition: A sequence {xn } in a normed linear space E is said to be

summable to the limit sum s if the sequence {sm } of the partial sums of

n
the series xn converges to s in E, i.e.,
n=1
0m 0
0 0
0 0
||sm − s|| → 0 as m → ∞ or 0 xn − s0 → 0 as m → ∞.
0 0
n=1
∞

In this case we write s = xn . {xn } is said to be absolutely summable
n=1
∞

if ||xn || < ∞.
n=1
It is known that for a sequence of real (complex) numbers absolute

summability implies summability. But this is not true in general for
sequences in normed linear spaces. But in a Banach space every absolutely
summable sequence in E implies that the sequence is summable. The
converse is also true. This may be regarded as a characteristic of a Banach
space.
2.1.10 Theorem
A normed linear space E is a Banach space if and only if every absolutely

summable sequence in E is summable in E.
Proof: Assume that E is a Banach space and that {xn } is an absolutely

summable sequence in E. Then,
∞

||xn || = M < ∞,
n=1

K
i.e., for each > 0 ∃ a K such that ||xn || < , i.e.,
n=1
0 n 0
0 0
n
0 0
||sn − sm || = 0 xk 0 ≤ ||xk ||
0 0
k=m+1 k=m+1
∞

≤ ||xk ||, ≤ , n, m > K.
n=K

n
In the above, sn = ||xk ||.
k=1
Thus sn is a Cauchy sequence in E and must converge to some element

s in E, since E is complete. Hence {xn } is summable in E.
Conversely, let us suppose that each absolutely summable sequence in

E is summable in E. We need to show that E is a Banach space. Let {xn }
be a Cauchy sequence in E. Then for each k, ∃ an integer nk such that
1
||xn − xm || < ∀ n, m ≥ nk .
2k
We may choose nk such that nk+1 > nk . Then {xnk } is a subsequence

of {xn }.
Let us set y0 = xn1 , y1 = xn2 − xn1 , . . . , yk = xnk+1 − xnk , . . .. Then

k
1
(a) yn = xnk+1 , (b) ||yk || < , k ≥ 1.
n=0
2k
∞
∞
1
Thus ||yk || ≤ ||y0 || + = ||y0 || + 1 < ∞.
2k
k=0 k=1
Thus the sequence {yk } is absolutely summable and hence summable

to some element x in E. Therefore by (a) xnk → x as k → ∞. Thus the
Cauchy sequence {xn } in E has a convergent subsequence {xnk } converging
to x. Now, if a subsequence in a Cauchy sequence converges to a limit then
the whole sequence converges to that limit. Thus, the space is complete
and is therefore a Banach space.
2.1.11 Ball, sphere, convex set, segment of a straight line
Since normed linear spaces can be treated as metric spaces, all concepts
introduced in metric spaces (e.g., balls, spheres, bounded set, separability,
compactness, linear dependence of elements, linear subspace, etc.) have
similar meanings in normed linear spaces. Therefore, theorems proved
in metric spaces using such concepts can have parallels in normed linear
spaces.
Definition: ball, sphere
Let (E, || · ||) be a normed linear space.
(i) The set {x ∈ E : ||x − x0 || < r}, denoted by B(x0 , r), is called the
open ball with centre x0 and radius r.
(ii) The set {x ∈ E : ||x − x0 || ≤ r}, denoted by B(x0 , r), is called a
closed ball with centre x0 and radius r.
(iii) The set {x ∈ E : ||x − x0 || = r}, denoted by S(x0 , r), is called a
sphere with centre x0 and radius r.
Note 2.1.1.
1. An open ball is an open set.

2. A closed ball is a closed set.
3. Given r > 0
1 0x0 2
0 0
B(0, r) = {x ∈ E : ||x|| < r} = x ∈ E :0 0<1
r
x
= {ry ∈ E : ||y|| < 1} where y =
r
= rB(0, 1).
Therefore, in a normed linear space, without any loss of generality, we can

consider B(0, 1), the ball centred at zero with a radius of 1. The ball B(0, 1)
is called the unit open ball in E.
Definition: convex set, segment of a straight line
A set of elements of a linear space E having the form y = tx, x ∈ E,

x = 0, −∞ < t < ∞ is called a real line defined by the given element x and
a set of elements of the form
y = x1 + (1 − λ)x2 , x1 , x2 ∈ X, 0 ≤ λ ≤ 1
is called a segment joining the points x1 and x2 . A set X in E is called a

convex set if
x1 , x2 ∈ X ⇒ λx1 + (1 − λ)x2 , x1 , x2 ∈ E, 0 ≤ λ ≤ 1.
2.1.12 Lemma. In a normed linear space an open (closed) ball

is a convex set
Let x1 , x2 ∈ B(x0 , r), i.e., ||x1 − x0 || < r, ||x2 − x0 || < r.
Let us select any element of the form,
y = λx1 + (1 − λ)x2 , 0 < λ < 1

Then ||y − x0 || = ||λx1 + (1 − λ)x2 − x0 ||
≤ ||λ(x1 − x0 ) + (1 − λ)(x2 − x0 )||
≤ λ||x1 − x0 || + (1 − λ)||x2 − x0 ||
< r.
Thus, y ∈ B(x0 , r).

Note 2.1.2
(i) For any point x = θ, a ball of radius r > ||x|| with its centre in the
origin, contains the point x.
(ii) Any ball of radius r < ||x|| with centre in the origin does not contain
this point.
(n)
In order to have geometrical interpretations of different abstract spaces lp ,
the n-dimensional pth summable spaces, we draw the shapes of unit balls
for different values of p.
Examples: unit closed balls in 4 2

with different norms: Given
x = (x1 , x2 )
(i) ||x||1/2 = (|x1 |1/2 + |x2 |1/2 )1/2

(ii) ||x||1 = (|x1 | + |x2 |)
(iii) ||x2 ||2 = (|x1 |2 + |x2 |2 )1/2
(iv) ||x||4 = (|x1 |4 + |x2 |4 )1/4
(v) ||x||∞ = max(|x1 |, |x2 |)
Problems
1. Show that for the norms in examples (ii), (iii), (iv) and (v) the unit
spheres reflect what is shown in figure below:
||x||∞ = 1
||x||4 = 1
||x||2 = 1
||x||1 = 1
Fig. 2.2
2. Show that the closed unit ball is a convex set.

+ +
3. Show that ϕ(x) = ( |ξ1 | + |ξ2 |)1/2 does not define a norm on the
linear space of all ordered pairs x = {ξ1 , ξ2 } of real numbers. Sketch
the curve ϕ(x) = 1 and compare it with the following figure 2.3.
ξ2
1
ξ1
−1 1
−1
Fig. 2.3
4. Let ρ be the metric induced by a norm on a linear space E = Φ. If

ρ1 is defined by
3
0 x=y
ρ1 (x, y) =
1 + ρ(x, y) x = y
then prove that ρ1 can not be obtained from a norm on E.

5. Let E be a normed linear space. Let X be a convex subset of E.
Show that (i) the interior X 0 and (ii) the closure X of X are convex
0
sets. Show also that if X = Φ, then X = X .
2.2 Subspace, Closed Subspace
Since the normed linear space E is a special case of linear space, all
the concepts introduced in a linear space (e.g., linear dependence and
independence of elements, linear subspace, decomposition of E into direct
sums, etc.) have a relevance for E.
Definition: subspace
A set X of a normed linear space E is called a subspace if
(i) X is a linear space with respect to vector addition and scalar

multiplication as defined in E (1.3.2).
(ii) X is equipped with the norm || · ||X induced by the norm || · || on E
i.e., ||x||X = ||x||, ∀ x ∈ X.
We may write this subspace (X, || · ||X ) simply as X.

Note 2.2.1. It is easy to see that X is a normed linear space. Furthermore

the metric defined on X by the norm coincides with the restriction to X
of the metric defined on E by its norm. Therefore X is a subspace of the
metric space E.
Definition: closed subspace

A subspace X of a normed linear space E is called a closed subspace of E
if X is a closed metric space.
Definition: subspace of a Banach space

Given E a Banach space, a subspace X of E is said to be a subspace of
Banach space E.
Examples
1. The space c of convergent numerical sequences in 4 +

(or ) is a closed
subspace of l∞ , the space of bounded numerical sequences in ( ). 4+
2. c0 , the space of all sequences converging to 0, is a closed subspace of
c.
3. The space P[0, 1] is a subspace of C[0, 1], but it is not closed.
P[0, 1] is spanned by the elements x0 = 1, x1 = t, . . . , xn = tn , . . .
Then P[0, 1] is a set of all polynomials, but P[0, 1] = C[0, 1].
2.2.1 Theorem (subspace of a Banach space)
A subspace X of a Banach space E is complete if and only if the set X

is closed in E.
Proof: Let X be complete. Let it contain a limit point x of X. Then,

every open ball B(x, 1/n) contains points of X (other than x). The open
ball B(x, 1/n) where n is a positive integer, contains a point xn of X, other
than x. Thus {xn } is a sequence in X such that
1
||xn − x|| <
, ∀ n,
n
⇒ lim xn = x in X
n→∞
⇒ {xn } is a Cauchy sequence in E and therefore in X.
However, X being complete, it follows that x ∈ X. This proves that X

is closed.
On the other hand, let X be closed, in which case it contains all of its
limiting points. Hence every Cauchy sequence will converge to some point
in X. Otherwise the subspace X will not be closed.
Examples:
4. Consider the space Φ of sequences
x = (ξ1 , ξ2 , . . . , ξn , 0, . . .) in 4(+),
where ξn = 0 for only finite values of n. Clearly, Φ̂ ⊂ c0 ⊂ l∞ and Φ̂ = c0 .
It may be noted that c0 is the closure of Φ̂ in (l∞ , || · ||∞ ). Thus Φ̂ is not
closed in l∞ and hence Φ̂ is an incomplete normed linear space equipped
with the norm induced by the norm || · ||∞ on l∞ .
5. For every real number p ≥ 1, we have,
Φ̂ ⊂ lp ⊂ c0
It may be noted that c0 is the closure of lp in c0 and lp = c0 . Thus, lp

is not closed in c0 and hence lp is an incomplete normed linear space when
induced by the || · || in c0 .
Problems
1. Show that the closure X of a subspace X of a normed linear space E

is again a subspace of E.
+ +
2. If n ≥ m ≥ 0, prove that n ({a, b]) ⊂ m ([a, b]) and that the space
+ n
+
([a, b]) with the norm induced by the norm on m ([a, b]) is not
closed.
3. Prove that the intersection of an arbitrary collection of non-empty

closed subspaces of the normed linear space. E is a closed subspace
of E.
4. Show that c ⊂ l∞ is a vector subspace of l∞ and so is c0 .
5. Let X be a subspace of a normed linear space E. Then show that

X is nowhere dense in E (i.e., the interior of the closure of X is
empty) if and only if X is nowhere dense in E.
6. Show that c is a nowhere dense subspace of m.

2.3 Finite Dimensional Normed Linear

Spaces and Subspaces
Although infinite dimensional normed linear spaces are more general than
finite dimensional normed linear spaces, finite dimensional normed linear
spaces are more useful. This is because in application areas we consider
the finite dimensional spaces as subspaces of infinite dimensional spaces.
Quite a number of interesting results can be derived in the case of finite
dimensional spaces.
2.3.1 Theorem
All finite-dimensional normed linear spaces of a given dimension n are

isomorphic to the n-dimensional Euclidean space En , and are consequently,
isomorphic to each other.
Proof: Let E be an n-dimensional normed linear space and let e1 , e2 , . . . , en

be the basis of the space. Then any element x ∈ E can be uniquely
expressed in the form x = ξ1 e1 +ξ2 e2 +· · · ξn en . Corresponding to x ∈ E, let
us consider the element x̃ = {ξ1 , ξ2 , . . . , ξn } in the n-dimensional Euclidean
space. The correspondence established in this manner between x and x̃ is
one-to-one. Moreover, let y ∈ E be of the form
y = η1 e1 + η2 e2 + · · · + ηn en .
Then y ∈ E is in one-to-one correspondence with y ∈ En where y =

{η1 , η2 , . . . , ηn } ∈ En . It is apparent that
x ↔ x̃, and y ↔ ỹ implies x + y ↔ +x̃ + ỹ

and λx ↔ λx̃, λ ∈ 4 (+ )
To prove that E and En are isomorphic, we go on to show that the
linear mapping from E onto En is mutually continuous.
For any x ∈ E, we have

0 0
0 n 0 n
0 0
||x|| = 0 ξi ei 0 ≤ |ξi | ||ei ||
0 0
i=1 i=−1
n
1/2 n
1/2

2 2
= ||ei || |ξi |
i=1 i=1
!n "1/2
= β||x̃|| where β = i=1 |ei |2
In particular, for all x, y ∈ E,
||x − y||E ≤ β||x̃ − ỹ||En (2.7)
Next we establish a reverse inequality. We note that the unit sphere

S(0, 1) = {||x̃ − 0|| = 1} in En is compact. We next prove that the function
f (x̃) = f (ξ1 , . . . , ξn ) = ||x|| = ||ξ1 e1 + ξ2 e2 + · · · + ξn en ||
defined on S(0, 1) is continuous. Now, a continuous function defined on a

compact set attains its extremum.
Since all the ξi ’s cannot vanish simultaneously on S and since

e1 , e2 , . . . , en are linearly independent, f (ξ1 , ξ2 , . . . , ξn ) > 0.
Now, |f (ξ1 , ξ2 , . . . , ξn ) − f (η1 , η2 , . . . , ηn )| = | ||x|| − ||y|| | ≤ ||x − y||E ≤

β||x̃ − ỹ||En .
The above shows that f is a continuous function.
Now, since the unit ball S(0, 1) in En is compact and the function
f (ξ1 , ξ2 , . . . , ξn ) defined on it is continuous, it follows that f (ξ1 , . . . , ξn ) has
a minimum on S. Hence,
f (ξ1 , ξ2 , . . . , ξn ) > r where r > 0,

or, f (x̃) = ||x|| ≥ γ.
Hence for any x̃ ∈ En ,
0 0
0 0
0 0
0 0
0 n 0
0 ξi ei 0
0
f (x̃) = ||x|| = ||x̃|| 0 0 ≥ γ||x̃||
0
0 i=1
n 0
0 ξi 0
2
0 0
0 i=1
0
or in other words
||x − y|| ≥ γ||x̃ − ỹ||. (2.8)
From (2.7) and (2.8) it follows that the mapping of E onto En is one-
to-one.
The mapping from E onto En is one-to-one and onto. Both the mapping
and its inverse are continuous. Thus, the mapping is a homeomorphism.
The homeomorphism between E and En implies that in a finite dimensional
Banach space the convergence in norm reduces to a coordinatewise

convergence, and such a space is always complete.
The following lemma is useful in deriving various results. Very roughly

speaking, it states that with regard to linear independence of vectors, we
cannot find a linear combination that involves large number of scalars but
represents a small vector.
2.3.2 Lemma (linear combination)
Let {e1 , e2 , . . . , en } be a linearly independent set of vectors in a normed

linear space E (of any finite dimension). In this case there is a number
c > 0 such that for every choice of scalars α1 , α2 , . . . , αn we have
||α1 e1 + α2 e2 + · · · + αn en || ≥ c(|α1 | + |α2 | + · · · + |αn |), c > 0 (2.9)
Proof: Let S = |α1 | + |α2 | + · · · + |αn |. If S = 0, all αj are zero so that

the above inequality holds for all c. Let S > 0. Writing βi = αi /S (2.9) is
equivalent to the following inequality,
||β1 e1 + β2 e2 + · · · + βn en || ≥ c (2.10)

n
Note that |βi | = 1.
i=1
Hence it suffices to prove the existence of a c > 0 such that (2.10) holds

n
for every n-tuples of scalars β1 , β2 , . . . , βn with |βi | = 1.
i=1
Suppose that this is false. Then there exists a sequence {ym } of vector
n
(m)
(m)
(m)
ym = β1 e1 + · · · + βn en , |βi | = 1
i=1
such that ||ym || → 0, as m → ∞.

n
(m) (m)
Since |βi | = 1 , we have |βi | ≤ 1. Hence for each fixed i the
i=1
sequence
(m) (1) (2)
(βi ) = (βi , βi , . . .)
(m)
is bounded. Consequently, by the Bolzano-Weierstrass theorem, {βi } has
a convergent subsequence. Let βi denote the limit of the subsequence and
let {y1,m } denote the corresponding subsequence of {ym }. By the same
argument, {y1,m } has a subsequence {y2,m } for which the corresponding

(m)
subsequence of scalars β2 converges. Let β2 denote the limit. Continuing
in this way after n steps, we obtain a subsequence, {yn,m } = {yn,1 , yn,2 , . . .}
of {ym }, the terms are of the form,

n
(m)
yn,m = γi ei ,
i=1

n
(m) (m)
|γi | = 1 with scalars γi → βi as m → ∞
i=1

n
yn,m → y = βi ei
i=1

where |βi | = 1, so that not all βi can be zero.
i
Since {e1 , e2 , . . . , en } is a linearly independent set, we thus have y = θ.

On the other hand, yn,m → y implies ||yn,m || → ||y||, by the continuity of
the norm. Since ||ym || → 0 by assumption and {yn,m } is a subsequence of
{ym }, we must have ||yn,m || → 0. Hence ||y|| = 0, so that y = 0 by (b) of
2.1.1. This contradicts the fact that y = θ and the lemma is proved.
Using the above lemma we prove the following theorem.
2.3.3 Theorem (completeness)
Every finite dimensional subspace of a normed linear space E is

complete. In particular, every finite dimensional normed linear space is
complete.
Let us consider an arbitrary Cauchy sequence {ym } in X, a subspace of

E and let the dimension of X be n. Let {e1 , e2 , . . . , en } be a basis for X.
Then each ym can be written in the form,
(m) (m)
ym = α1 e1 + α2 e2 + · · · + αn(m) en
Since {ym } is a Cauchy sequence, for every > 0, there is an N such

that ||ym − yp || < when m, p > N . From this and the lemma 2.3.2, we
have for some c > 0
0 0
0 0 (m)
0 (m) p 0
> ||ym − yp || = 0 (αi − αi )ei 0 ≥ c |αi − αip | for all m, p ≥ N
0 0
i=1 i=1
On division of both sides by c, we get

n

|αim − αip | ≤ |αim − αip | < for all m, p > N.
i=1
c
(m)
4+
This shows that {αi } is a Cauchy sequence in ( ) for i = 1, 2, . . . , n.
Hence the sequence converges. Let αi denote the limit. Using these n limits
α1 , α2 , . . . , αn , let us construct y as
y = α1 e1 + α2 e2 + · · · + αn en
Here y ∈ X and
0 n 0
0 0 n
0 (m) 0 (m)
||ym − y|| = 0 (αi − αi )ei 0 ≤ |αi − αi | ||ei ||.
0 0
i=1 i=1
(m)
Since αi → αi as m → ∞ for each i, ym → y as m → ∞. This shows
that {ym } is convergent in X. Since {ym } is an arbitrary Cauchy sequence
in X it follows that X is complete.
2.3.4 Theorem (closedness)
Every finite dimensional subspace X of a normed linear space E is

closed in E. If the subspace X of E is closed, then it is closed in E and
the theorem is true. By theorem 2.3.3, X is complete. X being a complete
normed linear space, it follows from theorem 2.2.1 that X is closed.
2.3.5 Equivalent norms
A norm on a linear space E induces a topology, called norm topology

on E.
Definition 1: Two norms on a normed linear space are said to be equivalent

if they induce the same norm topology or if any open set in one norm is also
an open set in the other norm. Alternatively, we can express the concept
in the following form:
Definition 2: Two norms ||·|| and ||·|| on the same linear space E are said to
be equivalent norms on E if the identity mapping IE : (E, || · ||) → (E, || · || )
is a topological homoeomorphism of (E, || · ||) onto (E, || · || ).
Theorem: Two norms || · || and || · || on the same normed linear space
E are equivalent if and only if ∃ positive constants α1 and α2 such that

α1 ||x|| ≤ ||x|| ≤ α2 ||x|| ∀ x ∈ E.
Proof: In view of definition 2 above, we know that || · || and || · || are

equivalent norms on E
⇔ the identity mapping IE is a topological isomorphism of (E, || · ||)

onto (E, || · || ).
⇔ ∃ constants α1 > 0 and α2 > 0 such that α1 ||x|| ≤ ||IE x|| ≤

α2 ||x||, ∀ x ∈ E,
⇔ α1 ||x|| < ||x|| < a2 ||x||, ∀ x ∈ E.
This completes the proof.
Note 2.3.1. The relation ‘norm equivalence’ is an equivalence relation

among the norms on E. The special feature of a finite dimensional normed
linear space is that all norms on the space are equivalent, or in other words,
all norms on E lead to the same topology for E.
2.3.6 Theorem (equivalent norms)
On a finite dimensional normed linear space any norm || · || is equivalent

to any norm || · || .
Proof: Let E be a n-dimensional normed linear space and let

{e1 , e2 , . . . , en } be any basis in E. Then for every x ∈ E we can find
some scalars α1 , α2 , . . . , αn , not all zeros such that
x = α1 e1 + α2 e2 + · · · + αn en . (2.11)
Then by lemma 2.3.2, we can find a constant c > 0 such that
0 0 n
0n 0
0 0
||x|| = 0 αi ei 0 ≥ c |αi | (2.12)
0 0
i=1 i=1
On the other hand,
0 n 0
0 0
0 0
||x|| = 0 αi e i 0
0 0
i=1

n
≤ |αi | ||ei || (2.13)
i=1

n
≤ k1 |αi |
i=1
where k1 = max ||ei || .
Hence ||x|| < α2 ||x|| (2.14)
k1
where α2 =
c
Interchanging ||x|| and ||x|| we obtain as in the above
||x|| ≥ α1 ||x||
−1
k2
where α1 = , k2 = max ||ei ||.
c
Problems
1. Let E1 be a closed subspace and E2 be a finite dimensional subspace

of a normed linear space E. Then show that E1 + E2 is closed in E.
2. Show that equivalent norms on a vector space E induces the same
topology on E.
3. If || · || and || · || are equivalent norms on a normed linear space E,
show that the Cauchy-sequences in (E, || · ||) and (E, || · || ) are the
same.
4. Show that a finite dimensional normed linear space is separable. (A
normed linear space is said to be separable if it is separable as a metric
space.)
5. Show that a subset L = {u1 , u2 , . . . , un } of a normed linear space E is
linearly independent if and only if for every x ∈ span L, there exists
4 +
a unique (α1 , α2 , . . . αn ) ∈ n ( n ) such that x = α1 u1 + α2 u2 +
α 3 u3 + · · · + α n un .
6. Show that a Banach space is finite dimensional if and only if every
subspace is closed.
7. Let 1 ≤ p ≤ ∞. Prove that a unit ball in lp is convex, closed and
bounded but not compact.
8. Let ) be the vector space generated by the functions
1, sin x, sin2 x, sin3 x, . . . defined on [0, 1]. That is, f ∈ ) if and only
if there is a nonnegative integer k and real numbers α1 , α2 , α3 , . . . , αn
∞
(all depending on f such that f (x) = αn sinn x for each x ∈ [0, 1]).
Show that )is an algebra and ) n=0

is dense in C([0, 1]) with respect to
)
the uniform metric (A vector space of real-valued functions is called
an algebra of functions whenever the product of any two functions
in ) is in )(Aliprantis and Burkinshaw [1])).
9. Let E1 be a compact subset and E2 be a closed subset of a normed

linear space such that E1 ∩ E2 = Φ. Then show that (E1 + B(0, r)) ∩
E2 = Φ for some r > 0.
10. Let 1 ≤ p ≤ ∞. Show that the closed unit ball in lp is convex, closed
and bounded, but not compact.
11. Let E denote the linear space of all polynomials in one variable with
4+
coefficients in ( ). For p ∈ E with p(t) = α0 +α1 t+α2 t2 +· · ·+αn tn ,
let
||p|| = sup{|p(t)| : 0 ≤ t ≤ 1},
||p||1 = |α0 | + |α1 | + · · · + |αn |,
||p||∞ = max{|α0 |, |α1 , . . . , |αn |}.
Then show that || ||, ||p||1 , ||p|||∞ are norms on E, ||p|| ≤ ||p||1 and
||p||∞ ≤ ||p||1 for all p ∈ E.
12. Show that equivalent norms on a linear space E induce the same
topology for E.
13. If two norms || · || and || · ||0 on a linear space are equivalent, show
that (i) ||xn − x|| → 0 implies (ii) ||xn − x||0 → 0 (and vice versa).
2.3.7 Riesz’s lemma
Let Y and Z be subspaces of a normed linear space X and suppose that

Y is closed and is a proper subset of Z. Then for every real number θ in
the interval (0, 1) there is a z ∈ Z such that ||z|| = 1, ||z − y|| ≥ θ for all
y ∈Y.
Proof: Take any v ∈ Z − Y and denote its distance from Y by d (fig. 2.4),
d = inf ||v − y||.

y∈Y
v
d y0
Fig. 2.4
Clearly, d > 0 since Y is closed. We now take any θ ∈ (0, 1). By the
definition of an infinum there is a y0 ∈ Y such that
d
d ≤ ||v − y0 || ≤ (2.15)
θ
d
(note that θ > d since 0 < θ < 1).
1
Let, z = c(v − y0 ) where c = .
||v − y0 ||
Then ||z|| = 1 and we shall show that ||z − y|| ≥ θ for every y ∈ Y .
Now, ||z − y|| = ||c(v − y0 ) − y|| = c||(v − y0 ) − c−1 y|| = c||v − y1 ||

where y1 = y0 + c−1 y.
The form of y1 shows that y1 ∈ Y . Hence ||v −y1 || ≥ d, by the definition

of d. Writing c out and using (2.15), we have
d d
||z − y|| = c||v − y1 || ≥ cd = ≥ d = θ.
||v − y0 || θ
Since y ∈ Y was arbitrary, this completes the proof.
2.3.8 Lemma
Let Ex and Ey be Banach spaces, A be compact and R(A) be closed in

Ey , then the range of A is finite dimensional.
Proof: Let, if possible, {z1 , z2 , . . .} be an infinite linearly independent

subset of R(A) and let Zn = span{z1 , z2 , . . .}, n = 1, 2, . . ..
Zn is finite dimensional and is therefore a closed subspace of Zn+1 . Also,

Zn = Zn+1 , since {z1 , z2 , . . . , zn+1 } is linearly independent. By the Riesz
lemma (2.3.7), there is a y n ∈ Zn+1 , such that
1
||y n || = 1 and dist(y n , Zn ) ≥ .
2
Now, {y n } is a sequence in {y ∈ R(A) : ||y|| ≤ 1} having no convergent

subsequence. This is because ||y n − y m || ≥ 12 for all m = n. Hence the set
{y ∈ R(A) : ||y|| ≤ δ} cannot be compact. Hence R(A) is finite dimensional.
Problems
1. Prove that if E is a finite dimensional normed linear space and X

is a proper subspace, there exists a point on the unit ball of E at a
distance from X.
2. Let E be a normed linear space. Show that the Riesz lemma with
θ = 1 holds if and only if for every closed proper subspace X of E,
there is a x ∈ E and y0 ∈ X such that ||x − y0 || = dist (x, X) > 0.
3. Let E = {x ∈ C([0, 1]) : x(0) = 0} with the sup norm and
1 ,
X= x∈E: x(t)dt = 0 . Then show that X is a proper closed
0
subspace of E. Also show that there is no x ∈ E with ||x||∞ = 1 and
dist (x, X) = 1.
2.4 Quotient Spaces
In this section we consider an useful method of constructing a new Banach

space from a given Banach space. Earlier, in section 1.3.7 we constructed
quotient space over a linear space. Because a Banach space is also a linear
space, we can similarly construct a quotient space by introducing a norm
consistent with the norm of the given Banach space.
2.4.1 Theorem
Let E be a normed linear space over 4(+) and let L be a closed subspace
of E.
Define || · ||q : E/L → 4 by

||x + L||q = inf{||x + m|| : m ∈ L}.
Then (E/L, || · ||q ) is a normed linear space. Furthermore, if E is a

Banach space, then E/L is a Banach space.
Proof: We first show that || · ||q defines a norm on E/L. We note first that
||x + L||q ≥ 0, ∀ x ∈ E.
Next, if x + L = L, then ||x + L||q = ||O + L||q = 0
Conversely, let ||x + L||q = 0 for some x ∈ E. Then there exists a

sequence {mk } ⊂ L such that

lim ||x + mk ||q = 0 i.e. − mk → x in E as k → ∞.
k→∞
Now x ∈ L as L is closed.
Hence x + L = L
Thus ||x + L||q = 0 ⇒ x + L = L.
Further, for x, y ∈ E, we have,

||(x + L) + (y + L)||q = ||(x + y) + L||q
= inf{||(x + y) + m|| : m ∈ L}
= inf{||(x + m1 ) + (y + m2 )||, m1 , m2 ∈ L}
< inf{||x + m1 || : m1 ∈ L}
+ inf{||y + m2 || : m2 ∈ L}
= ||x + L||q + ||y + L||q
This proves the triangle inequality.
Now, for x ∈ L and α ∈ 4(+), with α = 0, we have,

||α(x + L)||q = ||αx + αL||q = ||αx + L||q
= inf{||αx + m|| : m ∈ L}
1 m 2
= inf ||α(x + m )|| : m = ∈L
α
= |α| inf{||x + m || : m : L}
= |α| ||x + L||q
Thus we conclude that (E/L, || · ||q ) is a normed linear space.
We will next suppose that E is a Banach space. Let (xn + L) be a

Cauchy sequence in E/L.
We next show that {xn + L} contains a convergent subsequence {xnk +

L}. Let the subsequence {xnk + L} be such that
1
||(xn2 + L) − (xn1 + L)||q <
2
1
||(xn3 + L) − (xn2 + L)||q < 2
2
1
||(xnk+1 + L) − (xnk + L)||q <
2k
Let us choose any vector y1 ∈ xn1 + L. Next choose y2 ∈ xn2 + L such

that ||y2 − y1 || < 12 . We then find y3 ∈ xn3 + L such that ||y3 − y2 || < 212 .
Proceeding in this way, we get a sequence {yk } in E such that
1
xnk + L = yk + L and ||yk+1 − yk || < (k = 1, 2, . . .)
2k
Then for p = 1, 2, . . .

p

p
1 1
||yk+p − yk || ||yk+i − yk+i−1 || < = k+p−1
i=1 i=1
2k+i−1 2
Therefore, it follows that {yk } is a Cauchy sequence in E. However

because E is complete, ∃ y ∈ E such that limk→∞ ||yk − y|| = 0. Since
||(xnk + L) − (y + L)||q = ||(yk + L) − (y + L)||q

= ||(yk − y)|L||q
≤ ||yk − y||(||L||q = 0)
it follows that
lim (xnk + L) = y + L ∈ E/L
k→∞
Hence {xn + L} has a subsequence {xnk + L} that converges to some
element in E/L.
Then ||(xn + L) − (y + L)||q ≤ ||(xn + L) − (xnk + L)||q
+ ||(xnk + L) − (y + L)||q
≤ ||(xn − xnk ) + L||q
+ ||(xnk − y) + L||q → 0 as k → ∞.
Hence the Cauchy sequence {xn + L} converges in E/L and thus E/L is
complete.
Problems
1. Let M be a closed subspace of a normed linear space E. Prove that

the quotient mapping x ∈ x + M of E onto the quotient space E/M
is continuous and that it maps open subsets of E onto open subsets
of E/M .
2. Let X1 = (X1 , || · ||1 ) and X2 = (X2 , || · ||2 ) be Banach spaces over the
4+
same scalar field ( ). Let X = X1 × X2 be the Cartesian product
of X1 and X2 . Then show that X is a linear space over ( ). Prove 4+
that the following is a norm on X:
||(x1 , x2 )||∞ = max{||x1 ||1 ; ||x2 ||2 }
3. Let M and N be subspaces of a linear space E and let E = M + N .

Show that the mapping y → y + M , which sends each y in N to y+M
in E/M , is an isomorphism of N onto E/M .
4. Let M be a closed subspace of a normed space E. Prove that if E is
separable, then E/M is separable (A space E is separable if it has a
denumerable everywhere dense set).
5. If E is a normed vector space and M ∈ E is a Banach space, then
show that if E/M is a Banach space, E itself is a Banach space.
2.5 Completion of Normed Spaces
Definition: Let E1 and E2 be normed linear spaces over 4(+).

(i) A mapping T : E1 → E2 (not necessarily linear) is said to be an
isometry if it preserves norms, i.e., if
||T x||E2 = ||x||E1 ∀ x ∈ E1
such an isometry is said to imbed E1 into E2 .

(ii) Two spaces E1 and E2 are said to be isometric if there exists an
one-one (bijective) isometry of E1 into E2 . The spaces E1 and E2
are then called isometric spaces.
Theorem 2.5.1. Let E1 = (E1 , || · ||E1 ) be a normed linear space. Then

there is a Banach space E2 and an isometry T from E1 onto a subspace E2
of E2 which is dense in E2 . The space E2 is unique, except for isometries.
Proof: Theorem 1.4.25 implies the existence of a complete metric space

X2 = (X2 , ρ2 ) and an isometry T : X1 → X2 = T (X1 ), where X2 is dense in
X2 and X2 is unique, except for isometries. In order to prove the theorem,
we need to make X2 a linear space E2 and then introduce a suitable norm
on E2 .
To define on X2 the two algebraic operations of a linear space, we

consider any x̃, ỹ ∈ X2 and any representatives {xn } ∈ x̃ and {yn } ∈ ỹ.
We recall that x̃, ỹ are equivalence classes of Cauchy sequences in E1 .
We set zn = xn + yn . Then {zn } is a Cauchy sequence in E1 since
||zn − zm || = ||xn + yn − (xm + ym )|| ≤ ||xn − xm || + ||yn − ym ||.

We define the sum ẑ = x̂ + ŷ of x̂ and ŷ to be an equivalence class of

which {zn } is representative; thus {zn } ∈ ẑ. This definition is independent
of the particular choices of Cauchy sequences belonging to x̃ and ỹ. We
know that if {xn } ∼ {xn } and {yn } ∼ {yn }, then {xn + yn } ∼ {xn + yn },
because,
||xn + yn − (xn + yn )|| ≤ ||xn − xn || + ||yn − yn ||.
Similarly, we define αx̃ ∈ X2 , the product of a scalar α and x̃ to be

the equivalence class for which {αxn } is a representative. Moreover, this
definition is independent of the particular choice of a representative x̃. The
zero element of X2 is the equivalence class containing all Cauchy sequences
that converge to zero. We thus see that these algebraic operations have
all the properties required by the definition of a linear space and therefore
X2 is a linear space. Let us call it the normed linear space E2 . From the
definition it follows that on X2 [see theorem 1.4.25] the operations of linear
space induced from E2 agree with those induced from E1 by means of T .
We call X2 , a subspace of E2 as E2 .
Furthermore, T induces on E2 a norm || · ||1 , value of which at every

ỹ = T x ∈ E2 is ||ỹ||1 = ||x||. The corresponding metric on E2 is the
restrictions of ρ2 to E2 since T is isometric. We can extend the norm || · ||1
to E2 by setting ||x̃||2 = ρ(0, x̃) for every x̃ ∈ E2 . It is clear that || · ||2
satisfies axiom (a) of subsection 2.1.1 and that the other two axioms (b)
and (c) of the above follow from those for || · ||1 by a limiting process.
The space E2 constructed as above is sometimes called the completion

of the normed linear space E1 .
Definition: completion A completion of a normed linear space E1 is any

normed linear space E2 that contains a dense subspace that is isometric to
E1 .
Theorem 2.5.2. All completions of a normed linear space are isometric.
˜ be two completions of a normed linear

Proof: Let, if possible, Ẽ and Ẽ
space E. In particular, we assume that Ẽ and Ẽ ˜ are complete and both
contain E as a dense subset. We now define an isometry T between Ẽ and
˜ For each x̃ ∈ Ẽ, since E is dense in Ẽ, ∃ a sequence {x } of points of
Ẽ. n
E converging to x̃. But we may also consider {xn } as a Cauchy sequence
˜ and Ẽ
in Ẽ ˜ being complete, it must converge to x̃ ˜ Define T x̃ = x̃
˜ ∈ Ẽ. ˜
by the construction. In what follows we will show that this construction
is independent of the particular sequence {xn } converging to x̃ and gives
a one-to-one mapping of X̃ onto X̃.˜ Clearly T x = x ∀ x ∈ E. Now, if
˜ then
˜ in Ẽ,
{xn } → x̃ in Ẽ and xn → x̃
||x||Ẽ = lim ||xn || and ˜ ˜ = lim ||xn ||.

||x̃||Ẽ
n→∞ n→∞
Thus ||T x̃||Ẽ˜ = ||x̃||Ẽ . Hence T is isometric.
Corollary 2.5.1. The space Ẽ in theorem 2.5.2 is unique except for

isometries.
Example: The completion of the normed linear space (P[a, b], || · ||∞ ) where
P[a, b] is the set of all polynomials with real coefficients defined on the
closed interval [a, b], is the space (C([a, b]), || · ||∞ ).
CHAPTER 3
HILBERT SPACE
A Hilbert space is a Banach space endowed with a dot product or scalar

product. A normed linear space has a norm, or the concept of distance,
but does not admit the concept of the angle between two elements or
two vectors. But an inner product space admits both the concepts such
as the concept of distance or norm and the concept of orthogonality–in
other words, the angle between two vectors. Just as a complete normed
linear space is called a Banach space, a complete inner product space is
called a Hilbert space. An inner product space is a generalisation of the
n-dimensional Euclidean space to infinite dimensions.
The whole theory was initiated by the work of D. Hilbert (1912)
[24] on integral equations. The currently used geometrical notation and
terminology is analogous to that of Euclidean geometry and was coined by
E. Schmidt (1908) [50]. These spaces have up to now been the most useful
spaces in practical applications of functional analysis.
3.1 Inner Product Space, Hilbert Space

3.1.1 Definition: inner product space, Hilbert space
An inner product space (pre-Hilbert Space) is a linear (vector) space
H with an inner product defined on H. A Hilbert space is a complete
inner product space (complete in the metric defined by the inner product)
(cf. (3.2) below). Hence an inner product on H is a mapping of H × H into
4
the scalar field K( or C) of H; that is, with every pair of elements x and
y there is associated a scalar which is written x, y and is called the inner
product (or scalar product) of x and y, such that for all elements x, y and
z and scalar α we have s
(a) x + y, z = x, z + y, z

(b) αx, y = αx, y
91
(c) x, y = y, x

(d) x, x ≥ 0,
x, x = 0 ⇔ x = 0.
An inner product on H defines a norm on H given by

1
x = x, x 2 ≥ 0 (3.1)
and a metric on H given by

+
ρ(x, y) = x − y = x − y, x − y. (3.2)
Hence inner product spaces are normed linear spaces, and Hilbert
spaces are Banach spaces.
In (c) the bar denotes complex conjugation. In case,
x = (ξ1 , ξ2 , . . . ξn , . . .) and y = (η1 , η2 , . . . ηn , . . .)

∞

x, y = ξi ηi . (3.3)
i=1
In case H is a real linear space
x, y = y, x.
The proof that (3.1) satisfies the axioms (a) to (d) of a norm [see 2.1] will
be given in section 3.2.
From (a) to (d) we obtain the formula,
⎫
(a ) αx + βy, z = αx, z + βy, z for all scalars α, β. ⎪
⎬
(b ) x, αy = αx, y (3.4)
⎪
⎭

(c ) x, αy + βz = αx, y + βx, z.
3.1.2 Observation
It follows from (a ) that the inner product is linear in the first argument,
while (b ) shows that the inner product is conjugate linear in the second
argument. Consequently, the inner product is sesquilinear, which means
that 1 12 times linear.
3.1.3 Examples
1. 4 n
, n dimensional Euclidean space
4
The space n is a Hilbert space with inner product defined by

n
x, y = ξi ηi (3.5)
i=1
Hilbert Space 93
where x = (ξi , ξ2 , . . . ξn ), and y = (η1 , η2 , . . . ηn )

In fact, from (3.5) it follows
n
12
1

2
x = x, x = 2 ξi2 .
i=1
The metric induced by the norm takes the form

1 1
ρ(x, y) = x − y = x − y, x − y 2 = {(ξ1 − η1 )2 + · · · + (ξn − ηn )2 } 2
The completeness was established in 1.4.16.

2. Unitary space n +
The unitary space + n
is a Hilbert space with inner product defined
by

n
x, y = ξi ηi (3.6)
i=1
where x = (ξ1 , ξ2 , . . . ξn ) and y = (η1 , η2 , . . . ηn ).

From (3.6) we obtain the norm defined by
n
12
1

2
x = x, x = 2 |ξi | .
i=1
The metric induced by the norm is given by

n
12

2
ρ(x, y) = x − y = |ξi − ηi | .
i=1
Completeness was shown in 1.4.16.

Note 3.1.1. In (3.6) we take the conjugate ηi so that we have y, x =
x, y which is the requirement of the condition c, so that x, x is real.
3. Space l2
l2 is a Hilbert space with inner product defined by
∞

x, y = ξi ηi (3.7)
i=1
where x = (ξi , ξ2 , . . . ξn , . . .) ∈ l2 and y = (η1 , η2 , . . . ηn , . . .) ∈ l2 .

∞ ∞
Since x, y ∈ l2 , |ξi |2 < ∞ and |ηi |2 < ∞.
i=1 i=1
By Cauchy-Bunyakovsky-Schwartz inequality [see theorem 1.4.3]

∞
∞
1/2 ∞

1/2

2 2
We have x, y = ξi ηi ≤ |ξi | |ηi | <∞ (3.8)
i=1 i=1 i=1
From (3.7) we obtain the norm defined by
∞
12
1

x = x, x 2 = |ξi |2 .
i=1
Using the metric induced by the norm, for l2 , we see that all the axioms
3.1.1(a)–(d) are fulfilled.
4. Space L2 ([a, b])
The inner product is defined
b
x, y = x(t)y(t)dt (3.9)
a
for x(t), y(t) ∈ L2 ([a, b]) i.e., x(t), y(t) are Riemann square integrable.
The norm is then

12
b
1
2
x = x, x 2 = x(t) dt where x(t) ∈ L2 ([a, b]).
a
Using the metric induced by the norm we can show that L2 ([a, b]) is a
Hilbert space.
In the above x(t) is a real-valued function.
In case x(t) and y(t) are complex-valued functions we can define the
b
inner product x, y as x, y = x(t)y(t)dt with the norm given by
a

12
b
2
x(t) = |x(t)| dt because x(t)x(t) = |x(t)|2 .
a
Note 3.1.2. l2 is the prototype of the Hilbert space. It was introduced

and studied by D. Hilbert in his work on integral equations. The axiomatic
definition of a Hilbert space was given by J. Von Neumann in a paper on
the mathematical foundation of quantum mechanics [30].
3.2 Cauchy-Bunyakovsky-Schwartz (CBS)

Inequality
3.2.1 Lemma (Cauchy-Bunyakovsky-Schwartz inequality)
(CBS)
If x, y are two elements of an inner product space, then
|x, y| ≤ xy (3.10)
Hilbert Space 95
The equality occurs if and only if {x, y} is a linearly dependent set.

Proof: If y = θ then x, y = x, theta = θ, x = 0x, x = 0 and the
conclusion is clear. Let y = θ, then, for any scalar λ, we have
0 ≤ λx + y2 = λx + y, λx + y = λλx, x + λx, y + λy, x

+ y, y = |λ|2 x, x + λx, y + λx, y + y, y ∀ λ
y, x
Let λ=− .
x, x
Then the above inequality reduces to,
|x, y|2
y, y − ≥ 0 or, |x, y| ≤ x|y|.
x, x
Note 3.2.1. By using Cauchy-Bunyakovsky-Schwartz inequality we can

show that the norm defined by a scalar product of an inner product space
[cf. (3.1)] satisfied all the axioms of a normed linear space.
For x, y belonging to an inner product space, we obtain,
x + y2 = x + y, x + y = x, x + x, y + y, x + y, y

= x|2 + 2 Re x, y + y2
since y, x = x, y ≤ x y.
Cauchy-Bunyakovsky-Schwartz inequality reduces the above inequality to
x + y2 ≤ x2 + 2xy + y2 = (x + y)2
Hence, x + y ≤ x + y, which is the triangular inequality of a

normed linear space.
Thus, the distance introduced by the norm satisfies all the axioms of a
metric space.
Thus a Hilbert space ⇒ a Banach space ⇒ a complete metric space.
3.3 Parallelogram Law

3.3.1 Parallelogram law
The parallelogram law states that the norm induced by a scalar product
satisfies
x + y2 + x − y2 = 2(x2 + y2 ) (3.11)
where x, y are elements of an inner product space.
Proof: x + y2 = x + y, x + y = x, x + x, y + y, x + y, y.

x − y2 = x − y, x − y = x, x − x, y − y, x + y, y.
Therefore, x + y2 + x − y2 = 2(x2 + y2 ).
The term parallelogram equality is suggested by elementary geometry,

as we shall see from the figure below. Since norm stands for the length
of a vector, the parallelogram law states an important property of a
parallelogram, i.e., the sum of the squares of the lengths of the diagonals is
equal to twice the sum of the squares of the lengths of the sides.
Thus the parallelogram law generalises a known property of elementary
geometry to an inner product space.
y
x+
y
x−
y
x
Fig. 3.1 Parallelogram with sides x and y in the plane
In what follows we give some examples of normed linear spaces which

are not Hilbert spaces.
3.3.2 Space lp
The space lp with p = 2 is not an inner product space, hence not a
Hilbert space. Hence we would like to show that the norm of lp with
p = 2 cannot be obtained from an inner product. We prove this by
showing that the norm does not satisfy the parallelogram law. Let us take
1
x = (1, 1, 0 . . . 0) ∈ lp and y = (1, −1, 0, 0 . . . 0 . . .) ∈ lp . Then x = 2 p .
1
y = 2 p . Now x+y = (2, 0, 0 . . .), x+y = 2. Again x−y = (0, 2, 0, 0 . . .).
Hence x − y = 2.
Thus the parallelogram law is not satisfied. Hence lp (p = 2) though a
Banach space (cf. 2.1.4) is not a Hilbert space.
3.3.3 Space C([0, 1])
The space C([0, 1]) is not an inner product space and hence not a Hilbert
space.
We show that the norm defined by
x = max |x(t)|

0≤t≤1
cannot be obtained from an inner product. Let us take x(t) = 1 and

y(t) = t. Hence x = 1, y = 1
x(t) + y(t) = 1 + t = 2 · x(t) − y(t) = 1 − t = 1.

Thus x + y + x − y = 3 = 4 = 2(x + y).
Hence the parallelogram law is not satisfied.

Thus C([0, 1]), although a complete normed linear space, i.e., a Banach
space [cf. 2.1.4], is not a Hilbert space.
Hilbert Space 97
We know that the norm of a vector in an inner

+ product space can be
expressed in terms of the inner product: x = x, x. The inner product
can also be recovered from the induced norm by the following formula
known as the polarization identity:
⎧
⎪ 1
⎨ (x + y2 − x − y|2 ) in
4
4
x, y =
⎪
⎩ 1 [(x + y2 − x − y2 ) + i(x + iy2 − x − iy2 )] in
4
+
(3.12)
Now, in ,4
1 1
(x + y2 − x − y2 ) = [x + y, x + y − x − y, x − y]
4 4
1
= [x,x + y,y + x, y + y, x − x,
x − y,
y + x, y + y, x].

4
4
= x, y since in , x, y = y, x.
In +
1
[(x + y2 − x − y2 ) + i(x + iy2 − x − iy2 )]
4
1
= [{x, x + y,
y + x, y + y, x − x,
x − y,
y + x, y + y, x}

4
y − x,
x + x, iy + iy, x + iiy,
+ i{x, y + x, iy + iy, x}]
x − iiy,
1
= [2x, y + 2x, y + 2x, y − 2x, y] = x, y.
4
which is the polarization identity.
3.3.4 Theorem
A norm · on a linear space E is induced by an inner product , on it
if and only the norm satisfies the parallelogram law. In that case the inner
product , is given by the polarization equality.
Proof: Suppose that the norm is induced by the inner product. Then
the parallelogram law holds true. Furthermore the inner product can be
recovered from the norm by the polarization equality.
Conversely, let us suppose that · obeys the parallelogram law and
, is defined by the polarization equality as given in (3.11). We have to
show that , is an inner product and generalize · on E. Let us consider
the formula (3.11) for the complex space.
(i) Then for all x, y ∈ E

1
x, y = [(x + y2 − x − y2 + i(x + iy2 − x − iy2 )].
4
Putting y = x ∈ E we get
1
x, x = [(4x2 − 0 + i1 + i2 x2 − i1 − i2 x2 ]
4
1
= [(4x2 + 2i(x|2 − x2 )]
4
= x2
Therefore inner product , generates the norm · .
(ii) For all x, y ∈ E we have,
1
y, x = [y + x2 − y − x2 − i(y + ix2 − y − ix2 )]
4
1
= [x + y2 − x − y2 + i(x + iy2 − x − iy2 )] = x, y.
4
(iii) Let u, v, w ∈ X. Then parallelogram law yields
3
(u + v) + w2 + (u + v) − w2 = 2(u + v2 + w2 )
(u − v) + w2 + (u − v) − w2 = 2(u − v2 + w2 ).
On substraction we get,
((u + w) + v2 − (u + w) − v2 ) + ((u − w) + v2 − (u − w) − v2 )

= 2(u + v2 − u − v2 ).
Using the polarization identity, we get
3
Re u + w, v + Re u − w, v = 2 Re u, v (3.13)
Imu + w, v + Imu − w, v = 2 Imu, v.
Hence, u + w, v + u − w, v = 2u, v (3.14).
Putting u + w = x u − v = y and v = z, we obtain from (3.14)
7 8
x+y
x, z + y, z = 2 , z = x + y, z
2
since on putting w = u, (3.13) reduces to Re2u, v = Re2u, v for all
u, v ∈ E.
Thus condition (a) of 3.1 is proved.
(iv) Next we want to prove condition (b), i.e., αx, y = αx, y, for
every complex scalar α and ∀ x, y ∈ E. We shall prove it in stages.
Stage 1. Let λ = m, a positive integer, m > 1.
mx, y = (m − 1)x + x, y = (m − 1)x, y + x, y

= (m − 2)x, y + 2x, y
..
.
= x, y + (m − 1)x, y = mx, y.
Also for any positive integer n, we have
9x :
1
n , y = n x, y = x, y
n n
Hilbert Space 99
9x : 1
Hence , y = x, y.
n n
If m is a negative integer, splitting m as (m − 1) + 1 we can show that (b)
is true.
m
Stage 2. Let α = r = be a rational number, m and n be prime to each
n
other.
m 9x : 9m :
Then rx, y = x, y = m ,y = x, y = rx, y.
n n n
Stage 3. Let α be a real number. Then there exists a sequence {rn } of
rational numbers, such that rn → α as n → ∞.
Hence rn x, y → αx, y

But rn x, y = rn x, y and rn x + y → αx + y
Therefore, αx, y = αx, y for any real α.
Stage 4. Let α = i. Then, the polarization identity yields

1
ix, y = [ix + y2 − ix − y2 + i(i(x + y))2 − (i(x − y))2 ]
4
i
= [x + y2 − x − y2 + i(x + iy2 − x − iy2 )]
4
= ix, y.
Stage 5. Finally, let α = p + iq, be any complex number, then,
αx, y = px, y + iqx, y = px, y + iqx, y

= (p + iq)x, y = αx, y.
Thus we have shown that , is the inner product inducing the norm ·
on E.
3.3.5 Lemma
Let E be an inner product space with an inner product , .
(i) The linear space E is uniformly convex in the norm · , that is,
for every > 0, there is some δ > 0, such that for all x, y ∈ E with
x ≤ 1, y ≤ 1 and x − y ≤ , we have x + y ≤ 2 − 2δ.
(ii) The scalar product is a continuous function with respect to norm
convergence.
Proof: (i) Let > 0. Given x, y ∈ E with x ≤ 1, y ≤ 1 and x−y ≥ .
Then ≤ x − y ≤ x + y ≤ 2.
The parallelogram law gives
x + y2 = x2 + y2 + [x, y + y, x]

= 2(x2 + y2 ) − [x2 − {x, y + y, x} + y2 ]
= 2(x2 + y2 ) − x − y2 ≤ 4 − 2 .

√ 2
12
Hence, x + y ≤ 4 − 2 = 2 − 2δ if δ = 1 − 1 − 4 .
(ii) Let x, y ∈ E and xn −→ x, yn −→ y where xn , yn are elements of

E. Therefore xn and yn are bounded above and let M be an upper
bound of both xn and yn .
Hence, |xn , yn − x, y| = |xn , yn − xn , y + xn , y − x, y|

= |xn , yn − y + xn − x, y| ≤ M yn − y + M xn − x
using Cauchy-Bunyakovsky-Schwartz inequality, since xn −→ x and yn −→

y as n −→ ∞ we have from the above inequality,
xn , yn −→ x, y as n −→ ∞.
This shows that x, y, which is a function on E × E is continuous in both

x and y.
3.4 Orthogonality
3.4.1 Definitions (orthogonal, acute, obtuse)
Let x, y be vectors in an inner product space.
(i) Orthogonal: x is said to be orthogonal to y or written as x ⊥ y if
x, y = 0.
(ii) Acute: x is said to be acute to y if x, y ≥ 0.
(iii) Obtuse: x is said to be obtuse to y if x, y ≤ 0.
4 4
In 2 or 3 , x is orthogonal to y if the angle between the vectors is 90◦ .
Similarly when x is acute to y, the angle between x and y is less than or
equal to 90◦ . We can similarly explain when x, y ≤ 0 the angle between x
and y is greater than or equal to 90◦ . This geometrical interpretation can
be extended to infinite dimensions in an inner product space.
3.4.2 Definition: subspace
A non-empty subset X of the inner product space E is said to be a
subspace of E if
(i) X is a (linear) subspace of E considered as a linear space.
(ii) X admits of a inner product , X induced by the inner product
, on E, i.e.,
x, yX = x, y ∀ x, y ∈ E.
Note 3.4.1. A subspace X of an inner product E is itself an inner product
space and the induced norm · X on X coincides with the induced norm
· on E.
Hilbert Space 101
3.4.3 Closed subspace

A subspace X of an inner product space E is said to be a closed subspace
of E if X is closed with respect to · X induced by the · on E.
Note 3.4.2. Given a Hilbert space H, when we call X a subspace (closed
subspace) of H we treat H as an inner product space and X its subspace
(closed subspace).
Note 3.4.3. Every subspace of a finite dimensional inner product space
is closed.
This is not true in general, as the following example shows.
3.4.4 Example
Consider the Hilbert space l2 and let X be the subset of all finite
sequences in l2 given by
X = {x = {ξi } ∈ l2 : ξi = 0 for i > N, N is some positive integer}.
X is a proper subspace of l2 , but X is dense in l2 . Hence X = l2 = X.
Hence X is not closed in l2 .
Problems [3.1–3.4]
1. Let α1 , α2 , . . . αn be n strictly positive real numbers. Show that
4 4
the function of two variables ·, · : n × n −→ , defined by 4

4
n
x, y = αi xi yi , is an inner product on n .
i=1
2. Show that equality holds in the Cauchy-Bunyakovsky-Schwartz
inequality, (i.e., |x, y| = xy) if and only if x and y are linearly
dependent.
3. Let , be an inner product on a linear space E. For x = θ, y = θ
in E define the angle between x and y as follows:
Re (x, y)
θx,y = arc cos + , 0 ≤ θx,y ≤ π
(x, x)(y, y)
Then show that θx,y is well-defined and satisfies the identity

1 1
x, x + y, y − x − y, x − y = 2x, x 2 y, y 2 cos θx,y .
4. Let · be a norm on a linear space E which satisfies the parallelogram

law
x + y2 + x − y2 = 2(x2 + y2 ), x, y ∈ E
1
For x, y ∈ E define x, y = [x + y2 − x − y2 + ix + iy2 − ix −
4
iy2 ]
Then show that , is the unique inner product on E satisfying
+
x, y = x for all x ∈ E.
4
5. [Limaye [33]] Let X be a normed space over . Show that the norm
satisfies the parallelogram law, if and only if in every plane through
the origin, the set of all elements having norm equal to 1 forms an
ellipse with its centre at the origin.
6. Let {xn } be a sequence in a Hilbert space H and x ∈ H such that
lim xn = x, and lim xn , x = x, x. Show that lim xn = x.
n→∞ n→∞ n→∞
7. Let C be a convex set in a Hilbert space H, and d = inf{x, x ∈ C}.
If {xn } is a sequence in C such that lim xn = d, show that {xn }
n→∞
is a Cauchy sequence.
8. (Pythagorean theorem) (Kreyszig [30]). If x ⊥ y is an inner
product on E, show that (fig. 3.2),
x + y2 = x2 + y2 .
y
y x+ y
x
Fig. 3.2
9. (Appolonius’ identity) (Kreyszig [30]). Verify by direct calculations

that for any three elements x, y and z in an inner product space E,
1 1
z − x2 + z − y2 = x − y2 + 2z − (x + y)2 .
2 2
Show that this identity can also be obtained from the parallelogram
law.
3.5 Orthogonal Projection Theorem

In 1.4.7 we have defined the distance of a point x from a set A in a metric
space E which runs as follows:
D(x, A) = inf ρ(x, y). (3.15)

y∈A
In case E is a normed linear space, thus:
D(x, A) = inf x − y (3.16)

y∈A
If ŷ is the value of y for which the infimum is attained then, D(x, A) =

x − ŷ, ŷ ∈ A. Hence ŷ is the element in A closest to x. The existence
Hilbert Space 103
of such an element ŷ is not guaranteed and even if it exists it may not be

unique. Such behaviour may be observed even if A happens to be a curve
4
in 2 . For example let A be an open line segment in 2 . 4
∧
y2
∧
A A y1
x x A
∧ ∧ ∧
No y (A unique y ) (infinitely many y ’s)
Fig. 3.3 Fig. 3.4 Fig. 3.5
Existence and uniqueness of points ŷ A satisfying (3.16) where the given

4
set E, A ⊂ 2 is an open segment [in fig. 3.3 and in fig. 3.4] and is a
circular arc [fig. 3.5].
The study of the problem of existence and uniqueness of a point in
a set closest to a given point falls within the purview of the theory of
optimization.
In what follows we discuss the orthogonal projection theorem in a
Hilbert space which partially answers the above problem.
3.5.1 Theorem (orthogonal projection theorem)
If x ∈ H (a Hilbert space) and L is some closed subspace of H, then x
has a unique representation of the form
x = y + z, (3.17)
with y ∈ L and z ⊥ L i.e. z is orthogonal to every element of L.

Proof: If x ∈ L, y = x and z = θ. Let us next say that x ∈ L. Let
d = inf x − y2 , i.e., d is the square of the distance of x from L. Let {yn }
y∈L
be a sequence in L such that dn = x − yn 2 and let dn → d as n → ∞.
Let h be any non-zero element of L. Then yn + h ∈ L for any complex
number .
Therefore, x − (yn + h)2 ≥ d i.e. x − (yn + h), x − (yn + h) ≥ d

or, x − yn , x − yn − h, x − yn − x − yn , h + h, h ≥ d
or, x − yn 2 − h, x − yn − x − yn , h + | |2 h2 ≥ d,
x − yn , h
Let us put =
h2
The above inequality reduces to,
|x − yn , h|2
x − yn 2 − ≥ d,
h2
|x − yn , h|2
or ≤ (dn − d).
h2
√
or |x − yn , h| ≤ h dn − d. (3.18)
Inequality (3.18) is evidently satisfied for h = 0.
It then follows that
|ym − yn , h| ≤ |x − yn , h| + |x − ym , h|
√ √
≤ ( dn − d + dm − d)h (3.19)
(3.19) yields
|yn − ym , h| + +
≤ ( dn − d + dm − d)
h
Taking supremum of LHS we obtain,
√ √
yn − ym ≤ ( dn − d + dm − d) (3.20)
Since dn → d as n → ∞, the above inequality shows that {yn } is a

Cauchy sequence. H being complete, {yn } → some element y ∈ H. Since
L is closed, y ∈ L. It then follows from (3.18) that
x − yn , h = 0 (3.21)
where h is an arbitrary element of L.

Hence x − y is perpendicular to any element h ∈ L, i.e. x − y ⊥ L.
Setting z = x − y we have
x=y+z (3.22)
Next we want to show that the representation (3.22) is unique. If that
be not so, let us suppose there exist y and z such that
x = y + z (3.23)
It follows from (3.22) and (3.23) that
y − y = z − z. (3.24)
Since L is a subspace and y, y ∈ L ⇒ y − y ∈ L.

Similarly z − z ⊥ L. Hence (3.24) can be true if and only if y − y =

z − z = θ showing that the representation (3.22) is unique. Otherwise we
may note that y − y 2 = y − y , z − z = 0 since y − y is ⊥ to z − z.
Hence y = y and z = z . In (3.22) y is called the projection of x on L.
3.5.2 Lemma
The collection of all elements z orthogonal to L(= Φ) forms a
closed subspace M (say).
Let z1 , z2 ⊥ to L but y ∈ L where L is non-empty.
Hilbert Space 105
Then, y, z1 = 0, y, z2 = 0.

Therefore, for scalars α, β, y, αz1 + βz2 = αy, z1 + βy, z2 = 0.
Again, let {zn } → z in H, where {zn } is orthogonal to L. Then, a scalar
product being a continuous function
|y, zn − y, z| = |y, zn − z| ≤ yzn − z, y ∈ L
→ 0 since zn → z as n → ∞.
Therefore, y, z = lim y, zn = 0.
n→α
Hence, z ⊥ L.
Thus the elements orthogonal to L form a closed subspace M . We write

L ⊥ M . M is called the orthogonal complements of L, and is written
as M = L⊥ .
4
Note 3.5.1. In 2 if the line L is the subspace, the projection of any
vector x with reference to a point 0 on L (the vector x not lying on L) is
given by,
A
O L
B
Fig. 3.6 Projection of x on L
Here, OB is the projection vector x on L. B is the point on L closest

from A, the end point of x.
Next, we present a theorem enumerating the conditions under which
the points in a set closest from a point (not lying on the set) can be found.
3.5.3 Theorem
Let H be a Hilbert space and L a closed, convex set in H and x ∈ H −L,
then there is a unique y0 ∈ L such that
x − y0 = inf [x − y]

y∈L
Proof: Let d = inf x0 − y2 . Let us choose {yn } in L such that
y∈L
lim x − yn 2 = d (3.25)
n→∞
Then by parallelogram law,
(ym − x) − (yn − x)2 + (ym − x) + (yn − x)2

= 2(ym − x2 + yn − x2 )
ym + y n
or ym − yn 2 = 2(ym − x2 + yn − x2 ) − 4 − x2 .
2
ym + yn
Since L is convex, ym , yn ∈ L ⇒ ∈L
2
ym + yn
Hence − x2 ≥ d.
2
Hence, ym − yn 2 ≤ 2ym − x2 + 2yn − x2 − 4d.
Using (3.25) we conclude from above that {yn } is Cauchy.
Since L is closed, {yn } → y0 (say) ∈ L as n → ∞.
Then x − y0 2 = d.
For uniqueness, let us suppose that x − z0 2 = d, where z0 ∈ L.
Then, y0 − z0 2 = (y0 − x) − (z0 − x)2
= 2y0 − x2 + 2z0 − x2 − (y0 − x) + (z0 − x)2
0 02
0 y0 + z0 0
= 4d − 4 0
0 2 − x 0 ≤ 4d − 4d = 0
0
Hence y0 = z0 .
3.5.4 Theorem
Let H be a Hilbert space and L be a closed convex set in H and
x ∈ H −L. Then there is a unique y0 ∈ L (i) such that x−y0 = inf x−y
y∈L
and (ii) x − y0 ∈ L⊥ .
Theorem 3.5.3 guarantees the existence of a unique y0 ∈ L such that
x − y0 = inf x − y.
y∈L
If y ∈ L and is a scalar, then y0 + y ∈ L, so that
x − (y0 + y)2 ≥ y0 − x2

and − x − y0 , y − y, x − y0 + | |2 y2 ≥ 0 (3.26)
2 2
or | | y ≥ 2 Re { (x − y0 ), y}.
If is real and = β > 0
2 Re x − y0 , y ≤ βy2 (3.27)
If = iβ, β real and > 0, and if x − y0 , y = γ + iδ (3.26) yields,
iβ(γ + iδ) − iβ(γ − iδ) + β 2 y|2 ≥ 0 or, 2βδ ≤ β 2 y2
or, 2 Imx − y0 , y ≤ βy2 (3.28)
Since β > 0 is arbitrary, it follows from (3.27) and (3.28) that

x − y0 , y = 0 and x − y0 is perpendicular to any y ∈ L. Thus x − y0 ⊥ L.
Therefore x − y0 = z (say), where z ∈ L⊥ . Thus, x = y0 + z, where
y0 ∈ L and z ∈ L⊥ . y0 is the vector in L at minimum distance from x. y0
is called the orthogonal projection of x on L.
Hilbert Space 107
3.5.5 Lemma
In order that a linear subspace L is everywhere dense in a Hilbert space
H, it is necessary and sufficient that an element exists which is different
from zero and orthogonal to all elements of M .
Proof: Since M is everywhere dense in H, x ⊥ M implies that x ⊥ M . By
hypothesis M = H and consequently, x ⊥ H, in particular x ⊥ x implying
x = θ.
Conversely, let us suppose M is not everywhere dense in H. Then
M = H and there is an x ∈ M and x ∈ H. By theorem 3.5.1 x = y + z,
y ∈ M , z ⊥ M , and since x ∈ M , it follows that z = θ, which is a
contradiction to our hypothesis. Hence M = H.
3.6 Orthogonal Complements, Direct Sum

In 3.5.2, we have defined the orthogonal complement of a set L in a Hilbert
space H as
L1 = {y|y ∈ H, y ⊥ x for every x ∈ L} (3.29)
We write (L⊥ )⊥ = L⊥⊥ , (L⊥⊥ )⊥ = L⊥⊥⊥ etc.
Note 3.6.1. (i) {θ}⊥ = X and X ⊥ = {θ}, i.e., θ is the only vector
orthogonal to every vector.
Note that {θ}⊥ = {x ∈ H : x, θ = 0} = H.
Since x, θ = 0 for all x ∈ H. Also if x = θ, then x, x = 0. Hence
a non-zero vector x cannot be orthogonal to the entire space. Therefore,
H ⊥ = {θ}.
(ii) If L = Φ is a subset of H, then the set L⊥ is a closed subspace of
H. Furthermore L ∩ L⊥ is either {θ} or empty (when θ ∈ L).
For the first part of the above see Lemma 3.5.2. For the second part,
let us suppose L ∩ L⊥ = Φ and let x ∈ A ∩ A⊥ . Then we may have x ⊥ x.
Hence x = θ.
(iii) If A is a subset of H, then A ⊆ A⊥⊥ . Let x ∈ A. Then, x ⊥ A⊥
which means that x ∈ (A⊥ )⊥ .
(iv) If A and B are subsets of H such that A ⊆ B, then A⊥ ⊃ B ⊥ .
Let x ∈ B ⊥ , then x, y = 0 ∀ y ∈ B and therefore ∀ y ∈ A since
A ⊆ B. Thus x ∈ B ⊥ ⇒ x ∈ A⊥ that is, B ⊥ ⊂ A⊥ .
(v) If A = Φ is a subset of H, then A⊥ = A⊥⊥⊥ . Changing A by A⊥ in
(iii) we get, A⊥ ⊆ A⊥⊥⊥ .
Since A ⊆ A⊥⊥ , it follows for (iv) that A⊥ ⊇ A⊥⊥⊥ or A⊥⊥⊥ ⊆ A⊥ . Hence
it follows from the above two inclusions that A⊥ = A⊥⊥⊥ .
3.6.1 Direct sum
In 1.3.9. we have seen that a linear space E can be expressed as the
direct sum of its subspaces X1 , X2 . . . Xn if every x ∈ E can be expressed
uniquely in the form
x = x1 + x2 + · · · + xn , xi ∈ Xi .
In that case we write E = X1 ⊕ X2 ⊕ X3 · · · ⊕ Xn . In what follows we

mention that using the orthogonal projection theorem [cf. 3.5.1] we can
partition a Hilbert space H into two closed subspaces L and its orthogonal
complement L⊥ .
i.e., x=y+z where x ∈ H, y ∈ L and z ⊥ L.

This representation is unique.
Hence we can write H = L ⊕ L⊥ . (3.30)
Thus the orthogonal projection theorem can also be stated as

follows:
If L is a closed subspace of a Hilbert space H, then H = L⊕L⊥ .
Proof: L⊥ is a closed subspace of H (cf. Lemma 3.5.2). Therefore, L and
L⊥ are orthogonal closed subspace of H. We next want to show that L+L⊥
is a closed subspace of H. Let z ∈ L + L⊥ . Then there exists a sequence
{zn } in L + L⊥ such that lim zn = z.
n→∞
Since zn ∈ L + L⊥ can be uniquely represented as zn = xn + yn , xn ∈
L, yn ∈ L⊥ .
N
Since xn ⊥ yn , n ∈ , by virtue of Pythagorean theorem we have
zm − zn 2 = (xm − xn ) + (ym − yn )2

= xm − xn 2 + ym − yn 2 → 0 as n → ∞.
Hence {zn } is Cauchy and {xn } and {yn } are Cauchy sequences in L
and L⊥ respectively. Since L and L⊥ are closed subspaces of H, they are
complete. Hence {xn } → x an element in L and {yn } → y, an element in
L⊥ as n → ∞.
Thus, z = lim zn = lim (xn + yn ) = x + y.

n→∞ n→∞
Hence, x + y ∈ L + L⊥ i.e. L + L⊥ is a closed subspace of H.
Hence L + L⊥ is complete. We next have to prove that L + L⊥ = H.

If that be not so, let L + L⊥ be a proper subspace of H. If L = H then
L⊥ = Φ. On the other hand if L⊥ = H, L = Φ. Hence L + L⊥ = H is true
in the above cases. Let us suppose that L + L⊥ = Φ is a complete proper
subspace of H. We want to show that (L + L⊥ )⊥ is = Φ. Now L + L⊥ is
a convex set. Let θ = x ∈ H − [L⊥ ]. Then by theorem 3.5.4 there exists a
y0 ∈ L + L⊥ s.t.
x − y0 = inf x − y = d > 0
y∈L+L⊥
Hilbert Space 109
and x − y0 ∈ L + L⊥ . Let z0 = x − y0 , z0 = θ, z0 ∈ H and z0 ∈ L + L⊥ .

This gives,
z0 , y + z = 0 ∀ y ∈ M and z ∈ M ⊥ .
⇒ z0 , y + z0 , z = 0 ∀ y ∈ M and z ∈ M ⊥
In particular, taking z = θ and y = θ respectively,
3
z0 , y = 0 ∀ y ∈ M
z0 , z = 0 ∀ z ∈ M⊥
Consequently, z0 ∈ M ∩ M ⊥ .
But M ∩ M ⊥ = {θ}, because they are each closed subspaces.
Therefore its follows that z0 = θ, which contradicts the fact that

L + L⊥ = Φ is a proper subspace of H.
Hence H = L + L⊥ . Since L ∩ L⊥ = {θ}, H = L ⊕ L⊥ .
3.6.2 Projection operator
Let H be a Hilbert space. L is a closed subspace of H. Then by
orthogonal projection theorem for every x ∈ H, y ∈ L and z ∈ L⊥ , x can
be uniquely represented by x = y + z. y is called the projection of x on
L. We write y = P x, P is called the projection mapping of H onto L, i.e.,
onto
P : H −→ L.
Since z = x − y = x − P x = (I − P )x.
Thus PH = L (I − P )H = L⊥
Now, P y = y, y ∈ L and P z = 0, z ∈ L⊥ .
Thus the range of P and its null space are mutually orthogonal. Hence the
projection mapping P is called an orthogonal projection.
3.6.3 Theorem
A subspace L of a Hilbert space H is closed if and only if L = L⊥⊥ .
If L = L⊥⊥ , then L is a closed subspace of H, because L⊥⊥ is already
a closed subspace of H [see note 3.6.1].
Conversely let us suppose that L is a closed subspace of H. For any
subset L of H, we have L ⊆ L⊥⊥ [see note 3.6.1]. So it remains to prove
that L⊥⊥ ⊆ L.
Let x ∈ L⊥⊥ . By projection theorem, x = y + z, y ∈ L and z ∈ L⊥ .
Since L ⊆ L⊥⊥ , it follows that y ∈ L⊥⊥ .
L⊥⊥ being a subspace of H, z = x − y ∈ L⊥⊥ .
Hence z ∈ L⊥ ∩ L⊥⊥ . As such z ⊥ z, i.e., z = θ. Hence, z = x − y = θ.
Thus x ∈ L. Hence L⊥⊥ ⊆ L. This proves the theorem.
3.6.4 Theorem
Let L be a non-empty subset of a Hilbert space H. Then, span L is
dense in H if and only if L⊥ = {θ}.
Proof: We assume that span L is dense in H. Let M = span L so that
M = H. Now, {θ} ⊂ L⊥ . Let x ∈ L⊥ and since x ∈ H = M , there exists
a sequence {xn } ⊆ M such that
lim xn = x.
n→∞
Now, since x ∈ L⊥ and L⊥ ⊥ M , we have, x, xn = 0, ∀ n.

Since the inner product is a continuous function, proceeding to the limits
in the above, we have x, x = 0 ⇒ x = θ.
Thus, {L⊥ } ⊆ {θ}. Hence L⊥ = θ.
Conversely let us suppose that L⊥ = {θ}. Let x ∈ M ⊥ . Then x ⊥ M
and in particular x ⊥ L. This verifies that x ∈ L⊥ = {θ}.
Hence M ⊥ = {θ}. But M is a closed subspace of H. Hence by the
projection theorem H = M .
Problems [3.5 and 3.6]
1. Find the projections of the position vector of a point in a two-
dimensional plane, along the initial line and the line through the
origin, perpendicular to the initial line.
2. Let Ox , Oy , Oz be mutually perpendicular axes through a fixed point
O as origin. Let the spherical polar coordinates of P be (r, θ, φ). Find
the projections of the position vector of P WRT O as the fixed point,
on the axes Ox , Oy , Oz respectively.
3. Let x1 , x2 , . . . xn satisfy xi = θ and xi ⊥ xj if i = j, i, j =
1, 2, . . . n. Show that the xi ’s are linearly independent and extend
the Pythagorean theorem from 2 to n dimensions.
4. Show that if M and N are closed subspaces of a Hilbert space H,
then M + N is closed provided x ⊥ y for all x ∈ M and y ∈ N .
5. Let H be a Hilbert space, M ⊆ H a convex subset, and {xn } a
sequence in M such that xn → d as n → ∞ where d = inf x.
x∈M
Show that {xn } converges in H.
6. If M ⊆ H, a Hilbert space, show that M ⊥⊥ is the closure of the span
of M .
7. If M1 and M2 are closed subspaces of a Hilbert space H, then show
that (M1 ∩ M2 )⊥ equals the closure of M1⊥ ⊕ M2⊥ .
8. In the usual Hilbert space 42 find L⊥ if
(i) L = {x} where x has two non-zero components x1 and x2 .
4
(ii) L = {x, y} ⊂ 2 is a linearly independent set.
Hilbert Space 111
Hints: (i) If y ∈ L⊥ with components y1 , y2 , then x1 y1 = x2 y2 = 0.

(ii) If L is a linearly independent set then x = ky, k = 0 and a scalar.
9. Show that equality holds in the Cauchy-Bunyakovsky-Schwartz
inequality (i.e. |x, y| = xy) if and only if x and y are linearly
dependent vectors.
3.7 Orthogonal System

Orthogonal system plays a great role in expressing a vector in a Hilbert
space in terms of mutually orthogonal vectors. The consequences of the
orthogonal projection theorem are germane to this approach. In the two-
dimensional plane any vector can be expressed as the sum of the projections
of the vector in two mutually perpendicular directions. If e1 and e2 are
two unit vectors along the perpendicular axes Ox and Oy and P has co-
ordinates (x1 , x2 ) [fig. 3.7], then the position vector r of P is given by
r = x1 e1 + x2 e2 .
y
P
r
e2
O x
e1
Fig. 3.7 Expressing a vector in terms of two perpendicular vectors
The above expression of the position vector can be extended to n

dimensions.
3.7.1 Definition: orthogonal sets and sequences
A set L of a Hilbert space H is said to be an orthogonal set if its elements
are pairwise orthogonal. An orthonormal set is an orthogonal subset L ⊂ H
whose elements ei , ej satisfy the following
condition (ei , ej ) = δij where δij
0, i = j
is the Kronecker symbol in δij =
1, i = j.
If the orthogonal or orthonormal set is countable, then we can call the
said set an orthogonal or orthonormal sequence respectively.
3.7.2 Example
2πinx
{e }, n = 0, 1, 2, . . . is an example of an orthonormal sequence in
the complex space L2 ([0, 1]).
3.7.3 Theorem (Pythagorean)
If a1 , a2 , . . . am are mutually orthogonal elements of the Hilbert space
H, then
a1 + a2 + · · · + ak 2 = a1 2 + · · · + ak 2

Proof:, Since a1 , a2 , . . . ak are mutually orthogonal elements,
ai , aj = ai 2 if i = j
=0 if i = j.
a1 + a2 + · · · + ak 2 = a1 + a2 + · · · + ak , a1 + a2 + · · · + ak

k
k
k
k
= ai , aj = ai , ai = ai 2 .
i=1 j=1 i=1 i=1
since ai , aj = 0 for i = j.
3.7.4 Lemma (linear independence)

An orthonormal set is linearly independent.
Proof: Let {e1 , e2 , . . . en } be an orthonormal set. If the set is not linearly
independent, we can find scalars α1 , α2 , . . . αn not all zeroes such that
α1 e1 + α2 e2 + · · · αn en = θ.
Taking scalar products of both sides with αj , we get αj (ei , ej ) = 0 since
(ei , ej ) = 0, i = j.
Therefore, αj = 0, for j = 1, 2, . . . n. Hence {ei } is linearly independent.
3.7.5 Examples
1. 4 n
(n dimensional Euclidean space): In n the orthonormal 4
set is given by e1 = (1, 0 · · · 0), e2 = (0, 1, . . . 0), . . . en = (0, 0, . . . 1)
respectively.
2. l2 space: In l2 space the orthonormal set {en } is given by,
e1 = (1, 0, . . . 0 · · · ), e2 = (0, 1, 0, . . . 0 · · · ), en = (0, 0, 0, . . . 0, 1 . . .)
respectively.
3. C([0, 2π]): The inner product space of all continuous real-valued
functions defined on [0, 2π] with the inner product given by: u, v =
2π
0
u(t)v(t)dt.
We want to show that {un (t)} where un (t) = sin nt is an orthogonal
sequence.
2π 2π
un (t), um (t) = un (t)um (t)dt = sin nt sin mt dt
0 0

0 n = m
=
π n = m = 1, 2, . . .
√
Hence un (t) = π.
,
1
Hence, √ sin nt is an orthonormal sequence.
π
On the other hand if we take vn = cos nt
Hilbert Space 113
2π
vn , vm = cos nt cos mt dt
⎧0
⎨ 0, m = n
= π, m = n = 1, 2 · · ·
⎩
2π, m = n = 0.
√
Hence vn = π for n = 0.
1 cos t cos nt
Therefore, √ , √ , √ is an orthonormal sequence.
2π π π
3.7.6 Gram-Schmidt orthonormalization process

Theoretically, every element in a Hilbert space H can be expressed in
terms of any linearly independent set. But the orthonormal set in H has an
edge over other linearly independent sets in this regard. If {ei } is a linearly
independent set and x ∈ H, a Hilbert space, then x can be expressed in
terms of ei as follows:

n
x= ci ei ,
i=1
In case ei , ej , i, j = 1, 2, . . . n are mutually orthogonal, then ci = x, ei ,

since ei , ej = 0 for i = j.
Thus, in the case of an orthonormal system it is very easy to find the
coefficients ci , i = 1, 2, . . . n.
Another advantage of the orthonormal set is as follows. Suppose
we want to add cn+1 en+1 to x so that x = x + cn+1 en+1 ∈ span
{e1 , e2 , . . . en+1 }.
Now, x, ej = x, ej + cn+1 en+1 , ej = x, ej = cj , j = 1, 2, . . . n.

x, en+1 = x, en+1 + cn+1 en+1 , en+1 = cn+1 ,
since x ∈ span {e1 , e2 , . . . en }. Thus determination of cn+1 does not depend

on the values c1 , c2 , . . . cn .
In what follows we explain the Gram-Schmidt ortho-
normalization process.
Let {an } be a (finite or countably infinite) set of vectors in a Hilbert
space H. The problem is to convert the set into an orthonormal set {en }
such that span {ei , e2 , . . . en } = span {a1 , a2 , . . . an }, for each n. A few
steps are written down.
a1
Step 1. Normalize a1 which is necessarily non-zero so that e1 = .
a1
Step 2. Let g2 = a2 − a2 , e1 so that g2 , e1 = 0.
g2
Here take, e2 = , so that e2 ⊥ e1 .
g2
Here, g2 = 0, because otherwise g2 = 0 and hence a2 and a1

are linearly dependent, which is a contradiction. Since g2 is a linear
combination of a2 and a1 , we have span {a2 , a1 } = span {e1 , e2 }.
Let us assume by way of induction that
gm = am − am , e1 e1 − am , e2 e2 · · · am , em−1 em−1

and gm = 0
then gm , e1 = 0, . . . gm , em−1 = 0 i.e. gm ⊥ e1 , e2 , . . . em−1 .
gm
We take em = .
gm
We assume span {a1 , a2 , . . . am−1 , am } = span {e1 , e2 , . . . em−1 , em }
Next we take
gm+1 = am+1 − am+1 , e1 e1 − am+1 , e2 e2 − em+1 , em em .
Now, gm+1 , e1 = gm+1 , e2 · · · = gm+1 , em = 0.
Thus gm+1 ⊥ e1 , e2 , . . . em .
gm+1 = θ ⇒ am+1 is a linear combination of e1 , e2 , . . . em
⇒ am+1 is a linear combination of a1 , a2 , . . . am which contradicts the
hypothesis that {a1 , a2 , . . . am+1 } are linearly independent.
Hence, gm+1 = θ, i.e., gm+1 =
0.
gm+1
We write, em+1 = .
gm+1
Hence, e1 , e2 , . . . em+1 form an orthonormal system.
3.7.7 Examples
1. (Orthonormal polynomials) Let L2,ρ ([a, b]) be the space
of square-summable functions with weight functions ρ(t). Let us
take a linearly independent set 1, t, t2 · · · tn · · · in L2,ρ ([a, b]). If we
orthonormalize the above linearly independent set, we get Chebyshev
system of polynomials, p0 = const., p1 (t), p2 (t), . . . , pn (t), . . . which are
orthonormal with weight ρ(t), i.e.,
b
ρ(t)pi (t)pj (t)dt = δij .
a
We mention below a few types of orthogonal polynomials.
[a, b] ρ(t) Polynomial

(i) a = −1, b = 1 1 Legendre polynomials
−t2
(ii) a = −∞, b = ∞ e Hermite polynomials
−t
(iii) a = 0, b = ∞ e Laguerre polynomials
Hilbert Space 115
(i) Legendre polynomials

1 √
Let us take g1 = e1 = 1, g1 , g1 = dt = 2, i.e., g1 = 2
−1
;
g1 1 2n + 1
e1 = =√ = Pn (t)n=0
g1 2 2
1 dn 2
where Pn (t) = n (t − 1)n . (3.31)
2 n! dtn

a2 , g1 1 1
Take, g2 = a2 − g1 = t − t dt = t.
g1 2 −1
1 ;
2 2 2 g2 3
g2 = t dt = , e2 = = t.
−1 3 g2 2
;
2n + 1
Thus e2 = P2 (t)n=1 .
2
It may be noted that
1 dn 2n n
Pn (t) = [t − C1 t2n−1 + · · · (−1)n ]
2n n! dtn
applying binomial theorem

N
(2n − 2j)!
= (−1)j tn−2j (3.32)
j=0
2n j!(n− j)!(n − 2j)!
n (n − 1)
where N = if n is even and N = if n is odd.
2 2
Next, let us take g3 = a3 − a3 , e1 e1 − a3 , e2 e2
1
1 1 2 3 1
= t2 − t dt − t t3 dt = t2 − .
2 −1 2 −1 3
1 2
1 8
g3 2 = t2 − dt = .
−1 3 45
; ;
5 1 2 2n + 1
e3 = · (3t − 1) = Pn (t)n=2
2 2 2
We next want to show that
1 12 ;
2 2
Pn = Pn (t)dt = (3.33)
−1 2n +1
Let us write v = t2 − 1. The function v n and its derivatives

(v ) , . . . (v n )(n−1) are zero at t = ±1 and (v n )(2n) = (2n)!. Integrating
n 1
n times by parts, we thus obtain from (3.31)

1
2 2
(2 n!) Pn =
n
(v n )(n) (v n )(n) dt
−1
1 1

= (v n )(n−1) (v n )(n) − (vn )(n−1) (v n )(n+1) dt.
−1 −1
1 1
= (−1)n (2n)! vn dt = 2(2n)! (1 − t2 )n dt
−1 −1
π
2
= 2(2n)! cos2n+1 α dα (t = sin α)
0
22n+1 (n!)2
= .
2n + 1
0; 0
0 2n + 1 0
0 0
Thus 0 Pn 0 = 1.
0 2 0
1
1
Next, Pm , Pn = m+n ((t2 − 1)m )(m) ((t2 − 1)n )(n) dt
2 m!n! −1
1
1
= m+n (v m )(m) (v n )(n) dt
2 m!n! −1
where m > n (suppose) and v = t2 − 1.
+1
1
= m+n (v m )(m) (v n )(n−1)
2 m!n! −1
1
−2m (v m )(m−1) (vn )(n+1) dt
−1
1
(−1)n 2m · · · 2(m − n)
= (v m )(m−n) dt = 0 m > n.
2m+n m!n! −1
A similar conclusion is drawn if n > m.
Thus, {Pn (t)} as given in (3.31) is an orthonormal sequence of

polynomials.
The Legendre polynomials are solutions of the important Legendre

differential equation
(1 − t2 )Pn − 2tPn + n(n + 1)Pn = 0 (3.34)
and (3.32) can also be obtained by applying the power series method to
(3.34).
Hilbert Space 117
y
P0
1
P1
P2
x
−1 1
−1
Fig. 3.8 Legendre polynomials
Remark 1. Legendre polynomials find frequent use in applied

mathematics, especially in quantum mechanics, numerical analysis, theory
of approximations etc.
(ii) Hermite polynomials
Since a = −∞ and b = ∞, we consider L2 ([−∞, ∞]) which is also
a Hilbert space. Since the interval of integration is infinite, we need
to introduce a weight function which will make the integral convergent.
t2
We take the weight function w(t) = e− 2 so that the Gram-Schmidt
orthonormalization process is to be applied to w, wt, wt2 . . . etc.
∞ ∞ ∞
2 2 √
a0 2 = w2 (t)dt = e−t dt = 2 e−t dt = π,
−∞ −∞ 0
t2
w e− 2
e0 = √ 1 = √ 1
( π) 2 ( π) 2
∞
a1 , g0 [ −∞ w2 tdt]
Take g1 = a 1 − g0 = wt − √
g0 2 π
∞ − 3t2
te 2 dt
= wt − −∞ √ = wt.
π
∞ ∞
2
2 2
2
g1 = w t dt = t2 e−t dt
−∞ 0
∞
1 3
= 2 z 2 −1 e−z dz putting z = t2 .
2 0

3 1√
=Γ = π since Γ(n + 1) = nΓ(n).
2 2
√ − t2 <
g1 2te 2 1 t2
so that e1 (t) = = √ 1 = √ e− 2 · 2t
g1 ( π) 2 2·1 π
<
1 2
− t2
We want to show that en (t) = √ e Hn (t), n ≥ 1 (3.35)
2n n! π
t2
e− 2
and e0 (t) = √ 1 H0 (t)
( π) 2
dn −t2 2
where H0 (t) = 1, Hn (t) = (−1)n et (e ), n = 1, 2, 3, . . . (3.36)
dtn
Hn are called Hermite polynomials of order n. H0 (t) = 1.
Performing the differentiations indicated in (3.36) we obtain

N
2n−2j
Hn (t) = n! (−1)j tn−2j (3.37)
j=0
j!(n − 2j)!
n (n − 1)
where N = if n is even and N = if n is odd. The above form
2 2
can also be written as
N
(−1)j
Hn (t) = n(n − 1) · · · (n − 2j + 1)(2t)n−2j (3.38)
j=0
j!
t2
e− 2
Thus e0 (t) = √ 1 H0 (t)
( π) 2
√ − t2 t2
<
2te 2 e− 2 (2t) 1 t2
e1 (t) = √ 1 = √ 1 = n
√ e− 2 Hn (t)n=1 .
( π) 2 (2 π) 2 2 n! π
(3.36) yields explicit expressions for Hn (t) as given below for a few values
of n.
H0 (t) = 1 H1 (t) = 2t
H2 (t) = 4t2 − 2 H3 (t) = 8t3 − 12t
H4 (t) = 16t4 − 48t2 + 12 H5 (t) = 32t5 − 160t3 + 120t.
We next want to show that {en (t)} is orthonormal where en (t) is given
by (3.35) and Hn (t) by (3.37),
< ∞
1 2
en (t), em (t) = n+m
√ 2
e−t Hm Hn dt.
2 n!m!( π) −∞
Differentiating (3.38) we obtain for n ≥ 1,

M
(−1)j
Hn (t) = 2n (n − 1)(n − 2) · · · (n − 2j)(2t)n−1−2j
j=0
j!
= 2nHn−1 (t)
(n − 2) (n − 1)
where N = if n is even and N = if N is odd.
2 2
Hilbert Space 119
2
Let us assume m ≤ n and u = e−t . Integrating m times by parts we
obtain from the following integral,
∞
2
(−1) n
e−t Hm (t)Hn (t)dt
−∞
∞
= Hm (t)(u)(n) dt
−∞
∞ ∞

= Hm (t)(u)(n−1) − 2m Hm−1 (u)n−1 dt
−∞ −∞
∞
= −2m Hm−1 (u)(n−1) dt
−∞
= ······
∞
= (−1)m 2m m! H0 (t)(u)(n−m) dt
−∞
∞

= (−1)m 2m m!(u)(n−m−1)
−∞
=0 if n > m because as t → ±∞, t2 → ∞ and u → 0.
This proves orthogonality of {em (t)}, when m = n
∞ ∞
2 2
(−1)n e−t Hn2 (t)dt = (−1)n 2n n! H0 (t)e−t dt
−∞ −∞
n n √
= (−1) 2 n! π.
∞
2 √
Hence e−t Hn2 (t)dt = 2n n! π.
−∞
This proves orthonormality of {en }.
Hermite polynomials Hn satisfy the Hermite differential equations Hn −

2tHn + 2nHn = 0.
Like Legendre polynomials, Hermite polynomials also find applications
in numerical analysis, approximation theory, quantum mechanics etc.
(iii) Laguerre polynomials
We consider L2 ([0, ∞)) and apply Gram-Schmidt process to the
sequence defined by
t t t
e− 2 , te− 2 , t2 e− 2 .
∞
t
Take g0 = e− 2 , g0 2 = e−t dt = 1.
0
g0 t
e0 = = e− 2
g0
t t t t
g1 = te− 2 − te− 2 , e− 2 e− 2

− 2t − 2t
∞
−t te−t ∞ ∞
te ,e = te dt = + e−t dt = 1.
0 −1 0 0
t
g1 = (t − 1)e− 2
∞
t t
g1 2 = (t − 1)e− 2 , (t − 1)e− 2 = (t − 1)2 e−t dt = 1.
0
− 2t
e1 (t) = (t − 1)e .
− 2t
Let us take en (t) = e Ln (t), n = 0, 1, 2 (3.39)
where the Laguerre polynomials of order n is defined by
et dn n −t
L0 (t) = 1, Ln (t) = (t e ), n = 1, 2 (3.40)
n! dtn
(−1)j
n
n
i.e. Ln (t) = − tj (3.41)
j! j
j=0
Explicit expressions for the first few Laguerre polynomials are
L0 (t) = 1, L1 (t) = 1 − t
1 3 1
L2 (t) = 1 − 2t + t2 L3 (t) = 1 − 3t + t2 − t3 .
2 2 6
2 2 3 1 4
L4 (t) = 1 − 4t + 3t − t + t
3 24
The Laguerre polynomials Ln are solutions of the Laguerre differential
equations
tLn + (1 − t)Ln + nLn = 0. (3.42)
In what follows we find out e2 (t) by Gram-Schmidt process.
We know g2 (t) = a2 − a2 , e1 e1 − a2 , e0 e0 .

∞
2 − 2t
a2 (t) = t e ; a2 , e1 = t2 · (t − 1) · e−t dt
0
∞
= (t3 − t2 )e−t dt
0
∞
=2 t2 e−t dt = 4.
0
∞
a2 , e0 = t2 e−t dt = 2.
0
2 − 2t t t t
g2 (t) = t e − 4(t − 1)e− 2 − 2e− 2 = (t2 − 4t + 2)e− 2 .
∞
g2 (t) = 2
(t2 − 4t + 2)2 e−t dt
0
∞
= (t4 − 8t3 + 16t2 + 4 + 4t2 − 16t)e−t dt
0
Hilbert Space 121
∞
= (t4 − 8t3 + 20t2 − 16t + 4)e−t dt
0
= 4.
! "
e2 (t) = 1 − 2t + 12 t2 e−t .
The orthogonality {Lm (t)} can be proved as before.
3.7.8 Fourier series

Let L be a linear subspace of a Hilbert space spanned by
e1 , e2 , . . . en . . . and let x ∈ L. Therefore, there is a linear combination
n n
αi ei for every > 0 such that x − αi ei < .
i=1 i=1
0 02 = >
0 n 0 n n
0 0
Hence 0x − αi ei 0 = x − αi e i , x − αi e i
0 0
i=1 i=1 i=1
= n > = n > = n >

n
= x, x − x, αi ei − αi ei , x + αi e i , αi ei
i=1 i=1 i=1 i=1

n
n
n
n
= x|2 − αi x, ei − αi ei , x + αi αj ei , ej
i=1 i=1 i=1 j=1

n
n
n
= x2 − αi di − α i di + |αi |2 where di = x, ei .
i=1 i=1 i=1

n
n
Therefore, x − αi ei 2 = x2 − |di |2 + |αi − di |2 .
i=1 i=1
The numbers di are called Fourier coefficients of the element x with

respect to the orthonormal system {ei }.
The expression on the RHS for different values of αi takes its least value
when αi = di , the ith Fourier coefficient of x.
n
n
Hence, 0 ≤ x − di ei 2 = x2 − |di |2 < . (3.43)
i=1 i=1

n ∞

being arbitrary small, it follows that x = lim di e i = di e i .
n→∞
i=1 i=1
∞

The convergence of the series |di |2 also follows from (3.43) and
i=1
moreover ∞

|di |2 = x2 . (3.44)
i=1
Next let x be an arbitrary element in the Hilbert space H. Let y be the
projection of x on L.
∞
∞

Then y= di ei where di = y, ei = x, ei and |di |2 = y2 .
i=1 i=1
Since x = y + z, y ∈ L, z ⊥ L, it follows from the Pythagorean

theorem,
x2 = y2 + z2 ≥ y2 . (3.45)
Consequently for any element x in H, the inequality,
∞
|di |2 ≤ x2 (3.46)
i=1
holds for any x ∈ H where di = x, ei , i = 1, 2, 3 . . .. The above inequality

is called Bessel inequality.
3.8 Complete Orthonormal System

4
It is known that in two-dimensional Euclidean space, 2 , any vector can be
uniquely expressed in terms of two mutually orthogonal vectors of unit norm
or, in other words, any non-zero vector x can be expressed uniquely in terms
4
of bases e1 , e2 . Similarly, in 3 we have three terms, e1 , e2 , e3 , each having
unit norm, and the bases are pairwise orthonormal. This concept can be
easily extended to n-dimensional Euclidean space where we have npair-wise
orthonormal bases e1 , e2 , . . . en . But the question that invariably comes up
is whether this concept can be extended to infinite dimensions. Or, in other
words, the question arises as to whether an infinite dimensional space, like
an inner product space, can contain a ‘sufficiently large’ set of orthonormal
vectors such that any element x ∈ H (a Hilbert space) can be uniquely
represented by said set of orthonormal vectors.
Let H = Φ be a Hilbert space. Then the collection C of all orthonormal
subsets of H is clearly non-empty. It can be easily seen that the class C can
be partially ordered under ‘set inclusion’ relation.
3.8.1 Definition: complete orthonormal system
V.A. Steklov first introduced the concept of a complete orthonormal
system. An orthonormal system {ei } in H is said to be a complete
orthonormal system if there is no non-zero x ∈ H such that x is orthogonal
to every element in {ei }. In other words, a complete orthonormal system
cannot be extended to a larger orthonormal system by adding new elements
to {ei }. Hence, a complete orthonormal system is maximal with respect to
inclusion.
3.8.2 Examples
1. The unit vectors ei = (1, 0, 0), e2 = (0, 1, 0), e3 = (0, 0, 1) in the
directions of three axes of rectangular coordinate system form a complete
orthonormal system.
Hilbert Space 123
2. In the Hilbert space l2 , the sequence {en } where en = {δnj }, is a

complete orthonormal system.
3. The orthonormal system
, in example 3.7.5(3), i.e., the sequence un =
1 cos t cos 2t
√ , √ , √ · · · , although an orthonormal set is not complete for
2π π π
sin t, un = 0. But the system √12π , √1π cos t, √1π sin t, √1π cos 2t, . . . is a
complete orthonormal system.
3.8.3 Theorem
Every Hilbert space H = {θ} contains a complete orthonormal
set.
Consider the family C of orthonormal sets in1 H 2
partially ordered by
set inclusion. For any non-zero x ∈ H, the set x is an orthonormal
x
set. Therefore, C =
Φ. Now let us consider any totally ordered subfamily
in C. The union of sets in this subfamily is clearly an orthonormal set
and is an upper bound for the totally ordered subfamily. Therefore, by
Zorn’s lemma (1.1.4), we conclude that C has a maximal element which is
a complete orthonormal set in H.
The next theorem provides another characterization of a complete
orthonormal system.
3.8.4 Theorem
Let {ei } be an orthonormal set in a Hilbert space H. Then
{ei } is a complete orthonormal set if and only if it is impossible
to adjoin an additional element e ∈ H, e = θ to {ei } such that
{ei , e} is an orthonormal set in H.
Proof: Suppose {ei } is a complete orthonormal set. Let it be possible to
adjoin an additional vector e ∈ H of unit norm e = θ, such that {e, ei } is
an orthonormal set in H i.e. e ⊥ {ei }, e is a non-zero vector of unit norm.
But this contradicts the fact that {ei } is a complete orthonormal set. On
the other hand, let us suppose that it is impossible to adjoin an additional
element e ∈ H, e = θ, to {ei } such that {e, ei } is an orthonormal set.
Or, in other words, there exists no non-zero e ∈ H of unit norm such that
e ⊥ {ei }. Hence the system {ei } is a complete orthonormal system.
In what follows we define a closed orthonormal system in a Hilbert space
and show that it is the same as a complete orthonormal system.
3.8.5 Definition: closed orthonormal system
An orthonormal system {ei } in H is said to be closed if the subspace
L spanned by the system coincides with H.
3.8.6 Theorem
A Fourier series with respect to a closed orthonormal system,
constructed for any x ∈ H, converges to this element and for
every x ∈ H, the Parseval-Steklov equality

∞

d2i = x 2 (3.47)
i=1
holds.
Proof: Let {ei } be a closed orthonormal system. Then the subspace
spanned by {ei } coincides with H.
Let x ∈ H be any element. Then the Fourier series (c.f. 3.7.8) WRT
the closed system is given by
∞

x= di e i where di = x, ei ,
i=1
Then by the relation (3.43), we have,
∞

x2 = d2i where di = x, ei and ei 2 = 1.
i=1
3.8.7 Corollary
An orthonormal system is complete if and only if the system
is closed.
Let {ei } be a complete orthonomal system in H. If {ei } is not closed,
let the subspace spanned by {ei } be L where L = H. If any non-zero x is
not orthogonal to L, then x is also not orthogonal to L. Thus Φ ≡ H − L.
This contradicts that {ei } is not closed.
Conversely, let us suppose that the orthonormal system {ei } is closed.
Then, by theorem 3.8.6, we have for any x ∈ H,
∞

x2 = d2i where di = x, ei .
i=1
If x ⊥ ei , i = 1, 2, . . . , that is, ei = θ, i = 1, 2, . . . , then x = 0. This

implies that the system {ei } is complete.
3.8.8 Definition: orthonormal basis
A closed orthonormal system in a Hilbert space H is also called an
orthonormal basis in H.
Note 3.8.1 We note that completeness of {en } is tantamount to the
statement that each x ∈ L2 has Fourier expansion,
∞
1
x(t) = √ xn eint (3.48)
2π n=−∞
It must be emphasized that the expansion is not to be interpreted as saying

that the series converges pointwise to the function x(t).
Hilbert Space 125
One can conclude that the partial sums of 3.48 is the vector un (t) =

n
√1 xk eikt converges to the vector x in the sense of L2 , i.e.,
2π
k=−∞
un (t) − x(t) −→ 0 as n −→ ∞.

This situation is often expressed by saying that x is the limit in the
mean of un ’s.
3.8.9 Theorem
In a separable Hilbert space H a complete orthonormal set is
enumerable.
Proof: Let H be a separable Hilbert space, then ∃ an enumerable set
M , everywhere dense in H, i.e., M = H. Let {eλ |λ∈Λ } be a complete
orthonormal sets in H. If possible let the set {eλ |λ∈Λ } be not enumerable
then ∃ an element eβ ∈ {eλ |λ∈Λ } such that eβ ∈ M . Since M = H, eβ ∈ M ,
since M is dense in H, ∃ a separable set {xn } ⊆ M such that lim xn = eβ .
n→∞
Since {eλ |λ∈Λ } is a complete set in H and xn ∈ M ⊂ H, then xn can
be expressed as
(n)
xn = Cλ eλ
λ∈Λ
λ=β
(n)
eβ = lim xn = lim Cλ eλ .
n→∞ n→∞
λ∈Λ
λ=β
1 2
The above relation shows that eβ is a linear combination of eλ |λ∈Λ ,
λ=β
which is a contradiction, since {eλ |λ∈Λ } is an orthonormal set and hence
linearly independent.
Hence {eλ |λ∈Λ } cannot be non-enumerable.
1. Let {a1 , a2 , . . . an } be an orthogonal set in a Hilbert space H,
and α1 , α2 , . . . αn be scalars such that their absolute values are
respectively 1. Show that
α1 a1 + · · · + αn an = a1 + a2 + · · · + an .
2. Let {en } be an orthonormal sequence in a Hilbert space H. If {αn }

∞
be a sequence of scalars such that |αi |2 converges then show that
i=1

∞
αi ei converges to an x H and αn = x, en , ∀ n ∈ N.
i=1
3. Let {e1 , e2 , . . . en } be a finite orthonormal set in a Hilbert space H

and let x be a vector in H. If p1 , p2 , . . . pn are arbitrary scalers show
n
that x − pi ei attains its minimum value ⇒ pi = x, ei for each
i=1
i.
4. Let {en } be a orthonormal sequence in a Hilbert space H. Prove that
∞

|x, en y, en | ≤ xy, x, y ∈ H.
n=1
5. Show that on the unit disk B(|z| < 1) of the complex plane z = x+iy,
the functions
12
k
gk (x) = z k−1 (k = 1, 2, 3)
π
form an orthonormal system under the usual definition of a scalar
product in a complex plane.
6. Let {eα |λ ∈ Λ} be an orthonormal set in a Hilbert space H.
(a) If x belongs to the closure of span {en }, then show that

∞
∞

x= x, en en and x2 = x, en 2
n=1 n=1
where {e1 , e2 , . . .} = {en : x, en = 0}.

(b) Prove that the span {en } is dense in H if and only if every x in
H has a Fourier expression as above and for every x, y in H, the
identity
∞
x, y = x, en en , y
n=1
holds, where {ei , e2 , . . .} = {eα : x, eα = 0 and y, eα =

0}.
7. Let L be a complete orthonormal set in a Hilbert space H. If

u, x = v, x for all x ∈ L, show that u = v.
8. The Haar system in L2 ([0, 1]) is defined as follows:
(a)
φ(0) (x) = 1 x ∈ [0, 1]
⎧ ? 1
⎪
⎪ 1, x ∈ 0,
⎪
⎪
⎪
⎨ 2
1 )
(0)
φ1 (x) = −1, x ∈ ,1
⎪
⎪ 2
⎪
⎪
⎪
⎩ 0, x = 1
2
Hilbert Space 127
and for m = 1, 2, . . . ; K = 1, . . . 2m
⎡
√ K − 1 K − 12
⎢ 2 , x∈
m
m
, m
⎢
⎢ 2 1 2
√ K−2 K
φm (x) = ⎢
(K)
⎢ − 2m , x ∈ ,
⎢ 2m 2m
⎣ K −1 K
0, x ∈ [0, 1] ∼ ,
2m 2m
(K)
and at that finite set of points at which φm (x) has not yet been
(K)
defined, let φm (x) be the average of the left and right limits of
(K) (K)
φm (x) as x approaches the point in question. At 0 and 1, let φm (x)
assume the value of the one-sided limit. Show that the Haar system
given by
(0) (0) (K)
{φ0 , φ1 · · · φm , m = 1, 2, . . . ; K = 1, . . . 2m }
is orthonormal in L2 ([0, 1]).
9. If H has a denumerable orthonormal basis, show that every
orthonormal basis for H is denumerable.
10. Show that the Legendre differential equation can be written as
[(1 − t2 )Pn ] = −n(n + 1)Pn .
Multiply the above equation by Pn and the corresponding equation
in Pm by Pn . Then subtracting the two and integrating resulting
equation from −1 to 1, show that {Pn } in an orthogonal sequence in
L2 ([−1, 1]).
11. (Generating function) Show that
∞

1
√ = Pn (t)w n .
1 − 2tw + w2 n=0
12. (Generating function) Show that

∞
1
exp (2wt − w2 ) = Hn (t)w n .
n=0
n!
The function on the left is called a generating function of the Hermite

polynomials.
2 d
n 2
13. Given H0 (t) = 1, Hn (t) = (−1)n et n (e−t ) n = 1, 2, . . .
dt
Show that Hn+1 (t) = 2tHn (t) − Hn (t).
14. Differentiating the generating function in problem 12 W.R.T. t, show
that Hn (t) = 2nHn−1 (t), (n ≥ 1) and using problem 13, show that
Hn satisfies the Hermite differential equation.
3.9 Isomorphism between Separable Hilbert

Spaces
Consider a separable Hilbert space H and let {ei } be a complete
orthonormal system in the space. If x is some element in H then we
can assign to this element a sequence of numbers {c1 , c2 , . . . cn }, the

∞
Fourier coefficients. As shown earlier the series |ci |2 is convergent
i=1
and consequently the sequence {c1 , c2 , . . . cn , . . .} can be treated as some
element x̃ of the complex space l2 . Thus to every element x ∈ H there
can be assigned some element x̃ ∈ l2 . Moreover, the assumption on the
completeness of the system implies
∞
12

2
xH = |ci | = |x̃l2 (3.49)
i=1
where the subscripts H and l2 denote the respective spaces whose norms
are taken.
Moreover, it is clear that if x ∈ H corresponds to x̃ ∈ l2 , and y ∈ H
corresponds to ỹ ∈ l2 , then x ± y corresponds to x̃ ± ỹ. It then follows from
(3.49) that
x − yM = x̃ − ỹl2 . (3.50)
Let us suppose that z = {ζi } is an arbitrary element in l2 . We next consider
n
in H the elements zn = ζi ei , n = 1, 2, . . .
i=1

n
n
We have then zn − zm 2 = ζi ei 2 = |ζi |2 .
i=m+1 i=m+1
Now, zm − zn → 0 as n, m → ∞.
Then {zn } is a Cauchy sequence in the sense of the metric in H and

by virtue of completeness of H, converges to some element z of the space,
since
z, ei = lim zn , ei = ζi , i = 1, 2, . . ..
n→∞
It therefore follows that the ζi are Fourier coefficients of z w.r.t. the chosen
orthonormal system {ei }. Thus each element z̃ ∈ l2 is assigned to some
element z ∈ H. In the same manner, corresponding to every z̃ ∈ l2
we can find a z ∈ H. Thus a one-to-one correspondence between the
elements of H and l2 is established. The formula (3.50) shows that the
correspondence between H and L2 is an isometric correspondence. Now, if
x ∈ H corresponds to x ∈ l2 and y ∈ H corresponds to ỹ ∈ l2 , x ± y ∈ H2
corresponds to x̃ ± ỹ ∈ l2 . Again λx ∈ H corresponds to λx̃ ∈ l2 for any
Hilbert Space 129
scalar λ. Since x ± yH = x̃ ± ỹl2 and λxH = λx̃l2 , it follows that
the correspondence between H and l2 is both isometric and isomorphic.
We thus obtain the following theorem.
3.9.1 Theorem
Every complex (real) separable Hilbert space is isomorphic and isometric
to a complex (real) space l2 . Hence all complex (real) separable Hilbert
spaces are isomorphic and isometric to each other.
CHAPTER 4
LINEAR OPERATORS
There are many operators, such as matrix operator, differential operator,

integral operator, etc. which we come across in applied mathematics,
physics, engineering, etc. The purpose of this chapter is to bring the
above operators under one umbrella and call them ‘linear operators’. The
continuity, boundedness and allied properties are studied. If the range of
4
the operators is , then they are called functionals. Bounded functionals
and space of bounded linear functionals are studied. The inverse of an
operator is defined and the condition of the existence of an inverse operator
is investigated. The study will facilitate an investigation into whether a
given equation has a solution or not. The setting is always a Banach space.
4.1 Definition: Linear Operator

We know of a mapping from one space onto another space. In the case of
vector spaces, and in particular, of normed spaces, a mapping is called an
operator.
4.1.1 Definition: linear operator
Given two topological linear spaces (Ex , τx ) and (Ey , τy ) over the same
scalar field (real or complex) an operator A is defined on Ex with range in
Ey .
We write y = Ax; x ∈ Ex and y ∈ Ey .
The operator A is said to be linear if:
(i) it is additive, that is,

A(x1 + x2 ) = Ax1 + Ax2 , forall x1 , x2 ∈ Ex (4.1)
(ii) it is homogeneous, that is,

A(λx) = λAx,
130
Linear Operators 131
forall x ∈ Ex and every real (complex) λ whenever Ex is real

(complex).
Observe the notation. Ax is written instead of A(x); this simplification

is standard in functional analysis.
4.1.1a Example
Consider a real square matrix (aij ) of order n, (i, j = 1, 2, . . . n). The
equations,

n
ηi = aij ξj (i = 1, 2, . . . , n)
j=1
can be written in the compact form,

y = Ax
where y = (η1 , η2 , . . . , ηn ) ∈ Ey , A = (aij ) ,
i=1,...,n
j=1,...,n
x = (ξ1 , ξ2 , . . . , ξn ) ∈ Ex
If x1 = (ξ11 , ξ21 , . . . , ξn1 ) and x2 = (ξ12 , ξ22 , . . . , ξn2 ) and y 1 =
(η11 , η21 , . . . , ηn1 )
and y 2 = (η12 , η22 , . . . , ηn2 ) such that

n
Ax1 = y1 ⇒ aij ξj1 = ηi1 , i = 1, 2, . . . , n
j=1

n
2 2
Ax = y ⇒ aij ξj2 = ηi2 , i = 1, 2, . . . , n
j=1

Then A(x1 + x2 ) = ( aij (ξj1 + ξj2 )) = ( aij ξj1 ) + ( aij ξj2 )
j j j
= Ax1 + Ax2 = y 1 + y 2 .
The above shows that A is additive.

The fact that A is homogeneous can be proven in a similar manner.
4.1.2a Example
Let k(t, s) be a continuous function of t and s, a ≤ t, s ≤ b. Consider
the integral equation
b
y(t) = k(t, s)x(s)ds.
a
If x(s) ∈ C([a, b]), then y(t) ∈ C([a, b]).

The above equation can be written as
b
y = Ax where A = k(t, s)x(s)ds.
a
The operator A maps the space C([a, b]) into itself.

Let x1 (s), x2 (s) ∈ C([a, b]), then,
b
A(x1 + x2 ) = k(t, s)(x1 (s) + x2 (s))ds
a
b b
= k(t, s)x1 (s)ds + k(t, s)x2 (s)ds = Ax1 + Ax2
a a
b b
Moreover, A(λx) = k(t, s)(λx(s))ds = λ k(t, s)x(s)ds = λAx.
a a
Thus A is additive and homogeneous.

4.1.3a Example
Let Ex = C 1 ([a, b]) = {x(t) : x(t) is continuously differentiable in
dx
a < t < b, is continuous in (a, b)}.
dt
Define the norm as

dx
||x|| = sup |x(t)| + (4.2)
a≤t≤b dt
dx
Let x ∈ C 1 ([a, b]), then y = Ax = ∈ C([a, b])
dt

dx
||y|| = sup |y(t)| = sup
a≤t≤b a≤t≤b dt
Since the sup of (4.2) exists, sup |y(t)| also exists. Moreover the operator
dx
A= is linear for
dt
x1 , x2 ∈ C 1 ([a, b]) A(x1 + x2 ) = Ax1 + Ax2
⇒
λ∈ ( ) 4+ A(λx) = λAx
4.1.4 Continuity
We know that the continuity of A in the case of a metric space means
that there is a δ > 0 such that the collection of images of elements in the
ball B(x, δ) lie in B(Ax, ).
4.1.1b Example
(m)
Let us suppose in example 4.1.1a {ξi } is convergent.
(m) (p)
n
(m) (p)
Then ηi − ηi = aij (ξj − ξj )
j=1
Hence by Cauchy-Bunyakovsky-Schwartz inequality (1.4.3)

n
(m) n
n
n (m)
(η − η
(p) 2
) ≤ a 2 (p)
(ξi − ξi )2
i i ij
i=1 i=1 j=1 j=1
(m) (m)
where xm = {ξi }, ym = Axm = {ηi }.

n
n
(m)
Since a2ij is finite, convergence of {ξi } implies convergence of
i=1 j=1
(m)
{ηi }. Hence continuity of A is established.
4.1.2b Example
Consider Example 4.1.2a.
Let {xn (t)} converge to x(t) in the sense of convergence in C([0, 1]), i.e.,
converges uniformly in C[(0, 1]). Now in the case of uniform convergence
we can take the limit under the integral sign.
b b
It follows that lim K(t, s)xm (s)ds = K(t, s)x(s)ds i.e.
m a a
lim Axm = Ax and the continuity of A is proved.
m→∞
4.1.3b Example
We refer to the example 4.1.3a. The operator A, in this case, although
additive and homogeneous, is not continuous. This is because the
derivative of a limit element of a uniformly convergent sequence of functions
need not be equal to the limit of the derivative of these functions, even
though all these derivatives exist and are continuous.
4.1.4 Example
Let A be a continuous linear operator. Then
(i) A(θ) = θ (ii) A(−z) = −Az for any z ∈ Ex
Proof: (i) for any x, y, z ∈ Ex , put x = y + z and consequently y = x − z.
Now,
Ax = A(y + z) = Ay + Az = Az + A(x − z)
Hence, A(x − z) = Ax − Az (4.3)
Putting x = z, we get A(θ) = θ.

(ii) Taking x = θ in (4.3) we get
A(−z) = −Az
4.1.5 Theorem
If an additive operator A, mapping a real linear space Ex into a real
linear space Ey s.t. y = Ax, x ∈ Ex , y ∈ Ey , be continuous at a point
x∗ ∈ Ex , then it is continuous on the entire space Ex .
Proof: Let x be any point of Ex and let xn → x as n → ∞. Then
xn − x + x∗ → x∗ as n → ∞.
Since A is continuous at x∗ ,
lim A(xn − x + x∗ ) = Ax∗

n→∞
However, A(xn − x + x∗ ) = Axn − Ax + Ax∗ ,
since A is additive in nature.
Therefore, lim (Axn − Ax + Ax∗ ) = Ax∗ or lim Axn = Ax
n→∞ n→∞
where x is any element of Ex .
4.1.6 Theorem
An additive and continuous operator A defined on a real linear space is
homogeneous.
Proof: (i) Let α = m, a positive integer. Then
A(αx) = Ax + Ax + · · · + Ax = mAx
A BC D
m terms
(ii) Let α = −m, m is a positive integer.

By sec. 4.1.4
A(αx) = A(−mx), m positive integer

= −A(mx) = −mAx = αAx
m
(iii) Let α = be a rational number m and n prime to each other,
n
then
m x
A x = mA
n n
x
Let = ξ, n is an integer. Then x = nξ.
n x
Ax = A(nξ) = nAξ = nA
n
x 1 m
Hence A = Ax i.e. A(αx) = A x
n n n
x m
= mA = Ax = αAx.
n n
m m
If α = − where > 0, then also A(αx) = αAx.
n n
Let us next consider α to be an irrational number. Then we can find a
sequence of rational number {si } such that lim si = α.
i→∞
si being a rational number, then
A(si x) = si Ax (4.4)
since A is continuous at αx, α is a real number, and since lim si x = αx
i→∞
we have,
lim A(si x) = A(αx)

i→∞
Again, lim si = α.
i→∞
Hence taking limits in (4.4), we get,
A(αx) = αAx.
4.1.7 The space of operators

The algebraic operations can be introduced on the set of linear
continuous operators, mapping a linear space Ex into a linear space Ey .
Let A and B map Ex into Ey .
For any x ∈ Ex , we define the addition by
(A + B)x = Ax + Bx
and the scalar multiplication by
(λA)x = λAx.
Thus, we see the set of linear operators defined on Ex is closed w.r.t.

addition and scalar multiplication. Hence, the set of linear operators
mapping Ex into Ey is a linear space.
In particular if we take B = −A, thus
(A + B)x = (A − A)x = 0 · x = Ax − Ax = θ
Thus 0, the null operator, is an element of the said space. The limit of
a sequence is defined in a space of linear operators by assuming for example
that An → A if An x → Ax for every x ∈ Ex .
This space of continuous linear operators will be discussed later.
4.1.8 The ring of continuous linear operators
4+
Let E be a linear space over a scalar field ( ). We next consider the
space of continuous linear operators mapping E into itself. Such a space
we denote by (E → E). The product of two linear operators A and B in
(E → E) is denoted by AB = A(B), i.e., (AB)x = A(Bx) for all x ∈ E.
Let xn → x in E. A and B being continuous linear operators,
Axn → Ax and Bxn → Bx.
Since AB is the product of A and B in (E → E),
ABxn − ABx = A(Bxn − Bx).
Since Bxn → Bx as n → ∞ or Bxn − Bx → θ as n → ∞ and A is a

continuous linear operator, A(Bxn − Bx) → θ as n → ∞. Thus, AB is a
continuous linear operator.
Since A maps E → E and is continuous linear A2 = A · A ∈ (E → E).

Let us suppose An ∈ (E → E) for any finite n. Then
An+1 = A · An ∈ (E → E)
It can be easily seen that, if A, B and C ∈ (E → E) respectively, then
(AB)C = A(BC), (A + B)C = AC + BC and also C(A + B) = CA + CB.
Moreover, there exists an identity operator I, defined by Ix = x

for all x and such that AI = IA = A for every operator A. In general
AB = BA, the set (E → E) is a non-commutative ring with identity.
4.1.9 Example
Consider the linear space 2 of all polynomials p(s), with real coefficients.
The operator A is defined by
1
dp
y(t) = tsp(s)ds = Ap and y(t) = t = Bp
0 dt
1 1
2 dp(s)
1
ABp = ts 2
ds = t s p(s) 0 − 2 sp(s)ds
0 ds 0
1
= t p(1) − 2 sp(s)ds
0
1 1
d
BAp = t tsp(s)ds = t sp(s)ds
dt 0 0
Thus AB = BA.
4.1.10 Function of operator

The operator An = AA · A BC
· · · · · AD represents a simple example of an
n terms
operator function. A more general function, namely of the polynomial
function of operator,
pn (A) = a0 I + a1 A + a2 A2 + · · · + an An .
4.2 Linear Operators in Normed Linear

Spaces
Let Ex and Ey be two normed linear spaces. Since a normed linear space
is a particular case of a topological linear space, the definition of a linear
operator mapping a topological linear space Ex into a topological linear
space Ey holds good in case the spaces Ex and Ey reduce to normed linear
spaces. Theorems 4.1.5 and 4.1.6 also remain valid in normed linear spaces.
4.2.1 Definition: continuity of a linear operator mapping Ex

into Ey
Since convergence in a normed linear space is introduced through the
convergence in the induced metric space, we define convergence of an
operator A in a normed linear space as follows:
Given A, a linear operator mapping a normed linear space Ex into
a normed linear space Ey , A is said to be continuous at x ∈ Ex if
||xn − x||Ex → 0 as n → ∞ ⇒ ||Axn − Ax||Ey → 0 as n → ∞.
4.1.1c Example
(m) (m)
Now refer back to example 4.1.1a. Let, xm = {ξi }. ym = {ηi }.
Then ym = Axm . Let xm → x as m → ∞. We assume that xm ∈ Ex , the
n-dimensional Euclidean space. Then

n
(m) 2
(ξi ) < ∞.
i=1
Now, by Cauchy-Bunyakovsky-Schwartz’s inequality, (Sec. 1.4.3)
⎧ ⎫
⎨ n n ⎬
(m) (m)
y − ym = {ηi − ηi } = aij (ξj − ξj )
⎩ ⎭
i=1 j=1
⎛ ⎞1/2

n n
= aij (ξj − ξj ) ≤ ⎝
(m) (m)
or |ηi − ηi | a2ij ⎠ ·
j=1 j=1
⎛ ⎞1/2
n
(m)
⎝ |ξj − ξj |2 ⎠
j=1

1/2 ⎛ ⎞1/2 ⎛ ⎞1/2

n
(m)
n
n n
(m)
Thus |ηi − ηi |2 ≤⎝ a2ij ⎠ ⎝ |ξj − ξj |2 ⎠
i=1 i=1 j=1 j=1

n
n
Since a2ij < ∞
i=1 j=1
∞
(m) 2
||xm − x||2 −→ 0 =⇒ |ξi − ξi | −→ 0
i=1

n
(m) 2
=⇒ |ηi − ηi | −→ 0
i=1
=⇒ ||Axm − Ax||2 −→ 0 as m → ∞.
This shows that A is continuous.

4.1.2c Example
We consider example 4.1.2a in the normed linear space C([a, b]). Since
x(s) ∈ C([a, b]) and K(t, s) is continuous in a ≤ t, s ≤ b, it follows that
y(t) ∈ C([a, b]).
Let xn , x ∈ C([a, b]) and ||xn − x|| −→ 0 as n → ∞, where ||x|| =
max |x(t)|.
a≤t≤b
Now, ||yn (t) − y(t)|| = max |yn (t) − y(t)|

a≤t≤b
b
= max |K(t, s)(xn (s) − x(s))ds|
a≤t≤b a
≤ [b − a] max |K(t, s)| · max |xn (s) − x(s)|

a≤t,s≤b a≤s≤b
= [b − a] max |K(t, s)| ||xn (s) − x(s)|| −→ 0 as n → ∞.

a≤t,s≤b
or, ||Axn − Ax|| −→ 0 as ||xn − x|| → 0,

showing that A is continuous.
4.2.2 Example
Let A = (aij ), i, j = 1, 2, . . ., respectively.
Let x = {ξi }, i = 1, 2, . . . Then y = Ax yields
∞

ηi = aij ξj where y = {ηi }
j=1
Let us suppose
∞
∞
K= |aij |q < ∞, q > 1 (4.5)
i=1 j=1
∞

and x ∈ lp i.e. |ξi |p ≤ ∞.
i=1
(1) (2)
For x1 = {ξi } ∈ lp , x2 = {ξi } ∈ lp it is easy to show that A is linear.
Then, using Hölder’s inequality (1.4.3)
⎧⎛ ⎞1/q ⎛ ⎞1/p ⎫q
n n ⎪
⎨ ∞ ∞
p ⎪
⎬
|ηi |q ≤ ⎝ |aij |q ⎠ ⎝ |ξj |⎠
⎪ ⎪
i=1 i=1 ⎩ j=1 j=1 ⎭
∞
n
= ||x||q |aij |q
i=1 j=1
∞
∞
≤ ||x||q |aij |q
i=1 j=1

1/q ⎛ ⎞1/q
∞
∞
∞
1 1
Hence ||y|| = |ηi |q ≤⎝ |aij |q ⎠ · x where + = 1.
p q
i=1 i=1 j=1
Hence, using (4.5), we can say that x ∈ lp =⇒ y ∈ lq .

Let now ||xm − x||p −→ 0, i.e., xm −→ x in lp as m −→ ∞, where
(n)
xn = {ξi } and x = {ξi }.
Now, using Hölder’s inequality (sec. 1.4.3)
q
∞ ∞

||Axm − Ax||qq = a (ξ
(m)
− ξ )
ij j j
i=1 j=1
⎛ ⎞ ⎛ ⎞q/p
∞
∞ ∞
(m)
≤⎝ |aij |q ⎠ · ⎝ |ξj − ξj |p ⎠
i=1 j=1 j=1
⎛ ⎞q/p
∞
(m)
=K ⎝ |ξj − ξj |p ⎠
j=1
= K||xm − x||qp
Hence ||xm − x||p −→ 0 =⇒ ||Axm − Ax||q −→ 0.
Hence A is linear and continuous.

4.2.3 Definition: bounded linear operator
Let A be a linear operator mapping Ex into Ey where Ex and Ey are
4+
linear operators over the scalar field ( ). A is said to be bounded if there
exists a constant K > 0 s.t.
||Ax||Ex ≤ K||x||Ex for all x ∈ Ex
Note 4.2.1. The definition 4.2.3 of a bounded linear operator is not the
same as that of an ordinary real or complex function, where a bounded
function is one whose range is a bounded set.
We would next show that a bounded linear operator and a continuous
linear operator are one and the same.
4.2.4 Theorem
In order that an additive and homogeneous operator A be continuous, it
is necessary and sufficient that it is bounded.
Proof: (Necessity) Let A be a continuous operator. Assume that it is not
bounded. Then, there is a sequence {xn } of elements, such that
||Axn || > n||xn || (4.6)

Let us construct the elements

xn ||xn ||
ξn = , i.e., ||ξn || = −→ 0 as n → ∞
n||xn || n||xn ||
Therefore, ξn −→ ξ = θ as n → 0.
On the other hand,
1
||Aξn || = ||Axn || > 1 (4.7)
n||xn ||
Now, A being continuous, and since ξn → θ as n → ∞

ξn → θ as n → ∞ =⇒ Aξn −→ A · θ = θ
This contradicts (4.7). Hence, A is bounded.
(Sufficiency) Let the additive operator A be bounded i.e. ||Ax|| ≤
K||x|| ∀ x ∈ Ex .
Let xn −→ x as n → ∞ i.e. ||xn − x|| −→ 0 as n → ∞.
Now, ||Axn − Ax|| = ||A(xn − x)|| ≤ K||xn − x|| → 0 as n → ∞.
Hence, A is continuous at xn .
4.2.5 Lemma
Let a given linear (not necessarily bounded) operator A map a Banach
space Ex into a Banach space Ey . Let us denote by En the set of those
∞
*
x ∈ Ex for which ||Ax|| < n||x||. Then Ex is equal to En and at least
n=1
one En is everywhere dense.
Proof: Since ||A·θ|| < n||θ||, the null element belongs to every En for every
n. Again, for every x, we can find a n say n such that ||Ax|| < n ||x||,
||Ax||
n > .
||x||
Therefore, every x belongs to same En .
∞
*
Hence Ex = En . Ex , being a Banach space, can be reduced to a
n=1
complete metric space. By theorem 1.4.19, a complete metric space is a set
of the second category and hence cannot be expressed as a countable union
of nowhere dense sets. Hence, at least one of the sets of Ex is everywhere
dense.
To actually construct such a set in En we proceed as follows. Let En̂ be
a set which is everywhere dense in Ex . Consequently there is a ball B(x0 , r)
containing B(x0 , r) ∩ En̂ , everywhere dense.
Let us consider a ball B(x1 , r1 ) lying completely inside B(x0 , r) and
such that x1 ∈ En̂ . Take any element x with norm ||x|| = r1 . Now
x1 + x ∈ B(x1 , r1 ). Since B(x1 , r1 ) ⊆ E n̂ , there is a sequence {yk } of
elements in B(x1 , r1 ) ∩ En̂ such that yk −→ x1 + x as k → ∞. Therefore,
xk = yk − x1 −→ x. Since ||x|| = r1 , there is no loss of generality if we

assume r1 /2 < ||xk ||.
Since yk and x1 ∈ En̂ ,
||Axk || = ||Ayk − Ax1 || ≤ ||Ayk || + ||Ax1 ||
≤ n̂(||yk || + ||x1 ||)
Besides ||yk || = ||xk + x1 || ≤ ||xk || + ||x1 ||
≤ r1 + ||x1 ||
2n̂(r1 + ||x||) r1
Hence, ||Axk || ≤ n̂(r1 + 2||x1 ||) ≤ ||xk || since ≤ ||xk ||.
r1 2
2n̂(r1 + 2||x1 ||)
Let n be the least integer greater than , then ||Axk || ≤
r1
n̂||xk ||, implying that all xk ∈ En .
Thus, any element x with norm equal to r1 can approximate elements
x
in En . Let x be any element in Ex . Then x̂ = r1 ||x1 || satisfies
||x̂|| = r1 . Hence
is a sequence {x̂k } ∈ En , which converges to x̂.
there
x ||x1 ||
Then x̂ = r1 satisfies ||x̂|| = r1 . Then xk = x̂k → x, as
||x1 || r1
k → ∞.
||x1 || ||x1 ||
||Axk || = ||Ax̂k || ≤ · n||x̂k || = n||xk ||
r1 r1
Thus xk ∈ En . Consequently En is everywhere dense in Ex .
4.2.6 Definition: the norm of an operator
Let A be a bounded linear operator mapping Ex into Ey . Then we can
find K > 0 such that
||Ax||Ey ≤ K||x||Ex (4.8)
The smallest value of K, say M , for which the above inequality holds is
called the norm of A and is denoted by ||A||.
4.2.7 Lemma
The operator A has the following two properties:
(i) ||Ax|| ≤ ||A|| ||x|| for all x ∈ Ex , (ii) for every > 0, there is an
element x such that
||Ax || > (||A|| − )||x ||
Proof: (i) By definition of the norm of A, we can write,

||A|| = inf{K : K > 0 and ||Ax||Ey ≤ K||x||Ex , ∀ x ∈ Ex } (4.9)
Hence ||Ax||Ey ≤ ||A|| ||x||Ex
(ii) Since ||A|| is the infimum of K, in (4.9)
∃ x ∈ Ex , s.t. ||Ax || > (||A|| − )||x ||
4.2.8 Lemma
||Ax||Ey
(i) ||A|| = sup (4.10)
x=θ ||x||Ex
(ii) ||A|| = sup AxEy (4.11)
x≤1
Proof: It follows from (4.9) that
||Ax||Ey
sup ≤ ||A|| (4.12)
x=θ ||x||Ex
Again (ii) of lemma 4.2.7 yields
||Ax ||
> (||A|| − )
||x ||
||Ax||Ey
Hence, sup ≥ ||A|| (4.13)
x=θ ||x||Ex
||Ax||Ey
sup = ||A||
x=θ ||x||Ex
Next, if ||x||Ex ≤ 1, it follows from (i) of lemma 4.2.7,
sup AxEy ≤ A (4.14)
x≤1
Again, we obtain from (ii) of lemma 4.2.7

||Ax || > (||A|| − )||x ||
x
Put ζ = ||x || , then
1 1
||Aζ || = ||Ax || > (||A|| − )||x ||
||x || ||x ||
Since ||ζ || = 1, it follows that
sup ||Ax|| ≥ ||Aζ || > ||A|| −
||x||≤1
Hence, sup ||Ax|| ≥ ||A||. (4.15)

x≤1
Using (4.14) and (4.15) we prove the result.
4.2.9 Examples of operators

Examples of operators include the identity operator, the zero operator,
the differential operator, and the integral operator. We discuss these
operators in more detail below.
4.2.10 Identity operator

The identity operator IEx : Ex → Ex is defined by IEx x = x for all
x ∈ Ex . In case Ex is a non-empty normed linear space, the operator I is
bounded and the norm ||I|| = 1.
4.2.11 Zero operator
The zero operator 0 : Ex → Ey where Ex and Ey are normed linear
spaces, is bounded and the norm ||0||Ey = 0.
4.2.12 Differential operator
Let Ex be the normed linear space of all polynomials on J = [0, 1] with
norm given by ||x|| = maxt∈J |x(t)|. A differential operator A is defined on
Ex by Ax(t) = x (t) where the prime denotes differentiation WRT t.
The operator is linear for differentiable x(t) and y(t) ∈ Ex for we have,
A(x(t) + y(t)) = (x(t) + y(t)) = x (t) + y (t) = Ax(t) + Ay(t)

Again A(λx(t)) = (λx(t)) = λx (t) = λA(x(t)) where λ is a scalar.
If xn (t) = tn Ax(t) = ntn−1 , t ∈ J
then ||xn (t)|| = 1 and Axn (t) = xn (t) = ntn−1 so that
||Axn || = n and ||Axn ||/||xn || = n.
Since n is any positive integer we cannot find any fixed number M s.t.
||Axn ||/||xn || ≤ M
Hence A is not bounded.

4.2.13 Integral operator
Refer to example 4.1.2.
A : C([a, b]) → C([a, b]), for x(s) ∈ C([a, b]),

b
Ax = K(t, s)x(s)ds
a
where K(t, s) is continuous in a ≤ t, s ≤ b. Therefore

y(t) = Ax(s) ∈ C([a, b])

b

||y(t)|| = max |y(t)| = max K(t, s)x(s)ds
a≤t≤b a≤t≤b a
b 1
≤ max |K(t, s)|ds · max |x(s)| = max |K(t, s)|ds · ||x||
a≤t≤b a a≤s≤b a≤t≤b 0
b
Thus, ||A|| ≤ max |K(t, s)|ds (4.16)
a≤t≤b a
b
Since |K(t, s)|ds is a continuous function, it attains the maximum
a
at some point t0 of the interval [a, b]. Let us take,
z0 (s) = sgn K(t0 , s)
where sgn z = z/|z|, z ∈ . +
Let xn (s) be a continuous function, such that |xn (s)| ≤ 1 and xn (s) =
z0 (s) everywhere except on a set En of measure less than 1/2M n, where
M = max |K(t, s)|. Then, |xn (s) − z0 (s)| ≤ 2 everywhere on En .
t,s
b
b

We have K(t, s)z0 (s)ds − K(t, s)xn (s)ds
a a
b
≤ |K(t, s)| |xn (s) − z0 (s)|ds
a

= |K(t, s)| |xn (s) − z0 (s)|ds
En
1 1
2 max |K(t, s)| ·=
2M n
t,s n
b b
1
Thus K(t, s)z0 (s)ds ≤ K(t, s)xn (s)ds +
a a n
1
≤ ||A|| ||xn || +
n
for t ∈ [a, b], putting t = t0
b
1
|K(t0 , s)|ds ≤ ||A|| ||xn || +
a n
Since ||xn || ≤ 1, the preceeding inequality in the limit as n → ∞ gives
rise to
b
|K(t, s)|ds ≤ ||A||,
a
b
i.e., max |K(t, s)|ds ≤ ||A|| (4.17)
t a
From (4.16) and (4.17) it follows that

1
||A|| = max |K(t, s)|ds
t 0
4.2.14 Theorem
A bounded linear operator A0 defined on a linear subset X, which
is everywhere dense in a normed linear space Ex with values in a
complete normed linear space Ey can be extended to the entire space with
preservation of norm.
Proof: A can be defined on Ex such that Ax = A0 x, x ∈ X and

||A||Ex = ||A0 ||X .
Let x be any element in Ex not belonging to X.
Since X is everywhere dense in Ex , there is a sequence {xn } ⊂ X s.t.
||xn − x|| → 0 as n → ∞ and hence ||xn − xm || → 0 as n, m → ∞. However
then
||A0 xn − A0 xm || = ||A0 (xn − xm )|| ≤ ||A0 ||X ||xn − xm ||

→ 0 as n, m → ∞, {A0 xn }
is a Cauchy sequence and converges by the completeness of Ey to some

limit denoted by Ax. Let {ξn } ⊂ X be another sequence convergent to
x. Evidently ||xn − ξn || → θ as n → ∞. Hence, ||A0 xn − A0 ξn || → 0
as n → ∞. Consequently A0 ξn −→ Ax, implying that A is defined
uniquely by the elements of Ex . If x ∈ X select xn = x for all n. Then
Ax = lim A0 xn = A0 x.
n→∞
The operator A is additive, since
A(x1 + x2 ) = lim A0 (x(1) (2) (1) (2)

n + xn ) = lim A0 xn + lim A0 xn = Ax1 + Ax2
n→∞ n→∞ n→∞
We will next show that the operator is bounded
||A0 xn || ≤ ||A0 ||X ||xn ||
Making n → ∞ we have
||Ax|| ≤ ||A0 ||X ||x||
Dividing both sides by ||x|| and taking the supremum
||A||Ex ≤ ||A0 ||X
But the norm of A over Ex cannot be smaller than the norm of A0 over
X, therefore we have
||A||Ex = ||A||X
The above process exhibits the completion by continuity of a bounded
linear operator from a dense subspace to the entire space.
Problems [4.1 & 4.2]
1. Let A be an nth order square matrix, i.e., A = (aij ) i=1,...n . Prove
j=1,...n
that A is linear, continuous and bounded.
2. Let B be a bounded, closed domain in 4

and let x =
T
4
(x1 , x2 , . . . , xn ) . We denote by
the space C([B]) of functions f (x)
which are continuous on B. The function φ(x) and the n-dimensional
vector p(x) are fixed members of C([B]). The values of p(x) lie in B
for all x ∈ B. Show that T1 and T2 given by T1 f (x) = φ(x)f (x) and
T2 (f (x)) = f (p(x)) are linear operators.
3. Show that the matrix
⎛ ⎞
a11 a11 ··· a1n
⎜ a21 a22 ··· a2n ⎟
⎜ ⎟
A=⎜ . ⎟
⎝ .. ⎠
an1 an2 ··· ann
is a bounded linear operator in 4np for p = 1, 2, ∞ and that for

n
p = 1, ||A||1 ≤ max |aij |
j
i=1
⎛ ⎞1/2
⎜ n ⎟
⎜ ⎟
p = 2, ⎜
||A||2 ≤ max ⎜ 2⎟
|aij | ⎟
j
⎝ ⎠
i=1
j=1

n
p = ∞, ||A||∞ = max |aij |
j
j=1
4. Let E = C([a, b]) with || · ||∞ . Let t1 , . . . , tn be different points in

[a, b] and φ1 , . . . , φn be such that φj (ti ) = δij for i, j = 1, 2, . . . , n.
Let P : E =−→ E be denoted by

n
Px = x(tj )φj , x ∈ E
j=1
Then show that P is a projection operator onto span {φ1 , φ2 , . . . , φk },

P ∈ (E → E) and
n
||P || = sup |uj (t)|
a≤t≤b j=1
The projection operator P above is called the interpolatory projection

onto the span of {φ1 , φ2 , . . . , φn } corresponding to the ‘nodes’
{t1 , . . . , tn }.
[For ‘Projection operator’ see Lemma 4.5.3]
4.3 Linear Functionals

If the range of an operator consists of real numbers then the operator
is called a functional. In particular if the functional is additive and
homogeneous it is called a linear functional.
Thus, a functional f (x) defined on a linear topological space E is said
to be linear, if
(i) f (x1 + x2 ) = f (x1 ) + f (x2 ) ∀ x1 , x2 ∈ E and

(ii) f (xn ) −→ f (x) as xn −→ x and xn , x ∈ E in the sense of convergence
in a linear space E.
4.3.1 Similarity between linear operators and linear functionals

4
Since the range of a linear functional f (x) is the real line , which is
a Banach space, the following theorems which hold for a linear operator
mapping a Banach space into another Banach space are also true for linear
functionals defined on a Banach space.
4.3.2 Theorem
If an additive functional f (x), defined on a normed linear space E on
4(+), is continuous at a single point of this space, then it is also continuous
linear on the whole space E.
4.3.3 Theorem
Every linear functional is homogeneous.
4.3.4 Definition
A linear functional defined on a normed linear space E over 4(+) is
said to be bounded if there exists a constant M > 0 such that
|f (x)| ≤ M ||x|| ∀ x ∈ E.
4.3.5 Theorem
An additive functional defined on a linear space E over 4(+) is linear
if and only if it is bounded.
The smallest of the constants M in the above inequality is called the
norm of the functional f (x) and is denoted by ||f ||.
Thus |f (x)| ≤ ||f || ||x||
Thus ||f || = sup |f (x)|
||x||≤1
|f (x)|
or in other words ||f || = sup |f (x)| = sup
||x||=1 x=θ ||x||
4.3.6 Examples
4
1. Norm: The norm || · || : Ex → on a normed linear space (Ex , || · ||)
is a functional on Ex which is not linear.
2. Dot Product: Dot Product, with one factor kept fixed, defines a
4
functional f : n −→ by means of 4

n
f (x) = a · x = a i xi
i=1
where a = {ai } and x = {xi }, f is linear and bounded.
|f (x)| = |a · x| ≤ ||a|| ||x||
|f (x)|
Therefore, sup ≤ ||a|| for x = a, f (a) = a · a = ||a||2
x=θ ||x||
f (a)
Hence, = ||a||, i.e., ||f || = ||a||.
||a||
3. Definite Integral: The definite integral is a number when we take the
integral of a single function. But when we consider the integral over a class
of functions in a function space, then the integral becomes a functional.
Let us consider the functional
b
f (x) = x(t)dt, x ∈ C([a, b])
a
f is additive and homogeneous.

b b
Now, |f (x)| = | x(t)dt| ≤ sup |x(t)| dt = (b − a)||x||
a t a
Therefore, |f (x)| ≤ (b−a)||x|| or, sup |f (x)| = ||f (x)|| ≤ (b−a)||x||

x∈C[(a,b)]
showing that f is bounded.
|f (x)|
Now, sup = ||f (x)|| ≤ (b − a)
x∈C[(a,b)] ||x||
Next we choose x = x0 = 1, so that ||x0 || = 1.

b b
f (x0 )
Again, f (x0 ) = x0 (t)dt = 1 · dt = (b − a) or, = (b − a).
a a ||x0 ||
||f (x)||
Hence, sup = (b − a).
x=θ ||x||
4. Definite integral on C([a, b]);
If K(·, ·) is a continuous function on [a, b] × [a, b]
b
and F (x)(s) = K(s, t)x(t)dt, x ∈ C([a, b]), s ∈ [a, b]
a
b
then |F (x)(s)| ≤ |K(s, t)| |x(t)|dt
a
b
≤ |K(s, t)| sup |x(t)|dt
a t
b b
= ||x|| |K(s, t)|dt ≤ sup |K(s, t)|dt||x||
a s a
3
|F x(s)| b
||F || = sup ≤ sup |K(s, t)|dt; s ∈ [a, b]
x∈θ ||x|| s∈[a,b] a
Since |K(s, t)| is a continuous function of s defined over a compact

interval [a, b], ∃ a s0 ∈ [a, b] s.t.
|K(s0 , t)| = sup |K(s, t)|
s∈[a,b]
b
Thus, ||F || = sup |K(s0 , t)|dt
x∈[a,b] a
4.3.7 Geometrical interpretation of norm of a linear functional

Consider in a normed linear space E over ( ) a linear functional f (x).
∞
4+
The equation f (x) = c, where f (x) = i=1 ci xi , is called a hyperplane.
This is because in n-dimensional Euclidean space En , such an equation of
the form f (x) = c represents a n-dimensional plane.
Now, |f (x)| ≤ ||f || ||x||. If ||x|| ≤ 1, i.e., x lies in a unit ball, then
f (x) ≤ ||f ||. Thus the hyperplane f (x) = ||f || has the property that all
the unit balls ||x|| ≤ 1 lie completely to the left of this hyperplane (because
f (x) < ||f || holds for the points of the ball ||x|| < 1). The plane f (x) = ||f ||
is called the support of the ball ||x|| ≤ 1. The points on the surface of
the ball ||x|| = 1 lie on the hyperplane. All other points within the ball
||x|| = 1 lie on one side of the hyperplane. Such a hyperplane is also called
a supporting hyperplane.
Problems
1. Find the norm of the linear functional f defined on C([−2, 2]) by
0 2
f (x) = x(t)dt − x(t)dt
−2 0
[Ans. ||f || = 4]
2. Let f be a bounded linear functional on a complex normed linear
space. Show that f , although bounded, is not linear (the bar denotes
the complex conjugate).
3. The space C ([a, b]) is the normed linear space of all continuously
differentiable functions on J = [a, b] with norm defined by
||x|| = max |x(t)| + max |x (t)|
t∈J t∈J
Show that all the axioms of a norm are satisfied.

Show that f (x) = x (α), α = (a + b)/2 defines a bounded linear
functional on C ([a, b]).
1
st
4. Given F (x)(s) = x(t)dt, x ∈ C([0, 1]), s ∈ [0, 1], show that
0 2 − st
||F || = 1.
5. If X is a subspace of a vector space E over and f is a linear4
functional on E such that f (X) is not the whole of , show that 4
f (y) = 0 for all y ∈ X.
4.4 The Space of Bounded Linear

Operators
In what follows we want to show that the set of bounded linear operators
mapping a normed linear space (Ex , || · ||Ex ) into another normed linear
space (Ey , || · ||Ey ) forms again a normed linear space.
4.4.1 Definition: space of bounded linear operators
Let two bounded linear operators, A1 and A2 , map a normed linear
space (Ex , || · ||Ex ) into the normed linear space (Ey , || · ||Ey ).
We can define addition and scalar multiplication by
(A1 + A2 )x = A1 x + A2 x,
A1 (λx) = λAx for all scalars λ ∈ 4 (+ )
This set of linear operators forms a linear space denoted by (Ex → Ey ).
We would next show that (Ex → Ey ) is a normed linear space. Let us
define the norm of A as ||A|| = sup ||Ax||Ey . Then ||A|| ≥ 0. If ||A|| = 0,
||x||<1
i.e., if sup ||Ax||Ey = 0 then ||Ax||Ey = 0 for all x s.t. ||x||Ex ≤ 1.
||x||Ex ≤1
However because of the homogeneity of A, Ax = 0 for all x and therefore
A = 0.
Now, ||λA|| = sup ||λAx|| = |λ| sup ||Ax|| = |λ| ||A||
||x||≤1 ||x||≤1
||A + B|| = sup ||(A + B)x|| ≤ sup ||Ax|| + sup ||Bx||

||x||≤1 ||x||≤1 ||x||≤1
= ||A|| + ||B||
Thus, the space of bounded linear operators is a normed linear space.
4.4.2 Theorem
If Ey is complete, the space of bounded linear operators is also complete
and is consequently a Banach space.
Proof: Let us be given a Cauchy sequence {An } of linear operators.
Then with respect to the norm is a sequence of linear operators such that
||An −Am || → 0 as n, m → ∞. Hence, ||An x−Am x|| ≤ ||An −Am || ||x|| → 0,
as n, m → ∞, for any x.
Therefore the sequence {An x} of elements of Ey is a Cauchy sequence
for any fixed x. Now, since Ey is complete {An x} has some limit, y. Thus,
every y ∈ Ey is associated with some x ∈ Ex and we obtain some operator
A defined by the equation Ax = y. Such an operator A is additive and
homogeneous. Because
(i) A(x1 + x2 ) = lim An (x1 + x2 ) = lim An x1 + lim An x2

n→∞ n→∞ n→∞
= Ax1 + Ax2 , for x1 , x2 ∈ Ex
(ii) A(λx1 ) = lim An (λx1 ) = λ lim An x1
4 (+ )
n→∞ n→∞
= λAx1 , ∀ λ ∈
To show that A is bounded, we note that
||An − Am || → 0 as n → ∞
Hence ||An || − ||Am || → 0, i.e., {||An ||} is a Cauchy sequence and is

therefore bounded. Then there is a constant k s.t. ||An || ≤ k for all n.
Consequently, ||An x||Ey ≤ k||x||Ex for all n and hence
||Ax||Ey = lim ||An x||Ey ≤ k||x||Ex

n→∞
Hence, it is proved that A is bounded. Since in addition, A is additive

and homogeneous, A is a bounded linear operator.
Next, we shall prove that A is the limit of the sequence {An } in the
sense of norm convergence, in a space of linear operators.
Because of the convergence of {An x}, there is an index n0 for every
> 0 s.t.
||An+p x − An x||Ey < (4.18)
for n ≥ n0 , p ≥ 1 and all x with ||x|| ≤ 1.
Taking the limit in (4.18) as p → ∞, we get
||Ax − An x||Ey < for n ≥ n0 ,
and all x with ||x|| ≤ 1. Hence for n ≥ n0 ,
||An − A|| = sup ||(An − A)x||Ey <

||x||≤1
Consequently, A = lim An in the sense of norm convergence in a space

n→∞
of bounded linear operators and completeness of the space is proved.
We next discuss the composition of two bounded linear operators, each
mapping Ex → Ex , Ex being a normed linear space.
4.4.3 Theorem
4+
Let Ex be a normed linear space over ( ). If A, B ∈ (Ex −→ Ex ),
then
AB ∈ (Ex −→ Ex ) and ||AB|| ≤ ||A|| ||B||
Proof: Since A, B : (Ex −→ Ex ), we have
(AB)(αx + βy) = A(B(αx + βy)) = A(αBx + βBy)

= α(AB)x + β(AB)y ∀ x, y ∈ Ex , ∀ α, β ∈ 4 (+ )
Furthermore, A and B being bounded operators
xn −→ x ⇒ AB(xn ) = A(Bxn ) −→ ABx
showing that AB is bounded and hence continuous.
||(AB)(x)||Ex = ||A(Bx)||Ex ≤ ||A|| ||Bx||Ex ≤ ||A|| ||B|| ||x||Ex

||(AB)x||Ex
Hence, ||AB|| = sup ≤ ||A|| ||B||
x=θ ||x||Ex
4.4.4 Example
1. Let A ∈ ( 4m −→ 4n) where both 4m and 4n are normed by l1
norm.

n
Then ||A||l1 = max |aij | (4.19)
1≤j≤n
i=1
Proof: For any x ∈ 4n ,

m n
m
n
||Ax||l1 = | aij xj | ≤ |aij ||xj |
i=1 j=1 i=1 j=1

n
m
≤ |xj | |aij |
j=1 i=1

m
≤ max |aij | ||x||1
1≤j≤n
i=1
||Ax||1 n
or ||A|| = sup ≤ max |aij |.
x=θ ||x||1 1≤j≤n
i=1
We have to next show that there exists some x ∈ n s.t. the RHS in 4
the above inequality is attained.
Let k be an index for which the maximum in (4.18) is attained; then

m
m
||Aek ||1 = |aik | = max |aij |,
1≤j≤n
i=1 i=1
⎧ ⎫
⎪
⎪ 0⎪ ⎪
⎪
⎪
⎪
⎪ 0⎪ ⎪
⎪
⎪
⎪ ⎪
⎨.⎪
⎪ . ⎬
.
where, ek = , i.e. the maximum in (4.18) is attained for the kth
⎪
⎪ 1⎪ ⎪
⎪
⎪ ⎪
.. ⎪
⎪ ⎪
⎪
⎪ .⎪⎪
⎩ ⎪
⎪ ⎭
0
coordinate vector.
Problem
1. Let A ∈ ( 4n −→ 4m ) where 4n are normed by l∞ norm.

n
Show that (i) ||A||l∞ = max |aij |
1≤i≤m
j=1
√
(ii) ||A||l2 = λ where λ is the maximum eigenvalue of AT A.
[Hint. ||Ax||2 = (Ax)T Ax = (xT AT Ax)]
2. (Limaye [33]) Let A = (aij ) be an infinite matrix with scalar entries
and
⎧ ∞
1/r
⎪
⎪
⎪
⎪ |aij |r if p = 1, 1 ≤ r < ∞
⎪
⎪ sup
⎪
⎪ j=1,2,...
⎪
⎪ i=1
⎨ ⎛ ⎞1/q
∞
αp,r = 1 1
⎪
⎪ sup ⎝ |aij |q ⎠ if 1 < p ≤ ∞, + = 1, r = ∞
⎪
⎪ p q
⎪
⎪ i=1,2,...
⎪
⎪
j=1
⎪
⎪
⎩ sup |aij | if p = 1, r = ∞
i,j=1,2,...
If αpr < ∞ then show that A defines a continuous linear map from
lp to lr and its operator norm equals αp,r .
∞

[Note that, α1,1 = α1 = sup |aij | and α∞,∞ = α∞ =
j=1,2,...
⎧ ⎫ i=1
⎨∞ ⎬
sup |aij | ].
i=1,2,... ⎩ ⎭
j=1
4.5 Uniform Boundedness Principle

4.5.1 Definition: uniform operator convergence
Let {An } ⊂ (Ex −→ Ey ) be a sequence of bounded linear operators,
where Ex and Ey are complete normed linear spaces. {An } is said to
converge uniformly if {An } converges in the sense of norm, i.e.,
||An − Am || → 0 as n, m −→ ∞.
4.5.2 Lemma
{An } converges uniformly if and only if {An } converges uniformly for
every x ∈ Ex .
Proof: Let us suppose that {An } converges uniformly, i.e., ||An −Am || −→
0 as n, m → ∞.
||(An − Am )x||
Hence, sup → 0 as n, m → ∞.
x=θ ||x||
or in other words, given > 0, there exists an r > 0, s.t.

||An x − Am x|| < ·r
r
where x ∈ B(0, r) and n, m ≥ n0 ( /r).
Hence the uniform convergence of {An x} for any x ∈ B(0, r) is
established. Conversely, let us suppose that {An x} is uniformly convergent
for any x ∈ B(0, 1). Hence, sup ||An x − Am x|| → 0 as n, m → ∞. Or, in
||x||≤1
other words, ||An − Am || → 0 as n, m → ∞. Using theorem 4.4.2, we can
say
An −→ A ∈ (Ex → Ey ) as n → ∞
4.5.3 Definition: pointwise convergence

A sequence of bounded linear operators {An } is said to converge
pointwise to a linear operator A if, for every fixed x, the sequence {An x}
converge to Ax.
4.5.4 Lemma
If {An } converge uniformly to A then {An } converge pointwise.
||An − A)x||
Proof: ||An − A|| → 0 as n → ∞ ⇒ sup → 0 as n → ∞.
x=θ ||x||
Hence, if ||x|| ≤ r, for /r > 0, ∃ as n0 s.t. for all n ≥ n0

||An x − Ax|| < ||x|| ≤ · r = ,
r r
i.e., {An } is pointwise convergent.
On the other hand if {An } is pointwise convergent, {An } is not
necessarily uniformly convergent as is evident from the example below.
Let H be a Hilbert space with basis {e1 , e2 , . . . , en } and for x ∈ H let An

denote the projection of x on Hn , the n-dimensional subspace spanned
by e1 , e2 , . . . , en . Then

n ∞

An x = x, ei ei −→ x, ei ei = x
i=1 i=1
for every x ∈ H. Hence, An → I is the sense of pointwise convergence.

On the other hand, for < 1, any n and p > 0,
||An+p en+1 − An en+1 || = ||en+1 − 0|| = ||en+1 || = 1 >
Hence, the uniform convergence of the sequence {An } in the unit ball
||x|| < 1 of the space H does not hold.
4.5.5 Theorem
If the spaces Ex and Ey are complete, then the space of bounded linear
operators is also complete in the sense of pointwise convergence.
Proof: Let {An } of bounded linear operators converge pointwise. Since
{An } is a Cauchy sequence for every x, there exists a limit y = lim An x
n→∞
for every x.
Since Ey is complete y ∈ Ey . This asserts the existence of an operator
A such that Ax = y. That A is linear can be shown as follows
for, x1 , x2 ∈ Ex ,
A(x1 + x2 ) = lim An (x1 + x2 ) = lim (An x1 + An x2 )
n→∞ n→∞
= Ax1 + Ax2 , which shows A is additive.
Again, for λ ∈ 4(+),

A(λx) = lim An (λx) = λ lim An x
n→∞ n→∞
= λAx, showing A is homogeneous.
That A is bounded can be proved by making an appeal to Banach-

Steinhans theorem. (4.5.6)
4.5.6 Uniform boundedness principle
Uniform boundedness principle was discovered by Banach and
Steinhaus. It is one of the basic props of functional analysis. Earlier
Lebesgue first discovered the principle in his investigations of Fourier series.
Banach and Steinhans isolated and developed it as a general principle.
4.5.7 Theorem (Banach and Steinhaus)
If a sequence of bounded linear operators is a Cauchy sequence at every
point x of a Banach space Ex , the sequence {||An ||} of these operators is
uniformly bounded.
Proof: Let us suppose the contrary. We show that the assumption implies
that the set {||An x||} is not bounded on any closed ball B(x0 , ).
Any x ∈ B(x0 , ) can be written as

x0 + ξ for any ξ ∈ Ex
||ξ||
In fact, if ||An x|| ≤ C for all n and if all x is in some ball B(x0 , ), then

||An x0 + ξ || ≤ C
||ξ||

or, ||An ξ|| − ||An x0 || ≤ C
||ξ||
C + ||An x0 ||
or, ||An ξ|| ≤ ||ξ||

Since the norm sequence {||An x0 ||} is bounded due to {An x0 } being a
convergent sequence, it follows that
C + ||An x0 ||
||An ξ|| ≤ C1 ||ξ|| where C1 = .

The above inequality yields
||An ξ||
||An || = sup ≤ C1
ξ=0 ||ξ||
But this contradicts the hypothesis, because

||An x|| ≤ C ∀ x ∈ B(x0 , ) ⇒ ||An || ≤ C1
Next, let us suppose B(x0 , 0 ) is any closed ball in Ex . The sequence

{||An x||} is not bounded on it.
Hence, there is an index n0 and an element
x1 ∈ B0 (x0 , 0 ) s.t. ||An1 x1 || > 1
By continuity of the operator An1 , then the above inequality holds

in some closed ball B 1 (x1 , 1 ) ⊂ B0 (x0 , 0 ). (see fig. 4.1) The
sequence {||An x||} is again not bounded on B(x1 , 1 ), and therefore
there is an index n2 s.t. n2 > 0, and an element
x2 ∈ B 1 (x1 , 1 ) s.t. ||An2 x2 || > 2.
Since An2 is continuous the above inequality ε0
must hold in some ball B 2 (x2 , 2 ) ⊆ B 1 (x1 , 1 ) and x0
so on. If we continue this process and let n → 0
as n → ∞, there is a point x belonging to all balls
B r (xn , n ). At this point
||Ank x|| ≥ k Fig. 4.1

which contradicts the hypothesis that {An x} converges for all x ∈ Ex .

Hence the theorem.
Now we revert to the operator
Ax = lim An x
n→∞
The inequality ||An x|| ≤ M ||x||, n = 1, 2, . . . holds good.

Now, given > 0, there exists n0 (> 0) s.t. for n ≥ n0
||Ax|| ≤ ||An x|| + ||(A − An )x|| ≤ (M + )||x|| for n ≥ n0
Making → 0, we get
||Ax|| ≤ M ||x||
4.5.8 Theorem (uniform boundedness principle)

Let Ex and Ey be two Banach spaces. Let {An } be a sequence of
bounded linear operators mapping Ex −→ Ey s.t.
(i) {An x} is a bounded subset of Ey for each x ∈ Ex ,
Then the sequence {||An ||} of norms of these operators is bounded.
Proof: Let Sk = {x : ||An x|| ≤ k, x ∈ Ex }.
Clearly, Sk is a subset of Ex .
Since {||An x||} is bounded for any x, Ex can be expressed as
∞
*
Ex = Sk
k=1
Since Ex is a Banach space, it follows by the Baire’s Category theorem

1.3.7 that there exists at least one Sk0 with non-empty interior and thus
contains a closed ball B0 (x0 , r0 ) with centre x0 and radius r0 > 0.
Thus ||An x0 || ≤ k0
ξ
Now, x = x0 + r0 would belong to the B0 (x0 , r0 ) for every ξ ∈ Ex .
||ξ||
0 0
0 An ξ 0
Thus, 0 r
0 0 ||ξ|| + A 0
n 0 0 ≤ k0
x
r0
Hence, ||An ξ|| − ||An x0 || ≤ k0
||ξ||
k0 + ||An x0 || 2k0
or ||An ξ|| ≤ ||ξ|| ≤ ||ξ||
r0 r0
||An ξ|| 2k0
Thus, ||An || = sup ≤
ξ=0 ||ξ|| r0
Hence, {||An ||} is bounded.
4.5.9 Remark
The theorem does not hold unless Ex is a Banach space, as it follows
from the following example.
4.5.10 Examples
1. Let Ex = {x = {xj } : only a finite number of xj ’s are non-zero}.
||x|| = sup |xj |
j
Consider the mapping, which is a linear functional defined by fn (x) =

n
xj then
j=1

n n m

|fn (x)| = xj ≤ |xj | ≤ |xj |, since xi = 0 for i > m.
j=1 j=1 j=1

m
|fn (x)| ≤ |xj | ≤ m||x||
j=1
Hence, {|fn (x)|} is bounded.

n n

Now, |fn (x)| =
xj ≤ |xj | ≤ n||x||.
j=1 j=1
Hence ||fn || ≤ n
Next, consider the element ξ = {ξ1 , ξ2 , . . . , ξi , . . .}
where ξi = 1 1 ≤ i ≤ n
= 0 i > n.
||ξ|| = 1

f (ξi ) = ξi = n = n||ξ||
n
|fn (ξ)|
=n
||ξ||
Thus {fn (x)} is bounded but {||fn ||} is not bounded. This is because
Ex = {x = {xj } : only for a finite number of xj ’s is non-zero} is not a
Banach space.
∞

2. Let Ex be the set of polynomials x = x(t) = pn tn where pn = 0 for
n=0
n > Nx .
Let ||x|| = max[|pn |, n = 1, 2, . . .]

n−1
Let fn (x) = pk . The functionals fn are continuous linear functionals
k=0
on Ex . Moreover, for every x = p0 + p1 t + · · · + pm tm , it is clear that for
every n, |fn (x)| ≤ (m + 1)||x||, so that {|fn (x)|} is bounded. For fn (x) we
choose x(t) = 1 + t + · · · + tn .
|fn (x)|
Now, ||fn || = sup ≥ n since ||x|| = 1 and |fn (x)| = n.
x=θ ||x||
Hence {||fn ||} is unbounded.
4.6 Some Applications

4.6.1 Lagrange’s interpolation polynomial
Lagrange’s interpolation formula is to find the form of a given function in
a given interval, when the values of the function is known at not necessarily
equidistant interpolating points within the said interval. In what follows,
we want to show that although the Lagrangian operator converges pointwise
to the identity operator, it is not uniformly convergent.
For any function f defined on the interval [0, 1] and any partition
0 ≤ t1 < t2 < · · · tn ≤ 1 of [0, 1], there is a polynomial of degree (n−1) which
interpolates to f at the given points, i.e., takes the values f (ti ) at t = ti ,
i = 1, 2, . . . , n. This is called the Lagrangian interpolation polynomial and
is given by

n
(n)
Ln f = wk (t)f (tnk ) (4.20)
k=1
(n) pn (t)
where wk (t) = (4.21)
pn (t)(t − tnk )
E
n
and pn (t) = (t − tnk ). (4.22)
k=1
4.6.2 Theorem
We are given some points on the segment [0, 1] forming the infinite
triangular matrix,
⎡ ⎤
t11 0 0 ··· 0
⎢ 2 ⎥
⎢ t1 t22 0 ··· 0⎥
⎢ ⎥
T =⎢
⎢ t1
3
t32 t33 ··· 0⎥ ⎥. (4.23)
⎢ ⎥
⎣· · · ··· ··· ··· · · ·⎦
··· ··· ··· ··· ···
For a given function, f (t) defined on [0, 1], we construct the Lagrangian
interpolation polynomial Ln f whose partition points are the points of nth
row of (4.23),

n
(n)
Ln f = wk (t)f (tnk )
k=1
(n) pn (t) E
n
where wk = , pn (t) = (t − tnk ).
pn (t)(t − tnk )
k=1
For every choice of the matrix (4.23), there is a continuous function f (t)
s.t. Ln f does not uniformly converge to f (t) as n → ∞.
Proof: Let us consider Ln as an operator mapping the function f (t) ∈
C([0, 1]) into the elements of the same space and put

n
(n)
λn = max λn (t) where λn (t) = |wk (t)|
t
k=1

n
(n) n
Now, ||Ln f || = max wk (t)f (tk )
t
k=1

n
(n)
≤ max |wk (t)| max |f (tnk )|
t t
k=1

n
(n)
= λn ||f ||, where λn = max |wk (t)|
t
k=1
On C([0, 1]) ||f || = max |f (t)|
t
||Ln f ||
||Ln || = sup ≤ λn
f =θ ||f ||
Since λn (t) is a continuous function defined on a closed and bounded

set [0, 1], the supremum is attained.
Hence, ||Ln || = λn .
On the other hand, the Bernstein inequality (See Natanson [40]) λn >
ln n
√ holds.
8 π
Consequently ||Ln || −→ ∞ as n → ∞.
This proves the said theorem, because if Ln f −→ f uniformly for all
f (t) ∈ C([0, 1]), then the norm ||Ln || must be bounded.
4.6.3 Divergence of Fourier series of continuous functions
In section 7 of Chapter 3 we have introduced, Fourier series and Fourier
coefficients in a Hilbert space. In L2 ([−π, π]) the orthonormal set can be
taken as ,
eint
en (t) = √ , n = 0, 1, 2, . . .
2π
If x(t) ∈ L2 ([−π, π]) then x(t) can be written as
∞

xn en
n=−∞
π
1
where xn = x, en = √ x(t)e−int dt
2π −π
π π
1 1
=√ x(t) cos ntdt − √ x(t) sin ntdt
2π −π 2π −π
= cn − idn (say)
Then, xn = cn + idn
∞ ∞ ∞
2cn cos nt 2dn sin nt
Thus xn en = x0 e0 + √ + √ (4.24)
n=−∞ n=1
2π n=1
2π
Here, cn and dn are called the Fourier coefficient of x(t).

Note 4.6.1. It may be noted that the completeness of {en } is equivalent
to the statements that each x ∈ L2 has the Fourier expansion
∞
1
x(t) = √ xn eint (4.25)
2π n=−∞
It must be emphasized that this expansion is not to be interpreted as

saying that the series converges pointwise to the function.
One can conclude that the partial sums in (4.25), i.e., the vector
1 n
un (t) = √ xk eikt (4.26)
2π k=−∞
converges to the vector x in the sense of L2 , i.e.,
||un (t) − x(t)|| −→ 0 as n → ∞
This situation is often expressed by saying that x is the limit in the

mean of un ’s.
4.6.4 Theorem
Let E = {x ∈ C([−π, π]) : x(π) = x(−π)} with the sup norm. Then
the Fourier series of every x in a dense subset of E diverges at 0. We recall
(see equation (4.25)) that
∞ ∞
x(t) = x0 + √ + √ (4.27)
n=1
2π n=1
2π
π
1
where x0 = √ x(t)dt (4.28)
2π −π
π
cn = x(t) cos ntdt (4.29)
−π
π
dn = x(t) sin ntdt (4.30)
−π
For x(t) ∈ E
||x|| = max |x(t)| (4.31)
−π≤t≤π
Let us take operator An = un where un (x) is the value at t = 0 of the

nth partial sum of the Fourier series of x, since for t = 0 the sine terms are
zero and the cosine is one we see from (4.28) and (4.29) that

1 n
un (x(0)) = √ x0 + 2 cm
2π m=1
π 3
1
n
1
=√ x(t) 2 + cos mt dt (4.32)
2π −π 2 m=1
1
n n
1
Now, 2 sin t cos mt = 2 sin t cos mt
2 m=1 m=1
2
n
1 1
= sin m + t − sin m − t
m=1
2 2

1 1
= − sin t + sin n + t
2 2
It may be noted that, except for the end terms, all other intermediate
terms in the summation vanish in pairs.
Dividing both sides by sin 12 t and adding 1 to both sides, we have
! "
n
sin n + 12 t
1+2 cos mt =
m=1
sin 12 t
Consequently, the expression for un (x) can be written in the simple

form:
π ! "
1 sin n + 12 t
un (x) = √ x(t)Dn (t)dt, where Dn (t) =
2π −π sin 12 t
It should be noted that un (x) is a linear functional in x(t).

We would next show that un is bounded. It follows from (4.29) and the
above integral,
π
1
|un (x)| ≤ √ |x(t)| |Dn (t)|dt
2π −π
π
1
≤√ |Dn (t)|dt||x||
2π −π
π
|un (x)| 1
Therefore, ||un || = sup ≤√ |Dn (t)|dt
x=θ ||x|| 2π −π
1
= √ ||Dn (t)||1
2π
where || · ||1 denotes L1 -norm.
Actually the equality sign holds, as we shall prove.
Let us write |Dn (t)| = y(t)Dn (t) where y(t) = +1 at every t at which
Dn (t) ≥ 0 and y(t) = −1 otherwise y(t) is not continuous, but for any given
> 0 it may be modified to a continuous x of norm 1, such that for this x
we have
π

un (x) − √1

1 π
|Dn (t)|dt = √ (x(t) − y(t))Dn (t)dt <
2π −π 2π −π
If follows that π
1
||un || = √ |Dn (t)|dt
2π −π
We next show that the sequence {||un ||} is unbounded. Since sin u ≤ u,
0 ≤ u ≤ π, we note that
3
π
! 1
" 3
π/2
sin n + 2 t sin(2n + 1)u
dt ≥ 4 du
−π sin 12 t 0 u
(2n+1)π
2 | sin v|
=4 dv
0 v
2n (k+1) π
2 | sin v|
=4 dv
kπ
2
v
k=0
2n
(k+1)π
1 2
≥4 | sin v|dv
(k + 1) π2 kπ
2
k=0
2n
8 1
= −→ ∞ as n → ∞
π (k + 1)
k=0
∞
1
Because the harmonic series diverges.
k
k=1
Hence {||un ||} is unbounded. Since E is complete this implies that there
exists no c > 0 and finite such that ||un (x)|| ≤ c holds for all x. Hence
there must be an x ∈ E such that {||un (x )||} is unbounded. This implies
that the Fourier series of that x diverges at t = 0.
Problems [4.5 & 4.6]
1. Let Ex be a Banach space and Ey a normed linear space. If {Tn } is
a sequence in (Ex → Ey ) such that T x = limn→∞ Tn x exists for each
x in Ex , prove that T is a continuous linear operator.
2. Let Ex and Ey be normed linear spaces and A : Ex → Ey be a linear
N
operator with the property that the set {||Axn || : n ∈ } is bounded
whenever xn → θ in Ex . Prove that A ∈ (Ex → Ey ).
3. Given that Ex and Ey are Banach spaces and A : Ex → Ey is a
bounded linear operator, show that either A(Ex ) = Ey or is a set of
the first category in Ey .
[Hint: A set X ⊆ Ex is said to be of the first category in Ex if it is
the union of countably many nowhere dense sets in Ex ].
4. If Ex is a Banach space, and {fn (x)} a sequence of continuous linear
functionals on Ex such that {|fn (x)|} is bounded for every x ∈ Ex ,
then show that the sequence {||fn ||} is bounded.
[Hint: Consult theorem 4.5.7]
5. Let Ex and Ey be normed linear spaces. E be a bounded, complete
convex subset of Ex . A mapping A from Ex to Ey is called affine if
A(λa + (1 − λ)b) = λA(a) + (1 − λ)A(b) for all 0 < λ < 1,

and a, b ∈ E.
Let F be a set of continuous affine mappings from Ex to Ey . Then

show that either the set {||A(x)|| : A ∈ F} is unbounded for each x
in some dense subset of E or else F is uniformly bounded in Ey .
6. Let Ex and Ey be Banach spaces and An ∈ (Ex −→ Ey ), n = 1, 2, . . ..
Then show that there is some A ∈ (Ex −→ Ey ) such that An x → Ax
for every x ∈ Ex if and only if {An } converges for every x in some
set whose span is dense in Ex and the set {||An || : n = 1, 2, . . .} is
bounded.
7. Show that there exists a dense set Ω of Ex = {x ∈ C([−π, π]):
x(π) = x(−π)} such that the Fourier series of every x ∈ Ω diverges
at every rational number in [−π, π].
4.7 Inverse Operators

In what follows, we introduce the notion of the inverse of a linear operator
and investigate the conditions of its existence and uniqueness. This is,
in other words, searching the conditions under which a given system of

equations will have a solution or if the solution at all exists it is unique.
4.7.1 Definition: domain of an operator
Let a linear operator A map a subspace E of a Banach space Ex into
a Banach space Ey . The subspace E on which the operator A is defined is
called the domain of A and is denoted by D(A).
4.7.2 Example
2
Let A : C 2 ([0, 1]) −→ C[(0, 1)] where Au = f , A = − dx
d
2 , u(0) = u(1) =
0 and f ∈ C([0, 1]).

Here, D(A) = {u(x)|u(x) ∈ C 2 ([0, 1]), u(0) = u(1) = 0}
4.7.3 Example
Let a linear operator A map D(A) ⊂ Ex −→ Ey , where Ex and Ey are
Banach spaces. The range of A is the subspace of Ey into which D(A) is
mapped to, and the range of A is denoted by R(A).
4.7.4 Example
s
Let y(s) = k(s, t)x(t)dt where x(t) ∈ C([0, 1]), K(s, t) ∈ C([0, 1]) ×
0
C([0, 1]).
R(A) = {y : y ∈ C([0, 1]), y = Ax}
4.7.5 Definition: null space of a linear operator

The null space of a linear operator A is defined by the set of elements
of E, which is mapped into the null element and is denoted by N (A).
Thus N (A) = {x ∈ E : Ax = θ}.
4.7.6 Example
Let A : R2 −→ R2

2 1
A=
4 2
,
1 x1
N (A) = x ∈ R2 : x1 = − x2 , where x =
2 x2
4.7.7 Definition: left inverse and right inverse of a linear

operator
A linear continuous operator B is said to be a left inverse of A if
BA = I.
A linear continuous operator C is said to be a right inverse of A if
AC = 1.
4.7.8 Lemma
If A has a left inverse B and a right inverse C, then B = C.
For B = B(AC) = (BA)C = C.
If A has a left inverse as well as a right inverse, then A is said to have
an inverse and the inverse operator is denoted by A−1 .
Thus if A−1 exists, then by definition A−1 A = AA−1 = I.
4.7.9 Inverse operators and algebraic equations
Let Ex and Ey be two Banach spaces and A be an operator s.t.
A ∈ (Ex −→ Ey ).
We want to know when one can solve
Ax = y (4.33)
Here y is a known element of the linear space Ey and x ∈ Ex is unknown.

If R(A) = Ey , we can solve (4.31) for each y ∈ Ey .
If N (A) consists only one element then the solution is unique. Thus
if R(A) = Ey and N (A) = {θ}, we can assign to each y ∈ Ey the
unique solution of (4.33). This assignment gives the inverse operator
A−1 of A. We next show that A−1 if it exists is linear. Let x =
A−1 (y1 + y2 ) − A−1 y1 − A−1 y2 . Then A being linear
Ax = AA−1 (y1 + y2 ) − A · A−1 y1 − A · A−1 y2

= y1 + y2 − y1 − y2 = θ
Thus, x = A−1 Ax = A−1 θ = θ, i.e., A−1 (y1 + y2 ) = A−1 y1 + A−1 y2 ,

proving A−1 to be additive. Analogously the homogeneity of A−1 is
established.
Note 4.7.1. It may be noted that the continuity of the operator A in
some topology does not necessarily imply the continuity of its inverse,
i.e., an operator inverse to a bounded linear operator is not necessarily a
bounded linear operator. In what follows we investigate sufficient conditions
for the existence of the inverse to a linear operator.
4.7.10 Theorem (Banach)
Let a linear operator A map a normed linear (Banach) space Ex onto a
normed linear (Banach) space Ey , satisfying for every x ∈ Ex the condition
||Ax|| ≥ m||x||, m > 0 (4.34)
m, some constant. Then the inverse bounded linear operator A−1 exists.
Proof: The condition (4.34) implies that A maps Ex onto Ey in a one-
to-one fashion. If Ax1 = y and Ax2 = y, then A(x1 − x2 ) = θ yields
m||x1 − x2 || ≤ ||A(x1 − x2 )|| = 0, whenever x1 = x2 . Hence, there is a

linear operator A−1 . The operator is bounded, as is evident from (4.34),
1 1
||A−1 y|| ≤ ||A · A−1 y|| = ||y||,
m m
for every y ∈ Ey .
Let A and B be two bounded linear operators mapping a normed linear
space E into itself, so that A and B are conformable for multiplication.
Then
||AB|| ≤ ||A|| ||B||.
If further {An } → A and {Bn } → B as n → ∞, then An Bn −→ AB as

n → ∞.
Proof: For any x ∈ E,
||ABx|| ≤ ||A|| ||Bx|| ≤ ||A|| ||B|| ||x||

||ABx||
or, ||AB|| = sup ≤ ||A|| ||B||.
x=θ ||x||
Hence ||AB|| ≤ ||A|| ||B||
Now, since {An } and {Bn } are bounded linear operators,
||An Bn − AB|| = ||An Bn − An B|| + ||An B − AB||
≤ ||An || ||Bn − B|| + ||An − A|| ||B||
→ 0 as n → ∞, since An → A and Bn → B as n → ∞
4.7.12 Theorem
Let a bounded linear operator A map E into E and let ||A|| ≤ q < 1.
Then the operator I +A has an inverse, which is a bounded linear operator.
Proof: In the space of operators with domain E and range as well in E,
we consider the series
I − A + A2 − A3 + · · · + (−1)n An + · · · (4.35)
Since ||A2 || ≤ ||A|| · ||A|| = ||A||2 and analogously ||An || ≤ ||A||n , it

follows for the partial sums Sn of the series (4.35), that
||Sn+p − Sn || = ||(−1)n+1 An+1 + (−1)n+2 An+2 + · · · + (−1)n+p An+p ||

≤ ||An+1 || + ||An+2 || + · · · + ||An+p ||
≤ q n+1 + q n+2 + · · · + q n+p
(1 − q p )
= q n+1 → 0 as n → ∞, since p > 0, 0 < q < 1.
(1 − q)
Hence, {Sn } is a Cauchy sequence and the space of operators being

complete, the sequence {Sn } converges to a limit.
Let S be the sum of the series.
Then S(I + A) = lim Sn (I + A)

n→∞
= lim (I + A)(I − A + A2 + · · · + (−1)n An )

n→∞
= lim (I − An+1 ) = I
n→∞
−1
Hence S = (I + A)
Let x1 = (I + A)−1 y1 , x2 = (I + A)−1 y2 , x1 , x2 , y1 , y2 ∈ E
Then y1 + y2 = (I + A)(x1 + x2 )
or, x1 + x2 = (I + A)−1 y1 + (I + A)−1 y2 = (I + A)−1 (y1 + y2 ).
Hence, S is a linear operator. Moreover,
∞ ∞
1
||S|| ≤ ||An || ≤ qn = , 0<q<1 (4.36)
n=0 n=0
1−q
Then (I + A)−1 is a bounded linear operator.

4.7.13 Theorem
Let A· ∈ (Ex −→ Ey ), A have a bounded inverse with ||A−1 || ≤ α,
||A − C|| ≤ β and βα < 1, then C has a bounded inverse and
α
||C −1 || ≤
(1 − αβ)
Proof: Since ||I − A−1 C|| = ||A−1 (A − C)|| ≤ αβ < 1 and A−1 C =
I − (I − A−1 C), it follows from theorem 4.7.12 that A−1 C has a bounded
inverse and hence C has a bounded inverse.
Hence, ||C −1 || = ||(A−1 C)−1 A−1 || = ||(I − (I − A−1 C))−1 A−1 ||

≤ ||(I − (I − A−1 C))−1 || ||A−1 ||
∞

≤α· (αβ)n using (4.35) and noting that
n=0 ||I − A−1 C|| ≤ αβ
α
=
1 − αβ
4.7.14 Example
Consider the integral operator
1
Cx = x(s) − K(s, t)x(t)dt (4.36 )
0
with continuous kernel K(s, t) which maps the space C([0, 1]) into C([0, 1]).
Let K0 (s, t) be a degenerate kernel close to K(s, t). That is, K0 (s, t) is
n
of the form ai (s)bi (t). In such a case, the equation
i=1
1
Ax = x(s) − K0 (s, t)x(t)dt = y (4.37)
0
can be reduced to a system of algebraic equations and finding the solution

of equation (4.37) can be reduced to finding the solution of the concerned
algebraic equations. Let us assume that equation (4.37) has a solution.
In order to know whether the integral equation
Cx = y (4.38)
has a solution, we frame the equation (4.37) in such a manner that
1
w = max |K(t, s) − K0 (t, s)| < (4.39)
t,s r
where r = ||R||, R = (rij ) being the matrix associated with the solution
x0 (s) = Ry of the linear algebraic system generated from equation (4.37).
It follows from (4.38), (4.37) and (4.39)
1
||C − A|| ≤ w < (4.40)
r
It follows from theorem 4.7.13 that equation (4.37) with a continuous
kernel has a solution; if x(t) be the solution, then
w
||x(t) − x0 (t)|| ≤ r2
1 − wr
The above inequality gives an estimate as to how much the solution of
equation (4.38) differs from the solution of (4.37), the explicit form of the
solution of (4.38) being not known.
Finally, we obtain the following theorem.
If a bounded linear operator A maps the whole of the Banach space Ex
onto the whole of the Banach space Ey in a one-to-one manner, then there
exists a bounded linear operator A, which maps Ex onto Ey .
Proof: The operator A being one-to-one and onto has an inverse A−1 . We
need to show that A−1 is bounded.
Let Sk = {y ∈ Ey : ||A−1 y|| ≤ k||y||}.
Ey can be represented as
∞
*
Ey = Sk
k=0
Since Ey is a complete metric space, by Baire’s category theorem (th.

1.4.19) at least one of the sets Sk is everywhere dense. Let this set be Sn .
Let us take an element y ∈ Ey . Let ||y|| = l. Then there exists y1 ∈ Sn
such that
l
||y − y1 || ≤ , ||y1 || ≤ l.
2
This is possible, since B(0, l) ∩ Sn is everywhere dense in B(0, l) and
y ∈ B(0, 1).
Moreover, we can find an element y2 ∈ Sn such that
l l
||(y − y1 ) − y2 || ≤ , ||y2 || ≤ ,
22 2
Continuing this process, we can find element yk ∈ Sn , such that
l l
||y − (y1 + y2 + · · · + yk )|| ≤ k
, ||yk || ≤ k−1
2 2

k
Making k −→ ∞, we have, y = lim yi .
k→∞
i=1
Let xk = A−1 yk , then
nl
||xk ||Ex = ||A−1 yk ||Ex ≤ n||yk ||Ey ≤
2k−1

k
Expressing sk = xi
i=1

k+p
nl 1 nl
||sk+p − sk || = || xi || ≤ 1 − p < k−1
2k−1 2 2
i=k+1
Ex being complete, {sk } converges to some element x ∈ Ex as k → ∞.

k ∞
Hence, x = lim xi = xi .
k→∞
i=1 i=1
Moreover, A being continuous

k
k
Ax = A lim xi = lim Axi
k→∞ k→∞
i=1 i=1

k
= lim yi = y
k→∞
i=1

k
k
Hence, ||A−1 y|| = ||x|| = lim || xi || ≤ lim ||xi ||
k k
i=1 i=1
∞
nl
≤ = 2nl = 2n||y||
i=1
2i−1
Since y is an arbitrary element of Ey , that A−1 is a bounded linear

operator is proved.
Note 4.7.2. This is further to the note 4.7.1.
(i) A bounded linear operator A mapping a Banach space Ex into a

Banach space Ey may have an inverse which is not bounded.
(ii) An unbounded linear operator mapping Ex −→ Ey may have a
bounded inverse.
4.7.16 Examples
t
1. Let E = C([0, 1]) and let Au = u(ζ)dζ.
0
Thus, A is a bounded linear operator mapping C([0, 1]) into C([0, 1]).
d
A−1 given by A−1 u = u(t)
dt
is an unbounded operator defined on the linear subspace of continuously
differentiable functions such that u(0) = 0.
2. Let E = C([0, 1]). The operator B is given by
Bu = f (x) (4.41)
d2
where B = −
dx2
DB = {u : u ∈ C 2 ([0, 1]), u(0) = u(1) = 0}
Integrating equation (4.41) twice, we obtain,

x s
u(x) = − ds f (t)dt + C1 x + C2
0 0
The condition u(0) = 0 immediately gives C2 = 0 and consequently

x s
u(x) = − ds f (t)dt + C1 x
0 0
We change the order of integration and use Fubini’s theorem [see 10.5].
This leads to
x x
u(x) = − f (t)dt ds + C1 x
0 t
x
=− (x − t)f (t)dt + C1 x (4.42)
0
1
u(1) = 0 ⇒ C1 = (1 − t)f (t)dt (4.43)
0
Using (4.43), (4.42) reduces to
x 1
u(x) = t(1 − x)f (t)dt + x(1 − t)f (t)dt
0 x
1
= K(x, t)f (t)dt = B −1 f (4.44)
0
K(x, t), the kernel of the integral equation (4.44) is given by
3
t(1 − x) 0 ≤ t ≤ x
K(x, t) =
x(1 − t) x ≤ t ≤ 1
We note that B is not a bounded operator. For example, take
u(x) = sin nπx, u(x) ∈ D(B)
d2
Now, 2
sin nπx = −n2 π 2 sin nπx.
dx
0 2 0
0 d 0
Then, 0 0 dx2 (sin nπx) 0 = n2 π 2 −→ ∞ as n → ∞.
0
On the other hand
1
−1 1
||B f || ≤ |K(x, t)|dt||f || ≤ ||f ||
0 8
−1
Hence B is bounded.
4.7.17 Operators depending on a parameter
Let us consider the equation of the form
Ax − λx = y or, (A − λI)x = y (4.45)
Here, λ is a parameter. Such equations occur frequently in Applied

Mathematics and Theoretical Physics.
4.7.18 Definition: homogeneous equation, trivial solution
In case y = θ in equation (4.45) then it is called a homogeneous
equation. Thus
(A − λI)x = θ (4.46)
is a homogeneous equation. This equation always has a solution x = θ,
called the trivial solution.
4.7.19 Definition: resolvent operator, regular values
In case (A−λI) has an inverse (A−λI)−1 , the operator Rλ = (A−λI)−1
is called the resolvent of (4.45). Equation (4.46) will then have a unique
solution x = θ. Those λ for which equation (4.45) has a unique solution for
every y and the operator Rλ is bounded, are called regular values.
4.7.20 Definition: eigenvector, eigenvalue or characteristic
value, spectrum
If the homogeneous equation (4.46) or the equation Ax = λx has a non-
trivial solution x, then that x is called the eigenvector. The values of
λ corresponding to the non-trivial solution, the eigenvector, are called the
eigenvalues or characteristic values.
The collection of all non-regular values of λ is called the spectrum of
the operator A.
4.7.21 Theorem
If in the equation (A − λI)x = y, (1/|λ|)||A||q < 1 holds for λ, thus
A − λI has an inverse operator; moreover,

1 A A2
Rλ = − 1 + + 2 + ···
λ λ λ
If λ is a regular value, then λ + Δλ for |Δλ| < ||(A − λI)−1 ||−1 is also a
regular value. This implies that the collection of regular values is an open
set and hence the spectrum of an operator is a closed set.
Proof: The equation (A − λI)x = y can be written as

A 1
− I− x= y
λ λ
1
Thus, by theorem 4.7.12 we can say, if |λ| ||A|| = q < 1, then (I − A/λ)
will have an inverse and the concerned values of λ will be its regular values.
If in theorem 4.7.13 we take C = A−(λ+Δλ)I then if ||C −(A−λI)|| =
|Δλ| < ||(A − λI)−1 ||−1 then we conclude that C = A − (λ + Δλ)I has an
inverse and (λ + Δλ) is also a regular value.
4.7.22 Example
Let us consider the Fredholm integral equation of the second kind:
b
φ(x) − λ K(x, s)φ(s)ds = f (x) (4.47)
a
b
If we denote Aφ by K(x, s)φ(s)ds, equation (4.47) can be written as
a
(I − λA)φ = f (4.48)
The solution of the equation (4.47) can be expressed in the form of an
infinite series
∞ b
φ(x) = f (x) + λm Km (x, s)f (x)ds (see Mikhlin [37]) (4.49)
m=1 a
where the mth iterated kernel Km (x, s) is given by

b
Km (x, s) = Km−1 (x, t)K(t, s)dt (4.50)
a
By making an appeal to theorem 4.7.21, we can obtain the resolvent

operator Rλ as
Rλ f = (I − λA)−1 f = [I + λA + λ2 A2 + · · · + λp Ap + · · · ]f
b
where Ap f = Kp (t, s)f (s)ds is the pth iterate of the kernel K(t, s).
a
1
Thus if ||A|| < , we get
|λ|
b
φ(x) = f (x) + λ R(x, s, λ)f (s)ds
a
Here, R(x, s, λ), the resolvent of the kernel K(t, s) is defined by,
R(x, s, λ) = K(x, s) + λK2 (x, s) + · · · + λp Kp+1 (x, s) + · · ·
Problems
1. Let Ex and Ey be normed linear spaces over 4(+). A : Ex −→ Ey is
a given linear operator. Show that
(i) A is bijective =⇒ A has an inverse =⇒ N (A) = {θ}

(ii) A−1 , if it exists, is a linear operator.
4
2. L( n ) denotes the space of linear operators mapping n → n . 4 4
Suppose that the mapping A : D ⊂ Rm → L(Rn ) is continuous
at a point x0 ∈ D for which A(x0 ) has an inverse. Then, show that
there is a δ > 0 and a ν > 0 so that A(x) has an inverse and that
||A(x)−1 || ≤ ν ∀ x ∈ D ∩ B(x0 , δ)
Hint: Use theorem 4.7.13

3. (Sherman-Morrison-Woodbury Formula) Let A ∈ L( n ) have 4
4
an inverse and let U, V map n → m , m ≤ n. Show that A + U V T 4
has an inverse if and only if (I + (V T )−1 )U has an inverse and that
(A + U V T )−1 = A−1 − A−1 U (I + V T A−1 U )−1 V T A−1
Hint: Use theorem 4.7.13

4. If L is a bounded linear operator mapping a Banach space E into E,

show that L−1 exists if and only if there is a bounded linear operator
K in E such that K −1 exists and
||I − KL|| < 1
If L−1 exists then show that

∞
−1 ||K −1 ||
L = (I − KL)n K and ||L−1 || ≤
n=0
1 − ||I − KL||
5. Use the result of problem 4 to find the solution of the linear differential
equation
dU
− λU = f, u(t) ∈ C 1 [0, 1], f ∈ C([0, 1]) and |λ| < 1
dt
6. Let m be the space of bounded number sequences, i.e., for x ∈ m =⇒
x = {ξi }, |ξi | ≤ Kx , ||x|| = sup |ξi |.
i
In m the shift operator E is defined by
Ex = (0, ξ1 , ξ2 , ξ3 , . . .) for x = (ξ1 , ξ2 , ξ3 , . . .)
Find ||E|| and discuss the inversion of the difference operator Δ =

E − I.
[Hint: Show that Δx = θ =⇒ x = θ]
4.8 Banach Space with a Basis

If a space E has a denumerable basis, then it is a separable space.
A denumerably everywhere dense set in a space with basis is a linear
∞

combination of the form ri ei with rational coefficients ri . Though many
i=1
separable Banach spaces have bases, it is not proved for certain that every
separable Banach space has a basis.
Note 4.8.1.
1. It can be shown that a Banach space E is either finite dimensional
or else it has a Hamel basis which is not denumerable and hence
nonenumerable or uncountable.
2. An infinite dimensional separable Banach space has, in fact, a basis
which is in one-to-one correspondence with the set of real numbers.
Note 4.8.1. has exposed a severe limitation of the Hamel basis,
that every element of the Banach space E must be a finite linear
combination of the basic elements and has given rise to the concept
of a new basis known as Schauder basis.
4.8.1 Definition: Schauder basis

Let E be a normed linear space. A denumerable subset {e1 , e2 , . . .} of
E is called a Schauder basis for E if ||en || = 1 for each n and if for every
4+
x ∈ E, there are unique scalars α1 , α2 , . . . in ( ) such that
∞

x= αi ei .
i=1
In case E is finite dimensional and {a1 , a2 , . . . , an } is a Hamel basis,

then {a1 /||a1 ||, a2 /||a2 ||, . . . , an /||an ||} is a Schauder basis for E.
If {e1 , e2 , . . .} is a Schauder basis for E then for n = 1, 2, . . ., let us
4+
define functionals fn : E → ( ) by fn (x) = αn for
∞

x= αn en ∈ E. (4.51)
n=1
The uniqueness condition in the definition of a Schauder basis yields

that each fn is well-defined and linear on E. It is called the nth coefficient
functional on E.
4.8.2 Definition: biorthogonal sequence
Putting x = ej in (4.51) we have
∞

ej = fi (ej )ei ,
i=1
since ei are linearly independent,

3
1 if i = j
fi (ej ) = (4.52)
0 if i = j
The two sequences {ei } and sequence of functionals {fi } such that,
(4.52) is true, are called biorthogonal sequences.
4.8.3 Lemma
For every functional f defined on the Banach space E, we can find
coefficients ci = f (ei ), where {ei } is a Schauder basis in E, such that
∞
∞

f= f (ei )fi = ci fi ,
i=1 i=1
{fi } being a sequence of functionals defined on E and satisfying (4.51).

Proof: For any functional f defined on E, it follows from (4.51)
∞

f (x) = fi (x)f (ei )
i=1
Writing, f (ei ) = ci ,
∞
∞

f (x) = ci fi (x) or, f = ci fi (4.53)
i=1 i=1
The representation (4.53) is unique. The series (4.53) converges for

every x ∈ E.
Problems
1. Let E = C([0, 1]). Consider in C([0, 1]) the sequence of elements
t, (1 − t), u00 (t), u10 (t), u11 (t), u20 (t), u21 (t), . . . , u22 (t) (4.54)
where ukl (t), k = 1, 2, . . ., 0 ≤ l ≤ 2k , are defined in the following

way ukl = 0, if t is located outside the interval (l/2k , l + 1/2k ) but
inside of this interval ukl (t) has a graph in the form of a isosceles
triangle with height equal to 1. [See figure 4.2]. Take a function
x(t) ∈ C([0, 1]) representable in the form of the series
k
∞ 2
−1
x(t) = a0 (1 − t) + a1 t + akl ukl (t)
k=0 l=0
where a0 = x(0), a1 = x(1) and the coefficients akl admit a unique

geometric construction as given in the following figure [see figure (4.2)]
x
u x = x(t )
u22(t ) a11
1 a00
a01
a0 a1
O t t
1/2 3/4 1 O 1/4 1/2 3/4 1
Fig. 4.2 Fig. 4.3
The graph of the partial sums of the above series

k
s−1 2
−1
a0 (1 − t) + a1 t + akl ukl (t)
k=0 l=0
is an open polygon with 2s + 1 vertices lying on the curve x = x(t)

at the points with equidistant abscissae.
Show that the collection of functions in (4.54) forms a basis in
C([0, 1]).
2. Let 1 ≤ p < ∞. For t ∈ [0, 1], let x1 (t) = 1.

⎧
⎪ 1
⎨ 1 if 0 ≤ t ≤
x2 (t) = 2
⎪ 1
⎩−1 if < t ≤ 1
2
and for n = 1, 2, . . ., j = 1, . . . , 2n
⎧ n/2
⎪
⎪ 2 , if (2j − 2)/2n+1 ≤ t ≤ (2j − 1)/2n+1
⎨
x2n +j (t) = −2n/2 , if (2j − 1)/2n+1 ≤ t ≤ 2j/2n+1
⎪
⎪
⎩
0, otherwise
Show that the Haar system {x1 , x2 , x3 , . . .} is a Schauder basis for

L2 ([0, 1]).
Note each xn is a step function.
x1(t )
}
1
t
0 1
x2(t )
}
1
0 t
1
n = 1 x (t )
}
3
j=1 √2
t
x4(t ) n = 1
t j=2
Fig. 4.4
CHAPTER 5
LINEAR
FUNCTIONALS
In this chapter we explore some simple properties of functionals defined on

a normed linear space. We indicate how linear functionals can be extended
from a subspace to the entire normed linear space and this makes the
normed linear space richer by new sets of linear functionals. The stage
is thus set for an adequate theory of conjugate spaces, which is an essential
part of the general theory of normed linear spaces. The Hahn-Banach
extension theorem plays a pivotal role in extending linear functionals from
a subspace to an entire normed linear space. The theorem was discovered by
H. Hahn (1927) [23], rediscovered in its present, more general form (5.2.2)
by S. Banach (1929) [5]. The theorem was further generalized to complex
spaces (5.1.8) by H.F. Bohnenblust and A. Sobezyk (1938) [8].
Besides the Hahn-Banach extension theorem, there is another
important theorem discovered by Hahn-Banach which is known as Hahn-
Banach separation theorem. While the Hahn-Banach extension
theorem is analytic in nature, the Hahn-Banach separation theorem is
geometric in nature.
5.1 Hahn-Banach Theorem

4.3 introduced ‘linear functionals’ and 4.4 ‘the space of bounded linear
operators’. Next comes the notion of ‘the space of bounded linear
functionals’.
5.1.1 Definition: conjugate or dual space
The space of bounded linear functionals mapping a Banach space Ex
into4 is called the conjugate or (dual) of Ex and is denoted by Ex∗ .
In theorem 4.2.13 we have seen how a bounded linear operator A0
179
defined on a linear subspace X, which is everywhere dense in a complete

normed linear space Ex with values in a complete normed linear space, can
be extended to the entire space with preservation of norm. Hahn-Banach
theorem considers such an extension even if X is not necessarily dense in
Ex .
What follows is a few results as a prelude to proving the main theorem.
5.1.2 Lemma
Let L be a linear subspace of a normed linear space Ex and f be a
functional defined on L. If x0 is a vector not in L, and if L1 = (L, x0 ), a
set of elements of the form x + tx0 , x ∈ L and t any real number, then f
can be extended to a functional f0 defined on L1 such that ||f0 || = ||f ||.
Proof: We assume that Ex is a real normed linear space. It is seen
that L1 is a linear subspace because x1 + t1 x0 and x2 + t2 x0 ∈ L1 ⇒
(x1 + x2 ) + (t1 + t2 )x0 ∈ L1 and λ(x1 + t1 x0 ) ∈ L1 etc. Any u ∈ L1 has
two representations of the form x1 + t1 x0 , and x2 + t2 x0 respectively and
that t1 = t2 . Otherwise
x1 + t1 x0 = x2 + t1 x0 =⇒ x1 = x2
x1 − x 2
Now, x1 + t1 x0 = x2 + t2 x0 or, x0 =
t2 − t 1
showing that x0 ∈ L since x1 , x2 ∈ L. Hence x1 = x2 and t1 = t2 , i.e., the
representation of u is unique. Let us take two elements, x and x ∈ L.
We have, f (x ) − f (x ) = f (x − x )

≤ ||f || ||x − x ||
≤ ||f ||(||x + x0 || + ||x + x0 ||)
Thus, f (x ) − ||f || ||x + x0 || ≤ f (x ) + ||f || ||x + x0 ||
Since x and x are arbitrary in M , independent of each other,
sup{f (x) − ||f || ||x + x0 ||} ≤ inf {f (x) + ||f || ||x + x0 ||}
x∈L x∈L
Consequently, there is a real number c, satisfying the inequality,
sup{f (x) + ||f || ||x + x0 ||} ≤ c ≤ inf {f (x) + ||f || ||x + x0 ||} (5.1)
x∈L x∈L
Now any element u ∈ L1 has the form u = x + tx0 , x ∈ M and t ∈ 4.

Let us define a new functional f0 (u) on L, such that
f0 (u) = f (x) − tc (5.2)
c is some fixed real number satisfying (5.1).

Note that u = x + tx0 , x ∈ M , x0 ∈
/ M.
Linear Functionals 181
If in particular u ∈ L then t = 0 and u = x.

Hence f and f0 coincide on L.
Now let u1 = x1 + t1 x0 , u2 = x2 + t2 x0 .
Then, f0 (u1 + u2 ) = f (x1 + x2 ) − (t1 + t2 )c

= f (x1 ) − t1 c + f (x2 ) − t2 c
= f0 (u1 ) + f0 (u2 )
Thus, f0 (u) is additive. To show that f0 (u) is bounded and has the
same norm as that of f (x) we consider two cases:
(i) For t > 0, it follows from (5.1) and (5.2) that

x 1 0x 02
0 0
|f0 (u)| = t f − c ≤ t ||f || 0 + x0 0
t t
= ||f || ||x + tx0 || = ||f || ||u||
Hence, |f0 (u)| ≤ ||f || ||u|| (5.3)
(ii) For t < 0, then (5.1) yields

0x 0
0 0 1
f (x/t) − c ≥ −||f || 0 + x0 0 = − ||f || ||x + tx0 ||
t |t|
1
= ||f || ||u||
t
1 x 2
Hence, f0 (u) = t f − c ≤ ||f || ||u||
t
That is, we get back (5.3).

Hence, inequality (5.3) remains valid for all u ∈ (L, x0 ) = L1 . Thus it
follows from (5.3) that ||f0 || ≤ ||f ||. However, since the functional f0 is an
extension of f from L to L1 , where ||f0 || ≥ ||f ||. Hence ||f0 || = ||f ||.
Note that we have determined the norm of the functional f0 with respect
to that linear subspace on which f0 is defined. Thus, the functional f (x) is
extended to L1 = (L, x0 ) with presentation of norm.
5.1.3 Theorem (the Hahn-Banach theorem)
Every linear functional f (x) defined on a linear subspace L of a normed
linear space E can be extended to the entire space with preservation of
norm. That is, we can construct a linear function F (x) defined on E such
that
(i) F (x) = f (x) for x ∈ L, (ii) ||F ||E = ||f ||L
Proof: Let us first suppose the space E is separable. Let N be a countable
everywhere dense set in E. Let us select those elements of this set which
do not fall in L and arrange them in the sequence x0 , x1 , x2 , . . . , xn , . . .
By virtue of lemma 5.1.2 we can extend the functional f (x) successively

to the subspaces (L, x0 ) = L1 , (L1 , x1 ) = L2 and so on and ultimately
construct a certain functional fw defined on the linear subspace Lw , which
is everywhere dense in E and is equal to the union of all Ln .
Moreover, ||fw || = ||f ||. Since Lw is everywhere dense in E, we can
apply theorem 4.2.13 to extend the functional fw by continuity to the entire
space E and obtain the functional F defined on E such that
F (x)|L = f (x) and ||F ||E = ||f ||L
Alternatively, in case the space is not separable, we can proceed as

follows. Consider all possible extensions of the functional with preservation
of the norm. Such extensions always exist. We next consider the set Φ of
these extensions and introduce a partial ordering as detailed below. We
will say f < f if a linear subspace L on which f is defined is contained
in the linear subspace L on which f is defined and if f (x) = f (x) for
x ∈ L . Evidently f < f has all the properties of ordering.
Now, let {fα } be an arbitrary, totally ordered subset of the set Φ. This
subset has an upper bound, which is the functional f ∗ , defined on a linear
subspace L∗ = ∪α Lα . Lα is the domain of fα and f ∗ (x) = fα0 (x) if x ∈ L∗
is an element of Lα0 . Hence f ∗ is a linear functional and ||f ∗ || = ||f ||, that
is, f ∗ ∈ Φ. Thus, it is seen that all the hypotheses of Zorn’s lemma (1.1.4)
are satisfied and Φ has a maximal element F . This functional is defined on
the entire span E. If that is not so, the functional can be further extended,
contradicting the fact that F is the maximal element in Φ.
Hence, the proof is completed.
Note 5.1.1 Since the constant c satisfying (5.1) may be arbitrarily
preassigned, and hence there may not be a single maximal element in Φ, the
extension of a linear functional by the Hahn-Banach theorem is generally
not unique.
Note that Hahn-Banach theorem is a potential source of generating
different linear functionals in a Banach space or a normed linear space.
The two theorems which are offshoots of the Hahn-Banach theorem give
rise to various applications.
5.1.4 Theorem
Let E be a normed linear space and x0 = θ be a fixed element in E.
There exists a linear functional f (x), defined on the entire space E, such
that
(i) ||f || = 1 and (ii) f (x0 ) = ||x0 ||
Consider the set of elements {tx0 } = L, where t runs through all positive
real numbers.
The set L is a subspace of E, spanned by x0 .
A functional f0 (x), defined on L has the following form: if x = tx0 ,

then
f0 (x) = t||x0 || (5.4)
Then, f0 (x0 ) = ||x0 || and |f0 (x)| = |t|||x0 || = ||x||
|f0 (x)| |f0 (x)|
Thus, = 1, ∀ x, i.e., sup = 1, i.e., ||f0 || = 1
||x|| x=θ ||x||
Now, if the functional f0 (x) is extended to the entire space with
preservation of norm, we get a functional f (x) having the required
properties.
5.1.5 Theorem
Given a subspace L and an element x0 ∈ / L in a normed linear space E.
Let d > 0 be the distance from x0 to L, i.e.,
d = inf ||x − x0 ||
x∈L
Then there is a functional f (x) defined everywhere on E and such that

1
(i) f (x) = 0, x ∈ L, (ii) f (x0 ) = 1 and (iii) ||f || =
d
Proof: Consider a set (L; x0 ). Each of its elements is uniquely
representable in the form u = x + tx0 , where x ∈ L and t is real. Let
us construct the functional f0 (u) by the following rule. If u = x + tx0 ,
define f0 (u) = t. Evidently f0 (x) = 0, if x ∈ L and f0 (x0 ) = 1.
To determine ||f0 || we have, for t = 0,
|t| ||u|| |t| ||u||
|f0 (u)| = |t| = =
||u|| ||x + tx0 ||
||u|| ||u|| ||u|| 0 x0
0 0
=00 x + x0 0
0 = 0 !
0x 0 − − x 0
"0 ≤ since 0 x 0 + 0≥d
t t
d t
1
Thus, ||f0 || ≤ (5.5)
d
Furthermore, there is a sequence {xn } ⊂ L s.t.
lim ||xn − x0 || = d
n→∞
Then we have |f0 (xn − x0 )| ≤ ||f0 || ||xn − x0 ||

Since |f0 (xn − x0 )| = |f0 (xn ) − f0 (x0 )| = 1, xn ∈ L
1 ≤ ||f0 || ||xn − x0 ||
Hence, by taking the limit we get,
1
1 ≤ ||f0 ||d or ||f0 || ≥ (5.6)
d
Then, by extending f0 (x) to the entire space with preservation of norm
we obtain the functional f (x) with the required property.
5.1.6 Geometric interpretation

The conclusion of the above theorem admits geometric interpretation
as follows.
Through every point x0 on the surface of the ball ||x|| ≤ r a supporting
hyperplane can be drawn.
|f (x)|
Note that ||f || = sup
x=θ ||x||
For a functional f (x) on the ball ||x|| ≤ r,
|f (x)| sup |f (x)|

||f || = sup ≥
x=θ ||x|| r
Consider the hyperplane f (x) = r||f ||.

For points x0 on the surface of the ball, ||x0 || = r, f (x0 ) = r||f || and
for other points of the ball
f (x) ≤ r||f ||
Thus, f (x) = r||f || is a supporting hyperplane.

5.1.7 Definition: sublinear functional
A real valued function p on a normed linear space E is said to be
sublinear, if it is
(i) subadditive, i.e.,
p(x + y) ≤ p(x) + p(y), ∀ x, y ∈ E (5.7)
and (ii) positive homogeneous, i.e.,
p(αx) = αp(x), ∀ α ≥ 0 in 4 and x ∈ E (5.8)
5.1.8 Example
|| · || is a sublinear functional.
For, ||x + y|| ≤ ||x|| + ||y||, ∀ x, y ∈ E

||αx|| = α||x||, ∀ α ≥ 0 in 4 and x ∈ E
The Hahn-Banach theorem (5.1.3) can be further generalized in the
following theorem. Let a functional f (x) defined on a subspace L of a
normed linear space E and be majorized on L by a sublinear functional
p(x) defined on E. Then f can be extended from L to E without losing
the linearity and the majorization, so that the extended functional F on E
is still linear and majorized by p. Here we have taken E to be real.
5.1.9 Theorem Hahn-Banach theorem (using sublinear

functional)
Let E be a normed linear space and p a sublinear functional defined on
E. Furthermore, let f be a linear functional defined on a subspace L of E
and let f satisfy,
f (x) ≤ p(x), ∀ x ∈ L (5.9)
Then f can be extended to a linear functional F satisfying,
F (x) ≤ p(x), ∀ x ∈ E (5.10)
F (x) is a linear functional on E and
F (x) = f (x), ∀ x ∈ E
Proof: The proof comprises the following steps:
(i) The set F of all linear extensions g of f satisfying g(x) ≤ p(x) on

D(g) can be partially ordered and Zorn’s lemma yields a maximal
element F on F.
(ii) F is defined on the entire space E.
To show that D(F ) is all of E, the arguments will be as follows. If D(F )

is not all of E, choose a y1 ∈ E − D(F ) and consider the subspace Z1 of
E spanned by D(F ) and y1 , and show that any x ∈ Z1 can be uniquely
represented by x = y + αy1 . A functional g1 on Z1 , defined by
g1 (y + αy1 ) = F (y) + αc
is linear and a proper extension of F , i.e., D(g1 ) is a proper subset of D(F ).

If, in addition, we show that g1 ∈ F , i.e.,
g1 (x) ≤ p(x) ∀ x ∈ D(g1 ),
then the fact that F is the maximal element of F is contradicted.

For details of the proof see Kreyszig [30].
5.1.10 Theorem
Let E be a normed linear space over 4
and f is a linear functional
defined on a subspace L of E. Then theorem 5.1.9 =⇒ theorem 5.1.3.
Proof: Let p(x) = ||f || ||x||, x ∈ E.
Then p is a sublinear functional on E and f (x) ≤ p(x), x ∈ E.
Hence, by theorem 5.1.9, it follows that there exists a real linear
functional F on E such that
F (x) = f (x) ∀ x ∈ L and F (x) ≤ ||f || ||x||, ∀x∈E
F (x) = f (x) ≤ ||f || ||x|| ∀ x ∈ E =⇒ ||F ||E ≤ ||f ||L
On the other hand, take x1 ∈ L, x1 = θ, Then

|F (x1 )| |f (x1 )|
||F || ≥ = =⇒ ||F || ≥ ||f ||
||x1 || ||x1 ||
This completes the proof when E is a real normed linear space.
5.2 Hahn-Banach Theorem for Complex

Vector and Normed Linear Space
5.2.1 Lemma
+
Let E be a normed linear space over . Regarding E as a linear space
4
over , consider a real-linear functional u : E → . Define 4
f (x) = u(x) − iu(ix), x ∈ E
Then f is a complex-linear functional on E.

Proof: As u is real and linear, f is also real, linear.
Now, since u is linear
f (ix) = u(ix) − iu(i · ix) = iu(x) + iu(ix)
= i[u(x) − iu(ix)] = if (x) ∀ x ∈ E
Hence, f is complex-linear.
5.2.2 Hahn-Banach theorem (generalized)
Let E be a real or complex vector space and p be a real-valued functional
on E which is subadditive, i.e., for all x, y ∈ E
p(x + y) ≤ p(x) + p(y) (5.11)
and for every scalar α satisfies,
p(αx) = |α|p(x) (5.12)
Moreover, let F be a linear functional defined on a subspace L of E and

satisfy,
|f (x)| ≤ p(x) ∀ x ∈ L (5.13)
Then f has a linear extension F from L to E satisfying
|F (x)| ≤ p(x) ∀x∈E (5.14)
Proof: (a) Real Vector Space: Let E be real. Then (5.13) yields f (x) ≤
p(x) ∀ x ∈ L. It follows from theorem 5.1.9 that f can be extended to a
linear functional F from L to E such that
F (x) ≤ p(x) ∀ x ∈ E (5.15)
(5.12) and (5.15) together yield

−F (x) = F (−x) ≤ p(−x) = | − 1|p(x) = p(x) ∀x∈E (5.16)
From (5.15) and (5.16) we get (5.14).
(b) Complex Vector Space: Let E be complex. Then L is a complex
vector space, too. Hence, f is complex-valued and we can write,
f (x) = f1 (x) + if2 (x) ∀x∈L (5.17)
where f1 (x) and f2 (x) are real valued. Let us assume, for the time being, E
and L as real vector spaces and denote them by Er and Lr respectively. We
thus restrict scalars to real numbers. Since f is linear on L and f1 and f2
are real-valued, f1 and f2 are linear functionals on L. Also, f1 (x) ≤ |f (x)|
because the real part of a complex quantity cannot exceed the absolute
value of the whole complex quantity.
Hence, by (5.13),
f1 (x) ≤ p(x) ∀ x ∈ Lr
Hence, by the Hahn-Banach theorem 5.1.8 f1 can be extended to a
functional F1 from Lr to Er , such that
F1 (x) ≤ p(x) ∀ x ∈ Lr (5.18)
We next consider f2 . For every x ∈ L,
i[f1 (x) + if2 (x)] = if (x) = f (ix) = f1 (ix) + if2 (ix)
The real parts on both sides must be equal.
Hence, f2 (x) = −f1 (ix) ∀ x ∈ L (5.19)
Just as in (5.18) f1 (x) has been extended to F1 (x), ∀ x ∈ E,
f2 (x) = −[f1 (ix)] can be extended exploiting Hahn-Banach theorem 5.18,
to F2 (x) = [−F1 (ix)] ∀ x ∈ E.
Thus we can write
F (x) = F1 (x) − iF1 (ix) ∀x∈E (5.20)
(i) We would next prove that F is a linear functional on the complex
vector space E.
For real c, d, c + id is a complex scalar.
F ((a + ib)x) = F1 (ax + ibx) + iF2 (ax + ibx)
= F1 (ax + ibx) − iF1 (iax − bx)
= aF1 (x) + bF1 (ix) − i[aF1 (ix) − bF1 (x)]
= (a + ib)F1 (x) − i[(a + ib)F1 (ix)]
= (a + ib)[F1 (x) − iF1 (ix)]
= (a + ib)[F1 (x) + iF2 (x)]
= (a + ib)F (x)
(ii) Next to be shown is that |F (x)| ≤ p(x), ∀ x ∈ E.

It follows from (5.12) that p(θ) = 0. Taking y = −x in (5.11), we get
0 ≤ p(x) + p(−x) = 2p(x), i.e., p(x) ≥ 0 ∀ x ∈ E
Thus, F (x) = 0 =⇒ F (x) ≤ p(x).

Let F (x) = 0. Then we can write using polar coordinates,
F (x) = |F (x)|eiθ ,
thus |F (x)| = F (x)e−iθ = F (e−iθ x)

Since |F (x)| is real, the last expression is real and is equal to its real
part.
Hence, by (5.12)
|F (x)| = F (e−iθ x) = F1 (e−iθ x) ≤ p(e−iθ x) = |eiθ |p(x) = p(x)
This completes the proof.

We next consider Hahn-Banach theorem (generalized) in the setting of
a normed linear space E over ( ). 4+
5.2.3 Hahn-Banach theorem (generalized form in a normed
linear space)
Every bounded linear functional f defined on a subspace L of a normed
4+
linear space E over ( ) can be extended with preservation of norm to the
entire space, i.e., there exists a linear functional F (x) defined on E such
that
(i) F (x) = f (x) for x ∈ L; (ii) ||F |E = ||f ||L
Proof: If L = {θ} then f = θ and the extension is F = θ. Let L = {θ}.
We want to use theorem 5.2.2. For that purpose we have to find out a
suitable p.
We know, for all x ∈ L |f (x)| ≤ ||fL || ||x||.
Let us take p(x) = ||f ||L ||x|| ∀ x ∈ E.
Thus, p is defined on all E.
Furthermore, p(x + y) = ||f ||L ||x + y|| ≤ ||f ||L (||x|| + ||y||)
= p(x) + p(y), ∀ x, y ∈ E
p(αx) = ||f ||L ||αx|| = |α| ||f ||L ||x||)
= |α|p(x) ∀ x ∈ E
Thus conditions (5.11) and (5.12) of theorem 5.2.2 are satisfied. Hence,
the above theorem can be applied, and we get a linear functional F on E
which is an extension of f and satisfies.
|F (x)| ≤ p(x) = ||f ||L ||x|| ∀ x ∈ E

|F (x)|
Hence, ||F ||E = sup ≤ p(x) = ||f ||L (5.21)
x=θ ||x||
On the other hand, F being an extension of f ,
||F ||E ≥ ||f ||L (5.22)
Combining (5.21) and (5.22) we get,
||F ||E = ||f ||L
5.2.4 Hyperspace and related results
5.2.5 Definition: Hyperspace
A proper subspace E0 of a normed linear space E is called a hyperspace
in E if it is a maximal proper subspace of E. It may be noted that a
proper subspace E0 of E is maximal if and only if the span of E0 ∪ {a}
equals E for each a ∈
/ E0 .
5.2.6 Remark 1
A hyperplane H is a translation of a hyperspace by a vector, i.e., H is
of the form
H = x + E0
where E0 is a hyperspace and x ∈ E.
5.2.7 Theorem
A subspace of a normed linear space E is a hyperspace if and only if it
is the null space of a non-zero functional.
Proof: We first show that null spaces of non-zero linear functionals are
hyperspaces.
4+
Let f : E −→ ( ) be a non-zero linear functional and x0 ∈ E be such
that f (x0 ) = 0. Then, for every x ∈ E, there exists a u ∈ N (f ) such that
f (x)
x=u+ x0 , u ∈ N (f )
f (x0 )
If in particular x0 ∈ E/N (f ), then

E = span {x0 ; N (f )}
Thus we see that null spaces of f are hyperspaces.

Conversely, let us suppose that H is a hyperspace of E and x0 ∈ E/H
such that E = span {x0 ; H}. Then for every x ∈ E, there exists unique
4+
pair (λ, u) in ( ) × H such that x = λx0 + u.
4+
Let us define f (λx0 + u) = λ ∈ ( ), u ∈ H.
4+
Now, f ((λ + μ)x0 + u) = λ + μ = f (λx0 + u) + f (μx0 + u), λ, μ ∈ ( ).
f (p(λx0 + u)) = pλ = pf (λx0 + u), p ∈ 4 (+ )
Thus, f is a linear functional.

Again, taking λ = 1, f (x0 ) = 1, and when λ = 0, f (u) = 0 i.e. u ∈ N (f ),
i.e., N (f ) = H.
5.2.8 Remark
A subset H ⊆ E is a hyperplane in E if and only if there exists a
non-zero linear functional f and a scalar ‘λ’ such that
H = {x ∈ E : f (x) = λ}
since {x ∈ E : f (x) = λ} = xλ + N (f ) for some xλ ∈ E with f (xλ ) = λ.
Thus, hyperplanes are of the form H for some non-zero linear functional f
4+
and for some λ ∈ ( ).
5.2.9 Definition
If the scalar field is 4, a set X is said to be on the left side of a hyperplane
H if
X ⊆ {x ∈ E : f (x) ≤ λ}
and it is strictly on the left side of H if
X ⊆ {x ∈ E : f (x) < λ}
Similarly, X is said to on the right side of a hyperplane H if

X ⊆ {x ∈ E : f (x) ≥ λ}
and strictly on the right side of a hyperplane H if
X ⊆ {x ∈ E : f (x) > λ}.
5.2.10 Theorem (Hahn-Banach separation theorem)

Let E be a normed linear space and X1 and X2 be nonempty disjoint
convex sets with X1 being an open set. Then there exists a functional
f ∈ E ∗ and a real number β, such that
X1 ⊆ {x ∈ E : Ref (x) < β}, X2 ⊆ {x ∈ E : Ref (x) ≥ β}
Before we prove the theorem we introduce few terminologies and a

lemma.
5.2.11 Definition: absorbing set
Let E be a linear space. A set X ⊆ E is said to be an absorbing set
if for every x ∈ E, there exists t > 0 such that t−1 x ∈ X.
5.2.12 Definition: Minkowski functional
Let X ⊆ E be a convex, absorbing set. Then, μ : E −→ 4 is said to be
a Minkowski functional of X if
μX (x) = inf{t > 0 : t−1 x ∈ X}
5.2.13 Remark
(i) If X is an absorbing set, μX (x) < ∞ for every x ∈ E.
(ii) If X is an absorbing set then, θ ∈ X and
(iii) If X is a normed linear space, then every open set containing θ is an
absorbing set.
Proof: We prove (iii). Since an open set containing ‘θ’ contains an open
ball B(θ, ), if x ∈ B(θ, ) then ||x|| < . Therefore, for any α > 1

||α−1 x|| ≤ <
α
Hence, every open set containing θ is an absorbing set.

5.2.14 Lemma
Let X be a convex, absorbing subset of a linear space E and let μX be
the corresponding Minkowski functional. Then μX is a sublinear functional,
[see 5.1.7] and
{x ∈ E : μX (x) < 1} ⊆ X ⊆ {x ∈ E : μX (x) ≤ 1}. (5.23)
Proof: For μX to be a sublinear functional, it has to satisfy the properties,
μX (x + y) ≤ μX (x) + μX (y), μX (ζx) = ζμ(x)
for all x, y ∈ E and for all ζ ≥ 0.

Let x, y ∈ E. Let p > 0, q > 0 be such that p−1 x ∈ E, q −1 x ∈ E. Then,
using the convexity of X, we have,

−1 p −1 q
(p + q) (x + y) = ·p x+ · q −1 y ∈ X
p+q p+q
Hence, μX (x + y) ≤ p + q.
Taking infimum over all such p and q, it follows that
μX (x + y) ≤ μX (x) + μY (y)
Next, to show that μX (ζx) = ζμX (x) for all x ∈ E, and for all ζ ≥ 0.
Let x ∈ E and ζ > 0. Let p > 0 be such that μX (ζx) ≤ ζp. Taking
infimum over all p > 0 with p−1 x ∈ X, we have
μX (ζx) ≤ ζμX (x) (5.24)
Let us take ζx in place of x and let p > 0 be such that p−1 (ζx) ∈ X.
Since p−1 (ζx) = (ζ −1 p)−1 x, we have,
μX (x) ≤ ζ −1 p
Taking infimum over all x such that p > 0, we obtain
μX (x) ≤ ζ −1 μX (ζx)
Thus, ζμX (x) ≤ μX (ζx) (5.25)

μX (ζx) = ζμX (x), ζ ≥ 0
To prove the last part of the lemma we proceed as follows. Let x ∈ X.

Then 1 ∈ {t > 0 : t−1 x ∈ X}
Then μX (x) ≤ 1.
Next, let us suppose that x ∈ E be such that μX (x) < 1
Then there exists p0 > 0, such that p0 < 1 with p−1
0 x ∈ X. Since X is
convex and θ ∈ X, we have
x = p0 (p−1
0 x) + (1 − p0 ) · θ ∈ X
Thus, μX (x) < 1 implies x ∈ X.

Hence (5.23) is proved.
5.2.15 Proof of Theorem 5.2.10
We prove the theorem for . 4
Let x1 ∈ X1 and x2 ∈ X2 . Then X1 − X2 = {x1 − x2 : x1 ∈ X1 , x2 ∈
X2 }. Since X1 and X2 are each convex, X1 −X2 is convex. Thus X1 −X2 is
non-empty and convex. We next show that X1 − X2 is open, given that X1
is open. Since X1 is open, it contains an open ball B(x1 , ) where x1 ∈ X1 .
For, x2 ∈ X2 ,
B(x1 − x2 , ) = B(x1 , ) − x2 ⊂ X1 − x2
Thus, X1 − X2 is open.
*
Hence, X1 − X2 = (x1 − x2 ) is open in E.
x1 ∈X1
x2 ∈X2
Also, θ ∈
/ X1 − X2 , since X1 ∩ X2 = Φ.
Let X = X1 − X2 + u0 where u0 = x2 − x1 . Then X is an open convex
set with θ ∈ X. Hence, X is an absorbing set as well. Let μX be the
Minkowski functional of X.
In order to obtain the required functional, we apply the theorem 5.1.9.
4
Let E0 = span {u0 }, p = μX and the linear functional f0 : E −→ defined
by
f0 (λu0 ) = λ, λ ∈ 4 (5.26)
Since X1 ∩ X2 = Φ and u0 = x2 − x1 ∈
/ X, by lemma 5.2.14, we have
μX (u0 ) ≥ 1 and hence,
f0 (λu0 ) = λ ≤ λμX (u0 ) = μX (λu0 ), ∀λ∈ 4

Therefore, by theorem 5.1.9 f0 has a linear extension f : E → R such

that
f (x) ≤ μX (x), ∀ x ∈ E (5.27)
Lemma 5.2.14 yields that μX (x) ≤ 1 for some x ∈ X.

Hence, it follows from (5.27) that f (x) ≤ 1 for every x ∈ X.
Thus, f (x) ≥ −1 for every x ∈ (−X). Thus we have,
|f (x)| ≤ 1 ∀ x ∈ X ∩ (−X)
Since X ∩ (−X) is an open set containing θ, and is in the preimage of

any open set in the range of f , f is continuous.
Next to show that there exists β ∈ 4 such that
f (x1 ) < β ≤ f (x2 ), ∀ x1 ∈ X1 , x2 ∈ X2
Since f : E −→ R and X1 , X2 ⊂ E, f (X1 ), f (X2 ) are intervals in 4.

Given X1 is open we next show that f (X1 ) is open.
Since f is non-zero there is some a ∈ E such that f (a) = 1, a = θ. Let
x ∈ X. Since X1 is open, it contains an open ball B(x, ), > 0.
If |k| < /||a||, then x − ka ∈ X1 so that f (x − ka) ∈ f (X1 ). Then
,
k1 ∈ 4 1
, |f (x) − k | <

||a||
⊂ f (X1 )
Hence, f (X1 ) is open in 4.

Hence it is enough to show that f (x1 ) ≤ f (x2 ) for every x1 ∈ X1 and
every x2 ∈ X2 . Since x1 − x2 + u0 ∈ X, and taking note (5.26) by lemma
5.2.14, we have
f (x1 ) − f (x2 ) + 1 = f (x1 − x2 + u0 ) = μX (x1 − x2 + u0 ) ≤ 1
Thus, we have f (x1 ) ≤ f (x2 ) for all x1 ∈ X1 , x2 ∈ X2 .

In case the scalar field is +, we can prove the theorem by using lemma
5.2.1.
5.2.16 Remark
Geometrically, the above separation theorem says that the set X1 lies
on one side of the real hyperplane {x ∈ E : Ref (x) = β} and the set X2
lies on the other, since
X1 ⊂ {x ∈ E : Ref (x) < β} and X2 ⊂ {x ∈ E : Ref (x) ≥ β}


1. Show that a norm of a vector space E is a sublinear functional on E.
2. Show that a sublinear functional p satisfies (i) p(0) = 0 (ii) p(−x) ≥
−p(x).
3. Let L be a closed linear subspace of a normed linear space E, and x0
be a vector not in L0 . Given d is the distance from x0 to L, show that
there exists a functional f0 ∈ E ∗ such that f0 (L) = 0, f0 (x0 ) = 1 and
1
||f0 || = .
d
4+
4. (i) Let L be a linear subspace of the normed linear space E over ( ).
4+
(ii) Let f : L −→ ( ) be a linear functional such that |f (x)| ≤ α||x||
for all x ∈ L and fixed α > 0.
Then show that f can be extended to a linear continuous functional
4+
F : E −→ ( ) such that
|F (x)| ≤ α||x|| ∀ x ∈ 4 (+ )
5. Every linear functional F (x) defined on a linear subspace L of a
normed linear space E can be extended to the entire space with
preservation of the norm, that is, we can construct a linear functional
F (x), defined on E such that
(i) F (x) = f (x) for x ∈ L (ii) ||F ||E = ||f ||L .
Prove the above theorem in case E is separable without using Zorn’s
lemma (1.1.4).
6. Let L be a closed subspace of a normed linear space E such that
f (L) = 0 =⇒ f (E) = 0, ∀ f ∈ E ∗
Prove that L = E.
7. Let E be a normed linear space. For every subspace L of E and every
functional f defined on E, prove that there is a unique Hahn-Banach
extension of f to E if and only if E ∗ is strictly convex that is, for
f1 = f2 in E, with ||f1 || = 1 = ||f2 || we have ||f1 + f2 || < 2.
[Hint: If F1 and F2 are extensions of f , show that F1 + F2 /2 is also
a continuous linear extension of f and the strict convexity condition
is violated].
5.3 Application to Bounded Linear

Functionals on C([a, b])
In this section we shall use theorem 5.1.3 for obtaining a general
representation formula for bounded linear functionals on C([a, b]), where
[a, b] is a fixed compact interval. In what follows, we use representations in

terms of Riemann-Stieljes integral. As a sort of recapitulation, we mention
a few definitions and properties of Riemann-Stieljes integration which is a
generalization of Riemann integration.
5.3.1 Definitions: partition, total variation, bounded variation
A collection of points P = [t0 , t1 , . . . , tn ] is called a partition of an
interval [a, b] if
a = t0 < t1 < · · · < tn = b holds (5.28)
4
Let w : [a, b] → be a function. Then the (total) variation Var (w)
of w over [a, b] is defined to be
⎡
n
Var (w) = sup ⎣ |w(tj ) − w(tj−1 )| : P = [t0 , t1 , . . . , tn ]
j=1
)
is a partition of [a, b] (5.29)
The supremum being taken over all partitions 5.28 of the interval [a, b].
If Var (w) < ∞ holds, then w is said to be a function of bounded
variation.
All functions of bounded variation on [a, b] form a normed linear space.
A norm on this space is given by
||w|| = |w(a)| + Var (w) (5.30)
The normed linear space thus defined is denoted by BV ([a, b]), where
BV suggests ‘bounded variation’.
We now obtain the concept of a Riemann-Stieljes integral as follows.
Let x ∈ C([a, b]) and w ∈ BV ([a, b]). Let Pn be any partition of [a, b] given
by (5.28) and denote by η(Pn ) the length of a largest interval [tj−1 , tj ] that
is,
η(Pn ) = max(t1 − t0 , t2 − t1 , . . . , tn − tn−1 ).
For every partition Pn of [a, b], we consider the sum,

n
S(Pn ) = x(tj )[w(tj ) − w(tj−1 )] (5.31)
j=1
There exists a number I with the property that for every > 0 there is
a δ > 0 such that
η(Pn ) < δ =⇒ |I − S(Pn )| <
I is called the Riemann-Stieljes integral of x over [a, b] with respect to
w and is denoted by
b
x(t)dw(t) (5.32)
a
Thus, we obtain (5.32) as the limit of the sum (5.31) for a sequence
{Pn } of partitions of [a, b] satisfying η(Pn ) → 0 as n → ∞.
In case w(t) = t, (5.31) reduces to the familiar Riemann integral of x
over [a, b].
Also, if x is continuous on [a, b] and w has a derivative which is integrable
on [a, b] then
b b
x(t)dw(t) = x(t)w (t)dt (5.33)
a a
We show that the integral (5.32) depends linearly on x, i.e., given
x1 , x2 ∈ C([a, b]),
b b b
[px1 (t) + qx2 (t)]dw(t) = p x1 (t)dw(t) + q x2 (t)dw(t)
a a a
where p, q ∈ 4
The integral also depends linearly on w ∈ BV ([a, b]) because for all
w1 , w2 ∈ BV ([a, b]) and scalars r, s
b b b
x(t)d(rw1 + sw2 )(t) = r x(t)dw1 (t) + s x(t)dw2 (t)
a a a
5.3.2 Lemma
For x(t) ∈ C([a, b]) and w(t) ∈ BV ([a, b]),

b

x(t)dw(t) ≤ max |x(t)| Var (w) (5.34)
a t∈[a,b]
If Pn is any partition of [a, b]

n
S(Pn ) = x(tj )(w(tj ) − w(tj−1 ))
j=1
n

x(tj )(w(tj ) − w(tj−1 ))
j=1
n
max x(tj ) sup |w(tj ) − w(tj−1 )|
tj ∈[a,b]
j=1
= max |x(t)| · Var (w)
t∈[a,b]
Hence making n → ∞, we get,

b

x(t)dw(t) ≤ max |x(t)| Var (w) (5.35)
a t∈[a,b]
The representation theorem for bounded linear functionals on C([a, b])

by F. Riesz (1909) [30] is discussed next.
5.3.3 Theorem (Riesz’s representation theorem on functionals

on C([a, b])
Every bounded linear functional f on C([a, b]) can be represented by a
Riemann-Stieljes integral
b
f (x) = x(t)dw(t) (5.36)
a
where w is of bounded variation on [a, b] and has the total variation

Var (w) = ||f || (5.37)
Proof: Let M ([a, b]) be the space of functions bounded in the closed
interval [a, b]. By making an appeal to Hahn-Banach theorem 5.1.3. we can
extend the functional f from C([a, b]) to the normed linear space M ([a, b])
that is defined by
||x|| = sup |x(t)|
t∈[a,b]
Furthermore, F is a bounded linear

functional and
1
||f ||C([a,b]) = ||F ||M ([a,b])
We define the function w needed in
(5.36). For this purpose, we consider the
function ut (ξ) as follows, a t b
Fig. 5.1 The function ut
3
1 for a ≤ ξ ≤ t
ut (ξ) = [see figure 5.1] (5.38)
0 otherwise
Clearly, ut (ξ) ∈ M ([a, b]). We mention that ut (ξ) is called the

characteristic function of the interval [a, t]. Using ut (ξ) and the functional
F , we define w on [a, b] by
w(a) = 0 w(t) = F (ut (ξ)) t ∈ [a, b]
We show that this function w is of bounded variation and Var (w) ≤

||f ||.
For a complex quantity we can use the polar form. If fact setting,
θ = argξ, we may write,
ξ = |ξ|e(ξ)
3
1 if ξ = 0
where e(ξ) =
eiθ if ξ = 0
We see that if ξ = 0, then |ξ| = ξe−iθ . Hence, for any ξ, zero or not, we
have,
|ξ| = ξe(ξ) (5.39)
where the bar indicates complex conjugation.

In what follows we write,
j = e(w(tj ) − w(tj−1 )) and utj (ξ) = uj (ξ)
Then by (5.34), for any partition (5.26) we obtain,

n
n
|w(tj ) − w(tj−1 )| = |F (u1 )| + |F (uj ) − F (uj−1 )|
j=1 j=2
n
= 1 F (u1 ) + j [F (uj ) − F (uj−1 )]
⎛ j=2 ⎞
n
= F ⎝ 1 u1 + j [uj − uj−1 ]⎠
0 j=2
0
0 0
0 n
0
≤ ||F || 0 u + [u − u ]
j−1 0
0
0 1 1 j j
0 j=2 0
Now, uj (ξ) = utj (ξ) = 1, tj−1 < ξ ≤ tj . Hence, on the right-hand

side of the above inequality ||F || = ||f || and the other factor || · · · || equals
1 because | j | = 1 and from the definition of uj (ξ)’s we see that for each
t ∈ [a, b] only one of the terms u1 , u2 · · · ui , . . . is not zero (and its norm is
1). On the left we can now take the supremum over all partitions of [a, b].
Then we have,
Var (w) ≤ ||f || (5.40)
Hence w is of bounded variation on [a, b].
We prove (5.36) when x ∈ C([a, b]). For every partition Pn of the form
(5.28) we define a function, which we denote simply by zn (t), keeping in
mind that zn depends on Pn , and not merely on n. zn (t) will be as follows:

n
(t) (t)
zn (t) = x(t0 )x(ut1 (t)) + x(tj−1 )[utj − utj−1 ] (5.41)
j=2
zn (t) is a step function.

Then zn ∈ M ([a, b]). By the definition of w,

n
F (zn ) = x(t0 )F (xt1 ) + x(tj−1 )[F (xtj ) − F (xtj−1 )]
j=2

n
= x(t0 )w(t1 ) + x(tj−1 )[w(tj ) − w(tj−1 )]
j=2

n
= x(tj−1 )[w(tj ) − w(tj−1 )] (5.42)
j=1
where the last equality follows from w(t0 ) = w(a) = 0. We now choose any
sequence {Pn } of partitions of [a, b] such that η(Pn ) → 0 (It is to be kept in
mind that tj depends on the particular partition Pn ). As n → ∞ the sum
on the right-hand side of (5.42) tends to the integral in (5.26) and (5.26)
follows provided F (zn ) −→ F (x), which equals f (x) since x ∈ C([a, b]).
We need to prove that F (zn ) −→ F (x). Keeping in mind the definition
of ut (ξ) (fig. (5.1)), we note that (5.41) yields zn (a) = x(a) · 1 since the sum
in (5.41) is zero at t = a. Hence zn (a) − x(a) = 0. Moreover by (5.41) if
tj−1 ≤ ξ < tj , then we obtain zn (t) = x(tj−1 ) · 1. It follows that for those
t,
|zn (t) − x(t)| = |x(tj−1 ) − x(t)|
Consequently, if η(Pn ) → 0, then ||zn − x|| → 0 because x is continuous

on [a, b], hence uniformly continuous on [a, b], since [a, b] is compact in . 4
The continuity of F now implies that F (zn ) → F (x) and F (x) = f (x) so
that
b
f (x) = x(t)dw(t)
a

b

|f (x)| = x(t)dw(t) ≤ max |x(t)|Var (w) = ||x||Var (w)
a t∈[a,b]
Therefore, ∀ x ∈ C([a, b])
|f (x)|
||f || = sup ≤ Var (w) (5.43)
x=θ ||x||
||f || = Var (w)
Note 5.3.1. We note that w in the theorem is not unique. Let us impose
on w the following conditions
(i) w is zero at a and continuous from the right
w(a) = 0, w(t + 0) = w(t) (a < t < b)
Then, w will be unique [see A.E. Taylor [55]].

5.4 The General Form of Linear

Functionals in Certain Functional
Spaces
5.4.1 Linear functionals on the n-dimensional Euclidean space
4 n
Let f be a linear functional defined on En .

Now, x ∈ En can be written as

n
x= ξi e i where x = {ξ1 , ξ2 , . . . , ξn }
i=1
f being a linear functional

n

n
f (x) = f ξi ei = ξf (ei ) (5.44)
i=1 i=1

n
= ξ i fi where fi = f (ei )
i=1
For x = {ξi }, let us suppose,

n
φ(x) = ξi φi
i=1
where φi are arbitrary.

For y = {ηi }, we note that

n
n
n
φ(x + y) = (ξi + ηi )φi = ξi φi + ηi φi = φ(x) + φ(y)
i=1 i=1 i=1

n
n
φ(λx) = λξi φi = λ ξi φi = λφ(x)
i=1 i=1
for all sectors λ.

Hence φ is a linear functional defined on an n-dimensional space. Since
φi can be regarded as the components of an n-dimensional vector φ, the
4 4
space ( n )∗ , the dual of n , is also an n-dimensional space with a metric,
generally speaking, different from the metric of n . 4
Let ||x|| = max |ξi |; then
i
n n
n

|φ(x)| = ξi φi ≤ |ξi ||φi | ≤ |φi | ||x||,

i=1 i=1 i=1
|φ(x)|
n
Hence, ||φ|| = sup ≤ |φi | (5.45)
x=θ ||x|| i=1

n
On the other hand, if we select an element x0 = sgn φi ei ∈ 4n, then
i=1
||x0 || = 1 and

n
n
φ(x0 ) = sgn φi · φi (ei ) = sgn φi · φi
i=1 i=1
n
= |φi |||x0 ||
i=1

n
Hence, ||φ|| ≥ |φi | (5.46)
i=1
From (5.45) and (5.46), it follows that

n
||φ|| = |φi |
i=1
If an Euclidean metric is introduced, we can verify that the metric in

( 4n)∗ is also Euclidean.
5.4.2 The general form of linear functional on s
s is the space of all sequences of numbers.
Let f (x) be a linear functional defined on s. Put
en = {ξin } where ξin = 0, i = n and ξnn = 1
Further, let f (en ) = un . The convergence in s is coordinatewise.

Therefore,

n ∞

x = lim ξk ek = ξk ek
n→∞
k=1 n=1
holds where x = {ξi }.

Because f is continuous,

m ∞

f (x) = lim ξk f (ek ) = ξk uk
m→∞
k=1 k=1
Since this series must converge for every number sequence {ξk }, the uk
must be equal to zero from a certain index onwards and consequently

m
f (x) = ξk uk
k=1
Conversely, for any x = {ξi }, let φ(x) be given by

m
φ(x) = ξk uk (5.47)
k=1
where uk ’s are real and arbitrary. For if y = {ηi },

m
m
m
φ(x + y) = (ξk + ηk )uk = ξ k uk + ηk uk = φ(x) + φ(y)
k=1 k=1 k=1

m
m
Moreover, φ(λx) = λξk uk = λ ξk uk = λφ(x).
k=1 k=1
Hence, φ(x) is a linear functional on s.
It therefore follows that every linear functional defined on s has the
general form given by (5.47) where m and uk , k = 1, 2, . . . , m are uniquely
defined by (5.47).
5.4.3 The general form of linear functionals on lp
Let f (x) be a bounded linear functional defined on lp . Since the elements
ei = {ξij } where ξij = 1 for i = j and ξij = 0 for i = j, form basis of lp ,
every element x ∈ lp can be written in the form
∞

x= ξi ei
i=1
Since f (x) is bounded linear,

∞

f (x) = ξi f (ei )
i=1
Writing ui = f (ei ), f (x) takes the form,

∞

f (x) = ui ξ i (5.48)
i=1
(n)
Let us put xn = {ξi }, where
3
(n)
|ui |q−1 sgn ui , if i ≤ n
ξi =
0 if i > n
q is chosen such that the equality [(1/p) + (1/q)] = 1 holds

n
(n)

n
n
f (xn ) = ui ξi = |ui |q−1 ui sgn ui = |ui |q (5.49)
i=1 i=1 i=1
On the other hand,

∞

1/p
(n)
f (xn ) ≤ ||f || ||xn || = ||f || |ξi |p
i=1

1/p
1/p

n
n
= ||f || |ui | p(q−1)
= ||f || |ui | q
i=1 i=1
n
1/p

n
Thus |ui |q ≤ ||f || |ui |q
i=1 i=1
1 1
Since + = 1
p q
n
1/q

Thus, |ui |q ≤ ||f ||.
i=1
Since the above is true for every n, it follows that

∞
1/q

|ui |q ≤ ||f || (5.50)
i=1
Thus {ui } ∈ lq
Conversely, let us take an arbitrary sequence {vi } ∈ lq . Then, for

x = {ξi } let us write,
∞

φ(x) = vi ξi
i=1
To show that φ is a linear functional, we proceed as follows. For

y = {ηi },
∞

φ(x + y) = vi (ξi + ηi ) (5.51)
i=1
∞
1/p

Since x, y ∈ lp , x + y ∈ lp , i.e., |ξ + ηi | p
< ∞.
i=1
∞

1/q ∞

1/p

Since {vi } ∈ lq , |φ(x + y)| ≤ |vi | q
|ξi + ηv |
p
< ∞.
i=1 i=1
Hence, φ is additive and homogeneous.
∞
1/q ∞

1/p

|φ(x)| ≤ |vi | q
|ξi | p
= M ||x||
i=1 i=1
∞

1/q

where M = |vi | q
i=1
Thus, φ is a bounded linear functional. For calculating the norm of the

functional we proceed as follows
∞

1/q

|f (x)| ≤ |ui | q
||x||
i=1
∞
1/q

Consequently, ||f || ≤ |ui |q (5.52)
i=1
It follows from (5.50) and (5.52)
∞

1/q

||f || = |ui | q
i=1
5.4.4 Corollary
Every linear functional defined on l2 can be written in the general form
∞

f (x) = ui ξ i
i=1
∞
∞
1/2

2 2
where |ui | < ∞ and ||f || = |ui |
i=1 i=1
5.5 The General Form of Linear Functionals

in Hilbert Spaces
4+
Let H be a Hilbert space over ( ) and f (x) a linear bounded functional
defined on H. Let N (f ) denote the null space of f (x), i.e., the space of
zeroes of f (x). Let x1 , x2 ∈ N (f ). Then
f (αx1 + βx2 ) = αf (x1 ) + βf (x2 ) = 0 ∀ scalars α, β
Hence, N (f ) is a subspace of H. Since f is a bounded linear functional

for a convergent sequence {xn }, i.e.,
xn → x ⇒ |f (xn ) − f (x)| ≤ ||f || ||xn − x|| → 0 as n → ∞
Hence, N (f ) is a closed subspace. Let x ∈ H and x ∈ / N (f ). Let x0 be

the projection of x on the subspace H − N (f ), the orthogonal complement
of H. Let f (x0 ) = α, obviously α = 0. Put x1 = x0 /α. Then
x 1
0
f (x1 ) = f = f (x0 ) = 1
α α
If now x ∈ H is arbitrary and f (x) = β, then f (x − βx1 ) =

f (x) − βf (x1 ) = 0. Let us put x − βx1 = z then z ∈ N (f ) and we
have x = βx1 + z. This equality shows that H is the orthogonal sum of
N (f ) and the one-dimensional subspace spanned by x1 .
Since z ∈ N (f ) and x1 ∈ H − N (f ),
z⊥x1 or x − βx1 , x1 = 0 where , stands for the scalar product.
Hence,
x, x1 = βx1 , x1 = β||x1 ||2
Since β = f (x), we have

7 8
x1
f (x) = x,
||x1 ||2
x1
If = u, then
||x1 ||2
f (x) = x, u, (5.53)
i.e., we get the representation of an arbitrary functional as an inner product
of the element x and a fixed element u. The element u is defined uniquely
by f because if f (x) = x, v, then x, u − v = 0 for every x ∈ H, implying
u = v.
Further (5.53) yields,
|f (x)| = |x, u| ≤ ||x|| ||u||
which implies that
|f (x)|
sup ≤ ||u|| or ||f || ≤ ||u|| (5.54)
x=θ ||x||
Since, on the other hand, f (u) = u, u = ||u||2 , it follows that ||f ||
cannot be smaller than ||u||, hence ||f || = ||u||.
Thus, every linear functional f (x) in a Hilbert space H can be
represented uniquely in the form f (x) = x, u, where the element u is
uniquely defined by the functional f . Moreover, ||f || = ||u||.
Problem
∞

1. If l1 is the space of real elements x = {ξi } where |ξi | < ∞, show
i=1
that a linear functional f on l1 can be represented in the form
∞

f (x) = ck ξk
k=1
where {ck } is a bounded sequence of real numbers.

5.6 Conjugate Spaces and Adjoint Operators

In 5.1.1 we have defined the conjugate (or dual) space E ∗ of a Banach
space E. We may recall that the conjugate (or dual) space E ∗ is the space
of bounded linear functionals mapping the Banach space E → . The idea 4
that comes next is to find the characterisation, if possible, of a conjugate
or dual space. In this case isomorphism plays a great role. We recall
that two spaces E and E are said to be isomorphic if, between their
elements, there can be established a one-to-one correspondence, preserving
the algebraic structure, that is such that
, 3
x ←→ x x + y ←− x + y
⇒ (5.55)
y ←→ y λx ←− λx for scalar λ
5.6.1 Space 4n: the dual space of 4n is 4n

Let {e1 , e2 , . . . , en } be a basis in 4n . Then any x ∈ 4n can be written
as

n
x= ξi ei , where x = {ξi }
i=1
Let f be a linear functional defined on n . 4

n
n
Then f (x) = ξi f (ei ) = ξi ai where ai = f (ei ).
i=1 i=1
Now, by Cauchy-Bunyakovsky-Schwartz inequality (sec. 1.4.3)
n
1/2 n
1/2 n
1/2

|f (x)| ≤ |ξi |2 |ai |2 = |ai |2 ||x||
i=1 i=1 i=1

1/2

n
where ||x|| = |ξi |2
i=1
n
1/2
|f (x)|
2
Hence, ||f || = sup ≤ |ai | (5.56)
x=θ ||x|| i=1
Taking x = {ai } we see that
n
1/2
n
f (x) = |ai |2 = |ai |2 ||x||
i=1 i=1
Hence, the upper bound in (5.56) is attained. That is
n
1/2

2
||f || = |ai |
i=1
This shows that the norm of f is the Euclidean norm and ||f || = ||a||
where a = {ai } ∈ . 4
4 4
Hence, the mapping of n onto n defined by f −→ a = {ai } where
ai = f (ei ), is norm preserving and, since it is linear and bijective, it is an
isomorphism.
5.6.2 Space l1 : the dual space of l1 is l∞
Let us take a Schauder basis {ei } for l1 , where ei = (δij ), δij stands for
the Kronecker δ-symbol.
Thus every x ∈ l1 has a unique representation of the form
∞

x= ξi ei (5.57)
i=1
For any bounded linear functional f defined on l1 i.e. for every f ∈ l1∗
we have
∞
∞

f (x) = ξi f (ei ) = ξi ai (5.58)
i=1 i=1
where ai = f (ei ) are uniquely defined by f . Also, ||ei || = 1, i = 1, 2, . . .

and
|ai | = ||f (ei )|| ≤ ||f || ||ei || = ||f || (5.59)
Hence sup |ai | ≤ ||f ||. Therefore {ai } ∈ l∞ .

i
Conversely, for every b = {bi } ∈ l∞ we can obtain a corresponding
bounded linear functional φ on l1 . We can define φ on l1 by
∞

φ(x) = ξi bi , where x = {ξi } ∈ l1
i=1
If y = {ηi } ∈ l1 , then
∞
∞
∞

φ(x + y) = (ξi + ηi )φ(ei ) = ξ i bi + η i bi
i=1 i=1 i=1
= φ(x) + φ(y) showing φ is additive
For all scalars λ,

∞
∞

φ(λx) = (λξi )φ(ei ) = λ ξi φ(ei )
i=1 i=1
= λφ(x), i.e., φ is homogeneous.
Thus φ is homogeneous. Hence, φ is linear.

∞
∞

Moreover, |φ(x)| ≤ |ξi · bi | ≤ sup |bi | |ξi | = ||x|| sup |bi |
i i
i=1 i=1
|φ(x)|
Therefore, ||φ|| = sup ≤ sup |bi | < ∞ since b = {bi } ∈ l∞ . Thus
x=θ ||x|| i
∗
φ is bounded linear and φ ∈ l1 .
We finally show that the norm of f is the norm on the set l∞ . From
(5.58), we have,
∞
∞

|f (x)| = ξi ai ≤ sup |ai | |ξi | = ||x|| sup |ai |
i i
i=1 i=1
f (x)
Hence, ||f || = sup ≤ sup |ai |.
x=θ ||x|| i
It follows from (5.59) and this above inequality,
||f || = sup |ai |,
i
which is the norm on l∞ . Hence, we can write ||f || = ||a|| where

a = {ai } ∈ l∞ . It shows that the bijective linear mapping of l1∗ onto
l∞ defined by f → a = {ai } is an isomorphism.
5.6.3 Space lp theorem
The dual space of lp is lq , here, 1 < p < ∞ and q is the conjugate of p,
1 1
that is, + = 1
p q
Proof: A Schauder basis for lp is {ei } where ei = {δij }, δij is the Kronecker
δ symbol. Thus for every x = {ξi } ∈ lp we can find a unique representation
of the form ∞

x= ξi ei (5.60)
i=1
We consider any f ∈ where lp∗ lp2 is the conjugate (or dual) space of lp .
Since f is linear and bounded,
∞
∞

f (x) = ξi f (ei ) = ξi ai (5.61)
i=1 i=1
where ai = f (ei ).
1 1
Let q be the conjugate of p i.e. + = 2.
p q
(n)
Let xn = {ξ i } with
3
(n) |ai |q /ai if i ≤ n and ai = 0
ξi = (5.62)
0 if i > n or ai = 0
∞

n
(n)
Then f (xn ) = ξ i ai = |ai |q
i=1 i=1
Using (5.62) and that (q − 1)p = q, it follows from the above,
∞
1/p
(n) p
f (xn ) ≤ ||f || ||xn || = ||f || |ξ |
i=1

1/p

n
= ||f || |ai | p(q−1)
i=1

1/p

n
= ||f || |ai |q
i=1

1/p

n
n
Hence, f (xn ) = |ai | ≤ ||f ||
q
|ai | q
i=1 i=1
Dividing both sides by the last factor, we get,

n
1−p−1 n
1/q

|ai | q
= |ai | q
= ||f || (5.63)
i=1 i=1
Hence, on letting n → ∞, we prove that,
{ai } ∈ lq
Conversely, for b = {bi } ∈ lq , we can get a corresponding bounded linear

functional Φ on lp . For x = {ξi } ∈ lp , let us define Φ as,
∞

Φ(x) = ξ1 bi (5.64)
i=1
For y = {ηi } ∈ lp , we have,

∞
∞
∞

Φ(x + y) = (ξi + ηi )bi = ξi bi + ηi bi
i=1 i=1 i=1
⎡
1/p ∞
1/p ⎤
∞
= Φ(x) + Φ(y) ≤ ⎣ |ξi |p + |ηi |p ⎦×
i=1 i=1
∞
1/q

|bi | q
<∞
i=1
∞

α
Also, Φ(αx) = (αξi )bi = α ξi bi = αΦ(x), for all scalars α.
i=1 i=1
Hence φ is linear.
To prove that Φ is bounded, we note that,
∞
1/p ∞
1/q
∞

|Φ(x)| = ξi bi ≤ |ξi | p
|bi | q

i=1 i=1 i=1
∞

1/q

= |bi | q
||x||
i=1
Hence, Φ is bounded since {bi } ∈ lq .

Thus, Φ is a bounded linear functional. Finally, to show that the norm
of f is the norm on the space lq we proceed as follows: (5.61) yields,

1/p ∞
1/q
∞ ∞

|f (x)| = ξi ai ≤ |ξi | p
|ai |q

i=1 i=1 i=1
∞
1/q

= ||x|| |ai |q
i=1
∞
1/p
|f (x)|
Hence, ||f || = sup ≤ |ai | q
(5.65)
x=θ ||x|| i=1
∞
1/q

||f || = |ai | q
= ||a||q (5.66)
i=1
Thus, ||f || = ||a||q where a = {ai } ∈ lq and ai = f (ei ). The mapping of

lp∗ onto lq as defined f → a is linear and bijective, and from (5.66) we see
that it is norm preserving. Therefore, it is an isomorphism.
Note 5.6.1.
(i) It can be shown that lq∗ is isomorphic to lp . Hence
(ii) l2∗ = l2 , i.e., l2 is called a self-conjugate space.
(iii) A linear functional in a Hilbert space is spanned by elements of the
same space. A Hilbert space is, therefore, self-conjugate.
5.6.4 Reflexive space
Let E be a normed linear space. In 5.1.2, we have defined E ∗ , the
conjugate space of E as the space of bounded linear functionals defined on
E. In the same manner we can introduce the concept of a conjugate space
(E ∗ )∗ of a Banach space E ∗ and call the space E ∗∗ as the second conjugate
of the normed linear space E. To be more specific, consider a bounded
linear f defined on E, so that in f (x), f remains fixed and x varies over E.
We can also think of a situation where x is kept fixed and f is varying in
E. For example, let 1
f (x) = x(t)dg(t)
0
Then we have two cases: namely (i) g(t) is fixed and x(t) varying or
(ii) x(t) is fixed and g(t) varies. Now, since f (x) ∈ , f (x) can be treated 4
as a functional Fx , defined on E ∗ , for fixed x and variable f . Hence, it is

possible to write f (x) = Fx (f ). In what follows, we shall show that the
mapping Fx is an isometric isomorphism of E onto a subspace of E ∗∗ .
5.6.5 Theorem
Let E = {Φ} be a normed linear space over 4(+). Given x ∈ E, let
Fx (f ) = f (x) ∀ f ∈ E ∗ (5.67)
Then Fx is a bounded linear functional on E ∗ , i.e., Fx ∈ E ∗∗ .

Further, the mapping Fx is an isometric isomorphism of E onto the
subspace Ê = {Fx : x ∈ E} of E ∗∗ .
Proof: The mapping Fx satisfies,
Fx (αf1 + βf2 ) = (αf1 + βf2 )(x) = αf1 (x) + βf2 (x)
= αFx (f1 ) + βFx (f2 ) (5.68)
4+
∀ f1 , f2 ∈ E ∗ and α, β ∈ ( )
Hence Fx is linear.
Also Fx is bounded, since
|Fx (f )| = |f (x)| ≤ ||f || ||x||, ∀ f ∈ E ∗ . (5.69)
Consequently, Fx ∈ E ∗∗ . If Fx ∈ E ∗∗ is not unique, let us suppose

F1x (f ) = F2x (f ) or (F1x − F2x )f = 0. We keep x fixed and vary f . Since
f is arbitrary F1x = F2x , showing that Fx is unique.
Thus to every x ∈ E, ∃ a unique Fx ∈ E ∗∗ given by (5.67). This defines
a function φ : E → E ∗∗ given by
φ(x) = Fx .
(i) We show that φ is linear.
For, x, y ∈ E and α, β ∈ 4(+),
(φ(αx + βy))(f ) = Fαx+βy (f ) = f (αx + βy)
= (αFx + βFy )(f )
= (αφ(x) + βφ(y))(f ), ∀ f ∈ E ∗
Hence, φ(αx + βy) = αφ(x) + βφ(y)
(ii) We next show that φ preserves norm.

For each x ∈ E, we have
,
|Fx (f )| ∗
||φ(x)|| = ||Fx || = sup :f ∈E
f =θ ||f ||
,
|f (x)| ∗
= sup :f ∈E
f =θ ||f ||
Now, by theorem 5.1.4, for every x ∈ E, ∃ a functional g ∈ E ∗ such

that ||g|| = 1 and g(x) = ||x||.
|f (x)| |g(x)|
Therefore, ||φ(x)|| = ||Fx || = sup ≥ = ||x||
f =θ ||f || ||g||
Using (5.69) we prove ||φ(x)|| = ||x||.
(iii) We next show that φ is injective. Let x, y ∈ E. Then
x − y = θ ⇒ ||x − y|| = 0 ⇒ ||φ(x − y)|| = 0

⇒ ||φ(x) − φ(y)|| = 0 ⇒ φ(x) = φ(y)
We thus conclude that φ is an isometric isomorphism of E onto the

subspace Ê(φ(E)) of E.
5.6.6 Definition
4+
Let E be a normed linear space over ( ). The isometric isomorphism
φ : E → E ∗∗ defined by φ(x) = Fx is called the natural embedding (or
the canonical embedding) of E into the second conjugate space E ∗∗ . The
functional Fx ∈ E ∗∗ is called the functional induced by the vector x. We
refer to the functional of this type as induced functional.
φ
f
x Fx
E E* E**
Fig. 5.2
5.6.7 Definition: reflexive normed linear space

A normed linear space E is said to be reflexive if the natural embedding
φ maps the space E onto its second conjugate space E ∗∗ , i.e., φ(E) = E ∗∗ .
Note 5.6.2.
(i) If E is a reflexive normed linear space, then E is isometrically

isomorphic to E ∗∗ under the natural embedding.
(ii) If E is a reflexive normed linear space, since the second conjugate
space E ∗∗ is always complete, the space E must be complete. Hence,
completeness of the second conjugate space is a necessary condition
for a normed linear space to be complete. However, this condition
need not be sufficient (Example 4 (sec. 5.6.4)). Thus, it is clear that
if E is not a Banach space, then we must have φ(E) = E ∗∗ and hence
E is not reflexive.
5.6.8 Definitions: algebraic dual, topological dual

Algebraic dual (conjugate)
Given Ex , a topological linear space, the space of linear functionals
4
mapping Ex → is called the algebraic dual (conjugate) of Ex .
Topological dual (conjugate)
On the other hand the space of continuous linear functionals mapping
4
Ex → is called the topological dual (conjugate) of Ex .
5.6.9 Examples
1. 4 n
, n-dimensional Euclidean space is reflexive.
4 4
In 5.4.1 we have seen that n∗ = n . Then,
(4n )∗∗ = (4n∗ )∗ = (4n )∗ = 4n
4
Hence, n is reflexive.
Note 5.6.3 Every finite dimensional normed linear space is reflexive. We
know that in a finite dimensional normed linear space E, every linear
functional on E is bounded, so that the reflexivity of E follows.
2. The space lp (p > 1)
1 1
In 5.6.3, we have seen that lp∗ = lq , + = 1
p q
∗∗ ∗ ∗ ∗
Therefore, lP = (lp ) = (lq ) = lp
Hence, lp is reflexive.
3. The space C([0, 1]) is non-reflexive. For that see 6.2.
5.6.10 Theorem
A normed linear space is isometrically isomorphic to a dense subspace
of a Banach space.
Proof: Let E be a normed linear space. If φ : E → E ∗∗ be the natural
embedding, then E and φ(E) are isometrically isomorphic spaces. But
φ(E) is a dense subspace of φ(E) and φ(E) is a closed subspace of the
Banach space E ∗∗ , it follows that φ(E) itself is a Banach space. Hence E
is isometrically isometric to the dense subspace φ(E) of the Banach space
φ(E).
We next discuss the relationship between separability and reflexivity of
a normed linear space.
5.6.11 Theorem
Let E be a normed linear space and E ∗ be its dual. Then E ∗ is separable
⇒ E is separable.
N
Proof: Since E ∗ is separable, ∃ a countable set S = {fn : fn ∈ E ∗ , n ∈ }
such that S is dense in E ∗ , i.e., S = E ∗ .
For each n ∈ N, choose xn ∈ E such that

1
||xn || = 1 and |fn (x)| ≥ ||fn ||
2
Let X be a closed subspace of E generated by the sequence {xn }, i.e.,
X = span{xn ∈ E, n ∈ }. N
Suppose X = E then ∃ a point x0 ∈ E − X. Theorem 5.1.5 yields that
we can find a functional θ = g ∈ E ∗ such that g(x0 ) = 0 and g(X) = 0.
⎧
⎨g(xn ) = 0, n ∈ N
Thus 1
⎩ ||fn || ≤ |fn (xn )| = |(fn − g)(xn )| ≤ ||fn − g||
2
Therefore, ||g|| ≤ ||fn − g|| + ||fn || ≤ 3||fn − g|| ∀ n ∈ N
But since S = E ∗ , it follows that g = 0, which contradicts the
assumption that X = E. Hence, X = E and thus E is separable.
5.6.12 Theorem
Let E be a separable normed linear space. If the dual E ∗ is non-
separable then E is non-reflexive.
Proof: Let E be reflexive if possible. Then, E ∗∗ is isometrically isomorphic
to E under the natural embedding. Given E is separable, E ∗∗ will be
separable. But, by theorem 5.6.11, E ∗ is separable, which contradicts our
assumption. Hence, E is non reflexive.
5.6.13 Example
The space (l1 , || · ||1 ) is not reflexive.
The space l1 is separable.
Now, (l1 )∗ = l∞ . But l∞ is not separable. By theorem 5.6.12 we can
say that l1 is non-reflexive.
5.6.14 Adjoint operator
We have, so far, talked about bounded linear operators and studied their
properties. We also have discussed bounded linear functionals. Associated
with linear operators are adjoint linear operators. Adjoint linear operators
find much use in the solution of equations involving operators. Such
equations arise in Physics, Applied Mathematics and in other areas.
Let A be a bounded linear operator mapping a Banach space Ex into a
Banach space Ey , and let us consider the equation Ax = y, x ∈ Ex , y ∈ Ey .
4
If g : Ey → be a linear functional, then
g(y) = g(Ax) = a functional of x = f (x) (say) (5.70)
f (x) is a functional on Ex .
We can see that f is linear. Let x1 , x2 ∈ Ex and y1 , y2 ∈ Ey , such that
y1 = Ax1 , y2 = Ax2
Then g(y1 + y2 ) = g(Ax1 + Ax2 ) = g(A(x1 + x2 )) = f (x1 + x2 ).

Since g is linear,
f (x1 + x2 ) = g(y1 + y2 ) = g(y1 ) + g(y2 ) = f (x1 ) + f (x2 ) (5.71)
Thus, f is linear. Hence the functional f ∈ Ex∗ corresponds to

some g ∈ Ey∗ . This sets the definition of an adjoint operator. The
correspondence so obtained forms a certain operator with domain Ey∗ and
range contained in Ex∗ .
5.6.15 Definition: adjoint operator
Let A be a bounded linear operator mapping a normed linear space Ex
into a normed linear space Ey , let f ∈ Ex∗ and g ∈ Ey∗ be given linear
functionals, then the operator adjoint to A is denoted by A∗ and is given
by f = A∗ g [see figure 5.3]
A
Ex Ey
f = A*g g
( )
Fig. 5.3
5.6.16 Examples
4 4
1. Let A be an operator in ( n → n ), where n is an n-dimensional 4
space. Then A is defined by a matrix (aij ) of order n and equality y = Ax
where x = {ξ1 , ξ2 , . . . , ξn } and y = {η1 , η2 , . . . , ηn } such that

n
ηi = aij ξj
j=1
Consider a functional f ∈ n∗
(=4 4n ) since 4n is self-conjugate;

n
f = (f1 , f2 , . . . , fn ), f (x) = fi ξi .
i=1

n
n
n
Hence, f (Ax) = fi ηi = fi aij · ξj
i=1 i=1 j=1
n

n
n
n
= aij fi ξj = aij fi ξj
i=1 j=1 j=1 i=1

n
= φ j ξj (5.72)
j=1

n
where φj = aij fi (5.73)
i=1
The vector φ = (φ1 , φ2 , . . . , φn ) is an element of n and is obtained 4

from the vector f = (f1 , f2 , . . . , fn ) of the same space by the linear
transformation
φ = A∗ f
where A∗ is the transpose of the matrix A. Therefore, the transpose of A
corresponds to the adjoint of the matrix A in the n-dimensional space.
2. Let us consider in L2 ([0, 1]) the integral
1
T f = g(s) = K(s, t)f (t)dt
0
K(s, t) is a continuous kernel.

An arbitrary linear functional φ(g) ∈ L2 ([0, 1]) will be of the form g, v
where v ∈ L2 ([0, 1]) and , denotes scalar product.
This is because L2 ([0, 1]) is a Hilbert space.
1 1
φ(g) = g, v = K(s, t)f (t)dtv(s)ds
0 0
1 1
= K(s, t)v(s)ds f (t)dt
0 0
(on change of order of integration by Fubini’s theorem 10.5.3)
1
= (T ∗ v)(t)f (t)dt
0
= T ∗ v, f .
1
where, T ∗ v(s) = K(t, s)v(t)dt
0
Thus, in the given case, the adjoint operator is also an integral operator,
the kernel K(t, s) which is obtained by interchanging the arguments of
K(s, t). K(t, s) is called the transpose of the kernel K(s, t).
5.6.17 Theorem
Given A, a bounded linear operator mapping a normed linear space Ex
into a normed linear space Ey , its adjoint A∗ is also a bounded linear
operator, and ||A|| = ||A∗ ||.
Let f1 = A∗ g1 and f2 = A∗ g2 .
Hence, g1 (y) = g1 (Ax) = f1 (x), x ∈ Ex , f1 ∈ Ex∗ , y ∈ Ey , g1 ∈ Ey∗ .
Also g2 (y) = g2 (Ax) = f2 (x).
Now, g1 and g2 are linear functionals and hence f1 and f2 are linear
functionals.
Now, (g1 + g2 )(y) = g1 (y) + g2 (y) = f1 (x) + f2 (x) = (f1 + f2 )(x)

or f1 + f2 = A∗ (g1 + g2 ) or A∗ g1 + A∗ g2 = A∗ (g1 + g2 )
Thus, A∗ is a linear functional.
Moreover, |A∗ g(x)| = |f (x)| = |g(Ax)| ≤ ||g|| ||A|| ||x||
|A∗ g(x)|
or, ||A∗ g|| = sup ≤ ||g|| ||A||
x=θ ||x||
∗
||A g||
Hence, ≤ ||A||
||g||
||A∗ g||
Therefore, ||A∗ || = sup = ≤ ||A|| (5.74)
g=θ ||g||
Let x0 be an arbitrary element of Ex . Then, by theorem 5.1.4, there

exists a functional g0 ∈ Ey∗ such that ||g0 || = 1 and g0 (Ax0 ) = ||Ax0 ||.
Hence, ||Ax0 || = g0 (Ax0 ) = f0 (x0 ) ≤ ||f0 || ||x0 ||

= ||A∗ g0 || ||x0 || ≤ ||A∗ || ||g0 || ||x0 ||
||Ax0 ||
or, ||A|| = sup ≤ ||A∗ || [since ||g0 || = 1] (5.75)
x=θ ||x0 ||
||A|| = ||A∗ ||
5.6.18 Adjoint operator for an unbound linear operator

Let A be an unbounded linear operator defined on a subspace Lx dense
in Ex with range in the space Ey . The notion of an adjoint to such an
unbounded operator can be introduced. Let g ∈ Ey∗ and let
g(Ax) = f0 (x), x ∈ Lx
Let x1 , x2 ∈ Lx .
Then g(A(x1 + x2 )) = g(Ax1 + Ax2 )

= g(Ax1 ) + g(Ax2 )
since g is a linear functional defined on Ey .
= f0 (x1 ) + f0 (x2 )
On the other hand, g(A(x1 + x2 )) = f0 (x1 + x2 ), showing that f0 is

additive. Similarly, we can show that f0 is homogeneous. Thus, f0 is
linear. But f0 is not in general bounded. In case f0 is bounded, since Lx
is everywhere dense in Ex , f0 can be extended to the entire space Ex .
In case A∗ is not defined on the whole space Ey∗ which contains θ, it
must be defined on some subspace L∗y ⊂ Ey∗ . This will lead to the linear
functional f ∈ Ex∗ being set in correspondence to the linear functional

g ∈ Ey∗ . This operator A∗ is also called the adjoint of the unbounded
linear operator A.
Thus we can write f0 = A∗ g, g ∈ L∗y .
Let g1 , g2 ∈ L∗y . Then, for fixed x ∈ Lx .

g1 (Ax) = f1,0 (x)
(5.76)
Similarly, g2 (Ax) = f2,0 (x)
Therefore, (g1 + g2 )(Ax) = (f1,0 + f2,0 )(x) (5.77)
Thus, (g1 + g2 ) ∈ L∗y , showing that L∗y is a subspace. It follows form
(5.74) that f1,0 = A∗ g1 , f2,0 = A∗ g2 . Hence (5.76) gives
A∗ (g1 + g2 ) = f1,0 + f2,0 = A∗ g1 + A∗ g2
This shows that A∗ is a linear operator, but generally not bounded.

5.6.19 The matrix form of operators in space with basis and the
adjoint
Let E be a Banach space with a basis and A a bounded linear operator
mapping E into itself.
Let {ei } be a basis in E and x ∈ E can be written as
∞

x= αi ei
i=1
Thus, A being bounded,
∞
y = Ax = αi Aei
i=1
Since Aei is again an element of E, it can be represented by
∞
Aei = pki ek
k=1
Then we can write,
∞

n
n
y = Ax = lim αi Aei = lim αi pki ek (5.78)
n n
i=1 i=1 k=1
∞

Thus, y = βk ek (5.79)
k=1
∞

where, βk = pki αi (5.80)
i=1
Let {φj } be a sequence of functionals biorthogonal to the sequence {ei },

i.e.,
3
1 if j = k
φj (ek ) = (5.81)
0 if j = k
Then (5.79) and (5.80) imply,
3 ∞

n
βm = φm (y) = φm lim αi pki ek
n
i=1 k=1
3 ∞

n
= lim φm αi pki ek
n
i=1 k=1

n
n
= lim αi pki φm (ek )
n
i=1 k=1

n
= lim pmi αi (5.82)
n
i=1
Equation (5.82) shows that the operator A is uniquely defined by the

infinite matrix (pki ). Thus, the components of the element y = Ax are
uniquely defined by the components of the element x. Thus a finite matrix
gets extended to an infinite dimensional matrix.
5.6.20 Adjoint A∗ of an operator A represented by an infinite
matrix
Let A∗ denote the operator adjoint to A and A∗ map E ∗ into itself. Let
f = A∗ g, i.e., g(y) = g(Ax) = f (x) for every x ∈ E.
∞
∞

Furthermore, let g = ci fi and f = di f i
i=1 i=1
3 ∞
3 3 n ∞

Then g(Ax) = g A αi ei = g lim pki αi ek
n
i=1 k=1 i=1
3 ∞

n
= lim pki αi g(ek )
n→∞
k=1 i=1
∞
∞
n

n
= lim pki αi ck = lim pki ck αi
n→∞ n→∞
k=1 i=1 i=1 k=1
On the other hand,

∞
∞

g(Ax) = f (x) = αi fi (x) = di α i
i=1 i=1
∞ ∞
n

Consequently, di αi = lim pki ck αi (5.83)
n→∞
i=1 i=1 k=1
Let x = em , i.e., αm = 1, αi = 0, for i = m. Thus (5.83) gives,

n ∞
dm = lim pkm ck = pkm ck
n
k=1 k=1
Thus, dm = A∗ cm where cm is the mth component of g. A∗ = (aji ) is the

transpose of A = (aij ). Thus, in the case of a matrix with infinite number of
elements, the adjoint operator is the transpose of the corresponding matrix.
Such representation of operator and their adjoints hold for instance in
the space l2 .
Note 5.6.4. Many equations of mathematical physics are converted into
algebraic equations so that numerical methods can be adopted to solve
them.
5.6.21 Representation of sum, product, inverse on adjoints
of such operator which admit of infinite matrix
representation
Given A, an operator which admits of infinite matrix representation in
a Banach space with a basis, we have seen that the adjoint operator A∗
admits of a similar matrix representation.
By routine manipulation we can show that
(i) (A + B)∗ = A∗ + B ∗ .
(ii) (AB)∗ = B ∗ A∗ where A and B are conformable for multiplication.
(iii) (A−1 )∗ = (A∗ )−1 , where A−1 exists.
Problems
1. Prove that the dual space of +n is +n.
2. Prove that the dual space of (+n , || · ||∞ ) is the space (+n , || · ||1 ).
3. Prove that the dual space of l2 is l2 .
4. Show that, although the sequence space l1 is separable, its dual (l1 )∗
is not separable.
5. Show that if E is a normed linear space its conjugate is a Banach
space.
6. Show that the space lp . 1 < p < ∞ is reflexive but l1 is not reflexive.
7. If E, a normed linear space is reflexive and X ⊂ E is a closed
subspace, then show that X is reflexive.
8. Show that a Banach space E is reflexive if and only E ∗ is reflexive.
9. If E is a Banach space and E ∗ is reflexive, then show that φ(E) is
closed and dense in E ∗∗ .
10. Let E be a compact metric space. Show that C(E) with the sup norm
is reflexive if and only if, E has only a finite number of points.
CHAPTER 6
SPACE OF BOUNDED
LINEAR
FUNCTIONALS
In the previous chapter, the notion of functionals and their extensions

was introduced. We have also talked about the space of functionals or
conjugate space and adjoint operator defined on the conjugate space. In
this chapter, the notion of the conjugate of a normed linear space and
its adjoints has been revisited. The null space and the range space of a
bounded linear operator and its transpose (adjoint) are related. Weaker
concept of convergence in a normed linear space and its dual (conjugate)
are considered. The connection of the notion of reflexivity with weak
convergence and with the geometry of the normed linear spaces is explored.
6.1 Conjugates (Duals) and Transposes

(Adjoints)
In 5.6 we have seen that the conjugate (dual) space E ∗ of a Banach space
E, as the space of bounded linear functionals mapping the Banach space
4
E → . Thus, if f ∈ E ∗ ,
|f (x)|
||f || = sup , x ∈ E.
x=θ ||x||
If, f1 , f2 ∈ E ∗ , then f1 = f2 =⇒ f1 (x) = f2 (x) ∀ x ∈ E

Again, f1 (x) = f2 (x) =⇒ (f1 − f2 )(x) = 0 =⇒ ||f1 − f2 || = 0 =⇒ f1 = f2 .
On the other hand, the consequence of the Hahn-Banach extension
theorem 5.1.4 shows that x1 = x2 in E if and only if f (x1 ) = f (x2 ) for
221
all f ∈ E ∗ . This shows that
|f (x)|
||x|| = sup , x ∈ E.
f =θ ||f ||
in analogy with the definition of ||f || above.

This interchangeability between E and E ∗ explains the nomenclature
‘conjugate or dual’ for E ∗ .
6.1.1 Definition: restriction of a mapping
If F : E1 → E2 , E1 and E2 being normed linear spaces, and if E0 ⊆ E1 ,
then F |E0 defined for all x ∈ E0 is called the restriction of F to E0 .
6.1.2 Theorem
Let E be a normed linear space.
(a) Let E0 be a dense subset of E. For f ∈ E ∗ let F (f ) denote

the restriction of f to E0 . Then the map F is a linear
isometry from E ∗ onto E0∗ .
(b) IF E ∗ is separable then so is E.
Proof: Let f ∈ E ∗ . Now F (f ), being defined on E0 ⊆ E, belongs to

E0∗ . ||F (f )|| = ||f || and that the map is linear. Then by theorem 5.1.3,
F (f ) defined on E0 can be extended to the entire space with preservation
of norm. Hence, F is onto (surjective).
(b) [See theorem 5.6.11.]
6.1.3 Theorem
1 1
Let 1 ≤ p < ∞ and + = 1. For a fixed y ∈ lq , let
p q
∞

fy (x) = ξ i yi where x = {ξi } ∈ lp .
i=1
∗
Then fy ∈ (lp ) and ||fy || = ||y||q
The map f : lq → lp∗ defined by
F (y) = fy y ∈ lq
is a linear isometry from lq into (lp )∗ .
If 1 ≤ p < ∞. Then F is surjective (onto).
In fact, if f ∈ lp∗ and y = (f (e1 ), f (e2 ) . . .) then y ∈ lq and f = F (y).
Proof: Let y ∈ lq . For x ∈ lp , we have

∞

|ξi yi | ≤ ||x||p ||y||q .
i=1
Space of Bounded Linear Functionals 223
For p = 1 or ∞ the above is true and follows by letting n → ∞ in Holder’s

inequality (sec. 1.4.3) if 1 ≤ p < ∞. Hence fy is well-defined, linear and
||fy || ≤ ||y||q . Next, to prove ||y||q ≤ ||fy ||. If y = θ, there is nothing to
prove. Assume, therefore, that y = θ. The above inequality can be proved
by following arguments as in 5.6.3.
If we let F (y) = fy , y ∈ lq , Then, F is a Linear isometry from lq into
(lp )∗ for 2 ≤ p < ∞.
Let 1 ≤ p < ∞. To show that F is surjective consider f ∈ (lp )∗ and
and let y = (f (e1 ), f (e2 ), . . .).
If, p = 1, we show from the expression for y that y ∈ l∞ . Let 1 ≤ p < ∞
and for n = 1, 2, . . . define y n = (y1 , y2 , . . . yn , 0 . . . 0).
Thus, y n ∈ lq . ||yn ||q ≤ ||fyn ||. Now,
3 ∞

||fyn || = sup xi yi : x ∈ lp , |x||p ≤ 1

i=1
Let us consider x ∈ lp with ||x||p ≤ 1 and define xn = (x1 , x2 , . . . xn , 0 . . . 0).

Then xn belongs to lp , ||xn ||p ≤ ||x||p ≤ 1 and

n
n
n
f (x ) = xi f (ei ) = xi yi = fyn (x).
i=1 i=1
Thus, ||fyn || ≤ ||f || = sup{|f (x)| : x ∈ lp , ||x||p ≤ 1}
⎛ ⎞ q1
∞

so that ⎝ |yj |q ⎠ = lim ||y n ||q ≤ lim sup ||fyn ||
n→∞ n→∞
j=1
≤ ||f || < ∞, that is y ∈ lp .

n
Now, let x ∈ lp . Since p < ∞, we see that x = lim xi ei . Hence, by
n→∞
i=1
the continuity and the linearity of f ,
n
∞ ∞

f (x) = lim f xi ei = xi f (ei ) = xi yi = fy (x).
n→∞
i=1 i=1 i=1
Thus, f = fy that is F (y) = f showing that F is surjective.
In what follows we take c0 as the space of scaler sequences converging
to zero and c00 , as the space of scalar sequences having only finitely many
non-zero terms.
6.1.4 Corollary
1 1
Let 1 ≤ p < ∞ and + =1
p q
4 +
(i) The dual of n ( n ) with the norm || · ||p is linearly isometric to
4 +
n
( n ) with the norm || · ||q .
(ii) The dual of c00 with the norm || · ||p is linearly isometric to lq .
(iii) The dual of c0 with the norm || ||∞ is linearly isometric to l1 .
∞

n
Proof: (i) If we replace the summation ξi yi with the summation ξ i yi
i=1 i=1
in theorem 6.1.3 and follow its argument we get the result.
(ii) If 1 ≤ p < ∞. Then c00 is a dense subspace of lp , so that the dual
of c00 is linearly isometric to lq by theorems 6.1.2(a) and 6.1.3.
Let p = ∞, so that q = 1. Consider y ∈ l1 and define,
∞
fy (x) = xj yj , x ∈ c00 .
j=1
Following 6.1.3 we show that fy ∈ (c00 )∗ and ||fy || ≤ ||y||1 . Next, we
∞

show that ||fy || = |yj | = ||y||1 and that the map F : l1 → (c00 )∗ given
j=1
by F (y) = fy is a linear isometry from l1 into (c00 )∗ .
To prove F is surjective, we consider f in (c00 )∗ and let y =
(f (e1 ), f (e2 ), . . .). Next we define for n = 1, 2, . . .
,
sgn yj if 1 ≤ j ≤ n
xnj =
0 if j > n

n
n
so that ||f || ≥ f (xn ) = xnj yj = |yj |, n = 1, 2, . . .
j=1 j=1

n
so that y ∈ l1 . If x ∈ c00 then x = xi e i
i=1
for some n and hence

n
n
f (x) = xj f (ej ) = xi yi = fy (x).
j=1 i=1
Thus, f = fy that is F (y) = f , showing that f is surjective.

(iii) Since c00 is dense in c0 , we use theorem 6.1.1(a) and (b) above.
Note 6.1.1. Having considered the dual of a normed linear space E, we
now turn to a similar concept for a bounded linear operator on Ex , a normed
linear space.
Let Ex and Ey be two normed linear spaces and A ∈ (Ex → Ey ).
Define a map A∗ : Ey∗ → Ex∗ as follows. For φ ∈ Ey∗ and x ∈ Ex , let
xφ(y) = φ(Ax) = f (x), where x ∈ Ex , y ∈ Ey and f ∈ Ex∗ . Then we can
write
f = A∗ φ.
A∗ is called adjoint or transpose of A. A∗ is linear and bounded [see
5.6.13 to 5.6.16].
6.1.5 Theorem
Let Ex ,Ey and Ez be normed linear spaces.

(i) Let A, B ∈ (Ex → Ey ) and k ∈ 4(+).Then (A + B)∗ = A∗ + B∗ ,
and (kA)∗ = kA∗ .
(ii)Let A ∈ (Ex → Ey ) and C ∈ (Ey → Ez ). Then (CA)∗ = A∗ C ∗ .
(iii) Let A ∈ (Ex → Ey ). Then ||A∗ || = ||A|| = ||A∗∗ ||.
Proof:
(i)For proof of (A + B)∗ = A∗ + B ∗ and (kA)∗ = kA∗ see 5.7.13.
(ii) Since A∗ maps Ey∗ into Ex∗ we can find f ∈ Ex∗ and φ ∈ Ey∗ such
that φ(y) = φ(Ax) = f (x). Next, since C ∗ maps Ez∗ into Ey∗ we can find
ψ ∈ Ez∗ for φ ∈ Ey∗ such that ψ(z) = ψ(Cy) = φ(y).
Thus ψ(z) = ψ(CAx) = f (x).
Thus f = (CA)∗ ψ.
Now f = A∗ φ = A∗ (C ∗ ψ).
Hence (CA)∗ = A∗ C ∗ .
(iii) To show that ||A|| = ||A∗ || see theorem 5.6.17.
Now, A∗∗ = (A∗ )∗ . Hence the above result yields ||A∗∗ || = ||A∗ ||.
We have f = A∗ φ, i.e., A∗ : Ey∗ → Ex∗ since φ ∈ Ey∗ and f ∈ Ex∗ .
Since A∗∗ is the adjoint of A∗ , A∗∗ maps Ex∗∗ → Ey∗∗ .
If we write f (x) = Fx (f ), then for fixed x, Fx can be treated as a functional
defined on Ex∗ . Therefore, Fx ∈ Ex∗∗ .
Thus, for Fx ∈ Ex∗∗ , φ ∈ Ey∗ , we have A∗∗ (Fx )(φ) = Fx (A∗ (φ)).
G G
In particular, let x ∈ Ex and Fx = Ex (x), where Ex (x) is the canonical
embedding of Ex into Ex∗∗ . Thus, for every φ ∈ Ey∗ , we obtain
G G
A∗∗ ( Ex (x))(φ) = Ex (x)(A∗ (φ)) = A∗ (φ)(x)
G
= φ(Ax) = Ey (y)(A(x))(φ).
G G
Hence A∗∗ Ex (x) = E(y) (y)A. Schematically,
A
Ex Ey
∏Ex(x) ∏Ey(y)
A**
E x** E y**
Fig. 6.1
6.1.6 Example
Let Ex = c00 = Ey , with the norm || · ||∞ . Then, by 6.1.4, Ex∗ is linearly
∗∗
isometric to l1 and by 6.1.3. EG
x is linearly isometric to l∞ . The completion
of c00 (that is the closure of c00 in (c∗∗00 ) is linearly isometric to c0 . Let
A ∈ (c00 → c00 ). Then A∗ can be thought of as a norm preserving linear
extension of A to l∞ .
We next explore the difference between the null spaces and the range
spaces of A, A∗ respectively.
6.1.7 Theorem
Let Ex and Ey be normed linear spaces and A ∈ (Ex → Ey ). Then
(i) N (A) = {x ∈ Ex : f (x) = 0 for all f ∈ R(A∗ )}.

(ii) N (A∗ ) = {φ ∈ Ey∗ : φ(y) = 0 for all y ∈ R(A)}.
In particular, A∗ is one-to-one if and only if R(A) dense in Ey .
(iii) R(A) ⊂ {y ∈ Ey : φ(y) = 0 for all φ ∈ N (A∗ )}, where equality holds
if and only if R(A) is closed in Ey .
(iv) R(A∗ ) ⊂ {f ∈ Ex∗ : f (x) = 0 for all x ∈ N (A)}, where equality holds
if Ex and Ey are Banach spaces and R(A) is closed in Ey .
In the above, N (A) denotes the null space of A. R(A) denotes the range
space of A, N (A∗ ) and R(A∗ ) will have similar meanings.
Proof: (i) Let x ∈ Ex . Let f ∈ Ex∗ and φ ∈ Ey∗ .
Then A∗ φ(x) = f (x) = φ(Ax).
Therefore, Ax = 0 if and only f (x) = 0 ∀ f ∈ R(A∗ ).
(ii) Let φ ∈ Ex∗ Then A∗ φ = 0 if and only if φ(Ax) = A∗ φ(x) = 0 for
every x ∈ Ex .
Now, A∗ is one-to-one, that is, N (A∗ ) = {θ} if and only if φ = θ
wherever φ(y) = 0 for every y ∈ R(A). Hence, by theorem 5.1.5, this
happens if and only if the closure of R(A) = Ey , i.e., R(A), is dense in Ey .
(iii) Let y ∈ R(A) and y = Ax for some x ∈ Ex . If φ ∈ N (A∗ ) then
φ(y) = φ(Ax) = f (x) = A∗ φ(x) = 0.
Hence R(A) ⊂ {y ∈ Ey : φ(y) = 0 for all φ ∈ N (A∗ )}.
If equality holds in this inclusion, then R(A) is closed in Ey . Since
R(A) = ∩{N (φ) : φ ∈ N (A∗ )}, and each N (φ) is a closed subspace of Ey .
Conversely, let us assume that R(A) is closed in Ey . Let y0 ∈ R(A), then
by 5.1.5 there is some φ ∈ Ey∗ such that φ(y0 ) = 0 but φ(y) = 0 for every
y ∈ R(A). In particular, A∗ (φ)(x) = f (x) = φ(Ax) = 0 for all x ∈ Ex i.e.,
φ ∈ N (A∗ ). This shows that y0 ∈ {y ∈ Ey : φ(y) = 0 for all φ ∈ N (A∗ )}.
Thus, equality holds in the inclusion mentioned above.
(d) Let f ∈ R(A∗ ) and f = A∗ φ for some φ ∈ Ey∗ . If x ∈ N (A), then
f (x) = A∗ φ(x) = φ(Ax) = φ(0) = 0. Hence, R(A∗ ) ⊂ {f ∈ Ex : f (x) = 0,
for all x ∈ N (A)}.
Let us assume that R(A) is closed in Ey , and that Ex and Ey are

Banach spaces, we next want to show that the above inclusion reduces to
an equality. Let f ∈ Ex∗ be such that f (x) = φ(Ax) = 0 wherever Ax = 0.
We need to find φ ∈ Ey∗ such that A∗ φ = f , that is, φ(A(x)) = f (x) for
4+
every x ∈ Ex . Let us define ψ : R(A) → ( ) by ψ(y) = f (x), if y = Ax.
Since f (x) = 0 for all x ∈ N (A), ψ(y1 + y2 ) = f (x1 + x2 ), if y1 = Ax1

and y2 = f (x2 ). Since f is linear,
ψ(y1 + y2 ) = f (x1 + x2 ) = f (x1 ) + f (x2 ) = ψ(y1 ) + ψ(y2 ).
ψ(α y) = f (α x) = αf (x), α ∈ 4(+).
Thus ψ is well defined and linear.
Also, the map A : Ex → R(A), is linear, bounded and surjective, where

Ex is a Banach space and so is the closed subspace R(A) of the Banach
space Ey . Hence, by the open mapping theorem [see 7.3], there is some
r > 0 such that for every y ∈ R(A), there is some x ∈ Ex with Ax = y and
||x|| ≤ γ||y||, so that
|ψ(y)| = |f (x)| ≤ ||f || ||x|| ≤ γ||f || ||y||.
This shows that ψ is a continues linear functional on R(A). By the

Hahn-Banach extension theorem 5.2.3, there is some φ ∈ Ey∗ such that
φ|R(A) = ψ. Then A∗ (φ)(x) = φ(Ax) = ψ(Ax) = f (x) for every x ∈ Ex , as
desired.
Problems
1. Prove that the dual space of ( +n, || · ||1) is isometrically isomorphic
+
to ( n , || · ||∞ ).
2. Show that the dual space of (c0 , || · ||∞ ) is (l1 , || · ||1 ).
3. Let || · ||1 and || · ||2 be two norms on the normed linear space E
with ||x||1 ≤ K||x||2 , ∀ x ∈ E and K > 0, prove that (E ∗ , || · ||1 ) ⊆
(E ∗ , || · ||2 ).
4. Let Ex and Ey be normed spaces. For F ∈ (Ex → Ey ), show that
||F || = sup{|φ(F (x))| : x ∈ Ex , ||x|| ≤ 1, φ ∈ Ey∗ , ||φ|| ≤ 1}.
5. If S in a linear subspace of a Banach space E, define the annihilator

S 0 of S to be the subset S 0 = {φ ∈ E ∗ : φ(s) = 0 for all s ∈ S}. If
T is a subspace of E ∗ , define T 0 = {x ∈ E : f (x) = 0 for all f ∈ T }.
Show that
(a) S 0 is a closed linear subspace of E ∗ ,

(b) S 00 = S where S is the closure of S,
(c) If S is a closed subspace of E, then S ∗ is isomorphic to E ∗ /S 0 .
6. ‘c’ denotes the vector subspace of l∞ consisting of all convergent

sequences. Define the limit functional φ : c → by φ(x) = 4
φ(x1 , x2 , . . .) = lim xn and ψ : l∞ →
n→∞
4
by ψ(x1 , x2 . . .) =
lim sup xn .
n→∞
(i) Show that φ is a continuous linear functional where ‘c’ is

equipped with the sup norm.
(ii) Show that ψ is sublinear and φ(x) = ψ(x) holds for all x ∈ c.
6.2 Conjugates (Duals) of Lp ([a, b]) and

C([a, b])
The problem of finding the conjugate (dual) of Lp ([a, b]) is deferred until
Chapter 10.
6.2.1 Conjugate (dual) of C([a, b])
Riesz’s representation theorem on functionals on C([a, b]) has already
been discussed. We have seen that bounded linear functional f on [a, b] can
be represented by a Riemann-Stieljes integral
b
f (x) = x(t)dw(t) (6.1)
a
where w is a function of bounded variation on [a, b] and has the total

variation
Var(w) = ||f || (6.2)
Note 6.2.1. Let BV ([a, b)] denote the linear space of ( )-valued 4+
functions of bounded variation on [a, b]. For w ∈ BV ([a, b]) consider
||w|| = |w(a)| + Var(w).
Thus || · || is a norm on BV ([a, b]). For a fixed w ∈ BV ([a, b]), let us define
4+
fw : C[a, b] → ( ) by
b
fw (x) = xdw. x ∈ C([a, b]).
a
Then fw ∈ C ∗ ([a, b]) and ||fw || ≤ ||w||. However, ||fw || may not be equal
to ||w||. For example, if z = w + 1, then fz = fw , but ||z|| = ||w|| + 1, so
that either ||fw || =
||w|| or ||fz || = ||z||.
This shows that distinct functions of bounded variation can give rise to
the same linear functional on C([a, b]). In order to overcome this difficulty
a new concept is introduced.
6.2.2 Definition [normalized function of bounded variation]

A function w of bounded variation on [a, b] is said to be normalised
if w(a) = 0 and w is right continuous on ]a, b[. We denote the set of all
normalized functions of bounded variation on [a, b] by N BV ([a, b]). It is a
linear space and the total variation gives rise to a norm on it.
6.2.3 Lemma
Let w ∈ BV ([a, b]). Then there is a unique y ∈ N BV ([a, b]) such that
b b
xdw = xdy
a a
for all x ∈ C([a, b]). In fact,

⎧
⎨ 0, if t = a
y(t) = w(t+ ) − w(a), if t ∈]a, b[ .
⎩
w(b) − w(a), if t = b
Moreover, Var(y) ≤ Var(w).
4+
Proof: Let y : [a, b] → ( ) be defined as above. Note that the right
limit w(t+ ) exists for every t ∈]a, b[, because Rew, and Imw are real valued
functions of bounded variation and hence each of them is a difference of
two monotonically increasing functions. This also shows that w has only a
countable number of discontinuities in [a, b].
Let > 0. We show that Var(y) ≤ Var(w) + . Consider a partition
a = t0 < t1 < t2 < · · · tn−1 < tn = b.
s0 sn
a = t0 t1 s1 t2 s2 t3 tn = b
Fig. 6.2
Choose point s1 , s2 , . . . , sn−1 in ]a, b[, at which w is continuous and which

satisfy.

tj < sj , [w(t+ j ) − w(sj )] < , j = 1, 2, . . . , n − 1.
2n
Let s0 = a and sn = b. Then,
|y(t1 ) − y(t0 )| ≤ |w(t+
1 ) − w(s1 )| + |w(s1 ) − w(s0 )|
|y(tj ) − y(tj−1 )| ≤ |w(t+
j ) − w(sj )| + |w(sj ) − w(sj−1 )|
+ |w(sj−1 ) − w(t+
j−1 )|, j = 2, . . . (n − 2).
|y(tn ) − y(tn−1 )| ≤ |w(sn ) − w(sn−1 )| + |w(sn−1 ) − w(t+ n−1 )|

n n
+ (n − 2) +
Hence, |y(tj ) − y(tj−1 )| ≤ |w(sj ) − w(sj−1 )| +
j=1 j=1
2n

n
< |w(sj ) − w(sj−1 )| + .
j=1
Since the above is true for every partition Pn of [a, b], Var(y) ≤
Var(w) + . As > 0 is arbitrary, Var(y) ≤ Var(w). In particular, y is
of bounded variation on [a, b]. Hence y ∈ N BV [a, b].
Next, let x ∈ C([a, b]). Apart from the subtraction of the constant w(a),
the function y agrees with the function w, except possibly at the points of
discontinuities of w. Since these points are countable, they can be avoided
while calculating the Riemann-Stieljes sum

n
x(tj )[w(tj ) − w(tj−1 )],
j=1
b
n
which approximates xdw, since each sum is equal to x(tj )[y(tj ) −
a j=1
b b b
y(tj−1 )] and is approximately equal to xdy. Hence xdw = xdy.
a a a
To prove the uniqueness of y, let y0 ∈ N BV ([a, z]) be such that
b b
xdw = xdy for all x ∈ C([a, b]) and z = y − y0 . Thus z(a) =
a a
y(a) − y0 (a) = 0 − 0 = 0.
b b b
Also, since z(b) = z(b) − z(a) = dz = dy − dy0 = 0.
a a a
Now, let ξ ∈]a, b[. For a sufficiently small positive h, let
⎧
⎪
⎨ 1 if a ≤ t ≤ ξ
t−ξ
x(t) = 1− if ξ < t ≤ ξ + h
⎪
⎩ h
0 if ξ + h < t ≤ b.
Then x ∈ C([a, b]) and (x(t)) ≤ 1 for all t ∈ [a, b].
b b b
Since 0= xdy − xdy0 = xdz
a a a

ξ
t−ξ ξ+h
= dz + 1− dz
a ξ h
ξ ξ+h
t−ξ
we have z(ξ) = dz = − 1− dz.
a ξ h
It follows that |z(ξ)| ≤ Varξ ξ + h,
where Varξ ξ + h denotes the total variation of z on [ξ, ξ + h]. As z is

right continuous at ξ, its total variation function v(t) = Vara t, t ∈ [a, b]
is also right continuous at ξ. Let > 0, there is some δ > 0 such that for
0 < h < δ,
|z(ξ)| ≤ Varξ ξ + h = v(ξ + h) − v(ξ) < .
Hence, z(ξ) = 0. Thus z = 0, that is, y0 = y.
6.2.4 Theorem
Let E = C([a, b]). Then E is isometrically isomorphic to the subspace
of BV ([a, b]), consisting of all normalized functions of bounded variation.
If y is such a normalized function (y ∈ N BV ([a, b])), the corresponding f
is given by
b
f (x) = x(t)dy(t). (6.3)
a
Proof: Formula (6.3) defines a linear mapping f = Ay, where y is

normalized and f ∈ C ∗ ([a, b]). We evidently have ||f || ≤ Var(y). For a
normalized y, Var(y) is the norm of y because y(a) = 0. Now consider any
g ∈ C ∗ ([a, b]). The theorem 5.3.3 then tells us that there is a w ∈ BV ([a, b])
such that b
g(x) = x(t)dw(t) and Var(w) = ||g||.
a
The integral is not changed if we replace w by the corresponding normalized
function y of bounded variation. Then by lemma 6.2.3
b b
g(x) = x(t)dw(t) = x(t)dy(t)
a a
and g = Ty and ||g|| ≤ Var(y).
Also Var(y) ≤ Var(w) = ||g||.
Therefore, ||g|| = Var(y). Since by lemma 6.2.3 there is just one normalized
function, corresponding to the functional g a one-to-one correspondence
exists between the set of all linear functionals of C ∗ ([a, b]) and the set of
all elements of N BV ([a, b]).
It is evident that the sum of functions y1 , y2 ∈ N BV ([a, b]) corresponds
to the sum of functionals g1 , g2 ∈ C ∗ ([a, b]) and the function λy corresponds
to the functional λg, in case the functionals g1 , g2 correspond to the
functions y1 , y2 ∈ N BV ([a, b]). It therefore follows that the association
between C ∗ ([a, b]) and the space of normalized functions of bounded
variation (N BV ([a, b]) is an isomorphism. Furthermore, since
||g|| = Var(y) = ||y||
the correspondence is isometric too.

Thus, the dual of a space of continuous functions is a space of normalized
functions of bounded variation.
6.2.5 Moment problem of Hausdroff or the little moment
problem
Let us consider the discrete analogue of the Laplace transform
∞
μ(s) = e−su dα(u) s ∈ ( ) 4+ (6.4)
0
4+
where α : [0, ∞] → ( ) is of bounded variation on every subinterval of
[0, ∞). If we put t = e−u and put s = n, a positive integer, the above
integral gives rise to the form
1
μ(n) = tn dz(t), n = 0, 1, 2, (6.5)
0
where α(u) = −z(e−u ).

The integral (6.4), where z is a function of bounded variation of [0,1], is
called the nth moment of z. The moments of a distribution of a random
variable play an important role in statistics. For example, in the case of a
rectangular distribution
⎧
⎪
⎪ 0 x<a
⎨ x−a
F (x) = , a≤x≤b
⎪
⎪ b−a
⎩
L x>b
the frequency density function is given by the following step function [fig.
6.3]
1
—–—
b −a
a b
Fig. 6.3
Hence, the nth moment function for a rectangular distribution is

∞ n
μ(n) = dF (t) = tn f (t)dt.
−∞ a
A sequence of scalars μ(n), n = 0, 1, 2, . . . is called a moments sequence

if there is some z ∈ BV ([0, 1]) whose nth moments is μ(x), n = 0, 1, 2, . . .
tα
For example, if α is a positive integer, then taking z(t) = , t ∈ [0, 1],
α
we see that
1 1
n 1
μ(n) = t dz(t) = tn+α−1 dt = , n = 0, 1, 2 . . .
0 0 n + α
Similarly, if 0 < r ≤ 1, then (r n ), n = 0, 1, 2, . . . is a moment sequence since

if z is the characteristic functions of [r, 1] [see Chapter 10], then
1
tn dz(t) = rn , n = 0, 1, 2, . . .
0
If μ(x) is the nth moment of z ∈ BV ([0, 1]), then
|μ(n)| ≤ Var(z), n = 0, 1, 2, . . .
Hence, every moment sequence is bounded. To prove that μ(n) is

convergent [see Limaye [33]]. Thus every scalar sequence need not be a
moment sequence. The problem of determining the criteria that a sequence
must fulfil in order to become a moment sequence is known as the moment
problem of Hausdroff or the little moment problem. We next discuss
some mathematical preliminaries relevant to the discussion.
6.2.6 The shift operator, the forward difference operator
Let X denote the linear space of all scalar sequences μ(n)), n =
0, 1, 2, . . . and let E: X → X be defined by,
E(μ(n)) = μ(n + 1), μ ∈ X, n = 0, 1, 2, . . .
E is called the shift operator.
Let I denote the identity operator from X to X.
Define Δ = E − I. Δ is called then forward difference operator.
Thus, for all μ ∈ X and n = 0, 1, 2, . . .

Δ(μ(n)) = μ(n + 1) − μ(n) (6.6)
For r = 0, 1, 2, . . . we have
r
r−j r
Δ = (E − I) =
r r
(−1) Ej,
j=0
j
r
r
so that Δr (μ(n)) = (−1)r−j μ(n + j)
j=0
j
r
r r−j r
In particular, Δ (μ(0)) = (−1) μ(j). (6.7)
j=0
j
6.2.7 Definition: P ([0, 1])

Let P([0, 1]) denote the linear space of all scalar-valued polynomials on
[0,1]. For m = 0, 1, 2, . . . let
pm (t) = tm , t ∈ [0, 1].
We next prove the Weierstrass approximations theorem.

6.2.8 The Weierstrass approximations theorem (RALSTON
[43])
The Weierstrass approximation theorem asserts that the set of
polynomials on [0,1] is dense in C([0, 1]). Or in other words, P([0, 1]) is
dense in C([0, 1]) under the sup norm.
In order to prove the above we need to show that for every continuous
function f ∈ C([0, 1]) and > 0 there is a polynomial p ∈ P([0, 1]) such
that
max {|f (x) − p(x)| < }.
x∈[0,1]

n n!
In what follows, we denote by , where n is a positive integer
k k!(n − k)!
and k an integer such that 0 < k ≤ n. The polynomial Bn (f )(x) defined
by
n

n k k
Bn (f )(x) = x (1 − x) n−k
f (6.8)
k n
k=0
is called the Bernstein polynomial associated with f . We prove our theorem
by finding a Bernstein polynomial with the required property. Before we
take up the proof, we mention some identities which will be used:
n
n k
(i) x (1 − x)n−k = [x + (1 − x)]n = 1 (6.9)
k
k=0
n
n k
(ii) x (1 − x)n−k (k − nx) = 0 (6.10)
k
k=0
(6.10) is obtained by differentiating both sides of (6.9) w.r.t. x and
multiplying both sides by x(1 − x).
On differentiating (6.10) w.r.t. x, we get
n
n
[−nxk (1 − x)n−k + xk−1 (1 − x)n−k−1 (k − nx)2 ] = 0
k
k=0
Using (6.9), (6.10) reduces to
n
n k−1
x (1 − x)n−k−1 (k − nx)2 = n (6.11)
k
k=0
Multiplying both sides by x(1 − x) and dividing by n2 , we obtain,
n 2
n k k x(1 − x)
(iii) x (1 − x) n−k
−x = (6.12)
k n n
k=0
(6.12) is the third identity to be used in proving the theorem.
It then follows from (6.8) and (6.9) that
n
n k k
f (x) − Bn (f )(x) = x (1 − x)n−k f (x) − f (6.13)
k n
k=0
n
n
k
or |f (x) − Bn (f )(x)| ≤ x (1 − x)
k n−k
f (x) − f (6.14)
k n
k=0
Since f is uniformly continuous on [0,1], we can find a
⎫
k

δ > 0 and M s.t. x − < δ ⇒ f (x) − f k < ⎪
⎬
n n 2
⎪ (6.15)
⎭
and |f (x) < M for x ∈ [0, 1].
Let us partition
the sum on the RHS of (6.13) into two parts, denoted
stands for the sum for which x − nk < δ (x is fixed

by and .

but arbitrary, and is the sum of the remaining terms.

n

k
Thus, = xk (1 − x)n−k f (x) − f
k n
k
|x− nx |<δ
n
n k
< x (1 − x)n−k = . (6.16)
2 k 2
k=0

We next show that if n is sufficiently large then can be made less

than 2 independently of x. Since f is bounded using (6.15),
n
we get ≤ 2M xk (1 − x)n−k where the sum is taken for all
k
k
k s.t., x − ≥ δ
n
x(1 − x)
(6.12) yields δ 2 ≤
n
1 1
or ≤ 2 since max x(1 − x) = for x ∈ [0, 1]
4δ n 4
n
where = xk (1 − x)n−k ,
k
k,|x− n |>δ
k
M 2M δ 2

taking n> , < = .
δ2 4M δ 2 2
n
n k k
n−k
Hence, |f (x) − Bn (f )(x)| ≤ x (1 − x) f (x) − f
k n
k=0

< + = .
2 2
6.2.9 Definition: P(m)

Let us define, for a nonnegative integer m,
P(m) = {p ∈ P([0, 1]) : p is of degree ≤ m}.
6.2.10 Lemma Bn (p) ⊂ P(m) where p is a polynomial of degree

≤m
The Bernstein polynomial Bn (f ) is given by

n
k n k
Bn (f )(x) = f x (1 − x)n−k , x ∈ [0, 1], n = 0, 1, 2
n k
k=0
We express Bn (f ) as a linear combination of p0 , p1 , p2 , . . .

n−k j
n−k n−k
n−k
Since (1 − x)n−k = (−1)j x = (−1)j pj (x),
j=0
j j=0
j
and pk pj = pj+k , we have

n−k
n
n
k n−k
Bn (f ) = f (−1)j pj+k .
n k j=0 j
k=0

n n−k j+k n
As = , we put j + k = r
k j k j+k

n n
r−k r k n
Bn (f ) = (−1) f pr
r=0
k n r
k=0
In particular, Bn (f ) is a polynomial of degree at most n. Also,
n
n k
Bn (p0 )(x) = x (1 − x)n−k = [x + (1 − x)]n
k
k=0
= 1 = p0 (x), x ∈ [0, 1].
If n ≤ m then clearly Bn (p) ∈ P(n) ⊂ P(m) .
Next, fix n ≥ m + 1. Consider the sequence (μp (k)), defined by
k
μp (k) = p , k = 0, 1, 2, . . .
n
Noting the expression for Δr (μ)(0) and Bn (f ) we obtain
n
r n
Bn (p) = (Δ μP )(0) pr .
r=0
r
Since p is a polynomial of degree at most m, it follows that

k+1 k
(Δμp )(k) = p −p
n n
= λ0 + λ1 k · · · + λm k m−1 , k = 0, 1, 2, . . .
for some scalar λ0 , . . . λm−1 . Proceeding, similarly we conclude that
(Δm μp )(k) equals a constant, for k = 0, 1, 2, . . . and for each r ≥ m + 1,
we have (Δr μP )(k) = 0, k = 01, 2, . . .
In particular, (Δr μp )(0) = 0, for all r ≥ m + 1 so that
m
r n
Bn (p) = (Δ μp )(0) pr .
r=0
r
Hence, Bn (p) ∈ P(m) .
6.2.11 Lemma
Let h be a linear functional on P([0, 1]) with μ(n) = h(pn ) for n =
0, 1, 2, . . . . If fkl (x) = xk (1 − x)l for x ∈ [0, 1], then
h(fkl ) = (−1)l Δl (μ)(k), k, l = 0, 1, 2, . . .
Proof: We have fkl = pk (1 − x)l where pk = xk

l l
l l
= pk (−1)i pi = (−1)i pi+k .
i=0
i i=0
i
l l
i l i l
Hence, h(fkl ) = (−1) h(pi+k ) = (−1) μ(i + k),
i=0
i i=0
i
which equals to (−1)l Δl (μ)(k).
We next frame the criterion for a sequence to be a moment sequence.
6.2.12 Theorem (Hausdorff, 1921) Limaye [33]

Let (μ(n)), n = 0, 1, 2 . . . be a sequence of scalars. Then the following
conditions are equivalent,
(i) (μ(n)) is a moment sequence
(ii) For n = 0, 1, 2, . . . and k = 0, 1, 2, . . . , n, let

n
dn,k = (−1)n−k Δn−k (μ(k)).
k
n
Then |dn,k | ≤ d for all n and some d > 0.
k=0
4+
(iii) The linear functional h : P([0, 1]) → ( ) defined by h(λ0 p0 +
λ1 p1 + · · · + λn pn ) = λ0 μ(0) + · · · + λn μ(n), is continuous, where n = 0, 1, 2
4+
and λ0 , λ1 , . . . , λn ∈ ( ).
Further, there is a non-decreasing function on [0,1] whose nth moment
is μ(n) if and only if dn,k ≥ 0 for all n = 0, 1, 2, . . . and k = 0, 1, 2, . . . , n.
This can only happen if and only if the linear functional h is positive.
Proof: (i) ⇒ (ii). Let z ∈ BV ([0.1]) be such that the nth moment of z is
μ(n), n = 0, 1, 2, . . . . Then
1
h(p) = pdz, z ∈ P([0, 1]),
0
define a linear functional h on P([0, 1]) such that h(pn ) = μ(n), for
n = 0, 1, 2, . . .. By lemma 6.2.11,

n
dn,k = (−1)n−k Δn−k (μ)(k)
k

n
= h(fk,n−k ) for n = 0, 1, 2, . . .
k
and k = 0, 1, 2, . . . , n. Since fk,n−k ≥ 0 on [0, 1], it follows that

1
n
|dn,k | ≤ fk,n−k dvz ,
k 0
where vz (x) is the total variation of z on [0, x].
But, for n = 0, 1, 2, . . .
n
n
fk,n−k = Bn (1) = 1.
k
k=0
n 1
Hence, |dn,k | ≤ Bn (1)dvz = Var z,
k=0 0
where Var z is the total variation of z on [0,1].

Note that if z is non-decreasing, then since fk,n−k ≥ 0 we have dn,k =
1
n
fk,n−k dz ≥ 0 for all n = 0, 1, 2, . . . and k = 0, 1, 2, . . . , n
k 0
(ii) ⇒ (iii) For a nonnegative integer m, let hm denote the restriction
of h to P(m) . Since h is linear on P ([0, 1]) and P(m) is a finite dimensional
subspace or P ([0, 1]), it follows that hm is continuous, since every linear
map on a finite dimensional normed linear space is continuous.
Let p ∈ P([0, 1]). Since
n
k n k
Bn (p) = p x (1 − x)n−k
n k
k=0
and h(pn ) = μ(n) for n = 0, 1, 2, . . ., we have
n
k n
h(Bn (p)) = p h(xk (1 − x)n−k )
n k
k=0
n
k n
= p (−1)n−k Δn−k (μ)(k).
n k
k=0
n
k
= p dn,k ,
n
k=0
by lemma 6.2.11. Hence,
n

|h(Bn (p)| ≤ p k |dn,k |
n
k=0

n
||p||∞ |dn,k | d||p||∞ .
k=0
Now, let the degree of p be m. Then Bn (p) ∈ P(m) , for all n = 0, 1, 2, . . .
as proved earlier.
Since ||Bn (p) − p||∞ → 0 as n → ∞ and hm is continuous,
|h(p)| = |hm (p)| = lim |hm (Bn (p)| ≤ d||p||∞ .

n→∞
This shows that h is continuous on P([0, 1]).
If, dm,k ≥ 0 for all m and k, and if p ≥ 0 on [0,1]
n
k
then h(p) = lim h(Bn (p)) = lim p dn,k ≥ 0,
n→∞ n→∞ n
k=0
i.e., h is a positive functional.
n
n 1 1
In this case, |dn,k | = dn,k = Bn (1)dz = dz = μ(0).
k=0 k=0 0 0
(iii) ⇒ (ii) Since P([0, 1]) is dense in C([0, 1]) with the sup norm || ||∞ ,
there is some F ∈ C ∗ ([0, 1]) with F |P([0,1]) = h and ||F || = ||h||.
By Riesz representations theorem for C([0, 1]) there is some z ∈
N BV ([0, 1]) such that
1
F (f ) = f dz, f ∈ C([0, 1]).
0
In particular, for n = 0, 1, 2, . . .
1
μ(n) = h(pn ) = F (pn ) = tn dz(t),
0
that is μ(n) is the nth moment of z.

If the functional h is positive and f ∈ C([0, 1]) with f ≥ 0 on [0,1],
then Bn (f ) ≥ 0 on [0,1] for all n and we have F (f ) = lim F (Bn (f )) =
n→∞
lim h(Bn (f )) ≥ 0, i.e., F is a positive functional on C([0, 1]).
n→∞
By Riesz representation theorem for C([0, 1]), we can say that there is a
non-decreasing function z such that F = Fz . In particular μ(n) is the nth
moment of a non-decreasing function z.
Thus, if (μ(n)) is a moment sequence, then there exists a unique
y ∈ N BV ([0, 1]) such that μ(n) is the nth moment of y. This follows
from lemma 6.2.3 by noting that P([0, 1]) is dense in C([0, 1]) with the sup
norm || ||∞ .
Problems
1. For a fixed x ∈ [a, b], let Fx ∈ C ∗ ([a, b]) be defined by Fx (f ) =
f (x), f ∈ C([a, b]).
Let yx be the function in N BV ([a, b]) which represents Fx as in
theorem 5.3.3. If x = a, then yx is the characteristic function
of [a, b], and if a < x ≤ b, then yx is the characteristic function of
[x, b].
If a ≤ x1 < x2 < · · · xn−1 < xn ≤ b, and
F (f ) = k1 f (x1 ) + · · · + kn (f (xn ), f ∈ C([a, b]),
then show that the function in N BV ([a, b]) corresponding to F ∈

C ∗ ([a, b]) is a step function.
2. Prove the inequality

b

f dg < max[|f (x)| : x ∈ [a, b]] · Var g.
a
3. Show that for any g ∈ BV ([a, b]) there is a unique g ∈ BV ([a, b]),
continuous from the right, such that
b b
f dg = f dg for all f ∈ C([a, b]) and Var (g) ≤ Var (g).
a a
4. Show that a sequence of scalars μ(n), n = 0, 1, 2, . . ., is a moment

sequence if and only if
√
μ(n) = μ1 (n) − μ2 (n) + iμ3 (n) − iμ4 (n), where i = −1
and (−1)n−k Δn−k μj (k) ≥ 0
for all k = 0, 1, 2, . . . , n, j = 1, 2, 3, 4 and n = 0, 1, 2, . . . ,.
1
5. Let y ∈ N BV ([a, b]) and μ(n) = 0
tn dy(t), for n = 0, 1, 2, . . .
Then show that
n
n
Var (y) = sup{ |Δn−k (μ)(k)|; n = 0, 1, 2, . . .}
k
k=0
where Δ is the forward difference operator.
6.3 Weak∗ and Weak Convergence

6.3.1 Definition: weak∗ convergence of functionals
Let E be a normed linear space. A sequence {fn } of linear functionals
in E ∗ is said to be weak∗ convergent to a linear functional f0 ∈ E ∗ , if
fn (x) → f0 (x) for every x ∈ E.
Thus, for linear functionals the notion of weak convergence is equivalent
to pointwise convergence.
6.3.2 Theorem
If a sequence {fn } of functionals weakly converges to itself, then {fn }
converges weakly to some linear functional f0 .
For notion of pointwise convergence see 4.5.2. Theorem 4.4.2 asserts
that E ∗ is complete, where E ∗ is the space conjugate to the normed linear
space E. Therefore if {fn } ∈ E ∗ is Cauchy, {fn } → f0 ∈ E ∗ . Therefore,
for every x ∈ E fn (x) → f0 (x) as n → ∞.
6.3.3 Theorem
Let {fn } be a sequence of bounded linear functionals defined on the
Banach space Ex .
A necessary and sufficient condition for {fn } to converge weakly to f
as n → ∞ is
(i) {||fn ||} is bounded
(ii) fn (x) → f (x) ∀ x ∈ M where the subspace M is everywhere dense
in Ex .
Proof: Let fn → f weakly, i.e., fn (x) → f (x) ∀ x ∈ Ex .
It follows from theorem 4.5.6 that {||fn ||} is bounded. Since M is a
subspace of Ex , condition (ii) is valid.
We next show that the conditions (i) and (ii) are sufficient. Let {||fn ||}
be bounded. Let L = sup ||fn ||.
Let x ∈ Ex . Since M is everywhere dense in Ex , ∃ x0 ∈ M s.t. given
arbitrary > 0,
||x − x0 || < /4M.
condition (ii) yields, for the above > 0, ∃ n > n0 depending on s.t.

|fn (x0 ) − f (x0 )| < .
2
The linear functionals f is defined on M . Hence by Hahn-Banach
extension theorem (5.1.3), we can extend f from M to the whole of Ex .
Moreover, ||f || = ||f ||M ≤ sup ||fn || = L

Now, |fn (x) − f (x)| ≤ |fn (x) − fn (x0 )| + |fn (x0 ) − f (x0 )|
+ |f (x0 ) − f (x)|

= |fn (x − x0 )| + + |f (x − x0 )|
2

< M ||x − x0 || + + M ||x − x0 ||
2

<M + +M · .
4M 2 4M
= for n > n0 (t).
Hence, fn (x) → f (x) ∀ x ∈ Ex ,
weakly
i.e., fn f.
n→∞
6.3.4 Application to the theory of quadrature formula
Let x(t) ∈ C([a, b]). Then
b
f (x) = x(t)dt (6.17)
a
is a bounded linear functional in C([a, b]).

kn
(n) (n)
Consider fn (x) = Ck x(tk ), n = 1, 2, 3, . . . (6.18)
k=1
(n) (n)
Ck are called weights, tk are the nodes,
(n) (n) (n)
a ≤ t1 ≤ · · · ≤ tkn−1 ≤ tkn
(6.18) is a linear functional.

6.3.5 Definition: quadrature formula
(n)
Let Ck be so chosen in (6.18), such that f (x) and fn (x) coincide for
all polynomials of a degree less then equal to n, i.e.,

n
f (x) fn (x) if x(t) = an tp (6.19)
p=0
The relation f (x) fn (x) which becomes an equality for all polynomials
of degree less then equal to n, is called a quadrature formula. For example,
(n)
in the case of Gaussian quadrature: t1 = a and the last element = b.
(n)
tk , (k = 1, 2, . . . n) are the n roots of Pn (t) = 2n1n! (t2 − 1)n = 0.
Consider the sequence of quadrature formula:
f (x) fn (x), n = 1, 2, 3, . . .
The problem that arises is whether the sequence {fn (x)} converges to the
value of f (x) as n → ∞ for any x(t) ∈ C([0, 1]). The theorem below answers
this question.
6.3.6 Theorem
The necessary and sufficient condition for the convergence of a sequence
of quadrature formula, i.e., in order that

kn 1
(n) (n)
lim Ck x(tk ) = x(t)dσ(t)
n→∞ 0
k=1

kn
(n)
holds for every continuous function x(t), is that |Ck | ≤ K = const,
k=1
must be true for every n,
Proof: By definition of the quadrature formula, the functional fm satisfies
fm (x) = f (x) for m ≤ n (6.20)
for every polynomial x(t) of degree n,

km
(m) (m)
fm (x) = Ck x(tk ). (6.21)
k=1
Each fm is bounded since |x(t(m) )| ≤ ||x|| by the definition of the norm.

k

km
(n) (m)
m
(n)
Consequently, |fm (x)| ≤ |Ck x(tk )| ≤ |Ck | ||x||.
k=1 k=1
For later use we show that fm has the norm,

km
(m)
||fm || = |Ck |, (6.22)
k=1
i.e., ||fm || cannot exceed the right-hand side of (6.22) and equality holds if
we take an x0 ∈ C([0, 1]) s.t. |x0 (t)| < 1 on J and
3 (n)
(n) (n) 1 if Ck ≥ 0
x0 (tk ) = sgnCk = (n)
.
−1 if Ck < 0
Since then ||x0 || = 1 and

kn
(n) (n)

kn
(n)
fn (x0 ) = Ck sgnCk = |Ck |.
k=0 k=0
For a given x ∈ Ex , (6.21) yields an approximate value fn (x) for f (x)

in (6.20).
We know that the set P of all polynomials with real coefficients is dense
in the real space Ex = C([0, 1]) by the Weierstrass approximation theorem
(th 1.4.32). Thus, the sequence of all functionals {fn } converges to the
functional f on a set of all polynomials, everywhere dense in C([0, 1]).

km
Since |Cm | ≤ k = const., it follows from (6.20) ||fm || is bounded.
k=0
Hence, by theorem 6.3.3, fn (x) → f (x) for every continuous function x(t).
6.3.7 Theorem
(n)
If all the coefficients Ck of quadrature formulae are positive, then the
sequence of quadrature formulae f (x) fn (x), n = 1, 2, . . . is convergent
for every continuous function x(t).
In fact, fn (x0 ) = f (x0 ) for any n and x0 (t) ≡ 1.

kn
kn 1
(n) (n)
Hence, ||fn (x0 )|| = |Ck | = Ck = dσ = σ(1) − σ(0).
k=1 k=1 0
Therefore, the hypothesis of theorem 6.3.6 is satisfied.

6.3.8 Weak convergence of sequence of elements of a space
6.3.9 Definition: Weak convergence of a sequence of elements
Let E be a normed linear space, {xn } a sequence of elements in E,
x ∈ E, {xn } is said to converge weakly to the element x if for every
linear functional f ∈ E ∗ , f (xn ) → f (x) as n → ∞ and in symbols we write
w
xn −→ x. We say that x is the weak limit of the sequence of elements {xn }.
6.3.10 Lemma: A sequence cannot converge weakly to two limits
Let {xn } ⊂ E, a normed linear space, converge weakly to x0 , y0 ∈ E
respectively, x0 = y0 . Then, for any linear functional f ∈ E ∗ ,
f (x0 ) = f (y0 ) = lim f (xn ).

n→∞
f being linear, f (x0 − y0 ) = 0. The above is true for any functional

belonging to E ∗ . Hence, x0 − y0 = θ, i.e., x0 = y0 . Hence the limit is
unique.
It is easy to see that any subsequence {xnk } also converges weakly to x
w
if xn −→ x.
6.3.11 Definition: strong convergence of sequence of elements
The convergence of a sequence of elements (functions) with respect to
the norm of the given space is called strong convergence.
6.3.12 Lemma
The strong convergence of a sequence {xn } in a normed linear space E
to an element x ∈ E implies weak convergence.
For any functional f ∈ E ∗ ,
|f (xn ) − f (x)| = |f (xn − x)| ≤ ||f || ||xn − x|| → 0
as n → ∞, since ||f || is finite, xn → x∗ strongly ⇒ xn −→ w x∗ .

Note 6.3.1. The converse is not always true. Let us consider the sequence
of elements {sin nπt} in L2 ([0, 1]).
1
Put xn (t) = sin nπt s.t. f (xn ) = sin nπtα(t)dt where α(t) is a
0
square integrable function uniquely defined with respect to the functional f .
Obviously, f (xn ) is the nth Fourier coefficient of α(t) relative to {sin nπt}.
w
Consequently, f (xn ) → 0 as n → ∞ so that xn −→ 0 as n → ∞.
On the other hand,

1
||xn − xn ||2 = (sin nπt − sin mπt)2 dt
0
1 1
= sin2 nπtdt − 2 sin nπt sin mπtdt
0 0
1
+ sin2 mπtdt = 1 if n = m.
0
Thus, {xn } does not converge strongly.
6.3.13 Theorem
In a finite dimensional space, notions of weak and strong convergence
are equivalent.
Proof: Let E be a finite dimensional space and {xn } a given sequence
w
such that xn −→ x. Since E is a finite dimensional there is a finite system
of linearly independent elements e1 , e2 , . . . , em s.t. every x ∈ E can be
represents as,
x = ξ1 e1 + ξ2 e2 + · · · + ξm em .
(n) (n) (n)
Let, xn = ξ1 e1 + ξ2 e2 + · · · + ξm em .
(0) (0) (0)
x0 = ξ1 e1 + ξ2 e2 + · · · + ξm em .
Now, consider the functionals fi such that
fi (ei ) = 1, fi (ej ) = 0 i = j.
(n) (0)
Then fi (xn ) = ξi , fi (x0 ) = ξi .
But since f (xn ) → f (x0 ), for every functional f ,
(0)
then also fi (xn ) → fi (x0 ), i.e., ξi (n) → ξi .
0 0
0m 0
0 (p) 0 m
0
||xp − xq || = 0 (ξi − ξi )ei 0
(q)
≤
(p) (q)
|ξi − ξi | ||ei ||
0
0 i=j 0 i=1
→ 0 as p, q → ∞, showing that in a finite dimensional normed linear
space, weak convergence of {xp } ⇐⇒ strong convergence or {xp }.
6.3.14 Remark
There also exist infinite dimensional spaces in which strong and weak
convergence of elements are equivalent.
Let E = l1 of sequences {ξ1 , ξ2 , . . . ξn . . .}
∞

s.t. the series |ξi | converges.
i=1
We note that in l1 , strong convergence of elements implies co-
ordinatewise convergence.
6.3.15 Theorem
If the normed linear space E is separable, then we can find an

w
equivalent norm, such that the weak convergence xn −→ x and ||xn || →
||x0 || in the new norm imply the strong convergence of the sequence {xn }
to x0 .
Proof: Let E be a normed linear space. Since the space is separable, it
has a countably everywhere dense set {ei } where ||ei || = 1.
(n) (n)
Let xn = ξ1 e1 + · · · + ξi ei + · · ·
(0) (0)
x0 = ξ1 e1 + · · · + ξi ei + · · ·
w
where xn −→ x0 as n → ∞.
Let us consider the functionals fi ∈ E ∗ such that
fi (ei ) = 1, fi (ej ) = 0 i = j.
(n) (0)
Now fi (xn ) = ξi and fi (x0 ) = ξi
Since fi (xn ) → fi (x0 ) as n → ∞,
(n) (0)
ξi → ξi , i = 1, 2, 3, . . .
If ||xn || → ||x0 || as n → ∞
0 0 0 0
0 ∞ 0 0∞ 0
0 (n) 0 0 (0) 0
0 ξi ei 0 → 0 ξi ei 0
0 0 0 0
i=1 i=1
as n → ∞.
Let us introduce in E a new norm || · ||1 as follows
0∞ 0
0 0 ||xn − xo ||
0 (n) (0) 0
||xn − x0 ||1 = 0 (ξi − ξi )ei 0 = (6.23)
0 0 1 + ||xn − x0 ||
i=0 1
Since ||xn − x0 || ≥ 0
||xn − x0 ||1 ≤ ||xn − x0 ||.
Again, since {||xn ||1 } is convergent and hence bounded, ||xn ||1 ≤ M
(say).
||xn − x0 || ≤ (1 − M )−1 ||xn − x0 ||1
Thus, ||xn − x0 ||1 ≤ ||xn − x0 ||(1 − M )−1 ||xn − x0 ||1
Thus || · ||1 and || · || are equivalent norms.
M
(6.23) yields that ||xn || ≤ = L (say).
1−M
0 0
0 ∞ 0
0 (n) 0
Hence 0 ξi ei 0 ≤ L (say).
0 0
i=1

m ∞

(n) (n)
Let Sm = ξi ei and S= ξi ei
i=1 i=1
Let > 0 and Sm − S < for m ≥ m0 ( ).
∞
(n) (0)
Now, xn − x0 = (ξi − ξi )ei
i=1

Let = and for > 0, n ≥ n0 ( ), m ≥ m0 ( )
2L

(m) (0) (n) (0)
then (ξi − ξi )ei ≤ |ξi − ξi | ||ei || < 2L = · 2L =
m 2L
n
for n ≥ n0 ( ), m ≥ m0 ( ).
Thus, ||xn − x0 || → 0 as n → ∞, proving strong convergence of {xn }.
6.3.16 Theorem
If the sequence {xn } of a normed linear space E 3converges weakly
to

kn
(n)
x0 , then there is a sequence of linear combinations Ck xk which
k=1
converges strongly to x0 .
In other words, x0 belongs to a closed linear subspace L, spanned by
the elements x1 , x2 , . . . xn , . . . .
Proof: Let us assume that the theorem is not true, i.e., x0 does not
belong to the closed subspace L. Then, by theorem 5.1.5, there is a linear
functional f ∈ E ∗ , such that f (x0 ) = 1 and f (xn ) = 0, n = 1, 2, . . ..
But this means that f (xn ) does not converge to f (x0 ), contradicting the
w
hypothesis that xn −→ x0 .
6.3.17 Theorem
Let A be a bounded linear operator with domain Ex and range in Ey ,
both normed linear spaces. If the sequence {xn } ⊂ Ex converges weakly to
x0 ⊂ Ex , then the sequence {Axn } ⊂ Ey converges weakly to Ax0 ∈ Ey .
Proof: Let φ ∈ Ey∗ be any functional. Then φ(Axn ) = f (xn ), f ∈ Ex∗ .
Analogously φ(Ax0 ) = f (x0 ).
w
Since xn −→ x0 , f (xn ) → f (x0 ) i.e., φ(Axn ) → φ(Ax0 ). Since φ is
w
an arbitrary functional in Ey∗ , it follows that Axn −→ Ax0 . Thus, every
bounded linear operator is not only strongly, but also weakly continuous.
6.3.18 Theorem
If a sequence {xn } in a normed linear space converges weakly to x0 ,
then the norm of the elements of this sequence is bounded.
We regard xn (n = 1, 2, . . .) as the elements of E ∗∗ , conjugate to E ∗ then
the weak convergence of {xn } to x0 means that the sequence of functions
xn (f ) converges to x0 (f ) for all f ∈ E ∗ . But by the theorem 4.5.7 (Banach-

Steinhaus theorem) the norm {||xn ||} is bounded, this completes the proof.
6.3.19 Remark
If x0 is the weak limit of the sequence {xn }, then
||x0 || ≤ lim inf ||xn ||;
Moreover, the existence of this finite inferior limit follows from the preceding
theorem.
Proof: Let us assume that ||x0 || > limit inf ||xn ||. Then there is a number
α such that ||x0 || > α > lim inf ||xn ||. Hence, there is a sequence {xni }
such that ||x0 || > α > ||xni ||. Let us construct a limit functional f0 such
that
||f0 || = 1 and f0 (x0 ) = ||x0 || > α.
Then f0 (xni ) ≤ ||f0 || ||xni || = ||xni || < α for all i.
Consequently, f0 (xn ) does not converge to f0 (x0 ), contradicting the
w
hypothesis that xn −→ x0 .
Note 6.3.2. The following example shows that the inequality ||x0 || < lim
inf ||xn || can actually hold.
In the space L2 ([0, 1]) we consider the function
√
xn (t) = 2 sin nπt. Now, ||xn (t)||2 = xn (t), xn (t)
1 1
= 2
xn (t)dt = 2 sin2 nπtdt = 1.
0 0
Thus, lim ||xn || = 1. On the other hand, for every linear functional f ,
n
√ 1 √
f (xn ) = 2 g(t) sin nπtdt = 2cn ,
0
cn ’s are the Fourier coefficients of g(t) ∈ L2 ([0, 1]).
w
Thus, f (xn ) → 0 as n → ∞ for every linear functional f , i.e., xn −→ θ
consequently, x0 = θ.
and ||x0 || = 0 < 1 = lim ||xn ||.
n
6.3.20 Theorem
In order that a sequence {xn } of a normed linear space E converges
weakly to x0 , it is necessary and sufficient that
(i) the sequence {||xn ||} is bounded and
(ii) f (xn ) → f (x0 ) for every f of a certain set Ω of linear functionals,
linear combination of whose elements are everywhere dense in E ∗ .
Proof: This theorem is a particular case of theorem 6.3.3. This is because
convergence of {xn } ⊂ E to x0 ∈ E is equivalent to the convergence of the
linear functionals {xn } ⊂ E ∗∗ to xo ∈ E ∗∗ .
6.3.21 Weak convergence in certain spaces

(a) Weak convergence in lp .
6.3.22 Theorem
(n) (n)
In order that a sequence {xn }, xn = {ξi }, ξi ∈ lp converges to
(0) (0)
x0 = {ξi }, ξi ∈ lp , it is necessary and sufficient that
(i) the sequence {||xn ||} be bounded and
(n) (0)
(ii) ξi → ξi as n → ∞ and for all i (in general, however, non-
uniformly).
Proof: We note that the linear combinations of the functionals fi =
(0, 0, . . . 1, . . . 0), i = 1, 2, . . . are everywhere dense in lq = lp∗ . Hence, by
w
the theorem 6.3.20, in order that xn −→ x0 it is necessary and sufficient
(n) (0)
that (i) {||xn ||} is bounded and fi (xn ) = ξi → fi (x0 ) = ξi for every i.
Thus, weak convergence in lp is equivalent to coordinate-wise
convergence together with the boundedness of norms.
(b) Weak convergence in Hilbert spaces
Let H be a Hilbert space and x ∈ H. Now any linear functional f
defined on H can be expressed in the form f (x) = x, y, where y ∈ H
corresponds to x.
w
Now, xn −→ x0 ⇒ f (xn ) → f (x0 ) ⇒ xn , y → x, y for every
y ∈ H.
6.3.23 Lemma
In a Hilbert space H, if xn → x and yn → y strongly as n → ∞ then
xn , yn → x, y as n → ∞, where , denotes a scalar product in H.
Proof: |xn , yn − x, y| = |xn , yn − xn , y + xn , y − x, y|

= |xn , yn − y + xn − x, y| ≤ ||xn || ||yn − y|| + ||xn − x|| ||y||
→ 0 as → ∞, because ||xn || and ||y|| are bounded.
w w
Note 6.3.3. If, however, xn −→ x, yn −→ y, then generally xn , yn
does not converge to x, y.
For example, if xn = yn = en , {en } an arbitrary orthonormal sequence,
w
then en −→ θ but
en , en = ||en ||2 = 1 does not converge to θ

= 0, 0.
w
However, if xn → x, yn −→ y, then
xn , yn → x, y, provided ||yn || is totally bounded.
Let P = sup ||yn ||. Then
|xn , yn − x, y| ≤ |xn − x, yn + x, yn − y|
≤ P ||xn − x|| + |x, yn − y| → 0 as n → ∞.

w
Finally, we note that if xn −→ x and ||xn || → ||x||, then xn → x.
This is because,
||xn − x||2 = xn − x, xn − x = [xn , xn − x, x]
+ [x, x − x, xn ] + [x, x − xn , x]
2 2
≤ [||xn || − ||x|| ] + [x, x − x, xn ] + [x, x − xn , x]
→ 0 as n → ∞.
Problems
1. Let E be a normed linear space.
(a) If X is a closed convex subset of E, {xn } is a sequence in X and
w
xn −→ x in E, then prove that x ∈ X (6.3.20).
w
(b) Let Y be a closed subspace of E. If xn −→ x in E, then show
w
that xn + Y −→ x + Y in E/Y .
w
2. In a Hilbert space H, if {xn } −→ x and ||xn || → ||x|| as n → ∞,
show that {xn } converges to x strongly.

pn
3. Let fm (x) = Cnm x(tn,m ) be a sequence of quadrature formulae
m=1
b
for, f (x) = k(x, t)dt on the Banach space Ex = C([a, b]).
a
(Cn,m are the weights and tn,m are the nodes).

pn
Show that ||fn || = |Cn,m |.
m=1
Further, if (i) fn (xk ) → f (xk ), k = 0, 1, 2, . . .
and (ii) Cn,m ≥ 0 ∀ n, m,
show that fn (x) → f (x) ∀ x ∈ C([a, b]).
Hence, show that the sequence of Gaussian quadrature formulae

Gn (x) → f (x) as n → ∞.
4. Show that a sequence {xn } in a normed linear space is norm bounded
wherever it is weakly convergent.
5. Given {fn } ⊂ E ∗ where E is a normed linear space, show that {fn }
is weakly convergent to f ∈ E ∗ implies that ||f || ≤ lim inf ||fn ||.
n→∞
w
6. In l1 , show that xn −→ x iff ||xn − x|| → 0.
7. A space E is called weakly sequentially complete if the existence of
lim f (xn ) for each f ∈ E ∗ implies the existence of x ∈ E such that
n→∞
{xn } converges weakly to x. Show that the space C([a, b]) is not
weakly sequentially complete.
w
8. If xn −→ x0 in a normed linear space E, show that x0 ∈ Y , where
Y = span {xn }. (Use theorem 5.1.5).
w
9. Let {xn } be a sequence in a normed linear space E such that xn −→ x
in E. Prove that there is a sequence {yn } or linear combination of
elements of {xn } which converges strongly to x. (Use Hahn-Banach
theorem).
10. In the space l2 , we consider a sequence {Tn }, where Tn : l2 → l2 is
defined by
Tn x = (0, 0, . . . , 0, ξ1 , ξ2 , . . .), x = {xn } ∈ l2 .
Show that
(i) Tn is linear and bounded
(ii) {Tn } is weakly operator convergent to 0, but not strongly.
(Note that l2 is a Hilbert space).
11. Let E be a separable Banach space and M ⊂ E ∗ a bounded set. Show
that every sequence of elements of M contains a subsequence which
is weak∗ convergent to an element of E ∗ .
12. Let E = C([a, b]) with the sup norm. Fix t0 (a, b). For each positive
integer n with t0 + n4 < b, let
⎧ 4
⎪
⎪ 0 if a ≤ t ≤ t0 , t0 + ≤ t ≤ b
⎪
⎪ n
⎪
⎨ 2
xn (t) = n(t − t0 ) if t0 ≤ t ≤ t0 + ,
⎪
⎪ n
⎪
⎪
⎪
⎩ n 4
− t + t0
2
if t0 + ≤ t ≤ t0 + ,
4
n n n
w
Then show that xn −→ θ in E but xn (t) →
/ θ in E.
13. Let Ex be a Banach space and Ey be a normed space. Let {Fn } be
a sequence in (Ex → Ey ) such that for each fixed x ∈ Ex , {Fn (x)} is
w
weakly convergent in Ey . If Fn (x) −→ y in Ey , let F (x) = y. Then
show that F ∈ (Ex → Ey ) and
||F || ≤ lim inf ||Fn || ≤ sup ||Fn || < ∞, n = 1, 2, . . .
n→∞
(Use theorem 4.5.7 and example 14).

14. Let E be a normed linear space and {xn } be a sequence in E. Then
show that {xn } is weakly convergent to E if and only if (i) {xn } is
a bounded sequence in E and (ii) there is some x ∈ E such that
f (xn ) → f (x) for every f in some subset of E ∗ whose span is dense
in E ∗ .
6.4 Reflexivity
In 5.6.6 the notion of canonical or natural embedding of a normed
linear space E into its second conjugate E ∗∗ was introduced. We have
discussed when two normed linear spaces are said to be reflexive and some
relevant theorems. Since the conjugate spaces E, E ∗ , E ∗∗ often appear in
discussions on reflexivity of spaces, there may be some relationships between
weak convergence and reflexivity. In what follows some results which were
not discussed in 5.6 are discussed.
6.4.1 Theorem
Let E be a normed linear space and {f1 , . . . fn } be a linearly independent
subset of E ∗ . Then there are e1 , e2 , . . . , en in E such that fj (ei ) = δij for
i, j = 1, 2, . . . n.
Proof: We prove by induction on m. If m = 1, then since {f1 } is
linearly independent, let a0 ∈ E with f1 (a0 ) = 0. Let e1 = f1a(a0 0 ) .
Hence, f1 (e1 ) = 1. Next let us assume that the result is true for
m = k. Let {f1 , f2 , . . . , fk+1 } be a linearly independent subset of E ∗ .
Since {f1 , f2 , . . . fk } is linearly independent, there are a1 , a2 , . . . ak in E
such that fj (ai ) = δij for 1 ≤ i, j ≤ k. We claim that there is some a0 ∈ E
such that fj (a0 ) = 0 for 1 ≤ j ≤ k but fk+1 (a0 ) = 0. For x ∈ E, let
a0 = f1 (x)a1 + f2 (x)a2 + · · · + fk (x)ak .
-
k -
k
Then (x − a) ∈ N (fj ). If N (fj ) ⊂ N (fk+1 ),
j=1 j=1
then fk+1 (x) = fk+1 (x − a0 ) + fk+1 (a0 )

= f1 (x)fk+1 (a1 ) + · · · + fk (x)fk+1 (ak ),
so that fk+1 = fk+1 (a1 )f1 + · · · + fk+1 (ak )fk
∈ span {f1 , f2 , . . . , fk }.
violating the linear independence of {f1 , . . . fk+1 }. In the above N (fj )
stands for the nullspace of fj .
Hence our claim is justified. Now let
a0
ek+1 = and for i = 1, 2, . . . , k,
fk+1 (a0 )
let ei = ai − fk+1 (ai )ek+1 .
Then fj (ei ) = fj (ai ) − fk+1 (ai )fj (ek+1 ).
fj (a0 )
= fj (ai ) − fk+1 (ai ) .
fk+1 (a0 )

1 for j = i = 1, 2, . . . , k
=
0 for j = i

a0 fj (a0 )
fj (ek+1 ) = fj = =0
fk+1 (a0 ) fk+1 (a0 )
for j = 1, 2, . . . , k, since fj (a0 ) = 0.
Also fk+1 (ek+1 ) = 1.
Hence fj (ei ) = δij , i, j = 1, 2, . . . k + 1
6.4.2 Theorem (Helley, 1912) [33]

(a) Consider f1 , f2 , . . . , fm in E ∗ , k1 , k2 , . . . km in 4 +
( ) and α ≥ 0.
Then, for every > 0, there is some x ∈ E such that fj (x ) = kj for each
j = 1, 2, . . . , m and ||x || < α + if and only if
0 0
m 0 0
0m 0
h k ≤ α 0 h f 0
j j 0 j j0
j=1 0j=1 0
for all h1 , h2 , . . . , hm in 4 +
( ).
(b) Let S be a finite dimensional subspace of E ∗ and Fx ∈ E ∗∗ . If > 0,
then there is some x ∈ E such that

F S = φ(x )S and ||x || < ||Fx || + .
Proof: (a) Suppose that for every > 0, there is some x ∈ E such that
fj (x ) = kj for each j = 1, . . . , m and ||x || < α+ . Let us fix h1 , h2 , . . . , hm
in 4 + ( ). Then
⎛ ⎞

m m m
h k = h f (x )

= ⎝ h f ⎠ (x )
j j j j j j
j=1 j=1 j=1
0 0 0 0
0m 0 0 0
0 0 0m 0
≤00 h f
j j0
0 ||x || < (α + ) 0
0 h f 0
j j 0.
0j=1 0 0 j=1 0
As this is true for every > 0, we conclude that
0 0
0 0
m 0m 0
h k ≤ α 0 h f 0
j j 0.
j j 0
j=1 0 j=1 0

m
Conversely, suppose that for all h1 , h2 , . . . , hm in 4 +
( ), hj kj ≤
j=1
0 0
0 0
0m 0
α0 h 0
j j 0. It may be noted thats {f1 , f2 , . . . , fm } can be assumed to be
f
0
0j=1 0
a linearly independent set. If that is not so, let f1 , f2 , . . . , fn with n ≤ m
be a maximal linearly independent subset of {f1 , f2 , . . . fm }. Given > 0,

let x ∈ E be such that ||x || ≤ α + and fj (x ) = kj for j = 1, 2, . . . , n. If
n < l ≤ m, then fl = h1 f1 + · · · + hn fn for some h1 , h2 , . . . , hn in ( ). 4 +
Hence, fl (x ) = h1 f1 (x ) + · · · + hn fn (x ) = h1 k1 + · · · + hn kn .
0 0
0 0
n
0 n
0
But, k l − h k ≤ α 0 f − h f 0
j j 0 = 0,
j j 0 l
j=1 0 j=1 0
so that fl (x ) = kl as well.
4 +
Consider the map F : E → m ( m ) given by F(x) = (f1 (x), . . . , fm (x)).
Clearly, F is a linear map. Next, we show that it is a surjective (onto)
4
mapping. To this end consider (h1 , h2 , . . . , hm ) ∈ m (or m ). Since +
{f1 , f2 , . . . , fm } are linearly independent, it follows from theorem 6.4.1 that
there exist e1 , e2 , . . . , em in E such that fj (ei ) = δij , 1 ≤ i, j ≤ m. If we
take x = h1 e1 + · · · + hm em , then it follows that F(x) = (h1 , h2 , . . . , hm ).
We next want to show that F maps each open subset of E onto an open
4 +
subset of m (or m ). Since F is non-zero we can find a non-zero vector,
4
‘a’ in E s.t. F(a) = (1, 1, . . . 1) ∈ m (Cm ). Let P be an open set in E.
Then there exists an open ball U (x, r) ⊂ E with x ∈ E and r ∈ . 4
We can now find a scalar k such that
r
x − ka ∈ U (x, r) where 0 < |k| < .
||a||
Hence, x − ka ∈ P with the above choice of k.
Therefore, F(x − ka) = F(x) − kF(a) = F(x) − k ∈ F(E). Thus
,
4+
k ∈ ( ) : F(x) = |k | <
r
||a||
⊂ F (E),
showing F(E) is open in 4m (+m).

Thus F maps each open subset of E onto an open subset of m ( m ). 4 +
Let > 0 and let us consider U = {x ∈ E : ||x|| < α + }. We want
to show that there is some x ∈ U with F(x ) = (k1 , k2 , . . . , km ). If that
be not the case, then (k1 , . . . , km ) does not belong to the open convex set
F(U ). By the Hahn-Banach separation theorem (5.2.10) for m ( m ) 4 +
4 +
there is a continuous linear functional g on m ( m ) such that
Re g((f1 (x), . . . , fm (x))) ≤ Re g((k1 , . . . km ))

for all x ∈ U . By 5.4.1, there is some (h1 , h2 , . . . , hm ) ∈ 4m (+m) such
that
g(c1 , c2 , . . . , cm ) = c1 h1 + · · · + cm hm
for all (c1 , c2 , . . . , cm ) ∈ 4m(+m ). Hence,
Re[h1 f1 (x) + · · · + hm fm (x)] ≤ Re(h1 k1 + · · · + hm km )
for all x ∈ U . If h1 f1 (x) + · · · + hm fm (x) = reiθ with r ≥ 0 and

−π < θ < π, then by considering xe−iθ in place of x, it follows that
|h1 f1 (x) + · · · + hm fm (x)| ≤ Re(h1 k1 + · · · + hm k)
for all x ∈ U . But
⎧ ⎫ 0 0
⎨ m
⎬ 0m
0
0
0
sup
hj fj (x) : x ∈ U = (α + ) 0 0 h j fj 0
⎩ ⎭ 0
j=1 0 j=1 0
0 0
0m 0 0 0
0 0 m
m 0 0
Hence, (α + ) 0 0 0
hj fj 0 ≤ Re
hj kj ≤ hj kj ≤ α 0 hj fj 0
0j=1 0 j=1 j=1
This contradiction shows that there must be some x ∈ U with F(x ) =
(k1 , k2 , . . . , km ) as wanted.
(b) Let {f1 , f2 , . . . , fm } be a basis for the finite dimensional subspace S

of E ∗ and let kj = Fx (fj ), j = 1, 2, . . . , m. Then for all h1 , h2 , . . . , hm in
4 +
m
( m ), we have
⎛ ⎞ 0 0
m m 0m 0
m
0 0
h k = h F (f
j x j ) = F ⎝ h f ⎠ ≤ ||F || 0 h f 0
j j 0.
j j x j j x 0
j=1 j=1 j=1 0 j=1 0
Let α = ||Fx || in (a) above, we see that for every > 0, there is some
x ∈ E such that
||x || ≤ ||Fx || + for j = 1, 2, . . . , m
φ(x )(fj ) = fj (x ) = kj = Fx (fj ),

i.e., φ(x )S = Fx S as desired.
6.4.3 Remark
(i) It may be noted if we restrict ourselves to a finite dimensional
subspace of E, then we are close to reflexivity.
The relationship between reflexivity and weak convergence is
demonstrated in the following theorem.
6.4.4 Theorem (Eberlein, 1947)
Let E be a normed linear space. Then E is reflexive if and only if every
bounded sequence has a weakly convergent subsequence.
Proof: For proof see Limaye [33].
6.4.5 Uniform convexity
We next explore some geometric condition which implies reflexivity. In
2.1.12 we have seen that a closed unit ball of a normed linear space E is a
convex set of E. In the case of the strict convexity of E, the mid-point of
the segment joining two points on the unit sphere of E does not lie on the
unit sphere of E. Next comes a concept in E which implies reflexivity of

E.
A normed space E is said to be uniformly convex if, for every > 0,
there exists some δ > 0 such that for all x, and y in E with ||x|| ≤ 1, ||y|| ≤ 1
and ||x − y|| ≥ , we have ||x + y|| ≤ 2(1 − δ).
This idea admits of a geometrical interpretation as follows: given > 0,
there is some δ > 0 such that if x and y are in the closed unit ball of E,
and if they are at least apart, then their mid-point lies at a distance at
best δ from the unit sphere. Here δ may depend on . In what follows the
relationship between a strictly convex and a uniformly convex space
is discussed.
6.4.6 Definition
A normed space E is said to be strictly convex if, for x = y in E with
||x|| = 1 = ||y||, we have
||x + y|| < 2.
6.4.7 Lemma
A uniformly convex space is strictly convex. This is evident from
the definition itself.
6.4.8 Lemma
If E is finite dimensional and strictly convex, then E is uniformly
convex.
Proof: For > 0, let
Λ = {(x, y) ∈ E × E : ||x|| ≤ 1, ||y|| ≤ 1, ||x − y|| ≥ }.
Then Λ is a closed and bounded subset of E × E. We next show that Λ is

compact, i.e., show that every sequence in Λ has a convergent subsequence,
converging in E. Let un = (xn , y n ) in Λ, ( , ) denote the cartesian product.
m
Let {e1 , . . . , em } be a basis in E. Then we can write xn = pnj ej and
j=1

m
yn = qjn ej where pnj and qjn are scalars for j = 1, 2, . . . , m. Since
j=1
{(xn , y n )} is a bounded sequence in E × E, {pnj } and {qjn } are bounded for
j = 1, 2, . . . , m.
By Bolzano–Weierstrass theorem (Cor. 1.6.19), {zjn } has a convergent
subsequence {zjnk } for each j = 1, 2, . . . , m.
Hence, {unk } = {(xnk , ynk )} converges to some element u ∈ E × E as
k → ∞. Since unk ∈ Λ and Λ is closed, u ∈ Λ ⊆ E × E. Therefore, if
xnk → x and y nk → y as k → ∞, we have u = (x, y) ∈ Λ. Thus Λ is
compact.
For, (x, y) ∈ Λ, let

f (x, y) = 2 − ||x + y||.
Now, f is a continuous and strictly positive function on Λ. Hence there

is some δ > 0 such that f (x, y) ≥ 2δ for all (x, y) ∈ Λ. This implies the
uniform convexity of E.
6.4.9 Remark
A strictly convex normed linear space need not in general be uniformly
convex. Let E = c00 , a space of numerical sequences with finite number of
non-zero terms. Let
Λn = {x ∈ Λ : xj = 0 for all j > n}.
For x ∈ Λ1 , let ||x|| = |x1 |.
Let us assume that ||x|| is defined for all x ∈ Λn−1 .
If x ∈ Λn , then x = zn−1 + xn en , for some zn−1 ∈ Λn−1 . Define
||x|| = (||zn−1 ||n + |xn |n )1/n .
By making an appeal to induction we can verify that || · || is a strictly
convex norm on E.
For n = 1, 2, . . . let
e1 + en −e1 + en
xn = 1/n
and zn = ,
2 21/n
Then ||xn || = 1 = |zn ||
||xn + zn || = 2(n−1)/n = ||xn − zn ||.
Thus, ||xn − zn || ≥ 1 for all n. But ||xn + zn || → 2 as n → ∞. Hence, E
is not uniformly convex.
6.4.10 Remark
It is noted that the normed spaces l1 , l∞ , C([a, b]) are not strictly convex.
It was proved by Clarkson [12] that the normed spaces lp and Lp ([a, b]) with
1 < p < ∞ are uniformly convex.
6.4.11 Lemma
Let E be a uniformly convex normed linear space and {xn } be a sequence
in E such that ||xn || → 1 and ||xn + xm || → 2 as n, m → ∞. Then
lim ||xn − xm || = 0. That is, {xn } is a Cauchy sequence.
n,m→∞
Proof: If {xn } is not Cauchy in E, given > 0 and a positive integer n0 ,
there are n, m ≥ n0 with
||xn − xm || ≥ .
This implies that for a given x ∈ E and a positive integer m0 , there is
n, m > m0 with
||xn − x|| + ||xm − x|| ≥ ||xn − xm || ≥ .
Taking n = m > m0 we have

||xm − x|| ≥ .
2
Since ||xn || → 1 as n → ∞ we see that for each k = 1, 2, . . . there is a
positive integer nk such that
1
||xnk || ≤ 1 + for all n ≥ nk .
k
Choosing m1 = n1 . Then ||xm1 || ≤ 1 + 1 = 2.
Let m0 = max{m1 , n1 } and x = xm1
We see that there is some m2 > m0 with

||xm2 − xm1 || ≥ .
2
1
We note that ||xm2 || ≤ 1 + since m2 > m0 ≥ n2 . Thus we can find a
2
subsequence {xmk } of {xm } such that for k = 1, 2, . . .,
1
||xmk+1 − xmk || ≥ and ||xmk || ≤ 1 + .
2 k
By the uniform convexity of E, there is a δ > 0 such that ||x+y|| ≤ 2(1−δ)
wherever x and y are in E, ||x|| ≤ 1, ||y|| ≤ 1 and ||x − y|| ≥ .
Let us put yk = xmk for k = 1, 2, . . ., then

0 0
0 yk 0 1 1
0 0
0 1 + 1 0 ≤ 1, ||yk+1 || ≤ 1 + k + 1 ≤ 1 + k i.e.,
0 k
0 0 0
0 yk+1 0 0 0
0 0 ≤ 1 and 0 yk+1 − yk 0 ≥ = (say).
01 + 1 0 0 1+ 1 0 2
0 k
0 k
0 yk+1 + yk 0
Hence, 0 0
0 1 + 1 0 ≤ 2(1 − δ).
k
Thus, lim supk→∞ ||yk+1 + yk || ≤ 2(1 − δ) < 2,
i.e., lim supk→∞ ||xmk+1 + xmk || < 2.
The above contradicts the fact that
||xm + xn || → 2 as m, n → ∞.
Hence, {xn } is a Cauchy sequence in E.
6.4.12 Theorem (Milman, 1938) [33]

Let E be a Banach space which is uniformly convex in some equivalent
norm. Then E is reflexive.
Proof: We first show that a reflexive normed space remains reflexive in an
equivalent norm.
From theorem 4.4.2, we can conclude that the space of bounded linear
functionals defined on a normed linear space E is complete and hence a
Banach space. Thus the dual E ∗ and in turn the second dual E ∗∗ of
the normed linear space E are Banach spaces. Since E is reflexive, E
is isometrically isomorphic to E ∗∗ and hence E is a Banach space. Also, in
any equivalent norm on E, the dual E ∗ and the second dual E ∗∗ remains
unchanged, so that E remains reflexive.
Hence, we can assume without loss of generality that E is a uniformly
convex Banach space in the given norm || || on E.
Let Fx ∈ E ∗∗ . Without loss of generality we assume that ||Fx || = 1.
To show that there is some x ∈ E with φ(x) = Fx φ : E → E ∗∗ being
a canonical embedding. First, we find a sequence {fn } in E ∗ such that
||fn || = 1 and |Fx (fn )| > 1 − n1 for n = 1, 2, . . ..
For a fixed n, let Sn = span {f1 , f1 , . . . , fn }.

1
We put n = in Helley’s theorem (6.4.2) and find xn E such that
n
1
Fxn S = φ(xn )S
and ||xn || < 1 + .
n n n
Then for n = 1, 2, . . . and m = 1, 2, . . . , n.
F (fm ) = φ(xn )(fm ) = fm (xn )
1 1
so that 1 − < |F (fn )| = |fn (xn )| ||xn || < 1 +
n n
2
and 2 − < |2F (fn )| = |fn (xn ) + fn (xm )|
n
1 1
≤ ||xm + xn || ≤ 2 + + .
n m
Then we have,
lim ||xn || = 1 and lim ||xn + xm || = 2.
n→∞ n,m→∞
By lemma 6.4.11 {xn } is a Cauchy sequence in E. Since E is a Banach

space, let xn → x in E. Then ||x|| = 1. Also, since F (fm ) = fm (xn ) for all
n ≥ m, the continuity or fm shows that
F (fm ) = fm (x), m = 1, 2, . . . .
Let us next consider f ∈ E ∗ . Replacing Sn by the span of

{f, f1 , f2 , . . . , fn }, we find some z ∈ E, such that ||z|| = 1 and
F (f ) = f (z) and F (fm ) = fm (z).

We want to show that x = z.
2
Now, ||x + z|| ≥ |fm (x + z)| = 2|F (fm )| > 2 − ,
m
for all m = 1, 2, . . . so that ||x + z|| ≥ 2.
Since ||x|| = 1, ||z|| = 1, the strict convexity of E

implies that x = z and F (f ) = f (z) = f (x),
for all f ∈ E ∗ , i.e., F = φ(x). Hence, E is reflexive.
6.4.13 Remark
The converse of Milman’s theorem is false [see Limaye[33]].
Problems
1. Let E be a reflexive normed linear space. Then show that E is strictly
convex (resp. smooth) if and only if E ∗ is smooth (resp. strictly
convex).
(Hint: A normed linear space E is said to be smooth if, for every
x0 ∈ E with ||x0 || = 1, there is a unique supporting hyperplane [see
4.3.7] for B(θ, 1) at x0 .)
2. [Weak Schauder basis. Let E be a normed linear space. A
countable subset {a1 , a2 , . . .} of E is called a weak Schauder basis
for E if ||ai || = 1 for each i and for every x ∈ E, there are unique
n
4+
αi ∈ ( ) i = 1, 2, . . . such that
w
αi ai −→ x as n → ∞.
i=1
Weak∗ Schauder bases. A countable subset {f1 , f2 , . . .} of E ∗
is called a Weak∗ Schauder basis if ||fi || = 1 for all i and for
4+
every g ∈ E ∗ , there are unique βi ∈ ( ), i = 1, 2, . . . such that
n
w
βi fi −→ g as n → ∞.]
i=1
Let E be a reflexive normed linear space and {a1 , a2 , . . .} be a
Schauder basis for E with coefficient functionals {g1 , g2 , . . .}. If
fn = gn /||gn ||, n = 1, 2, . . . then show that {f1 , f2 , . . .} is a Schauder
basis for E ∗ with coefficient functionals (||g1 ||Fa1 , ||g2 ||Fa2 , . . .).
3. Let E be a separable normed linear space. Let {xn } be a dense subset
of {x ∈ E : ||x|| = 1}.
(a) Then show that there is a sequence {fn } in E ∗ such that

||fn || = 1 for all n and for every x = θ in E, fn (x) = 0 for
some n and for x ∈ E if
∞

12
|fn (x)|2
||x||0 =
n=1
2n
then || ||0 is a norm on E, in which E is strictly convex and

||x||0 ≤ ||x|| for all x ∈ E.
(b) There is an equivalent norm on E in which E is strictly convex
(Hint: consider ||x||1 = ||x|| + ||x||0 ).
(c) Show that l1 is strictly convex but not reflexive in some norm
which is equivalent to the norm || ||1 .
4. Let E be a uniformly convex normed linear space, x ∈ E, and {xn }
be a sequence in E.
(a) If ||x|| = 1, ||xn || → 1 and ||xn + x|| → 2 then show that
||xn − x|| → 0.
w
(b) Show that xn → x in E if and only if xn −→ x in E and
lim sup ||xn || ≤ ||x||.
n→∞
6.5 Best Approximation in Reflexive Spaces

The problem of best approximation of functions concerns finding a proper
combination of known functions so that the said combination is closest
to the above function. P.L. Chebyshev [43] was the first to address this
problem. Let E be a normed linear space, x ∈ E is an arbitrary element. We
want to approximate x by a finite linear combination of linearly independent
elements x1 , x2 , . . . xn ∈ E.
6.5.1 Lemma

n
If αi2 increases indefinitely, then φ(α1 , α2 , . . . , αn ) = ||x − α1 x1 −
i=1
α2 x2 − · · · − αn xn || → ∞.
Proof: We have
φ(α1 , α2 , . . . , αn ) ≥ ||α1 x1 + · · · + αn xn || − ||x||.
The continuous function
ψ(α1 , α2 , . . . , αn ) = ||α1 x1 + α2 x2 + · · · + αn xn ||
of the parameters α1 , α2 , . . . , αn assumes its minimum ‘m’ on a unit ball

3
n
S = α1 , α2 , . . . , αn ∈ En : αi2 = 1
i=1
in En where En denotes the n-dimensional Euclidean space.

Since a unit ball in En is compact, the continuous function
ψ(x1 , x2 , . . . xn ) assumes its minimum on S. Since x1 , x2 , . . . , xn are linearly
independent the value of ψ(x1 , x2 , . . . , xn ) is always positive.
Therfore m > 0.
Given an arbitrary K > 0,
φ(α1 , α2 , . . . αn ) ≥ ||α1 x1 + α2 x2 + · · · + αn xn || − ||x||
0 0
n 0α x + · · · + α x 0
0 0
= 2 1 1 n n
αi · 0 +n 0 − ||x||
0 i=1 iα 2 0
i=1

n

≥ αi2 · m − ||x||,
i=1
,
α1 α2 αn
since 2 , 2 , · · · , 2 lie on a unit ball.
i αi i αi i αi

n
2
Thus if α > 1 (K + ||x||), φ(α1 , α2 , . . . αn ) > K
i
i=1
m
which proves the lemma.
6.5.2 Theorem
(0) (0) (0)
There exist real numbers α1 , α2 , . . ., αn , such that φ(α1 , α2 , . . .,
(0)
αn ) = ||x−α1 x1 −α2 x2 . . . αn xn || assumes its minimum for α1 = α1 , α2 =
(0) (0)
α2 . . . αn = αn .
Proof: If x depends linearly on x1 , x2 , . . . , xn , then the theorem is true
immediately. Let us assume that x does not lie in the subspace spanned by
x 1 , x2 , . . . , xn .
We first show that φ(α1 , α2 , . . . , αn ) is a continuous function of its
arguments.
Now |φ(α1 , α2 , . . . , αn ) − φ(β1 , β2 , . . . , βn )|

0 0 0 0
0 n 0 0 n 0
0 0 0 0
= 0x − α1 xi 0 − 0x − β i xi 0
0 0 0 0
i=1 i=1
0 n 0
0 0 n
0 0
≤ 0 (αi − βi )xi 0 ≤ max |αi − βi | ||xi ||.
0 0 |≤i≤n
i=1 i=1
3
n
2
If, S = (λ1 , λ2 , . . . λn ) ∈ En : λi ≤ r ,
i=1

then outside S , the previous lemma yields, φ(α1 , α2 , . . . , αn ) ≥ ||x||.
Now, the ball S ⊂ En being compact, φ(α1 , α2 , . . . αn ) being

a continuous function assumes its minimum r at some point
(0) (0) (0)
(α1 , α2 , . . . , αn ). But r ≤ φ(0, 0, . . . , 0) = ||x||. Hence, r is the least
value of the function φ(α1 , α2 , . . . αn ) on the entire space of the points
α1 , α2 , · · · αn , which proves the theorem.
6.5.3 Remark

n
(0)
(i) The linear combination λi xi , giving the best approximation of
i=1
the element x, is in general not unique.
(ii) Let Y be a finite dimensional subspace of C([0, 1]). Then the best
approximation out of Y is unique for every x ∈ C([a, b]) if and only if Y
satisfies the Haar condition [see Kreyszig [30]].
(iii) However, there exist certain spaces in which the best approximation
is everywhere uniquely defined.
6.5.4 Definition: strictly normed
A space E is said to be strictly normed if the equality ||x + y|| =
||x|| + ||y|| for x = θ, y = θ, is possible only when y = ax, with a > 0.
6.5.5 Theorem
In a strictly normed linear space the best approximation of an arbitrary
x in terms of a linear combination of a given finite system of linearly
independent elements is unique.
n
Proof: Let us suppose that there exist two linear combinations α i xi
i=1

n
and βi xi such that
i=1
0 0 0 0
0 n 0 0 n 0
0 0 0 0
0x − α i xi 0 = 0x − βi xi 0 = d,
0 0 0 0
i=1 i=1
0 0
0 n 0
0 0
where d = min 0x − ri xi 0 > 0,
ri 0 0
i=1
0 0 0 0 0 0
0 α i + βi 0 10 0 10 0
n n n
0 0 0 0 0 0
then 0x − x i 0 ≤ 0x − α i x i 0 + 0x − β i xi 0
0 2 0 2 0 0 20 0
i=1 i=1 i=1
1 1
= d + d = d.
0 0 2 2
0 n 0
0 αi + βi 0
and since 0x − xi 0 ≥ d,
0 2 0
i=1
0 0
0 n
α i + βi 0
0 0
we have 0x − xi 0 = d,
0 2 0
i=1
0 0 0
0
0 n
αi + βi 0 0 n 0
0 0 01 0
Consequently, 0x − xi 0 = 0 x− α1 xi 0
0 2 0 02 0
i=1 i=1 0
0
01 n 0
0 0
+0 x− βi xi 0.
02 0
i=0
The space being strictly normed.

3
n
n
x− βi xi = a x − α i xi .
i=1 i=1
If a = 1, then x would be a linear combination of the elements

n
x1 , x2 , . . . , xn , which is a contradiction. Thus, a = 1, but then (αi −
i=1
βi )xi = 0 and x1 , x2 , . . . , xn are linearly independents. Hence we get
αi = βi , i = 1, 2, . . . , n.
6.5.6 Remark
(i) Lp ([0, 1]) and lp for p > 1 are strictly normed.
(ii) C([0, 1]) is not strictly normed.
Let us take x(t) and y(t) as two non-negative linearly independent
functions taking the maximum values at one and the same point of the
interval.
Then, ||x + y|| = max |x(t) + y(t)| = |x(t̂ ) + y(t̂ )|

= x(t̂ ) + y(t̂ ) = ||x|| + ||y||.
But y = ax. Hence, C([0, 1]) is not strictly normed.
6.5.7 Lemma
Let E be a reflexive normed linear space and M be a non-empty closed
convex subset of E. Then, for every x ∈ E, there is some y ∈ M such that
||x − y|| = dist (x, M ), that is there is a best approximation to x from M .
Proof: Let x ∈ E and d = dist (x, M ). If x ∈ M then the result is
trivially satisfied. Let x ∈ M . Then there is a sequence {yn } in M
such that ||x − yn || → d as n → ∞. Since x is known and {x − yn } is
bounded, {yn } is bounded, and since E is reflexive, {yn } contains a weakly
convergent subsequence {ynp }, (6.4.4). Now, {ynp } ⊂ M and M being
closed, lim φ1 (ynp ) = φ(y 1 ), y 1 ∈ M and φ1 is any linear functional.
p→∞
Therefore, lim φ1 (x − ynp ) = φ1 (x − y 1 ).
p→∞
Since {ynp } is a subsequence of {yn },
lim ||x − ynp || = lim ||x − yn || = d.

p→∞ n→∞
Thus, by theorem 6.3.15, {x − ynp } is strongly convergent to {x − y (1) }.

Hence, {ynp } is strongly convergent to y (1) . Similarly if {yn } contains
another weakly convergent subsequence, then we can find a y 2 ∈ M
s.t. {ynq } is strongly convergent to y 2 s.t. ||x − y 2 || = d. Since M
is convex y1 , y 2 ∈ M ⇒ [λy 1 + (1 − λ)y 2 ] ∈ M, 0 ≤ λ ≤ 1. Thus
||x − (λy 1 + (1 − λ)y 2 )|| = d.
6.5.8 Minimization of functionals

The problem of best approximation is just a particular case of a wider
and more comprehensive problem, namely the problem of minimization of
a functional.
The classical Weierstrass existence theorem tells the following:
(W ) The minimum problem
F (u) = min!, u ∈ M (6.25)
has a solution provided the functional F : M → R is continuous on the

nonempty compact subset M of the Banach space E.
Unfortunately, this result is not useful for many variational problems
because of the following crucial drawback:
In infinite-dimensional Banach spaces, closed balls are not compact.
This is the main difficulty in calculus of variations. To overcome this
difficulty the notion of weak convergence is introduced.
The basic result runs as follows:
(C) In a reflexive Banach space, each bounded sequence has a weakly
convergent subsequence. (6.4.4).
If H is a Hilbert space, it is reflexive and the convergence condition (C)
is a consequence of the Riesz theorem.
In a reflexive Banach space, the convergence principle (C) implies the
following fundamental generalization to the classical Weierstrass theorem
(W ):
(W ∗ ) The minimum problem (6.25) has a solution provided the
functional F : M → R is weakly sequentially lower semicontinuous on
the closed ball M of the reflexive Banach space E.
More generally, this is also true if M is a nonempty bounded closed
convex set in the reflexive Banach space E. These things will be discussed
in Chapter 13.
Problems
1. Prove that a one-to-one continuous linear mapping of one Banach
space onto another is a homeomorphism. In particular, if a one-to-
one linear mapping A of a Banach space onto itself is continuous,
prove that its inverse A−1 is automatically continuous.
2. Let A : Ex → Ey be a linear continuous operator, where Ex and Ey
4 +
are Banach spaces over ( ). If the inverse operator A−1 : Ey → Ex
exist, then show that it is continuous.
3. Let A : Ex → Ey be a linear continuous operator, where Ex and
4 +
Ey are Banach spaces over ( ). Then show that the following two
conditions are equivalent:
(i) Equation Au = v, u ∈ Ex , is well-posed that is, by definition,

for each given v ∈ Ey , Au = v has a unique solution u, which
depends continuously on v.
(ii) For each v ∈ Ey , Au = v has a solutions u, and Aw = θ implies
w = θ.
CHAPTER 7
CLOSED GRAPH
THEOREM AND ITS
CONSEQUENCES
7.1 Closed Graph Theorem

Bounded linear operators are discussed in chapter 4. But in 4.2.11 we
have seen that differential operators defined on a normed linear space are
not bounded. But they belong to a class of operator known as closed
operators. In what follows, the relationship between closed and bounded
linear operators and related concepts is discussed.
Let Ex and Ey be normed linear space, let T : D(T ) ⊂ (Ex → Ey ) be a
linear operator and let D(T ) stand for domain of T .
7.1.1 Definition: graph
The graph of an operator T is denoted by G(T ) and defined by
G(T ) = {(x, y) : x ∈ D(T ), y = T x}. (7.1)
7.1.2 Definition: closed linear operator

A linear operator T : D(T ) ⊆ Ex → Ey is said to be a closed operator
if its graph G(T ) is closed in the normed space Ex × Ey .
The two algebraic operations of the vector space Ex × Ey are defined as
usual, that is:
For, x1 , x2 ∈ Ex , y1 , y2 ∈ Ey ,
⎫
(x1 , y1 ) + (x2 , y2 ) = (x1 + x2 , y1 + y2 ) ⎪
⎪
⎬
For, x ∈ Ex , y ∈ Ey , (7.2)
⎪
⎪
⎭
267
α(x, y) = (αx, αy)

where α is a scalar and the norm on Ex × Ey is defined by
||(x, y)|| = ||x|| + ||y||. (7.3)
Under what conditions will a closed linear operator be bounded? An

answer is given by the following theorem which is known as the closed
graph theorem.
7.1.3 Theorem: closed graph theorem
Let Ex and Ey be Banach spaces and let T : D(T ) → Ey be a closed
linear operator where D(T ) ⊂ Ex . Then, if D(T ) is closed in Ex , the
operator T is bounded.
Proof: We first show that Ex × Ey with norm defined by (7.3) is complete.
Let {zn } be Cauchy in Ex ×Ey , where zn = (xn , yn ). Then, for every > 0,
there is an N = N ( ) such that
||zn − zm ||Ex ×Ey = ||(xn , yn ) − (xm , ym )||

= ||xn − xm || + ||yn − ym || < (7.4)
∀ n, m ≥ N ( ).
Hence, {xn } and {yn } are Cauchy sequences in Ex and Ey respectively.
Ex being complete,
xn → x (say) ∈ Ex as n → ∞.
Similarly, Ey being complete, yn → y (say) ∈ Ey .
Hence, {zn } → z = (x, y) ∈ Ex × Ey , as n → ∞.
Hence, Ex × Ey is complete.
By assumption, G(T ) is closed in Ex × Ey and D(T ) is closed in Ex .
Hence, D(T ) and G(T ) are complete.
We now consider the mapping P : G(T ) → D(T ).
(x, T x) → x
We see that P is linear, because
P [(x1 , T x1 ) + (x2 , T x2 )] = P [(x1 + x2 , T (x1 + x2 ))]
= x1 + x2 = P (x1 , T x1 ) + P (x2 , T x2 ),
where x1 , x2 ∈ Ex .
P is bounded, because
||P [(x, T x)]|| = ||x|| ≤ ||x|| + ||T x|| = ||(x, T x)||.
P is bijective, the inverse mapping
P −1 : D(T ) → G(T ) i.e. x → (x, T x).
Since G(T ) and D(T ) are complete, we can apply the bounded inverse
Closed Graph Theorem and Its Consequences 269
theorem (theorem 7.3) and see that P −1 is bounded, say ||(x, T x)|| ≤ b||x||,
for some b and all x ∈ D(T ).
Therefore, ||x|| + ||T x|| ≤ b||x||

Hence, ||T x|| ≤ ||x|| + ||T x|| ≤ b||x||
for all x ∈ D(T ). Thus, T is bounded.
7.1.4 Remark
G(T ) is closed if and only if z ∈ G(T ) implies z ∈ G(T ). Now, z ∈ G(T )
if and only if there are zn = (xn , yn ) ∈ G(T ) such that zn → z, hence
xn → x, T xn → T x.
This leads to the following theorem, where an important criterion for
an operator T to be closed is discovered.
7.1.5 Theorem (closed linear operator)
Let T : D(T ) ⊂ Ex → Ey be a linear operator, where Ex and Ey are
normed linear spaces. Then T is closed if and only if it fulfils the following
condition: If xn → x where xn ∈ D(T ) and T xn → y together imply that
x ∈ D(T ) and T x = y.
7.1.6 Remark
(i) If T is a continuous linear operator, then T is closed.
Since T is continuous, xn → x in Ex implies that T xn → T x in Ey .
(ii) A closed linear operator need not be continuous. For example, let
Ex = Ey = 4 and T x = x1 for x = θ and T θ = θ. Here, if xn → θ, then
T xn → θ, showing that T is closed. But T is not continuous.
(iii) Given T is closed, and that two sequences, {xn } and {xn }, in the
domain converge to the same limit x; if the corresponding sequences {T xn }
and {T xn } both converge, then the latter have the same limit.
T being closed, xn → x and T xn → y1 imply x ∈ D(T ) and T x = y1 .
Since {xn } → x, T being closed, xn → x and T xn → y2 imply that
x ∈ D(T ) and y2 = T x.
Thus, {T xn } and {T xn } have the same limit.
7.1.7 Example (differential operator)
We refer to example 4.2.11.
We have seen that the operator A given by Ax(t) = x (t) where
Ex = C([0, 1]) and D(A) ⊂ Ex is the subspace of functions having
continuous derivatives, is not bounded. We show now that A is a closed
operator. Let xn ∈ D(A) be such that
xn → x and Axn = xn → y.
Since convergence in the norm of C([0, 1]) is uniform convergence on

[0,1], from xn → y we have
t t t
y(τ )dτ = lim xn (τ )dτ = lim xn (τ )dτ = x(t) − x(0),
0 0 n→∞ n→∞ 0
t
i.e., x(t) = x(0) + y(τ )dτ .
0
This shows that x ∈ D(A) and x = y. The theorem 7.1.5 now implies
that A is closed.
Note 7.1.1. Here, D(A) is not closed in Ex = C([0, 1]), for otherwise A
would be bounded by the closed graph theorem.
7.1.8 Theorem
Closedness does not imply boundedness of a linear operator. Conversely,
boundedness does not imply closedness.
Proof: The first statement is shown to be true by examples 7.1.6 (ii) and
7.1.7. The second statement is demonstrated by the following example. Let
T : D(T ) → D(T ) ⊂ Ex be the identity operator on D(T ), where D(T ) is
a proper dense subspace of a normed linear space Ex . It is evident that T
is linear and bounded. However, we show that T is not closed. Let us take
x ∈ Ex − D(T ) and a sequence {xn } in D(T ) which converges to x.
7.1.9 Lemma (closed operator)
Since a broad class of operators in mathematical and theoretical physics
are differential operators and hence unbounded operators, it is important
to determine the domain and extensions of such operators. The following
lemma will be an aid in investigation in this direction.
Let T : D(T ) → Ey be a bounded linear operator with domain
D(T ) ⊆ Ex , where Ex and Ey are normed linear spaces. Then:
(a) If D(T ) is a closed subset of Ex , then T is closed.
(b) IF T is closed and Ey is complete, then D(T ) is a closed subset of
Ex .
Proof: (a) If {xn } is in D(T ) and converges, say, xn → x and is such that
{T xn } also converges, then x ∈ D(T ) = D(T ), since D(T ) is closed.
T xn → T x since T is continuous.
Hence, T is closed by theorem 7.1.5.
(b) For x ∈ D(T ) there is a sequence {xn } in D(T ) such that xn → x.
Since T is bounded,
||T xn − T xm || = ||T (xn − xm )|| ≤ ||T || ||xn − xm ||.
This shows that {T xn } is Cauchy, {T xn } converges, say, T xn → y Ey

because Ey is complete. Since T is closed, x ∈ D(T ) by theorem 7.1.5 and
T x = y. Hence, D(T ) is closed because x ∈ D(T ) was arbitrary.
7.1.10 Projection mapping

We next discuss the partition of a Banach space into two subspaces and
the related question of the existence of operators, which are projections
onto subspaces. These ideas help very much in the analysis of the structure
of a linear transformation. We provide here an illustration of the use of
closed graph theorem.
7.1.11 Definition: direct sum
A vector space E is said to be the direct sum of two of its subspaces M
and N , i.e.,
E =M ⊕N (7.5)
if every x ∈ E has a unique decomposition
x = y + z, (7.6)
with y ∈ M and z ∈ N .
Thus, if E = M ⊕ N then M ∩ N = {θ}.
7.1.12 Definition: projection
A linear map P from a linear space E to itself is called a projection if
P2 = P.
If P is a projection then (I − P ) is also a projection.

For (I − P )2 = (I − P )(I − P ) = I − P + P 2 − P
= I − P + P − P = I − P.
Moreover, P (I − P ) = P − P 2 = P − P = 0.
7.1.13 Lemma
If a normed linear space E is the direct sum of two subspaces M and
N and if P is a projection of E onto M , then
(i) P x = x if and only if x ∈ M ;
(ii) P x = θ if and only if x ∈ N .
Proof: If x ∈ E, then x = y + z where y ∈ M and z ∈ N .
Since P is a projection of E onto M
Px = y
if P x = x, y = x and z = θ. Similarly, if x ∈ M, P x = x.
If P x = θ then y = θ and hence x ∈ N . Similarly, if x ∈ N, P x = θ.
If R(P ) and N (P ) denote respectively the range space and null space
of P , then
R(P ) = N (I − P ), N (P ) = R(I − P ).
Therefore, E = R(P ) + N (I − P ) and R(P ) ∩ N (P ) = {θ} for every
projection P defined on E.
The closedness and the continuity of a projection can be determined by

the closedness of its range space and null space respectively.
7.1.14 Theorem
Let E be a normed linear space and P : E → E be a projection. Then
P is a closed map, if and only if, the subspaces R(P ) and N (P ) are closed
in E. In that case, P is in fact, continuous if E is a Banach space.
Proof: Let P be a closed map, yn ∈ R(P ), zn ∈ N (P ). Further, let
yn → y, zn → z in E. Then P yn = yn → y, P zn = θ → θ in E so that
P y = y and P z = θ. Then y ∈ R(P ) and z ∈ N (P ). The above shows that
R(P ) and N (P ) are closed subspaces in E.
Conversely, let R(P ) and N (P ) be closed in E. Let xn → x and
P xn → y in E. Since R(P ) is closed and P xn ∈ R(P ) we see that y ∈ R(P ).
Also, since N (P ) is closed and xn −P xn ∈ N (P ), we see that x−y ∈ N (P ).
Thus, x − y = z with z ∈ N (P ). Thus, x = y + z, with y ∈ R(P ) and
z = x − y ∈ N (P ). Hence, P x = y, showing that P is a closed mapping.
If E is a Banach space and R(P ) and N (P ) are closed, then by the closed
graph theorem (theorem 7.1.3) the closed mapping P is in fact continuous.
7.1.15 Remark
(i) Let E be a normed linear space and M a subspace of E. Then
there exists a projection P defined on E such that R(P ) = M . Let {fi }
be a (Hamel basis) for M . Let {fi } be extended to a basis {hi } such that
{hi } = {fi } ∪ {gi } for the space E. Let N = span {gi }, then E = M + N
and M ∩ N = {Φ}. The above shows that there is a projection of E onto
M along N .
(ii) A question that arises is that, given E is a normed linear space and
M a closed subspace of E, does there exist a closed projection defined on
E such that R(P ) = M ? By theorem 7.1.14, such a projection exists if
and only if, there is a closed subspace N of E such that E = M + N and
M ∩ N = {Φ}. In such a case, N is called a closed complement of M in
E.
7.2 Open Mapping Theorem

7.2.1 Definition: open mapping
Let Ex and Ey be two Banach spaces and T be a linear operator mapping
Ex → Ey . Then, T : D(T ) → Ey with D(T ) ⊆ Ex is called an open
mapping if for every open set in D(T ) the image is an open set in Ey .
Note 7.2.1. A continuous mapping, T : Ex → Ey has the property that
for every open set in Ey the inverse image is an open set. This does not
imply that T maps open sets in Ex into open sets in Ey . For example, the
mapping 4 → 4 given by t → sin t is continuous but maps ]0, 2π[ onto
[−1, 1].
7.2.2 Theorem: open mapping theorem

A bounded linear operator T mapping a Banach space Ex onto all of a
Banach space Ey is an open mapping.
Before proving the above theorem, the following lemma will be proved.
7.2.3 Lemma (open unit ball)
A bounded linear operator T from a Banach space Ex onto all of a
Banach space Ey has the property that the image T (B(0, 1)) of the unit
ball B(0, 1) ⊆ Ex contains an open ball about 0 ∈ Ey .
Proof: The proof comprises three parts:
! "
(i) The closure of the image of the open ball B 0, 12 contains an open
ball B ∗ .
(ii) T (Bn ) contains an open ball Vn about 0 ∈ Ey where Bn =
B(0, 2−n ) ⊂ Ex .
(iii) T (B(0, 1)) contains an open ball about 0 ∈ Ey .
(i) Given a set A ⊆ Ex we shall write [see figs 7.1(a) and 7.1(b)]
αA = {x ∈ Ex : x = αa, α a scalar, a ∈ A} (7.7)
A + g = {x ∈ Ex : x = a + g, a ∈ A, g ∈ Ex } (7.8)
and similarly for subsets of Ey .
A
αA
A+g
A
a +g
Fig. 7.1(a) Illustration of formula (7.7) Fig. 7.1(b) Illustration of formula (7.8)

1
We consider the open ball B1 = B 0, ⊆ Ex .
2
Any fixed x ∈ Ex is in kB1 with real k sufficiently large (k > 2||x||).
∞
*
Hence, Ex = kB1 .
k=1
Since T is surjective and linear,
∞
∞ ∞
* * *
Ey = T (Ex ) = T kB1 = kT (B1 ) = kT (B1 ). (7.9)
k=1 k=1 k=1
∞
*
Since Ey is complete and kT (B1 ) is equal to Ey , (7.9) holds. Since
k=1
Ey is complete, it is a set of the second category by Baire’s category theorem
*∞
(theorem 1.4.20). Hence Ey = kT (B1 ) cannot be expressed as the
k=1
countable union of nowhere dense sets. Hence, at least one ball kT (B1 )
must contain an open ball. This means that T (B1 ) must contain an open
ball B ∗ = B(y0 , ) ⊂ T (B1 ). It therefore follows from (7.8) that
B ∗ − y0 = B(0, ) ⊂ T (B1 ) − y0 . (7.10)
We show now B ∗ − y0 = T (B(0, 1)).

This we do by showing that
T (B1 ) − y0 ⊆ T (B(0, 1)). (7.11)
Let y ∈ T (B1 ) − y0 . Then y + y0 ∈ T (B1 ) and we remember that

y0 ∈ T (B1 ) too.
Since y+y0 ∈ T (B1 ), there exists wn ∈ B1 such that un = T wn ⊆ T (B1 )
such that un → y + y0 . Similarly, we can find zn ∈ B1 such that
vn → T zn ⊆ T (B1 ) and vn → y0 .
1
Since wn , zn ∈ B1 and B1 has radius 2, we obtain ||wn − zn ||
||wn || + ||zn || < 1.
Hence wn − zn ∈ B(0, 1).

Now, T (wn − zn ) = T wn − T zn → y.
Thus, y ∈ T (B(0, 1)). Thus if y ∈ T (B1 ) − y0 ,
then y ∈ T (B(0, 1)) and y is arbitrary, it follows that
T (B1 ) − y0 ⊆ T (B(0, 1)).
Hence, (7.11) is proved. Using (7.9) we see that (7.8) yields
B ∗ − y0 ⊆ B(0, ) ⊆ T (B1 ) − y0 ⊆ T B(0, 1). (7.12)
−n −n
Let Bn = B(0, 2 ) ⊆ Ex . Since T is linear, T (Bn ) = 2 T (B(0, 1)).
−n
If follows from (7.12), Vn = 2 B(0, ) = B(0, /2 ) ⊆ T (Bn ).
n
(7.13)

1
(iii) We finally prove that V1 = B 0, ⊆ T (B(0, 1)).
2

1
For that, let y ∈ V1 , (7.13) yields V1 = B 0, ⊆ T (B1 ). Hence
2
y ∈ T (B1 ). Since T (B1 ) is a closed set, there is a v ∈ T (B1 ) close to y
such that,

||v − y|| < .
4
Now, v ∈ T (B1 ) implies that there is a x1 ∈ B1 such that v = T x1 .

Hence ||y − T x1 || < .
4
From the above and (7.13), putting n = 2, we see that
y − T x1 ∈ V2 = T (B2 ).
We can again find a x2 ∈ B2 such that

||(y − T x1 ) − T x2 || < .
8
Hence y − T x1 − T x2 ∈ V3 ⊆ T (B3 ) and so on.
Proceeding in the above manner we get at the nth stage, an xn ∈ Bn
such that0 0
0 n 0
0 0
0y − T xk 0 < n+1 , (n = 1, 2, . . .) (7.14)
0 0 2
k=1
writing zn = x1 + x2 + · · · + xn , and since xk ∈ Bk , so that ||xk || < 1/2k
we have, for n > m,
n n
1
||zn − zm || ||xk || <
2k
k=m+1 k=m+1

1 1 1 1
= m+1 1 + + 2 + · · · + n−m−1 → 0 as m → ∞.
2 2 2 2
Hence, {zn } is a Cauchy sequence and Ex being complete {zn } → z ∈
Ex . Also z ∈ B(0, 1) since B(0, 1) has radius 1 and
∞
∞
1
||xk || < = 1. (7.15)
2k
k=1 k=1
Since T is continuous, T zn → T z and (7.14) shows that T z = y. Hence
y ∈ T (B(0, 1)).
Proof of theorem 7.2.2

We have to prove that for every open set A ⊆ Ex the image T (A) is
open in Ey .
Let y = T z ∈ T (A), where z ∈ A. Since A is open, it contains an open
ball with centre z. Hence A − z contains an open ball with centre 0. Let
the radius of the ball be r and k = 1r . Then k(A − z) has an open ball
B(0, 1). Then, by lemma 7.2.3. T (k(A − z)) = k[T (A) − T z] contains an
open ball about 0 and therefore T (A) − T z contains a ball about 0. Hence,
T (A) contains an open ball about T z = y. Since y ∈ T (A) is arbitrary,
T (A) is open.
7.3 Bounded Inverse Theorem

Let Ex and Ey be Banach spaces. Hence, if the linear operator T is bijective
(i.e., injective and surjective) then T −1 is continuous and thus bounded.
Since T is open, by the open mapping theorem 7.2.2, given A ⊆ Ex is open,
T (A) is open. Again T is bijective, i.e., injective and surjective. Therefore,
T −1 exists. For every open set A in the range of T −1 , the domain of T −1
contains an open set A. Hence by theorem 1.6.4, T −1 is continuous and
linear and hence bounded (4.2.4).
7.3.1 Remark
(i) The inverse of a bijective closed mapping from a complete metric
space to a complete metric space is closed.
(ii) The inverse of a bijective, linear, continuous mapping from a Banach
space to a Banach space is linear and continuous.
(iii) If the normed linear spaces are not complete then the above ((i) and
(ii)) may not be true.
Let Ex = c00 with || ||1 and Ey = c00 with || · ||∞ . If P (x) = x for
x ∈ Ex , then P : Ex → Ey is bijective, linear and continuous. But P −1 is
not continuous since, for xn = (1, 1, 1, . . . , 1 , 0, 0, 0 . . .) we have ||xn ||∞ = 1
A BC D
n
and ||P −1 (xn )|| = ||xn ||1 = n for all n = 1, 2, . . ..
7.3.2 Definition: stronger norm, comparable norm
Given E a normed linear space, || · || on E is said to be stronger than
the norm || · || if for every x ∈ E and every > 0, there is some δ > 0 such
B||·|| (x, δ) ⊆ B||·|| (x, ). Here, B||·|| (x, δ) denotes an open ball in E w.r.t.
|| · ||. Similarly B||·|| (x, ) denotes an open ball in E w.r.t. || · || . In other
words, || · || is stronger than || · || if and only if every open subset of E
with respect to || · || is also an open subset with respect to || · ||.
The norms || · || and || · || are said to be comparable if one of them is
stronger than the other. For definition of two equivalent norms, || · || and
|| · || , see 2.3.5.
7.3.3 Theorem
Let || · || and || · || be norms on a linear space E. Then the norm || · || is
stronger than ||·|| if and only if there is some α > 0 such that ||x|| ≤ α||x||
for all x ∈ E.
Proof: Let || · || be stronger than || · || , then there is some r > 0 such that
θ ∈ {x ∈ E : ||x|| < r} ⊂ {x ∈ E : ||x|| < 1}.

0 0
0 rx 0
Let θ = x ∈ E and > 0. Since 0 0 0 < r,
(1 + )||x|| 0
0 0
0 rx 0
then 0 0 0 < 1 , i.e., ||x|| < (1 + ) ||x||.
(1 + )||x|| 0 r
1
Since > 0 is arbitrary, ||x|| ≤ ||x||.
r
1
or ||x|| ≤ α||x|| with α = .
r

Conversely, let ||x|| ≤ α||x|| for all x ∈ E.
Let {xn } be a sequence in E such that ||xn − x|| → 0.
Since ||xn − x|| ≤ α||xn − x|| → 0.
Hence, the || · || is stronger than the norm || · || .
7.3.4 Two-norm theorem
Let E be a Banach space in the norm || · ||. Then a norm || · || of the
linear space E is equivalent to the norm || · || if and only if, E is also a
Banach space in the norm || · || and the norm || · || is comparable to the
norm || · ||.
Proof: If the norms || · || and || · || are equivalent, then clearly they are
comparable.
Therefore, α1 ||x|| ≤ ||x|| ≤ α2 ||x||, α1 , α2 ≥ 0 for all x ∈ E. Let {xn }
be a Cauchy sequence in the Banach space E with norm || · ||.
Then, if xn → x in E with norm || · ||,
||xn − x|| ≤ α2 ||xn − x||
and hence {xn } → x in the norm || · || .

Therefore, E is a Banach space with respect to the norm || · || .
Conversely, let us suppose that E is a Banach space in the norm || · ||
and that the norm || · || is comparable to the norm || · ||.
Let us suppose without loss of generality that || · || is stronger than || · || .
Then, by theorem 7.3.3, we can find a α > 0 such that ||x|| ≤ α||x||, for all
x ∈ E. Let E denote the linear space with the norm ||·|| and let us consider
the identity map I : E → E. Clearly I is bijective, linear and continuous.
By the bounded inverse theorem 7.3, I −1 : E → E is also continuous, that
is, ||x|| ≤ β||x|| for all x ∈ E and some β > 0.
1
Letting α = we have,
β
α ||x|| ≤ ||x|| ≤ α||x||.
Therefore, it follows from 2.3.5 || · || and || · || are equivalent.

7.3.5 Remark
(i) The result above shows that two comparable complete norms on a
normed linear space are equivalent.
Problem [7.1, 7.2 and 7.3]

1. Given that E is a Banach space, D(T ) ⊆ E is closed, and the linear
operator T is bounded, show that T is closed.
2. If T is a linear transformation from a Banach Ex into a Banach space
Ey , find a necessary and sufficient condition that a subspace G of
Ex × Ey is a graph of T .
3. Given Ex , Ey and Ez , three normed linear spaces respectively:
(i) If F : Ex → Ey is continuous and G : Ey → Ez is closed, then
show that G · F : Ex → Ez is closed.
(ii) If F : Ex → Ey is continuous and G : Ex → Ey is closed, then
show that F + G : Ex → Ey is closed.
4. Let Ex and Ey be normed linear spaces and T : Ex → Ey be linear.
Let T̂ : Ex /N (T ) → Ey be defined by T̂ (x + N (T )) = T (x), x ∈ Ex .
Show that T is closed if and only if N (T ) is closed in Ex and T̂ is
closed.
5. Let Ex and Ey be normed linear spaces and A : Ex → Ey be linear,
such that the range R(A) of A is finite dimensional. Then show that
A is continuous if and only if the null space N (A) of A is closed in
Ex .
In particular, show that a linear functional f on Ex is continuous if
and only if N (A) is closed in Ex .
6. Let Ex be a normed linear space and f : Ex → R be linear. Then
show that f is closed if and only if f is continuous.
(Hint: Problems 7.4 and 7.5).
7. Let Ex and Ey be Banach spaces and let T : Ex → Ey be a closed
linear operator, then show that
(i) if C is compact in Ex , T (C) is closed in Ey and
(ii) if K is compact in Ey , T −1 (K) is closed in Ex .
8. Give an example of a discontinuous operator A from a Banach space
Ex to a normed linear space Ey , such that A has a closed graph.
9. Show that the null space N (A) of a closed linear operator A : Ex →
Ey , Ex , Ey being normed linear spaces, is a closed subspace of Ex .
10. Let Ex and Ey be normed linear spaces. If A1 : Ex → Ey is a closed
linear operator and A2 ∈ (Ex → Ey ), show that A1 + A2 is a closed
linear operator.
11. Show that A : 422 → 4, defined by (x1 , x2) → {x1 } is open. Is the
mapping 2 →4 4 given by (x1, x2 ) → (x1, 0) an open mapping?
12. Let A : c00 → c00 be defined by,

T
1 1
y = Ax = ξ1 , ξ2 , ξ3 , . . .
2 3
where x = {ξi }. Show that A is linear and bounded but A−1 is

unbounded.
13. Let Ex and Ey be Banach spaces and A : Ex → Ey be an injective
bounded linear operator. Show that A−1 : R(A) → Ex is bounded if
and only if, R(A) is closed in Ey .
14. Let A : Ex → Ey , be a bounded linear operator where Ex and Ey
are Banach spaces. If A is bijective, show that there are positive real
numbers α and β such that α||x|| ≤ ||Ax|| ≤ β||x|| for all x ∈ Ex .
15. Prove that the closed graph theorem can be deduced from the open
mapping theorem.
(Hint: Ex × Ey is a Banach space and the map (x, A(x)) → x ∈ Ex
is one-to-one and onto, A : Ex → Ey ).
16. Let Ex and Ey be Banach spaces and A ∈ (Ex → Ey ) be surjective.
Let yn → y in Ey . If Ax = y, show that there is a sequence {xn } in
Ex , such that Axn = yn for each n and xn → x in Ex .
17. Show that the uniform bounded principle for functionals [see 4.5.5]
can be deduced from the closed graph theorem.
18. Let Ex and Ey be Banach spaces and Ez be a normed linear space.
Let A1 ∈ (Ex → Ez ) and A2 ∈ (Ey → Ez ). Suppose that for every
x ∈ Ex there is a unique y ∈ Ey such that Bx = A2 y, and define
A1 x = y. Then show that A1 ∈ (Ex → Ey ).
19. Let Ex and Ey be Banach spaces and A ∈ (Ex → Ey ). Show that
R(A) is linearly homeomorphic to Ex /N (A) if and only if R(A) is
closed in Ey .
(Hint: Two metric spaces are said to be homeomorphic to each
other if there is a homeomorphism from Ex onto Ey . Use theorem
7.1.3.)
20. Let Ex denote the sequence space lp (1 ≤ p ≤ ∞). Let || · || be a
complete norm on Ex such that if ||xn − x|| → 0 then xnj → xj for
every j = 1, 2, . . .. Show that || · || is equivalent to the usual norm
|| ||p on Ex .
21. Let || · || be a complete norm on C([a, b]) such that if ||xn − x|| → 0
then xn (t) → x(t) for every t ∈ [a, b]. Show that || · || is equivalent
to any norm on C([a, b]).
22. Give an example of a bounded linear operator mapping C 1 ([0, 1]) →
C 1 ([0, 1]) which is not a closed operator.
7.4 Applications of the Open Mapping

Theorem
Recall the definition of a Schauder basis in 4.8.3. Let E be a normed linear
space. A denumerable subset {e1 , e2 , . . .} of E is called a Schauder basis
for E if ||en || = 1 for each n and if for every x ∈ E, there are unique scalars
∞
4 +
α1 , α2 , . . . αn · · · in ( ) such that x = αi ei .
i=1
In case {e1 , e2 , . . .} is a Schauder basis for E then for n = 1, 2, . . ., let us
∞
4+
define functionals fn : E → ( ) by fn (x) = αn (x) for x = αn en ∈ E.
n=1
fn is well-defined and linear on E. It is called the nth coefficient functional
on E.
7.4.1 Theorem
The functionals fn = αn (x) for a given x ∈ E are bounded.
We consider the vector space E of all sequence (α1 ,0α2 , . . . , α0n , . . .) for
∞
0n 0
0 0
which αn en converges in E. The norm ||y|| = sup 0 αi ei 0 converts
n 0 0
n=1 i=1
E into a normed vector space. We show that E is a Banach space. Let

(m)
ym = {αn }, m = 1, 2, . . . be a Cauchy sequence in E . Let > 0, then
there is a N , such that m, p > N implies
0 0
0n 0
0 (m) (p) 0
||ym − yp || = sup 0 (αi − αi )ei 0 < .
n 0 0
i=1
(m) (p)
But this implies ||αn
− αn ||
< 2 for every n. Hence for every n,
lim αn(m) = αn exists. It remains to be shown that
m→∞
y = (α1 , α2 , . . . , αn , . . .) ∈ E and lim yn = y.
n→∞
(m) (m) (m)
Now, ym = {α1 , α2 , . . . αn . . .} ∈ E .
Since in E convergence implies coordinatewise convergence and since
lim αnm = αn , lim ym = (α1 , α2 , . . . αn , . . .).
n→∞ n→∞
0 0
0n 0
0 0
Now, ||y − ym || = sup 0 (αi − αi )ei 0 → 0 as m → ∞.
m
n 0 0
i=1
Hence y = (α1 , α2 , . . . αn , . . .) ∈ E and {ym } being Cauchy, E is a
Banach space.
Let us next consider a mapping P : E → E for which y = (α1 , α2 , . . .) ∈
∞
E such that Py = αn en ∈ E. If z = (β1 , β2 , . . .) ∈ E such that
n=1
∞
∞

Pz = βn en ∈ E, then P (y + z) = (αn + βn )en = P y + P z showing
n=1 n=1
P is linear. Since {ei } are linearly independent P y = θ ⇐⇒ y = θ. Hence
P is one-to-one. Now {en } being a Schauder basis in E, every element in
∞
E is representable in the form rn en where (r1 , r, . . . rn . . .) ∈ E . Hence
n=1
P is onto. P is bounded since
0 0 0 0
0 n 0 0n 0
0 0 0 0
sup 0 αi ei 0 ≥ lim 0 αi ei 0 .
n 0 0 n 0 0
i=1 i=1
By the open mapping theorem, the inverse P −1 of P is bounded.

Now,
0 n 0 0 n 0
0
n−1 0 0 0
0 0 0 0
|αn (x)| = |αn | = ||αn en || = 0 αi ei − αi ei 0 ≤ 2 sup 0 αi ei 0
0 0 n 0 0
i=1 i=1 i=1
−1 −1
= 2||y|| = 2||P x|| ≤ 2||P || ||x||.
This proves the boundedness of αn (x).
CHAPTER 8
COMPACT
OPERATORS ON
NORMED LINEAR
SPACES
This chapter focusses on a natural and useful generalisation of bounded

linear operators having a finite dimensional range. The concept of a
compact linear operator is introduced in section 8.1. Compact linear
operators often appear in applications. They play a crucial role in the
theory of integral equations and in various problems of mathematical
physics. The relation of compactness with weak convergence and reflexivity
is highlighted. The spectral properties of a compact linear operator are
studied in section 8.2. The notion of the Fredholm alternative and the
relevant theorems are provided in section 8.3. Section 8.4 shows how to
construct a finite rank approximations of a compact operator. A reduction
of the finite rank problem to a finite dimensional problem is also given.
8.1 Compact Linear Operators

8.1.1 Definition: compact linear operator
A linear operator mapping a normed linear space Ex onto a normed
linear space Ey is said to be compact if it maps a bounded set of (Ex )
into a compact set of (Ey ).
8.1.2 Remark
(i) A linear map A from a normed linear space Ex into a normed linear
space Ey is continuous if and only if it sends the open unit ball B(0, 1) in
282
Compact Operators on Normed Linear Spaces 283
Ex to a bounded subset of Ey .
(ii) A compact linear operator A is stronger than a bounded linear
operator in the sense that A(B(0, 1)) is a compact subset of Ey given B(0, 1)
an open unit ball.
(iii) A compact linear operator is also known as a completely
continuous operator in view of a result we shall prove in 8.1.14(a).
8.1.3 Remark
(i) A compact linear operator is continuous, but the converse is not
always true. For example, if Ex is an infinite dimensional normed linear
space, then the identity map I on Ex is clearly linear and continuous, but
it is not compact. See example 1.6.16.
8.1.4 Lemma
Let Ex and Ey be normed linear spaces.
Further, A1 , A2 map B(0, 1) into A1 (B(0, 1)) and A2 (B(0, 1)) which are
respectively compact.
Then (i) (A1 + A2 )(B(0, 1)) is compact.
(ii) A1 A2 (B(0, 1)) is compact.
Proof: Let ||xn || ≤ 1 and {xn } is a Cauchy sequence. Since A1 (B(0, 1))
is compact and is hence sequentially compact (1.6), {A1 xn } contains a
convergent subsequence {A1 xnp }.
A2 being compact, we can similarly argue that {A2 xnq } is convergent.
Let {xnr } be a subsequence of both {xnp } and {xnq }.
Then {xnr } is Cauchy.
Moreover, (A1 + A2 ) (xnr ) is convergent.
Hence (A1 + A2 ) is compact.
(ii) Let ||xn || ≤ 1 and {xn } is convergent in Ex . {A2 xn } being
compact and sequentially compact (1.6.17) {A2 xn } contains a convergent
subsequence {A2 xnp } ⊆ Ey . Hence {A2 xnp } is bounded. A1 being compact
(A1 A2 )(xnp ) is a compact sequence. Hence, A1 A2 maps bounded sequence
{xn } into a compact sequence. Hence A1 A2 is compact.
8.1.5 Examples
(1) Let Ex = Ey = C([0, 1]) and let
1
Ax = y(t) = K(t, s)x(s)ds.
0
The kernel K(t, s) is continuous on 0 ≤ t, s ≤ 1. We want to show that A

is compact. Let {x(t)} be a bounded set of functions of C([0, 1]), ||x|| ≤ α,
let K(t, s) satisfy L = max |K(t, s)|. Then y(t) satisfies
t,s

1
|y(t)| ≤ |K(t, s)|x(s)ds ≤ Lα,
0
showing that y(t) is uniformly bounded. Furthermore, we show that the

functions y(t) are uniformly continuous.
Given > 0, we can find a δ > 0 on account of the uniform continuity
of K(t, s), such that
|K(t2 , s) − K(t1 , s)| < /α
for |t2 − t1 | < δ and every s ∈ [0, 1].
1

Therefore, |y(t2 ) − y(t1 )| = (K(t2 , x) − K(t1 , s))x(s)ds
0
< max |K(t2 , s) − K(t1 , s)| · ||x(s)||
s

< · α = wherever |t2 − t1 | < δ
α
for all y(t). Hence {y(t)} is uniformly continuous.
By Arzela-Ascoli’s theorem (1.6.23) the set of functions {y(t)} is
compact in the sense of the metric of the space C([0, 1]). Hence the operator
A is compact.
8.1.6 Lemma
If a sequence {xn } is weakly convergent to x0 and compact, then the
sequence is strongly convergent to x0 .
Proof: Let us assume by way of contradiction that {xn } is not strongly
convergent. Then, given > 0, we can find an increasing sequence of
indices n1 , n2 , . . . , nk . . . such that ||xnk − x0 || ≥ . Since the sequence
{xni } is compact, it contains a convergent subsequence {xnij }. Thus, let
{xnij } converge strongly to u0 .
w w
Moreover, xnij −→ u . Since at the same time xnij −→ x ,
j→∞ 0 j→∞ 0
u0 = x0 . Thus on one hand, ||xnij − x0 || ≥ , whereas on the other
||xnij − x0 || → 0, a contradiction to our hypothesis. Hence the lemma.
8.1.7 Theorem
A compact operator A maps a weakly convergent sequence into
a strongly convergent sequence.
Let the sequence {xn } converge weakly to x0 . Then the norms of the
elements of this sequence are bounded (theorem 6.3.3). Thus A maps a
bounded sequence {xn } into a compact sequence {Axn }. Let yn = Axn .
Since a compact linear operator is bounded, and since {xn } is weakly
convergent to x0 , by theorem 6.3.17 Axn converges weakly to Ax0 = y0 .
Given A is compact and {xn } is bounded, {yn } where {yn } = {Axn } is
w
compact. Now, as because {yn } −→ y and {yn } is compact by lemma
n→∞ 0
8.1.6.
yn → y0 .
8.1.8 Theorem
Let A be a linear compact operator mapping an infinite dimensional
space E into itself and let B be an arbitrary bounded linear operator acting
in the same space. Then AB and BA are compact.
Proof: See lemma 8.1.4.
Note 8.1.1. In case, A is a compact linear operator mapping a linear
space E → E and admits of an inverse A−1 , then A · A−1 = I. Since I is
not compact, A−1 is not bounded.
8.1.9 Theorem
If a sequence {An } of compact linear operators mapping a normed linear
space Ex into a Banach space Ey converges strongly to the operator A, that
is if ||An − A|| → 0, then A is also a compact operator.
Proof: Let M be a bounded set in Ex and α a constant such that ||x|| ≤ α
for every x ∈ M . For given > 0, there is an index n0 such that
||An − A|| < /α, for n ≥ n0 ( ). Let A(M ) = L and An0 (M ) = N .
We assert that the set An0 (M ) = N is a finite -net of L. Let us take
for every y ∈ L one of the pre-images x ∈ M and put y0 = An0 x ∈ N ,
to receive ||y − y0 || = ||Ax − An0 x|| ≤ ||A − An0 || ||x|| < /α · α = .
On the other hand, since An0 is compact and M is bounded, the set N
is compact. It follows then L for every > 0 has a compact -net and is
therefore itself compact (theorem 1.6.18). Thus, the operator A maps an
arbitrary bounded set into a set whose closure is compact set and hence
the operator A is compact.
8.1.10 Example
1
1. If Ex = Ey = L2 ([0, 1]), then the operator, Ax−y = K(t, s)x(s)ds
1 1 0
with K 2 (t, s)dtds < ∞, is compact.

0 0
Proof: Let us assume first that K(t, s) is a continuous kernel. Let M be a
bounded set in L2 ([0, 1]) and let
1
x2 (t)dt ≤ α2 for all x(t) ∈ M.
0
Consider the set of functions

1
y(t) = K(t, s)x(s)dx, x(s) ∈ M.
0
It is to be shown that the functions y(t) are uniformly bounded and

equicontinuous (theorem 1.6.22). This implies the compactness of the
set {y(t)} in the sense of uniform convergence and also in the sense
of convergence in the mean square. By Cauchy-Bunyakovsky-Schwartz
inequality (1.4.3) we have
12 12
1 1 1
|y(t)| = K(t, s)x(s)ds ≤ 2
K (t, s)ds 2
x (s)ds ≤ Lα
0 0 0
where L = maxt,s |K(t, s)|. Consequently the functions y(t) are

uniformly bounded. Furthermore,
1 12 1 12
2 2
|y(t2 ) − y(t1 )| ≤ [K(t2 , s) − K(t1 , s)] ds x (s)ds <
0 0
for |t2 − t1 | < δ where δ is chosen such that
|K(t2 , s) − K(t1 , s)| < /α
for |t2 − t1 | < δ.

The estimate |y(t2 ) − y(t1 )| < does not depend on the positions of t1 , t2
on [0,1] and also does not depend on the choice of y(t) ∈ M . Hence, the
functions y(t) are equicontinuous.
Thus, in the case of a continuous kernel the operator is compact.
Next let us assume K(t, s) to be an arbitrary square-integrable kernel.
We select a sequence of continuous kernels {Kn (t, s)} which converges in
the mean to K(t, s), i.e., a sequence such that
1 1
(K(t, s) − Kn (t, s))2 dtds → 0 as n → ∞
0 0
1
Set An x = Kn (t, s)x(s)ds.
0
Then,
3 2 12
1 1 1
||Ax − An x|| = K(t, s)x(s)ds − Kn (t, s)x(s)dx dt
0 0 0
1 1 1 , 12
2 2
≤ [K(t, s) − Kn (t, s)] ds x (s)ds dt
0 0 0
1 , 12
2
= [K(t, s) − Kn (t, s)] dsdt ||x||
0
Hence,
1 1 , 12
||Ax − An x|| 2
||A − An || = sup ≤ [K(t, s) − Kn (t, s)] dtds
x=0 ||x|| 0 0
Since Kn (t, s) → K(t, s) as n → ∞

||A − An || → 0 as n → ∞. Since all the An are compact, A is also
compact by theorem 8.1.9.
8.1.11 Remark
The limit of a weakly convergent sequence {An } of compact operators is
not necessarily compact.
Let us consider an infinite dimensional Banach space E with a basis
{ei }. Then every x ∈ E can be written in the form
∞

x= ξi ei .
i=1

n
Let Sn x = ξi ei where Sn x is a projection of x to a finite dimensional
i=1
space.
Let us consider the unit ball B(0, 1) = {x : x ∈ E, ||x|| ≤ 1}.
Then Sn (B(0, 1)) is closed and bounded in the n-dimensional space En
and hence compact.
Thus, Sn is compact.
w w
As n → ∞, Sn x −→ x or Sn −→ I, where the identity operator I is
not compact.
8.1.12 Theorem (Schauder, 1930) [49]
Let Ex and Ey be normed linear spaces and A ∈ (Ex → Ey ). If A is
compact then A∗ is a compact linear operator mapping Ey∗ into Ex∗ . The
converse holds if Ey is a Banach space.
Proof: Let A be a compact linear operator mapping Ex into Ey . Let us
consider a bounded sequence {φn } in Ey∗ . For y1 , y2 ∈ Ey
|φn (y1 ) − φn (y2 )| ≤ ||φn || ||y1 − y2 || ≤ α||y1 − y2 ||.
Let L = A(B(0, 1)). Then {φn|L : n = 1, 2, . . .} is a set of uniformly

bounded equicontinuous functions on the compact metric space L. By the
Arzela-Ascoli’s theorem (1.6.23) {φn|L } has a subsequence {φnj |L } which
converges uniformly on L.
For i, j = 1, 2, . . ., we have
||A∗ (φni ) − A∗ (φnj )|| = sup{|A∗ (φni − φnj )(x)| : ||x|| ≤ 1}
= sup{|(φni − φnj )(Ax)| : ||x|| ≤ 1}
≤ sup{|φni (y) − φnj (y)| : y ∈ L}.
Since the sequence {φnj |L } is uniformly Cauchy on L; we see that

(A∗ (φnj )) is a Cauchy sequence in Ex∗ . It must converge in Ex∗ since Ex∗ is
a complete normed linear space and hence a Banach space (theorem 4.4.2).
We have thus shown that (A∗ (φn )) has a convergent subsequence. Thus A∗
maps a bounded sequence in Ey∗ into a convergent subsequence in Ex∗ and
is hence a compact operator.
Conversely, let us assume that Ey is a Banach space and A ∈ (Ex → Ey )

and A∗ is a compact operator mapping Ey∗ into Ex∗ . Then we can show that
A∗∗ is a compact operator mapping Ex∗∗ into Ey∗∗ by following arguments
put forward as in above. Now let us consider the canonical embedding
φEx : Ex → Ex∗∗ and φEy : Ey → Ey∗∗ introduced in sec. 5.6.6. Since
A∗∗ φEx = φEy A by sec. 6.1.5 we see that φEy A(B(0, 1)) = {A∗∗ φEx (x) :
x ∈ B(0, 1) ⊆ Ex }, is contained in {A∗∗ (f ∗ ) : f ∗ ∈ Ex∗∗ , ||f ∗∗ || < 1}.
This last set is totally bounded in Ey∗∗ since A∗∗ is a compact map.
As a result, φEy (A(B(0, 1)) is a totally bounded subset of Ey∗∗ . Since φEy
is an isometry, A(B(0, 1)) is a totally bounded subset of Ey . As Ey is a
Banach space, and A(B(0, 1)) is a totally bounded subset of it, then its
closure A(B(0, 1)) is complete and totally bounded. Hence by theorem
1.6.18, A(B(0, 1)) is compact. Hence, A is a compact operator mapping Ex
into Ey .
8.1.13 Theorem
Let Ex and Ey be normed linear spaces and A : Ex → Ey be linear. If
A is continuous and the range of A is finite dimensional then A is compact
and R(A) is closed in Ey .
Conversely, if Ex and Ey are Banach spaces, A is compact and R(A) is
closed in Ey , then A is continuous and its range is of finite dimensions.
Proof: Since A is linear and continuous, it is bounded by theorem 4.2.4.
Since R(A) is a finite dimensional subspace of Ey , it is closed [see theorem
2.3.4]. Thus if {xn } is a bounded sequence in Ex , {Axn } is a bounded and
closed subset of the finite dimensional space R(A). We next show that A is
compact. By the th. 2.3.1 every finite dimensional normed linear space of a
given dimension n is isomorphic to the n-dimensional Euclidean space Rn .
By Heine-Borel theorem a closed and bounded subset of Rn is compact.
Therefore A is relatively compact.
Conversely, let us assume that Ex and Ey are Banach spaces, A is
compact such that R(A) is closed in Ey . Then A is continuous. Also, R(A)
is a Banach space and A : Ex → R(A) is onto. Then by the open mapping
theorem (7.2.2), A(B(0, 1)) is open. Hence there is some δ > 0 such that
X = {y ∈ R(A) : ||y|| ≤ δ} ⊂ A(B(0, 1)).
Since R(A) is closed, we have
{y ∈ R(A) : ||y|| ≤ δ} = X ⊆ A(B(0, 1)) ⊆ R(A).
As A(B(0, 1)) is compact, we find the closed ball of radius δ about zero
in the normed linear space R(A) is compact. We can next show using 2.3.8
that R(A) is finite dimensional.
8.1.14 Remark
An operator A on a linear space E is said to be of finite rank if the
range of A is finite dimensional.
8.1.15 Theorem
Let Ex and Ey be normed linear spaces and A : Ex → Ey be linear.

w
(a) Let A be a compact operator mapping Ex into Ey . If xn −→ x in
Ex , then Axn → Ax in Ey .
w
(b) Let Ex be reflexive and Axn → Ax in Ey wherever xn −→ x in Ex .
Then A is a compact linear operator mapping Ex → Ey .
w
Proof: (a) Let xn −→ x in Ex . By theorem 6.3.3, {xn } is a bounded
sequence in Ex . Let us suppose by way of argument, that Axn −→ / Ax.
Then, given > 0, there is a subsequence {xni } such that ||Axni − Ax|| ≥
for all i = 1, 2, . . .. Since A is compact and {xni } is a bounded sequence,
there is a subsequence {xnij } of {xni } such that Axnij converges as j → ∞,
to some element y in Ey . Then ||y − Ax|| ≥ , so that y = Ax. On the
w
other hand, if f ∈ Ey∗ then f ◦ A ∈ Ex∗ and since xn −→ x in Ex , we have
f (Ax) = lim f (Axnij ) = f lim (Axnij ) = f (y)
j→∞ j→∞
Thus f (y − Ax) = 0 for every f ∈ Ey∗ . Then by 5.1.4 we must have

y = Ax. This contradiction proves that A(xn ) → Ax in Ey .
(b) Let {xn } be a bounded sequence in Ex . Since Ex is reflexive,
Eberlein’s theorem (6.4.4) shows that {xn } has a weak convergent
w
subsequence {xni }. Let xni −→ x in Ex . Then by our hyperthesis
Axni → Ax in Ey . Thus, for every bounded sequence {xn } in Ex , Axn
contains a subsequence which converges to Ey . Hence, A is a compact map
by 8.1.1.
8.1.16 Remark
(a) The requirement of reflexivity of Ex cannot be dropped from

8.1.15(b). For example, if A denotes the identity operator from l1 to l1 , then
w
Schur’s lemma [see Limaye [33]] shows that Axn → Ax wherever xn −→ x
in l1 . However, the identity operator is not compact.
8.1.17 Theorem
The range of a compact operator A is separable.

Proof: Let Kn be the image of the ball {x : ||x|| ≤ n}. Since A is compact,
K n is compact and therefore, also a separable set [see 1.6.19]. Let Ln be a
∞
*
countable everywhere dense set in K n . Since K = Kn is the range of
n=1
∞
*
A, L = Ln is a countable, everywhere dense set in K.
n=1
8.2 Spectrum of a Compact Operator

In this section we develop the Riesz-Schauder theory of the spectrum of
a compact operator on a normed linear space Ex over 4 +
( ). We show
that this spectrum resembles the spectrum of a finite matrix except for the
number 0. We begin the study by referring to some preliminary results
(4.7.17–4.7.20).
8.2.1 Theorem
Let Ex be a normed linear space, A is a compact linear operator mapping
4+
Ex into Ex and 0 = k ∈ ( ). If {xn } is a bounded sequence in Ex such
that Axn − kxn → y in Ex , then there is a subsequence {xni } of {xn } such
that xni → x in Ex and Ax − kx = y.
Proof: Since {xn } is bounded and A is a compact operator, {xn } has a
subsequence {xni } such that {A(xni )} converges to some z in Ex , then
kxni = kxni − Axni + Axni → −y + z,
so that xni → (z − y)/k = x (say). Also, since A is continuous,

Ax − kx = lim {Axni − kxni } = z − {z − y} = y.
i→∞
8.2.2 Remark
The above result shows that if A is a compact linear operator mapping
Ex into Ex , and if {xn } is a bounded sequence of approximate solution of
Ax − kx = y, then a subsequence of {xn } converges to an exact solution
of the above equation. The following result, which is based on Riesz
lemma (2.3.7), is instrumental in analysing the spectrum of a compact
operator.
8.2.3 Lemma
Let Ex be a normed linear space and A : Ex → Ex .
4
(a) Let 0 = k ∈ (C) and Ey be a proper closed subspace of Ex such
that (A − kI)Ex ⊆ Ey . Then there is some x ∈ Ex such that ||x|| = 1 and
for all y ∈ Ey ,
|k|
||Ax − Ay|| ≥ .
2
(b) Let A be a compact linear operator mapping Ex → Ex and
k0 , k1 , . . . , be scalars with |kn | ≥ δ for some δ > 0 and n = 0, 1, 2, . . . .
Let E0 , E1 , . . . , E 0 , E 1 , . . . be closed subspaces of Ex , such that for

n = 0, 1, 2, . . . ,
En+1 ⊆ En , (A − kn I)(En ) ⊆ E n+1 ,
E n ⊆ E n+1 , (A − kn+1 I)(E n+1 ) ⊆ E n .
Then there are non-negative integers p and q such that
Ep+1 = Ep and E q+1 = E q .

Proof: First, we note that A(Ey ) ⊆ Ey , since
Ay = [Ay − ky] + ky ∈ Ey for all y ∈ Ey .
Now by the Riesz lemma (2.3.7), there is some x ∈ Ex , such that ||x|| = 1
1
and dist (x, Ey ) ≥ .
2
Let us consider y ∈ Ey . Since Ax − kx ∈ Ey and Ay ∈ Ey , we have
0 0
0 1 0
0
||Ax − Ay|| = ||kx − [kx − Ax + Ay]|| = |k| 0x − [kx − Ax + Ay]0
k 0
1
≥ |k| dist (x, Ey ) ≥ .
2
(b) Let us suppose now that Ep+1 is a proper closed subspace of Ep for
each p = 0, 1, 2. By (a) above we can find an yp ∈ Ep , such that ||yp || = 1
and for all y ∈ Ep+1 ,
|k| δ
||Ayp − Ayp+1 || ≥ ≥ , p = 0, 1, 2, . . . .
2 2
If follows that {yp } is a bounded sequence in Ex and
δ
||Ayp − Ayr || ≥ , p, r = 0, 1, with p = r.
2
The above shows that {Ayp } cannot have a convergent subsequence. But
this contradicts the fact that A is compact. Hence there is some nonnegative
integer p such that Ep = Ep+1 .
It can similarly be proved that there is some nonnegative integer q such
that E q+1 = E q .
8.2.4 Definitions: ρ(A), δ(A), σe (A), σa (A)
In view of the discussion in 4.7.17–4.7.20, we write the following
definitions:
4+
(i) Resolvent set: ρ(A) : {λ ∈ ( ) : A − λI is invertible}.
4+
(ii) Spectrum σ(A) : {λ ∈ ( ) : A − λI does not have an
inverse}. A scalar belonging to σ(A) is known as spectral value of A.
4 +
(iii) Eigenspectrum σe (A) of A consists of all λ in ( ), such that
A is not injective or one-to-one. Thus, λ ∈ σe (A) if and only if there is some
non-zero x in Ex such that Ax = λx. λ is called an eigenvalue of A and x
is called the corresponding eigenvector of A. The subspace N (A − λI) is
known as the eigenspace of A, corresponding to the eigenvalue λ.
(iv) The approximate eigenspectrum σa (A) consists of all λ in
4 + ( ), such that (A − λI) is not bounded below. Thus, λ ∈ σa (A) if
and only if, there is a sequence in Ex such that ||xn || = 1 for each n
and ||Axn − λxn || → 0 as n → ∞. Then λ is called an approximate
eigenvalue of A. If λ ∈ σe (A) and x is a corresponding eigenvector, then

letting xn = x/||x|| for all n, we conclude that λ ∈ σa (A). Hence,
σe (A) ⊂ σa (A) ⊂ σ(A).
(iv) An operator A on a linear space Ex is said to be of finite rank if

the range of A is finite dimensional.
8.2.5 Theorem
Let Ex be a normed linear space and A be a compact linear operator,
mapping Ex into Ey .
(a) Every non-zero spectral value of A is an eigenvalue of A, so that
{λ : λ ∈ σe (A), λ = 0} = {λ : λ ∈ σ(A), λ = 0}.
(b) If Ex is infinite dimensional, then 0 ∈ σa (A)

(c) σa (A) = σ(A).
4+
Proof: (a) Let 0 = λ ∈ ( ). If λ is not an eigenvalue, then A − λI is
one-to-one. We prove that λ is not a spectral value of A, i.e., (A − λI)
is invertible. We first show that (A − λI) is bounded below. Otherwise,
we can find a sequence {xn } in Ex , such that ||xn || = 1 for each n and
||(A − kI)(xn )|| → 0 as n → ∞. Then, by theorem 8.2.1, there is a
subsequence {xni } of {xn } such that xni → x in Ex and Ax − λx = 0.
Since A − λI is one-to-one, we have x = θ. But ||x|| = lim ||xni || = 1.
i→∞
This leads to a contradiction. Thus, A − λI is bounded below.
Next we show that A − λI is onto, i.e., R(A − λI) = Ex . First, we show
that R(A − λI) is a closed subspace of Ex . Let (Axn − λxn ) be a sequence
in R(A − λI) which converges to some element y ∈ Ex . Then ((A − λI)xn )
is a bounded sequence in Ex and since (A − λI) is bounded below, i.e.,
||(A − λI)x|| ≥ m||x||, we see that {xn } is also a bounded sequence in Ex .
By theorem 8.2.1, there is a subsequence {xni } of (xn ), such that xni → x
in Ex and Ax − λx = y. Thus, y ∈ R(A − λI) showing that the range of
(A − λI) is closed in Ex .
Now, let En = R((A − λI)n ) for n = 0, 1, 2, . . .. Then we show by
induction that each En is closed in Ex . For n = 0, E0 = Ex , for n = 1, E1
is closed. For n ≥ 2 (A − λI)n
= An − λAn−1 + · · · + n Cr (−1)r An−r λr + · · ·

+ (−1)n−1 Aλn−1 + (−1)n λn I.
= Pn (A) − λn I
where λn = −(−1)n λn and Pn (A) is a nth degree polynomial in A.
Then, by lemma 8.1.4., Pn (A) is a compact operator and clearly λn = 0.

Further, since (A − λI) is one-to-one, (A − λI)n is also one-to-one.
If we replace A with Pn (A) and λ with λn and follow the arguments put
forward above, we conclude that R(Pn (A) − λn I) = En is a closed subspace
of Ex .
Since En+1 ⊆ En and En+1 = (A − λI)(En ) and part (b) of lemma
8.2.3 shows that that there is a non-negative integer p with Ep+1 = Ep . If
p = 0 then E1 = E0 . If p > 0, we want to show that Ep = Ep−1 .
Let y ∈ Ep−1 , that is, y = (A − λI)p−1 x for some x ∈ Ex . Then
(A − λI)y = (A − λI)p x ∈ Ep = Ep+1 , so that there is some x ∈ Ex
with (A − λI)y = (A − λI)p+1 x. Since (A − λI)(y − (A − λI)p x) = θ
and since (A − λI) is one-to-one, it follows that y − (A − λI)p x = θ, i.e.,
y = (A − λI)p x ∈ Ep . Thus, Ep = Ep−1 . Proceeding as in above, if
p > 1, we see that Ep+1 = Ep = Ep−1 = Ep−2 = · · · = E1 = E0 . But
E1 = R(A − λI) and E0 = Ex . Hence A − λI is one-to-one.
Being bounded below and onto, (A − λI) has an inverse. Hence, every
non-zero spectral value of A is an eigenvalue of A. Since σe (A) ⊂ σ(A)
always, the proof (a) is complete.
(b) Let Ex be infinite dimensional. Let us consider an infinite linearly
independent set {e1 , e2 , . . .} of Ex and let E n = span {e1 , e2 , . . . en }, n =
1, 2, . . .. Then E n is a proper subspace of E n+1 . E n is of finite dimension
and is closed by theorem 2.3.4. By the Riesz lemma (theorem 2.3.7), there
is some element an+1 ∈ E n+1 such that ||an+1 || = 1, dist (an+1 , E n ) 12 .
Let us assume that A is bounded below i.e., ||Ax|| ≥ m||x|| for all x ∈ Ex
and some m > 0. Then for all p, q = 1, 2, . . ., and p = q, we have,
m
||Aap − Aaq || ≥ m||ap − aq || ≥ ,
2
so that {Aap } cannot have a convergent subsequence, which contradicts the
fact that A is compact.
Hence, A is not bounded below. Hence 0 ∈ σa (A).
(c) If Ex is finite dimensional and D(A) = Ex , then the operator A
can be represented by a matrix, (aij ); then A − λI is also represented by a
matrix and σ(A) is composed of those scalars λ which are the roots of the
equation

a11 − λ a12 a1n

= 0 [see Taylor, [55]]

a a a −λ
n1 n2 nn
Hence σa (A) = σ(A).

If Ex is infinite dimensional, then 0 ∈ σa (A) by (b) above. Also, since
σe (A) ⊆ σa (A) ⊆ σ(A) always, if follows from (a) above that σa (A) = σ(A).
8.2.6 Lemma
Let Ex be a linear space, A : Ex → Ex linear, Axn = λn xn , for some
4+
θ = xn ∈ Ex and λn ∈ ( ), n = 0, 1, 2, . . . .
(a) Let λn = λm wherever n = m. Then {x1 , x2 , . . .} is a linearly

independent subset of Ex .
(b) Let Ex be a normed linear space. A is a compact linear operator
mapping Ex → Ex and the set {x1 , x2 , . . .} is linearly independent and
infinite. Then λn → 0 as n → ∞.
Proof: (a) Since x1 = θ, the set {x1 } is linearly independent. Let
n = 2, 3, . . . and assume that {x1 , x2 , . . . , xn } is linearly independent. Let,
if possible, xn+1 = α1 x1 + α2 x2 + · · · + αn xn for some α1 , α2 , . . . αn in 4
+
( ).
Then, λn+1 xn+1 = α1 λn+1 x1 + α2 λn+1 x2 + · · · + αn λn+1 xn and also

n
λn+1 xn+1 = A(xn+1 ) = αi Axi = α1 λ1 x1 + α2 λ2 x2 + · · · + αn λn xn .
i=1
Thus we get on subtraction,
α1 (λn+1 − λ1 )x1 + α2 (λn+1 − λ2 )x2 + · · · + αn (λn+1 − λn )xn = θ.
Since x2 , . . . xn are linearly independent, αj (λj − λn+1 ) = 0 for each j. As

xn+1 = θ, we see that αj = 0 for some j, 1 ≤ j ≤ n, so that λn+1 = λj .
But this is impossible. thus the set {x1 , x2 , . . . xn+1 } is linearly
independent. Using mathematical induction we conclude that {x1 , x2 , . . .}
are linearly independent.
(b) For n = 1, 2, . . ., let E n = span {x1 , x2 , . . . , xn }. Since xn+1 does
not belong to E n , E n is a proper subspace of E n+1 . Also, E n is closed in
Ex by th. 2.3.4, and (A − λn+1 I)(E n+1 ) ⊂ E n since (A − λn+1 I)xn+1 = θ.
If λn −→/ 0 as n → ∞, we can assume by passing to a subsequence that
|λn | ≥ δ > 0 for all n = 1, 2, . . .. Now 8.2.3(b) yields that E q+1 = E q
for some positive integer q which contradicts the fact that E q is a proper
subspace of E q+1 . Hence λn → 0 as n → ∞.
8.2.7 Theorem
Let Ex be a normed linear space and A be a compact linear operator
mapping Ex into Ex .
(a) The eigenspectrum and the spectrum of A are countable sets and
have ‘0’ as the only possible limiting point. In particular, if {λ1 , λ2 , . . .} is
an infinite set of eigenvalues of A, then λn → 0 as n → ∞.
(b) Every eigenspace of A corresponding to a non-zero eigenvalue of A
is finite dimensional.
Proof: Since {λ : λ ∈ σ(A), λ = 0} = {λ : λ ∈ σe (A), λ = 0} by 8.2.5(a),
we have to show that the set σe (A) is countable and 0 is the only possible
limit point of it.
For δ > 0, let
Lδ = {λ ∈ σe (a) : |λ| ≥ δ}.
Suppose that Lδ is an infinite set for some δ > 0. Let λn ∈ Lδ for

n = 1, 2, . . . with λn = λm wherever n = m. If xn is an eigenvector of
A corresponding to the eigenvalue λn , then by theorem 8.2.6(a), the set
{x1 , x2 , . . .} is linearly independent, and consequently λn → 0 as n → ∞
by 8.2.6(b). But this is impossible since |λn | ≥ δ for each n. Hence Lδ
∞
*
is a finite set for δ > 0. Since σe (A) = L1/n it follows that σe (A)
n=1
is a countable set and that σe (A) has no limit points except possibly the
number 0.
Furthermore, σe (A) is a bounded subset of 4 +( ) since |λ| ≤ ||A|| for
every λ ∈ σe (A). If {λ1 , λ2 , . . .} is an infinite subset of σe (A), then it must
4 +
have a limit point by the Bolzano-Weierstrass theorem for ( ) (theorem
1.6.19). As the only possible limit point is 0, we see that λn → 0 as n → ∞.
(b) Let 0 = λ ∈ σe (A). Suppose that the set of eigenvectors
corresponding to an eigenvalue λ forms an infinite set {x1 , x2 , . . .}. Let
λ take the values λ1 , λ2 , . . . corresponding to the eigenvectors x1 , x2 , . . ..
Then, by Bolzano-Weierstrass theorem (th. 1.6.19) the set of eigenvalues
4 +
{λ1 , λ2 , . . .} have a limit point in ( ). As the only possible limit point
is zero, we see that λn → 0 as n → ∞. But this is impossible since
σe (A) # λ = 0.
Thus the eigenspace of A corresponding to λ is finite dimensional.
We next consider the spectrum of the transpose of a compact operator.
8.2.8 Theorem
Let Ex be a normed linear space and A ∈ (Ex → Ex ). Then
σ(A∗ ) ⊆ σ(A).
If Ex is a Banach space, then
σ(A) = σa (A) ∪ σe (A∗ ) = σ(A∗ ).
4+
Proof: Let λ ∈ ( ) be such that (A − λI) is invertible, i.e., (A − λI)
has a bounded inverse. If (A − λI)B = I = B(A − λI) for some bounded
linear operator B mapping Ex → Ex , then by 6.1.5(ii) B ∗ (A∗ − λI) = I =
(A∗ − λI)B ∗ , where A∗ , B ∗ stand for adjoints of A and B respectively.
Hence σ(A∗ ) ⊆ σ(A).
Let Ex be a Banach space. By 8.2.4(iii) λ ∈ σ(A) if and only if either
A − λI is not bounded below or R(A − λI) is not dense in Ex . As because
(A − λI) is not bounded below λ ∈ σa (A).
Let f ∈ Ex∗ . Then (A∗ − λI)f = 0 if and only if f ((A − λI)x) =
∗
(A − λI)f (x) = 0 for every x ∈ Ex .
Now, (A∗ − λI) is one-to-one, i.e., N (A∗ − λI) = {θ} if and only if f = θ
wherever f (y) = 0 for every y ∈ R(A − λI). This happens if and only if the
closure of R(A−λI) is Ey i.e., R(A−λI) is dense in Ey . Hence λ ∈ σe (A∗ ).
Thus σ(A) = σa (A) ∪ σe (A∗ ).
Finally, to conclude σ(A) = σ(A∗ ), it will suffice to show that σa (A) ⊆

σ(A∗ ). Let λ ∈ σ(A∗ ), that is A∗ − λI is invertible. If x ∈ Ex then by
5.1.4, there is some f ∈ Ex∗ , such that f (x) = ||x||, ||f || = 1 so that
||x|| = |f (x)| = |(A∗ − λI)(A∗ − λI)−1 (f )(x)|

= |(A∗ − λI)−1 (f )(A − λI)(x)|
≤ ||(A∗ − λI)−1 || ||A(x) − λx||.
Thus, (A − λI) is bounded below, that is λ ∈ σa (A).
If A is a compact operator we get some interesting results.
8.2.9 Theorem
Let Ex be a normed linear space and A be a compact operator mapping
Ex into Ex . Then
(a) dim N (A∗ − λI) = dim N (A − λI) < ∞ for 0 = λ ∈ ( ), 4+
(b) {λ : λ ∈ σe (A∗ ), λ = 0) = {λ : λ ∈ σe (A), λ = 0},
(c) σ(A∗ ) = σ(A).
Proof: (a) By theorem 8.1.12, A∗ is a compact linear operator mapping
Ex into Ex . Then, theorem 8.2.7 yields that the dimension r of N (A − λI)
and the dimension s of N (A∗ − λI) are both finite.
First we show that s ≤ r.
If r = 0, that is λ ∈ σe (A), then, by theorem 8.2.5(a), we see that
λ ∈ σ(A). Since σ(A∗ ) ⊆ σ(A) by theorem 8.2.8 we have λ ∈ σ(A∗ ). In
particular (A∗ − λI) is one-to-one, i.e., s = 0.
Next, let r > 1. Consider a basis {e1 , e2 , . . . , er } of N (A − λI). Then
from 4.8.3 we can find f1 , . . . , fr in Ex∗ such that fj (ei ) = δi,j , i, j =
1, 2, . . . , r.
Let, if possible, {φ1 , φ2 , . . . , φr+1 } be a linearly independent subset of
N (A∗ − λI) containing (r + 1) elements. By 4.8.2 there are y1 , y2 , . . . , yr+1
in Ex such that
φj (yi ) = δij , i, j = 1, 2, . . . , r + 1.
Consider the map B : Ex → Ex given by

r
B(x) = fi (x)yi , x ∈ Ex .
i=1
Since fi ∈ Ex∗ , B is a a bounded linear operator mapping Ex → Ex and

B is of finite rank. Therefore B is a compact operator on Ex by theorem
8.1.13.
Since A is also compact, lemma 8.1.4 shows that A − B is a compact
operator. We show that A − B − λI is one-to-one but not onto and obtain
a contradiction.
We note that φj ∈ N (A∗ − λI) and hence,

r

φj (A − B − λI)(x) = (A∗ − λI)(φj )(x) − φj fi (x)yi
i=1

r
=0− fi (x)φj (yi ).
i=1

−fi (x) if 1 ≤ j ≤ r
= .
0 if j = r + 1
Now, let x ∈ Ex satisfy (A − B − λI)x = θ. Then it follows that

−fj (x) = φj (A − B − λI)x = φj (0) = 0, for 1 ≤ j ≤ r and in turn,
B(x) = θ. Hence (A − λI)x = θ i.e., x ∈ N (A − λI). Since {e1 , e2 , . . . , er }
is a basis of N (A−λI), we have x = α1 e1 +· · ·+αr er for some α1 , α2 , . . . , αr
in 4 +( ). But
0 = fj (x) = fj (α1 e1 + α2 e2 + · · · + αr er ) = αj , j = 1, 2, . . . , r
so that x = 0 · e1 + 0 · e2 + · · · + 0 · er = θ.
Thus A − B − λI is one-to-one because
(A − B − λI)x = θ ⇒ x = θ.
Next we assert that yr+1 ∈ R(A − B − λI). For if yr+1 = (A − B − λI)x

for some x ∈ Ex , then
1 = φr+1 (yr+1 ) = φr+1 ((A − B − λI)x) = 0,
as we have noted above. Hence (A − B − λI) is not onto.

Thus a linearly independent subset of N (A∗ − λI) can have at most r
elements, i.e., s ≤ r.
To obtain r ≤ s we proceed as follows. Let t denote the dimension
of N (A∗∗ − λI). Considering the compact operator A∗ in place of A, we
find that t ≤ s. If ΠEx denotes the canonical embedding of Ex into Ex∗∗
considered in sec. 5.6.6, then by theorem 6.1.5. A∗∗ ΠEx = ΠEx A. Hence
ΠEx (N (A − λI)) ⊆ N (A∗∗ − λI), so that r ≤ t. Thus r ≤ t ≤ s. Hence
r = s.
4+
(b) Let 0 = λ ∈ ( ). Part (a) shows that N (A − λI) = {θ} if and
only if N (A∗ − λI) = {θ}, that is, λ ∈ σe (A) if and only if λ ∈ σe (A∗ ).
(c) Since A and A∗ are compact operators, we have by theorem 8.2.5
{λ : λ ∈ σ(A), λ = 0} = {λ : λ ∈ σe (A), λ = 0}
{λ : λ ∈ σ(A∗ ), λ = 0} = {λ : λ ∈ σe (A∗ ), λ = 0}
It follows from (b) above that
{λ : λ ∈ σ(A∗ ), λ = 0} = {λ : λ ∈ σ(A), λ = 0}
If Ex is finite dimensional, then det(A − λI) = det(A∗ − λI). Hence

0 ∈ σ(A∗ ) if and only if 0 ∈ σ(A). If Ex is infinite dimensional then
Ex∗ is infinite dimensional and hence 0 ∈ σa (A) as well as 0 ∈ σa (A∗ ) by
theorem 8.2.5(b). Thus, in both cases, σ(A∗ ) = σ(A).
If follows from the above that the spectrum of an infinite matrix is very
much like the spectrum of a finite matrix, except for the number zero.
8.2.10 Examples
,
ξ1 ξ 2 ξ 3
1. Let Ex = lp , 1 ≤ p ≤ ∞ and Ax = , , ··· where
1 2 3
x = {ξ1 , ξ2 , ξ3 , . . .} ∈ lp .
,
1 1 1
Let An = , , · · · , , 0 · · · 0 . Since An is finite, An is a linear
1 2 n
compact operator [see theorem 8.1.13].
∞ ∞
1
Furthermore, ||(A − An )x||pp = |ηi |p = |ξ |p
p i
i=n+1 i=n+1
i
∞
1 ||x||p
|ξi | p
, p > 1.
(n + 1)p i=n+1 (n + 1)p
||(A − An )x|| 1
Hence ||(A − An )|| = sup ≤ , p > 1.
||x|| n+1
Hence, An → A as n → ∞ and An is compact, by theorem 8.1.9, A

is also a compact operator. A is clearly one-to-one, 0 is not an eigenvalue
of A, but since A is not bounded below, 0 is a spectral value of A. Also,
λn = n1 is an eigenvalue of A and λn → 0 as n → ∞.
2. The eigenspace of a compact operator corresponding to the eigenvalue
0 can be infinite dimensional. The easiest example is the zero operator on
an infinite dimensional normed linear space.
3. λ = 0 can be an eigenvalue of a compact operator A, but λ = 0 may
not be an eigenvalue of the transpose A∗ and vice versa.
4. Let Ex = lp and A denote the compact operator on Ex defined by
x4 x5 T
Ax = x3 , , , · · · for x = (x1 , x2 , . . .)T ∈ lp
3 4
⎛ ⎞
x1 ⎛ ⎞
⎛ ⎞ x3
0 0 1 0 ··· ⎜ x2 ⎟
⎜ ⎟ ⎜ x34 ⎟
⎜ 0 0 0 1 ⎟
··· ⎠⎜ x3 ⎟ ⎜ ⎟
i.e., ⎝ 3 ⎜ ⎟ = ⎜ x5 ⎟
1 ⎜ x4 ⎟ ⎝ 4 ⎠
0 0 0 4 ··· ⎝ ⎠ ..
.. .
.
1 1
Hence A∗ can be identified with B on lp , + = 1, so that
p q
⎛ ⎞
0 0 0···
⎜ 0 0 0··· ⎟
⎜ ⎟
⎜ 0··· ⎟
B=⎜ 1 0 ⎟
⎜ 0 1
0··· ⎟
⎝ 3 ⎠
.. .. 1
. . 4
···
x2 x3 T
Hence Bx = 0, 0, x1 , , ··· for x = (x1 , x2 , x3 , . . .) ∈ lq .
⎛ ⎞ 3⎛ 4 ⎞
1 1
⎜ 0 ⎟ ⎜ 0 ⎟
⎜ ⎟ ⎜ ⎟
Since A ⎜ . ⎟ = 0 ⎜ . ⎟ we see that 0 is an eigenvalue of A. But
⎝ .. ⎠ ⎝ .. ⎠
0 0
since B is one-to-one, 0 is not an eigenvalue of B. Also since B ∗ = A, we
see that not only the compact operator B does not have an eigenvalue 0,
its adjoint B ∗ does not have an eigenvalue 0 too.
(c) Let Ex = C([0, 1]), 1 ≤ p ≤ ∞. For x ∈ Ex , let
s 1
Ax(s) = (1 − s) tx(t)dx(t) + s (1 − t)x(t)dx(t), s ∈ [0, 1] (8.1)
0 s
Since the kernel is continuous A is a compact operator mapping Ex into

Ex [see example 8.1.10].
The above is a Fredholm integral operator with a continuous kernel
given by,
(1 − s)t if 0 ≤ t ≤ s ≤ 1
K(x, t) = (8.2)
s(1 − t) if 0 ≤ s ≤ t ≤ 1
Let x ∈ Ex and 0 = λ ∈ 4(+) be such that
Ax = λx.
Then for all s ∈ [a, b]

s 1
λx(s) = (1 − s) tx(t)dx(t) + s (1 − t)x(t)dx(t). (8.3)
0 s
Putting s = 0 and s = 1, we note that x(0) = 0 = x(1). Since tx(t) and

(1 − t)x(t) are integrable functions of t ∈ [0, 1], it follows that the right-
hand side of the equation given above is an absolutely continuous function
of x ∈ [0, 1]. Hence x is (absolutely) continuous on [0,1]. This implies that
tx(t) and (1 − t)x(t) are continuous functions of t on [0,1]. Thus the right
hand is, in fact, a continuously differentiable function of s and we have for
all s ∈ [0, 1].
s 1
λx (s) = (1 − s)sx(s) − tx(t)dx(t) − s(1 − s)x(s) + (1 − t)x(t)dx(t)
0 s
s 1
=− tx(t)dx(t) + (1 − t)x(t)dx(t).
0 s
This shows that x is a continuously differentiable function, and for all
s ∈ [0, 1], we have,
λx (s) = −sx(s) − (1 − s)x(s) = −x(s).
Thus, the differential equation λx + x = 0 has a non-zero solution,

satisfying x(0) = 1 = x(1) if and only if λ = 1/n2 π 2 , n = 1, 2, . . . and in
such a case its most general solution is given by x(s) = c sin nπs, s ∈ [0, 1],
4+
where c ∈ ( ).
Let λn = n21π2 , n = 1, 2, . . . and xn (s) = sin nπs for s ∈ [0, 1]. Thus each
λn is an eigenvalue of A and the corresponding eigenspace N (A − λn I) =
span {xn } is one dimensional.
Next, let 0 be not an eigenvalue of A. For, if Ax = θ for some x ∈ Ex ,
then by differentiating the expression for Ax(s) with respect to s two times,
we see that x(s) = 0 for all s ∈ [0, 1]. On the other hand, since A is
compact and Ex is infinite dimensional, 0 is an approximate eigenvalue of
A by theorem 8.2.5. Thus,
,
1 1 1 1
σe (A) = , ,··· and σa (A) = σ(A) = 0, 2 , 2 2 , · · · .
π 2 22 π 2 π 2 π
8.2.11 Problems [8.1 and 8.2]

1. Show that the zero operator on any normed linear space is compact.
2. If A1 and A2 are two compact linear operators mapping a normed
linear space Ex into a normed linear space Ey , show that A1 + A2 is
a compact linear operator.
3. If Ex is finite dimensional and A is linear mapping Ex into Ey , then
show that A is compact.
4. If A ∈ (Ex → Ey ) and Ey is finite dimensional, then show that A is
compact.
5. If A, B ∈ (Ex → Ex ) and A is compact, then show that AB and BA
are compact.
6. Let Ex be a Banach space and P ∈ (Ex → Ex ) be a projection. Then
show that P is a compact linear operator if and only if P is of finite
rank.
7. Given Ex is an infinite dimensional normed linear space and A a
compact linear operator mapping Ex into Ex , show that λI − A is
not a compact operator where λ is a non-zero scalar.
8. Let A = (aij ) be an infinite matrix⎛with aij ∈ 4+

⎞ ( ) i, j ∈ N . If
∞
x ∈ lp and Ax ∈ lr where Ax = ⎝ aij xj ⎠, show that in the
j=1
following cases A : lp → lr is a compact operator;
∞

(i) 1 ≤ p < ∞, 1 ≤ r < ∞ and |aij | → 0 as i → ∞
j=1
⎛ ⎞
∞
∞
(ii) 1 ≤ p < ∞, 1 ≤ r < ∞ and ⎝ |aij |r ⎠ < ∞.
i=1 j=1
9. Let Ex = C([a, b]) with || · ||∞ and A : Ex → Ex be defined

b
by Ax(s) = K(s, t)x(t)dt, x ∈ Ex where K(·, ·) ∈ C([a, b]) ×
a
C([a, b]). Let {An } be the Nyström approximation of A corresponding
to a convergent quadrature formula with nodes t1,n , t2,n , . . . , tn,n
in [a, b] and weights w1,n , w2,n , . . . wn,n in R C
( ) i.e., An x(s) =
n
K(tj,n , t)x(tj,n )wj,n , x ∈ Ex , n ∈ N , where nodes and weights
j=1

n b
are such that x(tj,n )wj,n → x(t)dt as n → ∞ for every
j=1 a
x ∈ C([a, b]). Then show that (i) ||Ax − An x|| → 0 for every
x ∈ C([a, b]). (ii) ||(An − A)A|| → 0 and ||(An − A)An || → 0 as
n → ∞.
For quadrature formula see 6.3.5. In order to solve the integral
equation numerically Ax = y, x, y ∈ C([a, b]) the given equation
is approximately reduced to a system of algebraic equations by using
‘quadrature’.
(Hint: (i) Show that for each u ∈ C([a, b]), {(An u(s))} converges
to (Au(s)). {An u : n ∈ N
} is equicontinuous, and hence (ii).
Use the result in (i) and the fact that {Au : ||u||∞ ≤ 1} and
N
{An u : ||u||∞ ≤ 1, n ∈ } are equicontiuous.)
8.3 Fredholm Alternative

In this section, linear equations with compact operators will be considered.
F. Riesz has shown that such equations admit the applications of basic
consequences from the Fredholm theory of linear integral equations.
8.3.1 A linear equation with compact operator and its adjoint
Let A be a compact operator which maps a Banach space E into itself.
Consider the equation
Au − u = v (8.4)
or, Pu = v (8.5)
where P = A − I. Together with equation (8.5), consider
A∗ f − f = g (8.6)
∗
or, P f =g (8.7)
where A∗ is the adjoint operator of A and acts into the space E ∗ . By
theorem 8.1.12, A∗ is a compact operator.
8.3.2 Lemma
Let N be a subspace of the null space of the operator P , that is, a
collection of elements u such that P u = θ. Then N is a finite-dimensional
subspace of E.
Proof: Let M be an arbitrary bounded set in N . For every u ∈ N, Au = u,
that is, the operator A leaves the element of the subspace N invariant and
in particular, carries the set M into itself. The subspace N of E is then
said to be invariant with respect to A.
As A is a compact operator, A carries M into a compact set.
Consequently, every bounded set M ⊆ N is compact, implying by theorem
2.3 that N is finite dimensional.
8.3.3 Remark
The elements of the subspace N are eigenvectors of the operator A
corresponding to the eigenvalue λ0 = 1. The above conclusion remains
valid if λ0 is replaced by any non-zero eigenvalue.
Thus a compact linear operator can have only a finite number of linearly
independent eigenvectors corresponding to the same non-zero eigenvalue.
8.3.4 Lemma
Let L = P (E), that is, L be a collection of elements v ∈ E representable
in the form Au − u = v. Then L is a subspace.
To prove L is linear we note that if Au1 − u1 = v1 and Au2 − u2 = v2 ,
4
then α1 v1 + α2 v2 = A(α1 u1 + α2 u2 ) − (α1 u1 + α2 u2 ), α1 , α2 ∈ (C). Thus,
v1 , v2 ∈ L ⇒ α1 v1 + α2 v2 ∈ L. We next prove that L is closed. We first
show that there is a constant m depending only on A−I such that wherever
the equation P u = v is solvable, at least one of the solutions satisfies the
inequality
m||u|| ≤ ||v||, m > 0 (8.8)
Let u0 be a solution of P u = v. Then every other solution of P u = v is

expressible in the form u = u0 + w where w is a solution of the homogenous
equation
Pu = θ (8.9)
Let us consider F (w) = ||u0 + w||, a bounded below continuous functional.

Let d = inf. F (w) and {wn } ⊆ N be the minimizing sequence, that is,
F (wn ) = ||u0 + wn || → d (8.10)
The sequence {||u0 + wn ||} has a limit and is hence bounded. However, the
sequence {||wn ||} is also bounded, since,
||wn || = ||(u0 + wn ) − u0 || ≤ ||u0 + wn || + ||u0 ||
Thus {wn } is a bounded sequence in a finite-dimensional space N and hence,

by Bolzano-Weierstrass theorem (theorem 1.6.19) has a convergent
subsequence. Hence, we can find a subsequence {wnp } such that
wnp → w0 . Then F (wnp ) → F (w0 ). (8.11)
F (w0 ) = ||u0 + w0 || = d.
Therefore, the equation P u = v always has the solution u = u0 + w0

with the minimal norm. In order to show that (8.8) holds for u , we consider
the ratio ||
u||/||v||, and let us assume that the ratio is not bounded. Then
there exist sequences vn and u n such that
||
un ||
→ ∞.
||vn ||
Since, λvn , evidently, corresponds to the minimal solution λ un , we can

assume, without loss of generality, that ||
un || = 1; then ||vn || → 0.
Since the sequence { un } is bounded and A is compact, the sequence
{Aun } is compact and consequently contains a convergent subsequence.
Again, without loss of generality, let us assume that
un → u
A 0 . (8.12)
n = A
However, since u un − vn ,
n → u
u o since vn → θ.
and consequently,
un → A
A u0 (8.13)
0 , that is, u
u0 = u
A 0 ∈ N.
However, because of the minimality of the norm of the solution u, it follows

that ||
un − u
0 || ≥ ||
u|| = 1, contradicting the convergence of {
un } to u 0 .
Thus ||
u||/||v|| is bounded and if m = inf {||v||/||
u||}, the inequality (8.8)
is proved.
Now, suppose we are given a sequence vn ∈ L convergent to v0 . We can
assume that for some subsequence
1 1
||vnp +1 − vnp || < , where ||vnp+1 − vnp || < .
2np +1 2np
Let un0 be a minimal solution of the equation P u = vn1 and unp , p =

1, 2, . . ., a minimal solution of the equation P u = vnp +1 − vnp .
1
Then m||unp || ≤ ||vnp+1 − vnp || < .
2np
∞

This estimate yields that is the sum of the
unp converges and if u
p=1
series, then

k
k
=P
Pu lim unp = lim P unp
k→∞ k→∞
p=0 p=0

k
= lim P un0 + (vnp+1 − vnp )
k
p=1
= lim vnk+1 = v0 .
k
exhibiting v0 ∈ L. Hence, L is closed.
8.3.5 Theorem
The equation (8.4) is solvable for given v ∈ E, a Banach space, if and
only if f (v) = 0 for every linear functional f , such that
A∗ f − f = θ (8.14)
Proof: Suppose that the equation Au − u = v is solvable, that is, v is

expressible in the form v = Au0 − u0 , for some u0 ∈ E. Let f be any linear
functional satisfying A∗ f − f = θ. Then
f (v) = f (Au0 −u0 ) = f (Au0 )−f (u0 ) = A∗ f (u0 )−f (u0 ) = (A∗ f −f )(u0 ) = 0.
Next we have to show that v ∈ L = P (E) satisfies the hypothesis

of the theorem. Let us suppose v ∈ L. Since L is closed, v lies at
a distance d > 0, from L and by theorem 5.1.5, there exists a linear
functional f0 such that f0 (v) = 1, and f0 (z) = 0 for every z ∈ L. Hence
f0 (Au − u) = (A∗ f0 − f0 )(u) = 0 for all u ∈ E, that is, A∗ f0 − f0 = 0, a
contradiction, because on the one hand by construction f0 (v) = 1, where
on the other hand f0 (v) = 0. Hence y ∈ L, proving the sufficiency.
8.3.6 Remark
An equation P u = v with the property that it has a solution u if
f (v) = 0 for every f , satisfying P ∗ f = θ, is said to be normally solvable.
The essence of the theorem 8.3.5 is that
L = P (E) is closed, is a sufficient condition for P u = v to be
normally solvable.
8.3.7 Corollary
If a conjugate homogeneous equation A∗ f − f = 0 has only a trivial
solution, then the equation Au − u = v has a solution for any right-hand
side.
8.3.8 Theorem
In order that equation (8.6) be solvable for g ∈ E ∗ given, it is necessary
and sufficient that g(u) = 0 for every u ∈ E, such that
Au − u = θ. (8.15)
Proof: To prove that the condition is necessary, we note that
g(u) = (A∗ f − f )u = f (Au − u) = 0 (8.16)
For proving that the condition is sufficient we proceed as follows. Let

us define the function f0 (v) on the subspace L by means of the equality
f0 (v) = g(u), u being one of the pre-images of the element v (i.e., P −1 v)
under the mapping P . The functional f0 satisfying hypothesis of the
theorem is uniquely defined. For if u is another pre-image of the same
element v, there
Au − u = Au − u i.e., A(u − u ) − (u − u ) = 0,

where g(u − u ) = 0, i.e., g(u) = g(u ).
If u1 and u2 are solutions of (8.16), we have
g(u1 + u2 ) = (A∗ f − f )(u1 + u2 ) = f (A(u1 + u2 ))
− (u1 + u2 )) = f ((Au1 − u1 ) + (Au2 − u2 )) = 0
∗
Since g ∈ E , g(u1 + u2 ) = g(u1 ) + g(u2 ), f ((Au1 − u1 ) + (Au2 − u2 ))
= f (Au1 − u1 ) + f (Au2 − u2 )
This shows that f is additive and homogeneous. To prove the boundedness

of f we proceed as follows. We can show, as in lemma 8.3.4, that the
inequality m||u|| ≤ ||v||is satisfied for at least one of the pre-images u of
the element v.
1
Therefore, |f0 (v)| = |g(u)| ≤ ||g|| ||u|| ≤ m ||g|| ||v|| and the
boundedness of f0 is proved. We can extend f0 by the Hahn-Banach
theorem 5.1.3 to the entire space E to obtain a linear functional f , such

that
f (Au − u) = f (v) = f0 (v) = g(u), or (A∗ f − f )u = g(u).
such that f is a solution of (8.6).

8.3.9 Corollary
If the equation Au − u = θ has only a null solution u = θ, then the
equation A∗ f − f = g is solvable only when g = θ on the RHS.
We next want to show that the homogeneous and non-homogeneous
equations having solutions in the identical space are also closely related.
8.3.10 Theorem
In order that the equation
Au − u = v (8.4)
be solvable for every v, where A is a compact operator mapping a Banach

space E into itself, it is necessary and sufficient that the corresponding
homogeneous equation
Au − u = θ (8.15)
has only a trivial solution u = θ. In this case, the solution of equation (8.4)
is uniquely defined, and the operator T = A − I has a bounded inverse.
Proof: Let us suppose that the condition is necessary. Let us denote by NK
the null space of the operator T K . It is clear that T K u = θ ⇒ T K+1 u = θ,
that is, NK ⊂ NK+1 .
Let the equation Au − u = v be solvable for every v, and let us assume
that the homogeneous equation Au − u = θ has a non-trivial solution
u1 . Let u2 be a solution of the equation Au − u = u1 , and in general
let uk+1 be a solution of the equation Au − u = uk , k = 1, 2, 3, . . ..
We have, T uk = uk−1 , T 2 uk = uk−2 , . . . , T k−1 uk = u1 = θ. Wherever
T k uk = T u1 = θ. Hence, uk ∈ Nk and uk ∈ Nk−1 , that is, each subspace
Nk−1 is a proper subspace of Nk . Then, by Riesz lemma 2.3.7 there is in the
subspace Nk , an element vk with norm 1, such that ||vk − u|| ≥ 12 for every
u ∈ Nk−1 . Consider the sequence {Avk }, which is compact since ||vk || = 1
(i.e., {vk } is bounded) and A is a compact operator. On the other hand,
let vp and vq be two such elements with p > q.
Since T p−1 (vq − T vp + T vq ) = T p−1 vq − T p vp + T p vq = θ noting that
p − 1 ≥ q, then vq − T vp + T vq ∈ Np−1 and hence
1
||Avp − Avq || = ||vp − (vq − T vp + T vq )|| ≥ .
2
Thus a contradiction arises from the assumption that equation (8.4) has
in the presence of a trivial solution of the equation T u = θ, a nontrivial
solution. This proves the necessary part. Next to show that the condition
is sufficient.
Suppose that the equation T u = θ has only a trivial solution. Then, by
corollary 8.3.9, the equation
A∗ f − f = g (8.6)
is solvable for any right side. Since A∗ is also a compact operator and E ∗ a
Banach space, we can apply the necessary part of the theorem just proved
to equation (8.6). Hence the equation
A∗ f − f = θ (8.14)
has only a trivial solution. However then equation (8.4) by corollary to

theorem 8.3.6 has a solution for every v, and it is proved that the condition
is sufficient.
Since by hypothesis of the theorem, equation (8.4) has a unique solution,
then the inverse T −1 to T (i.e., (A − I)) exists and T −1 = (A − I)−1 .
Because of uniqueness property, the unique solution is at the same time
minimal, and hence
m||(A − I)−1 v|| ≤ ||v||.
8.3.11 Theorem
Let us consider the pair of equations,
Au − u = θ (8.15)
and A∗ f − f = θ (8.14)
where A and A∗ are compact operators mapping respectively the Banach

space E into itself and the Banach space E ∗ into itself. Then the above
pair of equations have the same number of linearly independent solutions.
Proof: Let u1 , u2 , . . . , un be a basis of the subspace N of solutions of
equation (8.15). Similarly, let f1 , f2 , . . . , fm be a basis of the subspace of
solutions of equation (8.14).
Let us construct a system of functionals φ1 , φ2 , . . . , φn orthogonal to
u1 , u2 , . . . un , that is, such that
φi (uj ) = δij , i, j = 1, 2, . . . n.
Let us also construct a system of elements w1 , w2 , . . . , wm biorthogonal to

f1 , f2 , . . . , fm .
Let us assume n < m. We consider the operator V given by

n
V u = Au + φi (u)wi .
i=1
Since A is a compact operator and the right-hand side of V u contains a

finite number of terms, V is a compact operator. We next want to show
that the equation V u − u = θ has only a trivial solution.
Let u0 be a solution of Vu − u = θ.

n
Then fk (V u0 − u0 ) = 0, or, fk Au0 − u0 + φi (u0 )wi = 0
i=1

n
or, A∗ fk u0 − fk u0 + φi (u0 )fk (wi ) = 0
i=1
n
or, (A∗ fk − fk )u0 + φi (u0 )fk (wi ) = 0.
i=1
Since {fi } and {wi } are biorthogonal to each other, we have from the
above equation,
(A∗ fk − fk )u0 + φk (u0 ) = 0.
Since fk is a basis of the subspace of solutions of equation (8.14),
A∗ fk − fk = θ.
Hence, φk (u0 ) = 0, k = 1, 2, . . . (n < m).
Hence we have V u0 = Au0 or, Au0 − u0 = V u0 − u0 = θ.
Since u0 ∈ N and {ui } is a basis of N ,

n
u0 = ξi ui .
i=1

n
However, φj (u0 ) = ξi φj (ui ) = ξj .
i=1
Since φj (u0 ) = 0, j = 1, 2, . . . , n, ξj = 0.
Hence, u0 = θ.
Since the equation V u − u = θ has only a trivial solution, the equation
V u − u = v is solvable for any v and in particular for v = wn+1 . Let u be
a solution of this equation. Then we can write

n

fn+1 (wn+1 ) = fn+1 Au − u + φi (u ) (wi )
i=1

n
= (A∗ fn+1 − fn+1 )u + φi (u )fn+1 (wi ) = 0
i=1
where on the other hand fn+1 (wn+1 ) = 1. The contradiction obtained

proves the inequality n < m to be impossible.
Let us assume, conversely, that m < n. Consider in the space E ∗ , the

operator

m
V ∗ f = A∗ f + f (wi )φi . (8.16)
i=1
This operator is adjoint to the operator V .
It is to be shown that the equation V ∗ f − f = θ has only a trivial
solution.
For all k = 1, 2, . . . , n.
Taking note of the biorthogonality of {φi } and {ui }
m
(V ∗ f − f )uk = (A∗ f − f )uk + f (wi )φi (uk )
i=1
= f (Auk − uk ) + f (wk )
= f (wk ) (8.17)
since {uk } is one of the bases of the subspace of solutions of (8.15). Thus,
if f0 is a solution of the the equation V ∗ f − f = θ then from (8.17) it
follows that f0 (wk ) = 0, k = 1, 2, . . . , m.
Hence (8.16) yields V ∗ f0 = A∗ f0 .
Hence 0 = V ∗ f 0 − f 0 = A∗ f 0 − f 0 ,
i.e., f0 is a solution of A∗ f − f = 0.
m m
However, f0 = βi fi = f0 (wi )fi = θ,
i=1 i=1
since f0 (wi ) = 0, i = 1, 2, 3, . . ..
∗
Since V is a compact operator, by theorem (8.3.10) the equation
V ∗ f − f = g has a solution for any g, particularly for, g = φm+1 .
Therefore if f is a solution of the above equation we have
V ∗ f − f = φm+1
Therefore,φm+1 (um+1 ) = V ∗ f (um+1 ) − f (um+1 )
m
= (A∗ f − f )um+1 + f (wi )φi (um+1 )
i=1

= f (Aum+1 − um+1 ) = 0.
On the other hand, we have by construction φm+1 (um+1 ) = 1.
The contradiction obtained proves the inequality m < n to be impossible.
Thus m = n.
In what follows, we observe that if we combine the theorems 8.3.5,

8.3.8, 8.3.10 and 8.3.11 we obtain a theorem which generalizes the famous
Fredholm theorem for linear integral equations to any linear equation with
compact operator.
8.3.12 Theorem
Let us consider the equations
Au − u = v (8.4)
∗
and A f −f =g (8.5)
where A and A∗ are compact operators mapping respectively Banach spaces

E and E ∗ into itself. Then equations (8.4) and (8.5) have a solution for
any element on the right side and, in this case, the homogeneous equations
Au − u = θ (8.13)
A∗ f − f = θ (8.14)
have only a trivial solution or the homogeneous equations have the same
finite number of linearly independent solutions u1 , u2 , . . . , un ; f1 , f2 , . . . , fn .
In that case equation (8.4) will have a solution, if and only if,
fi (v) = 0 (g(ui ) = 0), i = 1, 2, . . . , n
The general solution of equation (8.4), then, takes the form,

n
u = u0 + a i ui ,
i=1
u0 is any solution of equation (8.4) and a1 , a2 , . . . an are arbitrary constants.

Correspondingly, the general solution of equation (8.5), has the form

n
f = f0 + bi fi ,
i=1
f0 any solution of equation (8.5) and b1 , b2 , . . . , bn are arbitrary constants.

We next consider an equation containing a parameter:
Au − λu = v, λ = 0. (8.18)
Since the equation can be expressed in the form

1 1 1
Au − u = v and A
λ λ λ
is compact (completely continuous) together with A, the theorem proved

for equation (8.4) remains valid for equation (8.15).
Theorem 8.3.10 implies that for a given λ = 0, either the equation
Au − λu = v is solvable for any element on the right-hand side, or the
homogeneous equation Au − λu = θ has a non-trivial solution. Hence,
every value of the parameter λ = 0 is either regular or is an eigenvalue and
the operator A has no other non-zero point spectrum except the eigenvalues.
8.3.13 Theorem
If A is a compact operator, then its spectrum consists of finite or
countable point sets. All eigenvalues are located in the interval [−||A||, ||A||]
and in the case of a countable spectrum, these have only one limit point
λ = 0.
Proof: Let us consider the operator Tλ = A − λI.
! "
Now, for λ = 0, Tλ = −λ I − λ1 A and by theorem 4.7.12, the operator
! "
I − λ1 A and hence Tλ has an inverse when |λ| 1
||A|| < 1, i.e., the
spectrum of the operator A lies on [−||A||, ||A||]. Let 0 < m < ||A||.
For a conclusive proof it will suffice to exhibit that there can exist only a
finite number of eigenvalues λ, such that |λ| ≥ m. If that be not true, it is
possible to select a sequence λ1 , λ2 , . . . , λn of distinct eigenvalues, and also
|λi | ≥ m. Let u1 , u2 , . . . , un be a sequence of eigenvectors corresponding to
these eigenvalues, such that
Aun = λn un .
It is required to show that the elements u1 , u2 , . . . , uk for every k are linearly
independent. For k = 1, this is trivial. Suppose that u1 , u2 , . . . , uk are
linearly independent.
Let us assume that

k
uk+1 = ci ui (8.19)
i=1
then we have

k
k
λk+1 uk+1 = Auk+1 = ci Aui = λi ci ui (8.20)
i=1 i=1
From (8.19) and (8.20) it follows (since λk+1 = 0) that
k
λi
1− ci ui = 0.
i=1
λk+1
λi
However this is impossible since 1 − = 0
λk+1
and u1 , u2 , . . . , uk+1 are linearly independent. Hence the distinct
eigenvalues are finite in number. For proof in the case of a countable
spectrum see theorem 8.2.7.
8.4 Approximate Solutions

In the last section we have seen that, given A, a linear compact operator
mapping a Banach space E into itself and that if the homogeneous equation
u − Au = θ has only a trivial solution, then the equation u − Au = v has a

unique solution.
In this section we consider the question of finding approximate solution
to the unique solution. We consider here operators with finite rank, i.e.,
operators having finite dimensional range. The process of finding such an
approximate solution has a deep relevance. In numerical analysis, in case
we cannot find the solution of an equation in a closed form, we find an
approximation to such operator equation, so that the approximations can
be reduced to finite dimensional equations. To make the analysis complete,
it is imperative in this case that the approximate operator equations have a
unique solution and this solution tends to the exact solution of the original
equation in the limit. Thus, if A is a bounded linear operator mapping E
into E and if A is approximate to A, v0 ∈ E is approximate to v, thus
the element u0 ∈ E satisfying u0 − Au 0 = v0 is a close approximation to u
satisfying u − Au = v.
8.4.1 Theorem
Let E be a Banach space and A be a compact operator on E, such that
x = θ is the only solution of x − Ax = θ. Then (I − A) is invertible. Let
∈ (Ex → Ex ) satisfy
A
− A)−1 || < 1.
= ||(A − A)(I
Then for given v, v0 ∈ Ex , there are unique u, u0 ∈ E such that

u − Au = v, u0 − Au0 = v0
||(I − A)−1 ||
and ||u − u0 || ≤ ( ||v|| + ||v − v0 ||).
1−
Proof: Since A is compact and I − A is one-to-one, it follows from theorem
8.2.5(a) that (I − A) is invertible. As E is a Banach space and

||[(I − A) − (I − A)](I − A)−1 || = < 1,
is invertible and
it follows from theorem 4.7.12 that (I − A)
−1 −1
−1 || ≤ ||(I − A) || , ||(I − A)−1 − (I − A)
||(I − A) −1 || ≤ ||(I − A) || .
1− 1−
are invertible, there are unique
Let v, v0 ∈ E, since I − A and I − A
u, u0 ∈ E such that
u − Au = v and 0 = v0 .
u0 − Au
Also, −1 v0
u − u0 = (I − A)−1 v − (I − A)
−1 ]v + (I − A)
= [(I − A)−1 − (I − A) −1 (v − v0 ).
||(I − A)−1 || ||(I − A)−1 ||
Hence, ||u − u0 || ≤ ||v|| + ||v − v0 ||.
1− 1−
We would next show how the operator A can be constructed. We would

also show that if A is an operator of finite rank, then the solution of the
0 = v0 can be reduced to the solution of a finite system of
equation u0 − Au
linear equations which can be solved by standard methods. Next, when the
operator A is compact, we can find several ways of constructing a bounded
linear operator A of finite rank such that ||A − A||
is arbitrarily small.
8.4.2 Theorem

Let A be an operator of finite rank on a normed linear space E over
4+( ) given by
= f1 (u)u1 + · · · + fm (u)um , u ∈ E
Au
where u1 , u2 , . . . , um are in E, and f1 , f2 , . . . , fm are linear functionals on

E.
⎡ ⎤
f1 (u1 ) · · · f1 (um )
⎢ .. ⎥
Let M =⎣ . ⎦
fm (u1 ) ··· fm (um )
(a) Consider vo ∈ E and let v = (f1 (v0 ), f2 (v0 ), . . . , fm (v0 ))T
Then 0 = v0
u0 − Au and = (f1 (u0 ), . . . , fm (u0 ))T ,
u
− Mu
if and only if u = v
and 1 u1 + u
u0 = v0 + u 2 u2 + · · · + u
m um
i , i = 1, . . . , m is the i-th component of u.
where u
4+
(b) Let 0 = λ ∈ ( ). Them λ is an eigenvalue of A if and only if λ is
an eigenvalue of M .
Furthermore, if u (resp., u0 ) is an eigenvector of M (resp., A)
corresponding to λ, then
1 u1 + u
u0 = u 2 u2 + · · · + u
m um .
= (f1 (u0 ), . . . , fm (u0 ))T is an eigenvector of A
(resp., u (resp., M )
corresponding to λ,
0 = v0 and u
Proof: Let u0 − Au = (f1 (u0 ), . . . , fm (u0 ))T
Then for i = 1, 2, . . . , m,
)(i) = fi (u1 )f1 (u0 ) + · · · + fi (um )fm (u0 )
(M u
= fi (f1 (u0 )u1 + · · · + fm (u0 )um )
0 ) = fi (u0 − v0 ) = fi (u0 ) − fi (v0 )
= fi (Au
(i) − v(i) .
=u
− Mu
Hence u = v.
Also, 0 = v0 + f1 (u0 )u1 + · · · + fm (u0 )um

u0 = v0 + Au
1 u1 + · · · + u
= v0 + u m um .
− Mu
Conversely, let u 1 u1 + · · · + u
= v and u0 = v0 + u m um .
0 = f1 (u0 )u1 + · · · + fm (u0 )um
Then, Au
⎡ ⎤ ⎡ ⎤

m
m
= ⎣f1 (v0 ) + j f1 (uj )⎦ u1 + · · · + ⎣fm (v0 ) +
u j fm (uj )⎦ um
u
j=1 j=1
1 1
= [ ) ]u1 + · · · + [
v + (M u ) ]um
v + (M u m m
1 u1 + · · · + u
=u m um = u0 − v0 .
Also for i = 1, 2, . . . , m
(i) = v(i) + (M u
u )(i) = v(i) + fi (u1 )
u(1) + fi (u2 )
u(2)
u(m)
+ · · · + fi (um )
= v(i) + fi (
u(1) u1 + u
(2) u2 + · · · + u
(m) um )
= v(i) + fi (u0 − v0 ) = fi (u0 ),
i.e., = (f1 (u0 ), . . . , fm (u0 ))T .
u
(b) Since λ = 0, let μ = 1 by μA

and letting v0 = θ in
λ. Replacing A
(a) above, we see that
0=θ
u0 − μAu and (1) u1 + · · · + u
u0 = u (m) um .
Hence Au 0 = λu0 with u0 = θ if and only if M u = λu with u = θ.

Thus, λ is an eigenvalue of A if and only if λ is an eigenvalue of M .
Also, the eigenvectors of A and M corresponding to λ are related by
= (f1 (u0 ), . . . , fm (u0 )) and
u
(1) u1 + u
u0 = u (2) u2 + · · · + u
(m) um .
Letting λ = 1 in (b) above, we see that Au = u has a non-zero solution

in E if and only if M u = u has a non-zero solution in m ( m ). Also, 4 +
for a given v0 ∈ E, the general solution of u − Au = v0 is given by
u = v0 + u (1) u1 + · · · + u
(m) um , where u = ( u(1) , u
(2) , . . . , u
(m) )T is the
general solution of u − M u = (f1 (v0 ), . . . , fm (v0 )) . Thus, the problem of
T
solving the operator equation u − Au = v0 is reduced to solving the matrix

equation
u − M u = v where v = (f1 (v0 ), . . . , fm (v0 ))T .
We next describe some methods of approximating a compact operator

by bounded operators of finite rank. First, we describe some methods
related to projections.
8.4.3 Theorem
Let E be a Banach space and A be a compact operator on E. For
n = 1, 2, . . ., let Pn ∈ (E → E) be a projection of finite rank and
AP S G
n = Pn A, An = APn , An = Pn APn .
If Pn u → u in E for every u ∈ E, then ||AP n − A|| → 0. If, in addition,

PnT uT → uT in E ∗ , for uT ∈ E ∗ , then ||ASn − A|| → 0 and ||AG n − A|| → 0.
Proof: Let Pn u → u in E for every u in E. Then it follows that AP n u → Au
in E for every u ∈ E. Since A is a compact linear operator mapping E → E,
the set G = {Au : u ∈ E, ||u|| ≤ 1} is totally bounded. As E is a Banach
space, we show below that {Pn v} converges to v uniformly on E. Since G
is totally bounded, given > 0, there are v1 , . . . vm in G such that
G ⊆ B(v1 , ) ∪ B(v2 , ) ∪ · · · ∪ B(vm , ).
Now, Pn vj → vj as n → ∞ for each j = 1, 2, . . . , m. Find n0 such that

||Pn vj − vj || < , for all n ≥ n0 and j = 1, 2, . . . , m. Let v ∈ G, and chose
vj in E such that ||v − vj || < . Then, for all n ≥ n0 , we have,
||Pn v − v|| ≤ ||Pn (v − vj )|| + ||(Pn − I)vj || + ||I(vj − v)||.

≤ (||Pn || + ||I||)||v − vj || + ||Pn vj − vj || ≤ 3 .
Thus Pn v converges to v uniformly on G. Hence,
n − A|| = ||(Pn − I)A|| = sup ||(Pn − I)Av|| → 0 as n → ∞.
||AP
||v||≤1
Let us next assume that Pn∗ uT → uT in E ∗ for every uT ∈ E ∗ as well.

By theorem 8.1.12, A∗ is a compact operator on E and
(ASn )∗ = (APn )∗ = Pn∗ A∗ .
Replacing A by A∗ and Pn by Pn∗ and recalling theorem 6.1.5 (ii), we
see that
||ASn − A|| = ||(ASn − A)∗ || = ||Pn∗ A∗ − A∗ || → 0, as before.
Also, n − A|| = ||Pn APn − Pn A + Pn A − A||
||AG
≤ ||Pn (APn − A)|| + ||Pn A − A||
≤ ||Pn || ||ASn − A|| + ||AP
n − A||.
which tends to zero as n → ∞, since the sequence {||Pn ||} is bounded
by theorem 4.5.7.
8.4.4 Remark
Definitions AP S G
n , An , An
(i) APn is called the projection of A on the n-dimensional subspace and

is expressed as APn = Pn A.
(ii) ASn = APn is called the Sloan projection in the name of the
mathematician Sloan.
(iii) AG
n = Pn APn is called the Galerkin projection in the name of the
mathematician Galerkin.
8.4.5 Example of projections

We next describe several ways of constructing bounded projections Pn
of finite rank such that Pn x → x as n → ∞.
1. Truncation of Schauder expansion
Let E be a Banach space with a Schauder basis {e1 , e2 , . . .}. Let
f1 , f2 , . . . be the corresponding coefficient functionals. For n = 1, 2, . . .
define

n
Pn u = fn (u)ek u ∈ E.
k=1
Now each fk ∈ E and hence each Pn ∈ (E → E). Now, Pn2 = Pn and each
∗
Pn is of finite rank.
The very definition of the Schauder basis implies that Pn u → u in E
for every u ∈ E. Hence ||A − AP n || → 0 if A is a compact operator mapping
E → E.
2. Projection of an element in a Hilbert space
Let H be a separable Hilbert space and {u1 , u2 , . . .} be an orthonormal
basis for H [see 3.8.8]. Then for n = 1, 2, . . .,

n
Pn u = < u, uk > uk , u ∈ H,
k=1
where < · > is the inner product on H. Note that each Pn is obtained by
truncating the Fourier expansion of u ∈ H [see 3.8.6]. Since H ∗ can be
identified with H (Note 5.6.1) and Pn∗ can be identified with Pn we obtain
||A−ASn || → 0 and ||A−AG n || → 0, in addition to ||A−An || → 0 as n → ∞.
P
Piece-wise linear interpolations

Let E = C([a, b]) with the sup norm. For n = 1, 2, . . ., consider n nodes
(n) (n) (n) (n) (n) (n)
t1 , t2 · · · tn in [a, b]: i.e., a = t0 ≤ tn1 < · · · < tn ≤ tn+1 = b.
(n)
For j = 1, 2, . . . , n, let uj ∈ C([a, b]) be such that
(n) (n)
(i) uj (ti ) = δij , i = 1, . . . n
(n) (n)
(ii) u1 (a) = 1, uj (a) = 0 for j = 2, . . . , n
(n) (n)
un (b) = 1, uj (b) = 0 for j = 1, 2, . . . , n − 1,
(n) (n) (n)
(iii) uj is linear on each of the subintervals [tk , tk+1 ], k =
0, 1, 2, . . . , n.
(n) (n) (n)

The functions u1 , u2 , . . . , un are known as the hat functions
(n)
because of the shapes of their graphs. Let t ∈ [a, b]. Then uj (t) ≥ 0
(n) (n) (n) (n)
for all j = 1, 2, . . . , n. If t ∈ [tk , tk+1 ], then uk (t) + uk+1 (t) = 1 and
(n) (n) (n) (n)
uj (t) = 0 for all j = k, k + 1. Thus u1 (t) + u2 (t) + · · · + un (t) = 1.
For x ∈ C([a, b]), define

n
(n) (n)
Pn (x) = x(tj )uj .
j=1
Then Pn is called a piecewise linear interpolatory projection. Let

(n+1) (n)
hn = max{tj − tj ; j = 0, 1, 2, . . . , n} denote the mesh of the partition
of [a, b] by the given nodes. We show that Pn x → x in C([a, b]), provided
hn → 0 as n → ∞. Let us fix x ∈ C([a, b]) and let > 0. By the uniform
continuity of x on [a, b], there is some δ > 0 such that |x(s) − x(t)| <
wherever |s − t| < δ. Let us choose N such that hn < δ for all n ≥ N .
Consider n ≥ n0 and t ∈ [a, b].
(n) (n) (n)
If uj (t) = 0, then t ∈ [tj−1 , tj−1 ], so that
(n) (n)
|tj
− t| ≤ hn < δ and |x(tj ) − x(t)| < .

n
Hence, |Pn x(t) − x(t)| = (x(tj ) − x(t))uj (t)
(n) (n)
j=1

n
(n) (n)
≤ |x(tj ) − x(t)|uj (t)
j=1

n
(n)
≤ uj (t) = .
j=1
Thus ||Pn x(t) − x(t)||∞ → 0, provided hn → 0.
3. Linear integral equation with degenerate kernels

Let E denote either C([a, b]) or L2 ([a, b]). We consider the integral
equation 1
k(t, s)x(s)ds = y(t). (8.21)
0
The kernel k(t, s) is said to be degenerate if

k(t, s) = ai (t)bi (s), t, s ∈ [a, b]. (8.22)
Thus the kernel can be expressed as the sum of the products of functions,
one depending exclusively on t and the other depending exclusively on s. ai
and bi belong to E, i = 1, 2, . . . , m.
Thus, the equation (8.21) can be reduced to the form

b b
m
k(t, s)x(s)ds = ai (t)bi (s)x(s)ds = y(t)
a a i=1
3

m b
Hence, ai (t) bi (s)x(s)ds = y(t) for t ∈ [a, b].
i=1 a
3

m b

We write Ax(t) = bi (s)x(s)ds ai (t) = y(t).
i=1 a
We note that A ∈ (E → E). Also A

is of finite rank because R(A)
⊆
span {a1 (t), . . . , am (t)}.
8.4.6 Theorem
Let E = C([a, b]) (resp. L2 ([a, b]) and K(t, s) ∈ C([a, b] × [a, b]). Let
(kn ( , )) be a sequence of degenerate kernels in C([a, b]×[a, b]) (respectively,
L2 ([a, b] × [a, b]) such that ||k − kn ||∞ → 0. (resp. ||k − kn ||2 → 0). If A
and AD n are the Fredholm integral operators with kernels k(·, ·) and kn (·, ·)
respectively, then ||A − An || → 0, where || · || denotes the operator norm in
(E → E).
Proof: Let E = C([a, b]).
0 0
0 b 0
0 0
Then ||(A − AD
n )x||∞ = 0 (k − kn )(t, s)x(s)ds0
0 a 0
∞
≤ (b − a)||(k − kn )||∞ ||x||∞ .
Hence, n )||∞ ≤ (b − a)||k − kn ||∞ → 0 as n → ∞.
||(A − AD
Next, let E = L2 ([a, b]).
0 0
0 b 0
0 0
||(A − AD
n )x||2 = 0 (k − kn )(t, s)x(s)ds0
0 a 0
2

12
12
b b
2 2
≤ [k(t, s) − kn (t, s)] ds x (s)ds
a a
= ||k − kn ||2 ||x||2 [see example 8.1.10]

Hence, n ||2 ≤ ||k − kn ||2 → 0 as n → ∞.
||A − AD
4. Truncation of a Fourier expansions
Let k( , ) ∈ L2 ([a, b] × [a, b]) and {e1 , e2 , . . .} be an orthonormal basis
for L2 ([a, b]). For i, j = 1, 2, . . . let wi,j (t, s) = ei (t)ej (s), t, s ∈ [a, b]. Then
{wi,j ; i, j = 1, 2, . . .} is an orthonormal basis for L2 ([a, b] × [a, b]). Then by
3.8.6,

k= (k, wi,j )wi,j .
i,j
For n, 1, 2, . . . and s, t ∈ [a, b]. Let

n
n
kn (t, s) = k, wi,j wi,j (t, s) = k, wi,j ei (t)ej (s),
i,j=1 i,j=1
b b
where < k, wi,j >= k(t, s)ei (t)ej (s)dtds i, j = 1, 2, . . . .
a a
Thus kn (·) is a degenerate kernel and ||k − kn ||2 → 0 as n → ∞.
8.4.7 Examples
Let us consider the infinite dimensional homogeneous system of
equations
∞

xi − ai,j xj = 0, i = 1, 2, . . . ∞. (8.23)
j=1
Let {xi } be a denumerable set and let

∞

ai,j ∈ 4 (+ ) where |ai,j |2 < ∞.
i,j=1
Let the only square-summable solution of (8.23) be zero.

⎡ ⎤
a11 a12 · · · a1n · · ·
⎢ .. ⎥
⎢ . ⎥
⎢ ⎥
⎢ an1 an2 · · · ann · · · ⎥
Let A=⎢ ⎥ (8.24)
⎢ ··· ··· ··· ··· ··· ⎥
⎢ ⎥
⎣ ··· ··· ··· ··· ··· ⎦
··· ··· ··· ··· ···
⎡ ⎤ ⎡ ⎤
x1 y1
⎢ x2 ⎥ ⎢ y2 ⎥
⎢ ⎥ ⎢ ⎥
⎢ .. ⎥ ⎢ .. ⎥
⎢ ⎥ ⎢
X = ⎢ . ⎥ ∈ l2 , Y = ⎢ . ⎥
Let ⎥ ∈ l2 .
⎢ xn ⎥ ⎢ yn ⎥
⎣ ⎦ ⎣ ⎦
.. ..
. .
Consider the infinite dimensional equation
X − AX = Y , (8.25)
If we now truncate the equation to a n-dimensional subspace then
(8.25) reduces to
Xn − An Xn = Yn (8.26)
⎡ ⎤ ⎡ ⎤
a11 a12 ··· a1n x1
⎢ .. ⎥ ⎢ .. ⎥
where An = ⎣ . ⎦ X n = ⎣ . ⎦.
an1 an2 ··· ann xn
⎡ ⎤
y1
⎢ y2 ⎥
⎢ ⎥
Yn = ⎢ .. ⎥
⎣ . ⎦
yn
Since the homogeneous equation (8.23) has the only solution as zero,
(i)
the equation (8.25) has a unique solution. If we let Xn = 0, for
i = n+1, . . . then, the sequence {Xn } converges in l2 to the unique solution
⎛ ⎞
x̂1
⎜ x̂2 ⎟
⎜ . ⎟
X̂ = ⎜ . ⎟
⎜ . ⎟ of the denumerable system given by
⎝ x̂n ⎠
..
.
∞

xi − aij xj = yi , i = 1, 2, . . . ∞. (8.27)
j=1
||(I − A)−1 ||
In fact, ||X − Xn || ≤ ( n ||Y ||2 + ||Y − Yn ||2 )
1 − n
1
provided n = ||A − An || < ,
||(I − A)−1 ||
⎛ ⎞
∞

where AX = ⎝ aij xj ⎠ , X ∈ l2 , i = 1, 2, . . . , n, . . . , ∞
j=1
⎛⎧ n
⎪
⎨
⎜ aij xj if i = 1, 2, . . . , n
An X = ⎝ j=1
⎪
⎩
0 if i > n
These results follow from theorem 8.4.1 and theorem 8.4.3 if we note
that A is a compact operator and if
Pn X = (x1 , . . . , xn , 0, . . . , 0)T , X ∈ E,
then An = Pn APn . Since Pn is obtained by truncating the Fourier series

of X ∈ l2 , we see that Pn X → X for every X in l2 and Pn∗ X T → X T for
every X T in (l2 )∗ . Hence ||A − AG n ||2 → 0.
We note that (x1 , . . . xn )T is a solution of the system (8.26) if and only
if (x1 , . . . xn , 0 . . . 0)T is a solution of the system
X − An X = (y1 , . . . , yn , 0, 0, . . .)T , X ∈ l2 .

1. Let Ex = C([0, 1]) and define A : Ex → Ex by Ax(t) = u(t)x(t)
where u(t) ∈ Ex is fixed. Find σ(A) and show that it is closed (Hint:
λ(t) = u(t) which is a continuous function defined on a compact set

[0,1].)
2. Let A be a compact linear operator mapping Ex into Ex and λ = 0.
Then show that A − λI is injective if and only if it is surjective.
3. If λ = λi for some i, show that the range of λ − T consists of all
vectors orthogonal to the eigenspace corresponding to λi , where T is
self-adjoint and compact. For such a vector show that the general
solution of (λ − T )u = f is
1 1 < f, uj >
u= f+ λj + g,
λ λ λ − λj
λj=λ
where g is an arbitrary element of the eigenspace corresponding to λi

[see Taylor [55], ch. VI].
4. If A maps a normed linear space Ex into Ey and A is compact, then
show R(A) is separable.
∞
*
(Hint: R(A) = A(Zn ) where Zn = {x : ||x|| ≤ n}).
n=1
Since A is compact, it may be seen that every infinite subset of A(Zn )

has a limit point in Ey . Consequently, for each positive integer m
1
there is a finite set of points in A(Zn ) such that the balls of radius m
with center at these points cover A(Zn ).)
5. Show that the operator C, such that
d2 u
Cu = − ,
dx2
subject to the boundary conditions
u (0) = u (1) = 0,
is positive but not positive-definite.
1
(Hint: If u, v = u(x)v(x)dx, show that Cu, u ≥ 0. If u = 1
0
then Cu, u = 0 [see 9.5].)
6. Find the eigenvalues and eigenvectors of the operator C given in
problem 5.

(Hint: Take u(x) = Cn cos nπx
1
where cos nπx cos mπx = 0 for n = m.)
0
7. Suppose Ex and Ey are infinite dimensional normed linear spaces.
Show that if A : Ex → Ex is a surjective linear operator, then A is
not a compact operator.
8. Let P0 = (1/ij), i, j = 1, 2, . . . and v0 ∈ l2 . Then, show that the

unique u0 ∈ l2 satisfying u0 − P0 u0 = v0 is given by
⎡ ⎤
∞ T
⎣ 6 v 0,j ⎦ 1 1
u0 = v0 + 1, , , · · ·
6 − π 2 j=1 j 2 3
⎛ ⎛ ⎞ ⎞
∞ T
⎝Hint: P0 u0 = ⎝ u 0,j ⎠ 1 1
1, , , · · · for u0 ∈ l2 ⎠
j 2 3
j=1
9. Let Ex = C([0, 1]), k(·, ·) ∈ C([0, 1] × [0, 1]) and P be the Fredholm
integral operator with kernel k ( , ).
(a) If |μ| < 1/||k||∞ , then for every v ∈ C([0, 1]), show that there is a
unique u ∈ C([0, 1]) such that u − μP u = v. Further, let

n 1
un (s) = v(s) + μj k (j) (s, t)v(t)dσ(t) , s ∈ [0, 1],
j=1 0
where k (j) (·, ·) is the j th iterated kernel then, show that
|μ|n+1 ||k||n+1
∞
||un − u|| ≤ ||v||∞
1 − |μ| ||k||∞
(b) If k(s, t) = 0, for all s ≤ t and 0 = μ ∈

∞
4(+) then show that
|μ|j ||k||∞
||un − u||∞ ≤ ||v||∞ .
j=n+1
j!
CHAPTER 9
ELEMENTS OF
SPECTRAL THEORY
OF SELF-ADJOINT
OPERATORS IN
HILBERT SPACES
A Hilbert space has some special properties: it is self-conjugate as well

as an inner product space. Besides adjoint operators, this has given rise
to self-adjoint operators which have immense applications in analysis and
theoretical physics. This chapter is devoted to a study of self-adjoint
bounded linear operators.
9.1 Adjoint Operators

In 5.6.14 we defined adjoint operators in a Banach space. In a Hilbert
space we can use inner product to obtain an adjoint to a given linear
operator. Let H be a Hilbert space and A a bounded linear operator
defined on H with range in the same space. Let us consider the functional
fy (x) = Ax, y, y ∈ H (9.1)
As a linear functional in the Hilbert space, fy (x) can always be written in

the form fy (x) = x, y∗ where y ∗ is some element in H. Let us suppose
that there exists another element z ∗ ∈ H s.t. fy (x) = x, z ∗ . Then we have
x, y ∗ − z ∗ = 0. Since x is arbitrary and x ⊥ (y ∗ − z ∗ ) we have y ∗ = z ∗ .
Thus y ∗ ∈ H can be uniquely associated to the element y, identifying the
323
functional fy . Thus we can find a correspondence between y and y ∗ given

by y ∗ = A∗ y. A∗ is an operator defined on H with range in H. Then
operator A∗ is associated with A by
Ax, y = x, A∗ y, (9.2)
and is called the adjoint operator of A. If A∗ is not unique, let us suppose,
Ax, y = x, A∗ y = x, A∗1 y.
Hence x, (A∗ − A∗1 )y = 0 i.e., x ⊥ (A∗ − A∗1 )y. Since x is arbitrary,
(A − A∗1 )y = 0. Again the result is true for any y implying that A∗ = A∗1 .
∗
It can be easily seen that the definition of adjoint operator derived here
formally coincides with the definition given in 5.6.14, for the case of Banach
spaces. It can be easily proved that the theorems on adjoint operators for
Banach spaces developed in 5.6 remain valid in complex Hilbert space too.
Note 9.1.1. In 5.6.18, we defined the adjoint of an unbounded linear

operator in a space Ex . Let H be a Hilbert space. Let A be a linear
operator (unbounded) with domain D(A) everywhere dense in H. If the
scalar product Ax, y for a given fixed y and every x ∈ D(A) can be
represented in the form
Ax, y = x, y ∗ ,
then it can be seen that y belongs to the domain DA∗ of the operator,
adjoint of A. The adjoint operator A∗ itself is thus defined by
A∗ y = y∗ .
It can be argued as in above, that y ∗ is unique and A∗ is a linear operator.
Here, y ∈ D(A∗ ).
9.1.1 Lemma
Given a complex Hilbert space H, the operator A∗ adjoint to a bounded
linear operator A is bounded and ||A|| = ||A∗ ||.
Proof: ||Ax||2 = Ax, Ax = x, A∗ Ax ≤ ||x|| ||A∗ Ax||
||Ax|| ||A∗ Ax||
Or, sup ≤ sup . Hence, ||A|| ≤ ||A∗ ||.
x=θ ||x|| Ax=θ ||Ax||
Similarly, considering ||A∗ x||2 we can show that ||A∗ || ≤ ||A||.
Hence, ||A|| = ||A∗ ||, showing that A∗ is bounded.
9.1.2 Lemma
In H, A∗∗ = A.
Proof: The operator adjoint ot A∗ is denoted by A∗∗ .
We have A∗ x, y = y, A∗ x = Ay, x = x, Ay, ∀ x, y ∈ H
Therefore, A∗ x, y = x, A∗∗ y = x, Ay showing that A∗∗ = A.
Elements of Spectral Theory of Self-Adjoint Operators. . . 325
9.1.3 Remark
∗∗∗
(i) A = A∗ .
(ii) For A and B linear operators (A + B)∗ = A∗ + B ∗ .
(iii) (λA)∗ = λA∗ , λ ∈ +
(iv) (AB)∗ = B ∗ A∗ .
(v) If A has an inverse A−1 then A∗ has an inverse and (A∗ )−1 = (A−1 )∗ .
Proof: (i) A∗∗∗ = (A∗∗ )∗ = A∗ using lemma 9.1.2.

(ii) For x, y ∈ H (A + B)∗ x, y = x, (A + B)y = x, Ay + x, By
= A∗ x, y + B ∗ x, y.
or, [(A + B)∗ − A∗ − B ∗ ]x, y = 0.
Since x and y are arbitrary, (A + B)∗ = A∗ + B ∗ .
(iii) λAx, y = λ Ax, y = x, λA∗ y = x, (λA)∗ y ∀ x, y ∈ H.
Hence, (λA)∗ = λA∗ .
(iv) ABx, y = Bx, A∗ y = x, B ∗ A∗ y = x, (AB)∗ y
where x, y ∈ H.
Hence, (AB)∗ = B ∗ A∗ .
(v) Let A mapping H into H have an inverse A−1 .
A−1 x, y = x, (A−1 )∗ y = y, A−1 x = AA−1 y, A−1 x
= A−1 y, A∗ A−1 x = y, (A−1 )∗ A∗ A−1 x
= (A−1 )∗ A∗ A−1 x, y.
Thus, (A−1 )∗ A∗ = I (9.3)
∗ −1 ∗ −1 ∗ −1
, Again A (A ) x, y = (A ) x, Ay = x, A Ay = x, y.
∗ −1 ∗
Thus A (A ) = I. (9.4)
∗ −1 −1 ∗
Hence it follows from (9.3) and (9.4) that (A ) = (A ) .
9.2 Self-Adjoint Operators

9.2.1 Self-adjoint operators
A bounded linear operator A is said to be self-adjoint if it is equal to
its adjoint, i.e., A = A∗ . Self-adjoint operators on a Hilbert space H
are also called Hermitian.
Note 9.2.1. A linear (not necessarily bounded) operator A with

domain D(A) dense in H is said to be symmetric, if for all x, y ∈ D(A),
the equality
Ax, y = x, Ay
holds. If A is unbounded, it follows from Note 9.1.1 that
y ∈ D(A) =⇒ y ∈ D(A∗ ).
Hence D(A) ⊆ D(A∗ ). In other words, A ⊆ A∗ or A∗ is an extension of A.

For A bounded, D(A) = D(A∗ ) = H. For, A = A∗ and D(A) dense in H,
A is called self-adjoint.
9.2.2 Examples
1. In an n-dimensional complex Euclidean space, a linear operator A can
be identified with the matrix (aij ) with complex numbers as elements. The
operator adjoint to A = (aij ) is A∗ = (aji ). A self-adjoint operator is a
Hermitian matrix if aij = aji .
If (aij ) is real then a Hermitian matrix becomes a symmetric matrix.
2. Adjoint operator corresponding to a Fredholm operator in
L2 ([0, 1]).
1
If T f = g(s) = k(s, t)f (t)dt (5.6.15), the kernel of the adjoint
0
operator T ∗ in complex L2 ([0, 1]) is k(t, s).
T is self-adjoint if k(s, t) = k(t, s).
3. In L2 ([0, 1]) let the operator A be given by Ax = tx(t) ∈ L2 ([0, 1]) with
every function x(t) ∈ L2 ([0, 1]). It can be seen that A is self-adjoint.
9.2.3 Remark
Given A, a self-adjoint operator, then
(i) λA is self-adjoint where λ is real.
(ii) (A + B) is self-adjoint if A and B are respectively self-adjoint.
(iii) AB is self-adjoint if A and B are respectively self-adjoint and AB =
BA.
(iv) If An → A in the sense of norm convergence in the space of operators
and all An are self-adjoints, then A is also self-adjoint.
Proof: For (i)–(iii) see 9.1.3 (ii)–(iv).

(iv) Let An → A as n → ∞ in the space of bounded linear operators.
Using 5.6.16, ||An − A|| = ||A∗n − A∗ ||.

Since An → A, as n → ∞, limn→∞ A∗n = A∗ . Since An is self-adjoint, An = A∗n .
Thus, An tends to both A and A∗ as n → ∞.
Hence, A = A∗ .
9.2.4 Definition: bilinear hermitian form

A functional is said to be of bilinear form if it is a functional of two
vectors and is linear in both the vectors.
Bilinear Hermitian Form
Let us consider Ax, y where A is self-adjoint.
Now, A(αx1 + βx2 ), y = αAx1 , y + βAx2 , y
Moreover, Ax, y = x, Ay
Thus, Ax, y is a bilinear functional.
If A is self-adjoint, then we denote Ax, y by A(x, y). Thus, A(x, y) =
A(y, x).
This form is bounded in the sense, that |A(x, y)| ≤ CA ||x|| ||y|| where
CA is some constant.
9.2.5 Lemma
Thus every self-adjoint operator A generates some bounded bilinear
hermitian form
A(x, y) = Ax, y = x, Ay.
Conversely if a bounded linear Hermitian form A(x, y) is given, then it
generates some self-adjoint operator A, satisfying the equality
A(x, y) = Ax, y. (9.5)
Proof: The first part follows from the definition in 9.2.4. Let us
consider the bilinear hermitian form given by (9.5). Let us keep y fixed
in A(x, y) and obtain a linear functional of x. Consequently, A(x, y) =
x, y ∗ , y ∗ is a uniquely defined element. Thus we get an operator A,
defined by Ay = y∗ , and such that x, Ay = A(x, y).
Now x, A(y1 + y2 ) = A(x, y1 + y2 ) = A(x, y1 ) + A(x, y2 ) = x, Ay1 +
x, Ay2
Thus A is linear. Moreover, |x, Ay| = |A(x, y)| ≤ CA ||x|| ||y||.
Putting x = Ay, we get from the above, |Ay, Ay| ≤ CA ||Ay||
||y|| or ||Ay|| ≤ CA ||y||.
Hence ||A|| ≤ CA , showing that A is bounded. To prove self-adjointness
of A, we note that for x, y ∈ H, we have, x, Ay = A(y, x) = (y, Ax) =
Ax, y, implying that A = A∗ and A(x, y) = Ax, y.
9.3 Quadratic Form

9.3.1 Definition: quadratic form
A bilinear hermitian form A(x, y) given by (9.5) is said to be a
quadratic form if x = y so that A(x, x) is a quadratic form.
Further, (i) A(x, x) is real since A(x, y) = A(y, x).

(ii) A(αx + βy, αx + βy) = ααA(x, x) + αβA(x, y)
+ αβA(y, x) + ββA(y, y)
9.3.2 Lemma
Every bilinear hermitian form A(x, y) can be uniquely defined by a
quadratic hermitian form.
The bilinear form A(x, y) is defined by
1
A(x, y) = {[A(x1 , x1 ) − A(x2 , x2 )] + i[A(x3 , x3 ) − A(x4 , x4 )]}
4
where x1 = x + y, x2 = x − y and x3 = x + iy, x4 = x − iy.
The quadratic form A(x, x) is bounded, that is, |A(x, x)| ≤ CA ||x||2 , if
and only if the corresponding bilinear hermitian form is bounded. Moreover,
||A|| = max(|m|, |M |) = sup |(Ax, x)| where m and M are defined below.
||x||=1
Proof: Let m = inf Ax, x and M = sup Ax, x. The numbers m and
||x||=1 ||x||=1
M are called the greatest lower bound and the least upper bounds
respectively of the self-adjoint operator A.
Let ||x|| = 1. Then,
|Ax, x| ≤ ||Ax|| · ||x|| ≤ ||A||||x||2 = ||A||
and, consequently, CA = sup |Ax, x| ≤ ||A||. (9.6)
||x||=1
On the other hand, for every y ∈ H, we have, Ay, y ≤ CA ||y||2 .
Let z be any element in H, different from zero.
1
||Az|| 2 1
We put t= and u = Az,
||z|| t
we get ||Az||2 = Az, Az = A(tz), u
1
= {A(tz + u), (tz + u) − A(tz − u), (tz − u)}
4
1
≤ CA [||tz + u||2 + ||tz − u||2 ]
4
1
= CA [||tz||2 + ||u||2 ]
2
1 1
= CA t2 ||z||2 + 2 ||Az||2
2 t
1
= CA [||Az|| ||z|| + ||z|| ||Az||] = CA ||Az|| ||z||.
2
Hence, ||Az|| ≤ CA ||z||.
||Az||
Therefore, ||A|| = sup ≤ CA = sup (Ax, x) (9.7)
z=θ ||z|| ||x||=1

||A|| = max{|m|, |M |} = sup Ax, x.
||x||=1
9.4 Unitary Operators, Projection Operators

In this section we study some well-behaved bounded linear operators in a
Hilbert space which commute with their adjoints.
9.4.1 Definitions: normal, unitary operators
(i) Normal operator: Let A be a bounded linear operator mapping a
Hilbert space H into itself. A is called a normal operator if A∗ A = AA∗ .
(ii) Unitary operator: The bounded linear operator mapping a Hilbert
space into itself is called unitary if AA∗ = I = A∗ A. Hence A has an
inverse and A−1 = A∗ .
9.4.2 Example 1
Let H = 42 (+2 )and x = (x1 , x2 )T

x1 − x2 ∗ x 1 + x2
Ax = , A x= .
x1 + x 2 −x1 + x2

2x1
A∗ Ax = = AA∗ x.
2x2
Thus A is a normal operator.
9.4.3 Example 2

cos θ − sin θ
Let A=
sin θ cos θ

∗ cos θ sin θ
Then A =
− sin θ cos θ

∗ ∗ 1 0
Therefore, AA = A A = .
0 1
Thus A is unitary.
9.4.4 Remark
(i) If A is unitary or self-adjoint then A is normal.
(ii) The converse is not always true.
(iii) The operator A in example 9.4.2 although normal is not unitary.
(iv) The operator A in example 9.4.3 is unitary and necessarily normal.
9.4.5 Remark
If B is a normal operator and C is a bounded operator, such that
C ∗ C = I, then operator A = CBC ∗ is normal.
For A∗ = CB ∗ C ∗ . Now, AA∗ = CBC ∗ · CB ∗ C ∗ = CBB ∗ C ∗

Again A∗ A = CB ∗ C ∗ · CBC ∗ = CB ∗ BC ∗ . Hence AA∗ = A∗ A.
9.4.6 Example
Let H = l2 and x = (x1 , x2 , . . .)T in H, let Cx = (0, x1 , x2 , . . .)T .
Then C ∗ x = (x2 , x3 , . . .)T for x ∈ H.
Hence C ∗ Cx = (x1 , x2 , . . .)T
CC ∗ x = (0, x2 , x3 , . . .)T for all x ∈ H
⎛ ⎞
0 0 ··· 0 ⎛ ⎞
0 1 ··· 0
⎜ 1 0 ··· 0 ⎟ ⎟ .. .. ⎠
where C = ⎜⎝ ··· ··· , C∗ = ⎝
1 ··· ⎠ . 0 1 .
0 0 ··· 1 0 0 ··· 1
Thus C ∗C = I but CC ∗ = I.
9.4.7 Definition
Given a linear operator A and a unitary operator U , the operator
B = U AU −1 = U AU ∗ is called an operator unitarily equivalent to
A.
9.4.8 Projection Operator
Let H be a Hilbert space and L a closed subspace of H. Then, by
orthogonal projection theorem for every x ∈ H, y ∈ L, and z ∈ L⊥ , x
can be uniquely represented by x = y + z.
Then P x = y and (I − P )x = z.
This motivates us to define the projection operator, see 3.6.2.
9.4.9 Theorem
P is a self-adjoint operator with its norm equal to one and P satisfies
P2 = P.
We first show that P is a linear operator.
Let, x1 = y1 + z1 , y1 ∈ L, z1 ∈ L⊥ and x2 = y2 + z2 , y2 ∈ L, z2 ∈ L⊥ .
Now, y1 = P x1 , y2 = P x2 .
Since αx1 + βx2 = [αy1 + βy2 ] + [αz1 + βz2 ], therefore,
P (αx1 + βy2 ) = αy1 + βy2 = αP x1 + βP x2 , α, β ∈ ( ). 4+
Hence P is linear.
Since y ⊥ z, we have ||x||2 = ||y + z||2 = y + z, y + z
= y, y + z, y + y, z + z, z
= ||y||2 + ||z||2
Thus, ||y||2 = ||P x||2 ≤ ||x||2 , i.e. ||P x|| ≤ ||x|| for every x.
Hence, ||P || ≤ 1.
Since for x ∈ L, P x = x and consequently ||P x|| = ||x||, it follows that

||P || = 1.
Next we want to show that P is a self-adjoint operator. Let x1 =
y1 + z1 and x2 = y2 + z2
We have P x1 = y1 , P x2 = y2 .
Therefore, P x1 , x2 = y1 , P x2 = y1 , y2 .
Similarly, x1 , P x2 = P x1 , y2 = y1 , y2 .
Consequently, P x1 , x2 = x1 , P x2 .
Hence, P is self-adjoint.
Since, x = y + z, with y ∈ L and z ⊥ L, P x ∈ L for every x ∈ H.
Hence, P 2 x = P (P x) = P x for every x ∈ H.
Hence, projection P in a Hilbert space satisfies P 2 = P .
9.4.10 Theorem
Every self-adjoint operator P satisfying P 2 = P is an orthogonal
projection on some subspace L of the Hilbert space H.
Proof: Let L have the element y where y = P x, x being any element

of H. Now, if y1 = P x1 ∈ L and y2 = P x2 ∈ L for x1 , x2 ∈ H, then
y1 + y2 = P x1 + P x2 = P (x1 + x2 ) ∈ L. Similarly, αy = αP x = P (αx) ∈ L
4+
for α ∈ ( ). Hence, L is a linear subspace. Now, let yn → y0 · yn ∈ L
since yn = P xn for every xn ∈ H. Now P yn = P 2 xn = P xn = yn .
Since P is continuous, yn → y0 ⇒ P yn → P y0 . However, since
P yn = yn , yn → P y0 . Consequently, y0 = P y0 , i.e., y0 ∈ L, showing
that L is closed. Finally, x − P x ⊥ P x since P is self-adjoint and P 2 = P ,
as is shown below x − P x, P x = P x − P 2 x, x = 0.
Thus, it follows from the definition of L that P is the projection of H
onto this subspace. Moreover, corresponding to an element x ∈ H, P x ∈ L
is unique. For if otherwise, let P1 x be another projection on L. But that
violates the orthogonal projection theorem [see 3.5].
9.4.11 Remark
(i) L mentioned above consists of all elements of the form P x = x, x ∈
L.
(ii) By the orthogonal projection theorem, we can write x = y +z, where
y ∈ L, z ∈ L⊥ and x ∈ H. If we write y = P x, then z = (I − P )x. Thus
(I − P ) is a projection on L⊥ .
Moreover, (I − P )2 = I − 2P + P 2 = I − 2P + P = I − P .
(I − P )x, y = x, (I − P )∗ y = x, y − x, P y
= x, (I − P )y for all x, y ∈ H.
Thus (I − P ) is a projection operator.
9.4.12 Theorem
For the projections P1 and P2 to be orthogonal, it is necessary and
sufficient that the corresponding subspace L1 and L2 are orthogonal.
Let y1 = P1 x and y2 = P2 x, x ∈ H. Let P1 be orthogonal to P2 . Then
y1 , y2 = P1 x, P2 x = x, P1 P2 x = 0, since P1 is orthogonal to P2 .
Since y1 is any element of L1 and y2 is any element of L2 , we conclude
that L1 ⊥ L2 . Similarly let L1 ⊥ L2 . Then for y, ∈ L1 and y2 ∈ L2 we
have y1 , y2 = 0 or P1 x, P2 x = x, P1 P2 x = 0 showing that P1 ⊥ P2 .
9.4.13 Lemma
The necessary and sufficient condition that the sum of two projection
operators PL1 and PL2 be a projection operator is that PL1 and PL2 must
be mutually orthogonal. In this case PL1 + PL2 = PL1 +L2 .
Proof: Let PL1 + PL2 be a projection operator P .

Then (PL1 + PL2 )2 = PL1 + PL2 .
Therefore PL21 + PL1 PL2 + PL2 PL1 + PL22 = PL1 + PL2 .
Hence PL1 PL2 + PL2 PL1 = 0.
Multiplying LHS of the above equation by PL1 , we have,
PL21 PL2 + PL1 PL2 PL1 = 0 or PL1 PL2 + PL1 PL2 PL1 = 0.
Multiplying RHS by PL1 we have,
PL , PL2 PL1 + PL1 PL2 PL21 = 0 or PL1 PL2 PL1 = 0.
Multiplying LHS by PL−1
1
we have PL2 PL1 = 0 i.e. PL2 ⊥ PL1 .
Next, let us suppose that PL1 PL2 = 0.
Then (PL1 + PL2 )2 = (PL1 + PL2 )(PL1 + PL2 )
= PL21 + PL1 PL2 + PL2 PL1 + PL22
= PL21 + PL22 = PL1 + PL2 .
Thus PL1 + PL2 is a projection operator.
Since PL1 ⊥ PL2 , L1 is ⊥ L2 (9.8)
If x ∈ H, thus P x = PL1 x + PL2 x = x1 + x2 with x1 + x2 ∈ L1 + L2 .
Further, if x = x1 + x2 is an element in L1 + L2 thus
x = x1 + x2 = PL1 (x1 + x2 ) + PL2 (x1 + x2 )
= PL1 +L2 (x1 + x2 ),
since PL1 x2 = 0 (9.9)
and PL2 x1 = 0.
It follows from (9.8) and (9.9) that P x = x = PL1 +L2 x.
Hence P is a projection.
9.4.14 Lemma
The necessary and sufficient condition for the product of two projections
PL1 and PL2 to be a projection is that the projection operator i.e. PL1 PL2 =
PL2 PL1 . In this case PL1 PL2 = PL1 ∩L2 .
Proof: Since P = PL1 PL2 is self-adjoint, we have

PL1 PL2 = (PL1 PL2 )∗ = PL∗2 PL∗1 = PL2 PL1 ,
taking note that PL1 and PL2 are self-adjoint.
Hence PL1 commutes with PL2 .
Conversely, if PL1 PL2 = PL2 PL1 , then
P ∗ = (PL1 PL2 )∗ = PL∗2 PL∗1 = PL2 PL1 = PL1 PL2 = P .
Thus P is self-adjoint.
Furthermore, (PL1 PL2 )2 = PL1 PL2 PL1 PL2 = PL1 PL1 PL2 PL2
= PL21 PL22 = PL1 PL2 .
Hence P = PL1 PL2 is a projection.

Let x ∈ H be arbitrary. Then, P x = PL1 PL2 x = PL2 PL1 x.
Thus P x belongs to L1 and L2 , that is to L1 ∩ L2 .
Now, let y ∈ L1 ∩ L2 . Then P y = PL1 (PL2 y) = PL1 y = y.
Thus P is a projection on L1 ∩ L2 . This proves the lemma.
9.4.15 Definition
The projection P2 is said to be a part of the projection P1 , if
P1 P2 = P2 .
9.4.16 Remark
(i) P1 P2 = P2 ⇒ (P1 P2 )∗ = P2∗ ⇒ P2 P1 = P2 .
(ii) PL2 is a part of PL1 if and only if L2 is a subspace of L1 .
9.4.17 Theorem
The necessary and sufficient condition for a projection operator PL2 to
be a part of the projection operator PL1 is the inequality ||PL2 x|| ≤ ||PL1 x||
being satisfied for all x ∈ H.
Proof: PL2 PL1 x = PL2 x yields
||PL2 x|| = ||PL2 PL1 x|| ≤ ||PL2 || ||PL1 x|| ≤ ||PL1 x|| (9.10)
Conversely if (9.10) be true, then for every x ∈ L2 ,
||PL1 x|| ≥ ||PL2 x|| = ||x|| and since ||PL1 x|| ≤ ||x||
we have ||PL1 x|| = ||x||.

Therefore, ||PHL1 x||2 = ||x||2 − ||PL1 x||2 = 0 and hence, x ∈ L1 .
Therefore, PL1 x ∈ L2 for every x ∈ H, which implies that
PL1 PL2 x = PL2 x i.e. PL1 PL2 = PL2 .
9.4.18 Theorem
The difference P1 − P2 of two projections is a projection operator, if and
only if P2 is a part of P1 . In this case, LP1 −P2 is the orthogonal complement
of LP2 in LP1 .
Proof: If P1 − P2 is a projection operator, then so is I − (P1 − P2 ) =

(I − P1 ) + P2 .
Then, by lemma 9.4.13, (I − P1 ) and P2 are mutually orthogonal, i.e.,
(I − P1 )P2 = 0 i.e. P1 P2 = P2 showing that P2 is a part of P1 .
Conversely, let P2 be a part of P1 i.e. P1 P2 = P2 or (I − P1 )P2 = 0, i.e.,
I − P1 is orthogonal to P2 .
Therefore, by lemma 9.4.13, (I − P1 ) + P2 is a projection operator and
I − [(I − P1 ) + P2 ] = P1 − P2 is aslo a projection operator. The condition
P1 P2 = P2 implies that P1 − P2 and P2 are orthogonal. Then, because of
lemma 9.4.13, LP1 = LP1 −P2 + LP2 .
9.5 Positive Operators, Square Roots of a

Positive Operator
9.5.1 Definition: A non-negative operator, a positive operator
4+
A self-adjoint operator, A in H over ( ) is said to be non-negative if
Ax, x ≥ 0 for all x ∈ H and is denoted by A ≥ 0. A self-adjoint operator
4+
A in H over ( ) is said to be positive if Ax, x ≥ 0 for all x ∈ H and
Ax, x = 0 for at least one x and is written as A > 0.
9.5.2 Definition: stronger and smaller operator
If A and B are self-adjoint operators and A − B is positive, i.e.,
A − B > 0, then A is said to be greater then B or B is smaller then
A and expressed as A > B.
9.5.3 Remark
The relation ≥ on the set of self-adjoint operators on H is a partial
order. The relation is
(i) reflexive, i.e., A ≥ A
(ii) transitive, i.e., A ≥ B and B ≥ C ⇒ A ≥ C
(iii) antisymmetric, i.e., A ≥ B and B ≥ A ⇒ A = B.
(iv) A ≥ B, C ≥ D ⇒ A + C ≥ B + D.
(v) For any A, AA∗ and A∗ A are non-negative.
For x ∈ H, AA∗ x, x = A∗ x, A∗ x = ||A∗ x||2 ≥ 0

A∗ Ax, x = Ax, Ax = ||Ax||2 ≥ 0.
(vi) If A ≥ 0 and A−1 exists, then A−1 > 0. A ≥ 0 ⇒ Ax, x ≥ 0.

Now, A−1 exists ⇒ Ax = θ ⇒ x = θ.
Hence, A−1 x, x = 0 ⇒ x = θ, for if x is non-null, A−1 x is non-null
and < A−1 x, x > 0. Thus, A−1 x, x > 0 for non-null x.
(vii) If A and B are positive operators and the composition AB exists, then
AB may not be a positive operator. For example, let H = 2 ( 2 ), and 4 +
A(x1 , x2 )T = (x1 + x2 , x1 + 2x2 )
B(x1 , x2 )T = (x1 + x2 , x1 + x2 ).
AB(x1 , x2 )T= (2x1 + 2x2 , 3x1 + 3x2 )
BA(x1 , x2 )T = (2x1 + 3x2 , 2x1 + 3x2 )
for all (x1 , x2 ) ∈ 4 × 4 (+ × + )
AB is not a positive operator since AB is not self-adjoint.
9.5.4 Example
Let us consider the symmetric operator B
d2 u
Bu = −
dx2
the functions u(x) being subject to the boundary conditions u(0) = u(1) =
0, the field Ω being the segment 0 < x < 1.
D(B) = {u(x) : u(x) ∈ C 2 (0, 1), u(0) = u(1) = 0}. Take H = L2 ([0, 1]).
Then, for all u, v ∈ D(B),
1 1
d2 x d2 v
Bu, v = v(x)dx = − u(x)dx = Bv, u
0 dx2 0 dx2
Hence, B is symmetric. Therefore,

1 2 x=1 1 2
du du du
Bu, u = dx − u = dx ≥ 0.
0 dx dx x=0 0 dx
9.5.5 Theorem
If two positive self-adjoint operators A and B commute, then their
product is also a positive operator.
Proof: Let us put
A
A1 = , A2 = A1 − A21 , . . . An+1 = An − A2n , . . . and show that
||A||
0 ≤ An ≤ I for every n. (9.11)
The above is true for n = 1. Let us suppose that (9.11) is true for n = k.
Then A2k (I − Ak )x, x = (I − Ak )Ak x, Ak x ≥ 0 since (I − Ak ) is a
positive operator. Hence A2k (I − Ak ) ≥ 0.
Analogously Ak (I − Ak )2 ≥ 0.
Hence, Ak+1 = A2k (I − Ak ) + Ak (I − Ak )2 ≥ 0
and I − Ak+1 = (I − Ak ) + A2k ≥ 0.
Consequently, (9.11) holds for n = k + 1.
Moreover, A1 = A21 + A2 = A21 + A22 + A3 = · · ·
= A21 + A22 + · · · + A2n + An+1 ,

n
whence A2k = A1 − An+1 ≤ A1 , since An+1 ≥ 0,
k=1
n
that is, Ak x, Ak x ≤ A1 x, x.
k=1
∞

Consequently, the series ||Ak x||2 converges and ||Ak x|| → 0 as
k=1
k → ∞.

n
Hence, A2k x = A1 x − An+1 x → A1 x as n → ∞ (9.12)
k=1
since B commutes with A and hence with A1 .
BA2 = B(A1 − A21 ) = BA1 − BA21
= A1 B − A21 B = (A1 − A21 )B,
i.e., B commutes with A2 .
Let B commute with Ak , k = 1, 2, . . . , n.
BAn+1 = B(An − A2n ) = An B − An BAn
= (An − A2n )B = An+1 B.
Hence B commutes with Ak , k = 1, 2, . . . , n, . . ..

n
ABx, x = ||A||BA1 x, x = ||A|| lim BA2k x, x .
n→∞
k=1

n
= ||A|| lim BAk x, Ak x ≥ 0
n→∞
k=1
Using (9.12) ABx, x = ||A||BA1 x, x = BAx, x.
9.5.6 Theorem
If {An } is a monotone increasing sequence of mutually commuting self-
adjoint operators, bounded above by a self-adjoint operator B commuting
with all the

An : A1 ≤ A2 ≤ · · · ≤ An ≤ · · · ≤ B (9.13)
then the sequence {An } converges pointwise to a self-adjoint operator A and
A ≤ B.
Proof: Let Cn = B −An . Since B and An for all n are self-adjoint operator
{Cn } is a sequence of self-adjoint operators. Cn x, x = (B − An )x, x ≥ 0
because of (9.13). Hence, Cn is a non-negative operator.
Moreover, Cn Cm = (B − An )(B − Am ) = B 2 − BAm − An B + An Am
= B 2 − Am B − BAn + Am An = (B − Am )(B − An )
since B commutes with An for all n and An Am commute. Hence,
Cn Cm = Cm Cn . Moreover, {Cn } forms a monotonic decreasing sequence.
Consequently, for m < n, the operator (Cm − Cn )Cm and Cn (Cm − Cn ) are
also positive. Moreover,
2
Cm x, x ≥ Cm Cn x, x ≥ Cn2 x, x ≥ 0
This implies that the monotonic decreasing non-negative numerical
sequence {Cn2 x, x} has a limit. Hence, it follows from the above inequality,
{Cm Cn x, x} also tends to the same limit as n, m → ∞. Therefore,
||Cm x − Cn x||2 = (Cm − Cn )2 x, x) = Cm
2
x, x − 2Cm Cn x, x
+Cn2 x, x → 0 as n, m → ∞.
Thus, the sequence {Cn x} and thereby also {An x} converges to some limit
Ax for arbitrary x, that is Ax = lim An x. Hence A is a self-adjoint
n→∞
operator, satisfying A ≤ B.
9.5.7 Remark
If, in theorem 9.5.6, the inequality (9.13) is replaced by A1 ≥ A2 ≥
· · · ≥ An ≥ · · · B and the other conditions remain unchanged, then the
conclusion of theorem 9.5.6 remains unchanged, except that A ≥ B.
9.5.8 Square roots of non-negative operators
9.5.9 Definition: square root
The self-adjoint operator B is called a square root of the non-negative
operator A, if B 2 = A.
9.5.10 Theorem
There exists a unique positive square root B of every positive self-adjoint
operator A; it commutes with every operator commuting with A.
Proof: Without loss of generality, it can be assumed that A ≤ I. Let us

put B0 = 0, and
1
Bn+1 = Bn + (A − Bn2 ), n = 0, 1, 2, . . . (9.14)
2
Suppose that Bk is self-adjoint, positive and commutes with every

operator commuting with A, for k = 1, 2, . . . n.
1
Then Bn+1 x, x = [Bn + (A − Bn2 )]x, x
2
1 1
= Bn x, x + Ax, x − (Bn2 x, x)
2 2
1 1
= x, Bn x + x, Ax − x, Bn2 x
2 2
= x, Bn+1 x.
1
Bn+1 x, x = Bn x, x + (A − Bn2 )x, x (9.15)
2
1 2 1
Now, (I − Bn+1 ) = (I − Bn ) + (I − A) (9.16)
2 2
1
and Bn+1 − Bn = [(I − Bn−1 ) + (I − Bn )](Bn − Bn−1 ) (9.17)
2
1
Now, B0 = 0, B1 = A ≤ I. Also B1 > 0.
2
Therefore, it follows from (9.16) and (9.17) that Bn ≤ I and Bn ≤ Bn+1
for all n. Thus it follows from (9.15) that Bn+1 is self-adjoint, positive
and commutes with every operator commuting with A. Thus {Bn } is a
monotonic increasing sequence bounded above. This sequence converges in
limit to some self-adjoint positive operator B. Taking limits in (9.14) we
have,
1
B = B + (A − B 2 ) that is B 2 = A.
2
Finally, B commutes with every operator that commutes with A. This
is because that Bn possesses the above property. Thus B is the positive
square root of A. If B is not unique let B1 be another square root of A.
Then B 2 − B12 = 0.
Therefore, (B 2 − B12 )x, y = 0 or (B + B1 )(B − B1 )x, y = 0.

Let us take y such that (B − B1 )x = y.
Then 0 = (B + B1 )y, y = By, y + B1 y, y.
Since B and B1 are positive, By, y = B1 y, y = 0.
However, since the roots are positive, we have B = C 2 where C is a

self-adjoint operator. Since
||Cy||2 = C 2 y, y = By, y = 0, hence Cy = 0.
Consequently, By = C(Cy) = 0 and analogously B1 y = 0.

However, then, ||B1 x − Bx||2 = (B − B1 )2 x, x = (B − B1 )y, x = 0, that
is, Bx = B1 x for every x ∈ H and the uniquences of the square root is
proved.
9.5.11 Example
Let H = L2 ([0, 1]). Let the operator A be defined by Ax(t) =
1
tx(t), x(t) ∈ L2 ([0, 1]). Then, ||Ax||2 = Ax, Ax = 0 t2 x2 (t)dt ≤ ||x||2 .
Hence, A is bounded.
1 1 1 √ √
2 2
Ax, x = tx(t)·x(t)dt = tx (t)dt = ( tx(t))( tx(t))dt = (Bx, Bx)
0 0 0
√
where Bx(t) = + tx(t).
Problems
1. Suppose A is linear and maps a complex Hilbert space H into itself.
Then, if Ax2 , x ≥ 0andAx, x = 0 for each x ∈ H, show that A = 0.
2. Let {u1 , u2 , . . .} be an orthonormal system in a Hilbert space H, T ∈
(H → H) and ai,j = T uj , ui , i, j = 1, 2, . . . Then show that the
matrix {ai,j } defines a bounded linear operator Q on H with respect
to u1 , u2 , u3 · · · . Show further that Q = P T P , where

Px = x, uj uj , x ∈ H.
j
If u1 , u2 , u3 . . . constitute an orthonormal basis for H, then prove that

Q = T.
3. Let P and Q denote Fredholm integral operators on H = L2 ([a, b])
with kernels p(·, ·) and q(·, ·) in L2 ([a, b] × [a, b]), respectively. Then
show that P = Q if and only if p(·, ·) and q(·, ·) are equal almost
everywhere on [a, b] × [a, b]. Further, show that P Q is a Fredholm
integral operator with kernel
b
p ◦ q(s, t) = p(s, u)q(u, t)dμ(u), (s, t) ∈ [a, b] × [a, b]
a
and that ||P Q|| ≤ ||p ◦ q(s, t)||2 ≤ ||p||2 ||q||2
(Hint: To find P Q use Fubini’s theorem (sec. 10.5).)

4. Consider the shift operator A and a multiplication operator B on l2
such that

0 if n = 0
Ax(n) =
x(n − 1) if n ≥ 1.
Bx(n) = (n + 1)−1 x(n) if n ≥ 0.
Put C = AB. Show that C is a compact operator which has no

eigenvalue and whose spectrum consists of exactly one point.
5. Let λ1 ≤ λ2 ≤ · · · λn be the first n consecutive eigenvalues of

a self-adjoint coercive operator ‘A’ mapping a Hilbert space H
into itself and let u1 , u2 , . . . , un be the corresponding orthonormal
eigenfunctions. Let there exist a function u = un+1 = 0 which
maximizes the functional
Au, u
, u ∈ DA ⊆ H,
u, u
under the supplementary conditions, u, u1 = 0, u, u2 =

0 · · · u, un = 0.
Then show that un+1 is the eigenfunction corresponding to the
eigenvalue
Aun+1 , un+1
λn+1 = .
un+1 , un+1
Show further that λn ≤ λn+1 .
(Hint: A symmetric non-negative operator ‘A’ is said to be coercive
if there exists a non-negative number α such that Au, u ≥
αu, u ∀ u ∈ D(A), α > $.)
6. Let D(A) be the subspace of a Hilbert space H 1 ([0, 1]) of functions
u(x) with continuous first derivatives on [0,1] with u(0) = u(1) = 0
d2
and A = − 2 .
dx
Find the adjoint A∗ and show that A is symmetric.
1
(Hint: u, v = 0 u(x)v(x)dx for all u, v ∈ H 2 .)
7. (Elastic bending of a clamped beam) Let

d2 d2 u
Au = 2 b(x) 2 + ku = f (x), k > 0, 0 < x < L.
dx dx
subject to the boundary conditions,
du
u= = 0 at x = 0 and x = L
dx
Show that the operator A is symmetric on its domain.
8. For x ∈ L2 [0, ∞[, consider
; ∞
2 d sin us
U1 (x)u = x(s)dμ(s) [see ch. 10]
π du 0 s
; ∞
2 d 1 − cos us
U2 (x)u = x(s)dμ(s)
π du 0 s
Show that
(i) U1 (x)u and U2 (x)u are well-defined for almost all u ∈ [0, ∞)
(ii) U1 (x), U2 (x) ∈ L2 [0, ∞[
(iii) The mappings U1 and U2 are bounded operators on L2 [0, ∞[,
which are self-adjoint and unitary.
9. Let A ∈ (H → H) be self-adjoint. Then show that
(i) A2 ≥ 0 and A ≤ ||A||I.
(ii) if A2 ≤ A then 0 ≤ A ≤ I.
9.6 Spectrum of Self-Adjoint Operators

Let us consider the operator Aλ = A − λI, where A is self-adjoint and λ a
complex number.
In sec. 4.7.17 we defined a resolvent operator and regular values of
an operator. By theorem 4.7.13, if ||(1/λ)A|| < 1 (that is, if |λ| > ||A||),
then λ is a regular value of A and consequently, then entire spectrum of
A lies inside and on the boundary of the disk |λ| ≤ ||A||. This is true for
arbitrary linear operators acting into a Banach space. For a self-adjoint
operator defined on a Hilbert space, the plane comprising the spectrum of
the operator is indicated more precisely below.
9.6.1 Lemma
Let A be a self-adjoint linear operator in a Hilbert space over ( ). 4+
Then all of its eigenvalues are real.
Let x = θ be an eigenvector of A and λ the corresponding eigenvalue.
Then, Ax = λx.
Pre-multiplying both sides with x∗
we have x∗ Ax = λx∗ x. (9.18)
Taking adjoint of both sides we have,
x∗ A∗ x = λx∗ x. (9.19)
x∗ Ax = λx∗ x = λx∗ x, showing that λ = λ i.e. λ is real.
9.6.2 Lemma
Eigenvectors belonging to different eigenvalues of a self-adjoint operator
4+
in a Hilbert space H over ( ) are orthogonal.
Let x1 , x2 be two eigenvectors of a self-adjoint operator corresponding
to different eigenvalues λ1 and λ2 .
Then we have
Ax1 = λ1 x1 , (9.20)
Ax2 = λx2 (9.21)
Premultiplying (9.20) by x∗2 and (9.21) with x∗1 we have,
x∗2 Ax1 = λ1 x∗2 x1

x∗1 Ax2 = λ2 x∗1 x2
Therefore, x∗2 Ax1 = x∗1 Ax2 = λ1 x∗1 x2 = λ2 x∗2 x1 .
Since A is self-adjoint λ1 , λ2 are real.
Therefore, (λ1 − λ2 )x∗1 x2 = 0.
Since λ1 = λ2 , x∗1 x2 = 0 i.e. x1 ⊥ x2 .
9.6.3 Theorem
For the point λ to be a regular value of the self-adjoint operator A, it is
necessary and sufficient that there is a positive constants C, such that
||(A − λI)x|| = ||Aλ x|| = ||Ax − λx|| ≥ C||x|| (9.22)
for every x ∈ H over 4(+).

Proof: Suppose that Rλ = A−1 λ is bounded and ||Rλ || = K. For every
x ∈ H, we have ||x|| = ||Rλ Aλ x|| ≤ K||Aλ x|| whence ||Aλ x|| ≥ (1/K)||x||,
proving the conditions is necessary.
We next want to show that the condition is sufficient. Let y = Ax − λx
and x run through H. Then y runs through some linear subspace L. By
(9.22) there is a one-to-one correspondence between x and y. For if x1 and
x2 correspond to the same element y,
we have A(x1 − x2 ) − λ(x1 − x2 ) = 0,

1
whence ||x1 − x2 || ≤ ||Aλ (x1 − x2 )| = 0 (9.23)
C
We next show that L is everywhere dense in H. If it were not so, then

there would exist a non-null element x0 ∈ H such that x0 , y = 0 for
every y ∈ H. Hence x0 , Ax − λx = 0 for every x ∈ H. In other words,
(A − λ)x0 , x = 0, A being self-adjoint. Hence Ax0 − λx0 , x = 0 for
non-zero x0 and for every x ∈ H. It then follows that
Ax0 − λx0 , x = 0.
The above equality is impossible, either for complex λ because the

eigenvalues of a self-adjoint operator A are real. If λ is real, i.e., λ = λ,
then we have from (9.23)
||x0 || ≤ (1/C)||Ax0 − λx0 || = 0.

Next, let {yn } ⊂ L, yn = Aλ xn and {yn } → y0 .

1 1
By (9.22) ||xn − xm || ≤ ||Aλ xn − Aλ xm || = ||yn − ym ||.
C C
{yn } is a Cauchy sequence and hence ||yn − ym || → 0 as n, m → ∞.
However, then ||xn − xm || → 0 as n, m → ∞. Since H is a complete space,
there exists a limit for {xn } : x = lim xn . Moreover,
r
Aλ x = lim Aλ xn = lim yn = y i.e. y ∈ L.
n n
Thus L is a closed subspace everywhere dense in H, i.e., L = H. In
addition, since the correspondence y = Aλ x is one-to-one, there exists an
inverse operator x = A−1 λ y = Rλ y defined on the entire H. Inequality
(9.22) yields

1 1
||Rλ y|| = ||x|| ≤ ||Aλ x|| = ||y||,
C C
1
i.e. Rλ is a bounded operator and ||Rλ || ≤ .
C
9.6.4 Corollary
The point λ belongs to the spectrum of a self-adjoint operator A if and
only if there exists a sequence {xn } such that
||Axn − λxn || ≤ Cn ||xn ||, Cn → 0 as n → ∞. (9.24)
If we take ||xn || = 1, then (9.24) yields
||Axn − λxn || → 0, ||xn || = 1 (9.25)
9.6.5 Theorem
The spectrum of a self-adjoint operator A lies entirely on a segment
[m, M ] of the real axis, where
M = sup Ax, x and m = inf Ax, x.

||x||=1 ||x||=1
Proof: Since A is self-adjoint

Ax, x = x, Ax = Ax, x i.e. Ax, x is real.
|Ax, x| ||Ax|| ||x||
Also, ≤ ≤ ||A||
||x|| 2 ||x||2
Then, CA = sup Ax, x| ≤ ||A||.
||x||=1
On the other hand, for every y ∈ H it follows from Lemma 9.3.2 that
Ay, y ≤ CA ||y||2 .
Ax, x Ax, x
Let m = inf , M = sup (9.26)
x=θ ||x||2 x=θ ||x||
2
Ax, x
i.e., m≤ ≤M
||x||2
Let λ1 be any eigenvalue of A and x1 the corresponding eigenvector.
Then Ax1 = λ1 x1 and m ≤ λ1 ≤ M .
Now if y = Aλ x = Ax − λx, then y, x = Ax, x − λx, x;

x, y = y, x = Ax, x − λx, x.
Hence, x, y − y, x = (λ − λ)x, x = 2iβ||x||2 , where λ = α + iβ
or, 2|β|| ||x||2 = |x, y − y, x| ≤ |x, y| + |y, x| ≤ 2||x|| ||y||
and therefore, ||y|| ≥ |β|||x||, that is, ||Aλ x|| ≥ |β| ||x|| (9.27)
Since β = 0, it follows from theorem 9.6.3, that λ = α + iβ with β = 0 is a

regular value of the self-adjoint operator A.
In view of the above result we can say that the spectrum can lie on the
real axis. We next want to show that if λ lie outside [m, M ] on the real line
then it is a regular value.
For example, if λ > M, then λ = M + k with k > 0.

We have Aλ x, x = Ax, x − λx, x ≤ M x, x − λx, x = −k||x||2
where |Aλ x, x| ≥ k||x||2
On the other hand, |Aλ x, x| ≤ ||Aλ x|| ||x||. Thus ||Aλ x|| ≥ k||x||
showing that λ is regular. Similar arguments can be put forward if λ ≤ m.
9.6.6 Theorem
M and m belong to the point spectrum.
Proof: If A is replaced by Aμ = A − μI, then the spectrum is shifted
by μ to the left and M and m change to M − μ and m − μ respectively.
Thus without loss of generality it can be assumed that 0 ≤ m ≤ M . Then
M = ||A|| [see lemma 9.3.2].
We next want to show that M is in the point spectrum. Since M = ||A||,
we can consider a sequence {xn } with ||xn || = 1 such that
Axn , xn = M − n , n → 0 as n → ∞.
Further, ||Axn || ≤ ||A|| ||xn || = ||A|| = M .
Therefore, ||Axn − M xn ||2 = Axn − M xn , Axn − M xn
= Axn , Axn − 2M Axn , xn + M 2 ||xn ||2 .
= ||Axn ||2 − 2M (M − n ) + M 2 ≤ M 2 − 2M (M − n ) + M 2 .
= 2M n .
√
Hence, ||Axn − M xn || = 2M n .
Therefore, ||Axm − M xn || → 0 as n → ∞ and ||xn || = 1.
Using corollary 9.6.4, we can conclude from the above that M belongs to
the spectrum. Similarly, we can prove that m belongs to the spectrum.
9.6.7 Examples
1. If A is the identity operator I, then the spectrum consists of
the single
eigenvalue
1 for which the corresponding eigenspace H1 = H.
1
Rλ = I is a bounded operator for λ = 1.
(λ − 1)
2. The operator A : L2 ([0, 1]) → L2 ([0, 1]) is defined by Ax = tx(t), 0 ≤
t ≤ 1.
Example 9.5.11 shows that A is a non-negative operator. Here m = 0
and M ≤ 1. Let us show that all the points of the segment [0, 1] belong to
the spectrum of A, implying that M = 1.
Let 0 ≤ λ ≤ 1 and > 0. Let us consider the interval [λ, λ + ] or
[λ − , λ] lying in [0, t].
⎧
⎨ √1 for t ∈ [λ, λ + ]
x (t) =
⎩ 0 for t ∈ [λ, λ + ]
1 λ+
1
Since x2 (t)dt = dt = 1.
0 λ
Hence, x (t) ∈ L2 ([0, 1]), ||x || = 1.
Furthermore, Aλ x (t) = (t − λ)x (t).

2 1 λ+ 2
Therefore, ||Aλ x (t)|| = (t − λ)2 dt = .
λ 3
We have ||Aλ x || → 0 as → 0. Consequently, for λ, 0 ≤ λ ≤ 1 is in the

point spectrum.
At the same time, the operator has no eigenvalues. In fact, Aλ x(t) =
(t − λ)x(t).
If Aλ x(t) = 0, then (t − λ)x(t) → 0 almost everywhere on [0, 1] and
thus x(t) is also equal to zero, almost everywhere.
9.7 Invariant Subspaces

9.7.1 Definition: invariant subspace
A subspace L of H is called invariant under an operator A, if x ∈ L ⇒
Ax ∈ L.
9.7.2 Example
Let λ be the eigenvalue of A, and Nλ the collection of eigenvectors
corresponding to this eigenvalue which includes zero as well.
Since Ax = λx, x ∈ Nλ ⇒ Ax ∈ Nλ . Hence Nλ is an invariant
subspace.
9.7.3 Remark
If the subspace L is invariant under A, we say that L reduces the
operator A.
9.7.4 Lemma
For self-adjoint A, the invariance of L implies the invariance of its
orthogonal complements, M = H − L.
Let x ∈ M , implying x, y = 0 for every y ∈ L. However, Ay ∈ L for
y ∈ L, and x, Ay = 0, i.e., Ax, y = 0 for every y ∈ L. Hence x ⊥ L and
Ax ⊥ L implies M is invariant under A. Moreover, M = H − L. Let Gλ
denote the range of the operator Aλ , i.e., the collection of all elements of
the form y = Ax − λx, λ an eigenvalue. We want to show that
H = Gλ + Nλ . Let y ∈ Gλ , u ∈ Nλ ,
then y, u = Ax − λx, u = x, Au − λu = x, 0 = 0.
Consequently, Gλ ⊥ Nλ . If y ∈ Gλ and y ∈ Gλ ,
then y = lim yn , where yn ∈ Gλ · yn , u = 0
n
⇒ y, u = limn yn , u = 0.
Consequently, Gλ ⊥ Nλ .
Now, let y, u = 0 for every y ∈ Gλ . For any x ∈ H,
0 = Ax − λx, u = x, Au − λu ⇒ Au = λu
since x is arbitrary. Therefore, u ∈ Nλ .
Consequently, Nλ = H − Gλ = H − Gλ .
9.7.5 Lemma
Gλ is an invariant subspace under a self-adjoint operator A where Gλ
stands for the range of the operator Aλ .
Proof: Let N denote the orthogonal sum of all the subspaces Nλ , i.e., a
closed linear span of all the eigenvectors of the operator A. If H is separable,
then it is possible to construct in every Nλ a finite or countably orthonormal
system of eigenvectors which span Nλ for a particular λ. Since the
eigenvectors of distinct members of Nλ are orthogonal, by combining these
systems, we obtain an orthogonal system of eigenvectors {en }, contained
completely in the span N .
The operator A defines in the invariant subspace L an operator AL in
(L → L); namely AL x = Ax for x ∈ L. It can be easily seen that AL is

also a self-adjoint operator.
9.7.6 Lemma
If the invariant subspace L and M are orthogonal complements of each
other, then the spectrum of A is the set-theoretic union of the spectra of
operators AL and AM .
Proof: Let λ belong to the point spectrum of AL (or AM ). Then,
there is a sequence of elements {xn } ⊆ L(or M ) such that ||xn || =
1, ||AL,λ xn || → 0(||AM,λ xn || → 0). However, ||AL,λ xn || = ||Aλ xn ||
(||AM,λ xn || → ||Aλ xn ||). Hence, λ belongs to the spectrum of A.
Now, let λ belong to the spectrum of neither AL nor AM . Then, there
is a positive number C, such that ||Aλ y|| = ||AL,λ y|| ≥ C||y||, ||AM,λ z|| ≥
C||z||, for any y ∈ L and z ∈ M . However, every x ∈ H has the form
x = y + z with y ∈ L and z ∈ M , and ||x||2 = ||y||2 + ||z||2 . Hence,
1
||Aλ x|| = ||Aλ y + Aλ z|| = (||Aλ y||2 + ||Aλ z||2 ) 2
1
≥ C(||y||2 + ||z||2 ) 2 = C||x||.
Thus λ is not in the point spectrum of A.
9.8 Continuous Spectra and Point Spectra

It has already been shown that a Hilbert space H can be represented as
the orthogonal sum of two spaces, N , a closed linear hull of the set of all
eigenvectors of a self-adjoint operator A, and its orthogonal complement G.
Thus H = N ⊕ G.
9.8.1 Definition: discrete or point spectrum
The spectrum of AN is called discrete or point spectrum if N is the
closed linear hull of all eigenvectors of a self-adjoint operator A.
9.8.2 Definition: continuous spectrum
The spectrum of the operator AG is called continuous spectrum of
A if G is the orthogonal complement of N in H.
9.8.3 Remark
(i) If N = H, then A has no continuous spectrum and A has a pure
point spectrum.
This happen in the case of compact operators.
(ii) If H = G, then A has no eigenvalues and the operator A has a
purely continuous spectrum. The operator in example 2 of section
9.6.7 has a purely continuous spectrum.
9.8.4 Spectral radius

Let A be a bounded linear operator mapping a Banach space Ex into
itself. The spectral radius of A is denoted by rσ (A) and is defined as
rσ (A) = Sup {|λ|, λ ∈ σ(A)}.
Thus, all the eigenvalues of the operator A lie within the disc with origin
as centre and rσ (A) as radius.
9.8.5 Remark
Knowledge of spectral radius is very useful in numerical analysis.
We next find the value of the spectral radius in terms of the norm of the
operator A.
9.8.6 Theorem
Let Ex be a complex Banach space and let A ∈ (Ex → Ex ).
1
Then rσ (A) = lim ||An || n .
n→∞
Proof: Note that for any 0 = λ ∈ 4(+), we have the factorization
A − λ I = (A − λI)p(A) = p(A)(A − λI)
n n
where p(A) is a polynomial in A. If follows from the above that if An − λn I

has a bounded universe in Ex , then A − λI has bounded inverse in Ex .
Therefore, λn ∈ ρ(An ) ⇒ λ ∈ ρ(A).
and so, λ ∈ σ(A) ⇒ λn ∈ σ(An ). (9.28)
Hence, if λ ∈ σ(A), then |λ| ≤ ||A || (by (9.28) and lemma 9.3.2).
n n
1
⇒ |λ| ≤ ||An || n for λ ∈ σ(A).
1
Hence, rσ (A) = Sup {|λ| : λ ∈ σ(A) ≤ ||An || n }.
1
This gives rσ (A) ≤ lim inf ||An || n (9.29)
n→∞
Further, in view of theorems 4.7.21, the resolvent operator is represented
by

Rλ (A) = −λ−1 λ−k Ak , |λ| ≥ ||A||.
Also, we have Ax = λx where λ is an eigenvalue and x the corresponding
eigenvector. Therefore
|λ| ||x|| = ||Ax|| ≤ ||A|| ||x||
or, |λ| ≤ ||A|| for any eigenvalue λ.
Hence, rσ (A) ≤ ||A||. Also Rσ (A) is analytic at every point λ ∈ σ(A). Let
x ∈ Ex and f ∈ Ex∗ . Then the function
∞

g(λ) = f (Rλ (A)x) = −λ−1 f (λ−n An x).
n=0
is analytic for |λ| > rσ (A). Hence the singularitics of the function g all be

∞
in the disc {λ : |λ| ≤ rσ (A)}. Therefore, the series f (λ−n An x) forms
n=1
a bounded sequence. Since this is true for every f ∈ Ex∗ , an application
of uniform boundedness principle (theorem 4.5.6) shows that the elements
λ−n An form a bounded sequence in (Ex → Ex ). Thus,
||λ−n An || ≤ M < ∞
for some positive constant M (depending on λ).
1 1 1
Hence, ||An || n ≤ M n |λ| ⇒ lim Sup ||An || n ≤ |λ|.
n→∞
Since λ is arbitrary with |λ| ≤ rσ (A), it follows that
1
lim Sup ||An || n ≤ rσ (A). (9.30)
n→∞
1
It follows from (9.29) and (9.30) that lim ||An || n = rσ (A).
n→∞
9.8.7 Remark
The above result was proved by I. Gelfand [19].
9.8.8 Operator with a pure point spectrum
9.8.9 Theorem
Let A be a self-adjoint operator in a complex Hilbert space and let A
have a pure point spectrum.
Then the resolvent operator Rλ = (A − λI)−1 can be expressed as
1
Pn .
n
λn − λ
Proof: In this case, N = H and therefore there exists a closed orthonormal
system of eigenvectors {en }, such that
A en = λn en (9.31)
where λn is the corresponding eigenvalue.
Every x ∈ H can be written as
∞

x= cn en (9.32)
n=1
where the Fourier coefficients cn are given by

cn = x, en (9.33)
The projection operator Pn is given by
Pn x = x, en en = cn en , (9.34)
Pn denotes the projection along en .
The series (9.32) can be written as,

x = Ix = n Pn x or in the form I = Pn (9.35)
n
We know Pn Pm = 0, m = n (9.36)

By (9.31) and (9.35) Ax = cn Aen = λn Pn x (9.37)
n n
We can write A in the operator form. Then, (9.36) yields,

A= λn Pn (9.38)
n

Thus, Ax, x = λn cn en , cm em = λn c2n (9.39)
n n n
Thus the quadratic form Ax, x can be reduced to a sum of squares.

Using (9.37), (9.39) can be written as,

Ax, x = λn Pn x, x (9.40)
n
If λ does not belong to the closed set {λn } of eigenvalues, then there is a
d > 0 such that |λ − λn | > d.

We have Aλ x = (A − λI)x = (λn − λ)Pn x. Since Aλ has an inverse
n
and Pn commutes with A−1
λ , we have

x= (λn − λ)Pn A−1
λ x= Pn x.
n n
Premultiplying with Pm we have
Pm x = (λm − λ)Pm A−1 λ x.
1
Hence Rλ x = A−1 λ x= Pn x. (9.41)
n
λn − λ
cn
Since Pn x = cn en , Rλ x = en (9.42)
n
λn − λ

cn cn

Since λ n − λ ≤ d ,

12
1 2 ||x|| 1
||Rλ x|| ≤ cn = or ||Rλ || ≤ .
d n
d d
Consequently, λ does not belong to the spectrum. Now it is possible to

write (9.42) in the form
1
Rλ = Pn . (9.43)
n
λn − λ
9.8.10 Remark
For n dimensional symmetric (hermitian) matrices we have similar
expressions for Rλ , with the only difference that for n-dimensional matrices
the sum is finite.
Hilbert demonstrated that the class of operators with a pure point
spectrum is the class of compact operators.
Problems
1. A : H → H be a coercive operator (see 9.5, problem 5).

(i) Show that [u, v] = Au, v, ∀ u, v ∈ D(A) defines a scalar product
in H. If [un , u] → 0 as n → ∞, un is said to tend u in energy and
the above scalar product is called energy product.
(ii) If {φn } be a eigenfunction of the operator A and λn the
corresponding eigenvalue, show that the solution u0 of the equation
Au = f can be written in the form,
∞
f, φn
u0 = φn .
n=1
λn
(Hint: Note that Aφn , φn = [φn , φn ] = λn and Aφn , φm =

[φn , φm ] = 0, for n = m, {φn } is a system of functions which is
orthogonal in energy and is complete in energy.)
2. Let A be a compact operator on H. If {un } is an infinite dimensional
orthonormal sequence in H then show that Aun → 0 as n → ∞. In
particular, if a sequence of matrices {ai,j } defines a compact operator
on l2 ,
∞
∞

and Πj = |ai,j |2 and Δi = |ai,j |2 ,
i=1 j=1
show that Πj → 0 as j → ∞ and Δ → 0 as i → ∞. i
3. Let A ∈ (H → H) where H is a Hilbert space. Then show that

(i) A is normal if and only if ||Ax|| = ||A∗ x|| for every x ∈ H.
(ii) if A is normal then N (A) = N (A∗ ) = R(A)⊥ .
4. Let P ∈ (H → H) be normal, H being a Hilbert space over 4(+).
(i) Let X be the set of eigenvectors of P and Y the closure of the
span of X. Then show that Y and Y ⊥ are closed invariant subspaces
for P .
(Hint: Show that σ(P ) is a closed and bounded subset of 4(+) and
that σ(P ) = σe (P ) ∪ {μ : μ ∈ σa (P ∗ ).)
5. Let A ∈ (H → H) be self-adjoint, where H is a Hilbert space over +.

Then show that its Cayley transform
(i) T (A) = (A − iI)(A + iI)−1 is unitary and 1 ∈ σ(T (A)).
6. Let a be a non-zero vector, v be a unit vector, and α = ||a||. Define

2 1
μ = 2α(α − v a) and u =
T
(a − αv).
μ
(i) Show that u is a unit vector and that (I − 2u uT )a = αv.
(ii) If v1 and v2 are vectors and σ is a constant, show that det (I −
σv1 v2T ) = 1 − σv1T v2 , and that det (I − 2u uT ) = −1.
7. Let x(t) ∈ C([a, b]) and K(s, t) ∈ C([a, b] × [a, b]).
s
K n (b − a)n
If Ax(s) = K(s, t)x(t)dt, show that ||An || ≤
a n!
where K = max |K(s, t)|.
a≤s,t≤b
(Hint: For finding A2 use Fubini’s theorem (sec. 10.5).)

8. Let A denote a Fredholm integral operator on L2 ([a, b]) with kernel
K(·, ·) ∈ L2 ([a, b] × [a, b]). Then show that
(i) A is self-adjoint if and only if K(t, s) = K(s, t) for almost all (s, t)
in [a, b] × [a, b].
(ii) A is normal if and only if
b b
|K(u, s)K(u, t)dμ(u) = K(s, u)K(t, u)dμ(u)
a a
for almost all (s, t) in [a, b] × [a, b].

(Hint: Use Fubini’s theorem (sec. 10.5).)
9. Fix m ∈ 4(+). For (x1, x2 ) ∈ 42 (+2 ), define
A(x1 , x2 ) = (mx1 + x2 , mx2 ).
+ 12
2|m|2 + 1 + 4|m|2 + 1
Then show that ||A|| = while σ(A) =
2
{m}, so that rσ (A) = |m| < ||A||.
10. Let A ∈ (H → H) be normal. Let X be a set of eigenvectors of A,
and let Y denote the closure of the span of X. Then show that Y
and Y ⊥ are closed invariant subspaces of A.
11. Let A ∈ (H → H), H a Hilbert space.
Then show that
(i) λ ∈ ω(A) if and only if λ ∈ ω(A∗ )
(ii) σe (A) ⊂ ω(A) and σ(A) is contained in the closure of ω(A).

ω(A) is defined as ω(A) = {Ax, x : x ∈ H, ||x|| = 1}. ω(A) is
known as the numerical range of A.
CHAPTER 10
MEASURE AND
INTEGRATION IN Lp
SPACES
In this chapter we discuss the theory of Lebesgue measure and p-integrable

4
functions on . Spaces of these functions provide some of the most concrete
and useful examples of many theorems in functional analysis. This theory
will be utilized to study some elements of Fourier series and Fourier
integrals. Before we introduce the Lebesgue theory in a proper fashion,
we point out some of the lacuna of the Riemann theory which prompted a
new line of thinking.
10.1 The Lebesgue Measure on 4

Before we introduce ‘Lebesgue measure’ and associated concepts we present
some examples.
10.1.1 Examples
1. Let S be the set of continuous functions defined on a closed interval
[a, b]. Let
b
ρ(x, y) = |x(t) − y(t)|dt. (10.1)
a
(X, ρ) is a metric space. But it is not complete [see example in note 1.4.11].
⎧
⎪ 1
⎪
⎪ 0 if a ≤ t ≤ c −
⎪
⎨ n
Let xn (t) = 1
⎪
⎪ nt − nc + 1 if c − ≤ t ≤ c
⎪
⎪ n
⎩
1 if c ≤ t ≤ b
354
Measure and Integration in LP Spaces 355
{xn } is a Cauchy sequence. Let xn −→ x as n → ∞, then xn −→ x as

n → ∞ ⇒ x(t) = 0, t ∈ [a, c) and x(t) = 1 for t ∈ (c, b], (see note 1.4.11).
b
The above example shows that |x(t)−y(t)|dt can be used as a metric
a
of a wider class of functions, namely the class of absolutely integrable
functions.
2. Consider a sequence {fn (t)} of functions defined by
fn (t) = lim [cos(πn!)t]2m (10.2)

m→∞
k
Thus, fn (t) = 1 if t = , k = 0, 1, 2, . . . , n!
n!
= 0 otherwise
1
Define, ρ(fn , fm ) = |fn (t) − fm (t)|dt.
0
Now, for any value of n, ∃ some common point at which the functional
values of fn , fm take the value 1 and hence their difference is zero. On
the other hand, at the remaining points {(n! + 1) − (m! + 1)} the value
fn (t) − fm (t) is equal to 1 or −1. But the number of such types of functions
is finite and hence fn − fm = 0 only for a finite number of points. Thus
ρ(fn , fm ) = 0.
Hence, {fn } is a Cauchy sequence. Hence, {fn (t)} tends to a function
f (t) s.t.

f (t) = 1, at all rational points in 0 ≤ t ≤ 1
(10.3)
= 0, irrational points in (0, 1)
Therefore,
1 if we consider the integration in the Riemann sense, the
integral |f (t)|dt does not exist. Hence the space is not complete. We
0
would show later that if the integration is taken in the Lebesgue sense, then
the integral exists.
3. Let us define a sequence {Ωn } of sets as follows:
Ω0 = [0, 1]
1
Ω1 = Ω0 with middle open interval of length 4 removed.
Ω2 = Ω1 with middle open interval of the component intervals of Ω1
removed, each of length 412 .
Then by induction we have already defined Ωn so as to consist of 2n
disjoint closed intervals of equal length. Let Ωn+1 = Ωn with middle open
1
intervals of the component intervals of Ωn removed, each of length 4n+1 .
For each n = 1, 2, . . ., the sum of the lengths of the component open

intervals of Ωn is given by

n−1
1 i 1 1
m(Ωn ) = 1 − 2 = + n+1 . (10.4)
i=0
4i+1 2 2
For every n = 1, 2, . . ., let xn be defined by

3
1 if t ∈ Ωn
xn (t) = (10.5)
0 if t ∈
/ Ωn
It may be seen that {xn } is a Cauchy sequence. Let m > n. Then

1
ρ(xm , xn ) = |xm (t) − xn (t)|dt
0
= m(Ωn ) − m(Ωm )
1 1
= − m+1
2n+1 2
We shall show that {xn } does not converge to any Riemann integrable
function. Let us suppose that there exists a Riemann integrable function x
and that 1
lim |x(t) − xn (t)|dt = 0. (10.6)
n→∞ 0
Let J1 be the open interval removed in forming Ω1 , J1 , J2 , J3 the open
intervals removed in forming Ω2 etc.
For each l = 1, 2, . . . there is an N so that n > N implies xn (t) = 0,
t ∈ Jl .
It follows x is equivalent to a function which is identically zero on
∞
*
V = Jl
l=1
But the lower Riemann integral of such a function is zero. Since x is

integrable, 1
x(t)dt = 0.
0
But (10.4) yields that
1
1
xn (t)dt >
0 2
This contradicts (10.6).
Thus, the space of absolutely integrable functions when integration is
used in the Riemann sense is not complete. This may be regarded as a major
defect of the Riemann integration. The definition of Lebesgue integration
overcomes this defect and other defects of the Riemann integration.
10.1.2 Remark
In example 2 (10.1.1) it may seen that {fn } → 1 at all rational points
in [0, 1] and −→ 0 at all irrational points in [0, 1].
It is known that the set of rational points in [0, 1] can be put into one-to-
one correspondence with the set of positive integers, i.e., the set of natural
numbers [see Simmons [53]]. Hence the set of rational numbers in [0, 1]
forms a countable set. Thus, the set of rational numbers in [0, 1] can be
written as a sequence {r1 , r2 , r3 , . . .}.
Let be any positive real number. Suppose we put an open interval
of width about the first rational number r1 , an interval of width /2
about r2 and so on. About rn we put an open-interval of width /2n−1 .
Then we have an open interval of some positive width about every rational
number in [0, 1]. The sum of the widths of these open intervals is
+ 2 + 22 + · · · + 2n + · · · = 2 . We conclude from all this that all
rational numbers in [0, 1] can be covered with open intervals, the sum of
whose length is an arbitrarily small positive number.
We say that the Lebesgue measure of the R of rational numbers in
[0, 1] is l · m · (R) = 0. This means that the greatest lower bound of the
total lengths of a set of open intervals covering the rational number is zero.
The Lebesgue measure of the entire interval [0, 1] is l · m · [0, 1] = 1. This
is because the greatest lower bound of the total length of any set of open
intervals covering the whole set [0, 1] is 1.
Now if we remove the rational numbers in [0, 1] from [0, 1] we are left
with the set of irrational numbers the Lebesgue measure of which is 1.
Thus, if we delete from [0, 1] the set M of rational numbers, whose
1
Lebesgue measure is zero, we can find L f (t)dt in example 2 above, i.e.,
0
1
f (t)dt = 1 · L f (t)dt
[0,1]−M 0
denotes the integration in the Lebesgue sense. The above discussion may
be treated as a prelude to a more formal treatment ahead.
10.1.3 The Lebesgue outer measure of a set E ⊂ R
4
The Lebesgue outer measure of a set E ⊆ is denoted by m∗ (E) and
is defined as
3∞ ∞

*
m∗ (E) = g · l · b l(In ) : E ⊂ In ,
n=1 n=1
where In is an open interval in 4 and l(In) denotes the length of the interval
In .
10.1.4 Simple results

∗
(i) m (φ) = 0
(ii) m∗ (A) ≥ 0 for all A ⊂ 4
∗ ∗
(iii) m (A1 ) ≤ m (A2 ) for A1 ⊆ A2 ⊂ 4
∞
∞

(iv) m ∗
An ≤ m∗ (An ) for all subsets A1 , A2 , . . . , An . . . ⊂ 4
n=1 n=1
(v) m∗ (I) = l(I) for any interval I ⊆ 4

(vi) Even when A1 , A2 , . . . , An . . . are pairwise disjoint subsets of 4, we
may not have
∞
∞
*
∗
m An = m∗ (An ).
n=1 n=1
10.1.5 Definition: Lebesgue measurable set, Lebesgue measure

of such a set
A set S ⊆ 4 is said to be Lebesgue Measurable if
m∗ (A) = m∗ (A ∩ S) + m∗ (A ∩ S C ) for every A ⊆ 4
Since we have always m∗ (A) ≤ m∗ (A ∩ S) + m∗ (A ∩ S C ) we see that S
is measurable (if and only if) for each A we have
m∗ (A) ≥ m∗ (A ∩ S) + m∗ (A ∩ S C ).
10.1.6 Remark
(i) Since the definition of measurability is symmetric in S and S C , we
have S C measurable whenever S is.
(ii) Φ and the set 4 of all real numbers are measurable.
10.1.7 Lemma
∗
If m (S) = 0 then S is measurable.
Proof: Let A be any set. Then A ∩ S ⊂ S and so m∗ (A ∩ S) ≤ m∗ (S) = 0.
Also A ⊇ A ∩ S C .
Hence, m∗ (A) ≥ m∗ (A ∩ S C ) = m∗ (A ∩ S) + m∗ (A ∩ S C ).
But m∗ (A) ≤ m∗ (A ∩ S) + m∗ (A ∩ S C ).
Hence S is measurable.
10.1.8 Lemma
If S1 and S2 are measurable, so is S1 ∪ S2 .
Proof: Let A be any set. Since S2 is measurable, we have
m∗ (A ∩ S1C ) = m∗ (A ∩ S1C ∩ S2 ) + m∗ (A ∩ S1C ∩ S2C )

Since A ∩ (S1 ∪ S2 ) = (A ∩ S1 ) ∪ (A ∪ S2 ∩ S1C ), we have
m∗ (A ∩ (S1 ∪ S2 )) ≤ m∗ (A ∩ S1 ) + m∗ (A ∩ S2 ∩ S1C ).
Thus, m∗ (A ∩ (S1 ∪ S2 )) + m∗ (A ∩ S1C ∪ S2C )
≤ m∗ (A ∩ S1 ) + m∗ (A ∩ S2 ∩ S1C ) + m∗ (A ∩ S1C ∩ S2C )
= m∗ (A ∩ S1 ) + m∗ (A ∩ S1C ) = m∗ A
Since (S1 ∪S2 )C = S1C ∩S2C . Hence, S∪S2 is measurable, since the above
4
equality is valid for every set A ⊆ , where S C denotes the complement of
4
S in . If S is measurable then m∗ (S) is called the Lebesgue measure
of S and is denoted simply by m(S).
10.1.9 Remark
(i) Φ and 4 are measurable subsets.
(ii) The complements and countable union of measurable sets are
measurable.
10.1.10 Lemma
Let A be any set and S1 , S2 , . . . , Sn , a finite sequence of disjoint
measurable sets. Then
n
* n
∗
m A∩ Si = m∗ (A ∪ Si )
i=1 i=1
Proof: The lemma can be proved by making an appeal to induction on n.

It is true for n = 1. Let us next assume that the lemma is true for
m = n − 1 sets Si . Since Si are disjoint sets, we have
n
*
A∩ Si ∩ Sn = A ∩ Sn (10.7)
i=1

n−1
*
n *
A∩ Si ∩ SnC =A∩ Si (10.8)
i=1 i=1
Hence the measurability of Sn implies

n
* *
∗ ∗
m A∩ Si =m A∩ Si ∩ Sn
i=1 i=1

*
n
∗
+m A∩ Si ∩ SnC
i=1
Using (10.7) and (10.8) we have,

n
n−1
* *
m∗ A ∩ Si = m∗ (A ∩ Sn ) + m∗ A ∩ Si
i=1 i=1

n−1
= m∗ (A ∩ Sn ) + m∗ (A ∩ Si ) (10.9)
i=1
(10.9) is true since by assumption the lemma is true for m = n − 1.

Thus the lemma is true for m = n and the induction is complete.
10.1.11 Remark [see Royden [47]]
It can be proved that the Lebesgue measure m∗ is countably additive
on measurable sets, i.e., if S1 , S2 , . . . are pairwise disjoint measurable sets,
then ∞
∞
*
∗
m Sn = m∗ (Sn ).
n=1 n=1
10.2 Measurable and Simple Functions

10.2.1 Definition: Lebesgue measurable function
An extended real-valued function f on 4
is said to be Lebesgue
measurable if f −1 (S) is a measurable subset for every open subset S
of4 and if the subsets f −1 (∞) and f −1 (−∞) of 4
are measurable.
10.2.2 Definition: complex-valued Lebesgue measurable
function
4
A complex-valued function f on is said to be Lebesgue measurable
if the real and imaginary part Ref and Imf are both measurable.
10.2.3 Lemma
Let f be an extended real-valued function whose domain is measurable.
Then the following statements are equivalent:
(i) For each real number α the set {x : f (x) > α} is measurable.
(ii) For each real number α the set {x : f (x) ≥ α} is measurable.
(iii) For each real number α the set {x : f (x) < α} is measurable.
(iv) For each real number α the set {x : f (x) ≤ α} is measurable.
If (i)–(iv) are true, then
(v) For each extended real number α the set {x : f (x) = α} is
measurable.
Proof: Let the domain of f be D, which is measurable.

(i) =⇒ (iv) {x : f (x) ≤ α} = D ∼ {x : f (x) > α}

and the difference of two measurable sets is measurable. Hence (i) =⇒ (iv).
Similarly (iv) =⇒ (i). This is because
{x : f (x) > α} = D ∼ {x : f (x) ≤ α}.
Next to show that (ii) =⇒ (iii) since
{x : f (x) < α} = D ∼ {x : f (x) ≥ α}.
(iii) =⇒ (ii) by arguments similar as in above.

(ii) =⇒ (i) since
*∞ ,
1
{x : f (x) > α} = x : f (x) ≥ α + ,
n=1
n
and the union of a sequence of measurable sets is measurable. Hence (ii)

=⇒ (i).
(i) =⇒ (ii) since
∞
- ,
1
{x : f (x) ≥ α} = x : f (x) > α −
n=1
n
and the intersection of a sequence of measurable sets is measurable. Hence,

(i) =⇒ (ii).
Thus the first four statements are equivalent.
If α is a real number,
{x : f (x) = α} = {x : f (x) ≥ α} ∩ {x : f (x) ≤ α}
and so (ii) and (iv) =⇒ (v) for α real. Since

∞
-
{x : f (x) = ∞} = {x : f (x) ≥ n}
n=1
(ii) =⇒ (v) for α = ∞. Similarly, (iv) =⇒ (v) for α = −∞, and we have
(ii) and (iv) =⇒ (v).
10.2.4 Remark
(i) It may be noted that an extended real valued function f is (Lebesgue)
measurable if its domain is measurable and if it satisfies one of the
first four statements of the lemma 10.2.3.
(ii) A continuous function (with a measurable domain) is measurable,
because the preimage of any open set in 4
is an open set.
(iii) Each step function is measurable.
10.2.5 Lemma
Let K be a constant and f1 and f2 be two measurable real-valued
functions defined on the same domain. Then the functions f1 + K, Kf1 ,
f1 + f2 , f2 − f1 and f1 f2 are measurable.
Proof: Let {x : f1 (x) + K < α} = {x : f1 (x) < α − K}.
Therefore by condition (iii) of lemma 10.2.3, since f1 is measurable,
f1 + K is measurable.
If f1 (x) + f2 (x) < α, then f1 (x) < α − f2 (x) and by the corollary
to the axiom of Archimedes [see Royden [47]] we have a rational number
between two real numbers. Hence there is a rational number p such that
f1 (x) < p < α − f2 (x).
Hence, {x : f1 (x) + f2 (x) < α} = ∪{x : f1 (x) < p} ∩ {x : f2 (x) < α − p}.
Since the rational numbers are countable, this set is measurable and so
f1 + f2 is measurable.
Since −f2 = (−1)f2 is measurable, when f2 is measurable f1 − f2 is
measurable.
√ √
Now, {x : f 2 (x) > α} = {x : f (x) > α} ∪ {x : f (x) < − α} for α > 0
and if α < 0
{x : f 2 (x) > α} = D,
where D is the domain of f . Hence f 2 (x) is measurable. Moreover,
1
f1 f2 = [(f1 + f2 )2 − f12 − f22 ].
2
Given f1 , f2 measurable functions, (f1 +f2 )2 , f12 and f22 are respectively
measurable functions.
Hence, f1 f2 is a measurable function.
10.2.6 Remark
Given f1 and f2 are measurable,
(i) max{f1 , f2 } is measurable

(ii) min{f1 , f2 } is measurable
(iii) |f1 |, |f2 | are measurable.
10.2.7 Theorem
Let {fn } be a sequence of measurable functions (with the same domain
of definition). Then the functions sup{f1 , . . . , fn } and inf{f1 , f2 , . . . , fn },
sup fn , inf fn , inf sup fk , sup inf fk are all measurable.
n n n k≥n n k≥n
Proof: If q is defined by q(x) = sup{f1 (x), f2 (x), . . . , fn (x)} then {x :
*∞
q(x) > α} = {x : fn (x) > α}. Since fi for each i is measurable, g is
n=1
measurable.
H∞Similarly, if p(x) is defined by p(x) = sup fn (x) then {x : p(x) > α} =

n=1 {x : fn (x) > α} and so p(x) is measurable. Similar arguments can be
put forward for inf.
10.2.8 Remark
(x)
If {fn } is a sequence of measurable functions such that fn −→ f (x)
4
for each x ∈ then f is measurable.
10.2.9 Almost everywhere (a.e.)
If f and g are measurable functions, f is said to be equal to g almost
everywhere (abbreviated as a.e.) on a measurable set S, if
m{x ∈ S : f (x) = g(x)} = 0.
10.2.10 Characteristic function of a set E, simple function

If we refer to (10.3) in example 2 of 10.1 we see that
3 1
f (x) = 1 at all rational points in [0, 1]
, R f (x)dx = 1 and
f (x) = 0 at all irrational points in [0, 1] 0
1
R f (x)dx = 0.
0
1
Thus the integral f (x)dx = 0 is not Riemann integrable. This has
0
led to the introduction of a function which is 1 on a measurable set and
is zero elsewhere. Such a function is integrable and has as its integral the
measure of the set.
Definition: characteristic function of E
The function χE defined by
3
1 x∈E
χE (x) =
0 x∈ /E
is called the characteristic function of E.

The characteristic function is measurable if and only if E is measurable.
Definition: simple function
A simple function is a scalar-valued function on 4
whose range is
finite. If a1 , a2 , . . . , an are the distinct values of such a function φ, then

n
φ(x) = ai χEi (x) (10.10)
i=1
is called a simple function if the sets Ei are measurable, Ei is given by
Ei = {x ∈ 4 : φ(x) = ai}
10.2.11 Remark
(i) The representation for φ is not unique.
(ii) The function φ is simple if and only if it is measurable and assumes
only a finite number of values.
(iii) The representation (10.10) is called the canonical representation and
it is characterised by the fact that Ei are disjoint and the ai distinct
and non-zero.
10.2.12 Example
Let f : 4 −→ [0, ∞[. Consider the simple function for n = 1, 2, . . .
3 (i−1) (i−1)
2n if 2n ≤ f (x) < i
2n for i = 1, 2, . . . , n2n
φn (x) =
n if f (x) ≥ n
Then 0 ≤ φ1 (x) ≤ φ2 (x) · · · ≤ f (x) and φn (x) −→ f (x) for each x ∈ . 4

If f is bounded, the sequence {φn } converges to f uniformly on . 4
If f : 4 −→ [−∞, ∞], then by considering f = f + − f − , where
f + = max{f, 0} and f − = min{f, 0}, we see that f = f + where f (x) ≥ 0
and f = f − when f (x) = 0.
Thus, there exists a sequence of simple functions which converges to f
at every point of . 4
It may be noted that if f is measurable, each of the simple functions is
measurable.
10.2.13 The Lebesgue integral φ
If φ vanishes outside a set of finite measure, we define the integral of φ
by

n
φ(x)dm(x) = ai μ(Ei ) (10.11)
i=1

n
when φ has the canonical representation φ = ai χEi .
i=1
10.2.14 Lemma

n
Let φ = ai χEi , with Ei ∩ Ej = ∅ for i = j. Suppose each set Ei is a
i=1
measurable set of finite measure. Then

n
φdm = ai m(Ei ).
i=1
*
Proof: The set Aa = {x : φ(x) = a} = Ei
ai =a

Hence am(Aa ) = ai m(Ei ) by the additivity of m, and hence
ai =a

n
φ(x)dm(x) = a m(Aa ) = ai m(Ei ).
i=1
10.2.15 Theorem
Let φ and ψ be simple functions, which vanish outside a set of finite
measure. Then

(αφ + βψ)dm = α φdm + β ψdm
and if φ ≥ ψ a.e. then

φdm ≥ ψdm.
Proof: Let {Ei } and {Ei } be the set occurring in canonical representations
of φ and ψ. Let E0 and E0 be the sets where φ and ψ are zero. Then the
set Fk obtained by taking the intersections of Ei ∩ Ei are members of a
finite disjoint collection of measurable sets and we may write

n
n
φ= ai χFi ψ= bi χFi
i=1 i=1

and so αφ + βψ = (αai + βbi )χFi
Hence, using lemma 10.2.13

(αφ + βψ)dm = a φdm + b ψdm
Again φ ≥ ψ a.e. =⇒ (φ − ψ) ≥ 0 a.e.

=⇒ (φ − ψ)dm ≥ 0
since the integral of a simple function which is greater than or equal to

zero a.e. is non-negative. Hence, the first part of the theorem yields

φdm ≥ ψdm.
10.2.16 The Lebesgue integral of a bounded function over a set

of finite measure
Let f be a bounded real-valued function and E a measurable set of finite
measure. Keeping in mind the case of Riemann integral, we consider for
simple functions φ and ψ the numbers

inf ψ (10.12)
ψ≥f E

and sup φ. (10.13)
φ≤f E
It can be proved that if f is bounded on a measurable set E with m(E)

finite, then the integrals (10.12) and (10.13) will be equal where φ and ψ
are simple functions if and only if f is measurable [see Royden [47]].
10.2.17 Definition: Lebesgue integral of f
If f is a bounded measurable function defined on a measurable set E
with m(E) finite, the Lebesgue integral of f over E is defined as

f (x)dm = inf ψ(x)dx (10.14)
E ψ≥f E
for all simple functions ψ(≥ f ).

Note 10.2.1. If f is a bounded function defined on [0, 1] and f is Riemann
integrable on [0, 1], then it is measurable and
1 1
R f (x)dx = f (x)dm.
0 0
10.2.18 Definition
If f is a complex-valued measurable function over , then we define 4

f dm = Re f dm + i Im f dm
4 4 4

whenever Ref dm and Imf dm are well-defined.
4 4
10.2.19 Definition

If f is a measurable function on 4 and |f |dm < ∞, we say f is an
integrable function on . 4 4
In what follows we state without proof some important convergence
theorems.
10.2.20 Theorem
Let {fn } be a sequence of measurable functions on a measurable subset
E of . 4
(a) Monotone convergence theorem: If 0 ≤ f1 (x) ≤ f2 (x) ≤ · · · and
fn (x) −→ f (x) for all x ∈ E, then

fn dm −→ f dm. (10.15)
E E
(b) Dominated convergence theorem: If |fn (x)| ≤ g(x) for all

n = 1, 2, . . . and x ∈ E, where g is an integrable function on E
and if fn (x) −→ f (x) for all x ∈ E then fn , f are integrable on E
n→∞
and
fn (x)dm −→ f (x)dm (10.16)
E E
If in particular, m(E) < ∞ and |fn (x)| ≤ K for all n = 1, 2, . . .,

x ∈ E and some K > 0, then the result in 10.2.20(b) is known as the
bounded convergence theorem.
Note 10.2.2. If f1 and f2 are integrable functions on E, then

(f1 + f2 )dm = f1 dm + f2 dm.
E E E
Proof: The above result is true where f1 and f2 are simple measurable
functions defined on E [see theorem 10.2.15]. We write f1 = f1+ −f1− , where
f1+ = max{f1 , 0} and f1− = min{f1 , 0}. Similarly we take f2 = f2+ − f2− ,
f2+ and f2− will have similar meanings as those of f1+ , f1− . It may be noted
that f1+ , f1− , f2+ , f2− are nonnegative functions.
We now approximate f1+ , f1− , f2+ , f2− by non-decreasing sequence
of simple measurable functions and applying the monotone convergence
theorem (10.2.20(a)).
10.3 Calculus with the Lebesgue Measure

4
Let E = [a, b], a finite closed interval in . We first recapitulate a few
definitions pertaining to the Riemann integral. Let f be a bounded real-
valued function defined on the interval [a, b] and let a = ξ0 < ξ < · · · <
ξn = b be a subdivision of [a, b]. Then for each subdivision we can define
the sums
n n
S= (ξi − ξi−1 )Mi and s = (ξi − ξi−1 )mi
i=1 i=1
where Mi = sup f (x), mi = inf f (x).

ξi−1 <x≤ξi ξi−1 <x≤ξi
We then define the upper Riemann integral of f by

b
f (x)dx = inf S (10.17)
a
with the infimum taken over all possible subdivisions of [a, b]. Similarly, we
define the lower integral
b
f (x)dx = sup s (10.18)
a
The upper integral is always at least at large as the lower integral and
if the two are equal we say f is Riemann integrable and call the common
value, the Riemann integral of f . We shall denote it by
b
R f (x)dx (10.19)
a
For the definition of the Lebesgue integral see 10.2.16.

10.3.1 Remark
4+
Consider a ( )-valued bounded function f on [a, b]. f is Riemann
integrable on [a, b] if and only if the set of discontinuities of f on [a, b] is
of (Lebesgue) measure zero. In that case, f is Lebesgue integrable on [a, b]
b b
and integral f (x)dx is equal to the integral f (x)dm [see Rudin [48]].
a a
10.4 The Fundamental Theorem for Riemann

Integration
4+
A ( )-valued function F is differentiable on [a, b] and its derivative f is
continuous on [a, b] if and only if
x
F (x) = F (a) + f (s)ds a≤x≤b
a
for some continuous function f on [a, b]. In that case F (x) = f (x) for all
x ∈ [a, b]. For a proof see Rudin [48].
10.4.1 Absolutely continuous function
4+
A ( )-valued function F on [a, b] is said to be absolutely continuous
on [a, b] if for every > 0, there is some δ > 0 such that

n
|F (xi ) − F (yi )| <
i=1

n
whenever a ≤ y1 < x1 < · · · < yn < xn ≤ b and (xi − yi ) < δ.
i=1
10.4.2 Remark
(i) Every absolutely continuous function is uniformly continuous on
[a, b].
(ii) If F is differentiable on [a, b] and its derivative F is bounded on
[a, b], then F is absolutely continuous by the mean value theorem.
10.5 The Fundamental Theory for Lebesgue

Integration
A 4(+)-valued function F is absolutely continuous on [a, b] if and only if
x
F (x) = F (a) + f dm a≤x≤b
a
for some (Lebesgue) integrable function; f on [a, b]. In that case F (x) =
f (x) for almost all x ∈ [a, b] [see Royden [47]].
10.5.1 Total variation, bounded variation
4
Let f : [a, b] −→ (C) be a function. Then the (total) variation Var
(f ) of f over [a, b] is defined as
n

Var (f ) = sup |f (ti ) − f (ti−1 )| : P = [t0 , t1 , . . . , tn ]
i=1
where a = t0 and b = tn is a partition of [a, b].

The supremum is taken over all partitions of [a, b]. If Var (f ) < ∞
holds, f is said to be a function of bounded variation.
10.5.2 Remark
(i) An absolutely continuous function on [a, b] is of bounded variation
on [a, b].
(ii) If f is of bounded variation on [a, b], then f (x) exists for almost all
x ∈ [a, b] and f is (Lebesgue) integrable on [a, b] [see Royden [47]].
(iii) A function of bounded variation on [a, b] need not be continuous on
[a, b].
. /
For example, the characteristic function of the set 0, 13 is of bounded
variation on [0, 1], but it is not continuous on [0, 1]
Although our discussion is confined to Lebesgue measure on , we 4
4
sometimes need Lebesgue measure on 2 to apply some results. The
4
Lebesgue measure on 2 generalizes the idea of area of a rectangle, while
the Lebesgue measure on 4generalizes the idea of length of an interval.
10.5.3 Theorem (Fubini and Tonelli) [see Limaye [33]]
Let m × m denote the Lebesgue measure on 2
4
and k(·, ·)
be
a ( 4+
)-valued measurable function on [a, b] × [c, d]. If either
|K(s, t)|d(m × m)(s, t) < ∞ or if K(s, t) ≥ 0 for all (s, t) ∈
[a,b]×[c,d]
d
[a, b] × [c, d], then K(s, t)dm(t) exists for almost every s ∈ [a, b] and
b c
K(s, t)dm(s) exists for almost every t ∈ [c, d].

a
The functions defined by these integrals are integrable on [a, b] and [c, d]
respectively.

Moreover, K(s, t)d(m × m)(s, t)
[a,b]×[c,d]
b ? d )
= K(s, t)dm(t) dm(s)
a c
d ? b )
= K(s, t)dm(s) dm(t).
c a
10.6 Lp Spaces and Completeness

We first recapitulate the two important inequalities, namely Hölder’s
inequality and Minkowski’s inequality, before taking up the case of pth
power Lebesgue integrable functions defined on a measurable set E.
10.6.1 Theorem (Hölder’s inequality) [see 1.4.3]
If p > 1 and q is defined by p1 + 1q = 1, then the following inequalities
hold true
n 1/p n 1/q
n
(H 1) |xi yi | ≤ |xi |p |yi |q
i=1 i=1 i=1
for complex numbers x1 , x2 , . . . , xn , y1 , y2 , . . . , yn
(H 2) In case x ∈ lp , i.e., pth power summable, y ∈ lq , where p and q are
defined as above x = {xi }, y = {yi }, we have
∞
∞
1/p n 1/q

|xi yi | ≤ |xi | p
|yi | q
i=1 i=1 i=1
The inequality is known as the Hölder’s inequality for sum.

(H 3) If f (x) ∈ Lp (]0, 1[), i.e., pth power integrable, g(x) ∈ Lq (]0, 1[), i.e.,
qth power integrable, where p and q are defined as above, then

1/p
1/q
b b b
|f (x)g(x)|dx ≤ |f (x)| dx
p
|g(x)| dx q
a a a
The above inequality is known as Hölder’s inequality for integrals.

10.6.2 Theorem (Minkowski’s inequality)[see 1.4.4]
(M 1) If p ≥ 1, then
1/p n 1/p 1/p

n
n
|xi + yi | p
≤ |xi | p
+ |yi | p
i=1 i=1 i=1

for complex numbers x1 , x2 , . . . , xn , y1 , y2 , . . . , yn .

(M 2) If p ≥ 1, x = {xi } ∈ lp , the pth power summable y = {yi } ∈ lq ,
where p and q are conjugate to each other, then
∞
1/p ∞
1/p ∞
1/p

|xi + yi | p
≤ |xi | p
+ |yi | p
i=1 i=1 i=1
(M 3) If f (x) and g(x) belong to Lp (0, 1), then

1 1/p 1 1/p 1 1/p
|f (x) + g(x)| dx p
≤ |f (x)| p
+ |g(x)| p
0 0 0
We next consider E to be a measurable subset of and 1 ≤ p < ∞. 4

Let f be a measurable or p-integrable function on E and be pth power
integrable on E i.e. |f |p is integrable on E. Then the following inequalities
hold true.
10.6.3 Theorem
Let f and g be measurable functions on 4.
1 1
(a) Hölder’s inequality Let 1 < p < ∞ and p + q = 1. Then
1/p 1/q
|f g|dm ≤ |f | dm
p
|g| dm
q
E E E
(b) Minkowski’s inequality Let 1 ≤ p < ∞.

Assume that m(f −1 (∞)) ∩ g −1 (−∞)) ≡ 0 ≡ m(f −1 (−∞) ∩ g −1 (∞)).
1/p 1/p 1/p
Then |f + g|p dm ≤ |f |p dm + |g|p dm
E E E
Proof: (a) Since f and g are

measurable functions on E, f g is measurable
on E (Sec. 10.2.5) and hence |f g|dm is well-defined. Let
E
1/p 1/p
a= |f | dm
p
and b= |g| dm
p
E E
If a = 0 or b = 0, then f g = 0 almost everywhere on E and hence

|f g|dm = 0. Therefore the inequality holds. If a = ∞ or b = ∞, then
E
the inequality obviously holds. Next we consider 0 < a, b < ∞. We replace
xi by f and yi by g and the summation from i = 0 to n by integral over
E with respect to Lebesgue measure in (H 1) of theorem 10.6.1. Then the
proof of theorem 10.6.3 is obtained by putting forward arguments exactly
as in the proof of theorem 10.6.1.
(b) Since f and g are measurable functions on E, f + g is a measurable

function on E (sec. 10.2.5).
Moreover since f and g are each p-integrable,
f + g is p-integrable i.e. |f + g|p dm is well-defined. The proof proceeds
E
exactly as in (M 1) of theorem 10.6.2.
10.6.4 Definition: f ∼ g, Metric in Lp (E)
f ∼ g: For measurable functions f and g on E, a measurable set, we
write f ∼ g if f = g almost everywhere on E.
It may be noted that ∼ is an equivalence relation on the set of
measurable functions on E.
Metric: Let 1 ≤ p < ∞. For any p-integrable functions f and g on E,
define,
1/p
ρp (f, g) = |f − g| dm
p
E
Note that m({x : |f(x)| = ∞}) = 0 = m({x : |g(x)| = ∞}) since

|f (x)|p dm < ∞ and |g(x)|p dm < ∞. Hence
E E
(i) ρp (f, g) is well-defined, non-negative and

(ii) ρp (f, g) = ρp (g, f ) (symmetric)
(iii) By Minkowski’s inequality
1/p 1/p 1/p
|f − g|p dm ≤ |f − h|p dm + |h − g|p dm
E E E
where h is a p-integrable function on E.

Hence ρp (f, g) ≤ ρp (f, h) + ρp (h, g).
Thus the triangle inequality is fulfilled.
Lp (E): Let Lp (E) denote the set of all equivalence classes of p-integrable
functions. ρp induces a metric on Lp (E).
10.6.5 Definition: essentially bounded, essential supremum
Let p = ∞ A measurable function f is said to be essentially bounded
on E if there exists some β > 0 such that m{x ∈ E : |f (x)| ≥ β} = 0 and
β is called an essential bound for |f | on E.
Essential supremum
If α = inf{β : β an essential bound for |f | on E}, then α is itself an
essential bound for |f | on E. Such an ‘α’ is called the essential supremum
of |f | on E and will be denoted by essupE |f |.
⎧
⎨n at x = 1 , n = 1, 2, . . . .
Let us consider f (x) = n is essentially
⎩
x otherwise 0 < x ≤ 1.
bounded and essupE |f | = 1.
If f and g are essentially bounded functions on E, then it can be seen

that
essupE |f + g| ≤ essupE |f | + essupE |g|
Lp (E): Let Lp (E) denote the set of all equivalence classes of essentially
bounded functions on E under the equivalence relation ∼.
Then ρ∞ (f, g) = essupE |f − g|, f, g ∈ E induces a metric on L∞ (E).
10.6.6 Theorem: For 1 ≤ p < ∞, the metric space Lp (E) is
complete
Proof: For 1 ≤ p < ∞, let {fn } be a Cauchy sequence in Lp (E), where
E is a measurable set.
To prove the space Lp (E) to be complete, it is sufficient if we can show
that a subsequence of {fn } converges to some point in Lp (E). Hence, by
passing to a subsequence if necessary, we may have that
1
ρp (fn+1 , fn ) ≤ , n = 1, 2, . . .
2n
Let f0 = 0 and for x ∈ E and n = 1, 2, . . . we denote,

n ∞

gn (x) = |fi+1 (x) − fi (x)| and g(x) = |fi+1 (x) − fi (x)|
i=0 i=0
By Minkowski’s inequality

p
1/p
1/p
n
|gn | dm
p
= |fi+1 − fi |
E E i=0
n
1/p
n
≤ |fi+1 − fi |p = ρp (fi+1 , fi )
i=0 E i=0
n
1
= ρp (f1 , f0 ) + i
i=1
2
If we apply the monotone convergence theorem (10.2.18(a)) to the above

inequality, we obtain,
1/p 1/p
|g| dm
p
= lim |gn | dm
p
≤ ρp (f1 , f0 ) + 1 < ∞
E n→∞ E
Hence the function g is finite almost everywhere on E. Now the series

∞

|fi+1 (x)−fi (x)| is absolutely continuous and hence summable for almost
i=0
all x ∈ E. For such x ∈ E, we let
∞

f (x) = [fi+1 (x) − fi (x)]
i=0

n−1
We know that fn (x) = [fi+1 (x) − fi (x)] for all x ∈ E, we have
i=0

n
lim fn (x) = f (x) and |fn (x)| ≤ |fi+1 (x) − fi (x)| ≤ g(x)
n→∞
i=1
for almost all x ∈ E. Since the function g p is integrable, the dominated

convergence theorem (10.2.20(b)) yields

|f | dm = lim
p
|fn | dm ≤
p
g p dm < ∞
E n→∞ E E
Hence f ∈ Lp (E). By Minkowski’s inequality, |f | + g ∈ Lp (E) and

|f − fn |p ≤ (|f | + g)p for all n = 1, 2, . . . Again the dominated convergence
theorem (10.2.20(b)) yields
1/p
ρp (fn , f ) = |fn − f | p
−→ 0 as n → ∞.
E
Thus, the sequence {fn } converges to f in Lp (E) showing that the

metric space is complete.
Case p = ∞ Let us consider a Cauchy sequence {fn } in L∞ (E).
Let Mj = {x ∈ E : |fj (x)| > essupE |fj |} Thus except for the set Mj ;
|fj (x)| is bounded.
Moreover, let Nm,n = {x ∈ |fm (x) − fn (x)| > essupE |fm − fn |}.
Thus except for the set Nm,n , |fm (x) − fn (x)| is bounded.
Let G be the union of all Mj and Nm,n . Then m(G) = 0, and the
sequence of {fn } converges uniformly to a bounded function f on the
complement of G in E. Hence, f ∈ L∞ (E) and ρ∞ (fn , f ) −→ 0 as n → ∞.
Thus the sequence {fn } converges in L∞ (E), showing that the metric
space L∞ (E) is complete.
10.6.7 The general form of linear functionals in Lp ([a, b])
Let us consider an arbitrary linear functional f (x) defined on Lp [0, 1]
(p > 1).
3
1 for 0 ≤ ζ < t
Let ut (ζ) = (10.20)
0 for t ≤ ζ < 1
Let h(t) = f (ut (ξ)). We first show that h(t) is an absolutely continuous
function. To this end, we take δi = (si , ti ), i = 1, 2, . . . , n to be an arbitrary
system of non-overlapping intervals in [0, 1].
Let i = sign (h(ti ) − h(si )).
3 n
n
Then |h(ti ) − h(si )| = f i [uti (ζ) − usi (ζ)]
i=1 i=1
0 0
0n 0
0 0
≤ ||f || 0 i [uti (ζ) − usi (ζ)]0
0 0
i=1
p
1/p
1 n

= ||f || i [uti (ζ) − usi (ζ)] dζ
0
i=1
n

1/p
1/p

n
≤ ||f || dζ = ||f || m(δi )
i=1 δi i=1
Hence, h(t) is absolutely continuous. Thus h(t) has an a.e. Lebesgue

integrable derivative and is equal to the Lebesgue integral of this derivative.
Let h (t) = α(t), so that
t
h(t) − h(0) = α(s)ds
0
Now h(0) = f [u0 (ζ)] = 0, since u0 (ζ) ≡ 0 is the null element of Lp [0, 1].
We have, t
h(t) = α(s)ds
0
It follows from (10.20), that

t t 1
f [ut (s)] = h(t) = α(s)ds = ut (s)α(s)ds + ut (s)α(s)ds
0 0 t
1
= ut (s)α(s)ds
0
Since f is a linear functional, we have

1 1
f (u K (s)) − f (u k−1 (s)) = u k (s)α(s)ds − u k−1 (s)α(s)ds
n n n n
0 0
1 ? )
= u k (s) − u k−1 (s) α(s)ds
n n
0

n 1
If vn (s) = Ck [u k (s) − u k−1 (s)], then f (vn ) = vn (s)α(s)ds.
n n
k=1 0
Let x(t) be an arbitrary, bounded and measurable function. Then there

exists a sequence of step functions {vm (t)}, such that vm (t) −→ x(t) a.e.
as m → ∞, where {vm (t)} can be assumed to be uniformly bounded.
By the Lebesgue dominated convergence theorem (10.2.20(b)), we get
1 1 1
lim f (vm ) = lim vm (t)α(t)dt = lim vm (t)α(t)dt = x(t)α(t)dt
m m 0 0 m→∞ 0
Since, on the other hand, vm (t) −→ x(t) a.e. and vm (t) is uniformly
bounded, it follows that
1 1/p
||vm − x||p = |vm (t) − x(t)| dt p
−→ 0 as m → ∞
0
Therefore, f (vm ) → f (x) and consequently,

1
f (x) = x(t)α(t)dt
0
Consider now the function xn (t) defined as follows

3
|α(t)|q−1 sgn α(t) if |α(t)| ≤ n
xn (t) =
0 if |α(t)| > n
where q is conjugate to p i.e. 1p + 1q = 1. The function xn (t) is bounded and

measurable.
1 1
Therefore, f (xn ) = xn (t)α(t)dt = |α(t)|q−1 |α(t)|dt
0 0
1 1/p
≤ ||f || ||xn ||p = ||f || · |xn (t)| dt p
0
On the other hand,

1
|f (xn )| = f (xn ) = |xn (t)||α(t)|dt
0
1 1
1 q
≥ |xn (t)||xn (t)| q−1 dt = |xn (t)| q−1 dt
0 0
1
= |xn (t)|p dt
0
1 1 p1
Hence, |xn (t)| dt ≤ ||f || ||xn || = ||f ||
p
|xn (t)| dt
p
0 0
1 1/q
Therefore, |xn (t)|p dt ≤ ||f || (10.21)
0
Now, α(t) is Lebesgue integrable and becomes infinite only on a set of

measure zero. Hence,
xn (t) −→ |α(t)|q−1 a.e. on [0, 1]
Therefore, by the dominated convergence theorem (10.2.20(b))

1 q1 1 q1
(q−p)p (q−1)p
|xn (t)| dt −→ |α(t)| dt ≤ ||f || as n → ∞
0 0
1 1/q
or, |α(t)|q ≤ ||f || i.e. α(t) ∈ Lq [0, 1] (10.22)
0
1
Now, let x(t) be any function in Lp [0, 1]. Then there exists 0 x(t)α(t)dt.
Furthermore, there exists a sequence {xm (t)} of bounded functions, such
that
1
|x(t) − xm (t)|p dt → 0 as m → ∞
0
1 1 1
Therefore, xm (t)α(t)dt − x(t)α(t)dt ≤ |(xm (t) − x(t)| |α(t)|dt
0 0 0
1 1/p 1 1q
≤ |(xm (t) − x(t)| dt · p
|α(t)| dt
q
0 0
Using the fact that xm (t) − x(t) ∈ Lp [0, 1] and α(t) ∈ Lq ([0, 1]), the
above inequality is obtained by making an appeal to Hölder’s inequality.
Since the sequence {xm (t)} are bounded and measurable functions, in
Lp ([0, 1]) and α(t) ∈ Lq ([0, 1]),
1
xm (t)α(t)dt = f (xm )
0
1
Hence, f (xm ) −→ x(t)α(t)dt as m → ∞
0
On the other hand f (xm ) −→ f (x). It then follows that

1
f (x) = x(t)α(t)dt (10.23)
0
Thus every functional defined on Lp ([0, 1]) can be represented in the

form (10.23).
Conversely, if β(t) be an arbitrary function belonging to Lq ([0, 1]), then
1
g(x) = x(t)β(t)dt can be shown to be a linear functional defined on
0
Lp ([0, 1]).
1 1
If g(x1 ) = x1 (t)β(t)dt and g(x2 ) = x2 (t)β(t)dt then
0 0
1
g(x1 + x2 ) = (x1 (t) + x2 (t))β(t)dt
0
1 1
= x1 (t)β(t)dt + x2 (t)β(t)dt = g(x1 ) + g(x2 )
0 0
1 1/p 1 1/q
Moreover, ||g(x)||p ≤ |x(t)| dt
p
|β(t)| dt
q
< ∞,
0 0
showing that g(x) ∈ Lp ([0, 1]).

Thus g is additive, homogeneous, i.e., linear and bounded.
The norm of the functional f given by (10.23) can be determined in
terms of α(t).
It follows from (10.23) with the use of Hölder’s inequality (1.4.4)
1/p 1/q
1 1 1
f (x) = x(t)α(t)dt ≤ |x(t)| dt
p
|α(t)| dt
q
0 0 0
≤ ||x||p ||α||q
Hence ||f || ≤ ||α||q (10.24)
1 1/q
||f || = ||α(t)||q = |α(t)|q dt
0
10.7 Lp Convergence of Fourier Series

In 3.7.8 we have seen that in a Hilbert space H if {ei } is a complete
orthonormal system, then every x ∈ H can be written as
∞

x= ci ei (10.25)
i=0
where ci = x, ei , i = 0, 1, 2, . . . (10.26)
i.e., the series (10.25) converges.

ci given by (10.26) are called Fourier Coefficients.
In particular, if H = L2 ([−π, π]) and
int ,
e
en (t) = √ , n = 0, ±1, ±2 (10.27)
2π
then any x(t) ∈ L2 [−π, π] can be written uniquely as
∞
∞
(c0 − id0 ) + √ + √ (10.28)
n=1
2π n=0
2π
π π
1
where cn = √ x(t) cos ntdt (10.29)
2π −π −π
π π
1
dn = √ x(t) sin ntdt (10.30)
2π −π −π
Let us consider a more general problem of representing any integrable

function of period 2π on 4in terms of the special 2π-periodic function
eint
√ , n = 0, ±1, ±2, . . .. Let x ∈ Lp ([−π, π]).
2π
For n = 0, ±1, ±2, . . . the nth Fourier coefficients of x is defined by
π
1
ĉn = √ (x(t))e−int dm(t) (10.31)
2π −π
and the formal series ∞
1
√ ĉn eint (10.32)
2π n=−∞
is called the Fourier series of x.
For n = 0, 1, 2, . . . consider the nth partial sum,

n
sn (t) = ĉk eikt , t ∈ [−π, π]
k=−n
10.7.1 Remark
(i) Kolmogoroff [29] gave an example of a function x in L1 ([−π, π]) such
that the corresponding sequence {sn (t)} diverges for each t ∈ [−π, π].
(ii) If x ∈ Lp ([−π, π]) for some p > 1, then {sn (t)} converges for almost
all t ∈ [−π, π] (see Carleson, A [10]).
A relevant theorem in this connection is the following.

10.7.2 Theorem
In order that a sequence {xn (t)} ⊂ Lp ([0, 1]) converges weakly to
x(t) ∈ Lp ([0, 1]), it is necessary and sufficient that
(i) the sequence {||xn ||} is bounded,

1 t
(ii) xn (s)ds −→ x(s)ds for any t ∈ [0, 1]
0 0
Proof: The assumption (i) is the same as that of theorem 6.3.22.

Therefore, we examine assumption (ii).
For this purpose, let us define,
3
1 for 0 ≤ t ≤ s
αs (t) =
0 for s ≤ t ≤ 1
Then the sums

n
ci [αsi (t) − αsi−1 (t)],
i=1
where 0 = s0 < s1 < · · · < sn−1 < sn = 1 are everywhere dense in

Lq ([0, 1]) = L∗p ([0, 1]).
w
Hence, in order that xn (t) −→ x(t), it is necessary and sufficient that
assumption (i) is satisfied and that
1 1
xn (t)αs (t)dt −→ x(t)αs (t)dt
0 0
s s
or, xn (t)dt −→ x(t)dt,
0 0
as n → ∞ and for every s ∈ [0, 1].

10.7.3 Remark
Therefore, if sn (t) ∈ Lp ([0, 1]) and fulfils the conditions of the theorem
10.7.2, then
w
{sn (t)} −→ s(t) as n → ∞.
CHAPTER 11
UNBOUNDED LINEAR
OPERATORS
In 4.2.3 we defined a bounded linear operator in the setting of two normed

linear spaces Ex and Ey and studied several interesting properties of
bounded linear operators. But if said operator ceases to be bounded, then
we get an unbounded linear operator.
The class of unbounded linear operators include a rich class of operators,

notably the class of differential operators. In 4.2.11 we gave an example
of an unbounded differential operator. There are usually two different
approaches to treating a differential operator in the usual function space
setting. The first is to define a new topology on the space so that the
differential operators are continuous on a nonnormable topological linear
space. This is known as L. Schwartz’s theory of distribution (Schwartz, L
[52]). The other approach is to retain the Banach space structure while
developing and applying the general theory of unbounded linear operators
(Browder, F [9]). We will use the second approach. We have already
introduced closed operators in Chapter 7. The linear differential operators
are usually closed operators, or at least have closed linear extensions.
Closed linear operators and continuous linear operators have some common
features in that many theorems which hold true for continuous linear
operators are also true for closed linear operators. In this chapter we point
out some salient features of the class of unbounded linear operators.
381
11.1 Definition: An Unbounded Linear

Operator
11.1.1 Let Ex and Ey be two normed linear spaces
Let a linear operator A : D(A) ⊂ Ex −→ R(A) ⊂ Ey , where D(A) and
R(A) stand for the domain and range of A, respectively.
If it does not fulfil the condition
AxEy ≤ KxEx , for all x ∈ Ex (11.1)
where K is a constant (4.2.3), the operator becomes unbounded.

See 4.2.11 for an example of an unbounded linear operator.
11.1.2 Theorem
Let A be a linear operator with domain Ex and range in Ey . The
following statements are equivalent :
(i) A is continous at a point
(ii) A is uniformly continuous on Ex
(iii) A is bounded, i.e., there exists a constant K such that (11.1) holds
true for all x ∈ Ex [see theorem 4.1.5 and theorem 4.2.4].
11.2 States of a Linear Operator

Definition: State diagram is a table for keeping track of theorems between
the ranges and the inverses of linear operators A and A∗ . This diagram
was constructed by S. Goldberg ([21]). In what follows, A : Ex → Ey , Ex
and Ey being normed linear (Banach) spaces.
We can classify the range of an operator into three types :
I. R(A) = Ey
II. R(A) = Ey , R(A) = Ey ,
III. R(A) = Ey .
Similarly, A−1 may be of the following types:
(a) A−1 exists and is continuous and hence bounded
(b) A−1 exists but is not continuous
(c) A has no inverse.
If R(A) = Ey , we say A is in state I or that A is surjective written as
A ∈ I. Similarly we say A is in state b written as A ∈ b if R(A) = Ey but
R(A) = Ey . Listed below are some theorems that show the impossibility
of certain states for (A, A∗ ). For example, if A fulfills conditions I and b,
then A will be said to belong to Ib . Similar meaning for A∗ ∈ IIc .
Now (A, A∗ ) will be said to be in state (Ib , IIc ) if A ∈ Ib then A∗ ∈ IIc .
Unbounded Linear Operators 383
11.2.1 Theorem
If A has a bounded inverse, then R(A∗ ) is closed.
∗
Proof: Let us suppose that A∗ : fp ∈ Ey∗ → g ∈ Ex∗ . Since A∗ has a

bounded inverse, there exists an m > 0 such that
A∗ fp − A∗ fq ≥ mfp − fq .
Thus, {fp } is a Cauchy sequence which converges to some f in some

Banach space Ey∗ . Since A∗ is closed, f is in D(A∗ ) and A∗ f = g. Hence
R(A∗ ) is closed.
11.2.2 Remark
The above theorem shows that if A∗ ∈ II then A∗ cannot belong to a
i.e., A∗ ∈
/ IIa .
11.2.3 Definition: orthogonal complements K ⊥ , ⊥ C
In 5.1.1. we have defined a conjugate Ex∗ of Banach space Ex . Let A
map a Banach space Ex into a Banach space Ey . In 5–6 an adjoint, A∗
of an operator A mapping Ey∗ −→ Ex∗ was introduced. We next introduce
the notion of an orthogonal complement of a set in Banach space.
The orthogonal complement of a set K ⊆ Ex is denoted by K ⊥ and
is defined as
K ⊥ = {f ∈ Ex∗ : f (x) = 0, ∀ x ∈ K} (11.2)
Orthogonal: A set K ⊆ Ex is said to be orthogonal to a set F ⊆ Ex∗ ,

if f (k) = 0 for f ∈ F ⊆ Ex∗ and ∀ k ∈ K.
Thus, K ⊥ is called an orthogonal complement of K, because K ⊥ is
orthogonal to K.
Even if K is not closed. K ⊥ is a closed subspace of Ex∗ .
⊥
C: If C is a subset of Ex∗ , the orthogonal complement of C in Ex
is denoted by ⊥ C and defined by
⊥
C = {x : x ∈ Ex , Fx (f ) = 0 ∀ f ∈ C} (11.3)
For notion of Fx see theorem 5.6.5.

11.2.4 Remarks
K and ⊥ C are closed subspaces respectively of Ex∗ and Ex . Also
⊥
⊥
K ⊥ = K and ⊥ C = ⊥ C.
11.2.5 Theorem
If L is a subspace of Ex , then ⊥ (L⊥ ) = L.
Proof: Let {xn } ⊆ L be convergent and lim xn = x ∈ L. Let {fm } ⊆ L⊥
n→∞
be convergent and L⊥ being closed lim fm = f ∈ L⊥ .
m→∞
Now, |fm (xn ) − f (x)| ≤ |fm (xn ) − f (xn )| + |f (xn ) − f (x)|.

Now, lim fm (xn ) = f (xn ).
m→∞
Since x ∈ Ex , f (x) = 0 for f ∈ L⊥ . Hence f (xm ) − f (x) = 0.
Thus, fm (xn ) −→ f (x) = 0 as m, n −→ ∞.
Hence, x ∈ ⊥ (L⊥ ). Thus ⊥
(L)⊥ ⊆ L.
On the other hand, L ⊆ ⊥ (L)⊥ since ⊥
(L)⊥ is a closed subspace.
⊥
Thus (L)⊥ = L.
11.2.6 Theorem
If M is a subspace of Ex∗ , then (⊥ M )⊥ ⊃ M . If Ex is reflexive, then
( M )⊥ = M .
⊥
Proof: Let {fn } ⊆ M ⊆ Ex∗ be a convergent sequence. Let fn −f Ex∗ → 0

as n → ∞, where f ∈ M .
⊥
Now, M = {x0 : x ∈ Ex , Fx (f ) = 0 ∀ f ∈ M }
Since fn ∈ M and if x ∈ Ex we have Fx (fn ) = 0.
Hence, fn (x) = Fx (fn ) = 0 for x ∈ ⊥M [see 5.6.5].
Thus, |fn (x) − f (x)| ≤ fn − f x, x ∈ ⊥ M .
→ 0 as n → ∞.
or |0 − f (x)| → 0 as n → ∞.
Hence f (x) = 0 for x ∈ ⊥ M . Thus f ∈ (⊥ M )⊥ .
Hence f ∈ M ⇒ f ∈ (⊥ M )⊥ proving that (⊥ M )⊥ ⊇ M .
For the second part see Remark 11.2.7.
11.2.7 Remark
If Ex is reflexive, i.e., Ex = Ex∗∗ , then (⊥ M )⊥ = M .
11.2.8 Definition: domain of A∗
Domain of A∗ is defined as
D(A∗ ) = {φ : φ ∈ Ey∗ , φA is continuous on D(A)}.
For φ ∈ D(A∗ ), let A∗ be the operator which takes φ ∈ D(A∗ ) to φA,
where φA is the unique continuous linear extension of φA to all of Ex .
11.2.9 Remark
(i) D(A∗ ) is a subspace of Ey∗ and A∗ is linear.
(ii) A∗ φ is taken to be φA rather than φA in order that R(A∗ ) is
contained in Ex∗ [see 5.6.15].
11.2.10 Theorem
(i) R(A)⊥ = R(A)⊥ = N (A∗ ).
(ii) R(A) = ⊥ N (A∗ ).
In particular, A has a dense range if and only if A∗ is one-to-one.
Proof: (i) R(A)⊥ is a closed subspace of Ey∗ . Hence, R(A)⊥ = R(A)⊥ .

R(A)⊥ = {φ : φ ∈ Ey∗ , φ(v) = 0, v = Au ∈ R(A)}.
Now, φ(v) = φ(Au) = A∗ φ(u).
Therefore if φ ∈ R(A)⊥ , φ ∈ N (A∗ ).
Hence, R(A)⊥ ⊆ N (A∗ ). (11.4)
∗ ∗
On the other hand, N (A ) = {ψ : ψ ∈ D(A ) ⊇ Ey∗ , ∗
A ψ = 0},
∗
Now, A ψ = 0,
Hence, ψ(Au) = 0, ∀ u ∈ D(A) i.e. ψ ∈ R(A)⊥ .
Thus, N (A∗ ) ⊆ R(A)⊥ . (11.5)
⊥ ∗
(11.4) and (11.5) together imply that R(A) = N (A ).
⊥
(ii) It follows from (i) of theorem 11.2.6 that (R(A)⊥ ) = R(A).
Again (i) of this theorem yields R(A)⊥ = N (A∗ ).
⊥
Hence, N (A∗ ) = ⊥ (R(A)⊥ ) = R(A).
If R(A) is dense in Ey , we have
R(A) = Ey = ⊥ N (A∗ )
= {y : y ∈ Ey , Gy (φ) = φ(y) = 0 where A∗ φ = 0}
Thus, A∗ φ = 0 ⇒ φ = 0 showing that A∗ is one-to-one.
11.2.11 Theorem
If A and A∗ each has an inverse then (A−1 )∗ = (A∗ )−1 .
Proof: By theorem 11.2.10. D(A−1 ) = R(A) is dense in Ey . Hence

(A−1 )∗ is defined. Suppose f ∈ D((A∗ )−1 ) = R(A∗ ). Then there exists a
φ ∈ D(A∗ ), such that A∗ φ = f . To show φ ∈ D((A−1 )∗ ) we need to prove
that f (A−1 ) is continuous on R(A). Now,
f ((A−1 A)x) = A∗ φ(x) = φ(Ax), x ∈ D(A)
Thus, (A−1 )f = φ on R(A), where φ = (A−1 )∗ f = (A−1 )∗ A∗ φ since
R(A) is dense in Ey . Hence, (A−1 )∗ = (A∗ )−1 on D((A∗ )−1 ). It remains
to prove that D((A−1 )∗ ) ⊆ D((A∗ )−1 ).
Let us next suppose that ψ ∈ D((A−1 )∗ ). We want to show that
ψ ∈ D((A∗ )−1 ) = R(A∗ ). For that we show that there exists an element
v ∗ ∈ D(A∗ ) such that A∗ v ∗ = ψ or equivalently v ∗ A = ψ on D(A).
Keeping in mind the definition of D(A∗ ) we define v ∗ as the continuous

linear extension of ψA−1 to all of Ey , thereby obtaining A∗ v ∗ = ψ. Thus,
D((A−1 )∗ ) ⊂ D((A∗ )−1 ).
11.3 Definition: Strictly Singular Operators

The concept of a strictly singular operator was first introduced by T.
Kato [28] in connection with the development of perturbation theory. He
has shown that there are many properties common between A and A + B,
where B is strictly singular. In what follows we take Ex and Ey as two
normed linear spaces.
Definition: strictly singular operator

Let B be a bounded linear operator with domain in Ex and range in Ey .
B is called strictly singular if it does not have a bounded inverse on any
infinite dimensional subspace contained in its domain.
11.3.1 Example
The most important examples of strictly singular operators are compact
operators. These play a significant role in the study of differential and
integral equations.
In cases where the normed linear spaces are not assumed complete, it
is convenient to also consider precompact operators.
11.3.2 Definition: precompact operator
Let A be a linear operator mapping Ex into Ey .
If A(B(0, 1)) is totally bounded in Ey , then A is called precompact.
11.3.3 Theorem
Every precompact operator is strictly singular.
Proof: Let B be a precompact operator with domain in Ex and range in

Ey . B is bounded since a totally bounded set is bounded. Let us assume
that B has a bounded inverse on a subspace M ⊆ D(B). If BM (0, 1) is a
unit ball in M , then B[BM (0, 1)] is totally bounded. Since B has a bounded
inverse on M , it follows that BM (0, 1) is totally bounded in M . Since unit
ball in M is totally bounded, it has a finite -net, i.e., it has finite number of
points x1 , x2 , . . . xn in the unit ball in M such that for every x ∈ BM (0, 1),
there is an xi such that
x − xi < 1 (11.6)
Let us assume that the finite dimensional space N spanned by x1 , x2 , . . . xk

is M .
Suppose this assertion is false. Since N is a finite dimensional proper

subspace of the normed linear space M , we will show that there exists an
element in the unit ball of M whose distance from N is 1. Let z be a point
in M but not in N . Then there exists a sequence {mk } in N such that
z − mk → d(z, N ).
Since N is finite dimensional and {mk } is bounded, N must also be
compact (1.6.19). Hence, {mk } has a convergent subsequence {mkp } which
converges to m ∈ N (say). Hence,
z − m = lim z − mkp = d(z, N )

p→∞
= d(z − m, N )
Since z − m = 0,
0 0
0 z − m 0 d(z − m, N ) z−m
1=0 0 0 = =d ,N .
z − m0 z − m z − m
Hence M is finite-dimensional. Therefore, B does not have an inverse

on an infinite dimensional subspace.
11.3.4 Definition: finite deficiency
A subspace L of a vector space E is said to have finite deficiency in
E if the dimension of E/L is finite. This is written as
dim E/L < ∞.
Even though L is not contained in D(A), or restriction of A to L will mean

a restriction of A to L ∩ D(A).
11.3.5 Theorem
Let A be a linear operator from a subspace of Ex into Ey . Assume
that A does not have a bounded inverse when restricted to any closed
subspace having finite deficiency in Ex . Then, given an arbitrarily small
number > 0, there exists an infinite dimensional subspace L( ) contained
in D(A), such that A restricted to L( ) is precompact and has norm not
exceeding .
Proof: Since A does not have a bounded inverse on a closed subspace Ex

having finite deficiency, we can not find a m > 0 such that Ax ≥ mx
on such a subspace. Therefore, there is no loss of generality in assuming
that there exists an x1 ∈ Ex such that x1 = 1 and Ax1 < 3 . There
is an f1 ∈ Ex∗ such that f1 = 1 and f1 (x2 ) = x2 = 1. Since N (f1 )
has a deficiency 1 in Ex , there exists an element x2 ∈ N (f1 ) such that
x2 = 1 and Ax2 ≤ 32 . There exists an f2 ∈ Ex∗ such that f2 = 1
and f2 (x2 ) = x2 = 1. Since, N (f1 ) ∩ N (f2 ) has finite deficiency in Ex ,
there exists an x3 ∈ N (f2 ) ∩ N (f1 ) such that x3 = 1 and Ax3 ≤ 33 .
Hence, by induction, we can construct sequences {xk } and {fk } having the
following properties :

xk = fk = fk (xk ) = 1, Axk < , 1≤k<∞ (11.7)
3k
xk ∈ ∩k−1
i=1 N (fi ) or equivalently, fi (xk ) = 0 1 ≤ i < ∞ (11.8)
i = k
We next show that {xk } is linearly independent.

If that is not so, we can find α1 , α2 , . . . αk not all zeroes, such that
α1 x1 + α2 x2 + · · · + αk xk + · · · = 0.
or α1 fi (x1 ) + α2 fi (x2 ) + · · · + αk fi (xk ) + · · · = 0.
Using (11.7) and (11.8) we get αi = 0 for 1 ≤ i < k = 2, 3, . . .

Hence, {xk } is a linearly independent set. Let L = span {x1 , x2 , . . .}.
L is an infinite dimensional subspace of D(A). It will now be shown that
the restriction AL of A to L has norm not exceeding .
l
Suppose x = αi xi . Then from (11.7) and (11.8),
i=1
α1 = |f1 (x1 )| f1 x = x
We next want to establish that
|αk | ≤ 2k−1 x 1≤k≤l (11.9)
Let us suppose that (11.9) is true for k ≤ j < l.

Then we get from (11.7) and (11.8) that

j
fj+1 (x) = αi fj+1 (xi ) + αj+1 (11.10)
i=1
Hence, by (11.10) and the induction hypothesis,

j
|αj+1 | ≤ |fj+1 (x)| + |αi | |fj+1 (xi )|
i=1

j
≤ x + 2i−1 x = 2j x
i=1
Hence, (11.9) is true by induction.

l
l
Thus, Ax ≤ |αi |Axi ≤ 2i−1 3−i x ≤ x
i=1 i=1
Hence, Ax ≤ .
To prove that AL is precompact, we would show that AL is the limit

in (L −→ Ey ) of a sequence of precompact operators [see 8.1.9]. For each
positive integer n, we define AL n : L −→ Ey to be A on span {x1 , . . . xn }
and 0 on span {xn+1 , xn+2 , . . .}. It may be noted that ALn is linear and has

n+k
finite dimensional range. Moreover, AL n is bounded on L for if x = α i xi
i=1
then by (11.7) and (11.9)

n
n
AL
n x ≤ |αi | Axi ≤ 2i−1 3−i x.
i=1 i=1
Thus AL n is bounded and finite dimensional.

Hence, AL n is precompact.

∞
∞
Since, AL x − AL n x ≤ |αi | Axi ≤ x 2i−1 3−i → 0 as
i=n+1 i=n+1
n → ∞, it follows that AL
n converges to AL in (L −→ Ey ). Hence, AL is
precompact [see 8.1.9].
11.4 Relationship between Singular and

Compact Operators
The following theorem reveals the connection between the class of strictly
singular and the class of compact operators.
11.4.1 Theorem
Suppose B ∈ (Ex −→ Ey ). The following statements are equivalent :
(a) B is strictly singular
(b) For every infinite dimensional subspace L ⊂ Ex , there exists an
infinite dimensional subspace M ⊆ L such that B is precompact on M .
(c) Given > 0 and given L an infinite dimensional subspace of Ex ,
there exists an infinite dimensional subspace M ⊆ L such that B restricted
to M and has norm not exceeding , an arbitrary small positive number.
Proof: We first show that (a) implies (b). Let us suppose that B is strictly
singular and L is an infinite-dimensional subspace of Ex . Then BL , the
restriction of B to L, is strictly singular. Therefore, BL does not have a
bounded inverse on an infinite dimensional subspace M ⊆ L. Hence, by
theorem 11.3.4, B is precompact on such a M ⊆ L.
Next, we show that (b) ⇒ (c). If (b) is true then we assert that B does
not have a bounded inverse on an infinite dimensional subspace, having
finite deficiency in L. If that is not so, then B would be precompact and
would have a bounded inverse at the same time. This violates the conclusion
of theorem 11.3.3. Hence by applying theorem 11.3.4 to BL (c) follows.
Finally, we show that (c) ⇒ (a). It follows from (c) that BL ≤ , i.e.,
BL x ≤ x for an arbitrary small > 0, i.e., BL x >mx, m > 0
and finite, for all x belonging to an infinite dimensional subspace M ⊆ L.
Hence, BL does not have a bounded inverse on M ⊆ L. Thus B is strictly
singular.
11.5 Perturbation by Bounded Operators

Suppose we want to study the properties of a given operator A mapping a
normed linear space Ex into a normed linear space Ey . But if the operator
A turns out to be involved, then A is replaced by, say, T + V , where T
is a relatively simple operator and V is such that the knowledge about
properties of T is enough to gain information about the corresponding

n
properties of A. For example, if A is a differential operator ak D k , T is
k=s
chosen as an D n and V is the remaining lower order terms. The concern
of the penturbation theory is to find the conditions that V should fulfil so
that the properties of T can help us determine the properties of A.
In what follows, V is a linear operator with domain a subspace of Ex
and range a subspace of Ey .
11.5.1 Definition: kernel index of A, deficiency index of A and
index of A
Kernel index of A: The dimension of N (A) will be defined as the kernel
index of A and will be denoted by α(A).
Deficiency index of A: The deficiency of R(A) in Ey written as β(A), will

be called the deficiency index of A.
Then α(A) and β(A) will be either a non-negative integer or ∞.
Index of A: If α(A) and β(A) are not both infinite, we say A has an inverse.
The index κ(A) is defined by κ(A) = α(A) − β(A).
It is understood as in the real number system, if p is any real number,
∞−p=∞ and p − ∞ = −∞.
11.5.2 Examples
1. Let Ex = Lp ([a, b]) Ey = Lq ([a, b]) where 1 ≤ p, q < ∞.
Let us define A as follows :
D(A) = {u : u(n−1) exists and is absolutely continuous on [a, b], u(n) ∈ Ey }.
Au = u(n) , u(n) stands for the n-th derivative of u in [a, b]. It may
be recalled that an absolutely continuous function is differentiable almost
everywhere (10.5). Here, N (A) is the space of polynomials of degree at
most (n − 1). Hence, α(A) = n, β(A) = 0.
2. Let Ex = Ey = lp , 1 ≤ p ≤ ∞. Let {λκ } be a bounded sequence of
numbers and A be defined on all of Ex by A({xκ }) = {λκ xκ }.
α(A) are the members of λk which are 0. β(A) = 0 if {1/λκ } is a
bounded sequence. β(A) = ∞ if infinitely many of the λκ are 0.
11.5.3 Lemma
Let L and M be subspace of Ex with dim L > dim M (thus dim M <
∞). Then, there exists a l = 0 in L such that
l = dist (l, M ).
Note 11.5.1 This lemma does not hold if dim L = dim M < ∞.
4
For example, if Ex = 2 and L and M are two lines through the origin
which are not perpendicular to each other.
If Ex is a Hilbert space, the lemma has the following easy proof.
Proof: First we show that dim M = dim(M ⊕ M ⊥ /M ⊥ ) = dim(H/M ⊥ )

where H/M ⊥ stands for the quotient space.
! ⊥
"
Let x, y ∈ M . We consider
! the
" mapping x → x + M in H/M .
⊥
Similarly, y → y + M in H/M . Thus, x + y → (x + M ) + (y + M ) =
x+y+M in (H/M ⊥ ). Similarly, for any scalar λ, λx → λ(x+M ) = λx+M .
Thus, there is an isomorphism between M and (H/M ⊥ ). Let us assume
that L ∩ M ⊥ = Φ. Hence, dim M = dim(M + M ⊥ /M ⊥ ) = dim(H/M ⊥ ) ≥
lim(L + N ⊥ /N ⊥ ) = dim L. The above contradicts the hypothesis that
L ∩ M ⊥ = Φ.
Let x ∈ L ∩ M ⊥ and let x = θ. Then
x − m2 = x2 + m2 ≥ x2 for m ∈ M .
Thus, d(x, M ) = x.
11.5.4 Definition: minimum module of A
Let N (A), the null manifold of A, be closed. The minimum module of
A is written as γ(A) and is defined by
Ax
γ(A) = inf (11.11)
xD(A) d(x, N (A))
11.5.5 Definition
The one-to-one operator Â of A induced by A is the operator from
D(A)/N (A) into Ey defined by
Â[x] = Ax,
where the coset [x] denotes the set of elements equivalent to x and
belongs to D(A)/N (A).
Â is one-to-one and linear with same range as that of A. We next state
without proof the following theorem.
11.5.6 Theorem (Goldberg [21])
Let N (A) be closed and let D(A) be dense in Ex . If γ(A) > 0, then
γ(A) = γ(A∗ ) and A∗ has a closed range.
11.5.7 Theorem
Suppose γ(A) > 0. Let V be bounded with D(V ) ⊃ D(A). If
V < γ(A), then
(a) α(A + V ) ≤ α(A)

(b) dim Ey /R(A + V ) ≤ dim Ey /R(A)
Proof: (a) For x = θ in N (A + V ) and V < γ = γ(A),

since x = θ ∈ N (A + V ), Ax + V x = θ i.e., Ax = V x.
γ[x] ≤ Ax = V x ≤ V x < γx where [x] ∈ Ex /N (A).
Thus, x > [x] = d(x, N (A)).
Therefore, by lemma 11.5.3, the dimension of N (A + V ) < dimension

of N (A), or α(A + V ) ≤ α(A).
(b) Let Ex1 = D(A) and let V1 be V restricted to D(A). Let us consider
A and V1 as operators with domain dense in Ex1 , since f1∗ = A∗ φ∗1 where
φ∗1 ∈ Ey∗ and f ∗ ∈ Ex∗ . Therefore, the domain of A∗ is in Ey∗ and range in
Ex1∗ . Therefore, by theorem 11.5.6,
γ(A∗ ) = γ(A) > V ≥ V1 = V1∗ .
We next show that
dim(Ey /R(A + V )) = dim(Ey /R(A + V1 )) = dim R(A + V )⊥ .
For g ∈ (Ey /R(A + V1 ))∗ , and the map W defined by:
(W g(x)) = g[x], g ∈ (Ey /R(A + V1 ))∗ ,

we observe |(W g(x)| = |g[x]| g [x] ≤ g x, x ∈ Ex
and W g(m) = g[m] = 0, m ∈ R(A + V1 ).
⊥
Thus, W g is in R(A + V1 ) with
W g ≤ g (11.12)
Since |g[x]| = |W g(y)| ≤ V g y, y ∈ [x]
It follows that |g[x]| ≤ V g [x]
Thus, g ≤ V g (11.13)

(11.13) together with (11.12) proves that W is an isometry.
⊥ ⊥
Given f ∈ R(A + V ) , let g be a linear functional on Ex1 /R(A + V )
defined g[x] = g(x).
Now, |g[x]| = |g(y)| ≤ g y, y ∈ [x].
It follows that |g[x]| ≤ g [x]|. Hence, g is in (Ex1 /R(A + V ))⊥ .
Furthermore, V g = f , proving that R(V ) = (R(A + V ))⊥ .
Thus, (Ex1 /R(A + V1 ))∗ = (R(A + V1 ))⊥ (11.14)
Hence, it follows from theorem 11.2.10, definition 11.5.1 and (11.14)
⊥
dim(Ex1 /R(A + V1 ))∗ = dim(R(A + V1 ) )
= dim(N (A + V1 )∗ ) = α(A∗ + V1∗ )
≤ α(A∗ ) = dim(Ey /R(A)).
11.6 Perturbation by Strictly Singular

Operators
11.6.1 Definition: normally solvable
A closed linear operator with closed range is called normally
solvable.
11.6.2 Theorem
Let Ex and Ey be complete. If A is closed but R(A) is not closed, then
for each > 0 there exists an infinite-dimensional closed subspace L( )
contained in D(A), such that A restricted to L( ) is compact with norm
not exceeding , an arbitrarily small number.
Proof: Let U be a closed subspace having finite deficiency in Ex . Assume

that A has a bounded inverse on U . Since A is closed, Axn → y, xn ∈ U ⇒
{xn } is a Cauchy sequence and therefore converges to x in Banach space
U . Thus, AU is closed.
Moreover, A being closed, x ∈ D(A) and Ax = y. By hypothesis,
there exists a finite dimensional subspace N of Ex such that Ex = U + N .
Hence AEx = AU + AN ⊆ Ey . Thus AU is a closed subspace and AN
is a finite dimensional subspace of Ey . Define a linear map B from Ey
onto Ey /AU by By = [y]. Since By = [y] ≤ y, B is continuous.
Moreover, the linearity of B and the finite dimensionality of N imply
the finite dimensionality of BN . Now a finite dimensional subspace of
a normed linear space is complete and hence closed. Since B is continuous
B −1 BN = U + N is closed (B −1 is used in the set theoretic sense). Thus
AEx is closed. But this contradicts the hypothesis that R(A) is not closed.
Therefore, A does not have a bounded inverse on U . Hence there exists
an L = L( ) with the properties described in theorem 11.3.4. Since Ey is
complete and A is closed and bounded on L, it follows that L is contained
in D(A). Moreover, AL ≤ and ABL = AB L , where BL and BL are
unit balls in L and L respectively and AL is the restriction of A to L.
The precompactness of A and the completeness of Ey imply that AB L is
compact. Thus, AL is compact.
11.6.3 Theorem
Suppose that A1 is a linear extension of A such that
dim(D(A1 )/D(A)) = n < ∞,
(a) If A is closed then A1 is closed

(b) If A has a closed range, then A1 has a closed range.
Proof: (a) By hypothesis, D(A1 ) = D(A) + N , where N is a finite

dimensional subspace. Hence, G(A1 ) = G(A) + H, where G(A1 ) and G(A)
are the graphs of A1 and A respectively and H = {(n, A1 n)| : n ∈ N }.
Thus, if G(A) is closed, then G(A1 ) is closed since H is finite
dimensional.
(b) R(A1 ) = R(A) + A1 N , A1 N is finite dimensional and hence closed.
Also R(A) is given to be closed. Hence, R(A1 ) is closed.
11.6.4 Theorem
Let Ex and Ey be complete and let A be normally solvable. If L is a
subspace (not necessarily closed) of Ex such that L + N (A) is closed, then
AL is closed. In particular, if L is closed and N (A) is finite dimensional,
then AL is closed.
Proof: Let A1 be the operator A restricted to D(A) ∩ (LN (A)). Then A1

is closed and N (A1 ) = N (A).
Hence γ(A1 ) ≥ γ(A) > 0. Therefore, A1 has a closed range, i.e.,
AL = A1 (L + N (A)) is closed.
If V is strictly singular with no restriction on its norm, then we get an
important stability theorem due to Kato (Goldberg [21]).
11.6.5 Theorem
Let Ex and Ey be complete and let A be normally solvable with
α(A) < ∞.
If V is strictly singular and D(A) < D(V ), then
(a) A + V is normally solvable
(b) κ(A + V ) = κ(A)
(c) α(A + λV ) and β(A + λV ) have constant values p1 and p2 ,

respectively, except perhaps for isolated points. At the isolated points,
p1 < λ(A + λV ) < ∞ and β(A + λV ) > p2 .
Proof: (a) Since α(A) < ∞, i.e., the null space of A is finite dimensional
i.e., closed, there exists a closed subspace L of Ex such that Ex = L⊕N (A).
Let AL be the operator A restricted to L ∩ D(A). Then A being closed, AL
is closed with R(AL ) = R(A). Let us suppose that A + V does not have a
closed range. Now A + V is an extension of AL + V . Then it follows from
theorem 11.6.3(b) that AL + V does not have a closed range. Moreover,
11.6.3(a) yields that AL + V is closed since AL is closed. Thus, AL + V is a
closed operator but its range is not closed. It follows from theorem 11.6.2
that there exists a closed infinite–dimensional subspace L0 contained in
D(AL ) = D(AL + V ) such that
γ(AL )
(AL + V )x < x, x ∈ L0 (11.15)
2
Thus, since AL is one-to-one, it follows for all x in L0 ,
V x ≥ AL x − (AL + V )x

γ(AL ) γ(AL )
≥ γ(AL ) − x = x.
2 2
The above shows that V has a bounded inverse on the infinite dimensional
space L0 . This, however, contradicts the hypothesis that V is strictly
singular. We next show that α(A+V ) < ∞. There exists a closed subspace
M1 such that
N (A + V ) = N (A + V ) ∩ N (A) ⊕ M1 (11.16)
Let A1 be the operator A restricted to M1 . Since N (A) is finite dimensional,
i.e., closed, N (A) + M1 is closed and A is normally solvable. Hence, by
theorem 11.6.4 AM1 is closed. Thus, R(A1 ) = AM1 is closed. Moreover, A1
is one-to-one. Hence, its inverse is bounded. Since M1 ⊆ N (A + V )#V =
−A1 on M1 and V is strictly singular, M1 must be finite dimensional.
Therefore, 11.16 implies that N (A + V ) is finite dimensional.
(b) We have shown above that for all scalars λ, A + λV is normally
solvable and α(A + λV ) < ∞.
Let I denote the closed interval [0, 1] and let Z be the set of integers
together with the ‘ideal’ elements ∞ and −∞. Let us define φ : I → Z
by φ(x) = κ(A + λV ). Let I have the usual topology and let Z have the
discrete topology, i.e., points are open sets. To prove (b) it suffices to show
that φ is continuous. If φ is continuous, then φ(I) is a connected set which
therefore consists of only one point. In particular,
κ(A) = φ(0) = φ(1) = κ(A + V ) (11.17)

In order to show the continuity of φ, we first prove that

κ(A + V ) = κ(A)for V sufficienty small.
We refer to (a) and note that AL is closed, one-to-one and R(AL ) =
R(A). Hence, AL has a bounded inverse. Then, by theorem 11.5.7, we have
α(AL + V ) = α(AL ) = 0 (11.18) Since AL has a bounded inverse

dim(Ey /R(AL + V )) = dim(Ey /R(AL )) provided V < γ(AL )
Hence, β(AL + V ) = β(AL ). (11.19)
Now, D(A + V ) = D(A) = D(A) ∩ L ⊕ N (A).
D(AL + V ) = D(AL ) = D(A) ∩ L.
Thus, D(A) = D(AL ) ⊕ N (A).
Thus, dim(D(A)/D(AL )) = dim(N (A)) = α(A).
Hence, κ(A) = κ(AL ) + α(A)
= κ(AL + V ) + α(A) = κ(A + V ), (11.20)
where V < γ(AL ).
Hence, the continuity of φ is established and (11.17) is true.

(c) For proof, see Goldberg [21].
11.7 Perturbation in a Hilbert Space and

Applications
11.7.1 A linear operator A defined in a Hilbert space is said to be
symmetric if it is contained in its adjoint A∗ and is called self-adjoint if
A = A∗ .
11.7.2 Definition: coercive operator
A symmetric operator A with domain D(A) dense in a Hilbert space H
is said to be coercive if there exists an α > 0 such that
∀ u ∈ D(A), Au, u ≥ αu, u (11.21)
11.7.3 Definition: scalar product [ , ]

Let A be a coercive linear operator with domains D(A) dense in a Hilbert
space H. Let A satisfy (11.21) ∀ u ∈ D(A).
We define [u, v] = Au, v.

Since A is self-adjoint [u, v] = [v, u] ∀ u, v ∈ D(A).
0 ≤ [u, u] = |u|2 (11.22)
It may be seen that [ , ] defines a new inner product in D(A). We

complete D(a) w.r.t. the new product and call the new Hilbert space as
HA , where | · | defined by (11.22) will be the norm of HA .
11.7.4 Perturbation
Our concern is to solve a complicated differential equation
A1 u = f, u ∈ D(A1 ) (11.23)
A1 is coercive in the Hilbert space H1 .

We often replace the above equation by a simpler equation of the form
A2 u = f (11.24)
The question that may arise is to what extent the replacement is

justified, or in other words we are to determine how close is the solution of
equation (11.24) to the solution of equation (11.23). For this, let us assume
that both the operators A1 and A2 are symmetric and coercive on their
respective domains D(A1 ) and D(A2 ) respectively. We complete D(A1 )
and D(A2 ) respectively w.r.t. to the products [u, u]A1 , u ∈ D(A1 ) ⊆ H and
[v, v]A2 , v ∈ D(A2 ) ⊆ H. We call the Hilbert spaces so generated as HA1
and HA2 respectively.
Let |u|2A1 = A1 u, u ≥ α1 u, u = α1 u2A1 , u ∈ HA1 (11.25)

Also, let |u|2A2 = A2 v, v ≥ α2 v, v = α2 v2A2 v ∈ HA2 (11.26)
11.7.5 Theorem
Let the symmetric and coercive operators A1 and A2 fulfill respectively
the inequalities (11.25) and (11.26). Moreover, let HA1 and HA2 coincide
and are each seperable. If u0 and u1 are the solutions of equations (11.23)
and (11.24), then there exists some constant η such that
|u1 − u0 |A2 ≤ η|u1 |A2 (11.27)
Before we prove theorem 11.7.5, a lemma is proved.

11.7.6 Lemma
Let A1 be a symmetric coercive operator fulfilling condition (11.25). Let
ΔA1 be a symmetric nonnegative bounded linear operator and satisfy the
condition
0 ≤ ΔA1 u, u ≤ α3 u, u, α3 > 0, ∀u ∈ D(ΔA1 ) ⊇ D(A1 ) (11.28)
Let α1 > α3 and α1 − α3 = α2 .

Then A2 = A1 − ΔA1 is a symmetric linear coercive operator satisfying
the condition (11.26).
Proof: A2 u, u = (A1 − ΔA1 )u, u = A1 u, u − ΔA1 u, u

≥ (α1 − α3 )u, u = α2 u, u
∀ u ∈ D(A2 ).
Moreover, since A1 and ΔA1 are symmetric linear operators, A2 = A1 −

ΔA1 is symmetric and coercive.
Proof (th. 11.7.5) HA1 being a Hilbert space, we can define

1
|u|A1 = [u, u]A2 1 = A1 u, u1/2 , u ∈ D(A) ⊆ HA1
1
Similarly, |u|A2 = [u, u]A2 2 = A2 u, u1/2 , u ∈ D(A2 ) ⊆ HA2
HA1 and HA2 being seperable are isomorphic.
It follows from (11.27) (A1 − A2 )u, u ≤ α3 |u|2

α3
or A1 u, u ≤ α3 |u|2 + |u|2A2 ≤ 1 + |u|2A2
α2

2 α3
or |u|A1 ≤ 1 + |u|2A2 .
α2
;
α3
or |u|A1 ≤ 1 + |u|A2 (11.29)
α2
Again, since ΔA1 is non-negative, we have
|u|2A1 ≥ |u|2A2 (11.30)
Hence, we can find positive constants β1 , β2 , such that
β1 |u|2A2 ≤ |u|2A1 ≤ β2 |u|2A2 ∀ u ∈ HA1 = H2 (11.31)
If u0 and v 0 are the respective unique solutions of (11.23) and (11.24),
then the inequality
|v 0 − u0 |A2 ≤ η|v0 |A2 (11.32)
holds, in which the constant η is defined by the formula

|β1 − 1| |β2 − 1|
η = max ; (11.33)
β1 β2
Formulas (11.32) and (11.33) solve the problem (11.23) approximately with
an estimation of the error involved.
11.7.7 Example
1. For error estimate due to perturbation of a second order elliptic
differential equation see Mikhlin [36].
2. In the theory of small vibrations and in many problems of quantum
mechanics it is important to determine how the eigenvalues and the
n
eigenvectors of a quadratic form K(x, x) = bij , xi xj , are changed if
i=1,j=1
both the form K(x, x) and the unit form E(x, x) are altered. Perturbation
theory is applied in this case. See Courant and Hilbert [15].
3. In what follows, we consider a differential equation where perturbation
method is used. We consider the differential equation
d2 y
+ (1 + x2 )y + 1 = 0, y(±1) = 0 (11.34)
dx2
We consider the perturbed equation
d2 y
+ (1 + x2 )y + 1 = 0, y(±1) = 0 (11.35)
dx2
d2 y0
For = 0, the equation + y0 + 1 = 0, y0 (±1) = 0 has the solution
dx2
cos x
y0 = − 1.
sin x
Let y(x, ) be the solution of the equation (11.33) and we expand y(x, )
in terms of
y(x1 ) = y0 (x) + y1 (x) + 2 y2 (x) · · · (11.36)
Substituting the power series (11.36) for y in the differential equation,
we obtain,
∞ ∞

n (yn + yn ) + n x2 yn−1 + 1 = 0
n=0 n=1
and since the coefficients of the powers of must vanish we have,
yn + yn + x2 yn−1 = 0 (11.37)
with the boundary conditions
yn (±1) = 0 (11.38)
Thus, we have a sequence of boundary value problems of the type (11.37)

subject to (11.38) from which y1 , y2 , y3 , . . . etc., can be found [see Collatz
[14]].
CHAPTER 12
THE HAHN-BANACH
THEOREM AND
OPTIMIZATION
PROBLEMS
It was mentioned at the outset that we put emphasis both on the theory and
on its application. In this chapter, we outline some of the applications of
the Hahn-Banach theorem on optimization problems. The Hahn-Banach
theorem is the most important theorem about the structure of linear
continuous functionals on normed linear spaces. In terms of geometry, the
Hahn-Banach theorem guarantees the separation of convex sets in normed
linear spaces by hyperplanes. This separation theorem is crucial to the
investigation into the existence of an optimum of an optimization problem.
12.1 The Separation of a Convex Set

In what follows, we state a theorem which asserts the existence of a
hyperplane separating two disjoint convex sets in a normed linear space.
12.1.1 Theorem (Hahn-Banach separation theorem)
Let E be a normed linear space and X1 , X2 be two non-empty disjoint
convex sets, with X1 being an open set. Then there exists a functional
f ∈ E ∗ and a real number β such that
X1 ⊆ {x ∈ E : Ref (x) < β}, X2 ⊆ {x ∈ E : Ref (x) ≥ β}.
For proof see 5.2.10.

The following theorems are in the setting of 4.
n
400
The Hahn-Banach Theorem and Optimization Problems 401
12.1.2 Theorem (intersection)

4
In the space n , let X1 , X2 , . . . , Xm be compact convex sets, whose
union is a convex set. If the intersection of any (m − 1) of them is non-
empty, then the intersection of all Xj is non-empty.
Proof: We shall first prove the theorem for m = 2.
Let X1 and X2 be non-empty compact convex sets, such that X1 ∪ X2
is convex. Let X1 and X2 be disjoint, then
there is a plane P which separates them
strictly. Since there exist points of X1 ∪X2 P
X1
on both sides of P , and since X1 ∪ X2 is
convex, there exist points of X1 ∪ X2 on
both sides of P . But this is impossible
since P separates strictly X1 and X2 .
Let us next suppose that the result is
true for m = r convex sets. We shall X2
prove this implies that the result holds for
m = r + 1 convex sets X1 , X2 , . . . , Xr+1 . Fig. 12(a)
Put X = ∩j=1 Xj . Then X = Φ by our premise. Now, X = Φ,
r
Xr+1 = Φ. Suppose the two sets are disjoint. Then there exists a plane P
which separates them strongly. Writing Xj = Xj ∩ P we have
*
r
Xj = ∪(Xj ∩ P ) ∪ (Xr+1 ∩ P )
j=1 ⎛ ⎞
*
r+1
= P ∩ ⎝ Xj ⎠
j=1
Therefore, the union of the sets X1 , X2 , . . . , Xr is convex. Also, the
intersection of any (r − 1) of X1 , X2 , . . . , Xr meets X and Xr+1 and hence
meets P . Therefore, the intersection of any (r − 1) of X1 , X2 , . . . , Xr is
not empty.
But by hypothesis ∩jr=1 Xj = X ∩ P = Φ contradicting the fact that P
is a hyperplane which separates X and Xr+1 strictly.
It follows that X ∩ Xr+1 = Φ and so the result holds for m = r + 1.
12.1.3 Theorem
A closed convex set is equal to the intersection of the half-spaces which
contain it.
Proof: Let X be a closed convex set and let A be the intersection of
4
the half-spaces which contains it. If Ef = {x : x ∈ n , f (x) = α} is
4 4
a hyperplane in n , then Hf = {x : x ∈ n , f (x) ≥ α} is called a
half-space.
If x0 ∈ X, then {x0 } is a compact convex set not meeting X. Therefore,
there exists a plane Ef separating {x0 } and set X s.t.
f (x0 ) < α ≤ inf f (x).

x∈X
We thus have Hf ⊃ X and x0 ∈ Hf .

Consequently, x0 does not belong to the intersection of the half-spaces
Hf containing X, i.e., x0 ∈ A. Hence, X ⊃ A and since A ⊇ X, A = X.
12.1.4 Definition: plane of support of X
4
Let X be any set in n . A plane Ef containing at least one point of X
and s.t. all points of X are on one side of Ef is called a plane of support
or supporting hyperplane of X.
12.1.5 Observation
If X is compact, then for any linear functional f which is not identically
zero, there exists a plane of support having equation f (x) = α (it is
sufficient to take α = minx∈X f (x)).
12.1.6 Plane of support theorem
If X is a compact non-empty convex set, it admits of an extreme point;
in fact, every plane of support contains an extreme point of X.
Proof: (i) The theorem is true in 4 for, compact convex set in 4 is a
closed segment [α, β] and contains two extreme points α and β; the planes
4 4
of support {x : x ∈ , x = α} and {x : x ∈ , x = β}.
4
(ii) Suppose that the theorem holds for r . We shall prove that it holds
4 4
in r+1 . Let X be a compact convex set in r+1 and let Ef be a plane of
support. The intersection Ef ∩ X is a non-empty closed convex set; since
Ef ∩ X is contained in the compact set X, it is also a compact set. The set
4
Ef ∩ X can be regarded as a compact set in r and so by hypothesis, it
admits of an extreme point x0 . Let [x1 , x2 ] be a line segment of centre x0
with x1 = x0 and x2 = x0 . Since x0 is an extreme point of Ef ∩ X, we have
[x1 , x2 ] ⊂ Ef ∩ X. Therefore, if x1 and x2 ∈ X, we have x1 , x2 ∈ Ef and
hence x1 , x2 are separated by Ef but this contradicts the definition of Ef
as a plane of support of X. It follows that there is no segment [x1 , x2 ] of
centre x0 contained in X and so x0 is an extreme point of X; by definition
x0 is in Ef .
Thus, if the theorem holds for n = r, it holds for n = r + 1. But we
have seen that it holds for n = 1. Hence, by induction, the theorem is true
for all n ≥ 1.
12.2 Minimum Norm Problem and the

Duality Theory
Let E be a real normed linear space and X be a linear subspace of E.
12.2.1 Definition: primal problem

To find u0 ∈ X s.t.
inf ||u0 − u|| = α, u ∈ X (12.1)

u
12.2.2 Definition: dual problem

To find f ∈ X ⊥ , ||f || ≤ 1 (12.2)
s.t. sup f (u0 ) = β (12.3)
⊥ ∗
where X = {f ∈ E : f (u) = 0 ∀ u ∈ X} (12.4)
12.2.3 Theorem: minimal norm problem on the normed space

E
Let X be a linear subspace of the real normed space E. Let u0 ∈ E.
Then the following results are true:
(i) Extremal values: α = β
(ii) Dual problem: The dual problem (12.3) has a solution u∗ .
(iii) Primal problem: Let fˆ be a fixed solution of the dual problem
(12.3). Then the point u0 ∈ X is a solution of the primal problem (12.1) if
and only if,
fˆ(u0 − u) = ||u − u0 || (12.5)
12.2.4 Lemma
If dim X < α, then the primal problem (12.1) always has a solution.
Let v ∈ X and φ ∈ X ⊥ with ||φ|| ≤ 1. Therefore, (i) we obtain the two
sided error estimate for the minimum value α:
||v − u0 || ≥ α ≥ φ(u0 ).
Proof of theorem: (i) and (ii) Since α is the infimum in (12.1), for each
> 0, there is a point u ∈ X such that,
||u − u0 || ≤ α + .
Thus, for all f ∈ X ⊥ with ||f || ≤ 1,
f (u0 ) = f (u0 − u) ≤ ||f ||||u0 − u|| ≤ α + .
Hence β ≤ α + , for all > 0, that is, β ≤ α. Let α > 0. Now theorem
5.1.4 yields that there is a functional fˆ ∈ X ⊥ with ||fˆ|| = 1 such that
fˆ(u0 ) = α (12.6)
Along with β ≤ α, this implies β = α.

If α = 0 then (12.6) holds with fˆ = 0 and hence we again have α = β.
(iii) This follows from α = β and fˆ(u) = 0 ∀ u ∈ X.

Proof of lemma: Since u0 ∈ X, ||u0 || ≤ α. Thus problem (12.1) is
equivalent to the finite-dimensional minimum problem
min Z = ||u − u0 || (12.7)

u∈X0
where the set X0 = {u ∈ X : ||u|| ≤ ||u0 ||} is compact. By the Bolzano-

Weierstrass theorem (1.6.19) this problem has a solution.
12.2.5 Minimum norm problems on the dual space E ∗
Let us consider the modified primal problem:
To find fˆ ∈ X ⊥ s.t.
inf Z̃ = (||fˆ − fˆ0 ||) = α (12.8)

fˆ
along with the dual problem.

To find u ∈ X,
sup z̃(= fˆ0 (u)) = β (12.9)
u
||u||=1
X ⊥ = {fˆ ∈ E ∗ : fˆ(u) = 0 for all u ∈ X}.

Thus, the primal problem (12.8) refers to the dual space E ∗ , where as
the dual problem (12.9) refers to the original space E.
12.2.6 Theorem
Let X be a linear subspace of the real normed linear space E. Given
fˆ0 ∈ E ∗ , the following results hold good.
(a) Extreme values: α = β
(b) Primal problem: The primal problem (12.8) has a solution fˆ.
(c) Dual problem: Let fˆ be a fixed solution of the primal problem
(12.8). Then, the point u ∈ X with ||u|| ≤ 1 is a solution of the dual
problem (12.9) if and only if,
(fˆ0 − fˆ)(u) = ||fˆ0 − fˆ|| (12.10)
Proof: (i) For all fˆ ∈ X ⊥ ,
||fˆ − fˆ0 || = sup (|fˆ(u) − fˆ0 (u)|)

||u||≤1
≥ sup fˆ0 (u) = β
||u||≤1,
u∈X
Since fˆ(u) = 0 for all u ∈ X. Hence, α ≥ β.

4
Let fˆr : X → be the restriction of fˆ0 : E → 4 to X.
Then ||fˆr || = sup fˆ0 (u) = β.
||u||≤1
u∈X
By Hahn-Banach theorem (theorem 5.1.3) there exists an extension

4
F̂ : E → of fˆr with ||F̂ || = ||fˆr ||.
This implies ĝ := fˆ0 − F̂ = 0 on X, that is, ĝ ∈ X ⊥ . Since α ≥ β and
||ĝ − fˆ0 || = ||F̂ || = ||fˆr || = β, ĝ ∈ X ⊥ ,
we get α = β. Hence, (12.8) has a solution.
(ii) This follows from α = β with fˆ(u) = 0.
12.2.7 Definition: δx
Let −∞ < a ≤ x ≤ b < ∞. Set δx (u) : = u(x) for all u ∈ C([a, b]).
Obviously, δx ∈ C([a, b])∗ and ||δC || = 1.
12.2.8 Lemma
Let f ∈ C([a, b])∗ be such that ||fˆ|| = 0.
ˆ
Suppose that fˆ(u) = ||fˆ|| ||u||, where ||u|| = max |u(x)| and u : [a, b] →
a≤x≤b
R is a continuous function, such that |u(x)| achieves its maximum at
precisely N points of [a, b] denoted by x1 , x2 , . . . , xN . Then there exist
real numbers a1 , a2 , . . . , aN , such that
fˆ = a1 δx1 + · · · + aN δxN
and |a1 | + |a2 | + · · · + |aN | = ||fˆ||.
Proof: By Riesz representation theorem (5.3.3) there exists a function
4
h : [a, b] → of bounded variation, such that
b
fˆ(u) = u(x)h(x)dx for all u ∈ C([a, b])
a
and Var(h) = ||fˆ||,
where Var(h) stands for the total variation of h on the interval [a, b]. We
assume that h(a) = 0. For simplicity, we take N = 1, ±u(x1 ) = ||u|| and
a < x1 < b.
Let J = [a, b] − [x1 − , x1 + ] for fixed > 0, and let VarJ (h) denote
the total variation of h on J. Then
VarJ (h) + |h(x1 + ) − h(x1 − )| ≤ Var(h) (12.11)
Case I Let VarJ (h) = 0 for all > 0. Then, by (12.11), h is a step function
of the following form,
3
0 if a ≤ x < x1
h(x) =
±Var(h) if x1 < x ≤ b
and by the relation

b
fˆ(u) = u(x)dh(x) = ±u(x1 )Var(h)
a
for all u ∈ C([a, b]|.

Hence, fˆ = ±Var(h)δx1 .
Case II Let VarJ (h) > 0 for some > 0. We want to show that this is
impossible. By the mean value theorem, there is a point ξ ∈ [x − , x + ]
such that
x1 +
ˆ
f (u) = u(x)dh(x) + u(x)dh(x)
J x1 −
≤ max |u(x)|VarJ (h) + |u(ξ)||h(x + ) − h(x − )|.
x∈J
Since |u(x)| achieves its maximum exactly at the point x1 , we get

max |u(x)| ≤ ||u||.
x∈J
Thus, it follows from (12.11) that
fˆ(u) < ||u||Var(h).
Hence, fˆ(u) < ||u||||fˆ||. This is a contradiction.

For N > 1 we use a similar argument.
12.3 Application to Chebyshev

Approximation
It is convenient in practice to approximate a continuous function by a
polynomial for various reasons. Let u0 be a continuous function mapping a
compact interval [a, b] → R. Let us consider the following approximation
problem:
max |u0 (x) − u(x)| = min!, u ∈ 2 (12.11)
a≤x≤b
where 2 denotes the set of real polynomials of degree ≤ N for fixed N ≥ 1.

Problem (12.11) corresponds to the famous Chebyshev approximation of
the function u0 by polynomials.
12.3.1 Theorem
Problem (12.11) has a solution. If p(x) is a solution of (12.11), then
|u0 (x) − p(x)| achieves its maximum at least N + 2 points of [a, b].
Proof: Let E = C([a, b]) and ||v|| = max |v(x)|. Then, (12.11) can be
a≤x≤b
written as
min Z = ||u0 − p||, p ∈ 2 (12.12)
Since, dim X < ∞, i.e., finite, the problem (12.12) has a solution by 12.2.4.
If u0 ∈ 2, then (12.12) is immediately true. Let us assume that u0 ∈ 2.
Let p̂ be a solution of (12.12). Then, since u0 ∈ 2 and p̂ ∈ 2 we have
||u0 − p|| > 0. By the duality theory from theorem 12.2.3 there exists a
functional f ∈ C([a, b])∗ such that
f (u0 − p̂) = ||u0 − p|| (12.13)
along with ||f || = 1 and

f (p̂) = 0 ∀ p̂ ∈ X (12.14)
Let us suppose that |u0 (x) − p(x)| achieves its maximum on [a, b] at
precisely the points x1 , x2 , . . . , xM where 1 ≤ M < N + 2. It follows
from (12.13) and lemma 12.2.6 that there are real numbers a1 , a2 , . . . , aM
with |a1 | + · · · + |aM | = 1, such that
u∗ = a1 u(x1 ) + a2 u(x2 ) + · · · + aM u(xM )
where u(x) ∈ C([a, b]).

Assume that aM = 0. Let us choose a real polynomial p̃(x) of degree
N , such that
p̃(x1 ) = p̃(x2 ) = · · · = p̃(xM −2 ) = 0 and p̃(xM ) = 0.
This is possible since M − 1 ≤ N . Then, p̃ ∈ X and f (p) = 0 contradicting

(12.14).
12.4 Application to Optimal Control

Problems
We want to study the motion of a vertically ascending rocket that reaches
a given altitude H with minimum fuel expenditure [see figure 12(b)].
The mathematical model for the system is given
by,
d2 x Rocket
2
= u(t) − g, (12.15)
dt
where x is the height of the rocket above the ground
Earth
level and g is the acceleration due to gravity. u(t) is
the thrust exerted by the rocket. Fig. 12(b)
Let h be the height attained at time T . Then the initial and boundary
conditions of the equation (12.15),
x(0) = x (0) = 0, x(T ) = h (12.16)
We neglect the loss of mass by the burning of fuel.

Let us measure the minimal fuel expenditure during the time interval
T
[0, T ] through the integral 0 |u(t)|dt over the rocket thrust. Let T > 0
be fixed. Then the minimal fuel expenditure α(T ) during the time interval
[0, T ] is given by a solution of the following minimum problem
T
min |u(t)|dt = α(T ) (12.17)
u 0
where we vary u over all the integrable functions u : [0, T ] → 4. Integrating

(12.15) we get
t

x (t) = u(t)dt − gt.
0
Integrating further,
t t
1
x(t) = dt u(s)ds − gt2
0 0 2
t
1
= (t − s)u(s)ds − gt2 [see 4.7.16 Ex. 2]
0 2
T
1
Thus, h = (T − s)u(s)ds − gT 2 . (12.18)
0 2
Thus for given h > 0, we have to determine the optimal thrust u(·) and the
final time T as a solution of (12.17).
This formulation has the following shortcoming. If we consider only
classical force functions u, then an impulse at time t of the form ‘u = δt ’ is
excluded.
However, such types of thrusts are of importance. Therefore, we
consider the generalized problem for functionals:
(a) For a fixed altitude h and fixed final time T > 0, we are looking for
a solution U of the following minimum problem:
min ||U || = α(T ), U ∈ C([0, T ])∗ (12.19)
along with the side condition,
T2
h = U (w) − (12.20)
2
where we write w(t) = T − t.
(b) We determine the final time T in such a way that
α(T ) = min! (12.21)
It may be noted that (12.19) generalizes (12.17). In fact, if the functional
U ∈ C([0, T ])∗ has the following special form:
T
U (v) = v(t)u(t)dt for all u ∈ C([0, T ])
0
where the fixed function u : [0, T ] → 4 is continuous, then we show that

T
||U || = |u(t)|dt (12.22)
0
t
Let us define h(t) = u(t)dt for all t ∈ [0, T ].
0
T
Then, U (w) = w(t)dh(t) for all u ∈ C([0, T ])
0
and ||U || = Var(h) by 12.2.8 (12.23)
For 0 = t0 < t1 < · · · < tn = T , a partition of the interval [0, T ],

n n
tj T
Δ= |h(tj ) − h(tj−1 )| ≤ |u(t)|dt = |u(t)|dt.
j=1 j=1 tj−1 0
T
Hence, Var(h) ≤ 0 |u(t)|dt.
By the mean value theorem,

n
Δ= |u(ξj )|(tj − tj−1 ) where tj−1 ≤ ξj ≤ tj .
j=1
Making the partition arbitrarily fine as n → ∞, we get

T
Δ→ |u(t)|dt as n → ∞
0
and hence T
Var(h) = |u(t)|dt. (12.24)
0
Thus, (12.23) and (12.24) yield

T
||U || = |u(t)|dt.
0
Thus (12.22) is true.

Theorem 12.4.1 Problem (a), (b) has the following solution:
1
U = T δ0 and T = (2h) 2
with the minimal fuel expenditure ||U || = T [see Zeidler, E [56]].

CHAPTER 13
VARIATIONAL
PROBLEMS
13.1 Minimization of Functionals in a

Normed Linear Space
In this chapter we first introduce a variational problem. The purpose is to
explore the conditions under which a given functional in a normed linear
space admits of an optimum. Many differential equations arising out of
problems of physics or of mathematical physics are difficult to solve. In
such a case, a functional is built up out of the given equations and is
minimized.
Let H be a Hilbert space and A be a symmetric linear operator with
domain D(A) dense in H. Then Au, u, for all u ∈ D(A), is a symmetric
bilinear functional and is denoted by a(u, u). a(u, u) is a quadratic
functional (9.3.1). Let L(u) be a linear functional. The minimization
problem can be stated as
1
Min J(u) = a(u, u) − L(u) (13.1)
u∈H 2
In general, we consider a vector space E and U an open set of E. Let

J : u ∈ U ⊂ E −→ R. Then the minimization problem is
Min J(u) (13.2)
u∈U
13.2 Gâteaux Derivative

Let E1 and E2 be normed linear spaces and P : U ∈ E1 −→ E2 be a
mapping of an open subset U of E1 into E2 . We shall call a vector φ ∈ E1 ,
410
Variational Problems 411
φ = 0 a direction in E1 .
13.2.1 Definition
The mapping P is said to be differentiable in the sense of Gâteaux,
or simply G-differentiable at a point u ∈ U in the direction φ if the
difference quotient P (u+tφ)−P
t
(u)
has a limit P (u, φ) in E2 as t → 0 in . 4

The (unique) limit P (u, φ) is called the Gâteaux derivative of P at u
in the direction of φ.
P is said to be G-differentiable in a direction φ in a subset of U if it is
G-differentiable at every point of the subset in the direction φ.
13.2.2 Remark
The operator E1 # φ −→ P (u, φ) ∈ H is homogeneous.
P (u + tαφ) − P (u)
For P (u, αφ) = lim ·α
t→0 tα
= αP (u, φ) for α > 0.
13.2.3 Remark
The operator P (u, φ) is not, in general, linear.
4
Example 1. Let f : 2 −→ be defined by 4
⎧
⎨0 if (x1 , x2 ) = (0, 0)
f (x1 , x2 ) = x51
⎩ if (x1 , x2 ) = (0, 0)
((x1 − x2 )2 + x41 )
If u = (0, 0) ∈ 42 and the direction φ = (h1 , h2 ) ∈ 42 (φ = 0), we have,

(f (th1 , th2 ) − f (0, 0)) t2 h51
=
t ((h1 − h2 )2 + t2 h41 )
which has a limit as t → 0 and we have

0 if h1 = h2
f (u, φ) = f ((0, 0), (h1 , h2 )) =
h if h1 = h2
It can be easily verified that f is G-differentiable in 2 . 4

4
Example 2. Let Ω be an open set in n and E = Lp (Ω), p > 1.
4 4
Suppose f : # t −→ f (t) ∈ is a continuously differentiable function,
such that (i) |f (t)| ≤ K|t|p and (ii) |f (t)| ≤ K|t|p−1 for some constant
K > 0. Then
J(u) = f (u(x))dx (13.3)
Ω
defines a functional J on Lp (Ω) = E which is G-differentiable everywhere
in all directions, and we have

J (u, φ) = f (u(x))φ(x)dx (13.4)
Ω
We first show that the RHS of (13.4) exists.

Since u ∈ Lp (Ω) and since f satisfies (i) we have,

|J(u)| ≤ |f (u(x)|dx ≤ K |u|p dx < ∞
Ω Ω
which means that J is well-defined on Lp (Ω).

On the other hand, for any u ∈ Lp (Ω), since f satisfies (ii), f (u) ∈
Lp (Ω) where 1p + 1q = 1. This is because

(p−1)q
|f (u)| dx ≤ K
q
|u| dx = K |u|p dx < ∞
Ω Ω Ω
Thus for any u, φ ∈ Lp (Ω), we have by using Hölder’s inequality

(theorem 1.4.3)

f (u)φdx ≤ ||f (u)||L (Ω) ||u||L (Ω) ≤ K||u|p/q ||φ||L (Ω) < ∞
p p Lp (Ω) p
Ω
This proves the existence of the RHS of (13.4).

4
If t ∈ , we define g : [0, 1] −→ by setting, 4
g(θ) = f (u + θtφ).
1 Then g is continuously
1 differentiable in ]0, 1[ and g(1) − g(0) =

g (θ)dθ = tφ(x) f (u + θtφ)dθ (θ = θ(x), |θ(x)| ≤ 1), so that
0 0
1
(J(u + tφ) − J(u))
= φ(x) f (u(x) + θtφ(x))dθdx
t Ω 0

Now, φ(x)f (u(x + θtφ(x))dx ≤ K |φ(x)||u(x + θtφ(x))|p−1 dx
Ω Ω
1/p 1/q
≤K |φ(x)|p dx |u(x + θtφ(x)|(p−1)q dx <∞
Ω Ω
Hence,φ(x)f (u(x) + θtφ(x)) ∈ L1 (Ω × [0, 1])

and by Fubini’s theorem (10.5.3)
1
(J(u + tφ) − J(u))
= dθ φ(x)f (u(x) + θtφ(x))dx.
t 0 Ω
The continuity of f implies that f (u + θtφ) −→ f (u) as t → 0 (and

hence as θt → 0) uniformly for θ ∈]0, 1[.
Moreover, the condition (ii) and the triangle inequality yield,
|φ(x)f (u(x) + θtφ(x))| ≤ K|φ(x)|(|u(x)| + |φ(x)|)p−1 (13.5)

The RHS of (13.5) is integrable by Hölder’s inequality [see 1.4.3]. Then

by dominated convergence theorem (10.2.20(b)) we conclude

J (u, φ) = f (u)φdx.
Ω
13.2.4 Definition
An operator P : U (E1 −→ E2 ) (U being an open set in E1 ) is said
to be twice differentiable in the sense of Gâteaux at at point u ∈ U
in the directions φ, ψ (φ, ψ ∈ E1 , φ = 0, ψ = 0 given) if the operator
u −→ P (u, φ) : U ⊂ E1 −→ E2 is once G-differentiable at u in the
direction ψ. The G-derivative of u −→ P (u, φ) is called the second
G-derivative of P and is denoted by P (u, φ, ψ) ∈ E2 , i.e.,
P (u + tφ, ψ) − P (u, ψ)
P (u, φ, ψ) = lim (13.6)
t→0 t
13.2.5 Gradient
Let J : U ⊂ E1 −→ 4
be a functional on an open set of a normed
linear space E1 which is once G-differentiable at a point u ∈ U . If the
functional u −→ J (u, φ) is continuous linear on E1 , then there exists a
(unique) element G(u) ∈ E1∗ (6.1), such that
J (u, φ) = G(u)(φ) for all φ ∈ E1 .
Similarly, if J is twice G-differentiable at a point u ∈ U , and if the form
(φ, ψ) −→ J (u, φ, ψ) is a bilinear (bi-) continuous form on E1 × E1 , then
there exists a (unique) element H(u) ∈ (E1 −→ E1∗ ) such that
J (u, φ, ψ) = H(u)(φ, ψ), (φ, ψ) ∈ E1 × E1 (13.7)
13.2.6 Definitions: gradient, Hessian
Gradient: G(u) ∈ E1∗ is called the gradient of J at u
Hessian: H(u) ∈ (E1 −→ E1∗ ) is called the Hessian of J at u.
13.2.7 Mean value theorem
Let J be a functional as in 13.1. Let us assume that [u + tφ, t ∈ [0, 1]],
4
is contained in U . Let the function g : [0, 1] → be defined as
t −→ g(t) = J(u + tφ)
Let J (u + tφ, φ) exist. Then
J(u + (θ + t)φ) − J(u + tφ) g(t + θ) − g(t)
lim = lim = g (t)
θ→0 θ θ→0 θ
showing g is once differentiable in [0,1]
Thus, g (t) = J (u + tφ, φ) (13.8)

Similarly, if J (u + tφ, φ, φ) exist and J is twice differentiable then
g (t) = J (u + tφ; φ, φ) (13.9)
13.2.8 Lemma
Let J be as in 13.1. Let u ∈ U , φ ∈ E1 be given. If [u+tφ : t ∈ [0, 1]] ∈ U
and J is once G-differentiable on this set in the direction, φ then there exists
a t0 ∈]0, 1[, such that
J(u + φ) = J(u) + J (u + t0 φ, φ) (13.10)
Proof: g : [0, 1] −→ 4 and g is differentiable in ]a, b[. The classical mean
value theorem yields:
g(1) = g(0) + 1 · g (t0 ), t0 ∈]0, 1[ (13.11)
Using the definition of g(t) and (13.8), (13.10) follows from (13.11).
13.2.9 Lemma
Let U and J be as in lemma 13.2.8. If J is twice G-differentiable on the
set [u + tφ; t ∈ [0, 1]] in the directions φ, ψ, then there exists a t0 ∈]0, 1[
such that
1
J(u + φ) = J(u) + J (u, φ) + J (u + t0 φ, φ, φ) (13.12)
2
Proof: The classical Taylor’s theorem applied to g on ]0, 1[, yields
1 2
g(1) = g(0) + 1 · g (0) + · 1 g (t0 ) (13.13)
2!
Using the definition of g(t), (13.8) and (13.9), (13.12) follows from
(13.13).
13.2.10 Lemma
Let E1 and E2 be two normed linear spaces. U an open subset of
E1 and let φ ∈ E1 be given. If the set [u + tφ; t ∈ [0, 1]] ∈ U and
P : U ⊂ E1 −→ E2 is a mapping which is G-differentiable everywhere on
the set [u + tφ; t ∈ [0, 1]] in the direction φ then, for any h ∈ E1∗ , there
exists a th ∈]0, 1[, such that
h(P (u + φ)) = h(P u) + h(P (u + th φ, φ) (13.14)
Proof: Let g : [0, 1] −→ 4 be set as
t −→ g(t) = h(P (u + tφ)) (13.15)
where h : U ⊂ E1 −→ . 4
Then g (t) exists in ]0, 1[ and
g(t + t ) − g(t) h(P (u + (t + t )φ) − h(P (u + tφ))
g (t) = Lt = Lt
t →0 t t →0 t
= h(P (u + tφ, φ) for t ∈]0, 1[,
since h is a linear functional defined on E1 .
Now, (13.14) follows immediately on applying the classical mean value
theorem to the function g.
13.2.11 Theorem
Let E1 , E2 , u, φ and U be as in above. If P : U ∈ E1 −→ E2 is
G-differentiable in the set [u + tφ; t ∈ [0, 1]] in the direction φ, then there
exists a t0 ∈]0, 1[, such that,
||P (u + φ) − P (u)||E1 ≤ ||P (u + t0 φ, φ)|| (13.16)
Proof: Let v = P (u+φ)−P (u), then v ∈ E1 . Then by th. 5.1.4, which is a

consequence of the Hahn-Banach theorem, we can find a functional h ∈ E1∗
such that
||h|| = 1 h(v) = h(P (u + φ) − P u) = ||P (u + φ) − P (u)||.
Since P satisfies the assumption of theorem 13.2.10, it follows that there

exists t0 = th ∈]0, 1[, such that
||P (u + φ) − P u|| = h(P (u + φ) − P u) = h(P (u + t0 φ, φ))

≤ ||h|| ||P (u + t0 φ, φ)|| = ||P (u + t0 φ, φ)||.
13.2.12 Convexity and Gâteaux differentiability

Earlier we saw that a subset U of a vector space E is convex if
u, v ∈ E =⇒ λu + (1 − λ)v ∈ E, 0 ≤ λ ≤ 1.
13.2.13 Definition
A functional J : U ⊂ E1 −→ 4 on a convex set U of a vector space E1
is said to be convex if
J((1 − λ)u + λv) ≤ (1 − λ)J(u) + λJ(v) for all (13.17)
u, v ∈ U and λ ∈ [0, 1].

J is said to be strictly convex if strict inequality holds for all u, v ∈ E1
with u = v and λ ∈]0, 1[.
13.2.14 Theorem
If a functional J : U ⊂ E1 −→ 4
on a convex set is G-differentiable
everywhere in U in all directions then
(i) J is convex if and only if,
J(v) ≥ J(u) + J (u, v − u) for all u, v ∈ U (13.18)
(ii) J is strictly convex if and only if,
J(v) > J(u) + J (u, v − u) for all u, v ∈ U (13.19)
with u = v.
Proof: J is convex =⇒ J(v) − J(u) ≥ J(u+λ(v−u))−J(u)

λ , λ ∈ [0, 1]
Since J (u, v − u) exists, proceeding to the limit as λ → 0 on the RHS,
we get,
J(v) − J(u) ≥ J (u, v − u) which is inequality (13.18).
For proving the converse, we note that (13.18) yields
J(v) ≥ J(u + λ(v − u)) + J (u + λ(v − u), u − (u + λ(v − u))

or, J(v) ≥ J(u + λ(v − u)) − λJ (u + λ(v − u), v − u) (13.20)

by the homogeneity of the mapping φ −→ J (u, φ).
Similarly we can write,
J(v) ≥ J(u + λ(v − u)) + J (u + λ(v − u)), v − (u + λ(v − u))
or, J(v) ≥ J(u + λ(v − u)) + (1 − λ)J (u + λ(v − u), v − u) (13.21)
Multiplying (13.20) by (1 − λ) and (13.21) by λ and adding we get
back (13.17) for any u, v ∈ U ⊂ E1 .
If J is strictly convex, we can write,
J(v) − J(u) > λ−1 [J(u + λ(v − u)) − J(u)] (13.22)
On the other hand, the mean value theorem yields
J(u + λ(v − u)) = J(u) + J (u, λ0 (v − u)), 0 < λ0 < λ ∈]0, 1[
or, J(u + λ(v − u)) − J(u) = λ0 J (u, v − u) (13.23)
Using (13.23), (13.22) reduces to (13.19).

The converse can be proved exactly in the same way as in the first part.
13.2.15 Weak lower semicontinuity
Definition: weak lower semicontinuity
4
A functional J : E −→ is said to be weakly lower semicontinuous
w
if, for every sequence vn −→ u in E1 (6.3.9), we have
lim inf J(vn ) ≥ J(u) (13.24)

n→∞
13.2.16 Theorem
4
If a functional J : E −→ is convex and admits of a gradient G(u) ∈ E ∗
at every u ∈ E, then J is weakly lower semicontinuous.
Proof: Let vn be a sequence in E, such vn − u is in E. Then
G(u)(vn − u) −→ 0 as n → ∞, since vn is weakly convergent in u. On
the other hand, since J is convex, we have by theorem 13.2.14,
J(vn ) ≥ J(u) + G(u)(vn − u).

On taking limits we have
lim inf J(vn ) ≥ J(u).

n→∞
13.2.17 Theorem
If a functional J : U ⊂ E −→ 4 on the open convex set of a normed
linear space, E is twice G-differentiable everywhere in U in all directions,
and if the form (φ, ψ) −→ J (u, φ, ψ) is non-negative, i.e., if J (u, φ, φ) ≥ 0
for all u ∈ U and φ ∈ E with φ = 0, then J is convex.
If the form (φ, ψ) −→ J (u, φ, ψ) is positive, i.e., if J (u, φ, φ) > 0 for
all u ∈ U and φ ∈ E with φ = 0, then J is strictly convex.
Proof: Since U is convex, the set [u + λ(v − u), λ ∈ [0, 1]] is contained in
U whenever u, v ∈ U . Then, by Taylor’s theorem [see Cea [11]], we have,
with φ = v − u,
1
J(v) = J(u) + J (u, v − u) + J (u + λ0 (v − u), v − u, v − u) (13.25)
2
for some λ0 ∈]0, 1[. Then the non-negativity of J implies
J(v) ≥ J(u) + J (u, v − u)
from which convexity of J follows from 12.2.14. Similarly, the strict

convexity of J follows from positive-property of J (u, φ, φ).
13.2.18 Theorem
If a functional J : E −→ 4 is twice G-differentiable everywhere in E in
all directions and satisfies
(a) J has a gradient G(u) ∈ E ∗ at all points u ∈ E,

(b) (φ, ψ) −→ J (u, φ, ψ) is non-negative, i.e., J (u; φ, φ) ≥ 0 for all
u, φ ∈ E with φ = θ
then J is weakly lower semicontinuous.
Proof: By theorem 13.2.14, the condition (b) implies that J is convex.

Then the conditions of theorem 13.2.16 being fulfilled J is weakly lower
semicontinuous.
13.3 Fréchet Derivative

Let E1 and E2 be two normed linear space.
13.3.1 Definition: Fréchet derivative

A mapping P : U ⊂ E1 −→ E2 from an open set U in E1 to E2 is
said to be Fréchet differentiable, or simply F-differentiable, at a point
u ∈ U if there exists a bounded linear operator P (u) : E1 −→ E2 , i.e.,
P (u) ∈ (E1 −→ E2 ) such that
||P (u + φ) − P (u) − P (u)φ||

lim =0 (13.26)
φ→θ ||φ||
Clearly, P (u), if it exists, is unique and is called the Fréchet

derivative of P at u.
13.3.2 Examples
1. f is a function defined on an open set U ⊂ R2 and f : U −→ . Then 4
f is F-differentiable if it is once differentiable in the usual sense.
Let u = (u1 , u2 )T , φ = (φ1 , φ2 )T .

Then, f (u + φ) − f (u) = f (u1 + φ1 , u2 + φ2 ) − f (u1 , u2 )
∂f ∂f
= φ1 + φ2 + 0(||φ||2 )
∂u1 ∂u2
where 0(||φ||2 ) denotes terms of order in ||φ||2 and of higher orders.
||f (u + φ) − f (u) − f (u)φ||
Therefore, lim
||φ||→0 ||φ||
0 0
0 ∂f 0
∂f
0 ∂u1 φ1 + ∂u 2
φ 2 − f (u)φ 0
= lim = 0.
||φ||→0 ||φ||

∂f ∂f
Hence, f (u) = grad f T = , .
∂u1 ∂u2
2. Let (u, v) −→ a(u, v) be a symmetric bi-linear form on a Hilbert space

H and v → L(v) a linear form on H. Let us take J : U −→ (sec 4
13.1) by
1
J(v) = a(v, v) − L(v) for φ ∈ H, φ = θ.
2
1
J(v + φ) − J(v) = a(v + φ, v + φ) − L(v + φ)
2
1
− a(v, v) + L(v)
2
1 1 1
= a(v, φ) + a(φ, v) + a(φ, φ) − L(φ)
2 2 2
1
= a(v, φ) − L(φ) + a(φ, φ).
2
||J(v + φ) − J(v) − J (v)φ||

Lt
||φ||→0 ||φ||
0 0
0 1 0
0
= Lt 0(a(v, φ) − L(φ) − J (v)φ) + a(φ, φ)0

(13.27)
||φ||→0 2 0
Let us suppose that
(i) a(·, ·) is bi-continuous: there exists a constant K > 0 such that
a(u, v) ≤ K||u|| ||v|| for all u, v ∈ H.
(ii) a(·, ·) is coercive (11.7.2), i.e.,
a(u, v) ≥ α||v||2H for all v ∈ H.
(iii) L is bounded, i.e., there exists a constant M , such that
L(v) ≤ M ||v||, for all v ∈ H.
Using condition (i) it follows from (13.27)

||J(v + φ) − J(v) − J (v)φ||
Lt
||φ||→0 ||φ||

K · 0(||v||2 )
≤ Lt ||a(v, φ) − L(φ) − J (v)φ|| + .
||φ||→0 ||v||
The limit will be zero if J (v)φ = a(v, φ) − L(φ).
13.3.3 Remark
If an operator P : U ⊂ E1 −→ E2 , where E1 and E2 are normed linear
spaces, is F-differentiable then it is also G-differentiable and its G-derivative
coincides with the F-derivative.
Proof: If P has a F-derivative P (u) at u ∈ U , then
||P (u + φ) − P (u) − P (u)φ||

lim = 0.
||φ||→0 ||φ||
Since φ = θ, we put φ = te where t = ||φ|| and e is an unit vector in the

direction of φ.
The above limit yields
||P (u + te) − P (u) − tP (u)e||

lim = 0.
||te||→0 t
The above shows that P is G-differential at u and P (u) is also the

G-derivative.
13.3.4 Remark
The converse is not always true.
Example one in 13.2.3 has a G-derivative at (0, 0), but is not F-
differentiable.
13.4 Equivalence of the Minimizing

Problem for Solving Variational
Inequality
13.4.1 Definition
4
A functional J : U ⊂ E −→ , where U is an open set in a normed
linear space, is said to have a local minimum at a point u ∈ U if there is
a neighbourhood Nu of u in E such that
J(u) ≤ J(v) for all v ∈ U ∩ Nu (13.28)
13.4.2 Definition
A functional J on U is said to have a global minimum in U if there exist
a u ∈ U , such that
J(u) ≤ J(v) for all v ∈ U (13.29)
13.4.3 Theorem
Suppose E, U and J : U −→ 4 fulfil the following conditions:
1. E is a reflexive Banach space
2. U is weakly closed
3. U is weakly bounded and
4. J : U ⊂ E −→ 4 is weakly lower semicontinuous.
Then J has a global minimum in U .
Proof: Let m denote inf J(v). If vn is a minimizing sequence for J, i.e.,
v∈U
m = inf J(v) = lim J(vn )

v∈U n→∞
then, by the boundedness of U , (from (3)), vn is a bounded sequence in

E, i.e., there exists a constant k > 0, such that ||un || < k for all n.
By the reflexivity of E, this bounded sequence is weakly compact [see
theorem 6.4.4]. Hence, {vn } contains a weakly convergent subsequence, i.e.,
w
a sequence vnp , such that vnp −→ u ∈ E as p → ∞. U being weakly closed
u ∈ U . Finally, since vnp −→ u and J is weakly lower semicontinuous,
J(u) ≤ lim inf J(vnp )

p→∞
which implies that
J(u) ≤ lim J(vnp ) = l ≤ J(v)

p→∞
for all v ∈ E.
13.4.4 Theorem
If E, U and J satisfy the conditions (1), (2), (4) and J satisfy the
condition (5)
lim J(v) = +∞,
v→∞
then J admits of a global minimum in U .

Proof: Let z ∈ U be arbitrarily fixed.
Let us consider the subset U ◦ of U as follows,
U ◦ = {v : v ∈ U and J(v) ≤ J(z)}.
Thus, the existence of a minimum in U ◦ ensures the existence of a

minimum in U .
We would show that U ◦ satisfies the conditions (2) and (3). If U0 is not
bounded we can find a sequence vn ∈ U 0 , such that ||vn || −→ +∞. Then
condition (5) yields J(vn ) −→ +∞ which is impossible since vn ∈ U ◦ =⇒
J(vn ) ≤ J(z). Hence U 0 is bounded. To, show that U ◦ is weakly closed,
w
let un ∈ U 0 be a sequence such that un −→ u in E. Since U is weakly
0
closed, u ∈ U . Since un ∈ U , J(un ) ≤ J(z) and since |J(un ) − J(u)| <
for n ≥ n0 ( ), it follows that J(u) ≤ J(z) showing that U ◦ is weakly closed.
w
On the other hand, since J is weakly lower semicontinuous un −→ u in
E implies that
J(u) ≤ lim inf J(un ) ≤ J(z)
proving that u ∈ U ◦ . Now, U ◦ and J satisfy all the conditions of theorem
13.4.3, hence J has a global minimum in U 0 and hence in U .
13.4.5 Theorem
Let J : E −→ 4 be a functional on E, U a subset of E satisfying the
1. E is a reflexive Banach space,

2. J has a gradient G(u) ∈ E ∗ everywhere in U ,
3. J is twice G-differentiable in all directions φ, ψ ∈ E and satisfies the
condition
J (u, φ, φ) ||φ||E(||φ||) for all φ ∈ E
where E(t) is a function on [t ∈ 4; t ≥ 0] such that
E(t) ≥ 0 and lim E(t) = +∞.
t→∞
4. U is a closed convex set

Then there exists at least one minimum u ∈ U of J. Furthermore, if
in condition (3),
5. E(t) > 0 for t > 0 is satisfied by E then there exists a unique minimum
of J in U
Proof: First of all, by condition (3), J (u, φ, φ) ≥ 0 and hence by Taylor’s

formula (13.25),
1
J(v) = J(u) + J (u, v − u) + J (u + λ0 (v − u), (v − u), (v − w)), 0 < λ < 1.
2
We have J(v) ≥ J(u) + J (u, v − u) (13.30)

Application of theorem 13.2.17 asserts the convexity of J. Similarly,
condition (5) implies that J is strictly convex by theorem 13.2.17. Then
by conditions (2) and (3), we conclude from theorem 13.2.16 and keeping
in mind that
J (u + λ0 (v − u), (v − u), (v − w)) ≥ 0 for 0 < λ0 < 1,
J is weakly lower semicontinuous.

We next show that J(v) −→ +∞ as ||v|| −→ +∞.
For this, let z ∈ U be arbitrarily fixed. Then, because of conditions (2)
and (3), we can apply Taylor’s formula (13.25) to get, for v ∈ E,
1
J(v) = J(z) + G(z)(v − z) + J (z + λ0 (v − z), (v − z), (v − z))
2
for some λ0 ∈]0, 1[ (13.31)
Now, |G(z)(v − z)| ≤ ||G(z)|| ||v − z||, (13.32)
Condition (3) yields,
J (z + λ0 (v − z), (v − z), (v − z)) ≥ ||v − z||E(||v − z||) (13.33)
Using (13.32) and (13.33), (13.31) reduces to

1
J(v) ≥ J(z) + ||v − z|| E(||v − z||) − ||G(z)|| .
2
Here, since z ∈ U is fixed, as ||v|| −→ +∞
||v − z|| −→ +∞,
J(z) and ||G(z)|| are constants and E(||v−z||) −→ +∞ by condition (3).

Thus, J(v) −→ +∞ as ||v|| → ∞. The theorem thus follows by virtue
of theorem 13.4.5.
13.4.6 Theorem
Suppose U is a convex subset of a Banach space and J : U ⊂ E −→ 4
is a G-differentiable (in all directions) convex functional.
Then, u ∈ U is a minimum for J, (i.e., J(u) ≤ J(v) for all v ∈ E) if,
and only if u ∈ U and J (u, v − u) ≥ 0 for all v ∈ U .
Proof: Let u ∈ U be a minimum for J. Then, since U is convex,
u, v ∈ U =⇒ u + n (v − u) ∈ U as n → 0 for each n. Hence
J(u) ≤ J(u + n (v − u))
J(u + n (v − u)) − J(u)
Therefore, lim ≥ 0,
n →0+ n
i.e., J (u, v − u) ≥ 0 for any v ∈ E.
Conversely, since J is convex and G-differentiable by condition (i) of
theorem 13.2.14, we have
J(v) ≥ J(u) + J (u, v − u) for any v ∈ E.
Now, using the assumption that J (u, v − u) ≥ 0, we get
J(v) ≥ J(u) for all u ∈ U.
In what follows we refer to the problem posed in (13.1).

13.4.7 Problem (PI) and problem (PII) and their equivalence
Let K be a closed convex set of normed linear space E.
Problem (PI): To find u ∈ K such that
J(u) ≤ J(v) for all v ∈ K (13.33)

1
J(v) = a(v, v) − L(v) implies
2
(i) J (v, φ) = a(v, φ) − L(φ), and
(ii) J (v; φ, φ) = a(φ, φ).
The coercivity of a(·, ·) implies that
J (v; φ, φ) = a(φ, φ) ≥ α||φ||2 .
If we choose E(t) = αt, then all the assumptions of theorem 13.4.5 are
fulfilled by E, J and K so that problem (PI) has a unique solution.
Also, by theorem 13.4.6 the problem (PI) is equivalent to
(PII): To find
u ∈ K; a(u, v − u) ≥ L(v − u) for all v ∈ K (13.34)
We thus obtain the following theorem:

13.4.8 Theorem
(1) There exists a unique solution u ∈ K of the problem (PI).
(2) Problem (PI) is equivalent to problem (PII). The problem (PII)
is called a variational inequality associated to the closed, convex set and
the bilinear form a(·, ·).
The theorem 13.4.7 was generalized by G-stampaccia (Cea [11]) to the
non-symmetric case. This generalizes and uses the classical Lax-Miligram
theorem [see Reddy [45]]. We state without proof the theorem due to
Stampacchia.
13.4.9 Theorem (Stampacchia)
Let K be a closed convex subset of a Hilbert space H and a(·, ·) be a
bilinear bi-continuous coercive form (sec 11.7.2) on H. Then, for any given
L ∈ H, the variational inequality (13.34) has a unique solution u ∈ K.
For proof see Cea [11].
13.5 Distributions
13.5.1 Definition: Support
4
The support of a function f (x), x ∈ Ω ⊂ n is defined as the closure
4
of the set of points in n at which f is non-zero.
13.5.2 Definition: smooth function
4 4
A function φ : n −→ is said to be smooth or infinitely differentiable
if its derivatives of all order exist and are continuous.
13.5.3 Definition: C0∞ (Ω)
The set of all smooth functions with compact support in Ω ⊂ 4n is
denoted by C0∞ (Ω).
13.5.4 Definition: test function
A test function φ is a smooth function with compact support,
φ ∈ C0∞ (Ω).
13.5.5 Definition: generalized derivative
A function u ∈ C ∞ (Ω) is said to have the αth generalized derivative
α
D u, 1 ≤ |α| ≤ m, if the following relation (generalized Green’s
formula) holds:

Dα uφdx = (−1)|α| uD α φdx for every φ ∈ C0α (Ω) (13.35)
Ω Ω
For u ∈ C0∞ (Ω), the generalized derivatives are derivatives in the

ordinary (classical) sense.
13.5.6 Definition: distribution

A set of test functions {φn } is said to converge to a test function φ0
in C0∞ (Ω) if there is a bounded set Ω0 ⊂ Ω containing the supports of
φ0 , φ1 , φ2 , . . . and if φn and all its generalized derivatives converge to φ0
and its derivatives respectively. A functional f on C0∞ (Ω) is continuous if
it maps every convergent sequence in C0∞ (Ω) into a convergent sequence in
4 , i.e., if f (φn ) −→ f (φ0 ) whenever φn −→ φ in C0∞ (Ω).
A continuous linear functional on C0∞ (Ω) is called a distribution or
generalized function.
Example An example of the distribution is provided by the delta
distribution, defined by
∞
δ(x)φ(x)dx = φ(0) for all φ ∈ C0∞ (Ω) (13.36)
−∞
Addition and scalar multiplication of distributions:

If f and g are distributions, then the distributions of αf + βg, α, β,
being scalars, is the sum of α × distribution of f and β × distribution of g.
13.6 Sobolev Space

C ∞ (Ω) is an inner product space with respect to the L2 (Ω)-inner product.
But it is not complete with respect to the norm generated by the inner
product
u, vp = D α uDα vdm (13.37)
Ω |α|=p
where u and v along with their derivatives upto m, are square integrable
in the Lebesgue sense [see chapter 10].

|Dα u|2 dm < ∞ for all |α| ≤ p.
Ω
The space C ∞ (Ω) can be completed by adding the limit points of all
Cauchy sequences in C ∞ (Ω). It turns out that the distributions are those
limits points.
We can thus introduce the Sobolev space H 1 (Ω) as follows:
,
1 ∂v 2
H (Ω) = v : v ∈ L2 (Ω), ∈ L (Ω), j = 1, 2, . . . , p , (13.38)
∂xj
∂v
where Dj v = are taken in the sense of distributions,
∂xj

i.e., Dj vφdx = − vDj φdx for all φ ∈ D(Ω) (13.39)
Ω Ω
where D(Ω) denotes the space of all C ∞ -functions with compact support
on Ω. H 1 (Ω) is provided with the inner product

n
u, v = u, vL2 (Ω) + Dj u, Dj vL2 (Ω) (13.40)
j=1
⎧ ⎫
⎨
n ⎬
= uv + (Dj u)(Dj v) dx (13.41)
Ω ⎩ ⎭
j=1
for which it becomes a Hilbert space.

13.6.1 Remark
D(Ω) ⊂ C 1 (Ω) ⊂ H 1 (Ω).
We introduce the space
H01 (Ω) = the closure of D(Ω) in H 1 (Ω), (13.42)
We state without proof some well-known theorems.

13.6.2 Theorem of density
If Γ, the boundary of Ω is regular (for instance, Γ is a C 1 function of
dimension n − 1), then C 1 (Ω) (or C ∞ )-manifold (respectively C ∞ (Ω)) is
dense in H 1 (Ω).
13.6.3 Theorem of trace
If Γ is regular then the linear mapping v → v|Γ of C 1 (Ω) −→ C 1 (Γ)
(respectively of C ∞ (Ω) −→ C ∞ (Γ)) extends to a continuous linear map of
H 1 (Ω) into L2 (Ω) denoted by γ and for any v ∈ H 1 (Ω), γv is called the
trace of v in Γ.
Moreover, H01 (Ω) = {v : v ∈ H 1 (Ω), γv = 0}.
13.6.4 Green’s formula for Sobolev spaces
Let Ω be a bounded open set with sufficiently regular boundary Γ, then
there exists a unique outer normal vector n(x). We define the operator of
exterior normal derivation formally
∂ n
= nj (x)D j (13.43)
∂n j=1
Now, if u, v ∈ C 1 (Ω) then, by the classical Green’s formula [see Mikhlin

[36]], we have

(Dj u)vdx = − u(Dj v)dx + uvnj dσ
Ω Ω Γ
where dσ is the area element on Γ.

This formula remains valid also if u, v ∈ H 1 (Ω) in view of the trace

theorem and density theorem .
Next, if u, v ∈ C 2 (Ω) then applying the above formula to Dj u, D j v and
summing over j = 1, 2, . . . , n, we get,

n n

j 2 ∂u
D u, D vL2 (Ω) = −
j j
((D ) u)vdx + · vdσ, (13.44)
j=1 j=1 Ω Ω ∂n

n
∂u
i.e., D j u, Dj v = (Δu)vdx + · vdσ (13.45)
j=1 Ω Γ ∂n
13.6.5 Remark
(i) u ∈ H 2 (Ω) =⇒ Δu ∈ L2 (Ω)
(ii) Since D j u ∈ H 1 (Ω) by trace theorem (13.6.3) γ(Dj u) exists and
belongs to L2 (Ω) so that
∂u
n
= nj γ(Dj u) ∈ L2 (Γ).
∂n j=1
Hence, by using the density and trace theorems (13.6.2 and 13.6.3),
the formula (13.4.3) is valid.
13.6.6 Weak (or variational formulation of BVPs)

Example 1. Let Γ = Γ1 ∪ Γ2 , where Γj are open subsets of Γ such that
Γ1 ∩ Γ2 = Φ
Consider the space
E = {v : v ∈ H 1 (Ω); γv = 0 on Γ1 } (13.46)
E is clearly a closed subspace of H 1 (Ω) and is provided with the inner

product induced from that in H 1 (Ω) and hence it is a Hilbert space.
Moreover,
H01 (Ω) ⊂ E ⊂ H 1 (Ω) (13.47)
and the inclusions are continuous linear. If f ∈ L2 (Ω) we consider the
functional
1
J(v) = v, v − f, vL2 (Ω) (13.48)
2
i.e., a(u, v) = u, v and L(v) = f, v.
Then a(·, ·) is bilinear, bicontinuous and coercive,
|a(u, v)| ≤ ||u||E ||v||E = ||u||H 1 (Ω) ||v||H 1 (Ω) for u, v ∈ E.
a(v, v) = ||v||2H 1 (Ω) for v ∈ E.
|L(v)| ≤ ||f ||L2 (Ω) ||v||L2 (Ω) ≤ ||f ||L2 (Ω) ||v||H 1 (Ω) for v ∈ E.
Then the problems (PI) and (PII) respectively become
(PIII) To find u ∈ E, J(u) ≤ J(v) for all v ∈ E (13.49)
(PIV) To find u ∈ E, u, φ = f, φL2 (Ω) for all v ∈ E (13.50)
Theorem 13.4.8 asserts that these two equivalent problems have unique
solutions.
The problem (PIV) is the Weak (or Variational) formulation of the
(i) Dirichlet problem (if Γ2 = Φ)

(ii) Newmann problem (if Γ1 = Φ)
(iii) Mixed boundary problem in the general case.
13.6.7 Equivalence of problem (PIV) to the corresponding

classical problems
Suppose, u ∈ C 2 (Ω) ∩ E and v ∈ C 1 (Ω) ∩ E.
Using Green’s formula (13.45),

∂u
a(u, v) = u, v = (−Δu + u)vdx + · vdσ = f vdx
Ω Γ ∂n Ω

∂u
i.e., (−Δu + u − f )vdx + · vdσ = 0 (13.51)
Ω Γ ∂n
We note that this formula remains valid if u ∈ H 2 (Ω) ∩ E for any v ∈ E.
We choose v ∈ D(Ω) ⊂ E then the boundary integral vanishes so that
we get,
(−Δu + u − f )vdx = 0 ∀ v ∈ D(Ω).
Ω
Since D(Ω) is dense in L2 (Ω), this implies that (if u ∈ H 2 (Ω)) u is a

solution of the differential equation,
−Δu + u − f = 0 in Ω (in the sense of L2 (Ω)).
More generally, without the strong regularity assumption as above, u is

a solution of the differential equation.
−Δu + u − f = 0 in the sense of distribution in Ω (13.52)
Next we choose v ∈ E arbitrary. Since u satisfies (13.50) in Ω, we find

from (13.51) that
∂u
vdσ = 0 ∀ v ∈ E
Γ ∂n
∂u
which means that ∂n = 0 on Γ in some generalised sense. In fact by trace
1 1
theorem γ|v ∈ H 2 (Γ) and hence ∂n
∂u
= 0 in H − 2 (Γ) [ee Liones and Magenes
[34]]. Thus, if the problem (IV) has a regular solution then it is the solution
of the classical problem
⎧ ⎫
⎪
⎪ −Δu + u − f = 0 on Ω ⎪ ⎪
⎪
⎨ ⎪
⎬
u=0 on Γ1 (13.53)
⎪
⎪ ⎪
⎪
⎪
⎩
∂u
=0 on Γ2 ⎭ ⎪
∂n
13.6.8 Remark
The variational formulation (PIV) is very much used in the Finite
Elements Method.
CHAPTER 14
THE WAVELET
ANALYSIS
14.1 An Introduction to Wavelet Analysis
The concept of Wavelet was first introduced around 1980. It came out
as a synthesis of ideas borrowed from disciplines including mathematics
(Calderón Zygmund operators and Littlewood-Paley theory), physics
(coherent states formalism in quantum mechanism and renormalizing
group) and engineering (quadratic mirror filters, sidebend coding in signal
processing and pyramidal algorithms in image processing) (Debnath [17]).
Wavelet analysis provides a systematic new way to represent and analyze
multiscale structures. The special feature of Wavelet analysis is to
generalize and expand the representations of functions by orthogonal
basis to infinite domains. For this purpose, compactly supported
[see 13.5] basis functions are used and this linear combination represents
the function. These are the kinds of functions that are realized by physical
devices.
There are many areas in which wavelets play an important role, for
example
(i) Efficient algorithms for representing functions in terms of a wavelet

basis,
(ii) Compression algorithms based on the wavelet

expansion representation that concentrate most of the energy of a
signal in a few coefficients (Resnikoff and Wells [46]).
430
The Wavelet Analysis 431
14.2 The Scalable Structure of Information

14.2.1 Good approximations
Every measurement, be it with a naked eye or by a sophisticated
instrument, is at best very accurate or in other words approximate. Even
a computer can measure the finite number of decimal places which is a
rational number. Even Heisenberg’s uncertainty principle corroborates this
type of limitations.
The onus on the technologist is thus to make a measurement or a
representation as accurate as possible. In the case of speech transmission,
codes are used to transmit and at the destination the codes are decoded.
Any transmitted signal is sampled at a number of uniformly spaced types.
The sample measurements can be used to construct a Fourier series
expansion of the signal. It will also interpolate values for unmeasured
instants. But the drawback of the Fourier series is that it cannot take
care of local phenomenon, for example, abrupt transitions. To overcome
this difficulty, compactly supported [see 13.5] wavelets are used. A simple
example is to consider a time series that describes a quantity that is zero
for a long time, ramps up linearly to a maximum value and falls instantly
to zero where it remains thereafter (fig. 14.1).
Fig. 14.1 Continuous ramp transient
Here, wavelet series approximate abrupt transitions much more

accurately than Fourier series [see Resnikoff and Wells [46]]. Wavelet series
expansion is less expansive too.
14.2.2 Special Features of wavelet series
That wavelet analysis provides good approximation for transient or
localized phenomenon is due to following:
(a) Compact support
(b) Orthogonality of the basis functions
(c) Multiresolution representation
14.2.3 Compact support
Each term in a wavelet series has a compact support [13.5]. As a
result, however short an interval is, there is a basis function whose support
is contained within that interval. Hence, compactly supported wavelet basis
function can capture local phenomenon and is not bothered by properties
of the data far away from the area of interest.
14.2.4 Orthogonality
The terms in a wavelet series are orthogonal to one another, just like
the terms in a Fourier series. This means that the information carried by
one term is independent of the information carried by any other.
14.2.5 Multiresolution representation
Multiresolution representation describes what is called a hierarchical
structure. Hierarchical structures classify information into several
categories called levels or scales so that, higher in the hierarchy a level
is, the fewer the number of members it has. This hierarchy is prevelant
in the social and political organization of the country. Biological sensory
system, such as visions, also have this hierarchy built in. The human
vision system provides wide aperture detection (so events can be detected
early) and high-resolution detection (so that the detailed structure of the
visual event can be seen). Thus, a multiresolution or scalable mathematical
representation provides a simpler or more efficient representation than the
usual mathematical representation.
14.2.6 Functions and their representations
Representation of continuous functions
Suppose we consider a function of a real variable, namely,
f (x) = cos x, x ∈ 4
where 4 denotes the continuum of real numbers. For each x ∈ , we 4
have a definite value for cos x. Since cos x is periodic, all of its values are
determined by its values on [0, 2π[. The question arises as to how best we
can represent the value of the function cos x at any point x ∈ [0, 2π[. There
is an uncountable number of points in [0, 2π[ and an uncountable number
of values of cos x, as x varies. If we represent
∞
x2n
cos x = (−1)n (14.1)
n=0
2n!
for any x ∈ 4, we see that the sequence of numbers,

1 1 1
1, 0, − , 0, , 0, − , 0, . . . , (14.2)
2! 4! 6!
which is countable, along with the sequence of power functions
x0 = 1, x, x2 , x3 , . . . , xn
is sufficient information to determine cos x at any point x. Thus we
represent the uncountable number of values of cos x in terms of the
countable discrete sequence (14.2), which are the coefficient of a power
series representation for cos x. This is the basic technique in representing a
function or a class of functions in terms of more elementary or more easily
computed functions.
14.2.7 Fourier series and the Fourier transform

The concept of a Fourier series was introduced in 3.7.8.
14.2.8 Definition: discrete Fourier transform
We note that if f is an integrable periodic function of period 1, then
the Fourier series of f is given by

f (x) = cn e2πinx (14.3)
n∈ +
with Fourier coefficients {cn } [see 3.7.8] given by
1
cn = f (x)e−2πinx dx. (14.4)
0
+
Supposethat {cn } is a given discrete sequence of complex numbers in l2 ( ),
that is |cn |2 < ∞, then we define the Fourier transform of the sequence
f = {cn } to be the Fourier series [see 14.3],

fˆ(ξ) = cn e2πinξ
n
which is a periodic function.

14.2.9 Inverse Fourier transform of a discrete function
The inverse of the Fourier transform is the mapping from a periodic
function (14.3) to its Fourier coefficients (14.4).
14.2.10 Continuous Fourier transform
4
If f ∈ L2 ( ), thus the Fourier transform of f is given by
∞
ˆ
f (ξ) = f (x)e2πiξx dx (14.5)
−∞
with the inverse Fourier transform given by

∞
f (x) = fˆ(ξ)e−2πixξ dξ (14.6)
−∞
where both formulas have to be taken in a suitable limiting sense, but

for nicely behaved functions that decrease sufficiently rapidly, the formulas
hold (Resnikoff and Wells [46]).
14.3 Algebra and Geometry of Wavelet

Matrices
A wavelet matrix is a generalisation of unitary matrices [see 9.4.1] to a
larger class of rectangular matrices. Each wavelet matrix contains the basic
information to define an associated wavelet system. Let . be a subfield of

the field + of complex numbers. . could be the rational numbers 3, the
real numbers 4 or the field + itself.
Consider an array A = (asr ), consisting of m rows of presumably infinite
vectors of the form
⎛ ⎞
··· a0−1 a00 a01 a02 ···
⎜ ···
⎜ a1−1 a10 a11 a12 ··· ⎟
⎟
⎜ ⎟ (14.7)
⎝ ··· ··· ··· ··· ··· ··· ⎠
· · · am−1
−1 am−1
0 a1m−1 am−1
2 ···
In the above, asr is an element of . ⊆ + and m ≥ 2. We call such an array

A as a matrix even though the number of columns (rows) may not be finite.
Define submatrices Ap of A of size m × m in the following manner,
Ap = (as pm + q), q = 0, 1, . . . , m − 1, s = 0, 1, . . . , m − 1 (14.8)
for p an integer. Thus, A can be expressed in terms of submatrices in the

form,
A = (· · · , A−1 , A0 , A1 , · · · ), (14.9)
where ⎛ ⎞
a0p a0p+1 ··· a0p+m−1
⎜ .. ⎟
Ap = ⎜
⎝ .
⎟
⎠
am−1
p am−1
p+1 ··· m−1
ap+m−1
From the matrix, a power series of the following form is constructed

∞

A(z) = Ap z p . (14.10)
p=−∞
We call the above series the Laurent series of the matrix A.

Thus, A(z) is the Laurent series with matrix coefficients. We can write
A(z) as a m × m matrix with Laurent series coefficients:
⎛ ⎞
a0mj z j a0mj+1 z j ··· a0mj+m−1 z j
⎜ j ⎟
⎜ j j
⎟
⎜ .. ⎟
A(z) = ⎜ ⎟ (14.11)
⎜ . ⎟
⎝ ⎠
am−1
mj z
j
am−1
mj+1 z
j
··· m−1
amj+m−1 zj
j j j
(14.10) and (14.11) will both be referred to as the Laurent series

representation of A(z) of the matrix A.
14.3.1 Definition: genus of the Laurent series A(z)

Suppose A(z) has a finite number of non-zero matrices, i.e.,

n2
A(z) = Ap z p (14.12)
p=n1
where we assume that An1 and An2 are both non-zero matrices.
g = n2 − n 1 + 1 (14.13)
i.e., the number of terms in the series (14.12) is called the genus of the
Laurent series A(z) and the matrix A.
14.3.2 Definition: adjoint A(z) of the Laurent series A(z)
Let,
A(z) = A∗ (z −1 )

= A∗p z −p (14.14)
p
A(z) is called the adjoint of the Laurent matrix A(z).

T
In the above, A∗p = Ap is the hermitian conjugate of the m × m matrix
Ap .
14.3.3 Definition: the wavelet matrix
The matrix A, as defined in (14.7), is said to be a wavelet matrix of
rank m if
(1) A(z)A(z) = mI (14.15)
∞

(2) asj = mδ s,0 0 ≤ s ≤ m − 1. (14.16)
j=−∞
where δ s,0 = 1 for s = 0 and is zero otherwise.
14.3.4 Lemma: a wavelet matrix with m rows has rank m
Let A be a wavelet matrix with m rows and an infinite number of
columns. ⎛ 0 ⎞
a
⎜ a1 ⎟
⎜ ⎟
Let A(1) = ⎜ . ⎟ where ai stands for the rows of A.
⎝ .. ⎠
am
If the second and third rows are multiples of each other, then we can
write a2 = λa1 where λ is a scalar. In that case the first two rows of A(1)
will be multiples of each other. Therefore, the determinant of A(1) would
be zero.
14.3.5 Definition: wavelet space W M (m, g : .)

The wavelet space W M (m, g : .) denotes the set of all wavelet
matrices of rank m and genus g with coefficients in the field ..
Quadratic orthogonality relations for the rows of A
Comparison of coefficients of corresponding powers of z in (14.15) yields:

asj+mp asj+mp = mδ s s δp p (14.17)
j
We will refer to (14.15) and (14.16) or equivalently (14.17) and

(14.16), as the quadratic and linear conditions defining a wavelet matrix
respectively.
Scaling vector: The vector a0 is called the scaling vector.
Wavelet vector: as for 0 < s < m is called a wavelet vector.
14.3.6 Remark
1. The quadratic condition
√ asserts that the rows of a wavelet matrix
have length equal to m and they are pairwise disjoint when shifted
by an arbitrary multiple of m.
2. The linear condition (14.16) implies that the sum of the components
of the scaling vector is equal to the rank of A.
3. The sum of the components of each of the wavelet vector is zero.
14.3.7 Examples
1. Haar matrix of rank 2

1 1
Let A1 = . Here A1 AT1 = 2I.
1 −1
Sum of the elements of the first row = 2 = rank of A.
Sum of the elements of the second row = 0.
Hence, A1 is a wavelet matrix of rank 2.

1 1
Similarly, A2 = , A2 AT2 = 2I and fulfils conditions (14.15)
−1 1
and (14.16).
Hence, A2 is a wavelet matrix of rank 2 too.
The general complex Haar wavelet matrix of rank 2 has the form

1 1
, θ ∈ 4.
−eiθ eiθ
2. Daubechies’ wavelet matrix of rank 2 and genus 2

Let
√ √ √ √
1 1+ 3 3+ 3 3− 3 1− 3
D2 = √ √ √ √ (14.18)
4 −1 + 3 3 − 3 −3 − 3 1 + 3
D2 D2T = 2I.
Sum of the elements of the first row = 2 = rank of D2 .

Sum of the elements of other wavelet vectors = 0.
14.3.8 Definition: Haar wavelet matrices
Haar wavelet matrix of rank m is denoted by H(m; .) and is defined
by
H(m; .) = W M (m · 1; .) (14.19)
Thus, Haar wavelet matrix is a wavelet matrix of genus 1.

14.3.9 The Canonical Haar matrix
In what follows, we provide a characterization of the Haar wavelet
matrix.
14.3.10 Theorem
An m × m complex matrix H is a Haar wavelet matrix if and only if

1 0
H=
0 U
0 (14.20)
where U ∈ U (m−1) is a unitary matrix and 0 is the canonical Haar matrix

of rank m which is defined by
⎛ ⎞
1 1 ··· ··· ··· ··· 1
⎜ I I I ⎟
⎜ −(m − 1) 1 1 1 ⎟
··· ··· ··· ···
⎜ m−1 m−1 m−1 ⎟
⎜ .. .. ⎟
⎜ ⎟
⎜ . . ⎟
0=⎜
⎜
I I I ⎟
⎟
⎜ 0 0 · · · −s m
s2 +s
m
s2 +s ··· m
s2 +s ⎟
⎜ ⎟
⎜ .. ⎟
⎝ . ⎠
+m +m
0 ··· ··· ··· 0 − 2 2
(14.21)
where s = (m − j) and j = 0, 1, . . . , (m + 1) are the row numbers of the
matrix.
In the above, U(m − 1) is the group of (m − 2) × (m − 1) complex
matrices U , such that U T U = 1.
Before we prove the theorem, we prove the following lemma.
14.3.11 Lemma
If H = (hsr ) is a Haar wavelet matrix, then
hr := h0r = 1 for 0 ≤ r < m (14.22)


m−1
m−1
Proof: From (14.16) and (14.15), we have hj hj = m and hj = m.
j=0 j=0

m−1
It follows that hj = m.
j=0

m−1
m−1
Now, |hj − 1|2 = (hj hj − hj − hj + 1)
j=0 j=0

m−1
m−1
m−1
m−1
= hj hj − hj − hj + 1
j=0 j=0 j=0 j=0
=m−m−m+m
=0
which implies that hj = 1 for j = 0, . . . , m − 1.

Proof of the theorem 14.3.9
We have seen that the elements of the first row of a Haar matrix are all
equal to 1.

1 0
For the remaining m − 1 rows, we proceed as follows. H is a
0 U
Haar matrix whenever H is a Haar matrix and U ∈ U(m − 1). Hence the
operation of U can be employed to develop a canonical form for H. The
first step rotates the last row of H so that its first (m − 2) entries are zero.
Since
√ the rows of a Haar matrix are pairwise orthogonal and of length equal
to m, the orthogonality of the first and last rows implies that the last row
can be normalized to have the form
; ;
m m
0, 0, 0, . . . , − ,
2 2
Using the same argument for the proceeding rows the result can be
obtained.
14.3.12 Remarks
0 +
I. If H1 , H2 ∈ (m; ) are two Haar matrices, then there exists a
unitary matrix U ∈ U(m − 1), such that

1 0
H1 = H2 .
0 U
II. If A is a real wavelet matrix, that is, if asj ∈ 4, then A is a Haar

matrix if and only if,

A=
1 0
0 O
0 (14.23)
where O ∈ O(m − 1) is an orthogonal matrix and 0 is the canonical

Haar matrix of rank m.
14.4 One-Dimensional Wavelet Systems

In this section, we introduce the basic scaling and wavelet functions of
wavelet analysis. The principle result is that for any wavelet matrix A ∈
+
W M (m, g; ), there is a scaling function φ(x) and (m−1) wavelet functions
ψ1 (x), . . . , ψ (m−1) (x) which satisfy specific scaling relations defined in terms
of the wavelet matrix A. These functions are all compactly supported and
square-integrable.
14.4.1 The scaling equation
+
Let A ∈ W M (m, g; ) be a wavelet matrix and consider the functional
difference equation

mg−1
φ(x) = a0j φ(mx − j) (14.24)
j=0
This equation is called the scaling equation associated with the

wavelet matrix A = (asj ).
14.4.2 The scaling function
4
If φ ∈ L2 ( ) is a solution of the equation (14.24),
then φ is called a
I 0
scaling function. It may be noted that H is a Haar matrix
0 U
whenever H is a Haar matrix and U is a group of (m−1)×(m−1) complex
matrices U such that U ∗ U = I. Hence the action of U can be employed
to develop a canonical form for H. Let the first step be to rotate the last
row of H so that the first (m − 2) elements of the row are zeroes. Let the
last two elements be α and β respectively. Then α + β = 0 by (14.16) and
α2 + β 2 = m by (14.15). Hence, α = −β and α2 = m 2 . Therefore, the last
row is ; ;
m m
0, 0, . . . , 0, − , .
2 2
Then next step would be to rotate the matrix so that the last three
elements in (m − 1)-th row are only non-zeroes. If these elements are α, β, γ
then
α+β+γ =0
α2 + β 2 + γ 2 = m.
If we take β = γ then α + 2β = 0
α2 + 2β 2 = 6β 2 = m
; ;
m m
i.e., β = , α = −2
6 6
Hence, the last but one row is
; ; ;
m m m
0, 0, . . . , 0, −2 , , .
6 6 6
Similarly, let the rotation yield for the (m − s)-th row all the first
(m − s − 1) elements as zeroes and the last (s + 1) elements as non-zeroes.
Let these be α1 , α2 , . . . , αs , αs+1 .

Then αi = 0, αi2 = m.
i i
Taking α2 = α3 = · · · = αs = αs+1
we have α1 = −sα2
;
m
αi2 = m or, s2 α22 + sα22 =m or, α2 = .
s2 + s
i
Hence, the (m − s)-th row is
; ; ;
m m m
0, 0, 0, . . . , −s , ,..., .
s2 + s s2 + s s2 + s
Thus we get the expression for 0 [see 14.21].
14.4.3 The wavelet function
If φ is a scaling function for the wavelet matrix A, then the wavelet
functions {ψ 1 , ψ 2 , . . . , ψ m−1 } associated with matrix A and the scaling
function φ are defined by the formula

mg−1
ψ s (x) = asj φ(mx − j) (14.25)
j=0
14.4.4 Theorem
+
Let A ∈ W M (m, g; ) be a wavelet matrix. Then, there exists a unique
4
φ ∈ L2 ( ), such that
(i) φ satisfies (14.24)

(ii) φ(x)dx = 1
4
m
(iii) supp φ ⊂ 0, (g − 1) +1 .
m−1
For proof see Resnikoff and Wells [46].
In what follows we give some examples and develop the notion of the
wavelet system associated with the wavelet matrix A.
14.4.5 Examples

1. The Haar functions If 0= 1 1
−1 1
is the canonical Haar wavelet
matrix of rank 2, then the scaling function φ satisfies the equation
φ(x) = φ(2x) + φ(2x − 1) (14.26)
Hence, φ(x) = χ[0, 1[ where χK , the characteristic function of a subset K
[see 10.2.10] is a solution of the equation (14.26) [see 14.2(b)].
The wavelet function ψ = −φ(2x) + φ(2x − 1), where φ(x) = χ[0, 1[.
For graph, see figure 14.2(c).
1.5
0.5
−0.5
−1.5
−0.5 0 0.5 1 1.5
Fig. 14.2(b) The Haar scaling function for rank 2

1.5
0.5
−0.5
−1.5
−0.5 0 0.5 1 1.5
Fig. 14.2(c) The Haar scaling function for rank 2

2. Daubechies wavelets for rank 2 and genus 2.

√ √ √ √
1 1+ 3 3+ 3 3− 3 1− 3
Let D2 = √ √ √ √
4 −1 + 3 3 − 3 −3 − 3 1 + 3
This is a wavelet matrix of rank 2 and genus 2 and discovered by
Deubechies. For graphs of the corresponding scaling and wavelet functions,
see Resnikoff and Wells [46]. The common support of φ and ψ is [0.3].
We end by stating a theorem due to Lawton, (Lawton [46]).
14.4.6 Theorem
+
Let A ∈ W M (m, g; ). Let W (A) be the wavelet system associated
4
with A and let f ∈ L2 ( ) (3.1.3). Then, there exists an L2 -convergent
expansion
∞

m−1 ∞ ∞

f (x) = cj φj (x) + dsij ψij
s
(x) (14.27)
j=−∞ s=1 i=0 j=−∞
where the coefficients are given by

∞
cj = f (x)φj (x)dx (14.28)
−∞
∞
dsij = s
f (x)ψij dx. (14.29)
−∞
For proof, see Resnikoff and Wells [46].

14.4.7 Remark
1. For most wavelet matrices, the wavelet system W [A] will be a
complete orthonormal system and an orthonormal basis for L2 ( ). 4
2. However, for some wavelet matrices the system W [A] is not
orthonormal and yet the theorem 14.4.6 is still true.
CHAPTER 15
DYNAMICAL SYSTEMS
15.1 A Dynamical System and Its Properties

Let us consider a first order o.d.e. (ordinary differential equation) of the
form
dx
ẋ = = sin x (15.1)
dt
The solution of the above equation is,

cosec x0 + cot x0
t = log , (15.2)
cosec x + cot x
where x(t)|t=0 = x0 .
From (15.2), we can find the values of x for different values of t. But the
determination of the values of x for different values of t is quite difficult.
On the other hand, we can get a lot of information about the solution
by putting x against ẋ of the graph ẋ = sin x.
•
x
x
O π 2π
Fig. 15.1
Let us suppose x0 = π/4.

Then the graph 15.1 describes the qualitative features of the solution
x(t) for all t > 0. We think of t as time, x as the position of an imaginary
particle moving along the real line, and ẋ as the velocity of that particle.
443
Then the differential equation ẋ = sin x represents a vector field on

the line. The arrows indicate the directions of the corresponding velocity
vector at each x. The arrows point to the right when ẋ > 0 and to the
left when ẋ < 0. ẋ > 0 for π/2 > x > 0 and ẋ < 0 for x < 0. At x > π,
ẋ < 0 and for π/2 < x < π, ẋ > 0. At points where ẋ = 0, there is no
flow. Such points are therefore called fixed points. This type of study of
o.d.e.-s to obtain the qualitative properties of the solution was pioneered
by H. Poincaré in the late 19th century [6]. What emerged was the theory
of dynamical systems. This may be treated as a special topic of the theory
of o.d.e.-s. Poincaré followed by I. Benedixon [Bhatia and Sjego [6]] studied
topological properties of solutions of autonomous o.d.e.-s in the plane.
Almost simultaneously with Poincaré, A.M. Lyapunov [32] developed his
theory of stability of a motion (solution) for a system of n first order o.d.e.-s.
He defined, in a precise form, the notion of stability; asymptotic stability
and instability, and developed a method for the analysis of the stability
properties of a given solution of an o.d.e.. But all his analysis was strictly
in a local setting. On the other hand, Poincaré studied the global properties
of differential equations in a plane.
As pointed out in the example above, Poincaré introduced the concept of
a trajectory, i.e., a curve in the x, ẋ plane, parametrized by the time variable
t, which can be found by eliminating the variable t from the given equations,
thus reducing these to first order differential equations connecting x and
ẋ. In this way, Poincaré set up a convenient geometric framework in which
to study the qualitative behaviour of planar differential equations. He was
not interested in the integration of particular types of equations, but in
classifying all possible behaviours of the class of all second order differential
equations. Great impetus to the theory of dynamical systems came from
the work of G.D. Birkoff [7]. There are many other authors who contributed
to a large extent to this qualitative theory of differential equations. In this
chapter we give exposure to the basic elements in a dynamical system.
15.1.1 Definition: dynamical system
Let X denote a metric space with metric ρ. A dynamical system on X
4
is the triplet (X, , π), where π is a map from the product space X × 4
into the space X satisfying the following axioms:
(i) π(x, 0) = π(x) for every x ∈ X (identity axiom)

(ii) π(π(x, t1 ), t2 ) = π(x, t1 + t2 ) for every x ∈ X and t1 , t2 in 4 (group
axiom)
(iii) π is continuous (continuity axiom).
Given a dynamical system on X, the space X and the map are respectively
called the phase space and the phase map (of the dynamical system).
Unless otherwise stated, X is assumed to be given.
Dynamical Systems 445
15.1.2 Example: ordinary autonomous differential systems

The differential system
dx
= ẋ = f (x, t) (15.3)
dt
is called an autonomous system if the RHS in (15.3) does not contain t
explicitly.
We consider the equation
dx
= ẋ = f (x) (15.4)
dt
4 4
where f : n → n is continuous and, moreover, let us assume that for
4
each x ∈ n a unique solution ψ(t, x) which is defined on 4
and satisfies
ψ(0, x) = x. Then following Coddington and Levinson ([13], chapters 1 and
2) it can be said that the uniqueness of the solution implies that
ψ(t1 , ψ(t2 , x)) = ψ(t1 + t2 , x) (15.5)
and considered as a function from 4 4 4

× n into n . Moreover, ψ is
continuous in its arguments [see section 4, chapter II, Coddington and
Levinson [13]].
We assume that f satisfies the global Lipschitz condition, i.e.,
||f (x1 ) − f (x2 )|| ≤ M ||x1 − x2 ||, (15.5)
4
for all x1 , x2 ∈ n and a given positive number M , so that the conditions
on solutions of (15.4) are obtained. We next want to show that the map
4 4 4
π : n × → n such that π(x, t) = ψ(t, x) defines a dynamical system
4
on n .
For that, we note that
π(x, 0) = ψ(0, x) = x
π(π(x, t1 ), t2 ) = ψ(t2 , ψ(t1 , x)) = ψ(t1 + t2 , x)
= π(x, t1 + t2 )
Moreover, π(x, t) is continuous in its arguments. Thus all the axioms

(i), (ii) and (iii) of 15.1.1 are fulfilled. Hence π(x, t) is a phase map.
Example 2: ordinary autonomous differential systems
Let us consider the system
dx
= ẋ = F (x) (15.6)
dt
where F : D → 4 is a continuous function on an open set D ⊂ n and 4
for each x ∈ D (15.6) has a unique solution ψ(t, x), ψ(0, x) = x defined on
a maximal interval (a(x), b(x)), −∞ a(x) < 0 < b(x) +∞. For each
x ∈ D, define Γ+ (x) = {ψ(t, x) : 0 t < b(x)} and Γ− (x) = {ψ(t, x) :
a(x) t 0}, Γ+ (x) and Γ− (x) are respectively called the positive and
the negative trajectories respectively through the point x ∈ D.
We will show that to each system (15.6), there corresponds a system
dx
dt
= ẋ = F (x), x ∈ 4n (15.7)
4
where F : D → n such that (15.7) defines a dynamical system on D with
the property that for each x ∈ D the systems (15.6) and (15.7) have the
same positive and the same negative trajectories.
4
If D = n , then given the equation (15.4), we set,
dx F (x)
= ẋ(t) = F1 (x) = (15.8)
dt 1 + ||F (x)||
where || · || is the Euclidean norm. If D = 4n , then the boundary ∂D of
D = Φ and is closed.
We next consider the system
dx F (x) ρ(x, ∂D)
= ẋ(t) = F1 (x) = · (15.9)
dt 1 + ||F (x)|| 1 + ρ(x, ∂D)
where ρ(x, ∂D) = inf{||x − y|| : y ∈ ∂D}.
In other words, ρ(x, ∂D) is the distance of x from ∂D.
Since f satisfies Lipschitz condition, equation (15.4) has a unique
solution.
0 0
0 F (x) F (y) 0
Now, ||F1 (x) − F1 (y)|| = 0 0 − 0
1 + ||F (x)|| 1 + ||F (y)|| 0

1
≤ ||F (x) − F (y)|| + ||F (x)|| ||F (y)||×
(1 + ||F (x)||)(1 + ||F (y)||)
0 0
0 F (x) F (y) 0
0 0
0 ||F (x)|| − ||F (y)|| 0
< K||x − y|| + ||e(x) − e(y)||
where e(x) is a vector of unit norm in the direction of F (x).

Assuming ||F (x)|| > m > 0 we have
||F1 (x) − F1 (y)|| < (K + k)||x − y||
where ||e(x) − e(y)|| ≤ k||x − y||.

Thus, F1 satisfies the global Lipschitz condition. Thus, (15.8) defines a
dynamical system. Similarly, (15.9) also defines a dynamical system. (15.8)
and (15.9) have the same positive and negative trajectories as (15.4).
15.2 Homeomorphism, Diffeomorphism,

Riemannian Manifold
15.2.1 Definition: homeomorphism
See 1.6.5.
15.2.2 Definition: manifold
A topological space Y [see 1.5.1] is called an n-dimensional manifold
or (n-manifold) if every point of Y has a neighbourhood homeomorphic
4
to an open subset of n . Since one can clearly take this subset to be an
4
open ball, and since an open ball in n is homeomorphic to n itself, the 4
condition for a space to be a manifold can also be expressed by saying
that every point has a neighbourhood homeomorphic to n . 4
15.2.3 Remark
A homeomorphism from an open subset V of Y to an open subset of
4n allows one to transfer the cartesian coordinate system of 4n to V . This
gives a local coordinate system or chart on Y [see Schwartz [51]].
15.2.4 Differentiability, C r -,C 0 -maps
Let U be an open subset of n . 4
4 4
For definition of differentiability of J : L ⊂ n → see 13.2 and 13.3.
C r - J is said to belong to class C r if J is r times continuously
differentiable on an open set U ⊆ n . 4
4 4
C r -map Let F : U ⊆ n → V ⊆ m . For definition of differentiability
of F and the concrete form of the derivative see Ortega and Rheinboldt
[42].
If (x1 , . . . , xn )T ∈ U and (y1 , y2 , . . . , yn )T ⊂ V then
yi = Fi (x1 , x2 , . . . , xn ), i = 1, 2, . . . , m (15.10)
The map F is called a C r -map if Fi is continuously r differentiable for

some 1 ≤ r ≤ ∞.
4 4
Smooth: F : U ⊂ n → m is said to be smooth if it is a C ∞ -map.
C 0 -map: Maps that are continuous but not differentiable will be
referred to C 0 -maps.
15.2.5 Definition: diffeomorphism
F : U ⊂ n → 4 4m is said to be a diffeomorphism if it fulfils the
(i) F is a bijection (one-to-one and onto) [see 1.2.3].

(ii) Both F and F −1 are differentiable mappings.
F is said to be C k -diffeomorphism if both F and F −1 are C k -maps.

15.2.6 Remark
Note that G : U → V is a diffeomorphism if and only if, m = n and the
matrix of partial derivatives,

∂Gi
G (x1 , . . . , xn ) = , i, j = 1, . . . , n
∂xj
is non-singular at every x ∈ U .
T
Example: Let G(x, y) =
exp y
exp x
4
with U = 2 and V = {(x, y) : x >
0, y > 0}.

0 exp y
G (x, y) = , det G (x, y) = − exp(x + y) = 0 for each
exp x 0
4
(x, y) ∈ 2 .
Thus, G is a diffeomorphism.
15.2.7 Two types of dynamical systems
We note that in a Dynamical system the state changes with time {t}.
The two types of dynamical system encountered in practices are as follows:
(i) xt+1 = G(xt ), t ∈ Z or N (15.11)
Such a system is called a discrete system.
(ii) When t is continuous, the dynamics are usually described by a
differential equation,
dx
= ẋ = F (x) (15.12)
dt
In (15.11), x represents the state of the system and takes values in the
state space or phase space X [see 15.1.1]. Sometimes the phase space is the
Euclidean space or a subspace of it. But it can also be a non-Euclidean
structure such as a circle, sphere, a torus or some other differential
manifold.
15.2.8 Advantages of taking the phase space as a differential
manifold
If the phase space X is Euclidean, then it is easy to analyse. But if the
phase space X is non-Euclidean but a differential manifold there is also an
advantage. This is because a differential manifold is ‘locally Euclidean’ and
this allows us to extend the idea of differentiability to functions defined on
them. If Y is a manifold of dimension n, then for any x ∈ Y we can find a
neighbourhood Nx ⊆ Y containing x and a homomorphism h : Nx → Rn
which maps Nx onto a neighbourhood of h(x) ∈ Rn . Since we can define
coordinates in U = h(Nx ) ⊆ Rn (the coordinate curves of which can be
mapped back onto Nx ) we can think of h as defining local coordinates on
the patch Nx of Y [see figure 15.2].
The pair (U, h) is called a chart and we can use it to give differentiability
on Nx . Let us assume that G : Nx → Nx then G induces a mapping
Ĝ = h · G · h−1 : U → U [see figure 15.3]. We say G is a C k -map on Nx if
Ĝ is a C k -map on U .
z
θ
Fig. 15.2 Cylinder
Example of a differential manifold
Nx G Nx
h−1 h
U ∧
U
G
Fig. 15.3
15.3 Stable Points, Periodic Points and

Critical Points
Let Y be a differential manifold and G : Y → Y be a diffeomorphism. For
x ∈ Y , the iteration (15.11) generates a sequence {Gk(xt ) }. The distinct
points of the sequence define the orbit or trajectory of x under G. More
generally, the orbit of x under G is {Gm (x) : m ∈ Z}. For m ∈ Z+ , Gm
is the composition of G with itself m times. Since G is a diffeomorphism,
G−1 exists and G−m = (G−1 )m · G0 = Idy, the identity map on Y . Thus,
the orbit of x is an infinite (on both sides) sequence of distinct points of Y .
15.3.1 Definition: fixed point
A point x∗ ∈ Y is called a fixed point of G if G(m) (x∗ ) = x∗ for all
m ∈ Z.
15.3.2 Example 1
To find the fixed points for ẋ = F (x), where F (x) = x2 − 1.
Now, x(t + 1) − x(t) ≈ F (x(t)).
If x∗ is a fixed point of G, then
x∗ − x∗ = F (x∗ ), i.e., F (x∗ ) = 0, i.e., x∗ = ±1.

It may be noted that at the fixed points, x∗ = ±1, there is no flow.
15.3.3 Definition: periodic points

A point x∗ ∈ Y is said to be a periodic point of Y if Gp (x∗ ) = x∗ for
some integer p ≥ 1.
The least value of p for which definition of a periodic point is true is
called the period of the point x∗ and the orbit of x∗ is
{x∗ , G(x∗ ), G2 (x∗ ), . . . , Gp−1 (x∗ )} (15.13)
is said to be a periodic orbit of period p or a p-cycle of G.
15.3.4 Remark
1. A fixed point is a periodic point of period one.
2. Since (Gp )q (x∗ ) = Gp (x∗ ), q ∈ Z for a periodic point x∗ of G with
period p, x∗ is a fixed point of Gp (x∗ ).
3. If x∗ is a periodic point of period p for G, then all other points in the
orbit of x∗ are periodic points of period p of G. For if Gp (x∗ ) = x∗ ,
then Gp (Gi (x∗ )) = Gi (Gp (x∗ )) = Gi (x∗ ), i = 0, 1, 2, . . . , q − 1.
15.3.5 Definition: stability according to Lyapunov stable point

A fixed point x∗ is said to be stable if, for every neighbourhood Nx∗ of
x , there is a neighbourhood Nx ∗ ⊆ Nx∗ of x∗ such that if x ∈ Nx ∗ then
∗
Gm (x∗ ) ⊆ Nx∗ for all m > 0. The above definition implies that iterates
of points near to a stable fixed point remain near to it for m ∈ Z+ . This
is in conformity with the definition of a stable equilibrium point for a
moving particle.
15.3.6 Remark
1. If a fixed point x∗ is stable and limm→∞ Gm (x∗ ) = x∗ , for all x
in some neighbourhood of x∗ , then the fixed point is said to be
asymptotically stable.
2. Unstable point. A fixed point x∗ is said to be unstable if for every
neighbourhood Nx∗ of x∗ Gm (x) ∈ Nx∗ for all x ∈ Nx∗ .
15.3.7 Example
We refer to the example in 15.3.2. To determine stability, we plot x2 − 1
and then sketch the vector field [see figure 15.4]. The flow is to the right
where x2 − 1 > 0 and to the left where x2 − 1 < 0. Thus, x∗ = −1 is stable
and x∗ = 1 is unstable.
•
x
x
O
Fig. 15.4
We end this section with an important theorem.
15.4 Existence, Uniqueness and Topological

Consequences
15.4.1 Theorem: existence and uniqueness

Consider ẋ = F (x), where x ∈ 4n and F : U ⊂ 4n → C 1 (4n), where
4
U is an open connected set in n .
Then, for x0 ∈ U , the initial value problem has a solution x(t) on some
time interval about t = 0 and the solution is unique. (Strogatz [54])
15.4.2 Remark
1. These theorems have deep implications. Different trajectories do
not intersect.
For if two trajectories intersect at some point in the phase space, then
starting from the crossing point we get two solutions along the two
trajectories. This contradicts the uniqueness of the solution.
2. In two dimensional phase spaces let us consider a closed orbit C in the
phase plane. Then any trajectory starting inside C will always lie within
C. If there are fixed points inside C, then the trajectory may approach one
of them.
But if there are no fixed points inside C, then by intuition we can
say that the trajectory can not move inside the orbit endlessly. This is
supported by the following famous theorem.
Poincaré-Bendixon theorem:
If a trajectory is confined to a closed, bounded region and there are no
fixed points in the region, then the trajectory must eventually approach a
closed circuit (Arrowsmith and Pace [3]).
15.5 Bifurcation Points and Some Results

Bifurcation phenomenon is the outcome of the presence of a parameter in
a dynamical system. A physical example may stimulate the study of the
bifurcation theory. Suppose a body is resting on a vertical iron pillar. If the
weight is gradually increased, a stage may come when the pillar may become
unstable and buckle. Here the weight plays the role of a control parameter
and the deflection of the pillar from vertical plays the role of the dynamical
variable x. The bifurcation of fixed points for flows on the line occurs
in several physical phenomenon, such as the onset of coherent radiation
in a laser and the outbreak of an insect population, etc. [see Strogatz
[54]]. Against the above backdrop, we can have the formal definition of
4 4 4 4 4
bifurcation as follows. Let F : m × n → n (G : m × n → n ) to be 4
4
an m-parameter, C r -family of vector fields (diffeomorphisms) on n , i.e.,
4 4
(μ, x) −→ F (μ, x)(G(μ, x)), μ ∈ m , x ∈ n . The family F (G) is said to
have a bifurcation point at μ = μ∗ if, in every neighbourhood of μ∗ , there
exists values of μ such that corresponding vector fields F (μ, ·) = Fμ (·)
(diffeomorphisms G(μ, ·) = Gμ (·)) show topologically distinct behaviour.
For details see Arrowsmith and Place [3]. We provide here some examples.
Example 1. F (x, μ) = μ − x2 (15.14)
Let us sketch the phase portraits for ẋ = Fμ (x), with (μ, x) near (0, 0).
Since our study is confined to a neioghbourhood of (0, 0), the bifurcation
study is local in nature (15.5).
Fig. 15.5
4
For μ < 0, the equation (15.14) yields ẋ < 0 for all x ∈ . When μ = 0
there is a non-hyperbolic fixed point at x = 0, but ẋ < 0 for all x = 0
(Arrowsmith and Place [3]). For μ > 0, μ − x2 = 0 ⇒ x = ±μ1/2 for
x > μ1/2 , ẋ < 0 and for x < μ1/2 , ẋ > 0. Hence, x = μ1/2 is a stable fixed
point. On the other hand, x = −μ1/2 is an unstable fixed point.
Example 2. F (μ, x) = μx − x2 (15.15)
If μ > 0, x = μ is stable and x = 0 is unstable. The stabilities are
reversed when μ < 0. At μ = 0, there is one singularity at x = 0 and ẋ < 0
for all x = 0. This leads to a bifurcation as depicted below [see figure 15.6].
Fig. 15.6
List of Symbols
Φ (phi) Null set

∈ Belongs to
A⊆B A is a subset of B
B⊇A B is a superset of A
P (A) Power set of A
J Set of all integers
3 Set of all rational numbers
N = {1, 2, . . . , n, . . .} The set of natural numbers
P Set of all polynomials
4 n
n-dimensional real Euclidean space
]a, b[ An open interval containing a and b
[a, b] A closed interval containing a and b
ℵ0 Class of all denumerable sets
4 Set of real numbers
+ Set of complex numbers
A∪B Union of sets A and B
A+B The sum of sets A and B
A∩B The intersection of A and B
AB The product of A and B
A−B The set of elements in A which are not elements
of B
Ac The complement of the set A
g.l.b (infimum) Greatest lower bound
l.u.b (supremum) Least upper bound
X ×Y Cartesian product of sets X and Y
(x, y) Ordered pair of elements x and y
4 m×n
Space of m × n matrices
A = {aij } A, a matrix
l∞ Sequence space
C([a, b]) Space of functions continuous in [a, b]
lp pth power summable space
l2 Hilbert sequence space
Lp ([a, b]) Lebesgue pth integrable functions
454
List of Symbols 455
Pn ([a, b]) Space of real polynomials of order n defined on

[a, b]
X 1 ⊕ X2 Direct sum of subspaces
X/Y Quotient space of a linear space X by a
subspace Y
Codimension Dimension of Y in X
(codim) of Y in X
ρ(x, y) Distance between x and y
(X, ρ) Metric space where X is a set and ρ a metric
M ([a, b]) Space of bounded real functions
c Space of convergent numerical sequences
m Space of bounded numerical sequences
s Space of not necessarily bounded sequences
⊂ Set inclusion
D(A) Diameter of a set A
D(A, B) Distance between two sets A and B
D(x, A) Distance of a point x from a set A
B(x0 , r) Open ball with centre x0 and radius r
B(x0 , r) Closed ball with centre x0 and radius r
S(x0 , r) Sphere with centre x0 and radius r
c
K The complement of the set K
Nx0 Neighbourhood of x0
A Closure of the set A
+ n
n-dimensional complex space
S([a, b]) Space of continuous real valued functions on
b
[a, b] with the metric ρ(x, y) = a |x(t) − y(t)|dt
Bx Basic F-neighbourhood of x
D(A) Derived set of A
Int A Interior of A
-n
Fi Intersection of all sets for i = 1, 2, . . . n
i=1
Inf Infimum
Sup Supremum
· Norm
xm → x {xm } tends to x
lim sup Limit supremum
lim inf Limit infimum
· 1 l1 —norm
· 2 l2 —norm
· ∞ l∞ —norm
∞

xn Summation of xn for n = 1, . . . ∞
n=1

m
xn Summation of xn for n = 1, 2, . . . m (finite)
n=1
(n)
lp n-dimensional pth summable space
X0 The interior of the set X
c0 Space of all sequences converging to 0
x + Lq Quotient norm of x + L
Span L Set of linear combinations of elements in L
σa (A) Approximate eigenspectrum of an operator A
rσ (A) Spectral radius of an operator A
T
A Transpose of a matrix A
D(A) Domain of an operator A
R(A) Range of an operator A
N (A) Null space of an operator A
BV ([a, b]) Space of scalar-valued functions of bounded
variation on [a, b]
N BV ([a, b]) Space of scalar-valued normalized functions of
bounded variation on [a, b]
Δ Forward difference operator
w
xn −→ x {xn } is weak convergent to x
x, y Inner product of x and y
x⊥y x is orthogonal to y
E⊥F E is orthogonal to F
T
M Conjugate transpose of a matrix M
A≥0 A is a non-negative operator
w(A) Numerical range of an operator A
List of Symbols 457
m(E)

Lebesgue measure of a subset E of 4
x dm Lebesgue integral of a function x over a set E
E
Var (x) Total variation of a scalar valued function x
essupE |x| Essential supremum of a function |x| over a
set E
L∞ (E) The set of all equivalent classes of essentially
bounded functions on E
Span L Closure of the span of L
⊥
M Set orthogonal to M
⊥⊥
M Set orthogonal to M ⊥⊥
(Ex → Ey ) Space of all bounded linear operators mapping
n.l.s. Ex → n.l.s. Ey
H Hilbert space
−1
A Inverse of an operator A
∗
A Adjoint of an operator A
A Closure of A
Aλ Operator of the form A − λI
A Norm of A
A(x, x) Quadratic Hermitian form
A(x, y) Bilinear Hermitian form
k
C ([0, 1]) Space of continuous functions x(t) on [0, 1] and
having derivatives to within k-th order
Ex∗ Conjugate (dual) of Ex
Ex∗∗ Conjugate (dual) of Ex∗
ΠEx (x) Canonical embedding of Ex into Ex∗∗
χE Characteristic function of a set E
δij Kronecker delta
sgn z Signum of z ∈ +
{e1 , e2 , . . .} Standard Schauder basis for 4n or +n or lp , 1 ≤
p<∞
Gr(F ) Graph of a map F
I Identity operator
ρ(A) Resolvent set of an operator A
σ(A) Spectrum of an operator A
σe (A) Eigenspectrum of an operator A
Abbreviations
BVP Boundary value problems
LHS Left hand side
ODE Ordinary differential equations
RHS Right hand side
s.t. Such that
WRT With regards to
a.e. Almost everywhere
Bibliography
[1] Aliprantis, C.D. and Burkinshaw, O. (2000) : Principles of Real

Analysis, Harcourt Asia Pte Ltd, Englewood Cliffs, N.J.
[2] Anselone, P.M. (1971) : Collectively Compact Operator
Approximation Theory, Prentice-Hall.
[3] Arrowsmith, D.K. and Place, C.M. (1994) : An Introduction to
Dynamical Systems, Cambridge University Press, Cambridge.
[4] Bachman, G. and Narici, L. (1966) : Functional Analysis, Academic
Press, New York.
[5] Banach, S. (1932) : Théories des opérations linéaires, Monografje
Matematyczne, Warsaw.
[6] Bhatia, N.P. and Szegö, G.P. (1970) : Stability Theory of Dynamical
Systems, Springer-Verlag, New York.
[7] Birkoff, G.D. (1927) : Dynamical Systems (Amer. Math. Soc.
Colloquium Publications Vol. 9), New York.
[8] Bohnenblust, H.F. and Sobczyk, A. (1938) : Extension of Functionals
on Complex Linear Spaces, Bull. Amer. Math. Soc. 44, 91–3.
[9] Browder, F.E. (1961) : On the Spectral Theory of Elliptic Differential
Operators I, Math. Ann., Vol. 142, 22–130.
[10] Carleson, A. (1966) : On the Convergence and Growth of Partial
Sums of Fourier series, Acta Math. 116, 135–7.
[11] Cea, J. (1978) : Lectures on Optimization—Theory and Algorithms,
Tata Institute of Fundamental Reseach, Narosa Publishing House,
New Delhi.
[12] Clarkson, J.A. (1936) : Uniformly Convex Spaces, Trans. Amer.
Math. Soc. 40, 396–414.
[13] Coddington, E.A., Levinson, N. (1955) : Theory of Ordinary
Differential Equations, McGraw-Hill Book Company, New York.
459
[14] Collatz, L. (1966) : Functional Analysis and Numerical

Mathematices, Academic Press, New York.
[15] Courant, R. and Hilbert, D. (1953) : Methods of Mathematical
Physics, Interscience, New York.
[16] Daubechies, I. (1988) : Orthonormal Bases of Compactly Supported
Wavelets. Commun. Pure App. Math., 41; 909–96.
[17] Debnath, L. (1998) : Wavelet Transforms and their Applications,
Pinsa-A, 64, A, 6.
[18] Dieudońne, J. (1969) : Foundations of Modern Analysis, Academic
Press, New York.
[19] Gelfand, I. (1941) : Normiert Ringe, Mat. Sbornik, N.S. 9 (51), 3–24.
[20] Goffman, C and Pedrick, G. (1974) : First Course in Functional
Analysis, Prentice-Hall of India Private Ltd., New Delhi.
[21] Goldberg, S. (1966) : Unbounded Linear Operators with
Applications, Mc-Graw Hill Book Company, New York.
[22] Haar, L. (1910) : Zur Theorie der Orthogonalen Functionensysteme,
Math. Ann. 69, 331–371.
[23] Hahn, H. (1927) : Über Lineare Gheichungssysteme in Linearen
Raümen, Journal Reine Angew Math. 157, 214–29.
[24] Hilbert, D. (1912) : Grundzüge Einerallgemeinen Theorie der
Linearen Integralgleichungen, Repr. 1953, New York.
[25] Jain, P.K., Ahuja, O.P. and Ahmed, K. (1997) : Functional Analysis,
New Age International (P) Limited, New Delhi.
[26] James, R.C. (1950) : Bases and Reflexively in Banach spaces, Ann.
Math., 52, 518–27.
[27] Kantorovich, L.V. (1948) : Functional Analysis and Applied
Mathematics, (Russian) Uspekhi Matem. Nauk, 3, 6, 89–185.
[28] Kato, T. (1958) : Perturbation Theory for Nullity, Deficiency and
Other Quantities of Linear Operators, J. Analyse Math. vol. 6, pp.
273–322.
[29] Kolmogoroff, A. and Fomin, S. (1954) : Elements of the Theory of
Functions and Functional Analysis, Izdatb Moscow Univ., Moscow;
transl. by L. Boron. Grayrock Press, Rochester, New York, 1957.
[30] Kreyszig, E. (1978) : Introductory Functional Analysis with
Applications, John Wiley & Sons. New York.
Bibliography 461
[31] Lahiri, B.K. (1982) : Elements of Functional Analysis, World Press,

Kolkata.
[32] Liapunov, A.M. (1966) : Stability of Motion (English translation),
Academic Press, New York.
[33] Limaye, B.V. (1996) : Functional Analysis, New Age International
Ltd., New Delhi.
[34] Lions, J.L. and Magenes, E. (1972) : Non-homogeneous Boundary
Value Problems, vol. I, Springer-Verlag, Berlin.
[35] Lusternik, L.A. and Sobolev, V.J. (1985) : Elements of Functional
Analysis, Hindusthan Publishing Corporation, New Delhi.
[36] Mikhlin, S. (1964) : Variational Methods in Mathematical Physics,
Pergamon Press, New York.
[37] Mikhlin, S. (1965) : The Problem of the Minimum of a Quadratic
Functional, Holden-day, San Francisco.
[38] Mansfield, M.J. (1963) : Introduction to Topology, Litton
Educational Publishing Inc., New York.
[39] Nair, M.T. (2002) : Functional Analysis, Prentice-Hall of India
Private Limited, New Delhi.
[40] Natanson, I.P. (1955) : Konstruktive Funktionentheories (translated
from Russian), Akademic Verlag, Berlin.
[41] Neumann, J. Von. (1927) : Mathematische Begründung der
Quantenmechanik. Nachr. Ges.Wiss. Götingen. Math. Phys. Kl 1–37.
[42] Ortega, J.M. and Rheinboldt, W.C. (1970) : Iterative Solution of
Nonlinear Equations in Several Variables, Academic Press, New York.
[43] Ralston, A. (1965) : A First Course in Numerical Analysis, McGraw-
Hill Book Company, New York.
[44] Rall, L.B. (1962) : Computational Solution of Nonlinear Operator
Equations, John Wiley & Sons, New York.
[45] Reddy, J.N. (1986) : Applied Functional Analysis and Variational
Methods in Engineering, McGraw-Hill Book Company, New York.
[46] Resnikoff, H.L. and Wells, O Jr. (1998) : Wavelet Analysis, Springer,
New York.
[47] Royden, H.L. (1988) : Real Analysis, Macmillan, New York.
[48] Rudin, W. (1976) : Principles of Mathematical of Analysis, 3rd. ed.,
McGraw-Hill Book Company, New York.
[49] Schauder, J (1930) : Über Lineare, Vollstetige Funktionalo-

perationen, Studia Math. 2, 1–6.
[50] Schmidt, E. (1908) : Über die Auflösung Linearer Gleichungen mit
Unendlich vielen Unbekannten, Rendi. Circ. Math. Palermo 25, 53–
77.
[51] Schwartz, A.S. (1996) : Topology for Physicists, Springer-Verlag,
Berlin.
[52] Schwartz, L (1951) : Théorie des Distributions, vols. I and II.

Hermann & Cie, Paris.
[53] Simmons, G.F. (1963) : Introduction to Topology and Modern
Analysis, Mc-Graw Hill Book Company, Tokyō.
[54] Strogatz, S.H. (2007) : Nonlinear Dynamics and Chaos, Levant

Books, Kolkata.
[55] Taylor, A.E. (1958) : Introduction to Functional Analysis, John Wiley
& Sons, New York.
[56] Zeidler, E. (1995) : Applied Functional Analysis: Main Principles
and their Applications, Springer-Verlag, Berlin.
Index
A linear functional 147
absolutely continuous function 368 C

absorbing set 190
acute 100 canonical or natural embedding 212
almost everywhere (a.e.) 363 cardinal numbers 2
approximate eigenspectrum 291 cartesian product 4
approximate eigenvalue 291 category
approximate solutions 311 first 33
asymptotically stable 450 second 33
Cauchy sequence 28
B Cauchy-Bunyakovsky-Schwartz
inequality 18
ball characteristic function of a set E 363
closed 25 characteristic function 197
open 25 characteristic value 173
Banach and Steinhaus theorem 155 Chebyshev approximation 406
base at a point 49 closed complement 272
base of a topology 49 closed graph theorem 268
Bases for a topology 48 closed orthonormal system 123
basis closure 25, 50
Hamel 8 -closure 50
Schauder 176 compact support 424
Bernstein Polynomial 234 compact
Bessel inequality 122 metric space 54
best approximation 261 compactness 51
bifurcation point 452 comparable norm 276
bilinear hermitian form 327 complete orthonormal system 122
biorthogonal sequence 176 completeness of
Bolzano-Weirstrass theorem 55 C([a, b]) 29
boundary 51 lp 31
bounded below 4 + n
29
bounded convergence theorem 367 4 n
29
bounded inverse theorem 276 completeness 27
bounded linear operators 150 conjugate (dual)
bounded variation 195, 369 C([a, b]) 228
bounded continuity 51
463
in a metric space 52 equicontinuous 56

of a linear operator equivalent norms 80
mapping Ex into Ey 137 essential supremum 372
on topological spaces 52 essentially bounded 372
continuous Fourier transform 433 everywhere dense 47
contraction mapping principle 36 exterior 51
convergence in norm 61
convergence 27 F
strong 244
weak∗ 240, 244 finite basis 9
weak 240 finite deficiency 387
covering 53 finite intersection property 53
fixed point 449
D Fourier coefficients 248
Fourier series 121
deficiency index of A 390 Fredholm alternative 301
definite integral 148 Fréchet derivative 417
dense set 32 Fubini’s theorem 369
everywhere 32 function 5
nowhere 32 bijective 6
diffeomorphism 447 domain 5
difference of two sets 3 range 5
n-dimensional manifold 447
direct sum 12, 107 G
discrete Fourier transform 433
distance between sets 24 G-differentiable 411
distance of a point from a set 24 general form of linear functionals
distribution 424 in a Hilbert space 204
dot product 148 generalized derivative 424
dual problem 403 generalized function 425
dual space of l1 207 genus of the Laurent series 435
dual gradient 413
algebraic 213 Gram-Schmidt orthonormalization
conjugate 213 process 113
topological 213 graph 267
dynamical system 443 greatest lower bound (g.l.b.) 4
Green’s formula for Sobolev spaces
E 426
Gâteaux derivative 410
eigenvalue 172
eigenvector 172 H
eigenspectrum 291
energy product 351 Haar wavelet matrices 437
equation Hahn-Banach separation theorem
homogeneous 172 190
linear integral 317 Hahn-Banach theorem
Index 465
using sublinear functional 185 linear functionals 147

half-space 401 on s 201
Heine-Borel theorem 55 on Lp ([a, b]) 374
Hermite polynomials 119 on the n-dimensional Euclidean
hessian 413 4
space n 200
Hilbert space 91, 204 linear independence 112
homeomorphism 52, 447 linear operator
hyperplane 149, 400 left inverse of a 165
hyperspace 188 null space of a 165
Hölder’s inequality 16 right inverse of a 165
unbounded 217
I linearly dependent 9
linearly independent 9
induced metric 61 Lipschitz condition 39
inequality
inner product 92 M
integrable function 366
integral of mappings
Riemann-Stieljes 195 into 5
interior 50, 51 onto (surjective) 5
intersection of two sets 3 one-to-one (injective) 5
invariant subspace 345 bijective 5
inverse Fourier transform 433 continuous 52
isometric mapping 35 homeomorphism 52
isometric spaces 35 projection 271
isomorphism 128 C r -,C 0 -maps 447
metric completion 35
K Minkowski functional 190
Minkowski’s inequality 19
kernel index 390 moment problem of Hausdorff 231
kernel multiresolution representation 432
degenerate 317
N
L
natural embedding 212
Lagrange’s interpolation polynomial neighbourhood of a point 25
159 normalized function of bounded
Laguerre polynomials 117 variation 229
Lebesgue integral φ 364 normally solvable 393
Lebesgue measurable function 360 norm 58
Lebesgue measurable set 358
Lebesgue measure 359 O
Lebesgue outer measure 357
Legendre polynomials 116 obtuse 100
Lindelöf’s theorem 49 τ -open covering 53
linear combination of vectors 9 open mapping 272
open mapping theorem 272 partition 195

operator 130 periodic points 449
adjoint 217, 221, 323 piecewise linear interpolations 317
bounded linear 150 plane of support 402
closed linear 267, 269 point
coercive 396 contact 50
compact linear 283 interior 25
completely continuous 283 isolated 50
differential 143, 269 limit 25, 50
domain of an 165 pointwise convergence 154
forward difference 233 polarization identity 97, 99
function 136 polynomials
identity 143 Hermite 117
integral 143 Laguerre 119
linear 130 Legendre 115
non-negative 334 primal problem 403
normal 329 product
norm 141 inner 92
positive 334 scalar 92
precompact 386 projection of finite rank 315
projection 109
resolvent 172, 348 Q
self-adjoint 326
shift 233 quadratic form 327
spectrum of a compact 290 quadrature formula 242
strictly singular 386 quotient spaces 85
stronger 334
smaller 334 R
inverse 166
unbounded linear 382 relatively compact 54
unitary 329 restriction of a mapping 222
zero 143 Riemann-Stieljes integral 195
ordinary autonomous differential Riesz’s lemma 83
systems 445 Riemannian manifold 447
orthogonal complements 107, 383
in a Banach space 383 S
orthogonal projection theorem 102
scaling equation 439
orthogonal set 112
Schauder theorem 287
orthogonal 100
second G-derivative 413
orthonormal basis 124
self-conjugate space 210
orthonormal set 112
sequence space 7
P sequence
strongly convergent 284
parallelogram law 95 weakly convergent 284
partially ordered set 4 sequentially compact 54
Index 467
set spectral radius 348

closed 25 spectrum of self-adjoint operators
countable 2 341
denumerable 2 spectrum 173
derived 50 continuous 347
diameter 24 discrete or point 347
empty (or void or null) 2 pure point 347
enumerable 2 sphere 25
finite 2 square roots of
inclusion 1 non-negative operators 337
open 25 stable points 449
power 2 stable 450
resolvent 291 strictly convex 256
uncountable 2 strictly normed 263
universal 3 stronger norm 276
simple function 363 subcovering 53
smooth function 424 sublinear functional 184
Sobolev space 425 subspace 8, 100
space closed 101
l2 7 summable sequence 68
lp 7 supporting hyperplane 149, 402
Lp [a, b] 7 support 424
n dimensional Euclidean 92
T
Banach 61
compact topological 53 test function 424
conjugate (or dual) 179, 206, the canonical embedding 212
221 the general form of linear functionals
discrete metric 16 on lp 202
Euclidean 7 the limit of a convergent sequence 28
Hilbert 91 the space of bounded linear
inner product 91 operators 150
linear 6 the space of operators 135
metric 13 theorem (Arzela-Ascoli) 56
normed linear 91 theorem (Banach and Steinhaus)
null 165 155
quasimetric 41 theorem (Eberlein) 255
reflexive 210 theorem (Helley) 253
separable 47, 51 theorem (Milman) 258
sequence 15, 17 theorem (Pythagorean) 102
unitary 7, 93 theorem (Schauder) 287
spaces theorem of density 426
complete 23 theorem of trace 426
isometric 35 theorem
non-metrizable 23 (Fubini and Tonelli) 369
topological 44 bounded convergence 367
bounded inverse 276 Weierstrass existence theorem 265

closed graph 268
dominated convergence 367 Z
Hahn-Banach theorem 179
Hahn-Banach (generalized) 186 Zermelo’s axiom of choice 4
monotone convergence 367 Zermelo’s theorem 4
open mapping 273 Zorn’s lemma 4
plane of support 402
two-norm 277
topology 45
discrete 46
indiscrete 46
lower limit 46
upper limit 46
usual 46
totally bounded 54
translation invariance 66
truncation
of a Fourier expansion 318
uniform boundedness principle 154,

155
uniform convexity 255
uniform operator convergence 154
uniformly bounded 56
(total) variation 195
wavelet analysis 430

wavelet function 440
wavelet matrix 435
wavelet space 436
weak convergence
lp 249
Hilbert spaces 249
weak lower semicontinuity 416
weakly bounded 420
weakly closed 420
weakly lower semicontinuous 420
Weierstrass approximation theorem
43, 233

A First Course in Functional Analysis Theory and Applications

Uploaded by

Copyright:

Available Formats

You might also like

A First Course in Functional Analysis Theory and Applications

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

A First Course in Functional Analysis Theory and Applications

Uploaded by

Copyright:

Available Formats

VS2%RS]BZRS%qS2Zqx%BqphSVqphiR%R

Those who are interested in applications of functional analysis may look

In this chapter we recapitulate the mathematical preliminaries that will

under reference. The universal set is denoted by U . For example, while

(i) If a ≤ b and b ≤ c, then a ≤ c (transitivity)

Such a set A is said to be partially ordered by ≤ and a and b, satisfying

1.2 Function, Mapping

coordinate. Thus (x, y) ∈ f and (x, z) ∈ f imply that y = z. The domain

1.3 Linear Space

of addition satisﬁes the following axioms:

(ii) A scalar multiplication is said to be deﬁned if for every x ∈ E, for any

(a) λ(μx) = λμx (associativity)

(iii) Space of m × n matrices, m×n 4

Let A = {aij } and B = {bij } be two m × n matrices. Then A + B =

(iv) Sequence space l∞

(v) C([a, b])

(vi) Space lp , Hilbert sequence space l2

(vii) Space Lp ([a, b]) of all Lebesgue pth integrable functions

1.3.2 Subspace, linear combination, linear dependence, linear

T ⊂ M, T and also y generate Y , it follows that T also generates {M },

(iv) C([a, b]), Pn ([a, b])

(i) dim 4 = dim + = 1

1.3.10 Quotient spaces

Fig. 1.1 Addition in quotient space

1.4 Metric Spaces

1. ρ is real-valued, ﬁnite and non-negative

These axioms obviously express the fundamental properties of the distance

It is easily seen that ρ(x, y) ≥ 0. Furthermore, ρ(x, y) = ρ(y, x).

(iv) C([a, b])

≤ maxt |x(t) − y(t)| + maxt |y(t) − y(t)

(v) Discrete metric space

(vi) The space M ([a, b]) of bounded real functions

(vii) The space BV ([a, b]) of functions of bounded variation

ρ(f, g) = ρ(g, f ) since V (f − g) = V (g − f ).

If BV ([a, b]) is decomposed into equivalent classes according to the

(viii) The space c of convergent numerical sequences

(ix) The space m of bounded numerical sequences

(x) Sequence space s

Let a = ξi − ζi , b = ζi − ηi , where z = {ζi }.

i=1 i=1 i=1

The above inequality is known as Hölder’s inequality for integrals. Here p

Proof: We ﬁrst prove the inequality

f (t) = tα − αt + α − 1 deﬁned for 0 < α < 1, t ≥ 0.

It follows that f (t) ≤ 0 for t ≥ 0. The inequality is true for b = 0 since

By adding these inequalities the RHS takes the form

which proves (1).

Summing over both sides for j = 1, 2, . . . , ∞ we obtain the Hölder’s

The Cauchy-Bunyakovsky-Schwartz has numerous applications in a

1.4.4 Theorem (Minkowski’s inequality)

i=1 i=1 i=1

for any complex numbers x1 , . . . xn , y1 , y2 , . . . , yn .

(M3) If x(t) and y(t) belong to Lp (0, 1), then

Proof: If p = 1 and p = ∞ the (M1) is easily seen to be true.

i=1 i=1 i=1

i=1 i=1 i=1

From the above two inequalities,

i=1 i=1 i=1

From (1.5) and (1.6) we have

(M2) is true for p = 1 and p = ∞.

VS2%RS]BZRS%qS2Zqx%BqphSVqphiR%R

4. Find a sequence {x} which is in lp with p > 1 but x ∈ l1 .

Thus ρ(xn , xm ) → 0 as n, m → ∞. Hence(xn ) is a Cauchy sequence.

(iv)Let {X, } be a topological space and N1 , N2 be any two

Note 1.5.1. A -neighbourhood of a point need not be a -open set but