S. J. Taylor - Introduction To Measure and Integration-University Press (1973)

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 273




Professor of Mathematics at Westfield College,
University of London


Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, Sao Paulo, Delhi

Cambridge University Press

The Edinburgh Building, Cambridge C132 8RU, UK

Published in the United States of America by Cambridge University Press, New York

Information on this title: www.cambridge.org/9780521098045

© Cambridge University Press 1966, 1973

This publication is in copyright. Subject to statutory exception

and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.

First published as Chs. 1-9 of Kingman and Taylor

Introduction to Measure and Probability 1966
Reprinted as Introduction to Measure and Integration 1973
Re-issued in this digitally printed version 2008

A catalogue record for this publication is available from the British Library

Library of Congress Catalogue Card Number: 73-84325

ISBN 978-0-521-09804-5 paperback


Preface page v
1 Theory of sets
1.1 Sets 1

1.2 Mappings 3
1.3 Cardinal numbers 5
1.4 Operations on subsets 9
1.5 Classes of subsets 14
1.6 Axiom of choice 19

2 Point set topology

2.1 Metric space 23
2.2 Completeness and compactness 29
2.3 Functions 35
2.4 Cartesian products 38
2.5 Further types of subset 41
2.6 Normed linear space 44
2.7 Cantor set 49

3 Set functions
3.1 Types of set function 51
3.2 Hahn-Jordan decompositions 61
3.3 Additive set functions on a ring 65
3.4 Length, area and volume of elementary figures 69

4 Construction and properties of measures

4.1 Extension theorem ; Lebesgue measure 74
4.2 Complete measures 81
4.3 Approximation theorems 84
4.4* Geometrical properties of Lebesgue measure 88
4.5 Lebesgue-Stieltjes measure 95

5 Definitions and properties of the integral

5.1 What is an integral? 100
5.2 Simple functions; measurable functions 101
5.3 Definition of the integral 110
5.4 Properties of the integral 115
5.5 Lebesgue integral; Lebesgue-Stieltjes integral 124
5.6* Conditions for integrability 127
6 Related spaces and measures
6.1 Classes of subsets in a product space page 134
6.2 Product measures 1.38
6.3 Fubini's theorem 143
6.4 Radon-Nikodym theorem 148
6.5 Mappings of measure spaces 153
6.6* Measure in function space 157
6.7 Applications 162

7 The space of measurable functions

7.1 Point-wise convergence 166
7.2 Convergence in measure 171
7.3 Convergence in pth mean 174
7.4 Inequalities 183
7.5* Measure preserving transformations from a 187
space to itself
8 Linear functionals
8.1 Dependence of 22 on the underlying (S, , ,a) 194
8.2 Orthogonal systems of functions 199
8.3 Riesz-Fischer theorem 202
8.4* Space of linear functionals 209
8.5* The space conjugate to Y. 215
8.6* Mean ergodic theorem 219

9 Structure of measures in special spaces

9.1 Differentiating a monotone function 224
9.2 Differentiating the indefinite integral 230
9.3 Point-wise differentiation of measures 236
9.4* The Daniell integral 241
9.5* Representation of linear functionals 250
9.6* Haar measure 254

Index of notation 261

General Index 263


There are many ways of developing the theory of measure and inte-
gration. In the present book measure is studied first as the primary
concept and the integral is obtained later by extending its definition
from the special case of `simple' functions using monotone limits. The
theory is presented for general measure spaces though at each stage
Lebesgue measure and the Lebesgue integral in Rn are considered as
the most important example, and the detailed properties are estab-
lished for the Lebesgue case.
The book is designed for use either in the final undergraduate year
at British universities or as a basic text in measure theory at the post-
graduate level. Though the subject is developed as a branch of pure
mathematics, it is presented in such a way that it has immediate
application to any branch of applied mathematics which requires the
basic theory of measure and integration as a foundation for its
mathematical apparatus. In particular, our development of the
subject is a suitable basis for modern probability theory - in fact this
book first appeared as the initial section of the book Introduction to
measure and probability (Cambridge University Press, 1966) written
jointly with J. F. C. Kingman.
The book is largely self-contained. The first two chapters contain
the essential parts of set theory and point set topology; these could
well be omitted by a reader already familiar with these subjects.
Chapters 3 and 4 develop the theory of measure by the usual process
of extension from `simple sets' to those of a larger class, and the
properties of Lebesgue measure are obtained. The integral is defined
in Chapter 5, again by extending its definition stage by stage, using
monotone sequences. Chapter 6 includes a discussion of product
measures and a definition of measure in function space. Convergence
in function space is considered in Chapter 7, and Chapter 8 includes
a treatment of complete orthonormal sets in Hilbert space. Chapter 9
deals with special spaces; differentiation theory for real functions of
a real variable is developed and related to Lebesgue measure theory,
and the Haar measure on a locally compact group is defined.
Starred sections contain more advanced material and can be
omitted at a first reading.
It will be clear to any reader familiar with the standard treatises
that this book owes much to what has gone before. I do not claim any
particular originality for the treatment, but the form of presentation
owes much to my experience of teaching this subject - at Birmingham
University, Cornell University and the University of London - and I
readily acknowledge the stimulus received from this source. I am
grateful to Dr B. Fishel and Professor G. E. H. Reuter who made
helpful criticisms of an early draft, and to a great number of students
and colleagues who pointed out misprints and errors in the first
edition. However my main debt of gratitude is to Professor J. F. C.
Kingman who was co-author of the first edition of this book, and who
was much involved in every detail of it.
December 1972

1.1 Sets
We do not want to become involved in the logical foundations of
mathematics. In order to avoid these we will adopt a rather naive
attitude to set theory. This will not lead us into difficulties because in
any given situation we will be considering sets which are all contained
in (are subsets of) a fixed set or space or suitable collections of such sets.
The logical difficulties which can arise in set theory only appear when
one considers sets which are `too big'-like the set of all sets, for
instance. We assume the basic algebraic properties of the positive
integers, the real numbers, and Euclidean spaces and make no attempt
to obtain these from more primitive set theoretic notions. However,
we will give an outline development (in Chapter 2) of the topological
properties of these sets.
In a space X a set E is well defined if there is a rule which determines,
for each element (or point) x in X, whether or not it is in E. We write
x r: E (read `x belongs to E') whenever x is an element of E, and the
negation of this statement is written x 0 E. Given two sets E, F we
say that E is contained in F, or E is a subset of F, or F contains E
and write E c F if every element x in E also belongs to F. If E C F
and there is at least one element in F but not in E, we say that E is a
proper subset of F.
Two sets E, F are equal if and only if they contain the same ele-
ments; i.e. if and only if E c F and F E. In this case we write
E = F. This means that if we want to prove that E = F we must prove
both x E E x E F and x E F x r: E (the symbol should be read
Since a set is determined by its elements, one of the commonest
methods of describing a set is by means of a defining sentence: thus
E is the set of all elements (of X) which have the property P (usually
delineated). The notation of `braces' is often used in this situation
E = {x: x has property P}
but when we use this notation we will always assume that only
elements x in some fixed set X are being considered-as otherwise
logical paradoxes can arise. When a set has only a finite number of
elements we can write them down between braces E = {x, y, z, a, b}.
In particular {x} stands for the set containing the single element x.
One must always distinguish between the element x and the set {x},
for example, the empty set 0 defined below is not the same as the class
{ 0 } containing the empty set.

Empty set (or null set)

The set which contains no elements is called the empty set and will
be denoted by o. Clearly
0 = {x: x + x}, and o c E for all sets E.
In fact since QJ contains no element, any statement made about the
elements of 0 is true (as well as its negative).
There are some sets which will be considered very frequently, and
we consistently use the following notation:
Z, for the set of positive integers,
Q, for the set of rationals,
R = R1, for the set of all real numbers,
C, for the set of complex numbers,
Rn, for Euclidean n-dimensional space, i.e. the set of ordered n-
tuples (x1, x2, ..., xn) where all the xi are in R.
We assume that the reader is familiar with the algebraic and order
properties of these sets. In particular we will use the fact that Z
is well ordered, that is, that every non-empty set of positive integers
has a least member: this is equivalent to the principle of mathematical
We frequently have to consider sets of sets, and occasionally sets
of sets of sets. It is convenient to talk of classes of sets and collections
of classes to distinguish these types of set, and we will use italic
capitals A, B, ... for sets, script capitals .2f, a, W, ... for classes and
Greek capitals A, P,... for collections. Thus CEW is read `the set C
belongs to the class'; and .W c a means that every set in the class.2f
is also in the class M.

Cartesian product
Given two sets E, F we define the Cartesian (or direct) product E x F
to be the set of all ordered pairs (x; y) whose first element x E E and
whose second element y e F. This clearly extends immediately to the
product El x E2 x ... x E. of any finite number of sets. In particular
it is immediate that Rn, Euclidean n-space, is the Cartesian product
1.11 SETS 3
of n copies of R. For an infinite indexed class {Ej, i E I} of sets, the
product II El is the set of elements of the form {as, i E I} with aj E Es
for each i E I.

Exercises 1.1
1. Describe in words the following sets:
(i) {t a R: 0 5 t S 1};
(u) {(x, y) E R2: x2+ y2 S 1};
(iii) {k E Z: k = n2 for some n r: Z};
(iv) {keZ:nj k=> n= 1 or k};
(vi) {B: B c E}.
2. Show that the relation c is reflexive and transitive, but not in gen-
eral symmetric.
3. The sets X x (Y x Z) and (X x Y) x Z are different but there is a
natural correspondence between them.
4. Suppose x is an element of X and A = {x}. Which of the following
statements are correct : x e A, x e %, x e A, x c I, A E %, A c %, A e x?
5. Suppose P(a) and Q(a) are two propositions about the element such
that P(a) . Q(a). Show that {a: P(a)} c {a: Q(a)}.

1.2 Mappings
Suppose A and B are any two sets: a function from A to B is a
rule which, for each element in A, determines a unique element in B.
We talk of the function f and use the notation f : A -+ B to denote a
function f defined on A and taking values in B. For any x E A, f (x)
means the value of the function f at the point x and is therefore an
element of the set B: we therefore avoid the terminology (common
in older text books) `the function f(x)'. The words mapping and
transformation are often used as a synonym for function.
For a given function f : A B, we call A the domain of f and the
subset of B consisting of the set of values f (x) for x in A is called the
range of f and may be denoted f (A). When f (A) = B we say that f
is a function from A onto B. Given a function f : A -> B, by definition
f (x) is a uniquely determined element of B for each x e A; if in addition
for each y in f (A) there is a unique x e A (we know there is at least
one) with y = f(x) we say that the function f is (1, 1). Another shorter
way of saying this is that f : A -> B is (1, 1) if and only if for x1, x2 E A,
x1 4 x2=f(x1) 4f(x2)
Given f : A -> B there is an associated f : sad -* -4, where .sad is the class
of all subsets of A and .4 is the class of all subsets of B, defined by
f(E) = with y = f(x)}
for each E c A. (the symbol 3 should be read, `there exists': i.e. the
set described by {x E E: y = f (x)) is not empty). There is also a function
f-1: -4 -> &I defined by
f-1(F) = {xEA:f(x)EF},
for each F - B. The set f-1(F) is called the inverse image of F under f.
Note that if yEB-f(A), then the inverse image f-1({y}) of the one
point set {y} is the empty set. If f : A -> B is (1,1) and Y E f (A), then
it is clear that f -1 ({y}) is a one point subset of A, so that in this case
(only) we can think off-' as a function from f (A) to A. In particular,
if f: A -* B is (1, 1) and onto there is a function f-1: B -- A called the
inverse function off such that f-1(y) = x if and only if y = f(x).
Now suppose f: Al B, g: A2 -- B are functions such that A, ' A2
and f (x) = g(x) for all x in A 2: under these conditions we say that f
is an extension of g (from A2 to A1) and g is the restriction of f (to A2).
For example, if
g(x) = cos x (x E R);
f(x + iy) = cos x cosh y + i sin x sinh y (x +iyEC);
then f: C --> C is an extension of g: R --> C from R to C, and the usual
convention of designating both f and g by 'cos' obscures the differences
in their domains.
If we have two functions f: A -* B, g: B -a C the result of applying
the rule for g to the element f (x) defines an element in C for all x E A.
Thus we have defined a function h: A -+ C which is called the composi-
tion off and g and denoted g of or g(f). Thus, for x E A
h(x) = (g of) x = g(f(x)) E C.
Note that, if f : A -> B is (1, 1) and onto we could define the inverse
function p l: B --> A as the unique function from B to A such that
(fof-') (y) = y for all yEB,
(f-1 of) (x) = x for all x E A.

Given any set X a finite sequence of n points of X is a function from
{1, 2,..., n} to X. This is usually denoted by xl, x2, ..., xn where
xi c X is the value of the function at the integer i. Similarly, an infinite
sequence in X is a function from Z to X (where Z is the set of positive
integers). This is denoted x1, x2, ..., or {xi} (i = 1, 2, ...), or just {xi}
where xi is the value of the function at i, and is called the ith element
of the sequence. Given a sequence {ni} of positive integers (that is, a
function f : Z -+ Z where f(i) = ni) such that ni > nn for i > j, and a
sequence {xi} of elements of X (a function g: Z --* X) it is clear that the
composite function g of: Z X is again a sequence. Such a sequence
is called a subsequence of {xi} and is denoted {xn,} (i = 1, 2,...). Thus
{x.,} is a subsequence of {xi} if ni E Z for all i E Z, and i > j = ni > n p
We can think of a sequence as a point in the product space Ij Xi
where Xi = X for all i. More generally a point in the product space
11 Xi with X i = X for i E I can be identified as a function f : I -+ X.
Exercises 1.2
1. Suppose f : R R is defined by f (x) = sin x. Describe each of the
following sets:
f-1{0}, f l{1}, f-1{2}, f-1{y:0 <, y <
2. Suppose f : A - . B is any function. Prove
(i) E c f-1(f(E)), for each E c A;
(ii) F f(f-1(F)), for each F e B;
and give examples in which there is not equality in (i), (ii).
3. Suppose f : A -* B, g : B -+ C are functions and h = g of: show that
h-1(E) = f-1[g-1(E)] for each E e C.
4. If AcBCC,f:A-*X,g:B-+X,h:C-+X are such that his an
extension of g and g is an extension off, prove that f is the restriction of h
5. Show that the restriction of a (1,1) mapping is (1, 1).
6. Suppose m, n E Z, A is a set with m distinct elements and B is a set
with n distinct elements. How many distinct functions are there from A
to B?

1.3 Cardinal numbers

If there is a mapping f: A -+ B which is (1,1) and onto, then it is
reasonable to say that there are the same number of elements in A
as there are in B. In fact, for finite sets, the elementary process of
counting sets up such a mapping from the set being counted to the
integers {1, 2, ..., n}, and from experience we know that if the same
finite set of objects is counted in different ways we always end up with
the same integer n. (This fact can also be deduced from primitive
axioms about the integers.) We say that the set A is equivalent to the
set B, and write A - B if there is a mapping f: A -> B which is (1,1)
and onto. It is clear that - is an equivalence relation between sets
in the sense that it is reflexive, symmetric and transitive, and we can
therefore form equivalence classes of sets with respect to this relation.
Such an equivalence class of sets is called a cardir' l number, but by
noting that the equivalence class is determined by any one of its mem-
bers, we see that the easiest way to specify a cardinal number is to
specify a representative set. Thus any set which can be mapped (1, 1)
onto the representative set will have the same cardinal. As is usual
we shall use the following notation:
the cardinal of the empty set 0 is 0;
the cardinal of the set of integers {1, 2, ...n} is n;
the cardinal of the set Z of positive integers is No;
the cardinal of the set R of real numbers is c.
Since Z is ordered we can clearly order the cardinals of finite sets
by saying that A has a smaller cardinal than B if A is equivalent to a
proper subset of B. This definition does not work for infinite sets as
the mappings
n -+ 2n or n-n

map Z onto a proper subset of Z and are (1, 1). Instead we say that the
cardinal of a set A is less than the cardinal of the set B if there is a
subset B1 cz B such that A - B1 but no subset Al c A such that
Al - B.
From this definition of ordering we consider the following state-
ments, where m, n, p denote cardinals
(i) m<n,n<p=>.m<p;
at most one of the relations m < n, m = n, n < m holds so
that m < n,n < m=> m=n.
(iii) at least one of the relations m < n, m = n, n < m holds.
Now (i) follows easily from the definition, for let M, N, P be sets with
cardinals m, n, p and suppose N1 c N, P1 c P with M - N1, N - P1.
The mapping f: N -. P1 when restricted to Nl gives an equivalence
N1-P2cP1 sothat M,P2cP. Further if P-1111cMthe map-
ping g: M -> Nl when restricted to M1 shows P -M1 - N2 c N which
contradicts n < p. (ii) can also be deduced from the definition (see
exercise 1.3 (5)), though this requires quite a complicated argument:
(ii) is known as the Schroder-Bernstein theorem. However, the truth
of (iii)-that all cardinals are comparable-cannot be proved. without
the use of an additional axiom (known as the axiom of choice) which
we will discuss briefly in § 1.6. If we assume the axiom of choice or
something equivalent, then (iii) is also true.
A set of cardinal X. is said to be enumerable. Thus such a set
A - Z so that the elements of A can be `enumerated' as a sequence
a1, a2, ... in which each element of A occurs once and only once. A set
which has a cardinal m 5 No is said to be countable. Thus E is countable
if there is a subset A c Z such that E - A, and a set is countable if it
is either finite or enumerable.
Given any infinite set B we can choose, by induction, a sequence
{bi} of distinct elements in B and if B1 is the set of elements in {bi}
the cardinal of B1 is No. Hence if m is an infinite cardinal we always
have m > No. By using the equivalence
between B1 and the proper subset B2 B1 where B2 contains the even
elements of {bi} and the identity mapping
b<-+b for
we have an equivalence between B = B1 v (B - B1) and B2 V (B - B1),
a proper subset of B. This shows that any infinite set B contains a
proper subset of the same cardinal.
In order to see that some infinite sets have cardinal > No it is
sufficient to recall that the set {x E R: 0 < x < 1} cannot be arranged
as a sequence.4 Now it tan-1 x + I = f (X), x E R defines a mapping f :
R -a (0, 1) which is (1, 1) and onto so that R has the same cardinal as
the interval (0, 1) and we have c > No. It is worth remarking that a
famous unsolved problem of mathematics concerns the existence or
otherwise of cardinals m such that c > m > No. The axiom that no
such exist, that is that m > No = m >, c is known as the continuum
The fact that there are infinitely many different infinite cardinals
follows from the next theorem, which ccmpares the cardinal of a set
E with the cardinal of the class of subsets of E.
Theorem 1.1. For any set E, the class (f = (E) of all subsets of E
has a cardinal greater than that of E.
Proof. For sets E of finite cardinal n, one can prove directly that
the cardinal of '(E) is 2n, and an induction argument easily yields
n < 2n for n E Z. However, the case of finite sets E is included in the
general proof, so there is nothing gained by this special argument.
t See, for example, J. C. Burkill, A First Course in Mathematical Analysis (Cam-
bridge, 1962).
Suppose 2 is the class of one points sets {x} with x e E. Then
2 c ' and E - 2 because of the mapping x H {x}. Therefore it is
sufficient to prove by (ii) above, that ' is equivalent to no subset
El c E. Suppose then that g ' -* El is (1, 1) and onto and let
x: El -> W denote the inverse function. Let A be the subset of El
defined by
A = {x e El, x x(x)}.
Then A E 6 so that c(A) = xc E El. Now if x0 a A, x(xc) = A does not
contain x0 which is impossible, while if x0 0 A, then x0 is not in x(xo)
so that x0 E A. In either case we have a contradiction.
It is possible to build up systematically an arithmetic of cardinals.
This will only be needed for finite cardinals and No in this book, so
we restrict the results to these cases and discuss them in the next

Exercises 1.3
1. Show that (0,1] .. (0,1) by considering, defined by
f(x)=I-x, for lj<x.1;
=I-x, for J < x < ,J;
=I-x, for }<x<,j;
2 -x, for 2n<x.ri
Deduce that all intervals (a, b), (a, b], [a, b] or [a, b) with a < b have the same
cardinal c.
2. Every function f : [a, b] --> R which is monotonic, i.e.
a <xl <x2 <b=t-f(xi) 4f(x2),
is discontinuous at the points of a countable subset of [a, b].
Hint. Consider the sets of points x where the size of the discontinuity
d(x) = f(x+0)-f(x-0) satisfies 1/(n+l) < d(x) < 1/n and prove this is
finite for all n in Z.
3. Show that R2 - R.

defines a (1, 1) mapping between pairs of decimal expansions and single

expansions of numbers in (0,1). Modify this mapping to eliminate the
difficulty caused by the fact that decimal expansions are not quite unique.
4. Prove that a finite set E of cardinal m has 2m distinct subsets.
5. Suppose Al c A, Bl c B, Al ... B and A - B1. Construct a mapping
to show that A - B.
Hint. Suppose f: A -). B1, g: B --> Al are (1,1) and onto. Say x (in either
A or B) is an ancestor of y if and only if y can be obtained from x by succes-
sive applications off and g. Decompose A into 3 sets A0, A6, A; according
as to whether the element x has an odd, even or infinite number of ancestors
and decompose B similarly. Consider the mapping which agrees with f
on A. and A,, and with g-1 on A0.

1.4 Operations on subsets

For two sets A, B we define the union of A and B (denoted A v B)
to be the set of elements in either A or B or both. The intersection
of A and B (denoted A n B) is the set of elements in both A and B.

Fig. 1

If A c X, the complement of A with respect to X (denoted X -A)

is the set of those elements in X which are not in A. We also use
(A - B) to denote the set of elements in A which are not in B for
arbitrary sets A, B. For any two sets A, B the symmetric difference
(denoted AL B) is (A - B) v (B - A), that is the set of elements which
are in one of A, B but not in both. Note that AL B = B L A.
These finite operations on sets are best illustrated by means of a
Venn diagram. In this some figure (like a rectangle) denotes the whole
space X and suitable geometrical figures inside denote the subsets
A, B, etc. It is well known that drawing does not prove a theorem, but
the reader is advised to illustrate the results of the next paragraph
by means of suitable Venn diagrams (see Figure 1).
The operations v, n,,L satisfy algebraic laws, some of which are
listed below. We assume the reader is familiar with these, so proofs
are omitted.
(i) AuB=BuA, AnB=BnA;
(ii) (AvB) v C= A v (B v C), (A n B) n C= A n (B n C);
(iii) A n (B v C) = (A n B) v (A n C),
A v (BnC) = (AvB)n (AuC);
(iv) A v o= A, A n N= O;
(v) ifAcX, then A v X= X, An X= A.;
(vi) ifAcX, B c X, then X - (AvB) = (X - A) n (X - B),
X- (AnB) = (X - A) v (X - B);
(vii) AvB = (AA B) A (AnB), A - B = AA (AnB).
A similarity between the laws satisfied by n, v and the usual algebraic
laws for multiplication and addition can be observed (in fact the older
notation for these operations is product and sum) but the differences
should also be noted: in particular the distributive laws, (iii) above,
are different in the algebra of sets. (vi) above will be generalized and
proved as a lemma-it is known as de Morgan's law.
Given a class f of subsets A, the union U {A; A E''} is the set of
elements which are in at least one set A belonging to ' and the inter-
section n {A; A E '} is the set of elements which are in every set A
of W. If the class ' is indexed so that ' consists precisely of the sets
Aa, (a E 1), then we use the notations U,,,,, I Aa, f 1 a E I A. for the union
and intersection of the class. In particular when ' is finite or enumer-
able it is usual to assume that it is indexed by {1, 2, ..., n} or Z respec-
tively and the notation is
n n oo co

U Ai,
fl Ai,
U Ai,
fl Ai

When the class' is empty, that is I = 0, we adopt the conventions

U Ea =o, fl Ea = X, the whole space.
This ensures that certain identities are valid without restriction on I.
Lemma. Suppose E, a E I is a class of subsets of X, and E1 is one set
of the class, then
(i) aEI
(ii) x- UaEI
E. = f aEI
l (X - Ea);
(iii) X - n Ea = U (X - Ea).
aE1 aE1
Proof. (i) This is immediate from the definition.
(ii) Suppose x c X - U Ea, then x c X and x is not in U Ea, that
is x is not in any Ea, a E I so that x E X - Ea for every a in I, and
X E n (X - Ea). Conversely if x E n (X - Ea), then for every a E I,
x is in X but not in Ea, so x E X but x is not in U Ea; that is, x E X - U E.
(iii) Similar to (ii).
Two sets A, B are said to be disjoint if they have no elements in
common; that is, if A n B = o. A disjoint class is a class ' of sets such
that any two distinct sets of ' are disjoint. The union of a disjoint
class is sometimes called a disjoint union.
Lemma. Given a finite or enumerable union of sets U Ei (where p
can be + oo), there are subsets Fi e Ei such that the sets Fi are disjoint
p p
and UEi = U Fi.
i=1 i=1
Proof. We write out the details for p = oo. Only obvious changes
are needed for p E Z. Put C = U Ei and define F1 = El,
Fn= En_ n-1
UEi (n= 2,3,...).
Then F. C En for all n, and if i > j, Fi and E) are disjoint, so that
F,, F must be disjoint. Further if x E C, and n is the smallest integer
(which exists because Z is well ordered) such that x E En; then x E E.

but not to Ei for i < n. Thus X E F. and so x E U F. Thus

C c U Fi, and the reverse inclusion is immediate.
Theorem 1.2. The union of a countable class of countable sets is a count-
able set.
Proof. By the process of the above lemma we can replace the count-
able union by a countable disjoint union of sets which are subsets of
those in the original class-each of which is therefore countable.
Each countable set can be enumerated as a finite or infinite sequence.
So we have 00
C = U Ei a disjoint union,
Ei = {xi;} (j = 1, 2, ... ),
where the infinite union may be a finite one and some (or all) of the
sequences {xi;} may be finite. Put F. = {x11: i +j = n}, then F. is a
finite set containing at most (n + 1) elements. The sets F. are disjoint,
and C can be enumerated by first enumerating F1, then F2, and so on.
If F. = 0 for n > N then C is finite; otherwise it is enumerable. ]
Corollary. The set Q of rational numbers is enumerable.
Proof. Q = U En, where E. is the set of real numbers of the form
p/n where p is an integer. En is enumerable since
0, + 1, -1, + 2, - 2,..., +p, -p, ...
is an enumeration of the set of integers. ]
For a sequence E1, E2, .(iij1)
.. of sets, we put
00 00 OD

lim sup Ei =n Go lim inf Ei = U n

n=1 i=n n=1` in
and if {E1} is such that lim sup Ei = lim inf Ei we say that the sequence
converges to the set E = lim sup Ei = lira inf Ei. For any sequence
{Ei}, lim sup Ei is the set of those elements which are in Ei for infinitely
many i and lim inf Ei is the set of those elements which are in all but
a finite number of the sets E.
A sequence {Ei} is said to be increasing if, for each positive integer n,
En En+1; it is said to be decreasing if, for each positive integer n,
En En+l. A monotone sequence of sets is one which is either
increasing or decreasing. Note that any monotone sequence conver-
gences to a limit for
(i) If {Ei} is increasing,
OD co

U Ei = U Ei, fl Ei = E. for all n,

in i=1 i=n

so that lim sup Ei = lim inf Ei = U Ei; while

(ii) If {E1} is decreasing,
00 W 00

UEi=E'n, in
for all n,

so that lim sup Ei = lim inf Ei = (1 Ei. 00

Indicator function
Given a subset E of a space X, the function XE: X --> R defined by
1 for xEE,
{ 0 for x E X - G
is called the indicator function of E (many books use the term `charac-
teristic function' for xE, but we will avoid this term because charac-
teristic function has a different meaning in probability theory). The
correspondence between subsets of X and indicator functions is clearly
(1, 1) for E = {x: yE(x) = 1},
and we will use indicator functions as a convenient tool for carrying
out operations on sets.

Exercises 1.4
1. Prove each of the following set identities:
A n (B-C) = AnB-A n C,
(A-B)-C = A-(BvC),
A-(B-C) = (A-B)v (An C),
(A -B) n (C-D) =AnC-BvD,
Et (FAG) = (EAF)AG,
En(FAG) _ (EnF)A(EnG),
EA 0 =E,EAX =X-E,
EAE= s,EO(X-E)=X,
EA F = (Eu F)-(En F).
2. With respect to which of the operations L, v, n does the class of all
subsets of X form a group?
3. Show that E c F if and only if X - F c X - E.
4. Prove that A A B= C A D if and only if A A C= BL D, by showing
that either equality is equivalent to the statement that every point of X
is in 0, 2 or 4 of the sets A, B, C, D.
5. Show that if Il e I2, then
nEaDnEa, UE.CUEa.
aEIt aEI2 aEIt aEll
6. A real number is said to be algebraic if it is a zero of a polynomial
an xn +an-1x°`-1 + ... -- ao where the coefficients ai are integers. Defining
the `height' of a polynomial to be the integer
h = n+lanl+Ian-1I +...+laol,
show that there are only finitely many polynomials of height h, and deduce
that the set of all algebraic real numbers is enumerable. Deduce that the

set of transcendental numbers (real numbers which are not algebraic) has
cardinal > Mo.
7. Show that in Rn, the set Qn of points (x1, x2, ..., xn), where each co-
ordinate x= is rational, is an enumerable set.
Further, the class of all spheres with centres at points of Qn and rational
radii is an enumerable class.
8. Show that any sequence of disjoint sets converges to 0. Show that
{En} is a convergent sequence if and only if there is no point x of X such that
each of x e En, x e X - En holds for infinitely many n.
0 < x 5 1- (1/n)} n odd,
E. {x:
{X: (1/n) 5 x < 1} n even;
show that {En} converges but is not monotone.
9. For any sequence {En} of sets prove
(i) lim sup En, lim inf E. are unaltered by the omission or alteration of
any finite number of sets in the sequence.
(ii) for any set F,
F-lim sup R. = liminf(F-En),
F - lim inf En = lim sup (F - En).
10. If En = A for n even, En = B for n odd, show that lim sup En = A v B,
liminfEn = An B.
11. Can an uncountable union of distinct sets be countable?
12. If {En} is a sequence of sets and
D1 = E1, D .n = Dn_10 En for n=2,3,...,
show that the sequence {DJ converges to a limit if and only if lim En = o.
13. Show that XE(x) < xF(x) for all x in X if and only if E c F. Suppose
A=EvF,B=EnF,C=E/ F: show that

Generalise the first two of these identities to finite unions and inter-
14. If xn is the indicator function of E. (n = 1, 2, ...) and A = lim sup En,
B = lim inf En, show that, for all x in X,
XA(x) = lim sup xn(x), XB(x) = lim inf xn(x)
n-4- OD n-4. OD

1.5 Classes of subsets

Up to the present our operations have been defined on the class ''
of all subsets of a given set X. This class is too large for many pur-

poses and it is usual to restrict attention to subclasses of W. However

it is important that the subclasses considered have sufficient structure,
and we now define various types of class starting with the simplest.
1. Semi-ring
A class. of subsets such that
(i) o EY;
(ii) A, B E. ' z A n B E.So;
(iii) A, B E.5 A - B = U Ei, where the Ei are disjoint sets in 9, is
called a semi-ring. (Note that many authors, following Von Neumann,
who first defined the concept, have an additional condition in the de-
finition of a semi-ring-instead of (iii) they assume that if A, B EY
and B a A there is a finite class Co, C1, ..., C. of sets of .51 such that
B=CocC1c... cCn=A and Di=C1-C1_1E.So for i= 1,2,...,n.
This stronger condition causes complications and we weaken it since
it is unnecessary.) An important example of a semi-ring of subsets of
R is the class' = 91 of finite intervals (a, b] which are open on the left
and closed on the right. Similarly, -9n consisting of the rectangles
in Rn of the form {(x1, x2, ..., xn): ai < xi S bi} is a semi-ring in Rn.

2. Ring
This is any non-empty class . of subsets such that
Since 0 = A A A, A v B= (A A B) A (A n B), and A- B= A A (A n B)
we see that a ring is a class of sets closed under the operations of union,
intersection, and difference and QS E R. Thus a ring is certainly also a
semi-ring. As examples the system { o, X} is a ring as is the class of all
subsets of X. However, the class 9 of half-open intervals in R is not
a ring, for it is not closed under the operation of difference.

3. Field (or algebra)

Any class sad of subsets of X which is a ring and contains X is called

a field. Thus a ring is a field if and only if it is closed under the operation
of taking the complement. The class of all finite subsets of a space X
is a ring, but is not a field unless X is finite. In R the class of all bounded
subsets is a ring but not a field.

4. Sigma ring
A ring . is called a if it is closed under countable unions, i.e.
if 00

AiEA (i= 1,2,...)=>UAiE9.

00 00 ao

Now put A = U Ai and use the identity fl Ai = A- U (A - Ai) to

i=1 i=1 i=1
see that a is also closed under countable intersections. Hence
if R is a o--ring and {An} is a sequence of sets from PAP then lim sup A.
and lim inf A. both belong to R.
5. Sigma field (o field, Borel field, a-algebra)
Any class.F of sets which contains the whole space X and is a o'-ring
is called a a-field. Alternatively, a a field
which is closed under countable unions. For any space X, the class of
all countable subsets will be a v-ring, but will only be a v-field if X
is countable.
6. Monotone class
Any class 4f of subsets such that, for any monotone sequence {En}
of sets in .4' we have lim En E .4' is called a monotone class. It is clear
that a is a monotone class, and any monotone class which is a
ring is also a v-ring since
OD n
and U Ei is monotone so that U Ei = lim U Ei is in .f1.
i=1 i=1 i=1
We now use the term z-class to denote any one of the types 2, 3, 4,
5, 6 above (but not a semi-ring), and we consider a collection of
Lemma. If W., for a E I is a z-class, then'' = n wa is a z-class.
Proof. Each of these z-classes is defined in terms of closure with
respect to specified operations. Since each %, is closed with respect to
operations, the resulting subset will be in Wa for all a E I and therefore
in 'f, so that '' is also a z-class.'
Note. The intersection of a collection of semi-rings need not be a
Theorem 1.3. Given any class ' of subsets of X there is a unique z-class
.9 containing (f such that, if .l is any other z-class containing ' we must
have .2 Y.
Remark. The z-class .5o obtained in this theorem is called the z-class
generated by W. It is clearly the smallest z-class of subsets which con-
tains 6.
Proof. The class of all subsets of X is a z-class containing W. Put
Y = (1 {2: 2 f and 2 is a z-class). Them is a z-class by the lemma
and it clearly satisfies the conditions of the theorem.
In certain special cases one can specify the nature of the z-class
generated by a given class.
Theorem 1.4. The ring M (Y) generated by a semi-ring.5o consists pre-
cisely of the sets which can be expressed in the form

of a finite disjoint union of sets of Y.

Proof. (i) The ring.(b°) certainly must contain all sets of this form,
since it has to be closed under finite unions.
(ii) To see that the system .2 of sets of this type form a ring suppose
n m
and put Ci, = Ai n B f E.9'. Then since the sets C,, are disjoint and
n m
ArB=U UCi,
the system 2 is closed under intersections. Now from the definition
of a semi-ring, an induction argument shows that
m r;
Ai = U Ci1 v U Dik, (i = 1, ..., n)
a=1 k=1
n s1

B, = U Ci, U U Ekf, (j = 1, 2, ..., m);

i=1 k=1
where the finite sequences {Dik} (k = 1, ..., ri) and {Ek,} (k = 1, ..., s,)
consist of disjoint sets in Y. It follows now that

ALB= i=1
U UDik
Um (UEk;)
j=1 k=1
so that the system 2 is also closed under the operation of taking the
symmetric difference.
Example. We have already seen that J', the class of intervals
(a, b] in R, is a semi-ring. The generated ring is the class off of finite
unions of disjoint half-open intervals. off is called the class of elemen-
tary figures in R. Similarly, the elementary figures in Rn form the
class Pn of finite disjoint unions of half-open rectangles from 6pn.
The next theorem is often important in proving that a given class is
a o--ring.
Theorem 1.5. If 9 is any ring, the monotone class _W(M) generated by
. is the same as the o--ring.9'(M) generated by M.
Corollary. Any monotone class .4' which contains a ring q contains
the .c"(.) generated by M.
Proof. Since a a-ring is always a monotone class and .9'(.) M
we must have Y (M) (9), denoted by .4'.
Hence it is sufficient to show that .,K is a o--ring, and this will follow
if we can prove that .4' is a ring. For any set F, let 2(F) be the class
of sets E for which E - F, F - E, E v F are all in .4'. Then if 2(F)
is not empty it is easy to check that it is a monotone class. It is clear
that 2(F) R for any F e 9 so that 2(F) .4'. Hence, EE.%',
FE. E e 2(F) . F E 2(E) by the symmetry of the definition of the
class 2, and it follows that ..Ill c 2(E) since 2 is a monotone class.
But the truth of this for every E E .4' implies that .4' is a ring. I
In § 1.2 we discussed mappings f : X -* Y and saw that any such
mapping induced a set mapping f-1 on the class of all subsets of Y.
If f-1 is restricted to a special class IF of subsets in Y, then the image of
' under f-I will be a class of subsets in X. The interesting thing is that
the structure of the class 'C is often preserved by such a mapping f-1.
Theorem 1.6. Suppose s' is a z-class of subsets of Y, f: X Y is any
mapping and f-1 (io) denotes the class of subsets of X of the form f-1(E),
EE'. Then f-1(co) is a z-class of subsets of X.
Proof. It is easy to check that the mapping!-': 'f --> f -'(W) commutes
with each of the set operations union, symmetric difference, countable
union and monotone limit. The closure of ' with respect to any of
these operations therefore implies the closure off-'(W) with respect to
the same operation. I
Exercises 1.5

1. Give an example of two semi-rings Y,, .9" whose intersection is not a

2. Prove that any finite field is also a v-field.
3. If M is a ring of sets and we define operations Q = multiplication and
$ =addition by EpF=EnF, E©F=ELF
show that 9 becomes a ring in the algebraic sense.
4. If .GP is a ring and ' is the class of all subsets E of X such that either
E or (X - E) is in Ge, show that ' is a field.
5. What is the ring .g(') generated by each of the following classes:
(i) for a single fixed E, le = {E};
(ii) for a single E, ' is class of all subsets of E;
(iii) ' is class of all sets with precisely 2 points?
6. Prove that if A is any subset of a space X, A + o or X, then the
v-field JF(A) generated by the set A is the class { 0, A, X -A, X}.
7. If le is a non-empty class of sets show that every set in the v-ring
generated by 'f is a subset of a countable union of sets of .
8. For each of the following classes' describe the v-field, u-ring and
monotone glass generated by W.
(i) P is any permutation of the points of X, i.e. any transformation from
X to itself which is (1, 1) and onto, and ' is the class of subsets of X left
invariant by P.
(ii) X is R3, Euclidean 3-space, ' is the class of all cylinders in X, i.e.
sets E such that (x, y, z1) E E . (x, y, z2) e E for all z2 E R.
(iii) X = R2, the plane, ' is class of all sets which are subsets of a count-
able union of horizontal lines.
9. Suppose X is the set of rational numbers in 0 < x < 1, and let 2 be
the set of intervals of the form {x c X ; a < x 5 b} where 0 <, a s b < 1;
a, b EX. Show that 2 is a semi-ring and every set in 2 is either empty or
Show that the u-ring generated by 2 contains all subsets of X.
10. Given a function f : X Y, and a class of subsets . of X, f (.V)
will denote the class of subsets of Y of the form f (A), A e d
What is the relation between f (A - B) and f (A) - f (B) ? Give an example
in which f (A n B) $ f (A) n f (B). Show that it is possible to have a ring sad
such that f (.V) is not a ring.
Give an example of a mapping!: X Y and a semi-ring Yin Y such that
f-1(.9) is not a semi-ring. For any class V' of sets in Y show that
g(f-1(./V')) =f-1(.g (.f 1(`A)) =ff 1(.
where R(') is the ring generated by le, and .F(') is the v-field generated
by W.

1.6 Axiom of choice

Any non-empty set A contains at least one element x, and in the
ordinary process of logic one can choose a particular element from a
non-empty set. By using the principle of induction it follows that one
can choose an element from each of a sequence of non-empty sets,
but difficulty arises if one has to make the simultaneous choice of an

element from each set of a non-countable class W. The assumption

that such a choice is possible can be formulated in the following equiva-
lent forms, known as the axiom of choice:
(1) Given a non-empty class ' of disjoint non-empty sets Ea, there
is a set G c U {E.: E. E f} such that G n E. is a single point set for
each Ea E W.
(2) For a non-empty class' of non-empty sets Ea, there is a function
(called a choice function ) f : ' -+ U {Ea: Ea E'} such that, for each E.
in ', f (Ea) E Ea.
The difficulty in proofs using the axiom of choice is that only the
existence of a choice function is postulated, and if '' is uncountable,
one has no information about its nature. However, we will find it
convenient at times to use this axiom (or something equivalent).
It has recently been shown that both the axiom of choice, and its
negative, are consistent with the other axioms of set theory, so that
one has to postulate this as an axiom. Although part of our theory will
be valid without this axiom we will not trouble to discover how much
and we will use the axiom of choice throughout when it is convenient.
There are a large number of other apparently different axioms
which turn out to be logically equivalent to the axiom of choice. We
will formulate just two of these, as they will be convenient later.
Various new concepts will be needed before we can state them pre-

Partial ordering
Suppose V is a set with elements a, b, ... and -< is a relation defined
between some but not necessarily all pairs a, b E V such that
(i) -< is transitive, i.e. a -< b, b -< c a -< c;
(ii) -< is reflexive, i.e. a -< a for all a in V;
(iii) a-<b,b-< a=> a=b;
then V is said to be partially ordered by the relation -<. V is said to be
simply (or totally) ordered if,
(iv) for each pair a, b E V at least one of a -< b, b -< a is valid.
Any partial ordering in a set V induces automatically a partial
ordering in every subset of V. If W V and the induced ordering in
W is a simple ordering, then W is said to be a chain in V.
For example, in R the usual S relation defines a total ordering of
R. However, in R2, if we say (xi, Yi) -< (x2, Y2) if and only if yi < y2
and xi 5 x2 we have an example of a partial ordering which is not
simple. A more useful example is the class ' of all subsets of a fixed
set X with A -< B meaning A c B.
A chain W in a partially ordered set V is called a maximal chain if
it is not possible to obtain a larger chain by the addition of an element
in (V - W). We can now state
Kuratowski's lemma. Every partially ordered Bet V contains a maxi-
mal chain.
This means that there is a totally ordered subset W c V such that
for every x e V - W, there is some element yEW such that neither of
x -{ y, y < x is true.
For a partially ordered set V, the element a is said to be an upper
bound for the subset C c V if c -< a for every c E C. The element a
is said to be the least upper bound or supremum of the subset C if
(i) a is an upper bound for C;
(ii) if b is an upper bound for C, then a -< b.
It is easy to check that, in any partially ordered set V, it is impossible
for two distinct elements al, a2 to satisfy the above conditions (i),
(ii) so that the supremum of a set C is unique when it exists. However,
even when a set V is totally ordered, not all its subsets need have a
supremum. With the usual ordering R has the property that any
non-empty subset C which is bounded above has a supremum (this
is known as the least upper bound axiom), but Q does not have this
Finally, we say that the element m e V is a maximal element of V
if m -< a . m = a. We can now state
Zorn's lemma. If V is partially ordered and each chain W in V has a
supremum, then V has a maximal element.
Both Zorn's lemma and Kuratowski's lemma can be deduced from
the axiom of choice, fi but we will not give the details as these are
complicated and outside the mainstream of our argument. However,
,the next theorem shows that, if we assume Zorn's lemma as an
additional axiom, then both the axiom of choice and Kuratowski's
lemma will be valid. This means that, in our subsequent work, we will
assume whichever of these three results happens to be most convenient.
Theorem 1.6. The statements of (A) Kuratowski's lemma and (B) Zorn's
lemma are equivalent. Either of them implies (C) the axiom of choice.
Proof. (A) . (B). Suppose W is a maximal chain in V, then by the
hypothesis of (B) there is a supremum m for W so that a -< m for all
a E W. If m is not a maximal element of V, then there is a be V such
that b 4 m and m -< b. Then b is not in Was this would imply b -< m,
t For a discussion of these and other axioms equivalent to axiom of choice see, for
example, J. L. Kelley, General Topology (Van Nostrand, 1955).
and b = m. Hence we may add b to the chain W and the new set ob-
tained is still a chain. This would contradict the fact that W is a
maximal chain.
(B) . (A). The chains in V form a class f which is partially ordered
by inclusion. If now VYis a chain in ' with elements W (each of which
is a chain in V), then the union U {W: W E *Y} is a chain in V so that
it is an element of ' which can only be the supremum of 0. Hence
by hypothesis ' contains a maximal element, i.e. V contains a maxi-
mal chain.
(B) (C). We now suppose given a class .NV of sets E. There are
clearly some subsets (in fact any finite subset) . c .'V on which
it is possible to define a choice function g: . --> U {E,: Ea E .} such
that g(Ea) E E. The set V of all such functions g is therefore non-
empty and it is partially ordered if we say g1-< g2 if gl is defined on
., 92 is defined on X., . c X. and g1(Ea) = g2(Ea) for Ea E
(i.e. g2 is an extension of g1). If now W is a chain in V containing func-
tions gi defined on M, the supremum of W is the function defined
on U which has the value gi(Ea) on any set E. E .. If we now
assume (B) it follows that the set V has a maximal element f. Then this
function f must be defined on all the sets Ea, for otherwise if f is not
defined on E1 we could choose an element x1 E E1, put f (El) = x1 and
this would be a proper extension off and therefore contradict the fact
that f is maximal.'
Exercises 1.6

1. Show that Z is partially ordered if a < b means that a is a divisor

of b.
2. Suppose a is a decomposition of the non-empty set X into disjoint
subsets; X = UAi all the Ai disjoint. Show that the collection of such
decompositions is partially ordered if a -< f means that ft is a refinement
of a, i.e. if ft is the decomposition X = UB; then each B3 is a subset of
some A,
3. A partially ordered set V is said to be well ordered if each non-empty
subset W -- V has a least element, i.e. there is a wo E W such that wo -< w
for all w e W. Show that, if V is well ordered, then it is simply ordered, and
by considering the natural ordering of R show that there exist simply
ordered sets which are not well ordered.
4. Assuming Zorn's lemma, show that any set X can be well ordered.
Hint. Consider the class le of well ordered subsets V X with the partial
ordering V1 -< V2 if: (i) V1 c V2, (ii) the ordering in Vi is the same as that in-
duced by the ordering in V2, (iii) V1 is an initial segment of V2 in the sense
b E V1. Show that each chain in' has a supremum
that a e V1, b E V2, b -< a
and show that the maximal element Vo in ' must be X.


2.1 Metric space
In the first chapter we were concerned with abstract sets where no
structure in the set was assumed or used. In practice, most useful
spaces do have a structure which can be described in terms of a class
of subsets called `open'. By far the most convenient method of
obtaining this class of open sets is to quantify the notion of nearness
for each pair of points in the space. A non-empty set X together with
a `distance' function p: X x X ->. R is said to form a metric space
provided that
(i) p(y, x) = p(x, y) ,>0 for all x, y e X ;
(ii) p(x, y) = 0 if and only if x = y;
(iii) p(x, y) < p(x, z) + p(y, z) for all x, y, z e X.
The real number p(x, y) should be thought of as the distance from
x to y. Note that it is possible to deduce conditions (i), (ii) and (iii)
from a smaller set of axioms: this has little point as all the conditions
agree with the intuitive notion of distance. Condition (iii) for p is
often called the triangle inequality because it says that the lengths of
two sides of a triangle sum to at least that of the third. Condition (ii)
ensures that p distinguishes distinct points of X, and (i) says that the
distance from y to x is the same as the distance from x to y. When
we speak of a metric space X we mean the set X together with a
particular p satisfying conditions (i), (ii) and (iii) above. If there is
any danger of ambiguity we will speak of the metric space (X, p).
In the set R of real numbers, it is not difficult to check (i), (ii) and
(iii) for the usual distance function
P(x,y) = Ix-yI,
and similarly in RR, x = (x1, ..., xn), y = (yi, ..., yn)

P(x, y) = (xs - yz) ZJ

(one always assumes the positive square root) the conditions for a
metric are satisfied. Thus R and Rn are metric spaces with the usual
Euclidean distance for p.

Open sphere
In a metric space (X, p), if x c X, r > 0, then
S(x,r) = {y:p(x,y) < r};

the set consisting of those points of X whose distance from x is less

than r is called an open sphere (spherical neighbourhood) centre x,
radius r. Clearly, in Rn, S(x, r) is the inside of the usual Euclidean
n-sphere centre x, radius r (for n = 2, the `sphere' is the interior of a
circle while for n = 1 it reduces to the interval (x -r,x+r)).

Open set
A subset E of a metric space X is said to be open if, for each point x
in E there is an r > 0 such that the open sphere S(x, r) c E. Note
that the open spheres defined above are examples of open sets since
y E S(x, r) = p(x, y) = r1 < r,
so that, for 0 < r2 <, r - r1, S(y, r2) c S(x, r).
Theorem 2.1. In a metric space x, the class 9 of open sets satisfies
(1) 0,XETJ;
(ii) A1,A2,...,A.ET=> nAEV;
(iii) A. E V for a in I U A. c!?.
Proof. (i) Since any statement about the elements of 0 is true, 0 E 9,
and it is clear that S(x, r) c X for any x E X, r > 0 so certainly X E 9.
(ii) If x E fl Ai, then x E Ai for i = 1, ..., n and each Ai is open
so there are real numbers ri > 0 for which S(x, ri) c A. If we
put r = min ri, then 0 < r < ri so that S(x, r) S(x, rj) c Ai for
i = 1,...,n; and S(x,r) c (1 Ai.

(iii) For any x E U Aa, there must be a particular a in I such that

x E A.. Since this Aa is open, there is an r > 0 such that
S(x, r) c Aa c U A.

Remark. The condition (ii) says that 9 is closed for finite inter-
sections, while (iii) says it is closed under arbitrary unions. One
cannot extend (ii) to give closure for infinite intersections for, in R
the intervals (0, 1 + (1/n)) are open sets, but

fl (0,1+11
n=1\ n
is not open as it contains no open sphere centre 1.
It is more general to start with a set X and a class V of subsets
of X satisfying (i), (ii), (iii) of theorem 2.1 and to call these `the
open sets' in X. Such a class 9 and set X are said to form a topological
space, and 9 is said to determine the topology in X. A topological
space (X, 9) is said to be metrisable if there is a distance function p
defined on it which determines the class 9 for its open sets. Most topo-
logical spaces (X, T) of interest satisfy the rather weak conditions
which are sufficient to ensure metrisability, so that little is lost by
assuming in the first place that we have a metric space (X,p). Of
course two different metrics p1, p2 on a set X may define the same class
V of open sets, so that even when a topological space is metrisable,
the metric p is not uniquely determined-see exercise 2.4 (1).
In this chapter we will define most of the further concepts. which
depend on the toplogy of X in terms of the class 9 of open sets in X:
this means that the definitions will make sense either in a metric space
(X, p) or in a topological space (X, V). However, when it simplifies the
proof, we will assume that X has a metric p determining ( and use
this metric, so that some theorems will be stated and proved for metric
spaces even though they are true more generally.

Closed set

A subset E of X is said to be closed if (X - E) is open. If we apply

this definition, with de Morgan's laws, to the conditions (i), (ii),
(iii) of theorem 2.1 satisfied by the class of open sets, we see that the
class ' of closed sets satisfies
(i) Q , X Ef;
(ii) A1, A2, ..., An e U Ai E'f;

(iii) A,,EWo,ainl=> nA,,EW,

so that the class W is closed for finite unions and arbitrary inter-
In a metric space (X, p), for x E X, r > 0 the set
S(x, r) = {y: p(x, y) < r}
is called the closed sphere centre x, radius r. It is always a closed set
according to our definition for
Y E G = X- S(x, r) p(x, y) = rl > r
so that S(y, r2) c G for 0 < r2 5 rl - r.
In a topological space (X, 9), any open set containing x E X is
said to be a neighbourhood of x.
Limit point of a set
Given a subset E of X, a point x E X is said to be a limit point (or
point of accumulation) of E if every neighbourhood of x contains a
point of E other than x. Note that the point x may or may not be in E.
In a metric space it is easy to see that x is a limit point of E only if
every neighbourhood N of x contains infinitely many points of E:
for, if N contains only the points x1, x2, ..., xn of E (all different from x),
then S(x, r) where r = min p(x, xi) is a neighbourhood of x which con-
1 i<n
tains no point of E other than x.
Lemma. A set E c X is closed if and only if E contains all its limit
Proof. Suppose E is closed, then X - E = G is open, so that if x E G
there is a neighbourhood N of x with N c G. This means that N con-
tains no point of E so that x is not a limit point of E. Conversely,
if E is a set which contains its limit points and x E G = X - E, then
x is not a limit point of E so there is a neighbourhood N,, of x containing
no point of E. Since Nx is open, so is H = U N. But Nx c G for all
x E G so H c G, and every point x of G is in the corresponding Nz so
H G. Thus H = G and G is open.
For any set E c X, the closure of E, denoted by R, is the inter-
section of all the closed subsets of X which contain E. It is immediate
that E is a closed set, and E = E if and only if E is closed. Further
since a closed set contains its limit points, E must contain all the limit
points of E: in fact
E = E v E',
where E' is the set of limit points of E, known as the derived set of E;
for if x 0 E u E', there is a neighbourhood N of x which contains no
point of E v E' so that (X - N) is a closed set containing E and x 0 E.

Limit of a sequence
Given a sequence {xi} of points in a metric space (X, p) we say that
the sequence converges to the point x E X if each neighbourhood of
x contains all but a finite number of points of the sequence. Thus
{xi} converges to x if given e > 0, there is an integer N such that
i>N=> p(x,xi)<e.
We then write x = lim xi or x = lim xi and say that x is the limit of the
sequence {xi}. Note that, in a metric space, the limit of a sequence is
unique-see exercise 2.1 (7).
In a metric space X, given a point x and a set E, the distance from
x to E, denoted by d(x, E) is defined by
d(x, E) = inf{p(x, y): yeE}.
This is always defined since {p(x, y) : y e E} is a set of non-negative
real numbers. If E c S(x, r) for some open sphere, we say that E is
bounded and define the diameter of E, denoted diam (E), by
diam (E) = sup {p(x, y) : x, y e E}.

If E is not bounded then the set {p(x, y) : x, y E E} is not bounded above

and we put diam (E) = + oo. Note that diam (E) is finite if and only
if E is bounded. Finally, if E, F are two subsets of X, we define
d(E, F) by
d E F= inf x x E E, Y E F

= inf {d(x, E), x e F}

and call d(E, F) the distance from E to F. Note that if E n F + 0,

then d(E, F) = 0 but there is no converse to this statement.

Many readers will be familiar with the concepts of this section for
R and R2. Usually the proofs given in these special cases can be
generalised to a general metric space, and often even to a topological
space. The reader who has difficulty in working in an abstract situa-
tion should visualise the argument in the plane R2, but not use any of
the special properties of R2.

Exercises 2.1
1. In any set X, the class '' of all subsets of X satisfies the conditions for
a class of open sets. Show that this topology can be generated by the metric
p(x, y) = 1 for x $ y, x, y E X.
(This is called the discrete topology in X.) At the opposite end of the scale,
the indiscrete topology in X is that for which 9 = { 0, X}: in this case the
space is non-metrisable if X contains at least two points.
2. Show that in a metric space X containing at least 2 points
(i) single point sets {x} are closed, but they are open only if
d(x, X -{x}) > 0;
(ii) finite sets are closed;
(iii) any open set G is the union of the class of open spheres contained
in 0;
(iv) any open set 0 is the union of the class of closed spheres contained
in G.
3. In the topological space X, given a set E -- X a point x is said to be
an interior point of E if there is a neighbourhood N of x with N c E. Prove
(i) the set E° consisting of the interior points of E is open;
(ii) E is open if and only if E = E°.
4. In a topological space X show that
A1 v A2v...v A = A1v A2v ...
but this does not extend to arbitrary unions. Give an example in which
5. If X is a 2-point space {x1, x2}, p(xl, x2) = 1 show that (X, p) is a metric
space in which the closure S(x1,1) of the open sphere S(x1,1) is not the same
as the closed sphere S(x1,1). However, if X is a normed linear space (see
§2.6) then S(x, r) = S(x, r).
6. Suppose A c R and A is closed and bounded below. Show that the
infinum of A is an element of A.
7. In a metric space, suppose is a sequence converging to x and E
is the set of points in this sequence. Show
(i) every subsequence converges to x;
(ii) either x is the only limit point of E or there is an integer N such that
x = x for n > N. Deduce that a sequence {xn} cannot converge to two
different limit points.
8. Suppose E is the set of points in a sequence in a metric space and
x is a limit point of E. Show there is a subsequence {x t} which converges
to X.
9. For any set E in a metric space, show that
E = {x: d(x, E) = 0}.
10. If E, F are subsets of a metric space X, x, y e X, show
(i) p(x,y) > Jd(x,E)-d(y,E)I;
(ii) p(x,y) <d(x,E)+diam(E), if ycE;
(f) Jp(x, y1) -p(x, y2) < diam (E), if y1, y2 E E;
(iv) d(E, F) = d(F, E).
Is d a metric on the space of subsets of X?
11. In R show that a bounded open set is uniquely expressible as a
countable union of disjoint open intervals.
Hint. For each x E E, put a = inf {y: (y, x) c E}, b = sup {y: (x, y) C E};
and show that the open interval It,, = (a, b) contains x, is contained in E and
is such that any open interval I satisfying x E I c E satisfies I C I,,. Deduce
that for x1, x2 E E, either Ixi = Ix9 or I xl n Ixs = o, so that E _ U I,, is a
disjoint union. Enumerate the intervals Ix by considering those of length
greater than 1/n (n = 1, 2, ...).
In Rn (n > 2) show that a bounded open set can be expressed as a dis-
joint union of a countable number of half-open rectangles in Yn (but that
this expression is never unique). Show that in general an open set in Rn
(n > 2) cannot be expressed as a disjoint union of open spheres, or of open

2.2 Completeness and compactness

In a metric space (X, p) a sequence {xn} is said to be a Cauchy
sequence if given e > 0, there is an integer N such that
n, m > N u p(xn, xm) < e.
It is immediate that any sequence {xn} in a metric space which con-
verges to a point x E X, is a Cauchy sequence.
Complete metric space
A metric space (X, p) is said to be complete if, for each Cauchy
sequence {xn} in X, there is a point x E X such that x = lim xn. For
example, the set Q of rationale is a metric space with the usual dis-
tance, but it is not complete for V2 0 Q, but one can easily define a
Cauchy sequence {xn} of rationals which converges to .J2 (in R),
and this sequence cannot converge to any rational. One of the im-
portant properties of the space R is that it is complete. This property
is equivalent to the assumption that, in the usual ordering, every non-
void subset of R which is bounded above has a supremum or least
upper bound. We now give a proof of the completeness of R by a
method which will turn out to be useful in more complicated situations.
Lemma. The space R is complete.
Proof. Let {xn} be a Cauchy sequence in R. Define a sequence of
integers {ni} by no = 1; if ni_1 is defined, let ni be such that ni > ni-1
and n, m > n i I xn -xmI < 1/2i. Then the series

(xn, -xni_1)

is absolutely convergent, and therefore convergent,-[ say to y. But

Z (xni-xnt_1) = xn,, -x1,

so the subsequence {xnp} (p = 1, 2, ...) must converge to x = x1 + y.

Given e > 0, choose integers P, N > np such that
p'> P=> Ix-xnD I <2e,
n,m> Ixn-xml <2e.
Now, if m > N, we can take np > N with p >, P to obtain
Ix-xml -< Ix-xnpl +Ixnp - xml < 6,
so the sequence {x j must converge to x.

Covering systems
A class ' of subsets of X is said to cover the set E c X or form a
covering for E, if E c U {S: S Eon }. If all the sets of ' are open, and
le covers E, then we say that' is an open covering of E.

Compact set
A subset E of X is said to be compact if, for each open covering c'
of E, there is a finite subclass W. c' such that' 1 covers E. For ex-
ample, the celebrated Heine-Borel theorem states that any finite
closed interval [a, b] is compact. Though this is proved in most
elementary text-books we include a proof which starts from the least
upper bound property.
Lemma. If a, b are real numbers, the closed interval
is compact.
t The fact that absolute convergence implies convergence is a consequence of the
least upper bound axiom for R, that is, it follows from the assumption that every non.
empty subset of R which is bounded above has a supremum. (See, for example,
Burkill, A First Course in Mathematical Analysis, Cambridge, 1962.)
Proof. Let le be any open cover of [a, b] and let c be the supremum
of the set of x in [a, b] for which some finite subfamily W1 c W covers
[a, x]. (This set is non-void since it contains a.) Choose a set GEW
with c e G and choose a point d E (a, c) such that the closed interval
[d, c] c G. Then there is a finite subfamily covering [a, d] and the
addition of 0 to this family gives a finite subfamily covering [a, c].
But unless c = b, since G is open, we have covered by a finite subfamily
the interval [a, e] for some e > c which contradicts the choice of c. 3
It is also possible to prove directly that any closed rectangle
in Rn is compact, but we will be able to deduce this from theorem 2.6.
We can use this to show that, in Rn, every closed bounded set is com-
pact. This will follow from the following:
Lemma. If E is compact, and F is a closed subset of E, then F is
Proof. Suppose le is an open covering for F. Then', together with
(X-F), which is open, forms an open covering for E. This has a
finite subcovering (1 of E and 6 n W, must be a finite subclass of
which covers F. J
It is not true in a general metric space that closed bounded subsets
are compact-see exercise 2.2 (3). However, we can prove:
Lemma. In a metric space X, every compact subset is closed and bounded.
Proof. If E is not closed, there is a point x0 E X which is a limit point
of E but is not in E. For every x E E, put S,, = S(x, r) with r = Jp(x, x0).
Then the collection of all such open spheres covers E, but every finite
subclass S(xl, r1), ..., S(xn, rn) has a void intersection with S(xo, r),
where r = min ri and so cannot cover E, for S(xo, r). contains points
of E. On the other hand, if E is not bounded, the class of open
spheres of radius 1 and centres in E covers E, but no finite subclass
can cover E. J
Whenever the whole space X is compact, we talk of a compact
space. The above lemma shows that Rn is not compact because it
is not bounded. A space X is said to be locally compact if every point
x in X has a neighbourhood N such that N is compact. It is clear that
Rn is locally compact.
There are various other properties in a topological space which are
equivalent to compactness under suitable conditions.
Weierstrass property
A set E is said to have property (W) if every infinite subset of E
has at least one limit point.

Finite intersection property

A class a of subsets of E is said to have the finite intersection
property if every finite intersection fl Ai, where Ai c d, (i = 1, 2, ..., n)
is non-void.

Theorem 2.2. (i) A closed subset E of a topological space X is compact

if and only if every class sad of closed subsets of E with the finite intersec-
tion property has a non-void intersection.
(ii) In a metric space X, a subset E is compact if and only if it has
property (W).
Remark. In a general topological space, property (W) is equivalent
to sequential compactness-the property that every countable open
covering has a finite subcovering. A space in which arbitrary open
coverings can be replaced by countable sub coverings is called Lindelof.
Thus in any Lindelof space, property (W) is equivalent to compact-
Proof. (i) Suppose E is compact and Fa, a E I is a class of closed
subsets of E with fl Fa void. Then the class of sets Ga = X -Fa,
a E I is an open covering of E. Choose a finite subcovering
Gal, Ga$, ..., Gan; then ( .1 = 0
i -1

so that the class of sets Fa, a e I has not got the finite intersection
property. This proves that compactness implies that any class sad
of closed subsets with the finite intersection property has a non-void
intersection. Conversely suppose a closed set E is such that any class
.d of closed subsets with the finite intersection property has a non-
void intersection, and suppose Ga, a E I is an open covering of E,
so that (1(X - Ga) = Q1. If E is closed E n (X - Ga), a E I is a family of
closed subsets of E, so there must be a finite set al, a2, ..., an such that
n (X - Gai) n E _ 0 . This means that Gal, ... , Gan form a finite sub-
covering for E.
(ii) Suppose first that E has not got property (W). Let A be an
infinite subset of E with no limit point. If A is not enumerable, choose
an enumerable subset B of A. Then B is closed and (X - B) is open.
Enumerate B as a sequence of distinct points {xi}, and for each xi
choose a neighbourhood Ni which contains xi but no other point of B.
Then the sequence {Ni} together with the set (X - B) form an open
covering of E, which has no finite subcovering as none of the open
sets Ni can be omitted without `uncovering' the corresponding xi.
Conversely suppose E has property (W). Then there is a finite class
W,, of spherical neighbourhoods of radius 1/n which covers E; for
otherwise we could find an infinite subset of E all of whose points were
distant more than 1/n apart and such a subset can have no limit
point. Let' = U Wo n; so that ' is countable. Now if G is any open set,
for each x in G n Ewe can find a sphere S E 'o containing x with S c G:
for we can first choose S > 0 so that S(x, S) c G and if n > 2/4, the
sphere of Wn which contains x will be contained in S(x, S). Given
any open covering -9 of E, carry out the above process for each set
D of .9 which intersects E, and each x in D n E, and let "' c ' be the
countable collection of open spheres obtained. For each SEW', choose
one set D E -9 with D S, and let .9' be the countable class of sets
so obtained. Then, since W' covers E, the class _9' is a countable
subcovering. This means that, if we assume property (W), open
coverings can be replaced by a countable subcovering.

Now suppose E U Gi where the sets Gi are open. Then, if there

is no finite subcovering, for each integer n we can find a point

xnEE- U Gt

and the sequence {xn} must form an infinite set, so that there is a
limit point x0 E E. But x0 E Gk for some k, and Gk is open and therefore
is a neighbourhood of x0; this means we can find an n > k such that
xn E Gk U Gi,
which is a contradiction.

Many operations can be carried out more easily in compact spaces
than in non-compact spaces. Given a non-compact space X a useful
trick is to enlarge it to a topological space X* X which is compact
and such that the system G of open sets in X is obtained by taking the
intersection X n 0 with X of sets G which are open in X*. This device
is known as the compatification of X. For example, R is not compact,
but if we adjoin two points + oo, - oo to form the space R* of extended
real numbers we can show that R* is compact if we call a set E c R*
open if E n R is open and
if + oo E E, there is a neighbourhood {x: a < x < + oo} c E;
if - oo E E, there is a neighbourhood {x: - oo 5 x < b} c E,
where a, b E R. Note that the extended real number system R* is
simply ordered if we put - oo < x < + oo for all x e R.
In general, a non-compact topological space X may be compactified
in many different ways. The simplest method is to adjoin a single
point oo (which can be thought of as a point at infinity) to give the
space X* = X v {oo} and say that a subset G of X* is open if either
(i) G c X and G is open in X; or
(ii) oo E G and (X* - G) is a closed compact subset of X. It is not
difficult to verify that this collection of `open' sets defines a topology
in X* in which X* is compact. This process is called the one-point
compactification of X. It is familiar in the theory of the complex
plane, where it is usual to add a single point at infinity (with neigh-
bourhoods of the form In > R) to make the resulting `closed plane'
compact. Note that, if G* is the class of open sets in X*, G is the class
of sets of the form X n E, E e G*.
There are other, more sophisticated, methods of compactifying a
topological space X, but we will not require these in the present
Exercises 2.2
1. If (X, p) is a compact metric space, show that
(i) X is complete;
(ii) for each e > 0, X can be covered by a finite class of open spheres of
radius E.
2. If (X, p) is a complete metric space which, for each e > 0, can be
covered by a finite number of spheres of radius e, show that X is compact.
3. The open interval X = (0, 1) c R with the usual metric
p(x1,x2) = Ixl-x21
is a metric space. In (X, p) the set X is closed and bounded. Show that X
is not compact (and therefore not complete by example 2).
4. Construct a covering of the closed interval [0, 1] by a family of closed
intervals such that there is no finite subcovering.
5. If A, B are compact subsets of a metric space X, show that there are
points xo E A, yo c B for which
p(xo, yo) = d(A, B).
Hint. Take sequences {xi} in A, {yi} in B with p(xi, yi) < d(A, B) + 1/i,
and apply property (W) to find convergent subsequences.
6. Give the details of the proof that the process of adjoining a point co
used to give the one-point compactification does yield a compact set X*.
If this process is applied to a space X which is already compact, show that
the one point set {co} is then both open and closed in X*.

2.3 Functions
In Chapter 1 we defined the notion of a function f: X --> Y. When
X and Y are topological spaces it is natural to enquire how the func-
tion f is related to these topologies. In particular do points which are
`close' in X map into points which are close in Y? We make this
precise first for metric spaces.

Continuous function
If (X, px), (Y, pp) are metric spaces, a function f: X -* Y is said to
be continuous at x = a if, given e > 0 there is a S > 0 such that
px(x,a) < 6- py(f(x),f(a)) < e.
If E c X, we say f is continuous on E if it is continuous at each point
of E. In particular f: X -. Y is said to be continuous (or continuous
on X) if it is continuous at each point of X.
Lemma. If (X, px), (Y, py) are metric spaces a function f : X --> Y
is continuous if and only if f-'(G) is an open set in X for each open
set G in Y.
Proof. Suppose first that f is continuous and G is an open set in Y.
If f-'(G) is void, then it is open. Otherwise, let a e f-'(G), f (a) e G
so that there is an e > 0 for which the sphere S(f(a), e) c G. But then
we can find a 8 > 0 such that
px(x, a) < 8 = f (X) E S(f (a), e) e G
so that the sphere S(a, S) c f-'(G).
Conversely consider f at a point a of X. For each e > 0, S(f (a), e) = H
is an open set in Y, so that if f-'(H) is open, we can find a 8 > 0 for
which S(a, 8) c f-'(H), that is such that
px(x,a) < 8=>- pr(f(x),f(a)) < e.]
If (X, 9), JV') are topological spaces, the function f: X -> Y
is said to be continuous if f-'(H) e 9 for every H in .'. The lemma just

proved shows that when the topologies in X, Y are determined by

metrics this definition agrees with the one first given for mappings
from one metric space to another.
Now if f: X -> Y is continuous and E is a closed subset of Y, it
follows that f-1(E) is closed in X. One has to be careful about the
implications in the reverse direction. In general, it is not true for a
continuous f : X -a Y, that A open in X f (A) open in Y. There is
one important result of this kind which is valid:
Theorem 2.3. If f: X -> Y is continuous, and A is a compact subset
of X, then f(A) is compact in Y.
Proof. Suppose G,, a E I forms an open covering of f (A). Then
f-'(G,,) is open for each a and the class a E I must cover A.
Since A is compact, there is a finite subcovering f-1(G1),...,f-1(G,")
which covers A, and this implies that G1, ...,G,, cover f(A).
Corollary. If f : X - R is continuous, and A is compact, the set f (A)
is bounded and the function f attains its bounds on A at points in A.
Proof. f(A) is compact, and so it must be closed and bounded.
Hence sup {x: x E f (A)} and inf {x: x E f (A)} exist and belong to the set
f (A). Hence there are points x1, x2 E A for which f (A) C [ f (x1), f (x2)]. I
Remark. The reader will recognise this corollary as a generalisation
of the elementary theorem that a continuous function f: [a, b] -+ R
is bounded and attains its bounds.
It is important to notice that, in a metric space X, the distance
function p(x, y) is continuous for each fixed y considered as a function
from X to R. Further, for a fixed set A, d(x, A) defines a continuous
function from X to R since
p(x1, x2) 1> I d(x1, A) - d(x2, A) 1.
This means that if E is compact, F is any set, the function d(x,F)
for x E E attains its lower bound so that there is an x0 in E with
d(xo, F) = d(E, F).
Now, if F is also compact
d(xo, F) = inf {p(xo, y): y E F}
is the lower bound on a compact set of another continuous function,
so that there is a yo in F such that
d(xo, F) = p(xo, yo) = d(E, F).
Thus we have proved a further corollary to theorem 2.3-which could
have been proved by a different argument (see exercise 2.2 (5)).
Corollary. If E, F are two compact subsets of a metric space (X, p),
there are points xo E E, yo E F such that
P(xo, yo) = d(E, F).
Uniformly continuous function
A mapping f : X Y from the metric space (X, px) to the metric
space (Y, pY) is said to be uniformly continuous on the subset A C X
if given e > 0, there is a 8 > 0 for which
x,yEA, px(x,y) < 8z pY(f(x),f(y)) < e. (2.3.1)
Clearly a function which is uniformly continuous on A is certainly
continuous at each point of A, but the point of the condition (2.3.1)
is that one can make f(x) close to f(y) in Y simply by making x close
to y simultaneously for all x, y e A. The choice of 8 in (2.3.1) does not
depend on x or y. In general, uniform continuity does not follow from
continuity, but there is an important case in which it does:
Theorem 2.4. If X, Y are metric spaces, and f : X -* Y is continuous
on A where A is a compact subset of X, then f is uniformly continuous
on A.
Proof. Given e > 0, for each x E A, there is a 8, > 0 such that
6 E S(x, 8,) n A f(6) E 8(f (X), fe).
For x c A, the class of spheres Sx = S(x, J88) form an open covering
of A. Choose a finite subcovering 5_ - 1 , ..., S and put S = I min
(8x1, ..., 8). Then if px(g, V) < S, 6, r/ E A, there must be a sphere S",
which contains 6, and S(xz, 8x,) will then contain i . This implies
PY(f(),fM) <' PY(f(),f(x,))+PY(f(rl),f(xs)) < e.1
Remark. The reader will recognize the above theorem as a generalisa-
tion of the result that a continuous function f: [a, b] -+ R is uniformly

Exercises 2.3

1. Consider the function f: (0, oo) -+ R given by f(x) = min (1,1/x), for
Show that it is continuous. Find the image f (E) of
(i) the set E = (1,1);
(ii) the set Z of positive integers;
(i) shows that E can be open, f (E) not open; (ii) shows that E can be closed,
f (E) not closed.
2. The function
f : (0,1) R given by f (x) _
is continuous on (0,1), but not bounded and not uniformly continuous,
so theorems 2.3, 2.4 fail if the set is not closed. To see they also fail if the set
is closed but not compact, examine
g: R R given by g(x) = exp (x).
3. In the argument of the proof to theorem 2.4 why could we not have
g = inf {Sx: x A}
before first restricting to a finite subset?
4. Suppose A is compact and f f,} is a monotone sequence of continuous
functions f,: A -* R converging to a continuous f : A -+ R. Show that the
convergence must be uniform, and give an example to show that the
condition that A be compact is essential.
5. Prove Lebesgue's covering lemma, which states that if le is an open
cover of a compact set A in a metric space (X, p), then there is a 8 > 0,
such that the sphere S(x, S) is contained in a set of ' for each x e X.

2.4 Cartesian products

We have already defined the direct product of two arbitrary sets
X, Y as the set of ordered pairs (x, y) with x e X, Y E Y. If (X, 9)
(Y, Jr) are topological spaces, then there is a natural method of de-
fining a topology in X x Y. Let .V be the class of rectangle sets G x H
with G E T, HE .*'and let' be the class of sets in X x Y which are unions
of sets in.2f (finite or infinite unions): it is immediate that ' satisfies
the conditions (i), (ii), (iii) of theorem 2.1. This, class ' of `open'
sets in X x Y is said to define the product topology. This definition
extends in an obvious manner to finite products X. X X2 x ... X Xn,
and it is also possible to extend it to an arbitrary product of topological
spaces-though we will not have occasion to consider a topology for
infinite product spaces.
Theorem 2.5. If (Xi, pi) (i = 1, ..., n) are metric spaces then
P((x1, ..., xn), (Y1, ..., yn)) = max pi(xi, yi)
defines a metric in the Cartesian product X1 x ... x X. which generates
the product topology.
Proof. It is clear that p(x, y) = 0 if and only if pi(xi, yi) = 0 for each i
which implies xi = yi, 1 < i < n or x = y. Thus in order to show that
p is a metric it is sufficient to prove the triangle inequality. But
pi(xi, yi) < pi(xi, z1) +pi(yi, z1) (i = 1, ..., n)
since the pi are all metrics, so that
max pi(xi, yi) < max {Pi(xi, z1) +P1(yi, zi)}
1<i-<n 1<i<n
max pi (xi, z1) + max pi(yi, zi)
1<i<n 1<i-<n
Now in the product space, the open sphere centre x, radius r has
the form
{(y1, ..., yn): Pi(xi, yi) < r, 1 < i < n}
= S(x1, r) x S(x2, r) x ... X S(xn, r)
that is, it is the product of spheres in each of the component spaces.
Thus the open spheres are open sets in the product topology and since
every open set in the topology of a metric p is a union of the open
spheres contained in it, each such set must be open in the product
Conversely if G is an open set in the product topology and x c G,
there must be open sets Gi c Xi such that x c Gl x ... x G,, c G.
Choose ri > 0 such that S(xi, r1) c Gi and put r = min ri. Then
r > 0 and S(x, r) c S(x1, r1) x ... x S(xn, rj c G. Thus any set G open
in the product topology is also open in the topology of the metric p.
Remark. The metric p defined in this theorem is by no means the
only one which generates the product topology-see exercise 2.4 (1, 2).
Theorem 2.6. If X, Y are compact topological spaces, then X x Y is
compact in the product topology.
Remark. The proof which follows extends immediately to finite
Cartesian products of compact sets. Actually the theorem is true for
arbitrary products, and in this more general form is due to Tychonoff.
Proof. Suppose first that R. = G. x Ha, a E I is a covering of X x Y
by open rectangles. Then if x0 is a fixed point of X and Ixa is the set of
indices a for which (xo, y) E G. x Ha for some y E Y, the class Ha,
a E I_,o forms an open covering of the compact set Y. Hence, there is
a finite set Jxo c I such that Ra, a E J,. covers the set {xo} x Y. But
if we put Ago = n Ga, Ax, is open, contains x0, and the finite class
Ra, a E JXo must cover all of Ax0 x Y. For each x0 E X, form such an open
set A,,: the class of all sets of this form is an open covering of the
compact set X, and so has a finite subcovering Axl, Axe, ..., A. It
follows that Ra, a E U Jxi is a finite subcovering of X x Y.
It remains to show that, in testing for compactness, it is sufficient
to consider coverings by open rectangles G x H with G open in X,
H open in Y. Suppose then that every covering of X x Y by open
rectangles has a finite subcovering, and consider an arbitrary open
covering Ga, a E I. Each point (x, y) E X X Y is an element of an open set
Ga and therefore there is an open rectangle Rx,v with (x, y) E Rx,v C Ga.
The class of open rectangles Rx, v, (x, y) E X X Y clearly covers X x Y
and so we can find a finite subcovering R1, R2, ..., R.. The correspond-
ing sets Gl,..., G. then form a finite subclass of the original covering
class which covers X x Y.
Corollary. Any bounded closed set in Rn is compact.
Proof. The usual topology in Rn is given by the distance function
P(x,y) (xi-y1)2
while theorem 2.5 shows that the product topology is given by the
distance function
r(x, y) = max k - yi l
1' i < n
Now 7(x, y) 4 p(x, y) 5 V n n T(x, y) for all x, y E Rn so that the topology
of the usual metric p is the product topology.
If E is bounded, there is a real number K such that E is a subset
of the Cartesian product of n intervals [ - K, K]. Since each of these
intervals is compact, the product is compact and therefore E is com-
pact if it is closed.

Exercises 2.4
1. If p11 P2 are two metrics in X such that cp1(x, y) < p2(x, y) 5 kp1(x, y)
for all x, y E X where c > 0, k > 0; show that pl and P2 generate the same
topology in X.
2. Show that, if X x Y has the product topology, then the projection
function p: X x Y -* X defined by p(x, y) = x is continuous.
In a space X, if T, T2 are two collections of `open' sets defining topologies
and 61 g2 we say that the topology given by T2 is coarser than that given
by T,. Show that the product topology in X x Y is the coarsest topology
for which projections are continuous. (For an arbitrary Cartesian product
ji Xa of topological spaces (Xa, 9a) the projection p,B can be defined as
p g(xa, a e I) = xe a X, for any 6 E I. One method of defining the product
topology in the Cartesian product space is to say that it is the coarsest
topology for which each of the projections is continuous.)
3. Suppose X x Y has the product topology and A e X, B C Y. Show
and prove that the product of closed sets is closed.

2.5 Further types of subset

In a topological space X, a subset E c X is said to be nowhere
dense if the closure E of E contains no non-void open set. If E is no-
where dense, and G is any non-void open set, the intersection
G n (X-2) is a non-void open subset disjoint from E, and therefore
from E. Conversely if E contains a non-void open set H then every
non-void open subset of H is a neighbourhood of each of its points, and
therefore contains points of E. Thus E is nowhere dense if and only if
every non-void open set in X contains a non-void open set disjoint
from E.
A subset E c X is said to be of the first category (in X) if there is a
sequence {En} of nowhere dense subsets of X such that E = 1J
00 Ei. A
set E c X which cannot be expressed as a countable union of nowhere
dense sets is said to be of the second category.
Before proving that complete metric spaces are necessarily of the
second category, it is convenient to prove a lemma which again
generalises a well-known result in R(about a decreasing sequence of
closed intervals).
Lemma (Cantor). In a complete metric space, given {An} a decreasing
sequence of non-empty closed sets such that diam (An) -> 0, fl An is a
one point set.
Proof. For each integer n, choose a point xn a An. Then given
e > 0 we can choose N so that
n>N=> diam(An)<e,
and, since AN An for n > N,
n,m > N- xn,x.EANz P(xn,X.) < e;
so that {xn} is a Cauchy sequence. Since the space is complete, there
is a point x0 such that xn -> xo as n - oo. For each n, since An is

closed and xi e An for i 3 n we must have x0 a A. so that x0 a (10" An.


Now diam (flA) diam (An) for each n so that diam (fl A) = 0

and the set fl An cannot contain more than one point.

Theorem 2.7 (Baire). Every complete metric space X is of the second
Proof. It is sufficient to show that if {An} is any sequence of nowhere
dense sets there are points x E X - U An. Starting with such a sequence
{An}, since we can find a non-void open set disjoint from Al there is a
sphere S(xl, r1) with 0 < rl < 1 such that S(xl, r1) n Al = 0. Suppose
we have found spheres S(x1, r1) S(x2, r2) ... S(xn-1, rn-1) with
0 < rn_1 < 1 /(n - 1) such that S(xi, ri) n Ai = 0 for 1 5 i S n - 1.
Then S(xn_1, rn-1) is a non-void open set in X and A. is nowhere
dense. There must therefore be an open subset disjoint from An so
that we can find a //sphere S(xn, rn) with
0 < rn < 1/n, S(xn, rn) c S(xn-v rn-1) and S(xn, rn) n An = 0

By the last lemma, there is a unique point xo a fl S(xn, rn) and this
point x0 cannot be in A. for any integer n. This means that

Corollary. Rn is of the second category. For a < b, the interval
[a, b] c R is of the second category.

Dense subset
If A, E are subsets of a topological space X, we say that A is dense
in E if A m E. This means that any open set G which contains a
point of E also contains a point of A. In particular A is dense in
X if A = X, that is, if every non-void open set contains a point of A.
Separable space
A topological space X is said to be separable if there is a countable
set E c X such that E is dense in X. This implies the existence of a
sequence {xn} in X such that every non-void open subset contains a
point of the sequence. It is immediate that R and Rn are separable
as the set of points with rational coordinates is dense and countable.
Further, every compact metric space X is separable, since X can be
covered by a finite class len of open spheres of radius 1 /n for n = 1, 2, ...
and the (countable) set consisting of the centres of all these spheres is
clearly dense in X.
Borel sets and Borelian sets
In any topological space X, we will call the i generated by
the open sets the class of Borel sets, and the .'1'' generated by
the compact sets the class of Borelian sets. (One must remark that
some authors use the term Borel sets for .7E.)
In a metric space the compact sets are closed, and therefore in

-4 so that .'' c R. If X = U Ki is a countable union of compact sets

(in this case we can say that X is o--compact), then M' = R. In order
to prove this it is sufficient to show that each open set G E X': but
if G is open, E = X - G is closed and so E n Ki is compact for each i

and this implies E _ U E n Ki e . ' and G = X - E E X. Now Eucli-

dean n-space Rn is the union of the closed spheres S(0, k) (k = 1, 2,...)
each of which is compact, so Rn is o--compact. This means that, in Rn
the Borel sets and the Borelian sets are the same.
Note that, by our definition, the class Pin of Borel sets in Rn is
the v-field generated by the open sets in R. It is convenient to see
that i can also be obtained as the o--field generated by a simpler
class of sets.
Lemma. The class On of half-open intervals in Rn generates the afield
R n of Borel sets in Rn.
Proof. Let n be the o--field generated by qn. Each set in n,
{(x1, x2, ..., x,y): ai < xt < bi, i = 1, 2, ..., n}
can clearly be obtained as a countable intersection
°° 1
fl ((xl, .. .,xn):ai<xi<bi+k, n

of open rectangles, and is therefore in Pin. Hence -41L D gn so that

Oi" n Fn.
On the other hand, each open set G in Rn is a union of those rect-
angles of Opn whose boundary points ai, bi are all rational. Since there
are only countably many such sets, each G is a countable union of
9. This implies that Fn Pin.'
sets in 9n and so Oil" n
It is sometimes useful to be able to describe sets which can be
obtained from a given class 6 by a countable operation. We say that
E is a WQ set if it is possible to find sets E1, E2 ... in ' such that
00 00
E _ U Ei; and E is a W8 set if E = fl E;, for a sequence {En} in W.
i=1 i=1
In particular, if 9 is the class of open sets in a space X, we see that
9 - g$ 9, T,,, = 9. Similarly, if F is the class of closed sets,
.4 => .FPF,and.F.=.F.
Perfect set

A subset E of a topological space X is said to be perfect if E is closed,

and each point of E is a limit point of E. For example, in Rn, any
closed sphere S(X, r), r > 0 is perfect and, in particular, the closed
interval [a, b] is perfect in R for any a < b. It is obvious that finite
sets in a metric space cannot be perfect. In fact more is true-see
exercise 2.5 (7).

Exercises 2.5
1. Show that, in R", any countable set is of the first category. Give a
category argument for the existence of irrational numbers.
2. Show that the class .N' of nowhere dense subsets of X is a ring, and the
class ' containing all sets of the first category is the generated o--ring.

3. Show that in a complete metric space, a set of the first category

contains no non-empty open set. Deduce that every non-empty open set
is of the second category.
4. If 0 is an open set in a topological space, prove that (C - G) is nowhere
5. In R show that the class of half-open intervals with rational end-
points generates the or-field -4 of all Borel sets. Similarly in R", show that
9' generates the Borel sets an.
6. Show that a set E is perfect if and only if E = E', where E' is the set
of limit points of E.
7. Show that any non-empty perfect subset of a complete metric space
is non-countable. Hint. Use theorem 2.7 and the fact that a closed subset
of a complete metric space is complete.

2.6 Normed linear space

There are many abstract sets which have an algebraic structure as
well as a topology. Thus if, in the set X there is a binary operation
+ (called addition) and an operation in which elements of X can be
multiplied by elements of the real number field R to give elements in
X we say that X is a real linear space if for all x, y, z E X, a, b, E R;
(i) x+y = y+x;
(ii) x+(y+z) _ (x+y)+z;
(iii) x+y = x+z= y = z;
(iv) a(x+y) = ax+ay;
(v) (a+b)x = ax+bx;
(vi) a(bx) = (ab) x;
(vii) l.x = X.
It follows from these axioms that X has a unique zero element
0 = 0. y for all y E X, and that subtraction can be defined in X by
x-y = x+(-1)y.
In the present book we will only consider linear spaces over R.
Most of our results can be extended, though sometimes with a little
difficulty, to linear spaces over the number field C. We will not carry
out this extension, nor do we consider any more general number
It is immediate that Rn is a real linear space with vector addition
and scalar multiplication for the two operations. The properties
of linear spaces are studied at length in elementary courses on linear
algebra. t We will not require many of these, but will develop the
properties of linear independence when they are needed in Chapter 8.
If in a real linear space X there is a function n: X R satisfying
(i) n(0) = 0, n(x) > 0 if x + 0;
(ii) n(x+y) < n(x)+n(y) for all x,yEX;
(iii) n(ax) = j al n(x) for a e R, x c X,
we say that X is a normed linear space. We will in this case use the
usual notation IIxjl for the value n(x) of the norm function n at x.
In any normed linear space X,
P(x, Y) _ IIx - yII = P(x-Y, 0)
defines a metric, and in the topology determined by this metric, the
algebraic operations are continuous in the sense that
(i) x + y is continuous in the product topology of X x X ;
(ii) ax is continuous in the product topology of X x R.
t See, for example, G. Birkoff and S. MacLane. A Survey of Modern Algebra,
(Macmillan, 1941).
It follows in particular that
(iii) a E R, lim xn = 0 lim (ax.) = 0;
(iv) X E X, a . E R, lim a, = 0=> lim (an x) = 0.
(The reader is advised to check (i)-(iv) using the axioms.)
Special normed linear spaces will be studied in Chapters 7 and 8.
At this stage we consider a few important examples of such spaces
and examine the topological structure imposed by the norm.
M. Consider the set of those functions x: [0, 1] -> R which are
bounded. Define, fort E [0, 1]
(x + y) (t) = x(t) + y(t),
(ax) (t) = ax(t)
and check that this makes M a linear space. If we put
1lxii = sup Ix(t)I
it is not hard to check that the conditions for a norm are also satisfied,
so we have a normed linear space.
C. The set of those functions x: [0, 1] - R which are continuous
is a subset C of M. Since this subset C is closed under the operations
of addition and scalar multiplication, it must be a normed linear
space with the same norm
114 = sup M01.
s. The set of all sequences of real numbers {xi} is a linear space if
we put
{xi} + {yz} = {xi + yi},

a{xi} = {axi}.
Since for x, y real we have
Ix+yl Ixl lyl
1+lx+yl 1+1x1 +1+Iyl

it follows that p({xi}, {yi}) +xi - yil

i=12 1+Ixi-yil
defines a metric in s.
m. This is the set of all bounded sequences of real numbers with
the same linear structure as s. However this time it is more convenient
to use the norm
ii{x41 = sup ixil,
to make m into a normed linear space.
c. This is the set of convergent sequences of real numbers with
the same norm and linear structure as m.
Each of the above spaces has a topology defined by the norm.
We now obtain a few of the topological properties of these spaces,
leaving the reader to determine the remainder.
Lemma. The space C is complete.
Proof. If {xn} is a Cauchy sequence in C, then for each t e [0, 1]
{xn(t)} is a Cauchy sequence in R which must converge to a real number
xo(t). For each e > 0, there is an integer N such that, if
Ym(t) = IxN(t)-xm(t)I (m > N),
then Ilymll < jc; that is,
0 < ym(t) < je for each tin [0, 1].
If we now let m -- oo, it follows that
I xN(t) - xo(t) I < je for all t in [0, 1]
so that, if n > N, t E [0, 1]
I xn(t) -x0(I < I XN(t) - xn(t) I + I kN(t) - x0(t) l < e
and II xn - xoll -> 0 as n -* oo. This means that xo is the uniform limit
of a sequence of continuous functions and must therefore be con-
tinuous; that is, xo a C. I
Lemma. The space M is not separable.
Proof. For each s e [0, 1), let xs be the function given by
0 for 0<t<8,
{1 for 8<t<1.
Then if r, s e [0, 1) and r + s we must have II xr - xsll = 1. Now any
dense set in M has to contain a point y8 such that Ilys-xsll < I for
each s e [0, 1); and we cannot have yr = y8 with r + s for then
1= Ilxr-xsll < Ilxr - Yrli+IIYr - ysil + llys-xsll < 1.
This means that any set dense in M must contain at least c points,
and therefore M is not separable.
Lemma. The space c is not locally compact.
Proof. A metric space X is locally compact if, for each zeX, there
is an e > 0 such that the closed sphere S(z, e) is compact. Now put
1 for i = k,
xi =
0 for i 4 k,

and for each integer k,

xk={x?}(i=1,2,...)Ec and k+j llxk-xfll=1.
Given z = {zi} e c and e > 0, put
zk = z+exk
and all the points zk are in S(z, e). But
IIzk - zill = e if k + j
so that {zk} (k = 1, 2,...) forms an infinite set in i (z, e) with no limit
point, and 2(z, e) cannot be compact. J

Exercises 2.6
1. Show that s is bounded but not compact. If x = {xi} c s, and
E _ {y: lyil , Ixil},
show that E is compact in s, but show that s is not locally compact.
2. Show that each of the spaces M, c, m, s is complete.
3. Show that each of the spaces C, c, s is separable, but that m is not
Hint for C. Consider the set of functions which take rational values at
each of a finite set of rationals in [0, 11 and are defined by linear interpola-
tion between these points.
4. C*(X) denotes the space of functions f: % -+ R which are continuous
and bounded. Show that C*(R) is not separable by considering continuous
functions which take the values + 1, -1 on disjoints subsets Z1, Z2 of the
set Z of positive integers and are defined elsewhere by linear interpolation.
(The distance between any two such functions is 2, and there are c of them.)
However, let .9 be the subset of C*(R) consisting of those functions f
for which lim ft x), lim &) both exist. Prove that -9 is separable.
5. Let 12 be the subset of s such that xZ converges. In the linear struc-
ture of s show that 12 is a linear space and that

defines a norm. In the topology of this norm show that 12 is separable.
The subset {x: 1xil < 1/i} of 12 is known as the Hilbert cube: prove it is
Hint. Starting with an infinite sequence in the Hilbert cube pick a sub-
sequence in which the first coordinate converges, then successive sub-
sequences in which the 2nd, 3rd, ..., nth coordinate converges. Show that
the sequence to which these coordinates converge is in the Hilbert cube
and is a limit point of the original set.
2.71 CANTOR SET 49

2.7 Cantor set

We now digress briefly from the study of general situations and
consider the definition and properties of a special subset of R first
considered by Cantor. This set and associated functions will be useful
in the sequel to provide counter-examples to several conjectures
which are plausible but false.
If we denote the open interval ((3r-2)/3n, (3r-1)/3n) by En r, put
3n-1 W

UE,,r G= UGn;
it is clear that G is an open subset of [0, 1]. Its complement
is called the Cantor set. From its definition C is closed.
Lemma. The Cantor set C is nowhere dense and perfect.
Proof. If we express points xE [0, 1] in the form
x= (a1 = 0, 1, 2), (2.7.1)
i=1 3'
then the set G. above is the set of x for which an = 1. Hence, the set
C consists of precisely those real numbers which have a representation
in the form (2.7.1) with each ai = 0 or 2. Given a point x1 E C, altering
the nth term an (replacing 0, 2 by 2, 0 respectively) gives a new
point x2 in C such that x2l =
I xl - 2.3-n.
This shows that every point of C is a limit point of C, so that C is
If H is any open set with H n [0, 1] not void then H n [0, 1] contains
an open interval I of length S > 0. If S > 31-n, then I must contain
an interval En,r so that H contains an open set disjoint from C.
This proves that C is nowhere dense. I
From the above lemma and example 2.5 (7) we can deduce that C
is not countable. However, one can prove that C must have the same
cardinal as the continuum [0, 1] by considering the following mapping:
if 1 ], put X= (bi = 0 or 1),
zZ 2i
where the sequence {bi} does not satisfy bi = 1 for i > N. Put
00 ai ai = 0 if bi = 0,
P X) = i? 13' (ai = 2 if bi = 1.
Then f: [0, 1] -* C is (1, 1) and maps [0, 1] onto a (proper) subset of C.
Since C c [0, 1] the cardinal of C is c.
We can think of f as a function on [0, 1] to [0, 1]. It is easy to see
that f is monotonic, that is,
x1 < x2 e f (x1) < J (x2)
so that, for each y E [0, 1],
f-1[0, y] = [0, z]. (2.7.2)
If z is defined by (2.7.2), then we say that
z = g(y)-
This defines g: [0, 1] -> [0, 1] as a monotonic function which is clearly
constant on each of the sets En ,. In fact
3r-2 3r-1 2r- I
3n -< y 3n g(y) = 2n
The function g is continuous and monotonic increasing, for
0 S y1 - Y2 < 3-n-1 0 5 g(y1) - g(y2) < 2-n.
Since the function g is constant in each En,, it follows that it is dif-
ferentiable with zero derivative at each point of G. One can easily
see that g increases at each point of C-and in fact the `upper deriva-
tive' at points of C is + oo.
Note that there is nothing magical about the integer 3 used in the
construction of C. Similar constructions using expansions to a
different base will give sets with similar properties.

Exercises 2.7
1. If x =
c(c2 = 0,1, ..., 9) is a decimal expansion of real numbers
in [0, 1] and T is the set of such x for which cE + 7, show that T is perfect
and nowhere dense.
2. Construct a set which is dense in [0,1] and yet the union of a countable
class of nowhere dense perfect sets.
3. Show that the function g : [0,1]-x[0,1] defined above satisfies a
Lipschitz condition of order a = log 2/log 3, but not of any order 8 > a.
(A function h: I-* R is said to satisfy a Lipschitz condition of order a at
xo E I if jh(x) - h(xo) I < K I x - xo) j a for x e I and some suitable K E R.)

3.1 Types of set function
We consider only functions u: ' -> R*, where ' is a non-empty
class of sets. Thus p is a rule which determines, for each Ee', a
unique element u(E) which is either a real number or ± oo. We always
assume that' contains the empty set 0. R* denotes the compacti-
fication of the real number field R by the addition of two points
+ oo, - oo, while R+ will denote the set of non-negative real numbers
together with +oo. It is not possible to arrange for R* to be an alge-
braic field extending R, though we will preserve as many of the alge-
braic properties of R as possible by adopting the convention that, for
any a e R,
-oo<a< +oo, -oo±a=-oo, +oo±a= +oo;
if a > 0, a(+ oo) = + oo, a(- oo) oo;

if a=0, a(± oo) = 0;

if a<0, a(+ oo) _ - oo, a(- oo) _ + oo;

(+00)(+00)=+00' (+00)(-00)=-00' (-00)(-00)=+00.

Thus the operation of dividing by (± oo) is not allowed and
is not defined in R*. All these definitions are natural except for the
convention 0( ± oo) = 0.
Arbitrary set functions a: W -)- R* are not of much interest. We
adopt conditions on p which correspond to our intuitive idea of mass
for a physical object (we generalise the notion to allow negative
masses). The first property we define corresponds to the notion that
the mass of a pair of disjoint objects is the sum of the masses of the
individual objects.

Additive set function

A set function ,u: ' -> R* is said to be (finitely) additive if
(i) p(O)=0,
(ii) for every finite collection E1, E2, ..., E. of disjoint sets of W such
that U Ei EW we have
y61 it (U E) = Fu (Ei). (3.1.1)
21 i=i
In this definition condition (i) is almost redundant for it will be
implied by (ii) provided there is at least one EE'o with #(E) finite.
Note also that we do not in the definition assume that ' is closed
under finite disjoint unions so that, in testing a given set function
,u: ' -> R* to see whether or not it is additive, we can only use sets
E, eW which are disjoint and have their union in W. However, the
definition is taken to imply that the right-hand side of (3.1.1) has a
unique meaning in R* so that in particular there are no sets E,
FEW such that En F = o, E v +oo, p(F) = -oo.
The natural domain of definition for an additive set function It
is a ring since, if '' is a ring,
EiE' (2 = 1,2,...,n)= UEiEcf.

For a ring Se it is worth noticing that p:' -+ R* is additive if and only

E, F EW, E n F = 0 = µ(E u F) _ #(E) +#(F),
since in this case the general result (3.1.1) can be obtained from the
result for two sets by a simple induction argument.
When ' is a finite class of sets it is easy to give examples of set
functions defined on le which are additive. We now give a number of
less trivial- examples which will be useful for illustrating our later
definitions. In each case the reader should check that ,u: 6 -+ R* is
Example 1. 0 any space with infinitely many points, (f the class
of all subsets of Q. Define p by
p(E) = number of points in E, if E is finite;
p(E) = +oo, if E is infinite.
Example 2. S2 any topological space, ' the class of all subsets of 92.
Put p(E) = 0, if E is of the first category in n;
p(E) _ + oo, if E is of the second category in S2.
Example 3. S2 = R, ' the class of all finite intervals of R. For
E = [a, b] or [a, b) or (a, b] or (a, b), put
,u(E) = b -a.
Example 4. S2 is any space with at least two distinct points n, s;
o is the class of all subsets of Q.
,u(E) = 0, if E contains neither or both of n, s;
p(E) = 1, if E contains n but not s;
µ(E) = -1, if E contains 8 but not n.
Example 5. t = (0, 1], the set of real numbers x with 0 < x S 1,
'f the class of half-open intervals (a, b] where 0 < a < b < 1.
,u(a, b] = b -a if a + 0;
,u(0, b] = + oo.
Example 6. S2 is any infinite space, W the class of all its subsets. Let
x1, x2, ..., xn, ... be an enumerable sequence of distinct points of 0,

and suppose P1, P2, ... is a sequence of real numbers such that E pi
either converges absolutely or is properly divergent to + oo or - oo
(the case E pi convergent, E lpi I divergent is not allowed: why?). Put
,u(E) = E pi,
where the sum extends over all integers i = 1, 2, ... for which xi E E.
Any set function which can be defined as in example 6 is called discrete.
Note that example (4) can be thought of as a special case of example (6).
Although it is not sufficient to restrict our attention to set functions
,u: W -. R which are finite valued, the condition of additivity which is
usually assumed prevents ,u from taking both the values + oo, - oo
at least when ' is a ring. This is one of the results in the next theorem.
Theorem 3.1. Suppose T: V R* is an additive set function defined
on a ring' and E, FEW. Then
(i) if E F and r(F) is finite
T(E - F) = T(E) -T(F);
(ii) if E F and T(F) is infinite
T(E) = T(F);
(iii) if T(E) _ + oo, then r(F) + - oo.
Proof. (i) Since '' is a ring, E - F E'o and additivity implies, since
Fn (E-F) = 0, T(E) = T(E-F)+T(F). (3.1.2)
Subtracting the finite real number T(F) gives the result.
(ii) If T(F) = +oo, then (3.1.2) can only have a meaning if
T(E - F) + - oo, and this implies T(E) = + oo. The case r(F) oo
is similar.
(iii) Since E n F, E - F, F - E are disjoint sets of le
T(E) = r(EnF)+T(E-F) = +oo,
T(F) =T(EnF)+T(F-E) = -oo
could only have meaning if r(E n F) is finite. But this would imply
T(E - F) = + 00, T(F - E) = - oo, and then, since EL F E W,
T(E0F) = r(E - F) + T(F - E) =+oo+(-oo)
which is impossible. J
Our definition of additivity means that for ,u: ' ->. R* to be additive
any set E0E' which can be split into a finite number of disjoint sub-
sets in ' must be such that ,u(Eo) is the same as the sum of the values
of ,u on the `pieces'. We often want this to be true for a dissection of
E0 into a countably infinite collection of subsets in W.

o--additive set function

A set function ,a:' R* is said to be a--additive (sometimes
called completely additive, or countably additive) if
(i) Aa(O)=0,
(ii) for any disjoint sequence El, E2, ... of sets of such that

E _ UEiE'f,

p(E) = Z,u(Ei). (3.1.3)

As before the condition (i) is redundant if It takes any finite values.
Since we may assume that all but a finite number of the sequence
{Ei} are void it is clear that any set function which is o--additive is
also additive. To see that the converse is not true it is sufficient to
consider example (5) on p. 53. Put

E _ (0,1], En= n(n= 1,2,...);

then {En} is a disjoint sequence in (f whose union E is in ' but

+ oo = ,u(E) + 1 = =1
\n -n+1 ) _ n=1
E ME.).
Notice further that even when '' is a ring it does not follow that
Ei c9 (i = 1, 2, ...) =>E = UEE a 1;
so that in testing (3.1.3) we can only use those sets E E' which can be
split into a countable sequence of disjoint subsets in W. In particular
if' is a finite class of sets then additivity for ,u: (f ->. R* implies
additivity. We also interpret (3.1.3) to mean that the right-hand side
is uniquely defined and independent of the order of the sets Ei;
thus if p is a decomposition E = U Ei,
we cannot have,u(Ei) = +oo, p(E5) _ -oo, nor can the series in (3.1.3)
converge conditionally.
It is easy to check that each of the set functions in examples (1), (2),
(4), (6) on p. 52 is and the set function of example (3) is
also o--additive though the proof of this fact is non-trivial. This proof
will be given in detail in § 3.4, as it is an essential step in the definition
of Lebesgue measure in R.
Any non-negative set function p:' R+ which is o-additive is
called a measure on ', (R+ = {x E R*: x > 0}).
We should remark that there is not general agreement in the
literature as to which set functions ought to be called measures.
According to our definition the set functions in examples (1), (2)
and (3) are measures, those in (4) and (6) are not because /I can
take negative values while the set function in (5) is not because it is
not o--additive.
The natural domain of definition of a measure, or indeed of any
or-additive set function, is a since then

EiE1o (i=1,2,...). UEiE'.

However, we will not restrict our consideration to set
functions already defined on a o -ring.
Given a set function ,u: (f -> R* where 'f is a ring it is usually quite
easy to check whether or not u is additive for one only has to check
(3.1.1) for n = 2. In order to check that it is also cr-additive it is useful
to have a characterisation of o--additive It in terms of a continuity
condition for monotone sequences of sets. Since we have seen already
(theorem 3.1) that such set functions cannot attain both values
+ oo, - oo there will be no loss of generality in assuming that
- oo < p(E) 5 + oo for all E E W.

Suppose q is a ring and p:. R* is additive with ,u(E) > -oo
for all E E M. Then for any E E R we say that :
(i) u is continuous from below at E if
lim #(E.) = p(E) (3.1.4)
for every monotone increasing sequence {En} of sets in gP which con-
verges to E;
(ii) It is continuous from above at E if (3.1.4) is satisfied for any
monotone decreasing sequence {En} in 9 with limit E which is such
that p(En) < oo for some n;
(iii) It is continuous at B if it is continuous at E from below and from
above (when E = 0 the first requirement is trivially satisfied).
Theorem 3.2. Suppose 9 is a ring and ,u:.9 --3- R* is additive with
,u(E) > - oo for all E E R.
(i) If p is o--additive, then p is continuous at B for all E E 9;
(ii) if It is continuous from below at every set E E 9?, then p is v-
(iii) if p is finite and continuous from above at o, then p is o--additive.
Proof. (i) If;a(En) = +oo for n = N and {En} is monotone increasing
then ,u(E) = +oo and #(E.) = +oo for n >, N by theorem 3.1 (ii),
where E = lim En. Thus in this case ,u(En) ->. p(E) as n -a oo. On the
other hand, if p(En) < oo for all n and {En} increases to E, then
E = El U U (En+1- En)

is a disjoint decomposition of E and


p(E) = #(El) + E Ft(En+1- En)

_ ,u(E1) + lim Z fp(En+1- En) = lim F(EN),
N-00 n=1 N-->co

since ,u is additive on the ring R. Thusp is continuous from below at E.

Now suppose {En} decreases to E and p(EN) < +oo. Put
Fn = EN - En for n >,N.
Then, by theorem 3.1 (ii), p(F.) < oo and the sequence {Fn} is monotone
increasing to EN - E. Hence, as n oo,

#(F.) -.p`(EN - E) = ,u(EN) -p(E)

But µ(F,) =,u(EN)-µ(Ef) so that p(En)-)-µ(E) as n-*oo, since
,u(EN) is finite, and fc is also continuous from above at E.
(ii) Suppose E E M, Ei EM (i = 1, 2, ...) are such that E _ U Ei
and the sets Ei are disjoint. put
Fn= UEiEJI (n= 1,2,...),
and {Fn} is a monotone increasing sequence of sets in 9 which con-
verges to E E M. If ,u is continuous from below at E

E,a(E' i) _ ,a(1''n) c(E) as n -->oo


so that ,u(E) = E,u(Ei),

and ,u is Q-additive.
(iii) In the notation of (ii) put
Gn=E-FFEP2 (n= 1,2,...).
Then {Gn} is a monotone decreasing sequence converging to 0 and,
for n = 1, 2, ... n

If u is finite and continuous from above at 0 we must have

#(G.) -* 0 as n-oo
so that again ,u(E) = ,u(Ei).

Remark 1. In our definition of continuity from above we only

require to have #(En) ->. ,u(E) for those sequences {En} which decrease
to E for which u (En) is finite for some n. To see that we could not relax
this finiteness condition, consider example (2) on p. 52 which we have
already seen to be o-additive with S2 = (0, 1). Then if

En = (0, n) (n = 1, 2, ... )

we have a sequence decreasing to 0 such that ,u(En) = +oo for all

n since En is of the second category.
Remark 2. The condition that ,u be finite cannot be omitted in
theorem 3.2 (iii). Consider example (5) on p. 53 which we saw was
additive but not o'-additive. Actually the class' of sets on which It
is defined is a semi-ring rather than a ring, but its definition can easily
be extended to the ring of finite disjoint unions of sets in W by using
theorem 3.4. It is easy to check that it will remain continuous from
above at 0, but not a -additive.
Part (iii) of theorem 3.2 will prove very useful in practice, especially
for finite valued set functions #:. i R+ which are non-negative and
additive on a ring M. In order to prove that such ap is a measure it is
sufficient to show that, if {En} is any sequence of sets in qi decreasing
to 0, µ(E.) --> 0 as n -* oo. (3.1.5)
If (3.1.5) is false for some such sequence {En} then, since #(E.) is
monotone decreasing we must have
#(E.)->- 8> 0 as (3.1.6)
If we can establish a contradiction by assuming (3.1.6), then (3.1.5)
will be proved and we will have deduced that It is a measure.
When we come to consider particular set functions one of our objec-
tives will be to define p on as large a class ' as possible. We will also
want It to be v-additive. It would be desirable to define, on the class
of all subsets of 0, but unfortunately this is not possible if 0 is not
countable and It is to have an interesting structure. In particular it
has been shown, using the continuum hypothesis, that it is impossible
to define a measure p on all subsets of the real line such that (a) sets
consisting of a single point have zero measure (this eliminates discrete
set functions like examples (1), (4), (6) on pp. 52-3); (b) every set of in-
finite measure has a subset of finite positive measure (this eliminates
example (2)); (c) the measure of the whole space is not zero. In
practice the method used is to define It with desired properties on
a restricted class of sets ' (as in examples (3) or (5)) and then extend
the definition to a larger class _q W.
Given two classes le _q of subsets of SZ and set functions ,u: le ->. R*,
v:. R* we say that v is an extension of p if, for all Ee'
v(E) = p(E);
under the same conditions we say that p is the restriction of v to W.
It is sometimes appropriate (as in probability theory) to work
with set functions ,u which are finite. However, most of the theorems
which can be proved for finite v-additive set functions can also be
obtained with a condition slightly weaker than finiteness.

ofinite set function

A set function ,a:' --a R* is said to be o--finite if, for each E eC,
there is a sequence of sets Ci (i = 1, 2, ...) e' such that E C U Ci
and p(Ci) is finite for all i.
In our examples, on p. 52, the set functions in (3), (4), and (5) are
all finite, (1) gives a o--finite measure if and only if it is countable,
(2) is not o--finite if Q is of the second category, (6) is finite if EIpzI
converges and otherwise it is o--finite.
Sometimes it is useful to relax the condition of additivity in order
to be able to define It on the class of all subsets. The most common
example of this is in the concept of outer measure.

Outer measure
If IF is the class of all subsets of 0, then ,u:' R+ is called an outer
measure on t if
(i) u(O) = 0;
(ii) u is monotone in the sense that E c F . p(E) < ,u(F);
(iii) u is countably subadditive in the sense that for any sequence
{Ei} of sets, OD 00

E U Ei - li(E) S E,u(E1) (3.1.7)

i=1 i=1
Note that every measure on the class of all subsets of a space S2
is an outer measure on Q. However, it is not difficult to give examples
of outer measures which are not measures.
Example 7. S2 any space with more than one point. Put
p(0) = 0, p(E) = 1 for all E + 0.
In this book we do not study the properties of outer measures for
their own sake, but we will use them as a tool to extend the definition
of measures.

Exercises 3.1
1. If S2 = [0,1) and '' consists of the 6 sets
0, Q, [0, '), [0, 1), [0, 1),
li(o)=0, u[0,1)=2, p[0,1)=2,
,a[0,1) = 4, fz[4,1) = 2,
lt(Q) = 4,
show that ,a is additive on W. Canp be extended to an additive set function
on the ring generated by le?
2. Show that if 9 is any finite ring of subsets of 0 and p is additive on
R then ,u is a--additive on R.
3. A set function fi: ' -+ R* is said to be monotone if p(0) = 0 and
E e F, E, F E' . p(E) < ,u(F). Show that monotone set functions are non-
negative, and if % is a ring, show that an additive non-negative set function
is monotone. Of the set functions in examples (1)-(7), which are monotone?

4. Z is the space of positive integers and coZ a is a convergent series of

positive terms.
If E is a finite subset of Z, put -r(E) _ a,,; if E is an infinite subset of
Z, put T(E) = + oo.
Show that T is additive, but not or-additive on the class of all subsets
of Z.
5. Z is the space of positive integers; for E e Z let rn(E) be the number
of integers in E which are not greater than n. Let ' be the class of subsets E
for which
lim rn(E) = T(E)

exists. Show that T is finitely additive, but not v-additive on e, but that
' is not even a semi-ring.
6. Ifp is finitely additive on a ring 9; E, F, G E 9 show
p(E) +p(F) = #(E v F) +,u(E n F),
p(E) +,u(F) +, a (G) +,u(E n Fn 0)
= p(EvFuC;)+,u(EnF)+p(FnG)+lu(GnE).
State and prove a relationship of this kind for n subsets of R.
7. Suppose. is a v-ring of subsets of n, It is a measure on Y. Show that
the class of sets EE.9' with p(E) finite forms a ring, and the class with p(E)
v-finite forms a a--ring.
8. If E is a set in So of a-finite #-measure (where p is a measure on So)
and 9 c 01 where -9 is a class of disjoint subsets of E show that the subclass
of those D e -9 for which p(D) > 0 is countable.
9. State and prove a version of theorem 3.2 (i) for set functions, defined
on a semi-ring W.
10. To show that the finiteness condition in the definition of 'continu-
ous from above' in theorem 3.2 (i) cannot be relaxed, consider any infinite
space.Q and put
T(E) = number of points in E, if E is finite;
T(E) = + oo, if E is infinite.
Then T is a measure on the class of all subsets of 0, but for any sequence
of infinite sets which decreases to 0, we do not have urn T(E) = 0.
11. Suppose 9 is the semi-ring of half-open intervals (a, b], Q is the set
of rationale in (0, 1] and 9Q is the semi-ring of sets of the form (a, b] n Q.
,u{(a,b]nQ}=b-a if 0<a<b<1.
Show that ,u is additive on 9Q and is continuous above and below at every
set in eQ, but is not a--additive. This shows that theorem 3.2 (ii), (iii) is not
true for semi-rings.

12. Show that if It is an outer measure on S2, E0 any fixed subset then
,uo(E) = p(E n Eo) defines another outer measure on Q.

13. Show that if It, v are outer measures on S2, so is T defined by

T(E) = max [u(E), v(E)].
14. Suppose is a sequence of o--additive set functions defined on a
o--ring Y and that lim O(E) exists for all E in .9'. Show that gS
is finitely additive on Y. If either
(i) 0.(E) -> q'(E) uniformly on Y with c(E) > - co for all E r: 9; or
(ii) 01(E) > - oo, 0.(E) monotone increasing for all E E.©;
show that 0 is a--additive on Y.

3.2 Hahn-Jordan decompositions

When discussing o -additive set functions we will usually restrict
our attention to the non-negative ones (which we call measures).
The present section justifies this procedure by showing that, under
reasonable conditions a `signed' set function u: ' -> R* which is
completely additive can be expressed as the difference of two measures.
This means that properties of completely additive set functions
can be deduced from the corresponding properties of measures. There
are also versions of the decomposition theorem for finitely additive
set functions, but we will not consider these.
We have already seen (theorem 3.1 (iii)) that an additive set function
defined on a ring cannot take both the values +oo, - oo. If .90 is a
o-ring and 1u:.5o R* is completely additive then for any sequence
{Ei} of disjoint sets in .9',

,t Ei = Z lu(Ei)
i =1 i=1
Since U Ei is independent of the order of the sets in the sequence,
it follows that the series on the right-hand side must be either
absolutely convergent or properly divergent. In the case of example (6)
the set function
,u(E) = E pi

can be decomposed µ(E) = ,u+(E) -p_(E),

where #+(E) = Z max (O, pi), ,u_(E) = - Z min (O,p,)
xiEE x,EE
so that are measures of which at least one is finite. Further
if we put P = {x;,u{x} > 0}, N = f2 - P we have ,u+E = ,u+(P n E),
,u_E = -,u(N n E) for all E c SZ, so that the decomposition into the
difference of two measures can also be obtained by splitting 0 into
two subsets P, N such that ,u is non-negative on every subset of P
and non-positive on every subset of N. These two aspects of the
decomposition are true in general, provided . ' is a o--field.

Theorem 3.3. Given a completely additive T: F -* R* defined on a o --field

.F, there are measures T+ and T_ defined on F and subsets P, N in .F
such that P u N = 0, P n N = 0 and for each E E .F,
T+(E) = T(En P) > 0, r_(E) _ -T(En N) >0,
T(E) = T+(E) -T_(E);
so that T is the difference of two measures T+, T_ on F. At least one of
T+, T_ is finite and, if T is finite or o- finite so are both T+, T_.
Proof. Since T can take at most one of the values + oo, - oo we
may assume without loss of generality that, for all E c :.F,
- oo < T(E) S + oo.
We first prove that, if E E F and
A(E) = inf T(B), (3.2.1)
then A(S2) + - oo. If this is false then there is a set B1 E F for which
T(B1) < -1. At least one of A(B1), A(f2 - B1) must be - oo; since
A(A v B) > A(A) +A(B) if A, B are disjoint sets of F. Put Al equal
to B1 if A(B1) = - oo and (f2 - B1) otherwise. Proceed by induction.
For each positive integer n, choose Bn+1 c A. such that
T(Bn+1) < - (n+ 1).
If A(B,t+1) = - oo, put An+1 = Bn+1; otherwise put An+1 = An - Bn+1
Then A(An+1) = - oo
There are two possible cases:
(i) for infinitely many integers n, A. = An-1- Bn;
(ii) for n > no, A. = Bn.
In case (i) there is a subsequence {B.,} of disjoint sets and

T iUBni = iT(B.,) <i O-(ni+1) _-oo,

=1 i=1 =1
so that r takes the value
-oo on E _ UBn,E, ,

contrary to assumption. In case (ii), the set

E= (1 B, E °F
and since {Bn} is then a decreasing sequence of sets we have
T(E) = lim r(B,) = - oo
again giving a contradiction.
Since T(0) = 0, A(S2) 5 0 so that A = A(S2) is finite and we can find
a sequence {Cn} of sets in .F for which
T(C,) <A+2-n.

Now consider the set Cn n C,n+1 By noting that

C. V Cn+1 = (Cn - C. n Cn+l) V (Cn+1- C. n C.+1) V (Cn n Cn+1)
is a decomposition into disjoint sets, it follows that
T(C, n Cn+1) = T(Cn) +T(Cn+1) -T(Cn v Cn+1)1
< A+2-n+A+2-n-1-A = A+2-n+2-n-1.
This argument can be repeated to the pair of sets (Cn n C.+,) and Cn+2:
by induction it can be proved that
T\ACr) <A+ Y2-r<A+21-n.

r=n r=n
If we put Dn = fl Cr we have Dn EF and, by theorem 3.2 (i),
T(D,) < A+21-n.
But now {Dn} is a monotone increasing sequence of sets in F so that
N = lim Dn = lim inf Cn E .F,
n->oo n--> ao

and T(N) = A,
by theorem 3.2 (i).
Finally, put P = fl - N. If E e.F, E e P we must have T(E) > 0
for, if T(E) < 0, then T(E v N) = T(E) +T(N) < A. Also if Be F,
E c N we must have T(E) < 0 for, if T(E) > 0, then
T(N - E) = T(N) - T(E) < A.
If we now put
T+(E) = T(E n P), T_(E) = T(E n N),
it is clear that all the conditions of the theorem are satisfied.
Remark. It is usual to call the decomposition T = T+-T_, of T
into the difference of two measures, the Jordan decomposition while
the decomposition of SZ into positive and negative sets P and N is
called the Hahn decomposition. It is not difficult to show that the
Jordan decomposition is unique while the sets P, N are not uniquely
determined by T unless T(E) + 0 for all EE,F such that 14(E) + 0 and
p(F) = 0 or µ(E) for every F c E with F E.F. It is further clear that
T_(E) = -A(E),
(3 . 2. 2)
T+(E ) = sup T(B )
BCE, BE.i f
under the conditions of theorem 3.3, where A(E) is given by (3.2.1).
If one is given a a or-ring Y
which is not a then it is not, in general, possible to obtain the
Hahn decomposition, but the Jordan decomposition is still possible,
using (3.2.1), (3.2.2) as the definition of T_, T+.

Exercises 3.2
1. If :Y -> R* is a'-additive on a Y, show that, for any Ee.9',
there are sets A c E, B e E with A, B E So such that
c(A) = inf 0(0), O(B) = sup (Cry).
2. Showthat, given a (finitely) additive ,u: 3P --> R* defined on a o--ring M
and taking finite values, there is a decomposition
.u(E) = µ+(E) -u-(E)
of p into the difference of two non-negative additive set functions on M.
3. The set E0 E' is said to be an atom of a set function 0:' R* if
g5(Eo) + 0 and for every E e B0, E ET; ¢(E) = 0 or ¢(E0). Write down the
atoms of the set functions of examples (4) and (6) on page 53.
4. A set function 0: ' -> R* is said to be non-atomic if it has no atoms.
Suppose q5:.F-* R* is on the non-atomic, and finite
valued. Show that for any A e.., 0 takes every real value between - 0-(A)
and 6+(A) for some subset E e A.

3.3 Additive set functions on a ring

In order to simplify the arguments we now consider only non-
negative set functions ,u:'' --> R+. It is often possible, for a given ring
9 to find a semi-ring W c R such that 9 is the ring generated by le.
We saw (see § 1.5) that the sets of R can then be expressed in terms of
the sets of (f, so it is natural to ask whether in these circumstances
a set function 1a: -> R+ can be extended top: R ->- R+. We now prove
that, if, a is additive on ', this is always possible and that the result is
Theorem 3.4. If ,u:'-. R+- is a non-negative additiveset functiondefined
on a semi-ring ', then there is a unique additive set function v defined
on the generated ring . = .(%) such that v is an extension of /J,. v is
non-negative on 9, and is called the extension of p from 'to A(W).
Proof. Suppose A is any set of . = R(T), then by theorem 1.4,
A = U Ek where the sets El, are disjoint and Ek e'. Define
v(A) = E µ(Ek)- (3.3.1)

Since for any a, b e R+, a + b is always defined, the right-hand side of

(3.3.1) defines a number in R+. v is thus defined on provided we can
show that (3.3.1) gives the same result for any two decompositions
of A into disjoint subsets in W.
Suppose n m
k=1 j=1

where F e' and are disjoint. Put Hkj = Ek n F1. Then -since W is a
semi-ring the sets Hkj a le, are disjoint and
Ek=UHkj (k n);
Fj = U Hkj (9 = 1, 2, ..., m);

so that, since It is additive on W,

k=1 k=1 j=1 j=1 k=1 j=1

and it makes no difference which decomposition is used with (3.3.1)

to define v(A).

If A1, A2 are disjoint sets of 9, and

A1= U Ek, A2=U1},
k=1 i=1

then Al v A2 is a set of 3P with a possible decomposition into disjoint

subsets of 1 given by
n m
A1vA2= UEkvUFi.
k=1 4=1

n m
Hence v(A1 v A2) = E,u(Ek) +iE,u(F )

= v(A1) + v(A2),
since v is uniquely determined by (3.3.1). Thus v is finitely additive
on R (since R is a ring). It is obvious that v is non-negative.
Now let r be any extension of p from ' to °.rP which is additive. If
A e 9P and A = IJ Ek is a decomposition into disjoint sets of '',
T(A) = E r(Ek), since r is additive;
= Ep(Ek), since r is an extension;
= v(A) by (3.3.1).
Thus v is the unique additive extension of p from $° to Q.
If we start with a measure p: % --> R+ on a semi-ring ', then It
is clearly a non-negative finitely additive set function, and so possesses
a unique additive extension to the generated ring R. What can we
say about this extension?
Theorem 3.5. If ,u:'-.R+ is a measure defined on a semi-ring W,
then the (unique) additive extension of µ to the generated ring 3P(() is also
a measure.
Proof. In the last theorem we discovered the form of the unique
additive extension v of ,u from ' to R. It is sufficient to show that
v is on R. Suppose E e R, Ek (k = 1, 2,...) e 9 and are
disjoint, and E = Uco Ek,
Put E = U A Ar disjoint sets of %;
Ek = U Bkd, Bk4 disjoint sets off.
then {Crki} forms a disjoint collection of sets in ', and
co nk n
Ar = U U Crki, Bki = U Crki
k-1 i=1 r=1
are disjoint decompositions into sets of W. Since ,a is additive on ',
u(Bki) = Ela(Cki)i
and since it is a--additive on W
0o nk
p(Ar) = z z/J'(Crki)-
Since the order of summation of double series of non-negative terms
makes no difference, we have
(00 nk
v(E) _ E f,(Ar) _ I E fp(Crki)
r=1 r=1 k=1 i=1
= E ( E E (Cki)
k=1 i=1 r=1
Co nk °o
= Fi Zi lu(Bki) = Fi v(Ek) J
k=1 i=1 k=1
The above theorem gives one method of obtaining a measure on a
ring-it is sufficient to define a measure on any semi-ring which
generates the given ring. The extension to the generated ring is easily
carried out, is unique, and gives a measure. There are circumstances
in which one can define a set function p directly on a ring so that it is
easy to see that p is non-negative and additive. Under these circum-
stances one can often use theorem 3.2 as a criterion for determining
whether or not It is a measure. Another useful criterion is given by
the following theorem.
Theorem 3.6. Suppose ,u: 9 --> R+ is non-negative and additive on a
ring R. Then
(i) if E E R, and {Ei} is a sequence of disjoint sets of R such that
i=1 p(E) i Tlu(Ei)i
(ii) p is a measure if and only if for any sequence {Ei} of sets in R
such that U Ei E E PA, 00

i=1 p(E) 5 Ep(Ei)



Proof (i). For each positive integer n,

Ei so that Z ,u(Ei),
i-1 \i=1 I i=1
since ,u is additive. Hence
p(E) % E,u(E'i)
(ii) First, suppose that p is a measure. Put
Fi = En Ei (i = 1, 2,...); Gl = F1,
and Gn = Fn - U Fi (n = 2, 3,....).
Then {On} is a sequence of disjoint sets of 9 such that

n=1 n=1

Thus µ(E) = µ (tio) = ,a(Gi) 5 E,i(Fi),

-1 i=1 t=1

since µ is o'-additive and non-negative.

Conversely if it is known that It is additive and E = U Ei is a
disjoint decomposition of E ER into sets in 9, by (i)

p(E) % Eli(Ei);

and if the condition in (ii) is satisfied,


p(E) 5 E,u(Ei)

so that we must have u(E) = E ,u(Ei)

and p is a measure on R. ]

Exercises 3.3
1. If n = {1, 2,3,4, 5,}, show that ' consisting of o, 0, {1}, {2, 3},
{1, 2, 3,}, (4,51 is a semi-ring and that 0, 3,1,1,2, 1 defines a set of values
for an additive set function ,u on W. What is the ring ? generated by 6?
Find the additive extension of p to M, and show that it is a measure.
2. Suppose . is any ring of subsets, 0: G -+ R+ is non-negative, finitely
additive on 9P, and p:. -->. R+ is a measure on 6. such that, for any sequence
of sets in R
0= 0 as n oo;
show that 0 is completely additive.
3. If ,u: R -+ R+ is finitely additive on a ring . and E, F E R are such
that #(E L F) = 0, we say that E - F. Show that - is an equivalence re-
lation in R and that
E- F-#(E) =,u(F) _,u(EvF) =,u(EnF).
Is the class of all sets E E R for which E -' 0 a ring?
4. In the notation of question 3, put p(E, F) = #(E A F) and show that
p(E, F) > 0, p(E, F) = p(F, E), p(E, F) 5 p(E, 0)+p(O, F). If E1,., E2,
F1- F2 are all sets in .', show that p(E1, F1) = p(E2, F2). Does p define a
metric in A?

3.4 Length, area and volume of elementary figures

In § 1.5 we saw that:
(i) In R = RI (Euclidean 1-space) the class 9 = 91 of half-
open intervals (a, b] forms a semi-ring which generates the ring
f of elementary figures (sets E of the form E _ (J (ai, bi] with
(ii) In Rk the half-open intervals have the form {(x1, x2, ..., xk):
ai < xi < bi, i = 1, 2,..., k} and they again form a semi-ring 9k which
generates the ring d1k of elementary figures (sets which can be expressed
as a finite union of disjoint sets of .9k).
Instead of using the terms length (for k = 1), area (for k = 2) and
volume (for k > 3) of an interval we will use the same word `length'
in each case. Thus the `length' of an interval of .k will be the product
of the lengths of k perpendicular edges.
,u(a, b] = b -a,
u{(x1, ..., xk) : ai < x 5 bi, i = 1, 2, ..., k} = H (bi - ai)-
Thus for each k we have defined a set function
#: 9k R+
which has the usual physical meaning of length, area or volume. His-
torically this set function and its extension to a larger class of subsets
of Rk was the first to be studied; it leads quickly to the definition of
Lebesgue measure in Rk. Our object in the present section is to show
that the set function obtained by extending ,u from oak to iffk is a
measure on ek. There are essentially two distinct methods of doing
this, and both will work for each k. In both it is necessary to show that
,u is additive on 9ak so that it has a unique extension to an additive

set function in offk. Then one can either make use of the continuity
theorem 3.2 to show that ,u: fk -> R+ is a measure on 01, or one can
prove directly that ,u is a measure on 9k and appeal to theorem 3.5
to deduce that its extension is also a measure. We illustrate by apply-
ing the first method to the case k = 1, and the second method to the
case k = 2.
For each (a, b] E 9 we put µ(a, b] = b - a. It follows that, is additive
on 9 for if (a, b] _ U (ai, bi] and the (ai, bi] are disjoint we may assume
that these intervals are ordered so that bi < ai+1(i = 1, 2, ..., n - 1). It
follows that we must have a1= a, bn = b and bi = ai+1(i =1, 2, ... , n - 1)
so that, if an+1 = bn,
n n n
E u(ai, bi] = E (bi - a1) = F (ai+1- a1)
1=1 i=1 i=1

_ (b -a) = ,u(a, b].

By theorem 3.4 there is a unique additive extension u: of -> R+ since
d° is the smallest ring containing the semi-ring 9. Since p is finite
on f it will follow from theorem 3.2 (iii) that p is a measure, if we
can prove that p is continuous from above at o.
Suppose this is false; then there is a monotone sequence {En}
of sets in for which lim E. = o but #(E.) -> 4 > 0 as n -+ oo.
Now El consists of a finite number of intervals of 9. Let F1 be a set
of 9 obtained by taking away short half-open intervals of 9 from the
left-hand end of each of the intervals of E1 in such a way that
F1 c Fi c E1; fu(F1) > fu(E1) - 8/22.
We now proceed by induction. Suppose we have obtained F. e S
such that
F. c T. c E. ^ Fn-1
n 16

and #(F.) > lz(En) - rEi 2r+1 (3.4.1)

Then F. ^ En+1 E of and
,u(Fn ^ En+1) % -,u(L''n - Fn) % µ(En+1) - E+1 (3.4.2.)
We can again remove small half-open intervals from the left-hand end
of each interval of F. n En+1 to give a set Fn+1 E& such that
p(Fn+1) > p(En+1 A Fn) - 8/2n+2 (3.4.3)
and Fn+l c Fn+1 c En+1 ^ Fn.
By (3.4.2) and (3.4.3) we deduce that
/ n+1 E
a(FF+1) > u(En+1) - 2r+1+1

Thus by induction we can establish (3.4.1) for all n. Since II(E.) >, 46
for all n, we have #(F.) > 16, for all n
so that all the sets F. are non-void. Hence {Fn} is a decreasing sequence

of non-empty bounded closed sets. Hence n F. is not void. But

00 00

nFnc n=nEn= o,
so we obtain a contradiction.
Suppose C = {(x, y) : a < x < b, c < y < d} is a set of g2, and
p(C) = (b - a) (d - c). In order to prove that u is additive on g2,
C n
suppose that U Ci is a decomposition of C into disjoint rectangles
in each of which one of the sides (say (c, d]) remains the same. Then
the other sides (ai, biJ must be disjoint and satisfy
(a, b] = U (ai, bi]
so that by the corresponding result in 91, ,u is additive in this case.
More generally if
C = U Ci, Ci = {(x, y): ai < x < bi, ci < y < di}
is a decomposition of C into a finite number of disjoint rectangles,
use the infinite lines x = ai, x = bi, (i = 1, 2,..., n) to decompose
each Ci into a finite number of pieces Cik each with the same bounds
for the y-coordinate. Hence
E p(Ci) = E Eclu\cik),
i=1 i k
and we can sum the right-hand side by first summing over the rect-
angles whose x-coordinate is bounded by a pair of contiguous ai,
by and then summing over these intervals in x. Thus by repeated
application of additivity in 91 we get
A(C) = E lu(Ci),

as required. (The reader should draw a picture.)


Now suppose C = U Ci is an infinite decomposition of C into dis-

joint sets of 92. We must show that It is completely additive on g2.
Since 92 is a semi-ring it follows by induction that, for each n.
C- UCi
can be expressed as a finite union of sets of °.1'2. Since A is non-negative,
this implies that n
,u(C) E,u(Ci), for all n,

so that p(C) >' E p(C1).


Suppose if possible that p is not v-additive, then there will be such

a set C for which co
,u(C) = +24 (4 > 0). (3.4.4)
We now use another form of compactness argument to obtain a
contradiction. Suppose e > 0 is small enough to ensure that, if
then ,u(FO) > ,u(C) - 8;
and ei > 0 are small enough to ensure that, if
Fi = {(x, y) : ai < x < bi + ei, c1 < y < di + ei},
then p(Fi) < p(Ci) +S2-n (i = 1, 2, ...). (3.4.5)
Then F. c C and Ci c F°, the interior of Fi (i = 1, 2,...); so that
Fo C U Foi.

Since Fo is compact and the sets FOi are open it follows that, for some
integer n, we have
C n n
PO U F° so that Fo c U Fi.
i=1 i=1

By the finite additivity of p on Og2 this implies

p(Fo) < Ep(Fi)
so that, by (3.4.5)
µ(C)-S < E,u(Ci)+ E42-i.
1=1 i=1
Which contradicts (3.4.4).
Thus,u is a measure on 92 and by theorem 3.5 the unique additive
extension ,u: &2 R+ is also a measure. Either form of argument
clearly extends to the class of elementary figures in Rk, so we have
Theorem 3.7. Suppose offk is the class of elementary figures in Rk, that
is, the class of those sets n
where the Ci are disjoint half-open intervals in Rk. If we put
,u(Ci) = length of Ci = product of lengths of the sides of the interval Ci
and n
p(E) p(Ci),
then It is uniquely defined on ffk and is a measure.


4.1 Extension theorem; Lebesgue measure
Measure was defined as a non-negative o -additive set function
defined on a class of sets W. In testing T for Q-additivity we only needed
T(E) = ZT(Ei)

for sequences {Ei} of disjoint sets of le for which E = U Ei Ele. This

is an artificial restriction as the condition of additivity does not
apply to a sequence {Ei} unless the union set U Ei happens to belong
to W. For this reason the natural domain of definition for a measure
T: le -> R+ is a o -ring, and in practice most useful measures are defined
In the last chapter we considered properties of measures defined
on a ring ., so our first objective in the present chapter will be to
prove that these can always be extended to a measure on the o--ring
.9' generated by R. This extension is unique provided the measure on
. is o--finite. We introduce an (unnecessary) simplifying assumption-
that the generated is also a o--field, i.e. that it contains the
whole space 0. Even with this simplification the main extension
theorem is somewhat involved. The main idea is that of introducing
an outer approximating set function, defined in terms of the measure
on R, and then restricting this to a class of sets on which it is a--additive.
The relevant set function turns out to be an outer measure, so it is
convenient first to obtain a theorem about all outer measures.
Measurable set
Suppose ,u* is an outer measure defined for all subsets of S2: that
is, ,u* is non-negative, countably subadditive and monotone (see
p. 59). A subset E is said to be measurable with respect to ,u* if, for
everysetA C S2, 1t*(A) _,u*(AnE)+,u*(A-E). (4.1.1)
It is important to stress that the concept of measurability for a
set depends on the outer measure ,u*. The same set E may well be
measurable with respect to,ul and non-measurable with respect to,u2 .
It helps ones intuition to realise that (4.1.1) states that if one
divides the set A using E and its complement, then the outer measure
of the `pieces' adds up correctly. Thus a set E is ,u*-measurable
if and only if it breaks up no set A into two subsets on which ,u*
is not additive. The measurability of E depends on what the set E does
to the outer measure of all the other subsets.
The reader may find the above explanation of condition (4.1.1)
still inadequate to provide the definition of measurability with much
intuitive content. This is a case where the definition is justified by the
result-it turns out that, for suitable outer measures, a wide class of
sets is measurable and the class of measurable sets has got the right
kind of structural properties. The definition is therefore justified
ultimately by the elegance and usefulness of the theory which results
from it.
Note that, because of the subadditivity condition on outer measures,
we always have
,u*(A) ,u*(A n E) +,u*(A - E)

for all sets A, E. Hence E is a*-measurable if and only if

,MA) 3 µ*(A n E) +,u*(A - E) (4.1.2)
for every set A c Q. Since (4.1.2) is automatically satisfied for sets A
with,u*(A) _ +oo, E is,u*-measurable if and only if (4.1.2) is satisfied
for every A S1 with It* (A) < oo.
It is worth remarking that many of the early discussions of measur-
ability use the concept of inner measure. If ,u*(S2) < oo, this can be
defined for all subsets E by
,u*(E) =,u*(S2)-,u*(S2-E).
In this method of procedure a set E is said to be measurable if
,u* (E) = ,u*(E). This apparently weaker definition of measurability
can be shown to be equivalent to the one we have adopted provided
the outer measureu* is regular. (An outer measure is said to be regular
if, for every A c 0, there is a measurable cover E A such that
,ME) = ,u*(A).) This means in particular that, under these circum-
stances, it is sufficient to use the single test set i for A in (4.1.1.).
We do not use the concept of inner measure in our development.
Theorem 4.1. Let ,u* be an outer measure on S2, and let .elf be the class
of sets of S2 which are measurable with respect to ,u*, Then .4' is a ofield
and the restriction of ,u* to .,' defines a measure on .4'.
Proof. We first show that any finite union of sets of . is in .,ll.
It is clearly sufficient to prove that El v E2 E .4' for any El, E2 E .Wf.
For any set A, since El is measurable,
#*(A) = /.c*(A n E1) +,u*(A -El). (4.1.3)
Now use (A - El) as a test set for the measurable E2
,u*(A-El) =,u*((A-E1)rF2)+,u*(A-El-E2),
,u*(A- El) =,u*((A-F1)rE2)+/t*(A-(E1uF2)). (4.1.4)
But [(A - El) n E2] U (A n El) = A n (El v E2),
so that if we substitute (4.1.4) into (4.1.3) and use the subadditivity
of ,u*, we obtain
,u*(A) = ,u*(A n El) +,u*((A - E1) (A) +,u*(A - (El v E2))
#*(A n (El u E2)) +,u*(A - (El u E2))
so that, by (4.1.2), E1 v E2 e .4'.
Now, since A n E = A - (12 - E), the equation for the measurability
of (S2 - E) is the same as that for the measurability of E. Hence,
(t) - E) is measurable if and only if E is measurable.
Since n n
flEi = S2- U (S2-Ei), (see §1.4),
i=1 i=1
it follows that the class -if is also closed under finite intersections so
that.,& is a field. In order to show that .4' is a av-field it is sufficient to
show that E = U Ek e .4l for any sequence {Ek) of sets of .,ff. There is
no loss in generality in assuming that the sets Ek are disjoint for, since
.4f is a ring, any countable union can be replaced by a countable
disjoint union of subsets in -ff. Put
Fn = U Ek (n = 1, 2, ...),
and let .°n be the hypothesis that, for any A,

µ*(A n Fn) = ,u*(A n Ek).

Clearly.*', is true. Use A n Fn+1 as a test set for the measurability of
Fn :then #*(A n F.+1) = ,u*(A n Fn) +,u*(A n E.J.
Hence Xl => . °n+1 so that, by induction .*n is true for all positive
integers n.
Since,a* is monotonic, for each n
,a*(A n E) > ,a*(A n Fn) It*(A n Ek),

so that p*(A n E) >, Ia*(A n Ek),

and the subadditivity of,a* now implies that
µ*(A n E) = F, µ*(A n Ek). (4.1.5)
Thus, for any A, and any n,
,a*(A) = p*(A n Fn) +,a*(A - Fn) >, ,a*(A n Ek) +,u*(A - E)
using <rn and the monotonicity of u*. Thus, by (4.1.5),
,a*(A) > ,a*(A n E) +,a*(A - E),
and this implies E e .JI by (4.1.2).
Now the restriction of ,a* to .,dl is a non-negative set function.
Further (4.1.5) with A replaced by S2 shows that ,a* is o-additive on
.4' and is therefore a measure on . '.
We can now prove the basic extension theorem. In order to simplify
the formulation we will assume that the ring 9 of subsets of 0 is
such that there is a sequence of sets {En} in .g' such that 0 = U En.
We then say that S2 is o-9. This condition implies that the o--ring
generated by 9 is a o--field. Theorem 4.2 is true without this restriction,
but the proof would then require the consideration of outer measures
defined on a suitable class of subsets of 92, rather than on all subsets.
Since this generalisation also causes complications in the definition
of the integral, and the extra generality is rarely needed, we will keep
the condition that S2 be o--R.
Theorem 4.2. Suppose R is a ring of subsets of 92 such that S2 is
and ,a: 9 -> R+ is a measure defined on £. Then there is an extension
of It to a measure v defined on .(.), the o -ring generated by R. If
u is o- finite on a, then the extension is unique, and is 0 --finite on Y.
Proof. Let' be the class of all subsets of Q. Since 0 is o--. , any
Be can be covered by a countable sequence of sets of 9. Put

,u*(E) = inf

the infimum being taken over all sequences of sets {Fi} in 9 such that

E U J. It is clear that ,a*:le --)- R+ is non-negative, monotone and

that ,u*(0) = 0. Suppose now that E U Ei. Then, if p*(Ei) is
infinite for some i, 00

,u*(E) < E,u*(Ei) (4.1.6)


is immediate. If a*(E1) < oo for all i; for any e > 0, choose sets
Fik (k = 1, 2,...) in .ri'P such that
00 00
Ei c U Fik and E /(Fik) < ,u*(Ei) +
(i = 1, 2, ... ).
k=1 k=1
The countable collection {Fik} will now cover E, and
00 00

p*(E) < E E F(Fik) < E'0

i=1 k=1 i=1 241

Since e is arbitrary, (4.1.6) now follows, and we have proved that

,u*:W R+ is an outer measure. Let .4f be the class of subsets of Q
which are measurable with respect to ,u*.
We first want to show that .,r' M. If E R and # *(A) < oo (the
case,u*(A) = +oo is unimportant as (4.1.2) is then trivially satisfied),

choose a sequence {Ei} of sets of .q' such that A c u Ei and


,u*(A) +e > E1 (Ei) = E L,u(Ei-E)+,u(Ei-E)l

i=1 i=1

> ,u*(A n E) +p*(A -E),

by the subadditivity of,u*. Since e is arbitrary, we have again proved
(4.1.2), so that E E -0. By theorem 4.1, .4' is a o -ring, so that _W .,
the o -ring generated by 9. But the restriction of,u* to -0 is a measure,
so that its further restriction v to .9' is also a measure.
If E e 9 it is clear that ,u*(E) > ,u(E) because of theorem 3.6 (i),
and since E is a covering of itself, ,u*(E) < ,u(E). Hence, for all sets
E e °R, we have v(E) _ ,u*(E) _ ,u(E), so that v is an extension of ,u
from R to Y.

If we now assume that 1a* is o -finite on qP, it follows that S2 = U Ei

with {Ei} an increasing sequence of sets ing andp(E1) finite, i = 1, 2,....
For a fixed integer n, consider the ring R. consisting of sets of the
form E. n E with E E R. Suppose ,ul and , t2 are any two extensions
of ,u from 9,, to Y. = .9(P2.). Then all the subsets in Y. are con-
tained in the set E,, so that It, and ,u2 are finite on if,,. Now let .9
be the subclass of those sets E of .So,, for which ,u1(E) = ,u2(E). Since
finite measures are continuous from above and below, it follows that
.1 is a monotone class. By theorem 1.5, since J-n 9n, it follows
that 9-n Y. and we must have .°ln = Son. Thus the extension of
p to Y. is unique for every n. But, for any E So we have
E= limEnK.
n-_> ro

so that a further application of the continuity theorem shows that the

extension of p to .5o must be unique.
Theorem 4.2 can be applied to any measure defined on a ring a.
In 3.4 we saw that the concept of length in R', area in R2 and volume
in Rk (k > 3) could be precisely formulated on the ring 46'k of elementary
figures to define a measure on 8k. It is clear that Rk is or-!o k, and the
measure is actually finite on !o k. The o -ring generated by gk is the class
_Vk of Borel sets in Rk (proved in § 2.5). Thus if we apply the statement
of theorem 4.2 to this measure ,a: gk ->- R+ we obtain a unique exten-
sion to a measure v: jk -+ R+ which is o--finite on jk. It is worth
noticing that in the proof of theorem 4.2 the extension was actually
carried out to a class of measurable sets containing Rk. This class is
denoted by Wk and can be shown to be larger than sk. A set E c Rk
is said to be Lebesgue measurable if and only if it is in the class 2k.
In particular all Borel sets in Rk are Lebesgue measurable. The set
function v: 2"' -> R+ is called Lebesgue measure in k-space and should
be thought of as a generalisation of the notion of-k-dimensional volume
to a very wide class of sets. We will examine the properties of this set
function in some detail in § 4.4, and it will then become clear that
many of our intuitive ideas of length, area, and volume can be pre-
cisely formulated and remain valid for Lebesgue measure.
It is worth noticing that the outer measure obtained by covering
as in theorem 4.2 is always a regular outer measure. For, if,u*(E) < oo,
choose sets Tn, r E 9 (r = 1, 2, ...) such that

E C U Tn.r, ,a* (E) + 1 > Gi#(T.")

r=1 n r=1
Go Co

Then A=n UTn,r=) E, Ae2,

and p*(A) = p*(E). This means that the approach through inner
measure will lead to the same class of measurable sets and the same
extension to this class. In particular the Lebesgue measure can be
obtained by this method provided one considers subsets of a fixed
bounded interval (of finite measure) in the first instance and then
allows the interval to expand to the whole Euclidean space.

Exercises 4.1
1. Suppose p* is an outer measure on S2 = lim Ek where {Ek} is a mono-
tone increasing sequence of sets. Show that if a set E is such that E n Ek
is measurable (p*) for all sufficiently large k, then E is measurable (p*).
2. Show that if p* is a regular outer measure on S2 and p*(Q) < oo,
then a necessary and sufficient condition for E to be measurable (,a*)
is that
p*(S2) = p*(E)+p*(Q-E).
3. In each of the following cases, show that p* is an outer measure, and
determine the class of measurable sets
(i) p*(o) . 0, p*(E) = 1 for all E + 0.
(ii) p*(Q) = 0, p*(E) = 1 for E + 0 or S2, p*(SZ) = 2.
(iii) S2 is not countable; p*(E) = 0 if E is countable, p*(E) = 1 if E is not
4. Show that any outer measure which is (finitely) additive is o--additive.
5. Suppose p* is an outer measure on 0 and E, F are two subsets, at
least one of which is measurable (p*). Show that
p*(E) +p*(F) = p*(Eu F) +,a*(E n F).
6. Suppose is a sequence of sets in a o--ring .97, and # is a measure
on 9. Show that

(ii) provided U00 Ek has finite measure for some n,

p(lim sup En) > lim sup

If E p(En) < oo, show that p(lim sup En) = 0.

7. Show that, if p is a discrete measure on n (as in example (6) of §3.1
with pi > 0), then the operation of extending it to an outer measure and re-
stricting this extension to the class of measurable sets as in theorem 4.2
yields nothing new.
8. Suppose .,/l is the u-ring of p*-measurable sets in Q. Then if {En} is
a monotone increasing sequence of sets in .4' and A is any set
p*( lim A n En) = lim p*(A n En).
n-). 00 n-)1 oo

Prove a corresponding result for a decreasing sequence (which needs an

additional condition).
9. If p* is a regular outer measure, show that p* (lim An) = lim p*(An)
for any increasing sequence
10. Suppose in theorem 4.2 that p is known only to be finitely additive
on l; then the same procedure yields an outer measure p* and a restriction
µ of /t* to the u*-measurable sets. Show that ;u is a measure but is not
necessarily an extension of It.
11. Suppose . is a ring of subsets of a countable set fZ such that every
set in R is either empty or infinite, but the generated sigma-ring Y(R) con-
tains all subsets of S2 (see exercise 1.5(8)). Putp1(E) = number of points
in E, ,u2(E) = 2,u1(E) for all subsets E c Q. Then /Zl, /b2 agree on ? but not
on .9'(R) so that the uniqueness assertion of theorem 4.2 requires ,u to be
12. Suppose h(t) is any continuous monotonic increasing function
defined on (0, y), y > 0 with lim h(t) = 0. If Sl is any metric space, let
t-)- o+
h-m*(E) = lim [inf h{diam (Ci)}J ,
where the infimum is taken over all sequences {C;} of sets of diameter
< 8 which cover E (if there are no such coverings then the inf is +eo).
Show that h-m*(E) defines an outer measure in Q. (It is called the Haus-
dorff measure with respect to h(t).)

4.2 Complete measures

If we again think of measure as a mass distribution in the space
S2, it is clear that any subset of a set of zero mass should have the mass
zero assigned to it. The present section seeks to make this notion
Given a measure T: ' -> R+ we say that the class 'f is complete
with respect to r if
r that r(E) = 0.) If r:Wo -> R+
is such that ' is complete with respect to r we also say that r is a com-
plete measure.
All measures It which are obtained (as in theorem 4.1) by restricting
an outer measure ,u* to the class .,' of sets which are measurable
(,u*) are complete measures. For, since outer measures are monotone,
non-negative, EcF, µ*(F) = 0 =>/t*(E) = 0,
and all sets E of zero /t*-measure are measurable /t* by (4.1.2) since
p*(A) > /t*(A-E) _,u*(A-E)+#*(AnE).
In particular Lebesgue measure defined on the class Ik is a complete
Given any measure p on a o--ring .5, there is a simple method of
extending it to a complete measure on a larger o -ring-called the
completion of.? with respect to ,u.
Theorem 4.3. Given a measure u on a o--ring.?, let So be the class of all
sets of the form EL N where E E.? and N c- FEY with µ(F) = 0.
Then 9 is a and if we put
µ(EA N) = ,u(E),
then µ':.9 -a R+ is a (uniquely) defined extension of p from .S? to .7,
and ;u is a complete measure on .9.
Proof. Let E0 = E A N, where E E.S, N F E.?, µ(F) = 0. Put
#(E). If
El = E - F, then El c E0, El E.' and #(E1)
N1 = E0-El,
then El, N1 are disjoint and E0 = El v N1. Further, since
we haveN1 c F and #(F) = 0. Thus the class 9 is the same as the class
of sets E v N with E E. °, N c F E.S, ,u(F) =0 and E n N= 0.
A similar argument shows that So is also the same as the class of sets
of the form E - N with E E.S, N c F E.9, ,u(F) = 0 and N c E.
It is now easy to check that 9 is a ring. Suppose E1, E2 E.9; first
express them as El = X, - N1, E2 = X 2 - N2, N1 c X 1, N2 C X2 where
N1 c F1, N2 e F2 and µ(F1) _ p(F2) = 0. Then
E1AE2 = X1nX2-(N1vN2),
and X. n X2E?, N1 v N2 c F1 v F2E.5, µ(F1 v F2) = 0; so that
E1 n E2 E .P. Now Put
E,=X3-N3, E2 = X2-N2, N3 n X 3 = o, N3 c F3 with
µ(F3) = 0.
El - E2 = (X3 - X 2) v (N3 - X2) V (N2 n E1) = (X3 - X2) U N51
where N. c F. u F2 and µ(F3 v F2) = 0. Finally
E, = X 3 v N3, E2 = X4 v N4, where X4 n N4 = o,
and N4 c F4 with µ(F4) = 0. Then
E1vE2= (X3vX4)v(N3vN4-X3vX4) = (X3VX4)vN8,
where N6 c F3 u F. and ,u(F3 v F4) = 0. Thus .9 is closed under the
finite operations of intersection, difference, union so it is a ring. To
prove it is a a -ring, put
Ei=XiuNi, NicFi, u(Fi)=0 (i=1,2,...);
OD 00 00

then UEi=UXiuUN7=XvN,
i=1 i=1 i=1

where N c U Fi = F and µ(F) =0.

Hence U Ei E Y.
To see that ,u is uniquely defined on 9, let
be two representations of the same set. Then (see exercise 1.4 (5))
and N1 A N2 C F E.9' with ,u(F) = 0. Hence
u(E1- E2) = ,u(E2 - E1) = 0,
and ,u(E1) = ,u(E1 ^ E2) = pp(E2)
Thus if we define ,ic on So by
7Z(E0) = lp(E1) if Eo = E1 L N1,
ii is uniquely defined.
It only remains to show that 9 is complete with respect to µ.
Suppose E is any set of .9 with µ(E) = 0. Then B = X v N where
X E.9', ,u(X) = 0, N c: FEY, ,u(F) = 0. Thus, if G c E, we have
G c X v F with p(X v F) = 0 and X v F E.9'; so that
G = 0 v GE.9,
and µ(G) = 0.1
We already saw that if It was a a -finite measure defined on a ring 9,
then it had only one extension to a measure on the generated ar-ring.9'.
If we now complete .9 to obtain the measure ;u defined on 9 so
that 9 is now complete with respect to the extension 71 of ,a, then
we have extended p from . to R. Since the extension from .So to .9,
is also unique, it follows that there is only one extension of p from 9P
to R. There is a sense in which, in general, this is as far as one can get
with extensions while still preserving uniqueness, though it may be
possible to extend ,u further to a larger o--field; see theorem 6.11.
It should also be noticed that in the extension theorem 4.2, the
class f of ,u*-measurable sets is none other than .9 the completion
of the a-ring .50 with respect to ,u. For, in the first place, .,' .Sv
and dl is complete, hence .ill .5". Secondly, if E is any set of .4
such that µ(E) < oo, we can cover it by FEY such that ,*(F) = ,u*(E).
Then F - E E .,11 and has zero measure, so that it can be covered by a
GE.S° with #(G) = 0, and
E_ (F-G)u(EnG)E.So.
Since It is a-finite on .4', and .9 is a a-ring, it now follows that
.,k c .9.
In particular, Lebesgue measure on 2k is the unique extension of
the concept of length from the semi-ring 9k to the a-ring 2k which
is the completion of Rk.

Exercises 4.2
1. Suppose It is a measure on a a-ring .2 and ;u on .2 is its completion.
Show that if A, Bet with A c E c B, ,u(B - A) = 0 then E E , and
Z(E) =,u(A) =#(B).
2. Given a a-finite measure,u on a ring. the extension given by theorem
4.2 yields a complete measure on the class .4' of #*-measurable sets which
is the completion of .5o the generated a-ring. The following example shows
that this is not true if the hypothesis of a-finiteness is omitted: Let S2 be
non-countable, .9' the ring (also a a-ring) of all sets which are countable
or have countable complements, ju(E) = number of points in E for E EY.
Then .5o is complete with respect to a, but applying theorem 4.2 yields a
complete measure on the class of all subsets (as every subset is measurable).

4.3 Approximation theorems

We have seen how the definition of a measure can be extended from
a ring .g' to the generated a-ring .50, and its completion .9. It is con-
venient to think of the sets of £ as having a simple structure, so that
it becomes interesting to see that the sets of So can always be approxi-
mated in measure with arbitrary accuracy by sets in the original
ring ?.
Theorem 4.4. Suppose .5P is a ring for which S2 is a-., and the o -finite
measure ,u: rP -+ R+ has been extended (uniquely) to the completion 9
of the a-ring .90 generated by 9?. Then for any e > 0, any set EE.5° with
,u(E) < oo, there is a set F E .? such that
#(E A F) < e.
Proof. First, find a set E1 a .So such that
,u(EL El) = 0.
Then ,u(E1) =,u(E) < oo, so that by the construction of theorem 4.2,
we have
T, 4E 5

so that we can choose a sequence of disjoint sets {Ti} of 9 such that

00 Co

E1 (J Ti and a*(E,) + Je > E,u(Ti).

i=1 i=1

Now choose a finite integer n such that


E t(Ti) < .e,

and put F = U Ti E .Q.

Then E1 -F U Ti, so that ,u(E1- F) < je;


and F -E1 I J Ti-E1 so that ,u(F-El) < ,fe.

Hence ,u(E F) = #(E1 A F) < e. I
Remark. The condition,u(E) < oo cannot be omitted from the above
theorem, since it is possible for a finite measure It on 9 to have an
extension to .9' which is but not finite (for example, Lebesgue
It is also worth noticing that the sets E of 9 can be approximated
exactly in measure by sets in ., by theorem 4.3. We noticed earlier
that the outer measure,u* generated by the process of theorem 4.2 is
always regular. This means that an arbitrary set E SZ is always
contained in a set FEY for which ,*(E) =#(F), so that every set can
be approximated from the outside by a set of.9' of the same measure.
If E is not,u*-measurable (i.e. not in 9) then two-sided approximation
is not possible.
Up to the present we have only considered general approximation
theorems valid in any abstract space. If the measure is defined in a
topological space, then it is of interest to obtain approximation
theorems which connect the measure properties to the topology of the
space. We do not, however, discuss this problem in general: instead
we consider Euclidean space with the usual topology, and Lebesgue

Regular measure
Suppose .So is a of subsets of a topological space S2 which
includes the open and the closed subsets of S2, and p:$-->- R+ is a
measure. Then the measure ,u is said to be regular if, for each e > 0,
(i) given E E.Y, there is an open G E with ,u(G - E) < e;
(ii) given E E ., there is a closed F E with p(E - F) < e.
Since the class . of Borel sets in S is the generated by the
open sets, the condition that 3 includes the open sets implies .9' .4.
If p is regular on ., then . .9, where . denotes the completion of
. with respect to ,u; for if Sn is a sequence of positive numbers de-
creasing to zero one can find for any E in . an open set G. and a
closed set F. such that
µ(G. - Fn) < Sn and G. E = Fn,

and G=nGn,
will then be Borel sets with G E F and µ(G - F) = 0.
Metric outer measure
An outer measure µ* defined on a metric space S2 and such that
p* is additive on separated sets, i.e.
d(E, F) > 0 . p*(E v F) = ,u*(E) +,u*(F),
is said to be a metric outer measure. It can be proved that, for any
metric outer measure, the class ,t of measurable sets contains the
open sets (and therefore contains -4), and that, if u* is also o--finite,
the restriction of µ* to . ' is regular. Since Lebesgue measure is
generated by a metric outer measure, this general theory would allow
us to deduce that Lebesgue measure is regular. However, we prefer
instead to prove the result only for the special.case of Lebesgue
Theorem 4.5. Lebesgue k-dimensional measure, defined on the class
2k of Lebesgue measurable sets in Rk, is a regular measure.
Proof. We give the details of the proof for k = 1; only obvious
alterations are needed for general k. Suppose E e 2 = 21; then
B n [n, n + 1) = E. e 2 for every integer n, and p(En) < 1 < oo. By
the construction of theorem 4.2, there is a countable covering {Cz}
of E. by f-open intervals of 9 such that
1e °°
lu(En) 4 F(Cni)
Enlarge each of these intervals Cni to an open interval Gni such that
1 e
u(Gni - Cni) < 4 21n1+i
Then Q. = U Gni is an open set which contains En and satisfies
u(Qn-En) <
If we now put Q = U Qn, then Q is open, Q C E, and u(Q - E) < e.
This proves condition (i) for regularity.
For any E E 2', 1) - E E 2, and we can apply the above argument
to obtain an open R 12 - E such that ,u (R - (S1- E)) < e. Then
F = SZ - R is closed, F C E and #(E - F) = u(R n E) < e, so that the
second condition for regularity is also satisfied. I
Corollary. Given any set E e 2k, there is a Va-set Q and an .°F, set R
such that
Q=)E=)R and µ(Q-R)=0.
Proof. Note that 9ra and .F,,. sets were defined in § 2.5. For each in-
teger n, take an open set G. E and a closed set F. C E such that
#(Gn-E) < n #(E-Fn) <' n1.
OD 00
The sets Q = n Gn and R = (,J F.
n=1 n=1

then satisfy the conditions of the corollary. I

This corollary strengthens the result that any set in Fk can be
approximated exactly in measure by a set in Rk-which follows from
the fact that 2k is the completion of elk with respect to Lebesgue

Exercises 4.3
1. Suppose 2 is the o--ring generated by a ring 9 and ,u, v are two a--
finite measures on R. Show that if E e 2 is such that both #(E), v(E) are
finite then, for any e > 0, there is a set E. e R for which
p(E A Eo) < e, v(E A E0) < e.
2. Suppose 0 is a metric space and p* is an outer measure on S2 such
that every Borel set is #*-measurable. Show that µ* is a metric outer
measure, i.e. that for E1, E2 C S1,
d(E1,Es) > 0 ,a*(E1vB2) = p*(E1)+,u*(E2)

Hint. Take an open set G El, G n E2 = o and use E1 v E2 as a test

set for the measurability of G.
3. Suppose ,a* is a metric outer measure on a metric space Q. Show
that if E is a subset of an open subset G and En = E n {x: d(x, C - G) > 1/n}
then lim ,u*(E,,) = /.c*(E).
n-* o0
Hint. {En} is a monotonic increasing sequence of sets whose limit is E.
Put E. = o, D. = En+1- E. and notice that if neither D,,+1 nor E is
empty then d(Dn+2, 0 so that
p*(E2n+1) > !. 14*(D2i), /i*(E2n) > E u*(D21-1)
i=1 4=1
If either of these series diverges, then ,u*(En) moo = / *(E). If both con-
verge, use OD 00

lME) S,u*(E2n)+ Z p*(D21)+ E /z*(D2i+1)

i=n i=n
4. Ifu* is a metric outer measure, show that all open sets (and therefore
all Borel sets) are u*-measurable.
Hint. If G is open, A any subset, use notation of (3) applied to E = A n G.
Then d(En, A n - G) > 0 so
/c*(A) > ,u*{En v (A n - G)} = +/C*(A n - G).

4.4* Geometrical properties of Lebesgue measure

We have now defined Lebesgue measure in Euclidean space and
considered some of its measure-theoretic properties. However, the
justification for studying Lebesgue measure is that it makes precise
our intuitive notion of length, area, volume in Euclidean space and
generalises these notions to sets where our intuition breaks down.
In the present section we want to show that Lebesgue measure has
got the properties which geometrical intuition would lead us to
It is convenient to adopt the notation IEI for the Lebesgue measure
of any set E e 2k, so that for sets EE-TI, I E l is a generalisation of
length; for sets E e 22, I E I is a generalisation of area; for sets E E 2k
(k > 3), IEI is a generalisation of volume.
Since the set consisting of a single point x can be enclosed in an
interval of 9 of arbitrarily small length, it follows that
I{x}I = 0 for xe Rk.
In particular, in R',
1[a,b]I = I (a,b)I = I(a,b]I = I[a,b)I = b-a
so that the Lebesgue measure of any interval on the line is its length.
Any countable set in Rk is the union of its single points, and is therefore
of zero measure. In particular the set of points in Rk with rational
coordinates forms a set of zero measure (even though this set is dense
in the whole space).
In Rk (k > 2), any segment of length l of a straight line can be
covered by [nl] + 1 cubes of g k of side 1/n so that the Lebesgue mea-
sure of such a segment must be less than ([x] denotes the largest integer
not greater than x)
k \
{[nl] + 1} = 0 inki I as n --> oo,
and so I L I = 0 for any segment L of finite length. Any infinite straight
line in Rk, k > 2, is the countable union of segments of finite length
so that ILI = 0 for any straight line L in Rk (k > 2). It follows that,
if we are calculating the measure of any geometrical figure in the plane
which is bounded by a countable collection of straight lines, then the
area will be the same whether all, some or none of the boundary lines
are included in the set.
The above argument shows that there are sets E in Rk (k > 2)
which are not countable, but such that IEI = 0. The question arises
whether or not such sets exist in R'. This is easily answered by the
Cantor set 00

defined in § 2.7 where F. = [0, 1] and F. is obtained from Fn_1 by replac-
ing each closed interval of Fn_1 by two closed intervals obtained by
removing an open interval of one third its length from the centre.
We proved that C was perfect and therefore non-countable. But
IFnl = I I Fn-lI = (J)'IF0l = (J)n,
so that ICI = lim I Fns = 0.
n--)- co

It is worth remarking that it is also possible for perfect nowhere

dense sets in R to have positive measure-see exercises 4.4 (2, 3).
We now consider what happens to the Lebesgue measure of sets
under elementary transformations of the space.
(i) Translation
Suppose X E Rk and E Rk. Put
E(x) _ {z:z= x+y, yEE}.
For the intervals I E 9k, it is immediate that
II(x)I = III
so that the outer measure ,u* is invariant under translations, and
Lebesgue measure must therefore also be invariant provided measur-
ability is preserved. Suppose E E S9k, and A is a test set for E(x).
Then since E is measurable, using A (- x) as a test set,
p*(A(- x)) _ ,u*(A(- x) n E) +,a*(A(- x) - E)
so that ,MA) = µ*(AnE(x))+,tt*(A-E(x))
and E(x) must also be measurable.
(ii) Reflexion in a plane perpendicular to an axis
(For k = 1 this means reflexion in a point, for k = 2 this means re-
flexion in a line parallel to an axis.) It is clear that,a* is invariant under
such a reflexion because the reflexion of the covering sets of 9k
again gives I-open intervals of the same measure. A similar argument
to that used in (i) shows that measurability is preserved, so that
Lebesgue measure is invariant under such reflexions.
(iii) Uniform magnification
For p > 0, the transformation of Rk obtained by putting y = px
for all x E Rk will be called a magnification by the factor p, and pE
denotes the result of applying this magnification to the set E. If
I E gk, then it is clear that
pI E .k and I pII = pk I II .
Hence, if ,u* denotes the outer measure generated by Lebesgue
measure on Yk,
pp*(pE) = pk,a*(E)
for all sets E. A similar argument to that used in (i) shows that
measurability is preserved by magnification, so that if E is Lebesgue
measurable, so is pE and -
IpEI - pk IEI
(iv) Rotation about the origin
Lebesgue measure is invariant in this case also, but rather more
work is needed to prove it. The key idea needed for the proof is that
an open sphere centre 0 is invariant under rotation about 0. Suppose
I is a fixed interval of 9k
I ={x: ai < xi < bi, i=1,2,...k}.
Then for any x E Rk (p > 0), (pI) (x) is an interval of Rk similar,
and similarly situated to I. If x denotes the transformation of Rk
consisting of a fixed rotation about 0, then
X(PI) (x) = (pxI) (xx)
By (i) and (ii)
Ix(pI)(x)I =p'Ix11, I(pI)(x)I =pdIII,
so that IX(PI) (x) I=
I II I (PI) (x) I

for all p > 0, x E Rk. This means that, for a given x and I, the effect
on the measure is the same for all intervals of the form (p1) (x).
Now any open set G can be expressed as a countable union of
disjoint sets of the form (pI) (x). In particular the unit open sphere
S centre the origin, can be expressed this way

S = U (piI) (xi),
and ISI = EI(piI)(xi)I.
But xS = 8, so that

EI(PJ)(xi)I = ISI = IxSI


Ix(piI)xiI =
II co

I xII = III. This argument is valid for any interval

I E 9k.
We can now use arguments similar to those in (i) to show that, for
any set E c Rk
,z (xE) = w (E)
and measurability is preserved under X. Thus if E E.Fk, XE is also in
2k and IxEI _ IEI

Note finally that reflexion in an arbitrary plane can be obtained by

successively applying the operations (iv), (ii), (i), (iv). We have thus
Theorem 4.6. The class Yk of Lebesgue measurable subsets of Rk,
and Lebesgue measure on _pk are invariant under translations, reflexions
and rotations. If E and F are two subsets of Rk which are congruent in
the sense of Euclid and E is measurable, then so is F and
For p > 0, if pE denotes the set of vectors x of the form py, y E E, then
EEYkz pEE2k, and IpEI = pkjEI.
If k, 1, r are positive integers and k + l = r, then the Euclidean
space Rr can be thought of as a Cartesian product Rk x R. We have
defined Lebesgue measure independently in each dimension, but the
measure of the primary sets -0A' could have been obtained as a product
of the measures of corresponding sets in Yk, 91. It is therefore not
surprising that this is true of a wider class of sets.
Theorem 4.7. If E E 2k, F E 22l then the Cartesian product E x F e 2pk+l
Proof. We use µ* to denote the outer measure generated by Lebes-
gue measure in the space where the set lies. Suppose first that E, F
are bounded so that there are finite open intervals J, K such that
E c J, F c K. We can then cover E and F by countable collections
of open intervals such that
00 00

EcUQicJ, FcUR1cK,
i=1 f=1
00 OD

I QiI < IEI +e, Z IR5I < IFI +e.

i=1 f=1
ThenExF c I.JQixR1,so that
It*(ExF) < Z IQixR I = E IQil lR>I
i.f i.y

= E IQil E I R1l < (IEI +e) (IFI +e).

i=1 f=1
Since e is arbitrary, it follows that
#*(E x F) S IEI. IFI. (4.4.1)
and the subadditivity of 1a* gives, with (4.4.1),
p*(J x K) <, IEI. IFI + IJ-EI. IFI + IEI. IK-FI + IJ-EI. I K-FI.
But J x K is an open rectangle and therefore in 2k+', and
p*(J x K) = IJI. IKI = (IEI + IJ-EI) (IFI + 1K-Fl).
It follows that all the inequalities of type (4.4.1) must be equalities.
In particular
R*(E x F) = IEI IFI (4.4.2.)
By the corollary to theorem 4.5, we can find sequences {An}, {Bn}
of disjoint closed sets such that

IE-AI = 0, IF-BI = 0.
Since A x B is an F, set in Rk+a it is measurable and
1*(A x B) = IA x BI = IAI . IBI = IEI. IFI.
But A x B c E x F, and Lebesgue measure is complete so that we
must have E x F measurable and
IE x FI = ,u*(E x F) = IEI. IFI.
In order to remove the restriction of boundedness, apply the above
to E n S, , F n Sn, where S. S,', are spheres of radius n centre the origin
in k-space, l-space respectively. This shows that, for each n,
(E n Sn) x (Fn S.) E 2k+l,
I(EnS.)x(FnSn)I = IEnS.IIFnS;,,I
and the result follows from the continuity of measures on letting
Non-measurable sets
We have now seen that Lebesgue measure can be defined on 2k,
a large class of subsets of Rk, in such a way as to preserve the intuitive
geometrical ideas of volume. We also remarked earlier that it is
impossible to define such a measure on all subsets of Rk, so we now
demonstrate the existence of at least one subset which is not in 2'k.
Again we carry out the construction for k = 1. Consider subsets
E c (0,1] and for xE (0, 1] let E(x) be the set of real numbers z such
that z = x + y, yEE and x + y < 1,
or z = x+y-1, yEE and x+y > 1;
that is, E(x) is the result of translating E a distance x and then taking
the non-integer part. From property (i), it follows immediately that
EE2=> E(x)E2, IEI = IE(x)I.
Now let Z be the set of rationals in (0, 1]. Two sets Z(x1), Z(x2) will
be disjoint if (x1- x2) is irrational and identical if (x1- x2) is rational.
Let f be the class of disjoint sets of the form Z(x). By the axiom of
choice (see § 1.6) there is a set T containing precisely one point from
each of the sets in W. If Z is the set (r1, r2, ... ), we put
Qn = T (rn) (n = 1, 2, ... ).

since every point x E (0, 1] is in Z(xl) for some x1 and if q E Z(xl) n T,
we have q - x1 = rn so that x E Q. Also the sets Q. are disjoint as T
contains only one point from each set in and therefore cannot
contain two points differing by a rational. If T E 2, then Q. E 2
(n = 1, 2,...) and I TI = IQ.I (n = 1, 2, ...).

But then 1 = 1(0,1]l _ I 001 Q . 1


and this equation cannot possibly be satisfied either by IQ-1 = 0

or I QnI = c > 0 for all n. The only possibility is that the set T is not
It is worth remarking that there are many more Lebesgue sets than
there are Borel sets. The number of sets in 2k is not more than the
number of subsets of Rk, i.e. not more than 2c. However it is at least
2c for it contains all subsets of the Cantor set (perfect with c points
in it), so that the cardinality of 2k must be 2c. However the
cardinality of the class a1k of Borel subsets of Rk is c and c < 2c
(see § 1.3) so that there must be some sets which are in 2k but
not in .2k; this means that the class Vk is not complete with respect
to Lebesgue measure. In order actually to exhibit a set in 2k but
not in _Jk one has to work a bit harder so we do not include such an

Exercises 4.4
1. Show that the set of points in [0, 1] whose binary expansion has zero
in all the even places is a Lebesgue measurable set of zero measure. Is it
a Borel set?
2. By changing the lengths of the extracted intervals in the construc-
tion of the Cantor set, show how to obtain a nowhere dense perfect set of
measure J.
3. Generalise (2) to show that for any e > 0 there is a nowhere dense,
perfect subset of [0, 1] with measure greater than 1-e.
4. Consider a union of sets of (3) to obtain a subset of [0, 1] of full measure
which is of the first category, and another subset of [0,1] of zero measure
which is of the second category.
5 Show that any bounded set in Euclidean space Rk has finite Lebesgue
outer measure. Is the converse of this statement true?
6. Suppose X is the circumference of a unit circle in R2. Show that there
is a unique measure aC defined on Borel subsets of X such that ,u(X) = 1
andp is invariant under all rotations of X into itself.
7. By considering suitable approximating polygons (finite unions of
rectangles will do), show that the area of the plane region bounded by x = 1,
y = 0, y = x3 is J. Generalise to the case y = xk, where k > 0 but need not
be an integer.
8. Show that a subset E of a bounded interval I c Rk is measurable if,
for any e > 0 there are elementary figures Q1, Q2 a ek such that Q1 E,
Q2 I-E and IQ1I+IQ2I < III +e-
9. Suppose X is the unit square {(x, y): 0 < x < 1, 0 < y < 1}. If
E c [0,1] put 2 = {(x, y): 0 < y < 1] and let .' be the class of sets
P such that E is 21-measurable. Put µ(E) = IEI, and show that the subset
M = {(x, y):0 < x < 1, y = J} is not measurable with respect to the outer
measure p* generated by a on the class of all subsets of X. Show that
µ*(M) = 1, p*(X-M) = 1.

4.5 Lebesgue-Stieltjes measure

There are other measures in Rk which are of importance in prob-
ability theory. Suppose F : R -* R is a monotone increasing real valued
function of a real variable which is everywhere continuous on the right.
Such a function is called a Stieltje8 measure function. Put
#F(a, b] = F(b) - F(a)
for each (a, b] E 9. Then It. is non-negative and (finitely) additive
on 9-the proof used for the length function in § 3.4 can be easily
adapted to show this (the length function corresponds to F(x) = x).
By applying theorem 3.4 we can extend /tF uniquely to an additive
set function on d, the ring of elementary figures. As in § 3.4 we again
have at least two methods of showing that IPF is a measure one. By
theorem 3.2 (iii) if ,aF is not a measure, then there is a monotone
decreasing sequence {En} of sets of a such that lim E, = o, but
limpF(Ef) = S > 0. The argument used in the Lebesgue case for
k = 1 can be modified by using the fact that, for any e > 0, if
PF(a, b] > 0,
we can always find a y > 0 such that
(a + y, b] = [a + y, b] c (a, b]
and ,/F(a + y, b] > ,aF(a, b] - e,
since F is continuous on the right at a. This leads us to a contradiction
which establishes that ,aF is a measure on e.
For k >- 2, we must start with a function F: Rk -> R which is con-
tinuous on the right in each variable separately and such that, for
I E 9}k, 2k
,up(I) _ yiF(Y) ? 0, (4.5.1)

where V are the 2k vertices of the set I E .k and yy = + 1 for the vertex
in which each co-ordinate is largest and y, = (-1)f if the vertex Y
is such that r of its coordinates are at the lower bound (and (k - r)
at the upper bound). Any such function F is called a k-dimensional
Stieltjes measure function. With a little care. it is not difficult to show
that, under these conditions, ,uF is a non-negative additive set function
on 5k and that it therefore has a unique extension to to k. Either of
the arguments given in § 3.4 can now be modified to show that ,aF
is a measure on 1i k.
We can now apply theorem 4.2 to this measure It,, to extend it
to the o--ring 1k of Borel sets in Rk. As in the case of Lebesgue measure,
this extension automatically defines ,uF on the completion TF
of Oak with respect to It,. The class 2F is called the class of sets which
are Lebesgue-Stieltjes measurable for the function F. The class clearly
depends on the function F-for in the particular case F - c, TF
contains all subsets of Rk as 1t (Rk) = 0 and pF is complete; while
if F(x1, x2, ...xk) = x1x2 ... xk, then luF is the length function and
-T,k, is the Lebesgue class 2k.
Each of these measures ,up:YF --)- R+ is regular. The proof given
in theorem 4.5 can easily be modified to show this (we again do the
case k = 1) by using the fact that, for any e > 0, if (a, b] E 9, there is
a y > 0 such that
(a,b+y] (a,b+y) (a,b]
and #,(a, b + y) S #p (a, b + y] < ,aF, (a, b] + e,
to obtain economical coverings by open intervals.

Probability measure
Given a o'-field Fof subsets of 0, any measure P: F R+ such that
P(S2) = 1 is called a probability measure on F. If in addition F is
complete with respect to P we will say that the triple (S2, .F, P) form a
probability space.

Distribution function
A function F: R -* R is called a distribution function if
(i) F is monotonic increasing, continuous on the right;
(ii) F(x) -+ 0 as x -> - co, F(x) -+ 1 as x -* +oo.
A function F: Rk -- R is called a (k-dimensional) distribution function
(i) F is continuous on the right in each variable;
(ii) /AF.(I) > 0 for all I F_ 9k, where pp is defined by (4.5.1),
(iii) F(xl, x2, ..., xk) -+ 0 as any one of xl, x2, ..., xk-* - 00,
F(xl, x2, ..., xk) -* 1 as xl, x2, ..., xk all -+ +oo.
It is immediate from our definitions that any distribution function
F can be used to define a Lebesgue-Stieltjes measure OF on the
o--field 2'F. Further #,,(Rk) = 1 and ,aF is complete, so that every
distribution function determines a probability measure and (Rk,
°F, #F), is a probability space. There, is a sense in which these are the
only interesting probability measures on Rk.
Theorem 4.8. Suppose So is a o- field of sets in R,. contains the open
sets and ,a:.5o -* R+ is a complete measure which is finite on bounded sets
in Y. Then there is a Stieltjes measure function F: R --> R such that
. n .5°F and ,u coincides with It, on YF. If is a probability
space, then F can be chosen to be a distribution function.
Proof. Since contains the open sets and is a o--field, it must contain
-4, the Borel sets and in particular . 9, the class of half-open
intervals. Define F by
for x 0,
F(x) - {,u(0, x]
_p(x, 0] for x < 0.
Then F: R -> R is clearly defined and is monotonic increasing for all
real x (note that F(0) = 0). By theorem 3.2 (i), if {xn} is any monotonic
sequence decreasing to x, lim F(x,,) = F(x); since
n->- oo

if x '> 0, lim (0, x,L] = (0, x],

if x < 0, lim (xn, 0] = (x, 0].
Thus F is continuous on the right, and must therefore be a Stieltjes
measure function.
Now if
a >O, ,u(a, b] = ,u(0, b] -,u(0, a] = F(b) -F(a);
if a < 0 S b, ,u(a, b] = ,u(a, 0] +,u(0, b] = F(b) -F(a);
if b < 0, µ(a, b] = µ(a, 0] -µ(b, 0] = F(b) - F(a);
so that It coincides with,uF on -0. By uniqueness of the extension of a
measure to the generated a--field and its completion, we have it = uF
on Y,, and .S° = 2F.

If p is a probability measure on .9', we must have

lim F(x) - lim F(x) = lim p(- n, n] = 1,
x--+.0 x-). -oo
so that F1(x) = F(x) - lim F(x)

will be a distribution function generating the same Stieltjes measure

as F.]
Remark. The case where u is a probability measure could have been
done directly by defining
F'i(x) = p(- co, x].
It is clear that this case extends immediately to Rk since if we put
F(xl, x2, ..., xk) = p{( 1, ..., fk):1 14 x,,, 2 = 1, ..., k}
it is easy to check that F is a k-dimensional distribution.
Discrete probability
There is a special case of a probability measure in which all the
probability is concentrated on a countable set E0 c S2. This can be
defined by specialising example (6) of §3.1. If {xn,} is any sequence in

0, and {pn} is a sequence of positive real numbers with E pn = 1,

then it is clear that
P(E) =xnEE
E pn
defines a probability measure on the class of all subsets of fl. When
S2 = R, this measure can be obtained from the distribution function
F(x) = E pn
so that, in R (or in Rk for that matter) a discrete probability measure
can be expressed as the Lebesgue-Stieltjes measure of a suitable
distribution function.

Exercises 4.5
1. To see that condition (4.5.1) for k-dimensional Stieltjes measure
functions is not implied by the condition that F be monotonic increasing
in each variable separately consider
max (0,x1 + x2 -I-1) for x1 + x2 < 0,
F(x x,) =
t I for x1+x2 0.
Does this condition (4.5.1) imply that F is monotonic in each variable?
2. If F: R -> R is a Stieltjes measure function, show that
,aF(a,b) =F(b-0)-F(a), 1t_,[a,b] =F(b)-F(a-0)
and determine ,u, for intervals of the form
[a, b), (- c , a), (a, co)
3. If F is a Stieltjes measure function in R which generates the Stieltjes
measure ,u,, show that F(x) is continuous if and only if IuF{x} = 0 for all
single point sets {x}. What is the corresponding continuity condition in Rk?
4. Consider Lebesgue measure on 21-subsets of [0,1] and let E0 be a
subset of [0,1] which is non-measurable, such that the Lebesgue outer
measure of Eo and ([0,1]-E0) are both 1. Let .2 be the smallest 0--field
of subsets of [0,1] containing Eo and Y. Show that .2 consists of sets of
the form
E = A n Eo+B n ([0,1] - E0)
for A, B F2" and that #(E) = IA n [0,1]l defines a probability measure on
the a--field .2. By applying theorem 4.8 to this probability measure show
that, in general it is not possible to deduce in theorem 4.8 that .So = 2F.
5. Suppose F(x) - r0 for x < 0,
1 for x > 0.
Show that ,up(-1, 0) < F(0) - F(-1).
6. Give an example of a right-continuous monotone F such that
,up(a, b) < F(b) - F(a) < ,uF[a, b].
7. Show that, if F, G are distribution functions in Rk, then aF+bG is
a distribution function for any a > 0, b >, 0, a+b = 1.
8. In R2' 1 for xl >, 0, x2 3 0,
F(xi, x2)
fO for all other points.
Show that this F is a distribution function describing a unit mass at 0.
9. State and prove an n-dimensional form of theorem 4.8.
10. We can obtain completely additive set functions in RI which are not
necessarily non-negative by the following method. Suppose F: R -> R
is continuous on the right everywhere and of bounded variation in each
finite interval and F(b)-F(a) is bounded below for all a < b and define
TF(a, b] = F(b) -F(a).
Show that TF is additive on and can be extended to S By an extension
of theorem 4.2, TF can then be extended to a o--additive set function on a.
Now apply theorem 3.3 to express TF as the difference of two measures.
Finally, the argument of theorem 4.8 shows that TF is the difference of two
Stieltjes measures.


5.1 What is an integral?
Historically the concept of integration was first considered for real
functions of a real variable where either the notion of `the process
inverse to differentiation' or the notion of `area under a curve' was
the starting point. In the first case a real number was obtained as the
difference of two values of the `indefinite' integral, while the second
case corresponds immediately to the `definite' integral. The so-called
`fundamental theorem of the integral calculus' provided the link
between the two ideas. Our discussion of the operation of integration
will start from the notion of a definite integral, though in the first
instance the `interval' over which the function is integrated will be
the whole space. Thus, for `suitable' functions f : 0 > R* we want to
define the integral 5(f) as a real number. The `suitable' functions will
be called integrable and .1(f) will be called the integral of f.
Before defining such an operator., we examine the sort of properties
5 should have before we would be justified in calling it an `integral'.
Suppose then that sad is a class of functions f : S2 > R*, and 5:a > R
defines a real number for every f E.W. Then we want S to satisfy:
(i) f d, f (x) >, 0 allx E 0 _0(f) >, 0, that is f preserves positivity ;
(ii) f,gE.W, a, ftER= of+figE.V and
5(af+fg) = a.N)+/if (g),
that is .1 is linear on.Qf ;
(iii) S is continuous on.V in some sense, at least we would want to
have.f(f ,,) -> 0 as n > oo for any sequence { fn} of functions in a which
is monotone decreasing with fn(x) > 0 for all x in 0.
These conditions are satisfied by the elementary integration pro-
cess, but the Riemann integral does not satisfy the following strength-
ened form of (iii):
(iii)* If {fn.} is an increasing sequence of functions in.V, and
fn(x) -->f(x) for all x c 0,
then f E.V and .f(fn) -> .1(f) as n -> co.
This is the most serious limitation of the Riemann integral for,
with this definition of integration, it is necessary in (iii)* to postulate
,,(x) -+f(x) uniformly in x before one can conclude that f Esad and
J(fn) -->.1(f). Now conditions about the continuity of f are really
essential if the operation is to be a useful tool in analysis-there would
not be much of analysis left if one could not carry out at least sequen-
tial limiting operations. One of our main objectives, therefore, is to
define an operator .1 which satisfies (iii) *.
One method of studying integration theory (essentially due to
P. J. Daniell) is to start with a restricted class sago of functions with a
simple structure, define .f:d0 - R to satisfy (i), (ii) and (iii) and then
extend Qto and the functional .1 step by step until f:d -> R is
defined on a sufficiently large class while (i), (ii) and (iii)* are satisfied.
Using this approach one can deduce a measure on a suitable o -ring
of subsets of S2 by putting
,u(E) = .f (XE)
for those sets E for which XE E sad. Condition (i) then implies that
,u is non-negative, condition (ii) that it is additive and condition
(iii) that it is v-additive provided the domain of definition is a ring.
We will give details of this approach in § 9.4, but for the present we will
regard the measure as the primary concept and define the integral in
terms of a given measure. We will, however, obtain an operator
.1:sad R which has the above properties and moreover in defining
.1 we will continually have these desired properties in mind. Thus out
of many possible ways of obtaining the integral starting from a
measure, we choose the method of definition by limits of monotone
sequences of `simple' functions.
5.2 Simple functions; measurable functions
We now assume given where it is a space, F a o--field
of subsets of S2 and It a measure on.F. All the concepts we now define
are relative to (SZ, It is worth remarking that our definitions
can be modified to apply to the case where JF is a o -ring rather than
a o -field, but this results in additional complications in proofs. The
additional labour involved does not seem justified for the small gain
in generality.
Our object is to define an operation, called integration, having
the properties discussed in § 5.1 on a suitable class of functions
f: Q -* R*. Ultimately we want this domain of definition for the
integral to be as large as possible. In the present section we obtain the
properties of certain classes of functions which will be important later.

0 n
If = U Ei and the sets Ei are disjoint, then El, E2, ..., E. are said
to form a (finite) dissection of Q. They are said to form an.'-dissec-
tion if, in addition, Ei E .F (i = 1, 2, ..., n).
Simple function
A function f: S2 -> R is called F-simple if it can be expressed as
f(x) = i1

where E1, E2, ..., E. form an .F-dissection of SZ and

ci e R (i = 1, 2, ..., n).
Thus an F-simple function is one which takes a constant value
ci on the set E. where the sets Ei are disjoint sets of .F. The additional
condition implied by our definition that 11 = U Ei is not important
(and is omitted by many authors), since if

F'n+1= a-i=1
UEi$ 0
we can always put cn+1 0 and write
f = E ci xEi
to see that the function is .f"-simple. If there is only one o--field .°F
under consideration we will talk of simple functions rather than F-
simple functions.
Lemma. The sum, difference and product of two simple functions is
a simple function.
Proof. Suppose we have the representations
n m
f =2Y- eixEi, 9 =jE djxdj;
=1 =J
then the sets Hi j = Ei n A j (i = 1, 2,..., n; j = 1, 2,..., m) are in .F and
form a dissection of Q. Further
f (x) = ci and g(x) = d, for xeHij, XH,j = XEi xd j
so that (f ± g) (x) = ci ± dj, (fg) (x) = ci dj for xeHij
n in n m
and f ± g =iZ (ci ± dj) XH, f9 =iE cidjxHij I
jZ jZ
Note that the constant functions
f(x) = c all xE S1

are simple, so that by this lemma it also follows that cf is simple if

f is and the class of simple functions forms a linear space over the reals.
One should regard simple functions as a generalisation of `step'
functions, but it is clear that they form a very restricted class since the
image of S2 under a simple function is a finite subset of R.
In defining measurability we will want to consider functions
f: £1- R* with extended real values. It is possible to define a topology
in R* and to define the class of Borel sets in R* in terms of this topology.
However, we adopt the simpler procedure of defining the class R*
of Borel sets in R* directly. We say that a set B c R* is a Borel set
in R* if it is the union of a set in MI (the class of Borel sets in R) with
any subset of R* - R = {- oo, + oo}.
Measurable function
A function f: t -> R* is said to be F-measurable if and only if
f-1(B) E.F
for every BE -4*. If there is only one .F under discussion we
may say that f is a measurable function.
From the definition it appears at first sight that one has to work hard
to check that a given function is.F-measurable. However, in practice
it is sufficient to check that f-1(E) E.F for a suitable class of subsets
which generates the o--field °.,$*. The most important such class is
given by the next theorem.
Theorem 5.1. In order that f: S2 -> R* be F-measurable each of the
following conditions is necessary and sufficient:
(i) {x:f(x) < c}E.F for all cER;
(ii) {x:f(x) > c}EJF for all
(iii) {x: f (x) > c} E.F for all c E R;
(iv) {x: f (x) < c} E JF for all c E R.
Proof (i). Since [ - cc, c] E -4*, it is clear that the given condition is
necessary. If we suppose that the condition is satisfied, and put
E., = {x: f (x) S c} = f-1[ - co, c],
then E0 E F, for all c E R. But the sets I,, = [ - oo, c], c E R generate the
o--field a*, so that, for each B E -V* (see exercise 1.5.(10)), the set f-1(B)
is in the o--field of subsets of S2 generated by the sets E0, c E R. Since
.F is a we have
f-1(B) E.F
for all BE a*.
(ii), (ii) and (iv). A similar proof can be constructed for each case.
Alternatively, it is easy to prove directly that each of (i), (ii), (iii),
(iv) is equivalent to each of the others. 3
Corollary. Any p-simple function is
Proof. If n
f = cixEi, then E. = {x:f(x) < c}

is the finite union of those sets Ei(e.F) for which ci < c, and is there-
fore in F. By condition (i) of the theorem, this implies that f is
measurable. 3
The next theorem examines further the relationship between simple
functions and measurable functions. It is both important and some-
what surprising.
Theorem 5.2. Any non-negative measurable function f : S2 - R+ is
the limit of a monotone increasing sequence of non-negative simple
Proof. For each positive integers, let

Q8 = (x: p-_28'
<f(x) < 2 } (p = 1,2,...,22,1);

Qo 8= - U QP,8 = {x: f (x) i 28}.


Then, since f is F-measurable, QP,BEJ5F and the sets

QP,s (P = 0, 1, ..., 228)
form an F-dissection of Q. The function
p 1
f8(x) = for xEQP,B (p = 1, 2,..., 22s);
= 28 for xEQ0,$
is a simple function and it is immediate that
0 <fa'< f-
If x c Q,,,, then either x c Q2P_l a+1 or x E QZP, 8+1 so that either

f8(x) = f8+1(x) or f8(x) + 28+1 = f8+1(x)

Further, if x E Q0 8, then f8(x) = 28 < f (x) so that either x E Qo, +1, or
x E Q,, 8+1 for some p > 228+1 + 1; and in either case ,,+1(x) > f8(x).
Thus for each integer s
f8+1(x) > f8(x) for all x e 0;
and the sequence {f8} of simple functions is monotone increasing.
If x is such that f(x) is finite, then, if 28 > f(x) we have
0 < f (x) - f8(x) < 2-8
so that in this case f8(x) - f (x) as s -* oo. On the other hand, if
f (x) _ + oo, then f8(x) = 28,
so that again f8(x) -+ f(x) as s -+ oo. ]
For any function f: S2 ->- R* we define the positive and negative
parts f+, f off by f+(x) = max [0, f (x)],
f _(x) = -min [0, f (x)].
Then clearly for all x, f (x) = f+(x) -f-(x),
If(x)I =f+(x)+f (x),
and each of the functions f+, f_ is non-negative. It follows immediately
from theorem 5.1 that, for any measurable f, f+ and f- are both measur-
able. An application of theorem 5.2 now shows that any measurable
function can be expressed as a limit of simple functions. Our next step
is to show that the class of measurable functions f: f -> R* is closed
both for finite algebraic operations and for countable limiting opera-
tions. A minor difficulty arises in that R* is not an algebraic field so
that, for example, (f + g) (x) is not defined at points where f (x) = + oo,
g(x) = - oo. In the following theorem therefore, we assume that the
functions are such that the algebraic operations are possible.
Theorem 5.3. If f and g are measurable functions: S2 -. R* and k e R,
then each of the functions:
f +k, kf, f+g, f2, fg, 1/f (where (1/f) (x) = +oo if f(x) = 0),
max (.f, g), min (f, g), .f+, f, If 1

which is defined, is measurable.

Proof. The measurability of the first two functions f + k, kf follows
immediately from any part of theorem 5.1. Consider now the function
(f + g). Let {ri} be a sequence containing all the rationals in R. Then,
since {ri} is dense in R, for any c E R,

{x:f(x)+g(x)> c} = U {x:f(x) > ri}n{x:g(x) > c-ri}.

By theorem 5.1 each of the sets on the right-hand side is in F so that,
because .F is a a -field, the set on the left is also in.F, and (f + g) must
be measurable.
Now 0 if c < 0,
{x:(f(x))2<c}= {x:f(x)=0} if c = 0,
{x:-c<f(x)<c} if c>0,
and each of these sets is in .F, so that f 2 is measurable. Further
{x: 1/c < f(x) < 0} if c < 0,
{x:(1/f)(x)<c}_ {x:-co<f(x)<0} if c=0,
{x: -co < f(x) < 0} u {x:f(x) > 1/c} if c > 0,
and each of these sets is in .F, so that 1/f is measurable.
We have already seen that f+ and f are measurable, so that
if I = f+ +f_ is also measurable. The measurability of the remaining
functions now follow from the identities:
fg = {[(f+g)2-f2-g2],
max(f,g) = J[f+g-If-gl],
min (f, g) = f +g - max (f, g)
It is clear that the above theorem extends immediately to functions
obtained by carrying out a finite number of algebraic operations on
any finite collection of measurable functions.
Theorem 5.4. Suppose {fn}, n = 1, 2, ... is a sequence of measurable
functions: S2 -> R*; then
(i) the functions supfn and inff,, are measurable;
(ii) the functions lim sup fn, lim inf fn are measurable;
n-->,,o n-+ao
(iii) if the sequence {fn} converges and in particular if it is monotone,
lim fn is measurable.

Proof. (i) {x: sUpfn > C} = U {x: fn(x) > C}-

n n=1
Since.F is a o--field, an application of theorem 5.1 now shows that supfn
is measurable. The case of inf fn can be proved similarly or it may be
deduced from
inf fn = -sup (-fn).
n n
Suppose now that {fn} is monotone increasing; then
lim fn = supfn
n-aao n
and is therefore measurable. Similarly, if Y J is decreasing,
lim fn = inf fn.

If 9'n = supfr, hn = inf fr,

r_> n ran
then {90, {hn} are monotone sequences, and
lim sup fn = lim gn, lim inf fn = lim hn
n--> co n--* oo n -ao
so that both are measurable.
(iii) If {fn} converges its limit will be measurable because it is the
common value of the measurable functions lim sup fn, lim inf fn. 3
It should be remarked that the class of measurable functions is
not closed for non-countable operations of the above type. Thus, if
A is non-countable and fa: S2 -+ R* is measurable for each a c A, there
is no reason for
f(x) = supfa(x)
to be measurable. For example, let A be a subset of [0, 1] which is
not 2-measurable (see § 4.4), and put
fa(x) = 1 if x = a;
=0 if x4a.
Then for each a e A, fa is 2-measurable (it is actually 2-simple) but
xa(x) = suPfa
is certainly not 2-measurable. In practice when one needs to consider
non-countable suprema (as in the theory of stochastic processes with
continuous time parameter) one tries to replace the index set A by a
countable subset giving the same supremum for the family (at least
except for a special subset of S2 of zero measure). If this procedure is
impossible for any reason, then there are very serious difficulties in
using non-countable suprema.
In the special case where l is a topological space and 9 is the
o--field of Borel sets in S2, there is a special name for a-measurable
Borel measurability
If . is the class of Borel sets in K2 and f: K2 -> R* is -4-measurable,
then we say that f is a Borel measurable function on 0.
Lemma. Any continuous function f: S2 --> R on a topological space
l is Borel measurable.
Proof. Since, for continuous f, the inverse image of an open set
in R* is open in fl it follows that {x: f (x) < c} is open for all c E R and
is therefore in .4.
If F, 2 are any two o--fields of subsets of f2 such that F 2, it is
immediate that any function f : f2 -a R* which is .2-measurable is also
F-measurable. In particular if F 2, then any continuous function
on a topological space f2 is .F-measurable. If fZ = Rk (Euclidean
k-space) then we know that the class _Tk of Lebesgue measurable
sets, and the class YF of sets which are measurable with respect to
the Lebesgue-Stieltjes measure defined by F each contain .4k, the
Borel sets in Rk. Hence, all continuous functions on Rk are Borel
measurable and therefore .2' -measurable for any F (in particular
they are ^Pk-measurable which we call Lebesgue measurable). Func-
tions which normally occur in real analysis are usually obtainable from
continuous functions and simple functions by the operations of the
following types:
(i) finite algebraic operations;
(ii) countable limiting operations;
(iii) composition.
We have already seen that operations of types (i) and (ii) preserve
measurability so that we should consider whether composition opera-
tions can be carried out within the class of measurable functions.
Lemma. Suppose f: R* R* is Borel measurable and g: 0 -> R*
is .F-measurable, then the composite function fog: fZ -* R* is F -
Proof. If A is any Borel set in R*, then since f is Borel measurable,
the set f-'(A) is also a Borel set in R*. Now
{x: f(g(x))EA} = {x:g(x)EB}E.5F
since B = f-'(A) E 2*. I
Remark. In the above lemma, it is not sufficient to assume that
f: R* -* R* is Lebesgue measurable -see exercise 5.2 (9).
This means that, for most of the functions which normally occur
in analysis, it is immediately obvious that they are 2F -measurable
for every F, and in particular that they are Lebesgue measurable.
Almost everywhere (a.e.)
It is convenient to have a way of describing the behaviour of a
function f : f2 -> R* outside an (unspecified) set of zero measure. If
P is some property describing the behaviour of f(x) at a particular
point x, then we say that f (x) has a property P almost everywhere
with respect to u, if there is some set with ,u(E) = 0 such that
f (x) has property P for all x E 0 - E. We then write
f(x) has property P a.e. (,u).;
and, if there is no ambiguity about the measure being considered,
(,u) can be omitted.
Lemma. If F is complete with respect to It, and f = g a.e., then f is
measurable if and only if g is measurable.
Proof. For any c e R the set
{x:f(x) < c} o{x:g(x) < c} C {x:f(x) + g(x)}

so that {x:f(x) < c} differs from {x: g(x) < c} by a subset of a set of
zero measure. If .F is complete with respect to It, all such sets are in
.F so that
{x: f (x) < c} E .F if and only if {x: g(x) < c} E .F. J

Exercises 5.2
1. In theorem 5.1, show that the condition {x: f (x) S c} E.V for all rational
c is sufficient to imply that f : ) R* is 3---measurable.
2. Suppose {f,,) is a sequence of functions: S2 -> R* each of which is
finite a.e. Show that, for almost all x in S2, is finite for all n.
3. Suppose G is an open set in R and { is a convergent sequence of
functions: S2 R. Show that
{x: lim E G} = U U fl
00 {x: d(f (x), R- 0) > m),
n-.co m=1k=1n=k
where d(y, E) denotes the distance from y to E (defined in §2.1).
4. Show that, in theorem 5.2, the condition f >, 0 can be deleted provided
we do not require monotonicity for the sequence {f,,} of simple functions
converging to f. Show that if f is unbounded above and below it is impossible
to arrange for the sequence {f,,,} to be monotone.
5. An elementary function is one which assumes a countable set of values
each on a measurable subset of 0. Show that, if f: f R is measurable
then it is the uniform limit of a monotone sequence of elementary functions,
but that if f is not bounded it is not the uniform limit of simple functions.
6. If.'V is a finite field of subsets of S2, show that f : S2 -* R is 35-`-measurable
if and only if it is F.simple.
7. If S2 is a topological space, give examples to show that, for f : Cl -+ R,
the condition
`f is continuous a.e. in Cl'
neither implies nor is implied by the condition
`there is a continuous g: S2 - R for which f = g a.e.'
8. Suppose S2 is a topological space,. .4 and (0, F, u) is such that
.F is complete with respect to It. Show that any function f which is continu-
ous a.e. is .F-measurable. Give an example of a measurable function which
cannot be made continuous by altering its values on any set of zero measure.
9. If 2 is the class of Lebesgue measurable sets in R, give an example
of an 2-measurable function f : R --> R and an 2-measurable set E c R
for which f-1(E) is not 2-measurable.
Hint. Use a suitable subset of the Cantor set (see §2.7).

5.3 Definition of the integral

Our method is to define the operation of integration first for non-
negative simple functions, and then extend the definition step-by-step
showing at each stage that the desirable properties discussed in §5.1
are obtained. If we think of measure as a mass distribution in 0,
and integration as a means of averaging a given function f with respect
to this mass distribution it is clear that there is only one reasonable
definition for the integral of
(1) A non-negative simple function
If n
f(x) = E ciXEi(x), (5.3.1)

where ci > 0 (i = 1, 2, ..., n) we define

ffczu E ci,u(Ei)
= i=1
This sum is always defined since each of the terms is non-negative.
It is called the integral off with respect to p. (Note that if ci = 0,
,u(Ei) = -boo our convention is that cip(Ei) = 0.) Since the repre-
sentation of a simple function in the form (5.3.1.) is not unique we
must first see that our definition of the integral does not depend on the
particular representation used. Suppose
f= n m
cj xE. = Fi dj XFj,
i=1 j=1

then since both systems of sets are dissections of 0

m n
,u(Ei) = fE,a(Ei n Fj) and µ(F) =iE11(E1 n Fj). (5.3.2)
Also if Ei n F is not empty, it will contain a point x and f (x) = ci = dj.
Thus n nm n m
E Ec,u(EinFj) _ df#(EinFj)
i=1 i=1j=1 i=1j=1
_ djp(Fj).
Now consider two non-negative simple functions
n m
f = E Ci XE,, 9 = E di XFj
i=1 j=1
and use the representations
nm cm n
f= Ej=1ECi,XEgnFj,
in terms of the dissection Ei n F j. Then the simple function (f + g) has
the representation nm
f+g= E E (ci+dj)xE;nF,,
i=1 j=1

and (f+ g) du = E E (ci+dj),u(Eir1 )

J i=1 j=1
n m n m
= E E ci,a(Ei n Fi) + E E dj,u(EE n Fj)
i=1j=1 i=1j=1
n m
= E ci,u(Ei) + E dj,u(F), using (5.3.2)
i=1 j=1

It is now immediate from the definition that if a > 0, 8 > 0 and
q are non-negative simple functions then
f (af+ffg)du = a ffdu+ffJ 9du
so that our operator is linear on the class of non-negative simple
functions. It is also clear that it is order preserving; that is, if f, g
are simple functions and f > g then f fdu > f gdu.
These properties allow us to extend our definition to:
(2) Non-negative measurable functions
Given a measurable f : SZ --> R+, by theorem 5.2 there is a monotone
increasing sequence fn of simple functions such that fn ->.f. Since
ff. d# is defined for all n, and is monotone increasing it has a limit in
R+ (which may be +oo). We define
ffd =lm f
n-oo J
fnd %(5.3.3)
Since there are many possible monotone sequences of simple functions
which converge to a given non-negative measurable f, we must show
that the integral f f du defined in this way is independent of the par-
ticular sequence used.
Suppose {fn} is an increasing sequence of non-negative simple
functions and f = lim fn > g, where g is non-negative simple. The
first (and main) step in
00 showing that our definition (5.3.3) is proper

is to show that, in these circumstances

limu> J gdu.
Put 9= ciXEi,
then if f g du = + oo, there must be an integer i, 1 < i S k such that
ci > 0, p(Ei) _ + oo. Then for any fixed e such that 0 < e < ci, the
sequence of sets {Ann Ej (n = 1, 2, ...) is monotone increasing to
Ej where
An = {x:fn+e > g}. (5.3.5)
Hence p(An n Ei) --,- + oo as n -* oo, by theorem 3.2. But

ffdu > f/flxflR.dP > (ci -e)#(An n Ei) --> +oo as n -

Thus (5.3.4) is established, if f gdu = +oo. Now assume that f gdu
is finite and put
A = {x:g(x) > 0} = U Ei.
Since g is simple, c = min ci is positive. and ,a(A) < oo. We now
suppose e > 0 and again define A. by (5.3.5). Then

ffndu > ffnxAnAd1iu > f(g_e)x4n4du

= f9ly-d.nAdu-e#(A. n A) > fxAn Adp-eu(A).

Since #(A. fl Ei) ->,a(Ei) for each i, we can evaluate the integrals as
finite sums and find an integer no = no(e) such that

findu > fgxAc1iu_e_ep(4) for n > no,

so that we have established (5.3.4) also in the case f gda < oo.
We can now suppose given two sequences of simple functions

0 <A <A < ... <1A < ... A

each monotone increasing and convergent to f. Then for each fixed
m we have, by (5.3.4), since
f = lim fn > gm,
lim J f .d# > gmdp.
If we now let m -+ oo lim
ffda > lim fgm du.

Since the situation is symmetrical, the opposite inequality is similarly

proved and we must have

n-> oo
lim fnd1t = lim
mom J

Thus the operation of integration is properly defined for non-

negative measurable functions. Because of the corresponding result
for non-negative simple functions, it now follows that, if f, g are
two non-negative measurable functions and a > 0, f3 > 0 then
+ Rg) d# = a f fda+fl gdu.
By our definition, for f > 0 and measurable, f f du may be finite or
+oo. A non-negative measurable function f is said to be integrable
with respect to the measure, if f f du is finite.
There are clearly two possible reasons for such an f to fail to
be integrable. Either there is a simple function g < f for which
f gdu = oo, which would imply the existence of a c > 0 for which
p{x: f(x) > c} = +oo, or alternatively it is possible that f gdu is
finite for all simple functions g < f (which implies p{x: f(x) > c} < oo,
all c > 0) but, for any sequence gn of simple functions converging
tof, f gndu-+ +ocasn-> co.
We can now define the integral for:
(3) Integrable measurable functions
We know that if f: L2 -> R* is measurable, then so are
.f+,f- and f = f+-f-.
If both f+ and f_ are integrable, then we say that f is integrable and
ffd =

Thus our operation of integration is now well-defined on the class .V

of integrable functions. We will show in the next section that all the
desirable properties discussed in §5-1 are satisfied by this operation.
Finally, we define:

(4) Integral of a function f over a set A

This can be considered only for sets A in F. Put

fdu = ffxAda

provided ffx4d/L is defined. Thus f fda will be defined if either

(i) fXA is non-negative and measurable, or (ii) fXA is measurable and

integrable. We say that f is integrable over A (with respect to p) if
the function fXA is integrable. It is clear that

fa fdu = ffdlt
and we will usually continue to omit the set f when we are integrating
over the whole space.
Note that, if E e .°F and ,u(E) = 0, then any function f : 0 -> R*
is integrable over E with
fE1d = 0.

Exercises 5.3
1. Show that a simple function
f(x) = E C XE;(x)

is integrable if and only if c; = 0 for each integer i such that 4a(E) = +ao.
2. Let Sl be a finite set, u(E) the number of points in E. Show that all
functions on S2 are simple functions and that the theory of integration
reduces to the theory of finite sums.
3. If f: S2 -> R* is integrable (a) show that, for any e > 0
#rx: I f (x) I 1 e} Goo.
4. Suppose ,ul and 1a2 are two measures defined on f7 and v =111+,u2.
Show that if f is integrable with respect top and,u2 over a set E, then it is
integrable with respect to v and

fE fdv =JE fdw1+JEfdp2

5. Suppose f : Sl --> R+ is a non-negative measurable function. Show that

ffdu = sup LZ p(Ek)inf{f(x):xeEk}1 ,

where the supremum is taken over the collection of all finite classes of
disjoint measurable sets with
E = U Ek.

(This is a possible way of defining f f du which leads to the same class of

integrable functions).
6. Suppose ,a(E) < oo and f : E a R is a measurable finite-valued function
defined on E Put
k k k + ll
Sn(E) _ p x:xEE, < f(x) <
ka-.2n 2n 2n J
Show that this series is absolutely convergent for all n if it is absolutely
convergent for any one n e Z. Show that f is integrable on E if and only if
the series converges absolutely for all n and in this case

fdu = Jim Sn(E).

B S- 00

Show that this is not valid if ,a(E) = + oo.

(This is another possible way of defining da.)

5.4 Properties of the integral

We have now defined the operation of integration with respect to a
measure p on a class of integrable functions. The first objective must
be to show that our operation has the properties outlined in § 5.1.
These are of two types: those involving only a finite number of func-
tions, and operations involving a countable class of functions. We will
obtain various closure properties of the classd while we are examining
the integration operation.
Theorem 5.5. Suppose (0, jF, p) is a measure space, A, B are dis-
joint sets in F and P L -* R*, g: S2 -+ R* are two functions integrable
(over 92) with respect to It. Then f is integrable over A, f +g and If j are in-
tegrable (over 12) and
p rs
W fd,u =J fdu+J fdu;
(ii) f is finite a.e.;
(iii) f(f+g)du = f fdu+ f gdu;
(iv) I f fd,2l < f If I du;
(v) for any c E R, cf is integrable and f cf du = c f f du;
(vi) f>0= ffdu>O;f>g=> ffdu> fgdu;
(vii) if f > 0 and f fdu = 0, then f = 0 a.e.;
(viii) f = g a.e. = f fdu = f gdu;
(ix) if h: i --> R* is iF-measurable and IhI < f, then h is integrable.
Corollary. Any function f : S2 -* R* which is bounded, .F-measurable,
and zero outside a set E in F of finite ,u-measure is integrable (over 0)
with respect to ,u.
Proof. If f : t2 -+ R+ is non-negative measurable and integrable
(over t2) and 0 < g < f with g: S2--> R+ measurable, it follows im-
mediately from the definition of the integral of non-negative measur-
able functions that g is integrable. Since for any A E .F, xy is
0 < f+ XA < f+ and 0 < f_ XA < f-,
and a function f which is integrable over t1 is also integrable over any
measurable set A.
(i) If A, B are disjoint,
xd-B = XA+xB,
so that f+xa -B = f+xa +.f+XB,
f-xA B =f-xA+f-XB;
and since property (i) is already known for non-negative measurable
functions we must have

fd'B f da = ff+xA Bdk- fi_x4 B4

= f+xAda- f f- xAda+ f f+xBdu- ff-xad1v;

and fB/dP = f/czp-i- fBfdP, (5.4.1)

since all the terms are finite.

(ii) If f is not finite a.e., then at least one of the sets
A1={x:f(x)=+oo}, A2={x:f(x)=-oo}
has positive measure. Suppose,u(A,) > 0. Then it follows immediately
from the definition that f f+dµ = + oo which means that f is not
integrable (over S2).
(iii) This has already been proved for non-negative functions f, g.
If fl, f2 are non-negative and f = fi-f2, then fi+f_ = fs+f+ and
applying (iii) for non-negative functions gives

fA d# + ff d14 = ff2 d1% +jf+ d#

so that

ffdi = f/ld/L_ff2du.

Now the general result follows since, for finite f, g

f + g = (f++g+) - (f-+g-),
so that

Pf + g) du = f (f++g+) d4 - (f-+g-)dw f
f f
= f+dµ- f- da+ f g+d#- g-d# f
= ffdu+fgdu.

Finally apply (iii) to the function If I = f+ +f_ to deduce that if I is

integrable and

f!/14 = f f+du+ f f_du.

(iv) This now follows immediately as

I ffdlt
I- ff+4 -ff-du s ff+d+f/_d.
(v) Ifc=0,cf=0and fcfdp=0=cffdu. Ifc> Othen
(cf)+ = of+, (cf)- = cf-
and the result follows since it has already been proved for non-
negative functions (p. 113). Similarly, if c < 0
(cf ) + = - cf-, (cf ) - = - cf+,
f (cf)+da-f (cf)_du = (- c) ff- d1t + ff+du = cffdu.

(vi) The first statement follows from the definition.

If f > g, then f = g + (f - g), and V- g) > 0. By (iii), we now have

ffd1u = fdP+f(f_)d1u > f gdu.

(vii) If {x:f(x) > 0} has positive measure, then by theorem 3.2
there is an integer n such that, if
A = {x:f(x) > 1/n},
,u(A) > 0. But n-1XA S fxa 5 f, so that
5fdu > n xadu = nu(A) > 0.

Hence, if f > 0 and f f du = 0, we must have ,u{x: f (x) > 0} = 0.

(viii) If f = g a.e., then f+ = g+, f = g- a.e. In the construction
of theorem 5.2, the sets Q,,,. for the two functions f+ and g+ will all
have the same measure. Hence, there are simple functions fn -+ f+,
gn -+ g+ such that
ffndPfndh/t (n = 1, 2, ...),

and it follows that f f+da = f g+da. Similarly f f -d# = f g_d#.

(ix) If I 1i < f then 0 < h+ 5 f, 0 < h_ < f. From (vi) it now follows
that each off h+da, f h_du is finite, and h is therefore integrable.
Proof of corollary. If If I < K, then the simple function KxE is
integrable and the integrability of f now follows from (ix).
Remark. If F is complete with respect to It, then (viii) can clearly
be strengthened as follows:
If f: Ll --> R* is integrable, and g: fZ -* R* is such that f = g a.e.,
then g is integrable and f f du = f gdu. There is also a converse to this
remark: if f and g are integrable functions such that

f f du = f gdu for all E e ,V,

then f = g a.e. For, suppose not, so that ,u{x: f(x) + g(x)} > 0. Then
at least one of {x:f(x) > g(x)}, {x:f(x) < g(x)} has positive measure.
must be an integer n such that, for
By theorem 3.2 there{x:f(x)

En = > 9(x)+n}, #(E.) > 0.

But then fflf_ffl4> 1 µ(E)

which is a contradiction establishing the required result.
We can now consider theorems about the continuity of the integra-
tion operator.
Theorem 5.6. Suppose {fn} is a monotone increasing sequence of non-
negative measurable functions: t -> R+ and fn(x) -*f(x) for all xE S2:
lim jfd dc = jfdiz,

in the sense that, if f is integrable, the integrals f fndµ converge to

f fdp; while if f is not integrable either fn is integrable for all n and
f fndµ - . +oo as n -* oo, or there is an integer N such that fN is not
integrable so that f fndµ = +oo for n > N.
Proof. For each n = 1, 2,... choose an increasing sequence
{fn.k}(k = 1,2,...)
of non-negative simple functions converging to fn, and put
9k = maxUn.kl-
Then {gk} is a non-decreasing sequence of non-negative simple func-
tions and
g = lim 9k
k --)- ao

is non-negative measurable. But

fn,k<9k<fk<f for n<k, (5.4.2)
so that fn < g < f;
and, if we now let n - oo, we see that f = g. Using the order property
(vi) of theorem 5.5 and (5.4.2) gives

ffn.kd1u < f gkdp < ffu for n < k.

For fixed n, let k -* co; from the definition of the integral,

ffndµ < f gd1u < lim ffkdP.

k-+ OD
If we now let n-> oo, we obtain
1im ffndp < gda <, lim ffkdu.
n-00 J n-a oo J

Since the two extremes of the inequality are the same, we must have

Jimu= fd/u = ffdu.]

Corollary (absolute continuity). If f is integrable (over 0) then, for
A E .F,

J, fda -> 0 as ,u(A) -* 0.

Proof. Put fn = f if I f I< n
=n if IfI>n.
Then I fnI is monotonic increasing to I f I as n -> oo Since, by theorem
5.5, If I is integrable, we have

as n -+ oo.

Given e > 0, choose N such that

f IfI du < f IfnI du++e for n > N.

Then if A e.F is such that a(A) < e/2N, we have, by theorem 5.5,

ffdu fIfIdP=ffNId#
Remark. The notion of absolute continuity for a set function v:
.F -+ R will be considered more fully in § 6.4.
Theorem 5.7 (Fatou). If {fn} is a sequence of measurable functions
which is bounded below by an integrable function, then

f liminffndp 5 liminfffndu
n--» n-aao

Remark. The operation lim inf picks out the small values of a
sequence. This theorem says that if this is done point-wise and the
result is then integrated the answer will be not greater than the result
of first integrating and then applying the operation.
Proof. Since {fn} is bounded below by an integrable function g
we may assume without loss of generality that fn > 0 for all n. For
h,,.=fn-g> 0 a.e.t and
fhnd/ u = f/n du -J gdu, lim inf hn = lim inf fn - g a.e.

Put gn = inf fk, then gn is an increasing sequence of measurable

functions and lim gn = lim inf fn.
n-*oo n-ioo
Since fn > gn, for all n

liminf f fnd# > lim f gndu = f lim gn du = f liminffnd1z,

n->m f J n-).co J
by theorem 5.6.
Corollary. If {fn} is a sequence of measurable functions whichis bounded
above by an integrable function, then
f limsupfndu > limsupJ fndu.
n-,ao n-O-ao

Proof. This can be proved directly by a method similar to that of

theorem 5.7, or it can be deduced from that theorem by putting
gn=-ffn(n= 1, 2,
Theorem 5.8 (Lebesgue). (i) If g: fl -> R+ is integrable, {fn} is a se-
quence of measurable functions SZ R* such that IfnI < g (n = 1, 2, ... )
and fn -->f as n -> oo, then f is integrable and

ffn da ffdl4 as n - oo.

(ii) Suppose g: Sl -+ R+ is integrable, - oo < a < b < + oo, and for
each tE (a, b), ft is a measurable function S2 to R*. Then if IftI < g for
all t r: (a, b) and ft-->f as t -> a + or t -* b -, then f is integrable and

fd -*ffdiu.
Proof. (i) We first prove the special case of the theorem where
t Since g is integrable the set (x: jg(x) I = + oo } has zero measure, so that the opera-
tion f (x) -g(x) can be carried out at least outside the set (x: lg(x) l _ + co }. We put in
a.e. to cover the possible exceptional set of zero measure where (ff - g) is not defined.
By theorem 5.5 (viii) such exceptional sets do not effect the value of the integrals,
and we could arbitrarily define to be zero at the points where fa = g = ± oo.
fn > 0 and fn - 0 as n -+ oo. In this case we can apply theorem 5.7
and corollary to give
lim sup flimsupfdP = fodfc = 0
= J lim inf ff,d# < lim sup fndµ.
Hence all the inequalities must be equalities, lim f fn du exists, and
has the value zero.
In the general case, put h,, = I fn -f I ; then 0 < hn < 2g, 2g is
integrable and hn is measurable with hn -* 0 as n -+ oo. But then
ffd1u_ffd/fIf_fJd1u -, 0 as n --> co,

and f is integrable by theorem 5.5 (ix).

(ii) Suppose, for example, that f, f as t --> a+, then we can apply
the sequence form of the theorem tofu = ft., where {tn} is any sequence
in (a, b) converging to a. Since f = limfn we must have
ffn(LP-> J/c4u.
But the right-hand side is now independent of the particular sequence
{tn} chosen so that f fgdu must approach the limit f f du as t ->a
through values in (a, b). 1
Exercises 5.4
1. Suppose f : S2 -+ R is measurable, A EF,,u(A) < oo and
f (x) =0 for x E S2 -A,
m< f (x) < M for x e A, where m, M E R.
Show that f is integrable and
mp(A) < ffd# < Mp(A).
2. Prove that, if f and g are integrable functions,
min [ffd/2. fdia] > f min (f, g) dµ.
If the two sides of this inequality are equal, what deduction can be made
about the relation between f and g?
3. Prove that, for any e > 0, if f is integrable over E there is a subset
Ei c E such that uc(E0) < oo, and

fB fdµ- fa.fdul < e.

4. Show that f : S2 -> R* is integrable if and only if for any e < 0, there
exist integrable functions g and h with g 3 f 3 h and f (g - h) d1i < e.
5. If E _ U Er is a countable union of disjoint sets off, and f is inte-
grable over E, then
f fda =E00 f
E r=1 E,

and the series converges absolutely.

6. Suppose Z is the set of positive integers, Jz'.is the class of all subsets
of Z and lu(E) denotes the number of points in E. Show that any f : Z -> R*
is g -measurable and that f is integrable if and only if E f (n) converges
absolutely. Deduce that the sum of an absolutely convergent series is
unaffected by any rearrangement of the terms.
7. Suppose {fn} is a sequence of integrable functions and

00 fflfn!d4u< o.
Show that the series E fn(x) converges absolutely a.e. to an integrable
function f and that
ffd1i = E ffndlu
8. Suppose {Er,} is a sequence of sets in .°F, m is a fixed positive integer,
and G is the set of points which are in E. for at least m integers n. Then
G is measurable and 00

1rn1u(G) < E fu(En)


9. Show that a measurable function f is integrable over a measurable

set E if and only if
Eµ[E n {x: I f (x) 13 n}]
10. Suppose f is measurable, g is integrable and a, ft e R with a < f (x) < 8
a.e. Then there is a real y such that a < y < lQ and

ff JgJ d# = y f I9Id1z.

Show by an example that we cannot replace IgI by g in this equation.

11. Suppose p is Lebesgue measure in R and put
fn(x)=-n2 for xE (0,1/n),
=0 otherwise.
Then lim inf f = lim f = 0 for all x, but

= -n.
This shows that theorem 5.7 is not valid without the restriction that {
be bounded below by an integrable g.
12. State and prove a version of Fatou's lemma (theorem 5.7) for a
family ft, t e (a, b), of non-negative measurable functions.
13. Is it true that, for measurable f, g:12 R*,
f2 and g2 integrable =>fg integrable?

Sho w that, if [ffdP]2 = ff2c1f2dp,

then f and g are essentially proportional : that is, there is a real a such that
f = ag a.e., or g = 0 a.e.

5.5 Lebesgue integral; Lebesgue-Stieltjes integral

We have defined the operation of integration on an abstract measure
space (12, F, p). Historically this method of integration was first
defined on (R, where, denotes Lebesgue measure on the a--field
2 of Lebesgue measurable sets. We have made the definition in the
general case since no more work is involved, but we must now special-
ise it to obtain the Lebesgue integral.
If E is a Lebesgue measurable set in R,,u denotes Lebesgue measure
in R, f is 2-measurable,, then it is usual to use the notation

f f (x) dx for fE f dl-t.

In particular, if E is an interval with end-points a, b we use the notation
fa f(x) dx for fE fdx,
where E = [a, b] or (a, b) or [a, b) or (a, b]. Note that, since the
Lebesgue measure of a single point is zero, it makes no difference
whether the interval is open or closed.
In the above notation a may be - oorrand b may be + oo so that
f '0-f(x)dx means Jxfdu =It

is worth remarking that the integral over an infinite interval is

defined directly (an infinite interval is a measurable set) and not as
the limit of integrals over finite intervals.
In Rk similar notations are used f E... f f (x) dx means a `multiple
integral' off over the set Ee2k in Euclidean k-space with respect to
Lebesgue measure.
If instead of using Lebesgue measure we use a Lebesgue-Stieltjes
measure (defined in § 4.5) given a point function F in Rk, this is
equivalent to working in the measure space (Rk, yak., luF). We use
the notation
fE f (x) dF(x) for fEfdPF.
In this case we do not, in general, obtain the same result when we
integrate over El = [a, b] and E2 = (a, b) so we will not use the notation

b f (x) dF(x)
unless we know that F is continuous for all x. (This condition is
sufficient to imply that the ,uF measure of single point sets is zero, so
that the integrals over El and E2 are the same).
Because Sfk is complete with respect to Lebesgue measure (2F
is complete with respect to pp) we see that if f : Rk -+ R* is integrable
and f = g a.e., then g: Rk --> R* is also integrable.
The theorems of § 5.4 were proved for any measure space (92, F, u)
so they are true in particular for Lebesgue measure in Rk. Thus the
Lebesgue integral is an order preserving linear operation on the class
.Qt of Lebesgue integrable functions. It is also a continuous operator
in the following senses.
Theorem 5.6 A. If {fn} (n = 1, 2,...) is a monotone increasing sequence
of non-negative Lebesgue measurable functions on Rk -> R+ and
f = lim fn,

then f(x) dx = lim J fn(x) dx.

f co n->oo oo

Corollary. If {f,,} is any sequence of non-negative Lebesgue measur-

able functions on Rk --* R+ and f = E fn, then

ff(x) dx = ff(x) dx.

0o n=1

Theorem 5.8 A. If g is Lebesgue integrable and {fn} is a sequence of

Lebesgue measurable functions Rk -> R such that fn --> f a.e. as n oo
and Ifnl < g a.e. for each n; then the functions fn, f are Lebesgue integrable
lim ff(x) dx = ff(x) dx.
"0M 00 o0

Corollary. If E e 2'k and I E is finite, then for any sequence {fn} of

2k-measurable functions Rk -* R such that Ifn(x) I 5 a < oo for all n,
all x E E, fn -3 f a.e. inE we have

f f (x) dx = lim f fn(x) dx.


It is clear that theorem 5.8A can also be translated to give a corre-

sponding result for series. It is also worth remarking that the theorems
corresponding to theorems 5.6 A, 5.8 A for the Riemann integral can
only be proved by using some additional assumption that ensures that
f is integrable : for example, it is sufficient to assume that fn -* f

Exercises 5.5
1. From first principles calculate the Lebesgue integrals
ro J1

(i) (p > -1); (ii) f10 xgdx (q < -1);

fo i

(iii) fsf du, where It is Lebesgue measure in R2, f (X, y) = xy and S is the
unit square 0 . x . 1, O < y < 1.
2. Suppose f: R R* is Lebesgue integrable and

F(x) = f-" f (t) dt.


Show that F is a uniformly continuous function.

3. Show that if {fn} is a sequence of integrable functions E - R* such

If f,,(x) l dx < co,

t hen fn(x) -* 0 for almost all xeE.

4. Show that if I fn(x)I 5 1/n2 for all integers n,xEE, and each fn is
measurable and g is integrable over E, then

E fn(x) g(x) dx = I fn(x) g(x) dx.

J E n=1 n=1fz
5. Caratheodory defines the Lebesgue integral of a non-negative measur-
able function in R as the Lebesgue measure of the ordinate set in R2

f 0<y<,f(x)}j.
Show that this definition is equivalent to the one we have given.

6. Suppose {xs} is a sequence of points in R and pt > 0, pt < co. F(x)

is defined by F(x) = Z pt
and ,uF denotes the Lebesgue-Stieltjes measure with respect to F. Show
that all functions f: R - R* are measurable, and that f is integrable if and
only if Z pt f (xt) converges absolutely.
7. Show that the function f(x) = 1/x2 is integrable with respect to
Lebesgue measure over [1, oo), but not with respect to the Lebesgue-
Stieltjes measure generated by F(x) = x3.
8. Show that the function f (x) = x2 is integrable with respect to the
Lebesgue-Stieltjes measure generated by
0, x < 0,
F(x) 1
x > 0.
1 (x -F 1)4'
9. Show that, for non-negative measurable functions f: R -+ R+, the
Cauchy definition of the integral over an infinite interval
f d# = lim f (x) dx
o t-*GO o

is equivalent to the Lebesgue definition.

By considering the function f (x) = sin x/x, show that this equivalence
does not extend to all measurable f.
10. Show that if f : [a, b] R is continuous and t e (a, b)

lim f tf(x)dx] =.f(t);

t yl t [f af(x)dx-
thus the Lebesgue indefinite integral can be differentiated at points where
the integrand is continuous.

5.6* Conditions for integrability

The strength of the integration operator we have defined is that it
works on a very wide class of functions. Provided the o--field .F is
large, the restriction that f has to be.F-measurable is not a serious one,
for we have seen that in a topological space S2, if F contains the open
sets, then any function which can be obtained from continuous
functions or simple functions by countable operations will be .F-
measurable. The only additional restriction for integrability of f
is on the size (that is, the measure) of the sets where jf j is large. It
should be emphasised that our operation could be called `absolute
integration' for f is integrable if and only if jf is and we do not allow
the large negative values off to `cancel out' the large positive values
to give a finite integral unless each of f+ and f_ is separately integrable
(see exercise, 5.5 (9)).
If we restrict our consideration now to the Lebesgue integral on
R, these general comments still apply, but here it is worth comparing
the Lebesgue integral with the Riemann integral over finite intervals.
Since we want to compare integration operators, for the present
section (only) we will use
Y f f(x) dx to denote the Lebesgue integral,

'f(x) dx to denote the Riemann integral.

It is easy to give examples of functions which are 2-integrable
but not 9-integrable. There are two kinds of bad behaviour which
can prevent a function from being 9-integrable. These are illustrated
(1) bounded functions which are badly discontinuous but still
2-measurable. For example
11 when x is rational,
0 when x is irrational,
is discontinuous everywhere. For any a < b, it is clear that
f (x) dx
cannot exist. However, the set of rational points is countable, and
therefore 2-measurable with zero measure, so that f(x) is an 2-
simple function and b
2fa f(x)dx=0.
(2) Functions which are unbounded in (a, b) cannot be G -integrable
even if they are continuous everywhere. For example, f(x) = x-
(0 < a < 1) is not R-integrable over (0, 1), although an elementary
calculation shows that it is 2'-integrable. If the points of unbounded-
ness off (as in the above case) are finite in number, it is sometimes
possible to use the `Cauchy-Riemann' process to define the integral.
Thus 1

lim aJ f(x) dx
6+O+ e

is defined in the above case and could be used as a definition of

f (x) dx. Provided the Cauchy-Riemann integral of lf (x) exists, it

is not difficult to show that, if the Cauchy-Riemann process for

f(x) works, then f is 2-integrable to the same value. This is not true
without the condition that the process works for lf(x)l, since the
2-integral is an absolute integral.
We know (corollary to theorem 5.5) that any function f: [a, b] -> R
which is 2'-measurable and bounded is .'-integrable. For the exis-
tence of the 9-integral it is necessary for f to be bounded, but the
condition of measurability does not give sufficient smoothness. In
fact the natural way of characterising functions which are . -integrable
over a finite interval is in terms of the measure of the set of points
where the function is discontinuous.
Theorem 5.9. A bounded function f: [a, b] -+ R is Riemann integrable
if and only if the set E of points in [a, b] at which f is discontinuous
satisfies JEl = 0. Any f: [a, b] -+ R which is Riemann integrable is
Lebesgue integrable to the same value.
Proof. We use the following definition for the Riemann integral
off over [a, b] (this is not the usual one but can easily be seen to be
equivalent by using the basic theory of the .?-integral). For any posi-
tive integer n, divide Io = (a, b] into 2n equal half-open intervals
n,a = (an,4-v an.i] (2 = 1, 2, ..., 2n);
put mn,,i = inf{f(x):an,j_1 < x < an.4},
_Mn.t = sup{f(x): an i_1 < x < an,z},

mn1 for x e In. d,

gn(x) =
0 for x ¢ Io;

0 for X E In,
hn(x) =
0 for x 0 Io.
Then for each integer n, x E Io
gn(x) < f(x) < hn(x);
{gn} is a monotone increasing sequence of simple functions, and {hn}
is a monotone decreasing sequence of simple functions. If we put
g = lim gn, h = lim hn,
n--). OD n-1 Go

then g < f < h. Further, by definition,

Y (`b
g(x) dx = lim 2J b gn(x) dx
Ja n-+ oo a
b-a 2"
= lim n E mn, = lim sn, say;
,--,..o i=1 n->oo
2Jaf h(x) dx = lim 2 J hn(x) dx

= limb-a
-27EE Mn, = lim Sn, say.
n-->oo i=1 n-.w
We say that f is .?-integrable over [a, b] if, only if
lim sn = lim S. and 9 f (x) dx
n->oo a

is then the common value of the limit.

Now notice that if f is continuous at x e (a, b) then g(x) = h(x).
Conversely if g(x) = h(x) and x is not a dyadic point (that is, x OD,
where D is the countable set of end-points of intervals In,i), then
f is continuous at x.
If AJ f(x) dx exists, since g < f < h,
2faa g(x) dx = MME f(x) dx = 2Jra h(x) dx

so that, by theorem 5.5 (vii) g = h a.e. Since the set E of points where
f is discontinuous is contained in D u {x: g(x) + h(x)} it follows that
JET = 0. Further, since Lebesgue measure is complete, f is P-measur-
able and, by theorem 5.5 (viii),

f (x) dx = b g(x) dx = ME b f(x) dx.

YJa a
Conversely if the set E satisfies JEJ = 0, this implies g(x) = h(x)
a.e., which gives, by theorem 5.5 (viii)

b g(x) dx = b h(x) dx
YEa Yfa
so that f is . -integrable. I
Theorem 5.9 shows that .P-integrable functions have to be con-
tinuous at most points. We have many examples of 2-integrable
functions which are continuous nowhere. However, there is a sense in
which even 2-integrable functions have to be approximable by con-
tinuous functions-in fact by functions which are arbitrarily smooth,
that is, functions that can be differentiated arbitrarily often.
Theorem 5.10. Given any 2'-integrable function f: R -* R* and any
e> 0 there is a finite interval (a, b), and a bounded function g: R --> R
such that g(x) vanishes outside (a, b), is infinitely differentiable for all real
Yflfx_uxiIdx < e.
Proof. We carry out the approximation in 4 stages.
(i) First, find a finite interval [a, b] and a bounded measurable
function fi which vanishes outside [a, b] and is such that
I/(x)-fi(x)I dx < Je.
This can be done by considering the sequence of functions
f (x) if x e [ - n, n] and If(x) I < n,
if x E f - n. n1 and f(x) > n.
-n It xE[-n,nj ana j(x) < -n,
0 if xo[-n,n].
Then gn(x) -+ f(x) for all x and I gnI < If 1. By theorem 5.8 it follows
-If(x)-gn(x)Idx-*0 as n->oo
so that we can fix a sufficiently large N and put fi(x) = gN(x).
(ii) The next step is to approximate fl by an'-simple function/2
which vanishes outside [a, b] and satisfies

I.fi(x) -12(x) I dx < }e.

This is clearly possible since we defined the integral as a limit of the

integrals of simple functions.
(iii) Now a simple function is a finite sum of multiples of indicator
functions. If each indicator function can be approximated by the
indicator function of a finite number of disjoint intervals, then it will
follow that f2 can be approximated by f 3, a step function of the form
f3(x) _ gxJ1(x),
where each Jt is a finite interval and

I f2(x) -/3(x)I dx < JE.

To see that this is possible start with a bounded IF-measurable set E

and 7/ > 0. Find an open set G z) E such that I G - E I < in and from
the countable union of disjoint open intervals making up G pick a
finite number to form Go such that I G - Go I < Jr/. It will then follow
that IEA GoI < r/ so that

f-I IX) -xao(x)I dx <,t/-

(iv) In order to obtain the required infinitely differentiable func-
tion g for which
f f3(x)-g(x)I dx < fe
it is now sufficient to find a function for one of the components
Xj,(x) of f3.
Suppose J = (a, b) and 0 < 21 < b - a. Put
[(x - a)2 -112]-1 for Ix-al <,I,
Oa.,7(x) = j exp for Ix-al > 1.
If c- 1=J 0,,,,(x) dx, let h(x) = cI f x 10a,,10) - cbb,,7(t)} dt.
-'000 - OD

It is easy to check that h is infinitely differentiable and

fIxx-h(x)Idx < 471,

since 0 <, 1(x) < 1 for all x and {x: X j(x) + h(x)} is contained in the two
intervals (a-71, a+,I) and (b-is, b+71).J
Remark 1. We stated our approximation theorem in R1. It is also
true in Rk for every k, and in this case we can require the approximating
function to have partial derivatives of all orders everywhere. Our
proof requires only minor modifications to give the corresponding
theorem in Rk.
Remark 2. If SZ is a topological space, and F includes the Baire
sets in S2 then theorem 5.10 can be generalized to (ti, to show that
any integrable function can be approximated by a continuous function.
(The Baire sets are the sets in the a--ring generated by sets
{x: f (x) > 0} where f : S2 R
is continuous and vanishes outside a compact set).

Exercises 5.6
1. In theorem 5.10 it was shown that any integrable function f could be
appproximated by a step function g in the sense that

fIfx)_(xi dx < e.
Show that in general it is not possible to arrange at the same time that
g <f-
2. Show if f : R -> R* is integrable, then

f a function which is uniformly continuous

and zero outside a bounded interval.
3. If f"(x) = e-"x-2e-2nx show that f,, is integrable over [0, +oo) but
I fn (x)) dx + n-1
Z 'fn(x) dx.
f o 0=01 0 J
x2 sin l/x3 (x + 0),
4. Put g(x) _ {0 (x = 0),
and g'(x) =f(x) for all x e R.
Show that f (x) is finite for all x, but unbounded near x = 0. Show that
fix) is not R-integrable over (0,1), but that it is Cauchy-Riemann integrable
(evaluate its integral). Isf(x) 2'-integrable over (0, 1)?


6.1 Classes of subsets in a product space
In the last few chapters we have defined all our concepts in a single
abstract space S2 and usually we have at any time considered only one
measure defined on a fixed class of subsets of Q. In applications one
often requires to consider more than one measure, and the relation-
ship between the spaces and measures involved become important.
We first consider measures defined on the Cartesian product of two
measure spaces. Before considering the definition of such measures
we must examine, in the present section, the structure of the relevant
classes of subsets.
In § 1.1 we defined the Cartesian product X x Y of two spaces
X, Y to be the set of all ordered pairs (x, y) with x E X, y E Y.

Any set in X x Y of the form E x F with E c X, F c Y is called a
rectangle (set).

Product of classes
If '', 3 denote classes of subsets in X, Y respectively, then ' x -9
denotes the class of all rectangles E x F with E E ', F E.9.

Product ring, field, o --field

If z-class again denotes any one of ring, field, or-ring, ar-field and
W, 3 are z-classes in X, Y, respectively, then the product z-class is the
z-class in X x Y generated by le x -9.
Lemma. If le, -9 are semi-rings in X, Y respectively, then ' x is a
semi-ring in X x Y.
Proof. It is immediate that le x -9 is closed for finite intersections,
so that we have only to prove that
E1x.F -E2xF2
can be expressed as a union of disjoint sets of ' x _q for any
El, E2 E'i; F1, F2 E _q.
Since 1, .9 are semi-rings we have
n m
El - E2 = U Ei, Fi - F2 = U Fj,
i-3 j-3
where the sets Et (i = 3,4,..., n) are disjoint sets of ', Fj (j = 3,4,..., m)
are disjoint sets of .9. Now
El x Fl - E2 x F2 = (El n E2) X (FI - F2) v (El - E2) X (F1 n F2)
v (El - E2) X (F1- FZ)
m n n m
= U (E1nE2)xFjiU Eix(FitF2)uU UEixFj,
i-3 1=3 i=3 j-3
and these are all disjoint sets in W x 2.
It is important to notice that this lemma does not extend to any of
the z-classes as W x -9 is not closed under the operation of union. In
particular, if le, _q are o--fields then le x _q will not be a We
will use the notation ' * for the v-field in X x Y generated by
W x 2. We also need some operations which are effective in the oppo-
site direction, from the product space to the components.
Given any set E c X x Y and any point x E X, the subset
EX _ {y: (x, y) E E}
of Y is called the section of E at x. Similarly, for y E Y, the subset
Ev = {x: (x, y) E E) of X is the section of E at y.
Given any set E -- X x Y the sets {x: there exists y with (x, y) E E},
{y: there exists x with (x, y) E E} are called the projections of E into
the spaces X, Y, respectively.
Although the product o--field' * -9 of two o--fields', -9 contains
more than the rectangle sets E x F, E E', F E-9, one can deduce an
important restriction on the sections of its sets.
Theorem 6.1. If .'-', T are o--fields in X, Y, respectively, and .° is
the product a -field in X x Y, then all sets E in.*' have the property that
E_, ETr for all yeY.
Proof. Let 3' be the class of subsets E of X x Y with the property
that every section of E is in the appropriate Since rectangle
sets certainly have this property, it is immediate that le .F x T.
Moreover, it is not difficult to verify that W satisfies all the axioms of a
Hence, T .Y the generated by F x 9.3

Sections of functions
Given any function f: X x Y R*; for each fixed x E X,
fx(y) = f (x, y) for y E Y
defines a function Y -* R* and for each fixed yE Y, ff(x) = f(x, y)
for x E X defines a function on X to R*. These functions fx, fv are
called the sections of f at x, y, respectively.
Corollary. Under the conditions of theorem 6.1, given any .-measur-
able function f: X x Y --> R*, each of the sections fx(y) is 9-measurable
and each of the sections ff(x) is .°F-measurable.
Proof. Suppose xa is a fixed point in X and M is a Borel set in R*;
{y: ,,.(Y) E.411 = {y: f (x0, y) E M} = {(x, Y): f (X, y) E M}x0
so that the test set is the section at x0 of a set in .'.
The results of theorem 6.1 and corollary can be extended in an
obvious way to finite Cartesian products X. X X2 x ... X X,1; there is
no difficulty in making the required modifications to the definitions
and proofs. It is not quite so immediate that they can also be extended
to arbitrary Cartesian products jj Xi.
Let us recall that a point in jj Xi can be thought of as a function
f : I U X j such that f(i) E Xi for each i E I. Suppose then that we
have a collection {Xi, i E I} of spaces and o--fields .i of subsets of Xi.
Cylinder set
If i1, i2, ..., in is any finite subset of I and EjkE rzk, k = 1, ..., n; the
set of points f E IIXi such that
f(ik) E Eik (k = 1, 2, ..., n),
is said to be a (finite dimensional) cylinder set in 11X1. When we say
that f is in a cylinder set C, the values off are restricted only on a
finite set of indices. The class of all such cylinder sets in IIXi will be
denoted i E I).
Lemma. The class i E I) of cylinder sets is a semi-ring of subsets
Proof. We can think of .j as a semi-ring in Xi which contains the
whole space X. Then if two sets
A = {f: f(i)EEi,iEJ},
B = {f: f(i)EFF,iEK},
are in '(.Fi, i e 1), J and K must be finite subsets of I, and each of the
sets Ei, Fi must be in the relevant .j. The set J v K = L
is also a finite subset of I and, if we put
Ei =Xi for iEK-J, Fi = Xi for iEJ-K,
then A = {f : f (i) E Ei, i E L}, B ={ f: f (i) E Fi, i E L},
are now cylinder sets in which the same finite subset L of indices are
restricted. Since we know that any finite Cartesian product of semi-
rings is a semi-ring, we can deduce that A - B is a finite disjoint union
of sets of this type and A n B is a set of this type. Hence W(Aj, i E I) is
a semi-ring.
Note. The case I = Z is important. The cylinder sets in II Xi
then reduce to sets of the form E1 x E2 x ... x E. x jj Xi with
Ei E.Fj (i = 1, 2, ..., n).

The results corresponding to theorem 6.1 for jj Xi are formulated

as examples for the reader to prove.

Exercises 6.1

1. If 9 is a ring of subsets of X, .9' is a ring of subsets of Y, show that

the product ring consists of those sets in X x Y which are finite unions of
disjoint rectangles in . x .50.
2. If Al, A2 -_ X, B1, B2 Y and Al x B1 = A2 x B2 is not null, prove
that Al = A2, B1 = B2.
3. Suppose E = A x B, El = Al x B1 and E2 = A2 x B2 are all non-
empty rectangles in X x Y. Show that E is a disjoint union of El and E2
if and only if either A is a disjoint union of A1, A2 and B = B1 = B21 or
B is a disjoint union of B1, B2 and A = Al = A2.
4. If .9', l are Q-rings in X, Y, respectively, then the product v-ring
in X x Y is a o--field if and only if both So and l are
5. Show that the intersection of a class of rectangles is a rectangle.
6. Suppose X = Y is any uncountable set and So = Jr' is the class of
subsets which are either countable or have countable complement. Deter-
mine the product a-field of .9 and .°l: If D = {(x, y): x = y} is the diagonal
in X x Y show that every section of D is in .50 or .T but D is not in the
product v-field. This shows that theorem 6.1 has no converse.
7. Suppose .5; T are o-fields in X, Y; then a rectangle set E x F is in the
product v-field if and only if E e .F, F E !Y.
8. Suppose _-Y is the product v-field of two a-fields.F, 9. Show that
any function on X x Y -* R which is .'-simple has all its sections F-simple
or 9-simple.
9. Suppose .r is the product v-field of two v-fields . F2. Show that the
projection of a set in 30" on an axis need not be in .F1, .°F2, respectively.
10. Suppose Fi is a v-field in Xi (i = 1, 2,...) and the v-field generated
by cylinder sets W(. ,. +1, ...) in lj Xi is denoted by .V,,. Then given any

set E in rj Xi the (finite dimensional) section of E at x1, x2, ..., xk is the set
(in n Xi) of points (xk+i, xk+2, ...) such that (x1, x2, ...) r :E. Then if EE Y1
\\ i=k+1
the product o--field in rj Xi, all its k-dimensional sections belong to .Sok+1

6.2 Product measures

We now assume that (X1, and (X2, are measure
spaces and/111 F2 are o--finite measures. The product Q-field .' in
X1 X X2 was defined as the smallest containing the class
'F1 x F2 which is known to be a semi-ring since each of F. are
semi-rings. In Chapters 3 and 4 we developed a general method of
extending a measure from a semi-ring to the generated a-ring. Since
the semi-ring F1 x F2 contains the whole space X 1 x X 2 this generated
r-ring must be a a-field and is therefore F1 * .F2, the product or-field.
Thus if we use theorems 3.5 and 4.2 we can extend any a-finite measure
on x F2 to a a'-finite measure on . * .'F2 in a unique way.
Suppose E1 x E. is any rectangle set in F1 x F2 and put
#(E1 x E2) = #1(L' 1) #2(L' 2)1
with the usual convention that 0. oo = co. 0 = 0. Then p is a non-
negative set function on F1 x F2 which is easily seen to be cr-finite.
Our first objective is to show that p is a measure on the semi-ring
.F1 x .r 2. First, suppose that
ExF= U (EixFi)
with the sets E. x Fi disjoint. Define the functions fi: X1-* R+
by fi(x) = ps(Fi) xEi(x) (i = 1, 2, ..., n). Then fi is a non-negative
function or possibly a function which takes the value + eo on a measur-
able set Ei and zero outside it: in any case

ficzpi = p1(Ei) p2(r'i) (i = 1, 2, ..., n).

Similarly, if f(x) = ,u2(F) XE(x) we have

ffdi = Ia1(E)uu2(F)
Now for each fixed x in X1 we have

(E x F)x = U (E, x Fi)x

with the sets (Ei x Fi)x disjoint. Since 1u2 is (finitely) additive it follows
that n
f(x) = Efi(x)

If we now use (finite) additivity for integrals of non-negative simple

functions we have

lu1(E),a2(F) = ffd,u1 = f Tidal = i=1 ffidu1 = L.i,a1(Ei)li2(Fi)

ti=1 i=1
This shows that the set function u 1we have defined is finitely additive
on Fl x " 2. The same argument extends without difficulty to count-
able unions of disjoint rectangles

because all the functions fi(x) are non-negative measurable, so that
the monotone convergence theorem 5.6 justifies the inversion of
integration and summation. Thus It is a measure on the semi-ring
.F1 x .F2. It can be extended uniquely by theorem 3.5 to the generated
ring, and then, by theorem 4.2, to the generated o--ring which is the
product o--field F1 * .5F2. The result is called the product measure
on <F1*c.F2. We have thus proved
Theorem 6.2. Given two measure spaces (X1, #i,1u1), (X21 'F21,02) such
that,u17u2 are o --finite, there is a unique measure a defined on the product
o field Al * ffl72 in X1 x X2 such that

,a(E1 x E2) = lu1(E1) Ja2(E2) for E1 E .°F1, E2 E '2.

The above theorem clearly extends immediately to any finite
Cartesian product of measure spaces. Difficulties arise with
the Cartesian product of an enumerable collection of measure spaces
unless we arrange that the infinite products of real numbers occurring
converge. The easiest way to ensure this is to restrict the discussion
to countable products of measure spaces (Xi, A j, pi) with,ui(Xi) = 1.
It is possible to define product measures on arbitrary product
spaces jj Xi such that,ui(Xi) = 1 by exactly the method used below.
We carry out the construction only for enumerable products as, in
applications, it is not usually appropriate to consider the product
measure for non-countable products. In § 6.6 we will give a general
construction for a measure in jl Xi, an arbitrary product space-
this construction could clearly be specialized to give the results of the
remainder of this section, but it is simpler to deal with the case of
product measures in countable product spaces first. We will set up our
measure on the product o--field by a slightly different procedure.
Let be the semi-ring of cylinder sets in jI Xi.
We define It on W by u(E) = 1z1(E1)#2(E2) . . .lun(En), if

E = E1 x E2 x ... X E. x jj Xi; Ei E, (i = 1, 2, ..., n).


It is clear that 0 < ,u(E) < 1 for all E in V. To see that It is finitely
additive on ' it is sufficient to see that, in any finite collection of
cylinder sets, only a finite number of coordinates are involved so that,
if C = U Cj is a dissection of C E' into disjoint sets of le, there is an
integer N such that C and Cj (j = 1, ..., m) can all be expressed in
the form
E1xE2x...xENx jj Xi. 00
We can then apply theorem 6.2 to the finite products to see that
fi(C) = E,u(Cj).

By theorem 3.4, u has a unique additive extension to the ring Q

of finite unions of cylinder sets. In order to apply theorem 4.2 we
must show that u is a measure on 9. This can be done by using the
continuity theorem 3.2. It is sufficient to show that any monotone
decreasing sequence {An} of sets in R such that
has a non-void intersection.

Let Y. = f X. Then by the above procedure we can define pro-

duct set functions v(n) on the class *n> of finite unions of cylinder
Sets It is clear that, for each integer n, we can
obtain It on W by taking the product of the measures µi (i = 1, 2, ..., n)
and zA' ). Let
An(x1) = {y: yEY1, (x1,y)EAn}
be the section of An at x1 E X1. It is clear that, for each x1 E Xl,
An(x1) E 9(1) and if
B.,1 = {x1: vA1)(An(x1)) > e}
then Bn,1 is a finite union of sets in .°F1 and is therefore in :

further we must have

p1(Bn,1) -+' je(1-p1(Bn,1)) i ,a(An) i e,
by considering Ann (Bn,1 x Yl) and Ann (X1- Bn,1) x Y1. It follows
that p1(Bn,1) > e (n = 1, 2, ... ).

But {An} is monotone decreasing so {Bn,1} must also decrease with n

lul 1
1 1 Bn, l 1

Since p1 is a measure on JF1, it follows that there must be at least one

point x1 E X1 for which
v<1)(An(x1)) > je for all n.
We now suppose x1 is fixed as such a point in (1 Bn,1 and repeat the
argument to the sequence of sets {An(x1)} in the space Y1. This gives
a point x2 E X2 such that
v(2)(An(x1, x2)) > e/22 for all n.
By an induction argument we obtain a point (x1, x2, ...,) in rj Xi such
that, for any k, n i-1
An(xl, x2, ..., xk) * fQ .

But each set An has only a finite number of coordinates restricted so

the point (x1, x2, ...) must be in A. for all n. This completes the proof
that ,u is continuous from above at 0.
Since p is now seen to be a finite measure on the ring . it has a
unique extension to the generated a-ring which is also the product v-
field in rj X. This extension is called the product measure. Thus we
have proved

Theorem 6.3. If (Xi, JFj, are measure spaces with

pi(Xi) = 1 (i = 1, 2, ...);
then there is a unique measure µ defined on the product o --field F of

subsets of X Xi which is generated by the cylinder sets of the form


E1xE2x...xEnx rj Xi (EiEFi,i= 1,2,...),

such that

= lu1(E1) fi2(E2) . . . run(E.)

Exercises 6.2

1. Given 3 or-finite measure spaces (X1,. 1,µl), (X2,. ,µ2),

let T be the product measure of µ1, µ2 in X1 X X2 and v the product measure
of It,, µ3 in X2 X X3. Show that, in the space X1 X X2 X X3 the product
measure of T and µ3 is the same as the product measure of It, and v.
2. Suppose (Xi,.5Fi,,ui) (i = 1, 2,...) is a sequence of measure spaces
with ai(Xi) = 1. Let µ be the product measure of theorem 6.3 on jj Xi

and suppose Tn is the corresponding product measure of jj X. Show that

µ is the same as the product measure of µ17u2, ..., µn, Tn on the finite Car-
tesian product
X1XX2X...XXnX 11 Xi .

3. The product measure of two complete measures need not be

complete. As an example take X1 = X2 = unit interval with Lebesgue
measure. Suppose M is a non-measurable set in X1, and consider the set
M x {y}; use exercise 6.1 (7).
4. Suppose jj Xi is a product space with µi(Xi) = 1. Let E.
(i = 1, 2,...). Then the set jj'0Ei is in the product o,-field and µ(E) µ(Ei).
i=1 i=1

5. If a cylinder set E1 x E2 x ... x E x jj00 Xi is in the product u-field

F generated by W(JF1, JF2, ...), then it is in in fact Ei E.
(i = 1, 2,..., n).

6.3 Fubini's theorem

Given two measure spaces (X, F, It), (Y, 9, v) we have now seen
how to define a product measure on the product o--field in X x Y.
Given a function f: X x Y --> R* there are sections f,,: Y ->- R* defined
for every x E X. Our objective in the present section is to compare the
integral off (x, y) with respect to the product measure with the iterated
integral obtained by first integrating fe(y) with respect to v for each
fixed x, and then integrating the resulting function of x with respect to
the measure It. Because of our method of defining the integral the
general result will follow easily from the special case of simple
functions. The essential step towards this case is given by the next
Theorem 6.4. Given (X, F, It), (Y,!?, v) two o--finite measure spaces,
let A be the product measure defined on the product o --field F* 9. Then
for all A F* 9, v(A.,) is F-measurable anda(Av) is 9-measurable;
and r
A(A) = #(A') dv = fv(A)d.
Proof. Suppose first that p(X), v(Y) are both finite. Let _W be
the class of subsets of X x Y for which the conclusions of the theorem
are valid. Then .4' .F x T since if A = El x E2, El E .F, E2E W
v(A,,) is.F-simple as a function of x,
,u(AY) is 9-simple as a function of y,
and both these functions integrate to A(A) by the definition of A on
.F x 9. It follows that A contains the ring . of finite unions of
rectangle sets of F x T. Since the limit of a monotone sequence of
measurable functions is measurable, and theorem 5.6 applies to the
integrals, it follows immediately that .4' is a monotone class. Hence,
by theorem 1.5, .,' is a o--ring. But clearly .4' contains X x Y so that
.4' is a o--field and _W n F* 9. The restriction ,u(X) < oo, v(Y) < oo
can now be removed by the usual device of taking measurable sequences
{A,z} increasing to X and {Bn} increasing to Y for which p(A) < oo,
v(B,) < oo for all n, and considering the set A n (An x B.) which
increases to A as n - oo.
Corollary. Under the conditions of theorem 6.4, if A E.5F* 9, A(A) = 0
if and only if v(A.,) = 0 for almost all x, and if and only if p(AY) = 0
for almost all y.
This follows from the theorem using the fact that a non-negative
measurable function can integrate to zero only if it is zero almost
everywhere. I
Theorem 6.5. Given all the conditions of theorem 6.4, we write ." for
the product o -field .F* 9.
(i) If h: X x Y -* R+ is any non-negative ilo measurable function
fh =f (fhdv) d = f(fhd)dv.
(ii) If h: X x Y - R* is -measurable andA-integrable, then
h_,: Y -> R* is v-integrable for almost all x and hy: X -> R* is ,u-inte-
grable for almost all y. Further
r rr

f hdA=Jfdu=Jgdv,
where f(x) = fhdv when hx is v-integrable,

g(y) = fh dp when by is,u-integrable

and f, g are defined to be zero on the remaining null sets.

(iii) If f: X x Y -+ R* is 1'-measurable and f (f If, dv) dµ is finite,
fi dA = f(ffd) dv = f(ffdv) du.
Proof (i). If h is the indicator function of a set in W the result
follows by theorem 6.4. Because of the linearity of the integration
process it now follows for non-negative .*'-simple functions (note that
sections of an *-simple function will be simple by theorem 6.1). If
we now take a sequence {h( n)} of non-negative simple functions increas-
ing to h, we will have the sections {h(xn)}, {hvn)} increasing to h, by
respectively. Hence, as n -> oo,

f h(') d A -> fh dil,

f hex )dv -->

f h,dv for all x, fhd/2 --f hydu for all y,

and application of the monotone convergence theorem (5.6) now

suffices to complete the proof.
(ii) Since h is integrable, the positive and negative parts h+, b-
are integrable. Apply (i) to each of these functions. Then

f+(x) = fh:dv
will always be defined, though it may take the value +oo. Since
ff+(x) dp exists, we must have f+ finite except for a set of zero u-
measure. Similarly, f- is finite almost everywhere. If we put
f(x) = f+(x) -.f-(x)
when both f+, f- are finite and f(x) = 0 otherwise, we see that

f hdA = fh+dA_fh_dA
= ff+du- rf-dµ = ffda.
(iii) Again split f into positive and negative parts. Since
0 <f+ < If 1,
we can apply (i) to each of the positive and negative parts to deduce
that f f+dA and ff-dA are both finite. The result now follows by (ii).
We should remark that theorem 6.5 is one of the most useful tools
in the theory of integration as we have developed it. This result again
exhibits the power and neatness of the absolute integral.
We have been careful to define the product measure A on the
smallest .V which contains F x T. Some authors define pro-
duct measure to be the completion of this A obtained by the process
of theorem 4.3. If one uses this definition then some of our statements
have to be modified to exclude possible subsets of zero measure,
though the essential content of the results remain valid. In particular,
given a function f(x, y) which is measurable with respect to the com-
pleted ar-field . f°, one can only say that the section fx is T-measurable
for almost all x. However, provided F and T are complete with respect
to their respective measures, theorem 6.5 remains valid as stated.
We can use our definition of product measure to give an alternative
definition of the integral of a non-negative measurable function.
Theorem 6.6. Suppose (S2, -,µ) is a o --finite measure space, (R, ., v)
denotes the real line with Lebesgue measure on it and z is the product
measure ,a x v defined on the product crfield dY in t x R. Then if
E E F and f : E -;,- R+ is non-negative, f is F-measurable over E if and
only if Q(E,f) E .ye, and in this case,

fE fd1i = r(Q(E',f));
where Q(E,f) is the ordinate set defined by
Proof. Suppose first that Q(E, f) e.. Then by theorem 6.1 all
its sections are in F. But the section of Q(E, f) at y = a is the set
so that by definition, f is .F-measurable. Conversely, if f is .F-
measurable then there is a sequence {fti} of .F-simple non-negative
functions which increases to f. Now for any F-simple function
Q(E, fn) is a finite union of measurable rectangles and is therefore
in .-Y. Also Q(E, fn) increases monotonely to Q(E,f) so we must have
Q(E,f) E .-Y. Further if r
A = i-1 Cn, 4 xEn {

with E,,z a disjoint partition of E,

ffdµ = 7- Cn.2#(E.,d) = E9'(En,d x [ID, 0.,j)) = T(Q(E,fn))
If we now let n -> oo we obtain the desired result.
Corollary. If f: R -* R+ is 2-measurable, then the ordinate set
{(x, y): a < x S b, 0 s y < f (x)} is 22-measurable and has planar
Lebesgue measure fa f (x) dx.

In many elementary accounts of integration the notion of `area

under the curve' is intuitively important. This last corollary makes this
notion rigorous for the Lebesgue integral of non-negative functions.
It is possible to consider Euclidean k-dimensional space Rk as
the Cartesian product of k distinct spaces R. Since we have a natural
measure (R, 2, v) on each of these spaces we could form the product
measure defined on Fk the product or-field in Rk by the process of
theorem 6.2. How does this measure compare with Lebesgue measure
in Rk? Since all the extension processes used are unique, and the two
measures clearly coincide on 9k = 9 x 9 x ... x 9, the half-open
rectangles in Rk, it is clear that the two measures coincide whenever
both are defined. However, 2t'k is complete with respect to Lebesgue
measure while.Fk is not known to be so. To see that F, is not complete
it is sufficient to consider the product of a linear set which is not measur-
able in R with (k - 1) single point sets. This set cannot be in the pro-
duct o--field by exercise 6.1 (7), but it is a subset of a line in Rk and
therefore it must be in 2k. It follows that 2k is a larger than
JFk. Since ak, the class of Borel sets in Rk is the ou-field generated by
.9k, we also have Fk P. If E is any set in 21 but not in 91 the
Cartesian product of E with (k - 1) whole lines R will be in Fk but
not in ak, so that Fk is a larger o--field than gik.
If we consider the case k = 2, a function f (x, y) which is 32-
measurable need not be Thus we can only say that

the function fe(y) = f(x, y) considered as a function of y for fixed x is

measurable for almost all x. Thus in Theorem 6.5 (ii), if f (x, y)
is Lebesgue integrable we can deduce that ¢(x) = f f (x, y) dy exists
and is finite except for an exceptional set of x of zero measure. As
g5(x) is thus defined a.e. it can be integrated and
fff(xY)dxdY = fr(x)dx.

Exercises 6.3
1. Suppose S2 is any set of cardinal greater than X0, and F is the o-field
of sets in fI which are either countable or have a countable complement.
For EeJF, put p(E) = 0 if E is 1 if (S2-E) is countable.
Consider the Cartesian product of two copies of S2 and let E be a set in
S2 x SZ which has countable x-sections for every x and y-sections whose
complement is countable for every y. If is the indicator function of E, then
fhu(x)au(dx) = 1, fh(Y)(dY) = 0.
Why does this not contradict theorem 6.4?
2. Suppose (X, .F,#) (Y, OF, v) are o -finite measure spaces and A is the
product measure on the product a-field A. Show that
(i) If E, G c: A' are such that v(E.,) = v(G,,) for almost all x e X, then
A(E) = A(G).
(ii) If f, g are integrable functions on X, Y then f (x) g(y) is integrable on
ff(x) g(y) dA = ffdufYdv.
3. X = Y = [0,1] an&F, 9 are the Borel subsets. Let p(E) be the Lebes-
gue measure of E, v(E) the number of points in E. Form the product mea-
sure It x v on Borel subsets of the unit square. Then if D is the diagonal
{(x, y); x = y}, D is measurable and

f v(Dx)u(dx) = 1, f(DY) v(dy) = 0.

Why does this not contradict theorem 6.4?
4. If f(x, y) = (x2-y2)/(x2+y2)2 show that

f{ffx,YdY}dx = 4 ,
0 0

f(x,y)dx dy= -4,

0 0
where all the integrals are taken in the Lebesgue sense. Thus theorem
6.5 (iii) is not valid without the modulus sign. Similarly, show that

(e--2e-209)dx)dy+J 1(e-xv-2e- v)dy}dx.

10 J1 1 JO 11

5. If f (x, y) = xyl (x2 + y2)2, then

+1 +1 +1

-1 (f 1 f(x, ) = = f(ff(x,Y)dY)dx
but the integral over the unit square in R2 does not exist.
6. Given a countable collection of probability spaces (X2, .j u;) and
the product measure ,u on the product v-field, we can form the finite product
measures T.. = µ1 X P2 X ... x pn and the product measure A on the product
space rj X j. Then, if f (x1, x2, ...) is any p-integrable function on rj Xj we
{=n+1 i 1
fdu= jftx1, x, ...) d n dTn.

6.4 Radon-Nikodym theorem

We start with a definition.
Absolute continuity
Suppose F is a of subsets of S2 and p is a measure on .F.
Then the set function v:.F -+ R* is said to be absolutely continuous
with respect to p if v(E) = 0 for every E in F with ,u(E) = 0. In this
case we write v < It. If (f2, u) is a measure space and f : 0 - R*
is µ-integrable, then it is clear that

v(E) = fE fdu
defines a finite valued absolutely continuous set function v. In fact,
in § 5.4 we proved that v was and that (corollary to theorem
5.6) given e > 0, there is a 8 > 0 such that for E e.F,
p(E) < S' Iv(E)I < e. (6.4.1)
It is immediate that any set function v which satisfies (6.4.1) is
absolutely continuous with respect to It. The conditions are equiva-
lent for finite measures, but not in general (see exercise 6.4 (4)).
There is a partial converse given by:
Lemma. If (S2, F, p) is a measure space and v: F ->. R is finite valued,
ar-additive and absolutely continuous with respect to ,u, then v satisfies
condition (6.4.1).
Proof. By the decomposition of § 3.2, any such v is the difference of
two finite measures, so it is sufficient to prove the result for a measure v.
Then if (6.4.1) is false, there is an e > 0 and a sequence {En} of
sets of F such that v(En) > e and ,u(En) < 2-n. Put E = lim sup En.
#(E) 5 p U Er) <, j U(Er) < 2-r`,
r=n+1 n+1
so that ,u(E) = 0 while
v(E) = lim v 1 u Er) > lim sup v(Er)
so that v(E) > e. This contradicts v << p. I
Thus we see that the indefinite integral of an integrable function
defines an absolutely continuous set function. Our object in the present
section is to obtain the converse of this statement under suitable
conditions. It is convenient at the same time to consider a more
general o--additive set function and to decompose it into a maximal
absolutely continuous component and a remainder which has to be
concentrated on a p-null set. It is convenient to give a further
Singular set function
Given a measure space a set function v: F - R* is said
to be singular with respect to p if there is a set E0E.F for which
p(E0) = 0 and v(E) = v(E n Eo), all EE.F. (6.4.2)
This condition clearly means that the parts of SZ outside the null set
E0 make no contribution to P. In fact if v is also a measure we see that
S2 can be dissected into two sets Eo, El E. such that
Eo n El = 0, Eo v E1 = S2, ,u(Eo) = 0, v(E1) = 0.
The symmetry of the relationship in this case is sometimes stressed
by saying that It and v are mutually singular.
Theorem 6.7. Given a afinite measure space (S2, . It) and a o--additive,
o--finite set function v, then there is a unique decomposition
V = V1+V2
into set functions vi which are ofinite and such that v1 is
singular with respect to ,u and v2 << p. Further there is a finite valued
measurable f: S2 --> R such that

v2(E) fda, all EE.F.

The function f is unique in the sense that if we also have

v2(E) = gda
for all E in .F, then f(x) = g(x) except in a set of zero,-measure.
Corollary. Under the conditions of the theorem if v < It then there is a
finite valued f : S2 -. R such that

v(E) = fdµ for Ee.F.

Note. The decomposition of v into absolutely continuous and singu-
lar components is often called the Lebesgue decomposition, while the
integral representation is called the Radon-Nikodym theorem.
Proof. Since we can express SZ as a union of a countable set of dis-
joint sets on each of which both ,a and v are finite, there is no loss in
generality in assuming that they are both finite on 92. This applies
to both the existence and uniqueness proofs. We first see that the
decomposition is unique.
V = Vl+V2 = V3+V4,
where v1, v3 are singular and v2, v4 are absolutely continuous. Then

v1- V3 = V4 - V21
Taking the union of support sets of v1, v3 gives a set Eo such that
(v1- v3) (E) = (v1- v3) (E n E0), ,a(Eo) = 0.
But (v4 - v2) is absolutely continuous and therefore zero on any null
set so that, for any E E .F,
(v4-v2)(E') = (v1-v3)(E) = (v1-v3)(EnEo)
= (v4 - v2) (E n Eo) = 0.
Thus vl(E) = v3(E), v2(E) = v4(E) for all E. The uniqueness of the
integral representation of v2 was proved in § 5.4. Thus it is sufficient
to find any decomposition and integral representation.
By theorem 3.3 we can decompose v into the difference of two mea-
sures. It is therefore sufficient to prove the theorem when v is a
measure. Now let .-° be the class of non-negative measurable
f: U -* R+

such that v(E) > f f d# for all E in JF

and put
a = suP {fiz1u:iE .31 .

Let {fn} be a sequence of functions in .° such that

.fndu > a- -.
Put gn(x) = max{fl(x), f2(x), ..., fn(x)}. Then if and n is fixed
we can decompose E into a disjoint union E1 v E2 v... v E,, of sets of
.F such that gn = fj on Ej. Hence

f gnciu = j=1fEj gndlu

E j=1 Ej
fjd'a s E v(E;) = v(E),
so that 9n E .° for all n. But {gn} is monotone increasing, and by the
monotone convergence theorem, fo(x) = lim gn(x) EA°. Since

fa(x) > fn(x) for all n,

we must have a = f fo(x) dµ.

For each E in F, put

v2(E) = f Efodp, v1(E) = v(E) - v2(E).

Then v2 is absolutely continuous with respect top, so it only remains to
show that v1 is singular.
Consider the o--additive set function
;(n = V1-(1/n) a
and decompose S2, using theorem 3.3. into positive and negative sets
Pn, Nn such that Pn v Nn = SZ, Pn n Nn = o, E c Pn o- An(E) > 0,
E c Nn A(E) < 0. Then, for E c Pn,

v(E) = v1(E)+v2(E) % v2(E')+np(E) =

f E
(.i +n) du.

This shows that the function equal to fo on N. and [fo + (1/n)] on P.

is in .*'. This will give a larger integral than a unless ,u(Pn) = 0. If
P = UP., then p(P) = 0. Further S2 - P c Nn for all n so that
v1(SZ-P) = 0 and
v1(E) = v1(E n P) for all E in .F,
that is, v1 is ,u-singular.
In the case where v <<,u, by the uniqueness of the decomposition
we must have v = v2, and the integral representation of v now follows. ]
Remark. In the statement of theorem 6.7 we do not assert that the
function f is integrable. A necessary and sufficient condition that f
be integrable is that v be finite. However, the use of the symbol
fF f du asserts that eitherf+ or f- has a finite integral. This corresponds
to the result of theorem 3.2 that v cannot take both the values ± cc.
Derivative of a set function
If is a measure space and

v(E) = for E in,

then we write f = dv/dµ and call f the Radon-Nikodym derivative of
v with respect toy.
One should emphasise that the derivative dv/dµ is not defined
uniquely at any given point, it has to be considered as a function and
then it becomes uniquely defined in the sense that any two functions
representing the same derivative can differ only on au-null set.

Exercises 6.4
1. Show that if µ, v are any two measures on a a-ring SP, then v < µ -{- v.
2. Suppose F(x) is the Cantor function defined in §2.7 and v is the
Lebesgue-Stieltjes measure with respect to F. Show that v is singular with
respect to Lebesgue measure.
3. Suppose (S2, .F, µ) is a measure space with µ(S2) < ac and visa measure,
v << It. Show there is a set & such that (S2 - E) has v-finite v measure and for
every measurable F c E, v(F) is either 0 or oo.
4. Let t be the set of positive integers,
µ(E) = E 2-n, v(E) 2n
then v < µ, but (6.4.1) is not satisfied. This shows that (6.4.1) is a stronger
condition than absolute continuity when v is not finite.
5. Suppose Q is an uncountable set,." is the class of sets which are either
countable or have countable complements. For E e.ri", put µ(E) = the
number of points in E, v(E) = 0 or 1 according as E is countable or not. Then
clearly v <#, but no integral representation is possible. This shows that
in the Radon-Nikodym theorem we cannot do without the condition that
µ be ar-finite.
6. If A, ,u, v are or-finite measures on IF and A << It, ,u << v; show that
dv du dv
except on a set of zero A-measure.
7. A, a are v-finite measures on F with y < A. Then if f is ,u-integrable

ffdu = J f d dA.
8. If A,# are a--finite measures on F such that ,u <A and A <# then
d1t dA)
TA -(
except for a set of zero A-measure.
9. If ,u, v are o--finite measures on F such that v << ,u, show that the set
of points x at which dv/du is zero has zero v-measure.
10. Suppose {,u1} is a countable family of finite measures on a o,-field F.
Show that there exists a finite a on such that each of the pi is absolutely
continuous with respect to It.
11. Suppose
lun = r Ilk - lu,
n n
vn = E vk - v,
k=1 k=1
where all the u, v with suffices are finite measures on a or-field 317 and vn
is din-continuous for all n. Show that
(i) du1/dun - du1ldc almost everywhere (f1).
(ii) If each,un is v-continuous then d7n/dv --* a.e. (v).
(iii) v is 71-continuous and dvn/dun -* dv/du a.e. (F1).

6.5 Mappings of measure spaces

In mathematical arguments one often needs to consider two spaces,
X, Y with a mapping f: X -* Y. Such a mapping induces mappings
on the classes of subsets of X and Y: if E c X, f(E) denotes the set of
yin Y with y = f(x), and if F c Y, f-1(F) denotes the set of x
in X with f (x) e F; further if V is a class of subsets of X, f (W) denotes
the class of sets f(E) with Eeq', and similarly for f-1(&) where & is a
class of subsets of Y. We saw (§ 1.5) that f-1 preserves the structure
of a class of subsets, so that if.9' is a a--field in Y, f -1(J') is a or-field in X.
Sometimes the two spaces X, Y already have classes of subsets defined,
and one can then examine the relationship of the mapping f to these.

Measurable transformation
Suppose f is a mapping from X into Y, F is a in X and 9r
is a o--field in Y, then we say that f is a measurable transformation
from (X, F) into (Y, 9) if f-1(E) EJF for every E in K This condition
can also be written f-1(g) c ,F.
In Chapter 5 we discussed `measurable functions'. In our new
terminology these are measurable transformations from (X, .F)
to (R*, 9) in which is the oS-field of Borel sets in R*. Given mappings
f:X Y, g: Y -± Z we can consider the composition g(f) : X -* Z
defined by g(f) (x) = g(f(x)). In particular if g: Y -* R* is an extended
real-valued function on Y, then g(f) defines an extended real function
on X.
Lemma. If f: X -a Y is a measurable transformation from (X,F)
into (Y, 9) and g: Y --> R* is EQ-measurable as a function with extended
real values, then the composition g(f) is .F-measurable.
Proof. For any Borel set B in R* we have
{x:g(f)(x)EB} = f-1 {y: g(y) E BI
= f-1(E) for some E E 9,
and is therefore in F. ]
Remark. We obtained a special case of this lemma when we proved
that a Borel measurable function of a measurable function is measur-
able (see §5.2).
If we start with a measure space (X, and f is a measurable
transformation from (X, F) into (Y, 9) it is natural to use f to define
a measure v on 9 by putting
v(E) _ ,a(f-1(E)) for E E 9. (6.5.1)
With this definition of v it is immediate that (Y, 9, v) is a measure
space. If (6.5.1) holds we will write v = ,uf-1. This allows us to carry
out a `change of variable' in an integral.
Theorem 6.8. Suppose f is a measurable transformation from a measure
space (X,.F,It) to (Y, 9) and g: Y -* R* is T-measurable: then

fd(f_1) = fg(f)du
in the sense that if either integral exists so does the other and the two are
Proof. It is clearly sufficient to consider non-negative functions
g: Y -). R+. Suppose first that g = XF, the indicator function of a set
E in 9. Then
g(f) (x) = 1 if x E f-' (E),
= 0 if x of'-'(E);
so that g(f) is the indicator function off-1(E), a set in F. Thus, in this
case, by (6.5.1)

f gd(fuf-1) = of-1(E) = #(f -1(E)) = f g(f)du.

By linearity, the result now follows for non-negative 9-simple func-

tions g. If {gj is an increasing sequence of non-negative simple func-
tions converging to the measurable function g, then gn(f) will be an
increasing sequence of simple functions converging to g(f). The
definition of the integral of a non-negative function now completes
the proof. I
Sometimes in integration, when the variable is changed, one wants
to integrate with respect to a new measure v + µf-1. We can do this
easily whenuf-1 is absolutely continuous with respect to v.

Theorem 6.9. Given o-finite measure spaces (X, .F, ,u) and (Y, T, v)
and a measurable transformation f from (X, F) into (Y, T) such that
µf -' is absolutely continuous with respect to v

f g(f)du = g.Odv,
where 0 is the Radon-Nikodym derivative d(,af-')/dv, for every measur-
able g: Y -* R* in the sense that, if either integral exists, so does the
other and the two are equal.

Corollary. If q: R -- R+ is Lebesgue integrable,

F(x) = Eco q(t) dt,

and ,up is the Lebesgue-Stieltjes measure generated by F, then

f Bg(x) dx = f bg(F(t)) d#F = f bg(F(t)) q(t) dt

d a a
where A = F(a), B = F(b).
Proof. By theorem 6.8 we have

f g(.f) du = f gd(1uf-1).
Since # -1 is absolutely continuous with respect to v, there exists a
measurable 0 such that, for every E E 9

fczdv = (/ff-1) (E).

If g is the indicator function of a measurable set E it now follows that

f gd(1pf-1) = (4-1) (E) = f g. Odv

and the required result now follows by successive extension to func-
tions g which are: (i) non-negative, simple, (ii) non-negative, measur-
able, (iii) measurable.
Under the conditions of the corollary we consider the mapping
F: R->. R given by the measure function F from the Lebesgue measure
space (R, .1, PF) to (R, .4,,a). Theorem 6.8 then gives the first equality.
If we define

then A: 21 -> R is a measure which coincides with

/F(E) = fE' F
for intervals of 9 and therefore for all sets E E Y. Hence the measure
pp is absolutely continuous with respect to Lebesgue measure ,u
and q is a possible definition of the Radon-Nikodym derivative
daF/da. The second equality now follows from theorem 6.9.
Remark. It is clear from the above that the function y5 (or q in the
corollary) plays the part of the Jacobian (or rather the absolute
value of the Jacobian) in the theory of transformations of multiple
integrals. In general it is not easy to obtain an explicit value for the
Radon-Nikodym derivative d(,af-1)/dv, but in important special
cases this can be done. In particular, if both spaces are (Rk, 21k, ,u)
with ,a Lebesgue measure, and f: Rk -> Rk is a linear transformation
given by a non-singular matrix A so that y = Ax one can prove that

where I1AII denotes the absolute value of the determinant of A.

(This can best be shown by expressing A as a product of elementary
transformations, and proving the result for each elementary trans-
formation.) This means that, in this case a possible Radon-Nikodym
derivative is the constant function I[A 11.

Exercises 6.5
1. Show that the composition of two measurable transformations is
2. If f is a measurable transformation from (X, -5F) into (Y, .P) and u, v
are two measures on such that u << v, show that ,uf-1 << vf-1.
3. (Integration by parts.) If F(x), G(x) are non-negative continuous
functions satisfying the conditions of §4.5 for a Stieltjes measure function
and E is any Borel set, then

fE F(x) dG(x) + fE G(x) dF(x) = ,uFa(E),

where ,UFO denotes the Lebesgue-Stieltjes measure generated by F(x) G(x).
In particular if r
F(x) = ff(t) dt, G(x) = x g(t) dt,
a Ja
b b
then f F(x) g(x) dx + ff(x) G(x) dx = F(b) G(b) -F(a) G(a).
4. Suppose A is a non-singular k x k-matrix defining a mapping from
Rk to Rk, then this is a measurable transformation from (R(k), 2'(k) to itself.
If ,uk denotes Lebesgue measure in Rk show that

ffdPk = IIAII ff (A) dltx,

for any Lebesgue measurable f, where f(A) denotes the composite map
f (A) (x) = f (Ax).

6.6* Measure in function space

We saw that points in the product space jj Xi can be thought of as
functions f: I - - U Xi in which f(i) E X. In the particular case where
Xi is the same space X for all i, the space i1 Xi reduces to the set of
functions : I -- X. For this reason such a product space is often denoted
by XI. Since theorem 6.3 clearly extends to arbitrary product spaces
we can produce a product measure in XI starting from any measure
,u on X with ,a(X) = 1. However, for non-countable I, such product
measures are rarely of interest. In applications, the space XI usually
describes a stochastic process (see Chapter 15), and the product
measure in XI would correspond to complete independence (see Chap-
ter 11) between the values in each of the coordinate spaces. Usually
one wants to be able to define and use measures in XI which are not
product measures.
In our account we restrict X to be the real line R (it is easy to extend
the theory to the case X = C, but some restriction is needed for its
validity), leaving the index set I completely arbitrary.
Borel sets in RI
If we assume the usual topology in R, and denote the class of Borel
sets in R by then the class ' of cylinder sets
{f RI: f (ik) E Bk, k = 1, 2, ..., n}, Bk E -4

is a semi-ring of subsets in RI. The a -field generated by' will be de-

noted by _I. If GRn denotes the class of Borel sets in Rn, it is immediate
that _I can also be generated by the class of sets of the form
{fERI: ak < f(ik) < bk, k= 1,2,...,n}, (6.6.1)
or of the form
{fERI: (f(i1),f(i2),...,f(in))EBn}, BnE.1n. (6.6.2)
It is important to notice that no set in .4I can have restrictions on an
uncountable set of coordinates. For, if E is a countable subset of I
and F = I - E, a set of the form
{f E RI: fE E RE}, (6.6.3)
where fE denotes the restriction off to E, contains functions f which
are not restricted on F. The class of subsets of RI of the form (6.6.3)
(for all possible countable sets E C I) is clearly a o--field which con-
tains the finite dimensional cylinder sets W. Further, every set of the
form (6.6.3) must be in .91I, so that the Borel sets in R' are precisely
the sets of the form (6.6.3).
Our object will be to extend a measure which is already defined on
sets of the form (6.6.1) to the o--field _I. For a fixed finite set
21, 22, ..., in E I, the sets of the form (6.6.1) clearly generate a o--field
containing those sets of RI obtained by taking a Borel set in
Ri1 x Ris x ... x Ri,, and forming the cylinder with this set as base.
If we are to have ,a(RI) = 1, then, for each fixed i1, i2, ..., in, our set
function on sets of the form (6.6.2) must define a measure on the Borel
sets of the Euclidean n-space R.1 x ... x Ri, in which the whole space
has measure 1.
It is clear that the measures given in the various Euclidean spaces of
this type have to satisfy various consistency relations, if there is to be
any hope of extending to a single measure on the whole of _4I. For
such a measure on 9I must yield the original system on restriction to
sets of the form (6.6.2). These consistency conditions can be stated
in terms of multidimensional distribution functions which generate
the measures on sets (6.6.2), but we prefer to state them (equivalently)
in terms of the measures.
We assume then that for each finite set of distinct indices i1, i2, ..., in
we have a measure /-t'1'2 ... in defined for the Borel sets in Rn such that
(I) 1ail...inin+1(A x R) = #il...in(A), AEan.
(II) If 77 is a permutation of (1, 2, ..., n) and 0: Rn -+ Rn is the map-
ping (x1, ..., xn) _ (x,11, x,12, ..., x,r,y)I

then pi" = ,ail i2... in 0-1-

The condition (I) says that putting on the additional condition
f(in+1) E R at a new index cannot effect the measure of the set since it
imposes no restriction, and condition (II) makes precise the notion that
the order in which the index set ill i2, ..., in is written should not have
any effect on the measure of the (same) set. Both these consistency
conditions are clearly necessary if there is to be any hope of extending
the measures ,ail.. .in to a single measure ,a on RI. The fact that they
are also sufficient was proved by Daniell in 1918 and rediscovered
by Kolmogorov in 1933. We state it as
Theorem 6.10. If I is any infinite index set, and for each finite set
il, i2, ..., in of different indices in I there is a measure ,ai1i2... in defined
on the Borel subsets of Rn such that the family of all such measures
satisfies the consistency conditions (I) and (II), there is a unique measure
,a defined on 91 in R' such that, for each n E Z, Bn E.In,
p{f E R': (f(i1), ...,f(in)) E Bn} = ,aili2... in (Bn).
Proof. Let .5° denote the semi-ring of sets in Rr of the form (6.6.1)
for some finite value of n. Let .92 denote the ring generated by .9"
consisting of finite unions of disjoint sets in Y. Now /Zi1i2... in
defines the measure of the set
{fER': ak < f(ik) < bk, k = 1,...,n} (6.6.4)
and the consistency conditions (I) and (II) clearly ensure that the
measure is uniquely defined and additive on . (for the sets of any
finite class of sets ink can all be described by restrictions on the same
finite set of coordinates, and therefore the measure can be given by a
single measure of the family). It follows, by theorem 3.1, that there
is a set function r defined on the ring . which is additive and co-
incides with the measure pi,...in on a set of the form (6.6.1).
Further a' is the o--field generated by R and we can obtain the
required measure p on ar by applying theorem 4.2 to the measure T
-provided the conditions of that theorem are satisfied. It is im-
mediate that RI is for RIE9, and r(RI) = 1; so that the only
condition which requires proof is that T is a measure on R. The proof
of this fact is an extension of the method used in §§ 3.4, 4.5.
If r is not a measure on ?, we can find a decreasing sequence {En}
of sets in R such that n E. =o, but T (E.) > 8 > 0 for all n. Now
given any set C of the form (6.6.4), and e > 0, we can choose I > 0
such that
T(D) > T(C) - e
where D = {fERI: (f(i1),f(i2),..., f(in))EP}
and P, = {ak+q< xk 5 bk, k = 1, 2, ..., n},
sinceF','2' "n is a measure. But now P,, c PO. This argument clearly

extends to any non-empty subset in 9, and we can apply it by induc-

tion to the sequence {En}. Since in each of the sets E. the value of f
at only a finite set of indices is restricted, there is no loss of generality
in assuming that in the sets E1, E2 ..., E. there is a restriction on f
only at the first n of the indices in the sequence
(If this condition is not satisfied one need only add additional sets in 9
to the sequence {En} to obtain a new sequence of which the original is a
Thus we may assume that
En = {f E RI: (f(i1), ...,f(i )) E Qn}r
where Q n E Can the class of elementary figures in Rn. The condition
that En be a decreasing sequence now means that Qn+1 C Qn x R.
We apply the above procedure to each of the sets En to give a sequence
{Dn} of sets
Dn = {f RI: (f (i1), ..., f (in)) E Pn}
such that P. c Qn, Pn E 61n and
( n) >TE
( n) - 2n+1

If we put Vn = Dl n D2 n ... n D.
then T(Vn) = T(En)-T(En-Vn) i T(En)- ET(Ei-Di) > J8

so that the sets {V.} form a monotone decreasing sequence of non-

empty sets. In each V,, choose a point
fn = {fn(i), 2E1}.
Now (fn+p(i1),fn+p(22), ... ,fn+p(i )) (p = 1, 2,...)
defines a sequence of points in Rn which is a subset of the bounded
closed set (P1 x Rn-1) n (P2 x Rn-2) n ... n (Pn) = Fn.

We can therefore find a subsequence of {fn+p} which, evaluated at the

first n indices converges to a point of Fn. Since T(Vn) > JS, Vn is not
empty and F. is not empty since
V. C {f E RI: (.f (2i), ... , f (i )) E F.}.
Further Fn x R c Fn+1 (n = 1, 2,...), and we can now employ a standard
diagonalisation argument to obtain a point in (1co En.
Obtain successively, by induction, infinite increasing sequences of
positive integers
V1 Z) V2 = ... :D PL. => ...

such that {fn} restricted to the sequence vk gives a sequence whose

values at i1, i2, ..., ik converge to a point in Fk. Form the sequence v
obtained by taking the kth integer in the sequence vk. Then, for each
k, v is a subsequence of Vk except for a finite number of terms at the
beginning so that {(fn(i1), fn(i2), ..., fn(ik))}, nE v must converge to a
point in Fk C Qk. If we put qk = lim fn(ik) (k = 1, 2,...) the set
H = U E E RI. f (2k) = qk, k = 1, 2, ... }
is non-empty, and H c Vn c En for all n. This contradicts fl En = 0.1
Remark. For a finite index set I, theorem 6.10 is still true, but lacks
any content as the measure ,ui1... in already is the required a if
I = {21, 22, ..., 2n}.
Brownian motion
We can set up a mathematical model for Brownian motion by apply-
ing theorem 6.10 to a particular family of finite dimensional dis-
tributions. Use the index set T = {t E R, t > 01 which can be thought
of as time and, for
0<tl<... <tn,
define ,at1... to {f E RI: ai < f(ti) < bi, i = 1, ..r., n}
fb,, r
exp - (Sn - 5nto-1)I
exp - (en-1- En=2)2
2(1n-1- to-2) J d5n-1
(27T)jn an 2(tn - fan_i

2(t2-t1) J
f blexp(-*) d 1.
The fact that this defines a consistent family of measures on .So which
can be extended to all sets of the form (6.6.2) can be proved directly
(it will follow from the discussion of the multinormal distribution in
Chapter 14). Hence, we can apply theorem 6.10 to give a measure ,a
on IT the class of Borel sets in RT = Q. This is called Wiener measure
in the space of functions f: T --> R, and is an example of a stochastic
process which will be discussed more fully in Chapter 15.
However, let us use the example of Wiener measure to illustrate
the inadequacy of theorem 6.10. This follows from the fact that the
o--field _IT is too small to contain interesting sets-for we have seen
that it contains no set in which a non-countable set of time coordinates
is restricted. Even if PT is completed with respect to a to give a
probability measure, the completed o-field is still too small. For
if A c RT is a set in which f is restricted at a non-countable set, the only
set of .IT which is contained in A is the empty set. This means that
A can only be measurable, if it has measure zero. But the same
argument applies to the complement of A so that if both A and its
complement involve restrictions on f : T R at a non-countable set
of indices, then the outer measure of A must be 1, and the inner mea-
sure must be 0. In particular the set
f f E RT : a < f (t) < b for all t E [tl, t2]} (6.6.4)
is not measurable, and if C is the set of functions f: T - R which are
continuous for all t E T, C also has outer measure 1 (and inner measure
0). Various methods can be used to extend ,a from .VT to a larger
v-field which includes C and (6.6.4) and other sets of interest. These
have been studied in detail and the interested reader is advised to
look in J. L. Doob, Stochastic Processes (Wiley, 1953).

6.7 Applications
In the second part of this book random variables will be defined as
.p-measurable functions f: S2 -+ R* where (S2, $ a) is a probability
space. Although it is usual to work with general carrier spaces S2,
there is a sense in which the real line R has a structure sufficiently
complicated to reproduce all the probability properties of the function f.
In fact, in many treatises on probability theory, the carrier space S2
is barely mentioned. This attitude is partially justified by the follow-
ing considerations.
Suppose (52, .° ', p) is any finite measure space and f : S2 -+ R is
.F-measurable. For all real x, define
F(x) = p{y: f(y) e x}. (6.7.1)

Then F(x) -* 0 as x ->. -oo, F(x) --> ,u(S2) as x -->- +oo, and F: R ->- R
is continuous on the right. Thus we can define a Stieltjes measure
#p using this particular F.
Theorem 6.11. Suppose (02, .F, ft) is a finite measure space and!: c2 R
is .F-measurable, ,uF is the Lebesgue-Stieltjes measure in R given by
(6.7.1) and g: R R is Borel measurable, then g(f) is $ -measurable and
,u{x: g(f) (x) E B) is determined by uF for every Borel set B. Further

fg(f)da = f g(x) dF(x)

in the sense that, if either side exists so does the other, and the two are equal.
Proof. {x: g(f) (x) E B} = {x: f (x) E g-1(B)} and g-1(B) = C is a Borel
set so that {x: f (X) E C} is in F, and,u{x: f (X) E C) is uniquely determined
by ,u{x: a < f (x) S b} = F(b) - F(a) for all real a, b since 9, the
class of half-open intervals generates the R of Borel sets in
R, and F(b) - F(a) = ,uF(a, b]. Thus, for all B in 9,
,u{x: g(f) (x) E B} = ,uF(g-1(B)).
Now suppose g is an indicator function of a Borel set B. Then

fg(f)ciu =u{x: f(x)EB}

= /tF(B) = fdPF.
By linearity our result follows for non-negative simple functions and
the monotone convergence theorem then gives it for non-negative
Borel measurable g and then for all integrable g.
Corollary. In the notation of the theorem

ffdu = fxdF(x).
Remark. There is an n-dimensional form of theorem 6.11 and
corollary which links the behaviour of n measurable functions with a
Lebesgue-Stieltjes distribution in Rn-see Chapter 14.
Marginal distributions
Not all measures in product spaces are product measures. Suppose
X, Y are spaces, then the projection X x Y --* X given by p(x, y) = x
defines a mapping. This will be a measurable transformation on
(X x Y, .$) into (X,.9') provided E x Y E .F for every E E Y. In this
case, if It is a finite measure on J F, pp-1 defines a measure on Y. In
general it may not be a very interesting measure as there may be no
sets of finite positive measure. However, if (X x Y, . ,1u) is a finite
measure space, then the measure pp-1 on .9' is called the marginal
measure on X. The marginal measure on Y is similarly defined using
a projection on Y. If It is a probability measure these marginal mea-
sures are called marginal (probability) distributions.
If F(x, y) is a distribution function in R2 (see § 4.5) then
lim F(x, y) and lim F(x, y)
will again define 1-dimensional distribution functions, and it is im-
mediate from theorem 4.8 that the corresponding Lebesgue-Stieltjes
measures will be the marginal distributions of,UF. If
F(x, y) = F1(x) F2(y)
is the product of two 1-dimensional distributions, then ,uF will be
the completion of the product measure 1aF, x,uF2 and F1, F2 will be
the marginal distributions for F. Conversely, if ,uF is a product mea-
sure, then it must be the product of its marginal distributions so that
F(x, y) = F1(x) E2(y) is a necessary as well as a sufficient condition for
FiF to be a product of two probability measures.
Thick subsets
For any finite measure space we can generate the outer
measure 00

p*(E) = inf E#(E1) (6.7.2)

the infimum being taken over all sequences of sets {Ei} in F with

E c U Ei. (Since It is a measure on the o--field. , (6.7.2) is the same

as ,u*(E) = inf,u(F) for F A subset Eo of SZ is said to be
thick in 4 if ,u*(Eo) = ,u(52). Thus a subset Eo is thick if and only if
(52 - Eo) contains no set in F of positive ,a-measure. There is a sense
in which the measure space can be projected onto any thick subset.
Theorem 6.12. If Eo is a thick subset of the finite measure space
(52, , µ), .F o =.Fn Eo, and uo(E n Eo) = µ(E) f o r any E . , then
(Eo,.Fo, uo) is a measure space.
Proof. We first see that 1uo is defined uniquely on Fo. If A1, A2EF
are such that A 1 n Eo = A 2 n Eo, then we must have
(A1 LA2)nEo= o,
so that /(A1 o A2) = 0 and Au(.1l J = ,u(A2).
Now suppose {B,.,} is a disjoint sequence of sets in Fo so that there
is a sequence of sets {Cn} in F with
Bn=CnnE0 (n= 1,2,...).
Put Dn = Cn - U Ci (n = 1, 2, ... ).
Then D. n Eo = Cn n Eo,
so that ,u(Dn 0 Cn) = 0. It follows that
00 00 00

Eµ0(Bn)= Eµ(CC)= Eµ(Dn)=,u(UDn) ==4 %(UB)

n-1 / n=1 /
so that µo is a measure.
Remark. This theorem shows that in a probability space (52,.5V, P),
the o--field F can be extended to include any set E. not in it whose
outer measure is 1. The effect of this extension is to discard all the
points of 52 which are not in E0. The device turns out to be useful in the
theory of stochastic processes where, by a careful choice of E0, one
can obtain a probability on a useful class of subsets. In particular,
for Wiener measure in RT described in § 6.6, it can be shown that the
set C of continuous functions is thick and that the extension given by
putting Eo = C is a useful one-see Chapter 15.

Exercises 6.7
1. Formulate and prove a theorem of the form of theorem 6.11 for n
F-measurable functions fi: 52 -. R (i = 1, 2, ..., n).
2. Find the 2-dimensional distribution function F(x, y) which generates
the measure µF such that uF(R) is 11V2 (length of diagonal D in B) for any
rectangle R, where D is the segment joining (0, 0) to (1, 1). Calculate the
marginal distributions of µF, and show that µF is not a product measure.
3. If is a complete o--finite measure space, and the outer mea-
sure,a* is defined by (6.7.2) show that a set E is ,u*-measurable if and only
if it is in F.
4. Suppose (52, is a finite measure space and E. is a subset of 4
such that, for A1,
Prove that E. is thick in Q.


Throughout this chapter we will assume (unless stated otherwise)
that (f2, F, tt) is a v-finite measure space, and that the o--field .F is
complete with respect to It. This implies that if f: f1-* R*, g: S2 -- R*
are functions such that f is F-measurable and f = g a.e., then g is
also .l-measurable. Thus, if M is the class of functions f: 92 -> R*
which are F-measurable, we say that fl, f2 in M are equivalent if
fl = f2 a.e. This clearly defines an equivalence relation in M and we
can form the space J,-' of equivalence classes with respect to this rela-
tion. When we think of a function f of M as an element of fl we are
really thinking of f as a representative of the class of F-measurable
functions which are equal to f a.e. As is usual we will use the same
notation f for an element of M and .4'. We can think of M or _W as an
abstract space, and the definition of convergence if given in terms of a
metric will then impose a topological structure on the space. We will
consider several such notions of convergence of which some, but not
all, can be expressed in terms of a metric in -W. We will obtain the
relationships between different notions of convergence, and in each
case prove that the space is complete in the sense that for any Cauchy
sequence there is a limit function to which the sequence converges.
The main strategy used to prove completeness will be to find a suitable
subsequence of the given sequence which clearly converges to a limit f
and then show that f is a limit of the whole sequence. This extends the
method used in § 2.2 to show that R is complete.
7.1 Point-wise convergence
Given a sequence {fn} of functions where fn: E -> R* and a function
f : E -+ R* (E c S2), we say that fn converges to f point-wise on E if,
for each x in E, f,,(x) ->f(x) as n -> co. This notion has a meaning if we
restrict consideration to .41. If E is such that ,a(S2 - E) = 0, and fn -> f
point-wise on E, then we say that fn -+ f a.e. For if fn -> f a.e., fn = gn
a.e. for each n, and gn - g a.e., then
{x:.f(x) + g(x)} C {x:fn(x) -H/(x)}

v {x: gn(x) g(x)} v U {x: fn(x) + gn(x)},

and each of these sets has zero measure so f (x) = g(x) a.e. which means
that f = g in -W. { fn} is a Cauchy sequence (point-wise) on E if, given
x E E, e > 0, there is an integer N such that
I fn(x)-fm(x)I < e for n, m > N. (7.1.1)
(This has meaning only if fn: E R is finite valued.) Because R
is complete it is clear that if { fn} is a Cauchy sequence on E, there must
be an f: E -* R such that fn ->-f point-wise on E.
Uniform convergence
If the sequence {fn} and the function f are finite valued functions
on E to R, we say that f converges uniformly to f on E if for each
e > 0, there is an integer N such that
xeE, n > N I fn(x)-f(x)I < 6.
Similarly, we say that the sequence is a Cauchy sequence uniformly on
E if given e > 0, a single integer N can be chosen so that (7.1.1)
is satisfied for all x E E. Since a Cauchy sequence uniformly on E is
certainly a Cauchy sequence on E and the existence of lim fm(x) = f(x)
follows for each x, we can let m --> oo in (7.1.1) to deduce that a Cauchy
sequence uniformly on E must have a limit function f: E -. R such
that fn ->.f uniformly on E.
If p(L - E) = 0 and fn --> f uniformly on E, then we say that fn f
uniformly a.e. All these notions have a meaning for functions which
need not be measurable. However, the notion of convergence uni-
formly a.e. can be expressed in terms of a metric on the restricted class
of measurable functions.
Essentially bounded functions
An .F-measurable function f: ) R* is said to be essentially
bounded if #{x: If(x) I > a} = 0 for some real number a. In this case
we define the essential supremum of f by
ess sup I f I = inf {a: ,u{x: If(x) I > a} = 0}.
Notice that, if ess sup If I = C, then

E = {x: If(x)I > C} =kU1(x:If(x)I > C+

so that ,a(E) = 0 and I f(x)I < C outside E. Thus I f(x)! < C a.e., and
if we define f(x) if I f(x)I <' C,
0 if If(x)I >C,
then I f * (x) I < C for all x and f * =f a.e. Further {x: I f *(x) l > C - e)
has positive measure for all e > 0, so that it is non-empty and we must
have sup If* I = C. It is clear that, if f = g a.e., then ess sup f = ess sup g,
so that we can think of ess sup as a functional on the subset Y,'C' of
the essentially bounded functions of -9. If we define (af +/3g) by
(af +/ig) (x) = af(x) +/3g(x) when f(x), g(x) E R,
=0 otherwise;
it is clear that (af +,6g) E _ if f, g E _W,0 for any a,,8 E R so that Y.,
is a linear subspace of .mil (over the reals) Further
P.(f, g) = ess sup if - gi
defines a metric in Y., for
(i) Pc(f,g) = P.(g,f);
(ii) p. (f, g) = 0 if only if f = g a.e.;
(iii) ess sup I f + gl < ess sup I f I +ess sup I g I so that
p. (f, g) < p, (f, h) +P.(h, g)
Now it is clear that, if {fn} and f are functions in - such that
fn -->.f uniformly a.e., then pro(fn, f) -* 0 as n -> oo. Conversely sup-
pose po(f f) --> 0, and let E. be a set of .F with,a(En) = 0 and
ess sup l fn -f I = sup I fn(x) -f(x) I
Put E = U En, then for x E SZ - E
I fn(x) -f(x) 15 sup I fn.(x) -f(x) I = ess sup If,, -fl
x e n -En
so thatfn --> f uniformly on SZ - E and #(E) = 0. A similar, but slightly
more complicated argument shows that, in 2 , a Cauchy sequence
uniformly a.e. is the same as a Cauchy sequence in p. norm.
Almost uniform convergence
Given functions fn: E -+ R* (n = 1, 2, ...) and f: E R* each of
which is finite a.e. on E we say that fn converges almost uniformly to
f on E if, for each e > 0, there is a set Fe ' E, FEE .F, ,u(F6) < e such that
fn -->f uniformly on (E-FE). The example E = [0,1] c R, fn(x) = xn
It Lebesgue measure shows that it is possible for a sequence to con-
verge almost uniformly on E while it does not converge uniformly a.e.
on E. However, it is immediate from the definitions that convergence
uniformly a.e. implies almost uniform convergence. What is more
surprising is that, under suitable conditions, convergence a.e. implies
almost uniform convergence.
Theorem 7.1. (Egoroff). Suppose E oo, and { fn} is a sequence
of measurable functions on E -> R* which are finite a.e. and converge
a.e. to a function f: E -> J2* which is also finite a.e. Then fn -+ f almost
uniformly in E.
Proof. By omitting a subset of E of zero measure, we may assume
that all the functions fn and f are finite and that
fn(x)-*f(x) for all xEE.
For positive integers, m, n put

A-= {x: Jfti(x)-f(x)I < yn}.

Then, for fixed m, Ai', A,-,..., An, ... is an increasing sequence of
measurable sets converging to E. Since #(E) is finite, by theorem 3.2
there is a positive integer N. = Nm(m) such that
,u(E - A') < e/2'n for i > Nm.
If we put FE U (E-AmNm

then ,u(FE) < e. Further given S > 0 we can choose m so that 1/m < 8
and then
fi(x)- f(x) I < S for all i > Nm, xE (E-F,),
so that fn --> f uniformly on (E - FE).
Remark. The converse to theorem 7.1 is true and almost trivial.
For if {fn}, f are finite a.e. on E, measurable, and fn -->. f almost uni-
formly, this means we can find sets F. with ,a(F.) < 1/n such that
fn -> f uniformly on (E - Fn) and so fn f point-wise on (E - Fn).
Put a)
F = ll Fn, then ,u(F) = 0
andfn -,,-f point-wise on (E-F) so thatf,, - f a.e. on E.

Exercises 7.1
1. Let X be the space of positive integers, class of all subsets of X,
and,u(E) the number of integers in E c X. If fn(x) is the indicator function
of {1, 2,..., n}, then f,,,(x) -- 1 for all x. However, fn does not converge almost
uniformly to 1, showing that theorem 7.1 is false without µ(E) < oo.
2. Suppose the conditions of theorem 7.1 are satisfied except that
,u(E) = eo, show that given P > 0, there is a subset Fp c E with ,u(Fp) > P
such that fn f uniformly on Fp but that there need not be a subset F
with µ(F) _ +co with fn -* f uniformly on F.
3. Suppose EE,"-", ,u(E) < oo, f,,: E -* R* (n = 1, 2,...) is a Cauchy
sequence a.e. of measurable funtions each finite a.e. Prove there is a finite
c and a measurable F c E with ,u(F) > 0 such that, for every integer n,
all x E F, If (x) < c.
4. Suppose E E.°, E has v-finite measure, f (n = 1, 2,...) and f are finite
a.e. on E and f -* f a.e. on E. Show that there exists a sequence {E;} of
sets in .° such that
,u(E- U Ezl = 0 and d, -* f
\\ i=1
uniformly on each Ej. By considering the measure of example 2, §3.1,
and a suitable sequence of functions show that the condition that E has
v-finite measure is essential.
5. In §4.4 we produced a sequence of sets each of which was not
Lebesgue measurable. If we put f (x) = indicator function of
[0, 1) - LJ Q2, then f (x) -* 0 for all x in [0,1].
Show that f does not converge almost uniformly so that theorem 7.1 fails
if the functions are not measurable.
6. Suppose fa: E -* R, h > 0 is a continuous family of measurable func-
tions, each finite valued, #(E) < oo and for each x E E, f,,(x) --* f (x) as h 0
where f is finite valued. Then if a continuous parameter version of Egoroff's
theorem were valid we would have given e > 0, there exists Fs
FE e E, ,u(FE) < e such that f h(x) -* f (x) ash -* 0 uniformly on (E -FE). The
following example shows that this extension is false. In Chapter 4, we
saw that there is a non-measurable set E e [0, 1) such that every point
x e [0, 1) has a unique representation x = y + q (mod 1), y e E, q rational.
Prove that, if M is a measurable subset of [0, 1] such that M n E(r) is
non-void for finitely many rationale r, then I M I = 0.
Arrange the rationals Q as a sequence For x E [0, 1) let n(x) be the
integer such that x = y + y E E. If x/n(x) _ al a2 ... (decimal repre-
sentation not ending in 9 recurring), put O(x) _ /31/32... where Ak = ak
(k = 1, 2, ...); and N2k-1 = 1 for k = n(x), 0 otherwise. Put fh(x) = 1,
for x = ¢(h), fh(x) = 0 otherwise. Prove fh(x) -* 0 as h 0 for each x.
Show that if M any measurable set, I M > 0, then f h(x) +i 0 uniformly
on M.
7. Suppose f f,,} is a sequence of functions in .2, In f a.e. and there is
an integrable function g such that < g a.e. for all n. Show that f,, -a f
almost uniformly.
8. Define what is meant by saying that a sequence {fn} of a.e. finite
valued functions is a Cauchy sequence almost uniformly, and show that this
implies the existence of a limit function f such that fn -* f almost uniformly.

7.2 Convergence in measure

We now consider a different kind of `nearness' in .4' in which the
measure of the set where two functions differ by more than a fixed
positive number is relevant. This time we make the definitions
relative to the whole space Q. Obvious changes give the corresponding
concepts relative to a set E in .. Given .-measurable functions
R* (n = 1, 2,...) we say that fn converges in
f: SZ -> R*, fn: S2
measure (,u) to f if, for each e > 0,
lim ,u{x: I fn(x) -f(x) I > e} = 0.
n oo

Note that the definition only makes sense for functions in W which
are finite a.e. We first see that the limit in measure is unique in .alt'.
For suppose fn --> f in measure, fn g in measure; then if 8 > 0,
{x: I AX) - g(x) I > 8} C {x: I fn(x) -Ax) I > 18} v {x: I fn(x) - g(x) I > Zs}

and both sets on the right can be made of arbitrarily small measure
by choosing n large. This means that
,u{x: I&) - g(x) I > S} = 0 for each S > 0,
and it follows that f = g a.e. (by taking a sequence 8n decreasing to
We say that the sequence { fn} of functions in. ' is a Cauchy sequence
in measure if, given e > 0, 8 > 0 there is an integer N such that
n > N, m > N - ,u{x: I fn(x) - fm(x) I > e} < 8.
The argument used to prove uniqueness of the limit also shows that
fn ->.f in measure {fn} is a Cauchy sequence in measure.
The converse is included in the following theorem.

Theorem 7.2. Suppose f and fn (n = 1, 2, ...) are functions in .,/l which

are finite a.e. Then
(i) fn - f almost uniformly = fn --> f in measure;
(ii) {fn} is a Cauchy sequence almost uniformly z {fn} is a Cauchy
sequence in measure;
(iii) {fn} is a Cauchy sequence in measure = there is a subsequence
{nk} such that {fnk} is a Cauchy sequence almost uniformly;
(iv) {fn} is a Cauchy sequence in measure = there is a function g e.4'
such that fn -> g in measure.
Proof. (i) If fn - f almost uniformly, for each e > 0, 8 > 0, we can
find a set E. E . such that ,u(E8) < 8 and fn -a f uniformly on (S2 - Ea).
Hence there is an N such that
ft(x)-f(x)I < e for n > N, xE(12-E8)
and then
,u{x: I fn(x)-f(x)I > e} <,u(E8) < 8 for n > N.
(ii) An argument similar to that in (i) will work.
(iii) Now suppose that fn is a Cauchy sequence in measure. For
each positive integer k, choose an integer Mk such that Mk > Mk-1
and /
n > mk, m > Mk - 1u{x: I fn(x) -fm(x) I > 2-k} < 2-k.
Put Ek = {x: Ifmk(x) fmk+i(x)I > 2-k,

Fk= i=k

Then u(Fk) 5 Zfi(Ei) s 21-k.

Given e > 0 we can choose k so that e > 21-k; then ,u(Fk) < e and
for all x E (S2 - Fk) we have
I fmi(x)-fmi+l(x)I < 2-i for all i > k.
j-1 21-z
i > k - I f, (x) -fmj(x) I s (fms(x) -fms+i(x) I <

so that the sequence fmi converges uniformly on (S2 - Fk); that is,
since ,u(Fk) -> 0 as k -> oo, it is a Cauchy sequence almost uniformly.
(iv) By (iii) we can obtain a subsequencefmk of the given sequence
which is a Cauchy sequence almost uniformly. This means we can
find a function g E.% such that fmk - g almost uniformly as k -> oo.
Now, for e > 0,
{x:lfn(x) - g(x)I > e} -- {x:Ifn(X) -fmk(x)I >- e}

-{X: I fmk(x) - g(x) I % ie}.

Given 6 > 0 we can find a set E8 a .F and integers ko, N such that
,u(Ea) < 18,
I fmk(x) - g(x) I < 4e for k > ko, x E S2 - Ed,
and ,u{x: I fn(x) - fmk(x) I > 4e} < 16 for n > N, mk > N.
It follows that
n > max {N, mko} = ,ii{x: I f,,(x) - g(x) I > e} < 8.1
It is not difficult to see that convergence in measure does not neces-
sarily imply convergence point-wise at any point, and so it certainly
cannot imply almost uniform convergence of the whole sequence. For

Er,k = k2kr
r[__ 1
(r = 1, 2,..., 2k; k = 1, 2, ...),

and arrange these intervals as a single sequence of sets {Fn} by

taking first those for which k = 1, then those with k = 2, etc. If ,u
denotes Lebesgue measure on [0, 1], and f,,(x) is the indicator function
of Fn, then, for 0 < e < 1,
{x:Ifn(x)I i e} = F.

so that, for any e > 0, ,a{x: I fn(x) I > e} 5 ,u(F.) ->. 0. This means that
fn 0 in measure in [0, 1]. However, at no point x in [0, 1] does
f,,(x)--> 0; in fact, since every x is in infinitely many of the sets F. and
infinitely many of the sets (S2 - F,,,) we have
lim inf f,,, (x) = 0, lim sup fn(x) = 1 for all x E [0, 1].

Exercises 7.2
1. Suppose { fn} is a Cauchy sequence in measure, and fnt, f,n, are two sub-
sequences which converge to f, g, respectively. Prove that f = g a.e.
2. Show that if {f} is a Cauchy sequence in measure then every subse-
quence of { fn} is also a Cauchy sequence in measure.
3. If S2 is the set of positive integers and, a is the counting measure on the
class 0T of all subsets, show that convergence in measure is equivalent to
uniform convergence.
4. If #(S2) = co can we say that convergence a.e. implies convergence
in measure?
5. Suppose {An} is a sequence of sets in,'Z;', xn is the indicator function
of A,,, and d(A, B) = ,u(A A B) for A, Be ". Show that is a Cauchy
sequence in measure if and only if d(A,,, A.) -* 0 as n, m -> oo.
6. Suppose {fn} is a sequence of functions of M which are finite a.e.
and fn -* f a.e. with f finite a.e. Show that, if (i) ,u(S2) < cc, or <, go
for all n where go is integrable; then f,, -> f in measure.
7. Suppose (S2, is a finite measure space and { fn}, are finite
valued F-measurable functions which converge in measure to f, g respec-
tively. Show
(i) If,,l converges in measure to If ;
(ii) for all real a, 6 the sequence converges in measure to
(iii) if f = 0 a.e., then fn converges in measure to f2;
(iv) the sequence {f,,g} converges in measure to fg;
(v) the sequence {f,,2,} converges in measure to f 2;
(vi) the sequence converges in measure to fg;
(vii) if f + 0 a.e. all n, f + 0 a.e., the sequence {1/fn} converges in measure
to 1/f.
Is the condition µ(S2) < oo essential for all these results?

7.3 Convergence in pth mean

All the definitions of the present section can be made relative to
an arbitrary E in 97. Since we could restrict It to the o--field F A E
of subsets of E, there is no loss in generality in making our definitions
in terms of S2, the whole space. In Chapter 5 we saw that f E.Y is
µ-integrable (over S2) if and only if If I is µ-integrable. Further we
saw that the subset of L1 of M consisting of µ-integrable functions is
a linear space (here we define (af +fg) (x) arbitrarily on the set of
zero measure where it is not defined because it involves + oo + (- oo)).
Further for f, g E L1,
pI(f,g) = f If-gi dµ
is finite. By theorem 5.6, p(f, g) = 0 if and only if f = g a.e., so that
if we take equivalence classes of functions equal a.e. to form the linear
space .i c A' we see that
pi(ff g) = p1(g,f) for all f, g E Y1,
p1(f, g) = 0 if and only if f =gin -V1.
The triangle inequality
p1(f, h) < p1(f, g) +p1(g, h) for f, g, h E 2'i
also follows by integrating
If(x)-h(x)I , If(x)-g(x)) +Ig(x)-h(x)I,
so that pl defines a metric in the space Y1.
Convergence in mean
A sequence {fn} of functions in L1 (or in Y1) is said to converge
in mean to a function f in L1 if p1(ff) -+ 0 as n - oo. A sequence { fn}
of functions in L. is a Cauchy sequence in mean if p1(f., fm) --> 0 as
n,m -aoo.
Convergence in mean is the special case p = 1 of convergence in
pth mean. Since most of the proofs are the same for p = 1 and p > 1,
it is convenient to consider this at the same time.

The class .p
For p > 1, a function f in M is said to be of class Lp if If I p is ,u-
integrable. Since
21f(x)I, if If(x)I > Ig(x)I,
l.f(x)+g(x)I <
2lg(x)l, if Ig(x)I > If(x)I;
we have, for all x,
If(x)+g(x)Ip S 2p{lf(x)Ip+Ig(x)Ip}. (7.3.1)
Thus, if f, g E Lp we must have (f ± g) e Lp. With the usual convention
about the set of zero measure where (af +,6g) may not be defined, it
follows that Lp is a linear space. For f, g e LP we define
Pp(f,g) _
and notice again that pp(f, g) = 0 if and only if f = g,a.e. so that in the
space Yp c .ill of equivalence classes we have
Pp(f, g) = Pp(g,f),
pp(f, g) = 0 if and only if f =gin gyp.
We will prove in the next section that pp satisfies the triangle in-
equality, which shows that it is a metric in rp. However, we can now

Convergence in pth mean

A sequence {fn} of functions in L. (or in Yp) is said to converge in
pth mean to a function f in LP if pp(fn, f) -> 0 as n -->. oo. A sequence
{fn} of functions in LP is a Cauchy sequence in pth mean if pp(fn, fm) ->. 0
as n,m -goo.
It is immediate, by (7.3.1) that convergence in pth mean to a func-
tion implies that we have a Cauchy sequence in pth mean. Complete-
ness for this type of convergence can now be proved.

Theorem 7.3. For p > 1, if { f n} is a sequence of functions in LP which

is a Cauchy sequence in pth mean, then there is an f in Lp such that
fn -*f in pth mean.
Proof. We again use the device of obtaining a subsequence which
will converge a.e. to f. For any e > 0, let N(e) denote an integer such
If,. - fs du < ep+1 for r, s > N(e).
Put AT k= N(e2-k), and assume that Nk+l > Nk for each integer k.
µ(E(e, r, s)) < e for r, s > N(e),
where E(e, r, s) = {x: I fr - fb I % e}.
If we put Ek = E(e2-k,Nk+1,Nk),

Fk = U Ei,
we have u(Ek) < 2-k e, µ(Fk) < 21-k e, and if x is not in Fk,
fNi+i(x) - fNi(x)I < e2-i for all i >, k.

Hence the series E (fNi+,-fNi) converges outside F = fl Fk and

i=1 k=1
µ(F) = 0. Suppose then that fNi f a.e. For a fixed integer r, if we
put gi = I fNi -fr I p, g = I f -fr I p we obtain a sequence gi of non-negative
measurable functions with lim inf gi = lim gi = g a.e. By theorem 5.7
(Fatou) we have
fd1u < lim inf f IfNi -frIpdµ < e if r > N(e).

Hence, g is integrable, so that (f - fr) E LP which implies that f e Lp.

We have also proved that

fIf_frIh/<6 if r>N(e)
so that fr -+ f in pth mean.
It is worth remarking at this stage that the theorem corresponding
to theorem 7.3 for Riemann integrals over a finite interval is false.
It is not difficult to construct an example of a sequence of functions
whose pth powers are Riemann integrable and which Cauchy con-
verges in pth mean, but for which the limit is necessarily discontinuous
on a set of positive measure and so cannot be Riemann integrable
by theorem 5.9 (see exercise 7.3 (10)). Thus theorem 7.3 exhibits
another way in which the Lebesgue integral is a big improvement on
the Riemann integral.
We now relate convergence in pth mean to convergence in measure.
Theorem 7.4. If {f.) is a sequence of functions of Lp (p 3 1) which is a
Cauchy sequence in pth mean then { is a Cauchy sequence in measure.
If fn -+ f in pth mean, then fn -+ f in measure.
Proof. For any h in LP, r/ > 0
rli{x: I h(x) I > V1/2P} > V1 fI hI pdu i r.
If {fn} is not a Cauchy sequence in measure, then there is an e > 0,
8> 0 for which
P'{x: I fn(x) - fm(x) I '> e} > 8

for infinitely many n, m. If now rl > 0 is small enough to ensure that

e >, ?11/2p, 8 > rtk we have

fifn(X)_fm(X)I4a i I> 0
for infinitely many n, m so that {fn} is not a Cauchy sequence in pth
mean. This proves the first statement: the second part of the theorem
is proved similarly.
Remark. The example after theorem 7.2 shows that {fn} may con-
verge in pth mean but not converge a.e., though theorems 7.2, 7.3
together show that there must be a subsequence {fnti} which converges
a.e. If we consider Lebesgue measure in R and put
n-1/P for x in [0, n],
f -W =
{0 otherwise,
ni/P for x in [0, 1/n],
{0 otherwise.
we see thatfn --> 0 uniformly (and therefore almost uniformly, a.e.,
and in measure) but not in pth mean. If t = [0, 1], then gn 0
almost uniformly, a.e. and in measure, but not in pth mean so that
even in a finite measure space we cannot deduce convergence in mean
from other types of convergence without some additional condition,
even if the functions concerned are all in Y p. The next definition
turns out to be appropriate:

Set functions equicontinuous at 0

Suppose v2 (i E.1) is a family of set functions defined on a
The family is said to be equicontinuous at 0 if, given e > 0 and any
sequence {Bn} of sets of F which decreases to 0, there is an integer
N such that I v;(Bn) i< e for all i e I, n N.
In § 6.4 we saw that a set function v was absolutely continuous with
respect to a if, given e > 0 there is a 6 > 0 such that, for
u (E) < 6r iv(E)I < e;
and that this condition was also necessary if v was a finite valued
measure. This makes the following definition reasonable:

Uniform absolute continuity

Any family vi (i e I) of set functions defined on .F is said to be uni-
formly absolutely continuous with respect to u if, given e > 0 there is
a S > 0 such that, for E E.F, p(E) < 8 I vi(E) I < e for all i. To see
what this condition means, suppose vi (i E I) is a family of measures
each of which is absolutely continuous with respect to p, but such that
the family is not uniformly absolutely continuous. Then there is an
e > 0, and a sequence {B.) of sets of F with indices {in} such that
p(B.) < 2-n, vi,,(Bn) > e.

Put Ak = U Bn, C = lim Ak.

n=k k- oo
Then,u(C) = 0 and lim (Ak - C) = o. It follows that
vik(Ak - C) = vik(Ak) i vik(Bk) > e > 0
so that, by considering the sequence {Ak-C} which decreases to 0,
we see that the family v i (i E I) is not equicontiruous at o. Thus
we have proved
Lemma. Suppose vi (i E I) is a family of measures on .°F each of which
is absolutely continuous w.r.t. It. Then if the family is equicontinuous
at 0, it is uniformly absolutely continuous w.r.t. ,u.
Theorem 7.5. Suppose { fn} is a sequence of functions of L,, and

vn(E) = Ifnl P du, (Fe 9-,n = 1, 2,...).

(i) {fn} is a Cauchy sequence in pth mean if and only if {fn} is a Cauchy
sequence in measure and the family {vn} of measures is equicontinuous
at 0.
(ii) The sequence {fn} converges to f in pth mean if fn converges to
f in measure and {vn} is equicontinuous at 0.
Proof. (i) Suppose first that {fn} is a Cauchy sequence in pth mean.
Then by theorem 7.4 {fn} is a Cauchy sequence in measure. For each
e > 0, there is an N such that
f Jfn -fN I P da < 2P+1 for n > N.
Now suppose {Bk} is a sequence of sets of F decreasing to o. Since
vn is absolutely continuous for n = 1, 2,..., N we can find, by theorem
5.6 an integer ko such that
for k >,ko (n = 1, 2, ..., N).
fBk fnI P < 2 +1+1
By (7.3.1) we obtain, for n > N, k > ko,

flfnlpolu < 2PfBkIfNlPd,+2pf BkIfn-fNIPdµ

< 2+2p f, If. -fN IP d/z < e,

so that the sequence {v,} is equicontinuous at 0.

In the other direction, since we assume that It is o--finite on n,
there must be a sequence {En} in .F which decreases to 0 and is such
that ,u(S2 - En) is finite for all n. Given e > 0, the equicontinuity
of vn now ensures that there is a set E = EN with u(Sl - E) < oo and

frIfnIPd#< +2 for all n.

Thus, for all m, n, by (7.3.1)

J. Ifn-fmIPdu < je. (7.3.2)

Now put S2 - E = F, ,u(F) = A. By the lemma, the sequence {vn} of

measures must be uniformly absolutely continuous. We can therefore
find an q > 0 such that, for B E , ,u(B) < 7j,

vn(B) = fBIP
I fn P d< 2p+3' (7.3.3)
For each in, n put
(4Ae 11P
Cm.n = {x: I fm(x) -fn(x) I > ) )

Then J Ifm-f.l Pdlu < 6#(F-Cm,n) < 6,0(F) _ 1e.

Since {fn} is a Cauchy sequence in pth mean we can find an no such
that lu(Cm,n) < I for m > n0, n > no. This gives, by (7.3.3),

ffIVd#< 21
Cm,fn Cm, n Cm, n

so that IF Ifm -fn I p du < je for m, n > no.

This, together with (7.3.2) gives

f Ifm-fnIpdu < e for m,n > no.

(ii) If fn -> f in measure, then {f} is a Cauchy sequence in measure
so that by (i) the condition that {v,,,) is equicontinuous implies that
{fn} is a Cauchy sequence in pth mean. By theorem 7.3, there exists
a g E Lp such that fn -*- g in pth mean. By theorem 7.4 (i), fn -a g
in measure so that we have f = g a.e. and it follows that fn -- fin pth
We can now slightly strengthen the dominated convergence theorem

Theorem 7.6. Suppose p > 1, and {fn} is a sequence of measurable

functions with Ifn I P S h E L1 for each n. If either fn -> fo in measure or
fn --> fo a.e., then fn --> fo in pth mean.
Proof. We must have vn(E) = fflPd1a S f hdu, so that the
family {vn} is equicontinuous at 0 by theorem 5.6. If fn fo in measure
we can apply theorem 7.5 (ii) to obtain the result. On the other hand, in
the proof of theorem 7.5 (i) we only use convergence in measure on the
subset F of SZ with ,a(F) finite. On F, fn -. fo a.e. implies fn in
measure by theorems 7.1, 7.2, so that the condition fn -*. fo a.e., to-
gether with equicontinuity at 0 of {vn}, implies convergence in pth
mean of {fn}.
We have now defined convergence to a limit for sequences of func-
tions in several different ways, and have proved completeness in each
case. It may help to summarise the relationships by a number of
diagrams (Figures 2 to 4). In each of these an arrow from A to B


* pth mean
Pointwise *

Pointwise a.e.- * ` unuorm In measure

Fig. 2. No additional conditions.

indicates that convergence in sense A implies convergence in sense B.

The lack of an arrow from A to B indicates that there is an example of
a sequence satisfying the stated conditions which converges in sense
A, but not in sense B. We assume throughout that we are considering
functions in M which are a.e. finite.
Uniform Uniform

In pth * E / 1 \, - * Uniform In pth * f I \K __ * Uniform

mean \ / I X/ a.e. JLea11 u.W.

I T t
a.e. * Almos t a.e. * * Almost
uniform uniform
In measure In measure
Fig. 3. y(i) < oo. Fig. 4. f JP < h a.e.,

Exercises 7.3
1. Check Figures 2, 3, 4, stating in each case the theorem or theorems
which justifies A --* B, or the example which justifies the exclusion of
2. Show that, if lc(f) < oo, then the condition in theorem 7.5 that {vn}
be equicontinuous at 0 can be replaced by a condition that {vn} is uni-
formly absolutely continuous.
3. Show that if {fn} Cauchy converges uniformly a.e. and each fn is
integrable over E with ,u(E) finite, then f (x) = lim f ,(x) a.e. is integrable
over E and
fE Ifn-.fI d1z-0 as n -> oo.
4. Suppose 0 is set of positive integers and ,u is the counting measure.
(i) If 1/n for 1 < k <, n,
f .(k)
{0- for k > n,
show that fn(x) -->- 0 uniformly on Q, each f, is integrable but

ffnd/2-» f
This shows ,u(E) < oo is essential in question (3).
(ii) For the same sequence {f} show that

vn(E) = fE f. d#
is uniformly absolutely continuous, but not equicontinuous at 0. This
shows that the condition ,a(0) < co is essential in question (2).
(iii) If
1/k for 1 < k < n,
- 10 for k > n,
show that { fn} is uniformly convergent on 01, each fn is integrable, but the
limit is not.
5. Show that, if {f,,,} converges in pth mean to f, and g is essentially
bounded, then f f,, g} converges in pth mean to fg.
6. Show that if
vn(E) = fndu (n=1,2,.-.),

defines a sequence of set functions which is uniformly absolutely continu-

ous then so does
An (E) = SE If.I du.

7. Suppose {fn} is a sequence of functions in L1. Show that is a

Cauchy sequence in mean if and only if

fIf.d =x

is a Cauchy sequence of real numbers for every E E .F, and { f,j is a Cauchy
sequence in measure. Give an example of a sequence which does not
converge in measure, for which

lim fE
for all E.
8. Suppose 1u(L) < oo, and for f, gE.,', and a.e. finite;

AM) = f l+If-gl
If-gI dµ
Show that p defines a metric in the space of a.e. finite functions of .4', and
that convergence in this metric is equivalent to convergence in measure.
9. Suppose #(Q) < co (1 < q < p). Show that Yi zD _a Yv Y,,, and
that p.(f, 0) = lim p9(f, 0) for f E 2,,. Show that the finite measure con-
dition is essential.
By considering a suitable function on [0, 1] show that 2' + n Y,, but
that if f E l Yp -then p,(f, 0) --* oo asp -* oo.
10. Suppose S2 = [0,1], ,u is Lebesgue measure. Let K be a nowhere
dense perfect set with positive measure and let {Ek} be the set of disjoint
open intervals such that (0, 1) - K = (J00 Ei. Let fn be the indicator function
of F,, = U Ei. Prove that fn (n = 1, 2,...) is Riemann integrable and con-
verges in mean to the indicator function f of (0, 1) - K. By considering the
construction of K, show that f is discontinuous a.e. on the set K of positive
measure, and so cannot be Riemann integrable. This shows that the class
of Riemann integrable functions is not complete with respect to convergence
in mean.

7.4 Inequalities
We now obtain some inequalities which turn out to be important in
several branches of analysis. We need to use the algebraic inequality
xayfi < ax +/3y (7.4.1)
for x > O, y > 0, a > 0, fl > 0, a+ fl = 1, which is strict unless x = y.
This is most easily proved by taking logarithms to give
alogx+/3logy ( log (ax+fly)
and using the fact that log: R+ -> R is strictly concave since it has a
negative second derivative.

Conjugate indices
If p > 1, q > 1 and 1/p+ 1/q = 1, we say that p and q are conjugate
Theorem 7.7 (Holder). If p, q are conjugate indices, and f e Lp, g r: LQ
then fg is integrable and, for each E in F,
fE d1u< (fE IfI (fE .

The inequality is strict unless there exist real numbers a, b such that
Proof. If
f IfI du = 0,
then the loose inequality is certainly satisfied and if the right-hand
side is also zero then either f = 0 a.e. on E or g = 0 a.e. on E. In either
case the condition alf Ip = bIgIQ a.e. is satisfied with b = 0 or a = 0,
respectively. Hence we may assume that

Put El = {x: Ig(x)l < If(x)Ip-1},
E2 = S2 - El.
Then for xeEl, If(x)g(x)I < If(x)IP,
for x e E2, I f(x) g(x) I S I g(x) I Ql
so that I f (x) g(x) 15 I f (x) I P + l g(x) I Q, for all x,
this implies that fg is integrable.
Given E e . , put E0 = E n {x : f (x) g(x) = 0}. Then by our assump-
tion ,a(E - E0) > 0. For x e E - E0, we can apply (7.4.1) replacing

a by p, # by 1, x by If(x)Ip and y by g(x) g

This gives
fE-E0 Iglgdu
If < If(x)IP + Ig(x)Ig
\p f Iflpdi q fE-Eo IglgduE-Eo
If we now integrate over (E - E0) and note that the right-hand side
gives 1/p+ 1/q = 1, we obtain the desired inequality for the integral
over (E - E0). We can only obtain equality for the integrals if we have
equality in (7.4.2) for almost all x E (E - E0). The condition for
equality in (7.4.1) now shows that we must have aI f I P = blglq a.e. on
(E - E0) where
a= E-E IfIpdu and
b= fE- 0

The inequality for the integrals over E now follows, and we can
again only have equality if f = g = 0 a.e. on E0, since otherwise the
right-hand side is increased while the left-hand side remains the same
on replacing (E - E0) by E. Thus equality can only occur if all I P = b I g I g
a.e. on E. 1
Remark 1. The special case p = q = 2 of theorem 7.7 is called
Schwartz's inequality. A simpler proof of this case is possible. See
exercise 5.4 (13).
Remark 2. In the sense of theorem 7.7 the index conjugate top =1
is q = oo. It is easy to prove directly that, if f E L g E L. then

f1f1dt< (flflit)esssuPlxEI)
Theorem 7.8 (Minkowski). For p >, 1, if f, g E Lp then (f + g) E LP and
for any EEC,


For p > 1, equality is strict unless there are real numbers a > 0, b > 0
such that of = bg a.e. on E.
Proof. We already proved in §7.3 that Lp is a linear space. For
p = 1, the result is immediate. For p > 1,

f EIfIIf+gIp-1da+ f II If+ gI p-1du,

with equality if and only if f and g have the same sign a.e. in E.
If we now apply theorem 7.7 to each of these integrals we obtain

JE f+gP dp (IEIflpda)llp (fE +

d1)" (fE
+ (fE IgI p
with strict inequality unless there are numbers a, /1, y such that
aI f I p = /ih f +gIp = yI gI p a.e. We can only have equality in both in-
equalities if there is a > 0, b > 0 with of = bg a.e. Provided it is not
zero we now divide both sides by

(fE If+gIpd1)
to obtain the desired result. If

fE If + gIP dy = 0,

then the inequality is trivially satisfied, and equality is only possible

The above theorem shows that
Pp(f,g)= (fEIf -gIpd,-)

defines a metric in gyp.

We have proved the Holder and Minkowski inequalities for general
measure spaces. They are of course valid for Lebesgue measure in R
either over a finite interval or over the whole line.
However, we can also apply these general theorems to the case
where S> is the set of positive integers Z, and ,a is the counting measure
,a(E) = number of integers in E,
which makes all subsets E c Z measurable. Then functions f : Z -> R
and g: Z - R reduce to
f(i) = ai, g(i) = bi (i = 1, 2, ...),
where fail and {bi} are sequences of real numbers, and we can apply
theorems 7.7, 7.8 to give:
p)1Ip 1 J114

Holder. E laibil < (j l ail (ii=1lbil4

i=1 i=1 /

in the sense that the convergence of both series on the right implies the
convergence of the series on the left and the inequality. Further, equality
is only possible if there is a constant k such that l ail p = kl bil4 for all i.



lailP)llp + (Z'

with equality if and only if there is a k > 0 such that ai = kbi for all i.

It is interesting to see how these elementary inequalities (which

can of course be proved directly) fall out of the general theorems by
using a simple special measure.

Exercises 7.4

1. Suppose a > 0, /B > 0, y > 0, a +,6 + y = 1 and f e La, g e Lp h E Ly.

Show that

f E lfghl dp
\ (IE lflVI d1,)a
(fE IgIllad)ie (fE l

Generalise to n > 2 functions.

2. If ,u is Lebesgue measure on I = (a, b) and f e L,,(I, ,u) show that
there is a continuous g such that
If (x) - g(x) 11 dx < e.
Deduce that If (x+h)-f(x)lpdx-*0 as h->. 0
(here f is given the value zero outside I). Hint. See theorem 5.10.
3. If u (S2) < oo, 1 5 p < q and f c L,, show that
[PM, 0)l < kpa(f, 0)
for a suitable constant k. Show that p(S2) < oo is essential.
4. Show by an example that theorem 7.8 is false for p < 1.
5. If p, q are conjugate indices, fn ->.f0 in pth mean, gn -a g o in qth mean,
show thatfngn in mean.
6. By considering intervals of O. whose coordinates are rational, and
linear combinations of indicator functions of such intervals obtain a
countable dense set in £,o for S2 = Rk, ,u any Lebesgue-Stieltjes measure.
Such a space Tq is therefore separable.

7.5* Measure preserving transformations from a space

to itself
In § 6.5 we discussed measurable transformations T from
to (Y,.9') and defined the measure ,uT-1 on . in terms of the trans-
formation. A special case of this is obtained when T : X -+ X maps X
into itself. We then say that T is measure preserving if, for every
Ee F, ,u(T-1(E)) = p(E). Given a mapping T: X -+ X, we can define
the iterates Tn obtained by composing T with itself n times. For
convenience TO will denote the identity mapping, and T-n will be
defined as a set mapping
T-n (E) = {x: T" (x) e E}
even if T-n is not a point function.
If 'F is the v-ring generated by a semi-ring 9, then it is immediate
on applying the extension theorems of Chapters 3 and 4 that T is
measure preserving if, and only if,
#(T-1(E)) _ ,u(E) for every E in 9.
If T is a (1, 1) transformation from X to itself, then it is said to be
invertible and the condition for T to be measure preserving in this case
can be written as a(T (E)) = ,a(E) for all E in F.
In § 4.5 we considered the geometrical properties of Lebesgue
measure and showed that the transformations of Euclidean space
defined in terms of translations, rotations or reflexions are measure
preserving. One can also prove that a matrix transformation of
determinant 1 defines a measure preserving transformation in
Euclidean space. All these are easily seen to be invertible.
If f2 = [0, 1) and Tx = 2x reduced mod 1 then, for Lebesgue
measure, T is seen to be measure preserving by considering the
effect of T-1 on the dyadic intervals [p/2q, (p + 1)/24) which form
a semi-ring generating -4. If x = . a1a2... is the expansion of x as a
binary `decimal', then Tx = . a2 a3.... This T is not invertible.
It is worth remarking that the study of measure preserving trans-
formations started with certain considerations in statistical mechanics.
Suppose we have a system of k particles whose present state is des-
cribed by a point in `phase space' R6kinwhich each particle determines
3 coordinates for position and 3 coordinates for momentum. Then
the entire history of the system can be represented by a trajectory
in phase space which is completely determined (assuming the laws of
classical mechanics) by a single point on it. Thus for any (time) t
we can define an invertible transformation Te by saying that, for
x in phase space, T x denotes the state of a system which starts
at x after a time t. One of the basic results in statistical mechanics
(due to Liouville) states that, if the coordinates in phase space are
correctly chosen, then the `flow' in phase space leaves all volumes
(i.e. Lebesgue measure in RBk) unchanged. This means that Tt becomes
a measure preserving transformation in (Rsk, YO, It). In practice
k is enormous, and it is not possible to observe at any one moment
all the particles of the system. Instead one asks questions like `what
is the probability that at time t the state of the system belongs to a
given subset of phase space?' One then imposes conditions which
ensure that this can be calculated by considering the `average'
behaviour of T x as t oo. To be more precise TB +t = TsT, so that
Tnt = (TT)n and one can consider a discrete model, count the proportion
of times up to n that Ti X E E where T = T,a and then let n --> oo.
In practice a set E in phase space is replaced by a function f (x)
(representing some physical measurement) and one considers the
average behaviour in terms of the sequence
Zif(Tix) (n = 1, 2, ...).
- i=o
This discussion of phase space provides a physical interpretation for
the mathematical results which we now formulate precisely.
For the remainder of this section, T will denote a measure preserving
transformation of S2 to itself, and f : t2 -* R* will denote an integrable
function. We define
fk(x) =f(Tkx) (k=0,1,2,...).
Then fk will be integrable and theorem 6.8 shows immediately that

5/k dp =Jfdu.

Before giving the proof (due to F. Riesz) of the point-wise ergodic

theorem, we obtain a lemma which is an important step towards it.
Lemma. (sometimes called the maximal ergodic theorem). Suppose
E is the set of points x E L such that
E fi(x) i 0
for at least one n: then
fEfda > 0.

Proof. We first need a result about finite sequences of real numbers.

Suppose al, a2 ..., an E R and m < n. A term ai of this sequence is called
an m-leader if there is an integer p, 1 < p < m, such that
ai + ai+1 +. _. + ai+p_1 % 0.

For a fixed m, let ak be the first m-leader, and let (ak + ... + ak+p_1)
be the shortest non-negative sum that it leads. Then for every integer
h with k < h < k + p -1, we must have ah + ah+1 + ... + ak+p-1 > 0,
so that ah is an m-leader. Now continue with the first m-leader in
ak+p, ..., an and repeat the argument until all the m-leaders have been
found. It follows that the sum of all the m-leaders of the original
sequence must be non-negative, as it is the same as the sum of the
non-negative shortest sums obtained by the above procedure.
We can now turn to the proof of the lemma and notice that, since f
is integrable, we may assume that it is everywhere finite valued. If
E. denotes the set of x such that
E fi(x) i 0
i= O

for at least one n < m, then E. increases to E, so it is sufficient to prove

f da >, 0 for all m.

For a positive integer n, let s(x) be the sum of the m-leaders of the
finite sequence fo(x), f1(x), ...,fn+m_1(x). Let Ak be the set of x for which
fk(x) is an m-leader and let xk be the indicator function of Ak. Since
Ak is measurable, and s(x) = Z xk(x) fk(x), s is measurable and in-
tegrable and s(x) > 0 so that
fdkfkda 0. (7.5.1)

Now notice that, for k = 1, 2, ..., n-1, T(x) E Ak_1 if and only if
fk-1(Tx) + ... +fk_l+p-1(Tx) > 0 for some p < m, which is equivalent
to fk(x) + ... + f k+p-1(x) > 0 for some p < m which in turn is the
condition for X E Ak. Thus Ak = T-'Ak-1 = T-kA0 for k = 1, ..., n -1.
Hence by theorem 6.8,

f akfk(x)dit=JT-"f(Tkx)d#= f df(x)da

so that the first n terms of (7.5.1) are all equal. Now A0 = Em, so that
(7.5.1) implies r
nfEmfdu+mJJfjda> 0.

Divide by n, keep m fixed and let n -> oo to give

f Emfdu >, 0.]

Theorem 7.9 (Birkhoff ). Suppose T is a measure preserving transforma-

tion on a o --finite measure space to itself and f:S2--> R* is in-
tegrable. Then
I n-1
(i) - Z f(Tix) converges point-wise a.e.;
n i=0
(ii) the limit function f*(x) is integrable and invariant under T
(i.e. f *(Tx) = f *(x) a.e.);
(iii) if 4u(1) < oo, then f f*du = f fdu.
Proof (i). Suppose r, s are rational numbers r < s and B = B(r, s)
is the set of points x for which
I n-1 II n-1
lim inf 2Z fi(x) < r < s < lim sup - 2Z fi(x)
It is immediate that B is measurable and invariant under T. Our
result will now follow if we can prove that ,u(B(r, s)) = 0 for all rational
r, s. The first step in this direction is to show that #(B) < oo.
We may assume without loss of generality that s > 0, for otherwise
the argument can be carried out with f replaced by -f. Suppose
CE.F, ,u(C) < oo, C c B, and x is the indicator function of C. Apply
the lemma to the function (f - sx) to give

fEf_sxd j_
/ 0,

where E is defined in the lemma. If X E B, then at least one of the

I n-1
- Zfi(x) > s > 0
so that at least one of the sums
E {f(Tix)-sx(Tix)} > 0,
and it follows that x E E. Thus

> fEsxdu so that u (C) < s fiii du.

Since B has measure and its subsets of finite measure have
bounded measure, it follows that µ(B) is finite. Since B is invariant
under T we can restrict our o-field and measure to B and think of
T as a measure preserving transformation on B. Apply the lemma
again to the integrable function (f-8), and note that in this case
the set E of the lemma is the whole space B. This gives

fB (f - 8) d# > 0.

Similarly, we can obtain (r -f) du > 0.


Together these give (r - s)du >, 0.

Since r < s, we must have ,u(B) = 0.
(ii) Put r1 n-I
f *(x) = lim {- E fi(x)}
,.0, ni=0
when the sequence converges. Then it is immediate that f * is measur-
able and invariant. Further
1 n-1 d1t l n-1
1 n- i=o
E fi(x) -ni=0
E I fi(x)I dp = lf(x)l da
so that, by theorem 5.7 (Fatou) f * is integrable, and

511*1 du s f IfI du
(iii) For fixed n, put
D(k, n) _ {x:
k k+1
< f* (X) <

and apply the lemma to the transformation T on the set D(k, n) which
can be assumed to be invariant. Then f*(x) 3 k/2n in D(k, n), so that
at least one of the sums
n-1 k
E (fi(x)-2n+ > 0
for each e > 0. Hence
L fdu >
k, n)
and so we must have
fdu % 2k ,u(D(k,n))
Similarly (' fd \ k+1 (/D(k, n))
D(k,n) 2n

µ (D(k, n)) c f f dµ 5
k2 1 µ (D( k , n)).
J D(k,n)
For each integer k, it follows that

*d d 2nµ(D(k,n));
J D(k, n)f JD(k, n)f
and if we sum over k
Ifaf * d1i -fnfdu I
< I #(Q)
Since n is arbitrary we must have f f * dµ = f f du. ]
For applications to statistical mechanics one would expect the
equilibrium value f*(x) to be independent of the point x, so that the
limit function f* of theorem 7.9 is a constant. Unfortunately this is
not true without imposing an additional condition.
Ergodic transformation
A measure preserving transformation T is said to be ergodic (or
metrically transitive or metrically invariant) if for all invariant sets
E (sets for which T-1(E) = (E)), µ(E) = 0 or,u(S2-E) = 0.
Lemma. T is ergodic if and only if every measurable invariant function
is constant a.e.
Proof. Suppose g is measurable and invariant. Then {x: g(x) > a}
is invariant for all real a, and must either have zero measure or be
the complement of a set of zero measure. Hence g = constant a.e., if
T is ergodic. Conversely, if every measurable invariant function is
constant a.e., since the indicator function of an invariant set is an
invariant function, there cannot be any invariant sets other than
null sets and complements of null sets. ]
We can now apply this lemma to theorem 7.9 when T is ergodic.
There are two cases:
(i) µ(S2) = + oo. Since the only constant which is integrable over a
space of infinite measure is zero we deduce that
l n-1
Efi(x) 0 a.e.
(ii) µ(S2) < oo. We can integrate f * = c a.e. by (iii) to obtain
l n-1 1

niEfi(x) ffdµ a.e.

This last result ties up with our remarks about statistical mechanics,
since it shows that the average value off on the discrete trajectory
approaches the average value off in phase space for all starting points
x except for a possible null set.
The reader who wishes to learn more about ergodic theory is
advised to read P. R. Halmos, Lectures on Ergodic Theory (Chelsea,
Exercises 7.5
1. Suppose S2 is the real line, T is the translation Tx = x+1, f is the
indicator function of [0, 1]. What is
f*(x) = lim- Z fi(x)
n i=0
in this case? Show that theorem 7.9 (iii) is not satisfied without the con-
dition,u(S2) < oo.
2. Suppose T is measure preserving and ergodic on (Q, F, p) and
u(S2) < ao.
If f is non-negative measurable and
i n-1
- Ef(Ttix)-*cERa.e.,
deduce that f is integrable.
3. Suppose S2 is five point space {a, b, c, d, e}, is the set of all subsets,
,u{a} =,u{b} = ,u{c} = 1 and u{d} = µ{e} = 2, T is the permutation (a, b, c) (d, e).
Show that T is measure preserving but not ergodic. Find the f* of theorem
7.9 if f is the indicator function of {a, b, e}.
4. Suppose (0,.5F, P) is a probability measure. Form the doubly infinite
Cartesian product of copies of (0, _5F, P) labelled (..., -2, -1, 0, 1, ..., n, ...)
and the product measure by the process of theorem 6.3. If a point of this
product space is w = (..., w_1, wo, w1, ...) and T is the shift (Tw)n = wn+1 ;
show that T is measure preserving and ergodic.
5. If 0 = [0, 1), Tx = 2x(mod 1), and ,u is Lebesgue measure, show that
T is ergodic. By applying the ergodic theorem to the indicator function
of [0, {) deduce the Borel normal number theorem which states that in the
binary expansion 0 a1 a2 ... an ... of real numbers in [0, 1), the density

-in ai - for almost all x.

n i=1
6. Suppose T is ergodic and measure preserving on a finite measure
space (X, and f, g are the indicator functions of measurable sets
F, G. Show that 1 n-1 c(F),u (G)
lira In fE ,u((T zF) n G) AX)
0 I

In this chapter all measure spaces (S2, F, p) will be o-finite, and F
will be complete with respect top, unless stated otherwise. In Chapter 7
we saw that Y. (1 < p < oo) with the metric
Pp(f,g) _ (f If-glpd,a}
and Y. with the metric
P.(f,g) = esssupIf-gl,
were complete metric spaces. We also proved they were linear spaces
(over the reals); and it is immediate that the metric defines a norm
Ilflip = P1(.ff 0) (1 < p < co),
for which the spaces are normed linear spaces. Thus
Ilfllp > 0 if f + 0, I1011 = 01

Ilf+glIp < Ilfllp+Ilglip,

IIafIIp = Ialllfllp
for aeR.
We will omit the suffix p in II . lip when it is clear which Fp space is
being considered.
It turns out that the space.T2 has some special properties not shared
by other 2p spaces. These can be postulated in terms of the difference
between Hilbert space and Banach space, but we prefer to examine,
in the first three sections of this chapter, the structure of .8ti and
then discuss later the analogous properties of more general normed
linear spaces.

8.1 Dependence of 22 on the underlying (12, F, p.)

In general, the structure of the space .92 depends on the underlying
space when we want to emphasise this we use the notation
We first examine conditions on which will ensure
that T2(Q,/t) is separable (in the topology of the norm). We later
define (real) Hilbert space in terms of its abstract properties, and show
that 22(K1,,u) is always a realisation of Hilbert space.
8.11 DEPENDENCE OF .P2 ON (f, F, p) 195

Countable basis for measure

In the measure space (t1, we can use the equivalence relation
A - B a,u(A. B) = 0 to identify the subsets in F which differ
only by a set of measure zero. If we denote the resulting quotient
space by .,K, it is clear that, when ,u(S2) < oo, F,, is a metric space
with the metric p(A, B) = µ(A D B), and one can further show that
.F,, is complete. In this case we can define a dense subset by means of
the topology of this metric. However, the notion of a dense subset in
Jr. can be extended to include the case p(S2) = oo by a device which
makes sense provided It is o--finite on Q. Thus we say that µ has a
countable basis if there is a sequence {En} of sets in .F such that, given
e > 0 and any A e.F with #(A) < oo, there is a set Ek of the sequence
for which
#(AA Ek) < e.
In Chapters 3 and 4 we saw how measures could be obtained by
extending a measure already defined on a semi-ring. If p can be
defined by extending a finite measure on a semi-ring -0 which contains
only a countable number of sets, then u has a countable basis. For
the ring 9 generated by ' is countable, and forms a basis for F by
theorem 4.4. In particular, in the definition of Lebesgue measure,
we could have used the countable semi-ring of f-open intervals, whose
bounding coordinates are rationals, to generate the o--field Rk of
Borel sets in Rk; so it follows that Lebesgue measure in Rk has a
countable basis.
We first obtain a condition equivalent to the existence of a countable
basis for p.
Lemma. A measure p has a countable basis if and only if, for each
e > 0, any collection ' c F of subsets of finite measure such that
A,BEf, A $ B=>p(ALB) >,e (8.1.1)
is countable.
Proof. Suppose first that e > 0 is such that there is a non-countable
le satisfying (8.1.1); and suppose if possible that u has a countable
basis -9. Then, for each A E ' we can find a set EA E 9 with
p(A t EA) < 1e.
Then, if A + B,
µ(E.g EB) > ,u(A I B) -#(A A EA) -#(B A EB) > ae > 0,
so that EA 4 EB. Thus if .9 is dense, it contains a non-countable
subclass, which is a contradiction.
Conversely, suppose the condition is satisfied. Then for each positive
integer n, the set F. of those classes W. which satisfy (8.1.1) with
e = 1/n can be partially ordered by inclusion. Clearly if An Fn
is a totally ordered set of classes, the union of all the classes in On
is a class Wn which is a maximal element of On. By Zorn's lemma
(see § 1.6) it follows that we can obtain a maximal element in Fn
with respect to this ordering. Thus we can find a class'. c Fsatisfying
(8.1.1), with e = 1 /n, and such that, given E E.F, there is at least one
A EW,Oy with ,u(A L E) < 1/n, as otherwise E could be added to Wn

to form a larger collection. But WO, is countable soW _ U W°n is count-

able and forms a basis for It. J
Theorem 8.1. The space P2(i,4a) of functions f: £2 R* which are
square integrable is separable (in the norm topology) if and only if
the measure ,a has a countable basis.
Proof. Suppose first that 2'2 is separable, so that there is a sequence
{fn} in Y2 such that for any e > 0, f eY2 we can find an integer k
with jjf - fkjj < e. Let'' be any collection of measurable sets of finite
measure. Then for each A E'6', the indicator function x4 E Y2, so there
is an integer k4 such that
Ilfkd-xe11 < 3e.

Then, if'' satisfies (8.1.1), we must have

1l fka -fkB!I > 3e for A + B

so that kA + kB, and C must be countable. By the lemma this implies
that ,u has a countable basis.
Conversely suppose that It has a countable basis'. The set , of
all simple functions
h = E ri x,

which are finite sums of rational multiples of indicator functions of

sets of 'e is then countable. In order to prove 112 separable, it is suffi-
cient to show that this set 9 is dense in 2'2.
From the definition of the integral, for any f e x''2, e > 0 we can
find a set E E .F with ,u(E) < oo such that f is bounded on E and

IfI2d1j < 3e2.

8.1] DEPENDENCE OF 22 ON (0, F, p) 197
On the set E, we can use the process of theorem 5.2 to approximate f
uniformly by a simple function g taking only rational values
Ei n EE = 0 (i + j),
9 = ik xEk
4- U E1.

Using #(E) < co, this means that such a function g can be found with

fE!f_g12dp < je2.

Then IIf-gll2 = L_E If-gI2du+ fE I f-gl2du

= f f-EIfI2dp+f if-gl2dp

< 1e2.
so that IIf - gll < 1e. If all the rk in the representation of g are zero we
are finished, so there is no loss in generality in assuming they are all
non-zero. Since ' is a basis for ,u and p(Ek) < oo, we can find sets
Ck of such that
,u(EkLCk) < Gr n) (k= 1,2,...,n).
E 2
Then II rkXEk-rkXC, II2 = (2n
so that, if
n n
h = F+rkXCk, we have IIg - hII , = II rkX k-rkXC.II < JE
k=1 k=1

and IIf-hII s IIf - gIl+Iig - hII < e.J

Corollary. If p denotes Lebesgue measure in Rk, then
is separable.
To prove this we use the observation that Lebesgue measure in Rk
has a countable basis. It is worth remarking that the classical method
of proving this corollary is to approximate f e 22 by a continuous
function vanishing outside a finite interval, and then approximate this
continuous function by a rational polynomial.
We end this section by making an important definition which is
essential to a geometric understanding of linear spaces. We will see
later than it is not possible to define an inner product in .p for

Inner product
For any normed linear space K over the reals a function (f, g) on
K x K ->. R is called an inner product (or scalar product) if
(i) V, g) = (g,f);
(11) (fl+f2,g) = (f1,g)+(f2,g);
(iii) (Af, g) = A(f, g), for A E R;
(iv) (ff) = 11f J12.
For the normed linear space 'T2 we can define

(f, g) = P9 dl t, f, g E '42,
since, by theorem 7.7, the productfg is integrable. It is a simple matter
to check that, with this definition, (f, g) satisfies all the conditions
(i)-(iv) for an inner product.

Exercises 8.1
1. For any normed linear space with an inner product, prove that
V 'O '< IlfiIIIgII.
Hint. Consider (f + 8g, f + Og) >, 0 for all real 6.
2. If (f, g) = 0 in a normed linear space, show that
Ilf+gll2 = ilfll2+Ilgll2.
3. Suppose (92, .v u) is a discrete measure space, i.e. there is a sequence
{pi} of reals with E 1 pi I < oo and a sequence {xi} in f2 such that p(E) = F, pi.
Prove that 22(f2 u) is separable.
4. If (Y, 9, v) are v-finite measure spaces each with countable
bases, show that the product measure A = It x v on X x Y also has a count-
able basis.
Hint. Consider finite unions of rectangles which are products of basic
5. Generalise the above to countable products of spaces
with ,ui(Xi) = 1. The example (8) below shows that it does not extend to
arbitrary products.
6. Let Q. be any set and define the counting measure ,u(E) = number of
points in E when E is finite; #(E) = + oo otherwise. Show (i) if f2 is countable,
the finite subsets of 0 form a countable basis; (ii) if f2 is uncountable, there
is no countable basis for ,u.
7. Show that any Lebesgue-Stieltjes measure (Rk, has a count-
able basis.
8.1] DEPENDENCE OF 22 ON (S2, . µ) 199
8. Suppose I is a non-countable index set and for a e I, Xa denotes
a 2-point space {0, 1} with µa{0} = ,aa{1} = 1. Form the product measure
p on the Cartesian product j-j {0, 1} = S2. Show that there is no countable
basis for p.
9. Show that, if p has a countable basis, then 2(S2,µ), I < p < co
is a separable space.

8.2 Orthogonal systems of functions

We now examine the part of the structure of Y2(S2 p) which is
more intimately related to the inner product.
Linear dependence

In a linear space K, the finite set is said to be linearly

dependent if there are real numbers c;,, not all zero, such that
0. (8.2.1)
On the other hand, if (8.2.1) implies that all the ci are zero, then we say
that 01, ..., 0. are linearly independent. A set E e K is said to be
linearly independent if each of its finite subsets is linearly independent.
Note that, when K = 22 the relation (8.2.1) becomes
cioi(x) = 0 a.e.

Closed linear span

In a normed linear space K, given a family {qa}, a r: I of points of K
the set of all finite linear combinations
cak 0., ca& E R (8.2.2)
is called the span of {0a}, and its closure (in the norm topology) is
called the closed linear span of {0} and denoted by M{Oa}. Thus
this set M consists of precisely those elements of K which can be
approximated in norm by elements of the form (8.2.2).

Complete set
A family {ca} (a c: I) in a normed linear space K is said to form a com-
plete set if its closed linear span is the whole space.
Suppose now that a normed linear space K is separable, so that
there is a sequence x1, x2, ..., xn, ... of points dense in K. By omitting,
in turn, any point in the sequence which can be expressed as a linear
combination of the previous ones we obtain a sequence 9.1 g2, ...
which is linearly independent, and has the same linear span as {x,z}.
Since {xn} is dense, the closed linear span of {gn} must therefore be the
whole space. Thus in any separable normed linear space there is a
complete set of independent points which is either finite or enumerable.
If there is a finite complete set (g1, g2, ..., gi.) of independent points and
K has an inner product, then we will see that K is isomorphic to
Euclidean k-space. For K = Y2(S2,1a), it is easy to see that K is
finite-dimensional if ,a is a discrete measure concentrated on a finite
set of points, for then the indicator functions of these individual
points will form a finite complete set. However, the interesting Y2-
spaces are infinite-dimensional. Any (S2, .° u) for which contains
infinitely many disjoint sets, each of finite positive measure, clearly
generates an infinite-dimensional since the indicator func-
tions of this sequence form an independent set.

Orthogonal system
Two points x, y in a normed linear space K with an inner product
are said to be orthogonal if (x, y) = 0. Any class %} (i E I) of points
of K which are different from zero and pairwise orthogonal is called an
orthogonal system. A non-zero element x E K is said to be normalized
if Jlxii = 1, i.e. (x, x) = 1. An orthogonal system of normalized points
is said to be an orthonormal system in K. Thus {O2} (i E I) is an ortho-
normal system if
1 for i = j E I,
10 for i + j.
Now any orthogonal system of points is certainly linearly indepen-
dent for, if we take the inner product of (8.2.1) with O; we obtain
cf(c,, 0f) = 0, so that cf = 0. Further, if K is separable, any ortho-
gonal system in K is countable. For any such system can be normal-
ised to give an orthonormal system {oi} (i E I), and then

JJ 0z - Y'7II = V2 for i 4 j ;
and, if {x,,} is a countable dense set, we can find for each i E I an
integer nz such that 11 x'i - 5zlj < J and this gives
11xni-xnjJJ > 4J2-1 > 0 for i + j, so that n z + nj.
In the study of finite-dimensional normed linear spaces it is helpful
to use a (finite) orthogonal normalized basis. In the general case, at
least for K separable, it is also possible to find a complete orthonormal
sequence for K. This can be done by first obtaining a complete in-
dependent sequence and then orthogonalising it by the process of the
next theorem.
Theorem 8.2 (Gram-Schmidt orthogonalisation process). If K is a
normed linear space with an inner product and x1, x2, ..., xn, ... is a
linearly independent sequence in K, then there is an orthonormal sequence
y11 y21... Yn, ... such that
(1) yn = an1x14an2x24... +annxn, ann 4 0;
(ii) xn = bnlyl+bn2y2+...+bnnyn, bnn + 0;
where a11, b11 are real numbers. Further each y1 is uniquely determined
(up to the sign) by these conditions.
Proof. If yl = ax1, then
(yl, yl) = a2(x1, x1) = 1
if a is suitably chosen. The conditions are therefore satisfied with
n = 1 if b11 = 1/a,1 = , J(x1, x2). (Note that the linear independence
condition ensures that Il xlll + 0.) For n > 1, suppose y1, y2, ... , yn_1
have been found to satisfy all the conditions. Then
xn = bn1y1+... +bn,n-lyn-l+zn,
where bnj = (xn, yi) (i = 1, 2, ..., n - 1) so that (zn, yz) = 0 for i < n.
We must have (zn, zn) > 0, since otherwise zn = 0 and x1, x2, ..., xn
would be linearly dependent. If we put

yn = // bnn = N(zn,zn),
(z zn))

then (yn, yn) = 1, (yn, y2) = 0 for i < n and (ii) is satisfied. We can
then deduce (i) since bnn 4 0. By induction the method of ortho-
gonalisation is established.
The uniqueness of the process (apart from sign) follows since the
values of the constants are all determined except for the ± sign in
the square root which occurs at each stage. J
Corollary. If is separable, then there is a complete ortho-
normal sequence in Y2.
Proof. Start with a sequence {fn} which is dense in 22, and replace
it first by a sequence {gn} of linearly independent functions with the
same closed linear span. If this is an infinite sequence, use the process
of theorem 8.2 to obtain the orthonormal sequence {hn}. It is clear that
this sequence has the same closed linear span as {gn} so that it is a com-
plete set. On the other hand, if 22 is finite dimensional, we will obtain
a finite set {g1, 92, ... , 9n} whose linear span is 2'2. This finite set
can be replaced by a finite orthonormal set using the process of
theorem 8.2.]
In practice it is not always easy to prove that a given orthonormal
sequence {01, 452, ...} is complete. Various methods for proving com-
pleteness will be given in the next section.

Exercises 8.2
1. Suppose L1 = [0,1), ,u is Lebesgue measure, fo(x) - 1,
+1 if 2n-1x - y e [0, J) (mod 1),
fn(x) -{-1
if 2n-ix=ye[j,1) (mod1).
The functions fn: [0, 1) -> R are called the Rademacher functions. Show
that they form an orthonormal sequence in
2. For Q = [ - rr, 7r], It Lebesgue measure, show that the trigonometric
1 1 1 1 1,
V2 rr
- cos x,
sin x, ...,
cos nx, - sin nx, ...

form an orthonormal sequence in

3. For S2 = [-1, 1], ,a Lebesgue measure, show that the Legendre
polynomials I do{( _ 1)n}
Pn(x) (n = 0, 1, 2,...),
2nn dxn
are orthogonal in and that the sequence V{1(2n+1)}P,,(x) is

8.3 Riesz-Fischer theorem

This theorem is formulated and proved in Hilbert space. Since
22 spaces will be shown to be realisations of Hilbert space, we will
deduce the classical theorem about the Fourier expansion as a trigo-
nometric series of a function in .2 as a special case.
Hilbert space
Suppose H is a normed linear space with an inner product which is
complete in the topology of the norm; then H is said to be a Hilbert
space. (Note that some older text-books require separability in addi-
tion.) If H is finite-dimensional, then theorem 8.2 allows us to choose
a finite orthogonal basis e1, e2, ..., en for H. It is then clear that
x = E Ckek, Ck = (x, ek) (8.3.1)
is the unique expansion of x e H in terms of this basis. For separable
infinite-dimensional H, we have seen that there is always an ortho-
normal sequence {ei} which forms a complete set. The main objective
of the present section is to obtain the extension of (8.3.1) to this
infinite-dimensional case. However, in formulating the results we will
not assume that H is separable. It will turn out that an expansion in
the form (8.3.1) is still possible, and that at most countably many
terms in any such expansion will be non-zero. Before leaving the
finite-dimensional H, we should observe that any Hilbert space of
dimension n is isomorphic to Euclidean n-space Rn with the usual
scalar product. For (8.3.1) determines the point (cl, c2, ..., CJ E Rn
and defines a (1, 1) mapping which then preserves the inner product.

Fourier coefficients
Given an orthonormal family (ei), j e J in a Hilbert space H, and
any point x E H, the real numbers
C, = (x, ei) (j e J),
are called the Fourier coefficients of x on the orthonormal family,
and the series
E ci ei
is called the Fourier series of x. (Note we have not yet said in what
sense (if any) this series converges.)
The choice of the Fourier coefficients c, can be justified as follows.
If I is a finite subset of the index set J, re-label the indices 1, 2, ..., n
and consider the partial sum
sn = ai ei (n = 1, 2, ... ).
n n
Then 11sn-x112 = x- Zaiei, x- i=1
x n n
= JJxJJ1-2E(x,aiei)+E1 E1(aiei, a,ef),

n n
JJxJJ2-2aici+ ai,
i=1 i=1
n n
so that Ilsn-xll2 = 11x112- Ci (ai-Ci)2. (8.3.2)
i=1 i=1
Thus Ilsn - xll will be a minimum when all the terms of the last series
in (8.3.2) are zero, and aiei is the best approximation (in norm)
to x when all the ai are Fourier coefficients. This generalises the well-
known geometrical theorem (for Rn) which states that the length of
the perpendicular from a point to a plane is smaller than the length of
any other line joining the point to the plane: for (x - I c i e,) is ortho-
\ i=1
gonal to all linear combinations of the form fi ei.
Bessel's inequality
We can make another deduction from (8.3.2) by noting that
I1sn-x112 > 0.
If we put ai = ci we obtain
If we now define c to be the supremum of c, for all finite subsets
I c J we find that
e IIxI12, (8.3.3)
and this is known as Bessel's inequality. It follows as an immediate
corollary that at most countably many Fourier coefficients of a point
in H can be non-zero.
Theorem 8.3. If {ej} (j E J) is an orthonormalfamily in a Hilbert space,
each of the following conditions is equivalent to {ej} being a complete
(i) Z e; = JJx112 for every x E H (Parseval),
where {c,} are the Fourier coefficients of x;
(ii) The finite partial sums sI = Eckek of the Fourier series of x
converge to x in norm for all x E H.
Note. For any arbitrary index set J we say that Z xj converges in
norm to x if, given e > 0, there is a finite set K such that if I is finite
and K c I c J then
11 j xj - xII < e.
It is easy to check that, when J is countable and the xj are real so
that we have a real series this notion reduces to the usual definition
of absolute convergence.
Proof. The conditions (i) and (ii) are clearly equivalent by (8.3.2).
Now suppose that (ii) is satisfied. Then any x can be approximated in
norm by a finite sum sn which is a linear combination of e1, e2, ..., en.
Hence, each x is in the closed linear span of {ej} and the sequence must
form a complete system.
Conversely suppose {ej} is a complete system. Then given e > 0,
x E H, there is a finite sum y = E ai ei for which l l x - y II < e. But,
if SN is the corresponding partial Fourier sum, we know
ilx-y112 i Ilx-sNII2,
so that, by (8.3.2) Ec2 >' 11x112-e.


Since e is arbitrary, we can combine this with (8.3.3) to give

ECj = I1 x1l2.]
From (8.3.3) we know that a given set {/3i} (jEJ) of real numbers
can only be the Fourier coefficients of a point in H if E fj2 converges.
It turns out that this condition is sufficient as well as necessary.
Theorem 8.4 (Riesz-Fischer). Let {e,} (j E J) be any orthonormal system
(not necessarily complete) in a Hilbert space H, and let {/3j} j E J be any
set of real numbers such that E ,6j' converges. Then there is a point
X E H with Fourier coefficients /3j = (x, ej) such that the finite partial
sums sI = Y ,,8i ei converge to x in norm.
Proof. Since E /3 converges the set of j for which /), + 0 is countable
and we may suppose then that these indices are renamed
A, 182, 116,,---
(it may be only a finite set). Put
sn = E /3i ei
Then Ilsn+p-sn112 = E N2

Since E f2 converges, it follows that {sn} is a Cauchy sequence in norm.
Since H is complete, there must be an x e H such that 8n -+ x in norm.
(x, ei) = (sn, ei) + (x - sn, ei)
= Ni+(x-8n,e1) for n >, i
and, by exercise 8.1 (1)
I (x-sn, ei)I <- lleill Ilx-snll = IIx-snll --> 0 as n -9 oo.
Since (x, ei) is independent of n, we have & = (x, ei) for all i. J
Corollary. An orthonormal system {ee} (j E J) in a Hilbert space H
forms a complete system if and only if the only point x E H which is
orthogonal to all the {e;} is the point x = 0.
Proof. Suppose {e1} is a complete orthonormal set and (x, ef) = 0
for all j. Then all the Fourier coefficients of x are zero and so


Conversely suppose {e;} is not a complete system. Then by theorem

8.3 (i), there is a point x E H with
11 x11 2 > E cf, where cf = (x, ej).
By theorem 8.4, there is a y E H such that
(y, e5), IIyll2 = E cJ2.

Then the point (x - y) E H is orthogonal to all the e,. But 11 x - yll + 0,

since llxll > llyll. 1
Remark. If the Hilbert space is separable we have already observed
that any orthonormal system is countable. For a separable Hilbert
space, therefore, it is natural to state and prove theorem 8.4 and
corollary in terms of an arbitrary orthonormal sequence. No essential
modifications to the proof are needed.
The space 12
The class of all infinite sequences c,, c2, ..., c,,,, ... of real numbers

for which Z ck converges is called the space 12. By using the discrete
version of theorem 7.7 one can check that if {ci}, {di} E l2,

(c, d) _ cidi (8.3.4)

converges and defines an inner product. Alternatively, if S2 is the set
of positive integers, ,u is the counting measure, ci = f(i), then f .T2
(0, It) if and only if Z00 ck < oo, and (8.3.4) is the usual inner product
(f, g) = f fg da in 22. Completeness and separability for 12 can be
proved directly, or they can be deduced from the corresponding
properties of Y2(92, #). Thus 12 is a separable Hilbert space according
to our definition-and historically 12 was the space first considered in
The justification for our abstract definition of Hilbert space is
contained in the following theorem.
Theorem 8.5. An n-dimensional Hilbert space is isomorphic to Rn.
Any separable infinite-dimensional Hilbert space is isomorphic to 12.
Proof. The finite-dimensional case was considered earlier. If H
is separable and infinite-dimensional we can select a complete ortho-
normal sequence {en} and obtain the Fourier coefficients {cn} of
a point x E H. Then since x E H E c2, < oo, this defines a mapping
from H to 12. Conversely, every sequence in 12 determines a point in
H with these Fourier coefficients, by theorem 8.4. There is only one
such function by the corollary to theorem 8.4. Thus to prove that we
have an isomorphism it is sufficient to prove that the linear structure
and inner product are preserved by the mapping. Suppose
x(1), x(2) E H

correspond to {c()}, {c(2)} E 12. Then it is immediate that

ax(l) 4-+ {ac21)} (a E R);
X(1) + x(2) t-+ {(C(l) + C(2))};

I x(1)II2 = Y_(C111)2, IIx(2)II2 = E(c2))2,

llx(1)+x(2)1!2 = Ilx(1)II2+2(x(1)

F..(Cz1))2+2}rc'c2)+ (c2))2,

so that (x(1), x(2)) = Zc(il, ci2).

Corollary. If is any measure space with a countable basis,

then is either finite-dimensional, when it is isomorphic to a
Euclidean space Rn, or it is infinite-dimensional when it is isomorphic
to 12. If (521,;, µl), (Q2, such that-T2(51, p1) and-T2(12,1a2)
are both infinite-dimensional and separable, then Y2(521,µ1) and
°2(021 p2) are isomorphic.
The theorems of this section were first obtained for trigonometric
series of functions f in Y2([-7r,7r],1a) where p is Lebesgue measure.
In order to obtain these special theorems one only has to prove that
the functions
V2rr' -1
cos x,


sin x, ... ,

form a complete orthonormal sequence in this Y2 space. The steps

cos nx,
- 1
sin nx, ...

necessary for this proof are contained in exercise 8.3 (2).


Exercises 8.3
1. Prove that a series E an of real terms converges absolutely to s if and
only if, for each e > 0 there is a finite set I e Z such that for every finite K
with I C K c Z we have
(s - Eanl < e.
2. If S2 = [ - it, 7f],µ is Lebesgue measure, f E Y2 (S2,µ)

am = -
1f f (x) cos mx dx (m = 0,1, 2, ... );

bm = f (x) sin mx dx (m = 1, 2, ... ),
then the a,,,, bm are the classical Fourier coefficients off. Prove:

(1) jag+E (a+bm) < 1 f ff {f(x)}2dx;

m=1 7< -,r
(ii) if {an}, {bn} are such that

jao+ E (a2 +b2,,) < 00,

then there is a function f E 2'2 for which these are the Fourier coefficients
and such that
= Jao + En (am cos mx + bm sin mx) -± f
in second mean;
(iii) if {an}, {bn} are the Fourier coefficients of f in the above sense
8.(X) = Ja0+ E (am cos mx+bm sin mx),

o'n(x) [so(x)+sl(x)+...+sn(x)],
= n+1
then vn(x) -> f (x) in Y. norm (and in fact on -* f a.e.);
1 _nncr n
(iv) (x)d/2= n-F1
n o0

< ao+(ak+bk)
<' ao+E(ak+bk);
E.fIT r
(v) since v(x) dµ-* f 2(x) dwe have for f
J Ir

J ff{f(x)}2dfa _ Sao+E (ak+bk);

(vi) the trigonometric system of exercise 8.2 (2) is complete.

3. If is any orthonormal sequence in a Hilbert space H, then for
xEH,(x,0)-0as n--> oo.
4. If f e P2([-7T, 7T], u) then, as n -> co,

ff(x)sunnxdx -+ 0, f'f(x) cos nx dx -* 0.


5. If 1" is the space of real sequences {x;} for which

jxijv < co for 1 <, p < oo,

show that lp is separable. I., is the space of bounded real sequences with
jjxli = sup lxil. By considering sequences of 0's and 1's show that l., is not
separable. Deduce that Y., is not separable if S2 has infinite but v-finite

8.4* Space of linear functionals

We start by defining a more general type of normed linear space.
Banach space
Any normed linear space over the reals which is complete in the
topology determined by the norm is called a (real) Banach space.
We saw already that the .2 (1 < p 5 + co) spaces are normed linear
spaces and that each of them is complete in the norm topology so that
each Yp is a Banach space. Euclidean n-space Rn with the usual metric
provides a simpler example of a Banach space. C[a, b], the space of
continuous functions on a finite closed interval, with 11f li = sup 1 f (x)1,
can also be easily seen to be a Banach space.
From our definition of Hilbert space, it follows that any Banach
space will be a Hilbert space provided there is an inner product
defined satisfying 1if 112 = (f, f ). The question immediately arises as
to whether or not all Banach spaces are Hilbert spaces; or is it always
possible to define an inner product in a Banach space? We can settle
this as follows. If there is to be an inner product, then
11f+gil2 = (f+g,f+g) = (f,f)+2(f,g)+(g,g),
I1f-g1l2 = (f-g,f-g) = (f,f)-2(f,g)+(g,g),
so that on adding
Ilf+g112+I1f-gil1 = 211f112+211gI12 (8.4.1)
Thus, a relation (8.4.1) for all f, g in the space is a necessary condition
for the Banach space to have an inner product. One can also easily
check that, if (8.4.1) is always satisfied, then
(f,g) = {lIf+gll2-11f112-11g112} (8.4.2)
satisfies all the conditions for an inner product, so that any Banach
space which satisfies (8.4.1) is a Hilbert space if we define the inner
product by (8.4.2). We can think of the condition (8.4.1) as a general-
isation of the Euclidean theorem that in any parallelogram the sum
of the squares on the diagonals is twice the sum of the squares on two
adjacent sides. If this theorem is not valid in the Banach space K,
then it is not possible to define an inner product on K. This allows us
to show that 2P is not a Hilbert space for p + 2-see exercise 8.4 (2).
Linear functional
Given a linear space K over the reals a function T : K -* R is
called a linear functional if, for all x1, x2 E K, a,,8 E R,
T (ax1 + fix2) = aT (xl) + f T (x2).
If K is a normed linear space, then T is continuous at xo E K if, given
e > 0, there is a 8 > 0 with
IIx-xoll < IT(x)-T(xo)I < e.
For a normed K the functional is said to be bounded if there is a real
constant C such that
IT(x)I 5 CIIxII for all xEK.
It is immediate that a linear functional on a normed linear space
is continuous everywhere if it is continuous at any one point. The
connexion between continuity and boundedness is not quite so obvious.
Lemma. A linear functional Ton a normed linear space K is continuous
if and only if it is bounded.
Proof. Suppose first that T is bounded, then given x1 E K, e > 0, put
8 = e. C-1 and we have
IT(x)-T(x1)I = IT(x-x1)I 5 CIlx-x1Il < e
if 11 x - x1II < 8. Conversely, if T is continuous at 0 E K we can choose
B > 0 such that 1 for B.
IT(x)I IIxii

Then for x E K, IIBII I I = B, so that

IT(x)I = < BIIxlI,
and T is bounded. J

Norm of a bounded functional

If T is a bounded linear functional on a normed linear space K, the
smallest number C satisfying I T (x) I S C11 xII for all x E K is called the
norm of T and we denote it by II T11. Because of linearity,
T(x) I = sup IT(x)l.
= sup I
11 4 IIxII=1
If T1, T2 are two linear functionals on a linear space K, a,,8E R
(aT1 + 18T2) (x) = aT1(x) + f T2(x)
is also a linear functional on K; and the set of all linear functionals
on K is a linear space. We can say more if K is normed.
Lemma. If K* denotes the set of bounded linear functionals on a
normed linear space K, then K* is a Banach space.
Proof. T E K*, 11 TII = 0 implies that T (x) = 0 for all x E K which
means that T is the null transformation,
1ITI+T211= sup IT1(x)+T2(x)I , sup IT1(x)I+SUP IT2(x)
I14=1 I1xII=1 IIxII=1
= 11Tih1+IIT21I
and IIaTII = sup IaT(x)I = jai sup IT(x)l = Ial.11TI1.
I4II=1 IIxII=1
This shows that K* is a normed linear space with the norm
IITII = sup IT(x)I
It remains to show that K* is complete. Suppose {Tn} is a sequence
in K* such that m,n-+oo.
IITm-TnII -,-0 as
Then, for each x E K, I Tm(x) - TT(x) I ->. 0 as m, n --> co. The complete-
ness of R now implies that there is a real number
y = T(x) = limTT(x).
Now T is clearly a linear transformation and since
the real sequence {IITnII} must be bounded, say by C. Then
ITT(x)I < CIIxII for all xEK
and all integers n, so that IT(x)I 5 CIIxII and T is a bounded linear
functional, that is T E K*. Given e > 0, there is an integer N = N(e)
such that
ITm(x)-T,,(x)I < e for 11x11=1, xEK, m, n, > N.

If we let m oo, then

IT(x)-T,,(x) I < e for (n > N),
lixil = 1
so that II T - T II -> 0 as n - oo, and K* is complete.
Conjugate space
For any normed linear space K (in particular if K is a Banach
space), the Banach space K* of bounded linear functionals on K is
called the conjugate space (or dual space) of K.
Linear subspace
A set H contained in a linear space K such that H is itself a linear
space is called a linear subspace of K.
If K contains a point x + 0, it is clear that the set of all points
ax, a E R is a linear subspace H of K. Then, if we put,
T (ax) = a for all ax E H
it is immediate that T + 0 is a linear functional on H. However, it is
not immediately obvious that the set of bounded linear functionals
defined on the whole of K contains any T + 0. The existence of such a
non-trivial T will follow if we can prove that linear functionals defined
on a subspace can always be extended to the whole of K.
Theorem 8.6 (Hahn-Banach extension). Suppose K is a linear subspace
of a linear space H. Then any bounded linear functional on K can be
extended to a bounded linear functional on H with the same norm.
Proof. Suppose f: K R is the given functional and
a = sup 1f(x) I I X11
Then I f(x)I < aJJxJI for all x in K. Consider the class'' of all linear
functionals T defined on spaces J such that (i) K - J c H; (ii)
T(x) =f(x)for xEK; (iii) T(x) < aJIxJJ for xEJ. We can partially order'
by putting g1 < 92'f 91 is defined on J1, 92 on J2, K - J1 - J2 c H and
gl(x) = g2(x) = f (x) for x E K, gl(x) = g2(x) for x E J1.
By Zorn's lemma (§ 1.6) we can find a maximal element in this partial
ordering. This must be an extension T of f defined on a subspace
J c H such that no further extension to a larger subspace is possible.
It is clearly sufficient to show that, for this maximal TEe, we must
have J = H.
Suppose not, then there is a point z E H - J. We will obtain a con-
tradiction by showing that T can be extended to the linear space
J. consisting of all points of the form j + az, j E J, a E R. Note first that,
since z 0 J, the representation (j + az) is unique. The extension to
Jz will therefore be determined by its value at z. Now if x, y e J,
T(x)-T(y) = T(x-y) < all x-yll < allx+zll +all -y-zll
so that - all -y-zll - T(y) < allx+zll-T(x).
Hence sup[ - all - y-zll -T(y)] < inf[allx+zll -T(x)].
Let t be any real number satisfying
sup[-all -y-zll -T(y)] < t < inf[allx+zll -T(x)], (8.4.3)
and put T(z) = t: this implies
T(k + az) = T(k) + at for kEJ.
Now put y = x/a in (8.4.3) and let w = x+az:
w w
-Tia <t<a a
If a > 0 multiply the right-hand inequality by a, while if a < 0
multiply the left-hand inequality by a. Both cases give
allwll -T(x) > at
so that T(w) < allwll
and TE ''. Since J is a proper subset of Jz this establishes the existence
of the extension. To see that the extension F has the same norm as
f we need only note that I F(x) I < all xll for all x in H so that
IIFII <a=llfII;
and IIFII = sup IF(x)l >_ sup lf(x)l =
I1x11=1 I1x1l=1

Remark. In the above theorem, the only property of the norm which
we used was that
llx+yll < llxll+llyll for all x,yEH.
It is possible to state and prove the extension theorem in terms of any
subadditive bounding functional. This gives
Theorem 8.6A (Hahn-Banach extension). Suppose K is a linear
subspace of a linear space H, p is a subadditive functional on H such
that p(ax) = ap(x) for a >, 0, xEH; and f is a linear functional on K such
that f (x) < p(x) for all x E K. Then there is a linear functional f : H - R
such that
f (x) = f (x) for x E K, j (x) < p(x) for x E H.

Exercises 8.4
1. If (0, .F, u) is such that there are two sets El, E2 .. with ,u(E1),
u(E2) positive and finite, show by considering the indicator functions of
El, E. that does not satisfy (8.4.1) if p + 2; and therefore is not a
Hilbert space.

2. Prove that m, the set of bounded real sequences {xi}, is a Banach space
with IIxII = sup I xi1
3. Suppose K is a Banach space, K* is its dual, and K** is the dual of
K*. Prove:
(i) if x is a fixed element in K, X (f) = f (x) for f E K* defines a linear
functional on K*;
(ii) for the above function IIXII = 11xii, so that T(x) = X is a norm
preserving map from K to K**;
(iii) this map T preserves the linear structure;
(iv) The set of elements X of K** such that X (f) =&) for some x K
forms a closed linear subspace of K**.

4. If K is a linear subspace of a Banach space H show that a point y

of H is in the closure of K if and only if f (y) = 0 for every linear functional
f c H* which vanishes on K.

5. In §4.4 we showed that it is not possible to define a measure on [0,1)

which is defined for all subsets and invariant for translations (mod 1).
The following steps will show that we can define such a finitely additive
set function on all subsets of [0, 1) with v[0,1) = 1 and v(E) = IEI when
E is Lebesgue measurable.
(i) Let H be the set of all bounded functions f: [0,1) - R which are
extended to be periodic in the whole of R by f (x +1) = f (x). Prove H is
a linear space.
1 n
(ii) Put M(f;al,a2,.... an) = sup- 1 f(x+ai),
p(f) = infM(f;a1, ..., an) for all such finite sequences of real a,. Prove that
p is subadditive and p(af) = ap(f) for a > 0.
(iii) If f: [0, 1) -. R is bounded and Lebesgue measurable, show that the
Lebesgue integral 5(f) 5 M(f; a1, ..., an).
(iv) Show that the set of bounded measurable f : [0, 1) -+ R is a linear sub.
space of H, and 5(f) is a bounded linear functional on this subspace.
(v) Use theorem 8.6 A to extend f to a linear functional F defined on
all of H.
(vi) Show that F{ f (x+x0)} = F{ f (x)} for all xo E R.
(vii) By considering indicator functions of subsets of [0, 1), put
v(E) = F(XE) for El c [0, 1).
Prove v(E1 v E2) = v(E1) + v(E2) if E1, E2 are disjoint,
v(E) = JEJ if E is Lebesgue measurable,
v(E) is invariant under translations.
6. Similar arguments to those used in (5) above can be applied to the
linear space V of bounded real functions f : [0, + eo) -. R. Show that there
exists a bounded linear functional Lim f (x) on V such that, for a, b E R.
(i) Lim {af (x) + bg(x)} = a Lim f (x) + b Lim g(x);
(ii) f (x) > 0 in [0, co) r. Lim f (x) > 0;
(iii) Lim f (x + x0) = Lim f (x) for any xo >, 0;
(iv) Lim f (x) = lim f (x) if this exists.
Deduce a corresponding result for the space m of bounded real sequences.

8.5* The space conjugate to 2p

We have seen that Y, (1 5 p < + oo) is a Banach space and, a for-
tiori, a normed linear space. It follows from the lemma on p. 211 that
the space of bounded linear functionals on .p is also a Banach space.
The object of the present section is to identify these conjugate spaces
at least up to an isomorphism.
Theorem 8.7. Suppose (f , is a a -finite measure space and Y,,,
1 5 p < oo is the linear space of ,F-measurable functions f : SZ -> R*
whose pth power is integrable, with the usual norm

Ilf1I =
Let 1/p+ 1/q = 1 (if p = 1, q = co). Then
(i) for each
F(f) = ffdu
defines a linear funtional on Yp;
(ii) given any bounded linear functional F on 27 , there is a g c .T.
such that
F(f) = ffgdiu,
and in this case IIFII = (flglQdp} if p > 1,
= esssup IgI if P=1-
Proof. (i) This follows immediately from theorem 7.7 and the linear-
ity properties of the integral; for

IF(f)I <
(ii) Suppose first that ,u(S2) < oo and F is a linear functional on
.T,. For any measurable E c S2 put
cr(E) = F(XE),
where xE is the indicator function of E. The linearity of F implies
immediately that 0' is finitely additive. Now suppose E _ (J Ei,
Ei disjoint. Then ,u( U E1) ->,u(E) as N oo, so that in -rp,
\\\i=1 1
where QN = U Ei.
II xQr, - XE II -->0,
Since F is continuous, we must have
00 N
E cr(Ei) = lim E o (Ei) = tr(E)
i=1 N-00 i=1
so that o- is completely additive on -5F. Further 1u(E) = 0 - v(E) = 0,
so that a- is absolutely continuous with respect top. By theorem 6.7
it now follows that there exists a function g which is integrable, such
F(,XE) = v(E) = I gdti for all E E z,
= yjpgdlc.
This gives the required representation for F on the class of indicator
functions of measurable sets. We must prove that g E .q, and that the
representation is valid on the whole of p.
It is clear from linearity that the representation is valid for F -
simple functions. If fo E Yp, fo > 0, we can find a sequence fn of
simple functions which increases monotonely to fo. Then by theorem
7.6, f n - f 0 in pth mean, and by the continuity of F

F(ffo) = limF(fn) = lim fngdu = f fogdu

on applying the monotone convergence theorem to {fng+} and {fng_}

separately. The restriction fo > 0 can be removed by considering fi.
and f_ separately, so that
F(f) = ffgczu for all fE,Pp.
Now suppose p > 1, and g(t) has been obtained by the above process.
I g(t)19-1 signg(t) if Ig(t) I q-1 < n,
9n(t) --
lllnsign g(t) if Ig(t)Iq-1 > n.
Then each gn is bounded and measurable, and is therefore in .p.
IF(gn)I = if gngdlt I , IIFIIIIgnII = IIFII (fId/4)1/p
But gng = IgnI I gI - IgnI Ign11/q-1= Ignip,
so that fIgnIPdiu s IIFII (fIgnd)

and 11 F11.

But I gnI p _ Iglga.e.

so that, by theorem 5.5 IIFII. (8.5.1)

Before going on to prove equality in (8.5.1), let us now remove

the restriction ,u(Q) < oo. Suppose ,u(SZ) = oo, so that there is a
sequence {Q,,,} of disjoint measurable sets with
SZ = U Q1, p (Qn) < oo all n.

We can apply the above argument to each of the spaces (Qn, Win, a)
where ffln = F n Qn. By the uniqueness of the derivative g in the
Radon-Nikodym theorem, if f e and vanishes outside
zUUQi' F(f) = ffgdi&
But if f > 0, we can apply the monotone convergence theorem to each
of {fng+}, {fn g_} where fn = AR to obtain this representation by
using the continuity of F. The final step is to use f = f+ -f- so that
the representation is valid on all of gyp. Further (8.5.1) follows since
it is true for the integral over each R.
Now by Holder's inequality (theorem 7.7) we have

IF(f)I <- IIfIIJ Iglgdu}

so that IIFII = {fIvdi} using (8.5.1).

We need to modify the argument in the case p = 1, assuming that

g has been defined as before as the Radon-Nikodym derivative of o.
For any t > 0, let E be a set such that 0 < p (E) < oo and I g(x) I > t
for x E E. Put f (x) = XE sign g(x) and we have
F(f) > tu(E), 11f 11 =,u(E)
so IIFII >, t. Since such a set E can be found for any t < ess sup IgI
we must have
IIFII _> esssup IgI.

But IF(f)I = ffudp.l <- (esssuplg1)IIf1I,

so that IIFII S esssup Igi.]

Corollary. If H is a separable Hilbert space:
(i) for any fixed h e H, the inner product F(f) = (f, h) defines a bounded
linear functional;
(ii) for any bounded linear functional F on H, there is an h e H such
that F(f) = (f, h) for all f E H: further 11 F11 = II hIl.
Proof. Choose a measure space such that is a
separable Hilbert space, and so is isomorphic to H. Now apply the
theorem in the case p = q = 2.
Note. One can also construct a direct proof of the Corollary without
the restriction that H be separable; the case p = q = 2 of theorem
8.7 could then be deduced from this.
Reflexive Banach space
In exercise 8.4 (3) we proved that, for any Banach space H, H** n H
in the sense that H is isomorphic to a Banach subspace of H**.
Those Banach spaces H for which H = H** are called reflexive.
By our representation theorem 8.7, 2, is reflexive for 1 < p < oo.
In general, .1 is not reflexive because Y, is bigger than £1: this will
follow from exercises 8.5 (3,4). In fact very little is known about the
structure of .,*o : the difficulty is that the axiom of choice, or something
equivalent, is needed to construct .* and this makes it impossible
to get a hold on it.

Exercises 8.5
1. Suppose 1 < p S + oo, 11p+ 11q = 1 and f,, ->f in .9 norm, g g
in £a norm. Deduce that
ff- g. du - JfgdU.
2. If f2 is the set of positive integers and u is counting measure, then
2D (1 < p < oo) reduces to the set of sequences {xi} of real numbers such
that I xi I9 < oo; 2,,, reduces to the set m of bounded sequences.
3. Let X = [-1, 1], ,u Lebesgue measure. Show that the collection '
of continuous functions f : X -* R is a closed linear subspace of Y., (pro-
vided any function f which is equal a.e. to a continuous function is identified
with it). Hence, by theorem 8.6, extend the bounded linear functional
F(f) = f (O) from 9 to Y. without changing its norm. If possible, suppose
there is an f0 which is integrable and such that

F(f) = fffodu for fEY..

Then, for the special sequence
fn(x) = (1- IxI" ),
we have F(fn) = 1 for all n. Show that, for any f0 E. ,

fff0d - 0.
4. Extend example (3) to show that if S2 contains a disjoint sequence
of measurable sets of finite positive measure, then 21(Q,,u) is a proper
subspace of . . Deduce that ll is not reflexive.

8.6* Mean ergodic theorem

In §7.6 we obtained the point-wise ergodic theorem for functions
in _T1. If the function is in 3°2 there is an alternative form of this
theorem in which point-wise convergence is replaced by convergence
in second mean. We saw that any F2 is a Hilbert space. A measure
preserving transformation T on the underlying measure space then
leads naturally to a mapping on the Hilbert space to itself which
preserves the inner product (and norm). It is therefore possible to
state the mean ergodic theorem in terms of the properties of such a
mapping in Hilbert space, and deduce the _T2 theorem by considering
this as a realization of Hilbert space. However, we choose instead to
state and prove it directly as a theorem about the structure of
It helps if we first show that bounded linear functionals on
a Banach space can be used to separate a closed linear subspace K
from a point not in K (see exercise 8.4 (4)).
Theorem 8.8. Suppose K is a linear subspace of a Banach space H,
and y e H with d(y, K) = I > 0. Then there is a bounded linear functional
F on H such that 11 F11 = 1, F(y) = rl, F(x) = 0 for all xeK.
Proof. Let J be the set of points of H of the form
x = z+ay, zeK, aeR.
Then J is a linear subspace of H and the representation of points of
J in this form is unique. Define a linear functional f on J by
f (z + ay) = ay.

Then f vanishes on K and, for a + 0,

Ilz+ayll = IaI -+y
a > IaI q = If(z+ay)I,
so that If II < 1. But if {zn} is a sequence in K for which II zn - yII ->- y
we have
IIf IIIIzn - yII >If(zn-y)I = If(zn)-f(y)I = If(y)I = y
so that IIf II > 1, on letting n -* oo. Hence IIf II = 1, and f has all the
desired properties except that it is only defined on J, a linear sub-
space of H. Use theorem 8.6 to extend it to a linear functional F on the
whole of H with 11 F11 = If II =1.

Corollary. If (S2, °4 a) is a or finite measure space, and K is a closed

linear subspace of 22(S2 u), and y E Y2 - K, then y = z + x where
z E K and (x, w) = O for all w e K.
Proof. T2(Q, t) is a Banach space, and K is closed (in the metric
p2) so that d(y, K) = 71 > 0. Find the functional F satisfying the con-
ditions of theorem 8.8 and represent it, by theorem 8.7, as
F(p) = (,u, g) where g E 92.
Now put x = Vg, z = y - x so that
(x, w) = qF(w) = 0 for all w E K.
It only remains to show that z E K. For e > 0 choose k c K such that
Ilk-yl12 = (k-y,k-y) < y2+e.
Then I1k-z1I2 = (k-y,k-y)+2(x,k-y)+(x,x)
= Ilk-yII2+2,1(g,k-y)+y2llgll2

= Ilk-yIl2-27lF(y)+y2IIFII2
= Ilk-yII2-V2 < e,

so that there are points of K arbitrarily close to z, and we must have

z E K, since K is closed.
Let us remind ourselves of the conditions under which we estab-
lished theorem 7.9. (52, _5F, It) is a o--finite measure space, and T is a
measure preserving transformation from 11 to itself. Tk is the result
of repeating the transformation k times (T° is the identity map). For
an F -measurable function f which is finite a.e. we consider the
sequence of means 1 n-1
gn = n- E f(TZx). (8.6.1)
Theorem 8.9. I f (1 and T satisfy the conditions in theorem
7.9, fEY2(n,p..), and gn is defined by (8.6.1) then {gn} is a Cauchy
sequence in second mean. Its limit (in second mean) f * satisfies
(i) f* is invariant under T, that is
f*(Tx) = f *(x) a.e.;
(ii) IIf*II s Ill II;
(iii) for any function gin Y2 which is invariant under T, (g, f *) _ (g, f ).
Proof. (a) Suppose first that f is such that there is an h E '2 such
f (x) = h(Tx) - h(x) a.e.
1 n-1 1
Then gn(x) = n E f(Tix) = [h(Tn-1x) - h(x)]
i.0 n
so that 11 gll < 211 hll /n --> 0 as n , co.
(b) Now suppose f is the limit (in second mean) of a sequence LJk}
such that, for each k, fk(x) = hk(Tx) - hk(x) with hk E 2'2. Then
n-1 1 n-1
IIgn!I<n' {f(T1x) -fk(T2x)} +n fk (T ix)
i=O i=O

1 E 1IIf(Tix)-fk(Tix)II EfkT1(x)
+ 1n i=o
n i=o

IIf-All +nllhkll;
so that we can make IIgnII < e by first choosing fk with If - fkll < je and
then making n large.
The class of f E -T2 which satisfy either (a) or (b) is clearly a closed
linear subspace K of 22. By the corollary to theorem 8.8, any f E '2
can be written uniquely as
f=/1+f2 where f1 E K,
and (f2,Tf-f) = 0 for all fE22.
Now 0 = (f2,Tf-f) = (f2,Tf)-(f2,f)
(T-1f2,f) - (f2,f) = (T-1f2 -f2,f) for all f E Y2,
and in particular when f is the indicator function of a measurable set E
of finite measure. Hence T--1f2 = f2 a.e. so that f2 is invariant under T.
Hence l n-1
- Z f2(Tix) = f2(x) a.e. for all n,
so that f * = f2 is the limit in second mean of {gn}. Thus (i) and (ii) are
proved. To prove (iii), suppose g is invariant under T; then
(T if, g) = (.f, T -1g) = (f, g)
so that (gn, g) = (f, g) for each n and the result follows on letting
n --* oo since the inner product is continuous in the norm topology.
Corollary. Under the conditions of theorem 8.9, if T is ergodic, then
the limit (in second mean) f * = c a.e. Also
(i) if ,u(S2) = oo, then c = 0,
(ii) if #(Q) < oo, then f f*du = f fdu.
Proof. The only invariant functions are constants so (ii) of the
theorem implies that f * = c a.e. Now if µ(S2) = oo, we have 11 f *1I finite,
so c = 0. If µ(S2) < oo, then the function g(x) = 1 is in 2'2 and is in-
variant so that
(1,f*) = ff*diu = (1,f) = f du. f
Exercises 8.6

1. If µ(S2,) < oo, f e 2q (S2,µ), 1 < p < co and T is a measure preserving

transformation, show that g,,,, defined by (8.6.1), converges a.e. to a limit
function f* e 2, such that p,(g,n, f *) ->. 0. (This gives a simpler proof of
theorem 8.9 for the case µ(S2) < oo).
Prove that (ii) of the corollary to theorem 8.9 is valid without the condi-
tion that T be ergodic.
2. Suppose X is an open subset of Rk of finite Lebesgue measure and
T : X - X preserves Lebesgue measure and is ergodic. Show that, for
almost all x e X, the sequence {Tkx} is dense in X.

In the present book most of the theory of measure and integration
has been developed in abstract spaces, and we have used the properties
of special spaces only to illustrate the general theory. The present
chapter, apart from § 9.4, is devoted to a discussion of properties which
depend essentially on the structure of the space.
The first question considered is that of point-wise differentiation.
In the Radon-Nikodym theorem 6.7 we defined the derivative du/dv
of one measure with respect to another for suitable measures ,u, v:
but the point function du/dv obtained is only determined in the sense
that the equivalence class of functions equal almost everywhere is
uniquely defined. This means that at no single point (except for those
points which form sets of positive measure) is the derivative defined by
the Radon-Nikodym theorem. In order to define du/dv at a point x,
the local topological structure of the space near x has to be considered.
It is possible to develop this local differentiation theory in fairly
general spaces, but only at the cost of complicated and rather un-
natural additional conditions: we have decided instead to give the
detailed theory only in the space R of real numbers where the term
derivative has a clear elementary meaning.
There are several ways of defining an integral with properties similar
to those obtained in Chapter 5. So far in this book we have con-
sidered definitions which start from a given measure defined on
a suitable class of sets. In § 9.4 we describe the Daniell integral and
show that, under suitable conditions this can be obtained in terms of
a measure. Then, for locally compact spaces, we discuss positive
linear functionals on the space Cg of real-valued continuous functions
which vanish outside a compact set, and show that these also corre-
spond to integrals with respect to a suitable measure.
The final section of the chapter is devoted to the definition of Haar
measure in topological spaces which have the algebraic structure
of a group and in which the group operation is continuous. The
details are given only for locally compact metric groups.

9.1 Differentiating a monotone function

We say that f:I - R where I is an open interval in R (that is, a
set of the form (a, b) with a, b E R*), is monotone increasing, if
x1,x2EI, x1 < x2 .f(x1) <f(x2)
At a given point x in I the function!: I - R may not be differentiable
h) -f(x) -f(x - h);
D+f(x) = lira sup f (x + D f (x) = lim supf (x)
h- 0+ h
D+f(x) h) -f(x) (x) - f (x - h)
= lim inff (x + D_ f (x) = lim inff
h-0+ h h- o+
are always uniquely determined in the extended real number system
R*. These numbers are called the derivates off at x. We say that f
is differentiable at x if
D+f(x) = D+f(x) = D f(x) = D_f(x) = Df(x) + ±oo.
It is clear that f is differentiable at x if and only if there is a real
number Df(x) such that, given e > 0 there is a 8 > 0 for which
f(x+h)-f(x)-Df(x)I < e if 0 < I h I < 8;
so that our definition is equivalent to that usually adopted in ele-
mentary texts on real analysis. When f is differentiable at x, we
call Df(x) the derivative off at x.
If f : I -> R is continuous, but not monotone, it is possible that
it is differentiable at no point x. However, a monotone f : I --> R
must be continuous except at the points in a countable set, and the
monotonicity further implies that there are some points x where the
derivative exists. In fact we prove much more: the set of points x
in I where f is not differentiable turns out to have zero measure.
In order to prove this it is convenient first to obtain a new type of
covering theorem. When in § 2.2 we showed that a bounded closed
interval K in R is compact we started with a covering of K by a
family of open sets and we demanded that all of K be covered by a
finite subfamily. However, in proving compactness we were not in-
terested in economical covering, and the covering sets finally chosen
could overlap. Clearly if we require that the covering sets must not
overlap we can no longer require that all of K be covered. However,
even if we are satisfied with a countable subcovering by disjoint sets of
almost all of K (see exercise 9.1(8)) additional conditions on the
nature of the original covering are essential. A suitable form of these
conditions now follows.

Vitali covering
For a subset E c R, a class f of intervals is said to cover E in the
Vitali sense if, given x E E, e > 0 there is an interval J E / with
xEJandO < IJI < e.
Theorem 9.1. Suppose E c R has finite Lebesgue outer measure and
is covered in the Vitali sense by a class / of intervals. Then there is a
countable disjoint subclass f1 e / such that
IE-U{J:JE /1}I = 0.
Proof. We use JAI to denote the Lebesgue outer measure of A
whether or not A is measurable. There is no harm in assuming that
all the intervals J in / are closed since III = III for any interval I.
We may further assume without loss of generality that there is an
open set 0 D E with I G I < oo, and that all the intervals of f are
contained in G.
We choose f1 by induction as follows. Let J1 be any interval of /.
Suppose we have already chosen disjoint intervals J1,J2, ...,Jm and let
sm be the supremum of the lengths of the intervals in If which do not
intersect any of J1,J2, ..., Jm. Now sm < IGI < oo, and if E is not con-
tained in U Ji, we must have sm > 0. Thus if E is not already covered,
we can choose Jm+l disjoint from U Ji with IJm+1I > lism. Now the
theorem is proved if E c U Ji for any finite m. Otherwise we obtain
a sequence {J,,} of disjoint sets so that

IJiI , IGI < co.


Now suppose, if possible, that


JE- U JiI = S > 0.


We can choose N so that IJiI < S,

and put F=E- UJi.
F must be non-void and U Ji is closed so we can find a point x in E
and an interval J off containing x and short enough to be disjoint
from U Ji. This implies IJI S sn < 2IJn+1I Since
limlRRI =0,
this J must meet at least one of the Ji for i > n. Let k be the smallest
integer for which Jr Jk + 0. Then IJI 5 Sk-1 < 2IJkl, so the distance
from x to the mid-point of Jk is at most IJI+JIJkI _ ZI4!, and x
must belong to the interval Hk which has the same centre as Jk and
5 times the length. Thus 0
I'' ci=N+1
U Hi

and 6 = II''I 5 IHil = 5 '-+ IJil < 6,

%=N+1 i=N+1
which establishes a contradiction.'
Corollary. Under the conditions of theorem 9.1, for each e > 0 there is
a finite set J1, J2, ..., J. of disjoint intervals of f such that
E- U Ji < C.

Theorem 9.2. Suppose f:I --> R is monotone increasing. Then the set
E of points x in I for which f is differentiable at x satisfies II - El = 0.
The derivative f' is Lebesgue measurable, and if [a, b] c I,

b.f'(x) dx 5 f(b) -f(a).

Proof. It is clearly sufficient to prove the theorem for a finite
closed interval I = [c, d]. The first step is to show that each of the
subsets of I:
{x: D+f(x) > D_f(x)}, {x:D-f(x) > D+f(x)},
{x: D+f(x) > D+f(x)}, {x:D-f(x) > D_f(x)},
has zero Lebesgue measure. We give the details for the set
E = {x: D+f(x) > D_ f(x)};
the proof for the others is similar. Now E is the (countable) union
of sets
E.,,, = {x: D+f(x) > u > v > D_ f(x)}
over rational pairs u, v. It is therefore sufficient to show that I Eu, ro I = 0
for all pairs u, v with u > v.
Let t = (EU,,l and e > 0. Find an open set 0 D E..,, with
101 < t+e.
For each x e E.,,,, there is an arbitrarily small closed interval
[x - h, x] c G
with /(x) - f (x - h) < vh.
By theorem 9.1, corollary we can find a finite disjoint collection
Jl, J2, ... , JN
of such intervals whose interiors cover a subset F of
En,,, with JEU,,D-Fl < e. If we sum over these intervals
Fi v E hn < vl Gl
n=1 n=1
< v(t+e).
But each y e F is the left-hand end-point of an arbitrarily small inter-
val [y, y + k] which is contained in one of the Ji (i = 1, 2, ..., N) and
such that
f(y+k)-f(y) > uk.
Use theorem 9.1 again to find a disjoint collection K1, K21 ..., KP of
such intervals which covers a subset H of F with l Hl > t - 2e.
Summing over these intervals, since each K. is contained in a J.,
{f (xn) -J (xn - hn)J %
f(yi + ki) -f(yi)
so that v(t + e) > u(t - 2e).
Since u > v and e is arbitrary, we must have t = 0. Thus for almost
all x in I,
g(x) = Df(x) =
Ago h
exists as an element in R* (we are thus allowing the value ± oo for a
limit). If we put
gn(x) = n[f(x+ -f(x)] for

where we re-define f(x) = f(b) for x >, b, then gn(x) is defined and
measurable and gn(x) --> g(x) for almost all x in [a, b] as n -* oo so that
g: I - R* is Lebesgue measurable if we define it arbitrarily to be zero
on the exceptional set where Df(x) is not defined. By Fatou (theorem
b g(x) dx 5 lim inf f b gn(x) dx
fa n-oo Ja
Jim inf n fa( f (x + n) - f (x) } dx
rb+(1/n) a+(1/n)
= liminf[nJb f(x)dx-n fa f(x)dx]

5 f(b) -f(a). J
This shows that the function g is integrable and so finite almost
everywhere. Thus f is differentiable a.e. in [a, b]. Since [a, b] is an
arbitrary subinterval of I, f is differentiable almost everywhere in I.
Functions of bounded variation
A function f: I R is said to be of bounded variation on I if
I f (xi) -f (xi-1) I
is bounded above for all ordered finite sequences xu < x1 < ... < xn
in I. Clearly if f: is of bounded variation on I, it is also of
bounded variation on each interval J c I. For an ordered sequence
a = {xi}, i = 0, 1, ..., n put
p(a) _ max [O,f(xi) -f(xi-1)],
n(a) min [O, f(xi) -.f(xi-1)],
t(a) = p(a) + n(a) _ I f (xi) -.f (xi-1) I

If f : [a, b] -> R is of bounded variation on [a, b], put

Ta = sup t(a), Pa = sup p(a), Na = sup n(a),
a a a

where each of the suprema is taken over all ordered finite sequences
a in [a, b]. It is easy to check that, in this case
Ta = Pa + Na, f(b) -f(a) = Pa - Na.
Now if f: [a, b] R is of bounded variation on [a, b] we can put
g(x) = Na, h(x) = PQ for all x e [a, b]
so that f (x) can be expressed as the difference of two non-decreasing
functions of bounded variation.
Corollary (Lebesgue). A function f: I -a R which is of bounded
variation on each finite interval [a, b] c I must be differentiable at x
for almost all x in I.
Proof. In each finite [a, b] we can express f as the difference of
two monotone increasing functions g and h. Each of these is differen-
tiable almost everywhere in [a, b] by theorem 9.2. Hence the difference
f is differentiable almost everywhere in [a, b]. I

Exercises 9.1
1. Show that, if g: I ->- R, h: I - R are each monotone increasing, then
f = g - h is of bounded variation on each [a, b] c I.
2. If f: I -* R is of bounded variation on each [a, b] c I, show that the
limits f(x + 0), f(x-0) exist at each interior point of I.
3. If c is an interior point of I and f : I -* R has a (local) maximum at c,
show that D+f (c) < 0, D_ f (c) > 0.
4. If f: [a, b] --> R is continuous and D+f(x) > 0 for all x in [a, b), show
that f (b) >, f (a).
5. Define f (o) = 0,
f(x) = x2 sin x 2 for x + 0;
g(0) = 0,
g(x) = x2 sin x-1 for x + 0.
Which of the functions f, g is of bounded variation on [-1, 1]?
6. Give an example of a function for which all the four derivates are
different at x = 0.
7. For any Lebesgue measurable f : I--> R, prove that D+f(x) is Lebesgue
8. Show that theorem 9.1 as stated in R is false in R" for n > 2.
Hint. Take a Vitali covering of [0, 1] and for each J of covering consider
J x [0, 1] and J x [ 3,1 J. This will give a covering in the sense of our definition
of the unit square [0, 1] x [0, 1]. Show theorem 9.1 is not satisfied.
(In fact a more complicated construction shows that theorem 9.1 fails
even if we require each point of the set to be covered by an interval J of
arbitrarily small diameter.)
9. Show that theorem 9.1 is true in R" for all n if we restrict the covering
to cubes. (In fact it can be shown that it is true if there is a constant K
such that the ratio of the lengths of longest and shortest sides is bounded
for the intervals in f)
10. For the Cantor ternary function g: [0, 1] -> [0, 1] show that g'(x) = 0
for all x e [0, 1] - C.
(This shows that we cannot hope, in general, for equality in

ff'(x) dx

11. Prove that a convergent series of non-decreasing real functions can

be differentiated term by term a.e.

9.2 Differentiating the indefinite integral

The `fundamental theorem of the integral calculus' states that, if
f : [a, b] -> R is a continuous function and

F(x) = 1:1(t)
then F: [a, b] --> R is differentiable in (a, b) with F'(x) = f(x). The
object of this section is to obtain the analogous theorem for the
Lebesgue integral, where it is not appropriate to assume that f is
continuous. (Of course, if f : [a, b] -* R is continuous on [a, b], we
know that F(x) = f(x) for all x in (a, b) since the Lebesgue integral
coincides with the Riemann integral in this case.) The first thing to
note is that, even for a monotonic function F, we cannot claim that,
in general, b
JF'(x) dx = F(b) - F(a), (9.2.1)
see exercise 9.1 (10). We will, however, obtain necessary and sufficient
conditions for the truth of (9.2.1).
Lemma. If f : [a, b] ->. R* is Lebesgue integrable on [a, b] and

1:1(1) dt = 0 for all x in [a, b],

then f(t) = 0 for almost all tin [a, b].
Note. This strengthens the result of theorem 5.5 (vii).
Proof. If the lemma is false then at least one of the sets
{t:f(t)<0}, {t:f(t)>0}
has positive measure. If I {t: f(t) > 0} 1 > 0 then we can find a S > 0 for
which JET > 0, where E = {t:f(t) > S}. Now choose a closed set
F c E with IFI > 0, and consider the open set G = (a, b) -F. Then
0= a fdm = JFfdm+J afdm.
But G is the disjoint union of a countable collection of open intervals
(an, bn) and fdm = 0
for each n. Since the integral defines a o--additive set function we must
f dm = 0 so that fF fdm = 0

and this contradicts 51dm > 8IF1 > 0.1

Let us now consider the properties of any function F which is an

indefinite integral, that is
F(x) = f f(t) dt

for a function f: [a, b] -± R* which is Lebesgue integrable. It is im-

mediate from theorem 5.6 that F is continuous on [a, b], but more
can be said: since it is the difference of the indefinite integrals of f+
and f- it must be the difference of two monotone functions and there-
fore it is of bounded variation. In fact, we saw in theorem 5.6 that the
set function
v(E) = fE fdmn: E measurable, E - [a, b]
is absolutely continuous; that is that v(E) 0 as m(E) -* 0. This
means in particular that given e > 0, there is a 8 > 0 such that if
E = U Ik is a finite disjoint union of intervals in [a, b] for which
n n
m(Ik) < 8, then I v(E) I = E v(Ik) < E.
k=1 k=1
In fact, by considering separately the intervals Ik for which v is positive
and negative we can find 8 > 0 such that
n n
m(Ik) < 8 -Z I V(Ik) I < 6-
k=1 k=1

In terms of the indefinite integral F this means that the function

F: [a, b] -+ R is such that, for each e > 0 there is a 8 > 0 for which
n n
E (bi - ai) < 8- E I F(bi) - F(ai) I < e (9.2.2)
i=1 i=1
for any finite class of disjoint intervals (ai, bi) c (a, b). Any function
F: I--> R which satisfies this condition on every finite interval
(a, b) c I is said to be absolutely continuous on I.
It is immediate that any function F: I - R which is absolutely
continuous is of bounded variation on each finite interval [a, b] C I.
For if we put e = 1 in (9.2.2) and choose 8 > 0, then any finite dissec-
tion of [a, b] can be split into K sets of intervals (by inserting extra
division points if necessary) each of total length less than 8, where
K = [(b - a)/8] + 1; and it follows that, for any dissection of [a, b]
F(xr) - F(xr-1) K.
By the corollary to theorem 9.2 we now see that any function F
which is absolutely continuous is differentiable except on a set of
zero measure.
Theorem 9.3. Suppose f: [a, b] -* R* is Lebesgue integrable on [a, b]
and F: [a, b] --> R satisfies
F(x) = F(a) + 1:1(t)

Then F is differentiable with F'(x) = f (x) for almost all x in [a, b].
Proof. Assume first that f is bounded on [a, b] so that for a suitable
M in R, If(x) I < M, for all x in [a, b]. Now we know that F is absolutely
continuous and therefore differentiable almost everywhere. Put

fn(x) = -F(x)J.
Then I f.I < M and fn(x) - F'(x) almost everywhere; so, by theorem
5.8 fora<c<b,
fF'(x)dx = 1imfa f(x)dx = lmn f ) -F(x)]dx
+(1/ n) a+(1/n)
= lim [nf F(x) dx - n f F(x) dx

= F(c ) - F(a) = f cf (x) dx

since F is continuous. Hence

f c {F'(x) - f(x)} dx = 0

for all c in [a, b] so that F'(x) = f(x) almost everywhere.

Now suppose that f: [a, b] R* is integrable but not bounded.
From the definition of the integral it is sufficient to prove the theorem
when f > 0. Put gn(x) = min [n, f (x)]
and Gn(x) = fa [f(t) - gn(t)] dt.
Since f - fn > 0, G. is monotone increasing and so has a non-negative
derivative almost everywhere. Since fn is bounded (by n) we know that

J.xf,,(t) dtl( = fn(x) a.e.,

so that the derivative
Fi(x) = G.n(x) +
Jxf.(t) dt) > fn(x),
and exists almost everywhere. Since this is true for each integer n,
F'(x) > f(x) a.e. (9.2.3)

Hence bF'(x) dx > f b f(x) dx = F(b) - F(a),

Ja a
and by theorem 9.2 we must have

fb F'(x) dx = F(b) - F(a) = bfj dx,


and {F'(x)-f(x)}dx = 0.
This with (9.2.3) implies that F'(x) = f(x) a.e.
Lemma. If F: [a, b] -> R is absolutely continuous on [a, b] and
F'(x) = 0 a.e.,
then F is constant.
Proof. Suppose a < c < b, and E _ {x E [a, c]; F'(x) = 0}. For a
fixed e > 0, there are arbitrarily small intervals [x, x + h] for each
xEE such that IF(x+h)-F(x)I < eh.
Choose 8 > 0 to satisfy (9.2.2) in the definition of absolute continuity
and use theorem 9.1 to obtain a finite collection [xk, yk] of intervals
IF(yk)-F(xk)I < e(yk-xk)
which cover all of E except for a subset of measure less than 8. Order
these intervals so that
yo = a < x1 < y1 -< x2 < ... < yn -< C = xn+i,
and I xi+1- yi l < 8.
By (9.2.2) this implies Z I F(xi+i) - F(yi) I < e
and, from the choice of the covering family
F(yi)-F(xi)I < e(c-a)
so that
n n
IF(c) - F(a) = {F(x2+1) - F(yz)} + {F(yz) - F(xi)}
2=o a=o
< e(c-a+1).
Since e is arbitrary, we have F(c) = F(a). ]
Theorem 9.4. A function F: I R is an indefinite integral, that is
there is a measurable f : I --> R* such that

F(b)-F(a) =J bf(x)dx
for all [a, b] c I, if and only if F is absolutely continuous on I.
Proof. We have already seen that any indefinite integral is abso-
lutely continuous. Conversely suppose F: I -+ R is absolutely con-
tinuous. Then F is differentiable almost everywhere in [a, b] and
IF'(x) I 5 Fi(x)+F2(x) a.e.,
where F = Fi - F2 expresses F as the difference of two monotone
functions. By theorem 9.2, F' in integrable on [a, b]. Put

G(x) = faF'(t) d t.

Then G is absolutely continuous and so is H = F - G. But, by theorem

so that H is constant by the lemma. Hence

F(x) = f
Corollary. Every absolutely continuous function F:I --> R is the
indefinite integral of its derivative.

Given a set A C R, X E R consider the ratio
for all intervals I containing x where JEJ denotes the Lebesgue outer
measure of E. If this ratio converges to a limit as III -> 0, then
this limit is called the density of A at x and denoted ?-(x, A). The
point x is called a point of density for A if T(x, A) = 1, and a point
of dispersion for A if T(x, A) = 0. We can obtain the following as a
corollary of theorem 9.4.
Lemma (Lebesgue). If A - R, A is Lebesgue measurable, then
T(x, A) = 1 for almost all x E A,
T(x, A) = 0 for almost all x E R - A.
Proof. Suppose a < x < b. Then the indicator function yd is
Lebesgue integrable over [a, b]. Hence

F(x) = f xx.dx
is differentiable almost everywhere and
F'(x) = 1 for almost all x in [a, b] n A,
F'(x) = 0 for almost all x in [a, b] n (R - A).
But if x is such that F'(x) = 1, there is for each e > 0 a E > 0 such that

1>I[x,x]nAl >1-e for 0<h<4,

(ii) >1-e for 0<k<4;

and so i '> l [x - kh + h >1-e for 0 < h, k < S,
which is precisely the condition for T(x, A) = 1. A similar proof shows
that, at points x where F'(x) = 0 we have T(x, A) = 0.1

Exercises 9.2
1. If F: I R is absolutely continuous, show that FD is absolutely
continuous for each p > 1, but not, in general, for p < 1.
2. If F: [a, b] ->. R is such that F' exists everywhere in (a, b) and is
bounded show that rb
F'(x) dx = F(b) - F(a).

For F(x) = x2 sin l/x2 (x + 0), F(0) = 0 show that F'(x) exists for all
x but is not Lebesgue integrable over [-1, 1]. (This shows that even the
Lebesgue integral is not strong enough to integrate all derivatives.)
3. Construct a subset A c R for which T(0, A) = J.
4. Extend the density result to non-measurable sets A by showing that
for any A c R, T(x, A) = 1 for all x in A except a subset of zero measure.
Hint. Assume A is contained in a finite interval, and take a measurable
set B A with JBI = CAI.
Deduce that a set A c R is measurable if and only if r(x, A) = 0 for
almost all x in (R-A).
5. Prove that the Cantor function g: [0, 1] -* [0, 1] defined in §2.7 is
monotone increasing and continuous but not absolutely continuous.
6. The function f: [0, 1] --> R is absolutely continuous on [e, 1] for each
e > 0. Can one deduce that f is absolutely continuous on [0,1]? Does the
additional condition that f is of bounded variation on [0, 1] help?

9.3 Point-wise differentiation of measures

In theorem 4.8 we proved that all measuresp in R defined for Borel
sets and finite on bounded sets are Lebesgue-Stieltjes measures:
that is, there is a monotone increasing function F: R -- R which is
continuous on the right such that It = ,aF on -4. Because of this
correspondence we can obtain properties of such Borel measures in
terms of the corresponding properties of F.
Lemma 1. Suppose ,uF is the Lebesgue-Stieltjes measure with respect to
the function F: R -- R which is continuous on the right. Then ,aF is
absolutely continuous with respect to Lebesgue measure m if and only if
F is absolutely continuous.
Proof. Suppose first that F is absolutely continuous. Then, by
theorem 9.4
,aF(a, b] = F(b) -F(a) = bF'(t) dt
so that, for E E 9, ,uF coincides with the set function

v(E) = fE F' dm.

But the extension of a measure from 9 to.' is unique, so that ,aF = v
on -4, and up must therefore be absolutely continuous with respect
to m.
Conversely, if pp is absolutely continuous with respect to m, by
the Radon-Nikodym theorem m there is an f > 0 such that

#' = f E dm .

Hence #p(0, x] = F(x)-F(0) = ff4t dt for x > 0,

#1,(x, 0] = F(0)-F(x) _ f f(t) dt for x < 0,


so that F: R -->- R is an indefinite integral and must therefore be

absolutely continuous.
Given any measure space (X,3;7, p) in which F contains all single
point sets the point x E X is said to be an atom for the measure µ
if ,u{x} > 0. A measure It with no atoms is said to be non-atomic.
Now if It is o--finite, the set of atoms of It is countable. In this case if
we put
v(E) = Z ,u{x}
µ{x} + 0

we obtain a new measure v defined on all subsets of X, and v is a

discrete measure as defined in § 3.1. Further, the set function
T =,a-v
defined on F is clearly non-atomic and so
It = v+T
is a decomposition of a o--finite measure a into the sum of a discrete
measure and a non-atomic measure. This decomposition is clearly
unique. Thus we have proved
Lemma 2. Given a o --finite measure space (X, F, #) in which. contains
all single point sets there is a unique decomposition of It,
p = V +T
for which v is a discrete measure on X and r is a non-atomic measure
on F.
Lemma 3. A measure ,a on . (the Borel sets of R) which is finite on
bounded intervals is a discrete measure if and only if p = /tF where F is
a jump function, that is,
F(x) _ pi for x >, 0,1
0<xs5x (9.3.1)
-F(x) = pi for x < 0,
x<x{<o J

where the measure p has atoms xi of weight (or measure) pi.

Proof. It is clear that if F: R -. R satisfies (9.3.1) then
, F(a,b]= a<x;5bpi=xiE(a,b]
E Ps
so that #F coincides with the discrete measure
v(E) =x;EE
E pi
for E e 9. By uniqueness of extension PF. must be a discrete measure.
Conversely, if a is a discrete measure with atoms xi of weight pi,
an application of the theorem 4.8 shows that ,a = ,UF, with F a jump
Lemma 4. A measure, It defined on .1 which is finite on bounded intervals
is non-atomic if and only if ,it _ ,aF for a continuous F: R -- R.
Proof. If F is continuous, then
0 < ,UF,{x} < ,aF,(x - h, x] = F(x) - F(x - h)
for all h > 0, so that ,up{x} = 0.
Conversely if F is not continuous at x0, then
,aF.{xo} = "M /tF (xo - n, xo = F(xo) - F(xo - 0)
so xo is an atom. I

Singular monotone function

Any function F: I - R which is continuous and monotone in-
creasing, such that F'(x) = 0 for all x in I except for a set of zero
Lebesgue measure, is said to be singular. The function g defined in
§ 2.7 clearly satisfies these conditions without being constant.
Lemma 5. A function F: R-+ R is singular if and only if the Lebesgue-
Stieltjes measure Itp is non-atomic and singular with respect to Lebesgue
Proof. The continuity of F is equivalent to the condition that u,
be non-atomic by lemma 4. Now a measure v is singular with respect
to Lebesgue measure if and only if any absolutely continuous T
satisfying T(E) 5 v(E) for all E in . must be zero. Now if F'(x) > 0
on a set of positive measure, the set
et function

T(E) = dx
is not always zero and T < ,aF. by theorem 9.2 so ,aF is not singular
with respect to Lebesgue measure. Conversely, if µF is not singular a
non-null absolutely continuous measure T 5 µF can be found, and this
corresponds to a function G, that is
G(b) - G(a) = b G'(x) dx.
But F(x) > G'(x) when both are defined, so F'(x) > 0 on a set of
positive measure. )
Theorem 9.5 (Lebesgue). Given any function F: R --> R which is
monotone increasing and continuous on the right, there is a decomposition
ofF F=F1+F2+F3
where F1 is a jump function,
F2 is singular,
F. is absolutely continuous.
This decomposition is unique if we insist that F1(0) = F2(0) = 0.
Proof. Use the function F to define a Lebesgue-Stieltjes measure
,up on -4. Decompose µF with respect to Lebesgue measure m by
theorem 6.7 so that
,aF = vl + V3
with v3 < m and v1 singular with respect to m. Decompose v1 by
lemma 2,
where Al is discrete and A2 is non-atomic.
Let F1, F2 be the monotone functions (with F1(0) = F2(0) = 0)
obtained by theorem 4.8 for which Al = PF,' A2 = UF2 on A Then by
lemmas 3 and 5, F1 is a jump function, and F2 is a singular function.
If one applies theorem 4.8 to v3 one obtains an absolutely continuous
G3 for which v3 = µa,. Finally, put F3(x) = G3(x)+F(0) and we still
have F. absolutely continuous, and v3 = µF8. Now
F(x) - F(0) = Fi(x) - F1(0) + F2(x) - F2(0) + F3(x) - F3(0)
for all x so that F(x) = F1(x)+F2(x)+F3(x).
The uniqueness follows from the uniqueness of the decomposition
µF = Al + A2 + v3, and theorem 4.8.1
In R we can also use the connexion between µF and F to define
differentiation. Thus if F:I -+ R is differentiable at x0, this means
that F(xo + h) - F(xo - k)
-)- F'(xo) as h,k -> 0
with h > O, k > 0, and
17L - - F(xo)
I(xo- k ,xo+ ]1
This can be written
#F(J) -a F'(xo) as IJI-->0
for intervals J containing x0, and we can write d aF/dm (xo) for the value
of this limit. More generally, if p, v are two measures in R which are
finite for bounded sets then
lim L/t(J)
I.n-- o L v(J)J
when it exists, is called the derivative of p with respect to v at the
point x.
In Rn we can consider the values of the ratio
for rectangles J (in 911) containing a fixed point x and ask whether or
not this ratio approaches a limit as diam (J) 0. The existence of
this limit for all x except for a set of zero v measure can be proved
when v is Lebesgue measure: the limit in this case is called the strong
derivate of p at x. This result is harder to prove than the result in
§ 9.1 because theorem 9.1 is not valid without some restriction on the
ratio of the sides of the covering class /. Essentially similar methods
to those of § 9.1 will work if only cubes J are considered. On the other
hand if in (9.3.2) one considers rectangles with arbitrary orientation
an example can be given for which the limit exists nowhere.
Differentiation point-wise in abstract spaces can also be defined
in terms of suitable `nets', and the theorems of this chapter can be
obtained if sufficient conditions are imposed. Since the results are
not often used in practice, we will not state them in detail.

Exercises 9.3
1. Enumerate the rationals as a sequence {ri}. By considering the discrete
measure with mass 1/i2 at ri (i = 1, 2,...) define a jump function which is
constant in no interval.
2. Give an example of a singular function which is constant in no interval.
3. If F, G are two monotone real functions differentiable at xo with
G'(xo) + 0, show that
(xo ) = lim -F exists and e quals , ()
duG xoEJ #G(J) G (x0)

9.4* The Daniell integral

Our approach in this book has been to regard measure as the primi-
tive concept, and to define the integration process in terms of a given
measure. One important alternative is to start with an `integral'
defined on a suitable class of functions, extend its definition to a larger
domain with desirable properties and then obtain measure as a
by-product at a later stage. In the present section we describe this
alternative approach: it is convenient to use it in the following section
to obtain the integral representation of an important class of linear
For an arbitrary space X, we consider a family L of functions f:
X ->. R satisfying
(i) L is a linear space over the reals;
(ii) for each f E L, the function f+ E L, where
f+(x) = max (0j (x)).
Now if we define, for each f, g E L, X E X
(fvg) (x) = max (f(x), 9(x)),
(f A g) (x) = min (f (x), g(x) ),
the relations
f+ = fv0, fvg = (f-g)v0+g, fAg =f+g-(fvg);
show that
(iii) if f, g E L, then f v g, f n g E L.
Any family L satisfying conditions (i) and (ii) (and therefore (iii))
is called a vector lattice of functions. Suppose 5 is a linear functional
on L (considered as a real linear space), then we say S is positive if
fEL, f > 0=- 5(f) > 0.
A positive linear functional f on L is said to be a Daniell functional
if, for every increasing sequence {Jn} of functions of L
.f (g) S limS(fn) (9.4.1)

for each g E L satisfying g(x) 5 lim fn(x) for all x E X. (Note that
lim fn(x) will be +oo if the sequence {fn(x)} is unbounded, and even
if lim fn exists as a function with finite values we do not assume
that it is in L.)
In particular, this implies that, if f is a Daniell functional, {fn}
a monotone sequence in L such that f (x) = lim fn(x), x E X defines
n- C0
a function in L then 5(f) = lim5(fn). For if {fz} is increasing then
f > fn for all n, so 5(f) > -f(fn) since .1 is positive, which with (9.4.1)
gives the required equality. Thus a Daniell functional is continuous in
the sense that for any sequence { fn} in L which decreases monotonically
to the zero function we must have . .f (f.) -). 0. Any Daniell functional
is therefore an `integral' in the sense discussed in § 5.1. However,
for the integral to be useful we want the domain L to be as large as
possible: if {fn} is an increasing sequence in L which is bounded above
by an element of L we would certainly want lim fn to be in L. The
Daniell integral is essentially the result of extending a Daniell func-
tional.f from L to a class Ll L: it turns out that this extension can
be carried out in two stages.
Suppose f is a Daniell functional on a vector lattice L. Denote
by L+ the set of functions f: X -> R* which are limits of monotone
increasing functions of L. L+ is not a linear space but
a,f > 0 f,gEL+= of+/3gEL+.
Then if {fn} is an increasing sequence in L, {,f (f.)} is an increasing
sequence in R which has a unique limit in R u { + oo}. We can define
5 in L+ by
_f(lim fn) = lim5(fn).
This definition is proper because if {fn},{gn} are two monotone
sequences each converging to h in L+, condition (9.4.1) gives, for
fixed n,
fn S h = lim gn 5(fn) 4 "M -f (9n)
so that lim5(fn) S lim.f (gn) and the opposite inequality can be
similarly obtained. It is clear that f is linear on L+ in the sense that
a > 0, f >, 0; f,gEL+=.f(af+fg) = a.f(f)+fS(g)
For an arbitrary function f : X -> R* we define the upper integral
.f *(f) by .'*(f) = inf of (g),

where we adopt the (usual) convention that the infimum of the empty
set is + oo. Similarly, the lower integral 5* (f) is defined by
5*(f) = -.f *(-f),
and we say that a function f: X -a R* is integrable (with respect to 5)
if 5*(f) _ 5* (f) and is finite. The class of integrable functions will
be denoted by Ll = L1(5, L). For f E Ll we call the common value of
5*(f ), 5* (f) the integral off and denote it by /(f ). We now show that
this functional/ on L1 is a Daniell functional which extends 5, and
that L1 has the closure properties desired. It is convenient to obtain
a number of preliminary results before stating the theorem.
Lemma 1. If {gn} is a sequence of non-negative functions in L+, then
00 OD

g = 1ign is in L+ and Jf(g) = Z 5(gn).

n=1 n=1

Proof. It is clear that a non-negative function f: X -> R+ belongs

to L+ if and only if there is a sequence {fn} of non-negative functions

in L with f = fn. By definition, in this case


5(f) = E 5(fn)

Hence, each function gn can be expressed as a sum


gn = Efn,v with fn, v : X -+ R+, fn,, E L.

It follows that g = I Zfnro
n v

is a countable sum of non-negative functions of L and so must be

in L+. Further since all the terms are non-negative, the order of sum-
mation is immaterial and

= E (E (fn,v))
n=1 v=1
Lemma 2. For arbitrary functions f: X --> R*, g: X --> R*:
(i) 5*(f+g)
(ii) if c % 0, 5*(cf) = c5*(f);
(iii) if f 5 g, 5*(f) < 5*(g), J*(f) 5-f*(g);
(iv) 5*(f) 5 J*(f);
(v) if fEL+, 5*(f) =5*(f) _ 5(f).
Proof. (i), (ii) and (iii) follow immediately from the definitions.
It is worth noting in (i), that we can put (f + g) (x) = + oo at those
points x for which one of f (x) is +oo and the other is - co so that (i)
is true whatever the value in R* chosen for (f + g) (x) at such points x.
(iv) Since 0 = 5(0) = 5(f -f) < 5*(f) +5*(- f) by (i), it follows
that. *(f) _ -5*(-f) < 5*(f)
(v) If f E L+, then by definition ./*(f) = .1(f ). Now if g E L, then
- g c L c L+ so that -0'* (g) = .fi(g). But each f in L+ is the limit of an
increasing sequence {gn} in L. Thus f > gn so J*(f) 3 5*(gn) = 5(gn)
andJ* (f) >, lim.f(gn) = 5(f).]
Lemma 3. If {gn} is a sequence of functions on X to R+, and

g= n=1
Egn, then .O*(g) < E.-.O*(gn)
Proof. If5* (gn) = +oo for some n, or if the series I.f*(gn) diverges
there is nothing to prove. Otherwise, given e > 0, for each integer
n choose hn > gn, hn E L+ such that .f*(gn) > 5(hn) - e 2-n. Then
h= E hn E L+ by lemma 1, h >,g and

-f *(g) <, 5(h) = E 5(hn) < e+ 21 .E*(gn)

Since e is arbitrary the result is proved.
Theorem 9.6. Given a Daniell functional ,f on a vector lattice L of
functions on X to R, the process defining a functional / on the set
L1 determines a Daniell functional on a lattice L1 which extends .f.
Further, if {fn} is an increasing sequence of functions in L1 and f = lim fn,
then f E L1 if and only if lim /(fn) is finite in which case /(f) = lim /' (fn).
Proof. Lemma 2(v) shows that L1 L and that f is an extension
of .0. Now if g E L1 so does cg for c in R since

c%0-5*(cf)=c.f *(f)=c.f*(f)=5*(cf),
c<0 5*(cf) = c-f*(f) = c5*(f) _ .f*(cf).
Further, if f and g are both in L1, using lemma 2 (i),t
-5*(f+g) =5*(-f-g) /(f)-/(g)
so /(f)+/(g) % *(f+g);
and, by lemma 2 (iv), f + g E L1 and
/(f+g) = /(f)+/(g)
Thus L1 is a real linear space, and f is a linear functional on L1.
To prove that L1 is a lattice it is sufficient to prove that
fEL1= f+eL1.
t As pointed out in the proof of lemma 2(i) the inequality is valid, whatever value
in R* is chosen for (f+g) (x) at points x where f(x) = + oo, g(x) = - oo. The proof
given then shows that, for f, g e L1, (f+ g) E L1 whatever values are assumed at such
For a fixed f in L1 and each c > 0, choose functions g, h in L+ such that
-h<f<g and
5(g) < /(f)+e < oo, f (h) < -/(f)+e < oo.
Now g = (g v 0) + (g A 0) and g A 0 E L+; so .5(gv 0) < .fi(g) - 5(g A 0) < oo.
Thus g+ = g v 0 E L+ and .f(g+) < oo. Similarly, - h_ = h A 0 E L+ and
h- < f+ < 9+. But (g+h) > 0; and separate consideration of each
possible pair of signs for g, h shows that g+- h_ < g + h. Hence
.f (9+) +.f (- h_) < .fi(g) +.f (h) < 2e.

But h_) < *(f+) < 5*(.f+) < 5(g+)

so that .f*(f+)-5*(f+) < 2e. Since e is arbitrary and /(g+) is finite
we have f+ E L1 as required.
Now suppose {fn} is an increasing sequence of functions in L1
and f = lim fn. Then if lim /(fn) = + oo, and g < f, g E L1 it is clear
that /(g) < lim /(fn) since /(g) is finite. On the other hand if
lim /(fn) is finite, put h = f - fl. Then h > 0 and

h =E.+ V.+1 -A).



By lemma 3, E {d (fn+1) - (fn)}


= lim /(fn) - /(.f1)

so that *(f) _ 5*(fi+h) < Jl*(fi)+J*(h) < lim /(fn)
But fn < f so that .f* (f) >, Jim /(fn), and we must have
5*(.f) = 5*(f) = Jim f(fn)-
This means that, if lim /(fn) is finite, then f is in L1, and
/(f) = lim f (fn)
The positive functional / therefore satisfies (9.4.1) and must be a
Daniell functional on L1. I
Remark. There may be some functions f : X -* R* which take the
values ± oo at some points but are still in L1. In the course of the proof
we saw that it made no difference to the linear functional / what
value was assigned to (bf + eg) (x) at points x where the usual calculation
leads to + oo + (- oo). It is in this sense that f is a linear functional
on the real linear space L1. However, we will shortly see that all
functions f : X -- R* in L1 must take finite values at `almost all'
points, so that the set where (bf + cg) is not determined by the laws of
algebra is always small (relative to f).
Now if one starts with a Daniell functional ./ on a vector lattice L
which is already closed for monotone limits, i.e. if {f j is a monotone
sequence in L and lim 5(fn) is finite, then f = lim f , is in L; the exten-
sion process defined will lead to nothing new as the part of L+ on
which .0 is finite is in L and this will give L = L1.
Daniell integral
Any Daniell functional / on a vector lattice L1 of functions on X
to R* such that the limit f of a monotone sequence {fn} of functions in
L1 is in L1 provided lim /(f.) is finite is called a Daniell integral.
We now see how one can obtain a theory of measure if one starts
with a linear operator f satisfying these conditions. The definitions
are made so that the integral (in the sense of Chapter 5) with respect to
the measure recovers the operator f. Starting with a Daniell integral
/ we say that a non-negative function f : X -a R+ is measurable
(with respect to f) if g c L1 = f A g E L1. We say that a set A c X
is measurable (with respect to /) if the indicator function XA is measur-
able; while the set A is integrable if XA E L1. In order to ensure that the
class of measurable functions and sets has useful properties we will
further assume that the space X is measurable, that is, that the
constant function f (x) __ 1 is measurable.
Lemma 4. If X is measurable, then the class.d of sets measurable with
respect to f is a o field. If f: X --> R+ is any non-negative integrable
function, the set Ea = {x: f (x) > a} E.d for all a E R.
Proof. Given f, g non-negative measurable functions, the lattice
properties of L1 immediately give that f v g and f A g are measurable. But
so that A, Bed = A n B and A u B Ea. Further for any set E,
gAXE = (gvO+gAO)AXE = (gvO)AXE+gAO
so that if g E L1, g v 0 and g A 0 E L1 and
(gAO)AXA-B = gAOEL1,
so that g A XA-B E L1. Thus sad is a ring, and since X ES, we have
proved that a is a field. To show that sad is a o--field one need only use
the fact that L1 is closed for monotone limits which are bounded,
since E. = U Ai is monotone and so is XEn.
Now if f : X -> R+ is non-negative and is in L1, Ea = X for a < 0.
If a = 0 put h = f; while if a > 0 put h = [a-l f - (a-l f) A 1]. Then
h e L1, and in either case h(x) > 0 for x E Ea and h(x) = 0 for x E X - Ea.
For each integer n, put fn = 1 A (nh). Then fn E L1 and the sequence
{fn} increases monotonely to yE Hence XEE is measurable, so Ea is
measurable. ]

Theorem 9.7 (Stone). Suppose / is a Daniell integral on the class L1

of functions f : X -+ R*, and X is a measurable set with respect to f .
p(E) = /(XE) when E is integrable,
p(E) = +oo otherwise
defines a measure p on the o--field .ud of measurable sets. A function
f : X - R* is in L1 if and only if it is integrable with respect to this
measure p, and
/(f) = f da for all f e Ll.
Proof. It is immediate that ,u(0) = 0. If B is integrable and A is
measurable with A c B, the definitions ensure that A is integrable and
0 < p(A) 5 p(B). This inequality is trivially satisfied when B is
measurable but not integrable, so p is monotone on d.
Now let {En} be a disjoint sequence ina and E = U
co En. If at least
one of the E. fails to be integrable, then E is not integrable and
u(E) = +oo = Ep(En). (9.4.2)

If each of the sets E is integrable, then E will be integrable if and only

if Ep(En) < oo by theorem 9.6, since XE = EXEE and in this case
p(E) = Ep(En) < oo.
It is clear from the statement of theorem 9.6 that (9.4.2) will be satis-
fied if Ep(En) = +oo. Thus in all cases, It is on..
Now lemma 4 ensures that .2f is a and that any non-negative
g-integrable functions is .sad-measurable. Since each g-integrable
function is the difference of two non-negative g-integrable functions
it follows that any f in L1 is a-measurable.

Consider a non-negative f: X -> R+ in L1. For each pair (r, 8) of

positive integers put
Er,s = {x: f(x) > r/s}.
Now E,S E.Qf and xEr 8 E L1 (that is, ,u(Er,s) < oo) since

XEr,s - xEr., A (if).

Put fn,=-8 r=1 xE,s' s=2",
and note that {f,,} is a monotone sequence in L1 which converges to f.
Hence /(f) = lim /(fn). But
1 s' 1 s'
/(fn) = Z /(XEr,,)
8r=1 -8r=1
Zi lu(Er.s) = ffndlj,,

and from the definition of the integral of a non-negative.-measur-

able function we have
/(f) = lim ffnda = ffd,u.
Conversely, if f : X -+ R+ is non-negative and integrable with
respect top, then each of the sets Er,s is insaf and has finite,-measure.
Hence xEr 8 and therefore fn are in L1. Since

= lim f f- du = lim aNn) < co,

by theorem 9.6, f = lim fn is in L.I. This completes the representation

theorem for non-negative functions. But for both the functional f,
and the integral with respect to ,u we have a decomposition f = f+-f-
of any integrable f: X -* R* as the difference of two non-negative
integrable functions, so we can deduce the representation for arbitrary
integrable functions. I
An obvious question arising is that of uniqueness for the measure
,u in theorem 9.7. This cannot always be obtained, but we give an
outline of the uniqueness proof under suitable conditions in exercises
9.4(8, 9).

Exercises 9.4
1. Show that the condition (9.4.1) for a positive linear functional is
equivalent to saying that, if {un} is a sequence of non-negative functions in
L and 0 E L satisfies 0< E u,n, then -0(0) 5 E 5(un).
2. If (S2, ,u) is a a -finite measure space, L is the class of u-integrable
functions and 5(f) = f f d u, show that .0 is a Daniell functional on L.
3. Let J be the class of continuous functions on R to R which are zero
outside [ - K, K] for some K and put

JO(f) = f (x) dx in the Riemann sense.

J 0-000

Show that S is a Daniell functional on J.

4. If / is a Daniell integral defined on the class L1 prove that
fEL1= IfIEL1.
5. (Fatou for Daniell integral.) Suppose {f,,,} is a sequence of non-
negative functions in L1. Prove that lim inf f is in L1 if lim inf .f(f) < oo
and in this case /(lim inf fn) < lim inf /(f,,,).
6. (Dominated convergence.) Suppose {fn} is a convergent sequence in
L1 such that I f n I g for all n where g e L1. Then if f = lim f n, f E L1 and
A(f) = lim /(fn)
7. Suppose It is a measure on a field.Vof subsets of X, and L is the family
of finite linear combinations of indicator functions of sets of d with finite
measure. Show that Lisa vector lattice and ifs is defined on L to be integra-
tion with respect to, u, then. is a Daniell functional. Discuss its extension
/to a Daniell integral.
8. Suppose 5 is a Daniell functional on a vector lattice L, and f' is an
extension of 5 to a Daniell functional on a vector lattice L' L. If 5
and 5' are extended to give Daniell integrals over L1 and Li show that
Li L1 and f' is an extension of f
9. Suppose L is a fixed vector lattice containing the constant function
1 and -4 is the smallest Q-field of subsets of X such that each function in L
is measurable -4. Prove that for each Daniell integral / on L1 there is a
unique measure p on a such that
/(f)=Jfdu for all fEL.
Hint. If sad is a--field of sets measurable w.r.t. 0, as a. Existence of
p follows from theorem 9.7. To prove uniqueness it is sufficient to show that
for any such ,u, ,u(B) = /(XB) for all
Use questions 8 and 7 above to extend the two Daniell functionals-one
given and the other defined in terms of the integrals with respect to p.

9.5* Representation of linear functionals

In this section we restrict our attention to topological spaces X
which are locally compact and Hausdorff. A topological space is
Hausdorff if given two distinct points x, y E X, there are open sets
G, H with x E G, y E H, G n H = 0. The family of functions f: X -> R
which are continuous on X and vanish outside a compact subset
of X is called C0(X). If we define the support of a function f: X -> R
to be the closure of the set {x:f(x) + 0}, then C0(X) is the family
of those continuous functions f: X -> R which have compact support.

Baire sets and measure

The class of Baire sets is the smallest o'-field W of subsets of X such
that each function f in C0(X) is f-measurable. Thus' is the o--field
generated by the sets of the form {x: f (x) > a}, f E Co(X ), a E R.
A measure u is called a Baire measure on X if It is defined on the
o'-fieldle of Baire subsets, and u(K) is finite for each compact set K in'.
Clearly C0(X) is a normed linear space if we put
11f 11 = sup 1f(X) 1,
and we will also use the fact that C0(x) is a vector lattice. This allows
us to identify the positive linear functionals on C0(X).
Theorem 9.8 (Biesz). Suppose X is locally compact Hausdorff, and
5 is a positive linear functional on the space C0(X) of continuous func-
tions f : X -. R with compact support. Then there is a Baire measure
,u on X such that
5(f) _ (f du for all f E Co(X ).
Proof. The first step is to show that -0 must be a Daniell functional
on C0(X). Suppose fEC0(X), {fn} is an increasing sequence in C0(X)
and f 5 lim fn. In order to prove that 5(f) 5 lim.f (fn) it is sufficient
to show that f(f) = lim.>f(gn) where gn = f n fn so that
f = lim gn 1< lim fn.
But then, if we put hn = f - gn we obtain a decreasing sequence of
functions of C0(X) whose limit is zero. Let K be the support of hl,
then there is a function 0 in C0(X) which is non-negative and satisfies
c(x) = 1 for x E K. f For each x E K, e > 0 there is an n., such that
hnx(x) < 2e and, since is continuous, there is an open set Gx for which
x E Gz and hnx(t) < e for t E Gx.
t This uses a separation property of X; see, for example page 146 of J. L. Kelley
Oencral Topology, Van Nostrand (1955).
Since K is compact there is a finite subcovering Gad of K.
If N = max [nxl, ..., nx8] we have h,(x) < e for all x in K, n > N. Thus
so that 0 < .f (hn) < e--'(¢).
Since a is arbitrary we must have lim.f(hn) = 0 which implies con-
dition (9.4.1).
We can now apply theorem 9.7 to the extension f off to
Li C0(X)
to obtain a measure It on the o--field.which contains the Baire sets
and such that, for f E C0(X),
5(f) _ /(f) = ff dp.
By considering the above function 0 which is in C0(X) and takes the
value 1 on the compact K, we see that
,t(K) = f(XK) 5 /(0) = Oda < oo, f
so that the measure It we have obtained is finite on compact sets.
When X is compact, C0(X) is the same as C(X) the space of con-
tinuous f : X -+ R, so that in this case the positive linear functionals
on C(X) correspond to finite Baire measures. Further, because of
exercise 9.5 (9) there is uniqueness. This gives
Corollary. If X is a compact topological space and C(X) is the set
of continuous functions f : X R, then there is a (1, 1) correspondence
between positive linear functionals f on C(X) and finite Baire measures
p on X given by
, 0 (f)
=5 .

If we want to consider more general linear functionals on C(X),

it is convenient to express these as the difference of two positive linear
functionals so that theorem 9.8 can be applied. This can be done for
bounded linear functionals.
If L is a vector lattice of bounded functions f: X -* R, then L
is a normed linear space with If II = sup I f (x) 1. A bounded linear
functional F has a norm
IIFII = sup IF(f)j.
Theorem 9.9. Suppose Lisa vector lattice of bounded functions f : X ->- R
which contains the constant function 1. Then for each bounded linear
functional F on L, there are two positive linear functionals F+ and F-
such that F = F+-F- and 11 FII = F+(1)+F (1).
Proof. For each f > 0 in L put
F+(f) = sup F(g)

Since F(0) = 0, F+(f) > 0 and F+(f) > F(f). Further

F+(cf) = cF+(f) for c > 0.
If f, g are two non-negative functions in L, such that 0 < 0 < f,
0 < x < g, then 0 < 0+x < f+g, so that F+(f+g) > F(O)+F(x).
Taking suprema over all such 0, x in L gives
F+(f+g) > F+(.f)+F+(g)
To obtain the reverse inequality consider x e L such that
0 < x< f+ g: then 0 < x A f< f and 0 5 x- (x Af) < g
so that F(x) = F(XAf)+F[x-(xAf)]
< F+(f)+F+(g)
and taking the supremum over such x gives
F+(f+ g) < F+(f) + F+(g)
For an arbitrary f E L, let p, q be two constants such that (f + p)
and (f + q) are both non-negative. Then
F+(.f+p+q) = F+(f +p) + F+(q) = F+(f + q) + F+(p)
so that F+(f +p) - F+(p) = F+(f + q) - F+(q)
This means that the value of [F+(f +p) - F+(p)] is independent of p
and we can define F+(f) to be this value. Thus F+ is now defined on L,
F+(f + g) = F+(f) + F+(g) for all f, g E L,
and F+(cf) = cF+(f) for c > 0, /EL.
But F+(- f) + F+(f) = F+(0) = 0 so we have F+(- f) F+(f) and
F+ is a positive linear functional on L.
But F+(f) > F(f) so that F- = F+ - F is also a positive linear
functional on L.
Now IIFII < 11 F+II+JIF-11=F+(1)+F-(1).
To establish the opposite inequality consider functions f E L for
which 0 < f < 1. Since 12f <1
11F11 > F(2f-1) = 2F(f)-F(1).
Taking the supremum over such f gives
11 F11 > 2F+(1)-F(1) =
Corollary. Let X be a compact Hausdorff space and C(X) the set of
continuous functions f: X -+ R. Then there is a (1, 1) correspondence
between finite signed Baire measures v on X and the dual space to C(X)
given by
F(f) = ff dv.
Moreover, IIFII = I vI (X).
Proof. If one starts with a finite signed Baire measure v, then by
theorem 3.3, there is a decomposition v = v+- v_ into the difference
of two finite Baire measures. Clearly

F(f) = fdv+- ffdv_

then defines a bounded linear functional on C(X) since each function

f in C(X) is bounded and measurable with respect to the class of
Baize sets.
Conversely given a bounded linear functional F on C(X), this can
be decomposed by theorem 9.9 into the difference F = F+-F- of two
positive linear functionals. Apply theorem 9.8 and corollary to find
finite Baire measures ,a1, ,u2 with

F+(f) = ffd1q) F-(f) = ffdAuz.

If we put v = ,u1-,u2, then v is a finite Baire measure and

F(f) = ff dv.

Now IF(f)I < f IfIdIvI

Ilfll IvI(X)
so that IIFII < I vI (X). Further
IvI(X) < 1t1(X)+,u2(X) = F+(1)+F-(1) = IIFII
so we have IIFII = I vl (X).
To prove that v is uniquely determined by F, suppose there are
two signed measures v1, v22 with

ffdvi = ffdv2 for each f e C(X ).

Decompose A = v1- v2 by theorem 3.2 to give A = A+ - A_. Then

ffdA =J f d l_ for all f C(X ),

so that by the uniqueness proved in exercise 9.4 (9), A+ = A_ on the

Baire sets. Hence v1 = v2. I

Exercises 9.5-

1. Show that in a locally compact separable metric space the class of

Baire sets is the same as the class of Borel sets.
2. Suppose u is a Baire measure on a locally compact space X. Let
H be the union of all open Baire sets 0 for which ,u(G) = 0. The com-
plement F = X - H is closed and called the support of a. Prove
(i) if G is an open Baire set and G n F + 0 then µ(G) > 0;
(ii) if K is a compact Baire set with k n F = o, then #(K) = 0;
(iii) if f e C0(X) and f > 0, f f d u = 0 if and only if f =- 0.
3. The corollary to theorem 9.9 is not valid on C0(X) for X locally com-
pact Hausdorff. A Radon measure 0 on a locally compact space is defined
to be a linear functional on 00(X) which is continuous in the sense that, for
each compact K, e > 0 there is a 6 > 0 such that 1If(x) I < 6 for all x, with
the support of f contained in K, implies that I#(f) I < e. Prove every
positive linear functional is a Radon measure.
For R and the usual topology define

g5(f) = Z (-1)*f(r) for fEC0(R).

Show that 0 is a Radon measure, but that 0 does not correspond to any
signed Baire measure.

9.6* Haar measure

There is a general method of defining a measure on an important
class of topological spaces which have the algebraic structure of a
group. For notational purposes we will represent the group operation
in the set X by multiplication. We do not assume that the group
operation is commutative. For subsets A, B of X and an element
x E X we define
xA = {xy: y E A},
AB = {xy: x E A, y E B},
= {x: X_1 E A},
and call xA and Ax respectively the left translation and right transla-
tion of A by x. We also require the algebraic operations to be con-
tinuous in the topology of X. The theory of Haar measure can be
developed for any such topological group which is locally compact
and Hausdorff, but in this section we will make the additional (un-
necessary) assumption that the topology is determined by a metric p.
A set X is a metric group if X is a group and there is a metric p
such that in (X, p), the group operation is continuous. In particular
lim X. = xo lim X. Y. = xoYo,
n- co n-aoo

lim Y. = yo lim xn 1 = xo 1.
n- oo n-aco

We will, for the remainder of the section, assume that X is a metric

group which is locally compact in the topology of the metric.
We are interested in measures for which the translation of A by
any element x leaves the measure invariant. For example, the space R
of real numbers is clearly a metric group with ordinary addition for
the group operation. Given a set E c R, and a point x E R, xE denotes
the set of real numbers of the form x + y with y E E. We showed in
§ 4.5 that Lebesgue measure in R is invariant under translations in the
sense that, for measurable E, E = I xE 1. The notation of an invariant
measure in a topological group should be thought of as a generalisa-
tion of this property of Lebesgue measure in R. To be precise, a mea-
sure,u defined on the class . of Borel subsets of X is called a left Haar
measure if
(i) ,u is invariant under left translations; that is for every E E 9,
x E X u(xE) = ,a(E) ;
(ii) for every compact set C, ,u(C) < oo;
(iii) for every non-void open set G, µ(G) > 0.
Conditions (ii) and (iii) eliminate such trivial measures as the zero
measure, and the measure which is +oo except on the null set. A
right Haar measure is one for which left translation invariance is
replaced by invariance for right translations. We give the details of
construction for a left Haar measure: obvious modifications would
give the right Haar measure.
Let WO be the class of non-empty open subsets of X whose closures
are compact. The important consequence of local compactness is
that every compact K in X can be covered by a finite number of sets
of WO. The sets 0, X added to WO form the class `e. The first step is to
define a suitable set function A one.
Suppose HE(o, and G is any non-empty open set. Then
9 = {xG: xEHG-'}
is a class of open sets covering H since, if y e H, g E G, x = yg-1,
y = xg E xG. But H is compact so there is a finite subclass of 9,
which covers H. Let the smallest number of sets of 9 which cover H
be denoted
(H: G).
This is a measure of the relative sizes of H and G. It is immediate
that, for A, B, C E To
1 < (A:C) < (A:B)(B:C).
Novy compare all sets with some fixed HOE WO, and put, for each
non-empty open set G, H E WO,
AG(H) =
( o.G).

Now, for fixed H, A,(H) is a bounded function of G since

0< 1 < A0(H) < (H:H0). (9.6.1)

(Ho: H)
If e is the identity element o//f the group X and

Sn=SIe,n1 (n= 1,2,...)

is the open sphere centre e radius 1/n, then for each fixed HEWO
A2.(H) (n= 1,2,...)
is a bounded sequence of real numbers. Put
A(H) = Lim A ,.(H)
where Lim is the generalized limit defined for all sequences in m
using the Hahn-Banach theorem to extend the definition from c
to m while preserving the norm (see exercise 8.4.(7)). Finally, put
A(o) = 0, A(X) = +oo if X is not compact (and so not in moo).
Lemma. The set function A defined on le has the following properties:
(i) 0 < A(H) < oo for every H E 'o;
(ii) if H1, H2 E WO, d(H1, H2) > 0 then
A(H1 v H2) = A.(H1) + A(H2);
(iii) for any Hl, H2 E WO,
A(H1 v H2) < A(H1) + A(H2);
(iv) if Hl, H2 E leo, Hl C H2 then
A(H1) < A(H2);
(v) for any x E X, H E To, A(xH) = A(H).
Proof. By (9.6.1), As.(H) is bounded below and above so that

0< 1 5 A(H) S (H: Ho) < oo.

(Ho: H)
This establishes (i). Further, for each He WO, G open, the covering
ratios (xH: 0) = (H: G) for all x r: X; so that the sequence Asn(H) is
invariant under left translations: therefore A is also and (v) is proved.
If H1, H2 E co and d (HI, H2) = q > 0, then for 1 /n <,I we must have
(H1v H2:Sn) = (HI:S.)+(H2:Sn),
As.(H1 v H2) = As (H) + As.(H2),
and (ii) is now established by taking generalised limits.
Now for any open G, and H1, H2 E 'Co
(H1 u H2: G) s (H1: G) + (H2: 0),
so Asn(H1 v H2) 5 As.(H1) +As,, (H2);

this implies (iii) and a similar argument gives (iv).

We now define a set function µ* for all subsets of X by

,u*(E) = inf E00A(HH), (9.6.2)


where the infimum is taken over all coverings {H,H} of E by sets

in W.

Theorem 9.10. In a locally compact metric group, the set functiona*

given by (9.6.2) is a metric outer measure. The restriction ,u of ,u* to the
class -4 of Borel sets is a left Haar measure.
Proof. In the definition of outer measure given in § 3.1, condition
(i) is obvious, (ii) follows from (iv) of the lemma, and subadditivity
(iii) follows from (9.6.2) as in the proof of theorem 4.2. Thus µ* is an
outer measure. Now suppose E1, E2 C X with d(E1, E2) > 0. If
E1 V E2 cannot be covered by a sequence from WO, then at least one
of the sets E1, E2 cannot be covered by such a sequence and
,u*(E1 v E2) = u*(E1) +,u*(E2) (9.6.3)
since both sides are + oo. If E1, E2 can be covered by sequences from
Wo, first choose open sets G1 E1, G2 E2 for which d(G1, G2) > 0 and
let {Hi} be a sequence of sets from WO covering E1 v E2 with
EA(Hi) 5 It* (El v E2)+e.
For each i, put H' = Gl n Hi, Hi2= G2 n Hi.
Then by (ii) and (iv) of the lemma, for each integer i,
A(Hi) > A(H' v H%) = A(Hi) + A(H2)
and so p*(E1) +,u*(E2) < F-A(Hi) < 1a*(E1 v E2) + e.
Since this is true for each e > 0, andp* is subadditive we have estab-
lished (7.6.3) so that p* is a metric outer measure.
Now apply theorem 4.1 to p* to obtain a measure p on a class .4l
of ,u*-measurable sets. Since p* is a metric outer measure, this class
.4l includes the open sets and therefore the Borel sets.4 (see exercise
4.3 (4)); so that the restriction of /.z* to -4 defines a measure on -4.
If we now examine the conditions for It to be left Haar measure we
see that (v) of the lemma implies that p* is left translation invariant.
If K is any compact set in X, there is a finite subclass of WO which
covers K so that n
p*(K) < E A(Hi) < co

so that condition (ii) for a Haar measure is satisfied. Now suppose

0 is any non-void open set in X. If x e G, pick e > 0 such that
S(x, e) e G and put E = S(x,'je) so that E c G. Since X is locally
compact we may assume a is small enough to make E compact so that
E e c1a. If,a(G) = oo then p(G) > 0; so we may suppose ,u(G) < oo. For
each y > 0 there is a sequence {Hi} from WO such that

U G E, F'A(Hi) < ,a*(G) +r/.

But E is compact so a finite number of the Hi must cover E. Then if

UHi:D E,

A(E) < A ( U Hi) A(Hi) < u* (G) +rl,

i=1 i=1
and since y is arbitrary we have
,u(G) =,u*(G) > A(E) > 0
so that It satisfies condition (iii) for a Haar measure.
Corollary. For any compact metric group X there is a left Haar
measure P defined on a crfield F which includes the Borel sets such that
(X, .F, P) is a probability space.
Proof. If X is compact, the above construction gives a left Haar
measure in which
0 < ,u(X) < 00
with ,u defined on a o--field $ which is complete with respect to ,cc. If
we put
P(E) = for Ee.F
it is clear that (X, .F, P) is a probability space.

Exercises 9.6
1. Suppose S2 is the set of positive real numbers with the usual metric
and multiplication for the group operation. If (1, e) is the reference set Ho
used in the definition of Haar measure It in f1, show that
,u(a, b) = log b/a for each interval (a, b) c Q.
(Here e is the base for Napierian logarithms.)
2. With X = R and addition for the group operation define Haar
measure ,u with (0,1) taken as the reference set Ho. Show that It is Lebesgue
measure in R.
3. Let X be the set of 2 x 2 matrices of the form

(0 x)
with x > 0 and multiplication for the group operation. Define a metric
in X by using the Euclidean metric in R2. Show that in the topology of
this metric, X is a locally compact metric group.
F(x y=-y
0 X/ x'

Map the Lebesgue-Stieltjes measure lip in the right half-plane x > 0

of R2 onto the set X, and show that the result is both a left and a right Haar
4. X is the set of 2 x 2 matrices of the form
xy (x > 0)
(0 1

metrised by the Euclidean metric in R2. As in question 3, obtain a measure

in X by mapping the Lebesgue-Stieltjes measure ,uF of question 3 into X.
Show that this is a left Haar measure but not a right Haar measure.
5. If ,u is a left Haar measure on X and v is defined by v(E) =,u(E-1),
show v is a right Haar measure.
6. The left Haar measure of theorem 9.10 is regular in the sense that
,u*(E) = inf{,u(G): G E, G open}.
7. Haar measure is obviously not unique since for any Haar measure
,u, c > 0 the measure cu is also a Haar measure. However, on a compact
metric group, with the condition u(X) = 1 it can be proved that the
Haar measure is essentially unique.
8. If A, B are two compact sets with ,u(A) = ,u(B) = 0, does it follow
that #(AB) = 0?
9. If It is a Haar measure in X, then X has a discrete topology if and only
if µ{x} $ 0 for at least one x r: X.
10. If a Haar measure It on X is finite prove that X is compact.
11. In a locally compact metric group X show that aHaar measure #
on X is o--finite if and only if X is o--compact.

A-B, 9 lp, 1w, 209
-4,-4-,43 2k, 79
R*, 103 F, 96
C0(X), 250 168
C(X), 251 Y2(S2, µ), 194
C, 2 M, .,f, 166
C, 46 M(ca), 199
C(a, b), 209 M, m, 46
C*(X), 48 p, gn, 15
c, 6,47t Q, 2
Wx-9, 134 R, Rn, 2
W*-9, 135 R*, 34, 51
((3; iEI), 136 R+, 51
D, D+, D_, D+, D-, 224 RI, 158
d(x, E), d(E, F), 27 S(x, r), 24
E, 26 S(x, r), 25
Ex, EY, 135 s, 46
ExF, 2 2, 82
e, 15 5P(M), 18
gn, 18 XI, 157
f-1, 4 {x1}, 5
fog, 4 {x; P},1
f: A-i-B, 3 Z, 2
fe(y), 136 a.e., 109
(f, g), 198 diam, 27
Via, 44 ess inf, ess sup, 167
F(A), 19 lim inf, lim sup, 12
19 11+, ,u_, 62
9a, 44 96
-or, 100 ,af-1, 154
K*, 211 per, 168
43 Pl, 174
L1, $1, 174 pp, 175
Lp, Yp, 175 (r--q, 77
12, 48 T(x, A), 235

t Note that the symbol c has two distinct uses, which should not be confused.
XE, 12 u, n, to
No,6 11.11,45
E, O, z , =>, 1 f , see chapter 5
o,2 <<, 148
H,3 V, A, 241
3,4 The symbol ] is used to signal
-,6 the end of a proof.
v, n,A,9

absolute continuity of functions, 231; of local, 31
measures, 120, 148, 236 complement, 9
integration, 128 complete measure, 81, 109, 166
additive set function, 51, 65, 214 metric space, 29, 175, 194
algebra of subsets, 15 set in a linear space, 199
algebraic numbers, 13 completion, 82
almost everywhere, 108 composition, 4, 154
uniform convergence, 168 conjugate indices, 183
approximation in measure, 84 space, 213, 215
to measurable functions, 131 consistency conditions, 158
area, 69, 79 continuous function, 35
under a curve, 146 set function, 56
atom, 64, 237 continuum hypothesis, 7, 58
axiom of choice, 7, 19, 93 convergence, 180-1
in mean, 174
Baire sets, 132, 250 in measure, 171
measure, 250 in norm, 204
Baire's category theorem, 42 in pth mean, 174
Banach space, 194, 209 of sets, 12
Bessel's inequality, 204 countable, 7
Birkhoff's theorem, 190 basis for measure, 195, 207
Borel field, 16 counting measure, 185
-measurable function, 107, 154 covering, 30, 225
sets, 43 cylinder set, 136, 140, 158
Borelian sets, 43
bounded, 27 Daniell extension, 242
convergence theorem, 126 functional, 241
linear functional, 210, 251 integral, 101, 223, 241, 246
variation, 228 -Kolmogorov theorem, 159
bounds, upper and lower, 21 decomposition, Hahn-Jordan, 61, 64
Brownian motion, 161 Lebesgue, 149, 239
de Morgan's laws, 10
Cantor set, function, 49, 110, 152, 229, dense, 42, 195
236, density, 234
Cantor's lemma, 41 derivate, 224
Caratheodory, 127 derivative, 224, 240
cardinal numbers, 5, 6 Radon-Nikodym, 152_223
Cartesian product, 2, 38, 134 derived set, 26
category, first and second, 41 diameter, 27
Cauchy integral, 127 differentiable, 224
-Riemann integral, 129 discrete distribution
sequence, 29, 167, 171 measure, 53, 80, 237
chain, 20 probability, 98
maximal, 21 topology, 28
change of variable, 155 disjoint, 11
class, 2 dissection, 102
closed linear span, 199 distance, 23, 27
set, 25 distribution function, 96-7
sphere, 26 domain, 3
closure, 26 dominated convergence theorem, 121,
coarser topology, 40 125-6, 180, 249
collection, 2 dual space, 21, 215
compactification, 33, 34
compactness, 29, 30, 39 Egoroff's theorem, 169
elementary event interior point, 28
figures, 17-18; length, area and volume intersection, 9, 10
of, 69 invariant function, 190
functions, 109 measure, 90, 255
empty set, 2 inverse function, 4
enumerable, 7 image, 4
equicontinuity, 177 invertible, 187
equivalence relation, 6, 166
ergodic theorems, maximal, 188; mean, Jacobian, 156
221; pointwise, 190 Jordan decomposition, 61, 64
transformation, 192, 222 jump function, ,237
essentially bounded, 167
extended real numbers, 33 Kolmogorov, 159
extension of functions, 4 Kuratowski's lemma, 21
of set functions, 58
theorems, 65-6, 77, 244 least upper bound axiom, 21, 30
Lebesgue convergence theorem, 121
Fatou's lemma, 120, 249 covering lemma, 38
field, 15 decomposition theorem, 149, 239
finite-dimensional distributions, inter- density theorem, 235
section property, 32 integral, 124
Fourier coefficients, 203 measurable, 108
series, 203 measure, 69, 79, 195
Fubini's theorem, 143-4 -Stieltjes integral, 125; measure, 95,
function, 3 198, 236
space, 157 Legendre polynomials, 202
length, 69, 79
generated ring, 17, 65 limit of a sequence, 27
o -ring, 18, 77 point, 26
z-class, 17 linear dependence, 199
Gram-Schmidt orthogonalisation, 201 functional, 201, 215, 241, 250
groups, 254 space, 45, 194
span, 199
Haar measure, 255, 257, 259 subspace, 212
Hahn-Banach theorem, 212-13 Lindelof space, 22
decomposition, 61, 64 Liouville's theorem, 188
Hausdorff space, 250 local compactness, 31, 250, 254
Heine-Borel theorem, 30
Hilbert cube, 48 majorised, aee dominated
space, 194, 202 mapping, 3, 153
Holder's inequality, 183, 186 marginal distribution, 163
maximal ergodic theorem, 188
indefinite integral, 127, 149, 230, 234 mean ergodic theorem, 221
indicator function, 12 measurable function, 103, 107, 166,
indiscrete topology, 28 246
inequalities, 183 set, 74, 79, 96, 246
inner measure, 75 space, 246
product, 198 transformation, 154
integrable function, 113-14,127, 129, 246 measure, 55
set, 246 Haar, 255, 257, 259
integral, 100 Lebesgue, 69, 79, 195
Cauchy, 127 Lebesgue-Stieltjes, 95, 198, 236
Cauchy-Riemann, 129 Radon, 254
Daniell, 101, 223, 241, 246 measure-preserving transformations, 187
Lebesgue, 124 metric, 23, 185
Lebesgue-Stieltjes, 125 group, 255
Riemann, 100, 128, 129, 176 outer measure, 86, 88, 257
integration by parts, 157 space, 23
metrisable, 25 regular measure, 86
Minkowski's inequality, 184, 186 outer measure, 75
monotone class, 16, 79 representation of linear functionals,
class theorem, 18 250
convergence theorem, 119 restriction, 4
sequence, 12 of a set function, 58, 75
set function, 60 Riemann integral, 100, 128, 129, 176
monotonic function, 8, 224, 226 Riesz-Fisher theorem, 205
mutually singular measures, 149 Riesz's lemma, 188
representation theorem, 250
neighbourhood, 26 ring, 15
non-atomic, 64, 238 (in algebraic sense), 18
non-measurable set, 93
norm, 45, 194, 211 scalar product, 198
normal numbers, 193 SchrSder-Bernstein theorem, 6
normed linear space, 44-5 Schwarz's inequality, 184
nowhere dense, 41, 49 section, 135-6
null set, 2 semi-ring, 15
separable, 48, 187, 194
open covering, 30 separating functional, 219
set, 24 sequence, 4
sphere, 24 monotone, 12, 18
ordered pairs, 2 set, 1
ordering, 20 function, 51
ordinate set, 145 shift, 193
orthogonal system, 199, 200 sigma additive (v-additive), 54
orthogonalisation, 201 algebra (v-algebra), 16
orthonormal system, 200 compact (v-compact), 43
outer measure, 59, 74 field (v-field), 16
metric, 86, 88, 257 finite (Q-finite), 59
ring (Q-ring), 16
parallelogram law, 209 simple function, 102
Parseval's identity, 204 singular, 149, 238
partial ordering, 20 statistical mechanics, 188
perfect set, 44, 49 Stieltjes measure, 95-6, 99, 163
phase space, 187 see also Lebesgue-Stieltjes
point of density, 235 Stone's theorem, 247
of dispersion, 235 strong derivative, 240
pointwise convergence, 166 subadditive, 59, 213
positive linear functional, 241, 250 support, 250
probability measure, 96, 98 supremum, 21, 29
space, 96
product field, 134 thick, 164
measure, 139, 141, 164 topological group, 254
ring, 134 space, 25
v-field, 134 transcendental numbers 14
space, 3, 5, 134 transformation, 3, 89, 154
z-class, 134 measure-preserving, 187
projection, 40, 135 transitive, 20
proper subset, 1 triangle inequality, 23
trigonometric functions, 202
Rademacher functions, 202 Tychonoff's theorem, 39
Radon measure, 254
-Nikodym derivative, 152, 223; theo- uniform absolute continuity, 178
rem, 149 continuity, 37
range of a function, 3 convergence, 167
rectangle, 134 union, 9, 10
reflexive, 20, 218 uniqueness of extension, 77
upper bound, 21 Weierstrass property, 31
well ordered, 2, 22
vector lattice, 241 Wiener measure, 162
Venn diagram, 9
Vitali covering, 225 z-class, 16, 134
volume, 69, 79 Zorn's lemma, 21
Von Neumann, 15

You might also like