Download as pdf or txt
Download as pdf or txt
You are on page 1of 191

SECOND EDITION

Daniel W. Stroock

A Concise Introduction
to the Theory of Integration

second edition

Birkhauser
Boston • Basel •Berlin
1994
Daniel W. Stroock
Department of Mathematics
Massachusetts Institute of Technology
Cambridge, MA 02139

Library of Congress Cataloging-in-Publication Data

Stroock, Daniel M.
A concise introduction to the theory of integration I Daniel W.
Stroock. -- 2nd ed.
p. em.
Includes index.
ISBN 0-8176-3759-1 (acid-free paper)
1. Integrals, Generalized. 2. Measure Theory. I. Title.
QA312.S78 1994 94-6246
512'. 4--dc20 CIP

© second edition 1994


Printed on acid-free paper
Daniel W. Stroock
Birkhiiuser $ ®

First edition: World Scientific, Singapore

Copyright is not claimed for works of U.S. Government employees.


All rights reserved. No part of this publication may be reproduced, stored in a retrieval system,
or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording,
or otherwise, without prior permission of the copyright owner.

Permission to photocopy for internal or personal use of specific clients is granted by

Center (CCC), provided that the base fee of $6.00 per copy, plus $0.20 per page is paid directly
Birkhauser Boston for libraries and other users registered with the Copyright Clearance

222 01923,
addressed directly to Birkhauser Boston, 675 Massachusetts Avenue, Cambridge, MA 02139,
to CCC, Rosewood Drive, Danvers, MA U.S.A. Special requests should be

U.S.A.

0-8176-3759-1
ISBN 3-7643-3759-1
ISBN

Typeset by the Author.


Printed and bound by Quinn-Woodbine, Woodbine, NJ.
Printed in the U.S.A.

9 8 7 6 5 4 3 2 1
Contents

Preface . . . . . . . . . . . . . . Vll

Chapter 1: The Classical Theory . . . . . 1


1.1: Riemann Integration . . . . 1
1.2: Riemann-Stieltjes Integration 7
Chapter II: Lebesgue's Measure . . . 19
2.0: The Idea . . . . . . . . . . . . . . 19
2. 1 : Existence . . . . . 21
2.2: Euclidean Invariance 30

Chapter Ill: Lebesgue Integration . . . . . . . . . . 34


Measure Spaces . . . .
3. 1 : . . . . . . . . . . . . . 34
3.2: Construction of Integrals . . . . . . . . . . . . . . . 40
3.3: Convergence of Integrals . . . . . . . 50
3.4: Lebesgue ' s Differentiation Theorem . . . . . . 62
Chapter IV: Products of Measures . . . . . . . . . . 68
4. 1:Fubini ' s Theorem . . . . . . . . . . . . . . . . . 68
4.2: Steiner Symmetrization and the Isodiametric Inequality . 74
Chapter V: Changes of Variable . . .
. . . . . . . . . . . . 80
5.0: Introduction . . . . . . . . . . . . . .
. . . . . 80
5. 1 : Lebesgue Integrals vs. Riemann Integrals . . . . . . . . 81
5.2: Polar Coordinates . . . . . . . . . . . . . . . . 85
5.3: Jacobi ' s Transformation and Surface Measure . . . . . . . . 89
5.4: The Divergence Theorem . . . . . . . . . . . . . 103

Chapter VI: Some Basic Inequalities . . . 1 14


6. 1 : Jensen, Minkowski, and Holder . . . 1 14
6.2: The Lebesgue Spaces . . . . . . . 1 19
6.3: Convolution and Approximate Identities . 131

Chapter VII: A Little Abstract Theory . . . . . . . . . . 139


7. 1 : An Existence Theorem . . . . . . . . . . . . . . 139
7.2: Hilbert Space and the Radon-Nikodym Theorem . . . . . . . 151
Notation . 159

v
Vl Contents
Index . 162
Solution to Selected Problems . 166
Preface to the First Edition

This little book is the outgrowth of a one semester course which I have
taught for each of the past four years at M.l . T. Although this class used to
be one of the standard courses taken by essentially every first year gradu­
ate student of mathematics, in recent years (at least in those when I was
the instructor), the clientele has shifted from first year graduate students
of mathematics to more advanced graduate students in other disciplines.
In fact, the majority of my students have been from departments of engi­
neering (especially electrical engineering) and most of the rest have been
economists. Whether this state of affairs is a reflection on my teaching,
the increased importance of mathematical analysis in other disciplines,
the superior undergraduate preparation of students coming to M.I.T in
mathematics, or simply the lack of enthusiasm that these students have
for analysis, I have preferred not to examine too closely. On the other
hand, the situation did force me to do a certain amount of thinking about
what constitutes an appropriate course for a group of non-mathematicians
who are courageous (foolish?) enough to sign up for an introduction to in­
tegration theory offered by the department of mathematics. In particular,
I had to figure out what to do about that vast body of material which, in
standard mathematics offerings, is "assumed to have been covered in your
advanced calculus course" . Aspiring young mathematicians seldom chal­
lenge even the most ridiculous declarations of this sort: the good ones look
it up, and the others trust that "it will not appear on the exam" . On the
other hand, students who are not heading into mathematics are less easily
shamed into accepting such claims; in fact, as I soon discovered, many of
them were attending my course for the express purpose of learning what
mathematicians call "advanced calculus" .
In view of the preceding comments about the origins of this text, it
should come as no surprise that the contents of this book are somewhat
different from that of many modern introductions to measure theory. In­
deed, I believe that nothing has been done here "in complete generality" !
On the other hand, greater space than usual has been given to the prop­
erties of Lebesgue ' s measure on �N . In particular, the whole of Chapter
IV [now, Chapter V ] is devoted to applications of Lebesgue ' s measure to
topics which are customarily "assumed to have been covered in your ad­
vanced calculus course" . As a consequence, what has emerged is a kind
of hybrid in which both modern integration theory and advanced calculus
are represented. Because none of the many existing books on integration
theory contained precisely the mix for which I was looking, I decided to
add my own version of the subject to the long list of books for the next
guy to reject.
Cambridge, MA, January 1, 1990

Vll
Vlll Preface
Preface to the Second Edition

It is four years since the first edition of this book appeared, and, in that
time, there has been little, if any, change in either the basic material cov­
ered or my attitude toward that material. On the other hand, experience
has taught me that my presentation of several points could be considerably
refined and that the inclusion of some additional topics would be desirable.
Thus, although they may not be immediately apparent, changes have been
made throughout. Among those which are obvious are the addition of two
new sections : Section 4.2, in which I prove the isodiametric inequality and
discuss Lebesgue ' s measure from the Hausdorff measure point of view, and
Section 3.4, in which I have given a proof ( based on the Hardy-Littlewood
maximal inequality ) of Lebesgue ' s Differentiation Theorem for �. These
additions made it desirable to reorganize the table of contents, with the
result that now product measures appear in Chapter IV and succeeding
chapters have been renumbered accordingly. Besides these new sections,
the exposition, particularly in what are now Chapters V and VII, has
been, I hope, improved. In addition, even where substantive alterations
are slight, I have made a great effort to remove some of the more egregious
errors with which the first edition was riddled. ( In particular, I believe
that I have, at last, mastered the spelling of Lebesgue ' s name. ) Finally, at
the behest of my students, I have attempted to solve some of the exercises,
and the fruits of my labor appear at the end of the book.
If I have successfully eliminated many of the errors in the first edition,
most of the credit should go to R.B. Burckel, who was kind enough to send
me a ( five page ) list of those which he found. In addition, I am indebted
to Ann Kostant at Birkhauser for her efforts, without which this second
edition would probably not have appeared.
Daniel W. Stroock, Cambridge, MA, December 1993
Chapter I
The C lassical Theory

1 .1 Riemann Integration.
We begin by recalling a few basic facts about the integration theory which
is usually introduced in advanced calculus. We do so not only for purposes
of later comparison with the modern theory but also because it is the theory
with which most computations are actually performed.
Let N E z+ (throughout z+ will denote the positive integers). A rectangle
in JRN is a subset I of JRN which can be written as the Cartesian product
flf[a k , b k ] of closed intervals [ak , b k ], where it is assumed that ak < b k for
each 1 < k < N. If I is such a rectangle, we call
N
and vol (I) IJ ( b k - ak)
k =l
the diameter and the volume of I, respectively. For purposes of this expo­
sition, it will be convenient to also take the empty set to be a rectangle with
diameter and volume 0.
Given a collection C, we will say that C is non-overlapping if distinct
elements of C have disjoint interiors. The following obvious fact is surprisingly
difficult to prove.
1.1.1 Lemma. If C is a non-overlapping, finite collection of rectangles each
of which is contained in the rectangle J, then vol ( J ) > L:J E C vol (I) . On the
other hand, if C is any finite collection of rectangles and J is a rectangle which
is covered by C (i.e., J C U CJ, then vol ( J ) < L:J E C vol (I) .
PROOF: Without loss in generality, we will assume throughout that each of
the rectangles I E C has non-empty interior and is contained in the rectangle
J. Indeed, it'( is obvious� that: I ' s with empty interior do not contribute to the
sum, I C J has already been assumed in the first part of the lemma, and, in
'

'the second part, if I C]_ J, then one can replace I by I n J if I n J has interior
and eliminate I from C when I n J has no interior.
We begin by numbering the elements of C so that C == {I1 , . . . , In } , and
start with the case when N == 1. Thus, J == [a , b] and each IJL == [aJL , bJL ] for
some a < b and aJL < bJL . Next, choose c 1 < < c2n so that { c 1 , . . . , c2 n } ==
· · ·
2 I The Classical Theory
{ a 1 , . . . , an } U { b 1 , . . . , bn }. For each 1 < v < 2n - 1 and 1 < Jl < n - 1 , set
M ( v ) == {Jl : a JL < Cv < b JL} and N( Jl ) == {v : a JL < Cv < b JL} .
When C is non-overlapping and U C C J, one has that a < c1 < c2 n < b and
that card ( M ( v )) < 1 for each v. Hence, in this case:
n n n
L vol (IJL) == L( b JL - a JL) == L L ( cv+l - cv)
JL= l JL= l JL= l vEN( JL)
2n - 1
== L ( cv+l - cv )card ( M ( v ))
v= l
2n - 1
< L ( Cv+ 1 - Cv) < b - a == vol ( J) .
v=l
On the other hand, if has J == u� IJL' then cl == a, C2 n == b, and card (M ( )) > 1 ZJ

for each 1 < v < 2n - 1 with Cv < c2 n ; and so


n n n
L vol ( IJL) == L( bJL - a JL) == L L ( cv+l - cv)
JL= l JL= l JL= l vEN( JL)
2 n -1
== L ( cv+l - cv ) card ( M ( v ))
v= l
2n - 1
> L ( cv+l - Cv) == b - a == vol (J).
v= l
Thus, the case when N == 1 is complete.
To handle N > 2, we work by induction. Write J [a, b] x J and IJL ==
==

[a JL, b JL] x iJL, where J and the iJL ' s are rectangles in �N - l having non-empty
interiors. Next, choose { c1 , . . . , c2 n } and define M ( v ) and N(JL) accordingly,
as in the case when N == 1 . When we assume only that J == U� IJL, we have
c1 == a, c2 n == b, and J == U for each v withcv < c2 n .
JLEM( v)
Hence, by the induction hypothesis, in this case:
n n
JL= l
n
== L vol ( iJL) L ( cv+l - cv)
JL=l vEN( JL)
2 n -1
L ( cv+l - Cv) L vol ( iJL)
v= l JLEM( v)
2n - 1
> vol ( J) L ( cv+l - cv) == ( b - a ) vol ( J) == vol (J) .
v= l
1.1 Riemann Integration 3

To handle the non-overlapping case when U� IJL c J, first assume that the
IJL ' s themselves are mutually disjoint. Then, a < c 1 < · · · < c2n < b, the IJL ' s
A

are mutually disjoint, and U JLEM (v) C J if C11 < c2n· Hence, by the induction
hypothesis,
L vol (i JL ) < vol ( J)
!.tEM (v)
for each C11 < c2n; and so
n n

JL= 1 JL= 1
n
== L vol (f JL ) L ( Cv+ 1 - C11 )
JL= 1 vEN( JL )
2n-1
L ( Cv+ 1 - C11) L vol ( f JL )
v= 1 JLEM (v )
2n-1
< vol (J) L ( cv+ 1 - C11) < ( b - a)vol (J) == vol (J).
v= 1
Finally, note that when the IJL ' s are non-overlapping but not necessarily dis­
joint, they can be made disjoint by an arbitrarily small diminution of their
sides. Hence, we can first make the necessary diminution and then pass to a
limit, thereby handling the case of general non-overlapping IJL ' s contained in
J. D
Given a collection C of rectangles I, we use S (C) to denote the set of all
maps � : C � U C such that � (I) E I for each I E C. Given a finite collection
C, an element � E S (C), and a bounded function f : U C � � ' we define the
Riemann sum of f over C relative to � to be

(1 . 1 .2) R (j ; C, � ) L J( � (I))vol (I).


lEC
Finally, if J is a rectangle and f : J � � is a function, we say that f is
Riemann integrable on J if there is a number A E � with the property
that, for all E > 0, there is a 8 > 0 such that
(1 . 1 .3) IR (j; C, � ) - A I < E
whenever � E B (C) and C is a non-overlapping, finite, exact cover of J (i.e.,
J == U C) whose mesh size
II C II max {diam(I) : I E C } < 8.
4 I The Classical Theory
When f is Riemann integrable on J, we call the associated number A in ( 1.1 .3)
the Riemann integral of f on J and we will use
(R) l f (x) dx
to denote A.
It is a relatively simple matter to see that any f E C ( J ) (the space of
continuous real-valued functions on J) is Riemann integrable on J. However,
in order to determine when more general functions are Riemann integrable, it
is useful to introduce the Riemann upper sum
U( f; C ) L sup f(x)vol (I)
lE C xE I

and the Riemann lower sum


£(!; C ) L xEinfI f(x ) vol (I) .
lE C

Clearly, one always has


£(!; C ) < R(j; C, � ) < U( f; C )
for any C and � E 3(C ) . Also, it is clear that a bounded f is Riemann integrable
if and only if
lim £(!; C ) > Clim U(J; C )
II C II�O II II�O
where the limits are taken over non-overlapping, finite, exact covers of J. What
we want to show now is that the preceding can be replaced by the condition
(1.1.4) sup £( !; C ) > inf U( f; C )
c
c

where C ' s run over all non-overlapping, finite, exact covers of J.


To this end, we partially order the covers C by refinement. That is, we say
that C2 is more refined than C 1 and write C 1 < C2 , if for every I2 E C2 there
is an I1 E C 1 such that I2 C I1 . Note that, for every pair C 1 and C2 , the least
common refinement cl v c2 is given by

1 . 1 .5 Lemma. For any pair of non-overlapping, finite, exact covers C 1 and C2


of J and any function f : J � ' £(j; C 1 ) < U( j; C2). Moreover, if C1
------t < C2 ,
then £( !; C 1 ) < £( !; C2) and U( f; C 1 ) > U( f; C2) .
1.1 Riemann Integration 5
PROOF: We begin by proving the second statement. Noting that
(1. 1 .6) C( f ; C ) == - U ( - j ; C ) ,
we see that it suffices to check that U (f ; C 1 ) > U (j ; C2) if C 1 <C2 . But, for
each I1 E C1 ,

where we have used Lemma 1 . 1 . 1 to see that

After summing the above over I1 E C 1 , one arrives at the required result.
Given the preceding, the rest is immediate. Namely, for any C 1 and C2 ,

Lemma 1. 1.5 really depends only on properties of our order relation and
not on the properties of vol(I) . In contrast, the next lemma depends on the
continuity of volume with respect to side-lengths of rectangles.
1.1.7 Lemma. Let C be a non-overlapping, finite, exact cover of the rectangle
J and f : J � � a bounded function. Then, for each E > 0, there is a 8 > 0
such that
U (f; C' ) <U (f; C ) + E C(f; C' ) > C(f ; C ) - E
and
whenever C ' is a non-overlapping finite exact cover of J with the property that
II C ' I I <8.
PROOF: In view of ( 1 . 1 .6) , we need only consider the Riemann upper sums.
Let J == flf[ck , dk] · Given a 8 > 0 and a rectangle I == flf[ak , bk] , define
IJ:(8 ) and I: ( 8 ) to be the rectangles
[cl, d 1 ] X · · · X [ak - 8, ak + 8] X · · · X [eN , dN ]
and
[cl, d1 ] X · · · X [bk - 8, bk + 8]
[eN , dN ] , X · · · X

respectively. Then, for any rectangle I' C J with diam(I') <8, either I' C I
for some I E C or
I ' c I: ( 8 ) u IJ:( 8 )
for some I E C and 1 <k <N. Now let C ' with IIC ' II <8 be given ; define
A == {I' E C' : I' c I for some I E C}
6 I The Classical Theory
and set B == C \ A. Then, by Lemma 1 . 1 . 1 ,
N

L vol ( I' ) <L L (vol (It( 8)) + vol (Ik( 8))) ;


l'EB lE C k= l

from which it is clear that there is a K < oo, depending only on C, such that
L vol (I' ) < K 8.
l'EB

Hence,
U (f; C ' ) - U (f; C) < U (f; C ' ) - U (f; C V C ' )
< L
l'EB
[( sup f(x) ) vol (I' ) - L ( sup f(x)) vol (In I' )]
xE l' I E C xE lnl'
< ll f ll u L vol (I' ) < K l l f l l u 8.
l'EB

(In the preceding, we have introduced the notation, to be used throughout,


that ll f l l u denotes the uniform norm of f: the supremum of f over the set
on which f is defined.) From the preceding it is clear how to choose 8 for any
given E > 0. D
As an essentially immediate consequence of Lemma 1 . 1 .7, we have the fol­
lowing theorem.
1 . 1 . 8 Theorem. Let f : J � � be a bounded function on the rectangle J.
Then

lim
C
.C(J; C ) == sup .C(J; C) and lim
C
U (f; C) == inf U (f; C),
c
II II �O c II II �O

where C runs over non-overlapping, finite, exact covers of J. In particular,


( 1 . 1 .4) is a necessary and sufficient condition that a bounded f on J be Rie­
mann integrable. Moreover, if ( 1 . 1 .4) , then

(R) f f(x) dx = sup .C(f; C) = inf U (f; C) .


JJ c c

Exercises

1 . 1 .9 Prove Theorem 1 . 1 .8. Next, suppose that f and g are Rie-


Exercise:
mann integrable functions on J. Show that f V g max{/, g }, f 1\ g
Riemann-Stieltjes Integration
1. 2 7

min{/, g }, and, for any a, {3 E JR, af + {3g are all Riemann integrable on J. In
addition, check that
( (
(R) l (! V g)(x)dx > (R) l f(x)dx) V (R) l g(x)dx),
(R) l u 1\g)(x)dx < ( (R) l f(x)dx) ( (R) l g(x)dx), 1\

and

Conclude, in particular, that if f and g are Riemann integrable on J and f < g,


then (R) JJ f(x)dx < (R) JJ g(x)dx.
1 . 1 . 10 Show that if f is a bounded real-valued function on the
Exercise:
rectangle J, then f is Riemann integrable if and only if, for each E > 0, there
is a 8 > 0 such that
(1.1.11) vol (J) < E

whenever II C II < 8. (We use sup 1 f and inf1 f to denote supx EJ f(x) and
infx E J f(x), respectively.) As a consequence, show that a bounded f on J is
Riemann integrable if it is continuous on J at all but a finite number of points.
(See Section 4. 1 for more information on this subject.)
1 . 1 . 12 Show that the condition in Exercise 1 . 1 . 10 can be replaced
Exercise
by the condition that for each E > 0 there exists some C for which ( 1 . 1 . 1 1)
holds.

1 .2 Riemann-Stieltjes Integration.
In Section 1 . 1 , we developed the classical integration theory with respect to the
standard notion of Euclidean volume. In the present section, we will extend the
classical theory, at least for integrals in one dimension, to cover more general
notions of volume.
Let J == [a, b] be an interval in � and cp and '¢ a pair of real-valued functions
on J. Given a non-overlapping, finite, exact cover C of J by closed intervals I
and a � E B(C ) , define the Riemann sum of cp over C with respect to 1/J
relative to � to be

R(cp l'l/J ; C, �) == L cp (�( I)) �I'l/J ,


I EC
8 I The Classical Theory
where � 1 V; V;(I + ) - V;(J - ) and I+ and I- denote, respectively, the right
and left hand end-points of the interval I. Obviously, when V;(x) == x , x E
J , R(cp i V; ; C, � ) == R ( cp; C, � ). Thus, it is consistent for us to say that cp is
Riemann integrable on J with respect to V;, or, more simply, V;-Riemann
integrable on J, if there is a number A with the property that, for each E > 0,
there is a 8 > 0 such that
(1.2.1) sup IR ( cp i V; ; C, � ) - A I < E
)
�ES( C

whenever C is a non-overlapping, finite, exact cover of J satisfying II C II < 8.


Assuming that cp is V;-Riemann integrable on J , we will call the number A in
( 1 . 2 . 1 ) the Riemann-Stieltjes integral of cp on J with respect to V; and
will use
1
(R) cp(x) d'lj;(x)
to denote A.
1 . 2 . 2 Examples: The following examples may help to explain what is going
on here.
(i) If cp E C( J) and V; E C 1 ( J ) (i.e., V; is continuously differentiable on J), then
one can use the Mean Value Theorem to check that cp is V;-Riemann integrable
on J and that
( 1 .2 .3) (R) 1 cp (x) d'lj;(x) = (R) 1 cp (x ) '1/J' (x) dx .
(ii) If there exist a == ao < a 1 < < an == b such that V; is constant on each
of the intervals (am - I , am ), then every cp E C ( [a , b] ) is V;-Riemann integrable
· · ·

on [a, b] , and
n
( 1 .2 .4) (R) 1
[a,b]
cp (x) d'lj;(x) = L cp (am )dm ,
m=O
where do == V;(a+) - V;(a) , dm == V;(am +) - V;(am - ) for 1 < m < n- 1 , and
dn == V;(b) - V;(b - ). (We have used f(x + ) and f(x - ) to denote the right and
left limits of f at x.)
(iii) If both (R) JJ 'PI (x) dV;(x) and (R) JJ 'P 2 (x) dV;(x) exist (i.e., cp 1 and cp 2
are both V;-Riemann integrable on J ) , then for all real numbers a and /3,
(acp 1 + (3cp 2 ) is V;-Riemann integrable on J and

(1 .2.5 )
1
(R) ( a cp 1 + f3cp2 )(x) d'lj;(x)
=a ( (R) 1 ip 1 (x) d'lj;(x) ) + (3 ( (R) 1 cp2 (x) d'lj;(x) ) .
1. 2 Riemann-Stieltjes Integration 9
If J == J1 U J2 where J1 n J2 == 0 and if cp is 'l/J -Riemann integrable on J,
0 0

(iv)
then c.p is 'l/J-Riemann integrable on both J1 and J2, and
(1 .2.6) (R) f cp(x) d'l/J (x) = (R) f cp (x) d'l/J (x) + (R) f cp(x ) d'l/J (x).
JJ } Jl JJ2
All the assertions made in Examples 1 .2.2 are reasonably straightforward
consequences of the definition of Riemann integrability. Not so obvious, but
terribly important, is the following theorem which shows that the notion of
Riemann integrability is symmetric in cp and 'l/J.
1.2. 7 Theorem (Integration by Parts) . If c.p is 'ljl-Riemann integrable on
J == [a , b], then 'ljJ is c.p-Riemann integrable on J and

(1 .2.8) (R) i '1/J (x) dcp(x) = '1/J (b) cp (b) - '1/J (a)cp(a) - (R) i cp(x) d'l/J (x) .
PROOF: Let C == {[am - 1 , a m ] : 1 < m <n} , where a == ao <···<a n == b;
and let � E 3 (C) with �( [a m - 1 , am ] ) == f3m E [am - 1 , am ] · Set f3o == a and
f3n+ 1 == b. Then
n
R( 'l/J lc.p ; C, �) == L 'l/J ( f3m) (c.p( am) - c.p ( am - 1 ))
m= 1
n n- 1
== L 'l/J ( f3m) c.p( am) - L 'l/J ( f3m + 1 )c.p ( am)
m= 1 m =O
n -1
== 'l/J ( f3n ) c.p ( an ) - L c.p ( am) ( 'l/J ( f3m + 1 ) - 'l/l( f3m)) - 'l/J ( /31 )c.p ( ao)
m= 1
n
== 'l/l (b )c.p (b) - 'l/l ( a )c.p( a) - L c.p( am) ( 'l/J ( f3m + 1 ) - 'l/l( f3m))
m =O
== 'lfJ (b)c.p (b) - 'lfJ (a)c.p(a) - R(c.p l 'l/J ; C' , �' ) ,
where C' == { [f3m - 1 , f3m ] : 1 < m < n + 1 } and�' E B(C') is defined by
�'( [/3m, f3m + 1 ] ) == am for 0 <m <n. Noting that II C' II <2 II C II , one now sees
if c.p is 'l/J-Riemann integrable, then 'ljJ is c.p-Riemann integrable and that ( 1 .2.8)
holds. D
It is hardly necessary to point out, but notice that when 'ljJ 1 and c.p
is continuously differentiable, (1 .2.8) becomes the Fundamental Theorem of
Calculus.
Although the preceding theorem indicates that it is natural to consider cp and
as
'ljJ playing symmetric roles in the theory of Riemann-Stieltjes integration, it
turns out that, in practice, one wants to impose a condition on 'ljJ which will
10 I The Classical Theory
guarantee that every cp E C ( J ) is Riemann integrable with respect to 1/J and
that, in addition,
( 1 .2.9 )

oo
for some K'l/J < and all cp which are ¢-Riemann integrable on J. Example
( i) in Examples 1 .2.2 tells us that one condition on 1/J which guarantees the
1/J-Riemann integrability of every continuous cp is that 1/J E C1 ( J ) . Moreover,
from ( 1.2.3 ) , it is an easy matter to check that in this case ( 1.2.9 ) holds with
K'l/J == 1 1 1/J' II u (b - a) . On the other hand, example (ii) makes it clear that 1/J
need not be even continuous, much less differentiable, in order that Riemann
integration with respect to 1/J have the above properties. The following result
emphasizes this same point.
1 .2. 10 Theorem. Let 1/J be non-decreasing on J. Then every cp E C( J ) is
¢-Riemann integrable on J. In addition, if cp is non-negative and ¢-Riemann
integrable on J, then (R) JJ cp (x) d'lj;(x) > 0. In particular, ( 1 .2.9 ) holds with
K1/J == fl. J 1/J .
PROOF: The fact that (R) JJ cp(x) d'lj;(x) > 0 if cp is a non-negative function
which is ¢-Riemann integrable on J follows immediately from the fact the
R ( cp l 'l/J; C, � ) > 0 for any C and � E B(C). Applying this to the function
II VJ II u - cp and using the linearity property in (iii) of Example 1 .2.2, we conclude
that ( 1 .2.9 ) holds with KV; == fl.J¢. Thus, all that we have to do is check that
every cp E C( J ) is ¢-Riemann integrable on J.
Let cp E C ( J ) be given and define
U ( cp i'I/J ; C) = L )sup cp) �1'1/J and .C( cp i'I/J ; C) = L( i �f cp) �1'1/J
JE C I IEC

for C and � E B (C) . Then, just as in Section 1 . 1 ,

for any � E B(C). In addition ( cf. Lemma 1 . 1.5 ) , for any pair C1 and C2 one
has that .C ( VJI ¢; cl) < u ( VJ I ¢; c2) . Finally, for any c '

where
w (8) sup { lcp ( y ) - cp(x) l : x , y E J and I Y - x l < 8}
is the modulus of continuity of cp . Hence
lim (u ( VJI ¢; C) - .C( VJI ¢; C) ) == o.
C
II II �O
1. 2 Riemann-Stieltjes Integration 11

But this means that for every t > 0 there is a 8 > 0 such that
U(cp l'l/J; C ) - U(cp l 'l/J; C' ) < U( cp l'l/J; C ) - .C ( cp l'l/J; C ) < t
no matter what C ' is chosen as long as IICII < 8. From these it is clear that
both
lim U( cp l'l/J ; C ) and lim £( cp l'l/J ; C )
II C II�O II C II �O
exist and are equal. D
One obvious way to extend the preceding result is to note that if cp is Rie­
mann integrable on J with respect to both 'l/; 1 and 'l/;2 , then it is Riemann
integrable on J with respect to 'ljJ 'l/J2 - 'l/; 1 and
_

(R) 1 cp(x) d'lj;(x) = (R) 1 cp (x) d'lj; 2 (x) - (R) 1 cp1 (x) d'lj; 1 (x) .
(This can be seen directly or as a consequence of Theorem 1.2. 7 combined
with (iii) in Examples 1 .2.2.) In particular, we have the following corollary to
Theorem 1 .2. 10.
1.2. 1 1 Corollary. If 'ljJ == 'l/J 2 - 'l/; 1 where 'l/J 1 and 'l/J 2 are non-decreasing func­
tions on J, then every cp E C( J) is Riemann integrable with respect to 'ljJ and
( 1 .2.9) holds with KV; == b,.J'lfJI + b,.J'lfJ 2 ·
We are now going to embark on a program which will show that, at least
0

among 'lj; 's that are right continuous on J and have left limits at each point
in J \ { J- }, the 'ljJ ' s in Corollary 1.2. 1 1 are the only ones with the properties
that every cp E C( J) is 'lj;-Riemann integrable on J and ( 1 .2.9) holds for some
oo.
KV; < The first step is to provide an alternative description of those 'ljJ ' s
which can be expressed as the difference of two non-decreasing functions. To
this end, let 'ljJ be a real-valued function on J and define
S( 'lj; ; C ) == L ID,. I'l/J I
lE C
for any non-overlapping, finite, exact cover C of J. Clearly
S( a'lj; ; C ) == l a iS( 'lj; ; C ) for all a E � '
S ('l/J 1 + 'l/J2; C ) < S ('l/J 1; C ) + S ( 'l/J2 ; C ) for all 'l/J 1 and 'l/J2 ,
and
if 'ljJ is monotone on J. Moreover, if C is given and C ' is obtained from C by
replacing one of the I 's in C by a pair { / 1 , I2 } , where I == I1 U I2 and I1 ni2 == 0,
0 0

then, by the triangle inequality,


S( 'lf; ; C ' ) - S( 'lf; ; C)
== l'l/J (Ii) - 'l/J (I!) I + l'l/J (Ii) - 'l/; ( I2) 1 - l'l/J (I+ ) - 'l/J (I - ) 1 > 0.
12 I The Classical Theory
Hence we see that
S( 'lj;; C ) < S( 'lj;; C' ) for C < C'.
We now define the variation of 'ljJ on J to be the number ( possibly infinite)
(1 .2. 12) Var('lj;; J) sup S( 'lj;; C ) ,
c

where the C ' s run over all non-overlapping, finite, exact covers of J. Also, we
say that 'ljJ has bounded variation on J if Var( 'lj;; J) < It should be clear oo.

that if 'ljJ == 'lj;2 - 'l/J 1 for non-decreasing 'l/J 1 and 'l/J2 on J, then 'ljJ has bounded
variation on J and Var ( 'lj;; J) < tl.J 'l/Jl + tl.J 'l/J2 . What is less obvious is that
every 'ljJ having bounded variation on J can be expressed as the difference of
two non-decreasing functions. In order to prove this, we introduce
S+ ('lj;; C ) = L (�I'I/J)+
lE C
and
s_ ( '1/J; C ) == L (tl.1 '1/J) - ,
lE C
where a + - a V 0 and a - - ( a 1\ 0) for a E �. Also, we call
Var + ( 'lj;; J) sup S+ ('lj; ; C ) and Var _ ( 'lj;; J) sup S_ ( 'lj;; C )
c c

the positive variation and the negative variation of 'ljJ on J. Noting that
2 S± ( '1/J; C ) == S( 'lj;; C ) ± tl.J 'l/J
(1 .2. 13) s+ ( '1/J; c ) - s_ ( '1/J ; c) == tl.J'l/J
S+ ( '1/J; C ) + S_ ( 'lj;; C ) == S ( 'lj;; C )
for any C, we see that
C <- C' '

and that
Var + ('lj;; J) < oo {:=::::> Var('lj;; J) < oo {:=::::> Var _ ( 'lj;; J) < oo.

1 . 2 . 14 Lemma. If Var('lj;; J) < then oo,

( 1.2. 15) Var ( 'lj;; J) + Var _ ( 'lj;; J) == Var( 'lj; ; J)


+

and

(1.2 . 16)
1. 2 Riemann -Stieltjes Integration 13

PROOF: By the middle relation in (1. 2.13) , we see that

Hence,
Var (1/; ; J) <Var� (1/; ; J) ± �J1/J;
±

and so (1.2. 16) has been proved. Moreover, (1 . 2. 16) combined with the middle
relation in ( 1 . 2. 13) leads to

for any C. In particular, there is a sequence {Cn }1 such that S+ (1j; ; Cn) �
Var + (1/;; J ) as n � oo and, at the same time, S_ (1f; ; Cn) � Var_ (7j; ; J) .
Hence, by the last relation in ( 1 .2. 13) , we see that
---+ CX) S(1/; ; Cn) <Var(1/; ; J) .
Var + (1/; ; J) + Var _ (1/; ; J) <nlim
At the same time, by that same relation in (1 .2. 13) ,

for every C. When combined with the preceding, this completes the proof of
(1.2.15) . D

1.2.17 Lemma. If 1/J has bounded variation on [a, b] and a <c<b, then
Var (1/; ; [a, b] ) == Var (1/; ; [a, c] ) + Var (1/; ; [c, b] ) ,
± ± ±

and therefore also Var (1/; ; [a, b] )Var ( 1/;; [a, c] ) + Var(1/; ; [c, b] ) .
==

PROOF: Because of (1.2. 15) and ( 1 .2. 16) , we see that it suffices to check the
equality only for "Var" itself. But if C1 and C2 are non-overlapping, finite,
exact covers of [a, c] and [c, b] , then c == cl u c2 is a non-overlapping, finite,
exact cover of [a, b] ; and so
S(1/; ; C1 ) + S(1/; ; C2 ) == S (1/; ; C ) <Var(1/; ; [a, b] ) .
Hence Var(1/; ; [a, c] ) + Var(1/; ; [c, b] ) < Var(1/; ; [a, b] ) . On the other hand, if C
is a non-overlapping, finite, exact cover of [a, b] , then it is easy to construct
non-overlapping, finite, exact covers C1 and C2 of [a, c] and [c, b] such that
C <C1 u C2 . Hence,
S(1/; ; C) <S (1/; ; C 1 u C2 ) == S(1/;; C1 ) + S(1/; ; C2 ) <Var(1/;; [a, c] ) + Var(1/;; [c, b] ) .
Since this is true for every C, the asserted equality is now proved. D
14 I The Classical Theory

We have now proved the following decomposition theorem for functions hav­
ing bounded variation.
1 . 2 . 18 Theorem. Let 1/J : J � � be given. Then 1/J has bounded variation
on J if and only if there exist non-decreasing functions 1/J 1 and 1/J2 on J such
that 1/J == 7j;2 - 7j; 1 . In fact, if 1/J has bounded variation on J == [a, b] and we
define 1/J ± (x) == Var ± (1/J; [a, x]) for x E J, then 1/J+ and 1/J - are non-decreasing
and 1/J ( x) == 1/J ( a ) + 1/J + ( x) - 1/J ( x) , x E J. Finally, if 1/J has bounded variation
on J, then every cp E C(J) is Riemann integrable on J with respect to 1/J and
_

( 1 .2. 19) (R) i cp (x) d'lj;(x) <Var('lj;; J) II�P I I u·


In order to complete our program, we need one more elementary fact.
1 . 2.20 Lemma. If1j;: J � � has a right limit in � at every x E J \ {J + }
and a left limit in � at every x E J \ { J - }, then 1/J is bounded and

{
card ( x E J : 1 '1/J (x) - (x + ) I V 1 ( x) - '1/J (x - ) I > E } ) <oo
'1/J '1/J for each E > 0.

In particular, 1/J has at most countably many discontinuities. Also, if 1/J(x)


1/J(x+) for x E J and 'lj; (x) == 'lj; (x) for x E {J - , J + }, then 'ljJ is right-continuous
0 - -

on J, has a left limit in � at every x E J \ { J - }, and coincides with 1/J at all


0

points where 'ljJ is continuous. Thus, if cp is Riemann integrable on J with


respect to both 1/J and ;f;, then (R) fJ cp(x) d;f; (x) == (R) fJ cp(x) d'ljJ (x) . Finally,
if cp E C ( J) is Riemann integrable on J with respect to 1/J, then it is also
Riemann integrable on J with respect to 1/J.
PROOF: Suppose that 1/J were unbounded. Then we could find {xn}1 C J so
that 1 1/J(xn) l � oo as n � oo ; and clearly there is no loss in generality if we
assume that Xn + <X n for all n > 1 . But this would mean that 1 1/J(x+) I == oo,
l

where x == limn-H:x) Xn , and so no such sequence can exist. Thus 1/J must be
bounded. The proof that card ( {x E J : 1 1/J(x) - 1/J(x + ) l V 1 1/J(x) - 1/J(x - ) 1 >
E} ) < oo is similar. Namely, if not, then we could assume that there exists
a strictly decreasing sequence { xn} C J with limit x E J such that 1 1/J(xn) -
0

1/J(xn+) l V 1 1/J(xn) - 'l/J(xn - ) 1 > E for each n > 1. But then, for each n > 1, we
could find x� E (x , xn) and x� E (xn , X n + � ) n J so that l 'l/J(xn) - 1/J(x� ) l V

11/J(xn) - 1/J(x � ) l > ; and clearly this contradicts the existence in � of 1/J(x + ).
The preceding makes it obvious that 1/J can be discontinuous at only count-
- 0

ably many points. In addition it is clear that 1/J(x ± ) == 1/J(x ± ) for all x E J. To
prove the equality of Riemann integrals with respect to 1/J and 1/J of cp ' s which
are Riemann integrable with respect to both, note that, because 1/J coincides
- 0

with 1/J on { J - , J + } as well as on a dense subset of J, we can always evalu-


ate these integrals using Riemann sums which are the same whether they are
computed with respect to 1/J or to 1/J.
1. 2 Riemann -Stieltjes Integration 15

Finally, we must show that if cp E C(J) is Riemann integrable with respect


to 1/J, then it also is with respect to 1/J. To do this, it clearly suffices to show
that for any C, � E 3(C), and E > 0, there is a C' and a �' E 3(C') such
that II C' II < 2 II C II and IR ( cp i� ; C, � ) - R ( cp i 1/J; C', �') I < E. To this end, let
J- == Co < · · · < Cn + I == J+ be chosen so that C == { [ck, ck+ I ] : 0 < k < n} .
For 0 < a < Cn +1 - Cn , set
if k E {0, n + 1 }
if 1 <- k <- n.
and let Co: == { [ck,o:, ck+ I ,o:] : 0 < k < n} . Clearly II Co: ll < 2 II C II · Next, define
�o:( [ck,o:, ck+ I,o:] ) == ck,o: for 0 < k < n. Then, because cp is continuous and �
0

is right-continuous on J,

At the same time,

for all but a countable


- number - of a ' s. Thus, for any -E > 0, there is an a > 0 for
which IR ( VJI 1/J ; C, � ) -R ( VJI 1/J; Co: , �a: ) I < E and R ( VJI 1/J, Co: , �a: ) == R ( VJI 1/J, Co: , �a: ).
D
1 .2.21 Remark: As a consequence of Theorem 1 .2. 18, any 1/J having bounded
variation
- on J certainly satisfies the hypotheses of Lemma- 1 .2.20. Moreover,
if 1/J is defined accordingly, then one can check that Var (1/J; J) < Var ( 1/J; J) .
1.2.22 Theorem. Let 1/J
be a function on J which satisfies the hypotheses
of Lemma 1.2.20, and define 1/J accordingly. If every cp E C(J) is Riemann
integrable on J with respect to 1/J and if there is K < oo such that a

(1 .2.23) ( R) 1 cp (x) d'lj;(x) < K II�P II u , cp E C(J) ,


-
then 1/J has bounded variation on J and

{ 1 cp (x) d'lj;(x) : cp E C (J) and I I�P II u = 1 }


Var('¢; J) = sup ( R)
( 1 .2.24) = sup { ( R) 1 cp (x ) d'¢(x ) : cp C (J) and II�P II u = 1 } .
E

In particular, if 1/J itself is right-continuous on J, then 1/J has bounded variation


on J if and only if every cp E C(J) is Riemann integrable on J with respect to
1/J and (1.2.23) holds for some K < oo, in which case Var( 1j;; J) is the optimal
choice of K.
16 I The Classical Theory

PROOF: In view of what we already know, all that we have to do is check that

(R)
for each C and E > 0 there is a cp E C(J) such that II 'P II u == 1 and S (;j; ; C) <
JJ cp(x) d'lj;(x)+E. Moreover, because ;j; is right continuous, we may and will
assume that C == { [ck , ck +l]: 0 < k <n} where J- == co <···<Cn+l == J+
and ck is a point of continuity of 'ljJ for each 1 <k <n.
Given 0 <a <minl < k < n ck+�-ck , define 'Po: E C (J) so that
sgn (�[co ,cl] 'ljJ) for x E [eo , c1 - a] ,
'Po:(x) == sgn (�[ck,ck d'l/J ) for x E [ck +a, ck +l - a] and 1 <k <n,
+

sgn (�[en ,en+I] 'ljJ) for X E [en +a, Cn+l]'


and 'Po: is linear on each of the intervals [ck - a, ck +a] , 1 < k < ( The n.

signum function t E � � sgn ( t ) is defined so that sgn(t) is - 1 or 1 accord­


ing to whether t <0 or t > 0.) Then, by (iv) in Examples 1.2.2,
(R) i�Po:(x) d'lj;(x) S('¢ ; C)
-

L(R) 1.
n

= (�Po:(x)-sgn(�[q,ck+d'lj;)) d'lj;(x)
k =O [ck,Ck+I]

= t [ (R ) {
k = l J[ck -o:,ck]
(cpo:(x)- IPo: ( ck - a )) d'lj;(x)

For each 1 <k <n, either 'Po:_ cp0 ( ck - a ) on [ck - a, ck +a], in which case the
corresponding term does not contribute to the preceding sum, or 'Po:(ck ) == 0
and cp� - ('Po:(ck +a )-'Po: ( ck - a )) /2a on [ck - a, ck +a] . In the latter case,
we apply Theorem 1. 2.7 and equation ( 1 .2.3) to show that

+ (R) J[ck{ ,ck+o:] (cpo:(x)-IPo:(ck +a)) d'lj;(x)


== [C{Jo:(Ck +a ) - C{J o:(Ck - a)] '1/J(Ck)
_ �Po:(Ck +a ) - i{)o:(Ck - a )
2a (R) J[ck{ -o:,ck+o:] '1/J(x) dx,
which, since 'ljJ is continuous at ck , clearly tends to 0 as a� 0. In other words,
we now see that
S ('¢ ; C) = o:lim
�O (R) j
{J cpo:(x) d'lj;(x),
which is all that we had to prove. D
1. 2 Riemann-Stieltjes Integration 17
Exercises

1.2. 25 Exercise: Check all of the assertions in Examples 1.2.2. The only one
which presents a challenge is the assertion in (iv) that cp is Riemann integrable
on both J1 and J2 with respect to 1/J .
1/J
1 .2.26 Exercise: If is non-decreasing on J, show that a bounded function
cp is Riemann integrable on J with respect to if and only if for every E > 0
1/J
there is a 8 > 0 such that
(1.2.27) <E 6.11/J
{ JE C : sup 1 <p-inf! <p>E}
whenever C is a non-overlapping, finite, exact cover of J satisfying I I C II < 8.
Also, show that when, in addition, 'ljJ E C (J) , the preceding can be replaced
by the condition that, for each E > 0, ( 1.2.27) holds for some C. (Hint : for
the last part, compare the situation here to the one handled in Lemma 1.1. 7.)
1 .2.28 Exercise: If E C(J) , show that
1/J
Var ± ( '¢ ; J) == Clim o S± ( '¢ ; C) E (0, oo
II II
......
( ])
and conclude that Var( '¢ , J ) == lim 1 1c 11 S('lj;; C). Also, show that if 'lj; E C 1 (J) ,
...... 0

then
Var ± ('l/J; J) = (R) L '1/J '(x) ± dx
and therefore Var(1/; ; J) == (R) JJ 1 1/J' (x)! dx.
1 .2.29 Exercise: Let 'lj; be a function of bounded variation on the interval
d],
J == [c, and define the non-decreasing functions ¢ + and
in Theorem 1.2.19. Given any other pair of non-decreasing functions 'l/;1 and
1/J -
accordingly, as
1/;2 on J satisfying 'lj; == 'l/;2 - 'l/; 1 , show that 'l/J 2 - ¢+ and 'l/J 1
non-decreasing functions. In particular, this means that ¢+ <'l/J2 - 'l/J2 ( c) and
are both - 1/J-
'lj; <'l/;1 - 'l/;1 (c) whenever 'l/;2 and 'l/;1 are non-decreasing functions for which
'ljJ == '¢2 - ¢1. Using Lemma 1. 2.17 and the preceding, show that
_

±
'1/J ± (x+ ) - '1/J ± (x) '1/J (x+ ) - '1/J (x) x d),
=( ) , E [c,
and ±
'1/J± (x) - '1/J ± (x - ) '1/J (x) - '1/J (x - ) x
=( ) , E ( c, d] .
x
Conclude, in particular, that the jumps in � Var('lj;; [a , x]), from both the
right and left, coincide with the absolute value of the corresponding jumps in
1/J1/J 1/J x
then so are + and and therefore also Var [ c, .
_
( (1/J;x] ) ] )
'lj;. Hence, is continuous if E J � Var 1/; ; [c, is; and if 'ljJ is continuous,
·

Hint: In order to handle the last part, show that it is enough to check that
1/J+ (c + ) 1/J(1/J (c + ) 1/J ) .
== - 1/; (c) + Next, show that this comes down to checking
that f3 + ( c+) 1\ ( c+ ) == 0. Finally, define 1/; 1 and 1/;2 on [c, so that d ]
- x
'¢ 1 (c) == 0, 1/;2 (c) == 1/;(c) , and, for E (c, d], 1/J 1 (x)
== 'lj;_(x) -
f3 while 1/J 2 (x)
==
1/J 1/J 1/J
(c) + + ( ) + (3 ; and apply the first part of this exercise to see that <1/; 1 .
x _
18 I The Classical Theory

1 . 2.30 Exercise: Construct an example of a V; E C( [O, 1] ) which is not of


bounded variation. Also, give an example of a V; having bounded variation on
[0, 1] for which

sup {(R) J[{o , 1) cp (x) d'ljJ (x) : cp E C(J) and l l cp llu = 1 } < Var(V;; J) .
Chapter I I
Lebesgue's Measure

2.0 The Idea.


In this chapter we construct Lebesgue ' s measure on � N , and in the following
chapter we will develop his method of integration. To avoid getting lost in the
details, it will be important to keep in mind what it is we are attempting to
do. For this reason, we begin with a brief summary of our goals.
The essence of any theory of integration is a divide and conquer strategy.
r
That is, given a space E and a family B of subsets C E for which one has a
rE JL(r) E oo],
reasonable notion of measure assignment B� [0, the integral
f
of a function E � � should be computed by a prescription containing
:

P
the following ingredients. In the first place, one has to choose a partition of
E a typical value
r
ar r. P,
the space E into subsets E B. Secondly, given one has to select for each
r P P
of f on Thirdly, given both the partition and
the selection
rEP arE� r, Range ( f r )
one forms the sum
(2.0.1)

Finally, using a limit procedure if necessary, one removes the ambiguity (in­
f r P
herent in the notion of typical ) by choosing the partitions in such a way
that the restriction of to each is increasingly close to a constant.
Obviously, even if we ignore all questions of convergence, the only way in
which we can make sense out of (2.0.1 ) is if we restrict ourselves to either
P.
finite or, at worst, countable partitions Hence, in general, the final limit
procedure will be essential. Be that as it may, when E is itself countable and
{x} xP {x} : xE
E B for every E E, there is an obvious way to avoid the limit step,
namely ones chooses == { E} and takes
(2.0.2) L f(x) JL ({x})
xEE

to be the integral. (We will ignore, for the present, all problems arising from
questions of convergence.) Clearly, this is the idea on which Riemann based his
20 II Lebesgue 's Measure

theory of integration. On the other hand, Riemann ' s is not the only obvious
way to proceed, even in the case of countable spaces E. For example, again
:
assuming that E is countable, let f E � � be given. Then Range( ! ) is
countable and, assuming that
Lebesgue would say that
r(a)
{ x E E : f( x ) == a}
E B for every E � 'a
(2.0.3)
aERange(f)

is an equally obvious candidate for the integral of f.


In order to reconcile these two obvious definitions, one has to examine the
r
assignment E B � JL(r)
E [0, oo] of measure. Indeed, even if E is countable
and B contains every subset of E, (2.0.2) and (2.0.3) give the same answer
only if one knows that, for any countable collection {rn }
C B,

(2.0.4)

When E is countable, (2.0.4) is equivalent to taking


JL(r) == L JL ( {x}), r c E.
xE r

However, when E is uncountable, the property in (2.0.4) becomes highly non­


trivial. In fact, it is unquestionably Lebesgue ' s most significant achievement
to have shown that there are non-trivial assignments of measure which enjoy
this property.
Having compared Lebesgue ' s ideas to Riemann ' s in the countable setting,
we close this introduction to Lebesgue ' s theory with a few words about the
same comparison for uncountable spaces. For this purpose, we will suppose
==
that E [0, 1] and, without worrying about exactly which subsets of E are
included in B, we will suppose that B contains not only all open and closed
subsets of E but also all the sets which can be obtained, starting from open
and closed sets, by countable, set-theoretic operations. Further, we assume
r
that E B � JL(r)E [0, 1] is a mapping which satisfies (2.0.4) .
Now let f : [0, 1] � � be given. In order to integrate J , Riemann says that
we should divide up [0, 1] into small intervals, choose a representative value of f
from each interval, form the associated Riemann sum, and then take the limit
as the mesh size of the division tends to 0. As we know, his procedure works
beautifully as long as the function f respects the topology of the real line ; that
is, as long as f is sufficiently continuous. However, Riemann ' s procedure is
doomed to failure when f does not respect the topology of �. The proble1n is,
of course, that Riemann ' s partitioning procedure is tied to the topology of the
reals and is therefore too rigid to accommodate functions which pay little or no
attention to that topology. To get around this problem, Lebesgue tailors his
2.1 Existence 21
partitioning procedure to the particular function I under consideration. Thus,
for a given function I, Lebesgue might consider the sequence of partitions
Pn , n E N, consisting of the sets
k k
{ [
rn ,k = x E E : f( x ) E 2 n, ; l) } , k E Z.
Obviously, no two values that I takes on any one of the rn,k ' s can differ by
more than 2� . Hence, assuming that rn,k E B for every n E N and k E Z and
ignoring convergence problems,
---+ CX) "'�
nlim kEZ L..-t 2n JL(rn,k)
simply must be the integral of I !
When one hears Lebesgue ' s ideas for the first time, one may well wonder
what there is left to be done. On the other hand, after a little reflection,
some doubts begin to emerge. For example, what is so sacrosanct about
Lebesgue ' s partitioning suggested in the preceding paragraph and, for instance,
why should one have not done the same thing relative to 3n instead of 2 n? The
answer is, of course, that there is nothing to recommend 2 n over 3 n and that it
should make no difference which of them is used. Thus, one has to check that
it really does not matter, and, once again, the verification entails repeated ap­
plication of the property in (2.0.4) . In fact, it will become increasingly evident
that Lebesgue ' s entire program rests on (2.0.4).
With the preceding comments in mind, it should be clear why we initiate
Lebesgue ' s program with his proof that there is an interesting M satisfying
(2.0.4) actually exists. To be more precise, the rest of this chapter is devoted
to showing that we can define such a M on a rich class of subsets of � N so that
M (I) == vol (I) whenever I is a rectangle.

2 . 1 Existence.
Given a countable ( possibly overlapping) cover C of a subset r C � N by
rectangles I, define �(C ) == L:IEC vol ( I) E [0, oo] . We call
{
1r1e = in£ � ( C ) : r c Uc }
the outer or exterior Lebesgue measure of r. What we are going to do is
describe a family BR N for which the map
r E BR N � lrle
satisfies (2.0.4). (The notation here, in particular the bar, will be explained
in (ii) of Examples 3.1.5 and Exercise 3.1.9.) However, before starting on this
project, we first check that I I I e == vol (I) for rectangles I.
2. 1 . 1 Lemma. If r == U� Jm where the Jm 's are non-overlapping rectangles,
then 1r1e == 2:::� vol (Jm ) ·
22 II Lebesgue 's Measure
PROOF: Obviously 1r1 e < E� vol ( Jm) · To prove the opposite inequality, let
C == {It} ! be a cover of J . Given an E > 0, choose I£ for each f E z+ so that
It C l£ and vol (I£) < vol ( It ) + 2 - tE . Because r is compact, there exists an
L E z+ such that {I�, . . . , I�} covers r. In particular, by Lemma 1.1.1,
n n L

L vol (Jm) < L L vol (Jm n I�)


m= l m= l t= l
L CX)
L vol (I�) < L vol (I�) < �(C) + E .
<
t= l t= l
( In the preceding, we have used the fact that, for any pair of rectangles I and
J , I n J is again a rectangle.) After first letting E � 0 and then taking the
infimum over C ' s, we get the required result. D
In view of Lemma 2. 1.1, we are justified in replacing vol (I) by I I I e for
rectangles I.
Our next result shows that half the equality in (2.0.4) is automatic, even
before we restrict to r ' s from BRN .
2. 1 . 2 Lemma. If r1 C r2, then 1r1 1e < l r2 l e · In addition, if r c U� rn, then
l r l e < E� l rn l e · In particular, if r c u� rn and l rn l e == 0 for each n > 1,
then 1r1e == 0; and so 1 8I I e == 0 for any rectangle I. (Here, and throughout, as
will denote the boundary S \ S of the set S.) Finally, if r1 and r2 are subsets
- 0

oflRN for which

then l rl u r21 e == l rll e + l r21 e ·


PROOF: The first assertion follows immediately from the fact that every cover
of rl is also a cover of r2 .
In order to prove the second assertion, let E > 0 be given, and choose for each
n > 1 a cover Cn so that �(Cn) < l rn l e + 2-nE . It is obvious that C U � Cn
is a countable cover of r. Hence
CX) CX) CX)
1r1 e < �(C) < LL L L
vol (I) < ( l rn l e + 2- nE ) < l rnl e + E .
n= l n= l

Given the preceding, one proves that loi i e == 0 by observing that 8I is the
union of 2N degenerate rectangles of the form [a1 , b1] · · · { ck} · · · [aN , bN] ,
x x x x

each of which has exterior measure 0.


Thrning to the final assertion, set 8 == dist ( r1 , r2 ) , and let C be a countable
cover of r1 u r2 by rectangles I. Without loss in generality, we will assume that
diam(I) < 8 for all I E C. (If this is not already the case, it can be brought
about by replacing C by a new covering C' in which each I E C with diam(I) > 8
2. 1 Existence 23
has been replaced with rectangles obtained by repeatedly subdividing I . As
an application of Lemma 1 . 1 . 1 , �(C) = �(C').) Next, set
ci = {I E c : I n ri =1= 0 } ,
and observe that, on the one hand, ci covers ri , while, on the other hand,
C1 n C2 = 0 . Hence,
< � (C1 ) + � ( C2 ) < �( C) ,
1 r1 1 e + 1 r2 1 e
and so, after taking the infimum over C ' s, we see that 1 r11 e + 1 r2 1 e < 1 r1 U r2 1 e ·
Since the opposite inequality is trivial, this completes the proof. D
2. 1.3 Remark: We will use <5 to denote the class of all open subsets in a
topological space. Thus, in the present context, <5 stands for the class of open
subsets of �N . In this connection, we also introduce the class <5 6 which consists
of all subsets which can be written as the intersection of a countable number
of open subsets. Note that <5 U � C <5 6 , where we use � to stand for the class of
all closed subsets. (To see this, let F E � \ {0 } be given and write F == n� Gn
where Gn is the set of x E �N for which there exists a y E F with l x - Yl < �.)
Finally, note that r E <5 6 if and only if its complement rC is an element of �u,
the class of subsets which can be written as the union of a countable number
of closed sets.
2.1.4 Lemma. For any r C �N
(2 .1 .5 ) l rl e = inf {IGi e : r c G E <5 } .
In particular, for each r C �N there is B E <5 6 such that r C B and
a

1r1e ==I Bi e ·
PROOF: Obviously the left hand side of ( 2.1.5) is dominated by the right hand
side. To prove the opposite inequality, assume that 1r1e < oo , let E > 0 be
given, and choose C =={In }! to be a cover of r for which 1r1e > �(C) - � · Next,
for each n > 1 , let I� be a rectangle satisfying I C J� and II�I e < IInl e + 2 -n -lE .
0

Then G U� I� is certainly open, it contains r, and


-

IGi e < LII�Ie < 1r1e + E .


1

Having proved the first assertion, the second one follows by choosing a se­
quence {G n }1 c <5 so that r c G n and IGnl e < l rl e + � for each n > 1 .
Clearly the set B n� G n will then serve. D
We are now ready to describe the class BRN (alluded to at the beginning of
this section), although it will not be immediately clear why it has the properties
which we want. Be that as it may, we will say that r C �N is Lebesgue
measurable (or, when it is clear that we are discussing Lebesgue ' s measure,
simply measurable) , and we will write r E BRN if for each E > 0 there is
an open G � r such that IG \ rl e < E . In order to distinguish I . l e from its
restriction to BRN , we will use 1r1 instead of 1r1e when r is measurable, and
we will call 1r1 Lebesgue's measure (or simply, the measure) of r.
24 II Lebesgue 's Measure
2 . 1 .6 Remark: At first sight one might be tempted to say that, in view of
Lemma 2. 1.4, every subset r is measurable. This is because one is inclined
to think that I G i e = I G \ r l e + l r l e when, in fact, I G i e < I G \ r l e + l r l e is
all that we know. Therein lies the subtlety of the definition! Nonetheless, it
is clear that every open G is measurable. Furthermore, if 1 r 1 e = 0, then r
is measurable since we can choose, for any E > 0, an open G � r such that
I G \ r l e < I G I < E. Finally, if r is measurable, then there is a B E <5 6 such that
r c B and I B \ r l e = 0. Indeed, simply choose {Gn}1 c <5 so that r c Gn
and I Gn \ r l e < � ' and take B == n� Gn ·
Our next result shows that many more sets are measurable.
2. 1 . 7 Lemma. If {rn}! is a sequence of measurable sets, then r == U� rn is
also measurable and, of course (cf. Lemma 2. 1.2),

(2.1.8)

In particular, every rectangle I is measurable.


PROOF: For each n > 1, choose rn c Gn E <5 so that
I Gn \ rn l e < 2 - n E. Then
G U� Gn is open, contains r, and (by Lemma 2. 1.2) satisfies
CX) CX)
IG \ rle < U( Gn \ rn ) e
1

I U 8I, we see that every rectangle is


0

Finally, by writing a rectangle I


measurable. D
Knowing that BRN is closed under countable unions, our next goal is to
prove that it is also closed under complementation. In doing so, we will be
simultaneously coming closer to showing that (2.0.4) holds for I I on BRN .
Our proof will turn on an elementary fact about the topology of �N . Recall
·

that a cube Q in �N is a rectangle all of whose sides have the same length.
2. 1 .9 Lemma. If G is an open set in �' then G is the union of a countable
number of mutually disjoint open intervals. More generally, if G is an open
set in �N , then, for each 8 > 0, G admits a countable, non-overlapping, exact
cover C by cubes Q with diam (Q) < 8.
PROOF: If G C � is open and E G, let Ix be the open connected component
0

of G containing Then Ix is an open interval and, for any y E G, either


x
0

x. x,

Ix n Iy == 0 or Ix == Iy. Hence, c {Ix : X E G n Q} (Q denotes the set of


0 0 0 0 0

rational numbers) is the required cover.


To handle the second assertion, set Q n == [0, 2 - n ] N and ICn == { 2� + Qn :
k E z N }. Note that if m < n, Q E IC m , and Q' E ICn, then either Q' C Q
or Q n Q' == 0 . Now let G C �N and 8 > 0 be given. Let n0 be the smallest
2. 1 Existence 25
n E Z such that 2 n VN < 8, and set Cn0 == {Q E ICn0 : Q C G } . Next, define
-

Cn inductively for n > no so that

{
Cn+ I = Q' E Kn + I : Q ' c G and (J' n Q = 0 for each Q E Q
m o
Cm } .

Note that if m < n, Q E Cm , and Q' E Cn, then either Q == Q' or Q n Q' == 0 .
0 0

Hence C U� no Cn is non-overlapping, and certainly U C c G . Finally, if


x E G , choose n > no and Q' E ICn so that x E Q' C G . If Q' � Cn , then there
0 0

is an n0 < m < n and a Q E Cm such that Q n Q' =I= 0 . But this means that
Q' C Q and therefore that x E Q C U C. Thus C covers G . D
2. 1 . 10 Lemma. If r is measurable, then so is its complement r C .
PROOF: We first check that every compact set K is measurable. To this end,
let E > 0 be given and choose an open set G � K so that IGI - I Ki e < E .
Set H == G \ K and choose a non-overlapping sequence {Qn}1 of cubes for
which H == u� Qn . By Lemma 2. 1.1, L � I Q m l == I U � Q ml · Moreover, since
K and U� Qm are disjoint compact sets, the last part of Lemma 2. 1.2 says
that I (U� Qm ) U K l e == I U � Qm l + I Ki e · Hence

IGI > (U1 Qm ) U K e = U1 Qm + I Ki e = t1 I Qml + I Ki e ,


and so L� I Qm l < IGI -I Ki e < E for all n > 1. As a consequence, we now see
that I Hi e < L� I Q m l < E ; and so K is measurable.
We next show that every closed set F is measurable. To this end, simply
write F == U� (F n B(O, n)) , where B (x, r) denotes the open Euclidean ball*
{y E �N : ly - x l < r} of radius r around the point Since each F n B(O, n) is x.

compact, it follows from the preceding and Lemma 2.1.7 that F is measurable.
To complete the proof, first observe that, after another application of Lemma
2.1.7, we know that (cf. Remark 2.1.3) Fa C BRN · Next (cf. Remark 2.1.6)
choose B E <5 6 so that r c B and I B \ rl e == 0. Then, since BC E �a and
I B \ rl e == 0, rc == BC u (B \ r) is measurable. D
We are now very close to our goal. However, we still need the following
simple fact about double sums of non-negative numbers.
2. 1 . 1 1 Lemma. If { a m , n : m , n E z + } C [0, oo ) then ,

(X) (X) (X) (X)

m == 1 n == 1 n == 1 m == 1
* In general, if E, p) is a metric space and a E E and r > 0, we will use B (a, r ) to denote
the open ball { x E E : p(a, x ) < r } .
26 II Lebesgue 's Measure
P ROOF: For each M, N E z + ,
M N N M
L L am, n > L L am, n == L L am, n ·
� �

m= 1 n= 1 m= 1 n= 1 n = 1 m= 1
Hence, by letting first M / oo and then N / oo , we conclude that
� � � �

m= 1 n= 1 n= 1 m= 1
The opposite inequality is checked by reversing the roles of m and n in the
preceding. D
2. 1 . 12 Theorem. The class BRN contains <5, is closed under countable unions,
complementation, and therefore - also under differences
- and countable intersec-
tions. Hence, <5 8 u �a c BRN ; in fact, r E BRN if and only if there exist
A E �a and B E <5 6 such that A C r C B and I B \ A I == 0, in which case
l r l == I A I == I B I . Finally, for any {rn}1 c BR N '

U rn L1 1 rn 1
� �

(2. 1 . 13) == if rm n rn == 0 for m -1- n.


1
PROOF: The first assertion follows immediately from what we already know
together with the trivial manipulation of set theoretic operations; and clearly
the fact that <5 6 U Fa C BRN is a consequence of the first assertion. Next
suppose that r is measurable. By the final part of Remark 2.1.6 applied to r
and rC, we can find A E �a and B E <5 6 such that rc c AC, r c B, I B\r l == 0,
and l r \ A I == l AC \ rC I == 0, from which I B \ A I == 0 is immediate. On the other
hand, if there exist A E Fa and B E <56 such that A C r C B and I B \ A I == 0,
then r == A U (r \ A ) is measurable because 1 r \ A l e < I B \ A I == 0. Hence, it
remains only to check (2. 1 . 13) .
We first prove (2. 1 . 13) under the additional assumption that each of the
rn ' s is bounded. Given E > 0, choose open sets Gn so that rnC c Gn and
I Gn \ rn C I < 2- n E . Then Kn Gn C c rn is compact and l rn \ Kn l < 2- n E.
Since Km n Kn == 0 and therefore dist ( Km , Kn) > 0 for m -1- n, we have, by the
last part of Lemma 2. 1 .2, that I U� Km l == 2::: � I Km l for every n > 1. Hence
n
L

m= 1
l rm l <L

m= 1
I Km i + E = 1�� u Km + E < u rm + E .
m= 1 m= 1

That is, L:� l rm l < I U� rm l · Since the opposite inequality always holds,
(2. 1 . 13) is now proved for bounded rn ' s.
Finally, to handle the general case, set
A 1 == B(O, 1 ) and An + 1 == B(O, n + 1) \ B(O, n) .
2.1 Existence 27
Then

Hence, by the preceding and Lemma 2. 1.11,


CX) CX) CX) CX) CX)

m= l m= l n= l n= l m= l
= � p ( rm nAn) � (p rm) nA n
1 =
1
=
PJ m l( U rm) nA n] u
ml r m . = 0

2 . 1 . 14 Remark: Although it seems hardly necessary to point out, exterior


Lebesgue measure has an obvious but extremely important property: it is
invariant under translation. That is, l x + r l e
r ==+ r
l r l e for all E �N and
all C �N . As a consequence, we also see that x is measurable whenever
X

x r
E �N and itself is measurable.
Before concluding this preliminary discussion of Lebesgue ' s measure, it may
be appropriate to examine whether there are any non-measurable sets. It
turns out that the existence of non-measurable sets brings up some extremely
delicate points about the foundations of mathematics. Indeed, if one is willing
to abandon the full axiom of choice, then R. Solovay has shown that there is
a model of mathematics in which every subset of �N is Lebesgue measurable.
However, if one accepts the full axiom of choice, then the following argument,
due to Vitali, shows that there are sets which are not Lebesgue measurable.
The use of the axiom of choice comes in Lemma 2. 1.16 below; it is not used in
the proof of the next lemma, a result which is interesting in its own right.

{y X : y E
- X,
r r}
2. 1 . 15 Lemma. If is a measurable subset of � and 1 r 1 > 0, then the set
r-r contains the open interval ( - 8, 8 ) for some 8 > 0.
PROOF: Without loss in generality, we assume that < oo .
r 1r1
Choose an open set G � so that I G \ r 1 < ! l r l , and let C be a countable 0

collection of mutually disjoint, non-empty, open intervals I whose union is G


(cf. the first part of Lemma 2. 1.9) . Then

d E R and ( d + A) n A ==
0, then
0

n 0

Hence, there must be an I E c such that I I r l > 4 1 I I . Set A == I0

n r. If
0

2 I A I == l d + AI + I A I == l (d+A) uA I < l (d + i) u i1 .
28 II Lebesgue 's Measure

if d < 0; and so, in either case,


0 0

l ( d +
< i+)u i )
l l d l +
Hence, if I i i . ( +
At the same time, (d+ l) UI C (I - , d J + if d > 0 and d I U I C (d+ I - , J + )
== 0 ,
) (d +
0

A)nA
0

then 2�I AiIi , l dl +(dI+i iA)nA


< , l dl I ii ( � i , � ii )
from which we deduce that > ! . In other words, if
ll I A r
d < then =/= 0 . But this means that for every d E
there exist y E C such that d == y - D
x, x.
-Ii I
2. 1 . 16 Lemma. Assuming the axiom of choice, there is a subset A of lR such
that ( A - A ) n ==
Q {0}and yet lR == U qEQ ( q +A).
PROOF: Write x if
y y-x
E Then Q. is an equivalence relation on � '
"'
and, for each E JR, the equivalence class ] of is + Now, using the
rv " rv "

x [x x x Q.
axiom of choice, let A be a set which contains precisely one element from each
"'
of the equivalence classes ] [x E JR. It is then clear that A has the required
, x

properties. D

2 . 1 . 1 7 Theorem. Assuming the axiom of choice, every


contains a non-measurable subset.
r C lR with 1 r 1 e > 0

PROOF: Let A be the set constructed in Lemma 2.1.16. Then 1 r 1


L:q EQ l r n(q + A) I e , and so there must exist a q E such that l r n ( q + A) I e >
Q
e == 0 <
0.
Hence, if r n (q +A) were measurable, then, by Lemma 2.1. 15, we would have
that (- 8, 8 ) C {y - x : x, y E ( q + A ) C U for some 8 >
} {0} QC 0. D

Exercises

2 . 1 . 18 Exercise:
and 1 r11< oo, show that
r 1 l 2
\ ==
rr12 l 1 r21 - 1 r11 ·
Let and be measurable subsets in JRN . If C
r More generally, show that if
r1 r2
1 r1 n r21 < oo , then 1 r1 r21
u == n 1 r1 1 + 1 r21 - 1 r1 r2 l ·
ing that 1 rm n rn l rn
2. 1 . 19 Exercise: Let { }1 be a sequence of measurable sets in �N . Assum­
== o for m =!= n, show that == L: I U� rn l � l rn l ·
2 . 1 .20 Exercise: It is clear that any countable set has Lebesgue measure zero.
However, it is not so immediately clear that there are uncountable subsets of
lR whose Lebesgue measure is zero. We will show here how to construct such
a set. Namely, start with the set Co == [0, 1] and let C1 be the set obtained by
C0 C1 C
removing the open middle third of (i.e., == o \ (� , j) == [0, �] U [j , 1 ] .
Next, let C2 be the set obtained from C1 after removing the open middle third
)
C1
of each of the ( two ) intervals of which is the disjoint union. More generally,
given ck (which is the union of 2 k disjoint, closed intervals), let ck + l be the
set which one gets from Ck by removing the open middle third of each of the
intervals of which Ck is the disjoint union. Finally, set C == n � Ck. The set 0
C is called the Cantor set , and it turns out to be an extremely useful source
of examples. Here we will show that it is an example of an uncountable set of
Lebesgue measure zero.
2. 1 Existence 29

(i) Note that C is closed and that ICI < I Ck l == ( j ) k , k > 0. Conclude that
C E BR and that ICI == 0.
(ii) Let A denote the set of a: E {0, 1, 2} N with the properties that:
( ) n0 E {0, 1} and n0 == 1 only if nk == 0 for all k E z + ;
a

(b) nk E {0, 1} for infinitely many k E z + .


Check that the map
L...... k E [0, 1]
a: E A � """'
nk
kEN 3
is an one-to-one and onto; and let x E [0, 1] � a: (x) E A denote the inverse
mapping. Next, define Ao to be the set of a: E A such that nk == 0 for all but a
finite number of k E N ; and, for a: E Ao, define f(a:) == max { k E N : nk =/:. 0} .
Show that
{)C�_ = {X : o (x) E Ao, f (o (x) ) < f, and ak (x) E {0, 2} for 0 < k < f } ;
and conclude that
{
6�_ = x : ak (x) E {0, 2} for every 0 < k < f
and ak (x) f= 0 for some k > f } ·
Finally, define
A = {a E A \ Ao : ak E {0, 2} for all k E N } ,

and show that


n C�_ =
00

l=O
{ X : o (x) E A }
while
00

c\ n C�_ =
l= O
{X : o (x) E Ao and ak (x) E {0, 2} for every 0 < k < f(a (x) ) }·
(iii)To see that C is not countable, suppose that it were. Using (ii) and the
countability of Ao, show that one would then have a way of counting {0, 2}z+ .
Finally, recall Cantor ' s famous anti-diagonalization procedure for showing that
{0, 2}z+ cannot be counted.
30 II Lebesgue 's Measure
2.2 Euclidean Invar iance.
Although the property of translation invariance was built into our construc­
tion of Lebesgue ' s measure, it is not immediately obvious how Lebesgue ' s mea­
sure reacts to rotations of
�N , �N .
One suspects that, as the natural measure on
Lebesgue ' s measure should be invariant under the full group of Euclidean
transformations (i.e., rotations as well as translations ) . However, because our
definition of the measure was based on rectangles and the rectangles were inex­
tricably tied to a fixed set of coordinate axes, rotation invariance is not as clear
as translation invariance. In the present section we will see how Lebesgue ' s
measure transforms under an arbitrary linear transformation of
tion invariance will follow as an immediate corollary.
and rota­ �N ,
We begin with a results about the behavior of measurable sets under general
transformations.
2.2. 1 Lemma. Let F C
(r n F) E1 r�a1 e ==whenever
�whenever r�M be closed and � : F � � N continuous. Then
E �a · Furthermore, if in addition, l � ( r n F) l e == 0
0, then �(r n F ) is measurable whenever r is. In particular,
if M < N and � is Lipschitz continuous with Lipschitz constant L (i.e. ,
land� (y)therefore L l y -measurable
- �(x) l �< takes x l for all x,subsets n F) l e < ( 2v!JVL ) N i r l e
y E F),ofthenF intol � (rmeasurable sets in � N .
PROOF: Remember that functions preserve unions. Hence, the class of sets r
for which �(rnF) E �a is closed under countable unions. Next note that if K
is compact, then, by continuity, so is � ( K F) . But every closed set in �N is
n
the countable union of compact sets, and therefore we see that � ( r F) E �a n
for every closed r. Finally, since every r E �a is a countable union of closed
sets, the first assertion is proved.
Next assume, in addition, that l � (r F ) l e == 0 whenever 1 r 1 e == 0. Given
n
a measurable r, choose A E Fa so that A r and l r \ A I == 0. Then
c
�(r n F ) == �(An F ) � (( r \A) F ) is measurable because �(A n F ) E �a
u n
and l � (( r \A) n F) l e == 0.
We now show that if � is Lipschitz continuous with Lipschitz constant L,
then l � (rnF) I e < ( 2v!JVL) N i r l e · But clearly it suffices to do this when r is a
cube Q with diameter less than 1. Indeed, if we knew it in this case and were
given r an arbitrary subset of �M with 1 r 1 e < then, by ( 2. 1 . 5 ) and Lemma
oo ,
2.1.cubes9, weQ with
could find, for any t > 0, a countable collection C of non-overlapping
diameter less than 1 such that r C U C and E Q EC I Q I < l r l e + t .
Hence, we could conclude that

Q EC e Q EC
< ( 2 v!JVL ) N L I Q I < ( 2 v!JVL ) N ( l r l e + t ) .
Q EC
Thus, let Q be a cube in �M with diameter D < 1. If Q n F == 0 , there is
nothing to do. If x E Q n F, note that <P( Q n F ) must be a subset of the ball
2. 2 Euclidean Invariance 31
in �N of radius L D around <P(p). Hence, <P(Q n F) is contained in the cube
N
IT [<P(x)k - L D , <P(x)k + L D] ,
1
and therefore

Given an N N matrix A of real numbers aij , we will use TA to denote the


x

linear transformation of �N which A determines relative to the standard basis


{ e 1 , . . . , e N }. That is,

for x == E �N .

Since TA is obviously Lipschitz continuous, TA takes measurable sets into mea­


surable sets and sets of measure 0 into sets of measure 0. The main result of
this section is the following important fact about Lebesgue ' s measure.
2.2.2 Theorem. Given a real N N matrix A, TA takes measurable sets into
x

measurable sets and I TA (r) l e == l det(A) I I r l e for all r C � N . ( We use det(A)


to denote the determinant of A.)
PROOF: There are several steps.
Step 1 : For any E � N , ,\ E �' and r C � N , l ( c + ,xr) l e == 1 ,\ I N i r l e, where
c

,xr { ,xx : x E r}.


By translation invariance, we may and will assume that c == 0. Moreover, there
is nothing to prove when ,\ == 0. Finally, it is clear that 1 ,\I I == 1 ,\ I N I I I for any
,\ =I= 0 and rectangle I. Hence, since C is a countable cover of r by rectangles
I if and only if { ,\I : I E C} is a countable cover of ,xr by rectangles, we are
done.
Step 2: For any linear transformation T and all cubes Q, I T(Q) I == a(T) I Q I
where a(T) I T(Qo ) l and Q o == [0, 1] N .
Since every cube Q == + ,\Qo for some E �N and ,\ satisfying I ,X I N == I Q I ,
c c

Step 1 plus the linearity of T yields I T(Q) I == I (T( ) + ,\T(Qo ) l == 1 ,\ I N I T(Qo ) l ==


c

a(T) I Q I .
Step 3: For any linear transformation T and open G,

I T(G) I < a(T) I G I .


Moreover, equality holds if T is non-singular.
32 II Lebesgue 's Measure
Let ( cf. Lemma 2 . 1.9) C be a countable, exact cover of G by non-overlapping
cubes Q. Then
L L
I T(G) I < I T(Q) I == a(T) I Q I == a(T) I G I .
QEC Q EC
Now suppose that T is non-singular. Then
T(Q) \ T( Q ) == T(Q \ Q )
has measure 0, and
T( Q ) n T ( Q' ) == 0 for distinct Q, Q ' E C.
Hence I T(Q) I == I T( Q ) I and ( cf. Exercise 2. 1.19)
L L L
I T(G) I > I T( Q ) I == I T (Q) I == a(T) I Q I == a(T) I G I .
QEC Q EC Q EC

Step 4: For any non-singular linear transformation T and all r C �N , I T(r) l e


== a(T) 1 r 1 e.
Since r c G E <5 if and only if T(r) c T( G) E <5, this step is an immediate
consequence of (2. 1.5) and Step 3.
Step 5: If S and T are non-singular linear transformations, then a(S o T) ==
a(S)a(T).
Simply note that, by Step 4,
a(S o T) == I S o T(Q o ) l == I S(T(Qo)) l == a(S) I T(Qo ) l == a(S)a(T).
Step 6: If A is an orthogonal matrix, then a(TA) == 1.
Because A is orthogonal, B ( O, 1 ) == TA ( B ( O, 1 ) ) and therefore I B ( O, 1) 1
a(TA) I B(O, 1 ) 1 .
Step 7: If A is non-singular and symmetric, then a(TA) == l det ( A ) I .
If A is already diagonal, then it is clear that a(TA) == I TA (Qo ) l == l ,\ 1 · · · AN I ,
where Ak is the kth diagonal entry. Hence, the assertion is obvious in this case.
On the other hand, in the general case, we can find an orthogonal matrix 0
such that A == OAOT, where A is a diagonal matrix whose diagonal entries
are the eigenvalues of A and OT is the transpose of 0 . Hence, by Steps 5 and
6, a(A) == a(O)a(A)a(OT) == a(A) == l det ( A ) I .
Step 8: For every non-singular matrix A, a(TA) == l det ( A ) I .
Set B == (AAT) � . Then B is symmetric and det ( B ) == l det ( A) I . Next set
0 == B - 1 A and note that oT == AT B - 1 and so ooT == B - 1 AAT B - 1 ==
B - 1 B 2 B - 1 == IR N , where IR N denotes the identity matrix. In other words, 0
is orthogonal. Since A == BO, we now have that a(A) == a(B) == l det ( A ) I .
Step 9: If A is singular, then a(TA) == 0.
Choose y E �N with unit length so that y l_ Range ( TA ) · Next, choose an
orthogonal 0 so that e 1 == To ( y). Then e 1 l_ Range ( To TA ) and so there a
o

rectangle i in �N - 1 such that To o TA (Q 0 ) C {0} i But {0} i has measure


x . x

0, and therefore a(TA) == a(To o TA ) == 0. D


2. 2 Euclidean Invariance 33
Exercises

2.2.3 Exercise: Here are two rather easy applications of Theorem 2.2.2.
( i ) If H is a hyperplane in �N (i.e. , H == {y E � N : y - c ..l £} for some
c E RN and f E RN \ {0} ) , show that I H I == 0.
( ii ) If B(c, r ) is the open ball in �N of radius r and center c, show that
IB(c, r ) l == IB(c,r ) l == O N rN where O N I B(O, l) l -
2.2.4 Exercise: If VI , · · · , and v N are vectors in �N , the parallelepiped
spanned by {v i , · · · , V N } is the set

N i
P(v 1 , . . . , v N ) { � x vi : xi E [O, l] for all l < i < N . }
When N > 2, the classical prescription for computing the volume of a paral­
lelepiped is to take the product of the area of any one side times the length of
the corresponding altitude. In analytic terms, this means that the volume is 0
if the vectors VI , . . . , v N are linearly dependent and that otherwise the volume
of P ( VI , . . . , v N ) can be computed by taking the product of the volume of
P (v i , . . . , V N - I ) , thought of as a subset of the hyperplane H ( v i , . . . , V N - I )
spanned by VI , . . . , V N - I , times the distance between the vector VN and the
hyperplane H ( v i , . . . , VN - I ) · Using Theorem 2.2.2, show that this prescrip­
tion is correct when the volume of a set is interpreted as the Lebesgue measure
of that set.
Chapter III
Lebesgue Integration

3. 1 Measure Spaces.
In Chapter II we constructed Lebesgue ' s measure on �N . The result of
our efforts was a proof that there is a class BRN of subsets of �N and a map
r E BRN � 1 r 1 E [0, oo] such that: BR N contains all open sets; BR N is
closed under both complementation and countable unions; I I I == vol(I) for
all rectangles I; and I u� r n l L:� l rn l whenever { rn}1 is a sequence of
==

mutually disjoint elements of BRN . What we are going to do in this section is


discuss a few of the general properties which are possessed by such structures.
Given a set E, we will use P(E) to denote the power set of E; that is
P(E) { r : r C E}. An algebra over E is an A C P(E) with the properties
that
(a) 0 E A,
( b ) r E A ===} rC E A,
( c ) r1 , r2 E A ===} r1 u r2 E A.

By elementary set-theoretic manipulations, one sees that algebra is also


closed under differences as well as finite unions and intersections. A a-algebra
over E is an algebra B which is closed under countable unions. Of course, a­
algebras are also closed under differences as well as countable intersections.
3. 1 . 1 Examples: Here are two, somewhat trivial, examples.
(i) For any E, {0, E} is the smallest algebra over E in the sense that every
algebra contains this one.
(ii) For any E, P(E) is the largest algebra over E in the sense that it contains
every other one.
In fact, both {0, E} and P(E) are a-algebras over E. Of course, most of the
interesting algebras and a-algebras lie somewhere in between these two extreme
examples. To wit, the a-algebra BR N over �N .
3. 1 . 2 Lemma. The intersection of any collection of algebras or a-algebras is
again an algebra or a a-algebra. In particular, given any non-empty C C P(E) ,
there is a unique minimal algebra A(E; C) and a unique minimal a-algebra
a(E; C) over E containing C.
3.1 Measure Spaces 35
PROOF: The first assertion is easily checked. Given the first assertion, the
second one is handled by considering the collection of all algebras or all a­
algebras over E containing the given C. Noting that neither of these collections
can be empty (P(E) being an element of both ) , one sees that A(E; C ) and
a(E; C ) can be constructed by taking intersections. D
The a-algebra a(E; C ) is called the a-algebra generated by C. Perhaps
the most important examples of a-algebras which are described in terms of
a generating set are those that arise in connection with topological spaces.
Namely, if E is a topological space and <5 denotes the class of all open sets
in E, then BE a(E; <5) is called the Borel a-algebra or Borel field over
E, and the elements of BE are called the Borel measurable subsets of E.
( For those who are struck by the similarity between BRN and BRN , a complete
explanation will be forthcoming shortly, in (ii) of Example 3. 1.5 below. In the
meantime, suffice it to say that, by Theorem 2. 1.12, r E BRN if and only if
there exist A, B E BRN such that A c r c B and I B \ A I == 0.)
Usually the class which generates a a-algebra is not itself even an algebra.
Nonetheless, it often has the property that it is closed under finite intersections.
For example, this is the case when the generators are the open sets of some
topological space. It was also true of the collection of all rectangles in �N . In
the future, we will call a collection C C P(E) a 1r-system if it is closed under
finite intersections. As we will see below, it is useful to know what additional
properties a 1r-system must possess in order to be a a-algebra. For this reason
we introduce a notion which complements that of a 1r-system. Namely, we will
say that 1i C P(E) is a ,\-system over E if
( a) E E 1i,
( b ) r1 , r2 E 1i and r1 n r2 == 0 �r1 U r2 E 1i,
( ) r1 , r2 E 1i and r1 C r2
c r2 \ r1 E 1i,
r E 1-l .

( d ) {rn}1 C 1i and rn / r �

The sense in which ,\-systems and 1r-systems constitute complementary no­


tions is explained in the following useful lemma.*
3. 1. 3 Lemma. The intersection of an arbitrary collection of 1r-systems or of,\­
systems is again a 1r-system or a ,\-system. Moreover, B C P(E) is a a-algebra
over E if and only if it is both a 1r-system well being a ,\-system over E.
as as

Finally, if C C P(E) is a 1r-system, then a (E ; C ) is the smallest ,\-system over


E containing C.
PROOF: The first assertion requires no comment. To prove the second one,
it suffices to prove that if B is both a 1r-system and a ,\-system over E, then
* The author learned these ideas from E. B. Dynkin ' s treatise on Markov processes. In fact ,
the .A- and 1r-system scheme i s often attributed to Dynkin, who certainly deserves the credit
for its exploitation by a whole generation of probabilists. On the other hand , the author has
been told that their origins go back to Minkowski , although no corroborating reference was
ever provided .
36 III Lebesgue Integration

it is a a-algebra over E. To this end, first note that AC == E \ A E B for


every A E B and therefore both 0 E B and B is closed under complementation.
Second, if r1 , r2 E B, then r1 U r2 == r1 U (r2 \ r3 ) where r3 == r1 n r2 . Hence
B is an algebra over E. Finally, if {rn } c B, set An == U� rm for n > 1. Then
{ An } ! c B and A n / U� rm . Hence U� rm E B, and so B is a a-algebra.
To prove the final assertion, let C be a 1r-system and 1i the smallest A-system
over E containing C. Clearly a(E; C) � 1i; and so all that we have to do is
show that 1i is 1r-system over E. To this end, first set
1i 1 == E : r n � E 1i for all � E C }.
{r C
It is then easy to check that 1{ 1 is a A -system over E. Moreover, since C is a
1r-system, C C 1i 1 , and therefore 1i C 1i 1 . In other words, r n � E 1i for all
r E 1i and � E C. Next set
1i2 == {r C E : r n � E 1i for all � E 1i} .
Again it is clear that 1{2 is a A-system. Also, by the preceding, C C 1{2 . Hence
we have shown that 1i is a 1r-system. D
Given a set E and a a-algebra B over E, we call the pair (E, B) a measur­
able space. The reason for introducing measurable spaces is that they are the
natural place on which to define measures. Namely, if (E, B) is a measurable
space, we say that the map M : B � [0, oo ] is a measure on (E, B) if J.-t( 0 ) == 0
and M is countably additive in the sense that for, {rn}1 C B,

(3.1.4)

When M(E) < oo, M is said to be a finite measure, and when J.-t(E) == 1
it is called a probability measure. Given a measurable space (E, B) and
a measure M on (E, B) , the triple (E, B, J.-t) is called a measure space. The
measure space (E, B, J.-t) is said to be a finite measure space or a probability
space according to whether M is a finite measure or a probability measure on
(E, B) .
3. 1 . 5 Examples: As we will show in Chapter VII, there is a general method
for producing lots of measures. However, at the moment we will have to settle
for the following examples.
( i ) Our basic examples of measures are those constructed by Lebesgue. Name­
ly, when E == � N , B == BR N , and M == AR N , where AJRN is the measure defined
by AR N (r) == 1r1 for r E BR N .
( ii ) Given a measure
JL space ( E, B, M) , one can always extend J.-t as a n1easure J.-t
on the a-algebra B of sets r C E with the property that there exist A, B E B
such that A c r c B and JL J.-t(B \ A) == 0; indeed, one simply defines J.-t(r) ==
J.-t(A) . The a-algebra B is called the completion of B with respect to J.-t,
3.1 Measure Spaces 37
and the resulting measure space ( E, BJL , Jl ) is said to be complete. In this
connection, note that what we have been denoting by BR N is the completion
of the Borel algebra BR N over �N with respect to the restriction of Lebesgue ' s
measure AJRN to BR N . Thus we really should have been using the hideous
notation B�n:.J' , but, for obvious reasons of aesthetics, we will continue to reserve
BJRN for the completion of BJRN with respect to Lebesgue ' s measure.
(iii) An easy and useful source of examples of measure spaces are those in
which E is a countable set, B == P(E) , and
}
E C [O, oo] .
JL(r)
== L: xE Jl x , where
r E {Jlx : x

(iv) As a final example, we point out that measure spaces give rise to other
and E' E B is given, define B[E'] == {rnE' : r } JL )
measure spaces by means of restriction. Namely, if (E, B, is a measure space
E B . Then B[E'] is a a-algebra
over E' and (E', B[E'] , Jl r B[E']) is a measure space. (See Section 5.3 for a
refinement of this procedure.)
The following theorem gives some of the basic consequences of (3. 1.4).
3. 1.6 Theorem. Let (E, B, JL) be measure
then JL( r1 ) JL( r2 ) and, when JL (r1 )
<
a JL(r2
< oo,
space.
\r1 ) If
JL r
(r12,r
) 2
- E
JL( B
r
==1 and
)· Cr1 r 2
Moreover,
for {rn}1 C B:

(i) JL(rn ) / JL (r) if rn / r,


(ii) JL(rn ) � JL (r) if rn � r and JL (r1 ) < oo,

(iii) 11 (Q rn) < �J1 (rn ),


and

PROOF: If r1 C r2 , then JL ( r2 ) JL(r1 ) JL ( r2 \ r1 ), since r 2 r1 U (r2 \ r1 ).


== + ==

The initial assertions follow immediately from this.


To prove ( i), set r0 0 and define An + 1 rn + 1 \ rn for > 0. Then
== == n
Am nAn 0 for m =I= rn Am , and r An . Hence
== n, == U� == U�

==
n
JL(rn ) L1 JL(Am ) / L1 JL(Am ) JL(r).
CX)
==

The proof of ( ii ) is accomplished by taking � n r1 \ rn and applying the


==

preceding to { �n } · (One needs JL( r1 ) in order to subtract it from both


< oo
sides.)
38 III Lebesgue Integration

n > 0. Then rAnm nAAnnUDn ,


== where ==
ro \
A
To prove ( iii ) and ( iv ) , again set == 0 and taken + 1 rn + 1 \ U� rm for
==

Dn U� (rn nrm ) , and U� rn U� An . ==

Hence, since == 0 for m =I= n,

This proves that the inequality in ( iii ) always holds and that the equality in
Dn 2:::� rn n rm )
( iv ) holds when J-L( ) < \ J-L( == 0 for all n > 2. D

Exercises

3. 1 . 7 Exercise: The decomposition of the properties of a a-algebra in terms


of 1r-systems and .X-systems is not the traditional one. Instead, most of the
early books on measure theory used algebras instead of 1r-systems as the usual
source of generating sets. In this case the complementary notion is that of a
exists {rn }1 c M such that either
r
monotone class: M is said to be a monotone class if E M whenever there
rn r rn r.
/ or � Show that B is a
a-algebra over E if and only if it is both an algebra over E and a monotone
E,
class. In addition, show that if A is an algebra over then a(E ; A) is the
smallest monotone class containing A.
3. 1 . 8 Exercise: Let (E, B) be a measurable space with B == a(E ; C). Suppose
that J-L and v are a pair of measures on (E, B) such that J-L(E) == v(E) < oo and
r r
J-L( ) == v(r) for all E C. Assuming that C is a 1r-system over E, show that
r r
J-L == v on B. ( Hint: Consider the collection 1t of E B for which J-L( ) == v(r),
and apply Lemma 3. 1.3.)
3. 1 .9 Exercise: Let (E, p) be a metric space and suppose that J-L is a finite
measure on (E, BE ) · Show that the completion B� of the Borel field BE with
r
respect to J-L coincides with the class of all C E such that for each E > 0 there
exists a closed set F and an open set G with the properties that F C C G r
and J-L( G \ F) < E .
3. 1 . 1 0 Exercise: Suppose that (E 1 , B 1 ) and (E2 , B2 ) are two measurable

r E1 E2
spaces and that � : � has the property that
element in a collection C which generates B2 . Show that
q, - 1 (r)� - 1 (r)
E B 1 for every
E B 1 for
r E2
every E B2 . In particular, if E1 and are topological spaces and B 1 and B2
are the corresponding Borel algebras, show that q,r- 1 (r)
E B1 for every E B2 r
r q,
if is continuous. Conclude from this that X + E BR N for all X E �N and
E BR N ·
3. 1 . 1 1 Exercise: Let J-L be a measure on (� N , BR N ) which is translation in­
r) J-L(r)).
variant ( i.e. , J-L(x + ==
that J-L == AR N on BR N .
In addition, assume that J-L( [O, 1] N ) == 1. Show
3.1 Measure Spaces 39

Hint: First check that Jl ( 8Q) == 0 for any cube Q. Second, show that
Jt ( [O, m,\] N ) == m N Jt ( [O, ,X] ) for any m E z+ and ,\ E �. From these, con­
clude that Jt ( Q ) == I Q I for all cubes Q. Finally, deduce the required result.
3 . 1 . 1 2 Exercise: Given sets rn for n > 1 , define
CX) CX) CX) CX)
nlim n u rn and n ---+ CX)
---+ CX) rn == m=l n =m m =l n = m
Observe that
nlim
---+ CX) rn == {x : X E rn for infinitely many n E z+ }
and that
(3. 1. 13) n ---+ CX)
n ---+ CX)
with equality holding when {rn} 1 is monotone. One says that the limit
limn ---+ CX) rn exists if equality holds in (2. 1. 13) , in which case lim n ---+ CX) rn
limn ---+ CX) rn.
-
Let (E, B, Jt) be a measure space and {rn} 1 C B. Prove each of the follow­
Ing.
( i)

and

( ii )

In particular, under the condition in ( ii ) , conclude that

( iii ) ---+ CX) Jt ( rn ) == Jl (nlim


nlim ---+ CX) rn ) if nlim
---+ CX) rn exists.
Finally, show that
CX)
( iv ) Jl (nlim
---+ CX) rn ) == o if L Jt (rn) < oo .
1

The result in ( iv ) is often called the Borel-Cantelli Lemma, and it has


many applications in probability theory.
40 III Lebesgue Integration

3. 1 . 14 Exercise: Let (E, B, be a measure space.JL)


( i ) Assuming that r1 , r2
E B and that n < oo, show that JL(rl r2 )
( ii ) Let {rm } 1 C B and assume that JL( rm ) Show that
max l < m < n < oo.

JL(rl u . . . urn ) == - LF ( - 1 )card (F) JL (rF ) '


where the summation is over non-empty subsets F of {1, ... , n} and rF
niEF ri.
( ) Although the formula in ( ) above is seldom used except in the case when
iii ii
nE be 2,thethegroup
== following is an interesting application of the general result. Let
of permutations on { 1, . . . ,n}, B == P(E) , and JL ( {7r} ) == �!
for each E E. Denote by A the set of E E such that ( i) =I= i for any
1r 1r 1r

1 i n. Then one can interpret JL(A) as the probability that, when the
< <
numbers 1, ... , n are randomly ordered, none of them is placed in the correct
position. On the basis of this interpretation, one might suspect that JL(A)
should tend to 0 as n However, by direct computation, one can see that
---t oo .
this is not the case. Indeed, let ri be the set of E E such that ( i) i. Then
1r 1r ==

A == (r1 u u rn) C. Hence,


· · ·

JL(A) == 1 - JL(rl u ... urn ) == 1 + LF ( - 1 )card(F) JL(rF ) .


Show that JL(rF ) ( n � ! if card ( F)
==
-
)
and conclude from this that
m,
JL(A) == 2:� ( -��m � as n �
n.
---t oo .

3.2 Construction of Integrals.


We are now very close to the point at which we can return to the prob­
lem of integrating functions on a measure space (E, B, Recall, from the
introduction to Chapter II, that Lebesgue ' s procedure entails the use of sums
JL).
like

to approximate the integral of f on E with respect to Jl · In particular, we have


to be dealing with functions f for which sets of the form
{f E �} {x E E : f(x) E �}
3.2 Construction of Integrals 41
B
are in when � is an interval; and it is only reasonable that we should
call such a function measurable. More generally, given measurable spaces
(E1(,E811,)81 ) (E2(E, 822,),82 )
on
and
into
we will say that
if
� q, : E1 E2
is a measurable map

{q, E r } {x E E1 : � ( x) E r} � - 1 (r) E 81 for each r E 82 .


==

Notice the analogy between the definitions of measurability and continuity.


In particular, it is clear that if q, is a measurable map on ( E1 , 81 ) into ( E, 82 )
and is a measurable map on ( E2 , 82 ) into (E3, 83), then 'Ito� is a measurable
W
map on ( E1 , 8 1 ) into ( E3, 83). The following lemma is simply a restatement
of Exercise 3. 1.10 in the language of measurable functions.
3.2. 1 Lemma. Let ( E1 , 8 1 ) and ( E2 , 82 ) be measurable spaces and suppose
that 82 a ( E2 ; C) for some C C P ( E2 ). If � : E 1 E 2
1 has the property
that q, - (r) E 8 1 for every r E C, then � is a measurable map on ( E 1 ,B 1 )
== �

into ( E2 , 82 ). In particular, if E1 and E2 are topological spaces and Bi BE == i


for i E {1, 2}, then every continuous map on E1 into E2 is a measurable map
on (E1 , 81 ) into ( E2 , 82 ).
In order to handle certain measurability questions, we introduce at this
point a construction to which we will return in Section 4. 1. Namely, given
measurable spaces (E1 ,B 1 ) and ( E2 ,B2 ), we define the product of( E 1 ,B 1 )
times (E2 , 82 ) to be the measurable space (E1 E2 , 81 82 ) where
x x

�i
Also, if is a measurable map on into
� 1 ( Eo , 80 ) ( Ei , Bi)
for i E {1, 2}, then we
define the tensor product of times to be the map � 2 � 1 Q9 � 2 : Eo
E1 E2 � 1 Q9�2 (x) (� 1 (x),�2 (x) ) , x Eo .
x given by == E

3.2.2 Lemma. Referring to the preceding, suppose that � i is a measurable


map on ( Eo , 80 ) into ( Ei , Bi) for i E {1, 2} . Then � 1 Q9 � 2 is a measurable
map on ( E0 ,B0) into ( E1 E2 ,B 1 82) . Moreover, if E1 and E2 are second
x x

countable topological spaces, then BE 1 BE2 BE 1 x E2 •


x ==

PROOF: To prove the first assertion, we need only note that if ri E Bi, i E
{1, 2}, then � 1 Q9 � 2 - 1 ( r1 r2 ) �1 1 ( r 1 ) n �2 1 ( r2 ) E Bo . As for the second
x ==
assertion, first note that G 1 G2 is open in E1 E2 for every pair of open sets
x x

GBE1 1inxEE•1 AtandtheG2same


in E2 . Hence, even without second countability, BE 1 BE2 C
x

x
2
Gopenin Ein1 Ei.EHence, time, with second countability, one can write every open
2 as theincountable union of sets of the form G 1 G2 where Gi is
x

this case, the opposite inclusion also holds. D


In measure theory one is most interested in real-valued functions. However,
for reasons of convenience, it is often handy to allow functions to take values
in the extended real line � [ -oo, oo ] . Unfortunately, the introduction
42 III Lebesgue Integration

of lR involves some annoying problems. We have already encountered such a


problem when we wrote the equation JL(r2 \ ri ) == JL(r2 ) - JL(ri ) in Theorem
3. 1.6. Our problem there, and the basic one which we want to discuss here,
stems from the difficulty of extending the arithmetic operations to include oo
and - oo. Thus, in an attempt to lay all such technical difficulties to rest once
and for all, we will spend a little time discussing them here.
To begin with, we point out that lR admits a natural metric with which it
becomes compact. Namely, define
p ( a, {3) == -2 I arctan( y ) - arctan( x) I ,
7r

where arctan( ± oo) ± ; . Clearly (JR, p) is a compact metric space: it is


homeomorphic to [ - 1, 1] under the map t E [ - 1, 1] � tan ( ;t ) . Moreover,
JR, with its usual topology, is imbedded in lR as a dense open set. In partic­
BR
ular, == �[R] (cf. (iv) - in Examples 3. 1.5). Having put a topology -and
measurable structure on JR, we will next adopt the following extension to lR of
multiplication: (±oo) · O == O · ( ± oo) == 0 and (±oo) · a == a · ( ± oo) == sgn(a)oo if
- - -
a E lR\ {0}. Although (a, {3) E lR2 � a · f3 E lR is not continuous, one can eas-
ily check that it is a measurable map on ( 1R2 , �2 ) into (JR., !3ni) . Unfortunately,
the extension of addition presents a knottier problem. Indeed, because we do
not know how to interpret ±oo =r= oo in general, we will.-..simply avoid doing so
at all by restricting the domain of addition to the set JR2 consisting of JR 2 with
.-..2
the-two points ( oo, oo) and.-..( - oo, - oo) appended. Clearly JR .-..is an open subset-
of lR2 , and so Br0 == 5n2 [1R2 ] . We define addition (a, {3) E JR2 � a + {3 E lR
so that ( ±oo) + a == a + ( ±oo) == ±oo .-..2 if a =I=- =r= oo. It is then easy to see that
(a, {3) � a + {3 is continuous
.-..2 on JR into JR; and therefore it is certainly a
measurable map on ( 1R , Br0 ) into ( lR, !3ni) . Finally, we complete our discus-
sion of lR by pointing out that the lattice operations " V " and A both admit
2
unique continuous extensions as maps from JR into lR and are therefore not a
" "

source of concern.
Having adopted these conventions, we see that, for any pair of measurable
B)
functions !I and f2 on a measurable space (E, into ( lR, �) , Lemma 3.2.2
guarantees that the measurability of the JR-valued maps:
x E E � !I · !2 (x) - !I (x) !2 (x), .-..2
x E E � (!I + !2 ) (x) !I (x) + !2 (x) E lR if Range ( JI e; !2 ) C 1R ,
x E E � !I V !2 (x) !I (x) V !2 (x),
and
3.2 Construction of Integrals 43

fV
f0), I f I
- ( ! 1\ and == j +
(
+ f-. B)
Thus, of course, if is measurable on E, into ( lR, BIR ) , then so are j +
0, f- Finally, from now on we will call a
_

B)
measurable map on (E, into ( lR, BIR ) a measurable function on (E,
From the measure-theoretic standpoint, the most elementary functions are
B).
those which take on only a finite number of distinct values; thus, we will say
that such a function is simple . Note that the class of simple functions is closed
under the lattice operations " V " and "/\" , multiplication, and, when the sum
is defined, under addition. Aside from constant functions, the simplest of the
simple functions are those which take their values in 1 . Clearly there is a
{0, } lr
one-to-one correspondence between 1 -valued functions and subsets of E.
{0, }
r
Namely, with C E we associate the function defined by
= l r (x) { 0 r
1 if x E r
if X �
set r. lr
The function is called the indicator ( or characteristic ) function of the
The reason why simple functions play such a central role in measure theory
is that their integrals are the easiest to describe. To be precise, let ( E,
f
be a measure space and a non-negative ( i.e., a oo -valued ) measurable [0 , ] B, JL)
B)
function on ( E, which is simple. We then define the Lebesgue integral
of f on E to be
L JL(f ) a == a ,

where JL(f ) is short-hand for JL( { ! } ) which, in turn, is short-hand for


o ERange(f)

JL( {x E E f (x) } ) There are various ways in which we will denote the
==
:
a
== a .
== a

Lebesgue integral of f, depending on how many details the particular situation


demands. The various expressions which we will use are, in decreasing order
of information conveyed:
l f(x) tt (dx ) , l f dtt, and J f dJ-t.
r B
Further ' for E we will use
l f(x) tt (dx)
or l f dtt
to denote the Lebesgue integral of lr f
on E. Observe that this notation
is completely consistent, since we would get precisely the same number by
f r
restricting to and computing the Lebesgue integral of r relative to f r
(r, B[r], B[r] )
Jl r ( cf. ( iv ) of Examples 3. 1.5 ) .
It will turn out that all the basic properties of the Lebesgue integral rest on
consistency results about the definition of the integral. The following lemma
is the first such consistency result.
3.2.3 Lemma. Let (E, B, JL) be a measure space and f a non-negative simple
measurable function on (E, B). If f == 2:: ; 1 ,Bt l � t where { ,81 , . . . , ,Bn } C [0 , oo ]
and { � 1 , ... , � n} C B, then J f dJL 2:: ; 1 ,Bt JL( �£ ).
==
44 III Lebesgue Integration

PROOF: Let { a 1 , ... , a m } denote the distinct values of f and, for 1 < <
set rk { ! a k }. Since rk n rk ' 0 for =I=
== == == k k' ,
k m,

n n m m n
L£= 1 !3tJL(�£ ) £L= 1 !3£ kL= 1 JL( �£ n rk ) kL= 1 L£= 1 !3t JL(�£ n rk );
== ==

and so all we need to show is that 2:::; !3t JL( �£ n rk ) a k JL(rk ) for each 1 ==
1that< it <sufficesSince
k m. 2:::; 1 /3t l � t n rk a k l rk for each 1 < < we now see
==
to treat the case when f alr for some a E [0 , ] r E B, and
==
k m,
oo ,
{ � 1 , . . . , �n} C B [r]. Further, it is clear that we may assume that a =I= 0, since
the only way in which one could have 2::: ; 1 /3t l � t 0 is if f3t 0 whenever ==
�£r E=I=B,0 .andIn other words, what we still have to show is that, for any a E (0, oo],
{� 1 , . . . , �n} C B[r]:
n n
L!3 t JL( � t ) aJL(r)
== if L!3t l �l al r . ==
£= 1 £= 1
n ( 'T/1 , . . . , TJn ) E define ,8 2::: � TJ£ !3£ and
Set I == ( {0, 1 }) , and, for 'I] == I, 11 ==
� 11 n; � �ru ) , where � ( 1 ) - � and � ( o ) � C. Then �17 n �17 , 0 if
== 1 ==
'IJ =I= '1] 1 , and, because
u
11 EI:ru =1 }
{
for each 1 < f < n,
n n
L /311 1 �, L L TJtf3t 1 �1J L£=1 !3t l �l alr .
== == ==

In particular, r U TJ EI � 11 and ,817


== Q
if �17 =I= 0, from which it is clear that
==

and therefore
n n
L!3
£=1
t JL( � t ) L!3t L JL(
== � 11 ) L i311 JL( � 11 ) aJL( r ) . == == o

The importance of Lemma 3.2.3 is already apparent in the next lemma.


3.2.4 Lemma. Let f and g be non-negative simple measurable functions on
(E, B, JL) . Then, for any a, E [0 , ] , ,{3 oo
3.2 Construction of Integrals 45
In particular, if f < g, then
then
J f d�-t < J g dJ-L. In fact, if f < g and J f d�-t < oo,

PROOF: Clearly it suffices to prove the first assertion. But if


{E:i1 , ...n 1, k,Bl�n } 1k aak f { f ak }
,8 are the distinct values of and g, respectively, then
{
anda
g ==
1 , ... , a m
ak f + ,B }
where == and � == == for 1 < < m and
1required
k ,B,Bresult
== and � == == {g ,Bk - mfrom} Lemma+ 3. 2 . 3k. D
for m 1 < < m + n. Hence the
k - m follows immediately
In order to extend our definition of the Lebesgue integral to arbitrary non­
negative measurable functions, we want to use a limit procedure. The idea is
to approximate such a function by ones which are simple. For example, if is
B,
a non-negative measurable function on (E, J-L), then we might take
f
4 n - 1 -n
'Pn kL=O 2 k 1 {2n / E [k , k+ 1 )} + 2n l { />2n }
==

for > 1. Then each 'Pn is a non-negative, measurable, simple function, and
n
'Preasonable
/ n � oo . 'Pn of ff asuniformly in (lR, p) . Thus it would seem
n f as to define Inthefact,integral�

(3 . 2 . 5 )
Indeed, since 'Pn < 'Pn + 1 , Lemma 3 . 2 . 4 guarantees that this limit exists. How­
ever, before adopting this definition, we must first check that the definition is
not too dependent on the choice of the approximating sequence. In fact, at the
moment, it is not even clear that this definition would coincide with the one
f
we have already given for simple ' s. This brings us to our second consistency
result, where, as distinguished from Lemma 3.2.3,
we must use countable, as
opposed to finite, additivity.
B,
3.2.6 Lemma. Let (E, J-L) be a measure space, and suppose that
'ljJ are non-negative measurable simple functions on E, If < ( B ). and
'P
for all n {
'P'P
n n}1
+ 1
n > 1 and if 'ljJ < lim n --H:x) 'P n , J
then 'ljJ d�-t < limn� CX) J 'P n
dJ-L. In particular,
for any non-negative, measurable function f and any non-decreasing sequence
{'l/Jn }1 oflimnon-negative,
n � oo,
measurable, simple functions '1/Jn which tend to f as
n� CX) J '1/Jn d�-t is the same as the limit in ( 3 . 2 . 5 ).
PROOF: We treat three cases.
Case 1: J-L('ljJ ) > 0. Then, for each M <
== oo oo ,
46 III Lebesgue Integration

for some t > and so0,


nlim J
---+ CX) 4?n dJ.-t > nlim
---+ CX) MJ.-t( 4?n > M) > Mt
J
for all M < oo . Hence, in this case, limn ---+ CX) 4'n dJ.-t == oo == 1/J dJ.-t. J
0) 0.
Case 2: J.-t( 1/J > == oo . Here, because 1/J is simple, there is an t > such that
1/J > t whenever 1/J > Hence,
0
J.L( 'Pn > € ) / M (Pl {'Pn > ) > €} J.L( 'l/J > 0) = 00 ,

which means that


J
---+ CX) 4?n dJ.-t > nlim
nlim J
---+ CX) tJ.-t( 4?n > t) == oo == 1/J dJ.-t.
0 0)
Case 3: J.-t( 1/J == oo ) == and J.-t( 1/J > < oo . Set E ==
J J {0J < 1/J < oo } . Under the
present conditions, J.-l ( E) < oo , 1/J dJ.-t == E 1/J dJ.-t, and 4?n dJ.-t > E 4?n dJ.-t for J
all n > 1. Hence, without loss in generality, we will assume that E == E. But
"'

then J.-t(E) < oo, and, because 1/J is simple, there exist t > and M < oo such
0
that t < 1/J < M. Now let < 8 < t be given, and define En == 4'n > 1/J - 8} .
0 {
Then En / E and so
nlim J
---+ CX) 'Pn dJ.L > nlim CX)
---+ 1En { 'Pn dJ.L > nlim [
---+ CX) {
1En
'1/J dJ.L - 8J.L(En) ]
[/
---+ CX) '1/J dJ.L - 1fEn '1/J dJ.L - 8J.L(En )
== nlim
C
]
C
> '1/J dJ.L - M }�� M ( En ) - 8J.L(E) = '1/J dJ.L - 8J.L(E) ,
J J
since J.-t(E) < oo and therefore (cf. (ii) in Theorem 3.1.6) J.-t (En ) � Because
0,
this holds for arbitrarily small 8 > we get our result upon letting 8 � D
C 0. 0.
The Lemma 3.2.6 allows us to complete the definition of the Lebesgue in­
tegral for non-negative, measurable functions. Namely, if f on (E, J.-t) is a
non-negative, measurable function, then we define the Lebesgue integral of
B,
f on E with respect to J.-t to be the number in (3.2.5); and we will continue
to use the same notation to denote integrals. Not only does Lemma 3.2.6 guar­
antee that this definition is consistent with our earlier one for simple f ' s, but
J
it also makes clear that the value of f dJ.-t does not depend on the particular
way in which one chooses to approximate f by a non-decreasing sequence of
non-negative, measurable, simple functions. Thus, for example, the following
extension of Lemma 3.2.4 is trivial.
3.2. 7 Lemma. If f and g are non-negative, measurable functions on the mea­
sure space (E, B, J.-t) , then for every a, f3 E [0 , ] oo ,

j (af + f3g) dJ.L = a j f dJ.L + /3 j g dJ.L.


In particular, if f g, then J f dJ.-t J g dJ.-t and J (g - f) dJ.-t J g dJ.-t - J f dJ.-t
< < ==
as long as J f dJ.-t < oo .
3. 2 Construction of Integrals 47
Obviously J f dJ.-L
reflects the size of a non-negative measurable The result
which follows makes this statement somewhat more quantitative.
f.
3.2.8 Theorem ( Markov's Inequality ) . If f is a non-negative measurable
function on (E, B, J.-L) , then
( 3.2.9)

In particular, J f dJ.-L 0 if and only if J.-L (f > 0) 0; and J.-L (f


== == ) 0 if == oo ==
J f dJ.-L < 00 .

PROOF: To prove (3.2.9), simply note that ,\ 1 { /> .\ } 1 { /> .\ } f f. Clearly


< <
(3.2.9) implies that J.-L ( f > 0) lim€� 0 J.-L ( f > E ) 0 if J f dJ.-L 0. Similarly, if
== == ==
M J f dJ.-L
== then J.-L ( f > ,\ ) '1,. for all ,\ > 0; and therefore,
< oo, <

J.-L(f ) lim J.-L (f > ,\) 0.


== oo <
A ---H:X)
==

Finally, if J.-L ( f > 0) > 0, then there exists an E > 0 such that J.-L ( f > E ) > E;
and so (3.2.9) implies that J f dJ.-L > E 2 > 0. D
The final step in the definition of the Lebesgue integral is to extend the
definition so that it covers measurable functions which can take both signs.
Then both
f
To this end, let be a measurable function on the measure space (E,
J f+ dJ.-L J f - dJ.-L J f dJ.-L
and are defined; and, if we want our integral to be
B, J.-L) .
linear, we can do nothing but take to be the difference between these
two. However, before doing so, we must make sure that this difference is well­
defined. With this consideration in mind, we now say that
J j+ dJ.-L J f - dJ.-L
1\ < oo, in which case we define
exists if J f dJ.-L

to be the Lebesgue integral of f on E. Observe that if


so does Ir f dJ.-L
for every r E and in fact B, J f dJ.-L exists, then

if r1 and r2 are disjoint elements of B. Also, it is clear that, when f f dJ.-L exists,
(3.2.10) J f dM J l f l dJ.L . <

In particular, fr f dJ.-L 0 if J.-L ( r ) 0. Finally, when J f + dJ.-L J f - dJ.-L


== == 1\ == oo ,
we do not even attempt to define J f dJ.-L .
48 III Lebesgue Integration

Once again, we need a consistency result before we know for sure that our
definition accomplishes what we wanted it to do; in this case, the preservation
of linearity.
3.2. 1 1 Lemma. Let f and g be measurable functions on (E, B, JL) for which
(
I f df-l and I g df-l exist and I f df-l , I g df-l ) E fti2. Then J-l ( f 0 g � fti2 ) = 0,
J{ /®gER2} ( ! + g) dJl exists, and

jr{ /®gER2 } u + g) df-l = j


� 1 dJl + j g dJL.
PROOF: Set E == {x E E : ( f( x ) , g ( x ) ) E fti2 } .
Note that, under the stated conditions, either
j j+ j +
df-l V g df-l < oo or j
f - df-l V g - df-l < oo. j
J J
For definiteness, we will assume that f- dJLV g- dJl < oo. As a consequence:
JL ( E C ) < JL(f - V g - == oo ) < JL(f - == oo )
....-..
+ JL(g - == oo ) == 0
and, because (a + b) - < a- + b- for (a, b) E �2 , f.E (f + g) - dJl < oo. Hence,
all that remains is to prove the asserted equality.
Note that
h (! + g) + df-l
J
En { /Ag> O}
( ! + + g + ) dJl +
J
En {g< O,f+g > O}
( ! + - g - ) dJL

+ J
En{ f <O,f+g >O}
(g + - f - ) dJl

J j + dJL - J ! - dJl
En{f+g >O } En { f+g> O}
+ J g + dJL -
J
En{f+g> O} En { f+g >O}
g - dJL.

Similarly,
- fe u + g) - dJ-L = J J+ dJ-L - J J - df-l
En { f +g < O} En{f+g < O}
+ J J
En{f+g< O} En{f+g <O}
After adding these two, we get the required result. D
3.2 Construction of Integrals 49
Given a measurable function on (E, JL ) , definef B,
l f l u (J.t) j 1 ! 1 dJ.L.
=

We say that f : E � � is JL-integrable if f is a measurable function on


(E, B) and 1 ! 1 £ 1 (JL) < oo ; and we use £ 1 ( JL ) == £ 1 (E, B, JL ) to denote the
set of all �-valued JL-integrable functions. Note that, from the integration
theoretic standpoint there is no loss in generality to assume that f E £ 1 ( JL )
is JR-valued. Indeed, if f is a JL-integrable function, then 1 { 1 / l < oo }! E L 1 ( JL ) ,
! - 1 { 1 / l < oo}!I I L lThe(JL) ==main0, reason
Iindistinguishable. and so integrals involving f and 1 { 1 / l < oo } f are
for insisting that f ' s in £ 1 ( JL ) be �-valued
is so that we have no problems taking linear combinations of them over �.
This simplifies the statement of results like the following.
3.2. 1 2 Lemma. For any measure space (E, B, JL ) , L 1 ( JL ) is a linear space and
(3.2. 13)
a,
whenever f3 E � and J, g E £ 1 ( JL ) .
PROOF: Simply note that l af + f3g l < l a l l f l + l /3 l l g l .
D
3.2. 14 Remark: As an application of the preceding inequality, we have that
if J, g, h E £ 1 (JL ) , then
(3.2. 15)
a
I ! - g i £ 1 (JL)
Thus
== ==
To see this, take f3 1 and replace and g by g and g - h in (3.2. 13) .
f f-
looks like a good candidate to be chosen as a metric on
£ 1 ( JL ) . On the other hand, although, from the standpoint of integration theory,
f
a measurable for which l f if£1 (JL) == 0 0,
might as well be identically there
0
is, in general, no reason why need be identically as a function. This fact
I I £ 1 (JL)
prevents · from being a completely satisfactory measure of size. To
overcome this problem, we quotient out by the offending subspace. Namely,
f
denote by N (JL ) the set of E L 1 ( JL ) such that JL ( -1-
3.2. 16 below ) , and, for J, g E £ 1 ( JL ) , write � g if g
f -f f 0 ) == 0 ( cf. Exercise
E N ( JL ) . Since it is
clear that N ( JL ) is a linear subspace of L 1 ( JL ) , one sees that � is an equivalence
relation and that the quotient space L 1 ( JL ) / � is again a vector space over �.
To be precise, for E £ 1 (JL ) , we use [ ] � to denote the �-equivalence class of
f !
f; a,
and, for any J, g E L 1 ( JL ) and f3 E � ' we take
a [f] � + f3 [g] � == [af + f3g] � .
� � �

Finally, since
50 III Lebesgue Integration

we can define ll [f]f'V II £ 1 (� ) == 11!11 £ 1 ( �) and thereby turn L 1 (J-L)/ �� into a bona

fide metric space in which the distance between [f]f'V and [g]f'V is given by
� �

II! - g ll£ 1 (�) ·


Having made this obeisance to rigor, we will now lapse into the usual, more
casual, practice of ignoring the niceties just raised. Thus, unless there is par­
ticular danger in doing so, we will not stress the distinction between f as a
function and the equivalence class [f]f'V, which f determines. For this reason,

we will continue to write L 1 (J-L) , even when we mean L 1 (J-L)/ � , and we will use
f instead of [ !] In particular, L 1 (J-L) becomes is this way a vector space over

rv •

� on which II! - g ll £ 1 (�) is a metric. As we will see in the next section (cf.
Corollary 3.3. 12) , this metric space is complete.

Exercises

3.(E,2 .B1)6. Show that fLetis measurable


Exercise: f be an �-valued function on the measurable space
if and only if { ! > a} E B for every a E �
if and only if { ! > a } E B for every a E �. At the same time, check that " > "
and "<" can be replaced by " < " and " > " , respectively. In fact, show that one
can restrict one ' s attention to a ' s from a dense subset of �. Finally, if g is
a second �-valued measurable function on (E, B), show that each of the sets
{ ! < g} , {! < g}, { ! g}, and { ! f= g } is an element of B.
==

3.of2measure
. 1 7 0, by itsShowintegrals.
Exercise: that an integrable function is determined, up to a set
To be more precise, let (E, B, J-L) be a measure
space and f and g a pair of functions from L 1 (J-L) . Show that f � g if and only
if fr f dJ-L fr g dJ-L for each r E B.
==

Hint: cp cp
Reduce the problem to showing that if E £ 1 (J-L), then J-L ( { < 0} ) 0 ==
if and only if fr dJ-L > 0 for every r E B.
cp

3.3 Convergence of Integrals.


One of the distinct advantages that Lebesgue ' s theory of integration has over
Riemann ' s approach is that Lebesgue ' s integral is wonderfully continuous with
respect to convergence of integrands. In the present section we will explore
some of these continuity properties. We begin by showing that the class of
measurable functions is closed under pointwise convergence.
3. 3 . 1 Lemma. Let (E,
measurable functions on (E,
B) B).
be a measurable space and {fn}! a sequence of
Then SUPn > 1 fn, infn > 1 fn, lim n----.. CX) fn, and
lim n ----.. CX) fn are all measurable functions. In particular,
-
{
� x E E : nlim ----.. CX) fn(x) exists } E B,
3.3 Convergence of Integrals 51

and the function I given by

l (x) == 0 { if X��
limn ---+ CX) In ( X ) if XE�
is measurable on ( E, B) .
PROOF: We first suppose that {In}! is non-decreasing. It is then clear that
CX)
{ }!_�� fn > a } = U Un > a } E B, a E JR,
n= 1
and therefore ( cf. Exercise 3.2. 16) that limn ---+ CX) In is measurable. By replacing
In with. - In, .we see that the same conclusion holds in the case when {In}!
.IS non-Increasing.
Next, for an arbitrary sequence {In}! of measurable functions,
{11 v . . . v In n E z + }
:

is a non-decreasing sequence of measurable functions. Hence, by the preceding,


sup In == nlim
---+ CX) ( 11 V V In )
n> 1 · · ·

is measurable; and a similar argument shows that infn > 1 In is measurable.


Noting that infn > m In does not decrease as m increases, we also see that

nlim In == mlim
---+ CX) ninf
> m In
---+ CX)
is measurable; and, of course, the same sort of reasoning leads to the measur­
ability of limn ---+ CX) In .
Finally, since ( cf. Exercise 3. 1 . 16)
� { x E E : nlim---+ CX) ln (x) exists } == { n�limCX) In == n�limCX) In } ,
it is an element of B; and from this it is clear that the function I described in
the last part of the statement is measurable. D
We are now ready to prove the first of three basic continuity theorems about
the Lebesgue integral. In some ways the first one is the least surprising in that
it really only echoes the result obtained in Lemma 3.2.6 and is nothing more
than the function version of (i) in Theorem 3. 1 .6.
3 . 3 . 2 Theorem ( The Monotone Convergence Theorem) . If {In}! is a
sequence of non-negative, measurable functions on the measure space (E, B, JL)
J
and if In / I point-wise as n � oo , then I dJl == limn ---+ CX) In dJL. J
52 III Lebesgue Integration

J J
PROOF: Obviously I dJl > limn-H)() In dJL. To prove the opposite inequal­
ity, for each m > 1 choose a non-decreasing sequence { 'Pm , n} � of non­ 1

negative, measurable, simple functions so that 'P m , n / lm as n ---t oo. Next,


define the non-negative, simple, measurable functions 1/Jn == 'P l , n V V 'Pn, n
· · ·

for n > 1. One then has that


1/Jn < 1/Jn + l and 'P m , n < 1/Jn < In for all 1 < m < n;
and therefore
< nlim
lm 1/Jn < I for each m E z+ .
---+ CX)
In particular, 1/Jn / I, and therefore

At the same time, because 1/Jn < In for all n E z + ' we know that
J I dJl < nlim---+ CX) J In dJl. D

Being an inequality instead of an equality, the second continuity result is


often more useful than the other two. It is the function version of (i) and (ii)
of Exercise 3. 1. 12.
3.3.3 Theorem (Fatou's Lemma) . Let {In}! be a sequence of functions
on the measure space (E, B, JL ) . If In > 0 for all n > 1, then

In particular, if there exists a JL-integrable function g such that In < g for all
n > 1, then

PROOF: Assume that the In ' s are non-negative. To check the first assertion,
set hm == infn > m In · Then lm > hm / limn ---+ CX) In and so, by the Monotone
Convergence Theorem,

Next, drop the non-negativity assumption, but impose In < g for some
JL-integrable g. Clearly, I� g - In is non-negative,
lim ---+ CX) In and
n ---+ CX) I� == g - nlim
3. 3 Convergence of Integrals 53
Hence, the required result follows from the first part applied to { !� }1 . D
Before stating the third continuity result, we need to introduce a notion
which is better suited to measure theory than ordinary pointwise equality.
Namely, we will say that an x-dependent statement about quantities on the
measure space (E, B, JL) holds JL-almost JL everywhere if the set � of x for which
the statement fails is an element of B which has JL-measure 0 (i.e. , JL(�) == 0) .
Thus, if {/n}1 is a sequence of measurable functions on the measure space
(E, B, JL), we will say that {fn}1 converges JL-almost everywhere and will
write limn ---+ CX) fn exists ( a.e., Jl), if
JL( { x E E : nlim
---+ CX) fn ( ) does not exist}) == 0;
x

and we will say that {fn}1 converges JL-almost everywhere to J , and will
:
write fn -----+ f ( a.e., Jl ), if JL( { x E E f(x) =/= limn� fn (x) }) == 0. By Lemma
(X)

3.3. 1, we see that if {fn}1 converges JL-almost everywhere, then there is a


measurable f to which {fn}1 converges JL-almost surely. Similarly, if f and
g are measurable functions, we write f == g (a.e., JL) , f < g (a.e., JL) , or f > g
(a.e., JL) if JL(f -1- g) == 0, JL(f > g) == 0, or JL(f < g) == 0, respectively. Note
that f == g (a.e., JL) is the same statement as f � g, discussed in Remark 3.2. 14.
The following can be thought of as the function version of ( iii ) of Exercise
3.1. 12.
3.3.4 Theorem ( Lebesgue's Dominated Convergence Theorem ) . Sup­
pose that {fn}! is a sequence of measurable functions on (E, B, JL), and let f be
a measurable function to which {fn} ! converges JL-almost everywhere. If there
is a JL-integrable function g for which I fn i < g ( a.e. , JL) , n > 1, then f is inte­
J J J
grable and limn ---+ oo l fn - !I dJl == 0. In particular, f dJl == lim n� CX) fn dJL.
PROOF: Let E be the set of-- x E E for which f( --) == limn� CX) fn (x) and
x

SUPn >l-- l fn (x) l < g (x) . Then E is measurable and JL (EC) == 0, and so integrals
over E are the same as those over E. Thus, without loss of generality, we
will assume that all our assumptions hold for every x E E. But then, f ==
limn ---+ oo fn, l f l < g and I f - fn l < 2g. Hence, by the second part of Fatou ' s
Lemma,

It is important to understand the role played by the Lebesgue dominant.


Namely, it acts as an umbrella to keep everything under control. To see that
such control is important, consider Lebesgue ' s measure A [o, l ] on ( [0, 1 ] , B[o,I J )
and the functions fn == nl ( o, n - 1 ) . Obviously, fn -----+ 0 everywhere on [0, 1] , but
II fn II 1 == 1 for all n E z + . unfortunately, in many circumstances, it is difficult
to find an appropriate Lebesgue dominant, and, for this reason, results like the
54 III Lebesgue Integration
following variation on Fatou ' s Lemma are interesting and often useful. ( See
Exercise 3.3.19 for other variations. )
3.3.5 Theorem (Lieb's Version of Fatou's Lemma) . Let (E, B, JL) be a
measure space,
Then
{
}! U fn { ! }
C L 1 ( JL) , and assume that -----+ fn
( a.e., JL) . f
nlim
---+ CX) ILJn I 1 (JL) - I J I 1 (JL) - I Jn - J I 1 (JL)
L L

( 3.3.6)
= }!_.� j I fn i - I f I - l in - ! I dM = o.

In particular, if l fn i £ 1 ( JL ) -----+ l f i £ 1 (JL) < then I ! - fn i £ 1 (JL) 0.


oo , -----+
PROOF: Since

l fn l u ctt) - l f l u ctt) - l fn - f l u ctt) < /l l fn l - 1 ! 1 - l fn - ! I I dM, n > 1,


we need only check the second equality in (3.3.6). But, because
l l fn I - I J I - l fn - !I I 0 (a.e., Jl)
-----+

and
l !n l - 1 ! 1 - l !n - !I I < l l !n l - l !n - !I I + I l l < 2 1 1 1 ,
(3.3.6) follows from Lebesgue ' s Dominated Convergence Theorem. D
We now have a great deal of evidence that almost everywhere convergence
of integrands often leads to convergence of the corresponding integrals. We
next want to see what can be said about the converse implication. To begin
with, we point out that
Indeed, define the functions
l fn i{£fn1 (JL}1)
-----+ 0 does not imply that
on [0, 1] so that, for m > 0 and 0 f
fn < < 2m ,
-----+ 0 ( a.e., JL) .

and that lim ---+


fn
It is then clear that these ' s are non-negative and measurable on ( [0, 1], B[o, 1 ] )
n CX) fn (x )
== 1 for every E [0, 1] . On the other hand,
x

and therefore �O ,l] fn(x)


dx -----+ 0 as n � oo .
The preceding discussion makes it clear that it may be useful to consider
other notions of convergence. Keeping in mind that we are looking for a type of
convergence which can be tested using integrals, we take a hint from Markov ' s
2
inequality ( cf. Theorem 3. .8 ) and say that the sequence { fn}! of measurable
functions on the measure space (E, B, JL) converges in JL-measure to the
3. 3 Convergence of Integrals 55
measurable function f if JL( I fn - f l > t) � 0 as n � oo for every t > 0,
in which case we write fn � f in JL-measure. Note that, by Markov ' s
inequality (3.2.9) , if ll fn - f i i £ 1 (J.L ) � 0 then fn � f in JL-measure. Hence,
this sort of convergence can be easily tested with integrals ( cf. Exercise 3.3.21
below) ; and, as a consequence, we see that convergence in JL-measure certainly
does not imply convergence JL-almost everywhere. In fact, it takes a moment
to see in what sense the limit is even uniquely determined by convergence in
JL-measure. For this reason, suppose that {fn}1 converges to both f and to g
in JL-measure. Then, for t > 0,

Hence,
JL ( J =1= g) == €lim JL (I J - g I > t) == o,
�0
and so f == g (a.e., JL). That is, convergence in JL-measure determines the
limit function to precisely the same extent as does either JL-almost everywhere
or I I · II £ 1 (J.L ) -convergence. In particular, from the standpoint of JL-integration
theory, convergence in JL-measure has unique limits.
The following theorem should help to explain the relationship between con­
vergence in JL-measure and JL-almost everywhere convergence.
3.3. 7 Theorem. Let {fn}! be a sequence of �-valued measurable functions
on the measure space (E, B, JL) .
(i) There is an �-valued, measurable function f for which

( 3.3.8) m---+ (
limCX) Jl nsup I f - fnl > t
>m ) == 0, t> 0,
if and only if

(3.3.9) m---+ ( >m )


limCX) Jl nsup l fn - fm l > t == 0, t > 0.
Moreover, (3.3.8) implies that fn � f both (a.e., JL) and in JL-measure.
(ii) There is an �-valued, measurable function f to which {fn}! converges in
JL-measure if and only if
(3.3.10) limCX) nsup
m---+ JL (I fn - fm l > t ) == 0, t> 0.
>m
Furthermore, if fn � f in J.L-measure, then there is a subsequence {fnj } '; 1

with the property that

(
1,.l---+imCX) Jl sup
j >i
I f - )
fni I > t == 0, t > O · '

and therefore fn i � f (a.e., JL ) as well as in JL-measure.


56 III Lebesgue Integration
(iii) When �-t ( E ) < oo , fn � f ( a.e., J-t ) implies (3.3.8) and therefore that
fn � f in J-t-measure.
PROOF: Set
== { x E E : nlim
� ---+ CX) fn ( x ) does not exist in � } .

For m > 1 and E > 0, define m (E) == { SUPn > m l fn - fm l > E }. It is then easy
to check ( from Cauchy ' s convergence criterion for �) that
CX) CX)
��
== u n m ( � ) .
l =l m =l
Since (3.3.9) implies that �-t (n :=l � m (E) ) == 0 for every E > 0, and, by the
preceding,

we see that (3.3.9) does indeed imply that {fn}1 converges J.-t- almost every­
where. In addition, if (cf. the last part of Lemma 3.3.1) f is an �-valued,
measurable function to which {fn}1 converges J.-t-almost everywhere, then
< sup l fn - fm l + l fm - f l < 2 sup l f fml (a.e., J-t ) ;
nsup
>- m l fn - f l n -> m -
-

n >m n

and so (3.3.9) leads to the existence of an f for which (3.3.8) holds.


Next, suppose that (3.3.8) holds for some f. Then it is obvious that fn -----+ f
both (a.e., J-t ) and in J.-t- measure. In addition, (3.3.9) follows immediately from
J-t (nsup> m l fn - fm l > E) < J-t (nsup> m l fn - f l > � ) + J-t (nsup> m I f - fm l > � )
- - -
·

We now turn to part (ii) . To see that fn -----+ f in J.-t-measure implies (3.3. 10),
simply note that

Conversely, assume that ( 3.3. 10) holds, and choose 1 < n 1 < < ni < · · ·
· ·
· so
that
-i- 1 )
sup �-t ( l!n - fni > 2 I< 2-i- 1 , i > 1.
n>n · - t

Then

U> i { l fnJ+ l -
f > 2 -j - 1 }
n1 I
j
CX)
< L J-t ( l fnj + l -
f I > 2 -j - l ) < 2 - i
n1 ·
3. 3 Convergence of Integrals 57
1
From this it is clear that { fni } i satisfies ( 3.3. 9) and therefore that there is
an f for which (3.3.8) holds with {fn} replaced by {fn i }. Hence, fni f
------+

both JL-almost everywhere and in JL-measure. In particular, when combined


with ( 3.3. 10) , this means that
�)
JL( I fm - J l > E) < 1,.l---+imCX) JL ( l fm - fn J > + 1,.l---+imCX) JL ( I fni - J l > �)
< nsup JL ( l fn - fm l > � 0
>m �)
as m ---t oo; and so fn � f in JL-measure.
)
Then, by ( ii of Theorem 3. 1.6,
)
Finally to prove ( iii , assume that JL ( E) < oo and that fn ------+f (a.e., JL) .

mlim (
---+ CX) JL nsup
>- m l in - ) ( {
! I > E == JL n sup l in - ! I > E == 0
m=l n >- m
})
for every E > 0, and therefore (3.3.8) holds. In particular, this means that
fn � f in JL-measure. D
Because it is quite important to remember the relationships between the
various sorts of convergence discussed in Theorem 3.3.7, we will summarize
them as follows:
ll fn - f i i£ 1 (J.L ) -----t 0 ===} fn -----t f in JL-measure
===} (
1,.l---+imCX) JL sup
) _t
• >. l fn1 - J l > E ) == 0, E > 0, for some subsequence {fni }

and
JL(E) < oo and fn -----t f (a.e., JL) ===} fn -----t f in JL-measure.
Notice that, when JL(E) == oo , JL-almost everywhere convergence does not im­
ply JL-convergence. For example, consider the functions 1 [n , CX)) on � with
Lebesgue ' s measure.
as
We next show that, at least far as Theorems 3.3.3 through 3.3.5 are
concerned, convergence in JL-measure is just as good as JL-almost everywhere
convergence.
3.3. 1 1 Theorem. Let f and {fn}1 all be measurable, �-valued functions on
)
the measure space (E, B, JL , and assume that fn -----t f in JL-measure.
( )
Fatou's Lemma: If fn > 0 a.e. , JL for each n > 1, then f > 0 a.e. , JL and ( )
J
f djL < nlim fn djL .
---+ CX)
J
Lebesgue's Dominated Convergence Theorem: If there is an integrable
( ) J
g on (E, B, JL) such that I fn i < g a.e. , JL for each n > 1, then f is integrable,
limn ---+ CX) l l fn - J II L 1 (J.L ) == 0, and so J fn dJL � f dJL as n ---t oo .
58 III Lebesgue Integration
Lieb's Version of Fatou's Lemma: If
integrable and
sup n>1 l fn i £ 1 ( JL ) <
oo, then f is

}�� I l in l u c tt ) - l f l u c tt ) - l fn - f l u c tt ) }�� �� l fn l - 1 ! 1 - l fn - f l l u c tt ) 0
= =

In particular, l fn - f i £ 1 ( JL ) � 0 if l fn i £ 1 ( JL )
� l f i £ 1 (JL) �-
E

PROOF: Each of these results is obtained via the same trick from the corre­
sponding result for JL-almost everywhere convergent sequences. Thus we will
prove the preceding statement of Fatou ' s Lemma and will leave the proofs of
the other assertions to the reader.
Assume that the fn ' s are non-negative JL-almost everywhere and that f �
f in JL-measure. Choose a subsequence
n
{ fn{mfn} mi } { fnm } J nnmmi
so that linlm --H:x) f dJL =
n Jn
lim � CX) f dJL. Next, choose a subsequence
nm of so that f �
f (a.e., JL) . Because each of the f · is non-negative (a.e., JL), it is now clear
that f > 0 ( a.e., JL). In addition, by restricting all integrals to the set E on
't

nm
3.3.2 to obtain
't
nm
which the f · ' s are non-negative and f · � J , we can apply Theorem
't

J }E t
n i m j nm
J df-l = ( J df-l < lim ( f df-l = l�CX) f djL = lim f dJL. D
� CX) JE
.,. jn n � CX)

An important dividend of these considerations is the fact that £ 1 (JL) is a


complete metric space. More precisely, we have the following corollary.
3.3 . 1 2 Corollary. Let {fn}!
C L 1 (JL) . If
lim sup m� CX) n> m l fn - fm i £ 1 (JL)
= 0,

then there exists an f L 1 (JL) such that I fn - f I L 1 ( JL )


E � 0. In other words,
( L 1 (JL), I I £ 1 ( JL ) ) is a complete metric space.
·

PROOF: By Markov ' s inequality, we see that (3.3. 10) holds. Hence, we can
find a measurable f such that fn � f in JL-measure; and so, by Fatou ' s
Lemma,
I ! - fm I L 1 (JL) < nlim�CX) l fn - fm 1 £ 1 (JL) < n>supm l fn - fm I L 1 (JL) 0 �

as m � oo. Finally, since supn> 1 l fn i £ 1 ( JL ) < we also see that f is JL­


oo ,
integrable and therefore may be assumed to be JR-valued. D

Before closing this discussion, we want to prove a result which is not only
useful but also helps to elucidate the structure of £ 1 (JL) .
B,
3.3 . 13 Theorem. Let (E, JL) be a measure space, and assume that JL( E ) <
oo . Given a 1r-system C C P(E) which generates B,
denote by S the set of
functions 2::: � = 1 am 1 rm,
where n E z + , {am }!
C Q, and {rm }f
C CU
Then S is dense in L 1 (JL) . In particular, if C is countable, then L 1 (JL) is a
{E}.
separable metric space.
Convergence of Integrals
3. 3 59
PROOF: Denote by S the closure in £ 1 (JL) of S . It is then easy to see that S
is a vector space
- over �-- In particular, if f E £ 1 (JL) and both j + and f - are
elements of S , then f E S. Hence, we need only check that every non-negative
f E £ 1 (JL) is in S . Since every such f is the limit in £ 1 (JL) of simple elements
of £ 1 (JL) and since S is a vector space, we now see that it suffices to check that
lr E S for every r E B. But it is easy to see that the class of r C E for which
lr E S is a .X-system over E, and, by hypothesis, it contains the 1r-system C.
Now apply Lemma 3. 1 .3. D
3.3. 14 Corollary. Let (E, p)
be a metric space, and suppose that Jl is a mea­
sure on (E, BE) with the property that there exists a non-decreasing sequence
of open sets En such that JL ( En ) < oo for each n > 1 and E == U� En. For
each n E z + , denote by ICn the set of bounded, p-uniformly continuous func­
tions cp which vanish identically off En, and set JC == U n EZ+ ICn. Then JC is
dense in L 1 (JL) .
PROOF: We will show first that, for each n E z + '
........,
ICn { <p r En cp E ICn} :

is dense in £ 1 (En, BEn , Jln), where Jln denotes the restriction of Jl to BEn . In
view of Theorem 3.3. 13, ........, this will follow as soon as we show that l a is in the
G G
· I (J.Ln -closure of ICn for each open C En . If == En == E, then there
I ILl )
is no problem, since in that case l a 1 is both p-uniformly continuous and
G_

Jln-integrable. On the other hand, if C En is open and =I= E, set F ==


and define
G GC
- m
'Pm ( x ) == 1 - ( 1 + p ( x , F) ) , m > 1 ,
where p ( x, F) inf { p ( x , y ) : y E F} . Since p ( · , F) is p-uniformly continuous,
we see that 'Pm is p-uniformly continuous. In addition, it is easy to check that
'P 'P
0 < m / l a as m ---t oo. Hence, by the Monotone Convergence Theorem, it
follows that m r En � l a r En in L 1 ( En, BEn ' Jln) as m ---t 00 .

By the preceding, we now know that if f E L 1 (M) vanishes identically


I 'Pm - 'P
off En for some n > 1 , then there is a sequence { m } 1 C ICn such that
f i i£ 1 (J.L ) � 0 as m ---t oo. At the same time, it is clear, by Lebesgue ' s
Dominated Convergence Theorem, that for any f E L 1 (J,t) , ll fn - f ii £ 1 (J.L ) � 0
as n ---t oo, where fn lEn f · D
Notice that, when applied to Lebesgue ' s AR N measure on � N , Corollary
3.3. 14 says that for every f E £ 1 (.XR N ) and E > 0 there is a continuous function
cp such that cp vanishes off of a compact set and I ! - 'P I L l (AJRN )< E. This fact
can be interpreted in either one of two ways: either measurable functions are
not all that different from continuous ones or · £ 1 I I ( AJRN ) provides a rather
crude gauge of size. Experience indicates that the latter interpretation is the
more accurate one.
60 III Lebesgue Integration
Exercises

3.3. 15 Exercise: Let f be a non-negative, integrable function on the measure


space (E, B, v) , and define JL(r) == fr f dv for r E B. Show that JL is a finite
measure on (E, B) . In addition, show that JL is absolutely continuous with
respect to v in the sense that for each E > 0 there is a 8 > 0 with the property
that JL(r) < E whenever v(r) < 8.
3.3. 16 Exercise: Let f be a non-negative, measurable function on the measure
space (E, B, JL) . If f is integrable, show that
(3.3. 17)
Next, produce a non-negative measurable f on ( [0, 1] , B[o , l ] , A [o , I] ) (.X[o, l ] is
used here to denote the restriction of AR to B[o, 1 ] ) such that (3.3. 17) holds but
f fails to be integrable. Finally, show that if f is a non-negative measurable
function on the finite measure space (E, B, JL), then f is integrable if and only
if CX)
L
n =l
JL(f > n ) < oo.

3.3. 18 Exercise: Let J be a closed rectangle in � N and f J � � a :

continuous function. Show that the Riemann integral (R) JJ f(x) dx of f over
J is equal to the Lebesgue integral JJ J (x) AJRN (dx) . Next, suppose that f E
L 1 (AJRN ) is continuous, and use the preceding to show that
J f(x) >.RN (dx)J /limR N (R) JJ{ f(x) dx ,
=

where the limit means that, for any E > 0, there exists a rectangle Jf. such that
j f(x) >.RN (dx) - (R) 1 f(x) dx < f
whenever J is a rectangle containing Jf. . For this reason, even when f is not
continuous, it is conventional to use J f(x) dx instead of J f dAR N to denote
the Lebesgue of f.
3.3. 19 Exercise: Let (E, B, JL) be a measure space and let {fn}1 be a se­
quence of measurable functions on (E, B) . Next, suppose that {gn}1 C L 1 (JL)
and that gn � g E L 1 (JL) in L 1 (JL) . The following variants of Fatou ' s Lemma
and Lebesgue ' s Dominated Convergence Theorem are often useful.
( i ) If fn < gn (a.e., JL) for each n > 1, show that
j
nlim j
---+ CX) fn djL < nlim
---+ CX) fn dJL.
( ii ) If fn � f either in JL-measure or JL-almost everywhere and if I fn i < gn
( a.e., JL) for each n > 1, show that II fn - f II £ 1 (J.L) � 0 and therefore that
limn ---+ CX) J fn dJL == J f dJL.
3.3 Convergence of Integrals 61
3.3. 20 Exercise: Let (E, B, JL) be a measure space. A family JC of measurable
functions f on (E, B, JL) is said to be uniformly JL-absolutely continuous if
for each E > 0 there is a 8 > 0 such that fr I f I dJL < E for all f E JC whenever
r r
E B and JL( ) < 8; and it is said to be uniformly JL-integrable if for each
E > 0 there is an R < oo such that � / > R l f l dJL < E for all f E JC .
I
( i ) Show that JC is uniformly JL-integrable if it is uniformly JL-absolutely con­
tinuous and
supIC l l f ii £ 1 (JL ) < oo .
/E
Conversely, suppose that JC is uniformly JL-integrable and show that it is then
necessarily uniformly JL-absolutely continuous and, when JL(E) < oo , that
sup / E IC l l f ii £ 1 (JL ) < oo .
( ii ) If sup / E IC J 1!1 1 + 6 dJL < oo for some 8 > 0, show that JC is uniformly
JL-integrable.
( iii ) Let {/n}1 C L 1 (JL) be given. If fn � f in L 1 (JL), show that {fn}! U {f}
is uniformly JL-absolutely continuous and uniformly JL-integrable. Conversely,
assuming that JL(E ) < oo , show that fn � f in L 1 (JL ) if fn � f in JL­
measure and {fn}1 is uniformly JL-integrable.
( iv ) Assume that JL(E) == oo . We say that a family JC of measurable func­
r
tions f on (E, B, JL) is tight if for each E > 0 there is a E B such that
JL( ) < oo and sup / E IC frc I! I dJL < E. Assuming that JC is tight, show that
r
JC is uniformly JL-integrable if and only if it is uniformly JL-absolutely contin­
uous and sup / E IC ll f ii £ 1 (JL ) < oo . Next, show that JC is uniformly JL-integrable
if JC is tight and sup / E IC J 1! 1 1 + 6 dJL < oo for some 8 > 0. Finally, suppose
that {fn}1 C L 1 (JL) is tight and that fn � f in JL-measure. Show that
l l fn - f i i£ 1 (JL ) � 0 if and only if {fn} is uniformly JL-integrable.
3.3. 21 Exercise: Let (E, B, JL) be a finite measure space. Show that fn � f
in JL-measure if and only if J l fn - ! I 1\ 1 dJL � 0.
3.3. 22 Exercise: Let (E, p) be a metric space and {En}! a non-decreasing
sequence of open subsets of E such that En / E. Let Jl and v be two measures
on (E, BE ) with the properties that JL(En) V v(En) < oo for every n > 1 and
J cp dJL == J cp dv whenever cp is a bounded p-uniformly continuous cp for which
there is an n > 1 such that cp 0 off En. Show that Jl == v on BE .
3.3. 23 Exercise: Although almost everywhere convergence does not follow
from convergence in measure, it nearly does. Indeed, suppose that {fn}1 is a
sequence of measurable, �-valued functions on (E, B, JL) . Given an �-valued,
measurable function J, show that (3.3.8) holds, and therefore that fn � f
both (a.e., JL) and in JL-measure, if
00

L1 JL (Ifn - f l > E) < oo for every E > 0.


62 III Lebesgue Integration

In particular, conclude that fn � f (a.e., J-t ) and in J.-t- measure if

L1 l l fn - J IIL 1 (JL) < 00 .

3.4 Lebesgue's Differentiation Theorem.


Although it represents something of a departure from the spirit of this chap­
ter, we return in this concluding section to Lebesgue ' s measure on � and prove
the following remarkable generalization of the Fundamental Theorem of Cal­
culus. Namely, we will show if f is any Lebesgue integrable function on � '
then ( cf. the notation introduced in Exercise 3.3.18)
(3.4. 1) hm �
. 1 ! J (t) - f ( x ) ! dt == 0 for (Lebesgue) almost every x E �
t '\.{ x } I I I t '
-
0

where ( 3.4.1) is to be interpreted as the statement that, for almost every x E � '
there exists, for each E > 0, a 8 == 8 ( x , E) > 0 such that
};_ r ! J (t) - f( x ) l dt < E whenever i 3 is an open interval with I I I < 8.
X
I I I it
In other words, except on a set of Lebesgue measure 0, an integrable function
can be recovered by differentiating its indefinite integral.
In order to understand the strategy behind our proof, first note that (3.4.1)
is completely obvious when f is continuous. Hence, since ( cf. Corollary 3.3.14 )
the continuous elements of L 1 ( �) L 1 (,\R) are dense in L 1 (�), it suffices for us
to show that the set g of f E L 1 (�) for which (3.4. 1) holds is closed in L 1 (�) .
_

To this end, we introduce the Hardy-Littlewood maximal function

(3.4.2) M f( x ) sup {};_I I I }{I i f(t) i dt : j }


3 x , xE �'

for f E L 1 (�). Next, for each f E L 1 ( � ) and E > 0 set

b,. ( f, E) == {x : 1'\.{)imx} };_I II Jt� i f(t) - f(x) i dt }


> t: .

Clearly, (3.4.1) holds if and only if I D,.(j, E) I e == 0 for every E > 0. Moreover,
for any E > 0 and any J, g E L 1 (�):
D,. ( j, 3 E ) C { x : M (f - g) > E } U b,. (g, E) U { x : l g( x ) - f( x) ! > E } .
3.4 Lebesgue 's Differentiation Theorem 63
In particular, this means that if g E g, then

where, in the passage to the last line we have used l � (g, E) l e == 0 and Markov ' s
inequality. Finally, suppose that { gn }1 C g and that 9n � f in £ 1 (� ) .
Then, the preceding line of reasoning leads us to the conclusion that
� � ( / , 3E) I e < nlim ! {M(J - gn) > E} l e '
---+ CX)
and so we would be done if we knew that
( 3.4.3)
With the preceding in mind, we now turn our attention to the analysis of the
Hardy-Littlewood maximal function. To begin with, we first note that M f is
measurable, and therefore that we can drop the subscript "e" in ( 3.4.3). To
see this, observe that an alternative expression for M f is
sup 1 {
Mf(x) =
,a b E(O , CX)) a + b }( -a ,b) I J(x + t) l dt
and that, for each a, b E (0, oo ) and x < y ,

f I J( x + t) l dt - r I J( y + t) l dt < r I J(t) l dt .
r ,b)
J(-a r ,b)
J(-a J( x -a , y-a]U[x + b, y+ b)
Hence, by Exercise 3.3. 15, we know that, for each a , b E (0, oo ) ,
xER 1 r 1 1 ( x + t) 1 dt E R
f---+
a + b J( -a ,b )
is uniformly continuous and, therefore, for each a E � ' that

{x : Mf(x) > a} == U {
a , b E (O , CX))
x : a
1
+ i
b ( -a ,b ) l f(x + t) l dt > o: }
is open. We next observe that control on the size of M f will follow from control
on the one-sided maximal functions
M+ f(x) sup � { I J(t) l dt and M_f(x) sup � { I J(t) l dt.
h>O J[x , x +h) h>O J( x -h, x ]
64 III Lebesgue Integration

Indeed, for any I 3 x , let h+ and h_ be the lengths of [x, oo ) ni and ( -oo, x] n i ,
0 0 0

respectively, and note that


� r I J(t) i dt = �
I I I }J 1
I I I ( x - h_ , x ]
I J (t) i dt + � r
I I I J[ x,x + h+)
I J( t ) l dt

< �A �il
M+ f( x ) + M_ f(x) < M+ f(x) V M_ f( x) ,

and therefore, since it is essentially obvious that M+ f V M_ f < MJ, we have


(3.4.4)
as
Moreover, by precisely the same argument we used to prove that { Mf > a}
is open, we know that the same is true of both {M+ f > a} and {M_ f > a}.
Finally, note that M_ f( x) == M+ /( -x) , where /(t) == f( - t) , which means
that we really need to learn how to control only M+ f ·
For this purpose, let f E £ 1 ( �) and E > 0 be given, and consider the function
F€ (x) = { i f(t) i dt - EX , X E JR.
}( - CX) , X]
By Exercise 3.3. 15, Ff_ is uniformly continuous, and, obviously, lim x � ± CX) Ff_(x)
== =f OO . Moreover, it is an elementary matter to see that

as
Hence, we will see shortly, all that we need is the following wonderfully
simple observation.
3.4.5 Lemma (Sunrise Lemma) . * Let F : � � � be a continuous func­
tion with the property that limx � ± CX) F (x) == =fOO . Set
G == {x : 3y > x F( y ) > F(x) }.
Then G is an open, and each non-empty, open, connected component of G is
a bounded interval (a, {3) with F(a) < F({3) .
PROOF: Clearly G is open. Next, suppose that (a, {3 ) is a non-empty, con­
nected, open component of G, and take 1 E (a, {3). If either {3 == oo or {3 < oo
and F({3) < F(1) , then there exists a unique x E [1, {3) such that F(x) == F(1)
and F( y ) < F(1) for all y > x . But this would mean that, on the one hand,
x E G and, on the other hand, F( y ) < F(x) for all y > x , which is impossible.
Hence, we now know that {3 < oo and that F( 1) < F ( {3 ) for all 1 E (a, {3) .
Finally, from this it is clear that a > -oo and that F(a) < F({3). D
* The name derives from the following picture. The sun is rising infinitely far to the right
in mountainous (one-dimensional) terrain, F(x ) is the elevation at x, and G is the region in
shadow at the instant when the sun comes over the horizon.
3.4 Lebesgue 's Differentiation Theorem 65
Applying Lemma 3.4.5 to the function Ff. , we see that G€ {M+ f > t } is
either empty or (cf. the first part of Lemma 2. 1.9) is the union of countably
many disjoint, bounded, open intervals ( an , f3n ) satisfying
0 < F€ (f3n ) - F€ ( an ) = 1(O:n ,/3n ) i f(t) i dt - E (f3n - an )
for each n. Hence, either IG € 1 == 0 or, after summing the preceding over n , we
arrive at

In other words, we have now proved first that

and then, after taking left limits with respect to t > 0,

In fact, because M_ f( x) == M+ f( ) we also know that (3.4.6) continues to


-x ,

hold when M+ f is replaced by M_ f throughout. Finally, in conjunction with


(3.4.4), these leads to

which means that we have now proved the renowned Hardy-Littlewood


maximal inequality:

( 3.4.7)
Since (3.4.7) certainly implies (3.4.3) , and (3.4.3) was the only missing in­
gredient in the program with which this section began, the derivation of the
following statement is complete.
3.4.8 Theorem ( Lebesgue's Differentiation Theorem ) . For any Lebes­
gue integrable function f on �' (3.4. 1) holds. In particular,

(3.4.9) lim _..;_ { j ( t ) = j(x) for almost every X E JR.


i�{x} I I I} ;
Before closing this section, there are several comments which should be
made. First, one should notice that the conclusions drawn in Theorem 3.4.8
remain true for any Lebesgue measurable f which is integrable on each compact
66 III Lebesgue Integration

subset of JR. Indeed, all the assertions there are completely local and therefore
follow by replacing f with f l ( - R , R) , restricting ones attention to x E ( -R, R) ,
and then letting R / oo.
Second, one should notice that (3.4. 7) would be a trivial consequence of
Markov ' s inequality if we had the estimate l i M f i i £ 1 ( R ) < 2 II J II£ 1 (R ) · Thus, it
is reasonable to ask whether such an estimate is true. That the answer is a
resounding no can be most easily seen from the observation that, if ll f ii £ 1 ( R) f=
0, then a f( -r, r ) I J(t) l dt > 0 for some r > 0 and therefore Mf(x) > l xN-r
for all x E JR. That is, if f E £ 1 (JR) does not vanish almost everywhere, then
Mf is not integrable. (To see that the situation is even worse and that, in
general, M f need not be integrable over bounded sets, see Exercise 3.4. 12
below.) Thus, in a very real sense, (3.4. 7) is about as well as one can do.
(See Exercise 6.2.27 for an interesting continuation of these considerations.)
Because this sort of situation arises quite often, inequalities of the form in
( 3.4. 7) have been given a special name: they are called weak-type inequalities
to distinguish them from inequalities of the form li M J II £ 1 (R ) < C I I J I I£ 1 (R ) ,
which would be called a strong-type inequality.
Finally, it should be clear that, except for the derivation of (3.4. 7), the
arguments given here would work equally well in JR N . Thus, we would know
that, for each Lebesgue integrable f on JR N ,
( 3.4. 10) B lim 1 1r
� { x} I B }B I J ( y ) - f(x) l d y = 0 for almost every X E IR. N

if we knew that
c
(3.4.1 1) I {Mf > E } l < -E IIJ IIL 1 ( AR N ) '
where M f is the Hardy-Littlewood maximal function
Mf(x) �� � I IJ( y ) l dy
I L
and B denotes a generic open ball in JR N . It turns out that (3.4. 11), and
therefore (3.4.10), are both true. However, the proof of ( 3.4. 11 ) for N > 2 is
somewhat more involved than the one which we have given of (3.4.7)*.

Exercises

3.4.12 Exercise: Define f : JR � [O, oo) so that f(x) == (x( log x) 2 ) - 1 if


x E ( o, - 1 ) and f(x) == 0 if x � ( o, 1 ). Using Exercise 3.3. 18 and the
e e

Fundamental Theorem of Calculus, check that f E £ 1 (JR) and that


-

- 1 , X E ( 0, - l ) .
1(O,x)
f ( t) dt = log X
e

In particular, conclude that f( o , r ) M f(x) dx == oo for every r > 0.


* See, for example, E . M . Stein's Singular Integrals and Differentiability Properties of Func­
tions, published by Princeton Univ . Press ( 1 970)
3.4 Lebesgue 's Differentiation Theorem 67
3.4. 13 Exercise: Given f E L 1 (1R) , define the Lebesgue set of f to be
the set Leb(f) of those x E � for which the limit in (3.4. 1) is 0. Clearly,
(3.4. 1) is the statement that Leb(j)C has Lebesgue measure 0, and clearly
Leb(f) is the set on which f is well behaved in the sense that the averages
m f; f (t) dt converge to f(x) as j \. {x} for X E Leb(f) . The purpose of this
exercise is to show that, in the same sense, other averaging procedures converge
to f on Leb(f). To be precise, let p be a bounded continuous function on �
having one bounded, continuous derivative p'. Further, assume that p E £ 1 (JR) ,
J p(t) dt == 1, J l tp'(t) I dt < oo , and lim l t i � CX) p(t) == limt � CX) tp'(t) == 0. Next,
for each t > 0, set pf. (t) == t - 1 p ( t - 1 t ) and define
J
f€ (x) = p€ (x - t)f(t) dt, x E lR and f E L 1 (1R).
The purpose of this exercise is to show that
(3.4.14) ft: (x) � f(x) as t � 0 for each x E Leb(f) .
(i) Show that, for any f E £ 1 (JR) and x E Leb(j),

�0 ; J[rx ,x+8) f(t) dt f(x) 8lim


8lim v
=
�0 ; j(x-8,x] f(t) dt.
=
v

(ii) Assuming that f is continuous and vanishes off a compact set, first show
that
f€ (x) = rJ[o, CX) ) p( - t)f(x + d) dt + J[ro ,CX)) p(t)f(x - d) dt,
and then (using Exercise 3.3. 18 and Theorem 1.2.7) verify the following equal­
ities:
(3.4.16) rJ[o, CX)) p( -t) f(x + Et) dt Jr(o, CX) ) tp' ( -t) ( ..!..tt J[rx ,x+t:t ) ! ( s) ds) dt
=

and
( 3.4.16 ' ) rJ[o, CX)) p(t)f(x - d) dt =
J( o , CX) )
- tp' (t) ..!..tt
r (j
(x-t:t , x] f(s) ds dt. )
Next (using Corollary 3.3. 14) argue that ( 3.4.16) and (3.4. 16 ' ) continue to hold
for every f E L 1 (1R) and x E Leb(J).
(iii) Combining part (i) with (3.4. 16) and (3.4. 16 ' ), conclude that

f.lim
�O ff. (x) -f(x)
== J tp ' (t) dt, for x Leb(J), E

and, after another application of Exercise 3.3.18 and Theorem 2. 1.7, note that
- J tp' (t) dt J p(t) dt
= = 1.
Chapter IV
Products of Measures

4. 1 Fubini's Theorem.
Just before Lemma 3.2.2, we introduced the product (E1 E2 , B1 B2 )x x

of two measurable spaces (E 1 , B 1 ) and (E2 , B2 ). We now want to show that


if Jl i , i E { 1, 2}, is a measure on ( Ei , Bi ), then, under reasonable conditions,
there is a unique measure v on (E1 E2 , B1 B2 ) with the property that
x x

v(rl X r2 ) == Il l (rl ) JL(r2 ) for all ri E Bi .


The key to the construction of v is found in the following function analog
of ?T- and A-systems ( cf. Lemma 3. 1.3). Namely, given a space E, we will say
that a collection £, of functions f : E � ( - oo, oo ] is a semi-lattice if both
j + and f - are in £, whenever f E .C . A sub collection /C C £, will be called an
£-system if:
( a ) 1 E /C;
( b ) if f, g E /C and { ! == oo} n {g == oo} == 0, then g - f E /C whenever
either f < g or g - f E .C;
( c ) if a , f3 E [0, ) and J , g E /C, then af + (3g E /C;
oo

(d) if {fn}1 C /C and fn / f, then f E /C whenever f is bounded or f E .C .

The analog of Lemma 3. 1.3 in this context is the following.


4. 1 . 1 Lemma. Let C be a ?T-system which generates the a-algebra B over
E, and let £, be a semi-lattice of functions f : E ( � ]- oo, oo . If /C is an
£-system and l r E /C for every r E C, then /C contains every f E £, which is
measurable on ( E, B) .
PROOF: First note that {r C E : 1 r E /C} is a A-system which contains C.
Hence, by Lemma 3. 1.3, 1 r E /C for every r E B. Combined with ( c ) above,
this means that /C contains every non-negative, measurable, simple function
on (E, B) .
Next, suppose that f E £, is measurable on (E, B). Then both j + and
f - have the same properties, and, by ( b ) above, it is enough to show that
j+ , f- E /C in order to know that f E /C. Thus, without loss in generality,
we assume that f E £, is a non-negative measurable function on (E, B). But
in that case f is the non-decreasing limit of non-negative measurable simple
functions ; and so f E /C by (d) . D
4.1 Fubini 's Theorem 69
The power of Lemma 4. 1 . 1 to handle questions involving products is already
apparent in the following.
4.1.2 Lemma. Let (E1 , 8 1 ) and (E2 , 82 ) be measurable spaces, and suppose
that f is an �-valued measurable function on (E 1 x E2 , 8 1 x 82 ). Then for
each x 1 E E1 and x 2 E E2 , j(x 1 , ) and f( x 2 ) are measurable functions on
· · ,

(E2 , 82 ) and (E1 , 8 1 ), respectively. Next, suppose that Jli , i E {1, 2}, is a finite
measure on ( Ei, 8i ). Then for every measurable function f on ( E1 x E2 , 8 1 x 82 )
which is either bounded or non-negative, the functions

are measurable on (E1 , 81 ) and (E2, 82 ), respectively.


PROOF: Clearly it is enough to check all these assertions when f is non­
negative.
Let £, be the collection of all non-negative functions on E1 x E2 , and define
/C to be those elements of £, which have all the asserted properties. It is clear
that 1 r1 xr2 E /C for all ri E 8i . Moreover, it is easy to check that /C is an
£-system. Hence, by Lemma 4. 1.1, we are done. D
4.1.3 Lemma. Given a pair (E 1 , 8 1 , Jl i ) and (E2 , 82 , Jl 2 ) of finite measure
spaces, there exists a unique measure v on (E1 x E2 , 8 1 x 82 ) such that

Moreover, for every non-negative function f on (E1 x E2 , 8 1 x 82 ),

rE 2
f(x l , x 2 ) v(dx l x dx 2 )
J 1 xE
(4.1.4) = L2 (L1 j(x 1 , x2 ) M 1 (dx 1 ) ) M2 (dx2 )
L1 (L2 J(x 1 , x2 ) 1-t2 (dx2 ) ) /-ll (dx l ) ·
=

PROOF: The uniqueness of v is guaranteed by Exercise 3. 1.8. To prove the


existence of v, define

and
70 IV Products of Measures
for r E B l X 82 . using the Monotone Convergence Theorem, one sees that both
v1 , 2 and v2 , 1 are finite measures on (E1 E2 , 8 1 82 ). Moreover, by the same
x x

sort of argument as was used to prove Lemma 4. 1.2, for every non-negative
measurable function on (E1 E2 , 81 82 ):
x x

and
j j dv2 , 1 l2 (l1 f(x 1 , x2 ) /-l2 (dx2 ) ) /-l 1 (dx 1 ).
=

Finally, since VI , 2 (rl X r2 ) == JL(rl ) JL(r2 ) == v2 , 1 (rl X r2 ) for all ri E Bi we see


'

that both v1 , 2 and v2 , 1 fulfill the requirements placed on v. Hence, not only
does v exist, but it is also equal to both v1 , 2 and v2 , 1 ; and so the preceding
equalities lead to (4.1.4). D
In order to extend the preceding construction to measures which need not be
finite, we must ( cf. Exercise 4.1. 12) introduce a qualified notion of finiteness.
Namely, we will say that the measure JL on (E, B) is a- fi nite and will call
(E, B, JL) a a-finite measure space if E can be written as the union of a
countable number of sets r E B for each of which JL(r) < oo . Thus, for
example, (� N , BRN , AR N ) is a a-finite measure space.
4.1.5 Theorem ( Tonelli's Theorem ) . Let (E1 , 8 1 , JL I ) and (E2 , 82 , JL 2 ) be
a-finite measure spaces. Then there is a unique measure v on (E1 E2 , 8 1 82 ) x x

such that v(rl X r2 ) == JL I (rl ) JL(r2 ) for all ri E Bi · In addition, for every non­
negative measurable function f on ( E1 E2 , 8 1 82 ), J f( · , x 2 ) JL2 (dx 2 ) and
x x

J j(x 1 , · ) JL I (dx l ) are measurable on (E1 , B 1 ) and (E2 , B2 ) , respectively, and


(4.1 .4) continues to hold.
PROOF: Choose sequences {Ei, n}� 1 C Bi for i E {1, 2} so that JLi (Ei,n) < oo
for each n > 1 and Ei == U� 1 Ei,n · Without loss in generality, we assume that
Ei,m n Ei, n == 0 for m f= n . For each n E z + , define JLi , n(ri) == JLi (ri n Ei, n),
ri E Bi ; and, for ( m , n ) E z + 2 ' let V(m , n ) on (El X E2 , B l X 82 ) be the measure
constructed from JL I,m and JL 2 , n as in Lemma 4. 1.3.
Clearly, by Lemma 4.1.2, for any non-negative measurable function f on
( El X E2 , Bl X 82 ),

is measurable on (E 1 , 8 1 ) ; and, similarly, JE 1 j(x 1 , · ) JL I (dx 1 ) is measurable on


(E2 , 82 ). Finally, the map r E 8 1 B2 r-----+ E:, n =l v( m , n ) (r) defines a measure
x

vo on (E 1 E2 , 8 1 82 ), and it is easy to check that vo has all the required


x x

properties. At the same time, if v is any other measure on (E1 E2 , 81 82 ) for


x x

which v(rl X r 2 ) == /L l (rl ) JL 2 (r2 ), ri E Bi , then, by the uniqueness assertion


4. 1 Fubini 's Theorem 71
in Lemma 4.2 1.3, v coincides with v(m , n ) on 81 82 [El,m E2 , n ] for each
x x

X
( m , n) E z + and is therefore equal to Vo on 8 1 82 . D
The measure v constructed in Theorem 4. 1.5 is called the product of Jll
times JL2 and is denoted by Jl l Jl2 · x

4.1.6 Theorem (Fubini's Theorem) . Let (EI , 8 1 , Jll ) and (E2 , 82 , JL2 ) be
a-finite measure spaces and f a measurable function on (E1 x E2 , 8 1 x 82 ) .
Then the f is Jll x Jl2-integrable if and only if

if and only if

Next, set

and

and define fi on Ei , i E {1, 2}, by


if XIE Al
otherwise

and

Then fi is an �-valued, measurable function on ( Ei , 8i ) . Finally, if f is Jll x Jl2-


integrable, then Jli (Ai C) == 0, fi E L 1 (J-Li ), and

( 4. 1.7)

for i E {1, 2}.


72 IV Products of Measures

PROOF: The first assertion is an immediate consequence of Theorem 4. 1.5.


Moreover, since Ai E Bi , it is easy to check (from Lemma 4. 1.2) that fi is an
�-valued, measurable function on (Ei , Bi)· Finally, if f is Jl l x JL 2 -integrable,
then, by the first assertion, Jli(Ai C) == 0 and fi E L 1 ( J-Li )· Hence, by Theorem
4.1.5 applied to j + and f- , we see that
r j(x 1 , X 2 ) ( J.Ll X J.L2 ) (dx 1 X dx 2 )
1E 1 x E2
- r � - (x 1 , X 2 ) ( J.Ll X J.L2 ) ( dx 1 X dx 2 )
}A l X E2
= 1 1 (L2 j+ (x 1 , x2 ) J.L 1 (dx2 ) ) J.Ll (dxt )
- 1 1 (L2 f (x 1 , x2 ) J.L2 (dx2 ) ) J.L 1 (dx 1 )
-

r ft (X ! ) J.L 1 ( dx ! ) i
=
jA 1
and the same line of reasoning applies to f2 . D

Exercises

4.1.8 Let (E, B, JL) be a a-finite measure space. Given a non­


Exercise:
negative measurable function f on (E, B) , define
r(j) == { (x, t) E E x [O, oo) : t < f (x) }
and
f (J )
{ (x, t) E E x [O, oo) : t < f (x ) }.
==

--
Show that both r(j) and r(j) are elements of B X B[o ,CX)) and, in addition,
that
( 4.1.9)
Hint : In proving measurability, consider the function (x , t ) E E x [0, oo) r-----+
f(x ) - t E ( - oo, oo] ; and get (4. 1.9) as an application of Tonelli ' s Theorem.
Clearly (4. 1.9) can be interpreted as the statement that the integral of a non­
negative function is the area under its graph.
4. 1 Fubini 's Theorem 73
4. 1 . 10 Exercise: Let (E1 , 81 , J.-t 1 ) and (E2 , 82 , J.-t 2 ) be a-finite measure spaces
and assume that, for i E { 1, 2} , 8i == a(Ei; Ci), where Ci is a 1r-system con­
taining a sequence {Ei, n}� 1 such that Ei == U� 1 Ei , n and J-t i(Ei , n) < oo,
n > 1. Show that if v is a measure on (E 1 x E2 , 81 82 ) with the property
x

that v(r1 X r2 ) == J.-l 1 (r1 ) J.-l 2 (r2 ) for all ri E Ci , then V == J.-l 1 X J.-l 2 . Use this fact
to show that, for any M, N E z + ,

4. 1 . 1 1 Exercise: Let (E1 , 8 1 , J.-t 1 ) and (E2 , 82 , J.-t2 ) be a-finite measure spaces.
Given r E 8 1 X 82 ' define
r(l ) (x2 ) { Xi E El : (x l , x2 ) E r } E B l for X 2 E E2
and
r(2 ) (x l ) { X2 E E2 : (x l , x 2 ) E r } for Xi E El .
Check both that r( i ) (xj) E 8i for each Xj E Ej and that Xj E Ej r-----+
J.-ti ( r( i ) (xj) ) E [O, oo] is measurable on (Ej, 8j) ({i, j} == {1, 2}). Finally, show
that JL 1 JL2 (r) = 0 if and only if JLi (r( i ) ( xi ) ) = 0 for ttralmost every xi E Ei ;
x

and, conclude that JL 1 (r(l ) ( x2 ) ) = 0 for tt2 -almost every x 2 E E2 if and only
if /L 2 (r(2) ( xt) ) = 0 for IL l -almost every Xi E El . In other words, r E Bl X 82
has J.-t 1 J.-t 2 -measure 0 if and only if �-t 1 -almost every vertical slice (J.-t 2 -almost
x

every horizontal slice) has �-t 2 -measure (J.-t 1 -measure) 0.


4. 1 . 12 Exercise: The condition that the measure spaces of which one is taking
a product be a-finite is essential if one wants to carry out the program in this
section. To see this, let E1 == E2 == (0, 1 ) and 8 1 == 82 == 8( o, I ) . Define J.-l 1 on
( E1 , 8 1 ) so that �-t 1 (r) is the number of elements in r ( oo if r is not a finite
set) and show that J.-li is a measure on (E1 , 8 1 ) . Next, take �-t 2 to be Lebesgue ' s
measure A (o, I ) on (E2 , 82 ). Show that there is a set r E 81 82 such that x

r l (x l , X 2 ) JL 2 (dx 2 ) = 0 for every X l E El


r
JE2
but

( Hint : Try r == {(x, x) : 0 < X < 1}.) In particular, there is no way that the
second equality in ( 4.1.4 ) can be made to hold. Notice that what fails here is
really the uniqueness and not the existence in Lemma 4. 1.3.
74 IV Products of Measures
4. 1.13 Exercise: Let (E, B) be a measurable space. Given -oo < a < b < oo
and a function f : (a, b) E � � with the properties that /( · , x) E C((a, b))
x

for every x E E and f ( t, ) is measurable on ( E, B) for every t E (a, b), show


that f is measurable on ( (a , b) E, B( a, b) B) . Next, suppose that f ( , x) E
·

x x

C 1 ((a , b)) for each x E E, set f'(t, x) == d� f(t, x) , x E E, and show that f' is
·

measurable on ( (a, b) E, B( a , b ) B) . Finally, suppose that Jl is a measure on


x x

(E, B) and that there is a g E L 1 (JL) such that I f( t, x ) I V I f' ( t, x ) I < g(x ) for all
(t, x ) E (a , b) E. Show not only that JE f( x) JL(dx) E C 1 ((a, b)) but also
x · ,

that
�l l
f(t, x) J.L (dx) = f ' (t, x) J.L (dx) .

4.2 Steiner Symmetrization and the Isodiametric Inequality.


In order to provide an example which displays the power of Fubini ' s The­
orem, we will prove in this section an elementary but important inequality
about Lebesgue ' s measure. Namely, we will show that, for any bounded sub­
set r C �N ,
(4.2. 1)
where n N denotes the volume of the unit ball B(O, 1 ) in �N and
rad(r) sup { ly; x l : x, y E r }
_

is the radius (i.e., half the di amet er) of r. Notice (cf. ( ii ) in Exercise 2.2.3) that
what (4.2.1 ) says is that, among all the subsets of �N with a given diameter,
the ball of that diameter has the largest volume; it is for this reason that (4.2. 1)
is called the isodiametric inequality.
At first glance one might be inclined to think that there is nothing to (4.2.1).
Indeed, one might carelessly suppose that every r is a subset of a closed ball
of radius rad(r) and therefore that (4.2.1) is trivial. This is true when N == 1.
However, after a moment ' s thought, one realizes that, for N > 1, although
r is always contained in a closed ball whose radius is equal to the diameter
of r, it is not necessarily contained in one with the same radius as r. ( For
example, consider an equilateral triangle in �2 .) Thus, the inequality 1 r 1e <
n N ( 2rad ( r )) N is trivial, but the inequality in ( 4.2. 1) is not! On the other hand,
there are many r ' s for which (4.2.1) is easy. In particular, if r is symmetric in
the sense that r == - r { -X : X E r}, then it is clear that
x E r ===} 2 l x l == l x + x l < 2rad(r) and therefore r c B (O, rad(r) ) .
Hence ( 4.2.1) is trivial when r is symmetric, and so all that we have to do is
devise a procedure to reduce the general case to the symmetric one.
4.2 Steiner Symmetrization and the Isodiametric Inequality 75
The method with which we will perform this reduction is based on a famous
construction known as the Steiner symmetrization procedure. To describe
Steiner ' s procedure, we must first introduce a little notation. Given v from
the unit (.LV - 1 ) -sphere s N - 1 {x E �N : l x l == 1 }, let L(v) denote the line
{tv : t E �}, P(v) the (N - I ) -dimensional subspace {x E �N : x ..l v}, and
define
S(r; v) {x + tv : x E P(v) and l t l < �f(r; v, x) } ,
where
f(r; v, x) l {t E � : x + tv E r} l e
X
is the length of the intersection of the line + L(v) with r. Notice that,
in the creation S(r; v) from r, we have taken the intersection of r with
x + L(v) , squashed it to remove all gaps, and then slid the resulting inter­
val along x + L(v) until its center point falls on x. In particular, S(r; v) is the
symmetrization of r with respect to the subspace P(v) in the sense that, for
each x E P(v),
(4.2.2) x + tv E S(r; v) ¢:::::;> x - tv E S(r; v);
what is only slightly less obvious is that S(r; v) possesses the properties proved
in the next lemma.
4.2.3 Lemma. Let r be a bounded element of BRN . Then, for each v E
s N - 1 , S(r; v) is also a bounded element of �N , rad (S( r; v) ) < rad ( r ) , and
I S(r; v) I == 1 r 1 . Finally, if R : �N � �N is rotation for which L(v) and r
a

are invariant (i.e., R ( L(v) ) == L(v) and R(r) == r), then RS(r; v) == S(r; v) .
PROOF: We begin with the observations that there is nothing to do when
N == 1 and that, because none of the quantities under consideration depends
on the particular choice of coordinate axes, we may and will assume not only
that N > 1 but also that v == eN _ (0, . . . , 0, 1 ) . In particular, this means, by
Lemma 4. 1 .2, that
� E IRN - l f(�) � l lr((�, t) ) dt E [0, oo)

is BR N- 1 -measurable; and therefore, by Exercise 4. 1 .8, because S(r; e N ) is


equal to
f� f�
{ ( � , t) E �N - 1 X [O, oo) : t < ( ) } U { ( � , t) E �N - 1 X ( - oo, O] : - t < ( ) } ,
we know both that S(; eN ) is an element of BRN and that

where, in the final step, we have applied Tonelli ' s Theorem.


76 IV Products of Measures

We next turn to the proof that rad (S (r ; eN ) ) cannot be larger than rad(r) ;
and, in doing so, we will, without loss in generality, add the assumption that
r is compact. Now suppose that x , y E S(r; eN ) are given, and choose � ' 'fJ E
�N - l and s, t E � so that x == ( � , s) and y == ( TJ , t) . Next, set
M ± (x) == ± sup{ T : ( � , ± 7 ) E r} and M ± ( y ) == ± sup{ T : ( TJ , ±T ) E r},
and note that, because r is compact, all four of the points x ± (� ' M ± (x))
and y ± == ( TJ , M ± ( y )) are elements of r. Moreover, 2 l s l < M+ (x) - M - (x)
and 2 l t 1 < M + ( y ) - M - ( y ) ; and therefore

In particular, this means that


x 2 == 2 t s 2 < 2 s t 2
IY - l ITJ - �1 + I - l I TJ - �1 + ( l i + 1 l )
2
< 111 - �1 + ( ( M ( y ) - M - (x)) V ( M (x) - M - ( y ) ) )
2 + +
< (I Y + - x - 1 v IX + - y - 1) 2 < 4 rad(r) 2 .
Finally, let R be a rotation. It is then an easy matter to check that P(Rv) ==
R ( P(v) ) and that f(Rr ; Rv, Rx) == f(r ; v, x) for all x E P(v). Hence,
S(Rr, Rv) == RS(r, v) . In particular, if r == Rr and L(v) == R (L (v)) , then
Rv == ± v, and so the preceding ( together with (4.2.2)) leads to RS( r, v ) ==
S(r, v). D
4.2.4 Theorem. The inequality in (4.2.1) holds for every bounded r C �N .
PROOF: Clearly it suffices to treat the case when r is compact and therefore
Borel measurable. Thus, let a compact r be given, choose an orthonormal basis
{e l , . . . , eN } for �N , set r o == r, and define rn == S(rn - l ; en) for 1 < n < N.
By repeated application of (4.2.2) and Lemma 4.2.3, we know that 1 rn1 == 1 r 1 ,
rad(rN ) < rad(r) , and that Rm rn == rn , 1 < m < n < N, where Rm is the
rotation given by Rm x == x - 2(x , em )RNem for each x E �N . In particular,
this means that Rm rN == Rm rN for all 1 < m < N, hence - rN == rN , and
therefore ( cf. the discussion preceding the introduction of Steiner ' s procedure)

We will now use ( 4.2.1) to give a description, due to F. Hausdorff, of Lebes­


gue ' s measure which, as distinguished from the one given at the beginning of
4.2 Steiner Symmetrization and the Isodiametric Inequality 77

r
Section 2. 1, is completely coordinate free. Namely, we are going to show that,
for all C �N ,

(4.2.5) 1 r 1 e HN (r) inf { L ON rad(C) N


==
CEC
: C a countable cover of r} .
H (r)
We emphasize that we have placed no restriction on the sets C making up the
cover C. On the other hand, it should be clear that N would be unchanged
if we were to restrict ourselves to coverings by closed sets or, for that matter,
to coverings by open sets.
Directly from its definition, one sees that H N is monotone and subadditive
in the sense that

and
HN (Qrn) < �HN(rn) rn
for all { } f' C P (R N ) .
Indeed, the first of these is completely trivial, and the second follows by choos­
ing, for a given E > 0, {Cm }1 so that
rm
c U cm and n N (rad ( C )) N < H N m + T m f
L
CEC m
(r )
and noting that

Moreover, because
1r 1 e < L C i e for any contable cover C,
CEC
I

the inequality 1 r 1 e < H N (r) is an essentially trivial consequence of ( 4.2.1). In


order to prove the opposite inequality, we will use the following lemma.
4.2.6 Lemma. For any open set G in �N
with IGI < oo , there exists a
sequence {Bn}! of mutually disjoint closed balls contained in G with the
property that
00

(4.2.7)

PROOF: If G == 0 , there is nothing to do. Thus, assume that G =I= 0 , and set
Go == G. Using Lemma 2. 1.9, choose a countable, exact cover C0 of G0 by
78 IV Products of Measures
non-overlapping cubes Q. Next, given Q E C0 , choose x E JRN and 8 E [0, oo )
so that
N i
Q == Il [x - 8, xi + 8] and set BQ == B (x , � ) .
1
Clearly, the BQ ' s are mutually disjoint closed balls. At the same time, there is
a dimensional constant a N E (0, 1) for which I BQ I > a N I Q I ; and therefore we
can choose a finite subset { Bo, I . . . , Bo, no } C { BQ : Q E Co } in such a way
that no
Go \ U Bo,m < ,BN I Go l where ,BN 1 - ¥- E (0, 1).
m= 1
Now set G 1 == G0 \ U�o Bo , m · Noting that G 1 is again non-empty and open,
we can repeat the preceding argument to find a finite collection of mutually
disjoint closed balls B1 , m C G 1 , 1 < m < n 1 , in such a way that
n1
G 1 \ U B 1 ,m < ,BN I G 1 I ·
m= 1
More generally, we can use induction on f E z + to construct open sets Gt C
Gt - 1 and finite collections Bt , 1 , , Bt , n�. of mutually disjoint closed balls B c
• • •

Gt so that
nt
I Gt + l l < ,BN I Gt l where Gt+ 1 == Gt \ U Bt ,m ;
m= 1
clearly the collection
:
{ Bt ,m f E N and 1 < m < n�_}
has the required properties. D
4.2.8 Theorem. The equality in (4.2.5) holds for any set r c RN .
PROOF: As we have already pointed out, the inequality 1r1e < H N ( r ) is an
immediate consequence of ( 4.2. 1 ) . To get the opposite inequality, first observe
that HN ( r ) == 0 if 1r1e == 0. Indeed, if 1r1e == 0, then, for each E > 0 we can
first find an open G � r with I G I < E and then, by Lemma 2.1.9, a countable,
exact cover C of G by non-overlapping cubes Q, which means that

Next, because H N is countably subadditive, it suffices to prove that H N ( r ) <


l r le for bounded sets r. Finally, suppose that r is a bounded set, and let G be
4.2 Steiner Symmetrization and the Isodiametric Inequality 79

any open superset of r with I G I < oo . By Lemma 4.2. 7, we can find a sequence
{Bn}! of mutually disjoint closed balls B C G for which
CX)

I G \ A I == 0 where A == U Bn.
1

Hence, because H N (r) < H N (G) < H N (A) + H N (G \ A) == H N (A), we see


that
H N (f) < L ONrad (Bn) N == L I Bn I == I A I == I G I
CX)

1 1

for every open G � r ; and, after taking the infimum over such G ' s, we arrive
at the desired conclusion. D

Exercises

4.2.9 Exercise: Using the definition of HN given by the second relation in


(4.2.5), give a direct ( i.e., one which does not make use of the first relation in
( 4.2.5)) proof that H N (f) == 0 for every bounded subset of a hyperplane ( cf.
(i) in Exercise 2.2.3) of � N .
Chapter V
C hanges of Variable

5.0 Introduction.
We have now developed the basic theory of Lebesgue integration. However,
thus far we have nearly no tools with which to compute the integrals which
we have shown to exist. The purpose of the present chapter is to introduce
a technique which often makes evaluation, or at least estimation, possible.
The technique is that of changing variables. In this introduction, we describe
the technique in complete generality. In ensuing sections we will give some
examples of its applications.
Let (E1 , B1 ) and (E2 , B2 ) be a pair of measurable spaces. Given a measure
M on (E 1 , B1 ) and a measurable map � on (E1 , B1 ) into (E2 , B2 ), we define the
pushforward or image �*M == M o � - 1 of M under � by �* M (r) == M ( � 1 (r) ) -

for r E B2 . Because set theoretic operations are preserved by inverse maps, it


is an easy matter to check that �* M is a measure on (E2 , B2 ).
5.0.1 Lemma. For every non-negative measurable function cp on (E2 , B2 ),

( 5.0.2 )

Moreover, cp E £ 1 ( E2 , B2 , �* M ) if and only if cp o � E L 1 (E1 , B 1 , M ) , and ( 5.0.2 )


holds for all cp E L 1 ( E2 , B2 , �* M ) ·
PROOF: Clearly it suffices to prove the first assertion. To this end, note that
( 5.0.2 ) holds, by definition, when f is the indicator of a set r E B2 . Hence, it
also holds when f is a non-negative measurable simple function on (E2 , B2 ).
Thus, by the Monotone Convergence Theorem, it must hold for all non-negative
measurable functions on (E2 , B2 ). D
The reader should note that Lemma 5.0. 1 is really a definition more than it
is an honest theorem. It is only when a judicious choice of � has been made
that one gets anything useful from ( 5.0.2 ) .
5.1 Lebesgue Integrals vs. Riemann Integrals 81
Exercises

5.0.3 Exercise: Referring to Theorem 2.2.2, let A be a non-singular N x

N matrix and TA the associated linear transformation on � N . Show that


(TA)*AR N == j det(A) j - 1 AR N .

5 . 1 Lebesgue Integrals vs. Riemann Integrals.


Our first important example of a change of variables will relate integrals
over an arbitrary measure space to integrals on the real line. Namely, given
a measurable �-valued function f on a measure space (E, B, JL) , define the
distribution of f under Jl to be the measure Jl t f*Jl on (�, �) . We then
have that, for any non-negative measurable cp on (�, �)
_

(5.1.1) L r.p o f(x) J-L(dx) .k r.p (t) J-Lt (dt).


=

The reason why it is often useful to make this change of variables is that the
integral on the right hand side of ( 5. 1.1 ) can often be evaluated as the limit
of Riemann integrals to which all the fundamental facts of the calculus are
applicable.
In order to see how the right hand side of (5.1. 1) leads us to Riemann in­
tegrals, we will prove a general fact about the relationship between Lebesgue
and Riemann integrals on the line. Perhaps the most interesting feature of this
result is that it shows that a complete description of the class of Riemann inte­
grable functions in terms of continuity properties defies a totally Riemannian
solution and requires the Lebesgue notion of almost everywhere.
5 . 1 . 2 Theorem. Let v be a finite measure on ((a , b], B (a ,b]) , where - oo <
a < b < oo , and set 'lj;(t) == v((a, t]) for t E [a, b] ( '1/J (a) == v(0) == 0) . Then, 'ljJ
is right-continuous on [a , b ) , non-decreasing on [a, b] , '1/J (a) == 0, and, for each
t E (a, b] , 'lj;(t) - 'lj;(t-) == v( {t} ) , where 'lj;(t - ) lim s/ t 'lj;(s) is the left-limit
of 'ljJ at t. Furthermore, if cp is a bounded function on [a , b] , then cp is Riemann
integrable on [a , b] with respect to 'ljJ if and only if cp is continuous (a. e. , v) on
(a , b] ; in which case, cp is measurable on ( (a , b] , B {a ,b] ) and
(5. 1.3) 1(a ,� r.p d v = (R) �f[a ,� r.p (t) d'lj;(t) .
(See Exercise 7. 1.31 to learn how to go from a right-continuous, non-decreasing
'ljJ to a measure Jl · )
PROOF: It will be convenient to think of v as being defined on ( [a , b] , B [a ,b])
by v(r) v(r n (a, b] ) for r E B[ a ,b] . Thus, we will do so.
82 V Changes of Variable

Obviously V; is non-decreasing on [a , b] ; and therefore ( cf. Lemma 1.2.20) V;


has at most countably many points of discontinuity. Moreover, V;( a) == v(0) ==
0, for each s E [a , b ) V;(s) == JL ( (a , s ]) == limt � s JL ( (a, t] ) == limt � s V;(t) , and for
each t E (a, b] V;(t) - V;(t - ) == lim s/ t v ((s , t] ) == v( {t} ).
Assume that cp is Riemann integrable on [a, b] with respect to V; . To see
that cp is continuous ( a.e., v ) on (a , b] , choose, for each n > 1, a finite, non­
overlapping, exact cover Cn of [a , b] by intervals I such that II Cn II < � and V; is
continuous at I - for every I E Cn. If � == U� {I - : I E Cn}, then JL(�) == 0.
1

Given m > 1, let Cm, n be the set of those I E Cn such that sup 1 cp - inf1 cp > � .
It is then easy to check that
CX) CX)
{t E (a, b] \ � : cp is not continuous at t} c u n U cm,n ·
m= l n= l
But, by Exercise 1.2.26,

and therefore v (U:= l n� u Cm, n) == 0. Hence, we have now shown that cp


1

is continuous ( a.e., v ) on (a, b] .


Conversely, suppose that cp is continuous ( a.e., v) on (a , b] . Let { Cn}1 be
a sequence of finite, non-overlapping, exact covers of [a , b] by intervals I such
that II Cn ll � 0. For each n > 1, define 'Pn (t) == sup1 cp and cpn (t) == inf1 cp
for t E I \ {I - } and I E Cn · Clearly, both 'Pn and -cpn are measurable on
( (a, b] , B< a,b] ) . Moreover,
inf cp < -cpn < cp < 'Pn < sup cp
( a , b] ( a , b]

for all n > 1. Finally, since cp is continuous ( a. e., v) ,


cp
----+- CX) -cpn == nlim
== nlim ----+- CX) 'Pn ( a.e., v);
and so, not only is cp == limn CX) cpn ( a.e., v) and therefore measurable on
---+-

( (a, b] , B (a , b] ) , but also


- v

nlim r -<pn dv r <p d v nlimCX) r 'Pn dv.


= =

---+- 1( a ,b]
CX) 1( a ,b] ---+- 1( a , b]
In particular, we conclude that
5.1 Lebesgue Integrals vs. Riemann Integrals 83
as n � oo. From this it is clear both that cp is Riemann integrable on [a , b]
with respect to 1/J and that (5 . 1 . 3 ) holds. D
We are now ready to prove the main result to which this section is devoted.
5.1.4 Theorem. Let (E, B, JL) be a measure space and f a non-negative,
measurable function on (E, B). Then t E (0, oo) � JL(f > t) E [0, oo] is a
right-continuous, non-increasing function. In particular, it is measurable on
( ( 0, oo), B( o , oo ) ) and has at most a countable number of discontinuities. Next,
assume that cp E C ( [0, oo)) nC 1 ( (0, oo)) is a non-decreasing function satisfying
cp(O) == 0 < <p ( t ) , t > 0, and set cp( oo) == lim t � oo cp ( t ) . Then
(5. 1.5) f r.p o f ( x) J-L( dx) = fo oo r.p ' ( t)J-L(f > t) AIR ( dt) .
JE J( , )
Hence, either JL(f > 8) == oo for some 8 > 0, in which case both sides of ( 5.1.5)
are infinite, or for each 0 < 8 < r < oo the map t E [8, r] � cp '(t)JL(f > t) is
Riemann integrable and

f r.p o f(x) J-L( dx) = lim (R) f r.p ' (t ) J-L(f > t) dt.
}E r/8 �oo0 }[8,r]
PROOF: It is clear that t E (0, oo) � JL(f > t) is right-continuous and non­
increasing. Hence, if 8 == sup{ t E (0, oo) : JL(f > t) == oo } , then JL (f > t) == oo
for t E (0, 8) and t E (8, oo) � JL(f > t) has at most a countable number of
discontinuities. Furthermore, if
h (t ) == Jl ( f > ktl ) for t E ( � , ktl J , k > 0, and n > 1,
n

then each hn is clearly measurable on ( (O, oo), B( o , oo ) ) and h (t ) � JL(f > n

t) for each t E (O, oo). Hence, t E (O, oo) � JL(f > t) is measurable on
( ( 0, oo), B(o , oo ) ) .
We turn next to the proof of ( 5.1.5). Since (cf. Exercise 3.3. 18)

limo f
o:� limo (R) {6 r.p' ( t) dt = r.p( 8)
r.p' ( t) dt = o:�
}( o: , 8] l o:
and therefore

(l 'P 0 f dfJ, ) (lco, oo ) r.p' (t) J-L(f


1\ > t) AJR (dt ) ) > r.p (8 ) J-L(f > 8) ,

it is clear that both sides of (5. 1.5) are infinite when JL(f > 8) == oo for some
8 > 0. Thus we will assume that JL(f > 8) < oo for every 8 > 0. Then the
restriction of Jl J to B[8 , oo ) is a finite measure for every 8 > 0. Given 8 > 0, set
84 V Changes of Variable
'l/J8 (t) ==
J.t J ((8, t]) for t E [8, oo) and apply Theorem 5.1.2 and Theorem 1.2.7
to see that
r
/<r
1{8 < }<p o 1 dJ.L =
(8, r <p j
dJ.L J = (R) r <p (t) d'I/Jti (t)
) 1[8 r] ,
= <p (r) 'I/Jti (r) - (R) f '1/Jti (t) <p ' (t) dt
1[8,r]
= <p ( 8) '1/Jti ( r) + (R) f ( '1/Jti ( r) - '1/Jti ( t) ) <p 1 ( t) dt
1[8,r]
= <p (8) J.L (8 < f < r) + (R) f J.L (t < f < r) <p ' (t) dt
1[8,r]
== cp(8)�-t(8 < f < r) + j(8, r) J.t(t < f < r) cp' (t) AR(dt)
for each r E ( 8, oo) . Hence, after simple arithmetic manipulation and an
application of the Monotone Convergence Theorem, we get
{
1{8 < / < oo } ( <p o f - <p (8)) dJ.L = j(8, oo) cp' (t)�-t(t < f < oo) AR(dt)
after r / oo. At the same time, it is clear that
f ('P o J - <p (8)) dJ.L = [<p (oo) - <p (8)] M U = oo)
1{ / = oo }
== j( 8, oo)
J.t(f == oo) cp' (t) AR(dt);
and, after combining these, we now arrive at
{
1{! > 8 } ('P o f(x) - <p (8)) J.L (dx) = j(8,oo ) J.t(f > t) cp' (t ) AR(dt).
Thus, (5.1.5 ) will be proved once we show that

8lim r ('P 0
f(x) - <p (8) ) J.L (dx) = r <p f(x) J.L (dx). 0
0
� 1{ / > 8} 1 E
But if 0 < 81 < 82 < oo, then 0 < (cp o f - cp (82 ) ) 1 { /> 82 } < (cp o f ­
I
cp(8 ) ) 1 { / > 8 } , and so the required convergence follows by the Monotone Con­
1

vergence Theorem.
Finally, to prove the last part of the theorem when J.t(f > 8) < oo for every
8 > 0, simply note that
r
J(O , oo ) <p lim
1 (t) J.L (f > t) AR(dt) = 6)0 j
r/ oo (li, r]
cp ' (t)�-t(f > t) AR(dt) ,

and apply Theorem 5.1.2. D


5. 2 Polar Coordinates 85

Exercises

5 . 1 . 6 Exercise: Here are two familiar applications of the ideas discussed in


this section.
(i) Let 1/J be a continuous, non-decreasing function on the compact interval
[a, b] . Show that

(5. 1 . 7) (R) { f o '1/J ( s) d'lj;( ) = (R) {


s f(t) dt, f E C ( [a, bl ) .
1[a , b] 1[1/J (a),'lj;(b)]
(ii)Suppose that Jl is a measure on ( �, BR) with the properties that JL(l) < oo
for each compact interval I and JL( { t}) = 0 for each t E � - Let q, : � � �
be a function satisfying JL( [a, b] ) = q,(b) - q,( a ) for all - oo < a < b < oo .
Note that q, is necessarily continuous and non-decreasing, and show that q,*Jl
coincides with the restriction of AR to BR [q,(�) J .
5 . 1 . 8 Exercise: A particularly important case of Theorem 5. 1 .4 1s when
cp(t) = tP for some p E (0, oo ), in which case (5. 1 . 5) yields
(5. 1 .9)

Use (5. 1 .9) to show that I J I P is JL-integrable if and only if


CX) CX)
Ln= 1 nP+ l J-l ( l f l > ! ) + nL= 1 nP- 1 1-l ( l f l > n)
1
< oo .

Compare this result to the one obtained in the last part of Exercise 3.3. 16.

5.2 Polar Coordinates.


From now on, at least whenever the meaning is clear from the context, we
will use the notation "dx " instead of the more cumbersome ",\R N ( dx ) " when
doing Lebesgue integration with respect to Lebesgue ' s measure on �N .
In this section, we examine a change of variables which plays an extremely
important role in the evaluation of many Lebesgue integrals over �N . Let
s N - 1 denote the unit (N - 1 )-sphere { x E �N : lx l = 1 } in � N , and define
<I> : � N \ {0} -----+ s N - 1 by q,(x) = 1�1 · Clearly q, is continuous. Next, define the

-1
surface measure AsN on s N - 1 to be the image under q, of N AR N restricted
to Bn ( o , 1 )\{ 0 } · Noting that q,(rx) = q,(x) for all r > 0 and x E �N \ {0} , we
conclude from Exercise 5.0.3 that
{ j o if!(x) dx = r N { j o if! (x) dx
1B ( O , r)\{ 0 } 1B (0, 1 )\{ 0 }
86 V Changes of Variable
and therefore that
r N
(5.2.1) {n j o if!(x) dx = N {sN- 1 j(w) >. gN- 1 (dw)
J ( O , r)\{ 0 } J
for every non-negative measurable f on ( S N - 1 , B8N-1 ). In particular, using
w
N--11 - AsN - 1 ( s N - 1 ) to denote the surface area of s N - 1 ' we have that
W� is the volume n N of the unit ball B(O, 1) in �N .
Next, define w : (O, oo) X s N - 1 � �N \ {0} by w(r, w) = rw. Note that
w is one-to-one and onto; the pair (r, w) w - 1 (x) = (lx l , �(x)) are called the
polar coordinates of the point x E � N \ {0}. Finally, define the measure RN
on ( (0, oo) , B( o , CX) ) ) by RN (r) = fr r N - 1 dr.
The importance of these considerations is contained in the following result.
5.2.2 Theorem. Referring to the preceding, one has that

ARN = w* ( RN X AsN- 1 ) on BR N \{ 0 } ·
a
In particular, if f is non-negative, measurable function on (�N , BR N ) , then

JR J(O , CX) )
(
r N j(x) dx = r r N - 1 rsN- j(rw) ).gN- 1 (dw) dr
l 1
)
(5.2.3)
(
= rsN- 1 r f(rw)r N - 1 dr AsN- 1 (dw) .
l J( o , CX))
)
PROOF: By Exercise 3.3.22 and Theorem 4.1.5, all that we have to do is check
that the first equation in ( 5.2.3 ) holds for every
f E Cc ( �N ) {! E C ( �N ) : f 0 off of some compact set}.
=

To this end, let f E Cc( �N ) be given and set F(r) = fn ( o ,r) f(x) dx for r > 0.
Then, by (5.2.1), for all r, h > 0:
F(r + h) - F(r) = f f(x) dx
jB ( O , r + h)\B (O , r)

J
B ( O , r + h)\B ( O , r)
f o \II ( r, if!(x)) dx

+ J
B ( O , r + h)\B ( O , r)
0
( f(x) - f w(r, �(x)) ) dx

=
(r + h) N - r N
N
i gN- 1 f o \II ( r, w) >. sN-l (dw) + o(h),
where "o ( h)" denotes a function which tends to 0 faster than h. Hence, F
is continuously differentiable on (0, oo) and its derivative at r E (0, oo) is
5. 2 Polar Coordinates 87
given by r N - 1 fsN- 1 f o 'll (r, w) AsN-1 (dw). Since F(r) -----+ 0 as r '\. 0, the
desired result now follows from Theorem 5. 1.2 and the Fundamental Theorem
of Calculus. D

Exercises

5.2.4 Exercise: In this exercise we discuss a few elementary properties of


AgN- 1 .
( i ) Show that if r =I= 0 is an open subset of s N - 1 , then AgN- 1 (r) > 0. Next,
show that AsN- 1 is rotation invariant . That is, show that if 0 is an N N­ x

orthogonal matrix and To is the associated transformation on �N ( cf. the


paragraph preceding Theorem 2.2.2), then ( To ) * AsN- 1 == AsN- 1 . Finally, use
this fact to show that
(5.2.5 ( i )) I
(e, w)RN AsN- 1 (dw) = o
sN- 1
and
(5.2.5 ( ii)) I
(e , w ) R N (TJ , w) JRN AsN- 1 ( dw) = n N (e , TJ ) RN
S N- 1
for any e , TJ E �N .
Hint: In proving these, let e E � N \ {0} be given and consider the rotation Oe
which sends e to - e but acts as the identity on the orthogonal complement
of e .
( ii ) Define <I> : [0, 21r] S 1 by <I>(O) =
-----+ [ ��::] ,
and set J.L = <I>* A [o, 2 ,.1 . Given
any rotation invariant finite measure v on (S 1 , B81 ), show that v = "�!1 ) J.L· In
particular, conclude that ,\8 1 == M · ( Cf. Exercise 5.3. 15 below. )
Hint: Define
0
0 == [
- sin 0 cos 0 ]
cos () sin () for () E [0, 21r] ,
and note that
l
f
s1 f d v = _.!._
27r f
J[o, 21r) l
f
s1 (
f o To8 (w) v (dw) dO. )
( iii ) For N E z + ' define 3 : [ - 1, 1] X s N - 1 -----+ s N by
S(p, w) = ( 1 - � ) 2 w '
[ 2 1
]
and let
J.L N (r) = £ p 2 � - l
( - ) AIR A8N- 1 (dp dw)
1 x x

for r E B[ - 1 , 1 ) X BsN- 1 . Show that AsN == 3* M N ·


88 V Changes of Variable
( Hint: Consider

{ rN {
1( o, oo )
( 1[ - 1 , 1 ] x sN- 1 f ( )
r 3 (p, w)) J-L N (dp X dw ) dr

for continuous f : �N + 1 � � with compact support.)


Finally, use this result to show that
rsN J ( ( O , w)JRN ) AsN (dw) = WN - 1 r f(p) ( l - p2 ) � - 1 dp
1 1[ - 1 , 1 ]
for all (} E s N and all measurable f on ( [- 1, 1] , B[ - 1 , 1 1 ) which are either
bounded or non-negative.
5 . 2 .6 Exercise: Perform the calculations outlined in the following.
(i ) Justify Gauss ' s trick:

( 2)
r e - dx == rJR2 e - 2 dx == 27r r re - 2 dr == 27r
� 2
� r 2
1R 1 1( o, oo )
and conclude that for any N E z + and symmetric N N-matrix A which is
x

strictly positive definite ( i.e., all the eigenvalues of A are strictly positive),
(5.2.7)
Hint : try the change of variable �(x) == TA _ ! x.
t
( ii ) D efine r ( 1' ) == f( o, oo ) fY - 1 e - t d for 1' E (0, oo). Show that, for any 1' E
(0, oo ) , r(1' + 1) == 1'r( 1') · Also, show that r ( � ) == 1r � . The function r( · ) is
called the Euler ' s Gamma function. Notice that it provides an extension of
the factorial function in the sense that r( n + 1) == n! for integers n > 0.
( iii ) Show that
2{ 7r) N
2
WN - 1 = r ( � ) ,
and conclude that the volume O N of the N-dimensional unit ball is given by
7r N
2
f! N == r ( -N + 1 )
2 .
/3
( iv ) Given a , E (O, oo) , show that

1(O, oo ) t-! exp [ 2 t


2 -a -
/3-t2 ]
5. 3 Jacobi 's Transformation and Surface Measure 89

Finally, use the preceding to show that


1 � 2[ a2
fJ
t- exp - a t - t dt == a .
]7r ! e - 2 o ,B
( O ,CX))
2 -

fJ
Hint: Define 'lj;( s ) for s E � to be the unique t E (0, oo ) satisfying s
at ! - (3t - ! , and use part ( i ) of Exercise 5. 1 .6 to show that
1 [
2 (3 2
t - 1 exp - a t - - dt ==
] e - 2 o ,8 1 e - s ds.
2

( O ,CX))
2

t a R

5.3 Jacobi's Transformation and Surface Measure.


We begin this section by deriving Jacobi ' s famous generalization to non­
linear maps of the result in Theorem 2.2.2. We will then apply Jacobi ' s result
to show that Lebesgue 's measure can be differentiated across a smooth surface.
Given an open set G C �N and a continuously differentiable map*

x E G r-----+ 4l(x) == E �N '


�(x)
we define the Jacobian matrix J4l(x) == �! (x) of 4l at x to be the N N­ x

matrix whose jth column is the vector


84>1
8x ·J

In addition, we call 8 4l( x ) l det(J4l(x)) l the Jacobian of 4l at x.


5 .3. 1 Lemma. Let G be an open set in � N and 4l an element of C ( G ; � N ) 1
whose Jacobian never vanishes on G. Then tit maps open (or BRN -measurable)
subsets of G into open or (BRN -measurable) sets in � N . In addition, if r C G
with 1 r 1 e == 0, then 1 4l (r) l e == 0; and if r c 4l( G ) with 1 r 1 e == 0, then
I tlt - 1
(r) l e == 0. In particular,

* Because we will be dealing with balls in different dimensional Euclidean spaces here , we
will use the notation BJR( a , r) to emphasize that the ball is in � N .
90 V Changes of Variable
PROOF: By the Inverse Function Theorem,t for each x E
G there is an open
neighborhood u c G of X such that 4l r u is invertible and its inverse has
first derivatives which are bounded and continuous. Hence, G can be written
as the union of a countable number of open sets on each of which 4l admits
an inverse having bounded continuous first order derivatives; and so, without
loss in generality, we may and will assume that � admits such an inverse on
G itself. But, in that case, it is obvious that both 4l and 4l - 1 are continuous.
In addition, by Lemma 2.2. 1 , both take sets of Lebesgue measure 0 into sets
of Lebesgue measure 0. Hence, by that same lemma, we now know that both
4l and 4l - 1 take Lebesgue measurable sets into Lebesgue measurable sets. D
A continuously differentiable map 4l on an open set U C �N into �N is
called a diffeomorphism if it is injective (i.e., one-to-one ) and 8clt never
vanishes. If 4l is a diffeomorphism on the open set U and if W == 4l( U ), then
we say that 4l is diffeomorphic from U onto W . In what follows, for any
given set r C �N and 8 > 0, we use
r < 6) - { X E �N : I Y - x l < 8 for some y E r}
to denote the open 8-hull of r.
5.3.2 Theorem (Jacobi's Transformation Formula) . Let G
be an open
set in � N and 4l an element of C2 (G; �N ). If the Jacobian of cit never van­
ishes, then, for every measurable function f on ( 4l(G), BR N [4l(G)] ) , f o cit is
measurable on ( G, BR N [G] ) and

(5.3.3) r
jif!(G)
J( y ) dy < 1G f 0 4l (x) 84l(x) dx
whenever f is non-negative. Moreover, if 4l is a diffeomorphism on G, then
(5.3.3) can be replaced by

( 5.3.4 ) r
jif!(G)
f( y ) dy = 1G f 0 4l(x) 84l(x) dx .
PROOF: We first note that (5.3.4 ) is a consequence of (5.3.3) when cit is one-to­
one. Indeed, if 4l is one-to-one, then the Inverse Function Theorem guarantees
that 4l - 1 E C 2 (4l(G) ; �N ) . In addition,

Hence we can apply (5.3.3) to 4l - 1 and thereby obtain

1G f fl- (x) c5 +(x) dx < r


0
jif! ( G )
J( y ) ( c5f!J ) · - 1 ( y ) c5f!J - 1 ( y ) dy = r
0
jif! ( G )
J( y ) dy ;
t See, for example, W. Rudin ' s Principles of Mathematical Analysis, McGraw Hill ( 1976) .
5. 3 Jacobi 's Transformation and Surface Measure 91

which, in conjunction with (5.3.3) , yields (5.3.4) .


We next note that it suffices to prove (5.3.3) under the assumptions that G
is bounded, 4l on G has bounded first and second order derivatives, and 8 4l is
uniformly positive on G. In fact, if this is not already the case, then we can
choose a non-decreasing sequence of bounded open sets Gn so that 4l r Gn has
these properties for each n > 1 and Gn / G. Clearly, the result for 4l on G
follows from the result for 4l r Gn on Gn for every n > 1. Thus, from now
on, we assume that G is bounded, the first and second derivatives of 4l are
bounded, and 8clt is uniformly positive on G.
Let Q == Q ( c; r ) == flf [ci - r, ci + r] C G. Then, by Taylor 's Theorem, there
is an L E [0, oo ) (depending only on the bound on the second derivatives of 4l)
such that

(Cf. Section 2.2 for the notation here.) At the same time, there is an M < oo
(depending only on L, the lower bound on 8 4l, and the upper bounds on the
first derivatives of 4l) such that

Hence, by Theorem 2.2.2,


1 4l(Q) I < 84l(c) I Q ( O, r + Mr 2 ) I == (1 + Mr) N 84l(c) I Q I .
Now define JL(r) == fr 84l(x) dx for r E BRN [G] , and set v == 4l*JL• Given an
open set H C 4l (G), use Lemma 2.1.9 to choose, for each m E z + , a countable,
non-overlapping, exact cover Cm of 4l - 1 (H) by cubes Q with diam( Q) < ! .
Then, by the preceding paragraph,

where CQ denotes the center of the cube Q. After letting m � oo in the


preceding, we conclude that I H I < v(H) for open H C 4l(G) ; and so, by
Exercise 3. 1.9, it follows that
(5.3.5) l r l < v(r) for all r E BR N [4l(G)] .
Starting from (5.3.5) , working first with simple functions, and then passing
to monotone limits, we now conclude that ( 5.3.3 ) holds for all non-negative,
measurable functions f on (4l(G) , BR N [4l(G)]) . D
As an essentially immediate consequence of Theorem 5.3.2, we have the
following.
92 V Changes of Variable
5.3.6 Corollary. Let G be an open set in JRN and cit E C 2 ( G; JRN ) a diffeo­
morphism. Set
M� (r) = l 8+(x) dx for r E BR N [ G ] .
Then clt*Jlif! coincides with the restriction Aif!(G) of AR N to BR N [clt ( G )] . In
particular,
f E L 1 (cit ( G), BRN [cit ( G)] , Aif!(G) ) � f � E L 1 ( G, BR N [G] , Jlif!) '
0

in which case (5.3.4) holds.


As a mnemonic device, it is useful to represent the conclusion of Corollary
5.3.6 as the statement
f( y ) dy == f o �(x) 8�(x) dx when y == clt(x).
We now want to apply Jacobi ' s formula to show how to differentiate Lebes­
gue 's measure across a smooth surface. To be precise, assume that N > 2 and
say that M c RN is a hypersurface if, for each p E M, there exists an r > 0
and a three times continuously differentiable F : BR N (p, r) � lR with the
properties that
B R N (p, r) n M == { y E BR N (p, r) : F(y) == o }
(5.3.7)
and I VF( y ) l -1- 0 for any y E BR N (p, r) ,
where
\7 F(y ) [F , 1 ( y ), . . . , F , N ( Y ) ]
is the gradient of F at y and, once again, we have used the notation F ,j to
denote the partial derivative in the direction ej . Given p E M, the tangent
space Tp ( M ) to M at p is the set of v E JR N for which there exists an E > 0
and a twice continuously differentiable curve 1 on ( - E , E ) into M such that
1(0) == p and i' (O) == v . ( If 1 is a differentiable curve, we use i' (t) to denote its
derivative � at t.)
Canonical Example: The unit sphere s N - 1 is a hypersurface in ]R N . In
fact, at every point p E s N - 1 we can use the function F(y) == IYI 2 - 1 and can
identify Tp (s N - 1 ) with the subspace of v E JRN for which v - p is orthogonal
to p.
5 .3.8 Lemma. Every hypersurface M can be written as the countable union
of compact sets and is therefore Borel measurable. In addition, for each p E M,
Tp ( M ) is an (N - I)-dimensional subspace of R N . In fact, if r > 0 and
F E C3 ( BR N (p, r) ; R) satisfy (5.3.7) , then, for every y E BR N (p, r)nM, T y ( M )
coincides with the space of the vectors v E JRN which are orthogonal to \7 F(y) .
Finally, if, for r c M and p > 0,
(5.3.9) r(p) { y E ]RN : :3p E M (y - p ) l_ Tp ( M ) and I Y - PI < p} } ,
5.3 Jacobi 's Transformation and Surface Measure 93

PROOF: To see that M is the countable union of compacts, choose, for each
p E M, an r(p) > 0 and a function Fp so that (5.3.7) holds. Next, select a
countable subset {Pn}! from M so that
00

M C U BR N (Pn, � ) , with rn r(pn)·


1
==

Clearly, for each n E z+ ' Kn { y E BR N (Pn, � ) : F( y ) = 0 } is compact ;


and M == U� Kn.
Next, let p E M be given, choose associated r and F, and let y E BR N (p , r) n
M. To see that VF( y ) l_ Ty (M) , let v E Ty (M) be given and choose t > 0
and 1 accordingly. Then, because 1 : ( - t , t ) � M,

Conversely, if v E �N satisfying (v , VF( y )) R N == 0 is given, set


( v , VF(z)) R N
V(z) = v - F z
IV' ( ) l 2 V' F(z)
for z E BR N (p, r) . By the basic existence theory for ordinary differential
equations,* we can then find an t > 0 and a twice continuously differentiable
curve 1 : ( - t , t ) � BR N (p, r) such that 1(0) == y and i' (t) == V (1(t)) for all
t E ( - t , t ) . Clearly i' (O) == v , and it is an easy matter to check that

Hence, v E Ty (M) .
To prove the final assertion, note that a covering argument, just like the one
given at the beginning of this proof, allows us to reduce to the case when there
is an r > 0 and an a three times continuously differentiable F : M (2r) � �
such that I VF I is uniformly positive and M == {x E M (2r ) : F(x) == 0} . But
in that case, we see that
r(p) = x + t VF(x)
{ i V' F(x) l : x E r and lt l < p ,
}
and so the desired measurability follows as an application of Lemma 2.2 . 1 to
the Lipschitz function
D

* See Chapter 1 of E. Coddington and N . Levinson's Theory of Ordinary Differential Equa­


tions, McGraw Hill ( 1 955) .
94 V Changes of Variable
We are, at last, in a position to say where we are going. Namely, we want to
show that there is a unique measure AM on (M, BR N [M] ) with the property
that (cf. (5.3.9))

AM (r) == plim 1
- AR N (r(p) )
( 5.3. 10) �O 2p
for bounded r E BR N with r c M.
Notice that, aside from the obvious question about whether the limit exists
at all, there is a serious question about the additivity of the resulting map
r � AM (r) . Indeed, just because r1 and r2 are disjoint subsets of M' it will
not be true, in general, that r1 (p) and r2 (p) will be disjoint. For example,
when M == s N - 1 and p > 1 ' r1 (p) and r2 (p) will intersect as soon as both are
non-empty. On the other hand, at least in this example, everything will be all
right when p < 1; and, in fact, we already know that (5.3. 10 ) defines a measure
when M == S N - 1 . To see this, observe that, when p E ( 0, 1) and M == S N - 1 ,

r(p) = { y : 1 - p < IYI < 1 + p and , : , E }


r '

and apply (5.2.3) to see that

where the measure AsN- 1 is the one described in Section 4.2. Hence, after
letting p � 0, we see not only that the required limit exists but also gives the
measure AsN - 1 . In other words, the program works in the case M == s N - 1 ,
and, perhaps less important, the notation used here is consistent with the
notation used in Section 4.2.
In order to handle the problems raised in the preceding paragraph for general
hypersurfaces, we are going to have to reduce, at least locally, to the essentially
trivial case when M == �N- 1 {0}. In that case, it is clear that we can identify
x

M with �N- 1 and r(p) with r X ( -p, p) . Hence, even before passing to a limit,
we see that

In the lemmata which follow, we will develop the requisite machinery with
which to make the reduction.
5.3. 1 1 Lemma. For each p E M there is an open neighborhood U of 0 in
�N- 1 and a three times continuously differentiable injection (i.e. , one-to-one)
\}1 U -----+ M with the properties that p == w(O) and, for each E U, the set
: u

{ \}1 , 1 ( ) , . . . , \}1 , N 1 ( ) } forms a basis in T w ( u) ( M) .


u _ u
5.3 Jacobi 's Transformation and Surface Measure 95
PROOF: Choose r and F so that (5.3.7) holds . After renumbering the coor­
dinates if necessary, we may and will assume that F , N (P) =I= 0. Now consider
the map
Y1 - P1
E �N .
YN - 1 - P N - 1
F( y )
Clearly, cit is three times continuously differentiable. In addition,
[ 1
J cit (p) == IRVN- 0 ]
F , N (P ) '
where v == [F , 1 (p) , . . . , F , N _ 1 (p) J . In particular, 8 clt (p) =I= 0, and so the In­
verse Function Theorem guarantees the existence of a p E (0, r] such that
cit
r BR N (p, p) is diffeomorphic and clt - 1 has three continuous derivatives on
the open set W cit ( BR N (p, p)) . Thus, if
U {u E � N - 1 : (u, O) E W },
then U is an open neighborhood of 0 in �N - 1 , and

is one-to-one and has three continuous derivatives. Finally, because \}1 takes
its values in M, it is obvious that

At the same time, as the first (N - 1) columns in the non-degenerate matrix


Jclt - 1 (u) , the \}1 ,j (u) ' s must be linearly independent. Hence, they form a basis
in Tw (u) ( M) . D
Given a non-empty, open set U in �N - 1 and a twice continuously differen­
tiable injection \}1 U � M with the property that
:

{ w , 1 , . . . , w , N - 1 } forms a basis in Tw(u) ( M )


for every u E U, we say that the pair ( \}1 , U) is a coordinate chart for M.
5.3. 12 Lemma. Suppose that (w, U) is a coordinate chart for M, and define

(5.3. 13)
96 V Changes of Variable
Then 8 '}1 never vanishes, and there exists a unique twice continuously differ­
entiable n : U ------+ s N - 1 with the properties that n(u) l_ Tw ( u ) (M) and *
det [ W , 1 . . . W , N - 1 n(u) T ] = 8W(u)
for every u E U. Finally, define
�(u, t) == W(u) + tn(u) T for (u, t) E U X �-
Then
(5.3. 14) 8 w(u, 0) == 8 w(u) , u E U,
and there exists an open set U in �N such that � r U is a diffeomorphism,
u == { u E �N - 1 : (u, 0) E U} , and w(U) == �(U) n M. In particular, if
x and y are distinct elements of \}1 ( U), then { x} (p) n \}1 ( U) is disjoint from
{ y } (p) n w(U) for all p > 0.
PROOF: Given a u E U and an n E s N - 1 which is orthogonal to Tw ( u ) (M) ,
{ \}1 1 ( u), . . . , \}1 , N - 1 ( u) , nT } is a basis for �N and therefore
,

det [ w , 1 (u) . . . W , N - 1 (u) nT ] =f- 0.


Hence, for each u E U there is precisely one n(u) E s N - 1 n Tw ( u ) (M) with
the property that
det [ W , 1 (u) . . . W , N - 1 (u) n(u) T ] > 0.
To see that u E U � n(u) E s N - 1 is twice continuously differentiable, set p ==
W(u) and choose r and F for p so that (5.3.7) holds. Then n(u) = ± 1 ���=� 1 ,
and so, by continuity, we know that, with the same sign throughout,
n( w ) = ± \7 F(w(w))
I V' F(W(w)) l
for every w in a neighborhood of u.
Turning to the function \}1 , note that
8�(u, 0) 2 = (det [ W , 1 (u) . . . W , N - 1 (u) n(u) T J r
W , 1 (u)T
== det
\}1 , N - 1 ( ) T
U
[ w , 1 ( u) . . . w , N - 1 ( u) n ( u) T ]
n(u)
[
= det (( ( W , i (u) , : ,j (u)) R N )) �] = 8W(u) 2 .
* Below, and throughout , we use A T to denote the transpose of a matrix A. In particular ,
if A is a row vector, then A T is the corresponding column vector, and vice versa.
5. 3 Jacobi 's Transformation and Surface Measure 97

Hence, (5.3. 14) is proved. In particular, this means, by the Inverse Function
Theorem, that, for each u E U, there is a neighborhood of (u, 0) in RN+ 1
on which \}1 is a diffeomorphism. In fact, given u E U, choose r and F for
p == w(u), and take p > 0 so that

'VF ( W ( w ))
Then, because n (w) = ± j F ( w ( )) j for all w E BR N- 1 (u, p) ,
V' w
F ( �(w, t) ) == ± t ! VF ( w(w)) ! + E(w, t)
for (w, t) E BR N- 1 (u, p) ( - � , � ) ,
x

where ! E(w, t) l < Ct2 for some C E (0, oo ) . Hence, by readjusting the choice of
p > 0, we can guarantee that \}1 r BIRN - 1 ( u, p) X ( - p, p) is both diffeomorphic
and satisfies

In order to -prove- the final -assertion,- we must find an open U in �N such


that w(U) == w(U) n M and w r U is a diffeomorphism. To this end, for each
u E U, we use the preceding to choose p( u) > 0 so that: BJRN - 1 ( u, p( u) ) C U
and � r BJRN - 1 ( u, p( u) ) X ( - p( u) ' p( u) ) is both a diffeomorphism and satisfies

Next, choose a countable set { un}1 C U so that


CX)
U == U BJRN- 1 (un , if- ) where Pn == p (un) ;
1
and set n
Un == U BJRN -1 (urn , e; ) and Rn == P 1 1\ . . . 1\ Pn ·
1
To construct the open set U, we proceed inductively as follows. Namely, set
E 1 == '¥- and k1 == u 1 X [ - E 1 , E 1 ] · Next given Kn , define En + 1 by
( {I
2En+ 1 = Rn + l 1\ En 1\ inf W ( ) - � ( t)
U W,
I : U W,
E Un + l \ Un , ( t) E kn ,
)
and l u - w l > R';+ 1 } ,
and take
98 V Changes of Variable
Clearly,
- for each n E z +-' Kn is compact, Kn c Kn + 1 ' Un X ( - En, En) c K-n , and -
8 '}1 never vanishes on Kn. Thus, if we can show that En > 0 and that \}1 f Kn
is one to one for each n E z + ' then we can take [; to be the interior of u � Kn.
To this end, first- observe
- that there is nothing to do when n == 1. Furthermore,
if En > 0 and \}1 f Kn is one to one, then En + 1 == 0 is possible only if there
exists a u E BR N-1 (Un + 1 , Pn + 1 ) and -a (w, t) E BRN- 1 (Um , Pm ) X ( - pm , Pm ) for
some 1 < m < n such that w(u) == w(w, t) and l u - w l > R�+ 1 • But, because

this would mean that \}1 ( w) == \}1 ( u) and therefore, since \}1 is one to one, would
lead to the contradiction that 0 == ! w - u l >
R�+ 1 • Hence, En + 1 > 0. Finally,
- -
to see that \}1 f Kn + 1 is one to one, we need only check that
(u, s) E (Un + 1 \ Un) X [ - En + 1 , En + 1 ] and (w, t) E Kn ===} �(u, s) =/= �(w, t) .
But, if l u - w l < R�+ 1 , then both (u, s) and ( t) are in BR N- 1 (un+ 1 , Pn + 1 )
w, x

( -Pn + 1 , Pn + 1 ) and � is one to one there. On the other hand, if l u - w l > R�+ 1 ,
then
w w
l �(u, s) - � ( , t) l > l w(u) - �( , t) l - l sl > 2En + 1 - En + 1 == En + 1 > 0. D

5 . 3 . 1 5 Lemma. If (w, U) is a coordinate chart for M and r a bounded ele­


ment of BR N with r c w(U) , then (cf. (5.3. 13))
(5.3. 16)
- - -
PROOF: Choose U and \}1 as
in Lemma 5.3.12. Since r is a compact subset of
the open set G == �(U), E == dist ( r, GC) > 0. Hence, for p E (0, E),

which, by (5.3.4 ) and Tonelli ' s Theorem, means that

Since w - 1 (r ) == � - 1 (r ) is compact, 8� f w - 1 ( r ) is uniformly bounded and


continuous. In particular, by Lebesgue ' s Dominated Convergence Theorem
and (5.3. 14) ,
_!_ f
2p }( - , ) 8�(u, t) dt � 8�(u, 0) = 8W(u)
p p
5.3 Jacobi 's Transformation and Surface Measure 99
boundedly and point-wise for u E U. Hence, after a second application of
Lebesgue ' s Dominated Convergence Theorem, we arrive at (5.3. 16). D
5.3. 1 7 Theorem. Let M be a hypersurface in � N . Then there exists a unique
measure AM on (M, BRN [M] ) for which (5.3. 10) holds. In fact, AM (K) < oo
for every compact subset of M. Finally, if ( \}1 , U) is a coordinate system for
M and f is a non-negative, BRN [M]-measurable function, then (cf. (5.4. 13))

(5.3.18) r f(x) AM (dx) = r f W(u)8W(u) )..R N - 1 (du).


0
lw ( U) lu
PROOF: For each p E M, use Lemma 5.3. 11 to produce an r(p) > 0 and a
coordinate chart ( \}1 Up ) for M such that BRN (p, 3r(p) ) is contained in ( cf.
P'

Lemma 5.3. 12 ) w(Up ) · Next, select a countable set {Pn}! C M so that


CX)
M C U BRN (pn , rn), where rn r(pn),
1

n- 1
Mn == (BRN (pn , rn) n M) \ u Mm for n > 2.
1
Finally, for each n E z + , define the finite measure Jln on (M, BRN [M] ) by

and set
(5.3.19)

Given a compact K c M, choose n E z + so that K c u� BRN (pm , rm ),


and set r == r 1 1\ · · · 1\ rn and E == j . It is then an easy matter to check that,
for any pair of distinct elements x and y from K, either l x - Y l > r, in which
case it is obvious that {x}(E) - -n { y }(E) == 0 , or l x - Y l < r, in which case both
P
{x}(E) and { y }(E) lie in W m (U m ) for some 1 < m < n and, therefore, the last
part of Lemma 5. 3.12 applies and says that { x} (E) n { y } (E) == 0 . In particular,
if r E BRN is a subset of K and rm == r n Mm , then, for each p E [0, E),
{r1 (p ) , . . . , rn (P) } is a cover of r( p ) by mutually disjoint measurable sets.
Hence, for each 0 < p < E:
n
L
ARN ( r (p ) ) == ARN ( rm (P) ) .
1
100 V Changes of Variable
At the same time, by Lemma 5.3. 15,

In particular, we have now proved that the measure defined in (5.3.19) satisfies
(5.3. 10) and that AM is finite on compacts. Moreover, since (cf. Lemma 5.3.8)
M is a countable union of compacts, it is clear that there can be only one
measure satisfying (5.3. 10) .
Finally, if ( \}1' U) is a coordinate chart for M and r E BR N with r c \}1 ( U) is
bounded, then (5.3.18) with f == l r is an immediate consequence of (5.3.16).
Hence, (5.3. 18) follows in general by taking linear combinations and monotone
limits. D
The measure AM produced in Theorem 5.3.1 7 is called the surface measure
on M.

Exercises

5.3.20 Exercise: In the final assertion of part ( ii ) in Exercise 5.2.4 and again
in ( i ) of Exercise 5.2.6, we tacitly accepted the equality of 1r , the volume 02 of
the unit ball BR 2 (0, 1 ) in �2 , with the half-period of the sin and cos functions.
1r ,

We are now in a position to justify this identification. To this end, define


4l : G (0, 1) (0, 27r) ----+ �2 (the here is the half-period of sin and cos) by
x 1r

4l(r, 8) == (r cos O, r sin O)T. Note that 4l(G) == B(O, 1) \ {x E B(O, 1) : x 1 == 0}


and therefore that 1 4l(G) I == 0 2 . Now use Jacobi ' s Transformation Formula to
compute 1 4l(G) I .
5.3.21 Exercise: Let M a hypersurface in �N . Show that, for each p E M,
the tangent space Tp (M) coincides with the set of v E � N such that
1.
-
lmo dist(p +�2�v, M) 00 .
e�
<

Hint : Given v E Tp (8G) , choose a twice continuously differentiable associated


curve 'Y, and consider � r-----+ ! P + �v - 'Y ( �) I·
5 .3.22 Exercise: Given r > 0, set s N - l (r) == { X E �N : l x l == r } , and
observe that s N - l (r) is a smooth region. Next, define � r : s N - l � s N - l (r)
by � r (w) == rw ; and show that AsN-l (r) == r N - l (� r )*AsN- 1 .
5.3.23 Exercise: Show that if r =I= 0 is an open subset of a hypersurface M,
then AM (r) > 0.
5. 3 Jacobi 's Transformation and Surface Measure 10 1
5.3.24 Exercise: In this exercise we introduce a function which is intimately
related to Euler ' s Gamma function.
( i ) For (a, {3) E ( 0 , oo ) 2 , define
B (a, /3) = f u - 1 ( 1 - u) f1 - 1 du.
0

1(0, 1 )
Show that
B ( a, {3) = rr(a
(a)r(/3)
+ {3)
where r( · ) is the Gamma function described in ( ii ) of Exercise 5.2.6. ( See part
( iv ) of Exercise 6.3.18 for another derivation.) The function B is called the
Beta function. Clearly it provides an extension of the binomial coefficients
in the sense that
m+n+ 1 m+n ( )
B (m + 1 , n + 1 ) - m
for all non-negative integers m and n.
Hint: Think of r(a) r({3) an integral in (s, t) over (O, oo) 2 , and consider
as

the map
( u , v ) E (0 , oo) (0 , 1 )
x
uv
f--+
u( 1 v)
E[ (0 , oo) ]
2•
( ii ) For ,\ > � show that

where {cf. part ( ii ) of Exercise 5.2.6) W N _ 1 is the surface area of s N - 1 . In


particular, conclude that

Hint : Use polar coordinates and then try the change of variable 4l{r) == 1 � r2 •
2

( iii ) For ,\ E { 0 oo ), show that


,

and conclude that, for any N E z + ,

1(- 1,1)
(1 - e ) � - 1 d�WN .
WN - 1
=

Finally, check that this last result is consistent with part ( iii ) of Exercise 5.2.4.
102 V Changes of Variable
5.3.25 Exercise: Let U be a non-empty, open subset of JR N - 1 and f E
C3 (U ; JR) be given, take
M = { (u, f(u)) : u E U} and W(u) = [!�)] , u E U.
That is, M is the graph of f.
(i) Check that M is a hypersurface and that (U, w) is coordinate chart for M
which is global in the sense that M == w(U ) .
(ii) Show that

(((w w ) IRN )) 1 <t. ,J< N - 1 ==


i
, ,
,j
.
1 + v J T V J,
and conclude that 8 w == J1 + I V !1 2 .
Hint: Given a non-zero, row vector v, set A == l + v T v and e == 1�1 . Note that
Ae == ( 1 + l v l 2 ) e and that Aw == w for w ..l e. Thus, A has one eigenvalue
equal to 1 + l v l 2 and all its other eigenvalues equal 1.
(iii) From the preceding, arrive at

JM <p d>.M fu <p (u, f(u) ) J1 + IV' f(u) l 2 du,


= <p E Cc (M),
a formula which should be familiar from elementary calculus.

5.3.26 Exercise: Let G be a non-empty, open set in JRN , and assume that
M {x E G : F(x) == 0} f= 0 and that F , N never vanishes on M.
_

(i) Set U == {u E JR N - 1 : (u, t) E M for some t E JR}, and show that there
exists an f E C3 (U ; JR) with the property that F (u, f(u)) == 0 for all u E U.
In particular, M is the graph of f.
(ii ) Define \}1 from f as in Exercise 5.3.25, and conclude that

{ <p d>. M = { cpFi V F I o W (u) du .


JM lu I , N I

5 .3.27 Exercise: Again let G be a non-empty, open set in JRN and F E


C3 (G; JR) , but this time assume that F , N vanishes nowhere on G. Set T ==
Range(F) and, for t E T, Mt == { x E G : F(x) == t} and Ut == {u E �N - 1 :
(u, t) E Mt } ·
( i ) Define cJl on G by

4l ( x ) ==
XN- 1
F( x)
5.4 The Divergence Theorem 103
and show that � is a diffeomorphism. As an application of Exercise 5.3.26,
show that, for each t E T, Mt is a hypersurface and that, for each cp E Cc (G) :
cpl\7
Mt1<p d >.. M,1RN -1
= l�(G) ( u, t) Fl
I F, N I o � - 1 (u, t) du, t E T.

In particular, conclude that

is bounded and Br-measurable.


(ii) Using Theorems 4.1.6 and 5.3.2, note that, for cp E Cc(G),

G1cp (x) dx = f 1;
jfJ!(G) , N I
o � - 1 ( y ) dy

=
£ (lN -l N
)
l�(G) ( u , t) I ; I o � - 1 (u , t) du dt .

After combining this with part (i) , arrive at the following (somewhat primitive)
version of the co-area formula
(5.3.28)

Take G == {x E �N : X N -1- 0}, F == lxl for x E G, and show that (5.2.3)


(iii)
can be easily derived from ( 5.3.28).

5 .4 The Divergence Theorem.


Again let N > 2. Perhaps the single most striking application of the con­
struction made in the second part of Section 4.3 is to multi-dimensional inte­
gration by parts formulae. Namely, given a non-empty open G in � N , we will
say that the region G is smooth if its boundary 8G is a hypersurface in �N
and, for each p E 8G, the number r > 0 and F
can be chosen so that, in addition to ( 5.3. 7),
: BRN
(p, r) � � in ( 5.3. 7)

(5.4.1)
Notice that if G is a smooth region, then, for each p E 8G, there is a unique
n(p) E s N - 1 n Tp ( 8G)_l_ with the property that, for some E > 0:

(5.4.2) {
p + � n (p ) E
GC if � E ( 0 ' E )
G if � E ( - t:, O).
104 V Changes of Variable
For obvious reasons, n (p) is called the outer normal to 8G at p. Notice
that x E 8G � n (x) E s N - 1 is locally ( i.e., on each compact ) Lipschitz
continuous, since if r and F are chosen for p so that (5.3.7) and (5.4. 1) hold,
then (cf. Lemma 5.3.8)
n(x) = I \1 F(x)
F(x) l ' X E BJRN (p, r) n aG.
Y'
Our main goal in this section will be to prove that if f E Cc (� N ; �) ( i.e.,
vanishes off some compact subset ) and G is a smooth region, then for every
w E sN- 1 :
d
(5.4.3 ) r f(x + �w) dx = r f(x) (w, n (x)) R N Aaa ( dx).
d
� la e = o la c
Observe that (5.4.3) is another way of seeing that surface measure is in truth
the derivative of ARN across the surface. Indeed, since there is no requirement
that f be differentiable, it must be Lebesgue ' s measure which is absorbing the
derivative on the left hand side of ( 5.4.3) .
The key to (5.4.3) is contained in the following lemma.
5 .4.4 Lemma. Let G be a smooth region. Then, for each p E 8G, there is a
f.
coordinate chart (w, U) for 8G and an > 0 with the properties that: U is
bounded, p E w(U), and the associated map

(5.4.5) (u, t) E U U X ( -€ , € ) � �(u, t) W(u) + tn(w(u)) T E � N


is a diffeomorphism such that both � and � - 1 have bounded, continuous first
and second order derivatives, and, for each u E U,
(5.4.6) w(u, t) E G if and only if t E (- € , 0 ) .
Furthermore, given such a coordinate chart, (5.4.3 ) holds for every continuous
f : �N � with compact support in (i.e., vanishing off a compact subset
-----t

of) w(U) .
PROOF: To prove the first part, use Lemma 5.3. 1 1 to choose some coordinate
chart (4l, W) for 8G with p-E 4l(U) ,- assume that W is connected, and define
associated n (w) , w E W, W, and 4l as in Lemma 5.3. 12. Clearly, n (w) ==
± n ( 4l ( w)) , with the same choice of sign for every w E W. Hence, if necessary
after replacing w by W' == { w : ( w 1 , . . . ' - W N - 1 ) E w} and cit by

we may and
- -
will assume that n( w) ==- n ( 4l ( w)) for all w E W. In particular,
because 4l(W) n 8G == 4l(W) and 4l(w, t) -E G- for sufficiently small strictly
negative t ' s, we know that, for each ( w, t) E W, 4l( w, t) E G if and only if t < 0.
5.4 The Divergence Theorem 105

Finally, choose t >- 0 so that BR N (p, 2 t) c 4- (U) , set U == � - 1 ( BR N -(p, t)n8G) ,


and take \}1 and \}1 to be, respectively, the restrictions of � and � to U and
U == U X (-t, t).
Turning to the second part of the lemma, set n == \}1 ( U) , and define
p( y ) = W ( --J, - 1 ( y h , . . . , --J, - 1 ( y ) N - 1 ) and n( y ) = n(p( y ) )
for y E n. Given w E s N - 1 ' set
Wn (Y ) == (w, n( y )) R N and Wn ( Y ) == W0 ( y ) n (y ) , y E fl ;
and, given f E C(�N ; �) with compact support K c n, set r ==dist ( K, nC ) .
Then, by the translation invariance of Lebesgue ' s measure, for every � E �:
�( � ) fa f(x + �w) dx - fa f(x) dx = � 1 (�) + � 2 (�),
where

and
�2 (� ) L ( 1 c (x - �wn (x) ) - 1 a (x) ) f(x) dx.
In order to prove that (5.4.3) holds, we will show that

lim � 1 ( � ) = 0
e ---+ 0 �
(5.4.7)
� 2 (� ) = f(x) wn (x) A (dx);
lim
e ---+ O � 1 G
aa
and we will begin with the second, and easier part of ( 5.4. 7) . To this end, we
use Theorem 5.3.2 and Fubini ' s Theorem to write
� 2 (� ) =
fu g( � , u) du, 1� 1 < r,
where
g(� , u)
- 1 ) [1 ( - oo ,O) (t - �wn (W(u)) ) - 1 ( - oo ,o) (t) ] f ( --J, (u, t) ) 8--J, (u, t) dt.
( - t: , t:

By elementary reasoning, one sees that, as � ---t 0,


106 V Changes of Variable
uniformly and boundedly for u E U; and so (cf. (5.3. 18))
.hm � 2 (x) =
j f(u)wn ( W(u) ) 8W(u) .XR N- 1 (du) 1 = f(x) wn(x) Aaa (dx).
e---. o � u aa
The first part of (5.4. 7) is more involved. To handle it, we introduce the
notation
D( y ) = q, - 1 ( Y ) N = (y - p( y ), n ( y ) ) JRN ' y E n.
Observe that if p E f! n 8G, then ( cf. Exercise 5.3.20)
{
. D(p + tv) - D (p) == 0 if v E Tp (8G)
hm
t---. o t 1 if v == n(p).
Hence,
( 5.4.8 ) V D(p) == n(p) for p E f! n 8G.
Next, set
E( y , � ) == D (y - �wn ( Y ) ) - D( y - �w) for ( y , �) E K ( - r, r).
x

Clearly,
x - �w E G but x - �wn ( Y ) � G ===} 0 < D (x - �wn (x) ) < E(x, � )
x - �w � G but x - �wn ( Y ) E G ===} E(x, � ) < D(x - �wn (x)) < 0.
Thus,

� � � ( � ) � < ll f ll u AR N (r( � ) )
where r( � ) {X E K I D (x - Wn(x)) I < I E(x, � ) I }.
= :

In order to estimate ARN ( r( � )) ' first observe that, for some cl < oo ,

Hence, since p ( x - �w0(x)) == p(x),

which, together with ( 5.4.8 ) , leads to the existence of a c2 < 00 for which

But, since w - w0(x) ..l n(x) , this, in conjunction with Taylor 's Theorem, says
that
5.4 The Divergence Theorem 107
for some C < oo. In other words, we have now shown that

r( � ) c { x E K : I D (x - �wn(x)) i < ce }
C q, ( { (u, t) E fJ : l t - �wn (W(u)) l < ce } ) ·

Finally, set

and use Theorem 5.3.2 to conclude that

which is more than enough to prove the first part of (5.4.7) . D


5.4.9 Theorem. Let G be a smooth region in JR N and U an open neighbor­
hood of G. Then (5.4.3 ) holds for every f E Cc(U; JR) . In particular, if G is
bounded, then ( 5.4.3 ) holds for every f E C(U; JR) .
PROOF: We first observe that, without loss in generality, we may assume that
U == JRN and that f E Cc (JR N ; lR). Indeed, when G is unbounded but f E
Cc(U; JR), choose a compact set K C U so that f vanishes off K; and when G
is bounded, choose K so that G C K and K C U. Next, in either case, let E
- 0

denote the distance between K and UC and define j JR N � lR so that :

f(x) = [ ( � dist (x, UC ) ) 1\ 1 J f(x) for x E U


and j 0 off U. It is then clear that j E Cc(lR N ; JR) . Moreover, since

r i(x + �w) dx = r f(x + �w) dx for 1�1 < �2 .


la la
it is obvious that (5.4.3) for f is equivalent to (5.4.3) for f.
In view of the preceding, we now assume that f E Cc (JR N ; JR) ; and we start
the proof of (5.4.3) by observing that when f vanishes off the compact set K
and K n {)G == 0 , then E == dist ( K, 8G) > 0 and

fa f(x + �w) dx = L l a (x - �w)f(x) dx = fa f(x) dx for all 1�1 < E.


Hence, in this case, (5.4.3 ) is essentially trivial.
108 V Changes of Variable
To complete the proof, we must show how to reduce to the situations for
which the result is already proved. Thus, let f E Cc (l� N ; �) be given and
choose a compact K off which f vanishes. As we noted above, there is nothing
more to do if K n 8G == 0. Thus, we assume that K n 8G =I= 0. Using the
compactness of K n 8G and Lemma 5.4.4, we now choose coordinate charts
{ ( w m , Urn) }� of the sort described in that lemma, points {Pm }1 C K n 8G,
and numbers {rm }1 C (O, oo) so that BJR N (Pm , 4rm) C � m ( Um) for each
1 -< m -< n and n

K n 8G C U BJR N (Pm , rm) ·

1
Set n

H == �N \ U BJR N (pm , rm) ,


1
and define

'1/Jo ( x ) == dist ( x, HC ) and '1/Jm ( x ) == dist ( x, BJR N (pm , 3rm) C ) for 1 < m < n.
It is then clear that each '1/Jm is a non-negative, continuous function and that
n

s (x) - L 'l/Jm (x) > r 1 1\ · · · 1\ Tn > 0 for all x E �N .


0

Hence, if
'1/J m ( x )f( x
fm ( x ) = s x ) ' x E IR. N and 0 < m < n,
( )
then each fm is continuous, f == 2:� fm , fo vanishes off a compact subset of
8G C , and, for 1 < m < n, fm vanishes off of a COmpact subset of � ( U ) Prn Prn ·

In particular, this means that


1
d f(x + t:,w) dx = n d fm( x + t:,w) dx
L e c
1
dC c
� e = o 0 d �e =o
n

= L r fm (x) (w , n ( x )) JR N A ac (dx) = r f(x ) (w, n(x )) JR N Aac (dx). D


0 lac lac

Although ( 5.4.3 ) is the basic fact which we wanted to prove in this section,
it does not present the result in the form which is most frequently encountered
in applications. To see how to pass from ( 5.4.3 ) to the expression which we
are after, assume that f is continuously differentiable in a neighborhood of G,
and ( cf. Exercise 4. 1 . 13 ) note that ( 5.4.3 ) becomes

r ('V f ( x ) , w) JR N dx = r f (x) (w , n(x)) JR N Aac (dx) .


lc lac
5.4 The Divergence Theorem 109
Next, suppose that F : � N � � N is once continuously differentiable and that
F 0 off a compact set. By applying the preceding with f == Fk and w == ek
and summing over 1 < k < N, we obtain

(5.4. 10) { div F (x) dx = { ( F ( x ) , n(x)) JR N Aaa (dx) ,


lc lac
where div F �� Fk,k is the divergence of F. The formula in (5.4.10) is
sufficiently important to warrant our stating it as a theorem.
5.4. 1 1 Theorem (Divergence Theorem) . Again let G be a smooth region
in � N and U an open neighborhood of G. If F : U � �N is continuously
differentiable and either G is bounded or F 0 off of a compact subset of U,
_

then
{ div F (x) dx = { ( F (x) , n(x)) JR N Aaa (dx) .
lc lac
Before dropping this topic, we will give some examples of the way in which
The Divergence Theorem is used in the analysis of partial differential equations.
Let � = �f" 1 8�::! denote the standard Laplacian. The following vari­
ant on The Divergence Theorem provides one of the keys to the analysis of
equations in which � appears.
5.4. 12 Theorem (Green's Identity) . Let G be a smooth region in �N and
U an open neighborhood of G. If u and v are twice continuously differentiable
�-valued functions on U and either G is bounded or u has compact support in
U, then

fa u �v dx - fa v �u dx
(5.4.13) 1 u ov Aac (dx) - 1
== �
ou A ac (dx ) ,
v�
ac un ac un
where
of (x) d f(x
On � + � n( x ) ) �=O = (V' f(x) , n(x)) JR N

denotes differentiation in the direction (cf. ( 5.4.2 ) ) n.


PROOF: Simply note that u �v - v �u == div (u \7 v - v \7u) , and apply The
Divergence Theorem. D
In order to extract information from Green ' s Identity, one must make judi­
cious choices of v for a given u. For example, one often wants to take v to be
the fundamental solution g given by g(O) == oo and

( 5.4.14) g( x ) _
{ - log l x l
- l xi 2 - N
if N ==
if
2
N>3
for X E �N \ {0} .
1 10 V Changes of Variable
Note that g and I Vg l are integrable on every compact subset of JRN . In addi­
tion, the following facts about g are easy to verify:

(5.4. 15) N
flg(x) == 0 and l x i \7g(x) ==
-x {
if N == 2
(2 - N)x if N > 3
on JR N \ {0}.
Our first application allows us to solve the Poisson equation flu == - f.
5 .4. 16 Theorem. Set CN == 21r or (N - 2) W N - 1 (cf. Exercise 5.2.6) depending
on whether N == 2 or N > 3. Given f E c; ( lR N ; lR) ' define UJ on JRN by (cf.
(5. 1 . 14))
U J (X) =
r g(x - y )f( y ) dy .
_2._
CN }JRN
Then U J E C2 ( JRN ; lR ) and flu! == -f .
PROOF: First observe that another expression for UJ ( · ) is JJRN g( y )f( · - y ) dy ,
and it is clear ( cf. Exercise 4. 1 . 13) from this latter expression not only that
U J E C2 ( 1RN ; JR) but also that flu, (x) == JJRN g ( y )flf(x - y ) dy . Thus, all that
we need to do is check that JRN g( y )flf(x - y ) dy == - eN f(x).
Fix x and choose R > 1 so that f 0 off of B(x , R - 1) . For 0 < r < R, set
Gr == B(O, R) \ B(O, r) . Then
r N g ( y ) b.. f (x - y ) dy = rlim � O r g ( y ) b.. f (x - y ) dy ;
}R Jar
and, by Green ' s Identity and (5.4. 15) , for each 0 < r < R ( cf. Exercise 5.3.22) :
r g ( y ) b.. f (x - y ) dy
Jar
of (x - y ) AsN- 1 ( r) (dy ) -
== 1
8Gr
g( y ) -
0 n 1 8Gr
f(x - y ) �
-
0 n
(y ) Aaar (dy )
o f
= - 1
8B ( O , r)
g( y ) a p (x - y ) >. a B ( o, r) (dy )

+
i N-1 og ( y ) A B(O , r) (dy )
f(x - y ) -
0p a
S (r )
== - r N - 1 gN- 1 g(rw) -of (x + rw)AsN- 1 (dw)
1 0p
+r N - 1
gN- 1 f(x +irw) o
-g
0p
(rw)AsN- 1 (dw),
where g denotes differentiation in the outward radial direction and we have
P
used Exercise 5.3. 17 together with the fact that, for Gr , :n == - gp on s N - 1 (r).
But r N - 1 g(rw) � 0 uniformly as r � 0 and ( cf. (5.4. 15) )
N _ 1 o g -1 {
r -p ( rw) == - (N - 2) if N > 3.
if N == 2
o
5. 4 The Divergence Theorem 111

After combining this with the preceding, we now see that

lim r g( y ) f}.j(x - y ) dy
r �O Ja r

= - WCN rlim { j(x + rw) ).. s N- 1 (dw) = - cN f(x) . 0


N- 1 l
�O sN- 1
Our second application of Green ' s Identity will be to harmonic functions. A
function u E C2 (G ; �) is said to be harmonic in G if �u == 0. Notice that
if N == 1 and u is harmonic on (a, b) and continuous on [a, b] , then u(x) ==
�:=:�u(a) + ��=:Ju(b) for x E [a , b]. In particular, u ( atb ) is precisely the
mean of the values that u takes on 8(a, b). We will now use Green ' s Identity
to derive the analogous fact about harmonic functions in higher dimensions.
5.4. 17 Theorem ( The Mean Value Property ) . Suppose that u is an har­
monic element of C2 (G ; �) . Then, for each x E G and R > 0 satisfying
B(x, R) c G,
(5.4. 18) u(x) = 1
rsN u(x + Rw) AsN- 1 (dw) .
WN - 1 l - 1
PROOF: Without loss in generality, we will assume that x == 0.
Set ( cf. (5.4. 14) ) gn (x) == g(x) - g (Re 1 ) where e 1 == ( 1 , 0, . . . , 0) E s N - 1 .
Then, by Green ' s Identity applied to the functions u and 9R in the region ( cf.
the proof of the preceding ) Gr

0 = { (gR ( Y ) f}. u( y ) - u( y ) i}. gR ( Y ) ) dy


Jar
= - RN - 1 {sN- 1 u(Rw) Q:pR (Rw)AsN- 1 ( dw)
l
l
- r N - 1 gN- 1 gn (rw) !l OU
up
(rw)>.. s N- 1 (dw)
+ r N - 1 {sN- 1 u(rw) Q:pR (rw)).. s N-1 (dw),
l
where we have used the same notation as in the preceding proof. Note that
the first term on the right equals ( cf. (5.4. 15) )

CN
N-
l
u(Rw) AsN-1 (dw) ,
WN - 1 S 1
the second term tends to 0 as r � 0, while the third term tends to - cN u(O) . D
1 12 V Changes of Variable
Exercises

5.4. 19 Exercise: Let u


be a twice continuously differentiable function in a
neighbor hood of the closed ball B ( x, T ) , and assume that �u < 0 in B ( x , T ) .
Generalize the Mean Value Property by showing that
(5.4.20) u(x) > 1 { u(x + rw) .\gN- 1 (dw)
WN - 1 1s N- 1
Next, show that (5.4. 18 ) and (5.4.20) yield, respectively,
(5.4.21 (i) ) u(x) = 1 N r u(x + y ) dy
O N T 1B ( x, r )
and
(5.4.21 (ii) ) u(x) > 0 1 N { u(x + y ) dy .
N T 1B ( x, r)
Using (5.4. 21 (ii) ) , argue that if G is a connected open set in �N and E u

C2 (G; �) satisfies �u < 0, then u achieves its minimum value at an x E G if


and only if u is constant on G. This fact is known as the strong minimum
principle.
5.4.22 Exercise: Let G be a bounded, smooth region in the plain �2 . In addi­
tion, assume {)G is a closed curve in the sense that there is a 1 E C 2 ( [0, 1); �2 )
with the properties that
t E [0, 1 ) � 1(t) E 8G is an injective surjection,
1 ( 0) == lim 1 ( t) ,
t/ 1 / 1 i' ( t) ,
i' ( 0) == tlim
/ 1 i(t) , and li' (t) l > 0 for t E [0, 1 ) .
i(O) == tlim
( i ) Show that
r cp ( ( ) Aaa ( d( ) = r cp I'( t) li' (t) I dt
0
1aa 1[o, 1 ]
for all bounded measurable cp on 8G.
( ii ) Let n(t) denote the outer normal to G at 1(t), check that
n(t) == ± li' (t) l - 1 (i'2 (t) , -i'1 (t) ) ,
with the same sign for all t E [0, 1 ) , and assume that 1 has been parameterized
so that the plus sign is the correct one. Next, suppose that h E C2 ( G ( P ) ; �)
for some p > 0, and define u == �� and == - �Z . Next, set f == u + A and
v v

r(t) == 1 1 (t) + A12 (t) . Show that


( 5.4.23 ) r J( l' (t)) t (t) dt = v=� r r�h] ( ( ) d( .
1[o, 1 ] 1a
A particularly important case of (5.4.23) is the one when �h == 0; in which
case f is a complex analytic function and (5.4.23) leads to the famous Cauchy
integral theorem.
5. 4 The Divergence Theorem 113
5.4.24 Exercise: Let G, a non-empty, open set in JRN , and F E C3(G ; JR) be
given, and assume that l \7 F l never vanishes in G. Next, set T == Range ( F)
and, for t E T, define Mt == { x E JRN F ( x) == t}. Check that, for each
:

t E T, Mt is a hypersurface. Further, by combining the result obtained in


Exercise 5.3.27 with the localization technique used in the proof of Theorem
5.4.9, show that, for each c.p E Cc(G) , t E T � JMt c.p d>.. Mt E lR is bounded
and measurable and that the co-area formula ( 5.3.28 ) continues to hold.
Chapter VI
Soine Basic Inequalities

6. 1 Jensen, Minkowski, and Holder.


There are a few general inequalities which play a central role in measure
theory and its applications. The ones dealt with in this section are all conse­
quences of convexity considerations.
A subset C C � N is said to be convex if (1 - t)x + ty E C whenever
x, y E C and t E [0, 1] . Given a convex set C C � N , we say that g : C � �
is a concave function on C if

g ( (1 - t)x + ty ) > (1 - t)g(x) + tg( y ) for all x, y E C and t E [0, 1] .


Note that g is concave on C if and only if { (x, t) E C x � : t < g(x) } is a
convex subset of � N + l . In addition, use induction on n > 2 to see that

for all n > 2, { y 1 , . . . , Yn} C C and {a 1 , . . . , an} C [0, 1] with E � ak == 1.


The essence of the relationship between these notions and measure theory
is contained in the following.
6. 1 . 1 Theorem ( Jensen's Inequality ) . Let C be a closed, convex subset of
�N , and suppose that g is a continuous, concave, non-negative function on C.
Let (E, B, JL) be a probability space and F : E � C a measurable function
on (E, B) with the property that IFI E £ 1 (JL) . Then

EC

and

(See Exercise 6. 1.9 for another derivation.)


6.1 Jensen, Minkowski, and Holder 1 15

PROOF: First assume that F is simple. Then F == L:�= O Yk l rk for some n E


z + , Yo , . . . , Yn E C and cover {ro , . . . , rn} of E by mutually disjoints elements
JE
of B. Hence, since I:� JL(rk) == 1 and C is convex, F dJL == I:� YkJL(rk) E C
and, because g is concave,

Next, let F be general. Choose and fix some element y0 of C, and let { yk}1
be a dense sequence in c. Given m E z + ' choose Rm > 0 so that

and nm E z + so that C n B(O, Rm ) c U�"\ B ( Yk, ! ) . Define rm, O == {� E E :


I F( � ) I > Rm } , and use induction to define
£ -1
rm, £ = � E E \
{ !Jo rm, k : F(� ) E B ( Yt , ,!. )
}
for 1 < f < nm . Finally, set Fm == L:�Tno Yk l rm , k . Then, by the preceding,
JE F m dJL E C and
g (L Fm dp) > L g 0 F m dp
for each m E z+ .
Moreover, it is easy to see that II I F m - F l I I L 1 (J.L ) 0 as
m � oo . Thus, because C is closed, we now see that JE
F dJL E C. At the
same time, because g is continuous, g o F m � g o F in JL-measure as m � oo .
----7

Hence, by Fatou ' s Lemma,

We now need to develop a criterion for recognizing when a function is con­


cave. Such a criterion is contained in the next theorem.
6. 1 . 2 Lemma. Suppose that C C �N is an open convex set, and let C be its
closure. Then C is convex. Moreover, if g is continuous on C and g E C 2 (C) ,
then g is concave on C if and only if its Hessian matrix

is non-positive definite ( i.e., all of the eigenvalues of H9 (x) are non-positive)


0

for each x E C.
116 VI Some Basic Inequalities
PROOF: The convexity of C is obvious.
In order to prove that g is concave on C if H9 (x) is non-positive definite
at every x E C, we will use the following simple result about functions on
0

the interval [0 , 1] . Namely, suppose that u is continuous on [0 , 1] and that u


has two continuous derivatives on ( 0, 1). Then u( t) > 0 for every t E [0, 1] if
u(O) == u(1) == 0 and u"(t) < 0 for every t E (0, 1). To see this, let t > 0 be
given and consider the function uf. - u - tt(t - 1). Clearly it is enough for us
to show that uf. > 0 on [0 , 1] for every t > 0. Note that uf. (O) == uf. (1) == 0 and
u� ( t) < 0 for every t E ( 0, 1). In particular, if uf. ( t) < 0 for some t E [0, 1],
then there is an E (0, 1) at which uf. achieves its absolute minimum. But this
s

is impossible, since then we would have that u� ( ) > 0. { The astute reader
s

will undoubtedly see that this result could have been derived as a consequence
of the strong minimum principle in Exercise 5.4.19 for N == 1.)
Now assume that H9 ( x ) is non-positive definite for every x E C. Given
0

x , y E C , define u(t) == g({1 - t)x + ty ) - (1 - t)g(x) - tg( y ) for t E [0, 1] . Then


u(O) == u(1) == 0 and
u" (t) = (y - x , H9 ((1 - t)x + ty ) ( y - x ) ) R N < 0
for every t E {0, 1). Hence, by the preceding paragraph, u > 0 on [0, 1] ; and
so g({1 - t)x + ty ) > ( 1 - t)g(x) + tg( y ) for all t E [0, 1] . In other words, g is
0

concave on C, and, by continuity, it is therefore concave on C.


To complete the proof, suppose that H9 ( x ) has a positive eigenvalue for
some x E 6 . We can then find an w E s N - l and an t > 0 such that
(w, H9 (x)w ) R N > 0 and x + tw E 6 for all t E ( - t , t ) . Set u(t) == g( x + tw)
for t E ( - t , t ) . Then u"(O) == (w, H9 (x)w) R N > 0. On the other hand,
u(t) + u( -t) - 2u{O)
u " (O) 1. liD
==
t----.. 0 t2 '
and, if g were concave,

2u{O) == 2u t -2 t ( ) ==
(
2g 21 (x + tw) + 21 (x - tw) )
> g ( x + tw) + g( x - tw) == u(t) + u( - t),

from which would we would get the contradictory conclusion that u"(O) <
0. D

6 . 1 . 3 Lemma. Let A = [ � : ] be a real symmetric matrix. Then A is non­


positive if and only if both a + c < 0 and ac > b2 . In particular, for each
0 -o: and ( x , y ) E [O, oo) 2 r-----+
a E {0, 1), the functions (x, y) E [O, oo) r-----+ x y 1
2
( X 0 + yo: ) are continuous and concave.
1

a:
6. 1 Jensen, Minkowski, and Holder 117
PROOF: In view of Lemma 6. 1.2, it suffices for us to check the first assertion.
To this end, let T == a + c be the trace and D == ac b2 the determinant of A.
-

Also, let A and Jl denote the eigenvalues of A. Then, T == A + Jl and D == AJl.


If A is non-positive and therefore A V Jl < 0, then it is obvious that T < 0
and that D > 0. If D > 0, then either both A and Jl are positive or both are
negative. Hence if, in addition, T < 0, A and Jl are negative. Finally, if D == 0
and T < 0, then either A == 0 and Jl == T < 0 or Jl == 0 and A == T < 0. D

6.1.4 Theorem (Minkowski's Inequality) . Let !I and /2 be non-negative,


measurable functions on the measure space ( E, B, JL ) . Then, for every p E
[1 , oo ) ,

PROOF: The case when p == 1 follows from (3.2.13). Also, without loss in
generality, we assume that ff and ff are JL-integrable and that !I and !2 are
[0, oo ) -valued.
Let p E ( 1, oo ) be given. If we assume that JL ( E ) == 1 and we take a == ! ,
then, by Lemma 6. 1.3 and Jensen ' s inequality,

l (it + h ) p d/1 l [ (fit + (f�t] ! dJ1


=

< [ (l Ji dJ1 ) + (l g dJ1 ) ]


!
a a

[ (l fi dJ1) i + (l g dJ1) i r
More generally, if JL ( E ) == 0 there is nothing to do, and if 0 < JL ( E ) < oo we
can replace Jl by JL(E) and apply the preceding. Hence, all that remains is the
case when JL ( E ) == oo . But if JL ( E ) == oo , take En == {!I V !2 > � } , note that
JL ( En ) < nP J ff dJl + nP J ff dJl < oo , apply the preceding to JI , !2 , and Jl all
restricted to En , and let n ---t oo . D

6. 1.5 Theorem (Holder's Inequality) . Given p E (1, oo ) define the Hol­


der conjugate p' of p by the equation 1p p1, == 1. Then, for every pair of
+
non-negative, measurable functions !I and !2 on the measure space (E, B, JL ) ,

for every p E ( 1, oo ) .
118 VI Some Basic Inequalities
PROOF: First note that if either factor on the right hand side of the above
inequality is 0, then !I f2 == 0 (a.e. , JL) , and so the left hand side is also 0.
Thus we will assume that both factors on the right are strictly positive, in
which case, we may and will assume in addition that both Jf and f� are
I

JL-integrable and that !I and f2 are both [0, oo )-valued. Also, just as in the
proof of Minkowski ' s inequality, we can reduce everything to the case when
JL(E) == 1. But then we can use apply Jensen ' s Inequality and Lemma 6. 1.3
with a == � to see that

l ft h dj1 l ( fi) o (f{ ) l - o dj1 < (l Jf dj1) o (l J{ dj1y -o


=

(l Jf d11) (l J{ d11) ? D
*

Exercises

6 . 1 . 6 Exercise: Here are a few easy applications of the preceding.


(i) Show that log is a continuous and concave on every interval [E, oo ) with
E > 0. Use this together with Jensen ' s inequality to show that for any n E z + '
Jl i , . . . , Jln E (0, 1) satisfying L: � = I Jlm == 1, and ai , . . . , an E (0, oo ) ,
n n

II a�m < L Jlm am .


m= l m= I
In particular, when Jlm == � for every 1 < m < n this yields ( a I a ) :n
1

· · ·
n

< � L:� = l am , which is the statement that the arithmetic mean dominates the
geometric mean.
(ii) Let n E z + , and suppose that !I , . . . , f are non-negative, measurable
P E ( 1, oo ) satisfying
n

functions on the measure space ( E, B, Jl ) . Given PI ,


L:� = I P� == 1, show that
. . . n
,

6. 1. 7 Exercise: When p == 2,
Minkowski ' s and Holder ' s inequalities are in­
timately related and are both very simple to prove. Indeed, let !I and f2
be bounded, non-negative, measurable functions on the finite measure space
(E, B, JL) . Given any a f= 0, observe that
6. 2 The Lebesgue Spaces 119
from which it follows that
2 r h 12 df-l < t r tt df-l + �t r Ji df-l
JE JE JE
for every t > 0. If either integral on the right vanishes, show from the preceding
that JE f1 f2 dJl < 0. On the other hand, if neither integral vanishes, choose
t > 0 so that the preceding yields
(6. 1.8)
Hence, in any case, (6.1.8) holds. Finally, argue that one can remove the
restriction that f1 and f2 be bounded, and then remove the condition that
JL (E ) < oo . In particular, even if they are not non-negative, so long as ft and
fi are JL-integrable, conclude that f1 f2 must be JL-integrable and that (6. 1.8)
continues to hold.
Clearly (6.1.8) is the special case of Holder ' s inequality when p == 2. Because
it is a particularly significant case, it is often referred to by a different name
and is called Schwarz's inequality. Assuming that both ft and fi are JL­
integrable, show that the inequality in Schwarz ' s inequality is an equality if
and only if there exist ( a , {3) E �2 \ {0} such that af1 + /3!2 == 0 ( a.e. , JL ) .
Finally, use Schwarz ' s inequality to obtain Minkowski's inequality for the
case when p == 2. Notice the similarity between the development here and that
of the classical triangle inequality for the Euclidean metric on � N .
6. 1.9 Exercise: There is a "better" proof of Jensen ' s inequality which is
based on the following geometric fact. Namely, if g is a continuous, concave
function on the closed, convex subset C in � N , then, for each p E C, there is a
v E �N such that the line L { ( x, g (p) + (v, x - p) R N ) : x E �N } lies above
6 { (x, t) E C x � : t < g(x) } in �N + l . That is, g(x) < g (p ) + (v, x - p)R N
for every x E C. Assuming this fact*, give another derivation of Jensen ' s
Inequality. ( Hint: Take p == J F dJL. )

6.2 The Lebesgue Spaces.


In Section 3.2 we introduced I I £ 1 (JL)
and the space L 1 ( JL ) . We are now
ready to embed L 1 ( JL ) into a one-parameter family of spaces.
·

Given a measure space ( E, B, JL ) and a p E [1, oo ) , define

* When C is the closure of its interior and g is smooth, this fact is an easy consequence of
Taylor 's Theorem. In general, it can be seen as an application of the Hahn-Banach Theorem
for JR N .
120 VI Some Basic Inequalities
for measurable functions f on (E, B) . Also, if f is a measurable function on
( E, B) define

II J IIL= ( J.t ) = inf { M E [0 , oo] : I f I < M (a.e. , J-L) }.


Obviously, as p varies 11 / II Lv(J.L ) provides different estimates on the size of f as
it is "seen" by the measure Jl ·
Although information about f can be gleaned from a study of 1 1 / II L P(J.L ) as
p changes (for example, spikes in f will be emphasized by taking p to be
large) , all these quantities share the same flaw as 11 / IIL l (J.L) : they cannot detect
properties of f which occur on sets having JL-measure 0. Thus, before we
can hope to use any of them to get a metric on measurable functions, we
must invoke the same subterfuge which we introduced at the end of Section
3.2 in connection with the space £ 1 (JL) . Namely, for p E [1, oo] , we denote
by LP (Jl) == LP (E, B, JL) the collection of equivalence classes [/]'"'-� (cf. Remark
1-L

3.2.14) of �-valued, measurable functions f satisfying ll f ll £v (J.L ) < oo, and, once
again, we will abuse notation by using f to denote its own equivalence class
[/] 1-L

I"V .

Note that, by (3.2.13 ) and Minkowski ' s inequality,


(6.2. 1)
for all p E [1, oo ) , /1 , /2 E LP (Jl) , and a , f3 E �- Moreover, it is a simple
matter to check that (6.2. 1) continues to hold when p == oo. Thus, each of
the spaces LP (Jl) is a vector space. In addition, because of our convention and
Markov's inequality (Theorem 3.2.8) , 11 / II Lv (J.L ) == 0 if and only if f == 0 as
an element of LP (JL) . Finally, (6.2. 1) allows us to check that l l /2 - /I IILP (J.L )
satisfies the triangle inequality and, together with the preceding, this shows
that it determines a metric on LP (JL) . Thus, when {/n }1 U {/} C LP (JL) , we
often write fn � f in LP (Jl) when we mean ll fn - f i ! Lv (J.L ) � 0.
The following theorem simply summarizes obvious applications of the results
in Sections 3.2 and 3.3 to the present context. The reader should check that
he sees how each of the assertions here follows from the relevant result there.
6.2. 2 Theorem. Let (E, B, JL) be a measure space. Then, for any p E [1, oo]
and f, g E LP (Jl) ,

Next suppose that {In } ! C LP (Jl) for some p E [1 , oo] and that I is an �­
valued measurable function on (E, B) .
(i) If p E [1, oo ) and fn � f in LP (JL) , then In � I in JL-measure. If
fn � f in L CX) (JL) , then fn � f uniformly off of a set of JL-measure 0.
(ii) If p E [ 1 , oo ] and In � f in JL-measure or ( a.e. , JL ) , then 11 / II L P(J.L) <
lim n---+ CX) II fn II LP(J.L) . Moreover, if p E [1 , oo ) and, in addition, there is a g E
LP (Jl) such that Ifn i < g ( a.e. , Jl) for each n E z+ ' then In � f in LP(JL) .
6.2 The Lebesgue Spaces 121

(iii) If p E [ 1 , oo] and limm � oo II In - lm iiLP(JL ) == 0, then there is an


SUPn > m
I E LP ( Jl ) such that In � f in LP (JL) . In other words, the space LP ( Jl ) is
complete with respect to the metric determined by II · IIL P(JL ) ·
Finally, we have the following variants of Theorem 3.3. 13 and Corollary 3.3. 14.
( iv ) Assume that JL(E) < oo and that p, q E [1 , oo ) . Referring to Theorem
3.3. 13, define S as in that theorem. Then for each I E LP ( Jl ) n L q (Jl) there
is a sequence { 'Pn }1 C S such that 'Pn � I both in LP ( Jl ) and in Lq (JL) .
B
In particular, if is generated by a countable collection C, then each of the
spaces LP (Jl), p E [1 , oo) , is separable.
( v ) Let ( E, p) be a metric space, and suppose that Jl is a measure on (
which there exists a non-decreasing sequence of open sets En
for
satisfying /E E, B E )
JL(En) < oo for each n > 1 . Then, for each pair p, q E [1, oo) and I E
LP ( Jl ) n Lq (JL) , there is a sequence {cpn }1 of bounded p-uniformly continuous
functions such that 'Pn 0 off of En and 'Pn � I both in LP (Jl) and in Lq (Jl) .
The version of Lieb ' s variation on Fatou ' s Lemma for £P-spaces with p =I= 1
is not so easy as the assertions in Theorem 6.2.2. To prove it we will need the
following lemma.
6.2.3 Lemma. Let p E ( 1 , oo) , and suppose that {In } ! C LP (Jl) satisfies
supn > 1 ll ln iiLP(JL ) < oo and that In � 0 either in JL-measure or (a.e. , JL) .
Then, for every g E LP (JL) ,

� oonlim J � oo J
l /n l p - 1 191 dJl == 0 == nlim l ln l l g l p - 1 dJL.
PROOF: Without loss in generality, we assume that all of the In 's as well as
g are non-negative. Given b > 0, we have that
J R:- 1 g dJ-L = }r{ Jn < 8g } ��- 1 g dJ-L + }r{ fn > 8g } ��- 1 g df-l
< 8P - 1 II g ll iv (JL ) + r �� - 1 g dJ-L + r �� - 1 g dJ-L.
}{ Jn >8 2 } }{ g <8 }
Applying Holder ' s inequality to each of the last two terms, we now see that
J ��- 1 g df-l < 8p- 1 II g ll iv (tt)

Since, by Lebesgue ' s Dominated Convergence Theorem, the first term in the
final brackets tends to 0 as n ---t 0, we conclude that

nl!__.� J ��- 1 g df-l < 8P- 1 ll g l l iv (tt) + �� II fn ll ivttt ) II l { g < 6 } gll LP(tt)
for every b > 0. Thus, after another application of Lebesgue ' s Dominated
Convergence Theorem, we get the first result upon letting b � 0.
122 VI Some Basic Inequalities
To treat the other case, apply the preceding with l!:,- 1 and gP- 1 replacing
In and g, respectively, and with p' in place of p. D

6.2.4 Theorem (Lieb) . Let (E, B, JL) be a measure space, p E [1 , oo ) ,


and
{ In } ! U { I } C LP (JL) . If SUPn> 1 ll ln ii L P ( JL ) < oo and In -----+ f in JL-measure
or (a.e., JL) , then

(6 . 2 . 5)

and therefore ll ln - I II L P ( JL ) -----+ 0 if ll ln i i L P ( JL ) � II I I I LP ( JL ) ·


PROOF : The case when p == 1 is covered by Theorems 3.3.5 and 3.3. 12, and so
we will assume that p E ( 1 , oo ) . Given such a p, we first check that there is a
KP < oo such that
(6.2.6) l l b i P - l a i P - l b - a l P I < Kp ( l b - a i P- 1 I a l + l a l p - 1 l b - a ! ) , a , b E �.
Since (6.2.6) clearly holds for all a , b E � if it does for all a E � \ {0} and
b E �' we can divide both sides of (6.2.6) by l a i P and thereby show that (6.2.6)
is equivalent to

l l ci P - l - I e - l i P I < Kp ( l c - l l p - 1 + l c - 1 1 ) , cE �.
Finally, the existence of a Kp < oo for which this inequality holds can be easily
verified with elementary consideration of what happens when c is near 1 and
when l ei is near infinity.
Applying (6.2.6 ) with a == ln ( x ) and b == l ( x ) , we see that

pointwise. Thus, by Lemma 6.2.3 with In and g there replaced by In - f and


I , respectively, our result follows. D
We now turn to the application of Holder's inequality to the £P-spaces. In
order to do so, we first complete the definition of the Holder conjugate p' which,
thus far, has only been defined ( cf. Theorem 6. 1 .5) for p E ( 1 , oo ) . Thus, we
define p' == oo or 1 according to whether p == 1 or oo . Notice that this is
completely consistent with the equation p! + p1, == 1 used before.
6.2. 7 Theorem. Let (E, B, JL) be a measure space.
(i) If I and g are measurable functions on (E, B) , then for every p E [1 , oo ]

( 6.2.8 )

In particular, if I E £P (Jl) and g E LP (Jl) ,


' then f g E L 1 (JL) .
6. 2 The Lebesgue Spaces 123
(ii) If p E [1, oo ) and f E £P (J-L ) , then

(6.2.9)

In fact, if II!II L P(J.L ) > 0, then the supremum in (6.2.9) is achieved by the
function
l f l p - l sgn o f
g= .
ll f ll i,tl' )
(iii) More generally, for any f which is measurable on ( E, B) ,

(6.2. 10)
if p == 1 or if p E (1, oo ) and either J-L( I J I > 8) < oo for every 8 > 0 or J-L is
a-finite.
PROOF: Part (i) is an immediate consequence of Holder ' s inequality when
p E ( 1, oo ) . At the same time, when p E { 1, oo } , the conclusion is clear
without any further comment. Given (i) , (ii) is easy.
When p == 1, (iii) is obvious; and, in view of (ii) , the proof of (iii) for
p E (1, oo ) reduces to showing that, under either one of the stated conditions,
IIJI ILP(J.L) == oo implies that the right hand side of (6.2.10) is infinite. To this
end, first suppose that J-L( I f l > 8) < oo for every 8 > 0. Then, for each n > 1,
the function

Moreover, if II J II Lv (J.L ) ==then, by the Monotone Convergence Theorem,


oo ,

II 'l/Jn II LP' (J.L ) -----+ oo .


1
'
Thus, Since II f'l/J n II L (J.L) == II 'l/Jn II pLP ' (J.L ) , we see that

Finally, suppose that J-L is a-finite and that J-L( I J I > 8) == oo for some 8 > 0.
Choose { En } ! C B so that En / E and J-L ( En ) < oo for every n > 1. Then it
is easy to see that limn-+ II f 9n I I L (J.L) == oo when
00 1

Since IIYn ii LP' (J.L) < 1, this completes the proof. D


For reasons which will become clearer in the next section, it is sometimes
useful to consider the following slight variation on the basic £P-spaces. Namely,
124 VI Some Basic Inequalities
let (E I , 81 , J.1 1 ) and ( E2 , 82 , J.1 2 ) be a pair of a-finite measure spaces and let
PI , P2 E [ 1 , oo ) . Given a measurable function f on (E1 x E2 , 8 1 x 82 ) , define

pI2
l l f ii £(P t ·P2 l ( J.t 1 , J.1 2 ) [l2 (l1 ! J(x 1 , x2 ) 1P1 JL 1 (dx 1 ) ) * tt2 (dx2 )] ,

and let £ (P I , p2 ) ( J.-t1 , J.-t 2 ) denote the mixed Lebesgue space of �-valued, 81 x
82 -measurable f ' s for which ll f ll £ c PI ·P2)(J.L I ,J.L 2 ) < oo . Obviously, when p 1 == p ==
P2 , II J II L c PI . P2 ) (J.L I ,J.L 2 ) == ll f ii £P(J.L I X J.L 2 ) and £ (P I , P2 ) (J.-t i , J.12 ) == LP(J.-t l x J.12 ) ·
6.2. 11 Lemma. For all f and g which are measurable on (E1 x E2 , 81 x 82 )
and all a, f3 E � '
l l af + f3g ii £(PI ·P2) (J.L I ,J.L 2 ) < l a l l lf ll £cPI . P2) (J.L I ,J.L 2 ) + l /31 I IYI I £(PI . P2) (J.L I ,J.L 2 )
(6.2. 12)

Moreover, if {fn}! U { ! } C £ ( P I ,P2 ) (J.-t l , J.12 ), fn � f (a.e., J.-l l X J.12 ) , and


I fn i < g (a.e. , J.-l l x J.12 ) for each n > 1 and some g E £ (P I ,P2 ) (J.-t i , J.12 ), then
II fn - f II £(PI , p 2 ) (J.L I ,J.L 2 ) � 0. Finally, if J.-l l and J.-t 2 are finite and g denotes the
class of all 'ljJ 's on E 1 x E2 having the form E� =l 1ri , m ( · I)'Pm ( · 2 ) for some
n > 1 , { 'P m } f C L00 ( J.-t 2 ), and mutually disjoint r1 , 1 , . . . , rl, n E 8 1 , then, for

every measurable f E £ (P I ,p2 ) ( J.-t1 , J.-t 2 ) and E > 0, there is a 'ljJ E g such that
II ! - '¢ 11 £(PI . P 2 ) (J.L I ,J.L 2 ) < E.
PROOF: Note that
(6.2. 13)

Hence the assertions in (6.2.12) are consequences of repeated application of


Minkowski's and Holder's inequalities, respectively. Moreover, to prove the sec­
ond statement, observe ( cf. Exercise 4. 1 . 1 1 ) that for J.-t 2 -almost every x 2 E E2 ,
fn ( · , x 2 ) � f ( · , x 2 ) ( a. e. , J.-t 1 ) , I fn ( · , x 2 ) I < g ( , x 2 ) ( a. e. , J.-t 1 ) , and g ( , x 2 ) E
£P I (J.-t 1 ). Thus, by part (ii) of Theorem 6.2.2,
· ·

for J.-t 2 -almost every x 2 E E2 . In addition,

for J.-t 2 -almost every x 2 E E2 and, by (6.2. 13) with g replacing J ,


6. 2 The Lebesgue Spaces 125

Hence the required result follows after a second application of ( ii ) in Theorem


6.2.2.
We turn now to the final part of the lemma, in which the measures Jll and
Jl 2 are assumed to be finite. In fact, without loss in generality, we will assume
that they are probability measures. In addition, by the preceding, it is clear
that, for each f E £ (P I ,p2 ) (JL I , Jl 2 ),

Thus, we need only consider f ' s which are bounded. Finally, because Jl l x JL 2
is also a probability measure, Jensen ' s inequality and (6.2. 13) imply that

Hence, it suffices to show that, for every bounded measurable f on ( E1 x


E2 , B1 x B2 ) and E > 0, there is a 'ljJ E g for which II! - 'l/J IIL q(J.L 1 xJ.L 2 ) < E. But,
by part ( iv ) of Theorem 6. 2.2, the class of simple functions having the form
n

L am lrl , m r2 ,m
'ljJ == X
m=l
with ri,m E Bi is dense in Lq(JL 1 JL 2 ). Thus, we will be done once we check
x

that such a 'ljJ is an element of g. To this end, we use the same technique as we
did in the final part of the proof of Lemma 3. 2.3. That is, set I == ( {0, 1 } )
n

and, for 'IJ E I, define r1,17 == n � =l ri � ) where r < o ) rC and r< 1 ) r. Then

'¢ (x1 , x 2 ) == L am L 'TJmlr1 , 11 ( ) lr2 , m (x 2 ) == L lr1 ,11 (xi ) cp11 ( 2 ) ,


xi x
m=l 17 EI 17 E I
where n

'P11 == L 'T/m amlr2 , m


m= l
·

Since the r1 , 17 ' s are mutually disjoint, this completes the proof. D
For our purposes, the most important fact that comes out of these consid­
erations is the following continuous version of Minkowski's inequality.
6.2. 14 Theorem. Let (Ei , Bi , Jli), i E { 1 , 2} , be a-finite measure spaces.
Then, for any 1 < P I < P2 < oo and any measurable function f on ( E1 x

E2 , B l X 82 ), I I J II £(PI ·P2) (J.L I ,J.L2 ) < II J II £(P2.Pl ) (J.L 2 ,J.L l ) •


PROOF: Since it is easy to reduce the general case to the one in which both
Jll and Jl 2 are finite, we may take them to be finite. In fact, without loss in
generality, we will assume, from the outset, that they are probability measures.
126 VI Some Basic Inequalities
Let g be the class described in the last part of Lemma 6.2. 11. Given ==
L:� l r l , Tn ( . 1) 'Pm ( . 2 ) I am l r1 m l r l am l r l r1 , m
which is an element of Q , note that, since the 's

rl ,m
are mutually disjoint, 2::: � ,
== 2::: � for any r E [0, oo ) and
a1 , ... , an
E � - Hence, by Minkowski ' s inequality for p == � ,
PI 1
I 'I/J I L<P l .P2 l(tt 1 ,tt2 ) [L2 (� tt l (rl ,m ) l cpm (x2 ) 1P1 ) � JL2 (dx2 )]
=
P2

1
[L1 � l rl , = (X 1 ) l cpm l i�2 (tt2 ) /Ll (dx1 )] Pl

P 2 1

Pl
=
[L1 (L2 � l r1 . = (x l ) cpm (x2 ) JL2 (dx2 ) ) � JL1(dx1 )]
==I '¢ I L(P2 ,p I ) (J.L2 1 ) .
, j.L

Hence, we are done when the function is an element of Q. f


f l f i £<P2 .vi> (J.L ,J.L )
To complete the proof, let be a measurable function on x x (E1 E2 , B1 B2 ).
Clearly we may assume that
Lemma 6.2.11,
choose {'¢n}1 C
g so that
2 �
1
< oo . Using the last part of
1 1/Jn -l 'l/Jfni £-<P2f,pi i£) (1J.L(J.L2 ,1J.Lx1 J.L) 2 ) 0. 0,
Then,
by Jensen ' s inequality, it is easy to check that � and
therefore that
we will assume that
1/Jn 1/Jnf f l Jl2 ) ·


in Jll x Jl 2 -measure. Hence, without loss in generality,
(a.e., Jl x In particular, by Fatou ' s Lemma
and Exercise 4.1.11, this means that

for JL 2 -almost every x 2 E E2 ;


and so, by the result for g and another application
of Fatou ' s Lemma, the required result follows for D f.
The following result is typical of the way in which one applies Theorem
6.2. 14.
6.2 The Lebesgue Spaces 127

6.2. 15 Theorem. Let (E 1 , 8 1 , J.-L l ) and (E2 , 82 , J.1 2 ) be a pair of a-finite mea­
sure spaces, and suppose that K is a measurable function on ( E1 E2 , 8 1 82 ) x x

which satisfies

M1 X2 E E2 I l K ( · , x2 ) 1 Lq(JL 1 )
sup < oo and M2 XI E El I K (x1 , · ) 1 Lq(JL2 )
sup < oo

for some q E [1 , oo ) . Define

(6.2. 16) /Cj (x1 ) J{E2 K (x1 , x2 ) J (x2 ) J.t2 (dx2 )


=

f (J.-L2 ) . Then for each p


for E Lq ' E [1, oo ] satisfying � � + � - 1 > 0,
(6.2.17)
PROOF: First suppose that r == oo and therefore that p == q ' . Then, by part
(i) of Theorem 6.2.7,

and so (6.2. 17) is trivial in this case.


Next, suppose that p == 1 and therefore that q == r . Noting that I KJI I £r (JL 1 )
< I KJI I £< I ,r)(JL2 ,JL 1 ) ,
we can apply Theorem 6.2. 14 to obtain

I KJI I £r (JL I ) I KJ I £ ( l ,r)(JL2 ,JL I ) I KJI I £ (r, l )(JL I ,JL2 ) 1


< <

=
l2 (l1 I K (x1 , x2 ) J (x2 W J.l l (dx l ) ) J.t2 (dx) r

r I l K ( . , X 2 ) 1 L r( J.II ) I f ( x2 ) 1 J.l 2 (dx2 ) < Ml l f l u ( JL 2 )·


=
JE2
Finally, the only case remaining is when r E oo and E [1 , ) p (1, ).
(1 - )p'
oo Noting
that r E ( q, )
oo , set a == Then, a E ;. and (0, 1 ) q. Given
g L r ' ( J.-L 1 ),
a ==

E we have, by the second inequality in that (6. 2. 12) ,

I Y KJ I L 1 (JL I ) I Y KJI I L 1 (JL I X JL2 )


<
< I I K I Q Jl l £(r,p)(JL 1 ,JL2 ) I Y 1 K i l - o" £ (r1 , p1 ) (JL I ,JL2 )"
Next, observe that

1
= [l2 (l1 IK (x1 , x2 ) l ar ! J (x2 W J.ll (dx l ) ) J.l2 (dx2 )] Mfl l f l v'(�t2 )·
� P <
128 VI Some Basic Inequalities

At the same time, since p < r and therefore r ' < p' , we can apply Theorem
6.2. 14 to see that
by the same reasoning as we just applied to
l g I K I 1 -o: l £<rl ,pi ) (J.L 1 ,J.L2 ) I lI Kg IiKo: fI 1l -£o:<rI,vL)(<vJ.Ll,1r,J.Li )2()J.L,2 ,J.L 1 ) .
< Hence,
we find that
II YY KI KJII 1I £-1o:(J.Ll £)<r1 ,p1 )Mf(J.L 1 ,J.LMi2 ) - o:MiI J I -LoiP(iJ.LY I) £r1(J.L2 ) . g
< Combining these two, we arrive at
for all E Lr 1 ( J-LI ) with I Y I £r1 (J.L I )
< 1
and so ( 6.2. 17 ) now follows from part (iii) of Theorem 6.2.7. D
< 1;1
6. 2. 18 Corollary. Let everything be as in Theorem 6.2. 15, and, for measur­
able f : E2 -----+ �' define

and

Next, let p E [ 1, oo ] satisfying � +� > 1 be given and define r E [ 1 , oo ] by


! !
== + ! 1 . Then
p
r q
-

( 6.2. 19 )
for f E LP (�-t 2 ) . In particular, K maps LP (�-t 2 linearly into Lr (�-t 1 ). In fact, )
finto £LrP (J-t(�-t21)) whoseKJrestriction
E � E
Lr ( J-LI ) is the unique continuous mapping from LP (�-t2
to LP ( �-t 2 n Lq ( �-t 2 is given by the map K in
)
I

)
)

( 6. 2. 16 ) .
PROOF: ==
If r oo , and therefore p
assume that r and therefore p are finite.
== q',
there is nothing to do. Thus, we will

Let E f £ P (J-t 2
be given, and set ) 0 for n E I n == fl [
Because - n, n ] f z+ .
f(6.2.n 17L)P (J-t2 ) n L 00 (J-tI K2 )I ( I fn i , 6. 2. 20) ' fn LP (J-t2 ) n Lq (J-t2 ). I

p < q'
and E cf. Exercise E
Hence, by applied to and
r
j K ( x1I, x2 ) i l f( x2 ) l M 2 (dx2 )
{ x2 : I J (x2 )l < n}
In particular, by the Monotone Convergence Theorem, this proves both parts
of ( 6.2. 19 ) . Furthermore, if E P ) and J, g £ (�-t2 a, {3
E � ' then

K (af+f3g) == a Kf+ {3 Kg AK(f) n AK(g). on


Thus, since both
a mapping into
AK( !
Lr ( �-t 1 K
)C ),
AK(
andg )C 0,
have �-t1-measure we now see that, as
Kf == Kf
is linear. Finally, it is obvious that for
6. 2 The Lebesgue Spaces 129

f E (J-L2 ) n Lq (J-L2 )·
£P
I

Hence, if /(' is any extension of /( r £P


as a continuous, linear mapping from P toL (J-L 2 ) L n
r (J-L )
I , then with the same (
( J-L 2 ) Lq ( J-L 2 ) I

choice of {fn }1 )
as above

I l K/ - K' / I £r (J.I t ) < }�--•� I l K/ - KJn I £r (J.I t )


I JC (f - fn) I Lr( 1 ) < M;!: M; - ; n---+ I J - fn i LP(J.I t ) 0. D
== limCX)
n ---+ J.L
limCX) =

Exercises

6.2.20 Exercise: Let ( E, B, J-L ) be a measure space and 1 < QI < Q2 < oo be
given. If f E Lq1 (J-L) nLq2 (J-L) , show that for any t E (0, 1)
1 t 1-t
(6.2.21) where - == - +
Pt QI Q2
--

(
Note that 5. 2. 21) says that p � - I f I LP (J..L)
log is a concave function of �.
6.2.22 Exercise: The following exercises give some insight into the £P-spaces
in various situations.
J-L)
(i) If ( E, B, is a probability space, show that p E [1 , oo ] �
f
non-decreasing function for any measurable on E, B ) . ( l f i
is a L v ( J. .L )
J-L I J I Lv (J..L) J-L ( {n})
(ii) Let E == z + and define on B == P ( E ) by
this case show that p E (1 , oo ] �
nf
== 1 for all E z + . In
is non-increasing for every on E.
J-L) I J I Loo (J..L) < l ffi Lv (J..L) ·
(iii) Let ( E, B,
function. Show that
be a measure space and : E � � is a B-measurable
limp/ CX) Further, assuming either
( )<
that �-t E oo or thatI J I L l (J..L) < I J I Loo (J..L)
oo, show that == limp / CX) l f/-Li L2 P(J..L) ·
( iv ) Let ( Ei , Bi ) , i E { 1 , 2}, be measurable spaces, and suppose that is a

f ( 2 2 )
a-finite measure on E , B . Using part (iii) , show that, for every measurable
2)
function on ( EI x E2 , BI x B , the function X I � I J (xi,I J I ·£2c)v1l .LPoo2 ) ((J.J..L.L21 ),J..L2 )
is
measurable on ( EI , BI ) · In particular, we could have defined
for all P I , P2 E [1 , oo] .
6.2. 23 Exercise: Let ( E, B,
L (J-L)
of P for some
J-L
p E (1, oo )
) f be a measure space, g a non-negative element
, and is non-negative, B-measurable function
for which there exists a C E (0, oo) such that

(6.2.24) MU > t) < ct Jr{f> t } g dJ.L, t E (O, oo) .

( i ) Set v(r) == fr g d�-t for rE B, note that (6.2.24) is equivalent to

J-L( f > t) < Tc v (f > t) , tE (0, oo ) ,


130 VI Some Basic Inequalities

and use (5. 1 .4) ( cf. Exercise 5. 1 .8) to justify

I I I iP(J.I) =p r
J( o, oo)
tp - 1 M U > t) dt
< Cp {
J( o, oo)
tP - 2 v(f > t) dt = C-p1
P
l f l iv� l ( v)"
Finally, note that
conclude that
l f l iv �l (J.t) = J fP - 1 g dJ.-l, and apply HOlder ' s inequality to

(6.2.25)

(ii) Under the condition that I J I Lv (JL) < oo , it is clear that (6. 2.25) implies

(6.2.26)

Now suppose that M(E) < oo . After checking that (6.2.24) for implies
(6.2.24) for fn f
1\ R, conclude that (6.2.26) holds first with replacing fn f f
f
and then, after R / oo , for itself. In other words, when M is finite, (6.2.24)
always implies (6.2.26) .
(iii) Even if M is not finite, show that (6.2.24) implies (6.2.26) as long as
M ( > E ) < oo for every E > 0.
f
Hint: Given E > 0, consider M E == M r B[{f
> E } ] , note that (6.2.24) with M
implies itself with ME , and use (ii) to conclude that (6.2.26) holds with ME in
place of M· Finally, let E � 0.

6.2.27 Exercise: Recall the Hardy-Littlewood maximal function defined


in (3.4.2) , and observe that the definition extends, without change, to any
Mf
f
measurable on � which is integrable on compacts. Show that

(6.2.28) l i MJI I Lv (R) I J I Lv (R) , P E ) f E £P �


<
2p
p- 1
( 1 , oo and ( ).

Hint : When f fE ECc£P(�)


handle general
(�) ,
, ( )
use (3.4.7) together with (iii) of Exercise 6.2.23. To
use v in Theorem 6.2.2 to find{fn}1 C Cc(�) so
that fn � f in LP � ( ) , and apply Fatou ' s Lemma to see that

l i MJI I LP(R) < n oo l i Mfn i Lv (R) ·


lim
---+
6.3 Convolution and Approximate Identities 131

6.3 Convolution and Approximate Identities.


We will use the ideas of the last section to develop in this section an im­
portant notion of multiplication for functions on � N ; and, because the only
measure involved will be Lebesgue ' s, we will use the notation LP(� N ) instead
of the more cumbersome LP ( AR N ) .
6.3. 1 Theorem ( Young's Inequality ) . Let p and q from [1,
! + ! > 1 be given, and define r E [1 ,
satisfying
by ! == p! + ! - 1 . Then, for each
oo] oo]
p -
q r q
f g
E LP (� N ) and E Lq (� N ) , the complement of the set

(6.3.2) A(f, g) {x N l) f (x - y) i i g (y) i dy < oo }


E IR :

has Lebesgue measure 0. Furthermore, if

(6.3.3 ) f * g (x) 0 { JR N f(x - y ) g ( y ) d y wh en


x E
otherwise,
A( J, g )

then f * g g * f and
==
( 6.3.4 )
Finally, the mapping ( J , g ) LP (�N ) Lq (� N )
E x � f*g Lr (�N ) is bilinear.
E
PROOF: We begin with the observation that there is nothing to do when == r
oo. r
Thus, we will assume throughout that and therefore also p and q are all
finite. Next, using the translation invariance of Lebesgue ' s measure, first note
that
qE ,
A(J, g ) A(g , f)
==
[ 1 oo )
and E
and then conclude that
set x , == for
==
g Lq (�N ) , K ( y) g (x - y ) x , y �N .
E
f *
Finally, given
Obviously,
g g * f .
I l K ( · , y ) I L q( R N ) I K ( ) I L q( R N ) l g i L q( R N ) < oo ;
,
yERN
sup == sup
xERN x ·
==

and, in the notation of Corollary


In particular, for each E f ==6.2. 18, A (J , g ) AK(f)
and
LP(�N ) , A (f, g )C
has Lebesgue measure and
== f * g 0 /C f .
( 6.3.4 )
holds. In addition, E
g Lq (�N )f; g. D
E
� f LP(�N )
E f * g
is linear for each
and therefore the bilinearity assertion follows after one reverses
Lr (� N )

the roles of and


The quantity f
gg. Lq (�N ) * g
described in ( 6.3.3 )
is called the convolution of with
In applications, the most useful cases are those when E and f f
LP (� N )
E p
where either == q' (and therefore == r oo ) p 1
or == (and therefore
r== q) . To get more information about the case when == q' , we will need the p
following.
6.3.5 Lemma. GivenhE � N , define Th f for functions f on �N by Th f ( x ) ==
fMoreover,
(
x + h) . Then Th is an isometry on LP (�N ) for every hE � N and p [1 , oo].
E
E ifp [ 1 oo ) and f LP(�N ) , then
, E
( 6.3.6 )
132 VI Some Basic Inequalities

PROOF : The first assertion is an immediate consequence of the translation


invariance of Lebesgue ' s measure.

( ) E
Next, suppose that p [1 , oo ) is given. If g denotes the class of LP ( �N )
for which 6 . 3 . 6 holds, it is clear that Cc ( � N ) C Q . Hence, by v in Theorem() f E
LP ( � N ) . To this end, let
=={ fn}1
6.2.2, we will know that g LP ( � N ) as soon as we show that g is closed in
C g and suppose that � in LP ( �N ) . fn f
Then

h---+<0 I Th f - f(i LP(Rfn)N )


lim

h==---+20l lfi Tnh -!J-I Lv (RINL)P(RN ) + h---+0 I Th fn - fn I LP(RN) + l fn - f i LP(RN )


lim
�0
lim

as n ---t oo . D
6.3. 7 Theorem. Let p E [1 , ] f E LP (�N ) , and g E LP' (�N ) . Then
oo ,

Th (f *g) == (Th f) * g == f * (ThgN) for all h E �N .


Moreover, f * g is uniformly continuous on � and
(6.3.8)
Finally, if E ( 1 , ) then
p oo ,

(6 .3. 9) f* g(x) == O.
lim
l x 1 ---+ CX)
PROOF: The first assertion is again just an expression of translation invariance
for
that
ARfN*. g (6.3.8)
Further, is a simple application of Holder ' s inequality. To see
is uniformly continuous, first suppose that p E [1 , )
oo . Then, by

( 6 . 3 8) ( 6 . 3 6)
. and . ,
I Th (f *g ) - J * g l u == I (Th j - f) * g l u < I Th j - JI I LP(RN ) l g i LP' (RN ) �0
as lhl
argument.
0; and when p
---t ==
oo , simply reverse the roles of f g
and in this

To prove the final assertion, first let


to be the class of g E LP' (�N ) for which .
f E Cc ( � N )
(6 3. 9) be given, and define gf
holds. Then it is easy to
check that Cc ( � N ) C g, . Moreover, by (6.3.8) , one sees that gf is closed in
LP' ( � N ) . Hence, just as in the final step of the proof of Lemma 6.3.5, we
conclude that g1 == LP' ( �N ) . Next, let
to be the class of fE g( E )
LP' ( � N ) be given and define 1{9
LP ( � N ) for which 6 . 3 . 9 is true. By the preceding, we
know that Cc ( � N ) C 1{9 . Moreover, just as before, 1{ 9 is closed in LP ( �N ) ;
==
and therefore 1{ 9 LP ( � N ) . D
6.3. 10 Remark: Both Theorem 6.3.3 and Theorem 6.3.7 tell us that the
convolution of two functions is often more regular than either or both of its
factors. An application of this fact is given in Exercise 6.3.26 below, where one
sees how it leads to an elegant derivation of Lemma 2. 1. 15.
6. 3 Convolution and Approximate Identities 133

The next result can be considered as another example of the observation


made in the preceding Remark.
6.3. 1 1 Lemma. Let g E C 1 ( �N ) , and assume that g as well as g , 1 , . . . , g , N
are elements of LP' ( l� N ) for some p E [1 , oo ] . (Recall that g ,i g;i . ) Then
f * g E C 1 (� N ) for every f E £P(�N ), and

(6.3. 12) ( J * g) f g
B aX i - - * ·
1 < i < N.
,t , - -

PROOF: Let w E s N - 1 be given. If p' E [1, oo ) , then, by Theorem 6.3.7,


Ttw ( f * g) - f * g == f * (Ttw g - g) for every t E � - Since
Ttw 9 (Y)
t
- g( y ) -
_

1 (
[0, 1 )
w, "
v g ( y + s tw ) ) R N d s

and, by Theorem 6.2. 14,

r (w,V'g( - + stw) - V'g( · )) R N ds


1[0, 1 ) LP1 (R N )
< r Tstw (w, V' g ) R N - ( w, V' g) JRN LP' (RN ) ds ----+ Q
J[o, 1 )
as t � 0, the required result follows from (6.3.4) . On the other hand, if p' == oo ,
then
Ttw 9 (Y) - g( y ) =
r (w, V'g( y + stw) ) JRN ds ----+ (w, V'g( y )) JRN
t J[o, 1 )
boundedly and point-wise, and therefore the result follows, in this case, from
Lebesgue's Dominated Convergence Theorem. D
The preceding result leads immediately to the conclusion that the smoother
g is the smoother is f * Y · More precisely, given a multi-index a == (a 1 , . . . , a N ),
where the a i ' s are non-negative integers, define l a l == L: f ai and

Then, as an immediate corollary to Lemma 6.3. 1 1 , we see that if g E C CX) ( �N )


and ao: g E £P' ( � N ) for some p E [1 , oo ] and all a ' s, then
(6.3. 13)
for every f E £P(� N ).
We next turn our attention to the case when g E £ 1 ( �N ) . The main result
here is the one which follows.
134 VI Some Basic Inequalities

6.3. 14 Theorem. Given g


9t L 1 (JR N )
and t > 0, define
E ) = t -Ng t - 1 ) .
Then E L 1 (JR N ) and J gt dx = J g dx . In addition, if J g dx = 1, then for
9t ( ·
( ·

every p E [1 , oo ) and f E LP(� N ) :


(5.3. 15) lim
t�O I ! * 9t - ! I LP(RN )
= 0.

PROOF: We need only deal with the last statement.


Assume that J g dx = 1 . Given f E LP(� N ), note that, for almost every
X E �N ,
*t
f g ( X ) - f (x ) { ( f( x - y ) - f ( x ) ) gt ( ) dy { ( f( x - ty) - f(x) ) g ( y ) dy .
Y

Hence, if w
=
JR N
t (x , y ) (J (x -
= )
=

ty) - J(x) g(y) , then, by Theorem


JR N 6.2. 14,
I ! * gt - JI I LP(RN ) < l '��t l £( l ,p)(AJRN ,AJRN )
< l � t l i L < P . 1 l( >.a N , >.a N ) { l l t f - J I L P(R N ) I g ( y ) i dy
r- y
.
Since I l T- ty !-JI I LP(RN ) < 2 I J I LP (RN ) ,
=
JR
we now see that the result follows from
N
the above by Lebesgue ' s Dominated Convergence Theorem and (6.3.6) . D
For reasons which ought to be made clear by Theorem if E 6.3. 14, g L 1 (� N )
and J g dx 1 ,
= the corresponding family {gt : t
> 0} is called an approximate
identity. To understand how an approximate identity actually carries out an
approximation of the identity, consider the case when is non-negative and g
vanishes off of B(O, 1 ) .
Then the volume under the graph 9t
of continues to be
1 t
as � 0 while the base of the graph is restricted to Hence, all the B(O, t) .
mass is getting concentrated over the origin.
Combining Theorem 6.3. 14 and (6.3. 13) , we get the following important
approximation procedure.
6.3. 16 Corollary. Let g C CX) (JR N ) n L 1 (JR N ) with JR N g( x ) dx = 1 be given.
E

In addition, let p E [1 , oo ) and assume that aa g E LP' ( R N ) for all E N N . a

Then, for each f E £P(JRN ), f


t > 0,
* 9 t f in LP(JR N ) as t � 0; and, for all
f * 9t has bounded, continuous derivatives of all orders and ( ! * 9t )
-----+

aa =
J * ( a0 Yt ) , NN .
a E

Exercises

6.3. 1 7 Exercise: Given J, g L 1 (JRN ),


E show that
j f * g( x ) dx j f(x ) dx j g( x ) dx .
=

6.3. 18 Exercise: Given a family {ft : t E (0, oo ) } C L 1 (� N ) , we say that the


family is a convolution semigroup if fs + = Is t * ft
Here are four famous examples of convolution semigroups, three of which are
s,
and all t E (0, oo ) .

also approximate identities.


6. 3 Convolution and Approximate Identities 135

� (
( i ) Define the Gauss kernel r ( x ) = (21r) - exp _ l xt ) for X E JRN . Using
the result in part ( i ) Exercise 5.2.6, show that JR N 1' ( x ) dx == 1 and that
1' v's * 1'v't == 1'v'8+I t E ( 0, oo ) .
for s,

Clearly this says that the approximate identity { 1' v't : t E (0, oo ) } is a convolu­
tion semigroup of functions. It is known as either the heat flow semigroup
or the Weierstrass semigroup.
( ii ) Define v on � by

Show that JR v(�) d� == 1 and that


(6.3. 19)
Hence, here again, we have an approximate identity which is a convolution
semigroup. The family { vt 2 : t > 0} , or, more precisely, the probability mea­
sures A r-----+ JA Vt 2 ( �) d�, play a role in probability theory, where they are called
the one-sided stable laws of order � .
Hint: Note that for 'TJ E (0, oo )

try the change of variable � (�) -f--e , and use part ( iv ) of Exercise 5.2.6.
==

( iii ) Using part ( ii ) of Exercise 5.3.24, check that the function P on �N given
by
N+l
(1 + lxl ) - -
2 2 2 ' X E �N '
P (x ) ==
WN
-

has Lebesgue integral 1 . Next prove the representation

(6.3.20)

Finally, using (6.3.20) together with the preceding parts of this exercise, show
that
(6.3.21) Ps * Pt == Ps+ t , s, t E (0, oo ) ;
and therefore that { Pt : t > 0} is a convolution semigroup. This semigroup
is known as the Poisson semigroup among harmonic analysts and as the
Cauchy semigroup in probability theory; the representation (6.3.20) is an
example of how to obtain one semigroup from another by the method of sub­
ordination.
136 VI Some Basic Inequalities

a
(iv) For each > 0, define Yo:g (xlR) (r(a)[O, oo)) - 1 x0 - 1 e- x9o: (x)x 00.
: � so that == if x < 0
and (cf. (ii) in Exercise 5.2.6)
JR 9o: ( x ) dx 1.
==
o:
Next, check that
== for > Clearly,

* ( ) ( r( a ,B) f t o: - 1 ( 1 - t ) !3 - 1 dt ) ga + f3 (X ) ,
go: g{3 X - r(a) r(,B) j(O,l)
_ +

and use this, together with Exercise 6.3. 17, to give another derivation of the
formula, in (i) of Exercise 5.3.24, for the Beta function. Clearly, it is also shows
that {go: : a > 0} is yet another convolution semigroup, although this one is
not an approximate identity.

6.3.22 Exercise: Show that if Jl is a finite measure on JRN and p E [1 , oo] ,


then for all f E Cc(RN ) the function f * Jl given by

f * J-L(X) = {R f(x - y) J-L(dy) , X E JR.N


JN
is continuous and satisfies

(6.3.23 )
Next, use (6.3.23 ) to show that for each p E [1, oo) there is a unique continuous
map JC JL : £P(JR N ) � LP(JRN ) such that JC JL f == f * Jl for f E Cc(� N ). Finally,
note that (6.3.23 ) continues to hold when f * Jl is replaced by JC JL f, but that
KJL J need not be continuous for every f E £P(JR N ).
6.3.24 Exercise: In many applications it is extremely important to have
compactly supported approximate identities.
(i ) Set
( r
CN ln( 0 , 1 )
== [ - 1
exp
1 - 1X
1 2 ] dx )
- 1

(xp ) { CN [� 1 _jx l2 ] XX � B(O,


and define
exp if E 1)
=
ifB(O, 1 )
p
Show that E C00 (1R N ) .
( ii ) Use the preceding to show that if F is a closed subset of JR N and G is an
open subset satisfying dist ( F, GC ) > 0, then there is an 'TJ E C00 ( �N ) such
that lp < 'TJ < la . Such a function 'TJ is sometimes called a bump function.
(iii) Show that for each pair p , q E [1 , oo) and every f E £P(JR N ) n L q ( �N )
there exists a sequence 1/Jn E cgo ( RN ) such that 1/Jn � f both in LP ( �N ) and
in Lq ( R N ) . In particular, cgo ( JR N ) is dense in LP(JR N ) for every p E [1 , oo).
6. 3 Convolution and Approximate Identities 137

( iv ) Let G be an open subset of JRN . When N > 2, we showed in Theorem


u
5.4. 17 that every E C2 (G) which is harmonic on G satisfies the Mean Value
Property ( 5.4. 18 ) for balls B(x, R) whose closures are in G. Moreover, as
pointed out in the paragraph preceding that theorem, the Mean Value Property
is a triviality when N == 1 . In this exercise, prove the converse of the Mean

u u u
Value Property. Namely, show that if E C(G) satisfies ( 5.4. 18 ) whenever
B(x, R) c G, then E C00 (G) and is harmonic on G. The proof can be
accomplished in two steps. First show that if B(x,
Value Property implies that cf. part ( i )) ( C G, then the Mean 2t)

and conclude from this that


and x E G,
u f (G)
E c oo (G) . Second, show that for any E c oo

( 6.3.25 ) f}.j (x) 02 t� lrsN -1 (f(x tw) - f(x) ) AsN - l(dw) .


=
� £N
lim
t� O
+

Hint : To prove ( 6.3.25 ) ,


and use the relations in
f
expand
( 5.2.5 ) .
x
in a two place Taylor expansion around

6.3.26 Exercise: Let


l _r * lr . Show that
r N
u(x) l r l l� ,
< r - r - { y - x, y r} ,
where � ==
u (x)
E BR have finite Lebesgue measure and set
X :and E
==

that u(O) == 1r1 . Use these observations, together with Theorem 6.3.7, to give
another proof of Lemma 2. 1 . 15.

6.3.27 Exercise: Define the o--finite measure JL ( ) )


on (0, oo , B( o , oo ) by

�-t(r) [ � dx
= for r E B ( o ,oo) ,

and show that Jl is invariant under the multiplicative group in the sense
that
r
J( o, oo)
f(ax) �-t (dx) f(x) �-t (dx), a
= r
J( o , oo)
E (0, oo) ,

and
{ f (_!_) X
M ( dx ) { f(x) M ( dx )
=
J ( O , oo) J ( O ,oo)
for every B( o ,oo) -measurable
ble, JR-valued functions and f g, f : (O, oo ) [O, oo] .
set
� Next, for B( o ,oo) -measura­

A11 (j, g) {x (O, oo) : ( ) l g ( y ) l JL(dy ) oo } ,


x
== E
J
{ f ( o ,oo) Y
<
138 VI Some Basic Inequalities

f g (x )

{ J f ( �) g ( y ) J.t ( dy )
==
( O ,oo ) when x E All- ( !, g)
0 otherwise,
and show that • f g g f. == • In addition, show that if p, q E [1 , oo ] satisfy
! ! + ! 1- -
> 0 then
r p
q '

f () g
for all E LP Jl and E L q (JL) . Finally, use these considerations to prove the
following one of G.H. Hardy ' s many inequalities :

[Jf xl+a (Jf cp(y) dy) dx]


( o ,oo)
1
( o , x)
P �
-<
p
a
(Jf (yycpl+(ya) t dy)
( o ,oo)

for all E (0, oo , p E [1 , oo , and non-negative, B ( o ,CX)) -measurable


a ) ) cp.
Hint : To prove everything except Hardy ' s inequality, simply repeat the argu­
ment used in the proof of Young ' s Inequality. To prove Hardy ' s result, take

f(x) = (�) l [l ,oo) (x) g (x) = x 1 - ;cp(x),


0:

P and

and use I ! g i Lv (JL) < I J I L 1 (JL) I Y I Lv (JL) ·



Chapter VII
A Litt le Abstract Theory

7. 1 An Existence Theorem.
In Chapter II we constructed Lebesgue ' s measure on � N , and in ensuing
chapters we saw how to construct various other measures from a given mea­
sure. However, as yet, we have not discussed any general procedure for the
construction of measures ab initio; it is the purpose of the present section to
provide such a procedure.
The basic idea behind what we will be doing appears to be due to F. Riesz
and entails the reversal, in some sense, of the process by which we went in
Chapter III from the existence of a measure to the existence of integrals. That
is, we will suppose that we have at hand an integral and will attempt to show
that it must have come from a measure. Thus, we must first describe what we
mean by an integral.
Let E be a non-empty set. We will say that a subset L of the functions
f : E � is a lattice if f 1\ g and f V g are both elements of E L whenever
------+

f and g are. Given a lattice L of �-valued functions, we will say that L is a


vector lattice if the constant function 0 is an element of L and L is a vector
space over �- Note that if L is a vector space of �-valued functions on E, ther1
it is a vector lattice if and only if j + f V 0 E L whenever f E L. Next, given
a vector lattice L, we will say that the mapping I : L � � is an integral
on L if
( a) I is linear,
(b) I is non-negative in the sense that I(f) > 0 for every non-negative
f E L,
( c ) I (fn) � 0 whenever {fn }� C L is a non-increasing sequence which
tends (point-wise) to 0.
Finally, we will say that the triple I == (E, L, I) is an integration theory if
L is a vector lattice of functions f : E � and I is an integral on L.
------+

7. 1 . 1 Examples: Here are three situations to which the preceding notions


apply.
( i ) The basic model on which the preceding definitions are based is the one
which comes from the integration theory for a measure space ( E, B, Jl ) . Indeed,
in that case, L == £ 1 (JL) and I(f) == J f dJL.
140 VII A Little Abstract Theory
(ii) A second basic source of integration theories is the one which comes from
finitely additive functions on an algebra. That is, let A be an algebra of subsets
of E and denote by L (A) the space of simple functions I : E � � with the
property that {I == a } E A for every a E � - It is then an easy matter to check
that L (A) is a vector lattice. Now let Jl : A � [0, oo ) be finitely additive
in the sense that
JL ( rl U r2 ) == JL ( rl ) + JL ( r2 ) for disjoint r1 , r2 E A.
Note that, since JL(0) == JL(0 U 0) == 2JL(0) , JL(0) must be 0. Also, by proceeding
in precisely the same way as we did (via Lemma 3. 2 .3 ) in the proof of Lemma
3.2.4 and then in the proof of Lemma 3.2. 1 1 , one can show that

I E L (A) � I( l ) -
a ERange(/)

is linear and non-negative. Finally, observe that I cannot be an integral unless


Jl has the property that

( 7. 1 . 2 ) JL ( rn) � 0 whenever {rn }� C A decreases to 0 .


On the other hand, if ( 7. 1 . 2 ) holds and {In }� C L(A) is a non-increasing
sequence which tends point-wise to 0, then for each E > 0,

Thus, in this setting, ( 7. 1 . 2 ) is equivalent to I being an integral.


(iii) A third important example of an integration theory is provided by the
following abstraction of Riemann ' s theory. Namely, let E be a compact topo­
logical space, and note that C(E; �) is a vector lattice. Next, suppose that
I : C(E; �) � � is a linear map which is non-negative. It is then clear that
I I(I) I < C l l l ll u ,E , I E C(E ; � ), where C == I ( l ) and ll l l lu , E supx E E l l (x) l
is the uniform norm of I on E. In particular, this means that I I(In) - I( J ) I <
C l l ln - l ll u ,E � 0 if In � I uniformly. Thus, to see that I is an integral,
all that we have to do is use Dini's Lemma (cf. Lemma 7. 1.23 below) which
says that In � 0 uniformly on E if {In }� C C(E; �) decreases point-wise
to 0.
Our main goal will be to show that, at least when 1 E L, every integration
theory is the sub-theory of the sort of theory described in (i) above. Thus,
we must learn how to extract the measure Jl from the integral. At least in
case (ii) above, it is clear how one might begin such a procedure. Namely,
A == {r c E : l r E L(A) } and JL(r) == I( l r ) for r E A. Hence, what we are
attempting to do in this case is tantamount to showing that Jl can be extended
as a measure to the a-algebra a ( A ) generated by A. On the other hand, it is
not so immediately clear where to start looking for the measure Jl in case (iii) ;
7.1 An Existence Theorem 141

the procedure which got us started in case (ii) does not work here since there
will seldom be many r C E for which l r E C(E ; �) . Thus, in any case, we
must learn first how to extend I to a larger class of functions f : E � � and
only then look for Jl.
Our extension procedure has two steps, the first of which is nothing but a
rerun of what we did in Section 3.2, and the second one is a minor variant on
what we did in Section 2. 1 .
7. 1 .3 Lemma. Let (E, L , I)be an integration theory, and define L u to be
the class of f : E � ( - oo , oo ] which can be written as the point-wise limit
of a non-decreasing sequence { 'Pn} r C L . Then L u is a lattice which is closed
under non-negative linear operations and non-decreasing sequential limits (i.e.,
{fn}r C Lu and fn / f implies that f E Lu )· Moreover, I admits a unique
extension to L u in such a way that I ( fn) / I(f) whenever f is the limit of a
non-decreasing {fn}r C L u . In particular, for all J, g E L u , - oo < I(f) <
I ( g ) if f < g and I(af + {3g) == ai ( f) + {3I (g ) for all a , {3 E [0, oo ) .
PROOF: The closedness properties of L u are obvious. Moreover, given that an
extension of I with the stated properties exists, it is clear that that extension
is unique, monotone, and linear under non-negative linear operations.
Just as in the development in Section 3.2 which eventually led to The Mono­
tone Convergence Theorem, the proof ( cf. Lemma 3.2.6) that I extends to L u is
simply a matter of checking that the desired extension of I is consistent. Thus,
what we must show is that when '¢ E L and { 'Pn }r C L is a non-decreasing
sequence with the property that 'ljJ < limn� 'Pn point-wise, then I( '¢ ) <
(X)

limn� CX) I ( 'Pn). To this end, note that 'Pn == '¢ - ( '¢ - 'Pn) > '¢ - ( '¢ - 'Pn) + ,
( 'ljJ - 'Pn) + � 0, and therefore that

limCX) I ( ( '¢ - 'Pn)+) == I( '¢ ).


I( '¢ ) - n�
�CX) I( 'Pn)
nlim >

As we said before, once one knows that I is consistently defined on L u , the


rest of the proof differs in no way from the proof of The Monotone Convergence
Theorem (cf. Theorem 3.3. 2) . D
Lemma 7. 1 .3 gives the first step in the extension of I. What it provides a
rich class functions which will be used to play the role that open sets played
in our construction of Lebesgue ' s measure. Thus, given any f : E � � ' we
define
(7. 1.4 ) I( ! ) = inf { I ( cp ) : cp E L u and f < cp } ·
(We use the convention that the infimum over the empty set is + oo. ) Clearly
I(f) is the analog here of the outer measure r � 1r1e in Section 2. 1. At the
same time as we consider I, it will be convenient to have

(7. 1.5) I( ! ) = sup { - I ( cp ) : cp E L u and - cp < f} ;


142 VII A Little Abstract Theory
the analog of which in Section 2 . 1 would have been the interior measure

r f----* lrh sup { I F I : F is closed and F c r}.


(In keeping with our convention about the infimum, we take the supremum
over the empty set to be - oo. )
7. 1 . 6 Lemma. For any JR.-valued function f on E
(7 . 1 . 7) I(f) < I(J),
and
I( a f) == ai(f) and I( a f) == ai(f) if a E [0, oo )
(7. 1 .8)
I(af) == ai(f) and I(af) == ai(f) if a E ( -oo, O] .
-2
Moreover, if ( J , g) : E � JR. then

(7. 1 .9) f<g ===} I(J) < I(g) and I(f) < I(g),
JR2 (cf. Section 3.2),
.-..

and, when ( J , g) takes values in

(7. 1 . 10)
( I( J ), I(g) ) E i2 ===} I(f + g) < I(f) + I(g)
( I(J), I(g) ) E i2 ===} I(f + g) > I(f) + I(g) .
Finally,

(7. 1 . 1 1) f E Lu ===} I(f) == I(f) == I(f).


PROOF: To prove (7. 1 . 7) , note first that it suffices to treat the case in which
I(f) > -oo and I(f) < oo and then that, for any ( 'ljJ) E L� with < f <
'¢, cp cp)
+ 'ljJ > 0 and therefore I( + I( 'ljJ) == I( + 'ljJ) > 0. cp cp , -cp
Both (7. 1 .8) and (7. 1.9) are obvious, and, because of (7. 1.8) , it suffices to
prove the first line in (7. 1 . 10) . Moreover, when I(f) 1\ I(g) > - oo, the required
result is easy. On the other hand, if I(f) == -oo and I(g) < oo, then we can
choose { cpUn}1 C L u so that f <
{'¢} and 'Pn
< for each E z + , I'Pn(cpn) -n n

'¢ ,
g < and I( 'ljJ) < oo. In particular, f + g < + 'ljJ for all E z + , and so
'Pn )
I(f + g) < limn -H:x) I( + I( 'ljJ) == -oo.
n

Finally, suppose that f E L u . Obviously, I(f) < I(f) . At the same time,
if
-(
{-cp'Pnn)} � 'Pn
==
C L is chosen so that
< f for each E z + )
'Pn
/ f, then (because
n
E L C L u and -cpn
I(f) > n ---+ - I ( - cpn) n ---+ I ( cpn)
lim
CX)
== limCX) == I(f). D
7.1 An Existence Theorem 143

From now on, we will use 9R(I) to denote the class of those f : E
==
which I(f) I( J) , and we define i : 9J1(I) JR so that I(f) i(J) � �
== == JR for
I(f) .
Obviously ( cf. (7 . 1 .8) )
(7. 1 . 12) E 9J1(I) ===} af E 9J1(I) and i ( a) ai(J) , a E JR.
f ==
Finally, let £ 1 (I) denote the class of JR-valued f E 9J1(I) with i( J ) E JR.
7. 1 . 13 Lemma. If f : E �
JR, then f E L 1 (I) if and only if, for each
E > 0, there exist cp, 1/J E L u such that -cp < f < 1/J and I( cp) + I( 1/J) < E. In
particular, f E £ 1 (I) ===} j + E £ 1 (I). Moreover, L 1 (I) is a vector space
and i is linear on L 1 (I). Finally, if {fn } ! C L 1 (I) and 0 < fn / f, then
- -
f E 9J1(I) and 0 < I(fn) / I(f) .
PROOF: First suppose that f E L 1 (I). Given E > 0, there exists (cp, 1j;) E L�
- -
such that -cp < f I(f) < -I(cp) + � ' and I(1j;) I(f) + � ; from which
< 1/J, <
it is clear that I( cp) + I( 1/J) < E. Conversely, suppose that f : E JR and ----t

that, for some E E (O, oo) , there exists (cp, 1j;) E L� such that -cp < f < 1/J and
I ( cp) + I ( 1/J) < E. Because I ( cp) 1\ I ( 1/J) > - oo, - oo < I ( 1/J) < E - I ( cp) < oo
and - oo < I(cp) < E - I(1j;) < oo. In addition, -I(cp) < I(f) < I(f) < I(1j;) .
Hence, not only are both I(f) and I(f) in JR, but also I(f) - I(f) < E, which
completes the proof of the first assertion. Next, suppose that f E L 1 (I) . To
prove that j + E L 1 (I) , let E E (O, oo) be given, and choose (cp, 'lj;) E L� so
that -cp < f < 1/J and I(cp) + I(1j; ) < E. Next, note that -cp - cp /\ 0 E L u ,
==
cp + cp V 0 E L u , and that cp - < j + < '¢+ . Moreover, because cp + 1/J > 0, it
==
is easy to see that -cp - + 7j; + < cp + 1/J, and therefore we know that

To see that £ 1 (I) is a vector space and that i is linear there, simply apply
(7. 1 .8) and (7. 1 . 10) . Finally, let {fn }1 be a non-decreasing sequence of non­
==
negative elements of L 1 (I), and set f limn-H:x) fn · Obviously, limn � CX) i( Jn )
< I(f) < I(f) . Thus, all that we have to do is prove that I(f) < limn � CX) i ( Jn ) ·
To this end, set h 1 == ==
!1 , hn fn - fn - 1 for n > 2, and note that each
hn is a non-negative element of L 1 (I) . Next, given E > 0, choose, for each
m E z + , 1/Jm E Lu so that hm < 1/Jm and I(1/Jm ) < i(hm ) + 2 - m E. Clearly,
1/J L:� 1/Jm E L u and f < 1/J. Thus, I(f) < I(1j;) . Moreover, by Lemma 7. 1 .3
and the linearity of i on L 1 (I) ,

I(1j;) == nlim (�1 ) = �L..-t1


� CX) I L..-t 1/Jm nlim
� CX) 1 (1/Jm )
n
limCX) '""""' i(hm ) + E
< n---+ L..-t
1
== limCX) i ( Jn ) + E. D
n---+

7. 1 . 14 Theorem (Daniell) . Let


Then i
== I
(E, L, I)
be an integration theory.
== (E, L 1 (I) , i ) is again an integration theory, C £ 1 (I), and i agrees
L
144 VII A Little Abstract Theory

with I on L. Moreover, if {fn } � C L I (I) is non-decreasing and fn / J, then


- only if- sup n f ( x ) < oo for each x E E and supn i ( Jn ) < oo ,
f E L I (I) if and
in which case I(fn) / I ( f ) .
P ROOF: In order to prove the first assertion, note that, by Lemma 7. 1. 13,
L I (I)- is a vector lattice and i is linear there.
- Moreover, by (7. 1 . 1 1) , L C LI (I)
and I r L == I r L; and, by (7. 1.9) , I(f) > I(O) == 0 if f > 0. Finally, if
{fn}1 C L I (I) and fn / J , then, by the last part of Lemma 7. 1 . 13 applied
to {fn - fi } l , f - fi E 001 (I) ,

f (fn) == f ( JI ) + f (fn - fi ) / f ( JI ) + f ( J - fi ) ·
But, by (7. 1 . 10) , f == !I + (f - !I ) E 001 (I) and i ( J ) == i ( JI ) + i ( J - /I ) ·
Hence, the last assertion is now proved. In addition, if {fn}1 C L I (I) and
fn � 0, then I (fn) � 0 follows from the preceding applied to { - fn} 1, which
completes the proof that I is an integration theory. D
We are now ready to return to the problem, raised in the discussion following
Examples 7. 1 . 1 , of identifying of the measure underlying a given integration
theory (E, L, I) . For this purpose, we introduce the notation a(L) to denote
the smallest a-algebra over E with respect to which all of the functions in
the vector lattice L are measurable. Obviously, a (L) is generated by the sets
{f > a} as f runs over L and a runs over �.
7. 1 . 15 Theorem (Stone) . Let ( E, L, I) be an integration theory, and assume
that 1 E L. Then

(7. 1 . 16)

the mapping
(7. 1 . 17)

is a finite measure on ( E , a (L I (I) ) ) , a (L I (I) ) is the completion of a(L) with


respect to Jli , L I ( JLI ) == £ I (I) , and

(7. 1 . 18)

Finally, if ( E, B, v) is any finite measure space with the properties that L C


JE
LI (v) and I ( f ) == f dv for every f E L, then a ( L ) C B and v coincides with
Jli on a(L) .
PROOF: Let 1i denote the collection described on the right hand side of equa­
tion (7. 1. 16) . Using Theorem 7. 1 . 14, one can easily show that 1i is a a-algebra
over E and that
r E 1i � Jli (r) - f (1r) E [0, oo )
7.1 An Existence Theorem 145

defines a finite measure on ( E, 1t) . Our first goal is to prove that

To this end, for given f : E ------+ � and a E � ' consider the functions
9n [n(l - l /\ a)] /\ 1 , n E ll + .
If I E L 1 (I) , then each 9n is also an element of L 1 (I) , 9n / 1{/> a } as
n � oo , and therefore 1{/> a } E L 1 (I) . Thus, we see that every I E L 1 (I) is
1t-measurable. Next, for given I : E � [0, oo ) define

If I E L 1 (I) U L 1 (E, 1i, JLI), then (cf. the preceding and use linearity) In E
£ 1 (I ) n £ 1 (JLI ) , In / I, and so I E L 1 (E, 1t, I) n L 1 (JLI) and i(l) == I dJl i.
Hence, we have now proved (7. 1 . 19) .
JE
Our next goal is to show that
(7. 1 .20) == 1t == a ( £ 1 (I) ) .
a(L) JL I
Since L C £ 1 (I) and every element of £ 1 (I) is 1t-measurable, what we know
so far is that
a(L) c a ( L 1 (I) ) c 1t.
Thus, to prove (7. 1.20) , all we need to do is show that

(7. 1 .21 )

But if r E 1t JL I , then there exist A, B E 1t such that A C r C B and


- -
I(1A) == Jli(A) == Jli(B) == I(1 n ) . Hence, we can choose sequences {cpn }1 - and
{'l/Jn}1 from Lu nL 1 (I) so that - cpn < 1 A < 1r < 1n < 'l/Jn , - I(cp n ) / I(1 A ) ,
and I( 'l/Jn ) � i(1n) . Further, after replacing 'Pn and 'l/Jn by 'P 1 1\ 1\ 'Pn and
'l/; 1 1\ 1\ 'l/Jn , we may and will assume that each of these sequences is non­
· · ·

· · ·

increasing. Next, take cp == limn---+ CX) 'Pn and 'ljJ == limn---+ CX) 'l/Jn . Note that cp
and 'ljJ are in L 1 (I), -cp < 1 r < 'ljJ, and -i ( cp) == i ( 'ljJ) . In particular, since
-i(cp) < I (1r) < I (1r) < i('lf;) , this means that 1r E L 1 (I) and therefore
that r == { 1r > 0} E a ( L 1 (I) ) . To prove that r E a(L) JL1 , first note that every
element of Lu is a(L)-measurable. Hence, both cp and 'ljJ are a(L)-measurable,
and so both the sets C == { cp < 0} and - D == {-'ljJ > 1 } are elements of a (L) .
Finally, from -cp < 1r < 'ljJ and -I( cp) == I( 'ljJ ) , it is easy to check that
- < 1r < 1v < 'ljJ and therefore that JL(D \ C) JL== JL(D) - JL(C)
<
- < 1c
-cp
I('l/J) + I(cp) == 0, which completes the proof that r E a(L) I . Hence, we have
now proved ( 7. 1. 21 ) and therefore ( 7. 1 .20 ) .
146 VII A Little Abstract Theory
We have now completed the proof of everything except the concluding as­
sertion of uniqueness. But if L C L I (v), then obviously a(L) C B. Moreover,
if J f dv == I(f) for all f E L, then we can prove that v f a(L) == J-LI f a(L) as
follows. Namely, it is clear that a(L) is generated by the 1r-system of sets r of
the form
r == { fi > a I , , ft > al },
· · ·

where f E z+ , { ai , . . . , at} C � ' and {fi , . . . , ft} C L . Thus, by Exercise 3. 1.8,


we need only check that v agrees with J-LI on such sets r. To this end, define

[
< k <l (fk - fk 1\ ak) 1\ 1,
9n == n I min ]
note that {gn}1 C L and 0 < 9n / 1r , and conclude that

v ( r ) == nlim j j
---+ CX) gn dJ-LI == J-LI (r) . D
---+ CX) gn dv == nlim
With the preceding result, we are now ready to handle the situations de­
scribed in ( ii ) and ( iii ) of Examples 7. 1 . 1 .
7. 1 . 22 Corollary ( Caratheodory Extension ) . Let A be an algebra of sub­
sets of E and suppose that J-L : A [0, oo ) is a finitely additive function (cf.
------+

( ii ) in Examples 7. 1 . 1) with the property that (7. 1 .2) holds. Then there is a
unique finite measure jl on (E, a(A)) with the property that jl coincides with
J-L on A.
PROOF: Define L(A) and I on L(A) as in ( ii ) of Examples 7. 1 . 1 . It is then
an easy matter to see that a(A) == a (L(A)) . In addition, as was shown in ( ii )
of Examples 7. 1 . 1 , I is an integral on L(A) . Hence the desired existence and
uniqueness statements follow immediately from Theorem 7. 1. 15. D
Before we can complete ( iii ) in Examples 7. 1 . 1 , we must first prove the
lemma alluded to there.
7. 1 . 23 Lemma ( Dini's Lemma) . Let {fn }�
be a non-increasing sequence
of non-negative, continuous functions on the topological space E. If fn � 0,
then fn ------+ 0 uniformly on each compact subset K C E (i.e., ll fn llu , K 0.)
------+

PROOF: Without loss in generality, we assume that E itself is compact.


Let E > 0 be given. By assumption, we can find for each x E E an
n(x) E z+ and an open neighborhood U(x) of x such that fn ( x ) ( Y ) < E for all
y E U(x) . Moreover, by the Heine-Borel Theorem, we can choose a finite set
{x i , . . . ' X L } c E so that E == u � I U(x£). Thus, if N(E) == n(xi ) v . . . V n(xL),
then fn < E as long as n > N(E) . D
Given a topological space E, let Cb (E; �) denote the space of bounded
continuous functions on E and turn Ch (E; � ) into a metric space by defin­
ing IIY - f llu, E to be the distance between f and g. We will say that A :
7. 1 An Existence Theorem 147

Cb (E; JR) lR
is a non-negative linear functional if A is linear and
A(f) > 0 for all f E Cb ( E; [0, oo ) ) . Furthermore, if A is a non-negative

linear functional on Ch (E; JR), we will say that A is tight if it has the property
that for every 8 > 0 there is a compact K8 C E and an A8 E (0, oo ) for which

I A(J) I < A8 ll f llu, K6 + 8 ll f ll u, E for all f E Cb (E; lR) .


Notice that when E is itself compact, then every non-negative linear functional
on Ch (E; JR) is tight.
7. 1.24 Theorem ( Riesz Representation ) . Let E be a topological space,
set B == a ( Cb (E; JR) ) , and suppose that A : Ch (E; �) � is a non-negative�

linear functional which is tight. Then tbere is a unique finite measure Jl on


(E, B) with the property that A(J) == JE f dJL, f E Ch (E; JR) .
PROOF: Clearly, all that we need to do is show that A ( fn ) � 0 whenever
{fn}! C Cb (E; JR) is a non-increasing sequence which tends (pointwise) to 0.
To this end, let E > 0 be given, set 8 == + ll f1 ll u,E , and use Dini ' s Lemma to
1 2
n
choose an N(8) E z+ so that l l fn ll u , K6 < � for all > N(8) , where K8 and
2 6
A8 are the quantities appearing in the tightness condition for A. Then, for
n > N(8), ! A(fn) l < E. D

The importance of Theorem 7. 1 .24 is hard to miss. Indeed, it seems to say


that it is essentially impossible to avoid Lebesgue ' s theory of integration. On
the other hand, it may not be evident how one might use Corollary 7. 1 .22. For
this reason we close this section with a typical and important application of
Caratheodory ' s Theorem. Namely, for each i from a non-empty index set J, let
(Ei , Bi , Jli ) be a probability space. Given 0 =I= S C J, set Es == fl i E S Ei , and
use lis to denote the natural projection map from E E3 onto Es . Finally,
let � stand for the set of all non-empty, finite subsets F of J, and denote by
BJ the a algebra over E generated by sets of the from

(7. 1 . 25) rF II F 1 (rriE F ri) ' F E � and ri E Bi , i E F.

Our goal is to show there is a unique probability measure JL ni E J Jl i on


( E, BJ ) with the property that

(7. 1 .26) JL ( rF ) == IT Jli (ri)


iE F
for all choices of F E � and ri E Bi , i E F.
We begin by pointing out that uniqueness is clear. Indeed, the generating
class described in (7. 1 .25) is obviously a 1r-system, and therefore ( cf. Exercise
3. 1 .8) the condition in (7. 1 .26) can be satisfied by at most one measure. Sec­
ondly, observe that there is no problem when J is finite. In fact, when J has
148 VII A Little Abstract Theory

only one element there is nothing to do at all. Moreover, if we know how to


handle J ' s containing n E z + elements and J == {i 1 , . . . , in + 1 }, then we can
take f1� + 1 Jli m to be the image (cf. Lemma 5.0. 1) of (cf. Lemma 4. 1 .3)

under the mapping


n+ 1
X Ei n + l � ( xi l ' · · · ' Xi n + l ) E II Ei m ·
1
Thus, assume that J is infinite. Given F E � ' use /-L F to denote niEF Jli · In
order to construct J.L, we first introduce the algebra

A U li p 1 ( BF) where Bp II Bi
FE� iEF
and note that BJ is generated by A. Next, observe that the map J.L : A � [0, 1]
given by

J.L( A ) == J.L F ( r ) if A == II F 1 (r) for some F E � and r E BF


is well-defined and finitely additive. To see the first of these, suppose that, for
some r E Bp and r ' E BF ' , Il p 1 ( r ) == Il p } ( r ') . If F == F' , then it is clear
that r == r ' and that there is no problem. On the other hand, if F c F',
then one has that r ' == r X E F'\F and therefore (since the Jli ' s are probability
measures) that

/-LF' ( r' ) == J.L F ( r ) II Jli (Ei ) == J.L F ( r ) .


iEF'\F
That is, J.L is well-defined on A. Moreover, given disjoint A and A' from A,
choose F E J so that A, A' E II F 1 ( BF) ' note that r == li p ( A ) is disjoint from
r ' == IJ p , ( A') , and conclude that
J.L ( A u A' ) == J.L F ( r u r' ) == J.L F ( r ) + J.L F ( r ' ) == J.L( A ) + JL ( A' ) .
In view of the preceding paragraph and Corollary 7. 1 .22, all that remains is
to show that if {An } ! is a non-decreasing sequence from A and n� An == 0,
then J.L(An ) � 0. Equivalently, what we need to know is that if {An} ! C A
is non-decreasing and, for all n E z + and some E > 0, J.L( An ) > E, then
n� An =I= 0. Thus, suppose that such a sequence is given, choose {Fn }1 so
that An E li p� ( BFn ) for each n E z + , and set S == J \ U� Fn. Without loss
in generality, we assume that Fn C Fn + 1 for all n . Under the condition that
7.1 An Existence Theorem 149

JL ( An) > E > 0 for all E z + , we must produce a sequence {am}! such that
n

a I E E p1 ,
am E Epm \Fm_ 1 for m > 2, and ( ai , . . . , an) E Il pn ( An) for all E z + . n

In fact, given such a sequence {am} 1 , observe that by taking a E E so that


Il p1 ( a) == a I , Ilpm \Fm_ 1 ( a) == am for m > 2, and (when S =I= 0) lis ( a) is an
arbitrary element of Es, one gets an element of n� An.
To produce the am ' s, first choose and fix, for each i E J, a reference point
e i E Ei . Next, for each n E z + , define .Pn : E pn � E so that
if i E Fn
if i E J \ Fn ,
and set fn == lA n o -Pn. Obviously,
f < JL ( An) = 1EFn fn (XFn ) ILFn (dxFn ) ·
FUrthermore, if, for each m E z + , we define the sequence { gm , n : n > m} of
functions on E pm so that 9m , m (xFm ) == fm (xFm ) and

gm , n (xF,. ) = { fn (xF,. , Y Fn \F.,. ) ILFn \F.,. ( Y Fn \ F.,. )


JE Fn \Fm
when > m, then 9m , n + I < 9m , n and
n

gm , n (XF,. ) = 1 gm + l, n (XF,. , Y F,.+1 \F.,. ) JL F,.+1 \F.,. ( dy F,.+ 1 \F.,. )


E pm +1 \Fm
for all 1 < m < n.
Hence, 9m limn ---+ CX) 9m , n exists and, by the Monotone
-

Convergence Theorem,

J
Finally, since
{ g 1 (xF1 ) ILF1 ( dXF1 ) =
nlimCX) { fn (xFn ) ILFn ( dxFn ) = nlim ---+ p( An) > E,
h
. E F1 ---+ }FE Fn CX)
there exists an a I E E p1 such that 9I ( a I ) > E. In particular ( since 9I < !I ) , this
means that ai E Il p1 ( AI )· In addition, from (7. 1 .27) with m == 1 and X p1 == ai ,
it means that there exists an a2 E Ep2 \F1 for which 92 ( ai , a2 ) > E and therefore
( since 92 < !2) ( ai , a2) E Il p2 ( A2) · More generally, if ( ai , . . . , am ) E E pm and
9m (ai ' . . . ' am ) > E, then ( since 9m < fm) ( ai ' . . . ' am) E Il pm ( Am) and, by
(7. 1.27) , there exists an am + I E Epm + 1 \Fm such that 9m + I ( ai , . . . ' am ri ) > E.
Hence, by induction, we are done and we have proved the following theorem.
7. 1 . 28 Theorem. Let J be an arbitrary index set and, for each i E J, let
(Ei , Bi , Jli ) be a probability space. If E == fl i E J Ei and BJ is the a-algebra
over E generated by the sets rF in (7. 1 . 25) , then there is a unique probability
measure JL on ( E, BJ ) satisfying (7. 1.26) .
150 VII A Little Abstract Theory

The existence result proved in Theorem 7. 1 .28 plays an important role in


probability theory, where it becomes a ubiquitous tool with which to construct
infinite families of mutually independent random variables.
7. 1 . 29 Remark: As the preceding discussion demonstrates, the version of
Caratheodory ' s theorem which we have proved here is sufficient to handle the
construction of non-trivial measures which are finite. Moreover, by an obvious
localization procedure, one can reduce problems involving a-finite situations to
finite ones. On the other hand, when confronting problems leading to measures
which are not a-finite, the results proved in this section are insufficient, and
one must work harder. For an excellent account of both the methodology ( as
well as reason ) for handling such problems, see L.C. Evans and R. Gariepy's
Measure Theory and Fine Properties of Functions, published by the Studies in
Advanced Math. Series of CRC Press ( 1991) .

Exercises

7. 1 .30 Exercise: Assume that E is a metric space. Show that the Borel field
BE coincides with a ( Cb(E; � )) . Next, assume, in addition, that E is locally
compact ( i.e. , every point x E E has a neighborhood whose closure is compact )
and show that BE == a ( Cc ( E; �) ) ( the space of f E C ( E; �) which vanish off
some compact subset of E ) .

7. 1 .31 Exercise: Let 1/J be a bounded, right-continuous, non-decreasing func­


tion on �. Set 1/J (±oo) == lim x� ± CX) 1/J(x ) .
(i) For each cp E Ch ( �; � ) , show that

I(cp) lim ( R )
R� CX) i
[ - R , R]
cp d1j; E �

exists.
(ii) Show that cp E Ch ( �; � ) I( cp) E � is a non-negative, linear functional.
r-----+
Further, check that, for each R > 0,

where

A == 1/J( oo) - 1/J( -oo) and E(R) == A - (1/J(R) - 1/J( -R) ) .

(iii) As a consequence of ( ii) and Theorem 7. 1 .24, show that there is a unique
measure J-l'lf; on ( �, BR ) with the property that �-t ( ( -oo, x] ) == 1/J(x) - 1/J( -oo)
for all x E �. Notice that the mapping 1/J r-----+ J-l 'lf; inverts the map discussed in
Theorem 5.4.2.
7.2 Hilbert Space and the Radon-Nikodym Theorem 151

7. 2 Hilbert Space and the Radon-Nikodym Theorem.


In Exercise 6. 1. 7 we saw evidence that, among the £P-spaces, the space £ 2 is
the most closely related to familiar Euclidean geometry. In the present section,
we will expand on this observation and give an application of it.
Throughout (E, B, JL ) will be a measure space. By part ( iii ) of Theorem
6.2.2, the space L 2 ( JL ) is a vector space which becomes a complete metric
space when we use I I ! - g ii £2 ( J.L ) to measure the distance between f and g. In
addition, if we define

(7.2 . 1)

then ( f, g ) L 2 ( J.L ) is bilinear ( i.e. , it is linear as a function of each its entries )


and ( cf. (6.2.9) ) , for f E L 2 ( JL ) ,

(J , f) l 2 ( J.L )
1

I I ! I I L 2 ( J.L ) ==
(7.2.2)
= sup { ( !, g ) £2 ( 11 ) : g E L 2 ( J..L ) with ll g ll £2 ( 1' ) }
<1 .

Note that (J , g ) L 2 ( J.L ) plays the same role for L 2 ( JL ) that the Euclidean inner
product plays in �N . That is, by Schwarz ' s inequality ( cf. Exercise 6. 1 .7) ,

can be thought of as the angle between f and g in the plane spanned by f


and g. Thus, we say that f E L 2 ( JL ) is orthogonal or perpendicular to
S C L 2 ( JL ) and write f ..l S if (J , g ) L 2 ( J.L ) == 0 for every g E S.
A topological vector spaces with this kind of structure is known as a Hilbert
space. A distinguishing feature of Hilbert spaces is the property proved in the
following lemma. Indeed, the reader might want to observe that, although we
restrict our attention to £ 2 -spaces, the result proved depends only on com­
pleteness and the existence of a bilinear map for which (7.2.2) holds; hence, it
is a property possessed by all Hilbert spaces.

7.2.3 Lemma. Let F be a closed linear subspace of L 2 ( JL ) . Then, for each


g E L 2 ( JL ) , there is a unique f E F for which

(7.2.4)

In fact, the unique f E F which satisfies (7.2.4) is the unique f E F such that
(g - f) ..l F. That is, f is the perpendicular projection of g onto F.
152 VII A Little Abstract Theory

P ROOF: We first check that E F satisfies


To this end, suppose that E F satisfies
ff (7.(7. 22. 4.4),)
if and only if ..l F.
(g - f)
and observe that, for any
1/J
E F, the function
t I I g - f - t'ljJ II �2 (J.I) I g - f I i2 (J.I) - 2t (g - J, 'ljJ) L2 (J.I) + t2 I 'ljJ I i2 (J.I)
E � 1----+ =

(g - J, 1/J ) £2 (JL1/J)
== 0 for every
t
has a minimum at == 0. Hence, by the first derivative test, we see that
1/J
E F. Conversely, if E F and l_ F, f (g - f)
then, for any E F,
l g - 1/J I �2 (JL) I Y - JI I �2 (JL) + 2 (g - f, f - 1/J ) L2 (JL) + I / - 1/J I �2 (JL)
==

== I Y - f i �2 (JL) + I ! - 1/J I �2 (JL) > I Y - JI I �2 (JL) •


Thus, we have now proved the equivalence of the two characterizations of
f; !I , !
and, as a consequence of the second characterization, uniqueness is easy.
Indeed, if E F and ..l F,( g
E - fi) then i {1, 2},
- !I ) l_ F and (!2
2
therefore II !2 - !Il l £ 2 ( JL )
== 0.

In view of the preceding, it remains only to prove that there is an E F for


which (7. 2.4)
holds. To this end, choose {fn}! C
F so that
f
l g - fn II £2 (J.I) a { l g - cp l £2 (J.t) : cp }
-----+ inf EF .
Since L 2 (JL)
and therefore F are complete, all that we have to do is show
that { fn}1
cp, 1/J £2 (JL) ,
E
is Cauchy convergent. In order to do this, note that for any

(7. 2 . 5) I 'P + 1/J I �2 (JL) + I 'P - 1/J I �2 (JL) 2 l cp i �2 (JL) + 2 I 1/J I �2 (JL) ·
==

Taking cp g - fn 1/J g - fm (7. 2 . 5),


== and == in we obtain

2 ( l g - fn l i2 (J.I) - a2 ) + 2 ( l g - fm l i2 (J.I) - a2 ) ,
<
ln �lm
where we have used the fact that
>
Now let E 0 be given, choose N z+ E l g -
so that i � 2 ( JL ) 2 �
2
E F in order to get the last inequality.
fn <a + for
n > N,and conclude that l fn - fm i £2 (JL) < > N. D
E for m, n

The result obtained in Lemma 7.1. 3


is a basic existence assertion from which
a great many other existence results follow. Perhaps the single most important
such result is the following sharpening of the (6. 2 . 8) .
7. 2.6 Theorem (F. Riesz) . Let A : L 2 (JL) R be a linear mapping. Then
-----+

(7.2 . 7) I A I { A(cp) : cp L2 (J.L) and l cp i £2 (J.t) < 1 } <


sup E oo
if and only if there is an h £ 2 (JL) for which
E
(7. 2 .8) A(cp) (h, cp) L2 (J.t) ' cp L2 (J.L) ;
= E
in which case h is uniquely determined by (7. 2 . 8 ) and l h i £ 2 ( JL ) is the supremum
in (7. 2 . 7 ).
7. 2 Hilbert Space and the Radon -Nikodym Theorem 153

PROOF: Everything except the existence of f when (7.2. 7) holds is clear. To


prove this existence, first note that (7.2. 7) implies that

and therefore that A is continuous on £ 2 (JL) . Hence, F { cp E £ 2 (JL) : A( cp) ==


0} is both linear and closed. Moreover, either F == £ 2 (JL) , in which case we
may take h == 0, or there is a g � F. In the latter case, choose f E F so that
k (g - f) ..l F and note that A( k ) =I= 0. In addition, observe that, for any
cp E £ 2 (JL) ,

A <p -
A ( cp )
k =0
( )
A(k)
and therefore that <p - ���j k E F. Hence,

(k, <p ) £2 ( JJ. ) -


A ( cp ) I l k II 2 =
(
k, <p -
A ( cp )
k = 0,
)
A(k ) £ 2 ( JJ. ) A(k )
and so we can take
D

Following J. von Neumann, we will now use Theorem 7.2.6 to derive an


important property about the relationship between measures. Namely, given
two measures Jl and v on the same measurable space (E, B) :
( a) we say v dominates Jl and write Jl < v if JL(r ) < v(r) , r E B;
( b ) we say Jl is absolutely continuous with respect to v and write
Jl << v if JL ( r ) == 0 for all r E B with v(r) == 0;
( c ) we say Jl and v are singular and write Jl ..l v if there is a � E B with
the property that JL(�) == v ( �C ) == 0.

Obviously, both domination and absolute continuity express a relationship


between Jl and v, the former being a much stronger statement than the latter.
In contrast, singularity is a statement that the measures have nothing to do
with one another and, in fact, live on different portions of E.
The result alluded to above comes in two parts and applies to Jl ' s which are
finite and v ' s which are a-finite. The first part says that Jl can be written (in
a unique way) as the sum of a measure Jla which is absolutely continuous with
respect to v and a measure Jla which is singular to v. The second part tells us
that there is a unique non-negative f E L 1 (v) with the property that

(7.2.9) !La (r) = l f dv, rE 13.

In particular, if Jl itself is absolutely continuous with respect to v, then Jl == Jla


and so (7.2.9) holds with Jl in place of Jla ·
154 VII A Little Abstract Theory

The key to von Neumann ' s proof of these results is the observation that
everything can be reduced to consideration Jl ' s which are dominated by v, in
which case the existence of f becomes a simple application of Theorem 7.2.6.
7.2. 10 Lemma. Suppose that (E, B, v ) is a a-finite measure space and that
Jl is a finite measure on (E, B) which is dominated by v. Then there is a unique
[0, !]-valued h E L 1 (v) with the property that

(7.2. 1 1) l r.p dJ.L l r.ph dv


=

for every B-measurable cp : E ----t [0, oo] .


PROOF : Since we can write E as the union of countably many, mutually dis­
joint, B-measurable sets of finite v-measure, we assume, without loss in gen­
erality, that v is finite on (E, B) . But, in that case, L 1 (JL) C L 2 (v) and the
linear mapping
r.p E L 2 (v) � A (r.p) - l r.p dJ.L E R
I 'P I £2 (v) ·
satisfies I A (cp) l < v(E) ! Hence, by Theorem 7.2.6, there is an h E
L 2 (v ) such that (7.2. 1 1) holds for every cp E L 2 (v) and, therefore, for every
bounded B-measurable function. We now want to show h (which is determined
only up to a set of v-measure 0) can be chosen to take its values in [0, 1] . To
this end, set An == { h < - � } and Bn == { h > 1 + � } for n E z+ . Then, by
(7. 2. 1 1 ) ,

from which we conclude that v ( An) == v ( Bn) == 0 , n E z+ , and therefore that


v( h < 0) == v( h > 1) == 0. In other words, we may assume that h takes its
values in [0, 1 ] , and, clearly, once we know this, (7.2. 1 1 ) for all non-negative, B­
measurable cp 's is an easy consequence of the Monotone Convergence Theorem.
Finally, the uniqueness assertion is trivial, since (7. 1 . 1 1) determines the v­
integral of h over every r E B. D
The first part of the next theorem is called Lebesgue's Decomposition
Theorem, and the second part is the Radon-Nikodym Theorem.

7. 2. 1 2 Theorem. Suppose that (E, B, v) is a a-finite me&Sure space, and let


Jl be a finite measure on (E, B) . Then there is a unique measure Jla < Jl on
(E, B) with the properties that Jla << v and Jla ( JL - Jla) l_ v. In addition,
there is a unique non-negative f E L 1 ( v ) for which (7.2.9) holds. In particular,
Jl << v if and only JL ( r ) == fr f dv, r E B, for some non-negative f E L 1 ( JL ) .
7. 2 Hilbert Space and the Radon -Nikodym Theorem 155

PROOF: We first note that if JL (r) == fr f dv, r E B , for some f E L 1 ( v ) ,


then ( cf. Exercise 3.3. 15) JL << v and ( cf. Exercise 3.2. 17) f is necessarily
unique and ( cf. the preceding ) non-negative as an element of L 1 ( v ) . Next,
we prove that there is at most one choice of J.la · To this end, suppose that
JL == J.la + J.la == JL� + JL� , where JLa and JL� are both absolutely continuous with
respect to v and both JLa and JL� are singular to v. Choose E , �' E B so that

v ( � ) == v ( � ' ) == 0 and J.la ( �C ) == JL� ( � ' C ) == 0,

and set A == � U �' . Then v( A) == J.la ( AC) == JL� ( AC) == 0; and therefore, for
any r E B,

JLa (r) == JLa ( A n r) == JL ( A n r ) == JL� ( A n r ) == JL � (r) .


Hence, J.la == JL� .
To prove the existence statements, we first use Lemma 7.2. 10, applied to JL
and JL + v, to find a B-measurable h : E [0, 1] with the property that
-----t

l <p d l <ph dJ.L + l <ph dv


J.L =

for all non-negative, B-measurable cp ' s. It is then clear that

(7.2. 13) l cp( 1 - h) d l <ph dv,


J.L =

first for all cp E £ 1 (JL) n £ 1 ( v ) and then for all non-negative, B-measurable cp ' s.
Now set � == { h < 1} and JLa (r) == JL ( � n r ) , r E B. Since

v ( E C ) = v(h = 1) = f h dv = f ( 1 - h) dJ.L = o,
J{h= 1 } J{h= 1 }
it is clear that JL - JLa is singular to v. At the same time, if f (1 - h) - 1 hlE ,
then, by (7.2. 13) with cp == ( 1 - h) - 1 1rnE ,

J.La(r) = J.La (r n E) = L cp( 1 - h) dJ.L £ f dv =

for each r E B. D
Given a finite measure JL and a a-finite measure v, the corresponding mea­
sures JLa and JLa are called the absolutely continuous and singular parts
of JL with respect to v. Also, if JL is absolutely continuous with respect
to v, then the corresponding non-negative f E L 1 (v) is called the Radon­
Nikodym derivative of JL with respect to v and is often denoted by � . The
choice of this notation is explained by part ( ii ) of the Exercise 7. 2. 14 which
follows.
156 VII A Little Abstract Theory

Exercises

7. 2 . 14 Exercise: Suppose that C is a countable partition of the non-empty


set E, and use B to denote a(C) .
( i ) Show that f : E � R is B-measurable if and only if f is constant on
each r E C. Also, show that the measure v is a-finite on (E, B) if and only if
v(r) < oo for every r E C. Finally, if Jl is a second measure on (E, B) , show
that Jl << v if and only if JL(r) == 0 for all r E C satisfying v(r) == 0.
( ii ) Given any measures Jl and v on (E, B) and a B-measurable, v-integrable
f : E � [0, oo ) , show that JL(r) == fr f dv for all r E B implies that, for every
r E C, f ���� on r if v(r) E (0, oo ) and JL(r) = 0 if v(r) = oo .
( iii ) Using the preceding, show that, in general, one cannot dispense with the
assumption in Theorem 7. 1 . 13 that v is a-finite.
7.2 . 1 5 Exercise: Readers with good memories may be disturbed by the ap­
parent difference between the notions of absolute continuity used here and that
used earlier in Exercise 3.3. 15. To allay such concerns, check that, as long as Jl
is finite, Jl << v implies that for every E > 0 there is a 8 > 0 with the property
that JL(r) < E whenever r E B and v(r) < 8 . (Because v is not assumed to be
a-finite, you should not use the Radon-Nikodym Theorem here.)
7. 2 . 16 Exercise: The purpose of this exercise is to take a closer look at
the relationship 'ljJ � Jl,p (established in Exercise 7 . 1 .30) taking a bounded,
right-continuous, non-decreasing 'ljJ on � into a finite measure Jl,p on ( �, BR ) .
( i ) Given a measure space (E, B, JL) and an x E E, one says that x is an atom
of (E, B, JL) if JL(r) > 0 for every r E B which contains x. Further, one says
that (E, B, JL) is non-atomic if it has no atoms and that (E, B, JL) is purely
atomic if JL(r) == 0 for every r E B which does not contain any atoms of
( E, B, Jl) . Assuming that { x} E B for every x E E, show that x is an atom of
(E, B, JL) if and only if JL ( {x} ) > 0, and, when Jl is a-finite, conclude that the
set A == { x E E : Jl ( { x} ) > 0} of atoms is countable. In particular, if { x} E B
for every x E E and Jl is a-finite, show that (E, B, JL) is purely atomic if an
only if there is a countable set S such that
JL(r) == L
x E r nS
JL ( {x} ) , r E B.

( ii ) Given a bounded, right-continuous, non-decreasing 'ljJ on � ' show that 'ljJ is


continuous if and only if Jl,p is non-atomic. Next, using either Lemma 1 .2.20
or the preceding, show that the set D( 'ljJ) of x E � for which 'lj;(x) - 'l/J(x - ) > 0
is countable, and define the discontinuous part of 'ljJ to be the function
'l/Jd : � � � given by
( 'lj;(t) - 'lj;(t-) ) , X E �-
t E ( - CX) ,x] n D ( 'f/; )
7. 2 Hilbert Space and the Radon-Nikodym Theorem 157

Show that 'l/Jd is a non-negative, bounded, right-continuous, non-decreasing


function and that the continuous part 'l/Jc 'ljJ - 'l/J( -oo) - 'l/Jd of 'ljJ is a
bounded, continuous, non-decreasing function. Finally, say that 'ljJ is purely
discontinuous if 'ljJ == 'l/J ( -oo) + 'l/Jd , and check that 'ljJ is purely discontinuous
if and only if Jl,p is purely atomic.
( iii ) We now want to see how to characterize Jl,p << AR in terms of 'l/J. For this
purpose, we say that the bounded, right-continuous, non-decreasing function
'ljJ is absolutely continuous if, for every E > 0, there is a 8 > 0 such that
00
L ( 'l/J (bm) - 'l/J ( am )) < E
m= l
whenever { ( am, bm ) }1 is a sequence of mutually disjoint, open intervals which
satisfy E:= l (bm - am ) < 8. Obviously, every absolutely continuous 'ljJ is
uniformly continuous. Show that, in fact, 'ljJ is absolutely continuous if and
only if Jl,p << AR . In this connection, we say that 'ljJ is singular if Jl,p ..l AR .
( iv ) The preceding considerations lead to the following decomposition of a
bounded, right-continuous, non-decreasing function 'l/J. Namely, let Jl,P,a and
Jl,P,a be the absolutely continuous and singular parts of Jl,p with respect to AR ,
and define the absolutely continuous part 'l/Ja and singular part 'l/Ja of 'ljJ
by
'l/Ja( x ) == Jla ( (-oo, x] ) and 'l/Ja ( x ) == Jla ( (-oo, x] )
for x E �- Further, let 'l/Ja,c and 'l/Ja,d be the continuous and discontinuous
parts of 'l/Ja . Obviously,
(7.2. 17) 'l/J == 'l/J ( - 00 ) + 'l/Ja + 'l/Ja,c + 'l/Ja,d ·
Show that the decomposition in (7.2. 17) is canonical in the sense that if 'ljJ ==
'l/J ( -oo) + 'lj; 1 + 'lj;2 + 'ljJ3 , where, for each 1 < i < 3, 'l/Ji is right-continuous, non­
decreasing, and tends to 0 at -oo, 'lj; 1 is absolutely continuous, 'lj;2 is continuous
but singular, and 'lj;3 is purely discontinuous, then 'l/J 1 == 'l/Ja, 'l/J2 == 'l/Ja ,c , and
'l/J3 == 'l/Ja,d ·
7.2 . 1 7 Exercise: It is clear that absolutely continuous non-decreasing and
purely discontinuous non-decreasing functions exist. But are there any contin­
uous, singular, non-decreasing functions? In order to show that there are, we
will now describe the Cantor-Lebesgue function. Referring to Exercise 2 . 1 .20,
recall that the closed set Ck is the union of 2 k disjoint closed intervals and that
[0, 1] \ ck is the union of 2 k - 1 disjoint open intervals ( ak,j ' bk,j ) ' 1 < j < 2 k '
where we have ordered these so that bk ,j < ak,j + l for 1 < j < 2 k - 1. Next,
set ak ' o == -oo, bk ' o == 0, a k ' 2 k == 1 , bk ' 2 k == oo, and define 'l/Jk : � � [0, 1] so
that:
( a) 'l/Jk is constant on each of the intervals [ak,j , bk , j ] , 0 < j < 2 k ;
( b ) 'lfJk (bk,j ) == 2 - k j for 0 < j < 2 k and 'l/Jk is linear on each of the intervals
[bk,j , ak , j + l ] , 0 < j < 2 k .
158 VII A Little Abstract Theory

Notice that each 'l/J k is continuous and non-decreasing from � onto [0, 1] . In
addition, check that i l '¢k +1 - '¢k l lu , IR = 2 �k ; and conclude from this that '¢k
converges uniformly on � to a continuous, non-decreasing 'ljJ : � [0, 1]
-----+

with the property that Jl'lf; (C) == 1 . At the same time, by Exercise 2 . 1 . 20,
AR (� \ C) == 0, and therefore 'ljJ is singular.
Notation

Notat ion Description Seet

(a.e. , J-t) To be read almost everywhere with respect to J-t §3.3

a+ & a
- The positive and negative parts of a E JIR § 1 .2

( v , w )!RN The inner product of v and w in JJRN

n := l U n > m An and limit inferior


lim An & lim An
The limit superior
n-+(X) u :=l n n > m An of the sets { An } r
-

n-+(X)
-
E 3. 1 . 12

BE (a, r) The ball of radius r around a in the metric space E

BE The Borel a-algebra over the topological space E §3. 1

B [E' ] The restriction of the a-algebra B to the subset E' §3. 1

J-L
-
The completion of the a-algebra B with respect to the
B §3. 1
measure J-t

BlRN
-

The a-algebra of Lebesque measurable subsets in JIRN §2. 1

The product a-algebra generated by sets of the form rl X


§3.2
r2 for ri E Bi
The spaces of continuous JIR-valued and F-valued func­
C ( E ) & C( E; F )
tions on E
The space of bounded, continuous, JIR-valued functions on
the topological space E.

Cc (G) The space of f E C (G) with compact support in G.

The space of f : G � JIRN with n continuous derivatives

II C II The mesh size of the collection C §1.1

To be read C2 is a refinement of C 1 §1.1

The least common refinement of C 1 and C2 §1.1

The length of the interval I and the change 1/; ( I+ ) -


§ § 1 .2 & 1 .3
1/; ( I - ) of 'lj; over I

The Jacobian of cp §5.3


. . c ala I
Ab rev1ation tor a:
1
oN §5.2
ax 1 . . . ax N

ar The boundary r \ f of the set r

159
160 Notation

J rr The restriction of the function f to the set r

f (x+ ) & f (x - ) The right and left limits of f at x E lR § 1.2

1 1 / l l u & 11 / l l u , E The uniform norms of the function f & f f E

f l\ g & f V g The minimum & maximum of f and g

1 1 / I I L v ( J.L ) The p-Lebesgue norm of f with respect to the measure J.L §3 . 2 & §5.2

( J , g) £ 2 ( J.L ) The inner product of f and g in L 2 (J.L ) §7.2

The mixed (P 1 , P2 ) -Lebesgue norm with respect to the


I I f I I £ ( P l ,p 2 ) ( J.L l ,J.L2 ) pair of measures (J.L 1 , J.L 2 )
§5.2

J' & J'u


The collections of all closed subsets and all countable
§2. 1
unions of closed subsets

f*Y The convolution of the functions f and g (5.3.3 )

p* J.L == J.L 0 q, - 1 The image of the measure J.L under the map tP §5. 1

1r1e The exterior or outer Lebesgue measure of the set r §2. 1

rc The complement of the set r

r The interior of the set r


0

-
r The closure of the set r

)
rC6 The open 6-hull of the set r §5 .3
The collections of all open sets and all countable intersec-
� & �6 §2. 1
tions of open sets

lr The indicator or characteristic function of the set r §3.2

I - & I+ The left and right end points of the interval I

IJRN The identity matrix on lR N §2.2

1 / dx
The Lebesgue integral of f over r E
-
BJRN

Jtl> The Jacobian matrix of tP §5.3

LP (J.L) The Lebesgue space of f with 1 1 / I I L P ( J.L ) < oo §3 .2 & §5.2

£P (JRN ) The Lebesgue space of L P ( AJRN )

L (PI ,P2 ) (J.L 1 ' J.L 2 ) Mixed norm Lebesgue space §6 .2


Notation 161

-
AE Lebesgue's measure restricted to E E BJRN

J.L 1 X J.L 2 The product of the measures J.L 1 and J.L 2 . §3.4

J.L << l/
To be read J.L is absolutely continuous with respect to v. §7.2

-
dJ.L The Radon-Nikodym derivative of J.L with respect to v §7.2
dv
J.L l_ V To be read J.L is singular to v. §7.2
J-L
r-v Equivalence relation determined by the measure J.L· §3.2

-
J.L The completion of the measure J.L · §3 . 1

N The set of non-negative integers.

VF The gradient of the function F. §5.3

n(x) The outer normal at the point x. §5.3

The surface area of the sphere and the volume of the unit
W N - 1 & O, N ball in JR N
E. 5 . 2 . 6

P (E ) The power set (set of all subsets ) of E §3. 1

Q The set of all rational numbers in lR

-
lR The extended real line §3.2

The extended plain JR with the two points ( oo - oo ) and


2
-... ,

JR2
( - oo oo
§3. 2
, ) removed

(R) 1 cp(x) d'ljl (x) The Riemann-Stieljes integral of cp with resp ect to 1/J § 1.2

sgn(x ) The signum of x E lR § 1 .2

a(E ; C) The a-algebra over E generated by the collection C §3 . 1

sN- 1 The unit sphere in lR N

TA
The linear transformation determined by the matrix A §2.2

Tx The tangent space at x §5.3

B(C) The set of selections for the collection C §1.1

z & z+ The set of all integers and its subset of positive integers
Index

A convex set, 1 1 4
convolution , 131
absolutely continuous, 60 for the multiplication group , 137
for measures , 1 53 Young's Inequality for, 1 3 1
for non-decreasing function, 1 57 convolution semigroup , 134
part of a measure, 1 55 Cauchy or Poisson, 135
uniformly, 6 1 Weierstrass, 135
absolutely continuous part coordinate chart , 95
for non-decreasing functions, 1 57 global, 102
algebra, 34 countably additive, 36
almost everywhere, 53 cover, 1
convergence, 53 exact, 3 , 24
approximate identity, 1 34 non-overlapping, 1 , 24
compactly supported , 136
atom , 1 56
D
atomic
non, 1 56 Daniell 's Theorem, 143
purely, 156 diameter, 74
axiom of choice, 27 of rectangle, 1
diffeomorphism, 90
B Dini 's Lemma, 146
discontinuous part
ball, 25 of non-decreasing function, 1 56
volume of, 88 distribution function of f
volume of in JR N , 86 in computation of integrals, 1 29
Beta function, 101 , 1 36 distribution of f , 8 1
Borel a-algebra or field over E , 35 i n computation of integrals , 83
Borel measurable , 35 divergence, 109
Borel-Cantelli Lemma, 39 Divergence Theorem , 109

c E

Cantor set , 28 Euclidean invariance


Cantor-Lebesgue function, 157 of Lebesgue ' s measure, 3 1
Caratheodory's Extension Theorem, 146 extended real line, 4 1
co-area formula, 103 , 1 13
concave function, 1 14
F
Hessian criterion, 1 1 5
continuous part Fatou ' s Lemma, 52, 57
of non-decreasing function, 1 57 Lieb's version, 54, 58, 1 22
convergence finitely additive, 140
J.L-almost everywhere, 53 Fubini 's Theorem, 71
in J.L-measure, 54 function
metric for, 6 1 bump , 136
Index 1 63

function ( continued) L
concave, 1 14
.X-system , 35
harmonic, 1 1 1
Laplacian, 109
measurable, 43
lattice, 1 39
modulus of continuity of, 10
vector, 1 39
fundamental solution, 109 integral on, 1 39
Fundamental Theorem of Calculus, 9 integration theory for, 139
lattice operations, 7
G £-system , 68
Lebesgue integral
Gamma function , 88 exists, 47
Gauss kernel, 135 notation for, 43
gradient , 92 Lebesgue measurable, 23
Green's Identity , 109 Lebesgue set , 67
Lebesgue spaces

LP (J.L) , 1 19
L 1 (J.L) , 49
H

Hardy's inequality, 138 mixed £ (PI , p2 ) (J.L l , J.L 2 ) , 1 24


Hardy-Littlewood Lebesgue's Decomposition Theorem , 1 54
maximal function, 62, 66 Lebesgue ' s Dominated Convergence Theo­
maximal inequality, 65, 130 rem, 53, 57
Lebesgue's measure, 23
harmonic function, 1 1 1
Euclidean invariance of, 3 1
mean value property of, 1 1 1
existence of, 26
generalized, 1 1 2
Hausdorff's description, 76
heat flow or Weierstrass semigroup , 135
interior, 142
Hessian matrix, 1 1 5 notation for, 23, 36, 85
Hilbert space, 1 5 1 outer or exterior , 2 1
Holder's Inequality, 1 1 7 notation for , 2 1
Holder conjugate, 1 1 7, 1 22 of a parallelapiped , 33
hyperplane, 33 limit of sets, 39
hypersurface , 92 limit inferior , 39
limit superior , 39
I linear functional
non-negative, 14 7
image of a measure, 80 tight , 1 47
injective, 90 Lipschitz continuous, 30
integrable, 49 Lipschitz constant , 30

£PI , p2
function L 1 (J.L) , 49
the space L 1 (J.L) , 49 (J.L l , J.L 2 ) , 1 24
uniformly, 6 1
integration by parts, 9 M

isodiametric inequality , 74 Markov 's Inequality , 47


Mean Value Property, 1 1 1 , 137
J measurable
function , 43
Jacobi's Transformation Formula, 90 criteria for, 50
Jacobian , 89 indicator or characteristic, 43
matrix, 89 Lebesgue integral of, 46
Jensen ' s Inequality, 1 14, 1 1 9 simple, 43
1 64 Index

measurable ( continued ) purely discontinuous


(function continued ) non-decreasing function, 1 57
(simple continued ) pushforward of a measure, 80
Lebesgue integral of, 43
map , 4 1 R
measurable space, 36 radius, 74
product of, 4 1 Radon-Nikodym
measure, 36 derivative, 1 55
finite, 36 Theorem, 154
probability, 36 rectangle, 1
product , 7 1 cube, 24
arbitrary number o f factors, 149 diameter ( diam) of, 1
pushforward or image of, 80 volume (vol ) of, 1
a-finite, 70 Riemann
surface, 100 integrable, 3
zero, 53 1/J-Riemann integrable, 8
measure space, 36 with respect to 1/J, 8
complete, 37 with respect to 1/J , 8 1
completion of, 36 sum, 3
finite, 36 lower, 4
probability, 36 upper, 4
a-finite, 70 Riemann integral , 4
mesh size, 3 Riemann-Stieltjes integral, 8
Minkowski's Inequality, 1 1 7 Riesz Representation Theorem
continuous version, 125 for L 2 , 152
modulus of continuity, 10 for measures, 14 7
monotone class, 3 8
Monotone Convergence Theorem, 5 1 s
more refined , 4
Schwarz's inequality, 1 19
semi-lattice, 68
N a-algebra, 34
non-atomic, 156 generated by, 35
non-measurable set, 28 signum function (sgn) , 16
non-overlapping, 1 singular
for measures, 1 53
0
part of a measure, 155
part of non-decreasing function, 1 57
one-sided , stable law of order �, 1 35 singular part
open 8-hull, 90 for non-decreasing functions, 1 57
outer normal, 104 smooth, 103
regions, 103
p sphere, 75, 85
surface area of, 86, 88
parallelepiped , 33 surface measure, 85
volume of, 33 Steiner symmetrization procedure, 75
1r-system , 35 Stone's Theorem, 144
Poisson equation , 1 10 strong minimum principle, 1 1 2
Poisson or Cauchy semigroup , 135 subordination, 135
polar coordinates , 86 surface measure, 100
power set , 34 of graphs, 102
purely atomic, 1 56 symmetric, 74
Index 165

T v

tangent space, 92 variation (Var) , 12


tensor product, 4 1 bounded, 12
tight negative variation (Var ) 1 2
_ ,

family of functions , 6 1 positive variation (Var + ) , 12


linear functional, 14 7
Tonelli's Theorem, 70 y

u
Young's Inequality, 1 3 1
Young's inequality
uniform norm, 6, 140 for the multiplicative group , 138
Solution to Selected Problems

1 . 1 .9: To prove Theorem 1 . 1 .8, simply note that Lemma 1 . 1 . 7 says that

lim U (f; C ) == inf U (f; C)


c
II C II �O

and that
lim £( ! ; C) == sup £(!; C) .
II C II �O c

To prove that f V g is Riemann-integrable if f and g are, observe that


a' V b' - a V b < (a' - a) V (b' - b) < Ia' - al + lb' - bl for any a, a', b, and b' E �-
Thus, for any C,

U ( f v g; C ) - .C ( f v g; C ) < [u (f; C) - .C(f; C) ] + [u (g; C) - .C(g; C) ] ;


from which it is clear that fVg is Riemann-integrable if f and g are. In
addition, since

for every C and � E B(C), the corresponding inequality is obvious.


One can handle f 1\ g by either applying an analogous line of reasoning or
simply noting that f 1\ g == - ( - f) V (-g) . Finally, the linearity assertion follows
immediately from the linearity of the approximating Riemann sums.

1 . 1 . 10 & 1 . 1 . 12: Given C and E > 0, set


C(E) == {I E C : sup f - inf j > E } .
I
I

Assume that f is Riemann-integrable, and let E > 0 be given. Then there


exists a 8 > 0 such that

U (f ; C) - C(J; C) < E 2
whenever I I C II < 8. Thus, if II C II < 8, then
E L
I E C ( E)
vol(J) < L (sup f - i�f f) vol(J)
IE C ( E) I

< L (sup f - i� f ) vol(J) == U ( f; C) - £( ! ; C) < E 2


1
IEC I

as long as II C II < 8.
Solution to Selected Problems 167

Conversely, suppose that for each t > 0 there is a C€ for which


L
lE C e ( E )
vol(J) < t.

Then
U( f; Cf. ) - C (J; Cf. )
= L( s p f - i� f f vol( I) +
lE C e ( E ) �
) s p f - i �f f
lE C e \ Ce ( E ) �
L ( ) vol( I)
L
< 2 ll f l l u vol( J) + t
lE C e ( E )
vol( J) L
lE C e \ C e ( E )
<2 ll f ll u t + t i J I == t ( 2 ll f ll u + I J I) � o
ast � 0. Hence, infc U (f; C) == supc £(!; C) and so f is Riemann-integrable.
Finally, suppose that f is a bounded function on J which is continuous at all
but a finite number of points, a 1 , . . . , am . Given t > 0, choose non-overlapping
cubes Q 1 , . . . , Qm so that at is the center of Q l and vol(Q£) < � for each
1 < f < m. Next, observe that f is continuous on J J \ (U� Q£) and
therefore that
- f(x) : x, y E J and I Y - x l < 8 } < t
sup {f (y )
for some 8 > 0. Moreover, J admits a finite exact cover C by non-overlapping
� �

cubes of diameter less than 8, and clearly


C C u { Q 1 n J, . . . , Q m n J}
is a finite, non-overlapping cover of J with the property that
m
L vol(Q) < L vol (Qt) < t.
{Q E C:sup Q f - i nf Q / > E } l =l
1 .2.28: Let 1/J E C(J).
What we must show is that, for each C and t > 0,
there is a 8 > 0 such that
S (1/J; C ' ) - S(1j;; C) < t
whenever II C' II < 8. To this end, suppose that

where J - == ao < < an == J+ . Next, given t > 0, choose 0 < 8 <


minl < m < n ( am - am- 1 ) so that
· · ·

w..p ( 8 ) sup { I 'I/J (t) - 7/J (s) l : s, t E J and I t - s l < 8 } <


2: .
168 Solution to Selected Problems

If II C 'II < 8 and A is the set of those I' E C ' for which there is an I E C with
I' C I, then for each I' E B = C \ A there is precisely one m E {1 , . . . , n - 1}
for which am E I' . In particular, because C ' is non-overlapping, B has at most
0

n elements. Moreover, if I' E B, am E I' , and we use L(I') and R(I') to denote
I' n [am - 1 , am J and I' n [am , am + 1 J , respectively, then

C V C' == A U U { L(I' ) , R ( I' ) }.


l'EB
Hence,

S ('l/J ; C' ) - S( 'lj;; C ) < S ( 'l/J ; C' v C ) - S('lj;; C )


== L ( I � L (J') 'l/J I + � � R(J') 'l/J I ) < 2nwV; ( 8 ) < E.
l'EB
Finally, by ( 1.2. 14 ) , ( 1 .2. 16 ) , and ( 1.2.17 ) , the analogous result for var_ and
var + follow immediately.
Now, suppose that 'ljJ E C 1 ( J ) . For n E z + and 1 < m < n , set am ,n ==
J- + 7: �J, and use the Mean Value Theorem to choose �m ,n E ( am ,n , am +1,n )
so the cp( am +1,n ) - cp( am ,n ) == cp'(�m ,n ) �J . Then
n- 1
Var + ( 'lj; ) == nlim -
�J
�� n m�
'"""
=O
1/J ' (�m ,n ) + == (R)
1 J
cp' ( t ) + dt,

and similarly for Var _ ( 'ljJ) and Var ( 'ljJ) .

1 . 2.29: Let 'lj; 1 and 'lj;2 be given. Then, for any c < s < t < d ,

since both 'lj; 1 ( t ) - 'lj; 1 ( s ) and 'lj;2 ( t ) - 'lj;2 ( s ) are non-negative. Hence, for any
c < s < t < d and any non-overlapping exact cover C of [s, t] ,

+
'I/J2 ( t ) - 'I/J2( s ) = L AJ 'l/J2 > L ( AJ'l/J ) ;
lEC JEC
and so 'l/J2 ( t ) - 'lj; 2 ( s ) > 1/J+ ( t ) - 1/J+ ( s ) . Since � 1 1/J1 - � 1 1/J - == � 11/J 2 - �I'l/J+ for
any interval I C J , we now see that 'lj;2 - 1/J + and 'lj; 1 - 1/J - are non-decreasing.

1 . 2.30: Define 1/J ( O ) 0 and 'lj; ( t ) == t sin ( ; ) , t E ( 0, 1] . Clearly 'ljJ is continuous


==

on [0, 1] . On the other hand, if tn == 2n� 1 for n E N and Cn consists of the


intervals [0, tn J , [tn , tn - 1 J , . . . , and [t 1 , to J , it is easy to check that S ('lj;; Cn)
diverges as n ----t oo at least as fast as the harmonic series does. Thus, 'ljJ has
unbounded variation on [0, 1] .
Solution to Selected Problems 169

To handle the second part of the exercise, consider the function


if t E (0, 1 ] \ { � }
if t == � .

Clearly var ('lj; ; [0, 1 ] ) == 2. On the other hand, for any cp E C ( [0, 1 ]) , one has
that R ( cp l'l/J; C, �) == 0 as long as � is not an endpoint of any I E C. Thus,
(R) f[o, I) cp (x) d'lj;(x) == 0 for every cp E C ( [0, 1 ]) .

2 . 1 . 20: Part (i) is obvious, since Ck is the union of 2 k mutually disjoint


intervals of length 3- k . Furthermore, the first assertion in part (ii) is clear,
and the second one follows by induction on f. To prove the characterization
0

of c�_, note that a is the left hand endpoint of one of the intervals I making
up Ct if and only if ak ( a ) E {0, 2} for 0 < k < f, ak ( a ) == 0 for k > f, in
which case the associated right hand endpoint is a + 3- l . Thus, X E c�_ if
and only if ak ( x ) == ak ( a ) , 0 < k < f, for such an a and there is a k > f
such that ak ( x ) f= 0. Once one has these facts, the remainder of (ii) is easy.
0

Moreover, as a consequence of (ii) , we know that the subset n � 1 Ct of C is


isomorphic to A which, in turn, differs from {0, 2}N by a countable set. Hence,
the uncountability of C follows from that of { 0, 2 }N.

2. 2.4: Thinking of the vk ' s as column vectors, let [ VI . . . v N J denote the


N x N-matrix whose ith column is vi . It is then clear that P ( v 1 . . . VN )
is the image of Q0 under [v 1 . . . v N ] ; and, therefore, Lebesgue would as­
sign P (VI , . . . , v N ) volume equal to the absolute value of the determinant
of [vi . . . vN ] ·
In order to connect this with the classical theory, it is helpful to observe first
that

where v[ denotes the row vector representation of vi and ( vi , vi ) R N is the


inner (i.e. , "dot" ) product of vi and Vj · Next, we work by induction on N.
Clearly there is no problem when N == 1. Next, assume that the two coincide for
some N E z + and let VI , . . . ' V N + I E �N + I be given. Since there is no prob­
lem in the linearly dependent case, we will assume that vI , . . . , v N + I are lin­
early independent. Choose an orthogonal matrix 0 so that Tovi , . . . , TovN E
H (ei , . . . , eN ) . Then, the Lebesgue volume of P ( VI , . . . , v N ) thought of as a
170 Solution to Selected Problems

subset of H ( VI , . . . , v N ) is the same as that of P (Tovi , . . . , Tov N ) thought


of as a subset of H (ei , . . . , e N ) and is therefore equal to the square root of

Hence, all that remains to check is that

where u == V N + I w and w is the perpendicular projection of V N + I onto


-

H ( VI , . . . , v N ) . But, since the determinant is a linear function of any one of


its columns and because w E H ( VI , . . . , v N ) , we see that

det [ (( ( vi , vi ) R N )) 1 <t,J. . < N + I J == ( det [vi . . . V N + I ] ) 2


== ( det [vi . . . V N u] ) 2

( vi , VI ) R N ( vi , V N ) RN 0
( v 2 , VI ) R N (v 2 , V N ) R N 0
== det

3. 1 . 9: Denote by set of all r C E with the property that, for every


A the
E > 0 there exist a closed F c r and an open G � r such that JL( G \ F) < E .
It is clear that A C BE JL ·
To prove the opposite inclusion, we first check that
A is a a-algebra. Obviously, 0 E A, and r E A ===} rC E A. Moreover,
if {r n } � c A, r == u � rn , and E > 0 is given, choose { Fn }� c � and
{Gn } � c <5 so that Fn c rn c Gn and
n E z+ .
Next, set G == U� Gn, A == U� Fn , and An == U� =I Fm , E z + . Then, n

G E <5, {An }� c �' A c r c G, An / A, and JL(G \ A) < � · Hence, after


choosing E z+ so that JL ( A \ An) < � ' we see that r E A and therefore that
n

A is a a-algebra.
We next check that BE
C A, and, in view of the preceding paragraph, this
comes down to checking that � C A. But if r E � and we set Fn = x E: {
p ( x, rC ) > ! }, then Fn / r and so J..t ( Fn) / J..L ( r). Hence, for given E > 0,
Solution to Selected Problems 171

we can take find an n E z + such that JL ( r \ Fn ) = JL ( f ) - JL ( Fn ) < t ; and


therefore we can take G = r and F == Fn .
Finally, suppose that rl c r c r2 , where rl , r2 E BE and JL ( r2 \ rl ) = 0.
To see that r E A, let t > 0 be given, and use the preceding to find F E � and
G E <5 so that F C f1 , f2 C G,

JL ( r1 \ F ) < 2t , and JL ( G \ r2 ) < 2t .


Clearly, F c r c G and JL (G \ F ) < t .
3. 1. 1 1 : To see that JL ( 8Q) = 0, it suffices to check that JL ( {x E �N Xi = :

a} ) =0 for every 1 < i < N and a E �; and, by translation invariance, this


comes down to checking that JL ( { x E Q 0 : Xi = a} ) = 0 for 1 < i < N and
a E [0, 1 ] . But, if p = JL ( { x E Q 0 : Xi = a } ) , then, by translation invariance,

JL ( { X E Q o : Xi = ,8 } ) = JL ( { ( ,8 - Q ) ei + X : X E Qo and Xi = } ) = p Q

for every ,B E [0, 1] . Thus,


n
1 = JL(Q o ) > L JL (
m =O
{ x E Qo : Xi = : }) > ( n + l)p

for every n E z + ; and therefore p = 0.


Next, to see that JL ( [O, m,\] N ) = m N JL ( [O, ,X] N ) for m E z+ and ,\ E (0, oo ) ,
note that

and so, by the preceding and translation invariance,

From the preceding, we now know that JL ( [O, q] N ) = q N for any q E Q n ( O , oo )


( Q is used to denote the rational numbers) . Thus, if r is any element of ( 0, oo ) ,
then by choosing {qn }1 C Q so that Qn � r, we see that

Combining this with translation invariance, we conclude that JL ( Q) AJR N ( Q) ==

for every cube Q in �N ; and, since every open set G in �N is the countable
union of non-overlapping cubes, it follows immediately, from what we already
know, that JL(G) AJRN (G) for all open G ' s in � N . Thus, since, for every
==

R E (0, oo ) , the set of all open subsets of ( - R, R) N is 1r-system which generates


BRN [( - R, R) N ] , we now know that JL coincides with AnN on BR N [( - R, R) N ]
for every R E ( O , oo ) and therefore on the whole of BRN ·
172 Solution to Selected Problems

3. 1 . 12: (i) For m


An C rn for every
E
n
zz+ +, . Am nn> m rn ·
E
set
Hence
== Then Am / n---+ rn lim CX) and

n
(ii) Set B ==
n E z+ . U n> m r n · Then B � lim
Hence, if J;(B 1 ) < oo , then
m n---+ rn rn c Bn
CX) and for every

(iii) IfE � JL (rn) < oo , then J-L (U� rn ) < 00 ( and so cf. part (ii) ) :

J-L (n---+ rn) n---+ J-L (Bn) m---+ n> m J-L (rn) 0.
limCX) == limCX) < limCX) "'

==

3. 1 . 14: Part (i) is clear. To prove (ii) , we work by induction on


set Srnn r{1,n+ l ,.
== . . , n } , assume the result for some n 2 , 2. r n> Thus,
> and, after replacing
n

by u conclude that -J-L (rl rn+l )


u . . . u is equal

L - (l ) ca r d ( F ) JL ( r F ) - L - l ) card (F) JL (r F n ( r n r n +d )
( u
0-#FCSn - 1 0-#FCSn - 1

n{ FCS,n+ln}CF+ I
L( - 1 ) ca rd (F) JL (rF ) ,
0#FCSn+ 1
where we have used

and part (i) in order to get the second line.


(iii) In this case,
card ( r )
JL(r) =
n .1
;

(mn ) m. (nn�m) .
and, for each non-empty
== ' ' sets
F C { 1,
F C { 1,
(r F )
. . . , n } , card

. . . , n } with exactly m elements, we now see that


.
== ( n - m ) ! Since there are
Solution to Selected Problems 173

3.3. 15: To see that J.-L is a finite measure on ( E, B) , it suffices to check that it
is countably additive. Thus, let { rn } c; C B be a sequence of mutually disjoint
sets, and define
n
Fn == f L= 1 lrm for n E z + .
m

Then Fn / f l r where r == U� rn ; and so, by the Monotone Convergence


Theorem,
n
JL ( r) = rr fdv = nlim JFn dll = r f dv = L JL ( fn ) ·
ex:>

1 ---. cx: :> n ---.


lim L 1r 1 CX)
m= l rn

To check the absolute continuity assertion, suppose that there were an > 0 E
such that for every n .z + there exists a r n B for which v (r n) < 2 n and
E E
> E . Set r == limn ---. CX) rn · Since J.-L ( E ) < oo , one would know ( cf. (ii)
J.-L(rn)
-

of Exercise 3. 1. 12) that J.-L ( r ) > E . On the other hand, by the Borel-Cantelli
Lemma (cf. (iii) of Exercise 3. 1. 12) , one would also have that v ( r ) and therefore
dv
fr f both vanish. Since J.-L (r ) == fr f dv, this is clearly a contradiction.
3.3. 16: Clearly, by Markov ' s Inequality, > ,\ ) � 0 as ,\ J.-L (I f l � oo ; and
therefore, by the Lebesgue Dominated Convergence Theorem,
f 1 J 1 dJL o as .x oo .
1{ 1 / 1>>.. } ------+ ------+

But, by a second application of Markov ' s inequality, this proves (3.3. 1 7) .


To produce the required example, simply consider the function
f (x ) == x ( l -1log x ) , x E [0, 1 ] ( oo when x == 0) , -

and note that l { x E [0, 1 ] : f ( x ) > ,\} 1 == x(,\) , where ,\x(,\) == ( 1 - log x(,\) ) - .
1
Finally, if f is non-negative and measurable and one defines r n == {f E
(n - 1, n] } for n E .z+ , then

CX)

when 2::: � J.-L ( f > n) < oo ; and, when J.-L ( E ) < oo ,


N N N
r f djL = L r f djL > L (n - l ) JL ( fn ) > L nJL( fn ) - JL ( E )
1{ / < N} r 1 1n 1 1
N
> L J.-L ( f > n) - 2J.-L ( E )
1
for every N E z + , from which the required result follows after one lets N / oo .
174 Solution to Selected Problems

3.3. 19:Without loss in generality, we assume that all the 9n ' s are �-valued
and that, in part (i) , fn < 9n, and, in part (ii) , I fn i < 9n , everywhere.
To prove part (i) , first choose a subsequence { fn ' } of {fn} so that
J
' -----+- CX) fn ' dJl == nlim
nlim J ---+ CX) fn dJl ,
and second choose a subsequence {gn" } of {gn' } so that Yn " � g (a.e., JL).
Then, by Fatou ' s Lemma applied to hn 9n - fn > 0,
J g dJl - nlim J
-----+> CX) II
J
fn dJl == lim hn " dJl > lim hn " dJl
n -----+- CX)
J II
n -----+- CX)
==J J J
[g - n,�im-----+- CX) fn" ] dJl > g dJl - nlim ---+ CX) fn dJL.
Given the preceding, the proof of part (ii) in the case of almost everywhere
convergence is exactly the same as the derivation of Lebesgue ' s Dominated
Convergence Theorem from Fatou ' s Lemma. Namely, one sets hn == l fn - J l
and observes that hn < gn + g. The case of convergence in measure is then
handled in precisely the same way as it was in the proof of Theorem 3.3. 1 1 .
3.3.20: (i) If /C is uniformly JL-absolutely continuous and M sup / E K ll f iiL l (J.L )
< oo , choose, for a given E > 0, 8 > 0 so that sup / E K fr I l l dJl < E whenever
r E B satisfies JL(r) < 8 and choose R E (0, oo ) so that -% < 8. Then, by
Markov ' s Inequality, sup / E K JL( I f l > R) < 8 and so sup / E K � / I > R I ll dJl < E.
Next, suppose that /C is uniformly JL-integrable and define
A(R) supK { I J I dtt for R E (0, oo ) .
/ E 1{ 1/ I > R }
Clearly
supK f I! I dtt < Rtt (r) + A(R)
/ E 1r
for any r E B and R E (0, oo ) . Hence, if, for given E > 0, we choose R E (0, oo )
so that A(R) < � ' then fr I! I dJl < E for all f E /( and r E B with JL(r) < 2� ;
and so /C is uniformly JL-absolutely continuous. In addition, when JL(E) < oo ,
by choosing R E (0, oo ) so that A(R) < 1, we see that II ! II L l (J.L ) < RJL(E) + 1 <
oo for all f E /C.

(ii) Note that, for any R E (0, oo ) ,

{
1{ 1/ I > R } I J I dtt < R - o l 1/ I > R
J
1!1 1 + 6 dJL < R - 8 1!1 1 + 6 dJL.
(iii) Suppose that fn � f in L 1 (JL). Clearly f E L 1 (JL) and supn EZ+ ll !n i i L l (J.L )
< In addition, for given E > 0, choose m E z + so that ll !n - ! I I L l (J.L ) < �
00 .

n
for > m ; and, using Exercise 3.3. 15, choose 8 > 0 so that
n < m 1fr I fn i dtt < �
1 <max
Solution to Selected Problems 1 75

for all r E B with J.t ( r ) < b , and note that fr I l l d�-t < � whenever J.t ( r ) < 8.
Hence, for this choice of 8 > 0,

for all r E B with J.t ( r ) < b.


Conversely, if �-t(E) < oo , fn f in J.t-measure, and {fn }� is uniformly
------+

I-t-integrable, note that f is J.t-integrable, and ( cf. (i) above and Exercise 3.3. 15)
choose, for a given t > 0, a b > 0 so that

for all n E z + . Hence , if rn = { l fn - f l > �-'tE) } and m is chosen so that


J.t ( rn) < b for all n > m , then

(iv) Tightness enables one to reduce each of these assertions to the finite
measure situation, in which case they have already been established.

3.3.22: Clearly the general case follows immediately from the case in which
�-t(E) V v(E) < oo; and therefore we will assume that J.t and v are both finite.
Thus, by Exercise 3. 1 .8, all that we have to do is check that �-t(G) == v(G) for
every open subset of E.
Let G be an open set in E, and define

cp ( x ) =
( dist ( x, GC )
1 + dist ( x, GC )
) :n
1

' X E E and E z + '


n

where dist ( x, r ) inf { p ( x, r ) : y E r } is the p-distance from X to r. One then


has that 'Pn is uniformly p-continuous for each n E z + and that 0 < 'Pn(x) /
1 a ( x ) as n------+ oo for each x E E. Thus, by the Monotone Convergence
Theorem,
---+ CX) J
J.t ( G) == nlim 'Pn dj.t == nlim
---+ 'Pn dv == v( G). CX)
J

4. 1 . 13: Without loss in generality, we will assume that a == 0 and that b == 1 .


To prove the first assertion, define, for n > 2 ,
( � X) ( t, x ) E ( kn 1 , � ) X E
cpn ( t, X ) = {� '
if
if ( t, x ) E ( 1 - � , 1 ) X E.
1 76 Solution to Selected Problems
Clearly each VJn is measurable on ( (0, 1) x E, B( o , l ) x B) , and f is the point-wise
limit of { VJn } �. Thus, f is itself measurable.
Thrning to the second assertion, note that, for each t E ( 0, 1 ) , f' ( t , ) is ·

the point-wise limit of functions which are measurable on (E, B). Thus, since
f' ( , x) is continuous for each x E E, the measurability of f' follows from the
·

first part.
To prove the last assertion, let t E (0, 1) be given, and suppose that { tn } � C
(0, 1 ) \ { t } is a sequence which tends to t. Set In == [t, tn] or [tn, t] according
to whether tn > t or tn < t, and define

VJn (x) =
fn ( tn, x) - f(t, x) n E z + and E E.
x
tn - t '
Then VJn(x) � f'(t, x) and
l cpn( x) l I In l - l ( R) r f ' (s, x) ds < g (x)
=

}I n
for every x E E. Hence, by Lebesgue ' s Dominated Convergence Theorem,
JE f ( tn, X) J.L ( dx) - JE f ( t, x ) J.L ( dx) r cpn(x) ( dx)
= � r f' (t, x) J.L (dx);
tn - t J.L
JE JE
from which we see that t E (0, 1) � JE f(t, x) J,t( dx) is not only differen­
tiable at t but also that its derivative is equal to the J,t-integral of f ' ( t, ) . ·

Finally, given the preceding, the continuity of the derivative is simply another
application of Lebesgue ' s Dominated Convergence Theorem.

5.2.4: (i) First, let r be a non-empty open subset of s N - l ' and set G == { X E
B ( O, 1) \ {0} : �(x) E r} . Then, because � maps B ( O, 1) \ {0} continuously
onto s N G is a non-empty open subset of �N ; and therefore AsN ( r)
-I , -1 ==

NARN ( G) > 0.
Next, let 0 be an orthogonal matrix and denote by R the corresponding
rotation To on � N . Then � o R Ro � on �N \ {0} , l n ( O , l) o R == l n ( o,l ) , and
==

R* ARN ARN ; and therefore the asserted invariance of As N follows directly


== -1

from its definition. To prove the asserted orthogonality relations, define for
each e E � N \ {0} the rotation Re : �N �N by �

2( , x)RN
Re x - X - e 2 e ' X E m N . �
lel
Then, by rotation invariance,
Solution to Selected Problems 177
Similarly, for '11 ..l e ,

rsN- ( e , w)RN ( 7J , W )JRN AsN- l ( dw) = r (e , R� w) R N (7J , R� w ) RN AsN-l ( dw)


1 1 1sN- 1
= - rsN- 1 ( e , w)RN ( 7J , W )JRN AsN-l (dw) ;
1
and therefore, for any '11 E � N ,

Finally, if { e 1 , . . . , e N } is the standard orthonormal basis for � N , then, again


by rotation invariance,

and so

(ii) Using Roto denote the rotation determined by the matrix Oo described
in the hint, we see, by rotation invariance and Tonelli's Theorem, that

1S 1 7r
(
r f dv = 21 {g r f Ro(w) dO v( dw)
1 1 1[0,2 1r]
0 )
for any non-negative, Borel measurable f on 8 1 . Next, for fixed w E 81 , choose
'TJw E [0, 2 7r ) so that

and observe that

{ f Ro(w) dO = { f (cos ( 'T7w + 0) , sin ( 77w + 0)) dO


o
1[0, 27r] 1[0 ,27r]
= 1
[71w , 2 7r]
f (cos 0, sin 0) dO + r f ( cos ( 21r + 0) , sin ( 21r + 0)) dO
1[0, 7Jw ]
= f f (cos O, sin O ) dO = f 1 f dJ.L .
1[0, 2 1r] 1S
178 Solution to Selected Problems

Combined with the preceding, this now show that

(iii) Let f
be a non-negative, continuous function on �N with compact support.
Then, by Tonelli ' s Theorem and Theorem 5.2.2,

J 2 N
p ) f (S (p,w) ) dp Ag N -1 (dw) dr
(1 - 2 -1
( O,CX) ) [- 1 , 1) x SN - 1

1 - p2 ) � F( p) dp
J (
[- 1 , 1 )
-1

where

F(p) J

dy
Hence,

p
(1 -2
Jx S N - 1 ) � f(S(p, w) ) dp A8N - 1 (8w) dr
-1

[- 1 , 1)

(R)
[-8( y ) , 8 ( y)]

where, for each y E �N \ { 0 } , 1/;y (p) == ( PIYI


1 - p2 ) 2
, p E [ -8( y ), 8( y )] , and 8( y ) E
(0 , 1) is chosen so that

== 0 for 8( y ) < < 1 . p


Solution to Selected Problems 179

(
Since, by part i) of Exercise 5 . 1 . 6,
(R) J f(y , 'l/Jy (p)) d'l/Jy (P) = (R) J f(y ,a)da
[- 8 ( y ) , 8(y)] ['l/ly { -8( y )) , 'l/ly (8(y))]
= r J ( y ,
JJR.l e7) de7
for each y E � N \ {0} , we now see that

{
J(o,oo) TN dr
x s N -1
[- 1 , 1 )

= r 1 j ( z)dz r
}JRN + ( r j ( rw)>. sN (dw) ) dr ;
J(o,oo) JsN f = TN
from which the desired result follows easily by taking to be of the form
1J( I z l ) g ( 1 � 1 )
·
5 .2.6: (iv) Define 1/J as in the hint and observe that

( )2
1/J s
1
==
s+ ( s 22 a a ) 2
------
+ 4 ,B
1

sE�
and therefore that
dd
-'lj;(s) 2
s
1
== ­
1
2a
Thus, by Exercise 5. 1 .6, for any R E (0, oo ) :
e2;13 j C� exp
2
[ -oh - � ] dt
[ '1/J( - R ) � , 'ljJ ( R ) � ]
1 1

2 /3 2
[- a '¢( s) - 1 ] d1j;(s)
== (R) 1[ - R,R]
exp �
'lj;( s) 2

=
2a 1 e-
_!_ (R) 8 2 2(s a ) 2 ds 1
[ - R, R]
1+
s
+ 4 ,B
1
==
[- R,R]
e - 8 2 ds ;
and so
t - ! [-oh - /3 2 ] dt e- 2 a /3 1 e - 8 2 ds ,
J t
[ 'l/J ( - R ) � , '1/J ( R ) � )
1 1
a exp =
[ - R ' R]

and, after differentiating with respect to ,B and applying Exercise 4. 1. 13, one
also has that
- 2 2 ] e- 2
J t 2 [- a t - -t dt ,8 1 e - 8 2 ds .
,3 o: f3 exp
[- R, R]
==
1 1
['l/J( - R ) � , '1/J ( R ) � )
3

Clearly the desired results follow when one lets R / oo .


180 Solution to Selected Problems

5.3. 2 1 : Choose r > 0 and F : BJRN (p, r) � lR so that (5.3. 7) holds. Without
loss in generality, we will assume that r is taken so that F has bounded second
order derivatives on BJRN (p, r) . In particular, since F vanishes on BJRN (p, r ) n
M, there is a C < oo with the property that I F( y ) l < Cdist (y, M) for all
y E BJRN (p, � ) . Hence,

-
1. dist (p + �v) 1.
- F (p + �v) - F (p)
"2 "2
1m < oo ===} 1m < oo
e� o � e� o �

===} ( v, \7F (p)) JR N == 0 ===} v E Tp (M).


Conversely, if v E Tp (M) and 1 : ( -E, E ) � M is chosen accordingly, then
dist (p + �v, M) < IP + �v - ! ( � ) I == 0 �2 )( as � � 0.

5.3.24: (i) By Tonelli's Theorem,

r (a) r (,B) = JJ Sa- l tf3- l e- s - t ds dt. X

(O,CX)) 2
Now let � be the mapping suggested in the hint, and observe that � maps
1)
(O, oo) x (0, diffeomorphically onto (O, oo) 2 ) and that 8� , == Hence, (u v ) u .
,B = J
r (a) r ( ) (O, )J sa- l t f3- l e - ( s +t ) d s dt = JJ ( u v ) 0 - 1 ( u ( l - v )) f3- 1 u dudv
CX) 2 (O,CX)) (0, 1 )
X

= �f(O, CX)) ua+f3- t e -u du �f(0, 1 ) v0- 1 ( 1 - v ) f3- 1 dv = r(a + ,B)B(a,,B).


(ii) By Theorem 5.2.2,

Moreover, if � is the map in the hint, then, by Exercise 4. 1 .6, the integral on
the right is equal to

N
Hence, since ( cf. (iii) in Exercise 4. 2 . 16 ) WN- l = ;(;) , the desired result
follows.
Solution to Selected Problems 181

A second approach to the same computation is to first observe that


r( ,x ) = J(O,oo) tA - 1 [-t ( l + l x l 2 )] dt
( 1 � I X ! 2 ) >, exp

and therefore that

r(.X) { ( 1 + l x l 2 ) - A dx = { t A - l e- t ( { e - t l x l 2 dx ) dt
}JRN J(o , oo) }JRN
= J(o{ ,oo) tA - l ( � ) � e - t dt = tr � f (,\ - N ) · 2

6.2.22: (i) Let 1 < p < < oo and set q a == � - Then, since J,t(E) == 1, Jensen ' s
inequality applies and yields

(ii) First note that, by Holder 's inequality or direct computation,

for any measure Jl and any 1 < p < q < oo . Second, note that, for the particular
measure under consideration, II J II L oo ( JL ) < II J I I LP( JL ) for any p E [1, oo ] .
(
(iii) Since there is nothing to do when JL E) == 0, we will assume that JL E) E
(0, oo ] . To prove the first assertion, suppose that 0 < M < II J II L oo ( JL ) and set
(
r == { I l l > M} . Then JL(r) > o, and so

p ---+ 00 p ---+ 00

In other words, limp---+ oo II J IILv (JL) > M for every M E [0 , II J II L oo (JL) ) , and so
limp ---+ oo II J II LP(JL ) > II J II L oo (JL ) · To prove the second assertion, assume that
(
JL E) < oo or ll f ii£ 1 (JL ) < oo , and note that

II f II LP( J.I ) < (II f II £'X' ( J.I ) M ( E) * ) (\ (I I f II �=t ) II f I I t ( J.I ) ) .


(iv) Suppose that ( E 1 , B 1 ) is a measurable space and that ( E2 , B2 , Jl 2 ) is
a a-finite measure space. Choose { E2 , n } � C B2 so that E2 , n / E2 and
Jl 2 ( E2 , n ) < oo , E z + . Given a measurable f on ( E1 E2 , B 1 B2 ) , observe
n x x

that
l i f (x l , · 2 ) 11 L= (J.I) = nl�n�J f(x l , · 2 ) 1E2 , n ( 2 ) 11 L= (J.I)

== n ---+ oo oo I I f (X 1 ' . 2 ) E2 ( 2 ) I I LP ( JL )
lim lim
---+
p
1 ' n • .
182 Solution to Selected Problems

Thus, since, by Lemma 4. 1 . 2,

is B1 - measurable for each n E z + and p E [1, oo , ) so is

7. 2. 1 5 : Suppose not. Then there exists an


such that JL(An ) (JL(A ) ) n
E > 0 and a sequence
== C
> E but v n < 2 - . Hence, if B limn-H:x) An , then (cf.
{An }o B
part (ii) of Exercise 3. 1 . 12) B > E while, by the Borel-Cantelli Lemma (cf.
==
part (iv) of Exercise 3. 1 . 1 2 ) , v(B) 0.

7.2. 16: (i) Assume that {x} E AB


for every xEE and that (E, B, JL )
is a
a-finite. In proving that the set of atoms is countable, it is easy to reduce

set
JL
to the case when is finite. Thus, we will assume that
An == {xA E E : JL ( {x} ) > � } An JL(E)
, and observe that card(An) <
< oo . Next,
nJL(E) .
Hence, since is the union over n E z + of the
JL
countable. Moreover, when is purely atomic,
A
s, it follows that must be
JL(AC) == 0
and therefore
'

JL(r) == JL(r n A) == L JL(r n {x} ) == L JL( {x }).


xEA

Conversely, if there is a countable S such that

JL(r) == L JL ({x} ) , r E BR ,
x E r nS

then A ==
r n A == 0 JL(r) =={
===}
x E S JL ( { x}) > 0 }
:
o.
is the set of atoms, and it is clear that

(
ii) Clearly,

JL1/J ({X} ) == JL1/J ( (X - E, X] ) == ( (X) - 'l/J (X - E) ) == 'l/J (X) - 'l/J (X - ) .


lim
€� 0
lim 'ljJ
€� 0

Hence, x is an atom if and only if


'lj; (x) - 'l/J(x- ) JL'l/J JL1/J
. In particular,
'l/J(x) - 'lj;(x-) > 0,
in which case
is non-atomic if and only if 'ljJ is continuous.
JL ( { x }) ==
On the other hand, if
A ==D( 'ljJ) and
A
is purely atomic and is its set of atoms, then

'l/J( x ) -'l/J(
- oo ) == JL ( {t}) ==
tE D ( 'lj; ) n ( - CX) , x )
('l/J(t) - 'lj; (t-)) == 'l/Jd (x) .
t E D ( 'lj; ) n ( - CX) , x )
Solution to Selected Problems 183

Conversely, if V; is purely discontinuous, then

JL� ( ( - oo , x] ) == (V;(t) - V;(t - ) ) == v ( ( - oo , x] ) , x E JR,


tED( � ) n( - oo , x]

where v is the purely atomic measure given by


v(r) == L (V; (x) - V;(x - ) ) , r E BR ·
xED( � )nr
Hence, since C == { ( - oo , x] : x E 1R} is a 1r-system which generates BR and
JL� r c u {JR} == v r c u {JR} , it follows that JL� == v on BR .
(iii) First suppose that JL� << AR · Clearly JL� is then non-atomic and therefore
/L,p
( ( a, b) ) = 'ljJ(b) - '1/J( a ) for all a < b. Next, set f = 'Z,.; . Given E > 0, choose
R > 0 so that J{ />R} f dx < � and take 8 == 2� . If { ( am , bm ) } is a sequence
of mutually disjoint intervals with L:( bm - am ) < 8 , then

< ( r
L Jr(am , bm ) n{ f <R} 1 dx + J(am , bm) n{ J > R} 1 dx )
L
< R ( bm - am ) + r f dx <
j{J > R}
€,

where, in the passage to the last line, we have used the fact that the intervals
are mutually disjoint. Conversely, suppose that V; is absolutely continuous.
Given r E BR with AR (r) == 0, choose, for a given E > 0, 8 > 0 so that

Next, choose an open set G =:) r so that ,\R (G) < 8 , and find (cf. Lemma
2. 1. 10) a sequence { ( am, ,Bm )} of mutually disjoint open intervals so that G ==
U( am, bm ) · Then L:( bm - am ) == AR ( G ) < 8 and so

Hence JL� << AIR .


(iv) Set VI JL�1 and v2 JL�2+�3 • Clearly, JL� + v2 , << AR , v2 ..l AJR ,
== == == VI VI

and therefore VI
JL�,a while v2 == JL�,a· In particular, this means that
==
V; I V;a and that V;2 + V;3 V;a . Finally, because V;2 is continuous and V;3 is
== ==
purely discontinuous, this second equality means that V;a d '¢3,d V; 3 , from , == ==
which 'l/J2 'ljJa, is trivial.
== c
184 Solution to Selected Problems
2 - k note first that 'lfJk+ 1 JR. \ ck
7. 2 . 1 7: To prove that
'l/Jk ck
r�\ and that
l 'l/Jk + 1 - 'l/Jk l u,R
= 6 , --
r =

Second, check that ('l/Jk x) (�) k x x 3- k ]


== for E [0, and that
k + 1
(�)- k - 1 x 3-k k1- 1 ] k 1
if X E [0,
'l/Jk+ 1 (x) 2
= (3- - , 23- - )
if X E

Finally, combine these two to complete the computation.


Given the preceding, it is obvious that 'l/Jk
� 'ljJ uniformly on � and, there­
fore, that 'ljJ is a continuous, non-decreasing function with 'l/;( - oo ) = 'l/;(0) = 0
(
and 'ljJ 1) = 'ljJ ( oo ) = 1 . Moreover, 'ljJ is constant on each of the intervals
k , k C)]
[a ,j b ,j , k E N and 0 < j < 2k .
Hence, JL ,p (� \ = 0 for every k , and so
Ck )
JL,p (� \ = 0 also.

You might also like