Download as pdf or txt
Download as pdf or txt
You are on page 1of 242

NONLINEAR notes on

FUNCTIONAL mathematics
ANALYSIS and its
applications

Jacob T Schwartz
GORDON AND BREACH
SCIENCE PUBLISHERS
Nonlinear Functional
Analysis

J. T. SCHWARTZ
Courant Institute of
Mathematical Sciences
New York University
Notes by
H. Fattorini
R. Nirenberg and H. Porta
with an additional chapter by
Hermann Karcher

GORDON AND BREACH SCIENCE PUBLISHERS


NEW YORK LONDON PARIS
Copyright © 1969 by GORDON AND BREACH SCIENCE PUau3SHERS INC.
150 Fifth Avenue, New York, N. Y. 10011
Library of Congress Catalog Card Number: 68-25643

Editorial office for the United Kingdom:


Gordon and Breach Science Publishers Ltd.
12 Bloomsbury Way
London W.C.1.

Editorial office for France:


Gordon & Breach
7-9 rue Emile Dubois
Paris 14e

Distributed in Canada by:


The Ryerson Press
299 Queen Street West
Toronto 2b, Ontario

All rights reserved. No part of this book may be reproduced or utilized in any
form or by any means, electronic or mechanical, including photocopying, recording,
or by any information storage and retrieval system, without permission in writing
from the publishers. Printed in East Germany.
Editors' Preface

A large number of mathematical books begin as lecture notes; but, since


mathematicians are busy, and since the labor required to bring lecture notes
up to the level of perfection which authors and the public demand of formally
published books is very considerable, it follows that an even larger number
of lecture notes make the transition to book form only after great delay
or not at all. The present lecture note series aims to fill the resulting gap.
It will consist of reprinted lecture notes, edited at least to a satisfactory level
of completeness and intelligibility, though not necessarily to the perfection
which is expected of a book. In addition to lecture notes, the series will
include volumes of collected reprints of journal articles as current develop-
ments indicate, and mixed volumes including both notes and reprints.

JACOB T. SCHWARTZ
MAURICE LEvI
Contents

Introduction . . . . . . . . . . . . . . . . . . . . I

Chapter 1: Basic Calculus . . . . . . . . . . . . . . 9


Chapter II: Hard Implicit Functional Theorems . . . . . . . .33

Chapter III: Degree Theory and Applications . . . : . . . . 55

Chapter IV: Morse Theory on Hilbert Manifolds . . . . . . . 99


Chapter V : Category . . . . . . . . . . . . . . . . 155

Chapter VI: Applications of Morse Theory to Calculus of Variations


in the Large . . . . . . . . . . . . . . . 165
Chapter VII: Applications . . . . . . . . . . . . . . . 181

Chapter VIII: Closed Geodesics on Topological Spheres . . . . . 199

Index . . . . . . . . . . . . . . . . . . . . . . 235
Introduction

Nonlinear functional analysis is of course not so much a subject, as the


complement of another subject, namely, linear functional analysis. In study-
ing our negatively defined field, we will, however, find a certain unity; partly
because we shall exclude from functional analysis those analytic theories
which do not make use of the characteristic procedure of functional analysis.
This characteristic procedure is, of course, the treatment of a given problem,
or the construction or study of a desired function, by imbedding the problem
or function into a space (generally infinite dimensional) of related problems
or functions. In accordance with this distinction we shall, for example,
regard much of the asymptotic study (by topological methods) of solutions
of nonlinear differential equations as belonging to nonlinear analysis but
not to nonlinear functional analysis, while, for instance, the Morse theory
of geodesics, or the construction of solutions of partial differential equa-
tions by application of the Schauder fixed point theorem, will definitely be
considered to belong to nonlinear functional analysis. The distinction sug-
gested is not always clear-cut, however.
We may orient ourselves toward our subject of study as follows. Non-
linear functional analysis is nonlinear analysis in the context of infinite
dimensional topological spaces, manifolds, etc. Naturally, our knowledge of
nonlinear analysis in this case cannot be more complete than our knowledge
of nonlinear analysis in the finite dimensional case. Therefore the finite-
dimensional case can serve as a model for the infinite dimensional case. We
can formulate- our aim as follows: to extend known theorems of nonlinear
analysis from the finite to the infinite dimensional case; to analyze any
particular difficulties, not present in the finite dimensional case, which arise
in the infinite dimensional case.
Now, what are the main branches of nonlinear analysis in finitely many
dimensions? They may be listed under five general headings:
1. Elementary calculus.
2. The implicit function theorem and related results.
I Schwartz, Nonlinear
1
2 NONLINEAR FUNCTIONAL ANALYSIS
3. Topological principles for establishing the existence of solutions to
systems of equations: the Brouwer fixed point theorem, the theory of degree,
the Jordan separation theorem, and, more generally, the Lefschetz fixed
point theorem and the general topological intersection theory.
4. Topological theories for establishing the existence of critical points:
the Morse critical point theory, and the Lusternik-Schnirelman "category"
theory.
5. Theorems following by the powerful special methods of complex func-
tion theory.
We shall find infinite dimensional generalizations of theorems belonging
to each of these five categories.
1. Elementary calculus goes over to B-spaces (and even slightly more
general spaces) in a routine way. Integration theory is developed for vector-
valued functions defined on a measure space in Linear. Operators, Chapter 3,
and contains no surprises. The proper notion of derivative (as already in
two-dimensional spaces) is that of directional derivative or Gateaux deriv-
ative, which may be defined as follows. Let 0 be.a function mapping one
B-space X into another Y. Then if, for each x, y e X the function 0 (x + ty)
of the real variable t is differentiable at t = 0, we say that 0 is (Gateaux)
differentiable, and write
do (x; y) = ¢ (x + ty) .
=o

A certain amount of basically elementary and unsurprising real variable


theory is connected with this notion. Thus, for instance, under suitable hypo-
theses do (x; y) is linear in y; so that we may, if we like, speak of the derivative
d0(x) as a linear operator mapping X into Y. Rather than study this elementary
calculus for its own sake, we will develop results which belong to it as
needed for other purposes.
.2. The implicit function theorem in B-spaces exists in-two versions. On the
one hand, we have the classical "soft" implicit function theorem, which
states that if 0 is a mapping of a B-space X into a space Y, if 0(0) = 0,
and if 0 is continuously differentiable and 4,'(0) is a bounded operator with a
bounded inverse, then 0 maps a neighborhood of zero (in X) homeomorph-
ically onto a neighborhood of zero (in Y). This basic version of the theorem
has several interesting variants, one of which' is the so-called theory of
"monotone" mappings. Another class of theorems closely related to this
implicit function theorem form the so-called "bifurcation theory". The main
INTRODUCTION 3

idea of this latter theory may be explained as follows. Suppose that the solu-
tio'ns of a functional equation ¢(x) = 0 are to be studied in the vicinity of a
given solution x = 0. (Here, 0 is taken to be a differentiable mapping of a
B-space X into itself.) We may write ¢(x) = x + Kx + tp(x), where ly(x)I
= 0(1x12) for x near 0, and where K is a linear transformation. If (I + K)-'
exists as a bounded operator, then, by the implicit function theorem, x = 0
is an isolated zero. In the bifurcation theory, we suppose only that K is com-
pact, and wish to consider the case in which (I+K)'' does not exist. In this
case, it follows by the Riesz theory of compact operators that X decomposes
as a direct sum X = Y ® Z of two subspaces, both invariant under K, the
second being finite dimensional, such that (I+K) is a bounded mapping Y
onto itself having a bounded inverse. Correspondingly, we may write
x = [y, z], and write the equation 4,(x) = 0 as a pair of equations:
01 (y, z) = (I + K)y + V, (y, z) =0
4, (y, z) = (I +K)z+V2(y,z) =0.
By the implicit function theorem, the first equation may be solved for y in
terms of z : y = Y(z). Substituting this solution into the second equation,
we find that the solutions of the original equation 4,(x) = 0 are in one-to-one
correspondence with the solutions of the equation (I + K) z + 1p2 (Y(z), z) = 0.
This last equation, however, may be regarded as a finite system of equations
in a finite number of variables, upon which all the resources of finite-dimen-
sional analysis may be brought to bear.
An introduction to the theory of bifurcation as outlined above may be
found in Graves' article Remarks on singular points of functional equations,
Trans. Amer. Math. Soc., V. 79, 150-157 (1955).
In addition to the "soft" version of the implicit function theorem described
above, there exists iu the functional-analytic case a separate "hard" version
of the theorem. The precise statement of this second version of the implicit
function theorem will be given in a later lecture. At present we shall only
remark that this theorem applies even in cases where the Gateaux derivative
of4, is unbounded as a linear operator, and has an unbounded linear inverse.
The theorem is due to J. Nash: The imbedding problem for Riemannian mani-
folds, Anti. Math. 63, pp. 20-63 (1956). J. Moser (A new technique for the
construction of solutions of non-linear differential equations, Proc. Nat. Acad.
U.S.A., V. 47,1961, pp. 1824-1831) made the useful observation that Nash's
"hard" version of the implicit function theorem could be proved by an
appropriate modification of the "Newton's Method" of finite dimensional
analysis, the superrapid convergence of Newton's method compensating,
4 NONLINEAR FUNCTIONAL ANALYSIS

in an appropriate sense, for the unboundedness of the Frechet derivative


and its inverse. Moser has subsequently made interesting applications and
extensions of this basic idea, cf. Moser: On invariant curves ofarea preserving
mappings ofan annulus, Gottinger Nachrichten,1962, pp.1-20, and subsequent
publications. Cf. also a lecture by Serge Lang in the 1962 Sdminare Bourbaki.
3. Finite codimensional topology. The attentive reader will have observed
that in our listing above of theorems of finite dimensional topology we have
separated these theorems into two groups. This separation, somewhat un-
natural in the finite dimensional case, is essential in the infinite dimensional
case. Consider, for example, the Brouwer fixed point theorem. As is well-
known, this theorem is equivalent to the statement that the boundary of the
unit sphere is not continuously deformable to a point on itself. In infinite
dimensions, however, this statement is false. E.G., if we examine the bound-
ary OS of the unit sphere in the Hilbert space L2 (0, 1), and follow the homo-
topy f(x) - f,(x), I z t , where

A(x) =t-112f/1 l}, 05x5 t


=0 l J t5x51
by the homotopy f112(x) - tfi/2(x) +1 - 12 a(x), t Z 0, where or e 8S
and o(x) = 0 for 0 S x 5 1, we obtain a continuous deformation of 8S
along itself, to the single point o. This implies a set of topological conse-
quences rather different from the corresponding results in finite dimensions.
We owe to Schauder and Leray the important observation that the most
familiar results of finite dimensional topology can be carried over to in-
finitely many dimensions if attention is restricted to the special category of
maps # having the form 0 - 1 + yv, where 1 is the identity, and +p is a mapping
whose range is compact. Thus, for instance, if we confine our attention to
this special category of maps, the boundary of the unit sphere is not con-
tinuously deformable to a single point along itself. Moreover, again for maps
of- this category, a straightforward generalization of the finite dimensional
theory of degree can be established, and infinite dimensional generalizations
of many of the basic theorems of finite dimensional topology obtained. As a
basic reference, see Schauder and Leray : Topologie et equations fonctionelles,
Ann. Sci. Ecole Norm. Sup. (3) 51 (1934) pp. 45-78. The infinite dimensional
theory of degree is based upon the finite dimensional method, which we shall
develop by a simplified method patterned after the procedure of Heinz: An
elementary analytic theory of degree in n-dimensional space, J. Math. Mech. 8
(1959) pp. 231-247.
INTRODUCTION 5

An especially useful theorem belonging to this circle of ideas is the fixed


point theorem of Schauder: any continuous mapping into itself of a compact
convex set in a locally convex linear topological space possesses a fixed point.
Cf. Schauder: Der Fixpunktsatz in Funktionalydumen, Studia Math. 2 (1936)
pp. 171-180. Krein and Rutman (Uspekhi Math. Nauk 3, No. 1, pp. 3-95)
give an interesting application of fixed point theory to the "projective space"
of a B-space.
A connected account of many of the principal results in the type of func-
tional topology discussed above is given by A. Granas: The theory of com-
pact vector fields and some of its applications to topology of functional spaces
(I). Roszprawy Math. XXX, Warsaw 1962. Granas lays stress on the homo-
topy theory of compact maps and on the Borsuk antipodal point theorem,
but avoids the theory of degree.

4. Finite dimensional topology. The second category of topological results


available in functional spaces is distinguished by the fact that it makes
reference to the ordinary singular homology and cohomology groups, de-
fined similarly in the functional case and in the finite dimensional case. The
Morse theory and the Lusternik-Schnirelman theory both begin with the same
construction. A manifold is defined to be a topological space locally homeo-
morphic to a given B-space in such a way that the "transition mappings"
between the various "local coordinate patches" which cover the manifold
are infinitely often differentiable. On such a manifold, all the ordinary local
notions of analysis such as directional derivative, differentiable function,
etc., are available. Let M be such a manifold, and let f be a smooth real-
valued function defined on M. A critical point of f is by definition a point in M
at which the directional derivative off in every direction vanishes. If M has
a Riemannian metric, we may in the usual way define the gradient Vf off,
which is a field of vectors tangent to M; in this case, the critical points of f
are the points p where Vf(p) = 0. If there exists no critical pointp off such
that a S f(p) 5 b (and assumingeertain additional, technicalhypotheses), then
the subsets M, _ (q a Mlf(q) 5 a) and Mb = {q e Mlf(q) 5 b} are diffeo-
morphic. To see this, we have only to note that if each point q such that
a 5 f(q) 5 'b is pushed down in the direction of the gradient field Vf until
M. is reached, we obtain the desired diffeomorphism. This statement is the
first main lemma of the Morse theory. The second observation on which the
Morse theory is built gives a corresponding result for the case in which
{q e MI a 5 f(q) 5 b} contains an isolated set of critical points. In this case,
and under the further assumption that the critical points are all nondegenerate
6 NONLINEAR FUNCTIONAL ANALYSIS

in an appropriate sense, a closer analysis shows that the space Mb is diffeo-


morphic to a space M. L) H obtained from M. by affixing a certain collection
of "handles". Thus the sequence of critical points off describes the construc-
tion of M by the successive addition of "handles" to a "ball". This connec-
tion may be exploited in either of two directions: to conclude from the
known topology of M that any function f defined on M must admit critical
points of certain numbers and types, or, conversely, to deduce information
about the topology of M from a knowledge of the critical points of some
particular function on M.
If we let M be the space of all smooth curves on a finite dimensional mani-
fold N, and regard M in an appropriate way as being an infinite dimensional
manifold, then the general Morse theory outlined above reduces to the
special Morse theory of geodesics.
A lucid account of the Morse theory, especially in the finite-dimensional
case, is to be found in Milnor: Morse theory, Ann. of Math., Study 51,
Princeton 1963. The generalization to infinite dimensional manifolds is
developed in Palais: Lectures on Morse theory, Notes, Harvard, 1963, to be
republished in 1964 in the Journal Topology. Palms gives the application of
the general theory to the Morse geodesic theory, developing in detail an
account of the necessary compactness properties of the infinite dimensional
manifold M and the function f on it. Further applications of the general
theory to establish the existence of higher type critical structures in theories
of minimal surfaces, etc., are to be hoped for. We may also refer to a set
of notes, entitled Lectures of Smale on Differential Topology (Columbia,
1963). These notes give extensions of various qualitative theorems of finite-
dimensional differential topology to the infinite dimensional case.
The Lusternik-Schnirelman theory of critical points agrees with the Morse
theory in makiag use of the deformations along gradient curves on a mani-
fold M. However, the methods of Lusternik-Schnirelman are more point
set theoretic thahthose of Morse, and lead to more general but less precise
results, If A and M are kopological spaces, and ¢ maps A into M and is con-
tinuous, call 0 a map of category I if it is homotopic to a constant map, and
call 0 a map of category k if A can be divided into k sets A1, ..., Ak, but
no fewer, such that 0 1 A is of category 1. If A c M, the category cat (A) is
the catqpry of the identity map of A into M. It is not hard to establish that
cat (A) - 1 is a lower bound for the topological dimension of A. If f is a
real valued function defined on A, and m S cat (A), put
cm(f) = inf (sup (f(x) Ix e B}j.
ostis>zM
INTRODUCTION 7

Then cl(f) S c2(f) 5 . It may be shown, under suitable compactness


hypotheses, that for each m S cat (A) there exists a set B,,, c"A such that
cat in, and sup {f(x) S x e cm(f ). Were it the case that A con-
tained no critical point q with f(q) = ejf), we could push all the points
p e B. down in the direction of the gradient field Vf, obtaining a sets,,, of
category m such that sup {f(x) I x e B.} < c.(f), a contradiction. Thus we
see that each value cm(f) must be a critical value of f. A refinement of this
argument shows that if c.(f) = then {x e A I f(x) = c, and Vf(x) = 01
must be of category at least m -- n + 1. Thus any smooth function on A
must admit at least cat (A) critical points. This last result makes it important
to be able to establish lower bounds for the category of a space. We will see
in a subsequent lecture that such results follow from an analysis of the
singular cohomology ring of a topological space. For an introductory account
of the theory of category and some of its applications, cf. Lusternik and
Schnirelman, Metkodes topologiques daps les problemes variationels, Gauthier-
Villars, Paris, 1934.
Chapter VIII of the present notes, generously contributed by Dr. Her-
mann Karcher*, gives an account of some of the Morse Theory of closed geo-
desics on manifolds which are topological spheres, according to methods
stemming from Klingenberg.

5. The complex analytic case. A few results applying specifically to complex


analytic functional mappings between complex linear spaces are known. In
the first place, one has the usual elementary results guaranteeing the power
series expansion of complex analytic mappings, etc. The bifurcation theory.,
where applicable, shows that the set of zeroes of an analytic functional equa-
tion O(x) = 0 is in one-to-one bianalytic correspondence with the set of
zeroes of a similar set of analytic equations in a finite number of complex
variables. A good deal is known about the structure of such analytic varieties,
and, the bifurcation theory enables one to carry all this information over to
the functional case,
.It follows readily from the definition of degree, in the cases where this
definition is applicable, that the degree of a complex analytic map x - 4(x)
near any isolated zero is non-negative. According to an interesting theorem
of Jane Cronin (cf. Cronin : Analytic Functional Mappings, Ann. Math. 58

' The work of H. Karcher was supported at'the Courant Institute of Mathematical
Seiencm New York -University, by the National Science Foundation under Grand
NSF-GR8114.
8 NONLINEAR FUNCTIONAL ANALYSIS

(1953) pp. 175-181) the degree of such a zero is actually positive. This result,
combined with the results available from the general theory of degree, leads
to a principle of permanance of zeroes that generalizes the well-known
theorem of Rouchk to the functional case.
6. Miscellany: In addition to the five principal categories of results out-
lined above, a variety of miscellaneous special results must be included in
our subject. These will be noted as they arise in our subsequent lectures.
In the present introduction, we shall note the work of Hammerstein (cf.
Nichtlineare Integralgleichungen nebst Anwendungen, Acta Math., V. 54
(1930) pp. 117-176) on integral equations, in which the order properties of
the integral operators studied are exploited. This work is related to the
theory of monotone operators alluded to above.
We may also mention the existence of various investigations, notably those
of E. Rothe, devoted to the variational method in functional analysis, i.e.,
to the possibility of solving functional equations 4(x) = 0 by casting them
into the form O(x) = min, Where 0 is an appropriately selected functional.
While the literature on the subject of the present course of lectures is
somewhat scattered, a number of useful books have dppeared. We mention
in the first place the book of Krasnoselskii: Topological methods in the theory
of non-linear integral equations, Moscow, 1956, 392 pp. An English trans-
lation of a related survey article by Krasnoselskii appears in the AMS Trans-
lations, Ser. 2, No. 10, pp. 345-409. Krasnoselskii gives a good account of
the available information on continuity and compactness of nonlinear inte-
gral operators of various forms, a good summary account of a number of
other important topics in nonlinear theory, as well as an extensive biblio-
graphy. A less closely related, but still relevant work is the article Functional
analysis and applied mathematics by Kantorovic in Uspekhi Math. Nauk 3
(No. 6) (1948), pp. 89-185, as well as this author's treatise Approximated
methods of higher analysis. An account of the differential calculus in B-spaces
is to be found in the book of Michal: Le calcul difirentielle daps ks espaces
de Banach (V. I, Fonctions analytiques--Equations int6grales) Gauthier-
Villars; 1958 (150 pp.), in the well-known treatise by Hille and Phillips on
semigroups, and in the texts of advanced calculus by Dieudonne and by
Serge Lang. _

The reader wishing to extend his knowledge of nonlinear functional ana-


lysis beyond the necessarily limited material contained in the present notes
will find it useful to consult the comprehensive survey article by James sells :
A Settingfor Global Analysis, Bull. Amer. Math. Soc. v. 72,,1966, p. 7S1-809.
This excellent review may also serve as a guide to the literature of the subject:
CHAPTER I

Basic Calculus

A. Some Definitions and a Lemma on Topological Linear Spaces . . . . . . 9


B. Elementary Calculus . . . . . . . . . . . . . . . . . . . . 11
C. The "Soft" Implicit Function Theorem . . . . . . . . . . . . . . 14
D. The Hilbert Space Case . . . . . . . . . . . . . . . . . . . 18
E. Compact Mappings . . . . . . . . . . . . . . . . . . . . . 26
F. Higher Differentials and Taylor's Theorem . . . . . . . . . . . . . 28
G. Complex Analyticity . . . . . . . . . . . . . . . . . . . . 30
H. Derivatives of Quadratic Forms . . . . . . . . . . . . . . . . 31

A. Some Definitions and a Lemma on Topological Linear Spaces

1.1. Definition: We say that E is a topological linear space if E is a linear


space which is given a topology such that addition and multiplication by
scalars are continuous functions, i.e.: + : E x E - E and - : E x R - E are
continuous functions, where E x E and E x R have the product topology.
1.2. Definition: Let E be a T.L.S. We say that E is locally convex if there
exists a family of convex sets {U) which is a basis for the family of neighbor-
hoods of 0.
(A set K is called convex iff x, y e K implies tx + (1 - t) y e K for every
t e [0,1 ].)

1.3. Definition: A T.L.S. will be called an F-space, or Frechet space, if, as


a topological space, it is metric and complete, with a topology given by a
"norm" function }x( which satisfies: (i) lxl real ? 0; (ii) lxi = 0 if x = 0;
(iii) Ix + yJ Ixi + lyi. (See Linear Operators*, Chapter 2.) We shall write
L.C.F.-space for locally convex F-spaces.
'Linear Operators, Nelson Durnford and Jacob T. Schwartz, Wiley-Interscience,
Vol. 1, 1958, Vol. 11, 1963.

9
10 NONLINEAR FUNCTIONAL ANALYSIS

1.4. Definition: A T.L.S. will be called a Banach space, or a B-space, iff


it is complete and its topology is given by a norm, which in addition to con-
ditions (i), (ii) and (iii) of the above definition, satisfies (iv) IAxl = JAI Ixi
The spaces with which we shall ordinarily deal are L.C.F.-spaces.

1.5. Definition: Let E be an F-space. We say that K c F is bounded if for


any neighborhood U of 0 there exists e -' 0 such that eK a U.
This condition is easily seen to be equivalent to the following one: e, -+ 0
and k e K implies ek -- 0.
We shall now prove a lemma relating L.C.F.-spaces to B-spaces to B-
spaces.

1.6. Lemma: A L.C.F.-space is a B-space if it contains a bounded open


set.
First, we note that boundedness is unaffected by translations, so we can
assume that 0 e U, where U is bounded and open. By definition of an L.C.F.-
space, U will contain a convex neighborhood U' of 0, which a fortiori
is bounded. Now, we can replace U' by V = U' n (- U) which is also
convex, bounded, and a symmetric neighborhood of 0, i.e., V = - V. By
the definition. of a bounded set, the family {eV}, e real > 0, is a neighbor-
hood basis at 0. We consider now the support function of V, p(x), defined
by p(x) = r sup I11l -1. (Obviously if in a B-space V is the unit sphere,
rx.v
p(x) - Ixi.) The functionp(x) has the four properties of a norm function :
(i) p(x) real and ? 0. Obvious. It is a finite number because V is absorbing.
(ii) p(x) = 0 -co- x = 0. If p(x) = 0, sup ItI = oo, and this means that
rxev
tx e V for all t, because V is convex. Hence x e 1 V for all t > 0, whence x
t
is in every neighborhood of 0, since
.
I V . is a neighborhood basis. There-
fore x = 0, because the space is Hausdorff.
(iii) p (x + y) 5 p(x) + p(y). It is apparent that p(x) can be defined by
p(x) = inf 1. Let a, ft > 0 and such that x e aV and y e gV. Then
x + y e aV + fV; sinceV is convex, a' + PV = (a + fi) V, whence
x + y e (a + (3) V. Therefore inf
9>0.X+rerv
t 5 r>0,xety
inf t + inf t, which
is (iii). r>o.,erv
(iv) p (ax) = jal p(x). If a > 0, it is easy to see that p (ax) = ap (x). But
since V is symmetric, (iv) holds for any a, since a V = -a V.
Next we note that V = {xjp (x) < 1} if we assume. V to be open. For then'
BASIC CALCULUS 11

clearly x e V implies p(x) < 1. Also p(x) < 1 implies x e tV for some t < 1,
i.e., x = tv, v e V; since V is convex, x e V. We see at once then that
sV = {xlp(x) < e}. Therefore p(x) is continuous at 0, and consequently at
every x.
Conversely, for any e > 0, there exists d > 0 such that p(x) < d implies
lxl < e. Simply choose d so that d V e S., where S. = {xl lxl < e}. This is
possible since {e V} is a neighborhood basis at 0.
Thus we have shown that l I and p(x) determine the same topology; hence
p(x) is the required norm.
Q.E.D.

B. Elementary Calculus

1.7. Defnitioa: Let X and Y be T.L.S. Let U be an open subset of X and


f : U - Y. We say that f has a Gateaux derivative df (x, y) at x e U iff

df(x, y)
dt f (X + ty) 1=0 =
ex ists for every y e X.
We call this derivative the derivative off at x in the direction y, and shall
write it often as (df(x)) (y) or (f'(x)) (y).
1.& Definition: Let X and Y be T.L.S. and let 0: U - Y, where'U is a
neighborhood of 0 in X. We say that 46 is horizontal at 0 if for each neigh-
borhood V of 0 in Y there exists a neighborhood U' of 0 in X, and a function
0(t) such that
0 (t U') c o(t) V.
1.9. DeWtion: Let X, Y be T.L.S. and U open in X. Let f : U -- Y and
xo e U. We say that f is Frechet differentiable, or F-diferentiable at xo, if
there exists a continuous linear map A : X - Y such that if we write
f(xo + y) = f(xo) + Ay + 4)(y)
then 0 is horizontal at 0.
We call A the derivative of f at xo, and we write it df(x, y) as in Defini-
tion 1.7.
1.10. Remark; If the spaces are B-spaces, then the definition of a function
horizontal at 0 is equivalent to
I4)(x)I s .Ixl tv(x)
12 NONLINEAR FUNCTIONAL ANALYSIS

where tp is real valued and lim V(x) = 0. Thus, in a B-space, the condition
x-'0
of F-differentiability can be expressed as follows :
f(xo + y) = f(xo) + Ay + 0(IYI)
1.11. Remark: If a linear function A is horizontal at 0, then A = 0, as
follows at once from the definition. Thus we see that the F-derivative of a
function is unique, because if
f(xo + y) = f(xo) + Ay + 4,(y)
and
f(xo + y) = f(xo) + By + 4'(y)

where A and B are continuous and linear, and 0 and 0' are horizontal at 0,
then A - B is horizontal at 0 (the sum of 0 and 4,' is still horizontal at 0)
whence A = B.
1.12. Remark: The domain of f in the definition of G-differentiability can
be assumed to be a "finitely open set in x", where x is simply a linear space.
Also, for complex spaces, it is easy to see that the Gateaux derivative is
always linear in y, and that the hypothesis of linearity is also unnecessary
for F-derivatives. (Cf. Hille and Philips [1], Sections 3.13 and 26.3.)
In the case Xis a B-space, it is easy to show that F-differentiability implies
Gateaux differentiability.
1.13. Lemma: Let f : U -- Y, where U is open in a B-space X and Y is a
T.L.S. Then if f has an F-derivative at x0, it also has a Gateaux derivative
at x0, and they are equal.
Proof: We write
f(xo + ty) - f(xo) = My + o (I tyl ),
where A is linear and continuous. But o(Ityl) = o(Itl lyl) and

lim 1 (f(x0 + ty) -f(xo)) - Ay.


f40 t
Q.E.D.
The next lemma gives the chain rule for F-derivatives.
1.14. Lemma: If f : U -+ V is F-differentiable at x0, and g : V -+ W is
F-differentiable at f(xo), then g (f(x)) is F-differentiable at x0, and its
derivative is given by:
(d (gf)) (xo, y) = dg (f(xo), df(xo, y)).
Here, U, V and W are open sets contained in X, Y, Z which are T.L.S.
BASIC CALCULUS 13

Proof: We have only to write :


g [f(xo + y)] = g [f(xo) + df(xo, y) + 0(y)]
= g [f(xo)] + dg [f(xo), df(xo, y)]
+ dg [f(xo),4(y)] + tp [df(xo, y) + 4(y)],
where 0 and +p are horizontal at 0. It is easy to see that the last term is hori-
zontal at 0 as a function from X to Z. We note that if 0 is horizontal at 0,
and if A is linear and continuous, then A o4) is also horizontal at 0. This fol-
lows immediately from the definition of horizontality. Thus dg [f(x0),0(y)1
is horizontal at x0, and its derivative is dg [1(x0), df(xo, y)1.
Q.E.D.
We next prove another lemma relating Gateaux and F-differentiability.
1.15. Lemma: Let X and Y be B-spaces, U open in X and f : U - Y. If f
has a Gateaux derivativef'(x, y) in U, which is linear in the variable y, and
if, when regarded as a linear operator, f'(x) is bounded for x e U and depends
continuously on x in the uniform topology, then f is F-differentiable in U.
Proof: Our point of departure is the formula

f(x + ty) = f'(x + ty) (y)


dt
which one can prove easily. It follows that

f(x + y) = f(x) + f0 l f'(x + ty) (y) dt

=f(x) +f'(x) (y) + fo [f'(x + ty) - f'(x)] (y) dt.


But now:

f t [f'(x + ty) - f'(x)] (y) dt 6 lyl f If'(x + ty) - f'(x)I dt


0 0
o

Q.E.D. = lyl 0(1) = o(lyl)

1.16. Remark: In the last lemma we integrated functions of a real (or


complex) variable with values in a B-space (cf., for example, Hille and
Phillips [1], Chapter III). The basic fact we used was that

fsf(S) d1c (S) S f I f(S) I du (S).


J
14 NONLINEAR FUNCTIONAL ANALYSIS

This is not true in general for F-spaces, because its proof depends upon the
inequality IE atrl S E la,l I fil . In L.C.F. spaces, however, one can define
weak integration by appropriate use of linear functionals. (Cf loc. cit.) Local
convexity implies separation theorems which assure the uniqueness of the
integral.

1.17. Lemma: (Contracting Mapping Principle.) Let X be a complete


metric space and 4 : U -+ X, U open in X, and assume a (¢(x), 4(y)) Sae (x, y)
with 0 S a < 1, where a (x, y) is the distance between x and y. Moreover,
suppose there exists zo e U such that a (zo, X - U) > M, and e (zo,¢(zo))
< M (I - a). Then there exists a fixed point z = O (z.) such that a (zo,zj
< M.
Proof: a (zo,4(zo)) < M(1 - a) < M < e (zo, X - U), so 4(zo) is also
in U, and inductively 02(zo), ..., 4 "(zo) ... are all in U, where #"(zo)
= ¢ ( '-1(zo)). The sequence zo, ¢(zo), ..., 4 (zo) ... is Cauchy, as follows
from the contracting hypothesis. Hence we can set za, = lim o"(zo). By the
continuity of 46, i(z,,) = z.. The formula "

e (z0, 40(zo)) < M (1 -,X")

is easily proved by induction on n. Then

e (zo, z.) = e (zo, lim 0"(zo)) = lime (zo, 4"(zo)) < M.


Q.E.D.

C. The "soft" Implicit Function Theorem

1.18. Lemma: Let x be an F-space, U the sphere {x: lxl < r}, and
0: U - X such that ¢(x) = x + y(x), where V(x) satisfies:
IV(x) - o(y)i 5 a lx - yl with 0 S a < 1, and o(0) = 0.

Then: (i) 4(U) covers a sphere of radius r (1 - a) about 0. (ii) 0 is one-to-


one and the inverse,0' 1 satisfies a Lipschitz condition with constant 1/1 -a.

Proof: (i) We apply the last lemma to the function f(x) = -V(x) + p
where p e X and Ipl < r (1 - a). If we put zo = 0, this inequality implies
that Izo - O(zo)l = IpI < r (1 - a). Hence there exists a point z in U such
that z,, _ -+y(zo,) + p, i.e., O (z,,) = p. (ii) Suppose 4(x) = x + ip(x) = p
BASIC CALCULUS 15

and4(y) = y + Vi(y) = q. Then, x - y + 1V(x) - V(y) = p - q, and Ix -yf


IV(x) - v(y)I s- lP - qI, so (1 - a) Ix - yl 5 IP - ql, and we are done.
Q.E.D.
1.19. Corollary: If x is a B-space and if, in the notation of the above
lemma, V,'(x) exists and Iv'(x)I S a < 1 in U, and V(0) = 0, then (i) and (ii)
are true.
Proof: We only have to note that

Iv(x) -- V'(y)I y))I Ix - yI dt < a Ix - yl.


Q.E.D.
Now we can prove the following important theorem :
1.20. Theorem: (Implicit function theorem.) Let X, Y be B-spaces and
U - Y, where U is an open neighborhood of 0 in X and 4)(a) = 0.
Assume : (a) 0 is F-differentiable in U. (b) 4)'(x) depends continuously on x
in the uniform operator topology. (c) 4)'(0) is a bounded linear map with a
bounded linear inverse. Then 0 maps a sufficiently small neighborhood of
zero homeomorphically onto a neighborhood of zero.
Proof: Let A = 4)'(0). We put , = A -1 o 0. Then iq: U - X, 77 has an
F-derivative rl'(x) which is continuous in x in the uniform operator topology,
and rl'(0) = I, the identity operator. Let v _ 71 - I. Then o'(0) = 0, and

V(x) - o(y) = n(x) - rl(y) - (x - y) = f (x + t (x - y)) (x - y) dt


o

-(x -y) = f (rl' (x + t (x - y)) - 1) (x - y) dt.


0
T hus
hv(x) - v(y)I s Ix - yl I0 Irl'(x + t (x - y)) - 11 dt.
1

But we can make the integral on the right of the last formula less than one,
by taking x and y in a sufficiently small neighborhood V c U of 0. Then the
preceding lemma applies toil, and, a fortiori, ¢ = Aij maps V homeomorphi-
cally onto a neighborhood of 0 in Y.
Q.E.D.
1.21.Cor6llary: Given the conditions of 1.20, the inverse map 1: 4)(V) -. V
is F-differentiable. Setting ip = 4-1, we have, for y e 4)(V), the formula

v'(y) = (4' (4) 1(,)))-1.


16 NONLINEAR FUNCTIONAL ANALYSIS

Proof: If 4(x,) = yl, 4(x2) = y2, then


I4-'V"2) -0-'(yl) -(4'(xl))-1 (Y2/,- yi)I
= I(4,'(x1))-' (4,'(x1) (X2 - xl) - (Y2 - yl)ll
5 A 14'(xl) (X2 - xl) - ,,/4'(x2) + 4(xl)I
This last expression is o &2 - x1 I), whence the first expression is o (I y2 -yl I),
and the result follows.
Q.E.D.
An induction argument easily yields the fact that if 4, has derivatives of
higher order (definition given later), then so does ¢-1. Similarly, if 0 depends
continuously on some parameter, so does 4,-1.
The following theorem is a global version of the "local" implicit theorem :
1.22. Theorem: Let X and Y be B-spaces, and q5: X -+ Y a continuously
F-differentiable function, and suppose 46' is invertible (as a linear operator)
at every x e X, and moreover, that I [4'(x)]-' I S K < co uniformly in x.
Then 0 is a homeomorphism of X onto Y.
The proof will depend on the following lemma :
1.23: Lemma: Under the same hypothesis as the theorem, if d is the
square 0 S s S 1, 0 S t 5 1, and if F(s, t) satisfies the conditions:
(i) F (s, t) I d - Y.
(ii) F (j, t) is continuous in (s, t) and for every fixed s, 0 S s S 1, F (s, t)
is F-differentiable in t.
(iii) F (s, t) has fixed endpoints, i.e. there exist yo, yl e Y such that F (s, 0)
=yo,F(s,1) =ylfor all OSsS 1.
Then there exists a function G (s, t) from d to X which also satisfies (ii) and
in addition
0 [G (s, t)] = F (s,, t) for all (s, t) ed.
Proof of the lemma: By the local implicit function theorem, there exist
neighborhoods V of yo and U of x0 (where 4,(xo) = yo) such that 0 is a
homeomorphism of U onto V. Then, for sufficiently small e, we can define
G ( s, t) as4, .1(F(s, t)) if 0 -< t 5 e and for 0 S s 5 1. We call a the largest
of the values such that G (s, t) can be defined in the rectangle 0 5 t < a,
0 5 s S 1. Assume a < 1. If G (s, t) is defined for t = a, consider the
curve G (s, a) and its image 0 (G (s, a)) = F(s, a). For each s, 0 S s 5 1,
we can select a neighborhood U, of G (s, a) and a neighborhood V. of F(s, a)
BASIC CALCULUS 17

such that 0 is a homeomorphism of U, onto V,. But G (s, a), 0 5 s ---5 1 is


compact, and therefore, there exists a finite subcovering of the curve G (s, a)
with neighborhoods U,,, i = 1, ..., n. In each of these neighborhoods, we
can define the function G (s, t) for all s and 0 S t < a + e by the local
implicit function theorem. So, G (s, t) can be defined for the rectangle
0 S s S 1, 0 5 t < a + min e,,, contradicting the fact that a was the largest
of such numbers. Now G (s, 1) must be F-differentiable in t, for F(s, t)
satisfies (ii) and 4'' is locally F-differentiable. By the chain rule, we have for
all 05 s5 I:
4,' [(G (s, t)] G' (s, t) = F(s, t)
where the prime denotes differentiation with respect to t. So :
G' (s, t) = [0' (G (s, t))]-' F (s, t)
and
IG' (s, t)J 5 J[0' (G (s, t))]-'J IF (s, t)J for-all 0 5 s <_ 1.
Then
(G' (s, t)J 5 KI F' (s, 1)1 5 A,.
Now, integrating with respect to t between to and t,-, we get:
IG (s, t,) - G (s, to)+ 5 A. 11, - tot,

a Lipschitz condition for G (s, t). Therefore lim G (s, t) exists for each s,
r-*.-
and G (s, t) can be defined at t = a. We have proved that a = 1.
Q.E.D.
Proof of the theorem: Let yo = 0(0), and y any point in Y. Consider the
straight line segment joining yo and y; we write it y(t), 0 t S 1. As a
particular case of the lemma, there exists a curve x(t), 0 5 t 5 1, in X such
that 0 [x(t)) = y(t). Then O [x(l)] = y, and 4, is onto.
Let, as' before, yo = 4,(0), and suppose there are two points xo, x, a X,
xo 0 x1 such that 4,(x0) z l! = y. We take two curves xo(t) and x,(t) join-
ing respectively 0 to xo and 0 to x1, with 0 5 t 5 1. Then both image curves
yo(t) - 4, (xo(t)) and y1(t) = 4, (x1(t)) will join yo and y. As Y is simply
connected, there exists a function F(s, t) from d to Y continuous in (s, t)
and such that F (0, t) - yo(t), F (1, t) = y1(t), and F (s, 0) = yo, F (s, 1) = y.
By the argument of the lemma, we find a function G-(s, t) from A to X,
continuous and such that 4, (G (s, 1)) = F (s, t), with G (0, t) = xo(t) and
G (1, t) = x1(t). But then the continuous curve G (s, 1) with endpoint xo
and x, is mapped by 4, onto y. This contradicts the local implicit function
2 Schwartz. Nonlinear
18 NONLINEAR FUNCTIONAL ANALYSIS

theorem, and therefore ¢ is 1 - 1. 0-1 is obviously continuous (it is F-


differentiable) so 0 is a homeomorphism of X onto Y.
Q.E.D.

D. The Hilbert Space Case

Now we shall prove some implicit function results for Hilbert space:
1.24. Lemma: Let H be a real Hilbert space, L a bounded linear operator
mapping H into H. Suppose that for every x e H, (Lx, x) z a (x, x) where
a > 0. Then L- I is defined everywhere, and tL- 11 a-1.
Proof: L is 1 - 1, for if Lx = 0, then 0 = (Lx, x) a (x, x), and so x = 0.
The range of L is closed, for if x e range of L and x, -+ x, then L-lx is a
Cauchy sequence:
(L-1Xe _ L-1Xm, LT 1X - L" 1Xm) S 1 (X. - Xm, X. - Xm)
a2

_ ?1 Ix, - Xm1 Z
a
Therefore L- 1x. - y e H; since L is continuous, Ly = x. Now let z e H
be in the orthogonal complement of the range of L. Then for every x e H,
(z, Lx) = 0 implies (z, Lz) = 0 implies (z, z) = 0, and z = 0. We have proved
that the range of L = H, and L is onto. From the inequality (Lx, x) z a (x, x)
we see that jLxi z a. Hence JL-1xl 5 a-1.
Q.E.D.

1.25. Corollary: Let ¢ : H - H where His a Hilbert space, and suppose 4,


is continuously F-differentiable, and (4,'(x) y, y) z a jy, y) for every x and y.
Then 0 is a homeomorphism of H onto H.

Proof: Apply the last lemma and Theorem 1.22.


Q.E.D.
We shall state the hypothesis of this corollary in a slightly different way.

1.26. Definition: We say that ¢ : H -+ H (H is a Hilbert space) is strongly


monotone if for every x, y e H we have
(4,(x) - 4,(y), x - y) z a (x - y, x - y)
for some a > 0.
BASIC CALCULUS 19

It is easy to see that a differentiable 0 is strongly monotone if (0'(x) y, y)


a (y, y) for every x, y e H. In fact, suppose 0 is strongly monotone. Then
for any real t:
(4) (y + tz) - ¢(y), z) to 1z12, where y, z e H,
and dividing by t and taking the limit, we get (4)'(y) z, z) > a (z, z). Con-
versely, we get the condition of strong monotonicity by integrating the condi-
tion involving the derivative.
The following definition will be useful in the sequel:
1.27. Definition: 0: H -> H will be called monotone if for every x, y e H,
(4)(x) - 4)(y), x - y) > 0. If the sign > holds for x - y 96 0, ¢ will be called
strictly monotone.
1.28. Remark: Obviously strongly monotone implies strictly monotone
implies monotone. Furthermore, 4) is strongly monotone with a constant a
if 1 0 - I is monotone, and if ¢ is strictly monotone (a fortiori, if 0 is
a
strongly monotone), 0 is 1 - 1.
We prove now a useful lemma on Euclidean space :
1.29. Lemma: (Kirszbraun) Suppose {x1 ... x"} and {xi - - are two sets
of points in E", and let p be also in E. Assume that for every i, j,1 < i, j < n,
we have Ixi - xj'l S Is - xxl (I I is the standard norm E Ix.I ). Then, there
exists p' a E" such that
IP' - xxl < Ip - xfl for every j, 1 <j5n.
Proof: Let
A = inf max I P,- - xil
D'.En 1sts. IP - xtl
This infimum is assumed at some point p+E E", for max IP, xti
'' s" l p - X11
-becomes
large when p' is large. Hence we can put
IP+ - xt'l

max =A.
1$!$" IP - xt1
Now, suppose that for 1 i S k we have Ip+ -:xil = A Ip - xtl, and that
for k+ 1 5 I S n we have l p+ - xil < A l p - xtl . We shall show that
p+ a co (xi , ... , xt) (the convex hull of {x. , ... , xk}). Suppose that
p+ 0 co (xi xt). Then, we can separate p+ from co (xi xx) with a hyper-
plane A. If we move the point p+ toward A perpendicularly, it is obvious
that the distance from p+-to every point in the halfspace not containing p+
20 NONLINEAR FUNCTIONAL ANALYSIS
Ip+ - x;(
decreases. We can move p+ by so little as to preserve the inequalities
< A Ip - x,I, k + 1 5 i 5 n, and now we get (p+ - ill < A (p - xil for every
1 5 i 5 n, which is impossible, because p+ realizes the infimum. Thus we
4 it

can express p+ as c,%, where c, z 0 and c, = 1. Let R, = p - xi and


1 1
Ri = p+ - x,. Now suppose A is greater than 1. Then
(1) R'2>R; for
On the other hand, we have by the hypothesis:
(R; - RR)2 S (R, - Rjy,
and after expanding and using (1):
(2) RiRj'/> R,Rj, 1 5 i, j 5 k.

Now, as c, = 1, we have c,) p+ = E c,x', and therefore, Y c,R# = 0,


1 \\\\1 1 1
and, by (1) and (2), 0 > (E c,R,)Z, a contradiction. We have proved that
151.
Q.E.D.
Now, it is easy to generalize this result for Hilbert space:
1.30. Corollary: Let {xa} and (x') be two sets of points in the Hilbert
space H, and p e H. Suppose Ix.' - xg'(;S (x - x,(. Then there exists
p' e H such that Ix' - p'I S (x3 - p( for all a.
Proof: We want prove that the intersection of the infinite family of
spheres with center xa, and radius Ix, - p( is non-void. But spheres are com-
pact in the weak topology for H, so it is sufficient to prove that every finite
subfamily of spheres has a non-void intersection. If, then, there are only
finitely many xa's, the set {xa} u (x') generates a finite dimensional Euclidean
space, and we have only to apply the lemma.
Q.E.D.
1.30A. As the following counterexample (due to Charles McCarthy)
shows, the obvious generalization of Kirszbraun's lemma to Banach spaces
that are not Hilbert spaces is not true in general. We give the following
1Leoi+em: Let 1,P, 1 < p < co, n z 1 be n-dimensional Euclidean space
with the norm I"1,,

Ix
BASIC CALCULUS 21

Then if n > 1, p # 2, the generalization of Kirszbraun's lemma does not


hold.
Proof: Take in l,', p > 2, n > 1 the points xi = (0, 0, ..., 0), x2 = (0, 1,
0, ..., 0), x3 = (1, 0, ..., 0). Evidently
Ixi - x21, = Ixi - x31, = 1, Ix2 - x31 = 21/D.
-,)/,_ We have
Choose now spheres Si, S2, S3 around xi, x2, x3 of radii 2(1
Si n S2 n S3 0, ..., 0)}.
Now let
xi (,0 0 //1 - 2i-Di/D
0), x'2 = ll ) 2(1-J`)/",
, 0, ..., 0),
A = ((1 - 21-,)i/a, --2('-,)/,, 0, ..., 0).
Again Ix2 - x31 = 21/P.
Ixi - x221, = Ixi - x31, = 1,
But if we take spheres Si , Ss , S3' of radii 2c1 -p)/p around x'j, x2, x3, their
intersection will be void. In fact, by uniform convexity
S2' nS3 ={((1 -2i-,)i/,,0,..., 0)} ={y).
Since (1 - 21-,)i/, > 2(1-,)/, is p > 2, Sin S2 n S3' = ¢. The case I. P,
1 < p < 2, n > 1 may be handled in a similar way; the points X1, x2, x3
are replaced by x' j, x2, x3 and vice versa, and the radii of S1, S2, S3 become
(I - 21-F)i-,, 2(i-,)/,9 2(1-,)/, respectively.
Q.E.D.
1.31. Theorem: Let H be a Hilbert space, S any subset of H, and4' : S - H.
Suppose 14'(x) - 4'(y)1 < K Ix - yI for all x, y e S. Then 0 can be extended
to all of H in such a way that the extension satisfies the same Lipschitz
condition.
Proof: Without loss of generality we can suppose that K = 1. By Zorn's
lemma, there exists a maximal extension 4' subject to the same Lipschitz
condition. Suppose p # domain of 4'. We have 14,(x) - 4(y)I 5 Ix - yI for
x, y e domain 4'. Therefore, by the last corollary, we can find p' e H such
that
14(x) - p'I S Ix - pi for all x e domain 4,.
If we define p' = 4'(p), we have extended j to one more point preserving the
Lipschitz condition, and thus contradicting the maximality of 4'. Hence
domain of = H.
Q.E.D.
We now make some definitions preparatory for the next theorem.
22 NONLINEAR FUNCTIONAL ANALYSIS

1.32. Definition: We say that 0: X -+ Y (X, Y are B-spaces), is feebly con-


tinuous if the mapping t --> 0 (x + ty) is continuous from R to Y with the
weak topology for every pair x, y e X.
1.33. Definition: 0 : X -+ Y (X, Y are B-spaces) is slightly continuous if
x,, - x strongly in X implies 4(xa) -+ 4(x) weakly in Y.
1.34. Remark: As it is easily seen, continuity implies slight continuity
implies feeble continuity.
1.35. Theorem (Minty): (a) Let 0: S -+ H (H is a Hilbert space), be de-
fined in an open set S C H, and suppose 0 is feebly continuous and strongly
monotone. Then 0 is an open mapping. (b) Let 0 : H -> H be defined every-
where, and suppose 0 is slightly continuous and strongly monotone. Then 0
maps H onto H.
Proof: (a) As we remarked earlier, we can assume without loss of generality
that 0 = id + T, where T is monotone. Now consider the Hilbert direct
sum H ® H. We introduce the relations:
(1) [x, y] M [x', y'] iff (x - x', y -- y') ? 0
and
(2) [x, y] L [x', y'] if ly-ASIx --x'$.
(Note that neither is transitive.)
Let 4: H ® H - H ® H (Cayley transformation) be defined by

(3) ([x, yl) _ - [x + y, x - y] .

It is easy to see that 0 is an isometry (of course I[x, y]12 = Ix12 + Iyi2), and
that 02 = id.
Now let p = [x, y] and q = [x', y']. We hale:
(4) pMq if 4$(p) L4(q)
For
4$(p) L4$(q)

iff Ix-y-x'+y'12 5Ix+y-x'-y'12


if -2 (x - x', y - y') 5 2 (x - x', y - y')
if (x - x', y - y') > 0 if pMq.
BASIC CALCULUS 23

Call I' e H ® H the graph of T. Since T is monotone,


(5) pMq for all p, q e T
By (4), if we put r1 = P(r), we get :
(6) pLq for all p, q c- r1 .

This means that r, is the graph of some function S1 satisfying a Lipschitz


condition with K = 1. Obviously the domain of S1 is the set of points
1
(x + y) e H such that [x, y] cr, i.e.,

(7) domain (S1) =


2 range (id + T).
1

By the previous theorem, we can extend S1 to a function S2 defined on all


of H and satisfying the same Lipschitz condition. Let I'2 = graph of S2 and
r3 = )(r2). Then r3 = r, because r2 = I'1. $2 satisfies the Lipschitz condi-
tion, whence
(8) p, q e r2 implies pLq
and
(9) p, q e r3 implies pMq
(apply (4) and recall that 02 = id).
Now, by (3) and (7) we have :

( 10) r = {_J [(Id+ S2) x, (id - S2 ) x ] ; x e 7 ran ge (id + T)


and
(11) r=
i [(Id + S2) x, (id - S2) x]; x e H

Suppose now that the range of id + T = range of 4 is not open; then there
exists a point in this range which is a limit of a sequence of points not in the
range, and by (10) and (11), this means that there exists a point [y, z] E r
such that [y, z] = lim [yy, and {[y,,, c r3 - r. But, by hypo-
thesis, the domain o; T is open; hence for some no,
y e domain of T y y 0, z* = z,0, we arrive at
24 NONLINEAR. FUNCTIONAL ANALYSIS

the following conclusion: There exists a pair V, z*] such that:


(i) y* e domain of T,
(ii) (y, z] M [y*, z*] for every pair (y, z] r e r,
(iii) z* * Ty*.
Now we show that this leads to a contradiction.
By (i), for small e > 0, y = y* + e (z* - Ty*) belongs to the domain of T,
so, by (ii), we have :
(y* - y, z* - Ty) ? 0,
and using the definition of y:
-e (z* -- Ty*, z* - T (y* + e (z* - Ty*))) z 0
or
(z* -- Ty*, z* - T (y* + e (z* - Ty*))) 5 0.
As e -+ 0, we have, using the feeble continuity of T:

fz*-Ty*12 0.

Thus, z* = Ty', which contradicts (iii). This proves that range of# is open.
(b) As we remarked before, 0 is 1- 1, because it is strongly monotone, and
moreover ¢-: satisfies a Lipschitz condition with constant 1la, as follows
from Definition 1.26 and Schwarz's inequality. Now to prove that ¢ is
onto we use the same argument as was used in Theorem 1.22, in proving
that if x(t) is defined for t < a, it is defined for t = a; as before, we use,
the Lipschitz condition on 4' 1 to prove that x(t) has a limit as t - a. Then
we use the slight continuity of ¢ to prove that 0 (x(a)) = y(a)..
Q.E.D.

We now establish an additional theorem for monotone functions:


1.36. Theorem: Suppose 0: H - H (H a Hilbert space) is monotone and
continuous. If p e H is such that (x, 4(x) - p) 10 for fix) z R, (where'1,,
is a number depending on p), then p belongs to the range of ¢.

Proof: We can assume that p = 0. Let 4,(x) = ex + 4(x) with e > 0.


Then ¢, is strongly monotone, and by Minty's theorem there exists x, e H
such that
(1) ex. + 4)(x.) = 0 ,
BASIC CALCULUS 25

So that multiplying by x, we have


814, + (O(x,), x,) _' 0.
Therefore, for every e > 0 Ix,I must be smaller than R so that the x, form
a bounded set. Hence there exists a sequence 0 such that the sequence
{x,j tends weakly to a point x., and we can -suppose that Ix,, j also con-
verges. Now, by the monotonicity of# and (1), we get:
(xa - x 6xa - ex,) S 0 for every d > 0 and e > 0.
Then, if we put 8 = e and let n -+ oo,
(x - xQ, - ex,) S 0 for every e > 0
or

(xQO - xt, x:) ? 0.


Therefore

(2) Ix,l2 z Jim Ix6"I2.

On the other hand, spheres in B-spaces are weakly closed, and this means
that:
(3) Ix,I 5 lim Ix,j.
Hence, by (2) and (3)
IXCDI = lim Ix4,l
and so
Xen -, x strongly.
By the continuity of 0 we get 0.
Q.E.D.

1.37. Example: Suppose (S, K, p) is a finite measure space, and V a


finite dimensional linear space. Let f : S x V - V be continuous in V for
every S, and such that I f(s, u)I 5 K Jul + 1 for some constant K z 1 and
every s e S, u e V. Then .(s) -+ f(s, 0(s)) maps L2 (S, V) into L2 (S, V) be-
cause
fIfs#(sxII2d#c s K2 f (I4(s)I + 1)2d/j.
s

We call this map ¢(F). If a sequence {4.($)} converges in L2 to 4(s),


then there exists a subsequence {4.. (s)} converging to 4(s) a.e. Hence
F% d (s) -+ F(4) (s) a.e. But IF(4,,) (s)I 5 K 14,,,(s)I + 1, whence there exists
a subsequence {4M,3 such that '0,,,j (s)j 5 V(s) for all j, where &'(s) is a summable
26 NONLINEAR FUNCTIONAL ANALYSIS
co
function. (Simply take (4 ,,) such that f -dill < oo.) By the Le-
besgue theorem, F(4) in L2. This proves that F is continuous.
Now let D be a linear operator in L2 with bounded inverse, and suppose
we want to solve the equation
D*FDx = Y.
By Minty's theorem, if D*FD is strongly monotone then there exists a solu-
tion for any y e L2. The condition for strong monotonicity is
(D*FDx,_D*FDx2,x,-x2)>_EIXI-x212 where a>0, x,,x2aL2,

(Fx, - Fx'2, x'1 - x2) ? E Ixi - X2112 where s' > 0, x', x2 a L2.
(Calling x; = Dx1, i = 1, 2, and remembering that D is bounded and has a
bounded inverse.)
We therefore see that for solvability it is sufficient to have
(f(s,v) -f(s,v'),v -v') E' IV -v'(2 for all seS, v,v'e V.
(Note'that here the scalar product is that id R", whereas before it was the
one in L2.)
This condition is implied by:
(df(s, v) v', v') Z E' Iv'I2 for all s e S, v, v' e V
which is equivalent to the following condition : There exists an a > 0 such
that the symmetric matrix 'J + J -- eI is positive definite, where J is the

above equation, it is sufficient to require that the matrix A =


axi
-
Jacobian matrix off. Thus for the existence of solutions at every point of the
( af I + LP
ax, r. j
has smallest eigenvalue >0 at every point.

E. Compact Mappings

1.38. Definition: Let E, F be two T.L.S. 45: E - F is compact iff it is


continuous and maps bounded sets into compact sets, i.e., if B e E is
bounded, then 4)(B) is relatively compact.
1.39. Definition: 0: E - F is called locally compact at a point p e E i$'4)
is continuous in a neighborhood V of p, and maps V into a relatively compact
set.
BASIC CALCULUS 27

1.40. Theorem: Let E, F be T.L.S., F complete. Let 0: E -+ F be F-


differentiable at p e E and locally compact at p . Then do (p) is a compact linear
operator.

Proof: We may suppose that p = 0 and that ¢(p) = 0. Let A = d4(p) and
suppose A is not compact. Let S be a bounded subset of E, with non-com-
pact A(S). By the completeness of F, we can find a family {x«} a A(S) and
a neighborhood U of 0 in F such that x« -- xx 0 U whenever a 0 P. Now
let {y«} e S be such that Aya = x«, and for any & > 0, let us define:
n 6(X.) = 0 (&y.)
Then we have :

(1) vla(x,,) - &x. = 4' (by.) - &x. = A (&y.) - &x. + V (&y.) = V (&y,.)
where tp is a function horizontal at 0, i.e. for every neighborhood V of I in F,
there exists a neighborhood U of 0 in E such that

Sp(&U) c of&) V.
Now
(2) rl6(x«) - ,;a(x$) = (&x« - &xa) + (r!a(x«) - &x«) + (&xp - rla(xp))

Choose asymmetric and circled neighborhood V of 0 in F such that V + V


+ V,= U. For such a neighborhood there exists another one W in E
st ch that
tp(&W)cof&) V.
Since $ is bounded, there exists A > 0 such that AS a W. Then for every a,

rla(x«) - 8x« = to (&yj a tp W) c v (,) V.

But as ao ( -+ 0 as & - 0, it follows that for sufficiently small &

0(1 VC &V.
Therefore, by (2), tya(x«) - t a(x,) # &V whenever a 0 fi, because if t1j(x«)
- rja(x«) a &V, then &x« - &xj + &V + &V a &U contrary to our assumption.
Hence 0 (&y«) -,0 (&ys) 0 &U for a # fl, and for sufficiently small &, this
contradicts the local compactness of 4).
Q.E.D.
28 NONLINEAR FUNCTI161 AL ANALYSIS

F. Higher Differentials and Taylor's Theorem

We recall that if X1, X2, ..., X., Z are linear spaces over the same scalar
field, a function M: (X1 x X2 x ... x X.) -+ Z is multilinear or n-linear
if it is linear in each of the variables separately. If X1,..., X., Z are B-spaces
M is continuous iff there exists a constant K such that IM (xi ... x.)J
5 K Ixi I Ix21 ... Ix.l for all x1 in X1, i.e. if it is bounded. The minimum of the
numbers K satisfying this inequality will be called the norm, of M, IMI. The
set of all n-linear bounded maps M from Xi x X2 x ... x X. to Z will be
denoted by B (XI, ..., X.; Z), and it is easy to verify that if X1 and Z are
ZPase B-spaces then B (X1, ..., X.; Z) is a B-space with the usual addition
and scalar multiplication, and with the norm defined above. In the case
Xi = X2 = ... = X., B (XI, ..., X.; Z) will be written B' (X, Z).
1.41. Lemma: Let X1, ..., X., Z be B-spaces over the same scalar field.
Then there is an isometric isomorphism between B (X1, ..., X.; Z) and
B (XI, B(X2, ..., B(X., Z)) ...).
The proof is left as an exercise for the reader.
Suppose that f is F-differentiable on a set Do in X with range off in Z. Then
the function f1, defined for x e Do by fi(x) = df (x), has its values in the
B-space B (X, Z). It makes sense to ask if f, is differentiable. If it is, then the
differential of fi = df will have its values in the space B2 (X, Z) of bounded
bilinear functions of X to Z, where, by the lemma above, we have identified
B2 (X, Z) and B (X, B (X, Z)). We define the differential of fi = df at a
point c to be the second differential off at c and we denote this second diffe-
rential by PA x). Hence d2f (c) is a bounded bilinear function on X to Z.
Higher order differentials are defined by induction.

1.42. Defuidon: A function f on D c X to Z is said to be in class C" on


D, written f e C", iff the n-th differential d"f exists at every point of D and the
mapping x - d"lx) of D into B" (X; Z) is continuous. If f e C' for all n, we
say that f e C. Observe that iff : X -+ Y e CO on a neighborhood of a point
c e X and if g : Y Z e C' on a neighborhood of the point b = ft c), then
h= g of: X ..+ Z e Co on a neighborhood of c.
We now prove Taylor's theorem for B-spaces. We shall write x(k) for the
k-tuple (x, x, ..., x).
1.43. Theorem: Suppose that f e Cl on an open set D which contains the
line segment joining c to c + x. Then
BASIC CALCULUS 29

d2f(c; x(2)) + ...


f(c + x) = f(c) + 1! df(c; x) + 21

+ 1 d"-lf(c; x("-")
(n-I)!
+ 1 - t)"-1 d"f(c + tx;
, dt.
n!Proof:
fo
Since the map t -+ d*Ac + tx; x(")) is continuous on [0, 1) to Z,
it is clear that both sides of the equation have a meaning. To establish
the equality let Z* be a continuous linear functional on Z and let F be de-
fined on 10, 11 to the scalar field by F(t) = Z*f (c + tx). Then F«"(t)
= Z* [dkf(c + tx; xwk')] for 0 k 5 n and we can apply the scalar-valued
form of Taylor's theorem to F. If we observe that Z* commutes with
integration, we can apply the Hahn-Banach theorem and obtain the result.
Q.E.D.

1.44. Corollary: Under the hypotheses of Taylor's theorem, there exists a


bounded n-linear function R. from X to Z such that

1)! d"-If(c;x(X-1)) +(xa)).


f(c + x) =.f(c) + 1! df(c; x) + ... + 1

(n
Proof: Let
1
An = (1 - t)"-' d" f (c + tx) dt.
n! J o
Q.E.D.

1.45. Corollary: Under the hypotheses of Taylor's theorem, there exists a


function a on a neighborhood of the origin in X to Z such that

f(c + x) = f(c) + df(c; x) + ... + I


1)1 x' '>)
(n
+ d"f(c; xa') + Q(x).

where Q(x) = o(jxl").

Proof: Observe that

1 f (1 -t)"-'dt=1,
n
30 NONLINEAR FUNCTIONAL ANALYSIS

and define Q(x) to be

Q(x) = 1 1(1 - t)"-' [d"f(c + tx; xc">) - d"f(c; x("))] dt.


Q.E.D(n-1)!fo
.

Note: The reader may easily generalize the definitions and results of this
section to the case of locally convex T.L.S.

G. Complex Analyticity

Let X and Y be complex B-spaces.


1.46. Definition: We say that 46: X -+ Y is complex analytic on an open
subset 0 of X iff 0 (zlxl + 22x2 + + z xjt) is analytic in zl , ..., zx for
every xl , ..., xj, in D, zi complex.
If 0 is complex analytic, we have immediately

ow _i 4, (x (Cauchy's formula)
and
n! f0(x+CydC.

1.47. Lemma: If v (x + (y) is a cl vector-valued function of x and y, and


1 av _ av
if , then v is complex analytic.
i ax ay
Proof: Let v* be a continuous linear functional. v* (v (x + iy)) is a com-
plex valued function which satisfies the Cauchy-Riemann equations and is
therefore analytic. Thus
mz
v* (v(z)) =
2xi f v*
C-
dd,

and by the Hahn-Banach theorem, the Cauchy formula holds, and v is


analytic.
Q.E.D.
Suppose 0 is analytic in D. Assume that 0 e D and 4,'(0) is invertible. By
the implicit function theorem, 0 has a local inverse, V. If u and v + 2u are
sufficiently near 0,
0 (V(u)) = u and 46 (V (v + 2u)) = v + 2u.
BASIC CALCULUS 31

Then
0' (p(v)) dt V (V + tu) =u
so
and

0' (V(v)) d V (v + itu)I


=o
= iu.
Since 0' (V(v)) is invertible,
d d
, (v + itu) = i V (v + tu).

By the preceding lemma, V (v + zu) is analytic in z, and consequently V is


analytic. We have therefore proved the implicit function theorem in the
analytic case :
1.48.T6eorem: Under the hypotheses of the implicit function theorem 1.20,
and if 0 is analytic, 0-1 is also complex analytic.

H. Derivatives of Quadratic Forms

1.49. Definition: If B, V are linear spaces and f : B -+ V, we say that f is a


quadratic form if the expression
fi (x, Y) = f(x + Y) - f(x) - f(y)

f(Ax) = 22f(x)
for every x e B and every scalar A.
It is clear that in such a case f( -x) = f(x) and f(0) = 0. P is called the
bilinear form associated with f. From the definition it follows that
(3) f(x) = +i (x, x),
and therefore f and fi determine each other. Suppose now that B and V are
Banach spaces. It is clear that f is continuous if and only if f is continuous.
If f is continuous, then
(4) If(x)I 5111A IxI2,
where IIPII stands for the norm of f as a bilinear function. If we define
II!II = (4) may be written
If(x)I S 11111 Ix12.
32 NONLINEAR FUNCTIONAL ANALYSIS

The main theorem on derivatives of quadratic forms is the following:


1.50. Theorem: Every continuous quadratic form has F-derivatives of all
orders. Denote by fi the bilinear form associated with! and by Ilf II the number
I sup Ifl (x, y)I, Ixl s 1, lyi s 1. The first and second derivatives off are:
f'(z) h = fl (z, h),
f"(z) =#I

and the higher derivatives vanish identically. From the equations above it
follows that :
If'(z)I s 2 Ilfli lzI,
llf"(z)0 = 2 IIfll
Proof: From the definition of fi it follows that
Y (z + h) - f(z) = f(h) + fl (z, h)

and (4) implies that f(h) = o(h). Since fi (z, h) is linear in h, it follows that f
has an F-derivative at every z and the equality f'(z) h = P (z, h) holds. Now
f : B -, Horn (B, V) (being equal to fi) is obviously linear. Then from the
general fact that the derivative of a linear mapping at any point is that linear
mapping, we conclude thatf"(z) = f', or f"(z) = P.
Q.E.D.
CHAPTER II

Hard Implicit Functional Theorems

A. Newton's Method and the Nash Implicit Functional Theorem . . . . . . . 33


B. A Partial Differential Equation . . . . . . . . . . . . . . . . . 41
C. Embedding of Riemannian Manifolds . . . . . . . . . . . . . . . 43

A. Newton's Method and the Nash Implicit Functional Theorem

The following theorem will be proved by the so-called Newton's method.


2.1. Theorem: Let B be a Banach space, and let f be a mapping whose
domain D(f) is the unit sphere of B. Suppose that:
(i) f has two continuous Fr6chet derivatives in D(f), both bounded above
by a constant M, which we assume to exceed 2.
(ii) There exists a map L(u) with domain D(L) = D(J) and range in the
space &(B) of bounded linear maps of B into itself, such that
(iia) IL(u) hlM jkI, h e B, u E D(L)
(ii b) df (u) L(u) Ii h#` h e B, u e D(L).
Then, if J f(0)) < M-3, it foI1ow'§ that f(D(f)) contains the origin..
Proof: Let x = J, and let f> 0 be a real number to be specified later. Put
uo = 0 and, proceeding inductively, put
(2.1) ua+t = U. -
We will prove inductively that
(2.2;n)
(2.3; n) 1u - t I S e-$"", n Z I.
3 Schwartz, Noatinear
33
34 NONLINEAR FUNCTIONAL ANALYSIS

We proceed as follows. Suppose that statements (2.2;j) and (2.3;j) are true
for j 5 n. Then
(2.4) Iu I e-p-j s e-pcx-I» <
1>
J=1 J=1

so that if fi is sufficiently large, (2.2; n) follows. Therefore Definition (2.1)


makes sense. Observe now that if g is any function twice continuously F-
differentiable, the mean-value theorem with Lagrange remainder applied to
g (u + th) yields
g (u + h) = g(u) + dg (u) h + fo (1 - t) d 2g (u + th, h, h) dt.

Combining this with (ii) and our induction hypothesis yields


(2.5) Iun+l - I = IL(uu)f(un)I 5 M If(uf)I
S
M2 1110 -- u.-lI2 = M Iu - M2e
Thus we have only to choose P so that
M2e-2"O' 5 e-Pxn+'
or
e(2-x)px"
(2.6) M2 <
Since x < 2, it is clear that (2.6) will hold for P sufficiently large, and then
(2.3; n) follows, completing our induction. Thus we have only to prove the
correctness of (2.3;1) to finish, our proof. But this statement is simply
(2.7) IL(O)f(O)I s e -PM
and it is therefore implied by M (f(0)I S e-p' . Since I f(0)I 5 M- s, if we
choose P so that M2 = e"/2>p", (2.7) follows. Then u converges to some
element u in D(f). By (iib) and (2.1)
f(u,,) = df(u;) (u. - u,+
Me-p",,,
If(U)I s M 1U. +A - s
so J(U) = 0.
Q.E.D.
The following theorem, which gives an important generalization of Theo-
rem 2.1, is proved by a modified Newton's method. We weaken the hypo-
theses of Theorem 2.1 requiring not that the "inverting" operator L(u) be
bounded, but only that it be an unbounded operator acting somewhat like a
differential operator of order a.
HARD IMPLICIT FUNCTIONAL THEOREMS 35

Given a compact n-dimensional manifoldK weintroduce the space C'(K) = C


of (possibly vector-valued), r times continuously differentiable functions
with the norm
Jul, = max max jDau (x)I
la15 r _. M

&-2,. -j a"), a1 non-negative integers, loci = al + +a


aI ... ( 61
G-X'1 ) ax"

Note that C' m- C'+1 and that, if u e Ci+1, Jul,, S Jul.+1; we write Jul, = co
if u 0 Cr. In the sequel we shall refer to a certain range m - a 5 r 5 m + 10a
of spaces C' and to a certain constant M z 1. We suppose that M is suffi-
ciently large so that there exist smoothing operators S(t), t z 1 such that

(S1) ISO ul e S M1°-' l ul,, u e C'

(S2) 1(I - S(t)) ul, 5 ml'-° lule, u e C°

(S3) S(t) U M:-°-1


lule, u E C°
dt
(S4) lim l(1 - S(t)) ul, = 0, tLe C'
-++ao
formr:9 Lo :9 m+10a.
(We will show later how to construct these operators for any compact
manifold.)
We proceed now to the statement of the main result of this chapter:

2.2. Nash implicit fanetional theorem: Let f be a mapping whose domain


D(f) is the unit sphere of CO with range in C". Suppose that
(i) f has two continuous F-derivatives, both bounded by M.
(ii) There exists a map L(u) with domain D(L) = D(f) and range in the
space it (C", C'-) of bounded linear operators on C" to C"-a,
such that :
(iia) IL(u)hl,-,, 5 Mlhl", ueD(L), heCm
(iib) df(u)L(u)h = h, ueD(L), heCm
(tic) IL(u)f(u)Im+9s 5 M(1 + IuI"+LOa), u e C"''°.
a
36 NONLINEAR FUNCTIONAL ANALYSIS

Then, if
2-40M-202
If(0)1.+9a -

f(D(f)) contains the origin.


Proof: Let x = I and P, µ, v > 0 be real numbers to be specified later.
Put uo = 0, and proceeding inductively, put
(2.8; n) u.+ 1 = U. - S.L (u.) flu.)
where S. = S(e1). We will prove inductively that
(2.9; n)

(2.10; n) lu. - u.-1I. 5 e-1001"


(2.11; n)
e"a0"
(2.12; n) 1+lu.1.+10a 5
Suppose (2.10; j) is true for j 5 n. Then
e-pa0x1 = e ,enc.-
Iu I SZ e-e.0c.-1»

map (N- 1)
J-1 1-1 1-e
which implies (2.9; n) if i4, µ are, sufficiently large. Suppose now that
(2.9; j), (2.10; j), (2.11; j), (2.12; j) are true for j S n. Then
1u,+1 - 6I.. = IS.L(u.)f(101..
s M e°"' JL(u.)f(w)I m-a 6 M2 e'"M' If(u )J ,

S M2 eO"" S.-1L(u.-1)f(u.-1)1,.
+ M3ea0x"1u.
- u.-11M
s M2 ea0x" Idf(u.-1)(1 - S._1)L(U,-1)f(u.-1)I.,
+ M3 ea0x" e-2pa0x"

s M3 eaax"[Me-9a0x"-' IL(u.=i)f(u.-x)1.+9a + e
c Ma ea0x" [e-900x"-' M (1 + Iu.-1Ie+1oa) + e-:"°`R""]

S M5 ea0""[e-9.0x"-' e"«0'P-' + e'2m0x")

S Ms {expagx"-1 (v - 9 + x) + expa4x' (1 - 2µ)).


HARD IMPLICIT FUNCTIONAL THEOREMS 37

The desired inequality will be then implied by


(2.13; n) M5 {exp [ape-' (v - 9 + x)] + exp (c flu" (1 - 2µ)])

which (noting that x = 4) will follow for P sufficiently large if we choose


(2.14) µ>2, 41u+v< s.
Thus (2.10; n + 1) follows.
Next we note that
+ 1u.+Jd.+10. 1 + i ISJL(uJ)1(uJ)I.+10,
J-0

i
5 1 + Me°10"' IL(uu)J(uu)I.+%
J-o

5 1 + M2 Z ed" ' (1 + IuJl.+ios)


J-o

5 1 + M2 i eaRCi+.)'r
J-o
Thus
(2.16)
a-b,.ft
(1 + Iu6+tl.+1oJ

5 e^'
'P.t + M2
i
J-o

If v > 2 the right side of (2.16) will be less than I for sufficiently large P,
and so statement (2.12; n + 1)will follow from (2.16), completingourinduction.
If we taker = }, µ a ,t, condition (2.14) is satisfied and so we have only to
verify the correctness of statements (2.10; 1) and (2.12; 1) and our proof
will be complete. These statements, however, are simply the inequalities
(2.17) IS1L(0)f0)I. 5 e-'oo"

and
(2.18) 1 + IS1L(0) f(0)I.+1o, S e"I"

and they in turn follow from the bound for JJ(0)I and (iic). The conclusion of
the proof is now just as in Theorem 2.1.
We proceed now to the construction of the family of smoothing operators
whose existence was assumed for the special case K = n-dimensional torus;
38 NONLINEAR FUNCTIONAL ANALYSIS

then Ck(K) is simply the space of all k-times differentiable functions u(x), de-
fined in E" and periodic with period 2x in each variable. Take a sufficiently
large constant M and a function a e C°°(E), vanishing outside a compact
set and identically equal to 1 in a neighborhood of 0, and let a be its Fourier
transform. It is well known that for any a, N

IDaa (x)I < Aa.N (1 + IxI)-N.


Moreover
a(x) dx = 1,
./ E^

xa(x)dx =0, xa =x,xa2...xa IaI > 0.


E
Now we set
(S(t) u) (x) = t" a (t (x - y)) u(y) dy.
J
It is clear that S(t) u e C°° and, since S(t) commutes with partial differentia-
tion operators, we have to prove statements (S1), (S2) above only for r = 0.
In fact, suppose (S1) is true for r = 0. Then
IS(t)
UI° < Ml,,-' IuIo
Taking any a, IaI s r
WS(t) ul°-, = IS(t) Daul°-.
5 Mt°-' IDaul0 s Mt°' Jul..
But then
IS(t)
ul k s me- Iul,
One deals similarly with (S2).
Suppose then that r = 0.' (S1) reduces to
is(t) uI° s Mt° Iulo.
Let Jai 5 e. We have

I D"S(t) ulo = t"flat f Daa (t (x - y)) u(y) dyl


E"

s Mia1 IuIo s Me Iul0


if IaI S e, so (S1) is established. (S2) reduces to
1(1 -- S(t)) ulo 5 Mt-° Iui,.
HARD IMPLICIT FUNCTIONAL THEOREMS 39

To prove this, apply Taylor's theorem with integral remainder:

P
k)(0)
+ I fu - )"'_' dµ
(1) _ k=0
Y1
k! (m - 1)! 0

to the function f(t) = u (x + ty). We obtain


°-1 1
u(x + y) = Y ( > yaDau (x)
k=0 k. Ia1=k
1

+
1
y" (i - Yu (x +,uy) du.
(Q - 1)! Ia1=° J0
Thus
U - S(t) u = t" f (t (x - y)) (u(x) - u(y)) dy
En

(B - 1)! Ial=o
, EM
Ia(t(x_y))(1-/s)'tDu(x+iy)d,Ady.
0

Making the change of variable ty = z, we obtain


I
-1
t" fEa foa (tx - ty) (i - µ)Q yDu (x + µy) dls dy

= t -1a1 J'a(tx - z) (1 - ,u)° - 1 zaD"u (x + µt -1 z) d IA dz.


f E" fo
But then it is easy to conclude
V - S(t) UI0 S Mt-° lup0
which proves (S2).
Let us pass now to (S3). As before, we can suppose without loss of general-
ity that r = 0, in which case, (S3) reduces to

S(t) ul 5 Mt-°-1 Pubo.


dr
But
d
S(t) U = dt to a ( t (x - y)) u(y) dy
ft.
= 18-1
E1=1
(na (t (x - y)) + Y ty' (D'a) (t (x - y))) u(y) dy.
40 NONLINEAR FUNCTIONAL ANALYSIS

Reasoning entirely analogous to that used in the proof of (S2) yields the
desired result. As for (S4), it is a well-known result in the theory of singular
integrals, and therefore we omit the proof.
We note that the construction of the smoothing operators could be carried
out for any compact manifold, and not only for the torus. For the proof, we
refer the reader to J. Schwartz, On Nash's Implicit Functional Theorem,
Comm. Pure Appl. Math., vol. 13 (1960), pp. 509-530. We note also that the
use of spaces C' and the norms I.1, is by no means essential in the proof of
Nash's implicit functional theorem; indeed, these spaces can be replaced, for
example, by spaces like L; (K) = LD = space of all (possibly vector-valued)
functions f for which
IIDafl' dx < co,
I&I S r, with the norm
iii = 1a15r
f
x
ID-fly dx.

We present now a useful corollary of Nash's Implicit Functional Theorem.


2.3. 2nd implicit functional theorem: Let T - n-dimensional torus, let
f : Ck - C" be defined on the unit sphere of Ck, and suppose that
(i) f has infinitely many continuous F-derivatives.
(ii) f is translation Invariant, i.e. if u e Ch, Iulk < I
f(u (. + h)) (s) = U (u)}
W) (x + h) .
(iii) There exists a mapping L(u) defined in the unit sphere of Ck with
values in 9 (Ck, Ck-s) such that L(u) is translation invariant in the
same sense as f, such that L(u) has infinitely many continuous
F-derivatives, and such that
(iiia) 194) hJk-s S M Ihlk,
u e Ck, h e CR
(iiib) df (u) L(u) h = h, u, h e Ck.
Then, if f(0) = 0, f(D(f)) contains a C00-neighborhood of zero.
Proof: Note that, since f is translation invariant, it commutes with deriv-
atives, so if we apply f to a function in CL*, k' > k, we obtain a function in
C&V-0;
similarly L(u) can be considered as a function whose domain is the
unit sphere of Ck' and range C*'-R. The inequalities and identities
(iiia)* IL(u) bilk.-a S M Ihlk., u e CO, h e C"'
(iiib)' df(u) L(u) h = h, u, h e Cr+s
HARD IMPLICIT FUNCTIONAL THEOREMS 41

also hold. We now have only to apply Nash's implicit functional theorem,
for which we need (iic) of its statement. But this is a consequence of the
translation-invariance of f and L together with inequality (iiia) and the
boundedness of the derivatives of f. Applying Nash's implicit functional
theorem, it follows that if a point k is sufficiently near to the origin in C"-O,
there is a point in Ck whose image is k. Therefore, f(D(f)) contains a Ck-1-
neighborhood of the origin, and thus a C°°-neighborhood also.
We show now how the implicit functional theorem can be applied, first
to an artificial example and then to a natural one.

B. A Partial Differential Equation

Consider functions of n variables, of period 2n in each variable, i.e. func-


tions on the n-dimensional torus. The partial differential operator
a a a )2
\ ax2 }/ +:
4 4
a ax1)
+
aX3 / axa .

2 a l2
axs/ - ... -
GO- a
ax,/
has, by deliberate choice, an extremely unfortunate "mixed" character fropi
the point of view of the theory of partial differential operators. But it is
easy to see that 0 admits the complete orthogonal set of functions
exp (i (mxxl + +
as eigenfunctions, and that the eigenvalues of 0 are Gaussian integers. There-
fore, the equation
(B1) (Q++)u=v
is invertible in the following sense: if visa function in L2(K), then there is a
function u e L2(K) such that (B 1) is valid in the L2-sense.
Our aim is to show that the equation
f(u)=Qu+Iu+u3exp(Qu) v

has a solution u E C°° for sufficiently small v in C°0. Observe first that f
maps Ck into 0 -4 for any k and that it has infinitely many continuous F-
derivatives, the first of which is
df(u) h = (1 + u3 exp (Qu)) Qh + (I + 3u2 exp (Qu)) h.
42 NONLINEAR FUNCTIONAL ANALYSIS

To find L(u) we have to invert

(B2) h+ + 3u2 exp h


1 + u3 exp 1 + u3 exp
For u = 0, (B2) reduces to (B 1), so by a standard perturbation theory argu-
ment, (B2) will be invertible for any u sufficiently close to zero in CR. The
operator L(u) I = h will then be defined and certainly continuous as an opera-
tor from C1 to C'`-4; thus inequality (iiia) follows for L, and (iiib) is an im-
mediate consequence of the definition of u, i.e., of the fact that L(u) has in-
finitely many derivatives and is translation invariant. But we have now veri-
fied all the hypotheses of Theorem 2.3, so our result follows at once.
We note next that our "translation invariance" requirements on f do not
prevent us from treating some apparently unmanageable cases, such as
f(u) = u (x) + c1(x) u(x) + c2(x) u3(x) exp (x)) = v(x),

where c,(x) and c2(x), are C°° functions on the n-dimensional torus. In fact,
we have only to look at this problem as if it were that of solving the system
of equations
u + dlu + d2u3 exp v

d, = c,

d2 = C2.
If we suppose that the operator u - ( + cl) u has a bounded inverse in
L2 (as was the case for c, = 1), then the first Frechet derivative of the in-
finitely differentiable mapping
F: [d,, d2, u] -- [Du + d1u + d2u3 exp d, , d2]
will have an inverse; in fact
dF [d, , d2 i u] [sl, s2, h] = slu + d1h + s2u3 exp (Du)
+ 3d2u2 exp (Oh) + d2u3 exp (Du) h, s1 i s21
which, for d, and d2 near cl and c2 and f sufficiently close to zero in suitable
senses may be solved as in the previous case. Observing that Fis translation-
invariant, and reasoning as before, our result follows.
HARD IMPLICIT FUNCTIONAL THEOREMS 43

C. Embedding of Riemannian Manifolds

Bibliography
1. N. Bourbaki, Espaces vectoriels lopologiques.
2. S. Helgason, Differential Geometry and Symmetric Spaces.
3. S. Lang, "Fonctions Implicites et plongements Riemanniens", Sem. Bourbaki, E.N.S.,
expose 237 (1961-62).
4. J. Nash, "The imbedding problem for Riemannian manifolds", Ann. of Math. vol. 63,
pp. 20-63 (1956).

Now we shall consider the problem of isometric embeddings of Riemannian


manifolds in euclidean spaces. This problem was successfully treated for the
first time by John Nash (see [4]), and it provides a natural application for
theorems such as Theorem 2.2 above. The problem can be stated as follows.
Is every Riemannian manifold (say of class Ck) isometrically embeddable in
RI? (Throughout this section, embedding means diffeomorphic mapping with
injective differential at each point (= regular at each point).) Nash's answer
is in the affirmative (technically, when k >- 3), and he also asserts that m
may be chosen less than or equal to an explicit function of the dimension
n of the manifold (namely m 5 1 (3n3 + 14n2 + 11 n) for the general case
and m 5 1 n(3n + 11) if M is compact). Actually we shall here prove only a
weak result: our final statement will deal only with C °°-compact Riemannian
manifolds and no bounds for in will be determined. M will henceforth de-
note a compact Riemannian manifold ofdimensoin it. The manifold itself and
its metric will be supposed of class C.
I. Remark: Without loss of generality the manifold M may be supposed
to be a torus ([3], No. 1). In fact, by Whitney's theorem (cf. G. de Rham,
Varietes differintiables, or Milnor, Notes on Differential Topology, Princeton,
1959) M can be represented as a closed smooth bounded submanifold of
some Euclidean space E". But then by properly choosing everything we
can assume that the projection of E' on some torus is I - I on the mani-
fold M. This represents M as a closed smooth submanifold of a torus.
Now we have some Riemannian metric defined on the submanifold M of
the torus. By a standard procedure using partitions of unity it is possible to
extend this metric to a metric on all of the torus. If we now isometrically
embed the torus equipped with this metric we obtain by restriction an iso-
metric embedding of M. Thus we may always suppose that our i anifold i-, a
torus; this will simplify some constructions. Nevertheless we begin h discu:s-
44 NONLINEAR FUNCTIONAL ANALYSIS

ing an arbitrary compact manifold, because as far as VI below the assump-


tion that M is a torus makes no difference in the proofs.
11. Consider the Banach space Cr (M, RI) of r-times differentiable func-
tions on M with values in R'", and, more generally the Banach space S' of
symmetric, doubly covariant, r times continuously differentiable tensor fields
on M, defined by dealing locally with matrices instead of with real numbers
(see above). Such tensor fields are metrics and R" has a canonical metric,
namely the Euclidean metric. Each z e C' (M, R'") induces "by devolution"
of this metric an element of S'-1, and therefore we have a mapping f:
C'(M, RM) -' S'-I (for a more explicit definition see below). We shall show
that for m large enough, the image off covers an open set of S. To do so
we shall prove the hypothesis of Theorem 2.2, and then establish our claim
easily.
III. First of all we want to know the Frechet derivatives off (and f itself).
Suppose that zl, ..., z," are the canonical coordinates in R', and that
xl, ..., xs is a coordinate system defined on some open set U of D. Then if
z e C, (M a R"') and f(z) denotes, as above, the tensor on M induced by z,
we have the following expression in coordinates:

(l) (f(z))I.J = E aza 8z. .


mOX,8Xj
This formula is standard and may be taken as the starting point, but (at the
risk of being more boring than necessary) we add the following exposition.
If X is a manifold and p e X, denote by TX, the tangent space to X at p.
Now if z : M R" is smooth, p e M, q = z(p), z has a differential at p, i.e.,
z induces a map
z* : TM, - T(R"),
which is linear (see [2], § 3, No. 1). But the metric on R'" induces an iso-
morphism
u : T(RO), - (T(R0),)* (* = dual space).
Consider now the linear mapping A obtained by composition of the map-
pings : so
TM, _i T(R"), - (T(R1.)* r"'` (TM,)* ,
where 'z* stand for the transpose of z*. Clearly A : TM, -. (TM,)*. As there
exists a canonical identification :

Ho?R (TM,. (TM,)*) = (TM,)' 0 (TM,)*,


R
HARD IMPLICIT FUNCTIONAL THEOREMS 45

A may be considered as a doubly covariant tensor at p. The correspondence


z -+ A is what we called f, i.e. we define f by (f(z)), = A. Observe that the fact
that the values of z are in R"' has not played any special role, and any Rie-
mannian manifold could replace R"`. But in our case, we know that T(Rl),
and (T(Rm),)* may be identified with Rm itself, and that the isomorphism u is
the identity. Finally we get
(l a)
/
V lz))D = zD . Izy*.

In terms of coordinates, z* is the Jacobian matrix z* = J. = (), and from


(1 a) it follows that
(1 b) f(z) = J. 'J".
But then
aza az,
a ax, ax,
which is (1) above. From formula (la) or (lb) it follows at once that
f: Cr+1 -+ S' is a quadratic form (see 2, Chap. I); the bilinear form P asso-
ciated with f is
(2) j9 (x, Y) = x* -'Y* .}. y* . 'X*.
Another consequence of (I b) is the continuity off (as a function from C'+ 1
into S"). This is clear. We may therefore apply Theorem 1.50 and conclude
that f has derivatives of all orders :
(3) f (z) h = z* 'h* + h* 'z*,
f"(z) (h, k) = h* 'k* + k* 'h*,
f (")(z) = 0 if n 3,
and that the norms satisfy
(3') If'(z)I S 2111'11 Iz1

II,f"(z)N = 2 11111

In terms of coordinates, (3) may be written as:


(3a) f'(z) h = J,'J. + J1, 'J:,
az, ah" + az_ ah,
(3b) (f'(z) h), j
ax, ax, ax, ax,
46 NONLINEAR FUNCTIONAL ANALYSIS

Naturally we plan to show that f'(z) h is invertible (as a function of h) in a


very smooth way, i.e.,
(4) given g E S' and z e C' we want to find h as differentiable as possible
such that g = f'(z) h.
IV. To achieve this we use a trick- invented by Nash ([4], p. 31) which
to the problem of solving (4) adds some new conditions. In other words, we
require that the solution h have an additional property given a priori, namely

(5) tZ* (h(p)) = 0 for every p e M.

Since (T(RM),)* was identified with R'" and

`zD : (T(Rm)a)* - (TM,)*, h(p) E Rm,


it is clear that (5) makes sense.
Of course (4) and (5) may be written in coordinates as:
aza A. aza A.
(4a)
g" = E axe ax, + ax, axe

(5 a) -Z-' ha=0, i=1,...,n.


a axe

We now prove that the conditions (4) and (5) may be satisfied simultane-
ously by a suitable h. From (5a) we conclude that

aza aha+ 02Z,


h=0
a axe axe axtax, a
or

Y,
Oz,, aha
_ _E a2Za
a axe axe a axr axe
and then (4a) becomes
a2
(4b) gjj = -21 Z'%
ha.
axe axe

This shows the point of adding the condition (5): now (4b) and (5) give a
system of algebraic linear equations, equivalent to (4) and (5), which are a
system of partial differential equations of first order.
Equations (4) and (5) (or (4b) and (5a)) can be written for every.
z E C'+' (M, Rm). Nevertheless, we can assure the existence of a solution only
HARD IMPLICIT FUNCTIONAL THEOREMS 47

for a non-void open set of z's in C3(M, Rl"), where m is large enough (m may
be chosen to be m = 2n2 + 3n - see [4], p. 53-but remember that we don't
care about bounds for m).
V. Choose a mapping of M into Rs by functions v1 , ..., vs. Now define a
mapping 2 of M into Rs+(1/2)s(s+1) by means of the functions
Cl , ., Vs

v; , ..., vlt's
v2r'1,...,"21',
2
t'sv1, .., t's.
Write
21 = V1, ZZ = t'2, ..., Zs = vs, Z(.J = t'ji'j, 1 < j.

If v = (v1, ..., is a regular COD embedding of M into Rs (the existence of


such is guaranteed by Whitney's Theorem), we claim that f' (z) his invertible
as a function of h for every z in a neighborhood of 2. In fact, if v = (v1, ... , v3)
is a regular embedding, one can take as local coordinates for Msome appro-
priate subset of the v='s. Suppose that M has been covered by the open sets
U1, ..., UN in such a way that on each Us one such subset works as a co-
ordinate system. We consider the linear systems (4 b) and (5 a) in this particular
case. In order to simplify the notation, order the z's once and for all by
(z1,...,Zp,z1.1,z1.2,...,Zs.s)
S

and write a as a general index for them. Let us fix one particular U, (call
it simply U) and suppose that x1 = v1, x2 = v2, ..., x = v is a system
of coordinates on U. Consider 2. The coefficients of (4 b) and (5 a) are first
or second derivatives of the 9,,'s with respect to the xl's, and the matrix B
of the system of linear equations has the following form at every point of U:

n s-n k

B-
48 NONLINEAR FUNCTIONAL ANALYSIS

In the above, shading indicates arbitrary coefficients and k = in (n + 1).


I., I,t are the identity matrices of dimension and k respectively.
This shows that the matrix B has maximal rank n + in (n + 1) at every
point of U (its rows being linearly independent). The same is true (for ob-
vious reasons) for every z which is sufficiently near 2 in the C2 sense.
This remark has two strong consequences. First, it clearly shows that-the
systems (4b) and (5 a) have solutions at every point of U for all z, C2 near 2.
This is basic in finding h. Second, it will follow from this that there exists
a solution of (4b) and (5 a) defined by a mapping that is as smooth as g
and the second derivatives of z are. This needs some explanation. At each
point of U, we know that (4b) and (5i) can be solved. Among the solu-
tions of this system of linear equations we pick out one by the condition
(6) (ha,)2 = minimum.

It is easy to conclude from the fact that the solutions form a convex set in
Euclidean space, that one well defined solution is thereby selected.
This defines a mapping h : U - R', I = s + Is (s + 1), and two prob-
lems arise. (a) Is h differentiable? (b) Is h defined independently of U4-
(a) Differentiability of h. Fix a point p in U. Since B has maximal rank,
B 'B is non-singular (this follows from the fact that det (B 'B) is the Gram
determinant of the rows of B, and consequently different from zero if these
rows are linearly independent). But then the equation

has a unique solution D = (B 'B)-1 G, for all G


Define
(7) H = 'BD = 'B (B 'B)-1 G.
Clearly H is a solution of (4b) and (5a) at the given point of U. We claim
that this is the solution 'satisfying (6). In fact, if R is another solution,
BR = G, we have (writing ( , ) for the scalar product in R'):

and
(H, H) =(R-H,R--H)+2(H,R -H)
(9, R)
(H,R-H)=('BD,R--H)=(D,BR-BH)=(D,G-G)=0.
Then
(R, fl) - (H, H) _ (R - H, R - H),
and this proves that among all solutions ft of BR = G, (R,17) is minimum
when R = H.
HARD IMPLICIT FUNCTIONAL THEOREMS 49

If the point p in U is now permitted to vary, formula (7) shows that H


varies as smoothly as B and G do. In it follows that h is r times differen-
tiable provided that z is t + 2 times differentiable and g is r times differen-
tiable.
(b) It remains to show that the solution H is independent of the coordi-
nates chosen in U. But that is easy. If hl, h2 are solutions constructed on
U1 and U2 respectively, since property (6) is coordinate independent, h1 and
h2 both must possess it at every point of U1 n U2; by uniqueness h1 and h2
agree there. This proves that h may be defined everywhere, and thus we are
through.
The expression
H=tB(B'fB)-G
(7)
for H at p in terms of G at p (and z locally), assures that the correspondence
g - h (z supposed fixed) is linear.
But it tells us still more. In fact, the matrix B involves first and second
derivatives of z (B is the matrix of (4b) and (5a)). Then from (7) it is apparent
that His a smooth function of g and of the first and second derivatives of z.
Let L(z) g denote the function H of (7). The smoothness of H as a function
of z and g may be stated as follows: the mapping L(z) g is continuous in
both variables z and g simultaneously for the topologies: z e C' +2(M, R"),
gES'L(z)geC'(M,R").
Naturally the equivalence between (4) and (5) and (4b) and (5) can be
proved so long as h at least first derivatives (otherwise (4) does not make
sense). For that we need L(z) g to belong at least to C1(M, R"'). This requires
r to be 3 or more.
We sum up as follows :
(8) For every r z 3 there exists an open set 0 in C'+ 2 (M, R'") such that a
function L(z) g is defined on 0 x S' is continuous (in both variables simultane-
ously), and has values in C' (M, R'") satisfying:
(8i) f'(z) ° L(z) g = g, (z, g) E 0 x S';
(8ii) L(z) g is linear in g for every fixed z in 0;
(8iii) the elements in 0 are 1 - 1 and regular at every point.
(8iii) follows from the fact that we may choose 0 to be a small neighborhood
of the embedding I above, and that the set of embeddings is open in C',
r Z1
VI. We now assume that M is a torus (cf. I). Hence there exists a global set
of (local) coordinates (the angular parameters .tit, ..., x") and consequently
4 Schwartz, Nonlinear
50 NONLINEAR FUNCTIONAL ANALYSIS

there is a standard way of expressing the doubly covariant tensor fields as


n x n matrices of functions, just by taking the components of such tensors in
coordinates x1, ..., x" at each point. This means that it is possible to identify
S' with (C' (M, R))"2. But then the mapping f : Cr- I (M, R") -+ Sr may be
assumed to have range in (C'(M, R))"2 and hence to split in n' components
each with range in C'(M, R).
Thus each component inherits from f all properties vis-A-vis derivatives.
We leave to the reader now the verification that these components satisfy the
hypotheses of Theorem 2.2.
We can then apply Theorem 2.2 and conclude that :
(10) The image under f of the set of infinitely often continuously differentiable
embeddings of M in some Euclidean space covers an open set of S.

Remark: We may restrict our embeddings to be embeddings of Min some


fixed R'", and the conclusion should remain the same. But our next step will
consist of adding directly two such embeddings and the bound m will
vanish. For that reason (9) is stated without any reference to the ranges of
the embeddings considered. ' .

VII. Let K' = the set of all tensor fields in Sr that are metric, that is to
say positive definite at each point of M. Clearly K' is a convex cone, open for
the- S' topology.
For every C'+1 embedding zof M in some Euclidean space, f(z) belongs
to K'. Let E' c K' be the set of all such f(z).

Lemma: E°° is a convex cone dense in K°° for the S0°- topology.

Proof: (i) E' is a convex cone. For every A z 0 we have Rf(z) = f(Jir)
(f is a quadratic form). If z : M -+ R'", u : M -+ R .then the embedding
t = z ® u : M -+ Rm ® R' (defined in the obvious way) satisfies
f(t) = f(z) + f(u). Both properties together define a convex cone.
(ii) E°° is dense in K aD.

Proof: Suppose the contrary, and let E°° be the closure of E. Then
there exists a point g e K°0 such.that g ylE°°. By the separating hyperplane
property of locally convex F-space Sab of all C°° tet}sors on the manifold
M, there exists a continuous. linear functional 4 on S°° such that ¢(E°°) S 0
and 0(g) > 0. Let z be any arbitrary embedding of Minto a Euclidean space,
and let u be any arbitrary smooth mapping of M into a Euclidean space.
Then, for any positive e, ez is an embedding. Since 0r?f(ez ®u)) S 0
HARD IMPLICIT FUNCTIONAL THEOREMS 51

for all e > 0, it follows on letting e - 0 that.0 (f(u)) 5 0 for every smooth
mapping of M into a Euclidean space (cf. formula (1) above). By formula (1)
above, this is equivalent to the statement that ¢(f (u)) S 0 for each smooth
mapping of M into 1-dimensional Euclidean space, that is, that¢ (f(u)) S 0
for each smooth real-valued function on M.
We now let VS M be a coordinate patch, introduce coordinates [x1,
x2, ..., xA] = [x1, y] = x in V mapping V onto the unit sphere in Euclidean
space and restrict the functional 0 to the set SV of tenors in S°° vanishing
outside V. For h e Sv, 4(h) may be written as 4(h) = i D'J(h,J(-)), where
t.J-1
D'J = D" is a distribution defined in the unit sphere of n-dimensional
Euclidean space, and where h,Ax) is the coordinate expression of the
tensor h e V. The above condition ¢ (f(n)) 5 0 evidently implies that
i D'J (--
I.J.1
-) S 0 for all smooth functions u vanishing outside a
au au
ax, ax,
subset of the unit sphere.
Let ,n be a smooth non-negative function in R", of total integral 1, vanishing
outside the unit circle. We know from the general theory of distributions that
the "convolutions" DsJ(y) = D'J (__L?)) are a family of C°° functions,
defined in the sphere of radius 1 - e, and converging as e - 0 to D'J in the
sense of the theory of distributions. From the statement i D'J au au 5 0.
it is easily verified that t.J-t (ax, ax1) -
1*1 f DQJ(x)
aaxx)
-0
'. ' aaxx)
J dx

for each smooth function u vanishing outside the sphere of radius 1- We


shall show that [*] implies that
A

E D'J(x) ,i;J S 0 for each jxi < I - e


I.J- t
and each vector -' e R".

We proceed as follows. First note that, by the rotational symmetry and the
homogeneity. of the condition [*], it is sufficient to prove [**] in the special,
notationally simpler case 6 = [1, 0, ..., 01 and x = [c, y], i.e., to prove that
Dal
(c, y) S 0 for each small c and jyl < 1 - e. To establish this last inequa-
lity, let o,be a smooth function of a real variable 1. equal to a constant in a
52 NONLINEAR FUNCTIONAL ANALYSIS

small neighborhood of t = 0 and vanishing for Itl > -{l - e), and put
ua(x) = 8112p (.i - cl W(IYI). Then
auj
(x) =
a_ 1(p. (X1 - c)12 (p(lyD)2,
8x1 (x) 8x1
so that S
Jeu8
j'f (tv'(xl - c))2 V(IYD2 dx
(ax, (x)12 dx =
is independent of 8, while all the other products of partial derivatives of
ua(x) have integrals which go to zero as 8 - 0. Thus, choosing p so that
f f Itp'(x, - c)I Iw(IyD12 dx > 0,
putting us for u in [*], and letting 6 -+ 0, we find that
f D,1 (c, y) (w (IyD)2 dy s 0.
Therefore, letting p vary through a sequence of functiops approaching a a-
function, we find that D." (c, y) S 0 for lyl < I - e. Therefore, as already
observed, [**] follows.
Now note that if -A is a positive symmetric matrix and B is a positive
symmetric matrix, then tr (AB) S 0. Indeed, we have
tr(AB) = tr(AB112B112) tr(B1/2(_A)1/2(_A)112B1/2) = _tr(CC*),
where C = B112( _A)1/2. Since CC* ? 0 we have tr (CC*) i' 0 and our
conclusion follows. Therefore it is a consequence of [**] that
n

[***] j D; (x) h,,,(x) 5 0


W-1
for every smooth positive symmetric tensor hu(x) vanishing outside lxl < 1- e.
Integrating the inequality [***] and letting a - 0 we conclude that

DI"(h j) S 0

for every positive symmetric tensor vanishing outside a compact subset of the
unit. sphere, i.e. that 4(h) S 0 for each positive h e SOD vanishing outside the
coordinate neighborhood V. Since, by use of an appropriate partition of unity,
any positive-definite g e SOD can be written as a sum of positive elements
h, a SOD, each vanishing outside a certain coordinate patch of M, it follows
HARD IMPLICIT FUNCTIONAL THEOREMS 53

that 0(g) S 0 for each g e K. But this contradicts our original statement
¢(g) > 0, and thus completes the proof of assertion (ii).
Q.E.D.
2.4. Theorem: (Nash, [4)). For every compact Riemannian manifold Y
with a Cm-metric, there exists a C°° isometric embedding of M in some
Euclidean space.
Proof: If E°°, K°° are the cones defined above, our theorem states simply
that E°0 = K. By the lemma above we know that E°° is dense in K°°; and
from (9) we also know that E°0 has interior points.

Let g e K°0, go be an interior point of E. As K°0 is open, there exists an


element 1:0 g in K°° such that g is a convex combination of go and g. But
then g is also a cluster point of E because it belongs to K°°. Moreover, go is
interior to E°° and E°0 is convex. This implies- that all points in the open
segment joining go and g (in particular g) are interior points of E°° (see [1],
E.V.T., Chap. 11, § 1, prop. 15), and we are done: E°D = K.
CHAPTER III

Degree Theory and Applications

A. A Form of Sard's Lemma . . . . . . . . . . . . . . . . . . 55


B. Definition of the Degree of a C1 Mapping in R" . . . . . . . . . . . 61
C. Some Functions are Divergences . . . . . . . . . . . . . . . . 63
D. Back to the Definition . .. . . . . . . .. . . . . . . . . . 66
E. The Continuous Case . . . . . . . . . . . .. . . . . . . . 70
F. The Multiplicative Property and Consequences . . . . . .. .. . . 74
G. Borsuk's Theorem . . . . . . . . . . . . . . . . . . . . . 78
H. Preliminaries: Degree Theory in an Arbitrary Finite Dimensional Space . . . 83
1. Preliminaries: Restriction to a Subspace . . . . . . . . . . . . . . 83
J. Degree of Finite Dimensional Perturbations of the Identity . . . . . . . 84
K. Properties . . . . . . . . . . . . . . . . . . . . . . . . 86
L. Limits . . . . . . . . . . . . . . . . . . . . . . . . . 86
M. Compact Perturbations . . . . . . . . . . . . . . . . . . . 89
N. Multiplication Property and Generalized Jordan's Theorem for Banach Spaces. 92
0. Fixed Point Theorems in Banach Spaces . . . . . . . . . . . . . 96

A. A Form of Sard's Lemma

Our aim is to prove the following Theorem 3.1, which is related to Sard's
lemma (cf. de Rham, "VarietLs differentiables", Sec. 3, Th. 4).

3.1. Theorem: Let D be an open set in R", let f be a continuously diffe-


rentiable mapping of D into R", and let J(x) be the Jacobian determinant of
f at x. Then for any measurable subset E of D the set f(E) is measurable and
3

m (f(E)) 5 f IJ(x)I dx.


a

55
56 NONLINEAR FUNCTIONAL ANALYSIS

We begin by recalling a few definitions. For any point vo of R" and any
set of n linearly independent vectors a,, ..., a" in R", the parallelotope P with
initial vertex vo and edgevectors a,, ..., a" is the set of all points of R" of
form x = vo + A,a,, where are real numbers such that the
0 5 A, < 1, i = 1, ..., n. The point vo + + E a, is called the center of the
parallelotope. `a 1
For fixed k the set of those points of P for which At has a fixed value equal
to either 0 or 1 is called an (n - 1)-dimensional face (or, briefly, face) of P, so
that the number of faces of P is 2n. The point vo + Akak + } a, is called
the center of the face. t+(rk
It is immediate that the parallelotope P with initial vertex at the origin and
edge-vectors a,, ..., a" is the image of the unit cube
::5

under the non-singular linear transformation h : R" -+ R" given by

h(x) = h (x', ... , x") = i x' a,;


moreover the image of the unit cube by any non-singular linear transforma-
tion R" onto itself is a parallelotope of this form. It follows that P is compact,
and that the frontier of P is the union of the 2n faces of P. Moreover, (see,
for example, Zaanen [4], p. 160) the n-dimensional measure of P is equal to
Idet (h)I = det (ai)j, where at is the}-th coordinate of a,; and obviously these
last results extend to a parallelotope with any initial vertex.
Throughout the following discussion we use the ordinary Euclidean
norm for points of As and the corresponding norm forlinear transformations
of R" into itself, and we denote the inner product of x and y by x - y. We use
A(n) to denote a positive constant depending only on n, not necessarily the
same on any two occurrences.
For our proof of Theorem 3.1 we require two simple geometrical in-
equalities, which we state below.

1. Let Fbe a set in A" contained in a hyperplane H, let x0 be a fixed point of


F, and let Rx - x0 116 d whenever x e F. Let also G be the set of points of
R" whose distance from F is less than 6. Then G is measurable (since it is open)
and
(a.1) m(G) 9 2" (d + 6r'-' 6.
DEGREE THEORY AND APPLICATIONS 57

It is evident that G lies between the two hyperplanes parallel to H and


distance 8 from it, and to prove (a. 1) we construct a parallelotope containing
G with two of its faces in these hyperplanes.
By a suitable translation we may suppose that H contains the origin, so
that H is an (n - 1)-dimensional vector subspace of R". We can there-
fore find a unit vector a1 such that x - a1 = 0 for all x e H (i.e. such that
a1 is orthogonal to H), and then we can find vectors a2 , ... , a" such that
{a1, a2, ..., a"} is a complete orthogonal set in R". Let now y e G. Since every
vector in R" can be expressed as a linear combination of the a,, there exist
real numbers A,, ..., A. such that
y -- xo = Aiaj.
1=z

Further, since the distance of y from F is less than 8, there exists x e F


(possibly identical with y) such that fly -- x P < 8, and then writing

y - x = (y - x0) - (x -- xo),
we obtain
R, = (y - x0) a1 =(y-x)-a, =(y-x)-a,
whence
14a11 6 fly - xfl flalll = IIy - xll < 8.
Also
ily - xoll s IIy - xll + IIx - xoll < 8+d,
so that for i = 2, ..., n,
lA,l = I(y - xo) ails IIy - xoll Ilaill = IIy - xoll 5 d + 6.
It follows that G is contained in the (fixed) parallelotope with center x0 and
edge-vectors 28a1, 2 (d + 8) at, i = 2, ..., n, and since the measure of this
parallelotope is 2" (d + 8)"-18 Idet (a;)I = 2" (d + 8)"'18, the result follows.

2. Let h be a linear transformation of R" into itself, let P be the image by h


of the unit cube C = {x = (xl, ..., x"): 0 5 x' S 1, i = 1, ..., n}, and let
Q be the set of points of R" whose distance from P is less than 6. Then Q is
measurable (since it Is open) and

m(Q) s Idet (h)l + A(n) (IIhII + 8)"-1 6.


Suppose first that det (h) = 0, so that h is singular. In this case P is
contained in a hyperplane, and we apply Lemma 1 to F = P, taking x0 to be
58 NONLINEAR FUNCTIONAL ANALYSIS

the image of the center wo of C. Since


IIh(w) - h(wo)II = Ilh (w - wo)ll 5 ilhll 11w - wolf 5 # fn Ilhll
whenever w e C, we have 11x - x011 5 J,,.!n IIhII whenever x e P, whence state-
ment I above gives
m(Q) 5 2"(I -,/n IIhII + b)"-1 b < A(n) (IIhII + 6)"-' (1,

as required.
Suppose next that det (h) 0 0. In this case P is a parallelotope with meas-
ure m(P) = Idet (h)I, and it is therefore enough to prove that the open set
Q - P has measure not exceeding A(n) (llhll +a)"-16. Since P is compact, for
each y eQ - P there exists x e P such that Ily - xll is equal to the distance
of y from P, and evidently x is a frontier point of P, so that x lies on one or
more (n - 1)-dimensional faces of P. Since P and 2n such faces and each
face is the image by h of a face of C, it is now enough to prove that if B is a
face of C and E is the set of points of R" whose distance from h(B) is less
than a, then
(a.2) m(E) 5 A(n) (IIhII + 6)r-16.
To prove this last result we observe that h(B) is contained in a hyperplane,
so that we can apply 1 to F = h(B). We choose x0 to be the center of the
face h(B) of P, so that fix - xoll 5 (n - 1) IIhII whenever x E h(B), and
then I gives
m(E)S2"(1 (n - 1)Ilho +8)"-1asA(n)(IIhf +ar-1a.
This proved (a.2), and completes the proof of 2.
From 2 we deduce immediately:
3. Let C be a closed cube in R" with sides parallel to the axes and of length a,
let h be a linear transformation of R" into itself, and let Q be the set of points
of All whose distance from the set h(C) is less than aa. ThenQ is measurable
(since it is open) and
(a.3) nt(Q) 5 m(C) {Idet (h)I + A(n) ((IhI + 6)"-' b).

* In the case in which det (h) 96 0 it is tempting to estimate nt(Q) by using the inequality
n Q) S m(P'), where P' is the smallest parallelotope containing Q with sides parallel to
those of P, but unfortunately the measure m(P) tends to infinity as we approach the singu-
lar case, i.e. as det (h) tends to 0 (this is easily seen from a diagram illustrating the plane
case). Most proofs of the change of variable formula in which the estimate of the measure
of a p-trallelotope appears do in fact use an estimate of the form m(Q) 5 m(P'), and it is
for this reason that the hypothesis inf I!(x)l > 0 is essential to such proofs.
DEGREE THEORY AND APPLICATIONS 59

By applying 3 to the derivative of a differentiable mapping, we obtain the


following result; in this we use the definition of derivative given in 1.9.
4. Let C be a closed cube in R" with center x0 and with sides parallel to the
axes, let f be a differentiable mapping of C into R", and let J(x) be the Ja-
cobian determinant off at x. Then
(a.4) m*(f(C)) 5 m(C) {IJ(xo) I + A(n) (flf'(xo)0 + j) 1 in

where ri = sup II f'(x) - f'(xo) II and m* denotes outer Lebesgue measure.


X@ c

To prove (a.4) let a be the length of the sides of C, and let P be the image
of C by the linear transformation f'(xo) : R" -* R". By the mean value theorem
applied to the function f - f'(xo) ( cf. the proof of Corollary 1.45), we have
for each x of C
11f(x) -f(xo) --f'(xo) (x - xo)II 5 rl pz - xoll < rla. fn,
and this inequality expresses the fact that the point f(x) - f(xo) + f'(xo) (xo)
of the translate f(C) - f(xo) + f'(xo) (xo) of ft C) is at a distance less than
r?a from the point f'(xo) (x) of P. It follows that this translate of f(C) is
contained in the set of points of R" whose distance from P is less than
rla f , and applying 3 (and noting that det (f'(xo)) = J(xo)) we immediately
obtain the inequality (a.3).
5. Let D be an open set in R, let f be a continuously differentiable mapping
of D into R", and let J(x) be the Jacobian determinant off at x. Then for any
measurable subset E of D
(p.1) m* (f(E)) 5 f dx,
s
z
where m* denotes outer Lebesgue measure.
Suppose first that E is a closed cube C with sides parallel to the axes. Since
f is continuous on C, we can divide C into a finite number of non-overlapping
closed cubes C1, ..., C,, with centers x1, ..., x4 and with sides parallel to
the axes such that II f'(x) - f'(xk) 11 5 e whenever x e Ck (k = 1...., N). By
4, for each cube Ck we have
m*(f(CC)) < m(Ck) {IJ(xk)I + As),
where A is independent of k, so that also

m*(f(C)) < E m*(f(Ck)) < Ij fJ(xk)I m(Ck) + Asm(C),


60 NONLINEAR FUNCTIONAL ANALYSIS

the summations being extended over all cubes Ck. When the maximum
diameter of the cubes Ck tends to 0 the sum E IJ(xk)I m(Ck) tends to the
Riemann integral of IJ(x)I over C, and sinces is arbitrary we therefore obtain

(p.2) m*(f(C)) 5 fIJ(x)I dx,


c
which is (#.1) for E = C.
Suppose next that E is a measurable subset of D. Then we can find a set
El containing E and with measure equal to that of E such that El is the
intersection of a contracting sequence of open sets O a D. If now C is a
closed cube contained in D with sides parallel to the axes, then for each fixed
n the set C n O, is a countable union of non-overlapping closed cubes with
sides parallel to the axes, and applying (f.2) to each such cube and summing
we obtain
m*(f(C n O.)) s IJ(x)I dx,
J
whence also
(fl.3) m*(f(C n E)) Lo. IJ(x)I dx

(si nce E e Op). Since J is bounded above on C, the integral on the right of

(fl.3) is finite, and so tends to L81 IJ(x)I dx as n tends to + oo, whence

m* (f(C n E)) s J IJ(x)I dx = f I J(x)I dx.


Cnr, c.,s
Since D is a countable union of non-overlapping cubes such as C, the general
result (j.1) follows.

6. Let D be an open set contained in R", and let f be a continuously differ.


entiabk mapping of D Into R". Then f(E) is measurable for every meawable
set E e D.
(For a proof of this under more general hypotheses see Rado and Reichel-
derfer [1], pp. 337, 214).
Let J(x) be the Jacobian determinant off at x. It follows immediately from
5 that if E0 is the subset of D where J(x) = 0, then flE0) has measure 0, so
that m (f (E n Es)) = 0 for every measurable E e D. Since D - E0 is
open, it is therefore enough to prove the above result when J(x) 0 0 on D.
DEGkEE THEORY AND APPLICATIONS 61

Suppose then that J(x) t 0 on D, so that f is locally a homeomorphism.


The open set D is a countable union of closed cubes, and, by the Heine-
Borel theorem, we can cover each of these cubes with a finite number of closed
cubes on each of which f is a homeomorphism. Hence D is a countable union
of closed cubes Ct on each of which f is a homeomorphism, and since
f(E) = U f(E n Cj), it is enough to prove that f(En C) is measurable
whenever E is measurable and C is a closed cube in D on which f is a homeo-
morphism.
If E is closed, so are E n C and f(E n C), and hence if E is a countable
union of closed sets, then f(E n C) is measurable. Since any measurable set is
the union of a set of measure zero and a set which is a countable union of
closed sets, it is now enough to prove thatf(E n C) is measurable when E is
of measure zero, and this follows immediately from 5. This completes the
proof of 6, and hence also of Theorem 3.1.

References

1. T. Rado and P. V. Reichelderfer, Continuous transformation in analysis (Berlin, 1955).


2. G. de Rham, Varietes differentlables (Paris, 1955).
3. J. Schwartz, "The formula for change in variables in a multiple integral", Amer. Math.
Monthly 61 (1954), 81-5.
4. A. C. Zaanen, An introduction to the theory of integration (Amsterdam, 1958).

B. Definition of the Degree of a CI Mapping in Rn

Notation: Until further comment, D will denote an open bounded set of


R", the Euclidean space whose coordinates are x = (x1, ..., x"). Let 8D de-
note the boundary of D.
Most of the mappings appearing below are continuous on D. We shall
write C for the space C(F), k) of continuous mappings defined on f) and
having values in R. By a C' function on f) we mean a function having deri-
vatives on a neighborhood of b up to order r which coincide with restrictions
of continuous functions on D. Cf. Chapter I for the topology of Cr.
Suppose that0 e C is C1 on D and that p e R" is a point not belonging to
0(0). We shall define the degree ofq5 with respect top and D; it will be an
integer denoted by deg (p, 0, D).
If Z e D is the set of critical points of 0, i.e. points at which the
Jacobian of 0 vanishes, and f -1(p) n Z = 0, then the set 0-1(p) is discrete,
by the implicit function theorem; since f) is compact, this set is finite.
62 NONLINEAR FUNCTIONAL ANALYSIS

At each x 4-'(p), J+ does not vanish. Then its sign is unambiguously de-
fined and we define
(11.1) deg (p, 0, D) = E sign J#(x).
Suppose now that 4-'(p) n Z 0 0. ,
By Sard's Lemma 3.1, O(Z) has measure zero in R-, and in particular, has
empty interior. This implies that the point p may be approximated as closely
as desired by points q for which 0-'(q) n Z = 0. For each q, the degree is
defined as above. Then, by definition, the degree of p is

(11.2) deg (p, 0, D) = lim deg (q, 46, D).


f-.D
j-1<i)nZ-0
In order to justify this definition we must prove that the limit exists (and
that it is independent of the choice of the q's). This will follow from more
general results to be obtained later (see Corollary 3.10). As a first remark we
note that if p #ci(b), then deg (p,0, D) = 0.
Let us consider the special case when #-'(p) n Z = 0. Suppose that f f.}
is a family of continuous functions f,: R' R with the properties

(i) f. (x) dx = 1

(II.3)
(ii) K, = support f, = sphere of radius a and center at p.
Let us consider the integrals
f f,(t¢(x)) J.#(x) dx.
D

We shall prove that I, = deg (p, 0, D) for every e small enough.


Let p', ..., p, be the elements of¢-'(p) (as was observed, this set is finite).
When s is small enough, there exist neighborhoods A,', ..., As of p', ..., AVE
such that 0 naps each A' homeomorphic ally onto K the sphere of radius s
f
around p. Observe that f,(4(x)) vanishes outside U As.. It follows that
'-1
(1) fD'J*(x) dx r f,(4x) J+(x) dx .
9-1 J A.

Since 4(p') # 0, for i = 1, ...,'s (recall that 4- '(p) n Z = 0), by choosing s


small enough, we may assume that J` 0 0 at every point in every A.. In
DEGREE THEORY AND APPLICATIONS 63

that case sign JJ is (defined and) constant on every A. Hence we may con-
sider 0 : A; - K. as a change of variables and apply the classical theorems on
change of variables in an integral. This leads to the equality:

(2) r fe(¢(x)) J`(x) dx = sign J#(P4) J f.(x) dx = sign J4(P`).

Combining (1) and (2) yields

f.(4(x)) J+(x) dx = E sign J#(p'), {P} _ 4-1(P),


1.
i.e.,
(II.4) deg (p,0, D) = (46(x)) J#(x) dx
fDf.
for every family of functions with the properties (1.3) and provided s is small
enough.
This expression may be adopted as an alternate definition of deg (p, 0, D)
whenever4-1(p) n Z = 0 (see E. Heinz, [1]). We shall prove some proper-
ties of the degree as so expressed. First we need some lemmas.

C. Some Functions are Divergences

3.2. Lemma: Suppose v is a C1 vectorfunction: (vl, ..., v") = v : E" -+ E",


and that f - div v (_ T
8v'' . If v vanishes outside a bounded set K, then
8x

f(x) dx = 0.
SRI'

Proof: Compute.
Suppose now that 4) e C and 0 in C2 on D, and that visas in Lemma 3.2.
K is the (compact) support off.

3.3. Lemma: If K r 0 (8D) = 0, the function


g(x) = f(OW) J4(x)
is the divergence of some vector valued C1 function u with support included
in D.
64 NONLINEAR FUNCTIONAL ANALYSIS
Proof: Let ak.i be the minor determinant corresponding to the (k, 1)-th
air
entry in the Jacobian. matrix and define u by
8xj
a'.J(x), (i = 1, 2, ..., n).
u'(x) _ v'(g6(x))
J

Then u is C' since 0 is C2 and v is C1. From the hypothesis 0 (OD) n K = 0


it also follows that u may be considered as defined on all of R" (with the
value zero outside D) and that its support is then included in D, as required.
It only remains to prove that div u = g. We simply compute as follows:
[v« (O(x))Oi(x) a' '(x) + v'(O(x)) ai''(x)],
k,J

div u = u', = div v J#(x) + vJ (¢(x)) ai''(x)

g(x) + vJ (4(x)) a(''(1


By definition we have
(It - 1)! ar.J =
(where a means sign of the permutation m,, - j ), and then, differentiating,
the antisymmetry leads to a;'' = 0, which implies the desired result.

3.4. Lemma: Let f be a continuous function defined on R" having support


K contained in D. Let x° e R" and suppose that the convex hull ofK v (K - x°)
(where K - x° denotes the set obtained from K by the translation induced
by -x°) is contained in D. Then the function
f(x) - f(x + x°)

is the divergence of some mapping v: R" - R" whose support is contained


in D.
DEGREE THEORY AND APPLICATIONS 65

Proof: Let t(x) = f(x) - f(x + x°). Clearly 0 has support equal to
K u (K - x°).
Now define °
0(x) _ f _ O(x + tx°) dt,

v'(x) = 40(x).
It is easily verified that v' has support contained in the convex hull of
K u (K - x°). Moreover, if v = (v', ..., v"), we claim that div v = q . In fact,
we have
dive=v;x° ax,
But the last term is the directional derivative of 0 in the direction x°, and
consequently equals
ddt 0 (x + tx°)
I-0,
By definition of 0, it follows that
d
+ tx°) = d ($0 4 (x + (t + u) x°) du)[_ O

d 0(x+(t+u)x0)du
dt 1.0

=
-7

.(
d '0
(x + ux°)) du OW.

and then div v(x) = 4(x) = f(x) - f(x + x°) as desired.

3.5. Corollary: Let x(s) be a continuous curve in R", 0 < s 5 1, and let f
be continuous f: R" -- B with support K contained in D. Suppose
(i) K is contained in a convex compact set M contained in D.
(ii) M - x(s) never touches the boundary of D.
Then f(x + x(0)) -f(x + x(1)) is the divergence of some C' mapping v:
R" -+ R" whose support in contained in D.

Proof: Let us define an equivalence relation on the set of values of s by


Si N 02 ifff(x + x(sl)) -f(x + x(s2)) is the divergence of a mapping with
support in D. Lemma 3.4 allows us to conclude that every class modulo - is
5 Schwarts, Nonlinear
66 NONLINEAR FUNCTIONAL ANALYSIS

open. In fact, for every s, M + x(s) being a compact convex subset of D there
exists an open convex neighborhood of M + x(s) also contained in D. By
the continuity of x(s), the convex hull of (M + x(s)) u (M + x(s')) is con-
tained in such a neighborhood, when s' is close to s. Then Lemma 3.4
applies and s s'. Thus we conclude that every class is open. But the con-
nectedness of [0, 11 therefore implies that there is only one class. This means
that 0 - 1 or that f(x + x(0)) - f(x + x(l)) is a divergence, and we are
done.

D. Back to the Definition

We begin by proving a lemma.


3.6. Lemma: Let 0 E C and be C2 on D. Consider two points pi , p2 E R"
- 0 (8D) and such that -1(p1) n Z = 0-1(P2) n Z = 0. Then if pl and P2
belong to the same component of the open set R', we have
deg (P1, 4, D) = deg (P210, D)
Proof: Since 0-1(p1) n Z = 0, we know that deg (p, 4, D) may be com-
puted by means of a family of functions with the simple properties described
in 11.3 as follows:

deg (p1, 0, D) = f.((x))J,(x) dx


D
for e small.
If we suppose that P2 lies in the same component of R' -,0 (8D) as does
p1, then there exists a continuous curve x(s), 0 5 s S 1 such that x(0) = 0,
x(1) = P2 - p1 and x(s) + p1 lies in that component. Since x(s) is compact
there exists e > 0 such that if K. is the- e-sphere around p1, K. + x(s) never
touches the boundary of R' -,0 (8D). Then Lemma 3.4 may be applied and
yields the conclusion that
A(x) - f.(x + P2 - Pi)
is a divergence. Therefore by Lemma 3.3,

f.(/(x)) J4(x) -1.(4(x) + P2 - P1) J#(X)


is also the divergence of a mapping with support in D. In such a case,
Lemma 3.2 implies that

(a) Jf.(x)) J*(x) dx = fJ(x) + P2 - Pi) JO(X) dx


DEGREE THEORY AND APPLICATIONS 67

or, according to formula (11.4):

(b) deg (Pl, 0, D) = f!. (4)(x) + P2 - Pi) JJ(x) dx.

But now the functions g,(x) = ff(x + P2 - p1), e > 0 have the properties
(11.3) around p2, and consequently

deg (P24, D) = I g. (4)(x)) J#(x) dx


JD

= Jfi(d(x) + Ps - P1) J#(x) dx.

Formulas (a), (b) and the last one together imply


deg (P1, 0, D) = deg (P2, 0, D)
and we are done.

Consequences: Suppose that p e R" but t¢-1(p) r Z = 0. In that case, for


every q sufficiently close top, q belongs to some well-defined component of
R" - 0 (3D), namely that containing p. But then by Lemma 3.6, deg (q, 0, D)
is constant when q is near p and 4-1(q) n Z = 0. This justifies the de-
finition (11.2) when 0 is C2. Taken together Lemma 3.6 and the definition
(11.2) imply:

3.7. Corollary: Let ¢ e C and be C2 on D. Then


deg (p, 0, D)
is constant on every component of R" - 0 (3D).
This corollary will also be true for continuous mappings after we define
the degree for such mappings. The unnatural hypothesis that 4, is C2 needs
to be eliminated first, we will do so in the first corollary of the next
lemma.

3.8. Lemma: Let 0 e C and be C1 on D. Then for each p f 0 (3D) u 4)(Z),


there exists a C' neighborhood U of ¢ such that for every p e U we have
p # +p (3D) and
deg (p, 0, D) = deg (p, +p, D).

Proof: Let y,, j = 1, ..., k be the elements of the finite set ¢' 1(p). Let Bj
be an open ball around y,, whose radius rj will be determined below.
68 NONLINEAR FUNCTIONAL ANALYSIS

First of all, if each r, is small enough, the family {B,) is disjoint. Our aim
now is to prove that there exists a C1 neighborhood U such that for every 1P
in U, the equation y'(x) = p has one and only one solution in each B, and no
others.
The hypothesis p #4)(Z) implies that the derivative 4)' of 0 (the Jacobian
matrix) has an inverse at each y,. By decreasing the radii r, it is possible to
guarantee that 0' has an inverse at every point of each B, and moreover that:

(1) I(4'(y,))-1 (4)'(y) - 4'(y,))I < 1, y e B,

where the norm I I stands for the norm for operators on R" into itself.
Let us suppose that this is the case for radii r1, ..., rk and let r be the
smallest of these. If we let F be the set D - U B1, then F is compact and
I
p O¢(F). Finally, let a be a positive number such that I(4)'(y,))-1I > a, j
= 1, .., k.
Now we are able to define the C1 neighborhood of 0 that will give a
desired solution.
Choose U to be a ball (in the C1 sense) such that for every V e U the follow-
ing holds:

(a) p ll +p(F) ).
(b) v,'(y) is invertible when u e U B,, and I(o'(y,))I-1 > a.
(c) I(V'(y,))-1(?p'(y) - yr'(y,))f < ;, when y e B, and for j = 1, .... k.
(d) 10(x) - V(x)I < for all x e D.

We remark that each one of the properties (a), (b), (c), (d) defines an open set
in the C' sense (call these sets A, B, C, D), that the sets A, B, D obviously
contain 0 and that 0 e C by (1).
Therefore U may be chosen as any C1 ball around 0 contained in A n B
nCnl).
Observe that since ¢' is invertible over each B1, the sign of the Jacobian
determinant is constant. U being a ball, it is connected. Then the same result
follows. for every 0 in U and clearly

(2) sign J*(y) = sign J#(y1)


when j e B, and V e U.
We now show that properties (a), (b), (c) and (d) imply that the function 1p
has one and only one root of the equation v'(x) = p in each B1. Of course
DEGREE THEORY AND APPLICATIONS 69

property (a) tells us that there are no roots outside U Bj (remember that
F = D - U Bj). The only problem is to see what happens in one Bp say, in B1.
Define
11(y) = (1V'(Yl))-1 (W(Y))
From (c) it follows that
Irl'(y) - 11 < 1, y e B1.

Then by Corollary 1.19 it follows that t1 is one-to-one on Bp This implies


that SV is one-to-one on B1 and therefore it has at most one root there.
But from the same corollary we also conclude that ii(Bj) covers a ball B
of radius (I - J) r, = Jr, > Jr around 77(yl).
As I(V'(y1))-11 > a, it follows that tV (yl) (B) covers a ball V of radius
a Jr around tV'(Y1) (71(y1)). But tp(y) = (tp'(yl))'' (11(Y)) and then tp(B1)
covers V. This means that every point x e R' for which ix - ip(y1)I < Ira
is of the form x = tp(b), b e B1. But by (d), Ip - tp(Yl)I = 14)(yl) - tp(Y1)1
< 4 . and then the equation V(x) = p has at least one solution in B1.
The same holds for every Bj and we have tp-1(p) n Bj = {9,}, j = 1, ..., k.
But then, by (2)
deg (p, u, D) _ Sign J (y,) _ Sign J#(yt) = deg (p,0, D).
J
Q.E.D.

3.9. Corollary: Let ¢ e C (b) and be C' on D. If p, q do not belong to


4) (8D) u 4(Z) and belong to the same component of R' - ¢ (8D), then
deg (p, 0, D) = deg (q, 0, D).

Proof: Choose a continuous curve x(s) joining p = x(O) and q = x(1).


Since x(s) is compact and disjoint from 0 (8D), when tp is CO near 0, x(s) is
also disjoint from tp (8D). This means that p and q belong to the same com-
ponent of B' -1p (8D). Therefore if we choose ,p to be C2,
deg (p, p, D) = deg (q, W, D).
This holds for every tp of class C2 and CO near 0. But since deg (p, to, D)
- deg (p, ¢, D) and deg (q, ip, D) - deg (q, ¢, D) as tp - 4 in the C' sense,
we conclude that
deg (p, 0, S) = deg (q, 0, D).
Q.E.D.
70 NONLINEAR FUNCTIONAL ANALYSIS

3.10. Corollary: Let p not belong to 4) (8D). The expression


deg (q, 4), D)
has a limit when qj - p and the qj's belong to R" - (4 (8D) u ¢(Z)).
Proof: Obvious by the previous lemma.
The conclusion is that the degree of 0 has a meaning for every point
p #¢ (8D) in the sense of the definition (11.1, 11.2) which may therefore be
adopted for every Cl mapping.
The next step is to remove the condition 0 e C2 in Corollary 3.7.
3.11. Proposition: Let 0 e C and be C' on D. Then the degree
deg (p, 0, D)
as defined in (11. 1) (11.2) is constant on every component of R" - 0 (8D).
Proof: This follows from 3.9 and the fact that ¢(Z) has empty interior.
Next we remove the condition p #4,(Z) in 3.8.

3.12. Proposition: Let 0 e C and be C1, and p be a. point not in 4) (8D).


In that case there exists a Cl neighborhood U of4 such that for every v, in U,
p 0 +p (8D) and :
deg (p, y,, D) = deg (p, 0, D).
Proof: This follows from 3.9 and the fact that 4(Z) has empty interior.
3.13. Corollary: Let {¢t} be a family of C' mappings in C depending con-
tinuously in the Cl sense on the real parameter t, 0 S t S 1. If p Oq$t (8D)
for every t, then
deg (p, 00, D) = deg (R01, D).
Proof: Essentially the same argument as in the proof of Corollary 3.5:
the equality deg (p, 4t, D) = deg (p, 0 D) defines an equivalence relation
t - U. Proposition 3.12 implies that each class is open and the connectedness
of [0, 1] allows us to conclude that 0 - 1. This is the claim.

E. The Continuous Case

We are approaching the most important point in this chapter: the definition
of the degree for every continuous mapping. This will demonstrate the topo-
logical character of this concept of degree. The definition is as follows:
DEGREE THEORY AND APPLICATIONS 71

3.14. Definition: Let 0 e C and let {&"} be a sequence of functions of


class C' converging uniformly to 0. Then for every p 00 (8D) a sequence
deg (p, gyp", D), n > N is defined, the limit
lim deg (p, 4", D)

exists and does not depend on and we then define

deg (p, 0, D) = lira deg (p, 0", D).


n-.ao

Justification of the definition. Let d equal the distance between p and the
compact set ¢ (8D).
Choose N so large that if n Z N, then 1¢n - 01 < Id (I I stands here for
the uniform norm = convergence in the CO sense).
Since p does not belong to any ball of radius Id and center at fi(x),
x e 8D, it follows that p is not a convex combination of the form to. (x)
+ (1 - t) 4",(x), 0 5 t S 1, x e 3D, n, m > N, because 4"(x) and 4m(x) be-
long to one such ball.
But then we may fix n, m > N and apply Corollary 3.13 to the family :
ton+(1 0<-t5 1.
Hence deg (p, 0", D) = deg (p, 0,,, D).
In other words, the limit lim deg (p, 0", D) does exist and 3.14 is legitimate.
We may reformulate the definition as follows.
3.15. Given p # ¢ (8D) there exists a CO neighborhood U of 0 such that for
every +p of class Cl belonging to U, the degree deg (p, gyp, D) is the same.
Henceforth we define
deg (p,0, D) = deg (p, yi, D), V e U.

Remark: The correct way to think about the degree is that it is defined for
every continuous function 0 and that Definitions II.1 and II.4 in B are only
methods of computing it in the special case 0 C'.
Our next aim is to extend to the continuous case the statements that we
have obtained for the C' or C2 cases. This is done in the following summary
theorem.

3.16. Theorem: To every continuous map 0 : D -+ R" and every p 4 (3D)


there is associated an integer deg (p,0, D) with the properties:
72 NONLINEAR FUNCTIONAL ANALYSIS

1. Invariance under homolopy. The integer deg (p, 0, D) depends only on


the homotopy class of4) in the following sense: if 0, is a family of mappings
¢, e C depending continuously in the uniform topology on the parameter
t, 0 S t S 1, and such that p 00,(OD) for every t, then
deg (p, 00 , D) = deg (p, 4) , D).
2. Dependence only on the boundary values. As a consequence of (1), we
have: if 4Iav = VI DD and p 00 (OD) = p (aD), then
deg(p,0,D) = deg(p,ip, D).
3. Continuity. The function deg (p, 4), D) is continuous in 0 in the uniform
topology in the following sense: given 0 and p 00 (8D), there exists a uni-
form neighborhood U of 0 such that if +p e U. then p 0,p (8D) and
deg (p, 0, D) = deg (p, w, D).
4. If p then deg (p, 0, D) = 0. If p and q belong to the same com-
ponent of R" - 0 (aD), then deg (p, ¢, D) = deg (q, 0, D).
5. Decomposition of the domain. If D = U D, where each Di is open, the
family (DJ is disjoint, and 8D, a 8D, then for every p 00 (8D):
deg (p, 0, D) _ deg (p, 0, D,).

6. The excision property. If p 00 (8D), K e D, K is closed and p #4)(K),


then
deg (p, 0, D) = deg (p, 0, D - K).
7. Cartesian products. If D e R", D' c R'° and 0: D - R", lp : D' - R,
then
deg ((p, q), (0,'p), D x D') = deg (p, 0, D) deg (q, ,p, D)
whenever each term makes sense.

Proof of 3. The proof follows obviously from Definition 3.15.

1 follows from 3. In fact, the function deg (p,4),, D) is continuous in t.


Since the range (the integers) is discrete and (0, 1] is connected, it must be
constant.
2 follows from 1. Consider the family to + (1 - t) V.
Proof of 4. We remarked after Definition 11.2 that 4 holds for C' mappings.
The general case follows immediately by 3 and approximation.
DEGREE THEORY AND APPLICATIONS 73

Proof of 5. We assumed that D is compact. This can happen only if D,


is a finite family. But then we can select a single C' mapping rp such that ip
and each restriction y), = 1V4D, belong to the CO neighborhood corresponding
to 0 and each41D,, respectively, according to Definition 3.15. The statement
is thus reduced to the C' case, and now it is enough to consider the case
p # p(Z). But then our result follows trivially from the associative law of
addition:
deg (p, V, D) = Y Sign J4(x)
v(x)=D

= i Sign Jm(x)1 = deg (R 1V,, DI)


x =P
C V(x) J
Proof of 6. Here again it is easy to see how to reduce the continuous case
to the C' case. If ¢ is supposed to be C', Definition II.1 gives
deg (p, 0, D) = Y Sign J#(x)
4.(x)-P

and under the assumption p O4,(K), it follows that


deg (p, 0, D) = Y Sign J#(x) = deg (p, 0, D - K).
O(x)-P
xeD-K
Proof of 7. Once again the reduction to the C' case is immediate and the
result follows from the remark that J(+,,,) = J# J,,.
Q.E.D.

We now generalize 1 and 2.


3.17. Corollary: If4, and ff have homotopic restrictions to 8D, i.e., if there
exists a family 0,, 0 S t 5 1 of mappings 0,: 8D R' such that p 0 0, (aD)
for every t and d16D = 00 , VI DD = 0, , then

deg (p, 0, D) = deg (p, lp, D) .


Proof: Consider the cylinder L = D x [0, 1]. The homotopy 0, may be
considered as a continuous function 0 from 8D x [0, 1] into R" by defining
0 (t, x) = 0,(x). Now extend 0 to a mapping T defined on all of L and call ,
1V the mappings O(x) = T (0, x), j(x) = T(1, x), xeD. Clearly by (3.16; 1)
we have
deg (p, w, D) = deg (p, , D).
But since 1 aD = 4,18D and V 1 aD = V1 ,1D, by (3.16; 2) it also follows that
deg (p, 0, D) = deg (p, gyp, D) as desired.
74 NONLINEAR FUNCTIONAL ANALYSIS

An easy consequence of 3.16 (and one which could have been obtained
earlier, if desired) is the Brouwer fixed point theorem and its equivalent
"no-retraction" theorem.
3.18. Corollary: ("No-retraction" Theorem.) Let B e R" be the open unit
ball. There is no continuous mapping 0: B -- 8B such that the restriction
cb B is the identity.
Proof: Under the conditions stated, 0 should satisfy
deg (p,0, B) = deg (p, id, B)
for every properly situated p (by 3.16; 2). The second member is 1 at p = 0
(actually at any point in B). This implies, according to (3.16; 4), that
0 e¢(B). But this contradicts 4)(B) a 8B, and we are done.
Q.E.D.
3.19. Theorem: (Brouwer fixed point theorem.) Let B be any open ball
in R. If 0: B -+ B is continuous, there exists x e B such that 4)(x) = x.
Proof: We may suppose that B is the unit ball with center at the origin.
If ¢ has no fixed points, then for each x e Bx and 4)(x) define a straight line.
Define +p(x) as the only point of the form Ax + (1 - A)4(x), A z 1 having
norm 1. V is continuous, p : B --i- 8B and 'Ia e = id. But we have just seen
that such a mapping cannot exist.
Q.E.D.

F. The Multiplicative Property and Consequences

As We have seen, if4, is a continuous mapping4, : D -+ R", then deg (p,¢,D)


is constant on every component of R" - 4) (8D) (see (3.16; 4)). This allows
us to introduce the notation
deg (A, 0, D)
where A is any non-void set contained in a single component of R" - 0 (8D).
The definition is:
deg (A, 0, D) = deg (p, 0, D), p E A.
However, we can point out one distinguished component, namely the un-
bounded one. Certainly, 0 (8D) being compact, there is at least one un-
bounded component of R" - 46 (8D). But this component contains the ex-
terior of any ball containing 0 (8D). Therefore it is unique. Call it A.,. Since
4)(D) is compact, there exist points in A., not in 4,(b). For any such point p,
deg (p, 4), D) = 0, by (3.16;4). Thus deg D) ='0.
DEGREE THEORY AND APPLICATIONS 75

3.20. Theorem: (Multiplication Property.) Let ¢ : D - R", w : R" - R" be


two continuous mappings and d i the bounded components of R" - 0 (8D).
Suppose that p t +p o 0 (8D). Then
deg (p, +p o0, D) = E deg (p, +p, d,) deg (d,, 4', D).
4,
Proof: We place ourselves in the pleasant situation of "counting zeros"
by assuming that 0 and +p are Cl mappings and that p is not the image of a
critical point. In that case
deg (p, Y) o 4', D) _ Sign J,, o #(y)
Y; y (d(Y )) = D

Sign Sign J#(y)

Sign J (z) Sign J#(y)


v(z)=D

I
z E R"-4(OD)
deg (z, 0, D) Sign J.(z).
v(z)=0

But R" - ¢ (8D) is the union of the disjoint sets 4,, so that
deg (p, V o0, D) = F rdeg (d,, 0, D) Y, Sign J11(z)
ZE4j
VP(z)=D

_ Y deg (d,, 0, D) deg (p, +p, d


41
and the proof is done.
As an illustration of the power of this theorem it is possible we give,
following Leray, ao immediate proof of the Jordan separation theorem
(cf. Jean Leray, Proc. Int. Congress, 1950, vol. II, pp. 202-208).
3.21. Theorem: (Jordan) Let K and L be homeomorphic compact sets
in R1. Then R" - K and R" - L have the same number of components.
Proof: (Leray) Suppose that h : K - L is a homeomorphism and that 0
and +p are extensions respectively of h and h-1 to all of R". Let d, be the com-
ponents of R" - K and D, those of R" - L. Now the mapping ip o 0 : R" R"
is the identity on K. ConsiderAJ. The restriction ap o,0la4, is also the identity
since 8d, c K. Therefore if p is properly located
deg (p, +p o 0, A,) = deg (p, id, d,)
by (3.16; 2).
76 NONLINEAR FUNCTIONAL ANALYSIS

But for every i, the degree deg (A,, id, 4J) is equal to 8,J (Kronecker's d),
and then :
deg (d1, tp o0, d,) = 8,, J.

Using the multiplicative property we shall obtain


61,J = E deg (dJ,V, DJ deg (D,.,0, A )
h

This is not immediate, because the sets Dy are not the components of
R" - 0 (8d,), but subsets of them.
Fix j and compare the sets:
R"-L=UD,
R' - 4 (ad,) = UG,,
where G, are the components of R" - ¢ (3DJ). The sets of the family {G,}
are disjoint.
Since L and 0 (8d J) are compact there is only one unbounded component
in each case. Call these A. and G... Since 0 (8d J) c L, or
R" - Lc R"-¢(8d,).
It follows by connectedness arguments that the family {D,} splits into several
subfamilies {Dj} such that
Us=Diu...uD.' u...cG,
U2=D;u

D. is necessarily contained in G.. Consequently


(1) deg(D,,,4,4J) = deg(G,,0,AJ).
Let M, = C, - U'. It is easy to see that M, c L for every i. But now if
p e d, and o(z) = p, z cannot belong to L, because rp(L) = h(L) = K, and,
a posteriori, z cannot belong to any M, either. This implies that
deg (p, N, G,) = deg (p, +p, G, -- M,)
as was seen in (3.16; 6).
But then, since G, - M, = U Dj, we conclude (3.16; 5) that
J
(2) deg (p, gyp, G,) = deg (p, rp, U Dj) _ deg (p, ip, Dj).
J J
DEGREE THEORY AND APPLICATIONS 77

Now we recall that the multiplicative property means exactly


deg (p, V o 0, A) _ deg (p, +,, G.) deg (Gb, 0 4,) .
h* 00

and, by (1) and (2), it follows that


deg (p,' o0, 4,) _ F, (E deg (R V, DI',)) deg (D;,, (p, d,)

_ Y deg (p, +P, D,) deg (DI, 0, df)


DI*D
Since p was any point in d,, we conclude
(3) 8,,, = deg (A1, fV o4, Aj) = Y deg (A,, +p, D) deg (D, Q', dj).
D*D.
By the symmetry of the argument it is also true that
(4) 8,j _ deg (D,, 0, A) deg (d, +p, D).
e*e.,
Suppose that both families {d,} and {D,} are finite. Then they define two
matrices
A = (deg (D,, 0, d,))
B = (deg (d,, lp, Dj))
and (3) and (4) imply
AB=I
(s)
BA = 1,,,
if we assume A to be n x m and B to be m x n, where n = number of D,'s,
m = number of Al's. But- now it is an easy exercise to show that the equal-
ities (5) imply n = m, and this is what we were looking for.
The case {D,} finite, {d,} infinite is easily seen to be impossible by the
equalities (3) and (4). If both families are infinite, then being disjoint families
of open sets, are both denumerably infinite.
Q.E.D.
We can now draw some topological consequences.
3.22. Corollary: (Domain Invariance) Let U be an open set in R",
0: U - R" a continuous and one-to-one mapping. Then 0 is an open map-
ping.
Proof: Pick a point p e U and let D be the closure of an open ball D con-
taining p and contained in U.
78 NONLINEAR FUNCTIONAL ANALYSIS

Since D is compact, 0: D - 4(D) is a homeomorphism and we can apply


Jordan's theorem which implies that R' - 4,(D) has one component and
that R" - 0 (8D) has two components.
Take a point q in the bounded component d of R" - 4, (8D). If
q e R" - 4,(D), it follows from the connectedness of R" - ¢(D) that -q can be
joined by a continuous curve with every other point in R" - 4,(D). But this
set certainly intersects the unbounded component of R" - 0 (8D) which is
a contradiction. We conclude that d n (R" - 4(D)) = 0 or d c 4(b). As
we knowthat¢(p) 4 0 (8D), because4, is 1-1,4,(p) must belong to A. Hence¢
is open.
Q.E.D.

G. Borsuk's Theorem

We shall prove the following


3.23. Theorem: Let D be a symmetric bounded open set in R" containing
the origin and 46: D - R" an odd mapping (4(x) = -4'(-x), for all x e D)
such that 0 #0 (8D). Then deg (0,¢, D) is an odd number (in particular
different from zero).
The proof follows from a sequence of lemmas.
3.24. Lemma: Let K e R" be a compact set, 0: K - R"+ 1 a continuous
mapping such that 0 #4,(K). Then 4' can be extended to a continuous never
vanishing mapping defined on a cube C ? K.
Proof: For e > 0 choose r' to be C1 and such that (4,(x) - +p(x)I < e for
x e K.1p(D) has measure zero for every D e R" by Sard's lemma, and so it is
possible to pick a point yo such that x - ip(x) + yo never assumes the value 0.
Suppose then that y, itself is never vanishing.
Let c = inf 14,(x) (, x e K and choose a continuous function defined for
t > 0 with values in R such that
2t
ra(t) = 1 if tZ2 r1(t) = if t 5 .
c 2

If we define 0 as the mapping


V(x)
OW =
rJ(I o(x)I)
then I0(x)I k c/2 for all x and I0(x) - 4,(x)I < c on K.
Suppose that a has been chosen so that e < c/2
DEGREE THEORY AND APPLICATIONS 79

By the Tietze extension theorem there exists a function d : C -. R" such


that 8(x) = O(x) - 4(x) if x e K, and I8(x)I S s for x e C. Define 41(x)
_ O(x) - 8(x), x e C.
Then 01(x) _ 4(x) if x e K,
C
101(4 = 10W - 44 ? I0(x)1 - 144 ? -e>0
and 01 is a solution of our problem.
3.25. Lemma: Let D e R" be a symmetric open bounded set such that
0 0 D. Let be a mapping of 8D in R1, m > n, which is odd and non-vanish-
ing. Then Q, can be extended to D to be odd and non-vanishing.
Proof: We shall use the induction on the dimension of R. For n = 1,
D looks like:
e
--H-H -y -H
By the previous lemma we can extend the function ip = 0 restricted to
[s, co] n 8D, to a non-vanishing function j defined on some interval [e, N].
By symmetry, we may define a function extending 0 and never vanishing.
Suppose now that the lemma is established for n1 < n. Let x e R", .f e R11-1
(suppose furthermore that R"-1 has been identified with the hyperplane
xl = 0 in R"). Considering R"-1 n D, o can be extended to 8D v (R"' 1 n D)
to be odd and non-vanishing (this is our inductive step): call the extension
again 4'.
Now split R" into R"-1, where x1 = 0, x1 > 0, x1 < 0 respec-
tively, and let D+ = D n R+, D- = D n r- By the previous lemma 4' has
a further extension to 8D v (R"-1 n D) u D +, continuous and non-vanish-
ing. Now, by symmetry, the final extension can be defined.
Q.E.D.
3.26. Lemma: Let D e R" be a bounded open symmetric set such that
0 0 D, 0 : 8D -+ R", a continuous odd and never vanishing mapping. Then 0
can be extended to D to be continuous and odd, and furthermore, non-
vanishing on D n R"-1 (again the identification R"-1 a R").
It follows from the previous lemma applied too retsricted to 8 (D n R8-1)
= 8D n R"-1, that we can obtain a never-vanishing extension to D n R"-1.
Such an extension can be extended at once to the desired map on D by
symmetry.
Q.E.D.
80 NONLINEAR FUNCTIONAL ANALYSIS

3.27. Lemma: If D e R" is a bounded open symmetric set and 0 # D, for


every 0: 8D -j, R" continuous and odd such that 0 f 0 (8D), deg (0, 4), D) is
an even integer.

Proof: Extend 0 to D so as to be a continuous odd mapping never vanish-


ing on D n R". The lemma above assures the existence of such an exten-
sion. Call the extended mapping also 0. Approximate 0 by a mapping ,p of
class C1 and odd (replace, if necessary, an approximating V by its odd part
I [+p(x) - ip(-x)]). If rp is close enough to 0, it follows that
0 0 V (dD)
00+p(Dn R"-1)
deg (0, vp, D) = deg (0, ¢, D).
We want to compute deg (0, lv, D). Consider the sets D+ = R"+ n D,
D- = Jr- n D (where R"+ = {x(xl > 0}, R"_ = {xlxl < 0)).
By construction V never vanishes on D n R"-1, so we can avoid this set
and obtain :
(1) deg (0, +p, D) = deg (0, p, D+ u D-)
= deg(0,%p,D+) + deg(0,+p,D-).
Choose p close to 0 and such that p is not the image under ip of a critical
point of ip. Observe now that since V is odd, each partial derivative p/8xt is
even. But then J, is also an even mapping. This implies in particular that
-p is not the image of a critical point either. Compute
deg (0, v, D+) _ Sign J#(y)
V(V)=a
7ED.
deg (0, +p, D_) = Sign J#(z).
,(z)- -D
zED_

Since V is odd, the set {z(tp(z) = p, z e D_} can be obtained by taking the
opposite of the elements in {yjy,(y) = -p}. But J,(z) = J,(-z) and we con-
clude that
deg (0, gyp, D+) = deg (0, jp, D_).
Then (1) implies that deg (0, 0, D) = deg (0, ip, D). is an even nwnber.
Q.E.D.
Now we are ready to prove Borsuk's theorem.
DEGREE THEORY AND APPLICATIONS 81

Consider a small open ball U with center at 0, and a mapping f : D - R"


such that
(a) f is odd,
(b) AD 010D,
(c) f I u = identity.
The existence of such a function follows from the observation that if g is an
extension satisfying (b) and (c) (such an extension certainly exists) then
f = [g(x) - g(-x)] satisfies (a), (b) and (c).
We know that f 18D = 46 implies
deg (0, f, D) = deg (0, 0, D)
(as follows from 3.16; 2).
But if f = id on U, it is clear that f # 0 on 8U and then
deg (0, f, D) = deg (0, f, U u (D - U))
= deg (0, f, U) + deg (0, f, D - U)
= 1 + deg (0, f, D - U) .
But the second term is known to be even by the last lemma. This proves that
deg (0, ¢, D) = deg (0, f, D)
is odd.
Q.E.D.
We now draw some consequences from Borsuk's theorem.
3.28. Corollary : Let D be as in Borsuk's theorem and V: 8D - R" a con-
tinuous mapping. Then there exists no homotopy +p, of W into a constant
mapping such that lpr(x) # 0 for all t, x e 8D.
Proof: First extend w to some 0 defined on D. Replacing 0 by
I (4)(x) - 4)(-x)) we may suppose that 0 itself is odd. But then Borsuk's
theorem implies that deg (0, 0, D) # 0 and the impossibility of the existence
of the homotopy described is then apparent.
Q.E.D.
3.29. Corollary: Let D be as above. If V : 8D -' RA is an odd continuous
mapping whose image is contained in a subspace E # R", then w assumes
the value 0 at some point of 8D.
Proof: Extend +p to a continuous odd mapping 0: b - E. If 0 l w (8D),
then by Borsuk's theorem, deg (0, 0, D), being odd, is different from zero.
6 Schwartz, Nonlinear
82 NONLINEAR FUNCTIONAL ANALYSIS

But this implies that deg (p, 45, D) differs from zero on the component d
of R" - p (8D) containing 0. Now (3.16; 4) implies that 0(b) contains such
a component, and, a posteriori, E does also. But this is impossible, since d
is open, non-void, and E # R'.
Q.E.D.

3.30. Corollary: Let D be as above, and let +p any continuous mapping


Tp : 8D -- R" whose image is contained in a subspace E 01%. Then there
exists p E 8D such that +p(p) = V(-p).

Proof : Apply the corollary above to the mapping I (y'(x) - lp(- x)).
Q.E.D.

3.31. Corollary: Let D be as above, and q5: D - R' a continuous mapping


never vanishing on 8D, such that for every x E 8D,
(1) a4, (x) # (1 - 0)4(-x)
for alla,I Sa 5 1.
Then 4(D) contains a neighborhood of the origin.

Proof: Observe that the conclusion follows from the statement


deg (0, 0, D) # 0.
This property is an immediate consequence of the fact that ifp is the mapping
+p(x) = I (O(x) - ¢(-x)), then by Borsuk's theorem, deg (0, P, D) # 0.
Under condition (1), the family
to (x) - (I - t)4(-x), # 5 t .g 1,
is a homotopy between 0 and tp, which implies
deg (0, ,0, D) = deg (0, V. D).
Q.E.D.
DEGREE THEORY AND APPLICATIONS 83

Degree Theory-General Case

H. Preliminaries: Degree Theory in an Arbitrary Finite Dimensional


Vector Space

Suppose E is a real vector space of dimension n. By choosing a basis in E


we can identify E with R". This should allow us to define deg (p, 0, D) as it
was done for R" and of course the only important thing is to see what happens
after a change of basis. The answer is that the degree is basis-independent.
More precisely, given a basis B = {b1, ..., b"}, we shall for the moment
denote by deg B(p, ¢, D) the degree computed with resp --et to B; then we have

3.32. Proposition: For every pair of bases B, .P


degB (p, 0, D) = degp (p, 4', D) ,
whenever the expressions make sense.
Proof: It suffices to prove this for C1 mappings. But then we only need to
know what happens to the sign of the Jacobian of a mapping when the basis
is changed. This is easily seen to be invariant, whence the result.
Q.E.D.

1. Preliminaries: Restriction to a Subspace

Suppose that D e Jr is an open and bounded set, and that R' S R",
where the inclusion is made by identifying R" with the subspace of R" whose
points are the x such that x"+1 = x"+: = ... = x. = 0.
3.33. Proposition: If 0: D -+ R" is continuous and 9p: D -+ R is the map-
ping 1P = id + 4, for every p c -AR" not belonging to o (8D):

deg (p, p, D ) = deg (p,1ol R., a, D n R").


Proof: Let us begin by noting that y' (R" n D) c R"' as can be verified
easily; thus the expression deg (p, Vlx,.,,8, R" n D) makes sense.
Suppose that 4, is C'. By definition it suffices to prove the statement for
this case, and under the assumption that p is not the image ofa critical point
84 NONLINEAR FUNCTIONAL ANALYSIS

of t'. As the degree is then computed by counting zeros, it is necessary to look


for the points y in V r 1(p) . If y(y) = y + 0(y) =p, then y = p - 0(y) ,s R"'.
Hence tp-1(p) c RI n D.
This implies that the points to be counted for V: D - R" and for
F = ?P1 R-, fi, F: R" n D -+ R'" are the same, and the only possible difference
in degree lies in the signs assigned to them.
Our proposition will follow from the fact that at each such pointy, we have
(1) Sign 4(y) = Sign
To prove (1), first observe that the Jacobian matrix of ip has the form

1 + aO, 0
ax, Im

U runs from 1 to m).


This implies immediately that J((x) for every x e R'" n D, which
clearly implies (1) above. Hence our proposition has been proved.
Q.E.D.

J. Degree of Finite Dimensional Perturbations of the Identity

Let X be a real Hausdorff locally convex T.L.S., and D e X an open sub-


set of X such that E n D is bounded for every finite dimensional subspace E
of X (in that case we say that D is "finitely bounded"). This is the most general
case we shall consider.
We now give some definitions.

3.34. Definition: If T is a topological space and 0: T -. X is a continuous


mapping, we shall say that 0 is finite dimensional if ¢(T) is contained in
some finite dimensional subspace of X. If T is also a subset of X, we define
a finite dimensional perturbation of the identity to be a mapping lp : T - X
of the form y, = 1 + 0 where 1 is the identity 1: T -. X and 0 is finite di-
mensional. 0 = tp - 1 is called the perturbation of V.
Our aim is to define the degree deg (p, ip, D) for every finite dimensional
perturbation of the identity yt = I + ¢ (defined on T = D, D as above). Let
DEGREE THEORY AND APPLICATIONS 85

p e X be a point not in y, (3D) and choose a finite dimensional subspace


E c X such that
(a) peE,
qS(D)-- E.
Letting f denote the restriction f : D n E " -- E, we have

3.35. Definition:
deg (p, v, D) = deg (p, f, D n E),
where the second member is computed in E according to the theory for finite
dimensional spaces.
We must justify 3.35. Suppose Fis a subspace of X satisfying properties (a)
above and that F (-_ E. Then proposition 3.32 applies and we conclude
deg (p, f IF n D, D n F) = deg (p, f, D n E).
If F satisfies (a) but F and E are not nested, we reduce to that case by con-
sidering E e E + F and F e E + F separately. Thus 3.35 is justified.

3.36. Remark: We know that in the finite dimensional case the degree
deg (p, ¢, D) depends only on the restriction of 0 to 3D (see 3.16; 2).
Let us suppose that we have a finite dimensional mapping defined only
on 3D, : 3D -+ X. The degree of all finite dimensional perturbations of
the identity 1 + 0 by means of a finite dimensional extension 0 of J defined
on all of D will be the same as follows from Definition 3.35 and the finite
dimensional theory. But we cannot assure the existence of such extensions
unless we assume X to have additional properties (for instance, to be nor-
mal). Nevertheless a notion of degree of 1 + may be defined. Suppose
p # ¢ (3D). Choose a finite dimensional subspace E containing both p and
4' (3D). E is finite dimensional and so there are extensions 0 Of $18DnE to all
of D n E. Thus it is possible to define deg (p, + 1, D) by
deg (p, . + 1, D) = deg (p, ¢ + 1, D),
where the second member is computed in E.
Hence whenever we have a finite dimensional mapping : 3D - X we can
define the degree deg (p, + 1, D), and this coincides with the degree of every
finite dimensional perturbation of 1 by means of an extension of p to all of D.
86 NONLINEAR FUNCTIONAL ANALYSIS

K. Properties

We have shown that the definition of degree can be generalized to obtain


a notion of degree for finite dimensional perturbations of the identity with
respect to domains "finitely bounded" in an arbitrary locally convex T.L.S.
Now we shall list the properties of degree that remain valid in this situation.
From the Definition 3.35 and 3.16; 4, 5, 6 and 7 we obtain:
3.37. Proposition: For every finite dimensional perturbation of the identity
1P = I + 0: D -- X, and every p # ip (8D), the following results hold:
1. If p 0 tp(D), then deg (p, yr, D) = 0.
2. If p and q belong to the same component of X - p (8D), then
deg (p, yr, D) = deg (q, gyp, D).
3. If D = U D1, where the family {Dt) is disjoint and 8Dg a 8D, then
deg (p, I p, D ) = deg (p, V. Di) .

4. If K D, K is closed, p t yr(K), then


deg (p, ys, D) = deg (p, rp, D -- K).
5. If f : Dl - X1 is a mapping satisfying the same conditions as 4,, then
deg ((p x q),1 + (0 x J), D x D1) = deg (p, l + 4, D) deg (q, I + f, D1)
whenever these expressions make sense.

L. Limits

The family of finite dimensional mappings is closed under addition and


product by scalars. Hence to proceed we consider limits of such mappings.
From this point of view the important thing is that the compact mappings
(definition below) are such limits and so we will be able to define degrees
of compact perturbations of the identity.
Let us recall and introduce some notations: D is an open set in X such
that D n E is bounded for every finite dimensional subspace E e X. C(D)
will denote the (linear) space of all continuous mappings 0: D -* X; like-
wise, C (8D) will be the space of continuous mappings 4, : OD - X. There
exists a natural mapping Q : C(D) - C(OD) defined by restriction Q(4,) = 018D
DEGREE THEORY AND APPLICATIONS 87

Similarly, we denote by F(D) and .F(3D) the subspaces of C(D) and C(OD)
whose elements are the finite dimensional mappings. Of course, Q: .F(D)
- .01 (aD).
Let us give C(D) and C(OD) the topologies of the uniform convergence.
The open sets of, say, C(D), are those defined by
{4)(4)(D) c G)
where G runs over all the open sets of X.
Warning: These topologies are not linear space topologies, but merely
group topologies (i.e., the mapping (0, p) --, 0 + ,p is continuous, while the
mapping (A, 0) -, A4, A e R, 0 e C(D), is not necessarily continuous).
By 3.36, for every 0 e .F(3D) and every p 0 (1 + 0) (OD), the degree
deg (p, 1 + 0, D) is defined, and for every g e C(D) such that Qg = 0,
deg (p, l + g, D) = deg (p, l + 0, D).
3.3& Lemma: Let 46 e C(aD), p be a point of X and V be a convex sym-
metric neighborhood of 0 e X such that:
(a) (p + V) n (1 + 4)) (OD) = 0.
Then, if f e F (3D) satisfying f(x) - 4)(x) e V for every x e aD, the degrees
deg(p,l +f, D)
are defined; moreover, these degrees are equal for all such f.
Proof: Suppose p = x + f(x), x e 3D. Then x + O(x) = p + (4)(x) - f(x))
e p + V, which contradicts (a). Hence p 0 (1 + f) (3D) and the degree is
defined.
Suppose that g e F(3D) also satisfies g(x) - 4)(x) e V for every a e 3D,
andcall F=1 +f,G=1 +g,jp=1 +,0.
For every x e 3D, F(x) and G(x) both belong to +p(x) + V and this set is
convex. Hence any convex combination (1 - t) G(x) + tF(x), 0 5 t S 1
also belongs to jp(x) + V. Using (a) this implies that for every 1, 0 5 t 5 1,
and every x e 3D,
p # (1 - t) G(x) + IF (x).
Let E be a finite dimensional subspace of X such that
peE
G (aD) c E
F (49D) E.
88 NONLINEAR FUNCTIONAL ANALYSIS

Considering the domain D n E and the homotopy (1 - t) G + tFbetween G


and F, we conclude by 3.17 that deg (p, G, D n E) = deg (p, F, D n E), or,
according to Definition 3.35, deg (p, G, D) = deg (p, F, D) as desired.
Q.E.D.

3.39. Proposition: Let 0 e C (8D) and p a point of X not in the closure of


(1 + ¢) (8D). Then there exists a neighborhood U of 0 in C (8D) such that
the degree
deg (p, 1 + f, D)
takes on but a single value for all f belonging to U n .W (aD).
Proof: Follows from the lemma above.
Q.E.D.
This proposition will permit us to define the degree for perturbations of the
identity by limits of finite dimensional mappings.
Call 2' (8D), 2(D) the closure of F (8D) (respectively .F(D)) in C (8D)
(respectively C(D)). Plainly Q : T(D) - .P (8D).
3.40. Definition: Let 0 = w - 1 be a mapping in 2 (aD). Let p be a
point of X not in the closure of ,(aD). The common value of the degrees
deg (p, 1 +f, D) when f e .F (8D) is near 0 is defined to be deg (p, ip, D).
If4, = y, - 1 is a mapping defined on all of D and such that Q(4,) c-.7 (8D),
we shall write simply deg (p, +p, D) instead of deg (p, Qy,, D). In particular,
for every 0 e 2(D), the degree is defined.
From 3.37 the next proposition follows easily.
3.41. Proposition: If ,0 = +p - 1 e 2(D) and p and q do not belong to
the closure of tp (8D), then :
1. If p 0 sp(D), then deg (p, gyp, D) = 0.
2. If p and q belong to the same component of X - y,(OD), then
deg (p, ap, D) = deg (q, y,, D).
3. If D = U D,, where the family (D,) is disjoint and 8Di c 8D, then
deg (p, p, D) _ deg (p, y,, D,)

4. If K D, K is closed and p 0 +'(K), then


deg (p, p, D) = deg (p, p, D - K).
5. If f: D1-, X1 is a mapping of 2(D1), then
deg ((p, pl), 1 + (0,f), D x D1) = deg (p, 1 +,0, D) deg (pl, 1 + f, D1) .
DEGREE THEORY AND APPLICATIONS 89

Proof: Left to the reader.


(Hint: Approximate and use Proposition 3.16.)
In the same way it is possible to prove the following generalization of 3.33:
3.42. Proposition: Let Y be a closed subspace of X and p be a point of Y
not in v(aD), where tp - I e 3 (aD). Then
deg (p, y,, D) = deg (P, Vianr, r, D n Y).
Finally, we have a generalization of Borsuk's theorem:
3.43. Proposition: If D c X is an open set which is finitely bounded. symme-
tric, and contains the origin, then for every odd tp such that V - I e 2 (aD),
and if 0 0 tp (aD), then deg (0, t', D) is an odd number.
The proof follows at once from the Definition 3.40 and the Borsuk theo-
rem 3.23.

3.44. Corollary: If D is a domain as in the proposition above, if vp - 1 is


odd and belongs to 2'(D), and if 0 0 ty(aD), then V(D) covers a neighbor-
hood of 0.

M. Compact Perturbations

Here we shall show that the compact mappings are in 3 (OD) and 3(D)
and obtain some additional properties of the degree of compact perturba-
tions of the identity.
We begin with some purely topological results.
Let X be a locally convex T.L.S., T a topological space. Let C(T) denote
the set of continuous mappings 0 : T -+ X. C(T) has a natural st. ucture as a
topological space (see the beginning of section L).
Consider the subspace.F(T) of C(T) whose elements are the mappings 0
such that O(T) is contained in a finite dimensional subspace of X and the
subspace K(T) of the mappings 0 for which the set f(T) is pre-compact.

3.45. Proposition:
K(T) c .F(T) n K(T).
Proof: Choose an open, convex, symmetric neighborhood V of 0 in X,
and let 0 e K(T). Suppose the points yl , ... , y e X have the property
n
O(T) c U{y,+V;.
'=1
90 NONLINEAR FUNCTIONAL ANALYSIS

Letting e be the gauge induced by V,


e(x) = inf 1A1, x e X
xe,1Y

we define mappings pl : T -; R by
pi(x) = max (0, 1 - e (4(x) - YO) .
Each µ, is continuous (since a is continuous).
Since O(T) e U l y, + V), for each x there exists at least one p,(x) different
from zero. Thus the function µ(x) _ µ1(x) never vanishes on 4(T), and
we can define

µ(x)
These mappings satisfy I Z A, ? 0, A1(x) = 1. Now define

:(x) _ Ar(x) Ys

This mapping belongs to F(T) n K(T) and


O(x) - q5.(x) = E !(x) (4(x') - ye)

We see that if 4(x) - y, V, then a (4(x) - y,) ? 1, and consequently


µ,(x) = 0, which implies A,(x) = 0. This means that 4(x) - 41(x) is a con-
vex combination of elements of V which belongs to V since V is convex.
Thus 0 is a limit point of .F(T) n K(T), as desired.
Q.E.D.
Using this proposition we may return to our initial situation D e X, D an
open finitely bounded set.
We shall say that a mapping 0 e C(D) (or# eC (aD)) is compact iff 4(D) is
a compact set (respectively:4(aD)).
(Not to be confused with the mappings of Definition 1.38.)
3.46. Proposition: Any compact mapping 0 e C(D) (respectively 0 e C(OD))
belongs to .'(D) (respectively to .P(AD)).
Proof: Immediate from 3.45.
Q.E.D.
The perturbations of the identity by elements of 2(b) do not behave
"nicely" topologically and in the preceding it was necessary to consider such
DEGREE THEORY AND APPLICATIONS 91

artificial sets as the closure of (1 + ¢) (8D). The compact ones however look
like finite dimensional mappings, and we have

3.47. Proposition: If D is any domain D e X, and 0 E C(OD) (respectively


0 e C(D)) is compact, then V = 1 + 0 is proper and closed.
Proof: To say that c is proper means that the inverse image of a compact
set is also compact. Suppose K = K is compact, and let A = gyp' 1(K). Sup-
pose {xa} is an indexed family in A. Then {x, + 4)(x,)} being contained in K,
has a convergent subfamily xp + 4)(x,) -* y. But4) being a compact mapping,
there exists a third subfamily such that ¢(x,) - z. This implies that x.-I. y- z,
and so A is compact. Suppose now that F = D is closed and that xa + ¢(xJ
- z, x, E F. 0 being compact, there exists {xp} such that ¢(x,) -+ y. Then
xp --* z - y; by continuity, z - y + ¢ (z - y) = z. F is closed, so z - y
= lim xp belongs to F, which implies that z e (1 + ¢) (F). This means that
(1 + 0) (F) is closed as desired.
Q.E.D.
3.48. Corollary:
' (8D) is closed.
This property makes the statement "p is not in the closure of 1P(8D)" in most
of the statements above, equivalent to "p is not in tp (8D)", the same statement
which appears in the finite dimensional case.
We leave to the reader the work (and the delight thereby engendered) of
rewriting Propositions 3.41 and 3.43 for the case ,p - 1 = compact, with
the assumption p # rp(0D).

3.49. Corollary: Suppose D is a domain as in 3.43 and +p a C(D) a map


such that 1P - 1 is compact. If +p maps D into a proper linear subspace of X,
then V(x) = V(-x) for some x e 8D.
Proof: Consider the map j(x) = +p(x) - V(-x). If +p(x) # 0 when x c- 8D,
then by 3.44 j(D) would cover a neighborhood of 0. But j (D) is contained in
every subspace containing ?(D); by hypothesis there is a proper one, without
interior points.
Q.E.D.

3.50. Corollary: Let D be as above and ? a C(D) such that +p - I is com-


pact. If tp(x) is never in the positive direction of rp(-x), x e 8D, then tp(x) = 0
for some x E D.
We shall define a notion of homotopy for compact mappings:
92 NONLINEAR FUNCTIONAL ANALYSIS

3.51. Definition: Two compact mappings 00,01 e C(T) (T is any topo-


logical space) are said to be compact homotopic if there exists a compact
mapping F: I x T -+ X, where I = [0, 1], such that F(0, x) = 00(x)1
F(1, x) =01(x).
3.52. Proposition: Let D be as above, 0o = loo - 1, 0, = lp, - 1 two
compact mappings 4, E C(aD). If 0o and 01 are compact homotopic under
F (t, x) = ¢,(x) = ,(x) - x and p is a point in X such that p # lp,(x) for
every t and every x e OD, then

deg (R V0, D) = deg (p, V1, D)

Proof: If F is compact, it may be approximated by finite dimensional


mappings. The restrictions of such mappings also provide close approxima-
tions of 0o and 01, and then the proposition follows from the finite dimen-
sional case.
Q.E.D.

N. Multiplicative Property and Generalized Jordan's Theorem


for Banach Spaces

X is now a Banach space. Let D e X be a bounded domain to : D -+ X,


where tp - I is compact.

3.53. Lemma:
tp(D) is bounded.

Of course V(b) c b + 0(D), and ¢(D) being compact, both D and 0(D) are
bounded. Then +p(D) is bounded.
Q.E.D.
Since yr (OD) is closed (see 3.48), the set d = X - tp(aD) is open and there-
fore has the form
A=UA,
t

where the A are the components of A. Among these there is one and only
one unbounded component, A., because V(8D) is bounded. Let G = U At,
and suppose furthermore that g : C -- X is a mapping such that g -- I is
compact.
DEGREE THEORY AND APPLICATIONS 93

3.54. Theorem: (multiplicative property) Under the hypotheses above, and


if p 0 g+p (8D), then :
(I) deg (p, gy,, D) = E deg (p, g, A,) deg (A j, y,, D).
10 go

Remark: We have used the notation deg (A j, y,, D) as in section F(cf. 3.20):
the justification for this comes from 3.41; 2.
Proof: First of all, as g is proper (3.47), it follows that K = g-'(p) S G
is compact. Therefore from the covering K c U di, we can select a finite
i#GO
family satisfying K c U A,. Thus in the expression (I) all but a finite num-
ber of terms vanish so that the sum is meaningful.
Moreover, if g - I is approximated very closely (uniformly over G) by a
finite dimensional mapping g" - 1,
deg (p, g, d,) = deg (p, g, d,) , 1 # 00 ,
as follows from the Definition 3.40.
Hence we can prove (I) as assuming g - 1 itself finite dimensional, and
the general case will follow immediately.
Observe that if ip 1 is also finite dimensional we are done, because we
then have just the finite dimensional result proved in 3.20. Thus the only
thing to be proved is that 0 may be approximated by finite dimensional
mappings, for which (I) is already known.
When y, - 1 is approximated closely, the composition gy' is also approx-
imated, and the left member of (I) remains unchanged:
deg (p, g+p, D) = deg (p, g V', D),
where +p' is the mapping corresponding to a suitable approximation to
0=;-1.
Of course the terms deg (q, y,, D) don't change either after the substitu-
tion of +p' for ip. The only difficulty arises when we consider the sets A , which
obviously do change. But K = g-1(p) is compact, so the new sets d; will
differ from the old ones by some (closed) sets, disjoint from K. By 3.41; 4
applied to these closed sets, the desired equality follows.
Q.E.D.
Suppose again that D is open and bounded and that V: D - X with
4) = V -- 1 compact.
3.55. Lemma: I f f : b - X is one-to-one, then lp'' - 1 : V(b) .- X is
compact.
94 NONLINEAR FUNCTIONAL ANALYSIS

Proof: It is easy to see that tp-' - I = ¢ o tp-' which yields the lemma.
Q.E.D.
3.56. Lemma: t' can be extended to P : X - X in such a way that ! - 1
is still compact.
The proof will follow immediately from Proposition 3.58 below.
3.57. Theorem: (generalized Jordan's theorem) If D and D* are bounded
open sets in a Banach space X and there exists a homeomorphism jp : D -i D*
such that (p - 1) (D) is compact, then the number of components of X - D
and X -D" is the same.
Proof: By Lemma 3.55, the inverse mapping tp-1 is also of the form
(identity) + (compact); thus the hypotheses are symmetric.
But by Lemma 3.56 it is also possible to assume that tv and Sp-1 are restric-
tions of globally defined compact perturbations of the identity. The proof
now is the same as that of 3.21, except for the fact that the appeals to 3.20
are replaced by references to 3.54. Q.E.D.
We now give a proof of Lemma 3.56. This lemma is an immediate con-
sequence of the following generalization of the Tietze theorem due to J.Du-
gundji (An extension of Tietze's theorem, Pacif. Journal, Vol. 1, pp. 353-367
(1951)).

3.58. Proposition: Let A be a closed subset of a metric space X, and C a


convex set iti a locally convex T.L.S. E over the real *or the complex field.
Then any continuous f : A - C has a continuous extension F: X C.

Proof: For each x E X - A choose an open B containing x such that


diam V, 5 e (Vi, A). Then { V} is an open covering of X -A and since
X - A is paracompact there exists a locally finite refinement {U}, i.e., the
U's are open and cover X - A, each U c some V, and for each x e X - A
there exists an open 0x containing x and disjoint from all but a finite number
of the U's.
Let U0 e (U) and define for x e X - A
Auo(x) = e (.r, X - Uo)/D (x, X - U).
U

Since a (x, X - U) > 0 if x e U, and since each x e X - A is contained in


some U we have 0 S Aco(x) 5 1. For any x e X - A, Au11O,, has the form
e (x, X - Uo)/ I e (x, X- U),
finite
no. of U's
DEGREE THEORY AND APPLICATIONS 95

and since each e (x, X - U) is continuous (because


l e (x, X - U) - e (y, X - U)! s e (x, Y)),
Au0IOx is continuous. Therefore Au. is continuous on X - A and Avo(x) = 0
iffx#Uo.
Now for each U choose au e A such that a (au, U) < 2e (A, U), and let the
extension F be given by
F(x) = y A0(x) f(au) for x e X - A,
u
and
F(x) = f (x) , for x e A .
For each x e X - A, Au(x) = 0 except for finitely many U's and since
Z Au(x) . = 1, and f(au) e C it follows that F(x) a C. If x e X - A, F1 0.,, is a
u
finite sum of continuous functions and hence F is continuous on X - A.
Since Fis continuous on the interior of A by assumption, it only remains to
show the continuity of F on the boundary of A.
Let x0 e boundary A, and let W c E be any convex open set containing
the origin. Since f is continuous on A there exists an a > 0 such that if
a e A and a (xo, a) < a then f(a) - f(xo) a W. Let 0 = {x a X: a (x, x0)
< a/6}. We will show that if x e 0, then F(x) - F(xo) a W.
Assume x e X - A, a (x, x0) < a/6 and e (x, au) < a/2. Then

(xo, au) 5 e (xo, x) + e (x, au) < 6 + 2 <a

implies f(a0) - f(xo) a W. On the other hand assume x e X a- (x, A, x0)


<a/6 and (x, a0) ?=- a/2. Then (x, a0) z 3e (x,xo) 3e (x, A). If x e U
then
e (x, au) S e (au, U) + diam U < 2e (A, U) + diam U.
Since U e V. and diam V, S e (V,,, C) we have
diam US diam V, S Lo (V.,, A) :9 e (U. A).
Therefore a (x, au) z 3e (U, A) S 3e (A, x) contradicting the inequality
above. Hence e (x, x0) < a/6 and a (x, au) > a/2 implies x $ U and there-
fore A,(x) = 0.
Finally for x e X - A and a (x, x0) < a/6 we have
F(x) - F(xo) = I Au(x)f(au) - f(xo) _ y Au(x) (f(au) - &o))
V V
96 NONLINEAR FUNCTIONAL ANALYSIS

and by the above, for each U either A,,(x) = 0 or f(au) - f(xo) e W. Since
the sum is actually a finite sum and Z Au(x) = 1 with 0 < Au(x) < 1 it follows
U
that F(x) - F(xo) e W.
Q.E.D.

0. Fixed Point Theorems in Banach Spaces

Let X be a Banach space, K c K a convex and compact set.


3.59. Lemma: Every continuous mapping f : K - K has a fixed point x
such that: f(x) = x.
Proof: Since K is compact it is contained in some ball B around the origin.
Each ball in a Banach space is a metric space, so by 3.58 we can assume that f
is the restriction of a continuous mapping (also called f) from B into K.
Clearly any fixed point of the extension is a fixed point of the original
mapping.
Consider now the family of mappings
vt=I - tf, 05t51.
Since K is compact o, depends continuously on tin the uniform topology (it
suffices to observe that t - 0 implies tf -+ 0 uniformly).
The homotopy F (t, x) - tf(x) is compact because the set {ty; 0 5 t 5 1,
y eK}; is the continuous image of the compact set [0, 1] x K under the
mapping (t, y) - ty, and is therefore compact. That implies that
I = deg (0, V0, B) = deg (0, V1, B)
which establishes at once the existence of a point x e B satisfying p1(x) = 0,
or x -f(x) = 0, in other words, a fixed point off.
Q.E.D.

3.60. Proposition: (Schauder) Let A be a closed convex set contained in


the Banach space X. Every compact mapping f : A - A has a fixed point.
Proof: Supposef(A) = k. Let K be the closed convex hull of 1R. K is con-
tained in A and the restriction f iK has a fixed point by the lemma above.
Q.E.D.
3.61. Proposition (Rothe): Let A be a convex bounded open set in some
Banach space X. Suppose 0 : A - X is compact and 0 (8A) a A. Then 0 has
a fixed point.
DEGREE THEORY AND APPLICATIONS 97

Proof: Let us suppose that 0 e A. Define p by


p(x) = sup{A,;0 < A,;LxeA}.
Clearly p is continuous. Let q(x) = max {p(x), I}. Then q is also continuous.
Hence the map g(x) = x/q(x) is continuous, sends the whole space X into A
and is the identity on A. Moreover, the properties q(x) = 1 and x e 8A are
equivalent.
Let B be a ball containing A and O(A).
The mapping. = 0 o g : B -+ B is compact (because 46 is compact) and by
Schauder's theorem has a fixed point x = .(x). This point is easily seen to
be in A, and thus it is a fixed point for 4.
Q.E.D.
3.62. Proposition: (Altman) Let A be any convex open bounded set in a
Banach space X containing 0, and d :.4 - X a continuous mapping satisfying
Ix - (x)12 z I.O(x)IZ - I.x12. x e OA.
Then 0 has a fixed point.
Proof: For 0 5 1 5 I and, x e A define the mapping
F (t, x) = to (x).
F is easily seen to be compact.
For t fixed, write
d(t) = deg (0, 1 - F (t, x), A).
Clearly d(O) = 1.
Furthermore, if it is supposed that 0 = F (to, xo), for some x e 0A. 0 5 to
S 1, then

i
too (xo) = xo .
or

I4(xo)I = Ix01.
to

(1 -210)2
I4(xo) - x012 = (l - 10)2lo(xo)I2 = 1x012.
1o

I0(xo)I2 - JxoJ2 = I.x0J2 (-j.


1 - to
IxoJ2
to to
Using the hypothesis, we get
Z to)2 1
Ixol2 (1 I.ro12 2
t0
to to
7 Schwartz, Nonlinew
98 NONLINEAR FUNCTIONAL ANALYSIS

or
(1 -t0)2 Z 1 - t209
which cannot occur for any 0 < to 5 1. Then the homotopy F (t, x) is com-
pact and avoids the point 0. Therefore
1 = d(O) = d(1)

which means that O(x) = x has a solution for some x e A.


Q.E.D.
As an application of homotopy invariance we shall obtain now a separation
property.

3.63. Proposition: Let K c X be bounded and closed; x1, x2 elements


of X belonging to different components of X - K. If F (t, x), x c K is a com-
pact homotopy such that F (0, x) = 0, and if 0,(x) = x - F (t, x) is such
that 41(K) never contains x1 or x2 then x1 and x2 belong to different com-
ponents of X -¢1(K).
Proof: At least one of x1 and x2 belongs to a bounded component. Sup-
pose it is x1 . Let A denote any component of X - K. Now compare
deg (x1iq5o, d) and deg (x2,40, A).
They are different. After the homotopy, they remain different, which implies
that x1, x2 don't belong to the same component.
Q.E.D.
CHAPTER IV

Morse Theory on Hilbert Manifolds

Part 1

A. Manifolds . . . . . . . . . . . . . . . . . . . . . . . . 100
B. Functions . . . . . . . . . . . . . . . . . . . . . . . . 101
C. Tangent Vectors . . . . . . . . . . . . . . . . . . . . . . 102
D. Alternative Definitions of Tangent Vectors . . . . . . . . . . . . . 105
E. More on Linear Topology . . . . . . . . . . . . . . . . . . 107
F. More on Elementary Calculus . . . . . . . . . . . . . . . . . 111
G. A Short Outline of Smooth Linear Bundles . . . . . . . . . . . . 112
H. The Tangent Bundle . . . . . . . . . . . . . . . . . . . . 115
1. Ordinary Differential Equations . . . . . . . . . . . . . . . . 118
J. Submanifolds . . . . . . . . . . . . . . . . . . . . . . . 120
K. Riemannian Manifolds . . . . . . . . . . . . . . . . . . 121

Part 2

A. The Non-critical Nock Principle . . . . . . . . . . . . . . . . 127


B. The Palais-Smale Condition . . . . . . . . . . . . . . . . . . 130
C. Local Study of Critical Points . . . . . . . . . . . . . . . . . 132
D. Global Study of Critical Points . . . . . . . . . . . . . . . . 137
E. The Morse Inequalities . . . . . . . . . . . . . . . . . . . 148

This chapter consists of two parts: a discussion of the elementary properties


of smooth manifolds modelled on general linear topological spaces and the
theory of critical points of mappings defined on such manifolds (actually,
when the L.T.S. is a Hilbert space). The two main references for !he second
part are (1) and (2) of the bibliography. Moreover, for the classical biblio-

99
100 NONLINEAR FUNCTIONAL ANALYSIS

graphy, references may be found in J. Milnor, Morse Theory, Ann. of Math.


Studies, No. 51, Princ. (1963). We assume that the reader is familiar with
finite dimensional differentiable manifolds (de Rham's book, Chern's lec-
ture notes-Chicago-, Helgason's book). All the definitions and properties
in Part 1 below are easy generalizations of the corresponding ones in the
classical case.

Part I

A. Manifolds

We recall that a mapping from an open set of a linear topological space


with values in another L.T.S. is called smooth if it is differentiable infinitely
many times in the Frechet sense.

4.1. Definition: If V is a L.T.S. and M a topological space, we shall define


a smooth V-structure on M as a pair (4', J), where
1. 4 _ {U,} is an open covering of M.
2. / = {&,} is a family of mappings,,O, : U, - V such that if UQ = a(Ua)
then Ua is open and
(a) 0.: U,, - Ua is a homeomorphism,
(b) 0p (Ua n U,) 0. (U. n Up) is smooth.

If M has been provided with a smooth V-structure, we shall say that M is a


smooth manifold modelled on V. Every 0. c f will be called a chart, a co-
ordinate system or a map on M, U. being its domain.

Remark: If (b) is replaced by


(b') 4 4 ' is k-times differentiable.
or by
(b") is analytic,

we obtain the concepts of C'`-V-structure on M and of Ce-V-structure,


respectively. In most cases, the V-structures are smooth ones (or C'-V-
structures, as one says), and we shall restrict ourselves to this case. Parallel
results for other cases (when true!) can be proved by the reader.
MORSE THEORY ON HILBERT MANIFOLDS 101

Examples:
1. Every open set M c V can be modelled on V by means of the smooth
(indeed analytic) V-structures provided by the family / _ {i}, where i : M
-+ V is the inclusion.
2. Let Z be a discrete subgroup of V, and define M = V/Z, with the quo-
tient topology. Consider all pairs V., Vi', V,, = open set of V, V. - open
set of M, such that the canonical projection x : V - M sends Va homeo-
morphically onto V.. If f is defined as the family of inverses x' 1 : V. - V,',,
then M together with / satisfies the requirements above; hence M has a C'
structure modelled on V, or, briefly, M is a C' V-manifold. (If Zis not assumed
to be discrete, / does not define a smooth structure nor even a C1 structure.
Why?)
3. We shall see later (4.51, (a)) that in every Hilbert space t°, the unit
sphere {xl JxJ = I} can be modelled on V, where V is any quotient
.Wo being an one dimensional subspace of if.
4. New examples can be obtained from known ones by means of the two
following procedures.
(a) if G e M is open and M is a V-manifold (say ('PV, f)), then G has an
induced V-structure defined by the restrictions of the mappings in / to the
sets U n G, U. a V.
(b) if M has a V-structure and L has a W-structure then M x L has a
V x W-structure in the obvious way.

B. Functions

Let M be a V-manifold and M' a V'-manifold. Let J, J', 0, 0', U, U',


define the manifold structures.
4.2. Defn#don: A mapping f : M -- M' is called a smooth mapping if it is
continuous and for every pair .0'e J', 0 e.1, the mapping 4)' f4-1-which
is defined on some open set of V (precisely: (4))-1 (U n f -1(U'))) and takes
its values in V--is smooth. The set of all smooth mappings f: M -+ M' will
be denoted by Hone (M, M').
4.3. Demotion: M and M' are diffeomorphic if there exist f e Hom (M, Al")
and g e Hom (M', M) such that fg = id : M'-+ M' and g f = id : M - M.
If M" is a V" manifold, there exists a natural mapping
Hom (M, M') x Hom (M', M") -+ Hom (M, M")
defined by composition of mappings.
102 NONLINEAR FUNCTIONAL ANALYSIS

4.4. Definition: Let p be a point in M, U an open set of M containing p.


A mapping f e Hom (U, M') is horizontal at p if (f0-')'(0(p)) = 0 for one
(and hence for every) chart 0 e f whose domain contains p. If f is a real-
function horizontal at p, we shall also say that p is a critical point of f and
that the real number f(p) is a critical level off.
Suppose now that +p is another chart V e f defined near p. By the chain
rule applied to (f4)' 1) o (4+p' 1) we see that:

(f"- 1)' (p(P)) = (00') (4_ 1))' (Y'(,,))


= 104-1)' (4-1) MP))] [(ov_ 1)' (y(p))]
= [(f4)-1)' (W(P))] ((ov-1)' (+v(p))l

Thus (f0-')'(0(p)) = 0 if (ftp--1)' (tp(p)) = 0, because the mapping(4y,_1),

x (i(p)) is invertible (by the implicit function theorem), and this shows the
definition of being horizontal at p to be independent of the chart chosen.
We shall henceforth write C°°(M) for Horn (M, R), R with its standard
structure. Note that C0D(M) is a real algebra.

C. Tangent Vectors

Our aim is to define the tangent vector to a curve in a manifold at any poin
through which the curve passes. Let x be a point in the E-manifold M. Con-
sider the set of all the smooth real functions defined on some open set con-
taining x, and the equivalence relation - on that set defined by

f N g if f = g on some neighborhood of x.
4.5. Definition: We shall define a germ of smooth functions at x to be any
class of functions modulo the relation '' . The set of germs at x will be
denoted G(x) and if f is a function, y(f) will denote its germ (i.e., the germ
containing f).
It is clear that G(x) is a real algebra. We want to consider tangent vectors
at x. Roughly speaking, they will be "classes of curves going through x with
the same velocity". We shall check both properties by means of the elements
in G(x).

4.6. Definition: A curve through x e M is a smooth mapping p : J -+ M,


where J is an open set of R such that p(u) = x for some u e J.
MORSE THEORY ON HILBERT MANIFOLDS 103

4.7. Definition: If p is a curve, p(u) = x, we shall define the tangent vector


to p at x = p(u) as the mapping AP -1 : G(x) - R defined by
dt t

dp d(fop)
Y(f) =
dt t= dt t=U

The second member is meaningful because the mapping fop is a smooth


real function of a real variable and for every such mapping g, we identify
the linear mapping g'(x) : R -+ R with the number g'(x, 1).
4.8. Definition : The set TM., of all the tangent vectors tip where p is a
dt t=U
curve and p(u) = x, is called the tangent space to M at x.
A tangent vector is easily seen to be linear on G(x), whence there
tip 1
dt t ..
is an injection TM, c (G(x))*, where * denotes the algebraic dual space.
Of course we may assume, by changing the parameter, that every vector
tip
is of the form , and we will do so hereafter. Now choose a chart0 at x.
tit two
Every curve p with p(0) = x induces (after cutting down its domain J, if
necessary) a mapping o op: J-+ E. But then (4) o p)' (0) : R -. E is linear. This
means that (4) op)' (0) may be identified with e(p) = (4) o p)'(0, 1). For every
y(f) e G(x) we have (cf. 1.7 for notation and the chain rule 1.14):
I

Y(f) = (fo P)' (0, 1) = (f 0 4- 1 00 o P)' (0, 1)


at two
= U o 0 -1)' (0 (P(0)), (0 o P)' (0, 1))
= (f o 0 (4)(x), a (p)).
Thus
tip
(1)
y(f) = (f o 4r 1)' (4)(x), a (P))
dt t=o
Hence if p and q are two curves such that e(p) = e(q), then the tangent
dq
vectors tip , coincide. Therefore we can define a mapping 4*(x) :
dt t=o dt too
TM, - E as follows.
4.9. Definition:
4*(x) f tip ' 1 = e (p).
dtr=o
104 NONLINEAR FUNCTIONAL ANALYSIS

The notation is chosen to emphasize the fact that the mapping 4*(x) de-
pends on the chart¢ chosen (and of course on x e M).

4.10. Definition: If v e TM,,,, the element 4)*(x) v of E is called the co-


ordinate of v in the chart 0.
In the new notation, formula (1) becomes:
(2) v (y(l)) = (f o 0- 1)' (4)(x), 4*(x) v)-

4.11. Lemma: For every chart 0, the mapping 4*(x) : TM -+ E is one-to-


one and onto. If +p is another chart covering x, then

(3) 0*(x) = So V*(X),


where S is the following automorphism of E: S = (gyp o0- ')'(0(x)).

Proof: From formula (2) we conclude that 4*(x) v = 4)*(x) w implies


vy = wy for every y, or v = w. Hence 4*(x) is one-to-one. Let e be an ele-
ment of E, and define the curve p(t) _ ¢-1(4)(x) + te) (t small). It is very
easy to prove that if v = (')
ldt =o
, then 4 *(x) v = e. The formula (3) follows

by standard computations.
A useful consequence of Lemma 4.11 is the fact that any 4*(x) induces, by
transport of structure from E, a structure of L.T.S. on TM., and that all
4*(x) induce the same structure (this follows from formula (3) above). We
sum up in the following proposition.

4.12. Proposition: Every TMw has a canonical structure of L.T.S. given


in such a way that for every chart4) around x, the mapping 4*(x) : TM. -+ E
characterized by

(2) vy (f) = (I°0-1)' (fi(x), 4)*(x) v)


is a L.T.S. isomorphism. Moreover, the linear structure so induced on TMs
has the following properties :
(v+w)y=vy+wy
for y c- G(x),
(Av) y =A ' vy
or, in other words, coincides with that induced by the injection TMw c (G(x)).*
The last part of the proposition is very easy, and is left to the reader.
The definitions make the following proposition obvious.
MORSE THEORY ON HILBERT MANIFOLDS 105

4.13. Proposition: Every v eTM,r is a derivation of the algebra G(x) in R, i.e.

v (y(f)y(g)) = (vy(.f)) g(x) + AX) - (vy(g))


Warning: The converse assertion (that every derivation is a tangent vector)
is false-see below-unless E is finite dimensional.

0. Alternative Definitions of Tangent Vectors

We continue with the notation of the last section.


(a) First alternative.

Let G(x) be the algebra of germs of smooth functions defined near x e M.


Choose a chart ¢ around x (sending x into 0 e E). Then everyy e G(x) is the
germ of a function y = y(f), and by the Taylor formula

fo4-1 =c+x'+s
where c is a real constant, x' is an element of E' and s is a second order-
map (from a neighborhood of 0 E E onto R). If Y is another chart (also send-
ing x into 0), and
PIP-1 =c+. '+s
is the corresponding decompositon, then c = d and .z = x' o T, where
T = (4 o yr-1)' (0). This is clear after easy computation. T is an automorphism
of E, both algebraic and topological.
Chosing 4$, define 4(f) = Y. From the remark above, it follows that 0_(f)
= (f) o [(4 o ,-1)' (0)]. This means that the mappings i : G(x) -> E' in-
duced by all charts 0 are essentially the same. In particular it is true that a
topology F on G(x) may be defined as the weakest among those linear topo-
logies for which 4 is continuous, when E' is supposed to be endowed with
the weak topology of the duality (E, E'): this topology F is independent
of 0.
Now define D(x) as the set D(x) = (d), where d are the following mappings:

(a) d : G(x) -+ R, d is linear and continuous for F;


(b) d is a derivation : d (yu) = d(y) - µ(x) + y(x) d(u).

Of course, if f o 0- 2 = c + x' + s, then d (y(f )) d (y(x' o 4)). For every


x' e E' and d e D(x), write d(x') = d (y(x' o4)). If d(x) = d*(x') for all
106 NONLINEAR FUNCTIONAL ANALYSIS
x' E E', then d(y) = d *(y) for all y e G(x), or d = d*. This means that if is
chosen, there is an injection D(x) e (E')*.
But the continuity required of the elements d implies that D(x) C E c (E')*
One may now show immediately that D(x) = E.
D(x) might be called the tangent space at x, and all the theory based
upon this choice.
(b) Second alternative. Again we use the same notation.

Using equality (1) of section C, we may define the tangent vectors by


means of their coordinates in all charts. In fact, if e is the coordinate of y
in the chart 0, then its coordinate in y, is
[*1 g = (w o q- 1)' (4(x), e)
That means that y can be identified with the class of all pairs (0, e), (,p, g), ...
where 0 (resp. tp, ...) is a chart and e e E (g a E, ...), satisfying [*J. Call V,r
the set of all such classes. The correspondence: class of (0, e) -* e defines, for
a given 0, a map that identifies V. with E, and we have a structure of L.T.S.
for Vi,.
Of course if we define a mapping

class of (0, e) --> dp


dt r-0

where p is the curve defined as p(t) = 0-1(4(x) + te), (t small),. then we ob-
tain a one-to-one-onto correspondence between the elements in V. and the
tangent vectors to smooth curves. On the other hand, if we define the map-
ping
class of (4, e) -+ d
where d e D(x) is defined by d (y(f)) = (f o 4-1)' (4 (x), e), then we obtain a
correspondence between elements in V,, and elements in D(x).
This proves that both approaches lead essentially to the same result.
As a final remark, let us observe that:
-In the first definition of tangent space as the set of tangent vectors
to curves, addition is easily defined (by means of TM,,, a (G(x))*), but
not the topology of TM, On the other hand, the geometrical meaning is
very illuminating.
-In the second definition (alternative a), everything is natural (algebraic
topological structure), except the meaning.
-The third definition (b) is neither natural nor meaningful.
MORSE THEORY ON HILBERT MANIFOLDS 107

E. More on Linear Topology

This section is used only in section K, and not in its full generality,
but only in a very particular case.
Let E be any linear topological space. We shall denote by End (E) the
space of continuous linear mappings u : E - E.
4.14. Lemma: A linear (and Hausdorff if E is Hausdorff) topology for
End (E) is defined by uniform convergence on bounded sets: a funda-
mental system of neighborhoods of :he origin for this topology is the family
of sets L (K, V) = {u e End (E); u(K) c V), where K ranges over the family
of bounded sets of E and V over the neighborhoods of 0 e E.
Suppose now that E and f are two linear topological spaces, and consider
the space B(E, E) of continuous bilinear forms on E x t: B (E, E) =
{P;P: E x t -> R, P is bilinear and continuous}. (The continuity of P is in
the product topology.)
4.15. Lemma: B(E, t) can be made into a Hausdorff L.T.S. by the
topology of uniform convergence on bounded sets. A fundamental
system of neighborhoods of 0 e B (E, E) is provided by the sets B. (K, k)
= {P e B (E, E); P(K x t) c [ --a, + a]), where K and f range over the
family of bounded sets of E and E, respectively, and a over the positive
numbers.
4.16. Lemma: The bounded sets of B (E, E) are those sets D for which
D (K x f) (defined as {P (x, y); P e D, x e K, y e R}) is bounded for every
pair, K, k of bounded sets of E and E, respectively.
The proof is left to the reader.
A subset S c B (Et) is called equicontinuous (or, better, equicontinuous at
the origin) if for every neighborhood N of 0 e R, there exists a neighborhood
N' of (0, 0) e E x E such that all P in S satisfy P(N') c N.
4.17. Lemma: All equicontinuous set,, in B(EP) are bounded.
The proof follows immediately from (4.16). Suppose that N = [ -1, + I]
and that K x R c AN' (definition of bounded sets); then IP (x, y)) S A for
allPeS,xeK,yeR.
There is a natural way to make End (E) ® End (E) operate on B (E, E);
the following lemma establishes this.
4.18. Lemma: The mapping j : End (E) x End (E) -. End (B (E, t)) de-
fined by ([ j (u (D v)] P) (x, y) = P (ux, vy) is bilinear.
108 NONLINEAR FUNCTIONAL ANALYSIS

We note the following relation involving the mapping j:


4.19. Proposition: Suppose that both E and t are locally convex. The map-
ping j is continuous iff the converse of 4.17 holds.

Continuity of j from the Converse of 4.17


Let S be the neighborhood of 0 e End (B(EE)) defined by S = {u; uK a V},
where K is a bounded set of B (E, t) and V a neighborhood of 0 e B (E, E).
Assume that V = {P; P (L x L) e [ -a, +a]), where L and t are bounded
and a is a positive real number. Since we assume the converse of 4.17,
K is also equicontinuous. Let N, 10 be neighborhoods of 0 in E and -0 re-
spectively, such that if P e K, then P (N x 1a) e [-a, +a].
Define the neighborhoods
W = {u e End (E); u(L) c N)
W = {u e End (E); u(L) c N).
Then, if u e W, v e i, it follows for every P e K that :
(j (u, v) P) L x L = P (uL x vL) c P (N x R) c [-a, +a],
which proves that j (W x W) is contained in S. Hence j is continuous.
Converse of 4.17 From the Continuity of j
Let K be a bounded set in B(E, L); a e R, a > 0. Choose two bounded
sets L, L in E, 9 respectively, both different from (0}, and define
V = (PeB(E,-0); P(L x L) c [-a, +a]}.
Clearly V is a neighborhood of 0 e B (E, .9). Now consider the set
S = {ueEnd (B(E,E)); uK c V).
S is a neighborhood of 0 e End (B (E, it)). Since we assume that j is contin-
uous, there exist N, 10' in End (E) and End (E) such that j (N x R) a S, and
we may suppose that N and J9 are of the form
N = {u a End (E); uL1 c T1}
N = {u a End (L); uL2 c T2} .
where L, (i = 1, 2) are bounded and T, neighborhoods of 0. We may also
suppose that L1 L and L2 L. But the inclusion j (N x 19) e S means
that
j(n1,n2)PeV for all n,eN,, P e K (i = 1, 2).
MORSE THEORY ON HILBERT MANIFOLDS 109

or
(j(n,, n2iP)L x Lc [-a, +a]
and, finally
P (n,L x n2 L) -a, +a] for all ni a N,, P E K (i = 1, 2).

Since L :A {0} and E 0 {0} this implies, using the Hahn-Banach Theorem,
that
P(T, x T2) c [-a. +a].
Hence K is equicontinuous.
We shall now discuss some special cases for which the converse of 4.17
holds.

4.20. Definition: A linear topological space is called a Baire space if


whenever the union of a countable family of closed subsets covers the whole
space, then at least one of the subsets has non-empty interior.
The classical Baire Category Theorem (see Kelley or Boubaki) implies
the following proposition.
4.21. Proposition: Every Frechet space is a Baire space.
Moreover, the statement below follows easily from the definition.
4.22. Proposition: Every locally convex linear topological space which is a
Baire space is a barrelled space.
(Cf. Boubaki, E.V.T., Chapter 111, § 1, Prop. 1.)
4.23. Proposition: If E and t are locally convex linear topological spaces
and E x E is a Baire space, then a subset of B (E, E) is equicontinuous iff it is
bounded (and consequently, the map j in 4.18 is continuous).
Preliminary remark: If E x t is a Baire space, then both E and £' are
Baire spaces (hence barrelled spaces).
Proof:
Suppose that D e B (E, 9) is bounded and that N = [ -a, +a], a e R.
a > 0. The set G = {(x, y) e E x .9; P (x, y) e N for all P e D) is clearly
closed. Moreover, D being bounded, for every pair (x, y) there exists a posi-
tive integer A such that P (x, y) E AN for all p e D. Therefore E x .' = U nG,
n = positive integer. Since E x E is a Baire space, some nG must have non-
empty interior; hence G = 1 (nG) has non-empty interior: call this inte-
n
rior U and choose (a, f3) e U.
110 NONLINEAR FUNCTIONAL ANALYSIS

Now consider the set


G1 = {xeE;P(x,I)eN, for all PeD}.
G1 a E is clearly a barrel, and therefore (by our preliminary remark) a
neighborhood of 0 e E. The same argument applies to E and so we conclude
that there exist neighborhoods G1, G1 of 0 e E and 0 e E respectively such
that
P(a,y)eN, for all P e D and all ye01
(1)
P(x,fi)eN, for all PED and all xeG1.
Furthermore, we may find new neighborhoods V e G1, 17 c Gl such that
(a + V) x (/3 + 19) c G; in particular
(2) P ((a + V) x (/3 + 1)) c N for all PeD.
Suppose finally that (x, y) e V x17: Then, for every P e D

P (x, y) = P (x +a, y + /3) - P (a, /3) - P (x, /3) - P (x, y)


E P ((a + V) x (8 + IN + P (a, 9) + P (V, fl) - P (a, /3) .
From (1) and (2) we now get:
(3) P (x, y) a 3N - P (a, fl), for all PeD.
Let us observe now that since D is bounded, the set {P (a, /3); P e D} is
also bounded and hence contained in some qN, q a positive integer.
Therefore, by (3) :
{P(x,y);PeD,xeV,ygl7)c(q+3)N,
or

(4) {P (x, y);PeD,xeW,yeW}cN,


where W and Ware neighborhoods satisfying (q + 3) W c V, D = P. The
formula (4) shows that D is equicontinuous, as desired. As a corollary, we
obtain the following statement.

4.24. Corollary: If E and E are locally convex spaces and E x E is a Baire


space, then the mapping j : End (E) ® End (E) - End (B (E, E)) is continuous.
MORSE THEORY ON HILBERT MANIFOLDS 111

F. More on Elementary Calculus

Let Mbe a topological space and F a L.T.S. Denote by C (M, F) the set of
continuous mappings from M into F. Clearly C (M, F) has a natural linear
structure. Suppose now that Z is another L.T.S. Then there is a natural
identification
C (M, F) ® C (M, Z) = C (M, F ®Z)
which is a linear isomorphism (defined by (4 ®Tp) (x) = fi(x) ED V(x)). Sup-
pose now that u : F - Z is a continuous mapping. u induces a map

u o4.
Our aim is to consider the case in which M is a smooth manifold and to
deal with smooth mappings.
Assume that M is a smooth manifold.
4.25. Lemma: If 0 e C (M, F) and v e C (M, Z) are smooth, then so is
0 ®+p and furthermore
6 (4 ® y,) (x, h) = 80 (x, h) ® 80 (x, h).
The proof follows from the remark that if u e C (M, F) and v e C (M, Z) are
both o(x), then so is u ® v.
Let now Z, F and G be three L.T.S.
4.26. Lemma: Every continuous bilinear mapping u : Z x F - G is differ-
er-
entiable, and moreover its derivatives are
6u [(x, y); (h, k)] = u (x, k) + u (h, y)
62u [(x, y), (x', y'); (h, k)] = u (x', k) + u (h, y')
8"u-0 if nz3.
The proof follows from elementary remarks and the formula
u (x + h, y + k) - u (x, y) = {u (x, k) + u (h, y)} + u (h, k)
(see also "Quadratic Forms," in Chapter I).
Assume again that M is a smooth manifold, and that F, Z are locally
convex L.T.S. such that in B (F, Z) every bounded set is equicontinuous.
4.27. Proposition: If 0 e C (M, End (F)) and w e C (M, End (Z)) are differ-
entiable and we call j the natural mapping End (F) ® End (Z) - End (B (F, Z)),
112 NONLINEAR FUNCTIONAL ANALYSIS

then the mapping # e C (M, End (B (F, Z)) defined by (0 (B ?P) is also
differentiable.
Proof: By Lemma 4.25, 0 ® tp e C (M, End (F) ® End (Z)) is differen-
tiable. But by Proposition 4.23, j is continuous, hence (Lemma 4.26, above)
differentiable. Therefore fi, as a composition of two differentiable mappings
is itself differentiable.
4.28. Corollary: Proposition 4.27 holds under the hypothesis that F and Z
are locally convex and F x Z is a Baire space.
4.29. Notation: If 4) e C (M, End (F)) and tp e C (M, End (Z)), the mapping
f = j o (4 ® o) e C (M, End (B (F, Z))) defined in 4.27 will be called
s (¢, ip). With this notation, Proposition 4.27 reads: if4s and tp are smooth, so is
s (4 o)

G. A Short Outline of Smooth Linear Bundles

The reference is: Lang, Introduction to differentiable manifolds.


(a) Definition
Let M be a smooth E-manifold.
4.30. Definition: A smooth linear bundle on M consists of the following
objects:
1. a space X;
2. a surjective mapping x : X - M (called the projection);
3. an L.T.S. structure on every set x- I(x), x'e M, (the set x' '(x) is called
the fiber of the bundle over X, and denoted by Xx);
4. a linear topological space F;
5. an open covering {U, V, ...) of M. and
6. for every U in this covering, a map
ru : n''(U) -+ U x F;
These maps must satisfy:
7. the maps ru, av,, ..., commute with the projections, i.e., the diagram :
TU
x'(u) Ux F

pri

is commutative; (where pri (x, e) = x)


MORSE THEORY ON HILBERT MANIFOLDS 113

8. the restrictions (T0)x: X. -+ {x} x F are linear isomorphisms (here we


assume that {x} x F and F are identified);
9. if U and V are two members of the covering, the map
rw: Un V- End (F)
defined by
Tov(x) = (ru)x [(Tv).)-1, xEUnV
is a smooth mapping (End (F) has the topology defined in 4.14).
Examples
1. Let F be any L.T.S. Take X = M x F, r (m, f) = m, the covering
consisting only of the set M, and rM to be the identity. This is a smooth
linear bundle, called the trivial bundle.
2. The tangent bundle of a manifold M, whose description appears in
section H below.
Remark: Sometimes, for short, the expression "let Xbe a bundle" is used.
The space Fis called the typical fiber of the bundle (in the case of the tangent
bundle, it coincides with the space on which M is modelled).
The notions of sub-bundle of a bundle, direct sum of two bundles and homo-
morphism from one bundle into another may be defined in the standard way.
Assume that the covering (U) consists of domains of charts {¢}. Then the
mappings:
X Id
n'1(U)"UxF ---O(U)xFcExF
are charts for an E x F manifold structure on X.
Remark: Bundles of class Ck and analytic bundles are similarly defined.
4.31. Definition: A section of a bundle X is any mapping s : M - X such
that a o s = Id.
(b) A construction
Remark: Here again, the full generality will not be necessary. The reader
may assume that all spaces are Hilbert spaces and hence obtain the pro-
positions more easily.
We shall give a procedure for constructing new bundles from known ones.
Our procedure is .suggested by the general description in Lang (III,
§ 4).
Assume that M is a smooth E-manifold (E is not necessarily locally con-
vex).
8 Schwartz, Nonlinear
114 NONLINEAR FUNCTIONAL ANALYSIS

Let X and I be bundles on M; F and P their typical fibers, n and t their


projections and ru, fv their structural mappings. Define the set B (X, ?) to
be
(a) B (X, I) = U B (:Xx, Ax), x e M,
where B (Xx, kx) denotes, as above, the space of bilinear forms on X, x
We may assume that the coverings of M defining X and I are the same,
replacing the original ones by the intersections U n 0, if necessary. Let
(b) sad = (W, Q, ... )
denote this covering. Now consider:
(c) p : B (X, $) -+ M,
the mapping defined as: if u E B (Xx, fix), thenp(u) = x;
(d) for every We d, define
Aw : p- '(W) --* W x B (F, P)
as follows: for every x e W, the maps (rw)x and (ixw),c give an identification
of X x I. with F x P. Then every u e B (XX, Ix) induces an element
u" e B (F, P), which defines Aw. In other words, if u e B (Xi, t ), then
Aw(u) = (x, u) = (x, u [(rw)=1, (fw)x 11)
The reader will verify that these objects satisfy properties (7) and (8) of
the definition of bundle.
Let AQw : Q n W -+ End (B (F, P)) denote the mappings 4w(x) = (AQ)x x
[(Aw):] -1, where Q, Wed.
We see that this construction leads to a smooth fiber bundle as long as these
maps AQw are smooth (condition (9)). But it is easily verified that

AQw = s (rQw, fQw),


with the notation of 4.29.

4.32. Proposition: Let X, A be two smooth fiber bundles on M (any smooth


manifold), F and E their typical fibers. If both F and E are locally convex and in
B (F, t) every bounded set is equicontinuous, then the objects B (X, k), p, sad
and Aw described in (a), (b), (c) and (d) define a smooth linear bundle on M
(its typical fiber being B (F, P)).

Proof: Obvious from the fact that the hypothesis on B (F, F) implies (by
4.28, 4.29) that the mappings AQW = s (ZQw, T(?w) are smooth.
MORS& THEORY ON HILBERT MANIFOLDS 115

Remarks (1): Note that no assumptions are made on the local topology of
M, but on the fibers of the bundles considered.
(2): Observe that the differentiability (or continuity by 4.26) of j : End (F)
® End (F) -+ End (B (F, F)) is the weakest condition that can be imposed on
the fibers in order to obtain the statement : if 0 and , with values in End (F)
and End (F) are differentiable. then j o (4 ®V) is.also differentiable. But the
continuity of j is equivalent (gee 4.19) to the property: every bounded set in
B(F, F) is equicontinuous. Hence we conclude that this latter property of
B (D, F) is a natural condition to impost in order to obtain the validity of
the last proposition.
Nevertheless, an exception occurs in the case X = = trivial bundle,
where the proposition holds for any fiber.
(3): Suppose that f is the trivial bundle with fiber R, the real numbers.
Then B (FP) = B (F, .) = F, the topological dual of F provided with the
strong topology.
Here the condition on F about the bounded sets is equivalent (see Bor-
baki, EVT Chap. III, § 3, Ex. 6) to: the completion of F is barrelled. There-
fore:

4.33. Corollary: Every linear bundle such that the completion of its fiber
is a barrelled space has s smooth dual.
Once again the requirement on the fiber is "necessary" (see Bourbaki, loc.
cit.).
(4): Perhaps the reader cannot resist the temptation to define a tensor
product of smooth linear bundles. The following might be a way: X ® $ is
defined by fibers as (X ®$)x = subspace generated in the dual (B (X,,, Is))',
of B (X,,, fix) by the image of X x kx under the canonical mapping X,, x
- (B (X,,, This object is a smooth bundle provided that in B (F, F) and
in (B (F, F))' (the latter with the strong topology) every bounded set is equi-
continuous.
This is always the case if F and F are spaces of type (s.F), for example.
Of course when the product X ® ? exists, it has the standard universal
property.

H. The Tangent Bundle

Throughout this section, M will denote a smooth E-manifold. We shall


define a smooth linear bundle on M, the fiber being E, called the tangent
bundle of M (notation : T(M)).
116 NONLINEAR FUNCTIONAL ANALYSIS

4.34. Definition: Let T(M) = U TM,,, where TM,, is the tangent space
xeM
to M at x; let sc : T(M) -- M be the projection v(y) = x if y e TM.,,. Let
.sat = {U, V, ... } be the open covering of M by the domains of charts ¢, gyp, ...
Put T(C!') = U TM,, = a-'(0) for every 0 e Mand define uu : T(U) -+ U x E
xE0
as follows: if y E TM., uu(y) = (x,¢*(x) y) (notation of 4.9). The system
T(M), z, {U, V, ... }, {U, uy, ... } defines a smooth linear bundle with
fiber E. We call it the tangent bundle of M.
Of course the diagram

(where pr1 (x, e) = x) is commutative.


Consider now two charts 0, y,, with domains U and V, respectively. The
composition of the mappings:

(Un V) x E uu'T(Un V) Ov (Un V) x E


is:
uu(uv)- 1 (x, e) = (x, 4.(x) IV.(x)l -1 e),
as follows immediately from the definition of uo and uy. But then, since the
mapping
['l ru.y : x -+ 0.(x) [w.(x)l
(from U n V into End (E)) is a smooth mapping the mappings uu make
T(M) into a smooth linear bundle and consequently into a smooth E x E
manifold. (If M is a manifold of class Ck, T(M) is a manifold of class Ck-1;
if M is analytic, so is T(M).)
Recall that in 4.31 the notion of erection of bundle was defined.

4.35. Definition: A vector field on M is any section of T(M).


A vector field may be of class Ck, k ? 0, C°° (analytic if M is analytic).
4.36. Definition: The set of vector fields of class Ck, k < oo will be
denoted by Qk (or Sak(M), if necessary), and that of those of class C°° by
(or Q(M)).
Vector fields defined only on an open set of M will often be considered.
MORSE THEORY ON HILBERT MANIFOLDS 117

Lifting of mappings. Suppose now that M and N are smooth manifolds


,modelled on E and F, respectively) and that f : M -> N is also smooth.
Then a mapping f* completing the diagram
7(M) f' - T(N)

may be defined by means of


M - N

f*
dp d(f op)
dt r=o/ dt r=o
Sometimes we shall write f*(x) for the restriction of f* to TMs:
[**s] f* (x) : TMs -+ TNf( ).
4.37. Definition: The mapping f*(x) defined in [***] is called the differ-
ential off at x.
4.38. Proposition: If f : M - N is C', then f*(x) : TM, -+ TN,, is a con-
tinuous linear mapping for every x e M. f is one-to-one on some neighbor-
hood of x iff f*(x) is one-to-one. f(U) covers a neighborhood of f(x) for
every neighborhood U of x iff f*(x) is onto.
The proof, which uses the implicit function theorem, is left to the reader.
Observe that the differential at x of a chart 0 : U - E (where U C M) is
the mapping denoted by 4*(x) in definition 4.9 (assume the identification
TE, = E induced by the identity).
The mapping f* can be computed as the derivative of a mapping when
charts are chosen around x and f(x). In fact, suppose that0 and W are such
dp
charts. Then, if v = e TMs, and y = f(x), we have
dt 0=0

vV*(y) (f*(x) v) = tV*(y) d (fi)


dt t=o/

= [(+'fp)' (0)] (1)


_ [(wf4 ' 4P)' (0)] (1)
_ [(,ff-')1 (4P (0))] ((4P)' (1))
_ [(tvff-')1(fi(x))] (.*(x) v).
118 NONLINEAR FUNCTIONAL ANALYSIS

Hence the following diagram in which u = (fi(x))] commutes.

E F

L(:)
TM. - TNI(z)

1. Ordinary Differential Equations

In this section, H will denote a smooth manifold modelled on a Banach


space E. Similar results may be obtained for manifolds modelled on Montel
spaces (see E. Dubinsky, "Differential equations and differential calculus
in Montel spaces", Trans. Am. Math. Soc., Vol. 110, No. 1 (1964)). The
propositions below seem to have appeared for the first time in Michal and
Elconin, "Completely integrable differential equations in abstract spaces",
Acta Math., Vol. 68 (1937) pp. 71-107. Most proofs are omitted here:
reference to the book of Serge Lang will provide them.
dp
4.39. Notation: If p is a curve in M, we shall write p'(u) instead of
dt t
4.40. Proposition: Let po a M, v be a vector field defined on some neigh-
borhood of po. Consider the equation
P'(t) - v (P(t))
(I)
P(0) = Po
where the unknown p is a curve in M, its parameter running over some inter-
val containing 0 e R. If v is of class C1, the equation (1) has solutions
and such solutions agree in the intersection of their intervals of
definition.
The statement being local, M may be replaced by E itself. The proof is
then a slight modification of the Picard's proof from the corresponding
statement in the finite dimensional case. (Observe that the special form of
the equation (I) implies immediately that if a solution is CD, then it is neces-
sarily CD+1) As in the finite dimensional case, the above proposition implies
the existence of maximal integral curves, in the following sense:

4.41. Proposition: Suppose v is a C" vector field defined on the manifold Mt


then, for every p e M. there exists a smooth curve a (t, p) such that
MORSE THEORY ON HILBERT MANIFOLDS 119

(a) a (t, p) is defined for t belonging to some interval (t_(p). t., (p)), contain-
ing 0 e R, and is of class C"+ t there.
(b) a (0, p) = p for every p.
(c) a (t, p) satisfies the equation
da (t, p)
= v (a (u. P))
dt lt=u

(d) Given p e M, there is no C' curve defined on an interval property con-


taining (t_(p), t+(p)) and satisfying (b) and (c) above.
4.42. Proposition: If u. t, u + t e (t_(p), t+(p)), then a (u + t, p)
= a (u, a (t, P)).
((d) of Proposition 4.41 applies here.)

4.43. Proposition: The mappings p - t+(p) and p -* t_(p) are lower and
upper semicontinuous respectively.
(Follows from the proof of 4.40.)

4.44. Definition: Given a vector field v on M of class C1. the mapping a


will be called the flow of v.
The flow a is afunction of two variables: p e M, t e (t_(p). t+(p)). By 4.43,
this set of pairs (p, t) is an open subset of M x R, hence a smooth manifold
modelled on E 6) R. Let D. denote this manifold. From the proof of Pro-
position 4.40. we obtain:
4.45. Proposition: For every v of class C'. the flow a :'.Q,. M is a smooth
mapping.
Finally we have:

4.46. Proposition: If p E M, the mapping a (. , p) : (t_ (p), t+(p)) -+ M can-


not be continuously prolonged to either endpoint, nor can a (t, p) have a limit
point as t - t+(p) or t _+ t_(p).
For suppose a has a limit point as t - t+(p). Let pa, be this limit point
a (t+(p), p). By the definition of a, there exists a neighborhood U of p",
and an e > 0 such that a is defined onto U x (-e, +e). Let to be a real
number such that t+(p) - is < to < t+(p) and such that a (to, p) a U. Such
a point exists. Then define a curve p by p(t) = a (t, p) if t_ < t < t+,
p(t) = a (t - to, a (to, p)) if t+ S t < to + e. Clearly (from 4.42) p(t) is
well defined for t_ < t < t+ + is, is smooth and satisfies the differential
equation. This contradicts the maximality of a (t, p). as it is described in
4.41(d).
120 NONLINEAR FUNCTIONAL ANALYSIS

J. Submanifolds

4.47. Definition: Let M be a smooth E manifold. A subspace N C M is


called a regularly imbedded submanifold if there exist a covering of M by
domains of charts (U, = domain of ¢,) and a closed linear subspace F of E
such that, for every i,
4, (N n U,) = 01(U,) n F.
The covering {U, n N} and the restrictions 0, I U, n N provide a smooth
F-structure for N. It is easy to see that this structure does not depend on the
particular choice of the original covering { U,} of M.
The following statement can be easily proved.
4.48. Proposition: Every regularly imbedded submanifold of M is a closed
subset of M.
Examples
1. M and every point in M are regularly imbedded submanifolds.
2. If M is an open set of the Banach space E, then for every closed linear
subspace F of E the set F n M is a regularly imbedded submanifold
of M.
The following proposition provides less trivial examples.
4.49. Proposition: Let M be a smooth E-manifold and f : M --* r a smooth
mapping. If c e -W is not a critical level for f, the set N = f- '({cl) is a regularly
imbedded submanifold of M (modelled on a hyperplane of E).
The reader will see that the lemma below implies the above proposition.
4.50. Lemma: If x e M, f is defined near x, smooth, and non-horizontal at x,
then there exists a chart y -+ j'(y) around x such that f(y) = z(y,(y)) + c,
where x e E'.
Proof: Considering f -- f(x) we may suppose that f(x) = 0. Choose a
chart ¢ around x such that 4(x) = 0. We know that x' = (f4'- t)'(O) does
not vanish. Let e e E be a vector such that x'(e) = 1, and let F be the kernel
of x'. Clearly E = F ® IR e. Define a mapping 0 by:
0(y9te) =ye [fc-'(y(D te))e
(y c- F, y and t small).
The derivative at the origin of 0 is
O'(O)(y®te) =y®x'(yED te)e=y®e,
MORSE THEORY ON HILBERT MANIFOLDS 121

or 0'(0) = identity. Therefore near the origin 0 is a smooth mapping with


smooth inverse.
Consider the chart ip = 00. We have
(1) fV-1(y ®te) =fo-10-1(y ®te) =l0-'(y ®ue)
where 0 (y ® ue) = y ® le. But by the definition of 0, this last equality
implies that t = fo-1(y (D ue). From (1) we conclude that
f+p-1(y(D te) = t =x'(y(D te)
Q.E.D.

4.51. More Examples


(a) Let . be a Hilbert space, f :. ° R the mapping f(x) = (x, x). f is
a quadratic form. Its derivative is f'(x) y = 2 (x, y), and is horizontal only
at x = 0. Therefore the unit sphere S = {x l f(x) = 11 is a regularly imbedded
submanifold of 0 (modelled on any hyperplane).
(b) Let (d2, I',,u) be a measure space, and consider the space X=
p > 1. The mapping f : X -+ yP defined by
f(x) = 1 Ix(s)I' µ (ds)
has as many continuous derivatives as the integer part n of p ("greatest
integer less than or equal to p").
Considering Xas a manifold of class C", a form of Proposition 4.49 applies
and we conclude that the unit sphere S = {xl lxl = 1} = {xl f(x) = 1} is
a regularly imbedded submanifold of X, of class C".
(c) Consider again the case of a Hilbert space .- ', and let S be the unit
sphere of the Hilbert space . ° ®9t. According to example (1), S is a
smooth manifold modelled on any hyperplane of . ° ®98, in particular
on A. Define now on S the equivalence relation x - y if x = -y. The
quotient S/- has a natural structure as a smooth manifold modelled on .°,
called the projective space on 0 and denoted P(.*°).

K. Riemannian Manifolds

Some Preliminary Remarks on Bilinear Forms


If E is a real Hilbert space we have denoted by B (E, E) the linear topo-
logical space of bilinear continuous forms on E (see 4.15). The following
is clear.
122 NONLINEAR FUNCTIONAL ANALYSIS

4.52. Lemma: The topology of B (E, E) may be defined by means of the


norm
(1) IPI = sup {p (x, }'); 1x1 _<_ 1, lYI < 1).
Consider now the L.T.S. End (E) as defined in 4.14. Clearly we have
4.53. Lemma: The topology of End (E) may be defined by means of the
norm
(2) lul = sup flu(x)l; IxI < 1).
Our next step is to prove
4.54. Lemma: B (E, E) and End (E) are isometrically isomorphic in a
canonical way. Symmetrical bilinear forms go onto self-adjoint operators.
Proof: Let p e B (E, E) and for every x e E, consider the map x* : y
p (x, y). Clearly x* is a continuous linear functional on E. Hence, there
exists an element x** E E such that x*(y) = (x**, y). Call u, the map
u, :x - x**. It satisfies
(ux, y) = p (x, Y)
Of course u, a End (E) and we have a mapping p - u, from B (E, E) into
End (E).
It is obvious that p - u, is linear and onto. Let us note that it is an iso-
metry :
Iu,12 = sup Iu,x12 = sup I (u,x. u,y)l < I u,I sup l(u,x. y)1
1451
ixl51
y IXIs1

= lu,I sup Ip (x, y)I 5 Iupl I pI ,


s:5 1
1 751
I

whence lu,l < ppl. Conversely


IPI = sup p (x. Y) = sup (u,x, y) 5 Iu,I.
lyIs1 I=
Hence IPI = IuDI
Q.E.D.

4.55. Corollary: B (E, E) is a Frechet space.


Let us consider now two Hilbert spaces F and E (let (. ),r and be
their inner products respectively) and assume that T : F -- E is a continuous
linear operator. Denote by p the bilinear form on F defined by p U. y)
= M. Ty)e.
MORSE THEORY ON HILBERT MANIFOLDS 123

4.56. Lemma: The norm of p in B (F, F) is the square of that of T as it


bounded operator from F into E.
Proof: By formula (1) we have

IA = sup {P (X. Y), Ix(f 1 IYIf l; .

But then
IPI = sup {(T.r, TV),., l.'lf < 1. IYIf = 1: < ITI1

On the other hand,

ITI2 = sup (Tx. TO, 5 ,up (Tr. Tv) = 1p(,


IxI1 1
Ir/5i
and we are through.
Remark: The same result follows from more general statements upon
noticing that 'TT = up.

The Definition of Riemannian Manifolds and Some Properties

Let M be a smooth manifold modelled on a Hilbert space E. Since B(E, E)


is a Frechet space (Corollary 4.55), Proposition 4.21 allows us to apply
Proposition 4.32 and its Corollary 4.33 to conclude that the duals of the
bundles T(M) and B (T(M), T(M)) (notation of 4.32) both exist.
4.57. Definition: The bundles B (T(M). T(M)) and (T(M))' will be denoted
by T2(M) and T1(M) respectively.

4.58. Definition: A pseudometric on M is any smooth symmetric section


of T2(M).
In this definition, the word symmetric means that if s is the section, then
for every x e M s(x) e B (TM,,, TM.,) is symmetric.

4.59. Definition: A pseudometric s will be called a metric if for every


x r= M, the bilinear form s(x) is an inner product defining a Hilbert space
structure on TM., compatible with its topology.
Of course the set of all pseudo-metrics on M is a linear space; that of the
metrics is a cone in it.

4.60. Definition: A Riemannian manifold is a pair (M, g) where M is a


Hilbert manifold and g is a metric on M.
Assume that g is a pseudo-metric for M. For every element u e TM., (x e M)
124 NONLINEAR FUNCTIONAL ANALYSIS

we shall denote by lul (or Jul,), the norm of u: Jul = (g(x) (u, u))112. This
norm is then a function from T(M) into R.
Assume now that 0 is a chart of domain U c M.

0* E

T (U)
U

B(EE)

T2 (U)
U

If 6 and g are smooth sections of T(U) and T,(U) respectively, then


Ib(x)la = g(x) (a(x), b(x))
[4*(x)]- 1 4*(x) b(x))
= g(x) ([4*(x)]- 1 4*(x) b(x),

= [g(x) o
([4*(x)]-1

x [4*(x)]_l)] (4*(x) b(x),4*(x) a(x))


being the composition of 4*(x) b(x) and g(x) o ([4*(x)]-1 x [0,(x)] - 1) is
also smooth.
This proves that given a pseudometric g, for every smooth vector field 6,
the mapping x - 18(x)1 is also smooth.
Assume now that g iso a metric. Since g induces the inner product g(x) on
every TM,, and 4*(x) : TM,, -+ E is continuous it is natural (and useful)
to consider the norms I4*(x)I and 1(4*(x))-11 of 4*(x) and its inverse as
operators from one Hilbert space (TM, with g(x)) into another, namly E.
First of all, let us define h : U -i B (E, E) by
h(x) = g(x) o x [0*(x)]-1).([4*(x)]-I

B definition, h is smooth. Moreover, h never assumes on U the value


0z B (E, E), because g is a metric. Then then the real function x -+ lh(x)I is
continuous. Lemma 4.56 implies now that Ih(x)I = 1(¢*(x))I-112 for every
x e U. Hence:
4.61. Proposition: x - 1(0*(x))-11 is continuous on U and never vanishes.
Length of a curve.
Assume that a is a curve in M, its domain being [a, b] c R.
MORSE THEORY ON HILBERT MANIFOLDS 125

.62. Definition: We define the length of a, L(a), by


rb

L(a) =I IIP'(t)II. dt,


a

where p'(t) = (p*(t)) (1).


Observe that L(a) might be + oo.
This mapping L from the set of curves in M into R u {oo} will be thor-
oughly studied later; indeed much of our further study is devoted to state-
ments about L (mainly about its critical points, namely, the geodesics).
Assume now that M is connected. As in the classical case, topological and
arcwise connectedness are equivalent. This follows from the fact that every
topologically connected locally arcwise connected space is also arcwise con-
nected. Then for every pair x, y of points in M, there exist smooth curves p
joining them: p(a) = x, p(b) = y. Consider all the numbers L(p).
4.63. Definition: d (x, y) = inf L(p), the infininum taken on the (non-
void) set of smooth curves joining x and y.
4.64. Proposition: The mapping d is a distance on M defining the original
topology.
Proof: It suffices to prove that d induces the original topology on some
open neighborhood of every point. Assume then that 4 is a chart and U an
open set in the domain of 0 on which 1(0*(x))-11 is bounded (since 14«(x))-11
is continuous by 4.61, such a U does exist) and such that ¢(U) is a ball in E.
Such U's cover M and are d-open. We prove now that d induces the rela-
tive topology on U. If y, z e U, let

Then p(t) = 0-1(t0 (y) + (1 - 1) 4(z)), 0 S t S 1.

0* (P(t)) P'(t) = d O (P(u)).=. =0(y) - 4(z)


and hence
P'(t) _ [,0* (P(t))J-' Wy) - O(z))
Therefore
IP'(t)I I(c* (P(t)))-' 1 Ilb(y) -4(z). 1

Since we have assumed that I(0*(x))-' I is bounded, we have

jp'(t)I < K 10(y) -4(z)I,


from which it follows that
L(p) 5 K 10(y) - O(z)1
126 NONLINEAR FUNCTIONAL ANALYSIS

Finally:
d (y, z) K j I tq5 (y) + (1 - t) o(z)! di.

This implies that if z - y in the topology of M, then d (y, z) - 0. On the


other hand, if q is any curve joining y and z, we have

L(q) = j Iq'(t)I dt ? J o. (q(t)) q'(t)I dt K


? . (q(t)) q'(t) dt
fo K Jo o

o dt (0q (t)) dt = K 100) - O(z)1


K If
and then, choosing L(q) near d (y, z), we see that when d (z, y) -+ 0 it
necessarily follows that (¢(y) - O(z)1-1- 0, which implies that z - y in M.
Q.E.D.

Remark: Of course even if M is not connected it is possible to define a


distance on M induced by the distance defined above on each of the
components.

4.65. Definition: A Riemannian manifold is said to be complete if each


of its components is complete under the metric defined in 4.63.
Generally, we shall deal with complete Riemannian manifolds.

Gradient
Let (M, g) be a Riemannian manifold. Every TM., is endowed with an
inner product g(x) that makes it into a Hilbert space. That means that there
exist canonical isometries r,, : (TM,)' -+ TM.,.

4.66. Definition: If f is a smooth function on M, the gradient of f is the


vector field Vf defined by (Vf)x That is Vf satisfies
g(x) (( A' v) = vf.
for all v e TM. and all x e M.
It is clear that f -+ Vf is a linear mapping from C°°(M) into Q(M). More-
over, if Vf -= 0, then f is constant on every component of M. Hence, if M is
connected, V is an injection of C°°(M)/R into SA(M).
MORSE THEORY ON HILBERT MANIFOLDS 127

Part 2

Morse Theory

A. The Non-critical Neck Principle

Assume that M is a smooth manifold modelled on a Banach space E. We


recall that, given a smooth function f : M -* R, a real number c is called a
critical level for f ill there exists a point x e M such thatf(x) = c and f,(x) = 0
(see 4.4 and the notation in 4.9).
4.67. Theorem (Non-critical Neck Principle): Let f be a smooth function
f: M - 9 and consider the sets N e W e Ml e M defined by:
W={xEM;a<f(x)<b)
Ml = {xeM;a -F <f(x) <b+ e)
N =f-'(c),
where a, b, c and e are real numbers, e > 0 and a < c < b. Suppose that v is
a smooth vector field defined on M, and let a (t, p) denote its flow (see 4,44).
Assume also that
(a) vfzb>0
and that
(b) for every fixed p e N, the function f(a (t, p)) assumes values greater
than b and less than a in its interval of definition t_(p) < t < t+(p).
Then the manifolds W and (a, b) x N are diffeomorphic.
Remark: The fact that N is a manifold follows from 4.49.
Proof: Without loss of generality we may assume that a = -1, b = + 1,
.= 0 and of = I (this last condition is achieved by replacing the original
'y (i f)-' v). That means that
d
f(a (t, P)) = if
128 NONLINEAR FUNCTIONAL ANALYSIS

and hence (after integrating), that:


[*] f(a (t, p)) = f(p) + t .
Assume now that n e N. Then f(a (t, n)) = t and, from (b), we conclude
that t_(n) < -1, t+(n) > + 1. This means in particular that the domain of
the mapping a contains (-1, + 1) x N. Consider now the restriction
A: (-1, +1) x N-F M
of a (i.e., A (t, n) = a (t, n)).
We shall show that A is a diffeomorphism between (-1, + 1) x N and W.
1. A assumes values in W. In fact, from [*] we obtain I f(A (t, n))l
= I f(a (t, n))l = Ill < 1.
2. A maps (-1, +1) x N onto W. Assume that p e W. Set t = f(p),
n = a (-t, p) = a (-f(p), p)
Thenf(n) = f(a ( f(p), p)), and by [*], we havef(n) = f(p) + f(p) = 0,
whence n e N. Clearly (see the definition of W) we also have - I < t < + 1.
Now using Proposition 4.42 we obtain:
A (t, n) = a (t, n) = a (f(p), a (-f(p), p)) = a (0, p) = p
and A is therefore onto.
3. A is smooth and has a smooth inverse. If p = A (t, n); using 4.42 again
and also using [*] we conclude that

t = f(p)

n = or (- t, p) = a (-f(p), p)
But then 4.45 implies that both A and A-' are smooth.
Remark 1: If the hypothesis (b) is not satisfied then the proposition is
false, as the following example shows: let V be the surface of a vertical
circular cylinder in Euclidean space R3. Denote by f(x) the vertical co-
ordinate of a point x e V and by v(x) a vertical unit vector with origin at
x e V. Now remove a point z e V such that f(z) = 0 and let M denote the
remaining manifold.
For the values a = --1, b = +I the proposition does not hold for M.
Nevertheless, all the hypotheses except (b) are satisfied.
Remark 2: Observe that the diffeomorphism A : (a, b) x N -- W sends
{z} x Ndiffeomorphically onto!- '(z) for every a < z < b (this follows from
the formula [**] in the above proof).
MORSE THEORY ON HILBERT MANIFOLDS 129

4.68. Proposition: Assume the same hypothesis as in 4.67 and furthermore


assume that (b) is satisfied in the following stronger version:
(b') for every fixed p e N, the function f(a (t, n)) assumes values greater
than b + 27 and less than a - 77 for some rj > 0 (the same for all p e N).
Then if W, = {x a M; a S f(x) '5 b}, there exists a diffeomorphism be-
tween W, and N x [a, b] (as long as they are manifolds, i.e. when N is a
manifold without boundary).

Proof: From the proposition above we obtain the existence of a mapping

(1) A:Nx(a-,b+jl"'W2
where W2 = {x a M; a - i1/2 < f(x) < b +71/2}.
This mapping sends N x {z} onto f f'(z) for every a - i/2 < z < b+ -1/2.
Then the restriction of A to N x [a, b] is the desired diffeomorphism.

4.69. Corollary: Under the hypothesis of 4.68, there exists a homotopy


H: M x I - M, where I = [0, 1], such that if H, (m) = H (m, s) then
1. for every s e I, H,: M-> M is a diffeomorphism;
2. if m e M does not satisfy a - 17/4 5 f(m) S b +,1/4, then H,(m) = m
for all s;
3. Ho = identity;
4. Hl ({x;f(x) S a)) = {x;f(x) S b}.
Proof: Let h be a smooth function as shown below such that h'(x) > 0
for all x.

9 Scbwartz, Nonlinear
130 NONLINEAR FUNCTIONAL ANALYSIS

Let F = {x; a - q/4 5 f(x) 5 b + 7t/4} and let G = M - F. Clearly G


is open and M = G u W2.
Now if A is the mapping defined in (1) of the proof of 4.68, then for
every s between 0 and 1 we define H, as follows:
if m e G, then Him) = m
if m = A (n, t) a W2, then H,(m) = A (n, (1 - s) t + sh(t)).
Observe that H, is well defined (and equal to the identity) on W2 n G.
Indeed, the reader can now verify that the H, have the properties 1), ..., 4).

Remark: Part (4) of this corollary says in particular that {x; f(x) 5 a}
and {x;f(x) 5 b} are diffeomorphic. This fact will be used very often. An
important generalization appears in: 4.72.

B. The Palais-Smale Condition

Let us assume now that (M, g) is a Riemannian manifold.


4.70. Definition: If f e COR(M), we shall that f satisfies the Palais-Smale
condition ("P-S condition") if whenever S is a set in M on which! is bounded
and JI Vf 11 is not bounded away from zero, then there exists a critical point
off adherent to S.
Of course this is equivalent to: if is a sequence in M such that
is bounded and II(Vf),j - 0, then there exists a convergent subsequence of
(the limit being necessarily a critical point).
Remark: This condition appears in (2) and (3) of the bibliography.
4.71. Theorem (non-critical neck principle for Riemannian manifolds):
Let M be a complete Riemannian manifold, f e C0D(M) and consider the
sets N e W c W1 a M1 e M defined by
W = {x; a<f(x)<b)
W1 = {x; a < f(x) S b}

M1= {x; a-s<f(x)<b+e}


N = f 1(c),
where a, b, c and e are real numbers, e > 0 and a < c < b. Assume that the
following hypotheses are satisfied :
MORSE THEORY ON HILBERT MANIFOLDS 131

(a) f satisfies the P-S condition on M1;


(#)f has no critical point in M1.
Then W is diffeomorphic to N x (a, b) and (if Nhas empty boundary) W1 is
diffeomorphic to N x (a, bJ. Moreover, the diffeomorphism may be chosen
so as to send N x {z} onto f -1(z) diffeomorphically for every z between a
and b.

Proof: Define a vector field v by v = Vf (and let or be its flow). We shall


show that v and f satisfy hypothesis (a) of 4.67 and (b') of 4.68.
First of all, we see that of = (Vf) = (V f, V f) = II Vf II 2 and that hypo-
theses (a) and (1) imply that UVfji is bounded away from 0 on M2
= {x; a - e < f(x) < b + e/2), i.e., IIVf(z)II >- 8 > 0 if t c- M2. Thus,
hypothesis (a) of 4.67 is satisfied. So is hypothesis (b'). In fact, assume that

[*] f ( a (t, p)) S b + 2 f o r 0 S t< t+ = t+(p) .


Then

f(a (t, P)) = f = (of)f = IIVf(a (t, P))II2,


dt dt
and hence
It. d
If Vf(a (t, p))II2 dt = Jim f - f(a (t, p)) dt
0,
d-*t, o dt
5 sup f(a (u, p)) - f(p) 5 b + f(p).
osust. 2

Since IIVfII 6 on M2, it follows from [*] and [**] that

8t* 5 b + 2 - f(p)
whence t+ is finite.
But then we also have (from [**]):
t+
[***] IIVf(a(t,P))II dt < +oo.
Jo
Since a is the flow of Vf, we conclude that Vf(a (t, p)) = da (t, p), and then
[***] implies: dt
J
[****J do,

dt
(11
< +co.
o
132 NONLINEAR FUNCTIONAL ANALYSIS

Assume now thatq > 0 is given. Then T < t.. may be chosen so that
da
(t, p) dt < 27.
dt

Consider now two points a (x, p), a (y, p) with T:5. x 5 y < t+ . They may
be joined by a curve y(t) = = a (t, p), x 5 t 5 y, whose length by the last
formula is less than n. Therefore the distance e (x, y) < rl and we have
proved thereby that the net or (p, t), 0 < t < t+ is a Cauchy net (cf. Kelly,
General Topology). Since M is c.)mplete, lim or (t, p) exists, which to-
gether with t+ < + co, t -, t+, contradicts Prop. 4.45. Thus hypothesis
(b') of 4.68 is satisfied (for ?I = e/2) and then 4.68 applies.
Q.E.D.

Remark: Observe that hypotheses (a) and (8) are independent of c. There-
fore, the conclusion is true for any c between a and b.

4.72. Corollary: Under the hypotheses of Theorem 4.71, there exists a


homotopy H : M x I-+ M (I = [0, 1]) having the properties:
1. for every s e I, H,: M-+ M is a diffeomorphism;
2. if m e M does not satisfy a - e/8:5 f(m) S b + e/8, then H,(m) = m
for all s;
3. Ho = identity;
4. Hl ({x; f (x) 5 a}) = {x; f(x) S b}.

Proof: We have shown that the hypotheses of 4.71 imply those of 4.68,
and hence of its corollary.

C. Local Study of Critical Points

Let E be a Hilbert space and f a smooth real function defined on a neigh-


borhood of 0 e E. Using Taylor's expansion we write

f(x) = f(0) + f(O) (x) + If '(0) (x, x) + R(x),


where R(x) is a function of order 3.
Assume that 0 is a critical point of f. Then:
[*] f(x) = f(0) + #f"(0) (x, x) + R(r).
MORSE THEORY ON HILBERT MANIFOLDS 133

Since the bilinear form f'(0) is continuous and symmetric, there exists
(see 4.54) a (unique) symmetric operator A e End (E) such that
f"(0) (x, y) = (Ax, y) = (x, Ay).
Formula [*] then becomes
[**] f(x) = f(0) + j (Ax, x) + R(x).
Let us consider a smooth change of coordinates y - x(y), such that x(0) = 0.
Then f(y) = fl(x(y)) and using the chain rule we obtain:
P(y) (zl , z2) = f i (x(y)) (x'(y) z1 , x'(y) Z2) + f'(x(y)) (x"(y) (zi , z2))
Since 0 is a critical point, we get:
f"(O) (z1 , z2) = f1 (0) (x'(0) Z1, x'(0) z2)
This formula shows that the operator A transforms according to:
[***] A = u'1Alu, where u = x'(0).
4.73. Definition: Let f be a smooth function defined on some Riemannian
manifold, x a critical point of f. We shall say that x is a non-degenerate
critical point off if in any chart, the operator A defined in [**] is invertible.

Remark: The formula [***] shows this notion to be coordinate independent.


We are led to the same concept as follows. If x is a critical point and 0 is a
chart around x, the bilinear form H(f)x defined on TM.,, by
H(f)x(u, A) = [(fo -I)" (O(x))] (4*(x)p,0*(x) A)
does not depend on 0. Hence we may make the following definition.

4.74. Definition: The bilinear form H(f)x is called the Hessian off at x.
According to our definitions, the Hessian of a smooth function is a smooth
section of the bundle B (T(M), T(M)) = T2(M) defined on the set of critical
points off.

4.75. Definition: A critical point x is called non-degenerate if H(f)x is a


scalar product defining the given topology of TM.,.
It is obvious that 4.73 and 4.75 are equivalent.

Remark: A very elegant definition of H(f)x is given in Milnor, "Morse


Theory" (Ann. of Math. Studies, No. 51, Princeton 1963).
134 NONLINEAR FUNCTIONAL ANALYSIS

Let F be a complex Hilbert space. Denote by End (F) the space of all the
continuous linear operators T : F -> F and by Aut (F) (respectively H(F))
the subset of invertible operators (respectively, the subspace of the Hermi-
tean operators.
4.76. Lemma: If A G Aut (F) n H(F), then the mapping tp : End (F)
H(F) x H(F) defined by
ip(B) = (B*A + AB, i (B*A - AB))
is one-to-one onto and both.tp and tp-1 are continuous.
Proof: In fact, given S and T symmetric, define
B = +A-1(S + iT).
Then B*A + AB = S and i (B*A - AB) = T, and hence
+iT)
is the inverse of tp. Clearly both are continuous.
4.77. Lemma: Assume that A e Aut (F) n H(F). Then the mapping
q5: Aut (F) -+ H(F) x H(F) defined by
¢(B) = (B*AB, i (B*AB-1 - A))
is differentiable and its derivative at B = I is

80 (1, B) = tp(B).
Proof: Observe that 8(B-1) = --BB and compute.
4.78. Corollary: 0 maps a neighborhood of 1 e Aut (F) diffeomorphically
onto a neighborhood of (A, 0) =.0Q).
Proof: Use the implicit function theorem.
Assume now that x-1- (A(x), D(x)) is a smooth mapping from an open
set of F into H(F) x H(F) such that A(O) = A and D(0) = 0. Then
4.79. Lemma: x -> B(x) = 0 -1(A(x), D(x)) is a smooth mapping such
that B(O) = I and A(x) = B*(x) A(0) B(x) and
D(x) = i (B*(x) A(0) B-1(x) - A(0)).
Proof: Follows from the corollary above.
Assume now that E is a real Hilbert space.
MORSE THEORY ON HILBERT MANIFOLDS 135

4.80. Proposition: Let x - A(x) be a smooth mapping from a neighbor-


hood of 0 e E into End (E) such that A(x) is symmetric and A(0) is also
invertible. Then there exists a smooth mapping x --> B(x) of some neigh-
borhood of 0 e E into Aut (E) such that:
A(x) = B*(x) A(0) B(x).
Proof: Let F be the complexification of E : F = E ® C = E ® iE. De-
fine the mappings x + iy = z - A(z) by A(z) (u + iv) = A(x) u + iv and
z - D(z) by D(z) = 0. Apply Lemma 4.79 to prove that there exists a map
z -> B(z) such that:
(1) B*(z) A(0) B(z) = A(z)

(2) B*(z) A(0) (B(z))-' = A(0).


From (1) and (2) we conclude that:

(3) (B(z))2 = (A(0))-1 A(z)


(4) (B*(z))2 = A*(z) (A*(0))-1
Now observe that for every x e E, A(x) and A(0) leave E invariant: A(x) E
= A(x) E e E, A(0) E = E. Then (3) and (4) imply that E is also invariant
under (B(x))2 and (B" (x))2. Now we shall use the following statement (see
below for a justification):
(5) If an operator T satisfies III - T211 < 1, then the invariant (closed) sub-
spaces- of T and T2 are the same.
From (5), it follows that E is also invariant under B(x) and B*(x), provided
that x e E is near 0. Hence, by restriction to E (and calling B(x) = B(x)I B)
we obtain from 1
B*(x) A(0) B(x) = A(x), x e E, x near 0
as desired.
Justification of (5): observe that
T=(1 - (1 -T 2))112 = I+ 2 (1 - T2) - 8 (1 - T2)2
(2(n - 1))!
+ 2 2n- I
(1 - T2) "+ ---
n ((n _. 1)!)2

where the series converges in the uniform topology of operators if 11 - TI2


< 1.
136 NONLINEAR FUNCTIONAL ANALYSIS

4.81. Proposition: Let A be a symmetric invertible operator in End (E)


(E = real Hilbert space). Then there exist T e Aut (E) and a projector
P e End (E) such that
(Ax, x) = IIPTxll2 - II(1 - P) Txry2, x e E.
Proof: Let h be the characteristic function of [0, oo) and g the function
g(A) = 121-''2, A = real 0 0. Since A is invertible, g is continuous on the
spectrum of A. Then S = g(A) is defined.
Clearly S (being a function of A) is symmetric and commutes with A.
Moreover S is invertible (because g:0 0 on Spectrum (A)). Call T = S-
T is symmetric and invertible.
Now define P = h(A). P is clearly a projector (because h2 = h) commuting
with A, hence also with T. Since we have
I (g(j))2 = h(2) - (1 - h(2)),
we conclude that AT` = P - (1 - P), and hence A = PT2 - (1 - P)T2,
But then
(Ax, x) = (PT2x, x) - ((1 - P)T2x, X) = II PTxII2 - 11(1 - P) Tx112,
as desired.
4.82. Proposition (Morse Lemma) : Let f be a smooth function defined on
a Riemannian manifold M (modelled on E). If x is a non-degenerate critical
point off, then there exists a chart 0 around x (sending x into 0) and a pro-
jector P in E such that
(1) f(y) =f(x) + IPcbyI2 - I(1 - P)oyl2
when y belongs to the domain of 0
Proof: Let w be any chart around x (sending x into 0) and put g(y)
_ (ftp ') (y) - f(x), where e e E and a near 0. Then

(2) g(y) = g(0) + g'(0) y + f (sy)] (y, y) (1 - s) ds


o

f (1 - s) g" (sy) ds] (y, y)


0
rt
Now, since a,, = (1 - s) g" (sy) ds is a symmetric bilinear form on E,
J0
there exists a mapping y - A(y) a End (E) defined by
(3) (A(yy) x, y) = ocr (x, y)
MORSE THEORY ON HILBERT MANIFOLDS 137

Clearly every A(y) is symmetric and


"
(A(0) x, y) = ao (x, y) = Ig (0) (x, y) -
Together with the fact that x is a nondegenerate critical point off this implies
that A(0) is invertible. Now we apply 4.80 and obtain B(x) satisfying:
A(x) = B*(x) A(0) B(_x).
This implies that
(4) (A(y) Y, Y) _ (A(0) B(Y) y, B(y) y),
and from (2), (3) and (4) we obtain :
g(y) = (A (0) B(y) y, B(y) Y) -
Using 4.81, we conclude that there exist T and P such that

(5) g(y) = J PT B(y) yI2


- I(1 - P) T B(y) y12.
Define 0 by
4(Y) = T (B ('(Y)) v'(Y)) -
By (5), we have:
f(Y) = f(x) + g (W(Y)) = f(x) + I PT B (W(Y))'V(Y)f 2
- I(1 - P) T B (V'(Y))'ip(Y)I2 = f(x) - IP4)xI2 - 1(1 - P) 4)x12,
as desired.
We must observe that4) is actually a chart; indeed it is the composition of
Y - y'(.l')
e - B(e) e
z-+T(z)
and clearly the first and third mappings are diffeomorphisms while y B(y) y,
having the identity as derivative at the origin (compute!) is also a local
diffeomorphism. Hence 0 is a chart on some domain around x.
Q.E.D.

D. Global Study of Critical Points

We begin by defining handles and the attaching of handles.


4.83. Definition: For every cardinal number k we shall denote by D" the
closed unit ball around 0 in a Hilbert space having an orthorormrl basis of
cardinality k. aDk will denote its boundary.
138 NONLINEAR FUNCTIONAL ANALYSIS

4.84. Definition: Let M and M be two smooth manifolds (possibly with


boundary).
We shall say that M has been obtained from M by attaching a handle of
type (k, 1) if the following conditions are satisfied:
1. M is a regularly embedded submanifold of 2;
2. There exists a closed subset H c M and a mapping h : Dk x D' -+ H,
such that :
2a.MuH=2,
2b. h is a homeomorphism,
2c. h (aDk x D') = H n M c OM,
2d. H - M is a submanifold of M with boundary,
l
2e. the restriction h (Dk x D') is a diffeomorphism of Dk x D' onto
H - M, and
2f. the restriction h I(aDk x D') is a regular embedding of aDk x D'
into M.
In this situation, we use the following notation for M: M = M U H (k, 1).
k

Remark: Obviously dim 2 = k + 1.


We shall say that H is a handle.
As a generalization, we state the following definition:
4.85. Definition: We shall say that M has been obtained from M by at-
taching n handles of types (k1,11), ..., if separately attaching n such
handles to M in such a way that hj(H,) n h,(H,) = 0 the manifold obtained
is M.
In this situation, we use the following notation for M: 2 = M U
H1 ... U H. = 11?'. (See next figure.) I"
k
MORSE THEORY ON HILBERT MANIFOLDS 139

Assume now that f is a smooth real-valued function defined on some Rie-


mannian manifold M (modelled on E) and that x e M is a non-degenerate
critical point off. By 4.82, f may be represented, in some chart 0 around x, as
f(y) =f(x) + IP0yJ2 - I(1 - P)d'yI2,

where P is a projector in E.

4.86. Definition: The index off at x (or the index of x, if there is no con-
fusion) is the pair (k, 1), where k = dim (PE), 1 = dim ((1 -- P) E)
Of course both k and 1 may be infinite cardinal numbers.
We arrive now at the most important theorems of this section.
The symbol - will mean "is diffeomorphic to".
Let (M, g) be a complete Riemannian manifold.
Given f e C°°(M) then for every s, t e A, define
[f5 s]={xeM;f(X)SS}
[S 5f<_t]={xeM;s<f(x)<t}.
4.87. Theorem: (Critical neck principle.) Assume thatf a C°°(M) and that
the only critical level off between a and b is c, a < c < b; a and b are sup-
posed to be non-critical. Assume also that f satisfies the P-S condition on
[a 5 f S b] and that its critical points pl, ..., p (at level c) are non-degen-
erate, their indices being (k,,1,), i = 1, ..., n. Then
[#]
h, hn

where (for every i = 1, ..., n), H, is a (k,,1,) handle.


Remark: The number of critical points is finite.
140 NONLINEAR FUNCTIONAL ANALYSIS

Proof: I. Consideringf-c instead off, we may assume that c = 0, a< O<b.


2. Moreover, by 4.72, it follows that
[f<_s]-[.f5tj if a<s,t<0
and
['5s]-[f<t] if 0<s,t<b.
Hence [*] is equivalent to
[f<s] - [f<t]UH,...UH.
hn
hl
for some s, t, with a <s <0 < t 5 b.
3. Consider now charts 0 around pl , ..., sending p, into 0 e E
and such that, for x near pi,
f(x) = IIPJ0cxll2 - IIQg4,xIl2,
where Pi = I - Q, is a projector of E.
The existence of such charts follows from 4.82.

The ball of radius lOr is mapped by 0s 1 into a neighborhood U,(r) of p,


and the chart 0': x -+ 14,(x) sends U,(r) onto the ball of radius 10 in E: call
r
this ball B. We have
f(x) = r2(IIP4ixli2 - IIQ4ixIi2)
If r is chosen small enough, we have ar-2 < -1, and 2 < br-2. Replacing
f by r-2f and 4, by O;, we reduce the problem to the case in which there
are charts 4 1, ..., 0 around pl, ..., p whose domains U,, ..., U. have dis-
joint closures and which are all mapped onto the ball of radius 10 around 0 e E.
MORSE THEORY ON HILBERT MANIFOLDS 141

3. Now f may be expressed near p, as


f(x) = IIPI41XII2 - IIQ,d1xI12,
where P, = I - Ql is a projector of E.
All the levels c between a and fl are non-critical (a and # included) except
c = 0, where a and fl satisfy a < -1, 2 < fl.
4. The old sets V :!g a] and [f <_ b] correspond to the new levels a and f
and then coincide with the new [f < a] and [f < f]; but, from 2. we have
[f 5 a] - [f 5 . II and [f 5 2] 5 [f < i] and hence [*] now becomes
(4a) [f<2]-[f< -1]UH1...UH,,.
h, h
5. To simply notations, define ul and wl on U, by
u,(x) = llPbixll
VA) = IIQ41xII
Then we have
f(x) = (ul(x))2 - (vl(x))2, x e U,.

6. The remaining proof will depend on the existence of a function A having


the following properties:
(6a) A e C°°(M)
(6 b) A >- 0; A = 0 outside U1 v ... u U,,;
(6c) A(pl) _ 4, l = I,-, n;
(6d) iff(x) 2, then A(x) = 0.
Put g = f - A; we also require that
(6 h) g has the same critical points as f;
(6j) g satisfies the P-S condition;
(6k) [f;9 -1)UH1 -1].
h, h
7. Let us first see why the existence of such a A implies (4a) and hence
the theorem. First of all, from (6b) and (6d) it follows that
(7a) [f;g 21=[g<2].
On the other hand, by (6c), g(p,) Hence, by (6 h), all the levels between
-1 and 2 are non-critical for g.
142 NONLINEAR FUNCTIONAL ANALYSIS

That means (see 4.72) that


(7b) [g S -1] [g 5 2].
Finally, from (6k), (7a) and (7b) we obtain (4a), as desired.
8. Existence of A
Let A and q be two smooth real functions of a real variable as in the dia-
grams below:

T1 (x)
I

2 8 x

We also assume that


(8a) 171
> -f.
Let us denote by p, 0, U, P, Q, u, v a particular (but not specified) set
pi,4t, U,, Ps, Q,, ui, i = 1, ..., n. Let us agree, moreover, that whenever x
and 0, U, P, Q, u or v appear in the same formula, then the choice is deter-
mined by x e U, and u, v mean u(x), v(x), respectively.
Define A by
A(x) _ I A(u2) t#1) if x e U1 v v U
(8 b)
A(x) = 0 otherwise.
Clearly, A is smooth (because A = 0 outside#-1 (ball of radius 3)) and non-
negative.
Put g =f -A.
9. Proof of the desired properties (6a) - (6k)
Properties 6a, b and c are clearly satisfied. Moreover, since f = u2 - v2,
if f z 2, necessarily u2 > 2 and then A(u2) = 0. Thus we have (6d).
MORSE THEORY ON HILBERT MANIFOLDS 143

Since Vg = Vf - VA, x e M is a critical point of g if and only if (Vf)x


= (VA)x.

Now, for x e [ -1 S f 5 + 2] but outside U1 u " u U., we have (VA) 0


and (Vf). # 0, because the only critical points off are pi, ..., p..
That means that g (like f) has no critical points in [ -1 5 f 5 2]

Assume now that x e U, x # p. Then the differential of g is


(9.1) g*(x) = 2u (1 - I A'(u2) ?(v2)) u,(x) + 2v (-1 - )(u2) ?J '(v2)) v*(x).
We shall verify that g*(x) # 0.
In order to prove this we need the following statement.
(9.2) If u(x) and v(x) are both non-zero, then u*(x) and v*(x) are linearly
independent.
In fact, from u(x) # 0, v(x) # 0 it follows that e = Pox and el = Q¢x
are both different from 0 e E. Now consider the curves
a(t) =o- (O(x) + te)
fl(t) _ 0-1(O(x) + tei).
Then
u*(x) a'(0) =d (u o a(t))
-o
= d (11P4 (0-1 (4(x) + te))II )
I
dt r=o

= d II(1 + t) ell = llell + 0.


144 NONLINEAR FUNCTIONAL ANALYSIS

Similarly we may obtain


u*(x) PO) = 0
v*(x) n'(0) = 0

v,(x)#'(0) = 0.

That proves that u*(x) and v*(x) are linearly independent, establishing (9.2).
Assume now that g*(x) = 0. If u = 0, then by (9.1),

g*(x) = 2v (- I - 12(u2) ?i'(V2)) v*(x) = 0.

Since x 0, and u = 0, it must be that v*(x) 0 0, and from i'(v2) > - ,


0 < 2 <- 1, it follows that 2v(- I _12(u2) 17'(v2)) = 0 is only satisfied when
v = 0 also, contradicting x p.
The case v = 0 is treated similarly.
Finally, if u 0, v 0 0, then u*(x) and v*(x) are linearly independent by
(9.2) and g*(x) = 0 implies 2v (-1- 12 (u2) ?,'(v2)) = 0, which we have seen
to be true only if v = 0, while we are assuming v # 0. Thus g has no critical
points other than pl , ..., p,,, which are, indeed, critical because A is constant
on u < N/2/2, v < J2 , whence (Vg), = (Vf), = 0.
Thus, property (6h) is satisfied by A.
In order to establish (6k) we observe that outside Ul u ... u U,,, f and g
coincide by (6b). We shall consider each Us separately and prove that

(9.3) Un[f< -l]UH=Un[g< -1].


Is

Clearly this implies (6k).


Of course we shall deal with (9.3) after transposing it into E by means
oft.
Put X = P(E), Y = Q(E), Q = I - P, and B = open ball of radius 10
around 0 e E. Clearly E = X ® Y, X 1 Y. We shall let x, y denote elements
in X and Y respectively and x2, y2 denote x2 = 1x12, y2 = Jy12. If e, x and y
appear in the same formula, it must be understood that e e E, e = x + y
and x e X, y e Y. Applying 0, statement (9.3) may be written as follows.

(9.4) In B, the submanifolds:

(9.4.1) V = [x2 - y2 < -1]


(9.4.2) W = [x2 - y2 - 1 2(x2) rl(y2) < -1 ]
MORSE THEORY ON HILBERT MANIFOLDS 145

satisfy
(9.4.3) V U H = W,
h
where H is a (k, 1) handle.
The proof depends on elementary (though sly) computations. First of all
we observe that W may be defined more conveniently as
(9.5) W = [x2 - y2 - 2(x2)

In fact, if y2 < 2, tj(x2) = I and then


x2 - y2 - 4 A(x2) 17("2) ` x2 - y2 - 2(x2);
if y2 > 2 and x2 > 1, then

x2 - y2 - j 2(x2),i(y2) = x2 - y2 - j 2(x2) = x2 - y2;


finally, if y2 > 2 and x2 < 1, then necessarily x2 - y2 S -1 and conse-
quently
x2 - y2 - 2(x2) 77(y2) < - 1,
x2 - y2 - 2(x2) < -1.
This proves (9.5).
Define K e B by
(9.6) K = {e a B; x2 + I - 2(x2) 5 y2 x2 + 1) ,
so that K is the set of elements e e B satisfying
(9.6.1) x2 + 1 - 2(x2) < y2,
and

(9.6.2) y2 < x2 + 1.

Let Dk and D' be the unit balls of Y and X respectively and define
h:Dk x D' -Kby
(9.7) h (y, x) = (Q(y2))1I2 x + (1 + Q(y2) x2)1/2 y,

where a is the smooth function defined as follows: if 0 5 t < 1, a(t) is the


unique solution in [0, 1] of
3 2 (a(t))
2 1 + o(t)
10 Schwartz, Nonlinear
146 NONLINEAR FUNCTIONAL ANALYSIS

This mapping is smooth and has the form shown in the following graph.
I

1
21

Since a is smooth, h is also smooth. The reader will check that the image
H = h (D" x D)
of h is contained in K; this follows from the inequality
a(y2) + 1 -4 )' (a(y2) x2) 5 (1 + a(y2) x2) y2
which is a consequence of
1 - y2 = 4.)' (a(y2)) (1 + a(y2))-' < I (A(a(y2) x2) (I + a(y2) x2)-I
it follows from this inequality that H c W; the other condition (y2 < 1 + x2)
is even easier and is left to the reader.
Now consider the function S: H - X x Y defined by:

/
z 1/2
(9.9) S(e) = ((1 + X2)-1/2y, [(1( x)
1 +x2J 1
S is smooth and clearly Sh = identity, hS = identity.
In order to finish the proof it suffices to show that H is a (k, 1) handle and
that
V U H = W.
To do this, we first show that
H=Kn[x2 S 11.
In fact, it is obvious that H c K n [x2 S 1 ].
Assume now that e = x + y eK and x2 < 1. If x 5 1, then x2 5 or (y2/
(1 + x2)) is trivial. If x2 z - , then there exists 0 5 t S 1 such that
a(t) = x2. From the formula
x2 + I - 4 )'(x2) = x2 + 1 - 4 A (a(t)) < y2
MORSE THEORY ON HILBERT MANIFOLDS 147

it follows (use (9.8)) that:


x2 + 1 - (1 + a(t)) (1 - t) < y2,

(1 +x2) -(1 +x2)(1 -t) S y2,


so that
(I +x2)t<y2,
whence t 5 y2/(1 + x2). But since a is increasing, we have

x2 = a(t) < a yz
1 + x2
and this shows that
(i) < 1.
[a Y'x2)]
On the other hand, it is obvious from y2 5x2 + 1 that
(ii) 1(1 + x2)- 1/2 yl < 1 .

Now (i) and (ii) together imply that S(e) a D' x D' and hence
e = hS (e) e h (Dk x D') = H.
This proves that H = K n [x2 1] and we conclude that H = K n [x2 S 11.
As a corollary, it follows that
(9.10) His closed
(This is the required Property 4.84, 2 from the definition of "handle".)
We now show that
(9.11) V u H = W.
(This is Property 4.84, 2a).
Plainly V u H e W. Let a= x + y be a point such that
(a) x2 - y2 - j A(x2) < -1.
Ifeis not in Vwe have
(b) x2 - y2 > -1.
(a) and (b) together imply that 2(x2) > 0, whence
(c) 4 x2 < 1.
Now (a), (b) and (c) imply that e belongs toK n [x2 <- 1] = H, and we
obtain the desired assertion W c V u H.
148 NONLINEAR FUNCTIONAL ANALYSIS

If e e H n V K n V, plainly x2 - y2 = -1, so H n V< O V and


y2 (1 + x2)-' = 1 verifying (4.84.2c).
Similarly, if e e H - M then x2 - y2 > -1, so that y2 (x2 + 1)`1 < 1,
which with (9.9) and (9.7) verifies (4.84.2d-2e). This completes the proof of
(9.3) and of theorem (4.87). Q.E.D.

E. The Morse Inequalities

We begin with a rapid review of homology theory. Our description will be


based on the axiomatic characterization of the homology groups, as given
for example in Eilenberg-Steenrod ([4]).
We denote by (X, Y) a pair of topological spaces, Y a subspace of X;
(X, 0) is written as X. Under suitable restrictions on the class of spaces con-
sidered, we can associate with each pair (X, Y) Abelian groups Hk (X, Y), k
an integer, (Hk = {0} if k < 0). These groups will depend on a fixed group
G (the "coefficient group" of the theory), so we should denote them by
Hk (X, Y; G). We are mainly interested in two specific cases, namely
G = integers = Z or G = real numbers - R. In the case G = R the Hk
are vector spaces (over R) and the Betti numbers #I, (X, Y) of the pair X, Y are
defined by
1'k (X, Y) = dim Ht (X, Y; R) (1.1)
We write
cb:(X,Y)
if 0 is continuous, 0: X - I, 4(Y)
Any such function induces a homomorphism for each k
¢*:Hk(X, Y) -+ Hk(', 7)
having the following properties:
(a) (00 * = 0*y,*
(b) if i = identity, i* = identity.
(We can already deduce that homeomorphic pairs have isomorphic groups.)
(c) Let 0, p : (X, Y) -+ (2, 7) be homotopic
(here ¢, (X, Y) -- (2, 7)). Then 4* = lp*.
We say that (X, Y) and (1, 7) are homotopically equivalent if there exist
0: (X, Y) - (1, F) and V : (I, 7) - (X, Y) s.t. jp$ and 4iy' are homotopie
to the respective identity mappings. Property (c) implies that homotopically
equivalent pairs have isomorphic homology groups.
MORSE THEORY ON HILBERT MANIFOLDS 149

(d) Excision property: Take a pair (X, Y), let 0 c Y be an open set such
that also 0 c Y. Let i = identity map from (X - 0, Y - 0) to (X, Y).
Then i* : Hk (X -- 0, y -- 0) - Hk (X, Y)
is an isomorphism onto.
(e) The homology groups are related to the coefficient group G by
G if k =0
Hk(P) _
{0} if k > 0,
where P = (P, 0) is a space consisting of a single point.
(f) There exist maps ak : Hk (X, Y) -+ Hk_ 1(Y),
(which we write simply as a, omitting the subindex) such that, if 0:
(X, Y) -- (I, Y)
Then
a4* _ (0I y)* a
(here we have designated 01 Y the restriction of 0 to Y, (01 Y)* the induced
map on Hk(Y) = Hk (Y, 4,)).
(g) Exactness principle of Euler
Let X ? Y ? Q ; let

be inclusion maps, j*, k* the induced maps on the homology groups. We


can construct the sequence
-' Hk(X) '--' H& (X, Y) -a-' Hk- I(Y) "* Hk- IM Hk- i (1, Y) -'
... -, Ho(Y) k Ho(X) J- ' Ho (X, Y) 00 0 ...

(the homology sequence of the pair X, Y). The exactness principle tells us
that the homology sequence of any pair (X, Y) is exact (i.e. the image of any
group in the sequence under the corresponding homomorphism is equal
to the kernel of the next homomorphism).
We note the following results for future use.
1. Let S" be the n-sphere, G = Z, n 4 0. Then
Hk(S")=0 if k>0, kin,
H"(S") = Z
Ho(S") = Z
150 NONLINEAR FUNCTIONAL ANALYSIS

2. Let G be as before, D be the n-disk. Then


Hk (D", S°- 1) = Hk (D", aD") = {0} if VA n,

H. (D", S"-') = Z.
I

3. Suppose (X, Y) = U (XI, YI), all XX disjoint.


i=I
I

Then Hk(X,Y)=E®Hk(Xi,YI).
1=I

Example (i). To illustrate the use of the exactness principle we will


deduce 2) from 1) and (e). Consider
Hk(S"-1) k -+Hk(D") J' Hk (D", S"-1) e- Hk-1(S"-') k*
' Hk-,(D").
Since D" is homotopic to a point for any n, we have Hk(D") = {0} for any
k * 0. Therefore, if k > 0, k # n, we get the sequence

{0} - Hk (D", S"-1) 8-' {0) {0} ,

whose exactness implies readily that Hk (D", S"-1) _ {0}.


On the other hand, if k = n, the sequence is
{0} - H. (DO, {0}.

This time, exactness implies that H"(D", S"- 1) = Z.

Example (ii). We now note a result more general than the exactness
principle. Let X ? Y 3 Z; consider the inclusion maps
j:(X,Z)-+(X, Y)
k:(Y,Z)-->(X,Z)

Using the induced mappingsj*, k*, 1* we can form the sequence:

-' Hk(X,Z) J, -'Hk(X, Y) 8-'Hk-I (Y, Z) k--.Hk-I (X,Z)


Hk-IMY)a:-... a= Ho(Y,Z)

{0} k'- {0} (1.2)


MORSE THEORY ON HILBERT MANIFOLDS 151

where d' is the composition of 1* and 0, i.e.

Hk-t(Y) Hk-t (Y, Z)


a
r
j / a =t.a (1.3)

Hk (X, Y)
As an exercise the reader should prove, using the exactness principle, that
(1.2) is an exact sequence.
Consider now the case G = R; the homology groups are then vector
spaces. Let K,, (X, Y) be the subspace of Hk (X, Y) which is either the image
of the preceding homomorphism or the kernel of the next, (similarly define
Kk (X, Z), ... etc.) and set ek (X, Y) = dimKk (X, Y) ... etc. From the exactness
of (1.2) we easily see that
Ilk (X, Y) = ek (X, Y) + ek-1 (Y, Z) (1.4a)

flk(Y,Z)=ek(Y,Z)+ek(X,Z) (1.4b)
fk (X, Z) = ek (X, Z) + ek (X, Y) (1.4c)
Now,
A Y,(-1)'fli(X,Z) - Y_ (-1)i(3i(X, Y) - A Y, (-1)ifli(X, Z)
E(-1)f {eJ (X, Z) + ef(X, Y) - ei (X, Y) - er-1 (Y, Z)
JAM

-ej (Y' Z) -si(X,Z)} (1.5)

= -I(-I)J(ei
Jsm
(Y,Z)+ef-I (Y, Z)) =(-1)'"+t C. (Y, Z).
Now define
m(X,Z)_(-1)"E(-1),p3(X,Z) (1.6)
!:5m
Clearly it follows that
m (X, Z) = ,m (X, Y) + 1m (Y, Z) - non-negative integer, i.e.
(1.8)
Tim (X, Z) 5,1m(X, Y) +y1,,(Y,Z).
We now apply these results to Morse theory. Let M be a Hilbert manifold, f
a smooth function on M satisfying the P-S condition on a 5 f 5 b; let c,
a < c < b, be its only critical level, and suppose that the critical points off
P1, ..., p are non-degenerate, their indices being (n4, m,), i = 1, ..., n.
152 NONLINEAR FUNCTIONAL ANALYSIS

We know, by Theorem 4.87, that


[f<b],.,[f<_a]UHiU...UH" (1.9)
h, h
Our aim is to compute Hk ([ f 5 b], [f < a]).
Theorem 4.88. "
Hk({f<b},{f<a}) _E®bk",Z (1.10)
i=1

Before beginning the proof, let us note the surprising fact (R. S. Palais ((3),
p. 336)) that according to Theorem 4.88 the homotopy type of the pair
(If S b), {f S a}) will depend only on the critical points with finite index
at intermediate levels, those with infinite index being "homotopically in-
visible". This unexpected fact saves us from making the rather inelegant
assumption of finiteness of the indices.
On to the proof! By Theorem 4.87, we have
{f<b} = {f<a}UHIUH2U UH,,,
h, h2 fin

hi': D" x D" -+ Hi. Let R"' denote D"' with a ball removed from its inte-
rior, and let .9" = h, (R"' x D'"). It is easy to see that the pair (U;5 5 b},
{f 5 a}) is homotopic to Q f 5 b}, {f < a} u .9 u ... v 9 ,1). By excision,
this pair will have the same homology groups as Q f 5 b) - (f < a},
9R1 u ... u 9P") and also the same groups as (H1 u ... u H,,, 91 u v 9P").
But by a previous observation, the homology groups of the last pair are
7, Hk (HI, Rj) = Z Hx (D"', 8D"') .
By another of our observations,
Hk (D"', 8D"') =S 8kj, n< < oo,
and the same formula holds for ni = oo, since D" is homotopically trivial
modulo 80' in this case. This completes the proof.
Corollary: Px ({f 5 b}, {f 5 a)) = number of critical points of type (k, oo )
between a and b.
We can now obtain the best known results in Morse theory.
Theorem 4.89 (Morse Inequalities): Let M be a complete Hilbert manifold,
f a smooth function satisfying the P-S condition on a <- f 5 b, and suppose
that a, b are regular values off and that all critical points off are non-de-
generate. For each non-negative integer m let P. be the m-th Betti number of the
MORSE THEORY ON HILBERT MANIFOLDS 153

pair ({f S b}, {f S a}), and let cm denote the number of critical points off
o f index (m, co) in f -1([a, b]). Then
flo < CO

f'1 -/9o<C1 -Co


k
(_1)k-rRm Cm

m=0 m=0
and

F, (-1)mPm =E(-1)mCm (1.12)


M=0 M=0

Corollary 1. Nm < cm for all m.


Corollary 2.1f f is bounded below, then the conclusion of the theorem and of
Corollary I remain valid if we interpret P. = m-th Betti number of [f 5 b]
and cm = number of critical points off having index m in {f S b} respectively.
Proof of Theorem 4.89. Let c1 < c2 < ... < c be the critical values of f
in [a, b]. Choose at, i = 0, 1, ..., n, so that a = ao < I < a1 < c2 < ..
< a = b and call Xi = f f < all. Then by the corollary to
Theorem 4.88 it follows that Nk (Xi+1, XI) = number of critical points of
index k in the level ct. We thus have (see (1.6))
'im (Xt+1, -K) = r
L,
ksm
(-1)k-ml'k /(Xi+1, X()

E (_
= ksm l 1)k-m
/ (number of critical points of index k
on level c1).
F rlk(X.,XO) (number of critical points of index k
kim k$m
in f-1([a, b]).
But an iteration of //(1.8) yields

n.Qf<b},{f<a}) < y ilk(Xk+1,Xk)


ksm
and by definition of 7m, (1.11) follows. Equation (1.12) can be obtained by
taking m large enough.
The proof of Corollary 1 is trivial, and the proof of 2 follows easily from
Theorem 4.89.
154 NONLINEAR FUNCTIONAL ANALYSIS

Bibliography

1. Lang, Serge, Introduction to Differential Manifolds (Interscience, John Wiley, New York,
1962).
2. Palais, Richard, "Morse theory on Hilbert manifolds", Topology Vol. 2, pp. 299-340
(Dec. 1963).
3. Palais, R. and Smale, S., "A generalized Morse theory", Bull. A.M.S. Vol. 70, pp.
165-172 (January 1964).
4. Eilenberg, Steenrod, Foundations of Algebraic Topology (Princeton University Press,
1953).
5. Milnor, Morse Theory (Princeton University Press, 1963).
CHAPTER V

Category

A. Definition and Elementary Properties . . . . . . . . . . . . . . 155


B. Category and Homology . . . . . . . . . . . . . . . . . . . 158
C. Category and Calculus of Variations in the Large . . . . . . . . . . 162

A. Definition and Elementary Properties

Remark: We introduce our concepts for arbitrary topological spaces X,


but in the applications X will be a manifold.
5.1. Definition: The closed set A E- X is said to be of first category with
respect to X (in symbols cats (A) = 1 or simply cat (A) = 1) if the injection
i : A -+.x is homotopic to a constant.
5.2. Definition: The closed set A c X is said to be of k'k category with
respect to X (cats (A) = cat (A) = k) if
(a)
A, closed and cat (A) = 1, 1 5 i 5 k.
(b)
I < k, then some A, is not of the first category.
The closed set A E- X is said to have infinite category with respect to X
(cats (A) = cat (A) = oo) if no decomposition of the form (a) in Def. (5.2)
is possible.
Examples: Let E", E00 denote the Euclidean n-space and separable Hilbert
space respectively. Then, if A = {x I lx) = 1},
cats.(A)=1, 1 Sn<oo
2 for 1 5 n < oo
cat,, (A) =
1 for n = oo.

We now prove some elementary properties of the notion of category.

155
156 NONLINEAR FUNCTIONAL ANALYSIS

5.3. Lemma: Let A, B closed subsets of X. Then


(5.3.1) cat (A u B) < cat (A) + cat (B)
(5.3.2) AgB implies cat (A) < cat (B)
(5.3.3) Let I be the closed unit interval, rj: X x I - X a continous
function such that rl (x, 0) = x for x = X.
Then if 17 (x, 1) = 771(x), cat (A) 5 cat (t1,(A)).
Proof: (5.3.1.) and (5.3.2) are triv'al. To prove (5.3.3.) we can evidently
suppose that cat (ri1(A)) is finite, say, k. Then p1(A) = B1 u B2 u u Bk,
Bk closed and cat (B,) = 1. Let A, = 171 1(B,). Then A = Al u u Ak. The
identity mapping in each A, is clearly homotopic to a constant, so cat (A) S k.
Q.E.D.
5.4. Definition: We recall that the dimension of a compact metric space
X (in symbols, dim X) is equal to n iff
(a) Every open covering {U,) of X has a refinement { V,} such that the
order of {V.} is not greater than n, i.e., such that no n + 2 of the VV have
non-void intersections.
(b) There exists a covering {U,} such that for every refinement {V,} of
{U,} there exist V., V,,., in {V,} with non-void intersections (thus {V,}
has order n).
A fundamental relation between category and dimension is given by
S.S. Theorem: Let M be a Hilbert manifold, A 9 M a compact set. Then
(5.5) cat (A) 5 dim (A) + 1.
First we prove
5.6. Lemma: Let M be a Hilbert manifold, A S M a compact se,: let
n : I x A -+ M be a homotopy between the identity mapping and a constant
mapping (rl (0, x) = x, n (1, x) = constant). Then there is an extension s7 of
to I x 0, U a neighborhood of A, such that q is again a homotopy between
the identity mapping (in 6) and a constant.
From Lemma 5.6 we deduce immediately:
Corollary: For A, M as in the lemma, suppose (A) 5 k. Then there
exists a neighborhood U of A such that cat,, (U) S k.
Proof of Lemma 5.6: Embed M as a regular submanifold of some Hilbert
space H. Let r c H x H denote the set of all pairs (p, x), p e M, x 1 M,
CATEGORY 157

(MD the tangent space to Mat p). Let a : F-+ H, V :.P-+ H be defined by
a (P, x) =P
p(p,x) =p+x
Let [A) = U 27 (t, A) R A. It is easy to see, using the implicit function
ost 1
theorem, that there exists a neighborhood V of [A] and an E > 0 such that,
for (p, x) a I', p e V, Jxi < e, p( p, x) covers a neighborhood 0 of A in H
and is smoothly invertible there. Let v-1 be the inverse, and let 4 _ AV-1.
Then ¢ is a smooth map from 0 into N, and ¢'[A] = identity.
It is easy to see that ij can be extended to it defined in I x Mwith values
in H in such a way that n (0, x) = x, x e M, j (1, x) = constant. Choose
a neighborhood U of A such that (t, U) c 0 and set
ri(t,p) =X,-n (t, P), P6 U, 0 5 t S 1.
Then j (0, p) = p, p) = constant, n is M-valued, continuous and
defined in I x U.
Q.E.D.

Proof of Theorem 5.5: Let { UU} be any covering of A. Then there exists a
refinement { V) of { Uj} such that catM (Ph) = 1 for all k (for instance, a
refinement consisting of coordinate patches).
Since dim A = n and by our observation, we can find an open covering
{U,} such that
(a) cat U, = 1
+s
(b) n. ll, = q$ for any (n + 2)-ple of sets in { Uj}
J-
(c) There exist U,, U,.+, such that
s+1
in U,1# .
J=1
rn+1
Let U = U ( fl u, U1 E {U,}) , where the intersection is taken over sets
{if} of distinct indices.
By (c), U # ¢. By (a) each of the intersections has category 1, for it is
contained in some U. Hence 6 itself has category 1-it is the disjoint union
of sets of category 1. Observe that
A=(A-U)uU.
158 NONLINEAR FUNCTIONAL ANALYSIS

A - Uc U {UI - U} and it is easy to see that {U1 -- U} satisfies (b)


with n + I replacing n + 2. Indeed; since every possible intersection of n + 1
different sets C has been subtracted from U,, the intersection of n + 1

different sets Uj - U with different indices must be empty. Proceeding by


induction (the case n = 1 being trivial) catM (A) < n + cat U < n +-1.
Q.E.D.
B. Category and Homology

We now define the singular homology groups of a topological space. These


amount to a concrete realisation of the homology groups considered axio-
matically at the end of Chapter 4. We will define cubical homology groups
rather than the usual simplicial ones.
Let I = [-I, +1].
5.7. Definition: A singular n cube in the topological space Xis a continuous
mapping 0: 1" -+ X.
5.8. Definition: A singular n-cube is called degenerate if4 does not depend
on all of its coordinates.
If4. (xl , ..., xk, ..., x") (xl , ..., xx, ..., x") we say that 0 is degenerate
along its k-th coordinate.
Let be the free Abelian group generated by all the singular n-cubes
in X. Let D"(X) be the free Abelian group generated by all degenerate n-cubes
in X. Then
5.9. Definition: C"(X) = n-th singular cubic chain group is defined as
C"(X) = Q. (X )/D"(X )
Given a singular n cube 0, we obtain from it two n - 1 cubes (the k-th faces
of 4), 4k, 4k as follows.
Given (x1 ... xk-1, xk+1 x,,) a 1'. define
(X1, ..., xk-1, Xk+1s ..., x+1) -+0 (XI, ..., x,1-1, 1 , xk+l, ..., x")
and
.(xls...,xk-1,Xk+ls...,xn) -Iszk+ls...,x").
We can then define a boundary operator as follows. If 0, is an n-cube

00 _ 4=1
y (-1)k (fix - fix )
and 8 is extended linearly to arbitrary n-chains. Since (as the reader may
easily verify) 8: C"(X) - C"_1(X), we can define a boundary 8 in C
CATEGORY 159

(Definition 5.7 should be compared)


a: Cn(X) Cn-1(X).
5.10. Lemma : as = 0.
Proof: It is evidently sufficient to consider only singular n-cubes 0: I" X.
We have
km l

1)k 2: (-1)I (`Yk1


n n-1
I ( - i)k (-1)1 (0k1 4 i+)
k=1 1=1

-Y y
n n-1

k-1 1=1
Observe now that if 1 < k
*o o*
0 kl
where (*, 0) stands for any combination of the signs +, -; and if 1 > k
*o
k,1+1 = PI,k
This shows that a04) will be an element of D"(X), and hence aa¢ = 0 in C..
Q.E.D.
Having defined chain groups and boundary operators we can define
homology groups in the usual way, i.e.
5.11. Definition: Let Z"(X) = ker a c C"(X)
B"(X) = aC"+1 (X) c C"(X).
Then H"(X) = n-th (cubical) singular homology group of X = Z"(X)JB"(X).
To define relative cubical groups of a pair (X, Y) offers no new difficulties;
we simply define the n-th singular chain group of the pair (X, Y) as
C. (X, Y) = C,(X)l (D,(X) + CC(Y)),
and the boundary operator a as the relativization of the former O and the n-th
cubical singular homology group of (X, Y) as
ZZ(X,Y)=kera
B , (X, Y ) = B"(X,Y)=0Cn+1(X,Y)
The cohomology groups are constructed as usual starting from the cochain
groups and the coboundary operators. The reader may benefit by consulting
160 NONLINEAR FUNCTIONAL ANALYSIS

Hocking-Young or better still Hilton-Wylie, and by proving some of the


Eilenberg-Steenrod axioms for our cubical singular groups.
It can be seen (H. and Y., p. 306, H. and W., p. 362, B., p. 110) that a
multiplicative structure can be introduced into the cohomology groups, by
means of the so-called cup product. Since we are mainly interested in co-
homology groups of finite-dimensional manifolds, we will define this product
in terms of differential forms-via De Rham's theorem.
5.12. Definition: Let M be an n-manifold, Y e M. Let Ck (X, Y) denote
the vector space of all smooth exterior forms on M of degree k which
vanish in a neighborhood of Y. The coboundary operator is here simply the
differential d as ordinarily defined for exterior forms.
d : Ck (X, Y) -+ Ck+ 1 (X, Y)

As usual Zk (X, Y) = {co e Ck (X, Y) : dco = 0}, Bk (X, y) = dCk' 1 (X, Y)


and
5.13. Definition: Hk (X, Y) = Zk (X, Y)/Bk (X, Y).
De Rham's theorem essentially states that these cohomology groups de-
fined by means of differential forms coincide with the cohomology group
of M defined by the singular cubical groups introduced above.
Since differential forms can be multiplied, the definition of cup product
is already in view. Let Y1, Y2 s M, 001 a Ck' (M, Y1), co2 e Ck= (M, Y2).
Then col A 0)2 a Ck'+ks (M, Y1 U Y2). Suppose that w1, w2 are cocycles,
i.e. that dw1 = dw2 = 0. Then
d(0), A W2) = dw1 A W2 + (-1)eeso'
wl A dw2 = 0.
If wl is a cocycle and w2 is a coboundary so that w2 = dw, then
d (w1 A to) = dw1 A w + (-1)desm'
wl A eo2 = f wl A 0)2,
which shows thatwl A w2 is a coboundary. This allows us to define a pro-
duct (we will use the sign u) in the cohomology groups, operating as follows :
Hk' (M, Y1) x Hk2 (M, Y2) (X, Yx u.
We are going to establish now some rather interesting relations between
the cohomology structure of a pair (M, Y) and the category of Y.

5.14. Theorem: Let cat (X) = n. Then any -cup product of n elements of
degree > 0 vanishes.
CATEGORY 161

This theorem can be reformulated as follows. Define cuplength (X)


= greatest number of elements of non-zero degree with nonvanishing cup
product. Then we have the following corollary.

Corollary:
(5.6) cat (X) > cup length (X) + 1.
Proof: Since the case n = 1 is evident (it reduces simply to the fact that,
if the identity mapping i : X - X is homotopic to a constant, then X has
trivial homology) we can suppose n > I. Let Y S X be a set of the first
category, i.e. such that i : Y -+ X is homotopic to a constant map m : Y -+ X,
m(y) = p c- X. Consider the exact cohomology sequence
Hk-1(y) Hk (X, Y) J* H"(X) - Hk(Y).
Our assumption on Y implies that Hr(Y) = 0, whence by exactness, j* is
onto. Suppose now that X is of category n, X = Y, u u Y,,, Y, of the
first category. Let y,, ..., y" be n elements in the cohomology rings of de-
grees k,, ..., k". Then y, =j*Y,, Y, E Hk(X, Y,), i = 1, ..., n. The mapping
j* commutes with the cup product, so (y, u y1 u ) = j * (y, u yZ u ).
Hk,+...+k_(X, Hk'+...+k^(X,
But y, u Yz ... E Y, U ... U Ya) = X) = 0.
Q.E.D.
Let X, .t be two manifolds, w a form on X. Then we can use w in an evi-
dent way to define a form w in X x 1, and the same is true for forms in I.
Let n, A be the mappings of forms and of cohomology groups defined in
this way.
a : Hk(X) - Hk (X x 2)
x 2)
It may be seen that rrH" (X), RHk (2) together generate the cohomology
ring of X x I But this proves the following statement.
5.15. Lemma:
(5.7) cuplength (X x 1) ? cuplength (X) + cuplength (1).
Applying this lemma to the torus T", cat (T") - 1 ? cuplength (T") n.
Indeed, equality holds here. The proof is left as an exercise.
Even though we have defined the cup product only for finite-dimensional
manifolds, we have mentioned the fact that it can be defined in more general
situations. We mention the following result, discussed in more detail later.
11 Schwartz, Nonlinear
162 NONLINEAR FUNCTIONAL ANALYSIS

5.16. Theorem: Let X be a finite dimensional simply connected manifold,


Q(X) its loop space. Then cuplength Q(X) = oo.
Corollary: cat S2(X) = oo .
Example: If P(.i) is the Hilbert projective space over JY, then cat
P(, = 00.

C. Category and Calculus of Variations in the Large

Let f be a smooth function on a Hilbert manifold satisfying the P-S condi-


tion. Define .rk(M) as the set of all subsets of M of category >_ k.
5.17. Definition:
cm(f) = inf
Ae <M) tsupf(p)}
where we put cm(f) = oo if rm(M) = 0.'
Let m < m'. Then cm(f) is an infimum over more elements than cm.(f),
therefore
(5.8) -00Sci(f)5CAD :...<oo.
5.18 Lemma: Suppose (as always) that the pair (M, f) satisfies Condition
P-S. Let a(t) be a C° real-valued function defined for t z 0, such that a(t) = 1
for 0 S t 5 1, such that t2a(t) is monotone increasing for t ? 0, and such that
t2a(t) = 2 for t a 2. Let V(p) = -a (IVf(p)I) Df(p), so that V is a C' tangent
vector field on M, and let ri,(p) be the flow defined by V. Then ii=(p) is defined for
all p e M and all - oo < t < + oo.
Proof: It is plain from the description of the function a that the vector
field V(p) is uniformly bounded; let K be its upper bound. Since drl,(p)ldt
= V(rl,(p)), it follows from the definition of distance on the manifold M that
8 (rlt(p), rl,(p)) 5 K Is - tI for t_(p) < s, t < t+(p). Thus, if t+(p) < oo, and
approaches t+(p) from below, {rl,l(p)} is a C4uchy sequence, contradict-
ing Proposition 4.46 in virtue of the completeness of M. It follows that
t+(p) = oo. We may prove in the same way that t_(p) = - oo , and our
lemma follows.

5.19 Lemma: Let rlt(p) be as in the ' . eceding lemma. Let c be a real num-
ber, and set
K, = {p a MIf(p) = c, (vf) (p) = 0}.
CATEGORY 163

Then K,, is compact. Moreover, if for each s > 0 we set


(5.9) N. = {p e MI If(P) - cl < s and I(vf) (' (P))I < e
for some t such that 0< t 5 1},
then any neighborhood U of K contains one of the neighborhoods N. of K.
Proof: The assertion that KK is compact follows immediately from Condi-
tion P-S; thus only our second assertion requires proof. Suppose that this
second assertion is false. The there exist a'neighborhood U of KK not con-
taining any of the sets N1, and hence there exists a sequence of points
and a sequence t of numbers such that 0 < t, < 1, such that c and
0 as n --- co. Passing to a subsequence, we may suppose with-
out loss of generality that t -+ t* as n -+ co. Now
d
(5.10) f (yh(P)) = V (71r(P))f = -a {(vf('i (P))) f)

and thus, from the definition of the gradient, we have


d
(5.11) f (''(p)) = -a (Ivf('i (AI) Ivf('i (P))I2

It is clear from (5.11) and from the definition ofa that I df(ri,(p))/dtl is uniform-
ly bounded for all p e M and real t; thus there exists a finite constant K such
that If(ne(P)) - f(P)I < K lti . Since c, it follows from this last that
is uniformly bounded. Thus, by Condition P-S, {'7,n(pn)} has a con-
vergent subsequence, and we may suppose without loss of generality that
converges to a point q c- M. Since (vf) 0, q is evidently
a critical point off. We have
(5.12) P. = ti- r (r1 e (P.)) - n q
by Lemma 4.45 and by the fact that q is a critical point. Thus q e KK is the
limit of and since p # U we have a contradiction which completes our
proof.
5.20 Corollary: Let 0 < e < 1. Let ri, and Nt be as in the preceding Lemma.
Then if f(p) 5 c + e2/2 and p 0 N1, we have f (tll(p)) c - e2/2.
Proof: It is plain from (5.11) of the preceding proof that f(77,(p)) decreases
as t increases, and, in fact, that

(5.13) f(nI(P)) -f(P) I0 a (Ivf('t(P))I) Ivf(n,(P))12 dt.


164 NONLINEAR FUNCTIONAL ANALYSIS

Iff(p) < c - e we have nothing to prove; thus we may suppose without loss
of generality that I f(p) - el < e. Then p 0 Nt implies that I Vf Ze
for 0 < t 5 1, so that, since Pa (t) is monotone increasing (cf. the first para-
graph of the preceding proof) we may conclude, using (b), that f (r11(p) -f(p))
< -e2. But then
2
e2

and the present corollary is proved.


We are now in a position to prove the principal theorem of Lusternik and
Schnirelman in a generalized form.
5.21 Theorem: Let (M, f) satify Condition P-S, and let {cm(f)} be as in De-
finition 5.17. Suppose that m < n, and that - oo < c = c.(f) =
oo. Then the set Kc of critical points (cf. (1)) is of category n - m + 1
at least; moreover, even if m = n, the set K, is non-empty.
5.22 Corollary: Under the hypotheses of the preceding theorem the set KK
is of dimension n - m at least.
Proof: The corollary follows immediately from the theorem and Theo-
rem 5.5. To prove the theorem, first suppose that n > m and that cat (K,,)
S n - m, and then use the Corollary of Lemma 5.6 to find a neighborhood
U of K. such that cat (0) S n - m. Using Lemma 5.19 and Lemma 5.3.2, we
may suppose without loss of generality that U is one of the neighborhoods
N1 described by (2). By Definition 5.17, there exists a closed subset A of
M such that cat (A) z n and such that sup {f(p) l p e A} 5 c + 6212. Put
A0 = A - N1. Then, by Lemma 5.3.1, cat (AD) z m. Thus, if rl, is as in
Lemma 5.19 and Corollary 5.20, it follows from Lemma 5.3.3 that cat
(j7, (Aa)) Z m. On the other hand, by Corollary 5.20, f(171(p)) 5 c - e2/2
for p e A0. This contradicts the Definition 5.17 of c. and thus completes the
proof of Theorem 5.21 in case n > m. In case n = m and KK is void, we may
let U be the null set, and arrive by the same argument at the same contra-
diction. Thus Theorem 5.21 follows in every case. Q.E.D.

Reference

L. Liusternik and L. Schnirelman, Methodes topologiques dints les problEmes variationnels


(Hermann & Cie, Editeurs, Paris, 1934).
CHAPTER VI

Applications of Morse Theory to Calculus


of Variations in the Large

Bibliography

1. R. S. Palais, "Morse theory on Hilbert Manifolds", Topology, Vol. 2, pp. 299-340.


2. J. Milnor, Morse theory (Ann. of Math. Studies, Princeton, 1963).
3. I. M.Singer, Notes on Differential Geometry (Mimeographed, M.I.T., 1962).
4. S.S.Chern, Differentiable manifolds (Mimeographed, Chicago Univ., 1959).

We consider now the set of all suitably smooth paths in a finite-dimensio-


nal compact Riemannian manifold M. We shall see that a natural (infinite-
dimensional) Riemannian structure can be introduced into this set, allowing
us to apply our infinite-dimensional Morse theory. Extremals of a convenient-
ly chosen f on this set will correspond to geodesics in M, so that our results
will relate to the geodesics of M.
6.1. Definition: Let R" be n-dimensional Euclidean space. Define H0(I, R")
= L2(1, R"), i.e. the space of all functions, a, e, ... such that f 1 1a(t)l2 dt < 00
with the scalar product o

(a, e)o = I (a(t), Lo(t)) dt .


fo
6.2. Definition: Let Hl(I, R") be the set of all absolutely continuous maps
a : I - R" such that a' a Ho (I, R"). Hl (I, R") is a Hilbert space under the inner
product (a, e)1 = (a(0), N(O)) + (a', Lo')o. In fact, if (p, q) e R" ® Ho (1, R"),
the map (p, q) --, p + f g(s) ds e Hl (I, R") is an isometry onto.
0

165
166 NONLINEAR FUNCTIONAL ANALYSIS
6.3. Definition: We define L : Hi (I, R") -> Ho (I, R") by La = a' and we
define Hi (I, R") = {a e HI (I, R") I a(0) = a(1) = 0}.
Then the following is immediate:
6.4. Theorem: L is a bounded linear transformation of norm 1. H1 (I, R")
is a closed linear subspace of codimension 2n in HI(I, R") and L maps Hf (1, R")
isometrically onto the set of g e Ho (I, R") such that
I
g(t) dt = 0,
0

i.e. into the orthogonal complement in Ho (I, R") of the set of constant maps of
I into R.
6.5. Theorem: If p e H, (I, R") and 2 is absolutely continuous from I into
R", then
fI
1 (2'(t), e(t)) dt = (2, -LP)o
Jo
6.6. Definition: C ° (I, R") = set of all continuous maps of I into R".
C°(I, R") is a Banach space with the usual norm I I.. The inclusion of C° (1, R")
into Ho (1, R") is evidently bounded.

6.7. Theorem: Let a e HI (1, R"). Then

la(t)-a(s)) s jt-sIIL,lo.
Proof: Apply Schwarz's inequality.

Corollary 1: If a e H, (1, R") then Io'L0 S 2 la), .

Corollary 2: The inclusion maps i : H, (I, R") - CO (1, R") and Ho (I, R")
are completely continuous.
Proof of Corollary 1 is trivial. For 2 apply the Arzela-Ascoli Theorem.
6.8. Lemma: Let 0: R" -+ RP be a smooth map, and let 4) : HI (I, R")
- H, (I, RD) be defined by Vi(a) = 0 o a. Then 0 is smooth. Moreover, if
1 :!!g m ::5 k, then

d"'oo (2I, ..., A,n) (t) = (21(t) ... ,"(t))


This follows from
6.9. Lemma: Let F be a C'-map of r into L3 (R", R°), the space
of all s-linear maps from 7eR" to R". Then the map F of H, (I, R") into
APPLICATIONS OF MORSE THEORY TO CALCULUS 167

L' (Hl (1, R"), HI (I, R°)) defined by


F(a) (AI ..., AS) (t) = F (a(t)) (21(t), .... AS(t))
is continuous. Moreover, if F is C3 then F is C' and
dF=dF.
Proof: Observe that

F(a) (AI ... AsY (t) = F(a(t}) (AI(t) ... As(t)) = dF,(,) (a'(t)) (AI(t), .... ;.,(t))
dt

+ E F (a(t)) ( I(t), ... , A X0, ... AS(t})


t=1
which implies
IF(a) (AI ... As)' (t)I < IdFa(t)I IAI(t)I I ... IAS(t)I
IF(a(t))IIAI(t)I...IAI(t)I...1A3(t)I
+

Since IAtI. < 2IA111, and putting k = sup ldF,(,)I, we have


IdFFcn (a'(t)) (AI ... ).s)10 k23L (a) IAII I ... IAJ I
Since also
(AI ... Ai(t) ... AS)I < 2' sup IF'(o'(t))I IAIII ... ,li(t) ... IAs11,
if we recall that 1e12 = Ie(0)12 + Ie'12 we see that
IF(a) (A, ...1,)II < k(a) IA:II
... IA,I'
where k(a) is a constant depending on a. It follows that (since F(a) is plainly
multilinear) F(a) a L'(HI (I, R"), HI (I, R°)). If e e H, (I, R") then
JAI
I (F(a) - F(e)) (AI ... A.,)I. s 2' sup I F (a(t)) - F (e(t)) I I1 ... IA.11

tnd it is plain that


I ((F(a) - F(e)) (A1 ... ;,,))'I o 5 28M (a, e) IA111 ... IAsl1,
where
M (a, e) = sup IdFc(,)I la' - e'lo + sup I dF,(t) - dFQ(t)I Ie'lo
+ s sup IF (a(t)) - F (e(t))I
Hence
IF(a) - F(e)I <_ k (a, e),
where I iis the norm in L'(HI (I, R"), H, (I, R°)) and where the constant
k (a, e) -. 0 if sup IF(or(t)) - F(e(t))I, sup IdF,u> - dFanl and la' - e'Io all
approach zero. But if a --' a in H, (I, R") then IA' - e'lo S la - ell goes to
168 NONLINEAR FUNCTIONAL ANALYSIS

zero and then e -+ a uniformly. Hence since F and dF are continuous


F(a(t)) - F(e(t)) uniformly and dF0(t) -+ dF,(l) uniformly, so k (a, e) 0.
This shows that F is continuous, and it can be proved similarly that F is
C1 whenever F is C3.
Q.E.D.
Having disposed of the preliminaries we proceed to the applications.
6.10. Definition: Let V be a finite dimensional smooth manifold. Denote by
H1 (I, V) the set of all continuous mappings a : I -+ V such that 4)a is ab-
solutely continuous and 1(4) o a)'I locally square-integrable for each chart¢ in V.
Let H1 (I, V), = {A e Hl (I, T(V)) I A(t) e V,(,) for all t e 1), where T(V) is the
tangent bundle to V, and V,(,) is the tangent space at a(t). If p, q e V we define
S2 (V; p, q) as for e H1(I, V) I a(0) = p, a(1) = q} and if a e S2 (V, p, q) we
define 92 (V; p, q), = {AE H1 (1, V),IA(0) = Oo, A(l) = Q j where 0,, (resp. Oa)
is the zero of V. (resp. Va).

Remark: H1 (1, V)o is a vector space under pointwise operations and


Q (V, p, q), is a subspace of H1 (I, V),.

6.11. Theorem: Let V be a smooth submanifold of R. Then (a) Hl (I, V)


consists of all a e Hl (1, R") such that a(I) c V. (b) H1 (1, V) is a closed sub-
manifold of the Hilbert space H1 (I, R"). (c) If p, q e V, then 9 (V, p, q) is a
closed submanifold of H1 (I, V). (d) If or e H, (I, V) then the tangent space to
H1 (I, V) at a is H1 (I, V). = (A e H1 (1, R") I A(t) e V,(,), t e 1} and (e) if
a e S2 (V, p, q) then the tangent space to D (V, p, q) at a is just S2 (V; p, q),
{A a H1 (I, V), IA(0) = A(1) = 01.

Proof: (a) is clear. It is equally clear that H,(I, V) is a closed set in H1(I, R")
and that S2 (V, p, q) is a closed set in H1 (I, V). Since V is a smooth sub-
manifold of R" we can find a smooth Riemann metric for R" such that V is a
totally geodesic submanifold. Then if E : R" x R" - R" is the corresponding
exponential map (i.e. the map t - E (p, tv), where E (p, tv) is the geodesic
starting from p with tangent vector v), then E is a smooth map. Let or e Hl (I, V)
and define 0: H1 (I, R") - Hl (I, R") by 4)(A) (t) = E(a(t), A(t)). Then 0 is smooth
and 0(0) = a. Moreover d¢o (A) (t) = dEf"' (A(t)), where E0 )(v) = E(a(t), v).
Since dEo(') is the identity map of R", d4o is the identity in H1 (1, R"). Thus by
the inverse function theorem 0 maps a neighborhood of zero in Ht (I, R")
Ck-isomorphically onto a neighborhood of a in H1 (1, R"). Since V is totally
geodesic, given A near zero in H1 (I, R"), 4)(A) a H1 (1, V) if and only if
A e H1 (I, V),. Similarly, if a e .Q (V, p, q) then ¢(A) e S2 (V, p, q) if and
APPLICATIONS OF MORSE THEORY TO CALCULUS 169

only if A E D (V, p, q),. So 0-' restricted to a neighborhood of a in


H, (I, V) (resp. Q (V, p, q)) is a chart in H, (I, V) (resp. D (V, p, q)) which
is a restriction of a chart for H, (I, R"). This completes the proof of (b)
and (c), and the verification of (d) and (e) is routine.
. Q.E.D.

Since any smooth map of a sub-manifold V c R" into a submanifold W


c R," can be smoothly extended to a map from R" to Rl", we can apply our
results to obtain the following statement.
6.12. Theorem: Let V c R", W e R"' be smooth submanifolds, 0: V - W
a smooth map. Then 0: H, (I, V) --+ H, (I, W) defined by 0(a) = qa is a
smooth map of H, (I, V) into H, (1, W). Moi eover the Frechet derivative
d¢,,: Hi (I, V),, - H, (1, W)®(0>
is given by
ddb0Q) (t) = d () (A(t))
Observe now that every manifold can, by Whitney's Theorem, be imbed-
ded in an Euclidean space. Hence
6.13. Theorem: H, (I, V) and .Q (V, p, q) are Hilbert manifolds, and, by
Theorem 6.12, their manifold structure does not depend on the particular im-
bedding of V used.
The function to which general Morse Theory will be applied in what fol-
lows will be the action integral J"(a), defined for a Riemann manifold V as
follows :

6.14. Definition: For a e H, (I, V),

J°(a) f0l('t)I2dt.

We leave to the reader proofs of the following properties of J°(a).


6.15. Lemma: Let V, W be smooth manifolds, 0 : V - W an isometry.
Then J' = Jw o 0.
6.16. Lemma :Let V be a smoot h submanifold of W. Then J' = J w J H, (I, V).
6.17. Lemma: J'' is a smooth functional.
Advice: Prove Lemma 6.17 first for smooth submanifolds of R" and then
for general manifolds using Nash's imbedding theorem for Riemannian
manifolds, which was proved as Theorem 2.4.
Next observe that Hl (I, R"), as a Hilbert space, has a natural Rie-
mannian structure. Hence for all manifolds V, H, (I, V) will also have a
170 NONLINEAR FUNCTIONAL ANALYSIS

Riemannian structure. But the situation here is not so pleasant as in connec-


tion with the differentiable structure of H, (I, V), since now, in general, this
Riemannian structure will depend on the imbedding V-+ R". However, this
will not bother us at all.
A second observation is the following. Let W be a complete Riemannian
manifold, W, a closed submanifold of W inheriting from Wits Riemannian
structure; let ev, ew be the respective Riemannian metrics. It is clear that if
p, q c W1, then ev (p, q) >_ ow (p, q), since, by definition the right side of given
an infitnum over a larger set than the infimum on the left. Hence the Rie-
mannian structure of W, is also complete. Putting our observations together,
we get
6.18. Theorem: Let V be a smooth submanifold of R". Then H, (I, V) is a
complete smooth Ricmannian manifold with the Riemannian structure inherited
from H, (I, R") where R" is any Euclidean space in which V is isometrically
imbedded.
The same reasoning gives shows that Sl (V; p, q) is also a complete Rie-
mannian manifold. Regarded as a submanifold of H, (I, R"), the scalar
product in it is simply (e, A)o = (Le, LA)0.
One more preliminary needed for the application of Morse Theory is the
verification of the Palais-Smale condition for the action integral. To remind
the reader of the nature of this condition we write it down again:
P-S condition: Suppose that, for a sequence a", VJv (o") -+ 0, and J v(,.) is
bounded. Then there exists a subsequence a",, convergent to an element
aeH,(I,V).
We proceed to establish this condition for iv in a series of substeps. We
suppose throughout that V has been isometrically imbedded in a Euclidian
space R". In what follows, L is the operator of Definition 6.3.
6.19. Lemma: Let {a"} be a sequence in Q (V; p, q) such that I L (a" - ojo
- 0 as n, m -+ oo. Then or. converges in Q (V; p, q).
Proof: Evidently or. - e H, (1, R"). {a"} is Cauchy in H, (I, R") and
hence convergent. But Q (V; p, q) is closed in H, (I, R").
Q.E.D.
6.20. Detnitlon: Let p, q V. If a e Q (V; p, q) we define h(a) to be the
orthogonal projection of La onto the orthogonal complement of L (Q (V; p, q),)
in Ho (I, R").
6.21.1Leorem: Let J = Jv IQ (V; p, q). If we consider Q (V; p, q) as a
Riemannian manifold with the structure induced on it as a closed submanifold
APPLICATIONS OF MORSE THEORY TO CALCULUS 171

of H, (1, R") then for each a e Q (V; p, q) (VJ) (a) can be characterized as the
unique element of Q (V; p, q), mapped by L onto La - h(a). Moreover
IVJ (a)1. = I La - h(a)10.
Proof: Note thatQ (V; p, q), is a closed subspace of H1(I, R") and is con-
tained in H1 (I, R"). It follows from Theorem 6.5 that L maps Q (V; p, q),
isometrically onto a closed subspace of Ho (I, R"). Since La - h(a) is ortho-
gonal to L (Q (V; p, q),)1, La - h(a) = LA, A e Q (V; p, q) with R unique
and JAI, = IL210 = ILa - h(a)lo. It will suffice to prove that dJ,(e) = (2, e)0
for e e Q (V; p, q) i.e., that dJ, (e) = (LA, 4)0 = (La - h(a), Le)o for
e eQ (V; p, q),. Since (h(a), Le)o = 0 fore eQ (V; p, q) we must prove
that dJ,(e) = (La, Le)o for e eQ (V; p, q).,. But JR"(a) _ I ILaI0, so dJa "(e)
j
= (La, Le)o for e e H, (I, R"). Since = JR" IQ (V; p, q), it follows that
dJ,=dJ;"IQ(V;p,q),.
Q.E.D.

6.22. Definition: Let Q (V; p, q), be the closure of Q (V; p, q), in Ho (1, R"),
and let P, be the orthogonal projection of Ho (I, R") on Q (V, p, q),. For each
point r e V, let Q(r) denote the orthogonal projection of R" onto the tangent
space V, to V at r.

6.23. T71eorem: The functional J of Theorem 6.21 satifies the Palais-Smale


condition.

Proof: Let {a"} be a sequence in Hl (I, V) such that IJ(a")I < M, JJ(a")
-+ 0. Since, by Theorem 6.21,

IVJ (a,,)I = I La. - h(a")l0,


we have ILa,, - h(a")I0 - 0. Since each P, is a projection-hence norm-
decreasing-it follows from the corollary of Theorem 6.7 that I La.
- P,"h (o")l0 - 0, and by Corollary 2 of 6.7 we can assume on passing to a
subsequence that Ia" - aml," -+ 0 as m, n -' oo. We need only to prove that
IL (a" -- am)Io -+ 0 for m, n -+ oo, for then it follows that a" will converge
inQ(V;p,q) to aainQ(V;p,q). But
IL (a" - am)IO = (La", L (a" - am))o - (Idm, L (a,, - d.))0
Thus it suffices to prove that (La", L (a" - am))0 -- 0 as m, n -, oo. Since
I1-0"IZ = 2(a.) is bounded, IL (o - am)I0 is bounded also and, since
La" - P,"h (a") -+ 0 in Ho (I, R") it suffices to prove that
(P,"h (a"), L (a" - am))o - 0 as nr, n -+ oo.
172 NONLINEAR FUNCTIONAL ANALYSIS

We now refer to Lemma 6.24 below and note that it follows from this
Lemma that if a e Hl (I, V) then Pf belongs to S2 (V, p, q), if f is smooth
and vanishes for t = 0 and t = 1. Since h(a) is orthogonal to LP,f in this
case, we have (h(a), LPef) = 0 for all such a and f. Thus
(;) (P,h (a), Lf) = (h(a), (P,L - LP,) f)
for a e Hl (I, V) and smooth V vanishing at t = 0,1. If we put Q,(t)
(dldt) 0 (a(t)), it follows by differentiation from (*) that
(P,h (a), Lf) = (h(a), Q, '.f) _ (Q" . h(a), f)
for smooth f vanishing at t = 0,1, and hence, by a limit argument, for all
f e H1 (1, R").
Since a,, - am e Hi (I, R") it follows that
I

I(PQ"h (a"), L (a" - am))ol = Ifo (Qa" (t) h(a") (t), (a" - am) (t)) dt

la" - a.[. fo I Q,,,(t) h(a") (t)I dt


1

is bounded. Let A be a compact set such that a"(I) c A. Then there exists K
such that
IIQQ" (t) h(a") (t)I dt 5 K ILa"Io Ih(an)Io.
J

Now, since ILa"lo is bounded and since ILa" - h(o")lo - 0, Ih(a")lo is


bounded and the theorem follows.
Q.E.D.
Finally, we relate critical points of the action integral J with geodesics, and
find conditions under which these critical points are non-degenerate. We
will not discuss the geometry of geodesics of a finite-dimensional manifold
in detail, but refer the reader instead to (3) or (4) of the Bibliography.
6.24. Lemma : Let a e Q (V; p, q). Then b (V; p, q), = {A e Ho (1, R") I A(t)
e V.(,) for almost all t e I). If A e Ho (I, R") then (PA) (t) = 0 (a(t)) A(t).
Proof: Let n, e L (Ho (I, R"), Ho (I, R")) be defined by (nA) (t) = S2 (a(t))
A(t). Since S2 (a(t)) is an orthogonal projection in R" for each t e I it follows
from the definition of the inner product in Ho (1, R") that n, is an ortho-
gonal projection. From the characterization of 0 (V; p, q), it is clear that x,
maps H* (I, R") onto S2 (V; p, q),. Since Hi (I, R") is dense in Ho (1, R")
APPLICATIONS OF MORSE THEORY TO CALCULUS 173

it follows that the range of 2r, is d2 (V; p, q) so rr = P.. On the other hand,
A e Ho (I, R") is fixed under r, if and only if A(t) e V,(,, for almost all t e I.
Since the range of a projection is its set of fixed points, this proves our
lemma.
Q.E.D.
The following are obvious consequences of the lemma.
Corollary 1: If a e Q (V; p, q), then
P, (Hi (I, R")) = Hl (I, Y). and P. (Hi (I, R")) = S2 (V; p, q)0
Corollary 2: If or e Q(V; p, q) then P0La = La.
Another simple result, whose proof is left as an exercise, is
6.25. Lemma : Let T c- Ho (I, L (R", R°)) and define for each A e Ho (I, R") a
measurable function TT (A) : I-+ R" by T(A) (t) = T(t) A(t). Then
(1) T is bounded from Ho (I, R") to L1 (I, R°);
(2) If T and A are absolutely continuous then so is T(A) and
(TA)' (t) = T'(t) A(t) + T(t) A'(t);
(3) IfT e Hl (I, L (R", RD)), A e Hl (I, R"), then T (A) e Hl (I, RD).
6.26. Definition: Let a e .Q (V; p, q). Define G, e Hl (I, L (R", R")) by
G, = .Q o or and Q, a Ho (I, L (R", R")) by Q, = G,.
6.27. Theorem: Let a e .G (V; p, q). Let F. be as in Definition 6.22. If
e e H, (I, R"), then (LP, - PL) e(t) = Q,(t) 9(t). Given f e Ho (I, Jr), define
an absolutely continuous map g : I -+ R" by

g(t) = J ds.
0
Then, if e e Hi (I, R")
(.l (LP, - P,L)e)o = (g, -Le)0.
Proof: Since Pe (t) = G,(t) e(t) and P,(Le) (t) = G,(t) e'(t) by 6.24, (LP,
- PA) (e(t)) = Q,(t) e(t) follows immediately by differentiation. By (1) of
Lemma 6.25, s - Q,(s) f(s) is summable, so g is absolutely continuous. Next
note that, since G,(t) = Q (a(t)) is self-adjoint for all t, Q,(t) = G,'(t) is
self-adjoint wherever defined, and hence
1
P Le)o = J U(t),
(f, (LPo ` o) 1
Q.(t) e(t)) dt = J(QQ(t)f(t), e(t)) dt
0 o

= f (g'(t), e(t)) dt.


J 0
174 NONLINEAR FUNCTIONAL ANALYSIS

Then if e e Hi (1, R") Theorem 6.5 gives


(J,, (LP, - PoL) e)o = (g, -Le)o
Q.E.D.
6.28. Theorem: Let h(a) be as in Definition 6.20. If or e .Q (V; p, q) then
P,h (a) is absolutely continuous and (P,h (a))'(t) = Q,(t) h(a) (1).
Proof: If o e Hi (I, R") then
(P,h (a), Le)o = (h(a), P,Le)o = (h(cr), (P,L - LP,) e)o
since (h(a), LPe) = 0. Hence (P,h (a), Le)o = (g, Le)o if we define g to be

g(t) = fr Q0(s) h(a) (s) ds.


.JJ o

Then P,h (a) - g 1 L (H* (I, R")), whence Ph (a) - g = constant. Since g
is absolutely continuous so is P,h (a) and they have the same derivative.
But g'(t) = Q,(t) h(a) (t).
Q.E.D.
6.29. Theorem: Let a be a critical point of J. Then or is smooth and, more-
over a" 1 V everywhere. Conversely, if a eQ (V; p, q), a' a.e., a" 1 V, then a is
a critical point of J.
Proof: By Theorem 6.21, if a is a critical point of J, then La = h(a).
Since P,la = La, it follows that P,h (a) = h(a), so by Theorem 6.21 a' is
absolutely continuous (so that a is C1) and
(*) Qe(t) a'(t).
Now since SZ : V - L (R", R") is smooth using 6.26 we have

Q1(t) = d2 (a(t))
dt
It follows that if a is C", then Q,(t) is so by (*) the statement that a"
is Cl"- I implies that or is C,"+ 1. Since we already know or is C', it follows that
a is smooth. If e e S2 (V; p, q) then La = h(a) is orthogonal to Le, so that
a" is orthogonal toe. Since a" and e are continuous, it follows that (a"(t), e(t))
= 0, t e I. If t e I is not an endpoint and so a V,(,), then there exists
e c -.Q (V; p, q) sdch that e(t) = vo, hence al(t) is orthogonal to V,(,) and,
by continuity, this holds also at the endpoints. Conversely, if a eQ (V; p,q)
is such that a' is absolutely continuous and a" 1 V,(,) for almost all t e 1,
then La 1 L (Q (V; p, q)) so La = h(a) and a is a critical point of J.
Q.E.D.
APPLICATIONS OF MORSE THEORY TO CALCULUS 175

The last step in the characterization of critical points of J is supplied by


the well-known result of classical differential geometry (see (3) and (4)) that,
if Or e C2(I, V), or is a geodesic of V parametrized proportionately to arc-
length if and only if a" 1 V everywhere. We obtain the following conclusion.
6.30. Theorem: If a e S2 (V; p, q), then or is a critical point of J if and only
if a is a geodesic of V parametrized proportionately to arc length.
We must now determine when an extremal point of Jwill be degenerate.
We limit ourselves to a brief exposition and to suggesting that the reader
consult (3).
Let E denote the exponential map of V. into V; i.e. if v e V,, then
E(v) = a(Ivl) where or is the geodesic starting from p with tangent vector v/Iv(.
Then E is smooth. Given v e V, we define R(v) = dimension of null-space
of d,,, If A(v) > 0, we call v a conjugate vector at p. A point of V is called
a conjugate point of p if it is in the image under E of the set of conjugate
vectors at p. By Sard's theorem the set of conjugate points of p has measure
zero in V.
E_1j` _)
Given v e define v e.Q (V; p, q) by v(t) = E (t(v)). Then v is a geo-
desic parametrized proportionately to arc length (factor: JvJ) and hence a
critical point of J. Conversely, any critical point of J is of the form v for a
unique v e E- 1(q). We may now state the following two theorems:
6.30. Non-degeneracy theorem: If v e E-1(q) then v is a degenerate critical
point of J if and only if v is a conjugate vector at p. Hence J has only non-
degnerate critical points if and only if q is not a conjugate point p. This condi-
tion is satisfied if q lies outside of a set of measure zero in V.
6.31. Morse index theorem: Let v e E- 1(q). Then there are only a finite
number oft satisfying 0 < t < 1 such that t, is a conjugate vector at p. The
index of v is E A (tv). In particular each critical point of J has finite index.
0<t<I
Proof of 6.31 and 6.32
Let a be a geodesic on the manifold V. By definition 6.14 of J'(o) we have
J'(a) 1 da (t) dQ (t) dt,
2Jodt dt
so that, if we introduce coordinates in the neighborhood of the curve a we
have
j' dat
JV(a) = I Igij (a(t)) (1) da, (t) dt.
o dt dt

2
176 NONLINEAR FUNCTIONAL ANALYSIS

If we then put a, = or + ep, and calculate the terms of second order in a in


order to evaluate the Hessian quadratic form 62J" (a; e, e) we find that
b2JV (a; e, e)
_ 1 1 dp'(t) ar(t) den (t) :
I {gg + Ar(t) + Bt) a (t) a (t) dt ,
2 Jwhere

may readily be expressed in terms of a and of the


Ai/t) and
first and second partial derivatives of ggj.
If we are careful to choose coordinates in the neighborhood of the geo-
desic curve a in such a way that or itself is the first axis and curves perpendicular
to a give the remaining axes, we have g!(a(t)) = 6,.,, and the above expres-
sion for the Hessian reduces to
[t] 62Jv (a; e, e)

= zf t
It (de'(t))2
+ A(t) a(t) d t) + B'(t) '(t) e'(t) dt.
The Hessian matrix 32J (a, ... , .) will be singular if and only if there exists
a function a eQ (V; p, q), such that 62J (or; e, j) = 0 for all a eQ (V; p, q),.
That is, the Hessian matrix will be singular if and only if there exists a non-
zero function (e') a H, (1) such that e'(0) = e'(l) = 0 and such that
Q. dLo'
(e, e) = Ajj(t) e`(t)
Jo ' d t t) d drt) + 2 dr(:)

+ I Aj/:) e `(t) d ' t) + Bu(t) e'(t) e'(t)} dt = 0


for all (e') a H, ((0, 1 ]) such that e'(0) = e'(1) = 0.
Integrating the above expression by parts, we see that a2J (a; , ) will
be singular if and only if the second order differential system
(t) _ d
(*) - d2@1(

dr2
t) +
2
1
Au(t) dej
dt
1

2 dt
{Aji(t) e'(t)} + BiAt) e'(t) = 0

has a solution (e') satisfying A = 0, e'(0) = 0, e'(1) = 0.


Call the real numbers A for which there exists a non-zero function (e')
with e'(0) = 0 = e'(1) satisfying (*) the eigenvalues of the Hessian form
Q,; call the corresponding functions a the eigenfunctions belonging to
the eigenvalue A; and call the number of linearly independent functions
belonging to the eigenvalue A the multiplicity of the eigenvalue A.
APPLICATIONS OF MORSE THEORY TO CALCULUS 177

Then the classical theory of Sturm-Liouville equations supplies the follow-


ing results.
a. Every eigenvalue of the boundary value problem defined by the differ-
ential equation (*) and the boundary conditions p'(0) = e'(1) = 0 is of finite
multiplicity. The eigenvalues form an infinite sequence of isolated points
bounded below. Thus, if the eigenvalues are enumerated in increasing order,
each being repeated a number of times equal to its multiplicity, they form an
increasing sequence A1, A2, ..., of real numbers approaching infinity.
b. (Minimax principle). Let H1'°'([0, 1]) denote the set of functions
e = (e') a H1([0, 1]) such that e'(0) = 0 = Lol(l). Put

e (P) _ + e(t) P'(t) dt,


J0
and Ie( = (p, the k-th eigenvalue Ak in the above sequence is
given by the expression
(**) Ak = max min Q. (e, p)
al. ak-1
101=1 0 EH1°)(10.1])

Let 0 < a 5 1, and let H1([0, a]) denote the set of all functions e = (e')
on the interval [0, a] which have square-integrable first derivatives: let
H(°)([0, a]) denote all those which vanish at both end-points of the interval
[0, a]. Let be the n-th eigenvalue, in increasing order and with repetitions
according to multiplicity, of the equation (*), with boundary conditions
e'(0) = 0 = o'(a). Then, applying the minimax principle (**), we find that
(***) Ak(a) = max min Q (e, e)
a1. ek-1 (a. aj)-0
C- (CO.03) J=1,-.-,k-1
X01= 1 0 e H;°j(10. a])

Since Hi°)([O,a]) may also be regarded as the subset of Hi°)([0,1]) consisting


of all functions e = (p') such that p'(t) = 0, a <_ t 5 1, it follows immediately,
on comparing (**) and (***), that Ak(a) > Ak(1) = Ak for all k. By the same
argument but more generally we have Ak(a) ? Alt(b) for a 5 b. Thus the
eigenvalues A*(a), regarded as functions of the parameter a, are monotone-
f°° 4r 1 l/z
decreasing. For all awe have e'(t) = (e'(s))' ds S 11/2 d$)
1 I (e'(s))' I2
thus for all sufficiently small values of a the first term of the integral [ i ]
dominates the others, and the expression [t] is necessarily positive. By the
minimax principle (**), this implies that for sufficiently small a, all the
eigenvalues Ak(a) are positive. Thus, for each a > 0, the number of negative
12 Schwartz. Nonlinear
178 NONLINEAR FUNCTIONAL ANALYSIS

eigenvalues is precisely equal to the number of eigenvalues which have


crossed from positive to negative as a has increased from zero to its given
value.
The above arguments establish the following lemma:

6.32. Lemma :
(i) The Hessian matrix 62J° (a; e, ti) is singular, i.e., the critical point a
of the functional J`' is degenerate, if and only if the equation (*) has a non-
zero solution e = (e') satisfying e'(0) = e'(1) = 0.
(ii) Let 0 < al < a2 < ... < a, < 1 be the values of a for which the
differential equation (*) has a non-zero solution e = {e') satisfying e'(0)
= e'(a) = 0; and let n(a) be the number of such linearly independent solu-
tions attaching to the value a. Then Morse index of the critical point a, i.e.,
the number of negative eigenvalues of the hessian matrix 62J", is equal to the
sum n(al) + + n(a,).
Next we shall need the following Lemma.

6.33. Lemma: Let n(a) be defined for each a as in (ii) of the preceeding
Lemma. Let v - E(v) be the exponential transformation, which sends each
tangent vector v at the point e of the manifold V into the point o,flvI), where
a, is the geodesic starting from P with-tangent vector v/JvJ. Let vo be such that
or = a,,. Let dE,, be the gradient of the map E at the point vo. Then n(l)
is equal to the dimension of the null-space of the linear transformation dE,,.
Before giving the proof of this Lemma, let us note that the Non-degeneracy
Theorem and the Morse Index Theorem follow readily from the two pre-
ceeding lemmas. Indeed, since n(1) = 0 is the criteria for non-degeneracy
according to the first of our two lemmas, the Non-degeneracy Theorem fol-
lows immediately from the second lemma. As to the Morse Index Theorem,
we note that, applying the second of Lemmas to each of the geodesic seg-
ments a(t), 0 S t 5 a, with a 5 1, we find that n(a) = 2(avo) for 0 S a S 1.
Thus the Morse Index Theorem follows at once from part (ii) of the first
of our Lemmas.
Let us now give the proof of the second lemma:

Proof: Put a,(t) = E(vt), so that a, is a geodesic curve parametrized propor-


tional to arc-length, whose tangent vector at t = 0 is v. Since for any v a, is a
critical point of the functional J"(a) we have de J1 (a, + &0)1..o = 0 for
every function e = (e') vanishing for t = 0 and t = 1. Thus, if y is any vec-
APPLICATIONS OF MORSE THEORY TO CALCULUS 179

for tangent to Vat the same point p as v, we have


02
JY(av+ir + Ee) a=o = 0
0EaE I 1=0
for all Q. That is,
d
62JV (av,
dE av+iv, e) a=o - 0
z=a

for all a vanishing for t = 0 and t = 1. If we note that the differential equa-
tion [*] is derived from the variational condition (t] on integrating by parts,
we see at once from this last equation that the function

AY(t) = d av+4t)
s=o

satisfies the linear differential equation [*].


We have
02 d
ar+tv-(t) = (v+sv70=v;
}
=o aiat t=t=o de

thus d ; satisfies the initial conditions d r(0) = 0, d;(0) = v. On the other


d
hand, taking t = I, we find that d,(1) = a, *;,{t) = d E (v + iv)
de s-0 de- 1,;Wo
= Thus the dimension of the null-space of dE, is at the same time
the dimension of the null-space of A4,1); and hence equals the dimension of
the space of vectors v such that d satisfies both the boundary conditions
A -,(O) = 0 and d,(1) = 0. That is, the dimension of the null-space of dE,
is the integer n(l) of Lemma 6.33, and thus the proof of Lemma 6.33 is
complete.
Q.E.D.
CHAPTER VII

Applications

A. Applications to Homotopy Theory . . . . . . . . . . . . . . . 181


B. A proof of Theorem 5.16 . . . . . . . . . . . . . . . . . . . 185
C. The Homotopy of Some Lie Groups . . . . . . . . . . . . . . 189

A. Applications to Homotopy Theory

We first recall the definition of the homotopy groups of a space. Let X


be a topological space, A a subspace of X and p e A. Let In denote the n-
dimensional cube, I"-1 c I" the bottom face, and J"-1 the union of all the
other faces, so that J"-1 = 8I" -- I"-1. We shall write
f: (I", I"- J"-1) -- (X, A, p)
for any continuous function f : In -> X which maps In-1 into A and J"-1
on p. We denote by Q" (X, A, p) the space of all such functions, and by
an (X, A, p) the set of all components of D. (X, A, p). n" (X, A, p) has a well
known group structure (by taking representatives of two elements of at",
reparametrizing and then "joining" them). We call it the n-dimensional
homotopy group of X relative to A with base point p.
The following are easily proved properties of the groups n":
(i) By reparametrizing we get for n z 2
da" (X, A, P) ; D1(D"-1 (X, A, P).0o, 00),
where 0o is 'the constant map sending I"-1 to p. Hence, n"(X, A, p)
= nl(P"-1(X, A,P),00,00).
(ii) In the same way we prove that for n z 2
X. (X, A, P) = n.-1(Q (X, A, P), 00, to),
where 00 sends I1 onto p.

181
182 NONLINEAR FUNCTIONAL ANALYSIS

(iii) The identity of maps homotopic to the con-


stant map 0o : I" -> p.
(iv) If 4)(I") c A, then 0 is homotopic to the identity.
(Shrink I" by means of a function m,, so that 0 4)m, is a honotopy be-
tween 0 and 0o.)
In other words, 'r (X, X, p) is trivial for any p e X.
In the sequel we shall write n"(X, p) for n"(X, p, p). It is easy to see that
n"(X, p) is, in fact, the usual "absolute" n-dimensional homotopy group
of X with base point p. Also, n"(X, A, p) will often be written 7r .(X, A),
when no confusion can arise. (Of course if A is arc-wise connected n" (X, A, p)
does not depend on p.)
Suppose we have a map V: (X, A, p) -+ (Y, B, q). Then V induces a map
+p* : n"(X, A, p) -+ n"(Y, B, q). (Just send4) ESl"(X, A, p) into E D.(YB,q).)
7.1. Definition: The boundary homomorphism 49: n" (X, A, p) -' n"-1(A, p, p)
or briefly 8 : a. (X, A) - ac _ 1(A) is defined as follows.
Given q e ="(X, A), take ¢ e q, then ¢II"-1 belongs to D"_ 1(A, p, p) and
so determines a class 8q a n"_ 1(A).
We state the following without proof.

7.2. Theorem: Let i be the injection (A, p) - (X, p) and j the injection
(X, p) - (X, A). Then the following sequence is exact:

... -. n" (X, A) e ' nA-1 (A, p) o n"-1 (X, p) !_`` n"-1 (X, A)
This is the analog for homotopy groups of the Exactness Principle given
in Chapter N, Part 2, § E, of these Notes.
Now, suppose we have a manifold M, feCOD(M), satisfying the P-S condi-
tion, and let as usual M° _ {x e M; f(x) 5 a), M° = {x e M; f(x) S b}.
If there are only non-degenerate critical levels between a and b, Mb is de-
formable to M' with handles attached :
(1) Mb - M" u h1 (Dk' x D`1) u h2 (D1= x D12) u ---
h, h= h3

Let A be another manifold, and 0 a mapping 46: A - Mb. Assume that dim (A)
is less than the index kl of any critical point in (1) and that A is compact.
Next, note that 01 can be deformed to a smooth map, and set Al
= ¢-1 (h1 (Dk1 x D")). Consider h-14) : Al -+91 x Di'. Let p1 bethe pro-
jection map of Dk1 x D" onto D. Then p1hi 10 is smooth and maps Al
into 0111. But dim (A1) < k1, whence some point in D"`1 does not belong to
the range of pah-'4), and the same holds for the other indices k2, etc.
APPLICATIONS 183

Now a manifold of the type


M°uh,q, x D'')uh2(#2-q2 x
hl h2 h3
D`2)U...

can be deformed into M° (see drawing).


Hence, 0 can be deformed to a map A - M°.

As a special case we get:

7.3. Theorem: n (M°, M°) = 0 if n < degree of any critical point between
a and b.

7.4. Corollary: If Morse theory applies to (M, f) and if above some non-
critical level c all critical points have indices greater than n, then n. (M, M`)
=0.
We will now apply our results to the topology of spheres, in order to
obtain the so-called Freudenthal suspension relation between homotopy
groups.
First we recall that in relation to H,(SJ, p, q) and the function J, the
geodesics joining p and q are critical points whose indices depend on
the length of the geodesic: if length (y) = n - e for any 0 < e < n, then
index (y) = 0; if length (y) = n + e, then index (y) = j - 1; if length
(y) = 3n - e, index (y) = 2(j - 1) and so on.
This follows from the Morse Index Theorem 6.31. Suppose that we
have, as before, a map 4) : A - H, (SJ; p, q) where p # q and p # q',
the conjugate of q, and that dim (A) < 2 (j - 1). Then by 7.3 0 is homo-
topic to a map whose range contains curves of length at most n + 2e. Now
assume that length (o) < n + 2e. Let m be the midpoint of cr: m = v(Q.
It is easy to see that m: H,(SJ; p, q)-+ SJ is a smooth map.
We have d (p, m) < in + e and d (q, m) < in + e. This implies that there
are unique geodesics v, joining p and m and v2 joining q and m (see dra-
wing below), if e is small enough.
184 NONLINEAR FUNCTIONAL ANALYSIS

P'

Then d (a(t), al(t)) 5 1(2c + 2e) + j (n + 2E) < n if e is small enough (e


being the distance between p' and q). Hence in this case a(t) and al(t) are con-
nected by a unique shortest geodesic varying continuously with t, whence a can
be deformed through these geodesics into a1. The same holds for a2. Thus
any map 0: A -+ H1(SJ; p, q) is homotopic to a map 0: A -+ H1(SJ; p, q)
such that each value ¢(A) is a broken geodesic of two segments and total
length less than a + It follows that the space of maps A - H1 is of the
same homotopy type as the space of maps with values in a "belt", and hence
ofthe same homotopy type as the space of maps A - SJ-1(see figure below).

Now, it can readily be proved that H1(SJ; p, q) is of the same homotopy


type as H1(SJ; p, j); that is, the homotopy type does not depend on the
points p and q. Thus our result is independent of the relative position of
p and q.
In particular, we obtain :

7.5. Theorem: If dim (A) < 2(j -1), the space of maps A - D1(SJ; p, q)
is of the same homotopy type as the space of maps A - SJ-1.
APPLICATIONS 185

7.6. Corollary: For n < 2(j-I)


an (D1 (SI; p, q)) - n (Sl -1) .
By property (ii) of the homotopy groups, we obtain
7.7. Corollary:
7r"+1(SJ") n"(SJ), for n < 2j.

Corollary 7.7 is known as the Freudenthal suspension relation.


7.8. Corollary:
xn(S") -- 2r"+1(S"+1), if n > 0,
whence arn(S") = Z if n > 0.

B. A Proof of Theorem 5.16

Let X ° . B be a fiber space. Also let 0 : A - X and V = p¢ : A -+ B. We


say that the homotopy +p= of V has the "lifting property" if there exists a
homotopy 0, of ¢ such that ip, = po,.
Example: If X = B x C and p,: X-+ B is the natural projection on B
and P2: X - C that on C, and if 0 and w are two functions as above,
then given a homotopy Vr the map

of =+V,
has the required properties.
We state without proof the following
7.9. Theorem (Kunneth) [Cf., for example, Hilton and Wiley, Homology
Theory.]
H. (B x C; G) = ®Hk (B; Ht (C; G)).
k+t=n
7.10. Corollary: Suppose G = real numbers. Then
b (B x C) = E' bk(B) b!(C)
k+1="
where b" are the Betti numbers.
If we form the Betti polynomials
b (B, z) = z"bn(B),
nao
Corollary 7.10 implies that
b (B x C, z) = b (B, z) b (C, z).
186 NONLINEAR FUNCTIONAL ANALYSIS

Consider now the following fiber space: take a topological space B, a


point b e B and let X be the space of all curves in B starting at b, with the
usual topology. Of course p : X-+ B assigns its end point to each curve.
Let 0: A -+ X, and ip = pq5. For a given homotopy y,, of ip, put 4,(a)
= curve 4(a) followed by +p,(a). This provides a lifting. So X -°-- B has the
lifting homotopy property. Furthermore, Xhas the homotopy type of a point
(just shrink each curve to the point b). Therefore 0 for n > 0.
Returning to Theorem 7.9, set Ht (B, H, (C, G)) for k, I z 0, and
denote by Z the whole double sequence {Zk-'; k, l z 0}. More generally, as-
sume we have two arbitrary double sequences of Abelian groups, Z and Z.
Then we make the following
7.11. Definition: We say that Z is derived from Z by an r-boundary
operation if there exists a "boundary" operator d: E ®Zk.1 E ®Zk.r
such that d2 = 0,
(#)
d: Zk,' - Zk-r. i+r-I
and
k.1 {dz = 0) n Zk. Z
dZ - Zk"
(It should be understood that Z" is the trivial group for k or I < 0.) d is
called an operator of type r. In this case we shall write 2 = JE°,(Z).
Observe that for r large and k + 1 small, Zk.', because in this case
{dz = 0} = and dZ = {0}.
7.12. Lemma: If we have operators d, of type i = 2, 3, ... and starting
with Z, sequences .r°2(Z), .* 3 (.*'2(Z)) of groups, etc., the limit
.W.(Z) = lim' °e (°2(Z)) ...)
exists.
This follows from the above observation.
Next we quote the following fundamental theorem on the homology of
fiber space, but without giving its proof.
7.1 3.Tbeorem (Leray-Serre) : [Cf. Serre, "Homologie Singulibre des Espaces
Fibras", Ann. Math. 54 (1951).] Let X °- B be a fiber space, with B con-
nected and simply connected, and connected fiber F = p-1(b). Put Zk,'
= Hk (B, H, (F)). Then H (X) has a composition series with factors Zk-',
k + I = n, such that Z = .af°.(Z). (A composition series for G is a sequence
of subgroups G, of G such that G = Go 2 G, -a 0, and the factors
are the groups G,/G,+,.)
APPLICATIONS 187

Let us consider once more the fiber space X -D- B of curves beginning at
b e B, with fiber F = p-1(b) = Q(B). As we said before, all the homo-
logy groups of X are zero for n > 0. As in Theorem 7.13, put
Hk (B, H, (.Q(B))), where Zk-0 = Hk(B). Suppose that H (B) is the first non-
vanishing homology group of B of positive dimension (see diagram below).
If n > 2, by Theorem 7.13, Z°.1 must be zero, because this does not change
when homology with respect to an r/r Z 2 boundary operation is taken;
since the final result must be 0, all Zk.'being zero, H1(Q) itself must vanish.
This implies that all the in the column of H1(Q) are 0. Similarly, if n > 3,
all the groups in the column of H,(Q) are zero. Using these remarks, we
may prove the following theorem.

0
d2 ZliI
0

H0(8)-H0(Q) H1 (Q) H, (2) -- - Hn-i (U) H.. (S2)

7.14. Theorem: If B is connected and simply connected, the first non-


vanishing homology group of positive dimension is isomorphic to the first
non-vanishing homology group of positive dimension of Q(B), which is
Hr-1 (Q(B))
Thus
H (B) ^-' (.(B))
Proof: Suppose, for example, that n = 3. After homology with respect to
the 2-boundary operation is taken, Z3.0 remains the same, for Z1.1 is zero
by the above remark. The same is true of Z°.2. Taking homology with re-
spect to the 3-boundary operation may change both groups, but all the other
188 NONLINEAR FUNCTIONAL ANALYSIS

homologies leave invariant the groups in the places (3, 0) and (0, 2). But the
limit groups H.,(Z)3.0 and H,,,(Z)°,Z must be zero, so the 3-boundary ho-
mology gives us zero in both places. In other words, the sequence
d3- 113

0 HO) a~ 0
is exact, which proves the theorem.
Q.E.D.
7.15. Ccrollary (Hurewicz): If B is connected and simply connected, the
first non-vanishing 'romology group of positive dimension, H (B) is isomor-
phic to the first non-vanishing homotopy group of positive dimension a. (B).
Proof:
H4(B) = i (.(B) = H1 (f"-'(B)) = ri (D8-'(B)) = x.(B)
Q.E.D.
Now assume that B is a finite dimensional space. Consider homology
groups with real coefficients and let Dk.' = dim bk(B) bt(Q(B)), where
the bk are Betti numbers.
Suppose that Q(B) has only finitely many non-vanishing Betti numbers;
let b,(D(B)) be distinct from zero, and bj(Q(B)) = 0 for 1 > r. Similarly, let
0 and bk(B) = 0 for k > n. Then D'," is different from zero, and
remains fixed throughout the sequence of homologies of Lemma 7.12 and
Theorem 7.13 (same argument as before). But this is a contradiction, for
the final result gives the trivial homology of the path-space X and hence
must be 0. Thus D(B) always has infinitely many non-vanishing homology
groups. Suppose next that one of the numbers, say, b,(Q(B)), is infinite, and
that for I < s, b, (S2(B)) is finite. Then the number at the node (s, 0) of the
above diagram remains infinite throughout the sequence of homologies
which is again a contradiction. We have thus proved
7.16. Theorem: If B is connected, simply connected and finite dimensional,
99(B) has infinitely many non-vanishing real homology groups and all of them
have finite dimensions.
Q.E.D.
The space 9(B) is an example of the more general concept of a "group-
like space".
7.17. Definition: Let X be a topological space. Then X is called a group-like
space if there is a binary operation defined on it, a distinguished element
APPLICATIONS 189

called the identity, and a mapping x -- x-1 such that all the properties defining
a group are satisfied up to homotopy (e.g. m - m - e - identity).
For group-like spaces we have the following theorem of Hopf, which we
quote from the ciled paper of Serre but shall not prove.
7.18. Theorem: The cohomology ring with real coefficients of a group-like
space with finite dimensional homology groups is the direct product of a poly-
nomial algebra and an exterior algebra.
7.19. Corollary: Under the above hypotheses, if the group-like space X has
infinitely many non-vanishing Betti numbers, the cup-length of X equals oo.
(See § 2 of Chapter 5 for the definition of cup-length.)
7.20. Corollary: If B is connected, simply connected and finite dimensional,
then
cup-length (S1(B)) = oo.
[Compare with Theorem 5.16.]
7.21. Corollary: Under the above hypotheses, any two points of a Rieman-
nian manifold B are connected by indefinitely long geodesics.
Remark: It can be proved that if B is compact, simple connectedness is not
necessary.

C. The Homotopy of Some Lie Groups


We first recall the definition and some properties of the unitary group.
For more details, see Milnor's book on Morse theory. The unitary group
U(n) is the group of all n x n complex matrices preserving the inner product
in C", or equivalently, the group of all n x n complex matrices such that
UU* = I, where U* is the conjugate transpose of U.
This is a Lie group, and the tangent space at the identity I is the space of
matrices {iH}, where H is hermitian, i.e.: H = H*. Analogously, the tangent
space at U0 is the space of matrices {iU0H} = {iHU0}. The matrix ex-
ponential function defined by
expA=I+A+ -+ -+
A2
2!
A3
3!

coincides with the exponential function defined on the Lie algebra {iH} with
values in the Lie group U(n). The scalar product
*'-(A, B) = Re trace (AB*)
defines a Riemannian structure on U(n).
190 NONLINEAR FUNCTIONAL ANALYSIS

The geodesics beginning at I are the curves of the form v(t) = exp (ill?),
with H hermitian. We say that o(t) has H as initial velocity.
Our aim now is to determine at which points of the tangent space at the
origin, that is, at which hermitian matrices H, the exponential function has a
vanishing Jacobian. The image of these points under the exponential is the
set of conjugate points to the identity I.
In general, let f be any analytic function of a matrix. Then f has a Cauchy
integral representation,
f(M) = , If(Z) dz.
tact z -- M
If bf(M, N) denotes the first variation off at the point Mapplied to N, we
have:
(1) af(M, N) =

On the other hand,


-f f(z) S [(z --- M)-', NJ dz.

(2) 6(z-M)-' _(z-M)-'dM(z-M)-


Now consider the operators e(A) and A(A), on matrices defined as right and
left multiplication by A respectively. Since the mapping A - e(A) is a homo-
morphism from the group of non-singular matrices to the group of non-
singular linear operators on the linear space of all matrices, and similarly
for A, we obtain :
e ((z - A)-') = (z - e (A))-'
and
A((z -A-') =(z -A(A))-'.
Hence from formula (2) we obtain
8 ((z - M-') = (z - e(M))-' (z - A(M))-' 8M,
and therefore using (1) it{f(z)
follows that
(3) bf(M, N) = _L (z - (M))1 (z - A(M))-' dz}(N).
2xi
Set
$2) -- 1
2i
f(z)
IF I
dz.
Then
'
6f(M, N) = 0 (e(M), A(M)) (N) .
Moreover,
1 1 _
( L-),
Z -$1 z 1 1 2(Z r ),
z - r2
APPLICATIONS 191

SO

$2

We now return to the function f(H) = exp (iH). In this case, we have

S exp (iH) = 0 (o(H),1.(H)) 8H,


where
exp (iE1) - exp (iE2) (i (E1 - 2)))
= exp
E, - $2 $1 - $2
But
e(A) - R(A) = Ad (A).
So finally we get the formula

8 exp (iH) = exp (iH) +p (Ad (H)) oH,


where
1 - exp (iz)
(z)
z

Furthermore, the eigenvalues of ti (Ad (H)) are equal to W (eigenvalues of


Ad (H)). The zeros of tp are z = 22rn, n = 0, ± 1, ±2, ... Hence the ma-
trices H which give rise to conjugate points to the identity, are those whose
Ad (H) has an eigenvalue of the form 2nn, n = 0, ± 1, ± 2, ... Now, the
eigenvalues of Ad (H) are differences of--the eigenvalues of H, hence the
matrices we are looking for are those having eigenvalues differing by 2nn,
n=0,±1,±2,...
All these calculations apply to the Special Unitary group SU(n) also, but
in this case the tangent space to the indentity, that is, the Lie algebra, is the
space of matrices {iH} where H is hermitian and has trace 0.
The following considerations apply to the group U(2m) or the group
SU (2m).
We choose an element E near -I having the form

exp(i(a+a,)) 0
exp (-i (n + el))
E= exp (i (n + 82))
exp (-i (n + e2))
0
192 NONLINEAR FUNCTIONAL ANALYSIS

We wish to study the geodesics joining 1 and E. Therefore we have to find the
solutions of exp (iH) = E. But such an elementHcommutes with E, and since E
is diagonal with distinct entries, H too must be diagonal. Set

H =

Then exp (ih1) = exp (i (n + e,)); exp (ih2) = exp (i (n -- et)); etc. Hence,
h 1 = n + e l + 27zn, = e, + (2n, + 1) n,
h2 = r - e, + 2nn2 = -el + (2n2 + 1) n, etc.
This means that the h, are of the form:
(2k + 1) n ± e.

The length of the geodesic with initial velocity H is


d
L= exp (itH)I = (tr (HH*))1/2 = (tr(H2))1/2
dt
= {E ((2n, + 1) n ± e)2}1/2

Choosing ± 1 for the coefficients of n, we obtain 22n, geodesics of minimal


length -a
The next shortest geodesics are obtained when all coefficients but one are
±1, and one of them is ±3; then the length is - n ylm + 8.
Conjugate points to the identity appear along a geodesic when t (h, -hj)
= 22rn for some t, 0 < t < 1, and for, some h, 0 hj. Given hl and hj there are
Ch` - h,] ([ ] = integer part) conjugate points. The total number of con-

jugate points along the geodesic is therefore


I rhi - h j = 2 h
h,#hJL 2n IL 2n ]
But hj = n (2nj + 1) ± e. Hence the total number of conjugate points is
>2 (nj - n, - 1) for a small enough.
nj>nj
Consider now the special case of the Special Unitary group SU (2m).
Then we have
APPLICATIONS 193

7.22. Lemma: Unless m of the nj's in the formula for the hj's are 0 and
the m others are -1, the geodesic with initial velocity H passes through at
least 2 (m + 1) conjugate points to the identity.
Proof: Since the trace of H is zero, E hj = n (E 2nj + 2m) = 0, so
Y, nj < -m, or
E nj = -m. Thus there are two possibilities: either nJ<0
n j = -m. In the first case, there is at least one positive n j, call it n1i
nJ<0
so that if N denotes the number of conjugate points
E -nj>2(m+1).
N>2 Y (nj - ni - 1)z2nj<0E (n1 -nj-1)>2 eJ<0
nj>ni
In the second case E nj = -m there are no positive nj's. If our hypothesis
AJ<0
is violated, some negative nj must be less than -1, so the number of nj's
equal to zero is >- m + 1. Now,
N > 2 E (nk - nj - 1) >_ 2 (number of n's equal to 0) > 2 (m + 1).
nk=0
nJ<-1
Q.E.D.

7.23. Corollary: All the geodesics joining I and E of non-minimal length


contain at least 2 (m + 1) conjugate points and therefore have index
2(m + 1).
7.24. Corollary: For the loop space H1(SU (2m), I, E) and the length
function J, the relative homotopy group
nj(H1i{J5n12m+8)) = 0
for all j :!!g 2m + 1 and 6 small enough.
The proof is an application of Corollary 7.4.
We consider now the geodesics joining I and -I in SU (2m). If H is the
initial velocity of such a geodesic, H has eigenvalues (2nj + 1) n, and the
length of the corresponding geodesic is
2m 1/2
L=[ ((2n j + 1) n)2]
=1

We obtain the geodesics of minimal length when all the coefficients 2nj + I
of n, are ± 1. This length is 2m r, and the other geodesics have length
>= n 2m + 8. For the geodesics of minimal length, the fact that trace (H)
= 0 implies that there are m eigenvalues equal to +n and m eigenvalues
equal to -n. In this case, the matrix H is completely determined by giving
the subspace of eigenvectors of the eigenvalue n, the other subspace corre-
13 Schwartz, Nonlinear
194 NONLINEAR FUNCTIONAL ANALYSIS

sponding to -a being orthogonal to the first one. Therefore we have a homeo-


morphism between the manifold of minimal geodesics joining I to -I in
SU (2m) and the Grassmann manifold Gm(2m) of m-dimensional linear
subspaces of C2m.
We shall now prove the following theorem due to Bott (compare to
Lemma 22.5 of Milnor's Morse Theory):
7.25. Theorem (Bott): Let M be a complete Hilbert manifold, f a smooth
function satisfying the P-S condition. Suppose that the set (f a) has only
one critical level c, with critical set K = f- I (c), and assume that K is a finite
dimensional submanifold of M. Then
n:k({f<a},K) =0 for all k,
provided that f is bounded below.
For the proof we need the following lemma.
7.26. Lemma: Under the hypotheses above, if {x"} is a sequence such
that f(x") -+ c, then there exists a subsequence {x",} such that x", -' x e K.
Proof: Take c to be 0. Consider the vector field v = -Vf, and let o(x, t)
be the flow of v (v (x, 0) = x; see Definition 4.44). For each n, let t be the
first value of t for which
I( Vf(a (x", t))II <
n
We prove the existence of such a t" as follows.
If g(t) = f(a(t)), a(t) being any solution of a'(t) = v (a(t)), then
g'(t) = dfo(,)(a'(t)) = df,(,)( -Vff(,)) = - II Vf,(,)II2 ,
so g(t) is monotone decreasing. Since g is bounded below, its derivative can-
not be bounded away from zero for positive t. Thus t" exists. Set y" = a (x", t").
Then we have
f(y,,) _<_ f(x,,).
By the P-S condition, there is a subsequence (y"J} which tends to a critical
point y,,,. Since 0 is the only critical level, f(y.J) - 0. Assume
without loss of generality that nf(x") -- 0. Then, nf(y") - 0 also. But

d (x",. y.) = d (a (xn,. 0). a (x",, ta,)) < f Ila'(xn1, t)II dt


o

f"" IIVf ((r(x",, t))IJ dt.


0
APPLICATIONS 195

In the interval (0, t,,,), II Vf (a (xc,, t))II > l/nj, whence the last expression is
less than
(*) nj tnl II Vf (a (xn,, t))112 att .
J 0

Now, f(a(t)) _ -IIVf(a(t))112 So (*) equals


dt
tn,

d f(
nj a (xn,, t)) dt = nj (f(xn,) -f(yn))
0

which tends to zero.


Thus {xn,} also tends to y,,,, e K.
Q.E.D.

Proof of Theorem 7.25: By our hypotheses, K has a neighborhood N in M


homeomorphic to K x disc. [Cf. O. Hanner, "Some theorems on absolute
neighborhood retracts", Arkiv for Matematik, Vol. 1, (1950), pp. 389-408.]
Assume that S is a topological space and ¢ : S - { f S a} is continuous.
Using Morse Theory it follows that for any s > 0, 0 is homotopic to a
4z such that c - s S f(o,(S)) < c + s [because { f S a} and {f < c + e}
have the same homotopy type]. Therefore there is a homotopic 01 such that
f(4 1(S)) 5 c + s and 01(S) s N. If this were not the case, we would con-
struct a sequence such that c and x. l N for all n, contra-
dicting Lemma 7.26. Since N is homeomorphic to K x disc it follows by
squeezing the disc hat 01 is homotopic to a function 02 with values in K.
Q.E.D.
We return to our study of the groups U(2in) and SU(2m). By Corol lary 7.24.
any map from a space X of di mension < 2m + I into H1(SU(2m), I, E) can be
pushed down homotopically into a map ofXwith values in curves whose length
is S n + e, for any e > 0. The same result is true if we consider instead
the space of loops H1(SU(2m), I, -I); the points -I and E being joined by
an' unique minimal geodesic one can prove that H1(SU (2m), I, E) and
H1 (SU(2m), I, -1) have the same homotopy type.

7.27. Lemma: Fork < 2m + 1, nx(Q(SU(2m), I, -I),K) = 0, where K


is the manifold of minimal geodesics joining I and -I.

Proof: By the remark above, a map from the k-cube into Sl (SU(2m), I, -I)
can be pushed down to a map into the space S20 of curves of length at most
196 NONLINEAR FUNCTIONAL ANALYSIS

x2 +a. But by Bott's theorem, the space Do has vanishing homotopy


groups relative to K.
Q.E.D.
7.28. Corollary: Fork < 2m + 1, a k (S2 (SU (2m),1, -1)) ~ nk (Gm(2m)) .

Proof: By Theorem 7.2, the sequence


....-* 71Sk (X, A) 1 nk-1(A) - 7Lk-1(X) - 7Lk-1 (x, A) ...

is exact. The first and the last written terms are zero by Lemma 7.27, whence
the middle terms are isomorphic, i.e.
nk (92 (SU (2m),1, -1)) = nk(K)

But, as noted preceding Theorem 7.25, K is homeomorphic to the Grass-


mann manifold G. (2m).
Q.E.D.

7.29. Corollary (Bott's isomorphisms):


nk+1 (SU (2m)) nk (G,,, (2m)) for k5 2m.
We now proceed to obtain corresponding results for the group U(n). Let
X ! + B be a fiber space, X and B connected.
Let F = p-1(b) be the fiber. Then p induces an isomorphism p" : nk (X, F)
-+ aik(B) [cf. Steenrod, The Topology of Fiber Bundles, or Hilton and Wylie,
Homology Theory, pp. 288-289]. Using the exact sequence for homotopy
we see that
... -+ 7ak(B) -s nk- I(F) ..+ ack-1(X) -+ n,%- 1(B) -- ...
is exact [this is the so called exact bundle sequence].
If G is a connected Lie group, and Ha subgroup of G, then G is a fiber bundle
with base space GJH (the factor space of G by H). The projection p is the
natural one. [This uses the existence of local cross sections of G over GJH.
See Steenrod's book.] Consider now the inclusion SU(n) c U(n). The factor
space U(n)/SU(n) is the unit circle C. Then:
ak (U(n)) = nk (SU(n)), k > 2.

Next, consider the inclusion U(n) c U(n + 1). The factor space U(n + l)/U(n)
is the sphere S2A+1 [cf. Steenrod's book]. Hence,
nk(S2n+1).+ nk-1(U(n)) -+ Xk-1(U(n + 1)) -i alk-1(S2s+1)
APPLICATIONS 197

is exact. So, fork < 2n + 2 we get:


nk(U(n)) = irk(U(n + 1))
(Stable homotopy groups). It is natural to write ak (U(oo)) = lim alt (U(n)).
a-.m
The space U(m) x U(m) is included in U(2m), since the matrix
A(m) 0

0 ` A(m)

is in U(2m) for any A(m) c- U(m).


It is easy to verify that the factor space U(2m)/U(m) x U(m) is homeo-
morphic to the Grassmann manifold G. (2m). Therefore
xk (Gm(2m)) --+ ak-1(U(m) x U(m)) -.+ vk-1 (U(2m)) - ik- z (Gm(2m))
is exact. This geometric fact justifies the following
Remark: Given a Lie group G and subgroups H1 c H2, GJH1 has a
bundle structure over GJH2, with natural projection and fiber H2/H1.
(Same proof as in the somewhat less general case considered before.] If we
then consider the fiber space
U (2m)/ U(m) - U (2m)/ U(m) x U(m),
and use the exact bundle sequence we find that
xk (U(2m)l U(m)) - al, (Gm (2m)) --+'re-1(U(m)) - xk (U(2m)l U(m))
is exact.
7.30. Theorem: (Bott's Periodicity Theorem):
xk-1(U(co)) - Irk+1(U(oo)) for k Z I.
Proof: First we prove that in the exact sequence noted above the
first and the last groups are zero provided m is large enough. Indeed, we
have seen already that the space U(2m)l U(2m -1) is the sphere S" 1. Thus
irk(U(2m)/U(2m - 1)) = 0 for m large.
Similarly
ak (U (2m - 1)/ U (2m - 2)) = 0 for m large, etc.,
and we get
xk (U(2m)/U(m)) = 0 for m large,
by the above remark.
198 NONLINEAR FUNCTIONAL ANALYSIS

Hence, by the exactness of the above sequence of homotopy groups, we get


.nk (Gm (2m)) - nk _ j (U(m)) form large.
Now,
nk -I (U(M)) - nk -1(U (2m)) by stability, for m large.
Therefore
nk (Gm (2m)) - nk_ 1(U (2m)) form large.

Using Corollary 7.29 (Bott's isomorphisms), we may assert that


nk+ 1 (SU (2M)) '" nk(Gm (2m)) form large.

But we have seen that


nk (U(n)) - nk (SU (n)) for k>2.
Thus
nk + 1(U (2m)) =nk (Gm (2m)) form large,

and for k > 1. Hence, finally,


nk+1(U(2m)) - nk_1(U(2m)) if m is large.
Q.E.D.

7.31. Corollary: The homotopy groups nk (U(oo)) are zero if k is even, and
isomorphic to Z if k is odd.

Proof: Observe that


7r2 WOOD n2(SU(2)) = 0
n3 (U(00)) = r3 (SU (2)) = Z,
as SU(2) is nothing but S3.
Q.E.D.
CHAPTER VIII

Closed Geodesics on Compact Riemannian


Manifolds

(Chapter by Hermann Karcher)

In this chapter closed geodesics on a Riemannian manifold M will be


studied using infinite dimensional Morse Theory in much the same way as
in Chapter IV where geodesics joining two fixed points were treated. We
shall study a Hilbert manifold H1 (S1 , M) of closed, sufficiently regular
curves (Hl-curves) on M. The coordinate spaces for Hl (S1 , M) are (as in
Chapter IV) Hilbert spaces whose elements are Hl-vector fields along curves
on M. In Chapter IV one defined scalar products for the coordinate spaces
(following Palais) with the aid of Nash's embedding theorem. In this chapter
(cf. Theorem 8.6) we use instead Klingenberg's intrinsic scalar product (first
introduced in a lecture given in Bonn) which depends only on the Riemannian
structure of M. In Theorem 8.9 we prove that this scalar product and various
other possible scalar products on the Hilbert spaces of Hl-vector fields lead
to equivalent norms. The differentiable structure of H1(S1, M) and useful
coordinate systems are discussed in 8.11 to 8.18. Theorem 8.19 states the
differentiability of the energy function and in 8.20 we introduce a Riemannian
metric for H1(S1, M) based on Klingenberg's intrinsic scalar products.
These developments are somewhat more complicated than the corresponding
ones in Chapter IV since it does not seem possible to obtain the intrinsic
Riemannian metric of H1(Si, M) via an embedding M z RN. In 8.22 to
8.26 we carry out an auxiliary discussion of differentiable curves on H1(S1, M)
and their representation on M. In the second half of the present chapter our
use of the intrinsic scalar product allows simpler proofs than in Chapter IV;
it also seems that notions such as the gradient vector field of J on H1(S1, M)
can be more readily interpreted in terms of M. We prove a few geometric
results concerning the Riemannian structure of H, (S1, M) in 8.27 to 8.33.

199
200 NONLINEAR FUNCTIONAL ANALYSIS

Lemma 8.34 contains basic estimates which we need to derive the explicit
formula for the first derivative of the energy J in 8.35 and to prove the validity
of the Palais-Smale condition for J in 8.41. In 8.36 to 8.39 we introduce the
gradient of J as a vector field on H1(S1, M) and identify the critical points
of J as the closed geodesics on M. In 8.41 to 8.50 standard arguments from
infinite dimensional Morse Theory are used to show the existence of at least
one nontrivial closed geodesic on every compact C6-Riemannian manifold.
In 8.48 we prove that those flow lines of the gradient deformation (cf. 8.43),
which start at points f with sufficiently small energy J(f) < e have uniformly
bounded length. As an immediate consequence we obtain the important
result 8.47 that J-1(0) is deformation retract of J-1([0, e]).
We conclude with a summary of recent developments.
Preliminaries. M will be a compact Riemannian manifold of class Ck
(k Z 6). (Metric completeness rather than compactness is sufficient for most
of our general developments but not for the desired application to closed
geodesics.) MD denotes the tangent space to M at p, TM the tangent bundle
of M. The scalar product in MD is denoted by g(p) (v, w) or more briefly by
(v, w); in local coordinates on Mthe metric is writtenglk(p) v'wk. The distance
on M induced by this infinitesimal metric (cf. Chapter VI) is called dM(p, q).
Absolutely continuous curves (resp. vector fields) with locally square in-
tegrable derivatives will be called H1-curves (resp. H1-vector fields). For H1-
curves we may define an energy integral J and a length LM as follows.
J(f) = j f (1'(t),f(t)) dt , L,4(f) = f (f(t),f(t))112

dt.
We shall be interested in closed H1-curves parametrized by the interval [0, 1]
(not necessarily proportionally to arc length). Hence we find it useful to
define the following space :
H1(S1i M) = {fIf: [0, 1]/{0, 1) -, M and J(f) < co}.
(We always identify the circle S1 with the factor space [0, 1]/{0, 1}.)
The covariant derivative of a vector field v(t) along f(t) will be written
Dv/dt; this derivative is given in local coordinates on M by the formula

+ I',k (f(t )) At) Vk(t )


dt
(where rJ, e Ck -I are the Christoffel symbols). Differential equations with
square-integrable coefficients can be treated by the Picard-Lindelof iteration
scheme. Using this fact and the last formula we see that Levi-Civita parallel
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 201

transport is well defined along any curve f e H1(S1, M). In particular a


parallel vector field along such a curve is absolutely continuous, and for any
continuous vector field along f the following statement holds: dv/dt is locally
square integrable if and only if Dv/dt is locally square integrable.
The exponential map available on the manifold M may be described as
follows. For 0 , v e Mo, let c : [0, oo) - M be the geodesic ray starting at p
with tangent vector v, such that its parameter t is proportional to arc length s
and ds/dt = I vl m.. Define exp (v) = c(1). Then exp: TM -+ M. We denote
the restriction expIMD = exp,. If convenient, we write exp, = exp. It fol-
lows immediately from the differential equation for geodesics (i.e., from
D/8t c = 0) that exp, (t v) = c(t), in other words that exp, is radial iso-
metric. Since r,,,, e Ck-2, it follows that exp : TM -- M is a C'`-2 map, and
the differential equation for geodesics also shows that the linear map in-
duced by exp, at the origin of M. is the identity map.
Geodesic parallel coordinates on M are easily defined in terms of the ex-
ponential map exp. Given a geodesic arc c: [0, T] -+ M (arc length t as para-
meter), choose an orthonormal n-frame F0 = (c(0), v2(0), ..., v (0)} in M 0)
and define n-frames F, in M,(,) by parallel transport of F0 along c. The
map [0, T] x Rn-1 -> M given by (t, u2, ..., u") -' exp t) (=R2
E u' - vl(t)) is
Ck'2 and the inducgd linear map at (t, 0, ..., 0) is the identity. Therefore a
neighborhood of [0, TJ x {0} is mapped C4'2 diffeomorphically onto a
tubular neighborhood of c in M. The inverse map gives the desired coordi-
nates. We have g,,t(c(t)) = ask and Fj,t(c(t)) = 0. Since M is compact these
remarks prove the following lemma.

8.1. Lemma: There exists e, > 0 such that the geodesic parallel coordi-
nates just defined are valid in the e,-tubular neighborhood of any geodesic
arc which is sufficiently short so that its e;.tube does not cover any point
twice. Moreover there exist constants C > 0 and 0 < m1 S 1 S m2 < oo
such that for parallel coordinates in any ep tube we have
n n

(1) II'fxl 5 C and m1 E (v')2 5 gtxv'vk 5 m2 E (v')2


t=1 t=1
A consequence of the second inequality in (1) is given in
8.2. Lemma: If ( , ) and (( , )) are two scalar products such that ml((v, v))
5 (v, v) S M2 ((V, *for all v and if m1 S 1 S m2 then
I(v, w) - ((v, w))I s 16 (m2 - ml) ((v, v)) - ((w, w)).
202 NONLINEAR FUNCTIONAL ANALYSIS

Proof: Using (v, w) _ I ((v + w, v + w) - (v - w, v - w)) one first gets


I(v, w) - ((v, w)) I < 8 (m2 - mi) (((v, v)) + ((w, w)))
Now (v, w) - ((v, w)) = b (v, w) is bilinear and
l b (v, w)i < c (It'll + 1w12)
implies
lb (v, w)I s 2 1v1 Iw1,
since for A 0 we have

Ib (v, w)I = b (Av, w) < C A2IUI2 +2


IWI2l
Q.E.D.
The following result is well known for H1-curves in RN.

8.3. Lemma: Every subset of H1(Si , M) on which the energy integral is


bounded consists of equicontinuous curves on M.
Proof :

dM(f(t),f(to)) Lm(f)l',. = (1'(z),.f(a))1/2 dr

fto
t 1/2

I t - to l ft (f(r), j(-r)) dr) (by the Cauchy-Schwarz


o inequality)
5 -J(f .

8.4. Corollary: L,(f) S 2J(f), and La (,f) =1(f) if and only if f is para-
metrized proportional to arc length, i.e., (f(a), f(r)) = coast.
We next wish to define a differentiable structure on H1(S1, M) with the
aid of coordinate spaces which have a geometric interpretation on M.

8.5. Definition: For any f e H1(S1, M) consider the set of H1-vector


fields along f:
(2) H1(S1, TMf) = {vIv 3 [0, 1]/(0, 1)) --i TM

such that v(t) a Mr(,) and v is absolutely continuous and has locally square
integrable derivatives.
8.6. Theorem: H1(S1, TMf) is a separable Hilbert space with the scalar
product <v, w> defined below.
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 203

Proof: H, (S, , TMf) is obviously a vector space. For v, w e H, (S, , TM,r)


define
(3) <v, w> = f1 Iwo, w(t)) + I )} dt.
o dt dt

(Of course, Dldt denotes the covariant derivative along f.) It is clear from (2)
of Definition 8.5 that we have <v, v> < oo, <w, w> < oo. Moreover using
I(v(t), w(t))I < Iv(t)I Iw(t)I (in Mf(,)) and the Cauchy-Schwarz inequality, it
follows that
2
<v, w)2 < CI1 fI v(t )I . Iw(t)i +

fDU dt
I

o dt
21 1

o Iv(t)I2 + dt fo { t 2 + JD
dt

= <v,v>.<w,w>.
Hence <v, w> is defined for all v, w e H1 (S,, TMf). Clearly <v, w> is bi-
linear and positive definite, and therefore a scalar product.
For the proof of completeness of the space H, (S, , TMf) we need the
following definition and lemma.

8.7. Definition : For v e H, (S,, TMf) put

IIvII. = max (v(t), v(t))1'2.

t E [O.1)

8.8. Lemma: IIvII2 S 2 <v, v>.

Proof: The formula

dt
(4) at (8U(t))(v(t), w(t))) _
(., +

is well known for f, v, w e C' and generalizes by an easy limit argument


to all f e H, (S,, M) and v, w e H, (S1, TM,r). Now choose such that
J I vll v(tm)) and note that

uvll , =
(4t),
v(t)) + 5"" (v(z), v(a)) dr
dz
1
Dv
s (v(t), v(r)) + 2 f Iv(r)I . dr.
o dt
fo
204 NONLINEAR FUNCTIONAL ANALYSIS

Since the left side of this formula is independent of sand since 2a b s a2 + b2


we get
00 -
IIV112 ::g
(v(t), v(t)) dt + f ' {(vr), v(-r)) + (P!.. )} dt 2<v, v>.
Q.E.D.
We may now complete the proof of Theorem 8.6 in regard to completeness
and separability.
Since f is absolutely continuous.we can subdivide f into finitely many sub-
arcs f, such that each of them is representable in some geodesic parallel
coordinates by functions f,'(1) (i = 1, ..., n = dim M and t e I,) with
I f,(t)I < e, (cf. Lemma 8.1) and U I, = [0, 1], II,I = 1. If and only if
f e H1(S1 i M) will all the f, be Hl functions. Since J(f) < 00 we can also
assume that these subares are so short that
ml
(f(t), f(t)) dt <
e r, 8C m2n 3
where C, m1, m2 are the constants of 8.1 (1), and n is the dimension of M.
We now consider the restriction of any v e H1(S1, TM,) to I i.e. to a vec-
tor field along f . The coordinates of v will be called vi. Then by 8.1

<v,
v> r, -I (v(t ), v(t)) +
dt
, D dt
dt /j

Z ml rv ` ((v `(t))2 + {v`(t) + I' (f,(t)) fr(t) Vk(t))2) . dt.

We use (A + B)2 ? +A2 - B2 to obtain )J`

<v, v>
11v
mi
fyv
ft ((v`)2 +
!
(1'ik frvk)2) dt

? ml U1)2 + "Z (D1)2) - C2n (E11vk)2} dt .

Note that by 8.1 (1) and 8.8


n 2n <v,
vk)2 S n > (vk)2 5 (v(s), v(t)) 5 v>
m1 ml

(
rY

and note also that f (t)) di, so that


` m1
f (f(t),
,
rnl 2C2n3
< v, v> Z ((v')2 + (n1)2) dt - <v, v> (j(1), j(1)) dt.
jr 2 Jr i m1 iv
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 205

By the choice of the I,, we have

+
2C2n3
J f (f(t), f(t)) dt < 1 + 4
m1 t
which implies
<v, v) > _m_ Y ((v1(t))2 + (v'(t))2) di.
17V _ 3 f1v

Similarly, we may deduce

<v, v) (ff) dt,


<v, v) (
t,
5 2m2
t E ((v')2 + (v1)2) dt + 4C2n3m2
mi r t
so that we obtain
<v, v) 5 4m2 &1(t))2 + (v'(t))2) dt .
Z
IV r
Therefore <v, v) and the norms f1V (tJ')2) dt are equivalent. But
r,.

t he completeness and separability of H, in this latter norm is well-known.


Q.E.D.
The Definition 8.5 of the vector space H1(Si , TMr) involves only the
differentiable structure of M. In Theorem 8.6 we prove completeness with
respect to a scalar product which is defined intrinsically in terms of a Rie-
mannian metric on M (justifying the name intrinsic scalar product). It will
be helpful in what follows to know that various possible scalar products on
H1 (Si , TMr), are in fact equivalent.

8.9. Theorem: (i) Let g(') and g(2) be Riemannian metrics on M. The cor-
responding scalar products on the various spaces H1(S1i TMr) as defined
in 8.6 (3) are then equivalent, i.e. c, Ilvll"' < Ilvll`2' S c2 110{1' for each
v e H1(S,, TMr) with constants depending only on the energy J(f) of the
curve f.
(ii) Let M c R" be a differentiable submanifold, so that H, (S, , M) may
be regarded as a closed submanifold of the Hilbert space H, (S1, RN) (The-
orem 6.11), and so that the usual H,-norm for H, (S, , R") induces a scalar
product on each tangent space H1(S1, TMr) of H1(S1, M). Then any two
embeddings induce scalar products in H1(S1, TMr) which define norms for
these spaces which are uniformly equivalent on any subset of H1(S1, M) of
bounded energy J(f).
206 NONLINEAR FUNCTIONAL ANALYSIS

(iii) Any two scalar products in H1(S1, TM,r) of the sort described in (i)
and (ii) above are uniformly equivalent on any subset of H1(Si, M) of
bounded energy.
Proof: We first prove (i). By compactness of M there exist constants
0 < m1 < m2 < oo such that
(2) i k<
(1) k
M191k v v
1
glk L v = m2$Ik(1)V1D k

This implies m1J(1)(f) < J(2)(f) < m2J(1)(f). This observation allows us
to repeat the equivalence-of-norms proof given above as the second part of
the proof of Theorem 8.6 with only minor changes. The number of subinter-
vals I, which are needed in that proof can trivially be estimated in terms of a
bound for J(f); the other constants appearing in the proof of Theorem 8.6
are independent off. All additional details are left to the reader.
We now prove (iii). Given an embedding M c RN, consider the norm II II

on H1(S1, TM,) which is induced by the corresponding embedding H1(S1, M)


c H,(S1, RN) as described in (ii). Note also that the Riemannian metric
on M induced by the embedding M c RN defines an intrinsic norm II II' on
H1(S1, TMr). We claim that II II' and II II * are equivalent, uniformly for
any subset of H1 (S, M) of bounded energy. Indeed, for v e H1(S1, TM,r)
\
we have (IIviI')2 =
1

0
(v(t), v(t))M,(,) dt + f (---,
1

dt
Dv
--dt / Mr(,) dt
Dv

while
dv l
(IIvII*)2 = 1 (t#), v(t))RN dt + f 1 (dv
dt
,

dt )RN
dt.
o ,J o

Of course (v(t), v(t))M,(,) = (v(t), v(t))RN by the choice of the metric for M.
Let the map 0: M -i RN define the embedding with which we are con-
cerned. The second fundamental form of M c RN in terms of local coordinates
for M is a symmetric matrix of vector valued functions of these coordinates,
equal specifically to the projection of a2018u' auJ onto the hyperplane of
directions normal to M. We denote this form by 11J. (Cf. Milnor, Morse
Theory.) The definition of the covariant derivative of a vector field v along
a curve f (in local coordinates v', f') can be seen to imply the formula
(dv dv = Dv Dv
+ (luvl',
dt' dt)RN - (dt ' dt M,(
from which it is plain that it II' < II O. On the other hand, letting K. be
the least upper bound for the principal curvatures of M in all directions, it
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 207

follows by the compactness of M (the embedding being fixed) that K," < oo .
Moreover, by the definition of the principal curvatures,
(ltJV P, 'rsvrfs)RN K,2" Iv(t)I2 If12.
Hence, using 8.8
(IlviI*)2 S (IIvII')Z + Km 2J(f) 2(Ilvll')2.
This proves the equivalence of the norms II Il* and II 11' Combining (i) and
(iii), (ii) follows trivially; details are left to the reader. Q.E.D.
Remark: It is no coincidence that the constants cl, c2 such that cl IIv11"'
IIvllcs
< c211v11`" in the preceding theorem depend on an upper bound
for the energy J(f). One can easily check by explicit calculations (carried out
most simply on the sphere or on manifolds with vanishing curvature) that
the best possible constants c, and c2 may indeed approach 0 and oo respec-
tively as J(f) -+ oo.
Our next developments will depend on Palais' Theorem 6.12, which we
restate as follows.
8.10. Theorem: Let M and N be closed Ck-submanifolds of R"` and R"
respectively. Then H, (Si, M) and H, (S1, N) are closed Ck-4-submani-
folds of H, (SI, R"`) and H, (SI, R") respectively (cf. Theorem 6.11). Let
0:M->NbeaCk`2map.
(i) It is an elementary result of differential topology that 0 can be extended
to a C2 map from R'" to R. Then, by Theorem 6.8, the extended 0 gives
rise to a C1-4 map of H, (SI , RI) into H, (S1, R").
(ii) Consequently the map 0 : H, (S,, M) -+ H, (SI, N) defined by
0(f) = fi of is a C11-4 map. Moreover dO,(v) (t) = d0pt) (v(t)) for
v e H, (SI , TM,).
8.11. Definition: As in Chapter VI we take as the differentiable structure
of HI (S1, M) that structure which it inherits as a submanifold of H, (SI , RN)
in virtue of an embedding of M c RN. It is clear from Theorem 8.10 (cf. also
Theorem 6.11) that the differentiable structure thus defined does not depend
on the embedding. Although the differentiable structure of HI (SI , M) is
independent of the Riemannian metric of M it will be of considerable ad-
vantage to have coordinate neighborhoods and maps on HI (SI , M) avail-
able which are closely related to the Riemannian structure of M. We intro-
duce such coordinates in the following definition and lemmas.
8.12. Lemma: The set 0(f) = {vJ v e HI (SI, TMf), IIv11. < e}, is an
open subset of the Hilbert space HI (SI, TMf).
208 NONLINEAR FUNCTIONAL ANALYSIS

Proof. Let v e 0(f), i.e. let IIvII < e so that IIvII, s e - 28 for some
b > 0. Then if <w, w>h12 = IIwII < 8 we have IIwtI,, < 26 (cf. Lemma 8.8
above). Het.ce 11v + wll W < e, so that the 6-ball around v is in 0(f). Q.E.D.

8.13. Definition: For f, h E H, (S1, M) put


d. (f, h) = max du(f(t), h(t)); (1)
tES,

given.f e H1(S1, M-) and e < l e, (cf. Lemma 8.1). Put


U(f) = {h I h E Hl (S1, M) and d. (f, h) < e). (2)

This set is introduced as a standard coordinate neighborhood off, a definition


justified by the two following lemmas. Note that U(f) is an open subset of
H1(S1, M) since the original Riemannian metric of M and the metric in-
duced by an embedding M c R" (as used in Definition 8.11) are equivalent
and V(f) _ {h e H1 (S1 , R"); max dRN(h(t),f(t)) < S} is open in H1 (S1 , R")
by the proof of Lemma 8.12. `

8.14. Lemma: The following formula defines a 1-1 correspondence ri be-


tween U(f) and 0(f):
h(t) = expf(,)(v(t))
Proof: We have Iv(t)I = dM(f(t), h(t)) by the radial isometry of the map
exp, and hence IIv1I. = d0,(f, h). Since e < e,, and assuming h e U(f), the
inverse exp fc') is well defined at every point h(t) and hence the equation dis-
played in the statement of the lemma is inverted by v(t) = expf«) (h(t)).
Finally, the maps exp, ' (-) depend differentiably on p, which
implies h is an H1-curve if and only if v is an Hl-vector field. Q.E.D. ;

8.15. Lemma: The 1-1 mapping , of Lemma 8.14, given by rl(h) = v, is


a C" diffeormorphism between the open subset U(f) of the manifold
H1(S1 i M) and the open set 0(f) of the Hilbert space Hl (S1, TM.,).

8.16. Corollary: The mappings n : U(f) -+ 0(f) of Lemma 8.14 and 8.15
define valid Cx' 3 coordinates on the manifold H1(S1, M). We refer to them
as standard coordinates near f.

Remark: It is possible to define the differentiable structure of H1(S1, M)


directly with the aid of Lemma 8.14 independently of Definition 8.11; in this
case one has to show that the change-of-coordinates maps riZriT ' are of
class C1-3.
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 201)

The proof of Lemma 8.15 is based on Theorem 8.10 and Whitney's em-
bedding theorem. We start with some considerations which we shall need
again.

8.17. Using Whitney's embedding theorem, we may take M as a C`-


submanifold of some R". The Whitney sum TM O+ vM of the tangent bundle
TM and the normal bundle vM is then the trivial bundle M x R". Using
M c R" we embed the trivial bundle M x R" in R` x R". Since the
tangent bundle TM is a C"-'-subbundle of the trivial bundle we thus get
TM as a Ck-'-submanifold of x R". This embedding has the following
R'N

properties. If we identify M with the zero section of TM, then this sub-
manifold of TM is embedded in R" x {0}. Moreover the tangent space M,
of M and p is embedded as a linear subspace of { p} x RI in such a way that
the linear structure of M, is preserved. In view of Theorem 8.10 we have
then HI (S,, TM) as a C' 5-submanifold of Ht (S,, R" x R"), and for fixed
f e H, (S,, M), HI (S1, TMf) is a linear subspace (and therefore as a C'
submanifold) of H, (S , , R" x R") and consequently H, (S,, TMf) is a Ck"5-
submanifold of Hl (SI, TM).
In the same way the Whitney sum TAf Q+ TM is C' '-submanifold of
RN x RN x RN so that the linear structure of the fibers is preserved. Conse-
quently Hl (S, , TM ® TM) is a C'- 5-submanifold of HI (S, ,R' x R" x R")
(Theorem 8.10) and Hl (S, , TMf) x HI (S, , TMf) is a linear subspace of
H, (SI, R" x R" x R") and a Ck-S-submanifold of H, (S,, TM E) TM).

Proof of Lemma 8.15: We assume 8.17. Consider the map rh : T,'tf --. M
defined by 0(v) = exp,( (where p(c) is the base point of r). Then 0 E CA - 2
and by Theorem 8.10 the induced map 0: H, (S,, TM) -+ HI (S,, M) be-
longs to Ck_5 (not Ck-4 since TM is only Ck-'). But the restriction of 0
to the Ck'5-submanifold HI(S1, TMf) (cf. 8.17) is the map 9/`' of our
Lemma, proving that q-' e Ck-5
To show that we also have 77 e C'`-5, we argue as follows. On the open
subset U = {(p, q) E M x M; d (p, q) < e,} the map y: U -- TM given by
y (p, q) = expo `(q) is well defined and Ck-2. By Theorem 8.10 y induces a
Ck-5 map y of an open subset of HI (SL, M) x H, (S,, M) into HI (S,, TM).
The domain of y contains the Ck - 5 submanifold { f } x U(f) (cf. 8.13) and
the restriction of y to {f} x U(f) coincides with t) by the proof of
Lemma 8.14. Q.E.D.
The next Lemma shows that the Hilbert manifold HI (S, , TM) may be
identified with the tangent manifold TH, (S, , M).
14 Schwartz, Nonlinear
'110 NONLINEAR FUNCTIONAL ANALYSIS

8.18. Lemma: Let 0: TM -+ M be the map given by 45(v) = exp (v), and
let 0 be the induced map (Theorem 8.10) of H, (S,, TM) - H, (S,, M).
Let 0 be the map defined by 0(v) = ds 0(sv) Is_ o, so that 0: H, (S,, TM)
- TH, (S,, M). Then 0 is a Ck ` 6 diffeomorphism of H, (S,, TM) onto
TH, (S,, M).
Proof: From the proof of Lemma 8.15 we have that 0 e Ck- so that
S

0 e C". Write p(v) for the base point of the vector v e TM, or for the base,
point curve of the H,-vector field v E H, (S,, TM), as the case may be. Let
v e H, (S, , TM). The point 0(v) E H, (S,, M) has by Corollary 8.16 the
coordinate v in the standard coordinate neighborhood U(p(v)) near p(v).
Thus ds ; FP (sv) = (v)is that tangent vector of H, (S,, M) atp(v), which,
s= o
in the coordinate system of TH, (S, , M) corresponding to U (p(v)), has the
coordinate (0, v). This shows that 0 is a 1-1 C1-6 mapping of H, (S,, TM)
onto TH, (S,, M).
To prove that 0-' is also of class Ck-6 consider the coordinate system
of TH, (S,, M) corresponding to U(p(v)) and denote the C"`6-coordi-
nate map by ^, i.e. : C(v) --+ 0 (p(v)) x H, (S, , TM,(,,)). We shall find a
C'--5 map a such that 0-' = a o rj, which proves 0-1 a Ck-6.
Now use 8.17. Let p = p(v,) and q = exp,(v,). By Lemma 8.1 there
exists e, > 0 such that if V1, v2 a M, and Iv21 < e, then P (VI, v2)
= expq' (exp,(v, + v2)) defines a C1,-2 function whose value is a vector
tangent to M at q. Thus a (v,, v2) = s e (v,, sv2) I defines a Ck-3
Ss0
map a: TM ® TM - TM. By Theorem 8.10 a induces a Ck-s map
Ck- s
H, (S,, TM (D TM) -, H, (S,, TM) and therefore by restriction a
map (cf. 8.17)
v: H, (S, , TiVf f) x H, (S,, TMf) - H, (S,, TM).
For (v,, v2) e H, (S,, TM r) x H, (S,, TMf) and h = 0(v,) we have

(eXp n ` (exp.r (vl + sv2))) = (v1, v2)


ds s=o
(cf. 8.10 (ii), more explicitly we have R (s, v, , v2) = P (v,, sv2)and a (v,, v2)
= (1, 0, 0). Apply 8.10 (ii) to R and obtain
a (v1, v2) (:) = dR(o,a,(t).V2(t)) (1, 0, 0)

= ds (exp+,a) (eXpf(,) (v,(t) + sv2(:)))


LO)
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 211

Thus
exph ' (exp f (v, + sv2)) = s ' a (v, , v2) + o(s) E H, (Si , TMh).

Since rh (exph ' (exp f (v, + sv2))) = 0 (v1 + s v2) and since ?7-' =
u11(S1.TMI)
(by the proof of 8.15) it follows that
d d -
0 (sa (L ] , v2)) = ds 0 (v, + sv2)
ds S=o ds S=Q

i.e. (a (VI , v2))is that tangent vector of H, (S,, M) at the point h = 0(v,)
whose coordinates in the coordinate neighborhood rl(v,) of TH, (S,, M) are
(v, , v2). In other words i3 (0 (a (VI , v2))) = (v, , v2) or a o which
completes the proof. Q.E.D.
Our next aim is to prove the differentiability of the energy integral and to
introduce the intrinsic Riemannian metric for HI (SI , M), more precisely :

8.19. Theorem: The energy J is a C' S function on H, (SI , M).

8.20. Theorem: Suppose that we represent the tangent space to H, (S,, M)


at f by H, (Si , TMf) (cf. Corollary 8.16) and take as scalar product in the
tangent space the intrinsic scalar product for HI (S1, TMf) defined in 8.6 (3).
Then this scalar product defines a Ck' 6 Riemannian metric for HI (S, , M).
Before we prove these two theorems, we make the following observations.

8.21. Let M be embedded in R" and TM in R" x R" as described in 8.17.


The Ck_ I Riemannian metric g of M can be extended to a 6_2 Riemannian
metric of R" in such a way as to make M a totally geodesic submanifold
(using the normal bundle of M in R" and partitions of unity). This implies
that the covariant derivative along a curve f e H, (S, , M) is the same if f is
considered as a curve in M or as a curve in R'r. We write the scalar product
as g(p) (v, w) for (p, v) and (p, w) e R" x R" (this is of course bilinear in v, w).
The following statements are very similar,to Lemma 6.9 and are proved
in the same way.
(1) Let b(p) (v, w) be bilinear in x E R" and of class Ck in p e R". and
define a Ck`2 function

b: H,(S,, R") x H,(S,, R") x H,(S,, R")-, R


by

b(f) (v, it-) = Lb(f1) (i'(t ), w(t )) dt .


14a Schwartz. Nonlinear
212 NONLINEAR FUNCTIONAL ANALYSIS

(2) Let b ( ) ( , ) : RN - L2(RM) be a Ck map from R1 into the bilinear


forms on RM. Then we define a Ck-2 map
L2
b () (,) : H1(S1, RN) -' (H1 (S1, RM))
by

b (f) (v, w) = f b (f) (t)) (v(t), +'(t)) dt.


0

(3) Let c( , ) : RN x RN --+ L2(RM) be a Ck map from R" x RN,


)( ,
which is linear in the second argument, into L2(RM). Then we define a Ck-1
map
H1(S1, RN) x H1(S,, RN) `' L2 (H1(S1, RM))
by

c (f, h) (v, w) = fo c (f(t), h(t)) (v(t), l1w(t)) dt.


1

We show c (f, h) o L2 (H1(S1, RM)) to indicate the kind of changes which


have to be made to adapt the proof of Lemma 6,9. We use Lemma 8.8 and
Schwarz' inequality and we denite by e" the i-th unit vector in R so that
h(t) = E eihi(t).

Ic (f(t), h(t)) (v(t), w(t))I S lIc (f(t), h(t))11 L2(RM) .1V(t)IRM ' I i'(06-
s E I h1(t)I Ilc (f(t), e1)IIL2(RM) max I v(t)I I +'(t)I
and therefore 9410.1]

Ic (f h) (v, 01 S max (E Ilc (f(t), e,)11 2 2(RM))1'2 IlhH0, (s,. Rx) '2 1IvIIR, (s,. it.)
1x(0.1 L

' Ilwlls, (s,. RM)


1

Proof of Theorem 8.19: With the notations of 8.21 2J(f) = f g (Al))


0
x (f(t), f(t)) dt is the restriction of the Ck-4 functionon H1(S1, RN)
x H1(S1i RN) x H1(S1f RN) to the diagonal of the Ck - ° submanifold

H1(S1, M) x H1(S1, M) x H1(S1, M).


Q.E.D.

Proof of 8.20: We identify TH1(S1, M) and H1(S1, TM) using Lemma


8.18, i.e. we represent tangent vectors of H1(S1a M) by elements of
H1(S1, TM). For v e H1(S1, TM) we denote the covariant derivative along
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS

the base point curve p(v) by Dvldt (cf. 8.21). Then the scalar product ac-
cording to 8.6 (3) is

<v, w> =

dt
1

dz`
dt
g (= p (v(0)) (v(t ), w(t)) + g (p (v('))) `

where v and w have the same base point curve p(v). We have to show that
Dv
\ dt '

this formula defines a Ck-6 section from M into the positive-definite, sym-
metric, continuous bilinear forms on TH, (S,, M) = H, (S1, TM), cf. Lem-
ma 8.18. Taking, for example, Cartesian coordinates for R" we may write
the covariant derivative along f as

DO
+ Fjk (f(t)) v'(t) fk(t)) i = 1, ..., N and
(
Dw
dt }

E
o
J.k=1
dt,
213

where the rk are Ck-3 functions defined on R" (cf. 8.21). We write
I'(f(t)) [v(t), f(t)] for the vector {rk(f(t)) vJ(t) fk(t)} e R".
By 8.21(2) and (3) we may define a Ck-5 map

F: H, (SI , RN) x H,(S,, R") - L2(H,(SI, R"))


by

F(f, h) (v, w) = {g (f(t)) (v(t), w(t)) + g (f(t)) (iv(t) + r (f(t)) wt), h(t)],
0

w(t) + ru(t)) [w(t), h(t)])} dt.

The restriction of F first to the diagonal of H, (S,, R") x H, (S,, R"),


which we identify with H, (S,, R"), and then to the submanifold H, (S1, M),
is again Ck' 5. In other words, we have a Ck' 5 Riemannian metric on the
trivial vector bundle H, (S,, M) x H, (S1, R") and consequently also a
Ck'5 Riemannian metric on the Ck-5 subbundle H, (S,, TM), cf. 8.17. We
lose on more order of differentiability since the identification of H, (S, , TM)
with the tangent bundle TH, (S,, M) is only Ck-6 (cf. Lemma 8.18). This
Riemannian metric induces the right topology by Theorem 8.9. Q.E.D.
We continue with some observations concerning a useful family of differ-
entiable curves on H, (S1, M).

8.22. Notation: An element f e H,(S1, M) will be called a curve on M or


a point of Hl (S,, M) depending on the situation. A curve x on H, (S,, M) will
always be a map x : [a, b] -, Ht (S, , M).
214 NONLINEAR FUNCTIONAL ANALYSIS

&23. Definition : Let fo, f 1 e H1(S1 i M) be such that dd (Jo, fi) < E,,
which implies that the shortest geodesics on M joining fo(t) and fl(t) are
unique. Then put:
y (s, t) = expf0(t) (s . expf it)(t) (f1(t)))
From this function of two variables we obtain a curve y : s - y(s) E H1 (S1, M)
by writing y(s) (t) = y (s, t).
8.24. Lemma: The y-curves of Definition 8.23 are C'- 5 differentiable. The
ay
tangent vector (0) E H1(S1 i TMf) is the coordinate of f1 in the standard
as
coordinate system near fo (cf. Lemma 8.14 and Corollary 8.16). We have
(s, t) = dM(fo(t), fl(t)) (for all s, 0 < s:5 1); hence 11as = d.(.fo,f1)
as W

Proof: In the coordinate system centered at fo the y-curves have the follow-
ing representation.
v(s) (t) = s expj «)(f1(t)) where vs) E H1(S1, TMf0).
This implies that the y-curves are as often differentiable as the change of
coordinates map, i.e. are Ck - 5. Moreover, d v(s) is the coordinate of f1.
ds s=0
(s, t) is the length of the tangent vector to the geodesic y (s, t),
as
t = const., and therefore equals dx(fo(t), fl(t)).
8.25. Lemma: A Cl-curve x : [0, 1] - H1(Sl , M) considered as map [0, 1]
x [0, 1] - M by putting x (s, t) = x(s) (t) is a homotopy between the end
Cx
points x(0), x(l) e H1 (Si, M). Moreover (s, t) is a continuous vector
as
field on M along x. This implies that the deformation paths x (s, to), to = con-
stant, are rectifiable curves on M and that their length depends continuously
on t.
Proof: Since J is continuous and x [0, 1] is compact (in H1 (S3 , M)) we
have max J (x(s)) = A < oo. Therefore, by Lemma 8.3, the x(s) are an equi-
SE[0.1]
continuous family of curves on M. Equicontinuity in one variable and con-
tinuity in the other implies continuity in both variables. In this case x(s) (to)
is also equicontinuous ins for to a [0, 1]. To see this let v(s) be the coordinate
of x(s) in some standard coordinate system on H1(S1 i M) (cf. 8.13 (2)), I.e.
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 215

for some f e H, (SI, M) and b > 0 let x(s) e U(f) and v(s) E H, (S,, TM-,)
for Is - soI < S. By Lemma 8.8 we have

11v(s) - v(so)I17 = max Iv(s) (t) - v(so) (t)I2


2<v(s) - v(so), v(s) - v(so)>, ,

and therefore the continuity of v(s) implies the stated equicontinuity. To


prove the second part of the Lemma observe that the derivative of a C'-
an
curve in H, (S, , M) represents a tangent vector, so that - e H, (S, , TMK(S)) .
as
ax
Hence (s, t) is continuous in t for fixed s. Using coordinates we see as
as
before that as (s, t) is equicontinuous ins for fixed to, to c- [0, 1 ]. This implies

an
continuity of (s, t) in both variables.
as
Q.E.D.

8.26. Lemma: Let s -+ x(s) be a C'-curve on H, (S, , M) and s --> w(s) be


a C'-vector field along x(s). Define 0 (s, t) = x(s) (t) and v (s, t) = w(s) (t).
Then
D a¢ Da (almost everywhere)
at as D at
as ,

and

D D D D v (s, t) = R a- a
, v (s, t) almost everywhere,
at as - as at) (at as)
where R is the Ck' 3 curvature tensor of M and (resp. D f as) is the
at
covariant derivative along the curves 0 (so, t) (resp. 0 (s, to)) on M.

Proof: The formulae are well known for 0, V E C2 and extend by the usual
limit arguments to the above situation.
Q.E.D.

8.27. Definition: Letx : [0, 1] -+ H, (SI , M) be a C'-curve; then

L(x) ds = f'(I'dtff-.,\+(_0x
ds
D ax
as dsat at ' as)})
1 /2

o as
f 11 J0 at
216 NONLINEAR FUNCTIONAL ANALYSIS

is the length of x and


I f0 ax
2

E(x) ds is the energy of x .


2 as

8.28. Remark: L2(x) < 2E(x) may be proved along precisely the lines of
the proof of Lemma 8.3.

8.29. Definition: In view of Lemma 8.25, define the Riemannian distance


between by

inf L(x) if fo and f1 are homotopic as curves on M


K(O)=f0
d (fo,f1) = K(1)=f,
00 if fo, f, are not homotopic.
8.30. Theorem: I \/2J(fo) - 2J(fl)I < d (fo ,

Proof: Either d (fo, f1) = oo and nothing has to be proved or there is a


C`-curve x joining fo and f1 . In this latter case we have

(s, t), at (s, t)) dt


J (x(s)) 2Jo at

d 1 ` D ax ax
ds ` -(m(s))
2J = J (as dt ' at) dt,
,/ o

Hence, using Lemma 8.26 to change the order of differentiation and noting
the -identity of Definition 8.27,

d ax 2 1/2 (fl D On
2J (x(s)) <_
1 1
(s, t) dt
ds - -,l 2J (x(s)) (fo at \d 0 at as

ax
as

Therefore /2J (x(1)) - 2J (x(O)) < L(m). It follows by symmetry that


I \'2J(x(1)) - ,12J(x(0))I < L(x), and taking infima over x we obtain the
theorem.
Q.E.D.

The following theorem is a generalization of Lemma 8.8.


CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 217

8.31. Theorem: d,2 (fo , f,) 2d 2 (fo ,fi)

Proof: We must show d.' (fo, f,) < LZ(x) for any C'-curve joining fo and
f , , cf. Definition 8.13 (1) and 8.29. Let tm c- [0, 1 ] be such that dc, (Jo , fi )
= dM(fo(tm),fi(tm)) Then by Lemma 8.25 x (s, is a Cl-curve on M and
we have
($1 ax (s tm) dsllz
d.2 (fo,fi) = d (fo(tm), fi(tm))
o as

1 max ax (s, t)Ids)2 ax ds)2

< <2 ` = 2L 2(X)


o t as o as

by Lemma 8.8.
Q.E.D.

8.32. Theorem: H, (S,, M) is a complete metric space.

Proof: Let f f.} be a Cauchy sequence in H, (S, , M). By Theorem 8.31


If,} is a d0-Cauchy sequence, i.e. f converges uniformly to a continuous
curve f on M. Since the coordinate neighborhoods on H, (S,, M) are defined
using the (cf. Definition 8.13 (2)) it follows that for large n the f
and f all belong to some single coordinate neighborhood. The coordinates v
of the curves f form a Cauchy sequence in the corresponding coordinate
Hilbert space; this Cauchy sequence converges to some limit v. The point of
H, (S1, M) with the coordinate v coincides as continuous curve on M with f;
this proves that f -+ f e H, (S,, M) in the topology of H, (S,, M). Q.E.D.
8.33. Theorem: If we consider each point of M to define a closed, con-
stant curve, then M is embedded isometrically as a totally geodesic closed
submanifold M of H, (S1, M).
Proof: Every "constant" curve p: S, -+ p e M is determined by its image
point p on M. The set

M = J-'(0) = If If : S, -, M is a constant curve)


is a closed subset of H, (S,, M) since J(f) = 0 if and only if f is constant
Let U(p) be the standard neighborhood (cf. Corollary 8.16) of the constant
curve p in H, (S,, M). Then f e U(p) n M if and only if the coordinate
of f is a constant vector field along p. The constant vector fields form an
n- (= dim M) dimensional (hence closed) subspace of the coordinate space
H, (S1, TM;) and therefore 9 is a closed submanifold of H, (S1, M). Next
218 NONLINEAR FUNCTIONAL ANALYSIS

we show that if two constant curves po, pl are joined in HI (S1, M) by a


C1-curve x, then they can be joined by a curve x : 11lf such that L(x)
< L(x). Indeed, by Lemma 8.25, the curve xT: I-+ M, defined by xe(s)
= x (s, r) is a C1-curve on M. Using any such curve, we may define a
curve xT : I -> M on M by putting eT(s) (t) = x (s, T), 0 < t S 1. We
have
oxT
2
D 0 2 )1 1/2
L(x ) _ f ds dt (s ) (t)
dsI
T 1

0 as = Jo 0 \ as at as

=J
ds (s, z) = L,.4 (x,),
0 as
D axT
since oxT/as is independent oft and = 0. Therefore the curve xT on
dt as
M and the curve xT on M have the same length. Moreover
1 ., 1 1

inf LM(x,) = inJo ds I °x (s, z) S f dr ds


as o .J o

21as
f1ds(f'f._(s,T)

inf LM(x,) < f ds


ax
= L(x).
T
Jo as

By Lemma 8.25, LM(x,) depends continuously on z, hence there is a a* for


which the above infunum is assumed. Putting X^' = xT. we have LM(xT.)
= L(x) 5 L(x). From this it follows that the shortest geodesic joining po
and p, on M, if considered as a curve on M joining p0 and P1, is also the
shortest curve connecting p o and p 1 in H1(S1 i M), and is therefore a geodesic
in H1(S1, M). Thus plainly dm (Po, P1) = d (P0, P1) so that ft s H1(S1 , M)
is totally geodesic.
Q.E.D.

We proceed to a more detailed discussion of the y-curves. These y-curves


are "short enough" to eliminate the need to refer explicitly to geodesics on
H1(S1, M) in some situations with which we shall deal below.
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 219

For use in the following proofs we recall some facts which have been
established above.

= d. (fo,f1) 2d (fo, fl) < v'2L(y) < 2 JE(y)


00

(cf. Lemmas 8.24, 8.31, 8.29, and 8.28).

8.34. Lemma: Let IIRII be a bound for the norm of the curvature tensor
8y
on M and let Ias = dx . Then we have
E y

(1) max J2J(y(s)) (1 - IIRII dam)-1 (IJiif5 +


< 2 II R 11 dQ s max -*' 2J y(6))
0<0<!

(3) 1 L(1') < 2 !IRII clr max \, 2J(y(s)):

(4) 2 IIRII do max J2J(y(s)).


as (0)

Remarks; (1) gives a bound for the energy integral along a y-curve in
terms of the energy of the one end point, J(fo), and in terms of the coordinate
of the other end point, as (0).
(2) shows that a y-curve is parametrized almost proportionally to arc
length.

Proof: (3) and (4) are immediate consequences of (2). To prove (2), we
note that, by 8.6 (3) and 8.8 (4)

d D ay L + D D oy D icy
ds ay
1
fo tas as ' as
1 dt
as at as ' at as)
as (s) II

By integrating this equation and using a ay = 0, 8.26, I s (s, t) I - d,,


s
220 NO` I INEAR FUNCTIONAL ANALYSIS

and the Cauchy-Schwar7 inequality we obtain

'a
ay _y D ay
<_ ds
\Ras
I

,Jo oy (s) at as at as

as
I
D ay 2

dt
J'C at as
II RI1;d f ds 2J (V(s))
0

< II R I; d, (i ds v'2J (y(s)).

This proves (2). From 8.30 we have

I\ 2J(,,((T)) - J2J(fo) $
0 as
(s) I ds.
Therefore (2) gives

\12J (y(a)) - J2J(fo)I


as
(0) + 11RII d; m ax J2J(y(s)),
li 11

which implies (1).


Q.E.D.
8.35. Theorem: The first derivative of the energy integral at f e H, (S,, M)
is the continuous linear map dJf : H, (S,, TM f) -+ R given by the formula

dJf(w) = I' C D w(t), aft dt, w e H, (S, , TM,).


0 at at J
Moreover,
I f(w)I < 11w11.

Proof: Let f, e U(fo) and let y(s) be the y-curve joining fo and fl. Then
ay
the coordinate off, is (0) (cf. 8.24) and we must prove that
as

.1(fi) - J(fo) - dJ f, (.. (0)) I = 0 f


as II as (01)
CLOSED GEOD.ESICS ON COMPACT RIEMANNIAN MANIFOLDS 221

Now, using 8.26, we see from 8.8 (4), from


1
1
d ay ay
J(fi) - J(fo) ds 1 , dt
fo ds \2,10\at at} /
and from

(s, r), ) dt = da J\
f 0 as at
at o f f o o dt at as at
+ (D D ay ay 11
as as at at
that

1J(fl) - AM - f 1 (D ay (0),
at as
afo
at )
dt
o
s
1 dt j(7 D ay D ay ey ayl ay ay
f f f 1 ds dor -R
(
0 o 0 as at a;) at' as I as ' at
2 2 2
ay
' {Il
< f
ds II R Il 2J(y(s)) } Cl

ol
0 as (s) as as (0)

where C1 = C1(RI , J(fo) , I (0) ) by Lemma 8.34 (1) and 8.34 (2). There-
11
as
fore dJfo ((0)) = f 1 as
(0), afo) dt, which proves our first con-

clusion. The inequality IdJf(w)I S 2J(f) Ilwll now follows from the
Cauchy-Schwarz inequality.
Q.E.D.
Remark: If one carefully notes those parts of earlier results which have
been used in the above proof, one finds that it not only establishes a formula
for the derivative assuming differentiability, but actually proves the differ-
entiability of J.
8.36. Definition: The continuous linear functional dJf(w) on the Hilbert
space H1(S1, THf) can be represented as the earlier product of w and of a
vector which we call grad J(f). More specifically, we have (cf. 8.35)

dJ f(w) = <grad J(f), w> = (D


f 0\ar w(t ), of
I at)
ar/ dt
= f0 (grad J(f)(t), w(t)) + grad J(f) (t), D w(t)l dt.
15 Schwartz, Nonlinear
at at /j
222 NONLINEAR FUNCTIONAL ANALYSIS

8.37. Remark: From 8.35 it follows that Jjgrad J(f)l S .J2J(f), and from
D of dt.
8.36 that fl grad J(f)112 = grad J(f),
J2
0\ t at)
8.38. Theorem: The vector grad J(f) tangent to Hl (S1, H) at f, may be
interpreted as a vector field along the curve f on M, and is determined by the
integral equation given in Definition 8.36. If f is smooth enough so that
aflat E Ht (S1, TM,r), then grad J(f) (t) is the unique periodic solution of
the differential equation
2
D22
grad 1(f)(t) - grad J(f)(t) = D
at .
Proof: Our first assertion is obvious. To prove the second, note that since
and w(t) are continuous, we have (w(t), at) = 0. Hence integrating
at o
the left side of the integral equation of 8.36 by parts gives (use 8.8 (4))

(w(t), af(t)) dt = J 1 J(grad J(f) (t), w(t))


Jo at I o

+ (. gradJ(f)(t), a w(t))j dt
This can hold for every w e Hl (Sl, TMf) only if D/at grad J(f) (t) is a con-
tinuous vector field along f, in which case integration of the last term by
parts gives the desired result.
Q.E.D.

Recall that f is called a critical point of J if f1grad J(f')fI = 0.

8.39. Theorem: The critical points of J correspond precisely to the closed


geodesics on M (including the constant curves, cf. 8.33).

Proof: Since the constant curves are of minimal energy, i.e., of energy zero,
there is nothing to be proved for these curves. Next, let f be a non-constant
closed geodesic and hence a C2-curve. Then al e Hl (S1, TMf) and by
Theorem 8.35 we have at

dJAw)
- 5o
(T' w(t),
L)
at
dt
-J o (w(t),

at at
0,
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 223

of a
since D
of at
= 0 for geodesics and (w(t), 1

o
= 0 bYcontinuity. There-

fore by 8.36 11 grad J(f )II = 0 so that f is a critical point of J.


Next assume that f is any critical point of J, so that II grad J(f)11 = 0. We
of
shall prove D
at
at
= 0 by constructing a vector field z which is parallel
of (In what follows D/at
along f and which will turn out to be equal to .
at
will denote the covariant derivative along f.) To construct the vector field z,
we first solve the equation y= putting y(O) = 0. Then y is a H,-
at
vector field along f, continuous everywhere
at except possibly for the fact that
y(O) # y(1). Next solve the differential equation D
at
z = 0 with the boundary

condition z(l) = +y(1). Again z is H, and continuous everywhere except


possibly for the fact that z(1) : z(0).
Note also that the solution of D x = z(t) with the end condition x(l)
= y(l) is x(t) = tz(t). at
Put v(t) = x(t) - y(t). Then v(t) is a H,-vector field along f and v(O)
= v(1) = 0. Hence v e H, (S, , TMf). Since II grad J(f) II = 0 it follows by
Theorem 8.35 that
dJf(v)=0= ('ir of )
J o \at ' atl dt.
Moreover

f'(v(t), D z)dt=0.
o- \ at

Forming the difference of the last two equations and noting that Dv/Ot
= Dx/at - Dy/at = z(t) - of/at, we get

f(z(t) - of , z(t) _ of) dt = 0.


at at

This implies that of/at = z(t) almost everywhere. Since of/at is the derivative
of an absolutely continuous function it follows that the indefinite integrals
of of/at and of z(t) (in local coordinates) agree. But z(t) is continuous and
therefore everywhere equal to the derivative of f. Thus of/at is continuous
except possibly for a jump at t = 0+, 1-. Arguing in a similar way however
224 NONLINEAR FUNCTIONAL ANALYSIS

we can show that of/at is continuous except possibly for a jump at t = I-,
I+. Thus of/at is continuous everywhere and parallel along f, so that f is
a closed geodesic.
Q.E.D.
8.40. Remark: We restate the Palais-Smale condition for the energy J for
the convenience of the reader:
If f f.} is a sequence on H1(S1, M) such that J(,) < A and II grad J(fn)11
converges to 0, then {f.) has a subsequence which converges to a critical
point.
8.41. Theorem: The Palais-Smale condition holds for J.

Proof: Since H1(S1, M) is complete (by 8.32) and since 11 grad J(-) 11 is
continuous it suffices to find a subsequence h of f,, which is a Cauchy se-
quence. Since by Lemma 8.3 the {f,} are an equicontinuous family on a com-
pact manifold we can use Arzela's theorem to find a subsequence of {f.}
which converges uniformly; suppose, without loss of generality, that f f.)
has this property. We then have d,, (f., fm) < s for n, m z N(e). We now
show that { fn} is a Cauchy sequence. The proof will rest on Lemma 8.34
and the following formula for y-curves (cf. 8.23, 8.8 (4) and 8.26)
aJ(y(s))I Dy ay (s, t)dt
as _ fo at tas ' at /
fu (a (o, t), ay (o, t)) dt + f f
o
z
(s, t) dt ds
as o at as

+ f f o(R at'
o dt ds, (5)

Let V. be the y-curve (cf. 8.23) such that y ..(O) = f fm. Then by
8.29 and 8.34, we have

d (fA, fm) 5 L(yem)

6 + 211811 do (fi,fm)
1 - IIRII dd (fn,fm)
2A + aynm (0)
as
}
We now suppress the indices n and m and write ay/as for ay,./as, d. for
(f,fm) and z for 21JR dx
(1 2
- IIRIIdW )2
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 225

Then the above formula yields


(6) d (f,,,fm) I (Y) (1 + e) jI s 2A.
as (0)II +
Since
dJ (y(s))
E=zi AD at)dt < 11 grad J (Y(s))1I
ds at as'
we obtain from (5)
D ay (s, t) 2
dt ds 5 (1 grad J (Y(1))II
at as as (1)

+ 11 grad J (Y(0)) II 11as (0)


11

+ J0
'f' ((as
0 , at as at
} dt ds,

ay 211 R II d.2
and using = d., 8.34 (2), s =
as (1 - IIRII d.)
and 8.34 (1)
f1 1 2
D
(7) (s, t) dt ds S Ilgrad J(y(1))II ( (0)
J 0J0 at as as

+E (0)) + 118rad J (Y(0)) II


as as (0)

11 (0)112).
+ (A +
as

On the other hand by 8.27 we have


J1
2
-
ff1

D ay (s, t) dt ds > 2E(y) d.2


OJO at as
and, after multiplying 8.34 (4) by 2E(y) + and using 8.34 (1)
1 1
D 2 2

f fo at as
o
(3, t) dtds>-
sy (0)
as (0) as (0)

+ 2A + II ay (0)
ns
226 NONLINEAR FUNCTIONAL ANALYSIS

Combining this inequality with (7) we obtain


2
(1 - 8 - 282) < d + e 2A (1 + E) a1' (0) + (e + 2e2) A
11
as 11

+ ((1 + e) 11 grad J(y(l))11 + 11 grad J(y(0))I1)

+ e N/2A 11 grad J(y(l))ll


as (0)
or, since e = O(de)
2
(8) = O (dam) + (0) O (dam) + (11 grad J (y(O))II
as

+ pgradJ(y(l))II)

From (8) it follows first that (0) stays bounded as n, m -+ oo and


as
then since da, -' 0 and 11 grad J (y,,,,,(0))11 -, 0, it also follows from (8) that
ay. (0) -. 0 as n, m -i oo. This and (6) completes the proof.
as
Q.E.D.

We may now draw various easy consequences of the Palais-Smale condi-


tion (for more details see Chapter IV).

8.42. Lemma: If J has no critical points in J-1([a, b]) then there exist
6 > O and e > O such that 11grad J II Z 8 on J-1 [(a - 6, b + 6]) .

Proof: Were this false, we could find a sequence {f.) such that lim
e [a, b] and lim Ilgrad 0. But then the Palais-Smale condition im-
plies that J has a critical point in J-1([a, b]).
-Q.E.D.

8.43. Definition; As in Chapter IV, we define a vector field on H1(S1, M)


by v(f) = - grad J(f ). Integration of a0 = v(4) with the initial condition
s
fi (0j) = f defines the "gradient deformation" 0 (s, f ).

8.44. Lemma: We have ` ( ,f)) = -l1gradJ(0)(s,f))112


CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 227

Proof :
V( s,.f )) - <grad J(am),
0V >
_ - II grad J(O) II 2
ds as
(cf. 4.71).
Q.E.D.
Let the singular cycle (or curve) z be a representative of any nontrivial
homology class (or homotopy class) H of the pair (H1(S1, M), R) (cf. 8.33).
The range (or carrier) of z is a compact subset of H1(SI , M) on which J
assumes a maximum. We make the following definition.

8.45. Definition:
co = inf (max J(f)).
)z)EFI fErangez
cH is called the critical value of H.

8.46. Theorem: cg is a critical level of J.

Proof: We assume ca > 0 since 0 is a critical level of J. If cg is not a critical


level, then for some 6 > 0 and E > 0 it follows by 8.42 that II grad J II k e
on J-1([cg -- S, cg + 8]). By definition of cg we can find z E H such that
max J(f) 5 cg + 8. Now deform z using the gradient deformation for
fEranjez 62
05s526/62.Then 0(?,z}eHand
/

max J(f)5CH --8


J E range m (2 46/0, z)

by 8.44, contradicting the definition of cm.


Q.E.D.

The preceding theorem does not by itself imply the existence of a single
nontrivial closed geodesic since the possibility eg = 0 is not excluded and
since we do not know the existence of nontrivial homology or homotopy
classes of H1(S1, M), M). The remainder of our reasoning is aimed at
overcoming this difficulty.

8.47. Theorem: There exists e > 0 such that Al (cf. 8.33) is a deformation
retract of J-1([0, e]).

Remark: From 8.47 it follows cg > 0 for any nontrivial homology (or
homotopy) class H of (Hl (S1, M, M).
The proof of 8.47 will follow from Corollary 8.49.
228 NONLINEAR FUNCTIONAL ANALYSIS

8.48. Theorem: There exists e > 0 such that the flow lines of the gradient
deformation (cf. 8.43) which start in J-1([0, e]) have uniformly bounded
length as s -i. oo, and such that each of these flow lines has a well defined
limit point in M as s -> oo.
8.49. Corollary: The uniform boundedness of the length of the flow lines
which start in J-1([0, e]) implies that for these flow lines the end point in 14
and the length depend continuously on the starting point. Consequently
we can parametrize these flow lines proportional to are length and thus get
a retraction of J-1([0, e]) to M, proving Theorem 8.47.
Proof of Theorem 8.48: We shall find e > 0 such that for f e J-1([0, e])
we have
(1) IIgrad J(f)112 > I J(f)
This estimate implies 8.48 in the following manner. By 8.44 the flow line
starting at 0(0) satisfies

J 11grad J (0(x))112 do = J (0(0)) - J(0(s)) < e.


0

Therefore 11grad J(0(s))11 is not bounded away from 0 on a flow line. By


the Palais-Smale condition, there exists a sequence converging to a crit-
ical point. By (1) it follows that lim 0(s.) a M. Since J(0(s)) is monotone
R-00
decreasing this proves lim J(0(s)) = 0. Therefore, using 8.44 and (1), we
s-ao
obtain the uniform bound for the length:

(2) L(0) = io 11 ' 11 ds =J 11grad J(0(s))11 ds


0

dJ
ds
co J(0(0))
ds
sJs dJ
fo 11grad J(0(s))II fQ=J(O)) -1/J
=2 J (0(0)) 5 J20 e
If a flow line had two different limit points in M, its length would have to be
infinite. Thus (2) proves
lim 0(s) a M,
s~00
as desired.
To complete the proof of 8.48, it only remains to prove (1). We do this by
introducing local coordinates on M. Let Ep, C, m1, m2 be as in Lemma 8.1
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 229

and in addition let e, be so small that m2 - M1:5 m1/64. We define the


quantity e above as
(n)3'Z)_2)
= mi(2s(8C (n = dim M).

By Corollary 8.4, every f e J-1([0, e]) satisfies L,,,(f) < 2s, and therefore
is completely contained in the domain of some geodesic parallel coordinate
system. Since both sides of (1) are continuous it suffices to prove (1) for
f e C2. Then by 8.38 grad J(f) is the periodic solution of
D2
(3)
d2
y- y= Dd f.
(where here and in the following proof we write y for grad J(f) and f for
afat).
Moreover, by the proof of 8.38, Dy/dt is absolutely continuous. Next
define
Dy
(4)
dt
We find using (3) that

(5) D
dt
D2y2 -
z = dt
dt
D
dt
f. = y and therefore
D2z
dt2
- z = f.
From 8.37 we have Ilyll 2 - f0 1 , f dt and hence, using (4), we get
dt )
(6) I1y112 = f 0
(z + f, f ) dt = 2.1(f) + f (z, f) dt.
0

We shall prove (1) by estimating f (z, f) dt in (6) as follows. Let f'(t) and
0
z'(t) be the coordinates of f(t) and z(t) in a parallel coordinate system whose
domain contains f (see choice of e, above). Then

(7) (z(t), f (t)) dt f 6,kz'(O) fk(t) dt


0

f 1 6fk (Z1(t) - z'(0)) J k(t) dt


0

{(Z(0, f()) - 61kz'(tfk(t)} di


f o
230 NONLINEAR FUNCTIONAL ANALYSIS

Since f is a closed curve contained in a single coordinate system we have

(8) J 6tkz'(0) .f k(t) dt = 0.


0

Before we estimate the next term observe that we have from (4) and (5) using
8.37 again
Al
Dz Dz
(9) IIzI12 = {(zz) )} dt
,10 + (dt , dt
1

-f dyi -f)+(y,y)}dt=21(f)-I1y112.
Jo
Lemma 8.2, 8.1 (1), the Cauchy-Schwarz inequality, and (9) give
(10) if {(z(t), f(t)) = 8,kz'(t) fk(t)} dtl

< f »i2-m'
16 Iz(t)I'I1(t)Idt
o m1
m1 - m1
16 m2 - 11z11 ti (J < 16 m2 2J(f)
m1 m1

To estimate the second term in (7) note that = y (cf. (5)) is in coordi-
nates equivalent to dz

(t) = y'(:) - r df(t)) z'(t) At),


z

so that integration and 8.1 (1) give

(11) Iz'(t) - z'(0)1 Al) dt + c t E Iz'(t)I If`(t)fI dt.


fr0 0 J.1
By the Cauchy-Schwarz inequality, 8.1 (1), and (9), we have
1/2 1 1/2
(12) Izf12 dt)
E Iz'(t)I If'(t)1 dt
0l.1
n (1'0 1 Y I.f`IZ dt
(fo t
)
< n Ilzll ;5 2J(f).
mi m1
Since
f'/;;j f
o r latkllk(t)I dt < n (f(t), f(t)) dt
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 231

and
1 8`k dT I fk(t) dtl

f Y_
if
-
01.k o
J0

1 +1 1/2 1 1/2
< ` (AT), Y(T)) dr (f (f(t), f(t)) dt
m1 (J o o JJJ

we obtain from (11), and (12) using also IIYII 2j(f) IIYII2 +J(f),

ifo ark(z`(t) - z(0))fk(t)dt


(1 3)
2J(f))3/2

5
+JJ(f))+C(n

1 (11yI12
m,
The estimates (8), (10) and (13) of the right side of (7) inserted in (6) give

(14) IIY112 J(f) (2 - 32


n12 - m,
MI
_
2m,
1 - 2C (n
m,
IIYII2
m,
so that I1yJ12 ? I J(f) by the choices made for s, and e.
Q.E.D.
8.50. Theorem: On every compact Riemannian manifold M of class Ck,
k 6, there is at least one nontrivial closed geodesic.

Proof: If M is not simply connected, then by Theorem 8.46 and 8.47 there
exists a closed geodesic in every nontrivial homotopy class of closed curves.
If M is simply connected there is a first nonvanishing homotopy group XI(M),
12: 2. We claim that r, _ 1(H, (S1, M), M) is nontrivial. This together with
Theorem 8.46 and the Remark following Theorem 8.47 implies the present
theorem.
We will prove our claim for the case M = S2, and indicate the modifica-
tions needed to treat the general situation thereafter. Consider the spheres S2
in R3 and a line 1 tangent at p to S2. Take the tangent plane to S2 at p and
rotate it around 1, through 180°, until it is again tangent. The intersections
of the intermediate planes with S2 form a family of circles. Parametrize the
intermediate planes with a parameter s running from 0 to 1 and call the
corresponding circles of intersection c(s). Then c(0) and c(1) are constant
curves (cf. 8.33). Parametrize each circle with a parameter t running from 0
to 1, so that c(s) (0) = c(s) (1) = p (e.g. take t proportional to arc length
on c(s)). Then c : [0, 1 ] -- H, (S1, M) is a curve in H, (S, , M) which re-
232 NONLINEAP. FUNCTIONAL ANALYSIS

presents an element of n1 (H1 (S1 , M), M). We claim that this homotopy
element is nontrivial.
To prove this, first consider the map e : I X I -+ S2 = M given by c (s, t)
C (s) (t). Note that c (s, 0) = c (s, 1) for each s and c (0, t) = p = e (1, t)
for all t. We shall make boundary identifications in I x I so that the square
becomes a sphere and c induces a map c* : S2 - S2 = M such that c* is
homotopic to the identity and therefore homotopically nontrivial. This is
done as follows. For each s 0,1 identify the two points (s, 0) and (s, 1);
for s = 0 identify all the points (0, t) to a single point; and for s = I identify
all the points (1, t) to a single point.
Assume now that c represents a trivial element of n1(H1(S1, M), M).
Then there is a deformation 0, of c in H1(S1, M) which deforms c to a
curve d : [0, 1] -- M and which leaves the end points of c fixed, i.e. 0,(0)
= c(0) = p = c(l) = (,(1) for all r. We can assume that 0, is a differ-
entiable deformation. Using Lemma 8.25, we can interpret 0, as a homotopy
of a on M. Since 0,(s) (0) = 1,(s) (1) for each r and s (since we deal with
spaces of closed curves), and also since 0,(0) (t) = p = 45,(1) (t) for all 1,
I, gives a homotopy of c*. Now note that the curves O1(s) = d(s) are con-
stant curves on M, i.e. 0,(s) (t) is independent of t, and note also that d(0)
= d(l) = p . We have thus constructed a homotopy from the nontrivial map
c* : S2 -- S2 = M to a map d* : S2 -, S2 = M such that the image d*(S2)
lies on the closed curve in S2 = M which is given by s -+ 01(s) (0). Clearly
d* is homotopically trivial, a contradiction.
In regard to the general case, we remark only that starting with a non-
trivial element of the first nonvanishing homotopy group n,(M), i.e. with a
(differentiable) map F: S' --.M, one may define an associated (l -1)-para-
meter family of circles on S' such that the F-image of the family of circles
represents a differentiable element of n, -I (HI (SI , M), 9). If this element is
trivial, one constructs, as above, a homotopy which deforms F to a map G
which can be considered as an element of n, _ 1(M) and is therefore trivial by
the choice of n,(M), a contradiction which proves our theorem in the general
case. Q.E.D.

Comments on Further Developments of the Theory of Closed


Geodesics

When one tries to prove the existence of more than one nontrivial closed
geodesic, one runs into the disturbing fact that associated with each closed
geodesic one automatically has a one-parameter family of geodesics arising
CLOSED GEODESICS ON COMPACT RIEMANNIAN MANIFOLDS 233

from the various different starting points for the parametrization of a closed
curve. (Note however that critical points of J are always parametrized
proportional to are length.) It is easy to prove that the action of 0(2) on
H1(S1 i M), which for a e 0(2) is given by f(t) - f(t + a) (t and a of course
taken mod 1) is continuous. By identifying orbits we then get a new space
11(M) (in Klingenberg's notation) in which a single nontrivial closed geodesic
is represented by exactly one point. (Multiple coverings of a geodesic are
not identified to a single point by this process.) The space II(M) is no longer
a manifold, but since the Riemann scalar product of H1(Sl, M) and the
energy function are compatible with the action of 0(2) (i.e., are equivariant
under the action of 0(2)) one can prove many statements concerning the
space 17(M) by "lifting" them to the manifold H1(S1, M). For example, the
gradient deformation in H1(S1, M) induces an energy decreasing deforma-
tion in 17(M), so that Theorem 8.47 also holds for 11(M). This is remarkable
since the result seems to be inaccessible via the classic techniques using
broken geodesics. In a paper to appear in the journal Topology, W. Klingen-
berg gives a complete description of the Z2-homology of both H1(S1, M)
and I1(M) for M = S. His method of calculation also applies to the pro-
jective spaces and the other symmetric manifolds of rank 1. Using this in-
formation he obtains a number g(n) of "algebraically different" nontrivial
closed geodesics for the case of M = S"; specifically g(n) = 2n - s - 1
where 0 S s = n - 2" < 2". "Algebraically different" means that the ener-
gies of these g(n) geodesics are the critical values (8.45) corresponding to
g(n) pairwise subordinated ' homology classes; here and below, we call a
homology class ,% subordinated to a homology class j9 if there exists a cohomo-
logy class such that a can be written as a cap product a = r ft. Unfor-
tunately the possibility cannot be excluded that all the geodesics obtained
are multiple coverings of a single geodesic.
The same result was proved using different methods by S. L. Alber. (On
periodicity problems in the calculus of variations in the large, Amer. Math.
Soc. Transl. (2) 14 (1960).) A. I. Fet proved that on every compact manifold
there are at least 3 algebraically different nontrivial closed geodesics. (On
the algebraic number of closed extremals on a manifold, Dokl. Akad. Nauk
SSSR (N.S.) 88 (1953), 619-621). Klingenberg obtains this result also.
No criterion is known which allows one to decide whether two algebraically
different geodesics are also geometrically different (i.e. whether the under-
lying simple covered closed geodesics are different). However, Lusternik and
Schnirelmann (Sur les problemes de trois g6od6siques ferm6es sur les sur-
faces de genre 0, C. R. Acad. Sci. Paris 189 (1929), 269-271) showed that on
234 NONLINEAR FUNCTIONAL ANALYSIS

manifolds of the type of the 2-sphere there exist 3 closed geodesics without
self intersections.
Fet proves the following result: If all closed geodesics on a compact
manifold are nondegenerate as critical points of J then there are at
least 2 prime geodesics (A periodic problem in the calculus of variations,
Dokl. Akad. Nauk SSSR (N. S.) 160 (1965), 287-289). Alber and Klingen
berg announced that under restrictions on the curvature of a manifold
M (j < min K/max K < 1) one can prove without difficulty that the geo-
M M
desics constructed from subordinated homology classes are geometrically
different. Fi aally, Klingenberg announced that certain special closed geo-
desics constructed from subordinated homology classes turn out to be
simple and without self intersection.
Index

Absolutely continuous maps 165 Cup-length 189


Associated bilinear form 31 of a space 161
Attaching of handles 137 Cup product 160
Curve on a manifold 102
Banach space 10
Bilinear forms 121 Degree,
Borsuk's theorem 78 and generalized Jordan's theorem for
Bott periodicity theorem 197 Banach spaces 92
Bounded set 10 multiplicative property of 74
of mappings 107 of a continuous mapping 70
B-space 10 of finite dimensional perturbations of
Bundles, the identity 84
analytic 113 theory 55
direct sum of 113 Derivative, Gateaux 11
homomorphism of 113 Diffeomorphic manifolds 101
of class C' 113 Dimension of a compact metric space 156
smooth linear 112 Domain invariance 77
sub bundles of 113
tangent 115 Embedding of Riemannian manifolds 43
Equicontinuous set of mappings 107
C1 mappings in R' 61 Exactness principle 149
Calculus of variations in the large 162 Excision property of homology 149
Category theory 155
and homology 158 F-differentiable function 11
principal theorem of 164 Feebly continuous mapping 22
Cohomology ring of a group-like space Fixed point theorems 96
189 Frechet differentiable function 11
Compact mappings 26 Frechet space 9
Complete Riemannian manifold 126 Freudenthal suspension relation 185
Complex analytic mapping 30 F-space 9
Contracting mapping principle 14
Convex set 9 Gateaux derivative 11
Coordinate of a vector 104 Geodesics on a finite-dimensional mani-
Critical neck principle 139 fold 172
Critical points, global study of 137 Germ of smooth functions 102
Critical points of functions 132 Gradient of a function 126

235
236 INDEX

Hardies, attaching of 137 Newton's method 33


.(ard implici functional theorems 33 Nash implicit functional theorem 33
Hessian of a function 133 Non-critical neck principle 127
Higher differentials 28 Non-degeneracy theorem 175
Hilbert manifolds 169 Non-degenerate critical point 133
Homology of n-sphere 149
Homology sequence 149 Palais-Smale condition 130, 171
Homotopi:ally equivalent spaces 148
Homotopy, Quadratic form 31
of Lie groups 189
theory 181 Regularly imbedded submanifold
Horizontal function 11 120
Hurewicz isomorphism theorem 188 Relative cubical groups of a pair
159
Implicit function theorem 15 Riemannian manifold 121
hard 33 geodesic manifold 175
soft 14
Sard's lemma 55
Jordan separation theorem 75 Section of a bundle 113
Set,
Kirszbraun lemma 19 of first category 155
of K-th category 155
Length of a curve on a Riemannian mani- Singular cubical chain group 158
fold 124 N cube 158
Locally compact mapping 26 Slightly continuous mapping 22
Locally convex space 9 Smooth linear bundles 112
Smooth manifold 100
Manifold, Strictly monotone mapping 19
of curves 168 Strongly monotone mapping 18
Riemannian 121
smooth 100 Tangent bundle 115
tangent space to 103 Tangent space,
Mapping horizontal at P 102 to a manifold 103
Minty's theorem 22 to manifold of curves 168
Monotone mapping 19 vectors to a manifold 102
Morse index theorem 175 Taylor's theorem for 9-spaces 28
inequalities 148 Topological linear space 9
lemma on critical points 136
theory, applications of 181 ...mar field on a manifold 116
NONLINEAR FUNCTIONAL ANALYSIS
by J.T. Schwartz, Courant Institute of Mathematical Sciences, New York
University, USA

This book delves into the subject of nonlinear analysis within the context of
infinite dimensional topological spaces and manifolds It aims to extend
known theorems of nonlinear analysis from the finite to the infinite dunen-
sional case and to analyze difficulties, which arise in the infinite dimensional
case. The authors address calculus on a basic level and work their way up to
closed geodesics on topological spheres. Mathematicians will find this a clear
explication of the theorems and applications in nonlinear functional analysis.
Related titles of interest from Gordon & Breach
SOME METHODS IN THE MATHEMATICAL ANALYSIS OF
SYSTEMS AND THEIR CONTROL
by J.L. Lions
FINITE ELEMENT METHODS
Proceedings of the Symposium on Finite Element Methods, Hefei, China,
(May 18-23, 1981)
edited by He Guangqian and Y.K. Cheung
DIFFERENTIAL GEOMETRY AND TOPOLOGY
by J.T. Schwartz

GORDON AND BREACH SCIENCE PUBLISHERS ISBN 0-677-01500-3


NEW YORK LONDON PARIS MONTREUX TOKYO ISSN 0888-6113

You might also like