Mark Rusi Bol

Calculus of Variations (with notes on
innite-dimesional calculus)
J. B. Cooper
Johannes Kepler Universitat Linz
Contents
1 Introduction 2
1.1 Methods employed in the calculus of variations. . . . . . . . . 3
2 The Riemann mapping theorem. 5
3 Analysis in Banach spaces 7
3.1 Dierentiation and integration of E-valued functions . . . . . 7
3.2 The Bochner integral . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 The Orlicz-Pettis Theorem . . . . . . . . . . . . . . . . . . . . 15
3.4 Holomorphic E-valued functions . . . . . . . . . . . . . . . . . 16
3.5 Dierentiability of functions on Banach spaces . . . . . . . . . 20
4 The calculus of variations 29
4.1 The Euler equations application . . . . . . . . . . . . . . . . 29
4.2 The Weierstra representation: . . . . . . . . . . . . . . . . . . 37
4.3 Applications to Physics: . . . . . . . . . . . . . . . . . . . . . 43
1
1 Introduction
The calculus of variations deals with the problem of maximizing or minimiz-
ing functionals i.e. functions which are dened not on subsets of R
n
but on
spaces of functions. As a simple example consider all parameterized curves
(dened on [0, 1]) between two points x
0
and x
1
on the plane. Then the
shortest such curve is characterized as that curve c with c(0) = x
0
, c(1) = x
1
which minimizes the functional
L(c) =
_
1
0
c(t)
dt
Of course this example is not particularly interesting since we know that
the solution is the straight line segment from x
0
to x
1
. However, it does
become interesting and non trivial if we consider points in higher dimensional
space and consider only those curves which lie on a given curved surface (the
problem of geodetics).
Further examples:
1. The Brachistone Problem. Given are two points O (the origin) and P
with y
P
< 0. Find the curve from O to P for which
_
c
1
y
ds =
_
1
0
1
_
c
2
(t)
_
c
2
1
(t) + c
2
2
(t)dt
is a minimum.
2. Minimal surfaces of revolution. Given are points P = (x
0
, y
0
), Q =
(x
1
, y
1
) with x
0
< x
1
, y
0
> 0, y
1
> 0.
We are looking for that function f(x) so that f(x
0
) = y
0
, f(x
1
) = y
1
which minimizes the functional
_
x
1
x
0
f(x)
_
1 +
_
f
(x)
_
2
dx
3. Hamiltonian mechanics. We have a mechanical system, where posi-
tions can be dened by n generalized coordinates. The Lagrangian is
a function of the form
L(t, x, x) = T(x, x) U(t, x)
where T is the kinetic energy and U is potential energy (the particular
form of L is dictated by the physics of the system). Then the depen-
dency of the position of an object of the system is described by the fact
2
that the corresponding n-dimensional curve c minimizes the functional
I(c) =
_
t
1
t
0
L
_
t, c(t), c(t)
_
dt.
4. The Dirchlet problem. U is a region in R
n
with smooth boundary U
and a function g is given on the boundary. The Dirchlet problem is:
nd a harmonic function f on U with g as boundary values. Dirchlets
principle states that the solution is that function f which satises the
boundary condition and minimizes the functional
D(f) =
_
U
n
k=1
_
f
x
k
_
2
dx
1
. . . dx
n
(more precisely, is a stationary point).
Further problems which can be formulated in this way are: the isoperimetric
problem, Plateans problem (soap bubbles!), Newtons problem, Fermats
principle.
Examples of classical optimization problems.
1. The base b and perimeter a + b + c of a triangle ABC are given.
Which triangle has the largest area (solution: the isosceles triangle
with |AB| = |AC|).
2. Steiners problem. Given an acute triangle ABC, determine that point
P for which |PA| + |PB| + |PC| is a minimum. (Solution: P is the
point with A
PB = B

PC = C

PA = 120
).
3. A triangle ABC is given. Find P on BC, Q on CA, R on AB so
that the perimeter of PQC is minimal. (Solution: PQR is the pedal
triangle i.e. P is the foot of the altitude from A to BC etc.).
4. (The isoperimetric problem for polygons). Find that n-gon with xed
perimeter S so that the area is maximal. (Solution: the regular n-gon.)
1.1 Methods employed in the calculus of variations.
I. The direct method (cf. proofs of the Riemann mapping theorem, Radon-
Nikodym theorem, existence of best approximations in closed convex sets.)
II. The method of Ritz. One approximates an innite dimensional problem
by a sequence of nite dimensional ones with solution x
1
, x
2
, x
3
. . . Un-
der suitable conditions, this sequence will have a cluster point (limit of a
convergent subsequence) which is a solution to the original one.
3
Examples: We illustrate the latter method on some of the classical situa-
tions which lead naturally to variational problems:
I. The isoperimetric problem. In this case we approximate the original prob-
lem by the one for n-gons which, as we have seen, has the regular n-gon as
solution. With care, a suitable limiting process leads to the circle as solution
to the original problem.
II. The Dirichlet problem. Here we are looking for a smooth function on
{x
2
+y
2
1} which minimizes the functional
D =
_ _
(
2
x
+
2
y
)dxdy
and satises the boundary condition
(e
i
) = f()
for a suitable 2-periodic function f.
We indicate briey how to solve this problem. It is natural to work with
polar coordinates where the functional D has the form: D() =
_
2
0
_
1
0
_
2
r
+
1
r
2
_
rdrd.
We discretise this problem by using Fourier series, i.e., suppose that
f() =
a
0
2
+
n=1
(a
n
cos n +b
n
sin n)
Write =
1
2
f
0
(r) +
n=1
_
f
n
(r) cos n +g
n
(r) sinn
_
where the functions f
n
and g
n
satisfy the conditions f
n
(1) = a
n
, g
n
(1) = b
n
. Then
D() =
_
1
0
f
(r)
2
rdr +
1
_
1
0
_
f
n
(r)
2
+
n
2
r
2
f
n
(r)
_
rdr
+
n=1
_
1
0
_
g
n
(r)
2
+
n
2
r
2
g
n
(r)
2
_
rdr
Since each summand is independent we obtain a minimum by minimizing
each term. Hence we consider the problem
_
1
0
_
f
2
n
+
n
2
r
2
f
2
n
_
rdr = min, f(1) = a
n
This can be solved using the Ritz methodwe consider the restriction of the
functional to those functions of the form
f
n
(r) = c
0
+c
1
r + +c
m
r
m
, c
0
+ +c
m
= a
m
4
for a xed m > n and so reduce to a classical optimization problem with
solution f
n
(r) = r
n
.
Suummarising, this method leads to the well-known solution
(r, ) =
1
2
a
0
+
n=1
(a
n
r
n
cos n +b
n
r
n
sin n)
for the Dirichlet problem.
2 The Riemann mapping theorem.
As an application of the principle of optimizing a functional on a space of
functions we shall bring a sketch of a proof of the following famous result:
Proposition 1 (The Riemann mapping theorem) Let be a simply con-
nected proper subset of C. Then is conformably equivalent to the unit disc
U = {z C : |z| < 1}.
For the proof, we require the following facts.
1. Theorem of Ascoli-Arzela: A subset A of the Banach space C(K) (K a
compact metric space) is relatively compact if and only if it is uniformly
bounded and equicontinuous. Hence every sequence in such a set has
a uniformly convergent subsequence.
2. If is an open subset of C, then H(), the space of holomorphic
functions on , is a linear space and the natural topology on this space,
that of compact convergence, is a complete metric topology. A suitable
metric can be dened as follows: Write =
k
where each
k
is
relatively compact in
k
and dene
d(f, g) =
k
1
2
k
||f g||
k
1 +||f g||
k
where ||f g||
k
= sup{|f(z) g(z)| : z
k
}.
3. A uniformly bounded subset of H() is relatively compact for the
above topology and hence every uniformly-bounded sequence has a
subsequence which is uniformly convergent on compacta. This fol-
lows from 1) (with the help of the diagonal process) since a uniformly-
bounded sequence of holomorphic functions is equicontinuous on com-
pacta (Cauchy integral formula).
5
We now proceed to the proof of the Riemann mapping theorem. We x a
point z
0
and consider the family F of all functions f in H() which are
a) injective
b) such that f(z
0
) = 0 and f
(z
0
) > 0
c) map into U.
We shall rst show that F is non empty. Choose a C with a and
consider a branch of ln(z a) on (this exists since is simply connected).
This is, of course, injective. We even have that if it takes a value w
1
then it
does not take any of the values w
1
+2in, (for if ln(z
1
a) = ln(z
2
a)+2in,
then z
1
a = e
ln(z
1
a)
= e
ln(z
2
a)
e
2
i
= z
2
a).
The function ln(za) is open and so there is a disc of the form (ww
0
) <
around (z
0
a) which lies in the image of ln(z a). By the above reasoning,
the corresponding disc |ww
0
+2i| is not in the image of ln(za). Hence
w
0
= ln(z a) takes its values in the complement of the latter set. But this is
conformably equivalent to U. This provides a univalent function g : U.
The condition f
(z
0
) = 0, f
(z
0
) > 0 can be obtained by composing g with a
suitable Blaschke factor e
i
z z
1
1 az
1
where z
1
= g(z
0
) (Exercise).
We now set d = sup{f
(z
0
) : f F}. By the compactness, there is an
f F with f
(z
0
) = d. We claim that this f is surjective and so conformal
from to U. We do this by showing that if f were not surjective we can
construct a g F with g
(z
0
) > f
(z
0
) contradiction.
Suppose that w
1
U is not in the range of f. Once again, by composing
with the Blaschke function z
z w
0
1 w
0
z
we can assume that w
0
= 0. We let
F denote a branch of ln f on . This takes its values in the left half plane
{Re z < 0}. We compose with a Mobiustransformation with maps the latter
onto U and takes F(z
0
) onto 0 to get function G from into U i.e. the
mapping
w
w F(z
0
)
w +F(z
0
)
.
One computes that
G
(z
0
) = d
1 |w
0
|
2
2w
0
ln |w
0
|
Hence the function
g(z) =
G(z)|G
(z
0
)|
G
(z
0
)
6
is in F and
g
(z
0
) = d
1 |w
0
|
2
2|w
0
| ln
_
1
|w
0
|
_
which is > d (contradiction). (Consider the function
h(t) = 2 ln
1
t
+t
1
t
h(t) = 0 and h
(t) > 0. Hence 2 ln

1
t
<
1
t
t (t < 1) and so g
(z
0
) > d).
We remark that a similar method can be used to solve the Dirichlet
problem in two dimensions (see Ahlfors, Complex analysis, p. 196).
3 Analysis in Banach spaces
Since the calculus of variations is concerned with maximising functionals
on innite dimensional space, we consider briey the a bstract theory of
dierential calculus on such spaces.
3.1 Dierentiation and integration of E-valued func-
tions
Denition: Let be an open or closed interval in R. A function x : E
(E a Banach space) is dierentiable at t
0
if
lim
tt
0
x(t) x(t
0
)
t t
0
exists in E. If this is the case, its limit is the derivative of x at t
0
, denoted
by x
(t
0
). x is dierentiable on if it is dierentiable at each t . The
function x
: t x
(t) is then the derivative of x. If this derivative is a

continuous function, then x is said to be C
1
.
The following properties of the derivative are evident:
1. x is dierentiable at t
0
with derivative a if and only if the function
dened on by the equation
x(t) x(t
0
) = (t t
0
)a + (t t
0
)(t)
has limit 0 as t tends to t
0
;
2. if x
1
and x
2
are dierentiable at t
0
, then so is x
1
+x
2
and
(x
1
+x
2
)
(t
0
) = x
1
(t
0
) +x
2
(t
0
);
7
3. if T : E
1
E
2
is continuous and linear and x : E
1
is dierentiable
at t
0
, then so is T x and
(T x)
(t
0
) = T
_
x
(t
0
)
_
.
This concept of dierentiability unies some apparently unrelated notions of
analysis as the following examples show:
I. Let x be a function from [0, 1] into
. Then we can identify x with a

bounded sequence (x
n
) of functions on [0, 1] (where x(t) = (x
n
(t)). x is
continuous if and only if for each t
0
,
lim
h0
_
_
x(t
0
+h) x(t
0
)
_
_
= 0 i.e. lim
h0
sup
n
x
n
(t
0
+h) x
n
(t
0
)
= 0
This means that the x
n
are equicontinuous at t
0
.
Similarly, one shows that x is dierentiable at t
0
if and only if the following
conditions hold:
a) each x
n
is dierentiable at t
0
;
b) the sequence (x
n
(t
0
)) is bounded;
c) the functions x
n
are uniformly dierentiable at t
0
i.e. for each > 0
there is a > 0 so that if |h| <
x
n
(t
0
+h) x
n
(t
0
)
h
x
n
(t
0
)
<
for each n.
II. Now let E be the space C(J) (J a compact interval). A function x :
C(J) can be regarded as a function x from J into R (put x(s, t) =
(x(s))(t) for x , t J). Then one can check that
a) x is continuous if and only if x continuous on J;
b) x is a C
1
-function if and only if D
1
x(s, t) exists for each s, t, is con-
tinuous as a function on J and the following condition holds: for
each s
0
, the dierence quotients
x(s
0
+h, t) x(s
0
, t)
h
tend to
D
1
x(s
0
, t) uniformly in t (over J).
We denote by C
1
(; E) the space of continuously dierentiable functions
from into E. Similarly, we can dene recursively the concept of an r-
times continuously dierentiable function or C
r
-function, by saying
that x is C
r
if x
exists and is C
r1
. The r-th derivative x
(r)
of x is then
dened to be (x
)
(r1)
.
8
Proposition 2 (Mean value theorem) If x C
1
(; E), then
_
_
x(t) x(s)
_
_
|t s| sup
_
_
_
x
(u)
_
_
: u ]s, t[
_
for s, t .
Proof. By the Hahn-Banach theorem, there is, for each s, t, an f E
so that ||f|| = 1 and f(x(t) x(s)) = ||x(t) x(s)||. Then, by the classical
mean value theorem applied to the scalar function f x,
_
_
x(t) x(s)
_
_
= f
_
x(t) x(s)
_
= (t s)f
_
x
(u
0
)
_ _
u
0
]s, t[
_
|t s| ||f|| sup
_
_
_
x
(u)
_
_
: u ]s, t[
_
Later in this chapter, we shall discuss the Bochner integral which is the
analogue of the Lebesgue integral for vector-valued functions. In the mean-
time we introduce a very elementary integral which suces for many pur-
poses. We consider functions from a compact interval I = [a, b] in R with
values in E. A function : I E is a step function if it has a represen-
tation
a
i
I
i
(a
1
, . . . , a
n
E) where the I
i
are suitable subintervals (open,
half-open or closed) of I. The step functions form a vector subspace St (I, E)
of
(I, E), the Banach space of bounded functions from I into E. We denote
its closure therein by R(I; E)the space of regulated functions.
We dene a linear mapping : St(I; E) E by specifying that (a
i
i
) =
(I
i
)a
i
((I
i
) is the length of I
i
) and extending linearly (the usual diculties
concerning the well-denedness of this extension can be resolved as in the
scalar case. Alternatively one can reduce the problem to the scalar one by
using the Hahn-Banach theorem).
Lemma 1 is a continuous linear mapping from St(I; E) into E and |||| =
(I) = b a.
Proof. If x St(I; E) then it has a representation of the form
i
a
i
I
i
where the I
i
are disjoint. Of course, ||x|| = max{||a
k
|| : k = 1, . . . , n} and
_
_
(x)
_
_
=
_
_
_
(I
k
)a
k
_
_
_
(I
k
)||a
k
||
(I)||x||
and so |||| (I). That the inequality is actually an equality is easy to see
(consider the constant functions).
9
Hence has a unique extension to a continuous linear operator from
the the denite integral of x (also written
_
b
a
x(t)dt or simply
_
I
x). It is
easy to check that if x R(I; E), T L(E, F) then T x R(I; F) and
_
I
T x = T(
_
I
x) (the formula holds trivially for step functions and the
general result follows by approximation). We consider some examples:
I. Let x be a function from I into
, say
x(t) =
_
x
1
(t), x
2
(t), . . .
_
.
Then x is regulated if and only if the (x
n
) are uniformly regulated i.e. for
each > 0, we can nd a partition {I
1
, . . . , I
k
} of I so that each x
n
can be
approximated up to by step functions which are constant on the I
i
. Then
we have
_
x =
__
x
n
_
Note that if x takes its values in c
0
, then the above rather articial condition
is equivalent to the more natural one that each x
n
be regulated.
II. Let x : I C(J) be continuous. Then x is (of course) integrable and the
integral is the classical parameterized integral
t
_
x(s, t)ds
( x(s, t) := x(s)(t)).
Proposition 3 (fundamental theorem of calculus) If x C(I; E) then X :
s
_
s
a
x(t)dt is dierentiable and X
= x (X is a primitive of X).
Proof.
_
_
X(s) X(s
0
) (s s
0
)x(s
0
)
_
_
=
_
_
_
_
_
s
s
0
x(t)dt (s s
0
)x(s
0
)
_
_
_
_
=
_
_
_
_
_
s
s
0
_
x(t) x(s
0
)
_
dt
_
_
_
_
|s s
0
| max
_
_
_
x(u) x(s
0
)
_
_
: u ]s
0
, s[
_
and the expression in brackets tends to zero as s tends to s
0
(by the continuity
of x).
10
As a simple corollary we have the formula
_
b
a
x(t)dt = X(b) X(a)
for the denite integral where X is any primitive of x. For any two primitives
dier by a constant.
Using the concepts of dierentiation and integration of Banach space
valued functions we can formulate and prove an abstract existence theorem
for dierential equations which unites many classical results. We consider
equations of the form
dx
dt
= f(t, x)
where the solution is a function on R with values in a Banach space. More
precisely, U is open in RE, f is a function from U into E and a C
1
-function
x : I E (I an open interval in R) is sought whereby the condition
() for each t I,
_
t, x(t)
_
U and x
(t) = f
_
t, x(t)
_
is to be satised.
If we specialize say to E = R
n
we get the system of equations
dx
1
dt
= f
1
(t, x
1
, . . . , x
n
), . . . ,
dx
n
dt
= f
n
(t, x
1
, . . . , x
n
)
and so, by a standard trick, n-th degree equations of the form:
d
n
x
dt
n
= f(t, x, . . . , x
(n1)
).
For the special case E = C(J) R
n
we get systems of equations with a
parameter. Hence the following existence theorem contains several classical
results as special cases:
Proposition 4 Let f : U E satisfy the LIPSCHITZ condition
_
_
f(t
1
, x
1
) f(t
2
, x
2
)
_
_
K
_
||x
1
x
2
|| +|t
1
t
2
|
_
for (t
1
, x
1
), (t
2
, x
2
) U. Then for each (t
0
, x
0
) U there is an > 0 and a
C
1
-function x :]t
0
, t
0
+[E so that
x(t
0
) = x
0
_
t, x(t)
_
U for t ]t
0
, t
0
+[
x
(t) = f
_
t, x(t)
_ _
t ]t
0
, t
0
+[
_
.
11
Proof. Since ||f(t, x)|| ||f(t
0
, x
0
)|| +K(|t t
0
| +||xx
0
||), f is bounded
on bounded sets of U. Consider the operator
: x
_
t
_
t
t
0
f
_
, x()
_
d +x
0
_
on the space
_
x C
_
[t
0
, t
0
+], E
_
: x(t
0
) = x
0
_
.
Then if is small enough, is a contraction (since
_
_
(x) (y)
_
_
sup
t
_
t
t
0
_
f
_
, x()
_
f
_
, y()
_
_
d
K max
_
_
_
x() y()
_
_
: [t
0
, t
0
+])
and so has a xed point x
0
by the Banach xed point theorem. This x
0
is a
local solution.
3.2 The Bochner integral
In this section we extend the Lebesgue integral to measurable Banach space
valued functions on a measure space. This is the vector analogue of the
Lebesgue integral and is called the Bochner integral. By a measure space
we mean a triple (, , ) where is a set, is a -algebra of subsets of
and is a nite, -additive non-negative measure on .
Denition: Let (, , ) be as above, E a normed space. A measurable
E-valued step function is a function of the form
n
i=1
A
i
where
i
C and A
i
. A function x : E is measurable if it is the
pointwise limit (almost everywhere) of a sequence (x
n
) of measurable step
functions.
The following facts are then easy to prove:
1. The pointwise limit (almost everywhere) of a sequence of measurable
functions is measurable;
12
2. (Egoros theorem) if (x
n
) is a sequence of measurable functions and
x
n
x pointwise almost everywhere then x
n
x almost uniformly
(i.e. for every > 0, there is an A with (A) < and x
n
x
uniformly in X \ A);
3. if x : E is measurable and T L(E, F) then T x is measurable;
4. if E is separable, x is measurable if and only if x
1
(U) for each
open U E.
It follows from 3. that if x is measurable, it is scalarly measurable i.e. for
each f E
, f x is measurable. The converse is not true in general.

However, surprisingly enough, the dierence between scalar measurability
and measurability is purely a question of the size of the range of x.
Proposition 5 x : E is measurable if and only if it is scalarly measur-
able and almost separably valued (i.e. x(\A) is separable for some negligible
set A).
Proof. Suppose x is measurable. Then by the denition and Egoro
there is a sequence (x
n
) of simple function and (A
n
) of measurable sets with
(A
n
)
1
n
so that x
n
x uniformly on \ A
n
. Now x(\
A
n
) is clearly
separable and hence so is x( \ A
n
). Of course
A
n
is negligible. For the
converse, we can assume that E is separable and so that it has a Schauder
basis (x
n
). This is because every separable space is isometrically isomorph
to a subspace of a subspace with Schauder basis (e.g. C([0, 1])). Then if
(P
n
) is the corresponding sequence of projection, P
n
x is measurable and
P
n
x xhence x is measurable.
As we have seen, a function x : E is Bochner integrable if and only
if there is a sequence (x
n
) of simple functions which converges a.e. to x and is
such that for each > 0, there exists an N N with
_
||x
m
x
n
||d < for
m, n N. Then (
_
x
n
d) is a Cauchy sequence in E (the above integrals,
being integrals of step functions, are dened in the obvious way) and we
dene _
xd = lim
_
x
n
d.
Proposition 6 Let x : E be measurable. Then x is Bochner integrable
if and only if
_
||x||d < .
13
Proof. The necessity follows from the inequality
_
||x||d
_
||x x
n
||d +
_
||x
n
||d
applied to the elements of an approximating sequence of simple functions.
For the suciency we can assume that the E is a space with a basis. We
denote the corresponding projections once again by (P
n
).
Now if
_
||x||d < we have that ||x P
n
x|| converges point wise to
zero and can deduce that
_
||x P
n
x||d 0 by the Lebesgue theorem on
dominated convergence. Now the P
n
x take their values in nite dimensional
subspaces and so can be approximated by simple functions in the L
1
-norm.
Proposition 7 Let x : E be Bochner integrable, T L(E, F). Then
T x is Bochner integrable and
_
T xd = T
_
xd.
Proof. The result is trivial if x is a simple function. The general result
follows by continuity.
Proposition 8 Suppose that x L
1
(, E) and that C is a closed subset of
E so that
_
A
xd (A)C
_
A A).
Then x(t) C for almost all t .
Proof. We can easily reduce to the case where E is separable. Now we
show that if U is an open ball (with centre y and radius r) in E \ C, then
(A) = 0 where A = {t : x(t) U}. Indeed if (A) > 0, then
_
_
_
_
1
(A)
_
A
xd y
_
_
_
_
=
1
(A)
_
_
_
_
_
A
xd
_
A
yd
_
_
_
_
1
(A)
_
A
||x y|| r
which is a contradiction (here we have used y to denote the constant function
t y).
Now E \ C is a countable union of such U and so
_
t : x(t) E \ C
_
= 0.
14
The following Corollaries are easy consequences of this result.
Corollar 1 Suppose that x, y L
1
(, E). Then
a) if
_
A
xd =
_
A
yd for each A A then x = y a.e.;
b) if
_
A
f xd =
_
A
f yd for each f in a total subset of E
and each
A A then x = y almost everywhere;
c) if ||
_
A
xd|| k(A) for each A A then ||x(t)|| k for almost all t.
Proposition 9 Let x : E be Bochner integrable, A a measurable subset
of . Then
_
A
xd (A)
_
x(A)
_
(if B E,

(B) is the closed convex hull of B).
Proof. If the above does not hold, then by the Hahn-Banach theorem there
is an f E
with
f
__
A
xd
_
> K
where K = sup{f(y) : y x(A)}. (To simplify the notation, we are assuming
that the measure of A is one). Then
_
A
f xd > K sup
_
f(y) : y x(A)
_
which is impossible because of standard estimates for integrals of real-valued
functions.
3.3 The Orlicz-Pettis Theorem
Using the machinery of the Bochner integral, we can give a short proof of a
famous result on convergence.
Proposition 10 (Orlicz-Pettis Theorem) Let (x
n
) be a sequence in a Ba-
nach space E so that for each subsequence (x
n
k
) there is an element x with
k=1
f(x
n
k
) converging to f(x) for each f E
. Then
x
n
converges (un-
conditionally) to x. (i.e. weak unconditional convergence of a series implies
norm unconditional convergence).
15
Proof. We use the fact that if f : E is Bochner integrable then the
family
__
A
fd : A A
_
is relatively compact in E. (Exercise! Hint if f is a measurable step function
then this set is bounded and nite dimensional hence relatively compact.
For the general case use an approximation argument).
We let = {1, 1}
N
be the Cantor set with Haar-measure. We dene a
function : E by dening (
k
x
k
) to be that element of E to which
(
k
x
k
) converges weakly. is measurable, since it is clearly weakly mea-
surable (even weakly continous). Since it is bounded (uniform boundedness
theorem) it is Bochner integrable. Hence its range is norm compact. But
on norm compact sets, weak and norm convergence concide hence
x
n
k
converges in the norm for each subsequence.
Remark: We use here the result from Elementare Topologie, that if
(K, ) is a compact space and
1
is a weaker T
2
-topology on K, then =
1
.
In the above case is the norm topology and
1
is the weak topology i.e. the
initial topology on E induced by the funcitonals of E
1
.
3.4 Holomorphic E-valued functions
We now introduce the concept of (complex) dierentiablitity for functions x
dened on an open subset U of C with values in a Banach space E. This
topic is not directly relevant to the calculus of variations but since it is so
similar in nature to the above and is important in other branches (e.g. the
spectral theory of operators on innite dimensional spaces) we include a brief
treatment here. For obvious reasons, we shall consider only complex Banach
spaces in this context. It will also be convenient to use the notation U
r
for
the set { C : || < r}.
A function x : U E is complex-dierentiable at U if
lim
0
x() x(
0
)

0
exists. Then the value of this limit is denoted by x
(
0
) the derivative of
x at
0
. If x is dierentiable at each
0
U it is analytic or holomorphic.
H(U; E) denotes the vector space of E-valued analytic functions on U. As
is easily seen, if T : E F is continuous and linear then T x is analytic
whenever x is and we have the relation: (T x)
= T x
. Just as in the
16
scalar case, holomorphicity can be characterized using a Cauchy integral, as
we now show.
Denition: Suppose that x : U E is continuous and is a smooth curve
with parameterization
c : [a, b] U.
We then dene the curvilinear integral
_
x()d =
_
b
a
x
_
c(t)
_
c(t)dt
(where the integral used is the one dened at the beginning of this section).
(It is no problem to extend this denition to integrals over piecewise
smooth or even rectiable curves but the above denition will suce for our
purposes.) Then it follows that if T : E F is continuous and linear
_
T x()d = T
_
x()d.
Proposition 11 Let x : U E be holomorphic and let be a smooth,
closed nullhomotopic curve in U. Then
_
x()d = 0.
Proof. If f E
, then the scalar function f x is analytic and so

f
__
x()d
_
=
_
f x()d = 0.
Since this holds for each f E
, the integral must be zero. Exactly as in the

scalar case, this implies:
Corollar 2 (Cauchy formula) Let x be as above,
0
a point of U and a
smooth closed curve with winding number 1 with respect to . Then
x(
0
) =
1
2i
_
x()

0
d.
Using the techniques we have developed, together with the classical results, it
is now easy to obtain alternative characterizations of complex dierentiability
in particular, the fact that it is equivalent to representability locally by Taylor
series. As we shall now see it follows from the uniform boundedness theorem
that x : U E is analytic if it satises the apparently much weaker con-
dition that the scalar function f x be holomorphic for each f E
(weak
analyticity).
17
Proposition 12 For a function x : U
r
E, the following are equivalent:
1. x is holomorphic on U
r
;
2. for each f E
, f x is holomorphic on U
r
;
3. there is a sequence (c
n
) in E so that
limsup ||c
n
||
1/n
1/r
and x :
n=0
c
n
n
.
If these condition are satised, then x
is also dierentiable and hence x is

innitely dierentiable. Also we have c
n
= x
(n)
(0)/n!
Proof. (3) implies (1): Just as in the scalar case, we can show that if (c
n
)
satises the above condition, then the given series is absolutely convergent
on U
r
and uniformly convergent on each compact subset. The series obtained
by formal dierentiation also have the same convergence properties and from
this it follows that x is innitely dierentiable and the formulae for the c
n
hold.
(1) implies (2) is trivial
(2) implies (3): First we note that if < r then f x is bounded on the
compact set

U
for each f E
and so by the uniform boundedness theorem,

x is also norm-bounded on

U
. Let
M
:= sup
_
_
_
x()
_
_
: ||
_
.
If f E
we dene
c
n
(f) = (f x)
(n)
(0)/n! =
1
2i
_
(f x)()
n+1
d
where
is the boundary of

U
. As is easy to see f c
n
(f) is a linear form
on E
.
We can estimate as follows:
c
n
(f)
||f||M
/
n
.
Hence c
n
is a bounded form (i.e. c
n
E
) and ||c
n
|| M
/
n
. Since
this holds for each < r we have limsup ||c
n
||
1/n
1/r. Now using the
implication (3) implies (1) we know that
x() :=
n=0
c
n
n
18
denes a holomorphic function from U
r
into E
. But it follows immediately

form the denition of the c
n
that
f x = f x
for each f E
. Hence x = x and so x is holomorphic and each c

n
actually
lies in E.
This result has the following global form:
Proposition 13 For a function x : U E, the following are equivalent:
1. x is holomorphic;
2. for each f E
, f x is holomorphic;
3. for each
0
U, there is a r > 0 and a sequence (c
n
) in E so that
x() =
n=0
c
n
(
0
)
n
in
0
+U
r
U where convergence is absolute and uniform on compact subsets.
x is then innitely often dierentiable.
Proposition 14 (Moreras theorem) Let x : U E be continuous. Then x
is analytic if
_
x()d = 0
for each smooth, closed, nullhomotopic curve in U.
Proof. If x satises the given condition, then so does f x(f E
) and
so f x is holomorphic by the classical form of Moreas theorem. Hence x is
holomorphic by the above Proposition.
Proposition 15 (Liouvilles theorem) If x : C E is holomorphic and
bounded, then it is constant.
Proof. If x is not constant, choose
1
,
2
C with x(
1
) = x(
2
). There
is an f E
with
f
_
x(
1
)
_
= f
_
x(
1
)
_
.
Then f x is a non-constant, bounded, entire function which contradicts the
classical form of Liouvilles theorem.
19
3.5 Dierentiability of functions on Banach spaces
We now discuss the concept of dierentiability for function dened on Ba-
nach spaces. The denitions for function of several variablesi.e. via local
approximation by linear operatorscan be carried over without any prob-
lems. The resulting concept is called Freechet dierentiability. We shall
also discuss a weaker onethat of Gateaux dierentiability).
In the following E, F, G etc. will be real Banach spaces. Letters such as
U, V, W will denote open subsets of Banach spaces.
Frechet dierentiability A function f : U F (U E) is Frechet
dierentiable at x
0
U if there is a T L(E, F) so that
lim
h0
f(x
0
+h) f(x
0
) Th
||h||
= 0
(equivalently f((x
0
+ h) f(x) = Th + (h) where (h)/||h|| goes to zero
with h).
T is uniquely determined by this condition and is called the (Frechet)
derivative of f at x
0
, denoted by (Df)
x
0
. f is dierentiable on U if (Df)
x
exists for each x U. f is a C
1
-function on U if the function
Df : x (Df)
x
from U into L(E, F) is continuous (in symbols f C
1
(U; F)).
Gateaux dierentiability There is a weaker concept of dierentiability
which is sometimes usefulthat of Gateaux dierentiability. This means
that the restriction of f to lines through x
0
are dierentiable in the sense of
1.1 and the corresponding derivatives are continuous and linear as functions
of the direction of dierentiation. More precisely, f : U E is Gateaux
dierentiable (or G-dierentiable) at x
0
U if
1. for each h E, the derivative
Df(x
0
, h) = lim
t0
f(x
0
+th) f(x
0
)
t
exists and
2. the mapping h Df(x
0
, h) from E into F is continuous and linear.
20
Note that the limit in 1) can exists for each h E without the mapping
in 2) being linear (the standard example is
(s, t)
_
0 (s, t) = (0, 0)
st
2
s
2
+t
2
otherwise
from R
2
into R). Also a function can be G-dierentiable at x
0
without being
continuous at x
0
(and so certainly not dierentiable). The standard example
is
(s, t)
_
0 t
2
s
4
1 otherwise
Of course, if f is dierentiable it is G-dierentiable. A useful relation between
the two concepts is established in the next Proposition.
Proposition 16 f : U F is dierentiable at x
0
U if it is G-dierentiable
there and
lim
t0
_
_
_
_
f(x
0
+th) f(x
0
)
t
Df(x
0
, h)
_
_
_
_
= 0
uniformly for h is the unit sphere of E.
This is just a reformulation of the denition of dierentiability.
Corollar 3 If f : U E is G-dierentiable and the function
x
_
h Df(x, h)
_
from U into L(E, F) is continuous, then f is dierentiable and
(Df)
x
: h Df(x, h) (x U).
Proof. The continuity of the above mapping means that for every > 0
there is a > 0 so that
_
_
Df(x, h) Df(x
0
, h)
||h||
for ||x x
0
|| < , h E.
Now there is a t [0, 1] so that
_
_
f(x +h)) f(x
0
) Df(x
0
, h)
_
_
=
_
_
Df(x
0
+th, h) Df(x
0
, h)
_
_
||h||
(applying the mean value theorem to the function
t f(x
0
+th) ).
Thus the dierence quotient converges to Df(x
0
, h) uniformly in h and so f
is Frechet dierentiable by the above.
21
We bring some simple examples of dierentiable functions:
If T : E F is continuous and linear, then T is dierentiable and (DT)
x
=
T for each x.
If E is the one-dimensional space R and U R is open, then the above
concepts of dierentiability both coincide with that of IIa).
If B : E F G is continuous and bilinear, then B is dierentiable and
(DB)
(x,y)
is the linear mapping (h, k) B(h, y) +B(x, k) on E F.
Polynomials If E, F are Banach spaces, a homogeneous polynomial of
degree n is a mapping Q : E F of the form
Q(x) = B(x, . . . , x)
where B L
n
(E, F). For convenience we write Bx
n
for the right hand side.
Note that we can and shall assume that B is symmetric i.e.
B(x
1
, . . . , x
n
) = B(x
(1)
, . . . , x
(n)
)
for each permutation in S
n
. For B induces the same polynomial as its sym-
metrisation.
A polynomial is a function from E into F which is a nite sum of such
homogeneous polynomials. For example if k
i
: [0, 1]
i
R are continuous
symmetric functions, the mapping
x
_
1
0
k
1
(s)x(s)ds +
_
1
0
_
1
0
k
2
(s
1
, s
2
)x(s
1
)x(s
2
)ds
1
ds
2
+ +
_
1
0
. . .
_
1
0
k
n
(s
1
, . . . , s
n
)x(s
1
) . . . x(s
n
)ds
1
, . . . , ds
n
is a polynomial on C(I).
If f : E F is a homogeneous polynomial of the form x Bx
n
(B
symmetric in L
n
(E, F)) then f is dierentiable and
(Df)
x
: h nB(x
n1
, h)
(the last symbol denotes B(x, x, . . . , x, h)). The proof is based on the bino-
mial formula
B(x +h) B(x) =
n
r=0
_
n
r
_
B(x
r
, h
nr
)
(where B(x
r
, h
nr
) has the obvious meaning) which can be proved just as in
the classical case.
22
Thus we have
f(x +h) f(x) = nB(x
n1
, h) +
n(n 1)
2
B(x
n1
, h
2
) +. . .
and the result follows.
For example, the derivative of the above polynomial on C(I) is
(Df)
x
: h
_
1
0
k
1
(s)ds + 2
_
1
0
_
1
0
k
2
(s
1
, s
2
)x(s
1
)h(s
2
)ds
1
ds
2
+
+
_
1
0
. . .
_
1
0
k
n
(s
1
, . . . , s
n
)x(s
1
) . . . x(s
n1
)h(s
n
)ds
1
. . . ds
n
We now show that the Frechet derivative possesses the basic properties
required for ecient calculation. We begin with the chain rule and then show
that suitable forms of the inverse function theorem hold.
Proposition 17 (the chain rule) If f : U V, g : V G where f is dier-
entiable at x U and g is dierentiable at f(x), then g f is dierentiable
at x and
D(g f)
x
= (Dg)
f(x)
(Dg)
x
.
Proof. Let k be the function h f(x +h) f(x) so that
k(h) = (Df)
x
(h) +||h||
1
(h)
where
1
(h) 0. Also
g f(x +h) g f(x) = g
_
f(x +h)
_
g
_
f(x)
_
= g
_
f(x) +k(h)
_
g
_
f(x)
_
= (Dg)
f(x)
_
k(h) +||k(h)||
2
_
k(h)
_
_
where
2
(k) 0 as k 0.
Then
g f(x +h) g f(x) = (Dg)
f(x)
(Df)
x
(h)
plus the remainder term ||h||(Df)
f(x)
1
(h) + ||k(h)||
2
(k(h)) which, as it is
easy to check, has the correct growth properties.
23
Using the chain rule, it is easy to prove the following results:
1. if f, g C
1
(U; E) then so does f +g and D(f +g)
x
= (Df)
x
+ (Dg)
x
(consider the chain
U E E E
x
_
f(x), g(x)
_
f(x) +g(x) );
2. if f C
1
(U; E), g C
1
(U; F) and B : E F G is bilinear and
continuous, then the mapping x B(f(x), g(x)) is in C
1
(U; G) and
its derivative at x is the mapping
h B
_
Df
x
(h), g(x)
_
+B
_
f(x), Dg
x
(h)
_
;
3. (partial derivatives) let U (resp. V ) be open in E (resp. F)
f : U V G. Then if f is dierentiable at (x
0
, y
0
), f
1
(resp. f
2
) is
dierentiable at x
0
(resp. y
0
) where f
1
: x f(x, x
0
), f
2
: y f(x
0
, y)
and then
(Df)
(x
0
,y
0
)
(h, k) = (Df
1
)
x
0
(h) + (Df
2
)
y
0
(k).
We now prove an analogue of the classical inverse function theorem for
function between Banach spaces. The proof is based on an inverse function
theorem for Lipschitz functions. Recall the denition:
If (X, d) and (Y, d
1
) are metric spaces, a mapping f : X Y is Lipschitz
if there exists a K > 0 so that
d
1
_
f(x), f(y)
_
Kd(x, y) (x, y X).
We write Lip(f) for the inmum of the K which satisfy the condition. For
example if E and F are Banach spaces and T L(E, F) then T is Lipschitz
and Lip(T) = ||T||.
Proposition 18 If E is a Banach space and T : E E is a linear isomor-
phism and f : E E is Lipschitz with Lip(f) ||T
1
||
1
then T + f is a
bijection and (T + f)
1
is Lipschitz with constant
Lip(T +f)
_
||T
1
||
1
Lip(f)
_
1
.
24
Proof. If x E, then
||z|| = ||(T
1
T)(z)|| ||T
1
|| ||Tz||
and so ||Tz|| ||T
1
||
1
||z||.
Hence we can estimate
_
||T
1
||
1
Lip(f)
_
= ||T
1
||
1
||x y|| Lip(f)||x y||
_
_
T(x y)
_
_
_
_
f(x) f(y)
_
_
_
_
(T +f)x (T +f)y
_
_
Thus T + f is injective and its inverse from (T + f)E into E is Lipschitz
with constant at most (||T
1
||
1
Lip(f))
1
. We now show that T + f is
surjective: take y E and let x
0
:= T
1
(y). We are looking for an h E so
that
(T +f)(x
0
+h) = y i.e. Th +f(x
0
+h) = 0
i.e. h is a xed point of the mapping
: h T
1
(x
0
+h).
Now is easily seen to be a contraction and so it has a xed point which
gives the result. The inverse function theorem says that if the derivative of
a function at a given point x
0
is an isomorphism, then f is invertible in a
neighborhood of this point. Now by the very denition of the derivative, f
(up to a constant) is a small perturbation of its derivative in a neighborhood
of x
0
and so satises the conditions of Proposition 19. However, in order to
apply Proposition 19 we must construct a function which is dened on all
of E and Lipschitz there, but agrees with f in a neighborhood of x. To do
this we use a bell function i.e. a function : R [0, 1] which is innitely
dierentiable, is equal to 1 on [1, 1] and equal to 0 outside of [2, 2].
Lemma 2 Let U be an open neighborhood of zero in the Banach space E,
f C
1
(U; E), with f(0) = 0 = (Df)
0
. Then for every > 0, there is an
open neighborhood V U of zero and

f : E E so that
1.

f
V
= f
V
;
2.

f is bounded;
3.

f is Lipschitz with Lip (f) < .
25
Proof. Since Df is continuous, there is an > 0 so that 2B
E
U and
_
_
Df(x)
_
_
<

1 + 2||
||
(||x|| 2) where ||
|| is the supremum norm of
. Let

f be the mapping
x ((
||x||
)f(x) for ||x|| 2 and

f(x) = 0 outside of 2B
E
. Clearly

f = f
on B
E
. Also

f is bounded since f is bounded on 2B
E
(by the mean value
theorem). To check the Lipschitz condition, it is sucient to consider x, y in
2B
E
. Then

f(x)

f(y)
_
||x||
_
f(x)
_
||y||
_
f(y)
_
||x||
_
||y||
_
_
f(x)
_
_
+
_
||y||
_
_
_
f(x) f(y)
_
_
||
|| ||x y||
_
_
f(x)
_
_
+
_
||y||
_
sup
z2B(E)
_
_
Df(z)
_
_
||x y||
||
||||x y|| sup

_
_
Df(z)
_
_
||x|| + sup
_
_
Df(z)
_
_
||x y||
||x y||
Using this we can prove the following version of the inverse function
theorem:
Proposition 19 Let U be an open neighborhood of zero in E, f C
1
(U; E)
with f(0) = 0. If (Df)
0
is an isomorphism, then f is local dieomorphism
i.e. there exists a neighborhood V U of zero so that f(V ) is open and
f
V
: V f(V ) is a C
1
-isomorphism.
Proof. Put g = f (Df)
0
. We can apply the above Lemma to get a
neighborhood V of zero and a g : E E so that
1. g = g on V ;
2. g is bounded and Lipschitz continuous with
Lip(g) <
_
_
(Df)
1
0
_
_
1
Then by the Lipschitz inverse function theorem (Df)
0
+ g is a homomorphism
of E with Lipschitz continuous inverse. Hence f
V
= ( g + (Df)
0
)
V
is a
26
bijection from V onto an open set. We show that its inverse is dierentiable.
(From this it follows easily that the inverse mapping is C
1
.)
Choose y, y +k in f(V ). Then if x = f
1
(y),
f
1
(y +k) f
1
(y) (Df)
1
x
k = (Df)
1
x
_
(Df)
x
_
f
1
(y +k) f
1
(y)
_
k
_
.
Now we put h = f
1
(y + k) f
1
(y) i.e. k = f(x + h) f(x). Then the
right hand side is
(Df)
x
h
_
f(x +h) f(x)
_
and this is ||h||(h) where (h) 0 as ||h|| 0. Now since f and f
1
are
Lipschitz, ||h|| 0 if and only if ||k|| 0 and so f
1
is dierentiable at x
and
(Df
1
)
x
=
_
(Df)
x
_
1
.
As in the nite dimensional case we can deduce the following corollary:
Proposition 20 (implicit function theorem) Let U be an open neighborhood
of (x
0
, y
0
) in E F, f : U G a C
1
-function with f(x
0
y
0
) = 0. Sup-
pose that (D
2
f)
(x
0
,y
0
)
; F G is an isomorphism, where (D
2
f)
(x
0
,y
0
)
is the
derivative of the function
y f(x
0
, y)
at y
0
.
Then there are open sets W E, W
U with x
0
W, (x
0
, y
0
) W
and a C
1
-mapping g : W F so that for x, y W
,
f(x, y) = 0 if and only if x W and y = g(x).
Proof. Write for the C
1
-mapping
(x, y)
_
x, f(x, y)
_
.
Then the derivative (D)
(x
0
,y
0
)
is the operator
_
Id
E
0
(D
1
f)
(x
0
,y
0
)
(D
2
f)
(x
0
,y
0
)
_
Here we are using an obvious matrix notation to describe mappings from
E F into E F (i.e. the matrix
_
T
11
T
12
T
21
T
22
_
27
denote the maps
(x, y) (T
11
x +T
12
y, T
21
x +T
22
y)
where T
11
L(E, E), T
12
L(F, E), T
21
L(E, F), T
22
L(F, F)).
(D
1
f)
(x
0
,y
0
)
is the derivative (at x
0
) of the mapping
x f(x, y
0
).
Now (D)
(x
0
,y
0
)
is an isomorphism and so there is an open neighborhood
W
U of (x
0
, y
0
) so that |
W
is a dieomorphism. Denote its inverse by
. Then if (x, y) W
f(x, y) = 0 if and only if (x, f(x, y)) = (x, 0) or,

equivalently, (x, y) = (x, 0) i.e. (x, y) = (x, 0).
So the required g is the mapping
x
_
(x, 0)
_
where is the natural projection from E F and g is dened on
W :=
_
x : (x, 0) (W
)
_
.
We now consider higher derivatives. In contrast to the situation of function
on R a new complication arises due to the fact that the derivative of a
function f from U into F takes its values not in F but in the Banach space
L(E, F). Hence, the derivative of Df, if it exists, will have its values in
L(E, L(E, f)). By the time we get to say the fourth derivative the range
space is a rather complicated nested operator space. Fortunately, the above
space is naturally isometric to L
2
(E, F). More generally, we have
L
k
(E, F)
= L
_
E, L
k1
(E, F)
_
.
This puts us in the position to dene recursively the notion of a C
r
-function
and its higher derivatives: f : U F is of class C
r
(or f C
r
(U; E)) if
f is dierentiable and Df C
r1
(U; L(E, F)). We then dene D
r
f by the
equation (D
r
f)
x
(h
1
, . . . , h
r
) = (D(D
r1
f)
x
(h
1
)(h
2
, . . . , h
r
).
Proposition 21 (Taylors theorem) Let f : U E be a C
r
-function and
let x U, h E be such that {x +th : t [0, 1]} U. Then
f(x +h) = f(x) + (Df)
x
(h) + +
1
(r 1)!
(D
r1
f)
x
h
r1
+R
r
(h)
where the remainder term R
r
is
_
1
0
(1 t)
r1
(r 1)!
(D
r
f)
x+th
(h
r
)dt
and so satises the growth property lim
||h||0
R
r
(h)
||h||
r
= 0.
28
Proof. Let g be the function t f(x +th). Then
g
(k)
(t) = (D
k
f)
x+th
h
k
.
(Exercise.) Now
d
dt
_
g(t) + (1 t)g
(t) + +
1
(r 1)!
(1 t)
n1
g
(r1)
(t)
_
=
1
(r 1)!
(1t)
r1
g
(r)
(t).
Integrating both sides from 0 to 1 and substituting for the terms of the form
g
(k)
(t) gives the result.
As in the classical case, the rst and second derivatives give information
on the extrema of functionals on Banach spaces:
Proposition 22 Let f : U R (U E) be a C
2
-function, x
0
a point in
U. Then
1. if f(x
0
) is a local minimum or maximum for f, (Df)
x
0
= 0;
2. if (Df)
x
0
= 0 and the second derivative (D
2
f)
x
0
satises the condition:
there is a k > 0 so that (D
2
f)
x
0
(h
2
) k||h||
2
(resp. (D
2
f)
x
0
(h
2
)
k||h||
2
), then f(x
0
) is a local minimum (maximum).
The proof is exactly as in the nite dimensional case.
We now return to the main subject:
4 The calculus of variations
4.1 The Euler equations application
In this section we shall use the above theory to derive the Euler equation for
a variety of problems of the calculus of variations. We begin with a simple
case.
Example A: Let L be a smooth function of three variables (t, x, z) we
calculate the derivative of the functional:
I(c) =
_
t
1
t
0
L
_
t, c(t), c(t)
_
dt.
We assume that L is analytic in the three variables t, x, z. Then I is contin-
uous (us a functional say on E = C
1
([t
0
, t
1
])) and so, in order to show that it
29
is analytic, it suces to demonstrate that its restrictions to one dimensional
ane subspaces of E are analytic. Also, in order to calculate the derivative
(DI)
c
(h) at h it suces to calculate lim
s0
1
s
[I(c +sh) I(c)]. But
1
s
_
I(c + sh) I(c)
=
1
s
__
t
1
t
0
_
L
_
t, c(t) +sh(t), c(t) +s
h(t)
_
L
_
t, c(t), c(t)
_
_
dt
_
=
1
s
__
t
1
t
0
_
sh(t)L
x
(t, c, c) +sh
(t)L
z
(t, c, c)
dt +o(s)
_
_
t
1
t
0
_
h(t)L
x
(t, c, c) +h
L
z
(t, c, c)
dt
This means that (DI)
c
is the linear form
h
_
t
1
t
0
h(t)L
x
_
t, c(t), c(t)
_
dt +
_
t
1
t
0
h
(t)L
z
_
t, c(t), c(t)
_
dt.
Hence the vanishing of the derivative at x means that this integral must
vanish for each h C
1
([t
0
, t
1
]).
Problems with xed endpoints: In many concrete problems, we wish to
specify a maximum or minimum of the functional not on the whole Banach
space but on an ane subspace (i.e. a translate of a subspace). Thus in the
above case we often have a problem with xed endpoints that is values
x
0
and x
1
are given and we consider the extremal values of the functional on
the subspace
E
1
=
_
c C
1
_
[t
0
, t
1
]
_
: c(t
0
) = x
0
and c(t
1
) = x
1
_
= c +E
0
where c E
1
and E
0
is the corresponding space with the homogeneous
conditions c(t
0
) = 0 = c(t
1
). In the order to calculate the derivatives we
compute exactly as above, but using test functions h E
0
. In this case the
nal result can be simplied, using integration by parts, as follows:
(DI)
c
(h) =
_
t
t
0
h(t)L
x
_
t, c(t), c(t)
_
dt
_
t
1
t
0
h(t)
d
dt
L
z
_
t, c(t), c(t)
_
dt
=
_
t
t
0
h(t)
_
L
x
_
t, c(t), c(t)
_
d
dt
L
z
_
t, c(t), c(t)
_
_
dt
Since this holds for each h E
0
we have the necessary condition:
L
x
_
t, c(t), c(t)
_
d
dt
L
z
_
t, c(t), c(t)
_
= 0
30
for c to be an extremum of I. This is known as Eulers equation (it is an
ordinary dierential equation for c of the most general form i.e. implicit
and non-linear). In a similar manner we have:
B. The Euler equations for the functional:
I(c) =
_
t
1
t
0
L
_
t, c
1
, . . . , c
n
(t), c
1
(t), . . . , c
n
(t)
_
dt
with kernel L a smooth function of the variables t, x
1
, . . . , x
n
, z
1
, . . . , z
n
are:
d
dt
L
z
k
_
t, c(t), c(t)
_
L
x
k
_
t, c(t), c(t)
_
= 0
(k = 1, . . . , n).
C. The Euler equations for the functional:
I(c) =
_
t
1
t
0
L
_
t, c(t), c(t), . . . , c
(n)
(t)
_
dt
with smooth kernel L = L(t, x, z
1
, . . . , z
n
) (and boundary conditions
c
(k)
(t
0
) =
k
(k = 0, . . . , n 1)
c
k
(t
1
) =
k
(k = 0, . . . , n 1)
is
L
x
d
dt
L
z
+
d
2
dt
2
L
z
2
+ (1)
n
d
n
dt
n
L
zn
= 0
D. The Euler equation for functions of several variables. Consider the func-
tional
I() =
_ _
U
L
_
u, v, (u, v),

u
,

v
_
du dv
with boundary conditions
(u, v) = f(u, v) on U.
1
s
_
I( +sh) I()
_
=
_ _
U
_
L(u, v, +sh,
1
+sh
1
,
2
+sh
2
) L(u, v, ,
1
,
2
)
du dv
=
_ _
hL
x
du dv +
_ _

1
h
1
L
z
du dv +
_

2
h
2
L
w
du dv
But, by Greens theorem:
_ _
U
h
1
L
z
du dv+
_ _
U
h
2
L
w
du dv =
_
U
h(L
z
dvL
w
d
u
)
_ _
U
h
_

x
L
z
+

y
L
w
_
du dv
31
(Consider the dierential form
= h(u, v)L
z
_
u, v, (u, v),
1
(u, v),
2
(u, v)
_
dvh(u, v)L
w
_
u, v, (u, v),
1
(u, v)
2
(u, v)
_
du
d = (h
1
L
z
+

u
L
z
) +(h
2
L
w
+

v
L
w
)). Hence (DI)
(h) =
__
h{L
x

u
L
z
v
L
w
}du dv and so the Euler equation for
I() =
_ _
U
L(u, v, ,
1
,
2
)du dv
with smooth kernel L = L(u, v, x, z, w) is
u
L
z
+

u
L
w
L
x
= 0
In the case where higher derivatives occur, for example second derivatives,
we have the following form for the Euler equations: In this case we have
L = L(u, v, x, z
1
, z
2
, z
2
, z
1
w
1
, w
2
).
(z
s
is a place-holder for the second partial derivative with respect to u, z
1
w
1
for the mixed partial derivative (not the product of z
1
and w
1
). Then the
Euler equation is:
L
x

u
L
z
1

v
L
w
1
+

2
u
2
L
z
2
+ 2

2
uv
L
z
1
w
1
+

2
v
2
L
O
w
2
= 0.
For example, if
I() =
_ _
1
2
_
_
u
2
_
2
+

2
u
2
v
2
+
_
v
2
_
2
_
ldudv,
then in this case Eulers equation is
= 0.
Examples: I. The shortest line between (t
0
,
0
) and (t
0
,
1
), (
1
>
0
) in
R
2
: Here
I(c) =
_
t
1
t
0
_
1 + c(t)
2
dt
L(t, x, z) =
1 +z
2
L
x
= 0, L
z
=
z
1 +z
2
. Eulers equation:
d
dt
c(t)
_
1 + c(t)
2
= 0
32
Hence
c(t)
_
1 + c(t)
2
and so also
c(t)
2
1 + c(t)
2
is constant. From this it follows easily
that c(t) is constant i.e. c(t) has the form at + b for suitable a, b which can
be determined from the boundary conditions.
II. As above in 3 dimensions:
I(c) =
_
t
1
t
0
_
1 + c
1
(t)
2
+ c
2
(t)
2
dt
L(t, x, z) =
_
1 +z
2
1
+z
2
2
Euler equations:
d
dt
c
1
(t)
_
1 + c
1
(t)
2
+ c
2
(t)
2
= 0
d
dt
c
2
(t)
_
1 + c
1
(t) + c
2
(t)
2
= 0
Once again, one can deduce that c is ane.
III. Surfaces of revolution with minimal area:
I(c)
_
t
1
t
0
c(t)
_
1 + c(t)
2
dt c(t
0
) = x
0
, c(t
1
) = x
1
.
L(t, x, z) = x
1 +z
2
L
x
=
1 +z
2
, L
z
=
xt
1 +z
2
Eulers equation:
d
dt
c(t) c(t)
_
1 + c(t)
2
=
_
1 + c(t)
2
This simplies to the equation c(t) c(t) = 1 + c(t)
2
which tacitly assumes
that c is C
2
. However this follows from the form of the equation. For if
y = L
z
(t, x, z) =
xz
1 +z
2
, simple algebra shows that z =
y
_
x
2
y
2
.
Hence if d(t) = L
z
(t, c(t), c(t)) we have
d(t) =
c(t) c(t)
_
1 + c(t)
2
and so c(t) =
d(t)
_
c(t)
2
d(t)
2
.
Hence c is C
2
(since d is C
1
).
33
The original Eulers equation is equivalent to the system:
c(t) =
d(t)
_
c(t)
2
d(t)
2
c(t) =
c(t)
_
c(t)
2
d(t)
2
By standard methods one obtains the explicit solution
c(t) = b cosh
t t
0
b
(b > 0).
The constants b and t
0
are determined by the boundary conditions. (Note
that such a solution cannot be found for arbitrary boundary conditions.)
IV. Geodetics in R
2
for the metric tensor
_
g
11
g
12
g
12
g
22
_
(where the gs are smooth functions of two variables with g
11
> 0, g
11
g
12

g
2
12
> 0). In this case:
I(c) =
_
t
t
0
_
g
11
(t, c(t)) + 2g
12
(t, c(t)) c(t) +g
22
(t, c(t)) c
2
(t)dt
L(t, x, z) =
_
g
11
(t, x) + 2g
12
(t, x)z +g
22
(t, x)z
2
The Euler equation is then
d
dt
g
12
(t, c(t)) +g
22
(t.c(t)) c(t)
_
g
11
(t, c(t)) + 2g
12
(t, c(t)) c(t) +g
22
(t, c(t)) c(t)
2
g
11
x
(t, c(t)) + 2
g
12
x
(t, c(t)) c(t) +

2
g
22
x
(t, c(t)) c(t)
2
2
_
g
11
(t, c(t)) + 2g
12
(t, c(t)) c(t) +g
22
(t, c(t)) c(t)
2
Sometimes written as
d
dt
F +G c
E + 2F c +G c
2
E
2
+ 2F
2
c +G
2
c
2
2
E + 2F c +H c
2
= 0
(with E = g
11
, F = g
12
= g
21
, G = g
22
). We can consider the same problem
in 3 dimensions.
Here
_
_
g
11
g
12
g
13
g
21
g
22
g
23
g
31
g
32
g
33
_
_
is a positive denite matrix whose entries are
smooth functions of those variables (t, x
1
, x
2
).
34
In this case:
I(c) =
_
g
11
+ 2g
12
c
2
+ 2g
13
c
3
+g
22
c
2
2
+ 2g
23
c
2
c
3
+g
33
c
2
3
L(t, x, z) =
_
g
11
(t, x
2
, x
3
) + 2g
12
z
2
+ 2g
13
z
3
+g
22
c
2
2
+ 2g
23
z
2
z
3
+g
33
z
2
3
Eulers equation:
d
dt
_
g
12
+g
22
c
2
+g
23
c
3
L
_
1
2L
_

x
2
g
11
+ 2

x
2
g
12
c
2
+
_
= 0
V. Fermats principle:
I(c) =
_
t
1
t
0
_
t, c
1
(t), c
2
(t)
_
_
1 + c
1
(t)
2
+ c
2
(t)
2
dt = (t, x
1
, x
2
)
the coecient of refraction)
L(t, x
1
, x
2
, z
1
, z
2
) = (t, x)
_
z
2
1
+z
2
2
L
x
1
=

x
1
_
1 +z
2
1
+z
2
2
L
z
1
=
z
1
_
1 +z
2
1
+z
2
2
L
x
2
=

x
2
_
1 +z
2
1
+z
2
2
L
z
2
=
z
2
_
1 +z
2
1
+z
2
2
Equations
d
dt
_
t, c(t)
_
c
k
(t)
_
1 + c
1
(t)
2
+ c
2
(t)
2
=
x
k
_
t, c(t)
_
_
1 + c
1
(t)
2
+ c
2
(t)
2
(k = 1, 2).
VI. Dirichlets problem. Recall that we minimized the functional
I() =
_ _
_
_
u
_
2
+
_
v
_
2
_
du dv
Thus L =
1
2
(z
2
+w
2
) and Eulers equation is =

2
x
2
+

2
y
2
= 0
Similarly if
I() =
_ _
1
2
_
_
u
2
_
2
+ 2
_
u
2
__
y
2
_
+
_
v
2
_
2
_
du dv
35
then the Euler equation is
=

4
u
4
+ 2

4
u
2
v
2
+

4
v
4
= 0
VII. Minimal surfaces. Here
I() =
_
_
1 +
2
u
+
2
v
L(u, v, x, z, w) =
1 +z
2
+w
2
Eulers equation is
uu
(1 +
2
v
) 2
uv
v
+
vv
(1 +
2
u
) = 0
This can be interpreted geometrically as stating that the mean curvative
vanishes (cf. Vorlesung Dierentialgeometrie).
Remark: Consider the implicit curves
{f = c}
Then their curvatures are given by the formula:
=
f
11
f
2
2
+ 2f
1
f
2
f
12
f
22
f
2
1
|gradf|
3
(cf. Dierentialgeometrie) and so the equation hat the form:
f
11
+f
22
= |gradf|
3
Hence if the level curves are straight lines, then
f
11
+f
22
= 0
i.e. f is harmonic.
Example: f is linear. Then it is a plane.
f is a(x, y) +b. Then the graph is the helicoid
f(u, v) = (u cos v, u sinv, av)
Scherks example: Scherk solved this equation for f of the form f(x, y) =
g(x) +h(y)
36
This leads to the equation
g
(x)
1 +g
1
(x)
2
=
h
(y)
1 +h
1
(y)
2
= const
Solutions: g(x) =
1
a
ln cos ax, h(x) =
1
a
ln cos ay
f(u, v) =
1
a
ln
_
cos au
cos av
_
(Catalan showed that the only ruled surfaces which are minimal are the
helicoid or the plane).
Surfaces of revolution: We rotate the curve (t, c(t), 0) about the x-axis.
(u, v) =
_
u, c(u) cos v, c(u) sinv
_
.
The equation H = 0 then becomes
c c + 1 + c
2
= 0
with solution
c(t) = a cosh
_
t
a
+b
_
.
4.2 The Weierstra representation:
We now show how to construct a large class of minimal surfaces by using
results from function theory. We consider a parameterized surface M in R
3
i.e. M is the image of a smooth function = (
1
,
2
,
3
) from U( R
2
)
into R
3
. We denote by N the normal vector to the curve i.e. N(u, v) =
1
(u, v)
2
(u, v)
|
1
(u, v)
2
(u, v)|
. We suppose that the parameterization is conformal
(or isothermal). This means that there is a scalar function (u, v) so that
(
1
|
1
) = (u, v) = (
2
|
2
), (
1
|
2
) = 0. Then we have the equation =
11
+
22
= 2
2
HN where H is the mean curvature of the surface. From
this it follows that the surface is minimal i = 0 i.e. the components
1
,
2
,
3
are all harmonic. Then if we introduce the functions f
1
, f
2
, f
3
where
f
i
=

i
z
, these are analytic functions.
37
Remark:

z
is the operator
1
2
(

u
i

v
).
A simple calculation shows that if is harmonic, then

z
satises the
Cauchy-Riemann equations and so is holomorphic). Furthermore we have
the relations by
f
2
1
+f
2
2
+f
2
2
= 0
between the fs. For
f
2
1
+f
2
2
+f
2
3
=
1
4
i
_
i
u
i
i
v
_
2
=
1
4
i
_
i
u
_
2
i
_
i
v
_
2
2i
i
u
i
v
=
1
4
_
|
u
|
2
|
v
|
2
2i(
u
|
v
)
= 0
A similar calculation shows that
|f
1
|
2
+|f
2
|
2
+|f
3
|
2
= 2
2
> 0
Conversely, we have the following. Suppose that we have three holomorphic
functions (f
1
, f
2
, f
3
) on U so that
a) f
2
1
+f
2
2
+f
2
3
= 0
b) |f
1
|
2
+|f
2
|
2
+|f
2
|
2
= 0
c) each f
i
has a primitive F
i
.
Then the function = (
1
,
2
,
3
) where
i
= F
i
is a parameterization of
a minimal surface.
Examples. I. The helicoid. Here
f
1
= cosh z, f
2
= i sinh z, f
3
= i.
1
= sinh z = cos v sinh u
2
= (i cosh z +i) = sin v sinh u
3
= (iz) = v
This is the helicoid as one sees by making the change of variable t = sinh u.
38
II. The Catenoid. Here we take
f
1
= sinh z, f
2
= i cosh z, f
3
= 1
1
= cos v cos u 1
2
= sin v cosh u
3
= u
This is a parameterization of the surface of rotation of the catenary.
III. Scherks surface. Take
f
1
(z) =
i
z +i

i
z i
=
2
1 +z
2
f
2
(z) =
i
z + 1

i
z 1
=
2i
1 z
2
f
3
(z) =
4z
1 z
4
=
2z
z
2
+ 1

2z
z
2
1
(dened on U = {|z| < 1}). Then
1
= arg
z +i
z i
2
= arg
z + 1
z 1
3
= ln
z
2
+ 1
z
2
1
IV. Ennepers surface:

f
1
=
1
2
(1 z
2
)
f
2
=
i
2
(1 +z
2
)
f
3
= z
(u, v) =
1
2
_
u
u
3
3
+uv
2
, v +
v
3
3
u
2
v, u
2
v
2
_
.
V. Hennebergs surface:
f
1
(z) =
_
1
z
4
+
1
z
2
+ 1 z
2
_
f
2
(z) = i
_
1
z
4

1
z
2
+ 1 z
2
_
f
3
(z) = 2
_
z
1
z
3
_
39
(u, v) =
_
u
3
(1 u
2
v
2
)
3
3uv
2
(1 u
2
v
2
)(1 +u
2
+v
2
)
2
3(u
2
+v
2
)
3
3u
2
v(1 +u
2
+v
2
)
2
(1 u
2
v
2
) v
3
(1 u
2
v
2
)
3(u
2
+v
2
)
3
(1 u
2
v
2
)
2
u
2
(1 +u
2
+v
2
)v
2
(u
2
+v
2
)
2
_
In the case of problems
I(c) =
_
t
1
t
0
L(c
1
, . . . , c
n
, c
1
, . . . , c
n
)dt
which are dened on parameterized curves it is natural to ask: under which
conditions is the integral independent of the parameterization? This will be
the case if L is homogeneous in z i.e.
L(x, z) = L(x, z) ( > 0)
As an example we have
I(c) =
_
_
g
ij
c
i
(t) c
j
(t)dt
Remark: In the case of a parameterization problem we have L is homoge-
neous of degree 1 hence L
x
i
is homogeneous of degree 1 and L
z
i
is homoge-
neous of degree 0. A result of Euler states that if is homogeneous of degree
n, then
i
= n
From this it follows that we have the following relationships:
L
z
k
z
k
= L
L
x
i
z
k
z
k
= L
x
i
L
z
i
z
j
z
j
= 0
Hence the equations:
d
dt
L
z
k
_
c(t), c(t)
_
L
x
k
_
c(t), c(t)
_
= 0 (k = 1, . . . , n)
are not independent. This means that if we add the further equation
c
i
(t)
2
= 1,
the system will not be overdetermined. This is useful in simplifying calcula-
tions.
40
Example: Fermats principle.
L(t, z) = (x
1
, x
2
, x
3
)
_
z
2
1
+z
2
2
+z
2
3
.
If we choose the curve with arc-length parameterization, we have
d
ds
(
k
) =
x
k
(explicitly:
d
ds
_
_
(s)
_
k
(s) =
x
k
_
(s)
_
_
or
d
ds
(((s))
k
(s)) = grad((s)).
The Brachistone:
I(c) =
_
t
1
t
0
_
c
1
(t)
2
+ c
2
(t)
2
_
c
2
(t)
with boundary conditions c(t
0
) = (a, 0), c(t
1
) = (b
1
, b
2
) with b
1
> a, b
2
> 0.
L(x, z) =
_
z
2
1
+z
2
2
x
2
L
x
1
= 0, L
x
2
=
_
z
2
1
+z
2
2
2x
2
x
2
L
z
i
=
1
x
2
z
i
_
z
2
1
+z
2
2
We assume that the curve is parametrized by arc length i.e. that c
2
1
+ c
2
2
= 1.
In accordance with the notation of Dierentialgeometrie we write (s) for
c(t). The rst equation is:
d
ds
1
(s)
_
2
(s)
= 0.
Hence

1
(s)
_
2
(s)
is constant.
Case 1. The constant = 0. Then
1
(s) = 0 and so
1
is constant. But this
is impossible, since b
1
> a.
Case 2. The constant k = 0.
2
(s) = k
1
(s)
2
(k > 0)
41
It follows from results of Dierentialgeometrie that we can write
(s) =
(cos (s), sin (s)) for a smooth function . Then
2
(s) = k cos
2
(s).
Dierentiating: sin (s) = 2k cos (s) sin (s)
(s) i.e. 2k cos (s)
(s) =
1. Hence
ds
d
= 2k cos and so
d
1
d
=
(s)
ds
d
= 2k cos
2
.
Hence
= k
1
k(+sin cos ). If we use t = 2 as a new parameter;
we have
cos
2
=
1
2
(1 + cos 2) =
1
2
(1 cos t)
+ sin cos = +
1
2
sin 2 =
t
2
+
1
2
sin t.
This leads to the parameterization
1
= d +
k
2
(t sin t),
2
=
k
2
(1 cos t)
_
d = k
1
k
2
_
i.e. the solution is a cycloid.
Geodetics: We consider a Riemann manifold with metric
ds
2
=
g
ij
(x)dx
i
dx
j
(cf. Dierentialgeometrie). Then
I(c) =
_
t
1
t
0
_
g
ij
_
c
1
(t), . . . , c
n
(t)
_
c
i
(t) c
j
(t)
_1
2
dt
with side conditions c(t
0
) = x
0
, c(t
1
) = x
1
.
L(x, z) =
_
i,j
g
ij
(x)z
i
z
j
_1
2
L
x
k
=
1
2
g
ij
x
k
z
i
z
j
_
g
ij
z
i
z
j
L
z
k
=
l
g
kl
z
l
_
g
ij
z
i
z
j
.
If we assume that we have parametrized by arc length, i.e.
g
ij
(c(t)) c
i
(t) c
j
(t) =
1 we have:
d
ds
_
L
z
k
_
(s),
(s)
_
=
l
g
kl
_
(s)
_
l
(s) +
i,j
g
kj
x
i
i
(s)
j
(s)
42
=
l
g
kl
l
+
1
2
i,j
_
g
kj
x
i
+
g
ki
x
j
_
j
(s)
i
(s)
This leads to the equation:
l
g
kl
l
+
i,j
ij,k
_
(s)
_
j
(s)
i
(s) = 0 k = 1, . . . , n
where
ij,k
=
1
2
_
g
jk
x
i
+
g
ik
x
j
g
ij
x
k
_
(cf. Dierentialgeometrie).
4.3 Applications to Physics:
The Euler equations of suitable variational problems arise in Physics in two
situations:
I. Equilibrium states Principle of minimization of potential energy.
II. Dynamics Variational principle of Hamilton. The corresponding Euler
equations are the dierential equations of the corresponding physical systems.
We consider a system with n degrees of freedom. Then the position is a
function of n parameters
q
1
, . . . , q
n
T( q
1
, . . . , q
n
, q
1
, . . . , q
n
) is the kinetic energy. It has the form
T =
P
ik
(q
1
, . . . , q
n
, t) q
i
q
k
The potential energy U is a function of position and time i.e. U(q
1
, . . . , q
n
, t).
Hamiltons principle: This states that the system minimizes the func-
tional:
J =
_
t
1
t
0
(T U)dt
Then the Euler equations are:
d
dt
T
q
i

q
i
(T U) = 0
Now at an equilibrium point, we have T =
a
ik
q
i
q
k
(where (a
ik
) is a positive
denite matrix and U =
b
ik
q
i
q
k
(where (b
ik
) is a positive denite matrix).
This leads to the system of equations:
a
ik
q
k
+
b
ik
q
k
= 0
43
Examples: I. The vibrating string: Kinetic energy
T =
1
2
_
1
0
u
2
t
dx ( the density function)
U =
1
2
_
1
0
u
2
x
dx.
( is the coecient of elasticity). Then
_
t
1
t
0
(T U)dt =
1
2
_
t
1
t
0
_
1
0
(u
2
t
u
2
x
)dxdt.
This gives the equation
u
tt
u
xx
= 0
If a force f(x, t) is acting, then we give an additional term
_
1
0
f(x, t)u dx and
so the equation
u
tt
u
xx
+f(x, t) = 0
For stable equilibrium we minimize the functional
_
1
0
_
2
u
2
x
+fu
_
dx.
This leads to equation u
xx
f(x) = 0. The corresponding terms for a beam
are:
u
tt
+u
xxxx
+f(x, t) = 0
resp.
u
xxxx
+f(x) = 0
Further examples: I. A free particle. In this case
L =
1
2
mv
2
2 =
m
2
(v
2
1
+v
2
2
+ v
2
3
).
i.e. L = z
2
1
+z
2
2
+z
2
3
.
II. A system of non-reacting particles:
L = T =
v
2
2
where v
2
= (z
1
)
2
+ (z
2
)
2
+ (z
3
)
2
. If the particles react with each other
(e.g. by gravity or electricity), then this is modied by a term U which is a
44
function of the position vectors of the particles e.g. L = T U where T is
as above and
U =
1
2
m
a
m
b
|tr
tr
|
.
The special cases of two resp. three particles are the famous Kepler problem
of the motion of the Sun-Earth system, respectively the three body problem.
III. Double pendulum swinging in a plane (cf. Landau and Lifchitz):
L =
m
1
+m
2
2
[l
2
1
2
1
+l
2
2
2
2
+m
2
l
1
l
2

2
cos(
1
2
)] +
+(m
1
+m
2
)gl
1
cos
1
+m
2
l
2
cos
2
.
In this context it is illuminating to consider the invariance of the Eu-
ler equation under changes of coordinates. Suppose that we introduce new
coordinates ( q
, v
t) where
q
= q(q, t)

t = t.
Then
v
j
q
i
q
j
v
j
+
q
i
t
.
Then if F = F(q, v, t),
d
dt
_
F
v
i
F
q
i
_
=
j
q
j
q
i
_
d
dt
_
F
q
j
_
F
q
j
_
.
Example: Consider a single particle moving under gravity. We calculate
its equations of motion in spherical coordinates i.e.
q
1
= x, q
2
= y, q
3
= z, U = mgy, T =
m
2
( x
2
+ y
2
+ z
2
)
resp.
q
1
= r, q
2
= , q
3
= .
Then
L =
1
2
m( r
2
+r
2

2
+r
2
sin
2
2
) mgr cos .
Hence the equations of motion are
0 =
d
dt
_
L
r

L
r
_
= m r mr(
2
+ sin
2
2
) +mg cos
45
(corresponding to the dependency on r). From the dependency on , we have
0 = mr
2
+ 2mr r
r
2
sin cos

2
mgr sin ,
while from the dependency of , we get
d
dt
(mr
2
sin
2

) = 0.
(The last equation means that angular momentum about the z-axis is pre-
served. This is a g general fact. If we have a coordinate system in which
L is independent of the coordinate q
i
, then the corresponding momentum
p
l
=
L
v
l
will be conserved.
Example: We consider the transformation from stationary Cartesian co-
ordinates to rotating ones i.e.
q
1
= q
1
cos t q
2
sin t, q
2
= q
1
sin t + q
2
cos t, q
3
= q.
Solving the Euler equations: The general Euler equation can be rather
intractable. However, in certain cases where L has a particularly simple form,
one can apply standard methods of ordinary dierential equations. Examples
a) L is independent of x. Then the Euler equation is
d
dt
L
z
_
t, c(t)
_
= 0
and so L
z
(t, c(t)) is constant. In principle we can solve this to get c(t)
explicitly as a function of t and obtain the solution by interpolation.
b) L is independent of t Then we have
d
dt
_
c(t)L
z
c(t), c(t)
_
L(c(t), c(t)) = c(t)L
z
+ c(t)
d
dt
L
z
L
z
c(t) L
x
c(t)
= c(t)
_
d
dt
L
z
L
x
_
= 0
Hence we have the equation:
L
_
c(t), c(t)
_
c(t)L
z
_
c(t), c(t)
_
= const
which we again solve for c(t) as function of c(t) and the constant. We can
solve this by standard methods (rewriting as a dierential equation in t!).
46
Examples. I. The Isoperimetric problem. First we simplify the problem
by assuming that the curve is convex. We choose the x-axis so that it halves
the area. We thus reduce to the following variational problem: Maximize:
_
t
0
0
c(t)dt
under the conditions c(0) = c(t
0
) = 0,
_
t
0
0
_
1 + c(t)
2
dt = 1 (where t
0
is also
unknown). If we choose s =
_
t
0
_
1 + c(u)
2
du as new independent variable,
then we reduce to the following:
_
1
0
y(s)
_
1 y
(s)
2
ds
where y(s) = c(t). One solves this equation. The solution to the original
problem is then the parameterized curve
_
x(s), y(s)
_
where
x(s) =
_
s
0
_
1
_
dy
ds
_
2
ds
In our concrete case we have
L(t, x, z) = x
1 z
2
Hence
y(s)
_
1 y(s)
2
=
1
d
. This has solution y =
1
d
sin(ds +c
1
)
x(s) =
_
_
1 y(s)
2
ds =
_
sin(ds +c
1
)ds =
1
d
cos(ds +c
1
) +d
1
.
From this it follows that the solution of the isoperimetric problem is a circle.
II. The Brachistochrone:
_
t
1
t
0
_
1 + c(t)
2
(t, c(t))
dt
L(t, x, z) =
1 +z
2
(t, x)
= (t, x)
_
1|z
2
c(t) =
x
_
t, c(t)
_
t
_
t, c(t)
_
c(t)
_
1 + c(t)
2
_
.
Special case:
(t, x) =
1
x
i.e. L(t, x, z) =
_
1 +z
2
x
47
Then the equation is
1
_
c(t)(1 + c(t)
2
)
=
1
d
. (The solution is a cycloid).
II. Minimal surface of variation.
_
t
1
t
0
c(t)
_
1 + c(t)
2
dt
This is the special case of the above example with
L(t, x, z) = x
1 +z
2
i.e. (t, x) = x
Then we have
c(t)
_
1 + c(t)
2
=
1
d
This has solution c(t) =
1
d
cosh(dt + c
1
) i.e. the solution is the surface of
rotation of a catenary i.e. a catenoid.
Problems without xed endpoints: Here we consider the problem of
nding an optimal value of the functional
I =
_
t
1
t
0
L(t, c, c)dt
without conditions on the endpoints. Since a solution is automatically a solu-
tion of the problem with xed endpoints and so satises the Euler equation,
the expression for its derivative simplies to
(DI)
c
(h) = L
z
_
t, c(t), c(t)
_
t
1
t
0
.
This leads to the additional condition: L
z
(t, c(t), c(t)) = 0 at t
0
and t
1
for
the solution. In a similar way for the case of optimizing the functional
I =
_
t
1
t
0
L(t, x
1
, . . . , x
n
, z
1
, . . . , z
n
)dt
without conditions on the endpoints, we get the additional equations
L
z
i
_
t, c
1
(t), . . . , c
n
(t), c
1
(t), c
2
(t), . . . , c
n
(t)
_
= 0 (i = 1, . . . , n)
at t
0
and t
1
. For
I =
_ _
L
_
u, v, (u, v),
1
(u, v),
2
(u, v)
_
du dv,
we get the additional equation
L
z
1
d
ds
L
w
d
ds
= 0
on U.
48
Problems with side conditions: Recall the proof of the spectral theo-
rem: In the nite dimensional case, we consider the problem of maximizing
the function of (Ax|x) =
a
ij
j
under the side condition (x|x)
2
=
2
i
=
1. The method of Lagrange multipliers leads to the fact that the maximum
occurs at an eigenvector and hence such an eigenvector exists. In the case
of a compact, self-adjoint operator, we consider the problem of optimizing
I(x) = (Ax|x) under the condition (x|x) = 1. Then
(DI)
x
(h) = 2(Ax|h)
(DI
)
x
(h) = 2(x|h)
Hence if a x
0
is a maximum, there is a so that
(Ax
0
|h) = (x
0
|h)
for each h H. This again means that x
0
is an eigenvector. The general
situation is as follows. I and I
are functionals on an ane subspace of a

Banach space. Then if c is a solution of the optimisation problem: I(c) is
maximum (or minimum) with side condition I
(c) = 0 then there is a R

so that
(DI)
c
= DI
c
.
We can reduce to the nite dimensional case by the following trick. Firstly,
we assume that E is separable (the general case is proved similarly, with
a slightly more complicated terminology). Then we can express E as the
closed hull of the union
E
n
of nite dimensional subspaces. Suppose that c
is a solution of our optimisation problem. Without loss of generality we can
assume that c lies in E
1
and hence in each E
n
. c is then a fortiori a solution
of the optimisation problem for I restricted to each E
n
. Since (DI
)
c
= 0, we
can also, without loss of generality, assume that the restriction of the latter
to each E
n
is non-zero. Hence by the nite dimensional version of the result,
there is, for each n a scalar
n
so that
(DI
n
)
c
+
n
(DI
n
)
c
= 0,
where the subscript denotes the restriction to E
n
. But it then follows easily
that the
n
are all equal and so we have the equation
(DI)
c
+(DI
)
c
= 0
where we have denoted by the common value.
49
Example: The isoperimetric problem. In this case
L(x, z) =
1
2
(x
1
z
2
x
2
z
1
), L
(x, t) =
_
z
2
1
+z
2
2
L
x
1
=
1
2
z
2
, L
x
1
= 0
L
x
2
= =
1
2
z
1
, L
x
2
= 0
L
z
1
=
1
2
x
2
, L
z
1
=
z
1
_
z
2
1
+z
2
2
L
z
1
=
1
2
x
1
, L
z
2
=
z
2
_
z
2
1
+z
2
2
Assume: z
2
1
+z
2
2
= 1. Then we have the equation
d
ds
_
1
2
2
(s) +
1
(s)
_
1
2
2
(s) = 0
d
ds
_
1
2
1
(s) +
2
(s)
_
+
1
2
1
(s) = 0
Hence:
2
= 0,
2
+
1
= 0
Hence:
(
1
) =
2
1
+
2
2
= 1
Hence is constant thus the solution is a circle.
Example: The hanging chain: This corresponds to minimising the func-
tional
I(c) =
_
t
1
to
c(t)
_
1 + c(t)
2
dt
under the side condition
I
(c) =
_
t
1
t
0
_
1 + c(t)
2
ldt 1 = 0.
Then
L +L
= (x +)
1 +z
2
and so
(L +L
)
x
=
1 +z
2
, (L +L
)
z
= (x +)
z
1 +z
2
.
50
The Euler equation is thus
d
dt
(c(t) +)
c(t)
_
1 + c(t)
2
) =
_
1 + c(t)
2
.
This is the same as the equation for the minimal surface of rotation (with
c(t) replaced by c(t) +). Hence the solution is
c(t) = +d cosh
t
t
d
with boundary conditions
+d cosh
t
t
d
=
1
, +d cosh
t
1
t
d
=
2
and further condition
d(sinh
t
1
t
d
sinh
t
0
t
d
) = 1.
(The constants d and

t are then determined by the equations
d =
t
1
t
0
2
,
t =
1
2
[(1

)t
0
+ (1 +

)t
1
]
where tanh = b a and
sinh
=
_
1 (b a)
2
t
1
t
0
with > 0).
In our treatment, we have tacitly assumed that the Euler equation, which
in the case of the simplest functional
I(c) =
_
t
1
t
0
L(t, c(t), c(t)) dt
is an implicit ordinary dierential equation of order 2 can be solved to give
an explicit equation. If we regard this equation in the following form
L
x
(t, c(t) c(t)) L
zt
(t, c(t) c(t)) c(t)L
zx
(t, c(t) c(t)) c(t)L
zz
(t, c(t), c(t)) = 0
we see that the condition for this to be the case is that L
zz
(t, c(t) c(t)) = 0
along the solution c. We consider here the case which is the extreme opposite
of this namely where the Euler equation is such that every suitable curve is
51
a solution. If we consider the above form of the equation, we must clearly
have that L
zz
vanishes identically. This means that L has the form
A(t, x) +zB(t, x).
The Euler equation is then
d
dt
L
z
L
x
= 0
which reduces to the equation B
t
A
x
= 0. i.e. the well-known integrability
condition for the dierential form Adt +Bdx. In this case we have
_
t
1
t
0
L(t, c(t) c(t)) dt =
_
t1
t
0
[A(t, c(t)) + c(t)B(t, c(t))] dt =
_
c
Adx +Bdy
where c is the curve with parametrisation t (t, c(t)) and this is independent
of the curve when the above integrability condition holds.
The second variation: As in the case of nite dimensional optimisation,
the second variation of the functional can provide more detailed information
on the nature of an extremum. We shall use this method here to show that
if c is a solution of the Euler equations for the functional
I(c) =
_
t
1
t
0
L(t, c(t), c(t)) dt
with xed endpoints, then
L
zz
(t, c(t), c(t)) 0
is necessary for c to be a minimum of I. (We remark here that the non-
vanishing of this expression is the Lagrangian condition for the fact that the
Euler equation can be written as an explicit dierential equation of second
order. In the case of a system of equations i.e. where
L = L(x
1
, . . . , x
n
, z
1
, . . . , z
n
),
then the corresponding condition is that the matrix L
z
i
z
j
be positive semi-
denite).
In order to prove the above fact, consider the Taylor expansion
I(c +tH) = I(c) + (DI)
c
(th) + (D
2
I)
c
(th, th) +o(||th||
2
).
52
Since (DI)
v
vanishes, we see that
lim
s0
1
s
2
(I(c +sh) I(c)) = (D
2
I)
c
(h, h).
Hence, by the one-dimensional result, if c is a minimum, we must have
(D
2
I)
c
(h, h) 0 for each h. But a simple calculation shows that in this
case,
lim
s0
1
s
2
(I(c +sh) I(c)) =
_
t
1
t
0
_
L
xx
H
2
(t) + 2L
xz
h(t)h
(t) +L
zz
h
(t)
2
dt.
By choosing for h concrete peak functions, it follows that L
zz
= 0 along c.
Invariance of the Euler equation under coordinate transformations:
It follows from the coordinate-free description of the optimisation problem,
that the Euler equations are invariant under suitable coordinate transforma-
tions. The explicit calculation of this fact is particularly important in two
concrete cases: changes in the independent variable t (reparametrisations)
and changes in x (new generalised coordinates for physical systems). For the
detail, see the lectures.
Remarks on existence: In this course we have generally avoided the dif-
culties involved in the demonstration of the existence of a solution to the
optimisation problem. Three exceptions were the proofs of the Riemann
mapping theorem and the spectral theorem for compact operators (where we
used a compactness argument) and the existence of a best approximation
from a compact subset of a Hilbert space (where we used a geometrical ar-
gument). In general, the existence question is one of considerable depth. For
example it embraces such signicant problems as the solution of the Dirichlet
problem and Plateaus problem on the existence of minimal surfaces bounded
by given closed, space curves, resp. the existence of geodetics on curved man-
ifolds (theorem of Hopf-Rinow). In fact, it is not dicult to supply examples
where simple optimisation problems fail to have a solution or, even when
they do, the solution is not obtainable as a limit of a minimising sequence.
Examples: 1. Firstly, we can obtain a large collection of examples where
a smooth functional need not have a minimal by considering a well-behaved
manifold such as the plane or the sphere where any pair of points can always
be joined by a minimal geodetic and removing a point. Then it is easy to
construct pairs in the adjusted manifold which cannot be so joined.
53
2. Consider the points (1, 0) and (1, 0) in the plane and the problem of
nding a smooth curve c of minimal length joining them and satisfying the
condition that c(0) = (0, 1). It is clear that the minimal length is 2 but that
this is not attained by any appropriate curve.
3. The problem of minimising the functional
_
1
1
x
4
y
2
dx
for the set of all curves y = y(x) with y(1) = 1, y(1) = 1. The minimal
value is 0 and one again, it is not attained by any smooth curve.
4. The following is an example where the optimisation problem trivially has
a solution. However, we display a minimal sequence which is such that no
subsequence converges to the solution. We consider the Dirichlet problem
i.e. that of minimising the functional
D() =
_ _
G
(
2
x
+
2
y
)dxdy
in the trivial case where G = {x
2
+ y
2
1} and the boundary value is the
zero function. Of course, the solution is the constant function zero. However
we can construct an explicit minimal sequence which does not converge to
the zero function (cf. Lectures). We conclude the course with a brief mention
of abstract methods which are used to attack the problem of existence.
Weak topologies: It is often advantageous to replace the norm topology
on the Banach space upon which the functional is dened by the so-called
weak topology (i.e. the initial topology induced by the elements of the
dual of E). This has the advantage that larger sets can be compact for
the weak topology. For example in the L
p
-spaces (1 < p < ) and the
corresponding Sobolew spaces (see below), the unit ball is weakly compact
(but not, of course, norm compact). The disadvantage is that, this topology
being weaker than the norm one, it is harder for the functional to b e
continuous.
Semicontinuity. The previous objection can often be met by noting that
for a function on a compact set to be bounded below and attain its minimum,
it suces for it to be lower semi-continuous. Thus the norm on an innite
dimensional Banach spaces is not weakly continuous but it is weakly lower
semi-continuous.
54
Sobolew spaces: In applying abstract methods to functionals given by
integrals of suitable kernels, it is crucial to choose the correct Banach spaces
on which the functional acts. These are determined by the growth and dif-
ferentiability conditions required to give meaning to the integral expression
involved. In order to get spaces with suitable properties, it is usually nec-
essary use a generalised form of derivative, the distributional derivative (see
Vorlesung, Funktionalanalylsis II, or a course on Distribution theory). The
resulting spaces are called Sobolew spaces and are dened as spaces of func-
tions for which suitable distributional derivatives are L
p
-functions.
As an example of an abstract result of the type which can be used to
obtain existence statements in a general situation, we mention without proof:
Proposition 23 Let E be a reexive Banach space, : E R a norm con-
tinuous functional. Then if is convex, it is weakly lower semi=continuous.
If is weakly semi-continuous and coercive i.e. such that lim
x
(x)
||x||
.0,
then attains its minimum on E.
55

Mark Rusi Bol

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mark Rusi Bol

Uploaded by

Copyright:

Available Formats

Calculus of Variations (with notes on

(t) > 0. Hence 2 ln

(t) is then the derivative of x. If this derivative is a

. Then we can identify x with a

, f x is measurable. The converse is not true in general.

, then the scalar function f x is analytic and so

, the integral must be zero. Exactly as in the

is also dierentiable and hence x is

and so by the uniform boundedness theorem,

. But it follows immediately

. Hence x = x and so x is holomorphic and each c

|| is the supremum norm of

)f(x) for ||x|| 2 and

||||x y|| sup

f(x, y) = 0 if and only if (x, f(x, y)) = (x, 0) or,

IV. Ennepers surface:

(s) i.e. 2k cos (s)

are functionals on an ane subspace of a

(c) = 0 then there is a R

You might also like