Variational Calculus

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 175

.

MECHANICS AND CALCULUS OF VARIATIONS


.

Department of Mathematical Sciences


.

UNIVERSITY OF SOUTH AFRICA

Only Study Guide for

. APM3712
. Prof H Jafari
© 2022 University of South Africa
All rights reserved
Printed and Published by the
University of South Africa
Muckleneuk, Pretoria

APM3712/0/2022
ii
iii CONTENTS

Contents

PREFACE v

1 HAMILTON’S PRINCIPLE 1
1.1 Hamilton’s variational principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Generalized coordinates and holonomic dynamical systems . . . . . . . . . . . . . . 8
1.3 Excercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2 THE EULER-LAGRANGE EQUATIONS 23


2.1 The calculus of variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.2 Some elementary classical problems . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.1 The brachistochrone problem . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.2 Minimal surfaces of revolution . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.3 The simplest isoperimetric problem . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.4 Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.3 The general formulation of the simplest single-integration problem . . . . . . . . . 30
2.4 The Euler-Lagrange equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

3 THE LAGRANGE AND MEYER EQUATIONS 57


3.1 Non–conservative holonomic dynamical systems . . . . . . . . . . . . . . . . . . . . 57
3.2 The problem of Lagrange from another perspective . . . . . . . . . . . . . . . . . . 64
3.3 The isoperimetric problem and the Mayer equations . . . . . . . . . . . . . . . . . 68
3.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4 THE HAMILTON–JACOBI THEORY 79


4.1 The canonical momentum and the Hamiltonian function . . . . . . . . . . . . . . . 80
4.2 The Hamilton–Jacobi equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.3 Hamilton’s equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.4 The Weierstrass conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.5 Properties of extremal arcs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
iv

5 ADDITIONAL TOPICS IN VARIATIONAL CALCULUS 119


5.1 The multiple integral problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
5.2 Equivalent Lagrangian functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.3 Canonical transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.4 Noether’s theorem for single integral problems . . . . . . . . . . . . . . . . . . . . . 139
5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

6 APPROXIMATE SOLUTION OF VARIATIONAL PROBLEMS 149


6.1 Reduction of BVP into Variational Problems . . . . . . . . . . . . . . . . . . . . . 149
6.2 Direct Methods to Solve Variational Problems . . . . . . . . . . . . . . . . . . . . . 153
6.2.1 Rayleigh-Ritz Method to find approximate solution . . . . . . . . . . . . . . 153
6.2.2 Euler’s Finite Difference Method . . . . . . . . . . . . . . . . . . . . . . . . 157
6.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

REFERENCES 161

Index 164
.
v CONTENTS

PREFACE

This module deals with certain aspects of the calculus of variations and their connection
with classical mechanics via Hamilton’s variational principle.

This variational principal plays a vital role in applied mathematics and physics. While its
greatest application is currently in quantum mechanics, it also forms the basis of the method of
finite elements. It is also playing an increasing role outside physics, as you will see here and there
in the guide. On the whole, however, most of the problems and applications that we will deal with
are in mechanics.

In this module we deal exclusively with aspects of what is known as the simplest prob-
lem in the calculus of variations, namely that involving only one independent variable (usually, in
mechanics, this will be time). This simplest problems is discussed in Chapter 1 in the context of
classical mechanics where it is linked to the more general and elegant theory known as Hamilton’s
principle. The important notions of generalised coordinates and holonomic dynamical systems are
dealt with and illustrated by examples. At the end of the chapter you will find a large variety of
exercise.

In Chapter 2 we discuss two classical examples of this type of problem, namely the brachis-
tochrone problem and the problem of minimal surfaces revolution. These examples are first
formulated precisely as variational problems in Rn+1 and we then derive the Euler-Lagrange differ-
ential equations. These Euler-Lagrange equations are necessary conditions that any solution of a
variational problem must satisfy in order to minimise (or maximise) the fundamental integral, and
these form the core of the chapter.

Non-conservative problems and their applications are discussed in Chapter 3. The problem
of Lagrange and the isoperimetric problems are discussed and their dependence on the form of
the auxiliary conditions. An interesting aspect of the formulation of the problem of Lagrange is
illustrated with an example. Several examples are dealt with and the final section of the chapter
is devoted to exercises.

In Chapter 4 we continue with the variational theory. We consider aspects of what is known
vi

as the Hamilton-Jacobi theory. This deals with properties of families of solutions of the
Euler-Lagrange equations (these families are also known as fields, which play such an important
role in the “field theories”, of physics). We then derive more sophisticated conditions subject to
which the fundamental integral will assume an extreme value. These conditions are named after
Karl Weierstrass. The theory is then applied to a couple of problems and the chapter ends with a
variety of exercises.

Chapter 5 is devoted to a few topics in variational calculus. Unfortunately the treatment


of these is not very detailed as otherwise the module would become too long. In fact each of these
topics is an entire study on its own. We generalize the single integral to multiple integrals and
derive the equivalent Euler-Lagrange equations. This derivation is much more complicated than
in the case of the single integral and gives you an indication of the difficulties of generalising a
simple concept. The next concept to receive attention is that of equivalent Lagrange-functions.
We look for the most general Lagrange-function which yields a specific set of equations of motion.
This is a very active field of research and you will realize very quickly that it is almost possible
to solve this problem in terms of multiple integrals! In the following section we discuss canonical
transformations which are used primarily to transform canonical equations to a set of differential
equations which are easier to solve. The final topic is the very important theorem of Noether, a
consequence of which are the conservation laws of mechanics.

The final chapter is devoted to converting a boundary value problem (BVP) into a vari-
ational problem. After that we present a few methods such as Rayleigh-Ritz and Euler’s Finite
Difference methods to obtain approximate/ numerical solution of variational problems.

I use the following notation in this study guide. Any reference to a book or an article
appears as a number in square brackets, e.g. [10], and this number appears in a list of references
at the end of the guide. Equations are numbered in the form (n.m), where n refers to the chapter
and m to the number of the equation in that chapter.
This study guide is meant to be complete. This implies that you need to work right
through it, and you need to fill in any steps left out in the working. Always remember that
mathematics is an activity and it requires work on your part in order to understand
and assimilate it. This implies working out problems, posing and solving new problems, and in
so doing you extend your knowledge of mathematics.

This brings us to the large variety of exercises which appear at the end of each chapter in
the guide. These exercises include a number of theoretical questions which relate to the work of
that chapter and the idea is to test how well you understand it. You are expected to answer these
questions in your own words and not to just repeat the contents of the guide. These questions will
also help with your preparation for the exam since similar questions may appear in the question
paper. The remaining problems deal with applications of the theory. They are mostly drawn from
mechanics although some problems are formulated purely in terms of the variational calculus. Try
to do as many of the exercises as possible and especially those that don’t appear to be trivial, and
at least read through all of them so that you are aware of what is “available on the market”. You
need not tackle all the routine problems, just do enough of them to master the techniques that they
demonstrate. Exam questions will be of the same standard of those that you encounter in the guide.

You don’t need to consult the books listed to do this course. You may feel relieved about
this but I should point out that it is a good idea to refer to these references (especially when
tackling the exercises) as this will broaden your horizons. Naturally if you are really interested in
this topic you will find pursuing the references very rewarding. I should emphasize that I have not
listed all the books dealing with variational calculus and mechanics. There are many more that
are available in the library, most containing a variety of examples and exercises.

I would like to conclude with a request. Although I have revised this manuscript with a fine
tooth-comb for possible errors, some may have slipped through. Please do me a favor and point
out any errors that you might encounter (informally, as a note attached to an assignment) so that
they can be corrected in a subsequent edition.

Prof. H. Jafari 2022

1st Edition by Dr. W.M. Lesame 1998

2nd Edition by Dr. W.M. Lesame 2010

3rd Edition by Prof. H. Jafari 2018

4th Edition by Prof. H. Jafari 2022


viii
1 CONTENTS

Chapter 1

HAMILTON’S PRINCIPLE

In the first level modules of applied mathematics you came across mechanics and specifically
Newton’s laws of motion. These are not the only approaches to mechanics. Other theories exist
which deal with the same practical situations but from different points of view. One of these is
Hamilton’s variational principle.

Objectives
At the end of this chapter you will able to:

• formulate Hamilton’s variational principle,

• identify the number of degrees of freedom and the number of generalized coordinates
for various dynamical systems, and

• construct the Lagrangian function of a given dynamical system.

1.1 Hamilton’s variational principle

Consider the following problem. A ball is thrown vertically upwards into the air. Expressing
this mathematically we could say, consider a particle moving vertically upwards, with an initial
velocity u, in the earth’s gravitational field. We ignore the effect of air resistance. Denote the
height of the particle by x. According to Newton’s second law of motion we have

mẍ = −mg, (1.1)

from which we can get the speed of the particle and its height at any instant, namely

ẋ = u − gt, (1.2)

and
1
x = ut − gt2 . (1.3)
2
2

The particle reaches its maximum height


u2
h= , (1.4)
2g
above the ground after a time u/g. Figure1.1.(a) represents the actual situation as the particle
(ball) moves vertically upwards, turns around and returns downwards. If you examine equation
(1.3) you will see that the path is a parabola, and its graph in the (t, x)–plane is shown in Figure
1.1(b). We can now identify the movement of the ball in physical three dimensional space with the
motion of a mathematical point along the graph C in the (t, x)–plane.

Definition 1 We say that the graph C is the path or trajectory of the particle in the two-
dimensional (t, x)–space. This space is called the configuration space.

Figure 1.1

It is important to realize that the path in the configuration space is just a representation
of the motion of the particle and not the actual motion of the particle. This should be evident if
you look at Figure 1.1 and compare the two graphs. One shows the motion of the ball in actual
physical space and the other how this motion appears in configuration space. We show below the
graphs of the potential energy of the particle, namely
 
1 2
V = mgx = mg ut − gt , (1.5)
2
and its kinetic energy , namely
 
1 2 1 2 1 2 1 2
T = mẋ = m (u − gt) = mu − mg ut − gt , (1.6)
2 2 2 2

as a function of time, of the difference T − V and the sum T + V (see Figure 1.2).
Firstly we see that the graph of T + V is a straight line with zero gradient namely,
1
T + V = mu2 = E. (1.7)
2
This, as will be elaborated in Chpter 5, is a consequence of the law of conservation of energy and
E is the total energy of the particle. Secondly, the graph of T − V against time is also a parabola
3 APM3712

Figure 1.2

and it is the area under this graph which holds the key to the dynamical principle dealt with in
this module. Without any input from us (except for the initial velocity) the particle moves in
such a way that the area under the T − V graph is a minimum. Nature is certainly remarkably
interesting! The area under the T − V graph is given by
Z 2u/g
I= [T − V ] dt. (1.8)
0

We can formulate this natural phenomenon in mathematical terms as: when a particle moves
along its path C (given by (1.3))) in the configuration space, then the time integral (1.8) will have
a minimum value. Substitute the expressions for the potential and kinetic energies namely (1.5)
and (1.6) in (1.8), then this minimum value is given by

mu3 2
I=− = − mhu. (1.9)
3g 3

What exactly do we mean by the word minimum? Is it a minimum with respect to time, distance,
or something entirely different? Since T and V are functions of time only, and I is the integral
of T − V with respect to time, it follows that I is simply just a number which is dependent on
the particular path followed by the particle. In fact the value of I will only be a minimum if the
particle follows its natural path C (see (1.3)) in the configuration space. On the other hand if the
particle follows another path C 0 in the configuration space with the same time period,
the corresponding integral IC 0 , calculated for the path C 0 , naturally with other values of T and V ,
will be greater than the minimum value (1.9)!
4

To be more precise, suppose that the particle moves up and down in such a way that its
path C 0 in configuration space is given by (see Figure 1.3)
 πg 
x = h sin t. (1.10)
2u

Figure 1.3

The new path C 0 in configuration space will allow the particle to move up and down in the same
time 2u/g and up to the same height h (although this is not a requirement). The new potential
and kinetic energies in a time interval t along this path are given by
 
πg
V = mgx = mgh sin t , (1.11)
2g
and
1 π2  πg 
T = mẋ2 = mgh cos2 t. (1.12)
2 16 2u
So, the time integral of T − V along the new path C 0 is given by
Z 2u/g
Ic0 = [T − V ] dt
0
Z 2u/g  2  πg  
π 2 πg
 
= mgh cos t − sin t dt
0 16 2u 2u
1 π2
 
= −4mhu − . (1.13)
π 64
We have used the identity cos 2θ = 2 cos2 θ − 1, to evaluate the integral. Since
 2 
π 4 2
IC 0 − IC = mhu − + > 0, (1.14)
16 π 3
it follows directly that path C in this case yields the smallest value for the integral C. How can we
be sure that the path C really provides us with the true minimum value of the integral I? One way
would be to calculate millions of other paths and then to compare the results with the “natural”
path C. Fortunately there is a better way to do this.
5 APM3712

Suppose for the moment that we do not know what the path C, which will provide a minimum
value for the integral I in (1.8), looks like. We denote the path by x = x (t) and we shall prove
that x (t) is actually given by x (t) = ut + 21 at2 . For this purpose let C 0 be any other path of the
particle in the configuration space which at any time t differs from the natural path C by a very
small quantity which we denote by εη (t). Here ε is a real number which is sufficiently small to
keep C 0 and C close together and η (t) is any arbitrary function that passes through the endpoints
of the natural path C, namely  
2u
η (0) = 0 = η . (1.15)
g
The function η (t) sin (πg/2u) t, for example, satisfies these conditions. The new path C 0 will then
be defined by
x = x (ε, t) = x (t) + εη (t) , (1.16)
where η (t) satisfies condition (1.15) (see Figure 1.4). By allowing ε to vary we get various paths
C 0 in the configuration space. Each path C 0 corresponds to a different value of ε and is as close to
the natural path C as ε allows. In the case where ε is zero, C 0 will coincide with the natural path
C : x = x (t) which minimises the integral I, that is

x = x (0, t) = x (t) . (1.17)

Figure 1.4

The potential energy of the particle along the varied path C 0 is given by

V = mgx = mgx (ε, t) = mg [x (t) + εη (t)] (1.18)

and the kinetic energy by


1 1
T = mẋ2 = mẋ2 (ε, t)
2 2
1  2
m ẋ (t) + 2εẋ (t) η̇ (t) + ε2 η̇ 2 (t) .

= (1.19)
2
6

The integral of T − V with respect to time along C 0 becomes


Z 2u/g
IC 0 = (T − V ) dt
0

Z 2u/g
1
ẋ2 (t) − 2gx (t) dt
 
= m (1.20)
2 0
Z 2u/g Z 2u/g
1
+mε [ẋ (t) η̇ (t) − gη (t)] dt + mε2 η̇ 2 (t) dt.
0 2 0

A close examination of this integral shows that it is actually only a function of the constant ε,
since the integration with respect to time of the other variables x (t), η (t) and their derivatives
just yields numbers. To indicate this dependence we shall write I (ε) instead of IC 0 . The value of
I along the natural path C will be denoted by I (0) .

If the time-integral I (ε) of T −V along C 0 is a minimum, then its first derivative with respect
to ε is zero when ε = 0, that is,
dI (ε)
= 0. (1.21)
dε ε=0
Thus we get
Z 2u/g
dI (ε)
= m [ẋ (t) η̇ (t) − gη (t)] dt
dε 0
Z 2u/g
+mε η̇ 2 (t) dt, (1.22)
0

and if we equate ε to zero for a minimum value of I, then we obtain


Z 2u/g
[ẋ (t) η̇ (t) − gη (t)] dt = 0. (1.23)
0

The first term in the integrand of (1.23) can be simplified by using integration by parts, namely,
Z 2u/g Z 2u/g
ẋ (t) η̇ (t) dt = ẋ (t) η − (t)|2u/g
0 ẍ (t) η (t) dt
0 0
Z 2u/g
= − ẍ (t) η (t) dt, (1.24)
0

where we used (1.15) in the last step. Applying (1.24) to (1.23) yields
Z 2u/g
− [ẍ (t) + g] η (t) dt = 0. (1.25)
0

Since η (t) is an arbitrary function of f , and as the integral is zero for any η (t) it directly follows
that
2u
ẍ (t) + g = 0, 0≤t≤ . (1.26)
g
7 APM3712

The time integral of T − V is thus a minimum along the path x = x (t) which satisfies the
differential equation (1.26)).

This derivation will be given a firmer mathematical foundation in Chapter 2. We have, of course,
already come across equation (1.26). It is nothing other than Newton’s second law (1.1)! We have
already integrated this and obtained the natural path C : x = ut − 12 gt2 . So equation (1.26) leads
to the remarkable result that the path which yields a minimum for the time-integral of T − V is
also the same path that satisfies Newton’s second law of motion! What we have done is to show
that Newton’s formulation of dynamics can be expressed in a more general and sophisticated form
involving the time-integral of T −V . This formulation is known as Hamilton’s variational prin-
ciple, named after its discoverer Sir William Rowan Hamilton (1805-1865). It can be expressed
in its simplest form as follows:

A particle in a conservative field will follow a path in configuration space which minimises
the time-integral of the difference between its kinetic and potential energies.

Here is some bit of History:

Hamilton published his variational principle, which today forms the foundation
of classical mechanics, in 1834 and 1835. This famous Scottish mathematician and
astronomer was not actually the first to apply minimum principles to dynamics. Nearly
a century earlier, in 1747, Maupertius suggested that dynamics depended on minimum
action, but his explanation of the concept of “action” was not clear enough. In 1657
Fermat used a minimum principle in optics to prove the law of reflection and Snell’s
law of refraction. Incidentally this principle of minimum values was already discovered
by Hero of Alexandria in 2 BC. Newton, Leibniz and the Bernoulli brothers solved
several important problems in physics in the late seventeenth century using this type
of mathematics and in so doing laid the foundation of what is known today as the
variational calculus.

Some remarks

Let us conclude this discussion with a couple of remarks.

• Firstly, a conservative force is a force which can be derived from a potential.

• Secondly, we have not proved directly that the time-integral I (ε) of T − V is a minimum
along the varied path C 0 . The condition that the first derivative of I (ε) with respect to ε
is equal to zero is just a necessary condition for a minimum, maximum or a so-called point
of inflection and is not a sufficient condition. In order to determine whether I (0) is a
maximum, minimum or a point of inflection we first need to evaluate the second derivative
8

of I (ε). In this case, as with all important cases in dynamics, we obtain the minimum value
since Z 2u/g
d2 I (ε)
=m η̇ 2 (t) dt > 0. (1.27)
dε2 ε=0 0

A case where the second derivative of I (ε) with respect to ε is zero occurs in the motion of
a charged particle in an electromagnetic field. Strictly speaking, we ought not to formulate
Hamilton’s variational principle in terms of minimum values of I (0), but rather in terms of
stationary values of I (0), that is where the second derivative with respect to ε is positive,
negative or zero. We can thus say that a particle always follows a path in the configuration
space which is such that the time-integral of the difference between the kinetic and potential
energies has a stationary value.

• Thirdly, a comment that should be made at this point has to do with the physical nature
of the varied path C 0 . We already know that a particle that moves up and then down in a
constant gravitational field follows the natural parabolic path C in the configuration space in
accordance with (1.3). We are now entitled to ask whether it is physically possible to vary
this motion so that, for example, it follows the sinusoidal path C 0 given by (1.9), without
changing the physical environment? This is obviously impossible. Here we are dealing with
something different. The difference is that we are using mathematics to describe a natural
phenomenon. Hamilton’s principle is based on a mathematical process (the variational calcu-
lus) of finding curves which provide stationary values for certain integrals. The varied paths
C 0 in configuration space are only possible in theory and do not have physical equivalents.
Furthermore, there are problems where the mathematics does not allow us to vary the path.
We will have more to say about this later (in Chapter 3).

• Finally, the difference T − V is usually denoted by L, which is known as the Lagrangian


function. The units attached to the function L vary from problem to problem. In dynamics
the units of energy, are for example joules.

1.2 Generalized coordinates and holonomic dynamical systems

It is of the utmost importance to formulate any problem in dynamics correctly. One of the aspects
we need to consider is the number of degrees of freedom. The number of degrees of freedom of
a particle is equal to the number of independent scalar coordinates or parameters which
are necessary to specify its position at any instant.
9 APM3712

Example 1.2.1

A particle P which moves freely in three-dimensional space requires three independent coordinates
in order to specify its position (see Figure 1.5) It thus has three degrees of freedom.

Figure 1.5

These independent coordinates can be Cartesian (x, y, z), cylindrical (r, θ, z) or spherical (r, φ, θ)
coordinates. There are other coordinate systems that we could use but we shall mainly focus on

these three. Should the particle be restricted to remain at a distance a from the xy–plane, then it
has only two degrees of freedom. Note that it is customary to say that the particle is constrained
to remain at a distance a from the xy–plane. Only two coordinates are now required to define
its position, for example x and y, r and θ, etc. We say that the particle has lost a degree of
freedom and that its motion is constrained by a single relation (in Cartesian coordinates) of the
10

form f (x, y, z, t) = 0, or to be more specific, in this case z − a = 0. In spherical coordinates it


would have the form g (r, φ, θ, t) = 0, or more precisely r cos φ − a = 0. Note that the fact that
the relation depends on t, makes it possible for us to alter the constraining surface with respect to t.

Should we restrict the particle even more by keeping it at a distance b from the xz–plane, then we
would lose another degree of freedom. We would then have just one variable and the particle would
only move along the straight line AB (see Figure 1.5). In Cartesian coordinates the equations
of constraint are now given by f (x, y, z, t) = 0 and h (x, y, z, t) = 0, or to be more precise, in this
case x is the only degree of freedom. In spherical coordinates the equations of constraint are given
by
r cos θ − a = 0 , r sin θ cos φ − b = 0.
Any of the three coordinates r, φ, or θ can be chosen to be the independent coordinate, while the
other two are determined by these two equations.

Example 1.2.2

Consider the motion of a particle in a plane. The particle slides along a wire OA which is rotating
about a point O at a constant angular velocity ω (see Figure 1.6). In polar coordinates the single
equation of constraint F (r, θ, t) = 0 takes the form

θ̇ − ω = 0, or θ − ωt = 0

where we assume that θ = 0 when t = 0. This constraint reduces the number of degrees of freedom
of movement of the particle in the plane from two to one. The r coordinate can be regarded as
representing the single degree of freedom of the particle.

Figure 1.6

Work out for yourself what the equation of constraint will be in Cartesian coordinates.

Example 1.2.3

Consider a thin rod OA of length 2a which rotates about the fixed end point O. The motion of
this rod can best be described in terms of two independent parameters θ and φ, which represent
11 APM3712

the two degrees of freedom of the rod. The orientation of the rod can also be represented by the
three Cartesian coordinates x, y and z, of the center of mass G of the rod and the single equation
of constraint
x2 + y 2 + z 2 = 0.

Figure 1.7

Any two of the parameters x, y or z represents the two degrees of freedom of the rod and the third
parameter is determined by the equation of constraint.

In all three examples the number of coordinates which were originally required to describe the
motion were reduced by one for each constraint imposed on the motion of the system. Consider a
rigid body as a system of N particles where the distance between any two particles is kept constant.
The equations of constraint are all of the form

|ri − rj | = cij ,

where ri and rj are the position vectors of the i–th and j–th particles respectively. There are
exactly N (N − 1)/2 equations of constraint as can be proved for the case where N = 3. If the 3N
Cartesian coordinates are used to define their positions, then each of 3N − 6 equations of constraint
reduces the original number of coordinates by one. Hence only 3N −(N (N − 1)/2) = 6 independent
parameters are required to specify the motion of a rigid body. An example of this is that the cen-
tre of mass is described by three coordinates and the orientation by the three so-called Euler angles.

It is important to note that the number of degrees of freedom is determined by the motion
of the system and not by the particular coordinate system.
In other words, if the motion is described by m coordinates subject to s equations of constraint,
then the number of degrees of freedom for the given system is n = m − s. Furthermore these n
degrees of freedom can be represented by any set of independent parameters or coordinates.
12

We refer to these as the generalized coordinates. If these coordinates are not specified then we
shall denote them by q1 , q2 , . . . , qn or by qi (i = 1, . . . , n).

Dynamical systems which have the property that the number of generalized coordinates
needed to describe the motion is equal to the number of degrees of freedom are called holo-
nomic dynamical systems. The constraints imposed on the system are called holonomic
constraints.
All the constraints that we have encountered so far are holonomic constraints, since we can reduce
the number of degrees of freedom by elimination. The equations of constraint are also of such a
nature that this can be done. If the constraints are such that this is not possible, then the system
is called a non-holonomic system. This would be the case, for example if the constraints take
the form of non-integrable relations involving the velocities. Naturally there is no unique method
of choosing the n generalised coordinates in the case of a holonomic system. This is a technique
that one only acquires through experience. There are no general rules that apply in making such
a choice.

As before it is possible to represent the motion of a holonomic dynamical system with n


degrees of freedom by the motion of a point along a curve in the (n + 1)–dimensional configuration
space. Each dimension of this space corresponds to either one of the generalized coordinates q or to
the time t. Each point on the system’s path in the configuration space specifies the configuration
or position of the system at a given time.

Let us now consider the generalized coordinates from a mathematical view point.
We can say that holonomic dynamical systems are defined by n coordinates qi which describe the
motion, if
∂qi
= δij ,
∂qj
where the Kronecker delta δij is defined by
(
1 if i = j,
δij =
0 if i 6= j.
It follows immediately that the generalized coordinates of a holonomic dynamical system can at
most be dependent on the time, that is, qi = qi (t). The generalized velocities

q̇1 , q̇2 , . . . , q̇n

are nothing other than the time derivatives of the generalized coordinates and are consequently
also dependent on the time.
The generalized coordinates and the generalized velocities are also independent quantities. Prove
this for yourself by differentiating the generalized coordinates to get the generalized velocities.
13 APM3712

To summarize, in working with generalized coordinates, the constraints on the system are
used to remove all redundant coordinates from the description of the motion. This in
itself is already a major simplification. All generalized coordinates are dealt with in the
same way and no special care need be taken when dealing with different dimensions. Since
generalized coordinates are independent of one another, this makes it noticeably easier to
apply Hamilton’s principle to dynamical systems.

The examples that we deal with below are not worked out in full since, firstly this would make the
guide too long, and secondly the application of mathematics cannot be spoon-fed. Consequently
please fill in the missing sections as this forms an important part of the study of
applied mathematics. This applies to all the examples in this guide.

Example 1.2.4

A particle of mass m slides freely on the sloping surface of an inclined block of mass M , while the
block is free to slide on a smooth horizontal table. Determine the generalized coordinates and the
corresponding Lagrangian function of the dynamical system consisting of the block and particle.
14

Figure 1.8

Solution
The position of the mass m is uniquely described by the variables s and X, and therefore these
represent the two degrees of freedom of the dynamical system (what is the equation of constraint
in this case?). If x and y are the Cartesian coordinates of the mass m then we have

x = X + s cos α, y = s sin α.

The kinetic energy of the system is


1 1
T = T (ẋ, ṡ) = M Ẋ 2 + m ẋ2 + ẏ 2

2 2
1 2 1  2 2

= M Ẋ + m Ẋ + ṡ + 2Ẋ ṡ cos α
2 2
and the potential energy is given by

Vm = mgs sin α, VM = 0

where the reference level (i.e. the zero potential level) is the horizontal table (the x–axis). The
Lagrangian function is then

L = L (x, s, ẋ, ṡ) = T (ẋ, ṡ) − V (s)


1 2 1  2 
= (M + m) Ẋ + m ṡ + 2Ẋ ṡ cos α − mgs sin α.
2 2


Example 1.2.5

A particle of mass m is fixed at one end of a string and is swung around a horizontal cylindrical
pole of radius a so that it winds around it. The particle moves only in a vertical plane. Determine
the degrees of freedom of the particle, the equation of constraint, a set of generalized coordinates
and the corresponding Lagrangian function of the particle.
15 APM3712

Figure 1.9

Solution
` is the length of the unwound string at a time t, and θ is the angle between the string and the
positive x–axis, then ` and θ specify the position of the particle uniquely. The motion takes
place in a vertical plane (given). The string winds around the pole at the rate of a (dθ/dt), so
that ` = −a (dθ/dt) defines the rate at which the unwound length of the string shortens. This
expression can be integrated to yield the only holonomic equation of constraint

` = c − aθ.

Here c is the unwound length of the string when θ = 0. Thus the system has a single degree of
freedom which can be represented by θ or `.

The variables x and y are dependent on ` and θ since

x = ` cos θ + a sin θ,
y = ` sin θ − a cos θ.

Differentiating with respect to time gives us

ẋ = ˙` cos θ − `θ̇ sin θ + aθ̇ cos θ = −`θ̇ sin θ,


ẏ = `˙ sin θ + `θ̇ cos θ + aθ̇ sin θ = `θ̇ cos θ.

If we take θ to be the generalized coordinate, then the kinetic energy is given by


1
m ẋ2 + ẏ 2

T =
2
  1 2 1 .2
= T θ, θ̇ = m`2 θ̇ = m (c − aθ)2 θ .
2 2
If we choose the midpoint of the pole to be the zero energy level, then

V = mgy = mg [(c − aθ) sin θ − a cos θ] .


16

The Lagrangian-function is then given by


 1 2
L = L θ, θ̇ = m (c − aθ)2 θ̇ − mg [(c − aθ) sin θ − a cos θ] .
2
If the variable ` is used as a generalized coordinate, then the Lagrangian function is given by
!
  1 `, `˙  
c − `
 
c − `

L = L `, `˙ = − mg ` sin − a cos .
2 a a a

In this example neither x nor y is a suitable candidate for the generalized coordinate since neither
of these two variables can specify the particle’s position uniquely. The angle which the string
makes with the x–axis is important since if it was not specified it would be possible to reach a
given height with different lengths ` of the unwound position of the string. 

Example 1.2.6

Consider a simple plane pendulum of length ` and mass m. The pendulum is fixed to another
mass M which is free to slide on a smooth horizontal surface. Find the Lagrangian function.

Figure 1.10

Solution
The two degrees of freedom of this system can be represented by the generalized coordinates X
and θ, since these two independent parameters describe the motion of the pendulum uniquely. If
we choose x and y as cartesian coordinates of the mass m, then

x = X + ` sin θ, y = ` cos θ,

and the kinetic energy of the system becomes


  1 1
M Ẋ 2 + m ẋ2 + ẏ 2

T = T X, θ, Ẋ, θ̇ =
2 2
1 1  2 2

= M Ẋ 2 + m Ẋ + `2 θ̇ + 2Ẋ θ̇` cos θ .
2 2
17 APM3712

The Lagrangian function is then given by


   
L = L X, θ, Ẋ, θ̇ = T Ẋ, θ̇ − V (θ)
1 1  2 
= (M + m) Ẋ 2 + m `2 θ̇ + 2Ẋ θ̇` cos θ + mg` cos θ.
2 2
On the other hand suppose that the point of support of mass M is forced to oscillate periodically
according to the equation
X = A sin ωt,

then the system will lose one degree of freedom and θ is the only generalized coordinate. In this
case the kinetic energy of the mass m of the pendulum will be explicitly dependent on time, namely
  1 h 2
i
T t, θ, θ̇ = m A2 ω 2 cos2 ωt + `2 θ̇ + 2A ω`θ̇ cos θ cos ωt .
2
The motion of the pendulum mass m is now non-conservative since the tension in the string is now
a non-conservative force. 

Example 1.2.7

A disc of radius a and mass m rolls without slipping down an inclined plane which makes an angle
α with the horizontal. Find the Lagrangian function for the disc.

Figure 1.11

Solution
Let G be the centre of mass of the disc and let xyz be a set of axes that passes through G and is
fixed to the disc. The direction of the x–axes is into the page. The angular velocity of the disc is
defined by  

ω = θ̇ẑ = 0, 0, θ̇ .
18

In the first year modules in applied mathematics the kinetic energy of a rigid body is defined by
1 1 G 2 G 2 G 2

T = mṘ.Ṙ + Ixx ωx + Iyy ωy + Izz ωz ,
2 2
G
where Ṙ is the velocity of the centre of mass G and Ixx is the moment of inertia of the disc with
respect to the x–axes, etc.

Since
G 1
Izz = ma2 , Ṙ = ṡ,
2
we have  
1 2 1 1 2 2
T = mṡ + ma θ̇ .
2 2 2
If the disc rolls without slipping (in other words, we neglect friction) then

ṡ = aθ̇

and if we integrate this equation once with respect to time, with s (0) = 0, the equation of constraint
yields
s = aθ.

Thus the rolling disc has only one degree of freedom which can be represented by s or θ. If s is
chosen as the generalised coordinate then the Lagrangian function becomes
3
L = L (s, ṡ) = mṡ2 + mgs sin α.
4
As an exercise write down the Lagrangian function in terms of θ. 

Example 1.2.8

A small sailing boat has lost its rudder during a storm, but its keel is still intact. Show that the
motion of the boat is non-holonomic.
Solution
Three coordinates are necessary to specify the position and orientation of the boat, for example,
the two Cartesian coordinates x and y of its center of mass G and the direction θ in which the keel
is pointed. These three coordinates are not mutually independent since the keel constrains the
motion of the boat in such a way that the velocity of the center of mass of the boat is always in
the direction of the keel, i.e.
ṙ = kn̂ = ẋx̂ + ẏ ŷ,

is the equation of constraint where k is an arbitrary scalar. This constraint does not necessarily
always cause the boat to move in a straight line.
19 APM3712

Figure 1.12

If m̂ is any vector perpendicular to n̂ then

ṙ.m̂ = kn̂.m̂ = 0,

or
ẋ sin θ − ẏ cos θ = 0.

This represents a non-integrable equation of constraint which cannot be used to eliminate any of
the three coordinates. The motion of the boat is thus non-holonomic. 

1.3 Excercises

The only way to master applied mathematics is to get as much practise you can. As a result I
have provided an abundance of exercises. It is not necessary to attempt every example. If you are
convinced that you are able to choose suitable generalised coordinates and that you can express
the kinetic energy of a system in terms of these coordinates then you should proceed with the next
chapter.

1.3.1 Two masses m and M are joined by an inextensible cord which passes over a frictionless
pulley. Assume that the system, known as the Atwood machine, is in a uniform gravitational field.
Determine the Lagrangian function of the two masses.
20

pulley

x
m

M
x

1.3.2 A particle moves under the influence of gravity along a spiral which is defined in cylindrical
coordinates by z = kθ, r = a, where k and a are constants. Determine the Lagrangian function of
the particle.

1.3.3 A spherical pendulum is physically equivalent to a simple pendulum. It consists of a mass


m which is attached to one end of a light inextensible cord of length a while the other end of the
cord is fixed. The motion is not just confined to a plane, but occurs freely in space. So it is not
just the inclination of the cord that moves, but also the orientation of the vertical plane through
the cord. Determine the Lagrangian function of the pendulum.

ϕ a

θ
m

1.3.4 Write down expressions in generalized coordinates for the Lagrangian functions of the
following systems.

1.3.4.1 A particle which is constrained to remain on the surface of a sphere.

1.3.4.2 A particle that is constrained to the surface of a cylinder.

1.3.4.3 A particle that is constrained to remain on the surface of a cone with half-angle α and
vertex downwards.
21 APM3712

1.3.5 Two masses m and M are joined by an inextensible cord which passes through a hole in a
smooth table, so that m slides on the table while M hangs below it. Assume that M can only move
along a vertical line and that m performs some kind of circular motion on the table. Determine
the Lagrangian function for the two masses.

1.3.6 A mass m is attached to one end of a cord which passes through a hole in a horizontal
table. The other end of the cord is attached to a vertical spring (with spring constant k) which is
fixed to the floor directly below the table. The spring is not stretched when the particle is at the
hole. Write down the Lagrangian function of the system.

1.3.7 Determine the kinetic energy of a particle which slides along a frictionless wire bent in the
form of a parabola with equation y = x2 . The wire rotates at a constant angular velocity ω about
the y–axis.

1.3.8 A simple plane pendulum consists of a mass m fixed to a cord of length `. Once the
pendulum is set in motion the length of the cord is shortened at a constant rate u, while the
position of the other end of the cord remains fixed. Find the kinetic energy of the system.

1.3.9 Consider a plane pendulum which consists of a mass m fixed to one end of an elastic cord
with un-stretched length ` which obeys Hooke’s law (Hooke’s constant is k). Write down the
Lagrangian function of the system.

1.3.10 A light smooth tube containing a particle of mass m rotates in a horizontal plane at a
constant angular velocity ω about a fixed point on the tube. Find the kinetic energy of the particle.

1.3.11 A horizontal ring of mass M and radius a rotates freely about a vertical axis passing
through a point on its circumference. If a particle of mass m slides along the ring without friction,
formulate the Lagrangian function for the ring and the particle. Also determine the kinetic energy
of the particle if the ring rotates at a constant angular velocity ω about the vertical axis.

1.3.12 A vertical ring with mass M and radius a rotates freely about a horizontal axis through
its centre of mass. A beetle of mass m crawls along the ring at a constant tangential velocity f
relative to the ring. Find the kinetic energy of the system.

1.3.13 A uniform ladder of mass m and length 2a rests with one end on a smooth horizontal floor
and the other end against a smooth vertical wall. The ladder is initially at rest in the vertical plane
perpendicular to the wall and makes an angle α with the horizontal. Determine the Lagrangian
function for the ladder.
22

α (x, y)
a cos α
a

α
x

1.3.14 A light ring is attached to one end of a uniform rod of mass m and length 2a. The ring
slides without friction along a horizontal rail. The motion takes place in a vertical plane passing
through the rail. Choose suitable generalized coordinates and write down the Lagrangian functions
of the rod.

1.3.15 A uniform cylinder of mass m and radius a rolls down the inclined plane of a wedge of
mass M , without slipping. The wedge moves freely along a smooth horizontal plane. Determine
the Lagrangian function of this system.
y

(x, y)
a cos α

α a

R
s sin α

α a sin α
x
X s cos α s cos α − a sin α

1.3.16 A uniform sphere of mass M and radius a is initially at rest on a flat horizontal plane.
A smooth particle of mass m is placed at the top of the sphere and allowed to slide down, while
the sphere as a result of the reaction of the particle begins to roll without slipping. Write down
the Lagrangian function of this system.

1.3.17 A sphere of radius r rolls without slipping on the inside of the lower half of a horizontal
cylinder of radius R. The cylinder is not fixed and can roll without slipping. Write down the
Lagrangian function of this system.
23 CONTENTS

Chapter 2

THE EULER-LAGRANGE
EQUATIONS

In the previous chapter we introduced Hamilton’s variational principle. The calculus of


variations is a logical consequence of that principle and in this chapter we are going to examine it
in detail.

Objectives

At the end of this chapter you will able to:

• formulate the general and simplest single integral problem in the calculus of variations,
and

• derive the Euler-Lagrange equations and apply them to various dynamical systems to
determine the equations of motion of the systems.

2.1 The calculus of variations

There are many problems in mathematics or physics where we need to evaluate an integral
of some variable and then to establish the conditions under which a maximum or minimum value
can be obtained. In the previous chapter you encountered this type of problem in the field of
mechanics. Other branches of physics where these problems occur are, for example, heat transfer,
electromagnetic theory and quantum mechanics.

In mathematics the problem of calculating the extreme values of integrals (or more generally,
functions) can be approached from many different points of view. One such approach would be to
find those points in the domain of the function for which the extreme value occurs, to classify them
24

according to the type of extreme value and then to calculate the value of the function at those points.

Prerequisite: Maxima and minima of functions.


Let us refresh your memory. The simplest case is that of a function with a real domain.
Suppose that A ⊂ Rn and let f : A → R be a function. If x0 ∈ A such that f (x0 ) ≥ f (x) for
all x in a region N (x0 ) ⊂ A of x0 , then f has a relative (or local) maximum at x0 . Other
possibilities are relative extreme values subject to certain auxiliary conditions.
We might, for example, try to find such points x0 ∈ A such that f (x0 ) ≥ f (x) for all
x ∈ N (x0 )∩G, where G is a subset of A defined by r (< n) equations gα (x) = 0, α = 1, . . . , r,
with g1 , . . . gr given functions.

A more general optimization problem deals with functions defined on sets that are more
general than A ⊂ Rn . If S is an arbitrary set, then the function f : S → R is called functional
and particular kinds of functionals play a central role in our considerations.
We first consider the case where S consists of certain real-valued functions (in terms of continuity
properties, end points, etc.) defined over a region A ⊂ Rn (throughout the guide assume that n = 1
unless otherwise specified). Firstly, for example let S be the set C [0, 1] of all continuous functions
on the interval [0, 1]. Then
Z 1
I [x] = x (t) dt, x ∈ C [0, 1] ,
0

is a functional with domain C [0, 1]. Secondly consider the following important case, namely
the class C 2 [a, b] , a < b, of real-valued functions on [a, b] which in addition have second order
continuous derivatives on [a, b]. Denote the second derivatives of x with respect to t by

d2 x
ẍ = ,
dt2
and consider a function L : [a, b] × R2 → R, which is also twice continuously differentiable. Then
Z b
I [x] = L (t, x (t) , ẋ (t)) dt x ∈ C 2 [a, b] , (2.1)
a

which is a functional on C 2 [a, b]. There are some points to note about this functional.

• Firstly, it is obvious that the definition of I [x] depends on whether or not L is integrable.

• Secondly, we limit the domain of the integral (2.1) to a subclass S of C 2 [a, b], for example,
by only considering those members of C 2 [a, b] which pass through two fixed points.

In essence variational calculus is concerned with various types of extreme value problems of
functionals such as (2.1) and their generalizations. The purpose of this module is then also to
study various aspects of the solutions of such problems.
25 APM3712

The functional (2.1) takes on different forms depending on the type of problem being dealt
with. In this module we shall mostly consider problems in mechanics, but there are wider choices
for the Lagrangian function. We have seen that L = T − V in mechanics, while in the next
chapter we shall find that L = T − W where W is an arbitrary work functions can also be used to
solve problems in a nonconservative force field. In the theory of solid mechanics the Lagrangian
function is defined to be L = T − V + A where A is the deformation energy of the body. The finite
element packages which are currently in use to calculate tensions and deformations of a structure
under load is based on this variational principle with this choice of L (see [11] for more details).
In electro-mechanics the Lagrangian function is defined by L = TEI − VEI + TM − VM , where EI
refers to the electrical energy and M is the mechanical energy.

The name variational calculus is derived from a certain technique, namely the so-called
“method of small variations”. We have already supplied this method in Chapter 1
to find the equations of motion of a specific problem. This name was first used by
Leonhard Euler (pronounced “Oiler”) in 1756 when he referred to the method which
had been previously introduced by Joseph Louis Lagrange (1763-1813). If you are in-
terested in history in general and particularly in the history of physics and mathematics
you should read [14] where, in Chapters 2 and 3, the contributions of these two giants
is discussed.

2.2 Some elementary classical problems

2.2.1 The brachistochrone problem

One of the oldest problems in the calculus of variations is that of the shortest “distance” between
two points. This problem is not as simple as it might seem, since it depends very much on the
type of space and the definition of “distance” in this space.
This problem is dealt with in many books on variational calculus. Examples are [2] Chapter 5,
which deals with the shortest distance in Euclidean space and then there is [20] which is aimed at
Russian high school children and avoids infinitesimal calculus altogether.

Figure 2.1
26

In a vertical plane (say, the (t, x)–plane in Figure 2.1), given two points P1 (t1 , x1 ) and
P2 (t2 , x2 ) where t2 > t1 and x2 > x1 > 0 joined by a continuously differentiable curve C, with
parametric representation
x = f (t) , t1 ≤ t ≤ t2 . (2.2)

The problem is: find, among all the curves of this type (i.e. continuously differentiable curves
joining P1 (t1 , x1 ) and P2 (t2 , x2 )) the one (if it exists) along which an idealized point mass,
with mass m, will slide smoothly under gravity from P1 to P2 in the shortest possible
time. This is the well known brachistochrone problem (brachisto : shortest, chronos : time),
and is the first problem in variational calculus to be solved using infinitesimal calculus.

We assume that the initial velocity of the mass at P1 is zero (without loss of generality) and
we exclude the vertical lines t = a constant. All “comparison curves”, i.e. curves along which the
point mass may slide, must pass through P1 (t1 , x1 ) and P2 (t2 , x2 ), which means that the function
in (2.1) must satisfy the conditions:

f (t1 ) = x1 , f (t2 ) = x2 . (2.3)

If s is the arclength, measured along C, then the velocity of the point mass is given by v = ds/dτ ,
where τ denotes the time. There is no friction and the only external force is gravity, and according
to the principle of conservation of energy 21 mv 2 = mgx (remember x > 0) at any point (t, x) on C.

We have that
ds p
= 2gx.

The time required to slide along C from P1 (t1 , x1 ) to P2 (t2 , x2 ) is given by
Z P2 Z P2
ds
T [x] = dτ ≡ √ ,
C P1 C P1 2gx
 21
or since, ds = 1 + (dx/dt)2 dt along the line-integral
Z P2 √
1 + ẋ2
T [x] = √ dt. (2.4)
C P1 2gx

It is obvious that the functional (2.4) is a special case of (2.1). Note that the Lagrangian function
is dimensionless.

The analytical formulation of the brachistochrone problem is as follows: Find that


continuously differentiable function f : [t1 , t2 ] → R (if it exists), which satisfies
the conditions (2.3) and also yields a minimum value for the functional (2.4).

The solution of this problem is discussed in section 2.4.


27 APM3712

2.2.2 Minimal surfaces of revolution

Here is another problem in the variational calculus with its origin in physics and which is
sometimes referred to as the problem of the soap bubble. (See for example [2].) Most standard
treatises on the variational calculus deal with this problem in one form or another.

Consider a circular wire which is pressed into a soap solution. If we remove it a circular disc
of foam is formed which is bounded by the wire circle, provided that the circle is not too large!
Another wire circle, with a radius smaller than the first circle, is held so it touches the soap bubble
disc and then the two are separated co-axially. The two circles will be connected by a soap foam
surface (provided that the two are not pulled too far apart). Since the two circles are co-axial and
are thus parallel to each other, the resulting connecting surface is a “surface of revolution” of some
curve about the common axis. If you feel like it try this experiment and see what such a surface
actually looks like.
Now it can be proved using pure physical-mechanical reasoning that this surface is one that has a
minimal surface area in comparison with all smooth surfaces of revolution joining the two circles.
The word “smooth” here means continuously differentiable.

The question arises as to the form of this “minimal surface”. Consider two points P1 (t1 , x1 )
and P2 (t2 , x2 ) in the Cartesian (t, x)–plane, with t1 < t2 and x2 ≥ 0, x1 ≥ 0. These points are
joined by curves which are continuously differentiable and which do not cut the t–axis, such as in
Figure 2.2.

Figure 2.2

The analytic formulation of the problem of minimal surface of revolution is as follows: find
that curve (if it exists) which is continuously differentiable and is such that, if it is rotated
through 2π radians about the t–axis, describes a surface with the smallest possible surface
area in comparison with the surfaces described by all other continuously differentiable curves
through P1 (t1 , x1 ) and P2 (t2 , x2 ).

It is obvious that we must restrict the comparison curves to the set of non-negative func-
28

tions. The surfaces produced by rotation of the curve

x = f (t) f (t) ≥ 0, t1 ≤ t ≤ t2 ,

about the t–axis through one complete revolution is


Z t2
df
q
I [f ] = 2π f (t) 1 + f˙ (t)2 dt, f˙ = . (2.5)
t1 dt

This problem has not been dealt with in as much detail as the previous one. Fill in the necessary
details and formulate the problem fully. You will find that this problem has been formulated a
little to generally. It is not always possible to admit all arbitrary (smooth) curves. At most it is
required that such curves must lie in a certain “neighbourhood” of each other. More about this
later.

2.2.3 The simplest isoperimetric problem

The third problem that we are going to look at is somewhat different from the preceding two.
The original formulation and even the solution (by intuition) of this special problem is attributed,
in mathematical mythology, to queen Dido of Carthage in about 80BC.

According to the legend when queen Dido landed in North Africa she asked the local
chieftain for as much land as could be bounded by a cowhide. This request was granted,
following which she had the hide cut up into very thin thongs, which were then joined
up to form a thong of about 4km in length. She then chose a spot for the harbour and
used this thong to enclose the largest amount of land between two points on the coast
on either side of the harbour. That was how Carthage was founded. The question
that arises immediately is: along what sort of curve should the thong be laid to enclose
the maximum area?

We shall try to answer this problem in the following chapter. This problem is discussed in [14].
Reference [26] calls it the classical isoperimetric problem and attributes it to the Greek mathemati-
cian Zenodorus, whereas the problem of Dido is formulated a little differently as follows:

The simplest isoperimetric problem: Consider two points P1 (t1 , 0) and P2 (t2 , 0) on the t–
axis (the “coast” in Dido’s problem) as in Figure 2.3. We look for that curve (if it exists)
amongst all the curves which are continuously differentiable, of given length λ and joining
P1 (t1 , 0) and P2 (t2 , 0), which encloses the largest area in (say) the positive half plane.

Let a typical curve of this kind be represented parametrically by

C : x = f (t) , f (t) ≥ 0, t1 ≤ t ≤ t2 ,
29 APM3712

Figure 2.3

where, since the curve must pass through P1 (t1 , 0) and P2 (t2 , 0) the end conditions

f (t1 ) = 0, f (t2 ) = 0, (2.6)

must be satisfied, while the prescribe length λ of the curve between P1 (t1 , 0) and P2 (t2 , 0) naturally
requires that

Z t2 q
1 + f˙ (t)2 dt = λ. (2.7)
t1

The problem is now to find that function f : [t1 , t2 ] → R (if it exists), among all the continu-
ously differentiable functions which satisfies (2.6) and (2.7), which yields a maximum value for the
functional

Z t2
I [f ] = f (t) dt. (2.8)
t1

We see immediately that this curve must not cut the t–axis (can you see why?).

We must emphasis here that each curve of comparison C must satisfy the end con-
ditions (2.6) as well as the condition (2.6).
This problem is an example of the more general isoperimetric problem (“isos” means the same).
The general formulation of such a problem is as follows.
30

Consider the set of all curves C of class C 1 in Rn+1 which can be represented parametrically
by

C : xi = fi (t) , t 1 ≤ t ≤ t2 , i = 1, 2, . . . , n,
fi (t1 ) = ai , fi (t2 ) = bi ,

and which satisfy the end conditions where ai and bi are given constants, while the functions
fi satisfy the integral conditions
Z t2  
Gµ t, fi (t) , f˙i (t) dt = λµ , µ = 1, 2, . . . , p,
t1

where Gµ , λµ are p given (integrable) functions and constants respectively. Now we look for
that member (if it exists) of this set of functions, which is such that it yields an extreme
value to the integral Z t2
I [f ] = L (t, xi , ẋi ) dt,
t1
C

where L is a given (integrable) function.

2.2.4 Remarks

• The problems that we have formulated so far are prototypes of problems that we are going
to deal with in this module. There are obviously many other functionals that can be defined
but we shall limit ourselves to problems in mechanics. I must just emphasize again that we
are not going to pay much attention in this module to finding extremals, but we shall rather
concentrate on the formulation of solutions of problems and their properties.

• We shall not pay too much attention to the type of problems where the end conditions are
not fixed. Conditions such as (2.6), for example, will always be assumed. There are a large
number of problems where such conditions are not assumed, and this set of problems forms
a subject on its won. The approach used by Hamilton-Jacobi theory dealt with in Chapter
4 does make provision for variable boundary conditions.

2.3 The general formulation of the simplest single-integration problem

We shall now provide a formal definition of the single integral variational problem. Let the space
Rn+1 be described by n + 1 coordinates t, x1 , x2 , . . . , xn and consider two different points P1 (t1 , xi,1 )
and P2 (t2 , xi,2 ) in Rn+1 such that t1 < t2 . Suppose that we are given a set of continuous functions,
namely
xi = xi (t) , i = 1, 2, . . . , n, (2.9)
31 APM3712

which satisfy the boundary conditions

xi (t1 ) = xi,1 , xi (t2 ) = xi,2 (2.10)

This set is the parametric representation of a curve C in Rn+1 which joins two points P1 and P2 .
If these functions are continuously differentiable we write
dxi
ẋi (t) = , (2.11)
dt
along C. Note that we use the same notation in turn in (2.9) and (2.11) for two entities. We
regard them either as functions

xi : [t1 , t2 ] → R, ẋi : [t1 , t2 ] → R,

or as the values of functions


xi ∈ R, ẋi ∈ R,
with i = 1, 2, . . . , n.
You are already familiar with this notation and I don’t anticipate that there should be any
confusion.

Hereafter we use (t, xi , ẋi ) instead of


(t, x1 , x2 , . . . , xn , ẋ1 , ẋ2 , . . . , ẋn )

Suppose that we are given a function

L : (t, xi , ẋi ) → R

which is defined on a simply connected region G in R2n+1 . If G is in a plane then G can be


specified as a region which does not contain any holes. This means that every closed curve in
G can be contracted to a point inside G without cutting the boundary. This definition cannot
just be extended to higher dimensions. We could, for example require that in n dimensions the
region G and its boundary ∂G be homeomorphic to the (n − 1) sphere. We shall also assume that
the function L is twice continuously differentiable in all its arguments. This requirement can be
relaxed but we will not deal with this.

The integral of L along C from P1 to P2 , namely


Z P2
I= L (t, xi , ẋi ) dt (2.12)
C P1

is well defined, since if we substitute the values (2.9) and (2.11) into (2.12), the integrand is just
a function of t. We have seen in Chapter 1 that the value of this integral depends on the choice of C.
32

The simplest problem in variational calculus can now be roughly formulated as: Amongst all
the curves joining P1 and P2 choose that one which will give the integral (2.12) and extreme
value in comparison with all the neighboring curves and further determine the conditions
that the functions which determine these curves must satisfy in order to provide an extreme
value for (2.12).
This particular function is called the Lagrangian function (introduced in Chapter 1) and the
integral is known as the fundamental(or action) integral.

We also assume that the Lagrangian function is left invariant by any transformation of the xi
for which the Jacobian is not zero. This implies that integral (2.12) is also invariant. There
are some obvious links with Hamilton’s principle in mechanics as we formulated it in Chapter 1.
It is important to be able to relate the integral (2.12) uniquely to a mechanical system. These
assumptions are the mathematical equivalents of the requirement in mechanics that the problem
must be formulated in generalized coordinates. We will discuss an important theorem to do with
invariance and its application in the last chapter.

Our formulation of the variational problem can be put on a better mathematical foundation.
First a few definitions:

• A function f (xi ) of n real variables xi belongs to class C r on a region g if the function f has
continuous partial derivatives up to and including the order r with respect to every variable
xi .

• A function h (t)of the single variable t is of class Dr on an interval a ≤ t ≤ b if h (t) is


continuous on the interval and further if the interval can be subdivided into a finite number
of subintervals on each of which h (t) is of class C r .

Let a neighboring curve C 0 be defined by

dχi
C 0 : χi = χi (t) , χ̇ (t) = , i = 1, 2, . . . , n. (2.13)
dt
If we now compare the neighboring curves C and C 0 with each other, then we must distinguish
between two interpretations. The curve C 0 lies in an ε–neighborhood of C in the interval a ≤ t ≤ b
if, for a given ε > 0, it holds that
|xi (t) − χi (t)| <  (2.14)

for all values of t in the interval. The curve C 0 lies in a (ε, ε0 )–neighborhood of C if for two given
numbers ε > 0 and ε0 > 0, both (2.14) and

|ẋi (t) − χ̇i (t)| < 0 (2.15)

are satisfied.
Sketches of curves which fulfill these conditions are given in Figure 2.4.
33 APM3712

Figure 2.4

In this formulation of the problem, we now consider admissible curves to be of class D1 . If


the curve yields an extreme value for the integral (2.12) with respect to all the admissible curves
in an ε–neighborhood of C, we talk of a strong extreme value. If however, the curve lies in a
(ε, ε0 )–neighborhood of C, then the extreme value is weak. Finally, the comparison curve (1.1.4)
which was used in the discussion of Hamilton’s principle in Chapter 1, lies in a (ε, ε0 )–neighborhood
of the curve Γ.

2.4 The Euler-Lagrange equations

We are now going to consider the basic necessary condition which a (smooth) curve must satisfy
to provide a weak extreme value for the fundamental integral. This condition arises out of a set
of ordinary second order differential equations.
We have already encountered these equations in Chapter 1 where they were derived for a given
conservative holonomic dynamical system. The derivation presented here makes use of the varia-
tional method which we have also come across in a limited form.
I must emphasize once again that the Euler-Lagrange equations derived here are necessary con-
ditions which the curve must satisfy.
A sufficient condition will be presented in a subsequent chapter.

Consider a 1–parameter family of curves in Rn+1 which pass through the points P1 and P2 .
This family is given by
xi = xi (t, u) , i = 1, 2, . . . , n, (2.16)

where u denotes the family parameter. We assume that the curve C defined in (2.9) becomes a
member of the family when u = 0, so that

xi = xi (t, 0) = xi (t) . (2.17)

A neighboring curve C (u) which corresponds to the parameter u is given by

xi = xi (t, u) = xi (t) + uηi (t) , (2.18)


34

with  
∂xi
ηi (t) = . (2.19)
∂u u=0

Since the curve C (u) also passes through P1 and P2 we have that

ηi (t1 ) = ηi (t2 ) = 0. (2.20)

Further, we assume that the curve C (u) is of class D1


 
. ∂ ẋi
ẋi = ẋi (t) + u η i (t) , η̇ i = . (2.21)
∂u u=0

The parameter u must therefore be small so that (2.18) and (2.21) hold (with the assumption that
ηi and its derivative with respect to t are bounded).

If we calculate the fundamental integral along C (u) between P1 and P2 , then we find that
the result is dependent on u, that is
Z t2
I (u) = L (t, xi (t, u) , ẋi (t, u)) dt. (2.22)
t1

Substitute (2.18) and (2.21) then


Z t2
I (u) = L (t, xi (t) + uηi (t) , ẋi (t) + uη̇ i (t)) dt. (2.23)
t1

In light of this we denote integral (2.12) by I (0), and if we consider the Taylor expansions of I (u),
namely  
dI
+ O u2 ,

I (u) = I (0) + u (2.24)
du u=0
where O (u2 ) denotes terms containing u2 , then we define the first variation of the fundamental
integral to be  
dI
δI = u . (2.25)
du u=0
If we take the second order terms in (2.24) into consideration, that is
   2 
dI 2 dI
+ O u3 ,

I (u) = I (0) + u +u 2
(2.26)
du u=0 du u=0

then we define the second variation of the fundamental integral to be


 2 
2 2 dI
δ I=u . (2.27)
du2 u=0

Suppose that curve C yields a minimum value for the fundamental integral. Given the fact that
the family of comparison curves (2.16) lies in a (ε, ε0 )–neighborhood of C, then the condition

δI = 0 (2.28)
35 APM3712

must necessarily hold as a result of (2.24). We shall now examine this conclusion.

Differentiate the right hand side of (2.23) with respect to u and then put u = 0. We now
get two summations under the integral sign and this is represented in an abbreviated form by using
the summation convention, i.e. summation is implied by repeated indices, for example
N
X
Ai bi = Ai bi .
i=1

Using this convention we can write the result as


  Z t2  
dI ∂L ∂L
= ηi + η̇ dt. (2.29)
du u=0 t1 ∂xi ∂ ẋi i
Integrating the first term on the right hand side of (2.29) by parts gives us
Z t2  Z t2 t2 Z t2  Z t2 
∂L ∂L ∂L
ηi dt = ηi dt − η̇ i dt dt (2.30)
t1 ∂xi t1 ∂xi t1 t1 t1 ∂xi

where the first term in square brackets is zero as a result of (2.20). Substituting (2.30) in (2.29)
gives us   Z t2  Z t2 
dI ∂L ∂L
= − η̇ i dt. (2.31)
du u=0 t1 ∂ ẋi t1 ∂xi

From (2.25) it follows that the condition (2.28) is the same as


Z t2  Z t2 
∂L ∂L
− η̇ i dt = 0, (2.32)
t1 ∂ ẋi t1 ∂xi

where the functions are calculated along the curve C. Obtaining (2.32) requires more steps than
we have shown here. Make sure that you fill in all the missing steps, especially as regards the
differentiation. Note that the total derivative of the function L with respect to u is given by
dL ∂L ∂xi ∂L ∂ ẋi
= + . (2.33)
du ∂xi ∂u ∂ ẋi ∂u
The arguments in (2.33) are left out in order to avoid confusion but the function’s dependence on
u must not be forgotten. You can now see why we put the requirements for continuity on L and
its variables.

In order to show that the terms inside the brackets in (2.32) are zero we proceed as follows:
Let Z t
∂L ∂L
Φi = − dt, (2.34)
∂ ẋi t1 ∂xi
where we evaluate the functions along C. Note that the upper limit of the integral is t and not t2 ,
otherwise we would not have an equation but an identity. Define constants ci
Z t2
ci (t2 − t1 ) = Φi dt, (2.35)
t1
36

so that Z t2
[Φi − ci ] dt = 0. (2.36)
t1

Now choose a particular variation of ηi (t) in (2.18) by putting


Z t
ηi (t) = [Φi − ci ] dt (2.37)
t1

which satisfy the end point conditions (2.20) on account of (2.36). The functions ηi (t) are of class
D1 with
η̇ i (t) = Φi − ci . (2.38)

Substitute (2.34) and (2.38) in (2.32), then we obtain


Z t2
Φi [Φi − ci ] dt = 0, (2.39)
t1

or, if we use (2.36) again,


Z t2
[Φi − ci ] [Φi − ci ] dt = 0. (2.40)
t1

The integrand of (2.40) consists of the sum of squares, and from (2.36) and (2.39) we get

Φi − ci = 0. (2.41)

You should check this step. If we use (2.34) then (2.41) is given by
Z t
∂L ∂L
− dt = ci . (2.42)
∂ ẋi t1 ∂xi

These equations are known as the Euler-Lagrange equations in integral form.


If we choose the comparison curves from the class which lies in a ε–neighborhood of C, then we
see that the necessary condition for a weak extremum also applies to a strong extremum. We
differentiate equations (2.42) with respect to t along each class C 1 segment of C and to obtain the
Euler-Lagrange equations in differential form namely
 
d ∂L ∂L
− = 0, i = 1, 2, · · · , n. (2.43)
dt ∂ ẋi ∂xi

These equations form the basis of our studies in this module and we restrict ourselves to curves of
class C 2 .

We have thus shown that for a curve C to yield a weak extremum for the funda-
mental integral (2.42), it is necessary that C should satisfy the Euler-Lagrange
equations (2.42), with t1 ≤ t ≤ t2 , while the constants ci must be chosen to suit
the problem.
37 APM3712

Some remarks about the Euler-Lagrange equations are appropriate at this point. The explicit form
of the equations (2.43) is

∂ 2L ∂ 2L ∂ 2L ∂L
ẍj + ẋj + . − =0 (2.44)
∂ ẋi ∂ ẋj ∂ ẋi ∂xj ∂t∂ xi ∂xi

(recall the summation convention). This is a set of n ordinary differential equations of the
second order. Since none of the coefficients in the equation need be linear, we can expect that
the set will be non-linear. Fortunately we don’t need to use the set as it appears in (2.44), as
this conceals certain properties.
For example

• If the Lagrangian function is independent of xi , then it follows immediately form (2.43) (or
from (2.44)) that
∂L
= ci ,
∂ ẋi

and we call this the first integral of the Euler-Lagrange equations, where the constants
ci are determined by the given problem. These equations are of the first order, and are in
principle easier to deal with than those of the second order. On the other hand (2.44) reduces
to
∂ 2L ∂ 2L
ẍj + . = 0, (2.45)
∂ ẋi ∂ ẋj ∂t∂ xi

which in general cannot easily be recognised to be a set of first order differential equations.
In equations (2.44) we see that the coefficient of d2 xi /dt2 must not be zero, since otherwise
it will not be a second order differential equation. Locally we have

d 2 xi
= fi (t, xj , ẋj )
dt2

(note that this does not have to apply at every point in R2n+1 ).

• If L is independent of t then we get the following result.

Lemma 2.4.1

Let L be the Lagrangian function which is not explicitly dependent on t. Then we have that

∂L
ẋi − L = c (2.46)
∂ ẋi

along any extremal of the problem. Conversely, if in addition n = 1, then any solution x = x (t)
of equation (2.46) is an extremal of the problem.
38

Proof
Differentiate the left hand side of (2.46) with respect to t, then we get
   
d ∂L ∂L d ∂L ∂L ∂L
ẋi − L = ẍi + ẋi − ẋi − ẍi
dt ∂ ẋi ∂ ẋi dt ∂ ẋi ∂xi ∂ ẋi
   
d ∂L ∂L
= ẋi − , (2.47)
dt ∂ ẋi ∂xi

where we made explicit use of ∂L/∂t ≡ 0. Thus if the Euler-Lagrange equations (2.43) are valid,
then we deduce that the right hand side of (2.47) is zero. Then (2.46) follows immediately.

Conversely, assume that we have a solution xi (t) of (2.46). This means that (from (2.47))
   
d ∂L ∂L
ẋi − = 0, (2.48)
dt ∂ ẋi ∂xi

but this still does not mean that we have an extremal. The fact that (2.48) is a sum must be
taken into consideration (each term must be zero to ensure that the Euler-Lagrange equations are
satisfied). If n = 1, then the Euler-Lagrange equations are satisfied (since there is only one term)
provided that
ẋi 6= 0, (2.49)
which, as we said earlier we assumed in general.
Further if xi (t) are such that
ẋi = 0 (2.50)
then this means that xi (t) = λi (constants) and then (2.46) applies irrespective of whether or not
xi (t) = λi is a solution of the Euler-Lagrange equations. For this reason it is always necessary to
examine the case where (2.50) applies separately. The important point here is that the extremal
satisfies (2.46), but not all functions satisfying (2.46) will satisfy (2.43). 

We shall now introduce an important term, namely that of an extremal.

Any solution of the Euler-Lagrange equations (2.43) in the form xi = fi (t) is called an
extremal, while the same name is also used for the corresponding curves in Rn+1 .

This name is not a very good choice since the extremal does not have to yield extreme values for
the fundamental integral (the Euler-Lagrange equations are only necessary conditions). It must
be emphasized that “smooth” solutions of the variational problem must be extremals. If we relax
the condition of smoothness, then this is not necessarily the case.

Since we have derived an important set of equations in the preceding section, we shall con-
sider now a few examples of their application. Once again please fill in any essential missing steps.
39 APM3712

Example 2.4.1

A particle is projected with velocity u, from a point O on the ground, in a direction which makes
an angle α with the horizontal. Neglect any air resistance and show that the path of the particle
is a parabola.

Solution
Let the generalized coordinates be x and y, the horizontal and vertical displacements.

The Lagrangian function is given by


1 1
L(t, x, y, ẋ, ẏ) = mẋ2 + mẏ 2 − mgy. (2.51)
2 2
The Euler-Lagrange equations become
   
d ∂L ∂L d ∂L ∂L
− = 0, and − = 0. (2.52)
dt ∂ ẋ ∂x dt ∂ ẏ ∂y

Substitute (2.51) into (2.52), then the equations of motion are

ẍ = 0, ÿ + g = 0.

The solution is given by


1
x = ut cos α, y = ut sin α − gt2 ,
2
and if we combine these (eliminate the parameter t) we get

gx2
y = x tan α − ,
2u2 cos2 α
which is obviously a parabola.

Example 2.4.2
40

A particle of mass m is attached to one end of a string and swings around a horizontal cylindrical
pole, of radius a, so that it winds around it. The motion of the particle takes place in a vertical
plane. Determine the equation of motion of the particle.
Solution
This example has already been dealt with in detail in Example 1.2.5. There is only one degree of
freedom and the Lagrangian function is given by
  1 2
L = L t, θ, θ̇ = m (c − aθ)2 θ̇ − mg [(c − aθ) sin θ − a cos θ] .
2
The Euler-Lagrange equations are given by
 
d ∂L ∂L
− = 0,
dt ∂ θ̇ ∂θ

where
∂L
= m (c − aθ)2 θ̇,
∂ θ̇
 
d ∂L 2
= m (c − aθ)2 θ̈ − 2ma (c − aθ) θ̇ ,
dt ∂ θ̇
∂L 2
= −am (c − aθ) θ̇ − mg [(c − aθ) cos θ − a sin θ + a sin θ] .
∂θ
The equations of motion are now given by
2
(c − aθ)2 θ̈ − a (c − aθ) θ̇ + g (c − aθ) cos θ = 0.

Once again there are several missing steps that you need to fill in.

Example 2.4.3

A uniform sphere of mass M and radius a is initially at rest on a horizontal flat plane. A smooth
particle of mass m is placed on top of the sphere and allowed to slide down. The resulting reaction
force causes the sphere to roll without slipping. Determine the equations of motion.

Solution
This system has two degrees of freedom represented by X, the displacement of the centre of mass
G of the sphere and θ the angle between m and the vertical. If the sphere rolls without slipping,
then the kinetic energy is given by
1 1 G 2 G 2 G 2 7
ωz = M Ẋ 2 .

T = M Ṙ.Ṙ + Ixx ωx + Iyy ωy + Izz
2 2 10
Let
x = X + a sin θ, y = a (1 + cos θ) ,
41 APM3712

Figure 2.5

define the position of the mass m. Its kinetic energy is given by


1 2 2
 mh 2 2 2
i
m ẋ + ẏ = Ẋ + a θ̇ + 2aẊ θ̇ cos θ .
2 2
Combining the kinetic and potential energies gives us the Lagrangian function
7 mh 2 2
i
L = M Ẋ 2 + Ẋ + a2 θ̇ + 2aẊ θ̇ cos θ − mga [1 + cos θ] .
10 2
The two Euler-Lagrange equations are given by
   
d ∂L ∂L d ∂L ∂L
− = 0, − = 0,
dt ∂ Ẋ ∂X dt ∂ θ̇ ∂θ
and by substitution we get
d h  i
7M Ẋ + 5m Ẋ + aθ̇ cos θ = 0,
dt
aθ̈ + ẍ cos θ − g sin θ = 0.

Example 2.4.4

The Brachistochrone-problem: This problem was formulated in 2.2.1 We are looking for a
smooth curve C : x = x (t), joining two points P1 (t1 , x1 ) and P2 (t2 , x2 ) in the (t, x)–plane, which
will yield a minimum value for the integral
Z P2 √
1 + ẋ2
I= √ dt, (2.53)
P1 x
C

in comparison with its value along neighboring curves through P1 (t1 , x1 ) and P2 (t2 , x2 ).
Solution
We are now going to derive the Euler-Lagrange equations without specifying whether we are looking
for a weak or a strong minimum. Notice that integral (2.53) looks a little different from the one in
(2.4). Will dropping the constants make any significant difference to the Euler-Lagrange equations?
The Euler-Lagrange equations are given by
(r )! (r )
d ∂ 1 + ẋ2 ∂ 1 + ẋ2
− = 0. (2.54)
dt ∂ ẋ x ∂x x
42

By differentiating (2.54) directly we obtain the explicit form (2.44) which we can solve for x. This
procedure is a little complicated since the form of the Lagrangian functions is not as simple as the
preceding ones. We see that n = 1 and we remarked above that the Lagrangian function is not
explicitly dependent on t. If we assume that (2.49) applies then the condition for Lemma 2.4.1 to
hold is satisfied. We use the first integral of the Euler-Lagrange equations, namely
r
ẋ2 1 + ẋ2
p − = k,
x (1 + ẋ2 ) x

to determine the extremals. Simplifying gives us


1
p =c (c = −k) . (2.55)
x (1 + ẋ2 )

This equation can be rewritten by squaring and solving the velocity, namely

1 − c2 x
ẋ2 = .
c2 x
This equation must hold along an extremal C and since (2.49) applies (also for practical reasons)
we get r
dx 1 − c2 x
= .
dt c2 x
Note that we have taken the positive root. We assume that the velocity is positive
(this choice is supported by our system of axes). In general you must be very careful
with your choice of signs with the square root. You should be guided by the problem
and the variables involved.

To solve this equation we integrate it along C from P1 to an arbitrary point P (t, x) , x1 ≤


x ≤ x2 , and find that the extremal is such that
Z xr
x √
t − t1 = dx, c−1 = 2b, t1 ≤ t ≤ t2 . (2.56)
x1 2b − x

Note that c > 0 on account of (2.55) and b ≥ 0 by definition and we exclude the possibility that
2b − x = 0 which (2.49) would confirm). As an exercise you can show that 0 < x < b for
x1 ≤ x ≤ x2 , that is 0 < x2 < b.

It should not be difficult for you to evaluate integral (4.41) by this stage. Substitute

θ
x = 2b sin2 , 0 < θ1 ≤ θ ≤ θ2 < π. (2.57)
2
The integral (2.56) then becomes

t − t1 = b (θ − sin θ) + a, a = b (sin θ1 − θ1 ) . (2.58)


43 APM3712

By using identities it follows that (2.57) can be written as

x = b (1 − cos θ) , (2.59)

and (2.58) becomes


t = b (θ − sin θ) + d, d = a + t1 . (2.60)

Equations (2.59),(2.60) are the equations of a cycloid, and represent the path of a point on the
circumference of a circle of radius b, which is rolling along the t–axis (see figure 2.6). The constants
b and d are determined by the end points P1 and P2 .

Figure 2.6

It can now be shown that it is always possible to construct one and only one cycloid through
two given points t1 6= t2 under a line x = α (say), where the cycloid is described by a circle which
rolls along the underside of the line. It can also be shown that the cycloid joining P1 and P2 as in
Figure 2.6 is a solution of the brachistochrone problem. You will find this discussed in the book
[2]. So the particle slides in the shortest time if it travels along a cycloid.

We assumed that (2.49) applies. If we now assume that it does not, then we get x = c, where c is
a constant. As an exercise, get this result from the explicit form (2.54) .

Example 2.4.5 The problem of minimal surfaces of revolution

We already formulated this problem in 2.2.2. This problem is discussed further in detail in
[2].

The Lagrangian function is given by (see (2.5)).



L (x, ẋ) = x 1 + ẋ2 . (2.61)

Once again we can use Lemma 2.4.1 to determine the extremals. Substitution in (2.46) gives us

xẋ2 √ −x
√ − x 1 + ẋ2 ≡ √ = −c1 , c1 ≥ 0. (2.62)
1 + ẋ2 1 + ẋ2
44

The restriction on c1 , agrees with the assumption that the functions x are not negative. Note that
if c1 = 0, then from (2.62) we get x ≡ 0 and according to Lemma 2.4.1 this solution of (2.62) is
not necessarily an extremal. Can you show that the straight lines x = c are not extremals of the
problem? This can also be seen from geometric considerations. Let us now solve equation (2.62).
Isolate the velocity on the left hand side so that
1
q
ẋ = ± x2 − c21 , (2.63)
c1
which indicates that c1 is restricted to 0 < c1 < |x|. Integration shows that
Z x2
dx
t − t1 = ±c1 p , (2.64)
x1 x2 − c21
Γ

where Γ is the extremal (for which we want the equation) which joins the points P1 and P2 see
(Figure 2.7)

Figure 2.7

By substituting x = c1 cosh u (u ≥ 0) we get


   
−1 x −1 x1
t = ±c1 cosh + c2 , c2 = ∓ cosh + t1 ,
c1 c1
and as a result the parametric equation of the extremal is
t − c2
x = c1 cosh . (2.65)
c1
Why has the ± been dropped?

Equation (2.65) represents a two parameter family of catenaries. A smooth minimal surface
of revolution can therefore only be a catenary. Once again we must emphasize that (2.65) is
an extremal and we are as yet unable to say whether it is a solution for the minimal surface of
revolution problem or not (although we can deduce that it is some kind of catenary!).

The solution referred to here is the solution of a variational problem and is not the solution (ex-
tremal) of the Euler-Lagrange equations. In other words, our main problem is to see if the family
45 APM3712

of extremals which is a solution of the Euler-Lagrange equation(s), is a solution in the sense that
for given endpoints there is a member of the family for which the given fundamental integral yields
an extreme value (as in variational calculus). Naturally even before this can be the case we need
to know if any two points in the relevant neighbourhood can be joined by an extremal. We will
look at this latter requirements from the point of view dealt with in [29].

Let us assume that without any loss of generality (why?) P1 is the point (0, 1) (see Figure
2.8).

Figure 2.8

We prove that points exist which cannot be joined by members of the family (2.65). Substituting
(0, 1) in (2.65) gives us
 −1
c2
c1 = cosh . (2.66)
c1
Set c2 = µc1 , then, after substituting (2.66), (2.65) becomes
cosh (t cosh µ − µ)
x = x (t, µ) = . (2.67)
cosh µ
Hence
ẋ (t, µ) = sinh (t cosh µ − µ) , (2.68)
which gives us
ẋ (0, µ) = − sinh µ,
from which we deduce that dx/dt : (0, µ) → dx (0, µ) /dt has R as its range, since µ ranges through
the whole of R (do you agree?). But this is precisely the gradient at P1 of that curve of the family
(2.65) which corresponds to the parameter value µ in (2.67). Hence the initial gradient of the
family of extremals (catenaries) through P1 can be chosen arbitrarily.

Now let’s derive the conditions on constants c1 and c2 (subject to (2.20)) which will ensure that a
member of the family (2.65) will go through both P1 and P2 (t2 , x2 ). It is obvious that we must
require that
cosh (t2 cosh µ − µ)
x2 = . (2.69)
cosh µ
46

Thus, if we can find values for (t2 , x2 ) for which (2.69) does not hold then we have proved that
points exist which cannot be joined to P1 by means of catenary of the family (2.65).

Firstly, we note that


1 t
e + e−t > |t|

cosh t = ∀tR,
2
and if we use this result then
cosh (t2 cosh µ − µ) t2 cosh µ − µ µ
> = t2 −
cosh µ cosh µ cosh µ
|µ|
≥ |t2 | − .
cosh µ

Secondly, it is easy to show that (try it!)

|µ| |µ0 |
max = ,
µR cosh µ cosh µ0

where µ0 (6= 0) satisfies the equation


cosh µ0
µ0 = .
sinh µ0
Hence it follows that
|µ| 1
|t2 | − ≥ |t2 | − ,
cosh µ sinh µ0
and as a result
1
x (t2 , µ) ≥ |t2 | − ∀µ0 (6= 0) R.
sinh µ0
Choose t2 so that
1
k = |t2 | − > 0,
sinh µ0
then we see immediately that for all x2 < k, it holds that x (t2 , µ) > x2 for all µ ∈ R. It is clear that
if µ = 0, then x (t2 , 0) = cosh t2 > |t2 | and hence if x2 < |t2 | then again we have that x (t2 , 0) > x2 .
Consequently no member of the family (2.65) passes through P1 (0, 1) and P2 (t2 , x2 ). Hence
the problem of minimal surfaces of revolution has no solution for the curves under discussion (i.e.
smooth curves) which joins P1 and P2 (t2 , x2 ) .

We shall now sketch the actual situation without a detailed proof. Look at Figure 2.8.
The family of catenaries (2.65) through P1 (0, 1) (in practice no restrictions should be placed on
P1 except that it should not be the origin), given by (2.67), has an envelope which appears in the
figure. This means that no member of the family can join P1 to a point P2 separated from P1 by
the envelope. Furthermore there is one and only one member of the family that passes through
P1 and through a point A on the envelope (a contact point), while if B is a point on the same side
of the envelope as P1 , then there are two members that pass through both P1 and B.
47 APM3712

If A lies on the envelope, the unique extremal P1 A does not minimize the surface. This
point A is called a conjugate point of P1 . In a certain sense A is “too far” from P1 which once
again underlines the local nature of our theory. In this case the “solution” of the problem is given
by the function f of class D1 , defined by

 1 if t = 0,

f (t) = 0 if 0 < t < ta , (2.70)

xa if t = ta ,

where A is the point (ta , xa ) (see Figure 2.9). The “surface of revolution” consists of the surfaces
of the bounding circles! This is a striking representation of what actually happens in practice
when two co-axial circles, as described in 2.2.2, are too far away from each other. This solution
(2.70) is the so-called (discontinuous) Goldschmidt solution, named after BCW Goldschmidt who
discovered it in 1831.

Figure 2.9

If P2 is on the same side of the envelope as P1 like point B in Figure 2.8, then it can be
proved that the upper of the two catenaries joining P1 and P2 , namely that one which does not
touch the envelope (and which does not have a point conjugate to P1 between P1 and B), provides
a strong relative minimum in the class of smooth curves between P1 and P2 .

This problem is discussed in detail in [2]. A related problem is the problem of Newton,
which is a type of surface of revolution. It is concerned with the shape that a body should have so
that as it travels through a resisting medium in the direction of its axis of revolution it experiences
the least amount of resistance. The Lagrangian is given by
xẋ3
L (x, ẋ) = .
1 + ẋ2
A thorough description can be found in [12].
48

2.5 Exercises

The exercises that follow are largely concerned with the Euler-Lagrange equations. I have
tried to present as many problems as possible. The exercises on the applications of mechanics un-
fortunately result in Euler-Lagrange equations which can only be solved numerically (as in examples
2.4.2 and 2.4.3). Sometimes these can actually be solved by making suitable assumptions, such
as for example that the angles involved are very small. This not always possible. Since numeri-
cal solutions fall outside the scope of this module, we shall have to make do just with the equations.

We first present problems in mechanics and so as an additional exercise you can derive
Euler-Lagrange equations for all the problems listed in the exercises of chapter 1.

2.5.1 A particle moves along a smooth plane which makes an angle α with the horizontal. Find
the equations of motion and solve them.

2.5.2 Consider a simple plane pendulum of length λ and mass m. The pendulum is suspended
from a mass point of mass m. The pendulum is suspended from a mass point of mass M which is
free to move along a smooth horizontal plane.
Determine the equation of motion and the period of small oscillations of the pendulum.

2.5.3 A particle is constrained to move on a vertical weightless circle of radius a in a uniform


gravitational field. The circle rotates freely around a vertical diameter.
Determine the equations of motion of the particle and derive the motion of the particle.

2.5.4 Two masses m and M are joined by an inextensible string which passes over a frictionless
and weightless pulley. Assume that the system is in a uniform gravitational field. Determine the
motion of the system.
This system is called Atwood’s machine and is used to determine the gravitational acceleration at
a point on the earth.

2.5.5 A light ring is fixed to one end of a uniform rod of mass m and length 2a. The ring slides
without friction along a horizontal track.
Determine the equations of motion of the rod and the period of small oscillations. The motion
takes place in a vertical plane parallel to the track.

2.5.6 The length s of any line joining two points P1 and P2 on a cylinder is given by
s  2
Z P1
2

s= 1+r dz,
P2 dz

with r, θ and z are cylindrical coordinates. Use the Euler-Lagrange equations to show that the
shortest path between P1 and P2 is a helix.
49 APM3712

2.5.7 A uniform ladder of mass m and length 2a stands with one end on a horizontal smooth
floor and the other end against a smooth vertical wall.
The ladder is initially at rest in a vertical plane perpendicular to the wall and makes an angle α
with the horizontal.
Determine the motion of the ladder.

2.5.8 A particle of mass m slides freely along a smooth inclined block of mass M , while the block
itself is free to slide on a smooth horizontal table. Find the motion of the system.

The problems that follow are going to make you delve a little deeper. They are all formulated
in terms of the variational calculus.

2.5.9 Formulate and solve the brachistochrone problem. In which sense have you “solved” the
problem?

2.5.10 Formulate and solve the problem of minimal surface of revolution. In which sense have
you solved the problem?

2.5.11 Consider the problem of minimal surfaces of revolution with Lagrangian function (2.61).
Show that the Euler-Lagrange equation is given by

1 + ẋ2 − xẍ = 0,

and solve it.


Hint: let dx
dt
= p.

2.5.12 Obtain the smooth extremal of the variational problem with Lagrangian function

L (t, x, ẋ) = et 1 + ẋ2 .

Determine whether any two points in the Cartesian plane can be joined by a member of this family.

2.5.13 Find the smooth extremal for the variational problem with Lagrangian function

L (t, x, ẋ) = et 1 + ẋ2 .




Determine whether any two points in the Cartesian plane can be joined by a member of this family.

2.5.14 Find the smooth extremal for the variational problem with Lagrangian function
.2
L (t, x, ẋ) = ẋ + t2 x .

Show that the smooth extremal is a hyperbola and deduce that points exists in the Cartesian plane
which cannot be joined by an extremal of class C 1 .
50

2.5.15 Consider the Lagrangian function

L (x, ẋ) = x2 (1 − ẋ)2 .

Show that no extremal of class C 2 exists which joins points P1 (0, 0) and P2 (2, 1). Show that the
curve “with an angle” defined by
( )
0, 0 ≤ t ≤ 1,
x = x (t) = ,
t − 1, 1≤t≤2

makes the absolute value of the integral zero. This curve is continuous everywhere but is not
differentiable at the point (1, 0) i.e. it is of class D1 [0, 2].

2.5.16 Consider the problem with the Lagrangian function



L (x, ẋ) = x 1 − ẋ2 .

Take the initial point to be the origin and show that the curves x = c sin xc are extremals for this
problem. Are they the only ones? Investigate the situation if the right hand endpoint is:

(a) on the t–axis,

(b) not on the t–axis.

Compare, in particular, in the first case, the values of the fundamental integral along the t–axis
with its values on the extremal. This exercise is in essence Exercise (4) on p. 28 of [6].

2.5.17 Consider the variational problem with Lagrangian function

L (t, ẋ) = t2 ẋ2 .

Show that x = 1/t is the extremal through P1 (1, 1) and P2 2, 21 . Denote this extremal by Γ and


vary Γ to the curve C : x = (1/t) + η (t) where η is of class C 2 and η (1) = η (2) = 0. Denote the
fundamental integral along Γ between P1 (1, 1) and P2 2, 21 by I (Γ) and the fundamental integral


along C by I (C). Show that Z 2


I (Γ) − I (C) = t2 η̇ 2 dt
1
and deduce that I (C) ≥ I (Γ) with equality if and only if Γ and C coincide. Is this minimum
strong or weak? This problem is discussed in [6].

2.5.18 Consider the variational problem with fundamental integral


Z m√ √
I [x] = x + h 1 + ẋ2 dt,
0

where h > 0 and x (0) = 0, x (m) = m1 > −h. Derive the extremals in the form

4c2 x = t2 − 2at. (2.71)


51 APM3712

Introduce the gradient


ẋ (0) = α
of the extremal at the origin as a parameter.
Show that the equation of the family (2.71) can be expressed as
1 + α2 2
x = αt + t. (2.72)
4h
In order to find the envelope of this family of extremals (parabolas), differentiate with respect to
α and get 0 = t + αt2 /2h. The equation of the envelope is given by
t2
x = −h + . (2.73)
4h
If we compare this situation with Figure 2.10 we see that all the extremals (2.72) lie “under” the
parabola (2.73) (with the x–axis pointing downwards). If the point (m, m1 ) is outside the parabola
it cannot be joined to the origin by an extremal. If it is “under” the parabola we can find two
parabolas of the family (2.72) which join (m, m1 ) to the origin. Here (and you do not need to
prove this at this stage) the “higher” parabola does not yield a minimum, but the “lower” one
yields a strong minimum. This exercise requires that you examine the example in detail. It is
dealt with in [1].

Figure 2.10

2.5.19 Find the smooth extremals for the problems with Lagrangian functions as defined below.
You may delve a little deeper than in the preceding exercises, but it is not part of the exercise.
1 + ẋ2
2.5.19.1 L (x, ẋ) = .
ẋ2
2.5.19.2 L (x, y, ẋ, ẏ) = 2xy − 2x2 + ẋ2 − ẏ 2 .

ẋ2
2.5.19.3 L (t, x, ẋ) = .
t3
52

2.5.19.4 L (x, ẋ) = ẋ2 + 2xẋ − 16x2 .

2.5.19.5 L (t, x, ẋ) = tẋ+ ẋ2 .

2.5.19.6 L (t, x, ẋ) = x2 + ẋ2 − 2x sin t.

2.5.19.7 L (t, x, ẋ) = x2 − ẋ2 − 2x sin t.

2.5.19.8 L (t, x, ẋ) = x2 + ẋ2 + 2xet .

2.5.19.9 L (t, x, ẋ) = t2 ẋ2 + 2x2 + 2tx.


.2
2.5.19.10 L (x, y, ẋ, ẏ) = ẋ2 + y + ẋẏ.

2.5.19.11 L (t, x, ẋ) = ẋ (1 + t2 ẋ) .

2.5.19.12 L (t, x, ẋ) = sin (tẋ) .

2.5.19.13 L (t, x, ẋ) = x2 − ẋ2 − 2x cosh t.



1 + ẋ2
2.5.19.14 L (x, ẋ) = .
x
2.5.20 Find the smooth extremals of the variational problems with Lagrange function

L (x, ẋ) = ẋ2 x2 .

Compare the value of the fundamental integral along an extremal between two points with the
value along a straight line through the two points.

2.5.21 Find the smooth extremal of the variational problem with Lagrangian function
xẋ3
L (x, ẋ) = .
1 + ẋ2
This is Newton’s problem referred to in 2.4 (Hint: Obtain the first integral by using lemma 2.4.1.
Let dx
dt
= p, so that dt = dx
p
. Find the solution in parametric form
 
−2 3 −4
t = c1 p + p + ln |p| + c2 ,
4
−3
2
x = c1 p 1 + p2 .

See [22].)

2.5.22 Find the extremal of class C 2 , if it exists, which satisfies the given final conditions for
problems with Lagrangian functions as follows:

2.5.22.1 L (t, x, ẋ) = x2 + ẋ2 + 2xet , x (0) = 0, x (1) = e.


53 APM3712

π 
2.5.22.2 L (t, x, ẋ) = ẋ2 − 2x sin t, x (0) = 0, x = 1.
2

1 2 1
2.5.22.3 L (x, y, ẋ, ẏ) = (ẋ + ẏ 2 ) + k (xẏ − y ẋ) − n2 (x2 + y 2 ) ,
2 2
. x (0) = y (0) = 0, x (t2 ) = x2 , y (t2 ) = y2 , k > 0, n > 0

2.5.22.4 L(x,y, ẋ, ẏ) = ẋ2 + ẏ 2 + 2xy, x (0) = y (0) = 0,


π  π
. x =y = 1.
2 2

2.5.22.5 L (x, ẋ) = e2x + ẋ2 , x (0) = 1, x (t2 ) = x2 .

1 + ẋ2
2.5.22.6 L (t, x, ẋ) = , x (1) = 0, x (2) = 1.
t
p
2.5.22.7 L (x, ẋ) = x (1 + ẋ2 ), x (−1) = x (1) = b > 0.
. Distinguish between the cases b < 1, b = 1 and b > 1.
.
 
2.5.22.8 L x, y, ẋ, y = x2 + 4y 2 + ẋẏ ,
π  π 
. x (0) = 0, y (0) = 1, x = 1, y = 0.
4 4

The following problems deal with Lagrangian functions with specific characteristics.

2.5.23 Show that if the Lagrangian function is only a function of dx dt


i
, then the extremals (if
they exist) are necessarily straight lines. What happens if L is also linear?
Now find the most general Lagrangian function of the form.

L (t, x, ẋ) = f (t, x) ẋ2 + g (t, x) ẋ3 + h (t, x) ẋ4

for which the extremals are straight lines.

2.5.24 Consider a problem with Lagrangian function L of class C 2 and such that

∂ 2L
≡ 0.
∂ ẋ2
(a) What is the most general form of such a Lagrangian function?

(b) Show that the Euler-Lagrange equations are identically zero or that they define a map t → x (t)
(which in general does not satisfy the boundary conditions).

(c) As an example investigate the case where

L (t, x, ẋ) = x2 − t2 ẋ.


54

You can find more problems in references [17] and [18].

2.5.25 Consider a problem in R2 with Lagrangian function L. Let S = S (t, x) be any function
of class C 3 . Denote the Euler-Lagrange operator by E, that is
 
d ∂L ∂L
E (L) = − . (2.74)
dt ∂ ẋ ∂x

(a) Prove that  


dS
E = 0. (2.75)
dt

(b) Show that the converse also applies, namely, if the Euler-Lagrange equations are identically
zero, then the Lagrangian function can be written as the total derivative of a function S (t, x).

Hint: Consider the Lagrangian function in its explicit form (2.44).

2.5.26 Generalize the result of Exercise 2.5.25 to the problem in Rn+1 with Lagrangian function
L. That is prove that if S (t, xi ) is a function of class C 3 , then (2.75) applies to n equations. The
converse holds as well, but you do not need to prove it, since the proof is quite difficult.

2.5.27 Consider the variational problem with Lagrangian function


. 
L (t, xi , ẋi ) = Λ t, xi , xi + Aj (xi ) ẋj ,

where Λ ∈ C 2 is given and the Aj are such that


∂Aj ∂Aj
− = 0.
∂xj ∂xi

(a) Show that Ei (L) ≡ Ei (Λ), where Ei denotes the Euler-Lagrange operator defined in (2.75) for
n = 1.

(b) Explain why the term Aj (dxj /dt) makes no contribution to the Euler-Lagrange expression. Do
you have any comments, especially in the light of Exercises 2.5.24 to 2.5.26?

The results contained in Exercises 2.5.24 to 2.5.27 are of fundamental importance for an
important part of the variational calculus and its applications, for example in field theory.
They are vitally important in the multiple integral problem which we will briefly deal with
in Chapter 5. Here are a few more examples where you can use the concepts of the previous
exercises.

2.5.28 Investigate the existence of an extremum for the following functionals.


Explain in your own words why none can be found.
R t2
2.5.28.1 t1
(x2 + 2txẋ) dt, x (t1 ) = x1 , x (t2 ) = x2 .
R t2
2.5.28.2 t1
(2xẋ − cos x − tẋ sin x) dt, x (t1 ) = x1 , x (t2 ) = x2 .
 
R t2 1 tẋ
2.5.28.3 t1
− dt, x (t1 ) = x1 , x (t2 ) = x2 .
x x2
R1
2.5.28.4 0
(tx + x2 − 2x2 ẋ) dt, x (0) = 1, x (1) = 2.
R t2
2.5.28.5 t1
(x + tẋ) dt, x (t1 ) = x1 , x (t2 ) = x2 .

The last few exercises are concerned with the theory dealt with in this chapter. The idea
is that you should try to answer the questions in your own words. Simply reproducing the
guide is taboo!

2.5.29 Give a precise analytical formulation of the simplest problem in the variational calculus
for singe integrals in Rn+1 .

2.5.30 Consider a non-homogeneous variational problem in Rn+1 with fundamental integral


(2.12). Derive the Euler-Lagrange equations as necessary conditions for curves to provide ex-
treme values for the fundamental integral.
State any assumptions that you make with reference to L and the curves.

2.5.31 State Lemma 2.4.1 and give a formal proof.


56
57 CONTENTS

Chapter 3

THE LAGRANGE AND MEYER


EQUATIONS

In the first chapter we examined the motion of conservative holonomic dynamical systems,
and more specifically we looked at dynamical systems for which the Lagrangian function is the
difference between the kinetic and potential energies. We shall now take a look at the motion of
non–conservative holonomic dynamical systems where the forces on the system cannot be derived
from an ordinary potential function.

Objectives
At the end of this chapter you will able to:

• determine the potential energy of a non-conservative holonomic dynamical system and


construct a Lagrangian function, and

• obtain the equations of motion of non-conservative holonomic dynamical systems.

3.1 Non–conservative holonomic dynamical systems

We first seek to define a Lagrangian function L for non–conservative dynamical systems, then the
Euler-Lagrange equations will be the equations of motion for such systems. The kinetic energy
should form part of the Lagrangian function, and in place of the potential energy (which should
be related to the conservative forces) we must introduce a term W which is directly related to the
non–conservative forces.

We define the Lagrangian function for a non–conservative holonomic system by

L = L (t, qi , q̇i ) = T (t, qi , q̇i ) − W (t, qi , q̇i ) , (3.1)

where W is some kind of function of the work and qi are the generalized coordinates.

The cases where the applied forces are derived from a scalar potential function V is the special
58

cases where W = V .
In practice it is difficult to calculate the function W explicitly and we shall rather apply the Euler-
Lagrange equations to include the non-conservative applied forces in a direct manner. These
revised equations are known as the Lagrange equations of motion.

Figure 3.1

Consider a non–conservative holonomic dynamical system with n degrees of freedom rep-


resented by the generalized coordinates qi , i = 1, 2, . . . , n. Suppose that s forces F 1 , F 2 , . . . ,F s
are applied to the dynamical system at the points with position vectors r1 , r2 , . . . , rs respectively.
Two cases arise:

(a) For a dynamical system consisting of a set of particles, the position vectors r1 , r2 , . . . , rs
simply refer to the position vectors of the particles themselves (see Figure 3.1).

(b) For a rigid body, the position vectors refer to the points of application on the surface of the
body.

Now let (3.1) be the Lagrangian function of the system, where the work function is related in some
way to the applied forces F 1 , F 2 , . . . ,F s . The Euler-Lagrange equations can then be written in
the form        
d ∂L ∂L d ∂T ∂T d ∂W ∂W
− = − − − = 0,
dt ∂ q̇i ∂qi dt ∂ q̇i ∂qi dt ∂ q̇i ∂qi
or    
d ∂T ∂T d ∂W ∂W
− = Qi = − , (3.2)
dt ∂ q̇i ∂qi dt ∂ q̇i ∂qi
where the n functions Qi are known as the generalized forces which correspond to the n general-
ized coordinates qi . Note that the function Qi is a function of t, qi and dqi /dt for each i. The
name generalized forces comes from the fact that the Qi have the dimension of force as the qi are
expressed in the SI units of distance.
59 APM3712

Generalized forces play an important role in the dynamics of non-conservative systems, and
are related to the s applied forces F 1 , F 2 , . . . , F s and their points of application r1 , r2 , . . . , rs ,
as follows: s
X ∂r
Qi = F k. k . (3.3)
k=1
∂q i

The proof of the connection (3.3) depends on the concepts of virtual work which is outside
the scope of this module. This theory of virtual work was formulated in 1717 by Johann
Bernoulli and is mainly applied in statics [23].

The theory in itself is not difficult to grasp, and the proof (3.3) is based on the fact that the virtual
work Qi δqi performed by the generalized forces Qi over imaginary or virtual displacements δqi is
equal to the virtual work
s s
X X ∂r
F k .δrk = F k . k δqi
k=1 k=1
∂qi
performed by the applied forces F 1 , F 2 , . . . , F s during imaginary displacements δr1 , δr2 , . . . , δrs ,
from their points of application. The theory of virtual work is discussed in many textbooks, for
example consult [23].

We now assume the validity of (3.3) and we write the equations of motion (3.2) of the system
in the following form:
  s
d ∂T ∂T X ∂r
− = Qi = F k. k . (3.4)
dt ∂ q̇i ∂qi k=1
∂q i

These equations are known as the Lagrange equations of motion. These equations are nothing
other than the Euler-Lagrange equations with a work function W which satisfies
  s
d ∂W ∂W X ∂r
− = Qi = F k. k . (3.5)
dt ∂ q̇i ∂qi k=1
∂qi

In practice we apply these Lagrange equations (3.4) directly since it is always easier to determine the
generalized forces Qi . On the other hand the application of the Euler-Lagrange equations requires
the construction of a Lagrangian function, in which the work function part W is defined by the n
equations (3.5). In the variational calculus we generalize this situation so that we can introduce the
so-called Lagrange multipliers in order to construct the Lagrangian function. More about this later.

Naturally we can also apply the Lagrange equations of motion to systems where some or all
of the applied forces are conservative. There is nothing in the derivation to prevent us from doing
so. Let us have a look at an important special case, namely one where all the applied forces on
the system can be derived from a generalized potential function V which is only dependent on the
generalized coordinates qi . That is

F k = −∇k V, k = 1, 2, . . . , s.
60

Here the ∇k denote that the gradient must be calculated at the point rk . The generalized forces
are given by
s
X ∂r ∂V
Qi = − ∇k V k = −
k=1
∂qi ∂qi

The work function W follows from (3.5), namely


 
d ∂W ∂W ∂V
− = Qi = − .
dt ∂ q̇i ∂qi ∂qi

Since V is a function of qi only, it follows directly that W = V , as we expected.

Not all the forces applied to the dynamical system lead to the generalized forces Qi . For
example, constraining forces that are directly associated to each mathematical constraint do not
contribute to the generalized forces on the system. This is an interesting point since the forces of
constraint can be removed from the formulation in the same way that the equations of constraint
can be eliminated by the choice of the n generalised coordinates. We give a simple example.

Example 3.1.1

Consider the downward motion of a particle on a smooth inclined plane which makes an angle α
with the horizontal.
Solution
This particle has only one degree of freedom which is represented by x, since the other degree of
freedom is excluded by the constraining equation y = 0.

Figure 3.2

The constraining force which physically restricts the motion of the particle to the inclined plane is
the reaction force R, which is perpendicular to the plane y = 0 (the equation of constraint). The
61 APM3712

generalized force Qx is given by

 ∂r 
Qx = R + mg . = R + mg .x̂
∂x
= mg.x̂ = mg sin α.

Thus the constraining force R contributes nothing to Qx . Other forces which also have no effect,
include pressure of connecting rods of constant length, reactions at fixed points of rotation, ten-
sions in rigid inextensible strings (ropes), friction forces which constrain rigid bodies to roll without
slipping, etc.

Here are a few examples of the use of equations (3.4) and (3.5) in mechanics. The equations
of motion are somewhat complicated and we are not going to attempt to solve them. Once again
lots of steps are left out for you to fill in.

Example 3.1.2

Consider a simple plane pendulum of length λ and mass m with a point of suspension which
oscillates according to the equation X = A sin ωt. Determine the Lagrangian function and the
equations of motion of the pendulum.

Figure 3.3

Solution
Let the mass at the end of the pendulum be our dynamical system with F the tension in the string
and mg the gravitational force. This is the only force acting on the mass at the point r. We can
express r in terms of the unit vectors, namely

r = X x̂ + λλ̂.
62

The angle θ can be taken to be the only generalized coordinate (why?). The generalized force Qθ
which corresponds to θ is given by
 ∂r  ∂ λ̂
Qθ = F + mg . = F + mg .λ
 ∂θ ∂θ
= F + mg .λθ̂ = −mgλ sin θ.

According to Example 1.2.6 the kinetic energy of the system is given by


  1 h i
2 2 2 2 2
T θ, θ̇, t = m A ω cos ωt + λ θ̇ + 2Aωλθ̇ cos θ cos ωt . (3.6)
2
The Lagrange equations of motion (3.4) become

λθ̈ − Aω 2 sin ωt cos θ = −g sin θ.

The work function W is then defined by (3.5), that is

W = −mgλ cos θ. (3.7)

The Lagrangian function of the particle is then T − W where T is defined by (3.6) and W from
(3.7).

Example 3.1.3

Two masses m and M hang from the two ends of an inextensible rope which passes over a smooth
pulley. If the mass M is replaced by a live ape (of mass M ) that climbs up the rope with a velocity
v (t) relative to the rope, find the equations of motion of the ape and the Lagrangian function of
the ape and the mass m.

Figure 3.4

Solution
Let x be the distance of the ape below the axis of the pulley. It is obvious that this is the only
63 APM3712

generalized coordinate. The distance of m below the axis is ` − πa − x where ` is the free length of
the rope at a time t. Note that d`/dt = −v (t). The kinetic energy of the two masses is given by
1 1  2
T = M ẋ2 + m `˙ − ẋ
2 2
1 1
= (M + m) ẋ2 + mv ẋ + mv 2 .
2 2
The generalized force corresponding to x is given by

Qx = (M − m) g.

From (3.4) we get the equations of motion


d
[(M + m) ẋ + mv] = (M − m) g.
dt
The Lagrangian function of the system is then
1 1
L= (M + m) ẋ2 + mv ẋ + mv 2 + (M − m) gx.
2 2
Example 3.1.4

A vertical ring of mass M and radius a rotates freely about a horizontal axis through its centre of
mass. A beetle of mass m crawls along the ring at constant tangential velocity f relative to the
ring. Determine the equations of motion of the system and the Lagrangian function.

Figure 3.5

Solution
Suppose that the ring rotates through an angle φ and the beetle rotates through a total angle θ.
The beetle has crawled from B to A through an angle of (θ − φ), where
 
f = a θ̈ − φ̈ ,

or  
f t = a θ̇ − φ̇ with θ̇ (0) = φ̇ (0) .
64

With θ as a generalized coordinate, the kinetic energy of the system is given by

1 2 2 1 .2
T = ma θ̇ + M a2 φ
2 2
1 2 . 1
= (M + m) a2 θ̇ − M af t θ + M f 2 t2 .
2 2

The generalized force which corresponds with θ is given by

∂r
Qθ = mg. = −mga sin θ,
∂θ

and the equations of motion follow from (3.4), namely

(M + m) aθ̈ − M f = −mg sin θ.

If we assume that θ is very small, then we can solve the equations of motion analytically. This
does not really make sense since then the beetle is sure to be a larger than the angle through which
it crawled! The Lagrangian function is given by

1 2 1
L = (M + m)a2 θ̇ − M af tθ̇ + M f 2 t2 + mga cos θ.
2 2

3.2 The problem of Lagrange from another perspective

In this section we shall re-examine some of the ideas that you already encountered from the
viewpoint of the variational calculus. Briefly, in the previous section we derived the conditions
(Lagrange-equations) that a function (extremal) must satisfy to yield a stationary, minimum or
maximum value for the time-integral of the kinetic energy (T ). The forces acting on the system
were not necessarily conservative. This problem is a special case of the more general problem,
namely to find the function which will yield an extreme value to a functional, while satisfying
additional auxiliary conditions, and determine the necessary conditions. The external forces
on the previous section determine the auxiliary conditions.

We shall devote the rest of this chapter to finding those members of a class of functions,
all of which satisfy certain auxiliary conditions in the form of a set of given differential equations,
which provide an extreme value in this class for the fundamental integral. Such problems, as can
be expected, have many applications, especially of a practical nature (for example as elegantly and
clearly set out in Chapter 30 of [31]).

These problems give rise to unexpected difficulties which we shall shortly deal with. But
first we need to formulate the problem as completely as possible at this stage.
65 APM3712

The Problem of Lagrange: In the class of a curves which satisfies the p (< n) auxiliary
conditions
Gµ (t, xi , ẋi ) = 0, µ = 1, 2, . . . , p. (p < n) (3.8)
we seek a curve which joins two fixed points P1 and P2 in Rn+1 , an extreme value for the
integral Z P2
I= L (t, xi , ẋi ) dt (3.9)
P1

It is assumed that the functions Gµ , L are of class C 2 and that


 
∂Gµ
rank = p.
∂ ẋi

Consider the following example for the case where n = 2 and p = 1.

Example 3.2.1

Consider an auxiliary condition in the form of the following equation


q
G (ẋ1 , ẋ2 ) = ẋ2 − 1 + ẋ21 = 0. (3.10)

Along a curve C passing through a fixed point P in R3 the condition (3.10) determines dx2 /dt once
dx1 /dt is given. Let us examine the situation more closely. Choose any point P1 (t1 , x1,1 , x2,1 ) and
any curve C which passes through P1 and satisfies (3.10). If P2 (t2 , x1,2 , x2,2 ) is any other point on
C, denote the projection of C, P1 , P2 onto the (t, x1 ) plane by C 0 , P10 , P20 respectively (See Figure
3.6).

Figure 3.6
66

According to the formula for the arclength of a curve the length ` of the projection C 0 between P10 ,
P20 is given by
Z P20 q
`= 1 + (ẋ1 )2 dt, (3.11)
P10
C0

where dx1 /dt refers to C 0 (and naturally is the same as dx1 /dt along C!). But the curve C (actually
the functions xi which define C) satisfies the auxiliary conditions of equations (3.10), so that (3.11)
reduces to Z P2 Z x2,2
`= ẋ2 dt = dx2 = x2,2 − x2,1 . (3.12)
P1 x2,1
C

This means that the length of the projection of C on the (t, x1 ) plane is always equal to the differ-
ence between the x2 coordinates of the endpoints of C.

Let P2 lie on a straight line through P1 which makes an angle of π/4 radians with the x2 -axis.
Any such line satisfies (3.10) (check this for yourself). Its projection on the (t, x1 ) plane is the
straight line which joins P10 to P20 and ` is the length of the line segment. We can thus deduce from
(3.12) that
0 0
x2,2 − x2,1 = ` = P1 P2 .

Since the difference x2,2 − x2,1 is constant for fixed points P1 and P2 , the length of the projection of
any curve joining P1 and P2 and satisfying the auxiliary equation (3.10), must be equal to |P10 P20 |.
The latter is in addition the minimum distance between P10 and P20 and thus C 0 must always be
a straight line for this specific choice of P1 and P2 . Thus dx1 /dt is a constant and we can deduce
directly from (3.10) that dx2 /dt is also a constant. Hence the curve C can only be a straight
line, namely the one joining P1 and P2 making an angle of π/4 radians with the x2 -axis. The
class of admissible comparison curves which satisfy the auxiliary equation, in this case consists of a
single unique curve so that the whole question (as a variational calculus problem) is completely
meaningless. Here there can be no mention of “the variation of curves”! 

There are various other difficulties which can appear apart from the one we have described
here (see [28]). It is sufficient to say that these difficulties do not appear if the problem of Lagrange
belongs to the so-called “null class” and we shall restrict ourselves to such problems. In this case
the curves can indeed be varied and the different functions can be defined unambiguously.

Euler–Lagrange equations

The derivation of the Euler-Lagrange equations for the null class follows precisely the same pattern
as in Section 2.4, if we introduce the following definition of a extended Lagrangian function namely
p
X
Λ (t, xi , ẋi , λµ ) = L (t, xi , ẋi ) + λµ Gµ (t, xi , ẋi ), (where p < n), (3.13)
µ=1
67 APM3712

where the λµ are the so-called Lagrange multipliers which do not necessarily have to be constants
but may be dependent on t. Substituting (3.13) in the Euler-Lagrange equations (2.43) we get
  p      
d ∂L ∂L X d ∂Gµ ∂Gµ ∂Gµ dλµ
− + λµ − + = 0, for i = 1, 2, · · · , n. (3.14)
dt ∂ ẋi ∂xi µ=1 dt ∂ ẋi ∂xi ∂ ẋi dt

To overcome the difficulties discussed in example 3.2.1 above, we can use other ways to derive the
Euler-Lagrange equations as necessary conditions. The basic starting point is still the extended
Lagrangian function (3.13), but the method used to derive (3.14) is somewhat different. Two
methods that work well in general are firstly the method of the complete figure described in [28],
and the other is the method of equivalent integrals (see also [29]).

We shall only discuss one example of how to use equations (3.13) and (3.14) to find an
extremal for the problem of Lagrange. Once again several intermediate steps have been omitted
which you should please fill in.

Example 3.2.2

Obtain the extremals (if they exists) of the problem of Lagrange with the function
Z 1
I= (ẋ2 )2 dt,
0

and auxiliary conditions


ẋ1 + ẋ2 − ẋ3 = 0, ẋ1 + 2ẋ3 = 0, (3.15)
and boundary conditions
x1 (0) = 0, x2 (0) = 0, x3 (0) = 0,
x1 (1) = −2, x2 (1) = 3, x3 (1) = 1.
Solution
The extended Lagrangian functions is given by

Λ (t, x1 , x2 , x3 , ẋ1 , ẋ2 , ẋ3 , λ1 , λ2 ) = (ẋ2 )2 + λ1 [ẋ1 + ẋ2 − ẋ3 ] + λ2 [ẋ1 + 2ẋ3 ] ,

where G1 (t, x1 , x2 , x3 , ẋ1 , ẋ2 , ẋ2 ) = ẋ1 + ẋ2 − ẋ3 and G2 (t, x1 , x2 , x3 , ẋ1 , ẋ2 , ẋ2 ) = ẋ1 + 2ẋ3 .
So i = 1, 2, 3 and µ = 1, 2. The Lagrange equations (3.14) are
d
[λ1 + λ2 ] = 0, (3.16)
dt
d
[λ1 + 2ẋ2 ] = 0, (3.17)
dt
d
[−λ1 + 2λ2 ] = 0. (3.18)
dt
From equation (3.17)it follows that
1
λ1 + 2ẋ2 = d =⇒ ẋ2 = [d − λ1 ] ,
2
68

with integration constant d. Integrate again to get


1
x2 = [d − λ1 ] t + a,
2
and from the boundary conditions we then get that

x2 = 3t.

From (3.16) and (3.18) it follows that

λ̇1 = 0, λ̇2 = 0.

Substitute for x2 in the first auxiliary conditions, and through elimination we get

3ẋ3 = 3.

After integration and by using the boundary conditions we get

x3 = t.

From the auxiliary– and boundary conditions it now follows that

x1 = −2t.

The extremal is now given by

x1 = −2t, x2 = 3t, x3 = t,

and is a parametric representation (parameter t) of a curve in R3 .

3.3 The isoperimetric problem and the Mayer equations

In 2.2.3 we briefly described the isoperimetric problem using a legendary example. The auxiliary
conditions differ from those used in the preceding section. In the case of the isoperimetric problem
the auxiliary conditions are integrals while in the Lagrange problem they are differential equations.
We will convert the isoperimetric problem to a problem of Lagrange so that the Euler-Lagrange
equations (3.14) apply.

Let L (t, xi , dxi /dt) be a given class C 2 function and consider p further class C 2 functions
Lµ (t, xi , dxi /dt) and also constants `u , µ = 1, 2, . . . , p. If P1 (t1 , xi,1 ) and P2 (t2 , xi,2 ) are two fixed
points in Rn+1 , let Ω denote the class of all smooth curves C joining P1 and P2 for which
Z P2
Lµ (t, xi , ẋi ) dt = `µ . (3.19)
C P1
69 APM3712

We look for that member Γ of Ω (if it exists and where we assume that Ω is not empty) which
yields an extreme value for the fundamental integral
Z P2
I [C] = L (t, xi , ẋi ) dt. (3.20)
C P1

in Ω. Note that here we do not assume that p < n. Just as in the case of the problem of Lagrange,
we can question the existence of the class Ω of curves and related properties which satisfy (3.19).
The entire meaning of the problem of finding extreme values for (3.20) depends on it and this
question in discussed in an example in the following section.

We will now reduce this problem to one of Lagrange. We shall use the following notation.
The coordinates (t, xi , xµ ) in Rn+p+1 are denoted by (t, xA ) ∈ RN +1 , where N = n+p, (t, xi ) ∈ Rn+1 ,
while xµ are new coordinates. In what follows the summation convention will also apply to the
indexes A and B. We restrict our attention to a region of RN +1 where the solutions of the system
of differential equations

Gµ = ẋµ − Lµ (t, xi , ẋi ) = 0, µ = 1, 2, · · · , p, (3.21)

cover an (N + 1)–dimensional neighborhood of each point. Unfortunately we can’t go too deeply


into the meaning of this assumption since this will make the guide too long.

Let us show that (3.21) is actually a system of equations defining auxiliary conditions for an
equivalent problem of Lagrange. Consider the given points P1 in Rn+1 and choose the corresponding
points in RN +1 , namely P1∗ (t1 , xA,1 ) and P2∗ (t2 , xA,2 ) where

xµ,2 − xµ,1 = `µ . (3.22)

This is always possible, but not unique. Suppose that C ∗ is a curve in RN +1 joining P1∗ and P2∗
and which satisfies the auxiliary conditions as expressed by equation (3.21). It is obvious that
Z P2∗ Z t2
Lµ (t, xi , ẋi ) dt = ẋµ dt = xµ,2 − xµ,1 = `µ ,
C∗ P1∗ C∗ t1

where we used (3.22) in the last step. This leads to the important result that if a curve C in RN +1
joining P1∗ and P2∗ is such that (3.21) is satisfied, then (3.19) is also satisfied along C ∗ in Rn+1
between P1∗ and P2∗ . Since the functions Lµ do not depend on xµ and dxµ /dt, it is easy to see
that Z ∗ P2 Z P2
Lµ (t, xi , ẋi ) dt = Lµ (t, xi , ẋi ) dt,
C∗ P1∗ C P1

where the curve C is the projection of C ∗ on Rn+1 (and is naturally the one that we started with
in (3.19) and (3.20), we have deliberately chosen C ∗ so as to satisfy (3.21)! Thus if we solve the
problem of Lagrange for (3.20) subject to (3.21) in RN +1 , then we have also found the solution to
70

the isoperimetric problem (3.19) and (3.20) by projecting the previous solution onto Rn+1 .

Let us now derive the form of the Euler-Lagrange equations in RN +1 of this problem of La-
grange in the enlarged space, and see how their projections on Rn+1 appear as necessary conditions
for the isoperimetric problem. In RN +1 the Euler-Lagrange equations (3.14) are written as
  p      
d ∂L ∂L X d ∂Gµ ∂Gµ ∂Gµ dλµ
− + λµ − + = 0, (3.23)
dt ∂ ẋA ∂xA µ=1 dt ∂ ẋA ∂xA ∂ ẋA dt

which are n + p equations. Let us consider the first n of these, that is, we project (3.23) on Rn+1 .
From (3.21) we get
∂Gµ ∂Lµ ∂Gµ ∂Lµ
=− , =−
∂xi ∂xi ∂ ẋi ∂ ẋi
and hence (3.23) becomes (for i = 1, 2, . . . , n)
  p      
d ∂L ∂L X d ∂Lµ ∂Lµ ∂Lµ dλµ
− − λµ − − = 0. (3.24)
dt ∂ ẋi ∂xi µ=1 dt ∂ ẋi ∂xi ∂ ẋi dt

For n < A ≤ n + p in (3.23) we have firstly, that


∂L ∂L
≡ 0, ≡ 0, (3.25)
∂xA ∂ ẋA
and secondly it follows from (3.21) that (for ν = 1, 2, . . . , p)
(
∂Gµ ∂Gµ 1 if µ = ν,
≡ 0, = δµν = (3.26)
∂xν ∂ ẋν 0 otherwise.

Substituting (3.25) and (3.26) into (3.23) gives us the remaining equations, namely
dλµ
= 0. (3.27)
dt
These last equations are nonetheless meaningful.

In the isoperimetric problem, when it is dealt with as a problem of Lagrange and the multi-
plication rule is applied, then the multipliers λµ are constants.

The Euler-Lagrange equations (3.24) are written in their final form as


  p    
d ∂L ∂L X d ∂Lµ ∂Lµ
− − λµ − = 0. (3.28)
dt ∂ ẋi ∂xi µ=1 dt ∂ ẋi ∂xi

In the context of the isoperimetric problem, equations (3.28) are often referred to as the Mayer
equations. These equations must be satisfied by all functions of class C 2 which can possibly yield
an extreme value for the fundamental integral (3.20) in the family Ω of curves which the auxiliary
conditions (3.19) satisfy. This problem as well as its generalizations are discussed in [17] and [29].
Here are some examples.
71 APM3712

Example 3.3.1

The Problem of Dido: Determine the extremal for the isoperimetric problem with the functional
Z b
I [x] = xdt, (3.29)
a

subject to the auxiliary– and boundary conditions


Z b√
1 + ẋ2 dt = `, (3.30)
a
x (a) = x (b) , a < b, b − a < `. (3.31)

Solution
Note that if b − a = `, then (3.30) will be satisfied only if x (t) ≡ c (c is a constant), and if b − a > `,
then no real solution exists.


In view of (3.29) and (3.30), we have L(t, x, ẋ) = x and Lµ(t, x, ẋ) = 1 + ẋ2 (Note that i = 1
and µ = 1). The Mayer equation (3.28) is given by
    
d ẋ d ẋ 1
−1 − λ √ = 0 =⇒ √ =− ,
dt 1 + ẋ2 dt 1 + ẋ2 λ
or, after manipulation,
ẋ t
√ = − + c.
1 + ẋ 2 λ
By now it should be routine to get the solution of this first order equation. The solution consists
of circles, and is obtained by the substitution of dx/dt = tan ψ or through direct integration. The
extremals of the problem are thus circles of the form
x 2  t 2
−d + − k = 1,
λ λ
with constants d and k which must be determined form the boundary conditions. Refer to [6],
Chapter 6.

Example 3.3.2

A heavy uniform flexible chain of a given length hangs in equilibrium under gravity with its end-
points P1 and P2 fixed. Determine the chain’s equation.
Solution
According to the theory of statics the chain will hang in a vertical plane in such a way that its
potential energy will be a minimum. Take this plane to be the xy–plane and denote the fixed
points by P1 (a, h) and P2 (b, h). Let ρ be the mass per unit length of the chain. The potential
energy of a length ds at a height y above the x–axis is ρgyds, and if we transform this as follows
p dy
ds = 1 + ẏ 2 dx, ẏ = ,
dx
72

namely Z b
I p
= y 1 + ẏ 2 dx,
ρg a
subject to the auxiliary conditions Z bp
`= 1 + ẏ 2 dx.
a
The Mayer equation (3.28) is then given by
     
d ∂L ∂L d ∂L1 ∂L1
− −λ − =0
dx ∂ ẏ ∂y dt ∂ ẏ ∂y
! " ! #
d y ẏ p d ẏ
⇒ p − 1 + ẏ 2 − λ p −0 =0
dx 1 + ẏ 2 dx 1 + ẏ 2
!
p d (y − λ) ẏ
⇒ 1 + ẏ 2 − p = 0.
dx 1 + ẏ 2

Now solving the differential equation leaves us with


− 23
1 + ẏ 2 1 + ẏ 2 − (y − λ)ÿ = 0,


for finite ẏ in (a, b). We therefore have


1 + ẏ 2
ÿ = .
y−λ
1 d(ẏ)2
To solve the above equation, we substitute Set ÿ = 2 dy
. So we have

1 d(ẏ)2 1 + ẏ 2 1 d(ẏ)2 dy dy 1 dz 1
= ⇒ 2
= ⇒ = ⇒ ln(y − λ) = ln(1 + ẏ 2 ) + c.
2 dy y−λ 2 1 + ẏ y−λ y−λ 21+z 2
p
Thus y − λ = d 1 + ẏ 2 , which is a first integral. Now integrate again gives us
 
x+d
y = λ + c cosh ,
c
where c and d are constants. This is the equation for a catenary (the reason is obvious), which we
have already encountered in our discussion of the minimal surfaces of revolution in Example 2.4.5.
Further information about this problem can be found in [10].

3.4 Exercises

The first few exercises deal with the problem of Lagrange.

3.4.1 A Simple plane pendulum consists of a mass m which is fixed to a string of length `. When
the pendulum is set in motion, the length of the string is shortened at a constant rate u, while the
point of suspension remains fixed.
Determine the equations of motion and the Lagrangian function.
73 APM3712

3.4.2 Consider the motion of a smooth particle on a circular wire with radius a which is rotating
at a constant angular velocity ω about a vertical diameter.
Determine the equations of motion of the particle, as well as its Lagrangian function.

3.4.3 A particle is subject to a force F . Determine its equations of motion in cylindrical coordi-
nates.

3.4.4 A particle of mass m moves in one dimension under the influence of a force

A −kt
F (x, t) = e ,
x2
where A and k are positive constants. Determine the equations of motion and the Lagrangian
function of the particle.

3.4.5 A rod of mass m is free to rotate horizontally about a fixed endpoint. An insect of mass
4m/9 crawls at a constant velocity along the rod in the direction of the fixed endpoint. Initially
the rod rotates with the insect at the moving end of the rod. Show that the insect finds it most
difficult to crawl when it is at the middle of the rod.

3.4.6 A rigid straight track is fixed onto a plate and a small locomotive of mass m travels on the
track. The plate is horizontal and rotates freely around a vertical axis at a distance a from the
track. If the moment of inertia of the plate and track about the vertical axis is mk 2 , show that
the plate can never turn through an angle greater than
πa

a2+ k2

as result of the motion of the locomotive.

3.4.7 A smooth piece of wire is bent in the form of a spiral with cylindrical equations r = b and
z = aφ, where a and b are constants. If the origin is the midpoint of a force of attraction which
is directly proportional to the distance (proportionality constant k), determine the motion of a
particle of mass m which slides down the wire.

3.4.8 A particle is constrained to move along a wire which is bent in the form of a conical spiral.
Assume that r = az and φ = −bz, where a and b are constants (see Figure 3.7). Use generalized
forces to show that the equation of motion of the particle is given by
.2
z̈ a2 + 1 + a2 b2 z 2 + a2 b2 z z = −g.


The following problems have to do with the problem of Lagrange from a wider perspective.
74

Figure 3.7

3.4.9 Find the extremal (if it exists) of the problem of Lagrange in R3 with the functional
Z 1
(ẋ1 )2 − (x1 )2 dt,
 
I [x] =
0

subject to q
x2 − 1 + (x1 )2 = 0
with boundary equations

x1 (0) = 0, x2 (0) = 0, x1 (1) = 1, x2 (1) = 2.

In terms of the terminology used by Sagan in [29], p. 339, this is an example of an abnormal
extremal.

3.4.10 Obtain the extremals (if they exist) of the following problems of Lagrange with functionals,
auxiliary conditions and boundary conditions as indicated. Ensure that the problems, as they are
stated make sense as problems in the variational calculus.

3.4.10.1 Z π/2
I [x] = (x1 )2 dt,
0
2
ẋ1 + x2 − (x2 − x3 ) x2 = 0, ẋ2 − x1 = 0,
x1 (0) = 1, x2 (0) = x3 (0) = 0,
π  π  π 
x1 = 0, x2 = x3 = 1.
2 2 2
3.4.10.2 Z π/2
(ẋ2 )2 − (x2 )2 dt,
 
I [x] =
0
ẋ1 + ẋ2 − ẋ3 = 0, ẋ1 + 2ẋ3 = 0, (3.32)
x1 (0) = 0, x2 (0) = 0, x3 (0) = 0,
π   π  3π π  π
x1 = −π, x2 = , x3 = .
2 2 2 2 2
75 APM3712

3.4.11 Consider the problem of the hanging chain in Example 3.3.2 from the start as a problem
of Lagrange. Obtain once again the catenary as extremal.

The following problems are concerned with the isoperimetric problem.

3.4.12 This problem is a variation of the problem of Dido. Let x = f (t) be a given curve of the
class C 2 which joins two endpoints P1 (t1 , x1 ) and P2 (t2 , x2 ) in the first quadrant of the (t, x)–plane,
with 0 ≤ t1 ≤ t2 (see Figure 3.8).

Figure 3.8

Determine the form of the curve Γ of class C 2 and of a suitable fixed length, which is such that
the finite surface are enclosed between the two curves Γ and x = f (t) is a maximum. Another
variation of the problem is to be found in [26], p. 172.

3.4.13 Find the extremal (if it exists) of the isoperimetric problem with the functional
Z 1
I [x] = ẋ2 dt
0
subject to the conditions
Z 1
xdt = 3, x (0) = 1, x (1) = 6.
0

Consider as well more general boundary conditions such as x (t1 ) = t1 , x (t2 ) = t2 while 3 in the
isoperimetric conditions is replaced by a constant c. This problem word is dealt with in [6], p.
134, 183 and [22], p. 96.

3.4.14 Determine the Mayer equation for the isoperimetric problem with the functional
Z a
 2
px + q ẋ2 dt,

I [x] =
0
with auxiliary– and boundary conditions
Z a
rx2 dt = 1, x (0) = 0, x (a) = 0,
0

where p, q, r are given class C 2 functions of t.


76

3.4.15 Find the extremal for the functional


Z 1
 2
ẋ + t2 dt,

0

subject to Z 1
x2 dt = 2, x (0) = 0, x (1) = 0.
0

3.4.16 Investigate the isoperimetric problem with the functional


Z 1
 2
ẋ − x2 dt,

I [x] =
0

with auxiliary– and boundary conditions


Z 1√ √
1 + ẋ2 dt = 2, x (0) = 0, x (1) = 1.
0

See the remarks in example 3.3.1 and also [29], p. 346.

3.4.17 Determine the extremal of the isoperimetric problem with the functional
Z 1h i
2 . 2
I= (ẋ1 ) + 2 − 4tx2 − 4x2 dt,
x
0

and auxiliary– and boundary conditions.


Z 1
(ẋ1 )2 − tẋ1 − (ẋ2 )2 dt = 2,
 
0

x1 (0) = x2 (0) = 0,
x1 (1) = x2 (1) = 1.

This is a standard problem which is discussed in most of the textbooks, see for example [8], p. 147.

3.4.18 Consider the class C 2 curves of given length which join the two points P1 (t1 , x1 ) and
P2 (t2 , x2 ) in the upper half of the (t, x)–plane. A classical problem is to determine the curve
which will maximise the volume of the surface of revolution about the t–axis. Formulate this as an
isoperimetric problem, obtain the Mayer equation and find the solution as an integral (this leads
to elliptic functions!). This problem is dealt with extensively in [6], p. 129-130.

The following problem deals with an important class of problems which are related to the
problem of Lagrange.
3.4.19 Consider a variational calculus problem with so-called finite auxiliary conditions of the
type
Gµ (t, xi ) = 0,
(these conditions are not differential equations).
77 APM3712

(a) Define the extended Lagrangian function (3.13) and derive the Euler-Lagrange equations for
this problem.

(b) Consider the following as an application (see [32], p. 61-63). In the so-called geodesic problem
in the (x, y, z)–space we want to find the equation of a curve with minimal arclength on a
surface
G (x, y, z) = 0, (3.33)
which joins the two points P1 and P2 on the surface.
Here we apply the parametric approach and conventions, so that the parameter t is not explicitly
present. It is obvious that the fundamental integral is now given by
Z t2 p
I= ẋ2 + ẏ 2 + ż 2 dt.
t1

Derive the equations of the extremal and eliminate the multiplier to express them in the form
     
d ẋ d ẏ d ż
dt f dt f dt f
∂G
= ∂G
= ∂G
,
∂x ∂y ∂z

where
p
f (ẋ, ẏ, ż) = ẋ2 + ẏ 2 + ż 2 .
These equations together with (3.32) are the equations of the required geodesic.

(c) Apply this result to prove that the geodesic on a sphere is a great circle.

The last problems deal with the theory discussed in this chapter.

3.4.20 Why is it necessary to vary the admissible curves in the variational calculus? Discuss
this using an example of a problem of Lagrange where the admissible curves are “fixed”.

3.4.21 Show that the isoperimetric problem in Rn+1 can be considered as a problem of Lagrange
in a higher dimension. Derive the Mayer equations (3.28) which must be satisfied by those smooth
curves in Rn+1 which are a solution of this problem. Also show, while carrying out the derivation,
that the multipliers are constants.
78
79 CONTENTS

Chapter 4

THE HAMILTON–JACOBI THEORY

The applications of the variational calculus which we have considered so far, dealt with a dynami-
cal system with n degrees of freedom and generalized coordinates qi characterized by a Lagrangian
function L. This led to the solution of the Euler-Lagrange equations which defined the motion
qi = qi (t) in the (n + 1)-dimensional configuration space (t, qi ). An important property of this
approach is that the generalized velocities dqi /dt appear explicitly in this formulation and are con-
sidered to be independent dynamical quantities.

As a result of the dynamical independence of the qi , dqi /dt and t it is convenient to represent
the motion of a dynamical system by a curve in the (2n + 1)–dimensional “phase”–space t, qi , dq i

dt
where each point on the path provides information regarding the phase or state of the system,
i.e. information regarding its 2n generalized position and velocity coordinates at a particular time.
The curve describing the motion in this new space is no longer derived from the Lagrangian func-
tion, but from a function called the Hamiltonian function which is constructed from the Lagrangian
function and is related to the conservation of energy.

Furthermore we abandon the generalized velocity and replace it with the generalized mo-
ment (also called the canonical momentum), which is constructed from the generalized velocity.
The introduction of this momentum simplifies the mathematical formulation in phase space. This
is briefly what this chapter is about. We construct the Hamiltonian function and generalized
momentum, obtain Hamilton’s equations (which are related to Euler-Lagrange equations) and con-
sider the complete figure which describes the problem of the variational calculus from a theoretical
geometrical viewpoint. Its application is mainly in quantum theory. The insight which it actually
provides to the understanding of the variational calculus should not be underestimated.

This approach has a lot to do with extremal fields, but the applications will be in mechan-
ics. To describe even one such physical field (extremal field) would make this course too long, so
I shall just refer you to [28] for further information. The theory presented here is a local theory.
Global variational calculus was only developed in the twenties and thirties (see [24, 25] and [31].
80

Objectives

At the end of this chapter you will able to:

• construct the Hamiltonian function and the generalized momentum,

• derive and apply the Hamilton-Jacobi equations,

• apply Hamilton’s equations, and

• construct, from a theoretical geometrical viewpoint, the complete figure of


Carathéodory.

4.1 The canonical momentum and the Hamiltonian function

We begin this section with an example in order to justify the introduction of our definitions. The
motion of a single point mass in an external force field with potential V is described in terms of
Cartesian coordinates (x, y, z) as generalized coordinates by the Lagrangian function
m 2 2 .2

L (x, y, z, ẋ, ẏ, ż) = ẋ + ẏ + z − V (x, y, z) .
2
The components of the linear momentum vector of the point mass is given by

(mẋ, mẏ, mż) ,

and if we compare this with the Lagrangian function, we see that


 
∂L ∂L ∂L
(mẋ, mẏ, mż) = , , .
∂ ẋ ∂ ẏ ∂ ż
This leads to the general definition of the canonical momentum, namely
∂L
pi = , L = L (t, xi , ẋi ) . (4.1)
∂ ẋi
Furthermore we see that we can express the momentum in terms of the velocities, namely
px py pz
ẋ = , ẏ = , ż = .
m m m
This is only possible in general if the Jacobian of equation (4.1) is not zero i.e.,
 2 
∂ L
det 6= 0. (4.2)
∂ ẋi ∂ ẋj
As we already stated in Chapter 1 in general we shall require that (4.2) holds (even though there
were applications in the previous chapter where this did not apply!). In this chapter condition
(4.2) is very important since we want to express the velocities in terms of the momentum as

ẋi = φi (t, xi , pi ) . (4.3)


81 APM3712

Incidentally, condition (4.2) forms the cornerstone of the non-homogeneous theory in the varia-
tional calculus. Since L is of class C 2 , the pi are continuously differentiable in all their variables,
and the same applies to their inverses φi .

We now introduce the Hamilton function in phase space (t, xi , pi ), corresponding to the
Lagrangian function in the configuration space (t, qi ), namely
n
X
H (t, xi , pi ) = −L (t, xi , φi ) + pj φj (t, xi , pi ) . (4.4)
j=1

The Hamilton function in our example is defined by


1
p2x + p2y + p2z + V (x, y, z) .

H (x, y, z, px , py , pz ) =
2m
It is obvious from (4.2) and (4.4) that H cannot become identically zero. If we assume that H is
identically zero, then we get
∂L
ẋi − L = 0,
∂ ẋi
and if we differentiate we get
∂ 2L
ẋj = 0 for ẋj 6= 0.
∂ ẋi ∂ ẋj
This is only possible if (4.2) does not hold! The case where (4.2) does not hold won’t be dealt
with here and you can read more about it in [28], Chapter 3.

The Hamilton function H in (4.4) is at least of class C 2 and satisfies the following identities

∂H ∂L ∂H ∂L ∂H
=− , =− , = φi . (4.5)
∂xi ∂xi ∂t ∂t ∂pi

Verify these identities yourself. Since the last identity is the inverse of (4.1), it follows that
 2 
∂ H
det 6= 0. (4.6)
∂pi ∂pj

Furthermore the Hamiltonian function is uniquely defined for a given Lagrangian function if (4.6)
holds. This ensures the symmetry that exists between the Lagrangian and the Hamilton function
representations.

4.2 The Hamilton–Jacobi equation

In this section we derive the Hamilton-Jacobi equation for the variational calculus problem formu-
lated in 2.3 and obtain the Euler-Lagrange equations from that equation. This method keeps the
geometric picture in the foreground and leads us directly to the so-called complete figure. We
82

owe this complete figure to the re-known mathematician Constantin Carathéodory [3, 4].

In the configuration space Rn+1 of the variables (t, xi ), consider a one-parameter family of
hypersurfaces
X
S (t, xi ) = , (4.7)

where the functions S are of class C 2 and


P
denotes the parameter of the family. An example
2 2
of such a hypersurface is the surface of a sphere in R3 which is given by t2 + (x1 ) + (x2 ) = ,
P
P P
where > 0 is a constant (the parameter). Note that is the square of the radius. By
substituting different values we get different spherical surfaces all of which have the same centre.
These different surfaces are referred to as the family of hypersurfaces. An arbitrary family of
hypersurfaces is illustrated in Figure 4.1.

Figure 4.1

We assume further that the family (4.7) covers a region G in Rn+1 in such a way that only one
member of the family passes through the point G. The example of the spheres is just such a family
(which indeed covers the whole of R3 in this way!).

Let C be the class C 2 curve with parametric representation

C : xi = xi (t) , (4.8)

in Rn+1 . In addition this curve must be such that it cuts the family of hypersurfaces (4.7) in such
a way that it is nowhere tangent to any member of the family (see Figure 4.2). Thus as result of
P
this choice we ensure that for each point P of C there exists a unique value for , namely that
P
hypersurface of (4.7) which passes through P . We can then vary along the curve C, and as a
P
result we can regard as a function of t, with
P
d ∂S . ∂S
∆= = xi + , (4.9)
dt ∂xi ∂t
83 APM3712

where dxi /dt refers to the tangent to C. Since none of the curves touches a member of the family,
4=6 0, we can assume that (4.7) is chosen so that

∆ > 0, or ∆ < 0. (4.10)

Figure 4.2

Consider the integral I (See Equation (2.12) of Section 2.3)


Z P2
I= L (t, xi , ẋi ) dt (4.11)
P1
C

defined for a given Lagrangian function L (t, xi , dxi /dt) along C. For a tangential displacement
(dt, dxi ) = (1, dxi /dt) dt along C, the corresponding increment of the fundamental integral (4.11)
is given by
dI = L (t, xi , ẋi ) dt (4.12)

(fundamental theorem of the infinitesimal calculus!). The next step in the construction of the
P
complete figure is to require that the curve C must be such that for a given increment d
of the parameter (4.7) the corresponding tangential displacement (1, dxi /dt) dt, which joins two
“neighboring” hypersurfaces
X X X
S (t, xi ) = , S (t, xi ) = +d ,

must yield a minimum value. It is clear that if the displacement (1, dxi /dt) dt provides an
P
extreme value for dI, it will also do so for dI/d . A necessary condition for this to occur is
 
∂ dI
P = 0. (4.13)
∂ ẋi d
84

Here we see that the requirement that (4.7) passes through each point in G is important otherwise
(4.13) would be the case everywhere in G. This condition can be rewritten, by making use of (4.9)
and (4.12), as  
∂ L
= 0.
∂ ẋi 4
Differentiation gives us
∂L L ∂4
− = 0. (4.14)
∂ ẋi 4 ∂ ẋi
From (4.9) we get
∂4 ∂S
=
∂ ẋi ∂xi
so that (4.14) can be written as
∂L L ∂S
− = 0. (4.15)
∂ ẋi 4 ∂xi
This condition has the form f (t, xj , dxj /dt) = 0, and must hold for the xi that define C as in (4.8).
We have so far not imposed conditions on C, and we now assume that C is such that (4.15) is
satisfied by C (see Figure 4.3). This means that the components of the tangent vector at C (that
is (1, dxi /dt) dt) satisfy condition (4.15).

We say that any tangent vector which satisfies condition (4.15), lies in the direction of the
geodesic gradient which is determined by the family of hypersurfaces (4.7).

This family (4.7) is too general for our purposes. We reduce the family to that group for
which the quantity L/∆ is constant on each member of the family. The constant does not
need to be the same for each member. We see now that the value of L/∆ is uniquely determined
P P
by , that is L/∆ is a function of namely

L X
(t, xi , ẋi ) = ψ . (4.16)
4

The variable dxi /dt in L/∆ refers to the geodesic gradient. The value of L/∆ can now be taken
to be 1 without any loss of generality, as shown in the following Lemma.

Lemma 4.2.1

If the family (4.7) of hypersurfaces is such that condition (4.15) is satisfied then there exists an
equivalent representation of (4.7) for which
X 
ψ =1 (4.17)

holds everywhere.
85 APM3712

Figure 4.3

P
Proof: Let Ψ be an unspecified monotone function of . The equation
X  X
S (t, xi ) = Ψ (S (t, xi )) = Ψ = (4.18)

represents the same family of hypersurfaces as (4.7). Just as in (4.9) we deduce that
P P P
d dΨ ( ) d 0
X 
4= = P =Ψ 4,
dt d dt
and as a result (4.16) becomes P
L L ψ( )
= 0 P = 0 P . (4.19)
∆ Ψ ( )∆ Ψ ( )
Define the function Ψ to be Z P
X 
Ψ = P
ψ (t) dt, (4.20)
0
P
where 0 is some constant such that
X X 
Ψ0 =ψ .

From (4.12) we get


L = ∆.
Thus, if the family is represented by (4.18) and Ψ is defined by (4.20) then
X L
ψ = = 1. (4.21)

Figure 4.4 illustrates what this Lemma means for the hypersurfaces. Note that the hyper-
surfaces are “ironed smooth” (our geometric illustrations are very schematic!). By assuming that
(4.17) holds, we have that
L = ∆. (4.22)
86

Figure 4.4

As a result the geodesic gradient (4.15) is given by


∂L ∂S
= . (4.23)
∂ ẋi ∂xi
If we use definition (4.1), then
∂S
pi = . (4.24)
∂xi
On account of condition (4.2), (4.23) can be solved (locally!) for dx0 /dt to get

ẋi = hi (t, xj ) , (4.25)

where the functions hi are of class C 1 . In general this system of n first order differential equations
has an n–parameter family of solutions (the n integration constants are considered to be param-
eters). We restrict our attention to the region G in Rn+1 which is covered simply by the family
(i.e. one and only one member of the family passes through each point of G). This property is
referred to as a congruence. We say that the congruence belongs to the family of hypersurfaces
(4.7) and this property “of belonging to” is characterized by (4.23) holding along each member of
the congruence.

Our construction has led to a family of hypersurfaces (4.7) cut by a congruence of curves
which are completely determined by (4.7). This family of hypersurfaces is no longer arbitrary since
(4.23) must hold. Let’s investigate the implications of this.

Let P (t, xi ) be an arbitrary point on (4.7). The curve Γ of the congruence goes through
P P
P and cuts the neighboring hypersurface S (t, xi ) = +d in Q. This displacement P Q yields,
for the fundamental integral (4.11) taken along Γ, a displacement dI as given in (4.12), and since
(4.22) holds, X
dI = ∆dt = d ,
87 APM3712

where we made use of (4.9). But this value is independent of the position of P on the hyper-
P
surface S(t, xi ) = . Thus, whenever we go in the direction of the geodesic congruence from
one hypersurface to the neighbouring one, the increment of the fundamental integral is constant,
P
namely d . This result means that, along each member Γ of the congruence (4.25) which joins
P P
the hypersurfaces S (t, xi ) = 1 and S (t, xi ) = 2 , the value of the fundamental integral between
the points P1 and P2 where Γ cuts these hypersurfaces is given by
Z P2 X X
I= L (t, xi , ẋi ) dt = − . (4.26)
P1 2 1
Γ
P P
For a given pair 1 and 2 the value of I is constant. For these reasons it is said that the family
of hypersurfaces (4.7) are geodesically equidistant with respect to L and the corresponding
congruence which belongs to these surfaces .

Theorem 4.2.1

A necessary and sufficient condition for the family of hypersurfaces (4.7) to be geodesically equidis-
tant with respect to L, is that the functions S must satisfy the partial differential equation
 
∂S ∂S
H t, xi , + =0 (4.27)
∂xi ∂t

where H is the Hamiltonian function corresponding to L according to (4.4). This is the famous
Hamilton-Jacobi equation.

Proof: We can write condition (4.22), using (4.9), as


∂S ∂S
L (t, xi , ẋi ) = ẋi + , (4.28)
∂xi ∂t
where dxi /dt refers to the direction of the geodesic gradient (from (4.25)). If we write (4.28) out
in full, keeping (4.3) in mind, we get
    
∂S ∂S ∂S ∂S
− L t, xi , φi t, xk , + φi t, xk , + = 0. (4.29)
∂xk ∂xi ∂xk ∂t
Close inspection of this equation shows us that the first two terms are the Hamiltonian function
(4.4), so that (4.27) follows directly.

The sufficient condition requires that we already have a solution S (t, xi ) of (4.27). This
solution defines a field
∂S
pi (t, xj ) = (4.30)
∂xi
in Rn+1 . Equations (4.1) and (4.2) determine a field (4.25), and as a result a congruence of curves.
P P
For a given curve Γ of this congruence, a solution of (4.27) and two parameter values 1 and 2
88

we can construct an integral similar to the left hand side of (4.26). From (4.4), (4.27) and (4.30)
we get
Z P2   
∂S
L t, xi , φi t, xk , dt
P1 ∂xk
Γ
Z P2    
∂S
= − H t, xi , − pi ẋi dt
P1 ∂xi
Γ
Z P2  
∂S ∂S
= dxi + dt
P1 ∂xi ∂t
Γ
Z P2 X X
= dS = − , (4.31)
P1 2 1
Γ
P
irrespective of the choice of P1 on the hypersurface with parameter 1 . Thus, the value of the
integral (4.31) is the same for all members of the congruence, and hence the solution S(t, xi ) defines
a family of geodesically equidistant hypersurfaces. 

The solution of the fundamental problem in the variational calculus (as formulated
in (4.3)) is provided by congruences of curves which belong to the geodesically equidistant
hypersurfaces under certain additional conditions. This geometrical entity that we have
constructed, is called, following Carathéodory, the complete figure of the problem of the
variational calculus. This congruence is transversal to the hypersurfaces of the family (4.7).
This complete figure forms the heart of the Hamilton-Jacobi theory, which itself plays, a
very important role in classical mechanics and most parts of theoretical physics.

The Hamilton-Jacobi theory is used on the practical side to solve problems, and from the theoretical
point of view it provides a geometrical picture which shows that there is no mathematical difference
between classical mechanics and geometrical optics. In addition it provides a very suitable starting
point from which to formulate quantum mechanics. Before we consider an example, we need to
learn more about the congruence of curves.

4.3 Hamilton’s equations

We have seen that the family of geodesically equidistant hypersurfaces (4.7) define a field (4.30) at
each point on a member Γ of the congruence. Differentiate (4.30) with respect to t along Γ so that

dpi ∂ 2S ∂ 2S
= ẋj + . (4.32)
dt ∂xi ∂xj ∂xi ∂t
89 APM3712

Differentiate the Hamilton-Jacobi equation with respect to xi so that

∂H ∂H ∂ 2 S ∂ 2S
+ + = 0. (4.33)
∂xi ∂pj ∂xj ∂xi ∂xi ∂t

Using the last identity in (4.5) together with (4.3) in the second term of (4.33) gives us

∂H ∂ 2S ∂ 2S
+ ẋj + = 0. (4.34)
∂xi ∂xj ∂xi ∂xi ∂t

Now eliminate the second derivatives of S in (4.32) and (4.34), to get (using the last identity in
(4.5))
dpi ∂H dxi ∂H
=− , = . (4.35)
dt ∂xi dt ∂pi
These are the well-known canonical equations (or Hamilton’s equations). Note that the
second set of equations are actually identities which result from the definition of the Hamiltonian
function. The first set of equations follow directly from the fact that the congruence of curves
belongs to the family of equidistant hypersurfaces.

These equations are equivalent to the Euler-Lagrange equations which we derived in Chapter 2.
This is obvious if we substitute (4.1)in the left hand side of (4.35), so that
 
d ∂L ∂H
=− ,
dt ∂ ẋi ∂xi

and if we replace the right hand side with the first identity in (4.5), we get
 
d ∂L ∂L
− = 0. (4.36)
dt ∂ ẋi ∂xi

Thus we have shown that any curve which satisfies (4.35) also satisfies (4.36). Thus, the
congruence of geodesics which belong to a family of hypersurfaces which are geodesically
equidistant with respect to a Lagrangian function L, is a congruence of extremals.

Let us just return for a moment to the holonomic dynamical system. The canonical equations
define the path of the system in the (2n + 1)–dimensional phase space (t, qi , pi ) while the Euler-
Lagrange equations define the path in the (n + 1)–dimensional configuration space (t, qi ).

It is usually claimed that the canonical equations (4.35) are easier to solve than the Euler-
Lagrange equations. In principle this ought to be the case, since (4.35) is a system of first order
differential equations, while we already know that (4.36) is a system of the second order. In prac-
tice we derive the canonical equations from the Euler-Lagrange equations and solve the latter since
that is much easier to do!
90

We can greatly simplify the canonical equations using the so-called canonical transformations
which we shall briefly discuss in Chapter 5. But it is usually difficult to solve the canonical equa-
tions directly.

The canonical equations related to the problem which we considered in section 4.2 are given by
dpx ∂V dpy ∂V dpz ∂V
=− , =− , =− ,
dt ∂x dt ∂y dt ∂z
px py pz
ẋ = , ẏ = , ż = .
m m m
We now introduce the concept of an extremal field. Suppose that we are given an arbitrary
congruence of curves in Rn+1 which cover a region G simply. We represent the congruence by n
equations of the form
xi = xi (t, uα ) , (4.37)
where uα are the n parameters of the congruence in the sense that each set of values uα corresponds
to one member of the congruence. There is thus a one to one correspondence between the points
(t, xi ) in Rn+1 and the n + 1 variables (t, uα ). This implies that
 
∂xi
det 6= 0. (4.38)
∂uα

We assume that the functions in (4.37) are of class C 2 . Given a Lagrangian function L, its value
is uniquely defined if we use (4.37) and its derivative with respect to t. The same applies to the
canonical momentum pi which we defined in (4.1), and now define as

pi = pi (t, uα ) . (4.39)

The 2n quantities xi and pi are called a field if they are given by (4.37) and (4.39) and dxi /dt is
related to pi by means of the Lagrangian function in terms of (4.1) or (4.3).

A field belongs to a family of hypersurfaces such as (4.7) if in the region G the conditions
∂S (t, xj ) ∂S (t, xj (t, uα ))
Pi = = , (4.40)
∂xi ∂xi
are satisfied, irrespective of whether the functions S define a geodesically equidistant family or
not. We investigate the integrability conditions of (4.40) since they play such a decisive role in the
theory as well as the fact that an arbitrary field need not belong to a family of hypersurfaces.

Theorem 4.3.1

For a field to belong to a family of hypersurfaces it is necessary and sufficient that the Lagrange
 
brackets uα , uβ should be identically zero, with uα as the n parameters of the field.
91 APM3712

Proof:
Firstly we note that for a given family (4.7) and a field belonging to it we have that S (t, xj ) =
S (t, xj (t, uα )), so that
∂S ∂S ∂xi ∂xi
= = pi , (4.41)
∂uα ∂xi ∂uα ∂uα
and further differentiation with respect to uβ gives

∂ 2S ∂ 2 xi ∂pi ∂xi
= pi + . (4.42)
∂uα ∂uβ ∂uβ ∂uα ∂uβ ∂uα

This provides the integrability conditions for (4.40): since the left hand side of (4.42) is symmetrical
in the indexes α and β this must apply to the right hand side as well, which implies that

[uα , uβ ] = 0. (4.43)

The quantity [v, w] is called the Lagrange-brackets of the variables v and w, namely
n  
X ∂xi ∂pi ∂xi ∂pi
[v, w] = − . (4.44)
i=1
∂v ∂w ∂w ∂

The condition (4.43) is a necessary condition for a field to belong to S.

Secondly, suppose that we are given a field , namely

xi = xi (t, uα ) , pi = pi (t, uα ) , (4.45)

which satisfies the condition (4.43). We write condition (4.43) in the form
   
∂ ∂xi ∂ ∂xi
pi = pi
∂uα ∂uβ ∂uβ ∂uα

(please fill in the missing steps). This implies that a function σ (t, uα ) exists such that
∂xi ∂σ
pi = . (4.46)
∂uα ∂uα
Thus, if we write S (t, xj ) = σ (t, uα (t, xj )) (where we used the inverse of the first equation in
(4.14)) so that
∂σ ∂S ∂xj
= , (4.47)
∂uα ∂xj ∂xα
then from (4.46), (4.47) and (4.38) it follows that (4.40) is satisfied by S (t, xj ) = σ (t, uα (t, xj )) .

Furthermore we say that a field is canonical or is an extremal field if the functions (4.45)
satisfy the canonical equations.
The following theorem is a logical consequence of what we have done so far.

Theorem 4.3.2
92

If a canonical field belongs to a family (4.7) of hypersurfaces then the family is geodesically equidis-
tant.
Proof:
The functions S (t, xi ) satisfy (4.40) . Differentiate (4.40) with respect to t along a member of the
congruence and substitute the first Hamilton’s equation (4.35) (why can we do this?) to get

∂H ∂ 2S ∂ 2S
− = ẋj + .
∂xi ∂xi ∂xj ∂xi ∂t

By using the second equation in (4.35) we get

∂H ∂ 2 S ∂H ∂ 2S
+ + = 0,
∂xi ∂xi ∂xj ∂pj ∂xi ∂t

which we write as   
∂ ∂S ∂S
+ H t, xj , = 0.
∂xi ∂t ∂xj
Integrating the equation gives us
 
∂S ∂S
+ H t, xj , = f (t) , (4.48)
∂t ∂xj

where f (t) is an arbitrary function (the integration “constant”). Without loss of generality, we can
absorb this function f (t) in the Hamiltonian function H, so that (4.48) becomes the Hamilton-
Jacobi equation. Incidentally, show that the function f (t) can be absorbed by the Hamiltonian
function H (see also Exercises 2.5.25-27). 

Theorem 4.3.2 provides a sort of converse to Hamilton’s equations (4.35) . We can now sum
up as follows. Hamilton’s equations characterize any field which belongs to a family of geodesically
equidistant hypersurfaces.

We illustrate the concepts that we have discussed by means of two simple examples.

Example 4.3.1

Consider a plane pendulum which performs small oscillations. This problem is discussed in [19],
pp. 199-201. The Lagrangian function is given by (m = 1)
1 1
L (x, ẋ) = T (ẋ) − V (x) = ẋ2 − ω 2 x2 , (4.49)
2 2
where ω is a positive constant and F = −ω 2 x is the force of attraction along the x–axis towards
the origin. We construct the Hamiltonian function, namely (4.4), using (4.1). As a result the
Hamiltonian function is
1 1
H (x, p) = p2 + ω 2 x2 , (4.50)
2 2
93 APM3712

and the canonical equations are (


ṗ = −ω 2 x,
(4.51)
ẋ = p.
Suppose that the initial conditions are x(0) = A, dx(0)/dt = 0, then the solution of (4.51) is

ẋ = p =⇒ ẍ = ṗ =⇒ ẍ + ω 2 x = 0 =⇒ x(t) = c1 cos ωt + c2 sin ωt,


x(0) = c1 = A, and dx(0)/dt = c2 = 0, so x (t) = A cos ωt. (4.52)

As A varies from −∞ to +∞, (4.52) generates a family of extremals in R2 which cover R2 simply
 
(2k + 1) π
(t, x) = , 0 , k = 0, ±1, ±2, . . . .

Consider any member Γ of this family. Since (4.51) is satisfied, it follows that
dH ∂H ∂H ∂H
= + ẋ + ṗ
dt ∂t ∂x ∂p
= 0 + ω 2 xẋ + pṗ
= 0, (4.53)

which implies that H is a constant, say H = E on Γ. The constant will be different for each
member of the family, and since E is determined by A, we can write E as E(A).

Indeed from (4.51) it follows that


p (t) = −Aω sin ωt, (4.54)
so that
1
E = A2 ω 2 . (4.55)
2
Let P (t, x) 6= (0, 0) be any point on Γ. Then from (4.51)

A = x sec ωt, (4.56)

and thus (4.55) becomes


1
E = ω 2 x2 sec2 ωt (4.57)
2
for E in P on Γ, and from (4.54) and (4.56) we also conclude that

p = −ωx tan ωt. (4.58)

Using (4.50) we can write the Hamilton-Jacobi equation as


 2
1 ∂S 1 ∂S
+ ω 2 x2 + = 0. (4.59)
2 ∂x 2 ∂t
The solution of this equation gives us the family of geodesically equidistant hypersurfaces S. Sub-
stitute (4.58) into (4.24) to get
∂S
(t, x) = −ωx tan ωt. (4.60)
∂x
94

Since H = E on Γ, we deduce from (4.57) and the Hamilton-Jacobi equation (4.27) that
∂S 1
(t, x) = − ω 2 x2 sec2 ωt. (4.61)
∂t 2
If we integrate (4.60) with respect to x, we see that S is of the form
1
S (t, x) = − ωx2 tan ωt + f (t) . (4.62)
2
Note that the integration “constant” is an arbitrary function of t (indeed it is constant with respect
to x!). If we differentiate (4.62) with respect to t and compare the result with (4.61), we see that
f is also constant with respect to t. Without loss of generality we can assume that the constant
is zero. The family of hypersurfaces (4.7) are thus given by
1 X
S (t, x) ≡ − ωx2 tan ωt = , (4.63)
2
which actually consists of curves in R2 . The congruence which is represented by (4.52) is transversal
to the curves (4.62). The local nature of the construction is evident from the fact that the points
((2k + 1) π/2ω, 0) must be excluded (these are the points for which the simple covering is not
possible). Furthermore, since we require that L 6= 0 (see (4.10)) we must also exclude the points
cos ωt = 0, that is we also require that t 6= (2k + 1) π/4ω. We have now constructed the complete
figure for the problem (see Figure 4.5).

Figure 4.5

Let us choose an extremal Γ with a value A as the parameter in (4.52). Along Γ we have
from (4.52) and (4.62)
1
S (t) = S (t, x (t)) = − A2 ω sin ωt cos ωt
2
1 2
= − A ω sin 2ωt. (4.64)
4
If we evaluate L along Γ, as given by (4.49), we get
1
L (t, x (t) , ẋ (t)) = − A2 ω cos ωt,
2
95 APM3712

and hence using (4.64) we get along Γ

dS dS
= = L.
dt dt
Which agrees with (4.22).

Furthermore, between any two points (t1 , x (t1 )) and (t2 , x (t2 )), such that for t1 ≤ t ≤ t2
with
(2k + 1) π (2k + 1) π
t 6= , t 6= , k = 0, ±1, ±2, . . . , (4.65)
2ω 4ω
we have that
Z t2
L (t, x (t) , ẋ (t)) dt = S (t2 ) − S (t1 )
t1
1 2
= A ω [sin 2ωt1 − sin 2ωt2 ] ,
4
which is an example of (4.31). The function S provides the value of the fundamental integral if it
is evaluated along an extremal of the problem, taking (4.65) into consideration. The local nature
of the theory cannot be overemphasized.

We end our discussion of the problem by taking a look at the paths in the configuration space
(t, x) and in phase space (t, x, p). In the configuration space, the path is represented by (4.56). The
path in the phase space is an ellipse which is “stretched” along the t–axis. The projection of the
path on the (x, p)–plane is an ellipse

x2 p2
+ = 1.
A2 A2 ω 2
with auxiliary axes A and Aω. The configuration space in this case is a 2–dimensional subspace
of the 3–dimensional phase space (see Figure 4.5).

Example 4.3.2

The second example is known as the two body problem and is discussed in [7], pp. 74-76. The
problem is a special case of the n–body problem which is still the subject of active research in
physics. Consider the motion of two bodies with masses m1 and m2 respectively and assume that
Newton’s law of gravity holds between them, namely
Gm1 m2
F = ,
(r1 + r2 )2

where r1 and r2 are the distances of m1 and m2 from the centre of mass C. Since m1 r1 = m2 r2 we
have that
Gm1 M2 m32
F = , M 2 = .
(r1 )2 (m1 + m2 )2
96

In this way we have reduced the problem to one with fixed mass M2 which attracts a mass m1 at
a distance r1 . Choose C as the origin and take x and y as the coordinates in the plane which are
determined by the initial position and velocity of m1 . Let m1 = 1 so that the kinetic energy is
given by
1 2
ẋ + ẏ 2 ,

T =
2
and the potential energy by
k2
V = −p ,
x2 + y 2
so that the Lagrangian function is given by

1 2 k2
ẋ + ẏ 2 + p

L=T −V = .
2 x2 + y 2

The canonical variables are introduced by using (2.1), namely


∂L ∂L
p= , q= .
∂ ẋ ∂ ẏ
The Hamiltonian function becomes
1 2 k2
p + q2 − p

H (x, y, p, q) = .
2 x2 + y 2

The Hamilton-Jacobi equation (4.27) becomes


"   2 #
2
∂S 1 ∂S ∂S k2
+ + −p = 0. (4.66)
∂t 2 ∂x ∂y x2 + y 2

Transforming to polar coordinates x = r cos θ and y = r sin θ, transforms (4.66) to


"   2 #
2
∂S 1 ∂S 1 ∂S k2
+ + 2 = . (4.67)
∂t 2 ∂r r ∂θ r

A two parameter solution of (4.67) is found by assuming that the solution is of the form

S = αt + βθ + R (r) .

Substitute this in (4.67) and by solving for R(r) we get


Z rs 2
2k β2
S = αt + βθ + − 2 − 2α dρ. (4.68)
r0 ρ ρ

The extremals are obtained by using the theorem stated in Exercise 4.6.9. Differentiate (4.68)
with respect to α and β to find the extremals. We get
Z r

t− q = t0 , (4.69)
2k2 β2
r0
ρ
− ρ2
− 2α
97 APM3712

and Z r

θ−β q = θ0 , (4.70)
2k2 β2
r0
ρ
− ρ2
− 2α
with t0 and θ0 as two arbitrary constants. The equation (4.70) gives θ as a function of r, this is
the path of the body, and (4.69) gives us r as a function of t, the time. The equation (4.70) can
be integrated by substituting ρ = 1/σ so that
 
 β2 − 1 
2
θ = θ0 − arcsin qk r .
2αβ 2 
1 + k4

We write r
β2 2αβ 2
δ = 2, 2 = 1+ ,
k k4
then
δ
r= .
1− 2 sin (θ − θ0 )

4.4 The Weierstrass conditions

One of the most important properties of extremal fields which belong to geodesically equidistant
hypersurfaces, is that this field defines a line integral which is path independent. This line integral
is instrumental in obtaining a sufficient condition which an extremal must satisfy to provide an
extreme value for the fundamental integral (4.11). This section deals with these topics. The
diagrammatic representations which help to explain the theory consist once again of 1–dimensional
in R2 and these should enable you to grasp what is going on.
P
hypersurfaces S (t, x) =

Assume that we are given a family of geodesically equidistant hypersurfaces such that the
quantities S (t, xi ) , pi = ∂S/∂xi are uniquely defined at each point in the region G. Consider the
P P
points P1 and P2 on S (t, xi ) = 1 and S (t, xi ) = 2 respectively, the extremals Γ1 and Γ2 which
are members of the congruence which belongs to the hypersurfaces S and which passes through P1
and P2 . Consider an arbitrary non–self–intersecting class C 1 curve C which joins P1 and P2 and
lies entirely in G. Let C intersect a member Γ of the congruence at P (between the two chosen
0 
hypersurfaces) and denote the tangent vector to C at P in the direction of P1 and P2 by 1, x i .
The geodesic gradient is denoted by (1, dxi /dt) which refers to the tangent to Γ at this point
(see Figure 4.6).

Consider the integral J of dS along C, namely


Z P2 X X
J= dS = − , (4.71)
C P1 2 1

but we also have that Z P2  


∂S 0 ∂S
J= x + dt. (4.72)
C P1 ∂xi i ∂t
98

Figure 4.6

We can now write (4.72) as


Z P2
J= (pi x0i − H (t, xi , pi )) dt, (4.73)
C P1

where we used (4.24) and the Hamilton-Jacobi equation (we could do this since the family of hyper-
surfaces is geodesically equidistant). As a result of (4.71), the integral (4.72) is independent
of the curve C which joins P1 and P2 . It should be emphasized that the pi in the integrand of
(4.73) refer to the given extremal field, while the x0i refer to the curve C. By using (4.1) and (4.4)
we write J as Z P1 .  !
∂L t, xi , xi
J= L (t, xi , ẋi ) + (x0i − ẋi ) dt, (4.74)
C P2 ∂ ẋ i

which we call the independent integral of Hilbert. We see that x0i refer to the arbitrary
curve C while dxi /dt refer to the given extremal field. A Mayer-field can be defined as any field

xi = xi (t, uα ) , pi = pi (t, uα ) (4.75)

which is such that the integral (4.74) is path independent. We have shown above that an ex-
tremal field which belongs to a family of geodesically equidistant hypersurfaces is also a Mayer-field.

The converse is also true. Suppose that we are given a field (4.75) which when substituted
into (4.74), makes the integral path independent of C. This is only possible if

pi (t, uα ) dxi − H (t, xj , pj (t, uα )) dt (4.76)

is an exact differential. According to the first of (4.75) we have that

∂xi
dxi = ẋi dt + duα ,
∂uα

along C where dxi /dt has its usual meaning since uα = constant along any member of the congruence
defined by the field. The expression (4.76) must then be written as

∂xi
pi (t, uα ) duα + [pi (t, uα ) ẋi − H (t, xj , pj (t, uα ))] dt. (4.77)
∂uα
99 APM3712

This will be an exact differential dσ (t, uα ) if and only if


[pi (t, uβ ) ẋi (t, uβ ) − H (t, xj (t, uβ ) , pj (t, uβ ))] dt
∂uα
 
∂ ∂xi
= pi (t, uβ ) , (4.78)
∂t ∂uα

and    
∂ ∂xi ∂ ∂xi
pi = pi .
∂uβ ∂uα ∂uα ∂uβ
The last condition leads directly to (4.43), and according to Theorem 4.3.1 this implies that the
field (4.75) belongs to a family of hypersurfaces. The left hand side of (4.78) is

∂ 2 xi ∂pi ∂H ∂xi ∂H ∂pi ∂H ∂xi ∂ 2 xi


pi + ẋi − − =− + pi , (4.79)
∂t∂uα ∂uα ∂xi ∂uα ∂pi ∂uα ∂xi ∂uα ∂t∂uα

since the second and fourth terms cancel out as result of the identities (4.5) and the relation (4.3).
The right hand side of (4.78) is
∂xi ∂ 2 xi
ṗi + pi .
∂uα ∂t∂uα
From this and (4.79) it follows that the condition (4.78) is satisfied

∂xi ∂H ∂xi
ṗi =− .
∂uα ∂xi ∂uα
From (4.38) we see that the first set of Hamilton’s equations (4.35) are satisfied. Our field is thus
canonical (or is an extremal field). Since the function σ (t, uα ) ensures that (4.77) is an exact
differential dσ (t, uα ), and this implies (4.46), a function S (t, xj ) exists for which (4.40) holds, with
S (t, xj ) = σ (t, uα ). According to Theorem 4.3.2, S satisfies the Hamilton- Jacobi equation. We
have now proved:

Theorem 4.4.1

A Mayer-field is a canonical field which belongs to a family of geodesically equidistant hypersurfaces


and conversely.

If we combine this result with the theorems of section 4.3 then we have

Theorem 4.4.2

For a field (4.75) to be a Mayer field, it is necessary and sufficient that the Lagrange-brackets
 
uα , uβ must be identically zero and the functions (4.75) must satisfy Hamilton’s equations.
100

Figure 4.7

It is now reasonably simple to construct a sufficient condition in terms of a Mayer-field. We


start with the integral (4.74) where the endpoints P1 and P2 are arbitrary in G, while C is an
arbitrary class D1 curve which joins them. The point P1 on the first hypersurface S (t, xj ) = 1
P

of the geodesically equidistant family uniquely determines a member Γ of the canonical field which
belongs to the family, namely that member which passes through P1 . This curve will intersect the
second hypersurface S (t, xj ) in a point which we now call P2 (which is no longer arbitrary). See
Figure 4.7 which represents the situation in R2 .

As a result of (4.71) and (4.74) we have


Z P2  
∂L (t, xj , ẋj ) 0 .  X X
L (t, xj , ẋj ) + xi − xi dt = − . (4.80)
P1 ∂ ẋi 2 1
C

For the special case where C, which joins P1 and P2 , coincides with Γ we have that x0i − dxi /dt = 0,
and (4.80) becomes (4.26). In general we have that

P2 .  !
∂L t, xj , xj
Z
L (t, xj , ẋj ) + (x0i − ẋi ) dt
P1 ∂ ẋi
C
Z P2
= L (t, xj , ẋj ) dt. (4.81)
P1
Γ

It must be emphasized that the x0i denote the components of the tangential vector along C.

At each point of the curve C we have in effect defined two directions dxi /dt and x0i . If we
want to solve the problem of extreme values for the fundamental integral we need to use the integral
of L t, xj , x0j along C and not that of L (t, xj , dxj /dt) along Γ as in (4.81). It is thus necessary


to compare the values of L in the different directions at each point C with each other. We do this
101 APM3712

using the second mean value theorem. Since L is of class C 2 we have

∂L (t, xj , ẋj ) 0
L t, xj , x0j
 
= L (t, xj , ẋj ) + xj − ẋj
∂ ẋj
2
1 ∂ L (t, xj , ζj ) 0
xj − ẋj (x0h − ẋh ) .

+ (4.82)
2 ∂ ẋj ∂ ẋh

with
ζh = ẋh + θ (x0h − ẋh ) , with 0 < θ < 1. (4.83)

It is certainly not necessary that the first three terms on the right hand side of (4.82) should be
identical to the integrand on the left hand side of (4.81).

We define the following function with (3n + 1) variables, namely

∂L (t, xj , ẋj ) 0
E t, xj , ẋj , x0j = L t, xj , x0j − L (t, xj , ẋj ) −
  
xj − ẋj . (4.84)
∂ ẋj

This is the well known Weierstrass excess function named after Karl Weierstrass who
first formulated it in the case when n = 1. This function leads to a criterion for the extreme
values of the fundamental integral.

We can now prove the sufficient condition of Weierstrass.

Theorem 4.4.3

Let Γ be a member of a Mayer-field which completely covers a region G of Rn+1 and which
contains Γ and its endpoints P1 and P2 with the geodesic gradient determined by the Mayer-field
and denoted by dxi /dt If
E (t, xi , ẋi , x0i ) > 0 (4.85)
dxi
for all sets of values x0i 6= dt
at all points (t, xi ) of G , then Γ provides a strong minimum to the
fundamental integral
Z P2
I= L (t, xj , ẋj ) dt (4.86)
P1

relative to all class D1 curves joining P1 and P2 , and contained in G.

Proof:
Eliminate the integrand on the left hand side of (4.81) using (4.84), so that
Z P2 Z P2 Z P2
t, xj , x0j E t, xj , ẋj , x0j dt
 
L dt = L (t, xj , ẋj ) dt + (4.87)
C P1 Γ P1 C P1
102

a Mayer-field is also a canonical field, so that Theorem 4.4.2 applies and the condition (4.85) follows
directly. 
The expression (4.87) is often referred as the fundamental formula of the variational
calculus. The condition (4.85) is fundamental in the variational calculus. If we compare (4.82)
and (4.84) with each other, we see that
 1 ∂ 2 L (t, xj , ζj ) 0
E t, xj , ẋj , x0j = (xi − ẋi ) (x0h − ẋh ) . (4.88)
2 ∂ ẋi ∂ ẋh
Note that the ζ in the right hand side must satisfy (4.83). If the arguments are independent of ζ,
we can also use this form of the excess function. Incidentally the form of (4.88) has very close ties
with the so-called Legendre-condition which we shall deal with a little later.

As you have probably noticed the Theorem 4.4.3 is concerned with a sufficiency condition.
Can we also formulate a necessary condition? We can also ask the question whether we can
formulate a theorem similar to Theorem 4.4.3 for a single extremal, that is, a single extremal which
satisfies either Hamilton’s equations or the Euler-Lagrange equations. We shall show that this
is indeed the case, on condition that the extremal can be imbedded in a Mayer-field. We do this
indirectly in the following theorem where we prove the necessary condition of Weierstrass.

Theorem 4.4.4

If an extremal Γ of class C 2 which joins two points P1 and P2 yields a strong minimum to the
integral (4.86) relative to all neighboring curves of class D1 joining P1 and P2 , then

E (t, xi , ẋi , x0i ) ≥ 0, (4.89)

for all (t, xi ) on Γ and for any set x0j 6= dxj /dt.

Proof:
Let Γ be an extremal joining the points P1 (t1 , xi,1 ) and P2 (t2 , xi,2 ) and suppose that there is a
point P (τ, xi (τ )) on
Γ (t1 < τ < t2 ) for which a direction exists such that

E (τ, xj (τ ) , ẋj (τ ) , wj ) < 0, (4.90)

while the “other” direction wj is not the same as the tangent to Γ at P , namely dxi (τ ) /dt. We
shall prove that under these conditions the extremal Γ cannot yield an extreme value to the inte-
gral (4.86). We construct a comparison curve which yields a smaller value for (4.86).

The curve Γ is represented by


Γ : xi = xi (t) , (4.91)
103 APM3712

while the curve C of class C 1 which passes through P (τ, xi (τ )) on Γ is represented by

C : ζi = ζi (t) , (4.92)

on which the tangent at P is given by the direction wi as in (4.90). This implies that

xi (τ ) = ζi (τ ) ,
dζi
= wi for t = τ. (4.93)
dt
Define a family of curves K(u) by
t − t1
xi = xi (t, u) = (ζi (u) − xi (u)) + xi (t) , (4.94)
u − t1
where u > t1 as the parameter. As a result of (4.93) the curve K (τ ) is given by

xi = xi (t, τ ) = xi (t) , (4.95)

which coincides with Γ (see (4.91)). We also have for t = t1 in (4.94)

xi (t1 , u) = xi (t1 ) , (4.96)

for all values of u. We deduce that all curves of the family (4.94) go through P1 and if we
differentiate (4.96) , then
∂xi (t1, u)
= 0. (4.97)
∂u
Consider a point Q with coordinates (u, ζi (u)) on C. Let t = u in (4.96) then we see that

xi (u, u) = ζi (u) , (4.98)

which means that Q lies on the curve K (u). Differentiate (4.94) with respect to u and let t = u = τ ,
then using (4.93), we get  
∂xi (t, u)
+ [ẋi ]t=τ = wi . (4.99)
∂u t=u=τ
The integral I(u) of the Lagrangian function L along the curve K(u) from P1 to Q and then
along C to P is given by
Z u   Z τ  
∂xi (t, u) dζi
I (u) = L t, xi (t, u) , dt + L t, ζi , dt. (4.100)
K(U ) t1 ∂t C u dt
Differentiate (4.100) with respect to u, and using (4.93) and (4.96), we get
Z τ
∂L ∂ 2 xi
  
dI ∂L ∂xi
= L (τ, xi ẋi ) − L (τ, xi , wi ) + − dt. (4.101)
du u=τ Γ t1 ∂xi ∂u ∂ ẋi ∂t∂u u=τ
Since Γ is an extremal we can evaluate the integral on the right hand side of (4.101) by substituting
∂L/∂xi in the Euler-Lagrange expression (4.36). The relevant term
Z τ τ Z τ
∂L ∂ 2 xi
  
d ∂L ∂xi ∂L ∂xi
dt = − dt
Γ t1 dt ∂ ẋi ∂u ∂ ẋi ∂u t1 t1 ∂ ẋi ∂t∂u
Z τ
∂L ∂ 2 xi
 
 i  ∂L
= w − [ẋi ]t=τ − dt,
∂ ẋi t=τ t1 ∂ ẋi ∂t∂u
104

where we used (4.97) and (4.99) in the last step. Substituting this into (4.101), gives us
   
dI .  ∂L
= L t, xi , xi − L (t, xi , wi ) − (ẋi − wi ) = −E (t, xi , ẋi , wi )|t=τ ,
du u=τ ∂ ẋi t=τ
where we used (4.84) in the last step. From our assumption of (4.90) we get
 
dI
> 0. (4.102)
du u=τ
Since Γ provides an extreme value to the fundamental integral, [dI/du] = 0 for u = τ , which
contradicts (4.102). 

Let’s return to the Legendre condition which we referred to earlier. To start, the sufficiency
condition of Weierstrass (4.85) implies a strong minimum for the fundamental integral. For a
weak minimum it is enough that (4.85) holds in its weak form, that is for values of x0i sufficiently
close to dxi /dt. The necessary condition of Legendre is that for a weak minimum (and also
for a strong minimum since we are dealing with necessary conditions) it is necessary that the
quadratic form
∂ 2 L (t, xh , ẋh )
ηi ηj
∂ ẋi ∂ ẋj
should be positive-definite. We are, however, interested in the sufficient condition of Legendre
for the purposes of the exercises that we are going to do.
Theorem 4.4.5

If an extremal xh = χh (t), T1 ≤ t ≤ t2 , can be imbedded in a Mayer-field and if

∂ 2 L (t, χh , χ̇h )
ηi ηj > 0, t1 ≤ t ≤ t2 , (4.103)
∂ ẋi ∂ ẋj

for all values of ηi for which


n
X
[ηi ]2 6= 0,
i=1

then the extremal yields a weak minimum.

If n = 1, then condition (4.103) just becomes


∂ 2 L (t, χ, χ̇)
> 0. (4.104)
∂ ẋ∂ ẋ
Many of our examples and exercises are in R2 (n = 1), so that (4.104) is applicable. Note, however,
that the sufficient condition of Legendre only applies to weak extreme values. As we
indicated in (4.103) and (4.104), the arguments of L must be calculated along the relevant
extremal whenever the Legendre condition is applied. In the case of the Weierstrass conditions
(4.85) and (4.89) refer to the field and the general gradient of any curve.
105 APM3712

We now discuss a few examples of the application of the necessary and sufficient conditions of
Weierstrass and Legendre. The first one is very important and places Hamilton’s principle, dis-
cussed at length in Chapter 1 in the correct perspective.

Example 4.4.1

Consider a holonomic dynamical system where the kinetic energy can be expressed as a quadratic
form and the potential energy only depends on the time and displacement. The Lagrangian function
is of the form
1
L (t, xi , ẋi ) = aij (t, xh ) ẋi ẋj − V (t, xh ) . (4.105)
2
If we calculate the Weierstrass excess function (4.88) then we obtain

E t, xj , ẋj , x0j = aij (t, xh ) ẋj − x0j .


 
(4.106)

We can assume that the kinetic energy of any dynamical system is positive, so that (4.106) is a
positive definite quadratic form. In this case we see that Weierstrass’s sufficient condition is satis-
fied as a result of the positive definite character of the kinetic energy of the system. An extremal
field of a holonomic dynamical conservative system with Lagrangian function (4.105) provides a
minimum value for the time-integral of the difference between the kinetic and potential energies.

Example 4.4.2

Consider the double pendulum in Figure 4.8 which performs small, slow oscillations about the
vertical. Consider the angles indicated to be the generalized coordinates q1 and q2 . Once again
some steps have been left out which you need to supply

Figure 4.8

Choose the potential energy so that V (q1 , q2 ) at the point (0, 0) is zero. Since we only consider
small oscillations we can neglect third powers of qi , so that the potential energy can be written as
1
V (q1 , q2 ) = mg` 2 (q1 )2 + (q2 )2 ,
 
2
106

while the kinetic energy can be expressed as


1
T (q̇1 , q̇2 ) = m`2 2 (q̇1 )2 + 2q̇1 q̇2 + (q̇2 )2 .
 
2
Once again ensure that you can derive these two equations. The Lagrangian function is given by
L = T − V , and if we choose the constant to be one, then the Lagrangian function is given by,

L (q1 , q2 , q̇1 , q̇2 ) = 2 (q̇1 )2 + 2q̇1 q̇2 + (q̇2 )2 − 2 (q1 )2 − (q2 )2 . (4.107)

The Lagrangian function (4.107) is of the form (4.105) so that we can use the form (4.106) of the
excess function (4.88). Here we have
      
∂ ∂L ∂ ∂L ∂
  
" # ∂ q̇1 ∂ q̇1 ∂ q̇1 ∂ q̇2 (4q̇1 + 2q̇2 ) ∂∂q̇1 (2q̇1 + 2q̇2 ) 4 2
∂ L2    ∂ q̇1
= = = ,
  
.
∂ q̇i ∂ q j       


∂ q̇
∂L
∂ q̇

∂ q̇
∂L
∂ q̇ ∂ q̇2
(4q̇1 + 2q̇2 ) ∂∂q̇2 (2q̇1 + 2q̇2 ) 2 2
2 1 2 2

so that

E (q1 , q2 , q̇1 , q̇2 , ξ1 , ξ2 ) = L (q1 , q2 , ξ1 , ξ2 ) − L (q1 , q2 , q̇1 , q̇2 )


 
∂L (q1 , q2 , q̇1 , q̇2 ) ∂L (q1 , q2 , q̇1 , q̇2 )
− (ξ1 − q̇1 ) + (ξ2 − q̇2 )
∂ q̇1 ∂ q̇2
= 2 (ξ1 − q̇1 )2 + 2 (ξ1 − q̇1 ) (ξ2 − q̇2 ) + (ξ2 − q̇2 )2
≥ 0 (4.108)

Equality holds only if


ξ1 − q̇1 = 0 = ξ2 − q̇2 .

Thus any extremal in this problem provides a strong (local) minimum for the fundamental
integral. If we used the Legendre condition here, although its form is identical to (4.108), the result
would be a weak minimum.

Example 4.4.3

In this example we use the excess function (4.108). Consider a problem in the variational calculus
with a Lagrangian function
L (x, ẋ) = ẋ3 . (4.109)

The form of the Weierstrass excess function (4.84) (why must we use this form and not the other
one as we did above?) is given by
0 .
E (x, ẋ, x0 ) = x 3 − ẋ3 − 3ẋ2 x0 − x
2
= (x0 − ẋ) (x0 + 2ẋ) .
107 APM3712

If we are given the form of the Mayer-field in which the extremal is imbedded, we can determine
the sign of E. By using the Euler-Lagrange equations we find that the extremal is given by a
straight line, namely
x = at + b.

Taking the origin as the starting point, we get b = 0. The other point is P2 (tx , x2 ) , with t2 > 0
and x2 > 0. The extremal is then given by
x2
Γ : x = kt, k = .
t2
The extremal is now imbedded in the Mayer-field x = kt + c, which consists of all straight lines
parallel to Γ (first convince yourself that this is a Mayer field). As c varies we get different members
of the field and the whole (t, x)–plane is covered (see Figure 4.9).

Figure 4.9

For this Mayer field the excess function is


2
E (x, k, x0 ) = (x0 − k) (x0 + 2k) .

It is obvious that E (x, k, x0 ) > 0 for all x0 > −2k (> 0). Hence Γ provides a weak minimum
for the fundamental integral with Lagrangian function (4.109) in the family of class C 1 curves
C : x = x (t) which pass through the origin and the point P2 (t2 , x2 ) and which is such that
x0 (t) > −2k, 0 ≤ t ≤ t2 . Since E (x, k, x0 ) < 0 if x0 < −2k < 0, it follows from the necessary
condition of Weierstrass that Γ does not provide a strong minimum to the fundamental integral. Do
you agree that the Legendre condition will provide a weak minimum, but in a smaller class of curves?

We could also imbed Γ in another Mayer field, the so-called central field, namely

x = dt (d is a constant)
108

(see Figure 4.10). This family of extremals (they are also straight lines) contains Γ for d = k.
If we exclude the origin then this is also a Mayer field. The field in section 4.5 is such a central
field whenever one of the points ((2k + 1) π/2ω, 0) is in the region. It can now be shown that the
conditions of Weierstrass also hold with respect to the central field.

Figure 4.10

We conclude these examples with some advice. If the Lagrangian function is a polynomial
0
in dx/dt and x0 , then it is obvious that x − dx/dt must be a factor of the excess function. This


helps considerably with the simplification (factorization) of the function so that its sign can be
analyzed. You may need to revise the theory of second and third degree polynomials to assist you
with the analysis of these functions.

4.5 Properties of extremal arcs

If we refer back to Lemma 2.4.1, which states that if a Lagrangian function is not an explicit
function of t, then along any extremal of the problem the condition
∂L
ẋi − L = constant (4.110)
∂ ẋi
is valid. We recognize that the left hand side is the Hamiltonian function (4.4) for a given
Lagrangian function (using (4.3) it becomes the Hamiltonian function!). If we carry out the
substitution then for this problem we have that

H (xj , pj ) = constant (4.111)

along an extremal. Although note that in this case as a result of the middle identity of (4.5)
we have that ∂H/∂t ≡ 0.
109 APM3712

Consider the holonomic dynamical system which we defined in example 4.4.1 with kinetic energy
given by the quadratic form
1
T (qi , q̇i ) = aij q̇i q̇j (4.112)
2
and potential energy V (t, qi ). To obtain the Hamiltonian function of this system define the
generalized momentum pi and we get
   
∂L ∂T
q̇i pi = q̇i = q̇i = q̇i (aij q̇j ) = 2T. (4.113)
∂ q̇i ∂ q̇i
Define the Hamiltonian function using (4.113) to get
H = −L + 2T = − (T − V ) + 2T = T + V, (4.114)
that is, H is the total energy of our system. It is no wonder that the Hamiltonian function (and
Hamilton-Jacobi theory) is so important! It is related to the total energy of the system. The
constant in (4.111) can be considered to be the total energy, and this holds along an extremal in
the holonomic dynamical case. In the more general situation this is not the case and we talk of
a generalized energy. The generalized energy and the total energy of a system need not be the same.

This result holds in general where the kinetic energy T is positively homogeneous of
degree 2 in dqi /dt, that is,
∂T
2T = q̇i .
∂ q̇i
The form of the kinetic energy as given in (4.112) fulfills this condition. In example 4.5.1 we
deal with a problem where this condition does not apply and the total energy is as a result not a
constant along an extremal.

We will now prove (4.111) by a direct application of the Hamiltonian formalism. In order
to apply the formalism, we introduce the important concept of Poisson brackets defined by
n  
X ∂F ∂G ∂F ∂G
{F, G} = − , (4.115)
i=1
∂xi ∂pi ∂pi ∂xi
for any functions F (t, xi , pi ) and G(t, xi , pi ) of class C 1 in the canonical variables. These brackets
play an important role in the theory of canonical transformations (dealt with in Chapter 5) and
in the Hamilton-Jacobi formalism of classical mechanics, especially it is used as an analogy mo-
tivating the transformation to quantum mechanics (which we shall unfortunately not deal with).
Incidentally, we have used the summation convention in (4.115)!

The Poisson brackets (4.115) satisfies several identities


{F, G} = − {G, F } , (Anticommutativity) (4.116)
{F1 + F2 , G} = {F1 , G} + {F2 , G} , (Distributive) (4.117)
{F1 .F2 , G} = F1 {F2 , G} + {F1 , G} F2 , (Product rule) (4.118)
{F1 , {F2 , F3 }} + {F2 , {F3 , F1 }} + {F3 , {F1 , F2 }} = 0, (4.119)
110

which are fairly easy to prove. The last identity is known as Jacobi’s identity.

Consider a function F (t, xi , pi ) of class C 1 . If we differentiate the function along an extremal,


that is along a curve for which Hamilton’s equations
∂H ∂H
ẋi = , ṗi = − (4.120)
∂pi ∂xi
are satisfied we get
n   n  
dF ∂F X ∂F ∂F ∂F X ∂F ∂H ∂F ∂H
= + ẋi + ṗi = + −
dt ∂t i=1
∂xi ∂pi ∂t i=1
∂xi ∂pi ∂pi ∂xi
∂F
= + {F, H} , (4.121)
∂t
where we used (4.115). The expression (4.121) denotes the rate of change of an arbitrary
function F of the canonical variables along an extremal. If the function F is not an explicit
function of the time (and hence is no longer arbitrary), then dF/dt = {F, H} = 0. The function
F , under these circumstances, is an integral of the canonical equations (4.120).

If F denotes the function H, then we have from (4.115) that {H, H} = 0, and we get the
middle identity of (4.5)
dH ∂H ∂L
= =− ,
dt ∂t ∂t
along an extremal.

Thus, the Hamiltonian function H is constant along an extremal if H (or L) does


not explicitly contain the parameter t.

We refer to (4.111) as a conservation law and it plays an important role in mechanics in general.
We will return to this important result in Chapter 5.

Example 4.5.1

Consider a rectangular system of axes (ρ, σ, τ ) rotating with an angular velocity ω about the z–axis
of a system of fixed axes (x, y, z). The origins of the systems coincide. The systems are related by
(prove this)

x = ρ cos ωt − σ sin ωt,


y = ρ sin ωt + σ cos ωt,
z = τ.
111 APM3712

kinetic energy of a point of mass m, measured relative to (x, y, z) is given by (prove this)
m 2
ρ̇ + σ̇ 2 + τ̇ 2 + mω [ρσ̇ − ρ̇σ]

T (ρ, σ, τ, ρ̇, σ̇, τ̇ ) =
2
mω 2  2
ρ + σ2 .

+
2
Assume that the forces are independent of the velocity, so that V (ρ, σ, τ ) represents the potential
energy. The Hamiltonian function is then given by (prove this)
1  2
pρ + p2σ + p2τ + ω [pρ σ − pσ ρ] + V (ρ, σ, τ ) ,

H (ρ, σ, τ, pρ , pσ , pτ ) =
2m
with the canonical momenta pρ , etc. If we now calculate H = T + V , we don’t get H as we defined
it (prove this). This set of generalised coordinates gives us a system for which the generalised
energy and total energy are not the same. Write down Hamilton’s equations and if we assume
that V = constant, then the Coriolis and centripetal accelerations become

ρ̈ − 2ω σ̇ 2 − ω 2 ρ = 0, σ̈ + 2wρ̇ − ω 2 σ = 0.

This example and other important matters are dealt with in [13].

4.6 Exercises

The first few deal with Hamilton’s equations. We could pose many more such problems,
certainly all the preceding exercises in Chapter 2 can be solved in this way. We shall be
content with only a few.

4.6.1 A particle of mass m moves in the xy–plane under the influence of a force

F = −gradV (x, y) .

Determine Hamilton’s equations of motion.

4.6.2 A particle of mass m moves in force field where the potential in spherical coordinates is
given by
k cos θ
V =− 2 .
r
Determine Hamilton’s equations of motion.

4.6.3 Consider a particle of mass m moving in two dimensions in a conservative force field with
the potential function V = V (x, y). Determine the Hamiltonian function and show that Hamilton’s
equations of motion reduce to Newton’s equations of motion in Cartesian coordinates.

4.6.4 Show that the Hamiltonian function for the single spring-mass system given in Figure 4.11
is
1 2 1 2
H= p + kx .
2m x 2
Write down Hamilton’s equations.
112

Figure 4.11

Note: at x = 0 the spring is un-stretched.

4.6.5 A uniform rod of mass m and length 2`, hangs from one end of a spring with spring constant
k. The rod can swing in a vertical plane, but the spring can only move vertically. Obtain the
Euler-Lagrange equations as well as Hamilton’s equations for this system.

The following exercises deal with the Hamilton-Jacobi equation.

4.6.6 Consider a variational problem in Rn+1 with a Lagrangian function which is not an explicit
function of t. Assume that ∂H/∂t ≡ 0 where H is the corresponding Hamiltonian function and the
Hamil-ton-Jacobi equation is  
∂S ∂S
H xj , + = 0. (4.122)
∂xj ∂t
Assume that a solution S of (4.122) is given by

S (t, xj ) = S ∗ (xj ) − Et, (4.123)

where E is an arbitrary constant.

(a) Show that S ∗ satisfies the so-called t–independent Hamilton–Jacobi equation


∂S ∗
 
H xj , =E (4.124)
∂xj

(b) Conversely, show that if S ∗ satisfies (4.124), then the function S defined by (4.123) satisfies
the Hamilton–Jacobi equation (4.122).

(c) Consider the motion of a single particle of mass m under the influence of a conservative
force. Show that, in terms of Cartesian coordinates (x, y, z) the Hamilton-Jacobi equation
(4.124) is now given by
" 2  ∗   ∗ #
1 ∂S ∗ ∂S ∂S
+ + + V (x, y, z) = E.
2m ∂x ∂y ∂z
113 APM3712

4.6.7 A particle of mass m is projected from a point O with a velocity u making an angle α with
the horizontal. The motion takes place in the xy–plane. Ignore air resistance. Determine the
Hamiltonian function and show using the Hamilton-Jacobi equation that the motion of the particle
is given by a parabola.

4.6.8 Construct the Hamiltonian function, derive the canonical equations and write down the
Hamilton-Jacobi equation for the following Lagrangian functions.

4.6.8.1 L (x, ẋ) = 1 + ẋ2 .

4.6.8.2 L(t, x, ẋ) = f (t, x) 1 + ẋ2 .

4.6.8.3 L(x, ẋ) = x2 (1 − ẋ2 ) .

4.6.9 Prove the following theorem: Let S (t, x, α) be a solution of the Hamilton–Jacobi equation
in R2 , with the parameter (integration constant) α. Then we have
∂S
= k (a constant)
∂α
along each extremal of the problem.

This result is often useful in finding extremals.

(a) Consider the problem with Lagrangian function

L (x, ẋ) = ẋ2 .

Show that the solution S of the corresponding Hamilton–Jacobi equation is given by

S (t, x, α, β) = −αt2 + 2αx + β,

where α and β are constants.


Hint: Assume that the solution S is of the form S (t, x) = u (t) + v (x) .

(b) Consider the problem with Lagrangian function

L (x, ẋ) = ẋ2 − x2 .

Use the theorem (and hint in (a)) to prove that the extremals are given by

x = α sin (t + β) .

4.6.10 Consider the variational problem with Lagrangian function


√ √
L (t, x, ẋ) = t2 + x2 1 + ẋ2 . (4.125)

Show that the extremals are given by the two-parameter family of curves

t2 cos β + 2t sin β − x2 cos β = α,

where α and β are the parameters, by


114

(a) transforming (4.125) to polar coordinates,

(b) using canonical variables,

(c) using the Hamilton–Jacobi equation.

Hint for (c): Let S be a solution of the Hamilton–Jacobi equation


1
At2 + 2Btx + Cx2 .

S (t, x) =
2
This will show that
1 2
t sin β − 2tx cos β − x2 sin β ,

S (x, t) =
2
is solution of the Hamilton–Jacobi equation.
Now prove that ∂S/∂β = α is a solution of the corresponding canonical equations (see Exercise
4.6.9 above).

The following problems deal with the independent integral of Hilbert and the conditions of
Weierstrass and Legendre.

4.6.11 Consider the simplest geodesic problem in R2 with Lagrangian function



L (x, ẋ) = 1 + ẋ2 .

The extremal joining (0, 0) and (1, 1) can be imbedded in a Mayer-field x = t + c, ( c is a real con-
stant). Write down Hilbert’s independent integral and show directly that it is really independent
of the choice of curves x = χ (t) of the class C 1 joining (0, 0) and (1, 1).

4.6.12 Imbed the extremal of the variational problem with Lagrangian function

L (x, ẋ) = 1 + ẋ,

joining (0, 0) and (1, 1), in a Mayer-field. Show directly that the corresponding integral of Hilbert
is independent of the choice of curves x = χ (t) of the class C 1 joining (0, 0) and (1, 1).

4.6.13 Consider the variational problem with Lagrangian function.

L (x, ẋ) = 6ẋ2 − x4 + xẋ,

and endpoint conditions x (0) = 0, x (a) = b, a > 0, b > 0. Show that the extremals are straight
lines x = c1 t+c2 , so that the particular extremal satisfying the end conditions, is given by x = bt/a.
This extremal can be imbedded in a Mayer-field x = bt/a+c, or in the central field x = kt. Consider
any of the fields and show that the excess function (4.84) is given by
. 2
 0 
E (x, ẋ, x0 ) = − x0 − x x 2 + 2ẋx0 − 6 − 3ẋ2 ,
115 APM3712

here, depending on which field is chosen,


b
ẋ = or ẋ = k, 0 < k < ∞.
a
The sign of E is obviously the opposite of that of
0
x 2 + 2ẋx0 − 6 − 3ẋ2 .

(4.126)

We now want to find out how the sign of (4.126) changes. Consider the quadratic expression in
x0 . Obtain the following:

(a) If
b √
ẋ = ≥ 3,
x
we get a strong maximum.

(b) If
b √
ẋ = < 3,
a
we see that the sign of E changes for every value of x0 , and so we do not get a strong extremum.

(c) For
b √
1 < ẋ = < 3,
a
and x0 sufficiently close to ẋ we have E ≤ 0, and hence we have a weak maximum.

(d) Also find out for which values of b/a, if any, we will get a weak minimum.

Note that the question of weak extrema can also be determined using the Legendre condition. In
connection with this problem, see [9], pp. l16-118.

4.6.14 Consider the variational problem with Lagrangian function

L (x, ẋ) = ẋ2 (1 + ẋ)2 ,

and boundary conditions x (0) = 0, x (1) = m. Show that the extremals are straight lines. Use
the condition of Weierstrass to show that

(a) if m ≤ −1 or m ≥ 0 then the extremal yields a strong minimum. Use the Legendre condition
to show the following.

(b) If √ √
1 3 1 3
−1 < m < − − or − + <m<0
2 6 2 6
then the extremal yields a weak minimum.
116

(c) If √ √
1 3 1 3
− − <m<− +
2 6 2 6
then the extremal yields a weak maximum.

4.6.15 Consider the variational problems with the following functionals. Determine the extremals
which join the given endpoints. Investigate what sort of extrema result from these extremals.
Rb√
4.6.15.1 a
1 + ẋ2 dt, x (b) > x (a) , b > a > 0.
R1
4.6.15.2 0
(1 − ẋ2 ) dt, x (0) = 0, x (1) = 1.
R1 2
4.6.15.3 0
(ẋ − 4xẋ3 + 2tẋ4 ) dt, x (0) = x (1) = 0.
R1 2
4.6.15.4 0
x (1 − ẋ)2 dt, x (0) = 0, x (1) = 1.
R1 2
4.6.15.5 0
(ẋ + ẋ3 ) dt, x (0) = x (1) = 0.
R4
4.6.15.6 0
(tẋ + ẋ2 ) dt, x (0) = 1, x (4) = −3.
Ra 2
4.6.15.7 0
(ẋ + 2xẋ − 16x2 ) dt, x (0) = x (a) = 0, a > 0.
R π/4 2 2 π

4.6.15.8 0
(4x − ẋ + 8x) dt, x (0) = −1, x 4
= 0.
R 4 t3
4.6.15.9 2
dt, x (2) = 4, x (4) = 16.
ẋ2
R1 2 2
4.6.15.10 0
x ẋ dt, x (0) = 0, x (1) = 1.
R4 2
4.6.15.11 2
(tẋ − 2xẋ3 ) dt, x (2) = 1, x (4) = 3.

4.6.16 Consider the brachistochrone-problem with Lagrangian function L given by


r
1 + ẋ2
L (x, ẋ) =
x
with the cycloids t = b (θ − sin θ) and x = b (1 − cos θ) as extremals. Show that for 0 ≤ t ≤ t1 <
2πb, the extremal going through (0, 0) and (t1 , x1 ), where t1 = b (θ1 − sin θ1 ) and x1 = b (1 − cos θ1 )
provides a minimum value to the fundamental integral. Is this minimum strong or weak?

4.6.17 Show that the extremal x = 0 of the variational problem with functional
Z 1
aẋ2 − 4bxẋ3 + 2btẋ4 dt,

I [x] =
0

where x (0) = 0, x (1) = 0, a > 0, b > 0 satisfies the necessary condition of Weierstrass and show
that it can be imbedded in an extremal field of the problem. Does this offer a strong minimum
for the functional? Choose
(
k ht 0 ≤ t ≤ h,
x = x0 (t) = 1−t
k 1−h h ≤ t ≤ 1,
and show that, given any value of k, however small, an h > 0 exists, such that I [xo ] < 0. Please
comment on this.
117 APM3712

4.6.18 Consider the linear Lagrangian function L in R2 given by

L (t, x, ẋ) = α (t, x) + β (t, x) ẋ,

and the corresponding variational problem with t1 ≤ t ≤ t2 . Write down the Euler-Lagrange
equation. What happens with the excess function? Comment on the situation.

The following problems deal with the Poisson brackets.

4.6.19 A particle of mass m moves in a plane subject to an inverse square law relative to the
origin. Show that the Lagrangian function L, in polar coordinates, is given by
  m 2
 k
L r, θ, ṙ, θ̇ = ṙ2 + r2 θ̇ + .
2 r
Obtain the Euler-Lagrange as well as Hamilton’s equations for this system.
Evaluate the Poisson brackets {r, pr }, {θ, pθ } , {pr , H} , {pθ , H} where pr and pθ are the components
of the generalized momentum which correspond to r and θ (recall the summation in (4.115)).

4.6.20 Suppose that a dynamical system moves along a path in the phase space determined by
the Hamiltonian function H (t, µ) and by the equation of motion dµ/dt = {µ, H}.
Show that, for any two variables f and g.

d ∂
{f, g} = {{f, g} , H} + {f, g} .
dt ∂t
This is known as the Poisson bracket theorem.

The last set of problems deal with the theory. Once again, you are required to answer the
questions in your own words. Please do not just reproduce the theory in the guide.

4.6.21 Construct the so-called complete figure of Carathéodory for the non-homogeneous vari-
ational problem for simple integrals in Rn+1 . Remember to describe what this figure consists of!
Note that the Hamilton–Jacobi equation characterizes the figure, it does not form part of it.

4.6.22 Prove that the Hamilton–Jacobi equation H (t, xj , ∂S/∂xj ) + ∂S/∂t = 0 is necessary and
in Rn+1 to be geodesically equidistant with respect
P
sufficient for the hypersurfaces S (t, xj ) =
to the corresponding Lagrangian function, and show that as a result it characterises the associated
complete figure of Carathéodory.

4.6.23 Prove that the members of the congruence of geodesics in the complete figure of
Carathéodory belong to a family of geodesically equidistant hypersurfaces in Rn+1 , satisfy the
canonical (or Hamilton’s) equations and show that as a result they must be extremals.
118

4.6.24 Define a field

xi = xi (t, uα ) , pi = pi (t, uα ) , α = 1, 2, . . . , n, (4.127)

in the context of the variational calculus of single integrals in Rn+1 . When can it be said that
this field belongs to a family of hypersurfaces? Prove the following theorem: “A necessary and
sufficient condition for the field (4.127) to belong to a family hypersurfaces is that the Lagrange
brackets [uα , uβ ] should be identically zero”.

4.6.25 When is the field (4.127) canonical? Prove the following theorem: “If a canonical field
belongs to a family of hypersurfaces, then the members of the family are geodesically equidistant”.

4.6.26 Derive the independent integral of Hilbert, define a Mayer-field and construct the Weier-
strass excess function for the non-homogeneous single integral problem in Rn+1 . State and prove
the necessary and the sufficient conditions of Weierstrass for an extremal to provide a minimum
value for the fundamental integral.

4.6.27 Prove that a Mayer field is a canonical field which belongs to a family of geodesically
equidistant hypersurfaces and conversely.

4.6.28 Prove the following theorem: “A necessary and sufficient condition for the field (4.127)
to be a Mayer field is that the Lagrange brackets [uα , uβ ] are identically zero and that the functions
(4.127) satisfy the canonical equations (4.35) ”.

4.6.29 Prove the identities (4.116), (4.117) and (4.118).


119 CONTENTS

Chapter 5

ADDITIONAL TOPICS IN
VARIATIONAL CALCULUS

The variational calculus that we have dealt with so far lends itself to further development
through applications. In this chapter we introduce additional topics where variational calculus is
applied.

Objectives
At the end of this chapter you will able to:

• formulate and solve the multiple integral problem in the calculus of variations,

• construct from a given Lagrangian function corresponding equivalent integrals,

• identify transformations that are canonical, and

• deduce conservation properties of dynamical systems using Noether’s theorem.

5.1 The multiple integral problem

We begin this section by formulating the problem. Suppose that we are given n real variables Xi
together with m real independent variables. Consider the space Rn+m of variables (tα , Xi ). When
m = 1, then our theory reduces to the theory of Chapter 2. In Rn+m the n equations
Xi = Xi (tα ) = Xi (t1 , t2 , · · · , tm ), i = 1, 2, · · · , n, (5.1)
define a subspace which we denote by Cm . The functions (5.1) are assumed to be of class C 2 , so
that we can define the derivatives on Cm by
∂ Xi
Ẋiα = (Xi )tα = , α = 1, 2, · · · m. (5.2)
∂tα
120

We assume that the dimension of the subspace Cm defined by (5.1) is m. Let G be a simply
connected region in the m–dimensional region of the tα which is bounded by the hypersurface
∂G each point of which corresponds to a set of values tα . Consider a second set of equations
Xi = χi (tα ) which represent another subspace Γm which we shall require to coincide with Cm on
the boundary ∂G of G, that is

χi (tα ) = Xi (tα ) = fi (tα ) for all tα ∈ ∂G. (5.3)

The functions fi have been previously prescribed


 and are thus known.
∂ Xi
Consider a Lagrangian function L tα , Xi , ∂tα of class C 2 defined in terms of these subspaces
as a function of tα . We form the multiple integral
Z  
∂Xi (tβ )
I [Cm ] = L tα , Xi (tβ ) , d (t) , (5.4)
G ∂tα
where d(t) = dt1 dt2 . . . dtm and the G under the integral denote that it is m–fold integration over G.

We formulate the variational problem as follows:

Simplest multiple integral problem in the calculus of variations: We seek the nec-
essary and sufficient conditions which the functions Xi (tα ) must satisfy so that they yield an
extreme value to the integral (5.4) compared with other functions which satisfy the boundary
conditions (5.3) and which lie in a neighborhood sufficiently close to Cm .

Before we derive the Euler-Lagrange equations we must consider two concepts with which
you are already familiar. The first is the total derivative, namely
dΦ (tβ , Xi ) ∂Φ ∂Φ ∂ Xi
= + , (5.5)
dtα ∂tα ∂Xi ∂tα
where we used (5.2). The other concept is the divergence theorem which states that a volume-
integral of the divergence of a vector function only depends on the value of the function on the
boundary of its domain, namely
Z Z Z
∇F dx dy dz = F .ndσ. (5.6)
∂G
G

The right hand side of (5.6) is a surface integral.

Consider a 1–parameter family of m–dimensional subspaces Cm(u) of Rn+m given by

Xi = Xi (tα , u) , (5.7)

where u denotes the parameter. Assume that the functions in (5.7) are of class C 2 . Consider two
neighboring subspaces Cm (u) and Cm (u0 ) where we assume that |u − u0 | is small, indeed assume
121 APM3712

that |u − u0 |2 and higher powers can be neglected. Consider two points denoted by P (tα , Xi (tβ , u))
and P 0 (tα , Xi (tβ , u0 )) respectively, on these two subspaces. The components of the displacement
P P 0 in Rn+m , are given by (0, 0, . . . , 0, δ ∗ X1 , . . . , δ ∗ Xn ) where
 
∗ ∂Xi
δ Xi = (u − u0 ) . (5.8)
∂u u=u0

These quantities represent the displacement where tα are kept constant.

Suppose that we are given an m–parameter family of curves in Rn+m which intersect the
subspaces Cm (u) such that one and only one member of the family passes through each point of a
given Cm (u) family. The parameters of the family are denoted by vα , which are chosen so that

vα = tα on Cm (u) . (5.9)

The curves vα = constant through a point P on Cm (u0 ) intersect the subspace Cm (u) at a point
Q the coordinates (tα , Xi ) of which are uniquely determined as functions of the parameters (vα , u).
We can write
tα = ψα (vβ , u) , (5.10)

which together with (5.7) gives us

Xi = Xi (ψα (vβ , u) , u) . (5.11)

The condition (5.9) gives us


vα = ψα (vβ , u0 ) , (5.12)

so that  
∂ψα
= δβ,α . (5.13)
∂vβ u=u0

Since the family of curves is chosen in an arbitrary way, the displacement P Q is also arbitrary.
This displacement is given by  
∂ψα
δtα = (u − u0 ) , (5.14)
∂u u=u0
and  
  ∂Xi
δXi = Ẋiα δtα + (u − u0 ) , (5.15)
u=u0 ∂u u=u0

where vα are constant


∂Xi (tβ , u)
Ẋiα = . (5.16)
∂tα
If we compare (5.15) with (5.8) we see that
 
∂ Xi
δXi = δtα + δ ∗ Xi . (5.17)
∂tα u=u0
122

Using (5.8) we define !


∂ Ẋiα (tβ , u)
δ ∗ Ẋiα = (u − u0 ) , (5.18)
∂u
u=u0

and from (5.16) it follows that



δ ∗ Ẋiα = (δ ∗ Xi ) . (5.19)
∂tα
The region G of the tα –space defines a region G (u0 ) on the subspace Cm (u0 ). The integral (5.4)
over G with respect to Cm (u0 ) is given by
Z  
I (u0 ) = L tα , Xi (tβ , u0 ) , Ẋiα (tβ , u0 ) d (t) . (5.20)
G

Consider the displacement P Q given by (5.14) and (5.15) as a mapping from Cm (u0 ) to Cm (u).
The boundary of G (u0 ) will not necessarily be mapped onto the boundary of G (u), but rather
onto a boundary of a region G0 (u) of Cm (u). Indeed G0 (u) will correspond to a region G00 (u) of
the tα –space which coincides with G of that space if the displacements P Q satisfy the condition
tα = constant. That is the displacements δXi and δ ∗ Xi coincide. We see that the variation (5.14)
and (5.15) contain two factors which contribute to the variation of the integral. The first is the
variation of the integrand and the second is the variation of the boundary of the domain over which
the integration is performed. To calculate first variation of the integral, namely,
 
dI
δI = I (u) − I (u0 ) = (u − u0 ) , (5.21)
du u=u0

the integral Z  
I (u) = L tα , Xi (tβ , u) , Ẋiα (tβ , u) d (t) , (5.22)
G(u)

must be differentiated with respect to u and the fact that the region G is dependent on u must be
taken into consideration. This dependence is dealt with as follows. According to our construc-
tion all points of G(u) and the corresponding points of G(u0 ) belong to the same vα –values, and
consequently we can replace the independent variables tα in (5.22) with the vα . As a result of
(5.9) we can denote a region Gv in the vα –space which corresponds to G. Thus, the region Gv is
independent of u, and is the domain of integration in (5.22) if the integration is performed in terms
of vα . Using (5.10) the integral (5.22) can be rewritten as
Z  
I (u) = L ψα (vβ , u) , Xi (ψα , u) , Ẋiα (ψα , u) Ad (v) , (5.23)
Gv

where
∂tα ∂ψα
A = det (Aβ,α ) with Aβ,α = = . (5.24)
∂vβ ∂vβ
We point out that as a consequence of (5.13)

(Aβ,α )u=u0 = δβ,α , (A)u=u0 = 1, (5.25)


123 APM3712

so that for sufficiently small values of |u − u0 | the determinant A > 0.

Differentiating (5.23) gives us


Z   
dI ∂L ∂ψα ∂L ∂Xi ∂ψα ∂Xi
= + +
du Gv ∂tα ∂u ∂Xi ∂tα ∂u ∂u
" #!
∂L ∂ Ẋiβ ∂ψα ∂ Ẋiβ
+ + Ad (v)
∂ Ẋiβ ∂tα ∂u ∂u
Z
∂A
+ L d (v) ,
Gv ∂u

and after some manipulation


" #
∂ Ẋ
Z Z
dI ∂L ∂Xi ∂L iβ ∂A
= + Ad (v) + L d (v)
du Gv ∂Xi ∂u ∂ Ẋiβ ∂u Gv ∂u
Z " #
∂L ∂L ∂Xi ∂L ∂ Ẋ iβ ∂ψα
+ + + Ad (v) ,
Gv ∂tα ∂Xi ∂tα ∂ Ẋiβ ∂tα ∂u

so that with the help of the notation in (5.25) we get


Z " #
dI ∂L ∂Xi ∂L ∂ Ẋiβ
= + Ad (v)
du Gv ∂Xi ∂u ∂ Ẋiβ ∂u
Z  
dL ∂ψα ∂A
+ A+L d (v) (5.26)
Gv dtα ∂u ∂u

in terms of (5.24) we have

dL ∂L ∂ψβ ∂L ∂Xi ∂ψβ ∂L ∂ Ẋiγ ∂ψβ dL


= + + = Aα,β .
dvα ∂tβ ∂vα ∂Xi ∂tβ ∂vβ ∂ Ẋiγ ∂tβ ∂vβ dtβ

Denote the cofactor of Aα,β in the determinant A by B,α , so that

Aα,β B,β = Aδ,α , (5.27)

then
dL dL
α
B,α = A  . (5.28)
dv dt
We write the integrand of the second integral on the right hand side of (5.26) as

dL ∂ψα ∂A dL ∂ψα ∂A
A+L = Bα,β +L . (5.29)
dtα ∂u ∂u dvβ ∂u ∂u

The derivative of a determinant is given by

∂A ∂Aβ,α
= Bα,β ,
∂u ∂u
124

and the expression (5.29) is then equivalent to

∂ 2 ψα
   
dL ∂ψα ∂Aβ,α dL ∂ψα
Bα,β +L = Bα,β +L
dvβ ∂u ∂u dvβ ∂u ∂u∂vβ
 
d ∂ψα
= Bα,β L ,
dvβ ∂u

where we used the second expression in (5.24). We write (5.26) in the form
Z " #  
∂ Ẋ
Z
dI ∂L ∂Xi ∂L iβ d ∂ψα
= + Ad (v) + Bα,β L d (v) . (5.30)
du Gv ∂Xi ∂u ∂ Ẋiβ ∂u Gv dvβ ∂u

In order to calculate (5.21), we put u = u0 in (5.30). From (5.25) and (5.27) it follows that for this
value Bα,β = δα,β , so that from (5.9) we get
  ! Z   
∂L ∂ Ẋiβ
Z
dI ∂L ∂Xi d ∂ψα
= + . d (t) + L d (t) (5.31)
du u=u0 G ∂Xi ∂u ∂ X iβ ∂u G dtα ∂u u=u0
u=u0

The first variation is obtained by using (5.8), (5.14) and (5.18), so that
Z ! Z
∂L ∗ ∂L ∗ d
δI = δ xi + δ Xiα d (t) + (Lδtα ) d (t) . (5.32)
G ∂Xi ∂ Ẋiβ G dtα

From (5.19) we get the final form of the first variation as


Z " #! Z !
∂L d ∂L ∗ d ∂L ∗
δI = − δ Xi d (t) + Lδtα + δ Xi d (t) .
G ∂Xi dtα ∂ Ẋiβ G dtα ∂ Ẋiβ

Replace δ ∗ Xi in the second integral according to (5.17), then we get


Z " #!
∂L d ∂L
δI = − δ ∗ Xi d (t) (5.33)
G ∂Xi dtα ∂ Ẋiβ
Z " ! #
d ∂L ∂L
+ Lδtβ,α − Ẋiβ δtβ + . δXi d (t) .
G dtα ∂ Ẋiβ ∂ X iα

This expression is the variational formula for the multiple integral . The integrand of the
second integral is a divergence, and by virtue of Gauss’s divergence theorem we can write it as an
(m − 1)–fold integral on the boundary ∂G of G. This is only dependent on the functions Xi on the
boundary ∂G of G.

Consider a special variation for which δtα = 0 for tα ∈ ∂G. Furthermore we demand of the
subspaces Cm (u0 ) and Cm (u) that they satisfy a boundary condition analogous to (5.3) namely

Xi (tα , u) = Xi (tα , u0 ) for all tα ∈ ∂G, (5.34)


125 APM3712

which implies that these subspaces coincide for tα ∈ ∂G. Both terms under the second integral
are now zero and the variational formula becomes
Z   
∂L d ∂L
δI = − δ ∗ Xi d (t) . (5.35)
G ∂Xi dtα ∂ Ẋiα

The term inside the brackets should look familiar! In order to derive the Euler-Lagrange equations
we must generalize the procedure that we used in the case of the single integral. The procedure is
based on the following lemma which is also called the generalized lemma of du Bois-Reymond.

Lemma 5.1.1

If the function F (tα ) is continuous on G and if


Z
F (tα ) η (tα ) d (t) = 0 (5.36)
G

for all η of class C1 which vanish on the boundary ∂G, then F (tα ) = 0 on G.

Proof:
Suppose that F (t0,α ) > 0 for some point t0,α in G. Since we require that F is continuous, we can
find a number r < 0 such that F (tα ) > 0 is defined on the closed set U by
m
X
(tα − t0,α )2 ≤ r2 ,
α=1

and this set is entirely contained in G. Define a function η by


" m
#2
X
η (tβ ) = r2 − (tα − t0,α )2 , tβ ∈ U,
α=1
η (tβ ) = 0, tβ ∈
/ U.

This function satisfies the requirements of the lemma, but makes the integral positive. This is a
contradiction and hence our assertion is true. 
Using our theory of maxima and minima, we see that if the subspace Cm (u) is to afford an
extreme value to the integral (5.4) as compared with all the other subspaces Cm (u) which satisfy
the boundary conditions (5.34) it is necessary that δI in (5.35) should vanish. Apart from these
boundary conditions, (5.7) and the derivatives ∂Xi /∂u in (5.8) which define the δ ∗ Xi are arbitrary,
while δ ∗ Xi = 0 on ∂G since δtα and δXi vanish on ∂G. (see (5.17)). Thus, as a special case, we
choose all δ ∗ Xi = 0 in (5.35) except one, say δ ∗ Xk , giving
" m #
X d ∂L ∂L
− = 0. (5.37)
α=1
dtα ∂ (X k )tα
∂X k
126

This process can be carried out for k = 1, 2, . . . , n. The conditions (5.37) are the necessary
conditions which a subspace Cm (u0 ) must satisfy to afford an extreme value to the fundamental
integral (5.4).

The equations (5.37) reduce to the Euler-Lagrange equations if m = 1, so we shall also refer
to these equations as the Euler-Lagrange equations for multiple integrals. Written out in
full the structure is similar to 2.4 in Chapter 2, and hence they are a set of second order partial
differential equations in xi , except in the case m = 1, when they are ordinary differential equations.

As in the single integral case, we call a solution of the equations (5.37) an extremal. Note
that as in the single integral case, the extremal only satisfies necessary conditions. The sufficient
condition can also be derived by generalizing the complete figure. However, a variety of theories
can be formulated all of which reduce to the complete figure if m = 1. The best known theories
are those of Weyl and Carathéodory which are dealt with in [28], Chapter 4. We shall not go any
further with the multiple integral theory, and one example will suffice.

Example 5.1.1

Consider the problem with Lagrangian function

L (t1 , t2 , X, Xt1 , Xt2 ) = (Xt1 )2 + (Xt2 )2 + 2X f (t1 , t2 ) .

where f (t1 , t2 ) is given. Here m = 2 and k = 1. Now

∂L ∂L ∂L
= 2Xt1 , = 2Xt2 , = 2f.
∂Xt1 ∂Xt2 ∂X

In view of (5.37), the Euler-Lagrange equation is


   
d ∂L d ∂L ∂L
+ − = 0.
dt1 ∂Xt1 dt2 ∂Xt1 ∂X

It is obvious that
   
d ∂L d d ∂L d
= (2Xt1 ) = 2Xt1 t1 and = (2Xt2 ) = 2Xt2 t2 .
dt1 ∂Xt1 dt1 dt2 ∂Xt2 dt2

so the Euler-Lagrange equation becomes

∂ 2X ∂ 2X
Xt1 t1 + Xt2 t2 = f (t1 , t2 ) or + = f.
∂t21 ∂t22

which is Poisson’s equation.

Example 5.1.2
127 APM3712

Consider the problem with Lagrangian function


1 1
L (x, t, Y, Yx , Yt ) = (Yx )2 − 2 (Yt )2 .
2 2c
Here m = 2, k = 1 and Y = Y (x, t). In view of (5.37), the Euler-Lagrange equation is
   
d ∂L d ∂L ∂L
+ − = 0,
dx ∂Yx dt ∂Yt ∂Y
where
∂L ∂L 1 ∂L
= Yx , = − 2 Yt , = 0.
∂Yx ∂Yt c ∂Y
Therefore we have  
d d 1
Yx − Yt − 0 = 0
dx dt c2
so the Euler-Lagrange equation becomes
1 1
Yxx − 2
Ytt = 0 or Yxx = 2 Ytt .
c c
which is the classical wave equation.

5.2 Equivalent Lagrangian functions

In Chapter 2 in exercises 2.5.25 to 2.5.27 we briefly discussed why the Euler-Lagrange equations do
not change if a function d f (t,x
dt
i)
is added to the Lagrangian function. Given a Lagrangian function
dxi

L t, xi , dt we can construct a Lagrangian function

.  d f (t, xi )
L∗ (t, xi , ẋi ) = kL t, xi , xi + (5.38)
dt
where k is a non-zero constant, and the total derivative of f is given by (5.6) with m = 1. The
Euler-Lagrange equations for L and L∗ are identical. In general we can say that we can construct a
second Lagrangian function (5.38) which yields equations of motion which have the same solutions
as in the case of Lagrangian function L. Define
d ∂L∗ ∂L∗
   
d ∂L ∂L ∗
Li = − , Li = − . (5.39)
dt ∂ ẋi ∂xi dt ∂ ẋi ∂xi
We say that the Euler-Lagrange equations Li = 0 are equivalent to L∗i = 0 when

L∗I = Λij Lj , (5.40)

where the matrix Λij is non-singular and is a function of t, xk and dxk /dt with

det (Λij ) 6= 0. (5.41)

The Lagrangian function L is then also equivalent to L∗ . It may happen that for some values of
t, xk and dxk /dt we get det (Λij ) = 0. The two Lagrangian functions may still be equivalent if the
128

solutions of det (Λij ) = 0 are also solutions of the differential equations Li = 0. We will now give a
brief description of the theory of equivalent Lagrangian functions. One of the consequences of this
theory is that in quantum mechanics two different but equivalent Lagrangian functions result in
two completely different quantum theories. This unfortunately is beyond the scope of this guide.

Example 5.2.1

Consider the motion of a particle in a 3-dimensional space. Denote the variables by xi , i =


1, 2, 3 and the time by t. According to Newton’s law of motion

ẍi = 0, i = 1, 2, 3. (5.42)

The Lagrangian function of such a system is given by Example 4.4.1 in Chapter 4. In this instance
we assume that the mass is normalized, so that m = 1. The Lagrangian function has the form

L (xi , ẋi ) = δij ẋi ẋj . (5.43)

If we replace the δij with a constant or a non-singular matrix (not necessarily positive definite),
the equation of motion is the same as (5.42).

Define an equivalent Lagrangian function L∗ as an arbitrary function of xi and dxi /dt. The
equation of motion is given by
∂ 2L
L∗i = Λij ẍj = 0, with Λij = . . (5.44)
∂ ẋi ∂ xj
We require Λij to be non-singular (otherwise the problem is not well defined). In the general
case Λij is only dependent on dxi /dt, which is a constant of the motion (integrate the equations
(5.42) once). These constants of motion (or first integrals) play an important role in the theory of
equivalent Lagrangian functions.

Consider the case in one dimension, here the constants of motion are c1 = dx/dt and
c2 = x − (dx/dt) t. Moreover, any constant of motion is a function of c1 and c2 . We shall con-
sider the motion in one dimension and state result in the form of a theorem.

Theorem 5.2.1

Let L∗ t, x, dx

dt
be a Lagrangian function which results in an equation of motion equivalent to that
of a particle, namely
∂L∗
 
∗ d ∂L
L1 − = Λẍ. (5.45)
dt ∂ ẋ ∂x
Then Λ t, x, dx

dt
is a constant of the motion.
129 APM3712

Proof:
Write the second expression in (5.39) where (n = 1), then

∂ 2L ∂ 2L ∂ 2L ∂L∗
L∗1 = ẍ + ẋ + − , (5.46)
∂ ẋ2 ∂ ẋ∂x ∂ ẋ∂t ∂x
2
so that L∗1 = Λ ddt2x requires that
∂ 2 L∗
Λ= (5.47)
∂ ẋ2
and
∂ 2 L∗ ∂ 2 L∗ ∂L∗
ẋ + − = 0. (5.48)
∂ ẋ∂x ∂ ẋ∂t ∂x
These equations are identical, whether the equation of motion is satisfied or not. Differentiate
(5.48) again with respect to dx/dt so that

∂ 3 L∗ . ∂ 3 L∗
x + = 0.
∂ ẋ2 ∂x ∂ ẋ2 ∂t
Evaluating the total derivative gives us
dΛ ∂ 3 L∗ ∂ 3 L∗ ∂ 3 L∗
= ẍ + ẋ + .2 =0
dt ∂ ẋ3 ∂ ẋ2 ∂x ∂ x ∂t
where we used (5.48) and the fact that the equation of motion holds.

Thus Λ is a constant of the motion and is of the form

Λ = Λ (c1 , c2 ) .

This statement is easy to prove and is given as an exercise at the end of the chapter. 

We could question whether an L∗ exists which yields the equation of motion Λd2 x/dt2 = 0
for a given Λ? The answer is yes. We shall first give an example before we deal with the general
case.

As an example, let
Λ = c2 = x − ẋt.
We prove that at least one L∗ exists for which

L∗1 = Λẍ.

From (5.47) we get


∂ 2 L∗
Λ= = c2 = x − ẋt. (5.49)
∂ ẋ2
The most general solution is
1 1
L∗ = xẋ2 − tẋ3 + F (t, x) ẋ + G (t, x) . (5.50)
2 6
130

To find F and G we put (5.46) equal to zero so that

∂ 2 L∗ ∂ 2 L∗ ∂L
ẋ + − = 0,
∂ ẋ∂x ∂ ẋ∂t ∂x
and replace L∗ with (5.50) so that

∂F 1 ∂F 1 ∂F ∂G ∂F ∂G
ẋ2 + ẋ − ẋ2 + − ẋ2 − ẋ − = − = 0.
∂x 2 ∂t 2 ∂x ∂x ∂t ∂x
One solution is F = G = 0. The most general solution would be where G is given and F is
determined by integrating with respect to t. In this case a function S exists for which

dS (t, x)
= F (t, x) ẋ + G (t, x) ,
dt
which we have already encountered. The divergence term which we added to the Lagrangian
function has no effect on the equation of motion, so that the solution of (5.49) is

1 1
L∗ (t, x, ẋ) = xẋ2 − tẋ3 . (5.51)
2 6
We examine the general case in the next section.

The 1-dimensional case

The case of a particle moving in one dimension can be discussed very thoroughly, while the higher
dimensions contain many as yet unanswered questions. Consider the situation where there is just
one dependent variable namely x(t) and with Lagrangian function L(t, x, dx/dt). The equation of
motion is L1 = 0, with  
d ∂L ∂L
L1 = − (5.52)
dt ∂ ẋ ∂x
or written out
∂ 2L ∂ 2L ∂ 2L ∂L
L1 = 2
ẍ + ẋ + − = 0. (5.53)
∂ ẋ ∂ ẋ∂x ∂ ẋ∂t ∂x
We assume that
∂ 2L
 
det 6= 0, (5.54)
∂ ẋ2
so that we can write −1 
∂ 2L ∂ 2L ∂ 2L
 
∂L
ẍ = − ẋ − + . (5.55)
∂ ẋ2 ∂ ẋ∂x ∂ ẋ∂t ∂x
We look for another Lagrangian function L∗ for which the equation of motion is L∗1 = 0. This
second equation of motion is also written in the form (5.55), and we assume a condition similar to
(5.54) for L∗ . Hence
 2 ∗ −1 
∂ 2 L∗ ∂ 2 L∗ ∂L∗

∂ L
ẍ = .2 − . ẋ − . + . (5.56)
∂x ∂ x ∂x ∂ x ∂t ∂x
131 APM3712

The two Lagrangian functions are equivalent if their equations of motion have the same solution.
This can only happen if the right hand sides of (5.55) and (5.56) are identical. We define Λ as
−1
∂ 2 L∗ ∂ 2 L

Λ≡ , (5.57)
∂ ẋ2 ∂ ẋ2
so that
∂ 2 L∗ ∂ 2 L∗ ∂L∗
 2
∂ 2L
  
∂ L ∂L
ẋ + − =Λ ẋ + − . (5.58)
∂ ẋ∂x ∂ ẋ∂t ∂x ∂ ẋ∂x ∂ ẋ∂t ∂x
This equation must hold identically when x is a solution of the equation of motion. We also see
that L∗1 = ΛL1 .

We prove firstly that Λ is a constant of the motion. The total derivative of Λ is given by
dΛ ∂Λ ∂Λ ∂Λ
= ẍ + ẋ + . (5.59)
dt ∂ ẋ ∂x ∂t
Substituting from (5.55) gives us
−1 
∂Λ ∂ 2 L ∂ 2L ∂ 2L
 
dΛ ∂L
= L1 − ẋ − +
dt ∂ ẋ ∂ ẋ2 ∂ ẋ∂x ∂ ẋ∂t ∂x
 2 −2  2 ∗ −1  3
∂ 3L

∂ L ∂ L ∂ L
− ẋ + 2
∂ ẋ2 ∂ ẋ2 ∂ ẋ2 ∂x ∂ ẋ ∂t
 2 −1  3 3

∂ L ∂ L ∂ L
+ 2 2
ẋ + 2 . (5.60)
∂ ẋ ∂ ẋ ∂x ∂ ẋ ∂t

We also used the definition of Λ in (5.57). Differentiate the equation (5.58) with respect to dx/dt,
so that
 3
∂ 3L ∂Λ ∂ 2 L ∂ 2L
  
∂ L ∂L
ẋ + 2 = ẋ + −
∂ ẋ2 ∂x ∂ ẋ ∂t ∂ ẋ ∂ ẋ∂x ∂ ẋ∂t ∂x
 3
∂ 3L

∂ L
+Λ ẋ + 2 . (5.61)
∂ ẋ2 ∂x ∂ ẋ ∂t

Using (5.61) we find that (5.60) becomes (do this substitution)


 −1
dΛ ∂Λ
= L1 ,
dt ∂ ẋ

whether L1 = 0 or not. If L1 = 0, so that x(t) is a solution, then Λ is a constant of the motion.

Conversely: suppose we choose any constant of the motion Λ, then a Lagrangian function
L exists such that L∗1 = ΛL1 . This new L∗ is uniquely determined up to the addition of a total

derivative of some function. 

To prove the uniqueness of L∗ we proceed as follows. Suppose L∗ and L0 arise out of the
same Λ, so that L∗1 = L01 = L1 . To prove uniqueness all we need to do is to prove that L∗ − L0 is a
132

total time derivative of some function. We rephrase this statement as follows:. By dropping the 0
in the second L and assuming that Λ = 1, we get that L∗ − L = df /dt (you will be asked to prove
this in the Exercises). 

We show that there is at least one L∗ such that L∗1 = ΛL1 for any given constant of motion
Λ. We first solve for L∗ in
∂ 2 L∗ ∂ 2L
= Λ . (5.62)
∂ ẋ2 ∂ ẋ2
Using integration by parts we get the solution of (5.62) to be
Z ẋ Z ẋ
∗ ∂ 2L ∂ 2L
L (t, x, ẋ) = ẋ Λ 2 dχ̇ − χ̇Λ 2 dχ̇
c ∂ χ̇ c ∂ χ̇
+A (t, x) ẋ + B (t, x) . (5.63)

The c in the limits of the integrals are constants, although in general they may be functions of t
and x. Since dΛ/dt = 0, it follows that
−1 
∂ 2L ∂ 2L ∂ 2L
 
∂Λ ∂Λ ∂Λ ∂Λ ∂L
ẋ + = − ẍ = ẋ + − . (5.64)
∂x ∂t ∂ ẋ ∂ ẋ ∂ ẋ2 ∂ ẋ∂x ∂ ẋ∂t ∂x

To find functions A and B substitute (5.63) into (5.58) so that

∂ 2 L∗ ∂ 2 L∗ ∂L∗
ẋ + −
∂ ẋ∂x
 2 ∂ ẋ∂t ∂x 
2
∂ L ∂ L ∂L
= Λ ẋ + −
∂ ẋ∂x ∂ ẋ∂t ∂x
Z ẋ  2
∂ 3L ∂Λ ∂ 2 L ∂ 3L

∂L ∂ L
= + Λ 2 + χ̇ + χ̇Λ dχ̇
c ∂t ∂ χ̇2 ∂ χ̇ ∂t ∂χ ∂ χ. 2 ∂χ∂ χ̇2
∂A ∂B
+ − . (5.65)
∂t ∂x
We now replace (5.64) in the integrand of (5.65) and evaluate the integral (do this yourself),
obtaining  2
∂ 2L

∂A ∂B ∂ L ∂L
− = Λ (t, c, x) ẋ + − . (5.66)
∂t ∂x ∂ ẋ∂x ∂ ẋ∂t ∂x ẋ=c
The right hand side is only a function of x and t which you can easily solve to find A and B.

The higher dimensional case will not be discussed here and you can find out more about it
should you wish to do so (especially regarding unsolved questions) in [21], pp. 82-84.

5.3 Canonical transformations

We begin this section with the definition of a canonical transformation. But first we would like
to point out that most of the canonical formalism and transformation theory in mechanics was
already developed in the previous century in the context of celestial mechanics and the classical
133 APM3712

work [27] appeared for the first time in 1892.

Consider the canonical variables (t, xj , pj ) associated with a Lagrangian function L of a


problem in Rn+1 (this implies that (4.1) is applicable). Consider the transformation (xi , pi ) →
(Xi , Pi ) given by
Xi = Xi (xj , pj ) , Pi = Pi (xj , pj ) . (5.67)

For the sake of clarity, the Xi and Pi are the new variables (left hand side) and employing the
customary abuse of notation the dependence on the right hand side is denoted by the functions xi
and Pi . In the case of the special transformation

Xi = Xi (xj ) , Pi = Pi (xj ) ,

we talk of a point transformation.

Definition 2

The transformation (5.67) is a canonical transformation if

• it is of class C2 , and

• a function (xi , pj ) → Ψ (xi , pi ) of class C2 exists such that the equation

dΨ = Pi dXi − pi dxi , (5.68)

is an identity.

It should be noted that in (5.68) the Pi and dXi are functions of xi and pi and they must also be of
class C 2 . We mention a few conditions which a transformation must satisfy to be canonical. The
definition is not always the most convenient way to check whether a transformation is canonical!
These conditions depend on the definition and can all be proved, but these proofs do not form part
of this module.

In Section4.3 we defined the Lagrange-brackets using (4.43). We just repeat this briefly, if
xi and pi are functions of the parameters u and v then the Lagrange-bracket [u, v] is defined by
n  
X ∂xi ∂pi ∂xi ∂pi
[u, v] = − . (5.69)
i=1
∂u ∂v ∂v ∂u

Although xi and pi are canonical, this is not a necessary condition for the definition of the Lagrange-
brackets. Using (5.67) we can associate u and v with the canonical variables xi and pi , then the
Lagrange brackets with respect to these variables is given by
n  
X ∂Xi ∂Pi ∂Xi ∂Pi
[xj , ph ] = − . (5.70)
i=1
∂xj ∂ph ∂ph ∂xj
134

Strictly speaking, we ought to indicate with respect to which variables we are differentiating, for
example []x,p . This should, however, be evident in each case so it will not be necessary to indicate
this specifically.

We shall introduce a few important aspects of canonical transformations in the form of theo-
rems. These theorems can all be proved but you will not be expected to do this. You do, however,
need to be able to apply these theorems so that you can show whether or not a transformation
is canonical.

Theorem 5.3.1

A necessary and sufficient condition for the class C2 transformation (5.67) to be canonical, is that
the Lagrange-bracket relations.

[xj , xh ] = 0, [xj , ph ] = δjh , [pj , ph ] = 0 (5.71)

are satisfied identically. 

An important point, that students often overlook when doing exercises is that in the case
where n = 1, the first and last identities in (5.71) are always satisfied. In this case it is only
necessary to check that [x, p] = 1.

Furthermore as a result of Theorem 5.3.1, we could also use the identities (5.71) as definitions
of canonical transformations. The derivatives that appear in the relations are of the first order,
and thus one can relax condition (1) of the definition to class C 1 . This is what Carathéodory did
in [4], §89.

Theorem 5.3.2

The functional determinant of a canonical transformation has the value +1. 

This theorem implies the following theorem.

Theorem 5.3.3

A Canonical transformation has a (local) inverse. 

It can now be shown that the following identities are valid, namely

[Xj , Xh ] = 0, [Xj , Ph ] = δjh , [Pj , Ph ] = 0, (5.72)


135 APM3712

where
n  
X ∂xi ∂pi ∂xi ∂pi
[Xj , Ph ] = − . (5.73)
i=1
∂Xj ∂Ph ∂Ph ∂Xj

These identities put us in the position to prove the following:

Theorem 5.3.4

The inverse of a canonical transformation is also canonical. 

The group properties of a canonical transformation can now be proved and we express the
result as a theorem.

Theorem 5.3.5

For every n the totality of all canonical transformations of a set of canonical variables (xi , pi ) ,
i = 1, . . . , n is a group, if the “ composition of functions” is taken to be the group operator. 

We are now going to consider the important aspect of the invariance of the canonical equations
under a canonical transformation.
In order to do this we first need the following theorem (which while it does not have any practical
value, is important theoretically).

Theorem 5.3.6

A necessary and sufficient condition for the transformation (5.67) (with inverse) to be canonical, is
that the following reciprocity relations must be valid:

∂Xi ∂ph ∂Xi ∂xh


= , h
= − i,
∂xh ∂Pi ∂p ∂P
i
∂Pi ∂ph ∂P ∂xh
= , h
=− . (5.74)
∂xh ∂Xi ∂p ∂Xi


Note that the assumption of the existence of an inverse need only be made in the “sufficient” part.
Another aspect to be considered involves the Poisson brackets which were defined in (4.115) Section
4.5. We recall these briefly. If F and G are two functions of class C 1 in the canonical variables,
then the brackets are given by
n  
X ∂F ∂G ∂F ∂G
{F, G} = − . (5.75)
i=1
∂x i ∂p i ∂p i ∂x i
136

Note that in order to satisfy the Jacobi identity (4.119) it is necessary for the functions to be of
class C 2 .

It can now be shown with the help of the reciprocity relations (5.74) that if the transformation
(5.67) is canonical, then the following relations exist among the Lagrange brackets (of the inverse
functions xj = xj (Xi , Ph ) and pj = pj (Xi , Ph ) with regard to Xi and Ph ) and the Poisson brackets
(of the functions Xj = Xj (xi , ph ) and Pj = Pj (xi , ph ) with regard to xi and ph ), namely
[Xj , Xh ] = {Pj , Ph } , [Xj , Ph ] = {Xh , Pj } ,
[Pj , Ph ] = {Xj , Xh } . (5.76)
We immediately deduce that in order for the transformation (5.67) to be canonical the following
Poisson brackets relations must be valid, namely
{Xi , Xj } = 0, {Xi , Pj } = δij , {Pi , Pj } = 0. (5.77)
These relations are also sufficient. We can combine these results in a theorem.

Theorem 5.3.7
The transformation (5.67) is canonical if and only if the associated Poisson bracket relations (5.77)
are satisfied identically. 

If we exchange the roles of the two sets of variables (xi , pj ) and (Xi , Pj ), then we deduce
that a canonical transformation can be characterized by the equations
{xi , xj }0 = 0, {xi , pj }0 = δij , {pi , pj }0 = 0. (5.78)
The accent here indicates that the Poisson brackets must be formed with respect to the (Xi , Pj ),
i.e. for two functions f (Xi , Pj ), g (Xi , Pj ) we can write
n  
0
X ∂f ∂g ∂f ∂g
{f, g} = − . (5.79)
i=1
∂X i ∂P i ∂P i ∂X i

Consider two arbitrary functions f (xi , pi ) and g (xi , pi ). The inverse of (5.67) transforms the
functions to
F = F (Xj , Pj ) = f (xi (Xj , Pj ) , pj (Xj , Pj )) ,
G = G (Xj , Pj ) = g (xi (Xj , Pj ) , pi (Xj , Pj )) .
The Poisson bracket {f, g} can now be expressed as
n  
X ∂f ∂g ∂f ∂g
{f, g} = −
i=1
∂xi ∂pi ∂pi ∂xi
n   
X ∂F ∂Xh ∂F ∂Ph ∂G ∂Xj ∂G ∂Pj
= ( + +
i=1
∂X h ∂x i ∂P h ∂x i ∂X j ∂p i ∂Pj ∂pi
  
∂F ∂Xh ∂F ∂Ph ∂G ∂Xj ∂G ∂Pj
− + + ).
∂Xh ∂pi ∂Ph ∂pi ∂Xj ∂xi ∂Pj ∂xi
137 APM3712

Simplifying the expression using (5.75), gives us


n
X ∂F ∂G ∂F ∂G
{f, g} = ( (Xh , Xj ) − (Xj , Ph )
j=1
∂Xh ∂Xj ∂Ph ∂Xj
∂F ∂G ∂F ∂G
+ (Xh , Pj ) − (Ph , Pj )). (5.80)
∂Xh ∂Pj ∂Ph ∂Pj
Suppose that the transformation

xi = xi (Xj , Pj ) , pi = pi (Xj , Pj ) , (5.81)

which is the inverse of (5.67), represents a canonical transformation. This means that we can
apply the identities (5.77) , so that
n  
X ∂F ∂G ∂F ∂G
{f, g} = − .
j=1
∂X j ∂P j ∂P j ∂X j

In terms of (5.79) this becomes


{f, g} = {F, G}0 . (5.82)
This means that the Poisson bracket of two arbitrary functions is invariant under a canonical trans-
formation.

The converse is also valid. Suppose that (5.82) is valid and use F = Xi and G = Ph . In
addition the following identity also holds:
∂Xi ∂Ph ∂Xi ∂Ph
{F, G}0 = − = δji δhj = δhi . (5.83)
∂Xj ∂Pj ∂Pj ∂Xj
So that
{Xi , Ph } = {Xi , Ph }0 = δhi .
In a similar fashion we can derive the other expressions in (5.77). We have just proved the following
theorem.

Theorem 5.3.8

A necessary and sufficient condition for the transformation (5.67) (with inverse) to be canonical,
is that it leaves invariant the Poisson bracket of a pair of arbitrary functions of class C 1 of the
canonical variables xi and pi . 

The concept of invariance is contained in expression (5.82). namely, any set of canonical
variables can be used to calculate the Poisson brackets. Theorem 5.3.8 puts us in the position to
prove that a canonical transformation leaves the canonical equations
∂H . ∂H
ẋi = , pi = − (5.84)
∂pi ∂xi
138

invariant. We should point out that we must make the assumption that t is unaffected by the
transformation. This implies that the mapping is of the form
   
dxi dpi dXi dPi
, → , . (5.85)
dt dt dt dt
By invariance is meant that the equations (5.84) are mapped onto

∂H ∂H
Ẋi = , Ṗi = − ,
∂Pi ∂Xi
and
H (t, Xi , Pi ) = H (t, xi (Xj , Pj ) , pi (Xj , Pj )) .
The following theorem contains this result. Its proof is direct and is left as an exercise.

Theorem 5.3.9

A Canonical transformation of the form (5.85) leaves the canonical equations (5.83) invariant. 
We end this section with an example.

Example 5.3.1

Consider the harmonic oscillator with Lagrangian function L, namely


1 2
ẋ − ω 2 x2 ,

L (x, ẋ) =
2
and Hamiltonian function (4.4), namely
1 2
p + ω 2 x2 .

H (x, p) = (5.86)
2
Consider the canonical transformation
ωx 1
ω 2 x 2 + p2

X = arctan , P = (5.87)
p 2ω
(prove that it is indeed canonical) and its inverse

r
P
x = 2 sin X, p = 2ωP cos X (5.88)
ω
(show this!), where we take the positive roots. The new Hamiltonian function is obtained by
substituting (5.88) in (5.86). This gives us

H (X, P ) = ωP. (5.89)

The transformed canonical equations are then

Ẋ = ω, Ṗ = 0,
139 APM3712

so that X = ωt + c, with c an integration constant, and P = constant = E/ω (from (5.89)).


Substitution in (5.88) gives us the same solution as before:


r
E
x = 2 sin (ωt + c) , p = 2E cos (ωt + c) . (5.90)
ω
Although this problem is very simple, it shows very clearly how the canonical transformation func-
tions. Hamilton’s equations may look very different but in fact they are associated with the same
problem. This also provides a very good illustration of how canonical transformations can trans-
form the canonical equations in such a way that it is easier to find the solution.

5.4 Noether’s theorem for single integral problems

Before we derive Noether’s theorem let us return to the so-called variational formula (5.33)
which we derived in 5.1 for the multiple integral. This formula was derived using variations in the
variables as well as of the domain of integration. To reduce this formula to the single integral case
we let m = 1. This results in
Z t2   
∂L d ∂L
δI = − δ ∗ xi dt
t1 ∂xi dt ∂ ẋi
Z t2   
d ∂L ∂L
+ L− ẋi δt + δxi dt. (5.91)
t1 dt ∂ ẋi ∂ ẋi

If we now substitute the canonical momentum (4.1) and the Hamiltonian function (4.1) in (5.91),
we get
Z t2   
∂L d ∂L
δI = − δ ∗ xi dt
t1 ∂xi dt ∂ ẋi
" n #t2
X
+ pi δxi − H (t, xj , ẋj ) δt , (5.92)
i=1 t1

where the canonical variables refer to a curve Γ which we compare with other curves. Note that
the end points as well as the path of integration are varied to obtain (5.92).

Consider the continuous r–parameter transformation in Rn+1 of the form

t = t (t, xj , αs ) , xi = xi (t, xj , αs ) , (5.93)

where the r parameters are denoted by αs , s = 1, 2, . . . , r. The summation convention holds for s
from 1 to r. Furthermore we assume that the functions (5.93) are of class C 1 and that the identity
transformation is given by the values α1 = . . . = αr = 0. Thus, corresponding to (5.93) there
exists an infinitesimal transformation of the form

δt = ξs αs , δxi = ζi,s αs , (5.94)


140

which is such that the higher powers of αs can be neglected, and


   
∂t ∂xi
ξs = , ζi,s = . (5.95)
∂αs αs =0 ∂αs αs =0
The displacement δ ∗ xi in (5.92) is of this type (compare with the definition in (5.5) and (5.17)).
We define
δ ∗ xi = ηi,s αs , with ηi,s = ζi,s − ẋi ξs . (5.96)
To see the effect of the transformation on the Lagrangian function L, assume that L has the
property of being “invariant up to an exact differential” in the sense of
 
dx̄j
L t̄, x̄j , dt̄ = L (t, xj , ẋj ) dt + dΦ (t, xj , αs ) , (5.97)
dt̄
with
Φ (t, xj , αs ) = Φs (t, xj ) αs . (5.98)
The function Φs can also be zero, in which case we have invariance in its usual meaning. The
reason for the introduction of (5.97) is that it is more general. In this case we get
Z t2
δI = dΦs (t, xj ) αs = [Φs (t, xj )]tt21 αs . (5.99)
t1

Substitute (5.92), (5.96) and (5.99) in (5.92), then we get


Z t2    " n
#t2
∂L d ∂L X
− ηi,s αs dt = Φs − pi ζi,s + Hξs αs . (5.100)
t1 ∂xi dt ∂ ẋi i=1 t1

Assume that Γ is an extremal, that is Γ satisfies the Euler-Lagrange equations, then we see that
the left hand side of (5.100) vanishes. The αs are independent, according to the definition so that
we can formulate the following theorem.

Noether’s theorem (for a single integral): If a Lagrangian function L is invariant under


the continuous r–parameter transformation (5.94) up to an exact differential in the sense of
(5.97), then the r expressions
n
X
Ψs = Hξs − pi ζi,s + Φs , s = 1, . . . r. (5.101)
i=1

are constant along any extremal, where ξs and ζi,s are determined in terms of (5.95) by the
infinitesimal transformation.

In the following examples we apply the fundamental theorem and its purpose should become
evident.
141 APM3712

Example 5.4.1

In this example assume that Φs = 0. Suppose the fundamental integral is invariant under the
one-parameter transformation

xj = xj , t = t + w, (w is the parameter),

which represents a time translation. This will be the case if


∂L
= 0,
∂t
and from (5.95) it follows that
   
∂t ∂x
ξs = = 1, ζi,s = = 0.
∂w w=0 ∂w w=0

From (5.101) we get that H = constant along any extremal. Suppose that the Lagrangian function
refers to a dynamical system with n degrees of freedom, then the Hamiltonian function is associated
with the sum of the kinetic and potential energies of the system. The consequence of the fact that
H = constant can now be interpreted as the law of energy conservation. This means that for
this dynamical system the total energy does not increase or decrease with the passage of time. ♥

Example 5.4.2

In the example above (Example 5.4.1) we considered a time translation. The next logical step is to
consider a spatial translation. Suppose that the Lagrangian function L is invariant (Φ = 0) under
the n–parameter transformation
t = t, xi = xi + αi .
Here we have
   
∂t ∂x
ξs = = 0, ζi,s = = δis , s = 1, . . . , n.
∂αs αs =0 ∂αs αs =0

According to Noether’s theorem we have

pi δis = ps = constant. (5.102)

along an extremal. If L refers to a dynamical system with n degrees of freedom, then (5.102) refers
to the law of conservation of linear momentum. ♥

In Newtonian mechanics the conservation of momentum and energy is important.

• If a particle, for example, does not experience a resultant force in a given direction then its
momentum is conserved in that direction, i.e. its velocity in that direction is constant.
142

• Furthermore, consider a dynamical system with n degrees of freedom and Lagrangian function
L which is not specifically dependent on a generalized coordinate xk , but is dependent on the
generalized velocity dxk /dt. Then, according to the Euler-Lagrange equation
 
d ∂L
= 0,
dt ∂ ẋk
or, using the definition of the momentum
∂l
PK = = constant. (5.103)
∂ ẋ

Hence whenever the coordinate xk does not appear explicitly in L, then the momentum pk
is conserved or it is a constant of the motion. Such a coordinate is said to be ignorable or
cyclic.

Equation (5.103) can be used to solve for the velocity in terms of pk and the other velocities and
coordinates. Once we have found dxk /dt we can then eliminate it from the other k − 1 equations
and in this way reduce the number of unknowns in the problem by one.

Example 5.4.3

As a simple example of the conservation of generalized momentum, consider the motion of a planet
about the sun. In polar coordinates the Lagrangian function of the planet is given by
1h 2 2
i
L=T −V = r̈ + r θ̇ − V (r) ,
2
where r and θ are chosen as generalized coordinates. Here θ is a cyclic coordinate and the
generalized momentum
∂L
pθ = = mr2 θ̇ = constant
∂ θ̇
added to θ remains conserved during the motion of the planet. This is nothing other than the
conservation of angular momentum. Note that the constant pθ is the result of the insensitivity
of the Lagrangian function L to changes in θ, that is the Lagrangian function is invariant with
respect to rotations about the z–axis. This in turn is a consequence of the fact that the force field
(of potential function V (r)) is symmetric about the z–axis. The symmetry of the problem with
respect to θ is thus directly responsible for the conservation of the generalized momentum pθ added
to θ. In this case we say that the Lagrangian function possesses a symmetry associated with
a constant of the motion. Hence space displacement symmetry leads to the conservation of the
generalized momentum.
We must point out that such a generalized momentum conservation law depends mainly on
the choice of the generalized coordinates. If the Cartesian coordinates x and y are employed as
generalized coordinates in example above then
1 2
ẋ + ẏ 2 − V (x, y)

L=T −V =
2
143 APM3712

does not lead to the conservation of generalized momentum. ♥

It is therefore advantageous to choose the generalized coordinates in such a way that as


much as possible is ignorable. This also leads to noticeable simplification of the equations
of motion.

In section 4.5, we showed that for a holonomic dynamical system with n degrees of freedom
the conservation of the total energy was demonstrated by making use of the Poisson brackets and
integrals of motion. We were actually making implicit use of Noether’s theorem in the proof. As
we said previously there is a significant difference between the total energy and the generalized
energy of a dynamical system which is non holonomic.

Example 5.4.4

As an example of generalized energy conservation, consider the motion of a smooth particle on


a circular wire with radius a which rotates around a vertical diameter with a constant angular
velocity ω. This is a non-conservative system with Lagrangian function

1 h 2 i
L = ma2 θ̇ + ω 2 sin2 θ − mga cos θ,
2
where θ is chosen as the generalized coordinate. It is obvious that L does not explicitly depend
on the time and the generalized energy of the system is conserved, that is

∂L 1 h 2 i
H = θ̇ − L = ma2 θ̇ − ω 2 sin2 θ + mga cos θ = constant.
˙
∂θ 2

The quantity
1 2h 2 2 2
i
ma θ̇ − ω sin θ + mga cos θ
2
on the other hand, which represents the total energy T + V of the system, is not conserved.

5.5 Exercises

The first few exercises deal with the multiple integral problem in the variational calculus.

5.5.1 Consider the problem of finding a surface in C2 in R3 with the smallest possible area and
which is bounded by given closed non-self-intersecting curve C with suitable properties (such as
smoothness, etc.). Let the projection of C on the (x, y)–plane enclose a region G. It can be shown
(can you?) that this leads to the variational problem with the fundamental integral
144

Z Z q
I= 1 + u2x (x, y) + u2y (x, y)dxdy.
G
Show that the Euler-Lagrange equation can be simplified as follows

uxx 1 + u2y + uyy 1 + u2x − 2ux uy uxy = 0.


 
(5.104)

This problem is known as the problem of Plateau and even today it is the subject of active
research. For more information consult [6], p. 70 et seq and [10], p. 76,79.

5.5.2 In R4 show that the problem of Plateau (see Exercise 5.5.1) leads to the Lagrangian function
q
L(x, y, z, u, ux , uy , uz ) = 1 + u2x + u2y + u2z .
Find the corresponding Euler-Lagrange equation and simplify it to a form which corresponds to
(5.104). What do you expect the Lagrangian function in Rn+1 looks like? What do you surmise
the accompanying Euler-Lagrange equation will be? Prove this surmise.

5.5.3 Find the Euler-Lagrange equation for the variational problem with the fundamental integral
Z t2 Z 1 "  2  2 #
∂u ∂u
ρ −τ dxdt (ρ, τ constant).
t1 0 ∂t ∂x
This integral is associated with the vibrating string, with the following boundary conditions:
u (0, t) = u (1, t) = 0, u (x, t1 ) = φ (x), u (x, t2 ) = ψ (x), φ, ψ functions of class C 1 ,
φ (0) = ψ (0) = φ (1) = ψ (1).

5.5.4 Find the Euler-Lagrange equation for the variational problem with the fundamental integral
Z t2 Z Z "  2 ( 
2  2 ) #
∂u ∂u ∂u
ρ −τ + + 2f (t, x, y) dxdydt,
t1 ∂t ∂x ∂y
G
where ρ and are τ constants and f is a given function. This integral is associated with the vibrating
membrane under an exterior force and boundary conditions

u (x, y, t)|(x,y)∈∂G = 0, u (x, y, t1 ) = φ (x, y) , u (x, y, t2 ) = ψ (x, y) ,

where φ and ψ are of class C 1 and φ, ψ vanish on ∂G.

5.5.5 Find the Euler-Lagrange equation for the variational problem with the fundamental integral
Z    
1 ∂u ∂u 1 2 2
Aαβ + m u dt1 . . . dt4 ,
G 2 ∂tα ∂tβ 2
where m is a constant and
 
−1 0 0 0
 0 1 0 0 
Aαβ =  α, β = 1, 2, 3, 4.
 

 0 0 1 0 
0 0 0 1
This Euler-Lagrange equation is well known Klein-Gordon equation.
145 APM3712

5.5.6 In Exercise 4.6.6 the so-called reduced time-independent Hamilton–Jacobi equation (4.27)
was discussed and applied to the motion of a single particle, namely
" 2  ∗ 2  ∗ 2 #
1 ∂S ∗ ∂S ∂S
+ + + V (x, y, z) = E. (5.105)
2m ∂x ∂y ∂z
Transform the dependent variable S ∗ to ψ via S ∗ = K ln ψ, with K a constant. Show that (5.105)
becomes "   2  2 #
2
K2 ∂ψ ∂ψ ∂ψ
+ + + (V − E) ψ 2 = 0.
2m ∂x ∂y ∂z
Consider the problem of minimizing the integral
Z Z Z ( 2 " 2  2  2 # )
K ∂ψ ∂ψ ∂ψ
I= + + + (V − E) ψ 2 dxdydz
G 2m ∂x ∂y ∂z

in the set of class C 2 functions which vanish on the boundary ∂G on the domain of integration G
in the (x, y, z)–space. Show that the Euler-Lagrange equation leads to the Schrödinger equation

K2 2
∇ ψ + (E − V ) ψ = 0.
2m
This is more or less the way that Schrödinger originally arrived at the equation that bears his name.

The theoretical question on this section concludes the problems on multiple integrals.

5.5.7 Assume that the variational formula (5.33) is correct. Hence derive the Euler-Lagrange
equations as necessary conditions that a subspace Cm (u0 ) must satisfy to afford an extreme value
to the fundamental integral (5.4). Also state and prove the lemma that is necessary for your
derivation.

The following exercises deal with equivalent Lagrangian functions.

5.5.8 Consider a particle which moves freely in one dimension with coordinate q = q(t). Show
that
c1 = q̇, c2 = q − q̇t
are constants of the motion (that is, are constant along any extremal). Also show that any constant
of motion is a function of c1 and c2 .

5.5.9 Suppose that a particle moves freely in one dimension so that


146

1 1
Λ = q̇ 2 = c21 .
2 2
Obtain the most general Lagrangian function which yields the equation of motion

Λq̈ = 0

5.5.10 Show that if Λ = 1, a function f (q, t) exists so that

df
L−L= .
dt
Hint: From (5.57) we get
∂ 2L ∂ 2L
= ,
∂ q̇ 2 ∂ q̇ 2
so that after integration we have

L = L + A (q, t) q̇ + B (q, t) .

Why must ∂a/∂t = ∂B/∂q?

5.5.11 Denote the right hand side of (5.65) by G(t, q). Show that two forms of the solution are
Z
∂Ψ ∂Ψ
A (t, q) = + Gdt, B =
∂q ∂t
and Z
∂Φ ∂Ψ
A= , B (t, q) = − Gdq
∂q ∂t
where Ψ and Φ are arbitrary functions of t and q.

The following exercises deal with the canonical transformations.

5.5.12 Investigate whether the following mappings are canonical transformations.


 
sin p
5.5.12.1 X = ln , P = x cot p.
x
√ √ √
5.5.12.2 X = ln [1 + x cos p] , P = 2 (1 + x cos p) x sin p.
√ √
5.5.12.3 X= 2xek cos p, P = 2x e−k sin p, k = constant.


r
2x
5.5.12.4 X= cos p, P = 2kx sin p, k = constant.
k

5.5.12.5 X = xα cos (βp) , P = xα sin (βp) , α, β are parameters.


147 APM3712

αx2 p2
   
αx
5.5.12.6 X = arctan , P = 1+ 2 2 , α = constant.
p 2 α x

5.5.12.7 X 1 = x1 , X2 = p2 , P1 = p1 − 2p, P2 = −2x1 − x2 .


p1 − p2 x1 p 2 − x2 p 1
5.5.12.8 X 1 = x1 x2 , X2 = x1 + x2 , P1 = , P2 = − (x2 + x1 ) .
x2 − x 1 x2 − x1

5.5.13 Show that the transformation

X1 = (x1 )2 + λ2 (p1 )2 ,
1 
X2 = (x1 )2 + (x2 )2 + λ2 (p1 )2 + λ2 (p2 )2 ,
2λ2    
x1 x2
2λP1 = − arctan + arctan ,
λp1 λp2
 
x2
P2 = −λ arctan ,
λp2

where λ a constant, is a canonical transformation. Discuss in a fair amount of detail the application
of this transformation to the dynamical system with the Hamiltonian function H given by
1 
H (x1 , x2 , p1 , p2 ) = (x1 )2 + (x2 )2 + λ2 (p1 )2 + λ2 (p2 )2 .
2λ2
5.5.14 Show that the transformation

 1   21
X1 2 X2
x1 = cos P1 + cos P2 ,
k1 k2
  12   12
X1 X2
x2 = − cos P1 + cos P2 ,
k1 k2

1 1
p1 = [k1 X1 ] 2 sin P1 + [k2 X2 ] 2 sin P2 ,
1 1
p2 = [k1 X1 ] 2 sin P1 + [k2 X2 ] 2 sin P2 ,

where k1 and k2 are constants, is a canonical transformation. Apply this to the dynamical system
with Hamiltonian function H, where
1
H (x1 , x2 , p1 , p2 ) = (p1 )2 + (p2 )2 + (k1 )2 (x1 − x2 )2
2
1
+ (k2 )2 (x1 + x2 )2 ,
2
and discuss the consequences in detail.
148

5.5.15 Prove the following property of Poisson brackets. Let F (t, xi , pi ) and G(t, xi , pi ) be func-
tions of class C 2 . Then
   
∂ ∂F ∂G
{F, G} = , G + F, .
∂t ∂t ∂t
Is this result still true in general if F and G are only of class C 1 ? Why?

The last few exercises deal with Noether’s Theorem.

5.5.16 Show that the functional


Z t2
I= ẋ2 dt
t1

is invariant under the transformation

t = t + , x = x,

with  an arbitrary constant.

5.5.17 Investigate whether the functional


Z t2
I= tẋ2 dt
t1

is invariant under the transformation

t = t + , x = x,

with  an arbitrary constant.

5.5.18 A particle is projected from a point O on the ground with a velocity u in a direction that
makes an angle α with the horizontal. There is no air resistance. Determine the motion of the
particle by making use of the conservation properties of the system.

5.5.19 Consider a simple plane pendulum that consists of a particle fixed to a string of length `.
After the pendulum is set in motion suppose that the length of the string decreases at a constant
rate u. If the point of suspension is fixed, discuss the conservation of energy of the particle.
149 CONTENTS

Chapter 6

APPROXIMATE SOLUTION OF
VARIATIONAL PROBLEMS

In this chapter we discuss about how to convert a boundary value problem (BVP) into a
variational problem. After that we present a few methods to obtain approximate numerical solution
of variational problems.

Objectives

At the end of this chapter you will able to:

• Reduction of BVP into Variational Problems

• Solve Variational Problems by direct method

6.1 Reduction of BVP into Variational Problems

There are many problems in mathematics or physics where we need to evaluate a Boundary
Value problem (BVP). We can convert a BVP to a Variational Problem (VP) and solve it. In this
section, we describe how to convert a BVP to correspond VP. In Chapter 2, it has discussed when
a function changes its value from x(t) to x(t + ∆t), the rate of change of this defines the derivative
ẋ(t). Whereas in variational calculus the function x(t) is changed to a new function x(t) + η(t),
where  is a constant and η(t) is a continuous differentiable function. The change η(t) in x(t) as
a function is called the variation of x(t) and is denoted by δx. That is δx = η(t). Similarly we
have δ ẋ = η̇(t). In L = L(t, x, ẋ) for a fixed t, change in x(t) from x(t) to x(t) + η(t) makes F
to change to L(t, x + η, ẋ + η̇). Then by using Taylor series, we obtained the First variation (δL)
150

and the Second variation (δ 2 L) as follow:

∂L ∂L
δL = δx + δ ẋ, (6.1)
∂x ∂ ẋ
1 ∂ 2L ∂ 2L ∂ 2L

2 2 2
δ L= (δx) + 2 δxδ ẋ + (δ ẋ) , (6.2)
2 ∂x2 ∂x∂ ẋ ∂ ẋ2

Example 6.1.1

Let L(t, x, ẋ) = ẋ2 + 2xt + t3 then (in view of (6.1) and (6.2)) δL = 2tδx + 2ẋδ ẋ and δ 2 L = (δ ẋ)2 .

Variation is analogous to derivative in calculus.

Some rules of variational calculus


The variational operator δ follows the rules of differential operator d of calculus. Let L1 and L2 be
any continuous and differentiable functionals. Then we have the following results:

(a) δ(L1 ± L2 ) = δL1 ± δL2 ,

(b) δ(L1 L2 ) = L2 δL1 + L1 δL2 ,


 
(c) δ LL21 = L2 δL1L−L
2
1 δL2
,
2

(d) δ(Ln ) = nLn−1 δL.


d
It is easy to show that the operators dt
and δ are commutative. The commutative property may
be written mathematically as
d dx
(δx) = δ . (6.3)
dt dt
That is, the differential of the variation of a function is identical to the variation of the differential
of the same function.
Another commutative property is the one that states that the variation of the integral of a functional
L is the same as the integral of the variation of the same functional, or mathematically
Z Z
δ L dt = δL dt (6.4)

Note that the two integrals must be evaluated between the same two limits.

Example 6.1.2 (i) δ(x2 ) = 2x δx,

(ii) δ(ẋ2 ) = 2ẋδ ẋ,

(iii) δ(tx) = tδx,

(iv) δ(t2 ) = 0.
151 APM3712

Theorem 6.1.1

If x(t) is an extremizing function for


Z t1
I[x] = L(t, x, ẋ)dt, x(t0 ) = x0 , x(t1 ) = x1 . (6.5)
t0

Then the first variations δI[x] = 0.


Proof: The first variation of I is given by
Z t1 Z t1   Z t1 Z t1
∂L ∂L ∂L ∂L
δI[x] = δL(t, x, ẋ)dt = δx + δ ẋ dt = δxdt + δ ẋdt
t0 t0 ∂x ∂ ẋ t0 ∂x t0 ∂ ẋ

Integrating by parts on the second term,

Z t1 Z t1 0 Z t1  
∂L ∂L d h ∂L i t
*
1 d ∂L
δ ẋdt = (δx)dt = δx −

δxdt.
t0 ∂ ẋ t0 ∂ ẋ dt ∂ ẋ t0 t0 dt ∂ ẋ

So Z t1 Z t1   Z t1 h  
∂L d ∂L ∂L d ∂L i
δI[x] = δxdt − δxdt = − δxdt.
t0 ∂x t0 dt ∂ ẋ t0 ∂x dt ∂ ẋ
h i
We know ∂L ∂x
− d ∂L
dt ∂ ẋ
= 0, because of Euler-Lagrange Equation for an extremizer. Hence
δI[x] = 0 if x is an extremizing function.

If a variational problem is given, the corresponding Euler-Lagrange equation is a BVP. Now, we


ask the question, if a BVP is given, can we find its corresponding variational problem. The answer
is yes for a class of BVP. We demonstrate it with the following example:

Example 6.1.3

Reduce the BVP

ẍ − x + t = 0,
(6.6)
x(0) = x(1) = 0,

into a variational problem.

Solution:
Multiply both sides of (6.6)) by δx and integrate over (0, 1).

Z 1 Z 1 Z 1
ẍ δx dt − x δx dt + t δx dt = 0.
0 0 0
152

Integration by parts,
0 Z 1 Z 1 Z 1
1
− ẋ δ ẋ dt −
*

xδx] x δx dt + x δx dt = 0
 
 0
0 0 0

But δ ẋ2 = 2ẋ δ ẋ, δx2 = 2x δx δ(tx) = t δx



Z 1 Z 1 Z 1
1 2 1 2
− δ ẋ dt − δx dt + δtxdt = 0
0 2 0 2 0
Z 1  
1 1 2
δ − ẍ − x + tx dt = 0
0 2 2
Z 1 
2
δ ẍ + x − 2tx dt = 0
0

It is of the form δI[x] = 0

Thus the corresponding variational problem is


R1 )
Extremize I[x] = 0 (ẋ2 + x2 − 2tx) dt
VP
x(0) = 0, x(1) = 0

If we find the Euler - Lagrange equation of the above VP, we have

L = ẋ2 − x2 − 2tx
∂L
= 2x − 2t
∂x
∂L
= 2ẋ
∂ ẋ
Euler-Lagrange eqn. is given by
d
(2ẋ) − (2x − 2t) = 0 =⇒ ẍ − x + t = 0,
dt
which is same as the original BVP.

Example 6.1.4 Deflection of a rotating string of Length l.

Consider the boundary value problem


d
F (t) dx

dx dt
+ ρw2 x + p(t) = 0,
(6.7)
x(0) = x(l) = 0,

Figure 6.1

where
153 APM3712

• x(t) - displacement of a point from the axis of rotation.

• F (t) - tension.

• ρ(t) - linear mass density.

• w - angular velocity of rotation.

• p(t) - intensity of distributed radial load.

We now reduce this BVP into a variational problem as follows: Multiply both sides of (6.7)) by δx
and integrate over (0, l).
Z l   Z l Z l
d dx 2
F (t) δx dt + ρw x δx dt + p(t) δx dt = 0.
0 dx dt 0 0

Consider the first term and integration by parts gives

Z l   *0 Z l
 Z l
d dx dx 

l dx dx
F (t) (t) δx]0 −
δx dt = F   F (t) δ dt = 0 − F (t)ẋ δ ẋ dt = 0.
0 dx dt  dt 0 dt dt 0

But δ (ẋ2 ) = 2ẋ δ ẋ, δx2 = 2x δx reduce


Z l   Z l   Z l
1 2 2 1 2
δ − F ẋ dt + ρw δ x dt + δ(px)dt = 0
0 2 0 2 0
Z l  
1 2 1 2 2
δ − F ẋ + ρw x + px dt = 0
0 2 2
Z l 
1 2 1 2 2
δ − F ẋ + ρw x + pxdt = 0
0 2 2
That is δI[x] = 0.

Thus the variational problem is:


Rl
Extremize I[x] = 0 − 21 F ẋ2 + 12 ρw2 x2 + px dt


x(0) = 0, x(l) = 0.

6.2 Direct Methods to Solve Variational Problems

In this section, we use two direct methods to solve variational problems.

6.2.1 Rayleigh-Ritz Method to find approximate solution

The Rayleigh–Ritz method is a direct method for minimizing a given functional[30]. It is direct
in the sense that it yields a solution to the variational problem without solving the associated
Euler-Lagrange Equation. It may be noted that, for most of the physical problems, the functional
we get from the variational principle is not simple and thus the solution using the EL equation
154

will be difficult to obtain. The Rayleigh-Ritz method is an approximate method where the given
functional is directly minimized without recourse to the associated EL equation.
Let C 1 [t0 , t1 ] be the set of all continuously differentiable functions defined on [t1, t2]. Consider the
variational problem: Rt
I[x] = t01 L[t, x, ẋ]dt
(6.8)
x(t0 ) = x0 , x(t1 ) = x1 ,
Our objective is to minimize this integral. In the Rayleigh-Ritz method, we select a linearly
independent set of functions called basis functions un and construct an approximate solution to
equation (6.8), satisfying some prescribed boundary conditions. Let x(t) ∈ C 1 [t0 , t1 ] be the solution
to the V.P. Let B = {φ0 (t), φ1 (t), ..., φn (t), ...} be basis for the infinite dimensional vector space
C 1 [t0 , t1 ]. Let x̄(t) be an approximation of x(t) given by
n
X
x̄(t) = ω(t) + ci φi (t), (6.9)
i=0

where ω(t) meets the nonhomogeneous boundary conditions if any, and φi (t) satisfies homogeneous
boundary conditions. The unknown coefficients ci are to be determined and x̄(t) is an approxi-
mate solution to the exact solution x(t). The basis functions are taken such that the boundary
condition x̄(t0 ) = x0 and x̄(t1 ) = x1 are satisfied. The problem to find an approximate solution x̄ in
Rt Rt
˙
I[x̄] = t01 L[t, x̄, x̄]dt = t01 L[t, ni=0 ci φi (x), ni=0 ci φ̇i (x)]
P P
(6.10)
x̄(t0 ) = x0 , x̄(t1 ) = x1 ,
Since φ0 (x), φ1 (x), ..., φn (x) are known basic functions, the only unknown are c0 , c1 , ..., cn , we have

I[x̄] = I[c0 , c1 , ..., cn ]. (6.11)

Using the classical calculus, we have


∂I
= 0, i = 0, 1, · · · , n, (6.12)
∂ci
If we simplify this n + 1 equation, we need to solve n + 1 linear equation in n + 1 unknowing to set
c0 , c1 , ..., cn .
These ci are then substituted into the approximate solution (6.9). Now, if x̄(t) −→ x(t) as n −→ ∞
in some sense, then the procedure is said to converge to the exact solution.
The Rayleigh-Ritz method has two major limitations. First, the variational principle in equation
(6.8) may not exist in some problems such as in nonself-adjoint equations (odd order derivatives).
Second, it is difficult, if not impossible, to find the functions ω(t) satisfying the global boundary
conditions for the domains with complicated geometries.
The simplest φi (x) are polynomials such as Taylor, Chebyshev and Legendre polynomials.
For (6.9), we can chose φi (x) and ω(t) as
x1 − x0
φi (x) = (t − t0 )(t − t1 )ti , ω(t) = (t − t0 ) + x0 . (6.13)
t1 − t0
155 APM3712

If we want to use trigonometric functions as basis function, we can chose φi (x) and ω(t) as
 
t − t0 x1 − x0
φi (x) = sin iπ , i = 1, 2, · · · , ω(t) = (t − t0 ) + x0 . (6.14)
t1 − t0 t1 − t0

Example 6.2.1

Find approximate solution of following VP by Rayleigh-Ritz Method


R1
I[x] = 0 2tx − x2 − ẋ2 dt
(6.15)
x(0) = x(1) = 0,

Solution: Let x̄(t) = c0 + c1 t + c2 t2 be an approximate solution. Applying the both boundary


conditions
x̄(0) = 0 =⇒ c0 = 0,
x̄(1) = 0 =⇒ c1 + c2 = 0 =⇒ c1 = −c2
Thus x̄(t) = c1 t(1 − t),where c1 has to be determined. Now we substitute x̄(t) and x̄(t) in (6.15),
Z 1
I[c1 ] = 2tx̄(t) − x̄2 (t) − x̄˙ 2 (t)dt
0
Z 1
= 2t (c1 t(1 − t)) − (c1 t(1 − t))2 − (c1 (1 − 2t))2 dt
Z0 1 (6.16)
2 4 2 3 3 2 2 2 2 2

= −c1 t + 2c1 t − 2c1 t − 5c1 t + 2c1 t + 4c1 t − c1 dt
0
c1 11c21
= −
6 30
In view of (6.12)
∂I 1 22 5
= 0 =⇒ − c1 = 0 =⇒ c1 = (6.17)
∂c1 6 30 22
5
So, x̄(t) = 22 t(1 − t) is the approximate solution.
t −e−t
The exact solution of (6.15) is x(t) = t − ee−e −1 . We plot the exact and approximate solution

Figure 6.2

Note that if in view of (6.13), we have φ0 (x) = t(t − 1) and ω(t) = 0 then for n = 0 we may choose
x̄(t) = ω(t) + c0 φ0 (t) = c0 t(t − 1). which gives same result.
156

Example 6.2.2

Find approximate solution of following VP by Rayleigh-Ritz Method


R1
I[x] = 0 ẋ2 − x2 − 2txdt
(6.18)
x(0) = x(1) = 0,

Solution: To solve this example, we use (6.14), and assume φ1 (x) = sin (π t). Let x̄(t) = c1 sin(πt)
be an approximate solution. Now we substitute x̄(t) in (6.18),
Z 1
x̄˙ 2 − x̄2 − 2tx̄ dt

I[c1 ] =
Z0 1
= (c1 sin(πt))02 − (c1 sin(πt))2 − 2t (c1 sin(πt))2 dt
Z0 1 (6.19)
2 2 2 2 2

= −c1 sin (πt) + π c1 cos (πt) − 2c1 t sin(πt) dt
0
π 2 c21 c21 2c1
= − −
2 2 π
In view of (6.12)
∂I 2 2
= 0 =⇒ π 2 − 1 c1 − = 0 =⇒ c1 =

2
. (6.20)
∂c1 π π (π − 1)
2 sin(πt)
So, x̄(t) = π(π 2 −1) is the approximate solution.
sin t
The exact solution of (6.18) is x(t) = −t + sin 1
. We plot the exact and approximate solution

Figure 6.3
157 APM3712

6.2.2 Euler’s Finite Difference Method

Euler solved many variational problems by the method of finite differences. Suppose we want to
extremize the integral Rb
I[x] = a L(t, x, ẋ)dt
(6.21)
x(a) = α, x(b) = β,
b−a
Dividing the interval [α, β] into n + 1 equal parts, the width of each piece is ∆t = n
= h.

t0 = a, t1 = t0 + ∆t = t0 + h, t2 = t1 + h, · · · , tn = tn−1 + h = b.

Next, let x1 , x2 , · · · , xn−1 be the values of x corresponding to t1 = t0 + ∆t, t2 = t0 + 2h, · · · , tn−1 =


t0 +(n−1)h respectively. The associated values x1 , x2 , · · · , xn−1 are unknowns because the function
which solves the problem is unknown as yet. The integral (6.21) (by definition) is the limit of a
summation, and thus we may approximate the integral by a function of n variables I[x0 , x2 , · · · , xn ].
n−1
X xi+1 − xi
I[x] = h L(ti , xi , ), (6.22)
i=0
h

In this way the derivative is replaced by a difference quotient and the integral by a finite sum. The
quantities x1 , x2 , · · · , xn−1 are determined so that I solves
∂I
= 0, i = 1, 2, · · · , n − 1. (6.23)
∂xi
every xi (exclude x0 and xn ) is appear in two terms of (6.22). So (6.23) leads to
 
∂I ∂ xi − xi−1 xi+1 − xi
= L(ti−1 , xi−1 , ) + L(ti , xi , )
∂xi ∂xi h h

= (L(ti−1 , xi−1 , ẋi−1 ) + L(ti , xi , ẋi ))
∂xi
1 ∂L ∂L 1 ∂L
= + − (6.24)
h ∂ ẋi−1 ∂xi h ∂ ẋi
 
∂L 1 ∂L ∂L
= + −
∂xi h ∂ ẋi ∂ ẋi−1
 
∂L d ∂L
= − = 0, i = 1, 2, · · · , n − 1.
∂xi t ∂ ẋi

Equation (6.24) is the finite difference version of the Euler equation. As n → ∞, ∆x → 0 and
(6.24) becomes the Euler equation.

Example 6.2.3

Find approximate solution of following VP by Euler’s Finite Difference Method


R2
I[x] = 0 ẋ2 − x2 − 6t2 xdt
(6.25)
x(0) = x(1) = 0,
158

2−0
Solution: To solve this example, we divide [0, 2] into n = 4 parts with h = 4
:

1 3
t0 = 0, t1 = , t2 = 1, t3 = , t4 = 2
2 2
and x1 , x2 , x3 are unknown. To obtain them, we use (6.22) to write discreet form

3 3  2
X xi+1 − xi X xi+1 − xi
I[x] = h L(ti , xi , )=h + 6t2i xi
i=0
h i=0
h
3 27
= 2x21 + 2(x2 − x1 )2 + x1 + 2(x3 − x2 )2 + 3x2 + 2(4 − x3 )2 + x3 .
4 4
Now in view of (6.23)
∂I 3
= 0 ⇒ 8x1 − 4x2 = −
∂x1 4
∂I
= 0 ⇒ 4x1 − 8x2 + 4x3 = 3
∂x2
∂I 37
= 0 ⇒ −4x2 + 8x3 =
∂x3 4
1 5 21
By solving the above algebraic equations, we find x1 = 16 , x2 = 16 , x3 = 16
.
The Euler-Lagrange equation for this problem is
 
d ∂L ∂L
− = 0 ⇒ ẍ − 3t2 = 0. (6.26)
dt ∂ ẋ ∂x

By solving the above equation and using the given boundary conditions, we obtain the exact
4
solution x(t) = t4 . In the table (6.1), we compare the exact solution and the approximate solution.

t Approximate Sol. Exact Solution


0 0 0
0.5 0.0625 0.0156
1 0.3125 0.2500
1.5 1.3125 1.2656
2 4 4

Table 6.1: Comparison the exact and approximate solutions


159 APM3712

6.3 Exercises

6.3.1 Proof the relation (6.3).

6.3.2 Reduce the following BVP into a variational problem.


 
d dx
t + x = t,
dt dt

6.3.3 Find approximate solution of following VP by Rayleigh-Ritz Method


R1
I[x] = 0 ẋ2 − x2 − 2txdt
x(0) = x(1) = 0,

6.3.4 Solve the same problem using finite differences.


160
161 REFERENCES

References

[1] Akhiezer, Naum I, The calculus of variations, Translated from Russian by Aline H Frink,
Blaisdell Publishing Co, New York, 1962.

[2] Bliss, Gilbert A, Calculus of variations, Carus Mathema-tical Monographs No 1, Published


for The Mathematical Association of America by The Open Court Publishing Co, La Salle,
Illinois, 1925, Vyfde druk, 1962.

[3] Carathéodory, Constantin, The beginning of research in the calculus of variations, Osiris 3
(1937), 224-240, Reprinted in Gesammelte mathematische Schriften 11, 93-107, CH
Beck, München, 1955.

[4] Carathéodory, Constantin, Basel und der Beginn der Variationsrechnung, Festschrift zum
60. Geburtstag von Prof Dr Andreas Speiser (Zürich), 1-18, Reprinted in Gesam-
melte mathematische Schriften 11, 108-128, CH Beck, München, 1955.

[5] Carathéodory, Constantin, Calculus of variations and partial differential equations


of the first order, Second (revised) English edition, Translated from German by Robert B
Dean, Chelsea Publishing Co, New York, 1982.

[6] Clegg, John C, Calculus of variations, University Mathematical Texts 38, Oliver & Boyd,
Edinburgh, 1968.

[7] Courant, R, Calculus of variations, Revised and amended by J Moser, Supplementary notes
by Martin Kruskal and Hanan Rubin, Courant Institute of Mathematical Sciences, New York
University, New York, 1962.

[8] Eisgolc, LE, Calculus of variations, International Series of Monographs on Pure and Applied
Mathematics, Vol 19, Pergamon Press, Oxford, 1961.

[9] Eisgolc, L, Differential equations and the calculus of variations, Translated from Rus-
sian by George Yankovsky, MIR Publishers, Moscow, 1970.

[10] Fox, Charles, An introduction to the calculus of variations. Oxford University Press,
London, 1950.
162

[11] Fung, YC, Foundations of solid mechanics, Prentice Hall Inc., Engelwood Cliffs, New
Jersey, 1965.

[12] Funk, Paul, Variationsrechnung und ihre Anwendung in Physik und Tecnik,
Grundlehren der mathematischen Wissenschaften Band 94, Springer-Verlag, Berlin, 1962.

[13] Goldstein, Herbert, Classical Mechanics, Second edition, Addison-Wesley Publishing Co,
Reading, 1980.

[14] Goldstine, Herman H, A history of the calculus of variations from the 17th century
through the 19th century, Studies in the history of mathematics and physical sciences 5,
Springer-Verlag, New York, 1980.

[15] Grässer, HSP, A monograph on the general theory of second order paramet-
invariant problems in the calculus of variations, Mathematical Communications of
the University of South-Africa M2, UNISA, Pretoria, 1967.

[16] Grässer, HSP, On a general Hamilton-Jacobi theory for m-th order single integral calculus of
variations problems, Part 2, The parameter-invariant case, lst Lombardo Accad Sci Let
Rend A105 (1971), 721-741.

[17] Klingbeil, Eberhard, Variationsrechnung, Bibligraphisches lnstitut, Mannheim, 1977.

[18] Leitmann, George, The calculus of variations and optimal control. An introduction,
Plenum Press, New York, 1981.

[19] Lovelock, David and Rund, Hanno, Tensors, differential forms and variational princi-
ples, John Wiley and Sons, New York, 1975.

[20] Lyusternik, LA, Shortest paths; Variational problems, Translated and adapted from
Russian by P Coffins and Robert B Brown, Popular Lectures in Mathematics Vol 13, Pergamon
Press, Oxford, 1964.

[21] Matzner, Richard A, Shepley, Lawrence C, Classical Mechanics, Prentice-Hall, Englewood


Cliffs, NJ, 1991.

[22] Miller, M, Variationsrechnung, MathematischNaturwissen-schaftliche Bibliothek 24, BG


Teubner Verlags-gesellschaft, Leipzig, 1959.

[23] Moore, E Neal, Theoretical mechanics, John Wiley and Sons, New York, 1983.

[24] Morse, Marston, The calculus of variations in the large, American Mathematical So-
ciety Colloquium Publications 18, American Mathematical Society, Providence, RI, 1934,
Reprinted 1960.
163 APM3712

[25] Morse, Marston, Variational Analysis: critical extremals and Sturmian extensions,
John Wiley and Sons, New York, 1973.

[26] Pars, LA, An introduction to the calculus of variations, Heinemann, London, 1962.

[27] Poincaré, H, Les Méthodes nouvelles de la Mécanique Céleste, Vols I, II, III, Paris,
1892, 1893, 1899, Dover Reprinted, New York, 1957.

[28] Rund, Hanno, The Hamilton–Jacobi theory in the calculus of variations, Van Nos-
trand, London, 1966, Supplemented and corrected reprint. Krieger, Huntington, New York,
1973.

[29] Sagan, Hans, Introduction to the calculus of variations, McGraw-Hill, New York, 1969.

[30] Salih, Abdusamad A, Text of Finite Element Method, Indian Institute of Space Science
and Technology, Thiruvananthapuram, India, 2020.

[31] Smith, Donald R, Variational Methods in optimization, Prentice-Hall, Englewood Cliffs,


NJ, 1974.

[32] Weinstock, Robert, Calculus of variations. With applications to physics and engi-
neering, McGraw-Hill, New York, 1952, Unabridged and corrected and republished: Dover,
New York, 1974.
164
Index

ε–neighborhood, 32 invariant, 138


isoperimetric problem, 68, 71
canonical, 91
canonical equations, 89 Jacobi’s identity, 110
canonical transformation, 132, 133, 135
Kinetic energy, 2
chain’s equation, 71
kinetic energy, 105
composition of functions, 135
Kronecker delta, 12
configuration space, 2
conservation law, 110 Lagrange equations of motion, 59
Conservative force, 7 Lagrange multipliers, 67
Lagrange-brackets, 91, 133
divergence theorem, 120
Lagrangian function, 57, 127, 128
double pendulum, 105
Legendre-condition, 102
Euler-Lagrange equations, 36, 58, 66, 81
Mayer equations, 70
Euler’s Finite Difference Method, 157
Mayer-field, 99
extremal, 38
extremal field, 90, 91 Newton’s second law, 7
Noether’s theorem, 139, 140
first variation of the integral, 122
null class, 66
functional, 24
point transformation, 133
Gauss’s divergence theorem, 124
Poisson bracket, 137
generalized coordinates, 12
Poisson bracket relations, 136
generalized lemma of du Bois-Reymond, 125
Poisson bracket theorem, 117
geodesic gradient, 84, 86
Poisson brackets, 109, 136
Hamilton function, 81 Potential energy, 2
Hamilton’s equations, 89
Rayleigh-Ritz Method, 154
Hamilton’s variational principle, 7
Hamilton-Jacobi equation, 81, 87, 89 The brachistochrone problem, 25
harmonic oscillator, 138 transversal, 88
holonomic dynamical systems, 12 two body problem, 95

165
166

variational formula, 124 Weierstrass conditions, 97


variational problem, 120 Weierstrass excess function, 101

You might also like