Professional Documents
Culture Documents
An Introduction To Numerical Methods For The Physical Sciences
An Introduction To Numerical Methods For The Physical Sciences
An Introduction To Numerical Methods For The Physical Sciences
WHELAN
An Introduction to
An Introduction to Numerical Methods
There is only a very limited number of physical systems that can be exactly described in terms of simple
analytic functions. There are, however, a vast range of problems which are amenable to a computational
approach. This book provides a concise, self-contained introduction to the basic numerical and analytic
Numerical Methods
for the Physical
techniques, which form the foundations of the algorithms commonly employed to give a quantitative
description of systems of genuine physical interest. The methods developed are applied to representative
problems from classical and quantum physics.
Sciences
Colm T. Whelan
ABOUT SYNTHESIS
This volume is a printed version of a work that appears in the Synthesis Digital Library of Engineering and
Computer Science. Synthesis Lectures provide concise original presentations of important research and
development topics, published quickly in digital and print formats. For more information, visit our website:
http://store.morganclaypool.com
Simplified Models for Assessing Heat and Mass Transfer in Evaporative Towers
Alessandra De Angelis, Onorio Saro, Giulio Lorenzini, Stefano D’Elia, and Marco Medici
2013
The Making of Green Engineers: Sustainable Development and the Hybrid Imagination
Andrew Jamison
2013
Crafting Your Research Future: A Guide to Successful Master’s and Ph.D. Degrees in
Science & Engineering
Charles X. Ling and Qiang Yang
2012
Geometric Programming for Design and Cost Optimization (with illustrative case study
problems and solutions), Second Edition
Robert C. Creese
2010
Geometric Programming for Design and Cost Optimization (with Illustrative Case Study
Problems and Solutions)
Robert C. Creese
2009
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations
in printed reviews, without the prior permission of the publisher.
DOI 10.2200/S01016ED1V01Y202006EST008
Lecture #8
Series ISSN
Print 2690-0300 Electronic 2690-0327
An Introduction to
Numerical Methods
for the Physical Sciences
Colm T. Whelan
Old Dominion University
M
&C Morgan & cLaypool publishers
ABSTRACT
There is only a very limited number of physical systems that can be exactly described in terms of
simple analytic functions. There are, however, a vast range of problems which are amenable to a
computational approach. This book provides a concise, self-contained introduction to the basic
numerical and analytic techniques, which form the foundations of the algorithms commonly
employed to give a quantitative description of systems of genuine physical interest. The methods
developed are applied to representative problems from classical and quantum physics.
KEYWORDS
differential equations, linear equations, polynomial approximations, variational
principles
xi
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii
1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Numbers and Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Programming Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
6 Polynomial Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.1 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.1.1 Error Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2 Orthogonal Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.2.1 Legendre Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.3 Infinite Dimensional Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.3.1 Zeros of Orthogonal Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.4 Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.4.1 Simpson Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.4.2 Weights and Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.4.3 Gaussian Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
7 Sturm–Liouville Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.1 Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.2 Least Squares Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
9 Variational Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
9.1 Rayleigh–Ritz Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
9.2 The Euler–Lagrange Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
9.3 Constrained Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
9.4 Sturm–Liouville Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
xvii
Preface
There is only a limited number of physical systems that can be exactly described in terms of
simple analytic functions. There are, however, a vast range of problems that are amenable to a
computational approach. This book provides a concise introduction to the essential numerical
and analytic techniques which form the foundations of algorithms commonly employed to give
a quantitative description of systems of genuine physical interest. Rather than providing a series
of useful programming recipes the philosophy of the book is to present in a coherent way the
underlying theory. I include some case studies illustrating the application to problems in classical
and quantum physics.
Colm T. Whelan
Norfolk, June 2020
1
CHAPTER 1
Preliminaries
Before diving into the study of numerical methods and their applications, it is worthwhile to
briefly think about how computers work and how we interact with them.
r D . 1/s f be;
where s; f; b , and e are integers; s determines the sign, f is the “significand” (or “coefficient”), b
is the base (usually 2), and e is the exponent. The possible finite values that can be represented
in a given format are determined by the number of digits in the significand f , the base b , and
the number of digits in the exponent e . When the computer adds two floating point numbers,
it first fixes them to have the same exponent then adds, clearly there is only a finite number of
places of decimals that can be stored and this leads to the potential for numerical errors.
• Roundoff Errors.
A roundoff error is the difference between the result produced by a given algorithm
using exact arithmetic and the result produced by the same algorithm using finite-
precision, rounded arithmetic. Such an error can grow in significance when we have a
large number of repeated operations.
• Cancellation Errors.
These occur when we subtract two almost equal numbers one from the other.
1.2 ALGORITHMS
The application of numerical methods to the description of physical problems proceeds through
the formulation and implementation of algorithms, i.e., sequences of well-defined instructions,
which a computer can interpret in an unambiguous way leading to a numerical solution of the
2 1. PRELIMINARIES
problem at hand. One of the challenges you will face in translating the equations of physics into
efficient and accurate computer code is that an expression that is mathematically correct may be
highly susceptible to numerical errors. We have to be careful to design our algorithms to avoid
such errors. I will illustrate this by two simple examples.
Example 1.1 Suppose we are looking for the roots of the quadratic
x2 2bx C c D 0:
x xC D b 2 b 2 C c;
c
x D ; (1.2)
xC
which gives us an expression which is much less sensitive.
In D e nIn 1;
I0 D e 1: (1.4)
In Figure 1.1 we plot the result of using the recurrence relation (1.4), for increasing n, compared
with the direct numerical integration of (1.3). The issue with using the recursion relation (1.4)
is the “unstable algorithm,” which magnifies the initial error at each step. If In is the exact value,
and INn our numerical estimate and n the error at each step then the magnitude of the error is
This error becomes rapidly larger as n increases. We note from the mean value theorem for
integrals that In will go to zero as n increases If now we rewrite (1.4)
1
In 1 D Œe In (1.5)
n
1.3. PROGRAMMING LANGUAGES 3
2
1.5
0.5
-0.5
0 2 4 6 8 10 12 14 16
n
R1
Figure 1.1: Evaluation of the integral In D 0 x n e x dx using: (i) direct numerical integration,
open blue squares; (ii) backward recurrence, solid red disks; and (iii) forward recurrence, green
crosses, - the dashed lines are a “best fit” through the forward recurrence.
and chose n D N large enough that we can approximate IN 0 then we can generate the smaller
n values using the backward recurrence relation. Figure 1.1 also shows the estimate for In got
using the backward formula (1.5). The results are almost indistinguishable from those of the
“exact” numerical integration (all the calculations shown are in single precision and the maximum
N for the backward recurrence was taken to be 35).
• Fortran has been in constant use in computationally intensive areas for over seven
decades during that time it has evolved, adding extensions and refinements, while striv-
ing to retain compatibility with prior versions. “Modern Fortran” (Fortran 90/95/03/08)
is still the dominant language for the large-scale simulation of physical systems, for
things like the astrophysical modeling of stars and galaxies, for the accurate calculation
4 1. PRELIMINARIES
of electronic structure, hydrodynamics, molecular dynamics, and climate change. In
the field of high performance computing (HPC), Modern Fortran also has a feature
called “coarrays” which puts parallelization features directly into the language. Coarrays
started as an extension of Fortran 95 and were incorporated into Fortran 2008 as stan-
dard. There is a hugh “legacy” of libraries both general, e.g., [1–3] and free academic
libraries devoted to specific areas in the physical sciences, e.g., [4–8].
• C CC is more difficult to learn. It does have a good basis of libraries. On most bench
mark tests C CC and Fortran are fairly equivalent. However, the two benchmarks where
Fortran wins (n-body simulation and calculation of spectra) are the most relevant to
Physics.
In the physical sciences C CC and “Modern Fortran” are still the most widely used. The popular
“Open MPI” libraries for parallelizing code were developed for these two languages. There is also
• C [9] which was designed to be compiled using a relatively straightforward compiler to
provide low-level access to memory and language constructs that map efficiently to ma-
chine instructions, all with minimal runtime support. Despite its low-level capabilities,
the language was designed to encourage cross-platform programming. A standards-
compliant C program written with portability in mind can be compiled for a wide
variety of computer platforms and operating systems with few changes to its source
code. The language is available on various platforms, from embedded microcontrollers
to supercomputers. It is much easier to learn to code in than C CC , which is not a di-
rect extension of C . The reason for the name was that when object-oriented languages
became popular, C CC was originally implemented as a source-to-source compiler; the
source code was translated into C, and then compiled with a C compiler.
• Python [10] is easy to learn with built in libraries but it is usually about 100 times
slower than Fortran or C CC on bench mark tests. A good learning language but not
currently really that useful for advanced (real) physical problems.
The challenges you will face in first writing effective code have more to do with knowing how to
recast the equations of mathematical physics in a way that minimizes numerical error than with
the choice of which high level language in which you are comfortable coding.
5
CHAPTER 2
Theorem 2.1 Let f be a real function which is continuous and has continuous derivatives up to the
n C 1 order then
Proof. [11].
Clearly, if Rn goes to zero uniformly as n ! 1 then we can find an infinite series. Ex-
amples are
x2
ex D1CxC 2Š
C ;
x3
sin x D1 xC 3Š
C :
An alternative form for the remainder term can be derived by by making use of the mean value
theorem for integrals, i.e.,
Z x
f .nC1/ .t/ .x a/.x ˛/n
RnC1 D .x t /n dt D f .nC1/ .˛/ ; (2.3)
a nŠ nŠ
6 2. SOME ELEMENTARY RESULTS
where ˛ is some number, a ˛ x . The form (2.3) is the Cauchy form of the remainder term.
A further alternative form was derived by Lagrange
f .nC1/ .ˇ/
RnC1 .x/ D .x a/nC1 ; (2.4)
.n C 1/Š
Theorem 2.2 Suppose f is a map from RN to R and it is at least k C 1 times continuously differ-
entiable then:
k
X .h:r/j
f .a C h/ D f .a/ C R.a; k; h/;
jŠ
j D0
1
R.a; k; h/ D .h:r/kC1 f .a C h/; (2.5)
.k C 1/Š
for some .0; 1/.
Proof. [12].
Example 2.3 The Taylor theorem in two dimensions. Expanding about a D .a; b/
@f @f
f .x; y/ D f .a; b/ C .x a/ C .y b/
@x @y
2
1 @ f @2 f @2 f
C .x a/2 2 C 2.x a/.y b/ C .y b/2 2 C
.2Š/ @x @x@y @y
(2.6)
2.1.1 EXTREMA
Suppose F .x/ is a continuous function with a continuous first derivate. Suppose further that F
has a local maximum at some point x0 , hence for some infinitesimal increment jhj
hence
F .x0 C jhj/ F .x0 /
< 0;
jhj
F .x0 jhj/ F .x0 /
> 0:
jhj
2.1. TAYLOR’S SERIES 7
We can make jhj arbitrarily small, hence when we take limit from the left and right and since
we have assumed the derivative is continuous we must have:
ˇ
dF .x0 / dF .x/ ˇˇ
D 0: (2.8)
dx dx ˇx0
dF .x0 / 1 d 2 F .x0 /
F .x/ D F .x0 / C .x x0 / C .x x0 /2 C O..x x0 /3 /
dx 2 dx 2
1 d 2 F .x0 /
D F .x0 / C .x x0 /2 C O..x x0 /3 /: (2.9)
2 dx 2
The symbol O.x n / means terms of order x n or higher. Now since .x x0 /2 > 0 and since we
can chose x arbitrarily close to x0 we see at once that
d 2 F .x0 /
F has a maximum at x0 if < 0:
dx 2
d 2 F .x0 /
F has a minimum at x0 if > 0:
dx 2
Suppose now f .x; y/ is a differentiable function of two variables, f .x; y/ D z defines a two
dimensional surface in three space. We can think of this surface as being constructed from a
series of curves of the form ˆ.x/ D f .x; y0 / D z and ‰.y/ D f .x0 ; y/ D z , you might think
of lines of latitude and longitude on the earth. Clearly, a necessary and sufficient condition for a
maximum, is that both ˆ and ‰ have maxima. Thus, the function f .x; y/ will have an extrema,
maximum or minimum at x0 ; y0 is
@f .x0 ; y0 / @f .x0 ; y0 /
D D 0; (2.10)
@x @y
but
@f @f
df D dx C dy: (2.11)
@x @y
df .x0 ; y0 / D 0: (2.12)
8 2. SOME ELEMENTARY RESULTS
Suppose we may want to find the extrema of f .x; y/ where x; y are related by some extra con-
dition
g.x0 ; y0 / D 0; (2.13)
for any which is independent of x and y . Now define a new function of the three variables
x; y;
The exact value of will be determined later. The extrema of this function satisfy
@l @f @g
D D 0:
@x @x @x
@l @f @g
D D 0:
@y @y @y
@l
D g.x; y/ D 0: (2.15)
@
The first two equations must be satisfied by the extrema and the third is just the constraint
condition. We have three equations for three unknowns thus the extrema of f .x; y/ D c subject
to the constraint g D 0 can be found by solving for the extrema of the function l.x; y; /.
Example 2.4 Suppose you want to find the maximum and minimum values of the function
f .x; y/ D xy;
where
x 2 C y 2 D 4:
Then you could proceed by direct substitution and look for the root of f 0 .y0 / D 0, i.e.,
p
x D ˙ 4 y2:
f .y/ D x.y/y; 2 3
2 q
6 y 7
f 0 .y0 / D ˙ 4 q C 4 y02 5 D 0;
2
4 y0
2.1. TAYLOR’S SERIES 9
) ˙.4 2y02 / D 0; p
) y0 D ˙ .2/;
x02 C y02 D 4;p
) x0 D ˙ .2/:
as before.
One of the advantages of the Lagrange multiplier approach is that it easily generalizes to
higher dimensions.
Example 2.5 Suppose you wish to find the maximum and minimum values of
f .x; y; z/ D xyz;
on the sphere
x 2 C y 2 C z 2 D 3:
Let
g.x; y; z/ D x 2 C y 2 C z 2 3:
x y z f(x,y)
1 1 1 1
-1 1 1 -1
1 -1 1 -1
1 1 -1 -1
-1 -1 1 1
1 -1 -1 1
-1 -1 1 1
-1 -1 -1 -1
In the same way y D ˙1; z D ˙1. We have 8 possible results for extrema.
So minimum value is 1, maximum is C1.
are both convergent within the interval I and ˛; ˇ are numbers, then
1
X
s3 .x/ D .˛an C ˇbn /x n
nD0
Figure 2.1: Values of f on an equally spaced lattice. Dashed lines show the linear interpolation.
fn D f .xn /;
xn D nh; .n D 0; ˙1; ˙2; : : :/:
x 2 00 x 3 000
f .x/ D f0 C xf 0 C f C f C ; (2.17)
2Š 3Š
where all derivatives are evaluated at x D 0. It follows that
h2 00 h3 f 000
f˙1 D f0 ˙ hf 0 C f ˙ C O.h4 /;
2 3Š
4h3 000
f˙2 f0 ˙ 2hf 0 C 2h2 f 0 ˙ f C O.h4 /: (2.18)
3
Subtracting f 1 from f1 we find
f1 f 1 h2 000
f0 D f C O.h4 /: (2.19)
2h 6
The term involving f 000 is the dominant error error associated with the finite difference approx-
imation that retains only the first term
f1 f 1
f0 : (2.20)
2h
2.2. NUMERICAL DIFFERENTIATION AND INTEGRATION 13
This “3-point formula” will be exact if f is a second degree polynomial. Note also that the sym-
metric difference about x D 0 is used as it is more accurate by one order in h than the forward
or backward difference formulae
f1 f0
f0 C O.h/;
h
f0 f 1
f0 C O.h/: (2.21)
h
These “2-point” formulae will be exact if f is a linear function on Œ0; ˙h.
It is possible to improve the 3-point formula, (2.20), by relating f 0 to lattice points further
removed. For example the “5-point formula”
1
f0 Œf 2 8f 1 C 8f1 f2 C O.h5 / (2.22)
12h
cancels all derivatives in the Taylor series through to fourth order. This formula will be exact f
is a fourth-degree polynomial over the 5-point interval Œ 2h; 2h.
Formulae for higher derivatives can be constructed by taking approximate combinations
of (2.18). For example,
f1 2f0 C f 1
f 00 : (2.24)
h2
Numerical differentiation can be quite tricky to program, since by its very nature it involves the
subtracting of two very similar numbers.
2.2.2 QUADRATURE
In quadrature we are interested in calculating the definite integral of a function f between two
limits a < b . We can divide the range.
b a
hD ;
N
where N is an integer.
It is then sufficient to derive a formula for the integral from h to h since this formula
can then be applied successively
Z b Z aC2h Z aC4h Z b
f .x/dx D f .x/dx C f .x/dx C C f .x/dx: (2.25)
a a aC2h b 2h
14 2. SOME ELEMENTARY RESULTS
f (xi+1)
xi xi+1
Rx
Figure 2.2: Using the trapezoidal rule to integrate xnnC1 f .x/dx corresponds to approximating
the integral by the area defined by the right trapezoid whose area is the sum of the rectangle
h f .xn / with the right triangle whose area is 12 .f .xnC1 / f .xn //h, in agreement with (2.26).
The idea is to approximate f on each interval by a function that is integrable, this approach
leads to a group of formulae that are said to be of “Newton–Cotes” type. Let us first consider
Rh
h f .x/dx .
If f .x/ is a linear function then
Z h
h
f .x/dx D .f 1 C 2f0 C f1 / (2.26)
h 2
is exact on Œ h; h. Thus, from Taylor’s theorem it follows that
Z h
h
f .x/dx D .f 1 C 2f0 C f1 / C O.h3 /: (2.27)
h 2
The approximation (2.26) is known as the “trapezoidal rule.”
Proof.
Z h
h
1dx D xjh h D 2h D Œ1 C 4 C 1 D 2h;
h 3
2.2. NUMERICAL DIFFERENTIATION AND INTEGRATION 15
Z h
1 2h h
xdx D x j h D 0 D Œ1 C 0 1 D 0;
h 2 3
Z h
x3 h 2h3 h 2 2h3
x 2 dx D j hD D h C 0 C h2 D ;
h 3 3 3 3
Z h
x4 h h 3
x 3 dx D j hD0D h C 0 h3 D 0:
h 4 3
Suppose we want to find the integral
Z b
f .x/dx:
a
This is “Simpson’s rule” which is accurate to two orders better than the trapizoidal rule.
High order quadrature formulas can be derived by retaining more terms in the Taylor’s
expansion used to interpolate f and using better finite difference approximations for the deriva-
tives. The generalization of Simpsons rule using cubic and quartic polynomials are
3
Simpson’s 8
rule
Z x3
3h
f .x/dx D Œf0 C 3f1 C 3f2 C f3 C O.h5 /: (2.32)
x0 8
Boole’s Rule
Z x4
2h
f .x/dx D Œ7f0 C 32f1 C 12f2 C 32f3 C 7f4 C O.h7 /: (2.33)
x0 45
t3 D x;
1 Due to a misprint in [14]. This approximation was incorrectly written as “Bode’s rule,” this error is frequently reproduced
in the literature.
2.3. FINDING ROOTS 17
Z 1
1 2
) I1 D 3 t t g.t 3 /dt
Z0 1
D 3 tg.t 2 /dt: (2.35)
0
gives
Z 1
1
dg.t /dt;
0
f .x0 / D 0:
then f must have at least one root in .a0 ; b0 /. The trick here is to repeatedly bisect, all the time
decreasing the size of the interval. Define
1
cD .a0 C b0 / :
2
18 2. SOME ELEMENTARY RESULTS
If f .c/ has the same sign as f .b/ then must be a root in Œa0 ; c otherwise there is a root
in Œc; b0 . Either way we have halved the size of the interval containing the root. The rule we
adopt, starting with N D 0, is:
f .x/ D x 2 5 D 0; (2.37)
with a tolerance of 10 6 using x D 1 as my initial guess and an initial step size of 0:5, and my
code converged to the answer, correct to 6 places of decimals, after 34 iterations. You need to be
careful using this method, since if the initial step size is too large it is possible to “step over” the
desired root especially when f has several roots.
If the actual root is at x0 and we guess x1 . If it is a good guess jx0 x1 j will be small.
Using Taylor’s
f .x0 / D 0
D f .x1 / C f 0 .x1 /.x0 x1 / C O..x0 x1 /2 /:
f .x1 /
) x0 x1 :
f 0 .x1 /
2.3. FINDING ROOTS 19
Thus, a better guess for the root will be
f .x1 /
x2 D x1 :
f 0 .x1 /
Repeating, we get
f .xi /
xiC1 D xi : (2.38)
f 0 .xi /
The application of (2.38) defines the Newton–Raphson algorithm. I used it to look for the root
of x 2 5 D 0 with a tolerance of 10 6 . I was able to achieve convergence after only 10 iterations.
The “secant method” is useful if finding the derivative is a problem.
We approximate
f .xi / f .xi 1/
f 0 .xi / ;
xi xi 1
and rewrite (2.38)
xi xi 1
xiC1 D xi f .xi / : (2.39)
f .xi / f .xi 1/
Provide that the initial guesses are reasonably close to the true root, convergence to the
exact answer is almost as rapid as the Newton–Raphson algorithm. The Newton–Raphson and
secant methods can fail to converge or worse converge to the wrong answer if there are multiple
roots close together or if there is a point, xQ near x0 where f 0 .x/
Q D 0.
21
CHAPTER 3
Lemma 3.1 Let c.x/; s.x/ be continuous differentiable functions such that
s 0 .x/ D c.x/;
c 0 .x/ D s.x/;
s.0/ D 0;
c.0/ D 1: (3.1)
Then
Proof. Let
thus F .x/ must be a constant, substituting the values at x D 0 we have the result.
22 3. THE NUMERICAL SOLUTION OF ORDINARY DIFFERENTIAL EQUATIONS
Lemma 3.2 If we have two sets of functions c.x/; s.x/, and f .x/; g.x/ s.t.,
c 0 .x/ D s.x/; g 0 .x/ D f .x/;
s 0 .x/ D c.x/; f 0 .x/ D g.x/;
c.0/ D 1; g.0/ D 1;
s.0/ D 0; f .0/ D 0:
Proof. We know that both the pairs .c; s/; .f; g/ must satisfy the relation (3.2)
c 2 .x/ C s 2 .x/ D 1;
f 2 .x/ C g 2 .x/ D 1:
The functions
F1 .x/ D f .x/c.x/ s.x/g.x/ and F2 .x/ D f .x/s.x/ C c.x/g.x/ are s.t.
dF1 .x/ dF2 .x/
D D 0:
dx dx
Hence,
a D f .x/c.x/ s.x/g.x/;
b D f .x/s.x/ C c.x/g.x/;
hence,
0 D f .x/c 2 .x/ c.x/s.x/g.x/;
s.x/ D f .x/s 2 .x/ C s.x/c.x/g.x/;
Hence,
s 0 .x/ D f 0 .x/I
therefore,
c.x/ D g.x/:
3.2. ANALYTIC SOLUTIONS 23
Clearly, the functions c.x/; s.x/ have all the properties of the sin.x/ and cos.x/ of
trigonometry. The rest of the properties that we know and love can be derived from the re-
sults above. Further, as we will see slightly later, we can use relatively straight forward numerical
methods to solve equations of the form (3.1). This may seem an odd way to discuss the sin
and cos functions but there is an important lesson here, in that perfectly good functions can
be defined simply as the solution of differential equations. The Schrödinger equation for an N
electron neutral atom in atomic units
2 3
XN XN
4 1 Z 1 1
r2 C E 5 ‰.r1 ; r2 ; : : : ; rN / D 0;
2 j rj 2 jrj rk j
j D1 j ¤k
is just another differential equation, albeit a more complicated one, which turns out to have a
unique solution for certain values of E .
y.1/ D 2:
24 3. THE NUMERICAL SOLUTION OF ORDINARY DIFFERENTIAL EQUATIONS
Then we can devide (3.6) by t to put it in the form (3.4):
dy.t/ y
C2 D 4t;
dt t Z t
2
r.t / D exp.dx/;
a x
) r.t/ D exp.2 ln t 2 ln a/; (3.7)
with out loss of generality you can take the arbitrary constant a to be unity and then
where c is a constant now substitute the initial condition y.1/ D 2 and it follows that c D 1.
We can rewrite
dy dx
D ;
Z y ln y Zx
dy dx
D
y ln y x
D ln.x/ C K: (3.11)
Put
u D ln y;
dy
du D ;
Z Zy
dy du
) D ;
y ln y u
D ln u;
D ln.ln.y//;
3.3. NUMERICAL METHODS 25
D ln.x/ C K: (3.12)
Initial conditions
y.2/ D e
) ln.ln.e// D ln.2/ C K;
ln.1/ D ln.2/ C K;
K D ln.2/: (3.13)
So from (3.12),
Equation (3.16) is known as the Euler solution. We could use it to propagate our solution
using a series of increments in h; at each step we introduce a potential error of order h2 .
4.5
y
3.5
2.5
2 2.2 2.4 2.6 2.8 3 3.2
Figure 3.1: Euler solution, crosses compared with analytic result, solid line.
Step 1.
y1 D y.2:1/; D y0 C hf .x0 ; y0 /;
D e C hf .2; e/;
e ln e
D e C 0:1 ;
2
D 2:85419583:
Step 2.
y2 D y.2:2/; D y1 C hf .x1 ; y1 /;
D 2:85419583 C h f .2:1; 2:85419583/;
D 2:99674129:
d 2x
D x;
dt 2
3.3. NUMERICAL METHODS 27
dx.0/
D 0;
dt
x.0/ D 1: (3.17)
We can convert this into a set of two first-order equations
dv
D x;
dt
dx
D v;
dt
v.0/ D 0;
x.0/ D 1: (3.18)
We can apply Taylor’s theorem to both x.t / and v.t /
dv.0/
v.t C h/ D v.0/ C h ;
dt
D v.0/ hx.0/;
dx.0/
x.t C h/ D x.0/ C h ;
dt
D x.0/ C hv.0/I (3.19)
incrementing we get
!
dx
x x dt
yN C1 D jN C1 D jN C h dv
v v dt
N
v
D yN C h jN : (3.20)
x
To code this up we can divide the interval Œ0; into h equal steps and create two arrays
x.0 W 100/; v.0; 100/ and then iterate.
In Figure 3.2, I show the Euler numerical solution from (3.20) plotted against cos t the
analytic solution to (3.17).
1 × xEuler(t)
cos(t)
0.5
x(t)
0
-0.5
-1
-1.5
-0.5 0 0.5 1 1.5 2 2.5 3 3.5
Figure 3.2: Comparison of the analytic solution to (3.17), solid line, and the numerical solution
found using the Euler approach, crosses.
h
yn C Œf .yn ; tn / C f .yNnC1 ; tnC1 / :
2
We approximate yNnC1 using the Euler method
h
ynC1 D yn C Œf .yn ; tn / C f .yn C hf .yn ; tn /; tnC1 / :
2
It is convenient to define
k1 D hf .yn ; tn /;
k2 D hf .yn C k1 ; tnC1 /;
1
) ynC1 D yn C Œk1 C k2 : (3.22)
2
Equation (3.22) defines the second order Runge–Kutta approximation. We can get a better
approximation by improving our estimate for yNnC1 . The Runge–Kutta approximation can be
extended to higher orders [15]. The fourth order Runge–Kutta is given by
1
ynC1 D yn C Œk1 C 2k2 C 2k3 C k4 C O.h5 /;
6
where
k1 D hf .yn ; tn /;
1 1
k2 D hf .yn C k1 ; tn C h/;
2 2
3.3. NUMERICAL METHODS 29
1 1
k3 D hf .yn C k2 ; tn C h/;
2 2
k4 D hf .yn C k3 ; tn C h/: (3.23)
The method can be extended to find the numerical solution to nth order differential equa-
tions. In much the same way as we looked at the vector formalism for the Euler method (3.20),
we can generalize the Runge–Kutta. For example the second order differential equation:
d 2f
D g.t /
dt 2
f .0/ D f0
f 0 .0/ D v0
can be transformed into two coupled differential equations:
du1 .t/
D u2 .t/;
dt
du2 .t/
D g.t /;
dt
u1 .0/ D f0 ;
u2 .0/ D v0 ; (3.24)
and solved using the vector scheme:
u2
uP D f .t; u/ D ;
g.t /
k1 D hf .tn ; un /;
1 1
k2 D hf .tn C h; un C k1 /;
2 2
1 1
k3 D hf .tn C h; un C k2 /;
2 2
k4 D hf .tn C h; un C k3 /;
1
unC1 D un C .k1 C k2 C k3 C k4 / : (3.25)
6
Second and higher order ordinary differential equations (more generally, systems of non-
linear equations) rarely yield closed form solutions. A great advantage of the numerical approach
is that it can be applied to both linear and nonlinear differential equations. A numerical solution
to the classical harmonic oscillator problem
d 2x
D x;
dt 2
x.0/
ˇ D 1;
dx ˇˇ
D 0; (3.26)
dt ˇ t D0
using the fourth order Runge–Kutta scheme as given in (3.25) results are shown in Figure 3.3
with h D 0:3.
30 3. THE NUMERICAL SOLUTION OF ORDINARY DIFFERENTIAL EQUATIONS
1.5 Analytic
× Numerical
0.5
x(t)
0 t
0 5 10 15 20 25 30
-0.5
-1
-1.5
Figure 3.3: Comparison of the analytic solution to (3.26), solid blue line compared to the fourth
order Runge–Kutta calculation, h D 0:3, red crosses.
To derive the Numerov’s method for solving this equation, we begin with the Taylor ex-
pansion of the function we want to solve, y.x/, around the point x0
x0 /2 00
.x .x x0 /3 000
y.x/ D y.x0 / C .x x0 /y 0 .x0 / C y .x0 / C y .x0 /
2Š 3Š
.x x0 /4 0000 .x x0 /5 00000
C y .x0 / C y .x0 / C O.h6 /: (3.28)
4Š 5Š
Denoting the distance from x to x0 by h D x x0 , we can write the above equation as
h2 00 h3
y.x0 C h/ D y.x0 / C hy 0 .x0 / C y .x0 / C y 000 .x0 /
2Š 3Š
h4 0000 h5
C y .x0 / C y 00000 .x0 / C O.h6 /: (3.29)
4Š 5Š
If we evenly discretize the space, we get a grid of x points, where h D xnC1 xn . By
applying the above equations to this discreet space, we get a relation between yn and ynC1
h2 00 h3
ynC1 D yn C hy 0 .xn / C y .xn / C y 000 .xn /
2Š 3Š
3.3. NUMERICAL METHODS 31
4 5
h 0000 h
C y .xn / C y 00000 .xn / C O.h6 /: (3.30)
4Š 5Š
Computationally, this amounts to taking a step “forward” by an amount h. If we want to
take a step “backward,” we replace every h with h and get the expression for yn 1
h2 00 h3 000
yn 1 D yn hy 0 .xn / C y .xn / y .xn /
2Š 3Š
h4 0000 h5 00000
C y .xn / y .xn / C O.h6 /: (3.31)
4Š 5Š
Summing the two equations, we find that
h4 0000
ynC1 2yn C yn 1 D h2 yn00 C y C O.h6 /: (3.32)
12 n
We can solve this equation for ynC1 by substituting the expression given at the beginning,
that is yn00 D gn yn C sn . To get an expression for the yn0000 factor, we simply have to differentiate
y 00 D gy C s (3.33)
d2
y 0000 D . gy/;
dx 2
) h2 yn0000 D gnC1 ynC1 C snC1 C 2gn yn 2sn gn 1 yn 1 C sn 1 C O.h4 /:
(3.34)
ynC1 2yn C yn 1 D
h2
h2 . gn yn C sn / C . gnC1 ynC1 C snC1 C 2gn yn 2sn gn 1 yn 1 C sn 1/
12
C O.h6 /:
(3.35)
32 3. THE NUMERICAL SOLUTION OF ORDINARY DIFFERENTIAL EQUATIONS
Rearranging
h2 5h2 h2
ynC1 1 C gnC1 2yn 1 gn C yn 1 1C gn 1
12 12 12
h2
D .snC1 C 10sn C sn 1 / C O.h6 /:
12
5h2 h2 h2
2yn 1 g
12 n
yn 1 1 C 12 gn 1 C 12 .snC1 C 10sn C sn 1 /
) ynC1 2
:
1 C h12 gnC1
(3.36)
One might expect that the errors at each step would be roughly comparable so the total
error in the Numerov method would be O.h6 h 1 / D O.h5 /. Unfortunately this is generally not
true, the error tends to grow with each step and a better estimate is O.h4 / the same as the 4th
order Runge–Kutta. It main disadvantages are that we need both y0 and y1 to start it off and
that round off errors can pop up when applying (3.36), you should always use double precision
in your Numerov code.
33
CHAPTER 4
ø
L
d h
Figure 4.1: The simple pendulum consists of a heavy weight attached to a fixed point by a massless
string. The string is of length L. It is displaced from equilibrium through some small angle, ,
and allowed to oscillate.
E D K C V;
1 1
E D ML2 P 2 C MgL 2 :
2 2
dE
D 0;
dt
) 0 D ML2 R P C MgLP sin ;
g
) R D sin./
L
2
D !0 sin./; (4.6)
where
r
g
!0 D :
L
R D !02 : (4.7)
where
r
g
!0 D :
l
The constants A and are determined by our initial conditions. If, for example, the mass
is released from rest at t D 0 at an angle 0 , then
P
.0/ D 0
D A!0 sin./;
) D 0;
)A D 0 ;
) .t / D 0 cos.!0 t /;
P D !0 0 sin.!0 t/: (4.9)
36 4. CASE STUDY: DAMPED AND DRIVEN OSCILLATIONS
0.3
1
0.2
0.5
0.1
ω
0 0
-0.1
-0.5
-0.2
-1
-0.3
0 5 10 15 20 25 30 -1.5 -1 -0.5 0 0.5 1 1.5
t ϕ
Figure 4.2: The undamped oscillator with
D 0:0; Q D 0:0 with !0 D 0:25 shown in the left
panel .t / red dashed, ! D P , solid blue as a function of time, the right panel shows the phase
trajectory ! against .
Figure 4.2 shows the time dependence of and ! D P as a function of time. Also shown
is the phase trajectory where ! is plotted against . In this simple case, the phase trajectory is
an ellipse. At a time T4 after being released the mass will be back at the origin with its maximum
speed with all energy kinetic. It will then decelerate and come to a stop at time t D T2 at an angle
of 0 . At this point all its energy is potential. After a further time of T4 it is back at the origin
with only kinetic energy and after a total time T it is back at its original position with D 0
and P D 0. This process will repeat indefinitely.
Our description of the oscillator is an idealization where resistive forces such as friction
and air resistance have been neglected. A typical resistive force would be proportional to the
angular velocity of the mass. This leads us to consider the following differential equation
R C P C !02 D 0; (4.10)
where
is a constant related to the strength of the resistance. In order to fully describe the
undamped oscillator we needed two linearly independent solutions. To find two such functions
4.2. THE PHYSICAL PENDULUM 37
t
for (4.10) we can try x D e where is a complex number to be determined. Plugging into
(4.10) we will find
2 e t C
e t C !02 e t D 0;
) 2 C
C !02 D 0;
q
˙
2 4!02
)D : (4.11)
2
Let
k D
2 4!02 ;
D jkj:
p
˙i
D : (4.12)
2
We can write the general solution:
.t/ D ˛e t Ci ‡ t C ˇe t i‡t
;
where
D ;
2
p
‡ D :
2
Hence, the general solution may be written
t
.t / D e Œ˛e i ‡ t C ˇe i‡t
; (4.13)
or equivalenly
t
x.t / ŒA cos.‡ t C /: (4.14)
The system will still oscillate but the magnitude of the oscillation will be reduced by the
exponentially decaying factor e t . Figure 4.3 shows a particular example of such motion.
In the left panel the plot of against time. The dashed curves correspond to the bounding
curves ˙e t and in the left the phase trajectory which spirals toward the point .0; 0/ we
will call such a point an “attractor.”
This case, where we still see oscillations, is described as under damped.
38 4. CASE STUDY: DAMPED AND DRIVEN OSCILLATIONS
1.5 0.2
γ = 0.05 γ = 0.05
1
0.1
0.5
0
ϕ
0
ω
-0.1
-0.5
-0.2
-1
-1.5 -0.3
0 20 40 60 80 100 -1 -0.5 0 0.5 1 1.5
t ϕ
Figure 4.3: Left panel plotted against t for
D 0:05, also shown are the bounding curves,
˙e 0:025t ; right panel is the phase trajectory.
Case 2. k > 0
In this case both
q
2 4!02
2
and
q
C
2 4!02
;
2
are negative real numbers and the solution
p 2
p 2
C
2 4!0
2 4!0
Ae 2 C Be 2 (4.15)
just decays with time and shows no oscillation. Such a solution is said be over damped.
Case 3. k D 0
This case which is know as critical damping marks the transition from oscillatory to decay-
ing behavior. Mathematically, it is a little different in that we have only one independent
solution e t . However, it is easy to check that in this special case te t is a second solution
and indeed the general solution can be written; see for example [16]
.t/ D Ae 2t C Bt e 2t : (4.16)
4.2. THE PHYSICAL PENDULUM 39
In this case we see no oscillations.
Lemma 4.1 Let xg .t/ be the general solution of the homogenous 2nd order linear differential equa-
tion
d 2x dx
a2 C 2a1 C a0 x D 0;
dt 2 dt
and xp is any particular solution of the inhomogeneous differential equation
d 2x dx
a2 C 2a1 C a0 x.t / D f .t/
dt 2 dt
then any other solution X.t/ must be of the form
X.t/ D xp .t/ C xg .t /:
Proof.
d 2 xp dxp
a2 2
C 2a1 C a0 xp .t/ D f .t/;
dt dt
2
d X dX
a2 2 C 2a1 C a0 X.t / D f .t/;
dt dt
d 2 xp X dxp X
) a2 C 2a1 C a0 Œxp .t/ X.t/ D 0:
dt 2 dt
X.t/ D xg .t/ C xp .t /:
Suppose, now, we wish to solve the differential equation
In other words, we are applying an external oscillating force to our pendulum. This could
be achieved, for example, if the mass is charged and we apply an external varying electric field.
We are assuming
; !02 and Q are positive real constants. Since the homogenous part of (4.17)
40 4. CASE STUDY: DAMPED AND DRIVEN OSCILLATIONS
is just the damped oscillator this means we know the general solution. So “all” that is needed is
a particular solution. To this end it is useful to look at the complex generalization of (4.17)
zR D
zP !02 z C Qe i t : (4.18)
Our desired particular solution will be the imaginary part of some zp : a particular solution
of (4.18). The form (4.18) is suggestive of a possible solution of the form
z D z0 e i t :
Plugging this into (4.18) we have
Œ 2 z0 e i t C i
z0 e i t C z0 e i t !02 D Qe i t ;
) z0 Œ.!02 2 / C i
D Q;
Q
) z0 D ;
.!02 2 / C i
Q .!02 2 / i
) z0 D
.!02 2 / C i
.!02 2 / i
D jz0 je i ; (4.19)
where
Q
jz0 j D q ;
.!02 2 /2 C
2 2
!02 2
cos D q ;
Œ!02 2 2 C
2 2
sin D q ;
Œ!02 2 2 C
2 2
D arctan : (4.20)
2 !02
For the under-damped forced oscillator with k < 0 we have, making use of (4.13) and
Lemma 4.1 we find
Q
.t / D ˛e t Ci ‡ t C ˇe t i ‡ t C q sin.t C /: (4.21)
2 2 2 2 2
.!0 / C
The first two term on the right-hand side will decay with time, the third term will be-
come dominant. If
<< 1 and ! !0 the amplitude of this term can be very large. This is
phenomenon is known as resonance. In Figure 4.4, we show an example of the damped-driven
oscillator with Q D 1; D 2:0;
D 0:1. Initially, the system is seen to initially exhibit damped,
irregular motion (transient behavior), but eventually it settles into a periodic motion with the
same frequency as the driving force. There is a phase-space attractor in this case as well: a closed
loop.
4.2. THE PHYSICAL PENDULUM 41
1.5 2
1 1.5
1
0.5
ϕ 0.5
0
ω
0
-0.5
-0.5
-1
-1
-1.5 -1.5
-20 0 20 40 60 80 100 120 -1.5 -1 -0.5 0 0.5 1 1.5
t ϕ
Figure 4.4: Damped-driven pendulum, small oscillations, with
D 0:1, D 2; !0 D 1:0; left
panel time evolution .t /, right panel phase trajectory.
0.15
0.1
0.05
ϕ
0
-0.05
-0.1
-0.15
0 5 10 15 20
t
Figure 4.5: Undamped undriven oscillator: blue linear approximation red nonlinear
D q D
0; 0 D 1 radian . 5:7ı / P 0 D 0.
1.5
ϕ0 = 60˚
1
0.5
ϕ(t)
-0.5
-1
-1.5
0 5 10 15 20
t
Figure 4.6: Undamped undriven oscillator: blue linear approximation red nonlinear
D q D
0; 0 D 3 radian (60ı ) P 0 D 0.
4.3. CHAOS 43
2.5
y
ϕ
-10 -5 0 5 10 15 20
-2.5
Figure 4.7: Phase trajectories, i.e., P D ! against , for the damped un-driven oscillator with
Q D 2:5; D 0:5;
D 0:2; !0 D 1. is in radians, ! D P in radians/s, red dashed nonlinear,
blue solid, linear.
I ran a fourth-order Runge–Kutta code for 0 t 100 and a step size of 0:01, with initial
P
conditions .0/ D 1; .0/ D 0 for both the linear and nonlinear cases where I have chosen Q D
1
2:5; D 2 ; !0 D 1;
D 0:2. The results are shown in Figure 4.7.
The linear case settles down to a behavior similar to that we have seen already in Figure 4.4.
The phase trajectories for the nonlinear case are much more erratic and do not become less so
as we increase the time range. The nonlinear case is very sensitive to the of step and also to the
value of Q. These type of sensitivities occur frequently when dealing with nonlinear systems.
4.3 CHAOS
Many mechanical systems exhibit chaotic motion in some regions of their parameter spaces.
Essentially, the term chaotic motion, or chaos, refers to aperiodic motion and sensitivity of the
time evolution to the initial conditions. A chaotic system is in practice unpredictable on long
time scales, although the motion is in principle deterministic, because minute changes in the
initial conditions can lead to large changes in the behavior after some time. Although a chaotic
system is unpredictable, its motion is not completely random. In particular, the way the system
44 4. CASE STUDY: DAMPED AND DRIVEN OSCILLATIONS
5
ω
2.5
ϕ
-3.2 -2.4 -1.6 -0.8 0 0.8 1.6 2.4 3.2
-2.5
-5
Figure 4.8: Phase trajectory for the Duffing oscillator with
D 0:1; ˛ D 1; ˇ D 1; Q D 2:4,
and initial conditions 0 D 1:0; P 0 D 0; 0 t 500 calculated with a step size of 0.1.
approaches chaos often exhibits universality, i.e., seemingly different systems make the transition
from regular, periodic motion to chaotic motion in very similar ways, often through a series of
quantitatively universal period doublings (bifurcations). Our nonlinear pendulum can be shown
to exhibit chaotic motion. A discussion of this important topic can be found in [17]. One of the
first chaotic systems to be studied was the Duffing oscillator:
R P C ˛x C ˇx 3 D Q sin.t/: (4.23)
CHAPTER 5
Ax D x; (5.1)
where the numbers aij ; bj are known. We can write (5.2) as a matrix equation Ax D b:
2 30 1 0 1
a11 a21 a1N x1 b1
6 a21 a22 a2N 7 B x2 C B b2 C
6 7B C B C
6 :: :: : : 7 B : C D B : C: (5.3)
4 : : : : :
: 5 @ :
: A @ :
: A
aN1 aN 2 aN N xN bN
1 1 T
A D C ; (5.4)
jAj
46 5. NUMERICAL LINEAR ALGEBRA
jAj is the determinant of A and C is the cofactor matrix corresponding to A . To calculate the
inverse of a N N matrix using the cofactor method would involve something of the order of
N Š multiplications. For N D 20, this would mean approximately 2 1018 multiplications. Even
with today’s fast machines it would take a long time to solve a system of equations this way. We
need a better computational approach.
There are some particular cases where the solution of (5.3) is particularly easy.
• If A is diagonal
2 3
a11 0 0
6 0 a22 0 7
6 7
AD6 :: :: :: 7;
4 : : : 0 5
0 0 aNN
bi
xi D i D 1; : : : ; N: (5.5)
ai i
where all the elements below the main diagonal are zero, aij D 0; 8i > j , then
2 30 1 0 1
a11 a12 a1N x1 b1
6 0 a22 a2N 7B x2 C B b2 C
6 7B C B C
6 :: :: :: 7B :: CDB :: C (5.6)
4 0 : : : 5@ : A @ : A
0 0 aNN xN bN
aNN xN D bN ;
aN 1N 1 xN 1 C aN 1N xN D bN 1 ;
:: :
: D ::
XN
a1i xi D b1 : (5.7)
j D1
5.2. LU FACTORIZATION 47
• In much the same way we can solve the linear equations when A is a lower triangular
matrix using a “forward substitution” solution.
My basic strategy here is to look for ways to relate our matrix A to one or other of these
simpler forms.
5.2 LU FACTORIZATION
Suppose A can be decomposed into the product of a lower-triangular matrix L and an upper-
triangular matrix U . The entire solution algorithm for Ax D b can be described in three steps
(i) Decompose A D LU .
(ii) Solve : Ly D b.
(iii) Solve Ux D y .
We have only four equations and we need six unknowns, however we only require a de-
composition. So just take l11 D 1; l22 D 1. Then,
4 3 1 0 u11 u12
D :
6 3 l21 1 0 u22
) 3 D u11 ;
1 D u12 ;
4 D l21 u11 ;
2 D l21 u12 C
u22
:
4 3 1 0 3 1
) D 4 :
6 3 3
1 0 23
Q D .c1 c2 cN / (5.8)
and
N
X
hci jcj i D cNki ckj ;
kD1
2 3
hc1 jc1 i hc1 jc2 i; ; hc1 jcN i
6 hc1 jc2 i hc2 jc2 i; hc2 jcN i 7
6 7
Q Q D 6 :: :: :: :: 7:
4 : : : : 5
hc1 jcN i hc2 jcN i; hcN jcN i
Thus, Q is unitary iff hci jcj i D ıij .
The proof using rows as vectors is almost identical.
Suppose A RN N is a non singular matrix then we know its determinant is non zero
and the vectors corresponding to the columns are linearly independent. Just as in (5.8) I can
write
A D Œa1 a2 : : : aN :
5.3. QR FACTORIZATION 49
My plan is to use the Grahm–Schmidt process to create an orthomormal set from the
vectors ai . We may write using (A.6)
e10 D a1 ;
e10
e1 D ;
jje10 jj
e20 D a2 ha2 je1 ie1 ;
e20
e2 D ;
ke20 k
::
: (5.9)
The vector ai is a member of the subspace spanned by fek gikD1 and thus orthogonal to the
unit vectors fek gk>i . We can expand
N
X
ai D rki ek
kD1
rki D hek jai i: (5.10)
Notice rki D 0 if k > i . Further,
ri i D hei jei0 i
D hei0 jei0 i 0:
Considering components, the .li /th term from our original matrix A , this is given by the
l th element of ai can, therefore, be written in terms of the l th elements of ek
N
X
ali D rki elk
kD1
XN
D elk rki
kD1
A D QR;
satisfy the unitary condition in Theorem 5.2 and the matrix, R has only zero elements below
the diagonal, i.e., it is upper triangular.
Example 5.3 Consider the matrix
3 2
AD ;
1 2
50 5. NUMERICAL LINEAR ALGEBRA
writing the columns as vectors
A D Œa1 a2 ;
where
3
a1 D ;
1
2
a2 D :
2
Q Q D I
1 0
D :
0 1
A D QR;
Q A D R
1 3 1 3 2
D p ;
10 1 3
1 2
1 10 8
D p :
10 0 4
As we have seen every non singular matrix A can be decomposed into a product QR ,
where Q is unitary and R is upper triangular. Let us now apply this observation to some of the
characteristic problems of linear algebra.
5.3. QR FACTORIZATION 51
5.3.1 SYSTEMS OF LINEAR EQUATIONS
Suppose A is a non-singular N N matrix and we want to solve
Ax D b:
We can proceed as follows.
• First factorize A
Ax D QRx D b:
5.3.2 EIGENVALUES
Our next task is to find the eigenvalues of A a non singular N N matrix. We first note that if
R is an upper triangular matrix, its eigenvalues are given by the solution of
ˇ ˇ
ˇ a11 a12 a1N ˇ
ˇ ˇ
ˇ 0 a a2N ˇ
ˇ 22 ˇ
jR Ij D ˇ : :: :: ˇ D 0;
ˇ 0 :: : : ˇ
ˇ ˇ
ˇ 0 0 aNN ˇ
YN
) jR Ij D .ai i / D 0:
i D1
Now we know, see Lemma A.29, that if A is an N N matrix and U is a unitary matrix
then if B D UAU then B and A have the same eigenvalues. If we can find a matrix B which
is upper triangular such that
A D UBU (5.11)
and U is unitary then the eigenvalues of A are just the diagonal elements of B . We will call a
transformation of the kind (5.11) with U unitary a “similarity transformation.”
We can find a QR decomposition of any non singular matrix A , i.e.,
A D QR;
Q AQ D RQ; (5.12)
is a similarity transformation of A and RQ has the same eigenvalues as A D QR . and if this is
upper diagonal the problem is solved. Even if RQ is not in upper triangular form we do have a
way forward.
52 5. NUMERICAL LINEAR ALGEBRA
The QR Algorithm
Suppose A is a real symmetric matrix for which we want to find the eigenvalues. Define A 0 D A ,
then define a sequence of matrices starting with k D 0 by computing the QR decomposition to
find Q.k/ and R .k/ then define A .kC1/ D R k Qk . Now, it can be shown that eventually A .k/
converges to an upper triangular matrix [18]. But
A .k/ D R .k 1/ Q.k 1/ ;
D .Q.k 1/ / Q.k 1/ R .k 1/
Q.k 1/
;
D .Q.k 1/ / A .k 1/ Q.k 1/
;
so all the matrices A .j / are connected by similarity transformations and therefore share the same
eigenvalues A .0/ is just A and A .k/ is upper triangular so we can read off the eigenvalues.
The advantages of the of the QR algorithm are that it
• gives all the eigenvalues,
• is stable.
It is incorporated in the major software packages such as LAPACK [2].
Suppose D Spanfr1 ; r2 g. Using the Grahm–Schmidt process it is easy to see that the
vectors e1 ; e2
0 1
1
1 @ A
e1 D p 1 ;
2 0
5.3. QR FACTORIZATION 53
0 1
0
e2 D @ 0 A;
1
Define
R D Q A
!2 1 2 3
p1 p1 0
D 2 4 1 2 5
2
0 0 1
0 3
p p
2 2 2
D :
0 3
Note that
• Q is a 3 2 matrix and Q is a 2 3 matrix,
• Q Q D I2 but QQ ¤ I3 ,
• since Q is real Q D QT .
A D QR;
QT Q D IN ;
t E(t)
1.0 1.0
2.0 1.5
3.0 3.0
4.0 6
which minimizes
jjAc bjj:
Further, if we decompose
A D QR:
Proof. [15].
Returning to Example 5.6. Using the QR approach I will look for the “best fit” in the least
square sense assuming the data is best represented by a
(i) straight line,
(ii) a quadratic function
p.t/ D c1 C c2 t C c3 t 2 :
2 3 Ac D b;
2 3
1 1 1
6 1 2 7 6 1:5 7
6 7 c0 D 6 7
4 1 3 5 c1 4 3 5:
1 4 6
Now write A as two column vectors:
0 1 0 1
1 1
B 1 C B 2 C
a1 D B C B
@ 1 A ; a2 D @ 3
C:
A
1 4
56 5. NUMERICAL LINEAR ALGEBRA
Applying Grahm–Schmidt to these vectors we get:
0 1 0 1
1 1:5
1B 1 C 1 B 0:5 C
e1 D B C ; e2 D p B C:
2@ 1 A 5 @ 0:5 A
1 1:5
Now construct Q
0 1 1:5
1
p
2 5
B 1 0:5 C
B 2
p C
QDB
B 1
5
0:5
C:
C
p
@ 2 5 A
1 1:5
p
2 5
R D QT A;
2 3
! 1 1
1 1 1 1 6 1
2 2 2 2 6 2 7
7
D 3 1 1 3 4 1
p
2 5
p
2 5
p
2 5
p
2 5
3 5
1 4
2 p5
D :
0 5
Ac D b;
) QRc D b;
) Rc D QT b;
2 3
! 1
1 1 1 1
2 p5 c0 6 1:5 7
D 2 2 2 2 6 7
0 5 c1
3
p 1
p 1
p 3
p 4 3 5;
2 5 2 5 2 5 2 5
6
!
2cp
0 C 5c1 5:75
D 8:25
p
:
5c1 5
c0 C c1 t C c2 t 2 E.t/:
5.3. QR FACTORIZATION 57
We will proceed much as before. We can write the problem in matrix form
0 1 Ac D 0
b; 1
1 1 1 0 1 1
B 1 c0
B 2 4 C B C
C @ c1 A D B 1:5 C :
@ 1 3 9 A @ 3 A
c2
1 4 16 6
Now since Q and A are known all we have to do is transpose Q and multiply by A to find
R:
R D QT A;
0 10 1 1 1 1
1 1 1 1
1B B 1 2 4 C
D @ p3 p1 p1 p3 C B C;
2 5 5 5 5 A@ 1 3 9 A
1 1 1 1
1 4 16
0 1
2 p5 p15
D @ 0 5 5 5 A:
0 0 2
Then,
Rc D QT b;
0 10 1 0 10 1 1
2 p5 p 15 c0 1 1 1 1
1 B B 1:5 C
@ 0 5 5 5 A @ c1 A D @ p3 p1 p1 p3 C B C;
2 5 5 5 5 A@ 3 A
0 0 2 c2 1 1 1 1
6
0 1 0 1
2c 5:75
p0 C 5c1 Cp15c2 B 8:25 C
@ 5c1 C 5 5c2 A D @ p5 A :
2c2 1:25
6
E(t)
5 Linear
Quadratic
0
0 1 2 3 4 5
t
Figure 5.1: Linear least squared and quadratic least squared fits to the data in Table 5.1.
In Figure 5.1 I show a comparison between the linear, (5.18), and quadratic, (5.19), fits.
I cannot end this chapter without adding one word of warning. A straightforward com-
puter implementation of the Grahm–Schmidt process as given in (5.9) can very easily run into
significant round off errors. Fortunately, there are some clever ways to avoid these problems [19]
leading to stable and efficient codes.
59
CHAPTER 6
Polynomial Approximations
6.1 INTERPOLATION
Interpolation is the problem of fitting a smooth curve through a given set of points, generally as
the graph of a function. It is useful in data analysis (interpolation is a form of regression) and in
numerical analysis. It is one of those important recurring concepts in applied mathematics.
yj D f .xj /
of a function at these points then the polynomial interpolation problem consists in finding a
polynomial pn .x/ of degree N which reproduces these values
yj D pn .xj /:
Let
x x2
L1 .x/ D ;
x1 x2
x x1
L2 .x/ D ;
x2 x1
p.x/ D y1 L
1 .x/ C y2 L2 .x/;
x x2 x x1
D y1 C y2 ;
x1 x2 x2 x1
1
D Œ y1 x2 C y2 x1 C x.y1 y2 / ;
x1 x2
y1 y2 y1 x2 C y2 x1
D xC ;
x1 x2 x1 x2
then
p.x1 / D y1 ;
p.x2 / D y2 :
60 6. POLYNOMIAL APPROXIMATIONS
A degree N polynomial can be written as
N
X
pN .x/ D an x n :
0
Theorem 6.4 (Lagrange interpolation theorem). Let fxj gjND0 be a collection of disjoint real
numbers and fyj gjND0 be a collection of real numbers. Then there exits a unique pn PN s.t.
pN .xi / D yi :
Proof. Define
N
X
pN .x/ D yk Lk .x/;
kD0
where Lk .x/ are the Lagrange elementary polynomials. Now from Lemma 6.3 we have
N
X N
X
pN .xi / D yk Lk .xi / D yk ıik D yi :
kD0 kD0
pN .xi / D qN .xi / D yi I
then,
rN .x/ D pN .x/ qN .x/
N
X
PN .x/ D f .xk /Lk .x/
kD0
2.5
y(x)
1.5
0.5
0 -1 -0.5 0 0.5 1
x
.x x1 /.x x2 / x.x 1/ x2 x
L0 .x/ D ;D ;D :
.x0 x1 /.x0 x2 / . 1/. 1 1/ 2
.x x0 /.x x2 / x C 1.x 1/
L1 .x/ D ;D ; D 1 x2:
.x1 x0 /.x1 x2 / .1/. 1/
.x x0 /.x x1 / x C 1.x/ x2 C 1
L2 .x/ D :D ;D ;
.x2 x0 /.x2 x2 / .1 C 1/.1/ 2
2 2
x x x Cx
) p2 .x/ D e 1 L0 .x/ C e 0 L1 .x/ C e 1 L2 .x/ D e 1 C 1 x2 C e1 ;
1 2 2
e C e1 e e 1
D x2 1 Cx D x 2 .cosh2 .1/ 1/
2 2
C x sinh.x/ C 1 1 C 1:1752x C 0:5431x 2 :
Theorem 6.7 Let f be a N C 1 continuously differentiable function on an interval Œa; b and let
fxj jj D 0; : : : ; N g be a set of distinct numbers in Œa; b. If pN .x/ is the Langrange interpolation of f
6.2. ORTHOGONAL POLYNOMIALS 63
using fxj jj D 0; : : : ; N g then for every x Œa; b there exists .x/ Œa; b s.t.
f N C1 ..x//
f .x/ pN .x/ D N C1 .x/;
.N C 1/Š
where
N
Y
N C1 .x/ D .x xj /:
j D0
Proof. [15].
• is directly proportional to N C1 .x/, which means that it will be zero at the points xj
and at its best in their vicinity.
0.8
0.6
0.4
0.2
-0.2
Figure 6.2: Runge function, red line, 5th order interpolation (six equally spaced points) blue
line, 9th order interpolation (10 equally spaced points), green line.
where l is a constant. In this section we will start from (6.3) and look for a power series solution
1
X
dy
D an nx n 1
;
dx nD1
X1
dy
2x D 2 nan x n ;
dx nD1
1
X
d 2y
D an n.n 1/x n 2
;
dx 2 nD2
X1 1
X
2 d 2y n 2
.1 x / 2 D an n.n 1/x an n.n 1/x n : (6.4)
dx nD2 nD2
Since y1 only contains even powers and y2 only odd powers they cannot be proportional
to each other and must be linearly independent. Thus, the general solution for jxj < 1 must be
where Pl .x/ is the polynomial and Ql corresponds to the other linearly independent solution.
If we now demand that Pl .1/ D 1 we have a set of polynomials (the Legendre polynomials),
P0 .x/ D 1;
P1 .x/ D x;
1
P2 .x/ D .3x 2 1/;
2
1
P3 .x/ D .5x 3 3x/;
2
1
P4 .x/ D .35x 4 30x 2 C 3/;
8
1
P5 .x/ D .63x 5 70x 3 C 15x/;
8
::
: (6.9)
Proof.
d 2 dPn .x/
Pm .x/ .1 x / C n.n C 1/Pn .x/Pm .x/ D 0;
dx
dx
d dPm .x/
Pn .x/ .1 x 2 / x C m.m C 1/Pn .x/Pm .x/ D 0;
dx dx
66 6. POLYNOMIAL APPROXIMATIONS
d 2 dPn .x/ dPm .x/
) .1 x /ŒPm .x/ Pn .x/ C
dx dx dx
Œn.n C 1/ m.m C 1/Pn .x/Pm .x/ D 0: (6.10)
Now integrate from 1 to 1, the first term will be zero since .1 x 2 / D 0 at both limits
and since m ¤ n we have the result.
Generating Function
Definition 6.10
1
ˆ.x; h/ D p ; jhj < 1
1 2hx C h2
is called the generating function of the Legendre polynomials.
Theorem 6.11
1
X
ˆ.x; h/ D hl Pl .x/:
lD0
Let y D 2xh h2 ;
1
ˆ.x; h/ D .1 y/ 2 ;
3
y
D 1 C C 4 y2 C
2 2
1 3
D 1 C .2xh h2 / C .2xh h2 /2 C
2 8
1 2 3
D 1 C xh h C .4x 2 h2 4xh3 C h4 / C
2 8
2 3 2 1
D 1 C xh C h . x / C
2 2
2
D P0 .x/ C hP1 .x/ C h P2 .x/ C
For a more complete formal proof, see [13]. The generating function allows us to derive
some important relations:
@ˆ.x; h/ 1 3
D .1 2xh C h2 / 2 . 2x C 2h/;
@h 2
2 @ˆ.x; h/
) .1 2xh C h / D .x h/ˆ;
@h
1
X 1
X
) .1 2xh C h2 / lhl 1 Pl .x/ D .x h/ hl Pl .x/:
lD1 lD0
6.2. ORTHOGONAL POLYNOMIALS 67
Equating powers of h we find the recurrence relation:
lPl .x/ 2x.l 1/Pl 1 .x/ C .l 2/Pl 2 .x/ D xPl 1 .x/ Pl 2 .x/;
lPl .x/ D .2l 1/xPl 1 .x/ .l 1/Pl 2 .x/: (6.11)
For each n the second solution Qn .x/ satisfies the same differential equation and also the
same recurrence relation:
lQl .x/ D .2l 1/xQl 1 .x/ .l 1/Ql 2 .x/;
.l C 1/QlC1 D .2l C 1/xQl .x/ lQl 1 .x/;
.2l C 1/xQl .x/ D lQl 1 .x/ C .l C 1/QlC1 .x/: (6.12)
However, it is singular at x D ˙1. For jxj ¤ 1 and it can be shown that [13]
1 1Cx
Q0 .x/ D ln ;
2 1 x
x 1Cx
Q1 .x/ D ln 1;
2 1 x
3x 2 1 1Cx 3x
Q2 .x/ D ln
4 1 x 2
::
: (6.13)
Qn(x)
Pn(x)
0
0
-1
-0.5
-2
Figure 6.3: The first few Legendre functions of the first kind, left panel, and the second kind,
right panel.
Then set
QQ l0 .x/ D 0;
7
QQ l0 1 .x/ D 10 : (6.17)
Then calculate QQ l using the bacward recurrence formula for 0 l l0 . Normalize the
sequence QQ l .x/ to deduce the computed Legendre functions Ql .x/ using the analytic Q0 .x/
Q0 .x/ Q
Ql .x/ D Ql .x/:
QQ 0 .x/
Clearly, the powers of ten are arbitrary and could be chosen differently but due care must
be taken to work within the precision of the machine. The Legendre polynomials are a member
of class of polynomial approximations which are widely used. Their strengths, weaknesses, and
utility are most clearly seen when we cast our analysis in terms of vector space theory.
6.3. INFINITE DIMENSIONAL VECTOR SPACES 69
6.3 INFINITE DIMENSIONAL VECTOR SPACES
Not all vector spaces one uses are finite dimensional. In particular, the vector space of states
from Quantum Mechanics is infinite dimensional.1 As an example, consider the set of piecewise
continuous functions, P CŒa; b on the real interval Œa; b. If f; g P CŒa; b and ˛; ˇ numbers
then ˛f C ˇg P CŒa; b. It is immediately obvious that P CŒa; b is a vector space over the real
numbers. To make P CŒa; b into a normed linear space we will need an inner product. Let us
try the following.
Definition 6.12
f; g P CŒa; bjw
Z b
hf jgi D f .x/g.x/w.x/dx;
a
w.x/ > 0;
where the “weight function,” w , is continuous and always positive on Œa; b.
but the function f0 ¤ 0 at 10,000 points in Œa; b. At first sight it looks like we can’t define a
norm. However, we can rescue the situation by agreeing to a distinction between the “vector” f0
and the function f0 .
Definition 6.13 We shall take two elements of the vector space P CŒa; bjw, f; g to be equal if
hf gjf gi D 0
even if f .t / ¤ g.t / at a finite number of points.
With this agreement we have a normed linear space and can define a norm
Z b
kf k2 D hf jf i D f 2 .t /w.t/dt;
a
1 Technicallyit is a Hilbert space, H, i.e., an infinite dimensional vector space with an inner product and associated norm,
with the additional property that if fxN g is a sequence in H and the difference kxn xm k can be made arbitrarily small for
n, m big enough then xN must converge to a limit contained in H.
70 6. POLYNOMIAL APPROXIMATIONS
with this definition we can consider limits.
Definition 6.14
lim fn ! f
n!1
if given > 0 no matter how small there exists an N s.t. for all n > N
kf fn k < :
then cj D 0 for all j . Thus, the set is X D fx j gjnD0 is a collection of linearly independent vec-
tors. Now, I will apply the Grahm–Schmidt orthogonalization process (Lemma A.9) to X with
weight w 1 to create an new set of orthonormal polynomials fn .x/g. I will adopt the notation
of (A.6). Then,
e00 D 1
he00 je00 i D 2 r
1
e0 D 0 .x/ D
Z 2
1 1
e10 D x xdx
2 1
D x; r
3
e1 D 1 .x/ D x;
r2
5 3 3 1
e2 D 2 .x/ D x ;
r2 2 2
7 5 3 3
e3 D 3 .x/ D x x ;
2 2 2
::
: (6.20)
In fact, the set of polynomials I have constructed are directly related to the Legendre
polynomials
r
2n C 1
n .x/ D Pn .x/: (6.21)
2
6.3. INFINITE DIMENSIONAL VECTOR SPACES 71
Table 6.1: Different polynomials bases
The Legendre polynomials are orthogonal and differ only in norm from the set we got by
using the Grahm–Scmidt process. They satisfy
Z 1
2
Pn .x/Pm .x/dx D ınm ; (6.22)
1 2n C1
and if f is any function in P CŒ 1; 1j1 then f may be expanded
1
X
f .x/ D an Pn .x/; where
nD0
Z 1
2n C 1
an D f .x/Pn .x/dx: (6.23)
2 1
unless m D n.
Lemma 6.15 The mth polynomial has exactly m zeros, which are simple and lie in .a; b/.
Proof. Suppose pm .x/ changes sign n times in .a; b/. Now from the fundamental theorem of
algebra n m. Let a1 ; : : : an be the distinct points where pm changes sign.
72 6. POLYNOMIAL APPROXIMATIONS
n
Y
qn .x/ D .x ai /
i D1
But in the vicinity of a1 .x a1 /pm .x/ does not change sign, indeed .x a1 /:.x
a2 / .x an /pm .x/ does not change sign in .a; b/, hence
Z b
.x a1 /:.x a2 / .x an /pm .x/w.x/dx ¤ 0: (6.26)
a
This is in contradiction to (6.25) if n is not equal to m. So there are m distinct zeros. The
point here is we have shown there are m sign changes and m is the maximum number of roots
a polynomial of order m can have!
6.4 QUADRATURE
6.4.1 SIMPSON REVISITED
The Simpson rule formula (2.28)
Z 1
1
f .x/dx D Œf . 1/ C 4f .0/ C f .1/
1 3
can be written
Z 1
1
f .x/dx D Œf . 1/ C 4f .0/ C f .1/;
1 3
X3
D wN n f .xn /;
i D1
1
wN 1 D D wN 3 ;
3
4
wN 2 D ;
3
x1 D 1;
x0 D 0;
x3 D 1; (6.27)
and is exact for all polynomials of order less than or equal 3. Be careful not to confuse the weight
function w used in the definition of the inner product (Definition 6.12) and the numbers wN n
which are also called weights in (6.27).
6.4. QUADRATURE 73
6.4.2 WEIGHTS AND NODES
We want to integrate
Z b
f .x/w.x/dx:
a
We want to find a set of points fxi gniD0 and weights wN n such that
Z b Xn
f .x/w.x/dx wN i f .xi / (6.28)
a 0
6x 5 C 18
5
x2
48 2
5
x C 6x C 6
5x 3 3
p3 .x/ D :
2
So
f .x/ D p3 .x/q.x/ C r.x/;
where q.x/; r.x/ are polynomials of order 2.
In the same way if f .x/ is a polynomial of order no more than 2n 1 we can write
where q and r are polynomials of order n 1 or less. Suppose now that pn the polynomial of
order n is a member of set of polynomials orthogonal over .a; b/ with respect to the weight
w.x/ > 0. We want to find the weights and nodes such that
Z b Xn
f .x/w.x/dx D wN i f .xi /:
a iD0
is exactly zero the same as (6.29). r.x/ is a polynomial of order n 1 so we can find the n weights
wN i so that
Z 1 n 1
X
r.x/w.x/dx D wN i r.xi /
1 i D0
6.4. QUADRATURE 75
is exact. Therefore,
Z b Z b n 1
X n
X
f .x/w.x/dx D w.x/ Œpn .x/q.x/ C r.x/ dx D wN i r.xi / D wN i f .xi /:
a a i D0 i D0
We know the nodes xi and now we will look for a closed form expression for the weights.
Now since r.x/ is a polynomial of order less than n, it is thus fixed by the values it attains at n
different points and we can use our Lagrange interpolation to write
n
X
r.x/ D Li .x/r.xi /;
i D1
Z b n 1
X Z b
) w.x/r.x/dx D Œ Li .x/w.x/dxr.xi /;
a a
Zi D0b
) wN i D w.x/Li .x/dx: (6.30)
a
Note that the weights wN i and nodes xi depend only on the polynomial pn and not on the
function f we want to integrate and thus can be tabulated. The choice the polynomial basis we
might want to use depends on the interval .a; b/ and the integral. Suppose, for example, you
wanted to evaluate an integral of the form
Z 1
f .x/
p dx;
1 1 x2
then the optimum choice is a Gauss–Chebyshev integration, see Table 6.1, where the weight
function is
1
w.x/ D p :
1 x2
77
CHAPTER 7
Sturm–Liouville Theory
In classical as well as quantum physics many problems arise in the form of boundary value prob-
lems involving second order ordinary differential equations, very frequently these problems are
of Sturm–Liouville type. Such problems have a particular place in the quantum theory because
of the self-adjoint nature of the differential operators involved.
with p; q , and w specified and p.x/ > 0; w.x/ > 0 on Œa; b is said to be a Sturm–Liouville
equation.
7.1 EIGENVALUES
Note that both y.x/ and are unspecified in Definition 7.1 so the solution of the Sturm–
Liouville equation is essentially an eigenvalue problem. The Sturm–Liouville differential equa-
tion as written down above is a purely formal entity in the absence of boundary conditions. We
can define a new Hillbert space of square integral functions on Œa; b, L2 Œa; bjw, with an inner
product
Z b
hf jgi D fN.x/g.x/w.x/dx; (7.1)
a
In order to properly define LO on our Hilbert space we need to add boundary conditions.
Now if we can find such conditions such that LO is self adjoint then:
78 7. STURM–LIOUVILLE THEORY
(i) its eigenvalues would be real (Lemma A.24),
(ii) its eigenfunctions corresponding to distinct eigenvalues would be orthogonal
(Lemma A.25).
Let us look to see what such boundary conditions would look like. Consider
Z b
d
O
hf jLgi D fN.x/ Œp.x/g 0 .x/ C q.x/g.x/ dx
a dx
integrating the first term on the right by parts we have
O
hf jLgi
Z b
D fN.x/p.x/g 0 .x/jba C ffN0 .x/p.x/g 0 .x/ C fN.x/q.x/g.x/gdx:
a
O
) hf jLgi O jgi D
hLf fN.x/p.x/g 0 .x/jba C fN0 .x/p.x/g.x/jba :
fpg 0 jba fN.b/p.b/g 0 .b/ C fN0 .b/p.b/g.b/ C fN.a/p.a/g 0 .a/ fN0 .a/p.a/g.a/
" !#b
d fN.x/ dg.x/
D p.x/ g.x/ fN.x/
dx dx
a
D 0: (7.2)
If we define our differential operator to be the formal differential operator LO together with
boundary conditions of the form (7.2) then we have a self adjoint operator on L2 Œa; bjw. The
boundary conditions (7.2) would be satisfied if, for example, we were to consider functions, g ,
defined on Œa; b satisfying
˛1 g 0 .a/ C ˛2 g.a/ D 0;
ˇ1 g 0 .b/ C ˇ2 g.b/ D 0; (7.3)
˛i and ˇi are constants not both zero. It is important to recognize that we must choose the
same constants for all our functions. We will describe (7.3) as “regular” boundary conditions. If
the function p.x/ is such that p.a/ D p.b/ then we can impose alternative “periodic boundary
conditions”
g.a/ D g.b/;
7.1. EIGENVALUES 79
0 0
g .a/ D g .b/:
In either case the Sturm–Liouville differential operator LO is self adjoint with real eigen-
values and orthogonal eigenvectors provided only that the eigenvalues are non degenerate.
But there is more. It can be shown [22] that in the regular case.
• The eigenvalues are simple; in other words the eigenfunctions are non-degenerate and
thus mutually orthogonal.
• The eigenfunction corresponding to the nth eigenvalue has n zeros on the open interval
.a; b/.
• The orthonormal set of eigenfunctions form a basis for the Hilbert space.
For the periodic problem and most but not all of the above results will still hold. The
eigenvalues will still be real, the eigenfunctions orthogonal for different eigenvalues but with
the exception that these may be degeneracy.
Consider the periodic Sturm–Liouville problem
„2 d 2 .x/
D E ;
2m dx 2
.0/ D .L/;
0 0
.0/ D .L/:
4 2 „2 n2
En D ;
2mL2
and for each En ; n ¤ 0 there are two linearly independent solutions
2 nx 2 nx
cos. /; sin. /:
L L
We can always use our Grahm–Schmidt procedure to find orthogonal vectors that span
the subspace corresponding to a degenerate eigenvalue, so even in the periodic case we can find
an othonormal set of eigenfunctions that form a basis but the sequence of eigenvalues is not
monotonically increasing.
80 7. STURM–LIOUVILLE THEORY
7.2 LEAST SQUARES APPROXIMATION
Suppose g is a member of L2 Œa; bjw then we can expand it in terms of the orthonormal eigen-
functions of our Sturm–Liouville operator, fi g1
mD1
1
X
g.x/ D cn n .x/;
nD1
1
X
jjgjj2 D hgjgi D jcn j2 : (7.4)
nD1
We want to chose the constants bi s.t. f is the “best” approximation to g , just as we did
in Chapter 5 we will look for the least squared fit, that is we want to minimize jjf gjj2
jjf gjj2 D hf
" Njf i C hgjgi hf jgi# hgjf i;
X
D jbi j2 bNi ci cNi bi C jjgjj2 : (7.6)
i D1
We want to chose our set of N coefficients bi so that the term in the square brackets is
smallest. Write
"N #
X
N b/ D
F .b; bNi bi bNi ci cNi bi :
i D1
CHAPTER 8
We want to find both the eigenvalues and the eigenfunctions. As you know an eigenvalue
problem can only have solutions for certain values of E . Suppose that E is such a value then the
Schrödinger equation (8.1) can be written:
d 2 .x/
C k 2 .x/ .x/ D 0;
dx 2 r
2m
k.x/ D ŒE V .x/;
„2
.x/ ! 0 as x ! ˙1: (8.2)
At this early stage it is helpful to focus on (8.2) and to see how much of the character of
the solution we can deduce before we start calculating.
We immediately recognize that we have a regular Sturm–Liouville problem, so we expect
that the eigenvalues will be real, non-degenerate, bounded below and that they can be ordered
E0 < E 1 < E 2 < (8.3)
84 8. CASE STUDY: THE QUANTUM OSCILLATOR
Further, the eigenfunction corresponding to the nth eigenvalue will have exactly n zeros.
If the potential is symmetric V .x/ D V . x/ then we make use of the following result.
Lemma 8.1 Suppose the potential V is such that
V . x/ D V .x/
and each bound state level corresponds to only one independent solution then
. x/ D ˙ .x/:
If we have the positive sign then we say the wave function has even parity and if negative we
say that we have odd parity.
Proof. Suppose .x/ is a solution of the Schrödinger equation corresponding to the energy E
then
„2 d 2 .x/
C V .x/ .x/ D E .x/: (8.4)
2m dx 2
Now we can always replace x by x in (8.4) and remembering that V . x/ D V .x/ we see
that . x/ is also a solution corresponding to the same eigenvalue and since the eigenvalues are
non degenerate it follows that .x/ and . x/ must be linearly dependent, i.e.,
.x/ D C . x/;
hence C 2 D 1, therefore
. x/ D ˙ .x/: (8.5)
For a given value of the energy E , we can divide space into three regions, depending on the
value of k.x/ in (8.2). This has something of the character of the classical problem we discussed
in Chapter 4.
In region 2,
E V;
k 2 .x/ > 0;
E V .x tp / D 0 (8.6)
marks the “turning points” between the classically allowed and quantum regions. We can make
use of our knowledge of Sturm–Liouville equations to create a computer code to get an estimate
of the eigenvalues and eigenfunctions. We know that the eigenfunction corresponding to the
nth eigenvalue has n zeros, smaller eigenvalues will have less zeros, bigger eigenvalues will have
more. We expect to find all the zeros in region 2. Our potential is symmetric, therefore we know
that the eigenfunctions will be either symmetric or antisymmetric either way if x0 > 0 is a nodal
point then x0 is also a nodal point. Further, if .x/ has odd parity:
. h/ D .h/;
. h/ C .h/ D 0;
) 2 .0/ C O.h2 / D 0;
) .0/ D 0; (8.7)
. h/ D .h/;
.h/ . h/
D 0;
h 0
) .0/ D 0: (8.8)
For either m even or odd, we will have exactly the same number of zeros for x > 0 and
x < 0. If is an odd function it must, as we have just seen, have a zero at the origin so it must
have an odd number of zeros and if is an even function it must have an even number of zeros.
(If it had an odd number of nodes, there must be one node at the origin, since there is exactly
the same number of nodes for x > 0 and x < 0, so .0/ D 0 D 0 .0/ therefore the leading term
in the Taylor’s expansion is just: x 2 00 .0/ D k 2 .0/ .0/ D 0 and so on from the higher terms,
i.e., function is exactly zero.)
In summary:
• if the function has m nodes and is odd then there must be one node at the origin and
m 1
2
nodes for x > 0,
m
• if the function has m nodes and is even then there must be 2
nodes for x > 0,
either way we need only solve for x 0.
86 8. CASE STUDY: THE QUANTUM OSCILLATOR
8.2 NUMERICAL SOLUTION FOR THE OSCILLATOR
While the quantum oscillator problem admits a relatively simple analytic solution, see Ap-
pendix B, our ambition here is to find an efficient numerical approach to calculate the eigen-
functions and eigenvalues.
The simplest approach is called the “shooting method.” It searches for a function with with a
pre-determined number, n, of zeros. It is assumed that the actual eigenvalue En lies somewhere
in an energy range ŒEmin ; Emax . An energy E is taken to be
Emax C Emin
ED : (8.9)
2
The energy range should contain the desired eigenvalue En . The wave function is inte-
grated starting from x D 0 in the direction of positive x ; at the same time, the number of nodes,
m (i.e., of changes of sign of the function) is counted. If the number of nodes is larger than n; E
is too high; if the number of nodes is smaller than n; E is too low. A new interval is defined
Emax D E if m > n;
Emin D E if m < n: (8.10)
A replacement E is found from (8.9) and the procedure repeated until the energy interval
is smaller than a pre-determined threshold, we assume that convergence has been reached.
I wrote a code to study the n D 3 eigenvalue and eigenfunction of the harmonic oscillator
potential where units were chosen such that „ D m D ! D 1 and the range of x was taken to be
10 x 10
which was divided into 300 equally spaced intervals. The potential was V .x/ was calculated at
each grid point and stored an array V .i / then the initial values of Emax and Emin were deter-
mined by
where we take
s.x/ D 0;
2m
g.x/ D ŒE V .x/ ;
„
8.2. NUMERICAL SOLUTION FOR THE OSCILLATOR 87
Table 8.1: Results from Numerov
V(x)
E
Figure 8.1: Given a test Energy space is divided into 3. The region 1 and 3 with E V are
classically forbidden.
and numerical solutions are in moderate agreement but they diverge dramatically once we pass
the turning points.
The problem with this approach lies in the fact that the code tried to integrate from x D 0
in the region 2 to large x in region 3, but as we have seen the solution to the formal differen-
tial equation allows for an exponentially increasing solution which we don’t want as well as an
exponentially decreasing solution which we do want. If even a tiny amount of the exponentially
increasing solution (due to numerical noise, for instance) is present at the turning point, the
integration algorithm will inexorably make it grow in the classically forbidden region. In order
to deal with this problem we can go “far” into the quantum region where we could reasonably
assume the wave function is close to zero and integrate backward to the turning point where we
“match” to the solution integrated from 0 in the classical region.
At the turning point we require that function got by integrating in, from large xmax in
region 3, 3 .x/, be such that it matches the solution got from integrating out from 0; 2 .x/.
Matching means that we require that both the functions and their first derivatives are continu-
ous. If we have found the correct eigenvalue then we have our solution.
A second code was written for the harmonic oscillator. Two integrations were performed:
a forward recursion, in region 2, starting from x D 0, and a backward one, in region 3, starting
from xmax The matching point was chosen to be the grid point, itp nearest to x tp . Note that
since x tp will vary with the choice of E . The outward integration is performed until grid point
i tp , yielding a function .2/ .x/ defined in region 2, of course because of symmetry we only need
to integrate from 0 to x tp ; the number n of changes of sign is counted in the same way as before.
8.2. NUMERICAL SOLUTION FOR THE OSCILLATOR 89
0.6
0.4
0.2
ϕ3(x)
0
-0.2
-0.4
-0.6
-10 -5 0 5 10
x
Figure 8.2: Comparison between the analytic solution, red dashed with the numerical solution,
blue dotted using the node counting approach.
We note that it is not needed to look for changes of sign beyond x tp : we expect that in
the classically forbidden region there will not be any nodes (no oscillations, just exponentially
decaying or increasing solutions).
If the number of nodes is the expected one, the code starts to integrate inward from the
rightmost points. It goes one grid point beyond xmax say, n and then puts
nC1 D 0
n D h
and then use the Numerov formula to integrate to i tp . Continuity at this point is easily achieved
by simply scaling the solution in region 3 by
.2/ .itp/
:
.3/ .itp/
Forcing the two solutions to have identical first derivatives is a little more demanding. If we use
our Taylor expansion on both functions then
1 00
.3/ .i tp C 1/ D 3 .itp/ C 0
3 .i tp/h C .i tp/h2 C O.h3 /
2 3
1 00
.2/ .itp 1/ D .2/ .itp/
0
.2/ .itp/h C .itp/h2 C O.h3 /: (8.11)
2 .2/
90 8. CASE STUDY: THE QUANTUM OSCILLATOR
0.6
0.4
0.2
ϕ3(x)
0
-0.2
-0.4
-0.6
-6 -4 -2 0 2 4 6
x
Figure 8.3: Comparison between the analytic solution, red solid with the numerical solution
using the extended approach to O.h3 /, blue crosses, for the n D 3 eigenfunction of the quantum
oscillator.
Now by construction,
Therefore to O.h2 /,
The jump condition (8.12 ) depends on our choice of E both through the functions .2/
and .3/ and gitp , and thus (8.12) can be solved for zero difference by successive bisections and
this was the approach incorporated in the second code.
In Figure 8.3, a comparison is presented between the analytic solution and the numerical
solutions using the modified code, the numerical results of which are visually indistinguishable
from the analytic result.
91
CHAPTER 9
Variational Principles
Minimization principles have a special place in numerical methods and they form one of the
most wide-ranging means of formulating mathematical models governing the equilibrium con-
figurations of physical systems.
E0 < E1 E2 En EnC1 ;
h jHO i E0 ;
with equality iff D 0.
Proof. Since HO is self adjoint its eigenvalues are real and the basis of eigenfunctions can be
chosen such that:
h nj mi D ınm :
Hence,
1
X
h jHO i D cNm cn h O
m jH ni
nD0;mD0
92 9. VARIATIONAL PRINCIPLES
1
X
D cNm cn En h m j n i
nD0;mD0
X
D jcn j2 En
nD0X
E0 jcn j2
n
D E0 ;
X1 X1
D E0 jcn j2 C jcn j2 ŒEn E0
nD0 nD0
E0 :
Clearly, the conditions of the theorem applies to any regular Sturm–Liouville problem as
we discussed in Chapter 7. We further note that if 0 is the actual normalized state correspond-
ing to E0 then
h O
0 jH 0i D E0 ; (9.1)
and since En > E0 for all n > 0 and is any other function then
This theorem is known as the Rayleigh–Ritz Theorem. The result is not restricted to the
finite dimensional case so it is equally valid for any of the infinite dimensional space of functions
we have met so far. Further if we know from, experiment say, the value of E0 then if we can find
that minimizes h jHO i we will have found the ground state wavefunction, 0 . This obser-
vation underlies the various variational approaches to structure studies of many body quantum
mechanical systems [23–25].
Now consider a family of states, f .˛/g; depending on real parameters:
˛1 ; : : : ; ˛N ;
we can relax our assumption that the states are normalized and define
h .˛/jH j .˛/i
E.˛/ D : (9.3)
h .˛/j .˛/i
E.˛/ is known as the Rayleigh–Ritz quotient. From Theorem 9.1 we have that E.˛/ E0 and
now we try to make E.˛/ as small as possible. Now the condition that the real function E.˛/
be stationary is [11],
@E.˛/
D 0: (9.4)
@˛i
9.1. RAYLEIGH–RITZ THEOREM 93
Example 9.2 Suppose we want to find the ground state energy of the one-dimensional har-
monic oscillator
„2 d 2 1
HO D 2
C m! 2 x 2 : (9.5)
2m dx 2
Now we already know the exact answer but lets try out our variational approach. We pick
our trial function to be the “Gaussian”
bx 2
T .x/ D Ae ; (9.6)
Z bx 2 Z
2 „2 1 bx 2 de m! 2 1
bx 2
h T jH
O T i D jAj Œ .e /dx C x2e dx
2m 1 dx 2 1
„2 b m! 2
D C : (9.8)
2m 8b
We have only one free parameter, b
O „2 b m! 2
f .b/ D h T jH Ti D C ;
2m 8b
df .b/ „2 m! 2
D D 0;
db 2m 8b 2
m!
)b D ;
2„
1
) ET D „!;
2
m! 14 m2 ! 2
) T D exp. /: (9.9)
2„ 4„2
We have unearthed the “exact” solution and we can’t do any better.
1 „2 d 2
HO D C V .x/:
2 2m dx 2
94 9. VARIATIONAL PRINCIPLES
b
y = y1(x)
y = y(x)
y = y2(x)
Figure 9.1: The points a and b can be connected by an infinite number of different paths; in this
figure we show just 3.
Define
I. N ; / D hZ jHO i
1
N .x/ „2 d 2
D C V .x/ .x/dx: (9.10)
1 2m dx 2
For any given .x/ and N .x/ the integral I in (9.10) is just a number. Our task is to find
the out of the infinity of possible ’s that minimizes I. N ; / while at same time satisfying
the constraint that
Z 1
h j iD N .x/ .x/dx D 1: (9.11)
1
@F
D 0;
@y
@F
) 0 D constant (9.21)
@y
@F
F y0 D constant: (9.22)
@y 0
The result (9.20) can be extended in a straightforward manner to more than one dependent
variable.
Suppose
Then the required path y.x/ which yields extrema satisfies the set of equations
!
@F d @F
D 0 1 j n: (9.23)
@yj dx @yj0
Example 9.3 Consider a particle, mass m moving in space under the effect of a scalar potential
V .x; y; z/ then define the Lagrangian
1
LD m.xP 2 C yP 2 C zP 2 / V .x; y; z/: (9.24)
2
Then, requiring the integral
Z t2
I D L.x; y; z; x;
P y;
P zP / (9.25)
t1
q1 .t/; : : : ; qN .t/;
with kinetic energy, T .qi ; qP i / and potential energy V .qi ; t / then the motion of the system from
time t1 to time t2 is such as to render the “action integral”
Z t2
I D L.qi ; qP i ; t /dt
t1
A few comments.
• The name is a bit misleading. I didn’t have to require the integral have a minimum to
recover Newton’s laws in the form (9.27); I only needed I to be stationary.
• The choice of the qi ’s is not restricted to Cartesian coordinates.
r D .x; y; z/:
Now measure the motion of the particle w.r.t. a rotating coordinate system with angular
velocity
! D .0; 0; !/:
98 9. VARIATIONAL PRINCIPLES
If r 0 D .x 0 ; y 0 ; z 0 / are the coordinates in the rotating system, then
z D z0;
x D x 0 cos !t y 0 sin !t;
y D y 0 cos !t C x 0 sin !t;
) zP D zP 0 ;
xP D xP 0 cos !t yP 0 sin !t x 0 ! sin !t !y 0 cos !t;
yP D yP0 cos !t C xP 0 sin !t y 0 ! sin !t C x 0 ! cos !t;
) xP 2 C yP 2 C zP 2 D ! 2 Œx 02 C y 02 C .xP 02 C yP 02 / C 2!.x 0 yP 0 xP 0 y 0 / C zP 02 ;
m 2 02
) L.r 0 ; rP 0 / D ! .x C y 02 / C .xP 02 C yP 02 / C 2!.x 0 yP 0 xP 0 y 0 / :
2
(9.29)
m 2 02
L.r 0 ; rP 0 / D ! .x C y 02 / C .xP 02 C yP 02 / C 2!.x 0 yP 0 xP 0 y 0 / ;
2
d @L d
D mxP 0 !y 0 D mxR 0 ! yP 0 ;
dt @xP 0 dt
d @L d
D myP 0 C !x 0 D myR 0 C ! xP 0 ;
dt @yP 0 dt
@L
D m! 2 x 0 C ! yP 0 ;
@x 0
@L
D m! 2 y 0 ! xP 0 : (9.30)
@y 0
xR 0 D ! 2 x 0 C 2! yP 0 ;
yR 0 D ! 2 y 0 2! xP 0 ;
) mrR 0 D m! .! r 0 / 2m! rP 0 : (9.31)
Thus, we have recovered the “fictitious forces” characteristic of a non-inertial frame, the
“centrifugal” and “Coriolis” terms [17].
where C is a constant.
We generalize our earlier argument and introduce a two-parameter family of curves
Y .x; 1 ; 2 /
Expanding in a Taylor series, (2.6), in 1 and 2 and integrating by parts then putting
i D 0 we find that
Z xb
@L @H d @H
D i .x/dx D 0;
@i xa @y dx @y 0
@H d @H
) D 0: (9.38)
@y dx @y 0
100 9. VARIATIONAL PRINCIPLES
This just like the Euler–Lagrange equation (9.20) except that H D F G replaces F .
Note the solution of the Euler–Lagrange equation involves two constants of integration plus the
constraint condition is enough to ensure that y.x/ passes through .xa ; a/ and .xb ; b/.
Generalization of these results to include multiple constraints is not difficult. If we have
M constraints:
Z xb
Ji D Gi .y; y 0 ; x/dx;
xa
1 i M:
Example 9.5 Let us look for the wave function .x/ which minimizes
Z 1
O
h jH i D N .x/HO .x/dx: (9.41)
1
where
„2 d 2
HO D C V .x/: (9.43)
2m dx 2
We require
lim .x/ D 0:
x!˙1
Therefore,
Z 1
ˇ1 Z 1
2
N d dx Nd ˇ
ˇ dN d
D dx: (9.44)
1 dx 2 dx ˇ 1 1 dx dx
9.4. STURM–LIOUVILLE REVISITED 101
Our task is to minimize
Z 1
„2 dN d
h jHO i D CV N dx (9.45)
1 2m dx dx
subject to (9.42). If we treat and N as two dependent variables and add the constraint using
a Lagrange multiplier the Euler–Lagrange equations become
„2 d 2
V D 0;
2m dx 2
„2 d 2 N
V N N 2
D 0;
Z 1 2m dx
N .x/ .x/dx D 1: (9.46)
1
The first and second of these equations in (9.46) are equivalent so we deduce that satisfies
the Schrödinger equation:
„2 d 2
C V .x/ D : (9.47)
2m dx 2
We can thus identify the Lagrange multiplier with the energy of the physical system. It
is a straightforward matter to extend this result to three dimension, i.e., the square integrable
function .r/ which minimizes
Z Z Z
„2 2
O
h jH i D N .r/ r C V .r/ .r/d 3 r (9.48)
2m
subject to the constraint that
Z Z Z
N .r/ .r/d 3 r D 1; (9.49)
p.x/; w.x/ > 0 on Œa; b. Let us introduce a Lagrange multiplier then (9.36) tells us that
I G is stationary when
d 2py 0
D 2qy 2wy
0
dx
dpy
) C qy D wy (9.53)
dx
which is the Sturm–Liouville equation. If we multiply (9.53) by y and integrate we find
Z b Z xb
dpy 0 2
y C qy dx D wy 2 dx
a dx xa
D GŒy
D : (9.54)
where we have assumed the usual Sturm–Liouville boundary conditions, (7.2). Thus, the sta-
tionary values of
Rb
a y.p.y 0 /2 C qy/dx
I Œy D D R xb
2
xa y .x/w.x/dx
Rb 0
a y.py qy/dx
D Rb (9.56)
2
a y .x/w.x/dx
are given by
F Œyn .x/ D n ;
where n are the eigenvalues of the Sturm–Liouville operator corresponding to the eigenfunc-
tions yn .
In summary, the following three problems are equivalent:
9.4. STURM–LIOUVILLE REVISITED 103
(i) Find the eigenvalues, and the eigenfunctions y.x/ that solve the Sturm–Liouville prob-
lem
d.p.x/y 0 /
C q.x/y D w.x/y;
dx ˇb
ˇ
0ˇ
ypy ˇ D 0;
a
w.x/; p.x/ > 0; x .a; b/:
The eigenvalues of the equivalent Sturm–Liouville problem in (i) are given by F Œy.
is stationary. The eigenvalues of the Sturm–Liouville problem are then given by the values
of H Œy.
We can make use of these equivalences to estimate the eigenvalues and eigenfunctions of
Sturm–Liouville problem.
u00 C u D 0;
u.0/ D u.1/ D 0; (9.57)
un D sin.nx/
n D .n/2
n D 1; 2; : : : : (9.58)
We are looking for the lowest eigenvalue. Let us try two different test functions.
104 9. VARIATIONAL PRINCIPLES
(i) The “hat” function
(
1
x; 0 x 2
uh .x/ D 1
(9.59)
1 x; 2
x 1:
uq D x.x 1/;
Z 1
1
u2q dx D ;
0 30
p
uN q D 30uq : (9.61)
In Figure 9.2, I show a comparison between the normalized trial functions uN h and uN q with
the exact solution u1 .x/. Now let us consider the Rayleigh quotient for both trial functions to
get an estimate for the eigenvalue :
R1 0 2
.u / dx
h D R 1 0 h D 12;
2
. 0 uh w.x/dx/2 /
R1 0 2
.uq / dx
q D R 1 0 D 10: (9.62)
. 0 u2q w.x/dx/2 /
The true value of the eigenvalue is 9:8626, as expected both approximate values are greater
than the exact value, with h overestimating the exact value by approximately 21% and q over
estimating it by only 1:3%.
9.4. STURM–LIOUVILLE REVISITED 105
1.6
1.2
0.8
uh
0.4
Figure 9.2: Trial functions, uh .x/, red long dashed, uq .x/, blue short dashed compared with
exact eigenfunction u1 .x/, solid green.
107
CHAPTER 10
n D 1; 2; : : : ;
l D 0; 1; : : : ; n 1;
m D l; l C 1; : : : ; l 1; l
108 10. CASE STUDY: THE GROUND STATE OF ATOMS
and Ylm .; / is a “spherical harmonic.”
The first few radial functions are given by [23, 26]
Since the nucler mass it is very much bigger than the electron mass we can just take D 1
which is what I will do from now on.
‰n1 ;l1 ;m1 ;n2 ;l2 ;m2 .r1 ; r2 / D n1 ;l1 ;m1 .r1 / n2 ;l2 ;m2 .r2 /; (10.7)
where ni li mi .ri / are the usual energy eigenstates of the hydrogenic ion with nuclear charge Z .
You should remember that the electrons are fermions so we cannot put them in the same state.
10.2. TWO ELECTRON IONS 109
However, electrons also have a spin degree of freedom which we have neglected in (10.7). This
means that two electrons can have the same spatial wavefunction as long as one is spin up and
the other spin down. The interaction energy in this approximation is just the sum of the energies
of both orbitals:
1 1
E D Z 2 2 C 2 Ryd:
n1 n2
Setting Z D 2; n1 D n2 D 1 for helium we get a ground state energy of 8Ryd
108:8 eV. The ground state of helium has a measured energy which is very close to 79 eV.
Clearly, we need to take into account the interaction term to get a better estimate. I will explore
two methods to finding an improved approximate solution for the two electron ion.
• I will look at a “perturbative” approach. Perturbation theory is a systematic method for
finding an approximate solution to a problem, by starting from the exact solution of
a related, simpler problem. I will only use the first-order theory as outlined in Ap-
pendix C.
• The variational approach discussed in the previous chapter.
We can take ‰n1 ;l1 ;m1 ;n2 ;l2 ;m2 .r1 ; r2 / as defined in (10.7) to be our unperturbed state. If
we are to apply perturbation theory we need to be able to assume that the neglected term, the
e e interaction is smaller than the unperturbed term. Both the electron-nucleus and electron-
electron terms are Coulomb interactions differing by a factor Z so crudely our perturbation is Z1
smaller. This is only a half for helium so we might expect that perturbation theory will only give
a very crude estimate of the correction. Our hydrogenic ground state orbital is given by (10.5):
1
1;0;0 .r/ D .Z/3=2 exp. Zr/ p :
Hence,
E D h‰1;0;0;1;0;0 jHI ‰1;0;0;1;0;0 i
Z
j 1;0;0 .r1 /j2 j 1;0;0 .r2 /j2
D d 3 r1 d 3 r2
jjr1 r2 jj
5Z
D Ryd; (10.8)
4
where I have made us of the integral [23]
Z
e
Œr1 Cr2 20 2
d 3 r1 d 3 r2 D : (10.9)
jjr1 r2 jj
5
So, the first-order correction is a positive term and yields a ground state energy:
5
. 8 C /Ryd 74:8 eV: (10.10)
2
110 10. CASE STUDY: THE GROUND STATE OF ATOMS
This is not a bad first estimate. However to take the perturbation to higher orders is very
demanding.
Now let us try the variational approach, taking as our variational test function a normalized
wave function:
ZQ 3 Q 1 Cr2 /
Z.r
t .r1 ; r2 / D e : (10.11)
Our trial function looks like the product of two hydrogenic functions for a nuclear charge
ZQ but this “charge” is not a real constant charge but a variable parameter which we can chose at
will in order to make use of the Rayleigh–Ritz theorem.
h O
t jH ti hZHO i
D d 3 r1 d 3 r2 N t
!
1 2 1 2 ZQ ZQ ZQ Z ZQ Z 1
r r Œ C C C C t:
2 1 2 2 r1 r2 r1 r2 jjr1 r2 jj
(10.12)
Now since we are using hydrogenic functions, it can be shown [23, 26],
" #!
1 2 1 2 ZQ ZQ
h tj r r C ti D ZQ 2 ;
2 1 2 2 r1 r2
ZQ Z ZQ Z Q Z/h 1 i
h tj C t i D 2.Z
r1 r2 r
1 5ZQ
h tj ti D ; (10.13)
jjr1 r2 jj 8
where h 1r i is the expectation for the ground state for a one electron hydrogenic ion with nuclear
charge ZQ and is equal to ZQ . With everything in Rydbergs we have
5 Q
hHO i D Œ 2ZQ 2 C 4Z.
Q ZQ Z/ C Z Ryd: (10.14)
4
Let us now find the value of ZQ which gives the minimum value, ZQ min :
ˇ
d hHO i ˇˇ
D 0;
d ZQ ˇZQ min
5
) 4ZQ min C 8ZQ min 4Z C D 0;
4
5
) ZQ min D Z : (10.15)
16
10.2. TWO ELECTRON IONS 111
For helium Z D 2 and consequently ZQ min D 27=16 and our upper bound on the ground
state energy is
2
27
hHO i D Ryd 77:46 eV: (10.16)
16
One of the advantages of the variational approach is not only does it give an estimate of
the ground state but also an approximate wave function. In this case we can interpret our new
wave function by saying that each electron moves on average in the field of a nucleus with charge
ZQ , rather than charge Z . The difference between ZQ and Z is a measure of the degree of screening
due to the second electron.
So far, we have not fixed Z so we can apply the same analysis to other two electron sys-
tems. The negative ion of hydrogen H is an interesting testing ground for exploring variational
techniques for estimating atomic wavefunctions [27, 28]. If we use the test function (10.11) our
analysis follows through exactly as before but now Z D 1 and ZQ min D 11=16 we thus have an
upper bound on the energy of the three particle system
2
11
Ryd 12:86 eV: (10.17)
16
Now this energy is greater than the ground state energy of neutral hydrogen ( 13:6 eV).
Thus, if this were the actual energy of the H ground state then it would be more energetically
favorable to free one electron and the other electron to be left in the ground state of the neutral.
The variational method only gives us an upper bound on the energy so it would be premature to
assume that there are no bound states of H based on this calculation alone. Bethe [29] used a
trial function which depended on a three parameter function of the form:
D .1 C ˛u C ˇt 2 /e s
where
u jjr1 r2 jj;
s r1 C r2 ;
t r1 r2 ; (10.18)
and ˛; ˇ; are the variational parameters. It was shown that with this wave function the resulting
Rayleigh–Ritz upper bound on the energy lies below 1Ryd. More and more sophisticated and
complex trial functions have been used. The best current estimate ground energy is close to
14:36 eV.
H is of astrophysical importance [27]. The abundant presence of both hydrogen and low
energy electrons in the ionized atmospheres of the Sun and other stars is ideal for the creation
of H by electron attachment. Radiation from the surface of the sun is absorbed by photo-
detachment. The continual formation and destruction of the negative ion conserves the total
radiated energy but modifies the characteristics of the light emitted from the star. Indeed, since
112 10. CASE STUDY: THE GROUND STATE OF ATOMS
most neutral atoms and positive ions have their first absorption at 4 or 5 eV if not larger, H is
the dominant contributor to the absorption of 0:75 eV photons, a critical range of infrared and
visible wavelengths.
Chandrasekar [30] used a two-parameter trial function:
N h ZQ 1 r1 ZQ 2 r2 Q Q
i
trial .r1 ; r2 / D e C e Z1 r2 Z2 r1 : (10.19)
4
Notice we have two variational parameters ZQ 1 and ZQ 2 and that our wave function is sym-
metric under the interchange of r1 and r2 . Using (10.19) and Rayleigh–Ritz Chandrasekar found
ZQ 1 D 1:039 and ZQ 2 D :283 and an upper bound on the ground state energy of 13:98, slightly
less than the binding energy of hydrogen and not too far of the actual ground state energy. The
function exhibits a “radial correlation” only. Particularly striking is the feature that ZQ 1 is larger
than 1, we can interpret this as implying that the effect of the second electron is to force the inner
one closer to the nucleus than it would be were it alone bound to the proton. The more complex
wave functions like those of the Bethe, (10.18) include “angular correlation” between the direc-
tions rO1 and rO2 and “radial” between the magnitudes r1 and r2 , the fact that the Chandrasekar
wavefunction gives a “good” bound state energy suggests that radial is the more significant of the
two types of correlation.
Now using ‰ as our trial function the expectation value of the energy becomes:
h‰jHO i hHO i
XN Z
1 2 Z
D d 3 r N ˛i .r/ r ˛i .r/
2 r
i D1
10.3. THE HARTREE APPROACH 113
XZ N ˛ .r/ N ˛ .r / ˛ .r/ ˛ .r /
0 0
C d 3 rd 3 r 0 i j i j
: (10.22)
kr r 0 k
j >i
Since the integral is over r and r 0 we have that Jij D Jji , hence
X 1X
Jij D Jij ; (10.24)
2
j >i j ¤i
N Z
X
1 2 Z
hHO i D d 3 r N ˛i .r/
r ˛i .r/
2 r
i D1
1X
Z N ˛ .r/ N ˛ .r 0 / ˛ .r/ ˛ .r 0 /
C d 3 rd 3 r 0 i j i j
: (10.25)
2 kr r 0 k
j ¤i
To find the least upper bound on the energy with this ansatz (10.20) we need to we min-
imize hHO i over all possible one particle orbitals. If we keep each orbital ˛i normalized then
the N particle wave function ‰ will be normalized. To achieve this we introduce N Lagrange
multipliers, i . Consider the functional
X Z
O 3 2
F Œ‰ D hH i i d rj ˛i .r/j 1 : (10.26)
i
We want to find the wave functions ˛i which will make F minimal. Just as in Exam-
ple 9.5, we can vary its real and imaginary parts independently. Since we have N independent
wavefunctions, this gives rise to 2N real conditions. This gives two sets of N complex equations,
however one set is simply the conjugate of the other and so we only need the N equations:
2 3
1 Z XZ N ˛ .r 0 / ˛ .r 0 /
4 ri2 C d 3r 0 j j
5 ˛i .r/ D i ˛i .r/; (10.27)
2 ri kr r 0 k
i¤j
Guess Orbitals ψi
Calculate Ui
No Calculate New ψi
Calculate Ui
Output:
SCF Physical quantities.
Converged Yes
Done
Equation (10.27) has the same form as the regular Schrödinger equation where we have an
effective potential:
XZ N ˛ .r 0 / ˛ .r 0 /
Ui .r/ D d 3r 0 j
0
j
; (10.28)
kr r k
j ¤i
and our Lagrange multipliers are now the orbital energies. We interpret Ui .r/ as coming from
the electrostatic potential due to all the electrons other than i . The thing to notice is that each
˛j that appears in Ui .r/ is itself determined by one of the Hartree equations in (10.27) so
we have a set of coupled integro-differential equations. The potentials Ui both determine the
wavefunctions and are determined by the wavefunctions. The major requirement now is “self-
consistency.” The usual way forward is to proceed iteratively, see Figure 10.1. We write down a
physically reasonable guess for our product wavefunction (10.20) and use this to calculate Ui
then calculate a new set of energies energies i and orbitals ˛i from which we can calculate
a new Ui and continue like this until we have reached the desired level of convergence. Now
taking the inner product of (10.27) with ˛i .r/ yields
Z XZ 0
3 1 2 Z j ˛j .r/j2 j ˛i .r/j2
i D d r N ˛i .r/ r ˛i .r/ C d 3 rd 3 r 0 :
2 i ri kr r 0 k
i ¤j
(10.29)
Summing over i we almost get the expression (10.22). Unfortunately, the inner summa-
tion in (10.22) is over j > i while in (10.27) it is over j ¤ i which means, as pointed out above
(10.24), it double counts. Correcting for the double counting means our variational estimate for
10.3. THE HARTREE APPROACH 115
the ground state is:
0 1
X XZ j ˛j .r
0
/j2 j ˛i .r/j2
Evariational D @i d 3 rd 3 r 0 A: (10.30)
kr r 0 k
i i<j
The Hartree approach has been generalized to take account of spin in the “Hartree–Fock”
theory, where the N -body test wave function of the system is taken to be antisymmetric products
of one electron orbitals, a “Slater determinant.” For more details, see [23, 24].
117
APPENDIX A
Vector Spaces
Vector space theory lies at the heart of much, if not most, numerical methods. We commonly
utilize the theorems of finite dimensional linear algebra and profit from our knowledge of self-
adjoint operators in infinite dimensional Hilbert spaces. In this appendix, I have gathered to-
gether some key results and observations that are employed throughout this book.
Definition A.1 A vector space, V , over the complex numbers, C , is a set, , together with
operations of addition and multiplication by complex numbers which satisfy the following ax-
ioms. Given any pair of vectors x; y in V there exists a unique vector x C y in V called the sum
of x and y . It is required that
x C .y C z/ D .x C y/ C z;
x C y D y C x;
x C 0 D x:
x C . x/ D 0:
Given any vector x in V and any ˛; ˇ in C there exists a vector ˛x in V called the product
of x and ˛ . It is required that
˛.y C z/ D ˛y C ˛z:
118 A. VECTOR SPACES
•
.˛ C ˇ/x D ˛x C ˇx:
•
.˛ˇ/x D ˛.ˇx/:
•
.1/x D x:
Now every vector in R3 can be written in terms of the three unit vectors ex ; ey ; ez ; we can
generalize this idea.
Definition A.3 Suppose V is a vector space and there exists a set of vectors fei gN
i D1 . Then this
set spans V or equivalently forms a basis for it. If
• the set of vectors fei gN
i D1 is linearly independent, and
form a vector space over R when define addition and multiplication by a scalar by
˛x C ˇy .˛x1 C ˇy1 ; : : : ; ˛xn C ˇyn / : (A.1)
Our “ordinary” vectors in R3 are just a special case. In R3 we have a scalar product
3
X
x:y xi yi : (A.2)
i D1
A. VECTOR SPACES 119
We can generalize this for an arbitrary vector space, V over the complex numbers.
Definition A.4 An inner product is a map which associates two vectors in the space, V , with
a complex number
hji W V V ! C;
W a; b 7! hajbi
that satisfies the following four properties for all vectors a; b; c V and all scalars. ˛; ˇ C :
hajbi D hbjai;
h˛ajˇbi D ˛ˇhajbi;
ha C bjci D hajci C hbjci;
hajai 0 with equality iff a D 0; (A.3)
where z denotes the complex conjugate of z . We note that h˛aj˛ai D j˛j2 hajai which is con-
r D .z1 ; : : : ; zn /:
r1 D .z1 ; : : : ; zn /;
r2 D .1 ; : : : ; n /;
then
However, if we are going to use our definition of inner product we will require:
n
X
hr1 jr2 i D zNi i : (A.4)
i D1
120 A. VECTOR SPACES
Consequently,
n
X
2
jjr1 jj D jzi j2 :
i D1
jhujvij kukkvk:
Proof. Let
w D u C v:
Then,
hwjwi 0:
But
Take
hvjui
D ;
kvk2
and using
hujvi D hvjui
(A.5) becomes
jhujvij2
kuk2 0:
kvk2
A. VECTOR SPACES 121
This result is known as the Cauchy–Schwarz inequality.
Definition A.7 Two vectors a; b V are said to be orthogonal if
hajbi D 0:
If further
hajai D hbjbi D 1;
the vectors are said to be orthonormal.
Proof. Suppose
N
X
˛i ai D 0;
i D1
N
X
) ˛i haq jai i D 0;
i D1
) ˛q haq jaq i D 0;
) ˛q D 0:
This is enough to establish linear independence.
Lemma A.10 Let V be a vector space over C and let {e1 ; : : : ; eN } be a basis for V Let fwi gM
i D1 be
a set of non-zero vectors in V . If M > N then the set fwi g is linearly dependent.
Proof. Let us begin by assuming that, on the contrary, the set fwi g is linearly independent.
Since fei g forms a basis we may write
N
X
w1 D ˛i ei ;
i D1
and since we are assuming fwi g is linearly independent, then at least one ˇi with i 2 is non-
zero. We can keep repeating the argument until w1 ; : : : ; wN spans V then since wN C1 an ele-
ment of the space it can be written:
N
X
wN C1 D ˛i wi ;
i D1
thus linearly dependent and we have a contradiction. Our original assumption is false and the
result is established.
Definition A.11 A linear operator TO is a map from a vector space V onto itself s.t. for all
x; y V ; ˛; ˇ C
Now,
r D a C b
defines the equation of a line through a parallel to the vector b [11]; then
where Tji are complex numbers, taking the inner product with eq we have
N
X
heq jTO .ei /i D Tji heq jej i
i D1
X N
D Tji ıqj D Tqi : (A.8)
i D1
Thus, once we chose our basis then to every linear transformation we assign an N N
array of numbers which we will call a matrix:
124 A. VECTOR SPACES
TO $ T
0 1
T11 T12 T1N
B T21 T22 T2N C
B C
D B :: :: :: C: (A.10)
@ : : : A
TN1 TN 2 TNN
A D TR:
Notice that for a matrix T the element Tij corresponds to the j th column and i th row.
Just as in R3 we can write r C N as an ordered N tuple
r D .x1 ; x2 ; : : : xN /: (A.11)
TO .r/ D T r: (A.12)
In summary, for a N -dimensional vector space with a fixed orthonormal basis then
• to each vector there is a one to one correspondence to an N -tuple;
• to each linear operator there is a one to one correspondence with a N N matrix;
• the vector TO Œr corresponding to the image of r under the linear transformation TO
corresponds to N -tuple got by multiplying the r N -tuple by the matrix representation
of the operator;
• the inner product corresponds to the multiplication of 1 N matrix by a N 1 matrix.
TRANSFORMATIONS FROM RN 7! RM
While I will reserve that the term “linear operator” only for a linear transformations from a vector
space onto itself, we could have a transformation AO from RN 7! RM with associated matrix A
having M rows and N columns, i.e., it is a M N matrix. This matrix acts on a vector in RN ,
A. VECTOR SPACES 125
M
a N 1 matrix, and converts it into a M 1 matrix: a vector in R , the set of all such vectors
is called the “image of AO,” which I will denote by A .
Now A is a subspace of RM whose dimension is R Let fei gNiD1 be the standard basis for
R . Now if x; A then there exists c RN s.t.
N
Ac D x;
Xn
c D ˛i ei ;
i D1
n
X
) Ac D ˛i Aei ;
i D1
X n
)x D ˛i Aei :
i D1
Definition A.13 The column rank of A is the maximal number of linearly independent
columns of A . The row rank of A is the maximal number of linearly independent rows of A .
Theorem A.14 The row rank of a matrix A is equal to its column rank [32].
Let IO be the linear operator acting on the finite dimensional vector space V defined by:
IO W V !7 V;
IOŒr D r; for all r V : (A.13)
Then from (A.8) then the elements of the matrix representation of IO are given by
i.e.,
0 1
1 0 0
B 0 1 0 C
B C
I DB :: :: :: C : (A.15)
@ : : : A
0 0 1
126 A. VECTOR SPACES
Lemma A.15 For any square matrix B
BI D IB D B:
1
Definition A.16 An N N matrix B has an inverse, B , if
1 1
BB DI DB B:
a2 x C b2 y D c2 : (A.17)
(A.16) and (A.17) are clearly equivalent to the matrix equation
Tr D c
a1 b1 x c1
D ; (A.18)
a2 b2 y c2
1
and clearly we can solve the set of linear equations iff T exists. If we multiply (A.16) by a2
and (A.17) by a2 and then subtract we find
b2 c1 b1 c2
x D ;
det jT j
a1 c2 c1 a2
y D ; (A.19)
det jT j
where we have introduced the determinant of T which is given by
det jT j D a1 b2 a2 b 1 : (A.20)
A. VECTOR SPACES 127
If det jT j D 0 then we are in trouble but if it is non zero then we have solved the set of
linear equations. if det jT j D 0 and
b2 c1 b1 c 2 D 0
and
a1 c2 c1 a2 D 0;
then there is some hope but in this case (A.16) and (A.17) are essentially the same equation and
we have only one equation of two unknowns and thus an infinity of solutions. The system of
linear equations (A.18) has a unique solution iff detŒT ¤ 0. These results can be generalized, a
determinant can be defined for N N matrix as follows.
Example A.18
ˇ ˇ
ˇ a11 a12 a13 ˇ
ˇ ˇ
ˇ a21 a22 a23 ˇD
ˇ ˇ
ˇ a31 a32 a33 ˇ
ˇ ˇ ˇ ˇ ˇ ˇ
ˇ a a23 ˇˇ ˇ a21 a23 ˇˇ ˇ a21 a22 ˇˇ
. 1/1C1 a11 ˇˇ 22 ˇ C . 1/12
a ˇ
12 ˇ ˇ C . 1/ 13
a ˇ
13 ˇ
a32 a33 a31 a33 a31 a32 ˇ
D a11 Œa22 a33 a23 a32 a12 Œa31 a33 a23 a31 C a13 Œa21 a32 a22 a31 : (A.21)
It can be shown [32] that if A is a N N matrix then it has a unique inverse iff detŒA ¤ 0
iff rank.A/ D N , this last condition is equivalent to saying that its columns(rows) treated as
vectors must be linearly independent. In fact, it can be shown that
1 CT
B D ; (A.22)
jBj
128 A. VECTOR SPACES
where C is the cofactor matrix, constructed as follows. The ij element of the cofactor matrix C is
cij which is . 1/i Cj multiplied by determinants of the .N 1/ .N 1/ matrix got by striking
out the i th row and j th column of the original matrix B . C T denotes the transpose of C .
Definition A.19 Let TO be an operator defined on a vector space, V , upon which an inner
product is defined. We define the adjoint of TO to be a linear operator TO W V 7! V where for all
a; b V
Lemma A.20 If TO is a linear operator acting on an N dimensional vector space, V with matrix
representation, T .T /ij then its adjoint T has the matrix representation TNji , i.e., we interchange
rows and columns and take the complex conjugate of each element.
Proof.
D hej jTO ei i
) .T /ij D .TN /ji :
To be clear, if we start we an operator TO with a matrix representation given by (A.10) then
its adjoint TO has a matrix representation given by
TO T
$ 0 1
TN11 TN21 TNN1
B TN12 TN22 TNN 2 C
B C
D B : :: :: C: (A.23)
@ :: : : A
TN1N TN2N TNNN
.AB/ D B A :
A. VECTOR SPACES 129
Proof. Looking at components,
M
X
.AB/ij D aiq bqj ;
qD1
.AB/ij D .AB/ji
XM
D aN jq bNqi
qD1
XM
D .B /iq .A /qj D .B A /ij :
qD1
matrix is given by
Tij D TNji :
Lemma A.24 If TO is a self-adjoint operator defined V 7! V , then its eigenvalues must be real.
Proof. Let a be an eigenvector of TO with eigenvalue . Note we have excluded the null vector
from being an eigenvector but we have not excluded the number zero for being an eigenvalue.
Consider
hajTO ai D hajai D hajai D kak2 ;
N
hTO ajai D hTO ajai D hajai D kak 2
;
) D :N
130 A. VECTOR SPACES
Lemma A.25 If fbi gM O
i D1 are the eigenvectors of a self adjoint operator B corresponding to distinct
M
eigenvalues fˇi gi D1 , then these eigenvectors are orthogonal.
These two lemmas though simple to prove turn out to be very important. We have proved
the results for a general operator rather than just the matrix representation, so they will hold in
any finite or infinite dimensional vector space. Suppose we are working in an N -dimensional
space.
Theorem A.26 If B is an N N matrix its eigenvalues are the solution of the equation
detŒB ˇI D 0:
Bb D ˇb;
) ŒB ˇIb D 0:
If ŒB ˇI 1 exists then if we act with it we find that b D 0, i.e., no eigenvectors exist so
for us to find eigenvalues we must have
detŒB ˇI D 0:
This will yield a polynomial of order N in ˇ and by the fundamental theorem of algebra
this has N complex roots which is the maximal number of eigenvalues possible.
Suppose B is a self-adjoint operator with a maximal set of distinct eigenvalues fˇi gN
iD1 with
associated eigenvectors fbi gN
i D1 . The eigenvectors may be written
0 1
b1i
B C
bi D @ ::: A ;
bNi
A. VECTOR SPACES 131
N
X
bNki bkj D ıij : (A.25)
kD1
Our derivation of (A.29) depended on the eigenvectors being mutually orthogonal, the
proof of which depended on the eigenvalues being distinct. It is not unusual to find a self-adjoint
operator B which has more than one eigenvectors which are linearly independent of each other
but have the same eigenvalue. Suppose the operator BO has M eigenvectors fbi gM 1 such that each
of them satisfies
O i D ˇb1 :
Bb (A.30)
Consider
m
X
aD ˛i bi ;
i D1
132 A. VECTOR SPACES
where ˛i are complex numbers then
M
X
O
BŒa D O i
˛i BŒb
i D1
M
X
D ˛i ˇŒbi
i D1
D ˇa: (A.31)
Thus, the set D feigenvectors of BO with eigenvalue ˇg is itself a vector space which is
a subspace of our original space. We may chose a maximal set of M , say, linearly independent
vectors which we can orthognalize to each other and to the other eigenfunctions of BO using our
Grahm–Schmidt processes. We can repeat this processes for any other degenerate eigenvalues
until we arrive at a maximal set of mutually orthogonal eigenvectors.
CHANGE OF BASIS
The matrix we have constructed in (A.27) we described as unitary.
Definition A.27 A linear operator UO is said to be unitary if
UO D UO 1
:
hUxjUyi D hxjyi:
So .b/ ) .c/.
Suppose (c) then
hUxjUyi D hxjyi
D hU Uxjyi;
) hx U Uxjyi D 0; 8 y;
)x D U Ux; 8 x;
) U U D I;
) .c/ ) .a/:
Lemma A.29 Suppose A is an N N matrix and U is a unitary matrix then if B D UAU then
B and A have the same eigenvalues.
Ar D r;
) U BUr D r;
) BUr D Ur:
APPENDIX B
Equation (B.3) is a second-order differential equation and as such will admit two linearly
independent solutions. Only one of which will be consistent with the boundary condition (B.3).
Asymptotically, for x >> 1 we can approximate
d 2 .x/
x 2 .x/; (B.4)
dx 2
which has solutions
2 =2
˙ .x/ D e ˙x : (B.5)
Notice
lim .x/ ! 0;
x!1
136 B. ANALYTIC SOLUTION TO THE QUANTUM OSCILLATOR
lim C .x/ ! 1: (B.6)
x!1
Clearly, we don’t want the divergent solution C .x/. Returning to the full differential
equation, (B.1), let us look to see if we can find a solution of the form:
x 2 =2
.x/ D h.x/e ; (B.7)
X1
2
D e x =2 4 .aj C2 .j C 2/.j C 1/ 2aj /x j 5 : (B.9)
j D0
Thus,
d 2 .x/
x 2 .x/ C 2E .x/ D 0;
2 dx 2 3
1
X
)4 .aj C2 .j C 2/.j C 1// 2aj j C .2E 1/aj /x j 5 D 0: (B.10)
j D0
Thus, if we know a0 we can find all even coefficients and have a1 we can find all odd
coefficients. We may write
Since we are interested in asymptotics we can concentrate on the larger powers, j >> 1
for which
2
aj C2 aj ;
j
thus
2
This diverges like e Cx the solution that we didn’t want. But if there exists an integer j
such that
2j C 1 D 2E (B.14)
then one of the series will terminate and we can set either a0 or a1 equal to zero to get rid of
the diverging series. The resulting finite series solution will has the correct asymptotic form. The
normalized eigenfunctions are
1 x2
n .x/ D p Hn .x/e 2 ; (B.15)
2n nŠ
where Hn .x/ is a hermite polynomial. The first few are given by
H0 .x/ D 1;
H1 .x/ D 2x;
H2 .x/ D 4x 2 2;
H3 .x/ D 8x 3 12x: (B.16)
Notice that H0 .x/; H2 .x/ are even functions of x while H1 .x/; H3 .x/ are odd. In Fig-
ure B.1 the first four eigenfunctions are plotted.
As expected, the lowest eigenfunction, corresponding to E0 is even with no zeros, 1 is
odd with one zero, 2 even with two zeros, and 3 is odd with three zeros.
138 B. ANALYTIC SOLUTION TO THE QUANTUM OSCILLATOR
0.5
ϕn(x)
0
-0.5
-1
-4 -2 0 2 4
x
Figure B.1: The first harmonic oscillator eigenfunctions for unit frequency, ! D 1, in units where
„ D m D 1: n D 0, dashed red, n D 1 dotted blue, n D 2, dashed dotted green, n D 3 solid black.
APPENDIX C
First-Order Perturbation
Theory
Suppose we have a Hamiltonian HO 0 which has a known set of eigenvalues, Ej , with an associated
set of orthonnormal eigenvectors f j0 g. We will assume there is no degeneracy, i.e., Ei ¤ Ej if
i ¤ j . If is any state of the system then
X
0
D aj j: (C.1)
j
Now, suppose our system is “perturbed” by a small extra potential so we will have another
Hamiltonian HO which is “not too different” from HO 0 . We can write
HO D HO 0 C HO I ;
where HO I D HO 1 where is “quadratically small;” in other words, 2 is negligibly small. For
example, if we place a one electron atom in an electric field, E D Eez the Hamiltonians are
„2 2 Z
HO 0 D r ;
2 r
HO 1 D z: (C.2)
We could reasonably assume that if D E is small the effect on the energy levels will also
be small, i.e., we would expect that the eigenenergies of the new Hamiltonian would be very
similar to the original; in other words, if ENi is a new eigenvalue then:
HO i D ENi ;
N
Ei D Ei C Ei ;
jEi j << 1: (C.3)
We also expect that the new eigenvector, i, will be “not too different” from the original
0
i . To be a little more precise, we can expand:
X
0
i D cij j; (C.4)
j
140 C. FIRST-ORDER PERTURBATION THEORY
and since we require h ij ii D 1 we have
X
jcij j2 D 1: (C.5)
j
We require
cii 1;
cij 0i ¤ j; (C.6)
i.e., we require cii to differ from 1 by a quadratically small quantity. The eigenvalue equation for
i is
h Hi i D ENi i ;
O O
H0 C H1 i D .Ei C Ei / i;
h iX X
0 0
HO 0 C HO 1 cij j D .Ei C Ei / cij j;
j j
X X X
0 0
cij .Ej j C HO 1 j/ D Ei cij j0 C Ei cij 0
j;
j
X X Xj j
0 0 0
cij ŒEj Ei j C HO 1 cij j D Ei cij j: (C.7)
j ¤i j j
Let us now neglect quadratically small quantities cij and Ecij when i ¤ j and assume
cii 1 then taking the inner product with i0 we have:
0 O 0
h i jHI i i Ei : (C.8)
That is the shift Ei in the level Ei resulting from the addition of the perturbation HO I to
the original Hamiltonian is just the expectation value of the perturbing Hamiltonian calculated
from the original eigenket i0 . For our purposes, here we will not need more than the energy
shift Ei , however if we take an inner product with i0 on (C.7) we find that
0 O 0
h q jHI i i
ciq D i ¤ q; (C.9)
Ei Eq
which together with cii D 1 gives us an estimate for i . Notice our original assumption of non-
degeneracy means that (C.9) is well defined. If level i is degenerate then our assumption that all
cij i ¤ j are quadratically small may not hold. We do not need all the eigenvalues to be non-
degenerate only the particular Ei we want to study. The approximation (C.8) is correct only to
first order in small quantities.
141
Bibliography
[1] William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. Nu-
merical Recipes, Fortran:77. Cambridge University Press, Cambridge, 1992. 4
[2] LAPACK (linear algebra package) is a standard software library for numerical linear alge-
bra. www.netlib.org/lapack 52
[3] The Numerical Algorithm Group (NAG) library is a commercial software library. https:
//www.nag.co.uk/content/nag-library-fortran 4
[4] CASTEP is a shared source suite for calculating the electronic properties of crystalline
solids, surfaces, molecules, liquids and amorphous materials from first principles. www.
castep.org 4
[9] Dennis M. Ritchie and Brian W. Kernighan. The C Programming Language, 2nd ed.,
Prentice Hall, 1988. DOI: 10.1007/978-3-662-09507-2_22 4
[10] John V. Guttag. Introduction to Computation and Programming Using Python: With Appli-
cation to Understanding Data. MIT Press, 2016. 4
[11] Colm T. Whelan. A First Course in Mathematical Physics. Wiley-VCH, 2016. 5, 21, 33,
34, 63, 92, 97, 123
Author’s Biography
COLM T. WHELAN
Colm T. Whelan is a Professor of Physics and an Eminent Scholar at Old Dominion University
in Norfolk, Virginia. He received his Ph.D. in Theoretical Atomic Physics from the University
of Cambridge in 1985 and was awarded an Sc.D. also from Cambridge in 2001. He is a Fellow
of both the American Physical Society and the Institute of Physics (UK). He has over 30 years
of experience in the teaching of physics.
147
Index