An Introduction To Numerical Methods For The Physical Sciences

You might also like

Download as pdf or txt
Download as pdf or txt
You are on page 1of 168

Series ISSN 1939-5221

WHELAN
An Introduction to
An Introduction to Numerical Methods

AN INTRODUCTION TO NUMERICAL METHODS FOR THE PHYSICAL SCIENCES


for the Physical Sciences
Colm T. Whelan, Old Dominion University

There is only a very limited number of physical systems that can be exactly described in terms of simple
analytic functions. There are, however, a vast range of problems which are amenable to a computational
approach. This book provides a concise, self-contained introduction to the basic numerical and analytic
Numerical Methods
for the Physical
techniques, which form the foundations of the algorithms commonly employed to give a quantitative
description of systems of genuine physical interest. The methods developed are applied to representative
problems from classical and quantum physics.

Sciences

Colm T. Whelan
ABOUT SYNTHESIS
This volume is a printed version of a work that appears in the Synthesis Digital Library of Engineering and
Computer Science. Synthesis Lectures provide concise original presentations of important research and
development topics, published quickly in digital and print formats. For more information, visit our website:
http://store.morganclaypool.com

MORGAN & CLAYPOOL


store.morganclaypool.com
An Introduction to
Numerical Methods
for the Physical Sciences
Synthesis Lectures on
Engineering, Science, and
Technology
Each book in the series is written by a well known expert in the field. Most titles cover subjects
such as professional development, education, and study skills, as well as basic introductory
undergraduate material and other topics appropriate for a broader and less technical audience.
In addition, the series includes several titles written on very specific topics not covered
elsewhere in the Synthesis Digital Library.
An Introduction to Numerical Methods for the Physical Sciences
Colm T. Whelan
2020

Introduction to Engineering Research


WEndy C. Crone
2020

Theory of Electromagnetic Beams


John Lekner
2020

The Search for the Absolute: How Magic Became Science


Jeffrey H. Williams
2020

The Big Picture: The Universe in Five S.T.E.P.S.


John Beaver
2020

Relativistic Classical Mechanics and Electrodynamics


Martin Land and Lawrence P. Horwitz
2019

Generating Functions in Engineering and the Applied Sciences


Rajan Chattamvelli and Ramalingam Shanmugam
2019
iv
Transformative Teaching: A Collection of Stories of Engineering Faculty’s Pedagogical
Journeys
Nadia Kellam, Brooke Coley, and Audrey Boklage
2019

Ancient Hindu Science: Its Transmission and Impact on World Cultures


Alok Kumar
2019

Value Rational Engineering


Shuichi Fukuda
2018

Strategic Cost Fundamentals: for Designers, Engineers, Technologists, Estimators,


Project Managers, and Financial Analysts
Robert C. Creese
2018

Concise Introduction to Cement Chemistry and Manufacturing


Tadele Assefa Aragaw
2018

Data Mining and Market Intelligence: Implications for Decision Making


Mustapha Akinkunmi
2018

Empowering Professional Teaching in Engineering: Sustaining the Scholarship of


Teaching
John Heywood
2018

The Human Side of Engineering


John Heywood
2017

Geometric Programming for Design Equation Development and Cost/Profit


Optimization (with illustrative case study problems and solutions), Third Edition
Robert C. Creese
2016

Engineering Principles in Everyday Life for Non-Engineers


Saeed Benjamin Niku
2016

A, B, See... in 3D: A Workbook to Improve 3-D Visualization Skills


Dan G. Dimitriu
2015
v
The Captains of Energy: Systems Dynamics from an Energy Perspective
Vincent C. Prantil and Timothy Decker
2015

Lying by Approximation: The Truth about Finite Element Analysis


Vincent C. Prantil, Christopher Papadopoulos, and Paul D. Gessler
2013

Simplified Models for Assessing Heat and Mass Transfer in Evaporative Towers
Alessandra De Angelis, Onorio Saro, Giulio Lorenzini, Stefano D’Elia, and Marco Medici
2013

The Engineering Design Challenge: A Creative Process


Charles W. Dolan
2013

The Making of Green Engineers: Sustainable Development and the Hybrid Imagination
Andrew Jamison
2013

Crafting Your Research Future: A Guide to Successful Master’s and Ph.D. Degrees in
Science & Engineering
Charles X. Ling and Qiang Yang
2012

Fundamentals of Engineering Economics and Decision Analysis


David L. Whitman and Ronald E. Terry
2012

A Little Book on Teaching: A Beginner’s Guide for Educators of Engineering and


Applied Science
Steven F. Barrett
2012

Engineering Thermodynamics and 21st Century Energy Problems: A Textbook


Companion for Student Engagement
Donna Riley
2011

MATLAB for Engineering and the Life Sciences


Joseph V. Tranquillo
2011

Systems Engineering: Building Successful Systems


Howard Eisner
2011
vi
Fin Shape Thermal Optimization Using Bejan’s Constructal Theory
Giulio Lorenzini, Simone Moretti, and Alessandra Conti
2011

Geometric Programming for Design and Cost Optimization (with illustrative case study
problems and solutions), Second Edition
Robert C. Creese
2010

Survive and Thrive: A Guide for Untenured Faculty


Wendy C. Crone
2010

Geometric Programming for Design and Cost Optimization (with Illustrative Case Study
Problems and Solutions)
Robert C. Creese
2009

Style and Ethics of Communication in Science and Engineering


Jay D. Humphrey and Jeffrey W. Holmes
2008

Introduction to Engineering: A Starter’s Guide with Hands-On Analog Multimedia


Explorations
Lina J. Karam and Naji Mounsef
2008

Introduction to Engineering: A Starter’s Guide with Hands-On Digital Multimedia and


Robotics Explorations
Lina J. Karam and Naji Mounsef
2008

CAD/CAM of Sculptured Surfaces on Multi-Axis NC Machine: The DG/K-Based


Approach
Stephen P. Radzevich
2008

Tensor Properties of Solids, Part Two: Transport Properties of Solids


Richard F. Tinder
2007

Tensor Properties of Solids, Part One: Equilibrium Tensor Properties of Solids


Richard F. Tinder
2007
vii
Essentials of Applied Mathematics for Scientists and Engineers
Robert G. Watts
2007

Project Management for Engineering Design


Charles Lessard and Joseph Lessard
2007

Relativistic Flight Mechanics and Space Travel


Richard F. Tinder
2006
Copyright © 2020 by Morgan & Claypool

All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in
any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations
in printed reviews, without the prior permission of the publisher.

An Introduction to Numerical Methods for the Physical Sciences


Colm T. Whelan
www.morganclaypool.com

ISBN: 9781681738727 paperback


ISBN: 9781681738734 ebook
ISBN: 9781681738741 hardcover

DOI 10.2200/S01016ED1V01Y202006EST008

A Publication in the Morgan & Claypool Publishers series


SYNTHESIS LECTURES ON ENGINEERING, SCIENCE, AND TECHNOLOGY

Lecture #8
Series ISSN
Print 2690-0300 Electronic 2690-0327
An Introduction to
Numerical Methods
for the Physical Sciences

Colm T. Whelan
Old Dominion University

SYNTHESIS LECTURES ON ENGINEERING, SCIENCE, AND


TECHNOLOGY #8

M
&C Morgan & cLaypool publishers
ABSTRACT
There is only a very limited number of physical systems that can be exactly described in terms of
simple analytic functions. There are, however, a vast range of problems which are amenable to a
computational approach. This book provides a concise, self-contained introduction to the basic
numerical and analytic techniques, which form the foundations of the algorithms commonly
employed to give a quantitative description of systems of genuine physical interest. The methods
developed are applied to representative problems from classical and quantum physics.

KEYWORDS
differential equations, linear equations, polynomial approximations, variational
principles
xi

For my colleagues and friends:


Reiner Dreizler and James Walters
xiii

Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii

1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Numbers and Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Programming Languages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Some Elementary Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5


2.1 Taylor’s Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.1 Extrema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.2 Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Numerical Differentiation and Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.1 Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2.2 Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2.3 Singular Integrands, Infinite Integrals . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Finding Roots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3 The Numerical Solution of Ordinary Differential Equations . . . . . . . . . . . . . . . 21


3.1 Trigonometric Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.2 Analytic Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Numerical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.1 Euler Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3.2 Runge–Kutta Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.3 Numerov Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4 Case Study: Damped and Driven Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . 33


4.1 Linear and Nonlinear Ordinary Differential Equations . . . . . . . . . . . . . . . . . . 33
4.2 The Physical Pendulum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.2.1 Small Oscillations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.2.2 Differences Between Linear and Nonlinear Pendulum . . . . . . . . . . . . 41
4.3 Chaos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
xiv
5 Numerical Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.1 System of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.2 LU Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.3 QR Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.3.1 Systems of Linear Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.3.2 Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.3.3 Linear Least Squares . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

6 Polynomial Approximations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.1 Interpolation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.1.1 Error Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.2 Orthogonal Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.2.1 Legendre Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.3 Infinite Dimensional Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.3.1 Zeros of Orthogonal Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.4 Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.4.1 Simpson Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.4.2 Weights and Nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.4.3 Gaussian Quadrature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

7 Sturm–Liouville Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.1 Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.2 Least Squares Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

8 Case Study: The Quantum Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83


8.1 Numerical Solution of the One Dimensional Schrödinger Equation . . . . . . . 83
8.2 Numerical Solution for the Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

9 Variational Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
9.1 Rayleigh–Ritz Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
9.2 The Euler–Lagrange Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
9.3 Constrained Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
9.4 Sturm–Liouville Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

10 Case Study: The Ground State of Atoms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107


10.1 Hydrogenic Ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
10.2 Two Electron Ions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
10.3 The Hartree Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
xv
A Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

B Analytic Solution to the Quantum Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . 135

C First-Order Perturbation Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141

Author’s Biography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
xvii

Preface
There is only a limited number of physical systems that can be exactly described in terms of
simple analytic functions. There are, however, a vast range of problems that are amenable to a
computational approach. This book provides a concise introduction to the essential numerical
and analytic techniques which form the foundations of algorithms commonly employed to give
a quantitative description of systems of genuine physical interest. Rather than providing a series
of useful programming recipes the philosophy of the book is to present in a coherent way the
underlying theory. I include some case studies illustrating the application to problems in classical
and quantum physics.

Colm T. Whelan
Norfolk, June 2020
1

CHAPTER 1

Preliminaries
Before diving into the study of numerical methods and their applications, it is worthwhile to
briefly think about how computers work and how we interact with them.

1.1 NUMBERS AND ERRORS


Computers store everything in bits (or binary digits). Bits have only two possible values, which
are usually denoted by 0 and 1. Computers store numbers in binary as a sequence of bits, and can
represent integers exactly, as long as there are enough bits but cannot store most real numbers
with complete accuracy. Obviously they cannot deal with irrational numbers such as  or e
but they also struggle when working with very big or very small numbers and those with large
numbers of significant digits. Computers represent a floating-point real number, r , as

r D . 1/s f be;

where s; f; b , and e are integers; s determines the sign, f is the “significand” (or “coefficient”), b
is the base (usually 2), and e is the exponent. The possible finite values that can be represented
in a given format are determined by the number of digits in the significand f , the base b , and
the number of digits in the exponent e . When the computer adds two floating point numbers,
it first fixes them to have the same exponent then adds, clearly there is only a finite number of
places of decimals that can be stored and this leads to the potential for numerical errors.
• Roundoff Errors.
A roundoff error is the difference between the result produced by a given algorithm
using exact arithmetic and the result produced by the same algorithm using finite-
precision, rounded arithmetic. Such an error can grow in significance when we have a
large number of repeated operations.
• Cancellation Errors.
These occur when we subtract two almost equal numbers one from the other.

1.2 ALGORITHMS
The application of numerical methods to the description of physical problems proceeds through
the formulation and implementation of algorithms, i.e., sequences of well-defined instructions,
which a computer can interpret in an unambiguous way leading to a numerical solution of the
2 1. PRELIMINARIES
problem at hand. One of the challenges you will face in translating the equations of physics into
efficient and accurate computer code is that an expression that is mathematically correct may be
highly susceptible to numerical errors. We have to be careful to design our algorithms to avoid
such errors. I will illustrate this by two simple examples.

Example 1.1 Suppose we are looking for the roots of the quadratic

x2 2bx C c D 0:

We know this has two roots p


x˙ D b˙ b2 c: (1.1)
If c << 1, then the root x is very susceptible to cancellation errors. However, we may write

x xC D b 2 b 2 C c;
c
x D ; (1.2)
xC
which gives us an expression which is much less sensitive.

Example 1.2 Suppose we want to compute the definite integral


Z 1
In D x n e x dx; (1.3)
0

for n D 1; : : : ; 15. Using integration by parts it follows that

In D e nIn 1;
I0 D e 1: (1.4)

In Figure 1.1 we plot the result of using the recurrence relation (1.4), for increasing n, compared
with the direct numerical integration of (1.3). The issue with using the recursion relation (1.4)
is the “unstable algorithm,” which magnifies the initial error at each step. If In is the exact value,
and INn our numerical estimate and n the error at each step then the magnitude of the error is

jn j D jIn INn j D j.e nIn 1 / j.e nINn 1 /j


D njIn 1 INn 1 j
D njn 1 j
D nŠj0 j:

This error becomes rapidly larger as n increases. We note from the mean value theorem for
integrals that In will go to zero as n increases If now we rewrite (1.4)
1
In 1 D Œe In  (1.5)
n
1.3. PROGRAMMING LANGUAGES 3
2

1.5

0.5

-0.5
0 2 4 6 8 10 12 14 16

n
R1
Figure 1.1: Evaluation of the integral In D 0 x n e x dx using: (i) direct numerical integration,
open blue squares; (ii) backward recurrence, solid red disks; and (iii) forward recurrence, green
crosses, - the dashed lines are a “best fit” through the forward recurrence.

and chose n D N large enough that we can approximate IN  0 then we can generate the smaller
n values using the backward recurrence relation. Figure 1.1 also shows the estimate for In got
using the backward formula (1.5). The results are almost indistinguishable from those of the
“exact” numerical integration (all the calculations shown are in single precision and the maximum
N for the backward recurrence was taken to be 35).

1.3 PROGRAMMING LANGUAGES


Once you have mastered the numerical methods in this book you will need to translate the
mathematical approach into a “programming language.” A programming language is a formal
language, which comprises a set of instructions that produce various kinds of output. There is
an embarrassment of languages available with new ones begin produced and “old” ones modified
all the time. I have found some more useful than others.

• Fortran has been in constant use in computationally intensive areas for over seven
decades during that time it has evolved, adding extensions and refinements, while striv-
ing to retain compatibility with prior versions. “Modern Fortran” (Fortran 90/95/03/08)
is still the dominant language for the large-scale simulation of physical systems, for
things like the astrophysical modeling of stars and galaxies, for the accurate calculation
4 1. PRELIMINARIES
of electronic structure, hydrodynamics, molecular dynamics, and climate change. In
the field of high performance computing (HPC), Modern Fortran also has a feature
called “coarrays” which puts parallelization features directly into the language. Coarrays
started as an extension of Fortran 95 and were incorporated into Fortran 2008 as stan-
dard. There is a hugh “legacy” of libraries both general, e.g., [1–3] and free academic
libraries devoted to specific areas in the physical sciences, e.g., [4–8].
• C CC is more difficult to learn. It does have a good basis of libraries. On most bench
mark tests C CC and Fortran are fairly equivalent. However, the two benchmarks where
Fortran wins (n-body simulation and calculation of spectra) are the most relevant to
Physics.
In the physical sciences C CC and “Modern Fortran” are still the most widely used. The popular
“Open MPI” libraries for parallelizing code were developed for these two languages. There is also
• C [9] which was designed to be compiled using a relatively straightforward compiler to
provide low-level access to memory and language constructs that map efficiently to ma-
chine instructions, all with minimal runtime support. Despite its low-level capabilities,
the language was designed to encourage cross-platform programming. A standards-
compliant C program written with portability in mind can be compiled for a wide
variety of computer platforms and operating systems with few changes to its source
code. The language is available on various platforms, from embedded microcontrollers
to supercomputers. It is much easier to learn to code in than C CC , which is not a di-
rect extension of C . The reason for the name was that when object-oriented languages
became popular, C CC was originally implemented as a source-to-source compiler; the
source code was translated into C, and then compiled with a C compiler.
• Python [10] is easy to learn with built in libraries but it is usually about 100 times
slower than Fortran or C CC on bench mark tests. A good learning language but not
currently really that useful for advanced (real) physical problems.
The challenges you will face in first writing effective code have more to do with knowing how to
recast the equations of mathematical physics in a way that minimizes numerical error than with
the choice of which high level language in which you are comfortable coding.
5

CHAPTER 2

Some Elementary Results


In this chapter I give a brief introduction to some of the more basic ideas from numerical analysis.

2.1 TAYLOR’S SERIES


Very often in physical problems you need to find a relatively simple approximation to a com-
plex function or you need to estimate the size of a function. One of the most commonly used
techniques is to approximate a function by a polynomial around a given point.

Theorem 2.1 Let f be a real function which is continuous and has continuous derivatives up to the
n C 1 order then

f 0 .a/ f .2/ .a/ f .n/ .a/


f .x/ D f .a/ C .x a/ C .x a/2 C    C .x a/n C Rn .x/;
1Š 2Š nŠ
(2.1)

where nŠ D n:.n 1/:.n 2/ : : : 3  2  1 and


Z x
f .nC1/ .t/
Rn .x/ D .x t/n dt: (2.2)
a nŠ

Proof. [11]. 

Clearly, if Rn goes to zero uniformly as n ! 1 then we can find an infinite series. Ex-
amples are

x2
ex D1CxC 2Š
C  ;
x3
sin x D1 xC 3Š
C  :

An alternative form for the remainder term can be derived by by making use of the mean value
theorem for integrals, i.e.,
Z x
f .nC1/ .t/ .x a/.x ˛/n
RnC1 D .x t /n dt D f .nC1/ .˛/ ; (2.3)
a nŠ nŠ
6 2. SOME ELEMENTARY RESULTS
where ˛ is some number, a  ˛  x . The form (2.3) is the Cauchy form of the remainder term.
A further alternative form was derived by Lagrange

f .nC1/ .ˇ/
RnC1 .x/ D .x a/nC1 ; (2.4)
.n C 1/Š

with a  ˇ  x . The theorem may be extended to N dimensions.

Theorem 2.2 Suppose f is a map from RN to R and it is at least k C 1 times continuously differ-
entiable then:
k
X .h:r/j
f .a C h/ D f .a/ C R.a; k; h/;

j D0
1
R.a; k; h/ D .h:r/kC1 f .a C h/; (2.5)
.k C 1/Š
for some   .0; 1/.

Proof. [12]. 

Example 2.3 The Taylor theorem in two dimensions. Expanding about a D .a; b/
@f @f
f .x; y/ D f .a; b/ C .x a/ C .y b/
 @x @y 
2
1 @ f @2 f @2 f
C .x a/2 2 C 2.x a/.y b/ C .y b/2 2 C 
.2Š/ @x @x@y @y
(2.6)

with all derivatives evaluated at .a; b/.

2.1.1 EXTREMA
Suppose F .x/ is a continuous function with a continuous first derivate. Suppose further that F
has a local maximum at some point x0 , hence for some infinitesimal increment jhj

F .x0 ˙ jhj/ < F .x0 /; (2.7)

hence
F .x0 C jhj/ F .x0 /
< 0;
jhj
F .x0 jhj/ F .x0 /
> 0:
jhj
2.1. TAYLOR’S SERIES 7
We can make jhj arbitrarily small, hence when we take limit from the left and right and since
we have assumed the derivative is continuous we must have:
ˇ
dF .x0 / dF .x/ ˇˇ
 D 0: (2.8)
dx dx ˇx0

Following a similar argument it is immediately obvious that if x0 corresponds to a minimum


(2.8) also holds. Now suppose F is a continuous function with continuous first and second
derivatives and that their is a point x0 in its domain where (2.8) holds. Then, using the Taylor’s
expansion, (2.1) we have

dF .x0 / 1 d 2 F .x0 /
F .x/ D F .x0 / C .x x0 / C .x x0 /2 C O..x x0 /3 /
dx 2 dx 2
1 d 2 F .x0 /
D F .x0 / C .x x0 /2 C O..x x0 /3 /: (2.9)
2 dx 2
The symbol O.x n / means terms of order x n or higher. Now since .x x0 /2 > 0 and since we
can chose x arbitrarily close to x0 we see at once that

d 2 F .x0 /
F has a maximum at x0 if < 0:
dx 2

d 2 F .x0 /
F has a minimum at x0 if > 0:
dx 2
Suppose now f .x; y/ is a differentiable function of two variables, f .x; y/ D z defines a two
dimensional surface in three space. We can think of this surface as being constructed from a
series of curves of the form ˆ.x/ D f .x; y0 / D z and ‰.y/ D f .x0 ; y/ D z , you might think
of lines of latitude and longitude on the earth. Clearly, a necessary and sufficient condition for a
maximum, is that both ˆ and ‰ have maxima. Thus, the function f .x; y/ will have an extrema,
maximum or minimum at x0 ; y0 is

@f .x0 ; y0 / @f .x0 ; y0 /
D D 0; (2.10)
@x @y

but
@f @f
df D dx C dy: (2.11)
@x @y

Hence, the condition for an extrema is:

df .x0 ; y0 / D 0: (2.12)
8 2. SOME ELEMENTARY RESULTS
Suppose we may want to find the extrema of f .x; y/ where x; y are related by some extra con-
dition

g.x0 ; y0 / D 0; (2.13)

which we will call a constraint. Since g is constant we have dg D 0, hence


@f @f
df D dx C dy D 0;
@x @y
@g @g
dg D dx C dy D 0;
@x @y
) .df dg/ D 0; (2.14)

for any  which is independent of x and y . Now define a new function of the three variables
x; y; 

l.x; y; / D f .x; y/ g.x; y/:

The exact value of  will be determined later. The extrema of this function satisfy
@l @f @g
D  D 0:
@x @x @x
@l @f @g
D  D 0:
@y @y @y
@l
D g.x; y/ D 0: (2.15)
@
The first two equations must be satisfied by the extrema and the third is just the constraint
condition. We have three equations for three unknowns thus the extrema of f .x; y/ D c subject
to the constraint g D 0 can be found by solving for the extrema of the function l.x; y; /.
Example 2.4 Suppose you want to find the maximum and minimum values of the function

f .x; y/ D xy;

where

x 2 C y 2 D 4:

Then you could proceed by direct substitution and look for the root of f 0 .y0 / D 0, i.e.,
p
x D ˙ 4 y2:
f .y/ D x.y/y; 2 3
2 q
6 y 7
f 0 .y0 / D ˙ 4 q C 4 y02 5 D 0;
2
4 y0
2.1. TAYLOR’S SERIES 9
) ˙.4 2y02 / D 0; p
) y0 D ˙ .2/;
x02 C y02 D 4;p
) x0 D ˙ .2/:

Let us solve the problem using the Lagrange multiplier method

l.x; y; / D xy .x 2 C y 2 4/;


@l
D y 2x D 0;
@x
@l
D x 2y D 0;
@y
@l
D .x 2 C y 2 4/ D 0;
@
y
) D ;
2x
2 2
)x D yp ;
)x D ˙p2;
y D ˙ 2;

as before.

One of the advantages of the Lagrange multiplier approach is that it easily generalizes to
higher dimensions.

Example 2.5 Suppose you wish to find the maximum and minimum values of

f .x; y; z/ D xyz;

on the sphere

x 2 C y 2 C z 2 D 3:

Let

g.x; y; z/ D x 2 C y 2 C z 2 3:

So our constraint is g.x; y; z/ D 0.


Define

l.x; y; z; / D f .x; y; z/ g.x; y; z/:


10 2. SOME ELEMENTARY RESULTS
Table 2.1: Possible values

x y z f(x,y)
1 1 1 1
-1 1 1 -1
1 -1 1 -1
1 1 -1 -1
-1 -1 1 1
1 -1 -1 1
-1 -1 1 1
-1 -1 -1 -1

Hence for an extrema,


@l
D yz 2x D 0;
@x
@l
D xz 2y D 0;
@y
@l
D xy 2z D 0;
@z
@l
D x2 C y2 C z2 3 D 0:
@

xyz D .2x 2 / D .2y 2 / D .2z 2 /;


) 3xyz D 2.x 2 C y 2 C z 2 /;
D 6;
) xyz D 2 D 2x 2 ;
)x D ˙1:

In the same way y D ˙1; z D ˙1. We have 8 possible results for extrema.
So minimum value is 1, maximum is C1.

2.1.2 POWER SERIES

Definition 2.6 A power series is a function of a variable x defined as an infinite sum


1
X
an x n ; (2.16)
nD0

where an are real numbers.


2.1. TAYLOR’S SERIES 11
Just because we write down a series of the form (2.16) it does not mean that such a thing is
well defined. It is, in essence, a limit of the sequence of partial sums and this limit may or may not
exist. The interval of convergence is the range of values a < x < b for which (2.16) converges.
Note this is an open interval that is to say we need to consider the end points separately. If a
function f has a Taylor expansion with remainder term Rn which uniformly goes to zero on
some interval I D fxja < x < bg then f can be represented by a power series on this interval.
Power series are extremely useful. Here we will only state some results and refer the reader to [12]
for proof; see also [13].
• A power series may be differentiated or integrated term by term: the resulting series
converges to the derivative or the integral of the function represented by the original
series within the same interval of convergence.
• Two power series may be added subtracted or multiplied by a constant and this will
converge at least within the same interval of convergence, i.e., suppose
1
X
s1 .x/ D an x n ;
nD0
X1
s2 .x/ D bm x m
mD0

are both convergent within the interval I and ˛; ˇ are numbers, then
1
X
s3 .x/ D .˛an C ˇbn /x n
nD0

is convergent within the interval I and

s3 .x/ D ˛s1 .x/ C ˇs2 .x/:

• The power series of a function is unique, i.e., if


1
X
f .x/ D an x n ;
nD0
X1
f .x/ D bm x m ;
mD0

then an =bn for all n.


12 2. SOME ELEMENTARY RESULTS

f-3 f-2 f-1 f0


f2 f3
f1
x-3=-3h x-2=-2h x-1=-h x0=0 x1=h x2=2h x3=3h

Figure 2.1: Values of f on an equally spaced lattice. Dashed lines show the linear interpolation.

2.2 NUMERICAL DIFFERENTIATION AND


INTEGRATION
2.2.1 DERIVATIVES
Suppose f is a known multiply differentiable function defined on some real interval Œa; b. We
want to find f 0 .0/, the derivative at x D 0. Let us suppose we know f on an equally spaced
lattice of x values

fn D f .xn /;
xn D nh; .n D 0; ˙1; ˙2; : : :/:

Using Taylor’s theorem

x 2 00 x 3 000
f .x/ D f0 C xf 0 C f C f C  ; (2.17)
2Š 3Š
where all derivatives are evaluated at x D 0. It follows that
h2 00 h3 f 000
f˙1 D f0 ˙ hf 0 C f ˙ C O.h4 /;
2 3Š
4h3 000
f˙2  f0 ˙ 2hf 0 C 2h2 f 0 ˙ f C O.h4 /: (2.18)
3
Subtracting f 1 from f1 we find

f1 f 1 h2 000
f0 D f C O.h4 /: (2.19)
2h 6
The term involving f 000 is the dominant error error associated with the finite difference approx-
imation that retains only the first term
f1 f 1
f0  : (2.20)
2h
2.2. NUMERICAL DIFFERENTIATION AND INTEGRATION 13
This “3-point formula” will be exact if f is a second degree polynomial. Note also that the sym-
metric difference about x D 0 is used as it is more accurate by one order in h than the forward
or backward difference formulae
f1 f0
f0  C O.h/;
h
f0 f 1
f0  C O.h/: (2.21)
h
These “2-point” formulae will be exact if f is a linear function on Œ0; ˙h.
It is possible to improve the 3-point formula, (2.20), by relating f 0 to lattice points further
removed. For example the “5-point formula”

1
f0  Œf 2 8f 1 C 8f1 f2  C O.h5 / (2.22)
12h
cancels all derivatives in the Taylor series through to fourth order. This formula will be exact f
is a fourth-degree polynomial over the 5-point interval Œ 2h; 2h.
Formulae for higher derivatives can be constructed by taking approximate combinations
of (2.18). For example,

f1 2f0 C f 1 D h2 f 00 C O.h4 /; (2.23)

which leads to the approximation

f1 2f0 C f 1
f 00  : (2.24)
h2
Numerical differentiation can be quite tricky to program, since by its very nature it involves the
subtracting of two very similar numbers.

2.2.2 QUADRATURE
In quadrature we are interested in calculating the definite integral of a function f between two
limits a < b . We can divide the range.

b a
hD ;
N
where N is an integer.
It is then sufficient to derive a formula for the integral from h to h since this formula
can then be applied successively
Z b Z aC2h Z aC4h Z b
f .x/dx D f .x/dx C f .x/dx C    C f .x/dx: (2.25)
a a aC2h b 2h
14 2. SOME ELEMENTARY RESULTS

f (xi+1)

f (xi) f (xi+1) + f (xi)


Area of Trapezoid = h. 2

xi xi+1
Rx
Figure 2.2: Using the trapezoidal rule to integrate xnnC1 f .x/dx corresponds to approximating
the integral by the area defined by the right trapezoid whose area is the sum of the rectangle
h  f .xn / with the right triangle whose area is 12 .f .xnC1 / f .xn //h, in agreement with (2.26).

The idea is to approximate f on each interval by a function that is integrable, this approach
leads to a group of formulae that are said to be of “Newton–Cotes” type. Let us first consider
Rh
h f .x/dx .
If f .x/ is a linear function then
Z h
h
f .x/dx D .f 1 C 2f0 C f1 / (2.26)
h 2
is exact on Œ h; h. Thus, from Taylor’s theorem it follows that
Z h
h
f .x/dx D .f 1 C 2f0 C f1 / C O.h3 /: (2.27)
h 2
The approximation (2.26) is known as the “trapezoidal rule.”

Lemma 2.7 For poylnomials of order 3 or less


Z h
h
f .x/dx D Œf .Ch/ C 4f .0/ C f . h/ : (2.28)
h 3

Proof.
Z h
h
1dx D xjh h D 2h D Œ1 C 4 C 1 D 2h;
h 3
2.2. NUMERICAL DIFFERENTIATION AND INTEGRATION 15
Z h
1 2h h
xdx D x j h D 0 D Œ1 C 0 1 D 0;
h 2 3
Z h
x3 h 2h3 h 2  2h3
x 2 dx D j hD D h C 0 C h2 D ;
h 3 3 3 3
Z h
x4 h h 3 
x 3 dx D j hD0D h C 0 h3 D 0:
h 4 3

Suppose we want to find the integral
Z b
f .x/dx:
a

We chose to work with an even number of lattice spacings


.b a/
N D :
h
We can write the integral as
Z b Z aC2h Z aC4h Z b
f .x/dx D f .x/dx C f .x/dx C    C f .x/dx: (2.29)
a a aC2h b 2h

Consider the first and second integrals on the right:


Z aC2h Z h
f .x/dx D f .u/du; u D x a h;
ZaaC4h Z hh
f .x/dx D f .z/dz; z D x a 3h: (2.30)
aC2h h

Now applying Lemma 2.7 to both integrals we have


Z aC2h
h
f .x/dx  Œf .u D h/ C 4f .u D 0/ C f .u D h/
a 3
h
D Œf .x D a C 2h/ C 4f .x D a C h/ C f .x D a/ ;
Z aC4h 3
h
f .x/dx  Œf .z D h/ C 4f .z D 0/ C f .z D h/
aC2h 3
h
D Œf .a C 4h/ C 4f .a C h/ C f .a C 2h/ :
3

Continuing in this way we eventually get


Z b
h
f .x/dx D Œf .a/ C 4f .a C h/ C 2f .a C 2h/ C 4f .a C 3h/
a 3
16 2. SOME ELEMENTARY RESULTS
C    C 4f .b h/ C f .b/: (2.31)

This is “Simpson’s rule” which is accurate to two orders better than the trapizoidal rule.
High order quadrature formulas can be derived by retaining more terms in the Taylor’s
expansion used to interpolate f and using better finite difference approximations for the deriva-
tives. The generalization of Simpsons rule using cubic and quartic polynomials are

3
Simpson’s 8
rule
Z x3
3h
f .x/dx D Œf0 C 3f1 C 3f2 C f3  C O.h5 /: (2.32)
x0 8

Boole’s Rule
Z x4
2h
f .x/dx D Œ7f0 C 32f1 C 12f2 C 32f3 C 7f4  C O.h7 /: (2.33)
x0 45

Note1 for this method to be applicable N must be a multiple of 4.


Although one might think that formulae based on interpolation using polynomials of a
very high degree would be even more accurate this is not necessarily the case since.
(i) Such polynomials tend to oscillate violently and thus lead to inaccurate interpolation.
(ii) The coefficients of the values of f can have both positive and negative signs in higher
order, making cancellation errors a potential problem.
Later on we will derive quadrature formulae which are accurate to a higher order but to
do this we will have to agree to give up on having equally spaced abscissae.

2.2.3 SINGULAR INTEGRANDS, INFINITE INTEGRALS


We need to be careful if our integrand is singular, even if the actual integral is well defined.
Sometimes the best approach is just to make a change of variable.
Example 2.8
Z 1
1=3
I1 D x g.x/dx: (2.34)
0

In this case put

t3 D x;
1 Due to a misprint in [14]. This approximation was incorrectly written as “Bode’s rule,” this error is frequently reproduced
in the literature.
2.3. FINDING ROOTS 17
Z 1
1 2
) I1 D 3 t t g.t 3 /dt
Z0 1
D 3 tg.t 2 /dt: (2.35)
0

In some cases it is better to isolate the singularity; for example,


Z 1 Z h Z 1
sin.x/ sin.x/ sin.x/
dx D dx C dx
0 x x h  x
Z0 h  3 Z 1
x sin.x/
D C O.x / dx C
5
dx: (2.36)
0 6 h x
The Simpson’s rule calculation of
Z 1
2
x g.x/dx
1

is not defined. However, changing variables


1
xDt

gives
Z 1
1
dg.t /dt;
0

which can be evaluated with any of the formulae we discussed.

2.3 FINDING ROOTS


We will frequently need to find the roots of a function f that is we will want to find the values
of x0 , s.t.

f .x0 / D 0:

Given a continuous function f defined on Œa0 ; b0 , suppose

f .a0 /f .b0 / < 0I

then f must have at least one root in .a0 ; b0 /. The trick here is to repeatedly bisect, all the time
decreasing the size of the interval. Define
1
cD .a0 C b0 / :
2
18 2. SOME ELEMENTARY RESULTS
If f .c/ has the same sign as f .b/ then must be a root in Œa0 ; c otherwise there is a root
in Œc; b0 . Either way we have halved the size of the interval containing the root. The rule we
adopt, starting with N D 0, is:

If f .bN /f .c/ > 0 then take aN C1 D aN ; bN C1 D c;


If f .bN /f .c/ < 0 then take aN C1 D c; bN C1 D bN ;
1
c D .jbN C1 aN C1 j/ :
2
After N iterations the position of the root is no more than
1
jbN aN j
2
from the mid-point of the interval ŒaN ; bN . But
1 1
jbN aN j D jb0 a0 j:
2 2N C1
This means that after N iterations the root must lie in an interval that is no bigger than
1
 D jb0 a0 j:
2N C1
We can chose  to the desired tolerance then we know that we need at most N iterations
where
ln jb0 a0 j ln./
N D 1
ln 2
to converge to within this tolerance.
As an example, I looked for the solution to

f .x/ D x 2 5 D 0; (2.37)

with a tolerance of 10 6 using x D 1 as my initial guess and an initial step size of 0:5, and my
code converged to the answer, correct to 6 places of decimals, after 34 iterations. You need to be
careful using this method, since if the initial step size is too large it is possible to “step over” the
desired root especially when f has several roots.
If the actual root is at x0 and we guess x1 . If it is a good guess jx0 x1 j will be small.
Using Taylor’s

f .x0 / D 0
D f .x1 / C f 0 .x1 /.x0 x1 / C O..x0 x1 /2 /:
f .x1 /
) x0  x1 :
f 0 .x1 /
2.3. FINDING ROOTS 19
Thus, a better guess for the root will be
f .x1 /
x2 D x1 :
f 0 .x1 /

Repeating, we get
f .xi /
xiC1 D xi : (2.38)
f 0 .xi /

The application of (2.38) defines the Newton–Raphson algorithm. I used it to look for the root
of x 2 5 D 0 with a tolerance of 10 6 . I was able to achieve convergence after only 10 iterations.
The “secant method” is useful if finding the derivative is a problem.
We approximate
f .xi / f .xi 1/
f 0 .xi /  ;
xi xi 1
and rewrite (2.38)
xi xi 1
xiC1 D xi f .xi / : (2.39)
f .xi / f .xi 1/

Provide that the initial guesses are reasonably close to the true root, convergence to the
exact answer is almost as rapid as the Newton–Raphson algorithm. The Newton–Raphson and
secant methods can fail to converge or worse converge to the wrong answer if there are multiple
roots close together or if there is a point, xQ near x0 where f 0 .x/
Q D 0.
21

CHAPTER 3

The Numerical Solution of


Ordinary Differential
Equations
It is sometimes, misleadingly, said that “physicists can’t solve the few body problem,” when what
is really true is that there is “no general analytical solution to the few body problem given by simple
algebraic expressions.” Sometimes it is nice to see a solution to an important problem written down
in terms of the elementary functions known to the ancients. However, it is often enough to have
a well-posed differential equation to find all that we could possible want to know about a physical
system. I will illustrate this by considering the most well known of all analytic functions [11, 12].

3.1 TRIGONOMETRIC FUNCTIONS

Lemma 3.1 Let c.x/; s.x/ be continuous differentiable functions such that

s 0 .x/ D c.x/;
c 0 .x/ D s.x/;
s.0/ D 0;
c.0/ D 1: (3.1)

Then

c 2 .x/ C s 2 .x/ D 1: (3.2)

Proof. Let

F .x/ D c 2 .x/ C s 2 .x/;


F 0 .x/ D 2c.x/c 0 .x/ C 2s.x/s 0 .x/ D 0;

thus F .x/ must be a constant, substituting the values at x D 0 we have the result. 
22 3. THE NUMERICAL SOLUTION OF ORDINARY DIFFERENTIAL EQUATIONS
Lemma 3.2 If we have two sets of functions c.x/; s.x/, and f .x/; g.x/ s.t.,
c 0 .x/ D s.x/; g 0 .x/ D f .x/;
s 0 .x/ D c.x/; f 0 .x/ D g.x/;
c.0/ D 1; g.0/ D 1;
s.0/ D 0; f .0/ D 0:

Then f .x/ D s.x/I c.x/ D g.x/ for all x .

Proof. We know that both the pairs .c; s/; .f; g/ must satisfy the relation (3.2)
c 2 .x/ C s 2 .x/ D 1;
f 2 .x/ C g 2 .x/ D 1:

The functions
F1 .x/ D f .x/c.x/ s.x/g.x/ and F2 .x/ D f .x/s.x/ C c.x/g.x/ are s.t.
dF1 .x/ dF2 .x/
D D 0:
dx dx
Hence,
a D f .x/c.x/ s.x/g.x/;
b D f .x/s.x/ C c.x/g.x/;

where a; b are constants putting in the values at x D 0 yields:


0 D f .x/c.x/ s.x/g.x/;
1 D f .x/s.x/ C c.x/g.x/I

hence,
0 D f .x/c 2 .x/ c.x/s.x/g.x/;
s.x/ D f .x/s 2 .x/ C s.x/c.x/g.x/;

adding the last two lines yields


s.x/ D f .x/:

Hence,
s 0 .x/ D f 0 .x/I

therefore,
c.x/ D g.x/:


3.2. ANALYTIC SOLUTIONS 23
Clearly, the functions c.x/; s.x/ have all the properties of the sin.x/ and cos.x/ of
trigonometry. The rest of the properties that we know and love can be derived from the re-
sults above. Further, as we will see slightly later, we can use relatively straight forward numerical
methods to solve equations of the form (3.1). This may seem an odd way to discuss the sin
and cos functions but there is an important lesson here, in that perfectly good functions can
be defined simply as the solution of differential equations. The Schrödinger equation for an N
electron neutral atom in atomic units
2 3
XN   XN
4 1 Z 1 1
r2 C E 5 ‰.r1 ; r2 ; : : : ; rN / D 0;
2 j rj 2 jrj rk j
j D1 j ¤k

is just another differential equation, albeit a more complicated one, which turns out to have a
unique solution for certain values of E .

3.2 ANALYTIC SOLUTIONS


Suppose we are presented with the O.D.E.
dy.t/
C p.t /y.t / D g.t /: (3.3)
dt
Then there is a convenient trick. Define, with a a constant,
Z t 
r.t/ D exp p.x/dx ;
a Z t
dr de u du
) D where u.t/ D p.x/dx;
dt du dt a
dr
D e u p.t /;
dt Rt
D e a p.x/dx p.t /: (3.4)
Rt
Returning to (3.3) and multiplying by a p.x/dx we have
dr.t/y.t/
D g.t /r.t /: (3.5)
dt

Example 3.3 Suppose we want to solve


dy.t/
t C 2y D 4t 2 (3.6)
dt
subject to

y.1/ D 2:
24 3. THE NUMERICAL SOLUTION OF ORDINARY DIFFERENTIAL EQUATIONS
Then we can devide (3.6) by t to put it in the form (3.4):
dy.t/ y
C2 D 4t;
dt t Z t
2
r.t / D exp.dx/;
a x
) r.t/ D exp.2 ln t 2 ln a/; (3.7)

with out loss of generality you can take the arbitrary constant a to be unity and then

r.t/ D exp.2 ln t / D exp.ln t 2 / D t 2 (3.8)

if we multiply (3.6) by t 2 we have


dy
t2 C 2ty D 4t 3 ;
dt
dt 2 y
) D 4t 3 ;
dt
) t 2 y.t / D t 4 C c; (3.9)

where c is a constant now substitute the initial condition y.1/ D 2 and it follows that c D 1.

Some times we can use direct integration. Consider


dy
x D y ln y
dx
y.2/ D e: (3.10)

We can rewrite
dy dx
D ;
Z y ln y Zx
dy dx
D
y ln y x
D ln.x/ C K: (3.11)

Put

u D ln y;
dy
du D ;
Z Zy
dy du
) D ;
y ln y u
D ln u;
D ln.ln.y//;
3.3. NUMERICAL METHODS 25
D ln.x/ C K: (3.12)

Initial conditions

y.2/ D e
) ln.ln.e// D ln.2/ C K;
ln.1/ D ln.2/ C K;
K D ln.2/: (3.13)

So from (3.12),

ln.ln.y// D ln.x/ ln.2/;


x
D ln. /;
x
2
) y D e2: (3.14)

3.3 NUMERICAL METHODS


3.3.1 EULER APPROXIMATION
Consider the ordinary first-order differential equation
dx.t/
D f .t; x/; (3.15)
dt
with initial condition x.t0 / D x0 . Now, from Taylor’s theorem
dx
x.t C h/ D x.0/ C h j tDt0 C O.h2 /;
dt
 x0 C hf .t0 ; x0 /: (3.16)

Equation (3.16) is known as the Euler solution. We could use it to propagate our solution
using a series of increments in h; at each step we introduce a potential error of order h2 .

Example 3.4 Let us apply the Euler method to (3.10)


dy
x D y ln y;
dx
y.2/ D e;
x
whose solution we know to be y D e 2 . Let us take h D 0:1

y.x C h/  y.x/ C hf .x; y/;


x0 D 2;
y0 D e;
y ln y
f .x; y/ D :
x
26 3. THE NUMERICAL SOLUTION OF ORDINARY DIFFERENTIAL EQUATIONS

4.5

y
3.5

2.5
2 2.2 2.4 2.6 2.8 3 3.2

Figure 3.1: Euler solution, crosses compared with analytic result, solid line.

Step 1.

y1 D y.2:1/; D y0 C hf .x0 ; y0 /;
D e C hf .2; e/;
e ln e
D e C 0:1  ;
2
D 2:85419583:

Step 2.

y2 D y.2:2/; D y1 C hf .x1 ; y1 /;
D 2:85419583 C h  f .2:1; 2:85419583/;
D 2:99674129:

It is straightforward to write a computer program to propagate the solution further. In


Figure 3.1, I show a comparison between the results of my numerical code and the analytic
solution.

We can generalize to higher order differential equations.

Example 3.5 Consider the classical harmonic oscillator problem

d 2x
D x;
dt 2
3.3. NUMERICAL METHODS 27
dx.0/
D 0;
dt
x.0/ D 1: (3.17)
We can convert this into a set of two first-order equations
dv
D x;
dt
dx
D v;
dt
v.0/ D 0;
x.0/ D 1: (3.18)
We can apply Taylor’s theorem to both x.t / and v.t /
dv.0/
v.t C h/ D v.0/ C h ;
dt
D v.0/ hx.0/;
dx.0/
x.t C h/ D x.0/ C h ;
dt
D x.0/ C hv.0/I (3.19)
incrementing we get
    !
dx
x x dt
yN C1 D jN C1 D jN C h dv
v v dt
  N
v
D yN C h jN : (3.20)
x

To code this up we can divide the interval Œ0;  into h equal steps and create two arrays
x.0 W 100/; v.0; 100/ and then iterate.
In Figure 3.2, I show the Euler numerical solution from (3.20) plotted against cos t the
analytic solution to (3.17).

3.3.2 RUNGE–KUTTA METHOD


The second order Runge–Kutta is most simply derived by applying the trapezoidal rule to inte-
grating
dy
D f .y; t /; (3.21)
dt
over the interval Œtn ; tnC1 .
Z tnC1
ynC1 D yn C f .y; t /dt;
tn
28 3. THE NUMERICAL SOLUTION OF ORDINARY DIFFERENTIAL EQUATIONS
1.5

1 × xEuler(t)

cos(t)
0.5

x(t)
0

-0.5

-1

-1.5
-0.5 0 0.5 1 1.5 2 2.5 3 3.5

Figure 3.2: Comparison of the analytic solution to (3.17), solid line, and the numerical solution
found using the Euler approach, crosses.

h
 yn C Œf .yn ; tn / C f .yNnC1 ; tnC1 / :
2
We approximate yNnC1 using the Euler method
h
ynC1 D yn C Œf .yn ; tn / C f .yn C hf .yn ; tn /; tnC1 / :
2
It is convenient to define
k1 D hf .yn ; tn /;
k2 D hf .yn C k1 ; tnC1 /;
1
) ynC1 D yn C Œk1 C k2  : (3.22)
2
Equation (3.22) defines the second order Runge–Kutta approximation. We can get a better
approximation by improving our estimate for yNnC1 . The Runge–Kutta approximation can be
extended to higher orders [15]. The fourth order Runge–Kutta is given by
1
ynC1 D yn C Œk1 C 2k2 C 2k3 C k4  C O.h5 /;
6
where
k1 D hf .yn ; tn /;
1 1
k2 D hf .yn C k1 ; tn C h/;
2 2
3.3. NUMERICAL METHODS 29
1 1
k3 D hf .yn C k2 ; tn C h/;
2 2
k4 D hf .yn C k3 ; tn C h/: (3.23)
The method can be extended to find the numerical solution to nth order differential equa-
tions. In much the same way as we looked at the vector formalism for the Euler method (3.20),
we can generalize the Runge–Kutta. For example the second order differential equation:
d 2f
D g.t /
dt 2
f .0/ D f0
f 0 .0/ D v0
can be transformed into two coupled differential equations:
du1 .t/
D u2 .t/;
dt
du2 .t/
D g.t /;
dt
u1 .0/ D f0 ;
u2 .0/ D v0 ; (3.24)
and solved using the vector scheme:
 
u2
uP D f .t; u/ D ;
g.t /
k1 D hf .tn ; un /;
1 1
k2 D hf .tn C h; un C k1 /;
2 2
1 1
k3 D hf .tn C h; un C k2 /;
2 2
k4 D hf .tn C h; un C k3 /;
1
unC1 D un C .k1 C k2 C k3 C k4 / : (3.25)
6
Second and higher order ordinary differential equations (more generally, systems of non-
linear equations) rarely yield closed form solutions. A great advantage of the numerical approach
is that it can be applied to both linear and nonlinear differential equations. A numerical solution
to the classical harmonic oscillator problem
d 2x
D x;
dt 2
x.0/
ˇ D 1;
dx ˇˇ
D 0; (3.26)
dt ˇ t D0
using the fourth order Runge–Kutta scheme as given in (3.25) results are shown in Figure 3.3
with h D 0:3.
30 3. THE NUMERICAL SOLUTION OF ORDINARY DIFFERENTIAL EQUATIONS
1.5 Analytic
× Numerical

0.5

x(t)
0 t
0 5 10 15 20 25 30

-0.5

-1

-1.5

Figure 3.3: Comparison of the analytic solution to (3.26), solid blue line compared to the fourth
order Runge–Kutta calculation, h D 0:3, red crosses.

3.3.3 NUMEROV METHOD


Suppose we are given the differential equation

y 00 .x/ D g.x/y.x/ C s.x/: (3.27)

To derive the Numerov’s method for solving this equation, we begin with the Taylor ex-
pansion of the function we want to solve, y.x/, around the point x0
x0 /2 00
.x .x x0 /3 000
y.x/ D y.x0 / C .x x0 /y 0 .x0 / C y .x0 / C y .x0 /
2Š 3Š
.x x0 /4 0000 .x x0 /5 00000
C y .x0 / C y .x0 / C O.h6 /: (3.28)
4Š 5Š
Denoting the distance from x to x0 by h D x x0 , we can write the above equation as

h2 00 h3
y.x0 C h/ D y.x0 / C hy 0 .x0 / C y .x0 / C y 000 .x0 /
2Š 3Š
h4 0000 h5
C y .x0 / C y 00000 .x0 / C O.h6 /: (3.29)
4Š 5Š
If we evenly discretize the space, we get a grid of x points, where h D xnC1 xn . By
applying the above equations to this discreet space, we get a relation between yn and ynC1
h2 00 h3
ynC1 D yn C hy 0 .xn / C y .xn / C y 000 .xn /
2Š 3Š
3.3. NUMERICAL METHODS 31
4 5
h 0000 h
C y .xn / C y 00000 .xn / C O.h6 /: (3.30)
4Š 5Š
Computationally, this amounts to taking a step “forward” by an amount h. If we want to
take a step “backward,” we replace every h with h and get the expression for yn 1

h2 00 h3 000
yn 1 D yn hy 0 .xn / C y .xn / y .xn /
2Š 3Š
h4 0000 h5 00000
C y .xn / y .xn / C O.h6 /: (3.31)
4Š 5Š
Summing the two equations, we find that

h4 0000
ynC1 2yn C yn 1 D h2 yn00 C y C O.h6 /: (3.32)
12 n
We can solve this equation for ynC1 by substituting the expression given at the beginning,
that is yn00 D gn yn C sn . To get an expression for the yn0000 factor, we simply have to differentiate

y 00 D gy C s (3.33)

twice and approximate it again as we did above:

d2
y 0000 D . gy/;
dx 2
) h2 yn0000 D gnC1 ynC1 C snC1 C 2gn yn 2sn gn 1 yn 1 C sn 1 C O.h4 /:
(3.34)

If we now substitute this in (3.32), we get

ynC1 2yn C yn 1 D
h2
h2 . gn yn C sn / C . gnC1 ynC1 C snC1 C 2gn yn 2sn gn 1 yn 1 C sn 1/
12
C O.h6 /:
(3.35)
32 3. THE NUMERICAL SOLUTION OF ORDINARY DIFFERENTIAL EQUATIONS
Rearranging
     
h2 5h2 h2
ynC1 1 C gnC1 2yn 1 gn C yn 1 1C gn 1
12 12 12
h2
D .snC1 C 10sn C sn 1 / C O.h6 /:
   12 
5h2 h2 h2
2yn 1 g
12 n
yn 1 1 C 12 gn 1 C 12 .snC1 C 10sn C sn 1 /
) ynC1   2
 :
1 C h12 gnC1
(3.36)

One might expect that the errors at each step would be roughly comparable so the total
error in the Numerov method would be O.h6 h 1 / D O.h5 /. Unfortunately this is generally not
true, the error tends to grow with each step and a better estimate is O.h4 / the same as the 4th
order Runge–Kutta. It main disadvantages are that we need both y0 and y1 to start it off and
that round off errors can pop up when applying (3.36), you should always use double precision
in your Numerov code.
33

CHAPTER 4

Case Study: Damped and


Driven Oscillations
4.1 LINEAR AND NONLINEAR ORDINARY
DIFFERENTIAL EQUATIONS
The great advantage of using the numerical approach to solving differential equations is that it
can be applied equally well to nonlinear as well as linear equations. A differential equation is
linear if it involves the dependent variables and their derivatives only linearly. For example, the
familiar equation
d 2x
D !02 x.t /
dt 2
is a linear one but
d 2x
D !02 sin.x.t //
dt 2
is nonlinear.
Linear equations are much simpler to deal with. Consider the general homogenous second
order ordinary differential equation
d 2x dx
a2 .t/ 2
C a1 .t / C a0 .t/x D 0: (4.1)
dt dt
If x1 and x2 are solutions of (4.1) then
˛x1 .t/ C ˇx2 .t/
is also a solution. A consequence of this is that if we find two linearly independent solutions,
x1 .t /; x2 .t/ to (4.1) then the general solution is given by xg .t/ D ˛x1 .t/ C ˇx2 .t/ where ˛; ˇ
are fixed by the boundary conditions [11, 16]. This useful property does not apply to nonlinear
differential equations and consequently very few nonlinear equations admit a simple analytic
solution. For a linear equation a small change in initial conditions generally only produces a small
change in the final solution but some nonlinear equations can exhibit can extreme sensitivity to
initial conditions. The good news is that numerical methods such as the Runge–Kutta can be
applied equally effectively to both linear and nonlinear problems. In this chapter, I will illustrate
some of these properties by considering the physical pendulum. For a more compete discussion
see [17].
34 4. CASE STUDY: DAMPED AND DRIVEN OSCILLATIONS
0

ø
L

d h

Figure 4.1: The simple pendulum consists of a heavy weight attached to a fixed point by a massless
string. The string is of length L. It is displaced from equilibrium through some small angle,  ,
and allowed to oscillate.

4.2 THE PHYSICAL PENDULUM


Let us consider the pendulum. A heavy mass is hung from a fixed point by a light inextensible
string; in other words, we assume the weight of the string is negligible compared to the weight
of the mass and the string is strong enough not to stretch when the mass hangs down. Initially,
the mass is in equilibrium with the gravitational force being exactly cancelled by the tension in
the string. If we displace the mass through an angle 0 and release it will begin to fall under
gravity. At some time t it will make an angle .t / with the vertical. If we assign zero potential
energy to the equilibrium point, i.e., hanging straight down with  D 0 then at time t we have
V D mgh; (4.2)
where h is the vertical displacement above equilibrium; see Figure 4.1.
Clearly,
d
sin  ; D
L
L h
cos  D ;
L
) h D L.1 cos /: (4.3)
The potential energy is then
V ./ D MgL.1 cos /: (4.4)
Now if we think of our polar coordinates being fixed at the origin then [11]
v D re P :
P r C r e
4.2. THE PHYSICAL PENDULUM 35
In our case r D L; rP D 0; hence, the kinetic energy is
1
KD ML2 P 2 ; (4.5)
2
and the total constant energy is

E D K C V;
1 1
E D ML2 P 2 C MgL 2 :
2 2
dE
D 0;
dt
) 0 D ML2 R P C MgLP sin ;
g
) R D sin./
L
2
D !0 sin./; (4.6)

where
r
g
!0 D :
L

4.2.1 SMALL OSCILLATIONS


First, let us consider the idealization where  is “small” and sin    , then (4.6) reduces to

R D !02 : (4.7)

This is a simple harmonic oscillator equation with general solution

.t/ D A cos.!0 t C /; (4.8)

where
r
g
!0 D :
l
The constants A and  are determined by our initial conditions. If, for example, the mass
is released from rest at t D 0 at an angle 0 , then
P
.0/ D 0
D A!0 sin./;
) D 0;
)A D 0 ;
) .t / D 0 cos.!0 t /;
P D !0 0 sin.!0 t/: (4.9)
36 4. CASE STUDY: DAMPED AND DRIVEN OSCILLATIONS
0.3
1

0.2

0.5
0.1

ω
0 0

-0.1
-0.5

-0.2

-1
-0.3
0 5 10 15 20 25 30 -1.5 -1 -0.5 0 0.5 1 1.5

t ϕ

Figure 4.2: The undamped oscillator with D 0:0; Q D 0:0 with !0 D 0:25 shown in the left
panel .t / red dashed, ! D P , solid blue as a function of time, the right panel shows the phase
trajectory ! against  .

The motion repeats itself indefinitely with a period of


2
T D :
!0

Figure 4.2 shows the time dependence of  and ! D P as a function of time. Also shown
is the phase trajectory where ! is plotted against  . In this simple case, the phase trajectory is
an ellipse. At a time T4 after being released the mass will be back at the origin with its maximum
speed with all energy kinetic. It will then decelerate and come to a stop at time t D T2 at an angle
of 0 . At this point all its energy is potential. After a further time of T4 it is back at the origin
with only kinetic energy and after a total time T it is back at its original position with  D 0
and P D 0. This process will repeat indefinitely.
Our description of the oscillator is an idealization where resistive forces such as friction
and air resistance have been neglected. A typical resistive force would be proportional to the
angular velocity of the mass. This leads us to consider the following differential equation

R C P C !02  D 0; (4.10)

where is a constant related to the strength of the resistance. In order to fully describe the
undamped oscillator we needed two linearly independent solutions. To find two such functions
4.2. THE PHYSICAL PENDULUM 37
t
for (4.10) we can try x D e where  is a complex number to be determined. Plugging into
(4.10) we will find

2 e t C e t C !02 e t D 0;
) 2 C  C !02 D 0;

q
˙ 2 4!02
)D : (4.11)
2
Let

k D 2 4!02 ;
 D jkj:

The behavior of the system depends on the value of k .


Case 1. k < 0

p
˙i 
D : (4.12)
2
We can write the general solution:

.t/ D ˛e  t Ci ‡ t C ˇe t i‡t
;
where

 D ;
2
p

‡ D :
2
Hence, the general solution may be written
t
.t / D e Œ˛e i ‡ t C ˇe i‡t
; (4.13)

or equivalenly
t
x.t / ŒA cos.‡ t C /: (4.14)

The system will still oscillate but the magnitude of the oscillation will be reduced by the
exponentially decaying factor e  t . Figure 4.3 shows a particular example of such motion.
In the left panel the plot of  against time. The dashed curves correspond to the bounding
curves ˙e  t and in the left the phase trajectory which spirals toward the point .0; 0/ we
will call such a point an “attractor.”
This case, where we still see oscillations, is described as under damped.
38 4. CASE STUDY: DAMPED AND DRIVEN OSCILLATIONS
1.5 0.2
γ = 0.05 γ = 0.05
1
0.1

0.5
0

ϕ
0

ω
-0.1
-0.5

-0.2
-1

-1.5 -0.3
0 20 40 60 80 100 -1 -0.5 0 0.5 1 1.5
t ϕ

Figure 4.3: Left panel  plotted against t for D 0:05, also shown are the bounding curves,
˙e 0:025t ; right panel is the phase trajectory.

Case 2. k > 0
In this case both
q
2 4!02
2
and
q
C 2 4!02
;
2
are negative real numbers and the solution
p 2
p 2
C 2 4!0 2 4!0
Ae 2 C Be 2 (4.15)
just decays with time and shows no oscillation. Such a solution is said be over damped.
Case 3. k D 0
This case which is know as critical damping marks the transition from oscillatory to decay-
ing behavior. Mathematically, it is a little different in that we have only one independent
solution e t . However, it is easy to check that in this special case te t is a second solution
and indeed the general solution can be written; see for example [16]

.t/ D Ae 2t C Bt e 2t : (4.16)
4.2. THE PHYSICAL PENDULUM 39
In this case we see no oscillations.

Lemma 4.1 Let xg .t/ be the general solution of the homogenous 2nd order linear differential equa-
tion
d 2x dx
a2 C 2a1 C a0 x D 0;
dt 2 dt
and xp is any particular solution of the inhomogeneous differential equation

d 2x dx
a2 C 2a1 C a0 x.t / D f .t/
dt 2 dt
then any other solution X.t/ must be of the form

X.t/ D xp .t/ C xg .t /:

Proof.

d 2 xp dxp
a2 2
C 2a1 C a0 xp .t/ D f .t/;
dt dt
2
d X dX
a2 2 C 2a1 C a0 X.t / D f .t/;
dt dt
d 2 xp X dxp X
) a2 C 2a1 C a0 Œxp .t/ X.t/ D 0:
dt 2 dt

Therefore, xp X is a solution of the homogenous problem and as such we must have


xp X D xg .t / xc .t/ will contain two constants so we have enough to accommodation the
boundary conditions so

X.t/ D xg .t/ C xp .t /:


Suppose, now, we wish to solve the differential equation

R D P !02  C Q sin.t /: (4.17)

In other words, we are applying an external oscillating force to our pendulum. This could
be achieved, for example, if the mass is charged and we apply an external varying electric field.
We are assuming ; !02 and Q are positive real constants. Since the homogenous part of (4.17)
40 4. CASE STUDY: DAMPED AND DRIVEN OSCILLATIONS
is just the damped oscillator this means we know the general solution. So “all” that is needed is
a particular solution. To this end it is useful to look at the complex generalization of (4.17)
zR D zP !02 z C Qe i t : (4.18)
Our desired particular solution will be the imaginary part of some zp : a particular solution
of (4.18). The form (4.18) is suggestive of a possible solution of the form
z D z0 e i t :
Plugging this into (4.18) we have
Œ 2 z0 e i t C i  z0 e i t C z0 e i t !02  D Qe i t ;
) z0 Œ.!02 2 / C i   D Q;
Q
) z0 D ;
.!02 2 / C i 
Q .!02 2 / i 
) z0 D
.!02 2 / C i  .!02 2 / i 
D jz0 je i ; (4.19)
where
Q
jz0 j D q ;
.!02 2 /2 C 2 2
!02 2
cos  D q ;
Œ!02 2 2 C 2 2

sin  D q ;
Œ!02 2 2 C 2 2
 

 D arctan : (4.20)
2 !02
For the under-damped forced oscillator with k < 0 we have, making use of (4.13) and
Lemma 4.1 we find
Q
.t / D ˛e  t Ci ‡ t C ˇe  t i ‡ t C q sin.t C /: (4.21)
2 2 2 2 2
.!0  / C 

The first two term on the right-hand side will decay with time, the third term will be-
come dominant. If << 1 and !  !0 the amplitude of this term can be very large. This is
phenomenon is known as resonance. In Figure 4.4, we show an example of the damped-driven
oscillator with Q D 1;  D 2:0; D 0:1. Initially, the system is seen to initially exhibit damped,
irregular motion (transient behavior), but eventually it settles into a periodic motion with the
same frequency as the driving force. There is a phase-space attractor in this case as well: a closed
loop.
4.2. THE PHYSICAL PENDULUM 41
1.5 2

1 1.5

1
0.5

ϕ 0.5
0

ω
0
-0.5
-0.5

-1
-1

-1.5 -1.5
-20 0 20 40 60 80 100 120 -1.5 -1 -0.5 0 0.5 1 1.5
t ϕ

Figure 4.4: Damped-driven pendulum, small oscillations, with D 0:1,  D 2; !0 D 1:0; left
panel time evolution .t /, right panel phase trajectory.

4.2.2 DIFFERENCES BETWEEN LINEAR AND NONLINEAR


PENDULUM
First let us compare the small angle case, D q D 0; 0 D 0:1 radians . 5:7ı / P 0 D 0.
From the numerical stand point we can just as well use our Runge–Kutta approach for the
physical pendulum, even though the underlying equation, (4.6) is nonlinear. As a first example
let us consider the simple oscillator, with no damping and no forcing term and consider the
difference between the linear and nonlinear cases when we change the initial position of the
mass. In Figure 4.5, the mass is released from rest at an angle  D 1 radian and the linear and
nonlinear solutions are nearly identical. If we release the mass from an angle of 60ı the nonlinear
solution continues to be periodic but out of phase with the linear case; see Figure 4.6. This is
not surprising since energy must be conserved, and we have switched off the dissipative force all
the initial potential energy must be converted into kinetic so the pendulum will rise until all the
kinetic energy has been converted to potential.
Example 4.2 Let us now explore the behavior of the linear and nonlinear oscillator when both
are forced and damped. The equation for the linear oscillator is given by (4.17)
R D P !02  C Q sin.t/;

and the nonlinear case by


R D P !02 sin./ C Q sin.t/: (4.22)
42 4. CASE STUDY: DAMPED AND DRIVEN OSCILLATIONS

0.15

0.1

0.05

ϕ
0

-0.05

-0.1

-0.15
0 5 10 15 20
t

Figure 4.5: Undamped undriven oscillator: blue linear approximation red nonlinear D q D
0; 0 D 1 radian . 5:7ı / P 0 D 0.

1.5
ϕ0 = 60˚
1

0.5
ϕ(t)

-0.5

-1

-1.5
0 5 10 15 20
t

Figure 4.6: Undamped undriven oscillator: blue linear approximation red nonlinear D q D
0; 0 D 3 radian (60ı ) P 0 D 0.
4.3. CHAOS 43

2.5
y

ϕ
-10 -5 0 5 10 15 20

-2.5

Figure 4.7: Phase trajectories, i.e., P D ! against  , for the damped un-driven oscillator with
Q D 2:5;  D 0:5; D 0:2; !0 D 1.  is in radians, ! D P in radians/s, red dashed nonlinear,
blue solid, linear.

I ran a fourth-order Runge–Kutta code for 0  t  100 and a step size of 0:01, with initial
P
conditions .0/ D 1; .0/ D 0 for both the linear and nonlinear cases where I have chosen Q D
1
2:5;  D 2 ; !0 D 1; D 0:2. The results are shown in Figure 4.7.
The linear case settles down to a behavior similar to that we have seen already in Figure 4.4.
The phase trajectories for the nonlinear case are much more erratic and do not become less so
as we increase the time range. The nonlinear case is very sensitive to the of step and also to the
value of Q. These type of sensitivities occur frequently when dealing with nonlinear systems.

4.3 CHAOS
Many mechanical systems exhibit chaotic motion in some regions of their parameter spaces.
Essentially, the term chaotic motion, or chaos, refers to aperiodic motion and sensitivity of the
time evolution to the initial conditions. A chaotic system is in practice unpredictable on long
time scales, although the motion is in principle deterministic, because minute changes in the
initial conditions can lead to large changes in the behavior after some time. Although a chaotic
system is unpredictable, its motion is not completely random. In particular, the way the system
44 4. CASE STUDY: DAMPED AND DRIVEN OSCILLATIONS

5
ω

2.5

ϕ
-3.2 -2.4 -1.6 -0.8 0 0.8 1.6 2.4 3.2

-2.5

-5

Figure 4.8: Phase trajectory for the Duffing oscillator with D 0:1; ˛ D 1; ˇ D 1; Q D 2:4,
and initial conditions 0 D 1:0; P 0 D 0; 0  t  500 calculated with a step size of 0.1.

approaches chaos often exhibits universality, i.e., seemingly different systems make the transition
from regular, periodic motion to chaotic motion in very similar ways, often through a series of
quantitatively universal period doublings (bifurcations). Our nonlinear pendulum can be shown
to exhibit chaotic motion. A discussion of this important topic can be found in [17]. One of the
first chaotic systems to be studied was the Duffing oscillator:

R P C ˛x C ˇx 3 D Q sin.t/: (4.23)

It occurs in a number of situations, for example in the modeling of a damped-driven elastic


pendulum whose spring does not exactly obey Hooke’s Law. In Figure 4.8, the phase trajectory
for a Duffing oscillator with D 0:1; ˛ D 1; ˇ D 1, and Q D 2:4 are shown. For this case the
system is chaotic.
45

CHAPTER 5

Numerical Linear Algebra


A vast number of problems in computational physics can be reduced to the solution of systems of
linear equations and the related problem of finding solutions to the matrix eigenvalue equation:

Ax D x; (5.1)

where A is an N  N matrix, x is a N  1 matrix (column vector), and  is a number. In this


chapter, I will develop some of the basic numerical methods for dealing with large systems of
equations and the eigenvalue problem (5.1). The development of the numerical methods I will
present below assumes a basic knowledge of linear algebra. A brief summary of the major results
I will need is presented in Appendix A.

5.1 SYSTEM OF LINEAR EQUATIONS


Suppose we need to find the unknowns fxi gN
i D1 that solve a system of linear equations of the
form:

a11 x1 C a21 x2 C    C a1N xN D b1 ;


a21 x1 C a22 x2 C    C a2N xN D b2 ;
::
:
aN1 x1 C a2N x2 C    C aN N xN D bN ; (5.2)

where the numbers aij ; bj are known. We can write (5.2) as a matrix equation Ax D b:
2 30 1 0 1
a11 a21    a1N x1 b1
6 a21 a22    a2N 7 B x2 C B b2 C
6 7B C B C
6 :: :: : : 7 B : C D B : C: (5.3)
4 : : : : :
: 5 @ :
: A @ :
: A
aN1 aN 2  aN N xN bN

If the N  N matrix A is non singular our desired solution is


1
xDA b:

Now, see Appendix A,

1 1 T
A D C ; (5.4)
jAj
46 5. NUMERICAL LINEAR ALGEBRA
jAj is the determinant of A and C is the cofactor matrix corresponding to A . To calculate the
inverse of a N  N matrix using the cofactor method would involve something of the order of
N Š multiplications. For N D 20, this would mean approximately 2  1018 multiplications. Even
with today’s fast machines it would take a long time to solve a system of equations this way. We
need a better computational approach.
There are some particular cases where the solution of (5.3) is particularly easy.

• If A is diagonal
2 3
a11 0  0
6 0 a22  0 7
6 7
AD6 :: :: :: 7;
4 : : : 0 5
0 0  aNN

then the linear equations are uncoupled and

bi
xi D i D 1; : : : ; N: (5.5)
ai i

• If A is an upper triangular matrix:


2 3
a11 a12  a1N
6 0 a22  a2N 7
6 7
AD6 :: :: :: 7;
4 0 : : : 5
0 0  aNN

where all the elements below the main diagonal are zero, aij D 0; 8i > j , then
2 30 1 0 1
a11 a12  a1N x1 b1
6 0 a22  a2N 7B x2 C B b2 C
6 7B C B C
6 :: :: :: 7B :: CDB :: C (5.6)
4 0 : : : 5@ : A @ : A
0 0  aNN xN bN

admits a “backward substitution” solution

aNN xN D bN ;
aN 1N 1 xN 1 C aN 1N xN D bN 1 ;
:: :
: D ::
XN
a1i xi D b1 : (5.7)
j D1
5.2. LU FACTORIZATION 47
• In much the same way we can solve the linear equations when A is a lower triangular
matrix using a “forward substitution” solution.
My basic strategy here is to look for ways to relate our matrix A to one or other of these
simpler forms.

5.2 LU FACTORIZATION
Suppose A can be decomposed into the product of a lower-triangular matrix L and an upper-
triangular matrix U . The entire solution algorithm for Ax D b can be described in three steps
(i) Decompose A D LU .
(ii) Solve : Ly D b.
(iii) Solve Ux D y .

Example 5.1 Consider the decomposition of the matrix


 
3 1
AD :
4 2
We can require
    
3 1 l11 0 u11 u12
D ;
4 2 l21 l22 0 u22
)3 D l11 u11 ;
1 D l11 u12 ;
4 D l21 u11 ;
2 D l21 u12 C l22 u22 :

We have only four equations and we need six unknowns, however we only require a de-
composition. So just take l11 D 1; l22 D 1. Then,
    
4 3 1 0 u11 u12
D :
6 3 l21 1 0 u22
) 3 D u11 ;
1 D u12 ;
4 D l21 u11 ;
 2 D l21 u12 C 
u22
: 
4 3 1 0 3 1
) D 4 :
6 3 3
1 0 23

For a further discussion of the LU approach, see [15].


48 5. NUMERICAL LINEAR ALGEBRA
5.3 QR FACTORIZATION
The QR approach is an alternative decomposition where a non singular matrix A is “decom-
posed” into the product of a unitary matrix and an upper triangular matrix.
Theorem 5.2 Suppose that Q is an N  N non singular matrix acting on an N dimensional complex
space V .
Q is unitary iff the column (row) vectors of Q form an orthonormal basis for V .

Proof. If we write Q in the form:


2 3
c11 c12  c1N
6 c21 c22  c2N 7
6 7
QD6 :: :: :: :: 7;
4 :; :; : : 5
cN1 cN 2  cNN
then we can regard Q as being made up of N vectors
2 3
c1i
6 c2i 7
6 7
ci D 6 : 7 ;
4 :: 5
cN i

Q D .c1 c2    cN / (5.8)
and
N
X
hci jcj i D cNki ckj ;
kD1

2 3
hc1 jc1 i hc1 jc2 i;  ; hc1 jcN i
6 hc1 jc2 i hc2 jc2 i;  hc2 jcN i 7
6 7
QŽ Q D 6 :: :: :: :: 7:
4 : : : : 5
hc1 jcN i hc2 jcN i;  hcN jcN i
Thus, Q is unitary iff hci jcj i D ıij .
The proof using rows as vectors is almost identical. 
Suppose A  RN N is a non singular matrix then we know its determinant is non zero
and the vectors corresponding to the columns are linearly independent. Just as in (5.8) I can
write
A D Œa1 a2 : : : aN :
5.3. QR FACTORIZATION 49
My plan is to use the Grahm–Schmidt process to create an orthomormal set from the
vectors ai . We may write using (A.6)
e10 D a1 ;
e10
e1 D ;
jje10 jj
e20 D a2 ha2 je1 ie1 ;
e20
e2 D ;
ke20 k
::
: (5.9)
The vector ai is a member of the subspace spanned by fek gikD1 and thus orthogonal to the
unit vectors fek gk>i . We can expand
N
X
ai D rki ek
kD1
rki D hek jai i: (5.10)
Notice rki D 0 if k > i . Further,
ri i D hei jei0 i
D hei0 jei0 i  0:

Considering components, the .li /th term from our original matrix A , this is given by the
l th element of ai can, therefore, be written in terms of the l th elements of ek
N
X
ali D rki elk
kD1
XN
D elk rki
kD1
A D QR;

where by construction the matrix


Q D Œe1 e2 : : : eN 

satisfy the unitary condition in Theorem 5.2 and the matrix, R has only zero elements below
the diagonal, i.e., it is upper triangular.
Example 5.3 Consider the matrix
 
3 2
AD ;
1 2
50 5. NUMERICAL LINEAR ALGEBRA
writing the columns as vectors

A D Œa1 a2 ;

where
 
3
a1 D ;
1
 
2
a2 D :
2

Suppose  D Spanfa1 ; a2 g. Applying the Grahm–Schmidt process to a1 , a2 you will find


the two orthonormal vectors
   
1 3 1 1
e1 D p ; e2 D p ;
10 1 10 3

which also span . Define


 
1 3 1
Q D p ;
10  1 3

1 3 1
) QŽ D p :
10 1 3

QŽ Q D I
 
1 0
D :
0 1

A D QR;
QŽ A D R   
1 3 1 3 2
D p ;
10  1 3

1 2
1 10 8
D p :
10 0 4

As we have seen every non singular matrix A can be decomposed into a product QR ,
where Q is unitary and R is upper triangular. Let us now apply this observation to some of the
characteristic problems of linear algebra.
5.3. QR FACTORIZATION 51
5.3.1 SYSTEMS OF LINEAR EQUATIONS
Suppose A is a non-singular N  N matrix and we want to solve
Ax D b:
We can proceed as follows.
• First factorize A
Ax D QRx D b:

• Then introduces a vector y


y  Rx;
) Qy D b;
)y D QŽ b:

• Solve the triangular system Rx D y by back substitution.

5.3.2 EIGENVALUES
Our next task is to find the eigenvalues of A a non singular N  N matrix. We first note that if
R is an upper triangular matrix, its eigenvalues are given by the solution of
ˇ ˇ
ˇ a11  a12  a1N ˇ
ˇ ˇ
ˇ 0 a   a2N ˇ
ˇ 22 ˇ
jR Ij D ˇ : :: :: ˇ D 0;
ˇ 0 :: : : ˇ
ˇ ˇ
ˇ 0 0  aNN  ˇ
YN
) jR Ij D .ai i / D 0:
i D1

Now we know, see Lemma A.29, that if A is an N  N matrix and U is a unitary matrix
then if B D UAU Ž then B and A have the same eigenvalues. If we can find a matrix B which
is upper triangular such that
A D UBU Ž (5.11)
and U is unitary then the eigenvalues of A are just the diagonal elements of B . We will call a
transformation of the kind (5.11) with U unitary a “similarity transformation.”
We can find a QR decomposition of any non singular matrix A , i.e.,
A D QR;
Ž
Q AQ D RQ; (5.12)
is a similarity transformation of A and RQ has the same eigenvalues as A D QR . and if this is
upper diagonal the problem is solved. Even if RQ is not in upper triangular form we do have a
way forward.
52 5. NUMERICAL LINEAR ALGEBRA
The QR Algorithm
Suppose A is a real symmetric matrix for which we want to find the eigenvalues. Define A 0 D A ,
then define a sequence of matrices starting with k D 0 by computing the QR decomposition to
find Q.k/ and R .k/ then define A .kC1/ D R k Qk . Now, it can be shown that eventually A .k/
converges to an upper triangular matrix [18]. But

A .k/ D R .k 1/ Q.k 1/ ;
D .Q.k 1/ /Ž Q.k 1/ R .k 1/
Q.k 1/
;
D .Q.k 1/ /Ž A .k 1/ Q.k 1/
;

so all the matrices A .j / are connected by similarity transformations and therefore share the same
eigenvalues A .0/ is just A and A .k/ is upper triangular so we can read off the eigenvalues.
The advantages of the of the QR algorithm are that it
• gives all the eigenvalues,
• is stable.
It is incorporated in the major software packages such as LAPACK [2].

5.3.3 LINEAR LEAST SQUARES


The QR factorization is not restricted to N  N matrices.

Example 5.4 Consider the matrix:


2 3
1 2
A D 4 1 2 5;
0 3

write the columns as the vectors


01
1
r1 D @ 1 A;
0
0 1
2
r2 D @ 2 A:
3

Suppose  D Spanfr1 ; r2 g. Using the Grahm–Schmidt process it is easy to see that the
vectors e1 ; e2
0 1
1
1 @ A
e1 D p 1 ;
2 0
5.3. QR FACTORIZATION 53
0 1
0
e2 D @ 0 A;
1

are orthonormal. Define


0 1
p1 0
2
B p1
C
Q D @ 0 A;
2
0 1
!
p1 p1 0
T 2 2
Q D ;
0 0 1
 
1 0
) QT D ID :
0 1

Define

R D QŽ A
!2 1 2 3
p1 p1 0
D 2 4 1 2 5
2
0 0 1
0 3
 p p 
2 2 2
D :
0 3

Note that
• Q is a 3  2 matrix and QŽ is a 2  3 matrix,
• QŽ Q D I2 but QQŽ ¤ I3 ,
• since Q is real QŽ D QT .

This example is a particular case of the following theorem; see [15].

Theorem 5.5 Suppose A  RM N where M  N then A can be written in the form

A D QR;

where R is an upper triangular N  N matrix and Q is an M  N matrix which satisfies

QT Q D IN ;

where IN is the N  N identity matrix. If rank.A/ D N then R is non singular.

Again notice our matrix Q is real so QŽ D QT .


54 5. NUMERICAL LINEAR ALGEBRA
Table 5.1: The right-hand side column displays measured values at several times t

t E(t)
1.0 1.0
2.0 1.5
3.0 3.0
4.0 6

Linear Least Squares


Suppose that we are given a vector b  RM and a M  N matrix A then the matrix equation
Ax D b (5.13)
has, in general, no solutions if M > N but it has an infinity of solutions if M < N , there are
more equations than unknowns and the system is said be “over determined.”
Example 5.6 Table 5.1 shows measured function values of the quantity E at four times. We
want to find the “best fit” to the data which can contain some degree of experimental error.

More generally, suppose we have M experimental observations which gives us a set


fxi ; bi gM
i D1 Suppose E is the desired experimental quantity and we want to model E with a
linear combination of N functions j .t/.
We can write
N
X
E.x/ D cj j .x/;
j D1
E.xi /  bi ; i D 1; : : : ; M; (5.14)
which we can put in matrix form as
2 3 3 22 3
1 .x1 /    N .x1 / E.x1 / b1
6 :: :: :: 72 3 6 :
:: 7 6 : 7
6 : : : 7 c1 6 7 6 :: 7
6 76 : 7 6 7 6 7
Ax D 6 6  1 .x N /     N .x N/
7 4 : 5 D 6 E.xN / 7  6 bN
7 : 6 7 6
7 D b:
7 (5.15)
6 :: : :: :
:: 7 6 :: 7 6 :: 7
4 : 5 cN 4 : 5 4 : 5
1 .xM /    N .xM / E.xM / bM
I still have not said precisely what I mean by a “best fit,” the way we will explore this idea
here is to minimize the sum of squares of the deviation i.e., we want to find the vector c which
minimizes
M
X
.E.xi / bi /2 D jjAc bjj2 : (5.16)
kD1
5.3. QR FACTORIZATION 55
We can use our QR decomposition to find the solution to the least squared problem.
Theorem 5.7 Suppose that A  RM N with M  N and rank.A/ D N . Then there exists a
unique least squared solution to the system of equations
Ac D b

which minimizes
jjAc bjj:

Further, if we decompose
A D QR:

Then the vector c defined by


Rc D QT b (5.17)
is that unique solution.

Proof. [15]. 
Returning to Example 5.6. Using the QR approach I will look for the “best fit” in the least
square sense assuming the data is best represented by a
(i) straight line,
(ii) a quadratic function
p.t/ D c1 C c2 t C c3 t 2 :

Example 5.8 First we will look for a linear solution c0 C c1 t  E.t/.


In matrix form,

2 3 Ac D b;
2 3
1 1 1
 
6 1 2 7 6 1:5 7
6 7 c0 D 6 7
4 1 3 5 c1 4 3 5:
1 4 6
Now write A as two column vectors:
0 1 0 1
1 1
B 1 C B 2 C
a1 D B C B
@ 1 A ; a2 D @ 3
C:
A
1 4
56 5. NUMERICAL LINEAR ALGEBRA
Applying Grahm–Schmidt to these vectors we get:
0 1 0 1
1 1:5
1B 1 C 1 B 0:5 C
e1 D B C ; e2 D p B C:
2@ 1 A 5 @ 0:5 A
1 1:5

Now construct Q
0 1 1:5
1
p
2 5
B 1 0:5 C
B 2
p C
QDB
B 1
5
0:5
C:
C
p
@ 2 5 A
1 1:5
p
2 5

Then R will be given by

R D QT A;
2 3
! 1 1
1 1 1 1 6 1
2 2 2 2 6 2 7
7
D 3 1 1 3 4 1
p
2 5
p
2 5
p
2 5
p
2 5
3 5
1 4
 
2 p5
D :
0 5

Now we can deduce values for our coefficients:

Ac D b;
) QRc D b;
) Rc D QT b;
2 3
! 1
   1 1 1 1
2 p5 c0 6 1:5 7
D 2 2 2 2 6 7
0 5 c1
3
p 1
p 1
p 3
p 4 3 5;
2 5 2 5 2 5 2 5
6
  !
2cp
0 C 5c1 5:75
D 8:25
p
:
5c1 5

Hence, the best linear approximation is

y.t/ D c0 C c1 t D 1:25 C 1:65t: (5.18)

Let us now consider a quadratic fit. We are looking for a solution

c0 C c1 t C c2 t 2  E.t/:
5.3. QR FACTORIZATION 57
We will proceed much as before. We can write the problem in matrix form

0 1 Ac D 0
b; 1
1 1 1 0 1 1
B 1 c0
B 2 4 C B C
C @ c1 A D B 1:5 C :
@ 1 3 9 A @ 3 A
c2
1 4 16 6

Applying Grahm–Schmidt we find our desired Q to be


0 1
1 p3 1
5
B p1
C
1BB
1
5
1 C
C:
QD B
2@ 1 p1 1 C
A
5
1 p3 1
5

Now since Q and A are known all we have to do is transpose Q and multiply by A to find
R:

R D QT A;
0 10 1 1 1 1
1 1 1 1
1B B 1 2 4 C
D @ p3 p1 p1 p3 C B C;
2 5 5 5 5 A@ 1 3 9 A
1 1 1 1
1 4 16
0 1
2 p5 p15
D @ 0 5 5 5 A:
0 0 2

Then,

Rc D QT b;
0 10 1 0 10 1 1
2 p5 p 15 c0 1 1 1 1
1 B B 1:5 C
@ 0 5 5 5 A @ c1 A D @ p3 p1 p1 p3 C B C;
2 5 5 5 5 A@ 3 A
0 0 2 c2 1 1 1 1
6
0 1 0 1
2c 5:75
p0 C 5c1 Cp15c2 B 8:25 C
@ 5c1 C 5 5c2 A D @ p5 A :
2c2 1:25

c2 D 0:625; c1 D 1:475; c0 D 1:875.


So the quadratic least squared fit is:

1:875 1:475t C 0:625t 2 : (5.19)


58 5. NUMERICAL LINEAR ALGEBRA
7

6
E(t)

5 Linear
Quadratic

0
0 1 2 3 4 5
t

Figure 5.1: Linear least squared and quadratic least squared fits to the data in Table 5.1.

In Figure 5.1 I show a comparison between the linear, (5.18), and quadratic, (5.19), fits.

I cannot end this chapter without adding one word of warning. A straightforward com-
puter implementation of the Grahm–Schmidt process as given in (5.9) can very easily run into
significant round off errors. Fortunately, there are some clever ways to avoid these problems [19]
leading to stable and efficient codes.
59

CHAPTER 6

Polynomial Approximations
6.1 INTERPOLATION
Interpolation is the problem of fitting a smooth curve through a given set of points, generally as
the graph of a function. It is useful in data analysis (interpolation is a form of regression) and in
numerical analysis. It is one of those important recurring concepts in applied mathematics.

Definition 6.1 Given N C 1 points xj  R; 0  j  N and sample values

yj D f .xj /

of a function at these points then the polynomial interpolation problem consists in finding a
polynomial pn .x/ of degree N which reproduces these values

yj D pn .xj /:

Example 6.2 Linear interpolation between .x1 ; y1 / and .x2 ; y2 /.

Let
x x2
L1 .x/ D ;
x1 x2
x x1
L2 .x/ D ;
x2 x1
p.x/ D y1 L
1 .x/ C y2 L2 .x/;
 
x x2 x x1
D y1 C y2 ;
x1 x2 x2 x1
1
D Œ y1 x2 C y2 x1 C x.y1 y2 / ;
x1 x2
y1 y2 y1 x2 C y2 x1
D xC ;
x1 x2 x1 x2
then

p.x1 / D y1 ;
p.x2 / D y2 :
60 6. POLYNOMIAL APPROXIMATIONS
A degree N polynomial can be written as
N
X
pN .x/ D an x n :
0

For interpolation the number of degrees of freedom (N C 1 coefficients) in the polynomial


matches the number of points where the function should fit. If the degree of the polynomial is
strictly less than N we cannot in general pass it through all the points .xi ; yi /. Let us look for a
direct solution:
XN
an xjn D yj ; j D 0; : : : ; N: (6.1)
nD0
In these N C 1 equations the unknowns are the coefficients a0 ; a1 ; : : : ; aN . In other words,
this is a linear system given by the matrix equation
Va D y;
2 30 1 0 1
1 x0  x0N a0 y0
6 1 x1  N 7B
x1 7 B a1 C B y1 C
6 C B C
6 :: :: :: :: 7 B :: CDB :: C: (6.2)
4 : : : : 5@ : A @ : A
N
1 xN  xN an yn
V is known as the “Vandermonde matrix,” it has coefficients
Vj n D xjn :
If we know how to effectively invert V then we can find a as
1
aDV y:
I will return to this relatively shortly but for the time being I will develop an analytic
approach. Let PN denote the set of all real valued polynomials of degree  N .
Lemma 6.3 Let fxj gjND0 be a set of distinct numbers then there exists polynomials Lk .x/  PN such
that
Lk .xj / D ıj k :

Proof. For a given k , define


Q
N
.x xj /
j D0
j ¤k
Lk .x/ D ;
Q
N
.xk xj /
j D0
j ¤k
6.1. INTERPOLATION 61
) Lk .xj / D ıj k :

The polynomials Lk .x/ are known as the “Lagrange elementary polynomials.”

Theorem 6.4 (Lagrange interpolation theorem). Let fxj gjND0 be a collection of disjoint real
numbers and fyj gjND0 be a collection of real numbers. Then there exits a unique pn  PN s.t.

pN .xi / D yi :

Proof. Define
N
X
pN .x/ D yk Lk .x/;
kD0

where Lk .x/ are the Lagrange elementary polynomials. Now from Lemma 6.3 we have
N
X N
X
pN .xi / D yk Lk .xi / D yk ıik D yi :
kD0 kD0

It remains to show that pN .x/ is unique.


Assume that there exist a second polynomial of order N , qN .x/ such that

pN .xi / D qN .xi / D yi I

then,
rN .x/ D pN .x/ qN .x/

is a polynomial of degree N and has roots at each of the N C 1 points x0 ; x1 ; : : : ; xN . However,


the fundamental theorem of algebra tells us that a non-zero polynomial of degree N can only
have at most N real roots it follows that rN must be the zero polynomial. So pN D qN . 

Definition 6.5 If f is a function defined on an interval containing the points fxi gN


i D0 then

N
X
PN .x/ D f .xk /Lk .x/
kD0

is called the Lagrange interpolation polynomial of f .


62 6. POLYNOMIAL APPROXIMATIONS
3

2.5

y(x)
1.5

0.5

0 -1 -0.5 0 0.5 1
x

Figure 6.1: Polynomial fit to exp.x/ function evaluated at the points x D 1; 0; 1.

Example 6.6 f .x/ D e x interpolated by a parabola with fixed values at x0 D 1; x1 D 0; x2 D


1.

.x x1 /.x x2 / x.x 1/ x2 x
L0 .x/ D ;D ;D :
.x0 x1 /.x0 x2 / . 1/. 1 1/ 2
.x x0 /.x x2 / x C 1.x 1/
L1 .x/ D ;D ; D 1 x2:
.x1 x0 /.x1 x2 / .1/. 1/
.x x0 /.x x1 / x C 1.x/ x2 C 1
L2 .x/ D :D ;D ;
.x2 x0 /.x2 x2 / .1 C 1/.1/ 2
 2   2 
x x x Cx
) p2 .x/ D e 1 L0 .x/ C e 0 L1 .x/ C e 1 L2 .x/ D e 1 C 1 x2 C e1 ;
 1    2 2
e C e1 e e 1
D x2 1 Cx D x 2 .cosh2 .1/ 1/
2 2
C x sinh.x/ C 1  1 C 1:1752x C 0:5431x 2 :

6.1.1 ERROR ESTIMATION

Theorem 6.7 Let f be a N C 1 continuously differentiable function on an interval Œa; b and let
fxj jj D 0; : : : ; N g be a set of distinct numbers in Œa; b. If pN .x/ is the Langrange interpolation of f
6.2. ORTHOGONAL POLYNOMIALS 63
using fxj jj D 0; : : : ; N g then for every x  Œa; b there exists .x/  Œa; b s.t.

f N C1 ..x//
f .x/ pN .x/ D N C1 .x/;
.N C 1/Š

where
N
Y
N C1 .x/ D .x xj /:
j D0

Proof. [15]. 

The interpolation error:

• depends on the smoothness of f via the high order derivative f .N C1/ ,

• has a factor of 1=.N C 1/Š which decays fast as N becomes large,

• is directly proportional to N C1 .x/, which means that it will be zero at the points xj
and at its best in their vicinity.

Can we always expect convergence of the polynomial interpolant as N ! 1? The answer


is: No!
There are examples of very smooth (analytic) functions for which polynomial interpolation
diverges, particularly so near the boundaries of the interpolation interval.

Example 6.8 Runge function.


Consider the function
1
f runge .x/ D :
1 C 25x 2
Runge found that if it is interpolated at n equidistant points xi between 1 and 1 the
interpolation error increases when the degree of the polynomial is increased; see Figure 6.2.

6.2 ORTHOGONAL POLYNOMIALS


6.2.1 LEGENDRE EQUATION
Legendre polynomials turn up in the solution of a lot of physical problems, in electromagnetism
and quantum mechanics [11]. They originate as the solution of the Legendre differential equa-
tion
d 2 y.x/ dy.x/
.1 x2/ 2x C l.l C 1/y.x/ D 0; (6.3)
dx 2 dx
64 6. POLYNOMIAL APPROXIMATIONS
1.2

0.8

0.6

0.4

0.2

-0.2

-0.4 -1 -0.5 0 0.5 1


x

Figure 6.2: Runge function, red line, 5th order interpolation (six equally spaced points) blue
line, 9th order interpolation (10 equally spaced points), green line.

where l is a constant. In this section we will start from (6.3) and look for a power series solution
1
X
dy
D an nx n 1
;
dx nD1
X1
dy
2x D 2 nan x n ;
dx nD1
1
X
d 2y
D an n.n 1/x n 2
;
dx 2 nD2
X1 1
X
2 d 2y n 2
.1 x / 2 D an n.n 1/x an n.n 1/x n : (6.4)
dx nD2 nD2

Inserting in (6.3) and equating powers of x we must have that


.n C 2/.n C 1/anC2 D .l 2 C l n2 n/an ;
D .l n/.l C n C 1/an ;
n.n C 1/ l.l C 1/
) anC2 D an : (6.5)
.n C 2/.n C 1/
The general solution to (6.3) is the sum of two series containing two constants a0 and a1
 
x2 x4
y1 .x/ D a0 1 l.l C 1/ C .l 2/l.l C 1/.l C 3/ C  ;
2 4
6.2. ORTHOGONAL POLYNOMIALS 65
 3 5 
x x
y2 .x/ D a1 x .l 1/.l C 2/ C .l 3/.l 1/.l C 2/.l C 4/ C  :
3 5
(6.6)

Since y1 only contains even powers and y2 only odd powers they cannot be proportional
to each other and must be linearly independent. Thus, the general solution for jxj < 1 must be

y.x/ D b1 y1 .x/ C b2 y2 .x/: (6.7)

In many applications l is an integer. In that case


Œl.l C 1/ l.l C 1/
alC2 D D0
.l C 1/.l C 2/
and we obtain a polynomial of order l . In particular, if l is even y1 .x/ is a polynomial and if l is
odd y2 .x/ is a polynomial. So we may write the general solution (6.7)

y.x/ D bl Pl .x/ C cl Ql .x/; (6.8)

where Pl .x/ is the polynomial and Ql corresponds to the other linearly independent solution.
If we now demand that Pl .1/ D 1 we have a set of polynomials (the Legendre polynomials),

P0 .x/ D 1;
P1 .x/ D x;
1
P2 .x/ D .3x 2 1/;
2
1
P3 .x/ D .5x 3 3x/;
2
1
P4 .x/ D .35x 4 30x 2 C 3/;
8
1
P5 .x/ D .63x 5 70x 3 C 15x/;
8
::
: (6.9)

The Ql .x/ solutions are functions not polynomials.


R1
Theorem 6.9 1 Pm .x/Pn .x/ D 0 if m ¤ n.

Proof.
 
d 2 dPn .x/
Pm .x/ .1 x / C n.n C 1/Pn .x/Pm .x/ D 0;
dx
 dx 
d dPm .x/
Pn .x/ .1 x 2 / x C m.m C 1/Pn .x/Pm .x/ D 0;
dx dx
66 6. POLYNOMIAL APPROXIMATIONS
 
d 2 dPn .x/ dPm .x/
) .1 x /ŒPm .x/ Pn .x/  C
dx dx dx
Œn.n C 1/ m.m C 1/Pn .x/Pm .x/ D 0: (6.10)

Now integrate from 1 to 1, the first term will be zero since .1 x 2 / D 0 at both limits
and since m ¤ n we have the result. 

Generating Function

Definition 6.10
1
ˆ.x; h/ D p ; jhj < 1
1 2hx C h2
is called the generating function of the Legendre polynomials.

Theorem 6.11
1
X
ˆ.x; h/ D hl Pl .x/:
lD0

Let y D 2xh h2 ;
1
ˆ.x; h/ D .1 y/ 2 ;
3
y
D 1 C C 4 y2 C   
2 2
1 3
D 1 C .2xh h2 / C .2xh h2 /2 C   
2 8
1 2 3
D 1 C xh h C .4x 2 h2 4xh3 C h4 / C   
2 8
2 3 2 1
D 1 C xh C h . x / C 
2 2
2
D P0 .x/ C hP1 .x/ C h P2 .x/ C   

For a more complete formal proof, see [13]. The generating function allows us to derive
some important relations:
@ˆ.x; h/ 1 3
D .1 2xh C h2 / 2 . 2x C 2h/;
@h 2
2 @ˆ.x; h/
) .1 2xh C h / D .x h/ˆ;
@h
1
X 1
X
) .1 2xh C h2 / lhl 1 Pl .x/ D .x h/ hl Pl .x/:
lD1 lD0
6.2. ORTHOGONAL POLYNOMIALS 67
Equating powers of h we find the recurrence relation:
lPl .x/ 2x.l 1/Pl 1 .x/ C .l 2/Pl 2 .x/ D xPl 1 .x/ Pl 2 .x/;
lPl .x/ D .2l 1/xPl 1 .x/ .l 1/Pl 2 .x/: (6.11)

For each n the second solution Qn .x/ satisfies the same differential equation and also the
same recurrence relation:
lQl .x/ D .2l 1/xQl 1 .x/ .l 1/Ql 2 .x/;
.l C 1/QlC1 D .2l C 1/xQl .x/ lQl 1 .x/;
.2l C 1/xQl .x/ D lQl 1 .x/ C .l C 1/QlC1 .x/: (6.12)
However, it is singular at x D ˙1. For jxj ¤ 1 and it can be shown that [13]
 
1 1Cx
Q0 .x/ D ln ;
2 1 x 
x 1Cx
Q1 .x/ D ln 1;
2 1 x 
3x 2 1 1Cx 3x
Q2 .x/ D ln
4 1 x 2
::
: (6.13)

The Numerical Generation of the Legendre Functions


For the Legendre functions of the first kind (i.e., the Legendre polynomials) then we can simply
use the recurrence relation (6.11) starting with P0 .x/ D 1; P1 .x/ D x and work up to larger l
values. This procedure is generally stable.
Unfortunately, for the Legendre function of the second kind using the forward recurrence
relation is not numerically stable and is subject to cancellation errors. As we have seen, the Ql
satisfy the recurrence relation
.l C 1/QlC1 .x/ D .2l C 1/xQl.x/ lQl 1 .x/;
1 xC1
Q0 .x/ D ln ;
2 x 1 
x xC1
Q1 .x/ D ln 1:
2 x 1
jxj ¤ 1: (6.14)
Noting that [20, 21]
l 1
jQl .x/j  e Q0 .cosh 2˛/; (6.15)
where x D cosh ˛; ˛  0. Then you can use (6.15) to estimate the value, l0 say, for which
7
jQl0 j < 10 : (6.16)
68 6. POLYNOMIAL APPROXIMATIONS
The First Few Legendre Functions of
The First Few Legendre Polynomials the Second Kind
3
1
Q0(x)
P0(x) Q1(x)
P1(x) 2
Q2(x)
P2(x) Q3(x)
0.5 P3(x)
1

Qn(x)
Pn(x)
0
0

-1

-0.5
-2

-1 -3-1 -0.5 0 0.5 1


-1 -0.5 0 0.5 1
x
x

Figure 6.3: The first few Legendre functions of the first kind, left panel, and the second kind,
right panel.

Then set

QQ l0 .x/ D 0;
7
QQ l0 1 .x/ D 10 : (6.17)

Then calculate QQ l using the bacward recurrence formula for 0  l  l0 . Normalize the
sequence QQ l .x/ to deduce the computed Legendre functions Ql .x/ using the analytic Q0 .x/

Q0 .x/ Q
Ql .x/ D Ql .x/:
QQ 0 .x/

The error in this case is approximately:


 
Q0 .x/
 10 7 :
QQ 0 .x/

Clearly, the powers of ten are arbitrary and could be chosen differently but due care must
be taken to work within the precision of the machine. The Legendre polynomials are a member
of class of polynomial approximations which are widely used. Their strengths, weaknesses, and
utility are most clearly seen when we cast our analysis in terms of vector space theory.
6.3. INFINITE DIMENSIONAL VECTOR SPACES 69
6.3 INFINITE DIMENSIONAL VECTOR SPACES
Not all vector spaces one uses are finite dimensional. In particular, the vector space of states
from Quantum Mechanics is infinite dimensional.1 As an example, consider the set of piecewise
continuous functions, P CŒa; b on the real interval Œa; b. If f; g  P CŒa; b and ˛; ˇ numbers
then ˛f C ˇg  P CŒa; b. It is immediately obvious that P CŒa; b is a vector space over the real
numbers. To make P CŒa; b into a normed linear space we will need an inner product. Let us
try the following.
Definition 6.12
f; g  P CŒa; bjw
Z b
hf jgi D f .x/g.x/w.x/dx;
a
w.x/ > 0;
where the “weight function,” w , is continuous and always positive on Œa; b.

Now, clearly, with this definition for f; g; h  P CŒa; bjw; ˛; ˇ;  R


hf jgi D hgjf i;
hf ˛f C ˇgj hi D ˛ hf jhi C ˇ hgjhi: (6.18)
Suppose we define a function f0 which has the value 17 at 10;000 equally spaced points
between a and b but is zero otherwise. Since we can always change the value of an integrand at
a finite number of points without changing the value of the integral we must have
Z b
f02 .t/w.t /dt D 0
a

but the function f0 ¤ 0 at 10,000 points in Œa; b. At first sight it looks like we can’t define a
norm. However, we can rescue the situation by agreeing to a distinction between the “vector” f0
and the function f0 .
Definition 6.13 We shall take two elements of the vector space P CŒa; bjw, f; g to be equal if
hf gjf gi D 0
even if f .t / ¤ g.t / at a finite number of points.

With this agreement we have a normed linear space and can define a norm
Z b
kf k2 D hf jf i D f 2 .t /w.t/dt;
a
1 Technicallyit is a Hilbert space, H, i.e., an infinite dimensional vector space with an inner product and associated norm,
with the additional property that if fxN g is a sequence in H and the difference kxn xm k can be made arbitrarily small for
n, m big enough then xN must converge to a limit contained in H.
70 6. POLYNOMIAL APPROXIMATIONS
with this definition we can consider limits.
Definition 6.14
lim fn ! f
n!1
if given  > 0 no matter how small there exists an N s.t. for all n > N
kf fn k < :

Let us consider the set P CŒ 1; 1 and define our weight function as


w.t /  1:
Now, we know that if we have a power series s.t.
n
X
cj x j D 0; (6.19)
j D0

then cj D 0 for all j . Thus, the set is X D fx j gjnD0 is a collection of linearly independent vec-
tors. Now, I will apply the Grahm–Schmidt orthogonalization process (Lemma A.9) to X with
weight w  1 to create an new set of orthonormal polynomials fn .x/g. I will adopt the notation
of (A.6). Then,
e00 D 1
he00 je00 i D 2 r
1
e0 D 0 .x/ D
Z 2
1 1
e10 D x xdx
2 1
D x; r
3
e1 D 1 .x/ D x;
r2  
5 3 3 1
e2 D 2 .x/ D x ;
r2 2 2

7 5 3 3
e3 D 3 .x/ D x x ;
2 2 2
::
: (6.20)
In fact, the set of polynomials I have constructed are directly related to the Legendre
polynomials
r
2n C 1
n .x/ D Pn .x/: (6.21)
2
6.3. INFINITE DIMENSIONAL VECTOR SPACES 71
Table 6.1: Different polynomials bases

Name Interval Weight


Legendre (-1, 1) 1
Laguerre (0, ∞) x e-x
α

Hermite (-∞, ∞) e-x2


Chebyshev (-1, 1) (1 - x2)-½
Jacobi (-1, 1) (1 - x)α(1 + x)β

The Legendre polynomials are orthogonal and differ only in norm from the set we got by
using the Grahm–Scmidt process. They satisfy
Z 1
2
Pn .x/Pm .x/dx D ınm ; (6.22)
1 2n C1
and if f is any function in P CŒ 1; 1j1 then f may be expanded
1
X
f .x/ D an Pn .x/; where
nD0
Z 1
2n C 1
an D f .x/Pn .x/dx: (6.23)
2 1

Remember convergence in the norm might not be equivalent to point-wise convergence.


There are other sets of orthogonal polynomials as well as the Legendre, defined on different
ranges and with different weight functions; I list some of the most important ones in Table 6.1;
for more details of their properties see [13].

6.3.1 ZEROS OF ORTHOGONAL POLYNOMIALS


Let fpm .x/g be a set of polynomials orthogonal over .a; b/ with respect to the weight w.x/ > 0,
i.e.,
Z b
pm .x/pn .x/w.x/dx D 0; (6.24)
a

unless m D n.

Lemma 6.15 The mth polynomial has exactly m zeros, which are simple and lie in .a; b/.

Proof. Suppose pm .x/ changes sign n times in .a; b/. Now from the fundamental theorem of
algebra n  m. Let a1 ; : : : an be the distinct points where pm changes sign.
72 6. POLYNOMIAL APPROXIMATIONS

n
Y
qn .x/ D .x ai /
i D1

is a polynomial of order n and


Z b
qn .x/pm .x/dx D 0; n < m: (6.25)
a

But in the vicinity of a1 .x a1 /pm .x/ does not change sign, indeed .x a1 /:.x
a2 /    .x an /pm .x/ does not change sign in .a; b/, hence
Z b
.x a1 /:.x a2 /    .x an /pm .x/w.x/dx ¤ 0: (6.26)
a

This is in contradiction to (6.25) if n is not equal to m. So there are m distinct zeros. The
point here is we have shown there are m sign changes and m is the maximum number of roots
a polynomial of order m can have! 

6.4 QUADRATURE
6.4.1 SIMPSON REVISITED
The Simpson rule formula (2.28)
Z 1
1
f .x/dx D Œf . 1/ C 4f .0/ C f .1/
1 3

can be written
Z 1
1
f .x/dx D Œf . 1/ C 4f .0/ C f .1/;
1 3
X3
D wN n f .xn /;
i D1
1
wN 1 D D wN 3 ;
3
4
wN 2 D ;
3
x1 D 1;
x0 D 0;
x3 D 1; (6.27)
and is exact for all polynomials of order less than or equal 3. Be careful not to confuse the weight
function w used in the definition of the inner product (Definition 6.12) and the numbers wN n
which are also called weights in (6.27).
6.4. QUADRATURE 73
6.4.2 WEIGHTS AND NODES
We want to integrate
Z b
f .x/w.x/dx:
a
We want to find a set of points fxi gniD0 and weights wN n such that
Z b Xn
f .x/w.x/dx  wN i f .xi / (6.28)
a 0

is as near to exact as possible.


Let us begin by assuming we have points fxi g3iD0 and then chose our weights in such away
that the integral over linear quadratic and cubic polynomials is exact
f .x/ D 1;
Z 1
dx D 2 D wN 0 C wN 1 C wN 2 C wN 3 ;
1
f .x/ D x;
Z 1
x D 0 D wN 0 x0 C wN 1 x1 C wN 2 x2 C wN 3 x3 ;
1
f .x/ D x 2 ;
Z 1
2
x2 D D wN 0 x02 C wN 1 x12 C wN 2 x22 C wN 3 x02 ;
1 3
f .x/ D x 3 ;
Z 1
x 3 D 0 D wN 0 x03 C wN 1 x13 C wN 2 x23 C wN 3 x03 :
1
2 30 1 0 1
1 1 1 1 wN 0 2
6x0 x1 x2 7 B
x3 7 BwN 1 C B 0 C
C B
6 D C;
4x 2 x12 x22 x32 5 @wN 2 A @ 23 A
0
x03 x13 x23 x33 wN 3 0
which we can summarize in matrix form as
V w D y:
So once we have chosen our xi0 s e can chose our wN i0 s by inverting V .
V is essentially the Vandemonde matrix that we met doing interpolation. In fact this is
the transpose of the matrix V in (6.2), but this should not unduly concern you since
 1 T
VT D V 1 :
In theory, we can just keep looking for a bigger and bigger Vandemonde matrix, We need
n weights and nodes to exactly integrate a polynomial of order n but if we make it “too big” for an
arbitrary set of xi the weights vary wildly and the sum in (6.28) becomes numerically unstable.
74 6. POLYNOMIAL APPROXIMATIONS
6.4.3 GAUSSIAN QUADRATURE

Example 6.16 Consider the polynomial:


f .x/ D 3x 5 C 3x 2 C 3x C 3:
5x 3 3
Divide f .x/ by 2
:
6 2
 5
x
5x 3 3 6x 5 C 6x C 6x C 6
2

6x 5 C 18
5
x2
48 2
5
x C 6x C 6

5x 3 3
p3 .x/ D :
2
So
f .x/ D p3 .x/q.x/ C r.x/;
where q.x/; r.x/ are polynomials of order 2.

In the same way if f .x/ is a polynomial of order no more than 2n 1 we can write

f .x/ D pn .x/q.x/ C r.x/;

where q and r are polynomials of order n 1 or less. Suppose now that pn the polynomial of
order n is a member of set of polynomials orthogonal over .a; b/ with respect to the weight
w.x/ > 0. We want to find the weights and nodes such that
Z b Xn
f .x/w.x/dx D wN i f .xi /:
a iD0

Now, if m < n pn is orthogonal to the subspace of mononomials constructed from


x 0 ; x; : : : ; x m , hence
Z b
pn .x/q.x/w.x/dx D 0: (6.29)
a
If we chose xi to be the n zeros of the pn function then
n
X
wN i pn .xi /q.xi /
i D0

is exactly zero the same as (6.29). r.x/ is a polynomial of order n 1 so we can find the n weights
wN i so that
Z 1 n 1
X
r.x/w.x/dx D wN i r.xi /
1 i D0
6.4. QUADRATURE 75
is exact. Therefore,
Z b Z b n 1
X n
X
f .x/w.x/dx D w.x/ Œpn .x/q.x/ C r.x/ dx D wN i r.xi / D wN i f .xi /:
a a i D0 i D0

We know the nodes xi and now we will look for a closed form expression for the weights.
Now since r.x/ is a polynomial of order less than n, it is thus fixed by the values it attains at n
different points and we can use our Lagrange interpolation to write
n
X
r.x/ D Li .x/r.xi /;
i D1
Z b n 1
X Z b
) w.x/r.x/dx D Œ Li .x/w.x/dxr.xi /;
a a
Zi D0b
) wN i D w.x/Li .x/dx: (6.30)
a

Note that the weights wN i and nodes xi depend only on the polynomial pn and not on the
function f we want to integrate and thus can be tabulated. The choice the polynomial basis we
might want to use depends on the interval .a; b/ and the integral. Suppose, for example, you
wanted to evaluate an integral of the form
Z 1
f .x/
p dx;
1 1 x2
then the optimum choice is a Gauss–Chebyshev integration, see Table 6.1, where the weight
function is
1
w.x/ D p :
1 x2
77

CHAPTER 7

Sturm–Liouville Theory
In classical as well as quantum physics many problems arise in the form of boundary value prob-
lems involving second order ordinary differential equations, very frequently these problems are
of Sturm–Liouville type. Such problems have a particular place in the quantum theory because
of the self-adjoint nature of the differential operators involved.

Definition 7.1 A second order differential equation of the form:


 
d dy.x/
p.x/ C q.x/y.x/ D w.x/y.x/;
dx dx
x  Œa; b;

with p; q , and w specified and p.x/ > 0; w.x/ > 0 on Œa; b is said to be a Sturm–Liouville
equation.

7.1 EIGENVALUES
Note that both y.x/ and  are unspecified in Definition 7.1 so the solution of the Sturm–
Liouville equation is essentially an eigenvalue problem. The Sturm–Liouville differential equa-
tion as written down above is a purely formal entity in the absence of boundary conditions. We
can define a new Hillbert space of square integral functions on Œa; b, L2 Œa; bjw, with an inner
product
Z b
hf jgi D fN.x/g.x/w.x/dx; (7.1)
a

where the “weight function” w.x/ is real and positive.

Definition 7.2 We define a Sturm–Liouville differential operator, LO to be an operator


 
1 d d
p.x/ C q.x/:
w.x/ dx dx

In order to properly define LO on our Hilbert space we need to add boundary conditions.
Now if we can find such conditions such that LO is self adjoint then:
78 7. STURM–LIOUVILLE THEORY
(i) its eigenvalues would be real (Lemma A.24),
(ii) its eigenfunctions corresponding to distinct eigenvalues would be orthogonal
(Lemma A.25).
Let us look to see what such boundary conditions would look like. Consider
Z b  
d
O
hf jLgi D fN.x/ Œp.x/g 0 .x/ C q.x/g.x/ dx
a dx
integrating the first term on the right by parts we have
O
hf jLgi
Z b
D fN.x/p.x/g 0 .x/jba C ffN0 .x/p.x/g 0 .x/ C fN.x/q.x/g.x/gdx:
a

In the same way,


Z b
O jgi D
hLf fN0 .x/p.x/g.x/jba C ffN0 .x/p.x/g 0 .x/ C fN.x/q.x/g.x/gdx;
a

O
) hf jLgi O jgi D
hLf fN.x/p.x/g 0 .x/jba C fN0 .x/p.x/g.x/jba :

To get a self adjoint operator we would need:

fpg 0 jba  fN.b/p.b/g 0 .b/ C fN0 .b/p.b/g.b/ C fN.a/p.a/g 0 .a/ fN0 .a/p.a/g.a/
" !#b
d fN.x/ dg.x/
D p.x/ g.x/ fN.x/
dx dx
a
D 0: (7.2)

If we define our differential operator to be the formal differential operator LO together with
boundary conditions of the form (7.2) then we have a self adjoint operator on L2 Œa; bjw. The
boundary conditions (7.2) would be satisfied if, for example, we were to consider functions, g ,
defined on Œa; b satisfying

˛1 g 0 .a/ C ˛2 g.a/ D 0;
ˇ1 g 0 .b/ C ˇ2 g.b/ D 0; (7.3)

˛i and ˇi are constants not both zero. It is important to recognize that we must choose the
same constants for all our functions. We will describe (7.3) as “regular” boundary conditions. If
the function p.x/ is such that p.a/ D p.b/ then we can impose alternative “periodic boundary
conditions”

g.a/ D g.b/;
7.1. EIGENVALUES 79
0 0
g .a/ D g .b/:

In either case the Sturm–Liouville differential operator LO is self adjoint with real eigen-
values and orthogonal eigenvectors provided only that the eigenvalues are non degenerate.
But there is more. It can be shown [22] that in the regular case.
• The eigenvalues are simple; in other words the eigenfunctions are non-degenerate and
thus mutually orthogonal.

• The set of eigenvalues is countably infinite and can be arranged in a monotonically


increasing sequence bounded below:

0 < 1 <    < n <   


lim n D 1:
n!1

• The eigenfunction corresponding to the nth eigenvalue has n zeros on the open interval
.a; b/.

• The orthonormal set of eigenfunctions form a basis for the Hilbert space.
For the periodic problem and most but not all of the above results will still hold. The
eigenvalues will still be real, the eigenfunctions orthogonal for different eigenvalues but with
the exception that these may be degeneracy.
Consider the periodic Sturm–Liouville problem

„2 d 2 .x/
D E ;
2m dx 2
.0/ D .L/;
0 0
.0/ D .L/:

This has eigenvalues

4 2 „2 n2
En D ;
2mL2
and for each En ; n ¤ 0 there are two linearly independent solutions
2 nx 2 nx
cos. /; sin. /:
L L
We can always use our Grahm–Schmidt procedure to find orthogonal vectors that span
the subspace corresponding to a degenerate eigenvalue, so even in the periodic case we can find
an othonormal set of eigenfunctions that form a basis but the sequence of eigenvalues is not
monotonically increasing.
80 7. STURM–LIOUVILLE THEORY
7.2 LEAST SQUARES APPROXIMATION
Suppose g is a member of L2 Œa; bjw then we can expand it in terms of the orthonormal eigen-
functions of our Sturm–Liouville operator, fi g1
mD1

1
X
g.x/ D cn n .x/;
nD1
1
X
jjgjj2 D hgjgi D jcn j2 : (7.4)
nD1

Obviously, computers cannot deal with infinite sums. Suppose we approximate g by a


function f which contains only a finite number of orthonormal eigenfunctions:
N
X
f .x/ D bn n .x/: (7.5)
nD1

We want to chose the constants bi s.t. f is the “best” approximation to g , just as we did
in Chapter 5 we will look for the least squared fit, that is we want to minimize jjf gjj2

jjf gjj2 D hf
" Njf i C hgjgi hf jgi# hgjf i;
X
D jbi j2 bNi ci cNi bi C jjgjj2 : (7.6)
i D1

We want to chose our set of N coefficients bi so that the term in the square brackets is
smallest. Write
"N #
X
N b/ D
F .b; bNi bi bNi ci cNi bi :
i D1

N b/ be stationary w.r.t. bi and bNi :


A necessary condition of a minimum is that F .b;
@F
D bNi cNi D 0;
@bi
@F
D bi ci D 0: (7.7)
@bNi
These derivatives vanish if ci D bi and further
@2 F
D 0;
@bi @bj
@2 F
D 0;
@bNi @bNj
7.2. LEAST SQUARES APPROXIMATION 81
2
@ F
D ıij  0: (7.8)
@bj @bNj

The extremum is a minimum! Therefore, if we wish to approximate a function g.x/ by


representing it as a linear combination of just a finite number of eigenfunctions of some Sturm–
Liouville operator, the best we can do is to choose exactly the same coefficients as in the true
infinite expansion.
83

CHAPTER 8

Case Study: The Quantum


Oscillator
The harmonic oscillator and the hydrogen atom are two of the very few quantum systems which
admit simple analytic solutions. All numerical methods need to be benchmarked against these
two cases. In this chapter I am going to show you how the understanding of Sturm–Liouville
problems you acquired in the proceeding chapter can be applied to the computation of the os-
cillator eigenvalues and eigenfunctions.

8.1 NUMERICAL SOLUTION OF THE ONE


DIMENSIONAL SCHRÖDINGER EQUATION
Let us consider the Schrödinger equation in one dimension [23]
 
„2 d 2 .x/
C V .x/ .x/ D E .x/: (8.1)
2m dx 2
We are looking for “bound states,” i.e., states where
lim .x/ D 0:
x!˙1

We want to find both the eigenvalues and the eigenfunctions. As you know an eigenvalue
problem can only have solutions for certain values of E . Suppose that E is such a value then the
Schrödinger equation (8.1) can be written:
d 2 .x/
C k 2 .x/ .x/ D 0;
dx 2 r
2m
k.x/ D ŒE V .x/;
„2
.x/ ! 0 as x ! ˙1: (8.2)
At this early stage it is helpful to focus on (8.2) and to see how much of the character of
the solution we can deduce before we start calculating.
We immediately recognize that we have a regular Sturm–Liouville problem, so we expect
that the eigenvalues will be real, non-degenerate, bounded below and that they can be ordered
E0 < E 1 < E 2 <    (8.3)
84 8. CASE STUDY: THE QUANTUM OSCILLATOR
Further, the eigenfunction corresponding to the nth eigenvalue will have exactly n zeros.
If the potential is symmetric V .x/ D V . x/ then we make use of the following result.
Lemma 8.1 Suppose the potential V is such that
V . x/ D V .x/

and each bound state level corresponds to only one independent solution then
. x/ D ˙ .x/:

If we have the positive sign then we say the wave function has even parity and if negative we
say that we have odd parity.

Proof. Suppose .x/ is a solution of the Schrödinger equation corresponding to the energy E
then
„2 d 2 .x/
C V .x/ .x/ D E .x/: (8.4)
2m dx 2
Now we can always replace x by x in (8.4) and remembering that V . x/ D V .x/ we see
that . x/ is also a solution corresponding to the same eigenvalue and since the eigenvalues are
non degenerate it follows that .x/ and . x/ must be linearly dependent, i.e.,
.x/ D C . x/;

for all x . Replacing x by x once more yields


.x/ D C 2 .x/;

hence C 2 D 1, therefore
. x/ D ˙ .x/: (8.5)

For a given value of the energy E , we can divide space into three regions, depending on the
value of k.x/ in (8.2). This has something of the character of the classical problem we discussed
in Chapter 4.
In region 2,
E  V;
k 2 .x/ > 0;

and we would expect to find an oscillatory solution but in regions 1 and 3


E < V;
8.1. NUMERICAL SOLUTION OF THE ONE DIMENSIONAL SCHRÖDINGER EQUATION 85
2
k < 0:

Asymptotically, we expect to find an exponentially decaying solution, see Appendix B. We


expect to find the nodes of the wave function concentrated in region 2. Classically, there will be
no solution for regions 1 and 3. The points, x tp for which

E V .x tp / D 0 (8.6)

marks the “turning points” between the classically allowed and quantum regions. We can make
use of our knowledge of Sturm–Liouville equations to create a computer code to get an estimate
of the eigenvalues and eigenfunctions. We know that the eigenfunction corresponding to the
nth eigenvalue has n zeros, smaller eigenvalues will have less zeros, bigger eigenvalues will have
more. We expect to find all the zeros in region 2. Our potential is symmetric, therefore we know
that the eigenfunctions will be either symmetric or antisymmetric either way if x0 > 0 is a nodal
point then x0 is also a nodal point. Further, if .x/ has odd parity:

. h/ D .h/;
. h/ C .h/ D 0;
) 2 .0/ C O.h2 / D 0;
) .0/ D 0; (8.7)

if .x/ has even parity

. h/ D .h/;
.h/ . h/
D 0;
h 0
) .0/ D 0: (8.8)

For either m even or odd, we will have exactly the same number of zeros for x > 0 and
x < 0. If is an odd function it must, as we have just seen, have a zero at the origin so it must
have an odd number of zeros and if is an even function it must have an even number of zeros.
(If it had an odd number of nodes, there must be one node at the origin, since there is exactly
the same number of nodes for x > 0 and x < 0, so .0/ D 0 D 0 .0/ therefore the leading term
in the Taylor’s expansion is just: x 2 00 .0/ D k 2 .0/ .0/ D 0 and so on from the higher terms,
i.e., function is exactly zero.)
In summary:
• if the function has m nodes and is odd then there must be one node at the origin and
m 1
2
nodes for x > 0,
m
• if the function has m nodes and is even then there must be 2
nodes for x > 0,
either way we need only solve for x  0.
86 8. CASE STUDY: THE QUANTUM OSCILLATOR
8.2 NUMERICAL SOLUTION FOR THE OSCILLATOR
While the quantum oscillator problem admits a relatively simple analytic solution, see Ap-
pendix B, our ambition here is to find an efficient numerical approach to calculate the eigen-
functions and eigenvalues.
The simplest approach is called the “shooting method.” It searches for a function with with a
pre-determined number, n, of zeros. It is assumed that the actual eigenvalue En lies somewhere
in an energy range ŒEmin ; Emax . An energy E is taken to be
Emax C Emin
ED : (8.9)
2
The energy range should contain the desired eigenvalue En . The wave function is inte-
grated starting from x D 0 in the direction of positive x ; at the same time, the number of nodes,
m (i.e., of changes of sign of the function) is counted. If the number of nodes is larger than n; E
is too high; if the number of nodes is smaller than n; E is too low. A new interval is defined

Emax D E if m > n;
Emin D E if m < n: (8.10)

A replacement E is found from (8.9) and the procedure repeated until the energy interval
is smaller than a pre-determined threshold, we assume that convergence has been reached.
I wrote a code to study the n D 3 eigenvalue and eigenfunction of the harmonic oscillator
potential where units were chosen such that „ D m D ! D 1 and the range of x was taken to be

10  x  10

which was divided into 300 equally spaced intervals. The potential was V .x/ was calculated at
each grid point and stored an array V .i / then the initial values of Emax and Emin were deter-
mined by

Emax D max jV .i /j;


i
Emin D min jV .i /j:
i

Our differential equation is of Numerov form (3.27),


00
.x/ D g.x/ .x/ C s.x/;

where we take

s.x/ D 0;
2m
g.x/ D ŒE V .x/ ;

8.2. NUMERICAL SOLUTION FOR THE OSCILLATOR 87
Table 8.1: Results from Numerov

Iteration Energy Eigenvalue


1 25.000000000000000
2 12.500000000000000
3 6.2500000000000000
4 3.1250000000000000
5 4.6875000000000000
6 3.9062500000000000
7 3.5156250000000000
8 3.3203125000000000
9 3.4179687500000000
10 3.4667968750000000
⋮ ⋮
35 3.4999996962142177
36 3.4999996954866219
37 3.4999996958504198
38 3.4999996960323188
39 3.4999996961232682

consequently I used the formula (3.36)


 2
  
h2
2 n 1 5h 12 n
g n 1 1C g
12 n 1
nC1 D 2
C O.h6 /
1 C gnC1 h12

to perform the numerical integration.


It was necessary only to integrate from 0 in the direction of positive x . Because I was fo-
cused on the n D 3 eigenvalue I knew the eigenfunction would have odd parity with its first zero
at the origin. Consequently, I took 0 to be 0, 1 D h, 1 D h. The choice of 1 is actually
somewhat arbitrary but is consistent with the lowest order Taylor expansion.1 Any arbitrariness
I include will be remove once I normalize the wavefunction. I took the threshold to be fixed at
jEmax Emin j < 10 10 . The output is shown in Table 8.1.
The correct eigenvalue 3:5 is recovered quite quickly. In Figure 8.1, the associated eigen-
function is compared with the analytic solution. In the classical accessible region the analytic
1 Had I been interested in an even eigenfunction I would have taken to be arbitrary but finite D so I could
0 C1 1
deduce C1 from the Numerov formula, (3.36).
88 8. CASE STUDY: THE QUANTUM OSCILLATOR

Region 1 Region 2 Region 3

V(x)
E

Figure 8.1: Given a test Energy space is divided into 3. The region 1 and 3 with E  V are
classically forbidden.

and numerical solutions are in moderate agreement but they diverge dramatically once we pass
the turning points.
The problem with this approach lies in the fact that the code tried to integrate from x D 0
in the region 2 to large x in region 3, but as we have seen the solution to the formal differen-
tial equation allows for an exponentially increasing solution which we don’t want as well as an
exponentially decreasing solution which we do want. If even a tiny amount of the exponentially
increasing solution (due to numerical noise, for instance) is present at the turning point, the
integration algorithm will inexorably make it grow in the classically forbidden region. In order
to deal with this problem we can go “far” into the quantum region where we could reasonably
assume the wave function is close to zero and integrate backward to the turning point where we
“match” to the solution integrated from 0 in the classical region.
At the turning point we require that function got by integrating in, from large xmax in
region 3, 3 .x/, be such that it matches the solution got from integrating out from 0; 2 .x/.
Matching means that we require that both the functions and their first derivatives are continu-
ous. If we have found the correct eigenvalue then we have our solution.
A second code was written for the harmonic oscillator. Two integrations were performed:
a forward recursion, in region 2, starting from x D 0, and a backward one, in region 3, starting
from xmax The matching point was chosen to be the grid point, itp nearest to x tp . Note that
since x tp will vary with the choice of E . The outward integration is performed until grid point
i tp , yielding a function .2/ .x/ defined in region 2, of course because of symmetry we only need
to integrate from 0 to x tp ; the number n of changes of sign is counted in the same way as before.
8.2. NUMERICAL SOLUTION FOR THE OSCILLATOR 89
0.6

0.4

0.2

ϕ3(x)
0

-0.2

-0.4

-0.6
-10 -5 0 5 10
x

Figure 8.2: Comparison between the analytic solution, red dashed with the numerical solution,
blue dotted using the node counting approach.

We note that it is not needed to look for changes of sign beyond x tp : we expect that in
the classically forbidden region there will not be any nodes (no oscillations, just exponentially
decaying or increasing solutions).
If the number of nodes is the expected one, the code starts to integrate inward from the
rightmost points. It goes one grid point beyond xmax say, n and then puts

nC1 D 0
n D h

and then use the Numerov formula to integrate to i tp . Continuity at this point is easily achieved
by simply scaling the solution in region 3 by

.2/ .itp/
:
.3/ .itp/

Forcing the two solutions to have identical first derivatives is a little more demanding. If we use
our Taylor expansion on both functions then

1 00
.3/ .i tp C 1/ D 3 .itp/ C 0
3 .i tp/h C .i tp/h2 C O.h3 /
2 3
1 00
.2/ .itp 1/ D .2/ .itp/
0
.2/ .itp/h C .itp/h2 C O.h3 /: (8.11)
2 .2/
90 8. CASE STUDY: THE QUANTUM OSCILLATOR
0.6

0.4

0.2

ϕ3(x)
0

-0.2

-0.4

-0.6
-6 -4 -2 0 2 4 6
x

Figure 8.3: Comparison between the analytic solution, red solid with the numerical solution
using the extended approach to O.h3 /, blue crosses, for the n D 3 eigenfunction of the quantum
oscillator.

Now by construction,

.3/ .i tp/ D .2/ .itp/ D .itp/;

and from Numerov equation we have


00 00
.3/ .itp/ D .i tp/gitp D .2/ .itp/:

Therefore to O.h2 /,

0 0 .2/ .itp 1/ C .3/ .itp C 1/ Œ2 gitp h2  .itp/


.3/ .itp/ .2/ .itp/ D :
h
(8.12)

The jump condition (8.12 ) depends on our choice of E both through the functions .2/
and .3/ and gitp , and thus (8.12) can be solved for zero difference by successive bisections and
this was the approach incorporated in the second code.
In Figure 8.3, a comparison is presented between the analytic solution and the numerical
solutions using the modified code, the numerical results of which are visually indistinguishable
from the analytic result.
91

CHAPTER 9

Variational Principles
Minimization principles have a special place in numerical methods and they form one of the
most wide-ranging means of formulating mathematical models governing the equilibrium con-
figurations of physical systems.

9.1 RAYLEIGH–RITZ THEOREM


Theorem 9.1 Suppose HO is a self-adjoint operator on a Hilbert space, whose eigenfunctions, n,
form a normalized basis and whose eigenvalues are bounded below and can be ordered:

E0 < E1  E2      En  EnC1     ;

where the smallest eigenvalue is non-degenerate.


Suppose is some arbitrary normalized state then

h jHO i  E0 ;

with equality iff D 0.

Proof. Since HO is self adjoint its eigenvalues are real and the basis of eigenfunctions can be
chosen such that:

h nj mi D ınm :

Since n form a basis and has unit norm


X
D cn n ;
n
X
h j i D cNm cn h m j n i;
mn
X
D jcn j2 D 1:
n

Hence,
1
X
h jHO i D cNm cn h O
m jH ni
nD0;mD0
92 9. VARIATIONAL PRINCIPLES
1
X
D cNm cn En h m j n i
nD0;mD0
X
D jcn j2 En
nD0X
 E0 jcn j2
n
D E0 ;
X1 X1
D E0 jcn j2 C jcn j2 ŒEn E0 
nD0 nD0
 E0 :


Clearly, the conditions of the theorem applies to any regular Sturm–Liouville problem as
we discussed in Chapter 7. We further note that if 0 is the actual normalized state correspond-
ing to E0 then

h O
0 jH 0i D E0 ; (9.1)

and since En > E0 for all n > 0 and is any other function then

h jHO i > E0 : (9.2)

This theorem is known as the Rayleigh–Ritz Theorem. The result is not restricted to the
finite dimensional case so it is equally valid for any of the infinite dimensional space of functions
we have met so far. Further if we know from, experiment say, the value of E0 then if we can find
that minimizes h jHO i we will have found the ground state wavefunction, 0 . This obser-
vation underlies the various variational approaches to structure studies of many body quantum
mechanical systems [23–25].
Now consider a family of states, f .˛/g; depending on real parameters:

˛1 ; : : : ; ˛N ;

we can relax our assumption that the states are normalized and define
h .˛/jH j .˛/i
E.˛/ D : (9.3)
h .˛/j .˛/i
E.˛/ is known as the Rayleigh–Ritz quotient. From Theorem 9.1 we have that E.˛/  E0 and
now we try to make E.˛/ as small as possible. Now the condition that the real function E.˛/
be stationary is [11],
@E.˛/
D 0: (9.4)
@˛i
9.1. RAYLEIGH–RITZ THEOREM 93
Example 9.2 Suppose we want to find the ground state energy of the one-dimensional har-
monic oscillator
„2 d 2 1
HO D 2
C m! 2 x 2 : (9.5)
2m dx 2
Now we already know the exact answer but lets try out our variational approach. We pick
our trial function to be the “Gaussian”
bx 2
T .x/ D Ae ; (9.6)

where b is our variational parameter. Our constraint is:


Z 1
2
1 D jAje 2bx dx D 1
1r

D jAj2 ;
2b
 4 1
2b
)A D : (9.7)


Z bx 2 Z
2 „2 1 bx 2 de m! 2 1
bx 2
h T jH
O T i D jAj Œ .e /dx C x2e dx
2m 1 dx 2 1
„2 b m! 2
D C : (9.8)
2m 8b
We have only one free parameter, b

O „2 b m! 2
f .b/ D h T jH Ti D C ;
2m 8b
df .b/ „2 m! 2
D D 0;
db 2m 8b 2
m!
)b D ;
2„
1
) ET D „!;
2
 m!  14 m2 ! 2
) T D exp. /: (9.9)
2„ 4„2
We have unearthed the “exact” solution and we can’t do any better.

More generally, consider the Schrödinger equation operator in one dimension

1 „2 d 2
HO D C V .x/:
2 2m dx 2
94 9. VARIATIONAL PRINCIPLES
b

y = y1(x)

y = y(x)

y = y2(x)

Figure 9.1: The points a and b can be connected by an infinite number of different paths; in this
figure we show just 3.

Define

I. N ; / D hZ jHO i  
1
N .x/ „2 d 2
D C V .x/ .x/dx: (9.10)
1 2m dx 2

For any given .x/ and N .x/ the integral I in (9.10) is just a number. Our task is to find
the out of the infinity of possible ’s that minimizes I. N ; / while at same time satisfying
the constraint that
Z 1
h j iD N .x/ .x/dx D 1: (9.11)
1

This is the basic problem from the calculus of variations.

9.2 THE EULER–LAGRANGE EQUATIONS


Suppose F is a function of an independent variable x , a dependent variable y.x/, and its deriva-
dy.x/
tive y 0 .x/  . Let y D f .x/ define a path, i.e., a curve in two space and suppose the points
dx
dy
a D f .xa / and b D f .xb / are on this path. Suppose F is a function of y; and x . The integral
dx
Z xb
dy
I D F .y; ; x/dx (9.12)
xa dx
9.2. THE EULER–LAGRANGE EQUATIONS 95
is just a number for a given path y.x/. The question is how to find a particular path y.x/ out
of the infinity of possible paths which for which I is smallest. Assume y.x/ is that path; then
define a set of varied curves around it
Y.x/ D y.x/ C .x/; (9.13)
where .xa / D .xb / D 0 and  is as differentiable as we like.
Then
Z xb
dY
I./ D F .Y; ; x/dx (9.14)
xa dx
defines a function
I W R ! R;
W  ! I./:
We are assuming that I./ takes its extremum value when  D 0, i.e., we want
ˇ
dI./ ˇˇ
D 0: (9.15)
d ˇD0
Now:
Z xb
I./ D .F .y C ; y 0 C 0 ; x/; (9.16)
xa

expand in a Taylor series in 


Z xb Z xb
@F @F
I./ D .F .y; y 0 ; x/dx C .  C 0 0 /dx C O. 2 /: (9.17)
xa xa @y @y
Differentiating with respect to  and putting  D 0 we have, for all ,
ˇ Z xb
dI./ ˇˇ @F @F
ˇ D .  C 0 0 /dx: (9.18)
d D0 xa @y @y
integrating by parts then
ˇ Z xb ˇ
dI./ ˇˇ @F d @F @F ˇˇxb
D . /dx C  : (9.19)
d ˇD0 xa @y dx @y 0 @y 0 ˇxa
The second term on the right is zero since .xa / D .xb / D 0. Since  is arbitrary it follows
that a necessary condition for an extremum is that
@F d @F
D 0: (9.20)
@y dx @y 0
(9.20) is known as the Euler–Lagrange equation. There are two special cases when it is particu-
larly simple to solve.
96 9. VARIATIONAL PRINCIPLES
(i) F has no explicit dependence on y i.e.,

@F
D 0;
@y
@F
) 0 D constant (9.21)
@y

(ii) F has no explicit dependence on x i.e., F D F .y; y 0 / then

@F
F y0 D constant: (9.22)
@y 0

The result (9.20) can be extended in a straightforward manner to more than one dependent
variable.
Suppose

F D F .y1 ; :::; yn ; y10 ; : : : ; yn0 ; x/:

Then the required path y.x/ which yields extrema satisfies the set of equations
!
@F d @F
D 0 1  j  n: (9.23)
@yj dx @yj0

Example 9.3 Consider a particle, mass m moving in space under the effect of a scalar potential
V .x; y; z/ then define the Lagrangian

1
LD m.xP 2 C yP 2 C zP 2 / V .x; y; z/: (9.24)
2
Then, requiring the integral
Z t2
I D L.x; y; z; x;
P y;
P zP / (9.25)
t1

to be stationary leads to the Euler–Lagrange equations:


 
d @L @L d xP @V
D 0)m D ;
dt  @xP  @x dt @x
d @L @L d yP @V
D 0)m D ;
dt  @yP  @y dt @y
d @L @L d zP @V
D 0)m D (9.26)
dt @zP @z dt @z
9.2. THE EULER–LAGRANGE EQUATIONS 97
which is equivalent to the vector equation:
dp
D rV: (9.27)
dt
which we recognize as Newton’s second law.

This is a special case of the “Principle of Least Action” [11, 17].

THE PRINCIPLE OF LEAST ACTION


If a classical mechanical system specified by “generalized coordinates”

q1 .t/; : : : ; qN .t/;

with kinetic energy, T .qi ; qP i / and potential energy V .qi ; t / then the motion of the system from
time t1 to time t2 is such as to render the “action integral”
Z t2
I D L.qi ; qP i ; t /dt
t1

stationary, where L is the “Lagrangian” defined by

L..qi ; qP i ; t/ D T .qi ; qP i / V .qi ; t/:

A few comments.
• The name is a bit misleading. I didn’t have to require the integral have a minimum to
recover Newton’s laws in the form (9.27); I only needed I to be stationary.
• The choice of the qi ’s is not restricted to Cartesian coordinates.

Example 9.4 Consider a free particle with Lagrangian:


1 2
LD mrP ; (9.28)
2
with

r D .x; y; z/:

Now measure the motion of the particle w.r.t. a rotating coordinate system with angular
velocity

! D .0; 0; !/:
98 9. VARIATIONAL PRINCIPLES
If r 0 D .x 0 ; y 0 ; z 0 / are the coordinates in the rotating system, then

z D z0;
x D x 0 cos !t y 0 sin !t;
y D y 0 cos !t C x 0 sin !t;
) zP D zP 0 ;
xP D xP 0 cos !t yP 0 sin !t x 0 ! sin !t !y 0 cos !t;
yP D yP0 cos !t C xP 0 sin !t y 0 ! sin !t C x 0 ! cos !t;
) xP 2 C yP 2 C zP 2 D ! 2 Œx 02 C y 02  C .xP 02 C yP 02 / C 2!.x 0 yP 0 xP 0 y 0 / C zP 02 ;
m  2 02 
) L.r 0 ; rP 0 / D ! .x C y 02 / C .xP 02 C yP 02 / C 2!.x 0 yP 0 xP 0 y 0 / :
2
(9.29)

m  2 02 
L.r 0 ; rP 0 / D ! .x C y 02 / C .xP 02 C yP 02 / C 2!.x 0 yP 0 xP 0 y 0 / ;
  2
d @L d 
D mxP 0 !y 0 D mxR 0 ! yP 0 ;
dt  @xP 0  dt
d @L d 
D myP 0 C !x 0 D myR 0 C ! xP 0 ;
dt @yP 0 dt
@L
D m! 2 x 0 C ! yP 0 ;
@x 0
@L
D m! 2 y 0 ! xP 0 : (9.30)
@y 0

Then Euler–Lagrange gives us

xR 0 D ! 2 x 0 C 2! yP 0 ;
yR 0 D ! 2 y 0 2! xP 0 ;
) mrR 0 D m!  .!  r 0 / 2m!  rP 0 : (9.31)

Thus, we have recovered the “fictitious forces” characteristic of a non-inertial frame, the
“centrifugal” and “Coriolis” terms [17].

9.3 CONSTRAINED VARIATIONS


We can employ the Lagrange multiplier method of Chapter 2 to incorporate constraints into
the calculus of variations.
Suppose we want to find the path y.x/ which makes
Z xb
I D F .x; y; y 0 /dx (9.32)
xa
9.3. CONSTRAINED VARIATIONS 99
stationary, subject to the constraint that
Z xb
J D G.x; y; y 0 /dx C D 0; (9.33)
xa

where C is a constant.
We generalize our earlier argument and introduce a two-parameter family of curves
Y .x; 1 ; 2 /

Y.x; 1 ; 2 / D y.x/ C 1 1 .x/ C 2 2 .x/


i .xa / D i .xb / D 0; i D 1; 2 (9.34)

i as differentiable as we need. Now


Z xb
I.1 ; 2 / D F .x; Y; Y 0 /dx;
Zxaxb
J.1 ; 2 / D G.x; Y; Y 0 /dx C (9.35)
xa

are two real-valued functions of 1 and 2 . Define

L.1 ; 2 ; / D I.1 ; 2 / J.1 ; 2 /;

 is our Lagrange multiplier.


The conditions for an extremum are
 
@L
D 0 i D 1; 2;
@
 i i D0
@L
D 0: (9.36)
@ i D0

As before, the second equation is just the constraint. We can write


Z xb
L.1 ; 2 ; / D H.Y; Y 0 ; x/;
xa
H D F G: (9.37)

Expanding in a Taylor series, (2.6), in 1 and 2 and integrating by parts then putting
i D 0 we find that

Z xb   
@L @H d @H
D i .x/dx D 0;
@i xa @y  dx @y 0
@H d @H
) D 0: (9.38)
@y dx @y 0
100 9. VARIATIONAL PRINCIPLES
This just like the Euler–Lagrange equation (9.20) except that H D F G replaces F .
Note the solution of the Euler–Lagrange equation involves two constants of integration plus the
constraint condition is enough to ensure that y.x/ passes through .xa ; a/ and .xb ; b/.
Generalization of these results to include multiple constraints is not difficult. If we have
M constraints:
Z xb
Ji D Gi .y; y 0 ; x/dx;
xa
1  i  M:

Define the function


M
X
H DF C i Gi (9.39)
i D1

and look for the extrema of


Z xb
0
L.y; y ; 1 ; : : : ; M /  H.y; y 0 ; 1 ; : : : ; M ; x/: (9.40)
xa

Then the new H defined in (9.40) satisfies the Euler–Lagrange equations.

Example 9.5 Let us look for the wave function .x/ which minimizes
Z 1
O
h jH i D N .x/HO .x/dx: (9.41)
1

Subject to the constraint that


Z 1
N .x/ .x/dx D 1; (9.42)
1

where
„2 d 2
HO D C V .x/: (9.43)
2m dx 2
We require

lim .x/ D 0:
x!˙1

Therefore,
Z 1
ˇ1 Z 1
2
N d dx Nd ˇ
ˇ dN d
D dx: (9.44)
1 dx 2 dx ˇ 1 1 dx dx
9.4. STURM–LIOUVILLE REVISITED 101
Our task is to minimize
Z 1    
„2 dN d
h jHO i D CV N dx (9.45)
1 2m dx dx
subject to (9.42). If we treat and N as two dependent variables and add the constraint using
a Lagrange multiplier  the Euler–Lagrange equations become
 
„2 d 2
V  D 0;
2m  dx 2 
„2 d 2 N
V N N 2
D 0;
Z 1 2m dx
N .x/ .x/dx D 1: (9.46)
1

The first and second of these equations in (9.46) are equivalent so we deduce that satisfies
the Schrödinger equation:
 
„2 d 2
C V .x/ D  : (9.47)
2m dx 2
We can thus identify the Lagrange multiplier with the energy of the physical system. It
is a straightforward matter to extend this result to three dimension, i.e., the square integrable
function .r/ which minimizes
Z Z Z  
„2 2
O
h jH i D N .r/ r C V .r/ .r/d 3 r (9.48)
2m
subject to the constraint that
Z Z Z
N .r/ .r/d 3 r D 1; (9.49)

must satisfy the time independent Schrödinger equation:


 2 
„ 2
r C V .r/ D : (9.50)
2m

9.4 STURM–LIOUVILLE REVISITED


Suppose we wanted to find the functions y.x/ for which
Z b
I Œy D .py.y 0 /2 C qy 2 /dx (9.51)
a
102 9. VARIATIONAL PRINCIPLES
is stationary subject to the constraint that
Z b
GŒy D w.x/y 2 dx D 1 (9.52)
a

p.x/; w.x/ > 0 on Œa; b. Let us introduce a Lagrange multiplier  then (9.36) tells us that
I G is stationary when

d 2py 0
D 2qy 2wy
0
dx
dpy
) C qy D wy (9.53)
dx
which is the Sturm–Liouville equation. If we multiply (9.53) by y and integrate we find
Z b  Z xb
dpy 0 2
y C qy dx D  wy 2 dx
a dx xa
D GŒy
D : (9.54)

It follows, integrating the first integral by parts, that


Z b 
dpy 0
 D y C qy 2 dx
a dx
Z b
D ypy 0 jba C .py.y 0 /2 C qy 2 /dx
a
D I Œy; (9.55)

where we have assumed the usual Sturm–Liouville boundary conditions, (7.2). Thus, the sta-
tionary values of
Rb
a y.p.y 0 /2 C qy/dx
I Œy D D R xb
2
xa y .x/w.x/dx
Rb 0
a y.py qy/dx
D Rb (9.56)
2
a y .x/w.x/dx

are given by

F Œyn .x/ D n ;

where n are the eigenvalues of the Sturm–Liouville operator corresponding to the eigenfunc-
tions yn .
In summary, the following three problems are equivalent:
9.4. STURM–LIOUVILLE REVISITED 103
(i) Find the eigenvalues,  and the eigenfunctions y.x/ that solve the Sturm–Liouville prob-
lem
d.p.x/y 0 /
C q.x/y D w.x/y;
dx ˇb
ˇ

ypy ˇ D 0;
a
w.x/; p.x/ > 0; x  .a; b/:

(ii) Find the function y.x/ for which


Z b 
F Œy D py 02 C qy 2 dx
a

is stationary subject to the constraint that


Z b
GŒy D wy 2 dx D 1:
a

The eigenvalues of the equivalent Sturm–Liouville problem in (i) are given by F Œy.

(iii) Find the function y.x/ for which


F Œy
H Œy D
GŒy

is stationary. The eigenvalues of the Sturm–Liouville problem are then given by the values
of H Œy.
We can make use of these equivalences to estimate the eigenvalues and eigenfunctions of
Sturm–Liouville problem.

Example 9.6 Consider the simple Sturm–Liouville problem

u00 C u D 0;
u.0/ D u.1/ D 0; (9.57)

whose solutions we know to be

un D sin.nx/
n D .n/2
n D 1; 2; : : : : (9.58)

We are looking for the lowest eigenvalue. Let us try two different test functions.
104 9. VARIATIONAL PRINCIPLES
(i) The “hat” function
(
1
x; 0  x  2
uh .x/ D 1
(9.59)
1 x; 2
 x  1:

In our case, w.x/ D 1:0. Hence,


Z 1 Z 1 Z 1
2
u2h w.x/dx D x 2 dx C Œ1 2x C x 2 dx;
1
0 0 2
ˇ1
x 3 ˇˇ 2
D 2 ˇ ;
3 0
1
D : (9.60)
12
p
We can normalize our trial function uN h D 12uh .

(ii) The quadratic trial function

uq D x.x 1/;
Z 1
1
u2q dx D ;
0 30
p
uN q D 30uq : (9.61)

In Figure 9.2, I show a comparison between the normalized trial functions uN h and uN q with
the exact solution u1 .x/. Now let us consider the Rayleigh quotient for both trial functions to
get an estimate for the eigenvalue :
R1 0 2
.u / dx
h D R 1 0 h D 12;
2
. 0 uh w.x/dx/2 /
R1 0 2
.uq / dx
q D R 1 0 D 10: (9.62)
. 0 u2q w.x/dx/2 /

The true value of the eigenvalue is 9:8626, as expected both approximate values are greater
than the exact value, with h overestimating the exact value by approximately 21% and q over
estimating it by only 1:3%.
9.4. STURM–LIOUVILLE REVISITED 105

1.6

1.2

0.8
uh

0.4

0 0.2 0.4 0.6 0.8 1


x

Figure 9.2: Trial functions, uh .x/, red long dashed, uq .x/, blue short dashed compared with
exact eigenfunction u1 .x/, solid green.
107

CHAPTER 10

Case Study: The Ground State


of Atoms
In nature every undisturbed quantum system tends to its lowest allowed energy. This state is
known as the ground state. An excited state is any state with energy greater than the ground
state. It is not possible to make a direct measurement of the quantum wave function. A quality
measure of an approximate wave function is how well it reproduces the ground state energy. In
this chapter I plan to focus on the computation of the ground state energies of multi-electron
atoms/ions with special reference to the application of the variational methods developed in the
previous chapter.
The Schrödinger equation for a one electron ion admits an analytic solution [23] which
we can use as a staring point for treating the more complex systems. (Atomic units with „ D
1; e D 1; me D 1; a0 D 1; 40 D 1 are used throughout this chapter.)

10.1 HYDROGENIC IONS


The energies of the stationary states of a hydrogenic ion of charge Z are the eigenvalues of the
time-independent Schrödinger equation:
 
1 2 Z
r .r/ D E .r/: (10.1)
2 r

 is the reduced mass:


M
;
M C1
where M is the nuclear mass. The eigenfunctions nlm .r/ can be written [23]

nlm .r/ D Rnl .r/Ylm .; /; (10.2)

where n; l; m are integers, s.t.

n D 1; 2; : : : ;
l D 0; 1; : : : ; n 1;
m D l; l C 1; : : : ; l 1; l
108 10. CASE STUDY: THE GROUND STATE OF ATOMS
and Ylm .; / is a “spherical harmonic.”
The first few radial functions are given by [23, 26]

R10 .r/ D 2.Z/3=2 exp.  Zr/;


1 Zr 1
R20 .r/ D 2. Z/3=2 1 exp. Zr/;
2 2 2
1 1 3=2 1
R21 .r/ D p . Z/ .Zr/ exp. Zr/;
3 2  2 
Z 3=2 Zr .Z/2 r 2 Zr
R30 .r/ D 2. / 1 2 C2 exp. /;
p3 
3

27 3
4 2 Z 3=2 Zr Zr
R31 .r/ D . / 1 .Zr/ exp. /;
9 3 6 3
4 Z 3=2 Zr
R32 .r/ D p . / .Zr/2 exp. / (10.3)
27 10 3 3

and the energy eigenvalues are given by


Z Z
En D D Ryd; (10.4)
2n2 n2
1
where Ryd is the Rydberg energy which is just 2
in atomic units. We are interested in the ground
state here:
  12
1
1;0;0 D R1;0 .r/Y00 .; / D 2.Z/ 3=2
exp. Zr/ : (10.5)
4

Since the nucler mass it is very much bigger than the electron mass we can just take  D 1
which is what I will do from now on.

10.2 TWO ELECTRON IONS


The Hamiltonian for two electrons, each of charge 1, orbiting a nucleus of charge Z is
1 2 Z 1 2 Z 1
H D r r C : (10.6)
2 1 r1 2 1 r2 jjr1 r2 jj
For helium Z D 2 but it will be convenient to keep Z arbitrary for the time being. If I
make the orbital approximation [23] and ignore the electron-electron interaction the problem
is separable and the eigenstate for the two electron system are of the form:

‰n1 ;l1 ;m1 ;n2 ;l2 ;m2 .r1 ; r2 / D n1 ;l1 ;m1 .r1 / n2 ;l2 ;m2 .r2 /; (10.7)

where ni li mi .ri / are the usual energy eigenstates of the hydrogenic ion with nuclear charge Z .
You should remember that the electrons are fermions so we cannot put them in the same state.
10.2. TWO ELECTRON IONS 109
However, electrons also have a spin degree of freedom which we have neglected in (10.7). This
means that two electrons can have the same spatial wavefunction as long as one is spin up and
the other spin down. The interaction energy in this approximation is just the sum of the energies
of both orbitals:
 
1 1
E D Z 2 2 C 2 Ryd:
n1 n2
Setting Z D 2; n1 D n2 D 1 for helium we get a ground state energy of 8Ryd 
108:8 eV. The ground state of helium has a measured energy which is very close to 79 eV.
Clearly, we need to take into account the interaction term to get a better estimate. I will explore
two methods to finding an improved approximate solution for the two electron ion.
• I will look at a “perturbative” approach. Perturbation theory is a systematic method for
finding an approximate solution to a problem, by starting from the exact solution of
a related, simpler problem. I will only use the first-order theory as outlined in Ap-
pendix C.
• The variational approach discussed in the previous chapter.
We can take ‰n1 ;l1 ;m1 ;n2 ;l2 ;m2 .r1 ; r2 / as defined in (10.7) to be our unperturbed state. If
we are to apply perturbation theory we need to be able to assume that the neglected term, the
e e interaction is smaller than the unperturbed term. Both the electron-nucleus and electron-
electron terms are Coulomb interactions differing by a factor Z so crudely our perturbation is Z1
smaller. This is only a half for helium so we might expect that perturbation theory will only give
a very crude estimate of the correction. Our hydrogenic ground state orbital is given by (10.5):
1
1;0;0 .r/ D .Z/3=2 exp. Zr/ p :

Hence,
E D h‰1;0;0;1;0;0 jHI ‰1;0;0;1;0;0 i
Z
j 1;0;0 .r1 /j2 j 1;0;0 .r2 /j2
D d 3 r1 d 3 r2
jjr1 r2 jj
5Z
D Ryd; (10.8)
4
where I have made us of the integral [23]
Z
e Œr1 Cr2  20 2
d 3 r1 d 3 r2 D : (10.9)
jjr1 r2 jj 5
So, the first-order correction is a positive term and yields a ground state energy:
5
. 8 C /Ryd  74:8 eV: (10.10)
2
110 10. CASE STUDY: THE GROUND STATE OF ATOMS
This is not a bad first estimate. However to take the perturbation to higher orders is very
demanding.
Now let us try the variational approach, taking as our variational test function a normalized
wave function:
ZQ 3 Q 1 Cr2 /
Z.r
t .r1 ; r2 / D e : (10.11)

Our trial function looks like the product of two hydrogenic functions for a nuclear charge
ZQ but this “charge” is not a real constant charge but a variable parameter which we can chose at
will in order to make use of the Rayleigh–Ritz theorem.

h O
t jH ti  hZHO i
D d 3 r1 d 3 r2 N t
!
1 2 1 2 ZQ ZQ ZQ Z ZQ Z 1
 r r Œ C C C C t:
2 1 2 2 r1 r2 r1 r2 jjr1 r2 jj
(10.12)

Now since we are using hydrogenic functions, it can be shown [23, 26],
" #!
1 2 1 2 ZQ ZQ
h tj r r C ti D ZQ 2 ;
2 1 2 2 r1 r2
ZQ Z ZQ Z Q Z/h 1 i
h tj C t i D 2.Z
r1 r2 r
1 5ZQ
h tj ti D ; (10.13)
jjr1 r2 jj 8

where h 1r i is the expectation for the ground state for a one electron hydrogenic ion with nuclear
charge ZQ and is equal to ZQ . With everything in Rydbergs we have
5 Q
hHO i D Π2ZQ 2 C 4Z.
Q ZQ Z/ C Z Ryd: (10.14)
4
Let us now find the value of ZQ which gives the minimum value, ZQ min :
ˇ
d hHO i ˇˇ
D 0;
d ZQ ˇZQ min
5
) 4ZQ min C 8ZQ min 4Z C D 0;
4
5
) ZQ min D Z : (10.15)
16
10.2. TWO ELECTRON IONS 111
For helium Z D 2 and consequently ZQ min D 27=16 and our upper bound on the ground
state energy is
 2
27
hHO i D Ryd  77:46 eV: (10.16)
16
One of the advantages of the variational approach is not only does it give an estimate of
the ground state but also an approximate wave function. In this case we can interpret our new
wave function by saying that each electron moves on average in the field of a nucleus with charge
ZQ , rather than charge Z . The difference between ZQ and Z is a measure of the degree of screening
due to the second electron.
So far, we have not fixed Z so we can apply the same analysis to other two electron sys-
tems. The negative ion of hydrogen H is an interesting testing ground for exploring variational
techniques for estimating atomic wavefunctions [27, 28]. If we use the test function (10.11) our
analysis follows through exactly as before but now Z D 1 and ZQ min D 11=16 we thus have an
upper bound on the energy of the three particle system
 2
11
Ryd  12:86 eV: (10.17)
16
Now this energy is greater than the ground state energy of neutral hydrogen ( 13:6 eV).
Thus, if this were the actual energy of the H ground state then it would be more energetically
favorable to free one electron and the other electron to be left in the ground state of the neutral.
The variational method only gives us an upper bound on the energy so it would be premature to
assume that there are no bound states of H based on this calculation alone. Bethe [29] used a
trial function which depended on a three parameter function of the form:

D .1 C ˛u C ˇt 2 /e s

where
u  jjr1 r2 jj;
s  r1 C r2 ;
t  r1 r2 ; (10.18)

and ˛; ˇ;  are the variational parameters. It was shown that with this wave function the resulting
Rayleigh–Ritz upper bound on the energy lies below 1Ryd. More and more sophisticated and
complex trial functions have been used. The best current estimate ground energy is close to
14:36 eV.
H is of astrophysical importance [27]. The abundant presence of both hydrogen and low
energy electrons in the ionized atmospheres of the Sun and other stars is ideal for the creation
of H by electron attachment. Radiation from the surface of the sun is absorbed by photo-
detachment. The continual formation and destruction of the negative ion conserves the total
radiated energy but modifies the characteristics of the light emitted from the star. Indeed, since
112 10. CASE STUDY: THE GROUND STATE OF ATOMS
most neutral atoms and positive ions have their first absorption at 4 or 5 eV if not larger, H is
the dominant contributor to the absorption of 0:75 eV photons, a critical range of infrared and
visible wavelengths.
Chandrasekar [30] used a two-parameter trial function:
N h ZQ 1 r1 ZQ 2 r2 Q Q
i
trial .r1 ; r2 / D e C e Z1 r2 Z2 r1 : (10.19)
4
Notice we have two variational parameters ZQ 1 and ZQ 2 and that our wave function is sym-
metric under the interchange of r1 and r2 . Using (10.19) and Rayleigh–Ritz Chandrasekar found
ZQ 1 D 1:039 and ZQ 2 D :283 and an upper bound on the ground state energy of 13:98, slightly
less than the binding energy of hydrogen and not too far of the actual ground state energy. The
function exhibits a “radial correlation” only. Particularly striking is the feature that ZQ 1 is larger
than 1, we can interpret this as implying that the effect of the second electron is to force the inner
one closer to the nucleus than it would be were it alone bound to the proton. The more complex
wave functions like those of the Bethe, (10.18) include “angular correlation” between the direc-
tions rO1 and rO2 and “radial” between the magnitudes r1 and r2 , the fact that the Chandrasekar
wavefunction gives a “good” bound state energy suggests that radial is the more significant of the
two types of correlation.

10.3 THE HARTREE APPROACH


Let us consider an atom with N electrons. We are looking for an optimal trial wavefunction
to use in our variational theorem. As a first approximation I will ignore spin and not impose
antisymmetry on the wave function. As my test function I will take
‰.r1 ; r2 ; : : : ; rN / D ˛1 .r1 / ˛2 .r2 / : : : ˛N .rN /; (10.20)
where I assume each one particle function is normalized and can be characterized by a set of
quantum numbers ˛ . ˛ .rq / is the one particle wave function for electron q with the set of
quantum numbers ˛ . These quantum numbers could be ˛  ni ; li ; mi . We do not have to assume
the single particle functions are hydrogenic wave functions. In practical calculations Slater-type
orbitals or Gaussian orbitals are frequently used [31].
The full multi-electron Hamiltonian is:
X N   X 
O 1 2 Z 1
H D ri C : (10.21)
2 ri rij
i D1 j >i

Now using ‰ as our trial function the expectation value of the energy becomes:
h‰jHO i  hHO i
XN Z  
1 2 Z
D d 3 r N ˛i .r/ r ˛i .r/
2 r
i D1
10.3. THE HARTREE APPROACH 113
XZ N ˛ .r/ N ˛ .r / ˛ .r/ ˛ .r / 
0 0
C d 3 rd 3 r 0 i j i j
: (10.22)
kr r 0 k
j >i

Let us focus for a moment on the second term. Let


Z N ˛ .r/ N ˛ .r 0 / ˛ .r/ 0
˛j .r /
Jij D d 3 rd 3 r 0 i j i
: (10.23)
kr r 0 k

Since the integral is over r and r 0 we have that Jij D Jji , hence
X 1X
Jij D Jij ; (10.24)
2
j >i j ¤i

and we may write

N Z
X  
1 2 Z
hHO i D d 3 r N ˛i .r/
r ˛i .r/
2 r
i D1
1X
Z N ˛ .r/ N ˛ .r 0 / ˛ .r/ ˛ .r 0 / 
C d 3 rd 3 r 0 i j i j
: (10.25)
2 kr r 0 k
j ¤i

To find the least upper bound on the energy with this ansatz (10.20) we need to we min-
imize hHO i over all possible one particle orbitals. If we keep each orbital ˛i normalized then
the N particle wave function ‰ will be normalized. To achieve this we introduce N Lagrange
multipliers, i . Consider the functional
X Z 
O 3 2
F Œ‰ D hH i i d rj ˛i .r/j 1 : (10.26)
i

We want to find the wave functions ˛i which will make F minimal. Just as in Exam-
ple 9.5, we can vary its real and imaginary parts independently. Since we have N independent
wavefunctions, this gives rise to 2N real conditions. This gives two sets of N complex equations,
however one set is simply the conjugate of the other and so we only need the N equations:
2 3
1 Z XZ N ˛ .r 0 / ˛ .r 0 /
4 ri2 C d 3r 0 j j
5 ˛i .r/ D i ˛i .r/; (10.27)
2 ri kr r 0 k
i¤j

called the Hartree equations.


Notice that in taking the variation derivative with respect to N n we end up with two
identical terms corresponding to n D i and n D j and we lose the factor of a half in (10.25).
114 10. CASE STUDY: THE GROUND STATE OF ATOMS

Guess Orbitals ψi

Calculate Ui

No Calculate New ψi

Calculate Ui

Output:
SCF Physical quantities.
Converged Yes
Done

Figure 10.1: Flow chart for a self-consistent field computer code.

Equation (10.27) has the same form as the regular Schrödinger equation where we have an
effective potential:

XZ N ˛ .r 0 / ˛ .r 0 /
Ui .r/ D d 3r 0 j
0
j
; (10.28)
kr r k
j ¤i

and our Lagrange multipliers are now the orbital energies. We interpret Ui .r/ as coming from
the electrostatic potential due to all the electrons other than i . The thing to notice is that each
˛j that appears in Ui .r/ is itself determined by one of the Hartree equations in (10.27) so
we have a set of coupled integro-differential equations. The potentials Ui both determine the
wavefunctions and are determined by the wavefunctions. The major requirement now is “self-
consistency.” The usual way forward is to proceed iteratively, see Figure 10.1. We write down a
physically reasonable guess for our product wavefunction (10.20) and use this to calculate Ui
then calculate a new set of energies energies i and orbitals ˛i from which we can calculate
a new Ui and continue like this until we have reached the desired level of convergence. Now
taking the inner product of (10.27) with ˛i .r/ yields
Z   XZ 0
3 1 2 Z j ˛j .r/j2 j ˛i .r/j2
i D d r N ˛i .r/ r ˛i .r/ C d 3 rd 3 r 0 :
2 i ri kr r 0 k
i ¤j
(10.29)

Summing over i we almost get the expression (10.22). Unfortunately, the inner summa-
tion in (10.22) is over j > i while in (10.27) it is over j ¤ i which means, as pointed out above
(10.24), it double counts. Correcting for the double counting means our variational estimate for
10.3. THE HARTREE APPROACH 115
the ground state is:
0 1
X XZ j ˛j .r
0
/j2 j ˛i .r/j2
Evariational D @i d 3 rd 3 r 0 A: (10.30)
kr r 0 k
i i<j

The Hartree approach has been generalized to take account of spin in the “Hartree–Fock”
theory, where the N -body test wave function of the system is taken to be antisymmetric products
of one electron orbitals, a “Slater determinant.” For more details, see [23, 24].
117

APPENDIX A

Vector Spaces
Vector space theory lies at the heart of much, if not most, numerical methods. We commonly
utilize the theorems of finite dimensional linear algebra and profit from our knowledge of self-
adjoint operators in infinite dimensional Hilbert spaces. In this appendix, I have gathered to-
gether some key results and observations that are employed throughout this book.

Definition A.1 A vector space, V , over the complex numbers, C , is a set, , together with
operations of addition and multiplication by complex numbers which satisfy the following ax-
ioms. Given any pair of vectors x; y in V there exists a unique vector x C y in V called the sum
of x and y . It is required that

x C .y C z/ D .x C y/ C z;

i.e., we require addition to be associative.

x C y D y C x;

i.e., we require addition to be commutative.

• There exists a vector 0 s.t.

x C 0 D x:

• For each vector x there exists a vector x s.t.

x C . x/ D 0:

Given any vector x in V and any ˛; ˇ in C there exists a vector ˛x in V called the product
of x and ˛ . It is required that

˛.y C z/ D ˛y C ˛z:
118 A. VECTOR SPACES

.˛ C ˇ/x D ˛x C ˇx:


.˛ˇ/x D ˛.ˇx/:


.1/x D x:

Definition A.2 A vector x is said to be linearly dependent on vectors x1 : : : xN , if x can be


written as:
x D ˛1 x1 C ˛2 x2 C    C ˛N xN :

If no such relation exists, then x is said to be linearly independent of x1 : : : xN .

Now every vector in R3 can be written in terms of the three unit vectors ex ; ey ; ez ; we can
generalize this idea.
Definition A.3 Suppose V is a vector space and there exists a set of vectors fei gN
i D1 . Then this
set spans V or equivalently forms a basis for it. If
• the set of vectors fei gN
i D1 is linearly independent, and

• if c  V then it can be written:


N
X
cD ˛i ei :
i D1

It is easy to verify that the set of ordered n-tuples, of real numbers


x D .x1 ; : : : ; xn /

form a vector space over R when define addition and multiplication by a scalar by
˛x C ˇy  .˛x1 C ˇy1 ; : : : ; ˛xn C ˇyn / : (A.1)
Our “ordinary” vectors in R3 are just a special case. In R3 we have a scalar product
3
X
x:y  xi yi : (A.2)
i D1
A. VECTOR SPACES 119
We can generalize this for an arbitrary vector space, V over the complex numbers.

Definition A.4 An inner product is a map which associates two vectors in the space, V , with
a complex number

hji W V  V ! C;
W a; b 7! hajbi

that satisfies the following four properties for all vectors a; b; c  V and all scalars. ˛; ˇ  C :

hajbi D hbjai;
h˛ajˇbi D ˛ˇhajbi;
ha C bjci D hajci C hbjci;
hajai  0 with equality iff a D 0; (A.3)

where z denotes the complex conjugate of z . We note that h˛aj˛ai D j˛j2 hajai which is con-

sistent with the last property in (A.3).


We can now define the following.

Definition A.5 For a vector a  V we can define its norm


p
kak D hajai;

it being understood that we take the positive square root.

We can generalize Rn to C n where r  C n can be represented by the n-tuple of complex


numbers

r D .z1 ; : : : ; zn /:

Addition and multiplication by scalar will go through just as usual, i.e., if

r1 D .z1 ; : : : ; zn /;
r2 D .1 ; : : : ; n /;

then

˛r1 D .˛z1 ; : : : ; ˛zn /;


r1 C r2 D .z1 C 1 ; : : : ; zn C n /:

However, if we are going to use our definition of inner product we will require:
n
X
hr1 jr2 i D zNi i : (A.4)
i D1
120 A. VECTOR SPACES
Consequently,
n
X
2
jjr1 jj D jzi j2 :
i D1

We can deduce from Definition A.4 of the inner product.

Lemma A.6 If u; v  V then

jhujvij  kukkvk:

Proof. Let

w D u C v:

Then,

hwjwi  0:

But

hwjwi D hujui C hujvi C hvjui C jj2 hvjvi


D kuk2 C hujvi C hvjui C jj2 kvk2  0: (A.5)

Take
hvjui
D ;
kvk2

and using

hujvi D hvjui

(A.5) becomes

jhujvij2
kuk2  0:
kvk2


A. VECTOR SPACES 121
This result is known as the Cauchy–Schwarz inequality.
Definition A.7 Two vectors a; b  V are said to be orthogonal if
hajbi D 0:
If further
hajai D hbjbi D 1;
the vectors are said to be orthonormal.

Lemma A.8 If fai gN


i D1 are a set of mutually orthogonal non-zero vectors then they are linearly
independent.

Proof. Suppose
N
X
˛i ai D 0;
i D1
N
X
) ˛i haq jai i D 0;
i D1
) ˛q haq jaq i D 0;
) ˛q D 0:
This is enough to establish linear independence. 

Lemma A.9 If fan gN


i D1 is a set of linearly independent vectors which span V then there exists an
orthonormal set fen gN
i D1 which also spans V .

Proof. I will prove the result by explicitly constructing a set fen gN i D1


a1
e1 D ;
ka1 k
0
e2 D a2 ha2 je1 ie1 ;
e20
e2 D ;
ke20 k
::
:
N
X1
0
eN D aN haN jek iek ;
kD1
0
eN
eN D 0 : (A.6)
keN k

122 A. VECTOR SPACES
The method of creating an orthonormal basis I just employed is known as the Grahm–
Schmidt orthogonalization method.

Lemma A.10 Let V be a vector space over C and let {e1 ; : : : ; eN } be a basis for V Let fwi gM
i D1 be
a set of non-zero vectors in V . If M > N then the set fwi g is linearly dependent.

Proof. Let us begin by assuming that, on the contrary, the set fwi g is linearly independent.
Since fei g forms a basis we may write
N
X
w1 D ˛i ei ;
i D1

at least one ˛i must be non-zero, renumbering if necessary we can chose it to be ˛1


N
X
) e1 D ˛1 1 Œw1 C ˛i ei :
i D2

Hence, the set fw1 ; e2 ; : : : eN g spans V .


Now we may repeat the argument for w2 ,
N
X
w2 D ˇ1 w1 C ˇi ei ;
i D2

and since we are assuming fwi g is linearly independent, then at least one ˇi with i  2 is non-
zero. We can keep repeating the argument until w1 ; : : : ; wN spans V then since wN C1 an ele-
ment of the space it can be written:
N
X
wN C1 D ˛i wi ;
i D1

thus linearly dependent and we have a contradiction. Our original assumption is false and the
result is established. 

LINEAR OPERATORS AND MATRICES

Definition A.11 A linear operator TO is a map from a vector space V onto itself s.t. for all
x; y  V ; ˛; ˇ  C

TO .˛x C ˇy/ D ˛ TO .x/ C ˇ TO .y/:


A. VECTOR SPACES 123

Now,

r D a C b

defines the equation of a line through a parallel to the vector b [11]; then

TO Œr D TO Œa C  TO Œb:

This is the equation of a line through TO Œa parallel to TO Œb.


The linear operator TO maps the vector space onto itself consequently if fei gN
i D1 is an or-
thonormal basis for the finite dimensional space V then for each ei we must be able to expand
TO .ei / in terms of the full basis
N
X
TO .ei / D Tji ej ; (A.7)
j D1

where Tji are complex numbers, taking the inner product with eq we have
N
X
heq jTO .ei /i D Tji heq jej i
i D1
X N
D Tji ıqj D Tqi : (A.8)
i D1

Then if r is any vector in V we may expand it in terms of the basis:


N
X
r D xi ei ;
i D1
X N
) TO .r/ D xi TO .ei /
i D1
X N N
X
D xi Tji ej
i D1 j D1
X N XN
D Œ Tji xi ej : (A.9)
j D1 i D1

Thus, once we chose our basis then to every linear transformation we assign an N  N
array of numbers which we will call a matrix:
124 A. VECTOR SPACES

TO $ T
0 1
T11 T12    T1N
B T21 T22    T2N C
B C
D B :: :: :: C: (A.10)
@ : :  : A
TN1 TN 2  TNN

Definition A.12 If T is a N  M matrix with components Tij ; 1  i  N; 1  j  M and B


is a M  R matrix with components Bqp with 1  q  M; 1  p  R then the matrix A with
P
components Ast D M kD1 Tsk Rkt ; where 1  s  N; 1  t  R is a N  R matrix known as the
product matrix

A D TR:

Notice that for a matrix T the element Tij corresponds to the j th column and i th row.
Just as in R3 we can write r  C N as an ordered N tuple

r D .x1 ; x2 ; : : : xN /: (A.11)

It is now, however, expedient to write it as a column vector, i.e., a N  1 matrix rather


than a row vector, a 1  N matrix. With this identification (A.9) can be written

TO .r/ D T r: (A.12)

In summary, for a N -dimensional vector space with a fixed orthonormal basis then
• to each vector there is a one to one correspondence to an N -tuple;
• to each linear operator there is a one to one correspondence with a N  N matrix;
• the vector TO Œr corresponding to the image of r under the linear transformation TO
corresponds to N -tuple got by multiplying the r N -tuple by the matrix representation
of the operator;
• the inner product corresponds to the multiplication of 1  N matrix by a N  1 matrix.

TRANSFORMATIONS FROM RN 7! RM
While I will reserve that the term “linear operator” only for a linear transformations from a vector
space onto itself, we could have a transformation AO from RN 7! RM with associated matrix A
having M rows and N columns, i.e., it is a M  N matrix. This matrix acts on a vector in RN ,
A. VECTOR SPACES 125
M
a N  1 matrix, and converts it into a M  1 matrix: a vector in R , the set of all such vectors
is called the “image of AO,” which I will denote by A .
Now A is a subspace of RM whose dimension is R Let fei gNiD1 be the standard basis for
R . Now if x;  A then there exists c  RN s.t.
N

Ac D x;
Xn
c D ˛i ei ;
i D1
n
X
) Ac D ˛i Aei ;
i D1
X n
)x D ˛i Aei :
i D1

Therefore, the set fAei gN


i D1 must span A . Further the vector Aei is the i th column of
A . Hence,

R D dim SpanfAei gN iD1 ;

and R is the number number of linearly independent columns of A .

Definition A.13 The column rank of A is the maximal number of linearly independent
columns of A . The row rank of A is the maximal number of linearly independent rows of A .

Theorem A.14 The row rank of a matrix A is equal to its column rank [32].

Let IO be the linear operator acting on the finite dimensional vector space V defined by:

IO W V !7 V;
IOŒr D r; for all r  V : (A.13)

Then from (A.8) then the elements of the matrix representation of IO are given by

Iij D hei jIOej i D hei jej i D ıij ; (A.14)

i.e.,
0 1
1 0  0
B 0 1  0 C
B C
I DB :: :: :: C : (A.15)
@ : :  : A
0 0  1
126 A. VECTOR SPACES
Lemma A.15 For any square matrix B
BI D IB D B:

Proof. Let T D BI . Then,


N
X N
X
Tij D Biq ıqj D Bij D ıip Bpj :
qD1 pD1

1
Definition A.16 An N  N matrix B has an inverse, B , if
1 1
BB DI DB B:

It is trivial to see that B 1 is unique.


Before we can proceed and actually construct an inverse matrix we need to take a quick
tour through the theory of linear equations. Let us begin with the simplest case suppose we want
to solve the set of two simultaneous equations in x and y
a1 x C b1 y D c1 ; (A.16)

a2 x C b2 y D c2 : (A.17)
(A.16) and (A.17) are clearly equivalent to the matrix equation

   Tr D c 
a1 b1 x c1
D ; (A.18)
a2 b2 y c2
1
and clearly we can solve the set of linear equations iff T exists. If we multiply (A.16) by a2
and (A.17) by a2 and then subtract we find
b2 c1 b1 c2
x D ;
det jT j
a1 c2 c1 a2
y D ; (A.19)
det jT j
where we have introduced the determinant of T which is given by
det jT j D a1 b2 a2 b 1 : (A.20)
A. VECTOR SPACES 127
If det jT j D 0 then we are in trouble but if it is non zero then we have solved the set of
linear equations. if det jT j D 0 and

b2 c1 b1 c 2 D 0
and
a1 c2 c1 a2 D 0;

then there is some hope but in this case (A.16) and (A.17) are essentially the same equation and
we have only one equation of two unknowns and thus an infinity of solutions. The system of
linear equations (A.18) has a unique solution iff detŒT  ¤ 0. These results can be generalized, a
determinant can be defined for N  N matrix as follows.

Definition A.17 The determinant of the N  N matrix is defined inductively as follows: it is


a linear combination of products of the elements of any row (or column)and the N 1 deter-
minant formed by striking out the row and column of the original determinant in which the
element appeared. The reduced array is called a minor and the sign associated with this product
is . /i Cj . The product of the minor with this sign is called the cofactor. We can keep doing this
until we get down to a sum of 2  2 determinants which can be evaluated using (A.20).

Example A.18
ˇ ˇ
ˇ a11 a12 a13 ˇ
ˇ ˇ
ˇ a21 a22 a23 ˇD
ˇ ˇ
ˇ a31 a32 a33 ˇ

ˇ ˇ ˇ ˇ ˇ ˇ
ˇ a a23 ˇˇ ˇ a21 a23 ˇˇ ˇ a21 a22 ˇˇ
. 1/1C1 a11 ˇˇ 22 ˇ C . 1/12
a ˇ
12 ˇ ˇ C . 1/ 13
a ˇ
13 ˇ
a32 a33 a31 a33 a31 a32 ˇ

D a11 Œa22 a33 a23 a32  a12 Œa31 a33 a23 a31  C a13 Œa21 a32 a22 a31 : (A.21)

It can be shown [32] that if A is a N  N matrix then it has a unique inverse iff detŒA ¤ 0
iff rank.A/ D N , this last condition is equivalent to saying that its columns(rows) treated as
vectors must be linearly independent. In fact, it can be shown that

1 CT
B D ; (A.22)
jBj
128 A. VECTOR SPACES
where C is the cofactor matrix, constructed as follows. The ij element of the cofactor matrix C is
cij which is . 1/i Cj multiplied by determinants of the .N 1/  .N 1/ matrix got by striking
out the i th row and j th column of the original matrix B . C T denotes the transpose of C .

Definition A.19 Let TO be an operator defined on a vector space, V , upon which an inner
product is defined. We define the adjoint of TO to be a linear operator TO Ž W V 7! V where for all
a; b  V

hajTO bi D hTO Ž ajbi:

Lemma A.20 If TO is a linear operator acting on an N dimensional vector space, V with matrix
representation, T  .T /ij then its adjoint T Ž has the matrix representation TNji , i.e., we interchange
rows and columns and take the complex conjugate of each element.

Proof.

.T /ij D hei jTO ej i


D hTO Ž ei jej i

D hej jTO Ž ei i
Ž
) .T /ij D .TN /ji :


To be clear, if we start we an operator TO with a matrix representation given by (A.10) then
its adjoint TO Ž has a matrix representation given by

TO Ž TŽ
$ 0 1
TN11 TN21    TNN1
B TN12 TN22    TNN 2 C
B C
D B : :: :: C: (A.23)
@ :: :  : A
TN1N TN2N    TNNN

Lemma A.21 Let A be an M  M complex matrix and B be a M  M complex matrix then

.AB/Ž D B Ž A Ž :
A. VECTOR SPACES 129
Proof. Looking at components,
M
X
.AB/ij D aiq bqj ;
qD1
.AB/Žij D .AB/ji
XM
D aN jq bNqi
qD1
XM
D .B Ž /iq .A Ž /qj D .B Ž A Ž /ij :
qD1

Definition A.22 Let TO be an operator defined on a vector space, V 7! V if TO D TO Ž , then


the operator is said to be self-adjoint. We know that the matrix representation of the adjoint

matrix is given by
TijŽ D TNji :

If TO is self-adjoint then it follows that


Tij D TNji ; (A.24)
such a matrix is said to be hermetian.
Definition A.23 A non-zero vector a is an eigenvector of TO with eigenvalue  if the effect of
acting with the operator is simply to multiply the vector by , i.e.,
TO a D a:

Lemma A.24 If TO is a self-adjoint operator defined V 7! V , then its eigenvalues must be real.

Proof. Let a be an eigenvector of TO with eigenvalue . Note we have excluded the null vector
from being an eigenvector but we have not excluded the number zero for being an eigenvalue.
Consider
hajTO ai D hajai D hajai D kak2 ;
N
hTO Ž ajai D hTO ajai D hajai D kak 2
;
)  D :N


130 A. VECTOR SPACES
Lemma A.25 If fbi gM O
i D1 are the eigenvectors of a self adjoint operator B corresponding to distinct
M
eigenvalues fˇi gi D1 , then these eigenvectors are orthogonal.

Proof. Suppose bi ; bj are eigenvectors corresponding to eigenvalues ˇ1 ; ˇj ; ˇi ¤ ˇj . Remem-


bering that the eigenvalues are real, we can write
O ji
hbi jBb D ˇj hbi jbj i
O
D hBbi jbj i D ˇi hbi jbj i;
) Œˇi ˇj hbi jbj i D 0;
) hbi jbj i D 0:


These two lemmas though simple to prove turn out to be very important. We have proved
the results for a general operator rather than just the matrix representation, so they will hold in
any finite or infinite dimensional vector space. Suppose we are working in an N -dimensional
space.

Theorem A.26 If B is an N  N matrix its eigenvalues are the solution of the equation

detŒB ˇI D 0:

Proof. Suppose b is an eigenvector, with eigenvalue ˇ , then

Bb D ˇb;
) ŒB ˇIb D 0:

If ŒB ˇI 1 exists then if we act with it we find that b D 0, i.e., no eigenvectors exist so
for us to find eigenvalues we must have

detŒB ˇI D 0:

This will yield a polynomial of order N in ˇ and by the fundamental theorem of algebra
this has N complex roots which is the maximal number of eigenvalues possible. 
Suppose B is a self-adjoint operator with a maximal set of distinct eigenvalues fˇi gN
iD1 with
associated eigenvectors fbi gN
i D1 . The eigenvectors may be written
0 1
b1i
B C
bi D @ ::: A ;
bNi
A. VECTOR SPACES 131
N
X
bNki bkj D ıij : (A.25)
kD1

If we define a matrix S whose columns are the eigenvectors of B , i.e.,


Sij  bij ; (A.26)
then (A.25) is equivalent to:
1
S D S Ž: (A.27)
An operator which satisfies the property (A.27) is said to be unitary. I will have a lot more
to say about such operators below.
Consider:
N
X N
X
Ž Ž
ŒS BS ij D Sik . Bkp Spj /
kD1 pD1
XN XN
D bNki . Bkp bpj /
kD1 pD1
XN
D bNki j bkj ;
kD1
D j ıij ; (A.28)
or in matrix form
0 1
1 0 0  0
B 0 2 0  0 C
B C
S Ž BS D B :: :: :: :: :: C: (A.29)
@ : : : : : A
0 0 0  N

Our derivation of (A.29) depended on the eigenvectors being mutually orthogonal, the
proof of which depended on the eigenvalues being distinct. It is not unusual to find a self-adjoint
operator B which has more than one eigenvectors which are linearly independent of each other
but have the same eigenvalue. Suppose the operator BO has M eigenvectors fbi gM 1 such that each
of them satisfies
O i D ˇb1 :
Bb (A.30)
Consider
m
X
aD ˛i bi ;
i D1
132 A. VECTOR SPACES
where ˛i are complex numbers then
M
X
O
BŒa D O i
˛i BŒb
i D1
M
X
D ˛i ˇŒbi 
i D1
D ˇa: (A.31)
Thus, the set  D feigenvectors of BO with eigenvalue ˇg is itself a vector space which is
a subspace of our original space. We may chose a maximal set of M , say, linearly independent
vectors which we can orthognalize to each other and to the other eigenfunctions of BO using our
Grahm–Schmidt processes. We can repeat this processes for any other degenerate eigenvalues
until we arrive at a maximal set of mutually orthogonal eigenvectors.

CHANGE OF BASIS
The matrix we have constructed in (A.27) we described as unitary.
Definition A.27 A linear operator UO is said to be unitary if
UO Ž D UO 1
:

Lemma A.28 If U is a linear transformation then the following are equivalent.


(a) U is unitary.
(b) For every x  V
jjUxjj D jjxjj:

(c) For every x; y  V

hUxjUyi D hxjyi:

Proof. In this proof we will make use of the following identity:


1 1
Œhx C yjx C yi hx yjx yi D jjxjj2 C jjyjj2 C 2hxjyi
4 4 
jjxjj2 jjyjj2 C 2hxjyi
D hxjyi: (A.32)
A. VECTOR SPACES 133
If U is unitary then (c) follows since
1
hUxjUyi D hU Uxjyi D hxjyi:

and (b) is a special case of (c) so we have .a/ ) .b/ ) .c/.


Suppose (b) holds. Consider:
1
hUxjUyi D ŒhUx C UyjUx C Uyi hUx UyjUx Uyi
4
1 
D jjU.x C y/jj2 jjU.x y/jj2
4
1 
D jjx C yjj2 jjx yjj2
4
D hxjyi:

So .b/ ) .c/.
Suppose (c) then

hUxjUyi D hxjyi
D hU Ž Uxjyi;
Ž
) hx U Uxjyi D 0; 8 y;
)x D U Ž Ux; 8 x;
) U ŽU D I;
) .c/ ) .a/:

Lemma A.29 Suppose A is an N  N matrix and U is a unitary matrix then if B D UAU Ž then
B and A have the same eigenvalues.

Proof. Suppose r is an eigenvector of A with eigenvalue 

Ar D r;
Ž
) U BUr D r;
) BUr D Ur:

Thus,  is an eigenvalue of B corresponding to the eigenvector Ur: 


135

APPENDIX B

Analytic Solution to the


Quantum Oscillator
The harmonic oscillator is one of the few quantum systems that admits a relatively simple an-
alytic solution [26]. We use this solution to benchmark our numerical code. We start with the
Schrödinger equation
„2 d 2 .x/ 1
2
C m! 2 x 2 .x/ D E .x/: (B.1)
2m dx 2
We will work with units where „ D 1; m D 1; ! D 1. In these units,
d 2 .x/
D .x 2 2E/ : (B.2)
dx 2
This formal differential equation, (B.2), needs to augmented by boundary conditions. In
order to retain the probability interpretation [23], we must require that the function .x/ be
square integrable, i.e., we must have
Z 1
j .x/j2 dx
1

be finite. A necessary condition is that


lim .x/ ! 0: (B.3)
x!˙1

Equation (B.3) is a second-order differential equation and as such will admit two linearly
independent solutions. Only one of which will be consistent with the boundary condition (B.3).
Asymptotically, for x >> 1 we can approximate
d 2 .x/
 x 2 .x/; (B.4)
dx 2
which has solutions
2 =2
˙ .x/ D e ˙x : (B.5)
Notice
lim .x/ ! 0;
x!1
136 B. ANALYTIC SOLUTION TO THE QUANTUM OSCILLATOR
lim C .x/ ! 1: (B.6)
x!1

Clearly, we don’t want the divergent solution C .x/. Returning to the full differential
equation, (B.1), let us look to see if we can find a solution of the form:
x 2 =2
.x/ D h.x/e ; (B.7)

where h.x/ is some analytic function with a power series expansion


1
X
h.x/ D aj x j ; (B.8)
j D0

from which it follows


1
X
0
h .x/ D aj jx j 1
;
j D0
X1
h00 .x/ D aj j.j 1/x j 2
;
j D0
2
d .x/ dh.x/e x =2
) D
dx dx 2
x 2 =2
D h .x/e x =2 xh.x/e
0
;
d 2 .x/ 00 x 2 =2 0 x 2 =2 x 2 =2 x 2 =2 x 2 =2
D h .x/e xh .x/e h.x/e x.h0 .x/e xh.x/e /
dx 2 2 2 =2 2 2
D h00 .x/e 2x =2 2xh0 .x/e x h.x/e x =2 C x 2 h0 .x/e x =2 3
1
X 1
X X1 1
X
2
D e x =2 4 aj j.j 1/x j 2
2 aj jx j aj x j C aj jx j C1 5
2j D0 j D0
3 j D0 j D0

X1
2
D e x =2 4 .aj C2 .j C 2/.j C 1/ 2aj /x j 5 : (B.9)
j D0

Thus,
d 2 .x/
x 2 .x/ C 2E .x/ D 0;
2 dx 2 3
1
X
)4 .aj C2 .j C 2/.j C 1// 2aj j C .2E 1/aj /x j 5 D 0: (B.10)
j D0

Now as we noted earlier power series coefficients are unique so

.aj C2 .j C 2/.j C 1// 2aj j C .2E 1/aj / D 0;


B. ANALYTIC SOLUTION TO THE QUANTUM OSCILLATOR 137
.2j C 1 2E/aj
) aj C2 D : (B.11)
.j C 1/.j C 2/

Thus, if we know a0 we can find all even coefficients and have a1 we can find all odd
coefficients. We may write

h.x/ D heven .x/ C hodd .x/: (B.12)

Since we are interested in asymptotics we can concentrate on the larger powers, j >> 1
for which
2
aj C2  aj ;
j

thus

h.x/ D heven .x/ C hodd .x/


X1 1
X
2 2j 2
D a0 .1 C x / C a1 .x C x 2j C1 /: (B.13)
jŠ .j C 1/Š
j D1 j D1

2
This diverges like e Cx the solution that we didn’t want. But if there exists an integer j
such that

2j C 1 D 2E (B.14)

then one of the series will terminate and we can set either a0 or a1 equal to zero to get rid of
the diverging series. The resulting finite series solution will has the correct asymptotic form. The
normalized eigenfunctions are
1 x2
n .x/ D p Hn .x/e 2 ; (B.15)
2n nŠ
where Hn .x/ is a hermite polynomial. The first few are given by

H0 .x/ D 1;
H1 .x/ D 2x;
H2 .x/ D 4x 2 2;
H3 .x/ D 8x 3 12x: (B.16)

Notice that H0 .x/; H2 .x/ are even functions of x while H1 .x/; H3 .x/ are odd. In Fig-
ure B.1 the first four eigenfunctions are plotted.
As expected, the lowest eigenfunction, corresponding to E0 is even with no zeros, 1 is
odd with one zero, 2 even with two zeros, and 3 is odd with three zeros.
138 B. ANALYTIC SOLUTION TO THE QUANTUM OSCILLATOR

0.5

ϕn(x)
0

-0.5

-1
-4 -2 0 2 4
x

Figure B.1: The first harmonic oscillator eigenfunctions for unit frequency, ! D 1, in units where
„ D m D 1: n D 0, dashed red, n D 1 dotted blue, n D 2, dashed dotted green, n D 3 solid black.

Equation (B.14) leads us to the allowed energy eigenvalues


1
E D jC : (B.17)
2
If j is odd we must take a0 D 0 and if j is even we must take a1 D 0 and we recover the
Hermite polynomials which as you will recall contain only odd or even powers of x , thus the
associated eigenfunctions, n .x/, are such that

2n . x/ D 2n .x/;


2nC1 . x/ D 2nC1 .x/: (B.18)
139

APPENDIX C

First-Order Perturbation
Theory
Suppose we have a Hamiltonian HO 0 which has a known set of eigenvalues, Ej , with an associated
set of orthonnormal eigenvectors f j0 g. We will assume there is no degeneracy, i.e., Ei ¤ Ej if
i ¤ j . If is any state of the system then
X
0
D aj j: (C.1)
j

Now, suppose our system is “perturbed” by a small extra potential so we will have another
Hamiltonian HO which is “not too different” from HO 0 . We can write

HO D HO 0 C HO I ;

where HO I D HO 1 where  is “quadratically small;” in other words, 2 is negligibly small. For
example, if we place a one electron atom in an electric field, E D Eez the Hamiltonians are

„2 2 Z
HO 0 D r ;
2 r
HO 1 D z: (C.2)

We could reasonably assume that if  D E is small the effect on the energy levels will also
be small, i.e., we would expect that the eigenenergies of the new Hamiltonian would be very
similar to the original; in other words, if ENi is a new eigenvalue then:

HO i D ENi ;
N
Ei D Ei C Ei ;
jEi j << 1: (C.3)

We also expect that the new eigenvector, i, will be “not too different” from the original
0
i . To be a little more precise, we can expand:
X
0
i D cij j; (C.4)
j
140 C. FIRST-ORDER PERTURBATION THEORY
and since we require h ij ii D 1 we have
X
jcij j2 D 1: (C.5)
j

We require

cii  1;
cij  0i ¤ j; (C.6)

i.e., we require cii to differ from 1 by a quadratically small quantity. The eigenvalue equation for
i is

h Hi i D ENi i ;
O O
H0 C H1 i D .Ei C Ei / i;
h iX X
0 0
HO 0 C HO 1 cij j D .Ei C Ei / cij j;
j j
X X X
0 0
cij .Ej j C HO 1 j/ D Ei cij j0 C Ei cij 0
j;
j
X X Xj j
0 0 0
cij ŒEj Ei  j C HO 1 cij j D Ei cij j: (C.7)
j ¤i j j

Let us now neglect quadratically small quantities cij and Ecij when i ¤ j and assume
cii  1 then taking the inner product with i0 we have:
0 O 0
h i jHI i i  Ei : (C.8)

That is the shift Ei in the level Ei resulting from the addition of the perturbation HO I to
the original Hamiltonian is just the expectation value of the perturbing Hamiltonian calculated
from the original eigenket i0 . For our purposes, here we will not need more than the energy
shift Ei , however if we take an inner product with i0 on (C.7) we find that
0 O 0
h q jHI i i
ciq D i ¤ q; (C.9)
Ei Eq

which together with cii D 1 gives us an estimate for i . Notice our original assumption of non-
degeneracy means that (C.9) is well defined. If level i is degenerate then our assumption that all
cij i ¤ j are quadratically small may not hold. We do not need all the eigenvalues to be non-
degenerate only the particular Ei we want to study. The approximation (C.8) is correct only to
first order in small quantities.
141

Bibliography
[1] William H. Press, Saul A. Teukolsky, William T. Vetterling, and Brian P. Flannery. Nu-
merical Recipes, Fortran:77. Cambridge University Press, Cambridge, 1992. 4

[2] LAPACK (linear algebra package) is a standard software library for numerical linear alge-
bra. www.netlib.org/lapack 52

[3] The Numerical Algorithm Group (NAG) library is a commercial software library. https:
//www.nag.co.uk/content/nag-library-fortran 4

[4] CASTEP is a shared source suite for calculating the electronic properties of crystalline
solids, surfaces, molecules, liquids and amorphous materials from first principles. www.
castep.org 4

[5] Quantum ESPRESSO is a suite for first-principles electronic-structure calculations and


materials modeling. https://www.quantum-espresso.org

[6] Gaussian is a general purpose computational chemistry software package. www.gaussian.


com

[7] K. G. Dyall, I. P. Grant, C. T. Johnson, F. A Parpia, and E. P. Plummer. GRASP: A


general-purpose relativistic atomic structure program. Comput. Phys. Commun., 55:425,
1989. DOI: 10.1016/0010-4655(89)90136-7

[8] R. J. Needs, M. D. Towler, N. D. Drummond, and P. Lopez Rios. Continuum variational


and diffusion quantum Monte Carlo calculations. J. Phys. Condens. Matter, 22:023201,
2010. DOI: 10.1088/0953-8984/22/2/023201 4

[9] Dennis M. Ritchie and Brian W. Kernighan. The C Programming Language, 2nd ed.,
Prentice Hall, 1988. DOI: 10.1007/978-3-662-09507-2_22 4

[10] John V. Guttag. Introduction to Computation and Programming Using Python: With Appli-
cation to Understanding Data. MIT Press, 2016. 4

[11] Colm T. Whelan. A First Course in Mathematical Physics. Wiley-VCH, 2016. 5, 21, 33,
34, 63, 92, 97, 123

[12] S. Lang. Analysis I. Addison-Wesley, New York, 1968. 6, 11, 21


142 BIBLIOGRAPHY
[13] Hans J. Weber and George B. Arfken. Essential Mathematical Methods for Physicists. El-
sevier, 2004. 11, 66, 67, 71
[14] Milton Abramowitz and Irene A. Stegun. Handbook of Mathematical Functions with For-
mulas, Graphs, and Mathematical Tables. Dover/National Bureau of Standard, 1972. DOI:
10.1115/1.3625776 16
[15] Endre Süli and David Mayers. An Introduction to Numerical Analysis. Cambridge Univer-
sity Press, 2003. DOI: 10.1017/cbo9780511801181 28, 47, 53, 55, 63
[16] Donald L. Kreider, Robert G. Kuller, Donald R. Ostberg, and Fred W. Perkins. An Intro-
duction to Linear Analysis. Addison-Wesley, Reading, 1966. DOI: 10.2307/2313834 33,
38
[17] John R. Taylor. Classical Mechanics. University Science Books, 2005. 33, 44, 97, 98
[18] J. H. Wilkinson. The Algebraic Eigenvalue Problem. Oxford, 1965. DOI: 10.1007/978-93-
86279-52-1_11 52
[19] James W. Longley. Modified Gram-Schmidt process vs. classical Gram-Schmidt. Com-
munic. Statist. Simul. Computat., 10(5):517, 1981. DOI: 10.1080/03610918108812227 58
[20] Colm T. Whelan. On the Bethe approximation to the reactance matrix. J. Phys. B, 19:2343,
1986. DOI: 10.1088/0022-3700/19/15/015 67
[21] Alan Burgess and Colm T. Whelan. BETRT—a procedure to evaluate cross-sections for
electron hydrogen collisions in the Bethe approximation to the reactance matrix. Comput.
Phys. Commun., 47:295, 1987. DOI: 10.1016/0010-4655(87)90115-9 67
[22] Earl A. Coddington and Norman Levinson. Ordinary Differential Equations. MGraw-
Hill, 1984. DOI: 10.1063/1.3059875 79
[23] Colm T. Whelan. Atomic Structure. IOP-Morgan & Claypool, 2018. DOI: 10.1088/978-
1-6817-4880-1 83, 92, 107, 108, 109, 110, 115, 135
[24] C. Froese-Fischer. The Hartree-Fock Method for Atoms—a Numerical Approach. Wiley, 1977.
115
[25] Reiner M. Dreizler and Eberhard K. U. Gross. Density Functional Theory. Springer, 1999.
DOI: 10.1007/978-3-642-86105-5 92
[26] David J. Griffiths. Introduction to Quantum Mechanics, 2nd ed., Cambridge University
Press, 2017. DOI: 10.1017/9781316995433 108, 110, 135
[27] A. R. P. Rau. The negative ion of hydrogen. J. Astrophys Astr., 17:113, 1996. DOI:
10.1007/bf02702300 111
BIBLIOGRAPHY 143
[28] Shaun Lucey, Colm T. Whelan, R. J. Allan, and H. R. J. Walters. (e, 2e) on hydrogen
minus. J. Phys. B, 29(13):L489, 1996. DOI: 10.1088/0953-4075/29/13/002 111
[29] H. Bethe. Berechnung der Elektronenaffinität des Wasserstoffs. Z. Phys., 57:815, 1928.
DOI: 10.1007/bf01340659 111
[30] S. Chandrasekar. Some remarks on the negative hydrogen ion and its absorption coeffi-
cient. Astrophys. J., 100:176, 1944. DOI: 10.1086/144654 112
[31] Attila Szabo and Neil S. Ostlund. Modern Quantum Chemistry: Introduction to Advanced
Electronic Structure Theory. Dover, 1996. 112
[32] Serge Lang. Linear Algebra. Springer, 1987. DOI: 10.1007/978-1-4757-1949-9 125, 127
145

Author’s Biography
COLM T. WHELAN
Colm T. Whelan is a Professor of Physics and an Eminent Scholar at Old Dominion University
in Norfolk, Virginia. He received his Ph.D. in Theoretical Atomic Physics from the University
of Cambridge in 1985 and was awarded an Sc.D. also from Cambridge in 2001. He is a Fellow
of both the American Physical Society and the Institute of Physics (UK). He has over 30 years
of experience in the teaching of physics.
147

Index

algorithms, 1–3 differential equation, 63–67


unstable, 2 functions of the second kind, 67–68
polynomials, 65–67
Cauchy–Schwarz inequality, 120
chaotic motion, 43–44 matrix, 123
hermetian, 129
determinant, 126–128 lower triangular, 47
differential equation upper triangular, 46
homogenous, 39
inhomogeneous, 39 Newton’s laws, 97
particular solution, 39 numerical differentiation, 12–13
3-point formula, 13
error 5-point formula, 13
cancellation, 1 second order, 13
roundoff, 1 Numerov method , 30–32
Euler approximation, 25–27
Euler–Lagrange equations, 94–97 operator
extrema self-ajoint, 129–132
functions of one variable, 6–7 unitary, 54, 132–133
functions of several variables, 7–10 orthogonal vectors, 121
oscillator
Gaussian orbitals, 112 critically damped, 38
Grahm–Schmidt orthogonalization, 49, 52, damped, 36–39
56, 57, 70, 122, 132 forced, 39–44
Hartree method, 112–115 over damped, 38
under damped, 37
interpolation, 59–63
interval of convergence, 11 parity
even, 84
Lagrange multiplier, 9–10, 98–104, 113–115 odd, 84
Least squares method, 54–58 perturbation theory
Legendre time independent, 140
148 INDEX
quadrature, 13–16 method, 27–29
Boole’s (Bode’s) rule, 16 second order approximation, 28
elementary, 17
Gaussian, 74–75 shooting method, 86
Newton–Cotes formulae, 14 significand, 1
Simpson’s 38 rule, 16 similarity transformation, 51
Simpson’s rule, 16 simple harmonic oscillator, 40
Slater
trapezoidal rule, 14
orbitals, 112
quantum oscillator, 83–90
Sturm–Liouville theory, 77–81
Rayleigh–Ritz theorem, 91–94
Taylor’s series, 5–7, 25
Rayleigh–Ritz quotient, 92
remainder term
resonance, 40
Cauchy form, 6
roots integral form, 5
finding, 17–19 Lagrange form, 6
Newton–Raphson formula, 19
secant method, 19 Vandermonde matrix, 60, 73
Runge function, 63 variational principles, 91–104
Runge–Kutta vector space
fourth order approximation, 28 basis, 118

You might also like