Download as pdf or txt
Download as pdf or txt
You are on page 1of 357

WORLD SCIENTIFIC SERIES ON

ries
Series Editor: Leon 0 . Chua

MODELING AND COMPUTATIONS


IN DYNAMICAL SYSTEMS
In commemoration of the 100th anniversary
of the birth of John von Neumann
edited by
EUSEBIUS J DOEDEL, CABOR DOMOKOS &
IOANIMIS G KEVREKIDIS

World Scientific
MODELING AND COMPUTATIONS
IN DYNAMICAL SYSTEMS
In commemoration of the 100th anniversary
of the birth of John von Neumann
WORLD SCIENTIFIC SERIES ON NONLINEAR SCIENCE

Editor: Leon O. Chua


University of California, Berkeley

Series B. SPECIAL THEME ISSUES AND PROCEEDINGS


Volume 1: Chua's Circuit: A Paradigm for Chaos
Edited by R. N. Madan
Volume 2: Complexity and Chaos
Edited by N. B. Abraham, A. M. Albano, A. Passamante, P. E. Rapp,
and R. Gilmore
Volume 3: New Trends in Pattern Formation in Active Nonlinear Media
Edited by V. Perez-Villar, V. Perez-Munuzuri, C. Perez Garcia, and
V. I. Krinsky
Volume 4: Chaos and Nonlinear Mechanics
Edited by T. Kapitaniak and J. Brindley
Volume 5: Fluid Physics — Lecture Notes of Summer Schools
Edited by M. G. Velarde and C. I. Christov
Volume 6: Dynamics of Nonlinear and Disordered Systems
Edited by G. Martfnez-Mekler and T. H. Seligman
Volume 7: Chaos in Mesoscopic Systems
Edited by H. A. Cerdeira and G. Casati
Volume 8: Thirty Years After SharkovskiT's Theorem: New Perspectives
Edited by L Alseda, F. Balibrea, J. Llibre, and M. Misiurewicz
Volume 9: Discretely-Coupled Dynamical Systems
Edited by V. Perez-Munuzuri, V. Perez-Villar, L. O. Chua, and M. Markus
Volume 10: Nonlinear Dynamics & Chaos
Edited by S. Kim, R. P. Behringer, H.-T. Moon, and Y. Kuramoto
Volume 11: Chaos in Circuits and Systems
Edited by G. Chen and T. Ueta
Volume 12: Dynamics and Bifurcation of Patterns in Dissipative Systems
Edited by G. Dangelmayr and I. Oprea
& I WOBLD SCIENTIFIC SERIES ON * • * > e«-!«„ D UAI •» Q

NONLINEAR SCIENCE '•- senesB voi.13


Series Editor: Leon 0. Chua

MODELING AND COMPUTATIONS


IN DYNAMICAL SYSTEMS
In commemoration of the 100th anniversary
of the birth of John von Neumann

edited by

Eusebius J. Doedel
Concordia university, Canada

Gabor Domokos
Budapest university of Technology and Economics, Hungary

loannis G. Kevrekidis
Princeton university, USA

\[p World Scientific


NEWJERSEY • LONDON • SINGAPORE • BEIJING • SHANGHAI • HONGKONG • TAIPEI • CHENNAI
Published by
World Scientific Publishing Co. Pte. Ltd.
5 Toh Tuck Link, Singapore 596224
USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601
UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data


A catalogue record for this book is available from the British Library.

Cover Illustration: The image is an artistic rendering by Greg Jones (University of Bristol) of the Lorenz manifold as computed
by the five different methods; see the chapter "A Survey of Methods for Computing (Un)Stable Manifolds of Vector Fields", by
B. Krauskopf, H. M. Osinga, E. J. Doedel, M. E. Henderson, J. Guckenheimer, A. Vladimirsky, M. Dellnitz and O. Junge.

MODELING AND COMPUTATIONS IN DYNAMICAL SYSTEMS


Copyright © 2006 by World Scientific Publishing Co. Pte. Ltd.
All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical,
including photocopying, recording or any information storage and retrieval system now known or to be invented, without written
permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood
Drive, Danvers, MA 01923, USA. In mis case permission to photocopy is not required from the publisher.

ISBN 981-256-596-5

Typeset by Stallion Press


E-mail: enquiries@stallionpress.com

Printed bv Fulsland Offset Printins (SVPte Ltd, Singapore


CONTENTS

Editorial 1

Transport in Dynamical Astronomy and Multibody Problems 3


M, Dellnitz, 0. Junge, W. S. Koon, F. Lekien, M. W. bo, J. E. Marsden,
K. Padberg, R. Preis, S. D. Ross and B. Thiere

A Brief Survey on the Numerical Dynamics for Functional Differential


Equations 33
B. M. Garay

Bifurcations and Continuous Transitions of Attractors in Autonomous and


Nonautonomous Systems 47
P. E. Kloeden and S. Siegmund

A Survey of Methods for Computing (Un)Stable Manifolds of Vector Fields 67


B. Krauskopf, H. M. Osinga, E. J. Doedel, M. E. Henderson, J. Guckenheimer,
A. Vladimirsky, M. Dellnitz and 0. Junge

Commutators of Skew-Symmetric Matrices 97


A. M. Bloch and A. Iserles

Simple Neural Networks that Optimize Decisions 107


E. Brown, J. Gao, P. Holmes, R. Bogacz, M. Gilzenrat and J. D. Cohen

Newton Flow and Interior Point Methods in Linear Programming 131


J.-P. Dedieu and M. Shub

Numerical Continuation of Branch Points of Equilibria and Periodic Orbits 145


E. J. Doedel, W. Govaerts, Yu. A. Kuznetsov and A- Dhooge

Coarse-Grained Observation of Discretized Maps 165


G. Domokos

Multiple Helical Perversions of Finite, Intristically Curved Rods 175


G. Domokos and T. J. Healey

Bifurcations of Stable Sets in Noninvertible Planar Maps 195


J. P. England, B. Krauskopf and H. M. Osinga

Multiparametric Bifurcations m an Enzyme-Catalyzed Reaction Model 209


E. Freire, b. Pizarro, A. J. Rodriguez-buis and F. Fernandez-Sanchez

v
Straightforward Computation of Spatial Equilibria of Geometrically Exact
Cosserat Rods 253
T. J. Healey and P. G. Mehta

Multiparameter Parallel Search Branch Switching 271


M. E. Henderson

Equation-Free, Effective Computation for Discrete Systems: A Time Stepper


Based Approach 279
J. Moiler, 0. Runborg, P. G. Kevrekidis, K. Lust and I. G. Kevrekidis

Model Reduction for Fluids, Using Balanced Proper Orthogonal


Decomposition 301
C. W. Rowley

Bifurcation Tracking Algorithms and Software for Large Scale Applications 319
A. G. Salinger, E. A. Burroughs, R. P. Pawlowski, E. T. Phipps and
L. A. Romero

An Algorithm for Finding Invariant Algebraic Curves of a Given Degree for


Polynomial Planar Vector Fields 337
G. Swirszcz
EDITORIAL

The papers in this issue are based on lectures presented at the October 2003 Budapest
workshop on Modeling and Computations in Dynamical Systems, and complemented by
selected additional contributions. The workshop, organized by G. Domokos, was held in
commemoration of the 100th anniversary of the date of birth of John von Neumann, and
made possible by generous support from The Thomas Cholnoky Foundation. Von Neumann
made fundamental contributions to Computing, and he had a keen interest in Dynamical
Systems, specifically, Hydrodynamic Turbulence. It was especially appropriate therefore, to
dedicate the workshop (and this special issue) to the memory of von Neumann, one of the
greatest and most influential mathematicians of the 20th century. While the topic of the
Budapest workshop was rather well-defined, concentrating on modeling and computations
in dynamical systems, the gathering attracted a diverse group of prominent researchers, the-
oreticians as well as computational scientists, with fields of expertise ranging from numerical
techniques, including large scale computing, to fundamental aspects of dynamical systems.
The papers in this special issue reflect these diverse interests, and, in fact, the wide-ranging
nature of the field of Dynamical Systems. Applications of the work reported in this spe-
cial issue include geometric integration, neural networks, linear programming, dynamical
astronomy, chemical reaction models, and structural and fluid mechanics.

Busebius Doedel,
Concordia University, Montreal, Canada
Gabor Domokos,
Budapest University of Technology and Economics, Hungary
Ioannis Kevrekidis,
Princeton University, USA

1
This page is intentionally left blank
T R A N S P O R T IN DYNAMICAL A S T R O N O M Y
AND MULTIBODY PROBLEMS
MICHAEL DELLNITZ*, OLIVER JUNGE*, WANG SANG K O O N t ,
F R A N C O I S LEKIEN*, MARTIN W. LO § , J E R R O L D E. MARSDEN^,
K A T H R I N PADBERG*, R O B E R T PREIS*, SHANE D. ROSS*,
and BIANCA THIERE*
*Faculty of Computer Science, Electrical Engineering and Mathematics,
University of Paderborn, D-33095 Paderborn, Germany
^Control and Dynamical Systems, MC 107-81,
California Institute of Technology, Pasadena, CA 91125, USA
^•Department of Mechanical and Aerospace Engineering,
Princeton University Engineering Quad, Olden Street,
Princeton, NJ 08544-5263, USA
^Navigation and Mission Design, Jet Propulsion Laboratory,
California Institute of Technology, M/S 301-140L,
4800 Oak Grove Drive, Pasadena, CA 91109, USA

Received April 28, 2004; Revised July 5, 2004

We combine the techniques of almost invariant sets (using tree structured box elimination and
graph partitioning algorithms) with invariant manifold and lobe dynamics techniques. The result
is a new computational technique for computing key dynamical features, including almost invari-
ant sets, resonance regions as well as transport rates and bottlenecks between regions in dynam-
ical systems. This methodology can be applied to a variety of multibody problems, including
those in molecular modeling, chemical reaction rates and dynamical astronomy. In this paper
we focus on problems in dynamical astronomy to illustrate the power of the combination of
these different numerical tools and their applicability. In particular, we compute transport rates
between two resonance regions for the three-body system consisting of the Sun, Jupiter and a
third body (such as an asteroid). These resonance regions are appropriate for certain comets
and asteroids.

Keywords: Three-body problem; transport rates; dynamical systems; almost invariant sets; graph
partitioning; set-oriented methods; invariant manifolds; lobe dynamics.

Contents
1. Introduction 4
1.1. Need for modification of current transport calculations 6
1.1.1. Chemistry 6
1.1.2. Dynamical astronomy 6
1.2. Current methods for the study of transport in the PCR3BP 6
1.2.1. Analytical methods: single resonance theory and resonance
overlap criterion 6

3
4 M. Dellnitz et al.

1.2.2. Toward a global picture of the phase space 6


1.2.3. Mars escape rates 7
1.3. Set oriented approach to transport 7
1.4. What is achieved in this paper 8
2. Description of the PCR3BP Global Dynamics 8
2.1. Problem description 8
2.2. Equations of motion 8
2.3. Energy manifolds 9
3. Computing Transport 10
3.1. Lobe dynamics 10
3.1.1. Boundaries, regions, pips, lobes, and turnstiles defined 11
3.1.2. Multilobe, self-intersecting turnstiles 11
3.1.3. Expressions for the transport of species 12
3.2. Set oriented approach 13
3.2.1. The transfer operator 13
3.2.2. Discretization of the transfer operator 14
3.2.3. Approximation of transport rates 14
3.2.4. Convergence 16
3.2.5. Almost invariant decompositions 17
3.2.6. Graph formulation 17
3.2.7. Heuristics and tools for the graph partitioning problem 18
4. Example: The Sun-Jupiter-Asteroid System 19
4.1. Lobe dynamics 19
4.1.1. Symmetries of the Poincare map / 19
4.1.2. Finding a fixed point p of / 19
4.1.3. Finding the stable and unstable manifolds of p under / 19
4.1.4. Defining the regions and finding the relevant lobes 20
4.1.5. Higher iterates of the map 20
4.1.6. Re-entrainment of the lobes 21
4.2. Set oriented approach 22
4.2.1. Almost invariant decomposition of the Poincare section 22
4.2.2. Transport for a two-set partition 22
4.2.3. Local optimization 24
4.2.4. Extrapolation 25
4.2.5. Higher iterates of the map 26
4.2.6. Return times of the Poincare map 27
5. Conclusions and Future Directions 27
5.1. Good agreement between approaches 27
5.2. Extension to higher dimensions and time dependent systems 27
5.3. Merging techniques into a single software package 28
5.4. Miscellany 28
5.5. Progress towards the grand challenges in computational science 28

1. Introduction Rom-Kedar, 1999]. The recent and surprisingly


The mathematical description of transport phenom- effective application of methods combining dynam-
ena applies to a wide range of physical systems ical systems ideas with those from chemistry to
across many scales [Meiss, 1992; Wiggins, 1992; the transport of Mars impact ejecta underlines
Transport in Dynamical Astronomy and Multibody Problems 5

this point [Jaffe et al, 2002]. In this paper, we impact rates are key for determining the delivery
develop computational methods to study trans- of water to the Earth [Morbidelli et al, 2000] and
port based on the relationship between statistics ejecta exchange rates are important for investigat-
and geometry in a nonlinear dynamical system ing the transportation of microbes between Mars
with mixed regular and chaotic motion. Our focus and Earth [Gladman et al, 1996; Mileikowsky et al,
is on the transport of material throughout the 2000].
solar system. However, these methods are fun- The recent discovery of several extrasolar plane-
damental and broad-based; they may be applied tary systems has stimulated interest in the morpho-
to diverse areas of study, including fluid mixing logical and dynamical features that may be present
[Rom-Kedar et al, 1990; Malhotra & Wiggins, 1998; in generic planetary systems [Konacki et al, 2003].
Poje & Haller, 1999; Coulliette & Wiggins, 2001; Some quantities of interest are the following: likely
Lekien et al, 2003], iV-body problems in physical distributions of objects in the presence of dynami-
chemistry [Jaffe et al, 2000; Lekien &; Marsden, cal sculpting due to planets and moons (e.g. generic
2004] as well as other problems in dynamical astron- circumstellar belts and circumsolar rings); rates of
omy. For example, the recent discovery of several small body collision with a planet; and rates of cap-
binary pairs in the asteroid and Kuiper belts has ture and escape from one orbital resonance with a
stimulated interest in computing the formation and planet to another.
dissociation rates of such binary pairs (see, e.g.
[Goldreich et al, 2002; Scheeres, 2002; Scheeres
et al., 2002; Veillet et al, 2002]). Short period comets
In order to develop a theory of chaotic transport
that is computationally tractable, we will consider a
Dynamical processes in the solar system physically relevant example from dynamical astron-
Our understanding of the solar system has changed omy: the motion of (short period) comets in the
dramatically in the past several decades with the gravitational field of the Sun and Jupiter. Our
realization that the orbits of the planets and some model, the planar circular restricted three-body
minor bodies are chaotic. In the case of plan- problem (PCR3BP), will be described in a later
ets, this chaos is of a sufficiently weak nature section.
that their motion appears quite regular on rela-
tively short time scales [Laskar, 1989]. In contrast,
The role of the planar circular restricted
small bodies such as asteroids, comets, and Kuiper-
belt objects can exhibit strongly chaotic motion three-body problem
through their interactions with the planets and The PCR3BP has long been considered an appro-
the Sun, exhibiting Lyapunov times of only a few priate "baseline" model for providing a reasonable
decades [Torbett <fc Smoluchowski, 1990; Tancredi, explanation for much of the dynamical behavior
1995]. found in the large scale numerical experiments of
The ability to predict the behavior of pop- solar system dynamics [Levison & Duncan, 1993;
ulations of these small but numerous objects is Malhotra et al, 2000]. Malhotra's work [1996] pro-
essential for understanding key transport phe- vides a good recent example. Motivated by numer-
nomena in dynamical astronomy, such as the ical studies of the stability of low-eccentricity and
evolution of short period comets [Torbett & low-inclination orbits of small bodies in the trans-
Smoluchowski, 1990], scattered Kuiper-belt objects Neptunian Kuiper belt, Malhotra [1996] used the
[Malhotra et al, 2000], and the intermediaries PCR3BP to describe the basic phase space struc-
between these two populations [Tiscareno & ture in the neighborhood of Neptune's exterior
Malhotra, 2003]. Furthermore, an understanding of mean motion resonances. The advantage of this sim-
how small bodies behave in n-body fields will aid ple model is that it allows the direct visualization,
in the gravitationally assisted transport of space- in two-dimensional surfaces-of-section, of a global
craft using very little fuel [Koon et al, 2000, mixed phase space structure of stable and chaotic
2001a, 2002; Gomez et al, 2001; Dellnitz et al, zones. Much can be learned about populations of
2001a; Ross et al, 2003; Yamato & Spencer, 2003]. minor bodies from a semi-analytical study of the
This understanding also contributes to other fields PCR3BP, i.e. careful numerics guided by dynami-
such as astrobiology, for example, where comet cal systems theory.
6 M. Dellnitz et al.

1.1. Need for modification of current this question from various points of view. We recall
transport calculations some of them in this subsection.
Several subjects make use of dynamical transport
calculations. We indicate some of the reasons one 1.2.1. Analytical methods: single resonance
would like to improve current techniques. theory and resonance overlap criterion
One approach is to develop simple analytical models
1.1.1. Chemistry which provide answers to basic phase space trans-
The transport of ensembles of points in phase space port questions. Much progress has been made in
has been important for the theoretical determina- this area, but most of the work has focused on the
tion of chemical reaction rates. One method, tran- study of the local dynamics around a single res-
sition state theory (TST), has been a ubiquitous onance, using a one-degree-of-freedom pendulum-
workhorse in the computational chemistry litera- like Hamiltonian with slowly varying parameters.
ture [Uzer et al, 2002]. It is based on the iden- Transport questions regarding capture into, and
tification of a transition state (TS) between large passage through resonance, have been addressed
realms of phase space which correspond to either this way [Henrard, 1982; Neishtadt, 1996; Neishtadt
"reactants" or "products." If one assumes the phase et al, 1997].
space in each realm is structureless [Marston & De An important result regarding the interac-
Leon, 1989], then the chemical reaction rate for the tion between resonances was obtained by Wisdom
reaction under study can be estimated from the flux [1980], where the method of Chirikov [1979] was
through the TS. However, rates given by TST can applied to the PCR3BP to determine a resonance
be off from the true rate by orders of magnitude overlap criterion for the onset of chaotic behav-
[De Leon, 1992]. Modifications of transition state ior for small mass parameter (e). These analyti-
theory are necessary to calculate statistical quan- cal methods are still used today (see [Murray &,
tities of interest [Hammes-Schiffer & Tully, 1995; Holman, 2001] and references therein).
Hammes-Schiffer, 2002; Agarwal et al, 2002].
1.2.2. Toward a global picture of the
1.1.2. Dynamical astronomy phase space
In principle, the computation of rates of mass trans- In [Koon et al., 2000], dynamical systems tech-
port can be accomplished by numerical simulations niques were applied to the problem of heteroclinic
in which the orbits of vast numbers of test particles connections and interior-exterior transitions in the
are propagated in time including as many gravita- PCR3BP, laying the foundation for tube dynamics.
tional interactions as desirable. Many investigators In the point of view developed in [Koon et al.,
have used this approach successfully (cf. [Levison 2000], the invariant manifold structures associated
h Duncan, 1993]). However, such calculations are to L\ and L2, the (Conley-McGehee) phase space
computationally demanding and it may be diffi- tubes [Conley, 1968; McGehee, 1969] play a key role.
cult to extract information from them about key These tubes provide fundamental tools that can aid
dynamical mechanisms since the outcomes may in understanding transport throughout the phase
depend sensitively on the initial conditions used space, e.g. transport between the inside and outside
for the simulation or may even be misleading. To of a planet's orbit, as seen in the comet P/Oterma
obtain general features of planetary system evo- [Carusi et al, 1985], and chaotic trajectories lead-
lution and morphology, which is a major goal of ing to planetary impact, as in comet D/Shoemaker-
dynamical astronomy, other approaches may be Levy 9 [Benner & McKinnon, 1995].
necessary. The main new technical result in Koon, Lo,
Marsden, and Ross [2000] is the numerical demon-
stration of the existence of a heteroclinic connec-
1.2. Current methods for the study tion between pairs of periodic orbits, one around
of transport in the PCR3BP the libration point L\ and the other around L2,
Many of the important transport questions involve with the two periodic orbits having the same energy.
motion between different regions of the phase space. This result is applied to the interior-exterior transi-
There have been a variety of approaches to deal with tion problem, providing insight into the "resonance
Transport in Dynamical Astronomy and Multibody Problems 7

hopping" of some short period comets (cf. [Tancredi of homoclinic and heteroclinic tangles. Further-
et al, 1990; Valsecchi, 1992; Belbruno k Marsden, more, the length of these complicated curves grows
1997; Koon et al., 2001]. Furthermore, an explicit quickly with the size of the time window of inter-
numerical construction of interesting orbits with est. The number of points needed to describe long
prescribed itineraries is developed, based on ideas segments of manifolds can be prohibitively large
from a proof of global motion in the PCR3BP. if naive computational methods are used. One
For particles in the PCR3BP with energy also needs to take into account the fine struc-
slightly greater than that of L2, the interior, exte- ture of the lobes and manifolds, and in particular,
rior and planetary realms are connected by bottle- the effect of re-entrainment of the lobes, i.e. the
necks about L\ and L2 (see Fig. 1(c) in the next implications of the lobes leaving and re-entering
section). Particles can pass between realms only the specified regions on the transport rate. We
through these bottlenecks by being inside phase show later on that this effect is in fact, impor-
space tubes, regions bounded by pieces of the stable tant in the three-body problem and cannot be
and unstable invariant manifolds of periodic orbits ignored.
around L\ and L2. We can determine the flux Recent efforts made to incorporate lobe dynam-
between realms by monitoring the flux through ics into geophysical, fluid and chemical transport
these tubes. calculations have brought new techniques to com-
pute invariant manifolds (see [Coulliette &; Wiggins,
2001; Lekien & Marsden, 2004; Lekien & Coulliette,
1.2.3. Mars escape rates 2004; Lekien et al, 2003]). Using these techniques,
Building on the ideas described in the preceding one is able to compute very long segments of sta-
paragraph, the rate of escape of particles tem- ble and unstable manifolds with high accuracy by
porarily captured by Mars was computed in [Jaffe, conditioning the manifolds adaptively, for instance,
et al., 2002; Ross, 2003]. The paper uses a statistical by inserting more points along the manifold where
assumption that is common in transition state the- the curvature is high (see [Hobson, 1993; Lekien,
ory in chemistry, and which is appropriate for this 2003]). As a result, the length and shape of the
problem. Theory and direct Monte Carlo simula- manifold is not an obstacle anymore and many more
tions are shown to agree to within 1%, which showed iterates of lobes than hitherto possible can be gen-
the promise of a dynamical systems approach for erated accurately. Using this approach, one keeps
the computation of interesting transport rates in track of all the points throughout the computa-
dynamical astronomy. tion, with the drawback that the resulting algo-
The work of Rom-Kedar and Wiggins [1990], rithms often require a great deal of memory. A
contains an investigation of the transport in the related set of studies [You et al, 1991; Kostelich
two-dimensional phase space of Cr diffeomorphisms et al, 1996] describes a method for restricting the
(r > 1) of two-manifolds between regions of the invariant manifold computation to specific regions
phase space bounded by pieces of the stable and of interest, thereby using significantly less memory,
unstable manifolds of hyperbolic points. The trans- while rigorously guaranteeing that the computed
port mechanism is associated with the dynamics of manifold lies no further than a specified tolerance
homoclinic and heteroclinic tangles, and the study from the "true" manifold.
of this dynamics leads to a general formulation of
the transport rates in terms of distributions of small
phase space regions called "lobes". By following 1.3. Set oriented approach to transport
the evolution of these lobes, lobe dynamics supplies In contrast to the geometric approach to the anal-
a method for theoretically computing short and ysis of transport phenomena as described in the
long term transport rates. However, computational preceding paragraphs, the set oriented approach
issues have limited its applications [Rom-Kedar &, focuses on a global description of the dynamics on
Wiggins, 1990, 1991; Meiss, 1992]. Important con- a coarse level. To this end, one considers a transfer
tributions to this effort were made by Lichtenberg operator associated to the underlying map. Roughly
and Lieberman [1983]; MacKay et al. [1984, 1987]; speaking, this operator describes how some initial
Meiss [1992]; Meiss and Ott [1986]. distribution evolves under the dynamics. Via a par-
The manifolds computed in such problems tition of some interesting invariant part in phase
are typically complicated because of the nature space this operator can be discretized, yielding
8 M. Dellnitz et al.

a stochastic matrix or, equivalently, a directed carry out many more iterates than heretofore
weighted graph, which may be viewed as a coarse- possible.
grain model of the global dynamics. • As a concrete nontrivial example illustrating the
Transport rates between subsets of phase space methods, the transport rate from an interesting
can easily be computed using this matrix of tran- resonant region R\ to a surrounding region R2
sition probabilities. When these subsets are given in the Sun-Jupiter system, exterior to the orbit
as unions of partition elements, the computed rates of Jupiter and at a particular energy value, are
are exact. However, in general the accuracy of the computed. It is computed that the probability (in
computed quantities is determined by the size of the the sense of the fractional area) that a transition
partition elements. from R\ to R2 occurs is about 28% in a period of
In addition to computing transport rates, it is about 1817 Earth years.
also possible to obtain insight about what "impor- • The methods of this paper lay the foundation
tant" or interesting regions are in phase space. for many other computations of astrodynamical
The idea is that the transfer operator encodes a interest. In particular, in [Dellnitz et al., 2004]
macroscopic description of the dynamics. One way we study the transport rate of asteroids from
to reveal this information is to consider the cor- the Hilda region to a region defined by crossers
responding graph, to which standard algorithms of Mars' orbit as well as a remarkable relation
from graph theory can directly be applied for a fur- between almost invariant sets associated with the
ther analysis. For example, we use algorithms for Sun-Jupiter three-body system and the orbits of
graph partitioning (see e.g. software-libraries such all the planets interior to Jupiter.
as CHACO [Hendrickson & Leland, 1995], JOSTLE
[Walshaw, 2000], METIS [Karypsis & Kumar, 1999],
SCOTCH [Pellegrini, 1996] or PARTY [Monien et al, 2. Description of t h e P C R 3 B P
2000] to find regions that are determined by (i) a
Global Dynamics
high transport rate within the region and (ii) a
small transport rate to other regions. In terms of 2.1. Problem description
dynamical systems, these sets are referred to as The PCR3BP is a particular case of the general
almost invariant sets [Dellnitz & Junge, 1999]. In gravitational problem of three masses 777,1,777,2,777.3
particular, we use the PARTY library with exten- defined by the following restrictions: (a) the motion
sions, which are explicitly developed for the anal- of all three bodies takes place in a common plane;
ysis of almost invariant sets in dynamical systems (b) the masses mi and 777,2 move on circular orbits
[Dellnitz k, Preis, 2003]. A key observation of this about their common center of mass; and (c) the
paper is that regions that we compute by this third body, 777,3, has zero mass; therefore, it does
approach are actually those bounded by certain not influence the motion of rri\ and 777,2. In the
invariant manifolds. context of this paper, m\ represents the Sun and
ni2 represents a planet, and we are concerned with
the motion of the third body, the test particle 777,3.
1.4. What is achieved in this paper The system is made nondimensional by the follow-
The main results of this paper are ing choice of units: the unit of mass is taken to be
mi + 777,2; the unit of length is chosen to be ap the
• Further development of the basic theory and constant separation between mi and 777,2 (i.e. the
application of computational techniques for tran- mean separation of the Sun and planet); the unit
sport. In particular, a comparison as well as a of time is chosen such that the orbital period of mi
synthesis of tools from lobe dynamics and set ori- and ?7i2 about their center of mass is 2TT. Then the
ented methods is presented. Error estimates are universal constant of gravitation, G = 1, and the
provided, which show, in particular, the conver- masses of the Sun and planet are 1 — e and e, where
gence of the set-oriented methods. e = m 2 /(mi + TB2).
• In regimes where the comparison makes sense,
it is shown that the agreement is very good on
a sample problem. Based on the initial infor- 2.2. Equations of motion
mation provided by the combination of the two Choosing a rotating coordinate system so that the
methods, the set oriented methods are able to origin is at the center of mass, the Sun and planet
Transport in Dynamical Astronomy and Multibody Problems 9

are on the x-axis at the points (—e, 0) and (1 — e, 0) by a particular value of E) embedded in the four-
respectively. Let (x, y) be the position of the particle dimensional phase space, (x,y,x,y).
in the plane, then the equations of motion for the The value of the energy is an indicator of the
particle in this rotating frame are: type of global dynamics possible for a particle in
the PCR3BP, which can be broken down into five
x-2y = -Ux y + 2x = -U (1) cases (see Fig. 1). In case 1, shown in Fig. 1(a), the
v>
particle is trapped either exterior or interior to the
where planet's orbit, or around the planet itself (labeled
the exterior, interior, and planetary realms, respec-
x2 + y2 1- e e e(l - e) tively). For energy values greater than that of L2
U =
2 ~r~s rp 2 ' (case 3), there is a bottleneck around L\ and L2,
permitting particles to move between the three
Here, the subscripts of U denote partial differen- realms.
tiation in the respective variable, and rs,rp are This paper considers case 1 to illustrate the
the distances from the particle to the Sun and techniques. It uses the Poincare surface-of-section
planet, respectively. See [Szebehely, 1967] for more (s-o-s) defined by y = 0, y > 0, and the coordinates
details on the derivation of this equation and [Koon (x, x) on that section. The geometric interpreta-
et al., 2004] for its derivation using Lagrangian tion is straightforward: we plot the x coordinate
mechanics. and velocity of the test particle at every conjunc-
tion with the planet. As a further restriction, we
consider only the motion of test particles in the
2.3. Energy manifolds exterior realm (strictly speaking, with mean motion
Equations (1) are autonomous and are in Euler- smaller than the planet's). For orbits exterior to the
Lagrange form (and thus, using the Legendre trans- planet's, the s-o-s is crossed every time the test par-
formation, can be put into Hamiltonian form as ticle is aligned with the Sun and planet and is on
well). They have an energy integral the opposite side of the Sun from the planet, along
the portion of the x-axis with x < — 1, as shown in
Fig. 2(a). Thus, the s-o-s becomes
E = -(x2 + y2) + U(x,y), (2)
y = 0, y>0, x<-l. (3)
which is related to the Jacobi constant C by C = In the s-o-s so defined, periodic orbits of the test
—2E. The motion of the test particle takes place particle appear as a finite set of points. The suc-
on a three-dimensional energy manifold (defined cessive crossings of the surface by a quasiperiodic

/ \
1
/
s
X_)
I S'»
s~~—'"1
\ \
y
\
\
y)
(a) Case 1: (b) Case 2: (c) Case 3: (d) Case 4:
E<Ei E\ < E < Ei E2< E < E3 S3 < E < £ 4 = E5
Fig. 1. There are five cases of allowable motion. The Sun and planet, denoted S and P, respectively, are fixed in this
rotating frame, (a) In case 1, the particle is trapped either exterior or interior to the planet's orbit, or around the planet
itself. It is energetically prohibited from crossing the forbidden realm, shown in gray, (b)-(d) As the energy E of t h e particle
increases, the bottlenecks connecting the realms open. In case 5, not shown, the entire configuration space is energetically
accessible.
10 M. De.llnitz et al.

3. C o m p u t i n g Transport
Exterior Realm / As laid out in the previous section, our task is to
/ Forbidden Realm
compute the transport between regions in phase
Poincare Section / space. More precisely, we consider a volume- and
f / Interiors^ orientation-preserving map f:M—*M (e.g. the
\ Poincare map in the PCR3BP as described in the
i / Realm \
previous section) on some compact set M C R
with volume-measure \x and ask for a suitable (i.e.
depending on the application in mind) partition
Particle
of M into compact regions of interest Ri, i —
1 , . . . ,NR, such that
Planetary
Realm NR
M = \J Ri and n(Ri n Rj) = 0 for i ^ j . (4)
(a) i=l
Furthermore, we are interested in the following
questions concerning the transport between the
regions Ri (see [Wiggins, 1992]): "In order to keep
track of the initial condition of a point as it moves
throughout the regions we say that initially (i.e. at
t = 0) region Ri is uniformly covered with species
Si. Thus, the species type of a point indicates the
region in which it was located initially. Then we can
generally state the transport problem as follows.
Describe the distribution of species S%,i = l,..-,
NR, throughout the regions Rj, j — 1,...,NR, for
any time t = n > 0.
The quantity we want to compute is Titj(n) =
the total amount of species Si contained in region
Rj immediately after the nth iterate.
X (Nondirn.)
The flux ctij(n) of species Si into region Rj
on the nth iterate is the change in the amount of
(b) species Si in Rj on iteration n; namely, ctij(n) =
Ti,j(n) — Tij{n — 1). Since / is area-preserving, the
Fig. 2. A Poincare section of t h e flow in t h e flux is equal to the amount of species Si entering
r e s t r i c t e d t h r e e - b o d y problem, (a) The location of the
Poincare surface-of-section (s-o-s) in this paper is shown in
region Rj at iteration n minus the amount of species
the configuration space for a case 1 energy, as in Fig. 2(a). Si leaving Rj at iteration n.
(b) The mixed phase space structure of the PCR3BP is shown Our goal is to determine Tij(n),i,j = 1,...,
on this s-o-s. KAM tori and the chaotic sea are visible. Note NR for all n. Note, that T^O) = ^(Ri), and
that the Poincare map of this s-o-s is area preserving.
Ti,j{0) = 0 for i •£ j . In the following we briefly
describe the theoretical background behind the two
computational approaches to the transport problem
that we are going to compare in Sec. 4.
orbit live on a set of closed smooth curves, such as
the cross-section of a KAM torus. Chaotic orbits
appear to approximately fill a two-dimensional 3.1. Lobe dynamics
area. Following Rom-Kedar and Wiggins [1990], lobe
In general, by taking a grid of points on this dynamics theory states that the two-dimensional
s-o-s and integrating them forward for several iter- phase space M of the Poincare map / can be divided
ates, one observes a mixed phase space structure of as outlined above (see Eq. (4)), as illustrated in
KAM tori embedded within a "chaotic sea", as in Fig. 3(a). A region is a connected subset of M with
Fig. 2(b). boundaries consisting of parts of the boundary of M
Transport in Dynamical Astronomy and Multibody Problems 11

3.1.1. Boundaries, regions, pips, lobes,


and turnstiles defined
To define a boundary between regions, one first
defines a primary intersection point, or pip. A point
qk is called a pip if S]pi, qk] intersects U\pj,qk] only
at the point qk, where U\pj,qk] is a segment of the
unstable manifold Wu(pj) joining the unstable fixed
point pj to qk and similarly S\pi,qk] is a segment
of the stable manifold Ws(pi) of the unstable fixed
point pi joining pi to qk- The union of segments of
the unstable and stable manifolds naturally form
partial barriers, or boundaries U\pj,qk] U S\pi,qk],
between regions of interest Ri,i = 1,...,NR, in
(a) M = \JRi- In Fig. 3(a) several pips are shown
as well as the boundary B\2- Note that we could
have Pi — Pj, as will be the case studied in this
paper.
Consider Fig. 3(b). Let g 0 ,«i 6 Wu(j>i)nWs
(pj) be two adjacent pips, i.e. there are no other pips
on U[qo,qi] and S[qo,qi], the segments of Wu(pi)
and Ws{pj) connecting qo and q\. We refer to the
region interior to U[qo,qi]uS[qo,qi] as a lobe. Then
S[f~1(qo),qo] U U[f~1(q0),qo] forms the bound-
ary of precisely two lobes; one in Rlt defined by
£1,2(1) := J(U[qo,qi] U S[q0,qi]), where / denotes
the interior operation on sets, and the other in
R2, L 2 l l (l) := J{U[f-1(qo),qi}^S[f-1(q0),qi]).
Fig. 3. T r a n s p o r t b e t w e e n r e g i o n s of t h e p h a s e s p a c e Under one iteration of / , the only points that can
M of a P o i n c a r e m a p / . (a) The segment S[pi,<j2] of move from Ri into R2 by crossing B12 are those in
the stable manifold W(j>\) from p\ to g2 and the segment £1,2(1)- Similarly, under one iteration of / the only
U[P2,Q2] of the unstable manifold Ww(p2) from p2 to qi inter- points that can move from R2 into Ri by crossing
sect in the pip 172- Therefore, the boundary B\i can be denned B12 are those in £2,1(1). The two lobes Li,2(l) and
as B\i = U\p2,q2\ U S[pi><?2]- The region on one side of the
boundary may be labeled R\ and the other side labeled i?2-
£2,1(1) are called a turnstile. It is important to note
(b) q\ is the only pip between the two pips go and f~ (go) that f~n(Lifi(l)),n > 2, need not be contained
in Wu(pi) n Ws{Pj), thus S I / " 1 (go),go] U U[r\qo),qo\ entirely in Ri, i.e. the lobes can leave and re-enter
forms the boundary of precisely two lobes; one in R\, labeled regions with strong implications for the dynamics.
£1,2(1), and the other in R2, labeled Z,2,i(l)- Under one iter- As will be shown, the quantities of interest, Tjj(n),
ation of / , the only points that can move from R\ into R2 can be expressed compactly in terms of inter-
by crossing the boundary B are those in £1,2(1). Similarly,
under one iteration of / the only points that can move from
section areas of images or preimages of turnstile
7?2 into Ri by crossing B are those in L 2 ,l(l)- lobes.

3.1.2. Multilobe, self-intersecting turnstiles


(which may be at infinity) and/or segments of stable Before we derive expressions for the Tjj-(n), some
and unstable manifolds of hyperbolic fixed points, comments regarding technical points are in order
Pi,i = 1,...,JV. Moreover, the transport between [Rom-Kedar k Wiggins, 1990]. In the previous
regions of phase space can be completely described paragraph we assumed that there was only one pip
by the dynamical evolution of small regions of phase between q and f~1{q), but this is not the case for the
space, "lobes" enclosed by segments of the stable application to the PCR3BP in Sec. 4. Suppose that
and unstable manifolds, as shown schematically in there are k pips, k>l, along U[f~1{q),q] besides q
Fig. 3(b), and defined below. and f~l(q). This gives rise to k + 1 lobes; m in R2
12 M. Dellnitz et al.

and (k + 1) — m in R\. Suppose

LQ,L\,... ,Lk-m C Hi,

Lk-m+li Lk-m+2-i • • • , £fc C i?2-

Then we define

Li,2(l) = LoU.LiU---UL f c _ m ,
£2,1(1) = Lk-m+i U Lk-m+2 U • • • U Lk,

and all the previous results hold.


Furthermore, we previously assumed that
£1,2(1) and £2,1(1) lie entirely in R\ and R2, respec-
tively. But £1,2(1) may intersect £2,1 (1), as shown
schematically in Fig. 4(a). We want U[q, f~1(q)] and
S[q,f~1(q)] to intersect only in pips, so we must (al

redefine our lobes, as shown in Fig. 4(b). Let

/ = int(£i,2(l)ni2(i(l))-

The lobes defining the turnstile are redefined as

£1,2(1) = £ i , 2 ( l ) - £
£2,1(1) = £ 2 , i ( l ) - £
and all our previous results hold. To the best of
our knowledge, the PCR3BP is the first example of
a physical system that has a multilobe turnstile, so
the fact that it is a multilobe, self-intersecting turn-
stile is even more surprising. We believe this has a
great effect on the dynamics.

3.1.3. Expressions for the transport of species (b)


In the application in the present paper, the phase Fig. 4. A m u l t i l o b e , self-intersecting t u r n s t i l e . The
space M is known to possess resonance regions stable and unstable manifolds of the unstable fixed point p
whose boundaries have complicated lobe structures, intersect in such a way that there are three pips between q
and f (q), but our naively defined turnstile "lobes" have a
which can lead to complicated transport properties
nonempty intersection I = f (£1,2(1) D Z.2,i(l)) 7^ 0- When
(cf. [Meiss, 1992; Schroer &; Ott, 1997; Koon et al, we redefine the turnstile lobes such that £1,3(1) = £1,2(1) — -?
2000]. In this paper, we limit ourselves to the study and £2,1(1) = £2,1(1) — I, the result is a multilobe, self-
of transport between just two regions. We suppose intersecting turnstile consisting of a sequence of six regions;
that our map / has a period-1 hyperbolic point p. three defining £1,2(1) and three others defining £2,1(1).
We consider only one branch of the unstable mani-
fold W+(p), and one branch of the stable mani-
fold W+(p). We suppose that they intersect each fm-l(Li,j(m)) = Lij(l). Let L^-(m) = Litj(rn) n
other, as in Fig. 4, forming a boundary between Rk denote the portion of lobe Lij(m) that is in the
two regions, R\ and i?2- Using the lobe dynam- region Rk- Then
ics framework, the transport of species between the
regions — Tij(n),i,j = 1,2 — can be computed Tij(n) -Tij(n- 1)
via the following formulas.
Let Li:j(m) denote the lobe that leaves Ri = J>(4>))-M(4»)] (6)
and enters Rj on the mth iterate, so that k-i
Transport in Dynamical Astronomy and Multibody Problems 13

where where L\ • (n) is the set of points that at time


2 Tl-1 t = n = 0 is in R and is mapped from region Rk
A* ( 4 » ) = E E ^ (W*)n fm(Lum into region Rj on the nth iterate, i.e.
s=l m=0
2 n-1 4 » = rn(Rj) n /-("-1)(i?fc) n R (9)
J^M^Wnn^W)). Combining (8) and (9) with the fact that / is a
s=l m=l
(7) diffeomorphism yields

Thus, the dynamics associated with particles


crossing B is reduced completely to a study of the Tain) = A* f U nrn(Rj) n r^-^Rk) n RA
dynamics of the turnstile lobes associated with B.
The amount of computation necessary to obtain all
the Tjj(n) can be reduced due to conservation of = fi(\jRjnf(Rk)nfn(RiU
area and species, as well as symmetries of the map
/ (to be discussed in Sec. 4).
NR
/ \
= ii(Rjnr(Ri)n\Jf(Rk))
3.2. Set oriented approach
3.2.1. The transfer operator --M
n
Computing transport between regions in phase = v(f- (Rj)nRi),
space is a question about the global dynami-
cal behavior of the underlying dynamical system where the latter equality follows from the fact that
/ : M —> M. One is interested in the evolution / is area-preserving. •
of sets or, more generally, of densities or mea-
sures on M instead of single trajectories. The evo- Since aij(n) = Tij(n) — Tij(n — 1), one obtains
lution of e.g. a (signed) measure v on M is com- the formula
pactly described in terms of the transfer operator
(or Perron-Frobenius operator) associated with / ,
aiJ(n) = ii(f-n(Rj) n Ri) - ^f-^iRj) n Ri)
which is the linear operator P : M —> M.,
(10)
{Pv)(A) = z/(/ - 1 (A)), A measurable,
for the flux of species Si into region Rj on the nth
on the space M of signed measures on M. iterate.
To see how this operator relates to the The following consequence of Proposition 3.1
transport quantities of interest, namely, the total tells us how we can compute fj, (f~n(Rj) n R) using
amount Tjj(n) of species, consider the following the transfer operator P (where, as usual, Pn refers
observation. to the n-fold application of P):
Proposition 3.1. Let f : M M be an area pre-
serving map, then Corollary 3.2. Let m G M be the measure fii(A) =
/ t ( i n f l j ) = fA%Ri dpi where %Ri denotes the indi-
TiJ(n)=Li(rn(Rj)nRi) cator junction on the region R. Then

(where, again, /j, denotes the volume-measure Tij(n) = ( P » ( ^ ) . (11)


on M).
Evidently, since we are interested in actu-
Proof. By definition (see [Wiggins, 1992], p. 30 ff.), ally computing the quantities of interest for the
we have PCR3BP, we need to explicitly deal with t h e trans-
'NR fer operator. Since an analytical expression for it
Tij(n)=r[\Jr(L%j(n))), (8) will only be derivable for none but the most sim-
\k=l
ple systems, we need to derive a finite-dimensional
14 M. Dellnitz et al.

approximation to it. For more details on the follow- means that we have to check whether or not the
ing description see [Dellnitz et al, 1997; Dellnitz point f(xk) is contained in Bi. There are efficient
& Junge, 1999; Dellnitz et al, 2001b; Dellnitz & ways to perform this check based on a hierarchical
Junge, 2002]. construction and storage of the collection B (see
[Dellnitz & Hohmann, 1997; Dellnitz et al, 1997]).
3.2.2. Discretization of the transfer operator
Consider a covering of the phase space M by a finite 3.2.3. Approximation of transport rates
collection B = {B\,..., Bf,} of compact sets, i.e. a Note that we can write
partition
b Tid(n)= f PnXRldM
M=(jBi and fi(Bir\Bj) = 0 for i ^ j .
i=i
For some (measurable) set A let
In practice such a partition can be efficiently com-
puted using a hierarchical multilevel approach as A= (J B and A = (J B.
described in [Dellnitz &; Hohmann, 1997]. BeB:BcA BeB:BnAy£®
As a finite dimensional space M.B of measures
on M we consider the space of absolutely continuous Since P is positive, it follows that for two given
measures with density h € Ag := span{%s : B € regions Ri and Rj, Pn{xRi-XRi) > 0, i.e. PnXRt >
B}, i.e. one which is piecewise constant on the ele- n
P XR an
d thus
ments of the partition B. Let QQ : L1 —• A# be the
projection
/ PnXRzdu< [ PnXRid»,
JR. JRi
QBh hdfiXB,
§/*(*) V similarly, we can bound the term fR PnXRi df* from
then for every set A that is the union of partition above and thus get the following estimate.
elements we have
Proposition 3.3
/ Qshdfi = / hd/i. (12)
JA JA
/ PnXRl df, < Tij(n) < [_ Pnxn% dy.. (14)
We define the discretized transfer operator PQ : JRj JRj
A B —> Ag as
The next step is to replace Pn by Pg, since this
PB = QBP- is the operator we have at hand for computing. The
With respect to the basis (xB)BeB it is represented error in making such a replacement is given by the
by the matrix estimate in the following Lemma.

^(f-^BOnBj) Lemma 3.4. Let R,ScM and


PB = (Pij), where pij
KBj
S0 = S, Sk+^f-'iSk), k = 0,1,2,....
1 < i, j < b. (13)
Then for n = 1,2,...
For the computation of /i(f~1(Bi)C\Bj), that is, the
measure of the subset of Bj that is mapped into Bi,
one can use a Monte Carlo approach as described [ PnXRdu- [ PgXRdfi
in [Hunt, 1993]: Js Js
n-1
K
i
1
M(r (^)ni?J)«-^XBi(/(xfc)), fc=0
fc=i
where the x^s are selected at random in Bj from a Proof. We proceed by induction on n. For n = 1
uniform distribution. Evaluation of XBi{f(%k)) only we use (12) and the fact that ||J — <3B|| < 2 and
Transport in Dynamical Astronomy and Multihody Problems 15

|P|| = 1 to obtain
f_P{Pn-l-(QBPY-l)XRd^
Js
/ PxndiJ.- f
Js Js I _{Pn-l-{QBP)n-1)XRdv-
Jf-i-(S)
'f-HS)

< [_(P-PB)xRdv- [_ (P-PB)XRdfi Thus, by induction, we obtain the claim. •


Js Js\s
Using Proposition 3.3 and Lemma 3.4 we obtain
the following estimate on the error between the true
< [_(I-QB)PXRdfi transport rate Tij(n) and its approximation. To
Js
abbreviate the notation, let e ^ e j , ^ and Ui € K6
be defined by
I (I-QB)PxRdfi
Js\s Jl, ifBkcRu
fe)fc =
\ 0 , else
< 0 + 2pt{R n S\S) = 2 / J ( P n SoVSb).
l, i f P f e n P ^ 0 ,
Now note that since \\I — QB\\ < 2, ||Qs|| = 1 and (fii)
ilk
0, else
||P|| = 1,
and
n
\\p - (QBPTW (fi(Bk), ifPfcCPi,
(Mi)fe
\ 0, else,
< \\Pn - QBPU\\ + \\QsPn - (QBPTW
<2+||QB||||P||||Pn-1-(QBPr-1|| {Ui)k==
{ 0, else,
< 2n,
where k = 1 , . . . ,b.

by induction. For n > 1 we get Lemma 3.5. Let Ri,Rj C M and


1 y
R
o = Rj> R
i+i ~ f {Rkk), k=0,1,2,...,
[ PnXRd/i- [ PEXRdfi
Js Js then for n = 1, 2 , . . .
|T^(n)-eJP£^|
[_(Pn - Pg)XR dfi- [__ (Pn - Pg)xR dfi
Js Js\s
< ejPg(ui - u^ + (ej - ejfPgUi
< f_(Pn - Pg)XR d/i + 2nn{R n S\S) n-l
Js + 2j2(n-k)v(RinRJk\Ri).
k=0
and, using (12) and the definition of P , the first
term on the right-hand side can be estimated as Proof.
{TijW-efPgUil
n
f_(P - P£)XRd»
Js
f PnXR^~ f PEXR^
JRJ JRj
= f(Pn - QBP71 + QsPn - (QBP)n)xRdv
Js
f PnXR^~ I PBxRld»
n
= f_(I - QB)P XRdn
Js

+ [jQsPKP71-1 - (QePr-^XRdfi f PEXRid»- f PEXR^


Js jRi JR:
16 M. Dellnitz et al.

A bound on the first term on the right-hand side is


given by Lemma 3.4. For the second term, we use
the observation that led to Proposition 3.3 and get

< i PBXR.dfM - f Fgx&dn


JRj JRj

= eJPgui - ejPgut

= ejPgiui - Mi) + (ft - ejfPgui,


which proves the claim. •

This estimate gives a bound on the error


Fig. 5. Two box transitions that contribute to the error
between the true transport rate Tij(n) and the between the computed and the actual value of the transport
one computed via the transition matrix, eJPgUi, rate Tij{l) from region Ri into region Rj after one iterate.
in terms of those elements of the fine partition B
that either intersect the boundary of Ri and are
mapped into Rj or that intersect Ri at all and are
mapped "onto" the boundary of Rj. In Fig. 5 we §jPB^i essentially consists in n matrix-vector-
illustrate this idea by sketching two box-transitions multiplications — where the matrix PQ is sparse.
that contribute to the error. So an obvious conse-
quence of Lemma 3.5 is that in order to ensure a
certain degree of accuracy of the transport rates for 3.2.4. Convergence
large n, these particular boxes need to be refined.
Lemma 3.5 yields the following convergence state-
Using (12) it also follows that for n = 1 the estimate
ment for the approximate transport rate eJPgUj as
in Proposition 3.3 holds for the discretized transfer
operator, too, i.e. the partition B is refined. Let {B£)t D e a sequence
of partitions such that

ejPBUi<Tij{i)<ejPBUi- (15) max diam(5) 0 as £ —> oo. (18)


BeBe
Moreover, if Jlj = R\ for all sets Ri under consider-
Corollary 3.6. If the regions Ri, i — 1 , . . . , NR, are
ation (i.e. the sets Ri are box collections) then
chosen such that for all i
efPg^ = ejPBUi
(
for all n € N. Notably the estimate in Lemma 3.5 [J 5 -»0 as£ (19)
/' oo,
reduces to see.

\Tij{n)-ejP^Ui\
then for all n (fixed) and all i,j,
n-l
<2Y;(n-k)li(RinWk\Ri), (16) ejPgtui->TiJ(n) (20)
fc=0

and for the special case of n = 1 we even get the as £ —» oo.


exact transport rate: Clearly, under the assumption (18), the condi-
ejPBlLi = Tij(l) = ejPBUi. (17) tion (20) will be satisfied if the boundaries of the
regions Ri are piecewise smooth — as in our case,
Note in particular that the numerical effort where the boundaries of the regions are composed
to compute the approximate transport rate of pieces of invariant manifolds.
Transport in Dynamical Astronomy and Multibody Problems 17

3.2.5. Almost invariant decompositions It = {R\,..., RNR}, Rk G CB, p(R-k) > 0, such t h a t
So far, we have discussed how to compute trans- NR NR
port between two given regions. In the remainder 1 1
of this section we will t u r n to the question of how P(K) = 1NR
^'Etp(Rk) = jr'Zlp(Rk,Rk) (24)
fc=l "R fc=l
to actually find regions of interest. In this context,
a region will be of interest if it is almost invariant is maximized over these special partitions.
in the sense t h a t typical points are mapped into
the region itself with high probability. T h e problem
3.2.6. Graph formulation
of decomposing M into almost invariant sets can
be formulated in graph theoretic notation and then Consider the transition matrix P g from (13). A
solved by applying graph partitioning methods. transition matrix P = (Pij) is called reversible, if
T h e transition probability for two measurable for all i,j we have PjPij = PiPji, where p is the sta-
sets R4 and Rj is defined as tionary distribution of P, i.e. Pp = p. T h e matrix
PB is not necessarily reversible. However, t h e matrix
QB defined by
p(Ri,Rj) = M(i2j)^0. (21)
QB = \{PB + DP?D-1),
If we consider the case Ri = Rj = R, then
this transition probability measures which fraction where D = diag(/u) denotes the diagonal matrix
(measured with respect to p) of the points in R with the entries of p on t h e diagonal a n d which
stays within R after one iteration of / . For an invari- has matrix entries
ant set R = f(R) with positive //-measure, this ratio n(Bj)pij + p(Bi)p31
will be 1. We therefore define the invariance ratio 1ij =
of R as MBj)
is reversible. Let TZ = {R\,... ,RNR},Rk e CB,
p(R)=p(R,R). (22)
p{Rk) > 0, be a partition of M into NR sets. T h e
For a given m a p / : M —> M one can decompose function (24) to be optimized can be w r i t t e n as
its maximal invariant set into invariant parts, as e.g. NR
1 ^^Bt,B3cRkPiJ • KBj)
chain recurrent sets and connecting orbits between
them. For details on these concepts see e.g. [Easton, ^) = ^ £ NR
fc=l T,B,CR. V(B3
1998]. But one may go one step further and ask for
macroscopic dynamical structures within the chain 1 NR
t^BuBjCRk Qv ' M - ° ? )
recurrent sets themselves. One possible decomposi- (25)
p(Bj)
tion is given by an almost invariant decomposition
NR
fc=l EBjCRk
of M (where for simplicity we assume M to be chain because of p(Bj)pij + p(Bi)pji = 2p(Bj)qij.
recurrent from now on) as defined in [Froyland & This optimization problem can be translated
Dellnitz, 2003]: We ask for a measurable partition into the question of finding an optimal cut in a
K = {Ri,..., RNR} of M into NR sets (with NR graph. Let G = (V,E) be a graph with vertex set
fixed) with positive measure (i.e. p(Rk) > 0), such V = B and directed edge set
t h a t the quantity
E = E(B) = {(B1,B2) £BxB\ f{Bx) n B2 ^ 0}.
NR
(23) T h e vertex weight function vw: V —>• R with
NR vw(Bi) = p(Bi) assigns a weight to t h e vertices
fc=i
and t h e edge weight function ew: E .—* R with
is maximized over all such partitions. ew((Bi,Bj)) = p(Bi)pji assigns a weight to the
Evidently the infinite dimensional optimiza- edges. Furthermore, let
tion problem (23) needs to be discretized so
it may be treated numerically. To this end we E = E(B)
again restrict ourselves to sets within CB, i-e. to = {{Bi,B2}cB\
sets t h a t are unions of elements of the partition
B. Therefore, our goal is to look for partitions (f(B1)DB2)U(f(B2)nB1)^<D}.
18 M. Dellnitz et al.

This defines an undirected graph G = (V, E) with are restricted to have an equal (or almost equal) vol-
a weight function ew: E —»• R with ew({Bi, Bj}) = ume of the underlying measure. Therefore, we will
2fj,{Bi)qji = 2p(Bj)qij = n{Bj)pij + \i{B^)pji on the use parts of the library PARTY and combine them
edges. The difference between the graphs G and G with some new code which is specially designed to
is that in G the edge weight between two vertices address our cost function (27).
is the sum of the edge weights of the two directed PARTY, like other graph partitioning tools,
edges between the same vertices in G. Thus, the follows the Multilevel Paradigm which has been
total edge weights of both graphs are identical. proven to be a very powerful approach to effi-
The partition TZ corresponds to the partition of cient graph-partitioning. See e.g. [Gupta, 1997;
V into V = {Vx,..., VNR} with Vt = {Bf, Bi C flj. Hendrickson k, Leland, 1995; Karypis & Kumar,
For a set W C V we denote 1999; Monien et al, 2000; Ponnusamy et al, 1994;
Preis, 2000] for a deeper discussion. The efficiency
„ m n _ J2(v,w)eE;v,wew ew({v,w}) of this paradigm is dominated by two parts: graph
<^mt{W ) — ^ j-r
coarsening and local improvement. The graph is
coarsened down in several levels until a graph with a
= T/{v,w}eE;v,wewmj({v^w}) sufficiently small number of vertices is constructed.
Ev£Wvw(v) A single coarsening step between two levels can be
performed by the use of graph matching (indepen-
called the internal cost of W. Note that the inter- dent sets of vertex pairs).
nal cost is independent from the choice between the Different methods for calculating the matching
directed graph G or the undirected graph G. Thus, will result in different solutions of the partitioning
we are allowed to operate on undirected graphs, as problem. To achieve a selection of different results
we shall do in the following. we will consider heuristics with the following graph
For a partition V = {Vi,..., VNR} we denote matching algorithms:
1 NR
Cint(V) = — ^2C^(Vi) (26) 1. Heavy Edge Matching (HEM): It is a simple, fast
i=l and widely used matching strategy in which the
called the internal cost of V. It is an easy task to weight of the edges are considered.
check that p(TZ) = C-mt(V). Thus, the optimization 2. Greedy Matching (GRM): The solution is within
of our cost function (24) is identical to the opti- a factor of two from the optimal matching, but
mization of the internal costs of the partition V (26) it requires the sorting of the edges in a prepro-
written in graph notation and we have established cessing step.
the graph partitioning problem 3. Locally Heaviest Matching (LHM): It is a short
algorithm which also guarantees a factor of at
Cint(V)-^max. (27) most two, but it runs in linear time [Preis, 1999].
4. Path Growing Matching (PGM): It has the same
theoretical runtime and approximation quality
3.2.7. Heuristics and tools for the graph as LHM but it follows a different strategy [Drake
partitioning problem & Hogardy, 2002].
The optimization problem (27) is known to be NP-
complete (even for constant weights, see [Garey All these matching algorithms are implemented in
h Johnson, 1979]), i.e. an efficient algorithm for PARTY and a discussion about their use in the graph
solving this problem is not known. Efficient graph partitioning context can be found in [Monien et al.,
partitioning heuristics have been developed for a 2000; Preis, 2000].
number of different applications. There are sev- The coarsening process is stopped when the
eral software libraries, each of which provides a number of vertices is equal to the desired number
range of different methods. Examples are CHACO of parts NR. Thus, each vertex of the coarse graph
[Hendrickson & Leland, 1995], JOSTLE [Walshaw, is one part of the partition. However, it is also pos-
2000], METIS [Karypis & Kumar, 1999], SCOTCH sible to stop the coarsening process as soon as the
[Pellegrini, 1996] or PARTY [Monien et al, 2000]. number of vertices is sufficiently small. Then, any
These libraries are designed to create solutions to standard graph partitioning method can be used to
the balanced partitioning problem in which all parts calculate a partition of the coarse graph.
Transport in Dynamical Astronomy and Multibody Problems 19

Finally, the partition of the smallest graph 4.1.1. Symmetries of the Poincare map f
is projected back level-by-level to the initial Using the following symmetry of the equations of
graph and the partition is locally refined on each motion (1),
level. Standard methods for local improvement are
Kernighan/Lin [Kernighan & Lin, 1970] type of y—
i > —y, t \—> —t, for all x, y
algorithms with improvement ideas from Fiduccia/
Mattheyses [Fiduccia & Mettheyses, 1982]. The and therefore x —i » —x, the Poincare map /
algorithm moves single vertices between the parts on the surface of Sec. 3 has the corresponding
to improve the cost function. The choice of the ver- symmetry
tices to be moved depends on the cost function to sym:MxZ^MxZ, (x, x) i-> (x, - i ) ,
be considered. Therefore, the Kernighan/Lin imple-
n i—• — n.
mentation has been modified in PARTY such that it
optimizes the cost-function Ci„t. This symmetry implies that the Poincare map /
The software environment GADS (Graph is symmetric with respect to reflection about the
Algorithms for Dynamical Systems) has been estab- x-axis and time reversal. Note that this notion of
lished, which consists of a collection of graph symmetry with time reversal is very useful since it
algorithms which are useful for the analyses of relates to stable and unstable manifolds.
dynamical systems. See [Dellnitz <fe Preis, 2003;
Padberg et al., 2004]. It has an interface to the
graph partitioning library PARTY [Monien et al, 4.1.2. Finding a fixed point p of f
2000] and is designed to work with the tool GAIO Due to the symmetry (28), we may expect to find
(Global Analysis of Invariant Objects, cf. [Dellnitz a fixed point for the Poincare-map / along the
et al, 2001b]. x-axis. Using differential correction, we numeri-
cally find an unstable fixed point at p = (x,x) =
(-2.029579567343744,0), shown in Fig. 6(a).
4. E x a m p l e : T h e Sun-Jupiter—Asteroid
System
4.1.3. Finding the stable and unstable
We will compare and combine both methods from manifolds of p under f
Sec. 3 within the example of the PCR3BP with
Denote the four branches of the stable and unsta-
the Sun and Jupiter as the main bodies, using
ble manifolds of p by W+(p),W™(p),W+(p), and
e = 9.5368 x 10 - 4 . We consider the motion of a
WL{p). We will consider only the "+" branches.
particle (asteroid) that has an energy E = —1.525
Using the symmetry reduces the calculations by a
(that is, the Jacobi constant is C = 3.05), case 1,
factor of two, i.e. W+(p) — sym (W™(p)). The local
as depicted in Fig. 1(a). We will study transport
approximation to W"(p) can be obtained as given in
in the exterior realm, using the Poincare section,
[Parker & Chua, 1989]. The basic idea is to linearize
/ : M -»• M where M e l 2 , defined in Eq. (3), which
the equations of motion about the periodic orbit in
is shown in Fig. 2(b).
the energy surface and then use the monodromy
matrix provided by Floquet theory to generate a
linear approximation of W" (p). The linear approxi-
4.1. Lobe dynamics mation, in the form of a state vector, is numerically
The only requirement to use lobe dynamics is being integrated in the nonlinear equations of motion to
able to generate stable and unstable manifolds of produce the approximation of W+(p).
the hyperbolic structures in phase space for the In practice, we take a finite segment along this
time window of interest. This has been done for linear approximation described by an ordered array
many years using a simple principle. A small seed of points (the "seed"). Using a standard numerical
set near the hyperbolic point (positioned along the integration scheme (in this case, RK78), we numeri-
unstable eigenspace) will deform in time and stretch cally integrate the equations of motion (1) to obtain
along the unstable invariant manifold. The same the Poincare map / . Under iterates of / , each point
procedure performed backwards in time will render approaches the manifold at an exponential rate,
the stable manifold, but we can save computational reducing the positioning error. However, the seed
effort by using symmetries of the map / . also stretches in the direction of the manifold and
20 M. Dellnitz et al.

0.2
zSd) *£<!> ^l(l)

0.18
f-Hq) A
0.16
\^s , R2
X 0 •
0.14

\ 0.12
X»*» *X(1) \\
\\
1

«1
',0.1
-0.2

11
-1.1 -1.7 -1.6 -1.5 -1.4 -1.3 -1.2

(a) (b)
Fig. 6. T r a n s p o r t using lobe d y n a m i c s for the same Poincare surface of section shown in Fig. 2(b). (a) The boundary B
between two regions is shown as the thick black line, formed by pieces of one branch of the stable and unstable manifolds of
the unstable fixed point p. We can call the region inside of the boundary R\ (in cyan) and the outside R2 (in white). The pips
q and / _ (q) are shown as black dots along the boundary and the turnstile lobes that will determine the transport between
R\ and R2 are shown as colored regions. In (b), we see more details of the turnstile lobes. This is a case of a multilobe,
self-intersecting turnstile discussed in Sec. 3.1. A schematic of this situation is shown in Fig. 4. In this case we define t h e
turnstile lobes to be Z-i, 2 (l) = L $ ( l ) U L ( $ ( l ) U L ( $ ( 1 ) and L 2 , i ( l ) = L^li1) u
^lt1) U I$.(l)-

the distance between each point increases exponen- Fig. 6(a). The first iterate of the turnstile lobes is
tially. Since the manifold experiences rapid stretch- shown in the lower half plane of Fig. 6(a) in cor-
ing as it grows in length, it is necessary to check responding colors. In the enlarged view, Fig. 6(b),
the distance between adjacent points and insert new the turnstile lobes are shown in greater detail. This
points if necessary to insure that sufficient spatial is a case of a multilobe, self-intersecting turnstile,
resolution is maintained [Lekien, 2003]. The soft- discussed in Sec. 3.1.
ware package MANGEN is used to implement the The area of the turnstile lobes, i.e. the flux of
adaptive conditioning of the mesh of points approx- phase space across the boundary B (and the trans-
imating the manifold [Lekien & Coulliette, 2004; port of species across B for just the first iteration
Lekien, 2003]. More points are added where curva- of the map / ) , is summarized in Table 1.
ture or stretching is high.

4.1.4. Defining the regions and finding 4.1.5. Higher iterates of the map
the relevant lobes To compute all the transport quantities
Ti,i(n), 7i, 2 (n), T2tl(n), and T2<2(n), it is only
The symmetry (28) is useful for defining the regions
and lobes. The first intersection of W"(p) with the
axis of symmetry is the natural choice for the pip Table 1. F l u x of p h a s e s p a c e across t h e b o u n d -
q defining the boundary, shown in Fig. 6(a). We a r y in terms of canonical area per iterate. Note,
define Ri (in cyan) to be the region bounded by li(Li,2(l)) is the sum M(L$(1)) + II(I$Q.)) +
B = U+\p,q]U S+[p, q], where U+\p, q] and S+\p, q] n(L^W)- This is t h e flux in both directions, i.e.
are segments of W+(p) and W+(p), respectively, /z(Li,2(l)) = M ( ^ 2 , I ( 1 ) ) , since the map / is area-
between p and q. We define R2 (in white) to be preserving on M.
the complement of R±. , (6)
MANGEN can then be used to compute the turn- M4?2(I)) tiv&V)) M^S(i)) M(£l,2(l))
stile lobes £1,2(1) U Z,2,i(l). The turnstile lobes are 0.000956 0.000870 0.000399 0.002225
shown as colored regions in the upper half plane of
Transport in Dynamical Astronomy and Multibody Problems 21

necessary to compute one of them. We compute regions. This geometric effect is believed to have
Ti^{n). By area preservation of the map / , we important consequences for the behavior of Tij (n)
have as n increases [Wiggins, 1992].
Consider Fig. 7. In this figure we show preim-
T1,1(n) = n(R1)-Tlt2(n), ages and images of only the lobe labeled L 2 1 ( l )
T 2 ,i(n)=T 1 , 2 (n), in Fig. 6(b). Four preimages and five images of
this lobe are shown in Fig. 7(a). By definition, we
T 2 , 2 (n) = M.R 2 )-Ti,2(n). must have /(L 2 {(1)) c R\, but the other images,
The values for Ti, 2 (n) up to n = 5 are given in i.e. / f c ( £ 2 i ( l ) ) for k > 1, need not be contained
Table 5. We cannot compute beyond n = 5 due to entirely in Rt. In the specific geometry shown here,
computer memory limitations of storing the windy / f e ( 4 6 j ( l ) ) n R2 ± 0 for k > kf, where kf = 3. The
boundaries of the lobes. boxed region in Fig. 7(a) is shown in more detail
in Fig. 7(b). The area of the lobe which lies in R\
or i? 2 is shown in Fig. 7(c). We conclude that some
4.1.6. Re-entrainment of the lobes particles in L J I U ) which begins in i? 2 will enter
We now illustrate the effect of re-entrainment of the i?i only to return to i? 2 after just three iterates
lobes, i.e. lobes leaving and re-entering the specified in R\.

Transport Across Two Regions (mu = 9.5368e-04, C = 3.05)

ii

-0.2

-2 -1.5
X (Sun-Jupiter distance = 1)
(a)
Fig. 7. R e - e n t r a i n m e n t of t h e lobes. We show preimages and images of only the lobe labeled Lr (26,i) ,W i n Fig. 6(b).
(a) Four preimages and five images of this lobe are shown. Notice that the images are not contained entirely in Ri, i.e.
/ f e (L 2 j ( l ) ) n i?2 5^ 0 for fc > kf, where kf = 3. (b) The boxed region in (a) is shown in more detail, (c) The area of the lobe
which lies in Ri or R2 is shown.
22 M. Dellnitz et al.

Area of Lobe Images under Poincare Map

-1.29 -1.28 -1.27 -1.26 -1.25 -1.24 -1.23 -1.22

(b) (c)

Fig. 7. (Continued)

4.2. Set oriented approach GRM, for V3 we used LHM and for V4 we used
PGM.
For the Poincare map / : M —> M we consider
The partition V2 has the highest internal cost,
M to be the chain recurrent set within the rectan-
although the internal costs are almost equal for
gle X = [-2.95,-1.05] x [-0.5,0.5] in the section
all partitions. We define the red region as Rn,
y = 0, y > 0 (see Sec. 2). For an efficient approach
light blue as R12, dark blue as R13, magenta as
to the construction of the box coverings B as needed
Ru, yellow as R21, green as R22 and white as
for the discretization of the transfer operator we
R23. To compare the regions obtained by com-
refer to [Dellnitz et al., 2000]. We approximate the
puting an almost invariant decomposition of M
entries of the transition matrix (13) in analogy to
with the regions found by considering branches
the Monte-Carlo approach as described in Sec. 3.2.
of stable and unstable manifolds we agglomer-
Only here instead of randomly choosing points in
ate the seven-set partition from above into a
each box we employ a uniform grid of 16 x 16 points.
two-set partition H = {i?i, -R2} by defining
Ri = {Rn,Ri2,Ri3,Ru},R2 = {R2i,R22,R23}-
Figure 9 shows 71 together with the boundary as
4.2.1. Almost invariant decomposition of
computed in the previous section. It is intriguing to
the Poincare section see how well the partitions which were found by the
The number of parts NR is an input to the graph respective methods agree visually. However, using
partitioning tools. We have experimented with dif- Eq. (17) (which follows from Lemma 3.5 for the
ferent values for NR and found that NR = 7 special case of n = 1 and i?i, i?2 box collections)
exhibits a lot of valuable information for the cur- we get 11,2(1) ~ 0.005 for this particular size of the
rent example. As stated in the previous section, boxes in the covering, which is considerably larger
we would like to find a partition that maximizes than the value of 0.0022 as computed in the previ-
our internal cost. Since the problem of comput- ous section for the corresponding partition given by
ing an optimal solution is NP-complete, we apply the invariant manifolds.
some heuristics as described in Sec. 3.2. Figure 8
shows four different decompositions of M into seven
almost invariant sets, obtained by different parame- 4.2.2. Transport for a two-set partition
ters for the coarsening step. For the partition Vi we In this section we are going to compare the value
used the matching strategy HEM, for V2 we used for the quantity 2^(1) as computed using lobe
Transport in Dijnamical Astronomy and Multibody Problems 23

0
(a) The partition Vi obtained with the HEM coarsening (b) The partition V2 obtained with the GRE coarsening
strategy has an internal cost of 0.9453. strategy has an internal cost of 0.9493.

(c) The partition V3 obtained with the LHM coarsening (d) The partition V4 obtained with the PGM coarsening
strategy has an internal cost of 0.9472. strategy has an internal cost of 0.9458.
Fig. 8. Almost invariant decomposition of the chain recurrent set M into seven sets, indicated by different colors. We used
different partitioning strategies to obtain the subfigures above.

dynamics with the one resulting from an application value of 0.002225 for 2i,2(l) as computed using lobe
of the set oriented approach. For both computations dynamics. However, the bounds are not very tight
we consider the two-set partition TZ — {i?i,i?2J and seem to converge rather slowly towards the
defined by the two segments of stable and unsta- true value.
ble manifolds of a certain fixed point as computed On the other hand, the error in computing
in Sec. 4.1, see Fig. 6(a). Tij(l) has to be related to a small subset of boxes
The third and fourth columns of Table 2 of B only, see Lemma 3.5 and comments there-
show the values e^PgUj and eTPQUi for differ- after. It is therefore natural to consider an adap-
ent partitions B of equally sized boxes. As sug- tive approach to the construction of the partitions
gested in Eq. (15) these values indeed sandwich the B in the sense that one only refines boxes that
24 M. Dellnitz et al.

-2.6 -2.4

Fig. 9. Almost invariant decomposition into two sets. We are interested in the transport between the two regions R\ and i?2
which we already displayed in Fig. 7(a). The red and yellow areas are an almost invariant decomposition into two sets. The
border between the two sets roughly matches the boundary formed by the branches of the stable and unstable manifolds of
the fixed point (-2.029579567343744, 0) drawn as a line.

Table 2. Lower a n d u p p e r b o u n d s for the total 4.2.3. Local optimization


amount Ti 2(1) of species Si in region R2 after
one iterate for the two-set partition Tl = {RI,SQ} The partitions obtained in the adaptive approach
shown in Fig. 6(a) for various box coverings B. described above are used to compute an improved
approximation of the transport rate 71,2(1). The
Box Volume No . of Boxes e2pBUi e\PBux idea is to use local optimization methods for graph
partitioning to smoothen the boundary between the
4.6387 x 1(T 4 2238 0 0.067417 two regions.
4
2.3193 x 10~ 4436 0 0.058418
The adaptive approach is based on a parti-
-4
1.1597 x 1 0 8673 0 0.041038 tion into three sets Ai,A% and Af, with A\ U
5.7983 x 1(T 5 17216 0.000034 0.034708 A2 U i j , = R\ U i?2 corresponding to an inter-
2.8992 x 10~ 5
32789 0.000258 0.022962 nal set Ai C Ri, an external set A2 C Ri
and a boundary set A^, which is a box cover-
ing of the boundary between R\ and R2 provided
by the results from the lobe dynamics approach.
To get an approximation of the transport rates
contribute to the error. To be able to identify
2i,2(l), we artificially construct a two-partition of
these, we rely on the results from the previous
the underlying graph corresponding to the sets
section on the lobe dynamics approach, where we
Ai and Ai U Af,. This partition is then locally
computed the pieces of the invariant manifolds optimized (by maximizing the the internal costs
bounding the two regions to high accuracy. Fig- (27) by the methods described in Sec. 3.2. In this
ure 10 shows the result of an implementation of way, one obtains approximations of the sets R\
this approach, the third and fourth columns of and R2, which are given as box collections, so
Table 3 show the corresponding lower and upper that for this particular setting we can again com-
bounds. Note that for a comparable number of pute the transport rate 7i,2(l) using Eq. (17). The
boxes in the partitions B these bounds are much fifth column of Table 3 presents the corresponding
tighter than those computed using equally sized results.
boxes.
Transport in Dynamical Astronomy and Multibody Problems 25

0.2 1

0.15 1 l-lli ~
H *^

0.11 •§•••§ P
11 ^ • K a J J u J 1 •
0.05 H — ii l i ! § ! " : ! ;
°l ±iz ::!l!|
-0.05 H 1 £a°tr
Mi mS rFr l
-0.11 E*P * x
-0.15 • .1

|3
-0.2 • ^'Tffi $ 8 X ii
-0.25 1 rn m i n T+Hf —
1

-1.9 -1.8 -1.7 -1.6 -1.5 -1.4 -1.3 r -1.2


i 1
-1.1

Fig. 10. Adaptive covering. Dynamical systems techniques have been used to identify locations in which box refinements are
needed, i.e. where the lobes are located. This speeds up the computation considerably.

Table 3. Lower and upper bounds and the optimized value of the total amount of species
Si contained in region i?2 after one iteration Tit2(l). The third and fourth columns present
lower and upper bounds for the total amount Ti,2(l) of species Si in region R2 after one
iterate for the two-set partition 11 = {RI,RQ} shown in Fig. 6(a) for various adaptively
refined box partitions B. The fifth column lists the approximate value for Ti,2(l), obtained
by additionally locally optimizing the partition of B into two sets.

Box Volume (min) No. of Boxes S2PBU1 e$Psui Optimized Value

4.6387 x 1 0 " 4 2238 0 0.067417 0.008605


-4
1.1597 x 1 0 3269 0 0.041038 0.005166
5
2.8992 x 10~ 5455 0.000258 0.022962 0.003497
7.2479 x 10" 6 10422 0.000790 0.012654 0.002622
6
1.8110 x 10~ 21655 0.001362 0.007508 0.002324
4.5290 x 1 0 - 7 45946 0.001722 0.004887 0.002314

4.2.4. Extrapolation boxes that contribute to the error will shrink by


The results in Table 3 suggest that one should try a factor of 1/2. In view of this, we make the fol-
to derive even better bounds by extrapolating the lowing Ansatz for an asymptotic expansion of the
computed values. By Lemma 3.5 and comments computed values in terms of the Lebesgue measure
thereafter, the error between Tij(l) and its approx- fj,(B) = minsge IJ-(B) of the relevant boxes:
imation e^PelLi can be bounded in terms of the
Lebesgue measure of a certain set of boxes that ejPBUi ~ Tij(l) + CyfJKB), (29)
either intersect the boundary of region Ri or are
mapped onto the boundary of region Rj. Roughly
speaking this means that whenever those boxes are for some constant C > 0; similarly for eJPeUj and
refined by bisection with respect to both coordi- the value as computed after locally optimizing the
nate directions, the Lebesgue measure of the set of partition. Table 4 shows the results of extrapolating
26 M. Dellnitz et al.

Table 4. Extrapolation of the results in Table 3. Using Table 5. Comparison of the two approaches for higher
the asymptotic expansion (29), the extrapolation is iterates. Approximate values for the amount Ti,2(") of
based on a linear interpolation of the values of two species Si in region R2 after n iterates.
subsequent rows of Table 3.
n T\fl{n) (lobe dynamics) §2^8^! ( s e t oriented)
KB) gf-PsMi efPjgWi Linear Extrapolation
1 0.002230 0.002314
1.811 x 10" 6 0.0019337 0.0023648 0.002026
2 0.004461 0.004449
4.529 x 10" 0.0020821 0.0022651 0.002304
3 0.006692 0.006533
4 0.008898 0.008568
5 0.01110 0.01056
the values in Table 3 based on the expansion (29). 6 — 0.01250
For the extrapolation we have been linearly interpo- 7 — 0.01438
lating the values of two subsequent rows of Table 3, 8 — 0.01623
respectively. 9 — 0.01803
n 0.01978

4.2.5. Higher iterates of the map


For the two-set partition 1Z = {i?i, #2} as employed the locally optimized partition on the finest box
in the previous section, Table 5 lists the approxi- level in Table 3. Although the two methods do
mate total amount Tij(n) of species Si in region not use exactly the same two-set partition they
i?2 after n time steps. The values in the sec- agree to within 5% over their common domain.
ond column are based on the nth. iterates of lobe Table 5 and Fig. 11 show that using the set oriented
volumes, whereas the values in the third column approach one can efficiently approximate the quan-
have been computed as Tjj(n) « e^P^Ui using tities Tij(n) for quite large n — every new iterate
the approximations of Ri and R2 obtained by requires a single matrix-vector product (where the

0.25

0.2

v
E
3 0.15
o
>
o.
a>
to
JZ
a.

0.05

20 25 30 50
n = Iterate Number
an
Fig. 11. Higher iterates using the set oriented method. Approximate values for T\p{n) d Titi(n) up to n = 50 iterates
using the set oriented approach.
Transport in Dynamical Astronomy and Multibody Problems 27

matrix is sparse) and a scalar product to be com- leads us to believe that also for larger iterates
puted. Note however that since we are working with n > n j ^ the computed transport rates are quite
a covering consisting of boxes, typically there will reliable. However, without proper modification the
be boxes that map outside the covering and thus the method is not yet suitable for very large iter-
resulting transition matrix is not exactly stochastic. ates (n > 100). On the other hand, the method
This will ultimately lead to e^P§ u^ dropping to 0 gives reliable transport rates for thousands of Earth
with increasing n. years and with cautious extrapolation, one can con-
clude that the method is indeed of astro dynamical
interest.
4.2.6. Return times of the Poincare map
In terms of the time scale of the underlying differ-
ential equation, a species from R = R\ U i?2 needs 5.2. Extension to higher dimensions
13.02 to 36.34 years to return to R. In Fig. 11 the and time dependent systems
approximate values for Ti^{n) for 50 iterates are Some work has been done on transport in higher
shown. Accordingly, the probability of the transi- dimensions, for example, four-dimensional symplec-
tion of a species from R\ to i?2 is about 28% after tic maps [Lekien & Marsden, 2004; Gillilan &; Ezra,
1817 years. 1991]. In future work, we intend to use box meth-
ods and graph algorithms in conjunction with ideas
5. Conclusions a n d Future D i r e c t i o n s from invariant manifold theory for studying phase
space transport in higher dimensions.
5.1. Good agreement between
Related to this, one can also consider an exten-
approaches sion of lobe dynamics to the four-dimensional case
We have shown how invariant manifold techniques [Lekien k Marsden, 2004; Lekien, 2003]. The four-
and the set oriented approach can work together in dimensional phase space M of a volume- and
an important two degree of freedom example prob- orientation-preserving Poincare map / : M —>• M
lem, reduced to a two-dimensional Poincare map. can be divided into disjoint regions of interest,
For example, graph partitioning gives a coarse-grain Ri,i = 1 , . . . , ] V R , where the boundaries between
global picture of the important regions and indi- regions are pieces of three-dimensional stable and
cates where key unstable periodic points reside. The unstable manifolds of two-dimensional normally
one-dimensional stable and unstable manifolds of hyperbolic invariant manifolds (NHIMs), pi,i =
those periodic points can then be computed and 1 , . . . , Np. Moreover, transport between regions of
the lobe areas determined to yield highly accurate phase space can be completely described by the
transport rates. dynamical evolution of the higher dimensional turn-
As one computes its extent of stable and unsta- stile lobes, volumes of the phase space enclosed by
ble manifold curves from an initial seed, computer segments of the stable and unstable manifolds.
memory restrictions and the rapid stretching of the One way to approach this problem is to simply
manifolds limits the length of the manifold which use the box subdivision and graph partitioning algo-
can be computed. This translates to a maximum rithms to partition M into its important regions and
iterate, nj£^, up to which transport can be accu- then compute the transport between them. How-
rately computed using the invariant manifold/lobe ever, this may not be computationally tractable.
dynamics method. The complexity of box subdivision methods are pro-
Based on a coarse model of the underlying portional to the dimension of the object of interest,
system in form of a finite-state Markov chain, not the dimension of the embedding space. Thus,
the set oriented approach can compute transport box subdivision methods could be used to (i) obtain
quantities at higher iterates. Using a boundary the two-dimensional NHIMs, and then (ii) their sta-
between regions obtained from the invariant man- ble and unstable three-dimensional manifolds which
ifold method, one can implement adaptive refine- bound regions of M.
ment strategies for the underlying partition of phase Another challenging but potentially very
space, which improves efficiency of the method. The fruitful application is to put box subdivision meth-
good agreement between the set oriented approach ods and almost invariant sets into the time depen-
with adaptive refinement and the lobe dynamics dent context, such as occurs in, for instance, ocean
method over their common domain (up to n ^ S ) dynamics [Lekien et al., 2003]. For such systems,
28 M. Dellnitz et al.

the idea of "fixed points" and "invariant manifolds" predictive ocean dynamics. Our long t e r m vision is
is problematic and one replaces t h e m with notions to make a link between (i) the statistical methods
such as those of Hallet [2002] involving Lagrangian which have been used t o probe the dynamics in high
coherent structures. Preliminary computations sug- dimensional systems and (ii) t h e geometric methods
gest t h a t set oriented methods may be able to reveal which provide detailed insight into the dynamics
such objects with similar properties. of low dimensional systems. This is a gap t h a t we
believe the work here begins to bridge.
5.3. Merging techniques into a We are ultimately interested in investigating
single software package whether the techniques described here will work for
models of direct, practical interest. Thus we first
The merging of statistical and geometric approaches work on simple models with an eye towards build-
yields a powerful tool. This could be reflected in ing more complex models using the results of simple
the merging of the two software packages used in models as building blocks.
the current study, GAIO and MANGEN. We envision
a software package for transport calculations using
the box formulation along with adaptive strategies
Acknowledgments
to reduce the computational effort based on highest
transport a n d / o r curvature of a low codimensional This research was partly supported by the DA AD,
object. DFG Priority Program 1095, NSF-ITR grant ACI-
Furthermore, we can make use of variational 0204932, a Max Planck Research Award and the
integration (VI) techniques, which are known to California Institute of Technology President's Fund.
perform well when computing long time dynamics This work was carried out in part at the Jet Propul-
and chaotic invariant sets for mechanical systems, sion Laboratory and California Institute of Technol-
with and without dissipation. See, for instance ogy under a contract with National Aeronautics and
[Kane et al, 2000; Rowley k Marsden, 2002; Space Administration.
Marsden West, 2001]. This also includes asyn-
chronous VI techniques [Lew et al, 2003] which are
appropriate for taking different time steps in dif- References
ferent spatial regions and yet maintaining all the Agarwal, P. K., Billeter, S. R., Rajagopalan, P. T.,
conservation properties of variational integrators. Benkovic, S. J. & Hammes-Schiffer, S. [2002] "Net-
work of coupled promoting motions in enzyme catal-
5.4. Miscellany ysis," Proc. Nat. Acad. Sci. USA 99, 2794-2799.
Belbruno, E. & Marsden, B. [1997] "Resonance hopping
Other topics t h a t warrant further investigation in comets," Astron. J. 113, 1433-1444.
are the addition of dissipation and forcing to the Benner, L. A. M. & McKinnon, W. B. [1995] "On the
problem, including the effect of other bodies, the orbital evolution and origin of comet Shoemaker-
Poynting Robertson drag and the Yarkovski effect. Levy 9," Icarus 118, 155-168.
It would also be of interest to know how the choice Carusi, A., Kresak, L., Pozzi, E. & Valsecchi, G. B. [1985]
of the coordinate system affects the results; specif- Long Term Evolution of Short Period Comets (Adam
ically, lobe boundaries may be easier to handle in Hilger, Bristol, UK).
other coordinates, such as Delaunay (action-angle Chirikov, B. V. [1979] "A universal instability of
canonical) coordinates for the P C R 3 B P ; in fact, many-dimensional oscillator systems," Phys. Rep. 52,
they will not appear as convoluted in Delaunay 263-379.
Conley, C. [1968] "Low energy transit orbits in the
coordinates.
restricted three-body problem," SIAM J. Appl. Math.
16, 732-746.
5.5. Progress towards the grand Coulliette, C. & Wiggins, S. [2001] "Intergyre transport
challenges in computational in a wind-driven, quasigeostrophic double gyre: An
science application of lobe dynamics," Nonlin. Process. Geo-
phys. 8, 69-94.
In this paper, we seek to lay a foundation for De Leon, N., Mehta, M. A. k. Topper, R. Q. [1991a]
significant progress toward some of the grand chal- "Cylindrical manifolds in phase space as mediators of
lenges in computational science, including computa- chemical reaction dynamics and kinetics. I. Theory,"
tional astrodynamics, protein folding, and detailed J. Chem. Phys. 94, 8310-8328.
Transport in Dynamical Astronomy and Multibody Problems 29

De Leon, N., Mehta, M. A. & Topper, R. Q. [1991b] Gillilan, R. E. & Ezra, G. S. [1991] "Transport and
"Cylindrical manifolds in phase space as media- turnstiles in multidimensional Hamiltonian mappings
tors of chemical reaction dynamics and kinetics. II. for unimolecular fragmentation: Application to van
Numerical considerations and applications to models del Waals predissociation," J. Chem. Phys. 94,
with two degrees of freedom," J. Chem. Phys. 94, 2648-2668.
8329-8341. Gladman, B. J., Burns, J. A., Duncan, M., Lee, P. &
De Leon, N. [1992] "Cylindrical manifolds and reactive Levison, H. F. [1996] "The exchange of impact ejecta
island kinetic theory in the time domain," J. Chem. between terrestrial planets," Science 271, 1387-1392.
Phys. 96, 285-297. Goldreich, P., Lithwick, Y. & Sari, R. [2002] "Formation
Dellnitz, M. & Hohmann, A. [1997] "A subdivision algo- of Kuiper-belt binaries by dynamical friction and
rithm for the computation of unstable manifolds and three-body encounters," Nature 240, 643-646.
global attractors," Numer. Math. 75, 293-317. Gomez, G., Koon, W. S. Lo, M. W., Marsden, J. E.,
Dellnitz, M., Hohmann, A., Junge, O. & Rumpf, M. Masdemont, J. & Ross, S. D. [2001] "Invariant mani-
[1997] "Exploring invariant sets and invariant mea- folds, the spatial three-body problem and space mis-
sures," Chaos 7, p. 221. sion design," Adv. Astronaut. Sci. 109, 3-22.
Dellnitz, M. & Junge, O. [1999] "On the approximation Gupta, A. [1997] "Fast and effective algorithms for graph
of complicated dynamical behavior," SI AM J. Numer. partitioning and sparse matrix reordering," IBM J.
Anal. 36, 491-515. Res. Dev. 41, 171-183.
Dellnitz, M., Junge, O., Rumpf, M. & Strzodka, R. [2000] Haller, G. [2002] "Lagrangian coherent structures
"The computation of an unstable invariant set inside a from approximate velocity data," Phys. Fluids 14,
cylinder containing a knotted flow," in Proc. Equadiff 1851-1861.
99, eds. Fiedler, B., Groger, K. & Sprekels, J. (World Hammes-Schiffer, S. & Tully, J. C. [1995] "Nonadiabatic
Scientific, Singapore), pp. 1053-1059. transition state theory and multiple potential energy
Dellnitz, M., Junge, O., Lo, M. & Thiere, B. [2001a] surface molecular dynamics of infrequent events,"
"On the detection of energetically efficient trajectories J. Chem. Phys. 103, 8528-8537.
for spacecraft," AAS/AIAA Astrodynamics Specialist Hammes-Schiffer, S. [2002] "Comparison of hydride,
Conf., Quebec City, Paper AAS 01-326. hydrogen atom, and proton-coupled electron transfer
Dellnitz, M., Froyland, G. & Junge, O. [2001b] "The reactions," Chem. Phys. Chem. 3, 33-42.
algorithms behind GAIO — Set oriented numerical Henrard, J. [1982] "Capture into resonance: an extension
methods for dynamical systems," in Ergodic Theory, of the use of adiabatic invariants," Celest. Mech. 27,
Analysis, and Efficient Simulation of Dynamical Sys- 3-22.
tems, ed. Fiedler, B. (Springer), pp. 145-174. Hendrickson, B. & Leland, R. [1995] "A multilevel algo-
Dellnitz, M. & Junge, O. [2002] "Set oriented numer- rithm for partitioning graphs," Proc. Supercomputing
ical methods for dynamical systems," in Handbook '95, ACM.
of Dynamical Systems II. Towards Applications, eds. Hobson, D. [1993] "An efficient method for computing
Fiedler, B., Iooss, G. & Kopell, N. (World Scientific, invariant manifolds of planar maps," J. Comput.
Singapore), pp. 221-264. Phys. 104, 14-22.
Dellnitz, M., Junge, O. Lo, M. W. Marsden, J. E., Hunt, F. Y. [1993] "A Monte Carlo approach to
Padberg, K., Preis, R., Ross S. D. & Thiere B. [2004] the approximation of invariant measures," National
"Transport of Mars-crossers from the quasi-Milda Insitute of Standards and Technology, NISTIR
region," submitted for publication. 4980.
Dellnitz, M. & Preis, R. [2003] "Congestion and Jaffe, C , Farrelly, D. & Uzer, T. [2000] "Transition state
almost invariant sets in dynamical systems," in theory without time-reversal symmetry: chaotic ion-
Proc. SNSC'01, ed. Winkler, F., LNCS, Vol. 2630 ization of the hydrogen atom," Phys. Rev. Lett. 84,
(Springer), pp. 183-209. 610-613.
Drake, D. E. & Hougardy, S. [2002] "A simple approx- Jaffe, C , Ross, S. D., Lo, M. W., Marsden, J. E.,
imation algorithm for the weighted matching prob- Farrelly, D. & Uzer, T. [2002] "Statistical the-
lem," Inform. Process. Lett. 85, 211-213. ory of asteroid escape rates," Phys. Rev. Lett. 89,
Easton, R. W. [1998] Geometric Methods for Discrete 011101.
Dynamical Systems (Oxford University Press, NY). Johnson, M. R. & Johnson, D. S. [1979] Computers
Froyland, G. & Dellnitz, M. [2003] "Detecting and locat- and Intractability — A Guide to the Theory of NP-
ing near-optimal almost-invariant sets and cycles," Completeness (W.H. Freeman and Co).
SIAM J. Sci. Comput. 24, 1839-1863. Kane, C , Marsden, J. E., Ortiz, M. & West, M. [2000]
Fiduccia, C. M. & Mattheyses, R. M. [1982] "A linear- "Variational integrators and the newmark algorithm
time heuristic for improving network partitions," for conservative and dissipative mechanical systems,"
Proc. IEEE Design Automation Conf., pp. 175-181. Int. J. Num. Math. Eng. 49, 1295-1325.
30 M. Dellnitz et al.

Karypis, G. & Kumar, V. [1999] "A fast and high quality MacKay, R. S., Meiss, J. D. & Percival, I. C. [1984]
multilevel scheme for partitioning irregular graphs," "Transport in Hamiltonian systems," Physica D13,
SI AM J. Sci. Comput. 20, 359-392. 55-81.
Kernighan, B. W. & Lin, S. [1970] "An effective heuris- MacKay, R. S., Meiss, J. D. k Percival, I. C. [1987],
tic procedure for partitioning graphs," The Bell Syst. "Resonances in area-preserving maps," Physica D27,
Tech. J., 291-307. 1-20.
Konacki, M., Torres, G., Jha, S. & Sasselov, D. D. [2003] Malhotra, N. & Wiggins, S. [1998] "Geometric struc-
"An extrasolar planet that transits the disk of its par- tures, lobe dynamics, and Lagrangian transport in
ent star," Nature 421, 507-509. flows with aperiodic time dependence, with appli-
Koon, W. S., Lo, M. W., Marsden, J. E. & Ross, S. D. cations to Rossby wave flow," J. Nonlin. Sci. 8,
[2000] "Heteroclinic connections between periodic 401-456.
orbits and resonance transitions in celestial mechan- Malhotra, R. [1996] "The phase space structure near
ics," Chaos 10, 427-469. Neptune resonances in the Kuiper belt," Astron. J.
Koon, W. S., Lo, M. W., Marsden, J. E. & Ross, S. D. I l l , 504-516.
[2001] "Resonance and capture of Jupiter comets," Malhotra, R., Duncan, M. & Levison, H. [2000] "Dynam-
Celest. Mech. Dyn. Astron. 8 1 , 27-38. ics of the Kuiper belt," in Protostars and Planets IV,
Koon, W. S., Lo, M. W., Marsden, J. E. & Ross, S. D. eds. Mannings, V., Boss, A. P. & Russell S. S. (Univ.
[2001a] "Low energy transfer to the Moon," Celest. of Arizona Press, Tucson), pp. 1231-1254.
Mech. Dyn. Astron. 81, 63-73. Marsden, J. E. & West, M. [2001] "Discrete mechanics
Koon, W. S., Lo, M. W., Marsden, J. E. & Ross, S. D. and variational integrators," Acta Numer. 10, 357-
[2002] "Constructing a low energy transfer between 514.
Jovian moons," Contemp. Math. 292, 129-145. Marston, C. C. & De Leon, N. [1989] "Reactive islands
Koon, W. S., Marsden, J. E., Ross, S., Lo, M. & as essential mediators of unimolecular conformational
Scheeres, D. J. [2004] "Geometric mechanics and the isomerization: A dynamical study of 3-phospholene,"
dynamics of asteroid pairs," Ann. NY Acad. Sci. J. Chem. Phys. 91, 3392-3404.
1017, 11-38. McGehee, R. [1969] "Some homoclinic orbits for the
Kostelich, E. J., Yorke, J. A. & You, Z. [1996] "Plotting restricted three body problem," Ph.D. thesis, Univer-
stable manifolds: Error estimates and noninvertible sity of Wisconsin, Madison, Wisconsin.
maps," Physica D93, 210-222. Meiss, J. D. & Ott, E. [1986] "Markov tree model of
Laskar, J. [1989] "A numerical experiment on the chaotic transport in area-preserving maps," Physica D20,
behaviour of the solar system," Nature 338,237-238. 387-402.
Lekien, F. [2003] "Time-dependent dynamical sys- Meiss, J. D. [1992] "Symplectic maps, variational prin-
tems and geophysical flows," Ph.D. thesis, California ciples, and transport," Rev. Mod. Phys. 64, 795-848.
Institute of Technology. Mileikowsky, C , Cucinotta, F. A., Wilson, J. W.,
Lekien, F., Coulliette, C. & Marsden, J. E. [2003] Gladman, B., Horneck, G., Lindegren, L., Melosh,
"Lagrangian structures in very high frequency radar J., Rickman, H., Valtonen M. & Zheng, J. Q. [2000]
data and optimal pollution timing," 7th Experimental "Natural transfer of viable microbes in space — 1.
Chaos Conf. (AIP), pp. 162-168. From Mars to Earth and Earth to Mars," Icarus 145,
Lekien, F. & Coulliette, C. [2004] "MANGEN: Compu- 391-427.
tation of hyperbolic trajectories, invariant manifolds Monien, B., Preis, R. k. Diekmann, R. [2000] "Quality
and lobes of dynamical systems defined as 2D+1 data matching and local improvement for multilevel graph-
sets," in preparation. partitioning," Parall. Comput. 26, 1609-1634.
Lekien, F. & Marsden, J. E. [2004] "Separatrices in high- Morbidelli, A., Chambers, J., Lunine, J. I., Petit,
dimensional phase spaces: Application to Van Der J. M., Robert, F., Valsecchi, G. B. & Cyr, K. E.
Waals dissociation," in preparation. [2000] "Source regions and timescales for the deliv-
Levison, H. F. & Duncan, M. J. [1993] "The gravitational ery of water to the Earth," Meteor. Planet. Sci. 35,
sculpting of the Kuiper belt," Astrophys. J. 406, 1309-1320.
L35-L38. Murray, N. & Holman, M. [2001] "The role of chaotic res-
Lew, A., Marsden, J. E., Ortiz, M. & West, M. [2003] onances in the solar system," Nature 410, 773-779.
"Asynchronous variational integrators," Arch. Rat. Neishtadt, A. [1996] "Scattering by resonances," Celest.
Mech. An. 167, 85-146. Mech. Dyn. Astr. 65, 1-20.
Lew, A., Marsden, J. E., Ortiz, M. & West, M. [2004] Neishtadt, A. I., Sidorenko, V. V. k Treschev, D. V.
"Variational time integration for mechanical sys- [1997] "Stable periodic motions in the problem on pas-
tems," Int. J. Num. Meth. Engin. 60, 153-212. sage through a separatrix," Chaos 7, 2-11.
Lichtenberg, A. J. & Lieberman, M. A. [1983] Regular Ozorio de Almeida, A. M., De Leon, N., Mehta, M. A.
and Stochastic Motion (Springer-Verlag, NY). & Marston, C. C. [1990] "Geometry and dynamics
Transport in Dynamical Astronomy and Multibody Problems 31

of stable and unstable cylinders in Hamiltonian sys- 13th AAS/AIAA Space Flight Mechanics Meeting,
tems," Physica D46, 265-285. Ponce, Puerto Rico, Paper AAS 03-143.
Padberg, K., Preis, R. & Dellnitz, M. [2004] "Integrating Rowley, C. W. & Marsden, J. E. [2002] "Variational inte-
multilevel partitioning with hierarchical set-oriented grators for point vortices," Proc. CDC40, 1521-1527.
methods for the analysis of dynamical systems," Scheeres, D. J. [2002] "Stability of binary asteroids,"
Technical Report, University of Paderborn. Icarus 159, 271-283.
Parker, T. S. & Chua, L. 0 . [1989] Practical Numerical Scheeres, D. J., Durda, D. D. & Geissler, P. E. [2002]
Algorithms for Chaotic Systems (Springer-Verlag, "The fate of asteroid ejecta," in Asteroids III, eds.
NY). Bottk, W. M. et al., University of Arizona, Tuscon,
Pellegrini, F. [1996] "SCOTCH 3.1 user's guide," pp. 527-544.
Technical Report 1137-96, LaBRI, University of Schroer, C. G. & Ott, E. [1997] "Targeting in Hamilto-
Bordeaux. nian systems that have mixed regular/chaotic phase
Perry, A. D. & Wiggins, S. [1994] "KAM tori are very spaces," Chaos 7, 512-519.
sticky: Rigorous lower bounds on the time to move Szebehely, V. [1967] Theory of Orbits (Academic Press,
away from an invariant Lagrangian torus with linear NY-London).
flow," Physica D71, 102-121. Tancredi, G., Lindgren, M. & Rickman, H. [1990]
Poje, A. C , & Haller, G. [1999] "Geometry of cross- "Temporary satellite capture and orbital evolution of
stream mixing in a double-gyre ocean model," Phys. comet P/Helin-Roman-Crockett," Astron. Astrophys.
Oceanogr. 29, 1649-1665. 239, 375-380.
Ponnusamy, R., Mansour, N., Choudhary, A. & Tancredi, G. [1995] "The dynamical memory of Jupiter
Fox, G. C. [1994] "Graph contraction for mapping family comets," Astron. Astrophys. 299, 288-292.
data on parallel computers: A quality-cost tradeoff," Tiscareno, M. & Malhotra, R. [2003] "The dynamics of
Sci. Program. 3, 73-82. known Centaurs," Astron. J. 126, 3122-3131.
Preis, R. [1999] "Linear time 1/2-approximation algo- Torbett, M. V. & Smoluchowski, R. [1990] "Chaotic
rithm for maximum weighted matching in general motion in a primordial comet disk beyond Neptune
graphs," Symp. Theoretical Aspects in Computer Sci- and comet influx to the Solar System," Nature 345,
ence (STACS), pp. 259-269. 49-51.
Preis, R. [2000] "Analyses and design of efficient graph Uzer, T., Jaffe, C , Palacian, J., Yanguas, P.
partitioning methods," Dissertation. Heinz Nixdorf & Wiggins, S. [2002] "The geometry of reaction
Institut Verlagsschriftenreihe, Universitat Paderborn. dynamics," Nonlinearity 15, 957-992.
Rom-Kedar, V., Leonard, A. & Wiggins, S. [1990] "An Valsecchi, G. B. [1992] "Close encounters, planetary
analytical study of transport, mixing and chaos in an masses, and the evolution of cometary orbits,"
unsteady vortical flow," J. Fluid Mech. 214, 347-394. in Periodic Comets, eds. Fernandez, J. A. &
Rom-Kedar, V. k Wiggins, S. [1990] "Transport in Rickman, H. Univ. de la Republica, Montevideo,
two-dimensional maps," Arch. Rat. Mech. Anal. 109, Uruguay, pp. 143-157.
239-298. Veillet, C , Parker, J. W., Griffin, I., Marsden, B.,
Rom-Kedar, V. & Wiggins, S. [1991] "Transport in two- Doressoundiram, A., Buie, M., Tholen, D. J., Connel-
dimensional maps: Concepts, examples, and a com- ley, M. & Holman, M. J. [2002] "The binary kuiper-
parison of the theory of Rom-Kedar and Wiggins belt object 1998 ww31," Nature 416, 711-713.
with the Markov model of MacKay, Meiss, Ott, and Walshaw, C. [2000] "The Jostle user manual: Version
Percival," Physica D51, 248-266. 2.2," University of Greenwich.
Rom-Kedar, V. [1999] "Transport in a class of n-d.o.f. Wiggins, S. [1992] Chaotic transport in Dynamical Sys-
systems," in Hamiltonian Systems with Three or More tems, Interdisciplinary Appl. Math., Vol. 2 (Springer,
Degrees of Freedom (S'Agaro, 1995), NATO Adv. Sci. Berlin-Heidelberg-NY).
Inst. Ser. C Math. Phys. Sci., Vol. 533 (Kluwer Acad. Wisdom, J. [1980] "The resonance overlap criterion and
P u b l , Dordrecht), pp. 538-543. the onset of stochastic behavior in the restricted
Ross, S. D. [2003] "Statistical theory of interior-exterior three-body," Astron. J. 85, 1122-1133.
transition and collision probabilities for minor bod- Yamato, H. & Spencer, D. B. [2003] "Numerical inves-
ies in the solar system," Proc. Int. Conf. Libration tigation of perturbation effects on orbital classifi-
Point Orbits and Applications, Parador d'Aiguablava, cations in the restricted three-body problem, 13th
Spain, June 10-14, 2002, eds. Gomez, G., Lo, M. W. AAS/AIAA Space Flight Mechanics Meeting, Ponce,
& Masdemont, J. J. (World Scientific, Singapore), Puerto Rico, Paper AAS 03-235.
pp. 637-652. You, Z., Kostelich, E. J. & Yorke, J. A. [1991] "Calculat-
Ross, S. D., Koon, W. S., Lo, M. W. & Marsden, ing stable and unstable manifolds," Int. J. Bifurcation
J. E. [2003] "Design of a multi-moon orbiter," and Chaos 1, 605-623.
This page is intentionally left blank
A BRIEF SURVEY ON T H E NUMERICAL D Y N A M I C S
FOR FUNCTIONAL DIFFERENTIAL EQUATIONS
BARNABAS M. GARAY
Department of Mathematics,
Budapest University of Technology and Economics,
H-1521 Budapest, Hungary

Received May 3, 2004; Revised J u n e 16, 2004

GYULA FARKAS (1972-2002) IN M E M O R I A M

This is a survey on discretizing delay equations from a geometric-qualitative view-point. Con-


cepts like compact attractors, hyperbolic periodic orbits, the saddle structure around hyperbolic
equilibria, center-unstable manifolds of equilibria, inertial manifolds, structural stability, and
Kamke monotonicity are considered. Error estimates for smooth and nonsmooth initial data in
various C J topologies are provided. The emphasis is put on Runge-Kutta methods with natu-
ral interpolants. The paper ends with a collection of the related results on retarded functional
differential equations with bounded delay.

Keywords: Delay equations; Runge-Kutta discretizations; invariant manifolds.

1. Introduction [Butcher, 1987; Hairer et al., 1993], t h e emphasis


T h e first monograph on numerical methods for is put on convergence and stability properties of
delay differential equations was published by Bellen R u n g e - K u t t a methods. T h e technicalities espe-
and Zennaro [2003]. Together with their own, they cially those related to the choice of t h e stepsize
present relevant results of Baker, Brunner, Enright, sequence depend on the type of the delay cru-
Guglielmi, Hayashi, Hairer, in't Hout, Iserles, cially. Delay equations of t h e form x(t) = f(t,x{t),
Jackiewicz, Koto, Maset, Tavernini, Torelli, x(t — T ) ) and neutral equations of the form x{t) =
Vermiglio and many other researchers. T h e book f(t,x(t),x(t — r),x(t — r))(t > to, to € R) are
is almost 400 pages long and the bibliography con- considered. W i t h increasing complexity, t h e delay
tains 288 items. can be constant (r = TQ > 0), bounded a n d time
T h e principle of organizing the material in dependent ( r = r ( t ) 6 [0,r 0 ]) bounded a n d space
[Bellen k, Zennaro, 2003] is that — mutatis mutan- dependent ( r = r ( x ( t ) ) £ [0, TO}), and proportional
dis — all discretization methods used for ordi- (r = qt with some q € (0,1) and to > 0) — initial
nary differential equations can be applied for delay data are functions defined on the interval [to — TO, to]
and neutral equations as well. Results for ordi- and [(1 — q)to,to], respectively. Multiple a n d dis-
nary differential equations are followed by those tributed delays are discussed incidentally.
on delay and neutral equations. Both similarities The monograph [Bellen & Zennaro, 2003] has
and differences (compared to the case of ordinary grown out of traditional numerical analysis. Of
differential equations) are analyzed in details. In course, the authors are well aware of t h e fact t h a t
line with the mainstream tradition of presenting the phase space of a delay equation is a function
numerical analysis for ordinary diferential equations space. T h e approximating solution is c o m p u t e d first

33
34 B. M. Garay

at the mesh points and then, via interpolation, 2. D i s c r e t i z a t i o n as a Family of


on the intervals between mesh points. However, Approximating Discrete-Time
little attention is paid to the question if geometric- Semidynamical Systems
qualitative aspects of the solution dynamics are
In this section we collect some basic definitions
preserved under discretization. The main object of
and results on discretizing functional differential
investigation is the relation between individual solu-
equations. This requires reformulation and restat-
tion trajectories and their numerical approximation
ing within the framework of abstract dynamical sys-
in R"-. In other words, the much younger tradition
tems theory. This level of abstractness — which is
of numerical dynamics, i.e. of handling numerical
not needed for describing results on approximating
methods from the view-point of dynamical system
individual solutions in [Bellen & Zennaro, 2003] —
theory plays a rather limited role in [Bellen &;
is absolutely essential when treating qualitative-
Zennaro, 2003].
geometric phenomena.
Thus it is not without any reason to write a
brief survey on delay equations placed within the
general framework of numerical dynamics. When
doing this, we reconsider some central topics dis- 2.1. Runge-Kutta discretization for
cussed in [Stuart k. Humphries, 1996], the first com- delay equations
prehensive presentation of numerical dynamics for
ordinary differential equations: For simplicity, take t$ = 0, TQ = I, and consider
first the initial value problem
• discretization methods as approximating semi-
dynamical systems,
• compact attractors, x(t) = f(x(t),x{t-l)) fori>0
• hyperbolic periodic orbits, x(t) = r)(t) f o r t e [-1,0]
• stable and unstable manifolds of hyperbolic
equilibria.
where / : R n x R n —>• R n is a bounded Cp function
Throughout this paper, we consider only with bounded derivatives and rj 6 C([—1, 0],R n ),
autonomous equations with bounded delay and the Banach space of continuous R n -valued functions
focus our attention to Runge-Kutta methods with on the interval [0,1]. The maximum norm on C =
polynomial interpolation. Aspects of C ( [ - l , 0 ] , R n ) is denoted by ||-||. The Euclidean
norm on R n is denoted by |-|. The smoothness
• inertial manifolds,
and boundedness assumptions on / imply that the
• structural stability,
initial value problem (1) has a unique solution
• Kamke monotonicity
x = redact(•,??) : [-l,oo) ->• R n . Moreover, for-
are also discussed. We refer frequently to papers of mula (<£(£, 7?)) (s) = x(t + s,r/), s e [—1,0] defines
our late colleague Gyula Farkas who died in a car a semidynamical system $ : R + x C —> C. With
accident on February 27, 2002 — he was to receive respect to the second variable, $ is of class Cp,
his PhD Diploma at the end of the same week. p = l,2,.... Note that a;exact(',??) is differentiable
The development of numerical dynamics at to = 0 if and only if the left-hand side deriva-
started for ordinary and parabolic partial differen- tive of the initial function n exists at to — 0 a n d
tial equations simultaneously. Retarded equations ^(O) = /(^(0),?7(—1)). It follows immediately that
followed with some delay and were influenced by $ is not differentiable with respect to its first vari-
the corresponding results on parabolic equations as able. Since (1) defines a nonautonomous ordinary
well as by the general theory of semiflows in Banach differential equation of the form x = f(x, r/(t — 1))
spaces. An analysis of the related/underlying work on the interval [0,1], xexact(-,v)\[o,i] = $(!,*?) is
on parabolic equations and general semiflows is of class C 1 and, by a simple induction argument,
beyond the scope of this paper. The most impor- Xexact{-,V)\[j,j+l] = $ ( j + 1,7/) is of claSS C3+l,
tant contributions are cited in the literature we j = 0 , 1 , . . . ,p. Moreover,
refer to. The monograph [Bellen & Zennaro, 2003] [j,oo) is of class
J+1
surveys connections to the numerics of Volterra C , j = 0 , 1 , . . . , p — this is the well-known
integral equations. smoothing property of the solution semidynamical
system.
A Brief Survey on the Numerical Dynamics for Functional Differential Equations 35

Fix ho € (0,1] a n d let h £ (0, ho]- T h e stepsize-h explicit Euler discretization operator with piecewise
linear interpolant is defined as <PE,PLI '• (0,ho] X C —+ C, (h, 77) —>• ipE,PLi{h,rj),

'v(h + s) if se [-1,-h]
(<PE,PLi(h,rj))(s) = < (2)
^77(0)+(l + £)x £ (M) ifse[-M]

where XE(h,r}) = rj(0)+hf(r,(0),r](-l)). The right-


h a n d side of formula (2) defines t h e stepsize-h Similarly, every R u n g e - K u t t a m e t h o d M
implicit Euler discretization operator fitpLi with (known from t h e numerics of ordinary differential
piecewise linear interpolant if X E ( / I , 77) is replaced equations [Butcher, 1987; Hairer et ai, 1993]) can
by Xi(h,r]), the unique solution of equation X = be applied for Eq. (1). T h e stepsize-h Runge-Kutta
V(0) + hf(X,r)(h - 1)), for h € (0,hx], h i > 0 is discretization operator with piecewise linear inter-
sufficiently small. polant is defined as <PM,PLI '• (0, h | ] x C —»• C ,
(h,7?) -»• <PM,PLl(h,r]),

77(h + s) if s e [ - 1 , - h ]
(<PM,PLl(h,T]))(s) = ^ s / S\ (3)
-7?(0) + ( l + - XM(h,r?) ifse[-h,0]
h h

where
Here t h e positive integer v a n d t h e real constants
i
XM(h,r?)=r?(0) + h ^ 6 i / ( X , 7 ? ( C i h - l ) ) (4) (%'}ij=i> {bi}i=i a n d {ci}i=i are t h e parameters
of t h e R u n g e - K u t t a method M . We leave t h e m
i=l
unspecified b u t assume t h a t c» e [0,1] for i =
with 1,2, . . . , i / . Note t h a t for h sufficiently small, say
h < hi ( < ho), t h e right-hand side of (5) defines
X{ = 77(0) +hJ2^j f(Xt,V(cjh ~ 1)), a contraction operator on R n x R n x • • • x R n
(y times).
i = 1 , 2 , . . . ,*/. The stepsize-h Runge-Kutta discretization
(5)
operator with a standard interpolant is defined as
<PM,NFI • ( 0 , h i ] x C -»• C, (h,n) -> <fM,NFi{h,r)),

( T)(h + S) if s e [ - 1 , - h ]
(VM,NFi(h,r)))(s) =< (6)
r/(0) + f t £ & ( - £ ) /{X^niah-l)) if s€[-h,0]

where {XlYi=1 is determined by (5) and t h e poly-


nomials /3i : [0,1] - • R satisfy &(0) = bt, # ( 1 ) = 0, respectively. Throughout this paper, we assume
and /?i(l — Cj) = a y , i, j = 1, 2 , . . . , v. Note t h a t t h e that our R u n g e - K u t t a method when applied for
collection of the requirements on {/3j}^=1 is equiva- ordinary differential equations (like x = f(x,
lent t o t h e collection of the properties r/(t — 1)) on t h e interval [0,1]) is of order p.
In most practical implementations, the approx-
(<PM,NFl{h,T]))(-h) = 77(0), imating solution £approx(->77) : [— 1,00) —> R n is
computed only at t h e mesh points {k/N}k>Q.
(.VM,NFI (h, ??)) (0) = XM (h, 77) and
When keeping track on internal stage values,
(<PM,NFl(h,rj))(-h + Cih) = X\ i = 1, 2 , . . . , v. one arrives at t h e finite sequence of points
-OaPProx((& + ci)/N>v)}k>0;i=i,...,v C R
™ I n partic-
T h e first two letters in NFI refer t o t h e termi- ular, z a pp rox (l/iV,77) = XM(h,ri) and xapprox(ci/
nus technicus "natural" and "of t h e first class", N, 77) = Xi from (5)-(6), i = 1, 2 , . . . , v.
36 B. M. Gamy

It is immediate that <PM,NFI '• (0, h\] x C —>• C in [—1,0]. By definition,


is continuous. With respect to the second variable,
p
<PM,NFI is of class C . For h G (0, frf] fixed, the
k l
iterates {<fi M NFi(^ ^v)}k_n define a discrete-time
semidynamical system on C. Our uniformity whenever k > N and 77 G C.
assumptions on / imply that, for any time T > 1
and for any ball B c C, the set A similar construction is possible for general
Runge-Kutta methods and leads to the defini-
{ V M W M ) € C|l < A;/i < T, k G N, tion of stepsize-1/N practical Runge-Kutta dis-
cretization operators <PV,M,PLI with piecewise linear
fte (0,/iJ], neB}
interpolant.
consists of uniformly bounded and uniformly
Lipschitz continuous functions. Note that the set
$([1, T], B) consists of uniformly bounded and uni- 2.2. Error estimates for smooth
formly Lipschitz continuous functions, too — this is initial functions
the well-known compactifying property of the solu- For ordinary differential equations on a finite time
tion semidynamical system. interval [0, T], it is well-known that stepsize-/*,
From the view-point of a qualitative theory of Runge-Kutta approximating/discretized solutions
discretizations, it is natural to define stepsize-/j dis- converge to the exact solution as h —> 0. The order
cretization operators as above i.e. as self-maps of of a Runge-Kutta method M refers to the order of
the infinite-dimensional function space C. However, this convergence process
this is not quite satisfactory for practical purposes.
In practice the initial function r\ G C is not always (a) on the set of the mesh points {kh}k>0 in [0, T]
explicitly given but only its values on a uniform (b) on the set of the stage points ~{kh}k>0 U
mesh are known. This leads to a parallel, more prac- {(k + Cj)h}k>0.j=1 ^ in [0,T] and, in case
tical framework of establishing an abstract theory method M is combined with an interpolation
for discretizations. operator INTRP,
Fix a positive integer N. By letting II1/]v(7?) (c) on the entire interval [0, T].
to be the piecewise linear continuous function with
vertices {—1 + j/N, rj(—l + j/N)}-Q, a linear pro- The corresponding orders are called the classical
jection Iliy^y : C —• C is defined. The range of UI/N (or nodal), the stage and the uniform order, respec-
is denoted by Ci/N C C. Obviously, Ci/N can be tively. A separate order can be defined for the inter-
identified with RrHJV+1) via the linear isomorphism polation operator INTRP as well.
n -> {v(-l+j/N)}f=0. Thus the stepsize-1/iV The monograph [Bellen & Zennaro, 2003] dis-
explicit Euler discretization method when applied cusses all the order concepts above in the context of
to the delay equation (1) can be understood as a delay equations thoroughly. For delay equations of
mapping (p(l/N, •) = ipv,E,PLiO-/N, 0 : C1/N -> the form (1), their main result goes back to [Bellen,
CI/JV defined by
1984] and can be restated as follows.

Lemma 1. Let <^M,NFI be a Runge-Kutta dis-


cretization operator with standard interpolant.
Given any finite interval [0,T] and any Cp ini-

= (,(!^"(^--*°>'H- tial function n, there exists a positive constant


K {depending only on f, M, T, as well as on
| | 7 / | | , . . . , | | 7 / P ) | | ) such that
(7)
Here the lower index V is an abbreviation for \^(k/N,n))(0) - (^kM!NFI(l/N,rj))(0)\ <K-W
"practical". Note that the sequence of iterates
de e n d s solel on
(8)
{^E,PLAVN^)}Z=O C C
P y (/
and) {??(—1 +j/N)}j__0, the restriction of rj to the whenever 0 ^ N, k G N, k/N < T and 1/N < h{.
finite collection of the mesh points {—1 + j/N}^=0 Under some additional assumptions on the
A Brief Survey on the Numerical Dynamics for Functional Differential Equations 37

interpolant NFI, also inequalities "additional assumptions on the interpolant NFI'"


from Lemma 1 (which go back to [Zennaro, 1986]
— ^k/N,V)-^~^M^NFI(l/N,r,) and constitute one of the mostly involved part
of [Bellen & Zennaro, 2003]) are discussed in
<Krh«+1-*, j = 0,l,...,q (9) Sec. 5.2.2. It is a challenging task to find such an
interpolation operator that preserves the order of
hold true. Here q < p — 1 stays for the order convergence in Lemma 1 and makes £approx (-,??) to
of the interpolant NFI and constant Kj depends be of class 0 on the interval [j, oo), j = 0,1,... ,p.
also on the underlying interpolation operator, j = The monograph [Bellen & Zennaro, 2003] refers to
0 , 1 , . . . , q. Derivatives at the mesh points in (9) are several results into this direction but none of them
meant in the left/right sense. seems to ensure the same smoothness improvement
for ajapprox(-,??) along the intervals {[j,j + l]}pj=0
Suppose that tyP' does not exist at some s = shared by xeXact(-, v)-
— 1 + A with ho < A < 1 but n is Cp on the interval There are various pro and contra arguments
[—1,-1 + A]. Then the local approximation error for variable stepsize sequences. A major pro argu-
satisfies inequality ment has already been discussed. Though conflict-
ing with higher order local approximation error
Inexact (^, 7?) ~ XM(h,T))\ < COnst(/, M, T]) • hp+1 estimates, further pro arguments are those behind
for h E (0, A] (10) adaptive error control in [Bellen k, Zennaro, 2003,
Chapter 7]. The major contra argument for variable
where const(/, M, ry) depends only on / , M, as well stepsize sequences is the obvious pro argument for
as on the bounds for \rj'\,..., \rj^\. The restric- the uniform mesh {k/N}k>Q we outlined in the two
tion "for h G (0, A]" in (10) (which means that last paragraphs of Sec. 2.1.
the inequality in (10) is not necessarily satisfied on However, despite all efforts of putting stepsizes
the interval (0, ho]) has important consequences to selection and the error control mechanism on a firm
mesh point selection. In order to have local 0(hp+1) mathematical basis, heuristical aspects can hardly
error estimates, it implies that A has to be cho- be avoided. This is particularly exemplified by con-
sen for a mesh point. In view of the smoothing sidering a delay equation of the form
property of <&, a similar argument shows that the
mesh (still in order to have local 0{hp+1) error x(t) = f(x(t),x(t-p),x(t-l))
estimates) should contain the points 0,1,... ,p and where ho < p < 1.
also the points 1 + A, 2 + A, etc. In general this
leads to nonuniform mesh with a variable stepsize Our first candidate is the uniform mesh Mu =
sequence (/ii,/i2,. • •), 0 < hm < ho, m = 1,2,... . {k/N}k>0. In order to go on with the explicit Euler
The discretization operator <PM,NFI gives rise to method at a mesh point ko/N < T, two earlier val-
one with variable stepsize sequence by defining ues of the approximate solution xapprox (at ko/N — p
an
<PM,NFI{0,V) = V d then, inductively and ko/N — 1 given or computed previously) are
needed. It follows that the values of rj at each point
(PM,NFl{hm,... ,hi;rj) of the set
= VM,NFl(hm,ipM,NFl[hjn-\, ... ,h\;rj))
HT = {k/N-£pe [-p,0]\keZ,£sN, k/N <T}
for m = 1,2,....
are also needed. This is a pladoyee for interpolat-
Chapters 4 and 6 of [Bellen & Zennaro, ing and working within the <p-p,E,PLl framework but,
2003] contain several generalizations of what we especially on moderate time intervals [0, T], also the
called Lemma 1 above even for equations with choice of Mu + HT (the algebraic sum of the two
state-dependent delay as well as for certain types discrete sets Mu and HT) as for a new, nonuniform
of neutral equations where (still in order to mesh MNU = Mu + HT is reasonable.
have local 0(hp+1) error estimates) stepsize selec- The subsection concludes with an example
tion is subject to various constraints. Also these showing that local error estimates between exact
results can be restated within the framework of and approximate solutions cannot be uniform in C.
a nonautonomous dynamical system theory. The Nevertheless, it indicates that, on certain natural
38 B. M. Garay

subsets of C, uniform error estimates can be chain of inequalities


expected.
d?
T$(k/N,ri) (6,..-,£i)
Example 1. If function / does not depend on the drf
first n coordinates, then (1) simplifies to
dP_
T PM,NFi0-/N,ri) ( 6 , • • • , £ ; < Kj/N
x(t) = f{x{t - 1)) for t > 0 drf
(11) (13)
x(t) = n{t) f o r i G [-1,0]
whenever 0 ^ N,k G N, fc/JV < T and 1/iV < /ij.
Suppose that /(0) = 0 and r)(—jh) = 0 whenever
The positive constant Kj, j = 0 , 1 , . . . ,p— 1 depends
h = 1/N and j = 0 , 1 , . . . , AT. Then ( $ ( 1 , T J ) ) ( S ) = only on / , M, T, L, and on the underlying interpo-
/ * ! /(»?(«)) du but ( ^ P L / ( l / i V , r/))(S) = 0 for each lation operator.
s G [-1,0]. In particular, ||$(1,T/) - <p%jPLI(l/N,
77) || can be arbitrarily large. Note that in our case, Proof. Case j — 0 is a direct consequence of
inequality (12) via the standard Gronwall argument
ll*(M)-¥$a/(W,»7)ll [Hairer et al, 1993].
In order to prove case j = 1, we pass to a
<y \f(ri(u))\du<cj \n(u)\du somewhat higher level of abstractness. Still with
the initial value problem (1) in mind, we use the
standard notation from the theory of retarded func-
where £ stays for the Lipschitz constant of / . Note tional differential equations [Hale, 1977] and write
that ||$(1,J7) - ^ P Z / / (l/iV,?7)|| -»• 0 as iV -> oo x(t) = g(xt), XQ = 7/ instead. We consider also the
for each 7/ G C because on [0,1] ((1) is equivalent initial value problem y(t) = [g'(xt)]yt, yo — £ for
to a nonautonomous ordinary differential equation the first variational equation. Define
and thus) (11) simplifies to the integration problem
x(t)-r,(p) = fif(r,(u-l))du). 9{xt)
G and
W{xtj\yt

2.3. Error estimates for nonsmooth $(t,7/)


initial functions e [h, d =, x
If the initial function 7/ G C is Lipschitz with con-
stant Lip(rj) < L, then the local approximation
error satisfies inequality where t > 0 and $ : R + x C —> C denotes the
solution semidynamical system for equation x(t) =
I -^ exact g(xt). It is clear that 6 : R+ x (C x C) - • C x C is
[h, n) - XM{h, rj)| < const(/, M, L) • h2 the solution semidynamical system for the retarded
for h G (0, hi] . (12) functional differential equation
This is a consequence of (10) when applied to a
x{t)\
sequence of C 1 Lipschitz functions {%}^x C C G *>0. (14)
with the properties that Lip(7/fc) Lip(7/) and m)
\\rjk — 7/H —> 0 as fc oo.
Similarly, with <p : (0, h^] x C —>• C denoting an
"approximation operator" for <£, define
Lemma 2. Xei fM,NFl be a Runge-Kutta dis-
cretization operator with standard interpolant. ( <p{h,x) \
Given any finite time interval [0, T] and a finite
collection of Lipschitz initial functions n, £ i , . . . , £j il>[h, whenever
^(M)
with constants Lip(?7),Lip(£i),..., Lip(£j) < L and \ 7
Mi \\£j\\ < 1) the derivatives of the approxima-
tion error (as a j-linear operator between C x C x h G (0, h\] and ['') eC x C.
• • • x C (j times) and C) satisfy the j = 0 , 1 , . . . ,p—l
A Brief Survey on the Numerical Dynamics for Functional Differential Equations 39

Suppose that (p = <fM,NFi,g comes from a Runge- where B be is an upper bound for | / | on R n x R n
Kutta method with interpolant. Then the very same and Ci stays for the Lipschitz constant of / with
Runge-Kutta method applies to Eq. (14) and gives respect to the ith variable, i = 1,2. On the time
rise to operator <PM,NFI,G- Analyzing (5)-(6), it is interval [0,1], (1) is equivalent to the initial value
not hard to show that ip = <PM,NFI,G- We arrived problem x = f(x,q(t)), x(0) = XQ = 77(0) where
at the conclusion that case j = 1 of (13) follows q(t) = rj(t-l). Let *(•;*„,a;*) : [U,l] -> R n denote
from inequality (12) — when applied to Eq. (14) the right-hand side solution to the nonautonomous
instead ofx(i) = g(xt) — via the standard Gronwall ordinary differential equation x = f(x,q(t)) with
argument. (Unfortunately, the uniformity assump- initial data (£*,#*) G [0,1] x R n . For brevity, we
tions we imposed on g remain no longer valid for G. write h = 1/N,
The second coordinate of G is unbounded when
||yt|| —> oo. However, if our interest is reduced to a
bounded subset of C x C, then no difficulties arise.) tk = kh, xk = ( ^ E , P L / ( 1/N, 77)) (tk - 1)
The remaining cases j = 2 , . . . ,p — 1 follow by for k= 0,l,...,N,
induction. •

and a = CxBh2/2, b = £2, d = eClh. Finally, for


Since <fr(0, rj) = idcrj = rj for each n £ C and
k = 0,l,...,N-l, define
{dP/dr]J) $(-,77) is differentiable, we obtain immedi-
ately from (13) — or, by a direct analysis of the
definition — that Ek = \$(tk;0, x0) -xk\ and
j
d ftk + l
-j—VM,NFl(l/N,r]) (6,.--,£; rtk+i
Ck / \Q(U) - q(h \du
Jtk
— idc (£i,.--,& < "j/N (15)
dry* and observe that the right-hand side of inequality
whenever 1/N < h* and the initial functions (16) is equal to EN = \^(t]y;0,xo) — XN\-
r
?;£i;---)Ci a r e Lipschitz with constants Lip(??), We claim that Ek+i < Ekd + a + bck for each
Lip(£i),...,Lipfo) < L and ||6ll, • • •, Ifell < 1, k = 0 , 1 , . . . , N — 1. In fact, we have for k = 0,
j = 0 , 1 , . . . ,p — 1. Of course the positive constant 1 , . . . , N — 1 by the triangle inequality that
Rj depends only on / , M, L, and on the underlying
interpolation operator, j = 0 , 1 , . . . ,p — 1.
Ek+i < Mtk+1;tk,V(tk;0,xo))-y(tk+1;tk,xk)\
Recall that Lipschitz functions in C are of
bounded variation and that the total variation + \-$(tk+i;tk,xk) - (xk +hf(xk,q(tk)))\ .
Tot v{rj) is not greater than Lip(ri).
Hence our next result is an improvement over
The first term can be estimated by using Gronwall
Lemma 2 for the explicit/implicit Euler method
lemma. In fact, for each t G [tk,tk+i], we have that
with piecewise linear interpolant.
Lemma 3. Consider only the special cases
VM,NFI = <PE,PLI or <PI}PLI- Given any finite |*(t; tk, *(t fc ; 0, x0)) - *(t; tk,xk)
time interval [0, T] and a finite collection of ini-
tial functions 77, £ 1 , . . . ,£j of bounded variation = V(tk;0,x0)
with Tot v (77), Tot •*;(£]. ,Tot7j(£j) < L, the
j = 0,1,..., p 1 chain of inequalities (13) still + I f(^(u-tk^{tk;0,xo)),q{u))di
holds true. Jtk

Proof. Applying the standard Gronwall argument -\xk+ f{^{u;tk,xk),q(u))du


[Hairer et al., 1993] we have already referred to, we \ Jtk j
point out first that
< \^{tk;0,x0) -xk\
l(*(l,»7))(0) «PW(1/A^))(0)|
1 Cl
< 2~ e ;£ + 4n£ 2 -Tot?j(r7))/iV (16) + Ci \^(u-tk,^(tk;0,x0))-^(u]tk,xk)\du
Jtk
40 B. M. Garay

and, a fortiori, the first term is not greater than < 2nh • Tot v(q)
eClhEk. On the other hand, the second term is
= 2nh Tot v(rj).
bounded by
(As a direct consequence of the uniform continuity
Xk+ / f{^{u;tk,xk),q(u))du of r], note that J2 ck —> 0 as N —> oo.)
Jtk By a repeated use of the standard Gronwall
argument (when combined with piecewise linear
- ( xk+ / f(xk,q(tk))du\ interpolation), case j — 0 of the PM,NFI = <PE,PLI,
L — Tot v(rj) version of inequality (13) follows with
KQ = const • eClT(l + L) easily.
<£i \%f(u;tk,xk) -xk\du The PM,NFI = <PI,PLI c a se can be reduced to
Jtk
the ifM,NFi = <£E,PLI case already proven. The cru-
rtk+i
+ C2 \q(u) - q(tk)\du cial point is to show that
Jtk
l(¥fe/(l/^))(0) - «pL/(W:>?))(0)|
and the claim follows from observing that xk =
< const- {l + Totv(r]))/N.
^(tk;tk,xk) and ^(•;tk,xk) is Lipschitz with con-
stant < B. Without referring to the Jordan decomposition the-
orem any more, this follows via a simplified version
Starting from EQ = 0, we conclude easily by of the recursion we used in deriving (16) above.
induction that (Having only Theorem 4.B in mind, we did not
N-l
check if (16) holds true for a general discretization
dN-l operator <PM,NFI-)
EN < a ^—^ + b £ ckdN~k-i
fc=o
The proof of the remaining cases j = 1,2,...,
p — 1 is the same as in the proof of Lemma 2. •

< 2 - x e £ l tBh + Cz^Ck). Both for ordinary and delay differential equa-
V fc=o / tions, one-step and multistep methods, several ver-
sions of inequalities (8), (9), (13) and (15) are
It remains to prove that Ylk=o ck ^ ^nh • Tot v(r}). known from the literature. A weaker version of
In fact, by the Jordan decomposition theorem, every Lemma 3 has been stated in [Garay &; Loczi, 2004].
coordinate function of q can be represented as g, = Inequality (16) is new.
Vi — wi where vi and wi are monotone increasing The definition of an abstract discretization
continuous real functions on [0,1] with the prop-
operator for Eq. (1) as well for the more abstract
erty that
equation x(t) = g(xt) we refer to in Sec. 4 below
Totv(vi), Totv{wi) < Totv(qi) < Totv(q) are based on case j = 0,1 of inequalities (13)
and (15).
i = 1,... ,n. In contrast to inequalities (8) and (9) which
It follows immediately that concern an individual exact and an individual
approximating trajectory, the j > 1 cases of
inequalities (13) and (15) relate to a collection of
N-l N-l n
k exact and approximating trajectories. Qualitative
J2° -J212 tk
(\vi(v) - vi(tk)\ theory cannot live without differentiating with
fc=0 fc=0 i=l
respect to initial data. This is why C 1 inequalities
+ \wi(u) -Wi(tk)\)du like case j = 1 of (13) (estimating the difference
between exact and approximating solutions in C 1
topologies on the phase space) play a fundamental
^z2z2 i\vi{tk+i) - vi(tk) role in almost all papers on numerical dynamics.
th
i=l k=0 The typical result is that, for stepsizes sufficiently
+ \wi{tk+i) -Wi(tk)\)du small, hyperbolic orbit configurations are preserved
by discretization.
n
For ordinary differential equations, an abstract
<Y2h- (Tot v(vi) + Tot v{wi))
i=l definition for discretization operators is based on C-7
A Brief Survey on the Numerical Dynamics for Functional Differential Equations 41

properties. In one of the earliest papers on the qual- which behaves badly in perturbation theory: The
itative theory of discretizations, [Beyn & Lorenz, behavior of h as of a small parameter is not
1987] suggest the following definition. Consider an entirely regular. We conclude that the proof of a
ordinary differential equation x = f{x) where / : qualitative result in discretization theory requires a
Q —»• R n is a Cp+r+l function. Let S be a compact thorough reconsideration of the proof of the underly-
subset of fl. A mapping ip : (0, ho] x S ^ R n is an ing abstract perturbation result in discrete dynamics
abstract discretization operator of order p if with stepsize h as an additional small parame-
ter, and the derivation of the accompanying error
(i) if admits a Cp+r+i extension to an open neigh- estimates.
borhood of [0, h0] x S in R x Q,
(ii) |$(/i, x) - <p(h,x)\ < Khp+1 whenever (h,x) G 3. Q u a l i t a t i v e N u m e r i c s for
(0, ho] x S Delay Equations
(iii) ip is locally determined by / . In other words,
there exists a continuous function A : [0, ho] —> What we described in the last paragraph for ordi-
R + with the properties that A(0) = 0 and, nary differential equations remains valid for delay
for all (h,x) G (0, ho] x S, <p(h,x) is deter- equations, too. However, one is confronted with two
mined by the restriction of / to the set major difficulties. These are the lack of uniform
{zeRn \ \z-x\ <A(h)}. local error estimates and the lack of backward solv-
ability. Fortunately, for L large enough say L > L*,
As a simple consequence of assumptions (i)-(ii), the closed set
<p(h, •) is a Cp+r+1 diffeomorphism of S onto ip(h, S) CLip(L) = {v £ C\v is Lipschitz with
for h sufficiently small, and — on condition that
$((fc - l)h,x) G S and ^ ( M ) e S — constant Lip(?y) < L}
is positively invariant with respect to the exact as
di d? well as to the discretized dynamics. By (13) and
^ * ( f c M ) - _ ^ ( M ) (15), C J estimates on CUP(L) a r e uniform. Badly
min
< KJ(T) • h &' p+r j}
- , j = 0,l,...,p +r enough, CLiP(it) is nowhere dense in C. What really
helps is the smoothing/compactifying property of
(17) the exact and the discretized dynamics. As for the
asymptotic theory, it implies that dynamical sys-
whenever k G N, h G (0,ho], kh < T and x £ S. tems in finite and semidynamical systems in infinite
Clearly Runge-Kutta methods are subject to dimension can be treated in a parallel way [Hale,
assumptions (i)-(iii). Moreover, for Runge-Kutta 1988]. Distinguished subsets of the phases space like
methods, inequality (17) is satisfied if / is cho- unstable manifolds of hyperbolic equilibria or of
sen from the less smoother class of Cp+r functions. hyperbolic periodic orbits, inertial manifolds, and
Mutatis mutandis, assumptions (i)-(iii) make sense compact attractors consist of full trajectories i.e.
if / is defined on a compact smooth manifold M.. trajectories defined on the entire real line R. In
This leads to the definition of abstract discretiza- particular, compact attractors and certain kinds of
tion operators on M. For stepsize h small enough, invariant manifolds of delay equations belong to
<p(h, •) is a Cp+r+l self-diffeomorphism of M.. Also CLip(£*)- This is why, in a final analysis, their qual-
the C-7 inequality (17) remains valid in the manifold itative discretization properties are (almost) the
setting. For details, see [Li, 1997; Garay, 2001]. same as of their counterparts in ordinary differential
Thus {f(h, -)}/ie(o h 1 ^s a one-parameter equations.
family of diffeomorphisms approximating the From now on, let p > 2 and assume that all the
one-parameter family of time-h diffeomorphism regularity conditions we imposed on (1) in Sec. 2.1
{&(h, -)}feg(o h 1 °f * n e continuous-time solution are satisfied.
dynamical system $ : R X R n —> R n (or locally, $ :
R x f l ^ R n ; or $ : R x M -*• Mn). Consequently,
3.1. The simplest hyperbolic orbit
for h G (0, ho] fixed, discretization theory is part
of perturbation theory for discrete-time dynami- configurations
cal systems. However, with h —> 0, both ip(h, •) The three major objects of the phase space investi-
and $(/i, •) approach the identity, an operator gated in [Stuart & Humphries, 1996] on numerical
42 B. M. Gamy

ordinary differential equations are c?


(¥ , Af,iVF/( 1 / iV ' r ?)' r i/iv) ^ K-[ik/N depends on
t); fj, < 1 is fixed) such that dHausdorff(r,r1yjV) <
• compact attractors (i.e. asymptotically stable const /Np.
compact invariant sets) • B.2.) [Farkas, 2003]: LetT be a hyperbolic peri-
• hyperbolic periodic orbits odic orbit for the continuous-time solution semi-
• hyperbolic equilibria, together with their stable dynamical system $ and assume that the period
and unstable manifolds of F is at least two (i.e. two times the delay).
Then, for stepsize-h sufficiently small (and not
In a well-defined technical sense, compact attrac- only for h = 1/N with N large), the discrete-time
tors, the saddle structure about hyperbolic equilib- semidynamical system <PM,NFi{h,-) has a hyper-
ria, and periodic orbits are only slightly perturbed bolic invariant curve T^ such that in normal coor-
under discretization. As for compact attractors, the dinates around T, both \Th\ and Lip(T/j) are of
hyperbolic structure preserved is the transversal order h.
intersection structure between trajectories near the
attractor and the level surfaces of suitable Liapunov
functions. (The dynamics within the attractor itself In line with (17) and the general estimates for
is not assumed to be hyperbolic and can be changed discretized normally hyperbolic compact invariant
dramatically under discretizaton.) The presenta- manifolds of ordinary differential equations [Garay,
tion in [Stuart & Humphries, 1996] is based on 2001], it seems plausible in Parts (Bl) and (B2) that
p+r
the original papers [Kloeden & Lorenz, 1986; Beyn, FI/JV = -7"i/iv(r) where T\j^ is a C embedding
1987a, 1987b]. No doubt these three papers belong of T into R and the norm distance in C J ( T , R n )
n

to those few marking the birth of numerical dynam- between T\JN and the inclusion of F in R n is of
ics as an independent field of research in the late order i/jV min {p,P+r-j} ; j = 0 , 1 , . . . ,p + r.
eighties. The unpublished PhD dissertation Gyula
In what follows we present the corresponding Farkas: On Numerical Dynamics of Functional Dif-
results for delay equations. ferential Equations, Budapest University of Tech-
nology, 2002, contains a lower semicontinuity result
Theorem 1. Consider the delay equation x(t) = for discretized compact attractors of delay equa-
f(x(t),x(t - 1)) and let ^M,NFI • (0,h{] x C —> C tions, the analogue of the one in [Stuart &
be a Runge-Kutta discretization operator with stan- Humphries, 1996] from the theory of discretized
dard interpolant. ordinary differential equations. As for upper semi-
continuity, Farkas refers to [Gedeon & Hines, 1999]
• A.) [Kloeden & Schropp, 2004]: Let 0 ^ A be a on upper semicontinuity of Morse sets under explicit
compact attractor for the continuous-time ODE-Euler discretization of a one-dimensional
solution semidynamical system $ . Then, for delay equation (which results in a cyclic feed-
stepsize-1/N sufficiently small, the discrete- back system of ordinary differential equations).
time semidynamical system fM,NFi(^/N,-) has Though conceptually much easier, we note that
a nonempty compact attractor AI/N and the Theorem l.A is not a consequence of the results
limiting process Ai/N —» A as N —• oo {both in [Gedeon & Hines, 1999].
in C L J P ^ ) with nice Liapunov estimates and For the rest of this subsection, we assume that
consequently, by using the general attraction /(0,0) = 0 or, equivalently, that r/o = 0 6 C is
results in Chapter 2 of [Hale, 1988], also in C) is an equilibrium for $. The next result starts with a
upper semicontinuous. center-unstable versus strongly-stable C = CU x SS
• B.l.) [In't Hout & Lubich, 1998]: Let V be product decomposition of the phase space (invariant
an exponentially stable periodic orbit for the with respect to the linear semidynamical system
continuous-time solution semidynamical system generated by the solutions of the linearized equation
<fr. Then, for stepsize-1/N sufficiently small, the y(t) = ti(0,0)y(t) + f^(Q,0)y(t-l) and determined
discrete-time semidynamical system PM,NFIO-/ by its characteristic equation). Thus the equilib-
N, •) has an exponentially stable invariant curve rium 0 €E C is not necessarily hyperbolic. As a
T1/N (both in AT C\ CLip(L,) and ^n N where consequence of basic spectral decomposition the-
M is a suitable neighborhood of V in C but in ory, note that the linear subspace CU is of finite
the second case constant K in the estimate dimension.
A Brief Survey on the Numerical Dynamics for Functional Differential Equations 43

Theorem 2 [Farkas, 2002a]. Consider the delay and, last but not least,
equation x(t) = f(x(t), x(t — 1)) again and let
fP,E,PLi{l/N, •) : C1/N —• CI/JV be the stepsize-1/N Hh($(h,x)) = <p{h,Hh{x))
practical explicit Euler discretization operator with whenever x <EU, $(h, x) G U. (20)
piecewise linear interpolant. Assume that the equi-
librium point 0 G C has a center-unstable manifold In other words, in the vicinity of hyperbolic equi-
of the form Graph(G), where G : CU —> SS is a C2 libria, the exact and the discretized dynamics are
function. Then for N large enough (and under very conjugate and discretization is nothing else but an
'mild additional technical conditions) C\i^ admits almost-identical coordinate transformation. With
a center-unstable versus strongly-stable C\m = Hh being a Cp+r diffeomorphism, note that (19)
CUI/N x SSI/N product decomposition with the and (20) can be proved in the vicinity of nonequi-
properties as follows. Operator <PV,M,PLI(1 /N, •) has libria, too [Garay & Simon, 2001].
an invariant manifold of the form Graph((7i/jv),
where Gi/N : CU\/N —• SS^/N ^S a C2 function. 3.2. Inertial manifolds and
In addition, there exists a linear isomorphism PI/JV : structural stability
CU —> CUI/N such that, for j = 0 and j = 1,
Throughout this subsection, we restrict ourselves to
a certain type of delay equations with small delay.
dj d? The smallness of the delay seems to be necessary for
-j-r U1/NG - -j-j Gi/NP1/N - • 0 as N -> oo. the C2 smoothness of the inertial manifold. (The
(18) existence of C 1 inertial manifolds can be proved
with moderate delay. However, if the delay is not
small, then the gap condition (which is the basis for
Together with Graph(Gr), also Graph(G1/7V) is eayo-
proving higher order smoothness) is violated and,
nentially attractive, with asymptotic phase depend-
for the time being, there is no way out of this dif-
ing continuously on the stepsize as N —> oo.
ficulty. For details and references, see [Robinson,
Theorem 2 in [Farkas, 2002a] is accompanied by 1999; Farkas, 2002b, 2002c; Chicone, 2003]. Here we
C2 existence and C 1 approximation results for exact restrict ourselves to reminding the reader that iner-
and discretized stable manifolds that correspond tial manifolds are global center-unstable invariant
to S in the center-unstable versus stable product manifolds.) On the other hand, the C2 smallness
structure CU x S where S is the finite-dimensional of the inertial manifold is necessary to apply [Li,
invariant subspace determined by a bounded set 1997] on numerical structural stability for ordinary
of the roots of the characteristic equation lying to differential equations in proving Part B of the Theo-
the left of those belonging to CU. In the case of rem below. All numerical structural stability results
hyperbolic equilibria, also a numerical Grobman- we are aware of require at least C2 smoothness
Hartman lemma for partial linearizations [Farkas, assumptions.
2001b] as well as C 1 shadowing results [Farkas, Recall the definition of the practical stepsize-
2002a] are given. 1/7V explicit Euler discretization operator
As a preparation for the next subsection, we <p-p,E,PLi(l/N, •) : CI/JV —• CI/JV from the last para-
recall the simplest ordinary differential equation graph of Sec. 2.1. If the delay is e > 0, then the
result on numerical structural stability [Garay, phase space is Ce = C([—e, 0], R n ) . A trivial modi-
1996]. The numerical saddle structure results in fication of formula (7) gives rise to the definition of
[Beyn, 1987a] can be interpreted as follows: Given a the stepsize-e/./V practical explicit Euler discretiza-
hyperbolic equilibrium XQ G R n of an ordinary dif- tion operator VV,E,PLI(1INI 0 : C1/N -* C i / i v
ferential equation, there exist a neighborhood U of Theorem 3. Consider the delay equation x(t) =
XQ in R n , a constant K > 0 and, for each h G (0, /i|], Ax(t) + a(x(i)) + b(x(t — e)) where A is an n x n
there exists a homeomorphism Tih oili into R n with real matrix, a,b : R n —• R n are bounded C2 func-
the properties that tions with bounded derivatives, and e is a positive
parameter.
T~(-h(xo) — xo an
d \1~t-h{x) ~ x\ < nhp
• A. [Farkas, 2002c]: Then there exists an eo > 0
for each i e W (19) with the properties as follows. For every e G
44 B. M. Garay

(0, £o], the delay equation has an invariant man- exists a homeomorphism T~L\/N o / R n onto Graph
ifold of the form Graph(J £ ) where Je : CU£ —• (Jf/N) and a continuous time-reparametrization
SS£ is of class C2, the linear subspace CU£ is mapping r£ /AT : R n —> R + such that
finite-dimensional, and the C£ — CU£xSS£ prod-
uct decomposition is given by CU£ = TT£(C£),
SS£ = (id\c - ire){C£) with ir£ : C£ -»• He1/Nmri/N(x),x)) = <fv,E,PLiWN,H{/N{x))
C£, (7r£(r?))(s) = e ^ ^ O ) , 8 e [-£,0]. In addition,
whenever x € Rn. (22)
for N large enough (and under very mild
additional technical conditions), C£,N admits a
= CU\,N x SS£X,N product decomposi- If \P is Morse-Smale and gradient-like, then
tion with the properties as follows. Operator 1/iV (x) = 1/N for each i e R " .
(fy, E PLI(1/N, •) has an invariant manifold of
the form Gr&ph(J£,N), where J£,N : CU^^ —> The reader is asked to make a comparison
SS' 1,N is a C function. In addition, there exists between (18) and (21) as well as between (20)
a linear £
Inea isomorphism P /N '• CU
£ and (22).
CU\,N such
that for j = 0 and j = 1 The proof of Theorem 3 requires a very care-
ful handling of standard inertial manifold tech-
di
a di niques like spectral decomposition, manipulations
TT£ Te - —— JT£ P6 - 0 as N — > oo.
dVi lll/NJ dr]j l/Nn/N with cut-off functions on finite-dimensional sub-
spaces, fixed-point equations in weighted sequences
(21)
£ £ of Banach spaces, fiber contraction theorem, etc.
Together with Graph(J ), also Graph(J , jV ) is
extended for discretizations. As for numerical struc-
exponentially attractive, with asymptotic phase
tural stability, it is just an application of the fun-
depending continuously on the stepsize as
damental theorem on numerical structural stability
N -> oo.
in [Li, 1997], derived as by-product of the Moser-
• B. [Farkas, 2002c]: (CONTINUATION.) Assume, Robbin-Robinson approach to Smale's structural
in addition, that the solution flow f : R x R n —• stability theorem.
R n of the limiting ordinary differential equa-
tion x — Ax + a(x) + b(x) is structurally
stable and that the point at the {oo} of R n
3.3. Kamke monotonicity
is repulsive. Then, for N large enough, there
Assume that, for some constant 7 > 0, condition

'(fi)'Xj(x,y)>7 if(x,2/)6R"xR", i,j 1,2,... ,n and i ^ j ,


(23)
(fiyyj(x,y)>j if(x,j/)eR n xR", i,j 1,2,. ..,n

Here of course /j stands for the ith coordinate func-


tion of / , further x = (x\,X2, • • • ,xn) and y = Theorem 4. Consider the initial value problem (1)
(2/1)2/21 • • • iVn) denote the first n and the last n under condition (23). Then
coordinate variables of fi, i = 1,2,...,n, respec-
tively. By letting x < x for x, x € R n if and • A. [Garay & Loczi, 2004]: Let 7 > 0. Given any
only if Xi < xi for each i = 1,2,...,n, a closed Runge-Kutta method M satisfying bi > 0 for
partial order on R n is defined. The closed partial i = l,...,v, the discretization operator <PM,PLI
order < on R n generates a closed partial order < (with piecewise linear interpolation) is monotone
on C. In particular, n < 7) holds if and only if in the sense that, for sufficiently small stepsize-h
v(s) < v(s) f° r e a c n s £ [—1,0]. As an easy con- and for any initial functions with r? <f), also the
sequence of assumption (20) the semi-dynamical order relation ipM,PLi(h,v) ^ lPM,PLi(h,fj) holds
system $ is Kamke monotone [Smith, 1995]. In true.
other words, inequality $(£,rj) •< $>(t,fj) holds • B. [Garay & Loczi, 2004]: Let 7 > 0. Sup-
true whenever t > 0 and 77,7? 6 C with rj ^ fj. pose that fi(x,y) > 0 and (fi)'Xi(x,y) > 0 for
A Brief Survey on the Numerical Dynamics for Functional Differential Equations 45

i — 1,2, . . . , n . Then, for sufficiently small x(t) = Lxt + g(xt), respectively. Here L : C —> R n
stepsize-h and for any nondecreasing initial func- is a bounded linear operator (which, by a theorem
tion T), we have that ^PE,PLi{h,rj) •< $(/i, 77) -< of Riesz, can be represented as a Stieltjes integral
<Pl,PLl{h,T}).
• C. [Kloeden & Schropp, 2004}: Let 7 = 0. Sup- The general definitions above have lit-
pose we are given a Runge-Kutta method M with tle relevance to practical purposes. For exam-
the properties that a^ > 0 for i,j = \,...,v ple, in all numerical implementations we are
and bi > 0 for i = 1 , . . . , v. Then the discretiza- aware of, Lr\ is replaced by a finite sum like
tion operator <PM,PLI is monotone. Moreover, the E £ i N-1 (t?((l - j)/N)-&(-j/N)) r,(-~j/N). Sev-
positivity assumption on the Runge-Kutta matrix eral references on the numerics of retarded func-
A = {ajj}^- = 1 can be weakened to the non- tional differential equations — chosen in the spirit
negativity assumption on the matrix function r —• of the Bellen-Zennaro monograph — are contained
(I + TA)~ A, required on some nondegenerate in [Maset, 2003].
T-interval [0,TQ\. The numerics of equations with infinite delay —
partly because of the depth of the underlying func-
An iterative combination of Parts A and B for- tional analysis — is more complicated. There are
mulates and generalizes the well-known observation only sporadic results into this direction [Liu, 1997].
that, given a one-dimensional ordinary differential We cite also the papers [Koto, 1999; Insperger &
equation with all solutions convex, then every solu- Stepan, 2002] representing those devoted to some
tion curve is above the broken line determined by qualitative aspects of numerical bifurcation and
the explicit, and under the broken line determined numerical stability of retarded/delay equations. We
by the implicit Euler method. Part C is entirely are not aware of any papers on the numerics of delay
of different character. In a strong resemblance to equations with computer-assisted proofs.
results on contractivity in numerical ordinary dif- All in all, we conclude by emphasizing that the
ferential equations [Hairer et al., 1993], it provides large gap that characterized the relation of abstract
a sufficient condition for a Runge-Kutta method dynamical systems theory and the numerical prac-
to preserve monotonicity of the solution dynam- tice of solving differential equations until the early
ics under discretization. In the light of the elegant nineties of the last century, has been considerably
counterexamples in [Kloeden & Schropp, 2003], this filled in the last ten years. Having read papers like
sufficient condition is almost necessary. [Shub, 1986] or [Matijasevich, 1985] on numerical
Even in the numerical contexts of differential methods, it is clear to us that many of the most
equation theory, the word "monotonicity" can be distinguished mathematicians have (i) hoped for
used in a number of various ways. Monotonicity of (ii) guessed (iii) foreseen (iv) worked for this devel-
iterative methods for delay equations has already opment. Among them John von-Neumann is pioneer
been investigated in [Erbe & Liu, 1991]. number one.

4. R e m a r k s o n Functional Acknowledgments
Differential E q u a t i o n s Parts of this paper were written during a stay
The previous considerations suggest that all the of the author at the University of Padova. Hos-
Theorems above are valid for retarded functional pitality of the Department of Mathematics is
differential equations of the form x(t) = g(xt) where gratefully acknowledged. The author is indebted
xt(s) = x(t + s) for s e [0,1], and g : C ->• R n is of to Wolf-Jiirgen Beyn, Giovanni Colombo, Peter
class Cp,p> 2. Gyula Farkas has always formulated Kloeden, and Johannes Schropp for valuable dis-
his results in this more general framework. However, cussions during the preparation of the paper. The
occasionally, he carried out the proofs only for the paper is supported by the Hungarian National
special case g{xt) = f(x(t),x(t — 1)) and indicated Science Foundation OTKA No. T037491.
the technical modifications needed for a general g.
He considered assumptions (2)-(4) in [Farkas, 2003]
and assumptions (i)-(vi) of his Lemma 8 in [Farkas, References
2002c] general definition of discretization oper- Bellen, A. [1984] "One-step collocation for delay differen-
ators for equations of the form x(t) = g(xt) and tial equations," J. Comput. Appl. Math. 10, 275-283.
46 B. M. Garay

Bellen, A. & Zennaro, M. [2003] Numerical Methods Hairer, E., Norsett, S. P. & Wanner, G. [1993] Solving
for Delay Differential Equations (Oxford University Ordinary Differential Equations I. Nonstiff Problems
Press, Oxford). (Springer, Berlin).
Beyn, W. J. [1987a] "On the numerical approxima- Hale, J. K. [1977] Theory of Functional Differential
tion of phase portraits near stationary points," SIAM Equations (Springer, Berlin).
J. Numer. Anal. 24, 1095-1113. Hale, J. K. [1988] Asymptotic Behaviour of Dissipative
Beyn, W. J. [1987b] "On invariant closed curves of one- Systems (AMS, Providence).
step methods," Numer. Math. 51, 103-122. Insperger, T. & Stepan, G. [2002] "Stability chart for the
Beyn, W. J. & Lorenz, J. [1987] "Center manifolds delayed Mathieu equation," Roy. Soc. London Proc.
of dynamical systems under discretization," Num. Ser. A. Math. Phys. Eng. Sci. 458, 1989-1998.
Fund. Anal. Optimiz. 9, 318-414. In't Hout, K. k Lubich, Ch. [1998] "Periodic orbits of
Butcher, J. F. [1987] The Numerical Analysis of Ordi- delay differential equations under discretization," BIT
nary Differential Equations (Wiley, London). 38, 72-91.
Chicone, C. [2003] "Inertial and slow manifolds for Kloeden, P. E. & Lorenz, J. [1986] "Stable attracting sets
delay equations with small delays," J. Diff. Eq. 190, in dynamical systems and their one-step discretiza-
364-406. tion," SIAM J. Num. Anal. 23, 986-995.
Erbe, L. & Liu, X. [1991] "Monotone iterative methods Kloeden, P. E. & Schropp, J. [2003] "Runge-Kutta meth-
for differential systems with finite delay," Appl. Math. ods for monotone differential and delay equations,"
Comput. 43, 43-64. BIT 43, 571-586.
Farkas, G. [2001a] "Unstable manifolds for RFDEs under Kloeden, P. E. & Schropp, J. [2004] "Stable attracting
discretization: The Euler method," Comput. Math. sets in delay differential equations and in their Runge-
Appl. 42, 1069-1081. Kutta discretization," submitted.
Farkas, G. [2001b] "A Grobman-Hartman result for Koto, T. [1999] "Neumark-Sacker bifurcation in the
retarded functional differential equations with an Euler method for a delay differential equations," BIT
application to the numerics of hyperbolic equilibria," 39,110-115.
Z. Angew. Math. Phys. 52, 421-432. Li, M. C. [1997] "Structural stability of flows under
Farkas, G. [2002a] "A numerical C 1 -shadowing result for numerics," J. Diff. Eq. 141, 1-12.
retarded functional differential equations," J. Com- Liu, Y. [1997] "On the (9-method for delay equations
put. Appl. Math. 45, 269-289. with infinite time lag," / . Comput. Appl. Math. 7 1 ,
Farkas, G. [2002b] "Nonexistence of uniform exponential 177-190.
dichotomies for delay equations," J. Diff. Eq. 182, Maset, S. [2003] "Numerical solution of retarded func-
266-268. tional differential equations as abstract Cauchy prob-
Farkas, G. [2002c] "Small delay inertial manifolds under lems," J. Comput. Appl. Math. 16, 259-282.
numerics: A numerical structural stability result," Matijasevich, Yu. V. [1985] UA posteriori interval anal-
J. Dyn. Diff. Eq. 14, 549-588. ysis," EUROCAL, Vol. 2., ed. Caviness, B. F.
Farkas, G. [2003] "Discretizing hyperbolic periodic orbits (Springer, Berlin), pp. 328-334.
of delay differential equations," Z. Angew. Math. Robinson, J. C. [1999] "Inertial manifolds with and with-
Mech. 83, 38-49. out delay," Discr. Cont. Dyn. Syst. 5, 813-824.
Garay, B. M. [1996] "On structural stability of ordinary Shub, M. [1986] "Some remarks on dynamical systems
differential equations with respect to discretization and numerical analysis," Dynamical Systems and
methods," Numer. Math. 72, 449-479. Partial Differential Equations, eds. Lara-Carrero, L.
Garay, B. M. [2001] "Estimates in discretizing nor- & Lewowicz, J. (Univ. Simon Bolivar, Caracas),
mally hyperbolic compact invariant manifolds of ordi- pp. 69-91.
nary differential equations," Comput. Math. Appl. 42, Smith, H. L. [1995] Monotone Dynamical Systems (AMS,
1103-1122. Providence).
Garay, B. M. & Simon, P. L. [2001] "Numerical flow- Stuart, A. M. & Humphries, A. R. [1996] Dynamical Sys-
box theorems under structural assumptions," IMA tems and Numerical Analysis (Cambridge University
J. Numer. Anal. 2 1 , 733-749. Press, Cambridge).
Garay, B. M. & Loczi, L. [2004] "Monotone delay equa- Zennaro, M. [1986] "Natural continuous extensions of
tions and Runge-Kutta discretizations," Fund. Diff. Runge-Kutta methods," Math. Comput. 46, 119-133.
Eq. 11, 59-67.
Gedeon, T. & Hines, G. [1999] "Upper semicontinuity
of Morse sets of a discretization of a delay-differential
equation," J. Diff. Eq. 151, 36-78.
BIFURCATIONS AND CONTINUOUS T R A N S I T I O N S
OF ATTRACTORS IN AUTONOMOUS AND
NONAUTONOMOUS SYSTEMS
P. E. K L O E D E N and S. SIEGMUND
Fachbereich Mathematik, Johann Wolfgang Goethe Universitat,
D-60054 Frankfurt am Main, Germany

Received February 16, 2004; Revised J u n e 8, 2004

Nonautonomous bifurcation theory studies the change of attractors of nonautonomous systems


which are introduced here with the process formalism as well as the skew product formalism.
We present a total stability theorem ensuring the existence of nearby attractors of perturbed
systems. They depend continuously on a parameter if and only if the attraction is uniform w.r.t.
parameter, i.e. the attractors are equiattracting.
We apply these principles to explicit systems to clarify the meaning of continuous and abrupt
transitions of attractors in contrast to bifurcations, i.e. splitting of minimal invariant subsets into
others within the attr actor. Several examples are treated, including a nonautonomous pitchfork
bifurcation.

Keywords: Total stability; attractor transition; attractor bifurcation; subcritical bifurcation;


supercritical bifurcation; nonautonomous pitchfork bifurcation; nonautonomous dynamical sys-
tem; process; skew product flow.

1. Introduction Underlying our considerations are t w o general


principles. T h e first is the concept of t o t a l stabil-
We have several aims in writing this article, which
ity: if a system has a uniformly asymptotically sta-
is really more an essay t h a n a research, survey or
ble compact set, such as a global a t t r a c t o r , then
tutorial paper, although it contains elements of all
so do all nearby systems. In the a u t o n o m o u s case,
three. Two main aims of particular long term inter-
the perturbed systems then have a t t r a c t o r s , which
est are to understand what is meant by
converge in general upper semicontinuously t o t h a t
1. a bifurcation or transition of a nontrivial attrac- of the original system. In the n o n a u t o n o m o u s case,
tor set in an autonomous system (e.g. such as a t h e p e r t u r b e d systems also have nearby compact
Lorenz attractor in its chaotic regime or a Chua absorbing or attracting sets, but the existence of
attractor), attractors is complicated by the ambiguity of just
2. a bifurcation in a nonautonomous system. how an attractor should actually be defined in
nonautonomous systems — we will introduce the
As we shall see, b o t h are closely related through reader to several possibilities below.
the skew product flow representation of nonau- The idea of total stability underlies a rarely
tonomous dynamical systems. mentioned fact in autonomous bifurcation theory.
Our discussion is by no means complete. Nev- Although subcritical and supercritical bifurcations
ertheless, we hope t h a t our comments will provide are commonly encountered in such systems, it is not
the reader with some insight into the issues that are always easy to determine which of t h e m actually
involved and will stimulate further investigations. occurs. However, if— at the bifurcation p o i n t — an

47
48 P. E. Kloeden & S. Siegmund

equilibrium point remains asymptotically stable bifurcation does not fit into our total stability sce-
for the nonlinear system when it loses stability nario. We also give examples of supercritical, sub-
in the linearized system, then the bifurcation is critical and saddle-node bifurcations for triangular
supercritical (e.g. see [Arrowsmith &. Place, 1990, autonomous systems.
Theorem 4.2.1] in connection with Hopf bifurca- In Sec. 3 we introduce nonautonomous systems
tions). In this case the global (or maximal if only from two different points of views using the pro-
a local) attractor of the nonlinear system in fact cess formalism and the skew product formalism.
depends continuously on the bifurcation parame- Our examples are a nonautonomous version of the
ter. We will see below that this also holds for a autonomous example for the supercritical pitch-
subcritical bifurcation at the point of loss of linear fork bifurcation and triangular autonomous sys-
stability, but not where the nonlocal bifurcating tems, now interpreted differently.
equilibrium points first arise. This is a consequence Section 4 contains our theorem on total sta-
of a second general principle: as recently shown bility of nonautonomous systems under a uniform
in [Li & Kloeden, 2004a] (see also [Li & Kloeden, parametric dependence condition. We apply it in
2004b; Wang et al., 2004]), attractors depend con- Sec. 5 to derive conditions for continuous bifurca-
tinuously on a parameter if and only if they tions, e.g. supercritical bifurcations.
are equiattracting, i.e. uniformly attracting with Section 6 concludes the discussion with some
respect to the parameter. remarks and open questions. The proof of the total
The term "bifurcation" usually refers to sit- stability theorem is contained in an Appendix.
uations when a system linearized about a mini- The Hausdorff semi-metric H^(A,B) of
mal invariant set such as an equilibrium point, a nonempty compact subsets A and B of a metric
periodic solution or an almost periodic solution space (X, d) is defined as
loses asymptotic stability and several new invari- H*X(A,B) :=maxdist(a,.B),
ant sets come into existence. Can one really talk
about the bifurcation of a general global attractor? where dist(a, B) := min d(a, b),
Firstly, it is not clear about which solutions within a b£B
global attractor one should linearize and, secondly, and HX(A,B) = mzx{Hx(A,B),Hx(B,A)} is a
a global attractor is both unique and connected, if metric, called the Hausdorff metric, on the space of
it exists, so cannot split into new disjoint invari- nonempty compact subsets of (X,d).
ant sets. Obviously, one needs to think in terms of
Remark 1. To simplify the exposition we will always
changes of the dynamics within the global attrac-
assume that we have global bounds and constants
tor rather than of the attractor itself. We will use
(e.g. in Lipschitz conditions and approximation
the term transition when discussing changes to the
estimates). Our theorems and proofs are in fact
attractors as system parameters vary and reserve
valid for locally defined bounds and constants, but
the term "bifurcation" for the usual situations men-
require technical modifications (e.g. see [Kloeden
tioned above, i.e. the splitting of minimal invari-
& Lorenz, 1986]), which, we feel, distract from our
ant subsets into others within the attractors. In
emphasis here on the dynamical behavior.
particular, we will refer to a continuous transition
when the global (or maximal if only local) attractors
2. Autonomous Systems
depend continuously on the parameters and to an
abrupt or discontinuous transition when the attrac- We begin with several examples of well-known bifur-
tors depend only upper semicontinuously (and not cations in scalar autonomous systems and in two-
continuously) on the parameter. We will see in many dimensional triangular autonomous systems, which
examples that although transitions, in general, need provide simple but useful insights into the topics we
only be abrupt, they are in fact typically continuous wish to discuss.
for most parameter values.
Next we describe the structure of this article, 2.1. Scalar autonomous systems
followed by some notation at the end of this section.
2.1.1. Supercritical bifurcation
Section 2 deals with autonomous systems and
some of their bifurcations. It contains scalar exam- The autonomous differential equation
ples of a supercritical and a subcritical bifurca-
— = vx - bx3 (1)
tion and an explanation why e.g. the transcritical
Bifurcations and Continuous Transitions of Attractors 49

with b > 0 has a global attractor Av — {0} for 2.1.2. Subcritical bifurcation
v < 0. For v = 0, the equilibrium point 0 loses We now consider a subcritical bifurcation arising in
asymptotic stability for the linearized equation (it the scalar autonomous differential equation
remains stable there), but the set AQ = {0} is still
a global attractor for the nonlinear system. To see dx
this we use the Lyapunov function V(x) x2 to = -x (x4 -2x2 + l-v), (2)
~dt
obtain
d for which there are three parameter regimes for
V(x0(t)) = -2bx0(t)4 = -2bV(x0(t)f equilibrium solutions xv:
dt
and hence (i) xv = 0 for v < 0
V(x0(0)) (ii) xv = 0, ±y/l + y/u, ±yjl-^v for 0 < v < 1,
V(x0(t)) 0 as t —>• oo, and
l + 2bV(x0(0))t
(hi) ~x„ = 0, ±\/l + \/u for v > 1.
where xo (t) is any solution of the differential equa-
tion (1) for v = 0. (This Lyapunov function can also The zero solution here loses linear stability at
be used for v < 0.) v = 1 in a subcritical bifurcation to the nonlo-
For v > 0, there are three equilibrium points 0 cal solutions ±'^/l + \/u. Note, however, that these
and ±y/v/bQ, and the global attractor is now equilibria as well as ±^/l — ^fv first appear at
v = 0. The equilibria ± \ / l + *Jv are asymptotically
Av •s/vjb^yjvlbts stable for v > 0, whereas the equilibria ± \ A — v ^
are unstable in their existence interval 0 < f < 1.
Here H*(AV, A0) = tf * (A,, {0}) = ^7Jb~o • - 0 The global attractors here are Av = {0} for
as v —-> 0, i.e. the set-valued mapping v ^> Av \s, v < 0 and
continuous (in the Hausdorff metric) at the bifur-
cation point v = 0.
This classical example of a supercritical pitch- Av = 1 + V^, V ! + ^
fork bifurcation in an autonomous system is our first
example of what we have called above a "continuous for z/ > 0. In particular, the set-valued mapping
transition" of the global attractor. f —i > ^ is not continuous at v = 0 (being only

o c> -

Fig. 1. Supercritical pitchfork bifurcation. Fig. 2. Subcritical bifurcation.


50 P. E. Kloeden & S. Siegmund

upper semicontinuous there), but is continuous at two-dimensional their global attractors can be eas-
v = 1. The attractor thus undergoes a discontinu- ily determined by elementary algebra and direction
ous transition at v = 0 and a continuous transition field arguments.
at v = 1 (and, in fact, at any v ^ 0). An inspection
of the direction fields shows that the attractors are
clearly not equiattracting for parameter values in a 2.2.1. Supercritical bifurcation
neighborhood of u = 0, since the attraction is not The driving system of the triangular system
uniform there.

~ = -x + p, ^ = vP-p3, (x,p)eR2 (3)


2.1.3. Some nonapplicable situations
We mention for completeness that a transcritical is, in fact, the scalar autonomous differential equa-
bifurcation does not fit into our total stability sce- tion (1) with 6 = 1 , thus with equilibria pv = 0
nario since there is no nonlinear attractor at the for v < 0 and pv = 0, zty/u for v > 0. The equi-
bifurcation point, e.g. the autonomous equation libria (xv,pv) of the triangular system (3) satisfy
%v =Vvi where the pv are equilibria of the driving
dx o system. Thus there are two cases
—- = ux — x
dt
has (nonglobal) local attractors Av = {0} for v < 0 (i) (xv,pv) = (0,0) for v < 0, and
and Av = {u} for v > 0, but AQ = {0} is not (ii) (xv,pv) = (0,0), ( ± v ^ , ± v ^ ) for v > 0.
attracting from any neighborhood, being attracting
on one side and repelling on the other. The global attractor of the coupled system is thus
A similar situation occurs in a small neighbor- ^ = {(0,0)} for z / < 0 a n d
hood of the nonlocal subcritically bifurcating equi-
libria ± 1 in system (2) at u = 0, these equilibria also A„ = { ( 0 , 0 ) , ( ± v ^ , ± v ^ ) }
being attracting on one side and repelling on the U {heteroclinic trajectories}
other. Unlike in the transcritical bifurcation above,
the equilibria here exist only on one side of the crit- for v > 0.
ical parameter value v = 0. A heteroclinic trajectory lies below the x = p
line in the (p, x) plane if p' = up — p 3 is positive
2.2. Triangular autonomous systems there, and above if it is negative, see Fig. 3.
The set-valued mapping u —>• Au is thus contin-
We consider some autonomous differential equa- uous for all v, including the supercritical bifurcation
tions in R2 with the triangular form point v = 0 of the driving system, where the sin-
gle equilibria (0,0) in AQ (actually, ^o = {(0,0)}
~ = f(x,P), ft=9(P), (x,P)eR2. here) undergoes a supercritical bifurcation to yield
the new equilibria ( i y ^ ^V^) in Av for v > 0.
Such triangular systems are examples of skew prod-
uct flows with the uncoupled component for p being
considered as "driving" the coupled or "driven" sys- 2.2.2. Subcritical bifurcation
tem for x.
The situation is analogous for the triangular system
In the following three examples we consider
bifurcations of the triangular system due to bifurca-
dx dp , A „ 2 \
tions in either the "driving" equation (for p) or the
-£ = -x+P, -d-t=-p{p ~2P + 1
~ ^ '
"driven" equation (for x), the first two involving
a supercritical and subcritical bifurcation, respec- (x,p)eR2, (4)
tively, of the driving system and the third a saddle-
node bifurcation in the driven system. The systems where the driving system is now the subcritically
are all Morse-Smale systems with global attrac- bifurcating scalar differential equation (2). The
tors consisting of a finite number of equilibria and triangular system (4) has equilibria (xu,pv) with
their heteroclinic trajectories. Since the systems are xu = pu, where the pv are equilibria of the driving
Bifurcations and Continuous Transitions of Attractor\s 51

v>0

Fig. 3. The attractor for v < 0 and v > 0: Supercritical case.

I/>1

Fig. 4. The attractor for different values of v. Subcritical case.

system, i.e. with the three cases 2p2 + 1 — v) is positive there, and above if it is
(i) pv = 0 for v < 0, negative, see Fig. 4. The transition in Av is thus
(ii) % = 0, ±yjl - y/V, ±^1 + V" for 0 < v < 1, continuous for all v ^ 0 and discontinuous only for
and i/ = 0.
(iii) % = 0, ± y i + V ^ for v > 1.
2.2.3. Saddle-node bifurcation
The global attractor of the coupled system is thus
The autonomous differential equation in R 2 ,
i4„ = {(0,0)} for v <0 with
da;
x (1 X y2)+y(l + v + x),
4, = {(o,o), (±vT±V^±Vi±v^
U {heteroclinic trajectories}
for 0 < z/ < 1, and is obviously not of triangular form. However, if we
change from cartesian to polar coordinates, then we
Av = {(0,0), (±v/l + v^,±Vl + v^ obtain an equivalent system in triangular form,
U {heteroclinic trajectories} d6 dr ,
dt -(1 + v + rcostf), —=r
dt — r,
for i> > 1. A heteroclinic trajectory lies below the
x = p line in the (p, x) plane if 7/ = —p(p4: — (0,r)e[O,2vr]x (5)
52 P. E. Kloeden & S. Siegmund

v<Q v=0 v>Q

Fig. 5. Bifurcation diagram, cartesian coordinates.

Fig. 6. Bifurcation diagram, polar coordinates.

for which the dynamical behavior is more transpar- (0 here) rather than in the "driving" system equa-
ent [Arrowsmith k, Place, 1990; Glendinning, 1994; tion (r here).
Reitmann, 1996].
The global attractor Av here is the unit disk
{{x,y) € R 2 : r2 = x2 +y2 < 1} for all u, i.e. it 3. Nonautonomous Systems
does not change at all as v changes (so it obviously
We consider two abstract formalisms of nonau-
depends continuously on v). However, the dynam-
tonomous dynamical systems through simple exam-
ics within and near Av do change significantly as v
ples: the process formalism, which at first sight
passes through zero. The origin is an equilibrium
seems more natural, and the skew product formal-
point and the unit circle is invariant for all val-
ism in which a state space system is driven by an
ues of v. For negative values of u, there are two
inputed autonomous system. Various definitions of
equilibrium points on the unit circle, one a saddle
nonautonomous attactors will then be introduced
point and the other a stable node, which coalesce
and illustrated with examples.
at v = 0 to form a single saddle-node. For positive
v this saddle node disappears. This is easily visual-
ized in Cartesian coordinates (see Fig. 5), but it is
interesting to also use polar coordinates (see Fig. 6). 3.1. Process formalism of a
This example also differs from the previous nonautonomous system
two on triangular systems in that the bifurca- Solution mappings are one of the main motivations
tion occurs in the coupled "state" space equation for the process definition [Dafermos, 1971] (see also
Bifurcations and Continuous Transitions of Attractors 53

[Hale, 1988]) of an abstract nonautonomous dynam- See Fig. 7 for the case where t h e family
ical system on a state space X. A process is a con- {A(t),t G R} corresponds to a single trajectory
tinuous mapping (t,to,xo) — i »• x(t,to,Xo) G X for 7p(t), i.e. A(t) = {p(t)} for all t G R. For an
£ > *o> to G R and XQ G X, with the initial value extensive, but elementary introduction t o forward
and evolution properties and pullback attractors see [Caraballo et al, 2003;
Griine & Kloeden, 2001].
(i) x(t0,to,x0) = XQ for all to G R and xo G X, We now consider a nonautonomous analogue of
(ii) a;(t2,to,a;o) = x(t2,ti,x(ti,t0,x0)) for all t 0 < the pitchfork bifurcation of the differential equation
*i < t2 in R and XQ G X. (1), namely
It is often also called a two-parameter semigroup on dx
b(t)x3, (7)
X in contrast with the one-parameter semigroup of dt = ux
an autonomous dynamical system.
A nonautonomous attractor A now consists of with continuous b : R —> R, b(t) G [&o>&i] f° r a u
a family of nonempty compact sets {A(t),t G R} t G R where 0 < b0 < h < oo. See [Kloeden, 2004]
which is invariant in the sense that for the bifurcatory analysis of a multidimensional
version of this equation.
x(t,t0,A(t0)) = A(t) for all t > t0,t0 G R Assuming b(t) is nonconstant, the nonau-
tonomous differential equation has only one equi-
(from which it follows that the set-valued mapping librium point 0 and this exists for all values of v.
11—> A(t) is continuous in t G R). There are now two Supposing that v < 0, we see for the Lyapunov
ways to define attraction of A, which are equiva- function V(x) = x2 that
lent in the autonomous case. The first, and perhaps
more obvious, corresponds to Lyapunov asymptotic
stability, i.e. with forward attraction in the sense of ^V(x„(t)) = 2vxv{tf 2b{t)xu{tf

lim H^(x(t,t0,x0),A(t)) = 0. < 2uV{xv{t)) - 2b0V(xu(t))2


(6)
t—>oo
< -2boV(Xl/{t)f,
Note that the target set A(t) is changing in time.
The other attraction, called pullback attraction so
involves a fixed target set with progressively earlier
starting time, i.e. V{xv{to)) ;
V(xu(t)) <
l + 26o^(M*o))(*-to)
lim HZ(x(t,to,xo),A(t)) = 0.
to as t oo,

x(-,t0,x0) ^^ ^^
r~~ _-——-^$^

- ^ ^ ^ ^
t0 i /

(a) (b)
Fig. 7. (a) Forward and (b) pullback attraction.
54 P. E. Kloeden &: S. Siegmund

where xv(t) is any solution of the differential equa- The proofs of the above assertions are instruc-
tion (7) for v < 0. Moreover, the equilibrium point tive. We first note that Eq. (7) is a Bernoulli equa-
0 loses stability in the linearized equation at v — 0 tion, so can be converted into a linear differential
and is unstable for v > 0. equation
Unlike the autonomous case above, no new
dv
equilibrium points come into existence when v > 0. + 2vv = 2b(t)
Instead we will show that there is a family of time- ~d~i
dependent sets with the substitution v = x~2 (recall that x — 0 is
an equilibrium of the Bernoulli equation (7)), which
Mt) = [-v„{t),ipv(t)], te (8) we integrate to obtain
which are uniformly Lyapunov asymptotically sta- -2i/(t-t0)
ble (see Eq. (12) below). Here ±7pu are solu- xu(t,t0,x0)2 = x,
~^e
tions of the nonautonomous differential equation (7)
given by + 2 [ b(s)e-2^~s) ds. (10)
J to

Vv{t) (9) We hold t and XQ fixed in (10) and take the pull-
\f b(t -Mts) ds
back limit as to —> —oo, to obtain the pullback limit
J —oo solution

Tpv(t)= lim xv(t,tQ,xQ)


and satisfies ipv (t) € \fv/b\, vW^O for all t € to—>~oo

which means, in particular that given by (9). This is itself a solution of the dif-
ferential equation (7) and thus satisfies (10) with
HR(A„(t),A0) = HR(Av(t), {0}) XQ = Tpv{to), specifically
< vW^o "~* 0 as ^ —> 0, 1 -2i/(t-i0)
2 2
for each f 6 R, see Fig. 8. (Note that Tpv is almost vAt) vM
periodic when the coefficient b is almost periodic
2 f b{s)e'2^t-sUs. (11)
[Fink, 1974; Sell, 1971].)
J to
Thus we have another example of a continu-
ous transition of the attractor, this time nonau- To show that fv(t) is asymptotically stable for all
tonomous. Xo > 0 (cf. [Langa et al., 2002; Langa et al., 2004;

Fig. 8. Nonautonomous pitchfork bifurcation.


Bifurcations and Continuous Transitions of Attractors 55

Langa & Suarez, 2002]), we subtract (10) from (11) autonomous dynamical system, examples of which
to obtain will be given in Sec. 3.3 and later.
1 1 We will reinterpret the bifurcations in these
<pv(t)2 xu(t,to,x0y examples as nonautonomous bifurcations, using the
pullback convergence method to construct explic-
i -2v(t-t0) itly the heteroclinic trajectories inside the previ-
0 as t —* oo,
ously determined global attractors.
In particular, we consider the supercritical
from which the result follows. It is in fact uniformly
bifurcation in the system
asymptotically stable because using the fact that
Tpv (t) G y/v/bi, y/is/bo , we can find for every dx dp q m0 , ,
e > 0 and XQ > 0 a T(XQ, e) > 0 such that
As seen earlier, the global attractor of the coupled
,-2u(t-to)
m (+^2 system (15) is Av = {(0, 0)} for v < 0 and
<Pv(to

-2i/(t-t0)
Av = {(0,0),(±y/i;,±y/P)}
< < £ (12)
V
U {heteroclinic trajectories}
for t > T(xo,e) + tQ. The corresponding result holds
for v > 0.
for the solution —^pv{t) for all XQ < 0.
We know that the global attractor of the uncou-
We note here that the nonzero solutions
pled equation for p when v > 0 is Pv = [—-s/v, y/u\
xu(t,to,xo) of the differential equation (7) converge
and, moreover, that the solution pv(t,po) exists for
to sga(xo)lpl,(t) in both the usual forward sense (i.e.
all t G M when po G Pu, since Pv is a compact
t —»• oo with to fixed) as well as in the pullback sense
invariant set for the p-dynamics. For such a solu-
(i.e. to -* — oo with t fixed). In general, forward and
tion, the nonautonomous differential equation (14)
pullback convergence are independent concepts and
for the first component,
neither one implies the other [Cheban et al, 2002;
Wang et al., 2004], although in autonomous systems dx
or in uniform systems, as in this example, they are -x+pu(t,p0),
~dl
equivalent. Pullback convergence is useful in con-
structing limiting objects, since the limit is fixed with initial value a;(to) = XQ has the solution
and not changing in time as in forward convergence.
xv(t,tQ,xQ,pQ) = x 0 e" ( t ~ t o )
s
3.2. Triangular systems as + e" / e pv(s,p0)ds. (16)
J to
nonautonomous systems
Here we will reconsider our examples in Sec. 2.2 of Holding t fixed and taking the limit as to —> — oo,
bifurcations in autonomous differential equations of we obtain the pullback limit
the triangular form
lim xu(t,t0,x0,po) = yv{t,pQ)
to—•—oo
2
^ = f(x,p), f =g(P), (x,p)€M , (13)
t
:=e / espl/(s,p0)ds.
which we now interpret as nonautonomous differen- J—oo

tial equations (17)


dx Obviously, we have —\fv < xu(t,po) < \fv for all
= fpo(x,t) := f{x,p(t,p0)), x G R \ (14)
dt t & M. with, in particular, Tpv{t,pv) = pv, when
where the solution p(t,po) with initial value Po = Pv i s o n e °f the equilibria 0, ± v ^ of the
p(0,po) = po of the second uncoupled component ^-equation. In fact, (jfv{t,po),pv{t,po)) G Av for all
of the triangular system (13) is considered as an t e l and the heteroclinic trajectories in the global
external driving force. In general, p(t,po) need not attractor Av are the curves
be the solution of an autonomous differential equa-
tion, but just a function or a trajectory of an Po $i/(Po), PoePv = [-y/v, Vu],
56 P. E. Kloeden & S. Siegmund

where <Ev(po) := ^ ( O J P O ) - I n this case we can solve Triangular autonomous systems (13) are special
the Bernoulli equation for p explicitly (cf. (10)) to cases of skew product flows. In general, the driving
obtain system is not generated by an autonomous differen-
tial equation, so the autonomous global attractor of
e^+1> ds
**(po) = / = > Po < v- the skew product flow may not always be physically
J -c meaningful or, at least, not as physically meaning-
°° e2vs + ( -^ - 1
ful as the dynamics in the state space variable x.
For example, consider the nonautonomous Bernoulli
The analysis is similar in the other two exam- differential equation
ples, so we will not present the details (which are
somewhat more complicated) here. dx
b(t)x3 (18)
~dt — vx
3.3. Skew product flow formalism with b : R —> R is an almost periodic function
of nonautonomous systems [Fink, 1974; Sell, 1971], which, in general, will not
be the solution of an autonomous differential equa-
Following [Kloeden et al., 1999] (see also [Berger tion. However, we can use a construction of Bebutov
& Siegmund, 2003; Cheban et al., 2002; Griine k [1940] to formulate changes in b as an autonomous
Kloeden, 2001; Kloeden & Kozyakin, 2001; Kloeden dynamical system, namely for the shift operators
& Stonier, 1998; Langa et al, 2002; Li & Kloeden, 8tb(-) := b{t + •) for all t £ R which determine an
2004b; Wang et al, 2004; Wiggins, 2003]) we define autonomous dynamical system on the space
a nonautonomous dynamical system (9,<p), abbre-
viated NDS, in terms of a cocycle mapping ip on P:=cl{&(* + •) : t e R } ,
a state space X (a metric space) which is driven
by an autonomous dynamical system 9 acting on a where the closure is with respect to the norm
base or parameter space P (also a metric space). ll/lloo := su PteR l/(*)l- Since b is almost periodic,
Specifically, 9 = {9t : t €E R} is an autonomous P C C(M,R) is a compact metric space with the
dynamical system on P, i.e. a group of homeomor- metric corresponding to this norm (however, if b is
phisms under composition on P with the properties only bounded we need the weak* topology for P
that to be a compact metric space). The x variable may
be a physical quantity such as a chemical concentra-
1. 90(p) = p for all p e P ; tion or population density, but it is not clear how we
2. 8a+t = 0s(6t(p))foTaRs,t€R; should interpret a point (p, x) in the global attrac-
3. the mapping (t,p) i-> 9t(p) is continuous, tor Av C P x R of the corresponding skew product
and the cocycle mapping if : x P x X X flow.1
satisfies We note that the quasi-periodically forced lin-
ear systems of [Grebogi et al., 1984] (see also
1. (p(0,p,x) = x for all (p,x) € P x X; [Glendinning, 2004]) with their intriguing non-
2. ip(s + t,p,x) = ip(s,9t{p),<p{t,p,x)) for all s, t, chaotic strange attractors can also be formulated
GR+, (p,x) ePxX; as nonautonomous dynamical systems in this way.
3. the mapping (t,p,x) *—> <p(t,p,x) is continuous. The global attractor 21 (if it exists) of a skew
Then the mapping -K : R xY-^Y +
defined by product flow 7r = (9,ip) has the form [Cheban et al,
2002]
ir(t,(p,x)) := [9t{p),ip{t,p,x))
21= | J ({p}xAp),
forms an autonomous semidynamical system on p<EP*
Y = PxX over R + , which is called the skew product
flow associated with the nonautonomous dynamical where P* is the global attractor of the driv-
system (9,ip). ing system 9, thus a nonempty compact invariant

A similar situation occurs for random dynamical systems [Arnold, 1998; Ashwin & Ochs, 2003], as when the function 6 here
is a stochastic process. Then we write b(t,u>) = b(6t(u>)) where 6t : O, —> Q is a metric dynamical system, i.e. with u> —*• 9t{u)
measurable rather than continuous. Here Q is the sample space in an appropriately chosen probability space ( n , j F , P). The
physical interpretation of a point (to, x) is also not obvious here.
Bifurcations and Continuous Transitions of Attractors 57

{p}xX {OsP} x X
R+tP} x X

tp(s + t,p,x)
= <p(t,OsP)(<p(s,p)x)

= vtvsP

Fig. 9. The cocycle property.

subset of P, and Ap are nonempty compact subsets The relationship between pullback and forward
of X for each p E P*, which are ^-invariant in the attractors and the subset A of P x X defined by
sense that U P eP UP} x AO w ^ h component sets from such
pullback or forward attractors and a possible global
<p(t,p, Ap) = Aet(p) for allt > 0, p E P* (19) attractor of the associated autonomous skew prod-
and pullback attracting in the sense that uct flow -K is discussed in [Cheban et al., 2002;
Wang et al, 2004].
lim Hx& (t, 9-t{p), B),AP) = 0 (20) As a simple example, we note that the fam-
t—>oo ily of singleton sets Ap = {$u(p)} for p E Pv =
for every nonempty bounded subset B of X and \—\fv, \fv\ where $v(p) in the previous subsection
every p E P*. However, the Ap sets here need not is a pullback attractor on X = M for p E Pv. In
be forward attracting, i.e. in the sense that this case, the family is also a forward attractor and
\imHx(cp(t,p,B),Adt{p)) =0 (21) the subset {jp&Pv({p} x Ap) of Pv x R is the global
attractor for the associated skew product flow, i.e.
for every nonempty bounded subset B of X and the triangular system (15).
every p E P*. The above skew product formalism is partic-
This suggests several possible definitions of a ularly advantageous when the base space P of the
nonautonomous attractor of an NDS (9, <p) when driving system is compact, as for example, in almost
one wishes to focus attention on the state space periodically forced differential equations. However,
X and the dynamics therein. A family A = {Ap : P need not be compact. In fact, with P = R
p E P} of nonempty compact subsets of X is said and 9t(to) := t + to for all t and i 0 €E M, the
to be a pullback attractor of an NDS (9, (p) if it above formalism reduces to the process formalism
is 92-invariant and pullback attracting as in (19) of a nonautonomous dynamical system with x(t +
and (20) on P. (Usually one restricts attention to to,to,xo) := ip(t,to,x0) for all t > 0, to EM and
P = P*, which we will do henceforth.) Such a fam- XQ E X. We have already seen counterparts of above
ily A is said to be a forward attractor of the NDS definitions of pullback or forward attractors for such
(9, ip) if it is ^-invariant and forward attracting as processes in Sec. 3.1 above. We observe that an
in (19) and (21) on P. attractor does not exist for the skew product flow
The existence of a pullback attractor follows in this context.
from that of an absorbing set (or family of absorb-
ing sets) and a compactness property of the cocycle 4. A Total S t a b i l i t y T h e o r e m
mapping [Cheban et al., 2002; Griine & Kloeden,
For the total stability theorem and its application
2001; Wiggins, 2003]. Obviously, any uniform pull-
in the next section we consider parametrized nonau-
back attractor is also a uniform forward attrac-
tonomous differential equations
tor, and vice versa, where uniformity is with
rj'-p
respect to p E P, but in general neither of pull-
back and forward convergence implies the other. -;r = Mt,x), !/£[-«/>*], (22)
58 P. E. Kloeden & S. Siegmund

where the vector fields fv : R x Rd —» Rd satisfy the inequality (see Lemma 3 in the Appendix)
following standard assumptions: The functions fv
are continuous in (t, x) G M. x R d , globally Lipschitz V(t,xv(t,to,xQ)) < e-(*- i0 ) V(t0,x0) + KLu(v),
in x G M.d uniformly in t G R with Lipschitz con- 0 < t - t 0 < 1, (26)
stant Lj,, and satisfy the parametric dependence
d
condition2: for all to G R and XQ G R , where V is a
Lyapunov function with Lipschitz constant L (and
There exists w(v) for v G [—v* ,v*\ with u(v) —> 0 K is another positive constant), which character-
as v —> 0 swc/i t/iai izes the uniform asymptotical stability of the set
AQ for the differential equation (22) with v = 0.
sup | / v ( t , x ) - / o ( t , x ) | < o ; ( i / ) (23) The existence of V is ensured by Theorem 2 in the
d
teM.,x<=m.
Appendix. We use the Lyapunov inequality (26) to
for v G [—v*,v*\. establish the existence of a family of nonempty com-
pact subsets {A*(to), to G R} of R d , provided v is
We will denote the solution of (22) with initial sufficiently small, which is positively invariant, i.e.
value a;(to) = ^o by xv(t,ta,xo). We will also assume
that for v = 0 the system (22) has a uniformly x I / (t,t 0 ,A*(to))CA^(t), Vt>t0,
asymptotically stable compact connected set A0 and absorbing uniformly in to G R, i.e. for every
(which may be an attractor, but need not be). Our compact subset D of R there exists TD,U > 0 such
next theorem on total stability is a consequence of that
the uniform asymptotic stability of Ao, the unifor-
mity being essential here. xl/(t,t0,D)CA*l/(t), t>t0 + TDiV, V t 0 G R .

Theorem 1. Consider Eq. (22) satisfying the stan- The component sets of the pullback attractor are
dard assumptions and suppose that there is a then determined by
nonempty compact set Ao which is uniformly
asymptotically stable (possibly only locally) for (22) A,(*o) = f l xv(t0,t0 - r, A* (t0 - r ) )
T>0
with v = 0.
Then there is a v** G (0, i/*] such that for for each to G R. These are pullback attracting in the
each v with 0 < \v\ < u** there exists a fam- sense that
ily Av = {Au(to),tQ G R} of nonempty compact
connected subsets of R d , which are invariant with dist (xv(to,to - T,XO), A,(t 0 )) —> 0 as r —> oo,
respect to (22), i.e. V t 0 G R, x 0 G R d , (27)

xv(t, t 0 , Av(to)) = Av(t), t > t0, (24) but they need not be forward attracting in the sense
that
and converge upper semicontinuously to AQ uni- dist (xu(t,to,xo), Av(t)) ^ 0 as t —>• oo,
formly in t G R, i.e.
d
V t 0 G R, x0 G R . (28)
supH^d(Au(t),A0)-^0 asv^Q. (25)
Note that the pullback attractor Av consists of
teR
a single set Av if the differential equation (22)
The family Av is a pullback attractor for the for this value of v is autonomous. In this case
nonautonomous process or two-parameter semi- xv(t, to,xo) = xu(t — to,0, xo), so pullback and for-
group {ajj,(£,to> •)} defined by the solutions of the ward convergences are equivalent. For a detailed
differential equation (22). Note that the set-valued discussion of the relationship between pullback and
mapping t >—>• Av(t) is continuous due to the invari-forward convergences see [Cheban et al, 2002;
ance property (24) and the continuity of the process. Wang et al., 2004].
The proof of Theorem 1 is given in the The upper semicontinuous convergence (25) fol-
Appendix. It is based on the following Lyapunov lows from the fact that A, (to) C A* (to) and the

Unlike the uniformity in x here (see Remark 1), the uniformity in t £ R is a strong restriction but is essential for total
stability. However it does hold for almost periodic functions.
Bifurcations and Continuous Transitions of Attractors 59

construction of the A*(£o) sets, which leads to a singleton set local attractor. The other equilib-
rium is unstable and coalesces with the first as v2
H£d(Av(t0),Ao) < H^d(At(to),Ao) - • 0 as u -+ 0 approaches e~l from below, then both equilibria
disappear for v2 > e _ 1 . This example also shows
for all to € M. In general, it cannot be strength-
that the parameter interval in which the perturbed
ened to continuous convergence (i.e. with H*d
attractor exists can be very small, e.g. when we
replaced by the Hausdorff metric HRd). A simple
counterexample is given by the example of a sub- apply the total stability theorem to the above differ-
critical bifurcation in the autonomous differential ential equation with a v value such that e~2 — v > 0
equation (2) at v = 0 (i.e. where the nonlocal equi- is very small.
libria first arise). As mentioned above, equiattrac-
tion ensures continuous convergence of attractors 5. A p p l i c a t i o n s of t h e Total S t a b i l i t y
in the autonomous case. An analogous result also Theorem
holds in the nonautonomous case [Li Sz Kloeden,
2004b]. As a first comment, if to emphasize the obvious, we
Under suitable assumptions (see [Wang et al., mention that uniform asymptotic stability is con-
2004]) it can be shown that the set-valued mapping cerned solely with what happens outside of the set
t i—• Av{t) is periodic, respectively almost periodic, A$ and says nothing about what may happen inside
when the functions fu(t, x) are so. In particular, this of AQ. Indeed the Lyapunov function vanishes on
holds for the Bernoulli equation (7) when the func- AQ. Nevertheless the internal dynamics of AQ may
tion b is periodic, respectively almost periodic, for have a significant effect as the above example of a
which Av{t) are given by (8). saddle-node bifurcation shows. The importance of
the total stability theorem is that it ensures the
Remark 2. It is possible to generalize the above existence of nearby attracting objects in perturbed
theorem to assume that the system has a uni- systems. The theorem of [Li <fe Kloeden, 2004a] (see
formly Lyapunov asymptotically stable (in the for- [Li &; Kloeden, 2004b] for nonautonomous dynami-
ward sense) family of nonempty compact invariant cal systems) then says that the convergence of these
sets {A(t) : t e M} instead of the attracting set perturbed attractors is in fact continuous rather
AQ (see [Yoshizawa, 1966]) or to assume that the than just upper semicontinuous when the attractors
right-hand side of the differential equation (22) is are equiattracting.
of the form f(x,p) with a uniform pullback attrac- For a supercritical bifurcation one needs to
tor {Ap : p £ P} with P compact [Kloeden & assume more about the attractor AQ of the reference
Kozyakin, 2001]. system and the perturbed systems: for exam-
ple, that the differential equations (22) with fv
Remark 3. The following example due to Li satisfying in addition to the standard assumptions
Desheng (private communication) shows that the the following properties
perturbed attracting set need not be globally
attracting even when the unperturbed system is 1. fv(t, 0) = 0 for all t € M and all v G [-u*, u*};
globally attracting. 2. the equilibrium solution xu(t) = 0 of (22) is sta-
The autonomous scalar ordinary differential ble for v < 0 and unstable for v > 0 for the
equation x'(t) = —xe~x '2 + v has a global attrac- linearized system.
tor AQ = {0} when v = 0. For 0 < v2 < e _ 1 , there 3. the equilibrium solution xo(t) = 0 of (22) with
are two equilibria with the one closer to zero being v = 0 is uniformly asymptotically stable.
locally asymptotically stable and thus forming
Thus Theorem 1 is applicable and we get AQ = {0}
with O e A ^ / {0} for v > 0, so
t/(*) HRd(Au(to),{0}) = H^d(Au(to),{0}) - 0
as v —> 0

for all to € R. In this case we have a continuous


bifurcation at v = 0. The supercritical bifurcations
Fig. 10. Graph of f(x) = -xe~x'*/2'. of the autonomous differential equation (1) and the
60 P. E. Kloeden & S. Siegmund

nonautonomous differential equation (7) are exam- bifurcations. More complicated types of bifurca-
ples of this result. We note that a similar analysis is tions, e.g. homoclinic bifurcations, certainly might
possible if the equilibrium solution 0 here is replaced occur inside autonomous attractors and their
by a periodic or almost periodic solution. nonautonomous counterparts. On the other hand,
Two features are required here for a supercrit- it is known that a bifurcation at infinity may com-
ical bifurcation: pletely destroy a pullback attractor [Kloeden h
Kozyakin, 2001]. Less dramatically, as we have seen
(i) the bifurcating solution is uniformly asymptot- in the autonomous differential equation x'(t) =
ically stable for the nonlinear system at the —xe~x I2 -\-v above (see Remark 3), the bifurcation
parameter value where it loses linear stability, at infinity for v = 0 destroys the global attractiv-
and ity of the perturbed attractor but not the attractor
(ii) the bifurcating solution (or at least a contin- itself.
uation of it) exists and is unstable after the In all of our examples the bifurcation parame-
bifurcation point. ter appears in an autonomous linear part of the dif-
ferential equation. Johnson et al. [2002] considered
The first of these ensures the existence of an
the analog of Hopf bifurcations for equations where
attractor after the bifurcation point and the second
the bifurcation parameter has a time-dependent
that these attractors contain something more than
coefficient, as e.g. in the Duffing-van der Pol equa-
just the continuation of the bifurcating solution and
tion with almost periodic coefficient b(t)
thus the occurence of not just a bifurcation, but a
supercritical bifurcation.
Usually the supercriticality of a bifurcation is dt \y) " \-a + ab(t) (3) \y J
determined by the sign of a coefficient in an expan-
sion of the new solution in what is essentially a nor-
mal form expression [Glendinning, 1994]. As can \x2y + x3)'
be seen from the examples in [Glendinning, 1994;
Marsden & McCracken, 1976], such expressions are The bifurcation then appears to occur in two
difficult enough to determine for specific examples stages, perhaps because the corresponding Sacker-
of codimension-one bifurcations and thus cannot be Sell spectrum consists typically of intervals rather
expected to be any easier for higher codimensions. than single points, with the bifurcation being only
The nonlinear asymptotic stability of the bifurcat- complete when the whole spectral interval has
ing solution at the point of loss of linear stability crossed over into the positive part of the real
provides an alternative test for supercriticality in line (see [Siegmund, 2002a] for spectral theory and
these cases. [Siegmund, 2001, 2002b] for a nonautonomous nor-
As a final point, we note that we are not mal form of (29)). Should one consider such bifurca-
restricted here to applying the total stability the- tions as the authentic nonautonomous bifurcations
orem to a global or maximal attractor, but could and the ones that we have focussed on in this article
equally well apply it locally to the bifurcation of only as special cases? In any case, the above discus-
an equilibrium solution, say, inside such an attrac- sion on total stability and supercriticality remains
tor. This would verify the continued existence of the valid here too.
solution or a continuation of it of some form after a Our autonomous triangular systems were all
bifurcation and the existence of other nearby min- Morse-Smale systems. They provided us with
imal solutions if the continued solution is unstable examples of nonautonomous systems with particu-
after the bifurcation point. larly robust attractors for the case that the driving
systems were generated by differential equations.
However, we also considered examples of nonau-
6. Concluding Remarks and Questions tonomous systems for which the driving system was
We have restricted our attention here to well-known not generated by a differential equation, but by, say,
basic bifurcations, partly because these already the shift operator on the hull of an almost periodic
illustrate our ideas clearly in the autonomous function. What then is the counterpart of a Morse-
case and partly because investigations of bifur- Smale system in such situations?
cations in the general nonautonomous case have In our examples, a discontinuous transition
not progressed much beyond these elementary of an attractor was always associated with a
Bifurcations and Continuous Transitions of Attractors 61

. X\j X

-2 -1 1 2

-1

y = -x3 + x + 2\/3/9 y = -x3 + x + 4\/3/9 - i>, i/ < i/0

Fig. 11. Two equilibria at VQ and one equilibrium for v < VQ.

subcritical bifurcation within t h e attractor. Is this Caraballo, T., Kloeden, P. E. & Langa, J. [2003] "Atrac-
the only cause of a discontinuous transition? Do tores globales para sistemas diferenciales no auto-
such discontinuous transitions only occur at isolated nomos," Cubo Matemdtica Educacional 5, 305-329.
parameter values? In fact, a subcritical bifurcation Cheban, D., Kloeden, P. E. & Schmalfufi, B. [2002] "The
is not necessary b u t rather t h e sudden vanishing relationship between pullback, forward and global
attractors of nonautonomous dynamical systems,"
or rising of nonlocal equilibria (or other minimal
Nonlin. Dyn. Syst. Th. 2, 9-28.
invariant sets), i.e. which exist only on one side
Dafermos, C. M. [1971] "An invariance principle for
of a critical parameter value. This can be seen in
compact processes," J. Diff. Eqs. 9, 239-252.
Example 3.2 in [Li & Kloeden, 2004a] for t h e scalar Fink, A. M. [1974] Almost Periodic Differential Equa-
equation tions, Springer Lecture Notes in Mathematics,
Vol. 377 (Springer-Verlag, Heidelberg).
x = -x3 + x + 4 \ / 3 / 9 - v Glendinning, P. [1994] Stability, Instability and Chaos
(Cambridge University Press, Cambridge).
with v E [0, UQ] where v : = 2-\/3/9. For v = VQ t h e Glendinning, P. [2004] "The non-smooth pitchfork bifur-
equation has two distinct equilibria x~Q and X+ and cation," Discr. Contin. Dyn. Syst. Ser. B 4 , 457-464.
for v < vo one equilibrium xv (see Fig. 11). Grebogi, C , Ott, E., Pelikan, S. & Yorke, J. A. [1984]
Note t h a t t h e equilibrium near 1 is asymptoti- "Strange attractors that are not chaotic," Physica
cally stable for all v. In particular, there is no sub- D13, 261-268.
critical bifurcation. Grime, L. & Kloeden, P. E. [2001] "Discretization,
inflation and perturbation of attractors," in Ergodic
Theory: Analysis and Efficient Simulation of Dynam-
ical Systems, ed. Fiedler, B. (Springer-Verlag),
References pp. 399-416.
Arnold, L. [1998] Random Dynamical Systems Hale, J. [1988] Asymptotic Behavior of Dissipative
(Springer-Verlag, Heidelberg). Dynamical Systems (Amer. Math. Soc, Providence).
Arrowsmith, D. K. & Place, C. M. [1990] An Introduction Johnson, R. A., Kloeden, P. E. & Pavani, R. [2002]
to Dynamical Systems (Cambridge University Press, "Two-step transition in nonautonomous bifurcations:
Cambridge). An explanation," Stoch. Dyn. 2, 67-92.
Ashwin, P. & Ochs, G. [2003] "Convergence to local Kloeden, P. E. & Lorenz, J. [1986] "Stable attract-
random attractors," Dyn. Syst. 18, 139-158. ing sets in dynamical systems and in their one-
Bebutov, M. V. [1940] "Sur les systemes dynamiques step discretizations," SIAM J. Numer. Anal. 23,
dans l'espace des fonctions continues," Doklady Akad 986-995.
Nauk SSSR 27, 904-906. Kloeden, P. E. & Stonier, D. J. [1998] "Cocycle
Berger, A. & Siegmund, S. [2003] "On the gap between attractors in nonautonomously perturbed differen-
random dynamical systems and continuous skew tial equations," Dyn. Contin. Discr. Impuls. Syst. 4,
products," J. Dyn. Diff. Eqs. 15, 237-279. 211-226.
62 P. E. Kloeden & S. Siegmund

Kloeden, P. E., Keller, H. & Schmalfufi, B. [1999] Siegmund, S. [2002b] "Normal forms for nonautonomous
"Towards a theory of random numerical dynam- differential equations," J. Diff. Eqs. 178, 541-573.
ics," in Stochastic Dynamics, eds. Crauel, H. & Wang Yejuan, Li Desheng & Kloeden, P. E. [2004]
Gundlach, V. M. (Springer-Verlag, Heidelberg), "Uniform attractors of almost periodic non-
pp. 259-282. autonomous dynamical systems," Nonlin. Anal. TMA
Kloeden, P. E. & Kozyakin, V. S. [2000] "The inflation of 59, 35-53.
attractors and discretization: The autonomous case," Wiggins, S. [2003] Introduction to Applied Nonlin-
Nonlin. Anal. TMA 40, 333-343. ear Dynamical Systems and Chaos, 2nd edition.
Kloeden, P. E. fe.Kozyakin, V. S. [2001] "The perturba- (Springer-Verlag, Heidelberg).
tion of attractors of skew-product flows with a shad- Yoshizawa, T. [1966] Stability Theory by Lyapunov's
owing driving system," Discr. Contin. Dyn. Syst. 7, Second Method (Mathematical Society of Japan,
883-893. Tokyo).
Kloeden, P. E. [2004] "Pitchfork and transcritical bifur-
cations in systems with homogeneous nonlinearities
and an almost periodic time coefficient," Commun. A. Appendix: Proof of Theorem 1
Pure Appl. Anal. 3, 161-173.
The following theorem, based on Theorem 22.5
Kloeden, P. E. & Kozyakin, V. S. [2004] "Uniform
nonautonomous attractors under discretization," in [Yoshizawa, 1966], provides t h e existence of a
Discr. Contin. Dyn. Syst. 10, 423-433. Lyapunov function which characterizes t h e uni-
Koksch, N. fc Siegmund, S. [2002] "Pullback attracting form asymptotical stability of a globally uni-
inertial manifolds for nonautonomous dynamical sys- formly asymptotically stable compact set AQ of a
tems," J. Dyn. Diff. Eqs. 14, 889-941. nonautonomous differential equation
Krasnosel'skii, M. A., Burd, V. Sh. & Kolesov, Yu. S.
[1973] Nonlinear Almost Periodic Solutions (John
Wiley & Sons, NY). §>/«,*). (A.1)
Langa, J. A., Robinson, J. C. & Suarez, A. [2002] "Sta-
bility, instability and bifurcation phenomena in non- Theorem 2. Suppose that f : R x Rd - • Rd in
autonomous differential equations," Nonlinearity 15, (A.l) is continuous in (t,x) and globally Lipschitz
887-903. in x € Rd uniformly in t € R and suppose that
Langa, J. A. & Suarez, A. [2002] "Bifurcation phenom- (A.l) has a globally uniformly asymptotically stable
ena for a non autonomous logistic equation," Electron. compact set AQ.
J. Diff. Eqs. 72, 1-20. Then there exists a Lyapunov function V :Rx
Langa, J. A., Robinson, J. C. & Suarez, A. [2004] Rd -> [0, oo) for which:
"Bifurcations in non-autonomous scalar equations,"
submitted. 1. V is globally Lipschitz in x € Rd uniformly in
Li Desheng & Kloeden, P. E. [2004a] "Equi-attraction t £ R, i.e. there exists a constant L > 0 such
and the continuous dependence of attractors on
that
parameters," Glasyow Math. J. 46, 131-141.
Li Desheng & Kloeden, P. E. [2004b] "Equi-attraction
\V(t,x)-V(t,y)\<L\x-y\
and the continuous dependence of pullback attractors
on parameters," Stock. Dyn. 4, 373-384. for all x, y e Rd, t € M; (A.2)
Marsden, J. & McCracken, M. [1976] The Hopf Bifurca-
tion and its Applications (Springer-Verlag, NY). 2. there exist continuous strictly increasing func-
Reitmann, V. [1996] Regulare und Chaotische Dynamik tions a, f5 : Rd i-> [0, oo) with a ( 0 ) = /3(0) = 0
(B.G. Teubner, Stuttgart). and 0 < a(r) < (3(r) for all r > 0 such that
Sell, G. R. [1971] Lectures on Topological Dynamics
and Differential Equations (Van Nostrand-Reinbold, a(dist(£, A))) < V(t,x) < P(dist(x,A0))
London). for all x e R; d
Siegmund, S. [2001] "Normal form of Duffing-van der
Pol oscillator under nonautonomous parametric per- 3. V decreases exponentially fast along trajectories
turbations," Discr. Contin. Dyn. Syst., 357-361,
of (A.l) uniformly in to € R, i.e. we have for
Kennesaw conference issue available from h t t p : / /
AlMSciences. org/ t0 eR,x0e Rd
Siegmund, S. [2002a] "Dichotomy spectrum for nonau-
tonomous differential equations," J. Dyn. Diff. Eqs. V(t,x(t,tQ,xo)) < e-^-^V(to,xo)
14, 243-258. for all t>t0. (A.3)
Bifurcations and Continuous Transitions of Attractors 63

In fact, if differential equation (A.l) has a uni-


formly asymptotically stable equilibrium solution + / \fo(s,xu(s)) - fo(s,x0(s))\ds
J to
and t h e function / in (A.l) is almost periodic
uniformly in i G B[0;R] for each R > 0
< / UJ(U) ds + LQ I \xv{s) — XQ{S)\ ds.
(respectively, periodic or autonomous), then, from J to Jto
Theorem 19.8 of [Yoshizawa, 1966], t h e Lyapunov
T h u s we have
function V can be chosen to be almost periodic in
t (respectively, periodic or autonomous). \xv{t) - x0(i)\ < u(v)(t - t0)
The following Lyapunov inequality, which is
similar t o inequalities in [Kloeden &: Kozyakin, + Lo\ \xv(s) — xQ{s)\ ds
2000; Kloeden & Lorenz, 1986], will b e one of t h e Jto
key tools in the proof of Theorem 1. for t — to > 0. Hence by the Gronwall inequality we
Lemma 3. Under the assumptions of Theorem 1 obtain
there is a Lyapunov function V with \xv{t) - ar0(*)| < eL^-^uj{v){t - *0)
( t t
V(t,x„(t,t0,x0)) < e' - - ^V(t0,x0)+KLuj(u), for any t — to > 0. If we restrict t o 0 < t — to < 1,
0<t-t0<l, (A.4) then we have

for the solution xv(t,to,xo) with initial value \xv{t) - xQ(t)\ < eLocu(v) = : Kw[y),
^1/(^0)^0)^0) = #o of the differential equation (22) proving t h e lemma. •
with parameter v ^ 0 for any to G M and xo £ R rf .

Proof. Apply Theorem 2 to (22) for v ^ 0. Using A.l. Existence of a positively invariant
the Lipschitz property (A.2) and t h e exponential family of absorbing sets for
decay inequality (A.2) of the Lyapunov function V,
the perturbed dynamics
we obtain
Since 00(f) —> 0 as v —• 0 (see (23)), we can
V(t,xu(t,t0,x0)) choose v** G (0, u*] such t h a t for each v with
< V(t,x0(t,t0,xo)) 0 < \v\ < v** we have UJ(U) < ((e - l)/Le)2. Then
K := In (1/[1 - Ly/u(y)]) < 1 and
+ \V(t,xv(t,t0,x0)) - V(t,x0(t,to,x0))\
A
1 - e " = L\fuj[y) and
<e-^-^V(to,xQ)
1 (A.5)
+ L\xu(t,tQ,x0) - x0(t,t0,x0)\. (l + e -^) <e'\^
2
To estimate \xv(t,to,xo) — xo(t,to,xo)\ = : \xv(t) —
XQ(£)\ for 0 < t — to < 1, we use t h e integral equa- (the reason for t h e last inequality will become
tion representation of the differential equation (22), apparent in the proofs of Lemmas 5 a n d 6, cf.
the Lipschitz constant LQ of (22) for v = 0 and t h e Lemma 3.4 of [Kloeden k, Lorenz, 1986]), a n d
parametric dependence condition (23) to get
r\{y) : = 2K^/UJ(V) -> 0 + and A,-> 0+

\xu(t) -x0(t)\ as v —* 0.

Then define
XQ + / fv(s,Xv(s)) ds — XQ
J to
A„(i 0 ) := { i e K d : V(t0,x) < 77(1/)}

- / fo(s,x0(s))ds for each to G R.


J to
Lemma 4. A l/ (io) is a nonempty compact subset of
< 1/^(5,^(5)) - f0(s,x0{s))\ds Rd for each f 0 £ » with
J to
H^d(Au(t0),A0) < a~l(r,(v)) ^ 0 as u -> 0.
< \fv(s,x„(s)) - fo(s,xv(s))\ds (A.6)
J to
64 P. E. Kloeden & S. Siegmund

Proof. Since V(to,x) = 0 for x e AQ, so AQ C (A.4) and the definition of r\{v) we have
A„(io)) hence A„(to) is nonempty. It is compact by
the continuity of x i—• V(to,x) and the fact that V(h,Xl) < e~A" V(t0,XQ) + KLw(v)
A„(t 0 ) = ^(to,-) _ 1 ([0^(^)]). The inequality (A.6)
follows from the inequalities = e~A^V(t0,XQ) + ^(l-e~^)v^)

a(dist(x,A0)) < V(t0,x) < r](u) for allx G A„(t 0 ).


< ^ ( 1 + e " A " ) V-(to,x0)

<e-iA"F(t0,a;o)
The family of sets {Aj,(to),to G R} is positively
invariant with respect to the discrete time process since F(toi^o) > ^(f7)- Repeating this argument,
%v{to + nAu, to, XQ), in the sense that we have

Lemma 5. xv (to + TIAJ,, t 0 , A„(io)) C Av{tQ+nAu) V(tn,xn)<e-^A^V(t0,x0)


/or a// n > 0, t 0 € R.
as long as Xj £ Av(to + jAv) for j = 0 , . . . , n — 1.
Proof. It suffices to consider the case n = 1. Take Now
any XQ G A„(to). Then V(to,a;o) < ??(^)- By the key
V(t0,x0) < P(dist(xQ,A0)) < (3(H^d{D,A0)) < oo
Lyapunov inequality (A.4), the definition of r)(v)
and (A.5) we have
for all XQ G D, so

V(t0 + Au, xu(to + A„, t 0 , XQ)) V(tn,xn) < e - ? A


' ^ ( A 4 ) )
<e-^V{t0,x0) + KLLu(v) as long as £_,• ^ A„(to + j A„) for j = 0 , . . . , n — 1.
Define Nr)tl, to be the smallest integer n for which

A
e-^ ^(tf*d(AA)))
A
= \ (i + ^ " ) ?(") A
<^)<e-f ^(^(D,io)).
< ^ ) , Thus for each XQ G D there exists an integer
«o < AT£)jJ/} possibly 0, such that xm = xv(to +
so a;„(to + Av, t 0 , x0) € A„(i 0 + A„). • noA^,to,xo) G A„(to + noAj,). By the positive
invariance of the family of sets {A,,(to), to € IK}
The family of sets {A„(to),to 6 R} is in fact proved in Lemma 5 all successive values xn remain
absorbing for the discrete time process xu(to + in Aj,(to + nAu), so the proof of Lemma 6 is
nAj,,to,£o) uniformly in to G R, provided u (and complete. •
hence A„) is sufficiently small.
However, we need a family of nonempty com-
Lemma 6. For each v such that 0 < \u\ < u** and pact subsets of Rd which is positively invariant and
each compact subset D of R d there exists an integer uniformly absorbing for the continuous time process
ND,V > 0, for which xu(t,tQ,xo). For this we define

xv(to + nAu, t 0 , XQ) G Ay(t0 + nAv) K(to)= |J xv{t0,T,Au(T))


to—A„<r<to
for all n > Nrj^, XQ G D and to G R.
for each to G R (see Fig. 12). These sets are obvi-
ously nonempty and compact. Note that
Proof. Choose XQ in a compact subset D of R d , let
us write tn = to + nAj,, xn = xv(to + nAv,to,XQ). If
x* G A* (t 0 ) =>• x* = xv(t0, r*, z*) with
XQ G Aj/(to) we have nothing to prove. Now assume
that XQ £ Aj/(to). Then, by the Lyapunov inequality r*G[t0-A„to],^GA,(T*). (A.7)
Bifurcations and Continuous Transitions of Attractors 65

Xv(to,T2,Au(T2))

xJt0,Ti,kv{Ti))

T
H h
h~^v \ T2 in t

Fig. 12. The definition of A* (t0).

We will show that the family of sets {A*(to), We can repeat the above argument on the intervals
to G R} is positively invariant and absorbing for the [to + nAy, to + (n + 1)Av] for n = 1, 2, . . . t o obtain
continuous time process xv{t,to,xo) uniformly in the inclusion for all t > to. •
to € M, provided v (and hence Av) is sufficiently
small. The proof of the absorbing property is easier.

Lemma 7. xv{t, to, A*(to)) Q A*(t) for all t > to- Lemma 8. For each compact subset D of IRd there
exists a time TDJV > 0 such that
Proof. Consider an arbitrary point x* G A*(to), x„{t,t0,D)CAl(t)
using (A.7) we get x* = xv(to,r*,z*). We prove
that xv(t,to,x*) € A*(t) for t G [t0,t0 + Au] in two for all t > to + Tr),v and each to G R.
steps by showing it for t € [to, r* + Aj,] and then for
t€ [T* + Av,tQ + Av\. Proof. We note from Lemma 6 that
xv(t0 + nAv,to,D) C A„(t0 + nA„) C A*(t0+nA„)
Step 1. Using the fact that
for n = ND,V > 0, so by the positive invariance
Xu(t, to, X*) = Xu(t, to, Xv(t0, T*,Z*)) property we then obtain
= xu(t,T*,z*) € xv(t,T*,Av(r*)) xv(t,t0,D)CAl(t)
for t > t 0 , we get xv(t,to,x*) € A*(t) for t E for t > t 0 + TDiV, where TDi„ := NDi„A„. •
[to,T*+Av].
We notice that the time elapsed until being
Step 2. Prom Lemma 5, we have xv{r* + AV,T*, absorbed Tr)tV does not depend on to- From this,
K„{T*))C\K{T* + AV),SO we conclude that the family {A* (to), to G M} is also
absorbing in the pullback sense of the following
XV{T* + Aw,to,x*) Lemma.
= XU{T* + AV,T*,Z*) e A„(T* + A„). Lemma 9. For each compact subset D of Md there
Hence xu{t,to,x*) € xv{t,T* + A„,A„(r* + A„)) exists a time TD,V > 0 such that
for all t > T* + Au, from which it follows that xv(t0,to-T,D) CA*(t 0 )
xu{t,t0,x*) € A*(t) for at least t G [r*+Ai / ,to+A v ].
Combining Steps 1 and 2, we have for all T > TD,V and each to G R.
Finally the pullback attracting component sets
^(t,t0,A*(t0))CA*(t)
converge upper semicontinuously to AQ uniformly
for at least t G [to, to + A^]. in t 0 G R.
66 P. E. Kloeden & S. Siegmund

Lemma 10. # ^ ( A * ( t 0 ) , A)) < a~\ri{y) + nested, i.e.


KLUJ(U)) -> 0 as v - • 0.
Xv{to,tQ - r 2 , A*(t0 - T2))
Proof. We apply the Lyapunov inequality (A.4) C xu(to,t0 -Ti,A*(t0-ri))
to x* € A*(to) given by x* = xv{to,T*,z*) for
T* e [t0 - Au,to\ and z* <E A„(r*), see (A.7), to for r\ < r 2 . This follows from the two-parameter
obtain evolution property and the fact that the family of
absorbing sets is positively invariant, thus
V(t0,x*)
= V(t0,xUt0,r*,z*)) Xu(to,to -T 2 ,A*(t 0 - r 2 ) )

<e-(t0~T^V(T*,z*)+KLu(v) = Xu(tQ,t0 ~Tl,Xu(tQ ~Tl,t0 - T 2 ,A*(t 0 — T 2 )))

< e-{to-T*] r)(y) + KLu(u) < r}{v) + KLcu(v). c xv(tQ,t0 -ri,A*(t 0 -n)).
The result then follows from the fact that The invariance
a(dist(x*,A0)) <V(t0,x*). * Xvfato, A„(t0)) = A„(t)
follows from the above construction and the conti-
nuity 11-> Au{t) from the invariance and continuity
A.2. Existence and convergence of of the process, since
pullback attractors
HR*(Av(t),Av(t0))
We apply standard theoretic methods for nonau-
tonomous dynamical systems to the continuous = HRd(xv(t, t 0 , Av(tQ)), Av{t0)) - • 0 as t - • t 0 .
time process x„(t,to,xo) and the family A* =
{A*(to), to € R} of pullback absorbing sets defined In addition, the uniform upper semicon-
in the previous subsection to obtain the existence of tinuous convergence follows from the fact that
a pullback attractor Av = {Av(to),to € R} defined Av(to) C A*(t 0 ), so
through iqd(AAto),Ao) < H£d(At(to),Ao)
Au(to) = H x"(*0' *° ~ T' A ^ ° " r )) for all to € M and the result follows from Lemma 10.
T>0 We also note that the Av(t) are connected sets,
which is a nonempty and compact set, since since the A* (to) are connected and the Av(i) are an
the intersecting sets are nonempty compact and intersecting family of nested connected sets.
A SURVEY OF M E T H O D S FOR C O M P U T I N G
(UN)STABLE MANIFOLDS OF V E C T O R FIELDS
B. K R A U S K O P F and H. M. OSINGA
Department of Engineering Mathematics, University of Bristol,
Queen's Building, Bristol BS8 1TR, UK
E. J. D O E D E L
Department of Computer Science, Concordia University,
1455 Boulevard de Maisonneuve O., Montreal Quebec, H3G 1M8 Canada
M. E. H E N D E R S O N
IBM Research, PO Box 218, Yorktown Heights, NY 10598, USA
J. G U C K E N H E I M E R and A. VLADIMIRSKY
Department of Mathematics, Cornell University,
Malott Hall, Ithaca, NY 14853-4201, USA
M. DELLNITZ and O. J U N G E
Institute for Mathematics, University of Paderborn,
D-33095 Paderborn, Germany

Received May 14, 2004; Revised J u n e 16, 2004

The computation of global invariant manifolds has seen renewed interest in recent years. We
survey different approaches for computing a global stable or unstable manifold of a vector field,
where we concentrate on the case of a two-dimensional manifold. All methods are illustrated with
the same example — the two-dimensional stable manifold of the origin in the Lorenz system.

Keywords: Stable and unstable manifolds; numerical methods; Lorenz equations.

1. Introduction assume t h a t all parameters are fixed and use (1) as


the appropriate setting for t h e discussion of global
Many applications give rise to mathematical mod-
manifolds.
els in the form of a system of ordinary differential
The goal is to understand the overall dynamics
equations. Well-known examples are periodically
of system (1). To this end, one needs t o find special
forced oscillators and t h e Lorenz system (intro-
invariant sets, namely the equilibria, periodic orbits
duced in Sec. 1.1); see, for example [Guckenheimer
and possibly invariant tori. Furthermore, if these
& Holmes, 1986; Kuznetsov, 1998; Strogatz, 1994]
invariant sets are of saddle type then t h e y come
for further references. Such a dynamical system can
with global stable and unstable manifolds. For ex-
be written in the general form
ample, the stable and unstable manifolds Ws(x.o) and
WU(XQ) of a saddle equilibrium xo are defined as

Ws(x0) •= ( x € R n | lim 0*(x) = x o )


where x G W1 and the m a p / : R n H-> R n is suf- I i—>oo J
ficiently smooth. We remark that, in general, the u n
W (x0) := ( x G R \ lim 0-*(x) = x 0 ) ,
function / will depend on parameters. However, we

67
68 B. Krauskopf et al.

respectively, where 4>l is the flow of (1). Hence, tra-


jectories on the stable (unstable) manifold converge
to Xo in forward (backward) time. Knowing these
manifolds is crucial as they organize the dynamics
on a global scale. For example, stable manifolds may
form boundaries of basins of attraction, and it is
well known that intersections of stable and unstable
manifolds lead to complicated dynamics and chaos.
Generally, global stable and unstable manifolds
cannot be found analytically. Furthermore, they are
not implicitly defined, meaning that it is not pos-
sible to find them as the zero-set of some func-
tion of the phase space variables. Hence, points
on global invariant manifolds cannot be found
"locally". Instead, these manifolds must be "grown"
from local knowledge, for example from linear infor-
mation, near a fixed point Xo-
It is the purpose of this paper to review differ-
ent numerical techniques that have recently become Fig. 1. The unstable manifold Wu(0) (red curve) accumu-
available to compute these global objects. We rev- lates on the butterfly-shaped Lorenz attractor. The blue disk
iew five algorithms in detail and characterize their is the linear approximation Es(0) of the Lorenz manifold
properties using a common test-case example, nam- Ws(0). Also shown are the two equilibria at the centers of
the "wings" of the butterfly and their one-dimensional stable
ely, the Lorenz manifold which is introduced now. manifolds (blue curves).

1.1. The Lorenz manifold


The Lorenz system [Lorenz, 1963] is a classic exam- of (2). Each of these equilibria has one negative real
ple of a vector field with a chaotic attractor. It is eigenvalue, giving rise to a one-dimensional stable
given as manifold, and an unstable pair of complex conju-
gate eigenvalues with positive real part. Figure 1
x = a(y — x),
shows all equilibria of (2) in green, together with
y = gx-y-xz, (2) their one-dimensional global manifolds. As men-
z = xy- 0z, tioned, the red curve is the unstable manifold
Wu(0) of the origin, whose closure is the Lorenz
where we fix the parameters at the standard choice attractor. The blue curves are the stable manifolds
a = 10, Q = 28 and /? = 8/3, for which one of the two other equilibria. The blue disk lies in the
finds the famous butterfly-shaped Lorenz attractor. linear eigenspace Es(0) of the origin.
Note that the Lorenz system (2) has the symmetry The Lorenz attractor, that is, the red curve in
(x,y, z) —i *- (—x,— y, z) of rotation by 7r about the Fig. 1 conveys the chaotic nature of the system, but
2-axis. In particular, the z-axis is invariant under does not give any information on the overall organi-
the flow. zation of the phase space of (2). This role is played
The origin is a saddle point of (2) by the two-dimensional stable manifold Ws(0) of
with real eigenvalues — 0 and — (<r + 1 ) / 2 ± the origin — which we refer to as the Lorenz man-
(l/2)y/(a + l ) 2 + 4a(p - 1), that is, approximately ifold from now on. The Lorenz manifold Ws(0)
-22.828, -2.667 and 11.828. The origin is contained is tangent at 0 to the eigenspace Es(0) spanned
in the Lorenz attractor, so that its one-dimensional by the eigenvectors associated with the eigenvalues
unstable manifold Wu(0) can be used to approx- -22.828 and -2.667. This is a generic property of
imate the Lorenz attractor; this is illustrated in stable and unstable manifolds; see Sec. 1.2. Note the
Fig. 1 where Wu(0) is shown in red. At the centers large difference in magnitude between the two sta-
of the "wings" of the butterfly are two more equi- ble eigenvalues, leading to a dominance of the strong
libria of (2), approximately at (±8.485, ±8.485,27), stable manifold, which is tangent to the eigenspace
which are each other's image under the symmetry of the eigenvalue -22.828.
A Survey of Methods for Computing (Un) Stable Manifolds of Vector Fields 69

The Lorenz manifold has a number of aston- and the different methods for the case of an unstable
ishing properties. Imagine that the little blue disk manifold. This is not a restriction, because a stable
in Fig. 1 "grows" to become the Lorenz manifold manifold can be computed as an unstable manifold
Ws(0), but without ever intersecting the red unsta- when time is reversed in system (1).
ble manifold W"(0). In other words, the Lorenz Suppose now that /(xo) = 0 and for some
manifold stays "in between" trajectories on the 1 < k < n the Jacobian Df(-x.0) of / a t Xo has
Lorenz attractor, but "spirals" simultaneously into k eigenvalues with positive real parts and (n — A;)
both wings of the butterfly. Now imagine how tra- eigenvalues with negative real parts (counted with
jectories on this manifold must be able to pass from multiplicity). The Stable and Unstable Manifold
one wing to the other. Any finitely grown part of Theorem (see, e.g. [Guckenheimer & Holmes, 1986;
Ws (0) is topologically still a two-dimensional disk, Kuznetsov, 1998]) states that a local unstable mani-
but one with a particularity intriguing embedding fold W^ c (xo) exists in a neighborhood of xo. Fur-
into R 3 . The geometry of Ws(0) can only truly be thermore, Wj"c(xo) is as smooth as / and tangent
appreciated if one can draw an image of it. to the unstable (generalized) eigenspace Eu(x.o) of
Some early work on the geometry of the Lorenz .D/(xo) at x o This means that we may define the
manifold can be found in [Perello, 1979]. Using global unstable manifold W u (xo) as
"a desktop computer with a plotter" Perello stud-
ied the embedding of the stable manifold of the Wu(x0) = ( x 6 Rn\ lim 0*(x) = x o )
origin as a function of the parameter p and, in
particular, provides a sketch for p close to 24.74.
Pioneering efforts to visualize the Lorenz system = l>'W°c( x o))- (3)
t>0
are due to Stewart. Trajectories that illustrate the
(local) stable manifold can be found in [Thompson Hence, W u (xo) is a A;-dimensional (immersed) man-
& Stewart, 1986, Fig. 11.6], while [Stewart, 1986] ifold, defined as the globalization of W^)C(XQ) under
is an extended abstract of a movie that visualizes the flow </>*. Note that the local stable manifold
the dynamics and global bifurcations (as a func- Wj*c(xo) and the stable manifold Ws(xo) are simi-
tion of R) of the Lorenz system with computer larly related with respect to the reversed direction
graphics in the three-dimensional phase space. The of time, namely
first, hand-drawn image of (the structure of) the
Lorenz manifold (that is, for the standard param- Ws(x0) = ( x € R n | lim ^ ( x ) = x o )
eter values also used here) appeared in the book I t—>oo J
[Abraham & Shaw, 1985]. The first published
computer-generated image of the Lorenz manifold = lM^ioc( x o))- (4)
is that in [Guckenheimer & Worfolk, 1993].
Not in the least due to its intriguing nature, This indeed shows that it is sufficient to consider
the Lorenz manifold has become a much-used only the case of an unstable manifold, possibly after
test-case example for evaluating algorithms that reversing time.
compute two-dimensional (un)stable manifolds of Definition (3) already suggests a method for
vector fields. For each of the methods discussed computing W u (xo): take a small (k — l)-sphere
in this paper we present an image of the com- (or other "outflow boundary" such as an ellipsoid)
puted Lorenz manifold that is always taken from Ss C W^"c(xo) with radius 8 around xo and grow
a viewpoint along the line spanned by the vector the manifold T^u(xo) by evolving Ss under the flow
(\/3,1, 0) in the (x, y)-plane. 0*. As starting data, one can take Ss C EU(XQ) or
a higher-order approximation of W^"c(xo).
In the special case k = 1 of computing a
1.2. Stable and unstable manifolds one-dimensional manifold, this method works well,
In order to explain the different methods for com- because it boils down to evolving two points at dis-
puting two-dimensional (un)stable manifolds, we tance 6 from xo under the flow. This can be done
need to introduce some notation. To keep the expo- reliably by numerical integration of (1), so that
sition simple, we consider here the case of a global computing one-dimensional unstable manifolds is
(un)stable manifold of a hyperbolic saddle point straightforward. The one-dimensional manifolds in
XQ G M.n of (1). Furthermore, we present all theory Fig. 1 were computed in this way.
70 B. Krauskopf et al.

However, the above method of evolving a circles, recall that the geodesic distance dg(x.,y)
(k — l)-sphere Ss with k > 2 under the flow </>* gen- is defined as the arclength of the shortest path in
erally gives very poor results. This is so because Ss Wu(x.o) connecting x and y, called a geodesic. Con-
will typically deform very rapidly under (pt. In par- sider now the geodesic parametrization of ^ ( x o )
ticular, it will stretch out along the strong unstable given by
directions (if present). Furthermore, Ss is a contin-
uous object that will have to be discretized by some W"(xo) = {S„}„>0
mesh. Any mesh on Ss will deteriorate rapidly under
the flow 0*, so that it will not be a good represen- where Sv := {x e W ( x 0 ) | ^ ( x , x 0 ) = 77}. (6)
tation of W u (xo) as a fc-dimensional manifold.
The geodesic parametrization (6) is entirely in
terms of the geometry of WU(XQ), and not in terms
1.3. Different approaches to of the dynamics on the manifold. Since Wu(xo) is
computing W"(x 0 ) a smooth manifold tangent to Eu(xo) at xo, there
It is quite a challenge to compute a global unsta- must be some 77max > 0 so that the geodesic level
u sets Sn for 0 < rf < r]max are all smooth closed
ble manifold W (xo) of dimension at least two.
Indeed simple numerical integration of the flow is curves without self-intersection, that is, topologi-
not sufficient (except in very special cases) — dedi- cal circles; see, for example, [Spivak, 1979]. We also
cated algorithms are needed for this task. Before refer to geodesic level sets for r\ < r]maJC as geodesic
we describe some recent methods in more detail, circles. Up until rymax, the geodesic parametriza-
we first explain the underlying approaches in gen- tion (6) is geometrically the nicest parametrization,
eral terms. It is useful to consider for this purpose because its elements, the geodesic circles, are the
U
different parametrizations of W (XQ). nicest possible topological circles on Wu(x.o)- (This
We concentrate in this survey on the first non- means here that the metric is exactly the identity.)
trivial case k = 2 of a two-dimensional unsta- For the Lorenz manifold, apparently r/max = 00.
ble manifold. While all methods could be used in However, the case of a finite rymax is possible and
principle to compute higher-dimensional manifolds, it typically involves a non-smooth geodesic circle;
almost all implementations are for k = 2. Fur- see [Krauskopf &; Osinga, 2003] for details.
thermore, visualizing higher-dimensional manifolds The idea of computing W u (xo) as a sequence
remains a serious challenge. The different methods of geodesic circles goes back to [Guckenheimer &;
use the idea of growing WU(XQ) from a local neigh- Worfolk, 1993]. Starting with a small geodesic cir-
borhood of XQ. They differ in how they ensure that cle (or ellipse) Ss around xo, they modify the
u vector field so that the component tangential to
a good mesh representing W (x.o) is computed dur-
ing this growth process. the last computed geodesic level set is practi-
Consider as starting data a small smooth closed cally zero, retaining only the radial part. Then
curve Ss C W ^ x o ) , also referred to as a (topo- the flow of the rescaled radial vector field is used
logical) circle in what follows, of points that all lie to evolve (a sufficient number of points on) this
within a distance 8 from xo- (As was mentioned, geodesic circle by integration over a suitably small
one can take Ss C Eu(xo) if 8 is small enough.) The and fixed integration time (now corresponding to
goal is to find a "nice" parametrization of Wu(xo) geodesic distance up to a rescaling of the radial
in terms of the starting data S$. part of the vector field). Figure 2 shows 36 approx-
As we have seen above, the parametrization imate geodesic circles of the Lorenz manifold com-
puted with this method up to geodesic distance
Wu(xQ) = {^(Ss^tm (5) 180. The output was produced in the DsTool soft-
ware environment [Back et al., 1992], the mani-
is not practical. While the (^(Ss) are smooth closed fold could be rendered as a two-dimensional surface
curves for all t, they are typically not "nice" and by post-processing the data. When the vector
"round". Indeed the curvature along these curves field / is largely tangential to the geodesic cir-
typically varies dramatically, and they soon tend to cles, the computation of that vector field's radial
look like very elongated ellipses. component becomes unstable unless the integra-
In order to define the parametrization of tion time r is sufficiently small (see the ripples on
WU(XQ) as a family of the nicest possible topological the last few geodesic circles near the helix at the
A Survey of Methods for Computing (Un) Stable Manifolds of Vector Fields 71

new parametrization of WU(XQ) given by

W«(xo) = {A,}„>o
where Av := {x € W™(x 0 )K(x,x 0 ) - 77}, (7)
where da(x,y) denotes the arclength distance
between two points x and y on the same trajec-
tory; we set da(x, y) = 00 if x and y are not on the
same trajectory. This parametrization can be con-
sidered as the best in terms of dynamically defined
topological circles on WU{XQ).
Johnson et al. [1997] used essentially this
parametrization by trajectory arclength, but con-
sidered integration in the product of time and phase
space. They started with a uniform mesh on a first
small circle As € £?"(xo) and then integrated at
each step the present mesh points up to a speci-
fied arclength. This leads to a new circle, on which
Fig. 2. The Lorenz manifold computed with the method a uniform mesh is then constructed by interpola-
of [Guckenheimer & Worfolk, 1993] up to geodesic distance tion between the integration points. Figure 3 shows
180; the computed approximate geodesic level sets are at the Lorenz manifold computed with this method up
increasing radial distances from the origin with steps of 5.0 in to an approximate arclength distance of 200. The
between, which are indicated by a color change from magenta
(small) to blue (large).
method is quite fast since it involves only direct
integration and redistribution of points by interpo-
lation. On the other hand, it is difficult to control
the interpolation error, which is determined by the
center of Fig. 2). This CFL-type stability con- (unknown) dynamics on W u (xo).
dition becomes increasingly restrictive as the An altogether different parametrization of
u
angle between the trajectories and geodesic cir- W (xo) is the dual parametrization to (5) and (7)
cles decreases. More generally, the method from that consists of the individual trajectories through
[Guckenheimer & Worfolk, 1993] can approximate a fixed Ss C EU(X.Q)- It is formally given as
stably only a part of the manifold, on which the vec-
tor field remains transverse to each geodesic circle. W«(xo) = {Bp}peS6
The method by [Krauskopf & Osinga, 1999, where B p := {0*(p)|t e R}. (8)
2003], discussed in detail in Sec. 2, also com-
putes Wu(xo) as a sequence of geodesic circles, Notice that, in the case of a two-dimensional mani-
but does not rescale the vector field. Instead, the fold Wu(xrj) considered here, parametrization (8)
idea is to find the next geodesic circle in a local is a one-parameter family of trajectories, while
(and changing) coordinate system given by hyper- (5) and (7) are one-parameter families of closed
planes perpendicular to the present geodesic circle. curves.
Determined by certain accuracy parameters, a suit- The method by Doedel, discussed in detail in
able number of mesh points on the next geodesic Sec. 3, computes two-dimensional (un)stable man-
circle is computed by solving appropriate boundary ifolds by following trajectories Bp as a boundary
value problems. During the computation the inter- value problem where the initial condition p € Ss
polation error stays bounded, so that the overall is parametrized with one of the free continua-
quality of the mesh is guaranteed. tion parameters. This method is very accurate and
A different approach is to reparametrize time flexible by allowing for different boundary condi-
so that the flow with respect to the new time pro- tions at the other end point of the trajectory Bp,
gresses with the same speed along all trajectories which includes specifying a fixed arclength L of
through Ss, meaning that the same arclength is cov- the trajectory. During a computation, mesh points
ered per unit time along all trajectories. One also are distributed along the trajectories to maintain
speaks of arclength integration. We then have the the accuracy of the computation.
72 B. Krauskopf et al.

The method of [Guckenheimer & Vladimirsky,


2004], discussed in detail in Sec. 5, locally models
W"(xo) as the graph of a function g that satis-
fies a quasilinear partial differential equation (PDE)
expressing the tangency of the vector field / to the
graph of g. The PDE is discretized in an Eulerian
framework and the manifold is approximated by a
triangulated mesh. At each step one new point is
added to the mesh, leading to a new simplex whose
other vertices are previously known mesh points.
An Ordered Upwind Method determines where the
next point/simplex is added and the ordering of
new simplices is based on the arclength of the
trajectories.
The method of Dellnitz and Hohmann [1996,
1997], discussed in detail in Sec. 6, is complemen-
tary to the previous methods in that it computes
an outer approximation of the manifold by boxes
of the same dimension n as the phase space of (1).
This method uses the time-r map of the flow 0* for
some fixed r. A subdivision algorithm first finds a
covering of W]"c(xo) with n-dimensional boxes of
suitably small diameter. This local box covering is
then globalized in steps by adding new boxes (of
the same small size) that are "hit" under the time-r
map by the present collection of boxes. The prac-
tical problem is to detect reliably when the image
of one box intersects another box (for example, by
using test points). If a priori bounds on the local
growth rate of the vector field are known then it
is possible to compute a rigorous box covering of
Wu(x0); see [Junge, 2000a].
In the following sections we present the dif-
Fig. 3. The Lorenz manifold computed with the method of
[Johnson et al, 1997] up to a total trajectory arclength of ferent algorithms in more detail, again illustrated
about 200. with the computation of the Lorenz manifold
Ws(0).

The method of [Henderson, 2003], discussed


in detail in Sec. 4, also considers parametrization 2. Approximation by Geodesic
(8) of Jy"(xo) by orbits. However, the manifold Level Sets
is constructed directly as a two-dimensional object The method of Krauskopf and Osinga [1999, 2003]
by computing fat trajectories. A fat trajectory is approximates a global (un)stable manifold as a
a string of polyhedral patches along a trajectory, sequence of geodesic circles of the parametrization
where the size of each patch is given by local curva- (6). Only the case of a two-dimensional unstable
ture information. When a fat trajectory reaches the manifold of a saddle point in a three-dimensional
prescribed total arclength L, the boundary of the space is presented here. However, the method
computed part of the manifold is determined. Then can be formulated in terms of computing a
a suitable starting point for the next fat trajectory A;-dimensional manifold of a vector field in M n , and
is found and the computation continues. When no has been implemented to compute two-dimensional
more possible starting points exist, the computation
(un)stable manifolds of saddle points and saddle
stops.
periodic orbits in a phase space of any dimension;
A Survey of Methods for Computing (Un) Stable Manifolds of Vector Fields 73

see the examples in [Krauskopf & Osinga, 1999, ^ " ( x o ) n Tr by integration from Ci. Points in
2003] and also in [Osinga, 2000, 2003]. Variants WU(XQ) n TT can be found by solving the two-
of this method exist to compute global mani- point boundary value problem
folds of maps; see [Krauskopf Sz Osinga, 1998a,
1998b]. «•(*) e ch (9)
The method completely steps away from evolv-
ing an existing mesh. Instead, new mesh points are br{t) := <j>\qr{t)) € Tr, (10)
computed by means of solving appropriate bound-
where the integration time t is a free parameter.
ary value problems; see Sec. 2.1. The boundary con-
The situation is shown in Fig. 4 with actual data
ditions predetermine where the new mesh points
for the Lorenz manifold Ws(0) presented in Sec. 2.3.
need to be added in order to achieve a prescribed
Note that for an unstable manifold t > 0 and for a
mesh quality. This method is as independent of the
stable manifold t < 0.
dynamics as possible and it grows the manifold as a
The point br(tr) € Tr is uniquely defined by the
sequence of discretized geodesic circles until r/max is
property that tr is the smallest integration time (in
reached where the geodesic level sets are no longer
absolute value) for which ||6 r (t r ) — r|| = Aj. If Aj is
smooth circles; see Sec. 1.3.
small enough then br(tr) exists and can be found by
To be more specific, let Mi denote a circular
continuation of the trivial solution br(0) = qr{0) = r
list of mesh points from which a continuous topo-
for t = 0 while checking for the first zero of the test
logical circle Ci is formed by connecting neighbor-
function
ing points of Mi by line segments. The mesh points
in Mi are computed to ensure that Ci is a good Ai - \\br(t) - r\\. (11)
approximation (according to prespecified accuracy
parameters) of an appropriate geodesic circle SVi. When the first zero is found then br(tr) = bT{t) is
The manifold W u (x 0 ) is then approximated up to the candidate for a point in Mj+i; see Fig. 4.
a prescribed geodesic distance L by the triangu-
lation formed by the total mesh M = \J0<i<i Mi,
where / € N depends on L and the accuracy
parameters.
The start data is a uniform mesh Mo on an
initial small geodesic circle S^ = Ss C Eu(xo) at
some prescribed distance 6 from xo- The method
then computes at each step i a new circular list
Mj+i that approximates the next level set SVi+1. In
other words, at every step a new band is added to
WU(XQ); the width of this band is determined by
the curvature of geodesies. The method stops when
the prespecified fixed geodesic distance L from xo
is reached.

2.1. Finding a new point in Mi+1


Let us consider the task of finding Mi+\ at some
prescribed increment A* from a known circular
list Mi representing SVi. The circular list Mj+i is
constructed pointwise. Let r € Mi and consider the
(half)plane Tr through r that is (approximately)
perpendicular to Ci at r. (In the implementation Fig. 4. The boundary value problem formulated for a mesh
the normal to Tr is defined as the average of point r on the geodesic level set Ci is solved by a family of
trajectories, starting at qr(t) on Ci and ending at br(t) in
the two unit vectors through r and its immediate Tr, that is parametrized by integration time t. There is a
left and right neighbors.) Then WU(XQ) D TT is a unique first orbit such that ||6 r (<r) — r|| = A j . T h e image
well-defined one-dimensional curve locally near r, shows actual data for the Lorenz manifold W s ( 0 ) of Fig. 5
which is parametrized by the time it takes to reach where C* « S n i with rn = 32.75 and A, = 4.0.
74 B. Krauskopf et al.

2.2. Mesh adaptation 2.3. The Lorenz manifold


Once all candidate points in Mj+i have been found, approximated by geodesic circles
s
all for the same Aj, then it is decided whether Figure 5 shows the Lorenz manifold W (0) repre-
the step size Aj was appropriate. To this end, sented by a total of 75 bands and with total geodesic
it is checked that the curvature of (approximate) distance 154.75. The manifold was computed start-
geodesies through all points r € Mj was not too ing with a mesh MQ of 20 points on Ss C Es(0) with
large. This is done with a criterion that was orig- S = 1.0. The computation was initiated with Ai =
inally introduced for one-dimensional global man- 0.25 and the mesh was generated using the accuracy
ifolds of maps [Hobson, 1993]. Let ar denote the parameters a min = 0.3, a m a x = 0.4, (Aa) m i n = 0.1,
angle between the line through r and br(tr) and the (Aa) m a x = 1.0, Ayr = 2.0, and 5r = 0.67. The col-
line through pr and r, where pr G Mj_j is the asso- oring illustrates the geodesic distance from the ori-
ciated point of Mi_i on the approximate geodesic. gin, where blue is small, green is intermediate and
The step of geodesic distance Aj was acceptable red is large. The manifold was rendered as a two-
if both dimensional surface with the visualization package
Oir < oWx, and (12) Geomview [Phillips et al, 1993]; other illustrations
of the Lorenz manifold can be found in [Krauskopf
Ai-ar< (Aa) m a x (13) k Osinga, 2003, 2004; Osinga & Krauskopf, 2002]
hold for all r £ Mi. In this case Mj + i is accepted and animations with [Krauskopf & Osinga, 2003,
and step i is complete. If there is some r 6 Mi 2004].
that fails either (12) or (13) then Aj is halved and Figure 5(a) shows the entire computed part of
step i is repeated with this smaller Aj. Similarly, the Lorenz manifold from the common viewpoint;
Aj may be doubled if for every r € Mi both ar and notice the similarity with the geodesic level sets in
Aj • ar are well below the respective upper bounds Fig. 2. Figure 5(b) shows an enlargement of the
in (12) or (13), say, below cerain and (Aa) m i n respec- Lorenz manifold where the manifold is now trans-
tively. The parameters amin, £tmax, (Aa) m i n , and parent. This brings out the detail of the manifold,
(Aa) m a x implicitly determine the mesh adaptation in particular, the development of a pair of extra
along geodesies and are fixed by the user before a helices that follow the main helix along the z-axis.
computation. Notice that points of the same color are on the
It is important to ensure that Q + i is also a same geodesic circle, which shows that points on
good approximation of SVi+1. In other words, neigh- Ws (0) that are close to the origin in Euclidean dis-
boring points of Mj+i may not be too close or too tance need not be close to the origin in geodesic
far from each other. When two neighboring points of distance. Figure 5(c) shows a further enlargement
Mj lead to two neighboring points of Mj+i at more near the Lorenz attractor, which is illustrated in
than the prespecified distance Ajr from each other, magenta by plotting the unstable manifold WU(Q).
then a new point is added in between. This is not In this image only every second band is shown
done by interpolating between points of Mj+i but to obtain a see-through effect, showing clearly
by applying step i of Sec. 2.2 for finding a new point how the Lorenz manifold "rolls" into the Lorenz
in Mj+i to the middle point on Cj. In other words, attractor.
no interpolation is ever performed between points Figure 5(d) gives an impression of the com-
that are more than Ajr distance apart. In order puted mesh with an enlargement looking into one
to ensure proper order relations between directly of the outer scrolls. Geodesic circles can be seen
neighboring points of Mj+i a point is removed if as spiraling curves (between bands of the same
two neighboring points in Mj+i lie closer together color). The approximate geodesies are the curves
than a prespecified distance Syr. that point approximately radially out in the image.
The mesh adaptation as decribed ensures that They are perpendicular to the geodesic circles, and
the overall error of a computation up to a pre- locations where points were added can be identified
scribed geodesic distance L is bounded. This means as starting points of new approximate geodesies.
that the computed piece of the manifold lies in an Notice that the last six bands are closer together.
e- neighbor hood of Wu(xo), provided the accuray The image illustrates how the distance between
parameters are chosen small enough; see [Krauskopf geodesic circles is determined by the curvature
& Osinga, 2003] for the proof. along geodesies, while the mesh distribution on the
A Survey of Methods for Computing (Uri)Stable Manifolds of Vector Fields 75

K L ^ HI

(a) (b)

(c) (d)
Fig. 5. The Lorenz manifold computed with the method of Krauskopf and Osinga up to geodesic distance 154.75. Panel (a)
shows the entire manifold, panel (b) an enlargement where the manifold is transparent, panel (c) a further enlargement near
the Lorenz attractor (in magenta) where only every second band is shown, and panel (d) the computed mesh when looking
into the outer scroll.

geodesic circles is allowed to vary between Syr — 0.67 unstable manifold Wu (xo) of a saddle equilibrium
and A^r = 2.0. xo of (1). An approximation to WU(XQ) could then
be attempted by simple integration of Eq. (1) for
a sufficient number of initial conditions that lie on
3. B V P Continuation of Trajectories the circle (or ellipse) Ss of small radius 6 in the
It seems very natural to use parametrization (8) for stable eigenspace Eu(x.o) centered at xo- However,
defining a one-parameter family that describes the as was already explained in Sec. 1.3, this procedure
76 B. Krauskopf et al.

does not generally produce WU(X.Q) as a surface. locally unique solution branch that passes through
The main task is to properly space the initial con- XQ. TO compute a next point, say, X\, on this
ditions around the circle, so that the result gives a branch, one can use Newton's method to solve the
reasonable distribution of the computed trajectories extended system
along the stable manifold. This is a major problem
F(X1) = 0, (15)
because the entire calculated trajectory (e.g. of a
fixed finite length) depends very sensitively on the (Xi - X0)*X0 = As. (16)
initial condition.
The method of Doedel uses numerical continu- Here XQ is the unit tangent to the path of solu-
ation to solve this problem. The basic idea of con- tions at XQ, the symbol * denotes transpose, and
tinuation is to follow a (one-dimensional) branch As is a step size in the continuation procedure.
of solutions that exists according to the Implicit The vector XQ is a null vector of the m x (m + 1)-
Function Theorem around a regular root of a sys- dimensional Jacobian matrix FX{XQ), and it can be
tem of m equations with m + 1 unknowns. The step computed at little cost [Doedel et al, 1991a]. This
size in the continuation procedure (see Sec. 3.1 for continuation method is known as Keller's pseudo-
details) measures the change of the entire computed arclength method [Keller, 1977]. The size of the
trajectory (and various parameters), and not just pseudo-arclength step As is normally adapted along
the change in the initial condition. It is this key the branch, depending, for example, on the conver-
property of continuation that generally results in gence history of Newton's method. It is very impor-
a reasonable distribution of trajectories along the tant to note that the stepsize is measured with
stable manifold. respect to all components of the solution, and not
In this section we only consider the com- just one.
putation of one-parameter families of trajecto- The continuation procedure is well posed near
ries, which together describe a two-dimensional a regular solution XQ, that is, if the null space of
(un)stable manifold of a fixed point. Most existing FX(XQ) is one-dimensional. Namely, in this case the
continuation algorithms can handle the computa- Jacobian of the entire system (15)—(16) at XQ, that
tion of such one-dimensional families (also called is, the (m + 1) x (m + 1) matrix
solution branches); see, for example [Beyn et al.,
2002; Doedel et al, 1991a; Doedel et al, 1991b;
Keller, 1977; Rheinboldt, 1986; Seydel, 1995], and
[Kuznetsov, 1998, Chapter 10]. The continuation
(T)
is nonsingular. The Implicit Function Theorem
method described here was implemented in the then guarantees that a locally unique solution
continuation package AUTO [Doedel, 1981; Doedel branch passes through XQ. This branch can be
et al, 1997; Doedel et al, 2000] by specifying the parametrized locally by As. Moreover, for As
respective driver files. sufficiently small, and for sufficiently accurate
Continuation algorithms have also been devel- initial approximation (for example, when taking
oped for the higher-dimensional case; see, for exam- X[ — XQ + ASXQ), Newton's method for solving
ple [Allgower & Georg, 1996; Henderson, 2002]. Eqs. (15)-(16) converges.
Hence, this method could be applied, in princi-
ple, equally well to compute manifolds of dimension
larger than two. 3.2. Boundary value problem
formulation
When computing a branch of solutions to an ODE
3.1. Pseudo-arclength continuation
of the form (1), parametrized by initial conditions
Let us begin with a discussion of some basic notions and the integration time T, one must keep in mind
of continuation. Consider the finite-dimensional that (1) has infinitely many solutions and boundary
equation or integral constraints must be imposed. Further-
more, the pseudo-arclength constraint (16) is then
F(X) = 0, F: R m + 1 - • R m , (14)
typically given in functional form; more details can
where F is assumed to be sufficiently smooth. This be found in [Doedel et al, 1991b]. This means that
equation has one more variable than it has equa- the possibly unknown total integration time T is
tions. Given a solution XQ, one has, generically, a embedded in the equations. To this end, the vector
A Survey of Methods for Computing (Un) Stable Manifolds of Vector Fields 77

field (1) is rescaled so that integration always takes 3.3. BVP continuation of the
place over the interval [0,1], and the actual integra- (un)stable manifold of an
tion time T appears as a parameter. Hence, in this equilibrium
context, Eqs. (15)-(16) take the form
Consider now the situation that (1) has a sad-
dle equilibrium xo with a two-dimensional unstable
x'1(<) = /(xi(t),Ai), (18)
manifold, meaning that the Jacobian D / ( x o ) has
6(xi(0),xi(l),Ai) = 0, (19) exactly two eigenvalues \i\ and //2 with positive real
part. Suppose further that vi and V2 are the associ-
/ g(xi(s),Ai)ds = 0, (20) ated (generalized) eigenvectors. We are looking for
Jo solutions of the system

/ (xi(r)-Xp(T))*Xp(r)dr x'(i) = T/(x(i)), (22)


Jo
+ (Ai-Ao)*A 0 = As, (21) x(0) = x 0 + <5(cos(0)Vl + sin(0)v 2 ), (23)

where the dimension of Ai must be chosen consis- which is a combination of Eqs. (18) and (19) with
tently with the dimensions of the boundary con- A = (9,T). Note that in Eqs. (22)-(23) the con-
ditions (19) and the integral constraints (20) in tinuation equation corresponding to Eq. (21) (or
order to ensure a one-dimensional family of solu- Eq. (16)) has been omitted, even though it is
tions. Again, we stress that the continuation step- an essential part of the continuation procedure.
size is for the entire solution X, and not just for the The continuation equation will also not be written
parameter vector Ai. Equations (15)-(16) must be explicitly in subsequent continuation systems.
solved for X\ = (xi(-),Ai), given a previous solu- If the eigenvalues /xi and //2 are real, then it is
tion XQ = (xp(-), Ao) of the ODE and the path tan- advantageous to choose the initial condition on the
gent XQ — (xp(-),Ao). That is, in a function space ellipse that is given by the ratio of the eigenvalues as
setting, Eqs. (18)-(20) correspond to the equation
F(X) = 0, as in Eq. (14). Note that the dimen-
x(0) = x 0 + 6 (cos(O)p- + sin(0)-^- ) . (24)
sion (7n + l) of X = (x(-), A) may be much larger \ |A*i| |/42|/
than the dimension n of the phase space of (1).
In particular, A always contains the parameter T, In other words, in the continuation Eq. (23) is
which may or may not vary during the continua- replaced by Eq. (24).
tion; see Sec. 3.3 for specific examples. If A = T Obvious starting data for the system (22)-(23)
t h e n / > ! ( * ) , Ai) = T / ( X l ( t ) ) . consist of a value of 9 (0 < 9 < 2ir), T = 0, and
In each continuation step, Eqs. (18)-(21) x(i) = xo + (5(cos(0)vi + sin(0)v2), that is, x(i) is
are solved by a numerical boundary value algo- constant. An actual trajectory for a specific value
rithm. Here, the package AUTO [Doedel, 1981; of 9 can now be obtained using continuation as well.
Doedel et al, 1997; Doedel et al, 2000] is used, While this may seem superfluous, it has the added
which uses piecewise polynomial collocation with benefit that the output files of this first step in
Gauss-Legendre collocation points (also called AUTO are then compatible with subsequent con-
orthogonal collocation), similar to COLSYS with tinuation steps. In this continuation step, system
adaptive mesh selection [Ascher et al, 1995; (22)-(23) is solved for X = (x(-),T), keeping the
De Boor & Swartz, 1973; Russell & Christiansen, angle 9 fixed. Here, T > 0 for an unstable manifold
1978]. In combination with continuation, this allows and T < 0 for a stable manifold since then integra-
the numerical solution of "difficult" orbits. More- tion is backward (or negative) in time.
over, for the case of periodic solutions, AUTO Once a single orbit is obtained up to a desired
determines the characteristic multipliers (or length, defined by a suitable end-point condition,
Floquet multipliers) that determine asymptotic then this orbit is continued numerically as a bound-
stability and bifurcation properties, as a by- ary value problem where the initial condition on the
product of the decomposition of the Jacobian of small circle (or ellipse) is now a component of the
the boundary value collocation system [Doedel continuation variable. In this way, the family (8) of
et al, 1991b; Fairgrieve & Jepson, 1991]; see also such orbits on (part of) the stable manifold W u (xo)
[Lust, 2001]. is approximated. The simplest way to do this is to
78 B. Krauskopf et al.

fix T in the continuation system (22)-(23) after the


first step and allow 9, the angle of the starting point
on Ss to vary freely. It is important to note that 9 is
not used as the sole continuation parameter. Instead
each continuation step is taken in the full continua-
tion variable X — (x(-), #), so that the continuation
stepsize includes variations along the entire orbit.
Also, 9 is one of the variables solved for in each
continuation step and it is not fixed a priori.
Instead of keeping T fixed, there are other ways
to perform the continuation. For example, one can
constrain the end point x(l) as one wishes. This is
done by adding to system (22)-(23) the equation

g(x(l),e,T)-<X = 0. (25)

Here g is an appropriate functional, chosen to con-


trol the end point in a desirable manner, for exam-
ple, by requiring one coordinate to have a particular
fixed value. The continuation variable can now be Fig. 6. Continued trajectories on W(0) near the origin
starting from the ellipse (24) with S = 5.0, w = -22.828
taken as X = (x(-),#,T), while a is kept fixed.
and H2 = —2.667; the coloring is according to integration
Another possibility is to impose an integral con- time T, where red indicates faster and green slower flow.
straint along the orbit, namely adding to (22)-(23)
the equation
manifold, which is located in the middle of the
/ h(x(s),9,T) ds-L = Q. (26) red region. Note that the distribution of points is
Jo much denser near the top and bottom of the ellipse,
Now h is an appropriate functional, chosen to con- that is, near the invariant z-axis, which ensures a
trol the orbit in a desirable manner. The continua- good distribution of orbits over the Lorenz mani-
tion variable can again be taken as X = (x(-), 9, T), fold Ws(0).
but now keeping L fixed. A particularly useful Figures 7(a)-7(c) show the Lorenz manifold
choice is /i(x, A,T) = T||/(x, A)||, which results in s
W (0) covered by 2284 trajectories of arclength
the total arclength of the orbit being kept fixed dur- 250, where the ellipse of initial conditions is as in
ing the continuation. Finally, it is entirely possible Fig. 6. The number of mesh points along each tra-
to use a combination of end-point conditions and jectory was NTST = 75, with NCOL = 4 collocation
integral constraints, but this will not be used here. points in each mesh interval. Figure 7(a) shows the
entire computed part of the Lorenz manifold from
the common viewpoint. The coloring changes from
3.4. The Lorenz manifold as a
blue to red according to the mesh point number
family of trajectories along a trajectory, which gives an impression of the
Figure 6 shows an enlargement near the origin of the arclength of trajectories. Figures 7(b) and 7(c) show
orbits that were continued on the Lorenz manifold enlargements where the coloring shows the total
Ws(0) (for negative T). The angle 9 is allowed to integration time T along trajectories. As in Fig. 6,
vary from 0 to 27r, SO that the initial condition varies this indicates the speed of the flow; the strong sta-
along the ellipse in the middle of the image, which ble manifold is located in the red region of fast flow.
is defined by (24) with 6 = 5.0, m = -22.828 and In Fig. 7(b) every fourth trajectory is rendered as
/ii = —2.667. All orbits have the same arclength a thin tube. This results in a better sense of depth
and the coloring is in terms of the total integra- so that an impression is given of how trajectories
tion time T along each trajectory. In other words, lie in phase space to form Ws(0). Figure 7(c) is an
the coloring gives an indication of the speed of the enlargement of Fig. 7(a) (though with a different
flow along trajectories, where red is fast and green color scheme) showing how the manifold forms a
is slower. The flow is fastest along the strong stable scroll.
A Survey of Methods for Computing (Un) Stable Manifolds of Vector Fields 79

(a) (b)

(c) (d)
Fig. 7. The Lorenz manifold computed with the continuation method of Doedel. Panels (a)-(c) show the manifold where
the arclength of the trajectories is fixed at L = 250. In panel (a) the coloring indicates the arclength along trajectories and
in panels (b) and (c) the coloring is according to the total integration time T of trajectories; the strong stable manifold lies
inside the red region. Panels (a) and (c) show all trajectories, while panel (b) shows only every fourth trajectory as a tube.
Panel (d) demonstrates that only a part of interest of the stable manifold may be computed, such as a part of the main scroll;
this was done by fixing x = — 25 at the end point of trajectories.
80 B. Krauskopf et al.

Figure 7(d) illustrates the flexibility of the to Ss is known, and if the flow is transverse to the
method by showing part of the Lorenz manifold initial curve Ss, f can be used as the second tan-
computed by numerical continuation of solutions gent. The circle Ss (or possibly an ellipse) may be
to the boundary value problem (22)-(23) and (25) chosen to be transverse to the flow for sufficiently
for the choice g(x, A, T) = x. This results in the small 5. The curvature information can be obtained
^-coordinate of the end point x(l) being kept fixed using the second derivative tensor.
during the continuation, and it was set to x = —25 The tangent and curvature can be "trans-
in the computation. For an appropriate choice of a, ported" over Wu(xo) by deriving and solving evolu-
for which some trajectories intersect this plane sev- tion equations along a trajectory. To this end, one
eral times, the continuation procedure then natu- writes the parametrization (5) in the form
rally leads to nonmonotonous variation of 6, thereby
allowing the computation of a scroll-like structure x(t, a) = c(a) + f f(x(s, a))ds, (27)
on the stable manifold. In Fig. 7(d) the origin is Jo
the point on the right from which all trajectories
emerge. where c(er) parametrizes Ss with the one-
dimensional parameter a. (An example of such a
parametrization is (24).) Then the tangent space
at x(t,a) is spanned by xa and xt = / , and the
4. Computation of Fat Trajectories corresponding curvatures are given by the second
derivatives xCT(T, xta = fxxa and xtt = / x / . Evo-
The method of Henderson [2003] computes a com-
lution equations for the unknown quantities can be
pact piece of a A;-dimensional invariant manifold by
found by differentiating (27)
covering it with fc-dimensional spherical balls in the
tangent space, centered at a set of well-distributed
points. This set is found by computing so-called fat
trajectories, which are trajectories augmented with
tangent and curvature information at each point. - X ( T = /xxff, (29)
The centers of the balls are points on the fat trajec-
tory, and the radius is determined by the curvature.
For the implemented case of computing a
two-dimensional unstable manifold ^ " ( x o ) of a
saddle point of (1), the method starts with a small Note that, even if xa is orthogonal to / at the ini-
circle Ss C EU(XQ) and at every step circular disks tial point, there is no reason to expect the basis to
are added along a fat trajectory with a fixed total remain orthogonal. In [Henderson, 2003], equations
arclength (from xo) of L. Initially all fat trajectories are derived for the evolution of a local parametriza-
start on Ss, but at later stages fat trajectories begin tion which does remain orthonormal and has mini-
at points interpolated where two fat trajectories mal change in the basis along the trajectory. (This
move too far from each other. The method stops is analogous to finding Riemannian normal coordi-
when Wu(x$) has been covered up to the prescribed nates in gravitation, where trajectories play the role
arclength L. of geodesies [Misner et al., 1970].) If the tangents in
the local parametrization are uo and u i , they evolve
according to

4.1. Fat trajectories on the global


—uo = / x u 0 - UQ7 X U 0 UO - u f / x u 0 u i , (31)
stable manifold
The method requires a basis for the tangent space u
and the curvatures in that basis to construct the T^Ul = / x U l ~ " o 7 x U l 0 - u f / x U l Ul. (32)
disks. As was mentioned in the introduction, invari-
ant manifolds are not defined locally, so that there
4.2. Interpolation points on the
is no local way of determining the tangent space or
curvature for a given point on the invariant mani- invariant manifold
fold. This information is known at points on the The method starts with a set of well-distributed
initial curve Ss C EU(XQ), for example, the tangent points on the initial curve Ss, which can be found
A Survey of Methods for Computing (Un) Stable Manifolds of Vector Fields 81

using the algorithm described in [Henderson, 2002]. Suppose that part of Wu(xo) is represented in
At each such point on Ss an orthonormal basis for this way, and a new disk Di is to be added. Pi is
the invariant manifold and second derivatives of the initially a square centered at the origin with sides
manifold in that basis are computed, and used as 2Ri, and for each disk Dj that intersects the new
initial conditions for finding a set of disks along a disk Di complementary half spaces are subtracted
fat trajectory. Because trajectories may move apart from Pi and Pj. The projection of Dj into the tan-
from each other, these disks will generally not cover gent space at x, is approximated by a disk of radius
Wu(xo); see Fig. 8. This means that additional fat Rj centered at the projection of x^. If Ri and Rj are
trajectories must be started at suitable points until small enough so that the distance between the tan-
Wu(xo) is covered. In order to generate a well- gent space and the manifold is small (this depends
spaced set of points on Wu(xo), one chooses a start- on the curvature of W"(xo)), then this is a good
ing point from the boundary of the computed part approximation. This pair of disks in the tangent
of the manifold. space at x, defines a line containing the intersection
The method in [Henderson, 2002] represents the of the circles bounding the disks, and one subtracts
boundary of the union of disks {Di} using poly- from Pi a half space bounded by this line. The same
gons related to the Voronoi regions of the centers approach is used to update Pj by projecting x; into
of the disks. A disk Di consists of a center x(i;,crj) the tangent space at Xj.
(a point on a fat trajectory), the orthonormal basis With these polygons a point on the boundary of
for the tangent space of the manifold uo(ij, c,) and the union can easily be found. Any point on 5Di D Pi
u
i{ti,&i)> a radius Ri, and polygon Pj. The poly- is near the boundary of the union (the distance to
gon Pi is represented by a list of vertices in the the boundary is controlled by the distance between
tangent space and edges joining them (which actu- the tangent space and the manifold at the radius).
ally works in arbitrary dimensions). The polygons Points on the boundary where two disks meet cor-
are constructed in such a way that each edge of respond to points where an edge of Pi crosses
Pi that crosses the boundary of Di corresponds to SDi (the point obtained is in the tangent space
a neighboring disk Dj. The situation is sketched in of the manifold and must be projected onto the
Fig. 8. manifold).
If one considers the part of the invariant mani-
fold that is not yet covered (that is, the exterior of
the union of neighborhoods, t < T), one can define
something resembling a constrained minimization
problem (it lacks a global objective function), which
looks for a point in this region that lies furthest back
in time under the flow. With a mild assumption
about the shape of the region (it must be a topo-
logical ball), such a minimal point must exist. It
must lie on the boundary of the region at the inter-
section of two disks. This point is a "minimum" if
the flow vector extended backwards intersects the
interior of the edge joining the centers of the inter-
secting disks. (This is, in fact, Guckenheimer and
Vladimirksy's upwinding criterion; see Sec. 5.) One
can easily find candidate points on the boundary
from the edges of the polygons, and checking the
upwinding criterion is a matter of computing a pro-
jection. One can then either interpolate tangents
and curvatures from the disks' centers (the method
used in the computations shown in Fig. 9) or use a
Fig. 8. Two adjacent fat trajectories starting from Sg- A
new fat trajectory starts from the point where the two fat
homotopy (as Doedel uses in AUTO [Doedel, 1981;
trajectories separate. This point can be found by interpola- Doedel et al., 1997; Doedel et al, 2000]) to move
tion between two suitable mesh points, indicated by the green from the fat trajectory from 5a through the cen-
lines. ter of one of the disks to the fat trajectory which
82 B. Krauskopf et al.

(a) (b)

/ / / / / / ''

(c) (d)
Fig. 9. The Lorenz manifold computed with the method of Henderson up to a total trajectory arclength of 250. Panel (a)
shows a view of the entire manifold, panel (b) a transparent enlargement near the main scroll, panel (c) shows the part of the
manifold for x < 0 together with the Lorenz attractor and the one-dimensional stable manifolds of the two other equilibria,
and panel (d) gives an impression of the computed mesh.
A Survey of Methods for Computing (Uri)Stable Manifolds of Vector Fields 83

starts on Sg and passes underneath the interpola- interpolated data. The boundary of the mani-
tion point. fold at termination simply consists of the disks
This interpolation to find new starting points that are of distance L from xo (measured along
for fat trajectories completes the algorithm. It com- trajectories).
putes a covering of the manifold WU(XQ) with disks
centered at well-spaced points. Provided the disks
are sufficiently small compared to the curvature, 5. P D E Formulation
the algorithm is guaranteed to terminate, and all
Another method for approximating invariant
points lie on trajectories that originate on the initial
manifolds of hyperbolic equilibria was intro-
curve Ss or at points interpolated between nearby
duced by Guckenheimer and Vladimirsky [2004].
trajectories.
Their approach locally models a codimension-one
The fat trajectory, with its string of disks and
invariant manifold as the graph of a function g sat-
polygons, is integrated until a prespecified total
isfying a quasi-linear PDE that expresses the tan-
arclength L is reached. This is repeated for all the
gency of the vector field / of (1) to the graph of g.
points on the initial curve. (The total integration
The PDE is then discretized in an Eulerian frame-
time T of fat trajectories varies with the initial
work and the manifold is approximated by a trian-
condition.)
gulated mesh. We denote by M the triangulated
approximation of the "known" part of the man-
ifold. It can be extended by adding simplices at
4.3. The Lorenz manifold covered the current polygonal boundary dM in a locally-
by fat trajectories outward direction in the tangent plane. The dis-
cretized version of the PDE is then solved to obtain
Figure 9 shows the Lorenz manifold Ws(0) com- the correct slope for the newly added simplices. To
puted (using integration backward in time) up to avoid solving the discretized equations simultane-
a total trajectory arclength of 250. The step was ously, an Ordered Upwind Method (OUM) is used
controlled so that the distance between the tan- to decouple the system: the causality is ensured by
gent space and W u (xo) over each disk was less ordering the addition/recomputation of new sim-
than 0.5. The scaled time step along trajectories plices based on the lengths A of the vector field's
was 0.01 (many more than one time step is taken trajectories.
between successive points on a fat trajectory), and Two key ideas provide for the method's
no radius is greater than 2.0. The result was a total efficiency:
of 221,210 disks. Figure 9(a) shows the entire com-
puted part of the Lorenz manifold from the com-
1. The use of Eulerian discretization ensures that
mon viewpoint. Figure 9(b) shows an enlargement
geometric stiffness, a high nonuniformity of sep-
of the Lorenz manifold near the central region where
aration rates for nearby trajectories on different
the manifold is now transparent. Notice the dif-
parts of the manifold, does not affect the qual-
ferent "sheets" of manifold in the scroll and the
ity of the produced approximation: new simplices
extra helices forming around the z-axis. This com-
constructed at the current boundary dM are as
plicated structure of the Lorenz manifold is fur-
regular as is compatible with the previously con-
ther illustrated in Fig. 9(c) where only the half of
structed mesh.
W s (0) with negative ^-coordinate is shown. The
2. Since OUM is noniterative, the PDE-solving step
intersection curves of the manifold with the plane
of the method is quite fast.
{x = 0} are shown in white. Also shown is the one-
dimensional unstable manifold Wu(0) (red curve)
accumulating on the Lorenz attractor (yellow) and
the stable manifolds (blue curves) of the other two 5.1. Tangency condition
equilibria. The method is explained here for a two-dimensional
Figure 9(d) gives an impression of the compu- manifold Wu(xo) of a saddle point xo in M3;
ted mesh. The fat trajectories are the white curves see [Guckenheimer Sz Vladimirsky, 2004] for more
and they are surrounded by the polygons that details. Let (u,g(u)) = (ui,U2,g(ui,U2)) be a local
make up the Lorenz manifold. Clearly visible are parametrization of the manifold of (1). Then the
points where new fat trajectories are started from vector field / should be tangential to the graph of
84 B. Krauskopf et al.

g(111,112), that is, the dot product For the approximation of ^ " ( x o ) all char-
acteristics of the PDE start at the initial
ft^(«l.«2), ^(«1,«2),-1_ boundary (chosen in EU{X.Q)) and run "out-
ward" . Knowledge of the direction of informa-
•f(ui,U2,g(ui,u2)) = 0. (33) tion flow can be used to decouple the discretized
system, resulting in a much faster computa-
The above first-order quasi-linear PDE can be tional method.
solved to grow the manifold in steps, because the
Dirichlet boundary condition is specified on the
boundary dM of the piece of the manifold com- 5.2. Eulerian discretization
puted in previous steps. The initial boundary is
To enable decoupling of the discretized system,
chosen by discretizing a small circle or ellipse Ss C
our discretization of Eq. (33) at a "new" mesh
.E"(xo) that is transverse to / , so that the vector
point y has to be "upwinding", i.e. it should use
field is outward-pointing everywhere.
only previously-computed mesh points straddling
Unlike a general quasi-linear PDE, Eq. (33)
y's approximate trajectory. For a two-dimensional
always has a smooth solution as long as the cho-
invariant manifold in R 3 , let G(ui,U2) be a
sen parametrization remains valid. Thus, switch-
piecewise-linear numerical approximation of the
ing to local coordinates when solving the PDE
local parameterization g(u\,U2). Consider a sim-
avoids checking the continued validity of the
plex yyxy2, where yl = (u\,u2,G (u\,u2)) =
parametrization. 1
(u\G(u )) andy = (ui,U2,G(ui,u2)) = (u,G(u)).
In [Guckenheimer & Vladimirsky, 2004] the
Suppose that the vertices y1 and y2 are two
PDE formulation (33) is extended to approximate
adjacent mesh points on the discretization of the
two-dimensional manifolds in R n . A similar char-
current manifold boundary, called AcceptedFront
acterization can be used for general A;-dimensional
(thus, G^w1) and G(u2) are known and can be
invariant manifolds in !Rn, but the current numeri-
used in computing G(u)). If u is chosen so that
cal implementation relies on k = 2.
the simplex uv^u2 is well-conditioned, then y =
The PDE approach for characterizing invari-
(u, G(u)) can be determined from the PDE. Define
ant surfaces goes back to at least the 1960s. The
the unit vectors Pi = (u — ul/\\u — ul\\) and let
existence and smoothness of solutions for equations
P be the square invertible matrix with the Pi's
equivalent to (33) were the subjects of Sacker's
as its rows. The directional derivative of G in
analytical perturbation theory [Sacker, 1965] and
the direction Pi can be computed as Vi(u) =
later served as a basis for several numerical meth-
(G(u) - G ( ^ ) ) / | | w - 1**11, for i = 1,2. Therefore,
ods, for example, those in [Dieci & Lorenz, 1995;
Died et al, 1991; Edoh et al, 1995]. However, all Vg(u) w VG(u) = P ~ V where v = [JJ1]. This
this work was done for the computation of invari- yields the discretized version of Eq. (33) as
ant tori. There are two very important distinctions
between the PDE methods for tori and the method [p-1v(u)]1f1(u,G(u))
presented in this section:
+ [p-1v(u)]2f2(u,G(u)) = f3(u,G(u)).
1. These prior methods assume the existence of (34)
a coordinate system in which the invariant
torus is indeed globally a graph of a function This nonlinear equation can be solved for G(u) by
g:Tk >-> Rn~k. This implies the availability of the Newton-Raphson method or any other robust
a global mesh, on which the PDE can be solved. zero-solver. In addition, it has an especially sim-
For invariant manifolds of hyperbolic equilibria ple geometric interpretation if the local coordinates
such a mesh is not available a priori and has are chosen so that G(ul) = G(v?) = 0. Namely,
to be constructed in the process of growing the we reduce the problem to finding the correct "tilt"
approximation M. of the simplex yy1y2 with respect to the simplex
2. For the invariant tori computations, the solu- yy1y2 where y = (u,0) can be interpreted as
tion function g has periodic boundary con- a preliminary position (predictor) of y. (As dis-
ditions; hence, the discretized equations are cussed in Sec. 5.3 below, when y is first added
inherently coupled and have to be solved simul- to the mesh, u is chosen so that yyly2 is a well-
taneously. conditioned simplex in a tangent plane.) Hence,
A Survey of Methods for Computing (Un) Stable Manifolds of Vector Fields 85

solving Eq. (34) is equivalent to finding a g l such


that f(y -f aw) lies in the plane defined by y1, y2,
and y = y + aw, where w is the unit vector nor-
mal to yyxy2; see Fig. 10. A similar discretization
y + aw and geometric interpretation can be derived for the
general case of k > 2 and n > 3 [Guckenheimer &
Vladimirsky, 2004].
The described discretization procedure is simi-
lar in spirit to an implicit Euler's method for solv-
ing initial value problems since y1 and y2 are
assumed to be known and the vector field is com-
puted at the to-be-determined point y. In solving
Fig. 10. Geometric interpretation of Eq. (34). The search first-order PDEs, a fundamental condition for the
space for y is the normal subspace, here corresponding to
the line spanned by w. The segment yxy is a part of the
numerical stability requires that the mathematical
AcceptedFront, and y is a Considered point. domain of dependence should be included in the
numerical domain of dependence. Since the char-
acteristics of PDE (33) coincide with the trajecto-
ries of the vector field, G(u) should be computed
using the triangle through which the corresponding
(approximate) trajectory runs. Thus, having com-
puted y = (u, G(u)) by (34) using two adjacent
mesh points yl and y J , we need to verify an addi-
tional upwinding condition: the linear approxima-
tion to the trajectory of y should intersect the line
yxyi at a point y = (u, G(u)) that lies between yl
and y3; see Fig. 11. An equivalent formulation is
that f(y) should point from the newly computed
simplex yyly^.
Algebraically, if y solves (34), then f(y) =
Pi(y — y%) + (h{y — 2/J); thus, the upwinding crite-
(a) rion above simply requires f3i,@2 > 0. In this case
the discretization is locally second-order accurate
and the arclength A(y) of the trajectory up to the
point y can be approximated as

My) \\y-y\\ + A(y)


ll/(»)ll + /?iA(i/*) + p2My3)
Pi+fo
da(0,y). (35)

Numerical evidence indicates that the result-


ing method is globally first-order accurate
y y [Guckenheimer & Vladimirsky, 2004].

(b)
Fig. 11. (a) An acceptable and (b) an unacceptable approx- 5.3. Ordered Upwind Method
imation of f(y); the range of upwinding directions is shown Ordered Upwind Methods (OUMs) were orig-
by dotted lines; the local linear approximation to the tra-
inally introduced for static Hamilton-Jacobi-
jectory is shown by a dashed line; y is its intersection with
the line yly3. In the second case the upwinding criterion is Bellman PDEs [Sethian & Vladimirsky, 2003]. In
not satisfied and the update for y should be computed using [Guckenheimer & Vladimirsky, 2004] the same idea
another segment of AcceptedFront. of space-marching for boundary value problems is
86 B. Krauskopf et al.

used to solve Eq. (33). All mesh points are divided A/"(y), and the desired simplex size A. (The
into those that are Accepted, that is, already fixed simplex size is fixed in the present implemen-
as belonging to the approximation M, and those tation; it could be adapted according to cur-
Considered, which are in a tentative position adja- vature information.) As in the original OUMs,
cent to the current polygonal manifold boundary the computational complexity of the algorithm is
dM, called the AcceptedFront. A tentative posi- 0(MlogM), where M = 0(L2/A2) is the total
tion can be computed for each Considered mesh number of mesh points and the (logM) factor
point y under the assumption that its trajec- results from the necessity to maintain a sorted
tory intersects dM in some neighborhood N{y) list of Considered mesh points. A detailed dis-
of that point. In other words, y is updated by cussion of the algorithmic issues can be found in
solving Eq. (34) for a "virtual simplex" yylyi [Guckenheimer &; Vladimirsky, 2004].
l
such that y yi £ dM fl Af(y) and the upwind-
ing criterion is satisfied. All Considered points
are sorted based on the approximate trajectory
arclengths A(y) defined by (35). The method starts 5.4. The Lorenz manifold computed
with dM discretizing a small ellipse in EU{XQ). with the PDE formulation
That initial boundary is surrounded by a sin- Figure 12 shows the Lorenz manifold Ws(0) com-
gle "layer" of Considered mesh points (also in puted up to an approximate total arclength of
Eu(x0)). L = 174. The computation started from S§ C
A typical step of the algorithm consists of Es(0) with S = 2.0, A = 0.6 and RN =
picking the Considered point y with the smallest 4A, which resulted in the total of 271469
A and making it Accepted. This operation modi- mesh points. The coloring shows arclength along
fies dM (y is included, and the mesh points that trajectories where blue is small and red is
are no longer on the boundary are removed) and large. The manifold was rendered as a two-
causes a possible recomputation of all the not- dimensional surface with MATLAB; other illustra-
yet-Accepted mesh points near y. If y% is adja- tions and associated animations can be found in
cent to y and yly is on the boundary, then [Guckenheimer &; Vladimirsky, 2004].
the mesh is locally extended by adding a new Figure 12(a) shows the entire computed part
Considered mesh point y connected to yly in a of the Lorenz manifold from the common view-
tangent plane. To maintain good aspect ratios of point. Figure 12(b) is an enlargement near the
newly-created simplices, the current implementa- central scrolls where the manifold is now shown
tion relies on an "advancing front mesh generation" transparent. Clearly visible are two secondary spi-
method similar to [Peraire et al., 1999]. Other local rals forming near the positive z-axis. The col-
mesh-extension strategies can be implemented sim- oring is such that points of the same color are
ilarly to methods in [Rebay, 1993] or [Henderson, equally far away from the origin in arclength
2002]. along trajectories. Figure 12(c) is a further enlarge-
The vector field near dM determines the ment near the unstable manifold Wu{0) accu-
order in which the correct "tilts" for tentative mulating on the Lorenz attractor. This clearly
simplex-patches are computed and the Considered shows how the Lorenz manifold "rolls" into both
mesh points are Accepted. This ordering has the wings of the Lorenz attractor, creating different
effect of reducing the approximation error (since sheets that do not actually intersect the shown
a mesh point y first computed from a relatively trajectories representing the unstable manifold
far part of Af(y) is likely to be recomputed before Wu(0).
it gets Accepted). The default stopping crite- Figure 12(d) gives an enlarged impression of the
rion is to enforce A(y) < L, so that the algo- computed mesh looking into one of the outer scrolls.
rithm terminates when the maximal approximate The simplices of the mesh are sufficiently uniform
arclength L is reached. Other stopping criteria in spite of the complicated geometry of the mani-
(for example, based on Euclidean or geodesic dis- fold they represent. The red boundary of the com-
tance or the maximum number of simplices) can puted manifold is not a smooth curve, because it is
be used as well. Current algorithmic parameters formed simply by the last simplices that were added
include L, the radius RN of the neighborhood locally.
A Survey of Methods for Computing (Un) Stable Manifolds of Vector Fields 87

(a) (b)

(c) (d)
Fig. 12. The Lorenz manifold computed with the method of Guckenheimer and Vladimirsky up to a total trajectory arclength
of about 174. Panel (a) shows a view of the entire manifold, panel (b) an enlargement near the main scroll where the manifold
is shown transparent, panel (c) shows how the manifold interacts with the Lorenz attractor, and panel (d) gives an impression
of the computed mesh.
88 B. Krauskopf et al.

6. B o x Covering a (small) collection CQ C V that covers the local


In contrast to the techniques described so far, unstable manifold Wj" c (x). This local covering of
the method of Dellnitz and Hohmann [1996, 1997] W u (x) is extended in steps, where in each step the
presented in this section approximates invariant sets in the current collection C^ are mapped forward
manifolds by objects of the same dimension as the under D. All sets in V that have an intersection with
underlying phase space. It first produces an outer the images of Ck are added to the current collection
covering of a local unstable manifold by a finite col- of sets, yielding Ck+i-
lection of sets. This covering is then grown in order More formally, let Vo,V\,... be a nested
to cover larger parts of the manifold analogously sequence of successively finer partitions of Q: We
to what is described in Sees. 2 and 5. In combi- take VQ = {Q} and each element P € Ve+i is
nation with set-oriented multilevel techniques for contained in an element P' £ Vi and diam(P) <
the computation of invariant sets, such as periodic 7 diam(-P') for some fixed number 0 < 7 < 1.
orbits, attractors and general chain recurrent sets, The algorithm consists of two main steps:
the technique allows, in principle, for the compu- 1. Initialization: Compute an initial covering
tation of manifolds of arbitrary dimension, where (k)
CQ C Ve+k °f the local unstable manifold
the numerical effort is essentially determined by
Wfoc(x) of x. (Here the index k indicates the
the dimension of the manifold. In combination with
fineness of the initial partition.) This can be
rigorous techniques for the implementation of this
achieved by applying a subdivision algorithm
approach, it is possible to compute rigorous cover-
for the computation of relative global attractors
ings of the considered object. For a more detailed
to the element P 6 Vi containing x for some
exposition of the general method see [Dellnitz &,
suitable £; see [Dellnitz 8z Hohmann, 1996].
Hohmann, 1996, 1997; Dellnitz et al, 2001; Dellnitz (k)
& Junge, 2002]. The algorithm is implemented in 2. Growth: From the collection C- the next col-
the software package GAIO [Dellnitz et al., 2001]. fit)
lection CWt is obtained by setting

6.1. The box covering algorithm CJ5i = {Pe Ve+k : D{P) n P ^ 0


The box covering algorithm applies to a discrete- for some set P G C - ' \.
time dynamical system, that is, to a diffeomor-
phism D. In the context of approximating global This step is repeated until no more sets are
manifolds, it can compute the unstable manifold added to the current collection, that is, until
of an (unstable) invariant set of D in a compact
region of interest Q. In this section, we explain how
this method can be used for the computation of a We can show that this method converges to a
two-dimensional (un)stable manifold of a saddle xo certain subset of Wu(x.) in Q. Namely, let WQ =
inM 3 . Wi"c(x) n P, where P is the element in Ve contain-
Here, the diffeomorphism D : M3 —> R 3 is given ing x and define
by the time-r map of the vector field (1). For an
unstable manifold r > 0, while for a stable manifold Wj+1 = D(Wj)nQ, j = 0,1,2,....
r < 0 to account for reversing time. Numerically,
the map D may be realized by classical one-step Then we have the following convergence result (see
integration schemes. Since the algorithm involves [Dellnitz & Hohmann, 1996]):
integration over short time intervals only, typically
the requirements in terms of accuracy or preserva- 1. the sets CJ = UpeC(fc) P are coverings of Wj
tion of structures of the underlying vector field / for all j , k — 0 , 1 , . . . ;
are rather mild. The diffeomorphism D then has a 2. for fixed j and k —• 00, the covering C- con-
hyperbolic saddle fixed point x = Xo and, in the verges to Wj in Hausdorff distance.
case r < 0, x has a two-dimensional unstable man-
ifold W u (x), which is identical to the stable mani- In general, one cannot guarantee that the algo-
fold of x 0 . rithm leads to an approximation of the entire set
The idea of the algorithm is as follows. Imag- Wu(x) n Q. This is due to the fact that parts
ine a finite partition V of Q. The method first finds of Wu(x) that do not lie in Q may map into Q.
A Survey of Methods for Computing (Un) Stable Manifolds of Vector Fields 89

In this case, the method will indeed not cover all of where c, r e R n , r, > 0, are the center and the sizes
Wu(x) n Q. of the box B(c,r), respectively. Moreover, only par-
Under certain hyperbolicity assumptions on titions are used that result from bisecting the initial
u
W (x.) it is possible to obtain statements about the box Q repeatedly, where in this process of bisecting
speed of convergence in terms of how the Hausdorff the relevant coordinate direction is changed system-
distance between the covering and the approxi- atically (typically, the bisected coordinate direction
mated subset of ^ " ( x ) depends on the diameter is varied cyclically).
of the sets in the covering collection; see [Dellnitz Starting with VQ — {Q}, this process yields a
& Junge, 2002] for details. sequence Vt of partitions of Q, that can efficiently
be stored in a binary tree. Note that it is easy to
store arbitrary subsets of the full partition Vt just
6.2. Realization of the method by storing the corresponding part of the tree. In
The efficiency of the growth part of the algo- fact, in the initialization of the algorithm one starts
rithm significantly depends on the realization of the with a single box on a given level I, so that the
collections Vt. In the implementation the Vt are stored tree consists of a single leaf. Whenever sets
partitions of Q into boxes are added to the current collection, the correspond-
ing paths are added to the tree. Figure 13 illustrates
B(c,r) = {y € Rn : \yt -*\ < rt for i = 1 , . . . , n } , the first three growth steps for the computation of

(a) (b)

(c)
Fig. 13. Coverings of the Lorenz manifold during the first three growth steps are shown in panels (a)-(c), where the covering
of the previous step (the initialization box in the case of (a)) is shown in yellow.
90 B. Krauskopf et al.

a covering of the Lorenz manifold on level 18 of 6.3. Box covering of the Lorenz
the tree (all other parameters are as described in manifold
Sec. 6.3 below). The yellow box in Fig. 13(a) was
Figure 14 shows a box covering of the Lorenz
created in the initialization step and then grown
in one step to obtain the blue boxes. Panels (b) manifold Ws(0). For the computation the time-r
and (c) show two further growth steps, where the map of the Lorenz system (2) was considered with
covering of the previous step is again shown in r = —0.1. This map is realized by the classi-
yellow. cal Runge—Kutta scheme of fourth order with a
The hierarchical storage scheme has another fixed step size of —0.01. The region of interest
crucial computational advantage in that it is Q is a box with radius (70,70,70) and center
easier to decide which boxes are "hit" by map- ( 1 0 - 1 , 1 0 - 1 , 1 0 _ 1 ) ; this offset centering is for a prac-
ping the boxes that were added in the previ- tical reason: it avoids having the origin on the edge
ous step of the continuation algorithm. Namely, of a box. Level I = 27 of the tree was used and
for each of these boxes B 6 Ve+k one needs to 16 growth steps were performed, starting from a
compute the set F(B) = {B' € Pe+k\D(B) n single box containing the origin (i.e. A; = 0). In each
B' y£ 0}. Since B contains an uncountable num- growth step, an equidistant grid of 125 test points in
ber of points, this problem must be discretized. each box was mapped forward. The resulting object
The obvious approach is to choose a finite set T contains more than 4 million boxes.
of test points in B and to approximate F{B) by Figure 14(a) shows the entire computed part of
F{B) = {B' e Vt+k\D{l) n B' £ 0}. Using the the Lorenz manifold from the common viewpoint.
tree structure, the determination of the box that The same view is shown in Fig. 14(b) but now
contains the image of a test point can be accom- the manifold is transparent. Figure 14(c) shows an
plished with a complexity that only depends loga- enlargement of the transparent rendering near the
rithmically on the number of boxes in Vt [Dellnitz central region. Because the method is using the
& Hohmann, 1997]. time-r map of the Lorenz system (2), the Lorenz

(a) Cb)
Fig. 14. The Lorenz manifold computed with the box covering method of Dellnitz and Hohmann seen from the common
viewpoint (a). In panels (b) and (c) the manifold is rendered transparently. Panel (c) shows an enlargement near the z-axis,
and panel (d) gives a closer look at the computed boxes.
A Survey of Methods for Computing (Un)Stable Manifolds of Vector Fields 91

(c)
Fig. 14.

manifold first grows initially mainly in the direction of method will generally depend on the application
of the strong unstable direction until the boundary one has in mind and on the particular questions
of the box of interest is reached. This can be seen one wants to answer. This discussion is intended to
nicely in Fig. 13. Later steps of the growth pro- give an indication of the specific properties of the
cess then start to build up the other part of the different approaches.
manifold, resulting in the images in Figs. 14(a) and
14(b). The further enlargement near the scroll of
the manifold in Fig. 14(d) gives a local impression 7.1. Approximation by geodesic
of the box covering. Notice that the covering of the level sets
manifold has a thickness of several box diameters
at the end of the scroll. The method of Krauskopf and Osinga [1999, 2003]
is presently implemented for two-dimensional man-
ifolds of saddle points and saddle periodic orbits
in a phase space of arbitrary dimension; see also
7. Discussion [Osinga, 2000, 2003] This implementation approxi-
After a recent flurry of research activity, sev- mates the manifold linearly between mesh points,
eral complementary methods are available today while the boundary value problems (9)—(10) are
to compute global (un)stable manifolds in appli- solved by single shooting. It would be possible to
cations. While these methods are still somewhat use higher order interpolation between mesh points
under development and testing, we hope that this and collocation for solving the boundary value prob-
survey will encourage the reader to consider com- lems. The method produces a very regular mesh
puting such global objects in systems arising in that consists of (approximate) geodesic circles and
applications. approximate geodesies. This means that the man-
Each of the methods presented in the previous ifold is rendered as a geometric object, indepen-
sections is based on a particular point of view of dently of the dynamics on it. The mesh is, in fact,
characterizing a global (un)stable manifold. Com- constructed so regularly that it can be interpreted
mon to all is the idea that the manifold must be as a crochet pattern. This allows one to produce a
grown from local information near the saddle point, real-life model of the Lorenz manifold; see [Osinga &:
and the difference is in how this is done. The choice Krauskopf, 2004] for details. During a computation,
92 B. Krauskopf et al.

the interpolation error is controlled by prescribed Figs. 7(a)-7(c), takes 30seconds on a 1.6MHz Pen-
mesh quality parameters, so that the correctness tium M laptop; for NTST = 25, which still gives
of the method can be proved; see [Krauskopf & good resulution, the computation time (including
Osinga, 2003] for details. writing the output) drops to just over 10 seconds.
The price one has to pay for obtaining a guar- The method is very flexible in that it allows for
anteed "geometric mesh" is that one needs to set different boundary conditions at the endpoint of a
up and continue a boundary value problem for trajectory. This means that one can compute only a
each new mesh point. This makes the method part of interest of the manifold, as was illustrated in
more expensive compared to other methods. With Fig. 7(d). However, the manifold cannot be grown,
the nonoptimized present implementation and the so that the continuation must be repeated if a larger
accuracy parameters as in Sec. 2.3, computing part of the manifold is desired.
the Lorenz manifold up to geodesic distance 140 While visualizing or even animating the comp-
takes about 10 minutes, while the larger image in uted trajectories gives much insight into the geom-
Fig. 5 with 69900 mesh points took 40 minutes and etry of the manifold, it would require substantial
47 seconds on an 800 MHz Pentium III machine. post-processing to produce a nice mesh representa-
Because it is based on the geodesic parametriza- tion of the manifold as a two-dimensional object. In
tion (6), the method works as long as the geodesic particular, the density of the orbits may be high in
level sets of this parametrization remain smooth cir- areas where the further evolution of the trajectories
cles. While this is not an obstruction for computing depends sensitively on the current state. For exam-
the Lorenz manifold, there are examples where the ple, in Figs. 7(a) and 7(c) the density of the orbits is
computation stops when a geodesic circle ceases to high along a curve in the direction of the z-
be smooth; see [Krauskopf & Osinga, 2003]. Fur- axis, that is, the direction of the weakly stable
thermore, the method stops when it encounters an eigenvector.
equilibrium or a periodic orbit on (the closure of)
the (un)stable manifold.
An implementation for global (un)stable man- 7.3. Computation of fat trajectories
ifolds of dimension three would already be quite While also essentially computing trajectories, the
challenging. First of all, geodesic level sets are method of [Henderson, 2003] does produce a nice
spheres in this case, on which one needs to com- mesh representation by "fattening" the trajec-
pute a regular mesh. Secondly, the method would tories with a string of polygonal patches. The
require multiparameter continuation to continue method tends to minimize the need for interpola-
the boundary value problems (9)-(10). tion. When interpolation is needed there is a guar-
antee that appropriate points exist, and at those
points information is available which allows higher
7.2. BVP continuation of trajectories order interpolation or the generation of an inter-
The method by Doedel is arguably the most polating trajectory. The algorithms for computing
straightforward one. The continuation calculations fat trajectories, for finding a third-order approxi-
can be carried out using the standard bound- mation to the manifold, and for finding interpola-
ary value continuation capabilities of AUTO. This tion points are implemented for any dimension k of
means that all that is required are rather standard the manifold. The interpolation itself is presently
AUTO equations and parameter files. The orbits limited to k = 2. The code used to compute the
that make up the manifold are computed very accu- Lorenz manifold is available as OpenSource; see
rately, due to the high accuracy of the orthogo- [Henderson, 2003].
nal collocation method, which is superconvergent The computation of a fat trajectory is more
for the solution at the mesh points and for scalar expensive than straighforward integration, because
variables. Furthermore, the boundary value contin- it adds equations for the evolution of the tangent
uation algorithms in AUTO, written in the f77 or space and curvatures. However, the implementation
C programming language, are rather efficient, so of updating the computed boundary is quite effi-
that the calculations can generally be done in rela- cient; see also [Henderson, 2002]. The overall algo-
tively little computer time. For example, computing rithm is relatively fast. For example, the Lorenz
the Lorenz manifold up to a trajectory arclength of manifold in Fig. 9 was computed on a 375 MHz
250 with a high resolution of NTST = 75, as in Power3 processor in about 7.3 hours.
A Survey of Methods for Computing (Un) Stable Manifolds of Vector Fields 93

Finally, the algorithm may encounter a geomet- used to build a global parametrization for nearby
ric problem. It must be able to distinguish between parameter values. This would reduce the cost of
mesh points on different sheets of the invariant locally extending the mesh near dM at every step
manifold, for example, where a trajectory returns of the continuation.
close to itself. This can be done by checking the The current implementation of the PDE
values of t and a at the centers of the disks, but approach works for two-dimensional manifolds in
it demands sufficiently small disks so that those a phase space of arbitrary dimension. An adap-
quantities vary only a little across each disk. This tive implementation for k > 3 will have to employ
requirement may result in many more mesh points a robust algorithm for a higher-dimensional local
being computed than is necessary to obtain a geo- mesh extension, which remains a challenge.
metrically smooth manifold. This geometric prob-
lem occurs when trajectories spiral tightly, as is
the case, for example, on the unstable manifolds 7.5. Box covering
of the two equilibria on the wings of the Lorenz The box covering algorithm of [Dellnitz &
attr actor. Hohmann, 1996, 1997; Dellnitz et al, 2001; Dellnitz
&; Junge, 2002] constructs a covering of (part of)
the global invariant manifold. This covering consists
7.4. PDE formulation of a collection of small boxes. The method is for-
The PDE approach by [Guckenheimer & mulated for discrete-time systems, and differential
Vladimirsky, 2004] leads to a very efficient numeri- equations can be handled by considering a corre-
cal method for computing a mesh representation sponding time-r map. It allows for the computation
of a global (un)stable manifold. The computa- of (un)stable manifolds of arbitrary invariant sets. It
tional cost of this method is largely independent is possible (and implemented in GAIO) to compute
of the geometric stiffness present in the system. manifolds of arbitrary dimension. The "thickness"
For example, the Lorenz manifold in Fig. 12 was of the covering depends on the contraction rate
computed in under 90 seconds on a Pentium III transverse to the manifold. The stronger the con-
850 MHz processor. traction, the fewer "box-layers" along the manifold
The constructed approximation M. is "causal", will be produced. In particular, the algorithm needs
that is, it contains approximate trajectories for to be modified in order to apply it to Hamiltonian
all the mesh points on dM.. The method is not systems [Junge, 2000b].
restricted to manifolds where the level curves of the The key implementational issue, namely how to
geodesic distance remain smooth. In particular, the compute the image of a given box, is typically dis-
method can be used for approximating manifolds cretized by mapping a (finite) set of test points in
containing homoclinic and heteroclinic trajecto- each box. Evidently, depending on the properties of
ries; see [Guckenheimer & Vladimirsky, 2004] for the underlying map, the choice of these points deter-
examples. mines the quality of the resulting covering. Using
The computational cost of adding each mesh too few points may lead to missing boxes, while
point is proportional to the codimension (n — k) using too many slows down the computation. There
of the manifold. When approximating manifolds exist strategies for a near-optimal choice of these
of high codimension, this is clearly a disadvan- points. In the case that Lipschitz estimates of the
tage compared to other methods for which this dynamical system are available, one may compute
cost is proportional to the dimension k of the rigorous coverings. In this case, it can be ensured
manifold. A second limitation of the method is that the manifold is contained inside the union of
that the constructed approximation is globally only the sets in the constructed covering [Dellnitz et al.,
first-order accurate, in contrast with, for example, 2001; Junge, 2000a].
the second-order accuracy of computing fat The overall computational cost is quite high
trajectories. when good resolution, that is, many boxes are
A variant of the code exists that uses a global required. For example, the Lorenz manifold in
coordinate system defined by a triangulated mesh. Fig. 14 of more that 4 million boxes took about
This means that the PDE method could be used 120 minutes on a 1.25 GHz G4 processor. Since the
in a continuation framework, where an approxima- numerical cost depends on the dimension of the
tion of the manifold for one parameter value is manifold, for manifolds of dimension larger than
94 B. Krauskopf et al.

two it may only be feasible to compute rather coarse Fiedler, B., Iooss, G. & Kopell, N. (World Scientific,
approximations. Singapore), pp. 221-264.
Dieci, L. & Lorenz, J. [1995] "Computation of invariant
tori by the method of characteristics," SIAM J. Num.
Acknowledgment
Anal. 32, 1436-1474.
The authors thank Mike Jolly for providing the Dieci, L., Lorenz, J. k Russell, R. D. [1991] "Numeri-
image in Fig. 3 of the Lorenz manifold computed cal calculation of invariant tori," SIAM J. Sci. Stat.
with the method in [Johnson et al, 1997], and seen Comput. 12, 607-647.
from the common viewpoint. Doedel, E. J. [1981] "AUTO, a program for the auto-
matic bifurcation analysis of autonomous systems,"
Congr. Numer. 30, 265-384.
References Doedel, E. J., Keller, H. B. k Kernevez, J. P. [1991a]
Abraham, R. H. k Shaw, C. D. [1985] Dynamics — The "Numerical analysis and control of bifurcation prob-
Geometry of Behavior, Part Three: Global Behavior lems: I," Int. J. Bifurcation and Chaos 1, 493-520.
(Aerial Press, Santa Cruz). Doedel, E. J., Keller, H. B. k Kernevez, J. P. [1991b]
Allgower, E. L. k Georg, K. [1996] "Numerical path "Numerical analysis and control of bifurcation prob-
following," Handbook of Numerical Analysis, Vol. 5, lems: II," Int. J. Bifurcation and Chaos 1, 745-772.
eds. Ciarlet, P. G. & Lions, J. L. (North Holland Doedel, E. J., Champneys, A. R., Fairgrieve, T. F.,
Publishing), pp. 3-207. Kuznetsov, Yu. A., Sandstede, B. k Wang, X. J.
Ascher, U. M., Mattheij, R. M. M. k Russell, R. D. [1995] [1997] "AUT097: Continuation and bifurcation soft-
Numerical Solution of Boundary Value Problems for ware for ordinary differential equations," available via
Ordinary Differential Equations (SIAM). http://cmvl.cs.concordia.ca/.
Back, A., Guckenheimer, J., Myers, M. R., Wicklin, F. J. Doedel, E. J., Paffenroth, R. C , Champneys, A. R.,
k Worfolk, P. A. [1992] "DsTool: Computer assisted Fairgrieve, T. F., Kuznetsov, Yu. A., Oldeman, B. E.,
exploration of dynamical systems," Notices Amer. Sandstede, B. k Wang, X. J. [2000] "AUTO2000:
Math. Soc. 39, p. 303. Continuation and bifurcation software for ordinary
Beyn, W.-J., Champneys, A., Doedel, E. J., differential equations," available via http://cmvl.
Govaerts, W., Sandstede, B. k Kuznetov, Yu. A. cs.concordia.ca/.
[2002] "Numerical continuation and computation of Edoh, K. D., Russell, R. D. k Sun, W. [1995] "Orthog-
normal forms," Handbook of Dynamical Systems, onal collocation for hyperbolic PDEs k computation
Vol. 2, ed. Fiedler, B. (Elsevier Science), pp. 149-219. of invariant tori," Australian National Univ., Mathe-
De Boor, C. k Swartz, B. [1973] "Collocation at matics Research Report No. MRR 060-95.
Gaussian points," SIAM J. Numer. Anal. 10, Fairgrieve, T. F. & Jepson, A. D. [1991] "O. K. Floquet
582-606. multipliers," SIAM J. Numer. Anal. 28, 1446-1462.
Dellnitz, M. k Hohmann, A. [1996] "The Guckenheimer, J. k Holmes, P. [1986] Nonlinear
computation of unstable manifolds using subdivi- Oscillations, Dynamical Systems and Bifurcations of
sion and continuation," Nonlinear Dynamical Sys- Vector Fields, 2nd edition (Springer-Verlag, NY).
tems and Chaos PNLDE 19, eds. Broer, H. W., Guckenheimer, J. k Worfolk, P. [1993] "Dynamical
Van Gils, S. A., Hoveijn, I. k Takens, F. (Birkhauser, systems: Some computational problems," Bifurca-
Basel), pp. 449-459. tions and Periodic Orbits of Vector Fields, ed.
Dellnitz, M. k Hohmann, A. [1997] "A subdivision algo- Schlomiuk, D. (Kluwer Academic Publishers),
rithm for the computation of unstable manifolds and pp. 241-277.
global attractors," Numer. Math. 75, 293-317. Guckenheimer, J. k Vladimirsky, A. [2004] "A
Dellnitz, M., Hohmann, A., Junge, O. k Rumpf, M. fast method for approximating invariant mani-
[1997] "Exploring invariant sets and invariant mea- folds," SIAM J. Appl. Dyn. Syst. 3, 232-260;
sures," Chaos 7, 221-228. animations available at http://epubs.siam.org/sam-
Dellnitz, M., Froyland, G. k Junge, O. [2001] "The bin/dbq/article/60017.
algorithms behind GAIO — Set oriented numeri- Henderson, M. E. [2002] "Multiple parameter continua-
cal methods for dynamical systems," Ergodic Theory, tion: Computing implicitly defined fc-manifolds," Int.
Analysis, and Efficient Simulation of Dynamical J. Bifurcation and Chaos 12, 451-476.
Systems, ed. Fiedler, B. (Springer-Verlag, Berlin), Henderson, M. E. [2003] "Computing invariant manifolds
pp. 145-174; software available at http://www. by integrating fat trajectories," SIAM J. Appl. Dyn.
dynamicalsystems.org/sw/sw/detail?item=30. Syst., in press.
Dellnitz, M. k Junge, O. [2002] "Set oriented numer- Hobson, D. [1993] "An efficient method for comput-
ical methods for dynamical systems," Handbook of ing invariant manifolds of planar maps," J. Comput.
Dynamical Systems IT. Towards Applications, eds. Phys. 104, 14-22.
A Survey of Methods for Computing (Uri) Stable Manifolds of Vector Fields 95

Johnson, M. E., Jolly, M. S. & Kevrekidis, I. G. [1997] Osinga, H. M. k Krauskopf, B. [2002] "Visualizing the
"Two-dimensional invariant manifolds and global structure of chaos in the Lorenz system," Comput.
bifurcations: Some approximation and visualization Graph. 26, 815-823.
studies," Numer. Alg. 14, 125-140. Osinga, H. M. [2003] "Non-orientable manifolds in three-
Johnson, M. E., Jolly, M. S. k Kevrekidis, I. G. [2001] dimensional vector fields," Int. J. Bifurcation and
"The Oseberg transition: Visualization of global bifur- Chaos 13, 553-570.
cations for the Kuramoto-Sivashinsky equation," Osinga, H. M. k Krauskopf, B. [2004] "Crocheting the
Int. J. Bifurcation and Chaos 11, 1-18. Lorenz manifold," The Math. Intell. 26, 25-37.
Junge, O. [2000a] "Rigorous discretization of subdivi- Peraire, J., Peiro, J. k Morgan, K. [1999] "Advancing
sion techniques," in Proc. Int. Conf. Diff. Eqs. Vol. 2, front grid generation," Handbook of Grid Generation,
eds. Fiedler, B., Groger, K. k Sprekels, J. (World eds. Thompson, J. F., Soni, B. K. k Weatherill, N. P.
Scientific, Singapore), pp. 916-918. (CRC Press), Chap. 17.
Junge, O. [2000b] Mengenorientierte Methoden zur Perello, C. [1979] "Intertwining invariant manifolds and
Numerischen Analyse Dynamischer Systeme (Shaker, Lorenz attractor," in Global Theory of Dynamical
Aachen). Systems (Proc. Internat. Conf, Northwestern Univ.,
Keller, H. B. [1977] "Numerical solution of bifurcation Evanston, III., 1979), Lecture Notes in Mathematics,
and nonlinear eigenvalue problems," Applications of Vol. 819 (Springer-Verlag, Berlin), pp. 375-378.
Bifurcation Theory, ed. Rabinowitz, P. H. (Academic Phillips, M., Levy, S. k Munzner, T. [1993] "Geomview:
Press), pp. 359-384. An interactive geometry viewer," Not. Amer. Math.
Krauskopf, B. k Osinga, H. M. [1998a] "Globalizing Soc. 40, 985-988.
two-dimensional unstable manifolds of maps," Int. J. Rebay, S. [1993] "Efficient unstructured mesh genera-
Bifurcation and Chaos 8, 483-503. tion by means of Delaunay triangulation and Bowyer-
Krauskopf, B. & Osinga, H. M. [1998b] "Growing ID Watson algorithm," J. Comp. Phys. 106, 125-138.
and quasi 2D unstable manifolds of maps," J. Comp. Rheinboldt, W. C. [1986] Numerical Analysis of Param-
Phys. 146, 404-419. etrized Nonlinear Equations, University of Arkansas
Krauskopf, B. k Osinga, H. M. [1999] "Two-dimensional Lecture Notes in the Mathematical Sciences (Wiley-
global manifolds of vector fields," Chaos 9, Interscience).
768-774. Russell, R. D. k Christiansen, J. [1978] "Adaptive mesh
Krauskopf, B. k Osinga, H. M. [2003] "Computing selection strategies for solving boundary value prob-
geodesic level sets on global (un)stable manifolds of lems," SIAM J. Numer. Anal. 15, 59-80.
vector fields," SIAM J. Appl. Dyn. Syst. 4, 546-569. Sacker, R. J. [1965] "A new approach to the perturba-
Krauskopf, B. k Osinga, H. M. [2004] "The Lorenz tion theory of invariant surfaces," Comm. Pure Appl.
manifold as a collection of geodesic level sets," Math. 18, 717-732.
Nonlinearity 17, C1-C6. Sethian, J. A. k Vladimirsky, A. [2003] "Ordered upwind
Kuznetsov, Yu. A. [1998] Elements of Applied Bifurca- methods for static Hamilton-Jacobi equations: Theory
tion Theory, 2nd edition (Springer Verlag, NY). k applications," SIAM J. Numer. Anal. 41, 325-363.
Lorenz, E. N. [1963] "Deterministic nonperiodic flow," Seydel, R. [1995] From Equilibrium to Chaos. Practi-
J. Atmosph. Sci. 20, 130-141. cal Bifurcation and Stability Analysis, 2nd edition
Lust, K. [2001] "Improved numerical Floquet multipli- (Springer-Verlag, NY).
ers," Int. J. Bifurcation and Chaos 11, 2389-2410. Spivak, M. [1979] Differential Geometry, 2nd edition
Misner, C. W., Thorne, K. S. k Wheeler, J. A. [1970] (Publish or Perish, Houston, Texas).
Gravitation (W. H. Freeman and Company, San Stewart, H. B. [1986] "Visualization of the Lorenz
Francisco). system," Physica D18, 479-480.
Osinga, H. M. [2000] "Non-orientable manifolds of Strogatz, S. H. [1994] Nonlinear Dynamics and Chaos
periodic orbits," in Proc. Int. Conf. Differential (Addison-Wesley, Reading, MA).
Eqations, Equadiff 99 {Berlin) Vol. 2, eds. Fiedler, B., Thompson, J. M. T. & Stewart, H. B. [1986] Nonlinear
Groger, K. k Sprekels, J. (World Scientific, Dynamics and Chaos (John Wiley, Chichester/NY).
Singapore), pp. 922-924.
This page is intentionally left blank
COMMUTATORS OF SKEW-SYMMETRIC M A T R I C E S
A N T H O N Y M. B L O C H
Department of Mathematics, University of Michigan,
Ann Arbor, MI 48109, USA
A R I E H ISERLES
Department of Applied Mathematics and Theoretical Physics,
Centre for Mathematical Sciences, University of Cambridge,
Wilberforce Road, Cambridge CBS OWA, England

Received March 26, 2004; Revised June 8, 2004

In this paper we develop a theory for analysing the "radius" of the Lie algebra of a matrix Lie
group, which is a measure of the size of its commutators. Complete details are given for the
Lie algebraso(n) of skew symmetric matrices where we prove \\[X, Y]\\ < y/2\\X\\ • \\Y\\, X, Y G
so(n), for the Probenius norm. We indicate how these ideas might be extended to other matrix Lie
algebras. We discuss why these ideas are of interest in applications such as geometric integration
and optimal control.

Keywords: Lie algebras; symmetric gauges; commutator matrices.

1. N o r m s and C o m m u t a t o r s in for two types of norms closely associated with a


Mn\R] and so(n) remarkable paper of von Neumann [1937].
We recall t h a t a symmetric gauge is a vector
This paper is concerned with the following ques-
norm |-| which is b o t h symmetric a n d positive.
tion. Let g be a matrix Lie algebra. (We refer the
In other words, for every x € R n it is t r u e t h a t
reader to [Carter et al, 1995; Humphreys, 1978;
lx-n-1 = | x | and ||x|| = | x | , where 7r is a permu-
Varadarajan, 1984] for elements of Lie-algebraic
tation of {1, 2 , . . . , n}, x j = [xwi ,x7C2,..., xnn] and
theory of relevance to this paper.) Given X, Y € g
| x | T = [\xi\, \x2\, • • •, \xn\}. We consider two norms,
and a norm ||-|| : g —• M + , what is the size of
firstly the operator norm
|| [X, Y] || in comparison with ||X||-||Y||? We assume
t h a t the norm satisfies the Banach inequality
\\XY\\<\\X\\.\\Y\\.
On the face of this, the question has little merit
since the elementary inequality and secondly the norm

||[X,Y]||<2||X||.||Y|| (1) \\x\\ = \<r(x)\, (2)


always holds for X, Y G M n [R], the set of all n x n where tr(X) are the singular values of X, arbitrarily
real matrices and an arbitrary matrix norm ||-||. ordered. While it is easy to see t h a t (2) is a u n i t a r y
(This follows purely from the additive and mul- norm (i.e. invariant under multiplication b y a uni-
tiplicative properties of norms, writing [X, Y] = tary matrix), von Neumann proved t h a t all u n i t a r y
XY — YX.) Moreover, it is easy to prove t h a t norms are of this form. We remark t h a t t h e stan-
the bound (1) can be attained for most norms of dard ^p[Mn] vector norm, 1 < p < oo, is a symmetric
practical interest. In particular, this is the case gauge. Therefore it gives rise to a unitarily-invariant

97
98 A.M. Block k A. Iserles

norm, the Schatten p-norm \\-\\p = |er( • ) | p [Horn & We recall also the important facts to be used
Johnson, 1994]. below, namely that ||X||2 is equal to the magnitude
We consider just the case n = 2, since it can of the largest singular value of X while ||X|| F =
be embedded in Mn[K] for any n > 2 and this is
sufficient for analysing the upper bound for general When the context is not clear we will label u)
n > 2. Let by a subscript denoting which norm is being used.
"l 0" 0 l" Trivially, the Lie algebra $j is commutative if
X , Y = and only if w(fl) = 0, but this observation is devoid
0 -1 —1 0 of any insight. More interestingly, consider so(3)
and the Euclidean norm. Letting
"0 2
Z=[X,Y] = 0 Xl 0
2 0 X-2 V\ yi
X = -Xl 0 xz , Y = -y\ 0 y$
It is easy to verify that \X\ = \Y\ = 1 and -X2 -xz 0 _ _~V2 ~yz 0
\Z\=2. Moreover, since <r(X) = cr(Y) = [1,1] and
cr(Z) = [2,2], it is also true that ||X||,||Y|| = | 1 | and observing that in so(n) the Euclidean norm
and ||Z|| = 2 | 1 | , where 1 T = [1,1]. In both cases coincides with the spectral radius, we commence by
the upper bound in (1) is attained. noting that
Yet, there is a basic difference between Mn[R]
||X|| = ||x||, ||Y|| = ||y||.
and a Lie algebra g C Mn[K]: while dimMn[K] = n 2 ,
the Lie algebra typically has a lower dimension: for Moreover, if Z = [X, Y] then, by an easy direct
example, dimso(n) = (1/2)(n — l)n. Thus, it makes calculation,
sense to pose the question whether, once X and
Y are restricted to Q, the inequality (1) might still ||x|| • ||y|| - ( x T y ) 2 . (4)
be obeyed as an equality or whether 2 might be Therefore
replaced by a smaller constant for all X, Y € g.
Thus, given a norm ||-||, we say that the radius of T„\2 < 11X11-IIYI
a Lie algebra g is the least number tu(g) G [0,2]
ixiriiYi (x'y)
such that with the upper bound holding as an equality
||[X,Y]||<W(X)||X|H|Y||, X,YE3. when x is orthogonal to y. We thus deduce
that w(so(3)) = 1.
In other words,
\\[X,Y}\\ . Remark 1. There is a natural Lie algebra homomor-
LO(Q) = max phism between so (3) and M3 endowed with the cross
\X\\-\\Y\
product. The above computation may be repeated
X,Ye9, X,Y^O (3) with this in mind and (4) is a standard vector iden-
tity. One could of course use the "hat" notation (see
the operator norm of the commutator. e.g. [Marsden &; Ratiu, 1999]) for this homomor-
It is important when defining u to keep in mind phism but we prefer our notation here because we
which underlying norm we are using. In the follow- require below a more general relationship between
ing we shall denote by ||i>||p, v 6 Rn, the vector vectors and matrices.
p-norm and by
Remark 2. It is also of interest to repeat the above
||X|L = m a x ( ^ ^ : v^Q\ computation for the Frobenius norm. One deter-
mines immediately that WF(SO(3)) = l / \ / 2 . How-
the corresponding operator norm as above. In the ever for Lie algebraic reasons that will become
case p = 2 we shall call this the Euclidean norm. apparent below it is more natural to scale the
We denote by Frobenius norm by a factor of y/2. With this scal-
ing we also have ujf(so(3)) = 1. Strikingly this result
x does not hold for n larger than 3.
II*IIF = I E l
ifc,Z=l The Main Result. In this paper we determine
the Frobenius norm. to(so(n)) for all n > 3 (so(2) is a commutative
Commutators of Skew-Symmetric Matrices 99

algebra, hence o;(so(2)) = 0) with respect to the In Sec. 2 we discuss the structure of the com-
(scaled) Probenius norm. Specifically, we prove that mutator operator, considered as a linear transfor-
LOF(so(n)) = \/2 for n > 4 (with the above- mation from so(n) to itself. We prove that, subject
mentioned scaling). Note that | | X | | F = — (X,X), to an appropriate representation of so(n), the com-
where (•, •) is a multiple of the Killing form in mutator matrix in the (1/2) (n — l)n-dimensional
so(n), hence it has deeper Lie-algebraic significance. linear space can be read explicitly from a certain
The Killing form evaluated on a pair of n x n skew- directed graph and investigate its eigenstructure.
symmetric matrices A, B is actually (n—2) trace AB Section 3 is devoted to the proof of the main result
(see [Kobayashi & Nomizu, 1969]). (Of course for a of this paper, namely that, once we use the (scaled)
noncompact Lie algebra the Killing form does not Probenius norm, uj(so(n)) = \/2 for all n > 4.
provide a norm since it is not definite.) Another rea- The subject matter of this paper is motivated
son why the Probenius norm is of interest is that the by a raft of issues arising from geometric numer-
radius of so(n), n > 4, is just equal to 2 for most ical integration. The simplest such problem is the
other norms of interest. Consider for example the convergence of the sum
following analysis.
Thus, again, let | • | be a symmetric gauge and f{t-X,Y) = Yjamt m a d ^ F , X,YEQ,
m=0
0 1 0 0" 0 0 1 0
where {am}mez+ is a given sequence and a d x is the
-1 0 0 0 0 0 0 -1
X = , Y = adjoint operator of the Lie algebra g,
0 0 0 1 -1 0 0 0
a d ^ F = Y, a d ^ F = [X, ad^Y], m e N.
0 0 -1 0 0 1 0 0
- It is trivial to deduce from the triangle inequality
that
0 0 0 -2 00

0 0 -2 0 \\f(t;X,Y) <PlE°mMfl)ll*l
[X, Y] = m=0
0 2 0 0
thereby relating the convergence of F to the
2 0 0 0 domain of analyticity of the generating function
"%2m=o a™zTn• ^ n e benefit of smaller U(Q) in the
Note that for every v £ l 4 positivity and symmetry
convergence of such a function is clear. Similar and
of the symmetric gauge imply that
more complicated problems abound in the analysis
V2 of Lie-group methods [Iserles et al., 2000].
-Vi The norm of a bracket is also important in
|Xv| |v| determining the maximum allowable step size in
certain minimization problems on adjoint orbits, see
l-v3 the work of [Brockett, 1993].
A related problem of interest in analysing cer-
and, similarly, \Yv\ = | v | and |[X,F]v| = 2|v|.
tain systems of differential equations is that of find-
Therefore, in the underlying operator norm \X\,
ing a bound on the norm of the bracket [X, N] where
\Y\= 1 and | [X, Y] | = 2. Consequently w(so(4)) = 2
N is fixed and X varies over the adjoint orbit of a
and this can be extended to all Lie algebras so(n),
group. This problem is discussed in [Brockett, 1994].
n > 4, since they form a flag. This example cannot,
In that setting for so(n) one has to solve the prob-
however, be extended to unitary norms (2), unless
lem of maximizing \\[X,N]\\ over all X = 6TA6 for
||1|| = 1. Note that the latter condition holds when
N, A fixed in so(n) and 9 in the group SO(n).
hi = I'loo (the oo-Schatten norm [Horn & Johnson,
1994], which is equivalent to the operator Euclidean
norm), but in that instance || • || = ||-1|2 and we are 2. The Reduced Commutator Matrix
back to the area covered earlier in this paragraph. in so(n)
On the other hand, I • I = | • I2, whence ||1|| = 2,
results in || • || = || • ||F. This, of course, does not 2.1. The reduced commutator matrix
necessarily mean that w(so(4)) < 2 in the Probenius Let Q C Mn[R] be an m-dimensional matrix Lie
norm. algebra, 1 < m < n2. An obvious means to
100 A. M. Block & A. Iserles

explore the norm of t h e commutator in g is by Recalling t h a t the definition of t h e radius co(g)


means of the natural embedding 6 : g —> Rn t h a t of the Lie algebra is
"stretches" a matrix X into a vector, e.g. by letting
9{l_1)n+k(X) = Xk,i, k,l = 1 , 2 , . . . ,n (columnwise ™J^^:X,Yeg,X,Y^o),
A Y
ordering). Since commutation is a linear transfor- UI II • \\ \\ )
mation, it follows t h a t for every l e g there exists where ||-|| is a given norm induced by the vector
a matrix Cx £ M„2 [M] such t h a t norm on R m , i.e. ||X|| = ||i/(X)||, we observe t h a t

0([X,Y]) = C - x 0 ( n yGfl. Proposition 1.

It is known that W (g)=max(l|l : X e g , X ^ o \ , (6)


I llAll J
a(Cx) = {Afc-Aj : Afc, Aj e a ( X ) ,fc,Z = 1 , 2 , . . . , n} where the Lie algebra norm is that induced by the
map v.
[Hille, 1969], and this provides a useful tool t o
explore commutators in a classical setting. Yet, this
Proof. We have
line of reasoning disregards t h e fact t h a t g is a
Lie algebra, typically of much smaller dimension \\[X,Y]\\ = \\U(\X,Y))\\ = \\CXV{Y)\\.
t h a n t h a t of M n [K]. Thus, in place of 0, we pro-
pose a restricted embedding v : g —• R m . Let Q = Hence
{Qi, Q2, • • • ,Qm} be a basis of g. We define an \\Cx"{Y)\\
isomorphism v from g on R m through u{g) = max m a x
xeg\{o}Yeg\{0} \\u(X)\\ • \\u(Y)\\

Xi \Cx\\ _
max
xeg\{o} \\u{X)\\'
X2
0 3 X = J2 xkQk <£> v{X) = x =
fc=i
Xm, We conclude this section by addressing t h e
question of multiple representations. Suppose thus
Remark. We note t h a t this is just a vector space t h a t we have two bases of g, Q = {Qi, Q2, • • •, Qm}
isomorphism and not in general a Lie algebra homo- and V = { P i , P2, • • •, Pm}, say. Set
morphism and there is in general no natural cross-
product operation in M.m. Thus one cannot use the P = [Pi P2 Pm], Q = [Ql <*2
earlier argument for so (3).
where p fc = u{Pk), qfc = u{Qk), k= 1,2,...,m.
T h e restricted commutator matrix Cx S M m [E]
Then
is then defined by t h e identity
m m

u([X,Y}) = Cxu{Y), Yeg. 0 3 x = ^T xkpk = ] P ^Qk


k=i k=i
Spectral information on Cx is no longer readily and
implies a t once t h a t x = Q x . Therefore Cj[ =
explicitly available, yet the procedure has t h e great
Q~XC^Q, where C £ and C § are reduced commuta-
virtue of reducing t h e dimension and allowing for a
more natural incorporation of Lie-algebraic infor- tors with respect to t h e two bases. In particular, if
mation. Specifically, let {c'k1)k,l,j=\,2,...,m be t h e Q is an orthogonal matrix a n d t h e bases are orthog-
structure constants of g with respect to Q, onally similar then t h e radius of g does not depend
on the choice of the basis.

[Qk,Qi] = ^2<itiQj- Example. We calculate a; (so (3)) using this


formalism.
If
It is an elementary exercise t h a t
0 a b
m
X —a 0 c
(Cx)j,i = ^2xkck\l, j,l = 1,2, ,771. (5)
-b -c 0
k=i
Commutators of Skew-Symmetric Matrices 101

then we easily compute likely to lend itself to a sparse set of structure con-
stants. Specifically, the nonzero structure constants
0 c -b are precisely
Cx —c 0 a
(Z,s)
b -a 0 c(M) - -1 I<s C
(fc,0,(fc,s)
= + 1 , / > S,
C
(k,l),(k,s) ~ *> ' ^ *'
T 2 (fc,S) .. , (fe,s)
Now, v{X) = [x,y, z] , and hence ||i/(X)|| 2 = (a + C
(k,l),(l,s) - + 1 ' K
< S
' C
(fc,0,(Z,s)
-1, k > s,
b2 + C2)1/2. On the other hand, <7(CX) = {0, ±(a 2 +
(k,r) (fc,r)
b2 + c 2 ) 1 / 2 }, hence ||C X || 2 = (a2 + &2 + c 2 ) 1 / 2 and \k,l),(r,l) = - 1 , fc < r, C = + 1 , fc > r,
(fc,0,(r,Z)
we obtain a; (so (3)) = 1.
Al'r^ — -4-1 / <r r C
(k\l),(r,k) = -1
> ' > r-
C
(k,l),(r,k) ~ ^ 1 ' l
^ ''
2.2. The reduced commutator matrix
in so(n) and directed graphs Given
n—1 n
We denote by Ekj E Mn[K] the matrix whose so(n) 9 X = ^ ^ Xk,lQk,l,
(k, l)th component is + 1 and otherwise is zero, fc=l Z=fc+1
k, I = 1,2,... , n, and choose the basis
(5) implies that
Q = {Qkfl = Ekti - Eitk : 1 < k < I < n}

of so(n). The restricted embedding u takes each (Cx)(fc,0,(i,i) = J ] Xr sC


' (ri),(fc,0
Qk,i to eM(feii) € Rm, where m = (1/2)(n - l)ra (r,s)eJ
and /x is an arbitrary isomorphism mapping pairs
X = {(£;, 0 : 1 < fc < / < n}, into { 1 , 2 , . . . , TO}. c
= " 12 (M),(ij)' (k,l),(i,j)el.
However, it is more convenient to discuss restricted (r,s)EJ
commutator matrices in the formalism of the Qk,i,
bypassing v altogether. Thus, we index the struc-
ture constants and the entries of the restricted com- We observe that Cx € so(m) and that it is a very
mutator matrix by pairs (i,j) E T. sparse matrix. Specifically, for any (fc, I) E I the
only nonzero entries are
For ease of notation we let Qk}i = —Qi,k f° r
k > I and Qk,k = O. Since
(Cx)(fc,0,(z,j) = Zfcj, j = I + 1, Z + 2 , . . . ,n,
[Qk,h Qr,s] — h,rQk,s ~ &k,rQl,s ~ &l,sQk,r + $k,sQl,r,
(Cx)(fe,/),(fc,j) = a^j, J = k + 1, A; + 2 , . . . , n,
the structure constants are
(Cx)(k,i),(i,l)=xi,k, i = l,2,... ,1-1, i^k,
Sid)
-(k,l),(r,s) (Cx)(k,i),(i,k) = xi,h i = l,2,...,k — l.
-1, k = r, I ^ s, i = l, j = s, (8)
+1, k ^ r, I = r, i = k, j = s,
= < -1, k^r, 1 = s, i = k, j = r, ( i j ' J e l - Altogether, just ( n - 2 ) ( n - l ) , out of (l/2)(n-l)2n2,
+1, k = s, l^r, i = l, j = r, entries of Cx are nonzero.
The elements of Cx lend themselves to a
0, otherwise.
very convenient representation in terms of labeled
(7) digraphs. Any matrix A E Mm[R] can be repre-
sented by a digraph with TO vertices, adopting the
In other words, most structure constants vanish: convention that, once Akj ^ 0, then there is a
not surprising, given that our basis is consistent directed edge from vertex k to vertex / with the label
with the root space decomposition of so(ra), hence Aki. As an example, let us examine the digraph
102 A. M. Block & A. Iserles

corresponding to C\ for n = 4 (hence m = 6):

(9)

Needless to say, (9) can be read "backwards": an


arrow from (l,j) to (k,j) with a label Xk,i is the
^3,4
same as an arrow from (k,j) to (l,j) with the
label —Xkj.

Lemma 2. For all n > 3 the directed graph of Cx


is the sum of all Cf\ 3-cycles (9) for all 1 < k <
I < j < n. It is r-regular, where r = 2{n — 2).

Proof. The first statement of the lemma follows


at once from our analysis. Because of symmetry,
clearly the graph must be r-regular for some r > 1.
Therefore, the sum of all the degrees of all the
vertices is mr. Since each 3-cycle (9) accounts for
We commence by noting t h a t the graph is exactly six degrees and m = (1/2)(n — l ) n , we
4-regular [Chartrand & Lesniak, 1986]: each ver- have
tex is of degree 4. Moreover, two of the edges
n
at each vertex commence and two terminate 6
there. = 2(n - 2).
r =
A generalization for all n > 3 is clear from (8). -(n — l ) n
For every 1 <k <l < j <n we have
Given X, Y € so(n), n > 3, we can reconstruct
(Cx)(k,l),(l,j) = %k,j, the representation of [X, Y] in the basis Q directly
from the digraph in Lemma 2. Since
(Cx)(lJ),(k,j) = Xfc,l,

(Cx)(k,j),(k,i) — xij.
[X,Y] = E
^
(k,l)el
5Z
(r,s)el
X
k,Wr,s(Sl,rQk,s - 0~k,rQ.l,s

In the notation of labeled digraphs this corresponds — 8l,sQk,r + 8k,sQl,r),


to the 3-cycle we have the following contributions to the (i,j) € X
component of the commutator,

(k,s) = (i,j), r =l +Xi,lVl,j, i + l<l<j -1,


(1,8) ==(*.J'). r -= k -Xk,iVk,j, 1 < k < min{i,j} - 1,
(k,r)-- = (*.J). s -= 1 -Xi,Wj,i, max{i, j} + 1 < I < n,
(l,r) =: (i,j), s == k +Xk,iUj,k, j + l < k < i - l ,
(k,s)-- = CM). r == 1
(l,s) == CM). r == k +Xk,jVk,i, 1 < k < m i n { i , j } - 1,
(k,r)-- = 0'.*). s -= 1 +Xj,iyi,i, max{i, j} + 1 < I < n,
(l,r) = {j,i), s = k -Xk,jVi,k, i + 1 < k < n.
Commutators of Skew-Symmetric Matrices 103

For n = 4 and (i,j) = (1,3) just four terms survive briefly in the sequel. We set J = [ _x 0 ] . Then there
from the above list,
exists a matrix Q € SO(n) such that
[^",y]l,3 = ^1,22/2,3 - £1,42/3,4 + #3,42/1,4 - £2,32/1,2- 'A, o o
Examine now the digraph for n = 4: (1,3) is con- T 0 A2
=A =
nected to (2,3) with label x\$ and outgoing arrow, o
to (3,4) with label x\^ and incoming arrow, etc. In
general, it is easy to confirm the following general
0 o AN
rule for the reconstruction of the commutator in our where Ak = akJ, k = 1,2, ...,iV. Note that the
basis. eigenvalues of X are iictifc, k = 1,2,... , N.
Choose 1 < k < I < N and let V £ so(n, C) be
Lemma 3. Let n > 3. Then, for every (i,j) 6 I a zero matrix, except that
the element [X, Y]ij is the sum of terms of the form
~^xk,lVr,s over all the 2(n — 2) edges adjoining the V2k-l,2l-l V2k-l,2l
vertices (i,j) and (r,s) with the weight xkti, and
V2k,2l-1 V2k,2l
= u,
with the sign being +1 if the arrow is outgoing from
(i,j), —1 otherwise. V2l-l,2k-l V2l-l,2k
V2L2k-l V2L2k
= -u,
Lemma 3 becomes very useful when the matrix
Y is sparse, since the algorithm therein lends where U = [uu\ uu\}. Letting Z = [A,V], we observe
itself handily to the exploitation of structure and that all the entries of Z vanish, except for
sparsity.
•^2fc-l,2Z-l ^2k~l,2l
AkU-UAi,
%2k,2l-l Z2k,2l
3. T h e R a d i u s of so(ra) for n > 4
3.1. The eigenstructure of so(n) in Rm Z2l-l,2k-l %2l-l,2k
UAk - A,U.
The evaluation of the Frobenius norm of a com- Z2l,2k-1 %2l,2k
mutator comes as something of an anticlimax, since Assume that 7 G C and u ^ 0 are an eigenvalue
the spectrum of the restricted commutator operator and an eigenvector, respectively, of the matrix
can be evaluated with relative ease. We have already
noted that the eigenvalues of the full commutator 0 Oil Oik 0
operator, acting in R n , are {i(Xk — A;) : k,l = -Oil 0 0 ak
1,2,... , n } , where o~(X) = {iAi,iA2,... ,iA n }. Our
Oik 0 0 Oil
contention is that m = (1/2) (n — l)n of these eigen-
values survive intact once we consider the restricted 0 ~Oik -Oil 0
commutator. Then AkU - UAi = jU and it follows that u(V) is
To this end, we commence by revisiting the clas- an eigenvector of CA, corresponding to the eigen-
sical analysis of the eigenstructure of the full com- value 7. This results for each k < I in four eigen-
mutator. Thus, suppose that X e Mn[R] has a full value/eigenvector pairs,
set of eigenvectors, therefore X — VDV~l, where
D = diag A. For every k,l = 1, 2 , . . . , n, k ^ I, we l 1
7 = i(ak + ai), U=
set Ekj € Mn[R] as a zero matrix, except for a unit i 1
element at the (k, I) entry. Therefore
1 i
'y = i(-ak + ai), U =
V-\X,Ektl]V = DEkii - EkjlD = (Afc - Xi)Ek>l -i 1
and [X,Wkji] = (A* - Xi)Wkji, where Wkj = 1 -i
VEkjV-1, k,l = 1,2, ...,n. However, if X resides 7 = i(«fc - at), U=
i 1
in a Lie algebra g, we cannot expect Wk,i to belong
to g: If g = so(n) then this is in general false. ' 1 -i
Suppose that X 6 so(n) and assume that j = i(-ak-ai), U =
-i 1
n = 2N — the case of an odd n will be addressed
104 A. M. Block & A. Iserles

Altogether, this results in (l/2)(iV-l)iV = (l/2)(n 3.2. The radius of so(n)


— l)n — (l/2)n eigenvalues of CA- Up to y/2, measuring 50(n) in the Probenius norm is
The remaining N = (l/2)n eigenvalues of CA the same as using the Euclidean norm in K.2( n_1 ) n ,
are zero. This is easy to verify by letting, for any
||X|| F = v / 2||i / (^)||2- Moreover, Cx is skew sym-
k = 1,2,... , N, V £ so(n, C) be zero, except that
metric, therefore normal, and its Euclidean norm
V2fc-l,2fc-l V2k-l,2k coincides with its spectral radius.
J,
V2k,2k-1 ^2fc,2fc Theorem 6. For every n > 4 it is true that
whence [.A, V] = O. w(so(n)) = \ / 2 . (13)
Once we have determined <J(CA), we note that
a(Cx) = &{CA) whenever X and A are similar,
Proof. We commence with even n = 2N and assume
since X = QAQ~l means that
that, without loss of generality,
Cxu{Y) = u{Z) ^ CAuiQ^YQ) = u{Q-xZQ).
|CKII > |o;21 : > • • • > \O-N\-
Lemma 4. Suppose that n = 2N and that the
eigenvalues of X £ so(n) are ±iak, k = 1, 2 , . . . , N. Therefore, according to (10),
Then the eigenvalues of the restricted commutator
\\Cxh = p(Cx) = |«i| + |CK2|-
Cx are
i(±ak±ai), l<k<l<N, (10) Since \\X\\% = ^Aeo-pc) I'M2' we
deduce that

as well as a zero eigenvalue of multiplicity N. ||Cx||2 _ |Q!l| + |Q!2| _ |Q;I| + |CK2|

We note as an aside that we have just deter- Hx)b N


mined that the centralizer of X £ so(rc) is (ra/2)- V2 \x\
dimensional, as well as presenting its basis. \ k=X

Lemma 5. Suppose that n = 2N + 1 and that the \ai\ + a d


< <V2,
eigenvalues of X £ so(n) are i i a ^ , k = 1,2,... ,N \/\ai\2 + |a 2 | 2
and zero. Then the eigenvalues of the restricted
commutator Cx are with the upper bound attainable when a.\ = 02 > 0,
ak = 0, k > 3, which corresponds to an embedding
i(±ak±ai), l<k<l<N, (11) of 5o(4) in the algebra. Note that the inequality
±iak, l<k<N, (12) above holds by Young's inequality for p = 2, i.e. we
have 2|Q;I||Q!2| < |«i| 2 + |«2| 2 -
as well as a zero eigenvalue of multiplicity N. Therefore W(so (2.2V)) = V2.
The proof for n = 2N + 1 is virtually identical,
Proof. Since X £ so(n) is necessarily singular, we
since
need to add to A a bottom row and rightmost col-
umn of zeros: We denote the new, (2N+1) x (2N+1) p(CV) = max< max lm.1 + la/1, max \ak\
matrix by A. All the eigenvectors of CA, suitably v
[l<k<l<N k=l,2,...,N
padded by zeros, can be extended to C^. Moreover,
= \a\\ + ]CK2 I,
let v £ C2N be a nonzero eigenvector of A with an
eigenvalue vy and set and we again obtain the radius (13). •
O Example. It is instructive to analyse the special
V =
-vT 0 case so (4). Using the structure constants one can
compute that for
Then we can easily verify that [A, V] = ijV. Hence
we recover the eigenvalues (12). Altogether we have u Xi X2 X3
N(2N + 1) eigenvalues, hence the full spectrum of Xl 0 X4 X5
C4. Since the spectrum of the restricted commuta- X
X2 —X4 0 x6
tor is invariant under similarity transformation, the
proof is complete. • X3 -x5 -x6 0
Commutators of Skew-Symmetric Matrices 105

we have between su(2) and so (3) and M3 endowed w i t h t h e


0 X4 x5 ~X2 -x3 0 cross product. The m a p in this case is given by (see
e.g. [Marsden & Ratiu, 1999])
—X4 0 X6 Xl 0 ~X3
-x5 -XQ 0 0 Xl X2 Xl
Cx = -1x3 -\Xi — X2
X2 -Xl 0 0 XQ ~X5 X2 X
-ixi + X2 IX3
X3 0 -xi -XQ 0 X4 X3

0 X3 -X2 X5 —X4 0 Thus our earlier argument for so (3) shows t h a t


w(su(2)) = 1 with respect to the norm induced by
The 2-norm of Cx may then be computed to be
the vector norm on R 3 .
||x|| 2 + 2|xia;6 — X2X5 + X3X4I. Using Lagrange mul-
tipliers t o maximize this subject to ||x|| = 1 yields
indeed t h a t u (so (4)) < V2. Acknowledgment s
It is of interest in fact to characterize all so (4)
We would like to t h a n k Brad Baxter, R e n g - C a n g Li,
matrices whose restricted commutator has the norm
Elizabeth Mansfield, Alexei Shadrin and Mike Shub
\/2- These take either the form
for useful comments. We would like also t o t h a n k
the referee whose comments greatly improved the
0) a b c exposition. T h e research of A M B was s u p p o r t e d in
a 0 c -b part by the National Science Foundation.
X =
b —c 0 a
c b —a 0 References
or Brockett,R. [1993] "Differential geometry and the design
) a b c of gradient algorithms," Proc. Symp. Pure Math.
54, 69-92.
a 0 —c b
X Brockett, R. [1994] "Differential equations and matrix
b c 0 —a inequalities on isospectral families," Lin. Alg. Appl.
•c -b a 0 203/204, 189-207.
Carter, R., Segal, G. & Macdonald, I. [1995] Lectures on
Lie Groups and Lie Algebras (Cambridge University
for arbitrary a, b, c 6 M which are not all zero. Press, Cambridge).
In each case the spectrum of Cx consists of four Chartrand, G. & Lesniak, L. [1986] Graphs and Digraphs
zero eigenvalues and ± i 2 ( a 2 + b2 + c 2 ) 1 / 2 . Hence (Wadsworth & Brooks/Cole).
||X|| F = 2(a 2 + b2 + c2)1/2 and thus \\u(X)\\2 = Hille, E. [1969] Lectures on Ordinary Differential Equa-
V2(a2 + b2 + C2)1/2 and | | C X | | 2 = 2(a 2 +b2 + c2)1'2 tions (Addison-Wesley, Reading, MA).
and it follows that | | C x | | 2 / | K X ) | | 2 = \fa. Horn, R. A. & Johnson, C. R. [1994] Topics in Matrix
Analysis (Cambridge University Press, Cambridge).
Humphreys, J. E. [1978] Introduction to Lie Algebras and
4. Conclusion Representation Theory (Springer-Verlag, Berlin).
We have defined the radius of a Lie algebra and Iserles, A., Munthe-Kaas, H. Z., N0rsett, S. P. &
computed its value for so(ra) and the Frobenius Zanna, A. [2000] "Lie-group methods," Acta Numer-
norm. It is of interest to compute the radius for ica 9, 215-365.
other Lie algebras. We intend to do this in a future Kobayashi, S. & Nomizu, K. [1969] Foundations of Dif-
ferential Geometry (John Wiley, NY).
publication. In generalizing the work here one needs
Marsden, J. E. & Ratiu, T. [1984] Introduction to
to distinguish between compact and noncompact
Mechanics and Symmetry (Springer-Verlag, NY).
Lie algebras (where the Killing form is definite and Varadarajan, V. S. [1984] Lie Groups, Lie Algebras, and
indefinite, respectively) and between real and com- Their Representations (Springer-Verlag, NY).
plex algebras. The compact real form of a complex von Neumann, J. [1937] "Some matrix inequalities
Lie algebra is natural to look at — for example and metrization of matrix space," Tomsk Univ.
s u ( n ) , the compact real form of sl(n, C). In t h e case Rev. 1, 286-300; Reprinted [1962] in Collected Works
of 5u(2) one has of course a Lie algebra isomorphism (Pergamon, Oxford), Vol. IV, pp. 205-218.
This page is intentionally left blank
SIMPLE NEURAL N E T W O R K S THAT
OPTIMIZE DECISIONS
ERIC BROWN*, J U A N GAO+, P H I L I P HOLMES*-*, RAFAL BOGACZ*-*,
MARK GILZENRAT*, and J O N A T H A N D. COHEN*
*Program in Applied and Computational Mathematics,
^Department of Mechanical and Aerospace Engineering,
^Department of Psychology,
Princeton University, Princeton, NJ 08544> USA

Received April 2, 2004; Revised July 7, 2004

We review simple connectionist and firing rate models for mutually inhibiting pools of neurons
that discriminate between pairs of stimuli. Both are two-dimensional nonlinear stochastic ordi-
nary differential equations, and although they differ in how inputs and stimuli enter, we show
that they are equivalent under state variable and parameter coordinate changes. A key parame-
ter is gain: the maximum slope of the sigmoidal activation function. We develop piecewise-linear
and purely linear models, and one-dimensional reductions to Ornstein-Uhlenbeck processes that
can be viewed as linear filters, and show that reaction time and error rate statistics are well
approximated by these simpler models. We then pose and solve the optimal gain problem for
the Ornstein-Uhlenbeck processes, finding explicit gain schedules that minimize error rates for
time-varying stimuli. We relate these to time courses of norepinephrine release in cortical areas,
and argue that transient firing rate changes in the brainstem nucleus locus coeruleus may be
responsible for approximate gain optimization.

Keywords: Gain; neural network model; decision task; stochastic differential equation; reaction
time; optimal speed and accuracy; matched filter; locus coeruleus.

1. Introduction et ai, 1990; Usher k McClelland, 2001]. Recent


direct recordings in visual and motor areas of mon-
The psychological and neural bases of decision mak-
keys performing sensory discrimination tasks sup-
ing are active areas of inquiry in cognitive science
port this interpretation by revealing that, following
[Schall, 2001; Gold k Shadlen, 2001; Schall et al.,
training, certain "decision" neurons become selec-
2002; Gold k Shadlen, 2002; Shadlen k Newsome,
tive for different stimulus alternatives, and upon
2001; P i a t t k Glimcher, 1999; Stone, 1960; Laming,
presentation of the relevant stimulus their firing
1968; Ratcliff, 1978; Ratcliff et ai, 1999; Usher rates gradually increase accordingly; when these
k McClelland, 2001; Roitman k Shadlen, 2002; rates cross thresholds, the corresponding behav-
Wang, 2002]. There is a wealth of d a t a on sim- ioral response is initiated (e.g. [Schall, 2001; Gold
ple decision tasks which require discrimination k Shadlen, 2001; Schall et ai, 2002; Roitman k
among alternative stimuli as quickly and accurately Shadlen, 2002; Gold k Shadlen, 2002]). This neural
as possible. Typically, this discriminatory process evidence adds to behavioral evidence noted below,
has been modeled as a competition among dif- suggesting t h a t decisions are made by compar-
ferent neural populations, each representing alter- ing integrated "weights of evidence", encoded by
nate interpretations of the current stimulus [Cohen the firing rates of neural groups. Here, we explore

107
108 E. Brown et al.

the computational mechanisms required to optimize can implement optimal processing of time-varying
such a process. stimuli.
The stimuli relevant to making a decision are Optimality principles have found wide applica-
often not static: their saliences may change over tion in psychology and neuroscience (e.g. [Bialek
time. In the simplest change occurs only at et al, 1991; Anderson, 1990; Fairhall et al,
the moment when the stimulus itself appears. This 2001]). In particular, Stone [1960] applied the opti-
is typically modeled in simulations of decision tasks mal Sequential Probability Ratio Test (SPRT) to
(e.g. in [Cohen & Huston, 1994; Brown & Holmes, model behavioral data in a two-alternative forced
2001; Cho et al, 2002], cf. [Laming, 1968]) by divid- choice task. This was followed by the exten-
ing the task into two distinct periods: a prepara- sive work of Laming [1968]. The SPRT computes
tory period, in which no stimulus is present, and a time-dependent likelihood ratios between the prob-
trial period, in which a stimulus of constant dis- abilities of two competing hypotheses, a proce-
criminability is presented. Alternatively, stimulus dure equivalent to the signal processing strategy
discriminability may change in a stepwise manner that maximizes signal-to-noise ratio in the dif-
or vary continuously. ference between two incoming stimuli. For stim-
The following specific example motivates our uli with constant signal-to-noise ratios, the SPRT
analysis of two specific cases in Sec. 2.5. In the is equivalent, in an appropriate continuum limit,
"moving dots" paradigm of the two alternative to the constant-drift diffusion model, which has
forced choice task [Britten et al, 1993; Shadlen & been shown by Ratcliff and others to fit a wide
Newsome, 2001; Gold & Shadlen, 2002] a display variety of behavioral data (see [Ratcliff, 1978;
of moving dots is presented, and the subject must Ratcliff et al, 1999] and references therein) and
indicate whether a majority of dots is moving to also to describe the dynamics of neural firing rates
the right or the left. In the simplest case, the sub- in sensori-motor brain areas [Schall et al, 2002;
ject focuses on a neutral fixation point during the Gold & Shadlen, 2002], cf. [Smith & Ratcliff, 2004].
preparatory period, after which the dots appear, Specifically, in [Gold & Shadlen, 2002], the notion of
with a certain "coherent" fraction moving either left reward rate is introduced for the constant-drift dif-
or right, and the rest moving randomly. A variant fusion model, and [Bogacz et al, 2004] shows that
is obtained by showing a zero coherence display of higher performing subjects do optimize this quan-
dots during the preparatory period, and suddenly tity in a specific behavioral task. However, although
increasing coherence to a fixed value. [Laming, 1968] does allow for accumulation of noise
Even if external stimuli have constant to have occurred before stimulus presentation (see
strengths, their representations in neural popu- Laming's Appendix A7), in all these studies the
lations that decide between alternative hypothe- decision process is modeled only after presentation
ses may gradually rise, due to accumulating of a stimulus having constant signal-to-noise ratio;
activity in input layers, fluctuations in atten- furthermore, the parameters describing processing
tion, or both [Mozer, 1988; Cohen et al, 1992; of incoming information are not explicitly allowed
Usher et al, 1999; Gilzenrat et al, 2002]. Another to vary in time.
possible source of time varying salience is the In this paper we show how models of mutu-
increasing noise levels that may accompany higher ally inhibiting neural populations can make nearly
firing rates. A richer situation, in which the stim- optimal decisions about the identity of time-varying
ulus salience increases and decreases over time, is stimuli. This is accomplished via dynamical adjust-
explored in [Huk et al, 2002]. A focus of the present ments in an effective gain parameter for the lin-
paper is how stimuli with time-dependent salience earized population dynamics. The gain determines
can be optimally processed in simple neurally-based the sensitivity of (equilibrium) population firing
models of decision networks. We study the reduc- rates to changes in averaged input currents to
tion of such networks to linearized, one-dimensional the population, and the word "effective" is used
approximations (cf. [Usher & McClelland, 2001; here because these changes can result either from
Brown & Holmes, 2001; Bogacz et al, 2004]) for transient variations in the gain parameter describ-
which optimality conditions can be fully charac- ing this sensitivity or directly from the nonlin-
terized, and identify two distinct mechanisms, one earities of neural input-output functions. There
involving intrinsic properties of decision networks is much current research into neural mechanisms
and the other involving external modulation, that for the modulation of gain in neural populations,
Simple Neural Networks that Optimize Decisions 109

identifying such factors as levels of norepinephrine book, The Computer and the Brain [von Neumann,
[Usher et ai, 1999] and the strength of fluctuations 1958], remains among his final work. In it, he makes
in individual neurons comprising the population elegant and simple estimates of human neural com-
(e.g. [Chance et al, 2002; Amit k Tsodyks, 1991; putational capacity based on notions drawn from
Brunei et al, 2001]). In particular, Shin et al. [1999] the theory of analog and digital automata (which he
proposes a mechanism in which frequency-current had largely developed), and from information the-
curves of individual neurons adapt to match oper- ory. Although neuronal spikes appear as l's (and
ating ranges to neural inputs, via intracellular cal- their absence as 0's), he argues that neural com-
cium signals. This may be viewed as a biophysical putation is necessarily inaccurate and noisy, and
implementation of the earlier "automatic gain con- hence must be "statistical" rather than "digital."
trol" (see Eq. (9) of [Grossberg, 1988] and references He points out that firing rates in sensory neurons
therein), which is implemented via multiplicative tend to be monotone functions of stimulus strength
"shunting" terms in neural network models and also and, as an early proponent of rate coding, he can
keeps neural units in the sensitive regimes of their be seen as pioneering the class of firing rate models
input-output functions. Gain plays a different role treated here.
in the present paper: we identify, for three differ-
ent models, the distinct time-dependent (effective)
gain schedules which implement optimal processing 2. Models of Decision Tasks
strategies for time-dependent signals. These pro-
vide predictions for gain manipulations that diverse
2.1. Decision tasks: The forced and
neural mechanisms may implement to improve task free response protocols
performance. We consider two distinct tasks, both widely used in
The balance of the paper proceeds as follows. cognitive neuroscience, in each of which a decision
In Sec. 2 we introduce the forced and free response maker must discriminate between two alternatives,
decision tasks, and three types of stochastic differ- henceforth denoted "1" and "2". The sensory infor-
ential equation (SDE) models for these tasks. We mation itself, as well as its neural representation,
show that two of these are related via a coordinate is assumed to be noisy, so that discrimination
transformation, and discuss linearized and one- errors occur. The first task is the forced-response
dimensional reductions of them, exploring the accu- paradigm, in which subjects must respond at a
racy of these reductions in two rather general cases. fixed time T following stimulus onset with their
In the following Sec. 3, we compute time-dependent best estimate of which alternative (1 or 2) was pre-
values of gain that optimize signal processing in the sented. Performance on this task is measured by
one-dimensional models. This involves calculating the error rate, or one minus the fraction of correct
gain functions that enable them to implement the responses. We will also refer to this as the interroga-
classical signal processing notion of matched filters. tion protocol, noting that it is distinct from deadlin-
Section 4 interprets these results in terms of cortical ing (not considered further here), in which subjects
norepinephrine (NE) release mediated by the brain- are apprised in advance of a fixed, maximal time
stem nucleus locus coeruleus (LC), showing that LC before which all responses must be made.
and NE dynamics indeed appear to approximate In the second, free-response paradigm, deci-
optimal time courses. Section 5 concludes the paper sions are not demanded at a preset time, but are
with a brief discussion. given when the subject feels that sufficient evi-
Although we only consider simple models of dence in favor of one alternative has accumulated.
a prototypical cognitive task, we believe that this Since the sensory evidence is noisy, response times
paper is appropriate for a volume celebrating the vary from trial to trial and performance under the
centenary of John von Neumann's birth. Early in free-response condition is characterized by both
1956 von Neumann was working on a manuscript reaction times and error rates. Here, optimality
in preparation for the Silliman memorial lectures requires an appropriate balance of speed and accu-
at Yale, which he had been invited to deliver that racy [Wickelgren, 1977; Gold & Shadlen, 2002;
Spring. Unfortunately, his final illness intervened Bogacz et al, 2004].
and he entered the Walter Reed Hospital in April, Following [Usher & McClelland, 2001] and
where he remained until his death in February 1957. others, we shall model both these tasks by a
The lectures were never given, but his remarkable pair of competing (mutually inhibitory) neural
110 E. Brown et al.

populations, each of which is selectively responsive input changes. Under the free-response paradigm a
to sensory input corresponding to one of the two decision is made and the response initiated when
alternatives. In the forced-response protocol, the the firing rate fg(t)(xj) of either population first
neural population with the highest firing rate at exceeds a preset threshold Of, it is normally assumed
time T determines the decision. For free responses, that Q\ = 02 = 0. For the interrogation protocol,
the first of the two populations to cross a firing rate the population with greatest activity (and also fir-
threshold establishes the choice. We do not address ing rate) at time T determines the decision. We also
the (interesting) question of how thresholds are set assume that activities decay to zero after response
or threshold crossings are detected. and prior to the next trial, so that the initial con-
ditions for (l)-(2) are Xj(0) = 0.
The subscript in fg(t){-) indicates dependence
2.2. Two-dimensional nonlinear on the time-varying gain, or sensitivity, g(t) of
models and the neural gain the neural populations: gain sets the slope of
the activation function. For example, the logistic
parameter
function
In this section we consider the dynamics of two
mutually inhibiting neural populations, each of
which receives noisy sensory input from components f9(t)
^ =
1 + exp(-4:g(t)(x - b))
of the stimulus representing one of the alterna-
tives. We describe two models for such populations, = i[l+tanh(2s(t)(*-&))] (3)
both in wide use, and both in the form of sys-
tems of stochastic ordinary differential equations
(SDEs) [Arnold, 1974]. has maximal slope g(t) (see Fig. 1, left). While
The first of these, the leaky integrator con- this specific form is not required for the results
nectionist model [McClelland, 1979; Usher & derived below, we do assume that fg takes its
McClelland, 2001], is: time-dependent maximal slope g(t) at some time-
independent point, as for (3).
As already mentioned, the connectionist model
rc~ = -xi - f3fg(t)(x2) + ai(t) + ^ l (1) describes the time evolution of current inputs.
A second model is derived in [Wilson & Cowan,
1972], cf. [Hopfield, 1984; Abbott, 1991; Gerstner
r c ^ = -X2 - 0fgit)(xi) + a2(t) + ^ 1 (2) & Kistler, 2002], in which the firing rates of neu-
ral populations are themselves integrated over time.
First we give the linearized version of this firing
where the state variables Xj(t) denote the mean rate model:
input currents to cell bodies of the j t h neural
population, the integration implicit in the differ- c
ential equations modeling temporal summation of rc^ = -m + 4 ) (-fa + ai(t) + ^}), (4)
dendritic synaptic inputs ([Grossberg, 1988] and
references therein). Additionally, the parameter j3 2
rc^§ = ~V2 + 4 ( t ) (-/?l/i + a2(t) + ^ % ) . (5)
sets the strength of mutual inhibition via popula-
tion firing rates fg(t)(xj(t)), where /<,(*)(•) is the
sigmoidal "activation" (or "frequency-current" or Here, the yj are the firing rates of population j and
neural "input-output") function to be described other terms are as above. The linear function
shortly. The stimulus signal received by each popu-
lation is aj(t), and the noise terms polluting this sig-
nal are c(t)rft, where c(t) sets r.m.s. noise strength
flgit)(x) = l+g(t)(x-b), (6)
and the rft are (independent) white noise processes
with variance E(r// — rft,)2 = S(t — t'). The time con- derives from replacing the logistic (or any similar
stant TC reflects the rate at which neural activities monotonic) function by the linear approximation
decay in the absence of inputs and respond to flp/t){-) around its point of maximal slope. Note that
Simple Neural Networks that Optimize Decisions 111

the firing rate yj of the j t h population approaches 2.3. Equivalence of the firing rate
an equilibrium set by the input currents to this pop- and connectionist models
ulation, passed through the (linearized) frequency-
We now show that the firing rate and connection-
current function. This model must be reformulated
ist models are equivalent under a (generally time-
to allow for nonlinear functions fg^, because white
dependent) coordinate change and corresponding
noise does not make sense as an argument in such
adjustment of parameters, initial conditions, and
a function, cf. [Gardiner, 1985]. In particular, we
thresholds. Specifically, for any activation function
assume that, as in (4)-(5), the strength of firing
that is odd around some input value, such as (3),
rate fluctuations in response to noise in inputs scales
(7)-(8) can be written in the form (l)-(2). Hence,
with g(t) (i.e. with the maximal sensitivity of firing
for every parameterization of the firing rate model,
rates to the deterministic component of the input).
there is a connectionist model that produces iden-
This yields
tical trajectories as well as error rate and reaction
time statistics, and vice-versa. This shows that the
two models are effectively equivalent, up to parame-
^ = -Vi + fg{t)(-PV2 + ai(t)) + 9(t)^=vl
dt terization. However, in Sec. 3 below we demonstrate
(7) that, because of the different ways that gain g(t)
enters them, their optimal gain trajectories differ
dy c(t), significantly.
Tc-rr = -3/2 + f9(t)(-0yi + a2(t)) +g{t)—7=rit, Starting with Eqs. (7)-(8), we extend the S-T,
dt V2'
(8) exchange transformation of Grossberg [1988] to
define the new coordinates
which is valid for all /(•) and reduces to the form yi = 2b + Pyi - a2, y2 = 2b +0y2 - a± , (9)
(4)-(5) for linear /(•). Note that the firing rate
model (7)-(8) is a standard two-unit recurrent neu- so that —fiyi+ a2 = —y\ + 26 and — (3y2 + a\ —
ral network with additive noise [Hertz et al., 1991]. —y2 + 26. In terms of these (7)-(8) become
As above, we take initial conditions j/j(0) = 0, and
dyi a
note that threshold-crossing in the free-response Tc =(3 (a2 + yi-2b) + fg(t)(-y2 + 2b)
case is detected directly via yj — 9j.
lt P
For the questions of optimal stimulus process- da2
ing addressed here, the most important distinction +9wf,; (10)
between the connectionist (l)-(2) and firing rate
(4-5)-(7-8) models is whether the inputs a,j(i) + dy2
P (a1+y2-2b) + fg{t)(~y1 + 2b)
c(t)/\/2rft enter as separate additive terms, as in ' dt 0
the former, or as arguments to the activation func-
tion fg(t), as in the latter. As explained at the + da\
end of Sec. 3, this determines whether changes in *>ftf lit' (11)
gain directly adjust the sensitivity of neural units
and using the following property of the logistic acti-
to all inputs or just to feedback from the com-
vation function (3):
peting unit, and it results in qualitatively different
predictions for optimal gain schedules in the two
models. While we expect that future work on low- /,(*)(-£ + 26) = hi + tanh(2s(t)K + 26 - 6])]
dimensional descriptions of the population dynam-
ics of spiking neurons (extending, e.g. [Brunei et al., = - [ l + tanh(-2<7(i)[£-6])]
2001; Wang, 2002; Omurtag et al, 2000; Shelley
& McLaughlin, 2002; Ermentrout, 1994] to include
-[l-tanh(2^)[£-6])]
neurotransmitter effects) will result in more refined
models, here we study the "simple" connectionist
and firing rate descriptions. Throughout, we use = 1 -[l + tanh(2c ? (i)[£-6])]
variables Xj in referring to the former and yj to
the latter. 1-/,(*) (0, (12)
112 E. Brown et al.

(10)-(11) become 2.4. Piecewise-linear approximations


As in [Usher & McClelland, 2001; Brown & Holmes,
dyi
= -2/1 - Pfg(t) (m) -a2-a2 2001], Eq. (3) may be approximated by a piecewise-
dt linear function:

fg(t)(0
dy
Tc-rr = -V2 ~ (3fg(t) (yi) -ai-ai
&0
dt
o for £ € I —oo,6
+ 2b + P + Pg(t)^r,l 25

h
~hb+k
l
+ g(t)(Z-b) for£€
This SDE has the same form as (l)-(2) with param-
eters mapped as follows:
for £ G
a\ i-s- 26 + (5 — a2 - a2, 2
9
(13) (14)
i > 26 + {3 — a± — hi.
a2 —
as illustrated in Fig. 1. Note that our choice to
The firing rate model (7)-(8) therefore produces
identical statistics to the connectionist model set the slope of fpJ\ in its central domain equal
(l)-(2) with appropriately remapped parameters to the maximal slope g(t) of the nonlinear func-
and state variables. Note that thresholds and initial tion fgu\ does not minimize the distance between
conditions for the firing rate variables yi,y2 must the two functions in the L°° or L2 norms. The
be transformed under (9) to apply to the equiva- best L°° match is obtained by setting the maxi-
lent connectionist model, that a\ and a2 are inter- mal slope of / ^ equal to 0.7lg(t), and in I? by a
changed in the inputs, and that the noise terms are ^(t)-dependent value ranging between 0.72g(£) and
multiplied by gain g(t). 0.76p(i) (for g(t) between 0.25 and 3). However, all

1.5
\ N
s N\ V \ N
i * ' '
j / *'
W \ \ .
/ ' ' "
N X A V ^ ^ . . ^ y s ' ^ •
f(x) o, ^ 4 * ^
7*
' / ys-
\ 1 ' /
t t. /
t
0.5 t t.
i
t t. i
f t
/ /h
r / , \ ^
*+ 1
V * —
^ S / / /.
, , / / / , \ \ \ N -
, , f t ]. t , \
0.5 , , t t t
0.5 0 0.5 1 1.5

Fig. 1. (Left) Comparison of logistic and piecewise-linear activation functions; g = 1, b = 0.5. (Right) Comparison of logistic
and piecewise-linear vectorfields F{y\,y2) and Fpw(yi,y2) for the piecewise-linear firing rate model (15)-(16): the difference
F(yi,y2) — FpW{yi, 2/2) is plotted. Also shown for reference are the nine phase space tiles described in Fig. 2. Here additionally
Tc = 1, (3 = 1, ax = 1.03, a2 = 0.97.
Simple Neural Networks that Optimize Decisions 113

these choices result in similar error rate and reac- Fig. 2. In the following section, we will describe two
tion time statistics, and we use (14) in what follows. cases in which this tiled structure can be used to
For ease of reference, we rewrite Eqs. (7)-(8) reduce Eqs. (7)-(8) to a one-dimensional system.
following piecewise linearization:
2.5. Representing decision dynamics
C in one dimension
r ^ = -2/i + 0'»(*)
- / % 2 + atf)) + g(t) -^Vl
dt V2
As discussed above and in [Usher & McClelland,
(15)
2001], in the forced response protocol, the choice
dm j = 1 or 2 is made according to which of the two
: -y2 + fZ){-Py1+a2{t))+g{t)C-^r,l neural populations has the greatest activity or firing
dt yft
(16) rate at interrogation time T. Therefore, knowledge
of the difference
The difference between the vectorfield of the fully y(T) = yi{T)-y2{T) or
nonlinear model (7)-(8) and that of (15)-(16) is (17)
illustrated in Fig. 1 (right) for a specific param- x{T) = xl(T)-x2(T)
eter choice. In Sec. 2.6 below, we shall explicitly
compare reaction times and error rates predicted determines the outcome and reduction of the orig-
by these two models. inal two-dimensional problem to a single variable
The (2/1,2/2) phase space of the piecewise-linear does not inherently imply any loss in accuracy.
firing rate model (and of the analogous connection- For example, if the difference in firing rates is
ist model) is tiled by nine regions divided by pairs described by a time-dependent probability density
of horizontal and vertical lines at the break points p(y, t) (whose distribution represents variability
of fgW, each having a distinct linear vectorfield: see across behavioral trials), then the error rate at

y2
xc dy/dt = -y, TC dy/dt = -y,
TC dy/dt = -y,
xc dy/dt = -y2 + 1 xc dy/dt = -y2 + g(t)[-py,+a 2 (t)]
xc dy/dt = -y2
+ % - g(t) b

py 2 = a i ( t ) - b + 1/[2g(t)] 8

xc dy/dt = - y 1 + g ( t ) [ - P y 2 + a1(t)] T c dy/dt = -y1 + g(t)[-py 2 +a 1 (t)] xc dy/dt = -y, + g(t) [ - p y2 + a/t)]
+ Vz - g(t) b + % - g(t) b + y2 - g(t) b
xc dy/dt = -y2 + 1 xc dy/dt = -y2 + g(t)[-Py 1 + a2(t)] \ dy/dt = -y2
+ y2 - g(t) b

py 2 = a 1 (t)-b-1/[2g(t)]

xc dy/dt = -y, + 1 xc dy/dt = -y, + 1 xc dy/dt = -y, + 1


xc dy/dt = -y2 + 1 xc dy/dt = -y2 + g(t)[-Py 1 + a 2 (t)] TC dy/dt = -y2
+ y2 - g(t) b

P Y l = a2(t) - b -1 /[2g(t)] p y, = a2(t) - b + 1 /[2g(t)]


Fig. 2. The piecewise-linear vectorfield of the firing rate model (15)-(16). The central tile is surrounded by a solid box.
114 E. Brown et al.

interrogation time T is The existence of center manifolds M. for SDEs


/•oo with additive noise, such as those considered here,
ER= / p(y,T)dy (18) has been proven rather generally: see [Boxler, 1991]
Jo and [Arnold, 1998, Chap. 7]; also [Knobloch k
if alternative 2 was presented (i.e. if a^ > a,\ for Wiesenfeld, 1983] for an early analysis and explicit
t > ts), and examples. However, here we consider only the fully
linear and piecewise linear systems, for which the
ER= f p(y,T)dy (19) "diagonal" coordinates y = y\ — j/2, V = y\ + 2/2 and
J—oo
assumption of independent white noise processes
if alternative 1 was presented. Similar conclusions decouple the components of (7)-(8) (and analo-
hold for the connectionist model. gously of (l)-(2)) [Bogacz et al., 2004], and so we
For the free choice protocol the situation is do not need the full power of these results.
more subtle. The single variable x or y is sufficient For collapse to M. to occur, the eigenvalue char-
to characterize the decision only if the probability acterizing dynamics normal to the manifold must be
density of solutions to (7)-(8) or (l)-(2) has approx- sufficiently negative compared with the other eigen-
imately collapsed along a one-dimensional "decision value and the noise strength c, so that the joint
manifold" M. by the time the threshold is crossed; probability density p{y\,yi,t) rapidly concentrates
see Fig. 3. In this sketch, the decision manifold, near M. and a substantial majority of sample paths
parameterized by y, is the unstable, center or weak crosses the thresholds Xj = 9 (or yj = 6) near their
stable manifold [Guckenheimer &; Holmes, 1983] of intersections with M [Usher k McClelland, 2001;
the indicated fixed point, which, for the linearized Brown k Holmes, 2001; Bogacz et al, 2004]. These
system, coincides with its eigenspace. requirements are met by two distinct parameter sets
to be introduced below, and in Sec. 2.6 we compare
the resulting reaction times and error rates deter-
y2 mined from one-dimensional reductions with those
A
of the original two-dimensional models.

2.5.1. Dimension reduction and transient


e gain in two simple cases
In two cases, a simple equation for the evolution of
x(t) or y(t) may be derived. These cases are char-
acterized by a dominant proportion of solutions to
(15)—(16) (i.e. for "most" realizations of the noise
processes rjj(t)) (i) being confined to a single tile for
the duration of the decision process or (ii) "jump-
ing" together between tiles. The first of these situ-
ations occurs for Case 1 parameter sets, in which,
for example, the onset of salience (i.e. a\ ^ 02) in
y input currents is accompanied by large transients in
the magnitude of these inputs. The second Case 2
occurs for stimuli in which salience appears with-
Fig. 3. Reduction to one dimension. The coordinate y (or x) out such transients in magnitude. We now consider
of Eq. (17) parameterizes the decision manifold M (see text): these cases in detail for the firing rate model.
the invariant manifold containing the fixed point indicated
by the square. In the free response protocol, collapse of noisy
solutions along M is required for accurate description in one Case 1. Trajectories confined to t h e central
dimension (cf. Figs. 4 and 5 (right)) so that sample paths tile, gain parameter directly modulated
(dotted line and point) cross thresholds arbitrarily close to
the intersections of A4 with the thresholds yj = 6. This is
The central tile of the firing rate phase plane, where
not required for the forced response/interrogation protocol,
in which the probability density p(y, t) is simply cut along both functions f^ft\{-) appearing in Eqs. (15)-(16)
2/1 = 2/2 at t = T. are linearly increasing, is defined by f3y\ G [02 (i)
Simple Neural Networks that Optimize Decisions 115

-b-{l/2g(t)),a2(t)-b+(l/2g{t))} and /3y2 e aj{t) ^ 0 are present throughout, but t h a t coher-


[a1(t)-b-(l/2g(t)),a1(t)-b+(l/2g(t))]- If ence (a\(t) 7^ a,2(t)) appears in the inputs cij only at
t = ts, so that times t <ts make up the preparatory
|oi(t) - b\ < \a2(t)-b\ < (20) phase mentioned in the introduction and the situa-
2g(tY 2g(ty tion corresponds to the introduction of coherence
into an entirely random pattern. Assuming that
then the central tile always contains the origin and decision thresholds are set within the boundaries
some part of the first quadrant (note that this of the central tile or that the interrogation time T
quadrant is invariant under the deterministic part is sufficiently small so that only a negligible pro-
of Eqs. (7)-(8) if / is non-negative) so that deci- portion of solutions have left this tile, solutions are
sion dynamics starting at the origin may (for suit- effectively confined to the central tile for all times of
able choices of other parameters) take place entirely interest. This behavior characterizes Case 1 param-
within the central tile. For example, if b = 0.5 and eter sets, for which subtraction of Eqs. (15)—(16)
0 < g(t) < 1, then a\(t), a2(t) may take values yields the one-dimensional SDE
between 0 and 1 while still satisfying (20).
Figure 4 shows a sample of solutions of the piec- rJ^ = ~y + g(t)({3y + a(t)) + g(t)c(t)rh
ewise-linearized firing rate model for the piecewise
constant parameters g(t) = {0.3, t < ts; 1, t > ts}, (firing rate model), (21)
ai(t) = {l,t < £ s ;1.03,i > ts}, a2(t) = {l,t <
where we define the net rate of incoming evidence as
ts;0.97,t > ts}, c{t) = 0 . 0 9 ^ , b = 0.5, r c = 1,
8 = 0.725, ts = 10 and (3 = 1. Note that stimuli a(t) = ai(t) - a2(t). (22)

t=t ; before stimulus t>t ; during stimulus


s

SI
-1

stimuli >fk.
1

time t
gam jf,
1

l
s time t

Fig. 4. Case 1: solutions confined to central tile. Scatter plot of trajectories both at the end of the preparatory period and
hence at the moment of stimulus onset ts (left) and during the stimulus (t = ts + 2, right). The tiling of the plane is shown
with dot-dashed lines; cf. Fig. 2; the central tile is outlined in solid and extends outside the plotted domain in t h e left panel.
Parameter values are given in text. Also shown are nullclines for Eqs. (15)-(16) as thin solid lines. The lower panels show
stimuli a,j(t) and gain g(t) as functions of time.
116 E. Brown et al.

We note that transient gain values in this case result fixed point at (0, 0) if b > l/2g. If b = l/2g, the
from modifications to the firing rate function itself, situation simplifies: while t < ts, (0,0) lies exactly
as solutions explore only the central region of this at the corner of tile 9 (see Fig. 2), to which tile
function in which it is practically linear. This is the solutions are confined (modulo noise effects). At
"external" mechanism of dynamic gain change dis- stimulus onset ts, tile boundaries shift, so that, for
cussed in the Introduction. appropriate choices of ai(t), a,2{t) > l/2g(t) — b for
For future reference, we also note that an ana- t > ts, the origin and the cluster of solutions in
lytical expression for the density of reaction times its neighborhood at time t = t+, suddenly finds
may be derived if the parameters in (21) are con- itself in the central tile 5. For concreteness, we
stant (i.e. a{t) = a, c(t) = c) and the gain "bal- fix parameters meeting the requirements b = l/2g
ances" the decay: e.g. g(t) = g = 1 in (21) (see, and ai(t) = a2(t) = 0 for t < ts as follows:
e.g. [Ratcliff et al, 1999]). In this case, (21) sim- fll(t) = {0,t < ts;l.03,t > ts}, a2(t) = {0,t <
plifies to a constant drift diffusion process and the ts;0.97,t > ts}, g = 1 and all other parameters as
probability that a trajectory first escapes the inter- for the example in Case 1. See Fig. 5.
val [-8,6] at a time RT = inf{£ : \y(t)\ > 9} from To determine the appropriate linear (two- and
initial condition y(0) = 0 has density one-dimensional) reductions for these parameters,
we use Eqs. (15)—(16) restricted to tile 9 for the
7TC a2 RT f 9a 9a \
p(RT) = ^ - e 2C2 I e J + e ^ J preparatory phase t < ts, and restricted to tile 5
for times t > ts during stimulus presentation (we
make the same assumptions about the interroga-
tion time or thresholds as for Case 1, so that solu-
tions remain in the central tile 5 for all times t > ts
(23) of relevance to the decision). This yields the one-
dimensional equation
Here ±6* correspond to the intersections of the deci-
sion manifold A4 with the thresholds yj = 9 of dy _ j 9c{t)vt for t <ts
the two-dimensional process (Fig. 3). Equation (23) Tc V
dt~ \ g[py + a(t)} + gc(t)rjt for t > ts'
may be extended to account for distributed ini-
tial conditions y(0) ^ 0 and other generalizations (25)
[Ratcliff et al, 1999], but we do not use such exten-
(and an analogous reduction to a linear two-
sions here.
dimensional model).
Similar considerations yield the reduction of
Equation (25) is similar to the reduction (21),
the connectionist model restricted to its respective
if the stimulus and gain functions in the latter are
central tile:
piecewise constant, as for the example parameters
dx of Case 1. The major difference is that the noise
r c — = -x + (3g(t)x + a(t) + c(t)rjt coefficient remains constant for (25). As we see in
(connectionist model). (24) the next section, the statistics produced by the one-
Note that gain multiplies the last three terms in dimensional models (21) and (25) can nevertheless
(21), but only the second in (24). agree rather well. Thus, transient gain strategies to
be derived for the more general (21) in Sec. 3 can
be approximately implemented for stimuli undergo-
Case 2. Trajectories switch tiles, changing
ing large steps, with no changes in the gain of the
effective gain
activation functions per se.
We now consider the case of stimuli a,j (£) that "sud- Similar considerations hold for Cases 1 and 2
denly" turns on from zero at time ts while the reductions of the connectionist model, but we do
gain parameter g(t) = g remains constant, and not pursue this here.
show how stimulus onset itself can give rise to
a time-dependent one-dimensional reduction that
resembles the reduction to (21) obtained above. 2.6. Accuracy of the reduced models
This corresponds to appearance of a partially coher- Figure 6 demonstrates that our simplifications
ent stimulus replacing a fixation spot. Since ai(t) = of the nonlinear firing rate model (7)-(8) accu-
d2(t) = 0 for t < ts, in this period there is a stable rately capture reaction time statistics for Case 1
Simple Neural Networks that Optimize Decisions 117

t=t ; before stimulus t>t ; during stimulus


s

>? 0\ HI
Vi
a
stimuli, 1
1
a
2
. >
t_s time t
gain > k
1

1 >
t s time t

Fig. 5. Case 2: trajectories switch tiles. Scatter plot of trajectories both at the end of the preparatory period and hence
at the moment of stimulus onset ts (left) and during the stimulus (t = ts + 2, right). The tiling of the plane is shown with
dot-dashed lines; cf. Fig. 2; the central tile is outlined in solid. Parameter values are given in text. Also shown are nullclines
for Eqs. (15)-(16) as thin solid lines. The lower panels show stimuli aj(t) and gain g(t) as functions of time.

parameters. For the one- and two-dimensional lin- y[ts) = 0 at the time of stimulus presentation (keep-
ear reductions, linearized activation functions take ing the first 10 terms of sum). For Case 2, these error
piecewise constant (in time) values appropriate to rates are 0.060, 0.065, 0.059, 0.042, 0.034. Thus,
the tiles containing the dominant proportion of in both cases while the different two-dimensional
solution trajectories during the preparatory and models are in close agreement, the one-dimensional
trial periods, exactly as in (21). That is: for Case 1, reductions produce significantly lower error rates.
flqu)(x) = 1/2 + (x — b) for all t, as solutions remain Figures 4 and 5 show why: the distribution of solu-
in the central tile 5. For Case 2, fiit\{x) = 0 tions is not entirely collapsed along the attract-
for all t < ts (when solutions are in tile 9) and ing decision manifold, and the spatially extended
fLt)(x) = (1/2) + (x — b) for t > ts, when solutions "incorrect" thresholds of the two-dimensional mod-
are in tile 5. els require smaller (and hence more probable)
For Case 1, the error rates corresponding to excursions to cross. Closer agreement between one-
the reaction time distributions of Fig. 6 are 0.050, and two-dimensional models can be achieved with,
0.051, 0.051, 0.035, and 0.034 respectively for for example, higher values of (3 or lower values of
the two-dimensional firing rate model with logis- noise strength c: see [Bogacz et al, 2004].
tic activation functions fgu), the two-dimensional As an additional comparison among the var-
model with piecewise-linear activation functions ious models, we separately computed error rates
/ P /^ (15)—(16), the two-dimensional model with lin- for interrogation at a time T = ts -f- 1 (see
ear activation functions, the one-dimensional reduc- [Usher & McClelland, 2001] for an earlier, related
tion (21), and the expression (23), which describes comparison between the nonlinear two-dimensional
the one-dimensional reduction with initial condition and linear one-dimensional models). For Case 1,
118 E. Brown et al.

0.16

0.14

fe 0.08

(a)

U. ID I

0.14 -
t /
0.12 - f *A -

0.1 11 -

|r, 0.08 -

\\
0.06 - •// -
HI
v\

0.04 -

0.02

* r r
= * - » • •-«>- =*= ™ *
10 15 20 25 30
RT
(b)
Fig. 6. Reaction time densities for the nonlinear firing rate model of Eqs. (7)-(8) (stars) and its various reductions, with
thresholds 9 = 0.725: dot-dashed line, two-dimensional model with piecewise-linear activation functions fp7ty dotted line, two-
dimensional model with linear activation functions fqrty, solid line, linear one-dimensional reduction, solid line with circles,
analytic 1-D expression with zero variance at trial onset (see text), (a) Case 1: solutions confined to central tile, (b) Case 2:
trajectories switch tiles. Parameter values are given in main text.
Simple Neural Networks that Optimize Decisions 119

the interrogation error rates are (in the same order for the connectionist model, and
as above) 0.323, 0.321, 0.321, 0.324, and 0.319. For
Case 2, these error rates are 0.374, 0.363, 0.354, y{t) = fQa^^(±jyg{s>)-l]ds^ds
0.350, and 0.319. For both cases interrogation error
rates are more similar for the various model reduc-
tions than the free response error rates reported
in the previous paragraph. This is expected from
the discussion in Sec. 2.5, since accurate description (30)
of the interrogation protocol by a one-dimensional for the firing rate model. Here, dWs is a n incre-
model does not require that solutions are confined ment of a Wiener process, of which the white noise
near the decision manifold. process r]s is the formal time derivative, and we
have assumed unbiased initial data x(0) = y(0) =
z(0) = 0. These expressions all take the form
2.7. Drift-diffusion and the
one-dimensional models as w(t)= / K{t,s)a(s)ds+ / K(t,s)c(s)dWs,
linear filters Jo Jo
(31)
We introduce a third one-dimensional SDE, an
extension of the drift-diffusion model of [Laming, and so we conclude that (28)-(30) all compute lin-
1968; Ratcliff, 1978] in which both drift and diffu- ear filters of their inputs.
sion terms are multiplied by a common gain fac- At any fixed time t, w(t) is a Gaussian-
tor g(t): distributed random variable with mean
J0 K(t,s)a(s) and variance fQ K2(t,s)c2(s)ds.
dz Using this fact, after a change of variables the error
TC— = g(t)[a(t) + c(t)rjt] ((pure) drift-diffusion rate expression (19) becomes
at
model). (26) rt
/ / K(t,s)a(i \
ER = 1-erf Jo (32)
Equation (26) and the one-dimensional reduc-
tions of the firing rate and connectionist equations f K2(t,s)c2(s)ds j
(21) and (24) are Ornstein-Uhlenbeck processes,
(affine-) linear in the activities x, y and z and in the
input
3. O p t i m a l Signal D i s c r i m i n a t i o n i n
I(t) = a(t)+c(t)r)t . (27) the One-Dimensional Models
input signal noise We now ask what functional form of g(t) optimizes
performance for Eqs. (28)-(30), thereby computing
We may explicitly solve all these SDEs, for a given optimal gain trajectories for the (reduced) drift-
realization of the white noise process r]s, s G [0, t], diffusion, connectionist, and firing rate models.
to obtain respectively
3.1. Optimal statistical tests
sWat^, / ' * ! * ) , _ (28) Given only the noisy input function (27), consider
r
JO c Jo
the task of deciding whether I(t) was generated
for the drift diffusion model,
by time-dependent signals ao(t) or ai(i): hypothe-
ses 0 and 1, resp. This can be accomplished in
two distinct ways, mirroring the interrogation and
x{t) /^eXp(i^[/35(s')-l]^)dS free response protocols of Sec. 2. In the first, the
decision is made at a fixed time T; in the sec-
ond, it is made when some preset level of confi-
dence is reached. Optimal performance in the first
+ J* ^ exp ( 1 j\f3g{s') ~ 1] ds'} dWs
version of the task implies that as few errors as
(29) possible are made; in the second, it implies that
120 E. Brown et al.

the decision must be made as quickly as possible time t,


for a fixed error tolerance, timed from stimulus
onset at time t = 0. The best strategy in the first dlt = k ^•>dt + °MdWl (39)
version is the (continuum limit of the) Neyman-
Pearson test; in the second version it is the sequen- which may be integrated to yield:
tial probability ratio test (SPRT) [Wald, 1947; a
l
Lehmann, 1959]. Both tests compute an evolving
estimate of the log likelihood ratio:
®= fk^nd8+f \ Mdw,
c(s)
(40)
Jo c 2 (s) J0
\p({I(s)\a0(s),se[0,t}}y Comparing with Eq. (31) shows that the optimal
l(t) = log filter is
P({/(s)Ms),se[0,i]})
a(s)_
K(t, s) = k 2 (41)
A Po({I(s),sE[0,t}}) c (s)
log (33)
Pl({I(s),se[0,t}}) this is the matched filter for white noise which is
(the base of the logarithm is arbitrary). In the fundamental in signal processing [Papoulis, 1977].
Neyman-Pearson test, hypothesis 0 is chosen if Note that, in (39)-(40) only the signal-to-noise ratio
l(T) > 0 and hypothesis 1 if l(T) < 0; in the (a/c) appears.
SPRT, hypothesis 0 (resp. 1) is chosen when l(t)
first crosses threshold 6 (resp. —0), 6 being deter-
3.2. A direct proof that the kernel
mined by the error tolerance. 2
K(t,s) = k(a(s)/c (s)) is
Writing the input I(t) (27) as a sum of its
optimal in the interrogation
increments for an appropriate discretization of
time {P}: paradigm
As follows from its matched filter property, the lin-
I{t) = ^ dlj = Y a(tj)dt + c(tj) dWl, (34) ear filter K(t,s) = k(a(s)/cP(s)) which computes
log likelihood l(t) for inputs with white noise also
we obtain produces, for all times t, a filtered (and Gaussian)
PojdP) version w(t) of the input [Eq. (31)] with a maximal
^) = E l o § Pi(dli)\
(35) integrated signal-to-noise ratio

Now restrict to the special case in which ao(t) = f K(t,, )a(s)ds


F[K;a,c](t) = Jo
—ai(t) = a(t) and consider the likelihood distribu-
tions (now themselves time-dependent) that corre-
s)c [s)dW&
spond to an increment dl(t) = a(t)dt + c(t)dWt-
Since the dWt are normally distributed with mean
0 and variance dt, we have
/ K(t,s)a(s)di
1 -(dl(t)+a{t)dt))2 /(2c2(t)dt) Jo (42)
P0{t){dl{t)) =
^2-Kc2{t)dt e
'f K2(t,s)c2(s)ds
(36)

-(dl(t)-a(t)dt))2 /{2c2{t)dt) For completeness, we now demonstrate this directly.


Pl(t)(dl(t)) 2 Minimization of the error rate (18) or (19) for
yf2TTC (t)dt
(fixed) interrogation at time t = T is achieved
(37)
by maximizing F over all possible kernels K(s).
The corresponding increment of likelihood evidence This problem in the calculus of variations is solved
to (33) is by computing the first and second variations, with
respect to K, of the functional F, setting the first
Pi(dlt
dlt = log (38) to zero to determine a candidate K for the opti-
Po(dIt. C2(t)
mal K, and evaluating the second at K to check
where k = 2 log(e) depends on the base of the loga- that F)2KF is negative (semi-) definite. Henceforth
rithm. Substituting for dlt, we obtain a differential we drop explicit reference to the (fixed, arbitrary)
equation for the total evidence lt accumulated at interrogation time t = T in the function K and
Simple Neural Networks that Optimize Decisions 121

write K(T, s) = K(s). We compute:

a(s)[i(:(5) + e7(s)]ds
— = lim — F[K + e 7 ; a, c](T) = lim — {
8K e^ode e->ode '
2 / c2(s)[Jftr2(s) + 2 e 5 ( s ) 7 ( s ) + e 2 7 2 ( s ) ] ^

i
( /" a(s) 7 (s)ds / a(s)[K(s) + e-f{s)}ds [ c2(s)[K{s)j(s) + e 7 2 (s)] ds
;™ —r= ) Jo ______
lim Jo Jo
°\/2| [tf(T,e)]§ [H(T,e)]*

/ a(s)-y(s)ds c2{s)K2{s)ds- I a(s)K(s)ds c2(s)K(s)j(s)ds


Jo Jo Jo Jo (43)
Vl\ T c2{s)K2{s) ds
Jo

where H{T,e) = £c2(s)[K2(s) + 2eK(s)j(s) +


e 2 7 2 (s)]ds. Setting (43) equal to zero and using To compute the second derivative we differenti-
the fact that the variation 7(5) is arbitrary, we ate the expression within braces in the penultimate
conclude that the critical point indeed occurs at step of (43) with respect to e once more, set e = 0,
K(s) = k(a(s)/c2(s)), as given by (41). and evaluate the resulting expression at the critical
point (41), obtaining:

[ c2(s)K2(s)ds [ c2{s)12{s)ds-( [ c2(s)K(s)j(s)ds


52F Jo Jo \Jo <0. (44)
5K2 K=K
V2 c2(s)K\s)ds

In the last step we appeal to Schwarz's inequal-


ity. This proves that the second variation is neg- Since the integrand (a/c)2 is non-negative, the
ative semidefinite, and vanishes identically only for error rate continues to decrease or at worst remains
variations 7(s) = KK(S) in the direction of K (as constant as T increases.
expected from (41), which contains the arbitrary
"scaling" parameter k). 3.3. Optimal gains for the three
Substituting (41) into (42) we obtain models
We may now extract explicit expressions for optimal
2
a (s) gains by setting K(s) = K(s) in (31) and compar-
F[g;a,c](T) ds, (45) ing the resulting integrands with those in the SDE
P./) c2(s)
<? solutions (28)-(30).

and using (32), we obtain the minimum possible


error rate for interrogation at time t: 3.3.1. Pure drift-diffusion model
Comparing (31) with (28), we see that the optimal
gain is simply K:
ER = - 1 - e r f (46) a(s)
2 2 Jo c2(s) 9dd(s) = TCK(S) = rck 2 (47)
c (s)
122 E. Brown et al.

thus, there is a continuum of optimal schedules dif- Defining f(s) = Tck(a(s)/c2(s))e^T^T-s\ differ-
fering only by a multiplicative scale factor. entiating with respect to s, and restricting to posi-
tive functions ~gt, a and c2 (which we justify below),
(50) yields
3.3.2. Connectionist model
Equations (31) and (29) give g/(s)exp( — / 0gf(s')ds'
™ - is
TCK(S) = Tck^\
c2(s) = g'f(s)exp(l6fa P9f(s')da'

ex - (~" -l]ds'\
lgc=00 (48)
P l
g}{s) exp ( — / (3gf{s')ds'
T T
c \c Js
where ~gc is the optimal gain for the connectionist
model. Taking the log of this expression, differen- (51)
tiating with respect to s, and solving for g~c(s), we
= g>f(s)f^--^gf(s)f(s).
obtain:
Rewriting (51), we obtain
d (a(s)
9c(s) = (49)
0
Note that gc is unique and in particular, indepen-
= lg2(s) + gf(s)£log(f(s))
dent of k and the interrogation time T. However, ~gc
is not required to be positive, so may not always be
physically admissable. The form of ~gc may be inter- / ^ 2 1_
9f(s)+9f(s) X0g
preted as follows. When (a(s)/c2(s)) is decreasing, dS \^{s)
gc(s) > 1//3 and the O-U process (24) is unstable; (52)
hence solutions "run away," in the direction x(s),
emphasizing higher-fidelity information that was Thus, the condition for optimal gain in
previously collected. When (a(s)/c2(s)) is increas- the linearized firing rate model is a differential
ing, (?c(s) < 1//?, the O-U process is stable, and the equation, unlike the algebraic relationships for
linear term in (24) is attractive, thereby discount- the drift-diffusion and connectionist cases. Note
ing previously integrated information in favor of the that solutions to (52) initialized at positive values
higher-fidelity input currently arriving. remain positive for all times, since the equation
We note that, because the "output" neural has an equilibrium at g~f = 0, preventing pas-
activity is determined by a gain-dependent function sage through this point. This justifies our assump-
of the dynamical variable x in the connectionist tion of positive ~g~s above and ensures that the
model (see text following Eqs. (l)-(2)), transient optimum gain is "physical" this sense. In fact,
gain schedules also adjust the position of free- (52) may be solved explicitly using the integrat-
response thresholds with respect to x. We leave ing factor I(s) = exp (JQS l(s')ds'), where l(s') =
an exploration of this effect, which does not enter (d/ds')log(a(s')/c2(s')) - l/r c , yielding
the interrogation protocol or affect the firing rate
model, for future studies.
exp r lists'
s
9f( )
3.3.3. Firing rate model exp ds' +
T~c JO 9(0)
Equations (31) and (30) give
(53)
as
TCK(S) = rck 2
i) The integral equation (50) specifies only
c (s)
an arbitrary, positive final condition g~f(T) =
gf(s)exp(^-£ [/%(*')-1]^')- (50) k(a{T)/c2(T)) for (52), since k is itself arbitrary.
Any solution of (52) with positive initial condition
Simple Neural Networks that Optimize Decisions 123

(as long as it is defined) therefore delivers a mem- with r c = (3 — 1. Then, Eq. (47) gives the fam-
ber of the continuum of optimal gain functions for ily of optimal constant gain functions for the pure
the linearized firing rate model. This is in strik- drift-diffusion model,
ing contrast to the unique optimal gain (49) in
the connectionist model, and, since the different 9dd(s) = rcka, (54)
g~f generally have different forms (see below), it and Eq. (49) gives the unique optimal gain for the
also contrasts with the multiplicity of "scaled" opti- connectionist model, again a constant:
mal drift-diffusion gain functions (47). The opti-
mality of g~f schedules with such different forms 1
9M (55)
follows from the fact that gain multiplies the
inputs to the firing rate model (21). For exam- For the same parameter values, the firing rate
ple, optimal gain schedules with ((3g~f(s) — 1) < 0 model gain ODE (52) becomes
may implement the SPRT even when the signal-
to-noise-ratio is constant (see Example 1 below), (3
9f{s) = -9f(s) - -9f(s). (56)
because discounting of previously integrated evi- ds I r. I r.

dence is compensated for via weighting incoming Initial conditions g~f(0) G [0,1/(3] decay to the fixed
evidence by a decreasing function g~f(s). point at g~f = 0, while for <?/(0) > 1/(3, gain
functions increase to oo in finite time. The initial
condition g~f(0) = 1/(3 yields the constant gain func-
3.3.4. Numerical examples
tion 5/(s) = 1/(3, for which the linearized firing
Example 1. We first take constant signal a(s) = rate model again becomes constant drift Brownian
a = 0.06 and constant noise strength c(s) = 0.09 motion: see Fig. 7. As expected, all gain profiles

I 1 1 1 i ^ ^

^ 1

i 1

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2


i 1 1 1 I i 1 i

- 1
D5

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2


0.1 I 1 1 i i I I i i

c(s)

0.05 a(s)

1 1 l i i I I i i

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8


times
Fig. 7. Optimal gains for constant signal strength a(s) = 0.06 (solid line in bottom panel) and constant noise amplitude
c(s) = 0.09 (dotted line). Top panel: three optimal gain schedules Tjf solving (52); note that these include, but are not limited
to, <?/(s) = 1//3 (here /3 = 1). Central panel: the unique optimal gain function g c (s) = 1//3 for the connectionist model, given
by Eq. (49).
124 E. Brown et al.

produced optimal performance (with 82.7% correct particular, ~g(s) = 0 for s < ts. For the connec-
responses returned at interrogation time T = 2). tionist and firing rate models, however, the formu-
lae (49) and (52) are valid only while a(s) > 0, and
Example 2. We now assume that signal ampli- additional reasoning is needed to determine optimal
tude is zero up to stimulus presentation at time ts gain values in the pre-stimulus period s < ts. For
and rises exponentially toward a thereafter: a(s) = the connectionist model, the integral equation (48)
a[l — e - r ( s - t s ) ] for s > ts. This form is motivated is clearly satisfied for a(s) = 0 if <?c(s) = - o o , so we
by the saturating dynamics of input layers which set g~c(s) = — oo, s < ts. Since for a "physical" neu-
feed forward to decision units in simple connec- ral network, activation functions fg(t){') a r e nonde-
tionist models. We set a = 0.06, r = 10, ts = 1 creasing, such negative gain values are not directly
and take constant noise strength c(s) = 0.09 and relevant to biological applications, but illustrate the
TC = (3 = 1 as previously: see Fig. 8 (bottom). demand that relative activation x be clamped at
As r —> oo, a(s) approaches the piecewise con- zero before the stimulus arrives. As before, we define
stant functions of Sees. 2.5.1-2.6, for which the one- gc(s) via (49) for s > ts. That is, for t > ts,
dimensional reduction was shown to be an adequate
•gc{s) = ^[l-rcl{s)], (58)
model.
For the pure drift-diffusion model, Eq. (47)
gives where l{s) = (d/ds)log(a{s)/c2(s)) = r/e^3'^ - 1
decays from oo to 0 as time s increases.
9dd(s) = rcka(s), (57) For the firing rate model, we also appeal
so that, as above, optimal gain trajectories are directly to the integral equation (50) to define g/(s)
scaled versions of the signal strength and, in when a(s) = 0. Since (50) is satisfied by g~f(s) = 0,

1.5

^ 1

0.5

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

0F^ -\ 1 1 r i i

S 20

40
_1 I I U _l L.

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

0.05-

0.8 1
time s
Fig. 8. Optimal gains for exponentially asymptoting signal strength a(s) (solid line in bottom panel) and constant noise
amplitude c(s) = 0.09 (dotted line). Top panel: three optimal gain schedules ~gc for the firing rate model solving (52) (solid
curves); the nonoptimal constant gain g = 1//3 is shown as dot-dashed for reference. The lowest of the solid g~t's displays the
rise-decay form discussed in the text. Central panel: the unique optimal gain function for the connectionist model, given by
Eq. (49); gc(s) = - o o for s < ts.
Simple Neural Networks that Optimize Decisions 125

we assume this for s < ts. We then determine <?/(s) schedules of this form, determined by their (suffi-
for s > ts from (52), allowing a discontinuity at ciently small) initial conditions, will always exist
ts and taking arbitrary "initial" conditions ]}t(ts). for monotonically rising and bounded stimuli a(s)
Figure 8 illustrates several optimal functions arising such as that chosen here. As we elaborate in Sec. 4,
from different choices of ~g Ats). The following fact is their "rise-decay" pattern resembles the gain pro-
helpful in understanding positive solutions of (52): duced by dissipating pulses of the neuromodulator
orbits lying below (1//3)[1 — rcl(s)] at any time s norepinephrine delivered to cortical decision areas
decrease toward 0; those above this value increase. via the locus coeruleus, hence providing a clue that
Since (1//?)[1 - TCI(S)] -+ 1/(3 as s -» oo, 1/(3 this brainstem organ may be assisting near-optimal
asymptotically forms a separatrix between optimal decision making.
gain trajectories that decay and those that diverge
Example 3. We finally assume that a(s) smoothly
to oo. Also, note that Case 2 parameters for the two-
increases from a low to a higher level and then
dimensional firing rate model of Sec. 2.5.1 imple-
returns to its original level, corresponding to a
ment a step in effective gain values up to 1/(3 = 1,
transient increase in stimulus salience. We model
so that in this case nearly optimal signal process-
this as a difference of two sigmoids: a(s) = ao + (a/
ing occurs with no explicit adjustment of the gain
(1+ exp(-4r(i a ,i-s))) - (a/(l + exp(4r(t f l > 2 -s))),
parameter. The performance resulting from opti-
with parameters ao = —0.04, a = 0.045, tSti — 0.75,
mal gain trajectories in all models is 73.1% correct
tSi2 = 1-25, and r = 20: see Fig. 9. Additionally,
responses at interrogation at time T = 2; for com-
we take constant noise strength c(s) = 0.06 and
parison, the (nonoptimal) constant gain <?/(s) =
TC = (3 = 1.
1/(3 produces only 66.4% correct.
Gains must remain bounded for all time to For the pure drift-diffusion model, Eq. (47)
be of practical interest. A family of optimal gain again gives ~g^{s) = rcka(s), and for the

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2

0.05

0.8 1
time s
Fig. 9. Optimal gains for pulsed signal strength a(s) (solid line in bottom panel) and constant noise amplitude c(s) = 0.06
(dotted line). Top panel: three optimal gain schedules g~t for the firing rate model solving (52) (solid curves); the nonoptimal
constant gain function g = 1/(3 is shown dot-dashed for reference. Central panel: the unique optimal gain function for the
connectionist model, given by Eq. (49).
126 E. Brown et al.

connectionist and firing rate models, we may use Since the firing rate of LC neurons governs NE
(49) and (52) for the entire time interval of interest release rate, we propose the following simple model
since a(s) is strictly positive. The resulting opti- for cortical gain g(t):
mal gain trajectories, shown in Fig. 9, yield 70.8%
correct responses at interrogation time T = 2, TNE 9(t) = kLC LC(t) - g(t). (59)
compared with 64.9% correct obtained for con-
stant gain 9f{s) = 1//3 in the firing rate model. Here, LC(t) denotes the time-dependent rate of LC
Note that the form of the optimal ~gc{s) illustrates firing and k^c is a constant relating this rate to
the intuitive explanation given in Sec. 3.3.2: when equilibrium values of cortical gain. This model's
the signal-to-noise ratio increases, ~gc{s) decreases, limitations in describing the underlying biology
suppressing previously integrated information, and include the fact that g(t) decays to zero in the
vice-versa. absence of LC firing (this could be rectified by
In summary, we have shown in this section adding a constant "gain floor" gtase)- Nevertheless,
that gain schedules yielding optimal performance in it allows us to make an interesting qualitative point
(reduced) models of decision tasks depend strongly in relating recent data on LC firing rates to opti-
on the time course of task stimuli as well as the mal strategies for the processing of noisy sensory
structure of the underlying model, although they all stimuli. Inverting (59) and inserting an optimal gain
implement matched filters and maximize the signal- trajectory yields a prediction for the optimal time
to-noise ratio in the difference between activities course of LC activity:
of neural populations representing competing alter-
natives. For systems well-described by connection- LC(t) = ~ (TNE t(t) + -git)). (60)
ist models, neural mechanisms may be expected to kLC
depress the gain (i.e. strength of inhibitory feed- Figure 10(d) shows histograms of LC firing
back) below the "balanced" level of 1/(3 when stim- rates recorded from monkeys performing two dif-
ulus salience is increasing, and enhance it above ferent psychological tasks: target identification, in
this level when salience is decreasing. However, which a horizontal or vertical bar must be detected,
for the firing rate model an optimal network can and the Eriksen flanker task, in which a central cue
"choose" among a variety of gain schedules of qual- must be identified while an array of distractors is
itatively different forms. One neurobiological impli- ignored. Since the second task involves more com-
cation of this flexibility is explored in the following plex stimulus processing, we assume as in [Brown
section. et al., 2004b] that the onset of stimulus represen-
tation in cortical decision areas is more gradual in
this than in the target identification task. Specif-
4. The Locus Coeruleus Brainstem ically, for t greater than the time ts of stimulus
Area and Optimal Gain Trajectories arrival we take a(t) = a(l — e~r^"ts^) with r = 50
Neurons comprising the brainstem nucleus locus (time constant 0.02 sec) for target identification and
coeruleus (LC) emit the neurotransmitter nore- r = 10 (time constant 0.1 sec) for the Eriksen
pinephrine (NE) to targets widely distributed task; also, we set a = 0.06; and r c = 0.5 sec:
throughout the brain, including cortical areas see Fig. 10(b). Additionally, we assume that ts fol-
involved in decision tasks. While NE has disparate lows presentation of the sensory cue by a processing
and complex effects on different brain regions, a time lag of 0.1 sec (cf. [Aston-Jones et al, 1994]).
dominant cortical role is believed to be modulation Optimal gain schedules g~f(t) for the firing rate
of neuronal gain at both the single cell and pop- model with these stimuli, computed as in the pre-
ulation levels [Usher et al., 1999; Servan-Schreiber ceding section, are shown in Fig. 10(a). To produce
et al., 1990]. Recordings of cortical neuron responses panel (c), these gain functions were inserted into
to stereotyped inputs at various latencies following Eq. (60) to yield corresponding optimal LC firing
activation of LC reveal these gain effects: responses rates, the discontinuity in ~§f{t) at stimulus onset
to a fixed input are larger (in certain experimental having negligible effect. (Also note that assuming
ranges) following LC activation than in control a smoother profile for a(t) would eliminate the
recordings without LC, and this elevated sensi- jump in LC(t).) The similarity between overall form
tivity decays with a time constant TNE ~ 0-2 and decay rates of optimal gain functions LC(t)
sec [Waterhouse et al, 1998]. and the empirical data of Fig. 10(d) supports the
TARGET DETECTION TASK ERIKSEN FLANKER TASK
(a)

IS- 0.5h

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1
(b) 0.

0.05 0.05

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 0 0.2 0.4 0.6 0.8

(c)

2
O 5 (J

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8


time(t)

Targets (good performance)


(d)

t i m e (sec.)
t i m e (sec.)
Fig. 10. Comparison of optimal gain theory with empirical data for two psychological tasks, (a) Optimal gain schedules for the firing rate model, for rapid (left) and
gradual (right) onset of stimulus a(t) to neural units (with a processing time lag of 0.1 sec following sensory cue), as shown in (b). (c) The corresponding optimal time
courses of LC firing rate, (d) Histograms of LC firing rates recorded in the two tasks: (left) the target detection task [Usher et al, 1999] (right) the Eriksen flanker
task, with data kindly provided by the authors of [Clayton et al, 2004]. Vertical dashed lines indicate onset of sensory stimuli, and vertical gray (solid) lines indicate
mean behavioral reaction time (standard deviations are « 34 and 114 msec for the target detection and Eriksen tasks, respectively).
128 E. Brown et al.

hypothesis that the LC may affect near-optimal pro- different neural mechanisms to implement transient
cessing of sensory stimuli. This is true even though effective gain values.
LC firing rates are not sustained at the initial high In Case 1, the development of salience (i.e.
a
values that follow stimulus onset; in fact, both LC i 7^ a2)-, in sensory stimuli at time ts is not accom-
firing rate relaxation and NE time constants are panied by large changes in the stimulus magnitudes;
compatible with optimal gain schedules. in fact the summed magnitude is unchanged. This
We note that the optimal gains, and hence mild stimulus onset is insufficient to move solu-
LC(t) time courses, are computed assuming prior tions between tiles, so variations in gain must result
knowledge of the stimulus a(t) and signal-to-noise from modulation of the gain of the neural activation
ratio a(t)/c(t). If this were the case, LC fir- function itself, presumably via influence of other
ing patterns should be well-correlated with stim- brain areas such as the locus coeruleus. However, in
ulus onset. However, experimental data of [Clayton Case 2, the appearance of salience is accompanied
et al, 2004], which involved variable stimulus onset by large changes in stimulus magnitude, either due
times, indicates tighter correlations with behavioral to properties of the stimulus itself or due to addi-
responses. Here, the function a(t) is perhaps better tive biases that shift the activation function to the
interpreted as input to motor neurons, the onsets of left, as has been proposed in connectionist models
sensory stimuli having been detected earlier in deci- that address the effects of attention [Mozer, 1988;
sion layers. Thus, the most appropriate LC data for Cohen et al, 1992]. In this case, no external modula-
use in Fig. 10 would be aligned with transients in tion of gain is required, since the decision dynamics
firing rates in intermediate processing layers; here themselves move the system between regions of the
we provide data aligned with sensory stimuli as the activation function where desired sensitivities (and
closest available surrogate. Explicit models of multi- hence gains) are achieved. The possibility that neu-
layer decision/response dynamics with variable gain ral systems are tuned so that the presence of tar-
are studied in [Brown et al, 2004a]. get stimuli causes solutions to move into sensitive
regions of their activation functions has been previ-
ously suggested in behavioral neuroscience [Servan-
5. Discussion and Conclusions Schreiber et al., 1990]; here we reformulate this idea
In this paper we explicitly compute optimal gain in terms of optimal signal processing.
trajectories for one-dimensional, linearized reduc- We end by showing that the (nonunique) opti-
tions of simplified models for competing neural mal gain schedules for the firing rate model include
groups involved in decisions between two alterna- time courses that are consistent with release of nore-
tives. We also demonstrate via simulations that pinephrine due to transient increases in the activity
such reductions provide good approximations for of neurons in locus coeruleus.
the reaction time and error rate statistics of the The external modification of gain considered in
nonlinear two-dimensional connectionist and firing Case 1 assumes prior knowledge of the time course
rate models from which they were derived. of the absolute values of sensory inputs a,j(t), the
We first show that the nonlinear connectionist task of the decision maker being merely to identify
and firing rate models are equivalent, under suit- their signs. In [Brown et al, 2004a] the more gen-
able variable and parameter coordinate changes. eral case in which this information is not available is
We then develop a piecewise-linear approximation treated, and strategies must additionally include a
to the canonical sigmoidal activation or firing rate mechanism for detecting increases in signal-to-noise
function. The resulting two-dimensional piecewise- ratio of sensory inputs.
linear SDEs (15)-(16) introduced in Sec. 2.4 form
a midpoint in our simplification process. This sys-
tem can be easily solved on each of nine "tiles" Acknowledgments
forming its phase plane, but solutions must be This work was partially supported by DoE grant
assembled by matching constants of integration. To DE-FG02-95ER25238 and PHS grants MH58480
illustrate this, we focus on two specific cases in and MH62196 (Cognitive and Neural Mechanisms
Sec. 2.5.1, motivated by the moving dots' paradigm of Conflict and Control, Silvio M. Conte Center).
[Britten et al, 1993; Shadlen & Newsome, 2001; E. Brown was supported under a National Science
Gold & Shadlen, 2002], that correspond to differ- Foundation Graduate Fellowship and a Burroughs-
ing stimulus presentation conditions and rely on Wellcome Training Grant in Biological Dynamics.
Simple Neural Networks that Optimize Decisions 129

The authors thank Josh Gold and Jaime Cisternas Brunei, N., Chance, F., Fourcaud, N. & Abbott, L. F.
for useful contributions and discussions, as well as [2001] "Effects of synaptic noise and filtering on the
Ed Clayton and Gary Aston-Jones for providing the frequency response of spiking neurons," Phys. Rev.
data of Fig. 10 and for their insights into the role Lett. 86, 2186-2189.
of the LC in modulating decisions. Chance, F. S., Abbott, L. F. & Reyes, A. D. [2002]
"Gain modulation from background synaptic input,"
Neuron 35, 773-782.
References Cho, R., Nystrom, L., Brown, E., Jones, A., Braver, T.,
Abbott, L. [1991] "Firing-rate models for neural pop- Holmes, P. & Cohen, J. D. [2002] "Mechanisms under-
ulations," in Neural Networks: From Biology to lying performance dependencies on stimulus history
High-Energy Physics, eds. Benhar, O., Bosio, C , in a two-alternative forced choice task," Cogn. Affect.
Del Giudice, P. & Tabat, E. (ETS Editrice, Pisa), Behav. Neurosci. 2, 283-299.
pp. 179-196. Clayton, E., Rajkowski, J., Cohen, J. D. &
Amit, D. & Tsodyks, M. [1991] "Quantitative study Aston-Jones, G. [2004] "Decision-related activation of
of attractor neural network retrieving at low spike monkey locus coeruleus neurons in a forced choice
rates: I. Substrate-spikes, rates, and neuronal gain," task," under preparation.
Network 2, 259-273. Cohen, J. D., Dunbar, K. & McClelland, J. L. [1990]
Anderson, J. [1990] The Adaptive Character of Thought "On the control of automatic processes: A parallel
(Lawrence Erlbaum, Hillsdale, NJ). distributed processing model of the Stroop effect,"
Arnold, L. [1974] Stochastic Differential Equations (John Psychol. Rev. 97, 332-361.
Wiley, NY). Cohen, J. D., Servan-Schreiber, D. & McClelland, J. L.
Arnold, L. [1998] Random Dynamical Systems (Springer, [1992] "A parallel distributed processing approach to
Heidelberg). automaticity," Amer. J. Psychol. 105, 239-269.
Aston-Jones, G., Rajkowski, J., Kubiak, P. & Cohen, J. D. & Huston, T. A. [1994] "Progress in the
Alexinsky, T. [1994] "Locus coeruleus neurons in the use of interactive models for understanding atten-
monkey are selectively activated by attended stimuli tion and performance," in Attention and Performance
in a vigilance task," J. Neurosci. 14, 4467-4480. XV, eds. Umilta, C. & Moscovitch, M. (MIT Press,
Bialek, W., Rieke, F., de Reuter van Steveninck, R. & Cambridge), pp. 453-476.
Warland, D. [1991] "Reading a neural code," Science Ermentrout, G. B. [1994] "Reduction of conductance-
252, 1854-1857. based models with slow synapses to neural nets,"
Bogacz, R., Brown, E., Moehlis, J., Hu, P., Holmes, P. & Neur. Comput. 6, 679-695.
Cohen, J. D. [2004] "The physics of optimal decision Fairhall, A., Lewen, G., Bialek, W. & de Ruyter van
making: A formal analysis of models of performance Steveninck, R. [2001] "Effciency and ambiguity in an
in two alternative forced choice tasks," Psych. Rev., adaptive neural code," Nature 412, 787-792.
in review. Gardiner, C. W. [1985] Handbook of Stochastic Methods,
Boxler, P. [1991] "How to construct stochastic center 2nd edition (Springer, NY).
manifolds on the level of vector fields," in Lyapunov Gerstner, W. & Kistler, W. [2002] Spiking Neuron
Exponents, eds. Arnold, L., Crauel, H. & Models (Cambridge University Press, Cambridge,
Eckmann, J.-P., Lecture Notes in Mathematics, UK).
Vol. 1486 (Springer, Heidelberg), pp. 141-158. Gilzenrat, M. S., Holmes, B. D., Rajkowski, J.,
Britten, K. H., Shadlen, M. N., Newsome, W. T. & Aston-Jones, G. & Cohen, J. D. [2002] "Simplified
Movshon, J. A. [1993] "Responses of neurons in dynamics in a model of noradrenergic modulation
macaque MT to stochastic motion signals," Vis. of cognitive performance," Neural Networks 15, 647-
Neurosci. 10, 1157-1169. 663.
Brown, E. & Holmes, P. [2001] "Modeling a simple choice Gold, J. I. & Shadlen, M. N. [2001] "Neural computa-
task: stochastic dynamics of mutually inhibitory neu- tions that underlie decisions about sensory stimuli,"
ral groups," Stochast. Dyn. 1, 159-191. Trends Cogn. Sci. 5, 10-16.
Brown, E., Gilzenrat, M. S. & Cohen, J. D. [2004a] "The Gold, J. I. & Shadlen, M. N. [2002] "Banburismus and
locus coeruleus, adpative gain, and the optimization the brain: Decoding the relationship between sensory
of simple decision tasks," Technical Report #04-02, stimuli, decisions, and reward," Neuron 36, 299-308.
Center for the Study of Mind, Brain, and Behavior, Grossberg, S. [1988] "Nonlinear neural networks:
Princeton University. Principles, mechanisms, and architectures," Neural
Brown, E., Moehlis, J., Holmes, P., Clayton, E., Networks 1, 17-61.
Rajkowski, J. & Aston-Jones, G. [2004b] "The influ- Guckenheimer, J. & Holmes, P. J. [1983] Nonlinear
ence of spike rate and stimulus duration on noradren- Oscillations, Dynamical Systems and Bifurcations of
ergic neurons," J. Comput. Neurosci 17, 5-21. Vector Fields (Springer-Verlag, NY).
130 E. Brown et al.

Hertz, J., Krough, A. & Palmer, R. [1991] Introduction Servan-Schreiber, D., Printz, H. & Cohen, J. D. [1990]
to the Theory of Neural Computation (Perseus Book "A network model of catecholamine effects: Gain,
Group, NY). signal-to-noise ratio, and behavior," Science 249,
Hopfield, J. J. [1984] "Neurons with graded response 892-895.
have collective computational properties like those of Shadlen, M. N. & Newsome, W. T. [2001] "Neural basis
two-state neurons," Proc. Natl. Acad. Sci. USA 82, of a perceptual decision in the parietal cortex (area
3088-3092. LIP) of the rhesus monkey," J. Neurophysiol. 86,
Huk, A., Palmer, J. & Shadlen, M. [2002] "Tempo- 1916-1936.
ral integration of motion energy underlies percep- Shelley, M. & McLaughlin, D. [2002] "Coarse-grained
tual decisions and response times," Annual Society for reduction and analysis of a network model of corti-
Neuroscience Meeting, Orlando, FL, Nov 2-7, 2002, cal response. I. drifting grating stimuli," J. Comput.
Abstract No. 353.5. Neurosci. 12, 97-122.
Knobloch, E. k Weisenfeld, K. A. [1983] "Bifurcations in Shin, J., Koch, C. & Douglas, R. [1999] "Adaptive neu-
fluctuating systems: The center manifold approach," ral coding dependent on the time varying statistics
J. Stat. Phys. 33, 611-637. of the somatic input current," Neural Comput. 11,
Laming, D. R. J. [1968] Information Theory of Choice- 1083-1913.
Reaction Times (Academic Press, NY). Smith, P. L. & Ratcliff, R. [2004] "Psychology and neuro-
Lehmann, E. L. [1959] Testing Statistical Hypotheses biology of simple decisions," Trends in Neurosci. 27,
(John Wiley, NY). 161-168.
McClelland, J. L. [1979] "On the time relations of men- Stone, M. [1960] "Models for choice-reaction time,"
tal processes: An examination of systems of processes Psychometrika 25, 251-260.
in cascade," Psychol. Rev. 86, 287-330. Usher, M., Cohen, J. D., Servan-Schreiber, D.,
Mozer, M. [1998] "A connectionist model of selective Rajkowsky, J. & Aston-Jones, G. [1999] "The role of
attention in visual perception," in Proc. Tenth Ann. locus coeruleus in the regulation of cognitive perfor-
Conf. Cognitive Science Society (Erlbaum, Hillsdale, mance," Science 283, 549-554.
NJ), pp. 195-201. Usher, M. & McClelland, J. L. [2001] "On the time
Omurtag, A., Kaplan, E., Knight, B. W. & Sirovich, L. course of perceptual choice: The leaky competing
[2000] "A population approach to cortical dynamics accumulator model," Psych. Rev. 108, 550-592.
with an application to orientation tuning," Network von Neumann, J. [1958] The Computer and the Brain
11, 247-260. (Yale University Press, New Haven, CT); 2nd edition
Papoulis, A. [1977] Signal Analysis (McGraw-Hill, NY). [2000], with a foreword by Paul and Patricia Church-
Piatt, M. L. & Glimcher, P. W. [2001] "Neural correlates land.
of decision variable in parietal cortex," Nature 400, Wald, A. [1947] Sequential Analysis (John Wiley, NY).
233-238. Wang, X.-J. [2002] "Probabilistic decision making by
Ratcliff, R. [1978] "A theory of memory retrieval," slow reverberation in cortical circuits," Neuron 36,
Psych. Rev. 85, 59-108. 955-968.
Ratcliff, R., Van Zandt, T. & McKoon, G. [1999] "Con- Waterhouse, B., Moises, H. & Woodward, D. [1998]
nectionist and diffusion models of reaction time," "Phasic activation of the locus coeruleus enhances
Psych. Rev. 106, 261-300. responses of primary sensory cortical neurons to
Roitman, J. & Shadlen, M. [2002] "Response of neurons peripheral receptive field stimulation," Brain Res.
in the lateral intraparietal area during a combined 790,33-44.
visual discrimination reaction time task," J. Neurosci. Wickelgren, W. A. [1977] "Speed-accuracy tradeoff and
22, 9475-9489. information processing dynamics," Acta Psychol. 41,
Schall, J. D. [2001] "Neural basis of deciding, choosing, 67-85.
and acting," Nature Rev.: Neurosci. 2, 33-42. Wilson, H. & Cowan, J. [1972] "Excitatory and
Schall, J., Stuphorn, V. & Brown, J. [2002] "Monitoring inhibitory interactions in localized populations of
and control of action by the frontal lobes," Neuron model neurons," Biophys. J. 12, 1-24.
36, 309-322.
N E W T O N FLOW AND I N T E R I O R P O I N T M E T H O D S
IN LINEAR P R O G R A M M I N G
JEAN-PIERRE DEDIEU
MIP. Departement de Mathematique, Universite Paul Sabatier,
31062 Toulouse cedex 04, France
MIKE SHUB
Department of Mathematics, University of Toronto,
100 St. George Street, Toronto, Ontario M5S 3G3, Canada

Received January 12, 2004; Revised June 17, 2004

We study the geometry of the central paths of linear programming theory. These paths are the
solution curves of the Newton vector field of the logarithmic barrier function. This vector field
extends to the boundary of the polytope and we study the main properties of this extension:
continuity, analyticity, singularities.

Keywords: Linear programming; interior point method; central path; Newton vector field;
extension.

1. Introduction except for orbits which come close to an orbit in a


face of dimension i which itself comes close to a sin-
In this paper we take up once again the subject
gularity in a boundary face of dimension less than i.
of the geometry of the central paths of linear pro-
This orbit is then forced to turn almost parallel to
gramming theory. We study the boundary behavior
the lower dimensional face so its tangent vector may
of these paths as in [Megiddo & Shub, 1989], but
be forced to turn as well. See the two figures at the
from a different perspective and with a different end of this paper. As this process involves a reduc-
emphasis. Our main goal will be to give a global pic- tion of the dimension of the face it can only hap-
ture of the central paths even for degenerate prob- pen for the dimension of the polytopetimes. So our
lems as solution curves of the Newton vector field, optimistic conjecture is that the total curvature of
N(x), of the logarithmic barrier function which we a central path is 0(n). We have verified the conjec-
describe below. See also [Bayer h Lagarias, 1989a, ture in an average sense in [Dedieu et al.]. It is not
1989b, 1991]. The Newton vector field extends to difficult to give an example showing that 0(n) is
the boundary of the polytope. It has the properties the best possible for the worst case. Such an exam-
that it is tangent to the boundary and restricted to ple is worked out in [Megiddo & Shub, 1989]. The
any face of dimension i it has a unique source with average behavior may be however much better. Ulti-
unstable manifold dimension equal to i, the rest mately we hope that an understanding of the cur-
of the orbits tending to the boundary of the face. vature of the central paths may contribute to the
Every orbit tends either to a vertex or one of these analysis of algorithms which use them. In [Vavasis
sources in a face. See Corollary 4.1. This highly cel- & Ye, 1996] the authors explore similar structure to
lular structure of the flow lends itself to the conjec- give an algorithm whose running time depends only
ture that the total curvature of these central paths on the polytope.
may be linearly bounded by the dimension n of We prove in Corollary 4.1 that the extended
the polytope. The orbits may be relatively straight, vector field is Lipschitz on the closed polytope.

131
132 J.-P. Dedieu & M. Shub

Under a genericity hypothesis we prove in and


Theorem 5.1 that it extends to be real analytic on
a neighborhood of the polytope. Under the same D2f(x)(u,v) = (u, (hess f{x))v).
genericity hypothesis we prove in Theorem 5.2 that
the singularities are all hyperbolic. The eigenval- It follows then that Nf(x) = —(hess f(x))"1
ues of —N(x) at the singularities are all +1 tan- grad f{x).
gent to the face and —1 transversal to the face. In Now let A be an affine map from P to Q whose
dynamical systems terminology the vector field is linear part L is an isomorphism. Suppose U\ is open
Morse-Smale. The vertices are the sinks. Finally, in P and A(U\) C U. Let g = f o A.
we mention that in order to prove that N(x)
always extends continuously to the boundary of Proposition 2.1. A maps the solution curves of Ng
the polytope we prove Lemma 4.2 which may be to the solution curves of Nf.
of independent interest about the continuity of the
Moore-Penrose inverse of a family of linear maps of Proof By the chain rule Dg(y) = Df(A(y))L and
variable rank.
D2g(y)(u,v) = D2f(A(y))(Lu,Lv).

2. T h e Central P a t h is a Trajectory So u = Ng(y) if and only if D2g(y)(u,v) = —Dg(y)(v)


of t h e N e w t o n V e c t o r Field for all v if and only if D2f(A(y))(Lu,Lv) =
-Df(A(y))Lv for all v, i.e. Nf(A(y)) = L(u) or
Linear programming problems are frequently pre-
LNg(y) = NjA(y). This last is the equation ex-
sented in different formats. We will work with one of
pressing that the vector field Nf is the push for-
them here which we find convenient. The polytopes
ward by the map of the vector field Ng and hence
defined in one format are usually affinely equivalent
the solution curves of the Ng field are mapped by
to the polytopes defined in another. So we begin
A to the solution curves of Nf. •
with a discussion of Newton vector fields and how
they transform under affine equivalence. This mate-
rial is quite standard. An excellent source for this Now we make explicit the linear programming
fact and linear programming in general is [Renegar, format used in this paper, define the central paths
2001]. and relate them to the Newton vector field of the
Let Q be an affine subspace of W1 (or a Hilbert logarithmic barrier function.
space if you prefer, in which case, assume Q is Let V be a compact polytope in R n defined by
closed). Denote the tangent space of Q by V. Sup- m affine inequalities
pose that U is an open subset of Q. Let f : U —*• R
be twice continuously differentiate. The deriva- AiX > bi, 1 < i < m.
tive Df(x) belongs to L(V, R), the linear maps
Here AiX denotes the matrix product of the row
from V to R. So Df{x) defines a map from U to
vector Ai = (an,..., aj n ) by the column vector
L(V,M). The second derivative D2f(x) is an ele-
x = (xi,..., xn)T, A is the mxn matrix with rows
ment of L(¥, L(V, R)). Thus D2f(x) is a linear map
Ai and we assume rank A = n. Given c E R n , we
from a vector space to another isomorphic space and
consider the linear programming problem
D2f(x) may be invertible.

Definition 2.1. If / is as above and D2f(x) (LP) min {c,x}.


AiX>bi
is invertible we define the Newton vector field, l<i<m
Nf(x) by
Let us denote by
2 l
Nf(x) = ~{D f(x))- Df{x). m

Note that if V has a nondegenerate inner


f(x) = J2ln(AiX-bi)
product ( , ) then the gradient of / , grad f(x) € V,
and Hessian, hess f(x) E L(V,V), are defined by (ln(s) = —oo when s < 0) the logarithmic bar-
rier function associated with the description Ax > b
Df(x)u = (u,grad f(x)) of V. The barrier technique considers the family of
Newton Flow and Interior Point Methods in Linear Programming 133

nonlinear convex optimization problems in R n we obtain a family of curves. Our aim in this
paper is to investigate the structure of this family.
(LP(t)) mm t(c,x) - f(x) For a subspace B C R m we denote by HB the
orthogonal projection Mm —>• B. Let bi,...,br be
with t > 0. The objective function a basis of B and let us denote by B the m x r
matrix with columns of the vectors 6j. Then Ilg,
ft(x) = t(c,x) - f(x)
also denoted UB, is given by UB = B(BTB)'1BT =
is strictly convex, smooth, and satisfies BB^ (f?t is the generalized inverse of B equal to
[BTB)~lBT because B is injective).
lim ft(x) = oo.
Definition 2.2. The Newton vector field associated
zelnt V
with g is
Thus, there exists a unique optimal solution ^(t) to
N{x) = -Dg{x)-lg{x)
(LP(t)) for any t > 0. This curve is called the cen-
tral path of our problem. Let us denote as Dx the = (ATD-2A)-1ATD-1e
m x m diagonal matrix Dx = D i a g ^ x — 6j). This = A^DxUD-iAe.
matrix is nonsingular for any x G Int V. We also let
m
e = (l,...,lfeK , It is defined and analytic on Int V.
Note that the expression A^DxIiD - i ^ e is
g(x) = grad f{x) = J J defined for all x G R" for which AiX — bi is not equal
to 0 for all i. Thus N{x) is defined by the rational
and expression in Definition 2.2 for almost all x 6 W1.
Later we will prove that this rational expression has
h(x) = hess / ( s ) = - A T I > ~ 2 A a continuous extension to all R n .

Since /j is smooth and strictly convex the central Lemma 2.2. The central paths 7(i), c G M.n, are the
path is given by the equation grad ft{j{t)) = 0 i.e. trajectories of the vector field —N(x).

g0y(t)) =tc, t> 0. Proof. A central path is given by


When t —> 0, the limit of j(t) is given by g0y(t)) =tc, t> 0,

- / ( 7 ( 0 ) ) = mm -f(x). for a given c G M.n. Let us change variable: t = exp s


and d(s) = j(t) with s G R. Then
It is called the analytic center of V and denoted
g(d(s)) — exp(s)c, s G R,
by cv.
so that
Lemma 2.1. g:IntV —> W1 is real analytic and
invertible. Its inverse is also real analytic. d_g(d(s)) = exp(s)c = g(d(s)).
ds
Proof. For any e e l " the optimization problem Let us denote d(s) = (d/ds)d(s). We have
min (c, x) — fix) d
g(d(S)) = Dg(d(s))d(s)
ds
has a unique solution in Int V because the objective thus
function is smooth, strictly convex and V is com-
pact. Thus g(x) = c has a unique solution that is g d(s) = Dg(d(S))-Lg(d(s)) = -N(d(s))
bijective. We also notice that, for any x, Dg{x) is
and d(s) is a trajectory of the Newton vec-
nonsingular. Thus g~l is real analytic by the inverse
tor field. Conversely, if d(s) = —N(d(s)) =
function theorem. • 1
J D5f(d(s))" 5(d(s)), s G R, then

According to this lemma, the central path is the d


g(d(s)) = Dg(d(s))d(s) = g(d(s))
inverse image by g of the ray cR + . When c varies ds
134 J. -P. Dedieu & M. Shub

so that Proof. N(x) = 0 if a n d only if g{x) — 0, that is


g(d(s)) = exV(s)g(d(0)) x = c-p. •

which is the central p a t h related t o c = g(d(0)).


3. A n A n a l y t i c E x p r e s s i o n for t h e
N e w t o n Vector Field
Remark 2.1. The trajectories of N(x) and — N(x)
In this section we compute an analytic expression
are t h e same with time reversed. As t —> oo, ^(t)
for N(x) which will b e useful later. For a n y sub-
tends to the optimal points of the linear program-
set Kn C { l , . . . , m } , Kn = {fci < ••• < kn}, we
ming problem. So we are interested in the positive
denote by AKU t h e n xn submatrix of A with rows
time trajectories of —AT(re).
A f c j , . . . , Akn, by bKn t h e vector in W1 w i t h coordi-
L e m m a 2.3. The analytic center cp is the unique nates bkl, . . . , bkn, and by UKn t h e unique solution
singular point of the Newton vector field N(x), x € of the system AK„uxn = b^n when t h e m a t r i x Axn
IntV. is nonsingular. W i t h these notations we have:
P r o p o s i t i o n 3 . 1 . For any x € Int V,

£ (x- uKn)(detAKn)2 [ ] (Atx - hf


Kn<Z{l,...,m}
detAKn^0
N(x) =
J2 (tetAKn)2 -[[(Aix-btf
KnC{l,...,m}
detAKn^0

Proof. Let us denote IT = fj™ I (Aix" b


0
and Uk = Yltyk (Aix — bi). We already know To compute X l we use Cramer's formula:
(Definition 2.2) t h a t N{x) = {ATD^2A)'1 ATD'1 e X - = cof ( X ) T / d e t ( X ) where cof (X) denotes t h e
1

with matrix of cofactors: cof(X)jj = ( — l ) l + J d e t ( X ^ )


with Xli t h e n - l x n - 1 matrix obtained by deleting
Q'kiQ'kj
^D-^=Y: • ^ (Akx - bk)2
in X the i t h row and j t h column. We first compute
d e t X . We have
1 m
d e t X = ] T e ( a ) X l a ( 1 ) • •' •^-ncr(n)
X
fe=i a&

where S n is t h e group of permutations of { 1 , . . . , n }


and e(er) t h e signature of a. Thus
where X is the n x n matrix given by Xy- =
YJk=i akiakj^l- Moreover

(^xM* = E A x a-H- b k k creSn j = lkj=l


fc=i
= n
E fci ••'IILafcil'"afcnn
l<fe.,'<m
fe=l l<i<n

X e cr a
E ( ) fci^(l) ' ' ' akna{n)
"B« <r€§n
where V is t h e n vector given by Vi = XX=i afc«n/;.
This gives = E n
fcr"nLafcii"'°*nndet^jfc1...fcri
N(x) = IIX"1^
l<i<n
Newton Flow and Interior Point Methods in Linear Programming 135

where Akl...kn is the matrix with rows Akl • • • Akn. - V^TT ^ M • TT Y^ II 2


When two or more indices kj are equal the cor- ~~ Z-/ k Z_> ^ >fc<7M1 1 Z ^ kii k (T
j U) kj
k=\ (T6Sn j=i kj=\
responding coefficient detAkl...kn is zero. For this j¥=i
reason, instead of this sum taken for n indepen- m
dent indices kj we consider a set Kn c {l,...,m},
Kn = {k\ < • • • < kn}, and all the possible permu-
J2nkYl £(a)a^ (0
fe=i o-es
tations a £ S(Kn). We obtain m
X 2J Ofcil a fcicr(l)n fcl • • • flfena^^n^
detX • na-(kn)
KnC{l,...,m} l<J<n
a<E§(Kn)

x a<r(fci)l •a<T(fc„)ndet4T(fci)-<T(fc„)
22 ^ ZJ ^ l 1 ' ' ' a*n«nil " ' ^
E
iC„C{l,...,m}
n
l • • • fc n fc=l
i<i<"

X e
ZJ (°')aCT(fci)l""a^rt)ndet-4fci-fcT1 X e a a
Zw ( ) fclo-(l) • ' ' akcr(i) • • • akncr(n)
<TGS(K"n)

£ Ul-'-YlUdetAKj2. which gives


KnC{l,...,m}

Note that, for any / = l,...,m, the product *i = £ ^ ZTl a ^ll " ' afcn"nfcl "'Tl

U2ki • • • U2kn contains {Atx - bi)2n if Z 0 # „ and


(A;a; — bi)2n~~2 otherwise. For this reason l<j<n

x detA^.-fc^^fci+i-fen-
det X = n2n"2 J2 (det AKn?
KnC{l,...,m}
By a similar argument as before we sum up for
any set with n — 1 elements Kn-\ C { 1 , . . . , TO},
x JJ(A,x-6,) 2 - Kn-i = {ki < ••• < fci_i < ki+i < ••• < kn} and
l?Kn
for any permutation a £ S(Kn-i). We obtain as
previously
Let us now compute Y = cof (X) T V. We have m

fc=l AT„-iC{l,...,m}

i=i x det Alkl...k._lk.+1...kn detAkl...ki_lkki+1...kn

n m
fc=l with AV h u u the matrix with rows Ak.,
= zm3 ( - i ) * + i d e t ( X i i ) ^ a j f e i n f e j 6 -K"n.-i and the ith column removed. The quan-
tity A\X — bi appears in the product n^II 2 . • • • IT2,
fc=i j=i with an exponent equal to

because X is symmetric. This last sum is the • 2n — 1 when / ^ k and / 0 Kn-i,


determinant of the matrix with rows X\ • • • • 2n — 2 when I = k and / ^ i^n-i,
Xi_iAkXi+i • • • Xn so that • 2TJ — 3 when / ^ k and I £ Kn_\,
• 2n — 4 when I = k and I 6 Kn-\-
m
Y
i = /2^-k 2_, e ( 0 " ) ^ l a ( l ) • • • Xi-\<T{i-l)akcr(i) In this latter case, two rows of the matrix
fc = l <T£§n
Akl...ki^1kki+1--kn a r e equal and its determinant is
X
^i+la(i+l) " ' " -^ncr(n) zero. Thus, each term Aix — bi appears at least In—3
136 J.-P. Dedieu & M. Shub

times so that

Yi = U2n 3
J2 (A^ - M I I ^ ~ b
tf det
4 1 ...fc i _ 1 fc i+1 .-fc n d e t
^fel-fei-lfefci+l-fcn-

The ith component of the Newton vector field is equal to N(x)i = IIY*/det X so that

J ^ (^fcx - 6fc) J J (Aix - bif det ^ . . . ^ j ^ . . . ^ det Akl...ki_lkki+V..kn


fc=i l^k
•Kn-l
N(x)i =

K, l£Kn

Instead of a sum taken for k and Kn-\ in the numerator we use a subset Kn C { 1 , . . . , m} equal to the union
of k and Kn-\. Notice that det Akl...ki_lkki+1...kn = 0 when k G Kn-i so that this case is not considered.
Conversely, for a given Kn = {k\ • • • kn}, we can write it in n different ways as a union of k = kj and
Kn-i = Kn\{kj}. For these reasons we get

YI JL^X - M det AKn detAK~,i,j n (Aix - ^


N(x)i = Kn \j=l / l<£Kn
J > e t AKn)2 H (AlX - kf
Kn l?K„

with A3^ the matrix obtained from Apcn m deleting the j t h row and ith column, and Aicn,i,j obtained
from AKn in removing the line Aj, and in reinserting it as the ith line, the other lines remaining with the
same ordering. Note that det AKn,i,j = (—iy+J det Axn thus

j2 s>fci* - M- 1 )^' det AjL det


AKn n (AIX -^
N(x) Kn \j=l / IgKn
^(detAKJ2II(^-^)2
Kn l<£Kn

In fact this sum is taken for the sets Kn such We get


that Axn is nonsingular, otherwise, the coefficient
d e t A ^ vanishes and the corresponding term is J2(Xi ~ uKn,i)(detAKn)2 Yl (Ax - bt
zero. K„ l£Kn
According to Cramer's formulas, the expression N(x)i =
(-!)<+' det A%JdetAKn is equal to {A^J Thus ^ ( d e t ^ J 2 H ( ^ - ^ ) 2

Kn l$Kn
and we are done.
J2(Akjx-bkj)(-l)^det^n

= (A-Kln{AKnx-bKn))i 4. E x t e n s i o n t o t h e Faces of V
Our aim is to extend the Newton vector field,
= Xi- {A^bKn). = Xi- uKn, defined in the interior of V, to its different faces.
Newton low and Interior Point Methods in Linear Programming 137

Let Vj be the face of V defined by AiX — bi, i G J (resp. i G I). It defines a linear
operator DXjJ : Rmj -»• Rm->.
P j = { i £ l " : Aix = bi for any i e J Since the faces of the polytope are regularly
and AjX > bi for any % G J } . described, for any x G ri — Vj, DXtj is nonsingular.
Vj is associated with the linear program
Here / is a subset of { 1 , 2 , . . . , m} containing mi
integers, J = {1, 2 , . . . , m}\I and m ; = m — mj. (LPj) min(c, x).

Definition 4.1. The face "Pj is regularly described


The barrier function
when the relative interior of the face is given by

ri-Vj = {x e M " : AiX — bi for any i € I fj(x) = Yl H^iX - bi)


ieJ
and AiX > 6j for any i G J } .
is defined for any x G Fj and finite in ri — Vj the
The polytope is regularly described when all its relative interior of Vj. The barrier technique con-
faces have this property. siders the family of nonlinear convex optimization
problems (LPj(t))
We assume here that V is regularly described.
This definition avoids, for example, in the descrip- min t(c,x) ~ fj(x)
tion of a Vj a hyperplane defined by two inequal- xEFj
ities: AiX > bi and AiX < bi instead of AiX = 6j.
Note that every face of a regularly described V has with t > 0. The objective function
a unique regular description, the set I consists of all
indices i such that AiX = bi on the face. The affine ft,j(x) = t(c,x) - fj(x)
hull of Vj is denoted by
is smooth, strictly convex and
T n
FJ = {x = (x1,...,xn) eWL :
lim ft,j(x) = oo,
AiX = bi for any i G 1} x^dVj

which is parallel to the vector subspace thus (LPj(t)) has a unique solution jj(t) G ri — Vj
given by
G J = {x = ( x 1 , . . . , x n ) T G R n :
AiX — 0 for any i G / } . DftA-YJ®) = 0-

We also let For any x G ri — Vj, the first derivative of fj


is given, by
Ej = {y=(yl,...,ymfeRm:
D
yi — Q for any i G / } . f^)u =J 2 J ^ = (ATjD^JeJ,u)
z l
ieJ
Ej is defined similarly.
Let us denote by Aj (resp. Ai) the mj x n with u G Gj and e j = ( 1 , . . . , 1) T G R m j . We have
(resp. mi x n) matrix whose ith row is Ai, i G J
(resp. i G I). Aj defines a linear operator Aj :Rn —> gj{x) = grad /j(x) = IiGjA^D^ej
RmJ. We also let = bTjD^jej.
bj:Gj^Rmj, bj = Aj\Gj The second derivative of fj at x G ri—Vj is given by
so that
rflf ( \( \ V^ (Aju)(Aiv)
b^-.R^^Gj, T
b j=HGjAj. c/,w(„,„ ) =-|: p --^
Here, for a vector subspace E, HE denotes the
orthogonal projection onto E. Let DXJJ (resp.
Dx,i) be the diagonal matrix with diagonal entries
138 J.-P. Dedieu & M. Shub

for any u, v G Gj so that The analytic center is also given by

Dgj(x) = hess fj(x) = -bTjD~2jbj. AiX = bi, ie I, bTjD~jej = 0 and 7 J ( 0 ) =X

To Vj we associate the Newton vector field so that 7 J (0) is the unique singular point of Nj in
given by the face Vj.
We now investigate the properties of this
Nj(x) = -DgJ(x)~1gj(x), x G ri - Vj. extended vector field: continuity, derivability and
so on. We shall investigate the following abstract
We have: problem: for any y G R m we consider the linear
Lemma 4.1. For any x G ri — Vj this vector field operator
is defined and Vy : R m -> R m
Nj(x) = (b§D-^bj)-%D-^ej given by the mxm diagonal matrix T>y = Diag(yj).
Let P be a vector subspace in R m . Then, for any
= &jAr,jn i m ( r r;i j 6 j )ej G G j . y G W71 with nonzero coordinates, the operator

Proof. We first have to prove that Dgj(x) is non- vy o n P _ 1(p) : R™ - r


singular and that Nj(x) G G j . This second point is well defined. Can we extend its definition to any
is clear. For the first, we take u G Gj such that y G R m ? The answer is yes and proved in the
Dgj(x)u = 0. This gives Aju = bju = 0 which following
implies Au = 0 because it G Gj that is .AJM = 0.
Since A is injective we get u = 0. By the same Lemma 4.2. Let y G Ej be such that yi ^ 0 for
argument we see that bj is injective so that b^bj is any i G J.
nonsingular. The first expression for Nj(x) comes Then F)y\Ej : Ej —> .Ej 25 nonsingular and
from the description of gj and Dgj. We have
lim £>y o ILp-i ( P ) = % | B j o n ( D | . | B j ) - i ( p n B j ) .
1
Nj(x) = ( ^ A ) " ^ - ^
Proof. To prove this lemma we suppose that / =
_1
= (^;) ^,A> {1,2,... , m i } and J = {mi + 1 , . . . , mi+m,2 = m}.
Let us denote p = dim P . P is identified to an
n x p matrix with rank P = p. We also introduce
the following matrices:
= 6SD*.Jnim(0-»eJ £ G
'" "

The curve 7j(i), 0 < t < oo, is the central path


v -(D^ ° ) P-(U °)
V F
of the face Vj. It is given by ~\ 0 Dy,J' ~\V Wj-
The different blocks appearing in these two matri-
-fj(t)eFj and Dfj(7j(t)) - tc = 0 ces have the following dimensions: Dyy.m\ x mi,
Dy,2-m2 x m 2 , U:mi x.p\, V:m 2 xpi, W:rri2 x p 2 .
that is We also suppose that the columns of (^ \ are a basis
x e Fj, ATjD~^ej -tceGJ and 7j(i) = x for PnEj and those of fy\ a basis of the orthogonal
complement of PnEj in P that is ( P n E j ) - L n P . Let
or, projecting on Gj, us notice that p2 < m 2 and rank W = P2 and also
that pi < mi and rank U = p\. Let us prove this
AiX = bi, i e 7, bTjD~^ej - tUGjc = 0
last assertion. Let Ui, 1 < i < p\ be the columns
and 7J(£) = x. of £/". If aiC/'i + • • • + a Pl f/ pi = 0, we have

When £ —> 0, jj(t) tends to the analytic center


7 J ( 0 ) of "Pj defined as the unique solution of the
convex program

-/J(7J(0)) = min -fj(x). = ( ° )•


\aiVH \-aPlVplJ
Newton Flow and Interior Point Methods in Linear Programming 139

The left-hand side of this equation is in ( P f l E j ) 1 n and


P and the right-hand side in P C\ Ej. Thus this 0 0
vector is equal to 0 and since rank P = p we get
v0 Im2)
a\ = • • • = api = 0.
m
For every subspace X in R with d i m X = p We also notice t h a t
identified with an m x p rank p matrix we have V Q
V -V~1P = VV^EJ^T)-^P + VyUEjU^-ip.
T 1 T
n^ = x{x x)~ x . We have
This gives here lirnX^II^IIj,-!, 0.
y^y
n ^ p
This is a consequence of the two following
0 in^Hp-ipl < 1
because it is the product of two orthogonal projec-
tions and
•UTD;2IU + V^D;22V VTD^2W^ lim VyUEl = V¥UEl = 0.
T T y-^y
W D~IV W D~IW/
We now have to study the limit
UTD-\ MmVyH.Ej'Uv-ip.
y
y->y
T
Let us denote A = U D~\U. T h e following identi-
ties hold:

VyO-Ej^D-^-p

D. 0 0 'D~\U 0 '
D
0 D.w , 2 . 0 I,mi ylV D
ylW,
_ 1
T
D-p + VTD^
•UrTn-2nxvT n -2T/ VTD'
T/T 2
n-2wW\ / JjTD~l vTD~1
2V y 2

WTD^2V WTDy22W^ wTD


vl
-i
0 0 'A Q\(lm,+A-1{VTD-22V) A-l(VTD~22Wy '^i VT
»y-l '
T 2 T 2
V W 0 Lmi W D~ V W D; 2W

0 0 Imi+A-\VTD-ylV) A-HVTD-2W)\-WA-\UTD-i) A " 1


^ ^ ) '
V W WTD;22V WTD~2W WT»yl

We will prove later that


1
im^"
lim = lim A"1 (uTD-fy =
when y —> y. Since

lim.D y ; 2 = D59
y^y
140 J.-P. Dedieu & M. Shub

is a nonsingular matrix we get Corollary 4.1. The vector field N(x) extends con-
tinuously to all of Rn. Moreover it is Lipschitz on
limPyn^ILp-ip compact sets. When all the faces of the polytope V
are regularly described, the continuous extension of
0 0 'mi N(x) to the face Vj of V equals Nj(x). Conse-
V W) \WTD^V WTD=22Wt quently any orbit of N(x) in the polytope V tends to
one of the singularities of the extended vector field,
i.e. either to a vertex or an analytic center of one
of the faces.
.° w-D^y
Proof. It is a consequence of Definition 2.2,
Lemmas 4.1 and 4.2 and the equality Ajy = A^y for
- ° any y G Ej that N{x) extends continuously to all
of Rn and equals Nj(x) on Vj. Moreover a rational
^0 W{WTD=22W)-lWTD^2)
function which is continuous on Rn has bounded
and this last matrix represents the operator partial derivatives on compact sets and hence is
Vy\Ej 0lliV_]Bj)-l(pnEj)
Lipschitz. Now we use the characterization of the
vectorfield restricted to the face to see that any orbit
as announced in this lemma. which is not the analytic center of a face tends to
To achieve the proof of this lemma we have to the boundary of the face and any orbit which enters
show that a small enough neighborhood of a vertex tends to
that vertex. •
limA" 1 = lim^T 1 (uTD~£\ = 0

with A = UTD~^U. In fact it suffices to prove Remark 4-1- We have shown that N{x) is Lipschitz.
We do not know an example where it is not ana-
lim^l - 1 = 0 because
lytic and wonder as to what its order of smoothness
is, in general. In the next section we will show it is
analytic generically.
= \\A~\UTD^)(A-\UTD-\)f\\
= P~1||. 5. Analyticity and Derivatives
Since U is full rank, the matrix UTU is positive In Sec. 3 we gave the following expression for the
definite so that Newton vector field:

/i = min Spec(UTU) > 0. N(x)

Let us denote >- the ordering on square matrices J2 (x- uKn)(detAKn)2 f j (Atx - btf
given by the cone of non-negative matrices. We have KnC{l,...,m}
detKn^O

-2
y
1 1 £ (detAKnf IK^x-k)2
^ m^| / m i ^ WImi K„C{l,...,m}
detKn^O
l£Kn

so that
for any x e Int V. Under a mild geometric assump-
1
T
U D~^U >- Y^U U Tv
T,r .
y ^
j~IPl. tion, the denominator of this fraction never vanishes
\y\\ \\y\f so that N(x) may be extended in a real analytic
Taking the inverses changes this inequality in the vector field.
following Theorem 5.1. Suppose that for any x 6 dV con-
tained in the relative interior of a codimension d
face ofV, we have A^x = bkt for exactly d indices
in { 1 , . . . , TO} and A\x > b\ for the other indices. In
when y —» y and we are done. • that case the line vectors A^, 1 < i < d, are linearly
Newton Flow and Interior Point Methods in Linear Programming 141

independent. Moreover, for such an x The first one is a well-known fact about the Newton
operator: its derivative is equal to —id at a zero
E (detAK^H^x-btf^O (if N(x) = 0, then DN(x) = D{-Dg{x)~lg{x)) =
KnC{l,...,m}
detKn^O
1<?K„ Di-Dgix)-1^) - Dg{x)-lDg{x) = - i d ) . The
second fact is proved in Sec. 4: the restriction of
so that N(x) extends analytically to a neighborhood N{x) to a face is the Newton vector field associated
ofV. with the restriction of g(x) to this face.
We have now take care of the 1 eigenvalues. To
Proof. Under this assumption, for any x € V, there simplify the notations we suppose that A^x = bi for
exists a subset Kn C { 1 , . . . , m} such that the sub- 1 < i < d, AiX > bi when i + 1 < i < m, and
matrix AKU is nonsingular and A\x — b\ > 0 for any N(x) — 0. N is analytic and its derivative in the
x£Kn. • direction v is given by
Our next objective is to describe the singular „,., . Num
(detAO 2
y j v yj.ju —
points of this extended vector field. Y[(AlX--h?
K„C{l,-,m } l£Kn
Theorem 5.2. Under the previous geometric ass- detKn^0
umption, the singularities of the extended vector
field are: the analytic center of the polytope and the VlZil
2
analytic centers of the different faces of the polytope, Num = 2_2 l
»(det AK J J] (AlX - h?
including the vertices. Each of them is hyperbolic: if K„C{l,...,m} l£Kn
x 6 dV is the analytic center of a codimension d detKn^O
face T of' V, then the derivative DN(x) has n — d
eigenvalues equal to —1 with corresponding eigen- + K„C{l,...,m
E (x-uKn){detAKn)2
vectors contained in the linear space TQ parallel to }
detKn^0
J- and d eigenvalues equal to 1 with corresponding
eigenvectors contained in a complement of FQ .
x E 2A^ v(Ai x 0 -•bi0) Ri^x - 6 | ) :
Proof. The first part of this theorem, about the lo^Kn
l&o
—1 eigenvalues, is the consequence of two facts.
which gives

E (x - ^ J ( d e t AKn)2 E 2A,0v(Aj0x - blo) H (Aix - hf


Kn to£K„ l£Kn
DN(x)v = v + det K„^0 = v + Mv
2 2
E (det^J n(^-^)
KnC{l,...,m}
det Kn^0

where M is, up to a constant factor (i.e. constant in v), the n x n matrix equal to

E (detAKn)2(Alox - blo) I Yl (Aix - bt)2 (x - uKn)Ah


K
Knn V i?Kn J
det Kn 7^0

which is also equal to

(AAAKJ'
E Ajnx — bin ,
Y[(Aix-bi)2\{x-uKn)Ah
{l,...,d}cKn '° '° \l*K,
detKnj^O
d-\-l<lo<m
142 J.-P. Dedieu & M. Shub

because A^x = b{ when 1 < i < d and A{X > bi the positive time trajectories of —N(x). For —N(x)
otherwise. the eigenvalues at the critical points are multiplied
To prove our theorem we have to show that by —1 so in the faces the critical points of —N(x)
dimker M > d. This gives at least d independent are sources and their stable manifolds are transverse
vectors vi such that Mvi = 0, that is, DN(x)vi = vf, to the faces.
thus 1 is an eigenvalue of DN(x) and its multi-
plicity is > d. In fact it is exactly d because we
already have the eigenvalue —1 with multiplicity 6. Example
n — d. The inequality dimker M > d is given by Let us consider the case of a triangle in the plane.
rank M < n — d. Why is it true? M is a linear Since the Newton vector field is afiinely invariant
combination of rank 1 matrices (x — UKn)Ai0 so (Proposition 2.1) we may only consider the triangle
that the rank of M is less than or equal to the with vertices (0,0), (1,0) and (0,1). A dual descrip-
dimension of the system of vectors x — UKn with tion is given by the three inequalities x > 0, y > 0,
Kn as before. Since {l,...,d} C Kn, Aix = b{ —x — y > — 1 which correspond to the following
when 1 < i < d, and Auxn = bxn we have data:
A(x - uKn) = ( 0 , . . . , 0, yd+i,..., ym)T. From the
hypothesis, the line vectors A\,..., A^ defining the l
face T are independent, thus the set of vectors
A=
( 0
°\
1 B
( °\
« £ l " such that the vector Au G lRm begins by 0
d zeros has dimension n — d and we are done. • V-i -l) V-i/
Remark 5.1. The last theorem implies that N(x) is X 0 0
Morse-Smale in the terminology of dynamical sys-
tems. Recall also that we are really interested in D (x,y) 0 y 0
0 0 1-x-yJ

Newton vector field in the triangle

The corresponding Newton vector field is given


by the rational expressions with z = 1 — x — y. This vector field is analytic on
xz2 — x2z + xy2 — x2y the whole plane. The singular points are the three
z2 + y2 + x2 vertices, the midpoints of the three sides and the
N(x,y) = center of gravity. The arrows in the figure are for
x2y — xy2 + yz2 — y2z —N(x) and the critical points are clearly sources in
z2 + y2 + x2 their faces.
Newton Flow and Interior Point Methods in Linear Programming 143

Five trajectories.

/ / /

References Dedieu, J.-P., Malajovich, G. & Shub, M. "On the


curvature of the central path of linear programming
Bayer, D. & Lagarias, J. [1989a] "The non-linear geom-
theory," to appear.
etry of linear programming I: AfRne and projective
Meggido, N. & Shub, M. [1989] "Boundary behaviour
scaling trajectories," Trans. Amer. Math. Soc. 314,
of interior point algorithms in linear programming,"
499-526.
Math. Oper. Res. 14, 97-146.
Bayer, D. & Lagarias, J. [1989b] "The non-linear geom-
Renegar, J. [2001] A Mathematical View of Interior-
etry of linear programming II: Legendre transform
Point Methods in Convex Optimization (SIAM,
coordinates and central trajectories," Trans. Amer.
Philadelphia).
Math. Soc. 314, 527-581.
Vavasis, S. & Ye, Y. [1996] "A primal-dual accelerated
Bayer, D. & Lagarias, J. [1991] "Karmarkar's linear pro-
interior point method whose running time depends
gramming algorithm and Newton's method," Math.
only on A," Math. Progr. A74, 79-120.
Progr. A50, 291-330.
This page is intentionally left blank
NUMERICAL CONTINUATION OF B R A N C H P O I N T S
OF EQUILIBRIA AND P E R I O D I C ORBITS
E. J. D O E D E L
Department of Computer Science, Concordia University,
1455 Boulevard de Maisonneuve O., Montreal Quebec, H3G 1M8, Canada
W. GOVAERTS
Department of Applied Mathematics and Computer Science,
Ghent University, Krijgslaan 281-S9, B-9000 Gent, Belgium
YU. A. KUZNETSOV
Mathematisch Instituut, Universiteit Utrecht,
Boedapestlaan 6, 3584 CD Utrecht, The Netherlands
A. D H O O G E
Department of Applied Mathematics and Computer Science,
Ghent University, Krijgslaan 281-S9, B-9000 Gent, Belgium

Received March 10, 2004; Revised June 14, 2004

We consider the three-parameter numerical continuation of branch points in dynamical systems,


with emphasis on the continuation of branch points of periodic orbits. We consider both, the case
of branch points along one-parameter families of limit cycles (typical in systems with symmetry),
and the case of branch points along fold-curves of limit cycles (typical in generic systems). We
discuss new algorithms based on bordered matrices for both detection and continuation. We
apply the techniques to a model of a chemical reactor, a model of an electronic circuit, and
a model from celestial mechanics. Our algorithms have been implemented in freely available
software.

Keywords: Rank defect; test function; bifurcation.

1. Introduction orbits, as well as families of fold and Hopf bifur-


cation points, and fold-, flip- and t o r u s bifurca-
We deal with the numerical continuation of special
tions of periodic orbits, Bogdanov-Takens points,
solution families associated with a smooth dynam-
etc. The singular points can b e classified in terms
ical system of the form
of the codimension of point type a n d t h e num-
ber of free parameters required for continuation,
— = f(x,a), (1) cf. [Beyn et al., 2002; Govaerts, 2000; Kuznetsov,
1998].
with x € JRn, f(x,a) G 1R", and a a vector of Branch points disturb this nice p i c t u r e . First,
parameters. unlike other bifurcation points, a b r a n c h point is
Software packages such as AUTO [Doedel et al., defined with respect to a particular p a r a m e t e r . If
2001], CONTENT [Kuznetsov k Levitin, 1997] and the equilibrium or periodic orbit of (1) is defined by
MATCONT [Dhooge et al, 2003] can compute
families ("branches") of equilibria and periodic F(X,aQ) = 0, (2)

145
146 E. J. Doedel et al.

where «o is a component of a, and X and F(X, oto) boundary value problems) are discussed in [Moore,
are in compatible state spaces, then a branch point 1980] and [Mei, 1989, 2000]. These systems can
is characterized by the fact that [Fx Fao] is rank be compared to our system (28) except that in
deficient. This is a codimension-2 phenomenon, and (28) [p* p\ Ps\* *s a fixed vector, while the corre-
therefore not generic in one-parameter problems. sponding entities in [Moore, 1980] and [Mei, 1989]
In generic systems, branch points can be expected are unknowns of the problem. Therefore the sys-
only in two-parameter problems, and their continu- tems that we obtain are essentially smaller. Minimal
ation requires three free parameters. Second, branch extended systems for branch points of equibria
points depend more intimately on the parameters of were proposed in [Griewank et al, 1984] and
the problem than, say, limit points. Unlike the pre- [Allgower & Schwetlick, 1997] and ours are equiva-
viously mentioned bifurcation points, their numeri- lent to these.
cal continuation uses second-order derivatives with On the other hand, the fact that branch points
respect to parameters. generically appear in families of limit points of equi-
There are reasons for not considering branch libria and periodic orbits seem to have received little
points at all in generic systems. However, prob- attention in the numerical literature. Our example
lems arising in applications often have a special in Sec. 5 shows how it helps to understand how
structure (e.g. equivariant, Hamiltonian, etc.). For connections between various objects can switch if
this reason, standard software packages do provide parameters change.
the option to detect and accurately locate branch In Sees. 2 and 3 we discuss mathematical
points, as well as branch switching, although they features of branch points of equilibria and peri-
do not provide for their numerical continuation. odic orbits, respectively. In Sec. 4 we deal with
One of the first standard codes that supported the numerical implementation, i.e. the detection,
detection of equilibrium branch points and allowed computation and continuation of branch points of
for branch switching was STAFF [Borisyuk, 1981]. periodic orbits. In Sees. 5-7 we give numerical
Similar facilities are provided by AUTO [Doedel examples, using an implementation of the algo-
et al, 2001], CONTENT [Kuznetsov & Levitin, 1997], rithms in the MATLAB-based software MATCONT.
and several other bifurcation programs.
Generic software can often deal with structured
problems if an artificial "unfolding" parameter is
2. Equilibria and their Branch Points
introduced to break the special structure, thereby
embedding the problem in a generic class; see, for An equilibrium is a constant solution of (1), i.e. a
example [Doedel et al, 2003] and [Munoz-Almarez solution of
et al, 2003]. Thus, three-parameter continuation of
branch points for generic systems is also useful for f(x,a) = 0. (3)
structured problems.
In this paper we describe a mathematical Let (x°,aP) be an equilibrium point, and
framework for the three-parameter continuation of assume that a component (3 of a is free. By the
branch points of equilibria and branch points of implicit function theorem there is a unique solu-
periodic solutions, concentrating on the use of min- tion family of (3) passing through (x°,(3°) in (x, (3)-
imal extended systems. The adjective "minimal" space, if the (n, n + l)-dimensional Jacobian matrix
reflects the fact that we append only two scalar
equations to the system that defines the equilibria [/x(x°,a°) Ux°,a0)], (4)
or the periodic solutions, much in the spirit of the
computation of fold, flip and torus bifurcations of has full rank n. For a generic (n,n + 1) matrix,
periodic orbits in [Doedel et al, 2003]. corank 1 is a codimension-2 phenomenon, and
We note that the computation of branches corank 2 is a codimension-6 phenomenon; see, e.g.
bifurcating at branch points of equilibria is studied [Govaerts, 2000], Proposition 3.4.2. We restrict to
intensively in many papers; we refer in particular to the corank 1 case, and simply call it a branch point.
the Proceedings volumes [Kiipper et al, 1984] and In this case, the existence of a unique family of equi-
[Mittelmann &; Weber, 1980], as well as to the book libria is not guaranteed. In fact, the behavior of
[Allgower & Georg, 1990]. Fully extended systems the equilibrium solutions near (x°,a°) can be quite
for branch points of systems of equations (including complicated; this is the subject of singularity theory
Numerical Continuation of Branch Points of Equilibria and Periodic Orbits 147

[Golubitsky & Schaeffer, 1985; Golubitsky et al., Moreover, (8) is locally a regular defining system for
1988; Govaerts, 2000]. However, the most com- the manifold of corank 1 matrices.
mon situation is that of a transcritical or pitchfork
bifurcation. Proof. This follows from [Govaerts, 2000, Proposi-
A local characterization of the manifold of tion 3.4.2]. •
(n,n+l) — corank 1 matrices near (4) follows from
[Govaerts, 2000, Propositions 3.4.1-2]. We recall
the main facts, using "*" to denote transposed By the previous results, the branch points
matrices. of the equilibria of (1) near (x°,a°) are defined
by the system consisting of (3) and (8), where
Proposition 1. Let 0n,02i € IR™, 012,022 £ IR, B\,B2 are replaced by fx(x,a), fp{x,a), respec-
be such that fa = (^^u)*, 02 = (021,022)*, tively, so that g\,g2 are functions of x, a. If three
together with the rows of (4) span JR n+1 . Also, let components of a are freed, then, generically, a fam-
ip £ lRn be a vector that together with the columns ily of branch points can be computed. "Generi-
of (4) spans JRn. Then the bordered matrix cally" means that the Jacobian matrix of (3), (8)
with respect to the components of x and the
/=(z°,a°) fp{x\cP) V> free parameters has full rank; geometrically this
Bj = 0*1 012 0 (5) can be related to a transversal intersection of
manifolds.
021 022 0
We note that the definition of branch points
is nonsingular. depends on the choice of the component (3 of a; a
solution of (3) may be a branch point with respect
to one component of a but not with respect to
Proof. This follows from [Govaerts, 2000, Proposi-
another component. Also, the three free parameters
tion 3.2.1]. •
generically needed to compute a family of branch
points with respect to (5, may or may not include (5.
Proposition 2. Let 0 n , 0 2 i € IRn, 012,022 £ IR In applications they usually do.
and ip 6 JRn be as in Proposition 1. Let B be a In the numerical continuation of a family of
matrix having the same structure as the matrix Bj branch points it is highly desirable to have explicit
in (5): expressions for the derivatives of g\, g2 with respect
to the state variables and the free parameters. To
Bx B2 V> this end we solve the adjoint system correspond-
B = 0ii 012 0 (6) ing to (7)
02^1 022 0
w
and define % , % G IRn, v12,v22,gi,g2 £ M by 0 n +l
B* 91 (9)
requiring 1
92
vn V21 "On On"
B V\2 ^22 = 1 0 (7) where w € H n . If z is one of the components of x,
0 1 or a free parameter, then by taking derivatives of
.Si 92 _
(7) and multiplying from the left with
/ / B\,B2 are sufficiently close to fx(x°,a°),
fp(x°,a°), respectively, then B is nonsingular. w 9i 92
Furthermore,
we obtain
[Bi B2]
9iz = -w*fxzvn - w*ff3zvi2, (i = 1, 2). (10)
has corank 1 if and only if
If z is a parameter component, then these expres-
9i 0, sions involve the second derivatives of / with
(8)
92 0. respect to parameters.
148 E. J. Doedel et al.

3. P e r i o d i c S o l u t i o n s a n d their and the adjoint variational equation


Branch Points
X + Tf*(x(t),a)X = 0. (14)
3.1. Periodic solutions
A periodic solution is a nonconstant solution of (1) Denote by $(t) the fundamental matrix solution
with finite period T > 0, i.e. x(0) = x(T). Since T of (13), for which $(0) = I, where I — In is the
is not known in advance, we use an equivalent sys- n-dimensional identity matrix. Then $(1) is the
tem defined on the fixed interval [0,1], by rescaling monodromy matrix of the periodic solution.
time. Then the system, with x = dx/dt, reads The eigenvalues of $(1) are the Floquet multipli-
(x-Tf(x,a)=0, ers. There is always at least one multiplier that
is equal to 1, with corresponding eigenvector x(0),
\ a;(0) - x(l) = 0. where x(t) is a solution of (11). For a regular peri-
The phase shifted function </>(£) = x(t + s) is also a odic solution, the multiplier 1 has geometric multi-
solution of (11), for any value of s. In order to have a plicity 1. Similarly denote by ^(t) the fundamental
unique solution, an extra constraint is needed. The matrix solution of (14), for which \l/(0) = I. One
following integral constraint is often used [Doedel has *(«) = [(^(t))-1}*.
et al, 2001; Kuznetsov & Levitin, 1997]: If v (t) is a vector solution of (13), with initial
(x,xold)=0, (12) values v(0) = VQ, and w(t) is a vector solution to
(14), with initial values w(0) = WQ, then the inner
where i 0 id is the time derivative of a previously cal- product satisfies w*(t)v(t) = WQVQ, i.e. it is inde-
culated periodic solution, and therefore known. For pendent of time t.
given x,y E C°([0, l],M n ), we denote The left and right eigenvectors of the monod-
romy matrix $(1) for a geometrically simple eigen-
(x,y) Inty{x) = / x*(t)y(t)dt.
Jo value 1 will be denoted po, go respectively. It is easily
The phase condition (12) selects the periodic solu- seen that po (respectively, go) is also the right
tion x with the smallest phase difference compared (respectively, left) eigenvector of ^(1) for the eigen-
to the previous solution x0\<i. The most appropri- value 1. Furthermore, go is a scalar multiple of x(0).
ate choice for x0\d is the preceding periodic solution
computed in the continuation process. 3.2. Branch points of periodic
The complete boundary value problem (BVP)
solutions
defining a periodic solution now consists of (11)
and (12). If we select a component (3 of the parameter vec-
Consider the variational equation tor a, then the periodic solution equations (11),
(12) admit a smooth solution family in (x(t),T,/3)-
X-Tfx(x(t),a)X = 0, (13) space, passing through a given periodic solution, if
the Jacobian operator

D~Tfx{x(t),a) -f(x{t),a) -Tf0(x(t),a)


J = S0 - <5i 0 0 (15)
Int
iold(t) 0 0

is onto and has a one-dimensional kernel. If this


condition is violated, then the periodic solution is Proposition 3. If (x(t),T,a) is a regular solution
called a branch point, and, as in the case of equilib- of (11), (12), then the operator
ria, more information is needed to decide about the
behavior of nearby periodic solutions.
We call J the branching operator. To study it D-Tfx(x(t),a)
in more detail we first recall some basic facts about
So -Si
the operator that is implicitly defined by (11), when
linearized about a regular solution (x(t),T, a). ^([0, l],JRn) -> C°([0,1], H n ) x IRn (16)
Numerical Continuation of Branch Points of Equilibria and Periodic Orbits 149

has a one-dimensional kernel spanned by &qo- Its equivalent:


range has codimension 1; if C, G C°([0,1], JRn), r G
H n then (£, r)* is in the range if and only if (1) One of ($fpo,ki), (^po,k2) is nonzero.
(^POiC) = Por- In particular, if r = 0 then (£,0)* (2) J\ is onto and has a one-dimensional kernel.
is in the range if and only if (typo, C) = 0. (3) J\ is onto.

Proof. See [Doedel et al., 2003b, Proposition 1].


Proof. To prove that (1) implies (2), we may
assume that (&po,ki) ^ 0. By Proposition 4, the
Proposition 4. Let (x(t),T,a) be a regular solu- second column of J\ is not in the range of (17),
tion of (11), (12), and assume that x0\&(t) is close and so J\ is an onto operator. Next, the first
enough to x(t), so that (x,x0\&) ^ 0. Then the two block columns of J\ span the range of J\, so
operator the third column is a linear combination of the
first two block columns; this implies that the ker-
D-Tfx(x(t),a) nel of J\ is at least one-dimensional. To prove
that the kernel is one-dimensional, suppose that
So -S1 •.C'ttO,!},^) (y\(t),u\, v-y), (y2(t),U2,v2) are both in the kernel
Int.iold(t) of J\. Then there exist a,b G M, not both equal to
zero, such that av\ + bv2 = 0. Hence
-^C°([0,l],IR n ) x f f T x I R (17)

is one-to-one. Its range has codimension 1; if £ G (aui + bu2)ki(t)


C°([0,l],IR n ), r G JRn, s G IR then ((,r,s)* is 0
in the range if and only if (\I/po,£) = Po r - ^n 0
particular, if r = 0, s = 0 then (£,0,0)* is in the
range if and only if (vl/po, £) = 0.
is in the range of (17). By Proposition 4 and the
Proof. First assume that y(t) is in the kernel of assumption on ki (t), this implies that au± + bu2 =
(17). Then it is also in the kernel of (16), i.e. 0. Hence ay\(t) + by2(t) is in the kernel of (17).
y(t) is a multiple of x(t) by Proposition 3. By the However, this operator is one-to-one, so ay±(t) +
assumption on the closeness of x0\&, this implies that by2(t) = 0. This proves that (yi(t),u\,vi) and
y(t) = 0. (y2(t),U2,v2) are linearly dependent, so the kernel
Next, consider any £ G C°([0, l],M n ), r G IRn, of J\ is one-dimensional.
s G IR. By Proposition 3 this implies (\I/po, £) = p^r. The implication (2) =£> (3) is trivial. Now
Conversely, if this condition holds then, again by assume that J\ is onto. If (\I/po, k\) — (^Po, k2) = 0
Proposition 3, there exists an s' G IR such that then, by Proposition 4, the range of J\ is the range
(£, r, s')* is in the range of (17). On the other of (17), so J\ is not onto. •
hand, this range contains also a vector of the form
(0,0, s — s')* (the image of a multiple of x(t)). •
Corollary 1. Let (x(t),T,a) be a regular solution
of (11), (12) and assume thatx0\&(t) is close enough
Propositions. Let (x(t),T,a) be a regular solu- to x(t), so that (x,x0\d) 7^ 0. Let (3 be a compo-
tion of (11), (12) and assume that aj0id(*) *s close nent of a. Then (x(t),T,a) is a branch point of
enough to x(t), so that (x,x0\d) ^ 0. Let k\,k2 the periodic solutions in (x,f3)-space if and only if
G C°([0,1],IR) and consider the operator {^Po,f(x,a)) = (^p0,fp(x,a))=0.

D-Tfx(x(t),a) h(t) k2(t)


From [Doedel et al., 2003b], Proposition 5 we
h S0 - <Si 0 0 , (18) recall that the condition (\I/po, f(x,a)) = 0 generi-
Int *old(t) 0 0 cally characterizes fold bifurcations of limit cycles.
This is in accordance with the fact that in generic
where J± : Cx([0, l],IR n ) x IR x IR -> C°([0, l],IR n ) x systems BPC points appear as special points in fam-
IRra x IR. Then the following statements are ilies of LPC points.
150 E. J. Doedel et al.

4. Numerical Detection, Computation and defining the polynomials x^> (r) as


a n d C o n t i n u a t i o n of B r a n c h P o i n t s
TO
of P e r i o d i c Solutions
X«(T) = ^ X ^ J ( T ) .
4.1. Time discretization 3=0
We concentrate on the orthogonal collocation
method [Ascher et al., 1995] to discretize the peri- Here x1^ is the discretization of x(r) at r = TJJ (we
odic solutions, because of its good convergence note that xl'm = xl+1'°), and the ^ J ( T ) ' S are the
properties [De Boor h Swartz, 1973], and its Lagrange basis polynomials
widespread use, e.g. in COLSYS [Ascher et al, 1981], m
AUTO [Doedel et al, 2001], CONTENT [Kuznetsov &
Levitin, 1997] and MATCONT [Dhooge et al, 2003]. k,(r>= n f3r-fc=o,fc#j M
*•*
We recall the basic features.
First the interval [0,1] is subdivided into N In each interval [TJ,TJ + I] we require that the
smaller intervals. polynomials x^'(r) satisfy the BVP exactly at m
collocation points Qtj (j = l , . . . , m ) . It can be
0 = TO < T I < • • • < TJV = 1- proved that the best choice for the collocation
points are the Gauss points [De Boor & Swartz,
In each of these intervals the solution X(T) is 1973], i.e. the roots of the Legendre polynomial of
approximated by an order m vector valued polyno- degree m, relative to the interval [TJ,TJ + I].
mial X^'(T). This is done by defining ra + 1 equidis- Now let x(t) be a function defined in [0,1], and
tant points on each interval: assume that we want to integrate it over [0,1]. If, for
j example, N = 3 (mesh intervals), and m = 2 (col-
n,j = n-\—(TJ+I - Tj) (j = o, l,...,m), location points), then the following data are associ-
J ated with the discretized interval [0,1]:

~o n T2 T3
o o • o • o o

r
0,0 To,l To,2 T2,0 T2,l T2,2

Tlfi Ti,l Ti,2 T3j0

tiWi tiW2 hWs + t2Wi t2W2 t2W'i + t3Wi t3W2 t3Ws

°rj,0 O"o 1 <Ti o Oil <T2,0 0"2,l (73 0

The total number of mesh points (tps) is N x m + 1 , r


the total number of points (ncoords) is tps x n. 4.2. Discretization of the BVP
Each mesh point TJJ in a mesh interval [TJ,TJ+I]
Using the discretization described in Sec. 4.1 we
has a particular weight Wj+\, the Gauss-Lagrange
obtain the discretized BVP
quadrature coefficient. Some mesh points (the black
bullets) belong to two mesh intervals. We set ti =
Ti — Tj_i, (i = 1,...,N). The integration weight
<7jj of Tij is given by Wj+iti+i, for 0 < i < N — 1
and 0 < j < m. For i = 0 , . . . , N — 2, the inte-
m \

( / m

^x^lhJ(Ci,k) \-Tf l^2x^£id(Ci,k),a 1=0,


\

gration weight of Tj,m = Ti+1,0 is given by ai<m = x°>° - Z"1-™ = 0,


wm+iti+i + witi+2, and the integration weights of
TO and TJV are given by w\ti and wm+itN, respec- JV-lm-l
tively. The integral JQ x(t) dt is approximated by
E i l o 1 TJj^o x(n,j)o-ij + x(l)aNfi. i=0 i = 0
(19)
Numerical Continuation of Branch Points of Equilibria and Periodic Orbits 151

The first equation actually represents Nm a linear system consisting of this Jacobian matrix,
equations, one for each combination of i = with a extra row that corresponds to the tangent
0,1, 2 , . . . , N - 1 and k = 1,2,..., m. vector to the solution branch. For example, if N = 3
The Jacobian of the discretized system is (mesh intervals), m = 2 (collocation points), and
sparse. During the continuation process, each n = 2, this matrix has the following sparsity struc-
Newton iteration requires the numerical solution of ture [Doedel et al, 1991]:

/ 2,0,0 3,0,1 3,1,0 3,1,1 3,2,0 3,2,1 .3,0


a

(20)

where the »'s denote elements that are generally


nonzero. The columns of (20) label the unknowns of corresponding to x°>° and xN'° (i.e. ±12)- The
the discretized problem. The first n = 2 rows corre- next to last row in (20) is the derivative of the
spond to the first collocation point, etc. In (11) and discretization of the phase condition (12). The last
(12) there are three unknown quantities: the orbit row, which basically corresponds to Keller's pseudo-
x, the period T and a parameter a. The part of the arclength continuation equation [Keller, 1977], is
Jacobian that corresponds to the first equation in automatically added in our implementation.
(11), has the following form:

[D-Tfx(x,a) -f(x,a) -Tfa(x,a)]. 4.3. Numerical detection of BPC


cycles in generic systems
In (20), D — Tfx(x, a) corresponds to N = 3 blocks,
For generic systems, Corollary 1 provides the key to
of dimension nm x n(m +1), i.e. 4 x 6 . The part of
detect and compute BPC cycles. First we consider
(20) that defines the periodic boundary conditions
the test functions
has the form:

[In 0,nx(Nm—l)n -In 0, TLpc = {^Po,f{x,a))

In (20), these are n = 2 rows following the 4 x = f [*(t)po]*f(x(t),a)dt


6 blocks. These rows contain two nonzero parts, Jo
152 E. J. Doedel et al.

and respectively

TBPC = {^Po,fp{x,a)) (f/3(x(t),a))dc


On
= / [*(t)po]*ff)(x(t),a)dt are in the range space of MD, or, equivalently, that
Jo
they are orthogonal to the left singular vector of
from Corollary 1. By Proposition 3 ME>. To compute this vector, we expand M by
adding a column w and a row v* so that
*(*)Po
Md w
-Po MDb =
v* 0
is orthogonal to the range of is nonsingular, and we solve the system
D-Tfx(x(t),a)
(21)
M-. So -Si "Or
>{Nm+l)n
MlDb
are
1
Thus the conditions TLPC = 0,TBPC = 0, V>3
equivalent to the statements that
where ipi,^ have Nmn,n components, respec-
f(x(t),a) tively, and ^3 is a scalar. We note that in exact
arithmetic ^3 = 0. So the numerical test func-
On tions are
and TLPCd = ri(f(x(t),a))dc
fp(x(t),a) and
0n
TBpcd = i>l(fp(x(t),a))dc,
are in the range of (21). Now the discretized form of
respectively.
(21) is the square matrix MB, obtained from (20)
by removing the last two rows and columns. To be
precise, if h G Cx([0, l],lR n ), then
4.4. Numerical detection of BPC
h-Tfx(x(t),a)h cycles in families of limit cycles
Mh =
h(0) - h{\) BPC cycles are not generic in families of limit
cycles, but they are common in the case of sym-
and metries, if the branching parameter is also the con-
tinuation parameter; examples are given in Sees.
(h-Tfx(x(t),a)h)dc 6 and 7. The test function in AUTO and CON-
MD(h)dm =
/i(0) - h(l) TENT is the determinant of a small matrix obtained
from (20), by an elimination that preserves the
where ()dm and ()dc denote discretization in mesh rank of the matrix. MATCONT uses a strategy that
points and in collocation points, respectively. requires only the solution of linear systems; it is
Mo has rank defect one, and its right singular based on the fact that in a symmetry-breaking
vector is (${t)q0)dm, or, equivalently, {x(t))dm- BPC cycle MB has rank defect two. Therefore we
For numerical purposes we therefore replace the border MB with two additional rows and columns
conditions TLPC = 0 and TBPC = 0 by the require- to obtain
ments that
MD W\ w2
(f(x(t),a))dc M,Bbb v{ 0 0
On V*2 0 0
Numerical Continuation of Branch Points of Equilibria and Periodic Orbits 153

so that Mobb is nonsingular in the B P C cycle. Then Proposition 7. Let 0 i , 0 2 G IR' /Vr>+1 , and ip €E
we solve t h e systems I R ^ be as in Proposition 6. Let Bo be a matrix
with the structure of BJO
ipn 1pl2
MDbb 9BPC11 9BPC12 "B •f
9BPC21 9BPC22 _ BD = 01 0 (23)
02" 0
®{Nm+l)n ®(Nm+l)n
1 0 and define voi,vo2 £ H i V D + 1 , 9o\,9D2 G IR by
requiring
0 1
0JV D 0jV c
where ipn,ipi2 have (Nm + \)n components, and V02
vox (24)
9BPCII, 9BPC12, 9BPC21, and 9BPC22, are scalar Bo = 1 0
test functions for t h e B P C . In t h e B P C cycle they 9D\ 9D2_
0 1
all vanish. In t h e examples in Sees. 6 and 7 we show
t h a t they indeed change sign, and that they can If B is sufficiently close to Jo, then B is nonsingu-
therefore detect the B P C cycles. lar. Furthermore, B has corank 1 if and only if

9m = 0,
4.5. Numerical computation of BPC (25)
gD2 = 0.
cycles in families of LPC cycles
In the generic case, T^pcd c a n be used to locate a T h e numerical equations for a b r a n c h p o i n t of a
B P C cycle on a curve of L P C cycles exactly. This periodic solution of (1), near a given (x°(t), T°, a0),
method is implemented in MATCONT. We note that are defined by t h e system consisting of (19) a n d
t h e branching parameter may be different from t h e (25), where B is replaced by Jo, so t h a t go\,go2
continuation parameters. An example is given in are functions of the discretized orbit x(t), a n d
Sec. 5. of T and a. If three components of a a r e freed,
then, generically, a family of branch p o i n t s can b e
computed.
4.6. Numerical continuation of In t h e computations we also need t h e deriva-
BPC cycles tives of go\,gD2, with respect t o t h e compo-
nents of x(t),T and a. This can b e d o n e as in
T h e discretization of t h e branching operator is a
Sec. 2. We solve the adjoint system corresponding
matrix Jo, t h a t we formally obtain by removing
to (24)
the last row of (20), and replacing a by a specific
component. Thus Jo is an No by No + 1 matrix
where No = (Nm + 1) x n + 1. As in Sec. 2, we Wo
shall express t h a t this matrix has rank defect 1. BD 9D\ (26)
1
We formulate t h e essential results, omitting the gv2
proofs.
where wo 6 IR . If z is one of the c o m p o n e n t s of
Proposition 6. Let 0 i , 0 2 € IR^0"1"1 be such that x, T, or a free parameter, then by taking derivatives
together with the rows of Jo they span of (24), and multiplying from t h e left w i t h
Also, let ip € IR D be a vector that together with
the columns of Jo spans ]RND. Then the bordered [w*D 9m 9D2],
matrix
we obtain
'Jo v>"
BJD = 01 0 (22)
9Di. -wDBDzvDi, (« = 1,2). (27)
02 0
T h e continuation of B P C cycles is s u p p o r t e d by
is nonsingular. MATCONT. We note t h a t t h e second-order partial
154 E. J. Doedel et al.

derivatives (the Hessian) of / with respect to x and of B P C points then involves three effective
a are required. parameters.
The model is t h a t of a continuous stirred t a n k
reactor, with consecutive A —> B —> C reactions,
4.7. Numerical computation of as studied by [Doedel &; Heinemann, 1983]. It has
BPC cycles in families of limit three state variables, ui, 112,1x3, and five parameters,
cycles Pl,P2,P3,PA,Pb-
In the case of symmetries, where a B P C point
occurs along a family of limit cycles, AUTO and CON-
TENT locate the B P C by a rootfmding procedure on ii\ = -1*1 +Pi(l — ui)em,
the test function for detection. MATCONT, on the ii2 = ~u2+PieU3(l -111 -P5U2), (29)
other hand, uses a specific locator algorithm t h a t
guarantees quadratic convergence. This locator has "3 = -U3 -P3U3 +PlP4eU3(l - Ui +P2P5U2)-
many features in common with the numerical con-
tinuation described in Sec. 4.6.
The idea is to set up a system based on (19), This model is used as a demo in the AUTO manual
t h a t contains an artificial scalar unknown f3, and [Doedel et al, 2001]. In the notation of [Doedel k,
two additional equations: Heinemann, 1983], we have u\ = y, where 1 — y is
the concentration of reactant A, u2 = z, t h e con-
centration of reactant B, 113 = 9, the t e m p e r a t u r e ,
pi = D, the Damkohler number, p2 = ex, t h e ratio
/ _, x ' ^i,j\S>i,k)
of reaction heats, p3 = f3, t h e heat transfer coeffi-
d=0
cient, P4 = B, the adiabatic t e m p e r a t u r e rise, and
j>5 = a, the selectivity ratio.
Tf\Y,xi'JkJ(Ci,k),a\ +/3pi=0, Figure 1(a) reproduces the equilibria found
,3=0 in [Doedel h Heinemann, 1983], recomputed with
N l m MATCONT. The parameter values are p2 = 1,^3 =
xo,o _ x ~ ^ + 0P2 = 0,
(28) 1.5,pi — 8,^5 = 0.04, with free parameter p\, start-
ing from the equilibrium at p\ = 0.1, for which
JV-lm-l
EE^^l*^old ux = 0.13304, u2 = 0.13223, n 3 = 0.42833.
The curve of equilibria contains four Hopf
i=0 j=0
points, denoted, from left to right, H\,H2,H3,H^,
jV,01 * z.N,0 respectively.
gm(x,T,a) = 0, Using MATCONT, we reproduce in Fig. 1(b)
another equilibrium curve, for the same parame-
gD2{x,T,a) = 0, ter values, except with p2 = 0.9. This curve looks
qualitatively similar to t h a t in Fig. 1(a), and it
where gDi,9D2 are defined as in (23), and \p\ p% P3]* also has four Hopf points. As shown in [Doedel &;
is the bordering vector ip t h a t appears in (22). Heinemann, 1983], in t h e case p2 = 1, t h e Hopf
We solve this system with respect to x,T,a,(3 points H\ and H4 are connected by a family of peri-
by Newton's method with initial f3 = 0. A branch odic solutions, and H2 and # 3 are similarly con-
point (x,T,a) corresponds to a regular solution nected. Figure 2(a) shows the family of periodic
(x,T,a,0) of system (28) (see [Beyn et al, 2002, solutions t h a t connects Hi to H4.
p. 165]). We note again t h a t the second-order par-
Interestingly, the situation is different when
tial derivatives (Hessian) of / with respect to x and
p2 = 0.9. In this case Hi and H2 are connected by a
a are required.
family of periodic solutions, and so are H3 and H4.
Figure 2(b) shows the family of periodic solutions
t h a t connects Hi to H2.
5. E x a m p l e 1: T h e A -> B -* C
As shown in Fig. 2(a) (p2 = 1), t h e family of
Reaction
solutions t h a t connects Hi to H\ contains three fold
In this section we discuss a generic example, i.e. bifurcations of periodic solutions, as also observed
a model without symmetries. The continuation in [Doedel & Heinemann, 1983]. In Fig. 5(a) we
Numerical Continuation of Branch Points of Equilibria and Periodic Orbits 155

0.8 -

0.6 -

0.4 -

0.2 -

(a)

(b)

Fig. 1. (a) Equilibrium curve of the A—> B —» C reaction, for p2 = 1. (b) Equilibrium curve of the A —> B —* C reaction for
P2 = 0.9.

plot TLPCd versus p\\ T^pcd clearly vanishes (and an exchange of connections, i.e. a branch point of
changes sign) at the fold bifurcations. periodic orbits with respect to p\. In order to locate
The observations above imply that, for certain it, we continue the first fold bifurcation of periodic
nearby values of other parameters, we can expect solutions numerically; freeing both p\ and p-2- This
156 E. J. Doedel et al.

1.1

I LPC

0.9

0.8

0.7 -

0.6 -

0.5 -

0.4
0.2 0.25 0.3 0.35
P1
(a)

1.1

0.9

0.8 LPC

0.7

0.6 -LPCl

0.5

0.4
0.2 0.25 0.3 0.35
P1

(b)
Fig. 2. (a) Family of periodic orbits connecting the first and fourth Hopf points in Fig. 1(a). (b) Family of periodic orbits
connecting the first and second Hopf points in Fig. 1(b).
Numerical Continuation of Branch Points of Equilibria and Periodic Orbits 157

family contains indeed a B P C point with respect to to the first parameter of the system p\. The criti-
P\\ see Figs. 3 and 4. This BPC point was detected cal parameter values are pi = 0.211201156173, P2 —
as a zero of Tjspcdi the symbol BPCl in Figs. 3 0.940211847478. We note that the local extremum
and 4 reminds us that the branching is with respect with respect to p\ in Fig. 3 corresponds to a cusp in

1.1 r

0.9

BPC1
0.8

0.7 -

0.6 -

0.5 -

0.4
0.185 0.19 0.195 0.2 0.205 0.21 0.215 0.22 0.225
P1

Fig. 3. LPC curve, with a BPC point with respect to pi, for the A —• B —* C reaction.

1.3

1.25

1.2

1.15

1.1

1.05

0.95 •

0.9
0.185 0.19 0.195 0.2 0.205 0.21 0.215 0.22 0.225
P1

Fig. 4. The family from Fig. 3 in (pi,p2)-space.


158 E. J. Doedel et al.

the parameter plane in Fig. 4. Also, in the param- curve, while TBPCd vanishes (and changes sign) at
eter plane, the branch point with respect to p\ cor- the BPC point only.
responds to a local extremum with respect to p2- It is now possible to continue the BPC point
In Fig. 5(b) we plot T^pcd and Tspcd versus pi- by freeing a third parameter. Selecting another
As expected, Tipcd vanishes at all points of the point on this family, and freezing again the third

0.195 0.205 0.21 0.215 0.22


Pi

(a)

'BPCd

1
LPCd 0

0.19 0.195 0.215 0.22

(I.)

an on a
Fig. 5. (a) T^pcd on a LC curve, (b) Tgpcd d ThPCd LPC curve.
Numerical Continuation of Branch Points of Equilibria and Periodic Orbits 159

parameter we can produce pictures qualitatively We start by computing the trivial family, with
similar to Fig. 4. fixed parameters 7 = -0.6, r = 0.6,03 = 0.328578,
63 = 0.933578,0 = 0.5, and with v as free param-
eter, with initially v = —0.9. Along this family a
6. E x a m p l e 2: A n Electronic Circuit Hopf point is detected at v = —0.58933644, and a
In this section we discuss a nongeneric situation, i.e. branch point of equilibria at v = —0.5. From the
a problem with a symmetry, where the continuation Hopf point we start the computation of a family
of BPC points includes two effective parameters, of periodic solutions, using 25 mesh intervals and
and one artificial parameter. four collocation points. This is a family of symmet-
The model is that of an autonomous elec- ric solutions of (30); we detect one LPC and two
tronic circuit, studied in [Freire et al., 1983]. It BPC, see Fig. 6.
has three state variables x, y, z, and six parameters In Fig. 7 we show how the first BPC point is
1,r,a3,b3,v,/3: detected by the simultaneous sign changes of the
test functions gspcu, QBPCU 9BPC21, 9BPC22 dis-
x = l-(0 + v)x + 0y - a-sx6 + 63(y - x)" | cussed in Sec. 4.4.
To compute the family of BPC points with
y = 0x-{0 + -f)y-z- b3(y - xf, respect to v, through the first BPC, with free
z = y. parameters v, 0, we need to introduce an additional
free parameter that breaks the symmetry. There
(30) are many choices for this; we choose to introduce
This model is also used as a demo in AUTO2000 a parameter e, and extend the system (30) by sim-
[Doedel et al, 2001]. It has the trivial solution ply adding a term +e to the first right-hand-side.
family, where x = y = z = 0, for all parame- For e = 0 this reduces to (30), while for e ^ 0
ter values. Moreover, it has the ^-symmetry x i-» the symmetry is broken. Using the algorithm for
—x, y t-* —y, z i—• —z. the continuation of generic BPC points, with three

Fig. 6. Family of periodic solutions, with LPC and branch points, in the circuit example.
160 E. J. Doedel et al.

x10r-3
12

10

SBPCH 6

4
9BPC12
2

9BPC21 0

2
9BPC22
-4

-6

-0.588 -0.5875 -0.587 -0.5865 -0.586 -0.5855


v
Fig. 7. Evolution of the test functions for branch points in a BPC in the circuit example.

1.5

0.5

>- 0

-0.5

-1

-1.5

-1.5 -1 -0.5 0 0.5 1.5


X

Fig. 8. Curve of BPC points in the circuit example.

free parameters v,0,e, we continue the curve of 7. Example 3


nongeneric BPC points, where e remains zero, up In this section we discuss an even less generic
to numerical precision. Figure 8 illustrates the fact situation than in the preceding example, namely,
that the symmetry is preserved. an equation with several symmetries, where the
Numerical Continuation nf Rranrh Points of Equilibria and Periodic Orbits 161

continuation of BPC points involves one effective The larger and smaller primary bodies (say,
parameter, and two artificial parameters. the Earth and the Moon) are located at (—fj,,0,0)
Our model is that of the circular restricted and (1 — LI, 0,0) respectively, where ii is the
three-body problem (CR3BP), as studied, for exam- mass-ratio parameter. In the Earth-Moon case,
ple, in [Doedel et al, 2003a; Doedel et al, 2003c]: LI = 0.01215. In (31) n = y/{x + LI)2 + y2 + z2 and
V2 = \J{x — 1 + LI)2 + y2 + z2 are the distances of
x = 2y + x - (1 - fi)(x + Li)r±3 the negligible-mass satellite (x, y, z) to the two pri-
- Li(x - 1 + LI)^3, mary bodies, respectively.
< (31) We convert (31) to a first-order system
y = -2x + y-(l-fi)yr^3 •-- 2 3
vyr in the usual way, by introducing the veloc-
ities vx,vy,vz as additional variables; further-
k z =-{1 - fj,)zrxs - fMzr2-
more we introduce artificial parameters A]_,A2, to
obtain:
' . , dE
x = vx + \i-dx '

dE
y = vy + \i

dE
z = vz + \i
dz '
(32)
dE
v'x = 2vy + x — (1 - ii){x + /J.)r1 6 — fi(x - 1 + /x)r2 + Ai
dvJ
3 dE
vy = -2vx + y - { \ - n)yr± - fxyr2 3 + Ai
dVy

vz = - ( 1 - ii)zr^3 - iizr^ +\x- h A2,

where
can locate zeroes of user functions, we detect LI as
V a zero of the function /i — 0.01215. ;
E=o(vl + tf + D
LI is a (degenerate) Hopf bifurcation point of
(32); we compute the family of planar periodic solu-
-l^ + y2)-1- M
r2 2 tions (with z = 0) born there, using 40 mesh inter-
n vals and four collocation points, with Ai as the free
is the Jacobi constant (the energy), which is pre- parameter. Along this family Ai remains zero, up
served along orbits. to numerical precision, and a BPC is detected, see
We are only interested in the case where Ai = Fig. 9.
A2 = 0. It is known that in this case, for each value In Fig. 10, we show how the BPC point is
of /i, (0 < 11 < 1), the (x,y) plane contains five detected by the simultaneous sign changes of the
equilibria, the so-called libration points. Three of test functions gBpcn, QBPCU, 9BPC21, 9BPC22 dis-
them, LI, L2, L3, are collinear with the primary cussed in Sec. 4.4.
bodies, while both L4 and L5 form an equilateral Selecting this BPC point as starting point, we
triangle with the primaries. Here we compute the compute the locus of branch points, using /x, Ai, A2
planar periodic orbits that arise from LI, the libra- as the free parameters. The resulting curve is shown
tion point between the two primary bodies. LI is a in Fig. 11. We note that the curve that repre-
solution of the equilibrium equations of (32), with sents the BPC point, shrinks to a single point,
Ai = A2 = y = z = vx = vy — vz = 0. We detect x = —l,y = 0, ii fj, tends to 1 (left side of Fig. 11),
it by a continuation of the equilibrium equations of and also to a single point, x = \,y = 0, if LI tends
(32), with LI as the free parameter. Since MATCONT to 0 (right side of Fig. 11).
162 E. J. Doedel et al.

Fig. 9. Some planar Lyapunov orbits, with BPC point, and some bifurcating orbits in the CR3BP.

-1.38 -1.36 -1.34 -1.32 -1.3 -1.28 -1.26 -1.24 -1.22 -1.2
X, x10" 13

Fig. 10. Evolution of the test functions for branch points in a BPC in the CR3BP example.
Numerical Continuation of Branch Points of Equilibria and Periodic Orbits 163

-1
L 12; ordinate -0.15 0.1S

0.15

0.1

0.05

=» 0

-0.05

-0.1 -
F
i i i i i i i i i i_
1 -0.8 -0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8 1
X

Fig. 11. Family of BPC points in the CR3BP.

References A continuation toolbox in MATLAB," Proc. 2003


ACM Symp. Applied Computing (Melbourne, Florida,
Allgower, E. L. & Georg, K. [1990] Numerical Continua-
March 2003), pp. 161-166.
tion Methods. An Introduction (Springer-Verlag, NY).
Dhooge, A., Govaerts, W. & Kuznetsov, Yu. A. [2003]
Allgower, E. L. & Schwetlick, H. [1997] "A general view
"MATCONT: A MATLAB package for numerical bifur-
of minimally extended systems for simple bifurcation
cation analysis of ODEs," ACM Trans. Math. Softw.
points," Z. Angew. Math. Mech. 77, 83-97.
29, 141-164.
Ascher, U. M., Christiansen, J. & Russell, R. D. [1981]
Doedel, E. J. & Heinemann, R. F. [1983] "Numeri-
"Collocation software for boundary value ODEs,"
cal computation of periodic solution branches and
A CM Trans. Math. Softw. 7, 209-222.
oscillatory dynamics of the stirred tank reactor with
Ascher, U. M., Mattheij, R. M. M. & Russell, R. D.
A —> B —> C reactions," Chem. Engin. Sci. 38,
[1995] Numerical Solution of Boundary Value Prob-
1493-1499.
lems for Ordinary Differential Equations (SIAM,
Doedel, E. J., Keller, H. B. & Kernevez, J. P.
Philadelphia).
[1991] "Numerical analysis and control of bifurcation
Beyn, W.-J., Champneys, A., Doedel, E. J.,
problems: (II)," Int. J. Bifurcation and Chaos 1,
Govaerts, W., Sandstede, B. & Kuznetov, Yu. A.
745-772.
[2002] "Numerical continuation and computation of
Doedel, E. J., Champneys, A. R., Fairgrieve, T. F.,
normal forms," Handbook of Dynamical Systems,
Kuznetsov, Yu. A., Sandstede, B. & Wang, X. J.
Vol. 2, ed. Fiedler, B. (Elsevier Science), pp. 149-219.
[2001] AUTO97-AUTO2000: "Continuation and Bifur-
Borisyuk, R. M. [1981] Stationary Solutions of a Sys-
cation Software for Ordinary Differential Equations
tem of Ordinary Differential Equations Depending
(with HomCont)," User's Guide, Concordia Univer-
upon a Parameter, FORTRAN Software Series, Vol. 6
sity, Montreal, Canada, http://cmvl.cs.concordia.ca.
(Research Computing Centre, USSR Academy of
Doedel, E. J., Dichmann, D. J., Galan-Vioque, J., Keller,
Sciences, Pushchino) [in Russian].
H. B., Paffenroth, R. C. & Vanderbauwhede, A.
De Boor, C. & Swartz, B. [1973] "Collocation at
[2003a] "Elemental periodic orbits of the CR3BP:
Gaussian points," SIAM J. Numer. Anal. 10,
A brief selection of computational results," in Proc.
582-606.
Equadiff Conf, Maastricht, to appear.
Dhooge, A., Govaerts, W., Kuznetsov, Yu. A.,
Mestrom, W. & Riet, A. M. [2003] "CL_MATCONT:
164 E. J. Doedel et al.

Doedel, E. J., Govaerts, W. & Kuznetsov, Yu. A. [2003b] Kiipper, T., Mittelmann, H. D. & Weber, H. (eds.) [1984]
"Computation of periodic solution bifurcations in Numerical Methods for Bifurcation Problems, ISNM
ODEs using bordered systems," SIAM J. Numer. Vol. 70 (Birkhauser Verlag, Boston).
Anal. 4 1 , 401-435. Kuznetsov, Yu. A. & Levitin, V. V. [1997] "CONTENT:
Doedel, E. J., Paffenroth, R. C , Keller, H. B., Dichmann, Integrated environment for analysis of dynamical
D. J., Galan, J. & Vanderbauwhede, A. [2003c] "Con- systems," CWI, Amsterdam: ftp://ftp.cwi.nl/pub/
tinuation of periodic solutions in conservative sys- CONTENT
tems with application to the 3-Body problem," Int. Kuznetsov, Yu. A. [1998] Elements of Applied Bifurca-
J. Bifurcation and Chaos 13, 1-29. tion Theory, 2nd Ed. (Springer-Verlag, NY).
Freire, E., Rodriguez-Luis, A., Gamero, E. & Ponce, E. Mei, Z. [1989] "A numerical approximation for the simple
[1993] "A case study for homoclinic chaos in an bifurcation problems," Numer. Fund. Anal. Optim.
autonomous electronic circuit: A trip from Takens- 10, 383-400.
Bogdanov to Hopf-Shilnikov," Physica D62, 230- Mei, Z. [2000] Numerical Bifurcation Analysis for
253. Reaction-Diffusion Equations (Springer-Verlag,
Golubitsky, M. & Schaeffer, D. G. [1985] Singularities Berlin).
and Groups in Bifurcation Theory, Vol. I (Springer Mittelmann, H. D. & Weber, H. (eds.) [1980] Bifurcation
Verlag, NY). Problems and their Numerical Solutions (Birkhauser
Golubitsky, M., Stewart, I. & Schaeffer, D. G. [1988] Verlag, Boston).
Singularities and Groups in Bifurcation Theory, Moore, G. [1980] "The numerical treatment of non-
Vol. II (Springer Verlag, NY). trivial bifurcation points," Numer. Fund. Anal.
Govaerts, W. [2000] Numerical Methods for Bifurcations Optim. 2, 441-472.
of Dynamical Equilibria (SIAM, Philadelphia). Muhoz-Almaraz, J., Freire, E., Galan, J., Doedel, E. J.
Griewank, A. & Reddien, G. W. [1984] "Characteriza- & Vanderbauwhede, A. [2003] "Continuation of peri-
tion and computation of generalized turning points," odic orbits in conservative and Hamiltonian systems,"
SIAM J. Numer. Anal. 21, 176-185. Physica D181, 1-38.
Keller, H. B. [1977] "Numerical solution of bifurca-
tion and nonlinear eigenvalue problems," in Appli-
cations of Bifurcation Theory, ed. Rabinowitz, P. H.
(Academic Press), pp. 359-384.
COARSE-GRAINED OBSERVATION OF
DISCRETIZED M A P S
GABOR DOMOKOS
Department of Mechanics, Materials and Structures and
Center for Applied Mathematics and Computational Physics,
Budapest University of Technology and Economics, H-1521 Budapest, Hungary

Received April 30, 2004; Revised J u n e 15, 2004

We investigate why discretized versions /jy of one-dimensional ergodic maps / : / — > / behave
in many ways similarly to their continuous counterparts. We propose to register observations
of the N x N discretization /AT on a coarse M x M grid, with N = cM, c being an integer.
We prove that rounding errors behave like uniformly distributed random variables, and by
assuming their independence, the M x M incidence matrix AM associated with the continuous
map (indicating which of the M equal subintervals is mapped onto which) can be expected to
be identical to the incidence matrix BN'M associated with the aforementioned coarse grid, if
c > y/deg(f)N, where deg(/) denotes the degree of / . We show how coarse-grained registration
can be used as a "digital" definition of an unstable orbit and how this can be applied in real
computations. Combination of these results with ideas from the random map model suggests
an intuitive explanation for the statistical similarity between / and / # . Our approach is not a
rigorous one, however, we hope that the results will be useful for the computational community
and may facilitate a rigourous mathematical description.

Keywords: Coarse-grained model; discretization; chaos.

1. Introduction finite arithmetic which subdivides t h e u n i t interval


into N = 2k subintervals, the iteration e n d s after
Computer simulations of ergodic maps f : I —* I
a maximum of k steps at the fixed-point a t x = 0.
necessarily distort the map due to roundoff errors.
These simulations have many surprising qualities. (Since computers do use discretizations of t h i s type,
For physicists and engineers it may appear the reader is encouraged to test this s t a t e m e n t by
as self-evident t h a t by applying sufficiently high writing a 3-line code and run it.) By changing N to
arithmetic precision, the aforementioned distortion N = 2k +1 we will see radically different dynamical
becomes negligible. This is, however, not always the behavior, with several, finite cycles. Each of these n
case. Regard, for example, the "diadic" doubling cycles is associated with a discrete density pN,i,i =
map / : x —> 2x mod 1 which, when interpreted 1,2, ...n, so the discrete m a p typically possesses
as an iteration, exhibits for any typical (irrational) several densities as opposed to the single one asso-
XQ random-like, ergodic behavior. Similar to many ciated with the continuous m a p . D e p e n d i n g on the
other mixing and expanding maps, the diadic m a p choice of initial condition t h e statistics converges
posseses a unique, absolutely continuous, invariant to one of the densities PN,I, i = 1, 2 , . . . n. Although
density function p(x), and for this map p happens the diadic map is rather special, it is t r u e for any
to be uniform, so the statistical distribution of the ergodic m a p / t h a t the exact dynamical behavior
iterated points is uniform on / . By applying any of the discretization /JV (i.e. t h e n u m b e r a n d length

165
166 G. Domokos

of the cycles) depends sensitively on N. One cen- length. The number of draws before the first appear-
tral computational problem is how to restore on ance may be regarded as the length of the transient
the basis of /jv the original density p. Apparently, leading into a cycle. According to [Lanford, 1998],
the first person to think about this problem was this random map approach has been first pro-
Stanislaw Ulam, who proposed in his book [Ulam, posed by D. Ruelle in order to explain some fea-
1960] an averaging scheme for the computation of tures of numerical experiments performed in [Levy,
invariant densities. Later on, several authors used 1982]. This model has been developed further in
the idea of random perturbations to restore the [Grebogi & Yorke, 1988]. More recent results on
invariant density p from the discrete map /JV, fun- the random map model are summarized in [Lanford,
damental convergence results for expanding maps 1998] and [Kloeden et al., 1996]. Independently, the
are due to [Kifer, 1997; Liverani, 2001] and [Keller, same idea appears in [Domokos, 1990]. The ran-
1982], see also [Benettin et al, 1978; Blank, 1988; dom map model predicts cycles of length « V7V
Gora & Boyarski, 1988]. In [Domokos k Szasz, 2003] and transients of the same length [Domokos, 1990;
the authors slightly improve previous results to Lanford, 1998]. The number of cycles is predicted
identify the minimally necessary random perturba- to grow approximately as log A?".
tion; also they show that the mentioned "minimally Both previously described approaches intro-
perturbed" scheme is equivalent to a special ver- duce randomness into the discrete map (either by
sion of Ulam's original scheme. So, the "engineer- adding random perturbation or by regarding the
ing puzzle" can be regarded as settled as far as the whole system as random) in order to give mean-
reconstruction of p is concerned, at least in case of ingful predictions for large N. In this paper we
expanding maps. However, discretized ergodic maps take a different approach and regard directly the
discrete map. Our goal is also to describe asymp-
can surprise not only engineers or physicists, but
totic behavior for large N, in particular, we would
mathematicians as well.
like to demonstrate that if we regard the discrete
For the latter, the radical difference between map on a coarse-grained level (i.e. on a M x M
a piecewise continuous map and a discrete one is grid, with M -C N), then the discrete map does
almost evident. What may come as a surprise is approximate the continuous one. We will show that
that discrete maps can reproduce several relevant the incidence matrices (indicating which subinter-
properties of their continuous counterparts in a rel- val is mapped onto which one), associated with the
atively robust manner. For example, by applying coarse-grained system and the continuous map can
double-precision arithmetic (i.e. iV « 253) physicists be expected to be identical if M < y/N. In Sec. 2
are able to compute unstable cycles of the contin- we give the basic definitions, Sec. 3 proves the main
uous map with reasonable precision (cf. [Grebogi result. In Sec. 4 we outline applications and in Sec. 5
& Yorke, 1988; Cvitanovic et al, 1999]), and the we summarize our results and survey other ways to
invariant densities p#,i associated with the discrete investigate the discrete to continuous convergence
cycles tend to approximate fairly accurately the for ergodic maps. Our approach is not a rigorous
unique invariant density p of the continuous map one, rather, we hope that as a plausible argument
it will be useful for the computational commu-
[Lanford, 1998]. These quantitative agreements may
nity and it also facilitates a rigorous mathematical
appear as rather baffling from the mathematical
description.
point of view and, as Lanford points out [Lanford,
1998], there is no simple explanation at hand. One
possible approach to predict the qualitative behav- 2. Definitions and A s s u m p t i o n s
ior (i.e. the length and number of cycles) of discrete
maps /JV for very large N is to investigate random
2.1. Definitions and assumptions
maps, where in each iterative step a different map is about the maps f and f^
chosen randomly from the set of all possible discrete We investigate maps of the unit interval f : I —> I
maps. In several respects, this approach is equiva- with independent variable x £ I. We will assume
lent to the drawing of numbered balls from an urn that f(x) is piecewise continuous so its derivative
with replacement, until the same number appears f'(x) exists for almost all values of x. The map /
the second time. The number of draws between defines the iterated sequence
the first and second appearances of this number
Xi+i = f(xi), iiG[0,l). (1)
can be regarded as an approximation of the cycle
Coarse-Grained Observation of Discretized Maps 167

The discretized map /yv will be interpreted on the where [ ] denotes the integer part of a real
TV x TV quadratic lattice, where TV is the integer number. Here again, the equivalent, "inflated"
describing the arithmetic precision of the computer sequence MXi of integers may be more convenient
in the independent variable x (the smallest value to use.
of x different from zero is 1/N). We define the dis- In an analogous manner, one can also look for
cretized map as the coarse-grained Xi version of the x\ sequence (1)
associated with the continuous map:
f =
»i*) hNf\N 0,1,2,...,7V,
Xi = [Mxi]/M,
(2) (8)
Xi e {0,1/M, 2 / M , . . . , (M - 1 ) / M } ,
where [ ] denotes the integer part of a real num-
ber. The discrete map f?j defines an iterated or at the inflated version Mafj.
sequence Xf. From now on the iV-mesh will refer to the orig-
inal, TV x TV mesh, the M-mesh to the coarse one.
Xi+i = /jv(-^i), Although it would be misleading to speak about a
(3) coarse-grained map, we can readily define a coarse-
Xi € {0,1/N, 2/TV, ...,(N 1)/N}.
grained incidence matrix BM'N by using the subin-
For convenience, the equivalent, "inflated" sequence tervals if = [(j - 1)/M,j/M) in the following
NXi of integers can be used as well. way:
Now we can proceed to define the inci-
dence matrix ^4^ associated with the continu-
ous map f{x). We denote the j t h subinterval 1 if 3k such that 4 € if1
TV %
[{j — 1)/N,j/N) by I]?, and using this notation the M,N
incidence matrix is defined as B.'i,j k (9)
and G M
f» I N ) *i
0 otherwise.
(4)
*J I1 otherwise.
(Observe that for TV = M the definitions (9) and
NN
The incidence matrix B ' associated with the dis- (5) coincide.) We also define rounding errors on the
crete map /jv(i/iV) is easy to define as coarse M-mesh as

6(X,M) = f(X)-fM(X),
0 if fN
N ) +N
T (10)
B.N,N (5) X € {0,1/M, 2 / M , . . . , M - 1/M},
'i,3
and their inflated versions

A(X,M) = MS(X,M). (11)


2.2. Definitions for the
coarse-grained map fM,N
Now we will present a plausible argument show-
The basic idea behind the coarse-grained approach ing that in the limit as M —> oo the A(Xi,M),
is that we deliberately ignore a certain amount of A(X2,M) values behave like uniformly distributed
information contained in the discrete trajectories of random variables on / which have arbitrarily small
IN by registering the iterated values on a coarse correlation if the inflated distance
M x M mesh with
D = M\X!-X2\ (12)
TV = cM, c G Z+ (6)

so that instead of the original iterated sequence Xi is sufficiently large. We can regard A(X, M) as a
we regard the modified sequence function of M for fixed X, and we achieve the
M —> CXD limit by taking a series
Xi = [MXij/M,
(7)
Xi e {0,1/M,2/M,... ,(M -1)/M}, Mi = F Mo- (13)
168 G. Domokos

It is easy to see that A(X, M), which measures the to achieve closed-form solutions, we will have to
rounding errors relative to the meshsize, will grow assume this as well. (As we will point out later,
by the same factor as the number of meshpoints, the assumption of independence shifts our estimates
of course, modulo 1. This intuition is confirmed, as towards the "safe" side.) It is not true either that
we can construct a map A(X,Mj) —> A(X, Mj+i), the initial error A(i/Mo,Mj) is typically an irra-
based on (2), (10), (11) and (13): tional number. If / is a polynomial with rational
coefficients then it will assume rational values at the
meshpoints. Also, even if / is not rational, the com-
A(X,Mj+1)
puter simulation will truncate it to rational values.
= Mj+15(X,Mj+1) Nevertheless, we expect that rational trajectories
will behave similarly to irrational ones, in the sense
= Mj+1(f(X) - [Mj+1f(X)]/Mj+1)
that the coarse-grained statistics will be similar to
= Mj+1(f(X)-[Mjf(X)]/Mj the discretized invariant measure. This expectation
is based on the arguments in Sec. 4.3, which, in turn,
- [Mj+1f(X)]/MJ+1 + [MjfWyMj)
rely on the main body of the paper. This seems
= Mj+1(S{X,Mj)-([Mj+if(X)] to indicate a logical trap; however, this is not the
-k[Mjf(X)])/Mj+1) case. We use "fc-adic" maps solely to show the uni-
formness of the rounding errors of "typical" maps;
= MJ+1(6(X,MJ)-[Mj+1f(X) meanwhile "&-adic" maps themselves have identi-
-k[MJnX)]]/Mj+1) cally zero rounding errors, so there is no vicious cir-
cle in the argument.
= Mj+MX,Mi) - [M i+1 «5(X,M i )]) We will further assume that M is large enough
= kA{X,Mj) - [kA(X,Mj)] so that the derivative f'(x) can be regarded as a
constant on any subinterval if1. Formally, this can
= kA(X,Mj)modl. (14) be expressed as

Maps of type (14) ("fc-adic" maps, with k being / fnotx + bi if x € if1, (15)
integer) are known for being ergodic and having
an absolutely continuous invariant measure which is where a* = f(x0), hi = f(x0)-ai(x0), x0 = (i~l)/M.
uniform [Renyi, 1957]. For typical (irrational) initial Our next goal is to investigate (based on the pre-
values A (Mo) the iterated series (14) will produce vious definitions and assumptions) under which
random-like numbers, uniformly distributed on I, conditions AM = BM,N. The equivalence of the
and the correlation between two trajectories decays incidence matrices would signal a qualitative agree-
exponentially with the number of iterates. ment between the discrete and the continous maps,
Equation (14) describes the evolution of the A although it does not provide immediately quanti-
errors on the initial M = MQ grid as M is multiplied tative agreement between the invariant densities or
by powers of A;, so the errors A(i/Mo, Mj),A((i+l)/ the individual trajectories.
MQ,MJ) behave like independent, uniformly dis-
tributed random variables for typical (irrational)
initial errors A(i/M0,M0),A((i + 1)/M 0 ,M 0 ) and 3. The Main Result: The Conditions
sufficiently high j . If we regard the inflated distance
for AM = BM'N
(12) associated with two adjacent points of the Mo-
mesh we obtain D = Mj\(i + 1)/M 0 - i/M0\ = Assume that the interval if1 is mapped by /
MJ/MQ = W, which becomes arbitrarily large after onto the intervals if ,lf+1,... ,lf+k. The linear-
a sufficiently high number of iterates on M, and this ity assumption (15) implies that the preimages
is an important condition guaranteeing the indepen- Si,o, Sitl,..., Siifc 6 I? of If,lfx,..., Ifik will be
dence of the random variables. subintervals of equal length l/(ajM), except for the
Although the above argument is interest- first and last (S^o and S^fc), which will be typically
ing from the theoretical point of view, its shorter [cf. Fig. 1(a)].
practical implementation needs further explana- We can guarantee AM = BM'N if each of these
tion. Independence is certainly not true for preimages contains at least one iV-meshpoint. The
A(i/Mj,Mj), A((i + l)/Mj,Mj), however, in order first one (<Sio) always does, since it contains the
Coarse-Grained Observation of Discretized Maps 169

Fig. 1 interpretation of the rounding error Si, the interval I± , its images in the intervals /,• , 'j+l, TM
1^ • • •,Ij+k an
^ tne
Preim_
ages S^o, M i s - Si,k £ H of the latter in case of (a) positive and (b) negative slope.

point x = (i — 1)/M = c(i — 1)/N. If we can guar- probability Pi associated with the event (16) can
antee that be readily obtained via (18) as

\Sitk\ > 1/N, (16) m


Pi=P{\Si,k\>l/N}=ll-!-± (19)
then this implies (S^il = |Sj,2| = ••• = |Sj,fc-i| > c
1/N, so each preimage will contain at least one Since c —» oo implies Pj —> 1, we are looking for
iV-meshpoint. The length \Sitk\ can be expressed the minimal value of c for which AM = BM'N can
via the rounding error Si (cf. Fig. 1): be expected. We will denote this minimal value of c
by the random variable £ with distribution
Si/di if cu > 0
(17) Af-l
15,i,k\
(5i-l/M)/ai ifai<0. p(t < c) = n Pi- rn
i=0
Substituting (16) into (17) and using (11) yields
the following conditions for the inflated rounding Here we used the assumption that the roundoff val-
error A,-: ues Aj are independent. Without this assumption
we could not obtain a closed form solution; on the
Aj > cii/c if aj > 0 other hand, observe that the expected value E(£)
(18) would decrease if we considered Aj to be correlated,
Aj < 1 + di/c if ai < 0.
so we are erring on the safe side. Since we are inter-
Since we proved that the errors Aj behave ested in the expected value E(£), we will replace
like uniformly distributed random variables, the the individual values |ctj| = \f'(i — 1)/M\ of the
170 G. Domokos

derivative by its average, which is identical to the 4. A p p l i c a t i o n of t h e


degree D of / . So, using (19), (20), (6) we obtain Coarse-Grained Model
4.1. Verification of formula (28)
P(C<c)=(l-| (21)
4.1.1. Numerical verification
This distribution function can be differentiated to As we pointed out earlier, since the roundoff errors
yield the density are correlated, we expect to see AM = BM'N
N already for somewhat smaller c (larger M) than
D predicted by (28), or, equivalently, we expect to find
P(c) 1-
smaller values for the constant k in numerical exper-
iments on individual maps.
/
DN
TV log i-Z\\ We carried out approximately 103 such experi-
x (22) ments on maps f(x) = ax(l ~ x) and f(x) = (ax +
«•(!-§ /
6) mod 1 with different values for a and b and for
discretizations with 103 < N < 105. We assumed
We are interested in the asymptotic behavior for that the relationship (28) is qualitatively true, i.e.
N, M —• oo (implying N/c » 1). Using the identity E(£) = k^/DN and in each experiment we identified
[Korn & Korn, 1968] the value of the constant k under the condition that
AM = BM,N (in fact, we just checked the condition
lim ( l + " V " = c«* (23) that whenever Aft = 1, this implies B™f = 1). We
found values in the range 0.4 < k < 2.5 with a mean
yields value of roughly fcsil. This numerical result cer-
tainly confirms not only the qualitative correctness
lim (1 _ R ) " = e(-DN/^ (24) of (28), but also shows fair quantitative agreement.
N^oo \ C We can also observe that as predicted, the mean of
Taking the first term in the D/c <C 1 Taylor expan- the measured k parameters was below the theoret-
sion of ical expected value k « 1.7724 which indicates the
correlation of the rounding errors.
/

3
DN
D
TV log
H)) (25)
4.1.2. Theoretical verification
z 1
/ We can observe that the role of N and D is symmet-
3 ric in (28). This is not a coincidence; in fact it con-
yields 2D N/c , so the density function can be iden-
tified in this limit as firms the validity of the formula. We proved that the
2DN -DN
rounding errors Si behave like uniformly distributed
lim p(c) = p(c) = - e c^ (26) random variables. This implies that their preim-
N,M—>oo
ages under / should behave similarly; the preim-
The expected value E(£) of p(c) can be computed ages can be regarded as "rounding errors" in the x
by using the identity [Korn & Korn, 1968] direction. We found (28) under the condition that
the preimages should be larger than 1/N, which
x2ke-ax2dx implied that the rounding error should be larger
./o than D/N, and we considered M = N/c such inde-
= (1 x 3 x 5 x • • • x (2k - l ) v / ^ ) / ( 2 f c + V + 1 / 2 ) pendent events. We could equally concentrate on
(27) the preimages themselves, however, the factor of D
would not enter into the length of the interval. On
yielding the other hand, recall that D is the degree of the
map f(x), so if we derive our formula based on the
E(£) = VTTDN. (28)
horizontal rounding errors then we have to consider
This formula implies that if we use a coarse grid MD = ND/c independent events, so D has been
with c = VTTDN (i.e. M w 0.56y/N/D) then we smuggled back, and because of the N — D symme-
can expect AM = BM'N. try of (28) we arrive at the same result.
Coarse-Grained Observation of Discretized Maps 171

4.2. Templates and conservation of 4.3. Statistical convergence to the


information continuous model
We can associate the oriented graphs GM, GM'N As mentioned in the Introduction, numerical
with the incidence matrices AM, BM,N in a natural experience shows that despite its fundamentally dif-
manner: both GM and G M,JV have M vertices and ferent structure, the discretized map tends to repro-
if Af* = 1 (-B^' N = 1, respectively) then the ver- duce fairly accurately the invariant density of the
tices i and j are connected by edge oriented towards continuous one, as long as the discretization is suf-
the j t h vertex. At constant M, as we increase c ficiently fine. It may not be easy even t o formu-
(and thus increase A), gradually more and more late this statement in a rigorous way, let alone to
edges are added to GM,N until it becomes "satu- prove it. However, the idea of coarse-grained lattices
rated", i.e. identical to GM. Based on the previous coupled with the random map model may help to
results we expect this to happen at N « M2. At understand intuitively why the statistical similari-
this "saturated" stage GM,N can be regarded as a ties are observed.
template carrying all the unstable cycles of the map In the previous subsection we showed that we
f(x). This, of course, does not imply that one can can expect s = log2(c) "topologically correct" steps
observe all these cycles in the coarse-grained model, on the coarse lattice. By choosing c > ^/-KDN
since only a finite number of steps s will agree, we according to (28), we can expect s K> log 2 A~ steps
will call these steps "topologically correct". One can which will agree (on the coarse-grained level) with
give a good estimate of s based on the amount of the continuous map. If we regard the coarse lat-
information carried by computation. We know that tice as a statistical sampling mesh, then these steps
the actual discrete map is working on an A-mesh, will be "statistically correct" on that mesh, i.e. they
so the amount of information contained in an initial will be in the same sampling box as the iterated
value xo = i/N is values of the continuous map. After s = log2 N
steps the discrete simulation will trail off, its global
Jo = log 2 (A). (29)
fate will be determined by the underlying fine
This amount of information can be utilized in dif- (A-mesh) discretization. Cycle length will agree
ferent ways as we iterate xo forward with /AT. One on the fine-grained (AT-mesh) and coarse-grained
part of the information, which we could call met- (M-mesh) level. Although we have no direct evi-
ric information Im, determines how accurately the dence on the cycle length of the A"-mesh model, the
location of any iterated value is known: obviously random map approach [Domokos, 1990; Lanford,
1998] suggests that for high N the cycles can be
lm = log 2 (M). (30)
expected to be of length L RS I/N. AS long as the
(If we choose M = N then lm = XQ and no fur- cycle is not finished, the coarse-grained iteration
ther information is available.) The remaining part can be regarded as a series of subsequent, statis-
we may call topological information (Jt) indicating tically correct segments of length s = log 2 N, and
which vertex of the graph we are visiting. This infor- the random map predicts that we will have approxi-
mation is equivalent to the number of "topologically mately v / A/log 2 AT such segments. For high N, this
correct" steps: is more than sufficient to produce reasonably good
statistical data on the M ss \/N-mesh. (In fact,
It = s. (31)
since in the random map model the transients lead-
Since we are not creating new information in ing into the cycles have the same expected length
the process of the iteration we have as the cycles themselves, we can expect even more
"statistically correct" segments.)
Xo = lm + It (32)
Although the above considerations are far from
which yields via (6), (29) and (30) rigorous, they may help to develop more rigorous
Xt = log2(c), (33) arguments.

implying, via (31),


4.4. Numerical example
s = log2(c). (34)
Equation (28) and the measured average value
In the next subsection we will explore the applica- k RS 1 suggest the following strategies for numerical
tions of (28) and (34). applications. If we define double- precision
172 G. Domokos

(52 bits) variable and carry out the iteration (3) difficult to measure if the steps are not known in
on this arithmetic precision (N = 2 52 ), however, advance. However, assume that the cycle length L
register the iterated values only at single precision is small compared to c. As we increase c, we will
(26 bits), regarding the latter as the coarse-grained observe the identical repetition of the same periodic
sequence (7) (M = 226) then we have M = c= y/N, pattern of integers MXi 6 {0,1, 2 , . . . , ( M - 1 ) / A f } ,
and, assuming D = 2, this yields k = 0.707. With (cf. (7)) on the coarse M-lattice, and each time
this arrangement we can expect the incidence matri- the discrete pattern is repeated we can say that the
ces of the coarse lattice and the continuous map to number of correct steps has increased by L.
be approximately equal, so we can explore the struc- The number of registered repeating patterns (and
ture of A2 . thus our confidence in having discovered a cycle)
Based on (34) we can expect the number of grows with c, our information about the location
correct steps s to be approximately 26, so we can of the cycle decreases simultaneously. This inverse
hope to find cycles of length up to approximately relationship is expressed formally in (6) and in its
13. However, one could make more specific claims logarithmic version (32).
as well, since Eq. (34) could serve as a "digital As we mentioned in the Introduction, the diadic
definition" for unstable cycles. If we regard the map f[x) = 2x mod 1 shows extremely negative
number of correct steps s as a function of c and properties for N = 2k discretizations: all trajecto-
as we are varying c over a certain range, s(c) fol- ries end after maximum k steps at x = 0. At first
lows the rule given in (34), then we can be more sight the discrete and the continuous maps have lit-
confident to have identified an unstable cycle dig- tle in common. However, this example is an almost
itally. Of course, the number of "correct" steps is trivial illustration of our method: it is well-known

Table 1. N = 2 2 1 , M = 2 2 computation of the unstable cycle {1/9, 2/9, 4/9,8/9, 7/9, 5/9} in the
diadic map f(x) = 2x mod 1 for 21 steps. First column: serial number i of step. Second column: Xi,
the iteration on the continuous map. Third column: Xi, iteration on the iV-lattice. Fourth column:
\xi — Xi\, difference between the continuous map and the iV-discretization. Fifth column: NXi,
"iV-inflated" value of Xi. Sixth column: MXi, "Af-inflated" value of the coarse-grained iteration.
Seventh column: Mx;, "M-inflated" value of the continuous iteration.

i X{ Xi \xi - Xi\ NXi MXi Mxi

0 0.1111111 0.1111106 0.0000004239 233016 0 0


1 0.2222222 0.2222213 0.0000008477 466032 0 0
2 0.4444444 0.4444427 0.0000016954 932064 1 1
3 0.8888888 0.8888855 0.0000033908 1864128 3 3
4 0.7777777 0.7777771 0.0000067817 1631104 3 3
5 0.5555555 0.5555419 0.0000135630 1165056 2 2
6 0.1111111 0.1110839 0.0000271260 232960 0 0
7 0.2222222 0.2221679 0.0000542530 465920 0 0
8 0.4444444 0.4443359 0.0001085000 931840 1 1
9 0.8888888 0.8886718 0.0002170100 1863680 3 3
10 0.7777777 0.7773437 0.0004340200 1630208 3 3
11 0.5555555 0.5546875 0.0008680500 1163264 2 2
12 0.1111111 0.1093750 0.0017361000 229376 0 0
13 0.2222222 0.2187500 0.0034722000 458752 0 0
14 0.4444444 0.4375000 0.0069444000 917504 1 1
15 0.8888888 0.8750000 0.0138888888 1835008 3 3
16 0.7777777 0.7500000 0.0277777777 1572864 3 3
17 0.5555555 0.5000000 0.0555555555 1048576 2 2
18 0.1111111 0.0000000 0.1111111111 0 0 0
19 0.2222222 0.0000000 0.2222222222 0 0 0
20 0.4444444 0.0000000 0.4444444444 0 0 1
21 0.8888888 0.0000000 0.8888888888 0 0 3
Coarse-Grained Observation of Discretized Maps 173

that the diadic map delivers the binary expansion of invariant measures can be often reliably reproduced
the initial value XQ if one registers whether the iter- in numerical experiments.
ated numbers are on the first or second continuous The relationship between discrete and contin-
segment of the map. The meshpoints of an N — 2k uous maps has been approached in various ways.
discretization have exactly k nontrivial binary dig- The random map model [Lanford, 1998; Domokos,
its. In our terminology this corresponds exactly to 1990] aims to predict the cycle and transient length
taking M = 2, and (34) predicts that the initial for high N. The random perturbation methods
value and the first s = log2(c) = log2 (N/M) = k — 1 pioneered by [Kifer, 1997; Liverani, 2001] aim to
binary digits of any trajectory will be computed reconstruct the invariant measure associated with
correctly, this is the "M-inflated" series MXi, the continuous map based on the discrete map.
the latter defined in (7). (Domokos and Szasz [2003] determined the mini-
Below we illustrate a slightly less trivial case mal amount of necessary perturbation.) Although
M = 4: we describe the computation of the unsta- rather efficient in achieving their goals, neither of
ble cycle {1/9,2/9,4/9,8/9, 7/9, 5/9} of length 6 in these approaches look directly at the discrete map
the diadic map. The "M-inflated" iterated series (either they substitute it with a random process or
(8) for M = 4 is {0,0,1,3,3,2} for this cycle. they add a random process). Our approach in this
We set N = 2 21 and X0 = [N/9] /N _and com- paper was different since we regarded directly the
puted the discrete sequences JQ and Xi for 21 discrete map, and tried to squeeze out information
steps. The results are summarized in Table 1. As which is relevant when studying the continuous
can be observed, the "M-inflated" integer series map. The first results consistent with this philos-
Mxi and MXi in the sixth and seventh columns, ophy appeared in [Domokos, 1990], where the qual-
belonging to the continuous and the iV-discretized ity of the discrete model is defined as Q e [0,1],
map, respectively, agree up to 19 steps, which and Q = 1 if the discrete model agrees with the
confirms the prediction in (34): s = log2(c) = continuous model in the following sense: an interval
log2(/V/M) = 19. of random length and random location (both cho-
sen uniformly) is visited with probability 1 by the
discrete iterated sequence. In [Domokos, 1990] the
5. S u m m a r y and Related Topics formulae for this probability are derived, based on
In this paper we presented a "direct" approach the incidence matrix of the discrete map. The plots
to the main question why discretized maps resem- showing Q(N) for specific maps are interesting; they
ble their continuous counterparts. We showed that suggest that Q —> 1 as N —> oo, but it may not
in case of an TV-discretization the coarse-grained be easy to prove this statement. Still another direct
model with an M = JV/c-mesh is approaching approach would be to study deterministic iterations
the continuous one in the sense that the appro- on randomly generated lattices.
priate coincidence matrices become identical as we
decrease c. We showed that for c fa \ZTTDN one can Acknowledgments
expect the two matrices to have the same entries.
This suggest to perform computations at double Several ideas in this paper originated in conver-
precision arithmetic but register the numbers only sations with Tamas Tel, Mike Shub and Oscar
at single precision. Lanford, whom the author would like to thank.
The principal benefit of the agreement between Tamas Tel also pointed out several useful formu-
las and helped to shape the paper. This work was
the coincidence matrices is that at that stage all
supported by OTKA grant T046646 and the Bolyai
unstable cycles of the continuous map will appear
Research Fellowship.
on the coarse grid temporarily. The number of
correct steps will be s ?s log2 (iV/M), so if we
are looking for longer cycles we have to settle References
for less accurate information concerning their loca- Bennettin, G. et al. [1978] "On the reliability of numer-
tion. Although this sounds plausible, we believe ical studies of stochasticity," Nouvo Cimento B44,
that the quantitative relationship derived in this 183-195.
paper can be useful. Combining these results with Blank, M. L. [1988] "Metric properties of epsilon-
those from the random map model [Domokos, 1990; trajectories of dynamical systems with stochastic
Lanford, 1998] offered an intuitive explanation why behaviour," Ergod. Th. Dyn. Syst. 8, 365-378.
174 G. Domokos

Cvitanovic, P. et al. [1999] Classical and Quantum Korn, G. A. & Korn, T. M. [1968] Mathematical
Chaos, 1st edition, Niels Bohr Institute, Copenhagen, Handbook for Scientists and Engineers, 2nd edition
http://www.nbi.dk/ChaosBook/. (McGraw-Hill Book Company, NY).
Domokos, G. [1990] "Digital modelling of chaotic Lanford, O. E. [1998] "Informal remarks on the orbit
motion," Studia Sci. Math. Hung. 25, 323-341. structure of discrete approximations to chaotic
Domokos, G. & Szasz, D. [2003] "Ulam's scheme maps," Experim. Math. 7, 317-324.
revisited: Digital modeling of chaotic attractors via Levy, Y. E. [1982] "Some remarks about computer stud-
micro-perturbations," Discr. Contin. Dyn. Syst. A4, ies of dynamical systems," Phys. Lett. A88, 1-3.
859-876. Liverani, C. [2001] "Rigorous numerical investigation
Gora, P. & Boyarsky, A. [1988] "Why computers like of the statistical properties of piecewise expand-
Lebesgue measure," Comput. Math. Appl. 16, 321- ing maps — a feasibility study," Nonlinearity 14,
329. 463-490.
Keller, G. [1982] "Stochastic stability in some chaotic Ott, E., Grebogi, C. & Yorke, J. A. [1988] "Roundoff-
dynamical systems," Monatshefte der Math. 94, 313- induced periodicity and the correlation dimension in
333. chaotic attractors," Phys. Rev. A38, 3688-3692.
Kifer, Yu. [1997] "Computations in dynamical systems Renyi, A. [1957] "Representations of real numbers and
via random perturbations," Discr. Contin. Dyn. Syst. their ergodic properties," Acta Math. Akad. Sc. Hung.
3, 457-476. 8, 477-493.
Kloeden, P., Diamond, P., Klemm, A. & Pokrovski, A. Ulam, S. [1960] Problems in Modern Mathematics
[1996] "Basin of attraction of cycles of discretizations (Interscience Publishers).
of dynamical systems with SRB invariant measures,"
J. Stat. Phys. 84, 713-733.
MULTIPLE HELICAL PERVERSIONS OF F I N I T E ,
INTRISTICALLY CURVED RODS
G. DOMOKOS
Department of Mechanics, Materials and Structures and
Center for Applied Mathematics and Computational Physics,
Budapest University of Technology and Economics,
H-1521 Budapest, Hungary
T. J. HEALEY
Department of Theoretical and Applied Mechanics,
Cornell University, Ithaca, NY 14853-1503, USA

Received April 15, 2004; Revised June 15, 2004

We investigate mechanical spatial equilibria of slender elastic rods with intristic curvature. Our
work is, to some extent, motivated by papers [Goriely & Tabor, 1998; Goriely & McMillen,
2002]. There such rods of infinite length were recently studied to quantify the behavior of
botanical filaments. In particular, an adequate explanation for the existence of helical perver-
sions (the transition between helical segments of opposite handedness) is provided in [Goriely
& Tabor, 1998]. However, this theory fails to describe multiple perversions, which can be
observed in Nature. In contrast we formulate a two-point boundary-value problem describ-
ing rods of finite length with initial curvature and clamped ends. We identify trivial solutions
as straight configurations and also fe-covered circles, rigorously establish the existence of local
bifurcations, and then compute global solutions via the Parallel Hybrid Algorithm [Domokos
& Szeberenyi, 2004] to find spatially complex equilibria characterized by multiple perversions.
Based on computational results and the White-Fuller theorem [White, 1969; Fuller, 1971;
Calugareanu, 1961] we describe a heuristic global picture of the bifurcation diagram, which
can serve as an explanation for the evolution of physically observable tendril shapes.

Keywords: Intristic curvature; rod theory; bifurcations; helical perversion; botanical tendrils.

1. Introduction give various examples of perversions occurring in


long thin filamentary structures including umbilic
In this paper we examine spatially complex eqilibria
of rods possessing intristic curvature. Such models chords. Following the ideas presented in [Goriely
have recently been employed to quantify the behav- <fe Tabor, 1998], they offer a simple mechanical
ior of botanical filaments [Goriely &: Tabor, 1998; model predicting the occurrence of a single per-
Goriely & McMillen, 2002]. As observed already version: they study static equilibria of a n infinitely
by Darwin [1888], the tendrils of climbing plants long Kirchoff rod of constant intristic curvature.
often assume configurations consisting of subse- Regarded as an initial value problem, this approach
quent helices of opposite handedness (see Fig. 1 permits the application of dynamical systems tools.
for Darwin's original drawing and Fig. 2 for an In t h a t context a perversion is represented by a
example in a tropical rainforest). T h e transition heterochnic orbit joining asymptotically two fixed
between the different helical segments is referred points, the latter corresponding to helices with
to as "perversion". Goriely and McMillen [2002] opposite handedness. While offering an adequate

175
176 G. Domokos & T. J. Healey

ft- displaced toward each other, a single perversion is


born on the segment. Accordingly we consider a
uniform rod with intrinsic curvature, initially occu-
pying a straight configuration with clamped ends.
The clamped ends also provide a good model for
the relatively firm attachment of tendrils to their
environment. We first provide a detailed local anal-
ysis of bifurcations from the straight state, as the
applied end tension in the rod is relaxed. In particu-
lar, we demonstrate the birth of a single perversion
Fig. 1. Multiple helical perversions. Drawing by Darwin
[1888].
that is locally stable, in accordance with the hand-
held telephone cord, described above. We then com-
bine numerical results with geometrical ideas based
on the White-Fuller theorem (cf. [White, 1969]) to
develop a global picture of the bifurcation diagram.
In particular, we identify equilibria characterized
by an arbitrary number of perversions with inter-
mittent helical segments. These configurations are
apparently similar to those observed on plants or
on telephone cords.
In recent years, there has been considerable
interest in the descriptions of spatially complex
eqilibria of finite elastic rods serving as models
in biology. In particular, the geometry of twisted
rings has been studied as a model for DNA config-
urations. Global, symmetry-based analysis for the
contact-free problem is carried out in [Domokos,
1995; Domokos h Healey, 2001], and the global
Fig. 2. Multiple helical perversions. Daintree National Park, problem with contact is solved in [Coleman et al,
Queensland, Australia. 1995; Swigon et al, 1998; Tobias et al, 1994], see
also [Li k, Maddocks].
In the current problem of tendril perversion,
the clamped-clamped boundary conditions admit
explanation for the existence of helical perversions, an interesting family of trivial solutions, consisting
this theory fails to describe multiple perversions of the straight rod and the series of fc-covered cir-
which can be observed in Nature (cf. Fig. 2) as cles. In Sec. 2 we describe the fundamental equa-
depicted in Darwin's drawings (cf. Fig. 1). Also, tions and analyze the straight, trivial solution. We
due to the infinite length of the model, the gen- prove that classical, planar Euler buckling modes
esis of helical perversions cannot be described by are possible in compression. Subsequently, for rods
this model. (In [Goriely & Tabor, 1998] finite rods with sufficiently high initial curvature, we identify
are also mentioned, however, just in the context of spatial modes in tension. The first part of Sec. 3
periodic solutions of the initial value problem, i.e. describes our computational approach, the Parallel
the boundary conditions are not specified.) Simplex Algorithm and the results obtained with
Like the approach in [Goriely Sz McMillen, this method. In Sec. 3.2 we describe a heuristic,
2002], we also look for spatial equilibria of slender, global picture of the bifurcation diagram which was
elastic rods with constant initial curvature. How- confirmed both by analysis and computations. In
ever, we treat boundary value problems for rods of particular, in Sec. 3.4 we apply the previous results
finite length. Our formulation is motivated by the to explain the existence and genesis of multiple ten-
following hand-held experiment: Take any helical dril perversions. Our global picture is based on the
telephone cord and straighten a finite segment of application of the White-Fuller theorem [White,
the cord by holding it firmly with two hands and 1969]. In Sec. 4 we summarize our results and draws
stretching it out. If the two ends are then slowly conclusions.
Multiple Helical Perversions of Finite, Intristically Curved Rods 177

2. Analysis "axial force", ra\, mi are "bending m o m e n t s " , a n d


m3 is the "torque" or "twisting m o m e n t " . For a
2.1. Formulation of the governing
homogeneous hyperelastic rod, we assume t h e exis-
equations
tence of a sufficiently smooth, scalar-valued stored
Let { e i , e 2 , e s } denote a fixed, right-handed, energy function, W ( « i , KQ, K3), such t h a t
orthonormal basis for E 3 . We consider a straight ref-
erence configuration parallel t o e3. Let " s " denote dW
m j =
the arclength coordinate (of t h e centerline) in the d ^ ' •7' = 1
'2'3- (8)
undeformed rod, and let r ( s ) denote the position
vector (with respect to some fixed origin) of the In accordance with t h e presumed curvature of
material point originally at "s" in t h e reference the rod in a relaxed state, we assume t h a t
configuration. We let R ( s ) denote t h e rotation of
the cross-section spanned by { e i , e 2 } at " s " in t h e W(/JL, 0, 0) = 0 is t h e global minimum of
undeformed rod. T h e first two unit vectors of the
W(KI,K2,K3), (9)
orthonormal field defined by
where /J, 7^ 0 is t h e intrinsic curvature. I n addition,
di(s) = R ( s ) e ; , i = 1,2,3, (1)
we make the physically reasonable assumption t h a t
are called directors in the special Cosserat theory, the Hessian matrix
which we employ here. T h e deformed configuration
of the rod is uniquely specified by the fields r ( s ) and D2W{-) is positive definite on M3. (10)
R(S).
For simplicity, we consider only inextensible, We further assume t h a t t h e straight r o d admits
unshearable rods, viz. two distinct transverse symmetries: A proper rota-
tion of 180° about G2, a n d a reflection across t h e
r' = d3. (2) plane spanned by { e 2 , e s } . It is not h a r d t o show
(cf. [Healey, 2002, Sec. 7]) t h a t these two operations
Next, we differentiate (1) t o get
induce t h e actions ( K I , / « 2 ) « 3 ) -*• (^l,— ^ 2 ^ 3 ) a n d
T
d^ = R ' R d i , ^ = 1,2,3. (3) ( K I , « 2 ) « 3 ) —* (Ki> ~ K 2 , ~K3), respectively. Accord-
ingly, we require t h e stored energy function to
Since the tensor field satisfy

K = R'Rr (4) W(K1,-K2,K3) = W(K1,K2,K3), (11)

is skew-symmetric, there is a unique vector field K W(KI,-K2,-K3) = W(KI,K2,K3). (12)


such t h a t
Condition (11) implies t h a t W is an even function
d£ = K x d i , i = l,2,3, (5)
of K2, and then (12), in t u r n , implies evenness of W
i.e. K is the axial vector of K . We then write in t h e argument K3 as well, viz.

K = Kjdj, (6) W(KI,K2,K3) = $(KI-(X,K2,KI), (13)

where K\ and K<I are "bending curvatures" and K3


where $ is some sufficiently smooth function on
is t h e "twist".
R x [0, 00) x [0, 00). From (8) we obtain
We let n ( s ) and m ( s ) denote t h e internal con-
tact force and internal contact couple, respectively,
m i = D I $ ( K I - n, K\, K | ) ,
acting on t h e cross-section originally at "s" in the
reference configuration. We write m 2 = 2D 2 *(«i-A*.Kl,«;§)K2, (14)
m 3 = 2 £ > 3 $ ( K I - /j,, K%, K3)K3,
n = rijdj, and m = mjdj. (7)

Recall t h a t t h e rii and rrii, i = 1,2,3, are called where "Z?,$" denotes t h e partial derviative of
forces and moments, respectively, cf. [Alexander k, the function $ with respect t o its ith argument,
Antman, 1982]; n\,ri2 are "shear forces", n3 is t h e 1 = 1,2,3.
178 G. Domokos k T. J. Healey

T h e simplest example of a stored energy func- On the other hand, we write r and R with respect
tion fulfilling (9)-(15) is the Kirchhoff model to t h e fixed basis:

W(KI,K2,KS) r = rie;,r = (r1,r2,r3) (25)


1 i?H R12 R13
[A(KX - nY + Bni + CKJ] (15)
R = RijGi (8> Gj, R R21 R22 R23 (26)
where A, B, C > 0 are the elastic moduli. Of course R3I R32 R33
we can obtain (15) directly from (9), (10) and (13)
via a truncated Taylor expansion of the latter about Then (1), (2), (4), (25), (26) lead to
(KI,K2,KS) = ( M , 0 , 0), where A = £ ^ $ ( 0 , 0 , 0 ) ,
B = 2£> 2 $(0,0,0), C = 2£> 3 $(0,0,0). f = R ( 0 , 0 , 1 ) T = (Rl3, R23,R33), (27)
In the absence of body forces and b o d y couples,
the well-known local forms of balance of forces and R' = RK. (28)
moments are

n ' = 0, (16) 2.2. Planar configurations


m ' + d 3 x n = 0, (17) We consider the possibility of planar solutions in
this section. We show t h a t such solutions may occur
only in the plane spanned by { e i , e 2 J . In particu-
respectively. We impose "clamped" conditions at
lar, this is t r u e in the special case (15) with A = B.
each end:
To this end, it is convenient t o first consider t h e
description of deformation with respect t o some
R(0) = R ( l ) = I. (18)
other fixed orthonormal basis { a i , a 2 , a 3 } , where

In addition we fix the left end; we constrain the


right end to move along e3 while prescribing the e3 a i = cosipei + sin ^ 2 ,
component of the force: a 2 = — s i n ^ e i + cos tpe2, (29)
»3 = e 3 ,
r(0) = 0,
eQ-r(l)=0, a = 1,2, (19) with ip being some fixed, b u t unspecified angle.
Then, as in (1), we have
e 3 • n(l) = A. (20)
d*(s) = R ( s ) a a , a = 1,2,
We now express the field equations in a con- (30)
venient component form. Recalling (6) and (7), we d^ = d 3 .
define the triples
It is not hard to show t h a t
n = (ni,n2,n3), m = (mi,m2,m3)
(21) d i = c o s ^ d ^ — sin V'd;!;,
and K= (rei,re 2 ,re 3 ), (31)
d 2 = sini/>d* + cos V^d^.
and we define a unique skew matrix K via
Writing
3
re x a = K a for all a e M . (22)
re = re^d^ + K^d.2 + «3d 3 , (32)
Using (5), we then express (17) and (17) with
respect to the convected basis { d i , d 2 , d 3 } : we find t h a t

n' + re x n = 0, (23) K\ = K\ COS ip — K\ sin ip,


(33)
m' + K x m + (0,0,1) x n = 0. (24) K2 = K>I sin ip + re2 cos ip.
Multiple Helical Perversions of Finite, Intristically Curved Rods 179

Next we seek planar solutions of the form that for any value of the loading, A G M, specified
in (21), the straight configuration
^2 = a2,
di = -sin 6>e3 + cos 0a1, (34) r = (0,0,s),
d 3 = d 3 = cos #e 3 + sin 8a.i. R = /,

Then, as in (5), we have « = 0, (41)

d*' = K x d * , i = 1,2,3, (35) m = (r( M ),0,0),


n=(0,0,A),
and we readily find that
K = 0'd*2, (36) satisfies the field equations, (14), (18)-(28), i.e. (41)
characterizes the trivial line of solutions. Here "/"
i.e. denotes the 3 x 3 identity matrix, and
n\ = K3 = 0, K*2 = 0'. (37) T{II) = £>!*(-//, 0,0) (42)
Next we write m = m*d* and compute m*.
Using (7), (14), (31), (33) and (37), we obtain is the "residual" couple maintained by the rod in the
straight state (supported by the clamped ends).
ml =g(9',?p) cos 4>, (38) In order to investigate bifurcation from the triv-
ial line, we first obtain a consistent linearization of
where (14), (23), (24), (27) and (28) as follows. Set

n=(0,0,A)+eN, (43)
= L>i$(-0'sin rp-n, {9')2 cos2 V, 0)
r = ( 0 , 0 , s ) + eu, (44)
+ 26'smipD2$(-0'smi> - n, (<9')2cos2 V,0).
(39) R = exp(e6), (45)

A similar calculation shows that m^ = m 3 = 0. where N, u are vector fields, © is a skew-matrix field,
Hence, the dot product of (17) with d3, employing and e is a small parameter. From (28) we then find
(34)-(36), reveals K = R R = e©', from which we deduce
e'm\ = 0'g(0', i>) cos il> = 0, (40)
K = e0' + o(e), (46)
i.e. either 0' or g(0',ip) or cos^ vanish. If 9' = 0,
then from (34), d 3 is constant, and (2) and the where 6_ is the axial vector of ©. We then substitute
boundary conditions (19) imply that r(s) = se 3 , (43)-(46) into (14), (24), (25) and (28), compute the
i.e. the rod is in the (trivial) reference configura- derivative of each with respect to "e", and evaluate
tion. From (9), (10) and (13) and (39) we see that the resulting expressions at e = 0, to obtain:
g{0', ifr) = 0 iff rj; = ±n/2 and 9' = =F/i, which is
a special case of cos i\) = 0. Without loss of gener- N{ + X0'2 =
: 0,
ality, we choose ip = 7r/2, in which case (30) yields =
Nli - X6'1 •• 0,
ai = e2 and a2 = —ei. From (2), (19) and (34), we
then conclude: Ni = 0,
U[ - 02 ••= 0,
Any planar nontrivial solution of (14), (18)-(28) is
characterized byr(s) G span{e2,e 3 } and K(S) "is an u'2 + 0i-= 0,
element of" spanjei} for all s G [0,1]. u3 = 0,
A(fl)9'{ - N2 := 0,
2.3. Linearized problem
In this section we obtain nontrivial solutions of the
BMeZ + TWz + N!-.
= 0,
linearized problem about the straight state. Observe c(M-r(jj)e'2: = 0,
180 G. Domokos & T. J. Healey

where which is equivalent to the classical eigenvalue


problem associated with the planar buckling
A{ii) = Dl$(-n,0,0), of a compressed "clamped-clamped" rod, cf.
[Timoshenko & Gere, 1961]. (If we differentiate (56)
S(/i) = 2 D 2 * ( - / i , 0 , 0 ) , and (48) and use either of (47)4,5, w e obtain the precise
C(/i) = 2D 3 *(-/i,0 ) 0) fourth-order formulation found in [Timoshenko Sz
Gere, 1961].) The general solution of (56) is
are the "instantaneous moduli" at the straight state
K = 0. Finally, for compatibility with the bound- _, . sm<j\ / . coscT - 1
ary conditions (18)-(21), the "incremental" fields 61 cos as I + C2 sm as —
a a
appearing in (43)-(45) must satisfy (58)

ua(0) = ua{l) = 0, a = 1,2, where Ci, Ci are constants. Enforcing the boundary
«3(0)=0, JV3(1) = 0, (49) conditions (57), we find nontrivial solutions iff
0(0) = 0(1) = 0. (50)
„, lX a Ia a . a
A necessary condition for bifurcation is that a sm a + 2(cos a — 1) = sm — I — cos - - sin -
the linearized system (47)-(50) admit nontrivial = 0, (59)
solutions. To solve the linearized problem, we first
observe from (47)3>6 and (49)2 that
(which agrees with Eq. (d) on p. 54 of [Timoshenko
h Gere, 1961]). Accordingly, we find two families of
N3 = u3 = 0. (51)
solutions:
Next, integration of (47)4,5, using (49)i, shows that
a = 2mr; # = sin2n7rs, n = l,2,..., (60)
aim]
0, a = 1,2, (52) = t a n a(m);
V V
0 = cos a(m)s + a(m) sin a(m)s — 1, (61)
and then the integration of (47)g, using (50), yields
m = 1,2,...,
Hx) T
- M)[m)it (53)
where 0 < a(l) < a(2) < • • • denotes the positive
solutions of the transcendental equation in (61) and
We then integrate (47) 1,2 and substitute into
(48)7,8, employing (51) and (53), to obtain
a(m) — sin a(m)
a(m) = (62)
1 — cos a(m)
0" - ^ 8 1 = eim - 0{(o). (54)
As discussed in [Timoshenko Sz Gere, 1961], the
family (61) corresponds to configurations that
«+^(w-*)*'-*(1)-*(0)-(55) are symmetric (reflection symmetric) about the
midspan (s = 1/2), while (62) yields antisymmetric
configurations with respect to the midspan.
Both (54) and (55) are of the form From (51) and (53), we can now read off non-
trivial solutions of the linearization (47), (49) and
0" + a26 = 0'(l)-0'(O), (56) (50). There are two families of distinct solutions.
The first is characterized by compressive critical
subject to loads only (A < 0), with the linearized solutions cor-
responding to the planar configurations discussed in
0(0) = 0(1) = 0, (57) Sec. 2.2. We denote these planar solutions by P£,
Multiple Helical Perversions of Finite, Intristically Curved Rods 181

which are reflection symmetric and P^1, which are (T(M))2


A2" -%)("(«))'
anti-symmetric or flip-symmetric. C(//)
Planar compressive: #2 = cos a[n)s + a (n) sinCT(n)s — 1

iVi = N3 = ux = u3 = 92 = 0 3 = 0; (63) u2n =—s-\ -~^-sin a(n)s


a(n)
S" : I a n
( ) /-,
/ ^\ (68)
(A 2 "" 1 = - 4 n ¥ i ( / i ) H T-T:{1 — cosa{n)s)
2n_1 a{n)
= sin2mrs T
o2n ( / i ) ,.2n
c^1
1 (64) ?3
P : < 2n-l
«: (cos 2n7rs — 1)
2n7T
N2n = -A 2n #f n + B(fi)(l - cos cr(n)),
n 1 2n 1 n 1
KN$ - = \ - 9l - , n = l,2,...
n = l,2,...

If the intrinsic curvature "ju" is sufficiently


A2n = -A{n){a(n)f
large, then from (8)-(10) and (42) we see that ten-
92n = cos a(n)s + a(n) sin cr(n)s — 1 sile "buckling loads" (A > 0) are possible. In par-
1 ticular, if we specialize to the Kirchhoff model (15),
uln =s sin<r(n)s viz. A(n) = A,T{H) = -An,B(n) = B,C(n) = C
a(n)
x
n »
into (67) and (68), the characteristic equations (67) 1
a(n) and (68)1 (non planar solutions) reduce to
+ a(n) (cos a(n)s — 1)

2,,2
jVfn A 2 "0 2n (s) + A(n)a(n){ca&a(n) - 1),
=
Symmetric : A2 r a - l 4nV£, (69)
C
n = l,2,...
(65) and

The second family potentially admits tensile crit- A2,!2


Antisymmetric : A2nn = A
V (a(n))2B, (70)
ical loads (A > 0) as well as compressive critical
loads, and unlike the previous family, the solutions
c
are characterized by #3 7^ 0. We denote these spa- respectively.
tial solutions by S^, which are reflection symmetric,
and by S™, which are flip symmetric.
2.4. Local bifurcation
Non planar tensile-compressive: In this section, we verify the standard transversal-
ity condition insuring thatthe linearized solutions
N2 = N3 = u2 = u3 = e1= 0; (66) (63)-(65) and (66)-(68) correspond to actual solu-
tions of the nonlinear problem. In order to make this
precise, we need a little extra notation: Refering to
(47), define the field
A2-i =M^!_4nVB(/x)

6/2n"1 =sin2n7rs x = (N1,N2,N3,u1,U2,U3,e1,d2,e3), (71)

2n-l
&„ u1 [1 — cos2n7rs) (67) and for all fields u,v on [0,1], define t h e inner
2n7T
product
a2n-l r(M) ,2n-l
9
rl
CG")
(u,v)= y2iti(s)vi(s)ds. (72)
v2n-ln2n-l J
7V-12 n - l ,n = l,2, 0 i=!
182 G. Domokos & T. J. Realty

Next we express (47) subject to (49), (50) via and substitute (47) into the right side of (77). Inte-
differential-operator notation: gration by parts, using the boundary conditions
(49), (50), yields the adjoint equations L*(X)y = 0:
L(X)x = 0, (73)
-P[ =
+ 0 2 •• 0,
n n
If (A , x ) denotes a nontrivial solution of the lin- =
-Pi ~ <t>l 0,
earized problem, viz.
-Pi = 0,
L(Xn)xn = 0. (74) -v[ = 0,
~v'2 = 0, (80)
We further assume, for a given A = Xn, that (73)
has only one linearly independent solution, x = xn. -v's = 0,
Observe that this is the case, provided that An given A{^'{ + XP^ + v2 = 0,
by (63) or (65) does not coincide with some other
(/^2 + T(/X)^3 - XP[ - Vl := 0,
\m(m ^ n) given by (67) or (68). Then a sufficient
condition for local bifurcation in the nonlinear prob- c(M - r{fi)<l>2 •= 0,
lem is (cf. [Crandall <fe Rabinowitz, 1971])
subject to
(yn,L'(Xn)xn)^0, (75) P Q (0) = P Q ( 1 ) = 0 , a = 1,2,
where yn denotes the adjoint null vector satisfying Ps(fi) = «3(1) = 0, (81)

L*{Xn)yn = 0. (76) & ( 0 ) = & ( 1 ) = 0, i = 1,2,3. (82)

Observe that (80)g is identical to (47)g, while


Here L*(An) is the adjoint operator defined by
(80)3,6 and (81)2 yield v3 = P3 = 0. Next (80)ij2
and (81)i imply that (52) holds for 4>a,a = 1,2,
(L*(\n)y,x) = {y,L(\n)x), (77) as well. Finally, if we substitute (80)^2,4,5,9 into
(80)7,8, using (52) for (f>a, we again obtain (54) and
for all sufficiently smooth fields x, y satisfying (55) with (f)a in place of 6a. We conclude that the
(49), (50). components 6™ and </>™, a = 1,2, are identical, and
From (47) and (71) we see that from (80)i;2 and (82), we obtain the first two argu-
ments of the adjoint null vectors as follows:
L'(X)x = (9'2,~9'l,0,...,0),

and thus, at the nontrivial solution (Xn,xn) of the yn=(o,-J80W)dt,... (83)


linearized problem, we find either
for the "planar compressive" solutions, and
L'(Xn)xn= (0,-^,0,...,0 (78)
V 0?m,o,... , (84)

for the "planar compressive" solutions (cf.


for the "nonplanar tensile-compressive" solutions.
(63)-(65)) or
Finally, we substitute either (78) and (83) into the
left side of (75) or (79) and (84) into the left side
d
L'(Xn)xn=[-BlO,...,0), (79) of (75). For either case (a = 1 or 2), integration by
parts using (82) yields:
for the "nonplanar tensile-compressive" solutions
(cf. (66)-(68)). (yn,L'(Xn)xn) = £ ±0Z(8) QTl?2(Ode)efe
To compute the adjoint operator, we let
= -Cms)?ds^Q, (85)
V = (Pl,P2,P3,Vi,V2,V3,^1,(p2,h), Jo
Multiple Helical Perversions of Finite, Intristically Curved Rods 183

which verifies the transversality condition (75). be inextensible and unshearable. We then compute
Accordingly, we conclude (cf. [Crandall & the second variation at the trivial solution via:
Rabinowitz, 1971]) Each of the linearized solu-
tions given in (63)-(65) and (66)-(68), denoted S2V(se3,1) = -^ [V(se3 + en, e x p ( e e ) ) ] e = 0 (88)
A n ,N n ,u n ,© r e (where 9n is the axial vector of
© n ), correspond to local bifurcating solutions of the A lengthy calculation leads to
nonlinear problem in the sense that (43)-(45) are
asymptotically valid, viz.
52V(se3,I)= f {A(9[)2 + B(9'2)2
n = (0,0,A) + eN" + o(e), Jo
r = ( 0 , 0 , s ) + e u n + o(e), + C(9'3)2 + X[(91)2 + (92)2}
(86)
R = I + eG n + o(e), -2An6'362}ds, (89)
n
A = A + o(e), for all smooth test functions 9(s) = (0i (s), 92 (s),
for all sufficiently small e, yields a curve of nontriv- 93(s)) satisfying (50) and (52), where 0(s) is the
ial solutions of the nonlinear problem. axial vector field corresponding to the skew-matrix
As suggested in (86)4, each of these local bifur- field O(s). By the minimum property of the smallest
cations is, in fact, a so-called pitchfork. In each eigenvalue, we have the following "sharp" Poincare
case, this is a consequence of Z2 symmetry breaking. inequalities:
Indeed, our boundary value problem (14), (16)-(20)
is equivariant under "mirror" reflections about the / (O2 ds
> 47r2 / (0a)2 ds, a = 1,2, (90)
midplane of the straight, underformed rod perpen- Jo Jo
dicular to e3 and also under 180° rotations or "flips"
cf. (56), (61) for n = 1. Also, by the arithmetic-
about the midpoint axes (to the undeformed rod)
geometric means inequality, we have
parallel to e x and e2. The bifurcations associated
with (64) and (67) each break a flip symmetry, while 1
those coming from (65) and (68) break the reflection if \0'302\ ds< f W + -03) ds, (91)
symmetry, cf. Fig. 5 in Sec. 3 Standard arguments Jo Jo
show that each of these is necessarily a pitchfork
for all numbers e > 0. For any A > A1, we choose
bifurcation, [Golubitsky & Schaeffer, 1985].
In the remainder of this section we consider A\n\ A\u\
the Kirchhoff model (15), and we provide a more (92)
C c
detailed analysis of the bifurcation (86) associated
with (67) and (69) for n — 1. In particular, we where a = (X — Xl)/A\fi\. It then follows from
assume that the intrinsic curvature /x is sufficiently (89)-(90) that
large so that the "buckling load" given by (69) is
positive (and hence, tensile), viz. A1 = A2u2/C — S2V(se3,1) >K J [(0i)2 + (# 2 ) 2 + (03)2} ds, (93)
4ir2B > 0. This is precisely the situation in the Jo
"hand-held experiment" for a telephone cord, as for all test functions satisfying (50), (52), where
discussed in the introduction. First we demonstrate K > 0 is a constant.
that the straight (trivial) solution (42) is stable (the On the other hand, suppose that A < A1. In
potential energy is a local minimum) for all (tensile) (89) we choose 6\ — 0 and integrate by parts to get:
loading A > A1 and unstable (the potential energy is
not a local minimum) for all A < A1. We start with
the total potential energy functional for the rod: 52V(se3,I) = [ {[-B0'2' + X02 + Afj,e'3}e2
Jo
V(r,R) /VGS) + n • (r' - Re 3 )] ds -[C0'3' + Afi0,2}93}ds, (94)
Jo
for all test functions 02,03 satisfying (50), (52). We
-Ae3-r(l), (87)
now choose 92, 93 to coincide with the nontrivial
where the internal contact force n is the Lagrange solutions 9\,9\ given in (67)2,4, the substitution of
multiplier field enforcing the constraint that the rod which into (94) yields 62V(se3,I) = (A - A x )/2.
184 G. Domokos <fe T. J. Healey

We now determine the next nonzero term "7" the Parallel Simplex Algorithm (PSA) which, to the
in the Taylor expansion (86) for n = 1, viz. best of our knowledge, is the only available code
capable of determining all equilibria, connected or
A = A 1 + 7 e 2 + o(e2). (95)
not, in a given domain of the solution space. We
To make this precise, we need a bit more nota- describe the method very briefly below.
tion. First we substitute (8), (13) and (15) into (24) The PSA, introduced in [Domokos, 1994;
(calling it (24)') and denote the system (23), (27) Domokos &: Gaspar, 1995; Gaspar et al., 1997] is
and (24)' via based on some simple ideas from the theory of ordi-
nary differential equations (ODEs), combined with
F(X,x)=0. (96) the Piecewise Linear (PL) Algorithm [Allgower k,
Since we have a Z 2 -symmetry-breaking pitchfork, it Georg, 1990]. In contrast to path-continuation tech-
can be shown [Kielhofer, 2004] that the coefficient niques, which deliver equilibria in sequence along
"7" is given by the formula solution branches, the PSA resolves simultaneously
all equilibria (in a given domain) lying on all
{y\DlF{X\Q)[xl'x^]) branches. The PSA can be directly applied to two-
7
3{y\L'(X^) ' ^ point BVPs associated with ODEs of the form
where the calculation for the numerator is facili- x(t) = f(x(t), A), x e R2n, A e l 1 , te [0, 2TT].
tated by
(101)
DlFiWo^x^x1} EE ^ [F(X\x(e))]e=Q, (98) Let us assume that the initial (t = 0) conditions
apply to the first n components {x{ (0) = etj, i =
with x(e) is represented by (43)-(45). In particular, 1,2, . . . , n ) and far-end (t = 2ir) conditions apply
we use an expansion of (45) to obtain the higher- to the n components with indices Vi(xUi(27r) —
order extension of (46), bi, i = 1, 2 , . . . , n), where the a$,fejare given scalars.
2 3 Let us denote the unspecified initial components
K= eev + -91 x ey + Te} by Vi-n = Xi(0), i = n + 1, n + 2 , . . . , 2n ("vari-
ables"). The (n + l)-dimensional space spanned by
x ( 0 x x6v) + o(e3), (99) the variables and the parameter A will be called the
which is needed in the calculation (98). The denom- Global Representation Space (GRS) for the bifurca-
inator in (97) is given by (85). In the special case tion problem. By using any convergent forward inte-
A — B = C (cf. (15)), which we employ in our grator for the Initial Value Problem (IVP), we can
numerical work to follow, a straightforward but express the far-end values x„i(2ir), (i = 1, 2 , . . . , n)
laborious calculation, employing (99), leads to as functions of the variables Vi and the parameter
A : xUi(2ir) = gi(vi,V2 • • • ,vn, X) and solve the alge-
braic equation system
7 = - ^ , (100)
gi(vj,X) -bi = 0,
i.e. the pitchfork (95) is "subcritical". By the
usual exchange-of stability-argument (cf. [Kielhofer, (i,j = 1,2,... n, Vj € [«J, v)), X G [A0, A1].)
2004]), it then follows that the local bifurcating (102)
solution (95), (100) is stable. These results are
summarized in Pig. 4: observe the pitchfork off by the PL algorithm [Allgower & Georg, 1990] in
branch A. the prescribed (n + l)-dimensional domain of the
GRS (defined by the constants with superscript in
(102)). Geometrically, (102) describes the intersec-
3. Global C o m p u t a t i o n s and their tion of n hypersurfaces in the (n + l)-dimensional
Interpretation space, yielding typically (locally) one-dimensional
solution sets, thus branches. This fact can be also
3.1. The parallel simplex algorithm
expressed as
and the global representation
space V = F + 1, (103)
In the remainder of this work, we seek a "global pic- where V and F denote the numbers of variables and
ture" of the solution diagram. To that end, we apply functions, respectively. These branches will appear
Multiple Helical Perversions of Finite, Intristically Curved Rods 185

as polygons, due to the piecewise-linear approxima-


tion. (We remark that variables can have a far more
general interpretation in the PSA. However, the ver-
sion described above is sufficient to introduce the
most important concepts.)
We now return to our problem. Due to the
clamped boundary conditions (18) we have 6 scalar
"free" initial conditions to (16), (17), viz. the values
of the components of n and m at s = 0, i.e. we have
V — 6 variables. In view of (18) and (20), observe
that 713(0) — A. The far-end condition (19) defines
two scalar equations, and from (18)2 we deduce the
three independent scalar conditions

#12(1) = 0
#13(1) = 0 (104)
#23(1) = 0.
In addition, (18)2 yields trace R ( l ) — R\x + R22 +
R33 = 3 which we also impose.
Equations (23), (24) can be integrated forward
using standard techniques; we relied on [Gaspar,
1977, 1978, 1979].
In the following subsections we will attempt to
give a partial picture of the global bifurcation dia-
gram. Our description is partially based on compu-
Fig. 3. Part of the global bifurcation diagram, illustrated in
tational results, partially on integer labels assigned
the [713(0) = A, mi(0), 7712(0)] space.
to the branches according to local bifurcation pat-
terns. We will also utilize ideas connected to the
White-Fuller theorem [White, 1969] and its exten-
sion [Alexander & Antman, 1982; Heijden et al., we will concentrate on the following classes of
20()i;. equilibria:

1. The primary, straight configurations, forming


the primary trivial branch A : [0,0, A, — /z, 0,0].
3.2. Classification of branches and
2. Planar, "classical" Euler modes P^P", bifur-
branch labels cating off the primary A branch for A < 0,
The computed equilibria can be identified by the (compression), forming branches. The family P?
six-dimensional vector consisting of the noncon- is reflection-symmetric, the family P?1 is flip-
stant initial conditions [ni(0),n2(0),713(0),mi(0), symmetric, cf. Eqs. (64), (65).
m2(0),7/13(0)], where 713(0) = A. For purposes 3. Spatial modes 5 / , 5 / 7 , bifurcating off branch
of graphical representation we will use the three- A both for positive and negative values of the
dimensional subspace [713(0) = A, mi(0),7712(0)]. tension A, forming branches. The family Sf
The computations revealed a highly complex is reflection-symmetric, the family 5 / / is flip-
bifurcation diagram, consisting of a large variety sj'mmetric, cf. Eqs. (67), (68).
of equilibria. One portion is illustrated in Fig. 3. 4. Asymptotically straight, twisted configu-
All calculations were carried out for the Kirchhoff rations bk, located at the GRS points
model (15), with A = B = C. The bifurcation dia- [co,0,0, — /J, 0, 2kir]. AS we will show, some
gram in Fig. 4 has been computed with /x = 10; all branches approach bk points as A —> 00. (In
other computations were carried out with /J, = 40. the case of zero initial curvature, twist is decou-
The main thrust of this section is to gain pled from bending and one obtains branches
some (partial) understanding of the diagram. Since of straight, twisted equilibria. In our case, the
our focus is the description of helical perversions, direction of preferred curvature changes as the
186 G. Domokos & T. J. Healey

n3(o)=A ( A

m2(o)

-M uim

Fig. 4. Branch S{ connecting the trivial A-branch to the Ci branch. Observe perversion on the physical configurations.

cross-section is twisted, so straight, twisted equi- 7. Branches created and secondary, tertiary, etc.
libria can be realized only asymptotically.) bifurcations.
Planar, untwisted, self-intersecting equilibria 8. Disconnected branches.
forming the branches Cfc (A; = 1,2,...) emerg-
ing from A;-covered circles c^ at [0,0, 0, Ikis — [i, These observations are in full agreement with
0,0] in the GRS. Equilibria on Ck branches the findings of Sec. 2. In particular, there we
correspond to planar, "noninflectional" elas- show not only the existence of the branches
tica lines in physical space, cf. [Love, 1927]. A, P/, P/1, <S/, Si1, but also the corresponding crit-
In the GRS description, these have the form ical load parameters and eigenfunctions are given
[0, n\(k, A), A,m|(fc,A),0,0], where n^fc, A), explicitly, cf. (64), (65), (67), (68), and a detailed
m*(fc, A) can be expressed in closed form using local analysis of the branch S[ is provided. We
Jacobian elliptic integrals. In the graphical repre- can observe some of the listed equilibria in Fig. 3.
sentation these curves appear as [X,ml(k, A),0]. All solutions shown are tensile (A > 0). Observe
Spatial modes bifurcating off Ck, forming that S[, which connects the trivial branch A to
branches. C2, contains "perversion" equilibiria. We illustrate
Multiple Helical Perversions of Finite, Intristically Curved Rods 187

this more fully in Fig. 4, displaying physical shapes Fig. 3 correspond to the branches 6 3 , 0 4 , 6 5 , C§
as insets. Moreover, the absence of any turn- and Cj. The latter two have been computed for
ing points and bifurcation points along S[, com- LI = 40 on much longer segments. Observe the
bined with the local stability results from Sec. 2, spatial modes bifurcating off C4, CQ and C7 and
imply that the entire branch (excluding the bifur- branches created in secondary bifurcations. We
cation points on A and C2) contains stable solu- illustrate some characteristic physical shapes in
tions. The almost parallel, almost straight lines in Fig. 5.

3.

4.

5.

6.

7.

Fig. 5. Physical shapes illustrated as ribbons. (1) The trivial eqilibria on branch A. (2) Asymptotic equilibrium 62 at
A = 00. (3) Equilibrium on the branch C2. (4) Planar, compressive Euler mode P{. (5) Planar, compressive Euler mode P\l.
(6) Spatial (possibly tensile) mode S{. (7) Spatial (possibly tensile) mode S[ .
188 G. Domokos & T. J. Healey

At first sight it may be surprising that the prevails. So we have exactly k such ("bottom")
branches Ck and the asymptotic points bk have been points for sufficiently high A.
included in the list; solutions on those branches In general, one would expect that bifurcations
are characterized by far-end clamping undergoing destroy the self-intersections. Indeed, this seems
2kir rotations about the various coordinate axes. to be case. However, not all self-intersections are
Nonetheless, the boundary conditions (18) are sat- destroyed simultaneously. In some cases several sub-
isfied in both cases. We will see that these solutions sequent bifurcations are needed to obtain a physi-
can be conneceted to each other and to the trivial cally relevant shape (without self-intersection). The
branch as well. key to identifying the bifurcations is the separation
Now we proceed to describe other perversions pattern.
occurring among the computed equilibria. Consider The pattern associated with the "inter-loop"
the branches C'k consisting of planar equilibria in points can be rather complex for two reasons: on
the [2,3] plane. As we explain below, these planar one hand, the number of such points is not charac-
curves have k < n < k2 self-intersection points for teristic for the branch — since it is changing with A.
A > 0. For A = 0, continuous intervals overlap, On the other hand, some patterns correspond to
forming a fc-covered circle. For small A the clamped knotted curves. Although such equilibria certainly
ends are slightly pulled apart and the k overlapping exist, we do not discuss them because they are not
circles form k, slightly shifted loops. Each loop has directly related to helical perversions. One example
two intersection points with each other loop (we will is illustrated in Fig. 6.
refer to these points as "inter-loop" points). In addi- For the listed reasons, we will not discuss the
tion, each loop intersects itself once (we will call
pattern associated with "inter-loop" points. Rather,
these points "bottom" points). So the total num-
we will assume that loops are moving indepen-
ber n of self-intersections is
dently of each other (for sufficiently high A there
are no "inter-loop" points so this assumption is
certainly true). We describe only the bifurcation
n = 2k{k " 1}
+ k = k2 (105)
behavior associated with the k self-intersections
of the loops ("bottom" points) surviving on the
in this case. As A increases and the clamped ends Ck branch for arbitrarily high A. These self-
move further apart, the intersection points between intersections occur at 2k points Pi, P%,..., P2fc> as
different loops become gradually disassociated, and we follow the arclength s from 0 to 1. Pairs of points
the self-intersection at the bottom of each loop P y - i i P y {j = 1,2,..., k) are coincident, forming

Fig. 6. Knotted equilibrium shape.


Multiple Helical Perversions of Finite, Intristically Curved Rods 189

the self-intersection. According to our computa- be applied; these conditions will refer both to the
tions, spatial separation of these self-intersection rod as well as to the closure. The writhing number
points, moving out of the [ a ^ ^ ] plane, correspond is the number of self-intersections of planar pro-
to bifurcations of branches of solutions containing jections of an oriented space curve, averaged over
spatial equilibria (for A > 0). Coincident points all possible projections. Each intersection is given a
P2j-i, -Py will move either in the same or in oppo- sign depending on whether the point with smaller
site directions. In the former case they remain coin- or larger arclength value is closer to the plane onto
cident, in the latter they separate. which one projects. If the curve is almost planar, the
We will characterize such a branch by an inte- writhing number is almost an integer. However, for
ger vector label Wi, with i = 1,2,..., A;, Wi € self-intersecting curves the writhing number is not
{—1,0,1} defined by the sign of the initial relative interpreted. The reason for this is that t h e same
displacement in the x\ direction of the point-pair configuration can be approached in different limits,
P2j-i,P2j- (The labels can be interpreted for the resulting in different (integer) writhing numbers.
general, k < n < k2 case as well!) The restricted Our Cfc branches are such curves with k self-
labels have exactly k entries, thus we have 3fc — 1 intersection points for which the writhing number
different labels for each value of k. The label cannot be defined. As soon as these points separate,
{0,0,... , 0} corresponds to the original branch. the writhing number W can be interpreted, and evi-
Bifurcating branches appear in pairs, with labels dently it agrees with the sum of the uii labels. The
w\ = -iv?, thus the labels admit (3k - l ) / 2 differ- pairs of branches with labels wj = — wf illustrate
ent possibilities for pairs. why the writhing number cannot be defined for self-
In case of k = 1 we have only the pair {—1} and intersecting curves: arbitrarily close to the bifurca-
{+1}. Based on numerical observation we believe tion point the two equilibria almost coincide, but
that to each possible label-pair the corresponding the sign of their writhing number is different.
branch-pair exists physically. (Others exist as well, The White-Fuller theorem states that if we
we dealt only with k self-intersections out of the interpret the rod as a ribbon, then the linking
total k2\) Labels containing W{ — 0 entries corre- number L of the two edges of the ribbon can be
spond to self-intersecting, nonphysical shapes. We written as
will be particularly interested in physical equilibria
without self-penetration, the corresponding labels L =T +W (107)
do not contain zeroes: there are 2fe such labels and
2k~1 pairs can be identified. where T denotes the total twist in the rod, propor-
tional to the integral of the twist moment mz{s).
Although neither W nor T are typically invariant
3.3. The White-Fuller theorem and along branches, their sum L is, as long as the fol-
global invariants lowing conditions are met [Alexander &; Antman,
One interesting property of the branch labels is that 1982; Heijden et al, 2004]:
for shapes without self-penetration the writhing 1. self-penetration of the rod does not occur,
number W of the centerline can be obtained at the 2. self-penetration of the closure does not occur,
bifurcation point as 3. the ends remain aligned.
n
Condition 1 has to be monitored along the
W = ^2wi. (106) branch. Conditions 2 and 3 can be guaranteed by
i=\
the clamped-clamped end conditions and by admit-
(We described above only the restricted n = k ting only positive (tensile) values for A.
case, however, (106) is valid in general as well.) If we take any of the equilibria bifurcating off
The writhing number has been defined originally the Cfc branches and increase the axial distance
for closed curves [Calugareanu, 1961; Fuller, 1971], between the clamped ends, it is evident that the
however, the extension to clamped-clamped end physical shape will become straight, at least asymp-
conditions is possible [Alexander k, Antman, 1982; totically as the tension grows to infinity. In some
Heijden et al., 2004] via a closure, i.e. a virtual rod cases there are branches carrying these equilibria.
segment connecting the two clamped ends. Later, We will call the values of L, W and T close to the
we will list conditions under which this theory can bifurcation point "initial values" and close to the
190 ' G. Domokos & T. J. Healey

straight shape "final values". Since L is a branch- 3.4. The existence and genesis of
invariant, we have simple and multiple
perversions
L&asi = -^initial, (108)
The branch labels Wi tell more than the scalar
and from the definition of the writhing number fol- branch-invariant L = W'. The exact sequence of
lows that the Wi entries defines the (approximate) shape: as
long as Wi does not change sign, we have a heli-
Wfinal = 0. (109) cal segment, at the sign-change a perversion will
Since the Ck branches consist of planar, untwisted occur. (If we include the labels defined by the "inter-
equilibria, loop" self-intersections, they also define the knot
type of the solution, however, we do not investi-
Tibial = 0. (110) gate knotted solutions in this paper.) The simplest
example are the branches bifurcating off the Ci
From (106)-(110) it follows that branch (originating in the double-covered circle at
n A = 0). If we regard only physically relevant (thus
Wi non-self-intersecting) shapes, we arrive at the
Tfinal = ^ i n i t i a l = ^ ' (1]L1)
i=l pattern-pairs [1, - 1 ] , [-1,1] and [1,1], [ - 1 , - 1 ] . The
former two have a sign-change consequently
(We see that by using the local bifurcation patterns they contain a perversion, while the latter two
we can define a global invariant quantity for the labels correspond to a left-handed and a right-
branch.) handed helical shape respectively, each with two
If we consider that along the primary, trivial A total turns. Observe that for the first shapes with
branch we have L = T = W = 0, and at the special, perversion we not only have L = W = 0, but
straight equilibria bk we have W = 0, L = T = k, the labels Wi do satisfy (112). So we expect a
then (111) predicts two kinds of different global sce- type I branch connecting to the trivial branch. In
narios: (I.) If Winitiai = E"=i m = 0 then Tfinai = 0, case of the helical shapes we have L = W = ±2
so such a branch may be connected to the trivial so we expect a type II branch converging to the 62
branch A at finite A. (II.) If Winitial = XT=i Wi > 0 equilibrium at A —>• 00.
then Tfinai > 0, so such a branch may approach an
equilibrium bk as A —> oo . We computed the [1, — 1], [—1,1] branch-pair
and found that it actually does connect to the triv-
The computations show that the two possi- ial A branch via a bifurcation point: In essence
bilities do in fact happen: we computed type I. these branches form a loop in the GRS with one
branches connecting Ck branches with the trivial A point connected to the C2 branch, and one to
branch, these connecting branches appear as loops the A branch. Figure 4 illustrates the topology of
in the global representation space, ending at two the bifurcation diagram with some physical shapes
bifurcation points. We have to stress that the con- shown as insets. Observe that the perversion is most
dition Wmitial = Y^i=i w« = 0 is a necessary one: apparent at the points which are equally far both
it simply indicates the possibility of a connecting from the Ck and the A branch.
branch. A direct connection appears to be only pos- The fact that the branch-pair [1, — 1], [—1,1]
sible for branches with labels connects to the trivial branch is remarkable. At the
Wj = (-lY, j = l,2,...,2i, k = 2i. (112) end of Sec. 2.3 we demonstrated explicitly the exis-
tence of spatial buckling modes S( (cf. Eq. (69));
These branches are identical with the reflection- the computations reveal that these branches can
symmetric spatial modes S(. In case of other contain helical perversions.
branches with Winitial = Y^i=i Wi = 0 we conjec- The multiple perversions observed in nature (cf.
ture that the connection can be established via sec- Figs. 1 and 2) fit easily into this qualitative picture.
ondary bifurcations. Similarly to the C2 branch, any branch Ck may be
We also computed type II branches, where with connected to the trivial A branch, as long as the
increasing A the solution becomes asymptotically bifurcating solution has L = W = X)i=i wi = 0>
straight. and this is possible for all even values of k. Such
Multiple Helical Perversions of Finite, Intristically Curved Rods 191

(a)

(b)
Fig. 7. Multiple perversions, (a) Configuration computed on the [ 1 , - 1 , 1 , - 1 , 1 , - 1 ] branch with five perversions,
(b) Telephone cord with three subsequent perversions.

an example is illustrated in Fig. 7(a), showing an cord. Of course, one can see multiple perversions on
equilibrium with five subsequent perversions on type II branches as well. Figure 8 illustrates such a
the [ 1 , - 1 , 1 , - 1 , 1 , - 1 ] branch (also satisfying the shape with two perversions on the [1,1, —1, —1,1,1]
(112) condition), which connects the CQ to the A branch with L = W — Yli=i wi — 2, connecting
branch. Observe the similarity to Fig. 7(b), show- the CQ branch to the equilibrium point 62 at A =
ing three subsequent perversions on a telephone 00. The C branches with odd subscript cannot be
192 G. Domokos & T. J. Healey

Fig. 8. Configuration on the [1,1, —1, —1,1,1] branch with two perversions.

connected to the untwisted A branch, however, they to the equilibrium point b-i at A = oo. Observe
may be connected to the asymptotically straight b that we have five sign-changes in the last mentioned
equilibria with odd subscripts, i.e. with odd num- label and the physical shape exhibits exactly five
ber of total twist. Such a shape is illustrated in perversions.
Fig. 9 on the [ 1 , - 1 , - 1 , 1 , - 1 , 1 , - 1 ] branch with As seen, some branches bifurcating off the Ck
L = W = ]Ci=i wi = ~~1> connecting the C? branch branches connect to the trivial solution A, some

Fig. 9. Configuration on the [1, —1, —1,1, —1,1, —1] branch with five perversions.
Multiple Helical Perversions of Finite. Intristically Curved Rods 193

0.20

0.16

0.10

O.OB

m -O.05

i mi mi III! IMll
-0.10

-0.15

-O.20

-0.35

Fig. 10. The first tensile buckling mode of the trivial solution, converging to a localized loop with L = T + W = 1 — 1 = 0.

others do not. The inverse is also true: some buck- By applying analytical, computational techniques
ling modes of the trivial solution (discussed in we found that the straight configuration under-
Sec. 2) connect to Ck branches, some others do goes bifurcations in tension, resulting in spatial
not. An example for the latter is the first member buckling modes. Some of these branches con-
of the Sf1 family, illustrated in Fig. 10: we can nect to other branches originating from /c-covered
observe a branch converging to a localized loop with circles. The fact that they connect helped to
L = T + W = l - l = OasA->oo. identify how the geometric quantities in the
White-Fuller theorem (Link, Twist, Writhe) evolve
along the branch. We found that these connect-
4. Summary and Related Issues ing branches carry equilibria with an arbitrary
In this paper we give a partial, global picture of the number of perversions. We also identified bran-
equilibria of intristically curved, finite, elastic rods, ches carrying equilibria with arbitrary number
serving as mechanical models for things like tele- of perversions, connected by helical segments of
arbitrary length. The computed shapes corre-
phone cords or botanical tendrils. In nature one can
spond well to the ones observable in nature and
observe that the tendrils of climbing plants show
experiments.
spatially complex shapes consisting of several heli-
cal segments, interrupted by helical perversions, While the multi-covered circles and the C*
connecting two helical segments with opposite branches originating from them certainly help to
handedness. understand the geometry of perversions, moreover,
In contrast to the works [Goriely & Tablor, in some cases it might be convenient to compute
1998; Goriely k McMillen, 2002], we analyze the perversions starting Ck branches, one has to be
equilibria of finite rods with clamped-clamped aware that the physical evolution is rather differ-
boundary conditions. The latter are meant to ent since equilibria on the Ck branches are self-
model the fact that tendrils have a solid "grip" intersecting and thus un-physical. On the other
on their environment. Our main focus was the hand, we believe that approaching a perversion from
description of equilibria connected to the trivial, the trivial, straight solution on the bifurcation dia-
straight shape and the identification of equlib- gram is qualitatively similar to the physical evolu-
ria with multiple perversions. (We assumed that tion of equilibria, so our model can shed some light
some of the complex shapes observable in nature on the existence of these highly interesting, spatially
complex shapes.
evolve from simple, almost straight equilibria.)
194 G. Domokos & T. J. Healey

Acknowledgments Gaspar, Zs. [1979] "An exact analysis of elastic bar-


structures," Zeitschr. Angew. Math. Mech. 59, T179-
This work was supported by OTKA grant T046646 T180.
and the Bolyai Research Fellowship (G. Domokos), Gaspar, Zs., Domokos, G. k Szeberenyi, I. [1997] "A
NSF grant DMS-0072514 (T. J. Healey). parallel algorithm for the global computation of elas-
tic bar structures," Comput. Assist. Mech. Eng. Sci.
4, 55-68.
References Goriely, A. k McMillen, T. [2002] "Tendril perver-
Alexander, J. C. & Antman, S. S. [1982] "The ambigous sion in intristically curved rods," J. Nonlin. Sci. 12,
twist of love," Quart. Appl. Math. 40, 83-92. 241-281.
Allgower, E. L. k Georg, K. [1990] Numerical Con- Golubitsky, M. k Schaeffer, D. G. [1985] Singularities
tinuation Methods: An Introduction (Springer-Verlag, and Groups in Bifurcation Theory, Vol. I (Springer
Berlin). Verlag, NY).
Calugareanu, G. [1961] "Sur les classes d'isotopie Goriely, A. k Tabor, M. [1998] "The mechanics and
de noeuds tridimensionells et leurs invariants," dynamics of tendril perversion in climbing plants,"
Czechoslovak Math. J. 11, 588-625. Phys. Lett. A250, 311-318.
Coleman, B. D., Tobias, I. k Swigon, D. [1995] "Theory Healey, T. J. [2002] "Material symmetry a n d chi-
of the influence of end conditions on self-contact in rality in nonlinearly elastic rods," Math. Mech.
DNA loops," J. Chem. Phys. 103, 9101-9109. Solids 7, 405-420.
Crandall, M. & Rabinowitz, P. H. [1971] "Bifurcation Heijden, G., Peletier, G. H. M. k Planque, R. [2004] "A
from simple eigenvalues," J. Fund. Anal. 8, 321-340. consistent treatment of link and writhe for open rods,
Darwin, Ch. [1888] The Movements and Habits of and their relation to end rotation," Arch. Rat. Mech.
Climbing Plants (Appleton, NY), available online at Anal, submitted.
http://promo.net/pg/. Kielhofer, H. [2004] Bifurcation Theory (Springer
Domokos, G. [1994] "Global description of elastic bars," Verlag, NY).
Zeitschr. Angew. Math. Mech. 74, T289-T291. Li, Y k Maddocks, J. "On the computation of equi-
Domokos, G. [1995] "A group-theoretic approach to the libria of elastic rods, part I: Integrals, symmetry
geometry of elastic rings," J. Nonlin. Sci. 5, 453-478. and a Hamiltonian formulation," J. Comput. Phys.,
Domokos, G. k Gaspar, Zs. [1995] "A global, direct submitted.
algorithm for path-following and active static control Love, A. E. H. [1927] A Treatise on the Mathematical
of elastic bar structures," Int. J. Struct. Mach. 23, Theory of Elasticity (Cambridge University Press,
549-571. Cambridge, UK); Reprinted (Dover Publications, Inc.
Domokos, G. k Healey, T. J. [2001] "Hidden symmetry NY).
of global solutions in twisted elastic rings," J. Nonlin. Swigon, D., Coleman, B. D. k Tobias, I. [1998] "The elas-
Sci. 11, 47-67. tic rod model for DNA and its application to tertiary
Domokos, G. k Szeberenyi, I. [2004] "A hybrid paral- structure of DNA minicircles in mononucleosomes,"
lel approach to nonlinear boundary value problems," Biophys. J. 74, 2515-2530.
Cora-put. Assist. Mech. Eng. Sci. 11, 15-34. Thimoshenko, S. P. k Gere, J. M. [1961] Theory of
Fuller, F. B. [1971] "The writhing number of a space Elastic Stability (McGraw-Hill, NY).
curve," Proc. Nat. Acad. Sci. USA 68, 815-819. Tobias, I., Coleman, B. D. & Olson, W. [1994] "The
Gaspar, Zs. [1977] "The form of an ideally elastic bar dependence of DNA tertiary structure on end con-
with a space curve axis," Acta Techn. Hung. Acad. ditions: Theory and implications for topological tran-
Sci. 84, 293-306. sitions," J. Chem. Phys. 101, 10990-10996.
Gaspar, Zs. [1978] "Large deflection analzsis of bar struc- White, J. H. [1969] "Self-linking and the gauss integral
tures," Acta Techn. Hung. Acad. Sci. 87, 49-58. in higher dimensions," Amer. J. Math. 9 1 , 693-728.
BIFURCATIONS OF STABLE SETS IN
NONINVERTIBLE P L A N A R M A P S
J. P. ENGLAND, B. K R A U S K O P F and H. M. OSINGA
Bristol Centre for Applied Nonlinear Mathematics,
Department of Engineering Mathematics, University of Bristol,
Queen's Building, Bristol BS8 1TR, UK

Received May 4, 2004; Revised J u n e 9, 2004

Many applications give rise to systems that can be described by maps that do not have a unique
inverse. We consider here the case of a planar noninvertible map. Such a map folds the phase
plane, so that there are regions with different numbers of preimages. The locus, where the
number of preimages changes, is made up of so-called critical curves, that are denned as the
images of the locus where the Jacobian is singular. A typical critical curve corresponds to a fold
under the map, so that the number of preimages changes by two.
We consider the question of how the stable set of a hyperbolic saddle of a planar noninvertible
map changes when a parameter is varied. The stable set is the generalization of the stable
manifold for the case of an invertible map. Owing to the changing number of preimages, the
stable set of a noninvertible map may consist of finitely or even infinitely many disjoint branches.
It is now possible to compute stable sets with the Search Circle algorithm that we developed
recently.
We take a bifurcation theory point of view and consider the two basic codimension-one
interactions of the stable set with a critical curve, which we call the outer-fold and the inner-
fold bifurcations. By taking into account how the stable set is organized globally, these two
bifurcations allow one to classify the different possible changes to the structure of a basin of
attraction that are reported in the literature. The fundamental difference between the stable set
and the unstable manifold is discussed. The results are motivated and illustrated with a single
example of a two-parameter family of planar noninvertible maps.

Keywords: Noninvertible map; stable set; critical curve; bifurcation; basin of attraction.

1. Introduction Generally, knowledge of t h e structure of t h e basins


of attraction is key to understanding t h e long t e r m
One often encounters maps arising in applications
evolution of t h e system. Other applications t h a t
t h a t are noninvertible, by which is meant t h a t the
give rise to noninvertible maps include models from
given map is smooth, b u t does not have a uniquely
economics [Agliari, 2000; Agliari et al, 2003], radio-
defined inverse. A well-referenced example of such
physics [Maistrenko et al., 1996] and neural net-
a noninvertible system is t h a t of a discrete-time
works [Rico-Martinez et al, 2000].
adaptive control system [Adomaitis et al., 1991;
In this paper we focus on noninvertible m a p s of
Frouzakis et al, 1992; Frouzakis et al, 1996]. In
t h e plane. T h a t is, we consider a dynamical systems
this example one finds multistability and the non-
t h a t is given by a smooth planar m a p
invertibility plays an important role in the structure
of the basins of attraction of the coexisting attrac- / : M2 H-> R2
tors, which may consist of disconnected regions. t h a t does not have a unique inverse.

195
196 J. P. England et al.

Geometrically such a noninvertible map folds one speaks of W s (xo) as the global stable set [Mira
the phase plane. Adopting the notation in [Nien h et al, 1996b]. Throughout this paper, the primary
Wicklin, 1998] the curve of merging preimages (also manifold is the unique connected subset of Ws(xo)
denoted as LC-i) is defined as that contains the fixed point XQ.
The computation of stable sets and inverse
Jo = {x G R 2 |/J/(x) is singular} , orbits is difficult due to the fact that the Jacobian
may become singular and that the critical curves
and the first iterate of this curve, J\ = /(Jo), is
separate the phase plane into regions that have
called a critical curve (also denoted as LC). The
different numbers of preimages. Furthermore, the
dynamics of a planar map is such that the phase
stable set may also consist of pieces that are discon-
plane folds along the critical curves. Generically,
nected from the saddle point. The recently devel-
the number of preimages of two points on either
oped Search Circle (SC) algorithm [England et al.,
side of a fold line differs by two [Arnol'd, 1992],
2004] overcomes the problem of computing the pri-
and points on the critical curve have two coincident
mary manifold past intersections with the curve Jo
preimages. We focus on this generic case of a sim-
where the Jacobian is singular. It can also be used to
ple fold, such that points on one side of J\ have two
compute disjoint pieces of the stable set. All that is
more preimages than points on the other side. Com-
needed to start a computation are the system equa-
mon notation denotes Z^ as a region having k rank-
tions themselves along with the saddle point and
one preimages. The simplest case of a single fold is
the stable eigenvector. All primary manifolds and
then denoted by [ZQ-Z-I), where points on one side
stable sets in this paper have been computed with
of the fold have no preimages and points on the
the implementation of the SC algorithm [Osinga &
other side of the fold have two preimages. The fold-
England, 2003] in the DsTool environment [Back
ing of the phase plane may be more complicated, for
et al, 1992].
example, for the case denoted as {Z\-Z^-Z{) there
The SC algorithm makes it possible to find and
are regions with one and three preimages.
consider codimension-one bifurcations where the
Much work has been done to investigate
stable set interacts with a critical curve. There are
the dynamics of noninvertible maps and how it
exactly two such codimension-one bifurcations —
is related with the folding of the phase plane
the outer-fold and inner-fold bifurcations that are
[Abraham et al, 1997; Gumowski k Mira, 1977,
discussed in detail in Sec. 3. (Similar interactions
1980a, 1980b; Mira et al, 1996a; Mira et al, 1996b].
between unstable manifolds and attracting invari-
In particular, there has been considerable interest
ant circles have been investigated in [Frouzakis
in bifurcations that lead to qualitative changes of
et al, 1997; Frouzakis et al., 2003; Maistrenko et al,
basins of attraction [Agliari et al, 2003; Cathala,
1996].) Depending on the global organization of the
1998; Kitajima et al, 2000; Lopez-Ruiz & Fournier-
stable set, these two basic codimension-one bifurca-
Prunaret, 2003; Mira et al, 1994]. Such basins are
tions may give rise to the different changes of basins
typically determined by computing the orbits for a
of attraction that have been studied independently
large number of initial conditions. An alternative
in the literature. This is discussed in Sec. 4. Finally,
is to compute the stable set of a suitable saddle
in Sec. 5 we illustrate the fundamental difference
point, which forms the boundary of a given basin of
between the stable set and the unstable manifold.
attraction.
We discuss the codimension-one bifurcation that
To define the stable set formally, assume that /
leads to structurally stable self-intersections of the
has a saddle fixed point xo = /(xo) and that / is
unstable manifold, a phenomenon that cannot occur
differentiate in a neighborhood of xo- The global
for the stable set.
stable set WS(X.Q) of xo is defined as the set of points
that converge to XQ under forward iteration of / ,
2. Example
Ws(x0) = {x e M 2 |/ n (x) -> x 0 as n -> 00} .
Throughout this paper we use a single exam-
For an invertible map WS(X.Q) is an embedded mani- ple, namely the two-parameter family of planar
fold and one speaks of WS(XQ) as the stable mani- noninvertible maps
fold. However, when multiple inverses exist WS(XQ)
(l)
may consist of disjoint pieces. In particular, this
set is not an embedded manifold and this is why °(»)-L+^)-
Bifurcations of Stable Sets in Noninvertible Planar Maps 197

We call this map the modified Gumowski-Mira y-coordinate, namely


map, because Gumowski and Mir a [1980a, 1980b]
/ (q + ft) (i-a)\
investigated the special case a = 4/5 and 6 = 1 . For q
V 6(1 + 6)' (1 + 6 ) ; -
6 ^ — 1 this map has two fixed points. The origin is
always a fixed point and it is attracting for \a\ < 1.
The other fixed point p is located at 2.1. Sequence of bifurcations of the
stable set
/(1-q) (l-q)\
p
In [England et al., 2004] the two particular cases
\(l+b)'(l+b)J- 6 = 0.2 and 6 = 0.1 were used to illustrate the SC
For the special case considered in [Gumowski algorithm. Here we present a more detailed study
& Mira, 1980a, 1980b] the point p is a saddle with of the bifurcation sequence as 6 is varied. Specif-
a negative stable eigenvalue. As is standard in such ically, we can explain all bifurcations encountered
a situation, the stable set can be computed in this by studying the interaction of the stable set with
2
case by using the second iterate Q , but the folding the critical curve J\ in some small neighborhood.
by Q2 is more complex because its Jacobian is sin- Figure 1 shows the stable set for nine decreasing
gular both when DQ(x) or D(Q(Q(x))) are singu- values of 6. The scale of the vertical axes is the same
lar. By choosing different values for a and 6 this in all panels, while the scale of the horizontal axes is
difficulty can be avoided. Specifically, we choose adjusted, because the primary manifold is growing
a — —0.8 throughout this paper and vary the param- wider with decreasing 6 > 0. Specifically, the left
eter b; see also [England et al., 2004]. Then the fixed point on the x-axis of each panel is fixed at x = —3
point p is a saddle for 6 < 3, 6 ^ — 1. The two and the right point is chosen such that Jo is always
sides of the primary manifold of the stable set of displayed at the center of each panel. The origin is
the saddle join to form a smooth closed loop. The a sink, and it is denoted by a blue triangle. In all
primary manifold bounds the basin of attraction of panels the saddle p lies in the ^-region, above J\,
the sink. and its preimage q lies in the Zo-region, below J±, as
The map (1) is designed such that the Jacobian indicated by green crosses. The stable sets Ws(p),
matrix becomes singular along a vertical line, including the primary manifold, are shown in blue
namely and the critical curves Jo and J\ are shown in gray.
The red crosses are either points where the stable
(2) set is tangent to J\, or preimages of such tangency
•*=-{*--£}• points.
Figure 1(a) shows Ws{p) for 6 = 0.25, where it
The rank-one critical curve is the parabola
only consists of the primary manifold that is con-
nected to the saddle point. Both sides of the man-
* = {<-*-*}• ifold join smoothly at q to form a closed loop. All
points on Ws(p) map to the segment in the Z^-
The phase plane folds along Jo under Q, and the region above J\. The situation is topologically the
image J\ of the fold divides the plane into two same for 0.189860 < 6 < 3. At 6 « 0.189860 the pri-
distinct regions, one with two preimages and one mary manifold Ws(p) becomes tangent to J\; see
with no preimages, denoted Zi and ZQ, respectively. panel (b). We observe that the point of tangency
Since a fixed point always has at least one preim- has a double preimage, which lies on Jo (indi-
age, namely itself, it must lie in Z%- Hence, there is cated by the red crosses). When decreasing 6 fur-
typically a distinct second preimage of the saddle p, ther, as is shown in panel (c) for 6 = 0.14, a new
which we denote by q. part of the stable set Ws(p), namely a closed loop,
The map (1) has the reflectional symmetry is formed that is disconnected from the primary
of a perfect fold around Jo. This means that the manifold. This so-called bubble has grown from the
points single point at the tangency and maps to the addi-
tional segment of the primary manifold that has
±x y moved above the Ji-curve. Panel (d) shows the
(-¥b ' ) case 6 = 0.08995, at which the disjoint bubble of
map to the same point under Q. In particular, the the stable set is approximately tangent to J i , giv-
point q is the mirror image of p and has the same ing rise to the birth of another disconnected closed
198 J. P. England et al.

Fig. 1. Bifurcations of the stable set W{p) (blue curves) of the saddle p of (1) for a = —0.8 as b is varied. The curves Jo and
J i are shown in gray. The saddle point p and its preimage q are indicated by green crosses. Tangency points between Ws(p)
and Ji and their preimages are indicated by red crosses. From (a) to (i) the parameter 6 takes the values 0.25, 0.18960, 0.14,
0.08995, 0.086, 0.0845735, 0.07, 0.04375 and 0.035.

curve, which will grow from the red cross; this is b = 0.07 is shown in panel (g). The second bubble
shown in panel (e) for b = 0.086. Approximately is still disconnected from the primary manifold. As
at 6 = 0.0845735, shown in Fig. 1(f), a tangency this bubble grows with decreasing b, there is a tan-
occurs at the left side of the picture, between the gency between this bubble and J\ at approximately
primary manifold and the curve J\. We observe that b = 0.04375; see panel (h). This bifurcation gives
the two separate segments of manifold that lie in birth to a third disjoint bubble, which is shown in
the Z2-region have joined. At the same time the panel (i) for b = 0.035.
original disjoint bubble of Ws(p) connects with and In the bifurcation sequence above, there are
then forms a part of the primary manifold. This effectively only two different bifurcations that lead
connection happens at the preimage of the tangency to the observed changes of the stable set Ws(p).
with J\. Decreasing b further, the primary manifold Both are generic codimension-one bifurcations
now has the shape of a horseshoe; the situation for where there is a tangency between Ws(p) and J\.
Bifurcations of Stable Sets in Noninvertible Planar Maps 199

Fig. 2. Further bifurcations of the stable set Ws(p) of the saddle p of (1) for a = —0.8; compare Fig. 1. From (a) to (c)
the parameter b takes the values 0.014, 0.01306669, and 0.0125, while the curve J0 is at x = 28.571, x = 30.612 and x = 32,
respectively. As b —* oo the disjoint bubbles disappear in inner-fold bifurcations; one such inner-fold bifurcation is shown in
panel (b).

The first bifurcation, which we call an outer-fold [Cathala, 1998; Kitajima et al, 2000; Mira et al.,
bifurcation, results in the creation (or disappear- 1994] and "contact bifurcations" [Agliari, 2000;
ance) of a new isolated closed curve that belongs Agliari et al., 2003; Lopez-Ruiz &: Fournier-
to Ws(p). The second bifurcation, which we call an Prunaret, 2003] that can be found in the litera-
inner-fold bifurcation, changes the local connected- ture. As was mentioned in the introduction, these
ness of branches of Ws(p). Each of these bifurca- bifurcations were not viewed in the first instance
tions is discussed in detail from the point of view of as bifurcations of a stable set, but as bifurcations
bifurcation theory in the next section. of a basin of attraction. Because the underlying
We finish this section by showing what hap- outer-fold or inner-fold bifurcations can change a
pens if b is decreased further towards zero. Since given basin of attraction in different ways, they are
the formula (2) for Jo has b in the denominator, associated with many different names, depending
the curve Jo tends to infinity in the limit as b on their global flavor and which basin of attrac-
approaches zero. For 6 = 0 the map (1) is actu- tion is under consideration. Indeed the notation in
ally a diffeomorphism, that is, each point has a the literature is rather complicated. It is generally
unique preimage and the map Q is invertible. In related to the connectedness of the basin of attrac-
particular, the saddle p still exists, but its second tion and some papers even contain a glossary of
preimage q does not. Furthermore, Ws(p) must be the names that are given to the different bifur-
a simply connected smooth stable manifold. Hence, cations [Lopez-Ruiz k. Fournier-Prunaret, 2003;
as b gets closer to zero all bubbles must disappear. Mira et al, 1994].
Figure 2 gives an indication of how this happens The main point of this paper is to take the point
with the three phase portraits for (a) b = 0.014, of view of bifurcation theory and singularity theory
(b) b = 0.01306669 and (c) b = 0.0125. The bub- in order to provide a systematic way of classifying
bles are joined one by one to the primary manifold qualitative changes to the stable set and, hence, to
in inner-fold bifurcations; compare Fig. 2(b). As b basins of attraction. To this end, we first consider
tends to zero the left branch of Ws(p) retracts fur- the generic codimension-one bifurcations where the
ther and further into the Z2-region and, since Jo stable set interacts with the critical curve J\. The
disappears to infinity, the right branch goes off to assumption that the codimension-one bifurcation
infinity for b = 0. be generic means, in particular, that the stable set
crosses the critical curve J\ at a generic point, that
is, at an image of a regular fold point of Jo- Further-
more, genericity demands that the stable set and
3. Generic C o d i m e n s i o n - O n e J\ have a quadratic tangency. (We do not consider
Bifurcations of S t a b l e S e t s here the case that the stable set and J\ already
The outer-fold and inner-fold bifurcations in Sec. 2 have a generic crossing, in which case the generic
underlie various types of "basin bifurcations" bifurcation would be a cubic tangency.) In other
200 J. P. England et al.

words, there are exactly two generic codimension- is interested in the change of a basin of attraction.
one bifurcations where the stable set crosses Jj upon In this way, all changes to a given basin of attrac-
change of a parameter, namely the outer-fold and tion, including changes to its local connectedness,
inner-fold bifurcations we already encountered in can be understood and classified in a systematic
the previous section. We stress that this is true irre- way as a combination of an outer-fold or inner-fold
spective of the particular folding structure of the bifurcation with a particular global flavor.
given map.
These two bifurcations, when seen only in a
local neighborhood of the tangency, are the basic 3.1. Outer-fold bifurcation
building blocks of any change in the structure The outer-fold bifurcation occurs when a segment
of a basin boundary when a single parameter is of the stable set becomes tangent to the J\ curve
varied. The second step is to consider the global on the outer side of the fold, so that a segment
arrangement of the stable set at the moment that it of the stable set crosses into the region with two
undergoes either an outer-fold or an inner-fold bifu- extra preimages. The additional preimages of this
rcation. Since there are different possibilities of con- segment form a disjoint bubble that is part of the
necting branches of the stable set globally, there are stable set.
different ways in which an outer-fold or inner-fold Figure 3 demonstrates how the outer-fold bifur-
bifurcation manifests itself, for example, when one cation creates a disjoint bubble by using data for

Fig. 3. A schematic representation of the outer-fold bifurcation. The map / folds the vertical plane along Jo and maps it to
the right of J\ onto the horizontal plane. There are two regions Z^ and Z^+2 t 0 t n e l e r t a n d right of J\ with k and k + 2
rank-one preimages, respectively. The stable set Ws(p) does not intersect J\ before the bifurcation, but is tangent to J i at
the outer-fold bifurcation, and has two intersection points with J\ after the bifurcation. After the bifurcation a part of W s (p)
extends into the region Zk+2 where there are (locally) two extra preimages. This part of Ws(p) lifts to the folded phase plane,
resulting in an isolated closed curve near the preimage of the tangency point on Jo- The shown manifolds are (scaled) data of
the map (1) for a = - 0 . 8 and 6 = 0.25, b = 0.189860 and b = 0.14, respectively.
Bifurcations of Stable Sets in Noninvertible Planar Maps 201

the map (1). The illustration is in the spirit of


singularity theory and shows the folded phase plane
in such a way that the action of the map / can
be interpreted as a simple projection onto regions
near Jo and J\, respectively. Indeed, a local neigh-
borhood of Jo, shown as a vertical plane, maps to
a local neighborhood of J\, shown as a horizontal
plane. It is indicated in the figure that the stable
set crosses from a region Zk with k preimages into a
region Z^+2 with k + 2 preimages. However, because
we only consider the situation locally near the tan-
gency point, this is (locally) equivalent to the case
of a Z0-Z2 map, such as (1).
For b = 0.25, a segment of the stable set is
shown in light blue. Since it does not intersect J\
it has k preimages (outside the local neighborhood
that we are interested in). For b = 0.189860 the sta-
ble set is shown in a darker blue and it is tangent to
J\. This means that the intersection point on Ws(p)
and J\ (red cross) has one extra (double) preimage, Fig. 4. The stable set Ws{p) (blue curves) of (1) for a =
which is illustrated by projecting the point up to —0.8 and b = 0.08995 shown together with the curves Jo,
the folded plane and then across, giving the dou- J\ and J2 = f{Ji) (gray curves). The saddle point p and
ble preimage (also denoted by a red cross) on the its preimage q are indicated by green crosses. A tangency
between the disjoint bubble and J\ must map to a tangency
curve Jo- The stable set for b = 0.14 is shown in between the primary manifold and J2. The preimage on Jo
even darker blue and it crosses Ji into the folded of the tangency with J\ is then the onset of a new disjoint
region, so that a whole segment of the manifold bubble. Tangencies and their preimages are indicated by red
lies in the Zfc+2-region. If one projects this piece crosses.
of manifold up to the folded plane and then across
to the unfolded phase plane, it is clear how the dis-
joint bubble is formed around the preimage of the of the stable set crosses into the region with two
tangency point on Jo- Note that the piece of the less preimages. This leads to a different connectivity
stable set in the Zfc+2-region has k other preimages, between the four branches of the stable set that are
which do not take part in the bifurcation; they cor- involved in this bifurcation. Figure 5 demonstrates
respond to the k preimages in the Z/--region. this with data for the map (1). The three panels
We remark that an outer-fold bifurcation (a)-(c) show the situation before, at and after the
occurs three times in the bifurcation sequence inner-fold bifurcation in the same way as in Fig. 3.
shown in Fig. 1, namely in panels (b), (d) and (h). Again, a local neighborhood of Jo, shown as the
Each case is different with respect to which part of vertical plane, maps to a local neighborhood of J\,
the stable set has an outer-fold bifurcation with J\. shown as the horizontal plane.
For example, in panel (d) there is a tangency of the Before the tangency, in Fig. 5(a), the stable
first bubble with the curve J\, leading to the cre- set locally extends across J\ into the Z^-region.
ation of the second bubble. As is shown in Fig. 4, The two segments in the Z^+2-region each have
this bifurcation can be interpreted as a tangency of two (local) preimages which are connected across Jo
the primary manifold with the curve Ji — / ( J i ) - as indicated by the vertical plane, the projection
Indeed, all images of tangencies are again tangen- to a local neighborhood of Jo- At the inner-fold
cies of the respective images. bifurcation the stable set is tangent to J\ and the
two segments in the Z^+2-region connect at the
tangency point with J\, indicated again by the red
3.2. Inner-fold bifurcation cross. Therefore, their preimages are also connected
The inner-fold bifurcation occurs when a segment at a single point on Jo- After the bifurcation, the
of the stable set becomes tangent to the curve J\ stable set remains entirely inside the Zfc+2_regi°n-
on the inner side of the fold, so that a segment Its two disjoint preimages do not connect across Jo
202 J. P. England et al.

/Z'J,A?:/ (c)
Fig. 5. A schematic representation of the inner-fold bifurcation, illustrated in the same way as in Fig. 3. The three panels
show the local phase portrait before (a), at (b), and after (c) the inner-fold bifurcation using (scaled) data from (1) for
a = -0.8 and b = 0.086, b = 0.0845735 and b = 0.07, respectively.

any longer, which means that the connectivity of the 3.3. Different global flavors of the
branches near Jo has changed. As before, k other inner-fold bifurcation
preimages of the stable set do not take part in the
The overall or global manifestation of an inner-fold
bifurcation. bifurcation depends on wlllch part of the stable set
We remark that an inner-fold bifurcation occurs crosses J 1 . Furthermore, it is important to know
in Fig. 1(f). The change of the local connectivity can how the preimages of the two segments of the sta-
clearly be seen by comparing panels (e) and (g). ble et, wlllch meet at the preimage of the tangency
Since the branches of the stable set involve the pri- point, are connected outside the local neighborhood
mary manifold and the first bubble, tills inner-fold that we consider.
bifurcation manifests itself as a qualitative change Figure 6 shows two topologically different glo-
of the primary manifold. bal phase portraits at the moment of an inner-fold
Bifurcations of Stable Sets in Noninvertible Planar Maps 203

show in Fig. 7 four instances of panels from Fig. 1


where we colored the basin of the origin in green
and the basin of infinity in blue. The coloring is
motivated by the literature on basins of attraction
that speaks of "sea", "land", "lakes" and "islands".
Indeed, as noted in [Kitajima et al., 2000; Mira
et al., 1994], "islands" and "lakes" axe equivalent,
simply by exchanging the coloring of the respective
basins.
Figure 7(a) shows a situation where the basin
Zk+2 (c) Zk+2 (d) of attraction is a simply connected domain. How-
ever, as is clear from the other panels in Fig. 7, a
basin of attraction of a noninvertible map need not
be simply connected. While the literature speaks of
the changes to the "island number" or "lake num-
Jl Jl ber" [Mira et al, 1994], a topological classification
of a basin of attraction would require one to con-
sider its fundamental group and how it changes in
Zk Zk bifurcations.
In this paper we argue that the notation in
Fig. 6. Two different global flavors of the inner-fold bifur-
the literature is rather phenomenological, leading
cation. The stable sets in panels (a) and (b) are mapped to
the Zfc+2-region as shown in panels (c) and (d), respectively. to many seemingly different cases, while the under-
Panels (a) and (c) demonstrate the case where, at the bifurca- lying bifurcation is always either an outer-fold or an
tion, the stable set intersects J\ outside a local neighborhood inner-fold bifurcation of a particular global flavor.
of the tangency point, while in (b) and (d) the stable set does For example, an outer-fold bifurcation leads to the
not intersect J\ outside a local neighborhood.
situation in Fig. 7(b) where the green basin is multi-
ply connected (has a nontrivial fundamental group).
In the literature one often finds the description that
bifurcation. Both agree in the small gray neighbor- the green basin, which is an "island" or "continent"
hood, but the global structure outside this gray in the "sea", has a "hole" or "lake". The bifurcation
neighborhood is different. In panel (a) the two itself has been called a "connected basin *-* multi-
branches are connected across Jo outside a neigh- ply connected basin bifurcation" when seen from
borhood of the point where they meet. We have the point of view of the green basin of the origin;
already seen this possibility for the inner-fold bifur- it has also been called a "connected basin *-* dis-
cation in Figs. l(e)-l(g). This means that the stable connected basin bifurcation", when seen from the
set must cross J\ at two different points, as shown in point of view of the blue basin of infinity [Mira
the image under / in panel (c). In Fig. 6(b), on the et al., 1994]. (These two names are equivalent in
other hand, the two branches are connected in such the case of only two basins, because the green basin
a way that they do not cross Jo outside a neigh- being simply connected means, by definition, that
borhood of the point where they meet. This means the blue basin is connected, and vice versa.) The
that the image, shown in panel (d), remains entirely second outer-fold bifurcation changes the connectiv-
inside the Zfc+2-region. In this case, the stable ity of the green basin again. As shown in Fig. 7(c),
set in panel (b) takes the form of a pinched bubble. there are now two "lakes" (and the fundamental
In the unfolding of this inner-fold bifurcation a sin- group has two generators). As far as we are aware,
gle bubble that crosses Jo at two nearby points any further increase in the connectivity of a basin is
pinches and then splits into two separate bubbles. referred to in the literature as a change to either the
"island number" or the "lake number" [Mira et al.,
1994].
4. B i f u r c a t i o n s of B a s i n B o u n d a r i e s Finally, in an inner-fold bifurcation with the
The bifurcations of the stable set that we discussed global flavor as was shown in Figs. 6(a) and 6(c), the
so far can be interpreted directly as bifurcations connectivity of the green basin changes again. As is
of basins of attraction. To make this point, we shown in Fig. 7(d), the "lake" has now joined the
204 J. P. England et al.

Ul (b)

p\_ Q

,;,
k )

Jo
Avs{P)

(d)

P 9

\i Jo '
^Wa(p)

Fig. 7. The stable sets of (1) for (a) 6 = 0.25, (b) 6 = 0.14, (c) b = 0.086 and (d) b = 0.07. The green shaded area indicates
the basin of attraction for the origin (blue triangle). The blue shaded area indicates the basin of attraction of infinity.

"sea", which is also referred to as a "lake <-+ road- manifolds does not exist. Indeed, there is a funda-
stead" bifurcation [Mira et al, 1994]. The funda- mental difference between the stable set Ws(xo) of a
mental group becomes smaller, and the topology of saddle point xo and the unstable manifold WU(XQ).
the basins in Fig. 7(d) is same as in Fig. 7(b). This means that one also finds fundamentally dif-
In summary, changes to the connectivity of a ferent bifurcations when these sets interact with the
basin of attraction can be classified in the spirit of curves Jo and J\.
bifurcation theory by only two ingredients: whether By definition, the global unstable manifold
an outer-fold or an inner-fold bifurcation is involved W"(xo) consists of points that converge to xo under
and the global organization of the stable set at the backward iteration, that is, under application of a
moment of bifurcation. The latter involves informa- sequence of inverse branches of / . In terms of for-
tion about which part of the stable set, the primary ward iterates this can be expressed as
branch or a disjoint piece, undergoes the bifurcation
and how branches are connected outside a neighbor- Wu(x0) = ( i £ M2|3 te}fc°=o, Qo = x and
hood of the bifurcation point.
f(lk+i) = Qk, such that lim <7fc = x 0 f.
k—*oo )
Because we assume that the Jacobian is nonsin-
5. Stable Set versus Unstable gular at xo, there exists the local unstable mani-
Manifold fold W£c(xo), which is associated with the unique
If one considers an invertible map, that is, a diffeo- inverse branch that fixes xo- The global unstable
morphism, then the stable manifold is the unsta- manifold WU(XQ) can then be expressed as
ble manifold of the inverse / - 1 , and vice versa.
However, for a noninvertible map / as consid- W"(xo) = [J fW oc (x 0 )). (3)
ered here this duality between stable and unstable n=l
Bifurcations of Stable Sets in Noninvertible Planar Maps 205

Note that even for noninvertible / the images of Mira et al, 1996b]. These authors are mainly inter-
Wi"c(xo) are unique. Indeed there may be other ested in bifurcations leading to the destruction of
preimages of WU(XQ), but these are not part of an invariant curve (also called "IC" or "torus"),
Wu(x.o) because points in these preimages do not which is the closure of unstable manifolds of suit-
converge to xo under backward iteration. Overall, able periodic points. The development of cusps and
WU(X.Q) is generically an immersed manifold (see then loops of the unstable manifold is interpreted
e.g. [Spivak, 1979]), so that it is justified to speak of as a global bifurcation of the invariant curve. For
W u (xo) as the unstable manifold. Note that W u (xo) example, Frouzakis et al. [2003, p. 107] reported the
may have generic transverse self-intersections as dis- "destruction of the IC through a global bifurcation,
cussed below. The unstable manifold Wu(x.o) can be appearance of loops on an unstable manifold, and
computed numerically with any algorithm that was the reappearance of an attractor, this time chaotic
developed for invertible maps. with loops". The ensuing global attractor "inherits"
As we discuss now, W"(xo) may have cusp loops from the global unstable manifold, and this
points, in which case it is a piecewise immersed type of attractor has been called a "weakly chaotic
manifold. However, this situation is not generic ring" [Frouzakis et al, 1997; Mira et al, 1996b].
and corresponds to a bifurcation of codimension at The papers [Frouzakis et al, 1997; Frouzakis et al.,
least one. 2003] and [Maistrenko et al, 2003] contain explana-
As seen, an interaction between the stable tions of how loops are formed. In [Frouzakis et al.,
set WS(XQ) and the critical curve J\ leads to a 1997, p. 1178] the point on J\ around which the
bifurcation occurring in the neighborhood of its loop forms is called a "self intersection of projec-
preimages on Jo- For the unstable manifold WU(XQ), tion" . Both [Frouzakis et al, 2003] and [Maistrenko
the converse is true: one needs to consider the inter- et al., 2003] state the condition that the eigenvec-
action between Wu(xo) and the curve Jo where the tor corresponding to the zero eigenvalue coincides
Jacobian is singular. with the normal to Jo at the moment when the
A transverse intersection of Wu(xo) with Jo unstable manifold develops cusps. However, none of
generically corresponds in the image to a tangency the authors gives this codimension-one bifurcation
between W u (xo) and J\. The genericity condition a name or describes it in the spirit of bifurcation
is that the tangent to H^"(xo) at the crossing point theory.
with Jo (which is, in fact, the eigenvector of the zero It does not appear to have been reported explic-
eigenvalue) does not coincide with the normal to Jo itly in the literature that this codimension-one
at this point. A codimension-one bifurcation occurs bifurcation is given by a cubic tangency of WU(XQ)
when this genericity condition is violated. This is with respect to the normal vector of Jo, and that
shown in Fig. 8, where panels (a) and (c) show it simply unfolds as a cusp singularity. This can
structurally stable tangencies of WU(XQ) with J\. At be seen in the vertical neighborhood, on the left
the bifurcation point, as in Fig. 8(b), the tangent to in Fig. 8, around the transverse crossing point of
W"(xo) and the normal to Jo at the crossing point WU(XQ) and Jo- The unfolding can be written as
coincide. This means that (generically) WU(XQ) has the normal form
a cusp point where WU(XQ) and J\ meet. This bifur-
cation leads to the creation of a structurally sta- R(s) = /is-s3 (4)
ble, transverse self-intersection and a little loop of
W u (xo) in a neighborhood of the image near Ji, as in the (r, s)-plane, where JQ = {s = 0}, the nor-
shown in Fig. 8(c). Note that there is a clear sense mal vector is (0,1), Wu(xo) — graph(i2) and \i is
of direction along WU(XQ) even when it has loops. the unfolding parameter. Figure 8 was obtained by
Since WU(XQ) is invariant under / , this bifurcation using the normal form (4) in the vertical neighbor-
creates infinitely many self-intersections and loops, hood near Jo and then using projections via the
which are the images under / ' (for any integer I > 1) folded plane to the horizontal neighborhood of J\.
of the intersection point of WU(XQ) with Jo- To avoid confusion with the codimension-two cusp
Self-intersections and the associated loops of bifurcation of equilibria, we call this codimension-
an unstable manifold have been reported in the one bifurcation of the unstable manifold the loop
literature; see [Lorenz, 1989], where the loops are bifurcation.
called "antennae", and [Frouzakis et al, 1997; We finish this section with a contrasting
Frouzakis et al, 2003; Maistrenko et al, 2003; statement.
206 J, P. England et at.

(b)
(a)

respectively.

Consider now a transverse mtersectiort 0 t w o


Proposition 1. The stable set W°(x0)of a hyper-
branches of W ( * o ) in ^ / ^ I L
bolic saddle point x 0 of a noninverttble map f. the intersection point teV* ?*»****£*£
J>n _ R « , „ > 1, cannot tot* structurally stable must map under some iterate of / , say, under / ,
transverse self-intersections.
i D t
° S ^ the intersection point . d o e s
Proof. Since x 0 is hyperbolic it does not lie on Jo, not lie on the set of preimages \Jo<i<L J W
so that there is a small neighborhood U of xo m J 0 . Then all iterates f for 1 < I < L are diffeomor-
which / is a local diffeomorphism (taking again phisms on the neighborhood V. In p a r t " ^ / O O
The appropriate branch of the inverse). Hence, contains a transverse intersection of two blanches
W8 (xo) C U does not have self-intersections.
Bifurcations of Stable Sets in Noninvertible Planar Maps 207

of W a (xo). Since fL{t) G fL(V) n U ^ 0 we have In other words, the unfoldings presented here are
found a transverse intersection on Wfoc(xo), which equally valid, for example, to describe bifurcations
is a contradiction. of invariant circles.
In other words, we must have f\t) € Jo Since a basin boundary of a noninvertible pla-
for some 0 < I < L. This shows that the self- nar map is typically bounded by stable sets, the
intersection t of WS(XQ) is not structurally stable, outer-fold and the inner-fold bifurcations lead to
because a small perturbation of / destroys this changes in the structure of a given basin. In fact,
property. • we argued that all changes to basins of attraction
can be classified in the spirit of bifurcation theory
The condition that f\t) 6 JQ means that, by the type of bifurcation in combination with its
generically, we find a codimension-one inner-fold global flavor, by which is meant the global struc-
bifurcation at fl(i) as was described in Sec. 3.2. ture of the part of the stable set that undergoes the
Hence, the last self-intersection fl(t) of the stable bifurcation.
set unfolds as shown in Fig. 5. Obvious future work is the consideration
Note that Proposition 1 was formulated for a of generic bifurcations of stable sets (or invariant
saddle point xo for convenience and can be general- curves) of higher codimension. The codimension
ized to stable sets of other hyperbolic invariant set may be increased by interacting with the critical
of saddle type. In particular, the statement holds curve at nongeneric points, for example a cusp.
for a saddle periodic orbit simply by means of con- Alternatively, one may look at a nonquadratic
sidering an appropriate iterate of / . (quartic) tangency with a generic segment of the
critical curve.
Finally, we mention that the general approach
6, Conclusions presented here may be used to consider noninvert-
Bifurcations of stable sets in noninvertible planar ible maps on higher-dimensional spaces. The next
maps occur when the stable set interacts with a crit- step would be to consider noninvertible maps of R 3 .
ical curve where the number of preimages changes. This is already quite a challenge. One now deals
It is now possible to compute stable sets and with a two-dimensional critical manifold J\ (instead
find such bifurcations directly. Note that a method of a critical curve) along which the number of
for computing the regions of different numbers of preimages changes. Clearly, the singularity theory
preimages of noninvertible maps has been developed of smooth noninvertible maps of R 3 is more com-
in [Nien & Wicklin, 1998] and implemented in the plicated than that of smooth noninvertible maps
program PlSCES [Wicklin, 1995]. of R 2 . Furthermore, the stable set may be up to
We considered a generic codimension-one bifur- two-dimensional.
cation of the stable set, of which there are exactly
two cases, the outer-fold and the inner-fold bifur- Acknowledgments
cation. Both were illustrated with data from an We are grateful to Bruce Peckham for sharing his
example family of noninvertible maps. Furthermore, insight into the literature on noninvertible maps
we contrasted the properties of the stable set with and for helpful comments on a draft of this paper.
that of the unstable manifold. We showed how the We thank Yuri Maistrenko for stimulating discus-
unstable manifold may develop structurally stable sions. The research of J. P. England was supported
self-intersections, while this is not possible for the by grant GR/R94572/01 from the Engineering and
stable set. Physical Sciences Research Council (EPSRC).
It must be stressed that the results pre-
sented here are valid for any noninvertible map, References
irrespective of its particular folding of the phase
Abraham, R. H., Gardini, L. & Mira, C. [1997] Chaos in
plane. This is so because the generic case is
Discrete Dynamical Systems: A Visual Introduction
always a quadratic tangency with a critical curve in 2 Dimensions (Springer-Verlag, NY).
along which the number of preimages changes by Adomaitis, R. A. & Kevrekidis, I. G. [1991] "Noninvert-
two. Furthermore, the outer-fold and the inner-fold ibility and structure of basins of attraction in a model
bifurcations are the generic codimension-one inter- adaptive control system," J. Nonlin. Sci. 1, 95-105.
actions of any invariant curve (not necessarily a Agliari, A. [2000] "Global bifurcations in the basins
segment of a stable set) with the critical curve. of attraction in noninvertible maps and economic
208 J. P. England et al.

applications," Proc. Third World Congr. Nonlinear Lopez-Ruiz, R. k Fournier-Prunaret, D. [2003] "Com-
Analysts, Part 8 (Catania, 2000); [2001] Nonlin. Anal. plex patterns on the plane: Different types of basin
47, 5241-5252. fractalization in a two-dimensional mapping," Int. J.
Agliari, A., Gardini, L. k Mira, C. [2003] "On the frac- Bifurcation and Chaos 13, 287-310.
tal structure of basin boundaries in two-dimensional Lorenz, E. N. [1989] "Computational chaos — a prelude
noninvertible maps," Int. J. Bifurcation and Chaos to computational instability," Physica D35, 299-317.
13, 1767-1785. Maistrenko, V., Maistrenko, Y. & Sushko, I. [1996]
Arnol'd, V. I. [1992] Catastrophy Theory, 3rd, revised "Noninvertible two-dimensional maps arising in
and expanded edition (Springer-Verlag, Berlin). radiophysics," Int. J. Bifurcation and Chaos 4,
Back, A., Guckenheimer, J., Myers, M. R., Wicklin, F. J. 383-400.
k Worfolk, P. A. [1992] "DsTool: Computer assisted Maistrenko, V., Maistrenko, Y. & Mosekilde, E. [2003]
exploration of dynamical systems," Not. Amer. Math. "Torus breakdown in noninvertible maps," Phys. Rev.
Soc. 39, 303-309. E67, 046215.
Cathala, J. C. [1998] "Basin properties in two- Mira, C , Fournier-Prunaret, D., Gardini, L., Kawakami,
dimensional noninvertible maps," Int. J. Bifurcation H. k Cathala, J. C. [1994] "Basin bifurcations of
and Chaos 8, 2147-2189. two-dimensional noninvertible maps: Fractalization of
England, J. P., Krauskopf, B. k Osinga, H. M. [2004] basins," Int. J. Bifurcation and Chaos 4, 343-381.
"Computing one-dimensional stable manifolds and Mira, C , Jean-Pierre, C , Millerioux, G. k Gardini, L.
stable sets of planar maps without the inverse," [1996a] "Plane foliation of two-dimensional noninvert-
SIAM J. Appl. Dyn. Syst. 3, 161-190. ible maps," Int. J. Bifurcation and Chaos 6, 1439-
Prouzakis, C. E., Adomaitis, R. A., Kevrekidis, I. G., 1462.
Golden, M. P. k Ydstie, B. E. [1992] "The structure Mira, C , Gardini, L., Barugola, A. k Cathala, J. C.
of basin boundaries in a simple adaptive control sys- [1996b] Chaotic Dynamics in Two-Dimensional Non-
tem," Proc. NATO 1992, Advanced Summer Institute invertible Maps, Series of Nonlinear Science Series A
(Ed. Bountis T.), pp. 195-210. (World Scientific, Singapore).
Prouzakis, C. E., Adomaitis, R. A. k Kevrekidis, I. G. Nien, C. H. k Wicklin, F. J. [1998] "An algorithm for
[1996] "An experimental and computational study the computation of preimages in noninvertible map-
of subcriticality, hysteresis and global dynamics for pings," Int. J. Bifurcation and Chaos 8, 415-422.
a model adaptive control system," Corn-put. Chem. Osinga, H. M. k England, J. P. [2003] "Global mani-
Engin. 120, 1029-1034. fold ID code, Version 2, software for use with
Prouzakis, C. E., Gardini, L., Kevrekidis, I. G., DsTool," http://www.dynamicalsystems.org/sw/sw/
Millerioux, G., k Mira, C. [1997] "On some proper- detail?item=27.
ties of invariant sets of two-dimensional noninvertible Palis, J. k de Melo, W. [1982] Geometric Theory of
maps," Int. J. Bifurcation and Chaos 7, 1167-1194. Dynamical Systems (Springer-Verlag, NY/Berlin).
Prouzakis, C. E., Kevrekidis, I. G. k Peckham, B. [2003] Rico-Martinez, R., Adomaitis, R. A. k Kevrekidis, I. G.
"A route to computational chaos revisited: Noninvert- [2000] "Noninvertibility in neural networks," Comput.
ibility and the breakup of an invariant circle," Physica Chem. Engin. 24, 2417-2433.
D177, 101-121. Spivak, M. [1979] Differential Geometry, Volume I, 2nd
Gumowski, I. k Mira, C. [1977] "Solutions chaotiques edition (Publish or Perish, Houston, Texas).
bornee d'une recurrence ou transformation ponctuelle Wicklin, F. J. [1995] "Pisces: a platform for implicit
du second ordre a inverse non-unique," Comptes Ren- surfaces and curves and the exploration of singu-
dus Acad. Sc. Paris A285, 477-480. larities," Technical Report GCG #89, The Geom-
Gumowski, I. k Mira, C. [1980a] Dynamique Chaotique etry Center, University of Minnesota, Minneapolis,
(Ed. Cepadues, Toulouse). MN. Available online at http://www.geom.uiuc.
Gumowski, I. k Mira, C. [1980b] Recurrences and Dis- edu/~fjw/pisces/.
crete Dynamic Systems (Springer-Verlag, NY).
Kitajima, EL, Kawakami, H. k Mira, C. [2000] "A
method to calculate basin bifurcation sets for a
two-dimensional noninvertible map," Int. J. Bifurca-
tion and Chaos 10, 2001-2014.
MULTIPARAMETRIC BIFURCATIONS IN AN
ENZYME-CATALYZED REACTION MODEL
E. F R E I R E , L. PIZARRO, A. J. RODRIGUEZ-LUIS
and F. FERNANDEZ-SANCHEZ
Department of Applied Mathematics II, E.T.S. Ingenieros, Univ. Sevilla,
Camino de los Descubrimientos s/n, 41092-Sevilla, Spain

Received April 26, 2004; Revised J u n e 17, 2004

An exhaustive analysis of local and global bifurcations in an enzyme-catalyzed reaction model


is carried out. The model, given by a planar five-parameter system of autonomous ordinary
differential equations, presents a great richness of bifurcations. This enzyme-catalyzed model
has been considered previously by several authors, but they only detected a minimal part of the
dynamical and bifurcation behavior exhibited by the system.
First, we study local bifurcations of equilibria up to codimension-three (saddle-node, cusps,
nondegenerate and degenerate Hopf bifurcations, and nondegenerate and degenerate Bogdanov—
Takens bifurcations) by using analytical and numerical techniques. The numerical continuation
of curves of global bifurcations allows to improve the results provided by the study of local
bifurcations of equilibria and to detect new homoclinic connections of codimension-three. Our
analysis shows that such a system exhibits up to sixteen different kinds of homoclinic orbits
and thirty different configurations of equilibria and periodic orbits. The coexistence of up to five
periodic orbits is also pointed out. Several bifurcation sets are sketched in order to show the
dynamical behavior the system exhibits. The different codimension-one and -two bifurcations
are organized around five codimension-three degeneracies.

Keywords: Local bifurcations; homoclinic connections; enzyme model.

1. Introduction Our objective is t o describe exhaustively t h e


bifurcation behavior t h a t the enzyme system
In this paper we study local bifurcations of equilib-
exhibits. One of t h e principal methods we use is
ria t h a t occur in a planar five-parameter system of
normal form theory, which provides information
autonomous ordinary differential equations, arising
about the nonlinear terms essential for describing
from a reaction-diffusion model governed by two
the bifurcation behavior. The analytical s t u d y of
partial differential equations, proposed in t h e study
the local bifurcation behavior of a degenerate equi-
of enzyme catalyzed reactions. We refer the inter-
librium provides knowledge of the bifurcation set in
ested reader to [Thomas, 1975; Murray, 1981a,
a neighborhood of the parameter space p o i n t where
1981b; Kernevez et al., 1979; Kernevez et al., 1983;
such degenerate equilibrium occurs (organizing cen-
Murray, 2002, 2003]. Namely, the system we con-
ters in b o t h state and parameter spaces). For a
sider, t h a t will be called enzyme system in the
general discussion on bifurcations and n o r m a l form
sequel, is:
theory, see e.g. [Guckenheimer &; Holmes, 1997;
t . sa
Kuznetsov, 1998; Wiggins, 2003].
The local information obtained from the
unfolding of a singularity is only valid in a
certain neighborhood. In practice, t h e size of
where s, a, ao,SQ, p,a,K > 0. this neighborhood is not usually very small and,

209
210 E. Freire et al.

therefore, the local results persist far away from bifurcation occurs when a\ ^ 0; a degener-
the organizing centers. The numerical techniques of ate codimension-two Hopf bifurcation, labeled H\,
continuation are a good tool for extending the local arises if a\ = 0 and a2 / 0; a degenerate
results. In this way, the local analytical information codimension-three Hopf bifurcation, labeled H2,
gives the starting points in both state and parame- appears if a\ = a2 = 0, a% ^ 0. When the transver-
ter spaces. sality condition fails, i.e. (d\/dr])(0) = 0 and a\ =
The continuation codes we have used will not 0, a nontransversal degenerate codimension-three
only allow to extend the local analytical results, but Hopf bifurcation, labeled HT, occurs.
also to detect new bifurcation phenomena. These Let us now consider a second-order nor-
numerical techniques are described in [Freire et al., mal form of the Bogdanov-Takens bifurcation
1999b; Freire et al, 2000]. (see, for instance [Guckenheimer h Holmes, 1997;
Now let us briefly describe the bifurcations we Kuznetsov, 1998]):
have obtained (see Table 1). Bifurcations up to
codimension-three of equilibria, periodic orbits and
homoclinic connections appear. \y = ax2 + bxy.
System (1) undergoes several cases of Hopf (Note that, in this normal form, a is not the state
bifurcation. Let us consider the Hopf bifurca- variable of the enzyme system.) A nondegener-
tion normal form [Guckenheimer &; Holmes, 1997; ate Bogdanov-Takens bifurcation occurs if a, b ^ 0.
Kuznetsov, 1998]: When a 7^ 0 and b = 0 a degenerate codimension-
f = \{j])r + a i r 3 + a2r5 + a3r7 -\ three Bogdanov-Takens bifurcation, called cusp of
(2)
order three and labeled E, appears; if a = 0
where A(0) = 0 and n is the bifurcation and b 7^ 0 another degenerate codimension-
parameter. Assuming the transversality condition three Bogdanov-Takens bifurcation, labeled D,
(d\/drj)(0) =£ 0 holds: a nondegenerate Hopf arises. These degeneracies have been studied in

Table 1. Bifurcation phenomena exhibited by the enzyme system. The symbol (•)
means that this bifurcation has been previously studied by other authors. Abbrevia-
tions are explained in the text.

Codimension Equilibria Periodic Orbits Homoclinic Orbits

saddle-node (•) saddle-node (•) nonzero trace


1 nondegenerate central
Hopf (.) saddle-node
cusp cusp zero trace
noncentral
degenerate Hopf
2 saddle-node
(HO (•)
(5 types)

nondegenerate
double
Bogdanov-Takens
degenerate
Bogdanov-Takens CL
(E)
degenerate
3 Bogdanov-Takens HEID
(D)
degenerate Hopf
(H 2 )
2* nontransversal
topological degenerate Hopf
2 2-codimension (HT) (•)
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 211

[Dumortier et al, 1987] and [Dumortier et al, 1991], methods we have used to detect and to continue the
respectively. global bifurcations up to codimension-three. Note
The different homoclinic bifurcations of Table 1 that sixteen different kinds of homoclinic orbits
are described in Sees. 3 and 4. appear. In Sec. 5 we apply those numerical meth-
The enzyme system (1) has been previously ods analyzing basically codimension-two and -three
studied by several authors, but their works only des- homoclinic orbits. In Sec. 6, several bifurcation sets
cribe a little part of its dynamical behavior. In this show the way the bifurcations are organized by
way, [Doedel et al, 1991; Doedel & Kernevez, 1986; five local and global bifurcations of codimension-
Doedel et al, 1998] consider the enzyme system three. We finish this work with some conclusions
as an example for applying the software package and remarks.
AUTO, and they only provide numerical evidence of
the existence of saddle-node and Hopf bifurcations
of equilibria and saddle-node bifurcations of peri- 2. Codimension-One a n d -Two
odic orbits. Kernevez et al [1983, 1985] performed Bifurcations of Equilibria
a numerical study of the system (1), obtaining the The first part of this section is devoted to transform
same bifurcation phenomena. Hassard and Jiang the enzyme system (1) into a low-degree polynomial
devoted two papers to study the enzyme system. system. We note that system (1) could be defined
In [Hassard <fe Jiang, 1992], they detected, numeri- for negative state variables, although this has not
cally, a nontransversal Hopf bifurcation point (they biochemical sense. In fact, it is easy to verify that if
also obtained, using AUTO, curves of saddle-node K > 1/4 then system (1) is defined for all (s, a) £ R 2
bifurcation of periodic orbits which appear in an whereas if K < 1/4 it is defined in a half-plane con-
unfolding of this singularity). Note that a suit- taining the positive quadrant R + x R + .
able choice of the bifurcation parameters can avoid An equilibrium (s, a) of system (1) has to verify
this transversality condition failure. In [Hassard <fc
Jiang, 1993], they carried out a detailed study of so —-s = a(ao ——a) = p-— ^ ^ > 0
a degenerate nontransversal Hopf bifurcation point 1 + S + KS*
given by the vanishing of the third-order coefficient and, therefore, (~s,a) € (0, so) x (0>ao); moreover,
of the normal form; they located this singularity in (s, a) lies on the straight line so — s = a(a,Q — a).
both parameter and state spaces and they obtained It is easy to prove that the rectangle [0, SQ] X
some information about its unfolding. [0, OQ] is a positively invariant set and, even more,
Thus, in this work we present a more compre- it is an attracting set for the positive quadrant.
hensive analysis of the system (1) and we study The following result will play a key role along
every local and global bifurcation, up to codim- this work, since it states that the enzyme system
ension-three. The reason why our analysis improves can be rewritten as a polynomial vector field whose
the previous ones is that the enzyme system can be components are of degree two and six.
written as a low-order polynomial system by means
of a suitable change of variables. Lemma 2.1. The system (1) is, for s > 0, C°°
The paper is organized as follows. Section 2 is orbital equivalent to the system
firstly devoted to the transformation of the enzyme
system into a low-order polynomial system, that u = uv,
facilitates the application of normal form theory v = v2 + F1(u)v + F2(u),
in order to obtain analytical results about high-
codimension bifurcations of equilibria. Once the where
transformation is done, we study, in several subsec-
Fi(u) == u(s0- u)p'(u) -- (era + so)p(w)
tions, bifurcations of equilibria of codimension-one
and -two: saddle-node and cusp bifurcations non- F2{u) --- up(u)h(u),
degenerate and degenerate Hopf bifurcations and p(u) == 1 + U + KU2,
nondegenerate Bogdanov-Takens bifurcations. We
also consider a case of nontransversal degener- h{u) == h(u,a0,s0,p)
ate Hopf bifurcation. Two kinds of degenerate - —paaou + (so -- u)[ap(u) + pu]
Bogdanov-Takens bifurcations and a limit case are
studied in Sec. 3. In Sec. 4 we explain the numerical and the symbol' stands for d/du.
212 E. Freire et al.

Proof. If we make the time reparameterization appear: a Hopf bifurcation occurs when a focus
t —• (1 + s -+ KS2)t (note that 1 + s + res2 > 0 for becomes nonhyperbolic, and a Bogdanov-Takens
s > 0), followed by the change of variables bifurcation appears when a Hopf and a saddle-node
bifurcation collide.
u = s, In Sec. 2.1 the saddle-node bifurcation of equi-
v = (so — s)(l + s + res2) — psa, libria and the cusp of saddle-node bifurcation are
studied. Hopf bifurcation and its degeneracies are
and the new time reparameterization t • ut treated in Sec. 2.2. The existence of nontransver-
(u > 0), system (1) may be written as in (3). sal degenerate Hopf bifurcation is considered in
Sec. 2.3. Finally, nondegenerate Bogdanov-Takens
Now the system is written in a suitable form for bifurcations are analyzed in Sec. 2.4.
its bifurcation analysis. Then, we start the study of
codimension-one and -two bifurcations of equilibria
of system (3) which are given by (u+, 0), where u+ is 2.1. Saddle-node and cusp bifurcations
a root of the third-degree polynomial h given in (4). The following proposition states the existence of
The linearization matrix of (3) at the equilib- saddle-node and cusps bifurcations.
rium is Proposition 2.1. For each a andrepositive, in the
(a0,s0,p)-space:
A=
F&v*) *!(«*) (a) a saddle-node bifurcation of equilibria occurs on
the parameterized surface given by the rational
Since det A = —u2p(u±)h' (u+) and the lead- expressions
ing term coefficient in h is — are < 0, when h(u)
has three roots, these equilibria correspond to a ul(p + ap'(u±))
so a(u*p'(u+) -p{u+))'
hyperbolic saddle located between two foci or nodes;
in the case that h(u) has only one root, it corre- {ap{u±) + pu±)2
sponds to a hyperbolic node or focus. Double and
a2('U*p'(u*) - p ( u * ) ) p '
triple roots correspond to nonhyperbolic equilib-
ria: saddle-node and cusp bifurcations of equilib- for u+, p > 0 and h"{u+) ^ 0;
ria, respectively. Moreover, other bifurcations may (b) the parameterized curve, given by the rational
expressions

2 ( ^ K ) -pK))[2Kp>,) -pK)) 2 +^p(«,)(/K) -p'K))]


a0 = a{u2p"(u*) - 2ii*p'(it*) + 2p(u*)]p>(u±)[2u*p'(u+) - 2p(u+) - u*p(u±)]

2u*(u*p'(it*) -p(u+))
s 0 = it* + 2
u P"(u*) - 2(u±p'(u±) - p ( « * ) ) '
2ap'(ui,)(uirp'(u*) -p(u+)) - au+p^p"(u±
P= ulp"(u*) - 2(u±p'{ui<) - p{u*))

for u* > 0 and h'"(u+) ^ 0, is the locus where


a cusp bifurcation of equilibria occurs. Solving these last equations, we obtain the expres-
sion for the curve of cusp bifurcation stated in the
proposition. •
Proof. Since i^C"*) = iLkp(uir)h(u*) and «*,
p(u+) > 0, the equilibrium («*, 0) undergoes a
saddle-node bifurcation, for each a and re positive, This result states where the system has one,
if /i(u*) = /i'(u*) = 0 and /i"(u*) + 0. Solving the two or three equilibria in the parameter space. The
above two equations, we obtain the required expres- saddle-node bifurcation locus separates the regions
sion for the surface of saddle-node bifurcation. If where either one or three equilibria exist.
h(u+) = ti(u+) = h"(u±) = 0 and /i'"(u*) ^ 0 the In Fig. 1 we observe, in the (ao,p)-plane, for
equilibrium (it*,0) undergoes a cusp bifurcation. SQ = 37, a = 0.2 and re = 0.1, that the two
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 213

*§-

Q I 1 1 1 I 1 1 1 1 1 1
0 1000 2000

o i i i 1 1 1 1 1 1 1 1 1 1 1 1 1

O650 700 750 800 Fig. 2. Projection, onto the (ao, so)-plane, of the curve of
cusp bifurcations of equilibria in the (ao,so,p) parameter
a
o space, for a = 0.2 and K = 0.1.

Fig. 1. Curves of saddle-node bifurcations of equilibria, snj^


and snL, in the (ao,p)-p\a,ne, for so = 37, a = 0.2 and
K = 0.1. Two Bogdanov-Takens bifurcation points, B T R and Proposition 2.2. For each a and K positive, in the
B T L , and a cusp bifurcation of equilibria, C, appear on those
curves. The L (R) index indicates that the corresponding (a0, s0, p)-space:
bifurcation is exhibited by the left (right) equilibrium.
(a) a codimension-one Hopf bifurcation occurs on
the parameterized surface given by the rational
expressions
saddle-node curves snL and sn^ collide in a cusp
C. Two Bogdanov-Takens points (see Sec. 2.4), _ (sp - ii*)[tt*p'(ii*)(ti*-so) +p(uir)s0]
B T L and B T R , are also present. In the narrow a[tt*;/^*) (it*-s 0 ) + p(u±)(au++s0)]'
region between snL and sn^ three equilibria coex- <
ist, and outside this zone the system only has one •u*(s0 - n*)p'(ii*) - (cm*+s0).p(*a*)
P= o '
equilibrium point. The subscript L (left) in snL
indicates that in the saddle-node bifurcation col- (5)
lapse the middle and the left equilibria of the sys-
tem and in B T L means that the bifurcation is for U+, so > 0 with a/('U*) 7^ 0 cr(it*) is defined
exhibited by the nonhyperbolic left equilibrium. in (7), h'(Ui,) < 0 and
Similar comments are valid for the subscript R
(right). **'(«,) p'K) h"(u*)
A projection on the (ao, so)-parameter plane of
the curve of cusp bifurcations of equilibria in the
(ao,so,p)-parameter space, labeled C, appears in (b) a degenerate Hopf bifurcation of codimension-
Fig. 2, for a = 0.2 and K = 0.1. two occurs on the curve given by the rational
expressions

2.2. Hopf bifurcations a(u*) = 0,


The analytical and numerical study of the nor- *(«*) = 0,
mal form of the Hopf bifurcation guarantees the
existence of these bifurcations and some degen- for u* > 0 with a'(u+) ^ 0, /&'(«*) < 0 and
eracies up to codimension-three. For the next dis- o-2(u-k) 7^ 0 (ao, is defined in (12)).
cussion, see, e.g. [Guckenheimer & Holmes, 1997;
Kuznetsov, 1998]. Proof. The value of ao can be obtained from
Other cases of higher codimension Hopf h(u) = 0 as
bifurcation are possible when the higher-order
coefficients in the Hopf bifurcation normal form (s0 - u)(ap(u) + pu)
vanish. a0 = a0(u) = . (6)
pau
214 E. Freire et

The linearization matrix of (3) at the


equilibrium is ^-u±p{u*)h'{u+)
0 u* \ allow to write the system (3) as
A = (7)

where
_/o - m \ / « \ //(«,«)'
(11)
7(/u*) = u*p(u*)h'(u*,ao(u*), s0,p),
o(u±) = Fi(u*,s0,p). where

Thus, the equilibrium (u*,0) of system (3) w0 = ^-u2p(u±)h'(u+),


undergoes a Hopf bifurcation if a(u+) = 0
and 7(11*) < 0. The transversality condition is f(u,v) = V'-u*p(u*)h''(u*)v2 + Fiiy/u^u + u*)v
given by
F2(y/u^u + u*) - u2p(u*)ti(u*)u
K^H^*^
where A = r\ ± uji are the eigenvalues of (7). Note
(8)
+ •
^-u+p(u*)h'(u±)

g(u,v) = V'-u±p{u±)h''(u* w .
that we have taken u* as the bifurcation parameter.
From In the study of the Hopf bifurcation of sys-
tem (11) and its possible degeneracies, the hand
a{u±) = u*(s0 - u*)p'(u*) calculation (as opposed to numerical evaluation) of
-(aw* + s0)p(Ui,) - pul = 0, (9) very long expressions is required, when the corre-
sponding bifurcation formulae are being used (see
it is easy to obtain, for each value of p, a and K, the
[Hassard & Jiang, 1992, 1993]). Freire et al. [1989]
curve of Hopf bifurcation in the (u*, so)-plane
developed a recursive algorithm well suited to sym-
SQ = U± + [pu* + (a + \)p{u* (10) bolic computation implementation, that turns out
1
KUX to be an efficient procedure to obtain the coeffi-
This curve has two asymptotes: u* = 1/y/n and cients of the Hopf bifurcation normal form. This
s 0 = (a + 2)u* + (p + a + l ) / « . For «* € (0,1/V«), algorithm is based upon the use of Lie transforms;
the value of SQ is negative. Therefore, the range of the calculations are arranged in a recursive scheme
u* is the interval (1/i/R, +oo). using complex variables and so the computational
Combining (9) and (6) we obtain, for each a effort is optimized.
and K positive, the parameterized surface (5). To The application of the aforementioned algo-
assure the codimension-one character, we need to rithm, by means of a MAPLE program, to compute
compute the normal form. the coefficients a\, a2 and 03 of order 3, 5 and 7,
The translation respectively, of the Hopf bifurcation normal form,
provides:
u —> u + «*,
1 A2,i
v —>• v, a\ = 16.4i: a2 =
1152 A 2 ) 2 '
followed by the rescaling transformation (12)
A3,i
1 «3
u u, 18432 A 3 2'
with

A n = u^F{'F^ - u^F[ + 2F[F'2,

^1,2 = K
A2,i = lOuffiFi"^ + 30u^(F{')2 - 12F^F^F{ - l O u ^ ' i f F / + 2,ul(F[)2 F2IV
- 16U*F2,F{"F; + Au*{F[)2F%,
A2,2 = F^F{,
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 215

AsA = S5ui(Fi)2Fiv(F^f - 35utFiFV(F{)2(FZ)2 + 35uiF? (Ftf FJV F? F{


+ 5uiFg(F{)2(FtfF?1 - \mulFl'F'2F[{Fllf - 315^F^ 2 /V '{F[f{F%f
+ 204ulF2"{F[)2F2v(F2\)2 - U0ulF^'{F^3(F{")2 - 126u^)3F2IVF{"F{
- ttuliFtfiFtfFY1 + 1260ulF["(F^2F{{F^)2 + 942«2(JF2')2F2IV{F[)2F%
- 288ul(F{)2F¥{Ftf + 504u 2 (i^) 4 (F{") 2 - 3768u*F[" (F^)3 F[F^'
-936u^)3Fiv{F{)2 + 37MF{"(Ft,)4F{,
A3j2 = (F^3F{(5u^-12F^,

where the derivatives of the polynomial functions coefficients a\ and 02 given in (12) vanish, for each
F\ and F2 are evaluated in u+. a and K positive. This point lies on H^ (a 2 < 0 for
Note that we have obtained the coefficients up all the points of Hi) and corresponds to a degen-
to order 7 of the Hopf bifurcation normal form only erate codimension-three Hopf bifurcation. A curve
in terms of the functions F\ and F^. of points DH, denoted by H2, projected on the
From F2(u) = up(u)h(u) an even more sim- (it*, a)-plane appears in Fig. 5 (obtained numeri-
plified expression for the coefficient a± can be pro- cally for K, = 0.1 by using the aforementioned code
vided, namely, PITCON 6.0). Local analysis provides the existence
of a curve of cusp bifurcation of periodic orbits,
ai = ai(u*) Cu, emerging from the point DH. We have eval-
j/K) , /i>* uated the coefficient 03 at the parameter values at
16 p(u+) +' h'(u. *i(«*) which the codimension-three Hopf bifurcation DH
occurs (for (a,n) e (0,1) x (0,10000)), obtaining
(cf. the long expressions used for the evaluation of that as is always negative. It means, on the one
a\ given in Appendix B of [Hassard & Jiang, 1992]). hand, the nonexistence of degenerate codimension-
The nondegeneracy condition a\(u±) 7^ 0 is equiva- four Hopf bifurcation points and, on the other, that
lent to ^(w*) 7^ 0 and this proves (a). the curve Cu emerges from point DH by the side of
The Hopf conditions /&(«*) = 0 and a(u+) = 0 the curve H'i where the coefficient a 2 is positive (see
along with the additional one, a\(u*) = 0, allow [Takens, 1973] and Fig. 6). In Fig. 7 two qualitative
to obtain, in the (ao, so>p)-space (for each a and K
positive), the curve of degenerate codimension-two
Hopf bifurcation stated in (b), assuming that it ver-
ifies the nondegeneracy condition 0.2(11*) 7^ 0. •

In Fig. 3 we show the curve of nondegenerate


Hopf bifurcation (solid line) in the (u*, so)-plane,
given by (10), for p = 1, a = 0.2 and K — 0.1.
Its two asymptotes are also drawn (dashed lines).
By using a general purpose continuation code
(PITCON 6.0, see [Rheinboldt, 1986]), we have
obtained numerically for a = 0.2 and K = 0.1 the
curve of degenerate codimension-two Hopf bifurca-
tion. It is formed by two components, labeled Hi
and H'1; respectively, which are shown in Fig. 4.
Other bifurcations shown in this figure are ana-
lyzed below.
Fig. 3. Curve of nondegenerate Hopf bifurcation (solid line)
Remarks. There exists a Hopf bifurcation point, and its asymptotes (dashed line) in the (u*, so)-plane for
in the (ao, so,p)-sp&ce, labeled DH, where both p = 1, a = 0.2 and K = 0.1.
216 E. Freire et al.

" 715 720 725 Fig. 6. Qualitative picture, in the (ao, so,p)-parameter
space, of the curves of Bogdanov-Takens (BT), degenerate
Hopf (Hi and H'i) and cusp of periodic orbits (Cu).
Fig. 4. Projection, onto the (ao,so)-parameter plane, for
a = 0.2 and K = 0.1, of the curves of Bogdanov-Takens
(BT), degenerate Hopf (Hi and H'i) and cusp of equilibria
(C). The degenerate Bogdanov-Takens points, E and D, are coalesce in the mentioned cusp of periodic orbits)
also shown. and the other value of so is less than this critical
value (in this case, the cusp of periodic orbits does
not already exist).

2.3. Nontransversal degenerate


Hopf bifurcations
H 2 / E//D When a complex-conjugate pair of eigenvalues of
the linearization matrix crosses the imaginary axis
in a degenerate way (i.e. the crossing is nontransver-
sal), a nontransversal Hopf bifurcation arises. If we
consider the Hopf bifurcation normal form given
in (2), the nontransversality condition of the Hopf
bifurcation merely means that (d\/dr/)(0) = 0.
0 10 20 30 In this subsection we study a case of non-
transversal degenerate Hopf bifurcation arising in
the system (3) (this kind of degeneracy corresponds
Fig. 5. Curves of codimension-three bifurcations of equilib- to a topological Z2-codimension 2 in the context of
ria in the (w*,a)-plane (for K = 0.1). E = cusps of order three;
[Golubitsky Sz Schaeffer, 1985]). Our only aim is to
D = weak foci; H2 = degenerate Hopf points.
show how the results of [Hassard & Jiang, 1993]
may be obtained from our analysis.
We note that a first case of nontransversal
pictures are represented: they show the rela- degenerate codimension-two Hopf bifurcation arises
tive position (with respect to the nondegenerate when the transversality condition given in (8) fails.
Hopf bifurcation curves), in the (ao,p)-parameter Hassard and Jiang [1992] studied this degeneracy.
plane (for a and K, constant), of the saddle- We will not consider this kind of degeneracy because
node of periodic orbits curves emerging from the of the simplicity it exhibits, in contrast to the great
codimension-two Hopf bifurcation points Hi and H': richness of bifurcation behavior of system (3). Thus,
for two values of SQ\ one of them greater than the it will not appear in Table 1.
critical value of SQ for which the point DH occurs (in The case of nontransversal degenerate topolog-
this case, the saddle-node curves of periodic orbits ical Z2-codimension-two Hopf bifurcations we are
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 217

(a) (b)
Fig. 7. Qualitative picture (for a and K constant) of the curves of saddle-node bifurcations of periodic orbits, SNj and SN2,
emerging from the codimension-two Hopf bifurcation points, Hi and Hi, for (a) so > s o a n d (b) so < s§, where s§ is the
critical value of the parameter so at which the codimension-two Hopf bifurcation point DH occurs. The cusp of periodic orbits
point Cu appears for ai (coefficient of the normal form of the Hopf bifurcation evaluated at the parameter values of H'i)
positive.

interested arises when both of the following situ- But


ations hold: the coefficient a\ given in (12) van- dh dh dp
ishes and the transversality condition given in (8)
dp du*'
fails. Such degeneracy, called H T , can be studied
in the context of the singularity theory developed and, therefore, (13) is equivalent to
by Golubitsky and Schaeffer [1985]. In [Hassard &
Jiang, 1993], this degenerate point is numerically
mdh_mdiL =0
du± dp dp du*
located in the (ao,so,/^-parameter space as well as
in the (s, a)-phase plane for the values a = 0.2 and This is the new expression for the nontransversality
K = 0.1. condition.
The analytical information we have about the Equations (14), a{u+) = 0 and ai(u*,ao, SQ,
enzyme system allows to characterize the manifold p(u*)) = 0) determine a simple equation system
of points H T in the (ao, so, P> a> ft)-parameter space. whose solutions provide the manifold of points HT-
If we fix arbitrary values of the parameters For example, for a = 0.2 and K = 0.1, the point H T
a and K, we have to find the point H T along is given by
the curve, in the (ao, 5o,/d)-space, of degenerate
* PS 12.91003, a0 ? 330.20156,
codimension-two Hopf bifurcation given by the con-
ditions a(Ui,) = 0 and ai(ii*) = 0 (see paragraph s 0 « 67.70777, p i 2.30884,
(b) in Proposition 2.2). Thus, we have to obtain a
and these parameter values are, precisely, those
solution of the system
obtained in [Hassard & Jiang, 1993].
<T(U*) = a'iu*) = ai(u*) = 0, A projection of the curve in the (u*, ao, So, a)-
space of points H T obtained for K = 0.1 appears in
with h'(u*) < 0. Fig. 8.
In order to obtain this solution, we find p
from h(u+) = 0, p = p(-u*,ao, so); this enables to 2.4. Bogdanov-Takens bifurcations
get a new expression for the function a, namely
cr(u*,a 0 ,so) = ^l(«* ; so,p(M*,ao,so))- Finally, we consider in the following result the exis-
tence of nondegenerate Bogdanov-Takens bifurca-
Thus, the nontransversality condition (<9/
tions. A Bogdanov-Takens bifurcation arises when
9U*)<T(U*, ao, so) = 0 becomes
the linearization matrix has, for certain critical
d 8F1 dF1dp parameter values, a double-zero eigenvalue (see, e.g.
-—a{u*,ao,so) = -^— + —-^— = 0. (13) [Guckenheimer & Holmes, 1997; Kuznetsov, 1998]).
ou* ou+ op ou*
218 E. Freire et al.

Proof. The parameterized curve (17) can be eas-


ily obtained from a(u*) = 0 and /i'(if*) = 0, for
a G (0,1) and K > 0. To guarantee that the
Bogdanov-Takens bifurcation is nondegenerate we
have to check that the coefficients of the second-
order normal form

(18)
au buv,

are nonzero (see, e.g. [Guckenheimer &: Holmes,


1997]). The computation of these coefficients can
be done using the algorithm developed in [Freire
et al., 1991] and then, we obtain the following
Fig. 8. Projection, onto the (w*, ao)-parameter plane, of the expressions
curve of degenerate Hopf bifurcation given by the failure of
the transversality condition and the vanishing of the cubic
term coefficient of the normal form (K = 0.1). a=\p{u*)h"{u*), b=—F{(uir),
L U+

Recall that system (3) has an equilibrium at that trivially lead to the nondegeneracy conditions
stated in the proposition. •
(tt*,0). It undergoes a double-zero degeneracy if
the trace and the determinant of the linearization
matrix (7) are equal to zero, that is, The curve in the (ao, SQ, p)-sp&ce of Bogdanov-
Takens bifurcation stated in the last propo-
<r(u*) = / i ' K ) = 0. (15)
sition is the organizing center for three sur-
Initially, we perform the rescaling t —» t/u* faces (see [Guckenheimer & Holmes, 1997]): one
which transforms (3) into of codimension-one Hopf bifurcation, another of
1 saddle-node bifurcation of equilibria, and a third
u = —uv, one corresponding to a nondegenerate homoclinic
(16) bifurcation.
2
v = — (v + Fi(u)v + F2{u)). A projection of this Bogdanov-Takens curve is
drawn in Fig. 4 for a = 0.2 and K = 0.1. On such
For the critical values determined by (15), the a curve two degenerate points (E and D) appear.
linearization matrix of this system at (u*, 0) is a Their study is performed in the next section.
Jordan block
0 1 3. Degenerate Bogdanov-Takens
0 0 Bifurcations
Proposition 2.3. For a G (0,1) and K > 0, a Degenerate codimension-three Bogdanov-Takens
codimension-two nondegenerate Bogdanov-Takens bifurcations arise when one of the quadratic coef-
bifurcation occurs on the parameterized curve given ficients in the Bogdanov-Takens normal form van-
by the rational expressions ishes. Two degenerate cases may appear:
u*p(u+)
a0 = a2(u±p'(u*) - p ( u * ) ) ( l - a ) ' • the first one arises when o ^ 0 and b = 0 in (18)
and it corresponds to a cusp of order three;
«*p(u*) • the other case arises when a = 0 and b ^ 0 in
s0 =u* + (u+tfiui,) -£>(«*))(! - a) ' (17)
(18) and it corresponds to a weak focus.
c?p{u+
P = u*(l - a ) ' The existence of these degenerate Bogdanov-
Takens bifurcations in the enzyme system is proved
for u*>0 with F^u*) ^ 0 and h"(u+) ^ 0. in the following subsections.
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 219

3.1. Cusp of order three case + 2 + V 9 « 2 ^ - 26««2 _ 8u* + l [-32/C4ul


Theorem 3.1. For each K > 0 there is, in the (a®,
so, p, a)-parameter space, a curve of cusps of order - 34re3?4 + (44K - 8 ) « 2 M 5 + 46K2U$
three given by the rational expression (17) and + (8 - 24K)KU\ - 30ra 2 - 4(1 + «)«* + 2]
(21)
a = -(V9-4Q(u*)-l), where
(19) The expression given in (19) corresponds to a
2p(u.k curve of cusp of order three assuming that 64 7^ 0.
QM = (u±p'(u*) - p(u*))2'
foru* > 0, with /I"(M*) ^ 0 and b'4 ^ 0 (b'4 is defined
in {21)). The curve of cusp of order three points for
K = 0.1, projected onto the (u+, a)-plane, appears
in Fig. 5.
Proof. If F{(u*) = 0 and h"(u*) ^ 0 then a ^ 0
and b = 0 in (18). A point verifying this kind
of degeneracy will be labeled E in the sequel. We Remark. The vanishing of 64 provides the even-
obtain its parameter values solving the system tual cusps of order greater than three.
Dumortier et al. [1987] stated the unfolding of a
F1 («,) = F[(u*) = hfa) = ti(u+) = 0. (20) cusp of order three. The intersection of the unfold-
ing of the point E with a half-sphere with center
From the first and second equations in (20) in the parameter space point corresponding to the
we get cusp presents the following bifurcation phenomena:

(u*p'(u*) -p(u*))u*(a + 2) codimension-one:


5 0 = W* +
1. subcritical and supercritical Hopf (H su b and
Comparing this value of so with the correspond- HSUper, respectively);
ing one given in (17), we obtain the following poly- 2. saddle-node of equilibria (sn);
nomial in u+ 3. saddle-node of periodic orbits (SN);
4. left homoclinic orbit (HL);
[2 - a(l + a)}n2ui - 2[3 - a ( l + <X)]KUI
codimension-two:
— 2u* — a ( l + a) = 0,
1. degenerate Hopf (Hi);
that leads to the required expression. 2. Bogdanov-Takens (BT);
To assure codimension-three of the points E, 3. left homoclinic orbit with zero trace (H^).
we need to compute a fourth-order normal form (see
[Dumortier et al, 1987]). This can be done using the (Description of the different homoclinic orbits
ideas of [Algaba et ai, 2003]. Then we obtain the listed above appears at the end of Sec. 3.2.)
following fourth-order normal form under smooth
orbital equivalence:
3.2. Weak focus case
u = v, Theorem 3.2. For K > 0, in the (ao,so, p,a)-
v = a2ix2 + b'4v?v, parameter space, there exists a curve of degenerate
Bogdanov-Takens bifurcations corresponding to the
where
vanishing of the term u2 coefficient in the normal
liff'(tt*) + 3 i f f K ) form. This curve is parameterized by the rational
°2 = -p{u*)ti'(u*), a3 expression (17) and
6 ul

64 = 96«;5u^ + 102/« 4 ^ + (24 - 268K)S~H


3
U. 1 P(«*) (22)
a (u*j/(u*) -p(u*))2
- 3 1 6 K 3 ^ + ( 1 9 6 K - 8 8 ) K 2 * 4 + ( 2 6 4 K - 4)KU:
foru* > 0, with F[(u+) ^ 0 and 63 ^ 0 (63 is defined
+ (76 - 20K)KUI + (4-52K)UI - (4K + 12)u, in (24)).
220 E. Freire et al.

Moreover, this Bogdanov-Takens degeneracy A point D can be classified into three topologi-
corresponds to a weak focus, for each value of K > 0 cally different types, depending on the values of the
and M* £ (0, +oo). For each K > 0, there exist 0, 2, coefficients of t h e system (23): saddle (if as > 0),
4 or 6 points on the curve given in (22) where the focus (if as < 0 and b2 + 803 < 0) a n d elliptic (if
foci change their stability. These points correspond 03 < 0 and b\ + 803 > 0). We are interested t o know
to a codimension-four degeneracy given by the van- what type of point D does t h e system (16) have.
ishing of both coefficients of the terms v? and u2v W i t h this aim, let us define
of the Bogdanov-Takens normal form.
d(n, M*) = b\ + 8a 3
Proof. If h"(u*) = 0 and i^(u*) ^ 0, then a = 0, \/ K2U\ — 3KM* — 1 — 2 V / 2M*K£>(M*)
b y£ 0 in t h e second-order normal form (18). We
V/M*K^(M*)
can obtain t h e parameter values of a point verify-
ing this kind of degeneracy, labeled D in the sequel, If a tends t o zero in t h e expression given in
by solving t h e equation system (22), it is easy t o see t h a t t h e abscissa M* of t h e
equilibrium undergoing t h e degeneracy D tends t o
Fi(u*) = h{u*) = h ' K ) = h"(u*) = 0. a value u% which is a root of the polynomial -P(M*) =
K 2 M 3 — 3KM* — 1. Since P ' ( M * ) = 3K(KM 2 — 1) > 0
Substituting the values of oo, so a n d P given in
for all M* > 1/y/K, a n d P ( 1 / A / K ) < 0, it follows
(17) into the fourth equation, we obtain t h e follow-
ing polynomial in M* t h a t u® is the unique root of P ( M * ) in (1/^/K, + 0 0 ) .
Thus, t h e curve of points D is solely defined for
ulp{u*)p"'(it*) - 2(M*P'(M*) - p(u+)) M* € (M°, +00) and, moreover, P ( M * ) > 0 for all
M* > u®. Therefore, as < 0 for all M* £ ( M $ , + C O )
x [(1 - a)u+p'(u±) + ap(Ui,)} = 0,
and for all K > 0.
t h a t leads t o t h e expression stated in t h e theorem. T h e proof of t h e following statements is
To guarantee codimension-three of t h e points direct:
D we need t o compute a fourth-order normal form (a) lim^-^+oo d(K, M*) = 1 — 2\/2~, independently
(see [Dumortier et al, 1991]). As in t h e previous of «;
theorem, this can be done following [Algaba et al, (b) d(/c, u®) = —2\/2, independently of K;
2003]. In this manner we get t h e normal form (c) The function d(n, M*) is increasing in (u®, +00),
for each value of K > 0, since

{ it = v,
• ,

v = b2uv + a%u + b3u v,


3 ,, 2 (23)
-—d(«,M*)
OM*
where
_ 1 K 2 M 4 + 8 K 2 M 3 + 6KM 2 + 2M* + 1
2
_ Fl(ui,) _ K U\ - 3KM* - 1
u% V K P ( M * ) 2 y/n2ul - 3KM* - 1
M* u*p'(u*) - p(u*Y
in (M°,+OO).
«3 = - p ( u * ) / i " ' ( i t * )
A sketch of t h e function d(ft, M*) appears in Fig. 9.
M*P(M*)K(K M 2 3
— 3 K M * — 1) Therefore, d(n, M*) < 0 for each value of K > 0
and for each value of w* £ (M°, +00). Thus, t h e point
(M*P'(M*)-J?(M*))2
D is always of focus type.
r T h e stability of t h e focus is given b y t h e sign
y = _I ±A (24) of b'3, t h a t is, t h e sign of T(M*) given in (25).
3
5u,pK)MK)-pK)) ' l '
2
Since r ( 0 ) = 1, T(M*) has, at least, one nega-
where tive root and, therefore, it has, at most, six posi-
tive roots. These positive roots correspond t o a
T~(M*) = 15K 4 M^ + 18K 3 M* + 18KM* + 1
change of stability of t h e focus (note t h a t t h e degen-
-M*[4K3^ + 37K2M3 eracy condition is 63 = 0) and they provide t h e
degenerate codimension-four points stated in t h e
+ ( 4 9 K 2 + 3K)M 2 + 2KM* + 2]. (25) theorem. •
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 221

The equilibrium with smallest (largest) abscissa


is called the left (right) equilibrium; the third equi-
librium (that is always a saddle) located between
the other two is called the intermediate equilibrium.
1-2 \/T So, left (right) homoclinic orbit means a homoclinic
connection surrounding the left (right) equilibrium.
A homoclinic orbit connecting from below (up) the
intermediate equilibrium to itself and surrounding
the left and right equilibria is called a lower (upper)
concave homoclinic orbit.
-2 V I 'J The central saddle-node homoclinic orbit
occurs when the isolated center manifold of the
saddle-node point returns to it through the inte-
Fig. 9. Qualitative sketch of d(re, «*) versus u*. rior of the nodal sector; the noncentral left (right)
saddle-node homoclinic orbit occurs when the iso-
lated center manifold returns to the saddle-node
point through one of the hyperbolic separatrices,
The curve of points D for n = 0.1, projected
surrounding the left (right) equilibrium. If the iso-
onto the (u*, a)-plane, appears in Fig. 5.
lated center manifold returns to the saddle-node
Dumortier et al. [1991] stated the unfolding of
point from below (up) through one of the hyper-
a point D in the focus case. The intersection of the
bolic separatrices enclosing the nodal sector and
unfolding of the point D in the focus case with a
the hyperbolic equilibrium is placed at the left of
sphere with center in the parameter space point cor-
the nonhyperbolic one, the homoclinic connection
responding to the singular point presents the follow-
is called a left lower (upper) concave saddle-node
ing bifurcation phenomena:
homoclinic orbit; if the hyperbolic equilibrium is
codimension-one: placed at the right of the nonhyperbolic one, the
homoclinic connection is called a right lower (upper)
concave saddle-node homoclinic orbit. Several of
1. subcritical and supercritical Hopf (Hsub and
the above homoclinic connections are sketched in
Hsuper, respectively);
Fig. 10.
2. saddle-node of equilibria (sn);
3. saddle-node of periodic orbits (SN);
4. left homoclinic orbit (HL); 3.3. Additional comments
5. right homoclinic orbit (HR);
The codimension-three degenerate Bogdanov-
6. lower concave homoclinic orbit (HLC);
Takens bifurcations considered in this section have
7. right central saddle-node homoclinic orbit
also been studied by [Medved, 1985; Guckenheimer,
(CSNHR);
1986a]. Moreover, the bifurcation diagrams for both
8. left central saddle-node homoclinic orbit
types of the degenerate Bogdanov-Takens bifurca-
(CSNH L );
tion have been described in [Bazykin et al., 1989;
codimension-two: Berezovskaya & Khibnik, 1985]. The latter publica-
tion contains the complete analysis leading to the
1. degenerate Hopf (Hi); bifurcation set in a neighborhood of the point E.
2. cusp of equilibria (C); Points E and D appear in Fig. 4, for a. = 0.2
3. Bogdanov-Takens (BT); and K = 0.1. Obviously, these two points are located
4. lower concave homoclinic orbit with zero trace on the Bogdanov-Takens curve BT. The curve of
(HPc); degenerate Hopf bifurcations (Hi) emerges from
5. right saddle-node homoclinic orbit ( S N H R ) ; D whereas the other degenerate Hopf bifurcations
6. left saddle-node homoclinic orbit ( S N H L ) ; (Hj) occur in a locus arising from E. Moreover,
7. right lower concave saddle-node homoclinic orbit the curve of cusp bifurcations of equilibria (C) goes
(SNH R C) ; across D.
8. left lower concave saddle-node homoclinic orbit We end our study of degenerate Bogdanov-
(SNH£ C ). Takens bifurcations with two remarks. The first
222 E. Freire et al.

CSNK SNK

Fig. 10. Qualitative picture of: a left homoclinic orbit, H L ; a right homoclinic orbit, Hp>; a lower concave homoclinic orbit,
H L C ; a right central saddle-node homoclinic orbit, C S N H R ; a right saddle-node homoclinic orbit, S N H R ; a right lower concave
saddle-node homoclinic orbit, S N H R . Focus, saddle and nonhyperbolic saddle-node equilibria are represented, respectively,
by a filled point, a cross and an empty square.

one is t h a t both quadratic coefficients of the The equilibrium of Eq. (27), w = WQ, corre-
Bogdanov-Takens normal form cannot vanish sponds to the straight line s — a = SQ — ao, t h a t is
simultaneously. From (19) and (22) such a situation invariant for the flow of (26). Thus, the equilibria
would lead to of system (26) lie on the line s ~ a = so — ao and,
therefore, the existence of limit cycles, homoclinic
P{u*
l(V^ 4Q(u*) - 1) = 1
{u*p'{u* •?K))!
connections and equilibria of focus type is impossi-
ble. We remark t h a t , in spite of the richness of t h e
that can occur if, and only if, a = 0, that has no dynamic and bifurcation behavior of t h e system (1)
biochemical meaning. for all values of parameter a arbitrarily close to t h e
A case where both quadratic coefficients of the critical value a = 1, there are no more limit sets
Bogdanov-Takens normal form vanish appears in t h a n equilibria for the value a = 1, and therefore
[Dangelmayr & Guckenheimer, 1987]. the only bifurcations t h a t remain are saddle-nodes
The second remark refers to the limit case and cusps of equilibria.
a = 1. It is easy to verify t h a t the curves of points In short, the limit sets t h a t exist for a < 1 suffer
E and D obtained in Theorems 3.1 and 3.2, respec- a stretching when a tends to 1 and they disappear
tively, tend to a = 1 as the parameter u+ tends to for a = 1 (see Fig. 5).
infinity. An evaluation of the curve of degenerate
Hopf points H2 provides an analogous result (see
Fig. 5). This fact is justified by the following rea- 4. Homoclinic Orbits and Their
sonings.
Numerical Continuation
If we consider the system (1) for a = 1,
When dealing with parameterized systems of
sa autonomous ordinary differential equations, the
s0-s
1 + s+ KS2 ' presence of a homoclinic orbit (that is, a trajec-
(26)
sa tory which is bi-asymptotic to the same stationary
a = ao — a — p point in b o t h forwards and backwards time) may
1 + S + KS2 '
reveal the existence of other bifurcations (see e.g.
and we make the change of variables w = s — a, this [Wiggins, 2003]).
new variable verifies the differential equation In autonomous planar systems, under certain
(27) nondegeneracy conditions, a homoclinic bifurcation
w = WQ — w,
simply creates or destroys a single periodic orbit.
where WQ = SQ — ao- Roughly speaking, it is a bifurcation of a periodic
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 223

orbit to "infinite period". Nevertheless, the pres- occurs. As global bifurcations may exhibit degen-
ence of degenerate homoclinic connections will lead eracies, numerical techniques are also needed for
to the existence of several bifurcations where more these higher codimension situations. In this direc-
than one periodic orbit is involved (for example, tion, [Champneys & Kuznetsov, 1994, 1996], have
saddle-node and cusp of saddle-node bifurcations). developed a continuation code for several cases
Then, it will be of importance to combine analytical of codimension-two homoclinic bifurcations. This
tools with numerical methods that detect and con- code, called HomCont, has been included in the
tinue degenerate homoclinic connections, since they AUT097 continuation and bifurcation software
act as important organizing centers in the dynami- [Doedel et al, 1998].
cal behavior of systems. In the following we briefly describe the numer-
The techniques to study homoclinic orbits in ical shooting methods used along this work for the
planar vector fields were well developed by the continuation of homoclinic orbits. Moreover, we will
1920's in the works of Dulac. The fundamental give information about some theoretical results on
idea is that the recurrent behavior near a connect- the homoclinic orbits that appear in this system. In
ing orbit should be studied in a fashion similar to particular, in Sec. 4.3, we will emphasize on cuspidal
that used in studying periodic orbits via a Poincare loops, a codimension-three homoclinic connection.
return map. But there are some additional com- First, in Sec. 4.1, we deal with nondegenerate
plications in the study of homoclinic orbits com- homoclinic connections and, in Sec. 4.2, we consider
pared to that of periodic orbits which significantly the cases of degenerate homoclinic orbits. The basic
complicate the analysis (see e.g. [Guckenheimer &; idea of the numerical method we use is to establish
Worfolk, 1993]). a correspondence, under the adequate hypothesis,
There are many types of codimension-two bifur- between the homoclinic connections and the zeros of
cations of connecting orbits. Failure of one of a certain function. Then, the continuation of homo-
the conditions that characterize a generic homo- clinic connections will be equivalent to the contin-
clinic orbit will lead to a degenerate bifurcation. uation of the zeros of such a function.
This can occur, in planar systems, for eigenval-
ues degeneracies and for multiple connecting orbits
[Guckenheimer & Worfolk, 1993]. In the system we 4.1. Continuation of nondegenerate
consider, as we are going to see, two kinds of eigen- homoclinic connections
value degeneracies may occur. The first one appears We consider the one-parameter autonomous planar
for homoclinic orbits to nonhyperbolic equilibria (a system
zero eigenvalue). The second one is present when
a nonresonant condition is violated (zero trace).
x = X(x,n), x = (x\,X2) 6 M2, fiElQR,
On the other hand, a double homoclinic connection
appears for certain values of the parameters. Fur-
where X E C°° (R2 x J;R 2 ) is a family of vector
thermore, the presence of two kinds of codimension-
fields and / is some neighborhood of LLQ € R, for
three homoclinic orbits will be pointed out.
which value a homoclinic orbit occurs.
Among the numerical continuation methods
Suppose the origin, x = 0, is a hyperbolic equi-
proposed in the literature there exist basically two
librium, X(0,/x) = 0 for all /x E I, of saddle-type.
groups: boundary-value and shooting methods. The
Without loss of generality we may suppose that the
boundary-value methods truncate the homoclinic
linearization matrix has the form
problem to a finite time interval and impose cer-
tain boundary conditions at the end points of that
f-Xii/x) 0 \
interval (see e.g. [Beyn, 1990; Friedman &, Doedel,
1993]). The second technique uses shooting, that ^M=( 0 *(,))•
is, the numerical integration of orbits in the stable
and unstable manifolds of the equilibrium and the where XI(LI) and X2(LI) are positive scalars, for
computation of a distance between them (see e.g. ix El.
[Rodriguez-Luis et al., 1990]). Under the adequate hypothesis [Freire et al.,
The above methods detect the homoclinic 1999b], the existence of a nondegenerate homo-
orbit and provide the curve, in a parameter clinic connection corresponds to a regular zero of a
plane, where the global codimension-one bifurcation certain scalar function G\{ix) that measures, on an
224 E. Freire et ai.

adequate transversal section, the distance between divergence of the vector field along the homoclinic
the stable and the unstable manifolds of the equilib- orbit. In fact, the homoclinic orbit T is asymp-
rium. If now the system is bi-parametric, \i € M2, totically stable (resp. unstable) if, and only if,
under the adequate hypothesis, a curve of nonde- / divX < 0 (resp. / divX > 0), where j(t),
generate homoclinic orbits may be continued solv- t E (—oo, oo), parameterizes F. Then, an addi-
ing Gi(/i) = 0, that is, the continuation in the tional degeneracy (codimension-three) appears when
parameter plane of the homoclinic connections locus J" div-X" = 0. As it is easier to compute the expo-
is a problem equivalent to tracing zeros of a function nential of the integral of the divergence EID [Freire
of one component and two independent variables. If et ai, 1999b],
there are more parameters, /x G R m , m > 2, to con-
tinue curves of degenerate homoclinic bifurcations
EIDd=eAdivX, (28)
it is enough to add the appropriate test functions
defining the degeneracy in question.
this codimension-three homoclinic singularity has
simultaneously zero trace and EID = 1. Its
4.2. Continuation of degenerate numerical continuation will be done using the test
homoclinic connections functions ^ x = - A i + A2 and # 2 = EID - 1,
with the corresponding transversality assumptions
For /i £ I 2 , to detect a codimension-two point along to guarantee the regularity of such zeros. From this
the homoclinic curve we monitor a test function codimension-three point a curve of cusps of saddle-
$ i (for instance, the test function that detects the node of periodic orbits will emerge.
vanishing of the trace is simply ^i(/u) = —Ai(/x) 4- The stability of this codimension-three homo-
A2 (/•*))• If * i changes sign we first accurately locate clinic orbit is governed by a new resonant local
its zero. We can then continue numerically the coefficient RES which may be computed taking
curve of codimension-two homoclinic orbits in three advantage of the duality between the Hopf bifur-
parameters by restarting from the detected zero cation (and its degeneracies) and the homoclinic
of vj/i, freeing an additional parameter (fj, 6 M3) bifurcation (and its degeneracies in the case of
and appending the extra algebraic constraint zero trace). (See details in [Freire et al., 1999b;
\&i(/x) = 0 to Gi(fi) = 0. It is clear that this strategy Joyal, 1988].) The first result used to look for an
may be applied to compute curves of codimension- expression of RES is the following:
three points as four parameters are allowed to vary
and to continue curves of codimension-four points Proposition 4.1. Let UQ be a hyperbolic saddle
when |ii£K 5 . Obviously, some extra transversality point of the planar system
assumption has to be satisfied to guarantee that we
have a regular zero of the corresponding test func- u = X(u), u = [x, y) e M2, (29)
tions (see details of this detection and continuation
strategy in [Champneys & Kuznetsov, 1994]). with divX(ito) = 0. Under these conditions, system
A first eigenvalue degeneracy (codimension- (29) is C°° orbitally equivalent to
two) appears when the homoclinic orbit connects an
hyperbolic equilibrium point with zero trace, the so-
called neutral resonant saddle case. This situation
was studied completely by [Nozdrachova, 1982] for | v = -y + E «*+!*V+1 + o{\x, y\2n+s).
two-dimensional systems. A curve of fold (saddle- I k=l
node) bifurcations of periodic orbits emerges from
this codimension-two bifurcation point.
System (29) can be written as
In the bi-parametric case fi = (1^1,^2), the
curve of homoclinic connections is defined by
(/ii(s),jU2(s)), where s adequately parameterizes
such a curve. To guarantee a regular zero of the test [yj [l 0/ [yj {g(x,y))'
function $i(^i(s)), at s = SQ say, we have to add the
extra transversality assumption d^i/ds\s=so ^ 0. where f(x,y),g(x,y) = 0(\x,y\2), and we have
In the zero-trace case, the stability of the homo- assumed that u$ = 0 is a hyperbolic equilibrium
clinic orbit is determined by the integral of the of (29).
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 225

It is then possible to obtain that RES is of lower concave, HLC; and of upper concave, Hue)
given by emerge from such a double homoclinic point, HH.
An example of this situation appears, for instance,
R E o = [jyygyy + Jyyjxy + 9yy9xy ~~ Jxyjxx in [Freire et al., 1996].
~ 9xy9xx Jxx9xx 9yyy

~ Jxyy i~ 9xxy T Jxxx)/^- \"^) 4.3. Cuspidal loops


When this coefficient RES vanishes, a In this subsection we will summarize some results
codimension-four homoclinic singularity appears. about cuspidal loops and the numerical method for
In this situation (we have numerically checked the continuation of these planar codimension-three
that this does not occur in the enzyme system), a homoclinic orbits [Freire et ai, 2000]. A cuspidal
curve of swallowtail singularities of periodic orbits loop occurs when the separatrices of an equilibrium
will emerge in a four-parameter space from such a of cusp type intersect and a cusp point is a nonhy-
codimension-four point. perbolic equilibrium with a double-zero eigenvalue
A second eigenvalue degeneracy (codimension (Bogdanov-Takens bifurcation).
two) appears when the homoclinic curve (in a bi- Let X be a planar vector field, X <G C°°(]R2),
parametric space) reaches a curve of saddle-node and
bifurcations of equilibria. This situation, known
x= Xi(x,y),
as the saddle-node separatrix-loop bifurcation, was (31)
analyzed by [Schecter, 1987]. He showed that the V= X2(x,y),
homoclinic curve meets the fold curve with a
quadratic tangency. a dynamical system, with an equilibrium at the ori-
When the curve of nondegenerate homoclinic gin of cusp type, that is, the equilibrium has stable
connections is approaching the fold curve, it is and unstable local separatrices forming a cusp. It
better to take, in the numerical method, the is well known (see e.g. [Guckenheimer &: Holmes,
abscissa of the equilibrium as continuation param- 1997]) that system (31) is C°° orbitally equivalent
eter [Freire et ai, 2000]. This is the way to detect to a system in the form
such a codimension-two point.
x y + 0(\x,y\k+1)
To continue the curve of these degenerate
homoclinic connections in a three-parameter space (32)
we have to adapt our strategy to the presence V = ^2ajxJ +bjX° 1y + 0(\x,y\ fc+i>
of a nonhyperbolic equilibrium. We take a linear i=2
approximation for the hyperbolic manifold (stable
or unstable) and a quadratic approximation for the When <Z2 ^ 0, the topological type of (32) is
determined by the truncated second-order system
center manifold. In this way we have a continuation
problem with a function of three variables (the three
parameters) and two components (the first one is x y, (33)
the distance, on a transversal section, between the y = ax2 + bxy,
orbits integrated from the approximations of the
center and the hyperbolic manifolds; the second (where a = a2 and b = b2)- We assume that the
one is the condition of saddle-node bifurcation of separatrices intersect forming a cuspidal loop.
equilibria). We also assume that the cuspidal loop is param-
On the other hand, a double homoclinic connec- eterized by the function j(t), for t £ (—00,+00).
tion (degeneracy for multiple connecting orbits) is The stability of the cuspidal loop is governed by the
easily detected looking at the crossing of two homo- integral of the divergence of the vector field along
clinic curves, one corresponding to left homoclinic the homoclinic loop, f divX, in the case that this
orbits, H L , and the other one to right homoclinic quantity does not vanish, as is stated in [Dumortier
orbits, H R . Its continuation in a three-parameter et ai, 1997] (the cuspidal loop is an attracting sin-
space is performed looking at the zeros of a two- gular cycle if J div X < 0, and it is a repelling one
component function (each component corresponds i f / 7 d i v X > 0).
to the condition of existence of one homoclinic con- To carry out the analysis of the local stabil-
nection) . Other curves of homoclinic orbits (namely, ity of a cuspidal loop and to establish the different
226 E. Freire et al.

unfoldings of such a singularity, two local transver- symbols: •, o and + mean stable, unstable and
sal sections to the loop are taken and three maps saddle equilibrium, respectively.
are considered [Freire et al., 1999a]: The bifurcation phenomena that appear in
the vicinity of a cuspidal loop can be classified
• The Dulac map D, that provides local informa- by their codimension:
tion of the behavior in the vicinity of the cusp
equilibrium point. codimension-one:
• The regular transition map along the homoclinic
(a) (subcritical) Hopf (Hsub);
orbit, R.
(b) saddle-node of equilibria (snjj;
• The Poincare map, P, given by the composition
(c) saddle-node of periodic orbits (SN);
of R and D.
(d) left homoclinic orbit (HL);
Two cases appear, depending on the signs of a (e) right homoclinic orbit (HR);
and b: (f) lower concave homoclinic orbit ( H L C ) ;
(g) upper concave homoclinic orbit (Hue);
1. a > 0,6 > 0 (topologically equivalent to the (h) right central saddle-node homoclinic orbit
case a < 0,6 < 0). The slope of D, in this (CSNH R );
case, is greater than 1, that indicates that the
codimension-two:
local behavior, in the vicinity of the cusp point,
is asymptotically unstable. The global behavior, (a) cusp of periodic orbits (Cu);
along the regular arc of the homoclinic orbit, (b) Bogdanov-Takens (BT L );
and, therefore, the stability of the cuspidal loop, (c) right homoclinic orbit with zero trace (Hg);
is given by the sign of u (the slope of R is (d) lower concave homoclinic orbit with zero
1+w): trace ( H £ C ) ;
(i) If OJ < 0 the cuspidal loop is an attracting (e) double homoclinic orbit (HH);
cycle; (f) right saddle-node homoclinic orbit ( S N H R ) ;
(ii) if ui > 0 the cuspidal loop is a repelling cycle. (g) right lower concave saddle-node homoclinic
orbit (SNH R C );
The limit situation, to = 0, corresponds, (h) right upper concave saddle-node homoclinic
therefore, to a codimension-four degeneracy, orbit ( S N H R 7 0 ) .
since the stability of the homoclinic orbit has
changed. Recall the description of the different homoclinic
The case (ii) corresponds to the simplest orbits given at the end of Sec. 3.2. Moreover, all
type of cuspidal loop. In this case the homoclinic the above homoclinic connections are sketched in
orbit that rises from the Bogdanov-Takens point Fig. 23.
and the cuspidal loop have the same stability
(this kind of cuspidal loop occurs in the enzyme The subscript L (left) in snL indicates that
in the saddle-node bifurcation collapse the middle
system). The case (i) is more complex, since both
and the left equilibria of the system and B T L means
stabilities are now opposite, producing, in the
that the bifurcation is exhibited by the nonhyper-
corresponding unfolding, a much richer dynam-
bolic left equilibrium.
ical behavior (an example of this kind of cusp-
idal loop appears in the continuous flow stirred
2. a > 0, b < 0 (topologically equivalent to the
tank reactor, CSTR [Guckenheimer, 1986b]). In
case a < 0, b > 0). Similarly, p < 1 is obtained
Fig. 11 the unfolding of the first type of (unsta-
(p is the slope of D), which means asymptot-
ble) cuspidal loop is shown [Dumortier et al,
ically stable local behavior, and the following
1997]. This figure has been obtained intersect-
results can be summarized:
ing the unfolding of the cuspidal loop with a
sphere with center in the parameter space point (1) u < 0: asymptotically stable global beha-
corresponding to the cuspidal loop. Numbers vior; the cuspidal loop is asymptotically
in this figure make reference to the different stable. Simple case.
phase portraits displayed at the bottom of it. (2) u > 0: asymptotically unstable global
In these phase portraits, the solid (dotted) line behavior; the cuspidal loop is asymptoti-
represents a stable (unstable) periodic orbit; the cally stable. Complex case.
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 227

o + : •

10 11 12

13

Fig. 11. Unfolding of the simplest type of (unstable) cuspidal loop. The different phase portraits appearing in the unfolding
of this type of cuspidal loop are also sketched. A solid (dotted) line represents a stable (unstable) periodic orbit; the symbols
• , o and + mean stable, unstable and saddle equilibrium, respectively.
228 E. Freire et al.

With respect to the numerical continuation of


cuspidal loops, the separatrices of the nonhyper-
bolic equilibrium can be approximated by means of
the semi-cubic y2 = ax3 (if the cusp point is at the
origin). This is deduced from (33), as in the vicinity
of the origin (cusp point) the separatrices have hor-
izontal tangent and then the term bxy is negligible,
in first approximation, with respect to the ax2 term
[Freire et al, 2000].

5. Homoclinic Bifurcations in the


Enzyme System
We start this section adapting the numerical meth-
ods for homoclinic continuation, summarized above,
to the system under study. The information we
get on homoclinic connections allows to show, in
Sec. 6, eight representative bifurcation sets with the
dynamical behavior exhibited by this system. These
bifurcation sets include the previous information we
obtained by using analytical methods concerning
local bifurcations as well as the global bifurcations
arising in enzyme system not yet considered. These
are the upper concave homoclinic orbit (Hue); the
right upper concave saddle-node homoclinic orbit
(SNH^ C ), the double homoclinic orbit (HH), the
cuspidal loop (CL) and the lower concave homo-
clinic orbit with simultaneously zero trace and coef-
ficient EID = 1 (HEID)- In this way, we obtain two
objectives: to show the unfoldings of the different
codimension-three bifurcation phenomena as well
as the transition among these important organizing
centers of the dynamics exhibited by the enzyme Fig. 12. Sketch to understand how the orbits in the s-a
system. plane map into the u-v plane. We have represented, in the
We are now interested in how an orbit in the plane s-a, the nullcline corresponding to s = 0 and the
as phase plane appears in the u-v plane and vice straight line r where the equilibria appear. In the plane u-v
we have drawn the equilibria, as the intersection points
versa (see Fig. 12). First, note that the curve s = 0
between the u-axis and the curve v = h(u). A homoclinic
is mapped into the u-axis and the straight line orbit also appears in both planes: it is lower concave in the
labeled as r (namely, so — s = a(ao — a)) is trans- s-a plane and upper concave in the u-v plane.
formed into the v = h(u) curve. Secondly, the
region where s < 0 maps into the region v < 0.
Thus, the equilibria of system (1), that appeared
on the straight line r, occur now on the u-axis, in In particular, the lower concave homoclinic
system (4). orbit drawn in the s-a plane is transformed into the
In general, an orbit in the region s < 0 will upper concave homoclinic orbit in the u-v plane.
move from right to left (as t increases) whereas the Thus, all kinds of homoclinic connections (except
corresponding orbit in the u-v plane will move from the concave ones) keep their shape in both planes
left to right in the v < 0 zone. Analogously, the but the orbits are described in opposite senses.
orbits in the s > 0 region, that move from left to Note that the names and labels of the homo-
right, correspond to orbits moving from right to left clinic orbits along this work correspond to their
in the v > 0 region. shape in the u-v plane.
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 229

Now we describe the strategy followed in the might call ilfj? to this codimension-three homo-
continuation of the homoclinic connection loci. In clinic orbits but, for simplicity, we denote them
all cases of homoclinic orbits to a hyperbolic saddle as H E ID-
point, we have taken linear approximations to the For the reference values found in the literature
stable and unstable manifolds of the equilibrium. a = 0.2 and K = 0.1 (we will use along all this
The addition of the test function \I/i = —Ai + A2 section), we have located a homoclinic connection
allows to detect degenerate homoclinic connections HEID (it is a degeneration point in the curve of lower
with zero trace (codimension-two). If we consider concave homoclinic orbit with zero trace H ^ Q ) for
the test functions \&i along with ^2 = EID — 1 the parameter values
we may continue the codimension-three homoclinic
orbits given by the vanishing of both trace and inte- u* « 7.4884, a 0 « 717.2305,
gral of the divergence (see algorithm EID developed
in [Freire et al., 1999b]). s 0 « 36.122, p « 0.0939.
In principle, there are four kinds of homo-
clinic orbits that may become degenerate due to We remark that the other homoclinic connections
the vanishing of the trace: HL, HR, Hue and HLC- with zero trace do not present this additional degen-
However, we have checked that this degeneration eracy (EID = 1).
only appears in two cases (labeled as H^ and H^c). We have also computed the resonant coefficient
Moreover, the following degeneration (EID = 1) RES given in (30), that determines the stability
only occurs for the HLC homoclinic connections. We of the homoclinic with zero trace and EID = 1,
HEID :

1 u* [F{(u*)FZ(u*) - F['(u*)F^u*)} - 2Fl(uic)F£(uir)


RES
16 W*2K)*2(«*)
We have verified that, for At = 0.1, it does r
not vanish (in fact, it remains always positive). occurs for a 0 « 683.39886 and p « 0.1036702. Its
Therefore, no codimension-four homoclinic orbit phase portrait in the original s-a plane appears
with zero-trace, EID = 1 and RES = 0 in Fig. 14(a) whereas its phase portrait in the
arises. u-v plane is drawn in Fig. 14(b). The second one,
When a curve of left homoclinic orbits inter- a lower concave homoclinic connection HLC in the
sects with a curve of right homoclinic orbits, a u-v plane (and an upper concave homoclinic con-
double homoclinic connection occurs. In this sit- nection in the s-a), exists for ao ~ 683.46591 and
uation, the equilibrium point is hyperbolic, but p as 0.1036624. We show its phase portrait in the
other codimension-one homoclinic curves appear. In original s-a plane in Fig. 14(c) and in the u—v plane
Fig. 13, we show the details of the tangency between in Fig. 14(d).
the curves of right homoclinic orbits (HR) and lower Two conclusions follow from this picture. First,
concave homoclinic orbits (HLC) and between the it is evident that the u-v plane is more convenient
curves of left homoclinic orbits (HL) and upper for the representation of the orbits (they are eas-
concave homoclinic orbits (Hue)- Note that if we ier to see in such a plane). Secondly, small varia-
superimpose both figures, the curve Hue would be tions of the parameters imply important changes
imperceptible with respect to HLCJ as the parame- in the dynamic behavior of the system (compare
ter region shown in (a) is approximately ten times the parameter values of the two homoclinic orbits).
the region drawn in (b). Then this shows clearly the importance of analyti-
At this moment we think it is interesting to cal and very precise numerical results to understand
have a realistic idea of the phase portrait of the the full dynamical behavior this system exhibits.
homoclinic connections exhibited by the enzyme Therefore, the narrow interval of the parameters
system. In Fig. 14 we represent the phase portraits where the phenomena occur (that has a parallelism
of two homoclinic orbits, obtained with Dstool in the narrow region of the phase plane where the
[Guckenheimer & Kim, 1992], for the following val- important orbits are) makes completely useless the
ues of the parameters: so = 37, a = 0.2 and K = 0.1. utilization of a brute-force simulation strategy of
The first one, a right homoclinic connection H R , the system.
230 E. Freire et

for the center manifold and a linear approxima-


X. ' 1 ' ' i tion for the hyperbolic manifold. For that, we have
A N — HH translated the nonhyperbolic equilibrium of (3) to
the origin, followed by a Taylor expansion in a

neighborhood of the origin, obtaining the following


second-order truncated system:
- ii = vu+ + uv,
H
L C ^ ^

(34)
v = F^u^v + v2 + Fl(u*)uv + -F 2 "(w*)u 2 .

The linear change of variables given by


-
1 u*
1 , , X ,
683.5 684 684.5 685 0 FiK)

uncouples the linear part of (34), obtaining in the


new variables x and y the following system

0 0 f{x,y)
(35)
0 Fi(u*) + g(x,y)

2
f(x,y) -u* Fi(u*)y + F 1 ' K ) y ( ^ y + x)

22;i(u*)
+ Fi(uic)y(uicy + x),
g{x, y) = Fi{u±)y2 + F{(u*)y(u*y + x)

x)2.
+ 2Fx{u* -{u*y +
683.4 683.45 683.5
The origin, that is a semi-hyperbolic equilib-
rium of (35), has a center manifold with tangent
(b) space on the OX axis. This center manifold is
given, up to second order, by the equation y =
Fig. 13. Detail of the tangency between the curves of: (a)
right homoclinic orbits ( H R ) and lower concave homoclinic
ax2, for a certain value of a. Differentiating and
orbits ( H L C ) ; (b) left homoclinic orbits ( H L ) and upper con- identifying coefficients, we obtain, in the original
cave homoclinic orbits (Hue). I n both cases the point of variables, the following expression for the center
tangency corresponds to a double homoclinic orbit (HH) manifold
(s 0 = 37, a = 0.2 and K = 0.1).
2u* Ul
v = —2*i (u*) u -uv +
Fi(u* Fik
Another important case of codimension-two With respect to the location and the continua-
homoclinic connections appears when the homo- tion of cuspidal loops, we have approximated the
clinic curve meets a curve of saddle-node bifurcation separatrices of the nonhyperbolic equilibrium by
of equilibria. In these cases of saddle-node homo- means of the semi-cubic (u — it*) 3 = av2, where u* is
clinic orbits (there are five in the enzyme system, the abscissa of the equilibrium and a is a parameter
corresponding to SNH R , SNHL, SNH^C, SNH£C, to be determined. Let us consider the system (3)
S N H L ) we have taken a quadratic approximation written in the Bogdanov-Takens normal form, in a
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 231

(b)

(c) (d)
Fig. 14. Phase portraits of two homoclinic orbits for so = 37, a = 0.2 and K = 0.1. The first one is a right homoclinic
connection H R that occurs for ao « 683.39886 and p « 0.1036702: (a) plane s-a; (b) plane u-v. The second one is a lower
concave homoclinic connection H L C th&* occurs for ao » 683.46591 and p « 0.1036624: (c) plane s-a; (d) plane u-v.

neighborhood (u*, 0), up to second order but avoid- Let the parameter-space point (ao,so, p,a,K.)
ing the uv term: be such that, simultaneously, the equilibrium (u\, 0)
undergoes a Hopf bifurcation and the equilibrium
u = v, (u|,0) undergoes a Bogdanov-Takens bifurcation.
(36) Since both u\ and u\ are zeros of the equation
W= F\ (u) = 0, it is easily deduced that
2
• « *
( « •

it*

Differentiating with respect to t in the semi-


cubic, and from (36), we get the following value
Ky«)-pK)) 2 -pK) (37)
a (u\p'(u\) -p{u\))2 +p(u*)
3M*
a= that is an expression, for n constant, of the value of
*?(«*) a in terms of the abscissa of the equilibrium under-
To start the continuation, we have to pre- going the Bogdanov-Takens bifurcation.
viously locate a cuspidal loop. It has been an Since u\ is a double root and u* is a single
easy task proceeding in the following way. Firstly, root of the cubic polynomial h(u) given in (4), it
we have detected a parameter space point corre- is deduced, using the Cardano relations, that
sponding to a simultaneous Hopf bifurcation in an
equilibrium and a Bogdanov-Takens in the other K(U\)2 + a __ 1 K2(u\f + KU\ + 1
Un = (38)
equilibrium. Next, the periodic orbit arising from (1 + Q)KU\ 2 *\2
K{K(U\
the Hopf bifurcation evolves, as the parameters
run over the Bogdanov-Takens curve, towards the From (37) and (38), and for a = 0.2 and K = 0.1,
cuspidal loop. we obtain the parameter values where a Hopf
232 E. Freire et al.

bifurcation and a Bogdanov-Takens bifurcation On the other hand, the codimension-three degener-
simultaneously occur: ate Hopf bifurcation point H2, occurs when

u\ « 6.4560, u\ ss 7.4798, a0 ss 716.0002, {u+,a0,s0,p)


s 0 « 36.1198, p « 0.094084. « (4.4721298,458.8523045,51.6934269, 0.3533743).

If we continue the periodic orbit arising from This point lies on the codimension-two degener-
this Hopf bifurcation point (as the parameters ate Hopf bifurcation curve H'x arising from the
run over the Bogdanov-Takens curve), we obtain codimension-three degenerate Bogdanov-Takens
the parameter values for which a cuspidal loop point E.
exists: Moreover, from the analytical expressions
obtained we conclude that no codimension-four
(u±,a0,s0,p) bifurcations of equilibria (Hopf and Bogdanov-
Takens) may occur. In the case K = 0.1, we have
« (7.5602, 715.2247,36.1692,0.094415). checked that the seventh-order coefficient 0,3 of the
normal form of the Hopf bifurcation (2) remains
In Fig. 15 we show, using simulation with
always negative for all the degenerate Hopf bifur-
Dstool, the evolution of the phase portraits of
cation points H2; thus, no codimension-four Hopf
system (3), for a = 0.2 and K = 0.1, as the
bifurcation arises.
parameters run over the Bogdanov-Takens curve.
Such a codimension-four point would lead to
We focus on the transition in the vicinity of a cus-
the existence of a curve of swallowtail singularities
pidal loop point. In a first moment, only a stable
of periodic orbits (codimension-three). Although
large-amplitude periodic orbit exists (i.e. a periodic
we have not numerically found a swallowtail sin-
orbit surrounding all the equilibria) [see Fig. 15(a)].
gularity (a cusp of cusps) of periodic orbits, its
An unstable small-amplitude periodic orbit appears
existence would not be strange (there are several
(i.e. a periodic orbit surrounding only one equi-
codimension-three bifurcations of equilibria and
librium point) in a Hopf bifurcation exhibited by
homochnic orbits).
the left equilibrium [see Fig. 15(b)]. When this
The sign of the seventh-order Hopf bifurcation
orbit grows, it collapses with the cusp point giving
coefficient a% determines the relative position of the
rise to an unstable cuspidal loop [see Fig. 15(c)].
curve of cusp bifurcations of periodic orbits arising
This codimension-three global connection does not
from the point H2 with respect to the degenerate
destroy the periodic orbit but it allows the transi-
Hopf H'x. Analogously, the sign of the resonant coef-
tion between a small- and a large-amplitude peri-
ficient RES determines the relative position of the
odic orbit [see Fig. 15(d)]. Finally, this unstable
curve of cusp bifurcations of periodic orbits, that
periodic orbit disappears in a saddle-node bifurca-
emerges from the point HEID, with respect to the
tion together with the stable large-amplitude peri-
curve of lower concave homochnic orbit with zero
odic orbit that has been present in all this sequence
trace H L C . In Fig. 16, a qualitative picture of all of
[see Fig. 15(e)].
these curves are sketched. As predicted by the the-
Now we compute, from the expressions given
ory (see [Takens, 1973]), the cusp curve Cu emerges
in Sec. 3, the parameter values where the three
from H2 by the side of the degenerate Hopf curve
codimension-three bifurcations of equilibria occur,
H'x where the coefficient 02 is positive. In a similar
for a = 0.2 and K = 0.1. On the one hand,
manner [Dumortier et al., 1994], the cusp curve Cu
the parameter values for the degenerate Bogdanov-
emerges from HEID by the side of the degenerate
Takens points D and E are, respectively
homochnic curve H£ c where the coefficient EID is
less than one. This second cusp curve Cu ends at the
(u*,a 0 ,so,p) cuspidal loop point CL. In this figure the degenerate
« (7.111385, 721.304124,35.963550,0.09258790) Bogdanov-Takens points E and D are also shown
because they are the starting points, respectively,
and of the curves H^ and HL C -
To put in evidence the usefulness of both the
(u*,aQ,s0,p) analytical results obtained in Sees. 2 and 3 and the
»(6.931252, 725.121350,35.936106,0.091869965). numerical methods for homoclinic connections, we
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 233

0.076 0.32

-0.07
6.64 U 7.56

(a) (b)

0.3 0.3

-0.4 -0.4

(C) (d)

0.3

-0.4

(e)
Fig. 15. Evolution of the phase portraits along the Bogdanov-Takens bifurcation in the (u, i>)-plane for a = 0.2 and
K = 0.1: (a) u* = 7.2; (b) M* = 7.5; (c) w* = 7.56022; (d) u* = 7.561 and (e) u* = 7.6. The meaning of t h e symbols
is the following: S = Stable periodic orbit; U = Unstable periodic orbit; • = Stable equilibrium; • = Unstable equilibrium;
+ = Saddle equilibrium. Note that the right equilibrium is nonhyperbolic (marked by superimposing a cross and a filled circle).
234 E. Freire et al.

Fig. 16. Relative position of the curves of cusp bifurca-


tions of periodic orbits (Cu) with respect to the curves of Fig. 17. Curves of codimension-three bifurcations, in the
degenerate Hopf H^ and of lower concave homoclinic orbit («*,a)-plane, for K = 0.1: E = cusps of order three
with zero trace H ^ Q . This scheme corresponds to a projection (Bogdanov-Takens bifurcation with degeneracy in the uv
from the (ao, so,p)-parameter space onto the (ao, so)-plane, term of its normal form); D = foci (Bogdanov-Takens bifur-
for a = 0.2 and K = 0.1. cation with degeneracy in the u term of its normal form);
H2 = Hopf bifurcation with degeneracy in both order three
and five of its normal form); CL = cuspidal loops and H E I D
= homoclinic orbits with zero trace and coefficient EID equal
to one.
will now consider the (ao, SQ, p, a)-parameter space,
for K — 0.1. In this situation, the codimension-
three bifurcations the enzyme system exhibits To detect possible changes in the stability of
occur on curves we are able to compute from the the cuspidal loops, we have performed a numeri-
analytical results or with the numerical continu- cal study for a € (0,0.9994705) and « = 0.1. This
ation procedures. In Fig. 17, all the curves cor- change would give rise to the richest case of cuspidal
responding to codimension-three bifurcations are loops. The result of this study has been negative.
represented in the («*, a)-plane [recall that u+ We have not detected any change in the stability
determines uniquely a point (ao,SQ,p)}: the cusps of such cuspidal loops. However, we have verified
of order three (E), the foci (D), the degenerate that for a > ao « 0.98, a greater richness in the
Hopf (H2), the cuspidal loops (CL) and the lower dynamics of the enzyme system occurs. It is due
concave homoclinic orbits with simultaneously zero to the appearance of a degenerate Hopf bifurcation
trace and zero integral of the divergence (HEID)- over the curve of points where both a Hopf bifurca-
Observe that all the curves approach asymptotically tion in one equilibrium and a Bogdanov-Takens in
to a — 1 as for this value all the dynamics disap- the other one simultaneously occur (this curve is of
pear as was pointed out in Sec. 3.3. On the other coexistence of Hopf and Bogdanov-Takens bifurca-
extreme, four of the curves collapse for a = 0. This tions, HT).
would be a codimension-four point if this param- From this degenerate point a curve of saddle-
eter value had biological meaning, but this is not node of small periodic orbits, sn, arises, that coex-
the case. ists with the curve of saddle-node of large-amplitude
Now we are interested in the stability of the cus- periodic orbits, SN, aforementioned. For a €E (0, ao)
pidal loop, to determine if some additional degener- the Hopf bifurcation is subcritical and for a €
acy, that is, a change in its stability (that will imply (ao,l) the Hopf bifurcation is supercritical. In
a codimension-four bifurcation) may be present. In Fig. 18 we show qualitatively the relative posi-
the case a = 0.2 and K = 0.1, we have verified tions, in the (n*,a)-plane, of the curve of cusp-
that both the cuspidal loop and the small-amplitude idal loops (CL) and the curve of coexistence of
periodic orbits arising from the homoclinic orbit Hopf and Bogdanov-Takens bifurcations (HT). We
associated with the Bogdanov-Takens bifurcation have verified that both curves intersect at the value
are unstable. This means that this cuspidal loop lies a.\ « 0.99, exchanging their relative positions.
in the simplest case of cuspidal loops considered in There exists, moreover, a value a^ ~ 0.994 for which
Sec. 4.3. both curves HT and SN intersect.
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 235

Ai

HT CL SN u *
(a)

Fig. 18. Qualitative representation, for re = 0.1, of


the curves of: cuspidal loops (CL); coexistence of Hopf
and Bogdanov-Takens bifurcations (HT); saddle-node of
small-amplitude periodic orbits (sn); saddle-node of large-
sn HT CL SN
amplitude periodic orbits (SN). The solid line (resp. dashed) (b)
in the HT curve means that the Hopf bifurcation is super-
critical (resp. subcritical).

The above bifurcation set will give rise to four


different bifurcation diagrams depending on the a
value, for K = 0.1 (see Fig. 19). In the first case,
when a € (0, ao), the sequence of the bifurcation
points is HT-CL-SN. For a G (ao,ai), a point of U*
saddle-node bifurcation of small-amplitude periodic CL HT SN
orbits (sn) appears as consequence of the degenerate
Hopf bifurcation. For a € (a.\,oi2)> the bifurcations (c)
HT and CL have changed their relative positions.
And finally, for a € (a2,l), HT and SN inter-
change their position. Note that the cuspidal loop
CL always occurs on the unstable branch: it con-
nects an unstable small-amplitude periodic orbit
with an unstable large-amplitude periodic orbit.
In Fig. 20 we show, using again simulation
with Dstool, the evolution of the phase portraits
of system (3), for a = 0.9971544168 and K = 0.1,
as the parameters run over the Bogdanov-Takens
bifurcation curve. In a first moment only a sta-
(d)
ble large-amplitude periodic orbit is present [see
Fig. 20(a)]. Later, two small-amplitude periodic Fig. 19. Qualitative bifurcation diagrams of a family of peri-
odic orbits near the cuspidal loop, for re = 0.1: (a) a £ (0, ao);
orbits emerge from a saddle-node bifurcation [see
(b) a E ( a o , « i ) ; ( c ) a ^ (ai>"2); (d) a G (<*2,1)- The ordi-
Fig. 20(b)]. The unstable small-amplitude periodic nate, A, represents the amplitude of the periodic orbit. A
orbit grows and collapses with the cusp point giving solid (resp. dashed) line means stable (resp. unstable) peri-
rise to a cuspidal loop [see Fig. 20(c)]. After the odic orbit.
236 E. Freire et al.

6000 6000
5^.^


) "*"
V
\\ s
/ lu ^-^ -
/

-9000 00
115 10 U 115

(a) (b)

6000 6000

-9000 -9000
115 115

(c) (d)

6000 6000

-9000 -9000
115 115

(e) (f)
Fig. 20. Evolution of the phase portraits in the («, u)-plane for a = 0.9971544168 and K = 0.1: (a) u* = 88.3; (b) u* =
88.32; (c) u* = 88.4715; (d) M* = 88.473; (e) u* = 88.48; (f) u* = 88.8. S = Stable limit cycle; U = Unstable limit cycle;
• = Stable equilibrium; • = Unstable equilibrium; + = Equilibrium of saddle type. Note that the right equilibrium is nonhy-
perbolic (marked by superimposing a cross and a filled circle).

breaking of the cuspidal loop, an unstable large- periodic orbit disappears in a supercritical Hopf
amplitude periodic orbit appears [see Fig. 20(d)]. bifurcation and only the two equilibria are present
The two large-amplitude periodic orbits collapse [see Fig. 20(f)].
and disappear in a saddle-node bifurcation [see To complete with the special attention we have
Fig. 20(e)]. Finally, the stable small-amplitude devoted to the cuspidal loop, we now show two
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 237

such global connections in the (s,a)-phase plane the only bifurcations that remain are saddle-nodes
(see Fig. 21). The first one corresponds to the values and cusps of equilibria.
of the parameters a = 0.2 and n = 0.1 whereas the To have an idea of the rich homoclinic behav-
second one occurs for a « 0.97552 and K = 0.1. We ior exhibited by the enzyme system, we show in
pay attention to several things. First, the orbits are Fig. 22, for a = 0.2 and K = 0.1, the projection on
better observed in the (u, i>)-plane than in the (s, a)- the (ao, so)-parameter plane of all codimension-two
plane [compare these cuspidal loops with the other homoclinic orbits curves (that are in the (ao, so,p)-
two represented in Figs. 15(c) and 20(c)]. Second, parameter space). We remark that these curves of
the cuspidal loop of Fig. 21(b) allows to see the codimension-two homoclinic orbits start and/or end
behavior the system exhibits when a tends to 1: we at the codimension-three points D, E, CL and HEID-
observe how it approaches the line s — a = so — a o- The two degenerate Bogdanov-Takens points (D
Recall that for a = 1, the straight line s—a = so—ao and E) as well as the cuspidal loop (CL) appear on
is invariant for the enzyme system and even all the the Bogdanov-Takens curve BT whereas the HEID
equilibria lie on such a line (see Sec. 3.3). Thus, point is on the curve of lower concave homoclinic
the existence of limit cycles, homoclinic connections orbits with zero trace H^c-
and foci is not possible: there are no more limit sets Note that six codimension-two homoclinic con-
than equilibria for the value a = 1, and therefore nections are related to the cuspidal loop point
CL. One of them, the curve of left saddle-node
homoclinic orbits ( S N H L ) exists on b o t h sides of
573 CL. The other five curves, corresponding to double
homoclinic orbits (HH), left upper concave saddle-
node homoclinic orbits (SNEE^0), lower concave
homoclinic orbits with zero trace ( H ^ Q ) , left lower
concave saddle-node homoclinic orbits ( S N H L C )
and left homoclinic orbits with zero trace (H^),
a
emerge from CL. Three of these six curves end
at the degenerate Bogdanov-Takens point D: H^c,
SNH£C and SNH L , whereas the curve H^ ends at
the other degenerate Bogdanov-Takens point E.
Moreover, two homoclinic curves, corresponding to
562 ' ' right saddle-node homoclinic orbit ( S N H R ) and
5.6 s 7.8 right lower concave saddle-node homoclinic orbit
( S N H R C ) , emerge from D.
(a)
Some comments are now in order. First, in the
unfolding of the cuspidal loop shown in Fig. 11, sev-
50 | 1 eral of the homoclinic connections correspond to the
right equilibrium whereas in Fig. 22 the same kinds
of homoclinic orbits connect the left equilibrium.
The reason is that, in the unfolding shown, the cus-
a
pidal loop occurs at the left equilibrium whereas in
the enzyme system the cusp point connected by a
loop is the right one.
Secondly, remark that some of these curves
are almost undistinguishable and then the need
of a qualitative picture (see the bottom sketch of
18 ' ' Fig. 22). Finally, the tangential behavior a homo-
7.3 s 36 clinic curve presents when it approaches a curve of
(b) saddle-node of equilibria (in the so-called saddle-
node separatrix-loop bifurcation, see Sec. 4.2) is
Fig. 21. Cuspidal loops in the (s, a)-phase plane, for K = 0.1
and: (a) a = 0.2, a0 ss 715.2247, s0 « 36.1692 and p « also present in the curves SNH R , SNHLC, SNH£ C
0.09441; (b) a w 0.97552, o 0 « 1881.2647, s 0 « 1823.2647 and S N H L when they approach the degenerate
and p « 168.8043. Bogdanov-Takens point D.
238 E. Freire et al.

1
V V 1 ' ' 1
ro
L
-
SNHL / \ £
SNHL C \ \ .
\**EID
"^^V \ -

o \BT

m SNH^C
SNHR \ \

SNH^P v^i /™L -

D^< ^ L c \ \ H E
BT .
E -
l I l
715 720 725

Fig. 22. Projection, onto the (ao, so)-plane, of the curves of codimension-two homoclinic orbits arising from the codimension-
three points E, D, CL and HEID- These numerical (top) and qualitative (bottom) pictures are obtained for a = 0.2 and
re = 0.1.

Due to the great variety of homoclinic con- picture of its phase portrait, with information about
nections exhibited by the enzyme system, we con- the equilibria involved (center); finally, at the bot-
sider useful to schematize as much information tom of the window we have put the bifurcations
as possible in Fig. 23. We then devote a window of higher codimension in whose unfolding appears
for each of the sixteen kinds of homoclinic con- such a homoclinic connection.
nections. In each window we show the following The convention we have used in such a table
information about a homoclinic orbit: in the upper is now indicated. Hyperbolic equilibria are rep-
left corner we write the label used for the homo- resented by a filled point, except in the case of
clinic connection along the text; its codimension is zero trace where a star is used. Saddle-node equi-
indicated in the upper right corner; a qualitative libria appear as an empty point and cusp points
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 239

HR 1 HL 1 Huc 1

G> • GXZ)
SNHR HH BTR SNH L HH BTL SNH^ c HH

HLC 1 CSNH R 1 CSNH L 1

GXD O
SNH^SNH^HH HLC SNHR SNH^C SNH L S N H L C SNHL- C

SNHR 2 SNH L 2 SNHRC 2

<3
D D CL D

SNHt c 2 SNHLC 2 H
LC 2

C03
D CL CL D CL H EID
2 CL ^ — . 3
HDL 2 HH

H
EID , - - - ^ ,.--?
E CL CL

Fig. 23. Table of the sixteen kinds of homoclinic connections exhibited by the enzyme system. In each window we show the
following information about a homoclinic orbit: the label used for the homoclinic connection along the text (upper left corner);
its codimension (upper right corner); a scheme of its phase portrait (center); the bifurcations of higher codimension related to
the homoclinic (bottom). The following convention has been used: filled point (nonresonant hyperbolic equilibrium); empty
point (nonhyperbolic saddle-node equilibrium); star (hyperbolic equilibrium with zero trace); empty square (cusp point); filled
head of the arrow (noncentral homoclinic orbit); empty head of the arrow (central homoclinic orbit); dashed orbit (EID = 1).

as an empty square. The presence of a central a filled head of the arrow. Finally, the vanishing
homoclinic orbit (codimension-one) is denoted by of the integral of the divergence along the homo-
an empty head of the arrow whereas a noncentral clinic orbit (EID = 1) is marked with a dashed
homoclinic orbit (codimension-two) is indicated by orbit.
240 E. Freire et al.

6. Bifurcation Sets
Our aim in this section is to show all the informa-
tion obtained from the previous analytical and/or
numerical study of the bifurcations of codimension-
one, -two and -three. In this way, we get a complete
picture of the nice and complex dynamics exhibited
by the enzyme system.
The need of qualitative diagrams is made evi-
dent looking at the following pictures. First, see
again Fig. 1, that shows the curves of saddle-node
bifurcations of equilibria, snL and snR, in the ao~p
parameter plane, for so = 37, a = 0.2 and K = 0.1.
In the narrow area between both curves three equi-
libria exist whereas outside this region the system
only has one equilibrium point. We observe how
these two curves collapse in a cusp C and also (a)
the presence of two Bogdanov-Takens bifurcations,
B T R and B T L - In fact, all the homoclinic bifur-
cation phenomena occur inside the zone of three
equilibria, delimited by the curves snL and snR,
and between the cusp of equilibria point C and the
Bogdanov-Takens point BTL- Thus, the different
curves numerically obtained would not be distin-
guishable among them.
In Fig. 24(a) we show, for the same values of
the parameters, the two Hopf curves H that emerge
from the Bogdanov-Takens points B T R and B T L .
The presence of two degenerate Hopf points, Hi and
H'l5 indicates the existence of saddle-node bifurca-
tions of periodic orbits SNi and SN2 that we have
(b)
numerically continued. We have drawn the Hopf
curves as dashed lines to distinguish them from the Fig. 24. (a) Curves of Hopf bifurcations of equilibria
curves SNi and SN2. (dashed lines) and curves of saddle-node bifurcations of peri-
odic orbits (solid lines), in the (ao, p)-plane, for SQ = 37,
Even in the zoom of the region where the a = 0.2 and K = 0.1. The degenerate Hopf points (Hi and
saddle-node curves exist, shown in Fig. 24(b), it H'i) and the Bogdanov-Takens points ( B T R and B T L ) a r e
would be easy to believe that both curves collapse in marked, (b) Zoom of the region where the curves of saddle-
a cusp. This optical illusion is due to the proximity node bifurcations of periodic orbits exist.
of both curves (see Fig. 27 to see the correct bifurca-
tion behavior). On the other hand, it is evident that
if we superimpose the curves of Figs. 1 and 24(a) a cases we have performed the numerical location
not very useful bifurcation picture would appear. and continuation of all the bifurcation phenomena
Our next objective is to perform a complete appearing in the aforementioned figures. We recall
study of the bifurcation sets related to the five that we have the analytical expression for several of
codimension-three points D, E, HELD, H2 and CL, the bifurcation curves and points which appear in
as well as of the transitions among them. For that, Figs. 26-33, namely saddle-node bifurcation of equi-
we have intersected the (ao,so,p)-parameter space libria (snL and snR), cusp bifurcation of equilibria
(for a = 0.2 and K = 0.1) with different planes (C), supercritical and subcritical Hopf bifurcation
so = constant, in the eight situations represented (H super and H su b, respectively), codimension-two
in Fig. 25. Hopf bifurcation (Hi and H'x) and cusp of order
The bifurcation sets intersected by the afore- three and foci bifurcations (E and D, respectively).
mentioned planes appear in Figs. 26-33. In all these The saddle-node bifurcation curves of periodic
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 241

s 0 = 35.936

Fig. 25. Relative position, with respect to the codimension-three points, of the different planes so = constant for which the
bifurcation sets of Figs. 26-33, for a = 0.2 and n = 0.1, are obtained.

UC 24
[CL]SNH

-a 0 [CL]SNHL-

[H2]H\

BTL[E]

Fig. 26. Intersection of the (ao, so, p)-space with the plane SQ = 52.
[CL]SNH^C 24

a0 [CL]SNHL-

BTL[E]

Fig. 27. Intersection of the (ao, SQ, p)-space with the plane SQ = 37.

[DlHj

BTL[E]

Fig. 28. Intersection of the (ao, so,p)-space with the plane SQ = SQ1, ~ 36.1692
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 243

Fig. 29. Intersection of the (OQ, so,p)-space with the plane SQ = 36.15.

Fig. 30. Intersection of the (OQ, SQ, p)-space with the plane SQ = 36.1.
244 E. Freire et al.

Fig. 31. Intersection of the (ao, so, p)-space with the plane SQ = 36.06.

Fig. 32. Intersection of the (ao, so, p)-space with the plane Fig. 33. Intersection of the (ao, s 0 ,p)-space with the plane
so = 35.94. s0 = 35.936.
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 245

orbits (SNi and SN2) have been continued using HR and H L , a curve of upper concave homoclinic
AUT097 (see [Doedel et al, 1998]). The numeri- orbits (Hue) and a curve of lower concave homo-
cal continuation of all kinds of homoclinic orbits clinic orbits (HLC) emerge. The first one ends at
has been performed with the methods developed in S N H L 1 0 and the second one at SNH^°. Note that on
[Freire et al, 1999b] and [Freire et al., 2000] that a portion of snR, delimited by S N H L and S N H L ° ,
have been summarized in Sec. 4 and, when it has a curve of central saddle homoclinic connections
been possible, the corresponding loci have also been ( C S N H L ) appears. Analogously, on a portion of snL,
computed with HomCont [Doedel et al, 1998]. delimited by S N H R and SNH^ C , a curve of cen-
In Figs. 26-33, the symbol [X;Y] beside a tral saddle homoclinic orbits exists ( C S N H R ) . Note
codimension-two point P means that point P tends that, for simplicity, we have not drawn in all these
to the codimension-three point X as the parameter qualitative figures the homoclinic curves touching
so increases and P tends to the codimension-three tangentially the saddle-node curves snL and snR, as
point Y if so decreases; if the symbol [Z] is beside it really occurs.
P, it means that P tends to the codimension-three In Fig. 27, the bifurcation set for so = 37 is
point Z as so either increases or decreases. drawn. In the meantime, the system has exhibited
In Fig. 34 we have represented the thirty phase a double-degenerate Hopf bifurcation H2 (it occurs
portraits corresponding to the different regions for so ~ 51.6934269). For this reason, the cusp of
bounded by curves of codimension-one arising in periodic orbits Cu has disappeared in the curve of
Figs. 26-33. saddle-node bifurcations of periodic orbits SN, that
The first bifurcation set, sketched in Fig. 26, connect the degenerate Hopf points Hi and H^. All
corresponds to so = 52. There are five codimension- the other bifurcations are present for so = 37.
two bifurcations of equilibria (Bogdanov-Takens, Decreasing SQ, the codimension-two points
B T R and B T L ; cusp C; degenerate Hopf points, SNH L , S N H L 1 0 , HH and B T R approach the cusp-
Hi and H'i), five codimension-two global connec- idal loop point CL and collapse at such a point.
tions (double homoclinic orbit, HH; left upper con- This codimension-three global bifurcation occurs
cave saddle-node homoclinic orbit, S N H L ° ; left for so ~ 36.1692. The bifurcation set for this param-
saddle-node homoclinic orbit, S N H L ; right lower eter value appears in Fig. 28.
concave saddle-node homoclinic orbit, SNH^ ; Evidently, between so = 37 and so ~ 36.1692,
right saddle-node homoclinic orbit, S N H R ) and the relative position of the curve SN and the point
one codimension-two bifurcation of periodic orbits B T R has changed. We have not drawn this situation
(cusp, Cu). in a different bifurcation set for the sake of brevity
From B T R , that is placed on the curve of and because the only consequence this change will
saddle-node of equilibria snR, a subcritical Hopf have is the disappearance of regions 25 and 26.
curve (Hsub) emerges as well as a curve of right On the other side of the CL point, for instance,
homoclinic orbits, H R . The Hopf curve has a first for so = 36.15 (see Fig. 29) important changes
degeneracy point (Hi) where it becomes supercrit- appear. The three curves H R , HL and HLC do not
ical. Later, in a new degeneracy point (Hi), it intersect (and then HH do not exist). Moreover,
becomes again subcritical and finally it ends at the a degenerate zero-trace left homoclinic orbit (H^)
point B T L , that is on the curve snL. From Hi, appears on the curve H L , a degenerate zero-trace
a curve of saddle-node of periodic orbits appears lower concave homoclinic orbit ( H L C ) appears on
(SN). This curve is connected with the saddle- the curve HLC and a cusp of periodic orbits Cu
node curve emerged from H l 5 but in the vicinity appears on the saddle-node curve of periodic orbits
of this degenerate Hopf point a cusp (Cu) appears, SN2 that connects the points H^ and H L C - NOW, the
due to the proximity in the parameter space with central saddle-node homoclinic connection curve
the double-degenerate Hopf point H2. Precisely, the ( C S N H L ) , on sn R , is bounded by SNH^° and SNH L .
points Cu and H'x collapse at H2 (in other words, As parameter so decreases and reaches the crit-
the curves of cusps Cu and of degenerate Hopf ical value so ~ 36.122, the coefficient EID, given in
H'l5 in the (00, so,p)-space, emerge from H2). The (28), equals one and the codimension-two degener-
homoclinic curve H# finishes, on the sni curve, ate homoclinic orbit H L C becomes a codimension-
at S N H R . Analogously, the left homoclinic curve three degenerate homoclinic orbit HEID- From this
H L , started at B T L , finishes at S N H R , on the snR point a new cusp of periodic orbits Cu2 arises as so
curve. From the HH point, the cross-point between decreases, as appears in Fig. 30.
246 E. Freire et al.

10 11 12

13 14 15 16

o + •

17 18 19 20

21 22 23 24

+ o
O
25 26 27 28

Fig. 34. Phase portraits for Figs. 26-33. A solid (dotted) line represents a stable (unstable) periodic orbit; the symbols •, o
and + mean stable, unstable and saddle equilibrium, respectively.
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 247

The cusp of periodic orbits curve connecting we have seen the bifurcation sets in Figs. 26-33
points CL and HEID has a minimal point with with two Bogdanov-Takens points, and then the
respect to the SQ axis (see Fig. 25). Thus, as so interaction between all the curves related to this
decreases, points Cui and Cu2 appearing in Fig. 30 bifurcation has been made evident.
coalesce and disappear, obtaining a situation such Our bifurcation analysis shows the presence of
as shown in Fig. 31. thirty different regimes (phase portraits) of dynami-
Between the two situations shown in Figs. 31 cal behavior of the enzyme system depending on the
and 32 a codimension-three degenerate Bogdanov- parameters. Multistability and oscillatory regimes
Takens point D occurs (at that moment the B T R are present in different combinations. We can clas-
point is just on the cusp point C). For the criti- sify the phase portraits of the system by the number
cal value of so corresponding to point D, namely, of attractors (stable equilibrium and stable limit
so ~ 35.9635498, the codimension-two points B T R , cycle), in a similar way as is done in [Bazykin,
S N H £ C , S N H L , H £ C , S N H ^ C , S N H R and Hi coa- 1998]. In this manner, there are eight different
lesce in the cusp of equilibria point C. This produces groups:
the vanishing of the following curves: the arc of sub-
(i) a single equilibrium, in regions 1, 17 and 25;
critical Hopf bifurcation Hsub connecting the points
(ii) a single limit cycle, in regions 2 and 5;
B T R and Hi, the right homoclinic orbit curve HR,
(iii) one equilibrium and one periodic orbit, in
the lower concave homoclinic orbit curve HLC> and
regions 3, 6, 7, 10, 20, 23, 24, 29 and 30;
both central saddle-node homoclinic orbit curves
(iv) two equilibria, in regions 11, 13 and 26;
CSNHL and C S N H R . Notice that in Fig. 32 both
(v) two equilibria and one limit cycle, in regions
Bogdanov-Takens points B T R and B T L lie now on
8, 9, 12, 16, 18, 19, 22 and 27;
the saddle-node of equilibria curve snL.
(vi) two equilibria and two periodic orbits, in
As parameter so decreases and reaches the crit-
regions 14 and 15;
ical value so ~ 35.9361063, the Bogdanov-Takens
(vii) one equilibrium and two limit cycles, in
point B T R shown in Fig. 32 degenerates into a cusp
regions 4 and 28.
of order three point E. At this critical value the
(viii) two limit cycles, in region 21.
codimension-two points H^ and H^ coalesce at E
and disappear as so decreases. Thus, the curve SN Note that when there are three equilibria, the
connecting the points H'x and H^ also disappears. middle one is a saddle and then its manifolds will
The situation corresponding to a value of so smaller play an important role in the delimitation of the
than the aforementioned critical value is shown in basin of attraction of the corresponding attractors.
Fig. 33. At this level, only the saddle-node of equi- On the other hand, unstable periodic orbits act as
libria curves snL and snR are present, the cusp boundaries of the basin of attraction.
of equilibria point C, the Bogdanov-Takens points As we can see, the system has one globally
B T R and B T L and the two curves connecting them, attracting equilibrium in regions 1, 17 and 25. The
namely, the subcritical Hopf bifurcation curve Hsub distinction in the system behavior between region 1
and the left homoclinic orbit curve HL- and the other two is the transitional processes of
The Bogdanov-Takens bifurcation curve in the getting back to the equilibrium after the system has
(oo, so,/?)-space has a minimum point for the value been perturbed.
so ~ 35.9350685. Therefore, for values of so below In cases where more than one attractor is
this critical value, the Bogdanov-Takens points present, the initial condition will determine in which
have disappeared, as well as both Hopf bifurcation of them the system will end up. Observe this situ-
Hsub and left homoclinic orbit HL curves. At this ation, for instance in region 15. The three unsta-
moment, the only bifurcation phenomena that still ble periodic orbits and the manifolds of t h e saddle
persist are the saddle-node of equilibria bifurcations equilibrium mark the boundaries of the basins of
snL and snR and the cusp bifurcation of equilibria attraction for the four coexisting attractors.
C. Then, in this case the only configurations of equi- Note that, for instance, in region 3 we may find
libria present are 1 and 13 of Fig. 34. hard generation of oscillations (also called abrupt
In fact, we have chosen the (ao,p)-plane to excitation of oscillations). In this case t h e phase
represent the bifurcation sets (for so = constant) portrait includes a stable equilibrium with a basin
because of the shape of the Bogdanov-Takens curve of attraction that is bounded by an unstable limit
we perfectly know analytically. For this reason, cycle. For small perturbations, damped oscillations
248 E. Freire et al.

restore the equilibrium, but the system goes into of oscillations, occurs for parameters crossing
oscillations for rather strong perturbations. from region 2 into region 1.
We now proceed by describing some events that • Abrupt termination of oscillations. This phe-
may occur in the enzyme system with respect to nomenon, reverse to the phenomenon of abrupt
attractors if we gradually vary parameters. When excitation of oscillations, occurs, for example, for
parameters change, stable equilibria may show dif- parameters crossing from region 10 into region 17
ferent types of behavior: and from region 3 into region 1.
• Breaking up of oscillations in a homoclinic loop.
(1) Jump from one equilibrium to another. This occurs when the parameters cross from
This hysteresis phenomenon between equilib- region 16 into region 9, provided that for param-
ria occurs, for instance, when the system is in eters from region 16 the system was in the oscil-
the right equilibrium of region 13 and changing latory regime. As the parameters approach the
the parameters it crosses the curve snR (where bifurcation curve HL, the amplitude increases and
such an equilibrium disappears) and enters in the oscillations change to relaxation type. On the
region 1. Then the system jumps to the other other hand, note that other crossing of curves
stable equilibrium. H L , H R , HLC and HTJC gives rise to the appear-
(2) Gradual excitation of oscillations. When the ance/disappearance of an unstable limit cycle.
parameters cross from region 1 into region 2
In this system, the appearance of oscillations
(supercritical Hopf bifurcation, H super ) oscilla-
from a saddle-node loop (and then its reverse phe-
tions are gradually excited around the unique
nomenon, termination of oscillations in a saddle-
equilibrium.
node loop bifurcation) is not directly observable
(3) Abrupt excitation of oscillations. When the
because the periodic orbits that emerge/disappear
parameters cross, for example, from region
when crossing the curves C S N H R and C S N H L are
3 into 2 or from region 6 into 5 or from
unstable (see transitions from regions 20 and 23 into
region 8 (if the system is initially on the right
region 3). The presence of the unstable limit cycle
equilibrium) into 7 (subcritical Hopf bifurca-
may be detected looking at the basin of attraction.
tion, Hsub) we find this hysteresis phenomenon
Note that the twelve different phase plane por-
between equilibrium and limit cycle. In the
traits that appear for cubic autocatalysis with decay
aforementioned transitions the initial values of
(see Fig. 8.14 of [Gray & Scott, 1990]) are included
the variables at the equilibrium are within the
between the thirty phase portraits that the enzyme
range where the oscillations (abruptly excited)
system exhibits for a = 0.2 and K = 0.1.
exist. This situation also occurs in the transi-
tion from region 14 into region 4 (saddle-node
of equilibria, snjjj provided the system was ini- 7. Conclusions
tially at the right equilibrium. We have shown how the local bifurcation the-
(4) Jump from an equilibrium to a distant limit ory may provide important analytical informa-
cycle. When the parameters cross from region tion about the organizing centers of the dynamical
16 into region 4 (saddle-node of equilibria, snjj,) behavior of the five-parameter enzyme system con-
the system, if it is initially in the right equi- sidered along this work. Sometimes, as in this case,
librium, moves to an oscillatory regime after it is very useful to rewrite the system in a more
the equilibrium disappears. The difference from convenient way for the application of the Bifurca-
the previous case is that the initial values of the tion Theory tools.
variables at the equilibrium lie generally beyond The complete study of codimension-one, -two
the range that they have in the new stable and -three bifurcations of equilibria (and the proof
oscillations. that there are no local bifurcations of higher
codimension) indicates the presence of a very
Let us comment some events that may occur rich dynamical scenario for a planar system: for
if the system is in an oscillatory regime and the instance, the emergence of up to three periodic
parameters are changed. orbits from degenerate Hopf bifurcations and the
presence of several degenerate codimension-two
• Gradual decay of oscillations. This phenomenon, homoclinic connections (that are also related to
reverse to the phenomenon of gradual excitation periodic orbits).
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 249

However, numerical methods are needed to comparing t h e m with the reality it tries t o model.
complete t h e analysis. On the one hand, several ana- For example, they have to evaluate if s o m e evolu-
lytical expressions are rather cumbersome and their tionary factors can force t h e system t o o p e r a t e in
numerical evaluation (and/or the numerical contin- the narrow domains where t h e complicated phase
uation of the locus where a bifurcation occurs) will portraits occur (idea suggested by [Bazykin, 1998]).
be needed to understand all the information they
have inside. Acknowledgments
On the other hand, the results on global con-
This work has been partially s u p p o r t e d by t h e
nections (homoclinic orbits in this system) are only
Ministerio de Ciencia y Tecnologia, fondos FEDER
first-order approximations, t h a t need to be extended
in the frame of the project BFM2001-2608 and
with t h e help of numerical methods. The informa-
by the Consejeria de Education de la Junta de
tion is very useful to guarantee the existence of such
Andalucia (TIC-0130). The authors wish t o t h a n k
homoclinic connections as well as to help in their
the comments of A. R. Champneys a n d E . Gamero
detection and continuation, as occurs in the case of
on a draft of this paper.
the Bogdanov-Takens bifurcation.
While the use of a brute-force simulation strat-
References
egy of the system would provide very few results
(the narrow interval of the parameters where the Algaba, A., Freire, E. k. Gamero, E. [2003] "Comput-
phenomena occur makes very difficult to find t h e m ing simplest normal forms for the Takens-Bogdanov
without an analytical previous information), an singularity," Qual. Th. Dyn. Syst. 3, 377-435.
exhaustive description of the dynamical behavior Bazykin, A. D., Kuznetsov, Yu. A. & Khibnik, A. I.
[1989] Bifurcation Portraits: Bifurcation Diagrams of
this system exhibits has been carried out along
Dynamical Systems on the Plane, Series in Mathe-
our study. However, although the presented anal- matics and Cybernetics, Vol. 89 (Znanie, Moscow) (in
ysis is rather detailed, we cannot exclude the Russian).
existence of closed curves related to limit cycle Bazykin, A. D. [1998] Nonlinear Dynamics of Interacting
bifurcations away from the studied codimension- Populations, World Scientific Series on Nonlinear Sci-
one, -two and -three points. ence, Series A, Vol. 11 (World Scientific, Singapore).
In the case of the Bogdanov-Takens bifur- Berezovskaya, F. S. & Khibnik, A. I. [1985] "Bifurcations
cation, the theoretical study of the correspond- of a dynamical second-order system with two zero
ing unfolding provides information about a lot of eigenvalues and additional degeneracy," in Methods of
codimension-one and -two bifurcations, b u t when Qualitative Theory of Differential Equations (Gorkii
numerical methods extend these local results, new State University, Gorkii) (in Russian), pp. 128-138.
bifurcations theoretically unexpected may appear Beyn, W.-J. [1990] "The numerical computation of con-
necting orbits in dynamical systems," IMA J. Numer.
(this is t h e case of t h e cuspidal loop, CL, a n d of
Anal. 9, 379-405.
the lower concave homoclinic orbit, H E I D ) . These
Champneys, A. R. k, Kuznetsov, Yu. A. [1994] "Numer-
numerical results may open new research frontiers ical detection and continuation of codimension-two
in the theoretical field, and an interesting feedback homoclinic bifurcations," Int. J. Bifurcation and
process may provide advances in both theoretical Chaos 4, 785-822.
and numerical areas. Champneys, A. R., Kuznetsov, Yu. A. & Sandstede, B.
The presence of hysteresis behavior between [1996] "A numerical toolbox for homoclinic bifurca-
equilibria a n d / o r periodic orbits is one of the tion analysis," Int. J. Bifurcation and Chaos 6, 867-
features t h a t may be deduced from the results 888.
achieved. Other characteristics of excitable media Dangelmayr, G. & Guckenheimer, J. [1987] "On a four
are present (trigger mechanism, threshold phe- parameter family of planar vector fields," Arch. Rat.
Mech. Anal. 97, 321-352.
nomena, slow-fast m o t i o n s , . . . ; see, for instance,
Doedel, E. J. & Kernevez, J. P. [1986] "AUTO: Software
[Murray, 2002, 2003]) and would be easy to find in
for continuation and bifurcation problems in ordinary
the five-parameter space. We have not emphasized differential equations," Applied Mathematics Report,
on these topics for the sake of brevity. California Institute of Technology.
T h e mathematical results we have obtained Doedel, E. J., Keller, H. B. & Kernevez, J. P. [1991]
about the enzyme system provide a deep insight "Analysis and control of bifurcation problems, Part I:
of the model and will be useful for biochemist- Bifurcation in finite dimensions," Int. J. Bifurcation
mathematicians t o check its validity and limitations and Chaos 1, 493-520.
250 E. Freire et al.

Doedel, E. J., Champneys, A. R., Fairgrieve, T. F., Gray, P. & Scott, S. K. [1990] Chemical Oscillations and
Kuznetsov, Yu. A., Sandstede, B. & Wang, X. Instabilities. Non-Linear Chemical Kinetics, Interna-
[1998] "AUT097: Continuation and bifurcation soft- tional Series of Monographs on Chemistry, Vol. 21
ware for ordinary differential equations (with Hom- (Clarendon Press, Oxford).
Cont), User's Guide," Concordia University, Mon- Guckenheimer, J. [1986a] "Multiple bifurcation problems
treal, Canada. for chemical reactors," Physica D20, 1-20.
Dumortier, F., Roussarie, R. & Sotomayor, J. [1987] Guckenheimer, J. [1986b] Global Bifurcations in Simple
"Generic 3-parameter families of vector fields on the Models of a Chemical Reactor, Lectures in Applied
plane, unfolding a singularity with nilpotent linear Mathematics, Vol. 24, pp. 163-174.
part," Ergod. Th. Dyn. Syst. 7, 375-413. Guckenheimer, J. & Kim, S. [1992] "Dstool: A dynami-
Dumortier, F., Roussarie, R. & Sotomayor, J. [1991] cal system toolkit with an interactive graphical inter-
Generic 3-Parameter Families of Planar Vector face, User's Guide," Center for Applied Mathematics,
Fields, Unfoldings of Saddle, Focus and Elliptic Sin- Cornell University, Ithaca, NY.
gularities with Nilpotent Linear Parts, Lecture Notes Guckenheimer, J. & Worfolk, P. [1993] "Dynamical
in Mathematics, Vol. 1480 (Springer, Berlin). systems: some computational problems," in Bifur-
Dumortier, F., Roussarie, R. & Sotomayor, J. [1994] cations and Periodic Orbits of Vector Fields, ed.
"Elementary graphics of ciclicity 1 and 2," Nonlin- Schlomiuk, D., NATO ASI Series, Series C, Vol. 408
earity 7, 1001-1043. (Kluwer, Dordrecht), pp. 241-277.
Dumortier, F., Roussarie, R. & Sotomayor, J. [1997] Guckenheimer, J. & Holmes, P. J. [1997] Nonlinear
"Bifurcations of cuspidal loops," Nonlinearity 10, Oscillations, Dynamical Systems, and Bifurcations of
1369-1408. Vector Fields, Applied Mathematical Science Series,
Fernandez-Sanchez, F., Freire, E., Pizarro, L. & Vol. 42 (Springer, Berlin).
Rodriguez-Luis, A. J. [1996] "Analytical and numer- Hassard, B. & Jiang, K. [1992] "Unfolding a point
ical study of a van der Pol-Duffing oscillator," of degenerate Hopf bifurcation in an enzyme-
in NDES '96: Fourth Int. Workshop on Nonlinear catalyzed reaction model," SIAM J. Math. Anal. 23,
Dynamics of Electronic Systems (Centro Nacional de 1291-1304.
Microelectronica, Sevilla), pp. 321-326. Hassard, B. & Jiang, K. [1993] "Degenerate Hopf bifur-
Freire, E., Gamero, E. & Ponce, E. [1989] "An algo- cation and isolas of periodic solutions in an enzyme-
rithm for symbolic computation of Hopf bifurcation," catalyzed reaction model," J. Math. Anal. Appl. 177,
in Computers and Mathematics, eds. Kaltofen, E. & 170-189.
Watt, S. M. (Springer, NY), pp. 109-118. Joyal, P. [1988] "Generalized Hopf bifurcation and its
Freire, E., Pizarro, L. & Rodriguez-Luis, A. J. [1999a] dual generalized homoclinic bifurcation," SIAM J.
"Examples of non-degenerate and degenerate cusp- Appl. Math. 48, 481-496.
idal loops in planar systems," Dyn. Stab. Syst. 14, Kernevez, J. P., Joly, G., Duban, M. C , Bunow, B. &
129-161. Thomas, D. [1979] "Hysteresis, oscillations, and pat-
Freire, E., Pizarro, L. & Rodriguez-Luis, A. J. [1999b] tern formation in realistic immobilized enzyme sys-
"Numerical continuation of degenerate homoclinic tems," J. Math. Biol. 7, 41-56.
orbits in planar systems," IMA J. Numer. Anal. 19, Kernevez, J. P., Doedel, E., Duban, M. C ,
51-75. Hervagault, J. F., Joly, G. & Thomas, D. [1983]
Freire, E., Pizarro, L. & Rodriguez-Luis, A. J. [2000] "Spatio-temporal organization in immobilized enzyme
"Numerical continuation of homoclinic orbits to non- systems," in Rhythms in Biology and Other Fields:
hyperbolic equilibria in planar systems," Nonlin. Dyn. Deterministic and Stochastic Approaches, eds.
23, 353-375. Demongeot, J. & Le Breton, A., Lecture Notes
Friedman, M. J. & Doedel, E. J. [1993] "Computational in Biomathematics, Vol. 49 (Springer, Berlin),
methods for global analysis of homoclinic and hetero- pp. 50-70.
clinic orbits: A case study," J. Dyn. Diff. Eqs. 5, 37-57. Kernevez, J. P., Doedel, E. & Thomas, D. [1985] "Math-
Gamero, E., Freire, E. & Ponce, E. [1991] "On ematical modeling of immobilized enzyme systems,"
the normal forms for planar systems with nilpo- Biomed. Biochim. Acta 44 6, 993-1003.
tent linear parts," in Bifurcation and Chaos: Kuznetsov, Yu. A. [1998] Elements of Applied Bifur-
Analysis, Algorithms, Applications, eds. Seydel, R., cation Theory, Applied Mathematical Science Series,
Schneider, F. W., Kiipper, T. & Troger, H., Inter- Vol. 112 (Springer, Berlin).
national Series of Numerical Mathematics, Vol. 97 Medved, M. [1985] "The unfoldings of a germ of
(Birkhauser, Basel), pp. 123-127. vector fields in the plane with a singularity of
Golubitsky, M. & Schaeffer, D. G. [1985] Singular- codimension 3," Czech. Math. J. 35, 1-42.
ities and Groups in Bifurcation Theory, Vol. I, Murray, J. D. [1981a] "On pattern formation mechanism
Applied Mathematics Science Series, Vol. 51 for lepidopteran wing pattern and mammalian coat
(Springer, Berlin). markings," Phil. Trans. Roy. Soc. B295, 473-496.
Multiparametric Bifurcations in an Enzyme-Catalyzed Reaction Model 251

Murray, J. D. [1981b] "A pre-pattern formation mecha- and Bifurcations: Numerical Techniques and Applica-
nism for animal coat markings," J. Theor. Biol. 88, tions, eds. Roose, D., de Dier, B. & Spence, A. NATO
161-199. ASI Series, Series C, Vol. 313 (Kluwer, Dordrecht),
Murray, J. D. [2002] Mathematical Biology. I: An pp. 197-210.
Introduction, Interdisciplinary Applied Mathematics, Schecter, S. [1987] "The saddle-node separatrix-
Vol. 17 (Springer, Berlin). loop bifurcation," SIAM J. Math. Anal. 1 8 , 1142-
Murray, J. D. [2003] Mathematical Biology. II: Spatial 1156.
Models and Biomedical Applications, Interdisciplinary Takens, F. [1973] "Unfoldings of certain singularities of
Applied Mathematics, Vol. 18 (Springer, Berlin). vectorfields: Generalized Hopf bifurcations," J. Diff.
Nozdrachova, V. [1982] "Bifurcation of a noncourse Eqs. 14, 476-493.
separatrix loop," Diff. Eqs. 18, 1098-1104. Thomas, D. [1975] "Artificial enzyme membranes,
Rheinboldt, W. C. [1986] Numerical Analysis of transport, memory, and oscillatory phenomena,"
Parametrized Nonlinear Equations, The University of in Analysis and Control of Immobilized Enzyme
Arkansas Lecture Notes in the Mathematical Science, Systems, eds. Thomas, D. & Kernevez, J. P.
Vol. 7 (John Wiley, NY). (Springer, Berlin), pp. 115-150.
Rodriguez-Luis, A. J., Preire, E. k Ponce, E. [1990] Wiggins, S. [2003] Introduction to Applied Nonlinear
"A method for homoclinic and heteroclinic continu- Dynamical Systems and Chaos, Texts in Applied
ation in two and three dimensions," in Continuation Mathematics, Vol. 2 (Springer, Berlin).
This page is intentionally left blank
STRAIGHTFORWARD COMPUTATION OF
SPATIAL EQUILIBRIA OF GEOMETRICALLY
EXACT COSSERAT RODS
T. J. H E A L E Y
Theoretical & Applied Mechanics and Center for Applied Mathematics,
Cornell University, Ithaca, NY 14850, USA
P. G. MEHTA*
Center for Applied Mathematics,
Cornell University, Ithaca, NY 14850, USA

Received July 9, 2004; Revised July 28, 2004

In this paper, we present a well posed "force" based formulation for nonlinearly elastic Cosserat
rods with general boundary conditions enabling straightforward, efficient computation of spatial
equilibria. We illustrate the ease and utility of our approach in four example problems, each
exhibiting large spatial buckling, employing the path-following software AUTO.

Keywords: Elastic Cosserat rods; geometrically exact; computation of spatial equilibria.

1. Introduction the setting of a linear space, free of algebraic


constraints, enabling t h e computation of equilib-
In this paper we consider the problem of com-
ria via standard numerical techniques for two-point
putation of spatial equilibria of nonlinear elastic
boundary value problems.
Cosserat rods, cf. [Antman, 1995]. At first glance
Simo and Vu-Quoc [1986] proposed a solu-
this appears innocuous — merely the solution of
tion algorithm for statical Cosserat rod p r o b l e m s
a nonlinear two-point boundary value problem is
featuring a multiplicative u p d a t i n g p r o c e d u r e —
required. However, in the special Cosserat theory,
in essence, a Newton solver on the differentiable
which we consider here, and which contains the manifold. There, the rotations are p a r a m e t r i z e d
classical Kirchhoff theory [Love, 1934] as a special via unit quaternions (Euler parameters), a n d t h e
case, the kinematical description of the rod requires incremental rotations (coming from t h e solution
the determination of the rotation field of the cross- of a linearized problem) are efficiently exponen-
sections. Herein lies the main difficulty — use of tiated via the so-called Rodrigues formula. Their
a standard (two-point boundary-value) solver with approach is geometrically n a t u r a l and correct, if a
Newton type iteration will typically lead to "drift" bit formidable, and their reported numerical results
in the rotation field. Namely, the rotations belong are certainly very good. However, from a practical
to a set of SO(3)-valued mappings, which is not a point of view, their methodology is not c o m p a t i -
linear space. In the absence of explicit constraint ble for use with s t a n d a r d numerical packages, a n d
equations, this is generally at odds with an addi- consequently, it has not been widely a d o p t e d .
tive iteration scheme. In this paper, we present More recently the numerical implementation of
a consistent formulation of the rod equations in a Hamiltonian formulation for the s t a t i c a l rod

"Current address: United Technologies Research Center, 411 Silver Lane, East Hartford, CT 06018, USA.

253
254 T. J. Healey k P. G. Mehta

equations (treating the arc-length of the rod as law. Specifically, for some differentiable function
a "time-like" variable) has been proposed [Li &; E : Rn - • R, we have
Maddocks, 1996; Dichmann et ai, 1996]. The
rotation field is explicitly parametrized via unit £(x) = C (2)
quaternions or Euler parameters. Of course, this
introduces a different nonlinear space — a set of on all solutions of (1), where C is a constant, viz.
S^-valued mappings (instead of SO(3)-valued map-
pings), where S3 denotes the unit sphere in M4. | # ( x ) = (V£(x),x) = <V£(x),f(x)) = 0, (3)
A four-vector field, conjugate to the quaternion
field, arises in lieu of the couple field in [Li &;
Maddocks, 1996; Dichmann et ai, 1996]. In particu- where (•,•) denotes the standard Euclidean inner
lar, this leads to an extra differential equation in the product on W1. The following is the cornerstone of
system replacing the balance-of-moments equation. our approach:
The role of this extra equation and, in particular,
Proposition 1.1. Consider the augmented system
the consistent assignment of boundary conditions
within a general class of problems is not addressed.
x = f (x) + /iV£7(x), a<t<b, (4)
In this work, we propose a "force" formu-
lation, based directly upon the convective form
where fj, € M is an unspecified parameter. Then any
of the balance laws (force and moment balance).
solution of (4) satisfying the end conditions
We too explicitly employ unit quaternions for the
rotation field. This formulation reveals an inconsis-
£7(x(a)) = E(x(&)), (5)
tency in the assignment of boundary conditions: For
example, the prescription of a boundary rotation
is also a solution of (1) and (2).
requires the assignment of the four components of
the quaternion, whereas a prescribed couple at the
The proof of Proposition 1.1 is simple: Substituting
boundary has only three components. Of course (4) into the first equation in (3), while using the last
in the former case the four components are not equality in (3), we see that
independent — the quaternion field must have
unit length, and a consistent formulation would
generally require the inclusion of that algebraic ^f=H|V£||2. (6)
constraint. Instead, we exploit the fact that the
"unit-constraint equation" is actually a conserva- In view of (5) we conclude that /J, = 0, i.e. (2) holds
tion law, and we eliminate its explicit appearance and (4) coincides with (1).
from the field equations via an approach similar to From a computational point of view, one advan-
that employed in the proof of the Liapunov Center tage of treating (4), (5) is clear: The accuracy of
Theorem — sometimes referred to as "vertical Hopf any reasonable method employed in solving (4), (5)
bifurcation", e.g. cf. [Ambrosetti & Prodi, 1993]. We (supplemented by appropriate initial or boundary
note that this same type of approach has been used conditions) is carried over automatically to (2). The
successfully in the numerical computation of peri- free parameter /x is simply a "dummy" unknown
odic solutions of the three-body problem in [Doedel, in (4) — its computed value is an extremely small
2000; Doedel et ai, 2003], and more generally, for number in practice. On the other hand, even a
periodic solutions of conservative and Hamiltonian highly accurate solver for (1) will be inconsistent, in
systems in [Munoz-Almaraz et ai, 2003]. For the general, with a conservation law (2). Thus, working
convenience of the reader, we now summarize the directly with (1) requires either that (2) be carried
well-known theorem underlying the method: along as an algebraic constraint or that a discretiza-
Consider a system of differential equations of tion scheme be found for (1) that automatically ful-
the form fills the conservation law (2), the latter approach
of which may be neither convenient nor practical.
x = f(x), (1) Moreover, as discussed below, the presence of the
free parameter fj, in the rod equations enables a
where f: Rn —> K n is sufficiently smooth. Further we consistent prescription of boundary conditions for
presume the existence of a real-valued conservation a general class of boundary value problems.
Computation of Spatial Equilibria of Cosserat Rods 255

The outline of the paper is as follows: In Sec. 2, examples illustrate the utility of our approach in
we present the equilibrium equations solely in terms the presence of "mixed" boundary conditions.
of force and moment fields via a complementary-
energy formulation. These, in turn, are coupled to
kinematical equations for the displacements and 2. Formulation
rotations. Here we see plainly the difficulty in Let {ei,e2,e3J denote a fixed, right-handed,
working directly with rotation matrices — several orthonormal basis for E 3 . We consider a straight rod
constraint equations arise. We eliminate all but of unit length occupying a reference configuration
one constraint equation in the usual way via the parallel to e3. Let s E [0,1] denote the arclength
Euler parameters. We then eliminate the unit con- coordinate (of the centerline) in the undeformed
straint equation via Proposition 1.1. Here we see rod, and let r(s) denote the position vector (with
another advantage of treating (4), (5): In rod prob- respect to some fixed origin) of the material point
lems, the multiplier ji provides an extra unknown, originally at "s" in the reference configuration. We
which enables the consistent prescription of bound- let R(s) denote the rotation of the cross-section
ary conditions. For example, it turns out that for spanned by {ei,e2J at "s" in the undeformed rod.
a prescribed rotation at an end-point, one of the The first two unit vectors of the orthonormal field
quantities (5) is necessarily equal to unity; when defined by
a couple is prescribed at a boundary point, we
impose one of (5) to be unity as a boundary con- di(s) = K(s)eu i = l,2,3, (7)
dition at that location. In this way we always have
the same number of boundary conditions appropri- are called directors in the special Cosserat theory,
ate for the number of unknowns in the field equa- which we employ here. The deformed configuration
tions. This is true for a general class of "mixed" of the rod is uniquely specified by the fields r(s)
boundary conditions as well. Thus we always have and R(s).
well-posed two-point boundary value problems (in Differentiation of (1) yields
the absence of other symmetries). We also point
out another advantage of our force-based formu- d^ = R'R r di, z = 1,2,3. (8)
lation (over displacement-based formulations, e.g.
Since the tensor field
[Simo & Vu-Quoc, 1986]), viz. the various classical
constraints, such as inextensibility and/or unshear- K = R'RT (9)
ability, are readily incorporated with only minor
modifications and without the need of Lagrange is skew-symmetric, there is a unique vector field K
multipliers. In Sec. 3 we present four numerical such that
examples of large, spatial buckling of elastic rods
using the software package AUTO [Doedel, 2000], d[ = Kxdi, i = 1,2,3, (10)
demonstrating the ease and utility of our formula-
tion. In the first example we consider large lateral i.e. K is the axial vector of K. We write
buckling of an end-loaded cantilevered rod in the
r ' — Vidi, and K = «jdj. (11)
shape of a thin ruler. Next we obtain large heli-
cal buckled states of a compressed hemitropic rod The numbers the "strains" in this theory,
in the absence of external twist, cf. [Papadopoulos, cf. [Antman, 1995]; Vi,i^2 are "shears", v^ is the
1999; Healey, 2002]. Third we consider a boundary "stretch", K±,K2 are "curvatures", and K3 is the
value problem governing the spatial equilibria of a "twist".
finite rod with intrinsic curvature (cf. [Domokos & We let n(s) and m(s) denote the internal con-
Healey, 2005] for a systematic study). In particu- tact force and internal contact couple, respectively,
lar, we obtain large helical solutions and so-called acting on the cross-section originally at "s" in the
helical "perversions" [McMillen & Goriely, 2002] reference configuration. We write
bifurcating from the straight rod in tension. Finally
we consider again a long thin "ruler" with one end n = njdj, and m = mjdj. (12)
clamped while the other end is twisted via a hinged
connection, i.e. the orientation of the cross-section Recall that the ra$ and m$, i = 1, 2, 3, are called for-
is only partially prescribed. The second and third ces and moments, respectively, cf. [Antman, 1995];
256 T. J. Healey & P. G. Mehta

ni,ri2 are "shear forces", 123 is the "axial force", (10)-(12):


mi,rri2 are "bending moments", and 1713 is the
"torque" or "twisting moment". For a hypere- n' + k x n + b = 0, (18)
lastic rod, we assume the existence of a twice- m' + k x m + v x n + g = 0, (19)
differentiable, scalar-valued stored energy function,
W(ui,U2,^3,Hi,K2,K3,s), such that where b = 6jdj and b = (^1,^2,^3), etc., as in (14).
dW , dW On the other hand, we write r and R with respect
rij = -5— and rrij = ——, j = 1,2,3. (13) to the fixed basis:
OVj OKn

If we define the triples n = (711,712,713), m = r = r;e;, r = fi,r2 r


3), (20)
(mi,m2,m3), y = (^1,^2,^3), and k = {KI,K2,K3),
-Rii Rl2 -Rl3
and define W(y,k,s) = W(ui,U2,U3, K\, K2, K3, s),
then (13) takes the compact form R = RijGi ® ej, R = -R21 R22 R23 • (21)
_#31 -R32 R33_
dW dW
and m = (14)
~dk' Then (7), (9) and (11) lead to
We make the physically reasonable assumption that
r' = Rv, (22)
the Hessian D2W(-) is positive-definite matrix for
each of its arguments on R 2 X (0,00) x R 3 . Con- R' = RK, (23)
sequently, there is a complementary energy func-
tion (the Legendre transform of W), denoted by where K is uniquely defined by axial(K) = k.
T(n, m, s), such that Next we employ (15) in (18), (19), (22) and
(23), to obtain the following system of first-order
v = and (15) ODEs:
on am
dT
Next we assume that the rod is subjected to eta (24)
a distributed, external body force per unit unde-
formed length, b(s), and a distributed, external . dT dT
. = m x —- + n x — , (25)
body couple per unit undeformed length, g(s). Then am an
the well-known local forms of balance of forces and
moments are given by (cf. [Antman, 1995]) f
/ ^3T (26)
=R7T>
an
n' + b = 0, (16)
R' = RK(n,m), (27)
and
where K(n, m) denotes the skew-matrix-valued
m ' + r' x n + g = 0, (17) function uniquely defined by
respectively. <9T
Finally, we must specify boundary conditions at axial(K(n,m)) = (28)
am
the two ends of the rod, s = 0 and s = 1. For bound-
ary conditions of place, we specify the configuration The main difficulty with the numerical implementa-
(r, R) at an endpoint, while boundary conditions of tion of formulation (24)-(27) is that R(s) G SO(3),
force entail the specification of (n, m) at a bound- the latter of which is not a linear space, i.e.
ary point. Of course, various "mixed" combinations
can also be imposed, e.g. (r, m) could be specified det(R(s)) = 1 and RTR = I (29)
at an endpoint.
must be imposed as constraints.
In an effort to reduce the number of constraints,
3. Numerical Implementation we look to a well-known, singularity-free parametri-
We first rewrite Eqs. (16) and (17) with respect zation of SO(3). For any rotation R, Euler's the-
to the convected basis {di,d2,ds}, employing orem asserts the existence of an axis of rotation,
Computation of Spatial Equilibria of Cosserat Rods 257

which corresponds to an eigenvector a satisfying inconsistency that is revealed through the prescrip-
Ra = a, |a| = 1. (30) tion of boundary conditions. For example, for a
placement boundary condition, (f, q) is prescribed
Let 6 € R(mod27r) denote the counterclockwise at an endpoint, which entails seven quantities. On
rotation angle (according to the right-hand rule the other hand, a force boundary condition entails
about a) also given by Euler's theorem. We then only six specified quantities, e.g. (n, m). Of course,
introduce the quantities the four components of q in the former must also
satisfy (33), which, in some sense, reconciles the
qo = cos (9i,92,93)=sin{ - )a, (31) actual count. At any rate, we would like to eliminate
(33) altogether, which would enable the use of stan-
and the four-vector
dard two-point boundary-value problem solvers.
q=- (9o,9i,92, qz)- (32) Observe from (36) that
We then observe
,4T(q)q = o for all q. (39)
<q,q) = ?o + 9 i + 9 2 + 9 3 = 1. (33)
viz. q is a unit quaternion; the scalars <70;9i>92;93 In view of (35) and (39), we find that
are typically called the Euler parameters. Here
(•, •) denotes the standard Euclidean inner-product d
<q,q) = 2<q,q') = 2<A T (q)q,k) = 0,
on M4. It can then be shown [Darboux, 1972] that ds
R i.e. E = (q, q) = 1 is a conservation law for (35).
= R(q) The following is a special case of Proposition 1.1:
1 Proposition 3.1. Consider the augmented system
9o + 9i 9i92 - 9093 9i93 + 9092
(24), (25), (37), with (38) replaced by

= 2 9192 + 9093 9o + 92 - 2 *?293 - 9o9i , . . . <9Y . ,


q =A(q) — ( n , m ) + / / q , (40)

9193 - 9092 9293 + 9091 i + 93 where /i G R is a free parameter, in the absence of


(34) (33). If (33) is satisfied at the endpoints, s = 0 and
s = 1, then any solution of the augmented system,
and
(24), (25), (37) and (40), is also a solution of the
q' = ^(q)k, (35) algebraic-differential system (24), (25), (33), (37)
where and (38).
-9i -92 -93 Proposition 3.1 has important practical ram-
1 90 -93 92 ifications for numerical implementation: The aug-
A{<\) = (36)
93 90 -9i mented system (24), (25), (37), and (40) can be
-92 91 90.
solved without explicitly enforcing (33) pointwise,
provided that the latter is satisfied at the endpoints.
We can now use (15), (34) and (35) to replace We accommodate this as follows: For boundary con-
(26) and (27) in our earlier formulation: ditions of placement, for which the orientation of
f, = (q) the cross-section is prescribed, (33) is naturally sat-
^ fn" (s ' m) ' (37)
isfied at the boundary points. For force boundary
conditions, (33) must be prescribed as a bound-
<9Y ary condition as well. In either case, we always end
(38)
q/ = ^(q)^(a,m). up with seven boundary conditions at each bound-
Our new system comprises (24), (25), (37) and (38), ary point, which agrees with the fourteen unknowns
subject to the constraint (33). inherent in the augmented system, viz. n, m, r, q
In spite of the drastic reduction in the num- and //. This is also true for (well-posed) mixed
ber of constraints (cf. (29) versus (33)), sys- boundary value problems, as demonstrated in the
tem (24), (25), (37) and (38) possesses a slight next section.
258 T. J. Healey & P. G. Mehta

Another major advantage of our "force-based" values of /J, in each of the four examples presented
approach, is that we can easily accommodate the below is observed to be numerically zero, i.e. on the
common constrained rod theories automatically — order of 10 - 1 4 .
without Lagrange multipliers. We summarize the
three most common cases. Recall that a rod is
said to be inextensible if v% = 1 is imposed as 4.1. Large lateral buckling of
a constraint. A rod is said to be unshearable if a "ruler"
v\ = V2 = 0 are imposed as constraints. Of course, Consider an unshearable rod with one end clamped
all three of these may be imposed, in which case and with the other end subjected to a "dead" trans-
the rod is said to be inextensible and unshearable. verse force A, as shown in Fig. 1. The assumed con-
In each of these cases, we simply replace the vector- stitutive laws are summarized in Table 1. Observe
valued function v = (<9T/dn)(n, m) in (25) and (37) that one bending stiffness is ten times the other.
with the following expression: Accordingly, we call such a rod a "ruler," as sug-
gested by the depiction in Fig. 1. The boundary
. <9T 0T . conditions at the clamped end (s = 0) are
inextensible: v = ——, ——, 1 . (41)
on\ on<i
r(0) = 0, (44)
<9T
Unshearable: v = ( 0,0, (42) go(0) = l, (?i,g 2 ,?3)(0)=0, (45)
0V13

Inextensible, unshearable: v = (0,0,1). (43) and at the end (s = 1) where normal force A is
applied, we impose

4. Examples m(l) = 0, (46)


In this section, we apply the above framework to n(l) = Aei, (47)
obtain computational bifurcation and continuation
results for four different example problems. We use
the software package AUTO [Doedel, 2000] to carry
out the computations. AUTO has the capability
to continue and locate bifurcations of the solu-
tions of general two-point boundary value problems
(BVP). A well posed BVP (with a correct count of
boundary conditions) is discretized in AUTO using <e2
the method of orthogonal collocation. A solution s=l s=0 /
curve for the resulting square system of algebraic

i
equations is continued in a single parameter using
the method of arc length continuation employing .centerline
Newton iteration [Keller, 1977]. The discretized
state, say x and the parameter A are parametrized
(Force)
as (x(s),A(s)) where s denotes the arc length con- amped end
tinuation parameter. For this purpose, an initial
solution x(0) for some given parameter value A(0)
is needed for the continuation to begin. Fig. 1. Schematic of the ruler showing reference configura-
In the examples presented below, there is tion (where A = 0) and boundary conditions at the two ends.
always an "extra" boundary condition relative to
the number of (first order) ODEs and the parameter
pi is then the "extra" unknown which makes the dis- Table 1. Constitutive laws for ruler.
cretized algebraic system well-posed (square). The Unshearable ^1=0, ^2=0
parametrization for the arc length continuation is Axial force n 3 = 20 log(i/ 3 )
then given as (x(s), A(s),/x(s)) where s as before Bending moments mi = « i , rri2 = 10«2
denotes the arc length continuation parameter. Twisting moment TO3 = K3
In consonance with Proposition 3.1, all computed
Computation of Spatial Equilibria of Cosserat Rods 259

for a total of thirteen boundary conditions. In the manifold of solutions defined by constraint (32).
convective coordinates used for computations, the For this purpose, and as discussed in the previ-
boundary condition (47) is expressed as ous section, we consider the augmented system of
Eqs. (24), (25), (37), and (39) and introduce an
ni(l) = 2A(gig2 + ?093), (48) additional boundary condition at s = 1
n 2 (l) = 2 A ( g g + ^ - 0 . 5 ) , (49)
{ql + ql + ql + ql){l) = l- (51)
ra3(l) = 2X(q2q3 - qoqi). (50)
The solution of the system of Eqs. (24), (25),
We are interested in computing the solutions of this (37), and (39) together with the boundary condition
problem as the force parameter A is increased from Eqs. (44)-(46), (48)-(51) is continued in two param-
zero (for which the ruler is in its reference configura- eters (A,//). As the applied force A is increased
tion). The boundary conditions (44)-(46), (48)-(50) from zero, the tip of the rod (at s = 1) moves in
together with the system of differential Eqs. (24), the e 2 — e3 plane in the direction of the applied
(25), (37), and (38) describe a well-posed continu- force. At a critical value, the planar solution buckles
ation problem in the single parameter A. However, thereby resulting in a bifurcated nonplanar solu-
on account of numerical errors (as A is increased) tion. Figure 2 plots the centerline of planar solu-
implicit in any boundary value solver, the solution tions for increasing values of A together with the
as it is continued may "move off" the admissible centerline of a single bifurcated nonplanar solution.

Centerline of the ruler as X increases

CM
CD

0.12

0.08
e3(z) o.5 0.06
0.04
0.02 e^y)
o o
Fig. 2. Centerline of the ruler as force at the tip — A is increased.
260 T. J. Healey & P. G. Mehta

Bifurcation diagram for ruler


0.45

0.40

<D 0 35
E
CD
So.30h
E
Q_ 25
"*=0
CD
c Bifurcated nonplanar solution
cd 20
Q.0.
B
^ o .15
o
l
go .10
0.05

Basic solution
1 2 3 4 5 6 7
X - force boundary condition
Fig. 3. Bifurcation diagram for ruler.

Figure 3 plots the bifurcation diagram as a function two endpoints are prescribed (and equal):
of A and Fig. 4 plots a typical bifurcated nonplanar
solution showing the buckling experienced by the rQ(-l)=0, a = 1,2, r 3 ( - l ) = - l + A, (52)
ruler as the normal force exceeds the critical value. q ( - l ) = (1,0,0,0), (53)
We note that the load at bifurcation agrees well
with that predicted buckling load in [Timoshenko ra(l)=0, a = 1,2, r 3 (l) = l - A , (54)
k Gere, 1961, Eqs. (6)-(23)], the latter of which is q(l) = (1,0,0,0) (55)
only approximate, given that an infinitesimal pre-
buckled configuration is presumed. imposed at the two ends s = — 1 and s = 1 as shown
in Fig. 5.
We are interested in computing the solutions
of this problem as the displacement parameter A is
4.2. Nonplanar solutions of a
increased from zero (for which the rod is assumed
compressed "cable" or to be in its reference configuration). The bifurcation
"DNA strand" problem for the unshearable case has been consid-
Consider an unshearable hemitropic rod (see ered in [Papadopoulos, 1999] and the linearization
Table 2 for the constitutive laws assumed). at any bifurcation point shown to possess a two-
Hemitropy is a natural model of long filaments dimensional null space. This two-dimensional null
having a helical micro-structure in the relaxed space arises due to the presence of the symmetry
state, cf. [Healey, 2002]. The two ends of the group 0(2) C SO(3): SO(2) due to rotations of the
rod are "clamped" against rotation and transverse rod about e3 (the centerline), and Z2 corresponding
displacements, while the axial displacements of the to 180° rotations of the rod about any perpendicular
Computation of Spatial Equilibria of Cosserat Rods 261

A typical solution along first bifurcated branch

e
0-0.2 2(y)
Fig. 4. A typical 3D nonplanar solution along the bifurcated branch for the ruler.

Table 2. Constitutive laws for hemitropic bisector of the centerline. We obtain solutions by
rod (see [Healey, 2002] for the definition of working in a suitable fixed-point space correspond-
hemitropic rod).
ing to solutions which are symmetric with respect
Unshearable fi = 0, ^2 = 0 to 180° rotations of the rod about e 2 at s = 0 — the
Axial Force n 3 = 10(i/3 - 1) - IO/C3
so-called Z2 isotropy subgroup, cf. [Papadopoulos,
Bending moments
1999]. In this fixed-point space, the boundary condi-
roi = « i , m.2 = K2
tions at s = 1 remain as before [Eqs. (52) and (53)]
Twisting moment "13 = K3 - 10(l/3 - 1)
while the boundary conditions are now imposed at
the midpoint s — 0 and are given as

n 2 (0) = 0, (56)
m 2 (0) = 0, (57)
..«en
ri(0) = 0, (58)
A, i displacement
—• r 3 (0) = 0, (59)

3
/
centc rline
1
"ffi
s=-l
91 (0) = 0,
93(0) = 0.
(60)
(61)

Once the solution is obtained in the fixed-point


space for s € [0,1], the solution for s G [—1,0]
Fig. 5. Schematic of the rod showing reference configuration is obtained by 180° rotation. Moreover, an entire
(where A = 0) and boundary conditions at the two ends. orbit of solutions may be obtained by rotating the
262 T. J. Healey & P. G. Mehta

Bifurcation diagram for hemitropic rod


0.18

0.16

I 0.14
3
<j?0.12
w
.2 0.10
c
o
§0.08
Iw"0.06
TJ
X Bifurcated nonplanar solutions
I 0.04
&
tfO.02

Bas,ic solution,
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
X - displacement at the end
Fig. 6. Bifurcation diagram for the hemitropic rod.

above solution (applying the SO(2) quotient group). These bifurcation points agree with the analy-
Physically though, these solutions are all the same sis in [Papadopoulos, 1999; Papadopoulos k Healey,
as they correspond to rigid rotation of the rod about 2004] and Figs. 7 and 8 plot two typical nonplanar
its center line. solutions along the resulting bifurcated branches.
As in the case of the ruler, the thirteen bound-
ary conditions (52), (53), (56)-(61) in the fixed-
point space are augmented with an extra boundary 4.3. Perversions of a "telephone cord"
condition Next we compute helical and so-called perversion or
helical-reversal solutions exhibited by a rod of finite
2
(?o +01+92 <?3 )(0) = 1 (62)
length with intrinsic curvature, e.g. a telephone
in order to satisfy the constraint. In view of the cord. We refer to [McMillen k Goriely, 2002] for
two boundary conditions (60) and (61), instead of an analytical study of the perversion solutions
Eq. (62), for infinite rods and to [Domokos k Healey, 2005]
for a systematic study of the class of finite-length
foo + ?2)(0) = l (63)
rod problems considered here. For our computa-
is actually used in carrying out the computations. tional study, we assume an unshearable, inexten-
The solution of the augmented system of Eqs. (24), sible rod with initial curvature KQ about the ei
(25), (37), and (39) together with the augmented direction; the constitutive laws are summarized in
set of fourteen boundary condition Eqs. (52), (53), Table 3. The initial curvature «o is related to the
(56)-(61) and (63) is continued in two parameters length L of the rod via
(A, fi). Figure 6 plots the bifurcation diagram of the
obtained solutions showing two bifurcation points iV27r— = L, (64)
for the problem.
Computation of Spatial Equilibria of Cosserat Rods 263

A typical solution along first bifurcated branch

Fig. 7. A typical 3D nonplanar solution along the first bifurcated branch for the hemitropic rod

A typical solution along second bifurcated branch

Fig. 8. A typical 3D nonplanar solution along the second bifurcated branch for the hemitropic rod.
264 T. J. Healey <& P. G. Mehta

Table 3. Constitutive laws for telephone cord:


KQ is the initial curvature.

Unshearable 1/1=0, 1/2=0


Inextensible 1/3 = 1 s=1 5=0 /©;
Bending moments " » 1 = « 1 , TO2 = K2 - Ko
Twisting moment TO3 = « 3
X ffl-
tensioning1 centerline

where N corresponds to the number of turns of the


telephone cord being modeled and 1/KO is the radius Fixed end
of curvature. At s = 0 the rod is clamped; the end
at s = 1 is clamped against rotation and transverse
displacements. We also impose axial tension at the Fig. 9. Schematic of the telephone cord showing reference
configuration (where A = 0) and boundary conditions at the
end s = 1 given by two ends.
n(l) • e 3 = A, (65)
where A denotes the magnitude of the imposed ten-
Algorithm [Domokos & Szeberenyi, 2004] in
sile force. Since the rotation at s — 1 is completely
[Domokos & Healey, 2005]. The latter paper also
constrained, Eq. (65) becomes
contains a detailed local analysis, with which our
n 3 (l) = A. (66) computational results agree well. The computa-
The rest of the boundary conditions correspond to tional results in [Domokos &: Healey, 2005] go well
the placements beyond those presented in this section. In particu-
lar, solutions characterized by more than one helical
r(0) = 0, (67)
reversal along the length of the rod are obtained.
q(0) = (1,0,0,0), (68)
r i ( l ) = 0, (69) 4.4. Multiple solutions for a twisted
r 2 (l) = 0, (70) "ruler"
q(l) = (1,0,0,0), (71) Consider finally a buckling problem for a classical
unshearable, inextensible rod with constitutive laws
for a total of fourteen boundary conditions. In par- summarized in Table 4. The rod is clamped at one
ticular, the rotations at the two ends are fixed, end (s = 0) and is subjected to a tensile load at the
which ensure that the straight rod configuration other end (s = 1). In addition, the rod is attached
with bending moment to a movable hinge at s = 1. The rod is free to
mi(s) = -K0 (72) rotate about the axis of the hinge, initially aligned
is a basic solution for all values of the loading A. with ei, while the orientation or twist of the hinge
We compute the solution as the tensile load A on about e3, through a counter-clockwise angle "a", is
the cord is increased from its initial zero value. Note prescribed, as illustrated in Fig. 13. The transverse
that the problem with differential Eqs. (24), (25), displacements of the rod at s — 1 are constrained.
(37), and (39) and the fourteen boundary condition We are interested in computing the solutions of the
Eqs. (66)-(71) is well-posed. problem in the presence of prescribed tensile end
Figure 10 plots the bifurcation diagram as a load A, as in the telephone cord problem, and in
function of A for the choice of TV = 1.5 turns in the presence of twist a. The twist boundary con-
Eq. (64). As the parameter A is increased from its dition is specified by the rotation imposed at the
zero value, the basic solution undergoes a bifurca- cross-section at s = 1:
tion into a nonplanar solution shown in Fig. 11. R ( l ) e i = cos(a)ei + sin(a)e2, (73)
As the parameter A is further increased, the basic
solution undergoes a second bifurcation resulting and in terms of coordinates used for computations
in a so-called perversion as shown in Fig. 12.
2<?2(l) + 2 9 2 ( i ) - i = C os(a), (74)
The computed shapes depicted in Figs. 11 and
12 were first obtained using the Parallel Hybrid 9i(l)« 3 (l)-90(l)«2(l) = 0. (75)
Computation of Spatial Equilibria of Cosserat Rods 265

0.5 Bifurcation diagram of telephone cord


1 1 1 *-""T I 1 — ' 1 [

0.45

CD 0.4
perversion
E
CD
O 0.35
JO
§- -
0.25

^N 0.2

0.15

0.11-

0.05-

10 15 20 25 30 35 40 45 50
0
A, - tension at the end
Fig. 10. Bifurcation diagram for telephone cord.

A typical solution along first bifurcated branch

-0.1
e2(y]°- 1 5 W 0.6 0.5 0.4
e3(z)

Fig. 11. A typical nonplanar solution along the first bifurcated branch for the telephone cord.
266 T. J. Healey & P. G. Mehta

A typical solution (perversion) along second bifurcated branch

e3(z)

Fig. 12. A typical perversion along the second bifurcated branch for the telephone cord.

Table 4. Constitutive laws for the twisted n ( i ) = o, (78)


ruler problem.
ra(l) = 0, (79)
Unshearable i / i = 0 , J/2 = 0
Inextensible i/3 = l mi(l) = 0, (80)
Bending moments m i = « i , m j = 10«2
Twisting moment mz = « 3 and a tensile boundary condition as in Eq. (65)

2 (939i - 9092) n i + 2 (q3q2 + q0qi) n2

+ 2 <?0+<?3 »:>, = A, (81)


. s=l
for a total of thirteen boundary conditions. These
thirteen boundary conditions are then augmented
by the extra boundary condition

(<7o2 + <?? + g22 + <?32)(l) = :L> (82)


Fixed end in accordance with Proposition 1.1.
We next describe some of the continuation
and bifurcation results obtained with this exam-
Fig. 13. Schematic of the ruler showing reference configura-
tion and boundary conditions at the two ends: A denotes the ple. With no twist (a = 0), the primary response
tensile load and a denotes the angle of twist. is governed by a standard planar buckling prob-
lem. Figure 14 depicts this branch together with
The rest of the boundary conditions arise as a branch corresponding to the case in which a small
amount of twist (a = 30°) is applied. A more inter-
r(0) = 0, (76) esting situation arises where the buckling load is
q(0) = (1,0,0,0), (77) kept at a constant value and twist applied. Figure 15
Computation of Spatial Equilibria of Cosserat Rods 267

Buckled solution for ot=0 (dashed) and a=30° (solid)

r3(0)

-30 -25 -20 -15 -10 -5

A. (tensile)
Fig. 14. Bifurcation diagram (in parameter A) for twisted ruler.

Bifurcation diagram for twisted ruler (h=0)


0.9

0.8

0.7

0.6
K s&rfV
K H i-V
0.5
! /V M V ^
- • , * ' - \ •••••* >••••% i 5

r3(0) /•^bifurcated solutions


0.4 J. i.i
\
v /
0.3

It ;

I...
I
I
i

Basic solution
-0.1
-100 0 100 200 300 4nn 500 600 700 800 900

a (twist) in degrees
Fig. 15. Bifurcation diagram (in parameter a) for twisted ruler.
268 T. J. Healey & P. G. Mehta

3 distinct multiple solutions at a=k=0


(along first bifurcating branch)

0.05N—""•

-0.05

g -0.1
<D~
-0.15

-0.2

-0.25

-0.3
1

0.4 -0.25

Fig. 16. Multiple solutions for parameter values A = a = 0 (no loading, no twist).

plots the bifurcation diagram for this case where to unit magnitude is accomplished" — this is sim-
A = 0 and parameter a is varied. Continuation here ply an ad hoc version of Proposition 3.1. Of course,
shows the intricate nature of multiple buckled non- Proposition 3.1 can be used in rigid-body dynami-
planar solutions for any given choice of angle. For cal simulations, but this requires the use of implicit
example, Fig. 16 plots three distinct solutions for methods, given that the unit condition is specified
a = 0 along the first bifurcating branch. at the end of a time step or sequence of time steps.
We plan to pursue such ideas elsewhere. We also
point out that in [Cooke et al, 1994] equations
5. Concluding R e m a r k s of the form (40) are advocated (presumably for
The use of quaternions in rigid body dynamics explicit methods of numerical integration), but
and its applications, including their use in simula- there \i ^ 0 is imposed as an "integration drift
tions, is well known, e.g. [Darboux, 1972; Mitchell correction gain".
& Rodgers, 1965; Spurrier, 1978; Kane et al., 1983; John Maddocks has kindly pointed out to us
Taylor k Paul, 1990; Cooke et al, 1994]. In particu- that the discretization scheme employed in the
lar, in [Taylor & Paul, 1990] the direct integration of package AUTO [Doedel, 2000] preserves quadratic
(38) (where k (cf. (15)) is now the triple of the com- conservation laws, e.g. (33). In the first example
ponents of the angular velocity in body coordinates) above, this means that we could directly tackle (38)
is advocated provided that "periodic normalization without imposing (33) as a pointwise constraint and
Computation of Spatial Equilibria of Cosserat Rods 269

without the need for (51). Indeed one then has the Biomolecular Structure and Dynamics, ed. Mesirov, J.
thirteen boundary conditions (44)-(46) and thir- (Springer-Verlag, NY), pp. 71-113.
teen (scalar) unknowns, n, m, f, and q — so with Doedel, E. J. [2000] "AUTO2000: Continuation and bifu-
one loading parameter, we expect solution branches. rcation software for ordinary differential equations."
Indeed, we carried this procedure out, which yields Doedel, E. J., Paffenroth, R. C , Keller, H. B., Dichmann,
D. J., Galan-Vioque, J. & Vanderbauwhede, A. [2003]
results virtually identical to those obtained above.
"Computation of periodic solutions of conservative
At the very least, this is a nice check on our results. systems with application to the 3-body problem," Int.
On the other hand, if we try this procedure for any J. Bifurcation and Chaos 12, 1353-1381.
problem like t h e third example (telephone cord), for Domokos, G. & Szeberenyi, I. [2004] "A hybrid paral-
which the orientation at both ends is prescribed, we lel approach to nonlinear boundary value problems,"
end u p with fourteen boundary conditions for the Comp. Ass. Mech. Eng. Sci. 11, 15-34.
thirteen unknowns — with one loading parameter Domokos, G. & Healey, T. J. [2005] "Multiple helical
we will not be able to compute solution branches. perversions of finite, intristically curved rods," Int. J.
As pointed out earlier, all such inconsistencies are Bifurcation and Chaos 15, 871-890.
avoided by our approach based upon Proposition Healey, T. J. [2002] "Material symmetry and chirality
3.1. We always prescribe fourteen boundary con- in nonlinearly elastic rods," Math. Mech. Solids 7,
405-420.
ditions for t h e fourteen scalar unknowns, n, m, f, q
Kane, T. R., Likins, P. W. & Levinson, D. A. [1983]
and [i, and with one loading parameter (and in
Spacecraft Dynamics (McGraw-Hill Book Company).
the absence of other symmetries) we get solution
Keller, H. B. [1977] "Numerical solution of bifurca-
branches. Moreover, our approach automatically tion and nonlinear eigenvalue problems," in Appli-
insures the same accuracy for the conservation law cations of Bifurcation Theory, ed. Rabinowitz, P. H.
as t h a t employed in t h e solution of the differential (Academic Press), pp. 359-384.
equations — independent of the particular integra- Li, Y. & Maddocks, J. [1996] "On the computation of
tion scheme chosen. In particular, problems for elas- equilibria of elastic rods. Part I: Integrals symmetry
tic frameworks of Cosserat rods (with more t h a n and a Hamiltonian formulation," manuscript.
two rods coming together at a joint) are beyond Love, A. [1934] A Treatise on the Mathematical Theory
t h e capabilities of any two-point boundary value of Elasticity, 4th edition (Cambridge University Press,
problem solver like A U T O . Here a more general Cambridge).
finite-element formulation is required, for which our McMillen, T. & Goriely, A. [2002] "Tendril perversion
in intrinsically curved rods," J. Nonlin. Sci. 12, 169-
method is very attractive. We will pursue such work
205.
elsewhere.
Mitchell, E. E. k Rodgers, A. E. [1965] "Quaternion
parameters in the simulation of a spinning rigid
Acknowledgment body," Simulation 18.
Munoz-Almaraz, F., Freire, E., Galan, J., Doedel, E.
This work was supported in part by the National & Vanderbauwhede, A. [2003] "Continuation of peri-
Science Foundation through grant DMS-0072514. odic orbits in conservative and Hamiltonian systems,"
Physica D181, 1-38.
Papadopoulos, C. M. [1999] "Nonplanar buckled states
References of hemitropic rods," Ph.D. thesis, Cornell University.
Ambrosetti, A. & Prodi, G. [1993] A Primer of Nonlinear Papadopoulos, C. & Healey, T. J. [2004] "Large buckling
Analysis (Cambridge University Press, Cambridge). of a compressed hemitropic rod," manuscript.
Antman, S. [1995] Nonlinear Problems of Elasticity Simo, J. & Vu-Quoc, L. [1986] "A three-dimensional
(Springer-Verlag, NY). finite-strain rod model. Part II: Computational aspe-
Cooke, J. M., Zyda, M. J., Pratt, D. R. & McGhee, R. B. cts," Comput. Meth. Appl. Mech. Engin. 58, 79-116.
[1994] "Npsnet: Flight simulation dynamic modeling Spurrier, R. A. [1978] "Comment on singularity-free
using quaternions," Presence 1, 404-420. extraction of a quaternion from a direction-cosine
Darboux, G. [1972] Lecons Sur La Theorie Generale Des matrix," J. Spacecraft 15, p. 255.
Surfaces, Premiere Partie (Chelsea Publishing Com- Taylor, R. H. & Paul, R. P. [1990] "On homoge-
pany, NY). neous transforms, quaternions, and computational
Dichmann, D., Li, Y. & Maddocks, J. [1996] efficiency," IEEE Trans. Robot. Autom. 6, 382-387.
"Hamiltonian formulation and symmetries in rod Timoshenko, S. & Gere, J. [1961] Theory of Elastic Sta-
mechanics," in Mathematical Approaches to bility (McGraw-Hill, NY).
This page is intentionally left blank
MULTIPARAMETER PARALLEL SEARCH
BRANCH S W I T C H I N G
MICHAEL E. H E N D E R S O N
IBM Research Division, T. J. Watson Research Center,
Yorktown Heights, NY 10598, USA
mhender@watson. ibm. com

Received March 10, 2004; Revised J u n e 8, 2004

A continuation method (sometimes called path following) is a way to compute solution curves
of a nonlinear system of equations with a parameter. We derive a simple algorithm for branch
switching at bifurcation points for multiple parameter continuation, where surfaces bifurcate
along singular curves on a surface. It is a generalization of the parallel search technique used in
the continuation code AUTO, and avoids the need for second derivatives and a full analysis of
the bifurcation point.
The one parameter case is special. While the generalization is not difficult, it is nontrivial,
and the geometric interpretation may be of some interest. An additional tangent calculation at
a point near the singular point is used to estimate the tangent to the singular set.

Keywords: Numerical continuation; continuation methods; multiparameter branch switching;


implicitly defined manifolds.

1. Background and Basic Result Suppose To is a regular connected component


of the solution manifold of
A continuation method (sometimes called path fol-
lowing) is a way to compute solution curves of F ( u ) = 0, ueR" F:Rn->R"-fc
a nonlinear system of equations with a param- containing the initial point uo and restricted to
eter. For an introduction to these methods see, some computational domain fl C R n . T h a t is, a
for example [Allgower & Georg, 2003; Garcia & point v is in To if there is a continuous curve u ( s ) ,
Zangwill, 1981] and more recently [Govaerts, 2000; s £ [0,1], of regular solutions of F = 0 connecting v
Beyn et al., 2002], and papers [Doedel, 1997] and to uo through fl (see Fig. 1)
[Seydel, 1997]. F ( u ( s ) ) = 0, u(s) C Q,
In [Henderson, 2002] the author described a
generalization of these methods to problems with r a n k ( F u ( u ( s ) ) ) = n — k.
more t h a n one parameter, where the solution man- u(0) = u 0 , u ( l ) = v .
ifolds are surfaces instead of curves. One practi-
If F is smooth To is a fc-dimensional manifold
cal issue t h a t was not addressed there is how t o with a boundary, and t h e b o u n d a r y is m a d e up of
generalize the second-derivative-free parallel search (A; — l)-dimensional manifolds (again with bound-
branch switching algorithm that is used in codes aries) which either lie on 5Q, or are such t h a t the
like A U T O [Keller, 2001; Keller & Doedel, 2003]. Jacobian F u is of rank n — k — 1.
The one parameter case is special. While the gen- Consider a point u* on the singular b o u n d a r y
eralization is not difficult, it is nontrivial, and the of To (see Fig. 2). This point can be found by mon-
geometric interpretation may be of some interest. itoring an indicator function x ( u ) , which changes

271
272 M. E. Henderson

Fig. 3. The regular connected components sharing a point


u* on their singular boundaries.

Fig. 1. A regular connected component To of F = 0 in Q. 1.1. The geometry of the solution


For every point v in To there is a path u(s) of regular solu- manifold near a singular point
tions connecting it to UQ.
The tangent space of To at the singular bound-
ary point can be found by interpolation between
the tangent spaces at u a and u;, (which are regu-
lar points and have unique tangent spaces). We can
therefore find an orthonormal basis {<po,..., <j>k-i}
for the fc-dimensional tangent space of To at u*

F u (u*)& = 0

4>f<l>j = $ij

If u* is interior to the singular boundary, the rank


of F u (u*) will be n — k — 1, and so there is a right
null vector <t>k 6 R", and left null vector ip € R n _ f c
of the augmented Jacobian

F u (u*)0 fc = O i/>TFu(u*) = 0

<t>T4>k = 0 ( M *)
Fig. 2. The singular boundary of a regular connected
component. Points on this boundary, like u*, are found by
monitoring a test function which changes between points on
To study the geometry of the bifurcation we use
opposite sides of the boundary ( u a and u;,). Bisection or a
root finding algorithm may be used to locate u*. a Lyapunov-Schmidt decomposition. Let
k

sign or jumps when evaluated for points on opposite


sides of a singular curve. (See [Beyn et al, 2002] for
a description of indicator functions for various bifur- and consider first the projection of F onto the range
cations.) Bisection or a root finding algorithm may of the Jacobian
then be used to locate u* in the interval [ua,uj,]
where the indicator function changes. The aim of (/-^r)F(u* + ^0is i
+ r/) = 0
branch switching is to find points near u* that are
interior to the other regular connected components
containing u* (Fig. 3).
Multiparameter Parallel Search Branch Switching 273

as a system for r\. The Jacobian at the solution Using the IFT, a set of functions s*(e) with
s% = 0, r\ = 0 is nonsingular, so using the Implicit s l (0) = s* exists in a neighborhood of e = 0.
Function Theorem (IFT) there is a unique function Each solution of the ABE! therefore corresponds to
r?(s 0 ,... ,sk) which satisfies the projected equations a curve (parameterized by e) on the solution sur-
in a neighborhood of sl = 0. At sl = 0 we have face through u*. Varying the sl subject to the ABE
traces out the surface.
rj = 0, f]si = 0 We know one set of solutions — any vector s
with sk = 0. (This is because we chose the first k
(/ - ^ T ) ( F u 7 7 s i ^ + FuuMj) = 0 null vectors to be a basis for the tangent space of
To.) The ABE is therefore of the form
</f %Vi = °

(Similar equations can be written for the higher sM5>rFuu&fosM=0, or


derivatives of r\ by repeated differentiation.)
To satisfy F = 0, one further scalar equation
must be satisfied (the Bifurcation Equation Eq. (1))
s fc nr> iS M =0.

V>TF I u* + J2 ^ + v(*°, ...,sk)j=0. (1)


Therefore N £ R f e + 1 is orthogonal to the bifur-
cating branch, and N{ = ipTFUVi4>i4>k- The tangent
The linearization of this is zero at s = 0, and we space of the singular boundary is
can remove this (so that the IFT can be used), by
introducing a small parameter e -
sh = 0, J^NiS^O
i=0
T < 0 k
i ^ F (u* + e ] T > s + V(es ,... ,es )j = 0,
Let {ao, • • • j Cfc-2} be an orthonormal basis for
this (k — l)-dimensional tangent space. The tan-
sW = 1.
gent space to the bifurcating sheet includes
the additional vector ak-i = (N1-N0,..., NkNk-i,
A Taylor series (in e) of this begins: ~Z)o~ X NiNi). This is orthogonal to both N and
the other <7j. (It is not normalized.)

i,3
1.2. Special case: k = 1
+e \iuu'fi'rj\ IS SJ S
When k = 1 the singular set is a point (see Fig. 4).
We have
•tpTFuu(f)ir]sitSJslJ
+
Suppose that the Algebraic Bifurcation Equation
(ABE) Eq. (2) and the tangent (not normalized) of the bifurcating
branch in s-space is
J2^TFnu(f>icf>jsis^ = 0 (2)
i ,3 ak_1 = N0(Nl,-N0)

is satisfied and the first-order term is nonzero


1.3. Special case: k — 2
The vector (s ,* 1 ,* 2 ) gives a point in the k +
0

1,3 1 = 3 dimensional null space of F u (u*) (in the


274 M. E. Henderson

("N^N^INI

Fig. 4. (Left) The parallel search branch switching algorithm used in AUTO. A bifurcation point u* is located between u a
and Uf,, and the null vector of the augmented Jacobian 4>\ is used as a tangent for the next step. (Right) A sketch of the
corresponding s-space, showing the four roots of the ABE.

Fig. 5. (Left) The basis for the right null space of F u ( u * ) . The first two (A: — 1) basis vectors lie in the tangent plane of To
and T2- T h e third is orthogonal to the first two. (Right) Solutions to t h e ABE's in s-space. Circle C 0 is s 2 = 0, s°s° + s1s1 = 1.
Circle C\ lies in the plane N.s = 0, where N = (ipTFuu(t>o4>2,il>T'Fuu4>i<l>2,i'T'Fuu<j>2<t>2)- The singular set is N.s = 0, sk = 0.

basis <f>o, <f>i, 4>2) (Fig. 5). We have The other branches (Ti and V3) are

u* + e (fos0 + ^ i s 1 + fas2) + ?7(es0, es 1 ,es 2 ),


(N0lNltN2)
N0s° + NiS1 + N2s2 = 0
= (l/> T F uu <£ O 02, ^ T F U u 0 1 0 2 , ^TFUU<P2<P2;
soso + s i s i + S2S2 = j
2
T h e branch corresponding t o To (and T2) is s = 0
See Fig. 5. The tangent to the singular set is, in
u* + e (<p0s° + ^ i s 1 ) + 7?(es°, e s \ 0 ) ,
s-space and R n

s°s° + s V = 1 00 = (-Ni,N0,0) «-+ -JVi^i + W o


Multiparameter Parallel Search Branch Switching 275

and the tangent vector (not normalized, and in No j^ 0 (the nontransverse case). So a point on the
s-space) to the bifurcating branch orthogonal to bifurcating branch may be found by solving
the singular set is
F(u) = 0
ax = (N0N2, N±N2, -N0N0 - N^) < # ( u - ( u * + Aa^i)) = 0

This is the technique used in AUTO (described in


1.4. Parallel search branch switching [Beyn et a/., 2002] and [Keller, 2001]).
These quantities can be computed, and the tangent For k > 1 we need to find a fe-dimensional sub-
to the bifurcating components found directly. How- space whose projection onto the tangent space of
ever, in many instances the second derivatives are the bifurcating sheet spans that tangent space. For
not available, and we need a branch switching algo- k = 1 we projected orthogonal to <pk- This defines a
rithm which does not assume they are. (k — l)-dimensional curve on the bifurcating sheet,
The goal is to find a point on the bifurcating and so if k ^ 1 we need additional constraints
component which can be used as an initial point to to define a unique point on the bifurcating sheet.
compute the component. To project a point v onto The tangent to the singular set lies in both tangent
a regular component, a system of the form spaces, and we need to project orthogonal to that
as well.
To estimate the tangent to the singular set we
F(u) = 0
use a perturbation
<§ r (u - v) = 0
F(u) = F ( u ) - F ( u * + eA0 fe ).
is used. As long as the projection of the k vectors $
(the columns) onto the null space of the Jacobian For this (as before Ni = ipTFuu^>i4>k)
at u spans the null space this is a nonsingular sys- k
tem. As the name parallel search implies, we choose ^ipTF = J2Nisisk - NkA2 + 0(e).
$ orthogonal to the tangent to I V The condition e
i=o
that the augmented Jacobian be nonsingular is that
the bifurcation be transverse to IV In s-space solutions of this perturbed equation are
For k = 1, fa has a nonzero projection onto a pair of hyperbolic sheets which asymptote to the
the tangent to the bifurcating curve Nifo — ^o4>i if solutions of the unperturbed equation (Fig. 6).

Fig. 6. (Left) The perturbed surface F(u) = F(u* + eA^fc). (Right) The same surface in the s-space defined by t h e unper-
turbed problem.
276 M. E. Henderson

Suppose a is any vector in the tangent space of (4) Find points on the bifurcating sheets by solving
the singular set. That is
F(u) = 0
k
^ ^ = 0 ak = 0 ( 7 j ( u - ( u * + As0fc)) = 0, i = 0,...,fc-2
i=0 4>l{u - (u* + As0 fc )) = 0
By construction, we know one point on the per-
With As > 0 we get a point on Ti, and As < 0
turbed surface, u* + e0fc, which corresponds to the
gives a point on T3. For the point on T2 we can
point (0,... , 0, A) in the s-space since it is a solu-
use u&, which was found in the detection step.
tion of
k
Notes:
Y,Nisisk = A2Nk.
2=0 • A controls the shape of the hyperbola in s-space,
and e is small relative to the norm of u*. There-
This equation is invariant to a shift in the a direc-
fore eA should be something like 10~ 3 |u*|.
tion, so a lies in the tangent space of the perturbed
• There is a technique, described in [Allgower &
system at the known point ( 0 , . . . , 0, A). This gives
Georg, 2003] which perturbs the problem in
us a way to compute the tangent space of the sin-
order to switch branches. This approach does
gular set: it is the common (k — l)-dimensional sub-
that in some sense by using the tangent of a
space of the tangent to To and the tangent of the
perturbation.
perturbed system (the null space of F u (u* + e</>k))-
• For Hopf and other bifurcations the null vector cf>k
at the singular point is of a different class than
2. S t a t e m e n t of t h e A l g o r i t h m the other null vectors. For example, <po,..., <f>k-i
may be in M n , while 4>k is in R " x S 1 . All this
Detection — Locate a pair of points u a and Uf, on means is that the other null vectors must be pro-
F = 0 such that x(ua) ¥" x(ub)- moted to the larger space, since the tangent space
Location — Using bisection, or a root finding $ at the perturbed point is in the larger space.
method locate the point u* on F = 0 in the interval
at which x ( u ) changes. Use the tangent spaces at u a
and Uf, to interpolate (<fo, • • • ,4>k-i), a n orthonor-
mal approximation to a basis for the tangent space 3. E x a m p l e s
of T0 at u*. 3.1. Cusp
Branch Switching — Our first example is a complexified cusp [Henderson
& Keller, 1990].
(1) Find the right null vector (j>k, (fk'Pk — 1

F u (u*)0 fc = 0 (x + iy) • ((x + iyf + A) = //.

4>f(j)k = 0, i = 0,..., k - 1 This is n = 4, k = 2, and we can easily find an


initial solution XQ = yo = 0 at /io = 0, Ao = 1.
(2) Find an orthonormal basis $ of the k- Figure 7 shows x + y as a function of (A,/x).
dimensional null space of The single initial point, with the branch switch-
ing algorithm described in the preceding section
F u (u* + eAfo)& = 0 was sufficient to compute the four regular connected
components. Note that the blue components (y = 0)
4>l4>j = Sij
are the cusp catastrophe.
(3) Find an orthonormal basis {<7o,..., (Jk-2} for
the common subspace of $ and $. Since (f>k is
orthogonal to <fr this is the subspace (ft^Q = 0, 3.2. (2,4) cell interaction model
so the Gram-Schmidt algorithm can be used on Our second example is a model of the (2,4)
the set of k+1 vectors {0/-, <f>Q,..., 4>k-i\ to find cell mode interaction in Taylor-Couette flow
the subspace (the first will be (frk, and the last [Meyer-Spasche, 1991, pp. 106-110]. It is based
will be zero. The ones in between are the oi). on an analysis of a two eigenvalue bifurcation by
Multiparameter Parallel Search Branch Switching 277

Fig. 7. A computation of solutions of u(u2 — A) = p. The projection used for rendering is (/z, A, x + y). (Left) To, the regular
component connected to the initial point. (Right) All components (y 5^ 0 is red, y = 0 is blue).

Andreichikov [1979], with coefficients computed by number R = 78.53836, aspect ratio A = 2.881799.
Bolstad [1992]. The computation of the coefficients The model is
is as described in [Ramaswamy & Keller, 1995].
At radius ratio r\ = 0.615 and a 12 x 48 grid x(x2 + aiy2 -fi + biy) = 0
2 2
the bifurcation point was found to be at Reynolds y(a2x +y- f2) + b2x2 = 0

Fig. 8. A computation of solutions of a Model for the (2,4) mode interaction in Taylor-Couette flow. The projection used for
rendering is (AR,A\,x + y). (Left) To, the regular component connected to the initial point. The inset shows tiles created
using {(To,..., cfc_2,4>k\ (right) all components.
278 M. E. Henderson

where Beyn, W.-J., Champneys, A., Doedel, E., Govarets, W.,


Kuznetsov, U. A., Yu, A. & Sandstede, B. [2002]
a\ — 3.67 Numerical Continuation, and Computation of Nor-
mal Forms, Handbook of Dynamical Systems, Vol. 2
bx = -0.0975 - 0.00392A.R + 0.0543AA (Elsevier Science).
/ i = 0.00117AR - 0.0137AA - 0.00000427Ai? 2 Bolstad, J. [1992] Private communication.
Doedel, E. J. [1997] "Nonlinear numerics," Int. J. Bifur-
- 0.000407Ai?AA + 0.00106AA 2 cation and Chaos 7, 2127-2143.
Garcia, C. B. & Zangwill, W. I. [1981] Pathways
a2 = 1.19
to Solutions, Fixed Points and Equilibria (Prentice-
b2 = 0.0331 + 0.000476A.R - 0.00546AA Hall).
Govaerts, W. J. F. [2000] Numerical Methods for Bifurca-
h = 0.000681A# + 0.00955AA - 0.000002605 2 tions of Dynamical Equilibria (SIAM, Philadelphia).
- 0.000216A#AA - 0.004925AA 2 . Henderson, M. E. & Keller, H. B. [1990] "Complex bifur-
cation from real paths," SIAM J. Appl. Math. 50,
460-482.
The solution manifold consists of three pieces with
Henderson, M. E. [2002] "Multiple parameter continua-
different symmetries (Fig. 8):
tion: Computing implicitly defined A;-manifolds," Int.
J. Bifurcation and Chaos 12, 451-476.
x = 0, y = 0 — The trivial solutions. Keller, H. B. [2001] "Continuation and bifurcations in
x = 0, y y^ 0 — The 4-cell solutions. scientific computation," Math. TODAY76, 493-520.
Keller, H. B. & Doedel, E. J. [2003] "Path follow-
y 2 - / 2 ( A i ? , A A ) = 0. ing in scientific computing and its implementation
in AUTO," in Sourcebook of Parallel Computing,
x^0, j / / 0 — The mixed 2-cell/4-cell solutions.
eds. Dongarra, J., Foster, I., Fox, G., Gropp, W.,
Kennedy, K. Torczon, L. & White A. (Morgan
Kaufman, San Francisco), Chap. 23, pp. 670-700.
References Meyer-Spasche, R. [1991] Pattern Formation in Viscous
Allgower, E. L. & Georg, K. [2003] Introduction to Flows (Springer-Verlag, NY).
Numerical Continuation Methods, Classics in Applied Ramaswamy, M. & Keller, H. B. [1995] "A local study of
Mathematics, Vol. 45 (SIAM, Philadelphia). a double critical point in Taylor-Couette flow," Acta
Andreichikov, I. P. [1979] "Branching of secondary Mech. 109, 27-39.
modes in the flow between rotating cylinders," Fluid Seydel, R. [1997] "Nonlinear computation," Int. J. Bifur-
Dyn. 12, 38-43. cation and Chaos 7, 2105-2126.
EQUATION-FREE, E F F E C T I V E COMPUTATION
FOR DISCRETE SYSTEMS: A T I M E S T E P P E R
BASED A P P R O A C H
J. M O L L E R and O. RUNBORG
Department of Numerical Analysis and Computer Science, KTH,
10044 Stockholm, Sweden
P. G. K E V R E K I D I S
Department of Mathematics and Statistics, University of Massachusetts,
Amherst, MA 01003, USA
K. L U S T
Departement Computerwetenschappen, Katholieke Universiteit Leuven,
Celestijnenlaan 200A, B-3001 Heverlee, Belgium
I. G. K E V R E K I D I S
Department of Chemical Engineering,
Program for Applied and Computational Mathematics,
Department of Mathematics,
Princeton University, Princeton, NJ 08544, USA

Received May 5, 2004; Revised August 3, 2004

We propose a computer-assisted approach to studying the effective continuum behavior of spa-


tially discrete evolution equations. The advantage of the approach is that the "coarse model"
(the continuum, effective equation) need not be explicitly constructed. The method only uses
a time-integration code for the discrete problem and judicious choices of initial data and inte-
gration times; our bifurcation computations are based on the so-called Recursive Projection
Method (RPM) with arc-length continuation [Shroff & Keller, 1993]. The technique is used to
monitor features of the genuinely discrete problem such as the pinning of coherent structures
and its results are compared to quasi-continuum approaches such as the ones based on Pade
approximations.

Keywords: Equation-free methods; homogenization; discrete problems; bifurcation; pinning


condition.

1. Introduction et al., 1999] to the propagation of action potentials


In contemporary science and engineering modeling through the tissue of the cardiac cells [Keener, 1991]
many situations arise in which the physical system and from chains of chemical reactions [Laplante &
consists of a lattice of discrete interacting units. Erneux, 1992] to applications in superconductivity
T h e role of discreteness in modifying the behav- and Josephson junctions [Ustinov et al., 1993], non-
ior of solutions of continuum nonlinear P D E s has linear optics and waveguide arrays [Christodoulides
recently been increasingly appreciated. T h e rele- &; Joseph, 1988], complex electronic materials
vant physical contexts can be quite diverse, ranging [Swanson et al., 1999], the dynamics of neuron
from the calcium burst waves in living cells [Dawson chains or lattices [Rinzel et al, 1998; McLaughlin

279
280 J. Moller et al.

et al, 2000] or the local denaturation of the DNA 2. A Coarse Time Stepper for
double strand [Peyrard k, Bishop, 1989]. Discrete Systems
Whether the phenomenon in question is the
Consider a discrete system where each unknown is
propagation of an excitation wave along a neu-
associated with a point on a lattice in space. In
ron lattice, the electric field envelope in an optical
the discussion here, we consider a one-dimensional
waveguide array, or the behavior of a tissue con-
regular lattice for simplicity. Higher dimensional
sisting of an array of individual cells, we would
and/or possibly irregular, lattices can be treated
often like to model the system through a "coarse
in a similar way. We denote the unknowns {u^},
level" effective continuum evolution equation that
with I g Z , and the corresponding points {x^}, such
retains the essential features of the actual (dis-
that X£ = £Ax, where Ax is the lattice spacing. We
crete) problem. Typically computational modeling
assume that the system is governed by the ordinary
of such systems involves two steps: the derivation
differential equations
of effective continuum equations, followed by their
analysis through traditional numerical tools. In this
paper we attempt to circumvent the derivation — = F(t,ue-n,...,ue+n), £<EZ, (1)
of explicit (closed) continuum effective equations,
and analyze the effective behavior directly. This is where n > 0 is an integer representing the range
accomplished through short, appropriately initial- of interaction between lattice points. We want to
ized simulations of the detailed discrete process, a describe this discrete system dynamics through a
procedure that we call the "coarse time stepper". continuous function v(t,x) that models the "coarse"
These simulations provide estimates of the quan- behavior of the unknowns on the lattice:
tities (residuals, action of Jacobians, time deriva-
tives, Frechet derivatives) that would be directly ue(t)&v(t,xe), Vt,£,
evaluated from the effective equation, had such an
in some appropriate sense. We denote v as the
equation been available. The estimated quantities
coarse continuous solution of (1) and we assume
are processed by a higher level numerical proce-
that n is not large and that there exists an effective,
dure (in this case, the Recursive Projection Method,
spatially continuous evolution equation for v{x,t) of
RPM, of [Shroff & Keller, 1993]) which computes
the form
the effective, macroscopic behavior (in this case,
traveling waves and their coarse bifurcations). A vt = P(t,v,dxv,...,d^v), (2)
more general discussion of the combination of coarse
time stepping with continuum numerical techniques for some P and integer M. Such an effective equa-
beyond RPM can be found in [Gear et al, 2002; tion for v should "average over" the detailed dis-
Kevrekidis et al, 2003]. We have recently demon- crete structure of the medium; if there are no
strated such an approach to the computation of the macroscopic variations of the discrete medium, this
effective behavior (in some sense, homogenization) equation should therefore be translationally invari-
of spatially heterogeneous problems [Runborg et al, ant; for the moment, we will confine ourselves to
2002]. This paper constitutes an extension of this this case. In terms of (1), we can express this as:
idea to spatially discrete problems. if F does not depend on £, and if v and v are
The paper is organized as follows: We begin two solutions to the effective equation (2) satisfy-
with a brief review of the coarse time stepper for ing v(0,x) = v(0,x + s) for all x, then v(t,x) =
spatially discrete problems. We then discuss our v(t, x + s) for all time t > 0, all x, and all shifts s.
illustrative problem (a front in a discrete reaction- It is interesting to consider what the result of
diffusion system) and its properties. A description integrating such an effective equation with a par-
of our implementation of the coarse time step- ticular, continuum initial condition VQ{X), would
per for the bifurcation analysis of this particular physically mean. There clearly exists an uncertainty
problem is then presented, followed by numerical in how such a continuum initial condition would
results. We conclude with a discussion of an alter- be imparted to (sampled by) the lattice. One way
native approach that involves the derivation of an would be to set ui(0) = vo(xi), for all £, but we
explicit effective evolution equation (based on Pade could equally well set tz^(O) = VQ(X£ + s) for any s G
approximations), and of the scope and applicability [0, Ax). There exists, therefore, a one-parameter
of our method. uncertainty parametrized by a continuous shift s.
Equation-Free, Effective Computation for Discrete Systems 281

Simulations resulting from different lattice sam- as follows. The solutions \ij(T) are thought of as
plings of the same continuum initial condition could sample values of a function u such that u(x£ +
be quite different. This is best illustrated by think- jAs) = u\. The function u is recovered by inter-
ing of a single-peaked function as the continuum ini- polating the sample values and the restriction
tial condition: the peak may lie precisely at a lattice u(x,T) = M{uj(T)} is finally given as a coarse
point, or could fall in-between lattice points. It is scale filtering of u(x).
reasonable to consider as an useful effective contin-
uum equation one which takes into account all pos- These steps are illustrated in Fig. 1. For n > 0
sible shifts of the initial condition within a cell; in we define u(nT, x) recursively by applying the same
analogy with our earlier work [Runborg et al, 2002], construction. Hence,
we would like to analyze an effective equation that
would describe the expected result — taken over all
u(nT, x) = M{TTiij}u((n - 1)T, x). (6)
possible shifts — of sampling the initial condition
by the lattice.
We will use the coarse time stepper approach The hope is that the coarse time stepper solution
to simulate an effective equation like (2). In this u(nT,x), at these discrete points in time, can be
setting, we approximate v(t,x) by the coarse time obtained from a closed evolution equation like (2)
stepper solution u(t,x) at discrete times nT, where whose solution, v(t,x) (defined for all t), agrees,
T is the time horizon of the coarse time stepper. at least approximately, with the coarse solution
Using the terminology of this framework, we take obtained from the procedure above, at the discrete
the following steps, starting from a continuous ini- points in time, v(nT,x) « u(nT,x). We will refer
tial condition VQ{X) = u(0,x). to the procedure as the coarse time stepper.
In order to approximate v numerically, we
• Lifting. This initial data VQ(X) is "lifted" to an must use a finite representation of u(nT,x). We
ensemble of JVC different initial states of (1) by let v n = {V^^SQ1, be this representation at time
sampling, t = nT. The elements {v%} could be nodal val-
ues, cell averages or, more generally, coefficients for
4(0) = v0(xe + jAs), As = Ax/Nc, finite elements or other basis functions. Let IT be the
operator realizing the function from the finite rep-
j = 0,...,Nc-l. (3) resentation, (IIvn)(a;) = u(nT,x). We also require
that the restriction operator projects on the sub-
Setting Uj = {u^}, we write this symbolically as space spanned by the finite representation, and we
can redefine it to also convert the projected func-
Uj(0) = fijVo, tion to this representation. Symbolically, we then
write the coarse time stepping
where {/i^} are called the lifting operators. In this
case they simply sample a continuous function.
v n + 1 = M{TTiij}nvn =: G(v n ). (7)
• Evolve. Each ensemble of initial data is evolved
till time T according to the "true dynamics" (1),
Note that we may not be able to write down the exp-
VLJ(T) = T T U j (0), j = 0,...,N c - 1. (4) licit expression for G or Eq. (2) for v(t,x), but our
definition of u(t, x) allows us to realize its time-T
where %• is the solution operator of (1) evolv- map numerically in a straightforward fashion.
ing u(i) to u(i + r ) . This step thus generates an Applied directly to the simulation, the coarse
ensemble of solutions u,- (T) at time T. time stepper does nothing to reduce the cost of
• Restrict. Via the restriction operator M, the detailed computation with the discrete dynamics.
ensemble of solutions is brought back to a It is only in conjunction with other techniques (like
continuous function. projective integration [Gear & Kevrekidis, 2003], or
matrix-free fixed point techniques) that the coarse
u(T,x)=M{uj(T)}, j = 0,...,Nc-l. (5) time stepper may provide computational or analyt-
ical benefits. Here we will make use of the coarse
To ensure consistency we require that M{IJ,J} = time stepper in conjunction with the Recursive Pro-
/ . The restriction operator M. is typically defined jection Method (RPM), to perform stability and
282 J. Moller et al.

Coarse initial condition Line up copies and restrict (filter)

I
Sample to get shifted copies Integrate each copy independently

G—e-

Fig. 1. The coarse time stepper: Starting from a coarse initial condition VQ(X), lift it by sampling to an ensemble of initial
data, {11^(0)}, j = 0,..., Nc — 1, for the system and evolve each set for time T. Line up solutions at time T and interpolate
to get u{x). Finally, filter u(x) to get u(T, x), the result of the coarse time stepper at t = T.

bifurcation analysis of certain types of solutions of bifurcations; when the bifurcations in (7) that we
the (unavailable) coarse evolution equation. For a are interested in do not involve fixed points, G has
schematic illustration of the coarse time stepper to be reformulated. How this is done depends on
with RPM, see Fig. 2. the application; for the type of solutions considered
RPM helps locate fixed points, allows us to here (traveling fronts), the appropriate modification
trace fixed point branches and locate their local is discussed in Sec. 3.2.
Equation-Free, Effective Computation for Discrete Systems 283

Parameter Bifurcation
Results


X Arclength
Cont.

0 .0
u , A,
It"
RPM
:*

* ,.n+l
if U

F: B L A C K B O X , here the ' Coarse Time Stepper"

1 •u(0) u(T)
Coarse IC

^ ~ \
^

I^nvv |up) I
J Detailed IC !
0
Aw/!"' ' G
! Micro j
j i Timestepper
Detailed IC !

K
MVy \ u™(0) AA
!
u„(D

Fig. 2. An overview of the coarse time stepper with RPM.

3. A D i s c r e t e Traveling Front E x a m p l e describing the nature of the solutions of discrete


problems, should successfully capture the effects
The effects of discreteness on the propagation of
of discreteness on the traveling wave shape and
traveling wave solutions have been documented and
speed. More importantly, they should be capable of
analyzed in many different settings over the last
accurately predicting qualitative transitions (bifur-
two decades. From the pinning of traveling waves
cations) that are inherently due to the discreteness.
in discrete arrays of coupled torsional pendula and
The most prominent of those is probably the pin-
Hamiltonian models [Ishimori & Munakata, 1982;
ning of traveling waves and fronts often observed
Peyrard k, Kruskal, 1984], to the trapping of coher-
when the lattice spacing becomes sufficiently large.
ent structures in dissipative lattices of coupled cells
To illustrate the performance of our proposed coarse
[Keener, 2000; Keener & Sneyd, 1998; Fath, 1998]
equation in capturing such a front pinning, we
(see also references therein), the role of spatial dis-
have chosen what is arguably a prototypical spa-
creteness has triggered a large interest in a diverse
tially discrete problem capable of exhibiting it: a
host of settings. Recent studies have addressed
one-dimensional lattice with scalar bistable on-site
rather extensively the possibility for stable, trav-
kinetics and nearest neighbor diffusive coupling
eling wave fronts to exist in discrete reaction-
between lattice sites. Our test problem is, therefore,
diffusion systems; see, e.g. [Zinner, 1991, 1992; Zin-
a discrete reaction-diffusion system described by
ner et al, 1993], as well as the more recent work of
[Bates et al, 2003; Beyn & Thummler, 2003].
Herein we focus on an alternative viewpoint due 1
:(ue-i - 2ue + ut+i) + /(«*),
(with respect to the above works), namely the one ~dt (Ax)
of effective equations. Such models, if capable of (8)
284 J. Moiler et al.

i i r i i

140

120 -

100

0)

| 80

60

40 m
20

i ^Hl
30 32 34 36 38 40 42
X/AX

(a) u(t, x) in the xt-plane

x/Ax

(b)u(t,a;),t = 0,2.5,5)...,40

Fig. 3. The plot illustrates how the front advances when Ax = 1.75. In (a) the front in the si-plane is shown; the grayscale is
proportional to the solution u(t,x). In (b) the solution as a function of x at different time levels is shown. The time interval is
t 6 [0,40]. Looking at the spacing between the solution instances, we can see how the front speed varies in a lurching manner.
Equation-Free, Effective Computation for Discrete Systems 285

with This should model the full problem accurately as


long as the (relatively narrow) front is positioned
f(u) = 2u(u - 1)(T? - u), n = 0.45. (9) sufficiently far from the boundary.
This can serve as a model of e.g. individual
cells in the cardiac tissue which are resistively cou- 3.1. Construction of the coarse
pled through gap junctions (see e.g. [Keener, 2000] time stepper
and references therein). In this case the solution
U£, would correspond to the electrical potential of In this section we detail the procedures associated
the cells. For small Ax the system possesses solu- with the coarse time stepper applied to the test
tions that can be characterized as discrete travel- problem (8, 9) on the finite interval / = [0, L], where
ing fronts: see Fig. 3. These solutions have a near L = NAx and the cell locations are Xj = jAx, with
constant shape and travel in a "lurching" manner. j = 0,...,iV-l.
When Ax becomes sufficiently large, front propaga- Our choice of finite representation of the coarse
tion fails (front pinning). In our example, this hap- solution are M nodal values v™ = {v%}, k =
pens at Aa; = Ax* « 2.3, see Fig. 4. The front 0 , . . . , M — 1, evaluated at t = nT and y^ = kAy,
speed for an infinite lattice approaches the asymp- with MAy = NAx.
totic "PDE speed" value 0.1 as the lattice size tends For many solution shapes Fourier interpolation
to zero. would be a natural interpolation operator realizing
We will examine how faithful the coarse time the coarse solution u(nT,x) from v n . We denote
stepper is to the properties of the solutions of the direct Fourier interpolation by 11^. We could then
full discrete model (8). Our numerical simulations define the corresponding lifting operators /j,? via the
are restricted to a finite domain, using N = 64 grid shifting operator s / : RM -* RN,
points. At the boundaries, we prescribe Neumann-
type conditions f 5 (Sfu)e:=(Ii'u)(xi + s), s>0,
/A*U'
UN - uN^i = 0, where II^ uses {y/J as interpolation nodes. In our
no — i t - i = 0. case, however, the solution is not periodic on / and

Speed versus grid partioning


0.1

0.09

0.08

0.07

0.06

3- 0.05
>
0.04

0.03

0.02

0.01

0.8 1.2 1.6 1.8 2.2 2.4 2.6


Ax

Fig. 4. The speed of the front as a function of Ax. As the lattice spacing is increased, the speed v approaches zero; the front
stops at Ax* ss 2.3.
286 J. Moller et al.

we get large errors if we use Ss directly. Instead We should also remark here that, in the special case
we apply Fourier interpolation to the differences of when N = M, we have
the v n sequence. We thus use the modified shifting , Nc-1
operator Ss : MM -> RN given by (pNnfu)(xe),
jAsU3
3=0
Ssu := CS{Du,
e U = {ur}, Ui+j N - „ • ?
Wli
(Cu)r.= l + Y,Uj, (10) where P/v is a projection on the N lowest Fourier
3=0
modes. Hence, if we used direct Fourier interpola-
u0-l, £ = 0, tion and M = N, then our definition of j\A. is equiv-
(Du)e .-=
alent to lowpass filtering of u, the lined up copies
described in Fig. 1, top right. When we replace Ss
We then define the lifting operator ^ : R M by Ss we do not retain exactly this property, and
,xNc
(acting directly on v n ) as a definition of M. based on simple lowpass filtering
is no longer consistent. However, our procedure still
Ay corresponds to a type of lowpass filtering, although
/xvn = {Hjvn}, HjVn := SjAsvn, As =
a more complicated one.
For the time integration of (8) we use the
where j = 0 , . . . , Nc — 1. Crank-Nicolson method, treating the nonlinear
The restriction operator M : RNxNc —> K'" is term explicitly. Thus, with w° = {«)}}eR A ',
also defined using the shifting operators, but now
with negative shifts, TTw° := w ^ = IwfT } , NTAt = T,
St9 : RN - RM, (Sf_su)k := (Ufu)(yk - s), where {w™} are given iteratively by
s >0, n+l At +
w 2V<- i 1 - 2 < + 1 + < + + 1 1 ')
2(Az
where 11^ uses {xi\ as interpolation nodes. We then
At
set <S_S = CSLSD and let *>?• • (w?_! - 2«/? + < + 1 ) + Atf(w?),
2(Ax)
1 ATC-1 for £ = 0 , . . . , TV— 1, together with the free boundary
conditions
3=0
w
n
-1 - Wn
W
Q 0,
Note that these choices of \i and M. are consis- W1
J T -
r
WW 0.
N f N~1
tent when N > M. Then, by the sampling theorem
SLsSs — I on MM. Moreover, it is easy to see that In our computations we use the time step At = 0.01.
CD — DC = I. Therefore, we also have
3.2. Steady state formulation
s.tS, = CSLDCSID = CSLSID = CD = L
The coarse solution u(nT, x) as we have defined it
on R and consequently, is a (practically) constant shape moving front. In
order to convert this moving state into a station-
, Nc-1 ary state, we can factor out the movement through
M
^ = NAT. E S
~3AsW a procedure based on template fitting ([Rowley &
r 3=0 Marsden, 2000; Runborg et a/., 2002], see also
Nc-1 [Chen & Goldenfeld, 1995]) which pins the travel-
1 ' ^ ing front at a fixed ^-coordinate. This is performed
by a "pinning-shift" operator, which we denote as
3=0
V. Our coarse time stepping is then modified from
Nc-1
(7) to
= — V Vn = Vn.
3=0 v n+i = pM{TTfij}Uvn = : G(v n ). (11)
Equation-Free, Effective Computation for Discrete Systems 287

This formulation has a steady state at the constant 3.3. The RPM with pseudo-arclength
shape moving front. continuation
Let us start from the basic, Fourier based,
RPM is an iterative procedure which can acceler-
pinning-shift operator Vf : R M - • M M . After intro-
ate the location of fixed points of processes; under
ducing a template function S(x), we define
certain conditions it can help locate steady states
pf\v := «S/w, of dynamic processes (in particular, discretized
parabolic PDEs). It can be an acceleration tech-
fL f , (12)
nique for the solution of nonlinear equations, and
c = a r g m a x / (W-w)(x + c )S(x)dx. a stabilizer of unstable numerical procedures (as
first presented [Shroff & Keller, 1993]). Consider the
Hence, V?w is the shifted version of w that best fixed point problem
fits the template S(x), in the sense that it max-
imizes the L2-inner product between its Fourier F(u;X) = u, (14)
interpolant and S. Upon convergence, the effec-
and let J be the Jacobian of F.
tive front speed v can be deduced from the con-
verged value of c and the time reporting horizon • Like the Newton method, RPM can converge
T simply by taking v = c/T. With the template rapidly to the fixed point solution u* provided
S{x) = 1 — cos(2irx/L) we can compute the inner the initial guess is good enough; the conver-
product in (12) explictly, gence occurs even if J(u*) has a few eigenval-
ues larger than one. The computational cost and
- [ {Hfw){x + d)S{x)dx = w0-^{wleic'), convergence rate depend on the eigenvalues of J.
L Jo Optimally there should be a clear gap in the spec-
(13) trum between small and large (near the unit cir-
cle) eigenvalues and a limited number of large (in
where Wk are the Fourier coefficients of w. Hence, norm) eigenvalues for RPM to perform well.
since WQ is real c in (12) should be chosen such • J never needs to be evaluated directly, only F.
that w\elc is real and negative. This is easily We can therefore apply RPM to any "black box"
implemented numerically together with the Fourier code that defines a function F; it is a "matrix-
shift Si. free" method.
For the same reasons as in the implementation • As a by-product, RPM also computes approxi-
of the coarse time stepper, we would like to avoid mations of the largest eigenvalues of J. This gives
direct Fourier interpolation of the solution, since it approximate stability information about the fixed
is not periodic. Therefore, we modify V$ to oper- point.
ate on differences instead. In the same spirit as in
Sec. 3.1, we let When RPM is used for the computer-assisted bifur-
cation analysis of steady states of (usually dissipa-
V := CVfD, tive evolution) PDEs, the function F represents a
time stepper: a subroutine that takes initial data
with C and D defined in (10). We still use the effec-
and reports the solution of the PDE after some fixed
tive propagation speed given by V? •
time (the reporting horizon T). A fixed point then
An important property of the Fourier based
satisfies (14). The conventional way of finding the
pinning shift operator is that it satisfies {V*)2 =
steady state using a time stepper would be to call
V?, which follows from the sampling theo-
it many times in succession — in effect, to integrate
rem [Runborg et al., 2002]. For other types of
the PDE for a long time, corresponding to solving
interpolation, such as piecewise polynomial interpo-
(14) by simple fixed point (Picard) iteration.
lation, the pinning shift operator will not have this
RPM can improve this approach in two impor-
property and a steady moving coarse shape may not
tant respects. First, the convergence can be signifi-
translate into a fixed point for (11). Our modifica-
cantly accelerated. The nature of many transport
tion still has this property though, since
PDEs usually encountered in engineering model-
•p2 = CVfDCVfD = C{Vf)2D = CVfD = V, ing (the action of viscosity, heat conduction, diffu-
sion, and the resulting spectra) dictates that there
where we used the fact that DC = I. exists a separation of time-scales, which translates
288 J. Moller et al.

into an eigenvalue gap in t h e spectrum of J at the first-order extrapolation


steady state. Second, R P M converges even if the
steady state is slightly unstable, i.e. when J has a x . Aj — Aj_i
A; Aj H r Asi,
few eigenvalues outside the unit circle. It may thus
Asi_i
be possible to compute (mildly) unsteady branches Ui - Ui-l
of the bifurcation diagram using forward integra- Ui + As,
Asi_i
tion (but in a nonconventional way, dictated by
the R P M protocol). R P M still retains the simplic- with a second-order extrapolation,
ity of the fixed point iteration, in the sense t h a t ,„ , l A i ( l - 7 ) - 2 A i - 1 + (l + 7 ) A i - 2 A 2
no more information is needed t h a n just the time- A* A t ~ 7 iiS:.
integration code. This code, which may be a legacy 2 Asi_iAsi_2
code, and can incorporate the best physics and „ , 1 Uj(l- 7) - 2 t t i - i + ( l + 7 K - 2 A 2
u 2 A s ^ A s ^
modeling available for the process, is used by R P M
as a black box. As i - l As,;_2
R P M can be seen as a modified version of 7
ASJ_I + A si - 2
fixed point iteration. It adaptively identifies the
subspace corresponding t o large (in norm) eigen- and requiring t h a t
values of J , hence the directions of slow or unstable max(||u**-«*||,|A**-A*|)<e (16)
time-evolution in phase space. In these directions
the fixed point iteration is replaced by (approx- t h e stepsize is determined. Here e is a user specified
imate) Newton iteration. More precisely, suppose tolerance. As t h e corrector method, we use R P M
F : M.N x R - • R ^ in (14). Let P be the maxi- with pseudo-arclength continuation, see [Shroff &
mal invariant subspace of J corresponding to the Keller, 1993; Lust, 1997]. Starting from u° = u**
m largest eigenvalues and let Q be its orthogonal and A0 = A**, t h e iterative scheme is given by
complement in M.N. The solution u is decomposed q-^=QF{un,Xn),
as u = p + q = Pu + Qu, where P and Q, are the
projection operators in M.N on P and Q. These are (VpTJVp-l) V?FX Ap
constructed from an orthogonal basis Vp
AX
T
slvp Sx
P = VPVp ,
VpTF(pn + qn+1,Xn) -pn
Q = I-VPV?.
S{pn + qn+1,Xn)
In a pseudo-arclength continuation context the solu-
tion u = u(s) and A = A(s), where s parameter- n+l n
p„n + VpApn + qn+\
u
izes the bifurcation curve. In addition to (14) we
then use an algebraic equation to be able to handle
Xn + l X\nn + AA n ,
turning points,
where the left-hand side consists of partial deriva-
\u(s) -u(s~ As)||2 tives of S in (15) and of F in (14) with respect
S(u, A, As)
As" to u and A. T h e iterates un = pn + qn will con-
verge to the solution of (14) under t h e assump-
|A(s)-A(s-As)|2
-As tions discussed above. If the number of large norm
+ As
eigenvalues, m, is limited, the dimension of P and
(15) the projected Jacobian in the Newton iteration,
0,
VpTJVp — I, remains small. Only this small matrix
where u(s — As) and A(s — As) refers to the con- needs to be inverted. For a more complete descrip-
verged solution at the previous point on the contin- tion of R P M we refer to [Shroff & Keller, 1993].
uation curve.
The solution is advanced using a predictor-
corrector method. Via extrapolation from previous 4. Numerical Results
points Ui = u(si), A, = A(SJ) and A s , = Sj+i — Sj, In this section we present some numerical results
the predictor-solution is obtained. Comparing a using the coarse time stepper and the procedure
Equation-Free, Effective Computation for Discrete Systems 289

described above to simulate an effective equation discrete medium, the shift of this initial condition
for the discrete problem in (8). We will start by dis- by one lattice spacing will eventually get trapped
cussing the "exact" bifurcation diagram of the dis- one lattice spacing further. This saddle-node bifur-
crete system, which we attempt to approximate. We cation can be seen in Fig. 5(a); linearizing around
will then show results obtained through the coarse the saddle front will give a positive eigenvalue Xs,
time stepper, and discuss the effect of time step- while the corresponding eigenvalue Xn for the node
per "construction parameters" like the reporting front would be negative. Since we look at the prob-
time horizon, T (the time to which (1) is integrated lem in discrete time, what is plotted is the multi-
within the coarse time stepper), and the number of plier fj,ntS = exp(A njS T), where T is the reporting
different initial shifted copies, iVc. horizon. The saddle front has a multiplier larger
Figure 5 shows the bifurcation diagram of the than 1, while the corresponding multiplier for the
discrete problem as a function of the parameter Ax, stable node is less than 1; both multipliers asymp-
the lattice spacing, in the regime close to the onset tote to 1 at the SNIPER (Ax*).
of pinning. For lattice spacings smaller than Ax* « Figure 5(b) shows the bifurcation diagram in
2.3 the system has, as we discussed, an attract- terms of the front traveling speed. Since both the
ing, front-like solution that travels; its motion is saddle and the node fronts are pinned (have zero
modulated as it "passes over" the lattice points. For speed) they both fall on the zero axis; we plot-
an infinite lattice, this modulated traveling solu- ted their eigenvalues in Fig. 5(a) to distinguish
tion possesses a discrete translational invariance: between them. The true traveling speed (broken
U£+i(t + r) = ui(t). The shape of the modulating line) is compared with the effective traveling speed
front is shifted by one (resp. 2 , 3 , . . . , n) lattice spac- predicted by a coarse time stepper using iVc = 5
ing after time r (resp. 2r, 3r, . . . , nr); this helps us copies within each unit cell, and a reporting hori-
define its effective speed v(Ax) = AX/T (see Fig. 4). zon of T = 32. The coarse time stepper speed is
As Ax approaches zero, for an infinite lattice, the a byproduct of fixed point computation and con-
discrete front approaches the continuum front of the tinuation with it; short bursts of detailed simula-
PDE, and its speed (the period of the modulation tion are used in the RPM framework to construct
divided by Ax approaches the PDE front speed, 0.1 a contraction mapping that converges to a fixed
(see Fig. 4)). point of the time stepper. The final shift upon
If we identify shapes shifted by one lattice con- convergence (from the pinning-shift computation),
stant, the attractor appears as a limit cycle with divided by the time stepper reporting horizon gives
period r. As the lattice spacing approaches the crit- us an estimate of the "effective speed". Inspection
ical value Ax* the speed of propagation approaches of Fig. 5(b) indicates that the coarse time stepper
zero (the period of the "limit cycle" approaches never predicts a speed that is exactly zero; yet it
infinity); asymptotically, v(Ax) « \Ax — Ax*| 0,5 . gives a good approximation of the effective speed,
As discussed in [Kevrekidis et al., 2001; Carpio &; all the way from small Ax to the near neighborhood
Bonilla, 2003a, 2003b] what occurs is a Saddle-Node of the pinning transition, when the effective speed
Infinite Period (SNIPER) bifurcation: a saddle- becomes small.
node bifurcation where both new fixed points We will return to discussing this issue of "small
appear "on" the limit cycle. For larger values of Ax residual motion" for the coarse time stepper shortly.
the "saddle" and the "node" move away from each To give an indication of when the procedure stops
other, and what used to be the limit cycle is now being quantitative, we have included the Ax/T
comprised from the saddle, the node, and both sides curve in Fig. 5(b): disagreement starts well in the
of the one-dimensional unstable manifold of the sad- regime where the effective movement is less than
dle, which asymptotically approaches the node. one unit cell per observation period. In the next
The saddle and the node are, of course, sta- section we will compare the "goodness of approxi-
tionary fronts. A pair of them exists for every "unit mation" of our coarse time stepper to the effective
cell": all "node fronts" are shifts of each other by speed predicted by the Pade approach to extract-
one lattice spacing, and all "saddle fronts" are also ing effective continuum equations. It is interesting
shifts of each other by one lattice spacing. Since the that the coarse time stepper sometimes predicts a
medium has a discrete translational invariance, this small hysteresis loop at low speeds, relatively close
makes sense — if an initial condition gives rise to to "true pinning"; notice in Fig. 5(a) the unsta-
a front eventually pinned at some location in the ble (larger than one) multipliers for the brief saddle
290 J. Moller et al.

10 - I . • . . .1 . . i i i i i i
: : : :i : :
... J . .
Real Problem / .
i Ax . :::'.:'.:::::'.::"::::::\"::::::::::>t/l ~" '
10
1 v

:X): :
. . . ] . .
1

::::::::::::::::;::- ::-*:: ::::::::::: i/i!i: \;:::


: ;;;::; ;
\L.- A .....
:/:::: :\::-::::::::
./..:. \
^ , • .

. . y . / '.
> ....;.
• • | • /
.. i. Coarse Problem
;
10 r._. JJ^IL
0

'.'.'.'} X
•\ ' ' ' ' ' '

1 v 1
" "1 " X • I
1 1
10-
. \: i::
"y 1 " """'"-. /
11
\i
2 \i : / V '"'"'- •
10-
. .jN, / \ — • - ^ _

i
i
3 :
10- i i i i i \ i i

2.2 2.4 2.6 2.8 3.2 3.4 3.6


Ax
(a)

0.1 I I I
.
I
.
I
/
V

"-*—^ I /
^ ~ \ : I : /
0.09 " ; ^ ^ ^ \ ' s • • •••

0.08
\
0.07 \ S\ ' ' \ Ax/T

0.06

<| 0.05
>
\X X S\ J
../...; \.!. ..;
0.04
\' , Coarse Problem
0.03

0.02 Real Problem


.Al.
\^_y
'•'
.:../.

')

0.01 h
! u___
i i i i i i i
0 1.5 2.5 3.5 4.5
Ax
(b)

Fig. 5. Detailed bifurcation diagram and coarse time stepper bifurcation diagram with parameters Nc = 5,T = 32. (a) Multi-
pliers versus Ax, (b) effective front speed versus Ax.
Equation-Free, Effective Computation for Discrete Systems 291

part of this loop. We will discuss a tentative ratio- after a small hysteresis loop close to true pinning —
nalization of this below. eventually becomes negligible.
Figure 6 illustrates the effects of "time stepper We now turn to the discussion of the slight
construction" parameters on the effective behavior residual motion of the coarse time stepper at large
predicted by the time stepper: the reporting time- Ax beyond Ax*. For an infinite domain, the saddle
horizon, for two different sets of shifted copies (Nc = and node pinned fronts appearing there are invari-
3 and Nc — 10) as well as the effect of the number of ant to translations by one lattice spacing; for a
copies for a fixed time horizon (T = 16). Augment- large enough computational domain we still see two
ing the time stepper reporting horizon is shown in pinned front solutions per cell. When we "sprinkle"
Figs. 6(a) and 6(b); clearly, in both cases, extend- initial conditions along the cell, depending on their
ing the time stepper reporting horizon extends the location with respect to the saddle front, the trajec-
region over which its effective speed agrees with tories may either be attracted to the stable node "to
the true problem closer to Ax*. Larger numbers of the right" or to the one "to the left" of the saddle.
copies (Nc = 5,10,20) also perform slightly bet- It is instructive to represent these solutions as in
ter than smaller numbers (Nc = 3). In all cases Fig. 7(a), in a way that identifies the "right" node
the qualitative behavior is the same: (a) successful front with the "left" one; here translation along
approximation of the effective speed until reason- the lattice corresponds roughly to rotation along the
ably close to true pinning; (b) all differences occur circle. The node is denoted by a black circle, and the
when the average front motion is significantly less saddle by a white one. The small squares represent
than one unit cell per reporting horizon; (c) there is the initial positions of our initial condition "copies".
always a slight residual motion, which — possibly The fate of our distribution of initial conditions is

(a)
Fig. 6. Effective front speed versus Aa; and the effect of varying the time horizon T and the number of copies Nc- (a) Varying
time horizon, T = 1 0 , . . . , 50, with fixed Nc = 3. Dashed lines show Ax/T, i.e. speed required to traverse one cell, (b) varying
time horizon, T = 5, . . . , 16, with fixed Nc = 10, (c) varying number of copies, iVc = 3, 5,10, 20, with fixed T = 16.
292 J. Moller et al.

0.1 I i i

0.09 ..kj
x
Ax
0.08

0.07

0.06 -
1
W \ \ v \ \ \ . ^~~~~~~~
0.05

0.04

0.03 /!.....L^J
7
0.02 Rea) Problem'

i . Increasing Time
0.01
i
i
i i i i i i

1.5 2.5 3 3.5 4.5


Ax
(b)

(c)

Fig. 6. (Continued)
Equation-Free, Effective Computation for Discrete Systems 293

(a)

2.5 r
/

1
s
N.
S
S
\
N
\ to*
1.5

<
Copy#

0.5 3

4
5
1
y ' p L a 2
1 i i i
-0.5
1.5 2 2.5 3 3.5 4 4.5 5
Ax
(b)

0.44

0.36

(c)

Fig. 7. Movement of the individual copies, for iVc = 5, T = 32. (a) Schematic movement of copies in phase space, (b) dis-
tance traversed by copies, (c) real movement of copies in phase space for Aa; = 1.6 (left) and Ax = 2.3 (right). (See text for
specification of axes.)
294 J. Moller et al.

governed by their initial "angle" on the circle — are indicated by small squares and their locations
as our time horizon grows all initial conditions will at t = T, the time horizon, are marked by filled
asymptote to a stable front, either the left one (mov- circles. The labels refer to the same copies as in
ing counterclockwise on the circle) or the right one Fig. 7(b).
(clockwise movement). We now see clearly the phys- As the reporting time horizon of the time step-
ical reason behind the net residual motion for any per goes to infinity, it is clear that one can compute
finite time horizon for the coarse time stepper. An the average residual movement from the asymp-
initial condition that is put down "at random" in totic position of the saddle front, i.e. from the rel-
a unit cell deep in the pinned regime, even if it ative extent of the circle "to the right" and "to
never exits this unit cell, will gradually traverse the left" of the saddle front. The most reasonable
the part of the circle separating it from the closest point to "declare" as an estimate of the true pin-
node front. ning from coarse time-stepper computations would
When the critical parameter value is come from a polynomial extrapolation of the "suc-
approached from the pinned side, the saddle and the cessful" regime (close to the tip of the "appar-
node fronts approach each other on the circle, on ent parabola" in Fig. 5); alternatively, a value of
their way to coalescing at the SNIPER bifurcation Ax where the speed is small enough (well below
point [Kevrekidis et al, 2001; Carpio k, Bonilla, one unit cell per time horizon) and its variation
2003a, 2003b]. Figure 7(b) shows how this process with number of copies and time horizon is below a
becomes manifest in the coarse time stepper compu- user-prescribed tolerance, would also serve this pur-
tations, using the problem in Fig. 5 as our example. pose. While there is no well-defined pinning bifurca-
Deep in the pinning regime (high Ax, marked a) tion for the coarse time stepper (since pinning is an
the relative "phase" of the saddle and the node inherently nontranslationally invariant bifurcation),
pinned fronts on the circle remains roughly con- the procedure can provide a good approximation of
stant. The distance each member of our ensemble the effective shape and speed of the traveling fronts,
of initial conditions has traversed during one time as well as "common sense" ways of numerically esti-
horizon can be deduced from Fig. 7(b): the copy mating the true pinning.
with the largest negative movement is the one clos-
est to the saddle but on its left (copy number two).
One can similarly rationalize the labeling of the 5. A n Alternative Continuum
remaining curves in Fig. 7(b). When Ax is reduced Approach: P a d e A p p r o x i m a t i o n s
approaching the onset of pinning, at some point the
In this section, we propose an alternative scheme for
saddle front starts moving appreciably towards the
capturing effects of discreteness, by means of a (now
node front. As part of this movement, it "sweeps"
explicit) continuum equation. This PDE is obtained
the circle counterclockwise; at Ax « 2.8 it has its
by means of Pade approximations [Cabannes,
first encounter with one of our initial conditions —
1976; Elphick et al, 1990] which can be used to
the closest one on the left. When the saddle "moves
approximate discreteness in a quasi-continuum way,
past" it into the regime marked (3, this copy, which
through the use of pseudo-differential operators.
was responsible for the largest negative displace-
In particular, starting from the Taylor expansion
ment now approaches asymptotically the node front
for analytic functions, see e.g. [Christiansen et al,
on the right, performing the largest positive dis-
2001],
placement (and so on for the remaining copies).
Eventually, in the propagating regime, marked 7,
and for long enough reporting horizons, the initial u(x + m) = exp(mdx)u(x),
"phase" difference (a fraction of a cell) becomes
negligible compared to the net displacement of each one can then express spatial discreteness as
point (several cells).
The real movement in phase space is shown
ue+i + u^_i - 2u£
in Fig. 7(c) for two different Ax. In these sub-
figures, the x-axis represents sin(27rxc) where x c = (exp(Axdx) + exp(—Axdx) — 2)u(x)
corresponds to the location of the front, more
specifically xc = Yle(Du)i£. The y-axis represents = Asmh2(^^\u(x,t).
max; |(Z?u)^|. The initial positions of the copies
Equation-Free, Effective Computation for Discrete Systems 295

Expanding ex.p(±Axdx) [Elphick et al, 1990], one bounds are different (/'(ithom) — 4/Ax 2 in the dis-
then obtains crete case versus f'{uhom) — 12/'Ax2 in the Pade
approximation).
1 A2 o/ Ax2 n9
exp(±Axdx -Ax [l + ^rdl
x dl It would then be of interest to alleviate this
2 V 12
spectral discrepancy, as well as to match the dis-
crete operator (if possible) to a higher order in the
± Ax ( 1 + -Ax2d2 +• Taylor expansion

Finally, regrouping the terms in the manner of Pade Un+l + Un-1 - 2un 2Aa^_"
[Cabannes, 1976; Elphick et al, 1990] yields Ax2 E -u m

1 Ax2d2x Axdx Ax^


exp(±Axdx — ^XX ~T -Ua
2~" 2 12
Ax
12 x dt
12
+ ^ u 6 x + 0(Ax6).
(17)

We now use the pseudo-differential operator This can be achieved by a natural generaliza-
approximation in (17) to convert the discrete model tion in the form of a continued fraction such as e.g.
in (8) into the PDE approximation of the form: dl
(19)
Ad2
dl
ut Ax'
-« + /(«) (18) Bd2x
dl 1-
~L2 Cd2
Such approaches were introduced and used exten- In order to use (19) in practice (i.e. for computa-
sively by Rosenau and collaborators [Rosenau, tional purposes), we convert the three fractions into
1986, 1987, 1989, 1992; Doering et al, 1987] to reg- one of the form
ularize nonlinear wave equations, particularly of the
_ .1 + aAx2dl)
Klein-Gordon type. (20)
1 + (a + (3) Ax2d2 + jAx4d*'
Equation (18) clearly emulates the discrete set-
ting in some key aspects of the relevant spectral where a simple (algebraic) reduction of A, B, C to
operator properties (i.e. the discrete Laplacian in a,/3,7 has been used. We then use Taylor expan-
comparison with the pseudo-differential operator of sion of the denominator to convert the expression
(18)). For example, considering plane wave solu- of (20) into one resembling (19). By matching up
tions of the form exp(Ai — ikx), we obtain in the dis- to 0(h6) the exact Taylor expansion, we obtain
crete case the linearized dispersion relation (around three algebraic equations for a, (5 and 7. In this
a uniform state u = «hom) way, we obtain a set of solutions for a, /3 and 7.
We use here the set a = -0.007,912, 0 = - 1 / 1 2 ,
A {cos(kAx) - 1) + f'(uho 7 = 0.002, 056. An additional benefit (to the match-
Ax2 ing of the Taylor expansion up to correction terms of
In the case of (18), the corresponding equation 0(h8)) that should be highlighted here is the value
becomes a/(7Ax 2 ) = 3.848/Ax 2 of the lower bound expres-
sion for A, which is much closer to the theoretical
k2
A= + f'(uh0 lower bound of A/Ax2 than the prediction 12/Aa;2
A x 2
-, ,2 of the leading order approximation presented pre-
1 + 12-r^k2 viously. The resulting evolution equation will then
Apart from sharing the continuum limit, the two read:
dispersion relations share another qualitative fea- d2(l + aAx2dl)
ture which is particularly important [Rosenau, ut = + f(u) (21)
1 + (a + (3)Ax2d2 + -fAxAd4
1986, 1987, 1989, 1992; Doering et al, 1987];
namely, the presence of a lower bound in the contin- Both (18) and (21) can be numerically imple-
uous spectrum. Notice, however, that the two lower mented in a straightforward manner, by means of
296 J. Moller et al.

the spectral techniques described in [Kevrekidis In part at least, these results (and the discrep-
et al, 2002]. We have performed numerical simula- ancy from the actual discrete case) can be justified
tions of the front propagation, using 1024 modes in by observing Fig. 9. The bottom panel of the fig-
the spectral decomposition of (18) and (21). We will ure suggests that the only way in which the front
refer to these equations as the (Pade) models A and can stop in these quasi-continuum Pade approxima-
B, respectively. A fourth order Runge-Kutta algo- tions is by becoming practically a vertical shock-
rithm has been used for the time integration. For like structure. In this case, the "mass" of the front
each value of Ax, we identify the position xc of the which is given by J^° u\ dx (see e.g. [Boesch et al,
front as the point where the ordinate of the front 1989] and references therein) becomes practically
acquires the value u = 1/2. The linear interpolation infinite. This means that the inertia of the front
scheme suggested in [Boesch et al, 1989] has been becomes too big for the front to move and hence
implemented and has proved to be an efficient front "pinning" occurs. However, notice that this process
tracking algorithm in all the examined cases. of pinning is significantly different than the details
Our results of this quasi-continuum approach to of the discrete structure of the problem (such as
the discrete problem can be summarized in Figs. 8 e.g. the saddle-node bifurcation and the transition
and 9. Figure 8 shows the speed of the fronts in Pade to pinned solutions). The translationally invariant
models A and B, respectively. We can observe that quasi-continuum Pade approximations of models A
the critical value of Ax beyond which trapping of and B do not "see" such features. Instead, they
the front occurs is significantly displaced from the incorporate the well-known feature of front steep-
actual one of Ax* « 2.3, for r\ = 0.45. In particu- ening for stronger discreteness [Peyrard & Kruskal,
lar, for model A, A 6.4, while for model B, the 1984] and the criticality of the latter feature even-
corresponding critical value is Ax* sa 3.8. We can tually leads to pinning.
deduce that the latter model is closer to the actual An additional pointer to the fact that such
physical reality, even though the relevant prediction (pseudo-differential operator) models are "eligible"
is still considerably higher than its actual value for to pinning is that they are devoid of some of the
the discrete model. important symmetries that are inherently related

i i i i i
—_JL_j::__ — Pade A
— Real problem

\ ^~"~-\^^
X \ ^\.
< 0.05 \ ^""^^ -
>

l
i i i i I - - —

4
Ax

0.1 I I I I 1
— Pade B
- Real problem
\ —-—___^

x
< 0.05
> \
\
\
I
I 1 1 1 i

1.5 2.5 3.5


Ax
Fig. 8. Effective front speed as a function of Ax, for the Pade model A (top panel) and model B (bottom panel).
Equation-Free, Effective Computation for Discrete Systems 297

150

1
0.8
O
0.6
0.4
0.2
0
-0.2
30 35 40 45 50 55 60
x/Ax
Fig. 9. The figure shows the time evolution of the front for model A and for Ax = 6.4. The top panel shows the time evolution
of the front center which eventually leads to trapping. The bottom panel shows the final front configuration of the numerical
simulation at t = 150.

to traveling such as the Galilean invariance in the hypothesis underlying the existence of a closed
case of continuum bistable equation or the Lorentz effective equation (see, for example, the discussion
invariance of its Hamiltonian (nonlinear Klein- in [Makeev et al, 2002; Hummer & Kevrekidis,
Gordon) analog. 2003]).
We also presented initial computational results
exploring the effect of certain "construction
6. Summary and Discussion parameters" of the approach: the number of shifted
We presented a computer-assisted approach for copies in the ensemble of initial conditions, as well
the solution of effective, translationally invariant as the time-horizon used. We included a compar-
equations for spatially discrete problems without ison between our approach and a particular way
deriving these equations in closed form. Assuming of obtaining explicit approximate translationally
that such an equation exists, its time-one map is invariant evolution equations for such a problem
approximated through the coarse time stepper, con- (the Pade approximation). More work is neces-
structed through an ensemble of appropriately ini- sary along these lines, exploring the relation of our
tialized simulations of the detailed discrete problem. approach with traditional homogenization meth-
Combining the coarse time stepper with matrix-free ods at small lattice spacings. A discrete problem
based numerical analysis techniques, e.g. contrac- whose detailed solution can be obtained explicitly
tion mappings such as RPM, can then help analyze (perhaps a piecewise-linear kinetics problem) or at
the unavailable effective equation. We are currently least approximated very well analytically over short
exploring the use of our coarse time stepper with times, would be the ideal context in which t o study
coarse projective integration [Gear & Kevrekidis, these issues.
2003; Gear et al, 2002; Kevrekidis et al, 2003; Several extensions of the approach can be envi-
Rico-Martinez et al., 2004]. Matrix-free eigenanal- sioned, and might be interesting to explore. A time
ysis techniques should also be explored, espe- stepper based approach can be applied without
cially since they can help test the "fast slaving" modification to hybrid discrete-continuum media,
298 J. Moller et al.

e.g. continuum transport with a lattice of sources is smeared out and rendered a "continuum tran-
or sinks, such as cells secreting ligands into and sition" (see, for example, materials science models
binding them back from a liquid solution, [Pribyl of the onset of movement of a front [Cahn, 1962;
et al., 2003]. It is clear that it can be tried in more Maroudas & Brown, 1991]). On the other hand,
than one dimensions, and for regular lattices of dif- one might argue that this is an acceptable, and
ferent geometry. For irregular lattices the averag- possibly optimal way for a continuum equation
ing "over all shifts" we performed here for periodic to represent the discrete bifurcation to pinning.
media can be substituted with a Monte Carlo sam- We can see that other procedures, such as the
pling over the distribution of possible lattices that discreteness-emulating Pade type ones, lose a lot
takes into account what we know about the sta- of the quantitative structure of the relevant tran-
tistical geometry of the lattices. In this paper we sition. On the other hand, if a continuum differ-
assumed that an equation existed and closed for ential (as opposed to pseudo-differential) equation
the expected shape of the solution. Conceivably one was constructed to "model" this transition, the lat-
can attempt to develop time steppers not only for ter would possess other artificial features such as
the expectation (the first moment of a distribution a topologically mandated, unstable branch of trav-
of possible results), but, say, for the expectation eling wave solutions. See e.g. [Kness et al, 1992]
and the standard deviation of possible results; the and references therein. It is conceivable that the
lifting operator would then have to be appropri- short hysteresis loop sometimes predicted by the
ately modified. Finally, our time stepper here was coarse time stepper close to pinning conditions is
built on short simulations of the entire detailed dis- a "vestige" of this unstable branch that trans-
cete system in space. Hybrid simulations, where a lationally invariant equations would necessarily
known, explicit effective equation is accurate over predict. In conclusion, it can be appreciated that
part of the physical domain can be done; an "over- genuinely discrete problems and continuum ones
all hybrid coarse" time stepper (explicit equation have inherent differences1 that cannot be fully cap-
over part of the domain, and the coarse time step- tured by emulating (or "summarizing") the one con-
per in this paper over the rest of the domain) will text through the other. Nevertheless, the approach
then be used. In a multiscale context, we have pro- proposed here, combined with a "common sense"
posed "gaptooth" and "patch dynamics" simula- interpretation of its results with respect to the gen-
tions [Gear et al, 2003; Kevrekidis et al, 2003], uinely discrete problem, performs in a satisfactory
where the present coarse time stepper integrations way for the modeler, even for the "most different"
are performed not over the entire domain, but over a features between discrete and continuum models.
mesh of small computational "boxes". Both hybrid
and "gaptooth" simulations, if possible, require
careful boundary conditions for the "handshaking" Acknowledgments
between the continuum equation and the discrete Part of the research for this paper was carried
simulations, or the discrete simulations in distant out while Olof Runborg held a post-doctoral
boxes, effectively implementing smoothness of the appointment with the Program for Applied
solution of the unavailable effective equation (e.g. and Computational Mathematics at Princeton
[Kevrekidis et al, 2003; Li et al, 1998a, 1983b; University, supported by NSF KDI grant DMS-
Shenoy et al, 1999; E & Huang, 2001]). 9872890. Panayotis G. Kevrekidis gratefully
We close with a discussion of the "onset of acknowledges support from a UMass FRG, NSF-
pinning", the transition around which our test DMS-0204585 and from the Eppley Foundation
example of the coarse time stepper was focused. for Research. Kurt Lust is a postdoctoral fellow
Continuum effective equations such as the ones dis- of the Fund for Scientific Research-Flanders. This
cussed here through the numerical time-stepping paper presents research results of the Belgian Pro-
procedure do not, strictly speaking, possess a bifur- gramme on Interuniversity Poles of Attraction, ini-
cation at the critical point of the genuinely discrete tiated by the Belgian State, Prime Minister's Office
problem. In this effective process, the bifurcation for Science, Technology and Culture. The scientific

A similar example can be found in the comparison of discrete and periodic continuum problems, where the former ones pos-
sess a single permissible band of excitations, while the latter p ossess an infinity of such bands and hence allow for interband
transitions [Alfimov et al, 2002].
Equation-Free, Effective Computation for Discrete Systems 299

responsibility rests with its authors. Ioannis G. Fath, G. [1998] "Propagation failure of traveling waves
Kevrekidis gratefully acknowledges the support in a discrete bistable Medium," Physica D116,
of AFOSR (Dynamics and Control) and an NSF- 176-190.
I T R grant. Gear, C. W., Kevrekidis, I. G. & Theodoropoulos, C.
[2002] "'Coarse' integration/bifurcation analysis via
microscopic simulators: Micro-Galerkin methods,"
Comp. Chem. Eng. 26, 941-963.
Gear, C. W. & Kevrekidis, I. G. [2003] "Projective
References methods for stiff differential equations: Problems with
Alfimov, G. L., Kevrekidis, P. G., Konotop, V. V. & gaps in their eigenvalue spectrum," SIAM J. Sci.
Salerno, M. [2002] "Wannier functions analysis of the Comput. 24, 1091-1106 (electronic).
nonlinear Schrodinger equation with a periodic poten- Gear, C. W., Li, J. & Kevrekidis, I. G. [2003] "The
tial," Phys. Rev. E66, 046608-6. gap-tooth method in particle simulations," Phys. Lett.
Bates, P. W., Chen, X. F. & Chmaj, A. J. J. [2003] A316, 190-195.
"Traveling waves of bistable dynamics on a lattice," Hummer, G. & Kevrekidis, I. G. [2003] "Coarse molec-
SIAM J. Math. Anal. 35, 520-546. ular dynamics of a peptide fragment: Free energy,
Beyn, W.-J. & Thummler, V. [2004] "Freezing solutions kinetics and long time dynamics computations," J.
of equivariant evolution equations," SIAM J. Appl. Chem. Phys. 118, 10762-10773.
Dyn. Syst. 3, 85-116. Ishimori, Y. & Munakata, T. [1982] "Kink dynamics
Boesch, R., Willis, C. R. & El-Batanouny, M. [1989] in the discrete sine-Gordon system: A perturbational
"Spontaneous emission of radiation from a discrete approach," J. Phys. Soc. Jpn. 51, 3367-3374.
sine-Gordon kink," Phys. Rev. B40, 2284-2296. Keener, J. & Sneyd, J. [1998] Mathematical Physiology
Cabannes, H. (ed.) [1976] Pade Approximants Method (Springer-Verlag, NY).
and its Applications to Mechanics (Springer-Verlag, Keener, J. P. [1991] "The effects of discrete gap junction
Berlin). coupling on propagation in myocardium," J. Theor.
Cahn, J. W. [1962] "The impurity drag effect in grain Biol. 148, 49-82.
boundary motion," Acta Metall. 10, 789-798. Keener, J. P. [2000] "Homogenization and propagation
Carpio, A. & Bonilla, L. [2003a] "Oscillatory wave fronts in the bistable equation," Physica D136, 1-17.
in chains of coupled nonlinear oscillators," Phys. Rev. Kevrekidis, I. G., Gear, C. W., Hyman, J. M.,
E67, 056621-11. Kevrekidis, P. G., Runborg, O. & Theodoropoulos, C.
Carpio, A. & Bonilla, L. L. [2003b] "Depinning transi- [2003] "Equation-free, coarse-grained multiscale
tions in discrete reaction-diffusion equations," SIAM computation: Enabling microscopic simulators to
J. Appl. Math. 63, 1056-1082. perform system-level analysis," Commun. Math. Sci.
Chen, L.-Y. & Goldenfeld, N. [1995] "Numerical 1, 715-762; Original version can be found as
renormalization-group calculations for similarity physics/0209043 at arXiv.org.
solutions and traveling waves," Phys. Rev. E51, Kevrekidis, P. G., Kevrekidis, I. G. & Bishop,
5577-5581. A. R. [2001] "Propagation failure, universal scal-
Christiansen, P. L., Gaididei, Y. G., Mertens, F. G. & ing and Goldstone modes," Phys. Lett. A279,
Mingaleev, S. F. [2001] "Multi-component structure 361-369.
of nonlinear excitations in systems with length-scale Kevrekidis, P. G., Kevrekidis, I. G., Bishop, A. R. &
competition," Eur. Phys. J. B19, 545-553. Titi, E. S. [2002] "Continuum approach to discrete-
Christodoulides, D. N. & Joseph, R. I. [1988] "Dis- ness," Phys. Rev. E65, 046613-13.
crete self-focusing in nonlinear arrays of couped wave- Kness, M., Tuckermann, L. S. & Barkley, D. [1992]
guides," Opt. Lett. 13, 794-796. "Symmetry-breaking bifurcations in one-dimensional
Dawson, S. P., Keizer, J. & Pearson, J. E. [1999] "Fire- excitable media," Phys. Rev. A46, 5054-5062.
diffuse-fire model of dynamics of intracellular calcium Laplante, J. P. & Erneux, T. [1992] "Propagation fail-
waves," Proc. Natl. Acad. Sci. USA 96, 6060-6063. ure in arrays of coupled bistable chemical reactors,"
Doering, C. R., Hagan, P. S. & Rosenau, P. [1987] J. Phys. Chem. 96, 4931-4934.
"Random-walk in a quasi-continuum," Phys. Rev. Li, J., Liao, D. & Yip, S. [1998a] "Coupling continuum
A36, 985-988. to molecular-dynamics simulation: Reflecting parti-
E, W. & Huang, Z. [2001] "Matching conditions in cle method and the field estimator," Phys. Rev. E57,
atomistic-continuum modeling of materials," Phys. 7259-7267.
Rev. Lett. 87, 135501-4. Li, J., Liao, D. & Yip, S. [1998b] "Imposing field bound-
Elphick, C , Meron, E. & Spiegel, E. A. [1990] "Pat- ary conditions in MD simulations of fluids: Optimal
terns of propagating Pulses," SIAM J. Appl. Math. particle controller and buffer zone feedback," Mat.
50, 490-503. Res. Soc. Symp. Proc. 538, 473-478.
300 J. Moller et al.

Lust, K. [1997] "Numerical bifurcation analysis of peri- Rosenau, P. [1989] "Extending hydrodynamics via the
odic solutions of partial differential equations," Ph.D. regularization of Chapman-Enskog expansion," Phys.
thesis, Katholieke Universiteit Leuven. Rev. A40, 7193-7196.
Makeev, A. G., Maroudas, D. & Kevrekidis, I. G. [2002] Rosenau, P. [1992] "Tempered diffusion: A transport
" 'Coarse' stability and bifurcation analysis using process with propagating fronts and inertial delay,"
stochastic simulators: Kinetic Monte Carlo exam- Phys. Rev. A46, R7371-R7374.
ples," J. Chem. Phys. 116, 10083-10091. Rowley, C. W. & Marsden, J. E. [2000] "Reconstruction
Maroudas, D. k Brown, R. A. [1991] "Model for disloca- equations and the Karhunen-Loeve expansion for sys-
tion locking by oxygen gettering in silicon crystals," tems with symmetry," Physica D142, 1-19.
Appl. Phys. Lett. 58, 1842-1844. Runborg, O., Theodoropoulos, C. & Kevrekidis, I. G.
McLaughlin, D., Shapley, R., Shelley, M. & Wielaard, [2002] "Effective bifurcation analysis: a time-stepper
D. J. [2000] "A neuronal network model of macaque based approach," Nonlinearity 15, 491-511.
primary visual cortex (vl): Orientation tuning and Shenoy, V. B., Miller, R., Tadmor, E. B., Rodney, D.,
dynamics in the input layer 4CQ," Proc. Natl. Acad. Phillips, R. & Ortiz, M. [1999] "An adaptive finite
Sci. USA 97, 8087-8092. element approach to atomic-scale mechanics — The
Peyrard, M. & Kruskal, M. D. [1984] "Kink dynamics quasicontinuum method," J. Mech. Phys. Solids 47,
in the highly discrete sine-Gordon system," Physica 611-642.
D14, 88. Shroff, G. M. & Keller, H. B. [1993] "Stabilization
Peyrard, M. & Bishop, A. R. [1989] "Statistical mechan- of unstable procedures: The recursive projection
ics of a nonlinear model for DNA denaturation," Phys. method," SI AM J. Numer. Anal. 30, 1099-1120.
Rev. Lett. 62, 2755-2758. Swanson, B. L., A. Brozik, J., Love, S. P., Strouse, G. F.,
Pribyl, M., Muratov, C. B. & Shvartsman, S. Shreve, A. P., Bishop, A. P., Wang, W.-Z. & Salkola,
[2003] "Discrete models of autocrine cell com- M. I. [1999] "Observation of intrinsically localized
munication in epithelial layers," Biophys. J. 84, modes in a discrete low-dimensional material," Phys.
3624-3635. Rev. Lett. 82, 3288-3291.
Rico-Martinez, R., Gear, C. W. & Kevrekidis, I. G. Ustinov, A. V., Doderer, T., Vernik, I. V.,
[2004] "Coarse projective kMC integration: Forward/ Pedersen, N. F., Huebener, R. P. & Oboznov, V. A.
reverse initial and boundary value problems," J. [1993] "Experiments with solitons in annular Joseph-
Comp. Phys. 196, 474-489. son junctions," Physica D 6 8 , 41-44.
Rinzel, J., Terman, D., Wang, X.-J. & Ermentrout, B. Zinner, B. [1991] "Stability of traveling wave-fronts for
[1998] "Propagating activity patterns in large-scale the discrete Nagumo Equation," SIAM J. Math. Anal.
inhibitory neuronal networks," Science 279, 1351- 22, 1016-1020.
1355. Zinner, B. [1992] "Existence of traveling wave-front solu-
Rosenau, P. [1986] "Dynamics of nonlinear mass-spring tions for the discrete Nagumo equation," J. Diff. Eqs.
chains near the continuum-limit," Phys. Lett. A118, 96, 1-27.
222-227. Zinner, B., Harris, G. & Hudson, W. [1993] "Travel-
Rosenau, P. [1987] "Dynamics of dense lattices," Phys. ing wave-fronts for the discrete Fisher's equation,"
Rev. B36, 5868-5876. J. Diff. Eqs. 105, 46-62.
MODEL R E D U C T I O N F O R FLUIDS, USING B A L A N C E D
P R O P E R ORTHOGONAL D E C O M P O S I T I O N
C. W. R O W L E Y
Department of Mechanical and Aerospace Engineering,
Princeton University, Princeton, NJ 08544, USA

Received May 15, 2004; Revised J u n e 7, 2004

Many of the tools of dynamical systems and control theory have gone largely unused for flu-
ids, because the governing equations are so dynamically complex, both high-dimensional and
nonlinear. Model reduction involves finding low-dimensional models that approximate the full
high-dimensional dynamics. This paper compares three different methods of model reduction:
proper orthogonal decomposition (POD), balanced truncation, and a method called balanced
POD. Balanced truncation produces better reduced-order models than POD, but is not compu-
tationally tractable for very large systems. Balanced POD is a tractable method for computing
approximate balanced truncations, that has computational cost similar to that of POD. The
method presented here is a variation of existing methods using empirical Gramians, and the
main contributions of the present paper are a version of the method of snapshots that allows
one to compute balancing transformations directly, without separate reduction of the Gramians;
and an output projection method, which allows tractable computation even when the number
of outputs is large. The output projection method requires minimal additional computation,
and has a priori error bounds that can guide the choice of rank of the projection. Connections
between POD and balanced truncation are also illuminated: in particular, balanced truncation
may be viewed as POD of a particular dataset, using the observability Gramian as an inner
product. The three methods are illustrated on a numerical example, the linearized flow in a
plane channel.

Keywords: Model reduction; proper orthogonal decomposition; balanced truncation; snapshots.

1. Introduction a fluid flow are too complex to be analyzed directly,


so in order t o answer questions such as t h e s e , lower-
The past several decades have produced major
dimensional models t h a t approximate t h e full sys-
advances in techniques for analyzing dynamical
t e m are desirable.
systems, both analytically and numerically. How-
T h e problem of obtaining a lower-dimensional
ever, despite continuing improvements in comput-
approximation to a high-dimensional d y n a m i c a l
ing power, many systems of interest remain out of
system is known as model reduction. T h i s paper
reach of these tools, because of their high dimen- reviews two well-known approaches t o model
sion. For instance, the mechanisms by which a fluid reduction, and presents a method w h i c h com-
flow transitions from laminar to turbulent are still pares favorably with b o t h of these. T h e m e t h o d
not fully understood: at this point, it is not even of proper orthogonal decomposition ( P O D ) and
clear whether the mechanisms are fundamentally Galerkin projection is popular in the fluids com-
nonlinear [Holmes et at, 1996] or linear [Farrell & munity, and in this method, one obtains a lower-
loannou, 1993; Bamieh & Daleh, 2001]. T h e full dimensional approximation by projecting t h e full
nonlinear partial differential equations t h a t describe nonlinear system onto a set of basis functions

301
302 C. W. Rowley

determined from empirical data. However, the The present method overcomes this latter
POD/Galerkin method can yield unpredictable drawback using a different method of snapshots,
results, and is sensitive to details such as the empir- described in Sec. 3.1, in which one computes the bal-
ical data used [Rathinam &; Petzold, 2003], and ancing transformation directly from the snapshots,
the choice of inner product [Colonius & Freund, without individual reduction of the Gramians,
2002]. POD/Galerkin models near stable equilib- and without a separate eigenvector solve. Further-
rium points can even be unstable [Smith, 2003]. more, we describe an output projection method in
A related method known as balanced truncation Sec. 3.2, which allows the empirical observability
was developed in the control theory community for Gramian to be computed even when the number of
stable, linear, input-output systems, and does not outputs is large, using many fewer adjoint simula-
suffer the same limitations as the POD method. tions. This output projection is optimal in an L2
Most notably, balanced truncation has error bounds sense, involves very little extra computation, and
that are close to the lowest error possible from comes with an a priori error bound which can guide
any reduced-order model. In addition, this method the rank of the output projection used [Eq. (27)].
has recently been extended to nonlinear systems Like balanced truncation, the present method is
using two distinct approaches [Lall et al., 2002; limited to stable, linear systems. However, because
Scherpen, 1993]. Balanced truncation has been used our method uses many of the same ideas as Lall
on some fluid problems [Cortelezzi &, Speyer, 1998], et al. [1999] (in particular, empirical Gramians con-
but becomes computationally intractable for sys- structed from impulse responses), it is likely that
tems of very large dimension (e.g. 10000 states similar computational techniques may be applied
or more), and so is not practical for many fluids to nonlinear systems as well.
systems. The paper is outlined as follows: in Sec. 2, we
This paper presents a method we refer to as review the methods of POD/Galerkin projection
balanced proper orthogonal decomposition, which and balanced truncation; we present our method in
combines ideas from POD and balanced truncation. Sec. 3; and in Sec. 4, we compare the three meth-
The goal is to compute balanced truncations, or ods on a example, the linearized flow in a plane
approximations to these, with computational cost channel.
similar to POD. Several previous methods have
combined ideas from POD and balanced trunca-
tion, including the original work of Moore [1981]. 2. Background on Model Reduction
The method presented here relies heavily on the The model reduction methods discussed in this
work of Lall et al. [1999, 2002], who used empiri- paper fall in the category of projection methods, in
cal Gramians to generalize balanced truncation to that they involve projecting the equations of motion
nonlinear systems. Our goal is to use empirical onto a subspace of the original phase space. The
Gramians to compute balancing transformations for methods of POD/Galerkin and balanced trunca-
very large systems. Previous works have addressed tion are briefly reviewed here, both for comparison
this problem as well, notably the work of Willcox with balanced POD, and also because our method
and Peraire [2002], which used POD to compute uses ideas from both POD and balanced trunca-
low-rank approximations to the Gramians, from tion. There are many other methods available for
which the balancing transformation was computed reducing both linear and nonlinear systems, and
using an efficient solver to find the eigenvectors several of these are reviewed in [Antoulas et al.,
of their product. However, this method has sev- 2001].
eral drawbacks. In particular, it becomes intractable
when the number of outputs is large, as a separate
adjoint simulation is required for each output. Fur- 2.1. Proper orthogonal decomposition
thermore, in reducing the rank of the controllability Proper orthogonal decomposition, also known as
and observability of Gramians before the balanc- principal component analysis, or the Karhunen-
ing is performed, one risks prematurely truncating Loeve expansion, has been used for some time
states that are poorly observable yet very strongly in developing low-dimensional models of fluids
controllable, which can lead to less accurate mod- [Lumley, 1970; Sirovich, 1987; Holmes et al, 1996].
els, as we shall see in the numerical example shown The idea is, given a set of data that lies in a vector
in Figs. 7 and 8. space V, to find a subspace Vr of fixed dimension r
Model Reduction for Fluids, Using Balanced POD 303

such that the error in the projection onto the sub- substituting into the equations, and multiplying by
space is minimized. Here, for simplicity, we will iffr, one obtains
consider the case where V = Rn. For a fluid,
ak(t) = ip*kf(xr), k = l,...,r, (6)
V will be infinite-dimensional, consisting of func-
tions on some spatial domain (for instance, veloc- a set of r ODEs that describe the evolution of xr(t).
ity and pressure everywhere), but we will assume
that the equations have already been discretized in
2.1.2. Method of snapshots
space, for instance by a finite-difference or spectral
method, so that V has finite dimension n (e.g. for a To compute the POD modes, one must solve an
finite-difference simulation, n is the number of grid- n x n eigenvalue problem (3). For a discretization
point times the number of flow variables). For the of a fluid problem, the dimension n often exceeds
infinite-dimensional case, see [Holmes et al., 1996; 106, so direct solution of this eigenvalue problem
Rowley et al, 2004]. is not often feasible. If the data is given as "snap-
Suppose we have a set of data given by x(t) £ shots" x(tj) at discrete times ti,...,tm, then one
R n , with 0 < t < T. We seek a projection Pr : can transform the nxn eigenvalue problem (3) into
M.n —> M71 of fixed rank r, that minimizes the total an m x m eigenvalue problem [Sirovich, 1987]. In
error this case, the integral in (3) becomes a sum

\x(t) - Prx{t)f dt. (1) R = ^2x(tJ)x(tj)*Sj (7)


To solve this problem, introduce the n x n matrix i=i
rT where Sj are quadrature coefficients. Assembling the
R= x(t)x(t)*dt, (2) data into a n n x r a matrix
Jo
where * denotes the transpose, and find the eigen- X x(ti)i Xytn X (8)
values and eigenvectors of R, given by
the sum (7) may be written R = XX*. In the
Rpk = hfk, Ai > • • • > A„ > 0. (3)
method of snapshots, one then solves the m x m
Since R is symmetric, positive-semidefinite, all eigenvalue problem
the eigenvalues A^ are real and non-negative,
and the eigenvectors ipk may be chosen to be X*Xuk = Afcnfc, uk € (9)
orthonormal. The main result of POD is that the where the eigenvalues Xk are the same as in (3).
optimal subspace of dimension r is spanned by The eigenvectors uk may be chosen to be ortho-
{ipi,..., <pr}, and the optimal projection Pr is then normal, and the POD modes are then given
given by by <Pk = Xv,k/\/\k. In matrix form, with $ =
r
[Vi ipm], and U = [ui this
Pr = ^^fc¥5*. becomes
k=l
The vectors ip^ are called POD modes. $ = XUA-1/2. (10)
The mx m eigenvalue problem (9) is more efficient
2.1.1. Galerkin projection than the nxn eigenvalue problem (3) when the
number of snapshots m is smaller than the number
One can then form reduced order models using
of states n.
Galerkin projection onto this subspace. Suppose the
dynamics of a system are described by
x(t) = f(x(t)). (4) 2.1.3. Remarks and limitations
Galerkin projection specifies dynamics of a variable A physical explanation of POD modes is that they
xr(t) e sp&n{<pi,...,tpr} by xr{t) = Prf(xr(t)), maximize the average energy in the projection of
that is, simply projecting the original vector field the data onto the subspace spanned by the modes.
/ onto the r-dimensional subspace. Writing This is equivalent to minimizing the error (1),
r since
Xr{t) = ^aj{t)ipj, (5) argmin (||x — P r ^|| 2 ) = argmax (||P r a;|| 2 )
Wk} Wk}
304 G. W. Rowley

where (•) is the average over the data ensemble (this The controllability Gramian Wc measures to what
follows from the Pythagorean theorem, since Pr is degree each state is excited by an input. For two
an orthogonal projection). In particular, the energy states x\ and X2 with ||xi|| = 11^21|> if x\Wcx1 >
in the projection is given by x\Wcx2, then state x\ is "more controllable" than
X2 (i.e. it takes a smaller input to drive the system
[T\\prx(t)fdt = T\k. (ii) from rest to x\ than to X2). The Gramian Wc is
Jo
fc=i
positive-definite if and only if all states are reach-
able with some input u(t).
Though POD modes are very effective (indeed opti- Conversely, the observability Gramian Wa mea-
mal) at approximating a given dataset, they are sures to what degree each state excites future out-
not necessarily the best modes for describing the puts. For an initial state XQ, and with zero input,
dynamics that generate a particular dataset, since one has \\yW2 = XQWOX0, where ||-||2 denotes the
low-energy features may be critically important to 1/2 [0,oo) norm. States which excite larger output
the dynamics. For instance, in a fluid flow where signals are called "more observable," and in this
acoustic resonances occur, acoustic waves play a sense are more dynamically important than states
crucial role, even though they have much smaller that are less observable.
energy than hydrodynamic pressure fluctuations. In The Gramians depend on the coordinates, and
practice, one sometimes neglects some of the higher- under a change of coordinates x = Tz, they trans-
energy POD modes in forming reduced-order mod- form as
els [Smith, 2003], in favor of lower-energy modes
that are more dynamically important. In fact, Wc H-> T^WcfT-1)*, W0 •-• T*W0T.
adding more POD modes can even make dynamical
Balancing refers to changing to coordinates in
models worse [Rowley et al., 2004]. These are unde-
which the controllability and observability proper-
sirable characteristics of a model reduction proce-
ties are balanced — more precisely, the transformed
dure, and part of the motivation behind balanced
Gramians are equal and diagonal:
POD is to improve on these limitations.
T^WdT-1)* = T*W0T = E = diag(ai,.. .,an).
2.2. Balanced truncation (15)
Balanced truncation is a method of model reduction The diagonal elements o\ > • • • > an > 0 are
for stable, linear input-output systems, introduced called the Hankel singular values of the system, and
by [Moore, 1981]. Consider a stable linear input- are independent of the coordinate system. A basic
output system result is that a balancing transformation T exists as
long as the system is both controllable and observe
A = Ax + Bu , N
able (i.e. Wc, Wo > 0). The transformation is found
r (12)
y = Cx by computing appropriately scaled eigenvectors of
where u(t) G W is a vector of inputs, y(t) 6 R9 the product WCW0 (in particular, WCW0T = TT?).
is a vector of outputs, and x(t) G M.n is the state In the balanced coordinates, the states that are least
vector. influenced by the input also have the least influence
One begins by defining controllability and on the output. Balanced truncation involves first
observability Gramians, which are symmetric, changing to these coordinates, and then truncating
positive-semidefinite matrices defined by the least controllable/observable states, which have
/•OO
little effect on the input-output behavior.
Wc= eAtBB*eAHdt
h
(13) 2.2.1. Error bounds
W0= / eAHC*CeAt dt, A useful property of balanced truncation is that one
Jo has a priori error bounds that are close to the lower
usually computed by solving the Lyapunov bound achievable by any reduced-order model.
equations To understand these error bounds, consider the
transfer function
AWC + WCA* + BB* = 0
K14)
! G{s) = C{sl - A)~XB,
A*W0 + W0A + C*C = 0.
Model Reduction for Fluids, Using Balanced POD 305

which relates the Laplace transform of the input used in [Lall et al., 1999, 2002] to extend balanced
to the Laplace transform of the output (y(s) = truncation to nonlinear systems.
G(s)u(s)). The L2-induced operator norm of G is
defined by
2.2.3. Controllability Gramian
\\Gu\\2
max • u = \\G\loo = max ai(G(iLo)), (16) To compute the controllability Gramian for a sys-
\ \\2
tem with p inputs, writing B = [bi,... ,bp], one
where a±(M) denotes the maximum singular value forms the state responses to unit impulses
of the matrix M. The following error bounds are
standard results [Dullerud & Paganini, 1999]: first, Xl(t) = eAth
any reduced order model Gr with r states must = response to impulsive input ui(t) = 5(t)
satisfy
\\G — GrWoo > cr r+ i, (17)
xp(t) = eA%
where <Jr+\ is the first neglected Hankel singular
value of G. This is a fundamental limitation for = response to impulsive input up(t) = 5(t)
any reduced order model. Balanced truncation also Then the controllability Gramian is given by
guarantees an upper bound of the error:
/•oo

G
Wc= (xi(t)xi(t)* + • • • + xp(t)xp(t)*)dt.
\G — Gr\\00<2 2_^ ji (18) Jo
j=r+l (19)
which is usually close to the lower bound (17), if the
Note the similarity between the expression above
Hankel singular values drop off quickly. Balanced
and the operator in (2) that arises in POD of the
truncation is not optimal, in the sense that there
dataset {xi(t),..., xp(t)}. In fact, the POD modes
may be other reduced-order models with smaller
for this dataset of impulse responses are just the
error norms, but a priori guarantees and strong
largest eigenvectors of Wc, or, in other words, the
heuristic justification make it a popular and effec-
most controllable modes of the realization. Note
tive technique.
that since the Gramian matrices depend on the
coordinate system, so do the POD modes of this
2.2.2. Empirical Gramians dataset.
Instead of computing the Gramians by solving If data from simulations is used to find the
Lyapunov equations (14), one may compute them impulse responses, then it is usually given at
from data from numerical simulations. This was the discrete times ti,...,tm, and the integral above
original approach used by [Moore, 1981], and was becomes a quadrature sum, as in (7), and we may
stack the snapshots as columns of a matrix

X = [xi(ti)y/ik ••• a>i(tm)v<5ri xp(h)y/6i • • • xp(tm)y/fim]> (20)

where again 5j are quadrature coefficients. The r


quadrature approximation to (19) is then
z1(t) = eA'tc1
Wc = XX*. (21) = response to impulsive input v\ (t) = 5(t)

2.2.4. Observability Gramian


zq(t) = eAHcq
The procedure for computing the empirical observ-
ability Gramian proceeds analogously: we compute = response to impulsive input vq(t) = 5(t),
impulse responses of the adjoint system
from which the observability Gramian is given by
z = A*z + C*v.
If q is the number of outputs and C* = ( c i , . . . , cq), w0 = / (*i(t)*i(*)* + • • • + zq{t)Zq{ty)dt.
then let Jo
306 C. W. Rowley

One then forms the data matrix Y, as in (20), and the matrix Y*X:
writes the Gramian as
Y*X = UEV*
W0 = YY*. "Ex 0"
0 0 v2*_
Note that this method requires q integrations of
= tfiEiJ? (23)
the adjoint system, where q is the number of out-
puts. Thus, this method is not feasible when the
where Si € R r x r is invertible, r is the rank of
number of outputs is large, for instance, if the out-
Y*X, and UfUi = V{VX = Ir. Define the matri-
put is the full state. The empirical Gramian may
ces Ti € R n x r and Sx G K r x n by
also be computed from n simulations of the pri-
mal system x = Ax, where n is the number of
71 = XViE~1/2, Si = E~1/2U?Y*. (24)
states (as is done in [Lall et al, 2002]), but clearly
this is also not feasible when the number of states
A proposition proved in the appendix establishes
is large. This difficulty is the motivation behind
that if r = n (that is, the Gramians are full rank),
the output projection method to be discussed in
then the matrix Ej contains the Hankel singular
Sec. 3.2.
values, T\ determines the balancing transformation,
and Si is its inverse. Furthermore, if r < n, then the
columns of T\ form the first r columns of the bal-
3. Balanced P O D ancing transformation, and the rows of Si form the
first r rows of the inverse transformation.
The main idea of balanced POD is to obtain
an approximation to balanced truncation that is Remark. The major advantage of the above met-
computationally tractable for large systems. The hod for computing the balancing transformation is
present method involves two components: com- that the Gramians themselves never need to be
puting the balancing transformation directly from computed. Only one SVD is needed, of a matrix
snapshots of empirical Gramians, without needing with dimension Np x N^, where Np is the number
to compute the Gramians themselves; and an out- of primal snapshots (columns of X), and ATd is the
put projection method to enable tractable computa- number of dual snapshots (columns of Y). If the
tion even when the number of outputs is large. The number of snapshots is much smaller than the num-
method has deep connections with POD: it may be ber of states n, as is typical for a problem in fluids,
viewed as POD with respect to a particular inner then this represents considerable savings. In partic-
product, or as a biorthogonal decomposition, as dis- ular, the size of the SVD is independent of n, and
cussed in Sec. 3.4. once the snapshots are computed, the entire method
scales linearly with n. Thus, the overall computa-
tion time is similar to POD (compare (23)-(24) with
3.1. Balanced truncation using the (9)-(10)), except that here one also needs to com-
method of snapshots pute adjoint snapshots, which do not arise in POD.
Suppose the controllability and observability The method above is also similar to a well-
Gramians may be factored as known method for computing balancing transfor-
mations from the Cholesky factorization of the
Gramians [Laub et al., 1987]. The present method
Wc = XX*, W0 = YY*, (22) differs in that the factorization (22) need not be the
Cholesky factorization, and neither of the Gramians
where Wc and W0 are n x n square matrices, but needs to be full-rank. (In particular, the system
X and Y may be rectangular, with differing dimen- does not need to be controllable or observable.)
sions. For instance, X and Y may be data matrices The present method does share the same desirable
used to form empirical Gramians, as described in numerical characteristics as the method in [Laub
the previous section. In the method of snapshots et ai, 1987], in particular that the Gramians never
used here, the balancing modes are computed by need to be "squared up," and thus the method is less
forming the singular value decomposition (SVD) of sensitive to numerical round-off than methods that
Model Reduction for Fluids, Using Balanced POD 307

involve computing the full Gramians Wc and W0, PrG(i), and we seek a projection Pr that minimizes
rather than a factorization.1 the error
/•OO

/ \\G(t) - PrG(t)f dt (26)


3.2. Output projection
Jo
Recall from Sec. 2.2 that in order to compute data
with respect to some norm on matrices. If we use
for the observability Gramian, one requires q sim-
a norm induced by an inner product, for instance
ulations of the adjoint system, where q is the
the Frobenius norm \\A\\jp = Tr(A*A), which is
number of outputs. This procedure is clearly not
induced by the inner product (A,B) = Tr(A*B),
feasible if the number of outputs is large. The idea
then the projection Pr that minimizes the error (26)
of this section is to alleviate this problem by pro-
is the projection onto the first r POD modes of the
jecting the output onto an appropriate subspace, in
dataset G(t). For instance, if <J>r = [ipi • • • ifr]
such a way that the input-output behavior is almost
is a matrix containing the first r POD modes of
unchanged. Instead of the system (12), consider the
G(t), then Pr = $ r $ * is the projection that mini-
related system
mizes (26).
x = Ax + Bu A convenient numerical feature of this method
for computing Pr is that the necessary snap-
shots for computing the POD modes of G(t)
where Pr is an orthogonal projection with rank r. have already been computed, for the empirical
Such a projection allows us to compute the empiri- controllability Gramian. To compute the snap-
cal observability Gramian using only r simulations shots for Wc, as in Sec. 2.2, we compute impulse
of the adjoint system, rather than q simulations. responses xi(t),..., xp(t), for each of the p inputs.
To see this, write the projection Pr as the prod- The dataset required for computing Pr is simply
uct Pr = &r&*, where $ r is a q x r matrix, with Cx\(t),... ,Cxp(t), so we need only to multiply
$ * $ r = Ir (this can always be done for any orthog- each of our snapshots by the output matrix C.
onal projection). The observability Gramian (13)
then becomes
3.2.1. Error bounds
/"OO

W0 = / eAHC*$rKCeM dt One can also quantify the error for the projected
Jo system. In particular, if A i , . . . , Am denote the POD
and so may be computed from r simulations of the eigenvalues of the dataset {Cxi(t),..., Cxp(t)},
adjoint system then
m
z{t) = A*z + C*$rv
\\G-PrGg= J2 Ai' ( 27 )
where v € W. When the number of outputs q j=r+l
is large, the reduction in computational cost is where m is the number of outputs, and the 2-norm
substantial. is given by
We would like to choose Pr such that the input-
output behavior of (25) is as close as possible to
the input-output behavior of (12). We can mea- ||G||1=/ Tr(G(tyG(t))dt. (28)
sure this input-output behavior by considering the Jo
impulse response matrix G(t), whose element Gij(t) The proof follows immediately from a variant
is the output component yi(t) corresponding to an of (11). This result gives us guidance in choosing
impulsive input Uj(t) = S(t). The impulse response the number of modes to keep in the projection,
completely determines the input-output behavior based on the desired accuracy of the reduced-order
of a linear system. If G(t) is the impulse response model, and the POD eigenvalues computed from the
of (12), then the impulse response of (25) is impulse response data.

x
As one reviewer remarked, POD modes may also be computed by a SVD of the snapshot matrix X from (8). This approach
also has better roundoff properties than computing the eigenvalue decomposition of X*X as in (9), although it requires more
computation.
308 C. W. Rowley

3.3. Summary 3.4. Relation to POD


To summarize, the steps in the balanced POD There are deep connections between the P O D /
method are as follows: Galerkin method and balanced truncation, which
are elucidated by the balanced POD procedure. For
1. Integrate solutions xi(t),... ,xp(t) of the system instance, balanced truncation may be viewed as a
x = Ax, with initial conditions Xfc(0) = &&, biorthogonal decomposition, instead of the orthog-
where bk denotes the kth. column of the B matrix onal decomposition given by POD. Alternatively,
in (12). balanced truncation may be viewed as a special
2. Compute POD modes <pk of the dataset case of POD, using a particular dataset (impulse
{Cxi(t),... ,Cxp(t)}, and choose a projection responses), and using the observability Gramian as
rank r such that the error (27) is acceptable. an inner product. The former point of view is useful
3. Integrate solutions z\(t),..., zT(t) of the adjoint for numerics, and the latter is useful for analysis, as
system z = A*z, with initial conditions £fc(0) = it yields a guarantee that if balanced POD is used,
C*<pk. then Galerkin projections of stable nonlinear sys-
4. Form the data matrices X and Y for the primal tems are guaranteed to be stable as well.
and dual solutions, as in (20).
5. Compute the SVD of Y*X, and the balanced
POD modes are given by (24). 3.4.1. Biorthogonal decomposition

If the number of outputs is small, then one may skip In the POD/Galerkin procedure, one finds a
step 2 and in step 3 use initial conditions zk (0) = c*k, sequence of orthogonal basis functions {<fij}, for pro-
where Cfc is the fcth row of C. jection of the dynamics. Balanced truncation can
Reduced-order models may then be formed by be viewed in the same way, but using a sequence of
transforming to balanced coordinates and project- biorthogonal functions {fj}, {ipj}- Let the matrices
ing. Note that there is no need to transform all of Ti and Si from (24) be written
the states: if we write
Pi
Zl(t) Tx = Wi (fir] , Si =
x(t) = Tz(t) = [Tl T2]
z2(t) PT
= T1z1(t)+T2z2(t),
with ifij,ipj £ Mn. Then since S\Ti = Ir, we have
where z\(i) are states to be retained and z2(t) are tp*<Pj = Sij, so the sequences are biorthogonal. Now,
states to be truncated, then the transformed equa- approximate x(t) as in (5), as
tions are

in = SiATiZi + SiAT2z2 + StBu *r (t) = ^ Uj (t)(pj, dj (t) = PjXit).


3=1
z2 = S2ATlZl + S2AT2z2 + S2Bu
y = CTlZl + CT2z2, Substituting into the equation x = f(x), multiply-
ing by V'fc and using biorthogonality now gives
where S = T~1. Setting z2 = 0 gives the truncated
model a-k = Pkf(x),
i i = SiATxzi + SxBu which is identical to (6), but using the adjoint
V = CTlZl modes ipk for the projection. Of course, one needs
a linear system to define Gramians or adjoint equa-
Thus, to compute a reduced-order model of order r, tions, but the idea is that even for a nonlinear sys-
all we need is the first r columns of T and the first tem, one may compute balancing modes {ifj}, {p}
r rows of S, given by (24). Note, however, that this using a linearization, or a method similar to that
is not the same as orthogonal projection onto the in [Lall et ai, 2002], and then project the nonlinear
subspace spanned by the first r columns of T, since system x = f(x) without having to transform the
the columns of T are not orthogonal. entire state before truncating.
Model Reduction for Fluids, Using Balanced POD 309

3.4.2. Observability Gramian as an so these adjoint modes may be viewed as a biorthog-


inner product onal decomposition with respect to the standard
inner product (ip, ip) = tp*ip, as in the previous sec-
One of the difficulties with the POD/Galerkin
tion. These adjoint modes are also rescaled versions
method is that the inner product used for com-
of the rows of S± in (24), since one easily checks
puting POD modes and projecting the dynamics is
that, with W0 = YY*, and X*Y = E/iEiVi*,
arbitrary. Sometimes, an appropriate inner product
is obvious, as for incompressible flow [Holmes et al, ri
1996], but other times, as for compressible flow, a
Si : = $*W0 = E^1V{X*YY* = UfY*,
suitable inner product is not obvious [Rowley et al.,
2004], and different choices can give dramatically rr
different results [Colonius & Preund, 2002]. Perhaps a rescaling of Si in (24).
the deepest connection between POD/Galerkin and
balanced truncation is that for a stable linear sys-
tem, balanced truncation may be viewed as a special 3.4.3. Guaranteed stability
case of POD, using impulse responses for a dataset A useful consequence of using the observability
(i.e. the matrix X in (20)), and using the observ- Gramian as an inner product for Galerkin projec-
ability Gramian as an inner product. tion is that in this case, the reduced-order model
To see this, first define an inner product preserves the stability of an equilibrium point at
on Rn by the origin, even if the full model is nonlinear. It is
well-known that balanced truncations of stable lin-
(a,b)Wo = a*W0b (29) ear systems are stable, but POD/Galerkin models of
where W0 is the observability Gramian (which is nonlinear systems may be unstable even if the non-
positive definite as long as the system is observable). linear system is linearly stable at the origin [Smith,
As mentioned in Sec. 2.2, W0 measures states of 2003].
large "dynamical importance," so this inner product The stability result follows from a result in
weights dynamically important states more heavily. [Rowley et al., 2004]: if the norm induced by an
The POD modes of the dataset X with respect to inner product is a Lyapunov function for a nonlin-
this inner product are eigenvectors of R = XX*W0 ear system with a stable equilibrium point at the
(see [Rowley et al., 2004] for an explanation of POD origin, then orthogonal projection of the dynamics
with respect to an arbitrary inner product). These onto any subspace will also be stable at the ori-
eigenvectors will be orthogonal with respect to the gin. One sees from (14) that V(x) = (x,x)w is a
inner product (29), though not with respect to the Lyapunov function of the linearized system x = Ax,
standard inner product. with V(x) = — C*C < 0. If the nonlinear system
x = f(x) has a linearly stable equilibrium point
POD modes are normalized balancing modes. at the origin, with Df(0) — A, then V{x) is also a
Since the dataset X was produced such that XX* = Lyapunov function for the nonlinear system, and
Wc, the POD modes are just the eigenvectors of so Galerkin projections using {-,-}w will also be
R = WCW0: in other words, they are the balanc- stable.
ing modes, normalized differently. Furthermore, the
eigenvalues of R are the squares of the Hankel sin- 4. E x a m p l e : Linearized C h a n n e l F l o w
gular values. If we compute the POD modes using
In order to compare the effectiveness of t h e three
the method of snapshots as in (9), we form the
model reduction methods considered in this paper,
SVD X*W0X = ViY^V*, and the POD modes are
we consider the problem of fluid flow in a plane
columns of
channel. In particular, we use linearized equations
$ = [0! &} = XViE-1. with a coarse enough discretization that conven-
tional balanced truncation is still computation-
Note that these modes are the same as columns ally tractable. Since balanced POD is meant to
of T\ in (24), with a different scaling. If we define approximate balanced truncation, we may evalu-
"adjoint modes" ipj = W0(pj, then ate how close the approximation is, and compare
the resulting models to those formed with t h e stan-
{<Pi, ^j)w0 = <PiWo<Pj = i>i<Pj = Sij dard POD/Galerkin method. Focusing on linearized
310 C. W. Rowley

equations allows us to use operator norms to objec- which is small enough that we may compute the full
tively compare the errors in the reduced order Gramians exactly, for comparison with our approx-
models. imate methods.

4.1. Equations of motion 4.2. Results


Consider the problem of a fluid flowing in a plane 4.2.1. Hankel singular values
channel, as depicted in Fig. 1. We focus on the lin- We begin by comparing the Hankel singular val-
earized case, considering small perturbations about ues <7j, shown in Fig. 2. Here, the exact values for
a steady, laminar flow. The flow is assumed peri- balanced truncation are compared to the approxi-
odic in the x- and z-directions, with no-slip bound- mate values for balanced POD, for both five-mode
ary conditions at the walls y = del. We force the and ten-mode output projections Pr. Also shown
flow with a body force given by B(y, z)f(t), act- are the POD eigenvalues Xj, computed from (9),
ing in the wall-normal direction (here B(y,z) spec- and observe that the eigenvalues fall off quite
ifies the spatial distribution of the force, and f(t) rapidly. The first five POD modes capture 95.6% of
is regarded as an input). We restrict ourselves to the energy, while the first ten modes capture 99.8%
streamwise-constant perturbations (no variations in of the energy. Thus, one expects that five-mode and
the ^-direction), and for this case the equations are ten-mode output projections should closely match
given by the full input-output system.
In Fig. 2, the exact Hankel singular values
are computed using the algorithm in [Laub et al.,
1987], while the approximate versions are com-
puted from (23). Both the primal and dual solutions
were computed using 1000 snapshots equally spaced
within time 0 < t < 200, by which time transients
where v is the wall-normal velocity and r] = uz — wx have decayed to a maximum value of 0.0002, from
is the perturbation in wall-normal vorticity. Numer- a maximum value of 1 at the initial time.
ical investigations indicate that the laminar velocity For the five-mode output projection, the first
profile u = (U(y), 0,0), with U(y) = 1 - y2, is lin- five singular values match closely, while for the ten-
early stable for Reynolds numbers R < 5772 [Drazin mode output projection, the first ten singular val-
& Reid, 1981], so the infinite-time Gramians will be ues match. Though there is no guarantee that for
well defined. an output projection of rank r, the first r singular
For the numerical examples considered here, we values will be approximated well, empirically this
consider R = 100, on the domain z E [0,2TT], and seems to be the case, at least for the channel flow
discretize the problem using 16 Chebyshev modes problem.
in the y-direction, and 16 Fourier modes in the
^-direction. The forcing B(y,z) is zero everywhere 4.2.2. Modes
except in a small region at the center of the domain
The first three modes are plotted in Figs. 3-5,
(y = 0, z = IT). We take the output to be the entire
which compare modes from exact balanced
state, that is, the values of (v, rj) everywhere in
truncation, balanced POD with a five-mode output
space. The total number of states is 2 • 16 • 15 = 480,
projection, and conventional POD. As explained
in Sec. 3.4, for exact balanced truncation and
balanced POD, the kth mode is the kth col-
umn of the transformation T, from (15) and (24),
respectively. The POD modes are the eigenvec-
tors from (3), also columns of the matrix $
from (10).
The modes from balanced POD are nearly
identical to those from exact balanced truncation,
even for the five-mode output projection. For the
Fig. 1. Schematic of channel flow example. ten-mode output projection, the modes also look
Model Reduction for Fluids, Using Balanced POD 311

(Tj, Xj

Fig. 2. Hankel singular values Oj for linearized channel flow: balanced truncation ( x ) , balanced POD with five-mode output
projection (o), ten-mode output projection (Zl); and POD eigenvalues Xj (A).

N o r m a l velocity v Normal vorticity rj

Balanced
truncation

Balanced

Fig. 3. Mode 1 for channel flow.

visually identical, so these are not shown. The we would not expect the POD modes to be the
conventional POD modes look similar in gen- same as the balancing modes, unless the observ-
eral structure, especially mode 1, but there are ability Gramian Y is the identity, so it is interesting
distinct differences in modes 2 and 3. Of course, that the POD modes look so similar.
312 C. W. Rowley

Normal velocity v Normal vorticity T7


1
Balanced
yo
truncation

Balanced

Fig. 4. Mode 2 for channel flow.

Normal velocity v Normal vorticity 77

Balanced
truncation

Balanced

Fig. 5. Mode 3 for channel flow.

4.2.3. Adjoint modes shown. Recall that the POD modes are orthogo-
The corresponding adjoint modes for balanced POD nal, not biorthogonal, so the "adjoint modes" for
are shown in Fig. 6. These look visually identical POD are the same as the primal modes shown in
to the adjoint modes from balanced truncation (i.e. Figs. 3-5. The adjoint modes in Fig. 6 look quite dif-
the first three rows of S\ in (24)), so these are not ferent from the primal modes or the POD modes,
Model Reduction for Fluids, Using Balanced POD 313

Normal velocity v Normal vorticity 77

Mode 1

Mode 2

Fig. 6. Adjoint modes 1-3 for balanced POD. The adjoint modes for balanced truncation are nearly identical, and the adjoint
modes for POD are the same as the primal modes.

so it is reasonable to say that, for this problem, as the order r varies from 1 to 10. Notice that
the main difference between balanced POD and the error norms for balanced POD with both five-
conventional POD is the choice of inner product mode and ten-mode output projections are virtually
used for the projection. the same as for balanced truncation, while POD is
significantly worse for models of dimension six or
smaller. For models of dimension greater than six,
4.2.4. Error norms the error norms become smaller and all methods
The main reason for using a linear system to com- perform about the same.
pare these model reduction procedures is to have Also shown is the error from an approximate
an objective measure of how effective the various balanced truncation in which the exact Gramians
reduced-order models are at approximating the full- are computed, and then separately approximated
order system. For linear systems, we have norms by low-rank projections (to rank 30) using SVD.
which enable such an objective comparison. Perhaps This separate reduction of Gramians is performed
the most intuitive norm is the H2 norm, denned in the method of snapshots used in [Willcox &
by (28). Since we have a single input, the impulse Peraire, 2002], although here their method of snap-
response matrix G(t) is a column vector g(t), shots was not literally used, since it would require
and so 480 adjoint simulations (the exact Gramians were
computed by solving (14) instead). The balancing
transformations are then found from the low-rank
dt,
Gramians, and the L2 errors of the resulting models
are plotted in Fig. 7. One sees that the errors are
just the regular Z<2[0,00) norm of the impulse significantly increased. It is interesting that if only
response vector. We can think of the error norm the controllability Gramian is reduced to rank 30,
\\G — GVH2 as being the RMS error between a sim- while the exact observability Gramian is retained,
ulation of the reduced-order model Gr and a sim- then the results are similar to full balanced trun-
ulation of the full model G, where the simulation cation or balanced POD (though these results are
begins with v(x, z, 0) = r](x, Z, 0) = 0, and the forc- not shown in the figure). Thus, in truncating the
ing is /(£) = 8(t). This error is shown in Fig. 7, observability Gramian, one is removing states that
314 C. W. Rowley

are almost unobservable, but apparently strongly which does not require separate reduction of the
controllable, and this causes increased errors in Gramians.
the resulting models. This illustrates one of the The differences between balanced truncation
advantages of our method of snapshots (Sec. 3.1), and POD become even more apparent when one

0 2 4 6 8
r (order of reduced model)
Fig. 7. Error \\G — Gr||2/I|G||2, for balanced truncation (x), balanced POD with five-mode and ten-mode output projection
(o and 3 ) , POD (A), and approximate balanced truncation with separate reduction of Gramians to rank 30 (v).

2 4 6 10
r (order of reduced model)
Fig. 8. Error ||G — Gr||oo/||G||oo, for balanced truncation (x), balanced POD with five-mode and ten-mode output projection
(o and D), POD (A) approximate balanced truncation with separate reduction of Gramians to rank 30 (V), and lower bound
for any model reduction scheme (—).
Model Reduction for Fluids, Using Balanced POD 315

considers the Hoo norm ||G — Gr ||oo, defined by (16). as one must integrate an adjoint solution for each
This norm is perhaps the most useful, because it is output. Section 3.2 describes an output projection
an induced norm, and measures the maximum error method that approximates full balanced truncation
over all possible inputs, not just an impulsive input. with guaranteed error bounds, and dramatically
Figure 8 shows the error ||C — Grlloo for the vari- reduces the number of adjoint solutions necessary.
ous reduced-order models Gr. Again, the norms for In the example shown, integration of five adjoint
balanced POD are almost identical to the norms for solutions produced models that were virtually indis-
exact balanced truncation, for both five-mode and tinguishable in the H^ norm from full balanced
ten-mode output projections. Here, the norms for truncations, which would have required 480 adjoint
POD are about an order of magnitude higher, for all simulations using previous methods.
models considered. The error from an approximate The formulation of balanced POD also clar-
balanced truncation using a rank-30 reduction of ifies some connections between balanced trunca-
the exact Gramians is also shown, and again results tion and POD, most importantly that for a linear
in larger errors for the more accurate models. Also system, balanced truncation is a special case of
shown in this figure is the lower bound (17) achiev- POD. In particular, one uses a dataset consist-
able by any reduced-order model of dimension r, ing of responses to unit impulses (one for each
and the balanced POD norms are indeed very close input), and uses the observability Gramian for the
to this lower bound. inner product. This inner product weight states
of large "dynamical importance," as opposed to
POD, which retains only the most energetic modes.
5. Conclusions This suggests that even for a nonlinear system, the
The balanced POD method described here is not observability Gramian from a linearization might be
the first to use empirical Gramians to compute a good choice of inner product for POD, if reduced-
approximate balanced truncations using simula- order models are desired. The balanced P O D pro-
tion data. These empirical Gramians were used by cedure not only removes subjectivity in the choice
Moore [1981] in his original development of bal- of inner product for POD, but also guarantees that
ancing, and by others in extending balancing to a Galerkin projection of a nonlinear system with a
nonlinear systems [Lall et al, 1999, 2002], and com- stable equilibrium point at the origin will also have
puting balancing transformations for large systems a stable equilibrium point at the origin.
[Willcox k Peraire, 2002]. Although many of the developments in this
This work addresses computing balancing paper are restricted to stable, linear systems,
transformations (or approximations of them) for Sec. 3.4 suggests how many of these ideas might be
very large systems with, e.g. millions of states, extended to large-scale nonlinear systems as well,
as arise in discretizations of problems in fluids. following the approaches in [Lall et al, 2002].
Standard methods for computing balanced trunca-
tions involve singular value decompositions of the Acknowledgments
empirical Gramians, which are full n x n matri- This work was partially supported by the NSF,
ces (where n is the number of states), which is grant CMS-0347239, under program manager
not feasible when n is large. Previous computa- M. Tomizuka; and by AFOSR, grant F49620-03-1-
tional methods for large systems [Willcox &; Peraire, 0081, under program managers B. King, S. Heise
2002] involve separate reduction of the Gramians, and J. Schmisseur.
which can lead to less accurate models, as we have
seen (Figs. 7 and 8). The method of snapshots References
described in Sec. 3.1 allows computing balanced
truncations from SVDs of much smaller matrices, Antoulas, A. C, Sorensen, D. C. & Gugercin, S. [2001]
"A survey of model reduction methods for large-scale
with dimension NpxNd, where Np and Nj are num-
systems," Contemp. Math. 280, 193-219.
bers of snapshots in a dataset of primal and dual Bamieh, B. & Daleh, M. [2001] "Energy amplification
solutions, respectively, without separate reduction in channel flows with stochastic excitation," Phys.
of the Gramians. Fluids 13, 3258-3269.
Furthermore, previous methods as in [Lall Colonius, T. & Preund, J. B. [2002] "POD analysis of
et al, 1999] and [Willcox & Periare, 2002] are not sound generation by a turbulent jet," AIAA Paper
tractable for systems with large numbers of outputs, 2002-0072.
316 C. W. Rowley

Cortelezzi, L. & Speyer, J. L. [1998] "Robust reduced- defined by (23)-(24). T h e following theorem
order controller of laminar boundary layer transi- establishes t h a t if one takes enough snapshots t h a t
tions," Phys. Rev. E58, 1906-1910. t h e empirical Gramians Wc and W0 have full rank n
Drazin, P. G. & Reid, W. H. [1981] Hydrodynamic (clearly, at least n snapshots are required, and the
Stability (Cambridge University Press). system must be b o t h controllable and observable),
Dullerud, G. E. & Paganini, F. [1999] A Course in
then S i contains t h e Hankel singular values (square
Robust Control Theory: A Convex Approach, Texts
in Applied Mathematics, Vol. 36 (Springer-Verlag). roots of the eigenvalues of the product W C W 0 ), and
Farrell, B. F. & Ioannou, P. J. [1993] "Stochastic forc- Ti is the balancing transformation t h a t simultane-
ing of the linearized Navier-Stokes equations," Phys. ously diagonalizes Wc and W0.
Fluids A5, 2600-2609.
Holmes, P., Lumley, J. L. & Berkooz, G. [1996] Proposition 1. Let Wc and W0 be empirical
Turbulence, Coherent Structures, Dynamical Systems Gramians defined by (22), and suppose Y*X has
and Symmetry (Cambridge University Press). rank r = n. Then the matrix T\ is square and
Lall, S., Marsden, J. E. & Glavaski, S. [1999] "Empirical invertible, with inverse S\, and
model reduction of controlled nonlinear systems," in
Proc. IFAC World Congress, Vol. F, pp. 473-478. SxWcSl = T^W0TX = S i .
Lall, S., Marsden, J. E. & Glavaski, S. [2002] "A subspace
approach to balanced truncation for model reduction
of nonlinear control systems," Int. J. Robust Nonlin. Proof. To show S\ = T^ , we have
Contr. 12, 519-535.
Laub, A. J., Heath, M. T., Page, C. C. & Ward, R. C. S{Ti = Y^1I2UIY*XViX~1/2
[1987] "Computation of balancing transformations
V — V - ^ V 1 ^—1/2 T
and other applications of simultaneous diagonaliza- — ZJ1 2->\L-'\ — ln-
tion algorithms," IEEE Trans. Automat. Contr. 32,
115-122. Also,
Lumley, J. L. [1970] Stochastic Took in Turbulence
(Academic Press).
SxWcSl = ?,~1/2UZY*XX*YUiZ-1/2
Moore, B. C. [1981] "Principal component analysis in
linear systems: Controllability, observability, and = S-1/2(SiFi*)(V1Si)S-1/2 = Si,
model reduction," IEEE Trans. Automat. Contr. 26,
17-32.
and a similar calculation shows T*W0Ti = Si. •
Rathinam, M. & Petzold, L. R. [2003] "A new look at
proper orthogonal decomposition," SIAM J. Numer.
Anal. 4 1 , 1893-1925. Of course, our main interest is in large systems
Rowley, C. W., Colonius, T. & Murray, R. M. [2004] for which the number of snapshots, and hence the
"Model reduction for compressible flow using POD rank of Wc, W0 is much smaller t h a n n. T h e follow-
and Galerkin projection," Physica D189, 115-129. ing theorem establishes t h a t in this case, S i also
Scherpen, J. M. A. [1993] "Balancing for nonlinear
contains all nonzero Hankel singular values, and 7 \
systems," Syst. Contr. Lett. 2 1 , 143-153.
contains t h e first r columns of t h e balancing trans-
Sirovich, L. [1987] "Turbulence and the dynamics of
coherent structures, parts I—III," Q. Appl. Math. formation.
XLV, 561-590.
Smith, T. R. [2003] "Low-dimensional models of plane Proposition 2. Suppose Y*X has rank r < n.
Couette flow using the proper orthogonal decomposi- Then there exist matrices S2,T2 G R r e x ("~ r ) such
tion," PhD thesis, Princeton University. that for
Willcox, K. & Peraire, J. [2002] "Balanced model
reduction via the proper orthogonal decomposition," Si
AIAA J. 40, 2323-2330. T = [T1 T 2 ], S =
S2

Appendix A T is invertible with T _ 1 = S, and


Theorems on Computing Balancing
Transformations SX2 0
SWCW0T = (A.1)
Here, we consider empirical Gramians defined 0 0
by (22), with balancing transformations T\ and S\
Model Reduction for Fluids, Using Balanced POD 317

and furthermore, columns are linearly independent. Define S2 as the


last n — r rows of T _ 1 , and it follows that S2T1 = 0.
Si 0 First, we show
swcs* =
0 Mi
(A.2)
•Tfwyzi T;W0T2~\ _ rsi o
T*W0T =
Si 0 _T2*W0Ti T^W0T2\ ~ [ 0 M2
T*W0T =
0 M2 As in the proof of Theorem 1, T^W0TX = E i . Next,
where M\ and M 2 are matrices in (n-r)x(n-r) T?W0T2 = Z~1/2V1*X*YY*T2

Proof. As in the proof of Theorem 1, S\T\ = Ir.


= sr 1/2 (Sit/r)y*T 2 = EiS-ir2 = o rx(n _ r) ,
Choose T2 such that its columns form a basis for the and thus T%W0TX = (T?W0T2)* = 0 ( n _ r ) x r . The
nullspace of S\ (an (n — r)-dimensional subspace of results for 5W C 5* and SW C W 0 T follow similarly,
R n ). Then S{T2 = 0, and T is invertible, since its using S2Ti = 0 . •
This page is intentionally left blank
BIFURCATION TRACKING ALGORITHMS AND
SOFTWARE FOR LARGE SCALE APPLICATIONS
A. G. SALINGER*, E. A. BURROUGHS*, R. P. PAWLOWSKI,
E. T. P H I P P S and L. A. R O M E R O
Sandia National Laboratories, Albuquerque,
NM 87185-1111, USA
*agsalin@sandia. gov

Received March 30, 2004; Revised June 28, 2004

We present the set of bifurcation tracking algorithms which have been developed in the LOCA
software library to work with large scale application codes that use fully coupled Newton's
method with iterative linear solvers. Turning point (fold), pitchfork, and Hopf bifurcation track-
ing algorithms based on Newton's method have been implemented, with particular attention to
the scalability to large problem sizes on parallel computers and to the ease of implementation
with new application codes. The ease of implementation is accomplished by using block elimina-
tion algorithms to solve the Newton iterations of the augmented bifurcation tracking systems.
The applicability of such algorithms for large applications is in doubt since the main compu-
tational kernel of these routines is the iterative linear solve of the same matrix that is being
driven singular by the algorithm. To test the robustness and scalability of these algorithms, the
LOCA library has been interfaced with the MPSalsa massively parallel finite element reacting
flows code. A bifurcation analysis of an 1.6 Million unknown model of 3D Rayleigh-Benard con-
vection in a 5 x 5 x 1 box is successfully undertaken, showing that the algorithms can indeed
scale to problems of this size while producing solutions of reasonable accuracy.

Keywords: Bifurcation analysis; continuation; Row stability.

1. Introduction &; Shvartsman, 2003]. In any event, t h e c o m p u t a -


tional design process of numerous systems can be
Bifurcation analysis is an important and powerful
tool for performing computational design of mod- aided by t h e availability of software w i t h efficient
eled systems. Identifying bifurcations in parameter and robust algorithms for tracking bifurcations. In
space is important since they represent a discon- this work we present t h e algorithms implemented
tinuous change in a system's behavior with respect in the LOCA library t h a t have been developed
to changes in parameter. This behavior is often an for large-scale applications, such as those arising
undesirable phenomenon t o b e designed away from, from t h e discretizations of P D E s in multiple dimen-
such as the onset of flow instabilities in a chemical sions. (Our definition of "large-scale" for Newton-
vapor deposition reactor [Pawlowski et al., 2001] or based applications are those t h a t use a p p r o x i m a t e
the buckling of a structure [Fujii et al., 2000]. It can iterative methods for solving the linear s y s t e m in
also be desired, such as t h e onset of oscillations in Newton's methods, and t h a t are likely parallel.)
an resonant tunnelling diode [Lasater et al, 2004] Certainly, bifurcation analysis software exists.
or symmetry breaking in morphogenesis [Muratov A partial list includes the A U T O code of Doedel

Current address: Department of Mathematics, Humboldt State University, Areata CA.

319
320 A. G. Salinger et al.

et al. [1997], CONTENT by Kuznetsov and Levitin presented here do work for problems with direct
[1995-1997], the MATCONT package for use within solvers, they do not take advantage of such capa-
Matlab by Dhooge et al [2003], and DDE-Biftool bilities as convenient monitoring of the sign of the
for delay differential equations by Engelborghs et al. determinant.) This is a different set of applica-
[2002]. As a generalization, these software pack- tion codes than those targeted by the PDEcont
ages are aimed at applications consisting of sets of code and therefore a complementary approach. Due
ODEs, including those that come from discretiza- to the fact that there are numerous linear solver
tions of ID PDEs. For these problems, the devel- algorithms which are tailored to different physics,
opers of these codes have implemented bifurcation data structures, and even discretizations, and that
analysis capabilities that go well beyond the generic this is an active area of research and develop-
one-parameter bifurcations that are the focus of ment, we have chosen not to own this computa-
this paper, including the tracking of periodic orbits, tion in the continuation and bifurcation library.
heteroclinic orbits, bifurcations of delay equations, Instead, we have implemented block elimination
and tracking of higher co-dimension bifurcations. algorithms, sometimes referred to as bordering algo-
It is our understanding that the only general pur- rithms, that use the solve of a linear system with
pose bifurcation analysis software for large-scale the Jacobian matrix ( J - 1 v ) as the main computa-
systems is the PDEcont code of Lust (e.g. [Lust tional kernel. (The Hopf tracking algorithm is an
et al, 1998]), which uses a Newton-Picard algo- exception.)
rithm and is aimed at transient-based simulation The ramifications of this approach, which is
codes. motivated for reasons to do with implementation
The development of algorithms for larger prob- and software, and not numerics, are many. On the
lems, such as those coming from multi-dimensional positive side, this approach renders the library read-
PDEs, is not new. A thorough treatment is pre- ily usable by any Newton-based code, which must
sented in the book by Govaerts [2000]. Also, an by definition already possess this inversion capabil-
excellent review of the theory, algorithms, and ity. The library can be written with no knowledge
applications to problems in fluid mechanics was of the matrix and its (parallel) data structures or
published in 2000 by Cliffe et al. [2000a]. What solution algorithm. On the negative side, the bifur-
distinguishes our present work from previous work cation tracking algorithms are numerically unsta-
(such as using the Entwife code [Cliffe et al., ble, using the linear solve of the Jacobian matrix
2000b]) is that we have worked towards develop- as part of the iteration process to drive that same
ing a general purpose software library for these matrix singular. This will be seen clearly in the pre-
problems, and therefore maintained a separation sentations of the algorithms in Sec. 2, and the effect
between the bifurcation library and the applica- will be documented in a numerical experiment in
tion code. Because of this, we have refrained from Sec. 3.4.
major modifications to the application codes, such To demonstrate and evaluate the algorithms,
as to explicitly form the augmented systems for dis- we present in Sec. 3 results for tracking secondary
tinguishing bifurcation points or to compute ana- bifurcations in the classical Rayleigh-Benard prob-
lytic derivatives for additional quantities needed lem. This problem involves natural convection flows
in the bifurcation analysis. Furthermore, we have and the discretization of five coupled PDEs in three
targeted very large systems where direct solvers dimensions. A brief description of the problem and
are no longer a scalable option. In this respect, PDE solution algorithms are presented in Sec. 3.1,
our present application is most closely related followed by bifurcation tracking results in Sees. 3.2
to the methods of Tuckermann and coworkers and 3.3.
(e.g. [Mamun & Tuckerman, 1995; Nore et al., 2003; The bifurcation tracking algorithms presented
Xin k, Le Quere, 2002]) who use a matrix-free and demonstrated in this paper are included in
Newton-Krylov approach to solve for fixed points the LOCA software library along with complemen-
and to perform stability analysis of a time-stepper tary capabilities of parameter continuation and a
for 2D and 3D fluid mechanics applications. linear stability analysis capability. The parameter
In our development of the LOCA software, continuation routines include the pseudo-arclength
we have targeted codes that use a Newton-based continuation algorithm [Keller, 1977] and multipa-
solution algorithm and iterative linear solvers to rameter continuation using the multifario code of
reach equilibrium solutions. (While the algorithms Henderson [2002]. We have previously reported (see
Bifurcation Tracking Algorithms and Software for Large Scale Applications 321

[Lehoucq & Salinger, 2001a; Burroughs et al., 2001, which, given an initial guess for x, is solved
2004]) on our approach to large-scale eigenvalue iteratively with Newton's method,
approximation using the generalized Cayley
JAx = - R ; x n e w = x + Ax, (2)
transformation and then Arnoldi iterations using
the ARPACK code [Lehoucq et al, 1998; Maschhoff where the Jacobian matrix J = <9R/dx. The iter-
& Sorensen, 1996]. We have found that the eigen- ation on x converges when ]|Ax|| and/or ||R||
solver exhibits even better scalability than the decrease below some tolerances. For scalability to
steady state solution algorithm since the matrix large applications, the matrix equation (2) must be
requiring inversion in the Cayley transform is bet- solved iteratively.
ter conditioned than the Jacobian matrix. Steady state solution branches are tracked
The application presented in this paper is the using continuation algorithms. Zero order contin-
largest we have analyzed with LOCA and serves uation (natural parameter continuation using the
to demonstrate the scalability of the algorithms previous solution as the initial guess), first order
on a familiar problem. Other large-scale appli- continuation (natural continuation with an Euler
cations that have been analyzed include natural predictor requiring an extra matrix solve), and
convection flows in 2D enclosures [Salinger et al., pseudo arclength continuation algorithms [Keller,
2002a; Burroughs et al, 2004], flows in chemi- 1977] have all been implemented in t h e LOCA
cal reactors [Pawlowski et al, 2001], and density library. Details of these methods can be found else-
functional theory calculations of capillary conden- where [Cliffe et al, 2000a; Salinger et al, 2002b],
sation of confined fluids [Salinger & Frink, 2003; and include code to automatically balance the scal-
Frink & Salinger, 2003] and polymer self-assembly ing between the solution and parameter components
[Frischknecht et al, 2002]. Current work includes of the arclength constraint and step size control
the release of a completely new version of LOCA algorithms.
as part of a larger solver framework effort [Heroux The stability of the steady solutions to small
et al, 2003], and the development and implemen- perturbations can be ascertained through lin-
tation of alternative algorithms for mitigating or ear stability analysis. Linearization of the tran-
removing the solves of the nearly singular systems sient equations, which can be written generally as
in the current algorithms. R(x, x, A) = 0, around the steady state, leads to a
generalized eigenvalue problem of the form

2. Bifurcation Tracking Algorithms J w = 7BW, (3)


In this section we describe the methods imple- where B = — <9R/<9x is the matrix of coefficients
mented in the LOCA library for locating three of time-dependent terms, 7 is an eigenvalue of the
common instabilities exhibited in nonlinear sys- system (generally complex), and w is t h e associ-
tems: turning point, pitchfork and Hopf bifurca- ated eigenvector, which can be written in terms of
tions. Each of the algorithms solves simultaneously real value vectors w = y + iz. If any eigenvalue
for the steady state solution vector x of length n, has positive real part, then perturbations with any
the parameter at which the bifurcation occurs, A, component in the direction of the associated eigen-
and the null vector w = y + iz, which is the eigen- vector will grow exponentially, and the steady state
vector associated with the eigenvalue that has zero solution is deemed unstable. A system loses stabil-
real part. The bifurcations are tracked as a func- ity, and experiences a bifurcation, when a stable
tion of a second parameter to generate the loci of steady state solution branch, as parameterized by a
bifurcation points in two-parameter space. system parameter A, passes through a point where
It is assumed that the application code uses a Real(7) = 0.
fully coupled Newton method to solve for steady We have developed a robust linear stability
states of a set of nonlinear equations. In this paper, analysis capability for large scale problems that
the equations are the n residual equations R of the accurately approximates leading eigenvalues of the
finite element discretization of the PDEs that gov- system in Eq. (3). A detail of the method is found
ern fluid flow and heat transfer. The steady state in [Lehoucq & Salinger, 2001b] while benchmarking
problem is written as and application of the method to incompressible
flows are found in [Burroughs et al., 2001, 2004; '
R(x,A) = 0, (1) Salinger et al, 1999].
322 A. G. Salinger et al.

2.1. The turning point (fold) would require global communications between all
tracking algorithm processors. The sparsity of the matrix J com-
ing from many PDE solution methods (e.g. finite
The turning point (fold) tracking algorithm in
element, finite difference, finite volume) limits com-
LOCA uses Newton's method to converge to a turn-
munications in the linear solver to only local com-
ing point and simple zero order continuation to
munications between a processor and ~ 10 of its
track it as a function of a second parameter. At a
neighbors.
turning point bifurcation (or fold), there is a single
To reduce the effort in implementing the bifur-
eigenvalue 7 = 0 with an associated real null vec-
cation algorithms with application codes, block
tor y. We use the formulation of Moore and Spence
elimination algorithms are used to solve the sys-
[1980] to characterize the turning point:
tem of equations in (7). The solution to (7) can
R = 0, (4) be equivalently formulated with four linear solves
of the matrix J [Eqs. (8)-(ll)] and some simple
J y = o, (5) algebra:
• y - l = 0. (6) Ja = -R, (8)
Here <f> is a constant vector. The first vector equa-
tion (which is n scalar equations) specifies that the (9)
solution be on the steady state solution branch,
the second vector equation specifies that a real- <9Jy
Jc a, (10)
valued eigenvector y exists that corresponds to a dx
zero eigenvalue, and the last scalar equation pins the
length of the null vector at length 1 (and removes d3y dJy
Jd = - b (11)
the trivial solution y = 0). This set of 2n + 1 equa- dx dX '
tions uniquely specifies the values of x, y, and A !-</>• c
given a nondegenerate turning point, as long as <f> A* (12)
d
is chosen to be any vector such that 0 • y 7^ 0.
A full Newton's method applied to this system Ax = a + AAb, (13)
requires linear solves of the form
dR ' Ay = c + AAd - y. (14)
J 0
~dX "Ax" R The variables a, b, c and d are temporary vectors
9Jy of length n. Each of the four linear solves of J are
dJy Ay = — Jy , (7)
J performed by the application code, in the same way
dx 8X AA </>-y-l that this matrix is solved in Newton iteration (2).
0 <F 0 Work is saved in the second, third and fourth solves,
by reusing the preconditioner for an preconditioned
It would be desirable to formulate this system and
iterative solver (and the factorization for a direct
send it to an efficient linear solver, but this is not
solver). The algorithm requires initial guesses for x
practical with many large-scale engineering simu-
and A, which usually come from a steady solution
lation codes. One hurdle would be the formula-
near the turning point as located by an arclength
tion of (dJy/dx), which is a matrix formed by
continuation run. The initial guess for the null vec-
the derivative of the vector J y with respect to the
tor is chosen to be a scaled version of the b vector
vector x (the same as the notation J = dR/dx).
from Eq. (9)
The computation of this matrix requires deriva-
tives not normally calculated in an engineering code b
r init (15)
and does not lend itself well to efficient numerical
differentiation. The second issue is the work The logic for this choice is based upon the realiza-
involved in determining the sparse matrix stor- tion that if J is nearly singular, then J~1(dR/dX)
age for iterative linear solvers and partitioning and should have a large component in the direction of
load balancing for applications sent to parallel com- the null vector. For coupled PDE applications, we
puters. The last row and column are not in gen- have found that a good choice for the scaling vector
eral sparse and so matrix-vector multiplications (j> is the vector given by the inverse of the average
Bifurcation Tracking Algorithms and Software for Large Scale Applications 323

of the solution values for each PDE variable. This bifurcation in place of Eqs. (4)-(6) listed above
tends to make each variable's contribution to <fi • y [Govaerts, 2000]. Here a is a scalar measure of the
to be of the same relative magnitude. singularity of J that is implicitly defined through
The derivatives on the right-hand side of additional matrix equations. An algorithm based on
Eqs. (9)—(11) are all calculated with first-order this approach should be more robust than the cur-
finite differences and directional derivatives. The rent method, yet requires solves with the transpose
following formulas are used: of the Jacobian matrix. This would preclude its use
<9R _ R(x, A + ei R(x, A) by codes that do not have the capability to solve
(16) J~Tv, such as those that use matrix-free Newton-
£l
Krylov methods.
<9Jy J(x + e2a,A)y-J(x,A)y
-a ~ , (ii)
<9x £2 2.2. The pitchfork tracking algorithm
An algorithm for tracking pitchfork bifurcations
-^— b + -^7- « — J ( x + e 3 b, A)y has been developed that requires little modifica-
ax <9A e3
tions to the application code and model. Pitchfork
H J ( x , A + ei)y bifurcations occur when a symmetric solution loses
£i stability to a pair of asymmetric solutions. In this
algorithm, we require that the user defines the sym-
- ( - £ + - ) J(x,A)y. (18) metry by supplying a constant vector, ip, that is
V 3 £1/ antisymmetric with respect to the symmetry being
The robustness and accuracy of the algorithm broken. We specify the pitchfork by the following
is dependent on the choice of the perturbations e. set of (2n + 2) coupled equations:
The following choices have been found to work well
K + aip = 0, (24)
on sample applications for 8 = 10 - 6 :
ei = <y(|A|+<J), (19) J y = o, (25)

<x,V>=0, (26)
x
£2 = 8 +5 (20)
4>-y-l = 0. (27)

The variable not previously defined in the turn-


|x| (21)
£3 +6 ing point algorithm is the scalar variable a that
IbT is a slack variable representing the asymmetry in
After convergence to a turning point, a slight the problem. This additional unknown is asso-
modification of simple zero order continuation is ciated with the additional equation (26), which
often used to converge to the next turning point at enforces that the solution vector is orthogonal to the
the next value of second parameter. We have found antisymmetric vector. The notation in this equa-
more robust convergence when the solution vector x tion represents an inner product. For a symmet-
was perturbed off the singularity by a small random ric model, a will go to zero at the solution. This
perturbation of relative magnitude 10~5. The initial approach for generating a regular system has been
guesses for A and y are the converged values at the presented by Govaerts [2000] as an alternative to
previous turning point. The constant vector cj) m&y the approach of Werner and Spence [1984].
be recomputed by recalculating the average of the There are a few assumptions that were made to
solution values across the mesh, although we typi- ease the implementation of the pitchfork tracking
cally find this unnecessary. algorithm, yet can make it trickier to use. First,
It should be pointed out that the minimally we require that any odd symmetry in the vari-
augmented system ables is about zero so that the inner product of
R = 0, (22) the solution vector with the antisymmetric vec-
tor is zero. For instance, the cold and hot tem-
a 0. (23)
peratures in a thermal flow problem should be set
is recommended in the literature for use as the at —0.5 and 0.5 instead of 0 and 1. Second, our
set of defining equations for a turning point current implementation uses a dot product of the
324 A. G. Salinger et al.

vectors to calculate the inner product (X,T/>); how- It can be solved using a mathematically (but
ever, this strictly should be an integral over the not numerically) equivalent block elimination
computational domain. For instance, if the dis- algorithm:
cretization (i.e. finite element mesh) is not symmet-
ric with respect to the symmetry, then the dot prod- Ja = - R , (29)
uct of the solution vector and antisymmetric coeffi-
cient vectors would not be zero. We allow the users Jb = (30)
of the LOCA library to supply the integrated inner ~dA'
product, yet in our applications we have replaced it
with the vector dot product. If the mesh is not sym-
Jc = -v>, (31)
metric with respect to the symmetry in the PDEs c>Jy
that is being broken at the pitchfork bifurcation, the J d = --~z—a (32)
<9x
discretized system will exhibit an imperfect bifur-
cation. The algorithm presented here will converge Je = - ^ b - 5 J y
to a point that is a reasonable approximation of the <9x dx ' (33)
pitchfork bifurcation. However, at this point CT^O
dJ
and therefore we will not have R = 0. if y
(34)
To start the algorithm, we require the user to
supply the vector I/J. The null vector y has the anti- Aa = -a
symmetry that we are requiring of ip. We calcu- ((x,y>) + (a,y>))^-e + ( b , ? / ; ) ( l - _*)
late ip and the initial guess for y by first detecting
the pitchfork bifurcation with an eigensolver. The
+ (b,rf>)<f>-f- (c,V>)^-e
1

eigenvector associated with the eigenvalue that is (35)


passing through zero at the pitchfork is used for ip • d - <j> • f (Aa + a)
and the initial guess for y. For problems that have AA = (36)
multiple pitchfork bifurcations in the same region
of parameter space, which is often the case when Ax = a + AAb + (ACT + cr)c, (37)
the system can go unstable to different modes, the
pitchfork algorithm can be started multiple times Ay = d + AAe + (ACT + a)f - y. (38)
with different xp vectors to track each pitchfork
separately. We choose a — 0 as an initial guess This algorithm has six temporary vectors
and we rarely see it increase past 10~ 10 through- (a, b, c, d, e, and f), each of which is the result of
out the iterations. The constant vector <> / is cho- a linear solve with the same matrix J. Again the
sen to be the scaling vector as in the turning point preconditioner is only calculated once. The use of
algorithm. a block solver, where solves for a, b, and c are per-
As with the turning point algorithm, we use formed simultaneously (as are d, e, and f), would be
a fully coupled Newton method to converge to advantageous. The right-hand sides of these six lin-
the pitchfork bifurcation and a block elimina- ear systems are mostly the same as for the turning
tion algorithm to simplify the solution of the point algorithm, and so reuse the same routines and
Newton iteration. The Newton iteration for this differencing schemes and perturbations presented
system is above [Eqs. (16) and (19)].

OR
J 2.3. The Hopf tracking algorithm
"Ax" "R + aip The algorithm for tracking Hopf bifurcations, where
dJy dJy a complex pair of eigenvalues have zero real part,
0 Ay Jy
dx dX is similar to the above turning point and pitch-
Aa (x,^ fork tracking algorithms. It is however more com-
0 0 0 U-y-i
dx A\ plicated in that it involves complex numbers. The
purely imaginary eigenvalues at the bifurcation
0 4>T 0 0 point can be written 7 = ±iu with complex eigen-
(28) vectors w = y + iz. The following set of equations,
Bifurcation Tracking Algorithms and Software for Large Scale Applications 325

presented by Griewank and Reddien [1983], specify This system of 3n + 2 equations and unknowns
the Hopf bifurcation, solves for the solution vector x, y, z, cv, and A. The
first vector equation specifies that we are on the
R=0, (39) solution branch, the next two equations specify that
we are at a place where there is a purely imaginary
J y + wBz = 0, (40) eigenvalue, and the last two scalar equations set
J z — toBy = 0, the phase and amplitude of the eigenvectors (which
(41)
are otherwise free). The same Hopf bifurcation can
^ • y - l = 0, (42) admit a second solution to this system of equations
at (x, y, - z , -u, A).
(43) One Newton iteration for the fully coupled solu-
tion of this system is the linear system,

dR
0 0 0
~dX
dJy dBz Ax R
dJy dBz
-\-us- J coB Bz Ay J y + CJBZ
dx dx dX dX
Az J z — wBy (44)
dJz JKy -wB By
dJz dBy
dx
— UJ-
J
" dx Au ^•y-i
dx dx
AX 6 •z
0 0 0 0
o AT 0
0 0

In this derivation we have allowed for


<9B/<9x^0 and <9B/<9A^0. While in many situa- /<9Jy , d B z
tions these terms can be neglected, the matrix B — h to
J uB V dx dx
can depend on the solution vector through depen-
dence of the inertial coefficients (e.g. density and -UJB J dJz dBy\
heat capacity) on the local state vector. The matrix ~dx~
B will depend on the parameter very strongly when (48)
A is a geometric parameter that moves the mesh
locations. J uB
Again we solve this linear system by a block -LUB 3
elimination algorithm that breaks it into simpler /dJy ^dBz fdJy dB2
|
linear solves. It is not possible to solve this system h +UJ
by solves of just the matrix J, but also requires
\ dx dx -{-dx -dx
solves of the complex matrix J + iuB. The block dJz dBy dJz dBy\
— oj b-
elimination algorithm for the Newton iteration of dx dx ^A"
the Hopf tracking algorithm, written in terms of (49)
real-valued variables, is,
(fc)(ff)-(fe)(fd) + (fd)
AX = (50)
(^•d)^.g)-(^.c)^-h)
Ja = -R, (45)
^ <^-h)AA + (0-f)
dR (51)
Jb = - (46)
<9A'
Ax = a + AAb, (52)
J UJB Bz Ay = e + AAg - ALOC •y, (53)
(47)
-wB J -By Az = f + AAh - Awd • z. (54)
326 A. G. Salinger et al.

Table 1. This table summarizes the functionality that an application code must
supply in order to access each of the stability analysis algorithms. The require-
ment for the Hopf tracking is in addition to those listed above for the other
methods.

Method Requirements Description

Parameter continuation R Residual calculation


Turning point tracking Jv Jacobian-vector multiply
x
Pitchfork tracking J v Solve with Jacobian
set A Set parameters
Eigensolve Bv Mass matrix-vector multiply
(J - < T B ) " *V Solve with shifted Jacobian

Hopf tracking ( J - - iwB)~^ v Solve with complex matrix

This algorithm has eight temporary vectors a for the Jacobian matrix, the requirements on the
through h, which are solved with two solves of application code to access these algorithms are
the J matrix and three solves of the 2n x 2n matrix rather small. The requirements are summarized in
-WB "j • This algorithm differs from the turn- Table 1.
ing point and pitchfork tracking algorithms which
only require solution of the steady state Jacobian
J, a capability already possessed by codes using 3. Bifurcation Analysis of
Newton's method. Since the location of the non- Rayleigh—Benard Convection
zeros in the sparse matrix B is typically a subset
in a 5 X 5 x 1 Box
of those for the matrix J, a parallel iterative solver
for the 2n x 2n matrix can use the same local com- As a demonstration of the algorithms presented
munication maps as used for solves of J. An algo- above, we choose to study the secondary bifurca-
rithm for solving complex matrix equations with tions in the Rayleigh-Benard problem, which con-
a real-valued sparse iterative solver has been pub- sists of a fluid that is heated from below and cooled
lished [Day & Heroux, 2001] and implemented in from above, so that the thermal expansion of the
the Komplex extension to the Aztec library of pre- fluid in the presence of gravity produces a desta-
conditioned iterative Krylov solvers. This algorithm bilizing density gradient. In particular, we com-
also requires the formulation of the B matrix, for pute the convection rolls arising from the first sym-
which a code performing linear stability analysis of metry breaking bifurcation and then analyze the
Eq. (3) will already have a routine. loss of stability of these rolls in two-parameter
space. This system is controlled by two dimension-
To initialize the routine, we assume that an
less groups (which will be defined in the following
initial Hopf bifurcation has been detected with an
section): the Rayleigh number Ra which is a mea-
eigensolver, by having the real part of a complex
sure of the destabilizing buoyancy effect compared
pair of eigenvalues pass through zero with successive
to the stabilizing diffusive effects, and the Prandtl
steps in the parameter. This gives good starting
number Pr, which is a property of the fluid com-
values for all the unknowns in the Hopf tracking
paring the relative diffusive strengths of momen-
algorithm. Also, the constant vector <f> is chosen as
tum and heat. We have chosen our parameters and
before.
boundary conditions so that we get all three bifur-
cations generic to a one-parameter system with
2.4. Interface requirements symmetry.
By using block elimination algorithms to solve Of particular interest in the past is the onset
the Newton iterations of the augmented sys- of oscillatory instabilities. The stability of the rolls
tems describing the bifurcations and using finite has been considered both experimentally [Willis
differencing to compute all the derivatives except & Deardorff, 1970] and numerically [Busse &
Bifurcation Tracking Algorithms and Software for Large Scale Applications 327

Clever, 1979; Clever & Busse, 1995; Tangborn et al, controlling the system are the Rayleigh number
1995; Nakamura, 1997; Sone et al, 1997; Cox &
Matthews, 2000]. In their paper, Busse and Clever p2Cpg(3ATL3
Ra =
[1979] numerically analyzed the stability of the con- kfi
vection rolls in the absence of any side walls. Their and the Prandtl number
equilibrium solution is a two-dimensional solution
periodic in the direction perpendicular to the axis of pCp
Pr =
the rolls. They analyze the stability of this solution k
by Fourier transforming the disturbances and look- Here p is the density at reference temperature
ing for the most unstable wavelength. Their results T = 0, Cp is the heat capacity of the fluid, g is
show that as the Prandtl number goes to zero, the the gravitational acceleration, f3 is the coefficient
rolls have an oscillatory instability at a Rayleigh of thermal expansion, AT is the temperature dif-
number close to the critical Rayleigh number of the ference over the box, L is the height of the box, k
first bifurcation. Although their results are in qual- is the thermal conductivity, and p is the viscosity.
itative agreement with experiments, quantitatively In this formulation, distances are made dimension-
their results predict that bifurcation occurs closer less with respect to L, the velocities with respect to
to the original bifurcation than the experiments do. p/pL, the pressure with respect to p2/pL2, the tem-
They argue that this is most likely a result of ignor- perature with respect to AT, and time with respect
ing the side walls. to p/pL2.
In a technical report [Burroughs et al, 2001], Even though we only solve for steady solutions
we computed the steady convective rolls in a 5 x 5 x 1 in this paper, the time-dependent versions of the
box, and analyzed their stability using the eigen- equations are written since these terms come to play
solver. For a fluid with Pr = 0.01, the onset of in the linear stability analysis and the Hopf tracking
an oscillatory instability was estimated to be at algorithm. Note that for incompressible flow there
Ra cr = 1910 on the finest finite element discretiza- are no time derivatives of the pressure field so the
tion of 16 Million unknowns. The appeal of bifur- mass matrix B is 20% rank deficient.
cation tracking for this problem is to calculate In the following sections we analyze the flow
the whole curve of Ra cr (Pr) and further determine stability for two closely related systems, that differ
under what conditions the convective rolls undergo in the boundary conditions on the side walls. The
an oscillatory instability. first is the closed box, with no-slip boundary condi-
tions on the side walls for the entire velocity vector,
3.1. Model and methods closed box:
The governing partial differential equations for the
Rayleigh-Benard problem are the incompressible u(0, y, z) = u(5, y, z) = u(x, 0, z) = u(0, 5, z) = 0.
Navier-Stokes equations for momentum transport,
The second is the symmetric box with symmetry
the continuity equation for mass conservation, and
boundary conditions on the side walls, including no
the heat equation. The Boussinesq approximation is
normal flow and no shear stress,
used, which allows for a linear dependence of density
on temperature in the body force term, yet assumes symmetric box:
constant density in all other terms. In nondimen-
sional form, the equations are: u(0, y, z) — «(5, y, z) = v(x, 0, z) — v(x, 5, z) = 0,
U
+ u - V u + V P = V2u + RaPrTez, (55)
dt
V • u = 0, (56) dw, n . dw
= —{x,0,z) = —(x,5,z) =0,
dT dy dy
— + u - V T = Pr"1V2T. (57)
at
Here u = uex+vey+wez, P and T are the unknown ^(0,y,z) = ^, y ,z)
velocity, pressure and temperature fields. The vec-
tor ez is the unit vector in the direction of the gravi- dw dw.
tational acceleration. The two dimensionless groups
328 A. G. Salinger et al.

The rest of the boundary conditions are the same Petrov-Galerkin (SUPG) type methodology for
for both systems, including adiabatic conditions on controlling oscillations due to convective effects.
the side walls, An additional important aspect of this stabiliza-
dT dT tion procedure is that a fully-implicit (for time-
-(0,y,z) = —(5,y,z) dependent systems) and a direct-to-steady-state
solution procedure using Newton-Krylov meth-
dT dT ods can be implemented [Shadid, 1999]. The
= — (x,0,z) = -^(x,y,z)=0. Aztec package of preconditioned iterative Krylov
methods is used to solve the linear systems
with no-slip boundary conditions and a hot temper-
[Hutchinson et al., 1995]. In this work, we have used
ature on the bottom surface,
the ILUT domain decomposition preconditioner,
u(x,y,0)=0, T(z,y,0) = 0.5, where each processor owns one domain. We chose
1 level of overlap between domains and a fill factor
and no-slip boundary conditions and a cold temper-
of 1.5, which allows for the preconditioner to have
ature on the top surface, 1.5 as many nonzeroes as the Jacobian itself. The
u(x,y,l)=0, T(x,y,l) = -0.5. GMRES linear solver was used without restarts,
and orthogonality is maintained with the modified
The temperature boundary conditions are chosen Graham-Schmidt algorithm. A typical linear solve
so that the solution is symmetric about zero, as for the solution of a steady state used a relative tol-
required by the pitchfork algorithm. erance of 10~3 and built a Krylov space of size 220,
The above system of five coupled PDEs and while a solve for a bifurcation tracking run used a
boundary conditions are solved for unknowns u, relative tolerance of 1 0 - 8 and built a Krylov space
v, w, P, T with the MPSalsa code. MPSalsa of size 400.
uses a Galerkin/least-squares finite element method
[Hughes et al, 1989a, 1989b; Shadid, 1999] to dis- The MPSalsa code is designed for general
cretize these equations over the spatial domain. unstructured meshes in 2D and 3D, and runs on
This stabilization procedure allows for the use of massively parallel computers. The majority of the
equal order linear FE basis functions for all vari- results in this paper were calculated for a 100 x
ables while avoiding spurious pressure oscillations 100 x 30 mesh of eight-node trilinear hexahe-
for incompressible flows. For these calculations we dral elements, which corresponds to 316231 nodes
did not need to include the streamline upwind and over 1.58 Million equations and unknowns.

Fig. 1. A visualization of the partition of the 316231 node mesh for 48 processors is shown. The colored patches are elements
(in the finite element discretization) whose nodes are all owned by a given processor, while the red strips are elements whose
nodes are owned by multiple processors. Inter-processor communication is needed only across these elements for performing
the finite element method.
Bifurcation Tracking Algorithms and Software for Large Scale Applications 329

Fig. 2. Visualization of the stable convective flow state for the symmetric box at Ra = 3328.1 and Pr = 1.0. The black circles
are streamlines, and the color contours are heat flux through the bottom surface.

The mesh is produced using the CUBIT software ' — i 1 r - • — i

[Shepherd, 2000] and decomposed for parallel solu- Pr=1.0


tion using the Chaco graph partitioning package 15
[Hendrickson & Leland, 1995a, 1995b]. The parti- *J
tioner assigns each node to a processor in a way •<—<
o
to evenly distribute the work load while minimizing o 10 y^
interprocessor communication. The decomposition >
of the mesh for 48 processors is visualized in Fig. 1. I 5 /
Finite elements with all eight corner nodes owned
X
by the same processor are given a color unique to
0
that processor. Elements broken over multiple pro-
i , i i . i . i

cessors are colored red (and form jagged lines) and


0 1000 2000 3000 4000 5000
are representative of the amount of information that
needs to be communicated to perform a matrix fill Rayleigh N u m b e r
or matrix-vector multiply.
Fig. 3. Plot of steady solution branches in the closed box as
A steady-state solution of the convective roll a function of Rayleigh number for Pr = 1.0, showing a pitch-
cells in the symmetric box is shown in Fig. 2. fork bifurcation from the stationary solution to a branch of
Note that the solution is two-dimensional, with the convective rolls. A turning point (fold) is seen on the sec-
streamlines showing five tubes of circulating flow. ond branch, which was subsequently calculated to occur at
The color contours show heat flux through the bot- Ra = 4915.2.
tom of the box, where the high red values corre-
spond to regions of downward flow, and low blue to
regions correspond to upward flow.
near this singularity, followed by a solve with the
pitchfork tracking algorithm using the eigenvector
3.2. Results for closed box as the if) antisymmetric vector and the initial guess
Our first model system is the closed box, with no- for the null vector.
slip boundary conditions on all walls. A bifurcation This convective flow branch has predominantly
diagram with respect to the Rayleigh number Ra for two-dimensional profile, with roll cells resembling
fixed Pr = 1.0 is shown in Fig. 3. The no-flow, con- those in Fig. 2, yet with 3D effects due to the no-
duction solution was calculated to bifurcate to the slip walls. Since we were not able to adequately
convective rolls solution at Ra = 1774.0. This calcu- visualize the flow field, we instead show the heat
lation involved the computation of the eigenvector flux through the bottom surface in Fig. 4(a). The
330 A. G. Salinger et al.

(a) (b)
Fig. 4. The heat flux through the bottom of the closed box is shown for (a) the solution and (b) the null vector, at the
turning point at Ra = 4915.2 and Pr = 1.0. The red and blue regions correspond to downward and upward flow.

red and blue regions correspond to downward and 5000


upward flow. The effect of the no-slip wall can
be clearly seen by comparison with the results for No S t a b l e Rolls
the symmetric box, which was visible in Fig. 2 u
CD 4000
and reproduced in the same format in Fig. 7(a).
The solution branch was tracked with pseudo
B
arclength continuation where it encountered a 2
turning point near Ra = 4900. (The unstable 3000
branch, shown as the dotted line, was followed S t a b l e Rolls
back through other bifurcations to another turning o
point near Ra = 3400, where linear stability anal- b 2000
ysis revealed nine eigenvalues with positive real CO :
parts.) The turning point algorithm of Sec. 2.1
No Flow
was then used to converge to the bifurcation at
Pr = 1.0 and Ra = 4915.2. The null vector is 1000
visualized in Fig. 4(b) and seen to have signifi- 0.4 0.6 0.8 1.0
cant variation in the y-direction, indicating an end
to the flow branch consisting of the tubular roll
Prandtl Number
cells. (The convergence details of this calculation Fig. 5. The results of tracking the turning point bifurcation
are the subject of the numerical experiments in seen in Fig. 3 (at Pr = 1.0 and Ra = 4915.2) as a function
Sec. 3.4.) of Prandtl number is shown. This bifurcation represents the
limit of the nearly two-dimensional roll cells in a closed box.
The turning point was then tracked with The branch of turning point bifurcations ends in what is pre-
sumably a cusp near Pr = 0.4075, and the stability behavior
decreasing Pr. The results of this tracking are shown
in the region of the ? symbol remain uninvestigated. The
in Fig. 5. The pitchfork bifurcation from the trivial pitchfork bifurcation signalling the onset of flow is drawn in
branch, known to be independent of Pr, is drawn as well.
in as well. The symbols at Pr = 1 correspond to
the similarly marked solutions in Fig. 3. Regions
of no-flow, stable convective roll cells of predomi-
nantly 2D flow, and the region where the cells are point calculations, each requiring about 45min on
no longer a stable solution are delineated by these a cluster of 48 3.0 GHz processors. The last solu-
curves of bifurcations. The calculation of the curve tion on this branch is calculated at Pr = 0.4075
of turning points involved 12 consecutive turning and Ra = 2831.7, below which the branch ends,
Bifurcation Tracking Algorithms and Software for Large Scale Applications 331

presumably in a cusp. Further investigation of the


stability behavior at lower Pr is beyond the scope
of this study.

3.3. Results for symmetric box


A similar set of calculations were performed on
the symmetric box, where symmetry boundary con-
ditions were placed on all side walls. This lin-
ear stability of this system has been previously
probed, including the detection of a Hopf bifurca-
tion from the convective rolls solution for Pr = 0.01
in the range of Ra = 1900 - 1950 [Burroughs
et al, 2001].
A parameter continuation study in Ra was per-
formed on this system at Pr = 1.0, and is presented (a)
in Fig. 6. Again, a pitchfork bifurcation from the
trivial no-flow, conduction solution is found. The
critical Rayleigh number is found to be at Ra —
1703.7, about 4% lower than for the closed box
where the side walls stabilize the no-flow solution.
The two-dimensional convective rolls solution is
continued until the linear stability analysis detects
a secondary pitchfork bifurcation near Ra = 3300.
This singularity represents the end of the stable
two-dimensional solution. The solution at the pitch-
fork bifurcation was visualized in Fig. 2. The heat
flux through the bottom of the box is again shown
for the solution and null vector in Figs. 7(a) and
7(b)- The null vector shows that the convective roll
solution destabilizes to 3D disturbances.

(b)
Fig. 7. The heat flux through the bottom of the symmetric
box is shown for (a) the solution and (b) the null vector, at the
pitchfork bifurcation at Ra = 3338.1 and Pr = 1.0. The red
and blue regions correspond to downward and upward flow.
>> 10 -
0
c
The pitchfork tracking algorithm in Sec. 2.2
>
was launched using an eigenvector calculated with
the linear stability analysis capability as the ip vec-
X tor. This same vector was used as the initial guess
CO
for the null vector y. This algorithm located the
pitchfork at Pr = 1.0 to be at Ra = 3338.1 for the
1000 2000 3000 4000 mesh of 1.58 Million unknowns. About 30 solutions
Rayleigh Number were calculated along the branch, calculated down
to Pr = 0.037, as shown in Fig. 8. The symbols at
Fig. 6- Plot of steady solution branches in a symmetric box
Pr = 1 correspond to the similarly marked solutions
as a function of Rayleigh number for Pr = 1.0, showing a
pitchfork bifurcation from the stationary solution to a branch in Fig. 6. Each solution required 40 min on average
of convective rolls. A second pitchfork bifurcation is seen on on a cluster of 48 3.0 GHz processors. The pitchfork
the asymmetric branch at Racr = 3338.1. bifurcation corresponding to the initial bifurcation
332 A. G. Salinger et al.

— ' i • — i • i the first destabilizing mode, confirming the results


3500 -" of previous works that oscillatory instabilities
destabilize the convective rolls at low Pr. Since
o we have not developed algorithms for directly

i—
,0 3000 - locating higher co-dimension bifurcations, the coin-
£ No Stable Roiis y cidence of the Hopf and Pitchfork bifurcations

1
Z was found, by repeated stability analysis calcula-
2500 - tions along the curve of pitchfork bifurcation, to
X!

_i
-r—< •
occur near Pr = 0.0434 and Ra = 2106. The
real and imaginary parts of the eigenvector corre-
^ 2000 Stable Rolls - sponding to the Hopf bifurcation are visualized in
CO Fig. 9. These solutions show five cells developing
. t S
K )
in the y-direction, where previously there was no
1500 No Flow variation.
• — i '
0.0 0.2 0.4 0.6 0.8 1.0 Starting from this point, and using the real
and imaginary eigenvectors as initial guesses, the
Prandtl Number Hopf tracking algorithm was launched. Since the
Fig. 8. The results of tracking the pitchfork bifurcation seen solutions of the complex matrix (or rank 2n real
in Fig. 6 in a symmetric box as a function of Prandtl number valued matrix) require considerable extra memory
is shown. Furthermore, a Hopf bifurcation branch is com- and time, the solution and eigenvector were first
puted, since this is the bifurcation signaling the loss of sta- interpolated to a coarser mesh. Since the solution
bility of the convective rolls solution at low Pr. with these boundary conditions has no variation
in the y-direction (although the eigenvectors do),
the mesh was coarsened only in this dimension. By
from the trivial solution, known to be independent reducing from 100 to 32 elements in this dimension,
of Prandtl number, is drawn in as well. The different a mesh corresponding to 516 K unknowns was pro-
flow regions are delineated by curves of bifurcation duced.
points. The results of the Hopf tracking runs are shown
Linear stability calculations at Pr = 0.04 (along with the pitchfork curves from Fig. 8) in
revealed a complex conjugate pair of eigenvalues Fig. 10, with the Prandtl number axis switched to
with positive real part. This indicates that a a log scale. The Rayleigh number of the Hopf bifur-
Hopf bifurcation had overtaken the pitchfork as cation appears to be approaching a low Prandtl

Fig. 9. The heat flux through the bottom of the symmetric box is shown for real and imaginary parts of the null vector for
the Hopf bifurcation, in the neighborhood of the higher codimension bifurcation at P r = 0.0434 and Ra = 2106. The red and
blue regions correspond to downward and upward flow.
Bifurcation Tracking Algorithms and Software for Large Scale Applications 333

for 12 Newton iterations, by setting an unreach-


3500 able convergence tolerance for the nonlinear system.
This was repeated for four different tolerances for
U
the reduction in the residual for the iterative lin-
3000 ear solves: 1 0 - 4 , 1 0 - 6 , 1 0 - 8 , 1 0 - 1 0 . Figure 11 shows
B a plot of the norm — the residual of the turn-
ing point equations (4) (which are dominated by
2500 |Jy|) as a function of Newton iteration for each of
GO these four tolerances. (We should note that several
a; of the linear solves did not reach their requested
i—i
2000 - tolerances.)
S t a b l e Rolls The results can be interpreted both in terms
. of robustness in accuracy. When looking at robust-
1500 - No Flow ness, the numerical instability of the algorithms
becomes apparent. After reaching a level of con-
10 - 3 10 -2 10 - 1 10° vergence to the singularity, the inexact solves
Prandtl Number of the nearly singular matrix can lead to bad
Newton steps. This is particularly noticeable in
Fig. 10. The results of tracking the pitchfork bifurcation
seen in Fig. 6 in a symmetric box as a function of Prandtl
the 10 - 4 run. This results in unacceptably large
number is shown. Furthermore, a Hopf bifurcation branch is increases in the residual, and we have seen occur-
computed, since this is the bifurcation signaling the loss of rences where the code has not recovered from these
stability of the convective rolls solution at low Pr. lapses. (This behavior can be mitigated by damp-
ing or other globalizations of Newton's method
[Pawlowski et al, 2004].) In practice, we set a tight
tolerance on the iterative linear solves (e.g. 10~ 8 ),
number limit around Ra = 1900. The inset fig-
ure showing the frequency of the Hopf bifurcation
shows that this quantity (with our choice of the
nondimensionalization of time) is also becoming
insensitive to the Prandtl number. The calculation
of 13 points along the Hopf curve required on aver-
age 50 min per solution on a cluster of 48 3.0 GHz
processors.

3.4. Effect of linear solver tolerance


£
o
In this section we present a simple numerical exper- Z
iment designed to inform both on the accuracy and
robustness of the bifurcation tracking algorithms "a
P
described in this paper. One would expect that the TD
•rH
algorithms, which use iterative linear solves of the GO
CD
matrix being driven singular, would continue to con-
verge towards the singularity until the condition
number of the matrix being inverted multiplied by
the error in the linear solve was order one. There-
fore a tighter tolerance on the iterative linear solves
would give a more accurate solution. 10 12
We performed a set of four computations using
the turning point tracking algorithm to converge to Newton Iteration
the turning point at Ra = 4915.2 (and Pr = 1.0), Fig. 11. Convergence history of the turning point algorithm
starting from a converged steady state solution at is plotted as a function of a Newton iteration for four different
Ra = 4875. Each computation was forced to run linear solver tolerances.
334 A. G. Salinger et al.

and a moderate tolerance on the nonlinear system, the bifurcation library from needing any informa-
and the robustness issues do not usually come into tion about the storage or solution of the linear
play. This does limit the scalability of the algo- systems.
rithms since this requirement puts a larger burden To demonstrate the scalability of the algo-
on the preconditioners and linear solver algorithms rithms, a bifurcation analysis of a three-dimensional
than the steady state solve. natural convection flow application was undertaken.
With regard to accuracy, it can be seen that The limit of stability of convective roll cells in
a tighter linear solver tolerance leads to a more the Rayleigh-Benard problem was investigated as a
accurate solution of the bifurcation. All runs suc- function of the Rayleigh number and Prandtl num-
ceeded in dropping the residual seven orders of mag- ber. Turning point, pitchfork, and Hopf bifurcations
nitude after six Newton iterations, and the 10~ 10 indicating the limit of stability of the convective
curve reaches a very low residual near 10 - 1 4 . All roll solutions were successfully tracked, with no fail-
predicted the same parameter value of the bifur- ures in the continuation process. The first two were
cation parameter to five digits, to Ra = 4915.2, tracked on a mesh corresponding to 1.58 Million
and only the 1 0 - 4 run moved away from this value unknowns, and the third on a mesh of 0.51 Million
with a bad step. To put this in perspective, the unknowns.
same turning point calculated on a mesh of 208 K The accuracy and robustness of the algorithms
unknowns, corresponding to half as many elements were shown in a numerical experiment to be a strong
in each direction, was located at Ra = 5167.8, a function of the tolerance of the iterative linear solver
full 5% difference, and implying that the solution used to invert the Jacobian matrix. While the accu-
on our current mesh still has a 1-2% discretiza- racy of this approach was found to be more than
tion error. Furthermore, in many applications the adequate for this problem — finding the parameter
modeling error, such as knowing the true value for value of the bifurcation to several digits — the cur-
the viscosity, even swamps the discretization error, rent algorithms lack robustness when trying to solve
further decreasing the importance of locating the the bifurcation problem to high accuracy. Work
bifurcation point to high accuracy. is underway to improve the robustness by look-
One final point regarding the results in Fig. 11 ing at reformulations of the linear solves, to imple-
comes from the observation that the four curves ment algorithms based on minimally augmented
overlap for the first five Newton iterations. This sug- systems [Govaerts, 2000], and to look at more inva-
gests that an inexact Newton algorithm, where the sive approaches.
linear solver convergence tolerance starts out very
loose and is dynamically tightened at later Newton
Acknowledgment s
iterations, would be appropriate for these calcula-
tions. The savings would be significant, since the The authors would like to thank those that con-
typical linear solver time to reach a 10~ 4 tolerance tributed code, advice and support for this work,
was half that needed to make the tightest 1CT10 including John Shadid, Rich Lehoucq, David Day,
tolerance. Ray Tuminaro, Ed Wilkes, David Womble and
Sudip Dosanjh. Funding for this work came from
the US DOE MICS and ASCI programs. Sandia
4. S u m m a r y and Conclusions is a multiprogram laboratory operated by San-
In this paper we present a set of bifurcation track- dia Corporation, a Lockheed Martin Company, for
ing algorithms used in the LOCA software library the United States Department of Energy under
and aimed at large scale applications, such as those Contract DE-AC04-94AL85000.
coming from discretizations of PDEs in multiple
dimensions. The augmented systems defining the References
bifurcations are solved with a Newton method. The
Burroughs, E. A., Romero, L. A., Lehoucq, R. B. &
linear solves within the Newton methods are solved Salinger, A. G. [2001] "Large scale eigenvalue calcula-
with block elimination, resulting in a numerically tions for computing the stability of buoyancy driven
unstable procedure involving the linear solve of flows," Technical Report SAND2001-0113, Sandia
the same Jacobian matrix being driven singular. National Laboratories, Albuquerque, NM.
This choice, however, leads to a simple interface to Burroughs, E. A., Romero, L. A., Lehoucq, R. B. &
existing Newton-based application codes and frees Salinger, A. G. [2004] "Linear stability of flow in a
Bifurcation Tracking Algorithms and Software for Large Scale Applications 335

differentially heated cavity via large-scale eigenvalue Hendrickson, B. k Leland, R. [1995a] "The Chaco user's
calculations," Int. J. Numer. Meth. Heat Fluid Flow guide: Version 2.0," Technical Report SAND94-2692,
14, 803-822. Sandia National Labs, Albuquerque, NM.
Busse, F. H. k Clever, R. M. [1979] "Instabilities of Hendrickson, B. k Leland, R. [1995b] "An improved
convection rolls of moderate Prandtl number," spectral graph partitioning algorithm for mapping
J. Fluid Mech. 91, 319-335. parallel communications," SIAM J. Sci. Comput. 16,
Clever, R. M. & Busse, F. H. [1995] "Convection rolls and 452-469.
their instabilities in the presence of a nearly insulating Heroux, M., Bartlett, R., Howie, V., Hoekstra, R.,
upper boundary," Phys. Fluids 7, 92-97. Hu, J., Kolda, T., Lehoucq, R., Long, K.,
Cliffe, K., Spence, A. k Tavener, S. [2000a] The Numer- Pawlowski, R., Phipps, E., Salinger, A., Thornquist,
ical Analysis of Bifurcation with Application to Fluid H., Tuminaro, R., Willenbring, J. & Williams, A.
Mechanics, Acta Numerica (Cambridge University [2003] "An overview of Trilinos," Technical Report
Press), pp. 39-131. SAND2003-2927, Sandia National Labs, Albuquer-
Cliffe, K., Spence, A. k Tavener, S. [2000b] "0(2)- que, NM.
symmetry breaking bifurcation: With application to Hughes, T. J. R., Franca, L. P. k Balestra, M. [1989a]
the flow past a sphere in a pipe," Int. J. Numer. Meth. "A new finite element formulation for computational
Fluids 32, 175-200. fluid dynamics: V. Circumventing the Babuska-Brezzi
Cox, S. M. k Matthews, P. C. [2000] "Instability of condition: A stable Petrov-Galerkin formulation
rotating convection," J. Fluid Mech. 403, 153-172. of the Stokes problem accommodating equal-order
Day, D. k Heroux, M. [2001] "Solving complex-valued interpolation," Comput. Meth. Appl. Mech. Engin.
linear systems via equivalent real formulations," 59, 85-99.
SIAM J. Sci. Comp. 23, 480-498. Hughes, T. J. R., Franca, L. P. k Hulbert, G. M.
Dhooge, A., Govaerts, W. k Kuznetsov, Y. [2003] [1989b] "A new finite element formulation for
"MATCONT: A MATLAB package for numerical computational fluid dynamics: VII. the Galerkin/
bifurcation analysis of ODEs," A CM Trans. Math. Least-Squares method for advective-diffusive equa-
Softw. 29, 141-164. tion," Comput. Meth. Appl. Mech. Engin. 73, 173-
Doedel, E. J., Champneys, A. R., Fairgrieve, T. F., 189.
Kuznetsov, Y. A., Sandstede, B. k Wang, X. J. [1997] Hutchinson, S. A., Shadid, J. N. k Tuminaro, R. S.
"AUTO: Continuation and bifurcation software with [1995] "Aztec user's guide: Version 1.0," Technical
ordinary differential equations (with homcont), user's Report SAND95-1559, Sandia National Laboratories,
guide," Technical Report, Concordia University, Albuquerque, New Mexico 87185.
Montreal, Canada. Keller, H. B. [1977] "Numerical solution of bifurcation
Engelborghs, K., Luzyanina, T. k Roose, D. [2002] and nonlinear eigenvalue problems," in Applications
"Numerical bifurcation analysis of delay differen- of Bifurcation Theory, ed. Rabinowitz, P. H. (Aca-
tial equations using DDE-BIFTOOL," ACM Trans. demic Press, NY), pp. 159-384.
Math. Softw. 28, 1-21. Kuznetsov, Y. A. k Levitin, V. V. [1995-1997]
Frink, L. k Salinger, A. [2003] "Rapid analysis of "CONTENT: A multiplatform environment for
phase behavior with density functional theory, part II: analyzing dynamical systems," Dynamical Systems
Capillary condensation in disordered porous media," Laboratory, CWI, Amsterdam, The Netherlands.
J. Chem. Phys. 118, 7466-7476. Lasater, M. S., Kelley, C. T., Salinger, A. G.,
Frischknecht, A., Weinhold, J., Salinger, A., Curro, J., Woolard, D. L. k Zhao, P. [2004] "Parallel solution of
Frink, L. & McCoy, J. [2002] "Density functional the- the Wigner-Poisson equations for RTDs," Proc. 2004
ory of inhomogeneous polymer systems: I. Numerical Int. Symp. Distributed Computing and Applications
methods," J. Chem. Phys. 117, 10385-10397. to Business, Engineering, and Science.
Fujii, F., Noguchi, H. k Ramm, E. [2000] "Static Lehoucq, R. k Salinger, A. [2001a] "Large-scale eigen-
path jumping to attain postbuckling equilibria of a value calculations for stability analysis of steady flows
compressed circular cylinder," Comput. Mech. 26, on massively parallel computers," Int. J. Numer.
259-266. Meth. Fluids 36, 309-327.
Govaerts, W. [2000] Numerical Methods for Bifurcations Lehoucq, R. B. k Salinger, A. G. [2001b] "Large-scale
of Dynamic Equilibria (SIAM, Philadelphia, PA). eigenvalue calculations for stability analysis of steady
Griewank, A. k Reddien, G. [1983] "The calculation of flows on massively parallel computers," Int. J. Numer.
Hopf points by a direct method," IMA J. Numer. Meth. Fluids 36, 309-327.
Anal. 3, 295-303. Lehoucq, R. B., Sorensen, D. C. k Yang, C.
Henderson, M. [2002] "Multiple parameter continuation: [1998] ARPACK USERS GUIDE: Solution of Large
Computing implicitly defined k-manifolds," Int. J. Scale Eigenvalue Problems with Implicitly Restarted
Bifurcation and Chaos 12, 451-476. Arnoldi Methods (SIAM, Philadelphia, PA).
336 A. G. Salinger et al.

Lust, K., Roose, D., Spence, A. & Champneys, A. studies of the 8:1 cavity problem," Int. J. Numer.
[1998] "An adaptive Newton-Picard algorithm with Meth. Fluids 40, 1059-1073.
subspace iteration for computing periodic solutions," Salinger, A. G., Bou-Rabee, N., Pawlowski, R. P.,
SIAM J. Sci. Comput. 19, 1188-1209. Wilkes, E. D., Burroughs, E. A., Lehoucq, R. B.
Mamun, C. & Tuckerman, L. [1995] "Asymmetric and & Romero, L. A. [2002b] "LOCA 1.0: Library of
Hopf bifurcation in spherical Couette flow," Phys. continuation algorithms — Theory and implemen-
Fluids 7, 80-91. tation manual," Technical Report SAND2002-0396,
Maschhoff, K. J. & Sorensen, D. C. [1996] "P_ARPACK: Sandia National Laboratories, Albuquerque, New
An efficient portable large scale eigenvalue package Mexico 87185.
for distributed memory parallel architectures," in Salinger, A. G., Shadid, J. N., Hutchinson, S. A.,
Applied Parallel Computing in Industrial Problems Hennigan, G. L., Devine, K. D. k Moffat, H. K. [1999]
and Optimization, eds. Wasniewski, J., Dongarra, J., "Analysis of gallium arsenide deposition in a hori-
Madsen, K. & Olesen, D., Lecture Notes in Computer zontal chemical vapor deposition reactor using mas-
Science. Vol. 1184 (Springer-Verlag, Berlin). sively parallel computations," J. Cryst. Growth 203,
Moore, G. & Spence, A. [1980] "The calculation of 516-533.
turning points of nonlinear equations," SIAM J. Shadid, J. N. [1999] "A fully-coupled Newton-Krylov
Numer. Anal. 17, 567-576. solution method for parallel unstructured finite
Muratov, C. & Shvartsman, S. [2003] "An asymptotic element fluid flow, heat and mass transport," IJCFD
study of the inductive pattern formation mechanism 12, 199-211.
in drosophila egg development," Physica D186, Shepherd, J. F. [2000] "CUBIT mesh generation
93-108. toolkit," Technical Report SAND2000-2647, Sandia
Nakamura, Y. [1997] "Spatio-temporal dynamics of National Laboratories, Albuquerque, New Mexico
forced periodic flows in a confined domain," Phys. 87185.
Fluids 9, 3275-3287. Sone, Y., Aoki, K. & Sugimoto, H. [1997] "The Benard
Nore, C , Tuckerman, L., Daube, 0 . & Xin, S. [2003] problem for a rarefied gas: Formation of steady flow
"The 1:2 mode interaction in exactly counter- patterns and stability of array of rolls," Phys. Fluids
rotating von Karman swirling flow," J. Fluid Mech. 9, 3898-3914.
477,51-88. Tangborn, A. V., Zhang, S. Q. & Lakshminarayanan, V.
Pawlowski, R. P., Salinger, A. G., Romero, L. A. & [1995] "A three-dimensional instability in mixed
Shadid, J. N. [2001] "Computational design and convection with streamwise periodic heating," Phys.
analysis of MPOVPE reactors," J. Phys. IV 11, Fluids 7, 2648-2658.
197-204. Werner, B. & Spence, A. [1984] "The computa-
Pawlowski, R. P., Simonis, J. P., Shadid, J. N. & tion of symmetry-breaking bifurcation points," SIAM
Walker, H. F. [2004] "Globalization techniques for J. Numer. Anal. 21, 388-399.
Newton-Krylov methods and applications to the Willis, G. E. & Deardorff, J. W. [1970] "The oscillatory
fully-coupled solution of the Navier-Stokes equa- motions of Rayleigh convection," J. Fluid Mech. 44,
tions," Technical Report SAND2004, Sandia National 661-672.
Labs, Albuquerque, NM. Xin, S. & Le Quere, P. [2002] "An extended
Salinger, A. & Frink, L. [2003] "Rapid analysis of phase Chebyshev pseudo-spectral benchmark for the 8:1
behavior with density functional theory, part I: Novel differentially heated cavity," Int. J. Numer. Meth.
numerical methods," J. Chem. Phys. 118, 7457-7465. Fluids 40, 981-998.
Salinger, A., Lehoucq, R., Pawlowski, R. & Shadid, J.
[2002a] "Computational bifurcation and stability
AN ALGORITHM FOR FINDING INVARIANT
ALGEBRAIC CURVES OF A GIVEN D E G R E E FOR
POLYNOMIAL PLANAR V E C T O R FIELDS
G R Z E G O R Z SWIRSZCZ
IBM Watson Research Center, Yorktown Heights NY 10598, USA
Institute of Mathematics, University of Warsaw,
02-097 Warsaw, Banacha 2, Poland
swirszcz@us. ibm. com
swirszcz@mimuw. edu.pl

Received March 10, 2004; Revised June 10, 2004

Given a system of two autonomous ordinary differential equations whose right-hand sides are
polynomials, it is very hard to tell if any nonsingular trajectories of the system are contained
in algebraic curves. We present an effective method of deciding whether a given system has an
invariant algebraic curve of a given degree. The method also allows the construction of examples
of polynomial systems with invariant algebraic curves of a given degree. We present the first
known example of a degree 6 algebraic saddle-loop for polynomial system of degree 2, which
has been found using the described method. We also present some new examples of invariant
algebraic curves of degrees 4 and 5 with an interesting geometry.

Keywords: Invariant algebraic curve; symbolic computations; linear algebra.

1. Introduction and Preliminary performing symbolic computations and providing


Definitions us with reliable results. The visionary ideas of
von Neumann about the architecture of computers
Since Darboux [1878] had found connections
(called today "von Neumann Architecture") [von
between algebraic geometry and the existence of
Neumann, 1945], his concepts of "code" and d a t a
first integrals of polynomial systems (polynomial
processing have laid t h e foundations to modern
planar vector fields), algebraic invariant curves have
computer science and its applications to pure and
been a central object in the theory of integrability of
applied mathematics. Thanks to his ideas we know
polynomial systems in R 2 . Today, after more t h a n
a century of investigations, the theory of invariant now how to obtain "reliable answers from unreliable
algebraic curves is still full of open questions. One of computer components" and with the aid of com-
t h e reasons for this is the fact t h a t examples of poly- puters we are able to develop proofs and theories
nomial systems with invariant algebraic curves are which are strictly correct from the mathematical
extremely hard to find. The calculations required to point of view.
find such examples exceed human abilities. Recent Dynamical systems are one of the fields of
development of the theory of integrability would mathematics where the combination of pure science,
have been impossible without t h e use of automatic modeling and computational methods have led t o
computations. Thanks to pioneering works of von amazing results. In the theory of iterations of maps,
Neumann [1946, 1958] and Turing [1950] we have thanks to the use of computers, we have beautiful
now at our disposal sound foundations of methods visualizations of fractals (see for example the

337
338 G. Swirszcz

famous book [Mandelbrot, 1982]). The computer


simulations have also provided useful tools and intu-
itions for many mathematical proofs, see for exam- The polynomial re is called a cofactor of the curve
ple [Lanford, 1982]. In the theory of vector fields if = 0. Of course, the degree of the cofactor can be
the computer assisted methods have a wide array of at most k — 1, so
applications. The methods of modeling are used to
k-i
obtain the approximate phase portraits for systems
nfay) - YZ KjX%y3. (3)
of differential equations, but probably even more
i,j=0
important application is the use of symbolic arith-
metics. The possibility to perform in a relatively An invariant algebraic curve tp = 0 is called irre-
short time extremely complex symbolic operations ducible if the polynomial (p is irreducible. In the rest
have led in the last years to the discovery of new of the paper all the invariant algebraic curves are
examples of invariant algebraic curves for polyno- assumed to be irreducible unless stated otherwise.
mial systems and to a much better understanding of A trajectory 7 of system (1) is a limit cycle if it
the theory of their integrability. Nevertheless, even is nonconstant periodic and there are no other peri-
with the help of computers it is far from obvious odic trajectories in some neighborhood of 7. The
how to look for such examples. In the present paper orbit 7 is an algebraic limit cycle of system (1) if it is
we propose an approach based on symbolic compu- a limit cycle and if it is contained in some irreducible
tations and methods of linear algebra, which turned algebraic invariant curve <p = 0 of system (1). An
out to be very effective for low-degree polynomial algebraic saddle-loop is defined analogously.
systems. It allows to reduce the problem of finding Polynomial system (1) that has enough invari-
invariant algebraic curves to the problem of find- ant algebraic curves must be integrable. Let <pi(x, y)
ing zeroes of a set of relatively simple polynomial be polynomials defining invariant algebraic curves
equations. This gives a link between the theory of of system (1). We say that first integral H of poly-
integrability of polynomial systems and a classical nomial system (1) is in Darboux form if it satisfies
chapter in the computer assisted mathematics — for some K £ N , and some a, € C
the theory of Grobner bases. Before we proceed with K
the introduction we present some definitions. H(x,y) = H^(x,y).
A polynomial system of a degree k in R2 is a i=l
system of two autonomous differential equations We say that integrating factor M of polynomial sys-
tem (1) is in Darboux form if it satisfies for some
x = p(x,y),
L E N and some $ E C
n ,»h (1)
y = q{x,y), L

where p, q are coprime polynomials of degree k, M(x,y) = H<pfi(x,y).


i=l
that is,
The classical result of Darboux is
k k
p(x,y) = Yl Pij^v3' i(x^v) = Yl fe^V• Theorem 1.1. (Darboux) If polynomial system
i,j=0 i,j=0 (1) of degree k has more than k(k + l ) / 2 irreducible
invariant algebraic curves <pi(x,y) = 0, then there
We say that the algebraic curve is an invariant
exist constants cti such that the product
algebraic curve of degree n if it is contained in the
fe(fc+i)
union of trajectories of (1) and it is given by zeroes
of a polynomial ip of a degree n
n
x
<p( iy) = YJ wj^y0- is a first integral in Darboux form. When the
i,j=0 , number of invariant algebraic curves is equal to
k(k + l)/2, the system has an integrating factor in
From basic properties of polynomials follows
Darboux form.
the fundamental fact that the algebraic curve
ip(x,y) = 0 is an invariant algebraic curve of Nevertheless, the above conditions are too
system (1) if and only if there exists a polynomial strong in general. This motivates the following
K = n,(x,y) satisfying problem.
An Algorithm for Finding Invariant Algebraic Curves 339

Problem 1. What are the connections between the The problem of classification of algebraic limit
possible degrees and numbers of invariant algebraic cycles for quadratic system is also open, for almost
curves of a polynomial system of degree k and the 30 years there have been only three known exam-
existence and type of its first integral? ples, one of degree 2 [Qin, 1958] and two of
degree 4 [Yablonskii, 1966; Filiptsov, 1973]. It was
Understanding the significance of invariant
also known [Evdokimenco, 1970, 1974, 1979] that
algebraic curves [Poincare, 1891, 1897] has formu-
there are no quadratic systems with algebraic limit
lated a slightly different question: Estimate the
cycles of degree 3. Then in the year 2000 two more
greatest possible degree n = n{k) of an invari-
families of quadratic systems with algebraic limit
ant algebraic curve for a polynomial system of
cycles of degree 4 have been found; see [Chavarriga
degree k. In this formulation the question has a sim-
et al, 2001; Chavarriga et al., 2000]. It has also
ple answer, the system
been proved by [Chavarriga et al., 2000] that there
are no other families of quadratic systems with
algebraic limit cycles of degree 4. The question if
there exist quadratic systems with algebraic limit
has the invariant algebraic curve x — yn = 0, there- cycles of degree greater than 4 remained open until
fore even n(l) is unbounded. Nevertheless, sys- recently two new examples, one of degree 5 and one
tem (4) has a rational first integral xy~n, so each of degree 6, have been found [Christopher et al,
of its trajectories is contained in some algebraic 2003]. Also [Christopher et al, 2003] presents the
curve. Therefore, Problem 1 is often referred to as first example of an algebraic saddle-loop of degree 5.
the "Poincare's problem". Another approach is to In Sec. 4.3 we give the first example of an algebraic
look for "nontrivial" examples of invariant algebraic saddle-loop of degree 6.
curves of high degrees, like algebraic limit cycles or Another simple and interesting class of poly-
algebraic saddle-loops. nomial systems for which one may ask a question
One of the main problems in the development about the existence of algebraic limit cycles are
of the theory of invariant algebraic curves is the fact Lienard systems x = y, y = ~Fk{x)y — Gm(x)
that there are not many examples known. Even for (i^, G m -polynomials of degrees n and m, respec-
systems of degree 2 the structure of invariant alge- tively). In this case, the question has been answered
braic curves turned out to be much more complex by Zoladek [1998] for all values of k and m except
than has been expected. For example it has been for k = 1, m = 3, for which the question still
conjectured that remains open.
Conjecture (Lins-Neto). There exists a number These, and many more similar questions,
-AT (2) such that, if a quadratic system has an invari- motivate the need for an efficient algorithm to
ant algebraic curve of a degree n > N(2), then the efficiently find examples of families of polyno-
system has a rational first integral. mial systems with invariant algebraic curves.
Until now, most attempts were based on look-
This conjecture has been proved to be false
ing for algebraic curves in some special form
by [Christopher &; Llibre, 2002], who have found a
(usually hyperelliptic) for the sake of simplify-
class of quadratic systems that can have an invari-
ing the calculations. However successful this sim-
ant algebraic curve of any degree, and not have a
ple approach was in many cases, it is far from
rational first integral. Their example has a rational
being general and fails completely when one tries
integrating factor. Later [Chavarriga & Grau, 2002]
to look for invariant curves of a high degree.
have found a family of quadratic systems which can
This is the reason that there have been practi-
have an invariant algebraic curve of any degree and
cally no known examples of invariant algebraic
without a rational integrating factor. It has an inte-
curves of degrees higher than 4. For quadratic
grating factor in Darboux form and it is still an open
systems even the invariant algebraic curves of
question if the following conjecture is true.
degree 4 are not well investigated. As one of
Conjecture (Weakened Lins-Neto). There exists a the examples of the application of the presented
number N(2) such that if a quadratic system has algorithm, we give in Sec. 4 two examples of
an invariant algebraic curve of degree greater than invariant algebraic curves of degree 4 with an
N(2), then the system has an integrating factor in interesting geometry, which to our knowledge have
Darboux form. not been known before.
340 G. Swirszcz

With the method described in the present paper basis B in V„


we have been able to successfully investigate some x
families of quadratic systems with invariant alge- y — eM(*j)'

braic curves of degrees as high as 14. where u(i,j) = (i + j)(i +j + l ) / 2 + i. This comes
from linearly ordering homogenous monomials in
the following way: xlyi > xkyl if and only if i + j >
2. T h e P r o b l e m of Invariant Algebraic k + l or i + j = k + l and i > k.
Curves from t h e Point of V i e w of
Linear A l g e b r a Remark 2.2. Note that the function [i is a bijection
from N x N —» N, so it has an inverse function.
The method we present is based on the observation Therefore, it makes sense to use both p, = p(i,j)
that the problem of existence and finding a solution and % = i(p), j = j(p).
to Eq. (2) is a purely linear problem. To be more
precise, we look for a polynomial p(n) of degree Every polynomial p € Vn has a unique repre-
less or equal to n. Such polynomials form a lin- sentation as a vector in the basis B — its coordi-
ear space Vn of dimension (n + l)(n + 2)/2. Given nates are simply the coefficients of the polynomial
a polynomial system (1) of degree k and a polyno- ip. Now the operator H is represented in the basis
mial K(X, y) of degree k — 1 we define an operator Bbj &[{n + k){n + k + l)/2] x [(n + l)(n + 2)/2]
E:Vn-y Vn+k-i as matrix A — (au). The terms au satisfy

_r , dp dp
+ J(J)Qi(i)-i(J),j(l)-j(J)+i
Of course, H is a linear operator. An obvious conse- ~ki(i)-i(j),j(i)-j(J), (5)
quence of the definition is: where i(I), j(I), i(J), j{J) are the unique numbers
Proposition 2.1. Polynomial system (J) has an satisfying /j,(i(I), j (I)) = I, and (J.(i(J), j (J)) = J
invariant algebraic curve <p of degree less or equal (see Remark 2.2). We apply the convention that we
to n with cofactor K if and only if the operator H set pij, qij, kij equal to 0 if (i,j) is out of the range
has a nontrivial kernel. of definition, i.e. i or j is negative, or their sum is
greater than the degree of the polynomial of their
To investigate the kernel of H we shall use the coefficients.
language of matrices. We introduce the following Matrix A has the following block-multidiagonal
form

DJl + l ! - l

Bn+k-2 r>n + k — 2
•°TI-1

B
l K-i -^n-ft + l
TjTl — 1 r>ri—l
K-\ B
n-k+l
B
n - k
r>n — 2 r>n-2
K~-l K-l tS
n-4

A =

Bk + l Bj+ i

ryk
Bkh B2fe B\
^k+l
B^ 1
Bt 1 B\-x Bt1

B\ B\ Bl
S°i B°o
An Algorithm for Finding Invariant Algebraic Curves 341

where each of the blocks B\ is a (i + 1) x (j + 1) space Vn. Moreover, after the removal there can
matrix. appear more rows with only one nonzero constant
Let MQ denote the set of all the minors of maxi- term in them, so sometimes the size of the matrix
mum dimension (determinants of (n + l)(n + 2)/ A can be reduced significantly in that way. We can
2 X (n + l)(n + 2)/2 submatrices) of the matrix A. also remove all the rows containing only zeroes. We
MQ is a set of polynomials in the variables pij, qij obtain the reduced matrix B.
and kij. The number of polynomials in the set MQ Once we have found the matrix B we apply
is equal to Gauss-Jordan elimination. We should note here
that applicability of numerical methods to Gauss
/ ( n + fe)(n + fc + l ) \ elimination is the subject of a fundamental paper
2 by [Goldstine & von Neumann, 1947]. When the
polynomial system is expressed in a normal form,
(n + l)(w + 2) one may expect the matrix B to have a lot of terms
V 2 / which are constants, that is, they do not depend on
and each of its elements depends, in general, the parameters of the system and the coefficients of
on (n+ l)(3n + 4)/2 variables. From fundamental the cofactor.
facts of linear algebra follows
3. The Algorithm
Theorem 2.3. Polynomial system (1) has an
invariant algebraic curve ip of degree less or equal to We get the following algorithm. Given a family of
n with cofactor K, if and only if all the polynomials polynomial systems
in Mo vanish simultaneously. k

Theorem 2.3 suggests the following algorithm.


If we want to find a polynomial system of a given
degree k with an invariant algebraic curve of degree
less or equal to n, we calculate the corresponding y = J2 Q^y3
matrix A for the system 1, and the corresponding i,j=0
set Mo- Next we try to solve the equation Mo = 0. whose coefficients p^j, Qij depend on some parame-
(In the language of algebraic geometry this means ters pi,...,ps and an integer n we want to find those
that we look for a simple description of the algebraic values of the parameters for which the system has
set V{MQ).) Methods for solving systems of polyno- an invariant algebraic curve of degree n.
mial equations are very well developed. The theory
of Grobner bases and multipolynomial resultants The procedure
can be applied here; see for example [Cox et al.,
1. We use changes of variables to transform simul-
1998]. Nevertheless, one can immediately see that,
taneously the system (6) and the potential
if we try to use this straightforward approach, we
end up with an enormous number of equations in cofactor n(x,y) = Yli7lo^i,jxtyJ to the sim-
many variables. plest form. Usually we strive to make as many
Fortunately, when we look for the examples of of the coefficients ptj, qij, kij as possible zero
polynomial systems with invariant algebraic curves, or equal to constants, all other coefficients are
treated as the parameters of the family. We
we usually consider certain families, depending only
shall call the family obtained in this way the
on a few parameters. Therefore, the number of vari-
simplified family.
ables is usually not too big.
The key to reducing the number of equations is 2. We find the matrix A for the simplified family.
(n+l)(n+2)
a standard linear-algebra approach. First we note 3. We generate a vector W 6 K[x, y] 2 ;
that, if there is a row i in the matrix A con- whose ith coordinate is a monomial e ^ , i.e.
taining only a single nonzero constant term aij, W = (xn, xn~ly, xn~2y2,... ,yn, xn~x, xn~2y,...,
then each of the vectors in the kernel of A must x,y,l). We create an extended matrix A
have 0 at the j t h coordinate. Therefore, we can obtained by adding the vector W as the
remove the column j from the matrix A, limit- last row to the matrix A. This is done only
ing our considerations to a certain subspace of the to make the transformation of the obtained
342 G. Swirszcz

vector-solution into a corresponding polyno- algebraic curves for the subfamily of the sim-
mial more convenient. plified family (6) defined by the conditions Si.
4. We perform the preliminary simplification of Note that Si usually contains some equations
the extended^ matrix A: if there is any row i in that must be satisfied by the coefficients of
the matrix A containing only a single nonzero the cofactor, as well as the coefficients of the
constant term dij, we remove the jth column system.
from the matrix A. We keep repeating this pro-
cess till there are no more rows with only one Remark 3.1. One may notice that steps 3-5 of
nonzero constant term. Then from the obtained our algorithm seem unnecessary. Indeed, one could
matrix we remove all the rows with only zeroes apply Gauss-Jordan elimination immediately to the
in them. We denote the extended reduced matrix matrix A. Nevertheless, the form of the vector W
matrix we have obtained by B. and the simplified matrix B contain some informa-
tion about the structure of the invariant algebraic
5. We denote the last row of the matrix B by
curve we are trying to find. This is particularly
W. We remove it. The matrix we obtain is the
helpful when we try to determine if the family of
reduced matrix B for the simplified family.
systems we are investigating is a good candidate.
6. We apply the process of Gauss-Jordan elimina-
Sometimes it can suggest how to change the family.
tion to the matrix B, using only nonzero con-
Another advantage is that performing this prelim-
stant terms. Namely, starting from the leftmost
inary reduction makes the elimination process run
column we pick a nonzero constant term and
faster.
use row reduction to make all the other terms
in that column equal to zero. Then we proceed Remark 3.2. In most cases, the system of linear
to the next column. If there is a column with all equations Bi • X = 0 in step 10 of our algorithm
the terms in it depending on the parameters, we has only one solution Xj. In case li > 1 the polyno-
skip it in the process. We denote the obtained mial system corresponding to Si has a rational first
matrix by C. integral. Indeed, invariant algebraic curves <p\ and
7. We apply the process described in step 4 to the iff have the same cofactor «, so
matrix C. In other words, this means that we
remove all columns with precisely only constant f<p[\ __ wWi - y>l«yj _ n
term in them, and then we remove all rows with Wj' (tf)2
only zeroes in them. The matrix we obtain is
denoted by D. Remark 3.3. To solve/simplify the system of poly-
8. We calculate the set Mi of minors of maximum nomial equations M\ = 0 the methods of applied
dimension of the matrix D. From the standard algebraic geometry can be used; see [Cox et al.,
facts of linear algebra it follows that MQ van- 1998]. In many cases standard packages using
ishes if and only if M\ vanishes. Grobner bases are efficient, in other cases the com-
9. We try to solve the system of equations M\ — 0. bination of those and the use of resultants turned
We find a set of solutions {Si, £2, • • •, Sd}- out to be very effective.
10. For each Si we substitute it to the matrix B,
obtaining a matrix Bi = -Bis*- Next, we solve 4. Examples
the linear system of equations Bi • X = 0. Of
course, each of the matrices Bi is a degener- 4.1. Degree 4 invariant algebraic
ate matrix, so for each i we have a nonempty curves for a certain family of
set of k solutions {Xj}^, k >l. Note that in quadratic systems
most cases Bi is a family of matrices — after We look for invariant algebraic curves of degree 4
the substitution of the solution Si, B usually within the family of quadratic systems
still depends on some parameters, and so does
each of the corresponding vectors X\. There- x = x + y + xy,
fore, we shall refer to each of X\ as to a family y = Kx + Ly + ax2 + fixy + 2y2
of solutions, although in some cases it can be a
constant family (see Sec. 4.2). with cofactor 4y. This family depends on the four
11. For each pair (i,l), the family of polynomials parameters {K,L,a,f3}. We perform steps 1-3 of
ip\(x,y) = W • X\ defines a family of invariant our algorithm. The extended matrix A for the
i=i H
1
&D"
C
cr CD H O O O O O O O O O O O O O ^ t f ^ O O O O O O
CD P-
0
O oo
o CD
O O O O O O O O O O O O C O + >^ O O O I-" "Gs P
o «£
3 t-1
CD

+ t° to to
o 5£S o o o o o o o o o o o t o o o o to

to
-I- CO CO CO
O O O O O O O O O O H - ' ^ ^ o o o c o - g j g ' o o
O O O O O O O O t J ^ J ^

O O O O O O C O C O I — 1 O <se o o o o o o o o o o O O O d^ Xo S3 o o o

to
O O O O O b O + ^ Xi S O O O O O O O O C O C O O O O h ^ O O O O O O O

tO
O O O O O O O t O + ^ 0 0 0 ~ C o S 3 O O O O O O
O O O t O b O O t O O O O «£
t-1
1—1

4- to - ^ p ' o o o o o o o
O O O O O O h-> 3

iO to ro ro
£ * ° °
ttS CO CO CO CO
O O O O O O ^ i ^ O O t O - ^ p O O O O O O O O

h-> h - ' O C O O O O O O O
o o o o t o t o o o t o o o o o o o o o o o o o
h ^ M -^ e o o o o o s
a.
1
O O O I- + >; o ^ ts P o o o o o o o o o o o
fc-1 I'
^ o o o o o o o o o 1—1
3
«s to to ro to o
O O O j ^ ^ O O ^ p O O O O O O O O O O O O -J
s
s

O h - ' F ^ O C O O O O O O O O O O O O O O O O O

*s tr-i >! to -co S3 o o o o o o o o o o o o o o o

-J
O ^ O O O O O O O O O O O O O O O O O O O

CO
344 G. Swirszcz

We proceed to step 7 of our algorithm and get


"2a - 2/3 + 3a/3 - 3/32 -2L + 3aL - 6/3L - 2KL - L2 2 - 2a + 5/3 + IK + L "
2
L> = 3a/3 - 3/3 - /3K + aL - 3/3L - KL - a + 3/3 + if
2a - 2/3 + 3a/3 - 3/32 + 2if - 4L 4- 3aL - 6/3L - 3L 2 4 - 3a + 6/3 + 3if + 3L
Now we are ready to calculate Mi. It consists of three terms that, after multiplication by a constant, are
equal to
(a-/3-K)(-2a + 3a/3 - 6/32 - 2pK - aL),
(a-13- i f ) ( - 4 + 2a - 8/3 + 3a/3 - 3/32 - AK - 8L + 3aL - 12/3L - 6if L - 3L 2 ),
(a - /3 - i f ) ( - 2 a - 6/3 + 6a/3 - 9/32 - 2if - 3/3if - 9/3L - 3ifL).
The set of equations Mi = 0 can be solved explicitly, and we have the following solutions
Si = {a = P + i f } ,
S 2 = {K = - l A L = - l A a = - 2 A ^ = - 1 } ,

S 3 = Jtf = -1A.L = - 1 A / 3 = U ,
54 = {P = -lAK = 2a + 3AK^-lAL = -1},

5 5 = {(3/3 + 2 V - 2 - 3L = 2 + 3L V 2 + 2 V - 2 - 3 L + 3L = 3/3) A

L ^ - l A 2 a + (l^)(l + ^)_ 5 + 3/? + 3 i ,+ 3X}.


1+ L

The kernel of i?i = B\s1 is generated by the with an invariant algebraic curve
vector X\ = ((/3 + K)2,AK(p + K), -4(/3 + K),
2K(P+3K)-2(p+K)L, -8K, 4, AK(K-L), -4K+ fo,o + fi,ox + <P2,ox2 + ¥>3,o#3 + <P4,ox4 + ip0>iy
4L, (if — L)2)T. Therefore an invariant algebraic
curve W • X{ = (L - Px2 - if (1 + x)2 + 2y) 2 , which + <fi,ixy + ¥2,ix2y + ipo^y2 = 0
is reducible, corresponds to S\.
Similarly, the reducible invariant algebraic
curve (x + x2 + y)2 = 0 corresponds to 52.
The invariant curve 18a; 2 +4x 3 — 12ax 3 — 3 a s 4 +
36:cy + Ylx2y + 18y2 = 0 corresponding to 53, has
for a < 2/3 a form of a cuspidal loop (see Fig. 1.)
containing all three singular points of the system.
Corresponding to 54 is the invariant curve
(x2 + 2y) {x (4 + 3x) + 2y) - Kx2{2 + x)2 = 0.
The solution S$ corresponds, in fact, to sev-
eral families of algebraic invariant curves. Here
we present only one example, belonging to a two-
parameter family

if (2 - V-2 -3L + 3L)


+ (2 + 3L)(4 + V - 2 - 3L + 3L)
a= 3(1 + L)
Fig. 1. The curve corresponding to K -1, L = - 1 ,
P = -{l + V-2~3L) +L
a = - 4 / 3 , /3 = 1/3.
An Algorithm for Finding Invariant Algebraic Curves 345

<^0 0 = 27(K — L)(l + £ ) 3 4.2. Degree 5 invariant algebraic


curves for quadratic systems
' We present two examples of degree 5 invariant alge-
¥>2,o = 18(1 + L)((4 + V - 2 - 3 L + 3L) braic curves for quadratic systems. They have been
found by applying our algorithm to the family of
x (2 + 5L + 3L 2 ) + tf(8 - V-2 - 3L quadratic systems

+ 18L + 9L 2 ))
x = x + y + xy,
^ > 0 = 4(2 + 3L)(6(4 + V-2 - 3L)
y = Kx + Ly + ax2 + (3xy + -y2
+17(4 + V - 2 - 3L)L

+ (60 + 9 V - 2 - 3L)L 2 + 18L3 with cofactor by. The set of minors M\ consists of
four polynomials depending on four variables K,
+ Jf (10 - 2 V - 2 - 3 L + 21L + 9L 2 )) L, a, /3 that, after multiplication by constants are
equal to
y?4,o = (2 + 3L)A"(10 - 2 V - 2 - 3 L + 21L + 9L2)

+ (2 + 3L) 2 (14 + 8 V - 2 - 3L 80a 2 (909/? + 86L - 25) - 5(4/3 + K)(22176/3 3

+ 3(7 + 2 V - 2 - 3L)L + 9L 2 ) + 245K(5 + QL) + 36/32(1725 + 1078L)


+ 0{te9OK + 3(5 + 6L)(745 + 462L)))
V0.1 = - 1 0 8 ( 1 + L ) '
+ 2a(133056/33 - 33(5 + L)(5 + 6L)(10 + 71)
2
lpltl = -36(1 + L)(6 + (13 + V-2 - 3L)L + 6L )
+ 48/32(924L - 1585) - 5X(2665 + 953L)
<P2,i = -12(4 + V-2 - 3L + 3L)(2 + 5L + 3L 2 ) + P(490K - 2(85175 + 22L(2570 + 63L)))),

<pot2 = 18(1 + % / - 2 - 3 L ) ( l + L) 400a 2 (9 + 606/3 + 83L) + 20a(44352/?3


+ 48/32(462L - 307) - 2X(1940 + 1039L)
The shape of the curve for k = —17/3, c =
—8/5 is presented in Fig. 2. + 2/3(-26245 + 109-ftT - 20338L + 1386L2)
- 3(1025 + L(2460 + 150LL)))
- (4/3 + tf)(310464/33 + 504/32(1945 + 15621)
+ 7(5 + 6L)(520ii: + 3(5 + 11L)(5 + 16L))
+ 2/3(106175+ 25780K
+ 12L(19795 + 11088L))),

40a2(1704/3 - 83L - 625) - 55/3(4/3 + K)(100K


+ 3/3(365 + 672/3 + 1681))
+ 2a(94248/33 - 10K(380 + 181L)
- 6/32 (38245 + 3696L) - /3(39125
+ 19610X + L(25885 + 2772L))),

- 2l0/33(4/3 + K) + a0{-95K + 2/3(-505


+168/3 - 63L)) + 12a2(11/3 - L - 5).

Fig. 2. The curve corresponding to K = -17/3, L = -8/5, The System of equations Mi = 0 has many solu-
a = -(l/135)(1358 + 43\/70),/?= (2/15)(v/70 - 7). tions, some of them being isolated points in C 4 . The
346 G. Swirszcz
examples we present correspond to two of these iso-
lated solutions, namely
I 0.4
3 7 5 ( 8 8 3 6 ^ - 1828897)
K =
722131963 \o.2
5(170v / 21- 41951)
L =
219961 -0.25 \ 0.25 0.5 0 . 7 5 "\ 1 1.25
4 6 8 7 5 ( 7 4 8 ^ - 2331)
a = -0.2
2888527852
375(9^21 + 182) -0.4
0= 439922
-0.6
and K = 189, L = - 1 1 , a = 405/4, 0 = - 2 7 / 2 .
Therefore these examples are isolated, not belong-
ing to any families of quadratic systems with invari- -0.8
ant algebraic curve of degree 5. We have:
The system Fig. 3. Invariant algebraic curve of degree 5.

x = x + y + xy,
The system
375(8836\/2l - 1828897) x = x + y + xy,
y = 722131963
405 2 27
5(17(V2T - 41951) y = 189x - l l y + — x ' - —xy + -y'
y
219961
has the invariant algebraic curve
46875(748\/2T - 2331)
2888527852 25600000 + 120960000x + 224272800x2

375(9V21 + 182) 5 2 + 203163552x3 + 89367381a:4 + 15116544a:5


Xy+ V
439922 A - 640000y - 2030400x2/ - 2137104x2j/

has the invariant algebraic curve - 746496x3y + 16800y2 + 35136xy2


+ 18306x2y2 - 208y3 - 216xy3 + y4 = 0
-3.1973 • 1057 + 2.06748 • 1060x - 3.7594 • 1 0 6 V
with cofactor by. This curve is presented in Fig. 4.
+ 1.32337 • 10 64 x 3 - 1.46055 • 1 0 6 V
+ 2.21 • 1 0 6 V + 2.22619 • 1060y
- 8.09555 • 1062xy + 4.27331 • 10Mx2y
- 6.20874 • 1 0 6 V y - 4.36964 • 1 0 6 V
+ 4.63717 • 1064a;y2 - 1.09432 - 10 65 x 2 y 2
+ 1.69051 • 1064y3 - 9.20718 • 10 64 xj/ 3 -2.2

- 3.02394 • 1 0 6 V = 0

with cofactor 5y. We present the coefficients in


numerical form because the exact formula is over
two pages long. The curve is presented in Fig. 3. Fig. 4. Another invariant algebraic curve of degree 5.
An Algoiithm for Finding Invariant Algebraic Curves 347

4.3. Degree 6 invariant algebraic curve <pitl = -1200(192 + 584L + 544L2


containing a saddle-loop for a + 152L3 + 3L 4 )
certain family of quadratic
<p21 = -600(2 + 3L) 2 (28 + 20L - 11L2 - 3L 3 )
systems
Application of our algorithm to the family of p 3 ,i = -120(L - 2) 2 (3 + L)(2 + 3L) 3
systems y>4,i = 60(L - 2)(3 + L) 2 (2 + 3L) 4
x = 1 + x + xy, V?0,2 = 120000L(1 + L)
2 2
y = (K - a) + Kx + Ly + ax + (3xy + 2y <pi,2 = 24000(6 + 17 L + 14L2 + 3L 3 )
with cofactor 6y and n = 6 leads to the discov- V2,2 = -600(4 + 4L - 3L 2 ) 2
ery of a degree 6 algebraic saddle-loop. As far as we
ipofi = 80000
know, this is the first known example of an algebraic
saddle-loop of degree greater than 5 for quadratic
with cofactor 6y. For 1 < L < 2 this curve contains
systems.
a saddle-loop.
Theorem 4.1. The system
The shape of the curve for L — 11/7 is pre-
x = 1 + x + xy, sented in Fig. 5.

. - 2 2 - 47L - 21L2 34 + 87L + 60L 2 + 9L 3 Remark 4-2. Most examples presented in the paper
y = T7T-
10 -r^r~
10 —X belong to a very special class of quadratic systems.
There are certain conditions that must be satisfied
for a quadratic system to have an invariant alge-
10 braic curve of a high degree. Such quadratic systems
(3 + L) (2 + 3L) have been studied by Llibre and Swirszcz [2003] and
+ xy + 2y"
10 all quadratic systems admitting high-degree limit
has an invariant algebraic curve defined by cycles have been classified. In particular, the family

V0,0 + pi,QX + f2fiX2 + V3,0Z 3 + <P4flX4 + <^5,0Z5 x = x + y + xy,


+ m,ox6 + <fo,iy + <Pi,ixy + <p2,ix2y
y = Kx + Ly + ax2 + f3xy + 7y 2
+ <P3,ix3y + f^ix^y + ip0i2y2 + <Pi,2xy2
+ <P2,2X2y2 + <p0,3y3 = 0
where
ip0fi = -200(192 + 1104L + 2184L2
+ 1732L3 + 463L4)
<pho = -2400(88 + 474L + 937L2 + 834L3
+ 325L4 + 42L5) -1. 0.25 0.5
2 2
^ 0 = -60(2 + 3L) (1296 + 3160L + 2506L
+ 701L3 + 47L4)
^3,0 = 20(2 + 3L) 3 (-884 - 1496L - 615L2
+ 4L 3 + 19L4)
<p4,o = 12(2 + 3L) 4 (-138 - 109L + 33L2
+ 24L 3 + 2L4)
(^5,o = 6 ( L - 2 ) ( 3 + L) 3 (2 + 3L) 5
¥?6,o = ( L - 2 ) ( 3 + L) 3 (2-|-3L) 6
<^o,i = -8000(12 + 29L + 33L2 + 6L3) Fig. 5. Degree 6 algebraic saddle-loop for L = 11/7.
348 G. Swirszcz

with cofactor ny (denoted by 5 " in [Llibre & Evdokimenco, R. M. [1970] "Construction of algebraic
Swirszcz, 2003]) is a very promising class of systems. paths and the qualitative investigation in the large of
Many other examples of quadratic systems with the properties of integral curves of a system of differ-
invariant algebraic curves have been found using ential equations," Diff. Eqs. 6, 1349-1358.
the described algorithm, b u t they usually do not Evdokimenco, R. M. [1974] "Behavior of integral curves
of a dynamic system," Diff. Eqs. 9, 1095-1103.
have such interesting geometry. Similar conditions
Evdokimenco, R. M. [1979] "Investigation in the large of
t o some of those presented in [Llibre &; Swirszcz,
a dynamic system," Diff. Eqs. 15, 215-221.
2003] have been found for polynomial systems Filiptsov, V. F. [1973] "Algebraic limit cycles," Diff. Eqs.
(not necessarily quadratic) by [Chavarriga et al., 9, 983-986.
2003]. Goldstine, H. H. & von Neumann, J. [1947] "Numeri-
cal inversion of matrices of high order," Bull. AMS,
1021-1099.
Lanford, O. E. [1982] "A computer-assisted proof of the
Acknowledgment
Feigenbaum conjectures," Bull. Amer. Math. Soc. 6,
This paper has been partially supported by Polish 427-434.
KBN Grant 2 P 0 3 A 01022. Llibre, J. & Swirszcz, G. [2003] "Classification of
quadratic systems admitting the existence of an alge-
braic limit cycle," preprint.
Mandelbrot, B. [1982] The Fractal Geometry of Nature
References (W.H. Freeman and Company, NY).
Chavarriga, J., Llibre, J. & Sorolla, J. [2000] "Algebraic Poincare, H. [1981] "Sur l'integration des equations
limit cycles of quadratic systems," preprint. differentielles du premier ordre et du premier degre
Chavarriga, J., Giacomini, H. & Llibre, J. [2001] I and II," Rendiconti del Circolo Matematico di
"Uniqueness of algebraic limit cycles for quadratic Palermo 5 (1891), 161-191; [1987] 11 (1897), 193-
systems," J. Math. Anal. Appl. 261, 85-99. 239.
Chavarriga, J. & Grau, M. [2002] "A family of non Qin, Y.-X. [1958] "On the algebraic limit cycles of
Darboux-integrable quadratic polynomial differential second degree of the differential equation dy/dx =
systems with algebraic solutions of arbitrarily high E0<i+j<2a^V/Eo<i+j<2%xyV' Acta
Math-
degree," Appl. Math. Lett. 16, 833-837. Sin. 8, 23-35.
Chavarriga, J., Giacomini, H. & Grau, M. [2003] "Nec- Turing, A. [1950] "Computing machinery and intelli-
essary conditions for the existence of invariant alge- gence," Mind 49, 433-460.
braic curves for planar polynomial systems," Bull. von Neumann, J. [1945] "First Draft of a Report
Sci. Math., to appear. on the EDVAC," Contract No. W-670-ORD-492,
Christopher, C. & Llibre, J. [2002] "A family of quadratic Moore School of Electrical Engineering, Univ. of
polynomial differential systems with invariant alge- Penn., Philadelphia (30 June 1945); Reprinted (in
braic curves of arbitrarily high degree without ratio- part): Randell, B., Origins of Digital Computers:
nal first integrals," Proc. Amer. Math. Soc. 130, Selected Papers (Springer-Verlag, Berlin, Heidelberg),
2025-2030. pp. 383-392.
Christopher, C , Llibre, J. k Swirszcz, G. [2003] "Invari- von Neumann, J. [1946] The Principles of Large-Scale
ant algebraic curves of large degree for quadratic Computing Machines; Reprinted, Ann. Hist. Comp.
systems," J. Math. Anal. Appl, to appear. 3, 263-273.
Cox, D., Little, J. & O'Shea, D. [1998] Using Algebraic von Neumann, J. [1958] The Computer and the Brain,
Geometry, Graduate Texts in Mathematics, Vol. 185 (Yale University Press, New Haven).
(Springer-Verlag, NY). Yablonskii, A. I. [1966] "Limit cycles of a certain differ-
Darboux, G. [1878] "Memoire sur les equations ential equations," Diff. Eqs. 2, 335-344 (in Russian).
differentielles algebriques du premier ordre et du pre- Zoladek, H. [1998] "Algebraic invariant curves for the
mier degre (Melanges)," Bull. Sci. Math. 2eme serie Lienard equation," Trans. Amer. Math. Soc. 4,
2, 60-96, 123-144, 151-200. 1681-1701.
AUTHOR INDEX

Bloch, A. M. 97 Koon, W. S. 3
Bogacz, R. 107 Krauskopf, B. 67, 195
Brown, E. 107 Kuznetsov, Yu. A. 145
Burroughs, E. A. 319
Lekien, F. 3
Cohen, J . D . 107 Lo, M. W. 3
Lust, K. 279
Dedieu, J.-P. 131
Dellnitz, M. 3, 67 Marsden, J. E. 3
Dhooge, A. 145 Mehta, P. G. 253
Doedel, E. J. 67, 145 Moller, J. 279
Domokos, G. 165, 175
Osinga, H. M. 67, 195
England, J. P. 195
Padberg, K. 3
Fernandez-Sanchez, F. 209 Pawlowski, R. P. 319
Freire, E. 209 Phipps, E. T. 319
Pizarro, L. 209
Gao, J. 107
Preis, R. 3
Garay, B. M. 33
Gilzenrat, M. 107 Rodriguez-Luis, A. J. 209
Govaerts, W. 145 Romero, L. A. 319
Guckenheimer, J. 67 Ross, S. D. 3
Rowley, C. W. 301
Healey, T. J. 175, 253
Runborg, O. 279
Henderson, M. E. 67, 271
Holmes, P. 107 Salinger, A. G. 319
Shub,M. 131
Iserles, A. 97
Siegmund, S. 47
Junge, O. 3, 67 Swirszcz, G. 337

Kevrekidis, I. G. 279 Thiere, B. 3


Kevrekidis, P. G. 279
Vladimirsky, A. 67
Kloeden, P. E. 47

349
The Hungarian born mathematical genius, John von Neumann,
was undoubtedly one of the greatest and most influential
scientific minds of the 20th century. Von Neumann made
fundamental contributions to Computing and he had a keen
interest in Dynamical Systems, specifically Hydrodynamic
Turbulence. This book.

offering a state-of-the-art MODELING


MODELING A AND
N D (COMPUTATIONS

collection of papers in . . . QYNAMICAL V


computational dynamical IN
" ,
•DYNAMICAL SYSTEMS
* I lIHIlllVltli J

systems, is dedicated to the In


| n commemoration
commemoration of the 100th anniversary
of til
memory of von Neumann. f | ^ f J ^
of the
Q n e
birth Qof JohnVvon
Q n
Neumann
Including contributions from

M Dellnitz, J Guckenheimer, PJ Holmes, A Iserles, J E Marsden


and M Shub, this book offers a unique combination of theoretical
and applied research in areas such as geometric integration,
neural networks, linear programming, dynamical astronomy,
chemical reaction models, structural and fluid mechanics.

www.worldscientific.com
5982 he

You might also like